Unleash your creativity
Check our our current supported models below. We support the best, highest-rated models available, and we update the selection of models we support frequently. Sign up to receive a monthly update about our latest models and other news, and participate in our Discord Server to ask questions or request additional models.
Top LLMs
- RP
- BF16
- rAIfle-SorcererLM-8x22b-bf16
- Context: 16K
- Recommended Settings: https://files.catbox.moe/9tj7m0.json
- RP, Storywriting. This model has spatial awareness, memory and detailed descriptions to keep the generation entertaining. Very good creativity and NSFW.
- FP8 Dynamic
- anthracite-org-magnum-v4-72b-FP8-Dynamic
- Context: 32k
- Preset: https://files.catbox.moe/rqei05.json
- RP Instruct: https://files.catbox.moe/btnhau.json
- RP Context: https://files.catbox.moe/7kct3f.json
Settings provided by: GERGE
- RP, Storywriting. Coherent, emotional and very creative.
- FP8 Dynamic
- L3.1-70B-Euryale-v2.2-FP8-Dynamic
- Context: 16K
- RP Instruct: https://files.catbox.moe/1c9sp0.json
- RP Context: https://files.catbox.moe/5wwpin.json
Settings provided by: ShotMisser64
All Models
- All the qualities of the best models merged into one. RP
- Merge of:
- Fizzarolli/MN-12b-Sunrose
- anthracite-org/magnum-v4-12b
- nbeerbower/Mistral-Nemo-Gutenberg-Doppel-12B-v2
- nothingiisreal/MN-12B-Starcannon-v3
- Format: ChatML
- Context: 32K
- Settings:
- Temperature: 1
- Min-P: 0.65
- Top-A: 0.2
- Repetition Penalty: 1.03
- A RP/storywriting specialist model, full-parameter finetune of Qwen2.5-72B on mixture of synthetic and natural data.
- Format: ChatML
- Context: 32K
- Settings:
- Temperature: 1
- Min-P: 0.05
- Top-A: 0.2
- Repetition Penalty: 1.03
- A RP/storywriting specialist model, full-parameter finetune of Qwen2.5-32B on mixture of synthetic and natural data.
- Context: 32K
- Settings:
- Temp: 1
- Min-P: 0.05
- Typical-P: 0.9
- Top-A: 0.2
- Repetition Penalty: 1.03
- RP
- BF16
- TheDrummer-UnslopNemo-12B-v4.1
- Context: 32K
- Recommended Settings: https://files.catbox.moe/7e6zjo.json
- GP
- FP8
- Qwen2.5-72B-Instruct-Turbo
- Context: 32K
- GP
- BF16
- nvidia-Llama-3.1-Nemotron-70B-Instruct-HF
- Context: 32K
- Recommended Settings: https://files.catbox.moe/7e6zjo.json
- RP
- FP16
- Midnight-Miqu-70B-v1.5
- Context: 16K
- Preset: https://files.catbox.moe/l8e5zt.json
- RP Instruct: https://files.catbox.moe/eaj6gy.json
- RP Context: https://files.catbox.moe/mvn3jo.json
Settings provided by: ShadingCrawler
- RP.
- Storywriting.
- General purpose.
- WizardLM-2-8x22B
- Context: 16K
- Context template: Alpaca
- RP Instruct: https://files.catbox.moe/q0a07u.json
- RP Context: https://files.catbox.moe/68194o.json
Settings provided by: GERGE
- FP8 Dynamic
- RP
- Llama-3-TenyxChat-DaybreakStorywriter-70B-fp8-dynamic
- Context: 16K
- Settings:
- RP Instruct: https://files.catbox.moe/i3z4wv.json
- RP Context: https://files.catbox.moe/1k8p5b.json
Settings provided by: ShotMisser64
- GP, RP
- FP16
- Sao10K-3.1-70B-Hanami-x1
- Context: 32K
- Mixtral-8x7B-Instruct-v0.1
- General purpose
- Context: 32K
- RP
- BF16
- TheDrummer-Rocinante-12B-v1.1
- Context: 32K
- FP8 Dynamic
- RP
- llama-3-lumimaid-8b-v0.1
- Context: 8K
- BF16
- GP
- Qwen2-72B-Instruct
- Context: 32K
- BF16
- GP
- Mixtral-8x7B-Instruct-v0.1
- Context: 32K
- BF16
- GP
- Llama-3.2-11B-Vision-Instruct-Turbo
- Context: 128K
Guides
Guides
Guide to quant FP8
Simple Guide to Convert an FP16 Model to FP8 Overview This simple guide to quant [...]
Aug
Models
L3-70B-Euryale-v2.1
Meet L3 70B Euryale v2.1: Your New Creative Companion What is L3 70B Euryale v2.1 [...]
Aug
Guides
Using Infermatic.ai API with SillyTavern
SillyTavern is one of the most popular interfaces to interact with LLMs. We have been [...]
Jun
Docs
API Docs
Frequently Asked Questions from Geek to Geek
- What is prompt engineering, and why is it critical in working with LLMs?
- How can I design effective prompts for LLMs?
- What are some standard techniques used in prompt engineering?
- How does prompt length impact the output of an LLM?
- How do LLMs understand and generate human-like text?
- What is the difference between Llama, Mixtral, and Qwen?
- What are some examples of advanced use cases of prompt engineering with LLMs?
- How do I choose the best LLM model for my project?
- What are large language models, and how do they differ from traditional NLP models?
- Can LLMs write code well?