Unleash your creativity
Check our our current supported models below. We support the best, highest-rated models available, and we update the selection of models we support frequently. Sign up to receive a monthly update about our latest models and other news, and participate in our Discord Server to ask questions or request additional models.
TOP LLMS
SorcererLM 8x22b bf16
- RP
- BF16
- rAIfle-SorcererLM-8x22b-bf16
- Context: 16K
- Recommended Settings: https://files.catbox.moe/9tj7m0.json
Magnum-72b-v4
- RP, Storywriting.
- FP8 Dynamic
- anthracite-org-magnum-v4-72b-FP8-Dynamic
- Context: 32k
- Preset: https://files.catbox.moe/rqei05.json
- RP Instruct: https://files.catbox.moe/btnhau.json
- RP Context: https://files.catbox.moe/7kct3f.json
Settings provided by: GERGE
72B Qwen2.5 Kunou v1
- Sao10K-72B-Qwen2.5-Kunou-v1-FP8-Dynamic
- FP8
- Context: 32K
ALL MODELS
Anubis 70B v1
- RP – Storywriting
- TheDrummer-Anubis-70B-v1-FP8-Dynamic
- Context: 32K
MN 12B Mag Mell R1
- inflatebot-MN-12B-Mag-Mell-R1
- Context: 32K
L3-70B-Euryale-v2.2
- RP, Storywriting.
- FP8 Dynamic
- L3.1-70B-Euryale-v2.2-FP8-Dynamic
- Context: 16K
- RP Instruct: https://files.catbox.moe/1c9sp0.json
- RP Context: https://files.catbox.moe/5wwpin.json
Settings provided by: ShotMisser64
L3.3-70B-Euryale-v2.3
- Sao10K-L3.3-70B-Euryale-v2.3-FP8-Dynamic
- FP8
- Context: 32K
Llama 3.3 70B Instruct
- meta-llama-Llama-3.3-70B-Instruct-FP8-Dynamic
- FP8
- Context: 32K
magnum v2 72b
- anthracite-org-magnum-v2-72b-FP8-Dynamic
- FP8
- Context: 32K
Hermes 3 Llama 3.1 70B
- RP
- NousResearch-Hermes-3-Llama-3.1-70B-FP8
- Context: 64K
QwQ 32B Preview
- RP
- Qwen-QwQ-32B-Preview
- Context: 32K
MN 12B Inferor v0.0
- RP
- Infermatic-MN-12B-Inferor-v0.0
- Context: 32K
- Settings & Review: MN 12B Inferor v0.0
UnslopNemo 12B v4.1
- RP
- BF16
- TheDrummer-UnslopNemo-12B-v4.1
- Context: 32K
- Recommended Settings: https://files.catbox.moe/7e6zjo.json
Qwen2.5-72B-Instruct
- GP
- FP8
- Qwen2.5-72B-Instruct-Turbo
- Context: 32K
Llama 3.1 Nemotron 70B Instruct HF
- GP
- BF16
- nvidia-Llama-3.1-Nemotron-70B-Instruct-HF
- Context: 32K
- Recommended Settings: https://files.catbox.moe/7e6zjo.json
Midnight Miqu 70B v1.5
- RP
- FP16
- Midnight-Miqu-70B-v1.5
- Context: 18K
- Preset: https://files.catbox.moe/l8e5zt.json
- RP Instruct: https://files.catbox.moe/eaj6gy.json
- RP Context: https://files.catbox.moe/mvn3jo.json
Settings provided by: ShadingCrawler
WizardLM 2 8x22B
- RP
- Storywriting
- General purpose
- WizardLM-2-8x22B
- Context: 16K
- Context template: Alpaca
- RP Instruct: https://files.catbox.moe/q0a07u.json
- RP Context: https://files.catbox.moe/68194o.json
Settings provided by: GERGE
Llama3 TenyxChat DaybreakStorywriter 70B
- FP8 Dynamic
- RP
- Llama-3-TenyxChat-DaybreakStorywriter-70B-fp8-dynamic
- Context: 16K
- RP Instruct: https://files.catbox.moe/i3z4wv.json
- RP Context: https://files.catbox.moe/1k8p5b.json
Settings provided by: ShotMisser64
L3.1 70B Hanami x1
- GP, RP
- FP16
- Sao10K-L3.1-70B-Hanami-x1
- Context: 32K
Mixtral 8x7B Instruct v0.1
- BF16
- GP
- Mixtral-8x7B-Instruct-v0.1
- Context: 32K
Rocinante-12B-v1.1
- RP
- BF16
- TheDrummer-Rocinante-12B-v1.1
- Context: 32K
Qwen2-72B-Instruct
- BF16
- GP
- Qwen2.5-72B-Instruct
- Context: 32K
Llama-3.2-11B-Vision-Instruct
- BF16
- GP
- Llama-3.2-11B-Vision-Instruct-Turbo
- Context: 128K
GUIDES & SETTINGS
Models
L3-70B-Euryale-v2.1
Meet L3 70B Euryale v2.1: Your New Creative Companion What is L3 70B Euryale v2.1 [...]
Aug
Guides
Using Infermatic.ai API with SillyTavern
SillyTavern is one of the most popular interfaces to interact with LLMs. We have been [...]
Jun
Models
nvidia/Llama-3.1-Nemotron-70B-Instruct
Llama 3.1 Nemotron 70B Instruct: Follow and assert Llama 3.1 Nemotron 70B Instruct is NVIDIA’s [...]
Dec
Models
Infermatic/MN 12B Inferor v0.0
MN 12B Inferor v0.0: Dynamic and Creative MN 12B Inferor, also known as Mistral Nemo [...]
Dec
Docs
API Docs
Frequently Asked Questions from Geek to Geek
- What is prompt engineering, and why is it critical in working with LLMs?
- How can I design effective prompts for LLMs?
- What are some standard techniques used in prompt engineering?
- How does prompt length impact the output of an LLM?
- How do LLMs understand and generate human-like text?
- What is the difference between Llama, Mixtral, and Qwen?
- What are some examples of advanced use cases of prompt engineering with LLMs?
- How do I choose the best LLM model for my project?
- What are large language models, and how do they differ from traditional NLP models?
- Can LLMs write code well?