Unleash your creativity
Check our our current supported models below. We support the best, highest-rated models available, and we update the selection of models we support frequently. Sign up to receive a monthly update about our latest models and other news, and participate in our Discord Server to ask questions or request additional models.
TOP LLMS
Fallen Llama 3.3 R1 70B v1
- TheDrummer/Fallen-Llama-3.3-R1-70B-v1
- Context: 32K
L3.3 70B Magnum v4 SE
- Doctor-Shotgun-L3.3-70B-Magnum-v4-SE
- Context: 32K
Qwen3 235B A22B Thinking 2507
- Qwen-Qwen3-235B-A22B-Thinking-2507
- Context: 100K
Strawberrylemonade L3 70B v1.1
- Infermatic/Strawberrylemonade-L3-70B-v1.1-FP8-Dynamic
- FP8
- Context: 32K
Qwen3.6-35B-A3B
- Qwen-Qwen3.6-35B-A3B
- Context: 66K
- Video Limit : 2
- Image Limit: 5
Cydonia-24B-v4.3
- Infermatic-Cydonia-24B-v4.3-FP8-Dynamic
- FP8
- Context: 66K
Magnum-72b-v4
- RP, Storywriting.
- FP8 Dynamic
- anthracite-org-magnum-v4-72b-FP8-Dynamic
- Context: 32k
- Preset: https://files.catbox.moe/rqei05.json
- RP Instruct: https://files.catbox.moe/btnhau.json
- RP Context: https://files.catbox.moe/7kct3f.json
Settings provided by: GERGE
72B Qwen2.5 Kunou v1
- Sao10K-72B-Qwen2.5-Kunou-v1-FP8-Dynamic
- FP8
- Context: 32K
L3.3-70B-Euryale-v2.3
- Sao10K-L3.3-70B-Euryale-v2.3-FP8-Dynamic
- FP8
- Context: 32K
L3.1 70B Hanami x1
- GP, RP
- FP16
- Sao10K-L3.1-70B-Hanami-x1
- Context: 32K
PLUS PLAN MODELS
All of Free + Essential + Standard and the following:
Kokoro-82M
- TTS-hexgrad-Kokoro-82M
Qwen3 235B A22B Thinking 2507
- Qwen-Qwen3-235B-A22B-Thinking-2507
- Context: 100K
STANDARD PLAN MODELS
All of Free + Essential and the following:
Fallen Llama 3.3 R1 70B v1
- TheDrummer/Fallen-Llama-3.3-R1-70B-v1
- Context: 32K
Strawberrylemonade L3 70B v1.1
- Infermatic/Strawberrylemonade-L3-70B-v1.1-FP8-Dynamic
- FP8
- Context: 32K
Qwen3.6-35B-A3B
- Qwen-Qwen3.6-35B-A3B
- Context: 66K
- Video Limit : 2
- Image Limit: 5
Prometheus 7b v2.0
- Context: 32K
- Type: LLM-as-a-judge (evaluator)
Magnum-72b-v4
- RP, Storywriting.
- FP8 Dynamic
- anthracite-org-magnum-v4-72b-FP8-Dynamic
- Context: 32k
- Preset: https://files.catbox.moe/rqei05.json
- RP Instruct: https://files.catbox.moe/btnhau.json
- RP Context: https://files.catbox.moe/7kct3f.json
Settings provided by: GERGE
ESSENTIAL PLAN MODELS
All of Free and the following:
multilingual-e5-base
- intfloat-multilingual-e5-base
L3.3 70B Magnum v4 SE
- Doctor-Shotgun-L3.3-70B-Magnum-v4-SE
- Context: 32K
Valkyrie 49B V1
- TheDrummer-Valkyrie-49B-v1
- Context: 64K
- RP
- BF16
Cydonia-24B-v4.3
- Infermatic-Cydonia-24B-v4.3-FP8-Dynamic
- FP8
- Context: 66K
Qwen3 Embedding 8B
- Context: 32K
- Dimension: From 32 to 4036
- Qwen-Qwen3-Embedding-8B
Qwen3 VL 8B Instruct
- Qwen-Qwen3-VL-8B-Instruct
- Context: 32K
- Text + image
- Max image: 4
72B Qwen2.5 Kunou v1
- Sao10K-72B-Qwen2.5-Kunou-v1-FP8-Dynamic
- FP8
- Context: 32K
L3.3-70B-Euryale-v2.3
- Sao10K-L3.3-70B-Euryale-v2.3-FP8-Dynamic
- FP8
- Context: 32K
L3.1 70B Hanami x1
- GP, RP
- FP16
- Sao10K-L3.1-70B-Hanami-x1
- Context: 32K
Anubis 70B v1.1
- RP – Storywriting
- TheDrummer-Anubis-70B-v1.1-FP8-Dynamic
- Context: 32K
Midnight Miqu 70B v1.5
- RP
- FP16
- Midnight-Miqu-70B-v1.5
- Context: 18K
- Preset: https://files.catbox.moe/l8e5zt.json
- RP Instruct: https://files.catbox.moe/eaj6gy.json
- RP Context: https://files.catbox.moe/mvn3jo.json
Settings provided by: ShadingCrawler
FREE MODELS
Rocinante-12B-v1.1
- RP
- BF16
- TheDrummer-Rocinante-12B-v1.1
- Context: 32K
GUIDES
Models
L3-70B-Euryale-v2.1
Meet L3 70B Euryale v2.1: Your New Creative Companion What is L3 70B Euryale v2.1 [...]
Aug
Guides
Using Infermatic.ai API with SillyTavern
SillyTavern is one of the most popular interfaces to interact with LLMs. We have been [...]
Jun
Models
nvidia/Llama-3.1-Nemotron-70B-Instruct
Llama 3.1 Nemotron 70B Instruct: Follow and assert Llama 3.1 Nemotron 70B Instruct is NVIDIA’s [...]
Dec
Models
Infermatic/MN 12B Inferor v0.0
MN 12B Inferor v0.0: Dynamic and Creative MN 12B Inferor, also known as Mistral Nemo [...]
Dec
Frequently Asked Questions from Geek to Geek
- What is prompt engineering, and why is it critical in working with LLMs?
- How can I design effective prompts for LLMs?
- What are some standard techniques used in prompt engineering?
- How does prompt length impact the output of an LLM?
- How do LLMs understand and generate human-like text?
- What is the difference between Llama, Mixtral, and Qwen?
- What are some examples of advanced use cases of prompt engineering with LLMs?
- How do I choose the best LLM model for my project?
- What are large language models, and how do they differ from traditional NLP models?
- Can LLMs write code well?
