Unleash your creativity
Check our our current supported models below. We support the best, highest-rated models available, and we update the selection of models we support frequently. Sign up to receive a monthly update about our latest models and other news, and participate in our Discord Server to ask questions or request additional models.
Top llms
-
- RP
- BF16
- rAIfle-SorcererLM-8x22b-bf16
- Context: 16K
- Recommended Settings: https://files.catbox.moe/9tj7m0.json
- RP, Storywriting. This model has spatial awareness, memory and detailed descriptions to keep the generation entertaining. Very good creativity and NSFW.
- FP8 Dynamic
- anthracite-org-magnum-v4-72b-FP8-Dynamic
- Context: 32k
- Preset: https://files.catbox.moe/rqei05.json
- RP Instruct: https://files.catbox.moe/btnhau.json
- RP Context: https://files.catbox.moe/7kct3f.json
Settings provided by: GERGE
All Models
-
- Finetune of llama 3.3. RP – Storywriting
- TheDrummer-Anubis-70B-v1-FP8-Dynamic
-
Context: 32K
-
- This is a merge of pre-trained language Mistral Nemo models
-
Context: 32K
- RP, Storywriting. Coherent, emotional and very creative.
- FP8 Dynamic
- L3.1-70B-Euryale-v2.2-FP8-Dynamic
- Context: 16K
- RP Instruct: https://files.catbox.moe/1c9sp0.json
- RP Context: https://files.catbox.moe/5wwpin.json
Settings provided by: ShotMisser64
- A direct replacement / successor to Euryale v2.2
- Sao10K-L3.3-70B-Euryale-v2.3-FP8-Dynamic
- FP8
- Context: 32K
- This model is fine-tuned on top of Qwen-2 72B Instruct.
- anthracite-org-magnum-v2-72b-FP8-Dynamic
- FP8
- Context: 32K
- All the qualities of the best models merged into one. RP
-
NousResearch-Hermes-3-Llama-3.1-70B-FP8
- Context: 64K
- All the qualities of the best models merged into one. RP
-
Qwen-QwQ-32B-Preview
- Context: 32K
- All the qualities of the best models merged into one. RP
- Context: 32K
- Settings & Review:
- MN 12B Inferor v0.0
- RP
- BF16
-
TheDrummer-UnslopNemo-12B-v4.1
- Context: 32K
- Recommended Settings: https://files.catbox.moe/7e6zjo.json
- GP
- FP8
- Qwen2.5-72B-Instruct-Turbo
- Context: 32K
- GP
- BF16
- nvidia-Llama-3.1-Nemotron-70B-Instruct-HF
- Context: 32K
- Recommended Settings: https://files.catbox.moe/7e6zjo.json
- RP
- FP16
- Midnight-Miqu-70B-v1.5
- Context: 18K
- Preset: https://files.catbox.moe/l8e5zt.json
- RP Instruct: https://files.catbox.moe/eaj6gy.json
- RP Context: https://files.catbox.moe/mvn3jo.json
Settings provided by: ShadingCrawler
- RP.
- Storywriting.
- General purpose.
- WizardLM-2-8x22B
- Context: 16K
- Context template: Alpaca
- RP Instruct: https://files.catbox.moe/q0a07u.json
- RP Context: https://files.catbox.moe/68194o.json
Settings provided by: GERGE
- FP8 Dynamic
- RP
- Llama-3-TenyxChat-DaybreakStorywriter-70B-fp8-dynamic
- Context: 16K
- Settings:
- RP Instruct: https://files.catbox.moe/i3z4wv.json
- RP Context: https://files.catbox.moe/1k8p5b.json
Settings provided by: ShotMisser64
- GP, RP
- BF16
-
Sao10K-L3.1-70B-Hanami-x1
- Context: 32K
- Mixtral-8x7B-Instruct-v0.1
- General purpose
- Context: 32K
- RP
- BF16
- TheDrummer-Rocinante-12B-v1.1
- Context: 32K
- BF16
- GP
- Qwen2-72B-Instruct
- Context: 32K
- BF16
- GP
- Mixtral-8x7B-Instruct-v0.1
- Context: 32K
- BF16
- GP
- Llama-3.2-11B-Vision-Instruct-Turbo
- Context: 128K
Guides & SETTINGS
Models
L3-70B-Euryale-v2.1
Meet L3 70B Euryale v2.1: Your New Creative Companion What is L3 70B Euryale v2.1 [...]
Aug
Guides
Using Infermatic.ai API with SillyTavern
SillyTavern is one of the most popular interfaces to interact with LLMs. We have been [...]
Jun
Models
nvidia/Llama-3.1-Nemotron-70B-Instruct
Llama 3.1 Nemotron 70B Instruct: Follow and assert Llama 3.1 Nemotron 70B Instruct is NVIDIA’s [...]
Dec
Models
Infermatic/MN 12B Inferor v0.0
MN 12B Inferor v0.0: Dynamic and Creative MN 12B Inferor, also known as Mistral Nemo [...]
Dec
Docs
API Docs
Frequently Asked Questions from Geek to Geek
- What is prompt engineering, and why is it critical in working with LLMs?
- How can I design effective prompts for LLMs?
- What are some standard techniques used in prompt engineering?
- How does prompt length impact the output of an LLM?
- How do LLMs understand and generate human-like text?
- What is the difference between Llama, Mixtral, and Qwen?
- What are some examples of advanced use cases of prompt engineering with LLMs?
- How do I choose the best LLM model for my project?
- What are large language models, and how do they differ from traditional NLP models?
- Can LLMs write code well?