Models

Unleash your creativity

Check our our current supported models below. We support the best, highest-rated models available, and we update the selection of models we support frequently. Sign up to receive a monthly update about our latest models and other news, and participate in our Discord Server to ask questions or request additional models.

TOP LLMS

TheDrummer/

Fallen Llama 3.3 R1 70B v1

RP, Storywriting

New

Doctor-Shotgun/

L3.3 70B Magnum v4 SE

"SE" for Special Edition, the objective, as with the other Magnum models, is to emulate the prose style and quality of the Claude 3 Sonnet/Opus series of models on a local scale.

Deepseek-ai/

DeepSeek R1 Distill Llama 70B

GP, reasoning

Sao10K/

70B L3.3 Cirrus x1

RP, Story Writing, Creative

Sao10K-70B-L3.3-Cirrus-x1

Anthracite-org/

Magnum-72b-v4

This model has spatial awareness, memory and detailed descriptions to keep the generation entertaining. Very good creativity and NSFW.

RP, Storywriting.
FP8 Dynamic
anthracite-org-magnum-v4-72b-FP8-Dynamic
Context: 32k
Preset: https://files.catbox.moe/rqei05.json
RP Instruct: https://files.catbox.moe/btnhau.json
RP Context: https://files.catbox.moe/7kct3f.json

Settings provided by: GERGE

Sao10K/

72B Qwen2.5 Kunou v1

Another version of Euryale with with a Qwen base model.

rAIfle/

SorcererLM 8x22b bf16

RP
BF16
rAIfle-SorcererLM-8x22b-bf16
Context: 16K
Recommended Settings: https://files.catbox.moe/9tj7m0.json

Sao10K/

L3.3-70B-Euryale-v2.3

A direct replacement / successor to Euryale v2.2.

Sao10K-L3.3-70B-Euryale-v2.3-FP8-Dynamic
FP8
Context: 32K

Sao10K/

L3.1 70B Hanami x1

GP, RP
FP16
Sao10K-L3.1-70B-Hanami-x1
Context: 32K

PLUS PLAN MODELS

All of Free + Essential + Standard and the following:

New

hexgrad/

Kokoro-82M

Kokoro is a Text to Speech model, which converts written text into natural-sounding speech using advanced AI voice synthesis. Simply type your text, select a voice, and generate high-quality audio in seconds. Join the discord to learn how to use this model.

TTS-hexgrad-Kokoro-82M

Top

rAIfle/

SorcererLM 8x22b bf16

RP
BF16
rAIfle-SorcererLM-8x22b-bf16
Context: 16K
Recommended Settings: https://files.catbox.moe/9tj7m0.json

STANDARD PLAN MODELS

All of Free + Essential and the following:

Top

TheDrummer/

Fallen Llama 3.3 R1 70B v1

RP, Storywriting

New

Top

Doctor-Shotgun/

L3.3 70B Magnum v4 SE

"SE" for Special Edition, the objective, as with the other Magnum models, is to emulate the prose style and quality of the Claude 3 Sonnet/Opus series of models on a local scale.

Top

Deepseek-ai/

DeepSeek R1 Distill Llama 70B

GP, reasoning

Top

Anthracite-org/

Magnum-72b-v4

This model has spatial awareness, memory and detailed descriptions to keep the generation entertaining. Very good creativity and NSFW.

RP, Storywriting.
FP8 Dynamic
anthracite-org-magnum-v4-72b-FP8-Dynamic
Context: 32k
Preset: https://files.catbox.moe/rqei05.json
RP Instruct: https://files.catbox.moe/btnhau.json
RP Context: https://files.catbox.moe/7kct3f.json

Settings provided by: GERGE

ESSENTIAL PLAN MODELS

All of Free and the following:

TheDrummer/

Valkyrie 49B V1

TheDrummer-Valkyrie-49B-v1
Context: 64K
RP
BF16

intfloat/

multilingual-e5-base

Embeddings model. This model is initialized from xlm-roberta-base and continually trained on a mixture of multilingual datasets. It supports 100 languages from xlm-roberta, but low-resource languages may see performance degradation. 512 Max context length.

intfloat-multilingual-e5-base

NousResearch/

DeepHermes 3 Mistral 24B Preview

Latest version of the flagship Hermes series. One of the first models to unify Reasoning and normal LLM response modes into one model. Also has also improved LLM annotation, judgement, and function calling.

NousResearch-DeepHermes-3-Mistral-24B-Preview

Top

Sao10K/

70B L3.3 Cirrus x1

RP, Story Writing, Creative

Sao10K-70B-L3.3-Cirrus-x1

Top

Sao10K/

72B Qwen2.5 Kunou v1

Another version of Euryale with with a Qwen base model.

Top

Sao10K/

L3.3-70B-Euryale-v2.3

A direct replacement / successor to Euryale v2.2.

Sao10K-L3.3-70B-Euryale-v2.3-FP8-Dynamic
FP8
Context: 32K

Top

Sao10K/

L3.1 70B Hanami x1

GP, RP
FP16
Sao10K-L3.1-70B-Hanami-x1
Context: 32K

LatitudeGames/

Wayfarer 12B

Wayfarer is an adventure role-play model specifically trained to give players a challenging and dangerous experience.

LatitudeGames-Wayfarer-12B

meta-llama/

Meta Llama Guard 2 8B

Meta Llama Guard 2 is a safeguard model. It can be used for classifying content in both LLM inputs (prompt classification) and in LLM responses (response classification).

Meta-Llama-Guard-2-8B

TheDrummer/

Anubis 70B v1

Finetune of llama 3.3.

RP – Storywriting
TheDrummer-Anubis-70B-v1-FP8-Dynamic
Context: 32K

Sao10K/

L3-70B-Euryale-v2.2

Coherent, emotional and very creative.

RP, Storywriting.
FP8 Dynamic
L3.1-70B-Euryale-v2.2-FP8-Dynamic
Context: 16K
RP Instruct: https://files.catbox.moe/1c9sp0.json
RP Context: https://files.catbox.moe/5wwpin.json

Settings provided by: ShotMisser64

TheDrummer/

UnslopNemo 12B v4.1

RP
BF16
TheDrummer-UnslopNemo-12B-v4.1
Context: 32K
Recommended Settings: https://files.catbox.moe/7e6zjo.json

Qwen/

Qwen2.5-72B-Instruct

GP
FP8
Qwen2.5-72B-Instruct-Turbo
Context: 32K

nvidia/

Llama 3.1 Nemotron 70B Instruct HF

GP
BF16
nvidia-Llama-3.1-Nemotron-70B-Instruct-HF
Context: 32K
Recommended Settings: https://files.catbox.moe/7e6zjo.json

Sophosympatheia/

Midnight Miqu 70B v1.5

RP
FP16
Midnight-Miqu-70B-v1.5
Context: 18K
Preset: https://files.catbox.moe/l8e5zt.json
RP Instruct: https://files.catbox.moe/eaj6gy.json
RP Context: https://files.catbox.moe/mvn3jo.json

Settings provided by: ShadingCrawler

Qwen/

Qwen2-72B-Instruct

BF16
GP
Qwen2.5-72B-Instruct
Context: 32K

meta-llama/

Llama-3.2-11B-Vision-Instruct

BF16
GP
Llama-3.2-11B-Vision-Instruct-Turbo
Context: 128K

FREE MODELS

Mistralai/

Mixtral 8x7B Instruct v0.1

BF16
GP
Mixtral-8x7B-Instruct-v0.1
Context: 32K

TheDrummer/

Rocinante-12B-v1.1

RP
BF16
TheDrummer-Rocinante-12B-v1.1
Context: 32K

GUIDES

Models

L3-70B-Euryale-v2.1

Meet L3 70B Euryale v2.1: Your New Creative Companion What is L3 70B Euryale v2.1 [...]

20
Aug

Guides

Using Infermatic.ai API with SillyTavern

SillyTavern is one of the most popular interfaces to interact with LLMs. We have been [...]

21
Jun

Models

nvidia/Llama-3.1-Nemotron-70B-Instruct

Llama 3.1 Nemotron 70B Instruct: Follow and assert Llama 3.1 Nemotron 70B Instruct is NVIDIA’s [...]

06
Dec

Models

Infermatic/MN 12B Inferor v0.0

MN 12B Inferor v0.0: Dynamic and Creative MN 12B Inferor, also known as Mistral Nemo [...]

05
Dec

Model Settings

Docs

API Docs

Infermatic API documentation

vLLM doc

All the models hosted use vLLM backend. If you want to know more feel free to visit vLLM’s documentation.

Models

Unleash your creativity

TOP LLMS

Fallen Llama 3.3 R1 70B v1

L3.3 70B Magnum v4 SE

DeepSeek R1 Distill Llama 70B

70B L3.3 Cirrus x1

Magnum-72b-v4

72B Qwen2.5 Kunou v1

SorcererLM 8x22b bf16

L3.3-70B-Euryale-v2.3

L3.1 70B Hanami x1

PLUS PLAN MODELS

All of Free + Essential + Standard and the following:

Kokoro-82M

SorcererLM 8x22b bf16

STANDARD PLAN MODELS

All of Free + Essential and the following:

Fallen Llama 3.3 R1 70B v1

L3.3 70B Magnum v4 SE

DeepSeek R1 Distill Llama 70B

Magnum-72b-v4

ESSENTIAL PLAN MODELS

All of Free and the following:

Valkyrie 49B V1

multilingual-e5-base

DeepHermes 3 Mistral 24B Preview

70B L3.3 Cirrus x1

72B Qwen2.5 Kunou v1

L3.3-70B-Euryale-v2.3

L3.1 70B Hanami x1

Wayfarer 12B

Meta Llama Guard 2 8B

Anubis 70B v1

L3-70B-Euryale-v2.2

UnslopNemo 12B v4.1

Qwen2.5-72B-Instruct

Llama 3.1 Nemotron 70B Instruct HF

Midnight Miqu 70B v1.5

Qwen2-72B-Instruct

Llama-3.2-11B-Vision-Instruct

FREE MODELS

Mixtral 8x7B Instruct v0.1

Rocinante-12B-v1.1

GUIDES

L3-70B-Euryale-v2.1

Using Infermatic.ai API with SillyTavern

nvidia/Llama-3.1-Nemotron-70B-Instruct

Infermatic/MN 12B Inferor v0.0

Docs

API Docs

vLLM doc

Frequently Asked Questions from Geek to Geek