Llama 3.1 Nemotron 70B Instruct: Follow and assert
Review
Nemotron is the smartest in the room. If you want any model to follow your system prompt exactly as intended, this is the best option available right now. It also has excellent general knowledge capabilities. Combining Nemotron with its large context support at 32k tokens delivers a really good experience.
- Use Cases:
- General Knowledge: Answers questions accurately across diverse topics.
- Roleplay (RP): Creative and flexible for interactive storytelling.
- Content Creation: Assists in storywriting, idea generation, and more.
- Code Generation: Helps with coding tasks and debugging.
- Question Answering: Offers precise and helpful responses.
- Limitations:
- Positivity Bias: May exhibit overly positive responses if not configured correctly.
- Stability: Can become unstable depending on system prompt settings.
- Specialized Domains: Not optimized for advanced mathematics or niche fields.
- Strengths:
- Dependable for following prompts and generating detailed outputs.
- Highly versatile across various use cases.
- Considerations:
- May exhibit positivity bias if not configured correctly.
Want to Try It?
Experience Llama 3.1 Nemotron 70B Instruct on the following platforms, both offering a 32K context window:
Recommended Settings for Llama 3.1 Nemotron 70B Instruct
For optimal performance, here are the recommended settings:
Setting | Value |
---|---|
Format | ChatML |
Tokenizer | Llama 3 |
Temperature | 0.85 |
Top K | -1 |
Top P | 0.95 |
Typical P | 1 |
Min P | 0.02 |
Top A | 0 |
Repetition Penalty | 1 |
Frequency Penalty | 0.5 |
Presence Penalty | 0.3 |
Response Tokens | 600 |
Pro Tips: To make the model more deterministic, decrease the temperature. To avoid incomplete sentences, enable the ‘Trim incomplete sentences’ option (if using Silly Tavern).
Are you using Silly Tavern?
Import the master settings from here: story formatting GERGE , Deterministic and uncreative GERGE
Additional information
Performance Benchmarks
- Arena Hard: Score of 85.0, ranked #1 as of Oct 2024.
- AlpacaEval 2 LC: Score of 57.6, ranked #1 (verified tab).
- MT-Bench (GPT-4-Turbo): Score of 8.98, ranked #1 as of Oct 2024.
Chatbot Arena Leaderboard Rankings (Oct 2024)
- Elo Score: 1267 (±7).
- Overall Rank: 9.
- Style-Controlled Rank: 26.
Design Highlights
- Training Methodology: Built using RLHF with the REINFORCE algorithm.
- Initial Policy: Derived from Llama-3.1-70B-Instruct.
- Evaluation Tool: NeMo Aligner.
Conversion and Compatibility
- Model Format: Converted to HuggingFace Transformers as Llama-3.1-Nemotron-70B-Instruct-HF.
- Software Support: Compatible with Transformers v4.44.0 and torch v2.4.0.
Training and Evaluation
Alignment Methodology
- Trained with:
- HelpSteer2-Preference prompts.
- Llama-3.1-Nemotron-70B-Reward.
- Dataset Size:
- 21,362 prompt-response pairs to improve alignment with human preferences.
- Split into 20,324 training pairs and 1,038 validation pairs.
Datasets
- Data Sources: Combines human-labeled and synthetic data for hybrid training.
- Focus Areas:
- Helpfulness.
- Factual correctness.
- Coherence.
- Customization for complexity and verbosity.
From: nvidia/Llama-3.1-Nemotron-70B-Instruct-HF huggingface
Looking for Similar Models?
Explore alternatives:
-
- TheDrummer/Nautilus-70B-v0.1 (A finetune of Nvidia’s Llama 3.1 Nemotron 70B )
Review of Sao10K/L3 70B Euryale v2.1
Click here
Review of Infermatic/MN 12B Inferor v0.0
Click here
Want to Know More?
Have questions or want to explore settings, examples, or community experiences? Join the discussion on our Discord server! Click the button below to connect: