AceReason-Nemotron-14B

View Website
Socials
Pricing
Freemium
Category
Added on
May 27th, 2025
AceReason-Nemotron-14B

AceReason-Nemotron-14B is a 14-billion-parameter language model developed by NVIDIA, designed to enhance mathematical and coding reasoning capabilities through reinforcement learning (RL). Starting from the DeepSeek-R1-Distilled-Qwen-14B model, it underwent a two-phase RL training process: first on math-only prompts, then on code-only prompts. This approach led to significant performance improvements on benchmarks like AIME 2025 and LiveCodeBench v5. The model's training involved a robust data curation pipeline, collecting challenging prompts with verifiable answers and test cases, enabling verification-based RL across both domains. AceReason-Nemotron-14B demonstrates that large-scale RL can substantially enhance the reasoning capabilities of strong, small- and mid-sized models, achieving results that surpass those of state-of-the-art distillation-based models.

Socials
Pricing
Freemium
Category
Added on
May 27th, 2025
AceReason-Nemotron-14B