Reasoning

Reasoning Traps, LLM Chaos, and Steering Curves

Reasoning Traps, LLM Chaos, and Steering Curves

Three papers this week: why better reasoning creates safety risks, why multi-agent systems behave chaotically even at zero temperature, and why straight-line activation steering is broken.

NVIDIA Nemotron 3 Super 120B-A12B

NVIDIA Nemotron 3 Super 120B-A12B

NVIDIA Nemotron 3 Super is a 120B-parameter open model with 12B active at inference, combining Mamba-2, LatentMoE, and Multi-Token Prediction for agentic workloads with a 1M token context window.

Grok 4 - xAI's Flagship Reasoning Model

Grok 4 - xAI's Flagship Reasoning Model

Grok 4 is xAI's frontier reasoning model, the first to break 50% on Humanity's Last Exam, with a 256K context window, $3/M input pricing, and a Heavy multi-agent variant built on 200,000 GPUs.