Small language models

Nemotron 3 Nano 4B: NVIDIA Edge Model Runs on 8GB

NVIDIA's Nemotron 3 Nano 4B packs a Mamba-dominant hybrid architecture, 262K token context, and 95.4% on MATH500 into a model that fits an 8GB Jetson Orin Nano.

Microsoft's Phi-4 Vision Matches Models 10x Its Size

Microsoft releases Phi-4-reasoning-vision-15B - a 15B open-weight multimodal model trained on 240 GPUs in 4 days that competes with 100B+ parameter models on math, science, and GUI understanding.

Qwen3.5-27B vs Phi-4: When Twice the Parameters Is Not Twice as Obvious

A data-driven comparison of Alibaba's Qwen3.5-27B and Microsoft's Phi-4 - a 27B hybrid architecture versus a 14B STEM specialist, testing whether raw parameter count or training efficiency wins in practice.

Cohere's Tiny Aya Fits 70 Languages Into 3.35 Billion Parameters and Runs on a Phone

Cohere Labs releases Tiny Aya, a 3.35B open-weight multilingual model that beats Gemma 3 4B in 46 of 61 languages on translation and runs at 32 tokens/sec on an iPhone.

Small language models

Nemotron 3 Nano 4B: NVIDIA Edge Model Runs on 8GB

Microsoft's Phi-4 Vision Matches Models 10x Its Size

Qwen3.5-27B vs Phi-4: When Twice the Parameters Is Not Twice as Obvious

Cohere's Tiny Aya Fits 70 Languages Into 3.35 Billion Parameters and Runs on a Phone

Google Analytics