Qwen

Qwen3.5-27B vs Gemma 3 27B: Same Parameter Count, Completely Different Models

A data-driven comparison of Alibaba's Qwen3.5-27B and Google's Gemma 3 27B - two 27B dense models that share a parameter count and almost nothing else.

Qwen3.5-27B vs Mistral Small 3.2: Apache 2.0 Heavyweights Go Head to Head

A data-driven comparison of Alibaba's Qwen3.5-27B and Mistral's Small 3.2 - two Apache 2.0 dense models in the 24-27B range with very different benchmark profiles and deployment strengths.

Qwen3.5-27B vs Phi-4: When Twice the Parameters Is Not Twice as Obvious

A data-driven comparison of Alibaba's Qwen3.5-27B and Microsoft's Phi-4 - a 27B hybrid architecture versus a 14B STEM specialist, testing whether raw parameter count or training efficiency wins in practice.

Qwen3.5-35B-A3B vs GLM-4.7-Flash: Two Chinese MoE Models, Very Different Strengths

Head-to-head comparison of Qwen3.5-35B-A3B and GLM-4.7-Flash - two Chinese-origin 30B-A3B MoE models with Apache 2.0/MIT licenses that dominate different benchmarks despite near-identical parameter budgets.

Qwen3.5-35B-A3B vs Llama 4 Scout: 3B Active Parameters vs 17B - Does 5.7x More Compute Actually Win?

David vs Goliath: Qwen3.5-35B-A3B activates 3B parameters and beats Llama 4 Scout's 17B active on MMLU-Pro, GPQA, and coding benchmarks - but Scout's 10M context window and native multimodal support tell a different story.

Qwen3.5-35B-A3B vs Nemotron 3 Nano 30B-A3B: Benchmarks vs Throughput in the 3B Active Parameter Class

A data-driven comparison of Alibaba's Qwen3.5-35B-A3B and NVIDIA's Nemotron 3 Nano 30B-A3B - two ~30B MoE models activating ~3B parameters that take fundamentally different architectural approaches to the same problem.

← Previous