Qwen 3.5

Qwen3.5-27B Distilled vs Base: What You Gain

Qwen3.5-27B Distilled vs Base: What You Gain

Comparing the Claude Opus reasoning-distilled Qwen3.5-27B against the base model - what chain-of-thought distillation adds and what it costs in context, multimodal, and reliability.

Qwen3.5-0.8B

Qwen3.5-0.8B

Qwen3.5-0.8B is the smallest natively multimodal model in the Qwen 3.5 family - 0.8B parameters handling text, images, and video with 262K context. MathVista 62.2, OCRBench 74.5. Apache 2.0.

Qwen3.5-2B

Qwen3.5-2B

Qwen3.5-2B is a 2B dense multimodal model with 262K context, thinking mode, and native vision including video understanding. OCRBench 84.5, VideoMME 75.6. Apache 2.0 licensed.