Qwen 3.5

Qwen3.5-4B

Qwen3.5-4B is a 4B dense multimodal model that matches Qwen3-30B on MMLU-Pro and beats GPT-5-Nano on vision benchmarks. Runs on 8GB VRAM, Apache 2.0 licensed, 262K-1M context.

Qwen3.5-9B

Qwen3.5-9B is a 9B dense model that outperforms Qwen3-30B on most benchmarks and beats GPT-5-Nano on vision tasks. Natively multimodal with 262K-1M context, Apache 2.0 licensed.

Qwen 3.5 FP8 Weights Drop - How to Actually Deploy a 397B Model on 8 GPUs

Alibaba releases official FP8-quantized weights for the Qwen 3.5 flagship and 27B dense model, cutting memory requirements roughly in half and enabling deployment on 8x H100 GPUs with native vLLM and SGLang support.

Kimi K2.5 vs Qwen3.5 Flash: Premium Open-Weight Power vs Budget API Speed

Comparing Kimi K2.5 and Qwen3.5 Flash - Moonshot AI's trillion-parameter frontier model against Alibaba's cheapest and fastest API offering.

Kimi K2.5 vs Qwen3.5-122B-A10B: Trillion-Parameter Giant Meets the Efficiency Miracle

Comparing Kimi K2.5's 1T-parameter benchmark dominance against Qwen3.5-122B-A10B's extraordinary parameter efficiency - and why the smaller model is harder to dismiss than the numbers suggest.

Kimi K2.5 vs Qwen3.5-27B: When 37x More Parameters Meets a Single GPU

Comparing Kimi K2.5's trillion-parameter benchmark dominance against Qwen3.5-27B's single-GPU accessibility - two models from entirely different tiers that both have compelling use cases.

← Previous