Multimodal

Kimi K2.5

Kimi K2.5

Moonshot AI's Kimi K2.5 is a 1T-parameter MoE model activating 32B per token with native multimodal vision via MoonViT-3D, Agent Swarm coordination of up to 100 sub-agents via PARL, and top-tier math and coding benchmarks under a modified MIT license.

Gemini 2.5 Flash-Lite

Gemini 2.5 Flash-Lite

Google's cheapest Gemini model pairs a 1M-token context window with $0.10/$0.40 per million token pricing, multimodal input, and 359 tokens/second throughput for high-volume production workloads.

Google Gemma 3 27B

Google Gemma 3 27B

Google Gemma 3 27B is a 27B dense multimodal model supporting text and vision with a 128K context window, 140+ languages, and single-GPU deployment - the most capable open model at its size class.

GPT-4o mini

GPT-4o mini

OpenAI's budget API workhorse pairs 128K context with $0.15/$0.60 per million token pricing, solid coding benchmarks, and the broadest third-party ecosystem of any small model.