Multimodal

DeepSeek V4 Lite Leaks Under NDA - 1M Context, Natively Multimodal, Codenamed Sealion-Lite

DeepSeek's V4 Lite model has leaked through inference provider testing under strict NDAs, revealing a 1M token context window, native multimodal capabilities, and the internal codename sealion-lite.

Kimi K2.5

Moonshot AI's Kimi K2.5 is a 1T-parameter MoE model activating 32B per token with native multimodal vision via MoonViT-3D, Agent Swarm coordination of up to 100 sub-agents via PARL, and top-tier math and coding benchmarks under a modified MIT license.

Kimi K2.5 vs Gemma 3 27B: Trillion-Parameter Frontier vs Google's Accessible Multimodal Model

Comparing Moonshot AI's 1T-parameter Kimi K2.5 with Google DeepMind's Gemma 3 27B - two multimodal open-weight models separated by 37x in parameter count but sharing a vision-first design philosophy.

Gemini 2.5 Flash-Lite

Google's cheapest Gemini model pairs a 1M-token context window with $0.10/$0.40 per million token pricing, multimodal input, and 359 tokens/second throughput for high-volume production workloads.

Google Gemma 3 27B

Google Gemma 3 27B is a 27B dense multimodal model supporting text and vision with a 128K context window, 140+ languages, and single-GPU deployment - the most capable open model at its size class.

GPT-4o mini

OpenAI's budget API workhorse pairs 128K context with $0.15/$0.60 per million token pricing, solid coding benchmarks, and the broadest third-party ecosystem of any small model.

← Previous