Benchmarks

MMLU-Pro Leaderboard: Graduate-Level Knowledge Rankings

Complete MMLU-Pro benchmark rankings measuring graduate-level knowledge across 14 subjects with 12,000 questions and 10 answer options per question.

AI Image Generation Leaderboard: Best Models for Visual Content

Rankings of the best AI image generation models including GPT Image 1.5, Gemini 3 Pro, Midjourney v7, FLUX 2 Max, Stable Diffusion 3.5, and Ideogram 2.0 across text rendering, photorealism, and artistic quality.

DeepSeek V3.2 Goes Open Source Under MIT License, Matches GPT-5 Performance

DeepSeek releases V3.2 under MIT license with 671B MoE architecture, matching GPT-5 at one-tenth the cost and achieving gold-medal performance on IMO and IOI competitions.

The Gap Between Open-Source and Proprietary AI Has Effectively Vanished

Analysis of how the MMLU benchmark gap between open-source and proprietary AI narrowed from 17.5 to 0.3 percentage points in a single year, reshaping the industry landscape.

← Previous

Benchmarks

MMLU-Pro Leaderboard: Graduate-Level Knowledge Rankings

AI Image Generation Leaderboard: Best Models for Visual Content

DeepSeek V3.2 Goes Open Source Under MIT License, Matches GPT-5 Performance

The Gap Between Open-Source and Proprietary AI Has Effectively Vanished

Google Analytics