Leaderboards

Overall LLM Rankings: February 2026

Comprehensive ranking of the top large language models in February 2026, combining multiple benchmarks including reasoning, coding, knowledge, and multimodal capabilities.

Chatbot Arena Elo Rankings: Who Wins the Human Vote?

Explore the latest Chatbot Arena Elo rankings from LM Arena, where over 6 million human votes determine which AI models people actually prefer in blind comparisons.

Coding Benchmarks Leaderboard: SWE-Bench, Terminal-Bench, and LiveCodeBench

Rankings of the best AI models for coding tasks across SWE-Bench, Terminal-Bench, and LiveCodeBench benchmarks, measuring real-world software engineering and algorithmic problem-solving ability.

Reasoning Benchmarks: GPQA, AIME, and Humanity's Last Exam

Rankings of AI models on the hardest reasoning benchmarks available: GPQA Diamond, AIME competition math, and the notoriously difficult Humanity's Last Exam.

Multimodal AI Leaderboard: Vision, Video, and Beyond

Rankings of the best multimodal AI models for image understanding, video analysis, and visual reasoning, covering MMMU-Pro, Video-MMMU, and more.

Open-Source LLM Leaderboard: February 2026

Rankings of the best open-weight and open-source large language models in February 2026, including DeepSeek V3.2, Qwen 3.5, Llama 4 Maverick, GLM-5, and Mistral 3.

← Previous