Speed

AI Speed and Latency Leaderboard: Tokens/s Rankings

Rankings of the fastest AI models and inference providers by tokens per second, time to first token, and end-to-end latency.

Mercury 2 Review: 1,000 Tokens per Second, Tested

Mercury 2 by Inception Labs is the fastest reasoning LLM available, built on diffusion architecture. We tested the speed, quality, and real-world trade-offs.

Speed

AI Speed and Latency Leaderboard: Tokens/s Rankings

Mercury 2 Review: 1,000 Tokens per Second, Tested

Google Analytics