AI Speed and Latency Leaderboard: Tokens/s RankingsRankings of the fastest AI models and inference providers by tokens per second, time to first token, and end-to-end latency.
Mercury 2 Review: 1,000 Tokens per Second, TestedMercury 2 by Inception Labs is the fastest reasoning LLM available, built on diffusion architecture. We tested the speed, quality, and real-world trade-offs.