LiveBench
Continuously refreshed benchmark across reasoning, coding, math, data analysis, language and instruction following. New questions every month to stay contamination-free.
Best results
Frontier over time
All results
| # | Model | Score | Conditions | Eval date | Source | Flags |
|---|---|---|---|---|---|---|
| 1 | Qwen3 235B A22B | 77.1 | — | Apr 28, 2025 | self reported | primary |
| 2 | Qwen3 30B A3B | 74.3 | — | Apr 28, 2025 | self reported | primary |
