MMMU-Pro
Harder MMMU variant: filters out text-only-solvable items and adds a vision-only setting where the question itself is rendered into the image.
Best results
Frontier over time
All results
| # | Model | Score | Conditions | Eval date | Source | Flags |
|---|---|---|---|---|---|---|
| 1 | GPT 5.4 | 81.2% | — | 05 Mar 2026 | Self-reported | Primary |
| 2 | Gemini 3 Flash (Thinking) | 81.2% | — | 17 Dec 2025 | Self-reported | Primary |
| 3 | Gemini 3 Pro | 81.0% | CoT | 18 Nov 2025 | Self-reported | Primary |
| 4 | Gemini 3.1 Pro | 80.5% | CoT | 19 Feb 2026 | Self-reported | Primary |
| 5 | Kimi K2.6 | 79.4% | — | 20 Apr 2026 | Self-reported | Primary |
| 6 | GPT 5 (Thinking) | 78.4% | — | 07 Aug 2025 | Self-reported | Primary |
| 7 | Gemma 4 | 76.9% | — | 03 Apr 2026 | Self-reported | Primary |
| 8 | GPT 5.5 Instant | 76.0% | 0-shot | 05 May 2026 | Self-reported | Primary |
| 9 | Qwen 3.5 35B A3B | 75.1% | — | 15 Feb 2025 | Third-party | Primary Verified |
| 10 | Claude Sonnet 4.6 | 74.5% | — | 17 Feb 2026 | Self-reported | Primary |
| 11 | Gemini 2.5 Pro (Thinking) | 68.0% | — | 17 Dec 2025 | Self-reported | Primary |
| 12 | Gemini 2.5 Flash (Thinking) | 66.7% | — | 17 Dec 2025 | Self-reported | Primary |
| 13 | GPT 5 | 62.7% | — | 07 Aug 2025 | Self-reported | Primary |
| 14 | Seed 1.5 | 59.3% | — | 22 Jan 2025 | Self-reported | Primary |
