TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

MathVista

MathVista (testmini)

Mathematical reasoning over visual contexts: figures, charts, diagrams, geometric drawings.

Multimodal Multimodal Accuracy Max 100.0% Released Oct 2023
8
Results
8
Models scored
86.8%
Top: o3
71.9%
Median

Best results

Top primary scores; one row per model.
1
86.8%
2
84.3%
4
72.0%
5
71.8%
7
63.8%

Frontier over time

Each dot is one model result; the line traces the running best score.
Best score over time0.0025.050.075.0100.0Oct 2024Jan 2025Apr 2025

All results

Showing all configurations including non-primary alternates.  · Show only primary
# Model Score Conditions Eval date Source Flags
1 o3 86.8% 16 Apr 2025 Self-reported Primary
2 o4 mini 84.3% 16 Apr 2025 Self-reported Primary
3 Llama 4 Maverick 73.7% 05 Apr 2025 Self-reported Primary
4 GPT 4.1 72.0% 14 Apr 2025 Self-reported Primary
5 o1 71.8% 16 Apr 2025 Self-reported Primary
6 Llama 4 Scout 70.7% 05 Apr 2025 Self-reported Primary
7 Claude Sonnet 3.5 67.7% 0-shot · standard 20 Jun 2024 Self-reported
8 Gemini 1.5 Pro 63.9% 0-shot · standard 01 May 2024 Self-reported
9 GPT-4o 63.8% 16 Apr 2025 Self-reported Primary
10 Pixtral 12B 58.3% CoT 10 Oct 2024 Self-reported Primary
11 Gemini Ultra 53.0% 0-shot · standard 06 Dec 2023 Self-reported
12 Claude Haiku 3 46.4% 0-shot · standard 04 Mar 2024 Self-reported
0 AIs selected
Clear selection
#
Name
Task