TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

GPQA Diamond

Graduate-Level Google-Proof Q&A — Diamond subset

PhD-level multiple-choice questions in biology, physics, and chemistry, written by domain experts so non-experts cannot answer them even with web search. Diamond is the hardest curated subset.

Knowledge Text Accuracy Max 100.0% Released Nov 2023
82
Results
79
Models scored
94.4%
Top: GPT 5.4 Pro
80.0%
Median

Best results

Top primary scores; one row per model.

Frontier over time

Each dot is one model result; the line traces the running best score.
Best score over time0.0025.050.075.0100.0Jan 2024Apr 2025Jul 2026

All results

Showing all configurations including non-primary alternates.  · Show only primary
# Model Score Conditions Eval date Source Flags
1 GPT 5.4 Pro 94.4% CoT 05 Mar 2026 Self-reported Primary
2 Gemini 3.1 Pro 94.3% CoT 19 Feb 2026 Self-reported Primary
3 Claude Opus 4.7 94.2% 16 Apr 2026 Self-reported Primary
4 Gemini 3 Deep Think 93.8% CoT 12 Feb 2026 Self-reported Primary
5 GPT 5.5 93.6% CoT 23 Apr 2026 Self-reported Primary
6 GPT 5.2 Pro 93.2% CoT 11 Dec 2025 Self-reported Primary
7 GPT 5.4 92.8% CoT 05 Mar 2026 Self-reported Primary
8 GPT 5.3 Codex 92.6% 05 Mar 2026 Self-reported Primary
9 GPT 5.2 Thinking 92.4% CoT 11 Dec 2025 Self-reported Primary
10 Qwen 3.7 Max 92.4% 0-shot · CoT · standard 20 May 2026 Self-reported
11 Gemini 3 Pro 91.9% CoT 18 Nov 2025 Self-reported Primary
12 Claude Opus 4.6 91.3% 05 Feb 2026 Self-reported Primary
13 Kimi K2.6 90.5% CoT 20 Apr 2026 Self-reported Primary
14 Gemini 3 Flash 90.4% CoT 17 Dec 2025 Self-reported Primary
15 Gemini 3 Flash (Thinking) 90.4% 17 Dec 2025 Self-reported Primary
16 Claude Sonnet 4.6 89.9% 17 Feb 2026 Self-reported Primary
17 Muse Spark 89.5% 08 Apr 2026 Self-reported Primary
18 Grok 4 Heavy 88.4% CoT 09 Jul 2025 Self-reported Primary
19 GPT 5.1 88.1% 13 Nov 2025 Self-reported Primary
20 GPT 5.1 Thinking 88.1% CoT 12 Nov 2025 Self-reported Primary
21 GPT 5.4 Mini 88.0% CoT 17 Mar 2026 Self-reported Primary
22 Grok 4 87.5% CoT 09 Jul 2025 Self-reported Primary
23 Claude Opus 4.5 87.0% 24 Nov 2025 Self-reported Primary
24 Qwen 3.5 122B A10B 86.6% 24 Apr 2026 Third-party Primary Verified
25 Gemini 2.5 Pro (Thinking) 86.4% 17 Dec 2025 Self-reported Primary
26 GLM-5.1 86.2% CoT 08 Apr 2026 Self-reported Primary
27 GLM 5 86.0% CoT 12 Feb 2026 Self-reported Primary
28 GPT 5 (Thinking) 85.7% 07 Aug 2025 Self-reported Primary
29 Qwen 3.5 27B 85.5% 24 Feb 2026 Third-party Primary Verified
30 Grok 3 Think 84.6% CoT 19 Feb 2025 Self-reported Primary
31 Gemma 4 84.3% CoT 03 Apr 2026 Self-reported Primary
32 Qwen 3.5 35B A3B 84.2% 15 Feb 2025 Third-party Primary Verified
33 Gemini 2.5 Pro 84.0% CoT 25 Mar 2025 Self-reported Primary
34 Claude Sonnet 4.5 83.4% CoT 29 Sep 2025 Self-reported Primary
35 o3 83.3% 16 Apr 2025 Self-reported Primary
36 GPT 5.4 Nano 82.8% CoT 17 Mar 2026 Self-reported Primary
37 Gemini 2.5 Flash (Thinking) 82.8% 17 Dec 2025 Self-reported Primary
38 Deepseek 3.2 82.4% 01 Dec 2025 Paper Primary
39 GLM 4.6 81.0% CoT 30 Sep 2025 Self-reported Primary
40 Opus 4.1 Thinking 80.9% CoT 05 Aug 2025 Self-reported Primary
41 DeepSeek V3.1 Terminus 80.7% 22 Sep 2025 Self-reported Primary
42 GPT OSS 120B 80.1% CoT 05 Aug 2025 Self-reported Primary
43 DeepSeek V3.2 Exp 79.9% CoT 29 Sep 2025 Self-reported Primary
44 Claude Sonnet 3.7 (Thinking) 78.2% 24 Feb 2025 Self-reported Primary
45 o1 78.0% 16 Apr 2025 Self-reported Primary
46 GPT 5 77.8% 07 Aug 2025 Self-reported Primary
47 Llama 3.1 Nemotron Ultra 76.0% 08 Apr 2025 Self-reported Primary
48 Claude Sonnet 4 75.4% 22 May 2025 Self-reported Primary
49 Grok 3 75.4% 19 Feb 2025 Self-reported Primary
50 Grok 3 75.4% 19 Feb 2025 Self-reported Primary
51 Kimi K2 Instruct 75.1% 02 Jul 2025 Paper Primary
52 Nemotron 3 Nano 75.0% 15 Dec 2025 Self-reported Primary
53 Llama 4 Behemoth 73.7% 05 Apr 2025 Self-reported Primary
54 Claude Haiku 4.5 73.0% 15 Oct 2025 Self-reported Primary
55 Claude Haiku 4.5 73.0% 15 Oct 2025 Self-reported Primary
56 Gemma 3 72.6% 20 May 2025 Self-reported Primary
57 DeepSeek-R1 71.5% CoT 21 Jan 2025 Paper Primary
58 R1 1776 71.5% 18 Feb 2025 Self-reported Primary
59 Magistral Medium 70.8% CoT 10 Jun 2025 Self-reported Primary
60 Llama 4 Maverick 69.8% 05 Apr 2025 Self-reported Primary
61 Phi 4 reasoning plus 69.3% 08 Jul 2026 Self-reported Primary
62 GPT 4.1 66.3% 14 Apr 2025 Self-reported Primary
63 Grok 3 mini 66.2% 19 Feb 2025 Self-reported Primary
64 Qwen3-30B-A3B 65.8% CoT 28 Apr 2025 Self-reported Primary
65 Qwen3 30B A3B 65.8% 28 Apr 2025 Self-reported Primary
66 Claude Haiku 3.5 65.0% 0-shot · CoT 22 Oct 2024 Self-reported Primary
67 Seed 1.5 65.0% 0-shot · CoT 22 Jan 2025 Self-reported Primary
68 Gemini 2.5 Flash-Lite 64.6% 26 Sep 2025 Self-reported Primary
69 Claude Sonnet 3.7 62.3% 24 Feb 2025 Self-reported Primary
70 Gemini 2.0 Flash 60.1% 0-shot · CoT · standard 05 Feb 2025 Self-reported
71 Nemotron 3 Super 60.0% 5-shot · CoT 03 Apr 2026 Self-reported Primary
72 Claude Sonnet 3.5 59.4% 0-shot · CoT · standard 20 Jun 2024 Self-reported
73 DeepSeek V3 59.1% 26 Dec 2024 Paper Primary
74 Llama 4 Scout 57.2% 05 Apr 2025 Self-reported Primary
75 GPT-4o 53.6% 16 Apr 2025 Self-reported Primary
76 Command A 50.8% 07 Apr 2025 Paper Primary
77 Command A 50.8% 07 Apr 2025 Self-reported Primary
78 Llama 3.3 50.5% 0-shot · CoT 06 Dec 2025 Self-reported Primary
79 GPT-4 Turbo 50.4% 01 Jan 2024 Paper Primary
80 Claude Opus 3 50.4% 04 Mar 2024 Self-reported Primary
81 Nova Pro 46.9% 0-shot · CoT 03 Dec 2024 Self-reported Primary
82 Mistral Large 3 43.9% 5-shot 02 Dec 2025 Self-reported Primary
83 Nova Lite 42.0% 0-shot · CoT 03 Dec 2024 Self-reported Primary
84 Nova Micro 40.0% 0-shot · CoT 03 Dec 2024 Self-reported Primary
85 Claude Haiku 3 33.3% 0-shot · CoT · standard 04 Mar 2024 Self-reported
86 Llama 3.2 32.8% 0-shot 25 Oct 2024 Self-reported Primary
0 AIs selected
Clear selection
#
Name
Task