Chinchilla | AI Model

Overview

Chinchilla is DeepMind’s 2022 language model that showed smaller models trained on far more tokens can beat much larger ones. It has about 70B parameters and was trained on roughly 1.4T tokens, setting a new compute-optimal recipe and improving accuracy while cutting inference cost.

Description

Chinchilla is a dense decoder-only Transformer built to test compute-optimal scaling. Instead of pushing parameter count ever higher, DeepMind kept the model moderate in size and dramatically increased the training corpus, landing near an optimal ratio of about 20 training tokens per parameter. Trained on roughly 1.4 trillion tokens with around 70 billion parameters, it outperformed larger predecessors such as Gopher on a wide range of benchmarks, while being faster and cheaper to serve at inference. The result reshaped industry practice: for a fixed training budget, allocate more data and fewer parameters, aim for long training runs with strong regularization, and you can get better generalization, stronger few-shot performance, and more practical deployment costs. Chinchilla’s findings influenced later model families that emphasized token budgets, data quality, and extended pretraining over sheer parameter scale.

About DeepMind

DeepMind is a technology company that specializes in artificial intelligence and machine learning.

Industry: Research Services

Company Size: 501-1000

Location: London, GB

Website: deepmind.com

View Company Profile

Related Models

Last updated: October 14, 2025

Overview

Description

About DeepMind

Related Models

Orca

FastVLM

Kanana Essence

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool