TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

Chinchilla

New Text Gen 2
Released: March 29, 2022

Overview

Chinchilla is DeepMind’s 2022 language model that showed smaller models trained on far more tokens can beat much larger ones. It has about 70B parameters and was trained on roughly 1.4T tokens, setting a new compute-optimal recipe and improving accuracy while cutting inference cost.

Description

Chinchilla is a dense decoder-only Transformer built to test compute-optimal scaling. Instead of pushing parameter count ever higher, DeepMind kept the model moderate in size and dramatically increased the training corpus, landing near an optimal ratio of about 20 training tokens per parameter. Trained on roughly 1.4 trillion tokens with around 70 billion parameters, it outperformed larger predecessors such as Gopher on a wide range of benchmarks, while being faster and cheaper to serve at inference. The result reshaped industry practice: for a fixed training budget, allocate more data and fewer parameters, aim for long training runs with strong regularization, and you can get better generalization, stronger few-shot performance, and more practical deployment costs. Chinchilla’s findings influenced later model families that emphasized token budgets, data quality, and extended pretraining over sheer parameter scale.

About DeepMind

DeepMind is a technology company that specializes in artificial intelligence and machine learning.

Industry: Research Services
Company Size: 501-1000
Location: London, GB
View Company Profile

Related Models

Last updated: October 14, 2025