Llama 4 Behemoth | AI Model

Overview

Description

Large Multimodal Model (Teacher); MoE architecture (16 experts, 288B active, ~2T total params); Used to distill Llama 4 Scout/Maverick; Outperforms GPT-4.5/Claude 3.7 Sonnet/Gemini 2.0 Pro on STEM benchmarks; Still in training (as of Apr 2025).

About Meta

We're connecting people to what they care about, powering new, meaningful experiences, and advancing the state-of-the-art through open research and accessible tooling.

Location: California, US

Website: ai.meta.com

View Company Profile

Related Models

Last updated: April 15, 2025

Overview

Description

About Meta

Related Models

Llama 3.1 (405B)

Mistral Small 3.2

Llama Guard 3

Help

People also viewed