Llama Guard 3 | AI Model

Overview

Llama Guard 3 is Meta’s open-weight safety model for moderating AI prompts and responses. It classifies policy risks, returns structured JSON (allow/block with categories and rationale), and is designed to run beside your assistant for low-latency, context-aware guardrails.

Description

Llama Guard 3 is a lightweight, instruction-tuned classifier that turns safety policies into reliable, machine-readable decisions. You pass it user input or a model’s output—optionally with chat history—and it produces a concise judgment plus tagged categories and short explanations, suitable for logging and automated actions. Unlike simple keyword filters, it reasons over paraphrase, sarcasm, and multi-turn context, helping catch disallowed requests while allowing benign ones that look similar on the surface. Policies are customizable, so teams can start with common areas (e.g., violence, hate, sexual content, self-harm, illegal activity) and adapt wording or thresholds to match internal standards. The model is optimized for latency and determinism, integrates cleanly with function-calling and RAG pipelines, and can be used both pre-generation (to screen prompts) and post-generation (to review candidate answers or enforce safe-completion flows). Overall, Llama Guard 3 provides a practical foundation for building auditable, JSON-first safety controls into real products without heavy infrastructure.

About Meta

We're connecting people to what they care about, powering new, meaningful experiences, and advancing the state-of-the-art through open research and accessible tooling.

Location: California, US

Website: ai.meta.com

View Company Profile

Related Models

Last updated: October 14, 2025

Overview

Description

About Meta

Related Models

Falcon-40B

Vicuna

AquilaChat

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool