TAAFT
Free mode
100% free
Freemium
Free Trial
Create tool

Llama Guard 3

By Meta
New Text Gen 5
Released: July 23, 2024

Overview

Llama Guard 3 is Meta’s open-weight safety model for moderating AI prompts and responses. It classifies policy risks, returns structured JSON (allow/block with categories and rationale), and is designed to run beside your assistant for low-latency, context-aware guardrails.

Description

Llama Guard 3 is a lightweight, instruction-tuned classifier that turns safety policies into reliable, machine-readable decisions. You pass it user input or a model’s output—optionally with chat history—and it produces a concise judgment plus tagged categories and short explanations, suitable for logging and automated actions. Unlike simple keyword filters, it reasons over paraphrase, sarcasm, and multi-turn context, helping catch disallowed requests while allowing benign ones that look similar on the surface. Policies are customizable, so teams can start with common areas (e.g., violence, hate, sexual content, self-harm, illegal activity) and adapt wording or thresholds to match internal standards. The model is optimized for latency and determinism, integrates cleanly with function-calling and RAG pipelines, and can be used both pre-generation (to screen prompts) and post-generation (to review candidate answers or enforce safe-completion flows). Overall, Llama Guard 3 provides a practical foundation for building auditable, JSON-first safety controls into real products without heavy infrastructure.

About Meta

We're connecting people to what they care about, powering new, meaningful experiences, and advancing the state-of-the-art through open research and accessible tooling.

Location: California, US
Website: ai.meta.com
View Company Profile

Related Models

Last updated: September 23, 2025