Gemini 2.5 Flash-Lite

Overview

Gemini 2.5 Flash-Lite is our most balanced Gemini model, optimized for low latency use cases. It comes with the same capabilities that make other Gemini 2.5 models helpful, such as the ability to turn thinking on at different budgets, connecting to tools like Grounding with Google Search and code execution, multimodal input, and a 1 million-token context length.

Description

Gemini 2.5 Flash-Lite is a cost- and latency-optimized version of Google's Gemini 2.5 model, designed for high-volume tasks requiring rapid responses, such as translation and classification. It features the same core capabilities as the 2.5 family, including a large 1-million-token context window, multimodal inputs (text, images, audio, video), and tool integration like Google Search grounding and code execution. Key features include its default "thinking off" state for maximum speed, which can be enabled for more complex reasoning at a cost, and strong benchmark performance for coding, math, science, and reasoning tasks

About Google AI

At Google, we think that AI can meaningfully improve people's lives and that the biggest impact will come when everyone can access it.

Industry: Research

Company Size: 501-1000

Location: Mountain View, CA, US

Website: ai.google

View Company Profile

Related Models

Last updated: October 13, 2025

Overview

Description

About Google AI

Related Models

Composer 1

GPT OSS Safeguard

Imagen

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool