Overview
Gemini 2.5 Flash-Lite is our most balanced Gemini model, optimized for low latency use cases. It comes with the same capabilities that make other Gemini 2.5 models helpful, such as the ability to turn thinking on at different budgets, connecting to tools like Grounding with Google Search and code execution, multimodal input, and a 1 million-token context length.
Description
Gemini 2.5 Flash-Lite is a cost- and latency-optimized version of Google's Gemini 2.5 model, designed for high-volume tasks requiring rapid responses, such as translation and classification. It features the same core capabilities as the 2.5 family, including a large 1-million-token context window, multimodal inputs (text, images, audio, video), and tool integration like Google Search grounding and code execution. Key features include its default "thinking off" state for maximum speed, which can be enabled for more complex reasoning at a cost, and strong benchmark performance for coding, math, science, and reasoning tasks
About Google AI
At Google, we think that AI can meaningfully improve people's lives and that the biggest impact will come when everyone can access it.
Industry:
Research
Company Size:
501-1000
Location:
Mountain View, CA, US
View Company Profile