DeepSeek models
Browse all models from this model family.
-
By DeepSeekDeepSeek-V4-Flash is DeepSeek’s faster, smaller, and more economical V4 model for efficient large-scale use. It has 284B total parameters with 13B active parameters, supports a 1M token context window, and is positioned as a high-speed, cost-effective model whose reasoning comes close to V4-Pro.NewMultimodalReleased 1mo ago
-
By DeepSeekDeepSeek-V4-Pro is DeepSeek’s new flagship open-source model for top-end reasoning, coding, world knowledge, and agentic work. It uses a MoE architecture with 1.6T total parameters and 49B active parameters, supports a 1M token context window, and is positioned as rivaling leading closed-source models while leading current open models on several major capabilities.NewMultimodalReleased 1mo ago
-
By DeepSeekSecond-generation DeepSeek OCR model, “Visual Causal Flow,” aimed at more human-like visual encoding, with dynamic resolution support and strong document-to-Markdown and layout-aware OCR for images and PDFs.TextReleased 4mo ago
-
By DeepSeekDeepSeek-V3.2-Speciale is a 685B-parameter research-only variant of DeepSeek-V3.2 that pushes open-weight reasoning ability to the limit, but disables tool calling and is intended purely for experimentation rather than everyday agent use.TextReleased 6mo ago
-
TextReleased 6mo ago
-
By DeepSeekDeepSeek-Math-V2 is a math-specialized LLM built on DeepSeek-V3.2-Exp-Base, trained to generate and verify step-by-step proofs. It uses a learned verifier as a reward model so the generator learns to fix its own reasoning, reaching gold-level scores on contests like IMO 2025, CMO 2024, and near-perfect Putnam 2024 with scaled test-time compute.TextReleased 6mo ago
-
By DeepSeekLLM-centric OCR model using “Contexts Optical Compression” to explore visual-text compression and provide fast streaming and batch OCR for images and PDFs via vLLM and Transformers runtimes.TextReleased 7mo ago
-
By DeepSeekDeepSeek v3.2 Exp is an experimental build of the DeepSeek V3 line, tuned for deeper reasoning and stronger coding while keeping latency practical. It supports long context, function/tool calling, and schema-true JSON—great for RAG, agents, and repo-scale tasks when you want extra accuracy.TextReleased 8mo ago
-
By DeepSeekDeepSeek-V3.1-Terminus is DeepSeek’s flagship reasoning model, tuned for difficult analysis, math, and coding. It supports very long context, function/tool calling, reliable JSON outputs, and an optional extended-thinking mode—ideal for enterprise RAG, agents, and high-stakes workflows.TextReleased 8mo ago
-
By DeepSeekDeepSeek R1 is a reasoning-first large language model built to solve complex problems with explicit multi-step thinking. It excels at math, coding, and logical analysis, supports long context, tool/function calling, and structured JSON outputs, and can trade latency for higher accuracy via extended "thinking" budgets.TextReleased 1y ago
-
TextReleased 1y ago
-
TextReleased 1y ago
