Papers
-
Pointer-CAD: Unifying B-Rep and Command Sequences via Pointer-based Edges & Faces Selection
-
DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference
-
DeepSeek-OCR 2: Visual Causal Flow
-
Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models
-
mHC: Manifold-Constrained Hyper-Connections
-
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
-
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures
-
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models
-
Inference-Time Scaling for Generalist Reward Modeling
-
X-MoE: Enabling Scalable Training for Emerging Mixture-of-Experts Architectures on HPC Platforms
-
DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal Decomposition
-
Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling
-
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
