Papers
-
SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?
-
GraNNite: Enabling High-Performance Execution of Graph Neural Networks on Resource-Constrained Neural Processing Units
-
Reviving The Classics: Active Reward Modeling in Large Language Model Alignment
-
s1: Simple test-time scalingContextual AI / Allen Institute for Artificial Intelligence, Stanford University, University of Washington
-
Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming
-
Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling
-
EmbeddingGemma: Powerful and Lightweight Text Representations
-
Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making
-
ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario
-
MiniMax-01: Scaling Foundation Models with Lightning Attention
-
PoAct: Policy and Action Dual-Control Agent for Generalized Applications
-
Agent Laboratory: Using LLM Agents as Research Assistants
-
Retrieval-Augmented Generation with Graphs (GraphRAG)
-
Cosmos World Foundation Model Platform for Physical AI
-
Titans: Learning to Memorize at Test Time
-
Generative Video Propagation
-
In Case You Missed It: ARC 'Challenge' Is Not That Challenging
-
Qwen2.5 Technical Report
-
Prompt Compression with Context-Aware Sentence Encoding for Fast and Improved LLM Inference
-
Alignment faking in large language models
-
How Often are Fingerprints Repeated in the Population? Expanding on Evidence from AI With the Birthday ParadoxUniversity of Pennsylvania Department of Criminology and Statistics, University of Pennsylvania School of Engineering and Applied Sciences
-
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
-
VDB-GPDF: Online Gaussian Process Distance Field with VDB Structure
-
pfl-research: simulation framework for accelerating research in Private Federated Learning
-
Frontier AI systems have surpassed the self-replicating red lineFudan University
-
InstantRestore: Single-Step Personalized Face Restoration with Shared-Image Attention
-
Best-of-N JailbreakingAnthropic, Tangentic, Speechmatics / MATS, Stanford University, University College London, University of Oxford
-
Creating realistic 3D shapes using generative AIMassachusetts Institute of Technology
-
Commit0: Library Generation from Scratch
-
ARTIST: Improving the Generation of Text-rich Images with Disentangled Diffusion Models and Large Language Models
-
Controlling Language and Diffusion Models by Transporting Activations
-
The Rise and Potential of Large Language Model Based Agents: A SurveyMassachusetts Institute of Technology
-
Evaluating Cultural and Social Awareness of LLM Web Agents
-
Helix Extractor 1.0 Update: High-Accuracy Information Extraction for Semi-Structured Documents
-
SF-V: Single Forward Video Generation Model
-
The Llama 3 Herd of Model
-
Improving Pinterest Search Relevance Using Large Language Models
-
NVLM: Open Frontier-Class Multimodal LLMs
-
HyQE: Ranking Contexts with Hypothetical Query Embeddings
-
RedPajama: an Open Dataset for Training Large Language Models
-
Understanding Chain-of-Thought in LLMs through Information Theory
-
Survival of the Safest: Towards Secure Prompt Optimization through Interleaved Multi-Objective Evolution
-
Nemotron-4-340B-Instruct
-
Pixtral 12B
-
Data-Driven Discovery of Conservation Laws from Trajectories via Neural Deflation
-
Chronos: Learning the Language of Time SeriesAmazon / AWS AI Labs, New York University, Rutgers University, University of California, University of Freiburg
-
Qwen2-VL: Enhancing Vision-Language Model’s Perception of the World at Any Resolution
-
Duo-LLM: A Framework for Studying Adaptive Computation in Large Language Models
-
HM3: Heterogeneous Multi-Class Model Merging
-
arsier: Recipes for Training and Evaluating Large Video Description Models
