Papers
-
EmbeddingGemma: Powerful and Lightweight Text Representations
-
Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making
-
ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario
-
MiniMax-01: Scaling Foundation Models with Lightning Attention
-
PoAct: Policy and Action Dual-Control Agent for Generalized Applications
-
Agent Laboratory: Using LLM Agents as Research Assistants
-
Retrieval-Augmented Generation with Graphs (GraphRAG)
-
Cosmos World Foundation Model Platform for Physical AI
-
Titans: Learning to Memorize at Test Time
-
Generative Video Propagation
-
In Case You Missed It: ARC 'Challenge' Is Not That Challenging
-
Qwen2.5 Technical Report
-
Prompt Compression with Context-Aware Sentence Encoding for Fast and Improved LLM Inference
-
Alignment faking in large language models
-
How Often are Fingerprints Repeated in the Population? Expanding on Evidence from AI With the Birthday ParadoxUniversity of Pennsylvania Department of Criminology and Statistics, University of Pennsylvania School of Engineering and Applied Sciences
-
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
-
VDB-GPDF: Online Gaussian Process Distance Field with VDB Structure
-
pfl-research: simulation framework for accelerating research in Private Federated Learning
-
Frontier AI systems have surpassed the self-replicating red lineFudan University
-
InstantRestore: Single-Step Personalized Face Restoration with Shared-Image Attention
-
Best-of-N Jailbreaking
-
Creating realistic 3D shapes using generative AIMassachusetts Institute of Technology
-
Commit0: Library Generation from Scratch
-
ARTIST: Improving the Generation of Text-rich Images with Disentangled Diffusion Models and Large Language Models
-
Controlling Language and Diffusion Models by Transporting Activations
-
The Rise and Potential of Large Language Model Based Agents: A SurveyMIT
-
Evaluating Cultural and Social Awareness of LLM Web Agents
-
SF-V: Single Forward Video Generation Model
-
The Llama 3 Herd of Model
-
Improving Pinterest Search Relevance Using Large Language Models
-
NVLM: Open Frontier-Class Multimodal LLMs
-
HyQE: Ranking Contexts with Hypothetical Query Embeddings
-
RedPajama: an Open Dataset for Training Large Language Models
-
Understanding Chain-of-Thought in LLMs through Information Theory
-
Survival of the Safest: Towards Secure Prompt Optimization through Interleaved Multi-Objective Evolution
-
Nemotron-4-340B-Instruct
-
Pixtral 12B
-
Data-Driven Discovery of Conservation Laws from Trajectories via Neural Deflation
-
Chronos: Learning the Language of Time Series
-
Qwen2-VL: Enhancing Vision-Language Model’s Perception of the World at Any Resolution
-
Duo-LLM: A Framework for Studying Adaptive Computation in Large Language Models
-
HM3: Heterogeneous Multi-Class Model Merging
-
arsier: Recipes for Training and Evaluating Large Video Description Models
-
OpenVLA: An Open-Source Vision-Language-Action Model
-
ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis
-
Learning-based Multi-View Stereo: A Survey
-
Arctic-TILT. Business Document Understanding at Sub-Billion Scale
-
General-Purpose User Modeling with Behavioral Logs
-
LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference
-
Qwen2-Audio Technical Report
