Papers

Filter by company

Lessons from Defending Gemini Against Indirect Prompt Injections

Google

Published on: 2025-05-20 1 author
G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning

Moonshot AI

Published on: 2025-05-19 1 author
Progressive Autoregressive Video Diffusion Models

Adobe / Stony Brook University

Published on: 2025-05-18 1 author
FastVLM: Efficient Vision Encoding for Vision Language Models

Apple

Published on: 2025-05-15 1 author
VGGT: Visual Geometry Grounded Transformer

Meta Platforms / University of Oxford

Published on: 2025-05-14 6 authors
Qwen3 Technical Report

Alibaba

Published on: 2025-05-14 1 author
The Leaderboard Illusion

Cohere / Princeton University

Published on: 2025-05-12 1 author
MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder

MiniMax

Published on: 2025-05-12 1 author
LLMs Get Lost In Multi-Turn Conversation

Microsoft

Published on: 2025-05-09 1 author
A Survey on Test-Time Scaling in Large Language Models: What, How, Where, and How Well

Salesforce / City University of Hong Kong

Published on: 2025-05-04 1 author
Command A: An Enterprise-Ready Large Language Model

Cohere

Published on: 2025-05-01 1 author
InteractRank: Personalized Web-Scale Search Pre-Ranking with Cross Interaction Features

Pinterest

Published on: 2025-05-01 1 author
Investigating the Overlooked Hessian Structure: From CNNs to LLMs

ByteDance

Published on: 2025-05-01 1 author
The Leaderboard Illusion

Cohere / Allen Institute for Artificial Intelligence, Massachusetts Institute of Technology, Princeton University, Stanford University, University of Washington, University of Waterloo

Published on: 2025-04-29 13 authors
Synthetic Data Generation & Multi-Step RL for Reasoning & Tool Use

Google / Stanford University

Published on: 2025-04-28 1 author
Perception Encoder: The best visual embeddings are not at the output of the network

Meta Platforms / Fudan University

Published on: 2025-04-28 1 author
Kimi-Audio Technical Report

Moonshot AI

Published on: 2025-04-25 1 author
I-Con: A Unifying Framework for Representation Learning

Google, Microsoft / MIT

Published on: 2025-04-23 5 authors
Describe Anything: Detailed Localized Image and Video Captioning

NVIDIA / UC Berkeley

Published on: 2025-04-22 11 authors
LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities

Google / JKU Linz

Published on: 2025-04-22 5 authors
UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing

Apple

Published on: 2025-04-22 1 author
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second

Apple

Published on: 2025-04-21 1 author
How Does Critical Batch Size Scale in Pre-training?

Amazon / Harvard University

Published on: 2025-04-21 1 author
Representation Engineering for Large-Language Models: Survey and Research Challenges

Perplexity

Published on: 2025-04-21 1 author
FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving

Perplexity

Published on: 2025-04-21 1 author
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

SenseTime / Fudan University, Nanjing University, Shanghai Jiao Tong University, The Chinese University of Hong Kong, Tsinghua University

Published on: 2025-04-19 1 author
It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization

Google

Published on: 2025-04-17 4 authors
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

ByteDance

Published on: 2025-04-17 1 author
ToolSandbox: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities

Apple

Published on: 2025-04-16 1 author
How new data permeates LLM knowledge and how to dilute it

Google

Published on: 2025-04-13 1 author
Migrating Code At Scale With LLMs At Google

Google

Published on: 2025-04-13 1 author
VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks

ByteDance

Published on: 2025-04-11 1 author
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model

ByteDance

Published on: 2025-04-11 1 author
PixelFlow: Pixel-Space Generative Models with Flow

Adobe / The University of Hong Kong

Published on: 2025-04-10 5 authors
Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning

ByteDance

Published on: 2025-04-10 1 author
SkillWeaver: Web Agents can Self-Improve by Discovering and Honing Skills

Cisco Systems / The Ohio State University

Published on: 2025-04-09 11 authors
Gemini: A Family of Highly Capable Multimodal Models

Google / Google DeepMind

Published on: 2025-04-09 1 author
SmolVLM: Redefining small and efficient multimodal models

Hugging Face / Stanford University

Published on: 2025-04-07 1 author
One-Minute Video Generation with Test-Time Training

NVIDIA / Stanford University

Published on: 2025-04-07 1 author
Data Scaling Laws for End-to-End Autonomous Driving

NVIDIA / New York University, Stanford University, University of Toronto

Published on: 2025-04-06 10 authors
Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme

MiniMax / Shanghai Jiao Tong University

Published on: 2025-04-04 1 author
MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding

Together AI / Carnegie Mellon University

Published on: 2025-04-02 1 author
A Systematic Survey of Automatic Prompt Optimization Techniques

Amazon

Published on: 2025-04-02 1 author
Scaling Language-Free Visual Representation Learning

Meta Platforms / New York University, Princeton University

Published on: 2025-04-01 11 authors
Large Language Models Pass the Turing Test

UC San Diego

Published on: 2025-03-31 2 authors
XAMBA: SSMs on Edge NPUs

Intel / Purdue University

Published on: 2025-03-31 1 author
On the Biology of a Large Language Model

Anthropic

Published on: 2025-03-27 1 author
Alibaba LingmaAgent: Improving Automated Issue Resolution via Comprehensive Repository Exploration

Alibaba

Published on: 2025-03-26 1 author
Qwen2.5-Omni Technical Report

Alibaba

Published on: 2025-03-26 1 author
Neuromorphic Principles for Efficient Large Language Models on Intel Loihi 2

Intel / University of California

Published on: 2025-03-25 1 author

Prev 63 64 65 66 67 68 69 70 71 72 73 Next

Search

Papers

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool

Choose listing type: