Papers

Filter by company

Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion

Tencent / Chinese Academy of Sciences Institute of Automation, Nanjing University

Published on: 2026-03-06 9 authors
OneRanker: Unified Generation and Ranking with One Model in Industrial Advertising Recommendation

Tencent

Published on: 2026-03-03 1 author
NOVA: Sparse Control, Dense Synthesis for Pair-Free Video Editing

Tencent

Published on: 2026-03-03 1 author
RubricBench: Aligning Model-Generated Rubrics with Human Standards

Tencent / University of Illinois Springfield

Published on: 2026-03-02 1 author
WorldStereo: Bridging Camera-Guided Video Generation and Scene Reconstruction via 3D Geometric Memorie

Tencent / Zhejiang University

Published on: 2026-03-02 1 author
AngelSlim: A more accessible, comprehensive, and efficient toolkit for large model compression

Tencent

Published on: 2026-02-26 1 author
The Art of Efficient Reasoning: Data, Reward, and Optimization

Tencent / The University of Hong Kong

Published on: 2026-02-25 1 author
Haitao Lin

Tencent / Fudan University

Published on: 2026-02-23 1 author
HyTRec: A Hybrid Temporal-Aware Attention Architecture for Long Behavior Sequential Recommendation

Tencent / Wuhan University

Published on: 2026-02-20 1 author
GUI-GENESIS: Automated Synthesis of Efficient Environments with Verifiable Rewards for GUI Agent Post-Training

Tencent, AMD / Peking University

Published on: 2026-02-15 1 author
OmniVideo-R1: Reinforcing Audio-visual Reasoning with Query Intention and Modality Attention

Tencent

Published on: 2026-02-15 1 author
Gradients Must Earn Their Influence: Unifying SFT with Generalized Entropic Objectives

Tencent / Harbin Institute of Technology

Published on: 2026-02-11 1 author
MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAE

Tencent / Monash University

Published on: 2026-02-09 1 author
RISE-Video: Can Video Generators Decode Implicit World Rules?

Tencent / Shanghai Jiao Tong University

Published on: 2026-02-05 1 author
BlossomRec: Block-level Fused Sparse Attention Mechanism for Sequential Recommendations

Tencent / City University of Hong Kong

Published on: 2026-02-03 1 author
ReMiT: RL-Guided Mid-Training for Iterative LLM Evolution

Tencent / Shanghai Jiao Tong University

Published on: 2026-02-03 1 author
HY3D-Bench: Generation of 3D Assets

Tencent

Published on: 2026-02-03 1 author
HunyuanImage 3.0 Technical Report

Tencent

Published on: 2026-02-02 1 author
MAIN-VLA: Modeling Abstraction of Intention and eNvironment for Vision-Language-Action Models

Tencent

Published on: 2026-02-02 1 author
AlignGemini: Generalizable AI-Generated Image Detection Through Task-Model Alignment

Tencent

Published on: 2026-01-30 1 author
PI-Light: Physics-Inspired Diffusion for Full-Image Relighting

Tencent / Nanyang Technological University

Published on: 2026-01-29 1 author
Youtu-Parsing: Perception, Structuring and Recognition via High-Parallelism Decoding

Tencent

Published on: 2026-01-28 1 author
Youtu-VL: Unleashing Visual Potential via Unified Vision-Language Supervision

Tencent

Published on: 2026-01-27 1 author
RISER: Orchestrating Latent Reasoning Skills for Adaptive Activation Steering

Tencent / Tongji University

Published on: 2026-01-19 1 author
FinVault: Benchmarking Financial Agent Safety in Execution-Grounded Environments

Tencent / Singapore University of Technology and Design

Published on: 2026-01-09 1 author
UniFinEval: Towards Unified Evaluation of Financial Multimodal Models across Text, Images and Videos

Tencent / Shanghai University of Finance and Economics

Published on: 2026-01-09 1 author
Rotate Your Character: Revisiting Video Diffusion Models for High-Quality 3D Character Generation

Tencent / The University of Hong Kong

Published on: 2026-01-09 1 author
One Language-Free Foundation Model Is Enough for Universal Vision Anomaly Detection

Tencent

Published on: 2026-01-09 1 author
DocDancer: Towards Agentic Document-Grounded Information Seeking

Tencent / Peking University

Published on: 2026-01-08 1 author
Re-Align: Structured Reasoning-guided Alignment for In-Context Image Generation and Editing

Tencent / Institute of Information Engineering

Published on: 2026-01-08 1 author
FOREVER: Forgetting Curve-Inspired Memory Replay for Language Model Continual Learning

Tencent / The Hong Kong Polytechnic University

Published on: 2026-01-07 1 author
SmartSnap: Proactive Evidence Seeking for Self-Verifying Agents

Tencent

Published on: 2026-01-06 1 author
A Versatile Multimodal Agent for Multimedia Content Generation

Tencent / University of Rochester

Published on: 2026-01-06 1 author
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models

Tencent

Published on: 2026-01-05 1 author
YOLO-Master: MOE-Accelerated with Specialized Transformers for Enhanced Real-time Detection

Tencent / Singapore Management University

Published on: 2025-12-30 1 author
HY-MT1.5 Technical Report

Tencent

Published on: 2025-12-30 1 author
D2Pruner: Debiased Importance and Structural Diversity for MLLM Token Pruning

Tencent / Shanghai Jiao Tong University

Published on: 2025-12-26 1 author
Streaming Video Instruction Tuning

Tencent / Hong Kong Baptist University

Published on: 2025-12-24 1 author
RePlan: Reasoning-guided Region Planning for Complex Instruction-based Image Editing

Tencent / The Chinese University of Hong Kong

Published on: 2025-12-18 1 author
N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models

Tencent / Hong Kong University of Science and Technology

Published on: 2025-12-18 1 author
AutoRefiner: Improving Autoregressive Video Diffusion Models via Reflective Refinement Over the Stochastic Sampling Path

Tencent / Australian National University

Published on: 2025-12-15 1 author
Soul: Breathe Life into Digital Human for High-fidelity Long-term Multimodal Animation

Tencent

Published on: 2025-12-15 1 author
Transform Trained Transformer: Accelerating Naive 4K Video Generation Over 10

Tencent

Published on: 2025-12-15 1 author
GTR-Turbo: Merged Checkpoint is Secretly a Free Teacher for Agentic VLM Training

Tencent / Tsinghua University

Published on: 2025-12-15 1 author
Distribution Matching Variational AutoEncoder

Tencent / Peking University

Published on: 2025-12-08 1 author
HunyuanVideo 1.5 Technical Report

Tencent

Published on: 2025-10-25 1 author
Training-Free Group Relative Policy Optimization

Tencent

Published on: 2025-10-09 1 author
Scaling Diffusion Language Models via Adaptation from Autoregressive Models

Tencent, Apple / The University of Hong Kong, University of Illinois at Urbana-Champaign

Published on: 2025-05-31 1 author
HunyuanVideo: A Systematic Framework For Large Video Generative Models

Tencent

Published on: 2025-03-11 1 author
ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis

Tencent

Published on: 2024-09-03 1 author

1 2 Next

Search

Papers

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool

Choose listing type: