TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

Papers

Filter by company
  • D2Pruner: Debiased Importance and Structural Diversity for MLLM Token Pruning
    Tencent / Shanghai Jiao Tong University
    Published on: 2025-12-26 1 author
  • Completed Hyperparameter Transfer across Modules, Width, Depth, Batch and Duration
    Published on: 2025-12-26 1 author
  • DiverseGRPO: Mitigating Mode Collapse in Image Generation via Diversity-Aware GRPO
    Published on: 2025-12-25 1 author
  • SemanticGen: Video Generation in Semantic Space
    Kuaishou Technology / Zhejiang University
    Published on: 2025-12-25 1 author
  • Streaming Video Instruction Tuning
    Tencent / Hong Kong Baptist University
    Published on: 2025-12-24 1 author
  • Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures
    Published on: 2025-12-23 1 author
  • FlashVLM: Text-Guided Visual Token Selection for Large Multimodal Models
    Snap / Sun Yat-sen University
    Published on: 2025-12-23 1 author
  • GR-RL: Going Dexterous and Precise for Long-Horizon Robotic Manipulation
    Published on: 2025-12-23 1 author
  • COBRA: Catastrophic Bit-flip Reliability Analysis of State-Space Models
    Published on: 2025-12-22 1 author
  • From Word to World: Can Large Language Models be Implicit Text-based World Models?
    Microsoft / Southern University of Science and Technology
    Published on: 2025-12-21 1 author
  • Secret mixtures of experts inside your LLM
    University of Pennsylvania, Wharton School of Statistics and Data Science
    Published on: 2025-12-20 1 author
  • Seed-Prover 1.5: Mastering Undergraduate-Level Theorem Proving via Learning from Experience
    Published on: 2025-12-19 22 authors
  • GroundingME: Exposing the Visual Grounding Gap in MLLMs through Multi-Dimensional Evaluation
    Xiaomi / The University of Hong Kong
    Published on: 2025-12-19 1 author
  • Diffusion Forcing for Multi-Agent Interaction Sequence Modeling
    Published on: 2025-12-19 1 author
  • Sigma-MoE-Tiny Technical Report
    Microsoft / Microsoft Research
    Published on: 2025-12-19 1 author
  • Journey Before Destination: On the importance of Visual Faithfulness in Slow Thinking
    Amazon / University of Wisconsin-Madison
    Published on: 2025-12-19 1 author
  • Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers
    Published on: 2025-12-19 1 author
  • Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities.
    Published on: 2025-12-19
  • DVGT: Driving Visual Geometry Transformer
    Xiaomi / Tsinghua University
    Published on: 2025-12-18 1 author
  • RePlan: Reasoning-guided Region Planning for Complex Instruction-based Image Editing
    Tencent / The Chinese University of Hong Kong
    Published on: 2025-12-18 1 author
  • N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models
    Tencent / Hong Kong University of Science and Technology
    Published on: 2025-12-18 1 author
  • Kling-Omni Technical Report
    Published on: 2025-12-18 1 author
  • EasyV2V: A High-quality Instruction-based Video Editing Framework
    Published on: 2025-12-18 1 author
  • FlashPortrait: 6x Faster Infinite Portrait Animation with Adaptive Latent Prediction
    Microsoft / Fudan University
    Published on: 2025-12-18 1 author
  • GenEval 2: Addressing Benchmark Drift in Text-to-Image Evaluatio
    Published on: 2025-12-18 1 author
  • Addendum to GPT-5.2 System Card: GPT-5.2-Codex
    Published on: 2025-12-18 1 author
  • Monitoring Monitorability
    Published on: 2025-12-18 1 author
  • Spatia: Video Generation with Updatable Spatial Memory
    Microsoft / The University of Sydney
    Published on: 2025-12-17 1 author
  • Prompt Repetition Improves Non-Reasoning LLMs
    Published on: 2025-12-17 1 author
  • Towards a Science of Scaling Agent Systems
    Google / MIT
    Published on: 2025-12-17 1 author
  • Fast and Accurate Causal Parallel Decoding using Jacobi Forcing
    Snowflake / UC San Diego
    Published on: 2025-12-16 1 author
  • TalkVerse: Democratizing Minute-Long Audio-Driven Video Generation
    Snap / The Chinese University of Hong Kong
    Published on: 2025-12-16 1 author
  • GLM-TTS Technical Report
    Z.ai / Tsinghua University
    Published on: 2025-12-16 1 author
  • Native and Compact Structured Latents for 3D Generation
    Microsoft / Tsinghua University
    Published on: 2025-12-16 1 author
  • One Layer Is Enough: Adapting Pretrained Visual Encoders for Image Generation
    Published on: 2025-12-16 1 author
  • T5Gemma 2: Seeing, Reading, and Understanding Longer
    Published on: 2025-12-16 1 author
  • Evaluating AI’s ability to perform scientific research tasks
    Published on: 2025-12-16 1 author
  • AutoRefiner: Improving Autoregressive Video Diffusion Models via Reflective Refinement Over the Stochastic Sampling Path
    Tencent / Australian National University
    Published on: 2025-12-15 1 author
  • Soul: Breathe Life into Digital Human for High-fidelity Long-term Multimodal Animation
    Published on: 2025-12-15 1 author
  • Transform Trained Transformer: Accelerating Naive 4K Video Generation Over 10
    Published on: 2025-12-15 1 author
  • GTR-Turbo: Merged Checkpoint is Secretly a Free Teacher for Agentic VLM Training
    Tencent / Tsinghua University
    Published on: 2025-12-15 1 author
  • KlingAvatar 2.0 Technical Report
    Published on: 2025-12-15 1 author
  • Wait, Wait, Wait... Why Do Reasoning Models Loop?
    Microsoft / MIT
    Published on: 2025-12-15 1 author
  • World Models Can Leverage Human Videos for Dexterous Manipulation
    Meta Platforms / New York University
    Published on: 2025-12-15 1 author
  • Towards Scalable Pre-training of Visual Tokenizers for Generation
    MiniMax / Huazhong University of Science and Technology
    Published on: 2025-12-15 1 author
  • Seedance 1.5 pro: A Native Audio-Visual Joint Generation Foundation Model
    Published on: 2025-12-15 1 author
  • Schrodinger Audio-Visual Editor: Object-Level Audiovisual Removal
    Published on: 2025-12-14 1 author
  • Scone: Bridging Composition and Distinction in Subject-Driven Image Generation via Unified Understanding-Generation Modeling
    Kuaishou Technology / Peking University
    Published on: 2025-12-14 1 author
  • Diffusion Language Model Inference with Monte Carlo Tree Search
    Amazon / Dartmouth College
    Published on: 2025-12-13 1 author
  • SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder
    Published on: 2025-12-12 1 author
0 AIs selected
Clear selection
#
Name
Task