Models

By Liquid AI

LFM2.5-350M is Liquid AI’s compact 350M-parameter language model optimized for fast inference, tool use, data extraction, and structured outputs. It is designed to run across cloud GPUs, CPUs, and edge devices, and is positioned for lightweight agentic workflows and large-scale data processing rather than math, coding, or creative writing.

📚Large Language Models 🔍Data extraction 🔄Language model optimization 🧠AI inference

NewText

Released 7h ago

Gen 7

Atlas RF Studio

By Arena Physica

Atlas RF Studio is Arena Physica’s AI-driven RF design platform and research preview toward a foundation model for electromagnetics. It is built for inverse RF component design, using agentic workflows to generate, simulate, and refine candidate geometries from target specifications, with underlying models aimed at learning electromagnetic behavior rather than just approximating a narrow simulator task.

🔌Circuit design 🛠️Design optimization 🔌Electrical engineering 🤖Engineering automation

NewText

Released 1d ago

Gen 3

Holo 3

By H Company

Holo3 is H Company’s computer-use agent model for enterprise workflows. It is designed to see, reason, and act across desktop software and multi-app business environments, with production-focused training in synthetic enterprise systems. H Company positions it as a state-of-the-art model on OSWorld-Verified, with the flagship Holo3-122B-A10B using 10B active parameters out of 122B total.

🤖Screen-based automation 🤖Task automation 💼Enterprise assistants 🤖Agents 👁️Computer vision assistance

NewMultimodal

Released 1d ago

Gen 3 Veo

Veo 3.1 Lite

By Google

Veo 3.1 Lite is a lower-cost AI video generation model designed for high-volume video applications. It supports both text-to-video and image-to-video, keeps the same generation speed as Veo 3.1 Fast while costing less than half as much, and offers flexible output options including landscape or portrait framing, 720p or 1080p resolution, and 4s, 6s, or 8s durations.

🎥Videos 📽️Image to video 🎥Short videos

NewMultimodal

Released 1d ago

Gen 3 Qwen

Qwen3.6 Plus Preview

By Alibaba

Qwen3.6 Plus Preview is a preview large language model in the Qwen Plus line, built with a hybrid architecture aimed at better efficiency, scalability, reasoning, and agentic behavior. It is positioned as especially strong for coding, front-end development, and complex problem-solving, and is offered on OpenRouter with a 1,000,000-token context window.

💬Chatting 🔍Advanced reasoning 🧩Complex problem solving 🖥️Frontend development 💻Conversational coding

NewMultimodal

Released 1d ago

Gen 3

PixVerse V6

By Pixverse

PixVerse V6 is PixVerse’s AI video generation model focused on longer, more production-ready clips. It supports 15-second 1080p generation, multi-shot storytelling, integrated audio generation, and flexible aspect ratios, aiming to move beyond short isolated clips toward more consistent narrative and marketing video workflows.

🎥Videos 📽️Image to video 🎬Script to video 🔊Audio

NewMultimodal

Released 2d ago

Gen 4

LongCat AudioDiT 3.5B

By Meituan

LongCat-AudioDiT-3.5B is Meituan LongCat’s diffusion-based text-to-speech model built directly in waveform latent space rather than mel-spectrogram space. It is designed for high-fidelity speech generation and zero-shot voice cloning, supports Chinese and English, and is positioned as a top-performing open model on the Seed benchmark for speaker similarity and intelligibility.

🔊Text to speech 🗣️Voice cloning 🔊Voice enhancement

NewAudio

Released 2d ago

Gen 7

Harrier OSS v1 0.6b

By Microsoft

Harrier OSS v1 0.6B is Microsoft’s multilingual text embedding model for semantic search, retrieval, clustering, similarity, classification, bitext mining, and reranking. It uses a decoder-only architecture with last-token pooling and L2 normalization, supports 94 languages, handles up to 32,768 tokens, and is positioned as a strong multilingual embedding model.

NewText

Released 2d ago

Gen 2

KAT Coder Pro V2

By Kuaishou Technology

KAT-Coder-Pro V2 is a hosted Kwaipilot coding model listed on OpenRouter with a 256,000-token context window. It is positioned as a large-context software engineering model for coding and agent-style development tasks, intended for API-based use rather than open-weight local deployment.

💻Coding 🔧Code optimization 🏗️Software architecture guidance 🔌Api integration assistance

NewMultimodal

Released 5d ago

Gen 3

SAM 3.1

By Meta Platforms

SAM 3.1 is Meta’s improved promptable segmentation model for images and video. It supports points, boxes, masks, text, and exemplar prompts, and is designed to segment and track objects more accurately than earlier SAM 3 releases, including open-vocabulary concepts across frames.

🖼️Image segmentation 🔍Image recognition 🎥Video analysis 🔍Object identification

NewMultimodal

Released 5d ago

Gen 3

Matrix Game 3.0

By Skywork AI

🌍Game worlds 🎥Interactive videos 🚀Physics simulations 🎥3d scenes

NewMultimodal

Released 5d ago

Gen 3

davinci llm 3B

By Sii GAIR-NLP

daVinci-LLM-3B is a 3B base language model built to make pretraining transparent and reproducible. Its release includes not only the weights, but also training trajectories, intermediate checkpoints, data-processing decisions, and more than 200 ablation studies.

📚Non-interactive language modeling 🤖Ai research assistance 🔍Data quality control 🧠Model training

NewText

Released 5d ago

Gen 3

Chroma Context 1

By Chroma

Context-1 is Chroma’s 20B agentic search model trained as a self-editing search agent. It is designed to decompose complex queries, prune irrelevant context, and deliver high retrieval quality at lower latency and cost than much larger frontier models.

🔍Information retrieval 📂Content categorization 🔍Conceptual search

NewText

Released 5d ago

Gen 3

Chandra OCR 2

By Datalab

Chandra is an OCR model for difficult document extraction tasks. Its GitHub description says it handles complex tables, forms, and handwriting while preserving full layout structure, making it more document-understanding focused than plain text O

📜OCR 🔍Text extraction 📄Document analysis 🔢Mathematical formula transcription 🔍Handwriting analysis

NewMultimodal

Released 5d ago

Gen 3

LongCat Next

By Meituan

LongCat Next is a multimodal LongCat model focused on compact yet capable visual and speech understanding. The official intro highlights strong performance despite a 28x compression ratio, with particular strength in text rendering, speech comprehension, low-latency voice conversation, and customizable voice cloning.

🖼️Image generation 🗣️Voice cloning 🔍Image interpretation 🔊Audio

NewMultimodal

Released 5d ago

Gen 3

Topaz Starlight Precise 2.5

By Topaz Labs

Topaz Starlight Precise 2.5 is an upgraded video upscaling model available through ComfyUI Partner Nodes. It is positioned as a direct replacement for the earlier SLP-2 model, promising sharper output, fewer artifacts, and better preserved detail at the same per-frame cost.

🎞️Video upscaling 🔍Image upscaling 🎥Video enhancement

NewVideo

Released 5d ago

Gen 4

Suno 5.5

By Suno

Suno is an AI music creation platform that generates complete original songs from prompts, including vocals, lyrics, and full production. It is built for fast music generation, remixing, beat making, and sharing, and supports creation from text, images, or voice inputs

🎵Music production assistance 🎵Songwriting 🎵Personalized songs 🎵Lyrics to music

NewAudio

Released 6d ago

Gen 3

Nanobanana 2

By Google

Gemini 3.1 Flash Live Preview is Google’s low-latency audio-to-audio model for real-time dialogue and voice-first AI apps. It is built for fast conversational interaction, with multimodal input support for text, images, audio, and video, and outputs in text and audio. Google positions it for acoustic nuance detection, numeric precision, and multimodal awareness.

🗣️Speech to speech 🎙Voice chatting 🔊Advanced audio generation 🎤Voice agents

NewMultimodal

Released 6d ago

Gen 4

Cohere Transcribe

By Cohere

Cohere Transcribe is an open-source automatic speech recognition model for highly accurate audio transcription. Cohere says it is built for practical enterprise use, supports 14 languages, uses a 2B parameter Conformer-based encoder-decoder architecture, and currently ranks #1 on Hugging Face’s Open ASR Leaderboard for accuracy.

🎤Voice transcription 📽Video transcription 🎙️Voice recognition 🎤Voice notes transcription

NewAudio

Released 6d ago

Gen 3

Voxtral TTS

By Mistral AI

Voxtral TTS is Mistral’s new open-source text-to-speech model for building voice agents and enterprise speech applications. According to TechCrunch, it supports 9 languages, can clone a voice from under 5 seconds of audio, preserves accents and speaking style, and is optimized for real-time use on edge devices like phones, laptops, and wearables.

🔊Text to speech 🗣️Voice cloning 🎙️Voiceovers 🎤Voice agents

NewAudio

Released 6d ago

Gen 3

TRIBE v2

By Meta Platforms

TRIBE v2 is Meta’s multimodal brain-encoding research demo. It predicts whole-brain fMRI responses to naturalistic stimuli by combining video, audio, and text representations, aiming to model how the brain reacts over time across different cortical regions and people. It builds on Meta’s TRIBE line for cross-modal brain response prediction.

🧠Neuroscience 🔬Scientific research 🧠Neuroscience exploration ⚡Neurofeedback analysis

NewMultimodal

Released 6d ago

Search

No models found

Search

Models

No models found

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool

Choose listing type: