Models
Browse and discover AI models from leading companies in the industry.
-
By Liquid AILFM2.5-350M is Liquid AIโs compact 350M-parameter language model optimized for fast inference, tool use, data extraction, and structured outputs. It is designed to run across cloud GPUs, CPUs, and edge devices, and is positioned for lightweight agentic workflows and large-scale data processing rather than math, coding, or creative writing.NewTextReleased 7h ago
-
Atlas RF Studio is Arena Physicaโs AI-driven RF design platform and research preview toward a foundation model for electromagnetics. It is built for inverse RF component design, using agentic workflows to generate, simulate, and refine candidate geometries from target specifications, with underlying models aimed at learning electromagnetic behavior rather than just approximating a narrow simulator task.NewTextReleased 1d ago
-
By H CompanyHolo3 is H Companyโs computer-use agent model for enterprise workflows. It is designed to see, reason, and act across desktop software and multi-app business environments, with production-focused training in synthetic enterprise systems. H Company positions it as a state-of-the-art model on OSWorld-Verified, with the flagship Holo3-122B-A10B using 10B active parameters out of 122B total.NewMultimodalReleased 1d ago
-
By GoogleVeo 3.1 Lite is a lower-cost AI video generation model designed for high-volume video applications. It supports both text-to-video and image-to-video, keeps the same generation speed as Veo 3.1 Fast while costing less than half as much, and offers flexible output options including landscape or portrait framing, 720p or 1080p resolution, and 4s, 6s, or 8s durations.NewMultimodalReleased 1d ago
-
By AlibabaQwen3.6 Plus Preview is a preview large language model in the Qwen Plus line, built with a hybrid architecture aimed at better efficiency, scalability, reasoning, and agentic behavior. It is positioned as especially strong for coding, front-end development, and complex problem-solving, and is offered on OpenRouter with a 1,000,000-token context window.NewMultimodalReleased 1d ago
-
By PixversePixVerse V6 is PixVerseโs AI video generation model focused on longer, more production-ready clips. It supports 15-second 1080p generation, multi-shot storytelling, integrated audio generation, and flexible aspect ratios, aiming to move beyond short isolated clips toward more consistent narrative and marketing video workflows.NewMultimodalReleased 2d ago
-
By MeituanLongCat-AudioDiT-3.5B is Meituan LongCatโs diffusion-based text-to-speech model built directly in waveform latent space rather than mel-spectrogram space. It is designed for high-fidelity speech generation and zero-shot voice cloning, supports Chinese and English, and is positioned as a top-performing open model on the Seed benchmark for speaker similarity and intelligibility.NewAudioReleased 2d ago
-
By MicrosoftHarrier OSS v1 0.6B is Microsoftโs multilingual text embedding model for semantic search, retrieval, clustering, similarity, classification, bitext mining, and reranking. It uses a decoder-only architecture with last-token pooling and L2 normalization, supports 94 languages, handles up to 32,768 tokens, and is positioned as a strong multilingual embedding model.NewTextReleased 2d ago
-
KAT-Coder-Pro V2 is a hosted Kwaipilot coding model listed on OpenRouter with a 256,000-token context window. It is positioned as a large-context software engineering model for coding and agent-style development tasks, intended for API-based use rather than open-weight local deployment.NewMultimodalReleased 5d ago
-
SAM 3.1 is Metaโs improved promptable segmentation model for images and video. It supports points, boxes, masks, text, and exemplar prompts, and is designed to segment and track objects more accurately than earlier SAM 3 releases, including open-vocabulary concepts across frames.NewMultimodalReleased 5d ago
-
NewMultimodalReleased 5d ago
-
By Sii GAIR-NLPdaVinci-LLM-3B is a 3B base language model built to make pretraining transparent and reproducible. Its release includes not only the weights, but also training trajectories, intermediate checkpoints, data-processing decisions, and more than 200 ablation studies.NewTextReleased 5d ago
-
By ChromaContext-1 is Chromaโs 20B agentic search model trained as a self-editing search agent. It is designed to decompose complex queries, prune irrelevant context, and deliver high retrieval quality at lower latency and cost than much larger frontier models.NewTextReleased 5d ago
-
By DatalabChandra is an OCR model for difficult document extraction tasks. Its GitHub description says it handles complex tables, forms, and handwriting while preserving full layout structure, making it more document-understanding focused than plain text ONewMultimodalReleased 5d ago
-
By MeituanLongCat Next is a multimodal LongCat model focused on compact yet capable visual and speech understanding. The official intro highlights strong performance despite a 28x compression ratio, with particular strength in text rendering, speech comprehension, low-latency voice conversation, and customizable voice cloning.NewMultimodalReleased 5d ago
-
By Topaz LabsTopaz Starlight Precise 2.5 is an upgraded video upscaling model available through ComfyUI Partner Nodes. It is positioned as a direct replacement for the earlier SLP-2 model, promising sharper output, fewer artifacts, and better preserved detail at the same per-frame cost.NewVideoReleased 5d ago
-
By SunoSuno is an AI music creation platform that generates complete original songs from prompts, including vocals, lyrics, and full production. It is built for fast music generation, remixing, beat making, and sharing, and supports creation from text, images, or voice inputsNewAudioReleased 6d ago
-
By GoogleGemini 3.1 Flash Live Preview is Googleโs low-latency audio-to-audio model for real-time dialogue and voice-first AI apps. It is built for fast conversational interaction, with multimodal input support for text, images, audio, and video, and outputs in text and audio. Google positions it for acoustic nuance detection, numeric precision, and multimodal awareness.NewMultimodalReleased 6d ago
-
By CohereCohere Transcribe is an open-source automatic speech recognition model for highly accurate audio transcription. Cohere says it is built for practical enterprise use, supports 14 languages, uses a 2B parameter Conformer-based encoder-decoder architecture, and currently ranks #1 on Hugging Faceโs Open ASR Leaderboard for accuracy.NewAudioReleased 6d ago
-
By Mistral AIVoxtral TTS is Mistralโs new open-source text-to-speech model for building voice agents and enterprise speech applications. According to TechCrunch, it supports 9 languages, can clone a voice from under 5 seconds of audio, preserves accents and speaking style, and is optimized for real-time use on edge devices like phones, laptops, and wearables.NewAudioReleased 6d ago
-
TRIBE v2 is Metaโs multimodal brain-encoding research demo. It predicts whole-brain fMRI responses to naturalistic stimuli by combining video, audio, and text representations, aiming to model how the brain reacts over time across different cortical regions and people. It builds on Metaโs TRIBE line for cross-modal brain response prediction.NewMultimodalReleased 6d ago
No models found
Try adjusting your search or filters.
KiloClaw - Managed ๐ฆ 