TAAFT
Free mode
100% free
Freemium
Free Trial
Create tool

Parakeet TDT

By NVIDIA
New Audio Gen
Released: May 1, 2025

Overview

Parakeet TDT is NVIDIA’s low-latency, streaming speech-to-text model in the Parakeet family—a transformer-transducer architecture tuned for high accuracy, robust noise handling, and real-time partial transcripts, deployable on GPUs from edge to data center.

Description

Parakeet TDT is a modern ASR model built around a transformer-transducer design, which combines an acoustic encoder with a lightweight prediction/decoding network to produce word-accurate transcripts with minimal delay. It’s trained for robustness across accents, microphones, and noisy environments, and it preserves long-form coherence while still responding interactively in streaming mode. The model emits stable partials and finalized text with punctuation, casing, and inverse-text normalization, and it can return timestamps needed for alignment or downstream analytics. Domain adaptation is straightforward: you can fine-tune on industry jargon or custom corpora so recognition improves for contact centers, medical dictation, meetings, or embedded commands. For production it integrates with NVIDIA’s inference stack, running efficiently with TensorRT and scaling as a NIM microservice; quantization options help fit tighter latency or memory budgets, making it practical on embedded Jetson devices as well as multi-GPU servers. If you need fast, reliable ASR that maintains quality under real-world conditions and slots cleanly into agent pipelines, Parakeet TDT is the workhorse in the Parakeet lineup.

About NVIDIA

No company description available.

Industry: Computer Hardware Manufacturing
Company Size: 10001+
Location: Santa Clara, California, US
Website: nvidia.com
View Company Profile

Related Models

Last updated: September 22, 2025