Back to Skills Hub
Parakeet STT

Parakeet STT

@carlulsoe
developmentSpeech-to-TextLocal ProcessingAudio Transcription

Local speech-to-text transcription using NVIDIA Parakeet TDT 0.6B v3 with ONNX Runtime. CPU-based, no GPU required, ~30x faster than realtime with OpenAI-compatible API.

🚀 Parakeet STT is a fast, local speech-to-text tool powered by NVIDIA's Parakeet model. Convert audio files to text, timestamps, or subtitles (SRT/WebVTT) with an OpenAI-compatible API. Runs entirely on your CPU—no GPU needed—and processes audio ~30x faster than realtime.

💡 Perfect for transcribing meetings, podcasts, videos, and interviews while keeping everything private. Supports 25 languages with automatic detection. Use it via simple API calls, Python SDK, or a built-in web interface with drag-and-drop uploads.

✨ Get enterprise-grade accuracy comparable to Whisper, complete privacy with zero cloud dependencies, and instant setup via Docker or Python—all without vendor lock-in.

GitHub

Requirements

FastAPI

Web framework for API server

OpenAI Python SDK

Optional, for client integration