Transcripto
Local Whisper · Speaker Diarization · Multi-language

Turn audio & video into clean, speaker-labeled transcripts

Upload a file, paste a YouTube link, or drop in text. Get transcripts, speaker separation, summaries, translations and downloadable reports.

Simple mode prefers existing captions; speaker mode always downloads and analyzes the audio.
Fastest path: transcript & summary only.
Adds a translated summary and per-line translations.
Processing time: on a CPU-only server expect roughly 1–3 minutes per minute of media. Speaker separation is slower. Keep clips short while testing.
Starting…
Your file is being processed. This can take a few minutes.
Accurate transcription
Local Whisper with word-level timestamps.
Speaker separation
pyannote 3.1 with a local CPU fallback.
Translation
Summaries & lines in 10+ languages.
Exports
.txt, .srt and polished PDF reports.