Audio formats

How to transcribe an MP3 file to text for free

Published on May 27, 2026 5 min read

MP3 is by far the most common audio format on the internet. Podcast episodes, voice memos exported from older phones, meeting recordings, music files — they all end in .mp3. This guide walks through the exact steps to turn any MP3 into editable text with VocaText, what bitrate to aim for if you need to fit a long recording into 25 MB, and what to do when the file is simply too big.

How to transcribe an MP3 file in 3 steps

Drop the file on VocaText. The dropzone accepts MP3 up to 25 MB.
Click "Transcribe". The engine detects the language automatically and processes the audio in the background.
Copy or download the text. The result appears within seconds for short clips, a couple of minutes for a full hour of audio.

No sign-up, no email, no watermark on the output.

What is an MP3 file?

MP3 is the file extension for audio compressed using MPEG-1 Audio Layer III. The format was standardised in 1993 by the Fraunhofer Institute and quickly became the de-facto way to store digital audio in compact files. Its key trick is lossy compression: it discards sound information the human ear barely perceives, shrinking files by an order of magnitude versus uncompressed audio.

For voice, MP3 at modest bitrates is excellent: small files, near-perfect intelligibility, universal compatibility. Every operating system, every phone, every browser plays MP3 natively without extra software.

Bitrate cheat sheet: how much audio fits in 25 MB?

The free tier caps uploads at 25 MB. How much MP3 audio that buys you depends entirely on the bitrate of the file:

64 kbps — about 50 minutes of voice. Compact, perfectly intelligible for transcription.
96 kbps — about 35 minutes. The sweet spot for voice MP3.
128 kbps — about 25 minutes. The default of most podcast platforms.
192 kbps — about 17 minutes. Overkill for voice.
320 kbps — about 10 minutes. Music quality, never needed for transcription.

Rule of thumb: if your MP3 is over 25 MB and you only have voice, re-export at 64 to 96 kbps in Audacity or ffmpeg. Transcription accuracy will not suffer, and you may fit twice the duration in the same file size.

What if your MP3 file is bigger than 25 MB?

Three options, in order of effort:

Re-encode at a lower bitrate. Open the file in Audacity, choose File → Export → Export As MP3, set bitrate to 64 or 96 kbps. A 60 MB podcast at 192 kbps becomes 20 MB at 64 kbps with no perceptible voice quality loss.
Split the file into 30-to-60-minute segments and transcribe each separately.
Switch to VocaText Pro for direct upload of files up to 200 MB.

What transcription quality to expect from an MP3

Speech recognition accuracy depends overwhelmingly on the audio, not on the file extension. A clean MP3 recorded close to the speaker, in a quiet room, routinely beats 95 percent accuracy. A noisy MP3 captured across a room will struggle to clear 80 percent. Three factors matter:

Microphone distance — keep the microphone within 15 to 20 centimetres of the speaker mouth, slightly off-axis to dodge plosives.
Background noise — turn off fans, close windows, mute notifications. Background hum is the single biggest cause of mistakes.
Speaker overlap — speech engines struggle to attribute simultaneous voices. Encourage one-at-a-time speaking.

MP3 lossy compression does not noticeably hurt speech recognition above 64 kbps. There is no point converting your MP3 to WAV before transcribing — you are just creating a bigger file with the same audio inside.

MP3 versus M4A, WAV and OGG

All four formats are accepted by VocaText natively.

MP3 — Universal compatibility, lossy, small files. The default for podcasts and most online audio.
M4A (AAC) — Apple default; slightly better quality than MP3 at the same bitrate.
WAV — Uncompressed, perfect quality, large files (10 MB per minute of stereo audio). Reach for it only when you plan to edit afterwards.
OGG (Vorbis) — Open-source equivalent of MP3; common in Linux and games.

For pure transcription work, MP3 and M4A are functionally equivalent. Pick whichever format you already have.

FAQ

Does VocaText support MP3 natively?

Yes. MP3 is in the list of accepted formats alongside M4A, WAV, OGG and FLAC.

How do I know my MP3 file size?

Right-click the file on Windows or macOS and select Properties or Get Info. Size in megabytes (MB) is what counts for the 25 MB free limit.

Can I transcribe a podcast episode?

Yes, if the episode is under 25 MB. Most podcast MP3s sit between 30 and 80 MB for a full hour at 128 kbps — re-encode at 64 kbps, split, or use Pro.

Does converting MP3 to WAV improve transcription accuracy?

No. The audio inside the WAV would still come from the lossy MP3 source. You only add file weight without recovering any information.

Why is there no sign-up?

VocaText free tier is sign-up-free by design. Drop the file, get the text, done. An optional Pro tier exists for power users who need larger files, batches, and stored transcripts.