You finished a 38-minute interview on your phone. Now you need it as text you can search and quote from. Here's the workflow that gets you there in five minutes, no matter which iPhone you're holding.
We get this question constantly from journalists and qualitative researchers who do their best work on the phone they already carry. The answer is short. The details matter.
The fastest path from voice memo to text
If your iPhone runs iOS 18 or later, you may already have a transcript inside the Voice Memos app — Apple added an automatic transcription feature in iOS 18. Open the recording, tap the transcript icon, copy the text.
If you're on an older iOS, or the built-in transcript isn't accurate enough for what you're doing (a long interview, multiple speakers, accented speech, technical vocabulary), the better workflow is two steps: export the .m4a file, then run it through a dedicated transcription tool. We'll walk through both paths.
How do you export a Voice Memo from iPhone?
Open the Voice Memos app and tap the recording.
Tap the three-dot menu (•••) next to the recording.
Choose Share.
Pick how you want it off the device. AirDrop is fastest if you're at your Mac; otherwise email it to yourself or save to Files, iCloud Drive, or Dropbox.
The file you get is a .m4a. That format is fine for transcription. You don't need to convert it to MP3 or WAV first. For more on why some formats transcribe better than others, see what's the best audio format for AI transcription?.
Tip — if your iPhone is set to Lossless encoding (Settings → Voice Memos → Audio Quality), files will be larger but transcribe slightly more accurately. Standard quality works fine for one clear speaker.
Does iPhone transcribe Voice Memos automatically?
In iOS 18 and later, yes. Apple added automatic transcription to Voice Memos as part of the iOS 18 update, and the transcript shows up below the waveform once the recording finishes processing.
On iOS 17 and earlier, no. There is no native transcript in Voice Memos. You'll need to export the file and use something else.
Apple's built-in transcription handles short, clean, single-speaker English well. It struggles with three things in particular:
- Multiple speakers. It produces one block of text with no idea who said what.
- Background noise. Coffee shops, cars, wind on the mic.
- Specialized vocabulary. Medical terms, legal jargon, brand names, proper nouns.
How accurate is iPhone's built-in transcription?
For a quiet, single-speaker memo of a few minutes, Apple's transcript is usable as-is — close to what you'd expect from any modern on-device speech-to-text. For anything resembling a real interview, expect to fix things.
We've written a whole post on this: transcription accuracy: what to expect. Short version: word error rate on clean studio audio is in the low single digits for the best systems, and climbs fast as soon as you add a second speaker or background noise. On-device transcription sits closer to the higher end of that range because it's optimized for fitting in your phone, not server-class accuracy.
If you need a transcript that doesn't require a heavy editing pass, run the file through a tool built for it.
How do you transcribe a long voice memo with multiple speakers?
The built-in Voice Memos transcript runs out of room here. There's no speaker separation, the transcript is hard to navigate, and long recordings are slow to scroll.
The workflow that works:
from Voice Memos using the steps above.
to a transcription tool. You can transcribe a voice memo directly without creating an account.
if your tool supports it. This is the difference between a wall of text and "Speaker 1… Speaker 2…" you can actually read.
A 30-minute recording typically takes 1–3 minutes with modern AI tools.
as plain text, SRT, or VTT, depending on what you need it for.
This is the standard workflow journalists use to clear a backlog of phone-recorded interviews. We wrote a deeper look at how journalists use transcription tools if you want the full picture.
How do you turn the transcript into something useful?
A transcript by itself isn't the deliverable. It's the raw material. Once you have the text:
- Search it. Cmd-F your way to the quote you remember.
- Pull pull-quotes straight into a draft, attributed.
- Build an outline by skimming the topics the speaker covered.
- Translate it if your audience isn't English-first.
- Time-code it if you're editing video and need to cut to the right moment.
If you record on your phone every week, the value compounds. Every interview becomes a searchable document you can revisit a year later.
When the built-in transcript is enough — and when it isn't
Use the iOS 18 Voice Memos transcript when:
- The recording is under 10 minutes.
- One person is speaking.
- The audio is quiet and the speaker is close to the mic.
- You only need a rough text version to find a quote.
Switch to a dedicated tool when:
- Multiple speakers are involved.
- The recording is more than 15–20 minutes.
- You need timestamps or speaker labels.
- The vocabulary is technical, legal, or medical.
- You want SRT/VTT for captions, or a clean copy you don't have to re-edit.
Paste any public link or upload a file and get a clean transcript in minutes. First 3 clips every month are on us — no card required.
Sources
- Apple, iOS 18 — New features — https://www.apple.com/ios/ios-18/
- Apple, iPhone User Guide — Voice Memos — https://support.apple.com/guide/iphone/



