Two interviews, same conversation, same software, very different transcripts. The difference wasn't the speaker. It was the mic.
If the question is which mic produces the cleaner transcript, the short answer is: a properly placed lavalier on each speaker beats a single shared handheld almost every time. The longer answer depends on the room, the format, and whether you remembered to charge the transmitter.
- A lav on each speaker wins on signal-to-noise ratio — the input AI transcription cares about most.
- A handheld can match it if you nail the distance, but it only covers the person holding it.
- For a sit-down 1-on-1 in a quiet room, both work. Pick the lav for hands-free flow.
- For a panel or walking interview, give each speaker their own lav. Never share one mic.
- Room and distance matter more than mic brand. A bad room kills any mic.
Why mic choice actually changes your transcript
AI transcription systems work better on audio with a high signal-to-noise ratio: the speech is loud, the background is quiet. Word error rate climbs fast in noise. A clip-on lav sits four to six inches from a speaker's mouth, so the speech signal arrives loud, and ambient noise sits well below it. A handheld held at a normal six to twelve inches sounds great too, until the speaker drops their hand, turns their head, or the second speaker mumbles their answer twelve feet away.
If you want the deeper version of why noise hurts transcripts, see what word error rate actually measures.
The lavalier case: where it wins and loses
- Mic stays close to the mouth no matter how the speaker moves
- One mic per speaker means one channel per speaker, which makes diarization trivial
- No fighting over a shared mic, so speech overlap is much rarer
- Hands-free, so interview body language stays natural
- Clothing rustle on a stiff jacket or a chunky necklace is murder on transcripts
- Cheap 2.4 GHz wireless lavs drop frames, and Whisper-class models can hallucinate over silence (see Whisper hallucinations and how to fix them)
- A forgotten mute captures the entire bathroom break
- More gear to charge, sync, and check before recording
For any transcript-first recording where you control the setup, the lav-per-speaker route is the default.
The handheld case: where it wins and loses
A handheld dynamic, like a Shure SM58 or a broadcast RE20, is forgiving in noisy rooms because it rejects sound that arrives off-axis. The cardioid or supercardioid pickup pattern cuts the room the moment you aim the capsule at the speaker. For street interviews, loud venues, and any "we can't control the room" situation, a handheld in a steady hand gives the transcription engine a fighting chance.
- Tight directional pickup rejects crowd, traffic, and HVAC noise
- One mic, one cable, no transmitter to forget
- Easy to share across many short subjects in a vox pop
- Robust to handling, weather, and impatient producers
- A single shared handheld puts the far speaker twenty inches from the capsule, so their words ghost
- Newer hosts hold the mic at their chest, not their mouth, and levels collapse
- One channel means the transcription engine has to guess who spoke when, instead of being told
- No hands-free option, which kills the flow of a long interview
So which wins, by scenario
| Scenario | Recommended | Why |
|---|---|---|
| Sit-down 1-on-1, quiet room | Lavalier per speaker | Hands-free, high SNR, separate channels |
| Panel of three or more | Lavalier per speaker | A shared mic mangles diarization |
| Walking or outdoor interview | Lavalier with wind muff | Handhelds pick up wind and handling noise |
| Loud venue, crowd noise | Handheld cardioid | Off-axis rejection saves the transcript |
| Vox pop or street interviews | Handheld, held close | One reporter, many quick subjects |
| Existing platform call | Neither, use per-speaker tracks | See transcribing a Zoom recording with multiple speakers |
Pickup pattern matters more than the form factor
A common mistake: assuming "lav equals better" because it's close. An omnidirectional lav in a reverberant conference room can sound worse than a cardioid handheld in the same room, because the omni captures every reflection off the glass walls. Match the pattern to the environment, not the brand to the budget.
- Cardioid (most lavs, most handhelds): rejects sound from the rear, forgiving of imperfect aim
- Omnidirectional (some lavs, broadcast standard): natural-sounding, picks up everything, unforgiving in bad rooms
- Supercardioid (some handhelds and shotguns): tight off-axis rejection, demands accurate aim
If you're recording in a room with hard walls, glass, or a low ceiling, lean cardioid. If you're in a treated studio or outdoors, omni is fine and sounds more open.
Distance and gain still rule
The single biggest predictor of a clean transcript isn't the brand on the mic. It's how close the mic sits to the mouth and whether the input level peaks around -12 to -6 dBFS without clipping. A $40 lav placed correctly will beat a $400 shotgun aimed across the room. Run levels for thirty seconds before you start, every single time. There's a full pre-roll list in our interview recording checklist for clean transcripts.
The verdict
Use a lavalier per speaker for any sit-down recording where you control the setup: interviews, podcasts, panels, depositions, oral histories. Use a handheld cardioid for noisy rooms, fieldwork, and anywhere a single host is moving between subjects. Don't share a single mic across speakers if you care about the transcript.
Once you have a clean recording, you can drop the file into a transcription tool and the difference shows up right away: fewer [inaudible] markers, sharper speaker labels, far less time spent cleaning up after the fact.
Paste any public link or upload a file and get a clean transcript in minutes. First 3 clips every month are on us — no card required.



