The setup, fast

Windows doesn't ship dictation that types into arbitrary apps. The built-in Win + H shortcut works in some Office apps but stumbles in the browser, in ChatGPT specifically, and in most modern web inputs. So you need a small helper that types your speech into whatever you're focused on.

The shortest path:

  1. Install a Windows dictation tool that injects text at the cursor. WinTranscribe is one option — install takes under a minute, free 14-day trial, no card. The same idea works with other tools; the key feature you want is "push-to-talk that types into any app", not "open a separate window and copy-paste".
  2. Open ChatGPT in your browser. Click into the prompt box so the cursor is blinking there.
  3. Press and hold the dictation hotkey (default Alt+Q in WinTranscribe). Speak your prompt the way you'd say it to a colleague. Release the hotkey.
  4. The text appears in the ChatGPT input. Press Enter.

That's it. The whole loop is a key-hold and a release. No window-switching, no clipboard dance. The text shows up at the cursor as if you'd typed it. And because it works in any text field, the same trick works in Claude, Gemini, Perplexity, Copilot, your inbox, Slack, anywhere you'd usually type.

Why your prompts get better when you speak them

Here's the part that's actually interesting, and the part most "dictate into ChatGPT" tutorials skip. Speaking your prompts produces noticeably better AI answers. Not because the AI is doing anything different — the model gets the same text either way — but because you write a different prompt when you speak it.

Compare what people typically type:

follow up email firm but polite

To what people typically say:

I'm writing a follow-up email to a client who hasn't replied to my proposal from two weeks ago. They were enthusiastic in our last call but went quiet after I sent the quote. I want to keep it warm — I don't want to lose the relationship — but I also need to know if they're moving forward or not. Maybe two short paragraphs, ending with a low-pressure question.

The first one took 5 seconds to type. The second one took 15 seconds to say. The first one gets you a generic four-paragraph corporate email. The second one gets you exactly the email you actually wanted to write.

It's not that you couldn't have typed the longer version. It's that you wouldn't have. Typing makes you cut corners. Every word costs a keypress, so you ration them, you condense, you skip the context that would make the answer specific. Speaking removes that friction. The full thing comes out the way you'd explain it to a human.

This is why people who dictate their AI prompts get better results — not by trying, just by accident. You'd give the AI better instructions if it weren't so much effort, and dictation makes the effort go away.

What to watch out for

Punctuation

Modern transcription models add punctuation automatically. Question marks land where they should, commas mostly behave, sentence breaks come out roughly right. You don't have to say "comma" or "period" out loud — that's old-school dictation software thinking. Just talk normally and trust the model.

Names and jargon

Brand names, colleague names, technical terms — these get spelled phonetically the first time. If you say "send this to Daan", you might get "Don" back. Most decent dictation tools let you add a custom dictionary; in WinTranscribe you add the word once and it sticks across every session, in every language.

Long prompts

For prompts longer than a couple of sentences, hold-to-talk gets cramped — your finger gets tired pressing the key for two minutes. Switch to toggle-style hotkeys for those: hit Alt+R to start recording, hit it again to stop. The text gets typed at your cursor when you stop. Use this for "explain the full background of my situation" type prompts. Use push-to-talk for the snappy "rewrite this in plain English" type prompts.

Privacy

Your speech gets converted to text by a speech-to-text service. It's a brief request — the audio doesn't get stored. With WinTranscribe specifically you can also bring your own Groq API key, which means the audio goes directly from your machine to your own Groq account and never touches our servers. If you're dictating sensitive content (legal, medical, client confidential) this is the path you want.

Tools that work for this

A few options if you want to compare:

  • WinTranscribe — Windows-only, focuses on push-to-talk plus meeting recording. 14-day free trial, then €9.95/month managed or €49 lifetime if you bring your own keys.
  • SuperWhisper — started as Mac-only, now ships Windows too. Free tier with local speech-to-text. Worth comparing if you also work on a Mac.
  • Windows' built-in Win+H — free, no install, works fine in Word and OneNote. Stumbles in browser inputs and most modern web apps. Try it first if you mostly dictate inside Microsoft apps.

The best one for you is the one whose hotkey you'll actually press. Try the built-in option first — it costs nothing. If it doesn't work in the apps you live in (the browser is the usual deal-breaker), try one of the dedicated tools.

The point

If you're using ChatGPT a few times a day, dictating into it instead of typing saves you typing time. Fine. But the real win is the one you'll only notice after a week: your prompts get longer, your answers get sharper, and you stop treating the AI like a search engine. You start treating it like someone you can explain things to. That's where the actual value of these tools lives, and dictation is the cheapest way to unlock it.