Getting Started with HyperVoice: Voice to Text in Any Windows App

· HyperVoice Team

HyperVoice turns your voice into text inside any Windows application — VS Code, Slack, Chrome, Notion, you name it. Press a hotkey, speak, and the transcribed text appears at your cursor. No browser tab, no copy-paste, no context switch.

This guide walks you through installation, your first dictation, recording and output modes, GPU acceleration, and AI-powered post-processing.

Installation

  1. Download the installer from hypervoice.app. The download is a standard .exe installer, around 10 MB.
  2. Run the installer. Windows SmartScreen may show a “Windows protected your PC” warning because HyperVoice is a new application. Click More info, then Run anyway.
  3. Launch HyperVoice. The app starts minimized in your system tray (bottom-right of the taskbar).

On first launch you’ll be asked to enter your license key. Once activated, HyperVoice works fully offline — no internet connection required for local transcription.

Your First Dictation

  1. Click the HyperVoice tray icon to open the window, or use the default hotkey Ctrl + Shift + Space.
  2. When you see the recording indicator, start speaking naturally.
  3. Press the hotkey again (or click the stop button) to stop recording.
  4. HyperVoice transcribes your speech locally using AI and pastes the text at your cursor in whichever app was focused before you started recording.

The entire flow — hotkey, speak, stop, paste — takes just a second or two of overhead. Most of the time is your actual speech.

Recording and Output Modes

Recording Mode

HyperVoice supports two recording modes, configurable in Settings > General:

Output Mode

You can also choose how transcribed text is delivered:

Choosing an AI Model

HyperVoice includes 11 AI model sizes. On first launch you’ll be prompted to download one. You can manage models in Settings > Processing. For a detailed comparison, see our guide to choosing the right AI model.

ModelSizeBest For
Tiny / Base75–142 MBQuick notes, low-end hardware
Small466 MBEveryday dictation, good speed-accuracy balance
Medium1.5 GBHigh accuracy, works well on most GPUs
Large-v3 Turbo1.6 GBBest balance — near Large-v3 accuracy at Medium speed
Large-v2 / Large-v33.1 GBMaximum accuracy, dedicated GPU recommended

Most models also have an English-only variant (Tiny English, Base English, etc.) that can be slightly faster and more accurate if you only dictate in English.

GPU Acceleration

HyperVoice uses Vulkan for GPU-accelerated transcription, which works with NVIDIA, AMD, and Intel GPUs. This is enabled by default — if a compatible GPU is detected, you’ll see a green “GPU” badge in the app.

GPU acceleration can cut transcription time by 3–5x compared to CPU-only mode. If you have a dedicated GPU with at least 2 GB of VRAM, you’ll get near-instant results even with larger models. All processing happens on your device — learn more about our privacy-first architecture.

To check or toggle GPU mode, go to Settings > Audio > GPU Acceleration.

AI Post-Processing

Raw transcriptions are good, but sometimes you want polished output — proper punctuation, cleaned-up grammar, or a specific format. That’s where processing modes come in.

Built-in Modes

HyperVoice includes seven built-in processing modes:

Select a mode from the cards on the Record tab. When no mode is selected, your raw transcription is used as-is.

Setting Up a Provider

Processing modes require an AI provider. You have three options:

  1. Bring Your Own Key (BYOK) — Use your own API key from your preferred provider. Available to all license tiers. Set it up in Settings > Processing.
  2. HyperVoice Cloud — Our hosted processing service, available with a Pro subscription. No API key needed.
  3. None — Skip post-processing entirely. Your raw transcription is pasted as-is.

Custom Modes

Want something specific? Create your own mode in Settings > Processing. Write a system prompt that describes how the AI should transform your text, and it shows up as a card on the Record tab alongside the built-in modes. For example:

Dictionary Replacements

HyperVoice also supports dictionary replacements — custom term substitutions that are applied to every transcription. This is useful for correcting words that the AI consistently gets wrong, expanding abbreviations, or enforcing specific spellings of names and jargon.

Configuring the Hotkey

The default hotkey is Ctrl + Shift + Space. You can change it in Settings > General to any key combination that works for your workflow. At least one modifier key (Ctrl, Alt, Shift) is required.

What’s Next

You’re all set to start dictating. Here are some tips to get the most out of HyperVoice:

Have questions or feedback? Visit your dashboard or reach out at support@hypervoice.app.

← Back to all posts