Features Overview
Conjure packs a wide set of features into a single desktop app. This page gives a brief overview of each, with links to detailed guides.
Beta — Platform support
Conjure is in beta. It runs on Windows and Linux. macOS support is experimental. Pre-built installers are coming soon.
Dictation
The core feature. Hold a hotkey, speak, and text appears wherever your cursor is. Conjure supports two modes:
- Batch mode -- record, process, paste. Best with post-processing styles.
- Streaming mode -- text appears in real-time as you speak (~70ms latency). Best with Verbatim tone for raw dictation speed.
Dictation works in any application that accepts text input, including terminals, code editors, and chat apps.
Writing Styles
Every dictation can be processed through a writing style (also called a "tone") before the text is pasted. Built-in styles include:
- Polished -- natural, well-written text that preserves your voice
- Verbatim -- exactly what you said, no editing
- Email -- professional email with greeting, body, and sign-off
- Chat -- casual, concise messages for chat apps
- Formal -- polished professional register for documents
You can also create custom styles with your own prompt templates.
Dictionary
A personal glossary and auto-correction system:
- Glossary entries -- vocabulary hints fed to whisper for better recognition of jargon, names, and technical terms
- Replacement rules -- auto-correct patterns applied after transcription (e.g., "Conjure" when you say "conjur")
- Auto-learn -- Conjure tracks your corrections and suggests new dictionary entries when it detects recurring patterns
Agent Mode
A voice-driven AI assistant that goes beyond dictation. Speak a request and the agent can:
- Paste text into your active application
- Run terminal commands
- Take screenshots and analyze them
- Read and write files
- Send keyboard shortcuts
- Get accessibility information from focused elements
Agent mode uses your configured LLM provider (OpenAI, Claude, Groq, etc.) and persists conversations across app restarts.
Text-to-Speech
Conjure can read text aloud using system voices or OpenAI TTS:
- Speak selected text -- highlight text in any app, press the TTS hotkey, and hear it read aloud
- Auto-speak -- optionally speak agent responses automatically
- Pill overlay -- the dictation pill turns purple while speaking; click it to stop
Voice Commands
36 built-in voice commands for hands-free formatting:
- Punctuation (29 commands): say "period", "comma", "question mark", "open quote", etc.
- Whitespace (4 commands): "new line", "new paragraph", "tab"
- Capitalization (4 commands): "cap" (next word), "all caps" / "caps on" / "caps off"
Voice commands are processed before dictionary replacements and work in both streaming and batch modes.
See the Voice Commands Reference
Analytics
Track your dictation habits on the analytics page:
- Total words dictated, sessions completed, average words per minute
- Time saved estimates compared to typing
- Correction statistics from auto-learn
- Top corrections bar chart
- Per-API-key usage with cost estimates
Audio Preprocessing
A signal processing pipeline applied to your audio before whisper transcription:
- Noise gate -- zeroes silence frames to reduce background noise
- High-pass filter -- removes low-frequency rumble (fans, AC, traffic)
- RMS normalization -- ensures consistent audio levels
Read the Audio Preprocessing Guide
Multi-Device Support
Conjure supports remote dictation via TCP tunneling:
- Pair devices using a simple pairing code
- Dictate on one device, text appears on another
- Useful for dictating from a phone or tablet to your desktop
Keyboard Shortcuts
Configurable hotkeys for all major actions -- dictation, agent mode, style switching, TTS, and cancellation. All shortcuts are customizable in Settings.
See the Keyboard Shortcuts Reference
Transcription History
Every dictation is saved to a local SQLite database:
- Browse past transcriptions with timestamps and word counts
- Click any transcription to see the original audio, raw transcript, and post-processed result
- Inline word correction: click words to select them, then correct errors that feed back into the dictionary
Data Portability
Export and import your data for backup or device transfer:
- Selective export with per-category checkboxes (profile, API keys, transcriptions, dictionary, tones, settings)
- ZIP format with JSON data and audio files
- Security warnings for sensitive data like API keys
