Introduction
Conjure is an open-source, AI-powered voice dictation desktop app. It runs on Windows and Linux, works fully offline, and lets you dictate text into any application at roughly four times your typing speed.
Beta
Conjure is currently in beta. It's functional and actively developed, but expect rough edges. Public installers are not yet available — see Getting Started for building from source.
What Conjure Does
Hold a hotkey, speak, and your words appear as text in whatever app is focused -- a code editor, email client, chat window, or terminal. Conjure handles the full pipeline:
- Capture your microphone audio while the hotkey is held
- Transcribe speech to text using local whisper.cpp or a cloud API
- Post-process the transcript with an LLM to apply a writing style (polish grammar, format as email, keep verbatim, etc.)
- Inject the final text into the active application via clipboard paste or direct character injection
Everything runs locally by default. Cloud providers are optional and always bring-your-own-key -- Conjure never routes your audio or text through its own servers.
Who It's For
- Developers who want to dictate into terminals, code editors, and AI tools without reaching for the keyboard
- Writers who think faster than they type and want clean, styled output from speech
- Professionals who compose emails, messages, and documents throughout the day
- Anyone with RSI, accessibility needs, or a preference for voice input
Key Principles
Local-first. The app ships with whisper.cpp for on-device transcription. No account, no internet connection, no API key required for basic dictation.
Bring your own key. For cloud transcription or AI post-processing, you add your own API keys from 13+ providers. You control the costs and the data.
Open source. Conjure is AGPLv3-licensed, forked from Voquill. The full source is available, self-hostable, and community-driven.
Open source
Conjure is AGPLv3 licensed. The full source code is available at github.com/Codename-11/conjure. Contributions are welcome.
Privacy by default. Audio is processed locally or sent directly to the provider you choose. Conjure has no telemetry, no analytics servers, and no data collection.
How It Works
Conjure is a Tauri v2 desktop app -- a Rust backend paired with a React/TypeScript frontend. The transcription engine runs as a standalone sidecar process (whisper.cpp compiled to Rust via whisper-rs), communicating over a local HTTP API.
The architecture separates concerns cleanly:
- Rust handles system integration: audio capture, hotkey listening, text injection, database, and the whisper sidecar
- TypeScript handles all business logic: state management (Zustand), AI provider integrations, dictation strategy, voice commands, and the UI
- SQLite stores transcription history, user preferences, dictionary entries, and conversation data locally
Next Steps
- Getting Started -- install Conjure and make your first dictation
- Features Overview -- see everything Conjure can do
- API Keys & Providers -- set up cloud providers for better accuracy or free alternatives
