Skip to content

Introduction

Conjure is an open-source, AI-powered voice dictation desktop app. It runs on Windows and Linux, works fully offline, and lets you dictate text into any application at roughly four times your typing speed.

Beta

Conjure is currently in beta. It's functional and actively developed, but expect rough edges. Public installers are not yet available — see Getting Started for building from source.

What Conjure Does

Hold a hotkey, speak, and your words appear as text in whatever app is focused -- a code editor, email client, chat window, or terminal. Conjure handles the full pipeline:

  1. Capture your microphone audio while the hotkey is held
  2. Transcribe speech to text using local whisper.cpp or a cloud API
  3. Post-process the transcript with an LLM to apply a writing style (polish grammar, format as email, keep verbatim, etc.)
  4. Inject the final text into the active application via clipboard paste or direct character injection

Everything runs locally by default. Cloud providers are optional and always bring-your-own-key -- Conjure never routes your audio or text through its own servers.

Who It's For

  • Developers who want to dictate into terminals, code editors, and AI tools without reaching for the keyboard
  • Writers who think faster than they type and want clean, styled output from speech
  • Professionals who compose emails, messages, and documents throughout the day
  • Anyone with RSI, accessibility needs, or a preference for voice input

Key Principles

Local-first. The app ships with whisper.cpp for on-device transcription. No account, no internet connection, no API key required for basic dictation.

Bring your own key. For cloud transcription or AI post-processing, you add your own API keys from 13+ providers. You control the costs and the data.

Open source. Conjure is AGPLv3-licensed, forked from Voquill. The full source is available, self-hostable, and community-driven.

Open source

Conjure is AGPLv3 licensed. The full source code is available at github.com/Codename-11/conjure. Contributions are welcome.

Privacy by default. Audio is processed locally or sent directly to the provider you choose. Conjure has no telemetry, no analytics servers, and no data collection.

How It Works

Conjure is a Tauri v2 desktop app -- a Rust backend paired with a React/TypeScript frontend. The transcription engine runs as a standalone sidecar process (whisper.cpp compiled to Rust via whisper-rs), communicating over a local HTTP API.

The architecture separates concerns cleanly:

  • Rust handles system integration: audio capture, hotkey listening, text injection, database, and the whisper sidecar
  • TypeScript handles all business logic: state management (Zustand), AI provider integrations, dictation strategy, voice commands, and the UI
  • SQLite stores transcription history, user preferences, dictionary entries, and conversation data locally

Next Steps

Released under the AGPLv3 License.