Murmur is a desktop voice transcription app that runs entirely on your machine. https://dikkadev.github.io/murmur/

Python 52.2%
Svelte 22.5%
TypeScript 22%
JavaScript 1.2%
PowerShell 1.2%
Other 0.9%

Find a file

dikkadev eb822ea5d0 Some checks failed CI / Version Consistency (push) Failing after 29s Details CI / App Build (push) Has been skipped Details fix(nemotron): reduce rnnt phrase boosting strength Lowering the RNNT boosting weights keeps hotword biasing from overwhelming acoustic evidence, especially for broad single-word phrases that were causing unrelated recognitions to be overfavored.		2026-05-22 19:45:31 +02:00
.cursor	feat(overlay): hydrate state on attach and improve multi-display behavior	2026-03-22 09:54:13 +01:00
.dev	feat(nemotron): add hotword context biasing support to transcription	2026-05-22 18:00:43 +02:00
.github/workflows	chore(release): prepare 0.5.0 release	2026-02-22 00:17:48 +01:00
app	feat(nemotron): add hotword context biasing support to transcription	2026-05-22 18:00:43 +02:00
docs	feat(nemotron): add hotword context biasing support to transcription	2026-05-22 18:00:43 +02:00
homepage	fix(homepage): respect base url for icon asset	2026-02-10 17:00:05 +01:00
poc	chore(deps): apply stable Python dependency updates	2026-04-30 15:33:50 +02:00
scripts	feat(scripts): add interactive murmur history exporter	2026-05-21 10:20:22 +02:00
server	fix(nemotron): reduce rnnt phrase boosting strength	2026-05-22 19:45:31 +02:00
.gitignore	feat(scripts): add interactive murmur history exporter	2026-05-21 10:20:22 +02:00
AGENT.md	feat(engine): harden engine discovery and status reporting across UI and server	2026-02-20 23:56:23 +01:00
BUILDING.md	feat: update installer documentation	2026-02-21 22:10:24 +01:00
CLAUDE.md	docs: add shared agent guidance and reference from CLAUDE file	2026-02-07 11:38:35 +01:00
justfile	feat: switch windows installer to nsis-web	2026-02-21 22:10:07 +01:00
LICENSE	docs: simplify docs layout and document engine metadata in protocol	2026-02-21 13:49:18 +01:00
README.md	docs(readme): clarify dev startup with explicit engine selection	2026-04-30 17:27:42 +02:00

README.md

Murmur

Local voice-to-text for Windows. Hold a key, talk, let go — your words land wherever you're typing.

Everything runs on your machine. No cloud, no account, no sending audio anywhere. Murmur sits in your system tray and gives you a global hotkey that turns speech into text in any app — your editor, your browser, a chat window, whatever has focus.

How It Works

graph LR
    A["🎤 Hold hotkey"] --> B["🎙️ Mic capture"]
    B --> C["📡 WebSocket"]
    C --> D["🧠 Transcription engine"]
    D --> E["💬 Partials stream to overlay"]
    E --> F["📋 Release → clipboard + paste"]

Audio flows from your mic through an AudioWorklet, gets sent as 16-bit PCM over a local WebSocket, and hits the transcription engine running on your GPU (or CPU). Partials stream back in real-time so you see words forming as you speak. When you release the key, the final transcription lands in your clipboard and gets pasted automatically.

The overlay is a transparent, always-on-top, click-through window — it shows up when you're recording and gets out of the way when you're not.

Windows 10/11
Bun
Python 3.11+
uv
just
CUDA-capable GPU recommended (driver 525+; CPU is supported but slower)
Hold-to-talk or toggle mode — bind any key as your global hotkey
Transparent overlay — live waveform and partial transcription while you speak
Two startup-selectable engines — Nemotron by default, Whisper for multilingual use
Auto-paste — transcribed text goes straight to your clipboard and into the active field
Post-processing — auto-append periods, spaces, or both
Searchable history — every transcription saved locally in SQLite
In-app server controls — start, stop, restart, stream logs, all from the settings panel
External server mode — point Murmur at a remote server if you want

Engines

Murmur ships with two transcription engines. Both run locally; choose the engine when starting the server with MURMUR_ENGINE or server/settings.json.

	Nemotron	Whisper
Model	`nvidia/nemotron-speech-streaming-en-0.6b`	`large-v3-turbo` (via faster-whisper)
Best for	English dictation, low latency	Multilingual, accuracy
Streaming	Native streaming architecture	Chunked re-transcription
Extras	—	Hotword boosting
VRAM	~1.5 GB	~3 GB

Nemotron is the default. It's a 0.6B parameter model built for streaming — partials come back fast and the final result is usually identical to the last partial. Start the server with Whisper for non-English languages or when you need hotword support to nail domain-specific terms.

Quick Start

You need Windows 10/11 with Bun, Python 3.11+, uv, and just. A CUDA GPU is recommended but not required.

# Server
cd server
uv sync --extra all    # or: --extra nemotron / --extra whisper
just start

# Start with an explicit engine
pwsh -NoProfile -Command "$env:MURMUR_ENGINE='nemotron'; uv run python -m main"
pwsh -NoProfile -Command "$env:MURMUR_ENGINE='whisper'; uv run python -m main"

# App (separate terminal)
cd app
bun install
bun run dev

The app auto-detects a running server in dev mode. In production, it manages the server lifecycle itself.

Note

If you develop from WSL, run all uv/bun/just commands through PowerShell — not Linux. Running them from WSL replaces Windows binaries with Linux ones and breaks everything. See BUILDING.md.

Build

cd server && uv sync --extra all
cd ../app && bun run package:win

bun run package:win produces a small nsis-web installer stub plus payloads (for example .7z, .yml, .blockmap) in app/release/. End users need internet to install (payload download) and to fetch models on first run.

Root-level helper:

just build

See BUILDING.md for full release and troubleshooting details.

Configuration

App settings (hotkey, audio device, post-processing, auto-paste) are configured through the UI.

Server settings use MURMUR_-prefixed environment variables or server/settings.json. Engine-related settings are read at server startup.

Server environment variables

Variable	Default	Description
`MURMUR_HOST`	`0.0.0.0`	Bind host
`MURMUR_PORT`	`51717`	Bind port
`MURMUR_ENGINE`	`nemotron`	Default engine (`nemotron` / `whisper`)
`MURMUR_NEMOTRON_MODEL`	`nvidia/nemotron-speech-streaming-en-0.6b`	Nemotron model
`MURMUR_NEMOTRON_DEVICE`	`auto`	Device (`auto`/`cuda`/`cpu`)
`MURMUR_WHISPER_MODEL`	`large-v3-turbo`	Whisper model
`MURMUR_WHISPER_DEVICE`	`auto`	Device (`auto`/`cuda`/`cpu`)
`MURMUR_WHISPER_COMPUTE_TYPE`	`auto`	Whisper precision mode
`MURMUR_MAX_SESSIONS`	`10`	Concurrent session cap
`MURMUR_LOG_LEVEL`	`INFO`	`DEBUG`/`INFO`/`WARNING`/`ERROR`

Project Structure

app/      Electron desktop app (Svelte 5, TypeScript, Tailwind v4)
server/   Transcription server (FastAPI, faster-whisper, NeMo)
docs/     Protocol spec and technical docs

Protocol

The app and server communicate over a custom WebSocket protocol on port 51717 — binary frames for audio, JSON frames for control and text. Full spec: docs/protocol.md

License

MIT