Murmur is a desktop voice transcription app that runs entirely on your machine. https://dikkadev.github.io/murmur/
  • Python 52.2%
  • Svelte 22.5%
  • TypeScript 22%
  • JavaScript 1.2%
  • PowerShell 1.2%
  • Other 0.9%
Find a file
dikkadev eb822ea5d0
Some checks failed
CI / Version Consistency (push) Failing after 29s
CI / App Build (push) Has been skipped
fix(nemotron): reduce rnnt phrase boosting strength
Lowering the RNNT boosting weights keeps hotword biasing from overwhelming acoustic evidence, especially for broad single-word phrases that were causing unrelated recognitions to be overfavored.
2026-05-22 19:45:31 +02:00
.cursor feat(overlay): hydrate state on attach and improve multi-display behavior 2026-03-22 09:54:13 +01:00
.dev feat(nemotron): add hotword context biasing support to transcription 2026-05-22 18:00:43 +02:00
.github/workflows chore(release): prepare 0.5.0 release 2026-02-22 00:17:48 +01:00
app feat(nemotron): add hotword context biasing support to transcription 2026-05-22 18:00:43 +02:00
docs feat(nemotron): add hotword context biasing support to transcription 2026-05-22 18:00:43 +02:00
homepage fix(homepage): respect base url for icon asset 2026-02-10 17:00:05 +01:00
poc chore(deps): apply stable Python dependency updates 2026-04-30 15:33:50 +02:00
scripts feat(scripts): add interactive murmur history exporter 2026-05-21 10:20:22 +02:00
server fix(nemotron): reduce rnnt phrase boosting strength 2026-05-22 19:45:31 +02:00
.gitignore feat(scripts): add interactive murmur history exporter 2026-05-21 10:20:22 +02:00
AGENT.md feat(engine): harden engine discovery and status reporting across UI and server 2026-02-20 23:56:23 +01:00
BUILDING.md feat: update installer documentation 2026-02-21 22:10:24 +01:00
CLAUDE.md docs: add shared agent guidance and reference from CLAUDE file 2026-02-07 11:38:35 +01:00
justfile feat: switch windows installer to nsis-web 2026-02-21 22:10:07 +01:00
LICENSE docs: simplify docs layout and document engine metadata in protocol 2026-02-21 13:49:18 +01:00
README.md docs(readme): clarify dev startup with explicit engine selection 2026-04-30 17:27:42 +02:00

Murmur

Murmur

Local voice-to-text for Windows. Hold a key, talk, let go — your words land wherever you're typing.

v0.2.0 alpha


Everything runs on your machine. No cloud, no account, no sending audio anywhere. Murmur sits in your system tray and gives you a global hotkey that turns speech into text in any app — your editor, your browser, a chat window, whatever has focus.

How It Works

graph LR
    A["🎤 Hold hotkey"] --> B["🎙️ Mic capture"]
    B --> C["📡 WebSocket"]
    C --> D["🧠 Transcription engine"]
    D --> E["💬 Partials stream to overlay"]
    E --> F["📋 Release → clipboard + paste"]

Audio flows from your mic through an AudioWorklet, gets sent as 16-bit PCM over a local WebSocket, and hits the transcription engine running on your GPU (or CPU). Partials stream back in real-time so you see words forming as you speak. When you release the key, the final transcription lands in your clipboard and gets pasted automatically.

The overlay is a transparent, always-on-top, click-through window — it shows up when you're recording and gets out of the way when you're not.

  • Windows 10/11

  • Bun

  • Python 3.11+

  • uv

  • just

  • CUDA-capable GPU recommended (driver 525+; CPU is supported but slower)

  • Hold-to-talk or toggle mode — bind any key as your global hotkey

  • Transparent overlay — live waveform and partial transcription while you speak

  • Two startup-selectable engines — Nemotron by default, Whisper for multilingual use

  • Auto-paste — transcribed text goes straight to your clipboard and into the active field

  • Post-processing — auto-append periods, spaces, or both

  • Searchable history — every transcription saved locally in SQLite

  • In-app server controls — start, stop, restart, stream logs, all from the settings panel

  • External server mode — point Murmur at a remote server if you want

Engines

Murmur ships with two transcription engines. Both run locally; choose the engine when starting the server with MURMUR_ENGINE or server/settings.json.

Nemotron Whisper
Model nvidia/nemotron-speech-streaming-en-0.6b large-v3-turbo (via faster-whisper)
Best for English dictation, low latency Multilingual, accuracy
Streaming Native streaming architecture Chunked re-transcription
Extras Hotword boosting
VRAM ~1.5 GB ~3 GB

Nemotron is the default. It's a 0.6B parameter model built for streaming — partials come back fast and the final result is usually identical to the last partial. Start the server with Whisper for non-English languages or when you need hotword support to nail domain-specific terms.

Quick Start

You need Windows 10/11 with Bun, Python 3.11+, uv, and just. A CUDA GPU is recommended but not required.

# Server
cd server
uv sync --extra all    # or: --extra nemotron / --extra whisper
just start

# Start with an explicit engine
pwsh -NoProfile -Command "$env:MURMUR_ENGINE='nemotron'; uv run python -m main"
pwsh -NoProfile -Command "$env:MURMUR_ENGINE='whisper'; uv run python -m main"

# App (separate terminal)
cd app
bun install
bun run dev

The app auto-detects a running server in dev mode. In production, it manages the server lifecycle itself.

Note

If you develop from WSL, run all uv/bun/just commands through PowerShell — not Linux. Running them from WSL replaces Windows binaries with Linux ones and breaks everything. See BUILDING.md.

Build

cd server && uv sync --extra all
cd ../app && bun run package:win

bun run package:win produces a small nsis-web installer stub plus payloads (for example .7z, .yml, .blockmap) in app/release/. End users need internet to install (payload download) and to fetch models on first run.

Root-level helper:

just build

See BUILDING.md for full release and troubleshooting details.

Configuration

App settings (hotkey, audio device, post-processing, auto-paste) are configured through the UI.

Server settings use MURMUR_-prefixed environment variables or server/settings.json. Engine-related settings are read at server startup.

Server environment variables
Variable Default Description
MURMUR_HOST 0.0.0.0 Bind host
MURMUR_PORT 51717 Bind port
MURMUR_ENGINE nemotron Default engine (nemotron / whisper)
MURMUR_NEMOTRON_MODEL nvidia/nemotron-speech-streaming-en-0.6b Nemotron model
MURMUR_NEMOTRON_DEVICE auto Device (auto/cuda/cpu)
MURMUR_WHISPER_MODEL large-v3-turbo Whisper model
MURMUR_WHISPER_DEVICE auto Device (auto/cuda/cpu)
MURMUR_WHISPER_COMPUTE_TYPE auto Whisper precision mode
MURMUR_MAX_SESSIONS 10 Concurrent session cap
MURMUR_LOG_LEVEL INFO DEBUG/INFO/WARNING/ERROR

Project Structure

app/      Electron desktop app (Svelte 5, TypeScript, Tailwind v4)
server/   Transcription server (FastAPI, faster-whisper, NeMo)
docs/     Protocol spec and technical docs

Protocol

The app and server communicate over a custom WebSocket protocol on port 51717 — binary frames for audio, JSON frames for control and text. Full spec: docs/protocol.md

License

MIT