mirror of https://github.com/sudoxnym/wyoming-chatterbox.git synced 2026-04-14 03:27:06 +00:00

No description

Find a file

Your Name bed25c51c7 initial release - wyoming protocol server for chatterbox tts features: - voice cloning with 10-30s audio sample - gpu-accelerated inference - volume boost option - pip installable 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>		2025-12-14 21:33:26 -06:00
wyoming_chatterbox	initial release - wyoming protocol server for chatterbox tts	2025-12-14 21:33:26 -06:00
.gitignore	initial release - wyoming protocol server for chatterbox tts	2025-12-14 21:33:26 -06:00
LICENSE	initial release - wyoming protocol server for chatterbox tts	2025-12-14 21:33:26 -06:00
pyproject.toml	initial release - wyoming protocol server for chatterbox tts	2025-12-14 21:33:26 -06:00
README.md	initial release - wyoming protocol server for chatterbox tts	2025-12-14 21:33:26 -06:00

README.md

wyoming-chatterbox

wyoming protocol server for chatterbox tts with voice cloning.

clone any voice with a 10-30 second audio sample.

requirements

nvidia gpu with 4gb+ vram
cuda 12.x
python 3.10+

install

pip install wyoming-chatterbox

or from source:

git clone https://github.com/sudoxnym/wyoming-chatterbox
cd wyoming-chatterbox
pip install .

usage

wyoming-chatterbox --uri tcp://0.0.0.0:10201 --voice-ref /path/to/voice_sample.wav

options

option	default	description
`--uri`	required	server uri (e.g., `tcp://0.0.0.0:10201`)
`--voice-ref`	required	path to voice reference wav (10-30s of speech)
`--volume-boost`	3.0	output volume multiplier
`--device`	cuda	torch device (`cuda` or `cpu`)
`--debug`	false	enable debug logging

voice reference tips

for best results:

10-30 seconds of clean speech
no background music or noise
consistent speaking style
wav format (any sample rate)

systemd service

sudo tee /etc/systemd/system/wyoming-chatterbox.service << 'EOF'
[Unit]
Description=Wyoming Chatterbox TTS
After=network-online.target

[Service]
Type=simple
User=YOUR_USER
Environment=PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True
ExecStart=/path/to/venv/bin/wyoming-chatterbox \
  --uri tcp://0.0.0.0:10201 \
  --voice-ref /path/to/voice_reference.wav \
  --volume-boost 3.0
Restart=always
RestartSec=5

[Install]
WantedBy=default.target
EOF

sudo systemctl daemon-reload
sudo systemctl enable --now wyoming-chatterbox

home assistant

settings → devices & services → add integration
search "wyoming protocol"
host: YOUR_IP, port: 10201
use in your voice assistant pipeline as tts

gpu memory

chatterbox uses ~3.5gb vram. if you get oom errors:

# check gpu usage
nvidia-smi

# kill zombie processes
pkill -f wyoming-chatterbox

license

mit