mirror of
https://github.com/sudoxnym/wyoming-chatterbox.git
synced 2026-04-14 03:27:06 +00:00
No description
features: - voice cloning with 10-30s audio sample - gpu-accelerated inference - volume boost option - pip installable 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> |
||
|---|---|---|
| wyoming_chatterbox | ||
| .gitignore | ||
| LICENSE | ||
| pyproject.toml | ||
| README.md | ||
wyoming-chatterbox
wyoming protocol server for chatterbox tts with voice cloning.
clone any voice with a 10-30 second audio sample.
requirements
- nvidia gpu with 4gb+ vram
- cuda 12.x
- python 3.10+
install
pip install wyoming-chatterbox
or from source:
git clone https://github.com/sudoxnym/wyoming-chatterbox
cd wyoming-chatterbox
pip install .
usage
wyoming-chatterbox --uri tcp://0.0.0.0:10201 --voice-ref /path/to/voice_sample.wav
options
| option | default | description |
|---|---|---|
--uri |
required | server uri (e.g., tcp://0.0.0.0:10201) |
--voice-ref |
required | path to voice reference wav (10-30s of speech) |
--volume-boost |
3.0 | output volume multiplier |
--device |
cuda | torch device (cuda or cpu) |
--debug |
false | enable debug logging |
voice reference tips
for best results:
- 10-30 seconds of clean speech
- no background music or noise
- consistent speaking style
- wav format (any sample rate)
systemd service
sudo tee /etc/systemd/system/wyoming-chatterbox.service << 'EOF'
[Unit]
Description=Wyoming Chatterbox TTS
After=network-online.target
[Service]
Type=simple
User=YOUR_USER
Environment=PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True
ExecStart=/path/to/venv/bin/wyoming-chatterbox \
--uri tcp://0.0.0.0:10201 \
--voice-ref /path/to/voice_reference.wav \
--volume-boost 3.0
Restart=always
RestartSec=5
[Install]
WantedBy=default.target
EOF
sudo systemctl daemon-reload
sudo systemctl enable --now wyoming-chatterbox
home assistant
- settings → devices & services → add integration
- search "wyoming protocol"
- host:
YOUR_IP, port:10201 - use in your voice assistant pipeline as tts
gpu memory
chatterbox uses ~3.5gb vram. if you get oom errors:
# check gpu usage
nvidia-smi
# kill zombie processes
pkill -f wyoming-chatterbox
license
mit