mirror of
https://github.com/sudoxnym/wyoming-chatterbox.git
synced 2026-06-17 09:44:04 +00:00
No description
| wyoming_chatterbox | ||
| .env.example | ||
| .gitignore | ||
| compose.yaml | ||
| Dockerfile | ||
| LICENSE | ||
| pyproject.toml | ||
| README.md | ||
wyoming-chatterbox
wyoming protocol server for chatterbox tts with voice cloning.
clone any voice with a 10-30 second audio sample. integrates directly with home assistant as a tts provider.
requirements
- nvidia gpu with 4gb+ vram (3.5gb used at runtime)
- cuda 12.x host driver (≥550.54.14)
- nvidia container toolkit installed on host
- docker + docker compose v2
docker (recommended)
1. configure
git clone https://github.com/sudoxreboot/wyoming-chatterbox
cd wyoming-chatterbox
cp .env.example .env
edit .env:
WYOMING_PORT=10800 # host port — change if 10800 is taken
VOICE_REF_DIR=/path/to/dir # directory containing your reference wav
VOICE_REF_FILE=reference.wav
VOLUME_BOOST=3.0
TORCH_DEVICE=cuda
2. build and run
docker compose build
docker compose up -d
first run downloads ~3.5gb of chatterbox model weights into a named docker volume (chatterbox-cache). this only happens once.
3. check logs
docker compose logs -f
# you should see: "starting server at tcp://0.0.0.0:10800"
voice reference tips
- 10-30 seconds of clean speech
- no background music or noise
- consistent speaking style
- wav format (any sample rate)
install from source (no docker)
git clone https://github.com/sudoxreboot/wyoming-chatterbox
cd wyoming-chatterbox
python3 -m venv .venv
source .venv/bin/activate
pip install .
wyoming-chatterbox --uri tcp://0.0.0.0:10800 --voice-ref /path/to/voice.wav
options
| option | default | description |
|---|---|---|
--uri |
required | server uri (e.g., tcp://0.0.0.0:10800) |
--voice-ref |
required | path to voice reference wav (10-30s of speech) |
--volume-boost |
3.0 | output volume multiplier |
--device |
cuda | torch device (cuda or cpu) |
--debug |
false | enable debug logging |
systemd service (source install)
sudo tee /etc/systemd/system/wyoming-chatterbox.service << EOF
[Unit]
Description=Wyoming Chatterbox TTS
After=network-online.target
[Service]
Type=simple
User=$(whoami)
Environment=PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True
ExecStart=$(pwd)/.venv/bin/wyoming-chatterbox \
--uri tcp://0.0.0.0:10800 \
--voice-ref /path/to/voice_reference.wav \
--volume-boost 3.0
Restart=always
RestartSec=5
[Install]
WantedBy=default.target
EOF
sudo systemctl daemon-reload
sudo systemctl enable --now wyoming-chatterbox
home assistant
- settings → devices & services → add integration
- search wyoming protocol
- host: your server ip, port:
10800(or whatever you set in.env) - select it as your tts provider in the voice assistant pipeline
gpu memory
chatterbox uses ~3.5gb vram at runtime. if you get oom errors:
nvidia-smi
# docker
docker compose restart
# source
pkill -f wyoming-chatterbox
license
mit
made by sudoxnym ⚡