mirror of
https://github.com/sudoxnym/ha-addons.git
synced 2026-04-14 19:46:21 +00:00
33 lines
890 B
Markdown
33 lines
890 B
Markdown
|
|
# wyoming-chatterbox addon
|
||
|
|
|
||
|
|
wyoming protocol server for chatterbox tts with voice cloning. clone any voice with a 10-30 second sample.
|
||
|
|
|
||
|
|
## requirements
|
||
|
|
|
||
|
|
- nvidia gpu with 4gb+ vram
|
||
|
|
- gpu passthrough configured in your HA host
|
||
|
|
|
||
|
|
## configuration
|
||
|
|
|
||
|
|
| option | default | description |
|
||
|
|
|--------|---------|-------------|
|
||
|
|
| `voice_ref` | required | path to voice reference wav (place in /share/) |
|
||
|
|
| `volume_boost` | 3.0 | output volume multiplier |
|
||
|
|
| `device` | cuda | torch device (cuda or cpu) |
|
||
|
|
| `debug` | false | enable debug logging |
|
||
|
|
|
||
|
|
## setup
|
||
|
|
|
||
|
|
1. place your voice reference wav in `/share/voice_reference.wav`
|
||
|
|
2. configure the addon with the path
|
||
|
|
3. start the addon
|
||
|
|
4. add wyoming integration in HA pointing to port 10201
|
||
|
|
|
||
|
|
## voice reference tips
|
||
|
|
|
||
|
|
for best results:
|
||
|
|
- 10-30 seconds of clean speech
|
||
|
|
- no background music or noise
|
||
|
|
- consistent speaking style
|
||
|
|
- wav format (any sample rate)
|