Local TTS Fallback Options
Fallback Chain
- Primary: InWorld TTS Max ($10/1M chars) - Best quality
- Fallback 1: Piper - Most natural local TTS
- Fallback 2: Kokoro - Fastest, lower quality
When Fallback Activates
- No internet connection
- InWorld API unreachable
- InWorld timeout (>3 seconds)
Piper TTS (Recommended Local)
Why Piper
- Most natural sounding among open source
- Uses VITS architecture (no separate vocoder)
- Fast, works offline
- Privacy-friendly
- Used by Home Assistant
Installation
pip install piper-tts
# Or download standalone binary
wget https://github.com/rhasspy/piper/releases/...
Usage
from piper import PiperVoice
voice = PiperVoice.load("en_US-lessac-medium.onnx")
audio = voice.synthesize("Hello world")
Voice Models
- en_US-lessac-medium (good quality)
- en_US-amy-low (faster)
- Many languages available
Kokoro TTS (Speed Fallback)
Why Kokoro
- Only 82M parameters
- Sub-0.3 second generation
- Apache 2.0 license
- Good for low-resource environments
Limitations
- No voice cloning
- Less natural than Piper
- Fewer inflections
Voice Server Logic
async def generate_audio(text: str):
# Try InWorld first
try:
result = await generate_audio_inworld(text, timeout=3)
if result["success"]:
return result
except (TimeoutError, ConnectionError):
pass
# Fallback to Piper
logger.info("InWorld unavailable, using Piper fallback")
return await generate_audio_piper(text)
Sources