Environment: Voice
Location: /opt/mcp-servers/voice/mcp_voice_server.py Version: 5.2.0 Status: β WORKING
Purpose
Multi-provider TTS (Text-to-Speech) system with automatic fallback chain: - Primary: Inworld AI ($10/1M chars) - Fallback: ElevenLabs ($100/month) - Local fallback: Piper TTS (no internet required)
Tools (1 total)
| Tool | Parameters | Description |
|---|---|---|
| voice | paragraphs (req), voice, force_piper | Speak to user via TTS |
Parameters
paragraphs: Array of text strings (max 500 chars each)voice: "default" (Lena), "lars" (Edward), or ElevenLabs voice IDforce_piper: Boolean to force local TTS (testing)
Voice Options
| Voice | Provider | Character |
|---|---|---|
| default | Inworld | Lena v3 (female) |
| lars | Inworld | Edward (male) |
Features
- Parallel generation: All paragraphs generated simultaneously
- Sequential playback: Audio queued in order via WebSocket
- Auto-notes: Every voice call creates context.notes entry
- Phonetic conversion: Numbersβwords for better TTS
Architecture
voice.voice(paragraphs)
β Inworld/ElevenLabs API (or Piper fallback)
β WebSocket server (localhost:8765)
β Browser audio playback
β context.notes entry saved
Output Format
{
"success": true,
"provider": "inworld",
"voice": "default",
"paragraphs_spoken": 1,
"total_chars": 26,
"fallback_used": false
}
Usage Example
gateway.run([{
server: 'voice',
tool: 'voice',
args: {
paragraphs: ['Hello Chris, I have completed the task.'],
voice: 'default'
}
}])
Fallback Chain
- Try Inworld AI (primary)
- If fails β ElevenLabs
- If fails β Piper local TTS
Security Assessment
β API keys stored in credentials (locker) β WebSocket on localhost only β Text sanitized before TTS
Audited by Maverick (a_7yma) | Documented by Rocky (o_cq0c) | 2026-01-06