Voice System v4.1 - Single Tool Architecture
Overview
The Voice system enables AI-to-human spoken communication via ElevenLabs TTS. Version 4.1 simplifies the architecture to a single voice tool with paragraph-based output.
Current Architecture (v4.1)
Voice MCP Server (v4.1.0)
Location: /opt/mcp-servers/voice/mcp_voice_server.py
Single Tool API:
gateway.run([{
server: 'voice',
tool: 'voice',
args: {
paragraphs: ['First paragraph', 'Second paragraph', ...]
}
}])
Key Features:
- ONE tool instead of 6 (removed voice_200, voice_350, voice_500, voice_1500, voice_2500, voice_queue)
- paragraphs array - each string is a separate audio clip
- Max 500 chars per paragraph (auto-truncates if exceeded)
- All paragraphs generate in PARALLEL on server
- Sent to browser sequentially for playback
- Uses single ElevenLabs voice ID (no voice parameter needed)
- Phonetic conversion for numbers and dates
- Auto-saves to context.notes for session history
Nexus Voice VS Code Extension (v1.7.1)
Location: /home/nexus/.config/systemd/user/.cache/voice-extension/
Features: - Background WebSocket connection (stays connected even when browsing files) - Auto-opens sidebar on voice message - Audio queue for sequential playback with minimal gaps - Mute button (🔊/🔇) - mutes volume, audio keeps playing in background - Messages display even when muted - Auto-detects voice server URL based on VS Code remote host
Mute Behavior: - Mute sets volume to 0, does NOT pause - Audio continues playing silently - Queue keeps advancing - Unmute restores volume - you hear wherever the stream currently is - Like muting a TV - show keeps going
Voice WebSocket Bridge
Location: /opt/mcp-servers/voice/voice_websocket_bridge.py
Port: 8765
Bridges HTTP POST from MCP server to WebSocket for VS Code extension.
Usage Pattern
Lightning Response Pattern
# Quick acknowledgment first, then substance
gateway.run([{
server: 'voice',
tool: 'voice',
args: {
paragraphs: [
'Got it!', # Lightning response
'Here is the detailed explanation...', # Follow-up
'And another point to consider...' # More detail
]
}
}])
All three generate in parallel, play sequentially with minimal gaps.
Legacy Reference (Archived)
Previous Tools (v3.x - REMOVED)
voice_200- 200 char limitvoice_350- 350 char limitvoice_500- 500 char limitvoice_1500- 1500 char limitvoice_2500- 2500 char limitvoice_queue- Multiple messagesvoiceparameter for selecting different voices
These were replaced with the single voice tool in v4.0.
File Locations
- MCP Server:
/opt/mcp-servers/voice/mcp_voice_server.py - WebSocket Bridge:
/opt/mcp-servers/voice/voice_websocket_bridge.py - VS Code Extension:
/home/nexus/.config/systemd/user/.cache/voice-extension/ - Built VSIX:
nexus-voice-1.7.1.vsix
Redis Storage
Voice notes are stored in Context environment (port 6620) with format:
ctxt:{timestamp}:NOTE:voice:{session_id}