root

Voice Environment

Voice Environment

Overview

The Voice Environment enables AI-to-human spoken communication through ElevenLabs text-to-speech. Unlike most Nexus environments, Voice does NOT have its own Redis storage - it writes notes to the Context environment.

Components

1. Voice MCP Server (v4.1.0)

Location: /opt/mcp-servers/voice/mcp_voice_server.py Protocol: MCP (Model Context Protocol) Gateway Access: gateway.run([{server:'voice', tool:'voice', args:{...}}])

Single Tool: - voice - Speak to user via TTS - paragraphs: Array of strings (max 500 chars each) - Generates all audio in parallel - Sends to browser for sequential playback

2. Voice WebSocket Bridge

Location: /opt/mcp-servers/voice/voice_websocket_bridge.py Port: 8765 Protocol: HTTP POST → Socket.IO WebSocket

Bridges MCP server to VS Code extension: - Receives audio from MCP server via HTTP POST - Emits to connected clients via Socket.IO - Handles queue position metadata

3. Nexus Voice VS Code Extension (v1.7.1)

Location: /home/nexus/.config/systemd/user/.cache/voice-extension/ Built Package: nexus-voice-1.7.1.vsix

Client-side audio playback: - Background WebSocket connection - Audio queue with sequential playback - Mute button (volume control, not pause) - Auto-open sidebar on voice message - Message display even when muted

Data Flow

AI → gateway.run([{server:'voice', tool:'voice', args:{paragraphs:[...]}}])
    ↓
Voice MCP Server
    ↓ (parallel ElevenLabs API calls)
    ↓ (generates all audio clips)
    ↓
HTTP POST to WebSocket Bridge (:8765)
    ↓
Socket.IO emit to VS Code Extension
    ↓
Audio Queue → Sequential Playback

Storage

Voice does NOT have its own Redis. Voice notes are stored in Context Environment (port 6620):

Key Format: ctxt:{timestamp}:NOTE:voice:{session_id}

Note Record:

{
  "content": "[voice:voice] Spoken text here",
  "type": "voice",
  "source": "voice",
  "session_id": "session_xxx",
  "timestamp": "2025-12-23T...",
  "links": []
}

External Dependencies

  • ElevenLabs API - Text-to-speech generation
  • Voice ID: cgSgspJ2msm6clMCkdW9 (configured in MCP server)
  • Model: eleven_multilingual_v2

Service Management

Voice services run via PM2:

pm2 status          # Check status
pm2 restart voice   # Restart MCP server
pm2 logs voice      # View logs

WebSocket bridge:

pm2 restart voice-websocket

Usage Examples

Basic Voice Output

gateway.run([{
    server: 'voice',
    tool: 'voice',
    args: {
        paragraphs: ['Hello! How can I help you today?']
    }
}])

Lightning Response Pattern

gateway.run([{
    server: 'voice',
    tool: 'voice',
    args: {
        paragraphs: [
            'Got it!',  # Quick acknowledgment
            'Let me explain how this works...',  # Detail
            'And here is another important point.'  # More detail
        ]
    }
}])

Legacy Reference (Archived)

Previous Tools (v3.x - REMOVED in v4.0)

  • voice_200, voice_350, voice_500, voice_1500, voice_2500 - Character-limited tools
  • voice_queue - Explicit queue tool
  • voice parameter - For selecting different voices

These were consolidated into the single voice tool with paragraphs array.

Previous Architecture

  • Server-side cooldown (removed in v3.4)
  • Multiple voice IDs/names (removed in v4.1)
  • Pause-based mute in extension (changed to volume-based in v1.7.1)
  • Voice System v4.1 - Single Tool Architecture (57346b26)
  • Nexus Voice VS Code Extension (754d3e7c)
  • Nexus 3.0 Complete Environment Reference (1755b964)
ID: 97d68468
Path: Voice Environment
Updated: 2026-01-13T12:51:22