page

Phase 2 - Voice Pipeline

Phase 2: Voice Pipeline

Goal

Connect wake word to full voice conversation.

Architecture

[Wake Word Detected]
       |
       v
[Start Recording]
       |
       v
[Whisper STT] --> Text
       |
       v
[LARS/Ollama] --> Response
       |
       v
[InWorld TTS] --> Audio
       |
       v
[Play Response]

Tasks

1. STT Setup

  • [ ] Install Whisper locally
  • [ ] Test transcription accuracy
  • [ ] Optimize for speed vs accuracy

2. LARS Integration

  • [ ] Verify lars-trained model in Ollama
  • [ ] Create API wrapper for conversation
  • [ ] Handle conversation context/memory

3. TTS Integration

  • [ ] Route responses to Nexus voice MCP
  • [ ] Test InWorld AI output
  • [ ] Handle long responses (paragraph splitting)

4. End-to-End Test

  • [ ] "Hey LARS" → Question → Response → Speech
  • [ ] Measure total latency
  • [ ] Test interruption handling

Success Criteria

  • Full conversation loop working
  • Total latency < 5 seconds
  • Natural sounding responses
  • Handles multi-turn conversations
ID: 91484450
Path: LARS Voice Assistant > Implementation Roadmap > Phase 2 - Voice Pipeline
Updated: 2025-12-30T19:41:33