local-voice-ai Reference Implementation
Overview
Complete Docker-based local voice assistant stack.
GitHub: https://github.com/ShayneP/local-voice-ai
Stack Components
- LiveKit: WebRTC signaling
- Whisper (VoxBox): Speech-to-text
- llama.cpp: Local LLM
- Kokoro: Text-to-speech (we won't use this)
- FAISS + Sentence Transformers: RAG/knowledge retrieval
- Next.js: Frontend UI
System Requirements
- Docker + Docker Compose
- No GPU required (CPU-based models)
- Recommended: 12GB+ RAM
Quick Start
git clone https://github.com/ShayneP/local-voice-ai
cd local-voice-ai
./test.sh
# Access at http://localhost:3000
What We Can Learn From This
- Docker service orchestration pattern
- LiveKit agent configuration
- Whisper integration for STT
- How to swap out TTS component
Our Modifications
- Replace Kokoro TTS with InWorld AI (our existing voice pipeline)
- Replace llama.cpp with Ollama running lars-trained
- Add wake word detection layer (openWakeWord)
- Add Claude Code SSH delegation