root

Tomorrow's Local LLM Setup Plan

Local LLM Server Setup - Tomorrow's Plan

Session: s_801t Date: December 9, 2025 Track Project: 0187a93c

Prerequisites (Christopher)

  1. Flash Ubuntu Server 24.04 to USB drive
  2. Download: https://ubuntu.com/download/server
  3. Use Rufus or balenaEtcher
  4. Move GPUs from desktop to new machine
  5. GTX 1070 (8GB) - Primary
  6. GTX 1060 6GB - Secondary
  7. Connect desktop monitors to motherboard (Intel UHD 630)
  8. Boot new machine from USB, install Ubuntu

During Ubuntu Install

  • Hostname: llm-server or local-ai
  • Username: nexus (or your preference)
  • Enable OpenSSH Server
  • Use entire disk (or configure NVMe RAID)

Once SSH is Available (Claude takes over)

Step 1: System Updates

sudo apt update && sudo apt upgrade -y

Step 2: NVIDIA Drivers

sudo apt install nvidia-driver-550 -y
sudo reboot

Step 3: Verify GPUs

nvidia-smi
# Should show both GTX 1070 and GTX 1060

Step 4: CUDA Toolkit

sudo apt install nvidia-cuda-toolkit -y

Step 5: Tailscale

curl -fsSL https://tailscale.com/install.sh | sh
sudo tailscale up

Step 6: Ollama

curl -fsSL https://ollama.com/install.sh | sh

Step 7: Pull Test Model

ollama pull qwen2.5:1.8b

Step 8: Test Locally

ollama run qwen2.5:1.8b
# Chat, check responsiveness

Step 9: Configure Network Access

sudo systemctl edit ollama
# Add: Environment="OLLAMA_HOST=0.0.0.0"
sudo systemctl restart ollama

Step 10: Test from Cortex

curl http://<tailscale-ip>:11434/api/generate \
  -d '{"model":"qwen2.5:1.8b","prompt":"Hello"}'

Optional: Open WebUI

sudo apt install docker.io -y
sudo docker run -d -p 3000:8080 \
  --gpus all \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:main

LM Studio Setup (Windows Desktop)

  1. Download LM Studio from lmstudio.ai
  2. Settings → Remote Server
  3. Enter Tailscale IP: http://:11434
  4. Browse and test models

Success Metrics

  • [ ] Both GPUs detected in nvidia-smi
  • [ ] Ollama responding on port 11434
  • [ ] Qwen 1.8B running at 40+ tokens/sec
  • [ ] LM Studio connected from Windows
  • [ ] Accessible via Tailscale from anywhere
ID: 710d3131
Path: Tomorrow's Local LLM Setup Plan
Updated: 2026-01-13T12:51:00