page

Server Mirroring & Redundancy

Research by Ray on failover strategies using Lsyncd + Redis Sentinel

mirroring redundancy failover lsyncd sentinel ray

NEXUS SERVER REDUNDANCY RESEARCH REPORT

Agent: Ray | Date: January 13, 2026

EXECUTIVE SUMMARY

Recommend a layered approach for Nexus failover: 1. Redis Sentinel for automatic database failover (built into Redis) 2. Lsyncd for MCP server code replication (simple, proven) 3. Manual failover initially (switch Tailscale DNS/IP)

CURRENT ARCHITECTURE (from KB)

  • Primary: cortex-nexus-master (2TB) - 100.73.67.78
  • Backup: cortex-storage (1TB) - 100.112.54.111
  • Redis: Vault/Operational dual-pod pattern already in place
  • Docker containers with volumes at /data/nexus3/

TOOL COMPARISON MATRIX

Tool Sync Type Level Real-time Bi-Directional Complexity Best For
rsync File Periodic No No Low Simple backups
Lsyncd File Near real-time Yes (inotify) No Low-Medium MCP code sync
DRBD Block Real-time Yes Yes (dual-primary) High Databases, requires dedicated partition
GlusterFS File Real-time Yes Yes High Large-scale distributed storage (needs 3+ nodes)
Syncthing File Real-time Yes Yes Low Peer-to-peer sync, easy setup
Redis Sentinel Redis Real-time Yes N/A (master-slave) Medium Redis automatic failover

LAYER 1: REDIS REPLICATION (Built-in Solution)

Current State

Already using vault/operational within same server. Can extend cross-server.

Cross-Server Redis Sentinel Setup

  • Deploy 3 Sentinel instances (minimum) across both servers
  • Configure vault containers on backup as replicas of primary vaults
  • Sentinel monitors masters and auto-promotes replica on failure

Configuration:

sentinel monitor nexus-kb-master 100.73.67.78 6625 2
sentinel down-after-milliseconds nexus-kb-master 5000
sentinel failover-timeout nexus-kb-master 60000

Pros: - Native Redis feature, no new software - Automatic failover (30-60 seconds) - Already familiar with vault/operational pattern

Cons: - Asynchronous replication (some data loss possible) - Need minimum 3 Sentinel instances for quorum - MCP servers need Sentinel-aware connection logic


LAYER 2: MCP SERVER CODE REPLICATION

Uses inotify + rsync for near real-time sync.

Setup:

-- /etc/lsyncd.conf on primary
sync {
    default.rsync,
    source = "/opt/mcp-servers/",
    target = "100.112.54.111:/opt/mcp-servers/",
    rsync = {
        archive = true,
        compress = true,
        rsh = "/usr/bin/ssh -l nexus"
    }
}

Pros: - 15-second batching (configurable) - Simple setup, low overhead - Uses existing SSH/rsync - One-way (prevents split-brain)

Cons: - Not truly real-time (15s delay default) - One-way only

Option B: Syncthing

Peer-to-peer real-time sync with web UI.

Pros: - Bidirectional by default - Easy Docker deployment - Encrypted, no manual SSH setup - Web UI for monitoring

Cons: - More overhead than Lsyncd - Conflict handling adds complexity


LAYER 3: DOCKER CONTAINER STRATEGY

Docker doesn't natively support live container migration. DRBD/shared storage would be needed.

  1. Keep identical docker-compose files on backup server
  2. Sync /data/nexus3/ volumes via Lsyncd
  3. On failover: start containers on backup

Commands for failover:

# On backup server
cd /opt/mcp-servers
docker-compose up -d

LAYER 4: FAILOVER STRATEGY

Manual Failover (Phase 1)

  1. Detect failure (monitoring/alert)
  2. SSH to backup: start Docker containers
  3. Update Tailscale DNS or use floating IP
  4. MCP clients reconnect

Estimated failover time: 2-5 minutes with manual intervention

Semi-Automatic (Phase 2)

  1. Keepalived for floating IP between servers
  2. Health check scripts monitor primary
  3. Auto-start containers on failure detection

Full Automatic (Phase 3)

  1. Redis Sentinel handles Redis failover
  2. Consul/Nomad for service discovery
  3. Automatic DNS updates

Phase 1: Foundation (Simplest Path)

  1. Install Lsyncd on primary
  2. Sync /opt/mcp-servers/ → backup
  3. Sync /data/nexus3/ volumes → backup (excluding container runtime data)

  4. Setup Redis replicas on backup

  5. Each vault on backup = replica of primary vault
  6. Same port scheme, just replica mode

  7. Document manual failover procedure

Phase 2: Automation

  1. Deploy Redis Sentinel (3 instances across both servers)
  2. Add Keepalived for floating Tailscale IP
  3. Create failover scripts

GOTCHAS & LIMITATIONS

  1. Tailscale IPs are fixed - Can't easily move IPs between machines. Options:
  2. Use Tailscale Magic DNS names
  3. Run HAProxy/Nginx on a third node
  4. Use Tailscale Funnel for external access

  5. Redis async replication - Some writes may be lost (milliseconds worth)

  6. Storage size mismatch - Backup has 1TB vs Primary 2TB. Prioritize critical data.

  7. Split-brain risk - If both servers think they're master. Sentinel quorum helps.

  8. MCP port conflicts - Same ports on both servers means can't run both simultaneously (not an issue for standby model)


ESTIMATED COMPLEXITY

Component Effort Risk
Lsyncd setup Low Low
Redis cross-server replication Medium Low
Redis Sentinel Medium Medium
Manual failover docs Low Low
Automatic failover High Medium

SOURCES

  • Redis Sentinel: https://redis.io/docs/latest/operate/oss_and_stack/management/sentinel/
  • Lsyncd: https://lsyncd.github.io/lsyncd/
  • DRBD vs rsync: https://iamvhl.medium.com/drbd-vs-rsync-92e8c2c53f9d
  • Syncthing: https://syncthing.net/
  • Docker HA: https://www.evidian.com/products/high-availability-software-for-application-clustering/
ID: 2837bc42
Path: Operation Ghostbusters - Infrastructure Research > Nexus Deployment > Server Mirroring & Redundancy
Updated: 2026-01-13T12:08:09