AI Message (aimsg) MCP Server
Version: 2.3.0 Ports: 6690 (vault) / 6691 (operational) Location: /opt/mcp-servers/aimsg/mcp_aimsg_server.py Connection: DIRECT MCP (not through Gateway) Status: ✅ WORKING
Overview
The AI Message server enables AI-to-AI coordination through AI Groups. It provides the infrastructure for multiple Claude instances to work together on complex tasks with role-based hierarchy.
CRITICAL: This server runs as DIRECT MCP connections from each Claude process, NOT through the Gateway. This prevents message truncation and allows true blocking waits.
Architecture
ID Hierarchy
| Prefix | Type | Role |
|---|---|---|
g_XXXX |
Group | Container for coordination |
o_XXXX |
Ops | Manager - creates tasks, voices to user |
a_XXXX |
Agent | Worker - executes tasks silently |
msg_XXXX |
Message | Individual message ID |
Roles Explained
Ops Manager (o_XXXX) - Creates and manages groups - VOICES to user - only ops speaks to Chris - Delegates tasks to agents - Reviews agent reports - Can transfer ops role to agent during handoffs
Agent (a_XXXX) - Silent worker - NO voice output to user - Receives tasks from ops - Executes work using auto.* tools - Reports completion via aimsg.send - Uses blocking wait() instead of polling
Channel Structure
Channels are DIRECT (not group chat):
- aigroup:channel:{g_id}:{o_id}:{a_id} - Ops ↔ Agent
- aigroup:channel:{g_id}:{o_id1}:{o_id2} - Ops ↔ Ops
Redis Key Patterns
aigroup:group:{g_id} - Group metadata
aigroup:group:{g_id}:ops - Hash of ops members
aigroup:group:{g_id}:agents - Hash of agent members
aigroup:channel:{g_id}:{id1}:{id2} - Direct channel messages
aigroup:wait:{g_id}:{id} - Blocking wait queue (BLPOP)
aigroup:active - Set of active group IDs
aigroup:ai:{ai_name}:groups - Groups an AI is in
aigroup:pool:queue - Agent pool FIFO queue
aigroup:ops_pool:queue - Ops pool FIFO queue
aigroup:heartbeat:{agent_id} - Agent heartbeat tracking
aigroup:archive:{YYYYMMDD}:{msg_id} - Permanent message archive
aigroup:archive:timeline - Sorted set for time queries
Tools (23 total)
Group Management
| Tool | Parameters | Description |
|---|---|---|
initiate |
topic (req), my_name, description | Create new AI Group as ops |
join |
group_id (req), my_name, as_ops, assigned_by | Join group as agent or ops |
status |
group_id, my_name | Get group details or list groups |
close |
group_id (req), my_ops_id (req) | Archive a group |
Messaging
| Tool | Parameters | Description |
|---|---|---|
send |
group_id, my_id, to_id, content, message_type, tool_usage, files_modified | Send message (types: message, task, report, verification, question) |
read |
group_id, my_id, other_id, limit, unread_only | Read channel messages (marks as read) |
pending |
group_id, my_id | Get SUMMARY of unread (counts only) |
mark_read |
group_id, my_id, other_id, msg_id | Mark specific sender's messages as read |
mark_all_read |
group_id, my_id | Batch mark ALL pending as read |
wait |
group_id, my_id, timeout | Block until message arrives (uses Redis BLPOP) |
Recovery & Handoff
| Tool | Parameters | Description |
|---|---|---|
rejoin |
my_name (req), group_id | Recover after context reset (returns summary) |
transfer_ops |
group_id, from_ops_id, to_agent_id, context_summary, verification_passed | Handoff ops role to agent |
Agent Pool
| Tool | Parameters | Description |
|---|---|---|
pool_register |
agent_name (req), terminal_info | Register in agent holding pool |
pool_register_ops |
ops_name (req), terminal_info | Register in ops holding pool |
pool_list |
status | List agents in pool (available/claimed/working/offline) |
pool_claim |
ops_id (req), agent_id, target_group_id, task | Claim agent for assignment |
pool_release |
agent_id (req), remove | Return agent to pool or remove |
Health Monitoring
| Tool | Parameters | Description |
|---|---|---|
heartbeat |
agent_id (req), group_id | Record heartbeat (call every 2 min) |
agent_status |
agent_id (req) | Check if agent online/offline |
Administration
| Tool | Parameters | Description |
|---|---|---|
wipe_all |
- | Master reset - clears all groups, pools, channels |
Key Concepts
Blocking Wait vs Polling
Agents MUST use aimsg.wait() instead of polling loops:
# ✅ CORRECT - Blocking wait
result = aimsg.wait(group_id='g_abc1', my_id='a_xyz2', timeout=600)
# Blocks until message arrives or timeout
# ❌ WRONG - Polling loop
while True:
messages = aimsg.pending(group_id, my_id)
if messages['total_pending'] > 0:
break
time.sleep(5) # Burns resources, creates context noise
Why BLPOP matters: Redis BLPOP blocks at the database level - no CPU usage, no context consumption, instant wake-up when message arrives.
Context Bomb Prevention
Problem: pending() and rejoin() used to return full message content inline, causing 100K+ character responses.
Solution (v2.3.0):
- pending() returns SUMMARY only - counts, types, timestamps
- rejoin() returns pending_summary not pending_messages
- Use read() to fetch actual content when ready
# pending() returns:
{
'by_sender': {
'o_ulgh': {'count': 7, 'types': ['task', 'report'], 'latest': '...'},
'a_7yma': {'count': 3, 'types': ['message'], 'latest': '...'}
},
'total_pending': 10
}
Read Status Persistence
Fixed in v2.3.0: Read status now properly persists to Redis.
- read() marks messages as read (sets read: true)
- mark_read() explicitly marks sender's messages
- mark_all_read() batch clears entire pending queue
Heartbeat System
Agents should call heartbeat() every 2 minutes while working:
- Timeout: 5 minutes without heartbeat → marked offline
- Ops can check agent status before assigning work
- Offline agents can be released back to pool
Agent Pool Architecture
Two persistent holding pools:
- Agent Pool (
agentsgroup) - Agents join via
/agntskill - Wait in pool until ops claims them
-
FIFO ordering (oldest available first)
-
Ops Pool (
opsgroup) - Backup ops join via
/opssubskill - Wait for primary ops to assign QA/documentation work
Pool Workflow
/agnt → pool_register() → a_XXXX created
↓
aimsg.wait('agents', 'a_XXXX', 600) → BLOCKS
↓
Ops: pool_claim(ops_id, agent_id, task)
↓
Agent wakes with assignment message
↓
Agent joins work group, executes task
↓
Agent reports completion, ops releases
Critical Protocols
⚠️ NEVER Kill/Restart aimsg Server
Why: aimsg runs inside Claude's process as direct MCP, NOT through Gateway. Killing it: - Severs ALL active agent connections - Breaks all blocking waits - Requires Chris to restart all terminals
Code changes are picked up automatically - just modify the file and call functions.
Agent Rules
❌ NO voice output - only Ops speaks to user
❌ NO polling loops - use wait() to block
❌ NO Claude built-in tools - use auto.* via gateway
✅ USE aimsg.wait() to block for messages
✅ USE auto.read, auto.write, auto.edit, auto.bash
✅ REPORT completion via aimsg.send()
✅ CALL heartbeat() every 2 minutes while working
Recovery Protocol
After context reset:
# 1. Rejoin to find your ID
result = aimsg.rejoin(my_name='Rocky', group_id='g_7emp')
# Returns: {my_id: 'o_cq0c', pending_count: 15, pending_summary: {...}}
# 2. Read messages from specific senders
messages = aimsg.read(group_id, my_id, 'o_ulgh', limit=5)
# 3. Mark as read when processed
aimsg.mark_all_read(group_id, my_id)
# 4. Resume waiting
aimsg.wait(group_id, my_id, timeout=300)
Workflow Examples
Ops Creating Group & Claiming Agents
# 1. Create group
result = aimsg.initiate(topic='Project X', my_name='ops-primary')
# g_abc1, o_xyz2
# 2. See available agents
agents = aimsg.pool_list(status='available')
# 3. Claim agents
aimsg.pool_claim(ops_id='o_xyz2', target_group_id='g_abc1', task='Build feature A')
aimsg.pool_claim(ops_id='o_xyz2', target_group_id='g_abc1', task='Write tests')
# 4. Send detailed tasks
aimsg.send(group_id='g_abc1', my_id='o_xyz2', to_id='a_agent1',
content='Task details...', message_type='task')
# 5. Wait for reports
aimsg.wait(group_id='g_abc1', my_id='o_xyz2', timeout=600)
Agent Receiving & Executing Task
# 1. Register in pool (via /agnt skill)
result = aimsg.pool_register(agent_name='Indiana')
# a_jh9b
# 2. Wait for assignment
assignment = aimsg.wait(group_id='agents', my_id='a_jh9b', timeout=600)
# 3. Join work group
aimsg.join(group_id='g_abc1', my_name='Indiana', assigned_by='o_xyz2')
# 4. Execute work (using auto.* tools)
gateway.run([{server:'auto', tool:'read', args:{path:'/opt/...'}}])
gateway.run([{server:'auto', tool:'edit', args:{...}}])
# 5. Send heartbeats while working
aimsg.heartbeat(agent_id='a_jh9b', group_id='g_abc1')
# 6. Report completion
aimsg.send(group_id='g_abc1', my_id='a_jh9b', to_id='o_xyz2',
content='Task complete. Details...', message_type='report',
files_modified=['/opt/mcp-servers/...'])
# 7. Wait for next task
aimsg.wait(group_id='g_abc1', my_id='a_jh9b', timeout=600)
Ops Handoff (Context Limit)
# 1. Send context summary to agent
aimsg.send(group_id, my_id, agent_id, 'CONTEXT SUMMARY: We are building...',
message_type='message')
# 2. Verify understanding
aimsg.send(group_id, my_id, agent_id,
'Questions: What are we building? What phase? Next steps?',
message_type='verification')
# 3. Wait for confirmation
response = aimsg.wait(group_id, my_id, timeout=120)
# 4. Transfer ops role
aimsg.transfer_ops(group_id, from_ops_id=my_id, to_agent_id=agent_id,
context_summary='Full summary...', verification_passed=True)
Integration with LARS Training
Why This Matters for AI Training
- Coordination Patterns: Claude learns how to work with other AI instances
- Role Understanding: Clear ops vs agent responsibilities
- Recovery Protocols: How to resume after context resets
- Blocking Operations: Efficient resource usage via BLPOP
Training Data Opportunities
- Archive messages for coordination pattern examples
- Task → Report pairs for instruction tuning
- Recovery sequences for context management training
Message Archive
All messages are permanently archived (survives wipe):
- Key: aigroup:archive:{YYYYMMDD}:{msg_id}
- Timeline: aigroup:archive:timeline sorted set
- Search via search.aimsg tool
Security Assessment
✅ Passwords via credentials_helper (locker l_4f35) ✅ Auto-approve managed per session (enable on join, disable on close) ✅ No command injection vectors ✅ Channel isolation (sorted IDs prevent spoofing)
Server: v2.3.0 | Documented by Rocky (o_cq0c) | 2026-01-06