section

Corpus Environment

Environment: Corpus

Ports: 6650 (vault) / 6651 (operational) Location: /opt/mcp-servers/corpus/mcp_corpus_server.py Version: 2.1.0 Locker: l_252a Prefix: corp: Status: ✅ WORKING

Purpose

Document storage, PDF ingestion, and content extraction with Quadfecta indexing. Takes documents, extracts text (via PyMuPDF/Docling), structures by pages/chapters, and stores for AI search/recall. Supports hierarchical parent-child relationships (max depth 3).

Port Configuration

  • Vault (6650): Password required - stores documents with full indexing
  • Operational (6651): No password required - read replica for search

Stable ID Format

c_XXXX (4 alphanumeric chars) Key Format: corp:{user}:{timestamp_id}

Tools (11 total)

Tool Parameters Description
create title (req), content (req), category, parent_id, cdn_urls, track_refs, tags Create document with hierarchy support
get id (required) Get document by ID (supports c_XXXX stable IDs)
update id (req), title, content, category, cdn_urls, track_refs, tags Update document fields
delete id (required), force (bool) Delete document (fails if has children unless force=true)
list limit (default 20), category, parent_id, query List documents with filters and Quadfecta scoring
search query (req), limit (default 10), category Quadfecta search (keyword + vector + graph + temporal)
tree root_id, max_depth (default 3) Show hierarchical tree structure
convert source (req), format, save, title, category Convert PDF to markdown via Docling
extract filepath (req), pages, extract_images Fast text/image extraction via PyMuPDF
categories action (list/add), name, description Manage document categories
ingest source (req), category (req), title, chunk_by, cdn_source, extract_images, tags Full ingestion workflow: extract → chunk → store → index

Ingestion Workflow

PDF file → ingest(source, category)
    ↓
PyMuPDF extraction (text + images)
    ↓
Chunking (by page, chapter, section, or none)
    ↓
Parent document + child chunks created
    ↓
Quadfecta indexing (keyword, vector, graph, temporal)
    ↓
Ready for corpus.search/get

Architecture

  • Stable ID System: c_XXXX prefix (4 alphanumeric chars)
  • Hierarchy: Max 3 levels deep (parent → child → grandchild)
  • Extractors: PyMuPDF (fast), Docling (high-quality markdown)
  • Graph Integration: FalkorDB via graph_helper
  • Quadfecta Scoring: keyword + vector + graph + temporal layers

Categories (16 default)

book, manual, research, report, blog, conversation, contract, invoice, proposal, documentation, analysis, notes, presentation, template, archive, other

Bug Fixes Applied

Locker Password Mismatch (Fixed 2026-01-06 by Maverick) - Locker l_5137 had incorrect password stored (28dTIp) - Fixed: Updated to correct password (crfWls) via locker.update - Impact: Corpus MCP tools failed when credentials_helper was available

Operational Auth Bug (Fixed 2026-01-06 by Maverick) - Operational client was passing password to port 6651 (no auth required) - Fixed: Removed password from get_operational_client() - uses password=None - File modified: /opt/mcp-servers/corpus/mcp_corpus_server.py

Security Assessment

✅ Locker password now correct ✅ No command injection (no shell execution) ✅ Stable ID system prevents key enumeration ✅ Hierarchy depth limit prevents infinite recursion


Audited by Maverick (a_7yma) | Documented by Rocky (o_cq0c) | 2026-01-06

ID: 183def33
Path: Nexus 3.0 Complete Environment Reference > Corpus Environment
Updated: 2026-01-06T16:29:23