Environment: Corpus
Ports: 6650 (vault) / 6651 (operational) Location: /opt/mcp-servers/corpus/mcp_corpus_server.py Version: 2.1.0 Locker: l_252a Prefix: corp: Status: ✅ WORKING
Purpose
Document storage, PDF ingestion, and content extraction with Quadfecta indexing. Takes documents, extracts text (via PyMuPDF/Docling), structures by pages/chapters, and stores for AI search/recall. Supports hierarchical parent-child relationships (max depth 3).
Port Configuration
- Vault (6650): Password required - stores documents with full indexing
- Operational (6651): No password required - read replica for search
Stable ID Format
c_XXXX (4 alphanumeric chars)
Key Format: corp:{user}:{timestamp_id}
Tools (11 total)
| Tool | Parameters | Description |
|---|---|---|
| create | title (req), content (req), category, parent_id, cdn_urls, track_refs, tags | Create document with hierarchy support |
| get | id (required) | Get document by ID (supports c_XXXX stable IDs) |
| update | id (req), title, content, category, cdn_urls, track_refs, tags | Update document fields |
| delete | id (required), force (bool) | Delete document (fails if has children unless force=true) |
| list | limit (default 20), category, parent_id, query | List documents with filters and Quadfecta scoring |
| search | query (req), limit (default 10), category | Quadfecta search (keyword + vector + graph + temporal) |
| tree | root_id, max_depth (default 3) | Show hierarchical tree structure |
| convert | source (req), format, save, title, category | Convert PDF to markdown via Docling |
| extract | filepath (req), pages, extract_images | Fast text/image extraction via PyMuPDF |
| categories | action (list/add), name, description | Manage document categories |
| ingest | source (req), category (req), title, chunk_by, cdn_source, extract_images, tags | Full ingestion workflow: extract → chunk → store → index |
Ingestion Workflow
PDF file → ingest(source, category)
↓
PyMuPDF extraction (text + images)
↓
Chunking (by page, chapter, section, or none)
↓
Parent document + child chunks created
↓
Quadfecta indexing (keyword, vector, graph, temporal)
↓
Ready for corpus.search/get
Architecture
- Stable ID System: c_XXXX prefix (4 alphanumeric chars)
- Hierarchy: Max 3 levels deep (parent → child → grandchild)
- Extractors: PyMuPDF (fast), Docling (high-quality markdown)
- Graph Integration: FalkorDB via graph_helper
- Quadfecta Scoring: keyword + vector + graph + temporal layers
Categories (16 default)
book, manual, research, report, blog, conversation, contract, invoice, proposal, documentation, analysis, notes, presentation, template, archive, other
Bug Fixes Applied
✅ Locker Password Mismatch (Fixed 2026-01-06 by Maverick)
- Locker l_5137 had incorrect password stored (28dTIp)
- Fixed: Updated to correct password (crfWls) via locker.update
- Impact: Corpus MCP tools failed when credentials_helper was available
✅ Operational Auth Bug (Fixed 2026-01-06 by Maverick)
- Operational client was passing password to port 6651 (no auth required)
- Fixed: Removed password from get_operational_client() - uses password=None
- File modified: /opt/mcp-servers/corpus/mcp_corpus_server.py
Security Assessment
✅ Locker password now correct ✅ No command injection (no shell execution) ✅ Stable ID system prevents key enumeration ✅ Hierarchy depth limit prevents infinite recursion
Audited by Maverick (a_7yma) | Documented by Rocky (o_cq0c) | 2026-01-06