Knowledge Base — Jesse Pike

The Problem

Raw information is abundant. Useful, queryable knowledge is rare. Note-taking tools collect input — they don’t curate, classify, or surface it at the right moment. For an agentic system to get smarter over time, it needs a structured place to put what it learns.

The Knowledge Base is that place. Not a document store. Not a bookmark manager. A curated repository of learnings, patterns, and research that any agent in the ecosystem can query and contribute to.

Architecture

Four layers, each with a distinct responsibility:

Capture  →  Process  →  Store  →  Access

Capture — pluggable input sources: Link Triage pipeline (automated), MCP tool (agent-initiated), bulk import CLI (markdown, Raindrop exports, notes). Capture is designed to be multi-source without any source being privileged.

Process — incoming content is chunked via tiktoken (cl100k_base), embedded via OpenAI ada-002 (1536-dim), and auto-classified by content type (learning, idea, note, to-read, backlog). Classification is not manual — agents and the system detect content type from structure and context.

Store — SQLite (WAL mode) holds metadata and lifecycle state as the authoritative record. Chroma holds vector embeddings for semantic search. Same dual-store pattern as the Memory Layer — SQLite is truth, Chroma is search.

Access — 15 MCP tools across three categories:

Query (10): search_knowledge, get_item, get_items, get_backlog, get_ideas, get_learnings, get_to_read, get_recent, get_focus_topics, get_stats
Write (1): send_to_kb — unified entry point with auto-detection
Manage (4): update_item, archive_item, mark_complete, set_focus_topics

Design Decisions

One Write Endpoint

send_to_kb is the only way content enters the Knowledge Base. It auto-detects content type, topics, and source project from context. Agents don’t need to classify content before writing — the system does it. This reduces friction at the capture point, which directly increases how much actually gets captured.

Content Types as First-Class Structure

The KB distinguishes: learnings (validated knowledge), ideas (hypotheses to explore), notes (reference material), to-read (staged for review), backlog (deferred work). Each type surfaces differently — get_learnings returns different results than get_ideas. The taxonomy matters because retrieval intent varies by type.

Global MCP Access

The KB MCP server is configured globally (~/.claude.json), not per-project. Any agent in any project can read from and write to the Knowledge Base. Concurrent writes from multiple agents are handled via SQLite WAL mode and Chroma’s built-in locking.

Source project is auto-detected from the calling agent’s working directory — no manual tagging required.

The KB ↔ Memory Distinction

These two systems look similar but serve different purposes:

Knowledge Base	Memory Layer
Curated, evergreen learnings	Atomic facts, observations, preferences
Manually reviewed and promoted	Written automatically by agents
Shared across all projects (global)	Scoped by namespace/project
Searchable by topic and type	Searchable by semantic similarity

The routing rule: if it’s something you’d want to read later and curate — KB. If it’s something an agent needs to recall in a future session — Memory.

Connection to the Pipeline

Link Triage feeds the KB. Links ingested by the pipeline are classified, extracted, and routed to KB as to-read or learning entries. The KB is the downstream destination for the entire content intake pipeline.

Krypton queries KB when generating focus recommendations and digests. The KB’s get_focus_topics tool drives the intelligence ring’s awareness of what the system is currently learning about.

Connection to the Agentic Work System

KB is Knowledge Ring infrastructure — it serves every layer. ADF agents write architectural learnings. Work Management sessions surface patterns. Krypton synthesizes across KB, Memory, and ADF in the same query. The Knowledge Ring is what makes the system cumulative rather than amnesic.