Files
Fulfilled-Knowledge/Understand-Anything-main/docs/superpowers/specs/2026-04-09-understand-knowledge-design.md
2026-05-27 15:40:32 +08:00

13 KiB

/understand-knowledge — Personal Knowledge Base Plugin Design

Overview

A new /understand-knowledge skill within the existing Understand Anything plugin that takes any folder of markdown notes and produces an interactive knowledge graph visualized in the existing dashboard.

Inspired by Andrej Karpathy's LLM Wiki pattern — where an LLM compiles and maintains a structured wiki from raw sources — this plugin goes further by adding typed relationship discovery and interactive graph visualization that tools like Obsidian and Logseq cannot provide.

Goals

  • Accept any markdown-based knowledge base (Obsidian vault, Logseq graph, Dendron workspace, Foam, Karpathy-style LLM wiki, Zettelkasten, or plain markdown)
  • Auto-detect the format and adapt parsing accordingly
  • Use LLM analysis to discover implicit relationships beyond explicit links
  • Produce a knowledge graph with typed nodes and edges
  • Visualize in the existing dashboard with knowledge-specific layout, sidebar, and reading mode

Non-Goals

  • Real-time sync with the knowledge base tool (Obsidian, Logseq, etc.)
  • Replacing the user's existing PKM tool — this is a visualization/analysis layer on top
  • Supporting non-markdown formats (PDFs, bookmarks) in v1

Schema Extensions

New Node Types (5)

Added to the existing NodeType union (currently 16 types):

export type NodeType =
  // existing (16)
  | "file" | "function" | "class" | "module" | "concept"
  | "config" | "document" | "service" | "table" | "endpoint"
  | "pipeline" | "schema" | "resource"
  | "domain" | "flow" | "step"
  // knowledge (5 new → 21 total)
  | "article" | "entity" | "topic" | "claim" | "source";
Type What it represents Example
article A wiki/note page — the primary content unit "LLM Knowledge Bases.md"
entity A named thing: person, tool, paper, org, project "Andrej Karpathy", "Obsidian"
topic A thematic cluster grouping related articles "Personal Knowledge Management"
claim A specific assertion, insight, or takeaway "RAG loses context at chunk boundaries"
source Raw/reference material that articles are compiled from A paper URL, a raw PDF reference

New Edge Types (6)

Added to the existing EdgeType union (currently 29 types):

export type EdgeType =
  // existing (29)
  | ...
  // knowledge (6 new → 35 total)
  | "cites" | "contradicts" | "builds_on"
  | "exemplifies" | "categorized_under" | "authored_by";
Type Direction Meaning
cites article → source References or draws from
contradicts claim → claim Conflicts or disagrees with
builds_on article → article Extends, refines, or deepens
exemplifies entity → concept/topic Is a concrete example of
categorized_under article/entity → topic Belongs to this theme
authored_by article → entity Written or created by

New Metadata Interface

export interface KnowledgeMeta {
  format?: "obsidian" | "logseq" | "dendron" | "foam" | "karpathy" | "zettelkasten" | "plain";
  wikilinks?: string[];
  backlinks?: string[];
  frontmatter?: Record<string, unknown>;
  sourceUrl?: string;
  confidence?: number; // 0-1, for LLM-inferred relationships
}

Added as an optional field on GraphNode:

export interface GraphNode {
  // ...existing fields
  knowledgeMeta?: KnowledgeMeta;
}

Graph-Level Kind Flag

export interface KnowledgeGraph {
  version: string;
  kind: "codebase" | "knowledge"; // NEW
  project: ProjectMeta;
  nodes: GraphNode[];
  edges: GraphEdge[];
  layers: Layer[];
  tour: TourStep[];
}

The kind field tells the dashboard which layout, sidebar, and visual styling to use. For backward compatibility, graphs without a kind field default to "codebase".


Format Detection & Format Guides

Auto-Detection Logic

Scans the target directory for signature files/patterns. Priority order (first match wins):

Priority Signal Detected Format
1 .obsidian/ directory Obsidian
2 logseq/ + pages/ directories Logseq
3 .dendron.yml or *.schema.yml Dendron
4 .foam/ or .vscode/foam.json Foam
5 raw/ + wiki/ + index.md Karpathy
6 [[wikilinks]] + unique ID prefixes in filenames Zettelkasten
7 Fallback Plain markdown

Format Guides

Located at skills/understand-knowledge/formats/. Each guide tells the LLM agents how to parse that format:

skills/understand-knowledge/
  SKILL.md
  formats/
    obsidian.md        — [[wikilinks]], [[note|alias]], [[note#heading]],
                         #tags, YAML frontmatter, .obsidian/ config,
                         dataview annotations, canvas files
    logseq.md          — block-based outliner, ((block-refs)),
                         journals/YYYY_MM_DD.md, pages/,
                         property:: value syntax, TODO/DONE states
    dendron.md         — dot-delimited hierarchy (a.b.c.md),
                         .schema.yml for structure validation,
                         cross-vault links, refactoring rules
    foam.md            — [[wikilinks]] + link reference definitions
                         at file bottom, .foam/config, placeholder links
    karpathy.md        — raw/ → wiki/ pipeline, index.md master map,
                         log.md append-only record, _meta/ state,
                         LLM-maintained cross-references
    zettelkasten.md    — atomic notes, unique ID prefixes (timestamps),
                         typed semantic links, one idea per note
    plain.md           — standard [markdown](links), folder hierarchy,
                         heading structure, no special conventions

Each format guide covers:

  • How to parse links (wikilinks vs standard vs block refs)
  • Where metadata lives (frontmatter vs inline properties vs block properties)
  • What the folder structure means (journals/ = daily notes, pages/ = permanent notes)
  • What conventions to respect vs what to infer

Format Guide Authoring Process

Format guides must be research-backed. During implementation, the agent building each format guide must:

  1. Read the official documentation for that format (Obsidian Help, Logseq docs, Dendron wiki, Foam docs, etc.)
  2. Study real-world examples of that format's structure
  3. Write the guide based on verified behavior, not assumptions

Agent Pipeline

knowledge-scanner → format-detector → article-analyzer → relationship-builder → graph-reviewer

Agent Definitions

Agent Input Output Model
knowledge-scanner Target directory path File manifest: all .md files with paths, sizes, first 20 lines preview inherit
format-detector File manifest + directory structure Detected format + format-specific parsing hints inherit
article-analyzer Individual .md file + format guide Per-file nodes (article, entities, claims) + explicit edges (wikilinks, tags) inherit
relationship-builder All per-file results Cross-file implicit edges (builds_on, contradicts, categorized_under) + topic clustering + layers inherit
graph-reviewer Assembled graph Validated graph — deduped entities, consistent edge weights, orphan detection inherit

Key Differences from Codebase Pipeline

  • No tree-sitter — markdown parsing is simpler, mostly regex + LLM interpretation
  • format-detector replaces framework detection — picks the right format guide
  • article-analyzer replaces file-analyzer — extracts knowledge concepts instead of code structure
  • relationship-builder is the heavy LLM step — discovers implicit connections across files that explicit links miss
  • graph-reviewer stays similar — validates the assembled graph for consistency

Intermediate Files

Same pattern as codebase analysis:

.understand-anything/intermediate/
  knowledge-manifest.json      — scanner output
  format-detection.json        — detected format + hints
  article-*.json               — per-file analysis
  relationships.json           — cross-file edges
  knowledge-graph.json         — final assembled graph

Intermediate files are cleaned up after graph assembly (same as codebase flow).

Incremental Mode (--ingest)

When the user runs /understand-knowledge --ingest path/to/new-source.md:

  1. knowledge-scanner — runs on just the new file(s)
  2. format-detector — skipped (format already known from initial scan)
  3. article-analyzer — processes only new/changed files
  4. relationship-builder — runs on new nodes against the existing graph, finds connections to what's already there
  5. graph-reviewer — validates the merged result

Existing nodes are preserved; only new nodes/edges are added or updated.


Dashboard Changes

All changes are scoped to graphs with "kind": "knowledge".

Vertical Flow Layout

  • Default to top-down vertical layout (like existing domain/business flow view)
  • Topics at top → articles in middle → entities/claims/sources at bottom
  • Reads like a knowledge hierarchy: broad themes flow down into specifics
  • User can still switch to horizontal or force-directed layout via controls

Knowledge Sidebar

Replaces NodeInfo when a knowledge graph is loaded:

Selection Sidebar Shows
Nothing selected ProjectOverview: format detected, total articles/entities/topics/claims/sources
Article node Title, summary, tags, frontmatter metadata, backlinks list (clickable), outgoing links, related topics
Entity node Name, type (person/tool/paper/org), articles that mention it, relationships to other entities
Topic node Description, child articles, child entities, cross-topic connections
Claim node Assertion text, supporting articles, contradicting claims (if any), confidence score
Source node Original URL/path, articles that cite it, ingestion date

Reading Mode

  • Clicking an article node triggers a reading panel that slides up from the bottom (same pattern as current code viewer overlay)
  • Shows the full compiled markdown rendered as HTML
  • Includes a mini backlinks sidebar within the panel
  • Clicking a [[wikilink]] or entity reference in the reading panel navigates the graph to that node

Node Visual Styling

Node Type Shape Color Accent
article Rounded rectangle Warm amber
entity Circle Soft blue
topic Large rounded rectangle Muted gold
claim Diamond Green/red depending on contradictions
source Small square Gray

Edge Visual Styling

Edge Type Style
cites Dashed line
contradicts Red line
builds_on Solid with arrow
categorized_under Thin gray
authored_by Dotted blue
exemplifies Dotted green

Skill Interface

Usage

# Full scan — first time or rescan
/understand-knowledge

# Point at a specific directory
/understand-knowledge path/to/my-notes

# Incremental ingest — add new sources to existing graph
/understand-knowledge --ingest path/to/new-note.md
/understand-knowledge --ingest path/to/new-folder/

Behavior

  1. Auto-detects format (Obsidian, Logseq, Karpathy, etc.)
  2. Announces: "Detected Obsidian vault with 342 notes. Scanning..."
  3. Runs the agent pipeline (scanner → detector → analyzer → relationship-builder → reviewer)
  4. Writes knowledge-graph.json to .understand-anything/ with "kind": "knowledge"
  5. Auto-triggers /understand-dashboard after completion

File Structure

skills/understand-knowledge/
  SKILL.md                     — skill entry point, orchestration logic
  formats/
    obsidian.md
    logseq.md
    dendron.md
    foam.md
    karpathy.md
    zettelkasten.md
    plain.md

Coexistence with /understand

  • /understand produces "kind": "codebase" graphs
  • /understand-knowledge produces "kind": "knowledge" graphs
  • Both write to .understand-anything/knowledge-graph.json
  • Running one replaces the other
  • To scope knowledge analysis to a subdirectory (e.g., docs/ within a code repo), use /understand-knowledge path/to/docs

What This Enables That Nothing Else Does

Existing Tools Limitation Our Advantage
Obsidian graph view Untyped edges — all links look the same Typed edges: cites, contradicts, builds_on
Logseq graph Only shows explicit links LLM discovers implicit relationships
All PKM tools Single-format only Cross-format support with auto-detection
Karpathy LLM Wiki Flat text wiki, no visualization Interactive graph dashboard with guided tours
None No knowledge graph tours Tour mode walks through a knowledge base step by step