Add under-anything knowledge dashboard

This commit is contained in:
qiaoxinjiu
2026-05-27 15:40:32 +08:00
commit e31a75d2bb
565 changed files with 143063 additions and 0 deletions

View File

@@ -0,0 +1,560 @@
# Multi-Platform Simple Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Make Understand-Anything skills work across Codex, OpenClaw, OpenCode, and Cursor — same files everywhere, no build step.
**Architecture:** Move 5 pipeline agents into `skills/understand/` as prompt templates. Create a reusable `knowledge-graph-guide` agent. Move per-platform config directories to repo root for auto-discovery. Add Cursor and Claude plugin descriptors.
**Tech Stack:** Markdown (SKILL.md, INSTALL.md), YAML frontmatter, JSON (plugin descriptors), Bash (symlink/clone commands in install docs).
**Design Doc:** `docs/plans/2026-03-18-multi-platform-simple-design.md`
---
### Task 1: Move pipeline agents into skills/understand/ as prompt templates
**Files:**
- Move: `understand-anything-plugin/agents/project-scanner.md``understand-anything-plugin/skills/understand/project-scanner-prompt.md`
- Move: `understand-anything-plugin/agents/file-analyzer.md``understand-anything-plugin/skills/understand/file-analyzer-prompt.md`
- Move: `understand-anything-plugin/agents/architecture-analyzer.md``understand-anything-plugin/skills/understand/architecture-analyzer-prompt.md`
- Move: `understand-anything-plugin/agents/tour-builder.md``understand-anything-plugin/skills/understand/tour-builder-prompt.md`
- Move: `understand-anything-plugin/agents/graph-reviewer.md``understand-anything-plugin/skills/understand/graph-reviewer-prompt.md`
**Step 1: Copy each agent file to the new location**
For each of the 5 files, copy from `agents/` to `skills/understand/` with the new name.
**Step 2: Strip agent frontmatter from the prompt templates**
Each prompt template file should remove the agent-specific YAML frontmatter (`name`, `description`, `tools`, `model`). Replace it with a simple Markdown header describing the template's purpose.
For example, `project-scanner-prompt.md` changes from:
```markdown
---
name: project-scanner
description: Scans a project directory...
tools: Bash, Glob, Grep, Read, Write
model: sonnet
---
You are a meticulous project inventory specialist...
```
To:
```markdown
# Project Scanner — Prompt Template
> Used by `/understand` Phase 1. Dispatch as a subagent with this full content as the prompt.
You are a meticulous project inventory specialist...
```
Apply this pattern to all 5 files:
- `project-scanner-prompt.md` — "Used by `/understand` Phase 1"
- `file-analyzer-prompt.md` — "Used by `/understand` Phase 2"
- `architecture-analyzer-prompt.md` — "Used by `/understand` Phase 4"
- `tour-builder-prompt.md` — "Used by `/understand` Phase 5"
- `graph-reviewer-prompt.md` — "Used by `/understand` Phase 6"
Keep the rest of the file content (the body instructions) exactly as-is.
**Step 3: Delete the original agent files**
```bash
cd understand-anything-plugin
rm agents/project-scanner.md agents/file-analyzer.md agents/architecture-analyzer.md agents/tour-builder.md agents/graph-reviewer.md
```
**Step 4: Verify the files exist in the new location**
```bash
ls understand-anything-plugin/skills/understand/
```
Expected: `SKILL.md`, plus the 5 `*-prompt.md` files.
**Step 5: Commit**
```bash
git add -A understand-anything-plugin/agents/ understand-anything-plugin/skills/understand/
git commit -m "refactor: move pipeline agents into skills/understand/ as prompt templates"
```
---
### Task 2: Update SKILL.md dispatch references with context injection
**Files:**
- Modify: `understand-anything-plugin/skills/understand/SKILL.md`
**Step 1: Read the current SKILL.md**
Read `understand-anything-plugin/skills/understand/SKILL.md` in full.
**Step 2: Update Phase 0 — add context collection**
After the decision logic table (line ~47), add a new section for collecting project context that will be injected into later phases:
```markdown
7. **Collect project context for subagent injection:**
- Read `README.md` (or `README.rst`, `readme.md`) from `$PROJECT_ROOT` if it exists. Store as `$README_CONTENT` (first 3000 characters).
- Read the primary package manifest (`package.json`, `pyproject.toml`, `Cargo.toml`, `go.mod`, `pom.xml`) if it exists. Store as `$MANIFEST_CONTENT`.
- Capture the top-level directory tree:
```bash
find $PROJECT_ROOT -maxdepth 2 -type f | head -100
```
Store as `$DIR_TREE`.
- Detect the project entry point by checking for common patterns: `src/index.ts`, `src/main.ts`, `src/App.tsx`, `main.py`, `main.go`, `src/main.rs`, `index.js`. Store first match as `$ENTRY_POINT`.
```
**Step 3: Update Phase 1 dispatch — inject README + manifest**
Replace the Phase 1 dispatch line:
```
Dispatch the **project-scanner** agent with this prompt:
```
With:
```markdown
Dispatch a subagent using the prompt template at `./project-scanner-prompt.md`. Read the template file and pass the full content as the subagent's prompt, appending the following additional context:
> **Additional context from main session:**
>
> Project README (first 3000 chars):
> ```
> $README_CONTENT
> ```
>
> Package manifest:
> ```
> $MANIFEST_CONTENT
> ```
>
> Use this context to produce more accurate project name, description, and framework detection. The README and manifest are authoritative — prefer their information over heuristics.
Pass these parameters in the dispatch prompt:
```
**Step 4: Update Phase 2 dispatch — inject scan results + framework context**
Replace the Phase 2 dispatch paragraph:
```
For each batch, dispatch a **file-analyzer** agent. Run up to **3 agents concurrently** using parallel dispatch. Each agent gets this prompt:
```
With:
```markdown
For each batch, dispatch a subagent using the prompt template at `./file-analyzer-prompt.md`. Run up to **3 subagents concurrently** using parallel dispatch. Read the template once, then for each batch pass the full template content as the subagent's prompt, appending the following additional context:
> **Additional context from main session:**
>
> Project: `<projectName>` — `<projectDescription>`
> Frameworks detected: `<frameworks from Phase 1>`
> Languages: `<languages from Phase 1>`
>
> Framework-specific guidance:
> - If React/Next.js: files in `app/` or `pages/` are routes, `components/` are UI, `lib/` or `utils/` are utilities
> - If Express/Fastify: files in `routes/` are API endpoints, `middleware/` is middleware, `models/` or `db/` is data
> - If Python Django: `views.py` are controllers, `models.py` is data, `urls.py` is routing, `templates/` is UI
> - If Go: `cmd/` is entry points, `internal/` is private packages, `pkg/` is public packages
>
> Use this context to produce more accurate summaries and better classify file roles.
Fill in batch-specific parameters below and dispatch:
```
**Step 5: Update Phase 4 dispatch — inject framework hints + directory tree**
Replace the Phase 4 dispatch line:
```
Dispatch the **architecture-analyzer** agent with this prompt:
```
With:
```markdown
Dispatch a subagent using the prompt template at `./architecture-analyzer-prompt.md`. Read the template file and pass the full content as the subagent's prompt, appending the following additional context:
> **Additional context from main session:**
>
> Frameworks detected: `<frameworks from Phase 1>`
>
> Directory tree (top 2 levels):
> ```
> $DIR_TREE
> ```
>
> Framework-specific layer hints:
> - If React/Next.js: `app/` or `pages/` → UI Layer, `api/` → API Layer, `lib/` → Service Layer, `components/` → UI Layer
> - If Express: `routes/` → API Layer, `controllers/` → Service Layer, `models/` → Data Layer, `middleware/` → Middleware Layer
> - If Python Django: `views/` → API Layer, `models/` → Data Layer, `templates/` → UI Layer, `management/` → CLI Layer
> - If Go: `cmd/` → Entry Points, `internal/` → Service Layer, `pkg/` → Shared Library, `api/` → API Layer
>
> Use the directory tree and framework hints to inform layer assignments. Directory structure is strong evidence for layer boundaries.
Pass these parameters in the dispatch prompt:
```
Also add after the "For incremental updates" note:
```markdown
**Context for incremental updates:** When re-running architecture analysis, also inject the previous layer definitions:
> Previous layer definitions (for naming consistency):
> ```json
> [previous layers from existing graph]
> ```
>
> Maintain the same layer names and IDs where possible. Only add/remove layers if the file structure has materially changed.
```
**Step 6: Update Phase 5 dispatch — inject README + entry point**
Replace the Phase 5 dispatch line:
```
Dispatch the **tour-builder** agent with this prompt:
```
With:
```markdown
Dispatch a subagent using the prompt template at `./tour-builder-prompt.md`. Read the template file and pass the full content as the subagent's prompt, appending the following additional context:
> **Additional context from main session:**
>
> Project README (first 3000 chars):
> ```
> $README_CONTENT
> ```
>
> Project entry point: `$ENTRY_POINT`
>
> Use the README to align the tour narrative with the project's own documentation. Start the tour from the entry point if one was detected. The tour should tell the same story the README tells, but through the lens of actual code structure.
Pass these parameters in the dispatch prompt:
```
**Step 7: Update Phase 6 dispatch — inject scan results for cross-validation**
Replace the Phase 6 dispatch line:
```
2. Dispatch the **graph-reviewer** agent with this prompt:
```
With:
```markdown
2. Dispatch a subagent using the prompt template at `./graph-reviewer-prompt.md`. Read the template file and pass the full content as the subagent's prompt, appending the following additional context:
> **Additional context from main session:**
>
> Phase 1 scan results (file inventory):
> ```json
> [list of {path, sizeLines} from scan-result.json]
> ```
>
> Phase warnings/errors accumulated during analysis:
> - [list any batch failures, skipped files, or warnings from Phases 2-5]
>
> Cross-validate: every file in the scan inventory should have a corresponding `file:` node in the graph. Flag any missing files. Also flag any graph nodes whose `filePath` doesn't appear in the scan inventory.
Pass these parameters in the dispatch prompt:
```
**Step 8: Update Error Handling section**
Change:
```
- If any agent dispatch fails, retry **once** with the same prompt plus additional context about the failure.
```
To:
```
- If any subagent dispatch fails, retry **once** with the same prompt plus additional context about the failure.
- Track all warnings and errors from each phase in a `$PHASE_WARNINGS` list. Pass this list to the graph-reviewer in Phase 6 for comprehensive validation.
```
**Step 9: Verify no references to named agent dispatch remain**
Search for "Dispatch the **" in the file — should find 0 results.
**Step 10: Commit**
```bash
git add understand-anything-plugin/skills/understand/SKILL.md
git commit -m "refactor: update SKILL.md to dispatch subagents with context injection"
```
---
### Task 3: Create knowledge-graph-guide agent
**Files:**
- Create: `understand-anything-plugin/agents/knowledge-graph-guide.md`
**Step 1: Write the agent definition**
Create `understand-anything-plugin/agents/knowledge-graph-guide.md`:
```markdown
---
name: knowledge-graph-guide
description: |
Use this agent when users need help understanding, querying, or working
with an Understand-Anything knowledge graph. Guides users through graph
structure, node/edge relationships, layer architecture, tours, and
dashboard usage.
model: inherit
---
You are an expert on Understand-Anything knowledge graphs. You help users navigate, query, and understand the `knowledge-graph.json` files produced by the `/understand` skill.
## What You Know
### Graph Location
The knowledge graph lives at `<project-root>/.understand-anything/knowledge-graph.json`. Metadata is at `<project-root>/.understand-anything/meta.json`.
### Graph Structure
The JSON has this top-level shape:
```json
{
"version": "1.0.0",
"project": { "name", "languages", "frameworks", "description", "analyzedAt", "gitCommitHash" },
"nodes": [...],
"edges": [...],
"layers": [...],
"tour": [...]
}
```
### Node Types (5)
| Type | ID Convention | Description |
|---|---|---|
| `file` | `file:<relative-path>` | Source file |
| `function` | `func:<relative-path>:<name>` | Function or method |
| `class` | `class:<relative-path>:<name>` | Class, interface, or type |
| `module` | `module:<name>` | Logical module or package |
| `concept` | `concept:<name>` | Abstract concept or pattern |
### Edge Types (18)
| Category | Types |
|---|---|
| Structural | `imports`, `exports`, `contains`, `inherits`, `implements` |
| Behavioral | `calls`, `subscribes`, `publishes`, `middleware` |
| Data flow | `reads_from`, `writes_to`, `transforms`, `validates` |
| Dependencies | `depends_on`, `tested_by`, `configures` |
| Semantic | `related`, `similar_to` |
### Layers
Layers represent architectural groupings (e.g., API, Service, Data, UI). Each layer has an `id`, `name`, `description`, and `nodeIds` array.
### Tours
Tours are guided walkthroughs with sequential steps. Each step has a `title`, `description`, `nodeId` (focus node), and optional `highlightEdges`.
## How to Help Users
1. **Finding things**: Help users locate nodes by file path, function name, or concept. Use `jq` or grep on the JSON.
2. **Understanding relationships**: Trace edges between nodes to explain dependencies, call chains, and data flow.
3. **Architecture overview**: Summarize layers and their contents.
4. **Onboarding**: Walk through the tour steps to explain the codebase.
5. **Dashboard**: Guide users to run `/understand-dashboard` to visualize the graph interactively.
6. **Querying**: Help users write `jq` commands to extract specific information from the graph JSON.
```
**Step 2: Commit**
```bash
git add understand-anything-plugin/agents/knowledge-graph-guide.md
git commit -m "feat: add knowledge-graph-guide agent for graph navigation and querying"
```
---
### Task 4: Move platform INSTALL.md files to repo root
**Files:**
- Move: `understand-anything-plugin/.codex/INSTALL.md``.codex/INSTALL.md`
- Move: `understand-anything-plugin/.opencode/INSTALL.md``.opencode/INSTALL.md`
- Move: `understand-anything-plugin/.openclaw/INSTALL.md``.openclaw/INSTALL.md`
- Delete: `understand-anything-plugin/.cursor/INSTALL.md` (replaced by `.cursor-plugin/plugin.json`)
**Step 1: Move the three platform directories to root**
```bash
cd /Users/yuxianglin/Desktop/opensource/Understand-Anything
git mv understand-anything-plugin/.codex ./.codex
git mv understand-anything-plugin/.opencode ./.opencode
git mv understand-anything-plugin/.openclaw ./.openclaw
```
**Step 2: Delete .cursor/ (replaced by .cursor-plugin/ in Task 5)**
```bash
git rm -r understand-anything-plugin/.cursor/
```
**Step 3: Verify symlink paths are correct**
Read each INSTALL.md. The symlink paths should reference `understand-anything-plugin/skills` — this is still correct since the skills directory remains inside the plugin wrapper.
**Step 4: Commit**
```bash
git add -A
git commit -m "refactor: move platform config directories to repo root for discovery"
```
---
### Task 5: Add plugin descriptors
**Files:**
- Create: `.cursor-plugin/plugin.json`
- Create: `.claude-plugin/plugin.json`
**Step 1: Create `.cursor-plugin/plugin.json`**
```json
{
"name": "understand-anything",
"displayName": "Understand Anything",
"description": "AI-powered codebase understanding — analyze, visualize, and explain any project",
"version": "1.0.5",
"author": { "name": "Lum1104" },
"homepage": "https://github.com/Lum1104/Understand-Anything",
"repository": "https://github.com/Lum1104/Understand-Anything",
"license": "MIT",
"keywords": ["codebase-analysis", "knowledge-graph", "architecture", "onboarding", "dashboard"],
"skills": "./understand-anything-plugin/skills/",
"agents": "./understand-anything-plugin/agents/"
}
```
Note: paths point into `understand-anything-plugin/` since the source stays nested.
**Step 2: Create `.claude-plugin/plugin.json`**
```json
{
"name": "understand-anything",
"description": "AI-powered codebase understanding — analyze, visualize, and explain any project",
"version": "1.0.5",
"author": { "name": "Lum1104" },
"homepage": "https://github.com/Lum1104/Understand-Anything",
"repository": "https://github.com/Lum1104/Understand-Anything",
"license": "MIT",
"keywords": ["codebase-analysis", "knowledge-graph", "architecture", "onboarding", "dashboard"]
}
```
**Step 3: Commit**
```bash
git add .cursor-plugin/ .claude-plugin/plugin.json
git commit -m "feat: add Cursor and Claude plugin descriptors for auto-discovery"
```
---
### Task 6: Update README with corrected multi-platform URLs
**Files:**
- Modify: `README.md`
**Step 1: Read current README**
Read `README.md` in full.
**Step 2: Update raw GitHub URLs for INSTALL.md files**
The INSTALL.md files moved from `understand-anything-plugin/.codex/INSTALL.md` to `.codex/INSTALL.md`. Update all raw GitHub URLs:
```
OLD: .../refs/heads/main/understand-anything-plugin/.codex/INSTALL.md
NEW: .../refs/heads/main/.codex/INSTALL.md
OLD: .../refs/heads/main/understand-anything-plugin/.openclaw/INSTALL.md
NEW: .../refs/heads/main/.openclaw/INSTALL.md
OLD: .../refs/heads/main/understand-anything-plugin/.opencode/INSTALL.md
NEW: .../refs/heads/main/.opencode/INSTALL.md
```
**Step 3: Replace Cursor section**
Replace the Cursor AI-driven install section with:
```markdown
### Cursor
Cursor auto-discovers the plugin via `.cursor-plugin/plugin.json` when this repo is cloned. No manual installation needed — just clone and open in Cursor.
```
**Step 4: Commit**
```bash
git add README.md
git commit -m "docs: update multi-platform URLs after moving configs to root"
```
---
### Task 7: Verify everything works
**Step 1: Check platform configs at root**
```bash
ls .codex/INSTALL.md .opencode/INSTALL.md .openclaw/INSTALL.md
ls .cursor-plugin/plugin.json .claude-plugin/plugin.json
```
All should exist.
**Step 2: Verify plugin source is intact**
```bash
ls understand-anything-plugin/skills/understand/
ls understand-anything-plugin/agents/
ls understand-anything-plugin/packages/
```
Skills, agents, and packages should all still exist inside the wrapper.
**Step 3: Verify no platform configs remain inside the wrapper**
```bash
ls understand-anything-plugin/.codex/ 2>/dev/null # should fail
ls understand-anything-plugin/.cursor/ 2>/dev/null # should fail
ls understand-anything-plugin/.opencode/ 2>/dev/null # should fail
ls understand-anything-plugin/.openclaw/ 2>/dev/null # should fail
```
**Step 4: Run tests**
```bash
pnpm --filter @understand-anything/core build && pnpm --filter @understand-anything/core test
```
All tests should pass — only config files moved, not source code.
**Step 5: Verify marketplace.json is unchanged**
```bash
cat .claude-plugin/marketplace.json | grep source
```
Expected: `"source": "./understand-anything-plugin"` — unchanged, still correct.
**Step 6: Verify no stale raw GitHub URLs**
```bash
grep -r "understand-anything-plugin/\." README.md
```
Expected: 0 results (no URLs pointing to old nested platform config locations).

View File

@@ -0,0 +1,149 @@
# Design: Dashboard Robustness — Permissive Graph Loading
## Problem
When the LLM agent produces a knowledge-graph.json that deviates from the strict Zod schema, the dashboard shows a blank screen with cryptic Zod error paths. Users don't know whether it's a system bug or an agent generation issue, and their only recourse is a full re-run of `/understand`.
## Goals
1. **Maximize what the user can see** — load valid nodes/edges even if some are broken
2. **Clearly communicate generation issues** — amber warnings (not red errors) with copy-paste-friendly messages
3. **Empower targeted fixes** — users can copy the issue report and ask their agent to fix specific problems instead of a full re-run
## Design
### Three-Layer Robustness Pipeline
```
Raw JSON → Sanitize (Tier 1) → Normalize + Auto-fix (Tier 2) → Validate per-item (Tier 3) → Fatal check (Tier 4) → Dashboard
```
### Tier 1: Sanitize Silently
Common LLM quirks that are pure noise — fix without reporting.
| Issue | Fix |
|-------|-----|
| `null` on optional fields (`filePath`, `lineRange`, `description`, `languageNotes`) | Convert to `undefined` |
| Mixed-case enum strings (`"Forward"`, `"SIMPLE"`) | Lowercase before matching |
### Tier 2: Auto-fix With Info Notice
Recoverable issues — apply sensible defaults, track as `auto-corrected` issues.
| Issue | Default | Notes |
|-------|---------|-------|
| Missing `complexity` | `"moderate"` | Most common LLM omission |
| Missing `tags` | `[]` | Empty is valid |
| Missing `weight` | `0.5` | Middle of 01 range |
| `weight` as string | Coerce to number | e.g., `"0.8"``0.8` |
| Missing `direction` | `"forward"` | Safe default |
| Missing `summary` | Use node `name` | Better than empty |
| `tour: null` / `layers: null` | `[]` | Null vs empty array |
| Complexity aliases | `low/easy→simple`, `medium/intermediate→moderate`, `high/hard→complex` | |
| Direction aliases | `to/outbound→forward`, `from/inbound→backward`, `both→bidirectional` | |
| Existing node/edge type aliases | Already handled by `normalizeGraph` | No change needed |
| Missing node `type` | `"file"` | Safe fallback |
| Missing edge `type` | `"depends_on"` | Generic fallback |
### Tier 3: Drop With Warning
Can't safely guess — remove the item, track as `dropped` issue.
| Issue | Action |
|-------|--------|
| Edge references non-existent node ID | Drop edge |
| Node missing `id` | Drop node |
| Node missing `name` | Drop node |
| Edge missing `source` or `target` | Drop edge |
| Unrecognizable `type` value (not in canonical or alias list) | Drop item |
| `weight` not coercible to number | Drop edge |
### Tier 4: Fatal
Graph is unsalvageable — show red error banner.
| Condition | Message |
|-----------|---------|
| 0 valid nodes after filtering | "No valid nodes found in knowledge graph" |
| Missing `project` metadata entirely | "Missing project metadata" |
| Input is not an object / not valid JSON | "Invalid input format" |
### Return Type
```typescript
interface GraphIssue {
level: 'auto-corrected' | 'dropped' | 'fatal';
category: string; // e.g., "missing-field", "invalid-reference", "type-coercion"
message: string; // human-readable, copy-paste friendly
path?: string; // e.g., "nodes[3].complexity"
}
interface ValidationResult {
success: boolean;
data?: KnowledgeGraph;
issues: GraphIssue[];
fatal?: string;
}
```
### Dashboard UI: WarningBanner Component
**New component** in `packages/dashboard/src/components/WarningBanner.tsx`.
**Visual design:**
- **Amber/gold theme** — `bg-amber-900/20`, `border-amber-700`, `text-amber-200`
- Matches dashboard's gold accent aesthetic; signals "generation quality issue" not "system crash"
- **Collapsed by default** — summary line: "Knowledge graph loaded with 5 auto-corrections and 2 dropped items"
- **Expandable** — click to reveal categorized issue list
- **Copy button** — one-click copies the full issue report as a pre-formatted message
- **Actionable footer** — tells users to copy issues and ask their agent to fix them
**Copy-paste output format:**
```
The following issues were found in your knowledge-graph.json.
These are LLM generation errors — not a system bug.
You can ask your agent to fix these specific issues in the knowledge-graph.json file:
[Auto-corrected] nodes[3] ("AuthService"): missing "complexity" — defaulted to "moderate"
[Auto-corrected] nodes[7] ("utils.ts"): missing "tags" — defaulted to []
[Auto-corrected] edges[12]: weight was string "0.8" — coerced to number
[Dropped] edges[5]: target "file:src/nonexistent.ts" does not exist in nodes
[Dropped] nodes[14]: missing required "id" field — cannot recover
```
**Fatal errors** stay red (`bg-red-900/30`) with message: "Knowledge graph is unsalvageable: [reason]. Please re-run `/understand` to generate a new one."
**Existing red error banner** for network/JSON-parse errors stays as-is (those ARE system/infra issues).
### App.tsx Changes
- On `result.success === true` with `result.issues.length > 0`: show `WarningBanner` with issues, load graph normally
- On `result.fatal`: show existing red banner with fatal message
- `console.warn` for auto-corrected items, `console.error` for dropped items
### Test Coverage
All in `packages/core/src/__tests__/schema.test.ts`:
- **Tier 1:** `null` optional fields silently become `undefined`
- **Tier 2:** Missing `complexity`/`tags`/`weight`/`direction`/`summary` get defaults; issues tracked
- **Tier 2:** String `weight` coerced; complexity/direction aliases mapped
- **Tier 3:** Dangling edge references dropped; nodes missing `id` dropped; issues recorded
- **Tier 4:** Empty graph after filtering → fatal; missing `project` → fatal
- **Integration:** Graph with mixed good/bad nodes → loads with correct node count + correct issues list
### Files Changed
| File | Change |
|------|--------|
| `packages/core/src/schema.ts` | Sanitize, expanded normalize, permissive validate, new types |
| `packages/dashboard/src/components/WarningBanner.tsx` | New component |
| `packages/dashboard/src/App.tsx` | Wire issues to WarningBanner |
| `packages/core/src/__tests__/schema.test.ts` | Tests for all tiers |
### Files NOT Changed
- Agent prompts (can be tightened later as a separate effort)
- GraphView / store logic (they already handle valid `KnowledgeGraph` objects)
- Existing node/edge type alias maps (preserved, extended around)

View File

@@ -0,0 +1,971 @@
# Token Reduction Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Reduce `/understand` token cost by ~85% on large codebases through import pre-resolution, batch consolidation, addendum removal, payload slimming, and gating the LLM reviewer.
**Architecture:** Five changes (C5 → C4 → C3 → C1+C2) applied in rollout order — lowest risk first. All changes are to prompt/skill markdown files in `understand-anything-plugin/skills/understand/`. No TypeScript source changes required.
**Tech Stack:** Markdown skill files, Node.js inline scripts embedded in SKILL.md, knowledge-graph JSON pipeline.
**Design doc:** `docs/plans/2026-03-27-token-reduction-design.md`
---
## Task 1: C5 — Gate graph-reviewer behind `--review` flag
Replaces the always-on LLM graph-reviewer subagent with a deterministic inline validation script. The LLM reviewer only runs when `--review` is in `$ARGUMENTS`. Saves ~58,500 tokens per default run.
**Files:**
- Modify: `understand-anything-plugin/skills/understand/SKILL.md` (Phase 6, lines 330362)
### Step 1: Open SKILL.md and locate Phase 6
Read the file and find "## Phase 6 — REVIEW" (line 297). Identify steps 36 (lines 330362) which currently always dispatch the LLM graph-reviewer subagent.
### Step 2: Replace Phase 6 steps 36 with conditional reviewer logic
Replace lines 330362 (from "3. Dispatch a subagent using the prompt template" through "6. **If `approved: true`:** Proceed to Phase 7.") with:
```markdown
3. **Check `$ARGUMENTS` for `--review` flag.** Then run the appropriate validation path:
---
#### Default path (no `--review`): inline deterministic validation
Write the following Node.js script to `$PROJECT_ROOT/.understand-anything/tmp/ua-inline-validate.js`:
```javascript
#!/usr/bin/env node
const fs = require('fs');
const graphPath = process.argv[2];
const outputPath = process.argv[3];
try {
const graph = JSON.parse(fs.readFileSync(graphPath, 'utf8'));
const issues = [], warnings = [];
const nodeIds = new Set();
const seen = new Map();
graph.nodes.forEach((n, i) => {
if (!n.id) { issues.push(`Node[${i}] missing id`); return; }
if (!n.type) issues.push(`Node[${i}] '${n.id}' missing type`);
if (!n.name) issues.push(`Node[${i}] '${n.id}' missing name`);
if (!n.summary) issues.push(`Node[${i}] '${n.id}' missing summary`);
if (!n.tags || !n.tags.length) issues.push(`Node[${i}] '${n.id}' missing tags`);
if (seen.has(n.id)) issues.push(`Duplicate node ID '${n.id}' at indices ${seen.get(n.id)} and ${i}`);
else seen.set(n.id, i);
nodeIds.add(n.id);
});
graph.edges.forEach((e, i) => {
if (!nodeIds.has(e.source)) issues.push(`Edge[${i}] source '${e.source}' not found`);
if (!nodeIds.has(e.target)) issues.push(`Edge[${i}] target '${e.target}' not found`);
});
const fileNodes = graph.nodes.filter(n => n.type === 'file').map(n => n.id);
const assigned = new Map();
(graph.layers || []).forEach(layer => {
(layer.nodeIds || []).forEach(id => {
if (!nodeIds.has(id)) issues.push(`Layer '${layer.id}' refs missing node '${id}'`);
if (assigned.has(id)) issues.push(`Node '${id}' appears in multiple layers`);
assigned.set(id, layer.id);
});
});
fileNodes.forEach(id => {
if (!assigned.has(id)) issues.push(`File node '${id}' not in any layer`);
});
(graph.tour || []).forEach((step, i) => {
(step.nodeIds || []).forEach(id => {
if (!nodeIds.has(id)) issues.push(`Tour step[${i}] refs missing node '${id}'`);
});
});
const withEdges = new Set([
...graph.edges.map(e => e.source),
...graph.edges.map(e => e.target)
]);
graph.nodes.forEach(n => {
if (!withEdges.has(n.id)) warnings.push(`Node '${n.id}' has no edges (orphan)`);
});
const stats = {
totalNodes: graph.nodes.length,
totalEdges: graph.edges.length,
totalLayers: (graph.layers || []).length,
tourSteps: (graph.tour || []).length,
nodeTypes: graph.nodes.reduce((a, n) => { a[n.type] = (a[n.type]||0)+1; return a; }, {}),
edgeTypes: graph.edges.reduce((a, e) => { a[e.type] = (a[e.type]||0)+1; return a; }, {})
};
fs.writeFileSync(outputPath, JSON.stringify({ issues, warnings, stats }, null, 2));
process.exit(0);
} catch (err) { process.stderr.write(err.message + '\n'); process.exit(1); }
```
Execute it:
```bash
node $PROJECT_ROOT/.understand-anything/tmp/ua-inline-validate.js \
"$PROJECT_ROOT/.understand-anything/intermediate/assembled-graph.json" \
"$PROJECT_ROOT/.understand-anything/intermediate/review.json"
```
If the script exits non-zero, read stderr, fix the script, and retry once.
---
#### `--review` path: full LLM reviewer
If `--review` IS in `$ARGUMENTS`, dispatch the LLM graph-reviewer subagent as follows:
Dispatch a subagent using the prompt template at `./graph-reviewer-prompt.md`. Read the template file and pass the full content as the subagent's prompt, appending the following additional context:
> **Additional context from main session:**
>
> Phase 1 scan results (file inventory):
> ```json
> [list of {path, sizeLines} from scan-result.json]
> ```
>
> Phase warnings/errors accumulated during analysis:
> - [list any batch failures, skipped files, or warnings from Phases 2-5]
>
> Cross-validate: every file in the scan inventory should have a corresponding `file:` node in the graph. Flag any missing files. Also flag any graph nodes whose `filePath` doesn't appear in the scan inventory.
Pass these parameters in the dispatch prompt:
> Validate the knowledge graph at `$PROJECT_ROOT/.understand-anything/intermediate/assembled-graph.json`.
> Project root: `$PROJECT_ROOT`
> Read the file and validate it for completeness and correctness.
> Write output to: `$PROJECT_ROOT/.understand-anything/intermediate/review.json`
---
4. Read `$PROJECT_ROOT/.understand-anything/intermediate/review.json`.
5. **If `issues` array is non-empty:**
- Review the `issues` list
- Apply automated fixes where possible:
- Remove edges with dangling references
- Fill missing required fields with sensible defaults (e.g., empty `tags` -> `["untagged"]`, empty `summary` -> `"No summary available"`)
- Remove nodes with invalid types
- Re-run the final graph validation after automated fixes
- If critical issues remain after one fix attempt, save the graph anyway but include the warnings in the final report and mark dashboard auto-launch as skipped
6. **If `issues` array is empty:** Proceed to Phase 7.
```
### Step 3: Verify the edit
Re-read SKILL.md lines 297380 and confirm:
- Phase 6 step 3 now checks for `--review` flag
- The inline validation script is present and complete
- The `--review` path still dispatches the LLM subagent identically to before
- Steps 46 handle the `review.json` output the same way as before
### Step 4: Commit
```bash
git add understand-anything-plugin/skills/understand/SKILL.md
git commit -m "perf(understand): gate LLM graph-reviewer behind --review flag, add inline deterministic validation"
```
---
## Task 2: C4a — Slim Phase 4 (architecture) node payload
Removes `name` and `languageNotes` from the file node format injected into the architecture-analyzer subagent. These fields are not needed for architectural layer assignment and add unnecessary tokens.
**Files:**
- Modify: `understand-anything-plugin/skills/understand/SKILL.md` (Phase 4, around line 188196)
### Step 1: Locate the Phase 4 dispatch prompt in SKILL.md
Find the block starting "Pass these parameters in the dispatch prompt:" under Phase 4 (around line 181). Look for:
```
> File nodes:
> ```json
> [list of {id, name, filePath, summary, tags} for all file-type nodes]
> ```
```
### Step 2: Update the file node format
Change the file nodes line from:
```
> [list of {id, name, filePath, summary, tags} for all file-type nodes]
```
To:
```
> [list of {id, filePath, summary, tags} for all file-type nodes — omit name, complexity, languageNotes]
```
### Step 3: Verify
Re-read Phase 4 and confirm the node format line is updated. Import edges line below it (`[list of edges with type "imports"]`) is unchanged.
### Step 4: Commit
```bash
git add understand-anything-plugin/skills/understand/SKILL.md
git commit -m "perf(understand): slim Phase 4 architecture payload — drop redundant node fields"
```
---
## Task 3: C4b — Slim Phase 5 (tour builder) payload
Phase 5 currently injects all nodes (including function/class), all edge types, and full layer objects (with nodeIds arrays). Only file nodes, import+calls edges, and slim layers are needed for tour design. This is the largest single payload change, saving ~105,000 tokens on a 500-file project.
**Files:**
- Modify: `understand-anything-plugin/skills/understand/SKILL.md` (Phase 5, lines 257270)
- Modify: `understand-anything-plugin/skills/understand/tour-builder-prompt.md` (input schema)
### Step 1: Locate the Phase 5 dispatch prompt in SKILL.md
Find the block starting with (around line 257):
```
> Nodes (summarized):
> ```json
> [list of {id, name, filePath, summary, type} for key nodes]
> ```
>
> Layers:
> ```json
> [layers from Phase 4]
> ```
>
> Key edges:
> ```json
> [imports and calls edges]
> ```
```
### Step 2: Replace all three payload sections
Replace those lines with:
```markdown
> Nodes (file nodes only):
> ```json
> [list of {id, name, filePath, summary, type} for file-type nodes ONLY — do NOT include function or class nodes]
> ```
>
> Layers:
> ```json
> [list of {id, name, description} for each layer — omit nodeIds]
> ```
>
> Edges (imports and calls only):
> ```json
> [list of edges where type is "imports" or "calls" only — exclude all other edge types]
> ```
```
### Step 3: Update tour-builder-prompt.md input schema
Open `tour-builder-prompt.md` and find the "Script Requirements" section (around line 1835). The input schema currently shows:
```json
{
"nodes": [...],
"edges": [...],
"layers": [
{"id": "layer:core", "name": "Core", "nodeIds": ["file:src/index.ts"]}
]
}
```
Update the layers example to reflect the slim format:
```json
{
"nodes": [
{"id": "file:src/index.ts", "type": "file", "name": "index.ts", "filePath": "src/index.ts", "summary": "..."}
],
"edges": [
{"source": "file:src/index.ts", "target": "file:src/utils.ts", "type": "imports"}
],
"layers": [
{"id": "layer:core", "name": "Core", "description": "Core application logic"}
]
}
```
Also update the "G. Node Summary Index" description (around line 84) to reflect that input nodes are file-type only:
Find:
```
**G. Node Summary Index**
Create a lookup of each node ID to its `summary`, `type`, `tags` (default to empty array `[]` if not present in input), and `name` for easy reference.
```
Add a note after it:
```
Note: input nodes are file-type only. The nodeSummaryIndex will contain only file nodes.
```
### Step 4: Verify
- Re-read SKILL.md Phase 5 payload block: confirms file-only nodes, slim layers (no nodeIds), imports+calls edges only
- Re-read tour-builder-prompt.md input schema: layers no longer have nodeIds
### Step 5: Commit
```bash
git add understand-anything-plugin/skills/understand/SKILL.md \
understand-anything-plugin/skills/understand/tour-builder-prompt.md
git commit -m "perf(understand): slim Phase 5 tour payload — file nodes only, imports+calls edges, slim layers"
```
---
## Task 4: C3 — Remove language/framework addendums from file-analyzer batches
The addendums (`languages/typescript.md`, `frameworks/react.md`, etc.) are currently injected into every file-analyzer batch prompt. They cost ~1,300 tokens × N batches. The model already knows these languages. Replace with a compact inline reference table (~150 tokens, paid once, embedded in the base template).
**Files:**
- Modify: `understand-anything-plugin/skills/understand/SKILL.md` (Phase 2, lines 104117)
- Modify: `understand-anything-plugin/skills/understand/file-analyzer-prompt.md` (add quick reference section)
### Step 1: Update the "Build the combined prompt template" block in SKILL.md Phase 2
Find the block at lines 104117:
```
**Build the combined prompt template:**
1. Read the base template at `./file-analyzer-prompt.md`.
2. **Language context injection:** ...
3. **Framework addendum injection:** ...
Then for each batch pass the combined template content as the subagent's prompt, appending the following additional context:
> **Additional context from main session:**
>
> Project: `<projectName>` — `<projectDescription>`
> Frameworks detected: `<frameworks from Phase 1>`
> Languages: `<languages from Phase 1>`
>
> Use the language context and framework addendums (appended above) to produce more accurate summaries and better classify file roles.
```
Replace it with:
```markdown
**Build the prompt for each batch:**
1. Read the base template at `./file-analyzer-prompt.md`. (Language and framework hints are embedded in the template — do NOT append addendum files for Phase 2 batches. Addendums are reserved for Phase 4.)
Then for each batch pass the template content as the subagent's prompt, appending the following additional context:
> **Additional context from main session:**
>
> Project: `<projectName>` — `<projectDescription>`
> Languages: `<languages from Phase 1>`
```
This removes steps 2 and 3 (the addendum injection loops) entirely from Phase 2.
### Step 2: Add Language and Framework Quick Reference to file-analyzer-prompt.md
Open `file-analyzer-prompt.md`. Find the "## Critical Constraints" section near the bottom (around line 299). Insert the following new section **before** "## Critical Constraints":
```markdown
## Language and Framework Quick Reference
Use these hints to improve tag and edge accuracy for common patterns. Your training knowledge covers these — this is a fast lookup for the most impactful signals.
**Tag signals:**
| Signal | Tags to apply |
|---|---|
| File in `hooks/`, exports a function starting with `use` | `hook`, `service` |
| File in `contexts/` or `context/`, exports a Provider component | `service`, `state` |
| File in `pages/` or `views/` | `ui`, `routing` |
| File in `store/`, `slices/`, `reducers/`, `state/` | `state` |
| File in `services/`, `api/`, `client/` | `service` |
| `__init__.py` at a package root with re-exports | `entry-point`, `barrel` |
| `manage.py` at the project root | `entry-point` |
| `mod.rs` in a directory | `barrel` |
| `main.go` in a `cmd/` subdirectory | `entry-point` |
**Edge signals:**
| Pattern | Edge to create |
|---|---|
| React component renders another component in its JSX | `contains` from parent to child |
| Component/hook calls a custom hook (`useX`) | `depends_on` from consumer to hook file |
| Context provider wraps components | `publishes` from provider to context definition |
| Component calls `useContext` or custom context hook | `subscribes` from consumer to context definition |
| Python file uses `from x import y` where x is a project file | `imports` edge (same rule as JS/TS) |
| Go file `import`s an internal package path | `imports` edge to the resolved file |
```
### Step 3: Verify
- Re-read SKILL.md Phase 2 "Build the prompt" block: steps 2 and 3 (addendum loops) are gone; "Frameworks detected" line in additional context is gone
- Re-read file-analyzer-prompt.md: new "Language and Framework Quick Reference" section appears before Critical Constraints; no reference to addendum files
- Confirm Phase 4 "Build the combined prompt template" (lines 163167) is **unchanged** — addendums still apply there
### Step 4: Commit
```bash
git add understand-anything-plugin/skills/understand/SKILL.md \
understand-anything-plugin/skills/understand/file-analyzer-prompt.md
git commit -m "perf(understand): remove addendum injection from Phase 2 batches, add compact inline hints to file-analyzer"
```
---
## Task 5: C1a — Extend scanner to pre-resolve imports
Adds a new Step 8 to the project scanner script: parse import statements from every source file and resolve relative imports against the discovered file list. The resolved map is written into `scan-result.json` as `importMap`. This is the data that lets us eliminate `allProjectFiles` from every batch in Task 7.
**Files:**
- Modify: `understand-anything-plugin/skills/understand/project-scanner-prompt.md`
### Step 1: Add Step 8 to the scanner script requirements
Open `project-scanner-prompt.md`. Find "**Step 7 -- Project Name**" (around line 100). After its content (the priority list), add a new step:
```markdown
**Step 8 -- Import Resolution**
For each file in the discovered source list, extract and resolve relative import statements. The goal is to produce a map from each file's path to the list of project-internal files it imports. External package imports are ignored.
For each file, read its content and extract import paths using language-appropriate patterns:
| Language | Import patterns to match |
|---|---|
| TypeScript/JavaScript | `import ... from './...'` or `'../'`, `require('./...')` or `require('../...')` |
| Python | `from .x import y`, `from ..x import y`, `import .x` (relative only) |
| Go | Paths in `import (...)` blocks that start with the module path from `go.mod` |
| Rust | `use crate::`, `use super::`, `mod x` (within the same crate) |
| Java/Kotlin | Not resolvable by path — skip import resolution for these languages |
| Ruby | `require_relative '...'` paths |
For each extracted import path:
1. Compute the resolved file path relative to project root:
- For relative imports (`./x`, `../x`): resolve from the importing file's directory
- Try these extension variants in order if the import has no extension: `.ts`, `.tsx`, `.js`, `.jsx`, `/index.ts`, `/index.js`, `/index.tsx`, `/index.jsx`, `.py`, `.go`, `.rs`, `.rb`
2. Check if the resolved path exists in the discovered file list
3. If yes: add to this file's resolved imports list
4. If no: skip (external, unresolvable, or dynamic import)
Output format in the script result:
```json
"importMap": {
"src/index.ts": ["src/utils.ts", "src/config.ts"],
"src/utils.ts": [],
"src/components/App.tsx": ["src/hooks/useAuth.ts", "src/store/index.ts"]
}
```
Keys are project-relative paths. Values are arrays of resolved project-relative paths. Every key in the file list must appear in `importMap` (use an empty array `[]` if no imports were resolved). External packages and unresolvable imports are omitted entirely.
```
### Step 2: Update the scanner script output format
Find the "### Script Output Format" section (around line 109) and update the example JSON to include `importMap`:
Find this in the example:
```json
{
"scriptCompleted": true,
"name": "project-name",
...
"estimatedComplexity": "moderate"
}
```
Add `importMap` to the example:
```json
{
"scriptCompleted": true,
"name": "project-name",
"rawDescription": "...",
"readmeHead": "...",
"languages": ["javascript", "typescript"],
"frameworks": ["React", "Vite"],
"files": [
{"path": "src/index.ts", "language": "typescript", "sizeLines": 150}
],
"totalFiles": 42,
"estimatedComplexity": "moderate",
"importMap": {
"src/index.ts": ["src/utils.ts", "src/config.ts"],
"src/utils.ts": []
}
}
```
Also update the field documentation list below the example to add:
```
- `importMap` (object) — map from every source file path to its list of resolved project-internal import paths; empty array if no resolved imports; external packages excluded
```
### Step 3: Update the final assembly section to preserve importMap
Find "## Phase 2 -- Description and Final Assembly" (around line 153). Find the IMPORTANT note:
```
**IMPORTANT:** The final output must NOT contain the `scriptCompleted`, `rawDescription`, or `readmeHead` fields.
```
Update it to:
```
**IMPORTANT:** The final output must NOT contain the `scriptCompleted`, `rawDescription`, or `readmeHead` fields. All other fields — including `importMap` — MUST be preserved exactly as output by the script.
```
Also update the final output example to include `importMap`:
```json
{
"name": "project-name",
"description": "...",
"languages": ["typescript"],
"frameworks": ["React"],
"files": [...],
"totalFiles": 42,
"estimatedComplexity": "moderate",
"importMap": {
"src/index.ts": ["src/utils.ts"]
}
}
```
### Step 4: Verify
Re-read `project-scanner-prompt.md` and confirm:
- Step 8 is present with full import resolution logic
- Script output format includes `importMap`
- Field documentation includes `importMap`
- Final assembly section preserves `importMap` in output
### Step 5: Commit
```bash
git add understand-anything-plugin/skills/understand/project-scanner-prompt.md
git commit -m "perf(understand): extend scanner to pre-resolve imports, output importMap in scan-result.json"
```
---
## Task 6: C1b — Update file-analyzer to use batchImportData
Removes `allProjectFiles` from the file-analyzer input schema and replaces it with `batchImportData` (pre-resolved imports for this batch's files only). Updates the extraction script section to skip import resolution entirely (already done by scanner). Updates the edge creation step to use `batchImportData` directly.
**Files:**
- Modify: `understand-anything-plugin/skills/understand/file-analyzer-prompt.md`
### Step 1: Update the input JSON schema (Script Requirements, step 1)
Find the input schema block around line 19:
```json
{
"projectRoot": "/path/to/project",
"allProjectFiles": ["src/index.ts", "src/utils.ts", "..."],
"batchFiles": [
{"path": "src/index.ts", "language": "typescript", "sizeLines": 150},
{"path": "src/utils.ts", "language": "typescript", "sizeLines": 80}
]
}
```
Replace with:
```json
{
"projectRoot": "/path/to/project",
"batchFiles": [
{"path": "src/index.ts", "language": "typescript", "sizeLines": 150},
{"path": "src/utils.ts", "language": "typescript", "sizeLines": 80}
],
"batchImportData": {
"src/index.ts": ["src/utils.ts", "src/config.ts"],
"src/utils.ts": []
}
}
```
Update the field descriptions:
- Remove: `allProjectFiles` description
- Add: `batchImportData` (object) — map from each batch file's project-relative path to its list of pre-resolved project-internal imports. Produced by the project scanner. Use this directly for import edge creation — do NOT attempt to re-resolve imports yourself.
### Step 2: Remove the imports extraction from "What the Script Must Extract"
Find the "**Imports:**" subsection under "What the Script Must Extract" (around lines 4953):
```
**Imports:**
- Source module path (exactly as written in the import statement)
- Imported specifiers (named imports, default import, namespace import)
- Line number
- For relative imports (starting with `./` or `../`), compute the resolved path...
```
Replace this entire subsection with:
```markdown
**Imports:**
- Do NOT extract imports in the script. Import resolution has already been performed by the project scanner.
- The pre-resolved imports for each file are provided in `batchImportData` in the input JSON.
- Do not include an `imports` field in the script output — import edges will be created in Phase 2 using `batchImportData` directly.
```
### Step 3: Update the script output format to remove imports
Find the `results` array in the script output format (around line 67). The current `imports` array in the output:
```json
"imports": [
{"source": "./utils", "resolvedPath": "src/utils.ts", "specifiers": ["formatDate"], "line": 1, "isExternal": false},
{"source": "express", "resolvedPath": null, "specifiers": ["default"], "line": 2, "isExternal": true}
],
```
Remove the `imports` array from the script output format entirely. The result for each file should be:
```json
{
"path": "src/index.ts",
"language": "typescript",
"totalLines": 150,
"nonEmptyLines": 120,
"functions": [...],
"classes": [...],
"exports": [...],
"metrics": {
"importCount": 5,
"exportCount": 3,
"functionCount": 4,
"classCount": 1
}
}
```
Keep `metrics.importCount` (derived from `batchImportData[path].length`) as a useful metric.
Update the metrics description to say:
```
- `importCount` (integer) — use `batchImportData[file.path].length` from the input JSON
```
### Step 4: Update "Preparing the Script Input" section
Find the `cat` command around line 113 that creates the input JSON:
```bash
cat > $PROJECT_ROOT/.understand-anything/tmp/ua-file-analyzer-input-<batchIndex>.json << 'ENDJSON'
{
"projectRoot": "<project-root>",
"allProjectFiles": [<full file list from scan>],
"batchFiles": [<this batch's files>]
}
ENDJSON
```
Replace with:
```bash
cat > $PROJECT_ROOT/.understand-anything/tmp/ua-file-analyzer-input-<batchIndex>.json << 'ENDJSON'
{
"projectRoot": "<project-root>",
"batchFiles": [<this batch's files>],
"batchImportData": <batchImportData JSON object — provided in your dispatch prompt>
}
ENDJSON
```
### Step 5: Update Step 3 (Create Edges) — Import edge creation rule
Find the "**Import edge creation rule:**" in the "Step 3 -- Create Edges" section (around line 213):
```
**Import edge creation rule:** For each import in the script output where `isExternal` is `false` and `resolvedPath` is non-null, create an `imports` edge from the current file node to `file:<resolvedPath>`. Do NOT create edges for external package imports.
```
Replace with:
```markdown
**Import edge creation rule:** For each resolved path in `batchImportData[filePath]` (provided in the input JSON), create an `imports` edge from the current file node to `file:<resolvedPath>`. The `batchImportData` values contain only resolved project-internal paths — external packages have already been filtered out. Do NOT attempt to re-resolve imports from source.
```
### Step 6: Remove `allProjectFiles` references from Critical Constraints
Find the last bullet in "## Critical Constraints" (around line 304):
```
- For import edges, use the script's `resolvedPath` field directly. Do NOT attempt to resolve import paths yourself -- the script already did this deterministically.
```
Replace with:
```markdown
- For import edges, use `batchImportData[filePath]` directly from the input JSON. Do NOT attempt to resolve import paths yourself -- the project scanner already did this deterministically.
```
### Step 7: Verify
Re-read `file-analyzer-prompt.md` and confirm:
- Input schema has `batchImportData`, no `allProjectFiles`
- Script "What to Extract" section: imports extraction replaced with "do not extract"
- Script output format: no `imports` array per file
- Preparing the Script Input: cat command has no `allProjectFiles`
- Import edge creation rule: uses `batchImportData` not script output
- Critical Constraints: no reference to `resolvedPath` from script
### Step 8: Commit
```bash
git add understand-anything-plugin/skills/understand/file-analyzer-prompt.md
git commit -m "perf(understand): replace allProjectFiles with batchImportData in file-analyzer — import resolution now done by scanner"
```
---
## Task 7: C1c + C2 — Update SKILL.md Phase 2 orchestration
Wires up the `importMap` from Phase 1 into per-batch `batchImportData` slices. Increases batch size from 5-10 to 20-30 files. Increases concurrency from 3 to 5. Removes `allProjectFiles` from the dispatch prompt.
**Files:**
- Modify: `understand-anything-plugin/skills/understand/SKILL.md` (Phase 0, Phase 1, Phase 2)
### Step 1: Update Phase 1 to note importMap is now in scan-result.json
Find Phase 1 (around line 62) where it says:
```
After the subagent completes, read `$PROJECT_ROOT/.understand-anything/intermediate/scan-result.json` to get:
- Project name, description
- Languages, frameworks
- File list with line counts
- Complexity estimate
```
Add one item to the list:
```
- Import map (`importMap`): pre-resolved project-internal imports per file
```
Also add a note:
```
Store `importMap` in memory as `$IMPORT_MAP` for use in Phase 2 batch construction.
```
### Step 2: Change batch size and concurrency in Phase 2
Find line 100:
```
Batch the file list from Phase 1 into groups of **5-10 files each** (aim for balanced batch sizes).
```
Replace with:
```
Batch the file list from Phase 1 into groups of **20-30 files each** (aim for ~25 files per batch for balanced sizes).
```
Find line 102:
```
For each batch, dispatch a subagent using the prompt template at `./file-analyzer-prompt.md`. Run up to **3 subagents concurrently** using parallel dispatch.
```
Replace with:
```
For each batch, dispatch a subagent using the prompt template at `./file-analyzer-prompt.md`. Run up to **5 subagents concurrently** using parallel dispatch.
```
### Step 3: Add batchImportData construction to the dispatch block
Find the dispatch prompt block (around lines 119134):
```
Fill in batch-specific parameters below and dispatch:
> Analyze these source files and produce GraphNode and GraphEdge objects.
> Project root: `$PROJECT_ROOT`
> Project: `<projectName>`
> Languages: `<languages>`
> Batch index: `<batchIndex>`
> Write output to: `$PROJECT_ROOT/.understand-anything/intermediate/batch-<batchIndex>.json`
>
> All project files (for import resolution):
> `<full file path list from scan>`
>
> Files to analyze in this batch:
> 1. `<path>` (<sizeLines> lines)
> ...
```
Replace with:
```markdown
Before dispatching each batch, construct `batchImportData` from `$IMPORT_MAP`:
```json
batchImportData = {}
for each file in this batch:
batchImportData[file.path] = $IMPORT_MAP[file.path] ?? []
```
Fill in batch-specific parameters below and dispatch:
> Analyze these source files and produce GraphNode and GraphEdge objects.
> Project root: `$PROJECT_ROOT`
> Project: `<projectName>`
> Languages: `<languages>`
> Batch index: `<batchIndex>`
> Write output to: `$PROJECT_ROOT/.understand-anything/intermediate/batch-<batchIndex>.json`
>
> Pre-resolved import data for this batch (use this for all import edge creation — do NOT re-resolve imports from source):
> ```json
> <batchImportData JSON>
> ```
>
> Files to analyze in this batch:
> 1. `<path>` (<sizeLines> lines)
> 2. `<path>` (<sizeLines> lines)
> ...
```
### Step 4: Update incremental update path
Find "### Incremental update path" (around line 140):
```
Use the changed files list from Phase 0. Batch and dispatch file-analyzer subagents using the same process as above, but only for changed files.
```
Update to clarify that batchImportData still applies:
```
Use the changed files list from Phase 0. Batch and dispatch file-analyzer subagents using the same process as above (20-30 files per batch, up to 5 concurrent, with batchImportData constructed from $IMPORT_MAP), but only for changed files.
```
### Step 5: Verify all Phase 2 changes
Re-read SKILL.md Phase 2 in full and confirm:
- Batch size says "20-30 files"
- Concurrency says "5 subagents concurrently"
- "Build the prompt" block: only step 1 (read base template), no addendum steps
- Additional context block: no "Frameworks detected" line, no addendum reference
- Dispatch prompt: has `batchImportData` injection, no `allProjectFiles`
- Incremental path: mentions batchImportData
### Step 6: Commit
```bash
git add understand-anything-plugin/skills/understand/SKILL.md
git commit -m "perf(understand): wire importMap into batchImportData per batch, increase batch size 5-10→20-30, concurrency 3→5"
```
---
## Task 8: Version bump
Per project convention, all four version files must stay in sync when changes are pushed.
**Files:**
- Modify: `understand-anything-plugin/package.json`
- Modify: `.claude-plugin/marketplace.json`
- Modify: `.claude-plugin/plugin.json`
- Modify: `.cursor-plugin/plugin.json`
### Step 1: Read current version
```bash
node -e "const p = require('./understand-anything-plugin/package.json'); console.log(p.version)"
```
Expected: `1.2.1` (or whatever the current version is).
### Step 2: Bump patch version in all four files
New version: `1.2.2` (patch bump — internal optimization, no API changes).
Update each file:
- `understand-anything-plugin/package.json`: `"version": "1.2.2"`
- `.claude-plugin/marketplace.json`: `"version": "1.2.2"` in `plugins[0]`
- `.claude-plugin/plugin.json`: `"version": "1.2.2"`
- `.cursor-plugin/plugin.json`: `"version": "1.2.2"`
### Step 3: Verify all four files match
```bash
grep -r '"version"' understand-anything-plugin/package.json .claude-plugin/marketplace.json .claude-plugin/plugin.json .cursor-plugin/plugin.json
```
All four should show `"version": "1.2.2"`.
### Step 4: Commit
```bash
git add understand-anything-plugin/package.json \
.claude-plugin/marketplace.json \
.claude-plugin/plugin.json \
.cursor-plugin/plugin.json
git commit -m "chore: bump version to 1.2.2"
```
---
## Task 9: Build and smoke test
Verifies all changes work end-to-end by running `/understand --full` against a real project.
**Files:** None (testing only)
### Step 1: Build the packages
```bash
pnpm --filter @understand-anything/core build
pnpm --filter @understand-anything/skill build
```
Expected: both build without errors.
### Step 2: Find installed plugin version and copy to cache
```bash
ls ~/.claude/plugins/cache/understand-anything/understand-anything/
```
Note the version (e.g., `1.0.1`). Copy local build into the cache:
```bash
VERSION=$(node -e "const p = require('./understand-anything-plugin/package.json'); console.log(p.version)")
rm -rf ~/.claude/plugins/cache/understand-anything/understand-anything/$VERSION
cp -R ./understand-anything-plugin ~/.claude/plugins/cache/understand-anything/understand-anything/$VERSION
```
### Step 3: Smoke test on a small project (~20 files)
Open a fresh Claude Code session in a small TypeScript project. Run:
```
/understand --full
```
Verify:
- Phases 07 complete without errors
- `knowledge-graph.json` is created
- Node count and edge count are reasonable
- Layers and tour are present
- No "allProjectFiles" or addendum errors in the output
### Step 4: Smoke test on a larger project (~100+ files)
Run `/understand --full` on a medium/large TypeScript+React project.
Verify:
- Batch count is ~4-6 (at 20-30 files per batch for 100 files), not 10-20
- No errors about missing import resolution
- `importMap` is present in `scan-result.json` (check `.understand-anything/intermediate/` before cleanup, or add a temporary debug log)
- Graph quality is comparable to before (summaries are descriptive, layers are correct)
### Step 5: Test `--review` flag
Run `/understand --full --review` on the same project.
Verify:
- Phase 6 now dispatches the LLM graph-reviewer subagent (not the inline script)
- `review.json` is produced with `approved` field
- Pipeline completes normally
### Step 6: Final commit (if any fixes needed from smoke test)
```bash
git add -A
git commit -m "fix(understand): smoke test fixes for token reduction changes"
```
---
## Summary
| Task | Change | Risk |
|---|---|---|
| 1 | C5: Gate reviewer | Low |
| 2 | C4a: Slim Phase 4 payload | Low |
| 3 | C4b: Slim Phase 5 payload | Low |
| 4 | C3: Remove addendums from batches | Low |
| 5 | C1a: Scanner import resolution | Medium |
| 6 | C1b: File-analyzer uses batchImportData | Medium |
| 7 | C1c+C2: SKILL.md orchestration + batch size | Medium |
| 8 | Version bump | Low |
| 9 | Smoke test | — |
Tasks 14 are independent of Tasks 57. They can be shipped separately if needed. Tasks 5, 6, and 7 are tightly coupled (scanner produces importMap → SKILL.md passes batchImportData → file-analyzer consumes it) and must be shipped together.

View File

@@ -0,0 +1,138 @@
# Homepage Feature Update Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Update the Astro homepage to reflect features from v1.2.0v2.0.0 releases.
**Architecture:** Three file edits — expand Features.astro from 3→6 cards, update Install.astro platform note, update Footer.astro tagline. No new files or structural changes.
**Tech Stack:** Astro 6, CSS grid
---
### Task 1: Update Features.astro — Replace 3 Cards with 6
**Files:**
- Modify: `homepage/src/components/Features.astro`
**Step 1: Replace the features array (lines 218)**
Replace the entire frontmatter features array with:
```astro
---
const features = [
{
icon: '◈',
title: 'Interactive Knowledge Graph',
description: 'Visualize files, functions, and dependencies as an explorable graph with hierarchical drill-down and smart layout.',
},
{
icon: '⬡',
title: 'Beyond Code Analysis',
description: 'Analyze your entire project — Dockerfiles, Terraform, SQL, Markdown, and 26+ file types mapped into one unified graph.',
},
{
icon: '⊘',
title: 'Smart Filtering & Search',
description: 'Filter by node type, complexity, layer, or edge category. Fuzzy and semantic search to find anything instantly.',
},
{
icon: '⎙',
title: 'Export & Share',
description: 'Export your knowledge graph as high-quality PNG, SVG, or filtered JSON — ready for docs, presentations, or further analysis.',
},
{
icon: '⟿',
title: 'Dependency Path Finder',
description: 'Find the shortest path between any two components. Understand how parts of your system connect at a glance.',
},
{
icon: '⟐',
title: 'Guided Tours & Onboarding',
description: 'AI-generated walkthroughs that teach the codebase step by step, plus onboarding guides for new team members.',
},
];
---
```
**Step 2: Update the reveal delay logic (line 24)**
The current `reveal-delay-${i + 1}` only has CSS for delays 13. With 6 cards in 2 rows, use modulo so each row staggers 1/2/3:
```astro
<div class={`feature-card reveal reveal-delay-${(i % 3) + 1}`}>
```
**Step 3: Update the grid CSS to handle 2 rows properly**
No change needed — `grid-template-columns: repeat(3, 1fr)` already wraps to a second row. The mobile `1fr` breakpoint also works. No CSS changes required.
**Step 4: Verify build**
Run: `cd homepage && npx astro build`
Expected: Build completes with no errors.
**Step 5: Commit**
```bash
git add homepage/src/components/Features.astro
git commit -m "feat(homepage): expand features section to 6 cards for v2.0.0"
```
---
### Task 2: Update Install.astro — Multi-Platform Note
**Files:**
- Modify: `homepage/src/components/Install.astro`
**Step 1: Replace the platform note (line 13)**
Change:
```html
<p class="install-note">Works with <strong>Claude Code</strong> — Anthropic's official CLI for Claude.</p>
```
To:
```html
<p class="install-note">Works with <strong>Claude Code</strong>, <strong>Codex</strong>, <strong>OpenCode</strong>, <strong>Gemini CLI</strong>, and more.</p>
```
**Step 2: Commit**
```bash
git add homepage/src/components/Install.astro
git commit -m "feat(homepage): update install note for multi-platform support"
```
---
### Task 3: Update Footer.astro — Tagline
**Files:**
- Modify: `homepage/src/components/Footer.astro`
**Step 1: Replace the tagline (line 13)**
Change:
```html
<p class="footer-note">Built as a Claude Code plugin</p>
```
To:
```html
<p class="footer-note">Built for AI coding assistants</p>
```
**Step 2: Verify full build**
Run: `cd homepage && npx astro build`
Expected: Clean build, no errors.
**Step 3: Commit**
```bash
git add homepage/src/components/Footer.astro
git commit -m "feat(homepage): update footer tagline for multi-platform"
```

View File

@@ -0,0 +1,776 @@
# .understandignore Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Add user-configurable file exclusion via `.understandignore` files using `.gitignore` syntax, with auto-generated starter files and a pre-analysis review pause.
**Architecture:** An `IgnoreFilter` module in `packages/core` uses the `ignore` npm package to parse `.understandignore` files and filter paths. A companion `IgnoreGenerator` scans the project for common patterns and produces a commented-out starter file. The `project-scanner` agent applies the filter as a second pass after its existing hardcoded exclusions. The `/understand` skill adds a Phase 0.5 that generates the starter file and pauses for user review.
**Tech Stack:** TypeScript, `ignore` npm package, Vitest
**Spec:** `docs/superpowers/specs/2026-04-10-understandignore-design.md`
---
## File Structure
### Core package
- Create: `understand-anything-plugin/packages/core/src/ignore-filter.ts` — parse .understandignore, merge with defaults, filter paths
- Create: `understand-anything-plugin/packages/core/src/ignore-generator.ts` — generate starter .understandignore by scanning project
- Create: `understand-anything-plugin/packages/core/src/__tests__/ignore-filter.test.ts` — filter tests
- Create: `understand-anything-plugin/packages/core/src/__tests__/ignore-generator.test.ts` — generator tests
- Modify: `understand-anything-plugin/packages/core/src/index.ts` — export new modules
- Modify: `understand-anything-plugin/packages/core/package.json` — add `ignore` dependency
### Agents & skills
- Modify: `understand-anything-plugin/agents/project-scanner.md` — add Layer 2 filtering step
- Modify: `understand-anything-plugin/skills/understand/SKILL.md` — add Phase 0.5
---
## Task 1: Add `ignore` dependency
**Files:**
- Modify: `understand-anything-plugin/packages/core/package.json`
- [ ] **Step 1: Install the `ignore` npm package**
Run:
```bash
cd understand-anything-plugin && pnpm add --filter @understand-anything/core ignore
```
- [ ] **Step 2: Verify it was added**
Run: `grep ignore understand-anything-plugin/packages/core/package.json`
Expected: `"ignore": "^7.x.x"` (or similar) in dependencies
- [ ] **Step 3: Commit**
```bash
git add understand-anything-plugin/packages/core/package.json understand-anything-plugin/pnpm-lock.yaml
git commit -m "chore(core): add ignore package for .understandignore support"
```
---
## Task 2: Create IgnoreFilter module with tests (TDD)
**Files:**
- Create: `understand-anything-plugin/packages/core/src/ignore-filter.ts`
- Create: `understand-anything-plugin/packages/core/src/__tests__/ignore-filter.test.ts`
- [ ] **Step 1: Write the failing tests**
Create `understand-anything-plugin/packages/core/src/__tests__/ignore-filter.test.ts`:
```typescript
import { describe, it, expect, beforeEach, afterEach } from "vitest";
import { createIgnoreFilter, DEFAULT_IGNORE_PATTERNS } from "../ignore-filter";
import { mkdirSync, writeFileSync, rmSync } from "node:fs";
import { join } from "node:path";
import { tmpdir } from "node:os";
describe("IgnoreFilter", () => {
let testDir: string;
beforeEach(() => {
testDir = join(tmpdir(), `ignore-filter-test-${Date.now()}`);
mkdirSync(testDir, { recursive: true });
mkdirSync(join(testDir, ".understand-anything"), { recursive: true });
});
afterEach(() => {
rmSync(testDir, { recursive: true, force: true });
});
describe("DEFAULT_IGNORE_PATTERNS", () => {
it("contains node_modules", () => {
expect(DEFAULT_IGNORE_PATTERNS).toContain("node_modules/");
});
it("contains .git", () => {
expect(DEFAULT_IGNORE_PATTERNS).toContain(".git/");
});
it("contains bin and obj for .NET", () => {
expect(DEFAULT_IGNORE_PATTERNS).toContain("bin/");
expect(DEFAULT_IGNORE_PATTERNS).toContain("obj/");
});
it("contains build output directories", () => {
expect(DEFAULT_IGNORE_PATTERNS).toContain("dist/");
expect(DEFAULT_IGNORE_PATTERNS).toContain("build/");
expect(DEFAULT_IGNORE_PATTERNS).toContain("out/");
expect(DEFAULT_IGNORE_PATTERNS).toContain("coverage/");
});
});
describe("createIgnoreFilter with no user file", () => {
it("ignores files matching default patterns", () => {
const filter = createIgnoreFilter(testDir);
expect(filter.isIgnored("node_modules/foo/bar.js")).toBe(true);
expect(filter.isIgnored("dist/index.js")).toBe(true);
expect(filter.isIgnored(".git/config")).toBe(true);
expect(filter.isIgnored("bin/Debug/app.dll")).toBe(true);
expect(filter.isIgnored("obj/Release/net8.0/app.dll")).toBe(true);
});
it("does not ignore source files", () => {
const filter = createIgnoreFilter(testDir);
expect(filter.isIgnored("src/index.ts")).toBe(false);
expect(filter.isIgnored("README.md")).toBe(false);
expect(filter.isIgnored("package.json")).toBe(false);
});
it("ignores lock files", () => {
const filter = createIgnoreFilter(testDir);
expect(filter.isIgnored("pnpm-lock.yaml")).toBe(true);
expect(filter.isIgnored("package-lock.json")).toBe(true);
expect(filter.isIgnored("yarn.lock")).toBe(true);
});
it("ignores binary/asset files", () => {
const filter = createIgnoreFilter(testDir);
expect(filter.isIgnored("logo.png")).toBe(true);
expect(filter.isIgnored("font.woff2")).toBe(true);
expect(filter.isIgnored("doc.pdf")).toBe(true);
});
it("ignores generated files", () => {
const filter = createIgnoreFilter(testDir);
expect(filter.isIgnored("bundle.min.js")).toBe(true);
expect(filter.isIgnored("style.min.css")).toBe(true);
expect(filter.isIgnored("source.map")).toBe(true);
});
it("ignores IDE directories", () => {
const filter = createIgnoreFilter(testDir);
expect(filter.isIgnored(".idea/workspace.xml")).toBe(true);
expect(filter.isIgnored(".vscode/settings.json")).toBe(true);
});
});
describe("createIgnoreFilter with user .understandignore", () => {
it("reads patterns from .understand-anything/.understandignore", () => {
writeFileSync(
join(testDir, ".understand-anything", ".understandignore"),
"# Exclude tests\n__tests__/\n*.test.ts\n"
);
const filter = createIgnoreFilter(testDir);
expect(filter.isIgnored("__tests__/foo.test.ts")).toBe(true);
expect(filter.isIgnored("src/utils.test.ts")).toBe(true);
expect(filter.isIgnored("src/utils.ts")).toBe(false);
});
it("reads patterns from project root .understandignore", () => {
writeFileSync(
join(testDir, ".understandignore"),
"docs/\n"
);
const filter = createIgnoreFilter(testDir);
expect(filter.isIgnored("docs/README.md")).toBe(true);
expect(filter.isIgnored("src/index.ts")).toBe(false);
});
it("handles # comments and blank lines", () => {
writeFileSync(
join(testDir, ".understand-anything", ".understandignore"),
"# This is a comment\n\n\nfixtures/\n\n# Another comment\n"
);
const filter = createIgnoreFilter(testDir);
expect(filter.isIgnored("fixtures/data.json")).toBe(true);
expect(filter.isIgnored("src/index.ts")).toBe(false);
});
it("supports ! negation to override defaults", () => {
writeFileSync(
join(testDir, ".understand-anything", ".understandignore"),
"!dist/\n"
);
const filter = createIgnoreFilter(testDir);
// dist/ is in defaults but negated by user
expect(filter.isIgnored("dist/index.js")).toBe(false);
});
it("supports ** recursive matching", () => {
writeFileSync(
join(testDir, ".understand-anything", ".understandignore"),
"**/snapshots/\n"
);
const filter = createIgnoreFilter(testDir);
expect(filter.isIgnored("src/components/snapshots/Button.snap")).toBe(true);
expect(filter.isIgnored("snapshots/foo.snap")).toBe(true);
});
it("merges .understand-anything/ and root .understandignore", () => {
writeFileSync(
join(testDir, ".understand-anything", ".understandignore"),
"__tests__/\n"
);
writeFileSync(
join(testDir, ".understandignore"),
"fixtures/\n"
);
const filter = createIgnoreFilter(testDir);
expect(filter.isIgnored("__tests__/foo.ts")).toBe(true);
expect(filter.isIgnored("fixtures/data.json")).toBe(true);
expect(filter.isIgnored("src/index.ts")).toBe(false);
});
});
});
```
- [ ] **Step 2: Run tests to verify they fail**
Run: `pnpm --filter @understand-anything/core test -- --run src/__tests__/ignore-filter.test.ts`
Expected: FAIL — module not found
- [ ] **Step 3: Implement IgnoreFilter**
Create `understand-anything-plugin/packages/core/src/ignore-filter.ts`:
```typescript
import ignore, { type Ignore } from "ignore";
import { readFileSync, existsSync } from "node:fs";
import { join } from "node:path";
/**
* Hardcoded default ignore patterns matching the project-scanner agent's
* exclusion rules, plus bin/obj for .NET projects.
*/
export const DEFAULT_IGNORE_PATTERNS: string[] = [
// Dependency directories
"node_modules/",
".git/",
"vendor/",
"venv/",
".venv/",
"__pycache__/",
// Build output
"dist/",
"build/",
"out/",
"coverage/",
".next/",
".cache/",
".turbo/",
"target/",
"bin/",
"obj/",
// Lock files
"*.lock",
"package-lock.json",
"yarn.lock",
"pnpm-lock.yaml",
// Binary/asset files
"*.png",
"*.jpg",
"*.jpeg",
"*.gif",
"*.svg",
"*.ico",
"*.woff",
"*.woff2",
"*.ttf",
"*.eot",
"*.mp3",
"*.mp4",
"*.pdf",
"*.zip",
"*.tar",
"*.gz",
// Generated files
"*.min.js",
"*.min.css",
"*.map",
"*.generated.*",
// IDE/editor
".idea/",
".vscode/",
// Misc
"LICENSE",
".gitignore",
".editorconfig",
".prettierrc",
".eslintrc*",
"*.log",
];
export interface IgnoreFilter {
/** Returns true if the given relative path should be excluded from analysis. */
isIgnored(relativePath: string): boolean;
}
/**
* Creates an IgnoreFilter that merges hardcoded defaults with user-defined
* patterns from .understandignore files.
*
* Pattern load order (later entries can override earlier ones via ! negation):
* 1. Hardcoded defaults
* 2. .understand-anything/.understandignore (if exists)
* 3. .understandignore at project root (if exists)
*/
export function createIgnoreFilter(projectRoot: string): IgnoreFilter {
const ig: Ignore = ignore();
// Layer 1: hardcoded defaults
ig.add(DEFAULT_IGNORE_PATTERNS);
// Layer 2: .understand-anything/.understandignore
const projectIgnorePath = join(projectRoot, ".understand-anything", ".understandignore");
if (existsSync(projectIgnorePath)) {
const content = readFileSync(projectIgnorePath, "utf-8");
ig.add(content);
}
// Layer 3: .understandignore at project root
const rootIgnorePath = join(projectRoot, ".understandignore");
if (existsSync(rootIgnorePath)) {
const content = readFileSync(rootIgnorePath, "utf-8");
ig.add(content);
}
return {
isIgnored(relativePath: string): boolean {
return ig.ignores(relativePath);
},
};
}
```
- [ ] **Step 4: Run tests to verify they pass**
Run: `pnpm --filter @understand-anything/core test -- --run src/__tests__/ignore-filter.test.ts`
Expected: All tests PASS
- [ ] **Step 5: Build to verify no type errors**
Run: `pnpm --filter @understand-anything/core build`
Expected: Clean build
- [ ] **Step 6: Commit**
```bash
git add understand-anything-plugin/packages/core/src/ignore-filter.ts understand-anything-plugin/packages/core/src/__tests__/ignore-filter.test.ts
git commit -m "feat(core): add IgnoreFilter module with .understandignore parsing and tests"
```
---
## Task 3: Create IgnoreGenerator module with tests (TDD)
**Files:**
- Create: `understand-anything-plugin/packages/core/src/ignore-generator.ts`
- Create: `understand-anything-plugin/packages/core/src/__tests__/ignore-generator.test.ts`
- [ ] **Step 1: Write the failing tests**
Create `understand-anything-plugin/packages/core/src/__tests__/ignore-generator.test.ts`:
```typescript
import { describe, it, expect, beforeEach, afterEach } from "vitest";
import { generateStarterIgnoreFile } from "../ignore-generator";
import { mkdirSync, rmSync, writeFileSync } from "node:fs";
import { join } from "node:path";
import { tmpdir } from "node:os";
describe("generateStarterIgnoreFile", () => {
let testDir: string;
beforeEach(() => {
testDir = join(tmpdir(), `ignore-gen-test-${Date.now()}`);
mkdirSync(testDir, { recursive: true });
});
afterEach(() => {
rmSync(testDir, { recursive: true, force: true });
});
it("includes a header comment explaining the file", () => {
const content = generateStarterIgnoreFile(testDir);
expect(content).toContain(".understandignore");
expect(content).toContain("same as .gitignore");
expect(content).toContain("Built-in defaults");
});
it("all suggestions are commented out", () => {
// Create some directories to trigger suggestions
mkdirSync(join(testDir, "__tests__"), { recursive: true });
mkdirSync(join(testDir, "docs"), { recursive: true });
const content = generateStarterIgnoreFile(testDir);
const lines = content.split("\n").filter((l) => l.trim() && !l.startsWith("#"));
// No active (uncommented) patterns
expect(lines).toHaveLength(0);
});
it("suggests __tests__ when __tests__ directory exists", () => {
mkdirSync(join(testDir, "__tests__"), { recursive: true });
const content = generateStarterIgnoreFile(testDir);
expect(content).toContain("# __tests__/");
});
it("suggests docs when docs directory exists", () => {
mkdirSync(join(testDir, "docs"), { recursive: true });
const content = generateStarterIgnoreFile(testDir);
expect(content).toContain("# docs/");
});
it("suggests test directories when they exist", () => {
mkdirSync(join(testDir, "test"), { recursive: true });
mkdirSync(join(testDir, "tests"), { recursive: true });
const content = generateStarterIgnoreFile(testDir);
expect(content).toContain("# test/");
expect(content).toContain("# tests/");
});
it("suggests fixtures when fixtures directory exists", () => {
mkdirSync(join(testDir, "fixtures"), { recursive: true });
const content = generateStarterIgnoreFile(testDir);
expect(content).toContain("# fixtures/");
});
it("suggests examples when examples directory exists", () => {
mkdirSync(join(testDir, "examples"), { recursive: true });
const content = generateStarterIgnoreFile(testDir);
expect(content).toContain("# examples/");
});
it("suggests .storybook when .storybook directory exists", () => {
mkdirSync(join(testDir, ".storybook"), { recursive: true });
const content = generateStarterIgnoreFile(testDir);
expect(content).toContain("# .storybook/");
});
it("suggests migrations when migrations directory exists", () => {
mkdirSync(join(testDir, "migrations"), { recursive: true });
const content = generateStarterIgnoreFile(testDir);
expect(content).toContain("# migrations/");
});
it("suggests scripts when scripts directory exists", () => {
mkdirSync(join(testDir, "scripts"), { recursive: true });
const content = generateStarterIgnoreFile(testDir);
expect(content).toContain("# scripts/");
});
it("always includes generic suggestions", () => {
const content = generateStarterIgnoreFile(testDir);
expect(content).toContain("# *.snap");
expect(content).toContain("# *.test.*");
expect(content).toContain("# *.spec.*");
});
it("does not suggest directories that don't exist", () => {
const content = generateStarterIgnoreFile(testDir);
// __tests__ doesn't exist, so it shouldn't be in directory suggestions
// (it may still be in generic test file patterns)
expect(content).not.toContain("# __tests__/");
expect(content).not.toContain("# .storybook/");
});
});
```
- [ ] **Step 2: Run tests to verify they fail**
Run: `pnpm --filter @understand-anything/core test -- --run src/__tests__/ignore-generator.test.ts`
Expected: FAIL — module not found
- [ ] **Step 3: Implement IgnoreGenerator**
Create `understand-anything-plugin/packages/core/src/ignore-generator.ts`:
```typescript
import { existsSync } from "node:fs";
import { join } from "node:path";
const HEADER = `# .understandignore — patterns for files/dirs to exclude from analysis
# Syntax: same as .gitignore (globs, # comments, ! negation, trailing / for dirs)
# Lines below are suggestions — uncomment to activate.
# Use ! prefix to force-include something excluded by defaults.
#
# Built-in defaults (always excluded unless negated):
# node_modules/, .git/, dist/, build/, bin/, obj/, *.lock, *.min.js, etc.
#
`;
/** Directories to check for and suggest excluding. */
const DETECTABLE_DIRS = [
{ dir: "__tests__", pattern: "__tests__/" },
{ dir: "test", pattern: "test/" },
{ dir: "tests", pattern: "tests/" },
{ dir: "fixtures", pattern: "fixtures/" },
{ dir: "testdata", pattern: "testdata/" },
{ dir: "docs", pattern: "docs/" },
{ dir: "examples", pattern: "examples/" },
{ dir: "scripts", pattern: "scripts/" },
{ dir: "migrations", pattern: "migrations/" },
{ dir: ".storybook", pattern: ".storybook/" },
];
/** Always-included generic suggestions. */
const GENERIC_SUGGESTIONS = [
"*.test.*",
"*.spec.*",
"*.snap",
];
/**
* Generates a starter .understandignore file by scanning the project root
* for common directories and suggesting them as commented-out exclusions.
*
* All suggestions are commented out — the user must uncomment to activate.
* Returns the file content as a string.
*/
export function generateStarterIgnoreFile(projectRoot: string): string {
const sections: string[] = [HEADER];
// Detected directory suggestions
const detected: string[] = [];
for (const { dir, pattern } of DETECTABLE_DIRS) {
if (existsSync(join(projectRoot, dir))) {
detected.push(pattern);
}
}
if (detected.length > 0) {
sections.push("# --- Detected directories (uncomment to exclude) ---\n");
for (const pattern of detected) {
sections.push(`# ${pattern}`);
}
sections.push("");
}
// Generic suggestions (always included)
sections.push("# --- Test file patterns (uncomment to exclude) ---\n");
for (const pattern of GENERIC_SUGGESTIONS) {
sections.push(`# ${pattern}`);
}
sections.push("");
return sections.join("\n");
}
```
- [ ] **Step 4: Run tests to verify they pass**
Run: `pnpm --filter @understand-anything/core test -- --run src/__tests__/ignore-generator.test.ts`
Expected: All tests PASS
- [ ] **Step 5: Build**
Run: `pnpm --filter @understand-anything/core build`
Expected: Clean build
- [ ] **Step 6: Commit**
```bash
git add understand-anything-plugin/packages/core/src/ignore-generator.ts understand-anything-plugin/packages/core/src/__tests__/ignore-generator.test.ts
git commit -m "feat(core): add IgnoreGenerator for starter .understandignore file creation"
```
---
## Task 4: Export new modules from core
**Files:**
- Modify: `understand-anything-plugin/packages/core/src/index.ts`
- [ ] **Step 1: Add exports**
Add to the end of `understand-anything-plugin/packages/core/src/index.ts`:
```typescript
export {
createIgnoreFilter,
DEFAULT_IGNORE_PATTERNS,
type IgnoreFilter,
} from "./ignore-filter.js";
export { generateStarterIgnoreFile } from "./ignore-generator.js";
```
- [ ] **Step 2: Build and run all tests**
Run: `pnpm --filter @understand-anything/core build && pnpm --filter @understand-anything/core test -- --run`
Expected: Clean build, all tests pass
- [ ] **Step 3: Commit**
```bash
git add understand-anything-plugin/packages/core/src/index.ts
git commit -m "feat(core): export IgnoreFilter and IgnoreGenerator from core index"
```
---
## Task 5: Update project-scanner agent
**Files:**
- Modify: `understand-anything-plugin/agents/project-scanner.md`
- [ ] **Step 1: Read the current project-scanner.md**
Read `understand-anything-plugin/agents/project-scanner.md` to understand the current structure.
- [ ] **Step 2: Add bin/ and obj/ to hardcoded exclusions**
In Step 2 (Exclusion Filtering), add `bin/` and `obj/` to the "Build output" line:
Change:
```
- **Build output:** paths with a directory segment matching `dist/`, `build/`, `out/`, `coverage/`, `.next/`, `.cache/`, `.turbo/`, `target/` (Rust)
```
To:
```
- **Build output:** paths with a directory segment matching `dist/`, `build/`, `out/`, `coverage/`, `.next/`, `.cache/`, `.turbo/`, `target/` (Rust), `bin/` (.NET), `obj/` (.NET)
```
- [ ] **Step 3: Add Layer 2 filtering step**
After Step 2 (Exclusion Filtering), add a new step:
```markdown
**Step 2.5 -- User-Configured Filtering (.understandignore)**
After applying the hardcoded exclusion filters above, apply user-configured patterns from `.understandignore`:
1. Check if `.understand-anything/.understandignore` exists in the project root. If so, read it.
2. Check if `.understandignore` exists in the project root. If so, read it.
3. Parse both files using `.gitignore` syntax (glob patterns, `#` comments, blank lines ignored, `!` prefix for negation, trailing `/` for directories, `**/` for recursive matching).
4. Filter the remaining file list through these patterns. Files matching any pattern are excluded.
5. `!` negation patterns override the hardcoded exclusions from Step 2 (e.g., `!dist/` force-includes dist/).
6. Track the count of files removed by this step as `filteredByIgnore`.
This filtering must be deterministic (not LLM-based). Use a Node.js script with the `ignore` npm package if implementing programmatically, or apply the patterns manually if the file list is small.
```
- [ ] **Step 4: Update scan output schema**
Find the output JSON schema section and add `filteredByIgnore` field:
```json
{
"name": "...",
"description": "...",
"languages": ["..."],
"frameworks": ["..."],
"files": [...],
"totalFiles": 123,
"filteredByIgnore": 5,
"estimatedComplexity": "moderate",
"importMap": {}
}
```
- [ ] **Step 5: Commit**
```bash
git add understand-anything-plugin/agents/project-scanner.md
git commit -m "feat(agent): add .understandignore support and bin/obj exclusions to project-scanner"
```
---
## Task 6: Update /understand skill with Phase 0.5
**Files:**
- Modify: `understand-anything-plugin/skills/understand/SKILL.md`
- [ ] **Step 1: Read the current SKILL.md Phase 0 section**
Read `understand-anything-plugin/skills/understand/SKILL.md` lines 22-80 to understand Phase 0.
- [ ] **Step 2: Add Phase 0.5 after Phase 0**
After the Phase 0 section (after the `---` separator before Phase 1), insert:
```markdown
## Phase 0.5 — Ignore Configuration
Set up and verify the `.understandignore` file before scanning.
1. Check if `$PROJECT_ROOT/.understand-anything/.understandignore` exists.
2. **If it does NOT exist**, generate a starter file:
- Run a Node.js script (or inline logic) that scans `$PROJECT_ROOT` for common directories (`__tests__/`, `test/`, `tests/`, `fixtures/`, `testdata/`, `docs/`, `examples/`, `scripts/`, `migrations/`, `.storybook/`) and generates a `.understandignore` file with commented-out suggestions.
- Write the generated content to `$PROJECT_ROOT/.understand-anything/.understandignore`.
- Report to the user:
> "Generated `.understand-anything/.understandignore` with suggested exclusions based on your project structure. Please review it and uncomment any patterns you'd like to exclude from analysis. When ready, confirm to continue."
- **Wait for user confirmation before proceeding.**
3. **If it already exists**, report:
> "Found `.understand-anything/.understandignore`. Review it if needed, then confirm to continue."
- **Wait for user confirmation before proceeding.**
4. After confirmation, proceed to Phase 1.
**Note:** The `.understandignore` file uses `.gitignore` syntax. The user can add patterns to exclude files from analysis, or use `!` prefix to force-include files excluded by built-in defaults (e.g., `!dist/` to analyze dist/ files).
---
```
- [ ] **Step 3: Update Phase 1 reporting**
In the Phase 1 section, after the gate check (~line 114), add a note about reporting ignore stats:
```markdown
After scanning, if the scan result includes `filteredByIgnore > 0`, report:
> "Scanned {totalFiles} files ({filteredByIgnore} excluded by .understandignore)"
```
- [ ] **Step 4: Commit**
```bash
git add understand-anything-plugin/skills/understand/SKILL.md
git commit -m "feat(skill): add Phase 0.5 for .understandignore setup and review pause"
```
---
## Task 7: Build, test, and verify end-to-end
**Files:**
- All modified files
- [ ] **Step 1: Build core**
Run: `pnpm --filter @understand-anything/core build`
Expected: Clean build
- [ ] **Step 2: Run all core tests**
Run: `pnpm --filter @understand-anything/core test -- --run`
Expected: All tests pass (existing + new ignore-filter + ignore-generator tests)
- [ ] **Step 3: Build skill package**
Run: `pnpm --filter @understand-anything/skill build`
Expected: Clean build
- [ ] **Step 4: Verify files exist**
Run:
```bash
ls understand-anything-plugin/packages/core/src/ignore-filter.ts understand-anything-plugin/packages/core/src/ignore-generator.ts
```
Expected: Both files listed
- [ ] **Step 5: Verify exports work**
Run:
```bash
node -e "import('@understand-anything/core').then(m => { console.log('IgnoreFilter:', typeof m.createIgnoreFilter); console.log('Generator:', typeof m.generateStarterIgnoreFile); })"
```
Expected: Both show `function`
- [ ] **Step 6: Final commit (if any unstaged changes)**
```bash
git status
# If clean, skip. If changes exist:
git add -A && git commit -m "chore: final verification for .understandignore support"
```

View File

@@ -0,0 +1,856 @@
# Language-Specific Extractor Architecture Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** (1) Decouple AST extraction logic from TS/JS-specific node types so 8 additional code languages (Python, Go, Rust, Java, Ruby, PHP, C/C++, C#) get tree-sitter-powered structural analysis. Swift and Kotlin are excluded — no WASM grammar packages available. (2) Replace the file-analyzer agent's ad-hoc regex script generation with a deterministic, pre-built tree-sitter extraction script.
**Architecture:** Introduce a `LanguageExtractor` interface that each language implements. `TreeSitterPlugin` delegates extraction to the registered extractor for the file's language. A bundled `extract-structure.mjs` script in `skills/understand/` uses `PluginRegistry` (which includes both `TreeSitterPlugin` and the non-code parsers) to provide deterministic structural extraction for the file-analyzer agent — replacing the current approach where the LLM writes throwaway regex scripts every run.
**Tech Stack:** web-tree-sitter (WASM), TypeScript, Vitest
---
## File Structure
```
packages/core/src/plugins/
├── extractors/
│ ├── types.ts # LanguageExtractor interface + TreeSitterNode re-export
│ ├── base-extractor.ts # Shared utilities (traverse, getStringValue)
│ ├── typescript-extractor.ts # TS/JS (moved from tree-sitter-plugin.ts)
│ ├── python-extractor.ts
│ ├── go-extractor.ts
│ ├── rust-extractor.ts
│ ├── java-extractor.ts
│ ├── ruby-extractor.ts
│ ├── php-extractor.ts
│ ├── cpp-extractor.ts
│ ├── csharp-extractor.ts
│ └── index.ts # builtinExtractors array + re-exports
├── tree-sitter-plugin.ts # Refactored to use extractors
└── tree-sitter-plugin.test.ts # Existing tests (should still pass)
packages/core/src/plugins/__tests__/
└── extractors.test.ts # Tests for all new extractors
skills/understand/
├── extract-structure.mjs # Pre-built tree-sitter extraction script (NEW)
└── SKILL.md # Updated to reference extract-structure.mjs
agents/
└── file-analyzer.md # Phase 1 rewritten to execute pre-built script
```
---
### Task 1: Create LanguageExtractor interface and shared utilities
**Files:**
- Create: `packages/core/src/plugins/extractors/types.ts`
- Create: `packages/core/src/plugins/extractors/base-extractor.ts`
- [ ] **Step 1: Create the extractor interface**
```typescript
// packages/core/src/plugins/extractors/types.ts
import type { StructuralAnalysis, CallGraphEntry } from "../../types.js";
// Re-export the tree-sitter Node type for use by extractors
export type TreeSitterNode = import("web-tree-sitter").Node;
/**
* Language-specific extractor that maps a tree-sitter AST
* to the common StructuralAnalysis / CallGraphEntry types.
*/
export interface LanguageExtractor {
/** Language IDs this extractor handles (must match LanguageConfig.id) */
languageIds: string[];
/** Extract functions, classes, imports, exports from the root AST node */
extractStructure(rootNode: TreeSitterNode): StructuralAnalysis;
/** Extract caller→callee relationships from the root AST node */
extractCallGraph(rootNode: TreeSitterNode): CallGraphEntry[];
}
```
- [ ] **Step 2: Create base-extractor with shared utilities**
Move `traverse()` and `getStringValue()` from `tree-sitter-plugin.ts` into a shared module:
```typescript
// packages/core/src/plugins/extractors/base-extractor.ts
import type { TreeSitterNode } from "./types.js";
/** Recursively traverse an AST tree, calling the visitor for each node. */
export function traverse(
node: TreeSitterNode,
visitor: (node: TreeSitterNode) => void,
): void {
visitor(node);
for (let i = 0; i < node.childCount; i++) {
const child = node.child(i);
if (child) traverse(child, visitor);
}
}
/** Extract the unquoted string value from a string-like node. */
export function getStringValue(node: TreeSitterNode): string {
for (let i = 0; i < node.childCount; i++) {
const child = node.child(i);
if (child && child.type === "string_fragment") {
return child.text;
}
}
return node.text.replace(/^['"`]|['"`]$/g, "");
}
/** Find the first child matching a type. */
export function findChild(node: TreeSitterNode, type: string): TreeSitterNode | null {
for (let i = 0; i < node.childCount; i++) {
const child = node.child(i);
if (child && child.type === type) return child;
}
return null;
}
/** Find all children matching a type. */
export function findChildren(node: TreeSitterNode, type: string): TreeSitterNode[] {
const result: TreeSitterNode[] = [];
for (let i = 0; i < node.childCount; i++) {
const child = node.child(i);
if (child && child.type === type) result.push(child);
}
return result;
}
/** Check if a node has a child of the given type (used for export/visibility checks). */
export function hasChildOfType(node: TreeSitterNode, type: string): boolean {
for (let i = 0; i < node.childCount; i++) {
const child = node.child(i);
if (child && child.type === type) return true;
}
return false;
}
```
- [ ] **Step 3: Commit**
```bash
git add packages/core/src/plugins/extractors/types.ts packages/core/src/plugins/extractors/base-extractor.ts
git commit -m "feat: add LanguageExtractor interface and shared base utilities"
```
---
### Task 2: Move TS/JS extraction logic into TypeScriptExtractor
**Files:**
- Create: `packages/core/src/plugins/extractors/typescript-extractor.ts`
- Modify: `packages/core/src/plugins/tree-sitter-plugin.ts`
This is a pure refactor. All existing tests must still pass with zero changes.
- [ ] **Step 1: Create TypeScriptExtractor**
Move all the TS/JS-specific extraction methods (`extractFunction`, `extractClass`, `extractVariableDeclarations`, `extractImport`, `processExportStatement`, `extractParams`, `extractReturnType`, `extractImportSpecifiers`, and the call graph walker) from `tree-sitter-plugin.ts` into `typescript-extractor.ts`, implementing the `LanguageExtractor` interface.
The `languageIds` should be `["typescript", "javascript"]`. Do NOT include `"tsx"` — it is a synthetic key internal to `TreeSitterPlugin` for grammar selection, not a `LanguageConfig.id`. The tsx→typescript mapping is handled in `getExtractor()` below.
- [ ] **Step 2: Refactor TreeSitterPlugin to use extractors**
Replace the hardcoded extraction logic in `TreeSitterPlugin` with extractor dispatch:
```typescript
// In TreeSitterPlugin
private extractors = new Map<string, LanguageExtractor>();
registerExtractor(extractor: LanguageExtractor): void {
for (const id of extractor.languageIds) {
this.extractors.set(id, extractor);
}
}
private getExtractor(langKey: string): LanguageExtractor | null {
// tsx is a synthetic grammar key — extraction logic is identical to typescript
const key = langKey === "tsx" ? "typescript" : langKey;
return this.extractors.get(key) ?? null;
}
```
The `analyzeFile()` method becomes:
```typescript
analyzeFile(filePath: string, content: string): StructuralAnalysis {
const parser = this.getParser(filePath);
if (!parser) return { functions: [], classes: [], imports: [], exports: [] };
const tree = parser.parse(content);
if (!tree) { parser.delete(); return { functions: [], classes: [], imports: [], exports: [] }; }
const langKey = this.languageKeyFromPath(filePath);
const extractor = langKey ? this.getExtractor(langKey) : null;
let result: StructuralAnalysis;
if (extractor) {
result = extractor.extractStructure(tree.rootNode);
} else {
result = { functions: [], classes: [], imports: [], exports: [] };
}
tree.delete();
parser.delete();
return result;
}
```
The `extractCallGraph()` method follows the same pattern — parser lifecycle must be managed identically:
```typescript
extractCallGraph(filePath: string, content: string): CallGraphEntry[] {
const parser = this.getParser(filePath);
if (!parser) return [];
const tree = parser.parse(content);
if (!tree) { parser.delete(); return []; }
const langKey = this.languageKeyFromPath(filePath);
const extractor = langKey ? this.getExtractor(langKey) : null;
const result = extractor ? extractor.extractCallGraph(tree.rootNode) : [];
tree.delete();
parser.delete();
return result;
}
```
The constructor should accept an optional `extractors` array and register them. If none provided, register the built-in `TypeScriptExtractor` for backward compatibility.
- [ ] **Step 3: Run existing tests to verify zero behavior change**
Run: `pnpm --filter @understand-anything/core test`
Expected: All 426 tests pass (identical to before)
- [ ] **Step 4: Commit**
```bash
git add packages/core/src/plugins/extractors/typescript-extractor.ts packages/core/src/plugins/tree-sitter-plugin.ts
git commit -m "refactor: move TS/JS extraction logic to TypeScriptExtractor, dispatch via LanguageExtractor interface"
```
---
### Task 2.5: Add extractCallGraph to PluginRegistry and update DEFAULT_PLUGIN_CONFIG
**Files:**
- Modify: `packages/core/src/plugins/registry.ts`
- Modify: `packages/core/src/plugins/discovery.ts`
**Context:** `PluginRegistry` currently only exposes `analyzeFile` and `resolveImports` — it has no `extractCallGraph`. The `extract-structure.mjs` script (Task 13) needs call graph data through the registry. Also, `DEFAULT_PLUGIN_CONFIG` hardcodes `["typescript", "javascript"]` which needs to reflect all supported languages.
- [ ] **Step 1: Add extractCallGraph to PluginRegistry**
```typescript
// In PluginRegistry (registry.ts)
extractCallGraph(filePath: string, content: string): CallGraphEntry[] | null {
const plugin = this.getPluginForFile(filePath);
if (!plugin?.extractCallGraph) return null;
return plugin.extractCallGraph(filePath, content);
}
```
- [ ] **Step 2: Update DEFAULT_PLUGIN_CONFIG to derive languages dynamically**
In `discovery.ts`, replace the hardcoded `["typescript", "javascript"]` with a dynamic derivation from `builtinLanguageConfigs`:
```typescript
import { builtinLanguageConfigs } from "../languages/configs/index.js";
export const DEFAULT_PLUGIN_CONFIG: PluginConfig = {
plugins: [
{
name: "tree-sitter",
enabled: true,
languages: builtinLanguageConfigs
.filter((c) => c.treeSitter)
.map((c) => c.id),
},
],
};
```
- [ ] **Step 3: Run tests, commit**
```bash
pnpm --filter @understand-anything/core test
git add packages/core/src/plugins/registry.ts packages/core/src/plugins/discovery.ts
git commit -m "feat: add extractCallGraph to PluginRegistry, derive DEFAULT_PLUGIN_CONFIG from configs"
```
---
### Task 3: Add npm dependencies and treeSitter configs for all 10 languages
**Files:**
- Modify: `packages/core/package.json` (add 8 deps: python, go, rust, java, ruby, php, cpp, c-sharp)
- Modify: 10 config files in `packages/core/src/languages/configs/`
- [ ] **Step 1: Add tree-sitter grammar dependencies to package.json**
Add to `dependencies`:
```json
"tree-sitter-c-sharp": "^0.23.1",
"tree-sitter-cpp": "^0.23.4",
"tree-sitter-go": "^0.25.0",
"tree-sitter-java": "^0.23.5",
"tree-sitter-php": "^0.23.11",
"tree-sitter-python": "^0.25.0",
"tree-sitter-ruby": "^0.23.1",
"tree-sitter-rust": "^0.24.0"
```
Then run `pnpm install`.
- [ ] **Step 2: Add treeSitter field to all 10 language configs**
Each config gets a `treeSitter` block. Examples:
```typescript
// python.ts
treeSitter: { wasmPackage: "tree-sitter-python", wasmFile: "tree-sitter-python.wasm" },
// go.ts
treeSitter: { wasmPackage: "tree-sitter-go", wasmFile: "tree-sitter-go.wasm" },
// rust.ts
treeSitter: { wasmPackage: "tree-sitter-rust", wasmFile: "tree-sitter-rust.wasm" },
// java.ts
treeSitter: { wasmPackage: "tree-sitter-java", wasmFile: "tree-sitter-java.wasm" },
// ruby.ts
treeSitter: { wasmPackage: "tree-sitter-ruby", wasmFile: "tree-sitter-ruby.wasm" },
// php.ts
treeSitter: { wasmPackage: "tree-sitter-php", wasmFile: "tree-sitter-php.wasm" },
// cpp.ts
treeSitter: { wasmPackage: "tree-sitter-cpp", wasmFile: "tree-sitter-cpp.wasm" },
// csharp.ts
treeSitter: { wasmPackage: "tree-sitter-c-sharp", wasmFile: "tree-sitter-c_sharp.wasm" },
```
Note: Swift and Kotlin configs are NOT changed (no WASM packages available).
- [ ] **Step 3: Run pnpm install and verify WASM files resolve**
```bash
pnpm install
node -e "const r=require('module').createRequire(import.meta.url??__filename); console.log(r.resolve('tree-sitter-python/tree-sitter-python.wasm'))"
```
- [ ] **Step 4: Commit**
```bash
git add packages/core/package.json pnpm-lock.yaml packages/core/src/languages/configs/
git commit -m "feat: add tree-sitter grammar deps and treeSitter configs for 10 languages"
```
---
### Task 4: Create Python extractor
**Files:**
- Create: `packages/core/src/plugins/extractors/python-extractor.ts`
- [ ] **Step 1: Write the Python extractor**
Key Python tree-sitter node types:
- Functions: `function_definition` (name, parameters, return_type)
- Classes: `class_definition` (name, body → methods + assignments as properties)
- Imports: `import_statement`, `import_from_statement`
- Decorated: `decorated_definition` wrapping function_definition or class_definition
- Calls: `call` (function field)
- No formal exports (all top-level names are "exported")
```typescript
languageIds: ["python"]
```
- [ ] **Step 2: Write tests for Python extractor**
Test with representative Python code:
```python
import os
from pathlib import Path
from typing import Optional
class DataProcessor:
name: str
def __init__(self, name: str):
self.name = name
def process(self, data: list) -> dict:
return transform(data)
def helper(x: int) -> str:
return str(x)
@decorator
def decorated_func():
pass
```
Verify: 2 functions (helper, decorated_func), 1 class (DataProcessor with methods __init__/process and property name), 3 imports, call graph (process→transform).
- [ ] **Step 3: Run tests**
Run: `pnpm --filter @understand-anything/core test`
- [ ] **Step 4: Commit**
---
### Task 5: Create Go extractor
**Files:**
- Create: `packages/core/src/plugins/extractors/go-extractor.ts`
- [ ] **Step 1: Write the Go extractor**
Key Go tree-sitter node types:
- Functions: `function_declaration` (name, parameter_list, result)
- Methods: `method_declaration` (receiver, name, parameter_list, result)
- Structs: `type_declaration``type_spec``struct_type`
- Interfaces: `type_declaration``type_spec``interface_type`
- Imports: `import_declaration``import_spec_list``import_spec`
- Exports: capitalized first letter of name
- Calls: `call_expression` (function field)
```typescript
languageIds: ["go"]
```
- [ ] **Step 2: Write tests**
Test with:
```go
package main
import (
"fmt"
"os"
)
type Server struct {
Host string
Port int
}
func (s *Server) Start() error {
fmt.Println("starting")
return nil
}
func NewServer(host string, port int) *Server {
return &Server{Host: host, Port: port}
}
```
Verify: 2 functions (Start, NewServer), 1 class/struct (Server with method Start, properties Host/Port), 2 imports, exports (Server, Start, NewServer — all capitalized), call graph (Start→fmt.Println).
- [ ] **Step 3: Run tests and commit**
---
### Task 6: Create Rust extractor
**Files:**
- Create: `packages/core/src/plugins/extractors/rust-extractor.ts`
- [ ] **Step 1: Write the Rust extractor**
Key Rust tree-sitter node types:
- Functions: `function_item` (name, parameters, return_type via `->`)
- Structs: `struct_item` (name, field_declaration_list)
- Enums: `enum_item`
- Impl blocks: `impl_item` (type, body containing function_items)
- Traits: `trait_item`
- Imports: `use_declaration` (scoped_identifier, use_list, use_wildcard)
- Exports: `visibility_modifier` containing `pub`
- Calls: `call_expression` (function field)
```typescript
languageIds: ["rust"]
```
- [ ] **Step 2: Write tests**
Test with:
```rust
use std::collections::HashMap;
use std::io::{self, Read};
pub struct Config {
name: String,
port: u16,
}
impl Config {
pub fn new(name: String, port: u16) -> Self {
Config { name, port }
}
fn validate(&self) -> bool {
check_port(self.port)
}
}
pub fn check_port(port: u16) -> bool {
port > 0
}
```
Verify: 3 functions (new, validate, check_port), 1 class/struct (Config with methods new/validate, properties name/port), 2 imports, exports (Config, new, check_port — those with `pub`), call graph (validate→check_port).
- [ ] **Step 3: Run tests and commit**
---
### Task 7: Create Java extractor
**Files:**
- Create: `packages/core/src/plugins/extractors/java-extractor.ts`
- [ ] **Step 1: Write the Java extractor**
Key Java tree-sitter node types:
- Methods: `method_declaration` (name, formal_parameters, type/dimensions)
- Constructors: `constructor_declaration` (name, formal_parameters)
- Classes: `class_declaration` (name, class_body)
- Interfaces: `interface_declaration`
- Fields: `field_declaration` (declarator → variable_declarator → identifier)
- Imports: `import_declaration` (scoped_identifier)
- Exports: `public` modifier (modifiers node)
- Calls: `method_invocation` (name, object, arguments)
```typescript
languageIds: ["java"]
```
- [ ] **Step 2: Write tests with representative Java code, run, commit**
---
### Task 8: Create Ruby extractor
**Files:**
- Create: `packages/core/src/plugins/extractors/ruby-extractor.ts`
- [ ] **Step 1: Write the Ruby extractor**
Key Ruby tree-sitter node types:
- Methods: `method` (name, parameters)
- Classes: `class` (name, body containing methods)
- Modules: `module` (name)
- Imports: `call` where method is `require` or `require_relative` (Ruby uses method calls for imports)
- Calls: `call` (method, receiver, arguments)
- No formal export syntax
```typescript
languageIds: ["ruby"]
```
- [ ] **Step 2: Write tests, run, commit**
---
### Task 9: Create PHP extractor
**Files:**
- Create: `packages/core/src/plugins/extractors/php-extractor.ts`
- [ ] **Step 1: Write the PHP extractor**
Key PHP tree-sitter node types:
- Functions: `function_definition` (name, formal_parameters, return_type)
- Methods: `method_declaration` (name, formal_parameters, return_type)
- Classes: `class_declaration` (name, declaration_list)
- Imports: `namespace_use_declaration` (namespace_use_clause)
- Calls: `function_call_expression` / `member_call_expression`
- Note: PHP tree wraps everything in a `program``php_tag` + statements
```typescript
languageIds: ["php"]
```
- [ ] **Step 2: Write tests, run, commit**
---
### Task 10: Create C/C++ extractor
**Files:**
- Create: `packages/core/src/plugins/extractors/cpp-extractor.ts`
- [ ] **Step 1: Write the C/C++ extractor**
Key C/C++ tree-sitter node types:
- Functions: `function_definition` (declarator → function_declarator → identifier + parameter_list)
- Classes: `class_specifier` (name, body → field_declaration_list)
- Structs: `struct_specifier` (name, body)
- Includes: `preproc_include` (path → string_literal or system_lib_string)
- Namespaces: `namespace_definition`
- Calls: `call_expression` (function, arguments)
Note: C/C++ function signatures are nested (the name is inside a `function_declarator` inside the `declarator` field).
The `cppConfig` has `id: "cpp"` and `extensions: [".cpp", ".cc", ".cxx", ".c", ".h", ".hpp", ".hxx"]`. Pure C files (`.c`, `.h`) are parsed with the C++ grammar, which works but won't produce C++-specific node types like `class_specifier`. The extractor must handle their absence gracefully (return empty arrays for classes when parsing pure C).
```typescript
languageIds: ["cpp"]
```
- [ ] **Step 2: Write tests for both C++ and pure C code, run, commit**
---
### Task 11: Create C# extractor
**Files:**
- Create: `packages/core/src/plugins/extractors/csharp-extractor.ts`
- [ ] **Step 1: Write the C# extractor**
Key C# tree-sitter node types:
- Methods: `method_declaration` (name, parameter_list, return type)
- Constructors: `constructor_declaration`
- Classes: `class_declaration` (name, declaration_list)
- Interfaces: `interface_declaration`
- Properties: `property_declaration` (name, type)
- Imports: `using_directive` (qualified_name)
- Calls: `invocation_expression` (identifier/member_access, argument_list)
```typescript
languageIds: ["csharp"]
```
- [ ] **Step 2: Write tests, run, commit**
---
### Task 12: Create extractor index and wire into TreeSitterPlugin
**Files:**
- Create: `packages/core/src/plugins/extractors/index.ts`
- Modify: `packages/core/src/plugins/tree-sitter-plugin.ts` (import builtinExtractors)
- [ ] **Step 1: Create index.ts exporting all extractors**
```typescript
// packages/core/src/plugins/extractors/index.ts
export type { LanguageExtractor, TreeSitterNode } from "./types.js";
export { traverse, getStringValue, findChild, findChildren, hasChildOfType } from "./base-extractor.js";
export { TypeScriptExtractor } from "./typescript-extractor.js";
export { PythonExtractor } from "./python-extractor.js";
export { GoExtractor } from "./go-extractor.js";
export { RustExtractor } from "./rust-extractor.js";
export { JavaExtractor } from "./java-extractor.js";
export { RubyExtractor } from "./ruby-extractor.js";
export { PhpExtractor } from "./php-extractor.js";
export { CppExtractor } from "./cpp-extractor.js";
export { CSharpExtractor } from "./csharp-extractor.js";
import type { LanguageExtractor } from "./types.js";
import { TypeScriptExtractor } from "./typescript-extractor.js";
import { PythonExtractor } from "./python-extractor.js";
import { GoExtractor } from "./go-extractor.js";
import { RustExtractor } from "./rust-extractor.js";
import { JavaExtractor } from "./java-extractor.js";
import { RubyExtractor } from "./ruby-extractor.js";
import { PhpExtractor } from "./php-extractor.js";
import { CppExtractor } from "./cpp-extractor.js";
import { CSharpExtractor } from "./csharp-extractor.js";
export const builtinExtractors: LanguageExtractor[] = [
new TypeScriptExtractor(),
new PythonExtractor(),
new GoExtractor(),
new RustExtractor(),
new JavaExtractor(),
new RubyExtractor(),
new PhpExtractor(),
new CppExtractor(),
new CSharpExtractor(),
];
```
- [ ] **Step 2: Wire builtinExtractors into TreeSitterPlugin constructor**
When no extractors are provided, default to `builtinExtractors`.
- [ ] **Step 3: Run full test suite**
Run: `pnpm --filter @understand-anything/core test`
Expected: All tests pass (existing + new extractor tests)
- [ ] **Step 4: Commit**
---
### Task 13: Create bundled extract-structure.mjs script
**Files:**
- Create: `skills/understand/extract-structure.mjs`
**Context:** Currently the file-analyzer agent (Phase 1) instructs the LLM to write a throwaway regex-based Node.js/Python script every run. This is slow, non-deterministic, and ignores the tree-sitter infrastructure we just built. This task replaces that with a pre-built script that uses `PluginRegistry` (which routes to `TreeSitterPlugin` for code files and to the regex parsers for non-code files).
- [ ] **Step 1: Create extract-structure.mjs**
The script:
1. Accepts input JSON path (arg 1) and output JSON path (arg 2)
2. Input format matches what file-analyzer.md already specifies: `{ projectRoot, batchFiles: [{path, language, sizeLines, fileCategory}], batchImportData }`
3. Resolves `@understand-anything/core` from the plugin's own `node_modules` using `createRequire` relative to the script's own location (two directories up to plugin root)
4. Creates a `PluginRegistry` with `TreeSitterPlugin` (all builtin language configs) + all non-code parsers registered
5. For each file: reads content, calls `registry.analyzeFile()`, formats output to match the existing script output schema (functions, classes, exports, sections, definitions, services, etc.)
6. For code files with tree-sitter support: also extracts call graph via `plugin.extractCallGraph()`
7. For files where no plugin exists (Swift, Kotlin, unknown languages): outputs `{ path, language, fileCategory, totalLines, nonEmptyLines, metrics }` with empty structural data — the LLM agent handles these in Phase 2
8. Writes output JSON matching the existing `scriptCompleted/filesAnalyzed/filesSkipped/results` schema
Key resolution logic (with fallback for different install layouts):
```javascript
import { createRequire } from 'node:module';
import { dirname, resolve } from 'node:path';
import { fileURLToPath } from 'node:url';
const __dirname = dirname(fileURLToPath(import.meta.url));
const pluginRoot = resolve(__dirname, '../..');
const require = createRequire(resolve(pluginRoot, 'package.json'));
let core;
try {
core = await import(require.resolve('@understand-anything/core'));
} catch {
// Fallback: direct path for installed plugin cache where pnpm symlinks may differ
core = await import(resolve(pluginRoot, 'packages/core/dist/index.js'));
}
```
- [ ] **Step 2: Test the script locally**
Create a small test input JSON with a TS file, a Python file, and a YAML file. Run:
```bash
node skills/understand/extract-structure.mjs test-input.json test-output.json
```
Verify the output contains structural data for all three.
- [ ] **Step 3: Commit**
```bash
git add skills/understand/extract-structure.mjs
git commit -m "feat: add bundled tree-sitter extraction script for file-analyzer agent"
```
---
### Task 14: Rewrite file-analyzer.md Phase 1 to use bundled script
**Files:**
- Modify: `agents/file-analyzer.md`
**Context:** Phase 1 currently has ~150 lines instructing the agent to write a custom extraction script from scratch. Replace this with a short section that tells the agent to execute the pre-built `extract-structure.mjs` script.
- [ ] **Step 1: Replace Phase 1 in file-analyzer.md**
Delete the entire current Phase 1 (~150 lines of regex script generation instructions). Replace with:
1. Tell the agent to prepare the input JSON file (same format as before):
```bash
cat > $PROJECT_ROOT/.understand-anything/tmp/ua-file-analyzer-input-<batchIndex>.json << 'ENDJSON'
{
"projectRoot": "<project-root>",
"batchFiles": [<this batch's files including fileCategory>],
"batchImportData": <batchImportData JSON>
}
ENDJSON
```
2. Execute the bundled script:
```bash
node <SKILL_DIR>/extract-structure.mjs \
$PROJECT_ROOT/.understand-anything/tmp/ua-file-analyzer-input-<batchIndex>.json \
$PROJECT_ROOT/.understand-anything/tmp/ua-file-extract-results-<batchIndex>.json
```
3. If the script exits non-zero, read stderr, diagnose and report the error. Do NOT fall back to writing a manual script — the bundled script is the sole extraction path.
4. Keep the existing output format — Phase 2 (semantic analysis) is unchanged.
- [ ] **Step 2: Update SKILL.md to pass SKILL_DIR to file-analyzer dispatch**
In SKILL.md Phase 2, the file-analyzer dispatch prompt must include the skill directory path so the agent can locate `extract-structure.mjs`.
Add to the dispatch parameters:
```
> Skill directory (for bundled scripts): `<SKILL_DIR>`
```
This follows the established pattern — SKILL.md already passes `<SKILL_DIR>` for `merge-batch-graphs.py` (line 213) and `merge-subdomain-graphs.py` (line 44) using the same mechanism.
- [ ] **Step 3: Verify the file-analyzer output format is unchanged**
Phase 2 of file-analyzer.md should NOT need changes — it reads the same JSON structure from the script results. Verify the output schema from `extract-structure.mjs` matches what Phase 2 expects.
- [ ] **Step 4: Commit**
```bash
git add agents/file-analyzer.md skills/understand/SKILL.md
git commit -m "feat: file-analyzer uses bundled tree-sitter script instead of LLM-generated regex"
```
---
### Task 15: Final integration verification and cleanup
- [ ] **Step 1: Add exports to packages/core/src/index.ts**
This is required — `extract-structure.mjs` and external consumers need these exports:
```typescript
export type { LanguageExtractor } from "./plugins/extractors/types.js";
export { builtinExtractors } from "./plugins/extractors/index.js";
```
- [ ] **Step 2: Build the full package**
```bash
pnpm --filter @understand-anything/core build
```
- [ ] **Step 3: Run full test suite one final time**
```bash
pnpm --filter @understand-anything/core test
```
- [ ] **Step 4: Final commit**
```bash
git commit -m "feat: complete language extractor architecture — 10 languages with tree-sitter support"
```
---
## Implementation Notes
**Test file convention:** Each language extractor gets its own test file at `packages/core/src/plugins/extractors/__tests__/<language>-extractor.test.ts`. This follows the existing pattern where `tree-sitter-plugin.test.ts` is co-located.
**Lazy grammar loading (future optimization):** The current `TreeSitterPlugin.init()` loads all grammar WASMs upfront via `Promise.all`. With 10 grammars (~12MB total WASM), this may cause noticeable init delay. A future improvement: load TS/JS eagerly (most common), defer others to first use. Not required for this PR — measure first.
**Fingerprint side effect:** `buildFingerprintStore` in `fingerprint.ts` uses `PluginRegistry.analyzeFile` internally. Once the new extractors are wired up, fingerprinting for Python/Go/Rust/etc. will automatically produce structural fingerprints instead of content-hash-only. No code changes needed — it happens for free.
**PHP grammar note:** `tree-sitter-php` ships both `tree-sitter-php.wasm` (full PHP + embedded HTML/CSS/JS) and `tree-sitter-php_only.wasm` (PHP only). We use `tree-sitter-php.wasm`. The PHP extractor should be robust to non-PHP AST nodes that appear when parsing files with embedded HTML templates.