Fulfilled-Knowledge/Understand-Anything-main/understand-anything-plugin/hooks/auto-update-prompt.md

# Auto-Update Knowledge Graph (Internal — Hook-Triggered)

Incrementally update the knowledge graph using deterministic structural fingerprinting to minimize token usage. This prompt is triggered automatically by the post-commit hook when `autoUpdate` is enabled. It is NOT a user-facing skill.

**Key principle:** Spend zero LLM tokens when changes are cosmetic (formatting, internal logic). Only invoke LLM agents when structural changes (new/removed functions, classes, imports, exports) are detected.

---

## Phase 0 — Pre-flight (Zero Token Cost)

1. Set `PROJECT_ROOT` to the current working directory.

2. Check that `$PROJECT_ROOT/.understand-anything/knowledge-graph.json` exists.
   - If not: report "No existing knowledge graph found. Run `/understand` first to create one." and **STOP**.

3. Check that `$PROJECT_ROOT/.understand-anything/meta.json` exists and read `gitCommitHash`.
   - If not: report "No analysis metadata found. Run `/understand` to create a baseline." and **STOP**.

4. Get current commit hash:
   ```bash
   git rev-parse HEAD
   ```

5. If commit hashes match and `--force` is NOT in `$ARGUMENTS`: report "Knowledge graph is already up to date." and **STOP**.

6. Get changed files:
   ```bash
   git diff <lastCommitHash>..HEAD --name-only
   ```
   If no files changed: update `meta.json` with the new commit hash and **STOP**.

7. Filter to source files only (`.ts`, `.tsx`, `.js`, `.jsx`, `.py`, `.go`, `.rs`, `.java`, `.rb`, `.cpp`, `.c`, `.h`, `.cs`, `.swift`, `.kt`, `.php`).
   If no source files changed: update `meta.json` with the new commit hash, report "Only non-source files changed. Metadata updated." and **STOP**.

8. Create intermediate directory:
   ```bash
   mkdir -p $PROJECT_ROOT/.understand-anything/intermediate
   ```

9. **Apply `.understandignore` exclusions** (same semantics as `/understand` Step 2.5 in `agents/project-scanner.md`).

   Without this step, files in user-excluded paths (migrations, vendored code, tests) are counted as structural changes and can spuriously escalate the action to `FULL_UPDATE` even when the real change set is tiny.

   1. If neither `$PROJECT_ROOT/.understand-anything/.understandignore` nor `$PROJECT_ROOT/.understandignore` exists, the step 7 extension filter is sufficient — skip to Phase 1.

   2. Write the step 7 file list to `$PROJECT_ROOT/.understand-anything/intermediate/changed-files-pre.json` as a JSON array of relative paths.

   3. Resolve `$PLUGIN_ROOT`:
      - Use `$CLAUDE_PLUGIN_ROOT` if set (Claude Code's hook context sets this).
      - Otherwise try `$HOME/.understand-anything-plugin`.
      - Validate the chosen candidate by checking `$candidate/packages/core/dist/ignore-filter.js` exists.
      - If neither resolves: report "Cannot locate plugin install at `$CLAUDE_PLUGIN_ROOT` or `$HOME/.understand-anything-plugin`; auto-update aborted. Run `/understand` to re-baseline." and **STOP**. Do **not** silently skip — silent skip reproduces issue #153.

   4. Write `$PROJECT_ROOT/.understand-anything/intermediate/ignore-filter.mjs`:
      ```javascript
      import { readFileSync, writeFileSync } from 'node:fs';
      import { pathToFileURL } from 'node:url';
      import path from 'node:path';

      const PROJECT_ROOT = process.cwd();
      const PLUGIN_ROOT = process.argv[2];
      const inputPath = process.argv[3];

      const modUrl = pathToFileURL(
        path.join(PLUGIN_ROOT, 'packages/core/dist/ignore-filter.js'),
      ).href;
      const { createIgnoreFilter } = await import(modUrl);
      const filter = createIgnoreFilter(PROJECT_ROOT);

      const input = JSON.parse(readFileSync(inputPath, 'utf-8'));
      const kept = input.filter((p) => !filter.isIgnored(p));
      const removed = input.length - kept.length;

      writeFileSync(
        path.join(PROJECT_ROOT, '.understand-anything/intermediate/changed-files.json'),
        JSON.stringify({ kept, removed, total: input.length }, null, 2),
      );
      console.log(`.understandignore: kept ${kept.length}/${input.length} (removed ${removed})`);
      ```

   5. Run it:
      ```bash
      node $PROJECT_ROOT/.understand-anything/intermediate/ignore-filter.mjs \
        "$PLUGIN_ROOT" \
        $PROJECT_ROOT/.understand-anything/intermediate/changed-files-pre.json
      ```

   6. Read `$PROJECT_ROOT/.understand-anything/intermediate/changed-files.json`. Pass the `kept` array as the input file list for Phase 1's fingerprint-check script.

   7. If `kept.length === 0`: update `meta.json` with the new commit hash, report "All changed source files are in ignored paths. Metadata updated." and **STOP**.

---

## Phase 1 — Structural Fingerprint Check (Zero LLM Tokens)

This phase runs a deterministic Node.js script that compares file structures against stored fingerprints. It costs **zero LLM tokens** — only the script execution cost.

1. Write and execute a Node.js script (`$PROJECT_ROOT/.understand-anything/intermediate/fingerprint-check.mjs`):

```javascript
// The script should:
// 1. Read fingerprints.json from .understand-anything/fingerprints.json
// 2. For each changed source file:
//    a. Read the file content
//    b. Compute SHA-256 content hash
//    c. If content hash matches stored hash → NONE (skip)
//    d. Extract structural elements via regex:
//       - Functions: match patterns like `function NAME(`, `const NAME = (`, `export function NAME(`
//       - Classes: match `class NAME`, `export class NAME`
//       - Imports: match `import ... from '...'`, `import '...'`
//       - Exports: match `export { ... }`, `export default`, `export function`, `export class`, `export const`
//    e. Compare extracted elements against stored fingerprint
//    f. Classify as NONE, COSMETIC, or STRUCTURAL
// 3. For new files (not in fingerprints.json): classify as STRUCTURAL
// 4. For deleted files (in fingerprints.json but not on disk): classify as STRUCTURAL
// 5. Determine overall decision:
//    - All NONE/COSMETIC → action: "SKIP"
//    - Some STRUCTURAL, ≤10 files, same directories → action: "PARTIAL_UPDATE"
//    - New/deleted directories or >10 structural files → action: "ARCHITECTURE_UPDATE"
//    - >30 structural files or >50% of graph → action: "FULL_UPDATE"
// 6. Write result to .understand-anything/intermediate/change-analysis.json
```

The output JSON should have this shape:
```json
{
  "action": "SKIP | PARTIAL_UPDATE | ARCHITECTURE_UPDATE | FULL_UPDATE",
  "filesToReanalyze": ["src/new-feature.ts"],
  "rerunArchitecture": false,
  "rerunTour": false,
  "reason": "1 file has structural changes (new function added)",
  "fileChanges": [
    { "filePath": "src/utils.ts", "changeLevel": "COSMETIC", "details": ["internal logic changed"] },
    { "filePath": "src/new-feature.ts", "changeLevel": "STRUCTURAL", "details": ["new function: handleRequest"] }
  ]
}
```

2. Read `.understand-anything/intermediate/change-analysis.json`.

3. **Decision gate:**

   | Action | What to do |
   |---|---|
   | `SKIP` | Update `meta.json` with new commit hash. Report: "No structural changes detected. Graph metadata updated. Zero tokens spent." **STOP.** |
   | `FULL_UPDATE` | Report: "Major structural changes detected (reason). Recommend running `/understand --full` for a complete rebuild." **STOP.** |
   | `PARTIAL_UPDATE` | Proceed to Phase 2 with `filesToReanalyze` |
   | `ARCHITECTURE_UPDATE` | Proceed to Phase 2 with `filesToReanalyze`, flag architecture re-run |

---

## Phase 2 — Targeted Re-Analysis (Minimal Token Cost)

Only re-analyze files with structural changes. This is the **only** phase that costs LLM tokens.

1. Read the existing knowledge graph from `$PROJECT_ROOT/.understand-anything/knowledge-graph.json`.

2. Batch the files from `filesToReanalyze` (from Phase 1). Use a single batch if ≤10 files, otherwise batch into groups of 5-10.

3. For each batch, dispatch a subagent using the `file-analyzer` agent definition (at `agents/file-analyzer.md`). Append:

   > **Additional context from main session:**
   >
   > Project: `<projectName from existing graph>` — `<projectDescription>`
   > Frameworks detected: `<frameworks from existing graph>`
   > Languages: `<languages from existing graph>`
   >
   > **IMPORTANT:** This is an incremental update. Only the files listed below have structural changes. Analyze them thoroughly but do not invent nodes for files not in this batch.

   Fill in batch-specific parameters:

   > Analyze these source files and produce GraphNode and GraphEdge objects.
   > Project root: `$PROJECT_ROOT`
   > Project: `<projectName>`
   > Languages: `<languages>`
   > Batch index: `1`
   > Write output to: `$PROJECT_ROOT/.understand-anything/intermediate/batch-1.json`
   >
   > All project files (for import resolution):
   > `<file list from existing graph nodes>`
   >
   > Files to analyze in this batch:
   > 1. `<path>` (`<sizeLines>` lines)
   > ...

4. After batch(es) complete, read each `batch-<N>.json` and merge results.

5. **Merge with existing graph:**
   - Remove old nodes whose `filePath` matches any file in `filesToReanalyze` or in the deleted files list
   - Remove old edges whose `source` or `target` references a removed node
   - Add new nodes and edges from the fresh analysis
   - Deduplicate nodes by ID (keep latest), edges by `source + target + type`
   - Remove any edge with dangling `source` or `target` references

---

## Phase 3 — Conditional Architecture/Tour + Save

### 3a. Architecture update (only if `rerunArchitecture === true`)

If the change analysis flagged `ARCHITECTURE_UPDATE`:

1. Dispatch a subagent using the `architecture-analyzer` agent definition (at `agents/architecture-analyzer.md`), passing the full merged node set and import edges. Include previous layer definitions for naming consistency:

   > Previous layer definitions (for naming consistency):
   > ```json
   > [previous layers from existing graph]
   > ```
   > Maintain the same layer names and IDs where possible. Only add/remove layers if the file structure has materially changed.

2. After completion, read and normalize layers (same normalization as `/understand` Phase 4).

3. Optionally re-run tour builder if layers changed significantly.

### 3b. Lite layer update (if `rerunArchitecture === false`)

If only a partial update:
1. For **new files**: assign them to the most likely existing layer based on directory path matching
2. For **deleted files**: remove their IDs from layer `nodeIds` arrays
3. Remove any layer that ends up with zero nodeIds

### 3c. Lite validation

Perform lightweight validation (no graph-reviewer agent):
1. Remove any edge with dangling `source` or `target`
2. Remove any layer `nodeIds` entry that doesn't exist in the node set
3. Ensure every file node appears in exactly one layer (add to a catch-all layer if missing)

### 3d. Save

1. Write the final knowledge graph to `$PROJECT_ROOT/.understand-anything/knowledge-graph.json`.

2. Write updated metadata to `$PROJECT_ROOT/.understand-anything/meta.json`:
   ```json
   {
     "lastAnalyzedAt": "<ISO 8601 timestamp>",
     "gitCommitHash": "<current commit hash>",
     "version": "1.0.0",
     "analyzedFiles": <total file count in graph>
   }
   ```

3. **Update fingerprints (LOAD-PATCH-SAVE, not OVERWRITE).**

   The most common failure mode here: writing only the freshly-computed batch entries to `fingerprints.json`, discarding every other file's fingerprint. The next auto-update then sees all those files as new (no stored fingerprint), classifies them as STRUCTURAL, and escalates to FULL_UPDATE permanently (issue #152). The script must LOAD ALL existing entries, PATCH only the re-analyzed ones, and SAVE the full dict back.

   Write and execute a Node.js script in this exact ordering:

   ```javascript
   import { readFileSync, writeFileSync, existsSync } from 'node:fs';
   import { createHash } from 'node:crypto';
   import path from 'node:path';

   const fpPath = path.join(PROJECT_ROOT, '.understand-anything', 'fingerprints.json');
   const existedAndNonEmpty = existsSync(fpPath) && readFileSync(fpPath, 'utf-8').trim().length > 0;

   // 1. LOAD ALL existing entries (NEVER skip — preserves un-analyzed files)
   const all = existedAndNonEmpty
     ? JSON.parse(readFileSync(fpPath, 'utf-8'))
     : {};
   const before = Object.keys(all).length;

   // 2. PATCH (file still exists) or REMOVE (file deleted) for each re-analyzed path.
   //    `filesToReanalyze` may include paths that were deleted in this commit —
   //    handle both branches inline rather than expecting a separate deleted list.
   for (const filePath of filesToReanalyze) {
     const fullPath = path.join(PROJECT_ROOT, filePath);
     if (!existsSync(fullPath)) {
       delete all[filePath];
       continue;
     }
     const content = readFileSync(fullPath, 'utf-8');
     const contentHash = createHash('sha256').update(content).digest('hex');
     // Extract functions, classes, imports, exports via the same regex as Phase 1.
     all[filePath] = { contentHash, functions, classes, imports, exports };
   }

   // 3. GUARD against silent load failure: if fingerprints.json existed and was
   //    non-empty but `before` came out as 0, refuse to overwrite — something
   //    went wrong reading the file and writing now would clobber every entry.
   if (existedAndNonEmpty && before === 0) {
     throw new Error('fingerprints.json existed and was non-empty but loaded as {} — refusing to overwrite');
   }

   // 4. SAVE ALL entries back (full dict — not just the patched subset)
   writeFileSync(fpPath, JSON.stringify(all, null, 2));
   console.log(`Fingerprints: ${before} → ${Object.keys(all).length}`);
   ```

   The `existedAndNonEmpty && before === 0` guard catches the silent-load-failure case before it corrupts the store. If the count shrinks from N to a small number that matches the batch size, the LOAD step was skipped — abort the write rather than persist the wrong dict.

4. Clean up intermediate files:
   ```bash
   rm -rf $PROJECT_ROOT/.understand-anything/intermediate
   ```

5. Report a summary:
   - Files checked: N (total changed)
   - Structural changes found: N files
   - Cosmetic-only changes: N files (skipped)
   - Nodes updated: N
   - Action taken: PARTIAL_UPDATE / ARCHITECTURE_UPDATE
   - Path to output: `$PROJECT_ROOT/.understand-anything/knowledge-graph.json`

---

## Error Handling

- If the fingerprint check script fails: fall back to treating all changed files as STRUCTURAL (conservative approach).
- If `fingerprints.json` doesn't exist: treat all changed files as STRUCTURAL and regenerate fingerprints after the update.
- If a subagent dispatch fails: retry once. If it fails again, save partial results and report the error.
- ALWAYS save partial results — a partially updated graph is better than no update.

---

## Notes

- This skill reuses the same `file-analyzer` and `architecture-analyzer` agent definitions as `/understand` — no separate agent prompts needed.
- The fingerprint comparison in Phase 1 uses regex-based extraction (not tree-sitter) because it runs as a temporary Node.js script and doesn't need full AST accuracy — just signature-level detection.
- The authoritative fingerprints stored in `fingerprints.json` are generated by `/understand` Phase 7 using the core `fingerprint.ts` module (which uses tree-sitter for precise extraction).