npm - @vpxa/aikit - Versions diffs - 0.1.201 → 0.1.203 - Mend

@vpxa/aikit 0.1.201 → 0.1.203

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

package/package.json +1 -1
package/packages/cli/dist/index.js +6 -3
package/packages/cli/dist/{init-Crz05_jQ.js → init-DMggNPFP.js} +1 -1
package/packages/cli/dist/{templates-DfIqEiIS.js → templates-B2Kub_Ol.js} +10 -7
package/packages/server/dist/index.js +1 -1
package/packages/server/dist/{server-DXuDMAna.js → server-C0mAYqGK.js} +135 -135
package/packages/tools/dist/index.d.ts +38 -1
package/packages/tools/dist/index.js +72 -72
package/scaffold/dist/definitions/bodies.mjs +123 -11
package/scaffold/dist/definitions/protocols.mjs +4 -2
package/scaffold/dist/definitions/skills/aikit.mjs +42 -0
package/scaffold/dist/definitions/skills/lesson-learned.mjs +92 -94

package/scaffold/dist/definitions/bodies.mjs CHANGED Viewed

@@ -141,13 +141,14 @@ This gives the user a visual dependency graph of the execution plan before dispa
 2. **Goal** — acceptance criteria, testable
 3. **Arch Context** — varies by \`config.tokenBudget\`: efficient → \`stratum_card({ files: ['<path>'], query: '<what matters>', tier: 'T1' })\`, normal → \`compact({path, query})\`, full → \`digest({ sources: [...], query: '<what matters>' })\`. Default to efficient unless task complexity requires more.
 4. **Constraints** — patterns, conventions
-5. **Artifacts Path** — the active flow's run directory and artifacts path from \`flow({ action: 'status' })\` (e.g. \`.flows/add-authentication/.spec/\`)
-6. **FORGE** — tier + task_id + evidence requirements (reviewers add CRITICAL/HIGH claims into your task_id; never create their own)
-7. **Flow Context** — "Call \`knowledge({ action: 'withdraw', scope: 'flow', profile: '<role>', budget: 6000 })\` as your FIRST action to receive pre-analyzed context from prior agents."
-8. **Self-Review** — checklist before declaring status
-9. **No present** — "Do NOT use the \`present\` tool — return all findings as structured text"
-10. **No get_changed_files** — "Do NOT call \`get_changed_files\` — it returns ALL uncommitted diffs (100K+ tokens), wasting your context window. If you need a specific file's changes, use \`run_in_terminal\` with \`git diff <file>\`."
-11. **Agent selection (HARD RULE)** — ALWAYS pass \`agentName\` parameter matching the Agent Dispatch Rules table. NEVER dispatch with empty/missing \`agentName\` — the generic default agent runs instead of the specialist. Example: \`runSubagent({ agentName: "Implementer", ... })\`.
+5. **Prior Knowledge** — Before dispatching, fetch topic-scoped knowledge: \`knowledge({ action: "lesson", subAction: "list-lessons", topic: "<2-3 task keywords>", minConfidence: 70 })\` + \`search({ query: "<task area>", category: "conventions", limit: 3 })\`. Include any HIGH-confidence results (≥70) under a \`## Prior Knowledge\` section in the prompt. Skip if no results.
+6. **Artifacts Path** — the active flow's run directory and artifacts path from \`flow({ action: 'status' })\` (e.g. \`.flows/add-authentication/.spec/\`)
+7. **FORGE** — tier + task_id + evidence requirements (reviewers add CRITICAL/HIGH claims into your task_id; never create their own)
+8. **Flow Context** — "Call \`knowledge({ action: 'withdraw', scope: 'flow', profile: '<role>', budget: 6000 })\` as your FIRST action to receive pre-analyzed context from prior agents."
+9. **Self-Review** — checklist before declaring status
+10. **No present** — "Do NOT use the \`present\` tool — return all findings as structured text"
+11. **No get_changed_files** — "Do NOT call \`get_changed_files\` — it returns ALL uncommitted diffs (100K+ tokens), wasting your context window. If you need a specific file's changes, use \`run_in_terminal\` with \`git diff <file>\`."
+12. **Agent selection (HARD RULE)** — ALWAYS pass \`agentName\` parameter matching the Agent Dispatch Rules table. NEVER dispatch with empty/missing \`agentName\` — the generic default agent runs instead of the specialist. Example: \`runSubagent({ agentName: "Implementer", ... })\`.
 **Subagent status protocol:** \`DONE\` | \`DONE_WITH_CONCERNS\` | \`NEEDS_CONTEXT\` | \`BLOCKED\`
 **Per-step review cycle (tier-gated):**
@@ -179,6 +180,12 @@ When \`allRoots.length > 1\`: always pass \`roots\` to \`flow start\` targeting
 Default to \`stratum_card({ files: ['<path>'], query: '<what matters>', tier: 'T1' })\` (~100 tok/file). Upgrade: \`compact\` (~300 tok/file) for semantic need, \`digest\` for multi-file synthesis, \`read_file\` only for exact edit lines.
+**Knowledge injection (MANDATORY for Standard+ tier):** Before building any subagent prompt, call:
+- \`knowledge({ action: "lesson", subAction: "list-lessons", topic: "<task keywords>", minConfidence: 70 })\`
+- \`search({ query: "<task area> convention decision", limit: 3 })\`
+Include results (if any) in the prompt under \`## Prior Knowledge\`. Cost: ~200 tokens. Benefit: prevents repeated mistakes across sessions.
+Skip for Floor tier (not worth the overhead for trivial tasks).
 ### Between-Phase Compression (MANDATORY)
 After each subagent batch returns:
@@ -348,6 +355,7 @@ After any \`status()\` call, check the \`contextPressure\` value (0-100):
 ### End (MUST do)
 \`session_digest({ persist: true })\`                              # Auto-capture session activity
+\`knowledge({ action: "flagged" })\`                                 # review decayed — refresh or forget
 \`knowledge({ action: "remember", title: "Session checkpoint: <topic>", content: "<decisions, blockers, next steps>", category: "conventions" })\`
 ## Flows
@@ -463,12 +471,42 @@ The Planner is typically activated by the Orchestrator as part of a flow step (e
 4. **Estimate blast radius** — \`blast_radius({ path: ".", files: [...] })\` BEFORE editing when changing a public/shared symbol; re-run AFTER to confirm actual impact matches.
 5. **TDD when tests exist** — write/extend the failing test first, then minimum code to pass.
+## Pre-Task: Knowledge Recall (MANDATORY)
+Before starting implementation, recall relevant lessons and conventions **scoped to your specific task**:
+\`\`\`
+// Extract 2-3 keywords from your assigned task
+knowledge({ action: "lesson", subAction: "list-lessons", topic: "<task keywords>", minConfidence: 70, limit: 3 })
+search({ query: "<task area> convention", category: "conventions", limit: 3 })
+\`\`\`
+**Rules:**
+- ALWAYS scope by topic — NEVER call \`list-lessons\` without \`topic\` param
+- ALWAYS limit results — \`limit: 3\` for search, \`minConfidence: 70\` for lessons
+- If recalled lessons apply → follow them, note which you followed in Status
+- If recalled lessons conflict → note the conflict in Status
+- Skip ONLY if task is pure config/formatting with zero logic
 ## Post-Edit Checklist
 1. \`check({})\` — typecheck + lint must pass clean
 2. \`test_run({})\` — full suite or targeted pattern
 3. If Orchestrator passed a \`task_id\`: \`evidence_map({action:'add', task_id, claim, status:'V', receipt:'file.ts#Lxx'})\` for each verified contract/acceptance claim. Do NOT run the gate — Orchestrator owns it.
+## Post-Task: Capture Lesson
+**HARD RULE:** Before reporting DONE status, load the \`lesson-learned\` skill and extract 1-2 engineering lessons from the changes made. Skip ONLY if changes are pure config/formatting with no logic modified.
+Quick lesson capture (when full skill feels heavy):
+\`\`\`
+knowledge({ action: "lesson", subAction: "create", context: "<what situation you faced>", insight: "<what principle the solution demonstrates>", evidence: "<file:line or commit that proves it>", confidence: 65 })
+\`\`\`
+**Confirm/Contradict (if pre-task recalled relevant lessons):**
+- Lesson proved correct → \`knowledge({ action: "lesson", subAction: "confirm", id: "<recalled-lesson-path>" })\`
+- Lesson was wrong/outdated → \`knowledge({ action: "lesson", subAction: "contradict", id: "<recalled-lesson-path>", evidence: "<what actually happened>" })\`
 ## Structured Output (MANDATORY)
 Every implementation response MUST end with a structured status block:
@@ -545,6 +583,30 @@ Every implementation response MUST end with a structured status block:
 If the pre-flight dev server cannot be started (e.g. sandbox), fall back to
 \`compact\` inspection of the component source + describe expected visual behavior.
+## Pre-Task: Pattern Recall (MANDATORY)
+Before implementing UI work, check existing component patterns:
+\`\`\`
+search({ query: "<component/feature area> pattern", category: "conventions", limit: 3 })
+knowledge({ action: "lesson", subAction: "list-lessons", topic: "<UI area>", minConfidence: 70, limit: 3 })
+\`\`\`
+Follow discovered patterns for consistency. Note any patterns followed in Status.
+## Post-Task: Capture Lesson
+**HARD RULE:** Before reporting DONE status, load the \`lesson-learned\` skill and extract 1-2 engineering lessons from the changes made. Skip ONLY if changes are pure config/formatting with no logic modified.
+Quick lesson capture (when full skill feels heavy):
+\`\`\`
+knowledge({ action: "lesson", subAction: "create", context: "<what situation you faced>", insight: "<what principle the solution demonstrates>", evidence: "<file:line or commit that proves it>", confidence: 65 })
+\`\`\`
+**Confirm/Contradict (if pre-task recalled relevant lessons):**
+- Lesson proved correct → \`knowledge({ action: "lesson", subAction: "confirm", id: "<recalled-lesson-path>" })\`
+- Lesson was wrong/outdated → \`knowledge({ action: "lesson", subAction: "contradict", id: "<recalled-lesson-path>", evidence: "<what actually happened>" })\`
 ## Skills (load on demand)
 | Skill | When to load |
@@ -581,10 +643,11 @@ Choose the appropriate loop type:
 ### Phase 2: Reproduce
-1. \`search({ query: "error patterns" })\` — check auto-captured error patterns and known issues
-2. \`knowledge({ action: "list", tag: "errors" })\` — find prior troubleshooting knowledge
-3. Run the feedback loop — confirm the error fires consistently
-4. If intermittent: add instrumentation, increase loop iterations, check race conditions
+1. \`search({ query: "<error-keywords>", tags: ["observation"] })\` — check auto-captured error patterns from prior sessions
+2. \`search({ query: "error patterns" })\` — check auto-captured error patterns and known issues
+3. \`knowledge({ action: "list", tag: "errors" })\` — find prior troubleshooting knowledge
+4. Run the feedback loop — confirm the error fires consistently
+5. If intermittent: add instrumentation, increase loop iterations, check race conditions
 ### Phase 3: Trace & Hypothesize
@@ -631,6 +694,31 @@ When debugging tool invocation issues, use the replay audit trail with traceId:
    - Server middleware context (\`ctx.requestId\`)
 4. Filter by traceId: search replay.jsonl for the specific UUID to trace the full invocation lifecycle
+## Pre-Task: Error Pattern Recall (MANDATORY)
+Before diagnosing, search for prior solutions to similar errors:
+\`\`\`
+// Use error message keywords or failing module name
+search({ query: "<error keywords or module name>", category: "context", limit: 3 })
+knowledge({ action: "lesson", subAction: "list-lessons", topic: "<error area>", minConfidence: 60, limit: 3 })
+\`\`\`
+If a prior fix exists for the same pattern → try it first before deep investigation.
+## Post-Task: Capture Lesson
+**HARD RULE:** Before reporting DONE status, load the \`lesson-learned\` skill and extract 1-2 engineering lessons from the changes made. Skip ONLY if changes are pure config/formatting with no logic modified.
+Quick lesson capture (when full skill feels heavy):
+\`\`\`
+knowledge({ action: "lesson", subAction: "create", context: "<what situation you faced>", insight: "<what principle the solution demonstrates>", evidence: "<file:line or commit that proves it>", confidence: 65 })
+\`\`\`
+**Confirm/Contradict (if pre-task recalled relevant lessons):**
+- Lesson proved correct → \`knowledge({ action: "lesson", subAction: "confirm", id: "<recalled-lesson-path>" })\`
+- Lesson was wrong/outdated → \`knowledge({ action: "lesson", subAction: "contradict", id: "<recalled-lesson-path>", evidence: "<what actually happened>" })\`
 ## Skills (load on demand)
 | Skill | When to load |
@@ -689,6 +777,30 @@ For multi-approach uncertainty (A vs B), do NOT create lanes. Instead:
   for read-only exploration and return a recommendation
 - You then apply the winning approach under the checkpoint protocol above
+## Pre-Task: Convention Recall (MANDATORY)
+Before refactoring, check existing conventions for the target area:
+\`\`\`
+search({ query: "<module/pattern being refactored> convention", category: "conventions", limit: 3 })
+knowledge({ action: "lesson", subAction: "list-lessons", topic: "<refactor area>", minConfidence: 70, limit: 3 })
+\`\`\`
+Follow discovered conventions. Do NOT introduce patterns that contradict established conventions without surfacing the conflict.
+## Post-Task: Capture Lesson
+**HARD RULE:** Before reporting DONE status, load the \`lesson-learned\` skill and extract 1-2 engineering lessons from the changes made. Skip ONLY if changes are pure config/formatting with no logic modified.
+Quick lesson capture (when full skill feels heavy):
+\`\`\`
+knowledge({ action: "lesson", subAction: "create", context: "<what situation you faced>", insight: "<what principle the solution demonstrates>", evidence: "<file:line or commit that proves it>", confidence: 65 })
+\`\`\`
+**Confirm/Contradict (if pre-task recalled relevant lessons):**
+- Lesson proved correct → \`knowledge({ action: "lesson", subAction: "confirm", id: "<recalled-lesson-path>" })\`
+- Lesson was wrong/outdated → \`knowledge({ action: "lesson", subAction: "contradict", id: "<recalled-lesson-path>", evidence: "<what actually happened>" })\`
 ## Skills (load on demand)
 | Skill | When to load |

package/scaffold/dist/definitions/protocols.mjs CHANGED Viewed

@@ -197,10 +197,10 @@ Past decisions, conventions, and patterns are stored in curated knowledge. Auto-
 - Reuse existing stash/checkpoint/workset context when present before creating new compressed artifacts.
 \`\`\`
-search({ query: "keywords about the feature/area you're changing" })  // check for past decisions
+search({ query: "<feature/area keywords>", limit: 5 })  // check past decisions + auto-knowledge
 knowledge({ action: "list", category: "decisions" })   // scan recent decisions that might apply
 knowledge({ action: "list", category: "conventions" }) // see project conventions (includes auto-captured)
-knowledge({ action: "lesson", sub_action: "list-lessons" }) // see validated patterns with confidence scores
+knowledge({ action: "lesson", subAction: "list-lessons", topic: "<2-3 task keywords>", minConfidence: 70 })  // topic-scoped lessons
 scope_map({ task: "what you need" })        // generates a reading plan
 // If running as sub-agent with flow context:
@@ -208,11 +208,13 @@ knowledge({ action: "withdraw", scope: "flow", profile: "<your-role>", budget: 6
 \`\`\`
 **Rules:**
+- **ALWAYS scope recalls** — NEVER call \`list-lessons\` without \`topic\`, NEVER call \`search\` without specific keywords. Unfiltered recall wastes tokens and returns noise.
 - If results exist → **READ them and FOLLOW** established patterns. Do not silently override.
 - If results conflict with the current task → **surface the conflict** to the user/orchestrator.
 - If flow-context search results already contain enough detail → **use them directly** instead of re-running the original tool.
 - If no results → proceed, but **persist your decisions with \`knowledge({ action: "remember", ... })\`** afterward for future recall.
 - Never assume "there's nothing stored" — always search first.
+- **Limit results** — Use \`limit: 3-5\` for search, \`minConfidence: 70\` for lessons. Only high-confidence knowledge deserves token budget.
 ### Step 3: Real-time Exploration (only if steps 1-2 don't cover it)

package/scaffold/dist/definitions/skills/aikit.mjs CHANGED Viewed

@@ -79,6 +79,11 @@ Need tool discovery?
 └─ Need details first? → describe_tool or guide
 ~~~
+**When tools return empty:**
+- \`search\` → 0 results? Broaden query terms OR fall back to \`find({ pattern })\` with regex
+- \`symbol\` → not found? Check spelling, try \`search\` with partial name
+- \`compact\` → irrelevant content? Refine \`query\` parameter or try different \`segmentation\`
 ## Session Protocol
 ### Why This Matters
@@ -106,6 +111,17 @@ Without session discipline, agents repeat work, miss context, and make decisions
 - \`session_digest({ persist: true })\` — Why: captures tool trajectory and supports crash recovery.
 - \`knowledge({ action: "remember", title: "Session checkpoint: <topic>" })\` — Why: the next session's first search is looking for exactly this.
 - \`reindex\` after structural code changes — Why: stale index data makes every later search worse.
+- \`knowledge({ action: "flagged" })\` periodically (not every session, but every few) — Why: decayed entries accumulate; reviewing them prevents stale advice from persisting.
+## Anti-Patterns (NEVER)
+- NEVER \`search\` then immediately \`read_file\` the result — search already returns content snippets
+- NEVER call \`compact\` on a file you just \`file_summary\`'d — pick one retrieval depth
+- NEVER stash >10 items without \`checkpoint\` — stash has no TTL, checkpoints do
+- NEVER \`read_file\` a file >50 lines to "understand" it — \`file_summary\` → \`compact\` decision tree
+- NEVER run \`reindex\` mid-implementation — wait until all edits are done
+- NEVER skip \`search\` before implementing — past decisions may exist that constrain your approach
+- NEVER echo full subagent output to user — compress with \`stash\` + brief summary
 ## Search Strategy
@@ -157,6 +173,32 @@ Use this instead of baking a static catalog into the skill. Tool metadata is liv
 - Search curated memory before inventing a new pattern.
 - Keep memory high-signal. Store what must survive, not everything you observed.
+## Memory Lifecycle
+AI Kit auto-manages memory behind the scenes. Know what happens so you can leverage it.
+### Auto-Observations
+Tool outputs from \`check\`, \`test_run\`, \`search\`, \`trace\` are auto-captured as observations.
+Query them: \`search({ query: "error pattern", tags: ["observation"] })\`
+They surface patterns you may have missed — especially useful in debugging sessions.
+### Retention & Tier Promotion
+- Memories start at **working** tier → promote on repeated access: episodic (2) → semantic (5) → procedural (10)
+- Unused entries decay via Ebbinghaus curve. Re-accessing important memories strengthens them.
+- Check decayed entries: \`knowledge({ action: "flagged" })\` — review, refresh important ones, forget stale ones.
+### Lessons
+Capture reusable engineering insights via \`knowledge({ action: "lesson", subAction: "create" })\`. Confirm with \`confirm\`, contradict with \`contradict\`, list with \`list-lessons\`. Confidence 60-70 for single-observation, 80+ when multiple diffs confirm. Auto-archived below 20 confidence after contradictions.
+See the \`lesson-learned\` skill for the full extraction workflow.
+### Supersession (automatic dedup)
+When you \`remember()\`, similar entries are detected automatically:
+- Jaccard > 0.7 → flagged for review
+- Jaccard > 0.95 → auto-superseded (replaced)
+- Use \`force: true\` to explicitly supersede flagged entries
 ## Flows
 Flows are the preferred path for guided multi-step work.

package/scaffold/dist/definitions/skills/lesson-learned.mjs CHANGED Viewed

@@ -8,49 +8,41 @@ When analyzing a diff, check for these signals. Present findings gently -- as op
 ## God Object / God Class
-One module doing too much.
 **Diff signals:** A single file with many unrelated changes. One class/module imported everywhere. A file over 500 lines that keeps growing.
 **Suggest:** Extract responsibilities into focused modules (SRP).
 ## Shotgun Surgery
-One logical change scattered across many files.
 **Diff signals:** 10+ files changed for a single feature or fix. The same type of edit repeated in many places. A rename or config change touching dozens of files.
 **Suggest:** Consolidate the scattered logic. If one change requires editing many files, the abstraction boundaries may be wrong.
 ## Feature Envy
-A function that uses another module's data more than its own.
 **Diff signals:** Heavy cross-module imports. A function reaching deep into another object's properties. Utility functions that only serve one caller in a different module.
 **Suggest:** Move the function closer to the data it uses.
 ## Premature Abstraction
-Abstracting before there are multiple concrete cases.
 **Diff signals:** An interface with exactly one implementation. A factory that creates only one type. A generic solution for a problem that exists only once.
 **Suggest:** Wait for the second or third use case before abstracting (Rule of Three).
 ## Copy-Paste Programming
-Duplicated code blocks with minor variations.
 **Diff signals:** Similar code appearing in multiple places in the diff. Functions that differ by only a parameter or two. Repeated patterns that could be parameterized.
 **Suggest:** Extract shared logic, parameterize the differences.
 ## Magic Numbers / Strings
-Literal values without explanation.
 **Diff signals:** Hardcoded numbers in conditions (\`if (retries > 3)\`). String literals used as keys or identifiers. Timeouts, limits, or thresholds without named constants.
 **Suggest:** Extract to named constants that explain the "why."
 ## Long Method
-Functions that do too much.
 **Diff signals:** New functions over 40-50 lines. Functions with multiple levels of nesting. Functions that require scrolling to read.
 **Suggest:** Extract sub-steps into named functions. Each function should do one thing.
 ## Excessive Comments
-Comments explaining "what" instead of "why."
 **Diff signals:** Comments restating the code (\`// increment counter\`). Large comment blocks before straightforward code. Commented-out code left in place.
 **Suggest:** Make the code self-documenting through better naming. Use comments only for "why" -- intent, trade-offs, non-obvious constraints.
 `},{file:`references/se-principles.md`,content:`---
@@ -63,108 +55,88 @@ Use this as a lookup table. When you spot a pattern in a diff, find the matching
 ## Design Principles (SOLID)
-**Single Responsibility Principle (SRP)**
-A module should have one reason to change.
-Code signals: A class/file was split into two. A function was extracted. A component stopped handling both UI and data fetching.
+### SRP
+**Signals:** A class/file was split into two. A function was extracted. A component stopped handling both UI and data fetching.
-**Open/Closed Principle (OCP)**
-Open for extension, closed for modification.
-Code signals: New behavior added without changing existing code. A plugin/hook/callback system introduced. Strategy pattern or configuration used instead of conditionals.
+### OCP
+**Signals:** New behavior added without changing existing code. A plugin/hook/callback system introduced. Strategy pattern or configuration used instead of conditionals.
-**Liskov Substitution Principle (LSP)**
-Subtypes must be substitutable for their base types.
-Code signals: An interface was introduced to unify implementations. A subclass override changed behavior in a way that broke callers (violation). Type narrowing or guards added.
+### LSP
+**Signals:** An interface was introduced to unify implementations. A subclass override changed behavior in a way that broke callers (violation). Type narrowing or guards added.
-**Interface Segregation Principle (ISP)**
-No client should depend on methods it doesn't use.
-Code signals: A large interface was split into smaller ones. Optional methods removed from an interface. A "fat" props object was broken into focused ones.
+### ISP
+**Signals:** A large interface was split into smaller ones. Optional methods removed from an interface. A "fat" props object was broken into focused ones.
-**Dependency Inversion Principle (DIP)**
-Depend on abstractions, not concretions.
-Code signals: A concrete dependency replaced with an interface/injection. A factory or provider pattern introduced. Import paths changed from specific implementations to abstract layers.
+### DIP
+**Signals:** A concrete dependency replaced with an interface/injection. A factory or provider pattern introduced. Import paths changed from specific implementations to abstract layers.
-**Composition over Inheritance**
-Favor object composition over class inheritance.
-Code signals: Inheritance hierarchy replaced with delegation. Mixins or HOCs replaced with hooks or composition. A "base class" was removed.
+### Composition over Inheritance
+**Signals:** Inheritance hierarchy replaced with delegation. Mixins or HOCs replaced with hooks or composition. A "base class" was removed.
 ## Simplicity Principles
-**DRY (Don't Repeat Yourself)**
-Every piece of knowledge should have a single representation.
-Code signals: Duplicate code extracted into a shared function. A constant replaced repeated literals. A template/generator replaced copy-pasted boilerplate.
+### DRY
+**Signals:** Duplicate code extracted into a shared function. A constant replaced repeated literals. A template/generator replaced copy-pasted boilerplate.
-**KISS (Keep It Simple, Stupid)**
-The simplest solution that works is the best.
-Code signals: A complex abstraction replaced with a straightforward implementation. Unnecessary indirection removed. A clever one-liner replaced with readable code.
+### KISS
+**Signals:** A complex abstraction replaced with a straightforward implementation. Unnecessary indirection removed. A clever one-liner replaced with readable code.
-**YAGNI (You Aren't Gonna Need It)**
-Don't build it until you actually need it.
-Code signals: Speculative features removed. Unused configuration options deleted. An over-engineered solution simplified to match actual requirements.
+### YAGNI
+**Signals:** Speculative features removed. Unused configuration options deleted. An over-engineered solution simplified to match actual requirements.
-**Rule of Three**
-Wait until the third duplication before abstracting.
-Code signals: Similar code exists in 2 places and was left alone (good). A premature abstraction was introduced after only one use (violation). Third occurrence triggered extraction (textbook application).
+### Rule of Three
+**Signals:** Similar code exists in 2 places and was left alone (good). A premature abstraction was introduced after only one use (violation). Third occurrence triggered extraction (textbook application).
-**Principle of Least Surprise**
-Code should behave the way most users would expect.
-Code signals: A function renamed to better describe what it does. Return types made consistent. Side effects made explicit or removed.
+### Principle of Least Surprise
+**Signals:** A function renamed to better describe what it does. Return types made consistent. Side effects made explicit or removed.
 ## Structural Principles
-**Separation of Concerns**
-Different responsibilities should live in different modules.
-Code signals: Business logic extracted from UI components. Data access separated from domain logic. Configuration separated from behavior. A "god file" split into focused modules.
+### Separation of Concerns
+**Signals:** Business logic extracted from UI components. Data access separated from domain logic. Configuration separated from behavior. A "god file" split into focused modules.
-**High Cohesion**
-Related functionality should live together.
-Code signals: Scattered related functions gathered into one module. A utility file broken up so each piece lives near its consumers. A feature folder created.
+### High Cohesion
+**Signals:** Scattered related functions gathered into one module. A utility file broken up so each piece lives near its consumers. A feature folder created.
-**Loose Coupling**
-Modules should depend on each other as little as possible.
-Code signals: Direct imports replaced with events/callbacks. A shared dependency removed. Modules communicate through well-defined interfaces instead of reaching into internals.
+### Loose Coupling
+**Signals:** Direct imports replaced with events/callbacks. A shared dependency removed. Modules communicate through well-defined interfaces instead of reaching into internals.
-**Encapsulation**
-Hide internal details, expose only what's necessary.
-Code signals: Public methods reduced. Internal helpers made private. A module's API surface shrunk. Implementation details hidden behind a facade.
+### Encapsulation
+**Signals:** Public methods reduced. Internal helpers made private. A module's API surface shrunk. Implementation details hidden behind a facade.
-**Information Hiding**
-Modules should not expose their internal data structures.
-Code signals: Raw data structures wrapped in accessor methods. Internal state made private. A data transformation moved inside the module that owns the data.
+### Information Hiding
+**Signals:** Raw data structures wrapped in accessor methods. Internal state made private. A data transformation moved inside the module that owns the data.
 ## Pragmatic Principles
-**Boy Scout Rule**
-Leave the code better than you found it.
-Code signals: Small cleanups alongside a feature change. A renamed variable for clarity. A dead code path removed. An outdated comment updated.
+### Boy Scout Rule
+**Signals:** Small cleanups alongside a feature change. A renamed variable for clarity. A dead code path removed. An outdated comment updated.
-**Fail Fast**
-Detect and report errors as early as possible.
-Code signals: Input validation added at entry points. Assertions added for invariants. Early returns replacing deep nesting. Error handling moved closer to the source.
+### Fail Fast
+**Signals:** Input validation added at entry points. Assertions added for invariants. Early returns replacing deep nesting. Error handling moved closer to the source.
-**Defensive Programming**
-Anticipate and handle unexpected inputs gracefully.
-Code signals: Null checks added. Default values provided. Edge cases handled. Error boundaries introduced.
+### Defensive Programming
+**Signals:** Null checks added. Default values provided. Edge cases handled. Error boundaries introduced.
-**Premature Optimization (avoiding it)**
-Don't optimize until you've measured.
-Code signals: A simple implementation chosen over a "faster" complex one. Readability prioritized over micro-performance. A profiling step added before optimization work.
+### Premature Optimization (avoiding it)
+**Signals:** A simple implementation chosen over a "faster" complex one. Readability prioritized over micro-performance. A profiling step added before optimization work.
 ## Refactoring Patterns
-**Extract Method/Function**
-Code signals: A long function split into named sub-functions. Inline logic replaced with a well-named call.
+### Extract Method/Function
+**Signals:** A long function split into named sub-functions. Inline logic replaced with a well-named call.
-**Extract Class/Module**
-Code signals: A file split into multiple files. A class split into two with distinct responsibilities.
+### Extract Class/Module
+**Signals:** A file split into multiple files. A class split into two with distinct responsibilities.
-**Replace Conditional with Polymorphism**
-Code signals: A switch/if-else chain replaced with a strategy pattern or subclass dispatch. A type map introduced.
+### Replace Conditional with Polymorphism
+**Signals:** A switch/if-else chain replaced with a strategy pattern or subclass dispatch. A type map introduced.
-**Introduce Parameter Object**
-Code signals: Multiple related parameters grouped into a single options/config object. Function signatures simplified.
+### Introduce Parameter Object
+**Signals:** Multiple related parameters grouped into a single options/config object. Function signatures simplified.
 `},{file:`SKILL.md`,content:`---
 name: lesson-learned
-description: "Analyze recent code changes via git history and extract software engineering lessons. Use when the user asks 'what is the lesson here?', 'what can I learn from this?', 'engineering takeaway', 'what did I just learn?', 'reflect on this code', 'what principles does this show?', or wants to extract actionable insights from recent work. Also use after completing any implementation task to capture learning."
+description: "Extract and persist engineering lessons from code changes into AI Kit memory. MUST use after completing ANY implementation, refactoring, debugging, or code review task — do NOT wait for user to ask. Also triggers on explicit requests: 'lesson', 'takeaway', 'what did I learn', 'reflect', 'principles'. Every completed task produces reusable lessons; skipping this loses institutional knowledge that prevents future rework."
 metadata:
   category: cross-cutting
   domain: general
@@ -179,20 +151,20 @@ metadata:
 Extract specific, grounded software engineering lessons from actual code changes. Not a lecture -- a mirror. Show the user what their code already demonstrates, and only elevate lessons that are earned by the diff.
-## Before You Begin
+## Auto-Trigger Protocol
-- Determine the analysis scope first.
-- Use the inline principles below for the common case.
-- Read 'references/se-principles.md' only when the dominant pattern is unclear or you need a deeper principle match.
-- Read 'references/anti-patterns.md' only when the diff suggests a gentle improvement opportunity.
+This skill MUST activate after any task that modifies code. The agent does NOT need user permission.
-## Core Principles
+**Activation flow:**
+1. Task complete (implementation/refactor/debug/review done)
+2. Load this skill
+3. Run Phase 1-5 (takes ~30 seconds of analysis)
+4. Present lesson to user and persist to AI Kit memory
-- **SRP** -- one module, one reason to change. Diff signal: a file/function was split because it was carrying mixed responsibilities.
-- **Separation of Concerns** -- keep UI, domain, data, config, and orchestration in their own lanes. Diff signal: logic moved across a boundary into a more appropriate layer.
-- **YAGNI** -- build only what current requirements justify. Diff signal: speculative abstractions, options, or branches were removed or avoided.
-- **Fail Fast** -- detect bad input or invalid state at the boundary. Diff signal: guards, validation, assertions, or early returns were added near the entry point.
-- **DRY** -- shared knowledge should live in one place. Diff signal: duplicated literals, branches, or helper logic were consolidated into one source.
+**Skip conditions (ONLY skip if ALL are true):**
+- Changes are pure config/formatting (no logic changed)
+- Less than 5 lines of substantive code modified
+- Change is a trivial typo or import reorder
 ## Phase 1: Determine Scope
@@ -212,12 +184,10 @@ Ask the user or infer from context what to analyze.
 ## Phase 2: Gather Changes
-1. Run \`git log\` with the determined scope to get commits and messages.
-2. Run \`git diff\` for the full diff of that scope.
-3. If the diff is large (>500 lines), start with \`git diff --stat\`, then inspect the 3-5 most-changed files.
-4. Read commit messages carefully -- they often reveal intent that raw diffs do not.
-5. Only read changed files. Do not widen scope beyond the diff.
-6. Capture concrete evidence as you go: commit SHA, file path, and line references.
+Use git log/diff to collect the changeset. Expert constraints:
+- **500-line threshold:** If diff > 500 lines, focus on the 3-5 most-changed files only
+- **Only read changed files** -- never speculatively read unchanged code for "context"
+- **Commit messages are clues** -- scan first, read code second
 ## Phase 3: Analyze
@@ -287,6 +257,34 @@ If there is a second lesson worth noting, include at most one more:
 Do not present more than 2 lessons total. If the diff is mostly clean execution with no deeper pattern, say that directly.
+## Phase 5: Persist
+After presenting, store the lesson for cross-session recall using the confidence-tracked lesson system:
+~~~
+knowledge({ action: "lesson", subAction: "create", context: "<what happened — the code situation>", insight: "<what was learned — the principle applied>", evidence: "<proof — commit SHA, file:line, test result>", confidence: 70 })
+~~~
+**Confidence scale:**
+- 50-60: Single observation, no corroboration yet
+- 60-70: Clear pattern from one changeset (default starting point)
+- 70-80: Same pattern observed in 2+ independent changes
+- 80-90: Pattern confirmed across multiple PRs/sessions
+- 90+: Fundamental principle validated repeatedly -- treat as convention
+If you encounter a lesson that CONFIRMS a previously stored one:
+~~~
+knowledge({ action: "lesson", subAction: "confirm", id: "<lesson-path>" })
+~~~
+If a lesson CONTRADICTS a previously stored one:
+~~~
+knowledge({ action: "lesson", subAction: "contradict", id: "<lesson-path>", reason: "<why the old lesson no longer holds>" })
+~~~
+Before creating a new lesson, check existing ones: \`knowledge({ action: "lesson", subAction: "list-lessons" })\`
+Only create if no existing lesson already covers this insight.
 ## What NOT to Do
 | Avoid | Why | Instead |