npm - @event4u/agent-config - Versions diffs - 1.15.0 → 1.17.0 - Mend

@event4u/agent-config 1.15.0 → 1.17.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (354) hide show

package/.agent-src/skills/systematic-debugging/SKILL.md CHANGED Viewed

@@ -8,8 +8,8 @@ source: package
 ## When to use
-* Test fails and failure is not self-explanatory
-* Bug reported (Jira, Sentry, user message), root cause not obvious
+* A test fails and the failure is not self-explanatory
+* A bug is reported (Jira, Sentry, user message) and the root cause is not obvious
 * Production or staging shows unexpected behavior
 * Code behaves differently than the developer expected
 * A previous fix did not resolve the issue or introduced a new one
@@ -17,14 +17,19 @@ source: package
 Do NOT use when:
-* Failure message names the fix (typo, missing import, obvious off-by-one) — fix it
+* The failure message already names the fix (typo, missing import, obvious
+  off-by-one) — fix it and move on
 * Pure style / formatting / lint issues
 * Documentation-only questions
+* You need a static trace of a specific data element — route to
+  [`data-flow-mapper`](../data-flow-mapper/SKILL.md)
+* You need to enumerate what a planned change will touch — route to
+  [`blast-radius-analyzer`](../blast-radius-analyzer/SKILL.md)
 ## Goal
-Find the **root cause** before changing any code. A symptom fix papering
-over an unknown cause is a regression waiting to happen.
+Find the **root cause** before changing any code. A symptom fix that
+papers over an unknown cause is a regression waiting to happen.
 ## The Iron Law
@@ -37,30 +42,31 @@ diff, a reproduced failure — those are evidence.
 ## Procedure
-Complete each phase before the next. Skipping ahead is the single
-biggest cause of wasted debug time.
+Complete each phase before starting the next. Skipping ahead is the
+single biggest cause of wasted debug time.
 ### Phase 1 — Reproduce
-Goal: make the failure happen on demand, smallest possible setup.
+Goal: make the failure happen on demand, with the smallest possible setup.
-1. Read the error message, stack trace, and logs **in full**. Note exact
-   file, line, and the chain of calls above it.
-2. Identify the minimum input, state, or action sequence triggering the
-   failure. Intermittent → gather more data before guessing.
+1. Read the error message, stack trace, and logs **in full**. Note the
+   exact file, line, and the chain of calls above it.
+2. Identify the minimum input, state, or sequence of actions that
+   triggers the failure. If it is intermittent — gather more data before
+   guessing.
 3. Capture the exact reproduction as a command or a test. Prefer a
    failing test (see [`test-driven-development`](../test-driven-development/SKILL.md))
-   — turns Phase 4 into a verified fix.
+   — it turns Phase 4 into a verified fix.
-Cannot reproduce? You do not yet understand the bug. Stop. Add logging,
-re-run, collect more evidence.
+If you cannot reproduce, you do not yet understand the bug. Stop. Add
+logging, re-run, collect more evidence.
 ### Phase 2 — Isolate
 Goal: locate the failure in a single component, layer, or call site.
-1. Bisect the surface area. Smallest code path that still fails? Turn
-   off/skip/mock adjacent features to narrow the window.
+1. Bisect the surface area. What is the smallest code path that still
+   fails? Turn off/skip/mock adjacent features to narrow the window.
 2. For multi-component systems (frontend → API → service → DB, or
    CI → build → deploy), log at **each boundary**:
@@ -68,11 +74,12 @@ Goal: locate the failure in a single component, layer, or call site.
    * What leaves the component
    * What config/env the component actually sees
-   Goal: answer "which boundary has expected ≠ actual?".
+   The goal is not to fix — it is to answer "which boundary is the one
+   where expected ≠ actual?".
 3. Check recent changes: `git log`, `git blame` on the failing line,
    recent dependency updates, config edits, infra changes.
 4. **Consult memory for prior matches.** Via
-   [`memory-access`](../../guidelines/agent-infra/memory-access.md):
+   [`memory-access`](../../../docs/guidelines/agent-infra/memory-access.md):
    ```python
    from scripts.memory_lookup import retrieve
    priors = retrieve(
@@ -81,13 +88,13 @@ Goal: locate the failure in a single component, layer, or call site.
        limit=3,
    )
    ```
-   A matching `incident-learning` may already name the root cause, fix,
-   and regression test. A matching `historical-pattern` narrows the
-   hypothesis space before Phase 3. Cite matching `id`s in the evidence
-   trail.
-5. Trace backwards from the symptom. `null` arrives at line 42 — where
-   does the value originate? Walk up the call stack until the origin is
-   found. Fix at origin, not at line 42.
+   A matching `incident-learning` may already name the root cause, the
+   fix, and the regression test. A matching `historical-pattern`
+   narrows the hypothesis space before Phase 3. Cite matching `id`s in
+   the Phase 1–4 evidence trail.
+5. Trace backwards from the symptom. If `null` arrives at line 42 —
+   where does the value originate? Walk up the call stack until the
+   origin is found. Fix at origin, not at line 42.
 ### Phase 3 — Hypothesize
@@ -95,35 +102,35 @@ Goal: one testable hypothesis at a time, rejected or confirmed by evidence.
 1. State the hypothesis in one sentence: *"The failure happens because
    X, which I can confirm by observing Y."*
-2. Design the smallest possible experiment that confirms or rejects the
-   hypothesis. One variable at a time.
+2. Design the smallest possible experiment that either confirms or
+   rejects the hypothesis. One variable at a time.
 3. Run it. Read the output.
-4. Confirmed → Phase 4. Rejected → back to Phase 2 with new
+4. If confirmed → Phase 4. If rejected → back to Phase 2 with the new
    information, then form a new hypothesis.
-Three hypotheses in a row fail → stop. You do not understand the system
-well enough yet, or the architecture itself is the problem — see
+If three hypotheses in a row fail, stop. You do not understand the
+system well enough yet, or the architecture is the problem itself — see
 "Three-strike rule" below.
 ### Phase 4 — Verify the fix
 Goal: the fix resolves the root cause, not just the observed symptom.
-1. Write or update a failing test reproducing the bug (if not already
-   done in Phase 1).
+1. Write or update a failing test that reproduces the bug (if not
+   already done in Phase 1).
 2. Apply a single, minimal fix targeting the root cause. No bundled
    refactors, no "while I'm here".
-3. Re-run the reproduction — failure gone.
-4. Re-run the surrounding test suite — nothing adjacent turned red.
-5. Read output carefully — no new warnings, deprecations, or silent
-   retries masking the same bug recurring.
+3. Re-run the reproduction — the failure is gone.
+4. Re-run the surrounding test suite — nothing adjacent has turned red.
+5. Read the output carefully — no new warnings, deprecations, or
+   silent retries that would mask the same bug recurring.
-Fix does not work? **Do not** stack a second fix on top. Go back to
-Phase 2, treat the failure as new evidence.
+If the fix does not work, **do not** stack a second fix on top. Go back
+to Phase 2, treat the failure as new evidence.
 ## Three-strike rule
-After **three** attempted fixes with the bug still present:
+If you have tried **three** fixes and the bug is still present:
 * Stop attempting fixes.
 * Re-read phases 1–3 — something about the root cause is wrong.
@@ -150,7 +157,7 @@ right line beats five minutes of IDE breakpoints.
 ## Condition-based waiting (intermittent bugs)
 Intermittent tests and race conditions usually stem from waiting on
-time instead of a condition. Replace `sleep(100)` or
+time instead of on a condition. Replace `sleep(100)` or
 `setTimeout(r, 100)` with an explicit wait-for:
 ```ts
@@ -172,11 +179,12 @@ async function waitFor<T>(
 ```
 Only use an arbitrary timeout when the timing itself is the contract
-(debounce, throttle) — add a comment explaining **why** the exact value.
+(debounce, throttle) — and add a comment explaining **why** the exact
+value.
 ## Output format
-When reporting debug findings:
+When reporting debug findings to the user:
 1. **Symptom** — what was observed (one sentence + failure message)
 2. **Reproduction** — the command or test that triggers it
@@ -189,27 +197,27 @@ When reporting debug findings:
 * Reading half a stack trace and jumping to a fix — the actual cause is
   usually two or three frames above the one you read.
-* "It works on my machine" — different env than the bug report.
-  Reproduce with exact conditions from the report.
-* Adding a retry or sleep to mask an intermittent failure — hides the
-  race condition, does not fix it. Use condition-based waiting.
-* Fixing the first line that throws when the bad value came from up the
-  call chain. Trace backwards to the origin.
+* "It works on my machine" — you are running a different env than the
+  bug report. Reproduce with the exact conditions from the report.
+* Adding a retry or sleep to mask an intermittent failure — this hides
+  the race condition, it does not fix it. Use condition-based waiting.
+* Fixing the first line that throws, when the bad value came from
+  somewhere up the call chain. Trace backwards to the origin.
 * "The fix works, the test is just flaky" — flaky tests are bugs in the
   test or the code. Diagnose them, do not retry-until-green.
-* Turning a failing assertion into a softer one ("maybe 2 or 3 retries,
-  accept both") to make it pass.
-* Bundling a bug fix with a refactor — test goes red again, cannot tell
-  which change broke it.
+* Turning a failing assertion into a softer one ("maybe it's 2 or 3
+  retries, let's accept both") to make it pass.
+* Bundling a bug fix with a refactor — if the test goes red again you
+  cannot tell which change broke it.
 ## Red flags — STOP and restart from Phase 1
 * "Let me just try X and see if it works"
 * "I don't fully understand it, but this probably fixes it"
 * Proposing a fix without having reproduced the bug
-* Bundling multiple changes in one attempt
+* Bundling multiple changes in one attempt ("fixing this and refactoring that")
 * "It's probably a race condition, let me add a sleep"
-* A green test run after changes without having first seen it red
+* A green test run after changes, without having first seen it red
 * "This looks similar to bug X, so it's the same fix"
 * Suppressing a log, warning, or exception instead of tracing its source
@@ -219,7 +227,7 @@ When reporting debug findings:
 * Do NOT change two things at once in a single experiment
 * Do NOT silence a warning, failing test, or noisy log as a "fix"
 * Do NOT mark a bug as fixed without a regression test
-* Do NOT attempt fix #4 after three failed fixes — surface the pattern
+* Do NOT attempt fix #4 after three failed fixes — surface the pattern instead
 ## When to hand over to another skill
@@ -234,11 +242,11 @@ When reporting debug findings:
 Before declaring a bug fixed:
-* [ ] Failure was reproduced before any code changed
-* [ ] Root cause named explicitly, not "probably"
+* [ ] The failure was reproduced before any code changed
+* [ ] The root cause is named explicitly, not "probably"
 * [ ] Evidence (log, trace, diff) supports the named root cause
-* [ ] Failing test reproducing the bug was added or updated
-* [ ] Fix is minimal, targets the root cause, not the symptom
-* [ ] Regression test now passes
+* [ ] A failing test reproducing the bug was added or updated
+* [ ] The fix is minimal and targets the root cause, not the symptom
+* [ ] The regression test now passes
 * [ ] Adjacent tests still pass
 * [ ] No warning or suppressed output hides a recurrence

package/.agent-src/skills/test-driven-development/SKILL.md CHANGED Viewed

@@ -9,23 +9,23 @@ source: package
 ## When to use
 * Adding a new function, method, or behavior
-* Fixing a bug (bug needs a regression test before the fix)
+* Fixing a bug (the bug needs a regression test before the fix)
 * Refactoring a unit whose current behavior is unclear
 * Any task where expected behavior can be expressed as an assertion
 Do NOT use when:
-* Throwaway prototype or spike code explicitly marked as exploration
-* Boilerplate generation (migrations, config, scaffolding)
-* Pure documentation (`.md`, `AGENTS.md`, README)
+* Writing throwaway prototype or spike code explicitly marked as exploration
+* Generating boilerplate (migrations, config files, scaffolding)
+* Editing pure documentation (`.md`, `AGENTS.md`, README)
 * Working inside this `agent-config` package on skill/rule markdown
 ## Goal
-* Drive implementation from a verified-failing test, not the agent's
-  belief that code "should work".
-* Catch edge cases **before** production bugs.
-* Leave every change with a regression test running in CI.
+* Drive implementation from a verified-failing test, not from the agent's
+  belief that the code "should work".
+* Catch edge cases **before** they become production bugs.
+* Leave every change with a regression test that runs in CI.
 ## The core discipline
@@ -38,7 +38,7 @@ Do NOT use when:
 ```
 If step 2 is skipped, the test is not trusted — a test that has never
-failed proves nothing.
+failed proves nothing about the code under test.
 ## Procedure
@@ -46,21 +46,21 @@ failed proves nothing.
 State in one sentence: *"When X happens, the system should do Y."*
-Cannot state in one sentence? Scope too big — split into multiple tests,
-one sentence each.
+If you cannot state it in one sentence, the scope is too big — split into
+multiple tests, each covering one sentence.
 ### 2. Write the failing test first
-Smallest test expressing step-1 sentence.
+Write the smallest test that expresses the sentence from step 1.
-* One assertion per behavior (multiple assertions OK only when describing
-  the same single behavior).
+* One assertion per behavior (multiple assertions are OK only when they
+  describe the same single behavior).
 * Real code paths, not mocks — mock only at I/O boundaries (HTTP, DB, time).
-* Descriptive name: `it_rejects_empty_email`, not `test_email_1`.
+* Use a descriptive name: `it_rejects_empty_email`, not `test_email_1`.
 ### 3. Run the test and watch it fail
-Execute the single test (targeted, not full suite):
+Execute the single test (targeted, not the full suite):
 ```bash
 # PHP/Pest
@@ -72,26 +72,27 @@ npx vitest run --testNamePattern "rejects empty email"
 Required observations **before proceeding**:
-* Test **fails** (not errors).
-* Failure message matches expectation (missing behavior, not typo).
-* Passes immediately → it does not test what you think. Fix the test,
-  do not start writing production code.
+* The test **fails** (not errors).
+* The failure message matches what you expected (missing behavior, not typo).
+* If the test passes immediately → it does not test what you think. Fix
+  the test, do not start writing production code.
 ### 4. Write minimum code to pass
-**Just enough** production code to make the test green. No extra
+Add **just enough** production code to make the test green. No extra
 features, no unrelated refactoring, no "while I'm here" cleanups.
-Urge to add a parameter, edge case, or helper not covered by current
-test → stop. Belongs in the next RED step, not this GREEN step.
+If you feel the urge to add a parameter, edge case, or helper not covered
+by the current test — stop. That belongs in the next RED step, not this
+GREEN step.
 ### 5. Run again and watch it pass
 Re-run the same targeted command. Required:
-* New test passes.
-* No previously green tests turned red.
-* Output clean (no new warnings, deprecations, or noise).
+* The new test passes.
+* No previously green tests have turned red.
+* Test output is clean (no new warnings, deprecations, or noise).
 ### 6. Refactor (only if green)
@@ -101,8 +102,8 @@ With all tests green, you may:
 * Extract duplication into helpers
 * Tighten types
-Do **not** add new behavior during refactor — needs its own failing test
-first. Re-run tests after refactor to confirm still-green.
+Do **not** add new behavior during refactor — that needs its own failing
+test first. Re-run tests after the refactor to confirm still-green.
 ### 7. Repeat for the next behavior
@@ -110,25 +111,25 @@ Back to step 1 with the next single-sentence behavior.
 ## Output format
-1. Failing test (file + test name) with captured failure output
-2. Minimum-code diff that makes it pass
+1. The failing test (file + test name) with captured failure output
+2. The minimum-code diff that makes it pass
 3. Captured green-run output
-4. Refactor diff (optional)
+4. Any refactor diff (optional)
 ## Anti-rationalizations table
-The urge to skip TDD is strongest when TDD matters most. Name the
-rationalization and reject it:
+The urge to skip TDD is strongest on tasks where TDD matters most. Name
+the rationalization and reject it:
 | Thought | Reality |
 |---|---|
-| "Too simple to need a test" | Simple code still breaks. A test costs less than one debug cycle. |
-| "I'll add the test after the code works" | A test written after code passes on first run — never failed. Proves nothing. |
-| "I already ran it manually" | Manual runs not repeatable. Next edit breaks it silently. |
-| "Deleting code I just wrote is wasteful" | Sunk cost. Cheap path: delete, write test, reimplement minimally. |
-| "I'll keep the code as reference while I write the test" | You will adapt it. That is test-after-the-fact with extra steps. Delete it. |
-| "I just need to explore the API first" | Spike on a throwaway branch. Then delete and restart with TDD. |
-| "The test is too hard to write" | Signals a design problem in the code, not the test. Listen. |
+| "This is too simple to need a test" | Simple code still breaks. A test takes less time than one debug cycle. |
+| "I'll add the test after the code works" | A test written after code passes on the first run — it has never failed. It does not prove the code is correct. |
+| "I already ran it manually" | Manual runs are not repeatable. The next edit breaks it silently. |
+| "Deleting this code I just wrote is wasteful" | Sunk cost. The cheap path is: delete, write the test, reimplement minimally. |
+| "I'll keep the code as reference while I write the test" | You will read it and adapt it. That is test-after-the-fact with extra steps. Delete it. |
+| "I just need to explore the API first" | Spike it on a throwaway branch. Then delete it and restart with TDD. |
+| "The test is too hard to write" | That signals a design problem in the code, not in the test. Listen to it. |
 | "This bug is urgent, no time for a test" | The test **is** the fastest path to a verified fix. Guessing takes longer. |
@@ -162,7 +163,7 @@ final class EmailValidator
 }
 ```
-Run filter again → passes. No additional rules (format, MX, length)
+Run the filter again → passes. No additional rules (format, MX, length)
 until a next failing test drives them.
 ### JS / Vitest
@@ -184,7 +185,7 @@ it('retries a failing operation up to 3 times', async () => {
 ```
 Run: `npx vitest run --testNamePattern "retries a failing"` → fails
-(`retry` undefined).
+(`retry` is undefined).
 ```ts
 // src/retry.ts — GREEN (minimum)
@@ -197,21 +198,22 @@ export async function retry<T>(op: () => Promise<T>): Promise<T> {
 }
 ```
-Run again → passes. Configurable attempt count, backoff, jitter all
+Run again → passes. Configurable attempt count, backoff, and jitter all
 wait for their own failing tests.
 ## Gotchas
 * Running the full suite instead of a filtered test hides the RED→GREEN
   signal in noise. Always target first.
-* A test that passes on the very first run is not TDD — written against
-  existing code.
+* A test that passes on the very first run is not TDD — it was written
+  against code that already exists.
 * `expect()` with three or four assertions on unrelated fields describes
   multiple behaviors. Split them.
-* Snapshot tests invert the discipline — they generate the expected
-  value from the code. Only use where human-readable output is the
-  contract (CLI, SQL strings).
-* Mocking the thing under test (instead of its I/O) tests the mock.
+* Snapshot tests invert the discipline — they generate the expected value
+  from the code. Only use snapshots where human-readable output is the
+  contract (CLI output, SQL strings).
+* Mocking the thing under test (instead of its I/O) tests the mock, not
+  the code.
 ## Do NOT
@@ -219,22 +221,22 @@ wait for their own failing tests.
   and has been observed to fail
 * Do NOT accept a test that never failed as evidence the code works
 * Do NOT bundle refactors into the GREEN step
-* Do NOT silence a flaky test — diagnose or delete it
-* Do NOT skip the targeted RED-run because "I know it fails"
+* Do NOT silence a flaky test — diagnose it, or delete it
+* Do NOT skip the targeted RED-run because "I just wrote it, I know it fails"
 ## Anti-patterns
 * `it('works')` — no behavior described
 * One test covering "and/and/and" — split per behavior
-* Test reaching into private state instead of observable behavior
-* Test duplicating the production code's algorithm (tautology)
+* Test that reaches into private state instead of testing observable behavior
+* Test that duplicates the production code's algorithm (tautology)
 ## When to hand over to another skill
 * Quality tools, PHPStan, ECS, Rector → [`quality-tools`](../quality-tools/SKILL.md)
 * Full Pest conventions, Laravel testing helpers → [`pest-testing`](../pest-testing/SKILL.md)
 * Running tests inside Docker → [`tests-execute`](../tests-execute/SKILL.md)
-* Investigating why a test fails for non-obvious reasons →
+* Investigating why a test is failing for non-obvious reasons →
   [`systematic-debugging`](../systematic-debugging/SKILL.md)
 ## Validation checklist
@@ -243,10 +245,10 @@ Before marking TDD work complete:
 * [ ] Every new behavior has a test
 * [ ] Each test was observed to fail first, with a matching failure message
-* [ ] Minimum code was written to turn each RED into GREEN
+* [ ] The minimum code was written to turn each RED into GREEN
 * [ ] All targeted tests pass
-* [ ] No adjacent test turned red
-* [ ] Output clean (no new warnings or deprecations)
+* [ ] No adjacent test has turned red
+* [ ] Test output is clean (no new warnings or deprecations)
 See also [`developer-like-execution`](../developer-like-execution/SKILL.md)
 for the broader think → analyze → verify loop this skill plugs into.

package/.agent-src/skills/test-performance/SKILL.md CHANGED Viewed

@@ -148,7 +148,6 @@ Replace dynamic `getPdo()` probing with explicit config:
 ## Related project docs
-- `agents/roadmaps/test-performance-refactor.md` — active roadmap for this project
 - `agents/docs/seeders.md` — seeder conventions (if exists)
 ## Output format

package/.agent-src/skills/traefik/SKILL.md CHANGED Viewed

@@ -18,11 +18,11 @@ Use this skill when:
 ### 0. Understand current setup
-Before changing routing:
+Before changing any routing configuration:
-1. Check existing proxy — `docker ps | grep traefik`
-2. Review docker-compose for existing labels and network config
-3. Check `/etc/hosts` or dnsmasq for existing domain mappings
+1. **Check existing proxy setup** — is Traefik already running? Check `docker ps | grep traefik`.
+2. **Review docker-compose** — read `docker-compose.yml` for existing labels and network config.
+3. **Check DNS/hosts** — review `/etc/hosts` or dnsmasq for existing domain mappings.
 ### Architecture

package/.agent-src/skills/upstream-contribute/SKILL.md CHANGED Viewed

@@ -178,7 +178,7 @@ If not in the package repo, note that these checks will run in CI after the PR i
 If the contribution originates from a proposal under `agents/proposals/`
 (produced by `learning-to-rule-or-skill` via the pipeline in
-[`self-improvement-pipeline`](../../guidelines/agent-infra/self-improvement-pipeline.md)),
+[`self-improvement-pipeline`](../../../docs/guidelines/agent-infra/self-improvement-pipeline.md)),
 run the Stage-4 gate before opening the PR:
 ```bash

package/.agent-src/skills/validate-feature-fit/SKILL.md CHANGED Viewed

@@ -45,7 +45,7 @@ Verify the request doesn't conflict with:
 - Data flow (does it bypass existing services or repositories?)
 - **Product rules and domain invariants** — pull active rules via the
   shared abstraction (see
-  [`memory-access`](../../guidelines/agent-infra/memory-access.md)):
+  [`memory-access`](../../../docs/guidelines/agent-infra/memory-access.md)):
   ```python
   from scripts.memory_lookup import retrieve
@@ -60,7 +60,7 @@ Verify the request doesn't conflict with:
   (e.g., "free plan caps at 3 users"); the feature must either respect
   it or explicitly propose to retire it. Cite the matching `id:` in
   the findings. Schema:
-  [`engineering-memory-data-format`](../../guidelines/agent-infra/engineering-memory-data-format.md).
+  [`engineering-memory-data-format`](../../../docs/guidelines/agent-infra/engineering-memory-data-format.md).
 **If contradiction found** → show the existing pattern, explain why it matters.