npm - @jaimevalasek/aioson - Versions diffs - 1.17.3 → 1.19.0 - Mend

@jaimevalasek/aioson 1.17.3 → 1.19.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (90) hide show

package/CHANGELOG.md +13 -0
package/README.md +85 -51
package/docs/en/3-recipes/full-feature-with-sheldon.md +1 -1
package/docs/en/5-reference/cli-reference.md +4 -4
package/docs/en/5-reference/qa-browser.md +2 -2
package/docs/en/README.md +1 -1
package/docs/en/deyvin-subtask-scout/how-to-use.md +2 -2
package/docs/en/deyvin-subtask-scout/sub-task-scout.md +3 -3
package/docs/en/deyvin-subtask-scout/troubleshooting.md +1 -1
package/docs/pt/3-receitas/publicar-no-aioson-com.md +17 -0
package/docs/pt/5-referencia/comandos-cli.md +2 -2
package/docs/pt/5-referencia/inteligencia-adaptativa.md +3 -3
package/docs/pt/5-referencia/skills.md +1 -1
package/docs/pt/5-referencia/web3.md +3 -3
package/docs/pt/README.md +1 -1
package/docs/pt/_arquivo/README.md +1 -1
package/docs/pt/_arquivo/cenarios.md +31 -31
package/docs/pt/_arquivo/design-hybrid-forge.md +5 -5
package/docs/pt/_arquivo/guia-engineer.md +1 -1
package/docs/pt/_arquivo/profiler-system.md +1 -1
package/docs/pt/_arquivo/site-forge.md +16 -16
package/docs/pt/_arquivo/squad-genome.md +2 -2
package/docs/pt/agentes.md +37 -37
package/docs/pt/deyvin-subtask-scout/como-usar.md +2 -2
package/docs/pt/deyvin-subtask-scout/sub-task-scout.md +1 -1
package/docs/pt/deyvin-subtask-scout/troubleshooting.md +1 -1
package/docs/pt/living-memory/README.md +1 -1
package/docs/pt/living-memory/memoria-viva.md +2 -2
package/docs/pt/living-memory/reflexao-in-harness.md +1 -1
package/docs/pt/living-memory/troubleshooting.md +6 -6
package/package.json +1 -1
package/src/cli.js +111 -7
package/src/commands/gate-approve.js +56 -1
package/src/commands/live.js +81 -54
package/src/commands/op-capture.js +27 -2
package/src/commands/op-list.js +33 -1
package/src/commands/store-system.js +4 -0
package/src/commands/tool-capabilities.js +14 -10
package/src/commands/workflow-heal.js +47 -1
package/src/constants.js +73 -0
package/src/i18n/messages/en.js +20 -2
package/src/i18n/messages/es.js +18 -1
package/src/i18n/messages/fr.js +18 -1
package/src/i18n/messages/pt-BR.js +20 -2
package/src/lib/dev-resume.js +6 -1
package/src/lib/tool-capabilities.js +64 -37
package/src/operator-memory/decision.js +11 -4
package/src/operator-memory/proposal.js +11 -7
package/src/session-handoff.js +52 -1
package/template/.aioson/agents/analyst.md +33 -1
package/template/.aioson/agents/architect.md +33 -1
package/template/.aioson/agents/briefing.md +23 -0
package/template/.aioson/agents/orchestrator.md +26 -0
package/template/.aioson/agents/pentester.md +66 -14
package/template/.aioson/agents/pm.md +18 -1
package/template/.aioson/agents/product.md +11 -0
package/template/.aioson/agents/sheldon.md +21 -1
package/template/.aioson/agents/tester.md +114 -1
package/template/.aioson/docs/pentester/browser-dast-playbook.md +398 -0
package/template/.aioson/rules/agent-structural-contract.md +139 -0
package/template/.aioson/skills/process/decision-presentation/SKILL.md +2 -2
package/template/.claude/commands/aioson/agent/analyst.md +16 -5
package/template/.claude/commands/aioson/agent/architect.md +17 -5
package/template/.claude/commands/aioson/agent/briefing.md +16 -5
package/template/.claude/commands/aioson/agent/committer.md +16 -5
package/template/.claude/commands/aioson/agent/copywriter.md +16 -5
package/template/.claude/commands/aioson/agent/design-hybrid-forge.md +16 -5
package/template/.claude/commands/aioson/agent/dev.md +18 -5
package/template/.claude/commands/aioson/agent/deyvin.md +16 -5
package/template/.claude/commands/aioson/agent/discover.md +16 -5
package/template/.claude/commands/aioson/agent/discovery-design-doc.md +16 -5
package/template/.claude/commands/aioson/agent/genome.md +16 -5
package/template/.claude/commands/aioson/agent/neo.md +16 -5
package/template/.claude/commands/aioson/agent/orache.md +16 -5
package/template/.claude/commands/aioson/agent/orchestrator.md +21 -5
package/template/.claude/commands/aioson/agent/pair.md +16 -5
package/template/.claude/commands/aioson/agent/pentester.md +22 -5
package/template/.claude/commands/aioson/agent/pm.md +20 -5
package/template/.claude/commands/aioson/agent/product.md +16 -5
package/template/.claude/commands/aioson/agent/profiler-enricher.md +16 -5
package/template/.claude/commands/aioson/agent/profiler-forge.md +16 -5
package/template/.claude/commands/aioson/agent/profiler-researcher.md +16 -5
package/template/.claude/commands/aioson/agent/qa.md +16 -5
package/template/.claude/commands/aioson/agent/setup.md +16 -5
package/template/.claude/commands/aioson/agent/sheldon.md +16 -5
package/template/.claude/commands/aioson/agent/site-forge.md +16 -5
package/template/.claude/commands/aioson/agent/squad.md +16 -5
package/template/.claude/commands/aioson/agent/tester.md +16 -5
package/template/.claude/commands/aioson/agent/ux-ui.md +19 -5
package/template/.claude/commands/aioson/agent/validator.md +17 -5

package/template/.aioson/agents/orchestrator.md CHANGED Viewed

@@ -268,6 +268,32 @@ scheduled spec.md snapshots. Always clean up with `CronDelete` when the session
 If Cron tools are unavailable, do not simulate them in prose. Use explicit manual checkpoints with `parallel:status` instead.
+## Handoff
+After all lanes are merged and verified:
+```
+Orchestration complete: {N} lanes merged
+Shared decisions: .aioson/context/parallel/shared-decisions.md
+Next agent: @dev (per-lane implementation) or @qa (if implementation is done)
+Action: /dev or /qa
+```
+> Recommended: `/clear` before activating — fresh context window.
+## Observability
+At strategic milestones during execution, emit progress signals:
+```bash
+aioson runtime:emit . --agent=orchestrator --type=milestone --summary="Lanes initialized: {N} lanes for {slug}" 2>/dev/null || true
+aioson runtime:emit . --agent=orchestrator --type=milestone --summary="Merge complete: {slug}, {N} lanes merged" 2>/dev/null || true
+```
+At session end, register:
+```bash
+aioson pulse:update . --agent=orchestrator --feature={slug} --action="Orchestration completed: {N} lanes, {N} merged" --next="<next agent recommendation>" 2>/dev/null || true
+aioson agent:done . --agent=orchestrator --summary="Orchestration <slug>: <N> lanes, <N> merged, <status>" 2>/dev/null || true
+```
 ## Rules
 - Do not parallelize modules with direct dependency.
 - Record all cross-module decisions in `shared-decisions.md` before implementing.

package/template/.aioson/agents/pentester.md CHANGED Viewed

@@ -13,6 +13,7 @@ Adversarial review of AIOSON features guided by an explicit review contract. `@p
 - AIOSON runtime artifacts (`.aioson/runtime/`, `.aioson/context/`, `.aioson/agents/`)
 - Fixtures, mocks, and test data within the workspace
 - Local SQLite databases and seed data
+- Local running application URLs (`localhost`, `127.0.0.1`) for browser DAST probes via Playwright
 **Forbidden — refuse and log:**
 - Internet URLs, public domains, or any external target
@@ -25,19 +26,66 @@ When a forbidden target is requested, respond:
 ## Session start protocol
-1. Ask the user: which feature slug is under review?
-2. Resolve `target_mode` from invocation context:
-   - default `framework_target`
-   - explicit `app_target` only when the invocation carries `--mode=app_target` or the workflow handoff says so
-3. For `app_target`, require a concrete feature slug and target scope before proceeding. If `--feature`/`--slug` or `--scope` is missing, fail early and do not silently fall back to `framework_target`.
-4. Load `project.context.md` — confirm `framework_installed` and workspace layout.
-5. Load `prd-{slug}.md` and `spec-{slug}.md` if present — these are the attack surface map.
-6. Load existing `security-findings-{slug}.json` if present — check for open or stale findings before adding new ones.
-7. Derive the threat-surface matrix for the feature (see surface list below).
-8. Generate the `pentester-review-contract` as the first output artifact.
+Load `.aioson/skills/process/decision-presentation/SKILL.md` before the first user-facing question. All questions below use `AskUserQuestion` with `(Recomendado)` on the first option when `profile=creator`.
+### Step 1 — Auto-detect context (silent, no user interaction)
+1. Load `project.context.md` — confirm project type, framework, stack.
+2. Load `features.md` and `project-pulse.md` — identify active features, last gate, current state.
+3. If the user's activation message already contains a clear target (e.g. "review my login page", "check the API"), extract intent silently and skip to Step 3.
+### Step 2 — Ask what the user wants to review
+If the user's intent is unclear, present a guided choice. Never ask for "feature slugs", "target_mode", or "runtime_mode" directly — those are internal terms.
+**Question (1 per turn, creator mode):**
+> "What would you like me to review for security?"
+| Option | Internal mapping | Description |
+|---|---|---|
+| "Review the project code for vulnerabilities (Recomendado)" | `framework_target` if AIOSON project, `app_target` otherwise | Analyzes source code, configs, dependencies, and agent prompts for security issues. No running app required. |
+| "Test my running site/app in a browser" | `app_target` + `runtime_mode: browser_dast` | Opens a real browser (Playwright) and probes the running application for exposed secrets, missing security headers, cookie issues, and more. Requires the app to be running locally. |
+| "Both — code review + browser testing" | `app_target` + `browser_dast` + code surfaces | Full review: static code analysis first, then dynamic browser probes. Most thorough option. |
+### Step 3 — Resolve scope automatically
+1. If there are active features in `features.md` with status `in_progress`, propose the most recent one as the default scope. Do not ask the user to type a slug — present it by name.
+2. If no active feature exists, use the project name as the scope slug.
+3. If the user provided a specific area ("check the login", "review the payments page"), derive the scope from their description.
+### Step 4 — Browser DAST setup (only when browser testing selected)
+When the user chose browser testing:
+1. Check if `aios-qa.config.json` exists — if yes, read the URL from it and propose it: "Your app is configured at `http://localhost:3000`. Is that correct?"
+2. If no config exists, ask: "What URL is your app running at?" with a default suggestion of `http://localhost:3000`.
+3. Run `aioson qa:doctor` silently. If prerequisites are missing, tell the user exactly what to install in plain language:
+   - Missing Playwright: "You need to install the browser testing tool first. Run: `npm install -g playwright && npx playwright install chromium`"
+   - URL not reachable: "I can't reach your app at that URL. Make sure it's running before we continue."
+4. Once prerequisites pass, confirm: "Everything is ready. I'll start by running an automated security scan, then do deeper manual checks."
+### Step 5 — Build review contract and proceed
+After resolving all inputs through the guided flow:
+1. Load `prd-{slug}.md` and `spec-{slug}.md` if present — these are the attack surface map.
+2. Load existing `security-findings-{slug}.json` if present — check for open or stale findings.
+3. Derive the threat-surface matrix for the feature (see surface list below).
+4. Generate the `pentester-review-contract` as the first output artifact.
+5. For `browser_dast`: run automated baseline (Phase 0) via `aioson qa:run --persona=hacker --url=<target>` + `aioson qa:scan --url=<target>`, then import findings and proceed to manual probes per `browser-dast-playbook.md`.
 Do NOT start analyzing surfaces before the review contract exists and has been written to the findings artifact.
+### Workflow-triggered activation (non-interactive)
+When `@pentester` is activated by a workflow handoff (not directly by the user), skip the guided questions and resolve from the handoff context:
+- `target_mode` from `--mode=` flag or handoff payload
+- Feature slug from `--feature=` or `--slug=`
+- URL from `--url=` or `review_contract.target_scope`
+Fail early with a clear message if required fields are missing — do not silently fall back to defaults.
 ## Attack surfaces (mandatory coverage)
 For every feature, map each applicable surface. If a surface is not applicable, add a `threat-surface-entry` with `verification_status: not_applicable` and a mandatory `skip_reason`.
@@ -76,6 +124,7 @@ Use this catalog when `review_contract.target_mode = app_target`. Do not mix fra
 | TS-{slug}-A05 | `app_target_logging_monitoring` | Security-relevant events logged, no secrets in logs, tamper-resistant storage |
 | TS-{slug}-A06 | `app_target_ssrf` | Add when feature fetches user-supplied URLs (avatar import, webhook, OIDC discovery, link unfurl) |
 | TS-{slug}-A07 | `app_target_auth_rate_limit` | Login, signup, reset, OTP, rate limiting, auth-adjacent endpoints, OAuth/OIDC |
+| TS-{slug}-A08 | `app_target_browser_exposure` | Security headers, cookie attributes, client-side storage leaks, CORS misconfiguration, source map exposure, server disclosure, clickjacking, SRI. **Requires Playwright.** Load `.aioson/docs/pentester/browser-dast-playbook.md` for full methodology. |
 ### Cross-scope rule
@@ -135,7 +184,7 @@ Write all output to `.aioson/context/security-findings-{slug}.json` using this e
   "review_contract": {
     "review_id": "pentester-{slug}-{timestamp}",
     "scope_mode": "phase_review | on_demand",
-    "runtime_mode": "local_static | local_runtime | fixture_based",
+    "runtime_mode": "local_static | local_runtime | fixture_based | browser_dast",
     "target_mode": "framework_target | app_target",
     "target_scope": "refund-flow",
     "allowed_targets": [],
@@ -237,6 +286,7 @@ The framework playbooks above cover the AIOSON-internal review surface. For app-
 | Doc | Load when |
 |---|---|
 | `.aioson/docs/pentester/app-playbooks.md` | `review_contract.target_mode = app_target` — full step-by-step methodology for TS-A01..A07 with OWASP ASVS 5.0 mapping, multi-identity setup for IDOR/BOLA, last-byte sync for race conditions, SSRF probe set, auth/MFA bypass tests |
+| `.aioson/docs/pentester/browser-dast-playbook.md` | `review_contract.target_mode = app_target` AND the application has a browser-accessible UI — Playwright-based dynamic probes for TS-A08: security headers, cookies, localStorage/sessionStorage, CORS, source maps, clickjacking, SRI, error page disclosure. **Mandatory Phase 0:** run `aioson qa:run --persona=hacker` + `aioson qa:scan` as automated baseline before manual probes |
 | `.aioson/docs/pentester/llm-supplychain.md` | Feature touches LLM prompts, RAG, tool invocation, `package.json`, lockfiles, GitHub Actions, or any release pipeline — full prompt-injection taxonomy (LLM01.1/.2/.3), supply-chain incidents, SAST/DAST/secrets tool catalog, SLSA + Sigstore |
 ## SAST / DAST / secrets — minimum tool baseline
@@ -248,12 +298,13 @@ Run at minimum for any non-trivial review. Cite versions in `review_contract.too
 | SAST multi-language | **Semgrep CE** with `p/security-audit`, `p/owasp-top-ten`, `p/secrets` |
 | SAST on GitHub | **CodeQL** (free for public repos) |
 | SCA + container + IaC | **Trivy** |
-| DAST | **OWASP ZAP** baseline scan |
+| DAST (automated) | **AIOSON qa:run --persona=hacker** + **qa:scan** (Playwright-based, built-in) |
+| DAST (deep) | **OWASP ZAP** baseline scan |
 | Secrets pre-commit | **Gitleaks** + **TruffleHog** (verified) |
 | LLM-app | **Garak** (adversarial prompt fuzzing) |
 | GitHub Actions audit | **zizmor**, **actionlint** |
-**Minimum stack:** Semgrep + Trivy + Gitleaks + ZAP. Add CodeQL on GitHub. Add Garak for LLM apps. Manual playbooks (`app-playbooks.md`) for IDOR/BOLA and race conditions — no scanner replaces them.
+**Minimum stack:** Semgrep + Trivy + Gitleaks + `aioson qa:run` + ZAP. Add CodeQL on GitHub. Add Garak for LLM apps. For `app_target` with browser UI, always run `aioson qa:run --persona=hacker` + `aioson qa:scan` as Phase 0 before manual probes. Manual playbooks (`app-playbooks.md`, `browser-dast-playbook.md`) for IDOR/BOLA, race conditions, and browser exposure — no scanner replaces them.
 ## Ownership protocol
@@ -271,8 +322,9 @@ Run at minimum for any non-trivial review. Cite versions in `review_contract.too
 - `on_demand`: triggered by the user pointing at a specific module or surface
 - `framework_target`: legacy AIOSON/runtime review mode
 - `app_target`: generated-app review mode using the dedicated app surface catalog
+- `browser_dast`: Playwright-based dynamic testing against a running local application — extends `app_target` with TS-A08 (browser_exposure) surface. Requires `aioson qa:doctor` prerequisites met.
-`app_target` is optional and should be invoked by `@qa` only when auth, money, ownership, uploads, external URLs, suspicious audit findings, or equivalent heuristics indicate a sensitive surface.
+`app_target` is optional and should be invoked by `@qa` only when auth, money, ownership, uploads, external URLs, suspicious audit findings, or equivalent heuristics indicate a sensitive surface. `browser_dast` is an extension of `app_target` — never standalone.
 ## Hard constraints
 - Use `interaction_language` (fallback: `conversation_language`) from context for all output.

package/template/.aioson/agents/pm.md CHANGED Viewed

@@ -107,7 +107,7 @@ gate_status: approved
 After writing the plan, always close Gate C:
 ```
-aioson gate:approve . --feature={slug} --gate=C
+aioson gate:approve . --feature={slug} --gate=C 2>/dev/null || true
 ```
 Or manually set `gate_plan: approved` in `spec-{slug}.md`.
@@ -118,6 +118,23 @@ Gate C: approved
 Next agent: @orchestrator (MEDIUM) or @dev (SMALL, user confirmed)
 Action: /orchestrator or /dev
 ```
+> Recommended: `/clear` before activating — fresh context window.
+## Observability
+At strategic milestones during execution, emit progress signals:
+```bash
+aioson runtime:emit . --agent=pm --type=milestone --summary="Implementation plan written: {slug}, {N} phases" 2>/dev/null || true
+aioson runtime:emit . --agent=pm --type=gate_check --summary="Gate C approved: {slug}" 2>/dev/null || true
+```
+At session end, register:
+```bash
+# Capture user decisions for operator memory
+aioson op:capture --signal=confirmation --quote="<user's verbatim choice>" --proposal="<decision paraphrase>" --source-agent=pm 2>/dev/null || true
+aioson pulse:update . --agent=pm --feature={slug} --action="PM completed: {N} stories prioritized, Gate C {approved|pending}" --next="<next agent recommendation>" 2>/dev/null || true
+aioson agent:done . --agent=pm --summary="PM <slug>: <N> stories prioritized, Gate C <approved|pending>" 2>/dev/null || true
+```
 ## Non-MEDIUM handoff reality

package/template/.aioson/agents/product.md CHANGED Viewed

@@ -212,8 +212,10 @@ Check the following conditions in order:
 1. Propose a slug from the feature name (e.g., "shopping cart" → `shopping-cart`).
 2. Confirm: "I'll save this as `prd-shopping-cart.md` — does that work?"
 3. Write `prd-{slug}.md`.
+   After writing the PRD, emit: `aioson runtime:emit . --agent=product --type=milestone --summary="PRD written: {slug}, classification: {class}" 2>/dev/null || true`
 4. Add or update `features.md`: `| {slug} | in_progress | {ISO-date} | — |`
    Create `features.md` if it does not yet exist.
+   After registering, emit: `aioson runtime:emit . --agent=product --type=milestone --summary="Feature registered: {slug}" 2>/dev/null || true`
 ## Required input
 - `.aioson/context/project.context.md` (always)
@@ -326,6 +328,8 @@ Action: /copywriter
 When `project_type=site`, do not route to `@sheldon`, `@analyst`, or `@ux-ui` directly. Always route to `@copywriter` first.
+> **Tip:** before the next agent loads, consider running `aioson context:pack .` to compress context and reduce token cost for the downstream agent.
 ## Responsibility boundary
 `@product` owns product thinking only:
@@ -364,4 +368,11 @@ aioson dev:state:write . --feature={slug} \
 Skip this step when classification is SMALL or MEDIUM — `@analyst` (and downstream agents) own the handoff producer in those flows.
 ## Observability
+When the user confirms a sizing, classification, or scope decision, capture it for operator memory:
+```bash
+aioson op:capture --signal=confirmation --quote="<user's verbatim choice>" --proposal="<decision paraphrase>" --source-agent=product 2>/dev/null || true
+```
+At session end, update pulse: `aioson pulse:update . --agent=product --feature={slug} --action="<summary>" --next="<next agent recommendation>" 2>/dev/null || true`
 At session end, register: `aioson agent:done . --agent=product --summary="PRD <slug>: <classification>, <N> stories" 2>/dev/null || true`

package/template/.aioson/agents/sheldon.md CHANGED Viewed

@@ -60,9 +60,11 @@ Load `.aioson/brains/_index.json` on activation. If review tags match `sheldon/a
 Cross-reference query before architectural recommendations:
 ```bash
-node .aioson/brains/scripts/query.js --tags sdd,classification,ordering --min-quality 4 --format compact
+aioson brain:query . --tags=sdd,classification,ordering --min-quality=4 --format=compact
 ```
+> If `aioson` CLI is unavailable, fall back to: `node .aioson/brains/scripts/query.js --tags sdd,classification,ordering --min-quality 4 --format compact`
 After a review yields a *new* structural lesson, append a node to the brain, update `nodes` + `updated` in `_index.json`, and link `see[]` to related nodes.
 ## Briefing context (RC-BRF)
@@ -255,5 +257,23 @@ Load `.aioson/docs/sheldon/harness-contract.md` for the full procedure: init via
 - **Always write sheldon-enrichment.md** — even if no improvements were applied
 - Use `interaction_language` (fallback: `conversation_language`) from project context for all interaction and output
 - Do not copy content from the PRD into your output. Reference by section name. The full document is already in context — re-stating it wastes tokens and introduces drift.
+- When the user confirms sizing or enrichment decisions, capture for operator memory: `aioson op:capture --signal=confirmation --quote="<user's verbatim choice>" --proposal="<decision paraphrase>" --source-agent=sheldon 2>/dev/null || true`
+- When sizing is decided, emit: `aioson runtime:emit . --agent=sheldon --type=milestone --summary="Sizing decided: score {score}, path {A|B}" 2>/dev/null || true`
+- When enrichment is applied, emit: `aioson runtime:emit . --agent=sheldon --type=milestone --summary="Enrichment applied: {N} improvements, sizing score: {score}" 2>/dev/null || true`
+- At session end, update pulse: `aioson pulse:update . --agent=sheldon --feature={slug} --action="<summary>" --next="<next agent recommendation>" 2>/dev/null || true`
 - At session end, register: `aioson agent:done . --agent=sheldon --summary="<one-line summary>" 2>/dev/null || true`
 - If `aioson` CLI is not available, write a devlog at session end following the "Devlog" section in `.aioson/config.md`.
+## Handoff
+After enrichment is complete and `agent:done` is registered, present the next step:
+```
+Enrichment complete: .aioson/context/sheldon-enrichment-{slug}.md
+Sizing: {score} → Path {A (in-place) | B (phased plan)}
+PRD updated: .aioson/context/prd-{slug}.md
+Next agent: @analyst (produces requirements + spec to close Gate A)
+Why: PRD is enriched — @analyst maps entities, business rules, and edge cases into the spec.
+Action: /analyst
+```
+> Recommended: `/clear` before activating — fresh context window.

package/template/.aioson/agents/tester.md CHANGED Viewed

@@ -12,7 +12,7 @@ Do not implement features. Do not review the product. Test what exists.
 - `@tester` validates behavior, regressions, coverage gaps, and reproducibility of implemented code.
 - `@tester` does not perform offensive review, threat modeling, exploit discovery, or adversarial probing. Those belong to `@pentester`.
-- If `.aioson/context/security-findings-{slug}.json` exists, you may read it as auxiliary risk input to prioritize tests or reproduce an already-documented path.
+- If `.aioson/context/security-findings-{slug}.json` exists, read it to: (1) prioritize tests by risk, (2) reproduce already-documented paths, and (3) **generate security regression tests** (see Phase 4.6) that prevent fixed vulnerabilities from recurring.
 - Do not create or close security findings, reclassify severity, or take ownership of residual security risk.
 - If testing reveals a likely security issue that is not already documented, record the evidence in `test-plan.md` or `test-inventory.md` and route it to `@pentester` or `@qa`.
@@ -339,6 +339,119 @@ Before declaring Phase 4 done, run this checklist against every test file writte
 For deep refactor guidance, load `.aioson/docs/tester/coverage-quality.md` § 4.
+## Phase 4.6 — Security regression tests (from @pentester findings)
+**Trigger:** `.aioson/context/security-findings-{slug}.json` exists with findings that have `status: fixed` or `status: open` with `recommended_owner: dev`.
+**Purpose:** Convert one-shot pentester findings into persistent Playwright tests that run in CI and catch regressions. The pentester discovers; the tester prevents recurrence.
+**Do NOT perform adversarial probing or threat modeling.** This phase generates regression tests only for vulnerabilities already documented by `@pentester`.
+### Step 1 — Read findings
+1. Load `security-findings-{slug}.json`.
+2. Filter findings relevant for regression testing: any finding with `severity ≥ medium` that has concrete `reproduction_steps` and `affected_artifacts`.
+3. Group by surface type — each group becomes a test describe block.
+### Step 2 — Generate tests by surface type
+Create `tests/security-regression.test.{ext}` (or `tests/security-regression-{slug}.test.{ext}` for feature-scoped). Use Playwright when the finding requires a browser; use the project's test runner for code-level findings.
+**Test patterns by surface:**
+| Finding surface | Test pattern | Example assertion |
+|---|---|---|
+| `app_target_browser_exposure` | Playwright: fetch main page, inspect response headers | `expect(headers['content-security-policy']).toBeTruthy()` |
+| `app_target_browser_exposure` (cookies) | Playwright: authenticate, inspect cookies | `expect(sessionCookie.httpOnly).toBe(true)` |
+| `app_target_browser_exposure` (storage) | Playwright: authenticate, evaluate localStorage | `expect(storageKeys).not.toContain('token')` |
+| `app_target_browser_exposure` (CORS) | Playwright/fetch: request with evil Origin | `expect(acao).not.toBe('*')` |
+| `app_target_browser_exposure` (source maps) | Playwright: try fetching `*.js.map` | `expect(mapResponse.status()).not.toBe(200)` |
+| `app_target_secrets_crypto` | Grep/read: scan rendered HTML for secret patterns | `expect(html).not.toMatch(/sk-[a-zA-Z0-9]{20,}/)` |
+| `app_target_injection_xss` | Playwright: inject payload in inputs, check for execution | `expect(xssFired).toBe(false)` |
+| `app_target_ownership_idor` | HTTP: request resource as wrong user | `expect(response.status).toBe(403)` |
+| `app_target_auth_rate_limit` | HTTP: send N+1 wrong passwords | `expect(response.status).toBe(429)` after threshold |
+| `app_target_logging_monitoring` | Read log output after security event | `expect(logEntry).toContain('login_failed')` |
+### Step 3 — Playwright security regression template
+For browser-based findings, generate tests following this structure:
+```javascript
+const { test, expect } = require('@playwright/test');
+test.describe('Security regression — {slug}', () => {
+  test('SF-{slug}-01: CSP header present and no unsafe-inline', async ({ page }) => {
+    const response = await page.goto(process.env.TARGET_URL || 'http://localhost:3000');
+    const csp = response.headers()['content-security-policy'] || '';
+    expect(csp).toBeTruthy();
+    expect(csp).not.toContain("'unsafe-inline'");
+  });
+  test('SF-{slug}-02: session cookie has HttpOnly and Secure flags', async ({ context }) => {
+    const cookies = await context.cookies();
+    const session = cookies.find(c => /session|token|auth|sid/i.test(c.name));
+    if (session) {
+      expect(session.httpOnly).toBe(true);
+      expect(session.secure).toBe(true);
+      expect(session.sameSite).not.toBe('None');
+    }
+  });
+  test('SF-{slug}-03: no secrets in localStorage', async ({ page }) => {
+    await page.goto(process.env.TARGET_URL || 'http://localhost:3000');
+    const storage = await page.evaluate(() => {
+      const data = {};
+      for (let i = 0; i < localStorage.length; i++) {
+        const key = localStorage.key(i);
+        data[key] = localStorage.getItem(key);
+      }
+      return JSON.stringify(data);
+    });
+    expect(storage).not.toMatch(/sk-[a-zA-Z0-9]{20,}/);
+    expect(storage).not.toMatch(/eyJ[a-zA-Z0-9_-]{10,}\.[a-zA-Z0-9_-]{10,}/);
+  });
+  test('SF-{slug}-04: source maps not accessible in production', async ({ page }) => {
+    const jsFiles = [];
+    page.on('response', (res) => {
+      if (res.url().endsWith('.js') && res.status() === 200) jsFiles.push(res.url());
+    });
+    await page.goto(process.env.TARGET_URL || 'http://localhost:3000', { waitUntil: 'networkidle' });
+    for (const js of jsFiles.slice(0, 5)) {
+      const mapRes = await page.request.get(js + '.map');
+      expect(mapRes.status()).not.toBe(200);
+    }
+  });
+});
+```
+### Step 4 — Traceability
+Each test name must include the finding ID from `security-findings-{slug}.json` (e.g., `SF-checkout-03`). This creates a traceable link: finding → regression test → CI pass/fail.
+In `test-plan.md`, add a **Security regression coverage** section:
+```markdown
+## Security regression coverage
+| Finding ID | Severity | Surface | Test file | Test name | Status |
+|---|---|---|---|---|---|
+| SF-checkout-01 | high | browser_exposure | tests/security-regression.test.js | CSP header present | ✓ passing |
+| SF-checkout-03 | critical | secrets_crypto | tests/security-regression.test.js | no secrets in localStorage | ✓ passing |
+```
+### Step 5 — Verify all regression tests pass
+Run the security regression tests. If any fail, it means the fix is incomplete — report in `test-plan.md` as `[fix-incomplete]` and route to `@dev`.
+### When to skip this phase
+- No `security-findings-{slug}.json` exists — skip silently
+- All findings have `severity: info` or `severity: low` — skip (not worth regression test maintenance)
+- Project has no browser UI and all findings are code-level — skip Playwright tests, use unit/integration tests only
 ## Adjacent quality layers — opt-in by trigger
 Don't auto-load. Add only when the trigger fires. Full details: `.aioson/docs/tester/coverage-quality.md` § 6.