npm - @brunosps00/dev-workflow - Versions diffs - 0.9.0 → 0.10.0 - Mend

@brunosps00/dev-workflow 0.9.0 → 0.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (72) hide show

package/scaffold/skills/dw-simplification/references/behavior-preserving.md ADDED Viewed

@@ -0,0 +1,148 @@
+# Behavior-preserving refactor — the test gate that protects you
+The promise of simplification: the code is cleaner; nothing observable changed. The risk: you broke a subtle behavior nobody had a test for.
+## The test gate
+Before simplifying ANY non-trivial code, ensure tests cover what you're about to change. Three cases:
+### Case A — tests exist and pass
+Run them. They pass before, they pass after. Done.
+### Case B — tests exist but are incomplete
+Run them. They pass before. Apply your simplification. They still pass — but you don't trust them to catch your specific concern.
+Action: write a characterization test FIRST that pins the behavior you're worried about. Then simplify.
+### Case C — no tests at all
+Don't simplify. The "preserve behavior exactly" claim is unverifiable. Either:
+- Write characterization tests first (see below), then simplify.
+- Or open a separate task ("add tests for X") and don't touch the code now.
+## Characterization tests
+A characterization test documents what the code currently does, without claiming whether that's correct. It's a freeze of behavior at a point in time.
+Pattern:
+```javascript
+// Characterization test — not asserting CORRECTNESS,
+// asserting BEHAVIOR-AT-TIME-OF-WRITING.
+describe('legacyFormatter (characterization)', () => {
+  it('handles empty input by returning "(none)"', () => {
+    expect(legacyFormatter('')).toBe('(none)');
+  });
+  it('preserves trailing whitespace in output', () => {
+    expect(legacyFormatter('hello ')).toBe('hello ');
+  });
+  it('returns null when input is undefined', () => {
+    expect(legacyFormatter(undefined)).toBe(null);
+  });
+});
+```
+Goal: capture the WEIRD behaviors. Boring behaviors don't need tests; nobody breaks them. The weird parts are the load-bearing ones.
+## How to find weird behaviors to characterize
+1. Run the function with a range of inputs (empty, null, undefined, very long, special chars, edge values).
+2. Note any output that surprised you. That's a weird behavior.
+3. Write a test that pins it.
+Example session:
+```bash
+node -e "console.log(JSON.stringify(require('./src/legacy').format('')))"
+# Output: "(none)"   ← weird, document it.
+node -e "console.log(JSON.stringify(require('./src/legacy').format('hello ')))"
+# Output: "hello "   ← preserves trailing whitespace? unusual, document.
+node -e "console.log(JSON.stringify(require('./src/legacy').format(undefined)))"
+# Output: null      ← returns null not "(none)" for undefined? document.
+```
+Tests written; now you can simplify. If after your refactor any of these tests fail, you broke behavior — even if your version "looks better".
+## Refactor with the test gate
+```
+1. Run tests → GREEN (baseline)
+2. Apply simplification (one logical change)
+3. Run tests → expect GREEN
+   - If RED → revert, understand why, retry differently
+4. Commit (atomic — one simplification per commit)
+5. Repeat for next simplification
+```
+DO NOT batch multiple simplifications into one commit. If something breaks, you can't bisect which change caused it.
+## Rollback patterns
+If the simplification went wrong AND the issue surfaces only post-merge:
+```bash
+git revert <simplification-sha>
+```
+Atomic commits make this clean. The revert is one commit; tests pass again; you can investigate at leisure.
+If multiple simplifications were stacked and one broke:
+```bash
+# Find which one
+git bisect start
+git bisect bad HEAD
+git bisect good <known-good-sha>
+# Run tests at each step
+```
+Bisect requires that EACH commit be testable in isolation — another reason for atomic commits.
+## Codemod tooling per language
+For refactors >500 lines, manual edits are dangerous. Use AST-based codemods:
+### TypeScript / JavaScript
+- **jscodeshift** — Facebook's AST-based codemod runner. Many community codemods at `npmjs.com/package/<framework>-codemods`.
+- **ts-morph** — programmatic TypeScript AST manipulation. Good for one-off refactors.
+- **ESLint --fix** — for any rule that has a fixer (most of them); narrow scope.
+```bash
+# Run a community codemod
+npx jscodeshift -t ./codemods/extract-method.js src/
+```
+### Python
+- **libcst** — Instagram's concrete syntax tree library. Preserves comments + formatting.
+- **ast** module + custom walker for simpler cases.
+- **Ruff --fix** for known auto-fixable rules.
+### C# / .NET
+- **Roslyn analyzers + code fixes** — write a custom analyzer with a fix provider; runs across the solution.
+- **Visual Studio Refactor menu** for IDE-driven multi-file renames/extractions.
+### Rust
+- **rust-analyzer** has refactor support: rename, extract function, inline variable.
+- **rustfmt + clippy --fix** for style/idiom-level fixes.
+## Don't
+- Don't simplify and update tests in the same commit. The test changes hide what you changed in the code.
+- Don't consider lint passing as "behavior preserved". Lint catches syntax; tests catch behavior.
+- Don't trust "no test failed therefore behavior preserved" if test coverage is <60%. Add tests first.
+- Don't pre-emptively simplify for a future hypothetical maintainer. Simplify when you have a present reason (a bug, a read-time confusion).
+## When the refactor is too risky
+If steps 2-3 of the test gate consistently fail (you can't pin the behavior, or your simplification keeps breaking subtle things):
+- The code is genuinely complex for a reason; leave it.
+- Or: rewrite-with-tests as a separate, larger task — not in-flight simplification.
+The right call sometimes is "this code is ugly but works; ship around it; revisit when you have hours, not minutes."

package/scaffold/skills/dw-simplification/references/chestertons-fence.md ADDED Viewed

@@ -0,0 +1,152 @@
+# Chesterton's Fence — understand WHY before changing
+> "If you find a fence in the middle of a field, don't tear it down until you know why it was put there." — G.K. Chesterton (paraphrased)
+In code, the "fence" is anything that looks unnecessary on first read. Most code that looks unnecessary is actually load-bearing for a case the casual reader hasn't encountered.
+## The protocol
+Before removing or rewriting anything:
+### 1. What does it actually do?
+Don't trust the name. Names lie. Names get repurposed. Read the body. Trace the call graph.
+If you can't say in one sentence what the code does (not what it's named, what it DOES), you don't yet understand it.
+### 2. Why was it added?
+```bash
+# When did this line/block enter the codebase?
+git log -L <start>,<end>:<file> --oneline | head -5
+# Or for a whole file
+git log --follow --oneline <file> | head -10
+# Read the introducing commit
+git show <commit-hash>
+```
+The commit message + PR title + linked issue often have the rationale. Common findings:
+- "Fix race condition when X happens during Y" — the seemingly redundant check is the fix.
+- "Workaround for [framework] bug #1234" — the weird code is a workaround.
+- "Required for [external system] compatibility" — the cast/wrapper exists for a reason.
+- "TODO: simplify after we deprecate Y" — there's an explicit plan, follow it.
+### 3. What breaks if it's gone?
+```bash
+# Find callers
+rg -F "<symbol>" --type <lang>
+# Run the test suite
+<test-runner>
+```
+If tests pass with the code removed, it might be safe — but tests have gaps. Cross-check with:
+- Search the whole codebase for the symbol/string
+- Check `.dw/intel/files.json` for `imports` referencing it
+- For exported symbols, check downstream consumers (other repos, npm, etc.)
+If even one of the three answers is "I don't know", DON'T REMOVE.
+## Case studies
+### Case 1 — "Redundant" early return
+Code:
+```python
+def process(item):
+    if item is None:
+        return
+    if item.id is None:
+        return
+    # ...rest...
+```
+Looks like the second `if` could merge into the first via `if item is None or item.id is None`. But:
+- `item is None` is a system error (caller passed None).
+- `item.id is None` is a domain state (item not yet persisted).
+The two cases warrant different downstream behavior in the future (e.g., logging the second but not the first). Merging them collapses the distinction. Chesterton's Fence: leave alone unless you also remove the future flexibility intentionally.
+### Case 2 — "Useless" type cast
+Code:
+```typescript
+const value = (someApi.fetch() as unknown) as MyType;
+```
+Looks like `as MyType` would suffice. But the double-cast was added because the API's actual return type is incompatible with `MyType` and a direct cast errors. Removing the `as unknown` step breaks compilation.
+The fix isn't to remove the double-cast; it's to either fix `MyType` (if the API is right) or fix the API typings (if `MyType` is right). Chesterton's Fence: read the commit message — usually says "compiler workaround for incompatible types".
+### Case 3 — "Duplicated" validation
+Code:
+```javascript
+function createUser(payload) {
+  if (!payload.email) throw new ValidationError('email required');
+  // ...
+}
+function updateUser(id, payload) {
+  if (!payload.email) throw new ValidationError('email required');
+  // ...
+}
+```
+Looks like duplication; "extract" to `validatePayload(payload)`. But:
+- `createUser` requires email.
+- `updateUser` may eventually allow email-less updates (e.g., just change name).
+Today they happen to share validation; tomorrow they might not. The "duplication" preserves independence of evolution. Don't extract a shared validator until at least 3 callers exist with truly identical rules.
+### Case 4 — "Dead" branch
+Code:
+```typescript
+if (process.env.LEGACY_MIGRATION_MODE === 'true') {
+  return legacyTransform(input);
+}
+return modernTransform(input);
+```
+Looks dead because nobody sets `LEGACY_MIGRATION_MODE` in test/dev. But:
+- Production sets it on the data-import service.
+- The migration runs once a quarter when new historical data lands.
+Removing the branch means the next quarterly migration breaks silently. Chesterton's Fence: check production env vars / feature flags / runtime configs before deleting "dead" branches.
+## When you DO act
+After all three protocol steps, if:
+1. You understand what the code does.
+2. You know why it was added (and the reason no longer applies, or you have evidence it never applied).
+3. You've verified nothing breaks.
+THEN you can simplify. Cite the rationale in the commit message:
+```
+refactor(auth): remove fallback to legacy session API
+The fallback was added in 2022-04 for compatibility with the v1
+session service, which was decommissioned in 2024-09 (PR #4567).
+No callers reference the legacy code path; tests confirm the
+modern path covers all scenarios. Removed.
+```
+Future maintainers can see why the change was safe. They can also see — if it turns out NOT to be safe — exactly what assumption was wrong.
+## Anti-pattern
+Removing code because "it looks unused" without doing the three steps. Speed is not the goal; correctness is. A 30-minute investigation to avoid a 3-day production rollback is a great trade.

package/scaffold/skills/dw-simplification/references/complexity-metrics.md ADDED Viewed

@@ -0,0 +1,147 @@
+# Complexity metrics — when each one actually matters
+Metrics are signal, not verdict. A function with cyclomatic complexity 12 isn't automatically "bad" — but it's worth a second look. This file describes the four common metrics, when each matters, and how to measure cheaply.
+## The four metrics
+### 1. Cyclomatic complexity
+Counts independent paths through a function. Each `if`, `else`, `for`, `while`, `case`, `&&`, `||`, `?:` adds one.
+| Score | What it means | Action |
+|-------|---------------|--------|
+| 1-5 | Simple, easily testable | Leave alone |
+| 6-10 | Moderate; usually fine | Test thoroughly |
+| 11-20 | Complex; review case-by-case | Strong candidate to simplify |
+| 21+ | Very complex; high bug rate empirically | Definitely simplify (or split) |
+Caveat: long switch/match statements over enum values can score high but be naturally clear. Cyclomatic complexity over-penalizes them.
+### 2. Cognitive complexity
+Counts how hard the function is to **understand** (not just trace through). Nesting amplifies; flat structures don't. `goto`, `break N levels`, and recursion add weight.
+This metric correlates better with bug rate than cyclomatic. Most modern linters compute it (SonarQube, ESLint with `sonarjs`, `radon` for Python).
+| Score | Action |
+|-------|--------|
+| 0-9 | Fine |
+| 10-15 | Review |
+| 16+ | Refactor |
+### 3. Nesting depth
+Counts the maximum depth of `if`/`for`/`try` nesting in a function.
+| Depth | Action |
+|-------|--------|
+| 1-2 | Fine |
+| 3 | Acceptable; consider early returns |
+| 4 | Smell; usually flatten via guard clauses |
+| 5+ | Almost always refactor |
+Easiest to fix via early returns:
+```javascript
+// Before — depth 4
+function process(req) {
+  if (req) {
+    if (req.user) {
+      if (req.user.active) {
+        if (req.body) {
+          return doWork(req.body);
+        }
+      }
+    }
+  }
+  return null;
+}
+// After — depth 1
+function process(req) {
+  if (!req) return null;
+  if (!req.user) return null;
+  if (!req.user.active) return null;
+  if (!req.body) return null;
+  return doWork(req.body);
+}
+```
+### 4. Fan-out (outgoing dependencies)
+Number of distinct other modules a function calls. High fan-out = many touchpoints; refactor risky.
+| Fan-out | Action |
+|---------|--------|
+| 0-3 | Fine |
+| 4-7 | Acceptable |
+| 8+ | Function probably has too many concerns; consider splitting |
+## When each metric matters
+| Symptom | Metric to check |
+|---------|-----------------|
+| "This function has a lot of `if`s" | Cyclomatic + nesting depth |
+| "I keep getting lost reading this" | Cognitive complexity |
+| "Tests miss edge cases" | Cyclomatic (each path needs a test) |
+| "Refactor keeps breaking adjacent code" | Fan-out + fan-in |
+| "Function is 200 lines" | Length + nesting (length alone is weak signal) |
+## How to measure cheaply
+### TypeScript / JavaScript
+```bash
+# ESLint with sonarjs plugin
+npx eslint --rule 'sonarjs/cognitive-complexity: ["error", 15]' src/
+# Or per-file analysis
+npx complexity-report src/foo.ts
+# Bundle analysis adjacent
+npx eslint --rule 'complexity: ["warn", 10]' src/
+```
+### Python
+```bash
+# Radon — cyclomatic + cognitive
+pip install radon
+radon cc src/ -a    # cyclomatic, average per file
+radon mi src/       # maintainability index
+# Or via ruff
+ruff check --select C901 src/
+```
+### C# / .NET
+```bash
+# Code metrics via Roslyn
+dotnet build /p:CodeAnalysisRuleSet=...
+# Or in Visual Studio: Analyze → Calculate Code Metrics
+```
+### Rust
+```bash
+# Clippy + complexity lints
+cargo clippy -- -W clippy::cognitive_complexity
+# Or rust-code-analysis CLI
+rust-code-analysis-cli -m -p src/
+```
+## Don't
+- **Don't chase a score.** "Cyclomatic 9 vs 10" is meaningless. The function either reads clearly or doesn't.
+- **Don't refactor purely to lower a number.** If `cognitive_complexity: 16` flags a switch over 16 enum cases, leaving it alone is fine.
+- **Don't measure the whole repo at once and act on the top 100.** Most will be false positives. Focus on functions you're touching.
+- **Don't ignore the trend.** A file going from average cognitive 8 → 14 over 6 months is a smell, even if no individual function crossed the threshold.
+## Healthy use of metrics
+- Run on PRs that touched complex code; flag if a function moved to a worse bucket.
+- Run on long-lived hot files quarterly; spot drift.
+- Set CI gates ONLY at gross thresholds (e.g., cognitive >25), not at edge thresholds (>10).
+- Treat metrics as conversation starters, not pass/fail gates.

package/scaffold/skills/dw-source-grounding/SKILL.md ADDED Viewed

@@ -0,0 +1,128 @@
+---
+name: dw-source-grounding
+description: Discipline of grounding architectural and dependency decisions in versioned official documentation, with mandatory citations. Other commands invoke this skill when they need to decide based on framework/library behavior — never on hallucinated APIs or stale Stack Overflow answers. Adapted from addyosmani/agent-skills (MIT).
+allowed-tools:
+  - Read
+  - Bash
+  - Grep
+  - Glob
+  - WebFetch
+---
+# dw-source-grounding
+Behavioral protocol for grounding decisions in **versioned, official documentation** — and citing those sources verifiably. Used by `dw-create-techspec`, `dw-deps-audit`, `dw-deep-research` whenever a decision depends on what a framework or library actually does at the version installed in the project.
+## Why this skill exists
+Decisions made on hallucinated APIs or 2-year-old Stack Overflow answers cause silent breakage. The cost shows up later — code that "worked in testing" because the agent matched an older version's API, then breaks in production where the real version is different. This skill enforces a four-step protocol that prevents that class of failure.
+## When to Use
+Read this skill when:
+- A command needs to recommend a library/framework version (`dw-deps-audit` brainstorm phase).
+- A command must propose architectural patterns that depend on framework specifics (`dw-create-techspec`).
+- A command is researching a topic that has version-specific answers (`dw-deep-research`).
+- You're about to cite an API, CLI flag, configuration option, or behavior — and you want the citation to be verifiable later.
+Do NOT use when:
+- The decision doesn't depend on external documentation (e.g., naming a variable inside a single function).
+- The library/framework version is irrelevant to the answer (e.g., "use a hash map for O(1) lookup").
+- You're writing examples that are intentionally generic / pseudocode.
+## The Protocol — Detect → Fetch → Implement → Cite
+### 1. Detect — read the actual version first
+Before researching anything, read the project's manifest and identify the EXACT version of the library/framework that matters:
+| Stack | File | Field |
+|-------|------|-------|
+| Node/TS | `package.json` | `dependencies`, `devDependencies` |
+| Python | `pyproject.toml`, `requirements*.txt`, `Pipfile.lock` | each dep with version |
+| .NET | `*.csproj`, `packages.lock.json` | `PackageReference Version="..."` |
+| Rust | `Cargo.toml`, `Cargo.lock` | `[dependencies]` |
+Record the version. If a range (`^4.18.0`), note the lockfile-resolved version.
+If no manifest exists OR the dep is not yet in the manifest (e.g., choosing what to install), record "no version yet — choosing fresh".
+### 2. Fetch — pull the matching version's official docs
+Authority hierarchy:
+1. **Official docs** for the EXACT version (or nearest stable). E.g., `react.dev/reference/react?version=18` not `react.dev` default.
+2. **Official changelog / migration guide** when transitioning across versions.
+3. **Web standards** (MDN, RFCs, W3C) for cross-implementation behavior.
+4. **Compatibility tables** (caniuse, Compat data) for API support across runtimes.
+Forbidden as primary source:
+- Stack Overflow answers (use only as discovery, then verify via official).
+- Tutorial blogs (frequently outdated; never authoritative).
+- AI training data (your training is months/years stale).
+- README screenshots from random GitHub repos.
+Fetch via `WebFetch` or `mcp__context7__*` if available. If both fail, surface to the user that you're falling back to training-data knowledge AND mark the citation `[source: training-data, unverified]`.
+### 3. Implement — apply the documented pattern
+Use exactly the API the documented version provides. Don't mix patterns from multiple versions ("this useEffect example is from React 16; you're on 18.3"). When the doc shows multiple acceptable patterns, pick the simplest that matches the project's style.
+If the doc presents migration warnings (e.g., "deprecated in v5, use X instead"), follow the new path unless the project explicitly pins to the old version for a documented reason.
+### 4. Cite — record the source verifiably
+Every decision that depended on an external source ends with a citation block:
+```
+[source: <url>, version: <X.Y>, retrieved: <YYYY-MM-DD>]
+```
+Examples:
+```
+[source: https://react.dev/reference/react/useEffect, version: 18.3, retrieved: 2026-05-07]
+[source: https://docs.python.org/3.12/library/asyncio-task.html, version: 3.12, retrieved: 2026-05-07]
+[source: https://docs.aws.amazon.com/sdk-for-javascript/v3/developer-guide/welcome.html, version: SDK v3, retrieved: 2026-05-07]
+```
+When citing in PRDs, techspecs, decision logs, or deps-audit reports, the citation is mandatory adjacent to each claim. A future engineer reading the doc can click and verify.
+## How `dw-create-techspec` uses this
+Before writing the "Architectural Decisions" section, the techspec command:
+1. Lists every framework/library decision the techspec depends on (e.g., "use Server Actions for mutations").
+2. For each, runs the protocol: detects version, fetches official doc, cites verifiably.
+3. Writes each decision as: `<decision> — <one-line rationale> — [source: ...]`.
+If the protocol can't reach official docs (offline, paywall, dead link), the techspec prefixes the decision with `⚠ training-data fallback` so the human reviewer knows to verify.
+## How `dw-deps-audit` uses this
+In the brainstorm phase (Conservative/Balanced/Bold per package), each option's "target version" cites the source where that version's release notes were checked. This catches the "agent recommends v5 because it sounds modern, but v5 dropped Node 18 support" class of error.
+## How `dw-deep-research` uses this
+Already does multi-source research; gains the citation discipline. Each finding line ends with a `[source: ...]` block. The output report's bibliography is built from these citations automatically.
+## Anti-patterns
+1. Citing Stack Overflow as primary source. (Use as DISCOVERY, then fetch the official doc the SO answer points to.)
+2. Citing "the docs" without a URL. The whole point is verifiability.
+3. Citing a doc URL that isn't pinned to a version (e.g., `react.dev` instead of `react.dev/reference/react?version=18`).
+4. Pretending knowledge is current when it's training data. Mark unverified.
+5. Citing your own previous answer in this session as authority. The chain has to terminate at an external source.
+## References
+- `references/citation-protocol.md` — exact format of `[source: ...]` blocks; how to consolidate multiple citations in a single decision; how to track citation freshness over time.
+- `references/source-priority.md` — full hierarchy with examples; when secondary sources are acceptable.
+- `references/freshness-check.md` — how to validate a doc URL still applies to the version in use; how to detect doc drift between when you fetched and when the user reads the artifact.
+## Inspired by
+Adapted from [`addyosmani/agent-skills/source-driven-development`](https://github.com/addyosmani/agent-skills) by Addy Osmani (MIT license). Core protocol (Detect → Fetch → Implement → Cite) and source authority hierarchy preserved. dev-workflow integration: invoked by `dw-create-techspec`, `dw-deps-audit`, `dw-deep-research` via Complementary Skills, and citation format aligned with our existing report frontmatter conventions (`type: ...`, `schema_version: ...`).

package/scaffold/skills/dw-source-grounding/references/citation-protocol.md ADDED Viewed

@@ -0,0 +1,108 @@
+# Citation protocol — exact format of `[source: ...]` blocks
+Citations are not optional decoration. They turn a decision into a verifiable claim. Future engineers reading a techspec/deps-audit/research report click the URL and confirm.
+## Required format
+Single citation:
+```
+[source: <url>, version: <X.Y>, retrieved: <YYYY-MM-DD>]
+```
+All three fields required:
+- `<url>` — direct URL to the section/page that supports the claim. Not the docs homepage.
+- `<version>` — the version of the library/framework the URL applies to. If the URL is version-pinned (e.g., `?version=18`), the value here matches.
+- `<retrieved>` — ISO date you fetched. Establishes when the doc was current.
+Examples (good):
+```
+[source: https://react.dev/reference/react/useEffect, version: 18.3, retrieved: 2026-05-07]
+[source: https://nextjs.org/docs/app/building-your-application/data-fetching/server-actions-and-mutations, version: 14.2, retrieved: 2026-05-07]
+[source: https://docs.python.org/3.12/library/asyncio.html, version: 3.12, retrieved: 2026-05-07]
+[source: https://docs.docker.com/compose/compose-file/build/, version: Compose Spec, retrieved: 2026-05-07]
+```
+Examples (bad):
+```
+[source: react.dev]                           # no version, no date, not pinned
+[source: docs, retrieved: today]              # vague URL, vague date
+[source: https://stackoverflow.com/q/12345]   # not authoritative — see source-priority.md
+[the React docs say ...]                      # not a citation block; future reader can't verify
+```
+## Multiple citations on one decision
+When a decision rests on more than one source (e.g., "use Server Actions for mutations" depends on the React 19 form actions API + the Next.js 14 routing layer), list each:
+```
+Decision: use Server Actions for the form submission flow.
+Rationale: Server Actions provide automatic revalidation
+[source: https://nextjs.org/docs/app/building-your-application/data-fetching/server-actions-and-mutations, version: 14.2, retrieved: 2026-05-07]
+and align with React 19's `useActionState`
+[source: https://react.dev/reference/react/useActionState, version: 19.0, retrieved: 2026-05-07].
+```
+Inline form for short claims:
+```
+- Postgres 16 introduces `pg_stat_io` for IO observability
+  [source: https://www.postgresql.org/docs/16/monitoring-stats.html, version: 16, retrieved: 2026-05-07].
+```
+## Where citations live
+Per artifact:
+| Artifact | Citations live in |
+|----------|-------------------|
+| PRD (`prd.md`) | "Open Questions" section when the answer needs research, OR inline next to specific functional requirements |
+| TechSpec (`techspec.md`) | Inline next to every architectural decision; consolidated in a "Sources" section at the end |
+| Deps-audit report | Adjacent to each package's recommended version |
+| Deep-research report | Inline next to every finding; consolidated bibliography auto-generated |
+| ADR | Mandatory in the "References" section |
+## Freshness notation
+If the citation is older than 90 days when the artifact is consumed, suspect drift. Best-practice:
+- Re-verify the citation before acting on the decision.
+- If the cited URL has moved or been deprecated, update the artifact: `[source: <new-url>, version: ..., retrieved: <new-date>, supersedes: <old-url>]`.
+For artifacts that age (e.g., a 6-month-old ADR), agents reading them downstream should flag stale citations rather than silently trust them.
+## Unverified fallback
+When official docs are unreachable (network down, paywall, deleted page) AND the agent must still produce an answer, mark the citation explicitly:
+```
+[source: training-data, unverified, claim: <X>, last-known-version: <Y>]
+```
+This signals to the human reviewer that the answer needs verification before commit.
+NEVER use `unverified` as a default. The four-step protocol (Detect → Fetch → Implement → Cite) demands fetch attempts; `unverified` is the failsafe, not the path.
+## Consolidation in reports
+When a report has dozens of citations (e.g., a deep-research output), build a numbered bibliography at the end:
+```
+## Sources
+[1] https://react.dev/reference/react/useEffect, version: 18.3, retrieved: 2026-05-07
+[2] https://nextjs.org/docs/app/building-your-application/data-fetching/server-actions-and-mutations, version: 14.2, retrieved: 2026-05-07
+[3] https://docs.python.org/3.12/library/asyncio.html, version: 3.12, retrieved: 2026-05-07
+```
+In-line, refer to bibliography entries: `... uses Server Actions [2] for mutations ...`.
+## Don't
+- Don't cite an internal Slack thread or private wiki as if it were authoritative — the citation must point to something the reader can read.
+- Don't cite a URL that requires login (paywalls); find an open-access equivalent or note the access barrier in the citation.
+- Don't backfill citations after the fact ("I'll add the source later"). Cite as you decide; if you can't cite, you don't have a verified decision.