@albinocrabs/feynman 0.2.2 → 0.2.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,162 @@
1
+ # Release Documentation
2
+
3
+ This document defines the full release process for `@albinocrabs/feynman`, including changelog usage, release notes generation, publishing, and verification.
4
+
5
+ ## 1) Scope and ownership
6
+
7
+ - Release versioning is package-driven (`package.json` `version`).
8
+ - GitHub release tags use `v${version}` format.
9
+ - Changelog is the canonical source for release notes.
10
+ - CI/release workflows live in `.github/workflows/ci.yml` and `.github/workflows/release.yml`.
11
+
12
+ ## 2) Pre-release checks
13
+
14
+ 1. Verify repo cleanliness and branch context.
15
+ 2. Ensure `main` is clean and CI checks are green.
16
+ 3. Confirm `package.json` version is the intended next version.
17
+ 4. Confirm matching plugin manifest versions if changed.
18
+
19
+ Recommended commands:
20
+
21
+ ```bash
22
+ git status --short --branch
23
+ git log --oneline -5 --decorate
24
+ npm run ci
25
+ ```
26
+
27
+ ## 3) Changelog-first release notes (required)
28
+
29
+ ### What to update
30
+
31
+ - Edit top of `CHANGELOG.md` before tagging.
32
+ - Use this section format:
33
+
34
+ ```md
35
+ ## 0.2.3 - 2026-05-07
36
+
37
+ Changes since v0.2.2.
38
+
39
+ ### Added
40
+ - ...
41
+
42
+ ### Changed
43
+ - ...
44
+
45
+ ### Fixed
46
+ - ...
47
+
48
+ ### Maintenance
49
+ - ...
50
+ ```
51
+
52
+ ### What goes into release notes
53
+
54
+ - `GitHub Release body` is generated from the section matching the new version in `CHANGELOG.md`.
55
+ - If no section exists or it is empty, workflow auto-falls back to:
56
+ - `## <version>`
57
+ - `Changes since <previousTag>.`
58
+ - `- Release published from package.json version <version>.`
59
+
60
+ ### Key rule
61
+
62
+ The workflow ignores changelog command output at publish time; it extracts directly from the repository changelog section for the version.
63
+
64
+ ## 4) Version and release flow
65
+
66
+ ### Step-by-step sequence
67
+
68
+ 1. Update files for release:
69
+ - `package.json` version bump.
70
+ - top `CHANGELOG.md` section for that version.
71
+ 2. Run release checks:
72
+ - `npm run ci`
73
+ 3. Commit and push to `main`.
74
+ 4. Create and publish GitHub release tag `v<version>`:
75
+ - Via UI or CLI.
76
+ 5. GitHub release workflow executes automatically:
77
+ - validates repository and package checks,
78
+ - extracts changelog notes,
79
+ - uploads tarball artifact,
80
+ - updates/creates release body,
81
+ - publishes to npm (if not already published),
82
+ - verifies npm propagation.
83
+
84
+ Manual local command examples:
85
+
86
+ ```bash
87
+ git add package.json CHANGELOG.md
88
+ git commit -m "chore: release v0.2.3"
89
+ git push origin main
90
+
91
+ gh release create v0.2.3 --generate-notes --target main
92
+ ```
93
+
94
+ If you need a full dry run without publish, use workflow dispatch:
95
+
96
+ ```bash
97
+ gh workflow run release.yml -f dry_run=true
98
+ ```
99
+
100
+ ## 5) What each workflow step does
101
+
102
+ - Checkout with full history + tags (`fetch-depth: 0`, `fetch-tags: true`).
103
+ - Resolve metadata from `package.json`:
104
+ - `package_name`
105
+ - `package_version`
106
+ - `tag`
107
+ - previous tag for fallback context.
108
+ - Parse `CHANGELOG.md` for matching `## <version>` section.
109
+ - Build and test release artifact (`npm run ci` + `npm run build`).
110
+ - Upload `dist/*.tgz` artifact.
111
+ - Update release notes on GitHub release.
112
+ - Publish to npm using `NPM_TOKEN` (skipped if version already exists).
113
+ - Verify publication via repeated `npm view` checks.
114
+
115
+ ## 6) Post-release verification
116
+
117
+ Run in order:
118
+
119
+ 1. Git state and alignment:
120
+
121
+ ```bash
122
+ git rev-parse --short HEAD
123
+ git rev-parse --short origin/main
124
+ ```
125
+
126
+ 2. Release artifact validation:
127
+
128
+ ```bash
129
+ gh release view v0.2.3 --json name,tagName,isDraft,isPrerelease,url -q '.name+"\\n"+.tagName+"\\n"+.isDraft+"\\n"+.isPrerelease+"\\n"+.url'
130
+ gh release view v0.2.3 --json body -q .body
131
+ ```
132
+
133
+ 3. NPM publication:
134
+
135
+ ```bash
136
+ npm view @albinocrabs/feynman@0.2.3 version
137
+ npm view @albinocrabs/feynman dist-tags
138
+ ```
139
+
140
+ 4. Smoke test from clean env:
141
+
142
+ ```bash
143
+ node -e "console.log('ok')" # placeholder for your install checks
144
+ npx -y @albinocrabs/feynman@latest version
145
+ ``
146
+
147
+ ## 7) Troubleshooting
148
+
149
+ - **Release notes still show CI/CD text**
150
+ - Ensure tag points to commit containing the new `CHANGELOG.md` section.
151
+ - Ensure section header exactly matches `## <version>`.
152
+ - **Publish failed**
153
+ - confirm `NPM_TOKEN` is set in repo secrets,
154
+ - verify package name in `package.json`.
155
+ - **Release not auto-updated**
156
+ - run on `release` event only or manual dispatch with `dry_run=false`.
157
+
158
+ ## 8) Required docs references
159
+
160
+ - [Release process (short)](../README.md) – README overview.
161
+ - [CI workflow](./launch.md#release-checklist).
162
+ - [Release workflow](../.github/workflows/release.yml).
@@ -0,0 +1,105 @@
1
+ # Action Sequencing: Checkout Incident During Peak
2
+
3
+ ## Question
4
+
5
+ > Checkout error rate jumps during peak and latency is now above SLO. We need a
6
+ > concrete action sequence for recovery, including who does what, when to rollback,
7
+ > and when to stop doing damage-control.
8
+
9
+ ## Without feynman
10
+
11
+ The team usually starts by checking dashboards, then opens an emergency channel,
12
+ then asks backend and DB teams to investigate. If they find a single failing
13
+ dependency they rollback that change; otherwise they add read-through cache and
14
+ throttle traffic. Communication goes out if the user impact is material.
15
+
16
+ ## With feynman
17
+
18
+ Operational sequence:
19
+
20
+ ```
21
+ [Alert: checkout error spike] --> [On-call acknowledges in 60s]
22
+ |
23
+ +--> [Is impact external?]
24
+ |
25
+ +-- yes --> [Start Incident Commander]
26
+ |
27
+ +-- no --> [Fast path mitigation only]
28
+ |
29
+ v
30
+ [Notify org channel] --> [Freeze non-critical deploys] --> [Triage by layer]
31
+ |
32
+ v
33
+ [Cache/DB/API/Infra]
34
+ |
35
+ [Need rollback?]
36
+ |
37
+ +-- yes --> [Rollback scoped deploy]
38
+ |
39
+ +-- no --> [Apply temporary safeguard]
40
+ |
41
+ v
42
+ [Stabilize + reduce blast radius]
43
+ |
44
+ v
45
+ [Declare recovery ETA]
46
+ |
47
+ +-- success --> [Post-incident audit]
48
+ |
49
+ +-- degraded --> [Resume changes]
50
+ ```
51
+
52
+ State board (what changed inside the incident):
53
+
54
+ ```
55
+ ┌─ Incident Action Board ───────────────────────┐
56
+ │ triage : done │
57
+ │ command setup : done │
58
+ │ mitigation active : in-flight │
59
+ │ rollback : ready │
60
+ │ customer comms : live │
61
+ │ root-cause evidence : collecting │
62
+ └───────────────────────────────────────────────┘
63
+ ```
64
+
65
+ Critical path decomposition:
66
+
67
+ ```
68
+ [Containment]
69
+ ├── [Enable queue throttle]
70
+ ├── [Route reads to replica]
71
+ └── [Turn on circuit breaker]
72
+ [Recovery]
73
+ ├── [Collect flamegraphs]
74
+ ├── [Compare with healthy minute]
75
+ └── [Prepare rollback diff]
76
+ [Verification]
77
+ ├── [Canary checks]
78
+ ├── [Synthetic transaction replay]
79
+ └── [SLO probe]
80
+ ```
81
+
82
+ Priority ladder during incident:
83
+
84
+ ```
85
+ ▲ high
86
+ data integrity checks
87
+ user-visible checkout path
88
+ ▼ low
89
+ dashboard chart style changes
90
+ PR comments cleanup
91
+ ```
92
+
93
+ Runbook gate table:
94
+
95
+ ```
96
+ check | owner | threshold | action
97
+ --------------------|----------------|----------------|-------------------------
98
+ P95 latency | SRE | <= 700ms | continue with mitigation
99
+ error budget burn | SRE | <= 3%/15m | escalate communications
100
+ db retry pressure | Backend lead | <= 2x baseline | rotate to fallback path
101
+ cache hit rate | Platform lead | >= 78% | stop throttling traffic
102
+ ```
103
+
104
+ This template forces action ordering, makes ownership explicit, and keeps the
105
+ team from drifting between investigation and mitigation under stress.
@@ -44,12 +44,41 @@ error response is sent.
44
44
  [200 OK + token]
45
45
  ```
46
46
 
47
- Seven stages, two branch points. Every failure path exits early — success
48
- continues to the next stage.
47
+ ### Timing lens: what can degrade and where
48
+
49
+ ```
50
+ [Client]
51
+ |
52
+ v
53
+ [CORS] --> [Parser] --> [Validation] --> [Controller] --> [AuthService]
54
+ | |
55
+ | +--> [401]
56
+ |
57
+ +--> [429/503]
58
+ +--> [400]
59
+
60
+ [AuthService] --> [UserRepository] --> [bcrypt.compare]
61
+ |
62
+ +-- mismatch --> [401]
63
+ |
64
+ +-- hit --> [JWT sign] --> [200]
65
+ ```
66
+
67
+ Error-path table:
68
+
69
+ ```
70
+ stage | symptom | response | recovery
71
+ -------------------|--------------|----------|-------------------
72
+ validation failed | 400 | reject | fix request payload
73
+ user not found | 401 | reject | prompt signup/help
74
+ credentials mismatch| 401 | reject | suggest retry/reset
75
+ dependency timeout | 503/504 | fail | retry/backoff
76
+ success | 200 | token | cache token metadata
77
+ ```
49
78
 
50
79
  ## Why this works
51
80
 
52
81
  The request lifecycle is a sequential flow with conditional branches, which
53
- activates feynman's flow diagram rules. Boxes (`[…]`) mark processing stages;
82
+ activates feynman's flow-diagram rules. Boxes (`[…]`) mark processing stages;
54
83
  arrows (`-->`) mark data flow; branch splits show the conditional paths at
55
84
  validation and credential-check points.
@@ -0,0 +1,89 @@
1
+ # Bug Isolation: Intermittent 500s in Checkout
2
+
3
+ ## Question
4
+
5
+ > Checkout API starts returning 500s at random intervals under normal load. How do we
6
+ > isolate root cause without a full-service shutdown?
7
+
8
+ ## Without feynman
9
+
10
+ Start by checking logs, then look at DB and cache metrics, then inspect release notes,
11
+ and finally reproduce with synthetic traffic around the failure window. If needed, roll
12
+ back gradually while validating with a canary cohort.
13
+
14
+ ## With feynman
15
+
16
+ Hypothesis tree:
17
+
18
+ ```
19
+ intermittent 500s
20
+ ├── app code
21
+ │ ├── null dereference
22
+ │ └── unhandled exception
23
+ ├── data layer
24
+ │ ├── deadlock / lock timeout
25
+ │ ├── stale row locks
26
+ │ └── missing index
27
+ ├── infrastructure
28
+ │ ├── DB connection pool exhaustion
29
+ │ └── Redis timeout
30
+ └── operational
31
+ ├── rollout wave overlap
32
+ └── scheduled jobs interference
33
+ ```
34
+
35
+ Isolation flow:
36
+
37
+ ```
38
+ [Symptom observed] --> [Scope blast radius]
39
+ |
40
+ +----------+----------+
41
+ | |
42
+ v v
43
+ [Single endpoint only] [All endpoints]
44
+ | |
45
+ yes/no yes/no
46
+ | |
47
+ v v
48
+ [Replay payload] [Re-check infra]
49
+ | |
50
+ v v
51
+ [Exception trace matches]
52
+ [Connection errors]
53
+ yes/no
54
+ | |
55
+ v v
56
+ [fix app] [next hypothesis]
57
+ | |
58
+ v v
59
+ [fix infra] [rollback segment]
60
+ ```
61
+
62
+ Impact priority:
63
+
64
+ ```
65
+ ▲ high
66
+ user checkout failure
67
+ payment status integrity
68
+ canary regression safety
69
+ ▼ low
70
+ low-frequency telemetry noise
71
+ non-critical UI latency
72
+ ```
73
+
74
+ Validation decision table:
75
+
76
+ ```
77
+ check | command | expected
78
+ ------------------|--------------------------------|-------------------------
79
+ error correlation | logs + correlation ids | grouped by checkout_id
80
+ db pressure | pool saturation metrics | stable for 15m
81
+ cache health | hit rate and timeout count | no timeout spike
82
+ deployment diff | feature flags + release notes | no new high-risk delta
83
+ ```
84
+
85
+ ## Why this works
86
+
87
+ Схема строит диагностику от гипотез к действиям: дерево сокращает пространство
88
+ поиска, flow управляет последовательностью экспериментов, а приоритетный блок
89
+ показывает, что лечится немедленно.
@@ -0,0 +1,112 @@
1
+ # C4-Style Architecture and Request Flow
2
+
3
+ ## Question
4
+
5
+ > Sketch a clean C4-style view for an AI documentation tool and show the
6
+ > normal request flow. I want context, container split, component split, and a
7
+ > clear status view for blockers.
8
+
9
+ ## Without feynman
10
+
11
+ The tool has three users: author, reviewer, and operator. It includes a web
12
+ client, a prompt gateway API, a rules engine, a diagram renderer, and a
13
+ quality service that validates responses before returning them. There is
14
+ SSO-based auth, storage for templates, and a shared observability channel.
15
+ If the rules fail to load, the request still needs a deterministic fallback to
16
+ text mode so the user is not blocked.
17
+
18
+ ## With feynman
19
+
20
+ ### C4 context
21
+
22
+ ```
23
+ feynman-system
24
+ ├── actors
25
+ │ ├── [Document Author]
26
+ │ ├── [Reviewer]
27
+ │ └── [Operator]
28
+ ├── containers
29
+ │ ├── [Web Client]
30
+ │ ├── [Prompt Gateway API]
31
+ │ ├── [Rule Service]
32
+ │ ├── [Diagram Renderer]
33
+ │ └── [Quality Service]
34
+ └── external systems
35
+ ├── [SSO]
36
+ ├── [Template Storage]
37
+ ├── [Model Provider]
38
+ └── [Observability]
39
+ ```
40
+
41
+ ### C4 container run
42
+
43
+ ```
44
+ [Document Author] --> [Web Client]
45
+ |
46
+ v
47
+ [Reviewer] --> [Prompt Gateway API]
48
+ |
49
+ v
50
+ [Auth + Rate Limit]
51
+ |
52
+ +-- unauthorized --> [403 / 401]
53
+ |
54
+ +-- authorized
55
+ |
56
+ v
57
+ [Rule Service]
58
+ |
59
+ +-- rule set miss --> [Text fallback]
60
+ |
61
+ +-- rule set hit
62
+ |
63
+ v
64
+ [Diagram Renderer]
65
+ |
66
+ v
67
+ [Quality Service]
68
+ |
69
+ +-- blocked --> [Recovery Plan]
70
+ |
71
+ +-- pass --> [JSON response]
72
+ |
73
+ v
74
+ [Observability publish]
75
+ |
76
+ v
77
+ [Document Author/Reviewer]
78
+ ```
79
+
80
+ ### Architecture split
81
+
82
+ ```
83
+ criterion | Context (C1) | Containers (C2) | Components (C3)
84
+ ---------------|---------------------|----------------------|----------------------
85
+ Main question | Who talks to what | Who owns boundary | Who owns behavior
86
+ Primary risk | Missing actor path | Wrong trust boundary | Rule fallback bug
87
+ Owner now | Product + Ops | Backend + Security | Runtime rule authors
88
+ ```
89
+
90
+ ### Why this helps
91
+
92
+ ```
93
+ ┌─ Delivery readiness ────────────┐
94
+ context map done
95
+ container flow done
96
+ component split done
97
+ risk hotspots identified
98
+ └─────────────────────────────────┘
99
+ ```
100
+
101
+ ## Why this works
102
+
103
+ Without explicit structure, the explanation is a dense paragraph. With the C4
104
+ perspective, Feynman converts it into:
105
+ - actor/system decomposition,
106
+ - runtime sequence,
107
+ - boundary+fallback behavior,
108
+ - and explicit risk visibility.
109
+
110
+ The result is understandable quickly and can be reviewed or extended as a
111
+ single architecture baseline.
112
+
@@ -0,0 +1,77 @@
1
+ # Context Splitting: Product Initiative to Deploy a New Onboarding Assistant
2
+
3
+ ## Question
4
+
5
+ > We want to launch a new onboarding assistant next quarter. It touches web UI, backend,
6
+ > legal review, and customer support. How should we split work so leadership can
7
+ > understand risk, dependencies, and rollout order in one view?
8
+
9
+ ## Without feynman
10
+
11
+ The initiative includes multiple teams and timelines. First we should do discovery and
12
+ alignment between engineering, legal, and support. Then UI and backend work must start,
13
+ because both rely on design contracts. A pilot should run with 10% of new users, then
14
+ global rollout should happen after legal and reliability checks are complete.
15
+
16
+ ## With feynman
17
+
18
+ Scope decomposition:
19
+
20
+ ```
21
+ [Onboarding assistant]
22
+ ├── Product & UX
23
+ │ ├── onboarding copy
24
+ │ ├── microcopy guardrails
25
+ │ └── in-app hints
26
+ ├── Platform
27
+ │ ├── assistant API
28
+ │ ├── telemetry events
29
+ │ └── admin flags
30
+ ├── Legal & Compliance
31
+ │ ├── consent text
32
+ │ └── data retention policy
33
+ └── Operations
34
+ ├── runbook
35
+ ├── on-call drill
36
+ └── rollback script
37
+ ```
38
+
39
+ Cross-team launch flow:
40
+
41
+ ```
42
+ [Legal approves data flow] --> [Backend API contract ready]
43
+ |
44
+ v
45
+ [UX/Copy draft] --> [Integration tests] --> [10% pilot]
46
+ | |
47
+ fail --> [fix + retest] v
48
+ [60-day rollback check]
49
+ |
50
+ +--> [full rollout]
51
+ ```
52
+
53
+ Dependency safety frame:
54
+
55
+ - legal-ok: mandatory
56
+ - telemetry path: must emit onboarding_success and onboarding_fail
57
+ - fallback: always-on silent mode if latency > 300ms
58
+ - rollback: kill-switch <2 minutes
59
+
60
+ Priority lanes:
61
+
62
+ ```
63
+ ▲ high
64
+ legal/consent review
65
+ backend idempotency
66
+ kill-switch + rollback readiness
67
+ ▼ low
68
+ copy polishing
69
+ dashboard cosmetics
70
+ ```
71
+
72
+ ## Why this works
73
+
74
+ The question has nested uncertainty and cross-team constraints. The tree diagram makes
75
+ the decomposition explicit. The flow diagram shows execution order and rework loops.
76
+ The frame and priority lanes turn soft governance requirements into checkable launch
77
+ conditions.
@@ -0,0 +1,107 @@
1
+ # Feature Planning: Build Internal Search or Use Managed API
2
+
3
+ ## Question
4
+
5
+ > We need fast text search by title/body tags. Should we build it ourselves or use
6
+ > a managed search service? Compare the options by cost, speed, and maintenance.
7
+
8
+ ## Without feynman
9
+
10
+ Building search internally gives us full control over ranking but requires schema
11
+ indexing work, query tuning, and ongoing reliability engineering. A managed API
12
+ is faster to deliver and has better relevance out of the box, but it increases
13
+ vendor dependency and recurring cost. We can reduce risk by evaluating latency,
14
+ cost, and maintenance for a 6-month period and then revisiting.
15
+
16
+ ## With feynman
17
+
18
+ Decision matrix:
19
+
20
+ ```
21
+ Option | build-internal | managed-search-service
22
+ ------------------|--------------------------------|--------------------------
23
+ speed-to-market | 8-12 weeks | 1-2 weeks
24
+ query latency | 60-120ms (with cache) | 40-80ms
25
+ maintenance | high (2 engineers, on-call) | low
26
+ vendor lock-in | none | medium-high
27
+ relevance quality | custom control, tuning effort | high, pre-tuned
28
+ ```
29
+
30
+ Decision flow:
31
+
32
+ ```
33
+ [Need search by title/body now?] --> [Yes]
34
+ |
35
+ v
36
+ [Need search now?] --> [Evaluate managed by default]
37
+ |
38
+ v
39
+ [POC in 2 weeks]
40
+ |
41
+ +-------------------- +--------------------+
42
+ | |
43
+ v v
44
+ [Latency/cost ok] [No]
45
+ | |
46
+ +--> [Adopt] +--> [Re-open internal build path]
47
+ |
48
+ +--> [Plan v1 migration in 1 sprint]
49
+ ```
50
+
51
+ Governance priority:
52
+
53
+ ```
54
+ ▲ high
55
+ Vendor contract review (SLA, data residency)
56
+ Incident drill: provider outage fallback plan
57
+ ▼ low
58
+ UI polish in search result cards
59
+ Advanced synonym tuning
60
+ ```
61
+
62
+ Phased rollout map:
63
+
64
+ ```
65
+ [Decision]
66
+ |
67
+ v
68
+ [POC + telemetry]
69
+ |
70
+ +-- latency/cost tests fail? --+--> [Re-scope]
71
+ |
72
+ +-- latency/cost tests pass? --> [Fallback path] --> [Adoption path]
73
+ |
74
+ +--> [Cost optimization] --> [Quarterly review]
75
+ ```
76
+
77
+ Rollback frame:
78
+
79
+ ```
80
+ Managed API chosen:
81
+ - fail trigger: P95 > 2x baseline + cost ↑
82
+ - response: cut volume 50%, enable fallback
83
+ - rollback time: 45 min
84
+ - owner: on-call + search guild
85
+ ```
86
+
87
+ Context split before execution:
88
+
89
+ ```
90
+ [Business goal]
91
+ ├── [Performance]
92
+ │ ├── latency
93
+ │ └── availability
94
+ ├── [Economics]
95
+ │ ├── direct cost
96
+ │ └── hidden support cost
97
+ └── [Risk]
98
+ ├── lock-in
99
+ ├── security
100
+ └── reversibility
101
+ ```
102
+
103
+ ## Why this works
104
+
105
+ The plain comparison becomes explicit with columns, and the execution path becomes
106
+ operational through a flow diagram. This helps teams decide with one view of
107
+ trade-offs and controls.