@albinocrabs/feynman 0.2.1 → 0.2.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.codex-plugin/plugin.json +1 -1
- package/CHANGELOG.md +50 -3
- package/CONTRIBUTING.md +1 -0
- package/README.md +249 -13
- package/SECURITY.md +11 -0
- package/bin/feynman.js +322 -9
- package/docs/architecture.md +15 -8
- package/docs/launch.md +10 -2
- package/docs/object-passport.md +91 -0
- package/docs/release.md +162 -0
- package/examples/activity-sequence.md +105 -0
- package/examples/api-flow.md +32 -3
- package/examples/bug-isolation.md +89 -0
- package/examples/c4-platform-diagramming.md +112 -0
- package/examples/context-splitting.md +77 -0
- package/examples/feature-planning.md +107 -0
- package/examples/incident-response.md +77 -0
- package/examples/release-readiness.md +73 -0
- package/examples/service-migration.md +72 -0
- package/package.json +5 -3
- package/rules/feynman-activate.md +5 -7
- package/skills/feynman/SKILL.md +9 -7
package/docs/release.md
ADDED
|
@@ -0,0 +1,162 @@
|
|
|
1
|
+
# Release Documentation
|
|
2
|
+
|
|
3
|
+
This document defines the full release process for `@albinocrabs/feynman`, including changelog usage, release notes generation, publishing, and verification.
|
|
4
|
+
|
|
5
|
+
## 1) Scope and ownership
|
|
6
|
+
|
|
7
|
+
- Release versioning is package-driven (`package.json` `version`).
|
|
8
|
+
- GitHub release tags use `v${version}` format.
|
|
9
|
+
- Changelog is the canonical source for release notes.
|
|
10
|
+
- CI/release workflows live in `.github/workflows/ci.yml` and `.github/workflows/release.yml`.
|
|
11
|
+
|
|
12
|
+
## 2) Pre-release checks
|
|
13
|
+
|
|
14
|
+
1. Verify repo cleanliness and branch context.
|
|
15
|
+
2. Ensure `main` is clean and CI checks are green.
|
|
16
|
+
3. Confirm `package.json` version is the intended next version.
|
|
17
|
+
4. Confirm matching plugin manifest versions if changed.
|
|
18
|
+
|
|
19
|
+
Recommended commands:
|
|
20
|
+
|
|
21
|
+
```bash
|
|
22
|
+
git status --short --branch
|
|
23
|
+
git log --oneline -5 --decorate
|
|
24
|
+
npm run ci
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
## 3) Changelog-first release notes (required)
|
|
28
|
+
|
|
29
|
+
### What to update
|
|
30
|
+
|
|
31
|
+
- Edit top of `CHANGELOG.md` before tagging.
|
|
32
|
+
- Use this section format:
|
|
33
|
+
|
|
34
|
+
```md
|
|
35
|
+
## 0.2.3 - 2026-05-07
|
|
36
|
+
|
|
37
|
+
Changes since v0.2.2.
|
|
38
|
+
|
|
39
|
+
### Added
|
|
40
|
+
- ...
|
|
41
|
+
|
|
42
|
+
### Changed
|
|
43
|
+
- ...
|
|
44
|
+
|
|
45
|
+
### Fixed
|
|
46
|
+
- ...
|
|
47
|
+
|
|
48
|
+
### Maintenance
|
|
49
|
+
- ...
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
### What goes into release notes
|
|
53
|
+
|
|
54
|
+
- `GitHub Release body` is generated from the section matching the new version in `CHANGELOG.md`.
|
|
55
|
+
- If no section exists or it is empty, workflow auto-falls back to:
|
|
56
|
+
- `## <version>`
|
|
57
|
+
- `Changes since <previousTag>.`
|
|
58
|
+
- `- Release published from package.json version <version>.`
|
|
59
|
+
|
|
60
|
+
### Key rule
|
|
61
|
+
|
|
62
|
+
The workflow ignores changelog command output at publish time; it extracts directly from the repository changelog section for the version.
|
|
63
|
+
|
|
64
|
+
## 4) Version and release flow
|
|
65
|
+
|
|
66
|
+
### Step-by-step sequence
|
|
67
|
+
|
|
68
|
+
1. Update files for release:
|
|
69
|
+
- `package.json` version bump.
|
|
70
|
+
- top `CHANGELOG.md` section for that version.
|
|
71
|
+
2. Run release checks:
|
|
72
|
+
- `npm run ci`
|
|
73
|
+
3. Commit and push to `main`.
|
|
74
|
+
4. Create and publish GitHub release tag `v<version>`:
|
|
75
|
+
- Via UI or CLI.
|
|
76
|
+
5. GitHub release workflow executes automatically:
|
|
77
|
+
- validates repository and package checks,
|
|
78
|
+
- extracts changelog notes,
|
|
79
|
+
- uploads tarball artifact,
|
|
80
|
+
- updates/creates release body,
|
|
81
|
+
- publishes to npm (if not already published),
|
|
82
|
+
- verifies npm propagation.
|
|
83
|
+
|
|
84
|
+
Manual local command examples:
|
|
85
|
+
|
|
86
|
+
```bash
|
|
87
|
+
git add package.json CHANGELOG.md
|
|
88
|
+
git commit -m "chore: release v0.2.3"
|
|
89
|
+
git push origin main
|
|
90
|
+
|
|
91
|
+
gh release create v0.2.3 --generate-notes --target main
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
If you need a full dry run without publish, use workflow dispatch:
|
|
95
|
+
|
|
96
|
+
```bash
|
|
97
|
+
gh workflow run release.yml -f dry_run=true
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
## 5) What each workflow step does
|
|
101
|
+
|
|
102
|
+
- Checkout with full history + tags (`fetch-depth: 0`, `fetch-tags: true`).
|
|
103
|
+
- Resolve metadata from `package.json`:
|
|
104
|
+
- `package_name`
|
|
105
|
+
- `package_version`
|
|
106
|
+
- `tag`
|
|
107
|
+
- previous tag for fallback context.
|
|
108
|
+
- Parse `CHANGELOG.md` for matching `## <version>` section.
|
|
109
|
+
- Build and test release artifact (`npm run ci` + `npm run build`).
|
|
110
|
+
- Upload `dist/*.tgz` artifact.
|
|
111
|
+
- Update release notes on GitHub release.
|
|
112
|
+
- Publish to npm using `NPM_TOKEN` (skipped if version already exists).
|
|
113
|
+
- Verify publication via repeated `npm view` checks.
|
|
114
|
+
|
|
115
|
+
## 6) Post-release verification
|
|
116
|
+
|
|
117
|
+
Run in order:
|
|
118
|
+
|
|
119
|
+
1. Git state and alignment:
|
|
120
|
+
|
|
121
|
+
```bash
|
|
122
|
+
git rev-parse --short HEAD
|
|
123
|
+
git rev-parse --short origin/main
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
2. Release artifact validation:
|
|
127
|
+
|
|
128
|
+
```bash
|
|
129
|
+
gh release view v0.2.3 --json name,tagName,isDraft,isPrerelease,url -q '.name+"\\n"+.tagName+"\\n"+.isDraft+"\\n"+.isPrerelease+"\\n"+.url'
|
|
130
|
+
gh release view v0.2.3 --json body -q .body
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
3. NPM publication:
|
|
134
|
+
|
|
135
|
+
```bash
|
|
136
|
+
npm view @albinocrabs/feynman@0.2.3 version
|
|
137
|
+
npm view @albinocrabs/feynman dist-tags
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
4. Smoke test from clean env:
|
|
141
|
+
|
|
142
|
+
```bash
|
|
143
|
+
node -e "console.log('ok')" # placeholder for your install checks
|
|
144
|
+
npx -y @albinocrabs/feynman@latest version
|
|
145
|
+
``
|
|
146
|
+
|
|
147
|
+
## 7) Troubleshooting
|
|
148
|
+
|
|
149
|
+
- **Release notes still show CI/CD text**
|
|
150
|
+
- Ensure tag points to commit containing the new `CHANGELOG.md` section.
|
|
151
|
+
- Ensure section header exactly matches `## <version>`.
|
|
152
|
+
- **Publish failed**
|
|
153
|
+
- confirm `NPM_TOKEN` is set in repo secrets,
|
|
154
|
+
- verify package name in `package.json`.
|
|
155
|
+
- **Release not auto-updated**
|
|
156
|
+
- run on `release` event only or manual dispatch with `dry_run=false`.
|
|
157
|
+
|
|
158
|
+
## 8) Required docs references
|
|
159
|
+
|
|
160
|
+
- [Release process (short)](../README.md) – README overview.
|
|
161
|
+
- [CI workflow](./launch.md#release-checklist).
|
|
162
|
+
- [Release workflow](../.github/workflows/release.yml).
|
|
@@ -0,0 +1,105 @@
|
|
|
1
|
+
# Action Sequencing: Checkout Incident During Peak
|
|
2
|
+
|
|
3
|
+
## Question
|
|
4
|
+
|
|
5
|
+
> Checkout error rate jumps during peak and latency is now above SLO. We need a
|
|
6
|
+
> concrete action sequence for recovery, including who does what, when to rollback,
|
|
7
|
+
> and when to stop doing damage-control.
|
|
8
|
+
|
|
9
|
+
## Without feynman
|
|
10
|
+
|
|
11
|
+
The team usually starts by checking dashboards, then opens an emergency channel,
|
|
12
|
+
then asks backend and DB teams to investigate. If they find a single failing
|
|
13
|
+
dependency they rollback that change; otherwise they add read-through cache and
|
|
14
|
+
throttle traffic. Communication goes out if the user impact is material.
|
|
15
|
+
|
|
16
|
+
## With feynman
|
|
17
|
+
|
|
18
|
+
Operational sequence:
|
|
19
|
+
|
|
20
|
+
```
|
|
21
|
+
[Alert: checkout error spike] --> [On-call acknowledges in 60s]
|
|
22
|
+
|
|
|
23
|
+
+--> [Is impact external?]
|
|
24
|
+
|
|
|
25
|
+
+-- yes --> [Start Incident Commander]
|
|
26
|
+
|
|
|
27
|
+
+-- no --> [Fast path mitigation only]
|
|
28
|
+
|
|
|
29
|
+
v
|
|
30
|
+
[Notify org channel] --> [Freeze non-critical deploys] --> [Triage by layer]
|
|
31
|
+
|
|
|
32
|
+
v
|
|
33
|
+
[Cache/DB/API/Infra]
|
|
34
|
+
|
|
|
35
|
+
[Need rollback?]
|
|
36
|
+
|
|
|
37
|
+
+-- yes --> [Rollback scoped deploy]
|
|
38
|
+
|
|
|
39
|
+
+-- no --> [Apply temporary safeguard]
|
|
40
|
+
|
|
|
41
|
+
v
|
|
42
|
+
[Stabilize + reduce blast radius]
|
|
43
|
+
|
|
|
44
|
+
v
|
|
45
|
+
[Declare recovery ETA]
|
|
46
|
+
|
|
|
47
|
+
+-- success --> [Post-incident audit]
|
|
48
|
+
|
|
|
49
|
+
+-- degraded --> [Resume changes]
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
State board (what changed inside the incident):
|
|
53
|
+
|
|
54
|
+
```
|
|
55
|
+
┌─ Incident Action Board ───────────────────────┐
|
|
56
|
+
│ triage : done │
|
|
57
|
+
│ command setup : done │
|
|
58
|
+
│ mitigation active : in-flight │
|
|
59
|
+
│ rollback : ready │
|
|
60
|
+
│ customer comms : live │
|
|
61
|
+
│ root-cause evidence : collecting │
|
|
62
|
+
└───────────────────────────────────────────────┘
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
Critical path decomposition:
|
|
66
|
+
|
|
67
|
+
```
|
|
68
|
+
[Containment]
|
|
69
|
+
├── [Enable queue throttle]
|
|
70
|
+
├── [Route reads to replica]
|
|
71
|
+
└── [Turn on circuit breaker]
|
|
72
|
+
[Recovery]
|
|
73
|
+
├── [Collect flamegraphs]
|
|
74
|
+
├── [Compare with healthy minute]
|
|
75
|
+
└── [Prepare rollback diff]
|
|
76
|
+
[Verification]
|
|
77
|
+
├── [Canary checks]
|
|
78
|
+
├── [Synthetic transaction replay]
|
|
79
|
+
└── [SLO probe]
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
Priority ladder during incident:
|
|
83
|
+
|
|
84
|
+
```
|
|
85
|
+
▲ high
|
|
86
|
+
data integrity checks
|
|
87
|
+
user-visible checkout path
|
|
88
|
+
▼ low
|
|
89
|
+
dashboard chart style changes
|
|
90
|
+
PR comments cleanup
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
Runbook gate table:
|
|
94
|
+
|
|
95
|
+
```
|
|
96
|
+
check | owner | threshold | action
|
|
97
|
+
--------------------|----------------|----------------|-------------------------
|
|
98
|
+
P95 latency | SRE | <= 700ms | continue with mitigation
|
|
99
|
+
error budget burn | SRE | <= 3%/15m | escalate communications
|
|
100
|
+
db retry pressure | Backend lead | <= 2x baseline | rotate to fallback path
|
|
101
|
+
cache hit rate | Platform lead | >= 78% | stop throttling traffic
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
This template forces action ordering, makes ownership explicit, and keeps the
|
|
105
|
+
team from drifting between investigation and mitigation under stress.
|
package/examples/api-flow.md
CHANGED
|
@@ -44,12 +44,41 @@ error response is sent.
|
|
|
44
44
|
[200 OK + token]
|
|
45
45
|
```
|
|
46
46
|
|
|
47
|
-
|
|
48
|
-
|
|
47
|
+
### Timing lens: what can degrade and where
|
|
48
|
+
|
|
49
|
+
```
|
|
50
|
+
[Client]
|
|
51
|
+
|
|
|
52
|
+
v
|
|
53
|
+
[CORS] --> [Parser] --> [Validation] --> [Controller] --> [AuthService]
|
|
54
|
+
| |
|
|
55
|
+
| +--> [401]
|
|
56
|
+
|
|
|
57
|
+
+--> [429/503]
|
|
58
|
+
+--> [400]
|
|
59
|
+
|
|
60
|
+
[AuthService] --> [UserRepository] --> [bcrypt.compare]
|
|
61
|
+
|
|
|
62
|
+
+-- mismatch --> [401]
|
|
63
|
+
|
|
|
64
|
+
+-- hit --> [JWT sign] --> [200]
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
Error-path table:
|
|
68
|
+
|
|
69
|
+
```
|
|
70
|
+
stage | symptom | response | recovery
|
|
71
|
+
-------------------|--------------|----------|-------------------
|
|
72
|
+
validation failed | 400 | reject | fix request payload
|
|
73
|
+
user not found | 401 | reject | prompt signup/help
|
|
74
|
+
credentials mismatch| 401 | reject | suggest retry/reset
|
|
75
|
+
dependency timeout | 503/504 | fail | retry/backoff
|
|
76
|
+
success | 200 | token | cache token metadata
|
|
77
|
+
```
|
|
49
78
|
|
|
50
79
|
## Why this works
|
|
51
80
|
|
|
52
81
|
The request lifecycle is a sequential flow with conditional branches, which
|
|
53
|
-
activates feynman's flow
|
|
82
|
+
activates feynman's flow-diagram rules. Boxes (`[…]`) mark processing stages;
|
|
54
83
|
arrows (`-->`) mark data flow; branch splits show the conditional paths at
|
|
55
84
|
validation and credential-check points.
|
|
@@ -0,0 +1,89 @@
|
|
|
1
|
+
# Bug Isolation: Intermittent 500s in Checkout
|
|
2
|
+
|
|
3
|
+
## Question
|
|
4
|
+
|
|
5
|
+
> Checkout API starts returning 500s at random intervals under normal load. How do we
|
|
6
|
+
> isolate root cause without a full-service shutdown?
|
|
7
|
+
|
|
8
|
+
## Without feynman
|
|
9
|
+
|
|
10
|
+
Start by checking logs, then look at DB and cache metrics, then inspect release notes,
|
|
11
|
+
and finally reproduce with synthetic traffic around the failure window. If needed, roll
|
|
12
|
+
back gradually while validating with a canary cohort.
|
|
13
|
+
|
|
14
|
+
## With feynman
|
|
15
|
+
|
|
16
|
+
Hypothesis tree:
|
|
17
|
+
|
|
18
|
+
```
|
|
19
|
+
intermittent 500s
|
|
20
|
+
├── app code
|
|
21
|
+
│ ├── null dereference
|
|
22
|
+
│ └── unhandled exception
|
|
23
|
+
├── data layer
|
|
24
|
+
│ ├── deadlock / lock timeout
|
|
25
|
+
│ ├── stale row locks
|
|
26
|
+
│ └── missing index
|
|
27
|
+
├── infrastructure
|
|
28
|
+
│ ├── DB connection pool exhaustion
|
|
29
|
+
│ └── Redis timeout
|
|
30
|
+
└── operational
|
|
31
|
+
├── rollout wave overlap
|
|
32
|
+
└── scheduled jobs interference
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
Isolation flow:
|
|
36
|
+
|
|
37
|
+
```
|
|
38
|
+
[Symptom observed] --> [Scope blast radius]
|
|
39
|
+
|
|
|
40
|
+
+----------+----------+
|
|
41
|
+
| |
|
|
42
|
+
v v
|
|
43
|
+
[Single endpoint only] [All endpoints]
|
|
44
|
+
| |
|
|
45
|
+
yes/no yes/no
|
|
46
|
+
| |
|
|
47
|
+
v v
|
|
48
|
+
[Replay payload] [Re-check infra]
|
|
49
|
+
| |
|
|
50
|
+
v v
|
|
51
|
+
[Exception trace matches]
|
|
52
|
+
[Connection errors]
|
|
53
|
+
yes/no
|
|
54
|
+
| |
|
|
55
|
+
v v
|
|
56
|
+
[fix app] [next hypothesis]
|
|
57
|
+
| |
|
|
58
|
+
v v
|
|
59
|
+
[fix infra] [rollback segment]
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
Impact priority:
|
|
63
|
+
|
|
64
|
+
```
|
|
65
|
+
▲ high
|
|
66
|
+
user checkout failure
|
|
67
|
+
payment status integrity
|
|
68
|
+
canary regression safety
|
|
69
|
+
▼ low
|
|
70
|
+
low-frequency telemetry noise
|
|
71
|
+
non-critical UI latency
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
Validation decision table:
|
|
75
|
+
|
|
76
|
+
```
|
|
77
|
+
check | command | expected
|
|
78
|
+
------------------|--------------------------------|-------------------------
|
|
79
|
+
error correlation | logs + correlation ids | grouped by checkout_id
|
|
80
|
+
db pressure | pool saturation metrics | stable for 15m
|
|
81
|
+
cache health | hit rate and timeout count | no timeout spike
|
|
82
|
+
deployment diff | feature flags + release notes | no new high-risk delta
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
## Why this works
|
|
86
|
+
|
|
87
|
+
Схема строит диагностику от гипотез к действиям: дерево сокращает пространство
|
|
88
|
+
поиска, flow управляет последовательностью экспериментов, а приоритетный блок
|
|
89
|
+
показывает, что лечится немедленно.
|
|
@@ -0,0 +1,112 @@
|
|
|
1
|
+
# C4-Style Architecture and Request Flow
|
|
2
|
+
|
|
3
|
+
## Question
|
|
4
|
+
|
|
5
|
+
> Sketch a clean C4-style view for an AI documentation tool and show the
|
|
6
|
+
> normal request flow. I want context, container split, component split, and a
|
|
7
|
+
> clear status view for blockers.
|
|
8
|
+
|
|
9
|
+
## Without feynman
|
|
10
|
+
|
|
11
|
+
The tool has three users: author, reviewer, and operator. It includes a web
|
|
12
|
+
client, a prompt gateway API, a rules engine, a diagram renderer, and a
|
|
13
|
+
quality service that validates responses before returning them. There is
|
|
14
|
+
SSO-based auth, storage for templates, and a shared observability channel.
|
|
15
|
+
If the rules fail to load, the request still needs a deterministic fallback to
|
|
16
|
+
text mode so the user is not blocked.
|
|
17
|
+
|
|
18
|
+
## With feynman
|
|
19
|
+
|
|
20
|
+
### C4 context
|
|
21
|
+
|
|
22
|
+
```
|
|
23
|
+
feynman-system
|
|
24
|
+
├── actors
|
|
25
|
+
│ ├── [Document Author]
|
|
26
|
+
│ ├── [Reviewer]
|
|
27
|
+
│ └── [Operator]
|
|
28
|
+
├── containers
|
|
29
|
+
│ ├── [Web Client]
|
|
30
|
+
│ ├── [Prompt Gateway API]
|
|
31
|
+
│ ├── [Rule Service]
|
|
32
|
+
│ ├── [Diagram Renderer]
|
|
33
|
+
│ └── [Quality Service]
|
|
34
|
+
└── external systems
|
|
35
|
+
├── [SSO]
|
|
36
|
+
├── [Template Storage]
|
|
37
|
+
├── [Model Provider]
|
|
38
|
+
└── [Observability]
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
### C4 container run
|
|
42
|
+
|
|
43
|
+
```
|
|
44
|
+
[Document Author] --> [Web Client]
|
|
45
|
+
|
|
|
46
|
+
v
|
|
47
|
+
[Reviewer] --> [Prompt Gateway API]
|
|
48
|
+
|
|
|
49
|
+
v
|
|
50
|
+
[Auth + Rate Limit]
|
|
51
|
+
|
|
|
52
|
+
+-- unauthorized --> [403 / 401]
|
|
53
|
+
|
|
|
54
|
+
+-- authorized
|
|
55
|
+
|
|
|
56
|
+
v
|
|
57
|
+
[Rule Service]
|
|
58
|
+
|
|
|
59
|
+
+-- rule set miss --> [Text fallback]
|
|
60
|
+
|
|
|
61
|
+
+-- rule set hit
|
|
62
|
+
|
|
|
63
|
+
v
|
|
64
|
+
[Diagram Renderer]
|
|
65
|
+
|
|
|
66
|
+
v
|
|
67
|
+
[Quality Service]
|
|
68
|
+
|
|
|
69
|
+
+-- blocked --> [Recovery Plan]
|
|
70
|
+
|
|
|
71
|
+
+-- pass --> [JSON response]
|
|
72
|
+
|
|
|
73
|
+
v
|
|
74
|
+
[Observability publish]
|
|
75
|
+
|
|
|
76
|
+
v
|
|
77
|
+
[Document Author/Reviewer]
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
### Architecture split
|
|
81
|
+
|
|
82
|
+
```
|
|
83
|
+
criterion | Context (C1) | Containers (C2) | Components (C3)
|
|
84
|
+
---------------|---------------------|----------------------|----------------------
|
|
85
|
+
Main question | Who talks to what | Who owns boundary | Who owns behavior
|
|
86
|
+
Primary risk | Missing actor path | Wrong trust boundary | Rule fallback bug
|
|
87
|
+
Owner now | Product + Ops | Backend + Security | Runtime rule authors
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
### Why this helps
|
|
91
|
+
|
|
92
|
+
```
|
|
93
|
+
┌─ Delivery readiness ────────────┐
|
|
94
|
+
context map done
|
|
95
|
+
container flow done
|
|
96
|
+
component split done
|
|
97
|
+
risk hotspots identified
|
|
98
|
+
└─────────────────────────────────┘
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
## Why this works
|
|
102
|
+
|
|
103
|
+
Without explicit structure, the explanation is a dense paragraph. With the C4
|
|
104
|
+
perspective, Feynman converts it into:
|
|
105
|
+
- actor/system decomposition,
|
|
106
|
+
- runtime sequence,
|
|
107
|
+
- boundary+fallback behavior,
|
|
108
|
+
- and explicit risk visibility.
|
|
109
|
+
|
|
110
|
+
The result is understandable quickly and can be reviewed or extended as a
|
|
111
|
+
single architecture baseline.
|
|
112
|
+
|
|
@@ -0,0 +1,77 @@
|
|
|
1
|
+
# Context Splitting: Product Initiative to Deploy a New Onboarding Assistant
|
|
2
|
+
|
|
3
|
+
## Question
|
|
4
|
+
|
|
5
|
+
> We want to launch a new onboarding assistant next quarter. It touches web UI, backend,
|
|
6
|
+
> legal review, and customer support. How should we split work so leadership can
|
|
7
|
+
> understand risk, dependencies, and rollout order in one view?
|
|
8
|
+
|
|
9
|
+
## Without feynman
|
|
10
|
+
|
|
11
|
+
The initiative includes multiple teams and timelines. First we should do discovery and
|
|
12
|
+
alignment between engineering, legal, and support. Then UI and backend work must start,
|
|
13
|
+
because both rely on design contracts. A pilot should run with 10% of new users, then
|
|
14
|
+
global rollout should happen after legal and reliability checks are complete.
|
|
15
|
+
|
|
16
|
+
## With feynman
|
|
17
|
+
|
|
18
|
+
Scope decomposition:
|
|
19
|
+
|
|
20
|
+
```
|
|
21
|
+
[Onboarding assistant]
|
|
22
|
+
├── Product & UX
|
|
23
|
+
│ ├── onboarding copy
|
|
24
|
+
│ ├── microcopy guardrails
|
|
25
|
+
│ └── in-app hints
|
|
26
|
+
├── Platform
|
|
27
|
+
│ ├── assistant API
|
|
28
|
+
│ ├── telemetry events
|
|
29
|
+
│ └── admin flags
|
|
30
|
+
├── Legal & Compliance
|
|
31
|
+
│ ├── consent text
|
|
32
|
+
│ └── data retention policy
|
|
33
|
+
└── Operations
|
|
34
|
+
├── runbook
|
|
35
|
+
├── on-call drill
|
|
36
|
+
└── rollback script
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
Cross-team launch flow:
|
|
40
|
+
|
|
41
|
+
```
|
|
42
|
+
[Legal approves data flow] --> [Backend API contract ready]
|
|
43
|
+
|
|
|
44
|
+
v
|
|
45
|
+
[UX/Copy draft] --> [Integration tests] --> [10% pilot]
|
|
46
|
+
| |
|
|
47
|
+
fail --> [fix + retest] v
|
|
48
|
+
[60-day rollback check]
|
|
49
|
+
|
|
|
50
|
+
+--> [full rollout]
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
Dependency safety frame:
|
|
54
|
+
|
|
55
|
+
- legal-ok: mandatory
|
|
56
|
+
- telemetry path: must emit onboarding_success and onboarding_fail
|
|
57
|
+
- fallback: always-on silent mode if latency > 300ms
|
|
58
|
+
- rollback: kill-switch <2 minutes
|
|
59
|
+
|
|
60
|
+
Priority lanes:
|
|
61
|
+
|
|
62
|
+
```
|
|
63
|
+
▲ high
|
|
64
|
+
legal/consent review
|
|
65
|
+
backend idempotency
|
|
66
|
+
kill-switch + rollback readiness
|
|
67
|
+
▼ low
|
|
68
|
+
copy polishing
|
|
69
|
+
dashboard cosmetics
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
## Why this works
|
|
73
|
+
|
|
74
|
+
The question has nested uncertainty and cross-team constraints. The tree diagram makes
|
|
75
|
+
the decomposition explicit. The flow diagram shows execution order and rework loops.
|
|
76
|
+
The frame and priority lanes turn soft governance requirements into checkable launch
|
|
77
|
+
conditions.
|
|
@@ -0,0 +1,107 @@
|
|
|
1
|
+
# Feature Planning: Build Internal Search or Use Managed API
|
|
2
|
+
|
|
3
|
+
## Question
|
|
4
|
+
|
|
5
|
+
> We need fast text search by title/body tags. Should we build it ourselves or use
|
|
6
|
+
> a managed search service? Compare the options by cost, speed, and maintenance.
|
|
7
|
+
|
|
8
|
+
## Without feynman
|
|
9
|
+
|
|
10
|
+
Building search internally gives us full control over ranking but requires schema
|
|
11
|
+
indexing work, query tuning, and ongoing reliability engineering. A managed API
|
|
12
|
+
is faster to deliver and has better relevance out of the box, but it increases
|
|
13
|
+
vendor dependency and recurring cost. We can reduce risk by evaluating latency,
|
|
14
|
+
cost, and maintenance for a 6-month period and then revisiting.
|
|
15
|
+
|
|
16
|
+
## With feynman
|
|
17
|
+
|
|
18
|
+
Decision matrix:
|
|
19
|
+
|
|
20
|
+
```
|
|
21
|
+
Option | build-internal | managed-search-service
|
|
22
|
+
------------------|--------------------------------|--------------------------
|
|
23
|
+
speed-to-market | 8-12 weeks | 1-2 weeks
|
|
24
|
+
query latency | 60-120ms (with cache) | 40-80ms
|
|
25
|
+
maintenance | high (2 engineers, on-call) | low
|
|
26
|
+
vendor lock-in | none | medium-high
|
|
27
|
+
relevance quality | custom control, tuning effort | high, pre-tuned
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
Decision flow:
|
|
31
|
+
|
|
32
|
+
```
|
|
33
|
+
[Need search by title/body now?] --> [Yes]
|
|
34
|
+
|
|
|
35
|
+
v
|
|
36
|
+
[Need search now?] --> [Evaluate managed by default]
|
|
37
|
+
|
|
|
38
|
+
v
|
|
39
|
+
[POC in 2 weeks]
|
|
40
|
+
|
|
|
41
|
+
+-------------------- +--------------------+
|
|
42
|
+
| |
|
|
43
|
+
v v
|
|
44
|
+
[Latency/cost ok] [No]
|
|
45
|
+
| |
|
|
46
|
+
+--> [Adopt] +--> [Re-open internal build path]
|
|
47
|
+
|
|
|
48
|
+
+--> [Plan v1 migration in 1 sprint]
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
Governance priority:
|
|
52
|
+
|
|
53
|
+
```
|
|
54
|
+
▲ high
|
|
55
|
+
Vendor contract review (SLA, data residency)
|
|
56
|
+
Incident drill: provider outage fallback plan
|
|
57
|
+
▼ low
|
|
58
|
+
UI polish in search result cards
|
|
59
|
+
Advanced synonym tuning
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
Phased rollout map:
|
|
63
|
+
|
|
64
|
+
```
|
|
65
|
+
[Decision]
|
|
66
|
+
|
|
|
67
|
+
v
|
|
68
|
+
[POC + telemetry]
|
|
69
|
+
|
|
|
70
|
+
+-- latency/cost tests fail? --+--> [Re-scope]
|
|
71
|
+
|
|
|
72
|
+
+-- latency/cost tests pass? --> [Fallback path] --> [Adoption path]
|
|
73
|
+
|
|
|
74
|
+
+--> [Cost optimization] --> [Quarterly review]
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
Rollback frame:
|
|
78
|
+
|
|
79
|
+
```
|
|
80
|
+
Managed API chosen:
|
|
81
|
+
- fail trigger: P95 > 2x baseline + cost ↑
|
|
82
|
+
- response: cut volume 50%, enable fallback
|
|
83
|
+
- rollback time: 45 min
|
|
84
|
+
- owner: on-call + search guild
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
Context split before execution:
|
|
88
|
+
|
|
89
|
+
```
|
|
90
|
+
[Business goal]
|
|
91
|
+
├── [Performance]
|
|
92
|
+
│ ├── latency
|
|
93
|
+
│ └── availability
|
|
94
|
+
├── [Economics]
|
|
95
|
+
│ ├── direct cost
|
|
96
|
+
│ └── hidden support cost
|
|
97
|
+
└── [Risk]
|
|
98
|
+
├── lock-in
|
|
99
|
+
├── security
|
|
100
|
+
└── reversibility
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
## Why this works
|
|
104
|
+
|
|
105
|
+
The plain comparison becomes explicit with columns, and the execution path becomes
|
|
106
|
+
operational through a flow diagram. This helps teams decide with one view of
|
|
107
|
+
trade-offs and controls.
|