@albinocrabs/feynman 0.2.2 → 0.2.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.codex-plugin/plugin.json +1 -1
- package/CHANGELOG.md +67 -1
- package/CONTRIBUTING.md +1 -0
- package/README.md +260 -22
- package/SECURITY.md +11 -0
- package/bin/feynman.js +419 -36
- package/docs/architecture.md +27 -17
- package/docs/launch.md +10 -2
- package/docs/object-passport.md +91 -0
- package/docs/release.md +162 -0
- package/examples/activity-sequence.md +105 -0
- package/examples/api-flow.md +32 -3
- package/examples/bug-isolation.md +89 -0
- package/examples/c4-platform-diagramming.md +112 -0
- package/examples/context-splitting.md +77 -0
- package/examples/feature-planning.md +107 -0
- package/examples/incident-response.md +77 -0
- package/examples/release-readiness.md +73 -0
- package/examples/service-migration.md +72 -0
- package/hooks/feynman-activate.js +11 -4
- package/hooks/feynman-session-start.js +79 -0
- package/hooks/hooks.json +12 -2
- package/hooks.json +13 -2
- package/package.json +5 -3
- package/rules/feynman-activate.md +5 -7
- package/skills/feynman/SKILL.md +11 -9
package/docs/architecture.md
CHANGED
|
@@ -6,16 +6,18 @@ Three independent layers: hook lifecycle, lint pipeline, and state schema.
|
|
|
6
6
|
|
|
7
7
|
## Layer 1: Hook Lifecycle
|
|
8
8
|
|
|
9
|
-
The `
|
|
10
|
-
|
|
11
|
-
`additionalContext`.
|
|
9
|
+
The `SessionStart` hook primes fresh Claude Code or Codex sessions with the
|
|
10
|
+
active rules. The `UserPromptSubmit` hook fires before every prompt and
|
|
11
|
+
reinforces the same rules as `additionalContext`.
|
|
12
12
|
|
|
13
13
|
```
|
|
14
14
|
~/.claude/settings.json ~/.codex/hooks.json
|
|
15
15
|
│
|
|
16
|
-
|
|
16
|
+
├─ hooks.SessionStart primes new sessions
|
|
17
|
+
│
|
|
18
|
+
└─ hooks.UserPromptSubmit reinforces every prompt
|
|
17
19
|
▼
|
|
18
|
-
hooks/feynman-activate.js
|
|
20
|
+
hooks/feynman-session-start.js + hooks/feynman-activate.js
|
|
19
21
|
│
|
|
20
22
|
├─ [0] FEYNMAN_HOME selects client state root
|
|
21
23
|
│ unset → ~/.claude (backward compatible)
|
|
@@ -25,9 +27,10 @@ hooks/feynman-activate.js
|
|
|
25
27
|
├─ [1] validate session_id (path-traversal guard)
|
|
26
28
|
│
|
|
27
29
|
├─ [2] $FEYNMAN_HOME/.feynman-active ← flag file
|
|
28
|
-
│ absent + no state.json
|
|
29
|
-
│ absent + state.
|
|
30
|
-
│
|
|
30
|
+
│ absent + no state.json → bootstrap first run
|
|
31
|
+
│ absent + state.enabled=true → recreate flag
|
|
32
|
+
│ absent + state.enabled=false → exit 0 (user disabled)
|
|
33
|
+
│ present → continue
|
|
31
34
|
│
|
|
32
35
|
├─ [3] $FEYNMAN_HOME/.feynman/state.json
|
|
33
36
|
│ enabled: false → exit 0
|
|
@@ -121,7 +124,7 @@ All runtime state lives in two files under the selected client root:
|
|
|
121
124
|
~/.claude/ or ~/.codex/
|
|
122
125
|
├── .feynman-active ← presence flag
|
|
123
126
|
│ present = feynman active
|
|
124
|
-
│ absent = user disabled
|
|
127
|
+
│ absent = user disabled only when state.enabled=false
|
|
125
128
|
│ content = current intensity string (informational)
|
|
126
129
|
│
|
|
127
130
|
└── .feynman/
|
|
@@ -153,7 +156,7 @@ All runtime state lives in two files under the selected client root:
|
|
|
153
156
|
state.intensity = <value>
|
|
154
157
|
.feynman-active content updated
|
|
155
158
|
|
|
156
|
-
[npx @albinocrabs/feynman uninstall --target claude|codex|both]
|
|
159
|
+
[npx @albinocrabs/feynman uninstall --target claude|codex|both|all|*]
|
|
157
160
|
hook removed from target hook config
|
|
158
161
|
.feynman-active deleted
|
|
159
162
|
state.json preserved (user data)
|
|
@@ -171,20 +174,27 @@ managed by skill commands in `skills/feynman/SKILL.md`.
|
|
|
171
174
|
## CLI Subcommand Map
|
|
172
175
|
|
|
173
176
|
```
|
|
174
|
-
bin/feynman.js
|
|
175
|
-
├── install → writes target hook config + state.json + flag
|
|
176
|
-
├── uninstall → removes target hook entries + flag (keeps state)
|
|
177
|
-
├── doctor → checks target health criteria, prints frame
|
|
178
|
-
├── lint → delegates to bin/feynman-lint.js
|
|
179
|
-
|
|
177
|
+
bin/feynman.js
|
|
178
|
+
├── install → writes target hook config + state.json + flag
|
|
179
|
+
├── uninstall → removes target hook entries + flag (keeps state)
|
|
180
|
+
├── doctor → checks target health criteria, prints frame
|
|
181
|
+
├── lint → delegates to bin/feynman-lint.js
|
|
182
|
+
├── examples → list and render built-in ASCII examples
|
|
183
|
+
├── help → this help/usage block
|
|
184
|
+
├── bootstrap → exports examples + manifests + skill into local package folder
|
|
185
|
+
└── version → prints package.json version
|
|
180
186
|
```
|
|
181
187
|
|
|
188
|
+
`/feynman on|off|start|stop|lite|full|ultra` are handled by the skill contract
|
|
189
|
+
in `skills/feynman/SKILL.md` and share aliases:
|
|
190
|
+
`start` == `on`, `stop` == `off`.
|
|
191
|
+
|
|
182
192
|
Targets:
|
|
183
193
|
|
|
184
194
|
```
|
|
185
195
|
claude → ~/.claude/settings.json + ~/.claude/.feynman/
|
|
186
196
|
codex → ~/.codex/hooks.json + ~/.codex/.feynman/
|
|
187
|
-
both
|
|
197
|
+
both, all, * → runs claude and codex installers/uninstallers idempotently
|
|
188
198
|
```
|
|
189
199
|
|
|
190
200
|
**File:** `bin/feynman.js`
|
package/docs/launch.md
CHANGED
|
@@ -9,7 +9,7 @@ become columns, priorities become scales, and status summaries become frames.
|
|
|
9
9
|
## One-liner
|
|
10
10
|
|
|
11
11
|
```bash
|
|
12
|
-
npx -y @albinocrabs/feynman@latest install --target
|
|
12
|
+
npx -y @albinocrabs/feynman@latest install --target all
|
|
13
13
|
```
|
|
14
14
|
|
|
15
15
|
## Short Description
|
|
@@ -30,9 +30,10 @@ without asking every time.
|
|
|
30
30
|
## Demo Script
|
|
31
31
|
|
|
32
32
|
```bash
|
|
33
|
-
npx -y @albinocrabs/feynman@latest install --target
|
|
33
|
+
npx -y @albinocrabs/feynman@latest install --target '*'
|
|
34
34
|
npx -y @albinocrabs/feynman@latest doctor --target claude
|
|
35
35
|
npx -y @albinocrabs/feynman@latest doctor --target codex
|
|
36
|
+
feynman bootstrap --out ./feynman-package
|
|
36
37
|
```
|
|
37
38
|
|
|
38
39
|
Prompt:
|
|
@@ -65,3 +66,10 @@ limited writes | production-ready | persistence opt
|
|
|
65
66
|
```bash
|
|
66
67
|
npx -y @albinocrabs/feynman@latest version
|
|
67
68
|
```
|
|
69
|
+
|
|
70
|
+
For the full release playbook, see: [docs/release.md](release.md)
|
|
71
|
+
|
|
72
|
+
## Detailed Release Docs
|
|
73
|
+
|
|
74
|
+
- [docs/release.md](release.md): full end-to-end release procedure, release notes
|
|
75
|
+
contract, workflow behavior, and post-release verification.
|
|
@@ -0,0 +1,91 @@
|
|
|
1
|
+
# Feynman Product Passport
|
|
2
|
+
|
|
3
|
+
## 1) Миссия и область применения
|
|
4
|
+
|
|
5
|
+
`feynman` — plugin для Claude Code и Codex, который injects ASCII-правила диаграмм в каждый prompt на слое `UserPromptSubmit`.
|
|
6
|
+
|
|
7
|
+
Главная задача: ускорить расшифровку структуры и снизить когнитивную нагрузку в ответах с потоками, иерархией, сравнением и статусами.
|
|
8
|
+
|
|
9
|
+
Не делает:
|
|
10
|
+
- бизнес-логику приложения,
|
|
11
|
+
- редизайн ответа API/фронтенда,
|
|
12
|
+
- сетевые вызовы или фоновое хранение текстовых логов.
|
|
13
|
+
|
|
14
|
+
## 2) Артефакты проекта
|
|
15
|
+
|
|
16
|
+
| Слой | Артефакт | Назначение |
|
|
17
|
+
|---|---|---|
|
|
18
|
+
| Hook | `hooks/feynman-activate.js` | Чтение состояния и инъекция `additionalContext` |
|
|
19
|
+
| Rules | `rules/feynman-activate.md` | Набор правил `lite/full/ultra` |
|
|
20
|
+
| CLI | `bin/feynman.js` | `install/uninstall/doctor/lint/examples/bootstrap/version/help` |
|
|
21
|
+
| Lint | `bin/feynman-lint.js`, `lib/lint/*` | Проверка корректности ASCII-диаграмм |
|
|
22
|
+
| Skill | `skills/feynman/SKILL.md` | Slash-команды управления режимом |
|
|
23
|
+
| Package | `.claude-plugin/`, `.codex-plugin/`, `hooks.json` | Доставка плагина |
|
|
24
|
+
|
|
25
|
+
## 3) Runtime state
|
|
26
|
+
|
|
27
|
+
`~/.claude/.feynman/` или `~/.codex/.feynman/`:
|
|
28
|
+
|
|
29
|
+
- `state.json` — `enabled`, `intensity`, `injections`.
|
|
30
|
+
- `.feynman-active` — флаг включения.
|
|
31
|
+
|
|
32
|
+
Семантика:
|
|
33
|
+
- флаг есть: инъекция может выполняться (если `enabled: true`);
|
|
34
|
+
- флага нет + state есть: отключено пользователем;
|
|
35
|
+
- оба файла отсутствуют: bootstrap первого запуска.
|
|
36
|
+
|
|
37
|
+
## 4) Управление состоянием (slash)
|
|
38
|
+
|
|
39
|
+
- `/feynman on` / `/feynman start` — включить инъекцию;
|
|
40
|
+
- `/feynman off` / `/feynman stop` — выключить инъекцию;
|
|
41
|
+
- `/feynman lite|full|ultra` — переключить интенсивность;
|
|
42
|
+
- `/feynman` / `/feynman status` — статус без изменений.
|
|
43
|
+
|
|
44
|
+
## 5) CLI контракт
|
|
45
|
+
|
|
46
|
+
| Команда | Что делает |
|
|
47
|
+
|---|---|
|
|
48
|
+
| `feynman install` | Регистрирует hook в `settings.json`/`hooks.json` |
|
|
49
|
+
| `feynman uninstall` | Удаляет hook, оставляя state |
|
|
50
|
+
| `feynman doctor` | Диагностический health-check |
|
|
51
|
+
| `feynman lint` | Проверяет Markdown на правила L01-L08 |
|
|
52
|
+
| `feynman examples` | Печатает примеры ответов |
|
|
53
|
+
| `feynman bootstrap` | Экспортирует артефакты в локальную папку |
|
|
54
|
+
| `feynman version` | Версия пакета |
|
|
55
|
+
| `feynman help` | Текущая справка CLI |
|
|
56
|
+
|
|
57
|
+
## 6) Качество диаграмм
|
|
58
|
+
|
|
59
|
+
Источник правил:
|
|
60
|
+
- `rules/feynman-activate.md` (источник истины для `lite/full/ultra`)
|
|
61
|
+
|
|
62
|
+
Обновление правил требует синхронизации:
|
|
63
|
+
- `hooks/feynman-activate.js` + `bin/feynman.js` + `skills/feynman/SKILL.md`
|
|
64
|
+
- плюс smoke/pattern-тесты `tests/cli.test.js`.
|
|
65
|
+
|
|
66
|
+
## 7) Технико-операционные ограничения
|
|
67
|
+
|
|
68
|
+
- Zero deps, CommonJS.
|
|
69
|
+
- Node >= 18.
|
|
70
|
+
- No network at runtime для hook execution (внешние пути не используются).
|
|
71
|
+
- Bootstrap включает runtime files для дисконнекта: `hooks/feynman-activate.js`, `bin/feynman.js`, `package.json` и ассеты.
|
|
72
|
+
|
|
73
|
+
## 8) Проверки и сопровождение
|
|
74
|
+
|
|
75
|
+
Обязательные для релиза:
|
|
76
|
+
- `feynman bootstrap --out ... --force`
|
|
77
|
+
- `feynman lint --json README.md` (рекурентно по docs),
|
|
78
|
+
- CI matrix (`Node 18/20` на `ubuntu`/`macos`).
|
|
79
|
+
|
|
80
|
+
## 9) Известный рынок и позиционирование
|
|
81
|
+
|
|
82
|
+
В рынке есть аналоги по prompt-управлению (кнопки в IDE/CLI/prompt-команды), но у `feynman` уникально сочетание:
|
|
83
|
+
- автоматическая ASCII-семантика на hooks,
|
|
84
|
+
- отдельный lint для качества диаграмм,
|
|
85
|
+
- zero-dep runtime.
|
|
86
|
+
|
|
87
|
+
## 10) Связь
|
|
88
|
+
|
|
89
|
+
- [Architecture](./architecture.md)
|
|
90
|
+
- [Launch](./launch.md)
|
|
91
|
+
- [RTK Playbook](../RTK.md)
|
package/docs/release.md
ADDED
|
@@ -0,0 +1,162 @@
|
|
|
1
|
+
# Release Documentation
|
|
2
|
+
|
|
3
|
+
This document defines the full release process for `@albinocrabs/feynman`, including changelog usage, release notes generation, publishing, and verification.
|
|
4
|
+
|
|
5
|
+
## 1) Scope and ownership
|
|
6
|
+
|
|
7
|
+
- Release versioning is package-driven (`package.json` `version`).
|
|
8
|
+
- GitHub release tags use `v${version}` format.
|
|
9
|
+
- Changelog is the canonical source for release notes.
|
|
10
|
+
- CI/release workflows live in `.github/workflows/ci.yml` and `.github/workflows/release.yml`.
|
|
11
|
+
|
|
12
|
+
## 2) Pre-release checks
|
|
13
|
+
|
|
14
|
+
1. Verify repo cleanliness and branch context.
|
|
15
|
+
2. Ensure `main` is clean and CI checks are green.
|
|
16
|
+
3. Confirm `package.json` version is the intended next version.
|
|
17
|
+
4. Confirm matching plugin manifest versions if changed.
|
|
18
|
+
|
|
19
|
+
Recommended commands:
|
|
20
|
+
|
|
21
|
+
```bash
|
|
22
|
+
git status --short --branch
|
|
23
|
+
git log --oneline -5 --decorate
|
|
24
|
+
npm run ci
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
## 3) Changelog-first release notes (required)
|
|
28
|
+
|
|
29
|
+
### What to update
|
|
30
|
+
|
|
31
|
+
- Edit top of `CHANGELOG.md` before tagging.
|
|
32
|
+
- Use this section format:
|
|
33
|
+
|
|
34
|
+
```md
|
|
35
|
+
## 0.2.3 - 2026-05-07
|
|
36
|
+
|
|
37
|
+
Changes since v0.2.2.
|
|
38
|
+
|
|
39
|
+
### Added
|
|
40
|
+
- ...
|
|
41
|
+
|
|
42
|
+
### Changed
|
|
43
|
+
- ...
|
|
44
|
+
|
|
45
|
+
### Fixed
|
|
46
|
+
- ...
|
|
47
|
+
|
|
48
|
+
### Maintenance
|
|
49
|
+
- ...
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
### What goes into release notes
|
|
53
|
+
|
|
54
|
+
- `GitHub Release body` is generated from the section matching the new version in `CHANGELOG.md`.
|
|
55
|
+
- If no section exists or it is empty, workflow auto-falls back to:
|
|
56
|
+
- `## <version>`
|
|
57
|
+
- `Changes since <previousTag>.`
|
|
58
|
+
- `- Release published from package.json version <version>.`
|
|
59
|
+
|
|
60
|
+
### Key rule
|
|
61
|
+
|
|
62
|
+
The workflow ignores changelog command output at publish time; it extracts directly from the repository changelog section for the version.
|
|
63
|
+
|
|
64
|
+
## 4) Version and release flow
|
|
65
|
+
|
|
66
|
+
### Step-by-step sequence
|
|
67
|
+
|
|
68
|
+
1. Update files for release:
|
|
69
|
+
- `package.json` version bump.
|
|
70
|
+
- top `CHANGELOG.md` section for that version.
|
|
71
|
+
2. Run release checks:
|
|
72
|
+
- `npm run ci`
|
|
73
|
+
3. Commit and push to `main`.
|
|
74
|
+
4. Create and publish GitHub release tag `v<version>`:
|
|
75
|
+
- Via UI or CLI.
|
|
76
|
+
5. GitHub release workflow executes automatically:
|
|
77
|
+
- validates repository and package checks,
|
|
78
|
+
- extracts changelog notes,
|
|
79
|
+
- uploads tarball artifact,
|
|
80
|
+
- updates/creates release body,
|
|
81
|
+
- publishes to npm (if not already published),
|
|
82
|
+
- verifies npm propagation.
|
|
83
|
+
|
|
84
|
+
Manual local command examples:
|
|
85
|
+
|
|
86
|
+
```bash
|
|
87
|
+
git add package.json CHANGELOG.md
|
|
88
|
+
git commit -m "chore: release v0.2.3"
|
|
89
|
+
git push origin main
|
|
90
|
+
|
|
91
|
+
gh release create v0.2.3 --generate-notes --target main
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
If you need a full dry run without publish, use workflow dispatch:
|
|
95
|
+
|
|
96
|
+
```bash
|
|
97
|
+
gh workflow run release.yml -f dry_run=true
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
## 5) What each workflow step does
|
|
101
|
+
|
|
102
|
+
- Checkout with full history + tags (`fetch-depth: 0`, `fetch-tags: true`).
|
|
103
|
+
- Resolve metadata from `package.json`:
|
|
104
|
+
- `package_name`
|
|
105
|
+
- `package_version`
|
|
106
|
+
- `tag`
|
|
107
|
+
- previous tag for fallback context.
|
|
108
|
+
- Parse `CHANGELOG.md` for matching `## <version>` section.
|
|
109
|
+
- Build and test release artifact (`npm run ci` + `npm run build`).
|
|
110
|
+
- Upload `dist/*.tgz` artifact.
|
|
111
|
+
- Update release notes on GitHub release.
|
|
112
|
+
- Publish to npm using `NPM_TOKEN` (skipped if version already exists).
|
|
113
|
+
- Verify publication via repeated `npm view` checks.
|
|
114
|
+
|
|
115
|
+
## 6) Post-release verification
|
|
116
|
+
|
|
117
|
+
Run in order:
|
|
118
|
+
|
|
119
|
+
1. Git state and alignment:
|
|
120
|
+
|
|
121
|
+
```bash
|
|
122
|
+
git rev-parse --short HEAD
|
|
123
|
+
git rev-parse --short origin/main
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
2. Release artifact validation:
|
|
127
|
+
|
|
128
|
+
```bash
|
|
129
|
+
gh release view v0.2.3 --json name,tagName,isDraft,isPrerelease,url -q '.name+"\\n"+.tagName+"\\n"+.isDraft+"\\n"+.isPrerelease+"\\n"+.url'
|
|
130
|
+
gh release view v0.2.3 --json body -q .body
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
3. NPM publication:
|
|
134
|
+
|
|
135
|
+
```bash
|
|
136
|
+
npm view @albinocrabs/feynman@0.2.3 version
|
|
137
|
+
npm view @albinocrabs/feynman dist-tags
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
4. Smoke test from clean env:
|
|
141
|
+
|
|
142
|
+
```bash
|
|
143
|
+
node -e "console.log('ok')" # placeholder for your install checks
|
|
144
|
+
npx -y @albinocrabs/feynman@latest version
|
|
145
|
+
``
|
|
146
|
+
|
|
147
|
+
## 7) Troubleshooting
|
|
148
|
+
|
|
149
|
+
- **Release notes still show CI/CD text**
|
|
150
|
+
- Ensure tag points to commit containing the new `CHANGELOG.md` section.
|
|
151
|
+
- Ensure section header exactly matches `## <version>`.
|
|
152
|
+
- **Publish failed**
|
|
153
|
+
- confirm `NPM_TOKEN` is set in repo secrets,
|
|
154
|
+
- verify package name in `package.json`.
|
|
155
|
+
- **Release not auto-updated**
|
|
156
|
+
- run on `release` event only or manual dispatch with `dry_run=false`.
|
|
157
|
+
|
|
158
|
+
## 8) Required docs references
|
|
159
|
+
|
|
160
|
+
- [Release process (short)](../README.md) – README overview.
|
|
161
|
+
- [CI workflow](./launch.md#release-checklist).
|
|
162
|
+
- [Release workflow](../.github/workflows/release.yml).
|
|
@@ -0,0 +1,105 @@
|
|
|
1
|
+
# Action Sequencing: Checkout Incident During Peak
|
|
2
|
+
|
|
3
|
+
## Question
|
|
4
|
+
|
|
5
|
+
> Checkout error rate jumps during peak and latency is now above SLO. We need a
|
|
6
|
+
> concrete action sequence for recovery, including who does what, when to rollback,
|
|
7
|
+
> and when to stop doing damage-control.
|
|
8
|
+
|
|
9
|
+
## Without feynman
|
|
10
|
+
|
|
11
|
+
The team usually starts by checking dashboards, then opens an emergency channel,
|
|
12
|
+
then asks backend and DB teams to investigate. If they find a single failing
|
|
13
|
+
dependency they rollback that change; otherwise they add read-through cache and
|
|
14
|
+
throttle traffic. Communication goes out if the user impact is material.
|
|
15
|
+
|
|
16
|
+
## With feynman
|
|
17
|
+
|
|
18
|
+
Operational sequence:
|
|
19
|
+
|
|
20
|
+
```
|
|
21
|
+
[Alert: checkout error spike] --> [On-call acknowledges in 60s]
|
|
22
|
+
|
|
|
23
|
+
+--> [Is impact external?]
|
|
24
|
+
|
|
|
25
|
+
+-- yes --> [Start Incident Commander]
|
|
26
|
+
|
|
|
27
|
+
+-- no --> [Fast path mitigation only]
|
|
28
|
+
|
|
|
29
|
+
v
|
|
30
|
+
[Notify org channel] --> [Freeze non-critical deploys] --> [Triage by layer]
|
|
31
|
+
|
|
|
32
|
+
v
|
|
33
|
+
[Cache/DB/API/Infra]
|
|
34
|
+
|
|
|
35
|
+
[Need rollback?]
|
|
36
|
+
|
|
|
37
|
+
+-- yes --> [Rollback scoped deploy]
|
|
38
|
+
|
|
|
39
|
+
+-- no --> [Apply temporary safeguard]
|
|
40
|
+
|
|
|
41
|
+
v
|
|
42
|
+
[Stabilize + reduce blast radius]
|
|
43
|
+
|
|
|
44
|
+
v
|
|
45
|
+
[Declare recovery ETA]
|
|
46
|
+
|
|
|
47
|
+
+-- success --> [Post-incident audit]
|
|
48
|
+
|
|
|
49
|
+
+-- degraded --> [Resume changes]
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
State board (what changed inside the incident):
|
|
53
|
+
|
|
54
|
+
```
|
|
55
|
+
┌─ Incident Action Board ───────────────────────┐
|
|
56
|
+
│ triage : done │
|
|
57
|
+
│ command setup : done │
|
|
58
|
+
│ mitigation active : in-flight │
|
|
59
|
+
│ rollback : ready │
|
|
60
|
+
│ customer comms : live │
|
|
61
|
+
│ root-cause evidence : collecting │
|
|
62
|
+
└───────────────────────────────────────────────┘
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
Critical path decomposition:
|
|
66
|
+
|
|
67
|
+
```
|
|
68
|
+
[Containment]
|
|
69
|
+
├── [Enable queue throttle]
|
|
70
|
+
├── [Route reads to replica]
|
|
71
|
+
└── [Turn on circuit breaker]
|
|
72
|
+
[Recovery]
|
|
73
|
+
├── [Collect flamegraphs]
|
|
74
|
+
├── [Compare with healthy minute]
|
|
75
|
+
└── [Prepare rollback diff]
|
|
76
|
+
[Verification]
|
|
77
|
+
├── [Canary checks]
|
|
78
|
+
├── [Synthetic transaction replay]
|
|
79
|
+
└── [SLO probe]
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
Priority ladder during incident:
|
|
83
|
+
|
|
84
|
+
```
|
|
85
|
+
▲ high
|
|
86
|
+
data integrity checks
|
|
87
|
+
user-visible checkout path
|
|
88
|
+
▼ low
|
|
89
|
+
dashboard chart style changes
|
|
90
|
+
PR comments cleanup
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
Runbook gate table:
|
|
94
|
+
|
|
95
|
+
```
|
|
96
|
+
check | owner | threshold | action
|
|
97
|
+
--------------------|----------------|----------------|-------------------------
|
|
98
|
+
P95 latency | SRE | <= 700ms | continue with mitigation
|
|
99
|
+
error budget burn | SRE | <= 3%/15m | escalate communications
|
|
100
|
+
db retry pressure | Backend lead | <= 2x baseline | rotate to fallback path
|
|
101
|
+
cache hit rate | Platform lead | >= 78% | stop throttling traffic
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
This template forces action ordering, makes ownership explicit, and keeps the
|
|
105
|
+
team from drifting between investigation and mitigation under stress.
|
package/examples/api-flow.md
CHANGED
|
@@ -44,12 +44,41 @@ error response is sent.
|
|
|
44
44
|
[200 OK + token]
|
|
45
45
|
```
|
|
46
46
|
|
|
47
|
-
|
|
48
|
-
|
|
47
|
+
### Timing lens: what can degrade and where
|
|
48
|
+
|
|
49
|
+
```
|
|
50
|
+
[Client]
|
|
51
|
+
|
|
|
52
|
+
v
|
|
53
|
+
[CORS] --> [Parser] --> [Validation] --> [Controller] --> [AuthService]
|
|
54
|
+
| |
|
|
55
|
+
| +--> [401]
|
|
56
|
+
|
|
|
57
|
+
+--> [429/503]
|
|
58
|
+
+--> [400]
|
|
59
|
+
|
|
60
|
+
[AuthService] --> [UserRepository] --> [bcrypt.compare]
|
|
61
|
+
|
|
|
62
|
+
+-- mismatch --> [401]
|
|
63
|
+
|
|
|
64
|
+
+-- hit --> [JWT sign] --> [200]
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
Error-path table:
|
|
68
|
+
|
|
69
|
+
```
|
|
70
|
+
stage | symptom | response | recovery
|
|
71
|
+
-------------------|--------------|----------|-------------------
|
|
72
|
+
validation failed | 400 | reject | fix request payload
|
|
73
|
+
user not found | 401 | reject | prompt signup/help
|
|
74
|
+
credentials mismatch| 401 | reject | suggest retry/reset
|
|
75
|
+
dependency timeout | 503/504 | fail | retry/backoff
|
|
76
|
+
success | 200 | token | cache token metadata
|
|
77
|
+
```
|
|
49
78
|
|
|
50
79
|
## Why this works
|
|
51
80
|
|
|
52
81
|
The request lifecycle is a sequential flow with conditional branches, which
|
|
53
|
-
activates feynman's flow
|
|
82
|
+
activates feynman's flow-diagram rules. Boxes (`[…]`) mark processing stages;
|
|
54
83
|
arrows (`-->`) mark data flow; branch splits show the conditional paths at
|
|
55
84
|
validation and credential-check points.
|
|
@@ -0,0 +1,89 @@
|
|
|
1
|
+
# Bug Isolation: Intermittent 500s in Checkout
|
|
2
|
+
|
|
3
|
+
## Question
|
|
4
|
+
|
|
5
|
+
> Checkout API starts returning 500s at random intervals under normal load. How do we
|
|
6
|
+
> isolate root cause without a full-service shutdown?
|
|
7
|
+
|
|
8
|
+
## Without feynman
|
|
9
|
+
|
|
10
|
+
Start by checking logs, then look at DB and cache metrics, then inspect release notes,
|
|
11
|
+
and finally reproduce with synthetic traffic around the failure window. If needed, roll
|
|
12
|
+
back gradually while validating with a canary cohort.
|
|
13
|
+
|
|
14
|
+
## With feynman
|
|
15
|
+
|
|
16
|
+
Hypothesis tree:
|
|
17
|
+
|
|
18
|
+
```
|
|
19
|
+
intermittent 500s
|
|
20
|
+
├── app code
|
|
21
|
+
│ ├── null dereference
|
|
22
|
+
│ └── unhandled exception
|
|
23
|
+
├── data layer
|
|
24
|
+
│ ├── deadlock / lock timeout
|
|
25
|
+
│ ├── stale row locks
|
|
26
|
+
│ └── missing index
|
|
27
|
+
├── infrastructure
|
|
28
|
+
│ ├── DB connection pool exhaustion
|
|
29
|
+
│ └── Redis timeout
|
|
30
|
+
└── operational
|
|
31
|
+
├── rollout wave overlap
|
|
32
|
+
└── scheduled jobs interference
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
Isolation flow:
|
|
36
|
+
|
|
37
|
+
```
|
|
38
|
+
[Symptom observed] --> [Scope blast radius]
|
|
39
|
+
|
|
|
40
|
+
+----------+----------+
|
|
41
|
+
| |
|
|
42
|
+
v v
|
|
43
|
+
[Single endpoint only] [All endpoints]
|
|
44
|
+
| |
|
|
45
|
+
yes/no yes/no
|
|
46
|
+
| |
|
|
47
|
+
v v
|
|
48
|
+
[Replay payload] [Re-check infra]
|
|
49
|
+
| |
|
|
50
|
+
v v
|
|
51
|
+
[Exception trace matches]
|
|
52
|
+
[Connection errors]
|
|
53
|
+
yes/no
|
|
54
|
+
| |
|
|
55
|
+
v v
|
|
56
|
+
[fix app] [next hypothesis]
|
|
57
|
+
| |
|
|
58
|
+
v v
|
|
59
|
+
[fix infra] [rollback segment]
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
Impact priority:
|
|
63
|
+
|
|
64
|
+
```
|
|
65
|
+
▲ high
|
|
66
|
+
user checkout failure
|
|
67
|
+
payment status integrity
|
|
68
|
+
canary regression safety
|
|
69
|
+
▼ low
|
|
70
|
+
low-frequency telemetry noise
|
|
71
|
+
non-critical UI latency
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
Validation decision table:
|
|
75
|
+
|
|
76
|
+
```
|
|
77
|
+
check | command | expected
|
|
78
|
+
------------------|--------------------------------|-------------------------
|
|
79
|
+
error correlation | logs + correlation ids | grouped by checkout_id
|
|
80
|
+
db pressure | pool saturation metrics | stable for 15m
|
|
81
|
+
cache health | hit rate and timeout count | no timeout spike
|
|
82
|
+
deployment diff | feature flags + release notes | no new high-risk delta
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
## Why this works
|
|
86
|
+
|
|
87
|
+
Схема строит диагностику от гипотез к действиям: дерево сокращает пространство
|
|
88
|
+
поиска, flow управляет последовательностью экспериментов, а приоритетный блок
|
|
89
|
+
показывает, что лечится немедленно.
|