@simplysm/sd-claude 13.0.78 → 13.0.81

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (68) hide show
  1. package/claude/rules/sd-claude-rules.md +4 -63
  2. package/claude/rules/sd-simplysm-usage.md +7 -0
  3. package/claude/sd-session-start.sh +10 -0
  4. package/claude/sd-statusline.py +249 -0
  5. package/claude/skills/sd-api-review/SKILL.md +89 -0
  6. package/claude/skills/sd-check/SKILL.md +55 -57
  7. package/claude/skills/sd-commit/SKILL.md +37 -42
  8. package/claude/skills/sd-debug/SKILL.md +75 -265
  9. package/claude/skills/sd-document/SKILL.md +63 -53
  10. package/claude/skills/sd-document/_common.py +94 -0
  11. package/claude/skills/sd-document/extract_docx.py +19 -48
  12. package/claude/skills/sd-document/extract_pdf.py +22 -50
  13. package/claude/skills/sd-document/extract_pptx.py +17 -40
  14. package/claude/skills/sd-document/extract_xlsx.py +19 -40
  15. package/claude/skills/sd-email-analyze/SKILL.md +23 -31
  16. package/claude/skills/sd-email-analyze/email-analyzer.py +79 -65
  17. package/claude/skills/sd-init/SKILL.md +133 -0
  18. package/claude/skills/sd-plan/SKILL.md +69 -120
  19. package/claude/skills/sd-readme/SKILL.md +106 -131
  20. package/claude/skills/sd-review/SKILL.md +38 -155
  21. package/claude/skills/sd-simplify/SKILL.md +59 -0
  22. package/dist/commands/install.js +20 -6
  23. package/dist/commands/install.js.map +1 -1
  24. package/package.json +3 -2
  25. package/src/commands/install.ts +29 -7
  26. package/README.md +0 -297
  27. package/claude/refs/sd-angular.md +0 -127
  28. package/claude/refs/sd-code-conventions.md +0 -155
  29. package/claude/refs/sd-directories.md +0 -7
  30. package/claude/refs/sd-library-issue.md +0 -7
  31. package/claude/refs/sd-migration.md +0 -7
  32. package/claude/refs/sd-orm-v12.md +0 -81
  33. package/claude/refs/sd-orm.md +0 -23
  34. package/claude/refs/sd-service.md +0 -5
  35. package/claude/refs/sd-simplysm-docs.md +0 -52
  36. package/claude/refs/sd-solid.md +0 -68
  37. package/claude/refs/sd-workflow.md +0 -25
  38. package/claude/rules/sd-refs-linker.md +0 -52
  39. package/claude/sd-statusline.js +0 -296
  40. package/claude/skills/sd-api-name-review/SKILL.md +0 -154
  41. package/claude/skills/sd-brainstorm/SKILL.md +0 -215
  42. package/claude/skills/sd-debug/condition-based-waiting-example.ts +0 -158
  43. package/claude/skills/sd-debug/condition-based-waiting.md +0 -114
  44. package/claude/skills/sd-debug/defense-in-depth.md +0 -128
  45. package/claude/skills/sd-debug/find-polluter.sh +0 -64
  46. package/claude/skills/sd-debug/root-cause-tracing.md +0 -168
  47. package/claude/skills/sd-discuss/SKILL.md +0 -91
  48. package/claude/skills/sd-explore/SKILL.md +0 -118
  49. package/claude/skills/sd-plan-dev/SKILL.md +0 -294
  50. package/claude/skills/sd-plan-dev/code-quality-reviewer-prompt.md +0 -49
  51. package/claude/skills/sd-plan-dev/final-review-prompt.md +0 -50
  52. package/claude/skills/sd-plan-dev/implementer-prompt.md +0 -60
  53. package/claude/skills/sd-plan-dev/spec-reviewer-prompt.md +0 -45
  54. package/claude/skills/sd-review/api-reviewer-prompt.md +0 -75
  55. package/claude/skills/sd-review/code-reviewer-prompt.md +0 -82
  56. package/claude/skills/sd-review/convention-checker-prompt.md +0 -61
  57. package/claude/skills/sd-review/refactoring-analyzer-prompt.md +0 -92
  58. package/claude/skills/sd-skill/SKILL.md +0 -417
  59. package/claude/skills/sd-skill/anthropic-best-practices.md +0 -156
  60. package/claude/skills/sd-skill/cso-guide.md +0 -161
  61. package/claude/skills/sd-skill/examples/CLAUDE_MD_TESTING.md +0 -200
  62. package/claude/skills/sd-skill/persuasion-principles.md +0 -220
  63. package/claude/skills/sd-skill/testing-skills-with-subagents.md +0 -408
  64. package/claude/skills/sd-skill/writing-guide.md +0 -159
  65. package/claude/skills/sd-tdd/SKILL.md +0 -385
  66. package/claude/skills/sd-tdd/testing-anti-patterns.md +0 -317
  67. package/claude/skills/sd-use/SKILL.md +0 -67
  68. package/claude/skills/sd-worktree/SKILL.md +0 -78
@@ -1,154 +0,0 @@
1
- ---
2
- name: sd-api-name-review
3
- description: "Public API naming review (explicit invocation only)"
4
- model: sonnet
5
- ---
6
-
7
- # sd-api-name-review
8
-
9
- ## Overview
10
-
11
- Compare a library/module's public API names against industry standards and review internal consistency, producing a standardization report. Uses **sd-explore** to extract the API surface, then dispatches research agents for industry comparison.
12
-
13
- **Analysis only — no code modifications.**
14
-
15
- ## Principles
16
-
17
- - **Breaking changes are irrelevant**: Do NOT dismiss findings because renaming would cause a breaking change.
18
- - **Internal consistency first**: Internal naming consistency takes priority over external standards.
19
-
20
- ## Usage
21
-
22
- - `/sd-api-name-review packages/solid` — full naming review
23
- - `/sd-api-name-review packages/orm-common` — review specific package
24
- - `/sd-api-name-review` — if no argument, ask the user for the target path
25
-
26
- ## Target Selection
27
-
28
- - With argument: review source code at the given path
29
- - Without argument: ask the user for the target path
30
-
31
- **Important:** Review ALL source files under the target path. Do not use git status or git diff to limit scope.
32
-
33
- ## Workflow
34
-
35
- ### Step 1: Prepare Context
36
-
37
- Read these files:
38
- - `CLAUDE.md` — project overview
39
- - `.claude/rules/sd-refs-linker.md` — reference guide
40
- - Target's `package.json` — version (v12/v13)
41
-
42
- Based on version and target, read all applicable reference files (e.g., `sd-code-conventions.md`, `sd-solid.md`).
43
-
44
- Keep the collected conventions in memory — they will inform the analysis in later steps.
45
-
46
- ### Step 2: API Extraction (via sd-explore)
47
-
48
- Follow the **sd-explore** workflow to extract the target's public API surface.
49
-
50
- **sd-explore input:**
51
-
52
- - **Target path**: the review target directory
53
- - **Name**: `api-name-review`
54
- - **File patterns**: `**/*.ts`, `**/*.tsx` (exclude `node_modules`, `dist`)
55
- - **Analysis instructions**:
56
-
57
- "For each file, extract its public API surface:
58
- - All exported identifiers (functions, classes, types, constants, etc.)
59
- - Names and types of user-facing parameters/options/config
60
- - Naming pattern classification (prefixes, suffixes, verb/adjective/noun usage, abbreviations, etc.)
61
-
62
- Output format:
63
- ```
64
- # API Surface: [directory names]
65
-
66
- ## Exports
67
- - `path/to/file.ts` — `exportName`: type (function/class/type/const), signature summary
68
-
69
- ## Naming Patterns
70
- - Pattern: description (e.g., 'create-' prefix for factory functions)
71
- - Examples: list of identifiers using this pattern
72
- ```
73
- "
74
-
75
- ### Step 3: Industry Standard Research
76
-
77
- Based on Step 2 results:
78
-
79
- 1. Identify **recurring naming patterns** from the extracted API
80
- 2. Determine the target's domain and tech stack to **select comparable libraries**
81
- 3. Dispatch **parallel agents** to web-search/fetch official docs for each comparable library, investigating naming conventions for the same pattern categories
82
-
83
- Each research agent receives:
84
-
85
- ```
86
- Research naming conventions in [library name] for these pattern categories:
87
- [list of patterns from Step 2]
88
-
89
- For each pattern, document:
90
- - What naming convention the library uses
91
- - Specific examples from the API
92
- - Any documented rationale for the convention
93
-
94
- Write results to: .tmp/api-name-review/research-{library_name}.md
95
- ```
96
-
97
- ### Step 4: Comparative Analysis & Verification
98
-
99
- Cross-compare Step 2 (API surface) and Step 3 (industry research) results.
100
-
101
- Classify each naming pattern:
102
-
103
- | Priority | Criteria |
104
- | -------- | ------------------------------------------------------ |
105
- | **P0** | Misaligned with majority of surveyed libraries |
106
- | **P1** | Internal inconsistency (same concept, different names) |
107
- | **P2** | Better industry term exists (optional) |
108
- | **Keep** | Already aligned with standards |
109
-
110
- **MANDATORY: Read actual code for EVERY finding.** For each finding, `Read` the file at the referenced location before finalizing. Do NOT rely on explore descriptions alone — verify against the actual code.
111
-
112
- Each finding includes: current name, recommended change, rationale (usage patterns per library).
113
-
114
- ### Step 5: Report & User Confirmation
115
-
116
- Present **Keep** items to the user as a summary.
117
-
118
- Then present each **P0/P1/P2** finding to the user **one at a time**, ordered by priority (P0 → P1 → P2).
119
-
120
- For each finding, explain:
121
- 1. **What the problem is** — the current name and why it's misaligned or inconsistent
122
- 2. **How it could be fixed** — recommended name(s) with rationale from surveyed libraries
123
- 3. **Ask**: address this or skip?
124
-
125
- Collect only findings the user confirms. If the user skips all findings, report that and end.
126
-
127
- ### Step 6: Brainstorm Handoff
128
-
129
- Invoke **sd-brainstorm** with all user-confirmed findings as context:
130
-
131
- _
132
- "Design naming changes for the following review findings.
133
-
134
- **For each finding, you MUST:**
135
- 1. Review it thoroughly — examine the code, understand the context, assess the real impact
136
- 2. If any aspect is unclear or ambiguous, ask the user (one question at a time, per brainstorm rules)
137
- 3. If a finding has low cost-benefit (adds complexity for marginal gain, pure style preference, scope too small), drop it. After triage, briefly list all dropped findings with one-line reasons (no user confirmation needed).
138
- 4. For findings worth fixing, explore approaches and design solutions
139
-
140
- Findings that survive your triage become the design scope. Apply your normal brainstorm process (gap review → approaches → design presentation) to the surviving findings as a group.
141
-
142
- <include all confirmed findings with their priority, file:line, current name, recommended name, and rationale>"
143
-
144
- sd-brainstorm then owns the full cycle: triage (with user input as needed) → design.
145
-
146
- ## Common Mistakes
147
-
148
- | Mistake | Fix |
149
- |---------|-----|
150
- | Using git diff to limit scope | Review ALL source files under target |
151
- | Skipping context preparation | Always read conventions and refs before analysis |
152
- | Skipping verification | Always verify findings against actual code |
153
- | Dismissing findings due to breaking changes | Breaking changes are irrelevant — report the naming issue |
154
- | Not writing research results to files | Research agents MUST write to disk — prevents context bloat |
@@ -1,215 +0,0 @@
1
- ---
2
- name: sd-brainstorm
3
- description: "You MUST use this before any creative work - creating features, building components, adding functionality, or modifying behavior. Explores user intent, requirements and design before implementation."
4
- ---
5
-
6
- # Brainstorming Ideas Into Designs
7
-
8
- ## Overview
9
-
10
- Help turn ideas into fully formed designs and specs through natural collaborative dialogue.
11
-
12
- Start by understanding the current project context, then ask questions one at a time to refine the idea. Once you understand what you're building, present the design in small sections (200-300 words), checking after each section whether it looks right so far.
13
-
14
- ## The Process
15
-
16
- **Understanding the idea:**
17
- - Check out the current project state first (files, docs, recent commits).
18
- - Ask questions one at a time to refine the idea
19
- - Prefer multiple choice questions when possible, but open-ended is fine too
20
- - Only one question per message - if a topic needs more exploration, break it into multiple questions
21
- - Focus on understanding: purpose, constraints, success criteria
22
-
23
- **When a main design document is provided as context:**
24
-
25
- ```mermaid
26
- flowchart TD
27
- A{"Main design with<br>section plan in context?"}
28
- A -->|no| B[Normal brainstorm]
29
- A -->|yes| C{Section specified?}
30
- C -->|no| D["Show section progress<br>Ask which section<br>(suggest next incomplete)"]
31
- C -->|yes| E{"Prerequisites<br>complete?"}
32
- E -->|yes| F[Proceed with section]
33
- E -->|no| G["Warn prerequisites incomplete<br>Ask: proceed anyway<br>or complete first?"]
34
- G -->|"user: proceed"| F
35
- ```
36
-
37
- When proceeding with a section:
38
-
39
- 1. **Read the main design** — understand goals, overall structure, and the target section's scope
40
- 2. **Read actual code** — check the current codebase state for what previous sections have built. Reference the **actual code**, NOT previous section design documents. Code may have diverged from earlier designs during implementation.
41
- 3. **Scope the brainstorm** — limit questions, gap review, approaches, and design presentation to the target section only. Do not re-question decisions already established in the main design.
42
- 4. **Conflict detection** — if the main design's direction conflicts with the actual code state, alert the user and ask for direction before proceeding.
43
- 5. After the design is complete, save as `docs/plans/YYYY-MM-DD-<topic>-section-N-design.md`
44
- 6. Update the main design document: mark the section `[ ]` → `[x]` in the section plan
45
- 7. Commit both files, then proceed to the normal **Next Steps Guide** (Path A/B)
46
-
47
- **Gap review loop:**
48
-
49
- When you think you've asked enough, **STOP and run a gap review before moving on.**
50
-
51
- Tell the user you're running a gap review, then check ALL categories. For each ✅, you MUST **cite specific evidence** (which Q&A, code reference, or explicit user requirement). "I already know" is not evidence.
52
-
53
- | Category | Check for... |
54
- |----------|-------------|
55
- | Scope | What's in? What's explicitly out? |
56
- | User flows | All inputs, outputs, feedback, navigation |
57
- | Edge cases | Empty states, errors, limits, concurrency, undo |
58
- | Data | Shape, validation, persistence, migration, relationships |
59
- | Integration | How does this connect to existing code/systems? |
60
- | Non-functional | Performance, accessibility, security, i18n |
61
- | Assumptions | Anything you assumed but never confirmed |
62
-
63
- Output format — cite evidence for each:
64
- - `✅ Scope — [Q2: user confirmed X / code at file:line / requirement doc says Y]`
65
- - `❓ Edge cases — gap: [what's missing]`
66
-
67
- If evidence is vague ("obvious", "I already know", "common sense") → mark as ❓, not ✅.
68
-
69
- - If ANY ❓ exists → ask about it. After the user answers, **run the full checklist again from scratch**.
70
- - Only when ALL categories show ✅ with cited evidence → proceed to exploring approaches.
71
-
72
- **All-✅ on first run is PROHIBITED — not "suspicious", prohibited.**
73
- If your first gap review shows all ✅:
74
- 1. You are rubber-stamping. Prior investigation ≠ complete design exploration.
75
- 2. Pick the 2 weakest categories (thinnest evidence).
76
- 3. Write one concrete unasked question per category.
77
- 4. Ask those questions, then re-run the full checklist from scratch.
78
-
79
- | Excuse | Reality |
80
- |--------|---------|
81
- | "Requirements are already clear" | Clear requirements ≠ complete design. Edge cases, error states, integration points still need exploration. |
82
- | "I already investigated the code" | Code investigation reveals what IS. Design exploration asks what SHOULD BE. Different activities. |
83
- | "It's just a bug fix" | Bug fixes have edge cases: error states, concurrent access, timing changes, consumer compatibility. |
84
- | "User is frustrated/in a hurry" | Rushing causes exactly the mistakes brainstorming prevents. Slow down. |
85
-
86
- **Rules:**
87
- - You MUST show the checklist to the user every time you run it. No silent/internal-only checks.
88
- - Each run must re-examine ALL categories from zero — do not carry over previous results.
89
- - When in doubt, ask. One extra question costs less than a flawed design.
90
-
91
- **Exploring approaches:**
92
- - Propose 2-3 different approaches with trade-offs
93
- - Present options conversationally with your recommendation and reasoning
94
- - Lead with your recommended option and explain why
95
-
96
- **Scale assessment:**
97
-
98
- After the approach is selected, assess scale (file count, logic complexity, number of distinct subsystems, scope of impact):
99
-
100
- ```mermaid
101
- flowchart TD
102
- A{"Assess design scale"}
103
- A -->|manageable| B["Proceed to<br>After the Design<br>(Path A/B)"]
104
- A -->|large| C["Propose to user:<br>proceed as-is OR<br>split into sections"]
105
- C --> D{"User choice?"}
106
- D -->|"proceed as-is"| B
107
- D -->|split| E["Propose 2-3 section<br>division approaches<br>(by feature/layer/dependency)"]
108
- E -->|"user selects"| F["Append section plan<br>to design doc<br>Save + commit"]
109
- F --> G["Show section guide<br>Brainstorm ENDS"]
110
- ```
111
-
112
- **How to present the split proposal:**
113
-
114
- When proposing the split to the user, you MUST clearly explain what "section split" means:
115
-
116
- - **Section split** = the design document is divided into sections, and each section goes through its own **separate brainstorm → plan → plan-dev → check → commit cycle**.
117
- - This is NOT about implementation phasing (doing some changes before others). It's about breaking the design work itself into independently deliverable chunks.
118
- - Explain: "Splitting into sections means each section goes through its own brainstorm → plan → plan-dev cycle. Complete and commit one section before moving to the next."
119
- - Contrast with: "Proceeding as-is means this single design document goes straight to plan → plan-dev."
120
-
121
- **Section plan format** (append to existing design content as-is):
122
-
123
- ```markdown
124
- ---
125
-
126
- ## Section Plan
127
-
128
- - [ ] Section 1: <name> — <scope summary>
129
- - [ ] Section 2: <name> — <scope summary> (after section 1)
130
- - [ ] Section 3: <name> — <scope summary> (after section 1, 2)
131
- ```
132
-
133
- **Section guide** (shown instead of Path A/B, in user's configured language):
134
-
135
- ```
136
- Design has been split into sections.
137
-
138
- Main design: docs/plans/YYYY-MM-DD-<topic>-design.md
139
-
140
- Section progress:
141
- - [ ] Section 1: <name>
142
- - [ ] Section 2: <name> (after section 1)
143
- - [ ] Section 3: <name> (after section 1, 2)
144
-
145
- Run each section in order:
146
- sd-brainstorm docs/plans/YYYY-MM-DD-<topic>-design.md section 1
147
-
148
- After each section's brainstorm completes, you can choose Path A/B
149
- to run plan → plan-dev → check → commit.
150
- ```
151
-
152
- Do NOT auto-proceed to any section.
153
-
154
- ## After the Design
155
-
156
- **Documentation:**
157
- - Write the validated design to `docs/plans/YYYY-MM-DD-<topic>-design.md`
158
- - Commit the design document to git
159
-
160
- **Next Steps Guide:**
161
-
162
- Present the following two workflow paths so the user can see the full process and choose.
163
- Display the guide in the **user's configured language** (follow the language settings from CLAUDE.md or system instructions).
164
-
165
- Before presenting, check git status for uncommitted changes. If there are any uncommitted changes (staged, unstaged, or untracked files), append the warning line (shown below) at the end of the guide block.
166
-
167
- ```
168
- Design complete! Here's how to proceed:
169
-
170
- --- Path A: With branch isolation (recommended for features/large changes) ---
171
-
172
- 1. /sd-worktree add <name> — Create a worktree branch
173
- 2. /sd-plan — Break into detailed tasks
174
- 3. /sd-plan-dev — Execute tasks in parallel (includes TDD + review)
175
- 4. /sd-check — Verify (modified + dependents)
176
- 5. /sd-commit — Commit
177
- 6. /sd-worktree merge — Merge back to main
178
- 7. /sd-worktree clean — Remove worktree
179
-
180
- --- Path B: Direct on current branch (quick fixes/small changes) ---
181
-
182
- 1. /sd-plan — Break into detailed tasks
183
- 2. /sd-plan-dev — Execute tasks in parallel (includes TDD + review)
184
- 3. /sd-check — Verify (modified + dependents)
185
- 4. /sd-commit — Commit
186
-
187
- You can start from any step or skip steps as needed.
188
-
189
- 💡 "Path A: yolo" or "Path B: yolo" to auto-run all steps
190
-
191
- ⚠️ You have uncommitted changes. To use Path A, run `/sd-commit all` first.
192
- ```
193
-
194
- - The last `⚠️` line is only shown when uncommitted changes exist. Omit it when working tree is clean.
195
- - If the design does NOT involve code modifications, omit the `/sd-check` step from both paths.
196
-
197
- - After presenting both paths, **recommend one** based on the design's scope:
198
- - Path A recommended: new features, multi-file changes, architectural changes, anything that benefits from isolation
199
- - Path B recommended: small bug fixes, single-file changes, config tweaks, minor adjustments
200
- - Briefly explain why (1 sentence)
201
- - Do NOT auto-proceed to any step. Present the overview with recommendation and wait for the user's choice.
202
- - **Yolo mode**: If the user responds with "Path A: yolo" or "Path B: yolo" (or similar intent like "A yolo", "B auto"), execute all steps of the chosen path sequentially without stopping between steps.
203
- - **Yolo sd-check — include dependents**: NEVER check only modified packages. Also check all packages that depend on them:
204
- 1. Identify modified packages from `git diff --name-only`
205
- 2. Trace reverse dependencies (packages that import from modified packages) using `package.json` or project dependency graph
206
- 3. Include integration/e2e tests that cover the modified packages
207
- 4. Run `/sd-check` with all affected paths, or `/sd-check` without path (whole project) when changes are widespread
208
-
209
- ## Key Principles
210
-
211
- - **One question at a time** - Don't overwhelm with multiple questions
212
- - **Multiple choice preferred** - Easier to answer than open-ended when possible
213
- - **YAGNI ruthlessly** - Remove unnecessary features from all designs
214
- - **Explore alternatives** - Always propose 2-3 approaches before settling
215
- - **Be flexible** - Go back and clarify when something doesn't make sense
@@ -1,158 +0,0 @@
1
- // Complete implementation of condition-based waiting utilities
2
- // From: Lace test infrastructure improvements (2025-10-03)
3
- // Context: Fixed 15 flaky tests by replacing arbitrary timeouts
4
-
5
- import type { ThreadManager } from "~/threads/thread-manager";
6
- import type { LaceEvent, LaceEventType } from "~/threads/types";
7
-
8
- /**
9
- * Wait for a specific event type to appear in thread
10
- *
11
- * @param threadManager - The thread manager to query
12
- * @param threadId - Thread to check for events
13
- * @param eventType - Type of event to wait for
14
- * @param timeoutMs - Maximum time to wait (default 5000ms)
15
- * @returns Promise resolving to the first matching event
16
- *
17
- * Example:
18
- * await waitForEvent(threadManager, agentThreadId, 'TOOL_RESULT');
19
- */
20
- export function waitForEvent(
21
- threadManager: ThreadManager,
22
- threadId: string,
23
- eventType: LaceEventType,
24
- timeoutMs = 5000,
25
- ): Promise<LaceEvent> {
26
- return new Promise((resolve, reject) => {
27
- const startTime = Date.now();
28
-
29
- const check = () => {
30
- const events = threadManager.getEvents(threadId);
31
- const event = events.find((e) => e.type === eventType);
32
-
33
- if (event) {
34
- resolve(event);
35
- } else if (Date.now() - startTime > timeoutMs) {
36
- reject(new Error(`Timeout waiting for ${eventType} event after ${timeoutMs}ms`));
37
- } else {
38
- setTimeout(check, 10); // Poll every 10ms for efficiency
39
- }
40
- };
41
-
42
- check();
43
- });
44
- }
45
-
46
- /**
47
- * Wait for a specific number of events of a given type
48
- *
49
- * @param threadManager - The thread manager to query
50
- * @param threadId - Thread to check for events
51
- * @param eventType - Type of event to wait for
52
- * @param count - Number of events to wait for
53
- * @param timeoutMs - Maximum time to wait (default 5000ms)
54
- * @returns Promise resolving to all matching events once count is reached
55
- *
56
- * Example:
57
- * // Wait for 2 AGENT_MESSAGE events (initial response + continuation)
58
- * await waitForEventCount(threadManager, agentThreadId, 'AGENT_MESSAGE', 2);
59
- */
60
- export function waitForEventCount(
61
- threadManager: ThreadManager,
62
- threadId: string,
63
- eventType: LaceEventType,
64
- count: number,
65
- timeoutMs = 5000,
66
- ): Promise<LaceEvent[]> {
67
- return new Promise((resolve, reject) => {
68
- const startTime = Date.now();
69
-
70
- const check = () => {
71
- const events = threadManager.getEvents(threadId);
72
- const matchingEvents = events.filter((e) => e.type === eventType);
73
-
74
- if (matchingEvents.length >= count) {
75
- resolve(matchingEvents);
76
- } else if (Date.now() - startTime > timeoutMs) {
77
- reject(
78
- new Error(
79
- `Timeout waiting for ${count} ${eventType} events after ${timeoutMs}ms (got ${matchingEvents.length})`,
80
- ),
81
- );
82
- } else {
83
- setTimeout(check, 10);
84
- }
85
- };
86
-
87
- check();
88
- });
89
- }
90
-
91
- /**
92
- * Wait for an event matching a custom predicate
93
- * Useful when you need to check event data, not just type
94
- *
95
- * @param threadManager - The thread manager to query
96
- * @param threadId - Thread to check for events
97
- * @param predicate - Function that returns true when event matches
98
- * @param description - Human-readable description for error messages
99
- * @param timeoutMs - Maximum time to wait (default 5000ms)
100
- * @returns Promise resolving to the first matching event
101
- *
102
- * Example:
103
- * // Wait for TOOL_RESULT with specific ID
104
- * await waitForEventMatch(
105
- * threadManager,
106
- * agentThreadId,
107
- * (e) => e.type === 'TOOL_RESULT' && e.data.id === 'call_123',
108
- * 'TOOL_RESULT with id=call_123'
109
- * );
110
- */
111
- export function waitForEventMatch(
112
- threadManager: ThreadManager,
113
- threadId: string,
114
- predicate: (event: LaceEvent) => boolean,
115
- description: string,
116
- timeoutMs = 5000,
117
- ): Promise<LaceEvent> {
118
- return new Promise((resolve, reject) => {
119
- const startTime = Date.now();
120
-
121
- const check = () => {
122
- const events = threadManager.getEvents(threadId);
123
- const event = events.find(predicate);
124
-
125
- if (event) {
126
- resolve(event);
127
- } else if (Date.now() - startTime > timeoutMs) {
128
- reject(new Error(`Timeout waiting for ${description} after ${timeoutMs}ms`));
129
- } else {
130
- setTimeout(check, 10);
131
- }
132
- };
133
-
134
- check();
135
- });
136
- }
137
-
138
- // Usage example from actual debugging session:
139
- //
140
- // BEFORE (flaky):
141
- // ---------------
142
- // const messagePromise = agent.sendMessage('Execute tools');
143
- // await new Promise(r => setTimeout(r, 300)); // Hope tools start in 300ms
144
- // agent.abort();
145
- // await messagePromise;
146
- // await new Promise(r => setTimeout(r, 50)); // Hope results arrive in 50ms
147
- // expect(toolResults.length).toBe(2); // Fails randomly
148
- //
149
- // AFTER (reliable):
150
- // ----------------
151
- // const messagePromise = agent.sendMessage('Execute tools');
152
- // await waitForEventCount(threadManager, threadId, 'TOOL_CALL', 2); // Wait for tools to start
153
- // agent.abort();
154
- // await messagePromise;
155
- // await waitForEventCount(threadManager, threadId, 'TOOL_RESULT', 2); // Wait for results
156
- // expect(toolResults.length).toBe(2); // Always succeeds
157
- //
158
- // Result: 60% pass rate → 100%, 40% faster execution
@@ -1,114 +0,0 @@
1
- # Condition-Based Waiting
2
-
3
- ## Overview
4
-
5
- Flaky tests often guess at timing with arbitrary delays. This creates race conditions where tests pass on fast machines but fail under load or in CI.
6
-
7
- **Core principle:** Wait for the actual condition you care about, not a guess about how long it takes.
8
-
9
- ## When to Use
10
-
11
- ```mermaid
12
- flowchart TD
13
- A{"Test uses setTimeout/sleep?"} -->|yes| B{"Testing timing behavior?"}
14
- B -->|yes| C[Document WHY timeout needed]
15
- B -->|no| D[Use condition-based waiting]
16
- ```
17
-
18
- **Use when:**
19
-
20
- - Tests have arbitrary delays (`setTimeout`, `sleep`, `time.sleep()`)
21
- - Tests are flaky (pass sometimes, fail under load)
22
- - Tests timeout when run in parallel
23
- - Waiting for async operations to complete
24
-
25
- **Don't use when:**
26
-
27
- - Testing actual timing behavior (debounce, throttle intervals)
28
- - Always document WHY if using arbitrary timeout
29
-
30
- ## Core Pattern
31
-
32
- ```typescript
33
- // ❌ BEFORE: Guessing at timing
34
- await new Promise((r) => setTimeout(r, 50));
35
- const result = getResult();
36
- expect(result).toBeDefined();
37
-
38
- // ✅ AFTER: Waiting for condition
39
- await waitFor(() => getResult() !== undefined);
40
- const result = getResult();
41
- expect(result).toBeDefined();
42
- ```
43
-
44
- ## Quick Patterns
45
-
46
- | Scenario | Pattern |
47
- | ----------------- | ---------------------------------------------------- |
48
- | Wait for event | `waitFor(() => events.find(e => e.type === 'DONE'))` |
49
- | Wait for state | `waitFor(() => machine.state === 'ready')` |
50
- | Wait for count | `waitFor(() => items.length >= 5)` |
51
- | Wait for file | `waitFor(() => fs.existsSync(path))` |
52
- | Complex condition | `waitFor(() => obj.ready && obj.value > 10)` |
53
-
54
- ## Implementation
55
-
56
- Generic polling function:
57
-
58
- ```typescript
59
- async function waitFor<T>(
60
- condition: () => T | undefined | null | false,
61
- description: string,
62
- timeoutMs = 5000,
63
- ): Promise<T> {
64
- const startTime = Date.now();
65
-
66
- while (true) {
67
- const result = condition();
68
- if (result) return result;
69
-
70
- if (Date.now() - startTime > timeoutMs) {
71
- throw new Error(`Timeout waiting for ${description} after ${timeoutMs}ms`);
72
- }
73
-
74
- await new Promise((r) => setTimeout(r, 10)); // Poll every 10ms
75
- }
76
- }
77
- ```
78
-
79
- See `condition-based-waiting-example.ts` in this directory for complete implementation with domain-specific helpers (`waitForEvent`, `waitForEventCount`, `waitForEventMatch`) from actual debugging session.
80
-
81
- ## Common Mistakes
82
-
83
- **❌ Polling too fast:** `setTimeout(check, 1)` - wastes CPU
84
- **✅ Fix:** Poll every 10ms
85
-
86
- **❌ No timeout:** Loop forever if condition never met
87
- **✅ Fix:** Always include timeout with clear error
88
-
89
- **❌ Stale data:** Cache state before loop
90
- **✅ Fix:** Call getter inside loop for fresh data
91
-
92
- ## When Arbitrary Timeout IS Correct
93
-
94
- ```typescript
95
- // Tool ticks every 100ms - need 2 ticks to verify partial output
96
- await waitForEvent(manager, "TOOL_STARTED"); // First: wait for condition
97
- await new Promise((r) => setTimeout(r, 200)); // Then: wait for timed behavior
98
- // 200ms = 2 ticks at 100ms intervals - documented and justified
99
- ```
100
-
101
- **Requirements:**
102
-
103
- 1. First wait for triggering condition
104
- 2. Based on known timing (not guessing)
105
- 3. Comment explaining WHY
106
-
107
- ## Real-World Impact
108
-
109
- From debugging session (2025-10-03):
110
-
111
- - Fixed 15 flaky tests across 3 files
112
- - Pass rate: 60% → 100%
113
- - Execution time: 40% faster
114
- - No more race conditions