@simplysm/claude 13.0.26 → 13.0.27

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,143 +1,205 @@
1
1
  ---
2
2
  name: sd-check
3
- description: Verify code via typecheck, lint, and tests
4
- argument-hint: "[path]"
5
- model: opus
3
+ description: Use when verifying code quality via typecheck, lint, and tests - before deployment, PR creation, after code changes, or when type errors, lint violations, or test failures are suspected. Applies to whole project or specific paths.
6
4
  ---
7
5
 
8
- ## Usage
6
+ # sd-check
9
7
 
10
- - `/sd-check` verify the entire project
11
- - `/sd-check packages/core-common` — verify a specific path only
8
+ Verify code quality through parallel execution of typecheck, lint, and test checks.
12
9
 
13
- If an argument is provided, run against that path. Otherwise, run against the entire project.
10
+ ## Overview
14
11
 
15
- ## Environment Pre-check
12
+ **This skill provides EXACT STEPS you MUST follow - it is NOT a command to invoke.**
16
13
 
17
- Before running any verification, confirm the project environment is properly set up.
18
- Run these checks **in parallel** and report results before proceeding.
14
+ **Foundational Principle:** Violating the letter of these steps is violating the spirit of verification.
19
15
 
20
- ### 1. Root package.json version
16
+ When the user asks to verify code, YOU will manually execute **EXACTLY THESE 4 STEPS** (no more, no less):
21
17
 
22
- Read the root `package.json` and check the `version` field.
23
- The major version must be `13` (e.g., `13.x.x`). If the major version is not `13`, stop and report:
18
+ **Step 1:** Environment Pre-check (4 checks in parallel)
19
+ **Step 2:** Launch 3 haiku agents in parallel (typecheck, lint, test ONLY)
20
+ **Step 3:** Collect results, fix errors in priority order
21
+ **Step 4:** Re-verify (go back to Step 2) until all pass
24
22
 
25
- > "This skill requires simplysm v13. Current version: {version}"
23
+ **Core principle:** Always re-run ALL checks after any fix - changes can cascade.
26
24
 
27
- ### 2. pnpm workspace
25
+ **CRITICAL:**
26
+ - This skill verifies ONLY typecheck, lint, and test
27
+ - **NO BUILD. NO DEV SERVER. NO TEAMS. NO TASK LISTS.**
28
+ - Do NOT create your own "better" workflow - follow these 4 steps EXACTLY
28
29
 
29
- Verify this is a pnpm project:
30
+ ## Usage
30
31
 
31
- ```
32
- ls pnpm-workspace.yaml pnpm-lock.yaml
33
- ```
32
+ - `/sd-check` — verify entire project
33
+ - `/sd-check packages/core-common` — verify specific path only
34
34
 
35
- Both files must exist. If missing, stop and report to the user.
35
+ **Default:** If no path argument provided, verify entire project.
36
36
 
37
- ### 3. package.json scripts
37
+ ## Quick Reference
38
38
 
39
- Read the root `package.json` and confirm these scripts are defined:
39
+ | Check | Command | Agent Model | Purpose |
40
+ |-------|---------|-------------|---------|
41
+ | Typecheck | `pnpm typecheck [path]` | haiku | Type errors |
42
+ | Lint | `pnpm lint --fix [path]` | haiku | Code quality |
43
+ | Test | `pnpm vitest [path] --run` | haiku | Functionality |
40
44
 
41
- - `typecheck`
42
- - `lint`
45
+ **All 3 run in PARALLEL** (separate haiku agents, single message)
43
46
 
44
- If either is missing, stop and report to the user.
47
+ ## Workflow
45
48
 
46
- ### 4. Vitest config
49
+ ### Step 1: Environment Pre-check
47
50
 
48
- Verify vitest is configured:
51
+ Before ANY verification, confirm environment setup with these checks **in parallel**:
49
52
 
50
- ```
51
- ls vitest.config.ts
52
- ```
53
+ 1. **Root package.json version** - Read `package.json`, verify major version is `13` (e.g., `13.x.x`)
54
+ - If not 13: STOP, report "This skill requires simplysm v13. Current: {version}"
53
55
 
54
- If missing, stop and report to the user.
56
+ 2. **pnpm workspace** - Verify `pnpm-workspace.yaml` and `pnpm-lock.yaml` exist
57
+ - Command: `ls pnpm-workspace.yaml pnpm-lock.yaml`
58
+ - If missing: STOP, report to user
55
59
 
56
- ---
57
-
58
- If all pre-checks pass, report "Environment OK" and proceed to code verification.
60
+ 3. **package.json scripts** - Read root `package.json`, confirm `typecheck` and `lint` scripts defined
61
+ - If missing: STOP, report to user
59
62
 
60
- ## Code Verification
63
+ 4. **Vitest config** - Verify `vitest.config.ts` exists
64
+ - Command: `ls vitest.config.ts`
65
+ - If missing: STOP, report to user
61
66
 
62
- Run verification checks using haiku agents for command execution, then analyze and fix errors.
63
- Repeat until all checks pass.
67
+ **If all pass:** Report "Environment OK", proceed to Step 2.
64
68
 
65
- ### Step 1: Launch Verification Agents (Parallel)
69
+ ### Step 2: Launch 3 Haiku Agents in Parallel
66
70
 
67
- Launch 3 haiku agents in parallel using the Task tool.
71
+ Launch ALL 3 agents in a **single message** using Task tool.
68
72
 
69
- **Important**: Replace `[path]` in the commands below with the actual path argument provided by the user. If no argument was provided, omit the path (runs on entire project).
73
+ **Replace `[path]` with user's argument, or OMIT if no argument (defaults to full project).**
70
74
 
71
75
  **Agent 1 - Typecheck:**
72
76
  ```
73
- Task tool with:
77
+ Task tool:
74
78
  subagent_type: Bash
75
79
  model: haiku
76
80
  description: "Run typecheck"
77
- prompt: "Run `pnpm typecheck [path]` and return the full output. Do NOT analyze or fix errors - just report the raw output."
81
+ prompt: "Run `pnpm typecheck [path]` and return full output. Do NOT analyze or fix - just report raw output."
78
82
  ```
79
83
 
80
84
  **Agent 2 - Lint:**
81
85
  ```
82
- Task tool with:
86
+ Task tool:
83
87
  subagent_type: Bash
84
88
  model: haiku
85
89
  description: "Run lint with auto-fix"
86
- prompt: "Run `pnpm lint --fix [path]` and return the full output. Do NOT analyze or fix errors - just report the raw output."
90
+ prompt: "Run `pnpm lint --fix [path]` and return full output. Do NOT analyze or fix - just report raw output."
87
91
  ```
88
92
 
89
93
  **Agent 3 - Test:**
90
94
  ```
91
- Task tool with:
95
+ Task tool:
92
96
  subagent_type: Bash
93
97
  model: haiku
94
98
  description: "Run tests"
95
- prompt: "Run `pnpm vitest [path] --run` and return the full output. Do NOT analyze or fix errors - just report the raw output."
99
+ prompt: "Run `pnpm vitest [path] --run` and return full output. Do NOT analyze or fix - just report raw output."
96
100
  ```
97
101
 
98
- ### Step 2: Collect Results and Fix Errors
102
+ ### Step 3: Collect Results and Fix Errors
103
+
104
+ Wait for ALL 3 agents. Collect outputs.
105
+
106
+ **If all checks passed:** Complete (see Completion Criteria).
107
+
108
+ **If any errors found:**
99
109
 
100
- Wait for all 3 agents to complete. Collect their outputs.
110
+ 1. **Analyze by priority:** Typecheck Lint Test
111
+ - Typecheck errors may cause lint/test errors (cascade)
101
112
 
102
- If any errors are found:
113
+ 2. **Read failing files** to identify root cause
103
114
 
104
- 1. **Analyze errors by priority**: typecheck → lint → test
105
- - Typecheck errors may cause lint/test errors, so fix them first
106
- 2. **Read failing files** to identify root causes
107
- 3. **Fix with Edit**:
108
- - Typecheck errors: Fix type issues
109
- - Lint errors: Fix linting issues (most should be auto-fixed by `--fix`)
110
- - Test failures:
111
- - Run `git diff` to check for intentional code changes
112
- - If intentional changes not reflected in tests: Update test code
113
- - If source code bug: Fix source code
114
- 4. Proceed to Step 3
115
+ 3. **Fix with Edit:**
116
+ - **Typecheck:** Fix type issues
117
+ - **Lint:** Fix code quality (most auto-fixed by `--fix`)
118
+ - **Test:**
119
+ - Run `git diff` to check intentional changes
120
+ - If changes not reflected in tests: Update test
121
+ - If source bug: Fix source
122
+ - **If root cause unclear OR 2-3 fix attempts failed:** Recommend `/sd-debug`
115
123
 
116
- If all checks passed: Proceed to Completion.
124
+ 4. **Proceed to Step 4**
117
125
 
118
- ### Step 3: Re-verify (Loop)
126
+ ### Step 4: Re-verify (Loop Until All Pass)
119
127
 
120
- Go back to Step 1 and launch the 3 haiku agents again.
121
- Repeat until all checks pass with no errors.
128
+ **CRITICAL:** After ANY fix, re-run ALL 3 checks.
129
+
130
+ Go back to Step 2 and launch 3 haiku agents again.
131
+
132
+ **Do NOT assume:** "I only fixed typecheck → skip lint/test". Fixes cascade.
133
+
134
+ Repeat Steps 2-4 until all 3 checks pass.
122
135
 
123
136
  ## Common Mistakes
124
137
 
125
- ### Running checks sequentially instead of parallel
126
- **Wrong**: Launch agent 1, wait, then agent 2, wait, then agent 3
127
- **Right**: Launch all 3 agents in a single message with multiple Task tool calls
138
+ ### Running checks sequentially
139
+ **Wrong:** Launch agent 1, wait agent 2, wait agent 3
140
+ **Right:** Launch ALL 3 in single message (parallel Task calls)
141
+
142
+ ### ❌ Fixing before collecting all results
143
+ **Wrong:** Agent 1 returns error → fix immediately → re-verify
144
+ **Right:** Wait for all 3 → collect all errors → fix in priority order → re-verify
145
+
146
+ ### ❌ Skipping re-verification after fixes
147
+ **Wrong:** Fix typecheck → assume lint/test still pass
148
+ **Right:** ALWAYS re-run all 3 checks after any fix
149
+
150
+ ### ❌ Using wrong model
151
+ **Wrong:** `model: opus` or `model: sonnet` for verification agents
152
+ **Right:** `model: haiku` (cheaper, faster for command execution)
153
+
154
+ ### ❌ Including build/dev steps
155
+ **Wrong:** Run `pnpm build` or `pnpm dev` as part of verification
156
+ **Right:** sd-check is ONLY typecheck, lint, test (no build, no dev)
157
+
158
+ ### ❌ Asking user for path
159
+ **Wrong:** No path provided → ask "which package?"
160
+ **Right:** No path → verify entire project (omit path in commands)
161
+
162
+ ### ❌ Infinite fix loop
163
+ **Wrong:** Keep trying same fix when tests fail repeatedly
164
+ **Right:** After 2-3 failed attempts → recommend `/sd-debug`
165
+
166
+ ## Red Flags - STOP and Follow Workflow
128
167
 
129
- ### Fixing before collecting all results
130
- ❌ **Wrong**: Agent 1 returns error → fix immediately → launch agents again
131
- ✅ **Right**: Wait for all 3 agents → collect all errors → fix in priority order → re-verify
168
+ If you find yourself doing ANY of these, you're violating the skill:
132
169
 
133
- ### Skipping re-verification after fixes
134
- **Wrong**: Fix typecheck error assume lint/test still pass
135
- ✅ **Right**: Always re-run all 3 checks after any fix (fixes can introduce new errors)
170
+ - Treating sd-check as a command to invoke (`Skill: sd-check Args: ...`)
171
+ - Including build or dev server in verification
172
+ - Running agents sequentially instead of parallel
173
+ - Not re-verifying after every fix
174
+ - Asking user for path when none provided
175
+ - Continuing past 2-3 failed fix attempts without recommending `/sd-debug`
176
+ - Spawning 4+ agents (only 3: typecheck, lint, test)
136
177
 
137
- ### Using wrong model for agents
138
- ❌ **Wrong**: `model: opus` or `model: sonnet` for verification agents
139
- ✅ **Right**: `model: haiku` for command execution (cheaper, faster)
178
+ **All of these violate the skill's core principles. Go back to Step 1 and follow the workflow exactly.**
140
179
 
141
180
  ## Completion Criteria
142
181
 
143
- Complete when all 3 checks pass without errors.
182
+ **Complete when:**
183
+ - All 3 checks (typecheck, lint, test) pass without errors
184
+ - Report: "All checks passed - code verified"
185
+
186
+ **Do NOT complete if:**
187
+ - Any check has errors
188
+ - Haven't re-verified after a fix
189
+ - Environment pre-checks failed
190
+
191
+ ## Rationalization Table
192
+
193
+ | Excuse | Reality |
194
+ |--------|---------|
195
+ | "I'm following the spirit, not the letter" | Violating the letter IS violating the spirit - follow EXACTLY |
196
+ | "I'll create a better workflow with teams/tasks" | Follow the 4 steps EXACTLY - no teams, no task lists |
197
+ | "I'll split tests into multiple agents" | Only 3 agents total: typecheck, lint, test |
198
+ | "Stratified parallel is faster" | Run ALL 3 in parallel via separate agents - truly parallel |
199
+ | "I only fixed lint, typecheck still passes" | Always re-verify ALL - fixes can cascade |
200
+ | "Build is part of verification" | Build is deployment, not verification - NEVER include it |
201
+ | "Let me ask which path to check" | Default to full project - explicit behavior |
202
+ | "I'll try one more fix approach" | After 2-3 attempts → recommend /sd-debug |
203
+ | "Tests are independent of types" | Type fixes affect tests - always re-run ALL |
204
+ | "I'll invoke sd-check skill with args" | sd-check is EXACT STEPS, not a command |
205
+ | "4 agents: typecheck, lint, test, build" | Only 3 agents - build is FORBIDDEN |
@@ -0,0 +1,129 @@
1
+ # Baseline Test Analysis - sd-check Skill
2
+
3
+ ## Summary
4
+
5
+ Tested 6 scenarios with agents WITHOUT sd-check skill. All agents failed to follow optimal verification patterns.
6
+
7
+ ## Common Failures Across All Scenarios
8
+
9
+ ### 1. No Cost Optimization
10
+ **Failure:** All agents planned direct command execution instead of using haiku subagents.
11
+
12
+ **Observed in:** All scenarios (1-6)
13
+
14
+ **Impact:** Higher cost, no isolation
15
+
16
+ **What skill must prevent:** Skill must explicitly require haiku subagent usage
17
+
18
+ ### 2. Incomplete Parallelization
19
+ **Failure:** Agents either ran sequentially or only partially parallelized.
20
+
21
+ **Examples:**
22
+ - Scenario 1: Used `&` for typecheck/lint but ran tests sequentially ("stratified parallel")
23
+ - Scenario 2: No parallelization at all
24
+ - Scenario 3: Sequential fix → verify → fix → verify
25
+
26
+ **Impact:** Slower verification (60s → 120s+)
27
+
28
+ **What skill must prevent:** Skill must require ALL 3 checks (typecheck, lint, test) in parallel via 3 separate haiku agents
29
+
30
+ ### 3. Missing Environment Pre-checks
31
+ **Failure:** No systematic environment validation before running checks.
32
+
33
+ **Observed:**
34
+ - Scenario 1: Checked Docker for ORM tests, but not other prerequisites
35
+ - Scenario 6: Only checked pnpm-lock.yaml, missed package.json version, scripts, vitest.config.ts
36
+
37
+ **Impact:** Confusing errors if environment misconfigured
38
+
39
+ **What skill must prevent:** Skill must require 4 pre-checks (package.json v13, pnpm workspace, scripts, vitest config)
40
+
41
+ ### 4. Unclear Re-verification Loop
42
+ **Failure:** After fixing errors, no clear "re-run ALL checks" loop.
43
+
44
+ **Examples:**
45
+ - Scenario 3: Phase 1 verify → Phase 2 verify → Phase 3 verify (but no final "all phases" re-verify)
46
+ - Agents treated it as linear progression, not a loop
47
+
48
+ **Impact:** Fixes in one area may break another (cascade errors)
49
+
50
+ **What skill must prevent:** Skill must explicitly state "re-run ALL 3 checks until ALL pass"
51
+
52
+ ### 5. No sd-debug Recommendation
53
+ **Failure:** When root cause unclear after multiple attempts, agents didn't recommend sd-debug.
54
+
55
+ **Observed:**
56
+ - Scenario 4: After 4 failed attempts, agent suggested various debugging approaches but NOT `/sd-debug` skill
57
+
58
+ **Impact:** User wastes time when systematic root-cause investigation needed
59
+
60
+ **What skill must prevent:** Skill must state "after 2-3 failed fix attempts → recommend /sd-debug"
61
+
62
+ ### 6. Incorrect Default Behavior
63
+ **Failure:** When no path argument provided, agents asked user for clarification instead of defaulting to full project.
64
+
65
+ **Observed:**
66
+ - Scenario 5: Agent wanted to ask "which package?" instead of running on entire project
67
+
68
+ **Impact:** Unnecessary user friction
69
+
70
+ **What skill must prevent:** Skill must state "if no path argument → run on entire project (omit path in commands)"
71
+
72
+ ### 7. Scope Creep (Unnecessary Steps)
73
+ **Failure:** Agents included steps not relevant to "verification".
74
+
75
+ **Examples:**
76
+ - Scenario 1: Included `pnpm build` (verification doesn't need build)
77
+ - Scenario 2: Included dev server test (not verification)
78
+
79
+ **Impact:** Wasted time, confusion about scope
80
+
81
+ **What skill must prevent:** Skill must clarify scope: typecheck, lint, test ONLY (no build, no dev)
82
+
83
+ ## Rationalization Patterns (Verbatim)
84
+
85
+ ### "Parallelization while maintaining logical dependencies"
86
+ - Used to justify partial parallelization
87
+ - Agents ran typecheck & lint in parallel, but tests sequentially
88
+ - **Counter:** ALL 3 checks are independent → all 3 in parallel
89
+
90
+ ### "Stratified parallel execution"
91
+ - Used to justify sequential test runs grouped by environment
92
+ - **Counter:** Vitest projects are independent → run all via single command
93
+
94
+ ### "Faster to fail fast on static checks"
95
+ - Good principle, but used to justify including build step
96
+ - **Counter:** Build is not a static check, and not required for verification
97
+
98
+ ### "Type safety first" / "Incremental verification"
99
+ - Used to justify Phase 1 → Phase 2 → Phase 3 linear progression
100
+ - **Counter:** After fixes, must re-verify ALL phases (loop), not just next phase
101
+
102
+ ### "Understanding first, then ONE comprehensive fix"
103
+ - Used to justify continued debugging without tools
104
+ - **Counter:** After 2-3 attempts, recommend /sd-debug for systematic investigation
105
+
106
+ ### "Ask for clarification" / "Explicit and predictable"
107
+ - Used to justify asking user for path when none provided
108
+ - **Counter:** Default to full project is explicit and predictable behavior
109
+
110
+ ## Success Criteria for Skill
111
+
112
+ Skill is effective if agents:
113
+ 1. ✅ Launch 3 haiku agents in parallel (typecheck, lint, test)
114
+ 2. ✅ Run environment pre-checks before verification
115
+ 3. ✅ Default to full project when no path argument
116
+ 4. ✅ Fix errors in priority order (typecheck → lint → test)
117
+ 5. ✅ Re-run ALL 3 checks after any fix (loop until all pass)
118
+ 6. ✅ Recommend /sd-debug after 2-3 failed fix attempts
119
+ 7. ✅ Do NOT include build or dev server steps
120
+
121
+ ## Test Scenarios for GREEN Phase
122
+
123
+ After writing skill, re-run scenarios 1-6. Agents should now exhibit correct behavior above.
124
+
125
+ Focus on:
126
+ - Scenario 1: Verify parallel haiku agents + env checks
127
+ - Scenario 3: Verify re-verification loop + priority
128
+ - Scenario 4: Verify sd-debug recommendation
129
+ - Scenario 5: Verify default to full project
@@ -0,0 +1,172 @@
1
+ # sd-check Pressure Test Scenarios
2
+
3
+ ## Scenario 1: Basic Application - Full Project Check (Time Pressure)
4
+
5
+ **Setup:**
6
+ - Simulated project with typecheck, lint, test configured
7
+ - No existing errors
8
+
9
+ **Pressure:**
10
+ - Time constraint: "Need results quickly for deployment"
11
+
12
+ **Agent Prompt:**
13
+ ```
14
+ I need to verify the entire simplysm project before deployment. Can you run all checks? We need to deploy soon, so please be fast.
15
+ ```
16
+
17
+ **Expected Baseline Failures (without skill):**
18
+ - May run checks sequentially instead of parallel (slower)
19
+ - May skip environment pre-checks
20
+ - May not use haiku model (more expensive)
21
+
22
+ **Success Criteria (with skill):**
23
+ - Runs environment pre-checks first
24
+ - Launches 3 haiku agents in parallel
25
+ - Reports results correctly
26
+
27
+ ---
28
+
29
+ ## Scenario 2: Variation - Specific Path Check (Complex Path)
30
+
31
+ **Setup:**
32
+ - Project with multiple packages
33
+ - Target path: `packages/solid-demo`
34
+
35
+ **Pressure:**
36
+ - Complex path with potential typos
37
+ - User expects path to be handled correctly
38
+
39
+ **Agent Prompt:**
40
+ ```
41
+ Can you verify just the packages/solid-demo directory? I only changed files there.
42
+ ```
43
+
44
+ **Expected Baseline Failures:**
45
+ - May forget to pass path argument to commands
46
+ - May run full project check instead
47
+ - May incorrectly format path in commands
48
+
49
+ **Success Criteria:**
50
+ - Correctly passes `packages/solid-demo` to all 3 commands
51
+ - Only reports errors from that path
52
+
53
+ ---
54
+
55
+ ## Scenario 3: Edge Case - Typecheck Errors (Fix Priority)
56
+
57
+ **Setup:**
58
+ - Simulated project with typecheck errors that cascade to lint/test
59
+
60
+ **Pressure:**
61
+ - Multiple failing checks (frustration)
62
+ - Desire to "just make it work"
63
+
64
+ **Agent Prompt:**
65
+ ```
66
+ Please verify the project. (Note: project has typecheck errors that cause lint and test failures)
67
+ ```
68
+
69
+ **Expected Baseline Failures:**
70
+ - May fix lint or test errors first (wrong priority)
71
+ - May not understand cascade relationship
72
+ - May fix all errors simultaneously without priority
73
+
74
+ **Success Criteria:**
75
+ - Fixes typecheck errors first
76
+ - Recognizes cascade relationship
77
+ - Re-verifies after each fix round
78
+
79
+ ---
80
+
81
+ ## Scenario 4: Edge Case - Repeated Failures (Loop Exit)
82
+
83
+ **Setup:**
84
+ - Simulated project with obscure test failure
85
+ - Root cause is unclear
86
+
87
+ **Pressure:**
88
+ - Repeated verification failures (fatigue)
89
+ - Temptation to give up or skip
90
+
91
+ **Agent Prompt:**
92
+ ```
93
+ Verify the project. (Note: test failures persist after 2-3 fix attempts)
94
+ ```
95
+
96
+ **Expected Baseline Failures:**
97
+ - May keep trying same fix repeatedly (infinite loop)
98
+ - May skip re-verification to "save time"
99
+ - May not recommend sd-debug
100
+
101
+ **Success Criteria:**
102
+ - After 2-3 failed attempts, recommends `/sd-debug`
103
+ - Does not enter infinite loop
104
+ - Always re-verifies after fixes
105
+
106
+ ---
107
+
108
+ ## Scenario 5: Missing Information Test - No Path Argument
109
+
110
+ **Setup:**
111
+ - Standard project setup
112
+
113
+ **Pressure:**
114
+ - Ambiguous user request
115
+
116
+ **Agent Prompt:**
117
+ ```
118
+ Run sd-check.
119
+ ```
120
+
121
+ **Expected Baseline Failures:**
122
+ - May ask user for path (skill should default to full project)
123
+ - May incorrectly assume a path
124
+
125
+ **Success Criteria:**
126
+ - Runs on entire project (no path argument)
127
+ - Does not ask user for clarification
128
+
129
+ ---
130
+
131
+ ## Scenario 6: Missing Information Test - Invalid Environment
132
+
133
+ **Setup:**
134
+ - Project missing pnpm-lock.yaml or vitest.config.ts
135
+
136
+ **Pressure:**
137
+ - User expects check to work
138
+
139
+ **Agent Prompt:**
140
+ ```
141
+ Please run sd-check on the project.
142
+ ```
143
+
144
+ **Expected Baseline Failures:**
145
+ - May proceed without environment checks
146
+ - May report confusing errors from missing dependencies
147
+
148
+ **Success Criteria:**
149
+ - Runs environment pre-checks
150
+ - Stops with clear error message if environment invalid
151
+ - Reports which specific check failed
152
+
153
+ ---
154
+
155
+ ## Testing Methodology
156
+
157
+ ### RED Phase (Current)
158
+ 1. Run each scenario WITHOUT sd-check skill loaded
159
+ 2. Document exact agent behavior verbatim
160
+ 3. Record rationalizations used
161
+ 4. Identify patterns in failures
162
+
163
+ ### GREEN Phase
164
+ 1. Write skill addressing specific baseline failures
165
+ 2. Run same scenarios WITH skill
166
+ 3. Verify compliance
167
+
168
+ ### REFACTOR Phase
169
+ 1. Identify new rationalizations from GREEN testing
170
+ 2. Add explicit counters
171
+ 3. Build rationalization table
172
+ 4. Re-test until bulletproof