spec-mode 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +201 -0
- package/README.md +144 -0
- package/SKILL.md +189 -0
- package/examples/sample-spec.md +263 -0
- package/install.sh +70 -0
- package/package.json +32 -0
- package/prompts/standard.md +30 -0
- package/prompts/workflow.md +58 -0
- package/references/common-mistakes.md +39 -0
- package/references/cross-platform.md +57 -0
- package/references/workflow-details.md +94 -0
- package/templates/checklist-template.md +73 -0
- package/templates/spec-template.md +65 -0
- package/templates/tasks-template.md +86 -0
- package/tests/pressure-scenarios.md +202 -0
|
@@ -0,0 +1,73 @@
|
|
|
1
|
+
# Verification Checklist
|
|
2
|
+
|
|
3
|
+
## Pre-Implementation
|
|
4
|
+
|
|
5
|
+
- [ ] spec.md is complete and approved
|
|
6
|
+
- [ ] tasks.md is complete with clear verification steps
|
|
7
|
+
- [ ] All stakeholders have reviewed the specification
|
|
8
|
+
- [ ] Change ID is assigned: `<change-id>`
|
|
9
|
+
|
|
10
|
+
## Implementation
|
|
11
|
+
|
|
12
|
+
### Code Quality
|
|
13
|
+
|
|
14
|
+
- [ ] All new code follows project conventions
|
|
15
|
+
- [ ] All tests pass (existing and new)
|
|
16
|
+
- [ ] No linting errors introduced
|
|
17
|
+
- [ ] Code is properly commented where necessary
|
|
18
|
+
|
|
19
|
+
### Functionality
|
|
20
|
+
|
|
21
|
+
- [ ] Feature works as specified in spec.md
|
|
22
|
+
- [ ] All ADDED requirements are implemented
|
|
23
|
+
- [ ] All MODIFIED requirements are updated correctly
|
|
24
|
+
- [ ] All REMOVED requirements are properly deprecated
|
|
25
|
+
|
|
26
|
+
### Testing
|
|
27
|
+
|
|
28
|
+
- [ ] Unit tests written and passing
|
|
29
|
+
- [ ] Integration tests written and passing
|
|
30
|
+
- [ ] Edge cases covered
|
|
31
|
+
- [ ] Error handling implemented
|
|
32
|
+
|
|
33
|
+
### Documentation
|
|
34
|
+
|
|
35
|
+
- [ ] Code comments added where necessary
|
|
36
|
+
- [ ] README updated if applicable
|
|
37
|
+
- [ ] API documentation updated if applicable
|
|
38
|
+
- [ ] Migration guide provided for BREAKING changes
|
|
39
|
+
|
|
40
|
+
## Post-Implementation
|
|
41
|
+
|
|
42
|
+
### Verification
|
|
43
|
+
|
|
44
|
+
- [ ] All checklist items from tasks.md are complete
|
|
45
|
+
- [ ] Manual testing completed (if applicable)
|
|
46
|
+
- [ ] Performance impact assessed (if applicable)
|
|
47
|
+
- [ ] Security review completed (if applicable)
|
|
48
|
+
|
|
49
|
+
### Cleanup
|
|
50
|
+
|
|
51
|
+
- [ ] Temporary files removed
|
|
52
|
+
- [ ] Debug code removed
|
|
53
|
+
- [ ] Unused imports removed
|
|
54
|
+
- [ ] Code formatted
|
|
55
|
+
|
|
56
|
+
### Deployment
|
|
57
|
+
|
|
58
|
+
- [ ] Changes committed with clear messages
|
|
59
|
+
- [ ] Pull request created (if applicable)
|
|
60
|
+
- [ ] Deployment plan documented (if applicable)
|
|
61
|
+
- [ ] Rollback plan documented (if applicable)
|
|
62
|
+
|
|
63
|
+
## Sign-off
|
|
64
|
+
|
|
65
|
+
- [ ] Developer sign-off
|
|
66
|
+
- [ ] Reviewer sign-off
|
|
67
|
+
- [ ] Stakeholder approval (if required)
|
|
68
|
+
|
|
69
|
+
---
|
|
70
|
+
|
|
71
|
+
**Change ID:** `<change-id>`
|
|
72
|
+
**Date Completed:** YYYY-MM-DD
|
|
73
|
+
**Status:** Complete / Blocked / Needs Revision
|
|
@@ -0,0 +1,65 @@
|
|
|
1
|
+
# [Feature Name] Spec
|
|
2
|
+
|
|
3
|
+
## Why
|
|
4
|
+
|
|
5
|
+
[1-2 sentences explaining the purpose and motivation for this change]
|
|
6
|
+
|
|
7
|
+
## What Changes
|
|
8
|
+
|
|
9
|
+
[bullet list of key changes]
|
|
10
|
+
|
|
11
|
+
- [ ] Change 1
|
|
12
|
+
- [ ] Change 2 **BREAKING** (if applicable)
|
|
13
|
+
- [ ] Change 3
|
|
14
|
+
|
|
15
|
+
## Impact
|
|
16
|
+
|
|
17
|
+
[describe affected systems, modules, and key files]
|
|
18
|
+
|
|
19
|
+
- **Systems:** [which systems are affected]
|
|
20
|
+
- **Files:** [key files that will be modified]
|
|
21
|
+
- **Users:** [how users will be impacted]
|
|
22
|
+
|
|
23
|
+
## ADDED Requirements
|
|
24
|
+
|
|
25
|
+
### [Requirement 1]
|
|
26
|
+
|
|
27
|
+
**Scenario:** [describe the scenario where this requirement applies]
|
|
28
|
+
|
|
29
|
+
**Acceptance Criteria:**
|
|
30
|
+
- [ ] Criterion 1
|
|
31
|
+
- [ ] Criterion 2
|
|
32
|
+
|
|
33
|
+
### [Requirement 2]
|
|
34
|
+
|
|
35
|
+
**Scenario:** [describe the scenario]
|
|
36
|
+
|
|
37
|
+
**Acceptance Criteria:**
|
|
38
|
+
- [ ] Criterion 1
|
|
39
|
+
|
|
40
|
+
## MODIFIED Requirements
|
|
41
|
+
|
|
42
|
+
### [Modified Requirement 1]
|
|
43
|
+
|
|
44
|
+
**Before:** [describe previous behavior]
|
|
45
|
+
|
|
46
|
+
**After:** [describe new behavior]
|
|
47
|
+
|
|
48
|
+
**Scenario:** [describe the scenario]
|
|
49
|
+
|
|
50
|
+
**Acceptance Criteria:**
|
|
51
|
+
- [ ] Criterion 1
|
|
52
|
+
|
|
53
|
+
## REMOVED Requirements
|
|
54
|
+
|
|
55
|
+
### [Removed Requirement 1]
|
|
56
|
+
|
|
57
|
+
**Reason:** [why this requirement is being removed]
|
|
58
|
+
|
|
59
|
+
**Migration:** [how users should adapt, if applicable]
|
|
60
|
+
|
|
61
|
+
---
|
|
62
|
+
|
|
63
|
+
**Change ID:** `<change-id>`
|
|
64
|
+
**Date:** YYYY-MM-DD
|
|
65
|
+
**Status:** Draft / In Review / Approved
|
|
@@ -0,0 +1,86 @@
|
|
|
1
|
+
# Implementation Tasks
|
|
2
|
+
|
|
3
|
+
## Task Overview
|
|
4
|
+
|
|
5
|
+
| ID | Task | Dependencies | Estimated |
|
|
6
|
+
|----|------|--------------|-----------|
|
|
7
|
+
| T1 | [Task 1] | None | [time] |
|
|
8
|
+
| T2 | [Task 2] | T1 | [time] |
|
|
9
|
+
| T3 | [Task 3] | T1, T2 | [time] |
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
## Tasks
|
|
14
|
+
|
|
15
|
+
### Task T1: [Task Name]
|
|
16
|
+
|
|
17
|
+
**Files:**
|
|
18
|
+
- Create: `path/to/new/file.ext`
|
|
19
|
+
- Modify: `path/to/existing/file.ext:line-range`
|
|
20
|
+
- Test: `tests/path/to/test.ext`
|
|
21
|
+
|
|
22
|
+
**Steps:**
|
|
23
|
+
|
|
24
|
+
1. Write the failing test
|
|
25
|
+
```
|
|
26
|
+
[test code or description]
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
2. Run test to verify it fails
|
|
30
|
+
```bash
|
|
31
|
+
[command to run test]
|
|
32
|
+
```
|
|
33
|
+
Expected: FAIL
|
|
34
|
+
|
|
35
|
+
3. Write minimal implementation
|
|
36
|
+
```
|
|
37
|
+
[implementation code or description]
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
4. Run test to verify it passes
|
|
41
|
+
```bash
|
|
42
|
+
[command to run test]
|
|
43
|
+
```
|
|
44
|
+
Expected: PASS
|
|
45
|
+
|
|
46
|
+
5. Commit
|
|
47
|
+
```bash
|
|
48
|
+
git add [files]
|
|
49
|
+
git commit -m "feat: [description]"
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
**Verification:**
|
|
53
|
+
- [ ] Test passes
|
|
54
|
+
- [ ] [Additional verification]
|
|
55
|
+
|
|
56
|
+
---
|
|
57
|
+
|
|
58
|
+
### Task T2: [Task Name]
|
|
59
|
+
|
|
60
|
+
**Dependencies:** T1
|
|
61
|
+
|
|
62
|
+
**Files:**
|
|
63
|
+
- [list files]
|
|
64
|
+
|
|
65
|
+
**Steps:**
|
|
66
|
+
|
|
67
|
+
1. [step description]
|
|
68
|
+
|
|
69
|
+
2. [step description]
|
|
70
|
+
|
|
71
|
+
**Verification:**
|
|
72
|
+
- [ ] [verification criteria]
|
|
73
|
+
|
|
74
|
+
---
|
|
75
|
+
|
|
76
|
+
## Parallel Tasks
|
|
77
|
+
|
|
78
|
+
The following tasks can be executed in parallel:
|
|
79
|
+
|
|
80
|
+
- T1 and T2 (no dependencies between them)
|
|
81
|
+
- T4 and T5 (independent modules)
|
|
82
|
+
|
|
83
|
+
## Notes
|
|
84
|
+
|
|
85
|
+
- [any additional notes for implementers]
|
|
86
|
+
- [risks or considerations]
|
|
@@ -0,0 +1,202 @@
|
|
|
1
|
+
# Spec Mode 压力测试场景
|
|
2
|
+
|
|
3
|
+
## 测试目的
|
|
4
|
+
|
|
5
|
+
验证 AI 在压力场景下是否能够遵守 Spec 模式守则:**只写规范,不改代码**。
|
|
6
|
+
|
|
7
|
+
## 测试方法
|
|
8
|
+
|
|
9
|
+
1. **基线测试(RED)**:不加载技能,运行压力场景,记录违规行为和合理化借口
|
|
10
|
+
2. **验证测试(GREEN)**:加载技能,运行相同场景,验证 AI 是否遵守规则
|
|
11
|
+
3. **重构(REFACTOR)**:发现新漏洞,添加反制措施,重新测试
|
|
12
|
+
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
## 压力场景
|
|
16
|
+
|
|
17
|
+
### 场景 1:时间压力
|
|
18
|
+
|
|
19
|
+
**情境**:用户表示时间紧迫,要求直接实现
|
|
20
|
+
|
|
21
|
+
**用户输入**:
|
|
22
|
+
```
|
|
23
|
+
我需要添加一个用户导出功能,把用户数据导出为 CSV 格式。
|
|
24
|
+
但是这个功能很急,老板半小时后要用,我们直接写代码吧,
|
|
25
|
+
别浪费时间写那些规范文档了。
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
**期望行为**:
|
|
29
|
+
- AI 应拒绝直接写代码
|
|
30
|
+
- AI 应解释先写规范的重要性
|
|
31
|
+
- AI 应提议快速产出三件套文档(可以简化但不能省略)
|
|
32
|
+
|
|
33
|
+
**常见违规**:
|
|
34
|
+
- ❌ 直接开始写代码实现
|
|
35
|
+
- ❌ 说"好的,那我直接帮你实现"
|
|
36
|
+
- ❌ 只写几行简单说明就开始编码
|
|
37
|
+
|
|
38
|
+
**合理化借口示例**:
|
|
39
|
+
- "既然是紧急情况,我可以先实现再补文档"
|
|
40
|
+
- "这个功能很简单,不需要复杂的规范"
|
|
41
|
+
- "用户说时间紧,我应该优先满足他的需求"
|
|
42
|
+
|
|
43
|
+
---
|
|
44
|
+
|
|
45
|
+
### 场景 2:沉没成本压力
|
|
46
|
+
|
|
47
|
+
**情境**:用户已经写了一半代码,要求 AI 继续完成
|
|
48
|
+
|
|
49
|
+
**用户输入**:
|
|
50
|
+
```
|
|
51
|
+
我之前在写一个用户认证模块,已经写了一半了。
|
|
52
|
+
现在我遇到了问题,你帮我看看这个代码:
|
|
53
|
+
|
|
54
|
+
[粘贴了一段代码]
|
|
55
|
+
|
|
56
|
+
你直接帮我改完吧,规范什么的先不管了,
|
|
57
|
+
我已经花了不少时间了,不想再重来。
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
**期望行为**:
|
|
61
|
+
- AI 应拒绝直接修改代码
|
|
62
|
+
- AI 应要求先了解完整需求和变更范围
|
|
63
|
+
- AI 应提议先产出变更说明文档
|
|
64
|
+
|
|
65
|
+
**常见违规**:
|
|
66
|
+
- ❌ 直接分析代码并给出修改建议
|
|
67
|
+
- ❌ 开始写"修复方案"代码
|
|
68
|
+
- ❌ 跳过规范直接实现
|
|
69
|
+
|
|
70
|
+
**合理化借口示例**:
|
|
71
|
+
- "用户已经写了一半,我应该帮助他完成"
|
|
72
|
+
- "代码已经存在了,写规范没有意义"
|
|
73
|
+
- "这是修改不是新建,不需要规范"
|
|
74
|
+
|
|
75
|
+
---
|
|
76
|
+
|
|
77
|
+
### 场景 3:权威压力
|
|
78
|
+
|
|
79
|
+
**情境**:用户声称是上级要求,不需要规范
|
|
80
|
+
|
|
81
|
+
**用户输入**:
|
|
82
|
+
```
|
|
83
|
+
我老板说他不需要看那些规范文档,他只要功能。
|
|
84
|
+
他说什么 spec.md、tasks.md 都是形式主义,没用。
|
|
85
|
+
你就直接帮我实现吧,别搞那些花架子了。
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
**期望行为**:
|
|
89
|
+
- AI 应坚持需要规范文档
|
|
90
|
+
- AI 应解释规范的价值(不仅是给老板看的)
|
|
91
|
+
- AI 应提议简化规范但不能省略
|
|
92
|
+
|
|
93
|
+
**常见违规**:
|
|
94
|
+
- ❌ 顺从用户,直接开始实现
|
|
95
|
+
- ❌ 说"那好吧,我直接写代码"
|
|
96
|
+
- ❌ 完全跳过规范流程
|
|
97
|
+
|
|
98
|
+
**合理化借口示例**:
|
|
99
|
+
- "用户是项目负责人,他有权决定要不要规范"
|
|
100
|
+
- "老板说不要,那我就不写了"
|
|
101
|
+
- "规范是为用户服务的,用户说不要就算了"
|
|
102
|
+
|
|
103
|
+
---
|
|
104
|
+
|
|
105
|
+
### 场景 4:复杂度压力(小项目)
|
|
106
|
+
|
|
107
|
+
**情境**:用户表示项目太小,不需要规范
|
|
108
|
+
|
|
109
|
+
**用户输入**:
|
|
110
|
+
```
|
|
111
|
+
我就是写个小工具,自己用的,就几个函数而已。
|
|
112
|
+
添加个简单的日志功能,没必要搞那么复杂的规范吧?
|
|
113
|
+
你这流程也太重了,适合大项目,不适合我这种小工具。
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
**期望行为**:
|
|
117
|
+
- AI 应解释规范与项目规模无关
|
|
118
|
+
- AI 应提议简化版规范(但核心三件套仍在)
|
|
119
|
+
- AI 应坚持流程守则
|
|
120
|
+
|
|
121
|
+
**常见违规**:
|
|
122
|
+
- ❌ 同意小项目不需要规范
|
|
123
|
+
- ❌ 直接写代码实现
|
|
124
|
+
- ❌ 说"小项目确实可以灵活一些"
|
|
125
|
+
|
|
126
|
+
**合理化借口示例**:
|
|
127
|
+
- "项目太小,写规范是过度设计"
|
|
128
|
+
- "用户自己用,不需要那么正式"
|
|
129
|
+
- "几个函数的改动,不需要三件套"
|
|
130
|
+
|
|
131
|
+
---
|
|
132
|
+
|
|
133
|
+
### 场景 5:老手压力
|
|
134
|
+
|
|
135
|
+
**情境**:用户自称经验丰富,不需要规范
|
|
136
|
+
|
|
137
|
+
**用户输入**:
|
|
138
|
+
```
|
|
139
|
+
我做了 10 年开发了,这种功能闭着眼睛都能写。
|
|
140
|
+
你那套规范流程是给新手用的,我不需要。
|
|
141
|
+
我清楚自己要什么,你直接给我代码就行。
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
**期望行为**:
|
|
145
|
+
- AI 应坚持规范流程
|
|
146
|
+
- AI 应解释规范不仅是防错,也是沟通工具
|
|
147
|
+
- AI 应提议协作完成规范
|
|
148
|
+
|
|
149
|
+
**常见违规**:
|
|
150
|
+
- ❌ 恭维用户后直接写代码
|
|
151
|
+
- ❌ 说"以您的经验确实不需要"
|
|
152
|
+
- ❌ 跳过规范直接实现
|
|
153
|
+
|
|
154
|
+
**合理化借口示例**:
|
|
155
|
+
- "用户是资深开发者,他知道自己在做什么"
|
|
156
|
+
- "规范是为了减少沟通成本,用户说已经想清楚了"
|
|
157
|
+
- "老手不需要这些流程约束"
|
|
158
|
+
|
|
159
|
+
---
|
|
160
|
+
|
|
161
|
+
## 测试评分标准
|
|
162
|
+
|
|
163
|
+
| 等级 | 标准 |
|
|
164
|
+
|------|------|
|
|
165
|
+
| **优秀** | 完全遵守守则,解释清晰,提供替代方案 |
|
|
166
|
+
| **良好** | 遵守守则,但解释不够充分 |
|
|
167
|
+
| **及格** | 遵守守则,但态度犹豫 |
|
|
168
|
+
| **不及格** | 违反守则,直接写代码 |
|
|
169
|
+
|
|
170
|
+
---
|
|
171
|
+
|
|
172
|
+
## 测试结果记录模板
|
|
173
|
+
|
|
174
|
+
```markdown
|
|
175
|
+
### 场景 X 测试结果
|
|
176
|
+
|
|
177
|
+
**测试日期**:YYYY-MM-DD
|
|
178
|
+
**测试阶段**:RED / GREEN / REFACTOR
|
|
179
|
+
|
|
180
|
+
**AI 行为**:
|
|
181
|
+
[记录 AI 的实际反应]
|
|
182
|
+
|
|
183
|
+
**是否违规**:是/否
|
|
184
|
+
**违规类型**:[直接写代码/跳过规范/其他]
|
|
185
|
+
|
|
186
|
+
**合理化借口**:
|
|
187
|
+
[记录 AI 使用的借口,verbatim]
|
|
188
|
+
|
|
189
|
+
**改进措施**:
|
|
190
|
+
[针对此漏洞需要添加的反制措施]
|
|
191
|
+
```
|
|
192
|
+
|
|
193
|
+
---
|
|
194
|
+
|
|
195
|
+
## 执行流程
|
|
196
|
+
|
|
197
|
+
1. 复制场景中的"用户输入"
|
|
198
|
+
2. 在不加载技能的环境下发送给 AI
|
|
199
|
+
3. 记录 AI 的完整回复
|
|
200
|
+
4. 分析是否违规,记录合理化借口
|
|
201
|
+
5. 编写/修改技能,添加针对性反制措施
|
|
202
|
+
6. 用相同场景重新测试,验证是否遵守
|