super-opencode 1.1.0 → 1.1.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.opencode/agents/architect.md +84 -84
- package/.opencode/agents/backend.md +124 -124
- package/.opencode/agents/frontend.md +137 -137
- package/.opencode/agents/optimizer.md +51 -51
- package/.opencode/agents/pm-agent.md +105 -105
- package/.opencode/agents/quality.md +107 -107
- package/.opencode/agents/researcher.md +105 -105
- package/.opencode/agents/reviewer.md +80 -80
- package/.opencode/agents/security.md +107 -107
- package/.opencode/agents/writer.md +136 -136
- package/.opencode/commands/soc-analyze.md +136 -137
- package/.opencode/commands/soc-brainstorm.md +109 -110
- package/.opencode/commands/soc-cleanup.md +107 -107
- package/.opencode/commands/soc-design.md +0 -1
- package/.opencode/commands/soc-explain.md +113 -113
- package/.opencode/commands/soc-git.md +104 -104
- package/.opencode/commands/soc-help.md +94 -94
- package/.opencode/commands/soc-implement.md +112 -112
- package/.opencode/commands/soc-improve.md +105 -105
- package/.opencode/commands/soc-pm.md +99 -99
- package/.opencode/commands/soc-research.md +105 -105
- package/.opencode/commands/soc-review.md +102 -102
- package/.opencode/commands/soc-test.md +109 -109
- package/.opencode/commands/soc-workflow.md +97 -97
- package/.opencode/settings.json +3 -3
- package/.opencode/skills/confidence-check/SKILL.md +97 -97
- package/.opencode/skills/debug-protocol/SKILL.md +83 -83
- package/.opencode/skills/reflexion/SKILL.md +108 -108
- package/.opencode/skills/security-audit/SKILL.md +90 -90
- package/.opencode/skills/self-check/SKILL.md +95 -95
- package/.opencode/skills/simplification/SKILL.md +92 -92
- package/AGENTS.md +175 -175
- package/LICENSE +21 -21
- package/dist/cli.js +8 -5
- package/package.json +45 -45
|
@@ -1,109 +1,109 @@
|
|
|
1
|
-
---
|
|
2
|
-
description: Test generation and execution
|
|
3
|
-
---
|
|
4
|
-
|
|
5
|
-
# /soc-test
|
|
6
|
-
|
|
7
|
-
## 1. Command Overview
|
|
8
|
-
The `/soc-test` command is the "Quality Gatekeeper." It generates, runs, and reports on tests. It ensures that no code is shipped without verification. It triggers the `quality` agent to write unit, integration, or E2E tests and uses the `reviewer` agent to analyze coverage.
|
|
9
|
-
|
|
10
|
-
## 2. Triggers & Routing
|
|
11
|
-
The command determines the scope and tool based on the arguments.
|
|
12
|
-
|
|
13
|
-
| Trigger Scenario | Flag | Target Agent | Tool Used |
|
|
14
|
-
| :--- | :--- | :--- | :--- |
|
|
15
|
-
| **Unit Testing** | `--type unit` | `[quality]` | Jest/Vitest |
|
|
16
|
-
| **Integration** | `--type integration` | `[quality]` | Supertest/Pytest |
|
|
17
|
-
| **E2E Flows** | `--type e2e` | `[quality]` | Playwright/Cypress |
|
|
18
|
-
| **Coverage Check**| `--coverage` | `[reviewer]` | Istanbul/C8 |
|
|
19
|
-
|
|
20
|
-
## 3. Usage & Arguments
|
|
21
|
-
```bash
|
|
22
|
-
/soc-test [target] [flags]
|
|
23
|
-
```
|
|
24
|
-
|
|
25
|
-
### Arguments
|
|
26
|
-
- **`[target]`**: File or directory to test (e.g., `src/utils/math.ts`).
|
|
27
|
-
|
|
28
|
-
### Flags
|
|
29
|
-
- **`--type [unit|integration|e2e]`**: (Optional) Default: `unit`.
|
|
30
|
-
- **`--coverage`**: Generates a coverage report.
|
|
31
|
-
- **`--watch`**: Runs in watch mode (interactive).
|
|
32
|
-
|
|
33
|
-
## 4. Behavioral Flow (Orchestration)
|
|
34
|
-
|
|
35
|
-
### Phase 1: Context & Gap Analysis
|
|
36
|
-
1. **Read**: Analyzes the source code in `[target]`.
|
|
37
|
-
2. **Scan**: Checks for existing `.test.ts` or `_test.py` files.
|
|
38
|
-
3. **Plan**: Identifies missing test cases (Edge cases, Error states, Happy path).
|
|
39
|
-
|
|
40
|
-
### Phase 2: Generation (The Writer)
|
|
41
|
-
The command prompts the `quality` agent:
|
|
42
|
-
> "Generate **[Type]** tests for **[Target]**.
|
|
43
|
-
> Ensure **[Coverage]**.
|
|
44
|
-
> Mock dependencies: **[External Services]**."
|
|
45
|
-
|
|
46
|
-
### Phase 3: Execution (The Runner)
|
|
47
|
-
1. **Run**: Executes the test command (e.g., `npm test`).
|
|
48
|
-
2. **Report**: Captures `stdout/stderr`.
|
|
49
|
-
3. **Reflect**: If tests fail, it suggests fixes (or fixes them if in `implement` mode).
|
|
50
|
-
|
|
51
|
-
## 5. Output Guidelines (The Contract)
|
|
52
|
-
|
|
53
|
-
### Test Execution Report
|
|
54
|
-
```markdown
|
|
55
|
-
## Test Report: [Target]
|
|
56
|
-
|
|
57
|
-
### 1. Summary
|
|
58
|
-
- **Status**: ❌ FAILED (2/10 failed)
|
|
59
|
-
- **Time**: 1.2s
|
|
60
|
-
|
|
61
|
-
### 2. Failures
|
|
62
|
-
- **Test**: `should return 400 on invalid email`
|
|
63
|
-
- **Expected**: `400`
|
|
64
|
-
- **Received**: `500`
|
|
65
|
-
- **Diagnosis**: Unhandled exception in controller.
|
|
66
|
-
|
|
67
|
-
### 3. Coverage
|
|
68
|
-
- **Line**: 85%
|
|
69
|
-
- **Branch**: 70% (Missing: `if (!user)`)
|
|
70
|
-
```
|
|
71
|
-
|
|
72
|
-
## 6. Examples
|
|
73
|
-
|
|
74
|
-
### A. Unit Test Generation
|
|
75
|
-
```bash
|
|
76
|
-
/soc-test src/utils/currency.ts --type unit
|
|
77
|
-
```
|
|
78
|
-
*Effect:* Generates `src/utils/currency.test.ts` covering rounding logic and currency codes.
|
|
79
|
-
|
|
80
|
-
### B. Integration Test
|
|
81
|
-
```bash
|
|
82
|
-
/soc-test src/api/auth --type integration
|
|
83
|
-
```
|
|
84
|
-
*Effect:* Sets up a mock database and tests the `/login` flow including DB persistence.
|
|
85
|
-
|
|
86
|
-
## 7. Dependencies & Capabilities
|
|
87
|
-
|
|
88
|
-
### Agents
|
|
89
|
-
- **Quality Assurance**: `@[.opencode/agents/quality.md]` - Primary agent for test generation.
|
|
90
|
-
- **Reviewer**: `@[.opencode/agents/reviewer.md]` - For evaluating test coverage and effectiveness.
|
|
91
|
-
|
|
92
|
-
### Skills
|
|
93
|
-
- **Self Check**: `@[.opencode/skills/self-check/SKILL.md]` - Verifying test results against requirements.
|
|
94
|
-
- **Debug Protocol**: `@[.opencode/skills/debug-protocol/SKILL.md]` - Analyzing test failures.
|
|
95
|
-
|
|
96
|
-
### MCP Integration
|
|
97
|
-
- **`run_command`**: Executing test runners (Jest, Vitest, Pytest).
|
|
98
|
-
- **`filesystem`**: processing integration test artifacts or coverage reports.
|
|
99
|
-
|
|
100
|
-
## 8. Boundaries
|
|
101
|
-
|
|
102
|
-
**Will:**
|
|
103
|
-
- Generate standard test files using the project's framework.
|
|
104
|
-
- Run tests and parse output.
|
|
105
|
-
- Mock external dependencies (S3, Stripe) to prevent flake.
|
|
106
|
-
|
|
107
|
-
**Will Not:**
|
|
108
|
-
- **Guarantee 100% Logic Coverage**: It covers code paths, not necessarily business intent.
|
|
109
|
-
- **Debug Production**: Tests run in `test` environment only.
|
|
1
|
+
---
|
|
2
|
+
description: Test generation and execution
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
# /soc-test
|
|
6
|
+
|
|
7
|
+
## 1. Command Overview
|
|
8
|
+
The `/soc-test` command is the "Quality Gatekeeper." It generates, runs, and reports on tests. It ensures that no code is shipped without verification. It triggers the `quality` agent to write unit, integration, or E2E tests and uses the `reviewer` agent to analyze coverage.
|
|
9
|
+
|
|
10
|
+
## 2. Triggers & Routing
|
|
11
|
+
The command determines the scope and tool based on the arguments.
|
|
12
|
+
|
|
13
|
+
| Trigger Scenario | Flag | Target Agent | Tool Used |
|
|
14
|
+
| :--- | :--- | :--- | :--- |
|
|
15
|
+
| **Unit Testing** | `--type unit` | `[quality]` | Jest/Vitest |
|
|
16
|
+
| **Integration** | `--type integration` | `[quality]` | Supertest/Pytest |
|
|
17
|
+
| **E2E Flows** | `--type e2e` | `[quality]` | Playwright/Cypress |
|
|
18
|
+
| **Coverage Check**| `--coverage` | `[reviewer]` | Istanbul/C8 |
|
|
19
|
+
|
|
20
|
+
## 3. Usage & Arguments
|
|
21
|
+
```bash
|
|
22
|
+
/soc-test [target] [flags]
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
### Arguments
|
|
26
|
+
- **`[target]`**: File or directory to test (e.g., `src/utils/math.ts`).
|
|
27
|
+
|
|
28
|
+
### Flags
|
|
29
|
+
- **`--type [unit|integration|e2e]`**: (Optional) Default: `unit`.
|
|
30
|
+
- **`--coverage`**: Generates a coverage report.
|
|
31
|
+
- **`--watch`**: Runs in watch mode (interactive).
|
|
32
|
+
|
|
33
|
+
## 4. Behavioral Flow (Orchestration)
|
|
34
|
+
|
|
35
|
+
### Phase 1: Context & Gap Analysis
|
|
36
|
+
1. **Read**: Analyzes the source code in `[target]`.
|
|
37
|
+
2. **Scan**: Checks for existing `.test.ts` or `_test.py` files.
|
|
38
|
+
3. **Plan**: Identifies missing test cases (Edge cases, Error states, Happy path).
|
|
39
|
+
|
|
40
|
+
### Phase 2: Generation (The Writer)
|
|
41
|
+
The command prompts the `quality` agent:
|
|
42
|
+
> "Generate **[Type]** tests for **[Target]**.
|
|
43
|
+
> Ensure **[Coverage]**.
|
|
44
|
+
> Mock dependencies: **[External Services]**."
|
|
45
|
+
|
|
46
|
+
### Phase 3: Execution (The Runner)
|
|
47
|
+
1. **Run**: Executes the test command (e.g., `npm test`).
|
|
48
|
+
2. **Report**: Captures `stdout/stderr`.
|
|
49
|
+
3. **Reflect**: If tests fail, it suggests fixes (or fixes them if in `implement` mode).
|
|
50
|
+
|
|
51
|
+
## 5. Output Guidelines (The Contract)
|
|
52
|
+
|
|
53
|
+
### Test Execution Report
|
|
54
|
+
```markdown
|
|
55
|
+
## Test Report: [Target]
|
|
56
|
+
|
|
57
|
+
### 1. Summary
|
|
58
|
+
- **Status**: ❌ FAILED (2/10 failed)
|
|
59
|
+
- **Time**: 1.2s
|
|
60
|
+
|
|
61
|
+
### 2. Failures
|
|
62
|
+
- **Test**: `should return 400 on invalid email`
|
|
63
|
+
- **Expected**: `400`
|
|
64
|
+
- **Received**: `500`
|
|
65
|
+
- **Diagnosis**: Unhandled exception in controller.
|
|
66
|
+
|
|
67
|
+
### 3. Coverage
|
|
68
|
+
- **Line**: 85%
|
|
69
|
+
- **Branch**: 70% (Missing: `if (!user)`)
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
## 6. Examples
|
|
73
|
+
|
|
74
|
+
### A. Unit Test Generation
|
|
75
|
+
```bash
|
|
76
|
+
/soc-test src/utils/currency.ts --type unit
|
|
77
|
+
```
|
|
78
|
+
*Effect:* Generates `src/utils/currency.test.ts` covering rounding logic and currency codes.
|
|
79
|
+
|
|
80
|
+
### B. Integration Test
|
|
81
|
+
```bash
|
|
82
|
+
/soc-test src/api/auth --type integration
|
|
83
|
+
```
|
|
84
|
+
*Effect:* Sets up a mock database and tests the `/login` flow including DB persistence.
|
|
85
|
+
|
|
86
|
+
## 7. Dependencies & Capabilities
|
|
87
|
+
|
|
88
|
+
### Agents
|
|
89
|
+
- **Quality Assurance**: `@[.opencode/agents/quality.md]` - Primary agent for test generation.
|
|
90
|
+
- **Reviewer**: `@[.opencode/agents/reviewer.md]` - For evaluating test coverage and effectiveness.
|
|
91
|
+
|
|
92
|
+
### Skills
|
|
93
|
+
- **Self Check**: `@[.opencode/skills/self-check/SKILL.md]` - Verifying test results against requirements.
|
|
94
|
+
- **Debug Protocol**: `@[.opencode/skills/debug-protocol/SKILL.md]` - Analyzing test failures.
|
|
95
|
+
|
|
96
|
+
### MCP Integration
|
|
97
|
+
- **`run_command`**: Executing test runners (Jest, Vitest, Pytest).
|
|
98
|
+
- **`filesystem`**: processing integration test artifacts or coverage reports.
|
|
99
|
+
|
|
100
|
+
## 8. Boundaries
|
|
101
|
+
|
|
102
|
+
**Will:**
|
|
103
|
+
- Generate standard test files using the project's framework.
|
|
104
|
+
- Run tests and parse output.
|
|
105
|
+
- Mock external dependencies (S3, Stripe) to prevent flake.
|
|
106
|
+
|
|
107
|
+
**Will Not:**
|
|
108
|
+
- **Guarantee 100% Logic Coverage**: It covers code paths, not necessarily business intent.
|
|
109
|
+
- **Debug Production**: Tests run in `test` environment only.
|
|
@@ -1,97 +1,97 @@
|
|
|
1
|
-
---
|
|
2
|
-
description: Custom workflow creation and execution
|
|
3
|
-
---
|
|
4
|
-
|
|
5
|
-
# /soc-workflow
|
|
6
|
-
|
|
7
|
-
## 1. Command Overview
|
|
8
|
-
The `/soc-workflow` command is the "Factory." It allows users to create, list, and run custom sequences of agent commands. It turns a manual, multi-step process (e.g., "Check status -> Pull -> Build -> Deploy") into a single executable command.
|
|
9
|
-
|
|
10
|
-
## 2. Triggers & Routing
|
|
11
|
-
The command is a meta-orchestrator.
|
|
12
|
-
|
|
13
|
-
| Trigger Scenario | Flag | Action |
|
|
14
|
-
| :--- | :--- | :--- |
|
|
15
|
-
| **New Automation** | `create` | Templates a new `.md` file in `.agent/workflows/` |
|
|
16
|
-
| **Run Automation** | `run` | Parses and executes the steps in order |
|
|
17
|
-
| **List Available** | `list` | Scans directory for `.md` files |
|
|
18
|
-
|
|
19
|
-
## 3. Usage & Arguments
|
|
20
|
-
```bash
|
|
21
|
-
/soc-workflow [action] [name]
|
|
22
|
-
```
|
|
23
|
-
|
|
24
|
-
### Arguments
|
|
25
|
-
- **`[action]`**: `create`, `run`, `list`, `edit`, `delete`.
|
|
26
|
-
- **`[name]`**: Name of the workflow (e.g., `deploy`).
|
|
27
|
-
|
|
28
|
-
## 4. Behavioral Flow (Orchestration)
|
|
29
|
-
|
|
30
|
-
### Phase 1: Definition (Create)
|
|
31
|
-
1. **Template**: Creates a standard markdown file with "Steps."
|
|
32
|
-
2. **Define**: User fills in shell commands or agent commands (`/soc-git`, `/soc-test`).
|
|
33
|
-
|
|
34
|
-
### Phase 2: Execution (Run)
|
|
35
|
-
1. **Parse**: Reads the markdown.
|
|
36
|
-
2. **Step**: Executes step 1.
|
|
37
|
-
3. **Check**: If step 1 fails, stop (unless `continue_on_error: true`).
|
|
38
|
-
4. **Next**: Proceed to step 2.
|
|
39
|
-
|
|
40
|
-
## 5. Output Guidelines (The Contract)
|
|
41
|
-
|
|
42
|
-
### Workflow Definition (Template)
|
|
43
|
-
```markdown
|
|
44
|
-
---
|
|
45
|
-
description: Deploy to Staging
|
|
46
|
-
---
|
|
47
|
-
# Deploy Staging
|
|
48
|
-
1. **Test**: `/soc-test --type e2e`
|
|
49
|
-
2. **Build**: `npm run build`
|
|
50
|
-
3. **Deploy**: `/soc-git push origin staging`
|
|
51
|
-
```
|
|
52
|
-
|
|
53
|
-
### Execution Log
|
|
54
|
-
```markdown
|
|
55
|
-
## Workflow: Deploy Staging
|
|
56
|
-
1. [x] **Test**: Passed (2s)
|
|
57
|
-
2. [x] **Build**: Passed (15s)
|
|
58
|
-
3. [x] **Deploy**: Pushed a1b2c3d
|
|
59
|
-
✅ Workflow Complete
|
|
60
|
-
```
|
|
61
|
-
|
|
62
|
-
## 6. Examples
|
|
63
|
-
|
|
64
|
-
### A. Create Release Workflow
|
|
65
|
-
```bash
|
|
66
|
-
/soc-workflow create release
|
|
67
|
-
```
|
|
68
|
-
*Effect:* Creates `.agent/workflows/release.md` for the user to edit.
|
|
69
|
-
|
|
70
|
-
### B. Run Nightly Build
|
|
71
|
-
```bash
|
|
72
|
-
/soc-workflow run nightly
|
|
73
|
-
```
|
|
74
|
-
*Effect:* Runs the sequence defined in `nightly.md`.
|
|
75
|
-
|
|
76
|
-
## 7. Dependencies & Capabilities
|
|
77
|
-
|
|
78
|
-
### Agents
|
|
79
|
-
- **PM Agent**: `@[.opencode/agents/pm-agent.md]` - For oversight.
|
|
80
|
-
|
|
81
|
-
### Skills
|
|
82
|
-
- **None**: It relies on other commands.
|
|
83
|
-
|
|
84
|
-
### MCP Integration
|
|
85
|
-
- **`filesystem`**: Reading/Writing workflow files.
|
|
86
|
-
- **`run_command`**: Executing shell steps.
|
|
87
|
-
|
|
88
|
-
## 8. Boundaries
|
|
89
|
-
|
|
90
|
-
**Will:**
|
|
91
|
-
- Execute commands in sequence.
|
|
92
|
-
- Stop on error.
|
|
93
|
-
- Pass context between steps.
|
|
94
|
-
|
|
95
|
-
**Will Not:**
|
|
96
|
-
- **Auto-Debug**: If a step fails, the workflow just stops.
|
|
97
|
-
- **Parallize**: Steps are currently sequential only.
|
|
1
|
+
---
|
|
2
|
+
description: Custom workflow creation and execution
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
# /soc-workflow
|
|
6
|
+
|
|
7
|
+
## 1. Command Overview
|
|
8
|
+
The `/soc-workflow` command is the "Factory." It allows users to create, list, and run custom sequences of agent commands. It turns a manual, multi-step process (e.g., "Check status -> Pull -> Build -> Deploy") into a single executable command.
|
|
9
|
+
|
|
10
|
+
## 2. Triggers & Routing
|
|
11
|
+
The command is a meta-orchestrator.
|
|
12
|
+
|
|
13
|
+
| Trigger Scenario | Flag | Action |
|
|
14
|
+
| :--- | :--- | :--- |
|
|
15
|
+
| **New Automation** | `create` | Templates a new `.md` file in `.agent/workflows/` |
|
|
16
|
+
| **Run Automation** | `run` | Parses and executes the steps in order |
|
|
17
|
+
| **List Available** | `list` | Scans directory for `.md` files |
|
|
18
|
+
|
|
19
|
+
## 3. Usage & Arguments
|
|
20
|
+
```bash
|
|
21
|
+
/soc-workflow [action] [name]
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
### Arguments
|
|
25
|
+
- **`[action]`**: `create`, `run`, `list`, `edit`, `delete`.
|
|
26
|
+
- **`[name]`**: Name of the workflow (e.g., `deploy`).
|
|
27
|
+
|
|
28
|
+
## 4. Behavioral Flow (Orchestration)
|
|
29
|
+
|
|
30
|
+
### Phase 1: Definition (Create)
|
|
31
|
+
1. **Template**: Creates a standard markdown file with "Steps."
|
|
32
|
+
2. **Define**: User fills in shell commands or agent commands (`/soc-git`, `/soc-test`).
|
|
33
|
+
|
|
34
|
+
### Phase 2: Execution (Run)
|
|
35
|
+
1. **Parse**: Reads the markdown.
|
|
36
|
+
2. **Step**: Executes step 1.
|
|
37
|
+
3. **Check**: If step 1 fails, stop (unless `continue_on_error: true`).
|
|
38
|
+
4. **Next**: Proceed to step 2.
|
|
39
|
+
|
|
40
|
+
## 5. Output Guidelines (The Contract)
|
|
41
|
+
|
|
42
|
+
### Workflow Definition (Template)
|
|
43
|
+
```markdown
|
|
44
|
+
---
|
|
45
|
+
description: Deploy to Staging
|
|
46
|
+
---
|
|
47
|
+
# Deploy Staging
|
|
48
|
+
1. **Test**: `/soc-test --type e2e`
|
|
49
|
+
2. **Build**: `npm run build`
|
|
50
|
+
3. **Deploy**: `/soc-git push origin staging`
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
### Execution Log
|
|
54
|
+
```markdown
|
|
55
|
+
## Workflow: Deploy Staging
|
|
56
|
+
1. [x] **Test**: Passed (2s)
|
|
57
|
+
2. [x] **Build**: Passed (15s)
|
|
58
|
+
3. [x] **Deploy**: Pushed a1b2c3d
|
|
59
|
+
✅ Workflow Complete
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
## 6. Examples
|
|
63
|
+
|
|
64
|
+
### A. Create Release Workflow
|
|
65
|
+
```bash
|
|
66
|
+
/soc-workflow create release
|
|
67
|
+
```
|
|
68
|
+
*Effect:* Creates `.agent/workflows/release.md` for the user to edit.
|
|
69
|
+
|
|
70
|
+
### B. Run Nightly Build
|
|
71
|
+
```bash
|
|
72
|
+
/soc-workflow run nightly
|
|
73
|
+
```
|
|
74
|
+
*Effect:* Runs the sequence defined in `nightly.md`.
|
|
75
|
+
|
|
76
|
+
## 7. Dependencies & Capabilities
|
|
77
|
+
|
|
78
|
+
### Agents
|
|
79
|
+
- **PM Agent**: `@[.opencode/agents/pm-agent.md]` - For oversight.
|
|
80
|
+
|
|
81
|
+
### Skills
|
|
82
|
+
- **None**: It relies on other commands.
|
|
83
|
+
|
|
84
|
+
### MCP Integration
|
|
85
|
+
- **`filesystem`**: Reading/Writing workflow files.
|
|
86
|
+
- **`run_command`**: Executing shell steps.
|
|
87
|
+
|
|
88
|
+
## 8. Boundaries
|
|
89
|
+
|
|
90
|
+
**Will:**
|
|
91
|
+
- Execute commands in sequence.
|
|
92
|
+
- Stop on error.
|
|
93
|
+
- Pass context between steps.
|
|
94
|
+
|
|
95
|
+
**Will Not:**
|
|
96
|
+
- **Auto-Debug**: If a step fails, the workflow just stops.
|
|
97
|
+
- **Parallize**: Steps are currently sequential only.
|
package/.opencode/settings.json
CHANGED
|
@@ -1,3 +1,3 @@
|
|
|
1
|
-
{
|
|
2
|
-
"version": "1.1.0"
|
|
3
|
-
}
|
|
1
|
+
{
|
|
2
|
+
"version": "1.1.0"
|
|
3
|
+
}
|
|
@@ -1,97 +1,97 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: confidence-check
|
|
3
|
-
description: Pre-execution risk assessment to prevent hallucinations and architectural drift.
|
|
4
|
-
---
|
|
5
|
-
|
|
6
|
-
# Confidence Check Skill
|
|
7
|
-
|
|
8
|
-
## Purpose
|
|
9
|
-
To calculate a probabilistic "Success Score" (0.0 - 1.0) **before** generating code. This acts as a circuit breaker for the `execution` mode.
|
|
10
|
-
|
|
11
|
-
**ROI Metric**: A 200-token analysis prevents 2,000+ tokens of incorrect code generation and subsequent debugging time.
|
|
12
|
-
|
|
13
|
-
## When to Use
|
|
14
|
-
- **Automatic Trigger**: Before any `write_file` operation affecting > 50 lines of code.
|
|
15
|
-
- **Manual Trigger**: When requirements are vague (e.g., "Fix the bug").
|
|
16
|
-
- **Agent Handoff**: When `pm-agent` assigns a task to `backend` or `frontend`.
|
|
17
|
-
|
|
18
|
-
## The 5 Pillars of Confidence
|
|
19
|
-
|
|
20
|
-
| Pillar | Weight | Verification Action | Related Agent |
|
|
21
|
-
| :--- | :--- | :--- | :--- |
|
|
22
|
-
| **1. Context Awareness** | 25% | Have I read *all* related files? (grep/read_file) | `pm-agent` |
|
|
23
|
-
| **2. Specification** | 25% | Do I have a clear Interface/Schema? (Zod/Types) | `architect` |
|
|
24
|
-
| **3. Ground Truth** | 20% | Do I have *current* docs? (No hallucinations) | `researcher` |
|
|
25
|
-
| **4. Pattern Matching** | 15% | Does this match existing project style? | `review` |
|
|
26
|
-
| **5. Impact Analysis** | 15% | Do I know what this might break? | `quality` |
|
|
27
|
-
|
|
28
|
-
## Scoring & Protocols
|
|
29
|
-
|
|
30
|
-
### 🟢 High Confidence (≥ 0.90)
|
|
31
|
-
**Action**: **PROCEED** to `execution` mode.
|
|
32
|
-
* *Definition:* You have the file paths, the schema is defined, you have verified the library version, and you have a rollback plan.
|
|
33
|
-
|
|
34
|
-
### 🟡 Medium Confidence (0.70 - 0.89)
|
|
35
|
-
**Action**: **REFINE** before coding.
|
|
36
|
-
* *Protocol:* Identify the weak pillar and fix it.
|
|
37
|
-
* *Weak Docs?* -> Trigger `researcher` to fetch API refs.
|
|
38
|
-
* *Weak Specs?* -> Trigger `architect` to define the interface.
|
|
39
|
-
* *Weak Context?* -> Run `grep` to find usages.
|
|
40
|
-
|
|
41
|
-
### 🔴 Low Confidence (< 0.70)
|
|
42
|
-
**Action**: **HALT**.
|
|
43
|
-
* *Protocol:* Do not write code. Return control to `pm-agent` or ask the user clarifying questions.
|
|
44
|
-
* *Example:* "I cannot proceed. I do not know the expected return type of the API, and I cannot find a design pattern for this module."
|
|
45
|
-
|
|
46
|
-
## Execution Template
|
|
47
|
-
|
|
48
|
-
*Copy this mental scratchpad into your context window before coding.*
|
|
49
|
-
|
|
50
|
-
```markdown
|
|
51
|
-
## 🛡️ Pre-Flight Confidence Check
|
|
52
|
-
|
|
53
|
-
### 1. Context Audit (25%)
|
|
54
|
-
- [ ] Mapped dependency tree? (Yes/No)
|
|
55
|
-
- [ ] Checked for duplicate logic via `grep`? (Yes/No)
|
|
56
|
-
- **Score: __ / 0.25**
|
|
57
|
-
|
|
58
|
-
### 2. Spec Validation (25%)
|
|
59
|
-
- [ ] Is the Input/Output clearly typed (TypeScript/Zod)?
|
|
60
|
-
- [ ] Does an ADR or RFC exist for this?
|
|
61
|
-
- **Score: __ / 0.25** -> *If 0, consult `architect`*
|
|
62
|
-
|
|
63
|
-
### 3. Documentation Reality (20%)
|
|
64
|
-
- [ ] Did I look up the library docs *today*?
|
|
65
|
-
- [ ] Am I guessing the API syntax?
|
|
66
|
-
- **Score: __ / 0.20** -> *If 0, consult `researcher`*
|
|
67
|
-
|
|
68
|
-
### 4. Local Patterns (15%)
|
|
69
|
-
- [ ] Am I using the project's established styling/naming?
|
|
70
|
-
- **Score: __ / 0.15**
|
|
71
|
-
|
|
72
|
-
### 5. Risk Assessment (15%)
|
|
73
|
-
- [ ] If this fails, does the app crash or just error out?
|
|
74
|
-
- [ ] Do I have a test case ready to verify this?
|
|
75
|
-
- **Score: __ / 0.15** -> *If 0, consult `quality`*
|
|
76
|
-
|
|
77
|
-
### 🏁 Final Score: [ 0.00 - 1.00 ]
|
|
78
|
-
**Verdict:** [ PROCEED / REFINE / HALT ]
|
|
79
|
-
```
|
|
80
|
-
|
|
81
|
-
## Scenario Examples
|
|
82
|
-
|
|
83
|
-
### Scenario A: Adding a new API Endpoint
|
|
84
|
-
* **Context**: Read `server.ts`? Yes.
|
|
85
|
-
* **Spec**: No request schema defined. (**0**)
|
|
86
|
-
* **Docs**: Know Express well. (0.2)
|
|
87
|
-
* **Pattern**: Copied existing controller style. (0.15)
|
|
88
|
-
* **Risk**: Low. (0.15)
|
|
89
|
-
* **Total**: **0.75 (Medium)**
|
|
90
|
-
* **Action**: *STOP. Call `backend` agent to define Zod schema first. Then Proceed.*
|
|
91
|
-
|
|
92
|
-
### Scenario B: "Fix the white screen"
|
|
93
|
-
* **Context**: No error logs provided. (**0**)
|
|
94
|
-
* **Spec**: Unknown. (**0**)
|
|
95
|
-
* **Docs**: N/A.
|
|
96
|
-
* **Total**: **0.15 (Low)**
|
|
97
|
-
* **Action**: *HALT. Ask user for logs or reproduction steps. Trigger `brainstorming` mode.*
|
|
1
|
+
---
|
|
2
|
+
name: confidence-check
|
|
3
|
+
description: Pre-execution risk assessment to prevent hallucinations and architectural drift.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Confidence Check Skill
|
|
7
|
+
|
|
8
|
+
## Purpose
|
|
9
|
+
To calculate a probabilistic "Success Score" (0.0 - 1.0) **before** generating code. This acts as a circuit breaker for the `execution` mode.
|
|
10
|
+
|
|
11
|
+
**ROI Metric**: A 200-token analysis prevents 2,000+ tokens of incorrect code generation and subsequent debugging time.
|
|
12
|
+
|
|
13
|
+
## When to Use
|
|
14
|
+
- **Automatic Trigger**: Before any `write_file` operation affecting > 50 lines of code.
|
|
15
|
+
- **Manual Trigger**: When requirements are vague (e.g., "Fix the bug").
|
|
16
|
+
- **Agent Handoff**: When `pm-agent` assigns a task to `backend` or `frontend`.
|
|
17
|
+
|
|
18
|
+
## The 5 Pillars of Confidence
|
|
19
|
+
|
|
20
|
+
| Pillar | Weight | Verification Action | Related Agent |
|
|
21
|
+
| :--- | :--- | :--- | :--- |
|
|
22
|
+
| **1. Context Awareness** | 25% | Have I read *all* related files? (grep/read_file) | `pm-agent` |
|
|
23
|
+
| **2. Specification** | 25% | Do I have a clear Interface/Schema? (Zod/Types) | `architect` |
|
|
24
|
+
| **3. Ground Truth** | 20% | Do I have *current* docs? (No hallucinations) | `researcher` |
|
|
25
|
+
| **4. Pattern Matching** | 15% | Does this match existing project style? | `review` |
|
|
26
|
+
| **5. Impact Analysis** | 15% | Do I know what this might break? | `quality` |
|
|
27
|
+
|
|
28
|
+
## Scoring & Protocols
|
|
29
|
+
|
|
30
|
+
### 🟢 High Confidence (≥ 0.90)
|
|
31
|
+
**Action**: **PROCEED** to `execution` mode.
|
|
32
|
+
* *Definition:* You have the file paths, the schema is defined, you have verified the library version, and you have a rollback plan.
|
|
33
|
+
|
|
34
|
+
### 🟡 Medium Confidence (0.70 - 0.89)
|
|
35
|
+
**Action**: **REFINE** before coding.
|
|
36
|
+
* *Protocol:* Identify the weak pillar and fix it.
|
|
37
|
+
* *Weak Docs?* -> Trigger `researcher` to fetch API refs.
|
|
38
|
+
* *Weak Specs?* -> Trigger `architect` to define the interface.
|
|
39
|
+
* *Weak Context?* -> Run `grep` to find usages.
|
|
40
|
+
|
|
41
|
+
### 🔴 Low Confidence (< 0.70)
|
|
42
|
+
**Action**: **HALT**.
|
|
43
|
+
* *Protocol:* Do not write code. Return control to `pm-agent` or ask the user clarifying questions.
|
|
44
|
+
* *Example:* "I cannot proceed. I do not know the expected return type of the API, and I cannot find a design pattern for this module."
|
|
45
|
+
|
|
46
|
+
## Execution Template
|
|
47
|
+
|
|
48
|
+
*Copy this mental scratchpad into your context window before coding.*
|
|
49
|
+
|
|
50
|
+
```markdown
|
|
51
|
+
## 🛡️ Pre-Flight Confidence Check
|
|
52
|
+
|
|
53
|
+
### 1. Context Audit (25%)
|
|
54
|
+
- [ ] Mapped dependency tree? (Yes/No)
|
|
55
|
+
- [ ] Checked for duplicate logic via `grep`? (Yes/No)
|
|
56
|
+
- **Score: __ / 0.25**
|
|
57
|
+
|
|
58
|
+
### 2. Spec Validation (25%)
|
|
59
|
+
- [ ] Is the Input/Output clearly typed (TypeScript/Zod)?
|
|
60
|
+
- [ ] Does an ADR or RFC exist for this?
|
|
61
|
+
- **Score: __ / 0.25** -> *If 0, consult `architect`*
|
|
62
|
+
|
|
63
|
+
### 3. Documentation Reality (20%)
|
|
64
|
+
- [ ] Did I look up the library docs *today*?
|
|
65
|
+
- [ ] Am I guessing the API syntax?
|
|
66
|
+
- **Score: __ / 0.20** -> *If 0, consult `researcher`*
|
|
67
|
+
|
|
68
|
+
### 4. Local Patterns (15%)
|
|
69
|
+
- [ ] Am I using the project's established styling/naming?
|
|
70
|
+
- **Score: __ / 0.15**
|
|
71
|
+
|
|
72
|
+
### 5. Risk Assessment (15%)
|
|
73
|
+
- [ ] If this fails, does the app crash or just error out?
|
|
74
|
+
- [ ] Do I have a test case ready to verify this?
|
|
75
|
+
- **Score: __ / 0.15** -> *If 0, consult `quality`*
|
|
76
|
+
|
|
77
|
+
### 🏁 Final Score: [ 0.00 - 1.00 ]
|
|
78
|
+
**Verdict:** [ PROCEED / REFINE / HALT ]
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
## Scenario Examples
|
|
82
|
+
|
|
83
|
+
### Scenario A: Adding a new API Endpoint
|
|
84
|
+
* **Context**: Read `server.ts`? Yes.
|
|
85
|
+
* **Spec**: No request schema defined. (**0**)
|
|
86
|
+
* **Docs**: Know Express well. (0.2)
|
|
87
|
+
* **Pattern**: Copied existing controller style. (0.15)
|
|
88
|
+
* **Risk**: Low. (0.15)
|
|
89
|
+
* **Total**: **0.75 (Medium)**
|
|
90
|
+
* **Action**: *STOP. Call `backend` agent to define Zod schema first. Then Proceed.*
|
|
91
|
+
|
|
92
|
+
### Scenario B: "Fix the white screen"
|
|
93
|
+
* **Context**: No error logs provided. (**0**)
|
|
94
|
+
* **Spec**: Unknown. (**0**)
|
|
95
|
+
* **Docs**: N/A.
|
|
96
|
+
* **Total**: **0.15 (Low)**
|
|
97
|
+
* **Action**: *HALT. Ask user for logs or reproduction steps. Trigger `brainstorming` mode.*
|