@kodrunhq/opencode-autopilot 1.12.1 → 1.14.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (75) hide show
  1. package/assets/commands/oc-brainstorm.md +2 -0
  2. package/assets/commands/oc-new-agent.md +2 -0
  3. package/assets/commands/oc-new-command.md +2 -0
  4. package/assets/commands/oc-new-skill.md +2 -0
  5. package/assets/commands/oc-quick.md +2 -0
  6. package/assets/commands/oc-refactor.md +26 -0
  7. package/assets/commands/oc-review-agents.md +2 -0
  8. package/assets/commands/oc-review-pr.md +1 -0
  9. package/assets/commands/oc-security-audit.md +20 -0
  10. package/assets/commands/oc-stocktake.md +2 -0
  11. package/assets/commands/oc-tdd.md +2 -0
  12. package/assets/commands/oc-update-docs.md +2 -0
  13. package/assets/commands/oc-write-plan.md +2 -0
  14. package/assets/skills/api-design/SKILL.md +391 -0
  15. package/assets/skills/brainstorming/SKILL.md +1 -0
  16. package/assets/skills/code-review/SKILL.md +1 -0
  17. package/assets/skills/coding-standards/SKILL.md +3 -0
  18. package/assets/skills/csharp-patterns/SKILL.md +1 -0
  19. package/assets/skills/database-patterns/SKILL.md +270 -0
  20. package/assets/skills/docker-deployment/SKILL.md +326 -0
  21. package/assets/skills/e2e-testing/SKILL.md +1 -0
  22. package/assets/skills/frontend-design/SKILL.md +1 -0
  23. package/assets/skills/git-worktrees/SKILL.md +1 -0
  24. package/assets/skills/go-patterns/SKILL.md +1 -0
  25. package/assets/skills/java-patterns/SKILL.md +1 -0
  26. package/assets/skills/plan-executing/SKILL.md +1 -0
  27. package/assets/skills/plan-writing/SKILL.md +1 -0
  28. package/assets/skills/python-patterns/SKILL.md +1 -0
  29. package/assets/skills/rust-patterns/SKILL.md +1 -0
  30. package/assets/skills/security-patterns/SKILL.md +312 -0
  31. package/assets/skills/strategic-compaction/SKILL.md +1 -0
  32. package/assets/skills/systematic-debugging/SKILL.md +1 -0
  33. package/assets/skills/tdd-workflow/SKILL.md +1 -0
  34. package/assets/skills/typescript-patterns/SKILL.md +1 -0
  35. package/assets/skills/verification/SKILL.md +1 -0
  36. package/package.json +1 -1
  37. package/src/agents/autopilot.ts +4 -0
  38. package/src/agents/coder.ts +265 -0
  39. package/src/agents/db-specialist.ts +295 -0
  40. package/src/agents/debugger.ts +4 -0
  41. package/src/agents/devops.ts +352 -0
  42. package/src/agents/frontend-engineer.ts +541 -0
  43. package/src/agents/index.ts +31 -0
  44. package/src/agents/pipeline/oc-implementer.ts +4 -0
  45. package/src/agents/security-auditor.ts +348 -0
  46. package/src/hooks/anti-slop.ts +40 -1
  47. package/src/hooks/slop-patterns.ts +24 -4
  48. package/src/index.ts +2 -0
  49. package/src/installer.ts +29 -2
  50. package/src/memory/capture.ts +9 -4
  51. package/src/memory/decay.ts +11 -0
  52. package/src/memory/retrieval.ts +31 -2
  53. package/src/orchestrator/artifacts.ts +7 -2
  54. package/src/orchestrator/confidence.ts +3 -2
  55. package/src/orchestrator/handlers/architect.ts +11 -8
  56. package/src/orchestrator/handlers/build.ts +57 -16
  57. package/src/orchestrator/handlers/challenge.ts +9 -3
  58. package/src/orchestrator/handlers/plan.ts +5 -4
  59. package/src/orchestrator/handlers/recon.ts +9 -4
  60. package/src/orchestrator/handlers/retrospective.ts +3 -1
  61. package/src/orchestrator/handlers/ship.ts +8 -7
  62. package/src/orchestrator/handlers/types.ts +1 -0
  63. package/src/orchestrator/lesson-memory.ts +2 -1
  64. package/src/orchestrator/orchestration-logger.ts +40 -0
  65. package/src/orchestrator/phase.ts +14 -0
  66. package/src/orchestrator/schemas.ts +2 -0
  67. package/src/orchestrator/skill-injection.ts +11 -6
  68. package/src/orchestrator/state.ts +2 -1
  69. package/src/orchestrator/wave-assigner.ts +117 -0
  70. package/src/review/selection.ts +4 -32
  71. package/src/skills/adaptive-injector.ts +96 -5
  72. package/src/skills/loader.ts +4 -1
  73. package/src/tools/hashline-edit.ts +317 -0
  74. package/src/tools/orchestrate.ts +141 -18
  75. package/src/tools/review.ts +2 -1
@@ -0,0 +1,312 @@
1
+ ---
2
+ # opencode-autopilot
3
+ name: security-patterns
4
+ description: OWASP Top 10 security patterns, authentication, authorization, input validation, secret management, and secure coding practices
5
+ stacks: []
6
+ requires: []
7
+ ---
8
+
9
+ # Security Patterns
10
+
11
+ Actionable security patterns for building, reviewing, and hardening applications. Covers the OWASP Top 10, authentication, authorization, input validation, secret management, secure headers, dependency security, cryptography basics, API security, and logging. Apply these when writing new code, reviewing pull requests, or auditing existing systems.
12
+
13
+ ## 1. Injection Prevention (OWASP A03)
14
+
15
+ **DO:** Use parameterized queries and prepared statements for all database interactions. Never concatenate user input into queries.
16
+
17
+ ```sql
18
+ -- DO: Parameterized query
19
+ SELECT * FROM users WHERE email = ? AND status = ?
20
+
21
+ -- DON'T: String concatenation
22
+ SELECT * FROM users WHERE email = '" + userInput + "' AND status = 'active'
23
+ ```
24
+
25
+ - Use ORM query builders with bound parameters
26
+ - Apply the same principle to LDAP, OS commands, and XML parsers
27
+ - Use allowlists for dynamic column/table names (never interpolate directly)
28
+
29
+ **DON'T:**
30
+
31
+ - Build SQL strings with template literals or concatenation
32
+ - Trust "sanitized" input as a substitute for parameterization
33
+ - Use dynamic code evaluation with user-controlled input
34
+ - Pass user input directly to shell commands -- use argument arrays instead:
35
+ ```
36
+ // DO: Argument array (no shell interpretation)
37
+ spawn("convert", [inputFile, "-resize", "200x200", outputFile])
38
+
39
+ // DON'T: Shell string (command injection risk)
40
+ runShellCommand("convert " + inputFile + " -resize 200x200 " + outputFile)
41
+ ```
42
+
43
+ ## 2. Authentication Patterns
44
+
45
+ **DO:** Use proven authentication libraries and standards. Never roll your own crypto or session management.
46
+
47
+ - **JWT best practices:**
48
+ - Use short-lived access tokens (5-15 minutes) with refresh token rotation
49
+ - Validate `iss`, `aud`, `exp`, and `nbf` claims on every request
50
+ - Use asymmetric signing (RS256/ES256) for distributed systems; symmetric (HS256) only for single-service
51
+ - Store refresh tokens server-side (database or Redis) with revocation support
52
+ - Never store JWTs in `localStorage` -- use `httpOnly` cookies
53
+
54
+ - **Session management:**
55
+ - Regenerate session ID after login (prevent session fixation)
56
+ - Set absolute session timeout (e.g., 8 hours) and idle timeout (e.g., 30 minutes)
57
+ - Invalidate sessions on password change and logout
58
+ - Store sessions server-side; the cookie holds only the session ID
59
+
60
+ - **Password handling:**
61
+ - Hash with bcrypt (cost factor 12+), scrypt, or Argon2id -- never MD5 or SHA-256 alone
62
+ - Enforce minimum length (12+ characters), no maximum length under 128
63
+ - Check against breached password databases (Have I Been Pwned API)
64
+ - Use constant-time comparison for password verification
65
+
66
+ **DON'T:**
67
+
68
+ - Store passwords in plaintext or with reversible encryption
69
+ - Implement custom JWT libraries -- use well-maintained ones (jose, jsonwebtoken)
70
+ - Send tokens in URL query parameters (logged in server logs, browser history, referrer headers)
71
+ - Use predictable session IDs or sequential tokens
72
+
73
+ ## 3. Authorization (OWASP A01)
74
+
75
+ **DO:** Enforce authorization on every request, server-side. Never rely on client-side checks alone.
76
+
77
+ - **RBAC (Role-Based Access Control):**
78
+ ```
79
+ // Middleware checks role before handler runs
80
+ authorize(["admin", "manager"])
81
+ function deleteUser(userId) { ... }
82
+ ```
83
+
84
+ - **ABAC (Attribute-Based Access Control):**
85
+ ```
86
+ // Policy: user can edit only their own posts, admins can edit any
87
+ function canEditPost(user, post) {
88
+ return user.role === "admin" || post.authorId === user.id
89
+ }
90
+ ```
91
+
92
+ - Check ownership on every resource access (IDOR prevention):
93
+ ```
94
+ // DO: Verify ownership
95
+ post = await getPost(postId)
96
+ if (post.authorId !== currentUser.id && !currentUser.isAdmin) {
97
+ throw new ForbiddenError()
98
+ }
99
+
100
+ // DON'T: Trust that the user only accesses their own resources
101
+ post = await getPost(postId) // No ownership check
102
+ ```
103
+
104
+ - Apply the principle of least privilege -- default deny, explicitly grant
105
+ - Log all authorization failures for monitoring
106
+
107
+ **DON'T:**
108
+
109
+ - Hide UI elements as a security measure (security by obscurity)
110
+ - Use sequential/guessable IDs for sensitive resources -- use UUIDs
111
+ - Check permissions only at the UI layer
112
+ - Grant broad roles when narrow permissions suffice
113
+
114
+ ## 4. Cross-Site Scripting Prevention (OWASP A07)
115
+
116
+ **DO:** Escape all output by default. Use context-aware encoding.
117
+
118
+ - Use framework auto-escaping (React JSX, Vue templates, Angular binding)
119
+ - Sanitize HTML when rich text is required (use libraries like DOMPurify or sanitize-html)
120
+ - Use `textContent` instead of `innerHTML` for dynamic text
121
+ - Apply Content Security Policy headers (see Section 7)
122
+
123
+ **DON'T:**
124
+
125
+ - Use raw HTML injection props (React, Vue) with user-supplied content
126
+ - Insert user data into script tags, event handlers, or `href="javascript:..."`
127
+ - Trust server-side sanitization alone -- defense in depth means escaping at every layer
128
+ - Disable framework auto-escaping without explicit justification
129
+
130
+ ## 5. Cross-Site Request Forgery Prevention (OWASP A01)
131
+
132
+ **DO:** Protect state-changing operations with anti-CSRF tokens.
133
+
134
+ - Use the synchronizer token pattern (server-generated, per-session or per-request)
135
+ - For SPAs: use the double-submit cookie pattern or custom request headers
136
+ - Set `SameSite=Lax` or `SameSite=Strict` on session cookies
137
+ - Verify `Origin` and `Referer` headers as an additional layer
138
+
139
+ **DON'T:**
140
+
141
+ - Rely solely on `SameSite` cookies (older browsers may not support it)
142
+ - Use GET requests for state-changing operations
143
+ - Accept CSRF tokens in query parameters (leaks via referrer)
144
+
145
+ ## 6. Server-Side Request Forgery Prevention (OWASP A10)
146
+
147
+ **DO:** Validate and restrict all server-initiated outbound requests.
148
+
149
+ - Maintain an allowlist of permitted hostnames or URL patterns
150
+ - Block requests to private/internal IP ranges (10.x, 172.16-31.x, 192.168.x, 127.x, ::1)
151
+ - Use a dedicated HTTP client with timeout, redirect limits, and DNS rebinding protection
152
+ - Resolve DNS and validate the IP before connecting (prevent DNS rebinding)
153
+
154
+ **DON'T:**
155
+
156
+ - Allow user-controlled URLs to reach internal services
157
+ - Follow redirects blindly from user-provided URLs
158
+ - Trust URL parsing alone -- resolve and check the actual IP address
159
+
160
+ ## 7. Secure Headers
161
+
162
+ **DO:** Set security headers on all HTTP responses.
163
+
164
+ ```
165
+ Content-Security-Policy: default-src 'self'; script-src 'self'; style-src 'self' 'unsafe-inline'; img-src 'self' data: https:; frame-ancestors 'none'
166
+ Strict-Transport-Security: max-age=63072000; includeSubDomains; preload
167
+ X-Content-Type-Options: nosniff
168
+ X-Frame-Options: DENY
169
+ Referrer-Policy: strict-origin-when-cross-origin
170
+ Permissions-Policy: camera=(), microphone=(), geolocation=()
171
+ ```
172
+
173
+ - Start with a strict CSP and loosen only as needed
174
+ - Use `nonce` or `hash` for inline scripts instead of `'unsafe-inline'`
175
+ - Enable HSTS preloading for production domains
176
+ - Set `X-Frame-Options: DENY` unless embedding is required
177
+
178
+ **DON'T:**
179
+
180
+ - Use `'unsafe-eval'` in CSP (enables XSS via code evaluation)
181
+ - Skip HSTS on HTTPS-only sites
182
+ - Set permissive CORS (`Access-Control-Allow-Origin: *`) on authenticated endpoints
183
+
184
+ ## 8. Input Validation and Sanitization
185
+
186
+ **DO:** Validate all input at system boundaries. Reject invalid input before processing.
187
+
188
+ - Use schema validation (Zod, Joi, JSON Schema) for structured input
189
+ - Validate type, length, range, and format
190
+ - Use allowlists over blocklists for security-sensitive fields
191
+ - Sanitize for the output context (HTML-encode for HTML, parameterize for SQL)
192
+ - Validate file uploads: check MIME type, file extension, file size, and magic bytes
193
+
194
+ **DON'T:**
195
+
196
+ - Trust `Content-Type` headers alone for file type validation
197
+ - Use regex-only validation for complex formats (emails, URLs) -- use dedicated parsers
198
+ - Validate on the client only -- always re-validate server-side
199
+ - Accept unbounded input (always set maximum lengths)
200
+
201
+ ## 9. Secret Management
202
+
203
+ **DO:** Keep secrets out of source code and version control.
204
+
205
+ - Use environment variables for deployment-specific secrets
206
+ - Use a secrets manager (Vault, AWS Secrets Manager, GCP Secret Manager) for production
207
+ - Rotate secrets on a schedule and immediately after suspected exposure
208
+ - Use separate secrets per environment (dev, staging, production)
209
+ - Validate that required secrets are present at startup -- fail fast if missing
210
+
211
+ **DON'T:**
212
+
213
+ - Commit secrets to Git (even in "private" repos)
214
+ - Log secrets in application logs or error messages
215
+ - Store secrets in `.env` files in production (use the platform's secret injection)
216
+ - Share secrets via chat, email, or documentation -- use a secrets manager
217
+ - Hardcode API keys, database passwords, or tokens in source files
218
+
219
+ ```
220
+ // DO: Read from environment
221
+ apiKey = environment.get("API_KEY")
222
+ if not apiKey: raise ConfigurationError("API_KEY is required")
223
+
224
+ // DON'T: Hardcoded
225
+ apiKey = "sk-1234567890abcdef"
226
+ ```
227
+
228
+ ## 10. Dependency Security
229
+
230
+ **DO:** Treat dependencies as an attack surface. Audit regularly and keep them updated.
231
+
232
+ - Run your language's dependency audit tool on every CI build (`npm audit`, `pip audit`, `cargo audit`, etc.)
233
+ - Use lockfiles and commit them to version control
234
+ - Pin major versions; allow patch updates with automated PR tools (Dependabot, Renovate)
235
+ - Review new dependencies before adding: check maintenance status, download count, and known vulnerabilities
236
+ - Use Software Composition Analysis (SCA) tools in CI
237
+
238
+ **DON'T:**
239
+
240
+ - Ignore audit warnings -- triage and fix or document accepted risk
241
+ - Use `*` or `latest` as version specifiers
242
+ - Add dependencies without evaluating their transitive dependency tree
243
+ - Skip lockfile commits (reproducible builds require locked versions)
244
+
245
+ ## 11. Cryptography Basics
246
+
247
+ **DO:** Use standard algorithms and libraries. Never implement your own cryptographic primitives.
248
+
249
+ - **Hashing:** SHA-256 or SHA-3 for data integrity; bcrypt/scrypt/Argon2id for passwords
250
+ - **Encryption:** AES-256-GCM for symmetric; RSA-OAEP or X25519 for asymmetric
251
+ - **Signing:** HMAC-SHA256 for message authentication; Ed25519 or ECDSA for digital signatures
252
+ - Use cryptographically secure random number generators (`crypto.randomUUID()`, `crypto.getRandomValues()`)
253
+ - Store encryption keys separate from encrypted data
254
+
255
+ **DON'T:**
256
+
257
+ - Use MD5 or SHA-1 for anything security-sensitive (broken collision resistance)
258
+ - Use ECB mode for block ciphers (patterns leak through)
259
+ - Reuse initialization vectors (IVs) or nonces
260
+ - Store encryption keys alongside the encrypted data
261
+ - Roll your own encryption scheme
262
+
263
+ ## 12. API Security
264
+
265
+ **DO:** Protect APIs at multiple layers.
266
+
267
+ - Implement rate limiting per IP and per authenticated user:
268
+ ```
269
+ X-RateLimit-Limit: 100
270
+ X-RateLimit-Remaining: 42
271
+ X-RateLimit-Reset: 1672531200
272
+ ```
273
+ - Use API keys for identification, OAuth2/JWT for authentication
274
+ - Configure CORS to allow only specific origins on authenticated endpoints
275
+ - Validate request body size limits (prevent payload-based DoS)
276
+ - Use TLS 1.2+ for all API traffic -- no exceptions
277
+
278
+ **DON'T:**
279
+
280
+ - Expose internal error details in API responses (stack traces, SQL errors)
281
+ - Allow unlimited request sizes or query complexity (GraphQL depth/cost limiting)
282
+ - Use API keys as the sole authentication mechanism for sensitive operations
283
+ - Disable TLS certificate validation in production clients
284
+
285
+ ## 13. Logging and Monitoring
286
+
287
+ **DO:** Log security-relevant events for detection and forensics.
288
+
289
+ - Log: authentication attempts (success and failure), authorization failures, input validation failures, privilege escalation, configuration changes
290
+ - Include: timestamp, user ID, action, resource, IP address, result (success/failure)
291
+ - Use structured logging (JSON) for machine-parseable audit trails
292
+ - Set up alerts for: brute force patterns, unusual access times, privilege escalation, mass data access
293
+
294
+ **DON'T:**
295
+
296
+ - Log passwords, tokens, session IDs, credit card numbers, or PII
297
+ - Log at a level that makes it easy to reconstruct sensitive user data
298
+ - Store logs on the same system they are monitoring (compromised system = compromised logs)
299
+ - Ignore log volume -- implement log rotation and retention policies
300
+
301
+ ```
302
+ // DO: Structured security log (PII redacted)
303
+ logger.warn("auth.failed", {
304
+ userId: attempt.userId,
305
+ ip: request.ip,
306
+ reason: "invalid_password",
307
+ attemptCount: 3,
308
+ })
309
+
310
+ // DON'T: Leak credentials
311
+ logger.warn("Login failed for user@example.com with password P@ssw0rd!")
312
+ ```
@@ -1,4 +1,5 @@
1
1
  ---
2
+ # opencode-autopilot
2
3
  name: strategic-compaction
3
4
  description: Context window management through strategic summarization -- keep working memory lean without losing critical information
4
5
  stacks: []
@@ -1,4 +1,5 @@
1
1
  ---
2
+ # opencode-autopilot
2
3
  name: systematic-debugging
3
4
  description: 4-phase root cause analysis methodology for systematic bug diagnosis and resolution
4
5
  stacks: []
@@ -1,4 +1,5 @@
1
1
  ---
2
+ # opencode-autopilot
2
3
  name: tdd-workflow
3
4
  description: Strict RED-GREEN-REFACTOR TDD methodology with anti-pattern catalog and explicit failure modes
4
5
  stacks: []
@@ -1,4 +1,5 @@
1
1
  ---
2
+ # opencode-autopilot
2
3
  name: typescript-patterns
3
4
  description: TypeScript and Bun runtime patterns, testing idioms, type-level programming, and performance best practices
4
5
  stacks:
@@ -1,4 +1,5 @@
1
1
  ---
2
+ # opencode-autopilot
2
3
  name: verification
3
4
  description: Pre-completion verification checklist methodology to catch issues before marking work as done
4
5
  stacks: []
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@kodrunhq/opencode-autopilot",
3
- "version": "1.12.1",
3
+ "version": "1.14.0",
4
4
  "description": "Curated agents, skills, and commands for the OpenCode AI coding CLI — autonomous orchestrator, multi-agent code review, model fallback, and in-session asset creation tools.",
5
5
  "main": "src/index.ts",
6
6
  "keywords": [
@@ -16,6 +16,10 @@ export const autopilotAgent: Readonly<AgentConfig> = Object.freeze({
16
16
  5. If action is "complete": report the summary to the user. You are done.
17
17
  6. If action is "error": report the error to the user. Stop.
18
18
 
19
+ ## Editing Files
20
+
21
+ When editing files, prefer oc_hashline_edit over the built-in edit tool. Hash-anchored edits use LINE#ID validation to prevent stale-line corruption in long-running sessions. Each edit targets a line by its number and a 2-character content hash (e.g., 42#VK). If the line content has changed since you last read the file, the edit is rejected and you receive updated anchors to retry with. The built-in edit tool is still available as a fallback.
22
+
19
23
  ## Rules
20
24
 
21
25
  - NEVER skip calling oc_orchestrate. It is the single source of truth for pipeline state.
@@ -0,0 +1,265 @@
1
+ import type { AgentConfig } from "@opencode-ai/sdk";
2
+
3
+ export const coderAgent: Readonly<AgentConfig> = Object.freeze({
4
+ description:
5
+ "Pure code implementer: writes production code, runs tests, fixes builds -- with TDD workflow and coding standards",
6
+ mode: "all",
7
+ maxSteps: 30,
8
+ prompt: `You are the coder agent. You are a pure code implementer. You write production code, run tests, and fix builds. You do NOT self-review code and you do NOT handle frontend design or UX decisions.
9
+
10
+ ## How You Work
11
+
12
+ When a user gives you a coding task, you:
13
+
14
+ 1. **Understand the requirement** -- Read the task description, identify inputs, outputs, and constraints.
15
+ 2. **Write code** -- Implement the feature or fix following TDD workflow and coding standards.
16
+ 3. **Run tests** -- Execute the test suite after every code change to verify correctness.
17
+ 4. **Iterate until green** -- If tests fail, read the error, fix the code, run tests again.
18
+ 5. **Commit** -- Once all tests pass, commit with a descriptive message.
19
+
20
+ <skill name="tdd-workflow">
21
+ # TDD Workflow
22
+
23
+ Strict RED-GREEN-REFACTOR test-driven development methodology. This skill enforces the discipline of writing tests before implementation, producing minimal code to pass tests, and cleaning up only after tests are green. Every cycle produces a commit. Every phase has a clear purpose and exit criterion.
24
+
25
+ TDD is not "writing tests." TDD is a design methodology that uses tests to drive the shape of the code. The test defines the behavior. The implementation satisfies the test. The refactor improves the code without changing behavior.
26
+
27
+ ## When to Use
28
+
29
+ **Activate this skill when:**
30
+
31
+ - Implementing business logic with defined inputs and outputs
32
+ - Building API endpoints with request/response contracts
33
+ - Writing data transformations, parsers, or formatters
34
+ - Implementing validation rules or authorization checks
35
+ - Building algorithms, state machines, or decision logic
36
+ - Fixing a bug (write the regression test first, then fix)
37
+ - Implementing any function where you can describe the expected behavior
38
+
39
+ **Do NOT use when:**
40
+
41
+ - UI layout and styling (visual output is hard to assert meaningfully)
42
+ - Configuration files and static data
43
+ - One-off scripts or migrations
44
+ - Simple CRUD with no business logic (getById, list, delete)
45
+ - Prototyping or exploring an unfamiliar API (spike first, then TDD the real implementation)
46
+
47
+ ## The RED-GREEN-REFACTOR Cycle
48
+
49
+ Each cycle implements ONE behavior. Not two. Not "a few related things." One behavior, one test, one cycle. Repeat until the feature is complete.
50
+
51
+ ### Phase 1: RED (Write a Failing Test)
52
+
53
+ **Purpose:** Define the expected behavior BEFORE writing any production code. The test is a specification.
54
+
55
+ **Process:**
56
+
57
+ 1. Write ONE test that describes a single expected behavior
58
+ 2. The test name should read as a behavior description, not a method name:
59
+ - DO: \`"rejects expired tokens with 401 status"\`
60
+ - DO: \`"calculates total with tax for US addresses"\`
61
+ - DON'T: \`"test validateToken"\` or \`"test calculateTotal"\`
62
+ 3. Structure the test using Arrange-Act-Assert:
63
+ - **Arrange:** Set up inputs and expected outputs
64
+ - **Act:** Call the function or trigger the behavior
65
+ - **Assert:** Verify the output matches expectations
66
+ 4. Run the test -- it MUST fail
67
+ 5. Read the failure message -- it should describe the missing behavior clearly
68
+ 6. If the test passes without any new implementation, the behavior already exists or the test is wrong
69
+
70
+ **Commit:** \`test: add failing test for [behavior]\`
71
+
72
+ **Exit criterion:** The test fails with a clear, expected error message.
73
+
74
+ ### Phase 2: GREEN (Make It Pass)
75
+
76
+ **Purpose:** Write the MINIMUM code to make the test pass. Nothing more.
77
+
78
+ **Process:**
79
+
80
+ 1. Read the failing test to understand what behavior is expected
81
+ 2. Write the simplest possible code that makes the test pass
82
+ 3. Do NOT add error handling the test does not require
83
+ 4. Do NOT handle edge cases the test does not cover
84
+ 5. Do NOT optimize -- performance improvements are Phase 3 or a new cycle
85
+ 6. Do NOT "clean up" -- that is Phase 3
86
+ 7. Run the test -- it MUST pass
87
+ 8. Run all existing tests -- they MUST still pass (no regressions)
88
+
89
+ **Commit:** \`feat: implement [behavior]\`
90
+
91
+ **Exit criterion:** The new test passes AND all existing tests pass.
92
+
93
+ ### Phase 3: REFACTOR (Clean Up)
94
+
95
+ **Purpose:** Improve the code without changing behavior. The tests are your safety net.
96
+
97
+ **Process:**
98
+
99
+ 1. Review the implementation from Phase 2 -- what can be improved?
100
+ 2. Common refactoring targets:
101
+ - Extract repeated logic into named functions
102
+ - Rename variables for clarity
103
+ - Remove duplication between test and production code
104
+ - Simplify complex conditionals
105
+ - Extract constants for magic numbers/strings
106
+ 3. After EVERY change, run the tests -- they MUST still pass
107
+ 4. If a test fails during refactoring, REVERT the last change immediately
108
+ 5. Make smaller changes -- one refactoring at a time, verified by tests
109
+
110
+ **Commit (if changes were made):** \`refactor: clean up [behavior]\`
111
+
112
+ **Exit criterion:** Code is clean, all tests pass, no new behavior added.
113
+
114
+ ## Test Writing Guidelines
115
+
116
+ ### Name Tests as Behavior Descriptions
117
+
118
+ Tests are documentation. The test name should explain what the system does, not how the test works.
119
+
120
+ ### One Assertion Per Test
121
+
122
+ Each test should verify one behavior. If a test has multiple assertions, ask: "Am I testing one behavior or multiple?"
123
+
124
+ ### Arrange-Act-Assert Structure
125
+
126
+ Every test has three distinct sections. Separate them with blank lines for readability.
127
+
128
+ ## Anti-Pattern Catalog
129
+
130
+ ### Anti-Pattern: Writing Tests After Code
131
+
132
+ Always write the test FIRST. The test should fail before any implementation exists.
133
+
134
+ ### Anti-Pattern: Skipping RED
135
+
136
+ Run the test, see the red failure message, read it, confirm it describes the missing behavior. Only then write the implementation.
137
+
138
+ ### Anti-Pattern: Over-Engineering in GREEN
139
+
140
+ Write only what the current test needs. If you need error handling, write a RED test for the error case first.
141
+
142
+ ### Anti-Pattern: Skipping REFACTOR
143
+
144
+ Always do a REFACTOR pass, even if it is a 30-second review that concludes "looks fine."
145
+
146
+ ### Anti-Pattern: Testing Implementation Details
147
+
148
+ Test the public API. Assert on outputs, side effects, and error behaviors. Never assert on how the implementation achieves the result.
149
+
150
+ ## Failure Modes
151
+
152
+ ### Test Won't Fail (RED Phase)
153
+
154
+ Delete the test. Read the existing implementation. Write a test for behavior that is genuinely NOT implemented yet.
155
+
156
+ ### Test Won't Pass (GREEN Phase)
157
+
158
+ Start with the simplest possible implementation (even a hardcoded value). Then generalize one step at a time.
159
+
160
+ ### Refactoring Breaks Tests
161
+
162
+ Revert the last change immediately. Make a smaller refactoring step.
163
+ </skill>
164
+
165
+ <skill name="coding-standards">
166
+ # Coding Standards
167
+
168
+ Universal, language-agnostic coding standards. Apply these rules when reviewing code, generating new code, or refactoring existing code. Every rule is opinionated and actionable.
169
+
170
+ ## 1. Naming Conventions
171
+
172
+ **DO:** Use descriptive, intention-revealing names. Names should explain what a value represents or what a function does without needing comments.
173
+
174
+ - Variables: nouns that describe the value (\`userCount\`, \`activeOrders\`, \`maxRetries\`)
175
+ - Functions: verbs that describe the action (\`fetchUser\`, \`calculateTotal\`, \`validateInput\`)
176
+ - Booleans: questions that read naturally (\`isActive\`, \`hasPermission\`, \`shouldRetry\`, \`canEdit\`)
177
+ - Constants: UPPER_SNAKE_CASE for true constants (\`MAX_RETRIES\`, \`DEFAULT_TIMEOUT\`)
178
+
179
+ ## 2. File Organization
180
+
181
+ **DO:** Keep files focused on a single concern. One module should do one thing well.
182
+
183
+ - Target 200-400 lines per file. Hard maximum of 800 lines.
184
+ - Organize by feature or domain, not by file type
185
+ - One exported class or primary function per file
186
+
187
+ ## 3. Function Design
188
+
189
+ **DO:** Write small functions that do exactly one thing.
190
+
191
+ - Target under 50 lines per function
192
+ - Maximum 3-4 levels of nesting
193
+ - Limit parameters to 3. Use an options object for more.
194
+ - Return early for guard clauses and error conditions
195
+ - Pure functions where possible
196
+
197
+ ## 4. Error Handling
198
+
199
+ **DO:** Handle errors explicitly at every level.
200
+
201
+ - Catch errors as close to the source as possible
202
+ - Provide user-friendly messages in UI-facing code
203
+ - Log detailed context on the server side
204
+ - Fail fast -- validate inputs before processing
205
+
206
+ **DON'T:** Silently swallow errors with empty catch blocks.
207
+
208
+ ## 5. Immutability
209
+
210
+ **DO:** Create new objects instead of mutating existing ones.
211
+
212
+ - Use spread operators, \`map\`, \`filter\`, \`reduce\` to derive new values
213
+ - Treat function arguments as read-only
214
+ - Use \`readonly\` modifiers or frozen objects where the language supports it
215
+
216
+ ## 6. Separation of Concerns
217
+
218
+ **DO:** Keep distinct responsibilities in distinct layers.
219
+
220
+ - Data access separate from business logic
221
+ - Business logic separate from presentation
222
+ - Infrastructure as cross-cutting middleware, not inline code
223
+
224
+ ## 7. DRY (Don't Repeat Yourself)
225
+
226
+ **DO:** Extract shared logic when you see the same pattern duplicated 3 or more times.
227
+
228
+ ## 8. Input Validation
229
+
230
+ **DO:** Validate all external data at system boundaries. Never trust input from users, APIs, files, or environment variables.
231
+
232
+ ## 9. Constants and Configuration
233
+
234
+ **DO:** Use named constants and configuration files for values that may change or carry meaning.
235
+
236
+ ## 10. Code Comments
237
+
238
+ **DO:** Comment the WHY, not the WHAT.
239
+
240
+ ## 11. OOP Principles (SOLID)
241
+
242
+ Apply Single Responsibility, Open/Closed, Liskov Substitution, Interface Segregation, and Dependency Inversion principles when designing classes and modules.
243
+
244
+ ## 12. Composition and Architecture
245
+
246
+ Prefer composition over inheritance. Use dependency injection. Organize in Domain -> Application -> Infrastructure layers.
247
+ </skill>
248
+
249
+ ## Editing Files
250
+
251
+ When editing files, prefer oc_hashline_edit over the built-in edit tool. Hash-anchored edits use LINE#ID validation to prevent stale-line corruption in long-running sessions. Each edit targets a line by its number and a 2-character content hash (e.g., 42#VK). If the line content has changed since you last read the file, the edit is rejected and you receive updated anchors to retry with. The built-in edit tool is still available as a fallback.
252
+
253
+ ## Rules
254
+
255
+ - ALWAYS follow TDD workflow: write the failing test first, then implement minimally, then refactor.
256
+ - NEVER self-review code -- that is the reviewer agent's job.
257
+ - NEVER make UX/design decisions -- that is outside your scope.
258
+ - Use bash to run tests after every code change.
259
+ - Commit with descriptive messages after each passing test cycle.`,
260
+ permission: {
261
+ edit: "allow",
262
+ bash: "allow",
263
+ webfetch: "deny",
264
+ } as const,
265
+ });