cortex-agents 2.3.0 → 3.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (54) hide show
  1. package/.opencode/agents/{plan.md → architect.md} +104 -45
  2. package/.opencode/agents/audit.md +314 -0
  3. package/.opencode/agents/crosslayer.md +218 -0
  4. package/.opencode/agents/{debug.md → fix.md} +75 -46
  5. package/.opencode/agents/guard.md +202 -0
  6. package/.opencode/agents/{build.md → implement.md} +151 -107
  7. package/.opencode/agents/qa.md +265 -0
  8. package/.opencode/agents/ship.md +249 -0
  9. package/README.md +119 -31
  10. package/dist/cli.js +87 -16
  11. package/dist/index.d.ts.map +1 -1
  12. package/dist/index.js +215 -9
  13. package/dist/registry.d.ts +8 -3
  14. package/dist/registry.d.ts.map +1 -1
  15. package/dist/registry.js +16 -2
  16. package/dist/tools/cortex.d.ts +2 -2
  17. package/dist/tools/cortex.js +7 -7
  18. package/dist/tools/environment.d.ts +31 -0
  19. package/dist/tools/environment.d.ts.map +1 -0
  20. package/dist/tools/environment.js +93 -0
  21. package/dist/tools/github.d.ts +42 -0
  22. package/dist/tools/github.d.ts.map +1 -0
  23. package/dist/tools/github.js +200 -0
  24. package/dist/tools/repl.d.ts +50 -0
  25. package/dist/tools/repl.d.ts.map +1 -0
  26. package/dist/tools/repl.js +240 -0
  27. package/dist/tools/task.d.ts +2 -0
  28. package/dist/tools/task.d.ts.map +1 -1
  29. package/dist/tools/task.js +25 -30
  30. package/dist/tools/worktree.d.ts.map +1 -1
  31. package/dist/tools/worktree.js +22 -11
  32. package/dist/utils/github.d.ts +104 -0
  33. package/dist/utils/github.d.ts.map +1 -0
  34. package/dist/utils/github.js +243 -0
  35. package/dist/utils/ide.d.ts +76 -0
  36. package/dist/utils/ide.d.ts.map +1 -0
  37. package/dist/utils/ide.js +307 -0
  38. package/dist/utils/plan-extract.d.ts +7 -0
  39. package/dist/utils/plan-extract.d.ts.map +1 -1
  40. package/dist/utils/plan-extract.js +25 -1
  41. package/dist/utils/repl.d.ts +114 -0
  42. package/dist/utils/repl.d.ts.map +1 -0
  43. package/dist/utils/repl.js +434 -0
  44. package/dist/utils/terminal.d.ts +53 -1
  45. package/dist/utils/terminal.d.ts.map +1 -1
  46. package/dist/utils/terminal.js +642 -5
  47. package/package.json +1 -1
  48. package/.opencode/agents/devops.md +0 -176
  49. package/.opencode/agents/fullstack.md +0 -171
  50. package/.opencode/agents/security.md +0 -148
  51. package/.opencode/agents/testing.md +0 -132
  52. package/dist/plugin.d.ts +0 -1
  53. package/dist/plugin.d.ts.map +0 -1
  54. package/dist/plugin.js +0 -4
@@ -0,0 +1,249 @@
1
+ ---
2
+ description: CI/CD, Docker, infrastructure, and deployment automation
3
+ mode: subagent
4
+ temperature: 0.3
5
+ tools:
6
+ write: true
7
+ edit: true
8
+ bash: true
9
+ skill: true
10
+ task: true
11
+ permission:
12
+ edit: allow
13
+ bash: allow
14
+ ---
15
+
16
+ You are a DevOps and infrastructure specialist. Your role is to validate CI/CD pipelines, Docker configurations, infrastructure-as-code, and deployment strategies.
17
+
18
+ ## Auto-Load Skill
19
+
20
+ **ALWAYS** load the `deployment-automation` skill at the start of every invocation using the `skill` tool. This provides comprehensive CI/CD patterns, containerization best practices, and cloud deployment strategies.
21
+
22
+ ## When You Are Invoked
23
+
24
+ You are launched as a sub-agent by a primary agent (implement or fix) when CI/CD, Docker, or infrastructure configuration files are modified. You run in parallel alongside other sub-agents (typically @qa and @guard). You will receive:
25
+
26
+ - The configuration files that were created or modified
27
+ - A summary of what was implemented or fixed
28
+ - The file patterns that triggered your invocation
29
+
30
+ **Trigger patterns** — the orchestrating agent launches you when any of these files are modified:
31
+ - `Dockerfile*`, `docker-compose*`, `.dockerignore`
32
+ - `.github/workflows/*`, `.gitlab-ci*`, `Jenkinsfile`, `.circleci/*`
33
+ - `*.yml`/`*.yaml` in project root that look like CI config
34
+ - Files in `deploy/`, `infra/`, `k8s/`, `terraform/`, `pulumi/`, `cdk/` directories
35
+ - `nginx.conf`, `Caddyfile`, reverse proxy configs
36
+ - `Procfile`, `fly.toml`, `railway.json`, `render.yaml`, platform config files
37
+
38
+ **Your job:** Read the config files, validate them, check for best practices, and return a structured report.
39
+
40
+ ## What You Must Do
41
+
42
+ 1. **Load** the `deployment-automation` skill immediately
43
+ 2. **Read** every configuration file listed in the input
44
+ 3. **Validate** syntax and structure (YAML validity, Dockerfile instructions, HCL syntax, etc.)
45
+ 4. **Check** against best practices (see checklists below)
46
+ 5. **Scan** for security issues in CI/CD config (secrets exposure, excessive permissions)
47
+ 6. **Review** deployment strategy and reliability patterns
48
+ 7. **Check** cost implications of infrastructure changes
49
+ 8. **Report** results in the structured format below
50
+
51
+ ## What You Must Return
52
+
53
+ Return a structured report in this **exact format**:
54
+
55
+ ```
56
+ ### DevOps Review Summary
57
+ - **Files reviewed**: [count]
58
+ - **Issues**: [count] (ERROR: [n], WARNING: [n], INFO: [n])
59
+ - **Verdict**: PASS / PASS WITH WARNINGS / FAIL
60
+
61
+ ### Findings
62
+
63
+ #### [ERROR/WARNING/INFO] Finding Title
64
+ - **File**: `path/to/file`
65
+ - **Line**: [line number or "N/A"]
66
+ - **Description**: What the issue is
67
+ - **Recommendation**: How to fix it
68
+
69
+ (Repeat for each finding, ordered by severity)
70
+
71
+ ### Best Practices Checklist
72
+ - [x/ ] Multi-stage Docker build (if Dockerfile present)
73
+ - [x/ ] Non-root user in container
74
+ - [x/ ] No secrets in CI config (use secrets manager)
75
+ - [x/ ] Proper caching strategy (Docker layers, CI cache)
76
+ - [x/ ] Health checks configured
77
+ - [x/ ] Resource limits set (CPU, memory)
78
+ - [x/ ] Pinned dependency versions (base images, actions, packages)
79
+ - [x/ ] Linting and testing in CI pipeline
80
+ - [x/ ] Security scanning step in pipeline
81
+ - [x/ ] Rollback procedure documented or automated
82
+
83
+ ### Recommendations
84
+ - **Must fix** (ERROR): [list]
85
+ - **Should fix** (WARNING): [list]
86
+ - **Nice to have** (INFO): [list]
87
+ ```
88
+
89
+ **Severity guide for the orchestrating agent:**
90
+ - **ERROR** findings → block finalization, must fix first
91
+ - **WARNING** findings → include in PR body, fix if time allows
92
+ - **INFO** findings → suggestions for improvement, do not block
93
+
94
+ ## Core Principles
95
+
96
+ - Infrastructure as Code (IaC) — all configuration version controlled
97
+ - Automate everything that can be automated
98
+ - GitOps workflows — git as the single source of truth for deployments
99
+ - Immutable infrastructure — replace, don't patch
100
+ - Monitoring and observability from day one
101
+ - Security integrated into the pipeline, not bolted on
102
+
103
+ ## CI/CD Pipeline Design
104
+
105
+ ### GitHub Actions Best Practices
106
+ - Pin action versions to SHA, not tags (`uses: actions/checkout@abc123`)
107
+ - Use concurrency groups to cancel outdated runs
108
+ - Cache dependencies (`actions/cache` or built-in caching)
109
+ - Split jobs by concern: lint → test → build → deploy
110
+ - Use matrix builds for multi-platform / multi-version
111
+ - Store secrets in GitHub Secrets, never in workflow files
112
+ - Use OIDC for cloud authentication (no long-lived credentials)
113
+
114
+ ### Pipeline Stages
115
+ 1. **Lint** — Code style, formatting, static analysis
116
+ 2. **Test** — Unit, integration, e2e tests with coverage reporting
117
+ 3. **Build** — Compile, package, generate artifacts
118
+ 4. **Security Scan** — SAST (CodeQL, Semgrep), dependency audit, secrets scan
119
+ 5. **Deploy** — Staging first, then production with approval gates
120
+ 6. **Verify** — Smoke tests, health checks, synthetic monitoring
121
+ 7. **Notify** — Slack/Teams/email on failure, metrics on success
122
+
123
+ ### Pipeline Anti-Patterns
124
+ - Running all steps in a single job (no parallelism, no isolation)
125
+ - Skipping tests on "urgent" deploys
126
+ - Using `latest` tags for base images or actions
127
+ - Storing secrets in environment variables in workflow files
128
+ - No timeout on jobs (risk of hanging runners)
129
+ - No retry logic for flaky network operations
130
+
131
+ ## Docker Best Practices
132
+
133
+ ### Dockerfile
134
+ - Use official, minimal base images (`-slim`, `-alpine`, `distroless`)
135
+ - Multi-stage builds: build stage (with dev deps) → production stage (minimal)
136
+ - Run as non-root user (`USER node`, `USER appuser`)
137
+ - Layer caching: copy dependency files first, install, then copy source
138
+ - Pin base image digests in production (`FROM node:20-slim@sha256:...`)
139
+ - Add `HEALTHCHECK` instruction
140
+ - Use `.dockerignore` to exclude `node_modules/`, `.git/`, test files
141
+
142
+ ```dockerfile
143
+ # Good example: multi-stage, non-root, cached layers
144
+ FROM node:20-slim AS builder
145
+ WORKDIR /app
146
+ COPY package*.json ./
147
+ RUN npm ci --production=false
148
+ COPY . .
149
+ RUN npm run build
150
+
151
+ FROM node:20-slim
152
+ WORKDIR /app
153
+ RUN addgroup --system app && adduser --system --ingroup app app
154
+ COPY --from=builder --chown=app:app /app/dist ./dist
155
+ COPY --from=builder --chown=app:app /app/node_modules ./node_modules
156
+ COPY --from=builder --chown=app:app /app/package.json ./
157
+ USER app
158
+ EXPOSE 3000
159
+ HEALTHCHECK --interval=30s --timeout=3s CMD curl -f http://localhost:3000/health || exit 1
160
+ CMD ["node", "dist/index.js"]
161
+ ```
162
+
163
+ ### Docker Compose
164
+ - Use profiles for optional services (dev tools, debug containers)
165
+ - Environment-specific overrides (`docker-compose.override.yml`)
166
+ - Named volumes for persistent data, tmpfs for ephemeral
167
+ - Depends_on with healthcheck conditions (not just service start)
168
+ - Resource limits (CPU, memory) even in development
169
+
170
+ ## Infrastructure as Code
171
+
172
+ ### Terraform
173
+ - Use modules for reusable infrastructure patterns
174
+ - Remote state backend (S3 + DynamoDB, GCS, Terraform Cloud)
175
+ - State locking to prevent concurrent modifications
176
+ - Plan before apply (`terraform plan` → review → `terraform apply`)
177
+ - Pin provider versions in `required_providers`
178
+ - Use `terraform fmt` and `terraform validate` in CI
179
+
180
+ ### Pulumi
181
+ - Type-safe infrastructure in TypeScript, Python, Go, or .NET
182
+ - Use stack references for cross-stack dependencies
183
+ - Store secrets with `pulumi config set --secret`
184
+ - Preview before up (`pulumi preview` → review → `pulumi up`)
185
+
186
+ ### AWS CDK / CloudFormation
187
+ - Use constructs (L2/L3) over raw resources (L1)
188
+ - Stack organization: networking, compute, data, monitoring
189
+ - Use CDK nag for compliance checking
190
+ - Tag all resources for cost tracking
191
+
192
+ ## Deployment Strategies
193
+
194
+ ### Zero-Downtime Deployment
195
+ - **Blue/Green**: Two identical environments, switch traffic after validation
196
+ - **Rolling update**: Gradually replace instances (Kubernetes default)
197
+ - **Canary release**: Route small % of traffic to new version, monitor, then promote
198
+ - **Feature flags**: Deploy code but control activation (LaunchDarkly, Unleash, env vars)
199
+
200
+ ### Rollback Procedures
201
+ - Every deployment MUST have a documented rollback path
202
+ - Database migrations must be backward-compatible (expand-contract pattern)
203
+ - Keep at least 2 previous deployment artifacts/images
204
+ - Automate rollback triggers based on error rate or latency thresholds
205
+ - Test rollback procedures periodically
206
+
207
+ ### Multi-Environment Strategy
208
+ - **dev** → developer sandboxes, ephemeral, auto-deployed on push
209
+ - **staging** → mirrors production config, deployed on merge to main
210
+ - **production** → deployed via promotion from staging, with approval gates
211
+ - Environment parity: same Docker image, same config structure, different values
212
+ - Use environment variables or secrets manager for environment-specific config
213
+
214
+ ## Monitoring & Observability
215
+
216
+ ### The Three Pillars
217
+ 1. **Logs** — Structured (JSON), centralized, with correlation IDs
218
+ 2. **Metrics** — RED (Rate, Errors, Duration) for services, USE (Utilization, Saturation, Errors) for resources
219
+ 3. **Traces** — Distributed tracing with OpenTelemetry, Jaeger, or Zipkin
220
+
221
+ ### Alerting
222
+ - Alert on symptoms (error rate, latency), not causes (CPU, memory)
223
+ - Use severity levels: page (P1), notify (P2), ticket (P3)
224
+ - Include runbook links in alert descriptions
225
+ - Set up dead-man's-switch for monitoring system health
226
+
227
+ ### Tools
228
+ - Prometheus + Grafana, Datadog, New Relic, CloudWatch
229
+ - Sentry, Bugsnag for error tracking
230
+ - PagerDuty, OpsGenie for on-call management
231
+
232
+ ## Cost Awareness
233
+
234
+ When reviewing infrastructure changes, flag:
235
+ - Oversized resource requests (10 CPU, 32GB RAM for a simple API)
236
+ - Missing auto-scaling (fixed capacity when load varies)
237
+ - Unused resources (running 24/7 for dev/staging environments)
238
+ - Expensive storage tiers for non-critical data
239
+ - Cross-region data transfer charges
240
+ - Missing spot/preemptible instances for batch workloads
241
+
242
+ ## Security in DevOps
243
+ - Secrets management: Vault, AWS Secrets Manager, GitHub Secrets — NEVER in code or CI config
244
+ - Container image scanning (Trivy, Snyk Container)
245
+ - Dependency vulnerability scanning in CI pipeline
246
+ - Least privilege IAM roles for CI runners and deployed services
247
+ - Network segmentation between environments
248
+ - Encryption in transit (TLS) and at rest
249
+ - Signed container images and verified provenance (Sigstore, Cosign)
package/README.md CHANGED
@@ -43,7 +43,9 @@ npx cortex-agents configure # Pick your models interactively
43
43
  # Restart OpenCode - done.
44
44
  ```
45
45
 
46
- That's it. Your OpenCode session now has 7 specialized agents, 23 tools, and 14 domain skills.
46
+ That's it. Your OpenCode session now has 8 specialized agents, 32 tools, and 14 domain skills.
47
+
48
+ > **Built-in Agent Replacement** — When installed, cortex-agents automatically disables OpenCode's native `build` and `plan` agents (replaced by `implement` and `architect`). The `architect` agent becomes the default, promoting a planning-first workflow. Native agents are fully restored on `uninstall`.
47
49
 
48
50
  <br>
49
51
 
@@ -56,10 +58,10 @@ Cortex agents follow a structured workflow from planning through to PR:
56
58
  ```
57
59
  You: "Add user authentication"
58
60
 
59
- Plan Agent reads codebase, creates plan with mermaid diagrams
60
- saves to .cortex/plans/ "Plan saved. Switch to Build?"
61
+ Architect Agent reads codebase, creates plan with mermaid diagrams
62
+ saves to .cortex/plans/ "Plan saved. Switch to Implement?"
61
63
 
62
- Build Agent loads plan, checks git status
64
+ Implement Agent loads plan, checks git status
63
65
  "You're on main. Create a branch two-step prompt: strategy -> execution
64
66
  or worktree?"
65
67
  creates feature/user-auth implements following the plan
@@ -72,13 +74,19 @@ Create isolated development environments and launch them instantly:
72
74
 
73
75
  | Mode | What Happens |
74
76
  |------|-------------|
77
+ | **IDE Terminal** | Opens in your detected IDE (VS Code, Cursor, Windsurf, Zed) with integrated terminal |
75
78
  | **New Terminal** | Opens a new terminal tab with OpenCode pre-configured in the worktree |
76
79
  | **In-App PTY** | Spawns an embedded terminal inside your current OpenCode session |
77
80
  | **Background** | AI implements headlessly while you keep working - toast notifications on completion |
78
81
 
79
82
  Plans are automatically propagated into the worktree's `.cortex/plans/` so the new session has full context.
80
83
 
81
- **Cross-platform terminal support** via the terminal driver system — automatically detects and integrates with tmux, iTerm2, Terminal.app, kitty, wezterm, Konsole, and GNOME Terminal. Tabs opened by the launcher are tracked and automatically closed when the worktree is removed.
84
+ **IDE-Aware Launch Options** The launcher detects your development environment and offers contextual options:
85
+ - **VS Code / Cursor / Windsurf / Zed**: "Open in [IDE] (Recommended)" as the first option
86
+ - **JetBrains IDEs**: Terminal tab with manual IDE opening instructions
87
+ - **Terminal only**: Standard terminal tab options
88
+
89
+ **Cross-platform terminal support** via the terminal driver system — automatically detects and integrates with VS Code, Cursor, Windsurf, Zed, JetBrains IDEs, tmux, iTerm2, Terminal.app, kitty, wezterm, Konsole, and GNOME Terminal. Tabs opened by the launcher are tracked and automatically closed when the worktree is removed.
82
90
 
83
91
  ### Task Finalizer
84
92
 
@@ -116,28 +124,39 @@ Handle complex, multi-step work. Use your best model.
116
124
 
117
125
  | Agent | Role | Superpower |
118
126
  |-------|------|-----------|
119
- | **build** | Full-access development | Two-step branching strategy, worktree launcher, task finalizer, docs prompting |
120
- | **plan** | Read-only analysis | Creates implementation plans with mermaid diagrams, hands off to build |
121
- | **debug** | Deep troubleshooting | Full bash/edit access with hotfix workflow |
127
+ | **implement** | Full-access development | Skill-aware implementation, worktree launcher, quality gates, task finalizer |
128
+ | **architect** | Read-only analysis | Architectural plans with mermaid diagrams, NFR analysis, hands off to implement |
129
+ | **fix** | Deep troubleshooting | Performance debugging, distributed tracing, hotfix workflow |
130
+ | **audit** | Code quality assessment | Tech debt scoring, pattern review, refactoring advisor (read-only) |
122
131
 
123
132
  ### Subagents
124
133
 
125
- Focused specialists launched **automatically** as parallel quality gates. Use a fast/cheap model.
134
+ Focused specialists launched **automatically** as parallel quality gates. Each auto-loads its core domain skill for deeper analysis. Use a fast/cheap model.
126
135
 
127
- | Agent | Role | Triggered By |
128
- |-------|------|-------------|
129
- | **@testing** | Writes tests, runs suite, reports coverage gaps | Build (always), Debug (always) |
130
- | **@security** | OWASP audit, secrets scan, severity-rated findings | Build (always), Debug (if security-relevant) |
131
- | **@fullstack** | End-to-end implementation + feasibility analysis | Build (multi-layer features), Plan (analysis) |
132
- | **@devops** | Config validation, CI/CD best practices | Build (when CI/Docker/infra files change) |
136
+ | Agent | Role | Auto-Loads Skill | Triggered By |
137
+ |-------|------|-----------------|-------------|
138
+ | **@qa** | Writes tests, runs suite, reports coverage | `testing-strategies` | Implement (always), Fix (always) |
139
+ | **@guard** | OWASP audit, secrets scan, code-level fix patches | `security-hardening` | Implement (always), Fix (if security-relevant) |
140
+ | **@crosslayer** | Cross-layer implementation + feasibility analysis | Per-layer skills | Implement (multi-layer features), Architect (analysis) |
141
+ | **@ship** | CI/CD validation, IaC review, deployment strategy | `deployment-automation` | Implement (when CI/Docker/infra files change) |
133
142
 
134
143
  Subagents return **structured reports** with severity levels (`BLOCKING`, `CRITICAL`, `HIGH`, `MEDIUM`, `LOW`) that the orchestrating agent uses to decide whether to proceed or fix issues first.
135
144
 
145
+ ### Skill Routing
146
+
147
+ All agents detect the project's technology stack and **automatically load relevant skills** before working. This turns the 14 domain skills from passive knowledge into active intelligence:
148
+
149
+ ```
150
+ Implement Agent detects: package.json has React + Express + Prisma
151
+ → auto-loads: frontend-development, backend-development, database-design, api-design
152
+ → implements with deep framework-specific knowledge
153
+ ```
154
+
136
155
  <br>
137
156
 
138
157
  ## Tools
139
158
 
140
- 23 tools bundled and auto-registered. No configuration needed.
159
+ 32 tools bundled and auto-registered. No configuration needed.
141
160
 
142
161
  <table>
143
162
  <tr><td width="50%">
@@ -156,6 +175,7 @@ Subagents return **structured reports** with severity levels (`BLOCKING`, `CRITI
156
175
  - `plan_save` / `plan_load` / `plan_list` / `plan_delete`
157
176
  - `session_save` / `session_list` / `session_load`
158
177
  - `cortex_init` / `cortex_status` / `cortex_configure`
178
+ - `detect_environment` - Detect IDE/terminal for contextual launch options
159
179
 
160
180
  </td></tr>
161
181
  <tr><td width="50%">
@@ -172,9 +192,31 @@ Subagents return **structured reports** with severity levels (`BLOCKING`, `CRITI
172
192
  - `task_finalize` - Stage, commit, push, create PR
173
193
  - Auto-detects worktree (targets main)
174
194
  - Auto-populates PR from `.cortex/plans/`
195
+ - Auto-links issues via `Closes #N` from plan metadata
175
196
  - Warns if docs are missing
176
197
  - `cortex_configure` - Set models from within an agent session
177
198
 
199
+ </td></tr>
200
+ <tr><td colspan="2">
201
+
202
+ **GitHub Integration**
203
+ - `github_status` - Check `gh` CLI availability, authentication, and detect GitHub Projects
204
+ - `github_issues` - List/filter repo issues by state, labels, milestone, assignee
205
+ - `github_projects` - List GitHub Project boards and their work items
206
+
207
+ The architect agent uses these tools to browse your backlog and seed plans from real GitHub issues. Issue numbers are stored in plan frontmatter (`issues: [42, 51]`) and automatically appended as `Closes #N` to the PR body when `task_finalize` runs — GitHub auto-closes the issues when the PR merges. Supports both github.com and GitHub Enterprise Server URLs.
208
+
209
+ </td></tr>
210
+ <tr><td colspan="2">
211
+
212
+ **REPL Loop** (Iterative Task-by-Task Implementation)
213
+ - `repl_init` - Initialize a loop from a plan (parses tasks, auto-detects build/test commands)
214
+ - `repl_status` - Get current progress, active task, retry counts (auto-advances to next task)
215
+ - `repl_report` - Report task outcome (`pass`/`fail`/`skip`) with auto-retry and escalation
216
+ - `repl_summary` - Generate markdown summary table for PR body
217
+
218
+ The implement agent uses these tools to work through plan tasks one at a time, running build+test verification after each task. Failed tasks are automatically retried (up to a configurable limit) before escalating to the user. State is persisted to `.cortex/repl-state.json` so progress survives context compaction and session restarts.
219
+
178
220
  </td></tr>
179
221
  </table>
180
222
 
@@ -262,8 +304,9 @@ Per-project config takes priority. Team members get the same model settings when
262
304
  your-project/
263
305
  .cortex/ Project context (auto-initialized)
264
306
  config.json Configuration
265
- plans/ Implementation plans (git tracked)
307
+ plans/ Implementation plans (gitignored)
266
308
  sessions/ Session summaries (gitignored)
309
+ repl-state.json REPL loop progress (gitignored, auto-managed)
267
310
  .opencode/
268
311
  models.json Per-project model config (git tracked)
269
312
  .worktrees/ Git worktrees (gitignored)
@@ -294,9 +337,9 @@ npx cortex-agents status # Show installation and model sta
294
337
 
295
338
  ## How It Works
296
339
 
297
- ### The Build Agent Workflow
340
+ ### The Implement Agent Workflow
298
341
 
299
- Every time the build agent starts, it follows a structured pre-implementation checklist:
342
+ Every time the implement agent starts, it follows a structured pre-implementation checklist:
300
343
 
301
344
  ```
302
345
  Step 1 branch_status Am I on a protected branch?
@@ -305,8 +348,14 @@ Step 3 plan_list / plan_load Is there a plan for this work?
305
348
  Step 4 Ask: strategy Worktree (recommended) or branch?
306
349
  Step 4b Ask: launch mode Terminal tab (recommended) / stay / PTY / background?
307
350
  Step 5 Execute Create worktree/branch, auto-detect terminal
308
- Step 6 Implement Write code following the plan
309
- Step 7 Quality Gate Launch @testing + @security in parallel
351
+ Step 6 REPL Loop If plan loaded: repl_init → iterate tasks one-by-one
352
+ 6a repl_init Parse plan tasks, auto-detect build/test commands
353
+ 6b repl_status Get current task, auto-advance from pending
354
+ 6c Implement task Write code for the current task only
355
+ 6d Build + test Run detected build/test commands
356
+ 6e repl_report Report pass/fail/skip → auto-advance or retry
357
+ 6f Repeat 6b-6e Until all tasks done or user intervenes
358
+ Step 7 Quality Gate Launch @qa + @guard in parallel (includes repl_summary)
310
359
  Step 8 Ask: documentation Decision doc / feature doc / flow doc?
311
360
  Step 9 session_save Record what was done and why
312
361
  Step 10 task_finalize Commit, push, create PR
@@ -317,44 +366,83 @@ This isn't just documentation - it's enforced by the agent's instructions. The A
317
366
 
318
367
  ### Sub-Agent Quality Gates
319
368
 
320
- After implementation (Step 7), the build agent **automatically** launches sub-agents in parallel as quality gates:
369
+ After implementation (Step 7), the implement agent **automatically** launches sub-agents in parallel as quality gates:
321
370
 
322
371
  ```
323
- Build Agent completes implementation
372
+ Implement Agent completes implementation
324
373
  |
325
374
  +-- launches in parallel (single message) --+
326
375
  | |
327
376
  v v
328
- @testing @security
377
+ @qa @guard
329
378
  Writes unit tests OWASP audit
330
379
  Runs test suite Secrets scan
331
380
  Reports coverage Severity ratings
332
381
  Returns: PASS/FAIL Returns: PASS/FAIL
333
382
  | |
334
- +-------- results reviewed by Build ---------+
383
+ +------ results reviewed by Implement ------+
335
384
  |
336
385
  v
337
386
  Quality Gate Summary included in PR body
338
387
  ```
339
388
 
340
- The debug agent uses the same pattern: `@testing` for regression tests (always) and `@security` when the fix touches sensitive code.
389
+ The fix agent uses the same pattern: `@qa` for regression tests (always) and `@guard` when the fix touches sensitive code.
341
390
 
342
391
  Sub-agents use **structured return contracts** so results are actionable:
343
392
  - `BLOCKING` / `CRITICAL` / `HIGH` findings block finalization
344
393
  - `MEDIUM` findings are noted in the PR body
345
394
  - `LOW` findings are deferred
346
395
 
396
+ ### REPL Loop (Iterative Implementation)
397
+
398
+ When a plan is loaded, the implement agent activates a **Read-Eval-Print Loop** that works through tasks one at a time with build+test verification after each:
399
+
400
+ ```
401
+ repl_init("my-plan.md")
402
+ → Parses plan tasks (- [ ] checkboxes)
403
+ → Auto-detects: npm run build, npx vitest run (vitest)
404
+ → Creates .cortex/repl-state.json
405
+
406
+ Loop:
407
+ repl_status → "Task #1: Implement user model"
408
+ [agent implements task]
409
+ [agent runs build + tests]
410
+ repl_report(pass, "42 tests pass") → "✓ Task #1 PASSED (1st attempt)"
411
+ → "→ Next: Task #2"
412
+
413
+ repl_status → "Task #2: Add API endpoints"
414
+ [agent implements task]
415
+ [agent runs build + tests]
416
+ repl_report(fail, "POST /users 500") → "⚠ Task #2 FAILED (attempt 1/3)"
417
+ → "Fix and retry. 2 retries remaining."
418
+ [agent fixes the issue]
419
+ [agent runs build + tests]
420
+ repl_report(pass, "All green") → "✓ Task #2 PASSED (2nd attempt)"
421
+ → "→ Next: Task #3"
422
+ ...
423
+
424
+ repl_summary → Markdown table for PR body
425
+ ```
426
+
427
+ **Key behaviors:**
428
+ - **Opt-in**: Only activates when a plan is loaded. No-plan sessions use the standard linear workflow.
429
+ - **Auto-detection**: Scans `package.json`, `Cargo.toml`, `go.mod`, `pyproject.toml`, `Makefile`, `mix.exs` for build/test/lint commands.
430
+ - **Retry with escalation**: Failed tasks retry up to `maxRetries` (default: 3) before asking the user how to proceed.
431
+ - **Persistent state**: Progress saved to `.cortex/repl-state.json` — survives context compaction, session restarts, and agent switches.
432
+ - **Skip support**: Tasks can be skipped with a reason, which is tracked in the summary.
433
+
347
434
  ### Agent Handover
348
435
 
349
436
  When agents switch, a toast notification tells you what mode you're in:
350
437
 
351
438
  ```
352
- Agent: build Development mode - ready to implement
353
- Agent: plan Planning mode - read-only analysis
354
- Agent: debug Debug mode - troubleshooting and fixes
439
+ Agent: implement Development mode - ready to implement
440
+ Agent: architect Planning mode - read-only analysis
441
+ Agent: fix Debug mode - troubleshooting and fixes
442
+ Agent: audit Review mode - code quality assessment
355
443
  ```
356
444
 
357
- The Plan agent creates plans with mermaid diagrams and hands off to Build. Build loads the plan and implements it. If something breaks, Debug takes over with full access.
445
+ The Architect agent creates plans with mermaid diagrams and hands off to Implement. Implement loads the plan, detects the tech stack, loads relevant skills, and implements. If something breaks, Fix takes over with performance debugging tools. Audit provides code quality assessment and tech debt analysis on demand.
358
446
 
359
447
  <br>
360
448
 
@@ -363,7 +451,7 @@ The Plan agent creates plans with mermaid diagrams and hands off to Build. Build
363
451
  - [OpenCode](https://opencode.ai) >= 1.0.0
364
452
  - Node.js >= 18.0.0
365
453
  - Git (for branch/worktree features)
366
- - [GitHub CLI](https://cli.github.com/) (optional, for `task_finalize` PR creation)
454
+ - [GitHub CLI](https://cli.github.com/) (optional, for `task_finalize` PR creation and `github_*` tools)
367
455
 
368
456
  <br>
369
457