opencastle 0.30.1 → 0.31.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (36) hide show
  1. package/dist/cli/pipeline.d.ts +2 -1
  2. package/dist/cli/pipeline.d.ts.map +1 -1
  3. package/dist/cli/pipeline.js +120 -59
  4. package/dist/cli/pipeline.js.map +1 -1
  5. package/dist/cli/pipeline.test.js +82 -143
  6. package/dist/cli/pipeline.test.js.map +1 -1
  7. package/dist/cli/plan.d.ts +1 -1
  8. package/dist/cli/plan.d.ts.map +1 -1
  9. package/dist/cli/plan.js +43 -10
  10. package/dist/cli/plan.js.map +1 -1
  11. package/dist/cli/run/adapters/copilot.d.ts +1 -0
  12. package/dist/cli/run/adapters/copilot.d.ts.map +1 -1
  13. package/dist/cli/run/adapters/copilot.js +11 -1
  14. package/dist/cli/run/adapters/copilot.js.map +1 -1
  15. package/dist/cli/run/adapters/index.d.ts +5 -0
  16. package/dist/cli/run/adapters/index.d.ts.map +1 -1
  17. package/dist/cli/run/adapters/index.js +13 -0
  18. package/dist/cli/run/adapters/index.js.map +1 -1
  19. package/dist/cli/run.d.ts.map +1 -1
  20. package/dist/cli/run.js +62 -9
  21. package/dist/cli/run.js.map +1 -1
  22. package/dist/cli/types.d.ts +2 -0
  23. package/dist/cli/types.d.ts.map +1 -1
  24. package/package.json +1 -1
  25. package/src/cli/pipeline.test.ts +82 -140
  26. package/src/cli/pipeline.ts +142 -61
  27. package/src/cli/plan.ts +47 -11
  28. package/src/cli/run/adapters/copilot.ts +11 -1
  29. package/src/cli/run/adapters/index.ts +13 -0
  30. package/src/cli/run.ts +60 -9
  31. package/src/cli/types.ts +2 -0
  32. package/src/dashboard/node_modules/.vite/deps/_metadata.json +6 -6
  33. package/src/orchestrator/prompts/assess-complexity.prompt.md +67 -0
  34. package/src/orchestrator/prompts/generate-convoy.prompt.md +14 -24
  35. package/src/orchestrator/prompts/generate-prd.prompt.md +20 -34
  36. package/src/orchestrator/prompts/validate-prd.prompt.md +1 -1
@@ -0,0 +1,67 @@
1
+ ---
2
+ description: 'Assess PRD complexity and recommend convoy strategy (single vs chain). Returns structured JSON consumed by the pipeline.'
3
+ agent: 'Reviewer'
4
+ output: json
5
+ ---
6
+
7
+ <!-- ⚠️ This file is managed by OpenCastle. Edits will be overwritten on update. Customize in the .opencastle/ directory instead. -->
8
+
9
+ # Assess PRD Complexity
10
+
11
+ Analyze the PRD below and produce a complexity assessment as a **single JSON object**. This JSON is consumed programmatically by the pipeline to decide whether to generate one convoy spec or a chain of convoy specs.
12
+
13
+ ## PRD to Analyze
14
+
15
+ {{goal}}
16
+
17
+ ## Original User Prompt
18
+
19
+ {{context}}
20
+
21
+ ---
22
+
23
+ ## Output Rules
24
+
25
+ **CRITICAL:** Return ONLY a single fenced JSON block — no prose, no explanation, no markdown headings. Start your response with the opening fence and end with the closing fence.
26
+
27
+ ## Required JSON Schema
28
+
29
+ ```json
30
+ {
31
+ "original_prompt": "<string>",
32
+ "total_tasks": <number>,
33
+ "total_phases": <number>,
34
+ "domains": ["<string>", ...],
35
+ "estimated_duration_minutes": <number>,
36
+ "complexity": "low" | "medium" | "high",
37
+ "recommended_strategy": "single" | "chain",
38
+ "chain_rationale": "<string — empty when strategy is single>",
39
+ "convoy_groups": [
40
+ {
41
+ "name": "<kebab-case-name>",
42
+ "description": "<one sentence>",
43
+ "phases": [<phase numbers>],
44
+ "depends_on": ["<group name>", ...]
45
+ }
46
+ ]
47
+ }
48
+ ```
49
+
50
+ ## Field Rules
51
+
52
+ - `original_prompt`: Copy the user's original feature request verbatim from the "Original User Prompt" section above. If that section is empty, extract a one-sentence summary from the PRD's Overview section.
53
+ - `total_tasks`: Count of individual workstreams in the Task Breakdown.
54
+ - `total_phases`: Count of phases in the Task Breakdown.
55
+ - `domains`: List of technical domains involved (e.g., "frontend", "api", "database", "testing", "config").
56
+ - `estimated_duration_minutes`: Rough estimate assuming AI agent execution (not human).
57
+ - `complexity`: `"low"` (1–4 tasks), `"medium"` (5–8 tasks), `"high"` (9+ tasks).
58
+ - `recommended_strategy`:
59
+ - `"single"` when: total tasks ≤ 8, OR total phases ≤ 3, OR all tasks are tightly coupled with heavy cross-phase file sharing.
60
+ - `"chain"` when: total tasks > 8 AND total phases > 3 AND domains have natural boundaries — AND splitting improves failure isolation, observability, or retry granularity.
61
+ - `chain_rationale`: Only filled when `recommended_strategy` is `"chain"` — explain WHY splitting benefits this specific feature.
62
+ - `convoy_groups`:
63
+ - When `"single"`: exactly one group covering all phases.
64
+ - When `"chain"`: 2–4 groups with explicit `depends_on` order. Each group covers a coherent domain boundary.
65
+ - **Minimum 3 tasks per group.** Never create a group that would produce a convoy with only 1–2 tasks — merge small groups with adjacent ones. A convoy with a single task is pointless overhead.
66
+ - **Do NOT map phases 1:1 to groups.** Groups should bundle multiple related phases when tasks are tightly coupled (e.g., config + data in one group, components + pages in another). Only split at genuine domain boundaries where failure isolation matters.
67
+ - Maximum 3 groups for projects with ≤ 15 tasks. Maximum 4 groups for 16+ tasks.
@@ -9,11 +9,13 @@ agent: 'Team Lead (OpenCastle)'
9
9
 
10
10
  You are the Team Lead. The user wants to run `opencastle run` to execute a batch of tasks autonomously via the convoy engine. Your job is to produce a valid `.convoy.yml` file they can feed to the CLI. Derive a short, descriptive, kebab-case filename from the user's goal (2–4 words max) and use it as the filename — for example `auth-refactor.convoy.yml` or `add-search.convoy.yml`. Always use the `.convoy.yml` extension. Store all generated convoy specs in the `.opencastle/convoys/` directory (create it if it doesn't exist).
11
11
 
12
+ > **⚠️ OUTPUT FORMAT: Your entire response must be a single ` ```yaml ` fenced code block containing the convoy spec. Do NOT output any text, explanations, summaries, or DAG diagrams before or after the YAML block. The parser only reads the ` ```yaml ` fence — everything else causes a failure.**
13
+
12
14
  ## User Goal
13
15
 
14
16
  {{goal}}
15
17
 
16
- ## Additional Context
18
+ ## PRD Reference
17
19
 
18
20
  {{context}}
19
21
 
@@ -311,24 +313,17 @@ For complex tasks, consider using `steps` to break the prompt into sequential su
311
313
 
312
314
  ### Chain Mode (Subset Generation)
313
315
 
314
- When the `{{context}}` field contains a JSON object with `"mode": "chain_subset"`, you are generating ONE convoy spec that is part of a larger convoy chain. The context will look like:
315
-
316
- ```json
317
- {
318
- "mode": "chain_subset",
319
- "group_name": "database-setup",
320
- "group_description": "Schema changes and migrations",
321
- "group_phases": [1],
322
- "depends_on_groups": [],
323
- "total_groups": 3,
324
- "group_index": 1
325
- }
326
- ```
327
-
328
- When this context is present:
329
- - **Only** generate tasks for the phases listed in `group_phases`. Do not include tasks from other phases.
316
+ When the `{{goal}}` section contains a "Convoy Group Scope" heading, you are generating ONE convoy spec that is part of a larger convoy chain. The goal will contain:
317
+
318
+ - The original user prompt
319
+ - The group name, description, phases to cover, and dependency info
320
+
321
+ The full PRD is available in the `{{context}}` section as reference.
322
+
323
+ When chain mode is detected:
324
+ - **Only** generate tasks for the phases listed in the group scope. Do not include tasks from other phases.
330
325
  - Use `version: 1` — this spec is a single convoy, not a pipeline.
331
- - Derive the convoy `name` from `group_name` (e.g., "Database Setup").
326
+ - Derive the convoy `name` from the group name (e.g., "Database Setup").
332
327
  - Derive the `branch` from the PRD's feature name, but it will be overridden by the pipeline anyway.
333
328
  - Keep all other conventions (prompts, files, gates, etc.) the same as for single-spec generation.
334
329
 
@@ -346,7 +341,7 @@ Before presenting the YAML, mentally verify:
346
341
 
347
342
  ### 7. Output
348
343
 
349
- Return the final YAML inside a fenced code block with a filename annotation:
344
+ Your response must contain **ONLY** a single ` ```yaml ` fenced code block no text before it, no text after it, no explanations, no summaries, no DAG diagrams. The pipeline parser will only extract content from the ` ```yaml ` fence. Any other text in your response is discarded and may cause parsing failures.
350
345
 
351
346
  ````yaml
352
347
  # .opencastle/convoys/<feature-name>.convoy.yml
@@ -393,9 +388,4 @@ gates:
393
388
  gate_retries: 1
394
389
  ````
395
390
 
396
- Also provide:
397
- 1. A **DAG summary** showing the phase structure so the user can verify execution order.
398
- 2. An **estimated total duration** (sum of timeouts on the critical path).
399
- 3. A `--dry-run` command they can use to validate: `npx opencastle run -f .opencastle/convoys/<feature-name>.convoy.yml --dry-run`
400
-
401
391
 
@@ -29,9 +29,13 @@ If the feature request involves a specific person, place, organization, topic, o
29
29
  > ℹ️ Content based on training data — verify before launch.
30
30
  3. **Never fabricate or hallucinate content.** If you genuinely have no knowledge about a real-world subject and cannot search, state what is unknown and use placeholder text. This applies to all content: bios, descriptions, histories, statistics, quotes, and any factual claims.
31
31
 
32
+ ## Output Rules
33
+
34
+ **CRITICAL:** Return the PRD as your text response. Do NOT create any files. Do NOT use file-writing tools. Simply output the full PRD document as text. Do not wrap it in a code fence — start directly with the `#` heading. Do not summarize — output the complete document.
35
+
32
36
  ## Required PRD Structure
33
37
 
34
- Produce the PRD in Markdown using **exactly** the sections below. Do not skip or merge sections. Do not wrap the output in a code fence — output raw Markdown starting directly with the `#` heading.
38
+ Produce the PRD in Markdown using **exactly** the sections below. Do not skip or merge sections.
35
39
 
36
40
  ---
37
41
 
@@ -56,6 +60,12 @@ Explicit exclusions — what this work does **not** cover. If nothing is exclude
56
60
 
57
61
  For each primary scenario, write a user story + binary acceptance criteria. Criteria must be testable (pass/fail — no subjective language).
58
62
 
63
+ **Quality rules for acceptance criteria (the validator WILL reject violations):**
64
+ - Every criterion must be evaluable as deterministic pass/fail — no subjective language ("looks good", "feels responsive", "is clean", "visually distinct")
65
+ - Do NOT use modal verbs that imply optionality: "should", "might", "could", "may"
66
+ - Do NOT use vague qualifiers: "or equivalent", "or similar", "as needed"
67
+ - State exact expected values (e.g., exact heading text, exact attribute names)
68
+
59
69
  **US-1: [Short title]**
60
70
  As a [user type], I want [action] so that [benefit].
61
71
 
@@ -76,7 +86,7 @@ Specific technical constraints the implementation must respect:
76
86
 
77
87
  ## Implementation Scope
78
88
 
79
- List **every file and directory** that will be created, modified, or deleted. Use specific paths — not broad paths like `src/`. Group by concern.
89
+ List **every file and directory** that will be created, modified, or deleted. Use specific paths — not broad paths like `src/`. Group by concern. Use compact file lists — group related files with commas instead of separate rows when they share a concern. Do NOT use glob patterns (`*`, `**`). Every concern must list at least one specific file.
80
90
 
81
91
  | Concern | Files / Directories |
82
92
  |---------|---------------------|
@@ -95,6 +105,14 @@ List **every file and directory** that will be created, modified, or deleted. Us
95
105
 
96
106
  Decompose into the minimum number of phases. Tasks in the same phase run in parallel and **must not share any files**.
97
107
 
108
+ Keep task descriptions **brief** — 1 sentence each. List only file paths, not explanations. Prefer compact formatting.
109
+
110
+ **Quality rules (the validator WILL reject violations):**
111
+ - Each workstream must list exact files it will modify
112
+ - No two parallel workstreams (same phase) may claim the same file
113
+ - Phases must have explicit dependency declarations (`depends on: Phase N`)
114
+ - No circular dependencies
115
+
98
116
  ```
99
117
  Phase 1 — Foundation (parallel, no dependencies):
100
118
  - [Workstream A title]: [2-sentence description]
@@ -127,35 +145,3 @@ Measurable, binary checks that confirm the feature is shippable:
127
145
  - **[Open question]**: [What needs to be decided before implementation can start]
128
146
 
129
147
  If there are no risks or open questions, write "None identified."
130
-
131
- ## Complexity Assessment
132
-
133
- Produce a fenced JSON block with the following fields. This is consumed programmatically by the pipeline to decide whether to generate a single convoy spec or a convoy chain.
134
-
135
- ```json
136
- {
137
- "total_tasks": 12,
138
- "total_phases": 4,
139
- "domains": ["database", "api", "frontend", "testing"],
140
- "estimated_duration_minutes": 120,
141
- "complexity": "low",
142
- "recommended_strategy": "single",
143
- "chain_rationale": "",
144
- "convoy_groups": [
145
- {
146
- "name": "full-implementation",
147
- "description": "All phases in a single convoy",
148
- "phases": [1, 2, 3, 4],
149
- "depends_on": []
150
- }
151
- ]
152
- }
153
- ```
154
-
155
- **Strategy decision rules:**
156
- - Use `"single"` when: total tasks ≤ 8, or total phases ≤ 3, or all tasks are tightly coupled with heavy cross-phase file sharing.
157
- - Use `"chain"` when: total tasks > 8 AND total phases > 3 AND domains have natural boundaries (e.g., database changes are independent from frontend components from test suites) — AND splitting would improve failure isolation, observability, or retry granularity.
158
- - When `"single"`: provide exactly one convoy group covering all phases.
159
- - When `"chain"`: provide 2–4 convoy groups with explicit `depends_on` order. Each group should cover a coherent domain boundary.
160
- - `complexity` values: `"low"` (1–4 tasks), `"medium"` (5–8 tasks), `"high"` (9+ tasks).
161
- - `chain_rationale` is only filled when `recommended_strategy` is `"chain"` — explain WHY splitting benefits this specific feature.
@@ -56,7 +56,7 @@ Evaluate **every item** below. If ALL items pass, respond `VALID`. If ANY item f
56
56
 
57
57
  ### Language Quality
58
58
 
59
- - [ ] No undefined acronyms or jargon used without explanation
59
+ - [ ] No **domain-specific** acronyms or jargon used without explanation (standard software acronyms like API, CSS, HTML, CI/CD, CMS, SDK, CLI, URL, JSON, REST, SQL, SSR, SSG, CDN, DNS, TLS, JWT, OAuth, CRUD, DOM, UI, UX, HTTP, HTTPS, LTS, WCAG, RTL, MCP, PRD, E2E are considered universally understood and do not need expansion)
60
60
  - [ ] No conflicting requirements (e.g., "must be fast AND run full suite on every change")
61
61
  - [ ] Section content is not placeholder/template text (e.g., "2–3 sentences about…", "Description here")
62
62