@papi-ai/server 0.5.3 → 0.6.0-alpha.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +28 -0
- package/dist/index.js +964 -202
- package/dist/prompts.js +83 -91
- package/package.json +1 -1
package/dist/prompts.js
CHANGED
|
@@ -57,7 +57,7 @@ After your natural language output, include this EXACT format on its own line:
|
|
|
57
57
|
<!-- PAPI_STRUCTURED_OUTPUT -->
|
|
58
58
|
\`\`\`json
|
|
59
59
|
{
|
|
60
|
-
"cycleLogTitle": "string \u2014 short descriptive title WITHOUT 'Cycle N' prefix (e.g. '
|
|
60
|
+
"cycleLogTitle": "string \u2014 short descriptive title WITHOUT 'Cycle N' prefix. Should capture the cycle theme in 3-5 words (e.g. 'MCP Quality + Product Readiness' not 'Cycle 5 \u2014 Board Triage \u2014 Bug Fix'). This is the canonical theme label for the cycle.",
|
|
61
61
|
"cycleLogContent": "string \u2014 5-10 line cycle log body in markdown, NO heading (the ### heading is generated automatically)",
|
|
62
62
|
"cycleLogCarryForward": "string or null \u2014 carry-forward items for next cycle",
|
|
63
63
|
"cycleLogNotes": "string or null \u2014 1-3 lines of cycle-level observations: estimation accuracy, recurring blockers, velocity trends, dependency signals. Omit if no noteworthy observations.",
|
|
@@ -198,6 +198,7 @@ Standard planning cycle with full board review.
|
|
|
198
198
|
Within the same priority level, prefer tasks with the highest **impact-to-effort ratio**. Impact is measured by: (a) strategic alignment \u2014 does it advance the current horizon/phase? (b) unlocks other work \u2014 are tasks blocked by this? (c) user-facing \u2014 does it change what users see? (d) compounds over time \u2014 does it make future cycles faster? A high-impact Medium task beats a low-impact Small task at the same priority level. Justify in 2-3 sentences.
|
|
199
199
|
**Blocked tasks:** Tasks with status "Blocked" MUST be skipped during task selection \u2014 they are waiting on external dependencies or gates and cannot be built. Do NOT generate BUILD HANDOFFs for blocked tasks. Do NOT recommend blocked tasks. If a blocked task's gate has been resolved (check the notes and recent build reports), emit a \`boardCorrections\` entry to move it back to Backlog. Report blocked task count in the cycle log.
|
|
200
200
|
**Cycle sizing:** Size the cycle based on what the selected tasks actually require \u2014 not a fixed budget. Select the highest-priority unblocked tasks, estimate each one's effort from its scope, and let the total emerge from the tasks themselves. The historical average effort from Methodology Trends is a reference point for calibration, not a target or floor. A healthy cycle has 4-6 tasks. Cycles with fewer than 4 tasks require explicit justification in the cycle log \u2014 explain why more tasks could not be included. When the backlog has 10+ tasks, the cycle SHOULD have 5+ tasks \u2014 undersized cycles waste planning overhead relative to the available work. If fewer than 4 tasks qualify after filtering (blocked, deferred, raw), check Deferred tasks \u2014 some may be ready to un-defer via a \`boardCorrections\` entry. A 1-task cycle is almost never correct.
|
|
201
|
+
**Theme coherence:** After selecting candidate tasks, check whether they form a coherent theme \u2014 all serving one goal, phase, or module. Single-theme cycles produce better build quality and less context switching. If the top candidates touch 3+ unrelated modules or epics, prefer regrouping around the highest-priority theme and deferring the outliers. Mixed-theme cycles are acceptable when justified (e.g. a P0 fix alongside P1 feature work), but the justification must appear in the cycle log. Name the theme in 3-5 words \u2014 it becomes the \`cycleLogTitle\`.
|
|
201
202
|
|
|
202
203
|
8. **Cycle Log** \u2014 Write 5-10 line entry: what was triaged, what was recommended and why, observations, AD updates.
|
|
203
204
|
**Cycle Notes** \u2014 Optionally include 1-3 lines of cycle-level observations in \`cycleLogNotes\`: estimation accuracy patterns, recurring blockers, velocity trends, or dependency signals. These notes persist across cycles so future planning runs can learn from them. Use null if there are no noteworthy observations this cycle.
|
|
@@ -220,10 +221,29 @@ Standard planning cycle with full board review.
|
|
|
220
221
|
**Estimation calibration:** Estimate **XS** for: copy/text-only changes, single string replacements, config tweaks, and any task where the scope is "change words in an existing file" with no logic changes. Estimate **S** for: wiring existing adapter methods, adding API routes following established patterns, modifying prompts, or documentation-only changes. Default to S for pattern-following work. Only use M when genuine new architecture, new DB tables, or multi-file architectural changes are needed. Historical data shows systematic over-estimation (198 over vs 8 under out of 528 tasks) \u2014 when in doubt, estimate smaller. If an "Estimation Calibration (Historical)" section is provided in the context below, use its data to adjust your estimates \u2014 it shows how often each estimated size matched the actual effort. Pay special attention to systematic over/under-estimation patterns (e.g. if M\u2192S happens frequently, estimate S instead of M for similar work).
|
|
221
222
|
**Reference docs:** If a task's notes include a \`Reference:\` path (e.g. \`Reference: docs/architecture/papi-brain-v1.md\`), include a REFERENCE DOCS section in the BUILD HANDOFF with those paths. This tells the builder to read the referenced doc for background context before implementing. Do NOT omit or summarise the reference \u2014 pass it through so the builder can access the full document. Only tasks with explicit \`Reference:\` paths in their notes should have this section.
|
|
222
223
|
**Pre-build verification:** EVERY handoff MUST include a PRE-BUILD VERIFICATION section listing 2-5 specific file paths the builder should read before implementing. Derive these from FILES LIKELY TOUCHED \u2014 pick the files most likely to already contain the target functionality. This is the #1 prevention mechanism for wasted build slots (C120, C125, C126 all scheduled already-shipped work). If the builder finds >80% of the scope already implemented, they report "already built" instead of re-implementing.
|
|
223
|
-
**
|
|
224
|
-
|
|
225
|
-
|
|
226
|
-
|
|
224
|
+
**Research task detection:** When a task's title starts with "Research:" or the task type is "research", add a RESEARCH OUTPUT section to the BUILD HANDOFF after ACCEPTANCE CRITERIA:
|
|
225
|
+
|
|
226
|
+
RESEARCH OUTPUT
|
|
227
|
+
Deliverable: docs/research/[topic]-findings.md (draft path)
|
|
228
|
+
Review status: pending owner approval
|
|
229
|
+
Follow-up tasks: DO NOT submit to backlog until owner confirms findings are actionable
|
|
230
|
+
|
|
231
|
+
Also add to ACCEPTANCE CRITERIA: "[ ] Findings doc drafted and saved to docs/research/ before submitting any follow-up ideas"
|
|
232
|
+
|
|
233
|
+
**Bug task detection:** When a task's task type is "bug" or the title starts with "Bug:" or "Fix:", apply these rules:
|
|
234
|
+
- **Auto-P1:** If the task's current priority is P2 or lower, upgrade it to "P1 High" via a boardCorrections entry in Part 2. Note the upgrade in Part 1 analysis.
|
|
235
|
+
- Add a BLAST RADIUS note to the BUILD HANDOFF SCOPE section: "Bug fix \u2014 minimal blast radius. Change only what is necessary to fix the reported behaviour. Do not refactor surrounding code or expand scope."
|
|
236
|
+
- Add to ACCEPTANCE CRITERIA: "[ ] Fix is targeted \u2014 no unrelated code changed"
|
|
237
|
+
|
|
238
|
+
**Idea task detection:** When a task's task type is "idea", add a scope clarification note to the BUILD HANDOFF:
|
|
239
|
+
- Add to SCOPE (DO THIS): "This task originated as an idea. Confirm the exact deliverable before implementing \u2014 check task notes and any referenced docs for intent. If scope is unclear, flag it in the build report surprises."
|
|
240
|
+
|
|
241
|
+
**UI/visual task detection:** When a task's title or notes contain keywords suggesting frontend visual work (e.g. "visual", "design", "UI", "styling", "refresh", "frontend", "landing page", "hero", "carousel", "theme", "layout", "cockpit", "dashboard", "page"), apply these handoff additions:
|
|
242
|
+
- Add to SCOPE: "Read \`.impeccable.md\` for brand palette, design principles, and audience context before writing any code. Use the \`frontend-design\` skill for implementation."
|
|
243
|
+
- For M/L UI tasks, add to SCOPE: "Use the full UI toolchain: Playground (design preview) \u2192 Frontend-design (build) \u2192 Playwright (verify). The playground is the quality bar. Expect 2-3 iterations."
|
|
244
|
+
- Add to ACCEPTANCE CRITERIA: "[ ] Visually verify rendered output in browser \u2014 provide localhost URL or screenshot to user for review." and "[ ] No raw IDs, abbreviations, or jargon visible without human-readable labels or tooltips."
|
|
245
|
+
- If the task involves image selection, add to SCOPE: "Include brand/theme direction constraints for image selection."
|
|
246
|
+
The planner's job is scoping, not design direction. Design decisions happen at build time via \`.impeccable.md\` and the frontend-design skill \u2014 don't try to write design specs in the handoff.
|
|
227
247
|
|
|
228
248
|
11. **New Tasks (max 3 per cycle)** \u2014 Actively mine the Recent Build Reports for task candidates. For each report, check:
|
|
229
249
|
- **Discovered Issues:** If a build report lists a discovered issue and no existing board task covers it, propose a new task.
|
|
@@ -418,115 +438,84 @@ IMPORTANT: You are running as a non-interactive API call. Do NOT ask the user qu
|
|
|
418
438
|
|
|
419
439
|
## OUTPUT PRINCIPLES
|
|
420
440
|
|
|
421
|
-
- **
|
|
441
|
+
- **Product impact first, process second.** The reader wants to know: what got better for users, what's broken, what opportunities exist. Internal machinery (AD wording, taxonomy labels, hierarchy status) is secondary \u2014 handle it in a compact appendix, not the main review.
|
|
422
442
|
- **Lead with insight, not data recitation.** Open each section with the strategic takeaway or pattern, THEN support it with cycle data and task references. Bad: "C131 built task-700, C132 built task-710." Good: "The last 5 cycles show a clear shift from infrastructure to user-facing work \u2014 80% of tasks were dashboard or onboarding, up from 30% in the prior review window."
|
|
423
443
|
- **Cycle data first, conversation context second.** Base your review on build reports, cycle logs, board state, and ADs \u2014 not on whatever was discussed earlier in the conversation. If recent conversation context conflicts with the data, flag it but trust the data.
|
|
424
|
-
- **
|
|
444
|
+
- **Be concise and scannable.** Use short paragraphs, bullet points, and clear headings. Avoid walls of text. The review should be readable in 3 minutes, not 15. Format cycle summaries as compact bullet points, not multi-paragraph narratives.
|
|
445
|
+
- **Every conditional section earns its place.** If a conditional section has nothing meaningful to say, skip it entirely. Do not write "No issues found" or "No concerns" \u2014 just omit the section.
|
|
446
|
+
- **AD housekeeping is an appendix, not the centerpiece.** Just list changes and make them. Don't score every AD individually. Don't ask for approval on wording tweaks \u2014 small changes (confidence bumps, deleting stale ADs, fixing wording) should just happen. Only flag ADs that represent a genuine strategic question.
|
|
425
447
|
|
|
426
448
|
## TWO-PHASE DELIVERY
|
|
427
449
|
|
|
428
|
-
|
|
429
|
-
|
|
430
|
-
2. **Phase 2 (after user discussion):** The structured action breakdown in Part 2 captures concrete next steps. But the user may refine, reject, or add to these after reading Phase 1. The structured output represents your best autonomous assessment \u2014 the user's feedback in conversation refines it.
|
|
450
|
+
1. **Phase 1 (this output):** Present the review \u2014 all 5 mandatory sections plus relevant conditional sections. Be thorough on product gaps and opportunities, compact on housekeeping. The user will discuss and refine before the structured output is applied.
|
|
451
|
+
2. **Phase 2 (after user discussion):** The structured data in Part 2 captures actions. The user may modify these after reading Phase 1.
|
|
431
452
|
|
|
432
|
-
|
|
453
|
+
The review should be readable in one sitting. Don't pad sections for depth \u2014 earn every paragraph with a specific insight or actionable finding.
|
|
433
454
|
|
|
434
455
|
## YOUR JOB \u2014 STRUCTURED COVERAGE
|
|
435
456
|
|
|
436
|
-
You MUST cover these
|
|
457
|
+
You MUST cover these 5 sections. Each is mandatory.
|
|
437
458
|
|
|
438
|
-
1. **
|
|
459
|
+
1. **What Got Built & Why It Matters** \u2014 Compact summary of each cycle since last review. For each cycle: 1-2 bullet points on what shipped and the strategic significance. Reference task IDs. Don't write multi-paragraph narratives per cycle \u2014 keep it scannable. End with the cross-cycle pattern or theme.
|
|
439
460
|
|
|
440
|
-
2. **
|
|
461
|
+
2. **Product Gaps & User Experience** \u2014 This is the MOST IMPORTANT section. Answer these questions with specifics:
|
|
462
|
+
- If a new user tried this product tomorrow, what would confuse or break for them?
|
|
463
|
+
- What features are broken, half-built, or misleading on the dashboard/UI?
|
|
464
|
+
- What's missing that would make the product noticeably better?
|
|
465
|
+
- What user experience friction has been reported or observed in dogfood/build reports?
|
|
466
|
+
Surface real problems, not theoretical ones. Reference specific pages, components, or flows that need attention.
|
|
441
467
|
|
|
442
|
-
3. **
|
|
468
|
+
3. **Opportunities & Growth** \u2014 What's not being built or explored that could move the needle?
|
|
469
|
+
- Marketing, distribution, community, content opportunities
|
|
470
|
+
- Methodology improvements that would make cycles more efficient
|
|
471
|
+
- Features or improvements that would differentiate from competitors
|
|
472
|
+
- Things the project owner mentioned wanting but that haven't been prioritised
|
|
443
473
|
|
|
444
|
-
4. **
|
|
474
|
+
4. **Strategic Direction Check** \u2014 Brief assessment (not a deep dive):
|
|
475
|
+
- Is the North Star still accurate? (1-2 sentences, not a multi-paragraph analysis)
|
|
476
|
+
- Has the target user or problem statement changed?
|
|
477
|
+
- What carry-forward items are stuck and why?
|
|
478
|
+
- If the product brief needs updating, include the update in Part 2.
|
|
445
479
|
|
|
446
|
-
5. **
|
|
447
|
-
-
|
|
448
|
-
-
|
|
449
|
-
-
|
|
450
|
-
-
|
|
451
|
-
|
|
480
|
+
5. **AD & Hierarchy Housekeeping** \u2014 Compact appendix. NOT the focus of the review.
|
|
481
|
+
- List ADs being deleted, modified, or created \u2014 with the change, not a per-AD essay
|
|
482
|
+
- Just make small changes (confidence bumps, stale AD deletion, wording fixes) \u2014 don't ask for approval
|
|
483
|
+
- Only flag ADs that represent a genuine strategic question requiring owner input
|
|
484
|
+
- Note any hierarchy/phase issues worth correcting (1-2 bullets max)
|
|
485
|
+
- Delete ADs that are legacy, process-level, or redundant without discussion
|
|
452
486
|
|
|
453
|
-
|
|
454
|
-
- **effort** \u2014 Implementation cost (1=trivial, 5=major project)
|
|
455
|
-
- **risk** \u2014 Likelihood of failure or rework (1=safe, 5=unproven)
|
|
456
|
-
- **reversibility** \u2014 How hard to undo (1=trivial rollback, 5=permanent)
|
|
457
|
-
- **scale_cost** \u2014 What this costs at 10x/100x users or data (1=negligible, 5=bottleneck)
|
|
458
|
-
- **lock_in** \u2014 Dependency on a specific vendor/tool (1=swappable, 5=deeply coupled)
|
|
459
|
-
Only score ADs where you have enough context to evaluate meaningfully \u2014 skip ADs where scoring would be guesswork.
|
|
460
|
-
**AD Quality Bar:** ADs are for product and architecture choices that constrain future work \u2014 technology selections, data model designs, UX principles, strategic positioning. They are NOT for: process preferences (commit style, PR size), configuration choices (linter rules, tab width), or temporary workarounds. If a decision doesn't affect what gets built or how it's architected, it's not an AD. Flag any existing ADs that fail this bar for deletion via \`activeDecisionUpdates\` with action \`delete\`.
|
|
461
|
-
**IMPORTANT:** If your analysis recommends changing an AD's confidence, modifying its body, or creating a new AD, you MUST include it in \`activeDecisionUpdates\` in Part 2. Analysis without persistence is waste \u2014 the next plan won't see your recommendation unless it's in the structured output.
|
|
487
|
+
## CONDITIONAL SECTIONS (include only when genuinely useful \u2014 most reviews should have 0-2 of these)
|
|
462
488
|
|
|
463
|
-
|
|
489
|
+
6. **Security Posture Review** \u2014 Only if \`[SECURITY]\` tags exist in recent cycle logs.
|
|
464
490
|
|
|
465
|
-
7. **
|
|
491
|
+
7. **Dogfood Friction \u2192 Task Conversion** \u2014 Scan dogfood observations for recurring friction (2+ occurrences without a board task). Convert up to 3 to task proposals via \`actionItems\` in Part 2. Skip if nothing recurring.
|
|
466
492
|
|
|
467
|
-
8. **
|
|
468
|
-
**How to convert friction to tasks:**
|
|
469
|
-
- In Part 1, write a "Dogfood Friction \u2192 Tasks" subsection listing each friction entry and your decision (convert or skip with reason).
|
|
470
|
-
- For each converted friction, add an entry to \`actionItems\` in Part 2 with \`type: "submit"\` and a descriptive \`description\` that includes the task title and scope. Example: \`{"description": "Submit task: Fix deprioritise clearing handoffs unnecessarily \u2014 add flag to preserve handoff on deprioritise", "type": "submit", "target": null}\`
|
|
471
|
-
- If friction points have been addressed by recent builds, note the resolution and skip them.
|
|
472
|
-
- This closes the loop between "we noticed a problem" and "we created a task to fix it."
|
|
493
|
+
8. **Architecture Health** \u2014 Only flag genuine broken data paths, config drift, or dead code. Keep it brief \u2014 bullet points, not paragraphs. Skip entirely if nothing found.
|
|
473
494
|
${compressionJob}
|
|
474
|
-
|
|
475
|
-
- **Broken data paths** \u2014 DB tables that exist but aren't being read by the dashboard, file reads returning empty, API routes with no consumers. Cycle 42 showed an empty product brief going undetected for multiple cycles \u2014 this check catches that class of problem.
|
|
476
|
-
- **Adapter parity gaps** \u2014 Features implemented in the pg adapter but missing from md (or vice versa). Both adapters must implement the same PapiAdapter interface, but runtime behavior can diverge.
|
|
477
|
-
- **Config drift** \u2014 Environment variables referenced in code but not documented, stale .env.example entries, MCP config mismatches between what the server expects and what setup/init generates.
|
|
478
|
-
- **Dead dependencies** \u2014 Packages in package.json that are no longer imported anywhere. These add install time and attack surface.
|
|
479
|
-
- **Stale prompts or instructions** \u2014 Cycle numbers, AD references, or project-state assumptions in prompts.ts or CLAUDE.md that no longer match reality.
|
|
480
|
-
- **Stage readiness gaps** \u2014 If the project is approaching or entering an access-widening stage (e.g. Alpha Distribution, Alpha Cohort, Public Launch), check that auth/security phases are complete. Stages that widen who can access the product must have auth hardening and security review as prerequisites \u2014 not post-hoc discoveries.
|
|
481
|
-
Report findings in a brief "Architecture Health" section in Part 1. If no issues found, skip the section entirely \u2014 do not write "No issues found".
|
|
482
|
-
|
|
483
|
-
10. **Discovery Canvas Audit** \u2014 If a Discovery Canvas section is provided in context, audit it for completeness and staleness. For each of the 5 canvas sections (Landscape & References, User Journeys, MVP Boundary, Assumptions & Open Questions, Success Signals):
|
|
484
|
-
- If the section is **empty** and the project has run 5+ cycles, flag it as a gap and suggest a specific enrichment prompt (e.g. "Consider defining your MVP boundary \u2014 what's in v1 and what's deferred?").
|
|
485
|
-
- If the section has content, assess whether it's still accurate given recent builds and decisions. Flag stale assumptions or outdated references.
|
|
486
|
-
- If no Discovery Canvas is provided in context, note that the canvas hasn't been initialized and recommend starting with the highest-value section for the project's maturity.
|
|
487
|
-
Report findings in a "Discovery Canvas Audit" section in Part 1. Persist findings in the \`discoveryGaps\` array in Part 2. If no gaps found, omit the section and use an empty array.
|
|
488
|
-
|
|
489
|
-
11. **Hierarchy Assessment** \u2014 If hierarchy data (Horizons \u2192 Stages \u2192 Phases with task counts) is provided in context, assess the full project structure:
|
|
490
|
-
**Phase-level:**
|
|
491
|
-
- A phase marked "In Progress" with all tasks Done \u2192 flag as ready to close.
|
|
492
|
-
- A phase marked "Done" with active Backlog/In Progress tasks \u2192 flag as incorrectly closed.
|
|
493
|
-
- A phase marked "Not Started" while later-ordered phases are active \u2192 flag as out-of-sequence.
|
|
494
|
-
- If builds in this review window created tasks that don't fit existing phases \u2192 suggest a new phase.
|
|
495
|
-
**Stage-level:**
|
|
496
|
-
- If all phases in a stage are Done \u2192 flag the stage as ready to complete. This is a significant milestone.
|
|
497
|
-
- If the current stage has been active for 15+ cycles \u2192 assess whether it should be split or whether progress is genuinely slow.
|
|
498
|
-
- If work is happening in phases that belong to a future stage while the current stage has incomplete phases \u2192 flag as scope leak.
|
|
499
|
-
**Horizon-level:**
|
|
500
|
-
- If all stages in the active horizon are complete \u2192 flag for Horizon Review (biggest-picture reflection).
|
|
501
|
-
- If no phase data is provided, skip this section.
|
|
502
|
-
Report findings in a "Hierarchy Assessment" section in Part 1. Persist findings in the \`stalePhases\` array in Part 2 (include stage/horizon observations too). If no issues found, omit the section and use an empty array.
|
|
503
|
-
|
|
504
|
-
12. **Structural Drift Detection** \u2014 If decision usage data is provided in context, identify structural decay using drift-based criteria (not pure cycle counts):
|
|
505
|
-
- **AD drift:** An AD is drifted when its content contradicts recent build evidence, references architecture/capabilities that no longer exist, or has been made redundant by newer ADs. Reference frequency is a secondary signal \u2014 an unreferenced AD that is still accurate is not necessarily stale; an AD referenced last cycle that contradicts shipped code IS drifted.
|
|
506
|
-
- **Carry-forward drift:** Carry-forward items that have persisted across **3+ cycles** without resolution \u2192 flag as stuck.
|
|
507
|
-
- **Confidence drift:** ADs with LOW confidence that have not gained supporting evidence within 5 cycles \u2192 flag as unvalidated. ADs where build reports contradict the decision \u2192 flag as confidence should decrease.
|
|
508
|
-
Use decision usage data as a secondary signal (unreferenced ADs are more likely to be drifted, but verify by checking content alignment). Report findings in a "Structural Drift" section in Part 1. Persist findings in the \`staleDecisions\` array in Part 2. If no issues found, omit the section and use an empty array.
|
|
495
|
+
Note: Hierarchy assessment and structural drift detection are handled within section 5 (AD & Hierarchy Housekeeping). They do not need their own sections.
|
|
509
496
|
|
|
510
497
|
## OUTPUT FORMAT
|
|
511
498
|
|
|
512
499
|
Your output has TWO parts:
|
|
513
500
|
|
|
514
501
|
### Part 1: Natural Language Output
|
|
515
|
-
Write your
|
|
516
|
-
1. **
|
|
517
|
-
2. **
|
|
518
|
-
3. **
|
|
519
|
-
4. **
|
|
520
|
-
5. **
|
|
521
|
-
|
|
522
|
-
|
|
523
|
-
Then include conditional sections only if relevant:
|
|
502
|
+
Write your Strategy Review in markdown. Cover the 5 mandatory sections in order:
|
|
503
|
+
1. **What Got Built & Why It Matters** \u2014 compact cycle summaries (bullets, not essays)
|
|
504
|
+
2. **Product Gaps & User Experience** \u2014 THE MAIN EVENT. What's broken, confusing, or missing for users.
|
|
505
|
+
3. **Opportunities & Growth** \u2014 what could move the needle that we're not doing
|
|
506
|
+
4. **Strategic Direction Check** \u2014 brief North Star + carry-forward + brief update check
|
|
507
|
+
5. **AD & Hierarchy Housekeeping** \u2014 compact appendix of changes being made
|
|
508
|
+
|
|
509
|
+
Then include conditional sections only if genuinely useful:
|
|
524
510
|
- **Security Posture Review** \u2014 only if [SECURITY] tags exist
|
|
525
|
-
- **Dogfood Friction \u2192 Tasks** \u2014 only if
|
|
526
|
-
- **Architecture Health** \u2014 only if
|
|
527
|
-
|
|
528
|
-
|
|
529
|
-
-
|
|
511
|
+
- **Dogfood Friction \u2192 Tasks** \u2014 only if recurring unaddressed friction
|
|
512
|
+
- **Architecture Health** \u2014 only if broken data paths or config drift found${compressionPart1}
|
|
513
|
+
|
|
514
|
+
**FORMAT GUIDELINES:**
|
|
515
|
+
- The entire review should be readable in 3-5 minutes
|
|
516
|
+
- Use bullet points and short paragraphs, not walls of text
|
|
517
|
+
- Section 2 (Product Gaps) should be the longest section
|
|
518
|
+
- Section 5 (AD Housekeeping) should be the shortest \u2014 just a change list
|
|
530
519
|
|
|
531
520
|
### Part 2: Structured Data Block
|
|
532
521
|
After your natural language output, include this EXACT format on its own line:
|
|
@@ -607,9 +596,9 @@ Everything in Part 1 (natural language) is **display-only**. Part 2 (structured
|
|
|
607
596
|
|
|
608
597
|
**If you analysed it in Part 1, it MUST appear in Part 2 to persist. Empty arrays/null = nothing saved.**
|
|
609
598
|
|
|
610
|
-
-
|
|
611
|
-
-
|
|
612
|
-
-
|
|
599
|
+
- AD changes listed in section 5? \u2192 Put them in \`activeDecisionUpdates\` with full body including ### heading. Use \`delete\` action (with empty body) to permanently remove non-strategic ADs. Small changes (confidence bumps, stale deletions) don't need justification essays \u2014 just make them.
|
|
600
|
+
- Only score ADs in \`decisionScores\` if the review surfaced a genuine strategic question about that AD. Do NOT score every AD \u2014 most reviews should have 0-3 scores at most.
|
|
601
|
+
- Product brief needs updating? \u2192 Put the full updated brief in \`productBriefUpdates\`.${compressionPersistence}
|
|
613
602
|
- Wrote a strategy review in Part 1? \u2192 \`sessionLogTitle\`, \`sessionLogContent\`, \`velocityAssessment\`, \`strategicRecommendations\` must all be populated
|
|
614
603
|
- Made recommendations in Part 1? \u2192 Extract each into \`actionItems\` with a specific type (resolve/submit/close/investigate/defer) and target (AD-N, task-NNN, phase name, or null). Every recommendation must have an action item \u2014 this is how they get tracked and surfaced to the next plan
|
|
615
604
|
- Converted dogfood friction to tasks in Part 1? \u2192 Each converted friction must appear as an \`actionItem\` with \`type: "submit"\`. If it's not in \`actionItems\`, it won't be tracked \u2014 the next plan will never see it
|
|
@@ -703,6 +692,9 @@ function buildReviewUserMessage(ctx) {
|
|
|
703
692
|
if (ctx.unregisteredDocs) {
|
|
704
693
|
parts.push("### Unregistered Docs", "", ctx.unregisteredDocs, "");
|
|
705
694
|
}
|
|
695
|
+
if (ctx.taskComments) {
|
|
696
|
+
parts.push("### Task Discussion (Recent Comments)", "", ctx.taskComments, "");
|
|
697
|
+
}
|
|
706
698
|
return parts.join("\n");
|
|
707
699
|
}
|
|
708
700
|
function parseReviewStructuredOutput(raw) {
|
package/package.json
CHANGED