@bugzy-ai/bugzy 1.12.3 → 1.13.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/dist/cli/index.js CHANGED
@@ -1261,7 +1261,9 @@ Extract the following from arguments:
1261
1261
  "read-knowledge-base",
1262
1262
  // Step 5: Test Execution Strategy (library)
1263
1263
  "read-test-strategy",
1264
- // Step 6: Identify Tests (inline - task-specific)
1264
+ // Step 6: Clarification Protocol (library)
1265
+ "clarification-protocol",
1266
+ // Step 7: Identify Tests (inline - task-specific)
1265
1267
  {
1266
1268
  inline: true,
1267
1269
  title: "Identify Automated Tests to Run",
@@ -1487,7 +1489,9 @@ Store the detected trigger for use in output routing:
1487
1489
  - Set variable: \`TRIGGER_SOURCE\` = [GITHUB_PR | SLACK_MESSAGE | CI_CD | MANUAL]
1488
1490
  - This determines output formatting and delivery channel`
1489
1491
  },
1490
- // Step 6: Extract Context (inline)
1492
+ // Step 6: Clarification Protocol (library)
1493
+ "clarification-protocol",
1494
+ // Step 7: Extract Context (inline)
1491
1495
  {
1492
1496
  inline: true,
1493
1497
  title: "Extract Context Based on Trigger",
@@ -1948,8 +1952,11 @@ This command orchestrates the complete test coverage workflow in a single execut
1948
1952
  2. **Phase 2**: Generate lightweight test plan
1949
1953
  3. **Phase 3**: Generate and verify test cases (create + fix until passing)
1950
1954
  4. **Phase 4**: Triage failures and fix test issues
1951
- 5. **Phase 5**: Log product bugs
1952
- 6. **Phase 6**: Final report`
1955
+ 5. **Phase 5**: Log product bugs to issue tracker
1956
+ 6. **Phase 6**: Notify team via Slack/Teams with results summary
1957
+ 7. **Phase 7**: Generate final report
1958
+
1959
+ **IMPORTANT**: All phases must be completed. Do NOT skip Phase 5 (bug logging) or Phase 6 (team notification) \u2014 these are required deliverables.`
1953
1960
  },
1954
1961
  // Step 2: Security Notice (from library)
1955
1962
  "security-notice",
@@ -6595,6 +6602,16 @@ Determine exploration depth based on requirement quality:
6595
6602
  - **Vague:** "Fix the sorting in todo list page. The items are mixed up for premium users."
6596
6603
  - **Unclear:** "Improve the dashboard performance. Users say it's slow."
6597
6604
 
6605
+ ### Maturity Adjustment
6606
+
6607
+ If the Clarification Protocol determined project maturity, adjust exploration depth:
6608
+
6609
+ - **New project**: Default one level deeper than requirement clarity suggests (Clear \u2192 Moderate, Vague \u2192 Deep)
6610
+ - **Growing project**: Use requirement clarity as-is (standard protocol)
6611
+ - **Mature project**: Trust knowledge base \u2014 can stay at suggested depth or go one level shallower if KB covers the feature
6612
+
6613
+ **Always verify features exist before testing them.** If exploration reveals that a referenced page or feature does not exist in the application, this is CRITICAL severity \u2014 escalate via the Clarification Protocol regardless of maturity level. Do NOT silently adapt or work around the missing feature.
6614
+
6598
6615
  ### Quick Exploration (1-2 min)
6599
6616
 
6600
6617
  **When:** Requirements CLEAR
@@ -6827,6 +6844,33 @@ Before starting, check if this task is resuming from a blocked clarification:
6827
6844
 
6828
6845
  3. **If no clarification in $ARGUMENTS:** Proceed normally with ambiguity detection below.
6829
6846
 
6847
+ ### Assess Project Maturity
6848
+
6849
+ Before detecting ambiguity, assess how well you know this project. Maturity determines how aggressively you should ask questions \u2014 new projects require more questions, mature projects can rely on accumulated knowledge.
6850
+
6851
+ **Measure maturity from runtime artifacts:**
6852
+
6853
+ | Signal | New | Growing | Mature |
6854
+ |--------|-----|---------|--------|
6855
+ | \`knowledge-base.md\` | < 80 lines (template) | 80-300 lines | 300+ lines |
6856
+ | \`memory/\` files | 0 files | 1-3 files | 4+ files, >5KB each |
6857
+ | Test cases in \`test-cases/\` | 0 | 1-6 | 7+ |
6858
+ | Exploration reports | 0 | 1 | 2+ |
6859
+
6860
+ **Steps:**
6861
+ 1. Read \`.bugzy/runtime/knowledge-base.md\` and count lines
6862
+ 2. List \`.bugzy/runtime/memory/\` directory and count files
6863
+ 3. List \`test-cases/\` directory and count \`.md\` files (exclude README)
6864
+ 4. Count exploration reports in \`exploration-reports/\`
6865
+ 5. Classify: If majority of signals = New \u2192 **New**; majority Mature \u2192 **Mature**; otherwise \u2192 **Growing**
6866
+
6867
+ **Maturity adjusts your question threshold:**
6868
+ - **New**: Ask for CRITICAL + HIGH + MEDIUM severity (gather information aggressively)
6869
+ - **Growing**: Ask for CRITICAL + HIGH severity (standard protocol)
6870
+ - **Mature**: Ask for CRITICAL only (handle HIGH with documented assumptions)
6871
+
6872
+ **CRITICAL severity ALWAYS triggers a question, regardless of maturity level.**
6873
+
6830
6874
  ### Detect Ambiguity
6831
6875
 
6832
6876
  Scan for ambiguity signals:
@@ -6853,8 +6897,8 @@ If ambiguity is detected, assess its severity:
6853
6897
 
6854
6898
  | Severity | Characteristics | Examples | Action |
6855
6899
  |----------|----------------|----------|--------|
6856
- | **CRITICAL** | Expected behavior undefined/contradictory; test outcome unpredictable; core functionality unclear; success criteria missing; multiple interpretations = different strategies | "Fix the issue" (what issue?), "Improve performance" (which metrics?), "Fix sorting in todo list" (by date? priority? completion status?) | **STOP** - Seek clarification before proceeding |
6857
- | **HIGH** | Core underspecified but direction clear; affects majority of scenarios; vague success criteria; assumptions risky | "Fix ordering" (sequence OR visibility?), "Add validation" (what? messages?), "Update dashboard" (which widgets?) | **STOP** - Seek clarification before proceeding |
6900
+ | **CRITICAL** | Expected behavior undefined/contradictory; test outcome unpredictable; core functionality unclear; success criteria missing; multiple interpretations = different strategies; **referenced page/feature does not exist in the application** | "Fix the issue" (what issue?), "Improve performance" (which metrics?), "Fix sorting in todo list" (by date? priority? completion status?), "Test the Settings page" (no Settings page exists), "Verify the checkout flow" (no checkout page found) | **STOP** - You MUST ask via team-communicator before proceeding |
6901
+ | **HIGH** | Core underspecified but direction clear; affects majority of scenarios; vague success criteria; assumptions risky | "Fix ordering" (sequence OR visibility?), "Add validation" (what? messages?), "Update dashboard" (which widgets?) | **STOP** - You MUST ask via team-communicator before proceeding |
6858
6902
  | **MEDIUM** | Specific details missing; general requirements clear; affects subset of cases; reasonable low-risk assumptions possible; wrong assumption = test updates not strategy overhaul | Missing field labels, unclear error message text, undefined timeouts, button placement not specified, date formats unclear | **PROCEED** - (1) Moderate exploration, (2) Document assumptions: "Assuming X because Y", (3) Proceed with creation/execution, (4) Async clarification (team-communicator), (5) Mark [ASSUMED: description] |
6859
6903
  | **LOW** | Minor edge cases; documentation gaps don't affect execution; optional/cosmetic elements; minimal impact | Tooltip text, optional field validation, icon choice, placeholder text, tab order | **PROCEED** - (1) Mark [TO BE CLARIFIED: description], (2) Proceed, (3) Mention in report "Minor Details", (4) No blocking/async clarification |
6860
6904
 
@@ -6955,18 +6999,26 @@ Tasks waiting for clarification responses.
6955
6999
 
6956
7000
  ### Wait or Proceed Based on Severity
6957
7001
 
6958
- **CRITICAL/HIGH \u2192 STOP and Wait:**
6959
- - Do NOT create tests, run tests, or make assumptions
6960
- - Wait for clarification, resume after answer
7002
+ **Use your maturity assessment to adjust thresholds:**
7003
+ - **New project**: STOP for CRITICAL + HIGH + MEDIUM
7004
+ - **Growing project**: STOP for CRITICAL + HIGH (default)
7005
+ - **Mature project**: STOP for CRITICAL only; handle HIGH with documented assumptions
7006
+
7007
+ **When severity meets your STOP threshold:**
7008
+ - You MUST call team-communicator (Slack) to ask the question \u2014 do NOT just mention it in your text output
7009
+ - Do NOT create tests, run tests, or make assumptions about the unclear aspect
7010
+ - Do NOT silently adapt by working around the issue (e.g., running other tests instead)
7011
+ - Do NOT invent your own success criteria when none are provided
7012
+ - Register the blocked task and wait for clarification
6961
7013
  - *Rationale: Wrong assumptions = incorrect tests, false results, wasted time*
6962
7014
 
6963
- **MEDIUM \u2192 Proceed with Documented Assumptions:**
7015
+ **When severity is below your STOP threshold \u2192 Proceed with Documented Assumptions:**
6964
7016
  - Perform moderate exploration, document assumptions, proceed with creation/execution
6965
7017
  - Ask clarification async (team-communicator), mark results "based on assumptions"
6966
7018
  - Update tests after clarification received
6967
7019
  - *Rationale: Waiting blocks progress; documented assumptions allow forward movement with later corrections*
6968
7020
 
6969
- **LOW \u2192 Proceed and Mark:**
7021
+ **LOW \u2192 Always Proceed and Mark:**
6970
7022
  - Proceed with creation/execution, mark gaps [TO BE CLARIFIED] or [ASSUMED]
6971
7023
  - Mention in report but don't prioritize, no blocking
6972
7024
  - *Rationale: Details don't affect strategy/results significantly*
@@ -6994,11 +7046,12 @@ When reporting test results, always include an "Ambiguities" section if clarific
6994
7046
 
6995
7047
  ## Remember
6996
7048
 
6997
- - **Block for CRITICAL/HIGH** - Never proceed with assumptions on unclear core requirements
7049
+ - **STOP means STOP** - When you hit a STOP threshold, you MUST call team-communicator to ask via Slack. Do NOT silently adapt, skip, or work around the issue
7050
+ - **Non-existent features = CRITICAL** - If a page, component, or feature referenced in the task does not exist, this is always CRITICAL severity \u2014 ask what was meant
6998
7051
  - **Ask correctly > guess poorly** - Specific questions lead to specific answers
6999
- - **Document MEDIUM assumptions** - Track what you assumed and why
7052
+ - **Never invent success criteria** - If the task says "improve" or "fix" without metrics, ask what "done" looks like
7000
7053
  - **Check memory first** - Avoid re-asking previously answered questions
7001
- - **Specific questions \u2192 specific answers** - Vague questions get vague answers`,
7054
+ - **Maturity adjusts threshold, not judgment** - Even in mature projects, CRITICAL always triggers a question`,
7002
7055
  tags: ["clarification", "protocol", "ambiguity"]
7003
7056
  };
7004
7057