@bugzy-ai/bugzy 1.12.4 → 1.13.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/cli/index.cjs +68 -12
- package/dist/cli/index.cjs.map +1 -1
- package/dist/cli/index.js +68 -12
- package/dist/cli/index.js.map +1 -1
- package/dist/index.cjs +68 -12
- package/dist/index.cjs.map +1 -1
- package/dist/index.js +68 -12
- package/dist/index.js.map +1 -1
- package/dist/tasks/index.cjs +10 -2
- package/dist/tasks/index.cjs.map +1 -1
- package/dist/tasks/index.js +10 -2
- package/dist/tasks/index.js.map +1 -1
- package/package.json +1 -1
package/dist/cli/index.cjs
CHANGED
|
@@ -268,6 +268,9 @@ Before invoking the agent, identify the test cases for the current area:
|
|
|
268
268
|
- Existing automated tests: ./tests/specs/
|
|
269
269
|
- Existing Page Objects: ./tests/pages/
|
|
270
270
|
|
|
271
|
+
**Knowledge Base Patterns (MUST APPLY):**
|
|
272
|
+
Include ALL relevant testing patterns from the knowledge base that apply to this area. For example, if the KB documents timing behaviors (animation delays, loading states), selector gotchas, or recommended assertion approaches \u2014 list them here explicitly and instruct the agent to use the specific patterns described (e.g., specific assertion methods with specific timeouts). The test-code-generator does not have access to the knowledge base, so you MUST relay the exact patterns and recommended code approaches.
|
|
273
|
+
|
|
271
274
|
**The agent should:**
|
|
272
275
|
1. Read the manual test case files for this area
|
|
273
276
|
2. Check existing Page Object infrastructure for this area
|
|
@@ -276,6 +279,7 @@ Before invoking the agent, identify the test cases for the current area:
|
|
|
276
279
|
5. For each test case marked \`automated: true\`:
|
|
277
280
|
- Create automated Playwright test in ./tests/specs/
|
|
278
281
|
- Update the manual test case file to reference the automated test path
|
|
282
|
+
- Apply ALL knowledge base patterns listed above (timing, selectors, assertions)
|
|
279
283
|
6. Run and iterate on each test until it passes or fails with a product bug
|
|
280
284
|
7. Update .env.testdata with any new variables
|
|
281
285
|
|
|
@@ -1269,7 +1273,9 @@ Extract the following from arguments:
|
|
|
1269
1273
|
"read-knowledge-base",
|
|
1270
1274
|
// Step 5: Test Execution Strategy (library)
|
|
1271
1275
|
"read-test-strategy",
|
|
1272
|
-
// Step 6:
|
|
1276
|
+
// Step 6: Clarification Protocol (library)
|
|
1277
|
+
"clarification-protocol",
|
|
1278
|
+
// Step 7: Identify Tests (inline - task-specific)
|
|
1273
1279
|
{
|
|
1274
1280
|
inline: true,
|
|
1275
1281
|
title: "Identify Automated Tests to Run",
|
|
@@ -1495,7 +1501,9 @@ Store the detected trigger for use in output routing:
|
|
|
1495
1501
|
- Set variable: \`TRIGGER_SOURCE\` = [GITHUB_PR | SLACK_MESSAGE | CI_CD | MANUAL]
|
|
1496
1502
|
- This determines output formatting and delivery channel`
|
|
1497
1503
|
},
|
|
1498
|
-
// Step 6:
|
|
1504
|
+
// Step 6: Clarification Protocol (library)
|
|
1505
|
+
"clarification-protocol",
|
|
1506
|
+
// Step 7: Extract Context (inline)
|
|
1499
1507
|
{
|
|
1500
1508
|
inline: true,
|
|
1501
1509
|
title: "Extract Context Based on Trigger",
|
|
@@ -6463,6 +6471,8 @@ Before proceeding, read the curated knowledge base to inform your work:
|
|
|
6463
6471
|
- Build on existing understanding
|
|
6464
6472
|
- Maintain consistency with established practices
|
|
6465
6473
|
|
|
6474
|
+
3. **Relay to subagents**: Subagents do NOT read the knowledge base directly. When delegating work, you MUST include relevant KB patterns in your delegation message \u2014 especially testing patterns (timing, selectors, assertion approaches) that affect test reliability.
|
|
6475
|
+
|
|
6466
6476
|
**Note:** The knowledge base may not exist yet or may be empty. If it doesn't exist or is empty, proceed without this context and help build it as you work.`,
|
|
6467
6477
|
tags: ["setup", "context"]
|
|
6468
6478
|
};
|
|
@@ -6606,6 +6616,16 @@ Determine exploration depth based on requirement quality:
|
|
|
6606
6616
|
- **Vague:** "Fix the sorting in todo list page. The items are mixed up for premium users."
|
|
6607
6617
|
- **Unclear:** "Improve the dashboard performance. Users say it's slow."
|
|
6608
6618
|
|
|
6619
|
+
### Maturity Adjustment
|
|
6620
|
+
|
|
6621
|
+
If the Clarification Protocol determined project maturity, adjust exploration depth:
|
|
6622
|
+
|
|
6623
|
+
- **New project**: Default one level deeper than requirement clarity suggests (Clear \u2192 Moderate, Vague \u2192 Deep)
|
|
6624
|
+
- **Growing project**: Use requirement clarity as-is (standard protocol)
|
|
6625
|
+
- **Mature project**: Trust knowledge base \u2014 can stay at suggested depth or go one level shallower if KB covers the feature
|
|
6626
|
+
|
|
6627
|
+
**Always verify features exist before testing them.** If exploration reveals that a referenced page or feature does not exist in the application, this is CRITICAL severity \u2014 escalate via the Clarification Protocol regardless of maturity level. Do NOT silently adapt or work around the missing feature.
|
|
6628
|
+
|
|
6609
6629
|
### Quick Exploration (1-2 min)
|
|
6610
6630
|
|
|
6611
6631
|
**When:** Requirements CLEAR
|
|
@@ -6838,6 +6858,33 @@ Before starting, check if this task is resuming from a blocked clarification:
|
|
|
6838
6858
|
|
|
6839
6859
|
3. **If no clarification in $ARGUMENTS:** Proceed normally with ambiguity detection below.
|
|
6840
6860
|
|
|
6861
|
+
### Assess Project Maturity
|
|
6862
|
+
|
|
6863
|
+
Before detecting ambiguity, assess how well you know this project. Maturity determines how aggressively you should ask questions \u2014 new projects require more questions, mature projects can rely on accumulated knowledge.
|
|
6864
|
+
|
|
6865
|
+
**Measure maturity from runtime artifacts:**
|
|
6866
|
+
|
|
6867
|
+
| Signal | New | Growing | Mature |
|
|
6868
|
+
|--------|-----|---------|--------|
|
|
6869
|
+
| \`knowledge-base.md\` | < 80 lines (template) | 80-300 lines | 300+ lines |
|
|
6870
|
+
| \`memory/\` files | 0 files | 1-3 files | 4+ files, >5KB each |
|
|
6871
|
+
| Test cases in \`test-cases/\` | 0 | 1-6 | 7+ |
|
|
6872
|
+
| Exploration reports | 0 | 1 | 2+ |
|
|
6873
|
+
|
|
6874
|
+
**Steps:**
|
|
6875
|
+
1. Read \`.bugzy/runtime/knowledge-base.md\` and count lines
|
|
6876
|
+
2. List \`.bugzy/runtime/memory/\` directory and count files
|
|
6877
|
+
3. List \`test-cases/\` directory and count \`.md\` files (exclude README)
|
|
6878
|
+
4. Count exploration reports in \`exploration-reports/\`
|
|
6879
|
+
5. Classify: If majority of signals = New \u2192 **New**; majority Mature \u2192 **Mature**; otherwise \u2192 **Growing**
|
|
6880
|
+
|
|
6881
|
+
**Maturity adjusts your question threshold:**
|
|
6882
|
+
- **New**: Ask for CRITICAL + HIGH + MEDIUM severity (gather information aggressively)
|
|
6883
|
+
- **Growing**: Ask for CRITICAL + HIGH severity (standard protocol)
|
|
6884
|
+
- **Mature**: Ask for CRITICAL only (handle HIGH with documented assumptions)
|
|
6885
|
+
|
|
6886
|
+
**CRITICAL severity ALWAYS triggers a question, regardless of maturity level.**
|
|
6887
|
+
|
|
6841
6888
|
### Detect Ambiguity
|
|
6842
6889
|
|
|
6843
6890
|
Scan for ambiguity signals:
|
|
@@ -6864,8 +6911,8 @@ If ambiguity is detected, assess its severity:
|
|
|
6864
6911
|
|
|
6865
6912
|
| Severity | Characteristics | Examples | Action |
|
|
6866
6913
|
|----------|----------------|----------|--------|
|
|
6867
|
-
| **CRITICAL** | Expected behavior undefined/contradictory; test outcome unpredictable; core functionality unclear; success criteria missing; multiple interpretations = different strategies | "Fix the issue" (what issue?), "Improve performance" (which metrics?), "Fix sorting in todo list" (by date? priority? completion status?) | **STOP** -
|
|
6868
|
-
| **HIGH** | Core underspecified but direction clear; affects majority of scenarios; vague success criteria; assumptions risky | "Fix ordering" (sequence OR visibility?), "Add validation" (what? messages?), "Update dashboard" (which widgets?) | **STOP** -
|
|
6914
|
+
| **CRITICAL** | Expected behavior undefined/contradictory; test outcome unpredictable; core functionality unclear; success criteria missing; multiple interpretations = different strategies; **referenced page/feature does not exist in the application** | "Fix the issue" (what issue?), "Improve performance" (which metrics?), "Fix sorting in todo list" (by date? priority? completion status?), "Test the Settings page" (no Settings page exists), "Verify the checkout flow" (no checkout page found) | **STOP** - You MUST ask via team-communicator before proceeding |
|
|
6915
|
+
| **HIGH** | Core underspecified but direction clear; affects majority of scenarios; vague success criteria; assumptions risky | "Fix ordering" (sequence OR visibility?), "Add validation" (what? messages?), "Update dashboard" (which widgets?) | **STOP** - You MUST ask via team-communicator before proceeding |
|
|
6869
6916
|
| **MEDIUM** | Specific details missing; general requirements clear; affects subset of cases; reasonable low-risk assumptions possible; wrong assumption = test updates not strategy overhaul | Missing field labels, unclear error message text, undefined timeouts, button placement not specified, date formats unclear | **PROCEED** - (1) Moderate exploration, (2) Document assumptions: "Assuming X because Y", (3) Proceed with creation/execution, (4) Async clarification (team-communicator), (5) Mark [ASSUMED: description] |
|
|
6870
6917
|
| **LOW** | Minor edge cases; documentation gaps don't affect execution; optional/cosmetic elements; minimal impact | Tooltip text, optional field validation, icon choice, placeholder text, tab order | **PROCEED** - (1) Mark [TO BE CLARIFIED: description], (2) Proceed, (3) Mention in report "Minor Details", (4) No blocking/async clarification |
|
|
6871
6918
|
|
|
@@ -6966,18 +7013,26 @@ Tasks waiting for clarification responses.
|
|
|
6966
7013
|
|
|
6967
7014
|
### Wait or Proceed Based on Severity
|
|
6968
7015
|
|
|
6969
|
-
**
|
|
6970
|
-
-
|
|
6971
|
-
-
|
|
7016
|
+
**Use your maturity assessment to adjust thresholds:**
|
|
7017
|
+
- **New project**: STOP for CRITICAL + HIGH + MEDIUM
|
|
7018
|
+
- **Growing project**: STOP for CRITICAL + HIGH (default)
|
|
7019
|
+
- **Mature project**: STOP for CRITICAL only; handle HIGH with documented assumptions
|
|
7020
|
+
|
|
7021
|
+
**When severity meets your STOP threshold:**
|
|
7022
|
+
- You MUST call team-communicator (Slack) to ask the question \u2014 do NOT just mention it in your text output
|
|
7023
|
+
- Do NOT create tests, run tests, or make assumptions about the unclear aspect
|
|
7024
|
+
- Do NOT silently adapt by working around the issue (e.g., running other tests instead)
|
|
7025
|
+
- Do NOT invent your own success criteria when none are provided
|
|
7026
|
+
- Register the blocked task and wait for clarification
|
|
6972
7027
|
- *Rationale: Wrong assumptions = incorrect tests, false results, wasted time*
|
|
6973
7028
|
|
|
6974
|
-
**
|
|
7029
|
+
**When severity is below your STOP threshold \u2192 Proceed with Documented Assumptions:**
|
|
6975
7030
|
- Perform moderate exploration, document assumptions, proceed with creation/execution
|
|
6976
7031
|
- Ask clarification async (team-communicator), mark results "based on assumptions"
|
|
6977
7032
|
- Update tests after clarification received
|
|
6978
7033
|
- *Rationale: Waiting blocks progress; documented assumptions allow forward movement with later corrections*
|
|
6979
7034
|
|
|
6980
|
-
**LOW \u2192 Proceed and Mark:**
|
|
7035
|
+
**LOW \u2192 Always Proceed and Mark:**
|
|
6981
7036
|
- Proceed with creation/execution, mark gaps [TO BE CLARIFIED] or [ASSUMED]
|
|
6982
7037
|
- Mention in report but don't prioritize, no blocking
|
|
6983
7038
|
- *Rationale: Details don't affect strategy/results significantly*
|
|
@@ -7005,11 +7060,12 @@ When reporting test results, always include an "Ambiguities" section if clarific
|
|
|
7005
7060
|
|
|
7006
7061
|
## Remember
|
|
7007
7062
|
|
|
7008
|
-
- **
|
|
7063
|
+
- **STOP means STOP** - When you hit a STOP threshold, you MUST call team-communicator to ask via Slack. Do NOT silently adapt, skip, or work around the issue
|
|
7064
|
+
- **Non-existent features = CRITICAL** - If a page, component, or feature referenced in the task does not exist, this is always CRITICAL severity \u2014 ask what was meant
|
|
7009
7065
|
- **Ask correctly > guess poorly** - Specific questions lead to specific answers
|
|
7010
|
-
- **
|
|
7066
|
+
- **Never invent success criteria** - If the task says "improve" or "fix" without metrics, ask what "done" looks like
|
|
7011
7067
|
- **Check memory first** - Avoid re-asking previously answered questions
|
|
7012
|
-
- **
|
|
7068
|
+
- **Maturity adjusts threshold, not judgment** - Even in mature projects, CRITICAL always triggers a question`,
|
|
7013
7069
|
tags: ["clarification", "protocol", "ambiguity"]
|
|
7014
7070
|
};
|
|
7015
7071
|
|