joycraft 0.5.6 → 0.5.8
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +59 -338
- package/dist/{chunk-QIYIJ7VR.js → chunk-A2CQG5J5.js} +680 -61
- package/dist/chunk-A2CQG5J5.js.map +1 -0
- package/dist/cli.js +3 -3
- package/dist/index.d.ts +6 -0
- package/dist/index.js +78 -7
- package/dist/index.js.map +1 -1
- package/dist/{init-MKRU6SYT.js → init-QXG5BT4Y.js} +83 -11
- package/dist/init-QXG5BT4Y.js.map +1 -0
- package/dist/{init-autofix-V2Y2O4HO.js → init-autofix-Y5DQOFEU.js} +2 -2
- package/dist/{upgrade-HK6F5SXI.js → upgrade-VUOSXPR5.js} +2 -2
- package/package.json +1 -1
- package/dist/chunk-QIYIJ7VR.js.map +0 -1
- package/dist/init-MKRU6SYT.js.map +0 -1
- /package/dist/{init-autofix-V2Y2O4HO.js.map → init-autofix-Y5DQOFEU.js.map} +0 -0
- /package/dist/{upgrade-HK6F5SXI.js.map → upgrade-VUOSXPR5.js.map} +0 -0
|
@@ -904,6 +904,28 @@ Based on their answer, use the appropriate git rules in the Behavioral Boundarie
|
|
|
904
904
|
- Ask "should I push?" or "should I create a PR?" \u2014 the answer is always yes, just do it
|
|
905
905
|
\`\`\`
|
|
906
906
|
|
|
907
|
+
### Permission Mode Recommendation
|
|
908
|
+
|
|
909
|
+
After the git autonomy question and before the risk interview, recommend a Claude Code permission mode based on what you've learned so far. Present this guidance:
|
|
910
|
+
|
|
911
|
+
> **What permission mode should you use?**
|
|
912
|
+
>
|
|
913
|
+
> | Your situation | Use | Why |
|
|
914
|
+
> |---|---|---|
|
|
915
|
+
> | Autonomous spec execution | \`--permission-mode dontAsk\` + allowlist | Only pre-approved commands run |
|
|
916
|
+
> | Long session with some trust | \`--permission-mode auto\` | Safety classifier reviews each action |
|
|
917
|
+
> | Interactive development | \`--permission-mode acceptEdits\` | Auto-approves file edits, prompts for commands |
|
|
918
|
+
>
|
|
919
|
+
> You do NOT need \`--dangerously-skip-permissions\`. The modes above provide autonomy with safety.
|
|
920
|
+
|
|
921
|
+
**If the user chose Autonomous git:** Recommend \`auto\` mode as a good default -- it provides autonomy while the safety classifier catches risky operations. Note that \`dontAsk\` is even more autonomous but requires a well-configured allowlist.
|
|
922
|
+
|
|
923
|
+
**If the user chose Cautious git:** Recommend \`auto\` mode -- it matches their preference for safety with less manual intervention than the default.
|
|
924
|
+
|
|
925
|
+
**If the risk interview reveals production databases, live APIs, or billing systems:** Upgrade the recommendation to \`dontAsk\` with a tight allowlist. Explain that \`dontAsk\` with explicit deny patterns is safer than \`auto\` for high-risk environments because it uses a deterministic allowlist rather than a classifier.
|
|
926
|
+
|
|
927
|
+
This is informational only -- do not change the user's permission mode. Just tell them what to use when they launch Claude Code.
|
|
928
|
+
|
|
907
929
|
### Risk Interview
|
|
908
930
|
|
|
909
931
|
Before applying upgrades, ask 3-5 targeted questions to capture what's dangerous in this project. Skip this if \`docs/context/production-map.md\` or \`docs/context/dangerous-assumptions.md\` already exist (offer to update instead).
|
|
@@ -1300,6 +1322,26 @@ Adjust the content based on the actual interview responses:
|
|
|
1300
1322
|
- Only include NEVER rules for directories/files the user specified
|
|
1301
1323
|
- If the user allowed certain network tools or package managers, exclude those
|
|
1302
1324
|
|
|
1325
|
+
## Recommended Permission Mode
|
|
1326
|
+
|
|
1327
|
+
After generating the boundaries above, also recommend a Claude Code permission mode. Include this section in your output:
|
|
1328
|
+
|
|
1329
|
+
\`\`\`
|
|
1330
|
+
### Recommended Permission Mode
|
|
1331
|
+
|
|
1332
|
+
You don't need \\\`--dangerously-skip-permissions\\\`. Safer alternatives exist:
|
|
1333
|
+
|
|
1334
|
+
| Your situation | Use | Why |
|
|
1335
|
+
|---|---|---|
|
|
1336
|
+
| Autonomous spec execution | \\\`--permission-mode dontAsk\\\` + allowlist above | Only pre-approved commands run |
|
|
1337
|
+
| Long session with some trust | \\\`--permission-mode auto\\\` | Safety classifier reviews each action |
|
|
1338
|
+
| Interactive development | \\\`--permission-mode acceptEdits\\\` | Auto-approves file edits, prompts for commands |
|
|
1339
|
+
|
|
1340
|
+
**For lockdown mode, we recommend \\\`--permission-mode dontAsk\\\`** combined with the deny patterns above. This gives you full autonomy for allowed operations while blocking everything else -- no classifier overhead, no prompts, and no safety bypass.
|
|
1341
|
+
|
|
1342
|
+
\\\`--dangerously-skip-permissions\\\` disables ALL safety checks. The modes above give you autonomy without removing the guardrails.
|
|
1343
|
+
\`\`\`
|
|
1344
|
+
|
|
1303
1345
|
## Step 4: Offer to Apply
|
|
1304
1346
|
|
|
1305
1347
|
If the user asks you to apply the changes:
|
|
@@ -1308,6 +1350,149 @@ If the user asks you to apply the changes:
|
|
|
1308
1350
|
2. **For settings.json:** Read the existing \`.claude/settings.json\`, show the user what the \`permissions.deny\` array will look like after adding the new patterns. Ask for confirmation before writing.
|
|
1309
1351
|
|
|
1310
1352
|
**Never auto-apply. Always show the exact changes and wait for explicit approval.**
|
|
1353
|
+
`,
|
|
1354
|
+
"joycraft-verify.md": `---
|
|
1355
|
+
name: joycraft-verify
|
|
1356
|
+
description: Spawn an independent verifier subagent to check an implementation against its spec -- read-only, no code edits, structured pass/fail verdict
|
|
1357
|
+
---
|
|
1358
|
+
|
|
1359
|
+
# Verify Implementation Against Spec
|
|
1360
|
+
|
|
1361
|
+
The user wants independent verification of an implementation. Your job is to find the relevant spec, extract its acceptance criteria and test plan, then spawn a separate verifier subagent that checks each criterion and produces a structured verdict.
|
|
1362
|
+
|
|
1363
|
+
**Why a separate subagent?** Anthropic's research found that agents reliably skew positive when grading their own work. Separating the agent doing the work from the agent judging it consistently outperforms self-evaluation. The verifier gets a clean context window with no implementation bias.
|
|
1364
|
+
|
|
1365
|
+
## Step 1: Find the Spec
|
|
1366
|
+
|
|
1367
|
+
If the user provided a spec path (e.g., \`/joycraft-verify docs/specs/2026-03-26-add-widget.md\`), use that path directly.
|
|
1368
|
+
|
|
1369
|
+
If no path was provided, scan \`docs/specs/\` for spec files. Pick the most recently modified \`.md\` file in that directory. If \`docs/specs/\` doesn't exist or is empty, tell the user:
|
|
1370
|
+
|
|
1371
|
+
> No specs found in \`docs/specs/\`. Please provide a spec path: \`/joycraft-verify path/to/spec.md\`
|
|
1372
|
+
|
|
1373
|
+
## Step 2: Read and Parse the Spec
|
|
1374
|
+
|
|
1375
|
+
Read the spec file and extract:
|
|
1376
|
+
|
|
1377
|
+
1. **Spec name** -- from the H1 title
|
|
1378
|
+
2. **Acceptance Criteria** -- the checklist under the \`## Acceptance Criteria\` section
|
|
1379
|
+
3. **Test Plan** -- the table under the \`## Test Plan\` section, including any test commands
|
|
1380
|
+
4. **Constraints** -- the \`## Constraints\` section if present
|
|
1381
|
+
|
|
1382
|
+
If the spec has no Acceptance Criteria section, tell the user:
|
|
1383
|
+
|
|
1384
|
+
> This spec doesn't have an Acceptance Criteria section. Verification needs criteria to check against. Add acceptance criteria to the spec and try again.
|
|
1385
|
+
|
|
1386
|
+
If the spec has no Test Plan section, note this but proceed -- the verifier can still check criteria by reading code and running any available project tests.
|
|
1387
|
+
|
|
1388
|
+
## Step 3: Identify Test Commands
|
|
1389
|
+
|
|
1390
|
+
Look for test commands in these locations (in priority order):
|
|
1391
|
+
|
|
1392
|
+
1. The spec's Test Plan section (look for commands in backticks or "Type" column entries like "unit", "integration", "e2e", "build")
|
|
1393
|
+
2. The project's CLAUDE.md (look for test/build commands in the Development Workflow section)
|
|
1394
|
+
3. Common defaults based on the project type:
|
|
1395
|
+
- Node.js: \`npm test\` or \`pnpm test --run\`
|
|
1396
|
+
- Python: \`pytest\`
|
|
1397
|
+
- Rust: \`cargo test\`
|
|
1398
|
+
- Go: \`go test ./...\`
|
|
1399
|
+
|
|
1400
|
+
Build a list of specific commands the verifier should run.
|
|
1401
|
+
|
|
1402
|
+
## Step 4: Spawn the Verifier Subagent
|
|
1403
|
+
|
|
1404
|
+
Use Claude Code's Agent tool to spawn a subagent with the following prompt. Replace the placeholders with the actual content extracted in Steps 2-3.
|
|
1405
|
+
|
|
1406
|
+
\`\`\`
|
|
1407
|
+
You are a QA verifier. Your job is to independently verify an implementation against its spec. You have NO context about how the implementation was done -- you are checking it fresh.
|
|
1408
|
+
|
|
1409
|
+
RULES -- these are hard constraints, not suggestions:
|
|
1410
|
+
- You may READ any file using the Read tool or cat
|
|
1411
|
+
- You may RUN these specific test/build commands: [TEST_COMMANDS]
|
|
1412
|
+
- You may NOT edit, create, or delete any files
|
|
1413
|
+
- You may NOT run commands that modify state (no git commit, no npm install, no file writes)
|
|
1414
|
+
- You may NOT install packages or access the network
|
|
1415
|
+
- Report what you OBSERVE, not what you expect or hope
|
|
1416
|
+
|
|
1417
|
+
SPEC NAME: [SPEC_NAME]
|
|
1418
|
+
|
|
1419
|
+
ACCEPTANCE CRITERIA:
|
|
1420
|
+
[ACCEPTANCE_CRITERIA]
|
|
1421
|
+
|
|
1422
|
+
TEST PLAN:
|
|
1423
|
+
[TEST_PLAN]
|
|
1424
|
+
|
|
1425
|
+
CONSTRAINTS:
|
|
1426
|
+
[CONSTRAINTS_OR_NONE]
|
|
1427
|
+
|
|
1428
|
+
YOUR TASK:
|
|
1429
|
+
For each acceptance criterion, determine if it PASSES or FAILS based on evidence:
|
|
1430
|
+
|
|
1431
|
+
1. Run the test commands listed above. Record the output.
|
|
1432
|
+
2. For each acceptance criterion:
|
|
1433
|
+
a. Check if there is a corresponding test and whether it passes
|
|
1434
|
+
b. If no test exists, read the relevant source files to verify the criterion is met
|
|
1435
|
+
c. If the criterion cannot be verified by reading code or running tests, mark it MANUAL CHECK NEEDED
|
|
1436
|
+
3. For criteria about build/test passing, actually run the commands and report results.
|
|
1437
|
+
|
|
1438
|
+
OUTPUT FORMAT -- you MUST use this exact format:
|
|
1439
|
+
|
|
1440
|
+
VERIFICATION REPORT
|
|
1441
|
+
|
|
1442
|
+
| # | Criterion | Verdict | Evidence |
|
|
1443
|
+
|---|-----------|---------|----------|
|
|
1444
|
+
| 1 | [criterion text] | PASS/FAIL/MANUAL CHECK NEEDED | [what you observed] |
|
|
1445
|
+
| 2 | [criterion text] | PASS/FAIL/MANUAL CHECK NEEDED | [what you observed] |
|
|
1446
|
+
[continue for all criteria]
|
|
1447
|
+
|
|
1448
|
+
SUMMARY: X/Y criteria passed. [Z failures need attention. / All criteria verified.]
|
|
1449
|
+
|
|
1450
|
+
If any test commands fail to run (missing dependencies, wrong command, etc.), report the error as evidence for a FAIL verdict on the relevant criterion.
|
|
1451
|
+
\`\`\`
|
|
1452
|
+
|
|
1453
|
+
## Step 5: Format and Present the Verdict
|
|
1454
|
+
|
|
1455
|
+
Take the subagent's response and present it to the user in this format:
|
|
1456
|
+
|
|
1457
|
+
\`\`\`
|
|
1458
|
+
## Verification Report -- [Spec Name]
|
|
1459
|
+
|
|
1460
|
+
| # | Criterion | Verdict | Evidence |
|
|
1461
|
+
|---|-----------|---------|----------|
|
|
1462
|
+
| 1 | ... | PASS | ... |
|
|
1463
|
+
| 2 | ... | FAIL | ... |
|
|
1464
|
+
|
|
1465
|
+
**Overall: X/Y criteria passed.**
|
|
1466
|
+
|
|
1467
|
+
[If all passed:]
|
|
1468
|
+
All criteria verified. Ready to commit and open a PR.
|
|
1469
|
+
|
|
1470
|
+
[If any failed:]
|
|
1471
|
+
N failures need attention. Review the evidence above and fix before proceeding.
|
|
1472
|
+
|
|
1473
|
+
[If any MANUAL CHECK NEEDED:]
|
|
1474
|
+
N criteria need manual verification -- they can't be checked by reading code or running tests alone.
|
|
1475
|
+
\`\`\`
|
|
1476
|
+
|
|
1477
|
+
## Step 6: Suggest Next Steps
|
|
1478
|
+
|
|
1479
|
+
Based on the verdict:
|
|
1480
|
+
|
|
1481
|
+
- **All PASS:** Suggest committing and opening a PR, or running \`/joycraft-session-end\` to capture discoveries.
|
|
1482
|
+
- **Some FAIL:** List the failed criteria and suggest the user fix them, then run \`/joycraft-verify\` again.
|
|
1483
|
+
- **MANUAL CHECK NEEDED items:** Explain what needs human eyes and why automation couldn't verify it.
|
|
1484
|
+
|
|
1485
|
+
**Do NOT offer to fix failures yourself.** The verifier reports; the human (or implementation agent in a separate turn) decides what to do. This separation is the whole point.
|
|
1486
|
+
|
|
1487
|
+
## Edge Cases
|
|
1488
|
+
|
|
1489
|
+
| Scenario | Behavior |
|
|
1490
|
+
|----------|----------|
|
|
1491
|
+
| Spec has no Test Plan | Warn that verification is weaker without a test plan, but proceed by checking criteria through code reading and any available project-level tests |
|
|
1492
|
+
| All tests pass but a criterion is not testable | Mark as MANUAL CHECK NEEDED with explanation |
|
|
1493
|
+
| Subagent can't run tests (missing deps) | Report the error as FAIL evidence |
|
|
1494
|
+
| No specs found and no path given | Tell user to provide a spec path or create a spec first |
|
|
1495
|
+
| Spec status is "Complete" | Still run verification -- "Complete" means the implementer thinks it's done, verification confirms |
|
|
1311
1496
|
`
|
|
1312
1497
|
};
|
|
1313
1498
|
var TEMPLATES = {
|
|
@@ -1689,13 +1874,59 @@ is required, though you can add one via the GitHub Checks API if you prefer.
|
|
|
1689
1874
|
|
|
1690
1875
|
---
|
|
1691
1876
|
|
|
1877
|
+
## Testing by Stack Type
|
|
1878
|
+
|
|
1879
|
+
The scenario agent selects the appropriate test format based on the project's
|
|
1880
|
+
testing backbone. Each backbone tests the same holdout principle \u2014 observable
|
|
1881
|
+
behavior only, no source imports \u2014 but uses different tools.
|
|
1882
|
+
|
|
1883
|
+
### Web Apps (Playwright)
|
|
1884
|
+
|
|
1885
|
+
For Next.js, Vite, Nuxt, Remix, and other web frameworks. Tests run against a
|
|
1886
|
+
dev server or preview URL using a headless browser.
|
|
1887
|
+
|
|
1888
|
+
- **Template:** \`example-scenario-web.spec.ts\`
|
|
1889
|
+
- **Config:** \`playwright.config.ts\`
|
|
1890
|
+
- **Package:** \`package-web.json\` (use instead of \`package.json\` for web projects)
|
|
1891
|
+
- **Run:** \`npx playwright test\`
|
|
1892
|
+
|
|
1893
|
+
### Mobile Apps (Maestro)
|
|
1894
|
+
|
|
1895
|
+
For React Native, Flutter, and native iOS/Android. Tests are declarative YAML
|
|
1896
|
+
flows that interact with a running app on a simulator.
|
|
1897
|
+
|
|
1898
|
+
- **Template:** \`example-scenario-mobile.yaml\`
|
|
1899
|
+
- **Login sub-flow:** \`example-scenario-mobile-login.yaml\`
|
|
1900
|
+
- **Setup guide:** \`README-mobile.md\`
|
|
1901
|
+
- **Run:** \`maestro test example-scenario-mobile.yaml\`
|
|
1902
|
+
|
|
1903
|
+
### API Backends (HTTP)
|
|
1904
|
+
|
|
1905
|
+
For Express, FastAPI, Django, and other API-only backends. Tests send HTTP
|
|
1906
|
+
requests using Node.js built-in \`fetch\`.
|
|
1907
|
+
|
|
1908
|
+
- **Template:** \`example-scenario-api.test.ts\`
|
|
1909
|
+
- **Run:** \`npx vitest run\`
|
|
1910
|
+
|
|
1911
|
+
### CLI Tools & Libraries (native)
|
|
1912
|
+
|
|
1913
|
+
For CLI tools, npm packages, and non-UI projects. Tests invoke the built
|
|
1914
|
+
binary via \`spawnSync\` and assert on stdout/stderr.
|
|
1915
|
+
|
|
1916
|
+
- **Template:** \`example-scenario.test.ts\`
|
|
1917
|
+
- **Run:** \`npx vitest run\`
|
|
1918
|
+
|
|
1919
|
+
---
|
|
1920
|
+
|
|
1692
1921
|
## Adding scenarios
|
|
1693
1922
|
|
|
1694
1923
|
### Rules
|
|
1695
1924
|
|
|
1696
|
-
|
|
1697
|
-
|
|
1698
|
-
|
|
1925
|
+
These rules apply to ALL backbones:
|
|
1926
|
+
|
|
1927
|
+
1. **Behavioral, not structural.** Test what the app does from a user's
|
|
1928
|
+
perspective. For web: navigate and assert on content. For CLI: run commands
|
|
1929
|
+
and check output. For API: send requests and check responses.
|
|
1699
1930
|
|
|
1700
1931
|
2. **End-to-end.** Each test should represent something a real user would
|
|
1701
1932
|
actually do. If you would not put it in a demo or docs example, reconsider
|
|
@@ -1705,9 +1936,8 @@ is required, though you can add one via the GitHub Checks API if you prefer.
|
|
|
1705
1936
|
see source code. Any \`import\` that reaches into \`../main-repo/src\` breaks
|
|
1706
1937
|
the pattern.
|
|
1707
1938
|
|
|
1708
|
-
4. **Independent.** Each test must be able to run in isolation.
|
|
1709
|
-
|
|
1710
|
-
state between tests.
|
|
1939
|
+
4. **Independent.** Each test must be able to run in isolation. No shared
|
|
1940
|
+
mutable state between tests.
|
|
1711
1941
|
|
|
1712
1942
|
5. **Deterministic.** Avoid network calls, timestamps, or random values in
|
|
1713
1943
|
assertions unless the feature under test genuinely involves them.
|
|
@@ -1716,31 +1946,25 @@ is required, though you can add one via the GitHub Checks API if you prefer.
|
|
|
1716
1946
|
|
|
1717
1947
|
\`\`\`
|
|
1718
1948
|
$SCENARIOS_REPO/
|
|
1719
|
-
\u251C\u2500\u2500 example-scenario.test.ts
|
|
1949
|
+
\u251C\u2500\u2500 example-scenario.test.ts # CLI/binary scenario template
|
|
1950
|
+
\u251C\u2500\u2500 example-scenario-web.spec.ts # Web app scenario template (Playwright)
|
|
1951
|
+
\u251C\u2500\u2500 example-scenario-api.test.ts # API backend scenario template
|
|
1952
|
+
\u251C\u2500\u2500 example-scenario-mobile.yaml # Mobile app scenario template (Maestro)
|
|
1953
|
+
\u251C\u2500\u2500 example-scenario-mobile-login.yaml # Reusable login sub-flow
|
|
1954
|
+
\u251C\u2500\u2500 playwright.config.ts # Playwright config (web projects)
|
|
1955
|
+
\u251C\u2500\u2500 package.json # Default (vitest for CLI/API)
|
|
1956
|
+
\u251C\u2500\u2500 package-web.json # Alternative (Playwright for web)
|
|
1957
|
+
\u251C\u2500\u2500 README-mobile.md # Mobile testing setup guide
|
|
1720
1958
|
\u251C\u2500\u2500 workflows/
|
|
1721
|
-
\u2502 \
|
|
1722
|
-
\
|
|
1959
|
+
\u2502 \u251C\u2500\u2500 run.yml # CI workflow (do not rename)
|
|
1960
|
+
\u2502 \u2514\u2500\u2500 generate.yml # Scenario generation workflow
|
|
1961
|
+
\u251C\u2500\u2500 prompts/
|
|
1962
|
+
\u2502 \u2514\u2500\u2500 scenario-agent.md # Scenario agent instructions
|
|
1723
1963
|
\u2514\u2500\u2500 README.md
|
|
1724
1964
|
\`\`\`
|
|
1725
1965
|
|
|
1726
|
-
|
|
1727
|
-
|
|
1728
|
-
|
|
1729
|
-
### Example structure
|
|
1730
|
-
|
|
1731
|
-
\`\`\`ts
|
|
1732
|
-
import { spawnSync } from "node:child_process";
|
|
1733
|
-
import { join } from "node:path";
|
|
1734
|
-
|
|
1735
|
-
const CLI = join(__dirname, "..", "main-repo", "dist", "cli.js");
|
|
1736
|
-
|
|
1737
|
-
it("init creates a CLAUDE.md file", () => {
|
|
1738
|
-
const tmp = mkdtempSync(join(tmpdir(), "scenario-"));
|
|
1739
|
-
const { status } = spawnSync("node", [CLI, "init", tmp], { encoding: "utf8" });
|
|
1740
|
-
expect(status).toBe(0);
|
|
1741
|
-
expect(existsSync(join(tmp, "CLAUDE.md"))).toBe(true);
|
|
1742
|
-
});
|
|
1743
|
-
\`\`\`
|
|
1966
|
+
Use the template that matches your project's stack. Remove the ones you
|
|
1967
|
+
don't need.
|
|
1744
1968
|
|
|
1745
1969
|
---
|
|
1746
1970
|
|
|
@@ -1752,6 +1976,7 @@ it("init creates a CLAUDE.md file", () => {
|
|
|
1752
1976
|
| Visible to agent | Yes | No |
|
|
1753
1977
|
| What they test | Units, modules, logic | End-to-end behavior |
|
|
1754
1978
|
| Import source code | Yes | Never |
|
|
1979
|
+
| Test method | Unit test framework | Depends on backbone (Playwright/Maestro/vitest/fetch) |
|
|
1755
1980
|
| Run on every push | Yes | Yes (via dispatch) |
|
|
1756
1981
|
| Purpose | Catch regressions fast | Validate real behavior |
|
|
1757
1982
|
|
|
@@ -1900,6 +2125,304 @@ describe("CLI: init command (example \u2014 replace with your real scenarios)",
|
|
|
1900
2125
|
}
|
|
1901
2126
|
}
|
|
1902
2127
|
`,
|
|
2128
|
+
"scenarios/package-web.json": `{
|
|
2129
|
+
"name": "$SCENARIOS_REPO",
|
|
2130
|
+
"version": "0.0.1",
|
|
2131
|
+
"private": true,
|
|
2132
|
+
"type": "module",
|
|
2133
|
+
"scripts": {
|
|
2134
|
+
"test": "playwright test"
|
|
2135
|
+
},
|
|
2136
|
+
"devDependencies": {
|
|
2137
|
+
"@playwright/test": "^1.50.0"
|
|
2138
|
+
}
|
|
2139
|
+
}
|
|
2140
|
+
`,
|
|
2141
|
+
"scenarios/playwright.config.ts": `import { defineConfig } from '@playwright/test';
|
|
2142
|
+
|
|
2143
|
+
/**
|
|
2144
|
+
* Playwright configuration for holdout scenario tests.
|
|
2145
|
+
*
|
|
2146
|
+
* BASE_URL can be set to test against a preview deployment URL
|
|
2147
|
+
* or defaults to http://localhost:3000 for local dev server testing.
|
|
2148
|
+
*/
|
|
2149
|
+
export default defineConfig({
|
|
2150
|
+
testDir: '.',
|
|
2151
|
+
testMatch: '**/*.spec.ts',
|
|
2152
|
+
timeout: 60_000,
|
|
2153
|
+
retries: 0,
|
|
2154
|
+
use: {
|
|
2155
|
+
baseURL: process.env.BASE_URL || 'http://localhost:3000',
|
|
2156
|
+
headless: true,
|
|
2157
|
+
screenshot: 'only-on-failure',
|
|
2158
|
+
},
|
|
2159
|
+
projects: [
|
|
2160
|
+
{ name: 'chromium', use: { browserName: 'chromium' } },
|
|
2161
|
+
],
|
|
2162
|
+
});
|
|
2163
|
+
`,
|
|
2164
|
+
"scenarios/example-scenario-web.spec.ts": `/**
|
|
2165
|
+
* Example Web Scenario Test (Playwright)
|
|
2166
|
+
*
|
|
2167
|
+
* This file is a template for scenario tests against web applications.
|
|
2168
|
+
* The holdout pattern applies: test the running app through its UI,
|
|
2169
|
+
* never import source code from the main repo.
|
|
2170
|
+
*
|
|
2171
|
+
* The main repo is available at ../main-repo and is already built.
|
|
2172
|
+
* Tests run against either:
|
|
2173
|
+
* - A dev server started from ../main-repo (default)
|
|
2174
|
+
* - A preview deployment URL (set BASE_URL env var)
|
|
2175
|
+
*
|
|
2176
|
+
* DO:
|
|
2177
|
+
* - Navigate to pages, click elements, fill forms, assert on visible content
|
|
2178
|
+
* - Use page.locator() with accessible selectors (role, text, test-id)
|
|
2179
|
+
* - Keep each test fully independent
|
|
2180
|
+
*
|
|
2181
|
+
* DON'T:
|
|
2182
|
+
* - Import from ../main-repo/src \u2014 that defeats the holdout
|
|
2183
|
+
* - Test internal implementation details
|
|
2184
|
+
* - Rely on specific CSS classes or DOM structure (use accessible selectors)
|
|
2185
|
+
*/
|
|
2186
|
+
|
|
2187
|
+
import { test, expect } from '@playwright/test';
|
|
2188
|
+
import { spawn, type ChildProcess } from 'node:child_process';
|
|
2189
|
+
import { join } from 'node:path';
|
|
2190
|
+
|
|
2191
|
+
const MAIN_REPO = join(__dirname, '..', 'main-repo');
|
|
2192
|
+
let serverProcess: ChildProcess | undefined;
|
|
2193
|
+
|
|
2194
|
+
/**
|
|
2195
|
+
* Wait for a URL to become reachable.
|
|
2196
|
+
*/
|
|
2197
|
+
async function waitForServer(url: string, timeoutMs = 60_000): Promise<void> {
|
|
2198
|
+
const start = Date.now();
|
|
2199
|
+
while (Date.now() - start < timeoutMs) {
|
|
2200
|
+
try {
|
|
2201
|
+
const res = await fetch(url);
|
|
2202
|
+
if (res.ok || res.status < 500) return;
|
|
2203
|
+
} catch {
|
|
2204
|
+
// Server not ready yet
|
|
2205
|
+
}
|
|
2206
|
+
await new Promise(r => setTimeout(r, 1000));
|
|
2207
|
+
}
|
|
2208
|
+
throw new Error(\`Server at \${url} did not become ready within \${timeoutMs}ms\`);
|
|
2209
|
+
}
|
|
2210
|
+
|
|
2211
|
+
test.beforeAll(async () => {
|
|
2212
|
+
// If BASE_URL is set, skip starting a dev server \u2014 test against the provided URL
|
|
2213
|
+
if (process.env.BASE_URL) return;
|
|
2214
|
+
|
|
2215
|
+
serverProcess = spawn('npm', ['run', 'dev'], {
|
|
2216
|
+
cwd: MAIN_REPO,
|
|
2217
|
+
stdio: 'pipe',
|
|
2218
|
+
env: { ...process.env, PORT: '3000' },
|
|
2219
|
+
});
|
|
2220
|
+
|
|
2221
|
+
await waitForServer('http://localhost:3000');
|
|
2222
|
+
});
|
|
2223
|
+
|
|
2224
|
+
test.afterAll(async () => {
|
|
2225
|
+
if (serverProcess) {
|
|
2226
|
+
serverProcess.kill('SIGTERM');
|
|
2227
|
+
serverProcess = undefined;
|
|
2228
|
+
}
|
|
2229
|
+
});
|
|
2230
|
+
|
|
2231
|
+
// ---------------------------------------------------------------------------
|
|
2232
|
+
// Example scenarios \u2014 replace with real tests for your application
|
|
2233
|
+
// ---------------------------------------------------------------------------
|
|
2234
|
+
|
|
2235
|
+
test.describe('Home page', () => {
|
|
2236
|
+
test('loads successfully and shows main heading', async ({ page }) => {
|
|
2237
|
+
await page.goto('/');
|
|
2238
|
+
// Replace with your app's actual heading or key element
|
|
2239
|
+
await expect(page.locator('h1')).toBeVisible();
|
|
2240
|
+
});
|
|
2241
|
+
|
|
2242
|
+
test('navigates to a subpage', async ({ page }) => {
|
|
2243
|
+
await page.goto('/');
|
|
2244
|
+
// Replace with your app's actual navigation
|
|
2245
|
+
// await page.click('text=About');
|
|
2246
|
+
// await expect(page).toHaveURL(/\\/about/);
|
|
2247
|
+
// await expect(page.locator('h1')).toContainText('About');
|
|
2248
|
+
});
|
|
2249
|
+
});
|
|
2250
|
+
`,
|
|
2251
|
+
"scenarios/example-scenario-api.test.ts": `/**
|
|
2252
|
+
* Example API Scenario Test
|
|
2253
|
+
*
|
|
2254
|
+
* This file is a template for scenario tests against API-only backends.
|
|
2255
|
+
* The holdout pattern applies: test the running server via HTTP requests,
|
|
2256
|
+
* never import route handlers or source code from the main repo.
|
|
2257
|
+
*
|
|
2258
|
+
* The main repo is available at ../main-repo and is already built.
|
|
2259
|
+
* Tests run against either:
|
|
2260
|
+
* - A server started from ../main-repo (default)
|
|
2261
|
+
* - A deployed URL (set BASE_URL env var)
|
|
2262
|
+
*
|
|
2263
|
+
* Uses Node.js built-in fetch \u2014 no additional HTTP client dependencies.
|
|
2264
|
+
*
|
|
2265
|
+
* DO:
|
|
2266
|
+
* - Send HTTP requests to endpoints, assert on status codes and response bodies
|
|
2267
|
+
* - Test realistic user actions (create, read, update, delete flows)
|
|
2268
|
+
* - Keep each test fully independent
|
|
2269
|
+
*
|
|
2270
|
+
* DON'T:
|
|
2271
|
+
* - Import from ../main-repo/src \u2014 that defeats the holdout
|
|
2272
|
+
* - Use supertest or similar tools that import the app directly
|
|
2273
|
+
* - Test internal implementation details
|
|
2274
|
+
*/
|
|
2275
|
+
|
|
2276
|
+
import { describe, it, expect, beforeAll, afterAll } from 'vitest';
|
|
2277
|
+
import { spawn, type ChildProcess } from 'node:child_process';
|
|
2278
|
+
import { join } from 'node:path';
|
|
2279
|
+
|
|
2280
|
+
const MAIN_REPO = join(__dirname, '..', 'main-repo');
|
|
2281
|
+
const BASE_URL = process.env.BASE_URL || 'http://localhost:3000';
|
|
2282
|
+
let serverProcess: ChildProcess | undefined;
|
|
2283
|
+
|
|
2284
|
+
/**
|
|
2285
|
+
* Wait for a URL to become reachable.
|
|
2286
|
+
*/
|
|
2287
|
+
async function waitForServer(url: string, timeoutMs = 60_000): Promise<void> {
|
|
2288
|
+
const start = Date.now();
|
|
2289
|
+
while (Date.now() - start < timeoutMs) {
|
|
2290
|
+
try {
|
|
2291
|
+
const res = await fetch(url);
|
|
2292
|
+
if (res.ok || res.status < 500) return;
|
|
2293
|
+
} catch {
|
|
2294
|
+
// Server not ready yet
|
|
2295
|
+
}
|
|
2296
|
+
await new Promise(r => setTimeout(r, 1000));
|
|
2297
|
+
}
|
|
2298
|
+
throw new Error(\`Server at \${url} did not become ready within \${timeoutMs}ms\`);
|
|
2299
|
+
}
|
|
2300
|
+
|
|
2301
|
+
beforeAll(async () => {
|
|
2302
|
+
// If BASE_URL is set externally, skip starting a server
|
|
2303
|
+
if (process.env.BASE_URL) return;
|
|
2304
|
+
|
|
2305
|
+
serverProcess = spawn('npm', ['start'], {
|
|
2306
|
+
cwd: MAIN_REPO,
|
|
2307
|
+
stdio: 'pipe',
|
|
2308
|
+
env: { ...process.env, PORT: '3000' },
|
|
2309
|
+
});
|
|
2310
|
+
|
|
2311
|
+
await waitForServer(BASE_URL);
|
|
2312
|
+
}, 90_000);
|
|
2313
|
+
|
|
2314
|
+
afterAll(() => {
|
|
2315
|
+
if (serverProcess) {
|
|
2316
|
+
serverProcess.kill('SIGTERM');
|
|
2317
|
+
serverProcess = undefined;
|
|
2318
|
+
}
|
|
2319
|
+
});
|
|
2320
|
+
|
|
2321
|
+
// ---------------------------------------------------------------------------
|
|
2322
|
+
// Example scenarios \u2014 replace with real tests for your API
|
|
2323
|
+
// ---------------------------------------------------------------------------
|
|
2324
|
+
|
|
2325
|
+
describe('API health', () => {
|
|
2326
|
+
it('GET / returns a success status', async () => {
|
|
2327
|
+
const res = await fetch(\`\${BASE_URL}/\`);
|
|
2328
|
+
expect(res.status).toBeLessThan(500);
|
|
2329
|
+
});
|
|
2330
|
+
});
|
|
2331
|
+
|
|
2332
|
+
describe('API endpoints', () => {
|
|
2333
|
+
it('GET /api/example returns JSON', async () => {
|
|
2334
|
+
const res = await fetch(\`\${BASE_URL}/api/example\`);
|
|
2335
|
+
// Replace with your actual endpoint
|
|
2336
|
+
// expect(res.status).toBe(200);
|
|
2337
|
+
// const body = await res.json();
|
|
2338
|
+
// expect(body).toHaveProperty('data');
|
|
2339
|
+
});
|
|
2340
|
+
|
|
2341
|
+
it('POST /api/example creates a resource', async () => {
|
|
2342
|
+
// Replace with your actual endpoint and payload
|
|
2343
|
+
// const res = await fetch(\\\`\\\${BASE_URL}/api/example\\\`, {
|
|
2344
|
+
// method: 'POST',
|
|
2345
|
+
// headers: { 'Content-Type': 'application/json' },
|
|
2346
|
+
// body: JSON.stringify({ name: 'test' }),
|
|
2347
|
+
// });
|
|
2348
|
+
// expect(res.status).toBe(201);
|
|
2349
|
+
// const body = await res.json();
|
|
2350
|
+
// expect(body).toHaveProperty('id');
|
|
2351
|
+
});
|
|
2352
|
+
|
|
2353
|
+
it('returns 404 for unknown routes', async () => {
|
|
2354
|
+
const res = await fetch(\`\${BASE_URL}/api/does-not-exist\`);
|
|
2355
|
+
expect(res.status).toBe(404);
|
|
2356
|
+
});
|
|
2357
|
+
});
|
|
2358
|
+
`,
|
|
2359
|
+
"scenarios/example-scenario-mobile.yaml": `# Example Mobile Scenario Test (Maestro)
|
|
2360
|
+
#
|
|
2361
|
+
# This file is a template for scenario tests against mobile applications.
|
|
2362
|
+
# The holdout pattern applies: test the running app through its UI,
|
|
2363
|
+
# never reference source code from the main repo.
|
|
2364
|
+
#
|
|
2365
|
+
# Maestro tests are declarative YAML flows that interact with a running
|
|
2366
|
+
# app on a simulator/emulator. Install Maestro:
|
|
2367
|
+
# curl -Ls "https://get.maestro.mobile.dev" | bash
|
|
2368
|
+
#
|
|
2369
|
+
# Run this flow:
|
|
2370
|
+
# maestro test example-scenario-mobile.yaml
|
|
2371
|
+
#
|
|
2372
|
+
# DO:
|
|
2373
|
+
# - Tap elements, fill inputs, assert on visible text
|
|
2374
|
+
# - Use runFlow for reusable sub-flows (e.g., login)
|
|
2375
|
+
# - Use assertWithAI for natural-language assertions
|
|
2376
|
+
#
|
|
2377
|
+
# DON'T:
|
|
2378
|
+
# - Reference source code paths or internal identifiers
|
|
2379
|
+
# - Depend on exact pixel positions (use text and accessibility labels)
|
|
2380
|
+
|
|
2381
|
+
appId: com.example.myapp # Replace with your app's bundle identifier
|
|
2382
|
+
name: "Core User Journey"
|
|
2383
|
+
tags:
|
|
2384
|
+
- smoke
|
|
2385
|
+
- holdout
|
|
2386
|
+
---
|
|
2387
|
+
# Step 1: Launch the app
|
|
2388
|
+
- launchApp
|
|
2389
|
+
|
|
2390
|
+
# Step 2: Login (using a reusable sub-flow)
|
|
2391
|
+
- runFlow: example-scenario-mobile-login.yaml
|
|
2392
|
+
|
|
2393
|
+
# Step 3: Verify the main screen loaded
|
|
2394
|
+
- assertVisible: "Home"
|
|
2395
|
+
|
|
2396
|
+
# Step 4: Navigate to a feature
|
|
2397
|
+
# - tapOn: "Settings"
|
|
2398
|
+
# - assertVisible: "Account"
|
|
2399
|
+
|
|
2400
|
+
# Step 5: AI-powered assertion (natural language)
|
|
2401
|
+
# - assertWithAI: "The main dashboard is visible with navigation tabs at the bottom"
|
|
2402
|
+
|
|
2403
|
+
# Step 6: Go back
|
|
2404
|
+
# - back
|
|
2405
|
+
# - assertVisible: "Home"
|
|
2406
|
+
`,
|
|
2407
|
+
"scenarios/example-scenario-mobile-login.yaml": `# Reusable Login Sub-Flow (Maestro)
|
|
2408
|
+
#
|
|
2409
|
+
# This flow handles authentication. Other flows include it via:
|
|
2410
|
+
# - runFlow: example-scenario-mobile-login.yaml
|
|
2411
|
+
#
|
|
2412
|
+
# Replace the selectors and credentials with your app's actual login flow.
|
|
2413
|
+
|
|
2414
|
+
appId: com.example.myapp
|
|
2415
|
+
name: "Login"
|
|
2416
|
+
---
|
|
2417
|
+
- assertVisible: "Sign In"
|
|
2418
|
+
- tapOn: "Email"
|
|
2419
|
+
- inputText: "test@example.com"
|
|
2420
|
+
- tapOn: "Password"
|
|
2421
|
+
- inputText: "testpassword123"
|
|
2422
|
+
- tapOn: "Log In"
|
|
2423
|
+
- assertVisible: "Home" # Verify login succeeded
|
|
2424
|
+
`,
|
|
2425
|
+
"scenarios/README-mobile.md": '# Mobile Scenario Testing with Maestro\n\nThis guide explains how to set up and run mobile holdout scenario tests using [Maestro](https://maestro.dev/).\n\n## Prerequisites\n\n- **Maestro CLI:** `curl -Ls "https://get.maestro.mobile.dev" | bash`\n- **Java 17+** (required by Maestro)\n- **Simulator/Emulator:**\n - iOS: Xcode with iOS Simulator (macOS only)\n - Android: Android Studio with an AVD configured\n\n> **Important:** Joycraft does not install Maestro or manage simulators. This is your responsibility.\n\n## Running Tests Locally\n\n```bash\n# Boot your simulator/emulator first, then:\nmaestro test example-scenario-mobile.yaml\n\n# Run all flows in a directory:\nmaestro test .maestro/\n```\n\n## Writing Flows\n\nMaestro flows are declarative YAML. Core commands:\n\n| Command | Purpose |\n|---------|--------|\n| `launchApp` | Start or restart the app |\n| `tapOn: "text"` | Tap an element by visible text or test ID |\n| `inputText: "value"` | Type into a focused field |\n| `assertVisible: "text"` | Assert an element is on screen |\n| `assertNotVisible: "text"` | Assert an element is NOT on screen |\n| `scroll` | Scroll down |\n| `back` | Press the back button |\n| `runFlow: file.yaml` | Run a reusable sub-flow |\n| `assertWithAI: "description"` | Natural-language assertion (AI-powered) |\n\n## CI Options\n\n### Option A: Maestro Cloud (paid, easiest)\n\nUpload your app binary and flows to Maestro Cloud. No simulator management.\n\n```yaml\n- uses: mobile-dev-inc/action-maestro-cloud@v2\n with:\n api-key: ${{ secrets.MAESTRO_API_KEY }}\n app-file: app.apk # or app.ipa\n workspace: .\n```\n\n### Option B: Self-hosted emulator (free, more setup)\n\nSpin up an Android emulator on a Linux runner or iOS simulator on a macOS runner.\n\n> **Cost note:** macOS GitHub Actions runners are ~10x more expensive than Linux runners.\n\n## The Holdout Pattern\n\nThese tests live in the scenarios repo, separate from the main codebase. The scenario agent generates them from specs. They test observable behavior through the app\'s UI \u2014 never referencing source code or internal implementation.\n',
|
|
1903
2426
|
"scenarios/prompts/scenario-agent.md": `You are a QA engineer working in a holdout test repository. You CANNOT access the main repository's source code. Your job is to write or update behavioral scenario tests based on specs that are pushed from the main repo.
|
|
1904
2427
|
|
|
1905
2428
|
## What You Have Access To
|
|
@@ -1907,7 +2430,23 @@ describe("CLI: init command (example \u2014 replace with your real scenarios)",
|
|
|
1907
2430
|
- This scenarios repository (test files, \`specs/\` mirror, \`package.json\`)
|
|
1908
2431
|
- The incoming spec (provided below)
|
|
1909
2432
|
- A list of existing test files and spec mirrors (provided below)
|
|
1910
|
-
- The main repo is available at \`../main-repo\` and is already built
|
|
2433
|
+
- The main repo is available at \`../main-repo\` and is already built
|
|
2434
|
+
- The testing strategy for this project (provided below)
|
|
2435
|
+
|
|
2436
|
+
## Testing Strategy
|
|
2437
|
+
|
|
2438
|
+
This project uses the **$TESTING_BACKBONE** testing backbone.
|
|
2439
|
+
|
|
2440
|
+
Select the correct test format based on the backbone:
|
|
2441
|
+
|
|
2442
|
+
| Backbone | Tool | Test Format | File Extension | How to Test |
|
|
2443
|
+
|----------|------|-------------|---------------|-------------|
|
|
2444
|
+
| \`playwright\` | Playwright | Browser-based E2E | \`.spec.ts\` | Navigate pages, click elements, assert on visible content |
|
|
2445
|
+
| \`maestro\` | Maestro | YAML flows | \`.yaml\` | Tap elements, fill inputs, assert on screen state |
|
|
2446
|
+
| \`api\` | fetch (Node.js built-in) | HTTP requests | \`.test.ts\` | Send requests to endpoints, assert on responses |
|
|
2447
|
+
| \`native\` | vitest + spawnSync | CLI/binary invocation | \`.test.ts\` | Run commands, assert on stdout/stderr/exit codes |
|
|
2448
|
+
|
|
2449
|
+
If the backbone is not provided or unrecognized, default to \`native\`.
|
|
1911
2450
|
|
|
1912
2451
|
## Triage Decision Tree
|
|
1913
2452
|
|
|
@@ -1926,7 +2465,7 @@ If you SKIP, write a brief comment in the relevant test file (or a new one) expl
|
|
|
1926
2465
|
- A new output format or file that gets generated
|
|
1927
2466
|
- A new user-facing behavior that doesn't map to any existing test file
|
|
1928
2467
|
|
|
1929
|
-
Name the file after the feature area
|
|
2468
|
+
Name the file after the feature area using the correct extension for the backbone.
|
|
1930
2469
|
|
|
1931
2470
|
### UPDATE \u2014 Modify an existing test file if the spec:
|
|
1932
2471
|
- Changes behavior that is already tested
|
|
@@ -1937,25 +2476,20 @@ Match to the most relevant existing test file by feature area.
|
|
|
1937
2476
|
|
|
1938
2477
|
**If you are unsure whether a spec is user-facing, err on the side of writing a test.**
|
|
1939
2478
|
|
|
1940
|
-
## Test Writing Rules
|
|
2479
|
+
## Test Writing Rules (All Backbones)
|
|
1941
2480
|
|
|
1942
|
-
1. **Behavioral only.** Test observable
|
|
2481
|
+
1. **Behavioral only.** Test observable behavior \u2014 what a real user would see. Never test internal implementation details or import source modules.
|
|
2482
|
+
2. **Each test is fully independent.** No shared mutable state between tests.
|
|
2483
|
+
3. **Assert on realistic user actions.** Write tests that reflect what a real user would do.
|
|
2484
|
+
4. **Never import from the parent repo's source.** If you find yourself writing \`import { ... } from '../main-repo/src/...'\`, stop \u2014 that defeats the holdout.
|
|
1943
2485
|
|
|
1944
|
-
|
|
2486
|
+
## Backbone: native (CLI/Binary)
|
|
1945
2487
|
|
|
1946
|
-
|
|
1947
|
-
|
|
1948
|
-
4. **Each test is fully independent.** No shared mutable state between tests. Each test that touches the filesystem gets its own temp directory via \`mkdtempSync\`.
|
|
1949
|
-
|
|
1950
|
-
5. **Assert on realistic user actions.** Write tests that reflect what a real user would do \u2014 not what the implementation happens to do.
|
|
1951
|
-
|
|
1952
|
-
6. **Never import from the parent repo's source.** If you find yourself writing \`import { ... } from '../main-repo/src/...'\`, stop \u2014 that defeats the holdout.
|
|
1953
|
-
|
|
1954
|
-
## Test File Template
|
|
2488
|
+
Use when the project is a CLI tool, library, or has no web/mobile UI.
|
|
1955
2489
|
|
|
1956
2490
|
\`\`\`typescript
|
|
1957
|
-
import {
|
|
1958
|
-
import {
|
|
2491
|
+
import { spawnSync } from 'node:child_process';
|
|
2492
|
+
import { mkdtempSync, rmSync } from 'node:fs';
|
|
1959
2493
|
import { tmpdir } from 'node:os';
|
|
1960
2494
|
import { join } from 'node:path';
|
|
1961
2495
|
import { describe, it, expect, beforeEach, afterEach } from 'vitest';
|
|
@@ -1968,39 +2502,122 @@ function runCLI(args: string[], cwd?: string) {
|
|
|
1968
2502
|
cwd: cwd ?? process.cwd(),
|
|
1969
2503
|
env: { ...process.env, NO_COLOR: '1' },
|
|
1970
2504
|
});
|
|
1971
|
-
return {
|
|
1972
|
-
stdout: result.stdout ?? '',
|
|
1973
|
-
stderr: result.stderr ?? '',
|
|
1974
|
-
status: result.status ?? 1,
|
|
1975
|
-
};
|
|
2505
|
+
return { stdout: result.stdout ?? '', stderr: result.stderr ?? '', status: result.status ?? 1 };
|
|
1976
2506
|
}
|
|
1977
2507
|
|
|
1978
|
-
describe('[feature area]
|
|
2508
|
+
describe('[feature area]', () => {
|
|
1979
2509
|
let tmpDir: string;
|
|
2510
|
+
beforeEach(() => { tmpDir = mkdtempSync(join(tmpdir(), 'scenarios-')); });
|
|
2511
|
+
afterEach(() => { rmSync(tmpDir, { recursive: true, force: true }); });
|
|
1980
2512
|
|
|
1981
|
-
|
|
1982
|
-
|
|
2513
|
+
it('[observable behavior]', () => {
|
|
2514
|
+
const { stdout, status } = runCLI(['command', 'args'], tmpDir);
|
|
2515
|
+
expect(status).toBe(0);
|
|
2516
|
+
expect(stdout).toContain('expected output');
|
|
1983
2517
|
});
|
|
2518
|
+
});
|
|
2519
|
+
\`\`\`
|
|
1984
2520
|
|
|
1985
|
-
|
|
1986
|
-
|
|
2521
|
+
## Backbone: playwright (Web Apps)
|
|
2522
|
+
|
|
2523
|
+
Use when the project is a web application (Next.js, Vite, Nuxt, etc.).
|
|
2524
|
+
|
|
2525
|
+
\`\`\`typescript
|
|
2526
|
+
import { test, expect } from '@playwright/test';
|
|
2527
|
+
|
|
2528
|
+
// Tests run against BASE_URL (configured in playwright.config.ts)
|
|
2529
|
+
// The dev server is started automatically or BASE_URL points to a preview deploy
|
|
2530
|
+
|
|
2531
|
+
test.describe('[feature area]', () => {
|
|
2532
|
+
test('[observable behavior]', async ({ page }) => {
|
|
2533
|
+
await page.goto('/');
|
|
2534
|
+
await expect(page.locator('h1')).toBeVisible();
|
|
1987
2535
|
});
|
|
1988
2536
|
|
|
1989
|
-
|
|
1990
|
-
|
|
1991
|
-
|
|
1992
|
-
|
|
2537
|
+
test('[user interaction]', async ({ page }) => {
|
|
2538
|
+
await page.goto('/login');
|
|
2539
|
+
await page.fill('[name="email"]', 'test@example.com');
|
|
2540
|
+
await page.click('button[type="submit"]');
|
|
2541
|
+
await expect(page).toHaveURL(/dashboard/);
|
|
1993
2542
|
});
|
|
1994
2543
|
});
|
|
1995
2544
|
\`\`\`
|
|
1996
2545
|
|
|
2546
|
+
## Backbone: api (API Backends)
|
|
2547
|
+
|
|
2548
|
+
Use when the project is an API-only backend (Express, FastAPI, etc.).
|
|
2549
|
+
|
|
2550
|
+
\`\`\`typescript
|
|
2551
|
+
import { describe, it, expect } from 'vitest';
|
|
2552
|
+
|
|
2553
|
+
const BASE_URL = process.env.BASE_URL || 'http://localhost:3000';
|
|
2554
|
+
|
|
2555
|
+
describe('[feature area]', () => {
|
|
2556
|
+
it('[endpoint behavior]', async () => {
|
|
2557
|
+
const res = await fetch(\\\`\\\${BASE_URL}/api/endpoint\\\`);
|
|
2558
|
+
expect(res.status).toBe(200);
|
|
2559
|
+
const body = await res.json();
|
|
2560
|
+
expect(body).toHaveProperty('data');
|
|
2561
|
+
});
|
|
2562
|
+
|
|
2563
|
+
it('[error handling]', async () => {
|
|
2564
|
+
const res = await fetch(\\\`\\\${BASE_URL}/api/not-found\\\`);
|
|
2565
|
+
expect(res.status).toBe(404);
|
|
2566
|
+
});
|
|
2567
|
+
});
|
|
2568
|
+
\`\`\`
|
|
2569
|
+
|
|
2570
|
+
## Backbone: maestro (Mobile Apps)
|
|
2571
|
+
|
|
2572
|
+
Use when the project is a mobile application (React Native, Flutter, native iOS/Android).
|
|
2573
|
+
|
|
2574
|
+
\`\`\`yaml
|
|
2575
|
+
appId: com.example.myapp
|
|
2576
|
+
name: "[feature area]: [behavior being tested]"
|
|
2577
|
+
tags:
|
|
2578
|
+
- holdout
|
|
2579
|
+
---
|
|
2580
|
+
- launchApp
|
|
2581
|
+
- tapOn: "Sign In"
|
|
2582
|
+
- inputText: "test@example.com"
|
|
2583
|
+
- tapOn: "Submit"
|
|
2584
|
+
- assertVisible: "Welcome"
|
|
2585
|
+
# Use assertWithAI for complex visual assertions:
|
|
2586
|
+
# - assertWithAI: "The dashboard shows a list of recent items"
|
|
2587
|
+
\`\`\`
|
|
2588
|
+
|
|
2589
|
+
## Graceful Degradation
|
|
2590
|
+
|
|
2591
|
+
If the primary backbone tool is not available in this repo, fall back to the next deepest testable layer:
|
|
2592
|
+
|
|
2593
|
+
| Layer | What's Tested | When to Use |
|
|
2594
|
+
|-------|-------------|-------------|
|
|
2595
|
+
| **Layer 4: UI** | Full user flows through browser/simulator | \`@playwright/test\` or Maestro is installed |
|
|
2596
|
+
| **Layer 3: API** | HTTP requests against running server | Server can be started from \`../main-repo\` |
|
|
2597
|
+
| **Layer 2: Logic** | Unit tests via test runner | Test runner (vitest/jest) is available |
|
|
2598
|
+
| **Layer 1: Static** | Build, typecheck, lint | Build toolchain is available |
|
|
2599
|
+
|
|
2600
|
+
**Fallback rules:**
|
|
2601
|
+
- If backbone is \`playwright\` but \`@playwright/test\` is NOT in this repo's \`package.json\`: fall back to \`api\` (fetch-based HTTP tests)
|
|
2602
|
+
- If backbone is \`maestro\` but no simulator context is available: fall back to \`api\` if a server can be started, else \`native\`
|
|
2603
|
+
- If backbone is \`api\` but no server start script exists: fall back to \`native\`
|
|
2604
|
+
- \`native\` is always available as the floor
|
|
2605
|
+
|
|
2606
|
+
Start each test file with a comment indicating the testing layer:
|
|
2607
|
+
\`// Testing Layer: [4|3|2|1] - [UI|API|Logic|Static]\`
|
|
2608
|
+
|
|
2609
|
+
If you fell back from the intended backbone, note this in your commit message:
|
|
2610
|
+
\`scenarios: [action] for [spec] (layer: [N], reason: [why])\`
|
|
2611
|
+
|
|
1997
2612
|
## Checklist Before Committing
|
|
1998
2613
|
|
|
1999
2614
|
- [ ] Decision: SKIP / NEW / UPDATE (and why)
|
|
2615
|
+
- [ ] Correct backbone selected (or fallback justified)
|
|
2000
2616
|
- [ ] Tests assert on observable behavior, not implementation
|
|
2001
2617
|
- [ ] No imports from \`../main-repo/src\`
|
|
2002
|
-
- [ ] Each test
|
|
2003
|
-
- [ ] File
|
|
2618
|
+
- [ ] Each test is independent (own temp dir, own state)
|
|
2619
|
+
- [ ] File uses the correct extension for the backbone
|
|
2620
|
+
- [ ] Testing layer comment at top of file
|
|
2004
2621
|
`,
|
|
2005
2622
|
"scenarios/workflows/generate.yml": `# Scenario Generation Workflow
|
|
2006
2623
|
#
|
|
@@ -2100,7 +2717,9 @@ jobs:
|
|
|
2100
2717
|
## Context
|
|
2101
2718
|
|
|
2102
2719
|
Existing test files in this repo: \${{ steps.context.outputs.existing_tests }}
|
|
2103
|
-
Existing spec mirrors: \${{ steps.context.outputs.existing_specs }}
|
|
2720
|
+
Existing spec mirrors: \${{ steps.context.outputs.existing_specs }}
|
|
2721
|
+
|
|
2722
|
+
Testing backbone: \${{ github.event.client_payload.testing_backbone || 'native' }}"
|
|
2104
2723
|
|
|
2105
2724
|
# \u2500\u2500 7. Commit any changes the agent made \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500
|
|
2106
2725
|
- name: Commit scenario changes
|
|
@@ -2676,4 +3295,4 @@ export {
|
|
|
2676
3295
|
SKILLS,
|
|
2677
3296
|
TEMPLATES
|
|
2678
3297
|
};
|
|
2679
|
-
//# sourceMappingURL=chunk-
|
|
3298
|
+
//# sourceMappingURL=chunk-A2CQG5J5.js.map
|