npm - agent-scenario-loop - Versions diffs - 0.1.1 → 0.1.3 - Mend

agent-scenario-loop 0.1.1 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (69) hide show

package/README.md +15 -9
package/app/profile-session.ts +98 -4
package/dist/core/agent-summary.d.ts +3 -2
package/dist/core/agent-summary.js +44 -2
package/dist/core/artifact-contract.d.ts +22 -4
package/dist/core/artifact-contract.js +512 -11
package/dist/core/comparison.d.ts +57 -3
package/dist/core/comparison.js +113 -1
package/dist/core/planner.d.ts +32 -1
package/dist/core/planner.js +144 -0
package/dist/core/run-index.d.ts +4 -0
package/dist/core/run-index.js +55 -1
package/dist/core/schema-validator.d.ts +1 -0
package/dist/core/schema-validator.js +1 -0
package/dist/runner/compare-latest.d.ts +8 -4
package/dist/runner/compare-latest.js +24 -5
package/dist/runner/example-android-live.d.ts +10 -1
package/dist/runner/example-android-live.js +55 -0
package/dist/runner/example-ios-live.d.ts +10 -1
package/dist/runner/example-ios-live.js +55 -0
package/dist/runner/init-project.d.ts +4 -1
package/dist/runner/init-project.js +26 -4
package/dist/runner/ios-simctl.d.ts +5 -0
package/dist/runner/ios-simctl.js +6 -0
package/dist/runner/live-comparison.d.ts +2 -2
package/dist/runner/live-comparison.js +2 -1
package/dist/runner/live-proof-summary.d.ts +5 -4
package/dist/runner/live-proof-summary.js +12 -2
package/dist/runner/live-proof.d.ts +3 -2
package/dist/runner/live-proof.js +9 -2
package/dist/runner/profile-android.d.ts +5 -0
package/dist/runner/profile-android.js +148 -24
package/dist/runner/profile-ios.d.ts +11 -1
package/dist/runner/profile-ios.js +128 -9
package/dist/runner/profile-mobile.d.ts +8 -0
package/dist/runner/profile-mobile.js +267 -28
package/docs/adapters.md +4 -0
package/docs/api.md +1 -1
package/docs/architecture.md +90 -0
package/docs/authoring.md +7 -1
package/docs/concepts.md +3 -24
package/docs/consumer-rehearsal.md +4 -0
package/docs/contracts.md +30 -100
package/docs/external-adapter-protocol.md +219 -0
package/docs/live-proofs.md +83 -2
package/docs/principles.md +9 -15
package/examples/mobile-app/README.md +12 -0
package/examples/mobile-app/runner-manifests/primary-runner.json +1 -0
package/examples/runners/README.md +1 -0
package/examples/runners/adb-android.json +1 -0
package/examples/runners/agent-device-android.json +1 -0
package/examples/runners/agent-device-ios.json +1 -0
package/examples/runners/argent-android.json +1 -0
package/examples/runners/argent-ios.json +1 -0
package/examples/runners/xcodebuildmcp-ios.json +1 -0
package/package.json +2 -1
package/schemas/causal-run.schema.json +85 -2
package/schemas/comparison.schema.json +130 -2
package/schemas/external-adapter-message.schema.json +693 -0
package/schemas/health.schema.json +72 -0
package/schemas/live-proof-set.schema.json +1 -1
package/schemas/live-proof.schema.json +14 -6
package/schemas/manifest.schema.json +442 -1
package/schemas/runner-capabilities.schema.json +20 -0
package/schemas/scenario.schema.json +16 -0
package/templates/primary-runner.json +1 -0
package/templates/skills/agent-scenario-loop/SKILL.md +93 -0
package/templates/skills/agent-scenario-loop/references/adoption-checklist.md +17 -0
package/templates/skills/agent-scenario-loop/references/artifact-interpretation.md +26 -0

package/schemas/scenario.schema.json CHANGED Viewed

@@ -331,6 +331,9 @@
         "selector": {
           "$ref": "#/$defs/selector"
         },
+        "uiContext": {
+          "$ref": "#/$defs/uiContext"
+        },
         "timeoutMs": {
           "type": "integer",
           "minimum": 1
@@ -361,6 +364,19 @@
         "collectPerfSignals"
       ]
     },
+    "uiContext": {
+      "type": "string",
+      "enum": [
+        "app",
+        "systemDialog",
+        "notificationShade",
+        "externalBrowser",
+        "webView",
+        "shareSheet",
+        "picker",
+        "otherApp"
+      ]
+    },
     "selector": {
       "type": "object",
       "additionalProperties": false,

package/templates/primary-runner.json CHANGED Viewed

@@ -6,6 +6,7 @@
   "capabilities": ["launch", "sessionControl", "command", "logCapture", "artifactWrite"],
   "driverActions": ["tap", "readLogs"],
   "artifactOutputs": ["logs", "signals"],
+  "uiContexts": ["app"],
   "lifecycle": [
     "prepare",
     "launch",

package/templates/skills/agent-scenario-loop/SKILL.md ADDED Viewed

@@ -0,0 +1,93 @@
+---
+name: agent-scenario-loop
+description: Use Agent Scenario Loop when implementing, debugging, optimizing, or validating mobile app behavior through durable scenarios, Android or iOS live proofs, evidence artifacts, health checks, budgets, or before-and-after comparisons. Do not use for ordinary unit-test-only changes or work unrelated to observable app behavior.
+---
+# Agent Scenario Loop
+Use ASL to establish trustworthy evidence about changes to this mobile app.
+## Discover The Project Contract
+1. Inspect `package.json` for installed ASL commands and project-owned `asl:*` scripts.
+2. Locate `asl.config.json` or the configured alternative.
+3. Inspect scenario manifests and runner manifests.
+4. Identify the scenario representing the app behavior affected by the task.
+5. Reuse an existing scenario when it expresses the intended behavior.
+Do not copy runner infrastructure into the application.
+## Validate Before Execution
+Run the project validation command configured by the repository. Then validate the selected scenario and execution plan.
+Stop before live execution when required capabilities are unavailable. Report the missing capability, selected platform, runner, and the command needed to reproduce the failure.
+## Select The Proof Lane
+Use fixture proof when validating:
+- package installation;
+- scenario parsing;
+- artifact contracts;
+- comparison behavior;
+- agent summaries.
+Use Android or iOS live proof when making claims about actual app behavior. Use both platforms when the task or release gate requires cross-platform evidence.
+## Interpret Results
+Read evidence in this order:
+1. `health.json`
+2. `verdict.json`
+3. `comparison.json`, when present
+4. `agent-summary.md`
+5. supporting raw evidence and captures when diagnosis is needed
+If health is not passed:
+- do not trust dependent timing or budget conclusions;
+- classify the failure as execution, environment, instrumentation, lifecycle, or evidence capture;
+- fix or report the evidence problem before claiming a product regression.
+If health passed but verdict failed:
+- treat the run as trustworthy evidence of product failure;
+- diagnose the failed budget, event, milestone, or expectation.
+Only interpret a comparison when ASL considers the baseline compatible.
+## Non-Negotiable Rules
+- Run planner validation before expensive device work.
+- Prefer an existing durable scenario over inventing an ad hoc script.
+- Never infer product improvement from unhealthy or partial evidence.
+- Treat passed health plus failed verdict as trustworthy evidence of failure.
+- Do not silently retry and discard failed attempts.
+- Do not change budgets or scenarios merely to turn a failure green.
+- Keep selectors, app identifiers, authentication assumptions, routes, and truth events inside the consuming app.
+- Preserve the artifact directory and cite exact artifact paths in the final report.
+- Use fixture proof for package and contract validation; use live proof for product claims.
+- Avoid adding new runners when an existing capability already satisfies the plan.
+## Report
+Include:
+- scenario ID;
+- platform;
+- run ID;
+- health status;
+- product verdict;
+- comparison status;
+- failed evidence dependencies;
+- relevant artifact paths;
+- recommended next action.
+Do not summarize a run as simply "passed" or "failed" when health and verdict differ.
+## References
+- `references/artifact-interpretation.md`
+- `references/adoption-checklist.md`

package/templates/skills/agent-scenario-loop/references/adoption-checklist.md ADDED Viewed

@@ -0,0 +1,17 @@
+# Adoption Checklist
+Use this checklist when bringing ASL into a consuming app or validating an existing adoption.
+1. Confirm `agent-scenario-loop` is installed from the registry, not a local tarball or link.
+2. Inspect `package.json` for project-owned `asl:*` scripts.
+3. Locate `asl.config.json` and verify app identifiers, artifact roots, and supported drivers are project-owned.
+4. Inspect scenario manifests under the configured scenario root.
+5. Inspect runner and evidence-provider manifests.
+6. Run the project validation script.
+7. Run the selected scenario's plan check before live device work.
+8. Use fixture/profile proof for package, parsing, and artifact-contract validation.
+9. Use Android or iOS live proof for product behavior claims.
+10. Use both platforms when a release or task requires cross-platform evidence.
+11. Preserve generated artifacts; do not delete failed attempts to make a later run look cleaner.
+12. Cite `health.json`, `verdict.json`, `comparison.json` when present, and `agent-summary.md` in the final report.
+13. Keep selectors, app identifiers, credentials, routes, and truth events in the consuming app, not ASL core.

package/templates/skills/agent-scenario-loop/references/artifact-interpretation.md ADDED Viewed

@@ -0,0 +1,26 @@
+# Artifact Interpretation
+ASL separates evidence health, product verdict, and comparison status. Keep those meanings distinct.
+## Read Order
+1. `health.json`: whether the evidence-producing run completed with enough trustworthy data to interpret downstream artifacts.
+2. `verdict.json`: whether the product behavior satisfied the scenario's expectations, budgets, milestones, and required events.
+3. `comparison.json`: whether a compatible baseline/current pair improved, regressed, stayed unchanged, or remained inconclusive.
+4. `agent-summary.md`: compressed outcome for humans and agents after the structured artifacts are understood.
+## Health
+When health is not passed, do not claim product improvement or regression from dependent timing, budget, or comparison artifacts. Classify the issue as execution, environment, instrumentation, lifecycle, or evidence capture.
+## Verdict
+Passed health plus failed verdict is trustworthy evidence of a product failure. Diagnose the failed event, milestone, budget, or expectation instead of treating the run as untrusted.
+## Comparison
+Only interpret comparison output when ASL selected or accepted a compatible baseline. If no trusted compatible prior exists, keep the current run as evidence and avoid before/after claims.
+## Reporting
+Reports must cite exact artifact paths and distinguish health status, product verdict, and comparison status.