agent-scenario-loop 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (170) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +119 -0
  3. package/app/profile-session.ts +812 -0
  4. package/core/config-template.json +41 -0
  5. package/dist/core/agent-summary.d.ts +15 -0
  6. package/dist/core/agent-summary.js +177 -0
  7. package/dist/core/artifact-contract.d.ts +151 -0
  8. package/dist/core/artifact-contract.js +897 -0
  9. package/dist/core/artifact-layout.d.ts +56 -0
  10. package/dist/core/artifact-layout.js +61 -0
  11. package/dist/core/artifact-writer.d.ts +44 -0
  12. package/dist/core/artifact-writer.js +55 -0
  13. package/dist/core/comparison.d.ts +133 -0
  14. package/dist/core/comparison.js +294 -0
  15. package/dist/core/evidence-interpreter.d.ts +28 -0
  16. package/dist/core/evidence-interpreter.js +69 -0
  17. package/dist/core/execution-plan.d.ts +44 -0
  18. package/dist/core/execution-plan.js +95 -0
  19. package/dist/core/planner.d.ts +132 -0
  20. package/dist/core/planner.js +812 -0
  21. package/dist/core/ports.d.ts +198 -0
  22. package/dist/core/ports.js +146 -0
  23. package/dist/core/run-index.d.ts +62 -0
  24. package/dist/core/run-index.js +143 -0
  25. package/dist/core/schema-validator.d.ts +86 -0
  26. package/dist/core/schema-validator.js +407 -0
  27. package/dist/index.d.ts +11 -0
  28. package/dist/index.js +27 -0
  29. package/dist/runner/agent-device-driver.d.ts +126 -0
  30. package/dist/runner/agent-device-driver.js +168 -0
  31. package/dist/runner/agent-device.d.ts +295 -0
  32. package/dist/runner/agent-device.js +1271 -0
  33. package/dist/runner/android-adb-driver.d.ts +175 -0
  34. package/dist/runner/android-adb-driver.js +399 -0
  35. package/dist/runner/android-adb.d.ts +254 -0
  36. package/dist/runner/android-adb.js +1618 -0
  37. package/dist/runner/argent-driver.d.ts +183 -0
  38. package/dist/runner/argent-driver.js +297 -0
  39. package/dist/runner/argent.d.ts +349 -0
  40. package/dist/runner/argent.js +1211 -0
  41. package/dist/runner/check-plan.d.ts +45 -0
  42. package/dist/runner/check-plan.js +210 -0
  43. package/dist/runner/cli.d.ts +20 -0
  44. package/dist/runner/cli.js +23 -0
  45. package/dist/runner/compare-latest.d.ts +99 -0
  46. package/dist/runner/compare-latest.js +233 -0
  47. package/dist/runner/compare.d.ts +58 -0
  48. package/dist/runner/compare.js +157 -0
  49. package/dist/runner/demo-loop.d.ts +45 -0
  50. package/dist/runner/demo-loop.js +170 -0
  51. package/dist/runner/example-android-live.d.ts +137 -0
  52. package/dist/runner/example-android-live.js +454 -0
  53. package/dist/runner/example-ios-live.d.ts +137 -0
  54. package/dist/runner/example-ios-live.js +471 -0
  55. package/dist/runner/host-doctor.d.ts +131 -0
  56. package/dist/runner/host-doctor.js +628 -0
  57. package/dist/runner/init-project.d.ts +88 -0
  58. package/dist/runner/init-project.js +263 -0
  59. package/dist/runner/ios-simctl-driver.d.ts +69 -0
  60. package/dist/runner/ios-simctl-driver.js +97 -0
  61. package/dist/runner/ios-simctl.d.ts +254 -0
  62. package/dist/runner/ios-simctl.js +1415 -0
  63. package/dist/runner/live-android.d.ts +137 -0
  64. package/dist/runner/live-android.js +539 -0
  65. package/dist/runner/live-comparison.d.ts +67 -0
  66. package/dist/runner/live-comparison.js +147 -0
  67. package/dist/runner/live-ios.d.ts +137 -0
  68. package/dist/runner/live-ios.js +460 -0
  69. package/dist/runner/live-proof-summary.d.ts +263 -0
  70. package/dist/runner/live-proof-summary.js +465 -0
  71. package/dist/runner/live-proof.d.ts +467 -0
  72. package/dist/runner/live-proof.js +920 -0
  73. package/dist/runner/local-env.d.ts +64 -0
  74. package/dist/runner/local-env.js +155 -0
  75. package/dist/runner/profile-android.d.ts +82 -0
  76. package/dist/runner/profile-android.js +671 -0
  77. package/dist/runner/profile-ios.d.ts +108 -0
  78. package/dist/runner/profile-ios.js +532 -0
  79. package/dist/runner/profile-mobile.d.ts +254 -0
  80. package/dist/runner/profile-mobile.js +1307 -0
  81. package/dist/runner/validate-project.d.ts +273 -0
  82. package/dist/runner/validate-project.js +1501 -0
  83. package/docs/adapters.md +145 -0
  84. package/docs/api.md +94 -0
  85. package/docs/authoring.md +196 -0
  86. package/docs/concepts.md +136 -0
  87. package/docs/consumer-rehearsal.md +115 -0
  88. package/docs/contracts.md +267 -0
  89. package/docs/live-proofs.md +270 -0
  90. package/docs/principles.md +46 -0
  91. package/examples/event-logs/app-startup-baseline.log +4 -0
  92. package/examples/event-logs/app-startup-current.log +4 -0
  93. package/examples/minimal-app/README.md +70 -0
  94. package/examples/mobile-app/README.md +302 -0
  95. package/examples/mobile-app/app.json +22 -0
  96. package/examples/mobile-app/asl/package-scripts.json +32 -0
  97. package/examples/mobile-app/asl.config.json +37 -0
  98. package/examples/mobile-app/event-logs/android-app-startup.log +4 -0
  99. package/examples/mobile-app/event-logs/android-open-close-cycle.log +12 -0
  100. package/examples/mobile-app/event-logs/android-scroll-settle.log +12 -0
  101. package/examples/mobile-app/event-logs/app-startup.log +4 -0
  102. package/examples/mobile-app/event-logs/open-close-cycle.log +12 -0
  103. package/examples/mobile-app/event-logs/scroll-settle.log +12 -0
  104. package/examples/mobile-app/index.ts +20 -0
  105. package/examples/mobile-app/metro.config.js +20 -0
  106. package/examples/mobile-app/package.json +62 -0
  107. package/examples/mobile-app/patches/expo-modules-jsi@56.0.10.patch +19 -0
  108. package/examples/mobile-app/plugins/with-ios-build-compat.js +271 -0
  109. package/examples/mobile-app/pnpm-lock.yaml +4440 -0
  110. package/examples/mobile-app/runner-manifests/evidence-provider.json +79 -0
  111. package/examples/mobile-app/runner-manifests/primary-runner.json +19 -0
  112. package/examples/mobile-app/scenarios/android/app-startup-video.json +73 -0
  113. package/examples/mobile-app/scenarios/android/app-startup.json +44 -0
  114. package/examples/mobile-app/scenarios/android/open-close-cycle.json +54 -0
  115. package/examples/mobile-app/scenarios/android/scroll-settle.json +49 -0
  116. package/examples/mobile-app/scenarios/ios/app-startup.json +44 -0
  117. package/examples/mobile-app/scenarios/ios/open-close-cycle.json +54 -0
  118. package/examples/mobile-app/scenarios/ios/scroll-settle.json +49 -0
  119. package/examples/mobile-app/scenarios/mobile/app-startup.json +91 -0
  120. package/examples/mobile-app/scenarios/mobile/open-close-cycle.json +160 -0
  121. package/examples/mobile-app/scenarios/mobile/scroll-settle.json +148 -0
  122. package/examples/mobile-app/scripts/asl-capture-accessibility-provider.mjs +112 -0
  123. package/examples/mobile-app/scripts/asl-capture-profiler-provider.mjs +127 -0
  124. package/examples/mobile-app/src/devtools/profile-session.ts +7 -0
  125. package/examples/mobile-app/src/example-screen.tsx +322 -0
  126. package/examples/mobile-app/tsconfig.json +16 -0
  127. package/examples/mobile-app/tsconfig.typecheck.json +13 -0
  128. package/examples/runners/README.md +44 -0
  129. package/examples/runners/adb-android.json +25 -0
  130. package/examples/runners/agent-device-android.json +27 -0
  131. package/examples/runners/agent-device-ios.json +27 -0
  132. package/examples/runners/argent-android.json +32 -0
  133. package/examples/runners/argent-ios.json +32 -0
  134. package/examples/runners/argent-react-profiler-provider.json +15 -0
  135. package/examples/runners/axe-accessibility-provider.json +24 -0
  136. package/examples/runners/manual-log-ingest.json +9 -0
  137. package/examples/runners/rozenite-profiler-provider.json +9 -0
  138. package/examples/runners/script-accessibility-provider.json +24 -0
  139. package/examples/runners/script-memory-provider.json +24 -0
  140. package/examples/runners/script-network-provider.json +24 -0
  141. package/examples/runners/script-profiler-provider.json +30 -0
  142. package/examples/runners/xcodebuildmcp-ios.json +29 -0
  143. package/examples/scenarios/ios/app-startup.json +28 -0
  144. package/examples/scenarios/ios/open-close-cycle.json +35 -0
  145. package/examples/scenarios/mobile/app-startup.json +72 -0
  146. package/examples/scenarios/mobile/media-open-close.json +141 -0
  147. package/examples/scenarios/mobile/open-close-cycle.json +135 -0
  148. package/examples/scenarios/mobile/scroll-settle.json +106 -0
  149. package/package.json +240 -0
  150. package/schemas/budget-verdict.schema.json +115 -0
  151. package/schemas/causal-run.schema.json +279 -0
  152. package/schemas/comparison.schema.json +196 -0
  153. package/schemas/health.schema.json +108 -0
  154. package/schemas/live-proof-set.schema.json +195 -0
  155. package/schemas/live-proof.schema.json +413 -0
  156. package/schemas/manifest.schema.json +204 -0
  157. package/schemas/metrics.schema.json +137 -0
  158. package/schemas/project-validation.schema.json +343 -0
  159. package/schemas/runner-capabilities.schema.json +217 -0
  160. package/schemas/scenario.schema.json +400 -0
  161. package/schemas/verdict.schema.json +88 -0
  162. package/templates/evidence-provider.json +83 -0
  163. package/templates/gitignore-snippet +9 -0
  164. package/templates/integration-readme.md +125 -0
  165. package/templates/mobile-scenario.json +133 -0
  166. package/templates/package-scripts.json +32 -0
  167. package/templates/primary-runner.json +19 -0
  168. package/templates/project.config.json +37 -0
  169. package/templates/scripts/asl-capture-accessibility-provider.mjs +112 -0
  170. package/templates/scripts/asl-capture-profiler-provider.mjs +127 -0
@@ -0,0 +1,145 @@
1
+ # Adapter Onboarding
2
+
3
+ Agent Scenario Loop treats runners as replaceable ports behind stable scenarios and artifacts. Add one adapter at a time: describe its capabilities, prove planner compatibility, run a scenario, and write the standard evidence artifacts.
4
+
5
+ ## Choose The Role
6
+
7
+ Use a primary runner when the tool owns the scenario lifecycle:
8
+
9
+ - install or verify the app
10
+ - launch the app
11
+ - start and stop a profile session
12
+ - execute scenario steps
13
+ - capture required logs or truth-event evidence
14
+ - write health, verdict, manifest, metrics, and summaries
15
+
16
+ Use an evidence provider when the tool only contributes evidence:
17
+
18
+ - accessibility inspection
19
+ - profiler output
20
+ - memory snapshots
21
+ - network captures
22
+ - screenshots, video, or UI tree snapshots
23
+
24
+ A scenario should have one primary runner. Evidence providers can satisfy required evidence outputs or optional driver actions when they are active for the selected platform.
25
+
26
+ ## Describe Capabilities
27
+
28
+ Create a runner manifest under `runner-manifests/` or use the fixtures in `examples/runners/` as a starting point. The shipped [runner and provider target matrix](../examples/runners/README.md) describes which fixtures are bundled adapters, external-tool targets, or project-local provider patterns.
29
+
30
+ Primary runner shape:
31
+
32
+ ```json
33
+ {
34
+ "schemaVersion": "1.0.0",
35
+ "runnerId": "my-android-runner",
36
+ "kind": "primary",
37
+ "platforms": ["android"],
38
+ "capabilities": ["launch", "sessionControl", "command", "logCapture", "artifactWrite"],
39
+ "driverActions": ["tap", "scroll", "assertVisible", "readLogs"],
40
+ "artifactOutputs": ["logs", "signals"],
41
+ "lifecycle": ["prepare", "launch", "startSession", "executeStep", "waitForTruthEvent", "captureEvidence", "stopSession", "finalize"]
42
+ }
43
+ ```
44
+
45
+ Evidence provider shape:
46
+
47
+ ```json
48
+ {
49
+ "schemaVersion": "1.0.0",
50
+ "runnerId": "my-accessibility-provider",
51
+ "kind": "evidenceProvider",
52
+ "platforms": ["ios", "android"],
53
+ "capabilities": ["accessibility"],
54
+ "artifactOutputs": ["accessibility"],
55
+ "lifecycle": ["prepare", "startWindow", "capture", "stopWindow", "finalize"]
56
+ }
57
+ ```
58
+
59
+ Keep manifests honest. Do not declare a driver action until the adapter can execute it or the provider can produce the required evidence.
60
+
61
+ ## Prove The Plan
62
+
63
+ Run compatibility before runtime:
64
+
65
+ ```bash
66
+ asl-check-plan \
67
+ --scenario scenarios/mobile/app-startup.json \
68
+ --runner runner-manifests/primary-runner.json \
69
+ --provider runner-manifests/evidence-provider.json \
70
+ --platform android \
71
+ --out artifacts/asl/plan/app-startup-android
72
+ ```
73
+
74
+ For an initialized app, use the project-level gate:
75
+
76
+ ```bash
77
+ asl-validate-project --root . --platform all --out artifacts/asl/project-validation
78
+ ```
79
+
80
+ The project-validation artifact gives agents structured `nextActions` for missing files, unsupported platforms, incomplete helper wiring, invalid required config, package-script drift, and planner failures. Omitted optional package drivers are preserved as warnings so teams can declare only the runner lanes they intend to support.
81
+
82
+ ## Implement The Port
83
+
84
+ An adapter should map normalized scenario steps to tool calls:
85
+
86
+ | Scenario step | Port responsibility |
87
+ | --- | --- |
88
+ | `launch` | install, launch, or verify the app is open |
89
+ | `command` | dispatch an app command or driver gesture |
90
+ | `waitForMilestone` | wait for app-owned truth events |
91
+ | `captureEvidence` | collect logs, screenshots, UI trees, video, or provider output |
92
+
93
+ When a normalized step has a `driverAction`, use `dispatchDriverAction` from the package root to call the active driver. It rejects unknown actions and missing driver methods explicitly, so a scenario cannot silently pass through an adapter that lacks the requested capability.
94
+
95
+ The built-in adb and simctl adapters show the expected boundary:
96
+
97
+ - `runner/android-adb-driver.ts`: adb-backed tap, scroll, assertion, UI tree, screenshot, record, and log actions
98
+ - `runner/ios-simctl-driver.ts`: simctl-backed screenshot and log actions
99
+ - `runner/argent.ts`: Argent-backed ASL artifact runner for launch, coordinate-backed gestures, screenshot requests, and UI descriptions
100
+ - `runner/argent-driver.ts`: optional Argent-backed driver adapter without bundling Argent
101
+ - `runner/profile-android.ts` and `runner/profile-ios.ts`: profile artifact pipelines that turn raw evidence into health, metrics, verdicts, and summaries
102
+
103
+ External tools such as agent-device, Argent, XcodeBuildMCP, axe, profilers, and custom scripts should plug in behind the same shape. The tactical tool can change; the scenario and artifact contract should not.
104
+
105
+ ## Preserve Evidence
106
+
107
+ Every run should leave agent-readable proof:
108
+
109
+ - `health.json`
110
+ - `verdict.json`
111
+ - `agent-summary.md`
112
+ - `manifest.json`
113
+ - `metrics.json`
114
+ - `causal-run.json`
115
+ - `budget-verdict.json` when budgets exist
116
+ - raw evidence under `raw/`
117
+ - captures under `captures/`
118
+ - provider signals under `signals/`
119
+
120
+ Do not treat timing as trustworthy unless scenario health passed. If setup fails, write failed health with a concrete next action instead of producing optimistic timing claims.
121
+
122
+ ## Attach Provider Evidence
123
+
124
+ If a provider already wrote files, attach them during profiling:
125
+
126
+ ```bash
127
+ asl-profile-android \
128
+ --config asl.config.json \
129
+ --scenario scenarios/android/app-startup.json \
130
+ --events artifacts/raw/adb-logcat.txt \
131
+ --signal js:artifacts/provider/js-profile.json \
132
+ --signal network:artifacts/provider/network.har \
133
+ --capture screenshot:artifacts/provider/final-screen.png
134
+ ```
135
+
136
+ If the provider should run during profiling, declare `providerCommands` in its manifest. Commands run without a shell, preserve stdout/stderr/exit code, and inventory outputs in `manifest.artifacts.evidenceAttachments`. Runtime profiles reject a provider whose `platforms` do not include the selected platform before command execution, preserving the same active-provider semantics used by planner compatibility.
137
+
138
+ ## Acceptance Checklist
139
+
140
+ - The manifest validates against `schemas/runner-capabilities.schema.json`.
141
+ - `asl-check-plan` passes for at least one scenario and platform.
142
+ - Failed setup produces failed health and a useful next action.
143
+ - Passed runs write the standard artifact set.
144
+ - Attached evidence is inventoried with stable run-relative paths.
145
+ - Package docs describe whether the adapter is bundled, a fixture target, or a project-local integration.
package/docs/api.md ADDED
@@ -0,0 +1,94 @@
1
+ # Public API
2
+
3
+ Agent Scenario Loop keeps its public surface small: the root package exports stable core contracts, while runner subpaths expose executable adapters for teams that want to compose the proof loop from code.
4
+
5
+ ## Root Package
6
+
7
+ Import core contracts from `agent-scenario-loop`:
8
+
9
+ ```js
10
+ const {
11
+ buildAgentSummaryMarkdown,
12
+ buildScenarioExecutionPlan,
13
+ buildRunIndex,
14
+ compareRunDirectories,
15
+ createArtifactLayout,
16
+ dispatchDriverAction,
17
+ evaluateRunnerCompatibility,
18
+ validateJson,
19
+ } = require('agent-scenario-loop');
20
+ ```
21
+
22
+ The root package is for stable, runner-neutral behavior:
23
+
24
+ - artifact layout and artifact writers
25
+ - profile-event parsing, metrics, manifests, causal runs, budget verdicts, and summaries
26
+ - scenario execution-plan normalization
27
+ - scenario/runner/provider compatibility checks
28
+ - port validation and driver dispatch helpers
29
+ - typed port contracts for primary runners, drivers, evidence providers, artifact writers, and interpreters
30
+ - evidence interpretation gates
31
+ - run indexing and lane-aware latest-trusted comparison selection
32
+ - comparison artifacts
33
+ - aggregate live-proof artifacts
34
+ - schema validation
35
+
36
+ Use `dispatchDriverAction()` when a runner has already normalized a scenario step and needs to call the active `DriverPort` implementation without binding to adb, simctl, agent-device, Argent, or another concrete tool.
37
+
38
+ ## Runner Subpaths
39
+
40
+ Runner subpaths are public when a consuming project needs to compose a workflow without shelling out to the installed binaries:
41
+
42
+ | Subpath | Purpose |
43
+ | --- | --- |
44
+ | `agent-scenario-loop/runner/agent-device` | agent-device capture runner that executes scenario-declared portable driver actions and writes ASL health, verdict, raw, and capture artifacts |
45
+ | `agent-scenario-loop/runner/android-adb` | Android adb readiness, launch, profile-session control, driver actions, and logcat capture |
46
+ | `agent-scenario-loop/runner/android-adb-driver` | adb-backed `tap`, `scroll`, `assertVisible`, `inspectTree`, `screenshot`, and `readLogs` driver adapter |
47
+ | `agent-scenario-loop/runner/agent-device-driver` | agent-device-backed portable action adapter for `tap`, `scroll`, `assertVisible`, `inspectTree`, `screenshot`, `readLogs`, app open/close, and alert helpers |
48
+ | `agent-scenario-loop/runner/argent` | Argent capture runner that executes launch and coordinate-backed portable driver actions, then writes ASL health, verdict, raw, and capture artifacts |
49
+ | `agent-scenario-loop/runner/argent-driver` | Argent-backed optional adapter for launch, URL open, normalized gestures, screenshot requests, and UI descriptions without bundling Argent |
50
+ | `agent-scenario-loop/runner/check-plan` | scenario/runner/provider compatibility artifact generation |
51
+ | `agent-scenario-loop/runner/compare` | direct baseline/current comparison |
52
+ | `agent-scenario-loop/runner/compare-latest` | latest trusted prior-run comparison |
53
+ | `agent-scenario-loop/runner/demo-loop` | fixture-only loop proof |
54
+ | `agent-scenario-loop/runner/example-android-live` | packaged Android example live proof |
55
+ | `agent-scenario-loop/runner/example-ios-live` | packaged iOS example live proof |
56
+ | `agent-scenario-loop/runner/host-doctor` | aggregate host/device preflight for adb, simctl, agent-device, and Argent availability before live proof |
57
+ | `agent-scenario-loop/runner/init-project` | template scaffold command for consuming app layouts |
58
+ | `agent-scenario-loop/runner/ios-simctl` | iOS simctl readiness, storage-backed session control, stored event capture, lifecycle crash detection, and host crash-report attachment |
59
+ | `agent-scenario-loop/runner/ios-simctl-driver` | simctl-backed `screenshot` and `readLogs` driver adapter |
60
+ | `agent-scenario-loop/runner/live-android` | generic one-scenario Android live proof runner with adb preflight, profile-session capture, optional agent-device and Argent sidecars, latest-trusted comparison, and aggregate live-proof artifacts |
61
+ | `agent-scenario-loop/runner/live-ios` | generic one-scenario iOS live proof runner with simctl preflight, storage or deep-link profile-session capture, optional agent-device and Argent sidecars, latest-trusted comparison, and aggregate live-proof artifacts |
62
+ | `agent-scenario-loop/runner/live-proof` | aggregate live-proof artifact validation, multi-artifact platform-set checks, durable `live-proof-set.json` writing, formatting, failed-proof gating, and regression gating |
63
+ | `agent-scenario-loop/runner/profile-android` | Android profile artifact pipeline |
64
+ | `agent-scenario-loop/runner/profile-ios` | iOS profile artifact pipeline |
65
+ | `agent-scenario-loop/runner/validate-project` | project-level validation for initialized consumer app scaffolds |
66
+
67
+ Installed binaries mirror those runner entrypoints for CLI use.
68
+
69
+ ## Shipped Fixtures
70
+
71
+ The package intentionally ships schemas and examples:
72
+
73
+ - `agent-scenario-loop/schemas/*`
74
+ - `agent-scenario-loop/examples/*`
75
+ - `agent-scenario-loop/templates/*`
76
+
77
+ These are public fixtures and contract references. Templates are safe starting points to copy into a consuming app and adapt.
78
+
79
+ For concrete runner and evidence-provider integration steps, see [Adapter Onboarding](adapters.md).
80
+
81
+ ## App Helper
82
+
83
+ `app/profile-session.ts` is shipped as source for React Native apps to copy into their own codebase. It is not a compiled CommonJS runtime export because it depends on app-side React Native modules, app bundling, and platform storage behavior.
84
+
85
+ The intended integration is:
86
+
87
+ 1. Copy `app/profile-session.ts` into the app.
88
+ 2. Wire `useProfileSessionBootstrap()` once near the app root.
89
+ 3. Emit app-owned truth events with `emitProfileEvent()`.
90
+ 4. Register optional command targets with `registerProfileCommandTargetHandler()`.
91
+
92
+ ## Stability Rule
93
+
94
+ If a function, binary, schema, or example path is listed here, package smoke should verify that it is present in the packed tarball. If a new public entrypoint is added, update this document and the smoke expectations in the same change.
@@ -0,0 +1,196 @@
1
+ # Scenario Authoring
2
+
3
+ Start with one journey that matters. A good scenario is boring, repeatable, inspectable, and portable.
4
+
5
+ ## Init Command
6
+
7
+ After installing the package, scaffold the starter layout with:
8
+
9
+ ```bash
10
+ asl-init --out . --scenario first-journey
11
+ ```
12
+
13
+ That creates:
14
+
15
+ - `asl.config.json`
16
+ - `scenarios/mobile/first-journey.json`
17
+ - `runner-manifests/primary-runner.json`
18
+ - `runner-manifests/evidence-provider.json`
19
+ - `scripts/asl-capture-accessibility-provider.mjs`
20
+ - `scripts/asl-capture-profiler-provider.mjs`
21
+ - `src/devtools/profile-session.ts`
22
+ - `asl/README.md`
23
+ - `asl/package-scripts.json`
24
+ - `asl/gitignore-snippet`
25
+
26
+ The command refuses to overwrite existing files unless `--force` is provided. Use `--dry-run` to preview the file list without writing. It does not edit your existing `package.json` or `.gitignore`; merge the generated script and ignore snippets intentionally. Project validation reports an error until the required generated `asl:*` scripts are present in the app `package.json`, and it flags direct installed-bin scripts that drift from `asl/package-scripts.json`.
27
+
28
+ After filling in app identifiers, validate the whole initialized project before runtime proof:
29
+
30
+ ```bash
31
+ asl-validate-project --root . --platform all --out artifacts/asl/project-validation
32
+ ```
33
+
34
+ Project validation checks the app-side profile-session helper, package-script snippets, app `package.json` script merge and drift, project config required fields, declared `drivers.supported` entries for fixture, adb, simctl, agent-device, and Argent lanes, scenario manifests, runner manifests, provider manifests, local provider-command script references, and planner compatibility. Validation also classifies declared drivers into package-supported lanes, known external target contracts such as XcodeBuildMCP, and custom driver names, so agents can distinguish bundled ASL execution paths from adapter targets that must be supplied by the host project. Missing live app identifiers such as `app.profileSessionScheme`, `app.iosBundleId`, or `app.androidPackage` are errors for the selected platform, as are missing artifact roots and missing scenario-root declarations for the selected platform. Placeholder app identity values are reported as warnings so a fresh scaffold can still prove installability while real app setup remains visible before live proof. The JSON artifact also includes structured `nextActions` for agents.
35
+
36
+ Project validation also checks whether `.gitignore` includes the generated `asl/gitignore-snippet` patterns for runtime artifacts, local runner config, traces, and local proof captures. Missing patterns are warnings with an `ignore_runtime_artifacts` next action; they do not block setup, but they should be fixed before running live scenarios repeatedly.
37
+
38
+ The generated compare and live-proof scripts require `ASL_COMPARE_IOS_CURRENT`, `ASL_COMPARE_ANDROID_CURRENT`, or `ASL_LIVE_PROOF` so agents pass explicit artifact paths instead of leaving shell-sensitive placeholders in package scripts.
39
+
40
+ ## Templates
41
+
42
+ You can also copy these files manually and rename them as needed:
43
+
44
+ | Template | Use |
45
+ | --- | --- |
46
+ | `templates/project.config.json` | Project-local app identifiers, artifact paths, and runner defaults |
47
+ | `templates/mobile-scenario.json` | First portable mobile scenario |
48
+ | `templates/primary-runner.json` | Primary runner capability manifest |
49
+ | `templates/evidence-provider.json` | Optional evidence-provider manifest |
50
+ | `templates/scripts/asl-capture-accessibility-provider.mjs` | Runnable starter provider command for deterministic accessibility evidence |
51
+ | `templates/scripts/asl-capture-profiler-provider.mjs` | Runnable starter provider command for deterministic profiler, memory, and network evidence |
52
+ | `templates/integration-readme.md` | Consumer-app wiring guide generated into `asl/README.md` |
53
+ | `templates/package-scripts.json` | Package-script snippets generated into `asl/package-scripts.json`; project validation also checks that required scripts exist in app `package.json` and direct installed-bin scripts have not drifted |
54
+
55
+ The JSON templates are schema-checked, and every shipped template is checked by package smoke. They intentionally use neutral placeholder names.
56
+
57
+ ## Scenario Shape
58
+
59
+ A scenario should answer five questions:
60
+
61
+ 1. What journey does the app need to prove?
62
+ 2. Which app-owned truth events prove progress and completion?
63
+ 3. How many cycles should run?
64
+ 4. Which budgets are meaningful only after scenario health passes?
65
+ 5. Which runner capabilities or driver actions are required?
66
+
67
+ Minimal fields:
68
+
69
+ - `id`: stable scenario id, such as `feed-open` or `checkout-submit`
70
+ - `flowId`: stable product flow id used in summaries and causal artifacts
71
+ - `platforms`: `ios`, `android`, or both
72
+ - `requiredCapabilities`: lifecycle and evidence ownership needed for the run
73
+ - `truthEvents`: app-owned events that make the scenario trustworthy
74
+ - `steps`: launch, command, wait, gesture, assertion, or evidence capture steps
75
+
76
+ Preferred fields:
77
+
78
+ - `journey`: human-readable intent, actor, start state, and end state
79
+ - `comparisonLane`: default historical baseline lane for runs of this scenario
80
+ - `milestones`: named event checkpoints with phases and timeouts
81
+ - `cycles`: iteration count and stop policy
82
+ - `budgets`: thresholds to evaluate only after truth-event health passes
83
+ - `artifacts`: required and optional evidence outputs
84
+
85
+ Use `comparisonLane` when a scenario should always compare within one stable proof mode, such as `feed-open-android-live`. Profile CLIs can also receive `--comparison-lane`; the CLI flag wins when one-off runs need a different lane.
86
+
87
+ ## Truth Events
88
+
89
+ Treat truth events as app-owned facts, not runner observations. The app should emit them from the code path that actually represents the journey state.
90
+
91
+ Good truth events:
92
+
93
+ - `feed_open_requested`
94
+ - `feed_first_content_visible`
95
+ - `message_send_completed`
96
+ - `checkout_submit_failed`
97
+
98
+ Weak truth events:
99
+
100
+ - `button_clicked`
101
+ - `waited_1000ms`
102
+ - `screen_probably_loaded`
103
+
104
+ Timing is not trusted unless scenario health passes. If a required truth event is missing, the run can still write artifacts, but verdicts and comparisons must remain inconclusive.
105
+
106
+ ## Steps
107
+
108
+ Use steps to describe intent and required adapter actions:
109
+
110
+ - `launch`: app lifecycle start
111
+ - `command`: app command such as `activate-target:first-journey`
112
+ - `waitForMilestone`: wait for an app-owned truth event
113
+ - `captureEvidence`: collect logs, screenshot, profiler output, or another artifact
114
+ - `gesture`: portable UI gesture intent
115
+ - `assertUi`: UI assertion intent
116
+
117
+ Use `driverAction` only when the scenario truly requires a concrete operation such as `tap`, `scroll`, `assertVisible`, `screenshot`, `readLogs`, or `collectPerfSignals`. The planner fails early when no active runner or provider can satisfy a required driver action.
118
+
119
+ Use `selector` to describe the intended app target without committing the scenario to one driver. Supported selector kinds are `testId`, `accessibilityId`, `accessibilityLabel`, `text`, `resourceId`, and `xpath`.
120
+
121
+ ```json
122
+ {
123
+ "id": "start-journey",
124
+ "kind": "gesture",
125
+ "driverAction": "tap",
126
+ "selector": {
127
+ "kind": "testId",
128
+ "value": "first-journey-start"
129
+ }
130
+ }
131
+ ```
132
+
133
+ Adapters may resolve selectors through accessibility trees, test ids, native UI inspection, or tool-specific selector engines. Android adb resolves `testId`, `resourceId`, `accessibilityId`, `accessibilityLabel`, and `text` selectors from UIAutomator bounds for tap and scroll actions. Argent gesture steps currently use normalized or pixel coordinates from `adapterOptions.argent`; it does not resolve tap or scroll targets from selectors. Coordinates belong in adapter metadata only when the selected runner cannot resolve a durable selector.
134
+
135
+ ## Runners And Providers
136
+
137
+ Primary runners own the run lifecycle: prepare, launch, start session, execute commands, wait, capture evidence, stop, and finalize.
138
+
139
+ Evidence providers attach smaller evidence windows: profiler data, accessibility snapshots, memory evidence, network evidence, or other signals.
140
+
141
+ Use an evidence provider when:
142
+
143
+ - the primary runner should not own that tool
144
+ - the evidence can be collected independently
145
+ - the same provider should work with multiple primary runners
146
+
147
+ When a provider or custom script has already written files, attach them to a profile run with repeatable CLI flags:
148
+
149
+ ```bash
150
+ asl-profile-android \
151
+ --config asl.config.json \
152
+ --scenario scenarios/android/app-startup.json \
153
+ --events artifacts/raw/adb-logcat.txt \
154
+ --signal js:artifacts/provider/js-profile.json \
155
+ --signal network:artifacts/provider/network.har \
156
+ --capture screenshot:artifacts/provider/final-screen.png \
157
+ --capture uiTree:artifacts/provider/ui-tree.json
158
+ ```
159
+
160
+ Signals are copied into `signals/js`, `signals/memory`, or `signals/network` and listed in `manifest.json`. Captures are copied into `captures`; screenshots are listed in `artifacts.captures.screenshots`, while video and UI tree captures replace the matching named capture path in the manifest. Every attached file is also listed in `artifacts.evidenceAttachments` with kind, run-relative path, source filename, byte size, and sha256 hash. Attached provider evidence is preserved as proof, but timing verdicts still come from app-owned truth events and budgets.
161
+
162
+ Provider manifests can also declare `providerCommands`. Profile runners execute those commands when passed with `--provider <manifest>`, but only when the provider manifest includes the selected platform. A provider with `platforms: ["ios"]` passed to an Android profile writes failed `health.json` with `provider_platform_unsupported` and does not run the command. Commands run without a shell, can use placeholders such as `{providerDir}`, `{runDir}`, `{runId}`, `{scenarioId}`, and `{platform}`, and must declare their output files. Provider-channel outputs are copied or preserved under `raw/providers/<provider-id>/` and inventoried in `artifacts.evidenceAttachments`; signal and capture outputs can still map into the standard `signals/*` or `captures/` folders. Command stdout, stderr, exit code, phase, and argv are preserved under `raw/provider-commands/`. When a provider command exits nonzero, the runner writes failed `health.json`, inconclusive `verdict.json`, and `agent-summary.md` with a next-action hint instead of making timing claims.
163
+
164
+ The `examples/runners/script-*.json` manifests show package-neutral wrappers for accessibility, profiler, memory, and network evidence. They intentionally reference placeholder commands such as `capture-accessibility` or `capture-memory`; replace those with your project-local script, binary, or agent command. The contract that matters is the declared output path and evidence kind, not the specific tool used to create the file.
165
+
166
+ ## Artifacts
167
+
168
+ A completed profile run should leave the standard artifact set:
169
+
170
+ - `health.json`
171
+ - `verdict.json`
172
+ - `agent-summary.md`
173
+ - `manifest.json`
174
+ - `metrics.json`
175
+ - `causal-run.json`
176
+ - `budget-verdict.json` when budgets are configured
177
+ - `summary.md`
178
+ - `raw/*`
179
+ - `captures/*`
180
+ - `signals/*`
181
+
182
+ Commit scenario definitions, runner manifests, docs, and app integration code. Do not commit generated native folders, runtime artifacts, simulator recordings, screenshots, profiler exports, or local app data containers.
183
+
184
+ ## Validation
185
+
186
+ Validate a scenario and runner before execution:
187
+
188
+ ```bash
189
+ pnpm check-plan -- --scenario templates/mobile-scenario.json --runner templates/primary-runner.json --platform ios --out artifacts/plan/first-journey
190
+ ```
191
+
192
+ Run the release gate before publishing package changes:
193
+
194
+ ```bash
195
+ pnpm release:check
196
+ ```
@@ -0,0 +1,136 @@
1
+ # Concepts
2
+
3
+ Agent Scenario Loop exists because agent-driven app work rarely belongs to one tool.
4
+
5
+ One runner may edit code. Another may build the app. Another may drive Android. Another may drive iOS. Others may collect logs, screenshots, accessibility output, profiler traces, memory evidence, network captures, or summaries.
6
+
7
+ Execution is not the missing piece.
8
+
9
+ The missing piece is a durable place for scenarios, evidence, and comparisons to live after any one runner finishes. Agent Scenario Loop coordinates scenarios, runners, and evidence so a project keeps a stable record of what happened across tools and over time.
10
+
11
+ ## What is an agent runner?
12
+
13
+ An agent runner is any tool that can carry out part of a software workflow on your behalf.
14
+
15
+ It might:
16
+
17
+ - click through an app
18
+ - run commands
19
+ - inspect a screen
20
+ - collect diagnostics
21
+ - drive a simulator or device
22
+ - collect logs, traces, or accessibility output
23
+
24
+ Examples include Codex, Argent, Agent Device, adb-based automation, accessibility tooling, Xcode instrumentation, Maestro, Detox, Appium, profilers, and custom internal runners. You do not need to know any specific one of these tools to understand Agent Scenario Loop. They are all ways to execute or observe part of a scenario.
25
+
26
+ ## Why orchestration matters
27
+
28
+ The moment you want to mix multiple runners, reuse scenarios, compare results across runs, preserve evidence, or evaluate changes over time, things become fragmented quickly.
29
+
30
+ Every tool has its own way to define work, capture results, and preserve context.
31
+
32
+ Agent Scenario Loop provides the layer that coordinates the work:
33
+
34
+ 1. Define an application scenario.
35
+ 2. Attach the runners and instrumentation appropriate for that scenario.
36
+ 3. Execute the scenario.
37
+ 4. Collect evidence throughout the run.
38
+ 5. Preserve the evidence as an artifact that humans and agents can inspect later.
39
+
40
+ ## Vendor-neutral by design
41
+
42
+ Scenarios should outlive tooling choices.
43
+
44
+ The best runner for a task today may not be the best runner six months from now. Agent Scenario Loop treats runners as interchangeable components. You can swap runners, combine runners, introduce new runners, or compare runners without rewriting your scenario definitions.
45
+
46
+ The goal is not to build another agent runner. The goal is to provide a common orchestration and evidence layer that sits above them.
47
+
48
+ ## Evidence is the output
49
+
50
+ Most testing systems produce a pass/fail result. Agent Scenario Loop produces evidence.
51
+
52
+ Evidence can include:
53
+
54
+ - logs
55
+ - traces
56
+ - memory measurements
57
+ - CPU measurements
58
+ - network activity
59
+ - accessibility results
60
+ - performance metrics
61
+ - custom signals
62
+
63
+ The scenario is not simply proving correctness. The scenario is generating evidence.
64
+
65
+ That evidence is preserved and becomes part of the project's understanding of itself. One run is useful. A hundred runs are more valuable because they let the project ask whether memory usage is improving, performance is degrading, regressions are appearing, or an optimization actually helped.
66
+
67
+ ## Scenarios become assets
68
+
69
+ Most automation is tightly coupled to the tools that created it.
70
+
71
+ When the tooling changes, the automation is rewritten. When the agent changes, the workflow changes. When the framework changes, the evidence disappears.
72
+
73
+ Agent Scenario Loop is built around the opposite idea: scenarios are long-lived project assets.
74
+
75
+ A scenario captures something important about your application:
76
+
77
+ - how users consume content
78
+ - how creators upload media
79
+ - how campaigns are created
80
+ - how livestreams behave
81
+ - how conversations load
82
+
83
+ These concerns exist independently of whichever tools happen to execute them today.
84
+
85
+ As tooling evolves, your scenarios remain. As better agents emerge, your scenarios remain. As instrumentation improves, your scenarios remain.
86
+
87
+ Over time, a project accumulates a growing library of scenarios that describe its most important behaviors. Those scenarios become a stable lens through which change can be evaluated.
88
+
89
+ Not just whether something works today. Whether it is improving over time.
90
+
91
+ ## The locus of control
92
+
93
+ Most teams unknowingly give the locus of control to the current tool.
94
+
95
+ Agent Scenario Loop moves it back into the application itself.
96
+
97
+ The feed is the thing that matters. The livestream is the thing that matters. The creator upload flow is the thing that matters. Agent Scenario Loop makes those concerns first-class citizens and lets tooling orbit around them instead of the other way around.
98
+
99
+ Every new scenario increases coverage. Every execution adds evidence. Every comparison adds historical context.
100
+
101
+ Eventually, the project develops a durable understanding of how critical parts of the application behave across releases, refactors, platform upgrades, and agent-driven changes.
102
+
103
+ The tooling may change. The runners may change. The agents may change. The scenarios remain the source of truth.
104
+
105
+ That is a different philosophy from frameworks that primarily evaluate agents. Agent Scenario Loop is built to evaluate the evolution of software.
106
+
107
+ ## How it differs from testing frameworks
108
+
109
+ Agent Scenario Loop does not make existing testing frameworks obsolete.
110
+
111
+ Traditional frameworks usually optimize for:
112
+
113
+ > Did the application behave correctly?
114
+
115
+ Agent Scenario Loop optimizes for:
116
+
117
+ > What did we learn from running this scenario?
118
+
119
+ Both questions matter. Agent Scenario Loop focuses on the second question by preserving health, verdicts, metrics, logs, traces, comparisons, and other run evidence in a stable artifact shape.
120
+
121
+ ## How it differs from agent evaluation
122
+
123
+ Agent Scenario Loop is not primarily evaluating agents.
124
+
125
+ An agent may execute part of a run. A runner may drive a device. A profiler may collect signals. None of those is the center of the model.
126
+
127
+ The scenario is.
128
+
129
+ The feed, livestream, upload flow, checkout flow, or conversation thread is the thing being studied over time.
130
+
131
+ ## Read next
132
+
133
+ - [Principles](principles.md) for the project doctrine
134
+ - [Contracts](contracts.md) for the current artifact and package surface
135
+ - [Live Proofs](live-proofs.md) for fixture, Android, iOS, and comparison runs
136
+ - [Runner docs](../runner/README.md) for the host execution boundary