@vitronai/alethia 0.8.5 → 0.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE CHANGED
@@ -41,4 +41,4 @@ or any other patent rights of vitron.ai. Use of the Alethia Core runtime is
41
41
  governed by separate terms; commercial and production use may require a
42
42
  patent license from vitron.ai once the patent grants.
43
43
 
44
- For licensing inquiries: gatekeeper@vitron.ai
44
+ For licensing inquiries: team@vitron.ai
package/README.md CHANGED
@@ -15,7 +15,7 @@ This package is the **MIT-licensed MCP bridge** (~22 KB) — a thin stdio-to-HTT
15
15
 
16
16
  The cockpit is an **oversight surface**, not an authoring IDE. Humans do not write tests in a GUI. Agents propose tests, run them, and prove safety — humans review the evidence.
17
17
 
18
- > **Patent notice.** The MIT license on this bridge does **not** grant a patent license to the Alethia runtime (U.S. Application No. 19/571,437). Commercial runtime use may require a separate license. Contact **gatekeeper@vitron.ai**.
18
+ > **Patent notice.** The MIT license on this bridge does **not** grant a patent license to the Alethia runtime (U.S. Application No. 19/571,437). Commercial runtime use may require a separate license. Contact **team@vitron.ai**.
19
19
 
20
20
  ---
21
21
 
@@ -29,6 +29,9 @@ The cockpit is an **oversight surface**, not an authoring IDE. Humans do not wri
29
29
  | Speed (per call) | ~200 ms via Playwright MCP, ~2 s via Playwright CLI | ~40 ms — 2-5× faster than Playwright MCP; up to 50× vs Playwright CLI on simple flows — [reproduce the numbers yourself](https://github.com/vitron-ai/alethia-anvil#verify-the-faster-than-cdp-based-tools-claim-yourself) |
30
30
  | Evidence | screenshots, videos | signed evidence pack with per-step integrity hashes |
31
31
  | Network | Telemetry on by default; optional cloud dashboards | **Air-gap deployable** — no cloud product, no telemetry path, bound to 127.0.0.1 |
32
+ | Dev feedback during coding | reload + devtools + screenshot (can serve cached frames) | `alethia_eval` → live `getComputedStyle()` and layout values, no cache, no round-trip |
33
+
34
+ Alethia isn't only a post-development testing tool. AI coding agents can use `alethia_eval` as a real-time DOM oracle *while writing code* — call `getComputedStyle()` to read exact computed values, `offsetWidth` for layout dimensions, `querySelectorAll().length` to verify list renders. The eval path always returns live values from the current page with no caching ambiguity, catching CSS cascade bugs in seconds that would otherwise require multiple reload-and-inspect cycles.
32
35
 
33
36
  ---
34
37
 
@@ -225,25 +228,34 @@ If you don't care about any of those (quick iteration, scratch testing), you can
225
228
 
226
229
  Once the MCP is configured (above), Alethia is available to any agent in any project — no per-project install, no scaffold to run. To add tests:
227
230
 
228
- 1. **Create the directory.** Convention is `__alethia__/` at the project root, mirroring how Jest/Vitest treat `__tests__/`.
231
+ 1. **Drop a `.alethia` file anywhere your repo treats as test code.** No enforced directory; pick whatever fits your existing layout (e.g. `tests/e2e/`, `e2e/`, `cypress/`-style — your call).
232
+
233
+ 2. **Write the test in plain English.** First line is a `name <label>` so cockpit history reads cleanly when the same file runs locally:
229
234
 
230
- 2. **Write a smoke test.** Plain English, one file per scenario:
231
235
  ```
232
- # __alethia__/smoke.alethia
236
+ # tests/e2e/login.alethia
237
+ name login flow
233
238
  navigate to http://127.0.0.1:5173
234
239
  assert "Sign in" is visible
240
+ click Sign in
241
+ type dev@company.com into the email field
242
+ assert dashboard is visible
235
243
  ```
236
244
 
237
245
  3. **Ask your agent to run it:**
238
- > *"Run the Alethia tests in `__alethia__/` against the app at http://127.0.0.1:5173."*
246
+ > *"Run `tests/e2e/login.alethia` against the app at http://127.0.0.1:5173."*
239
247
 
240
- The agent calls `alethia_tell` once per file and reports pass/fail.
248
+ The agent calls `alethia_tell` and reports pass/fail.
241
249
 
242
- 4. **For CI**, copy [`ci-runner.mjs`](https://github.com/vitron-ai/alethia-anvil/blob/main/__alethia__/ci-runner.mjs) from alethia-anvila small stdio MCP client that pipes every `.alethia` file through the bridge and exits non-zero on failure. Wire it into GitHub Actions or your pipeline of choice.
250
+ 4. **For CI**, use the native `alethia run` subcommandno MCP host or extra scripts needed:
251
+ ```bash
252
+ alethia run tests/e2e/login.alethia
253
+ ```
254
+ Exits 0 on pass, 1 on fail. See [Running in CI](#running-in-ci) below + the drop-in workflow at [`examples/github-actions.yml`](examples/github-actions.yml).
243
255
 
244
256
  5. **For evidence**, ask the agent to call `alethia_export_session` after a run — produces a signed evidence pack with per-step integrity hashes and full audit trail.
245
257
 
246
- The full reference example lives at [**vitron-ai/alethia-anvil**](https://github.com/vitron-ai/alethia-anvil) — Anvil demo app + 14 spec files + CI workflow + the head-to-head Playwright/PW-MCP benchmark. Fork it to see the pattern end-to-end.
258
+ The full reference example lives at [**vitron-ai/alethia-anvil**](https://github.com/vitron-ai/alethia-anvil) — demo app + spec files + CI workflow + the head-to-head Playwright/PW-MCP benchmark. Fork it to see the pattern end-to-end.
247
259
 
248
260
  ---
249
261
 
@@ -319,11 +331,12 @@ subcommand that drives the runtime headless and exits 0 (all passed) or 1
319
331
  (any failed):
320
332
 
321
333
  ```bash
322
- # from a file
334
+ # from a file (recommended — first line "name <label>" lands in cockpit history)
323
335
  alethia run tests/e2e/login.alethia
324
336
 
325
337
  # inline
326
- alethia run --nlp "navigate to http://localhost:3000
338
+ alethia run --nlp "name smoke
339
+ navigate to http://localhost:3000
327
340
  click Sign In
328
341
  assert dashboard is visible"
329
342
 
@@ -453,7 +466,26 @@ The Alethia runtime (which this bridge connects to) is local-only **by architect
453
466
 
454
467
  **Full security posture** — threat model, cryptographic chain of custody, supply-chain posture, update cadence, disclosure process — is at [`SECURITY.md`](./SECURITY.md).
455
468
 
456
- Abuse reports + vulnerability disclosure: **`gatekeeper@vitron.ai`**.
469
+ Abuse reports + vulnerability disclosure: **`team@vitron.ai`**.
470
+
471
+ ---
472
+
473
+ ## Privacy Policy
474
+
475
+ Alethia is **local-only by architecture**. No data is collected, transmitted, or stored outside your machine.
476
+
477
+ | What | How it's handled |
478
+ |------|-----------------|
479
+ | **Page content** | Processed locally inside the Alethia runtime binary. Never sent to Vitron or any third party. |
480
+ | **Screenshots** | Held in memory for the duration of the tool call, returned to your MCP client. Never persisted or uploaded. |
481
+ | **Test instructions** | Compiled and executed locally. Never logged to external services. |
482
+ | **Session evidence packs** | Written to your local filesystem on explicit `alethia_export_session` call. You control the file. |
483
+ | **Telemetry** | Zero. The runtime contains no analytics, crash reporting, or usage tracking of any kind. |
484
+ | **Network access** | The signed runtime binary only navigates to `file://`, `localhost`, `127.0.0.1`, `.local`, and RFC1918 private ranges — hard-coded at compile time, not configurable. |
485
+ | **Third-party sharing** | None. No data reaches Vitron servers during normal operation. |
486
+ | **Data retention** | No data is retained by Vitron. In-memory state is cleared when the runtime exits. |
487
+
488
+ For questions or concerns: **team@vitron.ai**
457
489
 
458
490
  ---
459
491
 
@@ -463,4 +495,4 @@ MIT — see [LICENSE](./LICENSE). Covers **this MCP bridge only.**
463
495
 
464
496
  ## Patent Notice
465
497
 
466
- The Alethia runtime is patent pending (U.S. Application No. 19/571,437). The MIT license on this bridge does **not** grant any patent license. For licensing inquiries: **gatekeeper@vitron.ai**.
498
+ The Alethia runtime is patent pending (U.S. Application No. 19/571,437). The MIT license on this bridge does **not** grant any patent license. For licensing inquiries: **team@vitron.ai**.
@@ -2,7 +2,7 @@ name Claude Code TaskFlow verification
2
2
  navigate to http://localhost:8765/claude-code-app.html
3
3
  assert TaskFlow is visible
4
4
  type dev@company.com into the you@company.com field
5
- type Engineering into the Your team name field
5
+ type Engineering into the Team field
6
6
  click Sign in
7
7
  assert Signed in as is visible
8
8
  type Deploy to production into the Add a new task field
@@ -1 +1 @@
1
- {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":";AACA;;;;;;;;;;;;;;;;;;;GAmBG;AA6HH,eAAO,MAAM,WAAW,GAAI,GAAG,MAAM,KAAG,CAAC,MAAM,EAAE,MAAM,EAAE,MAAM,CAO9D,CAAC;AAEF,eAAO,MAAM,aAAa,GAAI,GAAG,MAAM,EAAE,GAAG,MAAM,KAAG,MAMpD,CAAC;AAEF,eAAO,MAAM,sBAAsB,GAAI,MAAM,MAAM,EAAE,IAAI,MAAM,KAAG,OACrB,CAAC;AAkH9C,eAAO,MAAM,qBAAqB,QAAO;IAAE,OAAO,EAAE,MAAM,CAAC;IAAC,MAAM,EAAE,MAAM,CAAC;IAAC,UAAU,EAAE,MAAM,CAAA;CAAE,GAAG,IA8ClG,CAAC;AA8PF,eAAO,MAAM,iCAAiC,GAAI,IAAI,CAAC,MAAM,OAAO,CAAC,MAAM,GAAG,IAAI,CAAC,CAAC,GAAG,IAAI,KAAG,IAE7F,CAAC;AAIF,eAAO,MAAM,+BAA+B,QAAO,MAA6B,CAAC;AAEjF,eAAO,MAAM,qBAAqB,QAAa,OAAO,CAAC,MAAM,CAmC5D,CAAC;AAuDF,eAAO,MAAM,eAAe,GAAI,gBAAgB,MAAM,KAAG,MAAM,GAAG,IAiBjE,CAAC;AAEF,eAAO,MAAM,oBAAoB,GAAI,gBAAgB,MAAM,KAAG,MACe,CAAC;AAsF9E,eAAO,MAAM,+BAA+B,GAAI,mBAAwB,KAAG,MAAM,GAAG,IAsBnF,CAAC;AAmTF,MAAM,MAAM,UAAU,GAClB;IAAE,IAAI,EAAE,MAAM,CAAA;CAAE,GAChB;IAAE,IAAI,EAAE,QAAQ,CAAC;IAAC,GAAG,EAAE,MAAM,CAAC;IAAC,IAAI,EAAE,OAAO,CAAC;IAAC,KAAK,EAAE,OAAO,CAAC;IAAC,IAAI,CAAC,EAAE,MAAM,CAAA;CAAE,GAC7E;IAAE,IAAI,EAAE,MAAM,CAAC;IAAC,IAAI,EAAE,MAAM,CAAC;IAAC,IAAI,EAAE,OAAO,CAAC;IAAC,KAAK,EAAE,OAAO,CAAC;IAAC,IAAI,CAAC,EAAE,MAAM,CAAA;CAAE,GAC5E;IAAE,IAAI,EAAE,OAAO,CAAC;IAAC,IAAI,EAAE,OAAO,CAAC;IAAC,KAAK,EAAE,OAAO,CAAC;IAAC,IAAI,CAAC,EAAE,MAAM,CAAA;CAAE,GAC/D;IAAE,IAAI,EAAE,OAAO,CAAC;IAAC,OAAO,EAAE,MAAM,CAAA;CAAE,CAAC;AAEvC,eAAO,MAAM,YAAY,GAAI,MAAM,MAAM,EAAE,KAAG,UAkD7C,CAAC;AAEF,MAAM,MAAM,OAAO,GAAG;IAAE,EAAE,EAAE,OAAO,CAAC;IAAC,IAAI,EAAE,MAAM,CAAC;IAAC,MAAM,EAAE,MAAM,CAAC;IAAC,SAAS,EAAE,MAAM,CAAC;IAAC,UAAU,CAAC,EAAE,MAAM,CAAA;CAAE,CAAC;AAC5G,MAAM,MAAM,SAAS,GAAG;IACtB,EAAE,EAAE,OAAO,CAAC;IACZ,IAAI,EAAE,MAAM,CAAC;IACb,SAAS,EAAE,MAAM,CAAC;IAClB,SAAS,EAAE,MAAM,CAAC;IAClB,SAAS,EAAE,MAAM,CAAC;IAClB,SAAS,EAAE,MAAM,CAAC;IAClB,KAAK,EAAE,OAAO,EAAE,CAAC;CAClB,CAAC;AAIF,eAAO,MAAM,gBAAgB,GAAI,UAAU,OAAO,KAAG,SA0CpD,CAAC;AAEF,eAAO,MAAM,eAAe,GAAI,QAAQ,SAAS,EAAE,MAAM;IAAE,IAAI,EAAE,OAAO,CAAC;IAAC,KAAK,EAAE,OAAO,CAAA;CAAE,KAAG,MA4B5F,CAAC"}
1
+ {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":";AACA;;;;;;;;;;;;;;;;;;;GAmBG;AA6HH,eAAO,MAAM,WAAW,GAAI,GAAG,MAAM,KAAG,CAAC,MAAM,EAAE,MAAM,EAAE,MAAM,CAO9D,CAAC;AAEF,eAAO,MAAM,aAAa,GAAI,GAAG,MAAM,EAAE,GAAG,MAAM,KAAG,MAMpD,CAAC;AAEF,eAAO,MAAM,sBAAsB,GAAI,MAAM,MAAM,EAAE,IAAI,MAAM,KAAG,OACrB,CAAC;AAkH9C,eAAO,MAAM,qBAAqB,QAAO;IAAE,OAAO,EAAE,MAAM,CAAC;IAAC,MAAM,EAAE,MAAM,CAAC;IAAC,UAAU,EAAE,MAAM,CAAA;CAAE,GAAG,IA8ClG,CAAC;AA8PF,eAAO,MAAM,iCAAiC,GAAI,IAAI,CAAC,MAAM,OAAO,CAAC,MAAM,GAAG,IAAI,CAAC,CAAC,GAAG,IAAI,KAAG,IAE7F,CAAC;AAIF,eAAO,MAAM,+BAA+B,QAAO,MAA6B,CAAC;AAEjF,eAAO,MAAM,qBAAqB,QAAa,OAAO,CAAC,MAAM,CAmC5D,CAAC;AAuDF,eAAO,MAAM,eAAe,GAAI,gBAAgB,MAAM,KAAG,MAAM,GAAG,IAiBjE,CAAC;AAEF,eAAO,MAAM,oBAAoB,GAAI,gBAAgB,MAAM,KAAG,MACe,CAAC;AAsF9E,eAAO,MAAM,+BAA+B,GAAI,mBAAwB,KAAG,MAAM,GAAG,IAsBnF,CAAC;AA8UF,MAAM,MAAM,UAAU,GAClB;IAAE,IAAI,EAAE,MAAM,CAAA;CAAE,GAChB;IAAE,IAAI,EAAE,QAAQ,CAAC;IAAC,GAAG,EAAE,MAAM,CAAC;IAAC,IAAI,EAAE,OAAO,CAAC;IAAC,KAAK,EAAE,OAAO,CAAC;IAAC,IAAI,CAAC,EAAE,MAAM,CAAA;CAAE,GAC7E;IAAE,IAAI,EAAE,MAAM,CAAC;IAAC,IAAI,EAAE,MAAM,CAAC;IAAC,IAAI,EAAE,OAAO,CAAC;IAAC,KAAK,EAAE,OAAO,CAAC;IAAC,IAAI,CAAC,EAAE,MAAM,CAAA;CAAE,GAC5E;IAAE,IAAI,EAAE,OAAO,CAAC;IAAC,IAAI,EAAE,OAAO,CAAC;IAAC,KAAK,EAAE,OAAO,CAAC;IAAC,IAAI,CAAC,EAAE,MAAM,CAAA;CAAE,GAC/D;IAAE,IAAI,EAAE,OAAO,CAAC;IAAC,OAAO,EAAE,MAAM,CAAA;CAAE,CAAC;AAEvC,eAAO,MAAM,YAAY,GAAI,MAAM,MAAM,EAAE,KAAG,UAkD7C,CAAC;AAEF,MAAM,MAAM,OAAO,GAAG;IAAE,EAAE,EAAE,OAAO,CAAC;IAAC,IAAI,EAAE,MAAM,CAAC;IAAC,MAAM,EAAE,MAAM,CAAC;IAAC,SAAS,EAAE,MAAM,CAAC;IAAC,UAAU,CAAC,EAAE,MAAM,CAAA;CAAE,CAAC;AAC5G,MAAM,MAAM,SAAS,GAAG;IACtB,EAAE,EAAE,OAAO,CAAC;IACZ,IAAI,EAAE,MAAM,CAAC;IACb,SAAS,EAAE,MAAM,CAAC;IAClB,SAAS,EAAE,MAAM,CAAC;IAClB,SAAS,EAAE,MAAM,CAAC;IAClB,SAAS,EAAE,MAAM,CAAC;IAClB,KAAK,EAAE,OAAO,EAAE,CAAC;CAClB,CAAC;AAQF,eAAO,MAAM,gBAAgB,GAAI,UAAU,OAAO,KAAG,SAkDpD,CAAC;AAEF,eAAO,MAAM,eAAe,GAAI,QAAQ,SAAS,EAAE,MAAM;IAAE,IAAI,EAAE,OAAO,CAAC;IAAC,KAAK,EAAE,OAAO,CAAA;CAAE,KAAG,MA4B5F,CAAC"}
package/dist/index.js CHANGED
@@ -823,7 +823,7 @@ const ensureRuntime = async () => {
823
823
  if (!artifactName) {
824
824
  throw new Error(`No Alethia runtime available for ${platform()}-${arch()}. ` +
825
825
  `Supported: macOS (x64/arm64), Linux (x64/arm64), Windows (x64). ` +
826
- `Contact gatekeeper@vitron.ai for assistance.`);
826
+ `Contact team@vitron.ai for assistance.`);
827
827
  }
828
828
  // Check what's installed on disk. Marker is fast path; Info.plist fallback
829
829
  // catches legacy installs / partial extracts that never wrote a marker.
@@ -857,7 +857,7 @@ const ensureRuntime = async () => {
857
857
  if (!verifyManifest(manifest)) {
858
858
  throw new Error('Release manifest signature verification FAILED. ' +
859
859
  'The download may have been tampered with. Aborting. ' +
860
- 'Contact gatekeeper@vitron.ai if this persists.');
860
+ 'Contact team@vitron.ai if this persists.');
861
861
  }
862
862
  debug('manifest signature verified');
863
863
  // Download the binary
@@ -926,9 +926,19 @@ const spawnRuntime = async (runtimeVersion) => {
926
926
  // setting on the caller's shell doesn't silently re-route the runtime
927
927
  // into a non-runtime interpreter mode.
928
928
  const { ELECTRON_RUN_AS_NODE: _stripped, ...safeEnv } = process.env;
929
- runtimeProcess = spawn(exe, [], {
929
+ // In CI, pass --no-sandbox to the runtime. Container/runner environments
930
+ // typically lack the kernel-level isolation primitives the runtime's
931
+ // sandbox relies on; without this flag the runtime aborts before binding
932
+ // its port. Production / local installs aren't affected — the flag only
933
+ // applies when the bridge has detected a CI environment above.
934
+ const ciArgs = isCi ? ['--no-sandbox'] : [];
935
+ // In CI, surface the runtime's stderr to the bridge's stderr so spawn
936
+ // failures aren't silent. Local/production keep stdio:'ignore' to avoid
937
+ // leaking lower-level diagnostics in normal use.
938
+ const ciStdio = ['ignore', 'ignore', 'inherit'];
939
+ runtimeProcess = spawn(exe, ciArgs, {
930
940
  env: { ...safeEnv, ...(visible ? {} : { ALETHIA_HEADLESS: '1' }) },
931
- stdio: 'ignore',
941
+ stdio: isCi ? ciStdio : 'ignore',
932
942
  detached: false,
933
943
  });
934
944
  runtimeProcess.on('exit', (code) => {
@@ -936,19 +946,36 @@ const spawnRuntime = async (runtimeVersion) => {
936
946
  runtimeProcess = null;
937
947
  });
938
948
  // Wait for port to bind
939
- const maxWait = 15_000;
949
+ // CI runners have cold caches, slower disk IO, and warmup penalties; the
950
+ // 15s budget that's plenty for local hits 'first run' floors in CI. Give
951
+ // CI a 60s window and keep local snappy.
952
+ const maxWait = isCi ? 60_000 : 15_000;
940
953
  const interval = 300;
941
954
  const start = Date.now();
955
+ let lastErr;
956
+ let pollCount = 0;
942
957
  while (Date.now() - start < maxWait) {
943
958
  try {
944
959
  await callAlethia({ jsonrpc: '2.0', id: 0, method: 'tools/list' }, 2_000);
945
960
  process.stderr.write('[alethia] Runtime is ready.\n');
946
961
  return;
947
962
  }
948
- catch {
963
+ catch (err) {
964
+ lastErr = err;
965
+ pollCount++;
966
+ if (isCi && pollCount % 10 === 1) {
967
+ // Surface the polling error every ~3s in CI so a stuck spawn is
968
+ // diagnosable. The error is otherwise silently swallowed.
969
+ const msg = err instanceof Error ? err.message : String(err);
970
+ process.stderr.write(`[alethia] poll ${pollCount} failed: ${msg.slice(0, 120)}\n`);
971
+ }
949
972
  await new Promise(r => setTimeout(r, interval));
950
973
  }
951
974
  }
975
+ if (isCi && lastErr) {
976
+ const msg = lastErr instanceof Error ? lastErr.message : String(lastErr);
977
+ process.stderr.write(`[alethia] last poll error: ${msg}\n`);
978
+ }
952
979
  throw new Error(`Runtime failed to start within ${maxWait / 1000}s. Check ${RUNTIME_DIR} for issues.`);
953
980
  };
954
981
  // Clean up spawned runtime on exit
@@ -993,7 +1020,7 @@ RUNTIME
993
1020
  Ed25519-signed, SHA-256 verified. No signup required.
994
1021
 
995
1022
  Releases: https://github.com/vitron-ai/alethia/releases
996
- Licensing: gatekeeper@vitron.ai
1023
+ Licensing: team@vitron.ai
997
1024
 
998
1025
  ENVIRONMENT
999
1026
  ALETHIA_HOST Host of the Alethia runtime (default: 127.0.0.1)
@@ -1014,7 +1041,7 @@ ABOUT
1014
1041
  Patent Pending — U.S. Application No. 19/571,437.
1015
1042
  Title: "Deterministic Local Automation Runtime with Zero-IPC Execution,
1016
1043
  Offline Operation, and Per-Step Policy Enforcement"
1017
- Licensing inquiries: gatekeeper@vitron.ai
1044
+ Licensing inquiries: team@vitron.ai
1018
1045
  Bridge source (MIT): https://github.com/vitron-ai/alethia-mcp
1019
1046
  Project landing: https://github.com/vitron-ai/alethia
1020
1047
  `;
@@ -1117,26 +1144,29 @@ export const parseRunArgs = (argv) => {
1117
1144
  return { mode: 'error', message: 'No NLP source provided. Use a file path, --nlp <text>, or - for stdin. Try --help.' };
1118
1145
  };
1119
1146
  // Extract a normalized RunResult from the alethia_tell response. The runtime
1120
- // wraps the run in MCP content blocks; we want the raw run object.
1147
+ // returns one of three shapes:
1148
+ // 1. Compact (default): { ok, runId, name, elapsedMs, steps[], snapshot }
1149
+ // 2. Audit: { ok, run: { stepRuns[], lines[], name, elapsedMs, ... }, ... }
1150
+ // 3. MCP-wrapped: { content: [{ type: 'text', text: <JSON of (1) or (2)> }] }
1151
+ // extractRunResult handles all three.
1121
1152
  export const extractRunResult = (response) => {
1122
- const r = response ?? {};
1123
- // The runtime returns either { ok, run: {...} } directly or wrapped in
1124
- // MCP content. Handle both shapes.
1125
- let run = r.run;
1126
- if (!run && r.content && Array.isArray(r.content)) {
1127
- // MCP content shape — first text block holds JSON
1153
+ let r = response ?? {};
1154
+ // MCP-wrapped unwrap the inner JSON.
1155
+ if (Array.isArray(r.content) && r.content.length > 0) {
1128
1156
  const text = r.content[0]?.text;
1129
1157
  if (text) {
1130
1158
  try {
1131
- const parsed = JSON.parse(text);
1132
- run = parsed.run ?? parsed;
1159
+ r = JSON.parse(text);
1133
1160
  }
1134
- catch { /* leave run undefined; will fall through to defaults */ }
1161
+ catch { /* keep r */ }
1135
1162
  }
1136
1163
  }
1137
- run = run ?? r;
1138
- const stepRunsRaw = Array.isArray(run.stepRuns) ? run.stepRuns : [];
1139
- const linesRaw = Array.isArray(run.lines) ? run.lines : [];
1164
+ // Audit shape has r.run.stepRuns; compact has r.steps directly.
1165
+ const auditRun = (r.run && typeof r.run === 'object' ? r.run : null);
1166
+ const stepRunsRaw = auditRun && Array.isArray(auditRun.stepRuns) ? auditRun.stepRuns :
1167
+ Array.isArray(r.steps) ? r.steps :
1168
+ [];
1169
+ const linesRaw = auditRun && Array.isArray(auditRun.lines) ? auditRun.lines : [];
1140
1170
  const steps = stepRunsRaw.map((s, i) => {
1141
1171
  const step = s ?? {};
1142
1172
  const lineEntry = linesRaw[i] ?? {};
@@ -1150,13 +1180,20 @@ export const extractRunResult = (response) => {
1150
1180
  });
1151
1181
  const passCount = steps.filter((s) => s.ok).length;
1152
1182
  const failCount = steps.length - passCount;
1183
+ // Plan name: audit.run.name is the actual plan name (from NAME directive
1184
+ // or --name arg); compact's r.name is just the tool tag ("tell"). For
1185
+ // compact, prefer r.runId or fall back.
1186
+ const planName = (auditRun && typeof auditRun.name === 'string' && auditRun.name) ? auditRun.name :
1187
+ (typeof r.runId === 'string' ? r.runId : 'alethia run');
1153
1188
  return {
1154
1189
  ok: failCount === 0 && r.ok !== false,
1155
- name: typeof run.name === 'string' ? run.name : 'unnamed',
1190
+ name: planName,
1156
1191
  passCount,
1157
1192
  failCount,
1158
1193
  stepCount: steps.length,
1159
- elapsedMs: typeof run.elapsedMs === 'number' ? run.elapsedMs : 0,
1194
+ elapsedMs: auditRun && typeof auditRun.elapsedMs === 'number' ? auditRun.elapsedMs :
1195
+ typeof r.elapsedMs === 'number' ? r.elapsedMs :
1196
+ 0,
1160
1197
  steps,
1161
1198
  };
1162
1199
  };
@@ -1267,20 +1304,13 @@ const runCli = async (argv) => {
1267
1304
  process.exit(1);
1268
1305
  }
1269
1306
  };
1270
- // `alethia-mcp run <...>` agent-less CLI runner for CI. Branches BEFORE
1271
- // the global --help / --version handlers so `alethia run --help` shows
1272
- // run-specific help, not the top-level help. Below this block, control
1273
- // falls through to the existing MCP stdio-server bootstrap.
1274
- if (process.argv[2] === 'run') {
1275
- await runCli(process.argv.slice(3));
1276
- // runCli() always calls process.exit(); this is unreachable but satisfies
1277
- // the type system.
1278
- process.exit(0);
1279
- }
1280
- if (process.argv.includes('--help') || process.argv.includes('-h')) {
1307
+ // Skip the global --help / --version handlers when the user is in run mode.
1308
+ // run-mode has its own --help and the dispatcher fires below in isMainModule.
1309
+ const inRunMode = process.argv[2] === 'run';
1310
+ if (!inRunMode && (process.argv.includes('--help') || process.argv.includes('-h'))) {
1281
1311
  printAndExit(CLI_HELP);
1282
1312
  }
1283
- if (process.argv.includes('--version') || process.argv.includes('-v')) {
1313
+ if (!inRunMode && (process.argv.includes('--version') || process.argv.includes('-v'))) {
1284
1314
  printAndExit(`${PKG_NAME} v${PKG_VERSION}`);
1285
1315
  }
1286
1316
  // ---------------------------------------------------------------------------
@@ -1354,7 +1384,7 @@ const callAlethia = (body, timeoutMs = ALETHIA_TIMEOUT_MS) => new Promise((resol
1354
1384
  `Troubleshooting:\n` +
1355
1385
  ` → Run: alethia-mcp --health-check\n` +
1356
1386
  ` → Releases: https://github.com/vitron-ai/alethia/releases\n` +
1357
- ` → Licensing: gatekeeper@vitron.ai\n` +
1387
+ ` → Licensing: team@vitron.ai\n` +
1358
1388
  `\n` +
1359
1389
  `Override host/port with ALETHIA_HOST / ALETHIA_PORT environment vars\n` +
1360
1390
  `if your runtime listens on a non-default address.`));
@@ -1424,7 +1454,7 @@ const runHealthCheck = async () => {
1424
1454
  `GitHub Releases. Ed25519-signed, no signup required.\n` +
1425
1455
  `\n` +
1426
1456
  ` → https://github.com/vitron-ai/alethia/releases\n` +
1427
- ` → Licensing: gatekeeper@vitron.ai\n`);
1457
+ ` → Licensing: team@vitron.ai\n`);
1428
1458
  process.exit(1);
1429
1459
  }
1430
1460
  };
@@ -1473,6 +1503,12 @@ const TOOLS = [
1473
1503
  'Destructive actions (delete, purchase, transfer, etc.) are blocked unconditionally. ' +
1474
1504
  'Sensitive input (passwords, credit cards, SSN) is blocked unless allowSensitiveInput is true. ' +
1475
1505
  '~13 ms per step on average.',
1506
+ annotations: {
1507
+ title: 'Run E2E Tests',
1508
+ readOnlyHint: false,
1509
+ destructiveHint: false,
1510
+ openWorldHint: true,
1511
+ },
1476
1512
  inputSchema: {
1477
1513
  type: 'object',
1478
1514
  properties: {
@@ -1498,6 +1534,11 @@ const TOOLS = [
1498
1534
  'Returns the compiled IR, per-line confidence scores (0-1), and warnings for any lines the compiler ' +
1499
1535
  'could not parse. Use this to preview what tell() will run, debug coverage gaps, or generate ' +
1500
1536
  'reproducible IR scripts for CI pipelines.',
1537
+ annotations: {
1538
+ title: 'Compile Test Instructions',
1539
+ readOnlyHint: true,
1540
+ destructiveHint: false,
1541
+ },
1501
1542
  inputSchema: {
1502
1543
  type: 'object',
1503
1544
  properties: {
@@ -1515,6 +1556,12 @@ const TOOLS = [
1515
1556
  'kill switch state, driver statistics (queued plans, run count, audit count), the current page domain, ' +
1516
1557
  'and runtime capabilities. Use this for liveness checks before sending tell() calls, and to verify ' +
1517
1558
  'the runtime is in a known-good state at the start of an agent loop.',
1559
+ annotations: {
1560
+ title: 'Check Runtime Status',
1561
+ readOnlyHint: true,
1562
+ destructiveHint: false,
1563
+ idempotentHint: true,
1564
+ },
1518
1565
  inputSchema: { type: 'object', properties: {} },
1519
1566
  },
1520
1567
  {
@@ -1523,6 +1570,11 @@ const TOOLS = [
1523
1570
  'subsequent tell() calls will be blocked with reason KILL_SWITCH_ACTIVE until reset. ' +
1524
1571
  'Use this when an agent appears to be acting unsafely, when human review is required, or to enforce ' +
1525
1572
  'a hard boundary at the end of a controlled test run.',
1573
+ annotations: {
1574
+ title: 'Activate Kill Switch',
1575
+ readOnlyHint: false,
1576
+ destructiveHint: true,
1577
+ },
1526
1578
  inputSchema: {
1527
1579
  type: 'object',
1528
1580
  properties: {
@@ -1533,16 +1585,15 @@ const TOOLS = [
1533
1585
  },
1534
1586
  },
1535
1587
  },
1536
- {
1537
- name: 'alethia_reset_kill_switch',
1538
- description: 'Clear an active kill switch and resume normal operation. ' +
1539
- 'Re-enables tell() calls. The reset itself is logged in the audit trail for compliance review.',
1540
- inputSchema: { type: 'object', properties: {} },
1541
- },
1542
1588
  {
1543
1589
  name: 'alethia_screenshot',
1544
1590
  description: 'Capture a PNG screenshot of the current page and return it as a base64-encoded image. ' +
1545
1591
  'Use this to visually verify what the browser is showing after running test steps with alethia_tell.',
1592
+ annotations: {
1593
+ title: 'Take Screenshot',
1594
+ readOnlyHint: true,
1595
+ destructiveHint: false,
1596
+ },
1546
1597
  inputSchema: { type: 'object', properties: {} },
1547
1598
  },
1548
1599
  {
@@ -1551,6 +1602,12 @@ const TOOLS = [
1551
1602
  'Runs in the context of the navigated page, not the Alethia host UI. ' +
1552
1603
  'Use this for queries the NLP compiler cannot express — counting elements, reading computed styles, ' +
1553
1604
  'checking localStorage, or any DOM inspection that needs raw JS.',
1605
+ annotations: {
1606
+ title: 'Evaluate JavaScript',
1607
+ readOnlyHint: false,
1608
+ destructiveHint: false,
1609
+ openWorldHint: true,
1610
+ },
1554
1611
  inputSchema: {
1555
1612
  type: 'object',
1556
1613
  properties: {
@@ -1568,6 +1625,11 @@ const TOOLS = [
1568
1625
  'alt text, form labels, keyboard access, page title, lang attribute, link purpose, ' +
1569
1626
  'heading structure, duplicate IDs, and more. Call after navigating with alethia_tell. ' +
1570
1627
  'Returns findings with WCAG criterion numbers, severity levels, and issue counts.',
1628
+ annotations: {
1629
+ title: 'WCAG Accessibility Audit',
1630
+ readOnlyHint: true,
1631
+ destructiveHint: false,
1632
+ },
1571
1633
  inputSchema: { type: 'object', properties: {} },
1572
1634
  },
1573
1635
  {
@@ -1577,6 +1639,11 @@ const TOOLS = [
1577
1639
  'IA (unmasked passwords, weak password constraints, MFA indicators), ' +
1578
1640
  'SI (input validation, error information leakage). ' +
1579
1641
  'Call after navigating with alethia_tell. Returns findings with control IDs and severity levels.',
1642
+ annotations: {
1643
+ title: 'NIST 800-53 Security Audit',
1644
+ readOnlyHint: true,
1645
+ destructiveHint: false,
1646
+ },
1580
1647
  inputSchema: { type: 'object', properties: {} },
1581
1648
  },
1582
1649
  {
@@ -1585,6 +1652,11 @@ const TOOLS = [
1585
1652
  'made during this session with timestamps, inputs, outputs, policy decisions, and a ' +
1586
1653
  'SHA-256 integrity hash. Use at the end of an agent loop to produce cryptographic proof ' +
1587
1654
  'of everything the agent did. Designed for compliance review and chain-of-custody.',
1655
+ annotations: {
1656
+ title: 'Export Session Evidence',
1657
+ readOnlyHint: true,
1658
+ destructiveHint: false,
1659
+ },
1588
1660
  inputSchema: { type: 'object', properties: {} },
1589
1661
  },
1590
1662
  {
@@ -1592,6 +1664,12 @@ const TOOLS = [
1592
1664
  description: 'Run multiple test flows concurrently — each against a different URL. ' +
1593
1665
  'Takes an array of test specs, spawns a browser instance per spec, runs them in parallel, ' +
1594
1666
  'and returns all results together. Use this to verify multiple pages simultaneously.',
1667
+ annotations: {
1668
+ title: 'Run Parallel Tests',
1669
+ readOnlyHint: false,
1670
+ destructiveHint: false,
1671
+ openWorldHint: true,
1672
+ },
1595
1673
  inputSchema: {
1596
1674
  type: 'object',
1597
1675
  properties: {
@@ -1618,6 +1696,12 @@ const TOOLS = [
1618
1696
  'Use this to serve demo pages on localhost so they appear in preview panels (Claude Code, VS Code, etc.). ' +
1619
1697
  'The server runs on a random available port on 127.0.0.1. Call this before alethia_tell to get a localhost URL ' +
1620
1698
  'instead of a file:// path. Returns the base URL and a list of available demo pages.',
1699
+ annotations: {
1700
+ title: 'Serve Demo Pages',
1701
+ readOnlyHint: false,
1702
+ destructiveHint: false,
1703
+ idempotentHint: true,
1704
+ },
1621
1705
  inputSchema: { type: 'object', properties: {} },
1622
1706
  },
1623
1707
  {
@@ -1627,6 +1711,12 @@ const TOOLS = [
1627
1711
  'Returns an array of plain-English test blocks, including an auto-generated "EA1 Safety Gate Verification" ' +
1628
1712
  'block that uses "expect block: <action>" for every destructive control on the page. ' +
1629
1713
  'Use this to bootstrap test coverage for a new page or to discover what the safety gate should be watching.',
1714
+ annotations: {
1715
+ title: 'Propose Test Suite',
1716
+ readOnlyHint: true,
1717
+ destructiveHint: false,
1718
+ openWorldHint: true,
1719
+ },
1630
1720
  inputSchema: {
1631
1721
  type: 'object',
1632
1722
  properties: {
@@ -1645,6 +1735,12 @@ const TOOLS = [
1645
1735
  'This is the automated policy-verification primitive — proves the safety gate works on a real page ' +
1646
1736
  'without the agent or human having to click each destructive button manually. Use it as a compliance ' +
1647
1737
  'check before releasing an agent-driven workflow against a customer environment.',
1738
+ annotations: {
1739
+ title: 'Verify EA1 Safety Gate',
1740
+ readOnlyHint: true,
1741
+ destructiveHint: false,
1742
+ openWorldHint: true,
1743
+ },
1648
1744
  inputSchema: {
1649
1745
  type: 'object',
1650
1746
  properties: {
@@ -1661,12 +1757,24 @@ const TOOLS = [
1661
1757
  description: 'Show the Alethia cockpit window — the oversight surface where the target app is driven and each ' +
1662
1758
  'step is highlighted live (green = pass, blue = type, red = EA1 block). Use this to pop the UI ' +
1663
1759
  'into view during a headless-launched session for demos, review, or partner walkthroughs.',
1760
+ annotations: {
1761
+ title: 'Show Cockpit',
1762
+ readOnlyHint: false,
1763
+ destructiveHint: false,
1764
+ idempotentHint: true,
1765
+ },
1664
1766
  inputSchema: { type: 'object', properties: {} },
1665
1767
  },
1666
1768
  {
1667
1769
  name: 'alethia_hide_cockpit',
1668
1770
  description: 'Hide the Alethia cockpit window. The runtime keeps running and continues to accept tool calls; ' +
1669
1771
  'only the visible window is dismissed.',
1772
+ annotations: {
1773
+ title: 'Hide Cockpit',
1774
+ readOnlyHint: false,
1775
+ destructiveHint: false,
1776
+ idempotentHint: true,
1777
+ },
1670
1778
  inputSchema: { type: 'object', properties: {} },
1671
1779
  },
1672
1780
  ];
@@ -1676,7 +1784,6 @@ const TOOL_NAME_MAP = {
1676
1784
  alethia_compile: 'alethia_compile_nlp',
1677
1785
  alethia_status: 'alethia_status',
1678
1786
  alethia_activate_kill_switch: 'alethia_activate_kill_switch',
1679
- alethia_reset_kill_switch: 'alethia_reset_kill_switch',
1680
1787
  alethia_screenshot: 'alethia_screenshot',
1681
1788
  alethia_eval: 'alethia_eval',
1682
1789
  alethia_audit_wcag: 'alethia_audit_wcag',
@@ -1732,7 +1839,6 @@ const validateToolArgs = (toolName, args) => {
1732
1839
  return null;
1733
1840
  }
1734
1841
  case 'alethia_status':
1735
- case 'alethia_reset_kill_switch':
1736
1842
  case 'alethia_screenshot':
1737
1843
  case 'alethia_audit_wcag':
1738
1844
  case 'alethia_audit_nist':
@@ -1896,7 +2002,7 @@ const handle = async (request) => {
1896
2002
  '- alethia_status: Health check — version, policy profile, kill switch state.\n' +
1897
2003
  '- alethia_screenshot: Capture a PNG screenshot of the current page.\n' +
1898
2004
  '- alethia_eval: Run JavaScript in the page under test.\n' +
1899
- '- alethia_activate_kill_switch / alethia_reset_kill_switch: Emergency halt and resume.\n' +
2005
+ '- alethia_activate_kill_switch: Emergency halt. The kill auto-clears on the operator\'s next Run from the cockpit; agents have no self-release path by design.\n' +
1900
2006
  '- alethia_audit_wcag: WCAG 2.1 AA accessibility audit — 14 criteria.\n' +
1901
2007
  '- alethia_audit_nist: NIST SP 800-53 security controls audit — 8 controls.\n' +
1902
2008
  '- alethia_export_session: Export signed evidence pack of everything the agent did this session.\n' +
@@ -2142,6 +2248,15 @@ const isMainModule = (() => {
2142
2248
  }
2143
2249
  })();
2144
2250
  if (isMainModule) {
2251
+ // `alethia-mcp run <...>` — agent-less CLI runner for CI. Has to dispatch
2252
+ // here, AFTER all module-level const declarations (callAlethia, ensureRuntime
2253
+ // et al.) have evaluated; calling runCli earlier hits a temporal-dead-zone
2254
+ // error when spawnRuntime references callAlethia. Bypasses bootstrap
2255
+ // handoff + the MCP stdio server below — runCli always exits.
2256
+ if (process.argv[2] === 'run') {
2257
+ await runCli(process.argv.slice(3));
2258
+ process.exit(0);
2259
+ }
2145
2260
  // BOOTSTRAP HANDOFF: if a newer, signature-verified bridge is installed
2146
2261
  // at ~/.alethia/bridge/<version>/, exec to it instead of running ourselves.
2147
2262
  // This is how the self-update mechanism hands off control without requiring