npm - proteum - Versions diffs - 2.2.9 → 2.3.0 - Mend

proteum 2.2.9 → 2.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (45) hide show

package/AGENTS.md +3 -2
package/README.md +49 -11
package/agents/project/AGENTS.md +43 -5
package/agents/project/diagnostics.md +6 -2
package/agents/project/optimizations.md +1 -0
package/agents/project/root/AGENTS.md +14 -5
package/agents/project/tests/AGENTS.md +6 -0
package/agents/project/tests/e2e/AGENTS.md +13 -0
package/agents/project/tests/e2e/REAL_WORLD_JOURNEY_TESTS.md +192 -0
package/cli/commands/connect.ts +40 -4
package/cli/commands/diagnose.ts +136 -5
package/cli/commands/doctor.ts +24 -4
package/cli/commands/explain.ts +105 -6
package/cli/commands/mcp.ts +16 -0
package/cli/commands/orient.ts +66 -3
package/cli/commands/perf.ts +118 -13
package/cli/commands/runtime.ts +151 -0
package/cli/commands/trace.ts +116 -21
package/cli/mcp/provider.ts +365 -0
package/cli/mcp/stdio.ts +16 -0
package/cli/presentation/commands.ts +77 -20
package/cli/presentation/devSession.ts +2 -0
package/cli/runtime/commands.ts +95 -12
package/cli/utils/agentOutput.ts +46 -0
package/cli/utils/agents.ts +116 -49
package/common/dev/inspection.ts +14 -6
package/common/dev/mcpPayloads.ts +736 -0
package/common/dev/mcpServer.ts +254 -0
package/docs/agent-routing.md +126 -0
package/docs/dev-commands.md +2 -0
package/docs/dev-sessions.md +2 -1
package/docs/diagnostics.md +68 -23
package/docs/mcp.md +149 -0
package/docs/migrate-from-2.1.3.md +15 -5
package/docs/request-tracing.md +12 -6
package/package.json +2 -1
package/server/app/devMcp.ts +159 -0
package/server/services/router/http/cache.ts +116 -0
package/server/services/router/http/index.ts +94 -35
package/server/services/router/index.ts +8 -11
package/tests/agents-utils.test.cjs +36 -13
package/tests/dev-transpile-watch.test.cjs +117 -8
package/tests/inspection.test.cjs +67 -0
package/tests/mcp.test.cjs +127 -0
package/tests/router-cache-config.test.cjs +74 -0

package/AGENTS.md CHANGED Viewed

@@ -58,6 +58,7 @@ npx prisma migrate dev --config ./prisma.config.ts --name <migration name>
     - `/Users/gaetan/Desktop/Projets/unique.domains/platform/apps/website`
 - Inspect how both apps currently use the touched feature, runtime, API, compiler behavior, or generated output before proposing or implementing changes.
 - Keep the developer-facing contract synchronized when framework work changes CLI commands, profiler capabilities, or the `proteum dev` banner. Update the live surfaces together in the same pass: CLI command/help definitions, profiler panels and dev-only endpoints, banner text/examples, and the most relevant agent docs that describe them, especially `AGENTS.md`, `agents/project/AGENTS.md`, `agents/project/root/AGENTS.md`, `agents/project/app-root/AGENTS.md`, `agents/project/diagnostics.md`, and any narrower `agents/project/**/AGENTS.md` file that mentions the changed workflow.
+- Proteum MCP contract: `proteum mcp` is the read-only stdio server for compact agent diagnostics, and `proteum dev` exposes the same runtime-adjacent MCP contract at `/__proteum/mcp`. Keep MCP tools/resources compact, typed, capped, paginated for full trace detail, and read-only unless a future task explicitly expands the mutation contract. MCP payloads are compact single-line `proteum-mcp-v1` JSON, not pretty-printed human output. Do not implement MCP tools as thin CLI process wrappers when the data is available through manifest readers, tracked sessions, or dev runtime registries.
 - Keep the same-system trace contract explicit when request instrumentation changes: `TRACE_*` controls the retained dev trace store plus the trace/perf CLI, dev-only HTTP endpoints, and bottom profiler, while `ENABLE_PROFILER` enables the reduced request-local `request.profiling` snapshot and `request.finished` hook payload without retaining finished requests globally unless dev trace is also enabled.
 - Current CLI banner contract: only the bare `proteum build` and bare `proteum dev` commands print the welcome banner and include the active Proteum installation method. Any extra argument or option skips the banner. Only `proteum dev` clears the interactive terminal before rendering, exposes `CTRL+R` reload plus `CTRL+C` shutdown hotkeys in its session UI, and reports connected app names plus successful connected `/ping` checks in the ready banner. Every `proteum dev` start ensures tracked instruction files contain the current managed `# Proteum Instructions` section before the dev loop begins.
 - Keep core changes aligned with the explicit controller/page architecture in `agents/project/root/AGENTS.md` and its standalone composition in `agents/project/AGENTS.md`.
@@ -80,12 +81,12 @@ Do not stop at static analysis for routing, controllers, generated code, SSR, cl
 - Run `npx proteum dev --no-cache --session-file var/run/proteum/dev/framework-<app>.json --port <free-3xxx-port>` in both reference apps on explicit free ports and with elevated permissions outside the sandbox.
 - If either reference app uses local `file:` connected projects for the affected flow, run those producer apps too on their own free ports before exercising the consumer.
-- When validating a concrete route, controller path, or failing page on a running dev server, prefer `proteum diagnose <path> --port <port>` first. Use raw `proteum trace ...` output when you need lower-level event detail beyond the diagnose summary.
+- When validating a concrete route, controller path, or failing page on a running dev server, prefer `proteum diagnose <path> --port <port>` first. Use `proteum trace show <requestId> --events` only when you need lower-level event detail beyond the compact diagnose summary.
 - When the issue is latency, CPU, SQL cost, render cost, or memory drift, inspect `proteum perf top`, `proteum perf request`, `proteum perf compare`, or `proteum perf memory` against the running dev server before adding custom instrumentation.
 - When a framework change can affect shipped client code size, run `proteum build --prod --analyze` for static bundle artifacts or `proteum build --prod --analyze --analyze-serve --analyze-port auto` when you need a local analyzer URL.
 - For protected browser or API flows in dev, prefer `npx proteum session <email> --role <role>` for browser MCP validation, or `npx proteum e2e --session-email <email> --session-role <role>` for automated end-to-end suites, instead of automating the login UI. Use the login UI only when login itself is the feature under test.
 - When a task needs browser execution instead of the higher-level verifier, use the browser MCP. Keep Playwright inside `npx proteum e2e --port <port>` for targeted or full end-to-end suites. Keep auth sourced from Proteum session helpers, not UI login or shared browser state.
-- For request-time behavior, arm traces with `proteum trace arm --capture deep`, reproduce once, then inspect `proteum trace latest` or `proteum trace show <requestId>`.
+- For request-time behavior, arm traces with `proteum trace arm --capture deep`, reproduce once, then inspect compact `proteum trace latest` or raw `proteum trace show <requestId> --events` only when needed.
 - When the framework-facing workflow itself changed, verify the CLI surface too with `proteum verify framework-change --crosspath-port <port> --product-port <port> --website-port <port>`.
 - Only the final verifier agent should usually run browser flows. Other agents should stay on `orient`, `verify owner`, `verify request`, and command-level checks unless browser execution is the only trustworthy surface.
 - Open the real pages with the browser MCP.

package/README.md CHANGED Viewed

@@ -192,6 +192,42 @@ export default class MyApp extends Application {
 Proteum reads `server/index.ts` as the source of truth for installed root services and router plugins, and reads `server/config/*.ts` `Services.config(...)` exports for typed config such as service priority overrides.
+## Router Cache Policy
+Browser cache headers are configurable per app through the optional `routerBaseConfig.http.cache` object. Omit it to keep Proteum's defaults.
+```ts
+export const routerBaseConfig = {
+  currentDomain: AppContainer.Environment.router.currentDomain,
+  http: {
+    domain: 'example.com',
+    port: AppContainer.Environment.router.port,
+    ssl: true,
+    upload: { maxSize: '10mb' },
+    cache: {
+      html: {
+        dynamic: {
+          cacheControl: 'no-store, no-cache, must-revalidate, proxy-revalidate',
+          surrogateControl: 'no-store',
+        },
+        static: {
+          cacheControl: 'public, max-age=0, must-revalidate',
+          surrogateControl: false,
+        },
+      },
+      publicAssets: {
+        dev: 'no-store',
+        versioned: 'public, max-age=31536000, immutable',
+        unversioned: 'public, max-age=0, must-revalidate',
+      },
+    },
+  },
+  context: () => ({}),
+} satisfies RouterBaseConfig;
+```
+Default public asset validators depend on the environment: dev disables `ETag` and `Last-Modified`, while non-dev enables them. Use `etag: false` and `lastModified: false` when an app needs to fully disable browser cache for `/public` assets.
 ## Example: Page
 Proteum pages are explicit SSR entrypoints.
@@ -342,6 +378,7 @@ Proteum ships with a compact CLI focused on the real app lifecycle:
 | `proteum diagnose` | Combine owner lookup, diagnostics, trace data, and server logs for one concrete route or request target |
 | `proteum perf` | Aggregate request-trace performance into hot paths, one-request waterfalls, regressions, and memory drift views |
 | `proteum trace` | Inspect live dev-only request traces from the running SSR server |
+| `proteum mcp` | Start the read-only MCP server for compact agent diagnostics and runtime reads |
 | `proteum command` | Run a dev-only internal command locally or against a running dev server |
 | `proteum session` | Mint a dev-only auth session token and Playwright-ready cookie payload |
 | `proteum e2e` | Run Playwright with Proteum-managed `E2E_*` values instead of shell-leading env assignments |
@@ -383,6 +420,7 @@ proteum perf top --since today
 proteum perf request /dashboard --port 3101
 proteum perf compare --baseline yesterday --target today --group-by route
 proteum perf memory --since 1h --group-by controller
+proteum mcp --url http://localhost:3101
 proteum command proteum/diagnostics/ping
 proteum command proteum/diagnostics/ping --port 3101
 proteum session admin@example.com --role ADMIN --port 3101
@@ -408,7 +446,7 @@ proteum create service Conversion/Plans
 Every `proteum dev` start runs the same idempotent instruction check. It updates missing or stale managed sections automatically and prompts only when a blocked path would need to be replaced.
-`proteum connect`, `proteum explain`, `proteum doctor`, and `proteum diagnose` share the same generated manifest and contract state. `proteum perf` uses the same dev request-trace store as the profiler `Perf` tab. For the full diagnostics and tracing model, see [docs/diagnostics.md](docs/diagnostics.md) and [docs/request-tracing.md](docs/request-tracing.md).
+`proteum connect`, `proteum explain`, `proteum doctor`, `proteum diagnose`, and `proteum mcp` share the same generated manifest and contract state. `proteum perf` uses the same dev request-trace store as the profiler `Perf` tab. `proteum dev` also exposes the same read-only MCP contract at `/__proteum/mcp` for repeated agent reads against the live runtime. For the full diagnostics and tracing model, see [docs/diagnostics.md](docs/diagnostics.md), [docs/mcp.md](docs/mcp.md), and [docs/request-tracing.md](docs/request-tracing.md).
 ## Dev Commands
@@ -450,7 +488,7 @@ The CLI talks to the running app over the dev-only `__proteum/session/start` end
 Proteum includes a dev-only in-memory request trace buffer for auth, routing, controller, context, SSR, API, Prisma SQL, and render debugging.
-This is separate from `proteum explain` and `proteum doctor`: tracing is live request-time data, while explain/doctor are manifest-backed structure and diagnostics. `proteum perf` aggregates the same trace buffer into hot-path, waterfall, compare, and memory views. When you already know the failing path and want the fastest suspect list, start with `proteum diagnose`; when the issue is performance, start with `proteum perf`; then drop into raw trace output only if needed.
+This is separate from `proteum explain` and `proteum doctor`: tracing is live request-time data, while explain/doctor are manifest-backed structure and diagnostics. `proteum perf` aggregates the same trace buffer into hot-path, waterfall, compare, and memory views. When you already know the failing path and want the fastest suspect list, start with `proteum diagnose`; when the issue is performance, start with `proteum perf`; then drop into raw trace output only if needed. When an agent needs repeated trace, perf, diagnose, status, or instruction-routing reads from the same running app, use `proteum mcp --url <dev-url>` or the dev-hosted `/__proteum/mcp` endpoint.
 When diagnosing or testing against an app, first read the default port from `PORT` or `./.proteum/manifest.json` and check whether a server is already running there. If it is, inspect the existing traces before reproducing the issue so you can collect past errors and their context.
@@ -522,6 +560,7 @@ Proteum answers those questions with explicit artifacts:
 - `proteum explain owner <query>` for fast ownership lookup over routes, controllers, files, and generated artifacts
 - `proteum diagnose <path>` for a one-shot request diagnosis surface
 - `proteum perf top|request|compare|memory` for request-trace performance rollups
+- `proteum mcp` or `/__proteum/mcp` for repeated low-token agent reads over the same compact `proteum-mcp-v1` runtime/diagnostic contract
 - the profiler `Explain`, `Doctor`, `Diagnose`, and `Perf` tabs for a human-readable view over the same diagnostics and trace-derived perf contracts
 - `proteum command ...` plus the profiler `Commands` tab for dev-only internal execution
 - `proteum session ...` for explicit authenticated dev browser or API bootstrapping without login UI automation
@@ -529,16 +568,15 @@ Proteum answers those questions with explicit artifacts:
 If you are an LLM or automation agent, start here:
-1. Read `identity.config.ts` and `proteum.config.ts`.
-2. Read `PORT`, the relevant `ENV_*`, `URL`, `URL_INTERNAL`, any env values referenced by `proteum.config.ts`, plus `TRACE_*` and `ENABLE_PROFILER`, or run `proteum explain env`.
-3. Inspect `server/index.ts` and `server/config/*.ts` for the explicit app bootstrap.
-4. Read `.proteum/manifest.json` or run `proteum explain --json`.
-5. Inspect `server/controllers/**` for request entrypoints.
-6. Inspect `server/services/**` for business logic.
-7. Inspect `client/pages/**` for SSR routes and page data contracts.
-8. If the task touches a protected route or controller in dev and login UX is not the feature under test, use `proteum e2e --session-email <email> --session-role <role>` for Playwright suites or `proteum session <email> --role <role>` before direct HTTP calls.
+1. Run `proteum orient <query>` or MCP `orient` before broad source reads.
+2. Read only the returned `instructions.mustRead` files, plus conditional docs for diagnostics, coding style, or optimization when they apply.
+3. Run `proteum runtime status` once before starting a dev server; use MCP `runtime_status` for repeated status reads.
+4. Use `proteum diagnose`, `proteum perf`, and compact `proteum trace` for reproducible command evidence.
+5. Use `proteum mcp --url <dev-url>` or `/__proteum/mcp` for repeated live status, instruction, diagnose, trace, perf, and log reads.
+6. Inspect `server/index.ts`, controllers, services, or pages only after the routing/diagnostic surfaces identify the relevant owner.
+7. If the task touches a protected route or controller in dev and login UX is not the feature under test, use `proteum e2e --session-email <email> --session-role <role>` for Playwright suites or `proteum session <email> --role <role>` before direct HTTP calls.
-For implementation rules in a real Proteum app, treat the local `AGENTS.md` files plus `proteum explain`, `proteum doctor`, `proteum diagnose`, `proteum perf`, and `proteum trace` as the task contract. This README is the framework overview, not the project-local instruction layer.
+For implementation rules in a real Proteum app, treat the routed local `AGENTS.md` files plus `proteum orient`, compact CLI diagnostics, and MCP repeated-read surfaces as the task contract. This README is the framework overview, not the project-local instruction layer.
 ## What Proteum Avoids

package/agents/project/AGENTS.md CHANGED Viewed

@@ -23,6 +23,9 @@ Coding style source of truth: root-level `CODING_STYLE.md`.
     - re-print the complete list of suggested fixes, but strike the ones we already implemented or not necessary anymore
 - If the user asks to implement a feature, first inspect the relevant existing surface and state any implementation problem, pain point, attention point, or question you see. If a concern is blocking, or it can materially change product behavior, API shape, architecture, data model, cost, privacy, security, or UX, ask before editing; otherwise state the assumption and continue implementing.
 - If the task is ambiguous, generated, connected, or multi-repo, start with `npx proteum orient <query>` before reading large parts of the codebase.
+- Treat Proteum CLI and MCP output as the workflow router. Read only the instruction files returned in `orient` `mustRead` or MCP `instructions_resolve`, plus conditional docs that match the current task. Do not read broad instruction folders or every managed instruction file up front.
+- When a Proteum MCP client is available, prefer read-only MCP tools for repeated runtime/status/orientation/trace/perf/log reads. Use CLI commands when you need reproducible terminal validation, dev/build/check workflows, or output to share with a human.
+- MCP payloads are compact single-line `proteum-mcp-v1` JSON with capped and paginated detail. Do not expand MCP output for human readability.
 - If the user reports an issue, or the agent encounters one during exploration, implementation, verification, or runtime reproduction, load and follow root-level `diagnostics.md`.
 - If the task touches client-side files, especially `client/**` and page files, load and apply root-level `optimizations.md` only after implementation for post-implementation checking and optimization. Skip it at task start and skip it for server-only, test-only, doc-only, and non-client refactor tasks unless the user explicitly asks for optimization work.
 - If the task changes UX, copy, onboarding, pricing, product semantics, or commercial positioning, read the relevant files under `./docs/` first, especially `docs/PERSONAS.md`, `docs/PRODUCT.md`, and `docs/MARKETING.md` when they exist. If a dev server is already running, print the live dev server URL as a clickable Markdown link.
@@ -57,6 +60,7 @@ Coding style source of truth: root-level `CODING_STYLE.md`.
 - When starting a long-lived dev server for an agent task, always request elevated permissions and run `npx proteum dev` outside the sandbox. Use an explicit task/thread-scoped session file such as `var/run/proteum/dev/agents/<task>.json`, inspect `npx proteum dev list --json` plus current listeners first, for example with `lsof -nP -iTCP -sTCP:LISTEN`, then choose a port that is not currently used before starting `npx proteum dev --session-file <path> --port <port>`. After the server is ready, print the live server URL as a clickable Markdown link.
 - Use `--replace-existing` only when restarting the exact session file started by the current thread/task. Never replace another live session that belongs to a user, another thread, or an unknown owner.
 - If the current app depends on local `file:` connected projects, boot every connected producer app too, each with its own task-scoped session file and free port, and run every one of those `proteum dev` processes with elevated permissions outside the sandbox before starting or verifying the consumer app.
+- During `npx proteum dev`, the app exposes the read-only Proteum MCP runtime endpoint at `/__proteum/mcp`; use it for repeated agent reads instead of spawning equivalent diagnostics commands.
 - For browser validation, use the browser MCP against the running app. Keep Playwright inside `npx proteum e2e --port <port>` for targeted/full end-to-end suites. Bootstrap protected browser MCP state with `npx proteum session`; bootstrap protected E2E runs with `npx proteum e2e --session-email <email> --session-role <role>`.
 - Current CLI banner contract: only the bare `proteum build` and bare `proteum dev` commands print the welcome banner and include the active Proteum installation method. Any extra argument or option skips the banner. Only `proteum dev` clears the interactive terminal before rendering, exposes `CTRL+R` reload plus `CTRL+C` shutdown hotkeys in its session UI, and reports connected app names plus successful connected `/ping` checks in the ready banner. Every `proteum dev` start ensures tracked instruction files contain the current managed `# Proteum Instructions` section before the dev loop begins.
@@ -272,19 +276,24 @@ Project code should consume:
 Prefer structured CLI surfaces over re-deriving framework facts from source:
-- `npx proteum connect --json`
+- `npx proteum connect`
 - `npx proteum connect --controllers --strict`
 - `npx proteum orient <query>`
-- `npx proteum explain --json`
+- `npx proteum runtime status`
+- `npx proteum mcp`
+- `npx proteum mcp --url http://localhost:<port>`
+- `npx proteum explain`
+- `npx proteum explain --manifest`
 - `npx proteum explain --connected --controllers`
 - `npx proteum explain owner <query>`
-- `npx proteum doctor --json`
-- `npx proteum doctor --contracts --json`
+- `npx proteum doctor`
+- `npx proteum doctor --contracts`
 - `npx proteum diagnose <path> --port <port>`
 - `npx proteum verify owner <query>`
 - `npx proteum verify request <path>`
 - `npx proteum perf ...`
-- `npx proteum trace ...`
+- `npx proteum trace latest`
+- `npx proteum trace show <requestId> --events`
 - `npx proteum command ...`
 - `npx proteum session ...`
 - `npx proteum create ... --dry-run --json`
@@ -306,3 +315,32 @@ Edit these only when required, and keep changes minimal and explicit:
 - `PORT`, `ENV_*`, `URL`, `TRACE_*`, and `ENABLE_PROFILER` env setup
 - Prisma-generated files
 - symbolic links
+## Delivery Workflow
+Agents working in generated Proteum projects must use this delivery workflow for production code changes:
+1. BDD / ATDD: translate the requested behavior into acceptance scenarios before changing implementation code.
+2. TDD: write or update the smallest failing unit/integration test that proves the next behavior.
+3. Implementation: make the narrowest production change that satisfies the failing test while preserving Proteum boundaries.
+4. Proteum check: refresh and validate generated framework contracts after route, page, controller, service, command, or config changes.
+5. Validate unit + E2E: run the relevant unit tests and real-world journey E2E checks before calling the work complete.
+Unit test expectation: production package and service logic should target 100% meaningful unit coverage for touched behavior. Any excluded generated files, migrations, framework shims, or unreachable defensive branches must be documented in the completion note.
+E2E expectation: real-world journeys must follow the project-local instructions in `tests/e2e/REAL_WORLD_JOURNEY_TESTS.md`. These tests should model complete user workflows, role transitions, permissions, state changes, and cross-view consistency rather than isolated happy paths.
+Recommended validation sequence:
+```bash
+npm run refresh
+npm run typecheck
+npm run lint
+npm run test
+npm run test:integration
+npx proteum check
+npx proteum doctor --contracts --strict
+npx proteum e2e --port <port>
+```
+When bundling, SSR, server startup, routing, or build-time behavior changes, also run the project build command before finishing.

package/agents/project/diagnostics.md CHANGED Viewed

@@ -4,7 +4,10 @@ This file is the canonical source of truth for diagnostics, temporary instrument
 ## Initial Triage
-- Start with machine-readable app state before reading large parts of the codebase: `npx proteum orient <query>`, `./.proteum/manifest.json`, `npx proteum connect --json`, `npx proteum explain --json`, `npx proteum doctor --json`, and `npx proteum doctor --contracts --json` when generated artifacts or manifest-owned files may be stale.
+- Start with compact machine-readable app state before reading large parts of the codebase: `npx proteum orient <query>`, `npx proteum runtime status`, `npx proteum connect`, `npx proteum explain`, `npx proteum doctor`, and `npx proteum doctor --contracts` when generated artifacts or manifest-owned files may be stale.
+- When a Proteum MCP client is available, prefer MCP `runtime_status`, `orient`, `instructions_resolve`, `diagnose`, `trace_latest`, `perf_top`, `perf_request`, and `logs_tail` for repeated reads of the same app/runtime state. Use compact CLI commands when you need a reproducible shell command, validation step, or CI-like output.
+- MCP payloads are compact `proteum-mcp-v1` JSON and are capped/paginated by default. Do not request full detail until compact output identifies the missing data.
+- Use full-detail escape hatches only after compact output identifies the missing detail: `npx proteum explain --manifest`, `npx proteum diagnose <target> --full`, `npx proteum trace show <requestId> --events`, or `npx proteum perf request <requestId> --full`.
 - When the user pastes raw errors, reproduce locally before listing possible causes: identify the likely app, route, command, or request target from the error, boot or reuse the relevant dev server with the elevated-permissions workflow below, replay the failing surface once, and base the probability/why/how-to-fix list on local server output, browser console output, diagnostics, traces, or the smallest relevant command result. If there is not enough information to reproduce, state the missing context and ground the cause list in the local evidence that is available.
 - When one app depends on another app's generated controllers, inspect `npx proteum connect --controllers`, `npx proteum explain --connected --controllers`, the producer `proteum.connected.json`, the consumer `proteum.config.ts` connected `source` value, and the producer `./.proteum/proteum.connected.d.ts` before assuming the contract is local.
 - Use `rg -n` first to narrow the exact code path, then read only the relevant files.
@@ -16,6 +19,7 @@ This file is the canonical source of truth for diagnostics, temporary instrument
 - For long-lived dev reproductions, always request elevated permissions and run `npx proteum dev` outside the sandbox. Use an explicit task/thread-scoped session file, inspect `npx proteum dev list --json` plus current listeners first, for example with `lsof -nP -iTCP -sTCP:LISTEN`, then choose a port that is not currently used before starting `npx proteum dev --session-file <path> --port <port>`. After the server is ready, print the live server URL as a clickable Markdown link.
 - Use `--replace-existing` only when restarting the exact session file started by the current thread/task. Never replace another live session that belongs to a user, another thread, or an unknown owner.
 - Only the bare `npx proteum build` and bare `npx proteum dev` commands print the welcome banner and active Proteum installation method. Any extra argument or option skips the banner. Only `npx proteum dev` clears the interactive terminal before rendering and reports connected app names plus successful connected `/ping` checks in the ready banner; keep that in mind when capturing or comparing command logs during diagnosis. Every `npx proteum dev` start ensures tracked instruction files contain the current managed `# Proteum Instructions` section before the dev loop begins.
+- During `npx proteum dev`, the running app exposes the read-only Proteum MCP transport at `/__proteum/mcp`. Use it for runtime-adjacent agent reads instead of repeatedly spawning equivalent CLI diagnostics.
 - For ownership or repo discovery questions, start with `npx proteum orient <query>` instead of jumping straight into source searches.
 - For request-time issues in dev, start with `npx proteum diagnose <path> --port <port>` when you have a concrete failing route, page, controller path, or request target. It combines owner lookup, manifest diagnostics, contract diagnostics, matching trace data, and buffered server logs in one pass.
 - Prefer focused verification before global checks: `npx proteum verify owner <query>`, `npx proteum verify request <path>`, and only then browser MCP validation when the bug is browser-visible. Use `npx proteum e2e --port <port> ...` only when automated end-to-end coverage or a Playwright suite is required.
@@ -26,7 +30,7 @@ This file is the canonical source of truth for diagnostics, temporary instrument
 - For bundle-size inspection, use `npx proteum build --prod --analyze` to emit `bin/bundle-analysis/client.html` and `client-stats.json`, or add `--analyze-serve --analyze-port auto` when you want a local analyzer URL instead of a static HTML file.
 - For request-time issues in dev, inspect traces before adding logs when the diagnose surface is still too coarse.
 - If a server is already running on the default port from `PORT` or `./.proteum/manifest.json`, inspect existing traces before reproducing the issue.
-- If existing traces are insufficient, arm `npx proteum trace arm --capture deep`, reproduce once, then inspect the new request with `npx proteum trace latest` or `npx proteum trace show <requestId>`.
+- If existing traces are insufficient, arm `npx proteum trace arm --capture deep`, reproduce once, then inspect the new request with compact `npx proteum trace latest`; use `npx proteum trace show <requestId> --events` only when raw event detail is still required.
 - Use the browser MCP to inspect browser console errors and warnings for frontend, SSR, hydration, and controller-call issues.
 - Inspect server startup and runtime errors.
 - For protected browser or API flows in dev, prefer `npx proteum session <email> --role <role>` over driving the login UI, then use that session for browser MCP validation. Use `npx proteum e2e --session-email <email> --session-role <role>` only when Playwright end-to-end suites need the auth token through the child process environment. Use the login UI only when auth UX itself is under test.

package/agents/project/optimizations.md CHANGED Viewed

@@ -19,6 +19,7 @@ When tradeoffs exist inside optimization work, optimize in this order:
 - Prefer established, flexible, well-typed, widely adopted, actively maintained packages.
 - Build custom or keep custom infrastructure only when packages would clearly hurt bundle size, SSR behavior, performance, typing quality, flexibility, licensing, explicit contracts, or long-term maintainability.
 - If you choose custom over a package, state briefly why.
+- For agent-facing repeated diagnostics, prefer the read-only Proteum MCP surface over adding broader CLI output. MCP should expose compact single-line `proteum-mcp-v1` JSON with capped, typed, paginated reads; the CLI should stay compact and reproducible.
 ## SSR And Page Size

package/agents/project/root/AGENTS.md CHANGED Viewed

@@ -14,6 +14,9 @@ Coding style source of truth: root-level `CODING_STYLE.md`.
 - If the user pastes raw errors without asking for a fix, do not implement changes yet. First run the task-safe local reproduction path: identify the likely app, route, command, or request from the error, boot or reuse the relevant dev server with the elevated-permissions workflow in `Task Lifecycle`, reproduce the failing surface locally, and inspect server output, browser console output, diagnostics, traces, or the smallest relevant command result. If the error does not identify enough context to reproduce, say what is missing and use the available local evidence before guessing. Then list likely causes and, for each one, give probability, why, and how to fix it.
 - If the user asks to implement a feature, first inspect the relevant existing surface and state any implementation problem, pain point, attention point, or question you see. If a concern is blocking, or it can materially change product behavior, API shape, architecture, data model, cost, privacy, security, or UX, ask before editing; otherwise state the assumption and continue implementing.
 - If the task is ambiguous, generated, connected, or multi-repo, start with `npx proteum orient <query>` before reading large parts of the codebase.
+- Treat Proteum CLI and MCP output as the workflow router. Read only the instruction files returned in `orient` `mustRead` or MCP `instructions_resolve`, plus conditional docs that match the current task. Do not read broad instruction folders or every managed instruction file up front.
+- When a Proteum MCP client is available, prefer read-only MCP tools for repeated runtime/status/orientation/trace/perf/log reads. Use CLI commands when you need reproducible terminal validation, dev/build/check workflows, or output to share with a human.
+- MCP payloads are compact single-line `proteum-mcp-v1` JSON with capped and paginated detail. Do not expand MCP output for human readability.
 - If the user reports an issue, or the agent encounters one during exploration, implementation, verification, or runtime reproduction, load and follow root-level `diagnostics.md`.
 - If the task touches client-side files, especially `client/**` and page files, load and apply root-level `optimizations.md` only after implementation for post-implementation checking and optimization. Skip it at task start and skip it for server-only, test-only, doc-only, and non-client refactor tasks unless the user explicitly asks for optimization work.
 - If the task needs new app or artifact boilerplate, prefer `npx proteum init ...` and `npx proteum create ...` before creating files by hand. Use `--dry-run --json` when an agent needs a machine-readable plan before writing files.
@@ -47,6 +50,7 @@ Coding style source of truth: root-level `CODING_STYLE.md`.
 - When starting a long-lived dev server for an agent task, always request elevated permissions and run `npx proteum dev` outside the sandbox. Use an explicit task/thread-scoped session file such as `var/run/proteum/dev/agents/<task>.json`, inspect `npx proteum dev list --json` plus current listeners first, for example with `lsof -nP -iTCP -sTCP:LISTEN`, then choose a port that is not currently used before starting `npx proteum dev --session-file <path> --port <port>`. After the server is ready, print the live server URL as a clickable Markdown link.
 - Use `--replace-existing` only when restarting the exact session file started by the current thread/task. Never replace another live session that belongs to a user, another thread, or an unknown owner.
 - If the current app depends on local `file:` connected projects, boot every connected producer app too, each with its own task-scoped session file and free port, and run every one of those `proteum dev` processes with elevated permissions outside the sandbox before starting or verifying the consumer app.
+- During `npx proteum dev`, the app exposes the read-only Proteum MCP runtime endpoint at `/__proteum/mcp`; use it for repeated agent reads instead of spawning equivalent diagnostics commands.
 - For browser validation, use the browser MCP against the running app. Keep Playwright inside `npx proteum e2e --port <port>` for targeted/full end-to-end suites. Bootstrap protected browser MCP state with `npx proteum session`; bootstrap protected E2E runs with `npx proteum e2e --session-email <email> --session-role <role>`.
 - Current CLI banner contract: only the bare `proteum build` and bare `proteum dev` commands print the welcome banner and include the active Proteum installation method. Any extra argument or option skips the banner. Only `proteum dev` clears the interactive terminal before rendering, exposes `CTRL+R` reload plus `CTRL+C` shutdown hotkeys in its session UI, and reports connected app names plus successful connected `/ping` checks in the ready banner. Every `proteum dev` start ensures tracked instruction files contain the current managed `# Proteum Instructions` section before the dev loop begins.
@@ -262,19 +266,24 @@ Project code should consume:
 Prefer structured CLI surfaces over re-deriving framework facts from source:
-- `npx proteum connect --json`
+- `npx proteum connect`
 - `npx proteum connect --controllers --strict`
 - `npx proteum orient <query>`
-- `npx proteum explain --json`
+- `npx proteum runtime status`
+- `npx proteum mcp`
+- `npx proteum mcp --url http://localhost:<port>`
+- `npx proteum explain`
+- `npx proteum explain --manifest`
 - `npx proteum explain --connected --controllers`
 - `npx proteum explain owner <query>`
-- `npx proteum doctor --json`
-- `npx proteum doctor --contracts --json`
+- `npx proteum doctor`
+- `npx proteum doctor --contracts`
 - `npx proteum diagnose <path> --port <port>`
 - `npx proteum verify owner <query>`
 - `npx proteum verify request <path>`
 - `npx proteum perf ...`
-- `npx proteum trace ...`
+- `npx proteum trace latest`
+- `npx proteum trace show <requestId> --events`
 - `npx proteum command ...`
 - `npx proteum session ...`
 - `npx proteum create ... --dry-run --json`

package/agents/project/tests/AGENTS.md CHANGED Viewed

@@ -23,3 +23,9 @@ Diagnostics source of truth: root-level `diagnostics.md`.
 - Keep end-to-end tests clean, well organized, and non-redundant. Prefer extending or reshaping the most relevant existing scenario over duplicating coverage, and remove or consolidate overlap when the suite becomes repetitive.
 - Reuse root catalog files from `/client/catalogs/**`, `/server/catalogs/**`, or `/common/catalogs/**` instead of duplicating catalog constants in tests.
 - For protected dev flows, prefer `npx proteum e2e --session-email <email> --session-role <role>` or `npx proteum session <email> --role <role>` over automating login unless the login flow itself is under test.
+### Real-World Journey E2E
+End-to-end tests must follow the workflow in `tests/e2e/REAL_WORLD_JOURNEY_TESTS.md`.
+Use E2E journeys for complete role-based flows, permissions, state transitions, derived metrics, filters, detail/edit surfaces, and cross-role visibility. Avoid replacing those journeys with isolated happy-path smoke checks when the feature has real workflow depth.

package/agents/project/tests/e2e/AGENTS.md ADDED Viewed

@@ -0,0 +1,13 @@
+# E2E Journey Test Instructions
+Follow `REAL_WORLD_JOURNEY_TESTS.md` in this directory when designing, implementing, extending, or reviewing end-to-end tests.
+Required structure for substantial E2E work:
+- `tests/e2e/journeys/`: real-world journey specs grouped by workflow or domain.
+- `tests/e2e/pages/`: page objects or screen helpers that keep selectors stable.
+- `tests/e2e/workflows/`: reusable multi-step role and business workflows.
+- `tests/e2e/factories/`: deterministic test data builders and fixtures.
+- `tests/e2e/utils/`: login, navigation, assertions, network, and cleanup helpers.
+Do not reduce E2E coverage to one-screen smoke tests when the feature involves multiple roles, permissions, workflow states, derived KPIs, or cross-view consistency.

package/agents/project/tests/e2e/REAL_WORLD_JOURNEY_TESTS.md ADDED Viewed

@@ -0,0 +1,192 @@
+# Write Real World Journey Tests
+Source: `/Users/gaetan/.codex/skills/write-real-world-journey-tests/SKILL.md`
+## Goal
+Create journey tests that read like an executable product spec. Prefer a realistic sequence of actions by real roles over a collection of disconnected UI checks.
+The journey should cover the requested feature area broadly, but every step must be justified by normal user behavior and by state created earlier in the journey.
+## Workflow
+1. Inspect the feature surface before designing the test.
+   - Read existing journey tests, page objects, factories, workflows, fixtures, selectors, and test utilities.
+   - Identify local conventions for auth, data creation, navigation, assertions, cleanup, retries, and timeouts.
+   - Reuse existing helpers before adding new ones.
+   - Preserve the repository's test-file organization instead of putting every helper inside the journey spec.
+2. Follow the local E2E file organization.
+   - Put complete user journeys in `tests/e2e/journeys/*.spec.ts` or the repository's equivalent journey/spec directory.
+   - Put reusable screen abstractions in `tests/e2e/pages/**`, grouped by product area; use focused page objects for pages, modals, filter bars, pickers, and detail panels.
+   - Put reusable multi-step business flows in `tests/e2e/workflows/**`, such as login, signup via invite, create entity with invite link, or other flows shared by several journeys.
+   - Put generated domain data in `tests/e2e/factories/**`, with separate minimum, maximum, and edited payload factories when the journey needs create/edit coverage.
+   - Put shared technical/domain helpers in `tests/e2e/utils/**`, such as context creation, constants, API response parsing, date formatting, KPI expectations, number/currency parsers, display-name builders, and journey tree logging.
+   - Add a helper to the nearest existing page/workflow/factory/utils module when it is reusable; keep it local to the spec only when it is specific to that one journey's memory or assertions.
+3. Map the real-world story.
+   - List the roles involved, their permissions, and why each role appears in the scenario.
+   - Identify the entities that move through the journey: account, team, organization, customer, order, lead, report, subscription, etc.
+   - Define the natural order: setup, invite/signup/login, empty state, creation, assignment, action by another role, management, reporting, audit/review.
+   - Prefer one coherent story over feature stuffing. Cover many features only when they belong to the same user journey.
+4. Make the report-visible hierarchy explicit.
+   - Group journeys with `describe` blocks or the framework equivalent by business phase, such as acquisition, activation, setup, core workflow, limits, billing, and admin review.
+   - Split each journey test into named substeps with `test.step(...)`, Cypress command logs, or the local framework equivalent so CI reports and traces show what business phase failed.
+   - Name steps from the user or business perspective, such as `Launch plan blocks the third filter`, not from implementation details, such as `Click button`.
+   - Use nested substeps for repeated matrices like roles, plans, permissions, limits, or locales, with one visible step per case and smaller substeps for settings, allowed actions, and blocked actions.
+   - Preserve local file organization and helper conventions; hierarchy should improve readability without moving unrelated code or hiding assertions.
+5. Use serial state only for real dependency chains.
+   - Use a serial describe block when later tests intentionally depend on state created by earlier actors.
+   - Split the journey into tests by actor/session or meaningful product phase.
+   - Use fresh browser contexts or sessions for different roles.
+   - Keep setup assertions at the top of each dependent test so failures explain the missing prerequisite.
+6. Keep journey memory as the source of truth.
+   - Store created IDs, invite links, generated users, current payloads, assignment state, statuses, dates, and monetary values in typed in-memory objects.
+   - Update memory immediately after successful UI/API responses.
+   - Compute expectations from memory instead of duplicating constants in assertions.
+   - Track both `createdPayload` and `currentPayload` when edits are part of the journey.
+7. Assert the product contract at each phase.
+   - Empty states and initial KPIs for new users.
+   - Role-specific navigation, tabs, menu labels, and restricted access.
+   - Created rows/cards/details match generated input.
+   - Default ownership, assignment, channel/source, payment terms, status, or permission behavior.
+   - Cross-role visibility after another role acts.
+   - Details modals, edit flows, notes/activity timelines, status changes, derived fields, filters, date ranges, and aggregate KPIs.
+   - Final manager/admin review across the whole organization or feature scope.
+8. Make derived assertions resilient.
+   - Use polling for eventually consistent UI totals, KPI cards, async table updates, revenue/pipeline calculations, closing rates, and status text.
+   - Parse formatted numbers, currency, and percentages with shared utilities.
+   - Use deterministic date helpers for UI date formatting and date-range assertions.
+   - Wait for important create/update responses when IDs or persistence are needed.
+9. Prefer expressive helpers.
+   - Introduce small helpers for repeated domain assertions, such as row basics, row KPI checks, snapshot comparison, filter setup, and derived totals.
+   - Keep helper names product-facing, not implementation-facing.
+   - Use page objects and workflow helpers to keep the test readable at the story level.
+10. Log or snapshot the journey shape when useful.
+   - For long multi-role journeys, optionally keep a small tree or timeline logger showing created roles and entities.
+   - Use it for debugging and readability, not as a substitute for assertions.
+## Implementation Pattern
+Use this shape as a guide, adapting to the repository's test framework:
+```ts
+test.describe.serial('Domain - Feature journey', () => {
+  test.describe.configure({ timeout: 300_000 });
+  let organization = createEmptyOrganizationMemory();
+  let manager = createEmptyAccountMemory();
+  let operator = createEmptyAccountMemory();
+  let externalPartner = createEmptyPartnerMemory();
+  let itemA = createItemMemory(createMinimumPayload(), { type: 'operator' });
+  let itemB = createItemMemory(createMaximumPayload(), { type: 'external-partner' });
+  const getCreatedItems = () => [itemA, itemB].filter((item) => item.id);
+  const buildKpis = () => ({
+    'Total Items': getCreatedItems().length,
+    'Completed': getCreatedItems().filter((item) => item.status === 'Completed').length,
+  });
+  const matrixCases = [{ name: 'Manager' }, { name: 'Operator' }];
+  test.describe('Setup and activation', () => {
+    test('Admin or owner creates the parent scope', async ({ browser }) => {
+      await test.step('Create the organization through the normal product path', async () => {
+        // Store IDs, invite links, and display data in memory.
+      });
+    });
+    test('Manager signs up and prepares the working structure', async ({ browser }) => {
+      await test.step('Assert restricted navigation and empty state', async () => {});
+      await test.step('Create the team and invite the next role', async () => {});
+    });
+  });
+  test.describe('Core workflow', () => {
+    test('Primary role performs the core workflow and delegates work', async ({ browser }) => {
+      await test.step('Create a minimum item', async () => {});
+      await test.step('Assert default ownership and visible table state', async () => {});
+      await test.step('Delegate the item to another role', async () => {});
+    });
+    test('Secondary role acts on assigned state', async ({ browser }) => {
+      await test.step('Assert this role only sees assigned work', async () => {});
+      await test.step('Create or update a maximum-data item', async () => {});
+      await test.step('Assert role-specific defaults and KPIs', async () => {});
+    });
+    test('Primary role manages resulting activity', async ({ browser }) => {
+      await test.step('Open details and validate data from the secondary role', async () => {});
+      await test.step('Add notes, edit fields, and change status', async () => {});
+      await test.step('Assert recalculated derived values', async () => {});
+    });
+  });
+  test.describe('Review and reporting', () => {
+    test('Manager or auditor reviews aggregate state', async ({ browser }) => {
+      await test.step('Assert cross-role visibility and filters', async () => {});
+      await test.step('Assert aggregate KPIs and final business outcomes', async () => {});
+    });
+    test('Plan, role, or permission matrix exposes the correct behavior', async ({ browser }) => {
+      for (const caseItem of matrixCases) {
+        await test.step(`${caseItem.name} shows the right settings and limits`, async () => {
+          await test.step(`${caseItem.name} settings match contract`, async () => {});
+          await test.step(`${caseItem.name} allowed actions succeed`, async () => {});
+          await test.step(`${caseItem.name} blocked actions fail with the expected reason`, async () => {});
+        });
+      }
+    });
+  });
+});
+```
+## Coverage Heuristics
+Include a feature when it naturally belongs to the journey and creates a durable assertion later. Strong candidates:
+- Invite/signup/login boundaries.
+- Permission differences between roles.
+- Empty state to populated state.
+- Minimum-data and maximum-data creation paths.
+- Assignment, reassignment, ownership, membership, or sharing.
+- Details view, edit view, notes, activity history, and status changes.
+- Derived totals, dashboards, reporting cards, and tables.
+- Filters, search, date ranges, sorting, and snapshots.
+- Cross-role consistency: what one actor creates, another actor sees or manages.
+Exclude or move to narrower tests:
+- Pure component styling.
+- Exhaustive validation errors.
+- Every filter permutation.
+- Rare edge cases that interrupt the main story.
+- Admin-only setup that is unrelated to the requested feature.
+## Quality Bar
+- The scenario should be explainable as a real customer workflow.
+- Each role should do work that role would actually do.
+- Assertions should prove business behavior, not only that buttons are clickable.
+- Later assertions should depend on earlier created state.
+- Names, IDs, and dates should be generated uniquely enough for shared test environments.
+- The test should fail near the broken product behavior, with step names that explain the business phase.
+- The test report should expose a clear hierarchy of phases, tests, and substeps so a reader can understand the journey without opening the source file.
+- Comments should clarify product intent or non-obvious timing behavior, not narrate obvious code.
+## Anti-Patterns
+- One huge test with no actor boundaries.
+- Flat journey tests with no report-visible phases or substeps.
+- Serial dependency without explicit prerequisite assertions.
+- Hardcoded duplicate KPI values that drift after edits.
+- Creating data directly in the database when the product flow under test is creation, signup, invite, assignment, or editing.
+- Covering unrelated features just to increase line count.
+- Hiding flakiness with arbitrary sleeps instead of waiting on UI state, network responses, or polling derived values.
+- Testing only the creator role when the feature's value appears through another role's visibility or management flow.

package/cli/commands/connect.ts CHANGED Viewed

@@ -2,8 +2,9 @@ import cli from '..';
 import Compiler from '../compiler';
 import { readProteumManifest } from '../compiler/common/proteumManifest';
 import { buildConnectResponse, renderConnectHuman } from '@common/dev/connect';
+import { compactList, printAgentResponse, printJson, truncateForAgent } from '../utils/agentOutput';
-const allowedConnectArgs = new Set(['controllers', 'json', 'strict']);
+const allowedConnectArgs = new Set(['controllers', 'full', 'human', 'json', 'strict']);
 const validateConnectArgs = () => {
     const enabledArgs = Object.entries(cli.args)
@@ -19,6 +20,29 @@ const validateConnectArgs = () => {
     }
 };
+const compactDiagnostic = (diagnostic: ReturnType<typeof buildConnectResponse>['diagnostics'][number]) => ({
+    level: diagnostic.level,
+    code: diagnostic.code,
+    message: truncateForAgent(diagnostic.message),
+    filepath: diagnostic.filepath,
+    sourceLocation: diagnostic.sourceLocation,
+});
+const compactProject = (project: ReturnType<typeof buildConnectResponse>['projects'][number]) => ({
+    namespace: project.namespace,
+    identityIdentifier: project.identityIdentifier,
+    identityName: project.identityName,
+    sourceKind: project.sourceKind,
+    sourceConfigured: project.sourceConfigured,
+    urlInternalConfigured: project.urlInternalConfigured,
+    urlInternal: project.urlInternal,
+    controllerCount: project.controllerCount,
+    cachedContractExists: project.cachedContractExists,
+    cachedContractFilepath: project.cachedContractFilepath,
+    typingMode: project.typingMode,
+    controllers: project.controllers ? compactList(project.controllers, 8) : undefined,
+});
 export const run = async (): Promise<void> => {
     validateConnectArgs();
@@ -31,10 +55,22 @@ export const run = async (): Promise<void> => {
         strict: cli.args.strict === true,
     });
-    if (cli.args.json === true) {
-        console.log(JSON.stringify(response, null, 2));
-    } else {
+    if (cli.args.full === true) {
+        printJson(response);
+    } else if (cli.args.human === true) {
         console.log(renderConnectHuman(manifest, response));
+    } else {
+        printAgentResponse({
+            summary: `Connect: ${response.summary.connectedProjects} projects, ${response.summary.importedControllers} controllers, ${response.summary.errors} errors, ${response.summary.warnings} warnings`,
+            data: {
+                app: response.app,
+                summary: response.summary,
+                projects: response.projects.map(compactProject),
+                diagnostics: compactList(response.diagnostics, 10).map(compactDiagnostic),
+                totalDiagnostics: response.diagnostics.length,
+            },
+            fullDetailCommand: `proteum connect${cli.args.controllers === true ? ' --controllers' : ''} --full`,
+        });
     }
     if (cli.args.strict === true && response.diagnostics.length > 0) {