@amityco/social-plus-vise 0.8.1 → 0.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md ADDED
@@ -0,0 +1,149 @@
1
+ # Changelog
2
+
3
+ All notable changes to `@amityco/social-plus-vise` are documented in this file.
4
+
5
+ The format is loosely based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
6
+
7
+ ## 0.10.0 — 2026-05-29
8
+
9
+ **Theme:** Benchmark-driven sensor expansion. The Commune benchmark (9 new SDK domains: chat, push, social graph, moderation, comments) produced the first measured, defensible advantage for vise+skill over pure MCP: **7/9 working features vs 3/9** with the same agent on the same prompts. This release ships the sensors, rules, and findings.json improvements that produced that result.
10
+
11
+ ### Added
12
+ - `react-native.chat.channel-type-dm` / `typescript.chat.channel-type-dm` (`warning`) — DM channels must use `type: 'conversation'`, not `type: 'community'`. Agents consistently choose `community` for 1-to-1 chats because it sounds plausible but silently creates a group channel with the wrong shape. Sensor requires `userIds` co-occurrence to avoid firing on legitimate community broadcasts.
13
+ - `react-native.follow.status-subscription` (`warning`) — `getFollowStatus` must be wrapped in a live subscription, not a one-shot query. A one-shot call captures state at mount and never updates — follow/unfollow actions are not reflected in the UI until the user navigates away.
14
+ - `rationale` field in `sp-vise/findings.json` — agents see *why* each rule exists, not just *what* it requires. Improves attestation quality on rules that allow it.
15
+ - Compliance.json rule entries now include a `title` field (digest-stable, separate from hashing) so agents and humans can identify rules without grepping definitions.
16
+ - Corpus grew from **262 → 265 rules**.
17
+
18
+ ### Changed
19
+ - **`vise init` now writes `sp-vise/findings.json` immediately** — agents see current rule violations on startup with no exploration needed. Combined with the `npm run sp-check` script added to scaffolded workspaces, agents follow a directed (read findings → fix → verify) loop instead of an exploratory (search → search → search → implement) loop.
20
+ - **`live-collection.api-mismatch`, `posts.activity-tag-filter`, `posts.reaction-stale-post-ref`, `user.ban-state-respected`** — all now skip `.d.ts` files to eliminate false positives from type stubs.
21
+ - **`user.ban-state-respected`** — `flagComment` and `flagPost` added to the recognised write-pattern list. Flagging is a moderation action and must be ban-guarded.
22
+ - **`react-native.push.unregister.present`** — recommendation generalised; no longer references benchmark-specific state variables. Surfaces the exact `useEffect` cleanup pattern needed.
23
+ - Reactive markers now include `.on('dataUpdated', ...)` — the event-emitter style of subscribing to LiveCollection updates is now recognised as a valid alternative to property-callback subscription.
24
+ - README updated with a step-by-step Quick Start that references `findings.json` directly.
25
+
26
+ ### Benchmark infrastructure (`benchmarks/`)
27
+ - **Commune benchmark** added — 9-slice React Native scenario (CM-SETUP, CM-PRESENCE, CM-FEED, CM-EVENTS, CM-CHAT, CM-PUSH, CM-PROFILE, CM-MODERATE, CM-COMMENTS) covering chat, push, social graph, and moderation domains absent from TouchTunes. Three seed types per slice (`baseline`, `broken`, `greenfield`) for 27 fixture sets total.
28
+ - **Rules-as-markdown control arm** (`benchmarks/commune/run-commune-rules-arm.sh`) — injects the rule corpus as a static document into the agent prompt. Built to isolate whether vise's measured advantage comes from *information delivery* (the rules) or the *iterative verification loop* (sp-check).
29
+ - **TouchTunes runner improvements** — workspace isolation (`workspaces/broken/` vs `workspaces/baseline/` so agents can't peek at the answer), `< /dev/null` stdin redirect fix that was causing agy/codex to silently skip cells, `|| true` per-cell error isolation, and grader auto-attestation for no-file and `.d.ts`-pointing rules.
30
+ - **agy + codex runners** (`run-agy-cells.sh`, `run-codex-cells.sh`) — production-quality scripts with TTY-detection fixes and workspace isolation.
31
+
32
+ ### Findings & reports
33
+ - `benchmarks/FINDINGS.html` — engineering-facing summary of the benchmark methodology, results, and what was/wasn't proven.
34
+ - `benchmarks/MARKETING.html` — three-tier marketing-claim framework (safe / concrete / honest / aspirational) with supporting wallclock data and a list of metrics to instrument next.
35
+
36
+ ### Honest claim
37
+ On 9 new SDK domain implementations with codex gpt-5.4, vise+skill produced 7 working features vs 3 for pure MCP — same agent, same prompts. The cost: +28% wallclock per session. The net: −52% wallclock per *working* feature, because more features ship on the first try. Vise consistently catches five bug classes that capable models otherwise miss: wrong DM channel type, missing push register/unregister lifecycle, one-shot queries where live subscriptions are required, missing ban checks before write operations, and missing flag affordances on user-generated content.
38
+
39
+ ---
40
+
41
+ ## 0.9.0 — 2026-05-27
42
+
43
+ **Theme:** Business model-grounded gap analysis; Next.js / SSR guard; environment hygiene expanded to all platforms.
44
+
45
+ ### Added
46
+ - `typescript.client.no-ssr-init` (`error`) — SDK client must not be initialized in a Next.js Server Component, `layout.tsx` without `'use client'`, or inside `getServerSideProps`/`getStaticProps`. The primary demo-invisible failure mode for AI-native Next.js customers: `next dev` recovers from the error gracefully; `next build` + production does not.
47
+ - `react-native.secret.env-gitignore` — React Native env files containing secret-shaped keys must be excluded by `.gitignore`.
48
+ - `react-native.secret.env-example` — A `.env.example` or `.env.sample` must accompany any gitignored React Native env file.
49
+ - `flutter.secret.env-gitignore` — Flutter `.env` or `secrets.dart` files containing secret-shaped keys must be excluded by `.gitignore`.
50
+ - `android.secret.env-gitignore` — `local.properties` containing secret-shaped keys must be excluded by `.gitignore`.
51
+ - `ios.secret.env-gitignore` — `Secrets.plist` or `*.xcconfig` files containing secret-shaped keys must be excluded by `.gitignore`.
52
+ - Corpus grew from **256 → 262 rules**.
53
+ - `benchmarks/SDK_INTEGRATION_GAP_ANALYSIS.md` — business model-grounded gap analysis mapping every SDK-relevant value claim to Vise rule coverage, with a prioritised improvement backlog.
54
+
55
+ ### Changed
56
+ - **Skill — "Stop Instead Of Guessing":** intake list now asks about Next.js rendering mode (Server Component vs `'use client'` vs Pages Router) before implementing SDK initialization.
57
+ - **Skill — "Session Renewal":** new feedforward: SDK collection queries must not fire before `login()` completes; gate collection setup behind the session-active signal.
58
+ - **Skill — "Live Collection API Mismatch":** new guidance: handle connection-state changes and render a reconnecting indicator when the WebSocket drops.
59
+ - **Skill — "Debugging & Troubleshooting":** compact `--brief` flag documented; `repairBrief` output described.
60
+
61
+ ---
62
+
63
+ ## 0.7.0 — 2026-05-23
64
+
65
+ **Theme:** SDK-specific rule corpus expansion + measured cross-tool benchmark.
66
+
67
+ ### Added
68
+ - 17 new SDK-specific rule families across 5 platforms = **85 new compliance rules** (corpus grew from 167 → 252):
69
+ - **Tier 1 — Silent-failure traps:** `session-handler.retained`, `live-collection.api-mismatch`, `posts.status-filter-applied`, `pagination.cursor-opaque`, `posts.parent-child-rendered`
70
+ - **Tier 2 — Wrong-target / silent misroute:** `feed.target-type-explicit`, `comment.reference-type-enum`, `channel.type-matches-shape`
71
+ - **Tier 3 — Moderator-only data leaking to user UI:** `moderation.role-gated-action`, `flag-count.not-leaked-to-non-mods`, `user.ban-state-respected`
72
+ - **Tier 4 — Notifications & unread state:** `notifications.amity-preferences-configured`, `unread.subscribed-not-counted`
73
+ - **Tier 5 — Custom config & types:** `reactions.configured-name-used`, `custom-post-type.dataType-declared`
74
+ - **Tier 6 — File upload & media:** `file-upload.via-amity-file-client`, `image-post.child-resolution-awaited`
75
+ - Multi-outcome measured benchmark (chat / comments / push on React + Flutter) with cross-tool validation (Antigravity / Gemini 3.5 Flash). See `benchmarks/RESULTS.md`.
76
+ - Fixture-foundation gates: `run-happy-path-clean.mjs` (every canonical happy-path must fire zero rules) and `run-fixture-symmetry.mjs` (every rule's positive fixture must not fire the rule).
77
+ - Dedicated React Native canonical happy-path fixture (previously shared with TypeScript).
78
+ - New CI exit code `4` for `contract-drift` (rules in `sp-vise/compliance.json` no longer match current ruleset).
79
+
80
+ ### Fixed
81
+ - `*.secret.inline-api-key` now catches env-fallback literal leaks: `String.fromEnvironment(..., defaultValue: 'literal')` (Dart), `process.env.X ?? 'literal'` (JS/TS), ternary fallback. Previously these forms slipped past the regex because the literal wasn't directly assigned to `apiKey`.
82
+ - Four web/Flutter rule false-positives on idiomatic guarded code: `typescript.client.region` now accepts env-sourced and positional region declarations; `*.network.error-handling-present` recognizes React error-state idiom; `flutter.design.reuse-detected-tokens` credits `Theme.of(context)` reuse.
83
+ - Pre-existing CLI version assertion in `test/run-cli.mjs` (was pinned to `0.4.0`).
84
+
85
+ ### Changed
86
+ - Project structure flattened — `packages/foundry/` layer removed; npm package now publishes from the repo root.
87
+ - README consolidated from two files (brand + developer) into a single customer-facing canonical README; internal architecture moved to `docs/`.
88
+
89
+ ## 0.6.0 — 2026-05-22
90
+
91
+ **Theme:** v0.6 compliance expansion + 5-platform measured benchmark.
92
+
93
+ ### Added
94
+ - Corpus grew to 167 rules across 10 domains.
95
+ - Outcomes: `add-comments`, `add-moderation`, `add-chat`.
96
+ - Five-platform measured benchmark (TypeScript / React Native / Flutter / Android / iOS) with real `vise check` and `vise run-sensors` artifacts.
97
+
98
+ ### Fixed
99
+ - React Native platform detection priority (previously misdetected as TypeScript when both signals were present).
100
+
101
+ ## 0.5.0 — 2026-05-21
102
+
103
+ **Theme:** AST-based sensors.
104
+
105
+ ### Added
106
+ - Tree-sitter AST sensors for Kotlin / Swift / Dart literal detection.
107
+ - Phase 1 pilot: `typescript.auth.no-literal-user-id` resolves identifier-via-constant indirections.
108
+ - Phase 4: AST-aware comment stripping for `ui-states-present` and `design-reuse-detected-tokens` rules.
109
+
110
+ ## 0.4.0 — 2026-05-20
111
+
112
+ **Theme:** Compliance harness.
113
+
114
+ ### Added
115
+ - `vise check --ci`: read-only verification with structured exit codes for CI pipelines.
116
+ - Attestation flow: `vise attest` with rule id, signer, confidence, evidence, and rationale.
117
+ - `vise sync`: persist deterministic-pass attestation files.
118
+ - Engagement tracking: `vise engagement init/show` for tier / customer-id / scope metadata.
119
+ - `sp-vise/` sidecar directory: customer-visible compliance contract (`compliance.json`, `attestations/`, `engagement.json`, `inspection.json`).
120
+ - Cross-platform rule corpus.
121
+ - Native project skill installs.
122
+
123
+ ## 0.3.0 — 2026-05-19
124
+
125
+ **Theme:** Foundry → Vise rename.
126
+
127
+ ### Changed
128
+ - Renamed npm package to `@amityco/social-plus-vise`.
129
+ - Added `vise` short binary alias; kept `foundry-mcp` as a compatibility binary alias.
130
+ - Added Claude Code skill targets (`--target claude`, `--target claude-project .`).
131
+ - Documented Cursor, Copilot, and VS Code instruction installs.
132
+
133
+ ## 0.2.1 — 2026-05-18
134
+
135
+ ### Added
136
+ - `vise install-skill`, `vise print-skill`, and `vise skill-path` for bundled-skill installation.
137
+
138
+ ## 0.2.0 — 2026-05-17
139
+
140
+ ### Added
141
+ - Skill-guided CLI commands: `inspect`, `plan`, `validate`, `run-sensors`.
142
+ - The `social-plus-vise` skill guidance shipped as part of the package.
143
+
144
+ ## 0.1.1 — 2026-05-16
145
+
146
+ ### Added
147
+ - Initial npm publish.
148
+ - MCP adapter (stdio).
149
+ - Doc search backed by `https://learn.social.plus/llms-full.txt`.
package/README.md CHANGED
@@ -1,7 +1,3 @@
1
- <p align="center">
2
- <img src="./social.plus-vise.png" alt="social.plus Vise" width="320" />
3
- </p>
4
-
5
1
  <h1 align="center">social.plus Vise</h1>
6
2
 
7
3
  <p align="center">
@@ -45,9 +41,11 @@ See [Usage Flow](#usage-flow) for the full step-by-step diagram.
45
41
 
46
42
  ---
47
43
 
48
- ## What Vise Does
44
+ ## What Vise Does: Agentic Workflow Governance
49
45
 
50
- Vise is a **CLI + AI skill** that wraps coding agents in deterministic compliance guardrails when they integrate social.plus SDKs. It inspects your project, grounds the agent in hosted docs, enforces 250+ platform-specific compliance rules, and runs your project's own build/lint/typecheck sensors. **Your source code never leaves your machine.**
46
+ Instead of just providing a CLI or AI skills, Vise implements a technique called **Agentic Workflow Governance**. Think of it as building a software factory directly on top of the customer's project.
47
+
48
+ Vise acts as the foreman of this factory, wrapping your local coding agents in compliance guardrails when they integrate social.plus SDKs. It inspects your project, grounds the agent in hosted docs, enforces 262 platform-specific compliance rules, and runs your project's own build/lint/typecheck sensors. **Your source code never leaves your machine.**
51
49
 
52
50
  | Layer | Purpose |
53
51
  |---|---|
@@ -61,53 +59,78 @@ A bench vise holds the workpiece steady so the craftsman's hands are free to sha
61
59
 
62
60
  ---
63
61
 
64
- ## Benchmark: First-Try Success
62
+ ## Benchmark: Phase 1 Results
63
+
64
+ > **Every feature delivered correctly — confirmed independently with two different AI coding tools.**
65
+ > With Vise, both agents built all 9 social features with no production gaps. Without Vise, 3 out of 9 features had hidden problems that would only surface after users complained.
66
+
67
+ ### What "delivered correctly" means
68
+
69
+ "Correct" doesn't just mean the code compiles. It means every feature handles the edge cases that matter to real users and real moderation teams:
70
+
71
+ - A **banned user** cannot type or submit a post — the send button is hidden, not just disabled-on-submit
72
+ - **Push notification preferences** are wired to the Amity API so users who opt out actually stop receiving notifications
73
+ - **Moderation actions** (report, flag, block) are surfaced in the UI so users can act on them, not buried in a hook
74
+ - **Chat and feed queries** use live, reactive subscriptions — not one-time fetches that go stale
75
+
76
+ Without Vise, AI agents frequently implement the primary feature correctly but miss these secondary requirements. They know about them in the abstract — but when building a chat screen, "ban state" feels out of scope and gets skipped. `sp-check` turns that vague awareness into a specific, actionable finding.
77
+
78
+ ### The experiment: three conditions, nine features
79
+
80
+ We ran a controlled experiment — the **Commune Benchmark** — to measure not just *whether* Vise helps, but *why*. Each of the nine features below was built from scratch by an AI agent under three independent conditions:
65
81
 
66
- > **100% first-try CI pass with Vise vs 0% without.**
67
- >
68
- > **76% cheaper · 28% faster · 86% fewer issues**
82
+ **Nine features built:**
83
+ SDK setup · User presence · Social feed · Events · Chat & DMs · Push notifications · User profile · Content moderation · Comments
69
84
 
70
- When an AI agent integrates social.plus with only docs access (Pure MCP), it produces code with real problems: hardcoded user IDs, missing authentication, no content moderation, broken reactive patterns. These aren't edge cases — they're the SDK-specific requirements that general AI knowledge reliably misses.
85
+ | Condition | What the agent had | The question it answers |
86
+ |---|---|---|
87
+ | **Pure MCP** | Access to social.plus docs only — no compliance guidance | Baseline: how well does the agent do on its own? |
88
+ | **Rules-as-Markdown** | The full 1,013-line compliance rulebook pasted directly into the prompt | Is the problem just that the agent doesn't know the rules? |
89
+ | **Vise + Skill** | Full Vise CLI — `sp-check` runs automatically, agent reads specific findings, fixes them, repeats until green | Does an active feedback loop change the outcome? |
90
+
91
+ The Rules-as-Markdown condition is the key isolation: if the agent already knows all the rules, does giving it the spec document fix the problem? The answer turned out to be **no** — knowing the rules and being forced to act on specific findings are different things.
71
92
 
72
- ### v0.8 Pilot Results (React/Next.js · "add comments")
93
+ ### Results features delivered without production gaps
73
94
 
74
- | Surface | CI Pass | Issues | Tokens | Cost | Wall-clock |
75
- |---|---|---|---|---|---|
76
- | **Pure MCP** (docs only) | 0/2 | 4–7 | 36,219 | $0.0108 | 619s |
77
- | **Vise-as-MCP** (rules engine) | 2/2 | 1 | 21,047 | $0.0061 | 540s |
78
- | **Vise CLI + Skill** (full workflow) | ✅ 2/2 | 1 | 8,733 | $0.0024 | 447s |
95
+ | Coding agent (model) | Pure MCP | Rules-as-Markdown | Vise + Skill |
96
+ |---|---|---|---|
97
+ | **Cursor (Composer 2.5)** | 6 out of 9 | 5 out of 9 ✗ | **9 out of 9 ✅** |
98
+ | **Claude Code (Sonnet 4.6)** | 6 out of 9 | 7 out of 9 ✗ | **9 out of 9 ✅** |
79
99
 
80
- <sub>Token/cost data from Antigravity/Gemini Flash 3.5. Copilot CLI does not expose token accounting.</sub>
100
+ The three features that consistently fail without Vise — **Chat**, **Moderation**, and **Push Notifications** — are exactly the ones with secondary compliance requirements (ban-state, report affordances, Amity preference API). Vise's `sp-check` catches these with a specific finding; the rules doc does not.
81
101
 
82
- **What "Issues" means in plain language:**
102
+ Both agents reached a perfect score with Vise. Neither could reach it with the compliance spec pasted into the prompt. All 9 passes were independently verified by code inspection — no scoring shortcuts.
83
103
 
84
- Without Vise, both agents produced code with hardcoded user IDs (security vulnerability), no authentication flow (anonymous writes), missing moderation UI, non-reactive queries, and missing SDK initialization. With Vise, those problems are caught or prevented during generation.
104
+ ### Efficiency rework sessions needed
85
105
 
86
- ### Why this matters
106
+ Vise delivers all 9 features correctly in a single session. The other conditions leave failing features that require additional sessions to diagnose (the gap isn't visible without `sp-check`) and fix.
87
107
 
88
- | Metric | Without Vise | With Vise (CLI + Skill) | Improvement |
108
+ | Coding agent (model) | Condition | Features correct | Rework sessions needed |
89
109
  |---|---|---|---|
90
- | Does it work on first try? | Fails CI | ✅ Passes CI | 100% pass rate |
91
- | Security issues? | Hardcoded IDs, no auth | 0 security findings | 100% eliminated |
92
- | Integration issues | 4–7 per run | 1 per run | **−86%** fewer issues |
93
- | Token cost | $0.0108 | $0.0024 | **−78%** cheaper |
94
- | Token usage | 36,219 | 8,733 | **−76%** fewer tokens |
95
- | Speed (Gemini) | 619s | 447s | **−28%** faster |
96
- | Manual rework needed? | Yes | No | Zero rework |
110
+ | **Cursor (Composer 2.5)** | Pure MCP | 6 / 9 | +3 or more |
111
+ | **Cursor (Composer 2.5)** | Rules-as-Markdown | 5 / 9 | +4 or more |
112
+ | **Cursor (Composer 2.5)** | **Vise + Skill** | **9 / 9 ✅** | **0 ✅** |
113
+ | **Claude Code (Sonnet 4.6)** | Pure MCP | 6 / 9 ✗ | +3 or more |
114
+ | **Claude Code (Sonnet 4.6)** | Rules-as-Markdown | 7 / 9 ✗ | +2 or more |
115
+ | **Claude Code (Sonnet 4.6)** | **Vise + Skill** | **9 / 9 ✅** | **0 ✅** |
97
116
 
98
- ### Cross-model validation
117
+ <sub>Rework sessions are additional developer-initiated prompts needed after the initial session to diagnose and fix the failing features. Each failing feature typically requires at least one session to identify the gap and one to fix it — and that's without the benefit of `sp-check` pointing directly at the problem.</sub>
99
118
 
100
- The effect holds across **Claude Sonnet 4.6** (Copilot CLI) and **Gemini Flash 3.5** (Antigravity). This is not a prompt trick for one model — it's domain knowledge applied consistently at the social.plus layer.
119
+ ### Reproducibility
120
+
121
+ - **Gate-checked:** Every pass was verified by code inspection — the Vise workspaces contain an actual UI-level ban gate; the pure-MCP workspaces do not. Zero attestation shortcuts.
122
+ - **Built from scratch** (greenfield seed) — not patching existing code.
123
+ - **Three arms run with separate tooling.** The Rules-as-Markdown arm has no `sp-check` tool available — it cannot "cheat" by running the checker.
124
+ - **N=1 per cell (Phase 1).** Each agent ran each scenario once. Repeatability seeds on the three most discriminating slices (CM-CHAT, CM-MODERATE, CM-PUSH) are pending. These results should be treated as a strong initial signal, not a statistically settled finding.
125
+ - Full per-feature scorecards, agent transcripts, and workspace diffs: [`benchmarks/FINDINGS.html`](benchmarks/FINDINGS.html) · [`benchmarks/RULES_AS_MARKDOWN.html`](benchmarks/RULES_AS_MARKDOWN.html)
101
126
 
102
127
  ### Which mode should I use?
103
128
 
104
- | If you... | Use | Why |
129
+ | If you | Use | Why |
105
130
  |---|---|---|
106
- | Can install the skill | **CLI + Skill** | Fastest, cheapest, best results |
107
- | Can't install skill but have MCP | **Vise-as-MCP** | Same compliance, slightly more tokens |
108
- | Want to validate existing code | `vise check --ci` | Grade any codebase, any time |
109
-
110
- For the full interactive report with charts, see [`benchmarks/report.html`](./benchmarks/report.html). For per-cell scorecards and prior benchmark versions, see [`benchmarks/RESULTS.md`](./benchmarks/RESULTS.md).
131
+ | Building new social features with an AI agent | **Vise CLI + Skill** | The only mode that reliably delivers all features correctly |
132
+ | Auditing existing social.plus code | `vise check --ci` | Grades any codebase against the full ruleset |
133
+ | Enforcing compliance in a CI pipeline | `vise check --ci` | Exits non-zero on failures; structured JSON output for logs |
111
134
 
112
135
  ---
113
136
 
@@ -121,7 +144,7 @@ For the full interactive report with charts, see [`benchmarks/report.html`](./be
121
144
  | **Android (Kotlin)** | ✅ Full | Gradle assemble, unit tests |
122
145
  | **iOS (Swift)** | ✅ Full | (static rule checks; runtime sensors WIP) |
123
146
 
124
- Each platform has 5055 rules across 10 compliance domains (feed, comments, moderation, chat, secrets, session & auth, notifications, live objects, logging hygiene, design tokens).
147
+ Each platform has 5254 rules across 10 compliance domains (feed, comments, moderation, chat, secrets, session & auth, notifications, live objects, logging hygiene, design tokens).
125
148
 
126
149
  ---
127
150
 
@@ -199,12 +222,13 @@ The flow above is what the skill teaches your AI agent. You — the human — dr
199
222
  | `vise plan-harness [path] --request "..."` | (Pre-planning step) Build the harness around the request |
200
223
  | `vise init [path] --request "..."` | Write the `sp-vise/` compliance contract for this project |
201
224
 
202
- ### Documentation grounding
225
+ ### Documentation grounding & Troubleshooting
203
226
 
204
227
  | Command | Purpose |
205
228
  |---|---|
206
229
  | `vise search-docs "<query>"` | Search social.plus docs for relevant pages |
207
230
  | `vise get-doc-page <path>` | Fetch a specific doc page by path |
231
+ | `vise debug [path] --error "..." [--brief]` | Debug an SDK-specific runtime failure and emit a likely-cause summary plus a minimal repair brief |
208
232
 
209
233
  ### Compliance verification
210
234
 
@@ -225,6 +249,18 @@ The flow above is what the skill teaches your AI agent. You — the human — dr
225
249
  | `vise run-sensors [path]` | Run detected project commands (npm scripts, Gradle, Flutter, lint, typecheck, SDK import smokes); never executes arbitrary shell |
226
250
  | `vise run-sensors [path] --dry-run` | List what would run without executing |
227
251
 
252
+ ### Troubleshooting quick loop
253
+
254
+ For SDK-specific runtime issues, start with the compact debug flow before broader repo exploration:
255
+
256
+ ```sh
257
+ vise debug . --error-file logs/crash.log --brief
258
+ vise check . --ci
259
+ vise run-sensors .
260
+ ```
261
+
262
+ `vise debug --brief` returns the likely rule, minimum patch shape, invariants to preserve, and verification commands for the first repair pass.
263
+
228
264
  ### Skill management
229
265
 
230
266
  | Command | Purpose |
@@ -266,7 +302,7 @@ MCP-capable hosts can call Vise as structured tool calls instead of shell comman
266
302
 
267
303
  ### Tool names (snake_case per MCP convention)
268
304
 
269
- `inspect_project`, `plan_harness`, `plan_integration`, `init_compliance`, `check_compliance`, `sync_compliance`, `attest_rule`, `explain_rule`, `init_engagement`, `show_engagement`, `search_docs`, `get_doc_page`, `validate_setup`, `run_sensors`.
305
+ `inspect_project`, `plan_harness`, `plan_integration`, `init_compliance`, `check_compliance`, `sync_compliance`, `attest_rule`, `explain_rule`, `init_engagement`, `show_engagement`, `search_docs`, `get_doc_page`, `debug_issue`, `validate_setup`, `run_sensors`.
270
306
 
271
307
  These are the same operations as the CLI commands above, exposed as MCP tools.
272
308
 
package/dist/outcomes.js CHANGED
@@ -5,15 +5,15 @@ export function hasAnswer(answers, id) {
5
5
  const CLASSIFY_ORDER = [
6
6
  "setup-push",
7
7
  "setup-live-data",
8
+ "add-comments",
8
9
  "add-moderation",
9
10
  "add-chat",
10
- "add-comments",
11
11
  "add-feed",
12
12
  "troubleshoot",
13
13
  "validate-setup",
14
14
  "setup-sdk",
15
15
  ];
16
- const PUSH_PATTERNS = [/\b(push|notification|firebase|fcm|apns)\b/];
16
+ const PUSH_PATTERNS = [/\b(push(?:\s+notification)?|push(?:\s+notifications)?|firebase|fcm|apns)\b/];
17
17
  const LIVE_PATTERNS = [
18
18
  /\b(live object|live objects|live collection|live collections|realtime collection|real-time collection|observe|observer|subscribe|subscription|unsubscribe|live update|live updates)\b/,
19
19
  ];
@@ -31,7 +31,7 @@ const CHAT_PATTERNS = [
31
31
  ];
32
32
  const TROUBLESHOOT_PATTERNS = [/\b(error|broken|crash|not working|fail|timeout|401|403)\b/];
33
33
  const VALIDATE_PATTERNS = [/\b(validate|check|correct|setup right|initiali[sz])\b/];
34
- const SETUP_PATTERNS = [/\b(setup|set up|install|integrate|wire|configure)\b/];
34
+ const SETUP_PATTERNS = [/\b(setup|set up|install|integrate|wire|configure|init sdk|sdk setup|session lifecycle)\b|initialise?s?\b/];
35
35
  export const BROAD_SOCIAL_REGEX = /\b(nice|social features|social feature|engagement|community experience)\b/i;
36
36
  export const DESIGN_REGEX = /\bdesign token|design tokens|theme|same design|design system|brand/i;
37
37
  export function classifyOutcome(request) {
@@ -134,10 +134,18 @@ const setupSdk = {
134
134
  step: "Initialize the social.plus client exactly once with API key and explicit region.",
135
135
  evidence: [platformQuickStart(ctx.platform).path, "requiredInputs.social.plus API key local env/config variable", "requiredInputs.social.plus region"],
136
136
  },
137
+ {
138
+ step: "When repairing setup, reuse the app's existing region or endpoint config source instead of hardcoding a guessed default value.",
139
+ evidence: [platformQuickStart(ctx.platform).path, "requiredInputs.social.plus region"],
140
+ },
137
141
  {
138
142
  step: "Wire login after user identity is known and before social.plus API queries/subscriptions.",
139
143
  evidence: ["social-plus-sdk/getting-started/authentication", "requiredInputs.user identity source for login"],
140
144
  },
145
+ {
146
+ step: "When adding renewal handling, keep it in the existing login path and retain the handler for the full session lifetime.",
147
+ evidence: ["social-plus-sdk/getting-started/authentication", "requiredInputs.user identity source for login"],
148
+ },
141
149
  { step: "Run validate_setup and detected command sensors after edits.", evidence: ["validate_setup", "run_sensors"] },
142
150
  ],
143
151
  validation: (platform) => [`${platform}.setup.present`, `${platform}.login.present`, `${platform}.region.explicit`],
@@ -470,6 +478,13 @@ const addFeed = {
470
478
  "social-plus-sdk/core-concepts/realtime-communication/live-objects-collections/overview",
471
479
  ],
472
480
  },
481
+ {
482
+ step: "When repairing or refactoring a feed query, preserve existing pagination inputs and state (for example pageToken, nextPage, hasMore/loadMore, or infinite-query wiring) unless the customer explicitly changes feed behavior.",
483
+ evidence: [
484
+ "requiredInputs.feed scope",
485
+ "implementationRules.file-specific edits",
486
+ ],
487
+ },
473
488
  { step: "Reuse the host app's existing visual system for the social surface.", evidence: designEvidence },
474
489
  { step: "Implement loading, empty, error, and data states.", evidence: ["implementationRules.file-specific edits"] },
475
490
  { step: "Run validate_setup and detected command sensors after edits.", evidence: ["validate_setup", "run_sensors"] },
@@ -878,6 +893,7 @@ const troubleshoot = {
878
893
  implementationRules: () => [],
879
894
  implementationSteps: () => [
880
895
  { step: "Gather more evidence before implementation.", evidence: ["stopConditions", "search_docs", "inspect_project"] },
896
+ { step: "If the issue is an SDK-specific runtime symptom, run vise debug first and use the repair brief before broader repo exploration.", evidence: ["search_docs", "inspect_project"] },
881
897
  ],
882
898
  validation: () => [],
883
899
  stopConditions: () => [],
package/dist/server.js CHANGED
@@ -13,6 +13,7 @@ import { planIntegrationTool } from "./tools/integration.js";
13
13
  import { inspectProjectTool, validateSetupTool } from "./tools/project.js";
14
14
  import { resolveRequestTool, suggestPatchTool } from "./tools/resolve.js";
15
15
  import { runSensorsTool } from "./tools/sensors.js";
16
+ import { debugIssueTool, debugIssue } from "./tools/debug.js";
16
17
  import { packageName, packageVersion } from "./version.js";
17
18
  const tools = new Map([
18
19
  searchDocsTool,
@@ -31,6 +32,7 @@ const tools = new Map([
31
32
  runSensorsTool,
32
33
  validateSetupTool,
33
34
  suggestPatchTool,
35
+ debugIssueTool,
34
36
  ].map((tool) => [tool.name, tool]));
35
37
  const bundledSkillName = "social-plus-vise";
36
38
  const cliResult = await handleCli(process.argv.slice(2));
@@ -119,6 +121,32 @@ async function handleCli(args) {
119
121
  });
120
122
  return "exit";
121
123
  }
124
+ if (command === "debug") {
125
+ assertOnlyKnownFlags(args, ["error", "error-file", "brief"], "debug");
126
+ let errorMessage = flagValue(args, "error");
127
+ if (!errorMessage) {
128
+ const errorFile = flagValue(args, "error-file");
129
+ if (errorFile) {
130
+ errorMessage = await readFile(path.resolve(errorFile), "utf8");
131
+ }
132
+ else if (!process.stdin.isTTY) {
133
+ const { readFileSync } = await import("node:fs");
134
+ try {
135
+ errorMessage = readFileSync(0, "utf-8");
136
+ }
137
+ catch {
138
+ errorMessage = undefined;
139
+ }
140
+ }
141
+ }
142
+ if (!errorMessage) {
143
+ console.error("debug requires --error, --error-file, or piped stdin.");
144
+ process.exitCode = 1;
145
+ return "exit";
146
+ }
147
+ console.log(JSON.stringify(await debugIssue(positionalRepoPath(args.slice(1)), errorMessage, { brief: hasFlag(args, "brief") }), null, 2));
148
+ return "exit";
149
+ }
122
150
  if (command === "plan" || command === "plan-integration") {
123
151
  await printToolResult(planIntegrationTool, {
124
152
  repoPath: positionalRepoPath(args.slice(1)),
@@ -349,6 +377,16 @@ Run deterministic social.plus setup validation for the current project.
349
377
 
350
378
  Usage:
351
379
  vise validate [repoPath] [--platform typescript] [--surface apps/web]`;
380
+ }
381
+ if (command === "debug") {
382
+ return `${packageName} debug
383
+
384
+ Correlate an SDK-specific runtime failure to likely compliance issues and emit a minimal repair brief.
385
+
386
+ Usage:
387
+ vise debug [repoPath] --error "401 Unauthorized: TokenExpiredException during social.plus session renewal"
388
+ vise debug [repoPath] --error-file logs/crash.log
389
+ vise debug [repoPath] --error-file logs/crash.log --brief`;
352
390
  }
353
391
  if (command === "run-sensors" || command === "run-sensor" || command === "run_sensor") {
354
392
  return `${packageName} run-sensors
@@ -425,6 +463,7 @@ Usage:
425
463
  vise install-skill --target codex Install bundled skill guidance
426
464
  vise print-skill Print bundled skill markdown
427
465
  vise inspect [repoPath] Inspect platform and design signals
466
+ vise debug [repoPath] --error ... Debug an SDK-specific runtime error and emit a repair brief
428
467
  vise plan [repoPath] --request "..." Create an implementation plan
429
468
  vise init [repoPath] --request "..." Initialize compliance sidecar
430
469
  vise check [repoPath] Check compliance contract