npm - @amityco/social-plus-vise - Versions diffs - 0.14.11 → 0.14.12 - Mend

@amityco/social-plus-vise 0.14.11 → 0.14.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/CHANGELOG.md +10 -0
package/README.md +24 -13
package/dist/outcomes.js +1 -2
package/dist/productExpectations.js +83 -0
package/dist/tools/compliance.js +64 -9
package/package.json +1 -1

package/CHANGELOG.md CHANGED Viewed

@@ -4,6 +4,16 @@ All notable changes to `@amityco/social-plus-vise` are documented in this file.
 The format is loosely based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## 0.14.12 — 2026-06-05
+### Changed
+- **Multi-sensor shared expectations:** `comments.thread-read-write` now groups the concrete comment page-size, creation-affordance, observer-cleanup, and UI-state sensors under one public product expectation while preserving `contractRuleId` and `validator.sensorId` evidence.
+- **Ambiguous attestation guidance:** when a shared public expectation maps to multiple contract rules in the same sidecar, `vise check` now emits exact `vise attest --rule <contract-rule-id>` commands and `vise attest --rule <public-id>` rejects with a concrete rule list.
+- **Public benchmark claim:** README benchmark copy now describes the current full workflow, current release validation, and evidence boundaries without relying on local benchmark artifact links.
+### Verified
+- Packed-package smoke for `@amityco/social-plus-vise@0.14.12` confirmed plan question surfacing, design confirmation, optional feed capability sensors, shared comment expectation grouping, and ambiguous attestation handling.
 ## 0.14.11 — 2026-06-05
 ### Added

package/README.md CHANGED Viewed

@@ -143,27 +143,40 @@ A bench vise holds the workpiece steady so the craftsman's hands are free to sha
 ## Benchmark Results: Current Claim
-> **Vise gives AI coding agents a governed workflow for social.plus integrations, improving feature completeness, SDK compliance, and design consistency in greenfield work.**
+> **Vise gives AI coding agents a governed workflow for social.plus integrations: it makes scope explicit, checks the local code, and turns missing SDK capabilities or compliance gaps into repair work before the agent stops.**
-The strongest current claim is not a universal speed or quality promise. It is narrower and more useful: when agents build greenfield social.plus SDK features, the Vise workflow makes scope explicit, checks local code, and turns missing capabilities or SDK violations into concrete repair work before the agent stops.
+The strongest current claim is not a universal speed or quality promise. It is narrower and more useful: for greenfield social.plus SDK work, Vise improves the stopping condition. The agent is not done after reading docs or producing code; it is done when the local contract is green, attested, or blocked on explicit customer input.
-### Latest headline: feed completeness
+### Latest Quantitative Benchmark
-The latest Vise 0.14.5 opt-in comparison is the headline product proof. Same feed request, same SDK docs, no Vise workflow in the baseline; the Vise arm explicitly selected `feed_optional_capabilities=post-image-upload,post-poll-creation,post-edit`, persisted that choice into `sp-vise/compliance.json`, and activated the selected sensors.
+The latest feed-completeness benchmark remains the headline product proof. Same feed request, same SDK docs, no Vise workflow in the baseline; the Vise arm explicitly selected feed optional capabilities, persisted that choice into `sp-vise/compliance.json`, and activated selected source sensors.
-| Agent / model | Docs-only baseline | Vise 0.14.5 opt-in arm | Readout |
+| Agent / model | Docs-only baseline | Vise opt-in arm | Readout |
 |---|---:|---:|---|
 | Cursor / Composer 2.5 | 30% (3.3/11 avg) | **97% (32/33)** | One seed surfaced the remaining item instead of silently dropping it. |
 | Claude / Sonnet 4.6 medium | 27% (3.0/11 avg) | **100% (33/33)** | All three Vise seeds reached 11/11. |
 | Codex / GPT-5.4 medium | 21% (2.3/11 avg) | **100% (33/33)** | All three Vise seeds reached 11/11. |
-Aggregate: **98/99 expected feed capabilities** and **27/27 selected optional capabilities** implemented across the latest Vise arm. See the full table and per-seed grader links in [`benchmarks/CURSOR_VISE_0.14.5_RESULTS.html`](benchmarks/CURSOR_VISE_0.14.5_RESULTS.html).
+Aggregate: **98/99 expected feed capabilities** and **27/27 selected optional capabilities** implemented across the Vise arm.
-### Supporting proof
+### Current Release Validation
+Version 0.14.12 adds current release proof around the full feed-forward and validation flow:
+| Surface | What was validated |
+|---|---|
+| **Product flow** | Local end-to-end smoke covers design extraction, plan feed-forward, blocking intake, answered init, capability check, design conformance, and sensor discovery. |
+| **Plan questions** | Plans surface blocking questions such as `feature_surface` and `design_contract_confirmation`, plus optional choices such as `feed_optional_capabilities`. |
+| **Capability-to-sensor flow** | Vise checks platform support, matches the prompt to available capabilities, offers supported features as questions, records answers, and turns selected answers into sensors in `vise check`. |
+| **Shared product expectations** | Public IDs such as `comments.thread-read-write` stay platform-agnostic while check results retain concrete `contractRuleId` and `validator.sensorId` evidence. |
+| **Rule detection** | TP-track dashboard detects **311/311 seeded rule gaps (100.0%)** in the static corpus. |
+| **Packed-package smoke** | A real Antigravity agent smoke tested the 0.14.12 tarball, opted into surfaced plan questions, repaired selected optional poll capability sensors, and verified ambiguous shared comment attestations require exact contract rule IDs. |
+### Supporting Proof
 | Surface | Safe claim | Evidence |
 |---|---|---|
-| **Feature completeness** | Vise helps agents build more of the expected SDK capability surface. | Latest comparison: **21-30% without Vise vs 97-100% with Vise 0.14.5**. Earlier pre-registered Capability Matrix Row 2 also shipped a feature-completeness win: silently dropped items fell from 7.67/11 to 4.0/11. |
+| **Feature completeness** | Vise helps agents build more of the expected SDK capability surface. | Latest comparison: **21-30% without Vise vs 97-100% with Vise**, with **98/99** expected feed capabilities implemented in aggregate. Earlier Capability Matrix work also showed silently dropped items falling from 7.67/11 to 4.0/11. |
 | **SDK compliance** | Vise checks catch greenfield SDK compliance gaps that docs or static guidance can miss. | Commune benchmark: Vise averaged **100% greenfield SDK compliance** where docs/RAG-style controls averaged **67%** across the reported slices. |
 | **Design conformance** | Vise design checks reduce design drift under ambiguous briefs. | Ambiguous Spotify-style design test: Vise design runs produced **0 / 0 / 0 hex literals** across three seeds; without Vise, runs varied **0 / 2 / 15**. This supports variance reduction, not pixel-perfect visual quality. |
@@ -172,10 +185,10 @@ Aggregate: **98/99 expected feed capabilities** and **27/27 selected optional ca
 The benchmark story is the product flow:
 1. **Inspect** — Vise detects platform, app surface, SDK surface, sensors, and design signals from the local repo.
-2. **Plan** — Vise classifies the outcome, asks blocking intake questions, surfaces capability availability, and offers optional feed capabilities only when the platform SDK surface supports them.
+2. **Plan** — Vise classifies the outcome, asks blocking intake questions, checks platform capability availability, and offers optional features only when the platform SDK surface supports them.
 3. **Confirm design** — `vise design extract` writes a preview; `plan`/`init` withhold design feed-forward until the user confirms `design_contract_confirmation=yes`.
 4. **Initialize** — `vise init` records the resolved compliance contract, intake answers, selected optional capabilities, inspection result, and accepted design digest.
-5. **Build / check / repair** — the agent edits locally, runs `vise check`, fixes deterministic findings, resolves completeness gaps or selected-capability failures, and then runs project sensors.
+5. **Build / check / repair** — selected answers become sensors. The agent edits locally, runs `vise check`, fixes deterministic findings, resolves completeness gaps or selected-capability failures, and then runs project sensors.
 Static docs can tell an agent what exists. Vise changes the stopping condition: the agent is not done until the local contract is green, attested, or blocked on explicit customer input.
@@ -183,14 +196,12 @@ Static docs can tell an agent what exists. Vise changes the stopping condition:
 The benchmark suite is intentionally reported with boundaries:
-- **Latest feed-completeness comparison** is the current product claim for the opt-in capability flow in Vise 0.14.5. It is a best-case/opt-in comparison across Cursor, Claude, and Codex, not a universal result for every prompt.
+- **Latest feed-completeness comparison** is the current quantitative product claim for the opt-in capability flow. It is a best-case/opt-in comparison across Cursor, Claude, and Codex, not a universal result for every prompt.
 - **Capability Matrix v1** remains the pre-registered follow-up. It shipped the Row 2 feature-completeness claim, found **no Row 1 SDK-compliance claim** on chat/moderation/push under its registered margin, and withheld the Row 3 design claim on a technicality despite higher by-name token use.
 - **Commune Phase 1** remains useful historical evidence for the compliance loop: two agents reached 9/9 with Vise vs 5-7/9 under controls, but it was N=1 per cell and the grader overlaps Vise's own rules.
 - **Design tests** support design-drift reduction and token cleanup. They do not prove visual taste, pixel perfection, or production-ready UI without human review.
 - **Negative results must travel with the claim:** no measured Vise advantage on day-2 bug fixing; the push slice exposed a non-converging attestation loop when docs and SDK disagreed; earlier enumerative plan-time design guidance measured negative and was retracted; the original `scope-omit` affordance went unused in the matrix.
-Full evidence: [`benchmarks/CURSOR_VISE_0.14.5_RESULTS.html`](benchmarks/CURSOR_VISE_0.14.5_RESULTS.html), [`benchmarks/capability-matrix/RESULTS.md`](benchmarks/capability-matrix/RESULTS.md), [`benchmarks/commune/RESULTS.md`](benchmarks/commune/RESULTS.md), and [`benchmarks/brand/design-test/RESULTS.md`](benchmarks/brand/design-test/RESULTS.md).
 ### Which mode should I use?
 | If you… | Use | Why |

package/dist/outcomes.js CHANGED Viewed

@@ -788,8 +788,7 @@ const addComments = {
         "comment target resolved",
         "no invented postId/commentId",
         `${platform}.comments.target-resolved`,
-        `${platform}.comments.observer-cleanup`,
-        `${platform}.comments.ui-states-present`,
+        "comments.thread-read-write",
         `${platform}.comments.moderation-affordance-present`,
     ],
     stopConditions: (ctx) => filterStops(ctx.answers, [

package/dist/productExpectations.js CHANGED Viewed

@@ -49,31 +49,111 @@ export const PRODUCT_EXPECTATION_BINDINGS = [
         sensorId: "ios.feed.rich-post-composer-surfaced",
         platform: "ios",
     },
+    {
+        expectationId: "comments.thread-read-write",
+        sensorId: "typescript.comments.query-has-limit",
+        platform: "typescript",
+    },
     {
         expectationId: "comments.thread-read-write",
         sensorId: "typescript.comments.creation-affordance-present",
         platform: "typescript",
     },
+    {
+        expectationId: "comments.thread-read-write",
+        sensorId: "typescript.comments.observer-cleanup",
+        platform: "typescript",
+    },
+    {
+        expectationId: "comments.thread-read-write",
+        sensorId: "typescript.comments.ui-states-present",
+        platform: "typescript",
+    },
+    {
+        expectationId: "comments.thread-read-write",
+        sensorId: "react-native.comments.query-has-limit",
+        platform: "react-native",
+    },
     {
         expectationId: "comments.thread-read-write",
         sensorId: "react-native.comments.creation-affordance-present",
         platform: "react-native",
     },
+    {
+        expectationId: "comments.thread-read-write",
+        sensorId: "react-native.comments.observer-cleanup",
+        platform: "react-native",
+    },
+    {
+        expectationId: "comments.thread-read-write",
+        sensorId: "react-native.comments.ui-states-present",
+        platform: "react-native",
+    },
+    {
+        expectationId: "comments.thread-read-write",
+        sensorId: "android.comments.query-has-limit",
+        platform: "android",
+    },
+    {
+        expectationId: "comments.thread-read-write",
+        sensorId: "android.comments.creation-affordance-present",
+        platform: "android",
+    },
+    {
+        expectationId: "comments.thread-read-write",
+        sensorId: "android.comments.observer-cleanup",
+        platform: "android",
+    },
+    {
+        expectationId: "comments.thread-read-write",
+        sensorId: "android.comments.ui-states-present",
+        platform: "android",
+    },
     {
         expectationId: "comments.thread-read-write",
         sensorId: "android.comments.thread-ui-states-present",
         platform: "android",
     },
+    {
+        expectationId: "comments.thread-read-write",
+        sensorId: "flutter.comments.query-has-limit",
+        platform: "flutter",
+    },
     {
         expectationId: "comments.thread-read-write",
         sensorId: "flutter.comments.creation-affordance-present",
         platform: "flutter",
     },
+    {
+        expectationId: "comments.thread-read-write",
+        sensorId: "flutter.comments.observer-cleanup",
+        platform: "flutter",
+    },
+    {
+        expectationId: "comments.thread-read-write",
+        sensorId: "flutter.comments.ui-states-present",
+        platform: "flutter",
+    },
+    {
+        expectationId: "comments.thread-read-write",
+        sensorId: "ios.comments.query-has-limit",
+        platform: "ios",
+    },
     {
         expectationId: "comments.thread-read-write",
         sensorId: "ios.comments.creation-affordance-present",
         platform: "ios",
     },
+    {
+        expectationId: "comments.thread-read-write",
+        sensorId: "ios.comments.observer-cleanup",
+        platform: "ios",
+    },
+    {
+        expectationId: "comments.thread-read-write",
+        sensorId: "ios.comments.ui-states-present",
+        platform: "ios",
+    },
     {
         expectationId: "chat.unread-visible",
         sensorId: "typescript.chat.unread-visible",
@@ -169,3 +249,6 @@ export function contractRuleCandidatesForPublicId(ruleId) {
         .filter((binding) => binding.expectationId === ruleId)
         .map((binding) => binding.sensorId);
 }
+export function hasMultipleContractRuleCandidates(ruleId) {
+    return contractRuleCandidatesForPublicId(ruleId).length > 1;
+}

package/dist/tools/compliance.js CHANGED Viewed

@@ -4,7 +4,7 @@ import path from "node:path";
 import { fileURLToPath } from "node:url";
 import { assessProjectCompleteness, assessProjectSelectedOptionalCapabilities, availableOptionalCapabilityIds, optionalCapabilityChecklist, platformCapabilityAvailability, selectedOptionalCapabilityIds, } from "../capabilities.js";
 import { getOutcomeDefinition, hasAnswer, planContextFor, resolveOutcome, } from "../outcomes.js";
-import { contractRuleCandidatesForPublicId, productExpectationBindingForSensor, publicProductRuleId, } from "../productExpectations.js";
+import { contractRuleCandidatesForPublicId, hasMultipleContractRuleCandidates, productExpectationBindingForSensor, publicProductRuleId, } from "../productExpectations.js";
 import { objectInput, optionalBooleanField, optionalStringField, stringField, textResult } from "../types.js";
 import { packageVersion } from "../version.js";
 import { DESIGN_CONTRACT_CONFIRMATION_ANSWER_ID, buildDesignBrief, designContractConfirmationFromAnswers, designPreviewPath, readDesignContract, } from "./design.js";
@@ -568,7 +568,7 @@ export async function checkCompliance(repoPath) {
                     recommendation: finding?.recommendation,
                     rationale: rule.rationale,
                     current_rule: ruleSummary(rule),
-                    ...(failStatus === "attestation-needed" && rule.enforcement.attestation.allowed && attestHint(rule)),
+                    ...(failStatus === "attestation-needed" && rule.enforcement.attestation.allowed && attestHint(rule, compliance)),
                 });
                 continue;
             }
@@ -593,7 +593,7 @@ export async function checkCompliance(repoPath) {
                         rationale: rule.rationale,
                         current_rule: ruleSummary(rule),
                         source_fingerprint_status: sourceFingerprintStatus,
-                        ...(fingerprintStatus === "attestation-needed" && rule.enforcement.attestation.allowed && attestHint(rule)),
+                        ...(fingerprintStatus === "attestation-needed" && rule.enforcement.attestation.allowed && attestHint(rule, compliance)),
                     });
                     continue;
                 }
@@ -639,7 +639,7 @@ export async function checkCompliance(repoPath) {
             recommendation: finding?.recommendation,
             current_rule: ruleSummary(rule),
             ...(isInferential && { inferential_prompt: rule.enforcement.inferential?.prompt }),
-            ...(baseStatus === "attestation-needed" && rule.enforcement.attestation.allowed && attestHint(rule)),
+            ...(baseStatus === "attestation-needed" && rule.enforcement.attestation.allowed && attestHint(rule, compliance)),
         });
     }
     const summary = summarize(results);
@@ -738,6 +738,14 @@ export async function attestRule(args) {
     const contractRuleId = resolveRuleIdForContract(args.ruleId, compliance, rules);
     const rule = contractRuleId ? rules.get(contractRuleId) : undefined;
     if (!rule || !contractRuleId) {
+        const ambiguousCandidates = contractRuleCandidatesForContract(args.ruleId, compliance, rules);
+        if (ambiguousCandidates.length > 1) {
+            const candidateIds = ambiguousCandidates
+                .filter((candidate) => rules.get(candidate)?.enforcement.attestation.allowed)
+                .slice(0, 8);
+            const ids = candidateIds.length > 0 ? candidateIds : ambiguousCandidates.slice(0, 8);
+            throw new Error(`Rule is ambiguous in this compliance contract: ${args.ruleId}. Attest one contract rule at a time: ${ids.join(", ")}.`);
+        }
         // Collect up to 8 applicable attestable rule ids from this contract for the error hint. Prefer
         // ids that share the bad id's non-wildcard prefix so the agent can narrow down quickly.
         const attestableIds = compliance.rules
@@ -746,7 +754,10 @@ export async function attestRule(args) {
         const prefix = args.ruleId.replace(/\.\*$|\*$/, "");
         const prefixed = prefix !== args.ruleId ? attestableIds.filter((r) => r.id.startsWith(prefix) || r.id.includes(prefix) || publicProductRuleId(r.id).startsWith(prefix)) : [];
         const candidates = prefixed.length > 0 ? prefixed : attestableIds;
-        const hintIds = candidates.slice(0, 8).map((r) => publicProductRuleId(r.id));
+        const hintIds = uniqueStrings(candidates.slice(0, 8).map((r) => {
+            const publicId = publicProductRuleId(r.id);
+            return publicId === r.id ? r.id : `${publicId} (${r.id})`;
+        }));
         const hintSuffix = hintIds.length > 0 ? ` Applicable attestable rules: ${hintIds.join(", ")}.` : " Applicable attestable rules: none.";
         const preamble = args.ruleId.includes("*")
             ? `Wildcards are not supported — attest one rule at a time.`
@@ -780,6 +791,31 @@ export async function explainRule(ruleId) {
     const rules = await rulesById();
     const contractRuleId = resolveRuleIdForInstalledRules(ruleId, rules);
     const rule = contractRuleId ? rules.get(contractRuleId) : undefined;
+    const publicCandidates = contractRuleCandidatesForPublicId(ruleId)
+        .map((candidate) => rules.get(candidate))
+        .filter((candidate) => candidate !== undefined);
+    if (!rule && publicCandidates.length > 1) {
+        return {
+            id: ruleId,
+            kind: "product_expectation",
+            note: "This public product expectation is validated by multiple concrete contract rules. Use the contract_rule_id when attesting one sensor.",
+            contract_rules: publicCandidates.map((candidate) => {
+                const binding = productExpectationBindingForSensor(candidate.id);
+                return {
+                    contract_rule_id: candidate.id,
+                    ...(binding && { validator: { platform: binding.platform, sensorId: binding.sensorId } }),
+                    version: candidate.version,
+                    title: candidate.title,
+                    severity: candidate.severity,
+                    rationale: candidate.rationale,
+                    applies_when: candidate.applies_when,
+                    attestation: candidate.enforcement.attestation,
+                    summary: ruleSummary(candidate),
+                    rule_digest: digestRule(candidate),
+                };
+            }),
+        };
+    }
     if (!rule || !contractRuleId) {
         throw new Error(`Unknown compliance rule: ${ruleId}`);
     }
@@ -865,9 +901,12 @@ function resolveRuleIdForContract(ruleId, compliance, rules) {
     if (rules.has(ruleId) && compliance.rules.some((ref) => ref.rule_id === ruleId)) {
         return ruleId;
     }
-    const candidates = contractRuleCandidatesForPublicId(ruleId).filter((candidate) => rules.has(candidate) && compliance.rules.some((ref) => ref.rule_id === candidate));
+    const candidates = contractRuleCandidatesForContract(ruleId, compliance, rules);
     return candidates.length === 1 ? candidates[0] : null;
 }
+function contractRuleCandidatesForContract(ruleId, compliance, rules) {
+    return contractRuleCandidatesForPublicId(ruleId).filter((candidate) => rules.has(candidate) && compliance.rules.some((ref) => ref.rule_id === candidate));
+}
 function resolveRuleIdForInstalledRules(ruleId, rules) {
     if (rules.has(ruleId)) {
         return ruleId;
@@ -895,12 +934,16 @@ function ruleRefForFile(rule) {
 }
 // Benchmark-measured friction: agents looped on attest dialect for ~25 min/cell when docs and SDK
 // disagreed on exact invocation syntax (capability-matrix 2026-06, Row 5). Hand them the exact incantation.
-function attestHint(rule) {
+function attestHint(rule, compliance) {
     const minConfidence = rule.enforcement.attestation.host_agent_min_confidence ?? "high";
     const fields = rule.enforcement.attestation.evidence_required ?? [];
     const publicRuleId = publicProductRuleId(rule.id);
+    const candidatesInContract = compliance
+        ? contractRuleCandidatesForPublicId(publicRuleId).filter((candidate) => compliance.rules.some((ref) => ref.rule_id === candidate))
+        : contractRuleCandidatesForPublicId(publicRuleId);
+    const ruleArg = hasMultipleContractRuleCandidates(publicRuleId) && candidatesInContract.length > 1 ? rule.id : publicRuleId;
     return {
-        attest_command: `vise attest --rule ${publicRuleId} --confidence ${minConfidence} --signer host-agent --evidence-file sp-vise/evidence/${rule.id}.json --rationale "<why this rule is satisfied (or cannot apply) in this codebase>"`,
+        attest_command: `vise attest --rule ${ruleArg} --confidence ${minConfidence} --signer host-agent --evidence-file sp-vise/evidence/${rule.id}.json --rationale "<why this rule is satisfied (or cannot apply) in this codebase>"`,
         evidence_template: Object.fromEntries(fields.map((f) => [f.field, `<${f.description}>`])),
     };
 }
@@ -939,7 +982,7 @@ function contractDrift(compliance, rules) {
                 status: "stale",
                 reason,
                 current_rule: ruleSummary(rule),
-                ...(rule.enforcement.attestation.allowed && attestHint(rule)),
+                ...(rule.enforcement.attestation.allowed && attestHint(rule, compliance)),
             });
         }
     }
@@ -959,6 +1002,18 @@ function deterministicFindingIds(rule) {
         .filter((check) => check.check === "validator-finding-absent")
         .map((check) => check.finding_rule_id);
 }
+function uniqueStrings(values) {
+    const seen = new Set();
+    const result = [];
+    for (const value of values) {
+        if (seen.has(value)) {
+            continue;
+        }
+        seen.add(value);
+        result.push(value);
+    }
+    return result;
+}
 function buildAttestation(compliance, rule, signer, confidence, identity, rationale, evidence, sourceFingerprints = []) {
     const ref = compliance.rules.find((item) => item.rule_id === rule.id);
     if (!ref) {

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@amityco/social-plus-vise",
-  "version": "0.14.11",
+  "version": "0.14.12",
   "description": "Skill-guided deterministic CLI for social.plus SDK integration assistance.",
   "license": "SEE LICENSE IN LICENSE",
   "type": "module",