ai-collab-open-system 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.aict/START_HERE.md +127 -0
- package/.aict/WORKSPACE_MANIFEST.json +91 -0
- package/.aict/acceptance/EXAMPLE.synthetic.md +49 -0
- package/.aict/acceptance/FAILURE_MODES.md +40 -0
- package/.aict/acceptance/PROMPT.md +47 -0
- package/.aict/acceptance/README.md +44 -0
- package/.aict/acceptance/TEMPLATE.md +57 -0
- package/.aict/adapters/SHARED_CORE_CONTRACT.md +106 -0
- package/.aict/adapters/claude-code/ADAPTER.md +28 -0
- package/.aict/adapters/cline/ADAPTER.md +28 -0
- package/.aict/adapters/codex/ADAPTER.md +28 -0
- package/.aict/adapters/copilot/ADAPTER.md +28 -0
- package/.aict/adapters/cursor/ADAPTER.md +28 -0
- package/.aict/adapters/windsurf/ADAPTER.md +28 -0
- package/.aict/context/EXAMPLE.synthetic.md +53 -0
- package/.aict/context/FAILURE_MODES.md +40 -0
- package/.aict/context/PROMPT.md +47 -0
- package/.aict/context/README.md +44 -0
- package/.aict/context/TEMPLATE.md +63 -0
- package/.aict/cookbook/README.md +8 -0
- package/.aict/cookbook/bridge-to-a-second-family.md +103 -0
- package/.aict/cookbook/connect-a-tool.md +67 -0
- package/.aict/cookbook/review-a-half-product.md +79 -0
- package/.aict/cookbook/run-a-first-loop.md +81 -0
- package/.aict/examples/README.md +21 -0
- package/.aict/examples/ai-coding-long-task/CASE.md +161 -0
- package/.aict/examples/ai-coding-long-task/artifacts/acceptance-card.md +36 -0
- package/.aict/examples/ai-coding-long-task/artifacts/context-package.md +30 -0
- package/.aict/examples/ai-coding-long-task/artifacts/execution-prompt.md +30 -0
- package/.aict/examples/ai-coding-long-task/artifacts/first-ai-output.md +109 -0
- package/.aict/examples/ai-coding-long-task/artifacts/guard-review.md +40 -0
- package/.aict/examples/ai-coding-long-task/artifacts/handoff-note.md +28 -0
- package/.aict/examples/ai-coding-long-task/artifacts/harvest-seed.md +28 -0
- package/.aict/examples/ai-coding-long-task/artifacts/revised-output.md +62 -0
- package/.aict/examples/content-production-harvest/CASE.md +87 -0
- package/.aict/examples/content-production-harvest/artifacts/acceptance-card.md +28 -0
- package/.aict/examples/content-production-harvest/artifacts/context-package.md +28 -0
- package/.aict/examples/content-production-harvest/artifacts/execution-prompt.md +30 -0
- package/.aict/examples/content-production-harvest/artifacts/guard-review.md +28 -0
- package/.aict/examples/content-production-harvest/artifacts/handoff-note.md +28 -0
- package/.aict/examples/content-production-harvest/artifacts/harvest-seed.md +28 -0
- package/.aict/examples/multi-tool-collaboration/CASE.md +87 -0
- package/.aict/examples/multi-tool-collaboration/artifacts/acceptance-card.md +28 -0
- package/.aict/examples/multi-tool-collaboration/artifacts/context-package.md +28 -0
- package/.aict/examples/multi-tool-collaboration/artifacts/execution-prompt.md +30 -0
- package/.aict/examples/multi-tool-collaboration/artifacts/guard-review.md +28 -0
- package/.aict/examples/multi-tool-collaboration/artifacts/handoff-note.md +28 -0
- package/.aict/examples/multi-tool-collaboration/artifacts/harvest-seed.md +28 -0
- package/.aict/examples/personal-judgment-growth-assistant/CASE.md +87 -0
- package/.aict/examples/personal-judgment-growth-assistant/artifacts/acceptance-card.md +28 -0
- package/.aict/examples/personal-judgment-growth-assistant/artifacts/context-package.md +28 -0
- package/.aict/examples/personal-judgment-growth-assistant/artifacts/execution-prompt.md +30 -0
- package/.aict/examples/personal-judgment-growth-assistant/artifacts/guard-review.md +28 -0
- package/.aict/examples/personal-judgment-growth-assistant/artifacts/handoff-note.md +28 -0
- package/.aict/examples/personal-judgment-growth-assistant/artifacts/harvest-seed.md +28 -0
- package/.aict/examples/research-knowledge-synthesis/CASE.md +87 -0
- package/.aict/examples/research-knowledge-synthesis/artifacts/acceptance-card.md +28 -0
- package/.aict/examples/research-knowledge-synthesis/artifacts/context-package.md +28 -0
- package/.aict/examples/research-knowledge-synthesis/artifacts/execution-prompt.md +30 -0
- package/.aict/examples/research-knowledge-synthesis/artifacts/guard-review.md +28 -0
- package/.aict/examples/research-knowledge-synthesis/artifacts/handoff-note.md +28 -0
- package/.aict/examples/research-knowledge-synthesis/artifacts/harvest-seed.md +28 -0
- package/.aict/guard/EXAMPLE.synthetic.md +51 -0
- package/.aict/guard/FAILURE_MODES.md +40 -0
- package/.aict/guard/PROMPT.md +47 -0
- package/.aict/guard/README.md +44 -0
- package/.aict/guard/TEMPLATE.md +60 -0
- package/.aict/handoff/EXAMPLE.synthetic.md +51 -0
- package/.aict/handoff/FAILURE_MODES.md +40 -0
- package/.aict/handoff/PROMPT.md +47 -0
- package/.aict/handoff/README.md +44 -0
- package/.aict/handoff/TEMPLATE.md +60 -0
- package/.aict/harvest/EXAMPLE.synthetic.md +51 -0
- package/.aict/harvest/FAILURE_MODES.md +40 -0
- package/.aict/harvest/PROMPT.md +47 -0
- package/.aict/harvest/README.md +44 -0
- package/.aict/harvest/TEMPLATE.md +60 -0
- package/.aict/mechanisms/README.md +34 -0
- package/.aict/mechanisms/anti-drift-partner/EXAMPLE.synthetic.md +46 -0
- package/.aict/mechanisms/anti-drift-partner/FAILURE_MODES.md +25 -0
- package/.aict/mechanisms/anti-drift-partner/PROMPT.md +75 -0
- package/.aict/mechanisms/anti-drift-partner/README.md +82 -0
- package/.aict/mechanisms/anti-drift-partner/TEMPLATE.md +74 -0
- package/.aict/mechanisms/blind-spot-scan/EXAMPLE.synthetic.md +39 -0
- package/.aict/mechanisms/blind-spot-scan/FAILURE_MODES.md +25 -0
- package/.aict/mechanisms/blind-spot-scan/PROMPT.md +72 -0
- package/.aict/mechanisms/blind-spot-scan/README.md +79 -0
- package/.aict/mechanisms/blind-spot-scan/TEMPLATE.md +70 -0
- package/.aict/mechanisms/collaboration-coach/EXAMPLE.synthetic.md +40 -0
- package/.aict/mechanisms/collaboration-coach/FAILURE_MODES.md +25 -0
- package/.aict/mechanisms/collaboration-coach/PROMPT.md +72 -0
- package/.aict/mechanisms/collaboration-coach/README.md +79 -0
- package/.aict/mechanisms/collaboration-coach/TEMPLATE.md +61 -0
- package/.aict/mechanisms/do-not-handle-yet/EXAMPLE.synthetic.md +15 -0
- package/.aict/mechanisms/do-not-handle-yet/FAILURE_MODES.md +16 -0
- package/.aict/mechanisms/do-not-handle-yet/PROMPT.md +41 -0
- package/.aict/mechanisms/do-not-handle-yet/README.md +30 -0
- package/.aict/mechanisms/do-not-handle-yet/TEMPLATE.md +38 -0
- package/.aict/mechanisms/dual-guard/EXAMPLE.synthetic.md +54 -0
- package/.aict/mechanisms/dual-guard/FAILURE_MODES.md +25 -0
- package/.aict/mechanisms/dual-guard/PROMPT.md +76 -0
- package/.aict/mechanisms/dual-guard/README.md +81 -0
- package/.aict/mechanisms/dual-guard/TEMPLATE.md +73 -0
- package/.aict/mechanisms/feedback-absorption-ledger/EXAMPLE.synthetic.md +49 -0
- package/.aict/mechanisms/feedback-absorption-ledger/FAILURE_MODES.md +25 -0
- package/.aict/mechanisms/feedback-absorption-ledger/PROMPT.md +74 -0
- package/.aict/mechanisms/feedback-absorption-ledger/README.md +81 -0
- package/.aict/mechanisms/feedback-absorption-ledger/TEMPLATE.md +69 -0
- package/.aict/mechanisms/half-product-review/EXAMPLE.synthetic.md +15 -0
- package/.aict/mechanisms/half-product-review/FAILURE_MODES.md +16 -0
- package/.aict/mechanisms/half-product-review/PROMPT.md +41 -0
- package/.aict/mechanisms/half-product-review/README.md +30 -0
- package/.aict/mechanisms/half-product-review/TEMPLATE.md +38 -0
- package/.aict/mechanisms/handoff-abc/EXAMPLE.synthetic.md +47 -0
- package/.aict/mechanisms/handoff-abc/FAILURE_MODES.md +25 -0
- package/.aict/mechanisms/handoff-abc/PROMPT.md +75 -0
- package/.aict/mechanisms/handoff-abc/README.md +82 -0
- package/.aict/mechanisms/handoff-abc/TEMPLATE.md +60 -0
- package/.aict/mechanisms/harvest-and-erc/EXAMPLE.synthetic.md +43 -0
- package/.aict/mechanisms/harvest-and-erc/FAILURE_MODES.md +25 -0
- package/.aict/mechanisms/harvest-and-erc/PROMPT.md +74 -0
- package/.aict/mechanisms/harvest-and-erc/README.md +81 -0
- package/.aict/mechanisms/harvest-and-erc/TEMPLATE.md +60 -0
- package/.aict/mechanisms/honest-calibration/EXAMPLE.synthetic.md +43 -0
- package/.aict/mechanisms/honest-calibration/FAILURE_MODES.md +25 -0
- package/.aict/mechanisms/honest-calibration/PROMPT.md +74 -0
- package/.aict/mechanisms/honest-calibration/README.md +81 -0
- package/.aict/mechanisms/honest-calibration/TEMPLATE.md +66 -0
- package/.aict/mechanisms/one-click-dispatch/EXAMPLE.synthetic.md +15 -0
- package/.aict/mechanisms/one-click-dispatch/FAILURE_MODES.md +16 -0
- package/.aict/mechanisms/one-click-dispatch/PROMPT.md +41 -0
- package/.aict/mechanisms/one-click-dispatch/README.md +30 -0
- package/.aict/mechanisms/one-click-dispatch/TEMPLATE.md +38 -0
- package/.aict/mechanisms/plain-language-first-screen/EXAMPLE.synthetic.md +15 -0
- package/.aict/mechanisms/plain-language-first-screen/FAILURE_MODES.md +16 -0
- package/.aict/mechanisms/plain-language-first-screen/PROMPT.md +41 -0
- package/.aict/mechanisms/plain-language-first-screen/README.md +30 -0
- package/.aict/mechanisms/plain-language-first-screen/TEMPLATE.md +38 -0
- package/.aict/mechanisms/root-cause-brake/EXAMPLE.synthetic.md +55 -0
- package/.aict/mechanisms/root-cause-brake/FAILURE_MODES.md +25 -0
- package/.aict/mechanisms/root-cause-brake/PROMPT.md +73 -0
- package/.aict/mechanisms/root-cause-brake/README.md +79 -0
- package/.aict/mechanisms/root-cause-brake/TEMPLATE.md +74 -0
- package/.aict/mechanisms/scout-review-controller/EXAMPLE.synthetic.md +15 -0
- package/.aict/mechanisms/scout-review-controller/FAILURE_MODES.md +16 -0
- package/.aict/mechanisms/scout-review-controller/PROMPT.md +41 -0
- package/.aict/mechanisms/scout-review-controller/README.md +30 -0
- package/.aict/mechanisms/scout-review-controller/TEMPLATE.md +38 -0
- package/.aict/mechanisms/single-tool-guard/EXAMPLE.synthetic.md +54 -0
- package/.aict/mechanisms/single-tool-guard/FAILURE_MODES.md +25 -0
- package/.aict/mechanisms/single-tool-guard/PROMPT.md +76 -0
- package/.aict/mechanisms/single-tool-guard/README.md +83 -0
- package/.aict/mechanisms/single-tool-guard/TEMPLATE.md +75 -0
- package/.aict/mechanisms/task-splitting/EXAMPLE.synthetic.md +53 -0
- package/.aict/mechanisms/task-splitting/FAILURE_MODES.md +25 -0
- package/.aict/mechanisms/task-splitting/PROMPT.md +72 -0
- package/.aict/mechanisms/task-splitting/README.md +79 -0
- package/.aict/mechanisms/task-splitting/TEMPLATE.md +76 -0
- package/.aict/modes/README.md +11 -0
- package/.aict/modes/execute.md +31 -0
- package/.aict/modes/handoff.md +29 -0
- package/.aict/modes/harvest.md +30 -0
- package/.aict/modes/review.md +28 -0
- package/.aict/modes/shape.md +34 -0
- package/.aict/privacy/COMMERCIAL_BOUNDARY.md +34 -0
- package/.aict/privacy/PRIVACY.md +36 -0
- package/.aict/privacy/REDACTION_CHECKLIST.md +12 -0
- package/.aict/profile/CANDIDATES.md +44 -0
- package/.aict/profile/EXAMPLE.synthetic.md +49 -0
- package/.aict/profile/FAILURE_MODES.md +40 -0
- package/.aict/profile/PROMPT.md +47 -0
- package/.aict/profile/README.md +44 -0
- package/.aict/profile/TEMPLATE.md +57 -0
- package/.aict/prompts/acceptance-definition.md +109 -0
- package/.aict/prompts/guard-review.md +116 -0
- package/.aict/prompts/handoff-generation.md +110 -0
- package/.aict/prompts/harvest-extraction.md +110 -0
- package/.aict/prompts/mode-switching.md +66 -0
- package/.aict/prompts/profile-creation.md +66 -0
- package/.aict/prompts/profile-refinement.md +66 -0
- package/.aict/prompts/project-context-packaging.md +113 -0
- package/.aict/prompts/red-team-challenge.md +106 -0
- package/.aict/prompts/rule-update-proposal.md +114 -0
- package/.aict/prompts/workflow-reset.md +109 -0
- package/.aict/roles/README.md +18 -0
- package/.aict/roles/executor.md +34 -0
- package/.aict/roles/harvester.md +33 -0
- package/.aict/roles/owner-controller.md +38 -0
- package/.aict/roles/scout.md +33 -0
- package/.aict/roles/supervisor.md +34 -0
- package/.aict/roles/system-guardian.md +34 -0
- package/.aict/skills/acceptance/SKILL.md +43 -0
- package/.aict/skills/context/SKILL.md +44 -0
- package/.aict/skills/evidence-pack/SKILL.md +42 -0
- package/.aict/skills/guard/SKILL.md +46 -0
- package/.aict/skills/handoff/SKILL.md +44 -0
- package/.aict/skills/harvest/SKILL.md +44 -0
- package/.aict/skills/mode-switch/SKILL.md +42 -0
- package/.aict/skills/profile/SKILL.md +42 -0
- package/.aict/skills/red-team/SKILL.md +42 -0
- package/.aict/skills/single-tool-guard/SKILL.md +42 -0
- package/.aict/state/CURRENT_STATE.md +13 -0
- package/.aict/state/DECISIONS.md +7 -0
- package/.aict/state/TASK_LOG.md +7 -0
- package/.aict/state/evidence.jsonl +2 -0
- package/.aict/state/learning-ledger.jsonl +1 -0
- package/.aict/state/receipts.jsonl +1 -0
- package/.aict/state/runs.jsonl +1 -0
- package/.aict/state/tasks.jsonl +1 -0
- package/.aict/walkthroughs/10-minute-your-task.md +107 -0
- package/.aict/walkthroughs/10-minute.md +43 -0
- package/.aict/walkthroughs/30-minute.md +22 -0
- package/.aict/walkthroughs/60-minute.md +27 -0
- package/.aict/walkthroughs/synthetic-loop-transcript.md +43 -0
- package/CHANGELOG.md +23 -0
- package/CODE_OF_CONDUCT.md +20 -0
- package/CONTRIBUTING.md +30 -0
- package/KNOWN_LIMITATIONS.md +54 -0
- package/LICENSE +199 -0
- package/PRODUCT_CONTRACT.md +446 -0
- package/README.md +245 -0
- package/RELEASE_CHECKLIST.md +78 -0
- package/SECURITY.md +56 -0
- package/START_HERE.md +89 -0
- package/bin/ai-collab.js +2 -0
- package/docs/DOGFOOD.md +85 -0
- package/docs/FEEDBACK.md +61 -0
- package/docs/FIRST_EXPERIENCE_SPEC.md +32 -0
- package/docs/FREE_VS_PAID.md +53 -0
- package/docs/PUBLIC_BOUNDARY.md +36 -0
- package/docs/PUBLIC_MAPPING.md +178 -0
- package/docs/RELEASE_PRIORITY.md +23 -0
- package/docs/WHY_THIS_EXISTS.md +36 -0
- package/docs/open-system/00-start-here.md +60 -0
- package/docs/open-system/01-ai-collaboration-os.md +33 -0
- package/docs/open-system/02-six-layer-architecture.md +45 -0
- package/docs/open-system/03-role-system.md +33 -0
- package/docs/open-system/04-core-mechanisms.md +34 -0
- package/docs/open-system/05-failure-patterns.md +31 -0
- package/docs/open-system/06-how-to-adapt-to-your-workflow.md +31 -0
- package/package.json +69 -0
- package/privacy-manifest.json +78 -0
- package/privacy-scan.local.json.example +18 -0
- package/scripts/lib/forbidden-in-pack.js +55 -0
- package/scripts/pack-check.js +154 -0
- package/scripts/privacy-scan.js +487 -0
- package/scripts/validate-contract.js +160 -0
- package/src/adapters.js +590 -0
- package/src/bootstrap.js +1184 -0
- package/src/catalog.js +2723 -0
- package/src/cli.js +2899 -0
- package/src/dialogue.js +470 -0
- package/src/i18n.js +1034 -0
- package/src/ledger.js +2011 -0
- package/src/render.js +1381 -0
- package/src/sendmodel.js +452 -0
- package/src/validate.js +1307 -0
- package/src/workspace.js +1679 -0
- package/tests/contract.test.js +8514 -0
|
@@ -0,0 +1,81 @@
|
|
|
1
|
+
# Honest Calibration
|
|
2
|
+
|
|
3
|
+
Part of the AI Collaboration Open System. This is a local-first, public-safe mechanism package you can copy into Claude Code, Codex, Cursor, Cline, Windsurf, or Copilot.
|
|
4
|
+
|
|
5
|
+
## Purpose
|
|
6
|
+
|
|
7
|
+
Offset the model's built-in eagerness to please by pinning one short user-side prefix to the front of every ask for a rating, an evaluation, or a recommendation: be candid, do not inflate, do not over-hedge. The point is not to hope the AI will be honest — it is to know that, left uncalibrated, a model slides back toward the answer that makes you feel good, so you re-aim it on each ask. The prefix pulls the baseline from make-you-happy back to tell-the-truth, and it matters most exactly where the temptation to flatter is highest: when you are asking the AI to judge your own work, your own ability, or your own output.
|
|
8
|
+
|
|
9
|
+
## When to use
|
|
10
|
+
|
|
11
|
+
Use whenever you ask the AI to grade, score, place, rank, or recommend — and most of all when the thing being judged is yours: your draft, your plan, your skill level, the quality tier your output would land in, whether something is ready to ship or publish. If a falsely high 'this is great' would cost you (you publish too early, you skip a fix, you misjudge where you really stand), put the calibration prefix in front of the ask.
|
|
12
|
+
|
|
13
|
+
## When not to use
|
|
14
|
+
|
|
15
|
+
Do not bolt it onto a plain fact lookup or a direct instruction to carry out — there is no evaluation to calibrate, so the prefix is just noise. 'Be candid, do not inflate' in front of 'what is the capital of X' or 'rename this file' adds nothing, and a calibration ritual stapled to every message trains you to stop noticing it on the one ask where it actually changes the answer. It is also not a license to flip to harsh: the instruction is to stop both inflating AND over-hedging, not to make the AI negative on command.
|
|
16
|
+
|
|
17
|
+
## Input shape
|
|
18
|
+
|
|
19
|
+
The specific thing to judge, stated plainly (the draft, the plan, the output, the ability). What the judgment is for (publish / ship / keep iterating / a self-honest gut check), so the AI calibrates to a real bar instead of a vague vibe. The reference frame you want it measured against (a quality tier, a percentile, a named standard, a comparison set) so 'good' has an anchor. The candor prefix itself, placed at the FRONT of the ask, not buried after it. And, when the thing under judgment is your own, an explicit nudge to step outside your point of view and not grade to please you.
|
|
20
|
+
|
|
21
|
+
## Input materials
|
|
22
|
+
|
|
23
|
+
- The exact artifact, ability, or output to evaluate — named concretely so the AI grades a real thing, not a generality.
|
|
24
|
+
- The purpose of the judgment (decide whether to publish, whether to ship, whether to keep working, or just to know honestly where you stand), so the bar is the real-world consequence, not a feeling.
|
|
25
|
+
- The reference frame: the quality tier, percentile band, named standard, or comparison set the answer should be measured against, so 'good' or 'B+' is anchored rather than floating.
|
|
26
|
+
- The candor prefix, placed at the FRONT of the request (be candid, do not inflate, do not over-hedge) — position matters, because a prefix sets the stance before the model starts composing the pleasing version.
|
|
27
|
+
- When you are the subject (your work, your skill, your output), an explicit 'step outside my perspective and do not grade to make me feel good' so the highest-flattery case gets the strongest calibration.
|
|
28
|
+
- Optional: permission to deliver the verdict bluntly and lead with the weakest part, so the honest signal is not softened into mush on its way out.
|
|
29
|
+
|
|
30
|
+
## Process
|
|
31
|
+
|
|
32
|
+
1. Put the candor prefix first, before the actual ask. Lead the request with 'be candid, do not inflate, do not over-hedge' (or your own words for it) so the stance is set before the model reaches for the agreeable framing. A prefix after the question is half as effective as a prefix before it, because by then the answer is already forming around what would please you.
|
|
33
|
+
2. Anchor the judgment to a real bar, not a vibe. Name the reference frame — a tier, a percentile, a named standard, a comparison set — so the AI cannot retreat to a safely flattering 'it's pretty good'. 'Be candid' with nothing to be candid against just produces a more confident vague compliment.
|
|
34
|
+
3. Apply the strongest calibration when the subject is you. Self-evaluation is the peak-flattery case: the model most wants to please you exactly when you are asking about your own work or ability. Add the explicit 'step outside my point of view and do not grade to make me feel good' here, and treat a suspiciously warm verdict on your own output as a signal to re-ask, not as good news.
|
|
35
|
+
4. Read the answer for the tells of an uncalibrated slide-back: it opens with praise and buries the real critique; every weakness is immediately cushioned ('but this is genuinely strong'); the score drifts upward with no new evidence; it agrees with your own stated hope a little too readily. Any of these means the baseline slid back toward make-you-happy and the prefix needs re-asserting.
|
|
36
|
+
5. Re-aim when it slides. The model does not hold the candid stance forever — over a long thread it relaxes back into the pleasing default. When you catch the tells, restate the prefix and ask for the verdict again; do not accept the warmed-over version just because re-asking feels awkward.
|
|
37
|
+
6. Separate the candid verdict from encouragement, and keep them in that order. A useful honest answer can still end with 'and here is the fastest path up' — but the true placement comes first and unhedged, and the encouragement comes after, clearly marked as the next step rather than as a softener that quietly raises the grade.
|
|
38
|
+
|
|
39
|
+
## Output shape
|
|
40
|
+
|
|
41
|
+
- A candid verdict stated first and plainly: the tier, score, percentile, or yes/no, without an opening cushion of praise.
|
|
42
|
+
- The anchor it was measured against (the named tier, percentile, standard, or comparison set), so the verdict is checkable rather than a floating adjective.
|
|
43
|
+
- The weakest part led with, not buried: the single biggest reason it is not higher, stated before any reassurance.
|
|
44
|
+
- No upward drift: the score does not creep higher than the evidence supports, and warmth is not substituted for a number.
|
|
45
|
+
- Encouragement, if any, clearly separated and placed last — the fastest path up, marked as a next step, never folded back into the grade.
|
|
46
|
+
- On a self-evaluation, an explicit note that the AI graded from outside your perspective rather than to please you.
|
|
47
|
+
|
|
48
|
+
## Pass bar (what counts as done / safe to trust)
|
|
49
|
+
|
|
50
|
+
- The candor prefix sat at the FRONT of the ask, setting the stance before the answer formed.
|
|
51
|
+
- The verdict is anchored to a named bar (tier / percentile / standard / comparison set), not a floating 'pretty good'.
|
|
52
|
+
- The weakest point is stated first and unhedged, rather than buried under an opening of praise.
|
|
53
|
+
- The score reflects the evidence and did not drift upward, and warmth was not used in place of a real number.
|
|
54
|
+
- On a self-evaluation, the AI grades from outside your perspective and says so, instead of grading to please you.
|
|
55
|
+
- Encouragement, if present, is separated out and placed last as a next step — never blended back into the grade.
|
|
56
|
+
|
|
57
|
+
## Reject bar (what sends it back)
|
|
58
|
+
|
|
59
|
+
- The answer opens with praise and the real critique is buried below it (the classic flatter-first slide-back).
|
|
60
|
+
- The verdict is a warm adjective with no anchor — 'this is strong' against nothing checkable.
|
|
61
|
+
- Every weakness is immediately cushioned so no honest signal survives to the reader.
|
|
62
|
+
- The grade crept upward across the thread with no new evidence, tracking your stated hope rather than the work.
|
|
63
|
+
- The prefix was tacked on AFTER the question, so the pleasing version had already formed.
|
|
64
|
+
- The candor was read as a license to be harsh, producing a put-down instead of an inflation-free, hedge-free truth.
|
|
65
|
+
|
|
66
|
+
## Common misuse
|
|
67
|
+
|
|
68
|
+
- Stapling the prefix to plain fact lookups and direct instructions, so it becomes background noise you stop noticing on the one ask that needs it.
|
|
69
|
+
- Putting 'be candid' after the question instead of in front of it, so the model has already composed the agreeable answer before the stance lands.
|
|
70
|
+
- Accepting a suspiciously warm verdict on your own work because re-asking feels awkward — the peak-flattery case is exactly where you must re-aim.
|
|
71
|
+
- Treating the prefix as a one-time setting rather than re-asserting it when the thread drifts back toward pleasing you.
|
|
72
|
+
- Flipping the instruction into 'be harsh', so you trade a flattering distortion for a punitive one instead of getting an undistorted read.
|
|
73
|
+
- Letting the encouragement at the end quietly raise the grade ('it's a B, but honestly almost an A') so the candid verdict is undone in its own last line.
|
|
74
|
+
|
|
75
|
+
## Package files
|
|
76
|
+
|
|
77
|
+
- `README.md` explains the mechanism.
|
|
78
|
+
- `PROMPT.md` gives the copy-paste prompt.
|
|
79
|
+
- `TEMPLATE.md` gives the blank operating card.
|
|
80
|
+
- `EXAMPLE.synthetic.md` shows a public-safe run.
|
|
81
|
+
- `FAILURE_MODES.md` names common ways this mechanism fails.
|
|
@@ -0,0 +1,66 @@
|
|
|
1
|
+
# Honest Calibration Template
|
|
2
|
+
|
|
3
|
+
AI Collaboration Open System mechanism card. Fill this in a local-first workflow with public-safe or redacted material.
|
|
4
|
+
|
|
5
|
+
## Purpose
|
|
6
|
+
|
|
7
|
+
Offset the model's built-in eagerness to please by pinning one short user-side prefix to the front of every ask for a rating, an evaluation, or a recommendation: be candid, do not inflate, do not over-hedge. The point is not to hope the AI will be honest — it is to know that, left uncalibrated, a model slides back toward the answer that makes you feel good, so you re-aim it on each ask. The prefix pulls the baseline from make-you-happy back to tell-the-truth, and it matters most exactly where the temptation to flatter is highest: when you are asking the AI to judge your own work, your own ability, or your own output.
|
|
8
|
+
|
|
9
|
+
## Template
|
|
10
|
+
|
|
11
|
+
### Candor prefix (paste at the FRONT of the ask): be candid, do not inflate and do not over-hedge; step outside my perspective and do not grade this to make me feel good.
|
|
12
|
+
|
|
13
|
+
|
|
14
|
+
### What to judge (the exact draft / plan / output / ability):
|
|
15
|
+
|
|
16
|
+
|
|
17
|
+
### What the judgment is for (publish / ship / keep iterating / honest gut check):
|
|
18
|
+
|
|
19
|
+
|
|
20
|
+
### Reference frame to measure against (tier / percentile / named standard / comparison set):
|
|
21
|
+
|
|
22
|
+
|
|
23
|
+
### Is the subject mine? (if yes, apply the strongest calibration and grade from outside my view):
|
|
24
|
+
|
|
25
|
+
|
|
26
|
+
### Candid verdict first (tier / score / percentile / yes-no, no opening praise):
|
|
27
|
+
|
|
28
|
+
|
|
29
|
+
### Single biggest reason it is not higher (led with, not cushioned):
|
|
30
|
+
|
|
31
|
+
|
|
32
|
+
### Tells of a slide-back to watch for (praise-first / every flaw cushioned / score drifts up / agrees with my hope too fast):
|
|
33
|
+
|
|
34
|
+
|
|
35
|
+
### Fastest path up (optional, placed LAST, marked as a next step not a grade-softener):
|
|
36
|
+
|
|
37
|
+
|
|
38
|
+
|
|
39
|
+
## Pass bar (tick before you trust the result)
|
|
40
|
+
|
|
41
|
+
- The candor prefix sat at the FRONT of the ask, setting the stance before the answer formed.
|
|
42
|
+
- The verdict is anchored to a named bar (tier / percentile / standard / comparison set), not a floating 'pretty good'.
|
|
43
|
+
- The weakest point is stated first and unhedged, rather than buried under an opening of praise.
|
|
44
|
+
- The score reflects the evidence and did not drift upward, and warmth was not used in place of a real number.
|
|
45
|
+
- On a self-evaluation, the AI grades from outside your perspective and says so, instead of grading to please you.
|
|
46
|
+
- Encouragement, if present, is separated out and placed last as a next step — never blended back into the grade.
|
|
47
|
+
|
|
48
|
+
## Reject bar (send it back if any of these is true)
|
|
49
|
+
|
|
50
|
+
- The answer opens with praise and the real critique is buried below it (the classic flatter-first slide-back).
|
|
51
|
+
- The verdict is a warm adjective with no anchor — 'this is strong' against nothing checkable.
|
|
52
|
+
- Every weakness is immediately cushioned so no honest signal survives to the reader.
|
|
53
|
+
- The grade crept upward across the thread with no new evidence, tracking your stated hope rather than the work.
|
|
54
|
+
- The prefix was tacked on AFTER the question, so the pleasing version had already formed.
|
|
55
|
+
- The candor was read as a license to be harsh, producing a put-down instead of an inflation-free, hedge-free truth.
|
|
56
|
+
|
|
57
|
+
## Worked example
|
|
58
|
+
|
|
59
|
+
See `EXAMPLE.synthetic.md` for this same card filled out end to end on a public-safe synthetic task.
|
|
60
|
+
|
|
61
|
+
## Completion check
|
|
62
|
+
|
|
63
|
+
- The mechanism has a named trigger.
|
|
64
|
+
- The next action is concrete.
|
|
65
|
+
- Private details are redacted or rewritten as synthetic examples.
|
|
66
|
+
- The result can be handed to another AI tool without extra chat history.
|
|
@@ -0,0 +1,15 @@
|
|
|
1
|
+
# One-Click Dispatch Synthetic Example
|
|
2
|
+
|
|
3
|
+
This is a public-safe synthetic example for the AI Collaboration Open System. It is local-first and contains no private account, customer, route, hook, or conversation material.
|
|
4
|
+
|
|
5
|
+
## Synthetic example
|
|
6
|
+
|
|
7
|
+
A controller sends an implementation packet that says: edit only the synthetic CLI files, run npm test, do not publish, and return changed files plus command output.
|
|
8
|
+
|
|
9
|
+
## How the mechanism changes the outcome
|
|
10
|
+
|
|
11
|
+
Without this mechanism, a single assistant can produce a smooth answer while hiding uncertainty. With this mechanism, the workflow records trigger, evidence, decision, residual risk, and next action.
|
|
12
|
+
|
|
13
|
+
## Reuse note
|
|
14
|
+
|
|
15
|
+
Copy the shape, not the synthetic facts. Adapt the template to your own redacted task.
|
|
@@ -0,0 +1,16 @@
|
|
|
1
|
+
# One-Click Dispatch Failure Modes
|
|
2
|
+
|
|
3
|
+
AI Collaboration Open System failure checklist. Use it in a local-first workflow before trusting a mechanism run, and rewrite any public example into public-safe language.
|
|
4
|
+
|
|
5
|
+
## Failure modes
|
|
6
|
+
|
|
7
|
+
- Dispatch packet contains a full transcript instead of compressed state.
|
|
8
|
+
- Authority is unclear, so the worker edits during a review-only task.
|
|
9
|
+
- The return format omits unverified areas.
|
|
10
|
+
|
|
11
|
+
## Guard questions
|
|
12
|
+
|
|
13
|
+
1. Did this mechanism change the decision, or just add ceremony?
|
|
14
|
+
2. Is any private material copied instead of summarized or synthesized?
|
|
15
|
+
3. Are blockers, residual risks, and next actions separated?
|
|
16
|
+
4. Could a new session continue from this file alone?
|
|
@@ -0,0 +1,41 @@
|
|
|
1
|
+
# One-Click Dispatch Prompt
|
|
2
|
+
|
|
3
|
+
This prompt belongs to the AI Collaboration Open System. Use it in a local-first workflow with public-safe or redacted material.
|
|
4
|
+
|
|
5
|
+
## Purpose
|
|
6
|
+
|
|
7
|
+
Turn a messy task into a compact work packet another AI tool can execute without inheriting the whole chat.
|
|
8
|
+
|
|
9
|
+
## Copy-paste prompt
|
|
10
|
+
|
|
11
|
+
```text
|
|
12
|
+
Use the One-Click Dispatch mechanism from my local AI Collaboration Open System workspace.
|
|
13
|
+
|
|
14
|
+
Purpose:
|
|
15
|
+
Turn a messy task into a compact work packet another AI tool can execute without inheriting the whole chat.
|
|
16
|
+
|
|
17
|
+
Trigger:
|
|
18
|
+
Use when handing a task from a controller session to Codex, Claude Code, Cursor, Cline, Windsurf, or Copilot.
|
|
19
|
+
|
|
20
|
+
Input:
|
|
21
|
+
[paste redacted task material, context package, and acceptance card here]
|
|
22
|
+
|
|
23
|
+
Process:
|
|
24
|
+
1. Package only the state required to act.
|
|
25
|
+
2. State authority: read-only, write allowed, review-only, or handoff-only.
|
|
26
|
+
3. Attach acceptance and stop conditions.
|
|
27
|
+
4. Require the worker to return changed artifacts, verification evidence, blockers, and unverified claims.
|
|
28
|
+
|
|
29
|
+
Return:
|
|
30
|
+
- Decision-changing findings only
|
|
31
|
+
- Evidence used
|
|
32
|
+
- Required fixes
|
|
33
|
+
- Residual risk
|
|
34
|
+
- Next action
|
|
35
|
+
|
|
36
|
+
Rules:
|
|
37
|
+
- Work from provided material only.
|
|
38
|
+
- Keep private material local.
|
|
39
|
+
- Use public-safe synthetic wording for examples.
|
|
40
|
+
- Label assumptions and unverified claims.
|
|
41
|
+
```
|
|
@@ -0,0 +1,30 @@
|
|
|
1
|
+
# One-Click Dispatch
|
|
2
|
+
|
|
3
|
+
Part of the AI Collaboration Open System. This is a local-first, public-safe mechanism package you can copy into Claude Code, Codex, Cursor, Cline, Windsurf, or Copilot.
|
|
4
|
+
|
|
5
|
+
## Purpose
|
|
6
|
+
|
|
7
|
+
Turn a messy task into a compact work packet another AI tool can execute without inheriting the whole chat.
|
|
8
|
+
|
|
9
|
+
## When to use
|
|
10
|
+
|
|
11
|
+
Use when handing a task from a controller session to Codex, Claude Code, Cursor, Cline, Windsurf, or Copilot.
|
|
12
|
+
|
|
13
|
+
## Input shape
|
|
14
|
+
|
|
15
|
+
Goal, files or artifacts, acceptance card, allowed actions, forbidden actions, and expected return shape.
|
|
16
|
+
|
|
17
|
+
## Process
|
|
18
|
+
|
|
19
|
+
1. Package only the state required to act.
|
|
20
|
+
2. State authority: read-only, write allowed, review-only, or handoff-only.
|
|
21
|
+
3. Attach acceptance and stop conditions.
|
|
22
|
+
4. Require the worker to return changed artifacts, verification evidence, blockers, and unverified claims.
|
|
23
|
+
|
|
24
|
+
## Package files
|
|
25
|
+
|
|
26
|
+
- `README.md` explains the mechanism.
|
|
27
|
+
- `PROMPT.md` gives the copy-paste prompt.
|
|
28
|
+
- `TEMPLATE.md` gives the blank operating card.
|
|
29
|
+
- `EXAMPLE.synthetic.md` shows a public-safe run.
|
|
30
|
+
- `FAILURE_MODES.md` names common ways this mechanism fails.
|
|
@@ -0,0 +1,38 @@
|
|
|
1
|
+
# One-Click Dispatch Template
|
|
2
|
+
|
|
3
|
+
AI Collaboration Open System mechanism card. Fill this in a local-first workflow with public-safe or redacted material.
|
|
4
|
+
|
|
5
|
+
## Purpose
|
|
6
|
+
|
|
7
|
+
Turn a messy task into a compact work packet another AI tool can execute without inheriting the whole chat.
|
|
8
|
+
|
|
9
|
+
## Template
|
|
10
|
+
|
|
11
|
+
### Task:
|
|
12
|
+
|
|
13
|
+
|
|
14
|
+
### Authority:
|
|
15
|
+
|
|
16
|
+
|
|
17
|
+
### Required context:
|
|
18
|
+
|
|
19
|
+
|
|
20
|
+
### Acceptance:
|
|
21
|
+
|
|
22
|
+
|
|
23
|
+
### Stop conditions:
|
|
24
|
+
|
|
25
|
+
|
|
26
|
+
### Return format:
|
|
27
|
+
|
|
28
|
+
|
|
29
|
+
### Privacy boundary:
|
|
30
|
+
|
|
31
|
+
|
|
32
|
+
|
|
33
|
+
## Completion check
|
|
34
|
+
|
|
35
|
+
- The mechanism has a named trigger.
|
|
36
|
+
- The next action is concrete.
|
|
37
|
+
- Private details are redacted or rewritten as synthetic examples.
|
|
38
|
+
- The result can be handed to another AI tool without extra chat history.
|
|
@@ -0,0 +1,15 @@
|
|
|
1
|
+
# Plain-Language First Screen Synthetic Example
|
|
2
|
+
|
|
3
|
+
This is a public-safe synthetic example for the AI Collaboration Open System. It is local-first and contains no private account, customer, route, hook, or conversation material.
|
|
4
|
+
|
|
5
|
+
## Synthetic example
|
|
6
|
+
|
|
7
|
+
START_HERE opens with the demo path and raw-chat comparison, then links to architecture docs after the user has something runnable.
|
|
8
|
+
|
|
9
|
+
## How the mechanism changes the outcome
|
|
10
|
+
|
|
11
|
+
Without this mechanism, a single assistant can produce a smooth answer while hiding uncertainty. With this mechanism, the workflow records trigger, evidence, decision, residual risk, and next action.
|
|
12
|
+
|
|
13
|
+
## Reuse note
|
|
14
|
+
|
|
15
|
+
Copy the shape, not the synthetic facts. Adapt the template to your own redacted task.
|
|
@@ -0,0 +1,16 @@
|
|
|
1
|
+
# Plain-Language First Screen Failure Modes
|
|
2
|
+
|
|
3
|
+
AI Collaboration Open System failure checklist. Use it in a local-first workflow before trusting a mechanism run, and rewrite any public example into public-safe language.
|
|
4
|
+
|
|
5
|
+
## Failure modes
|
|
6
|
+
|
|
7
|
+
- The first screen becomes a manifesto.
|
|
8
|
+
- The user sees concepts before a runnable path.
|
|
9
|
+
- The guide claims value without a before/after proof.
|
|
10
|
+
|
|
11
|
+
## Guard questions
|
|
12
|
+
|
|
13
|
+
1. Did this mechanism change the decision, or just add ceremony?
|
|
14
|
+
2. Is any private material copied instead of summarized or synthesized?
|
|
15
|
+
3. Are blockers, residual risks, and next actions separated?
|
|
16
|
+
4. Could a new session continue from this file alone?
|
|
@@ -0,0 +1,41 @@
|
|
|
1
|
+
# Plain-Language First Screen Prompt
|
|
2
|
+
|
|
3
|
+
This prompt belongs to the AI Collaboration Open System. Use it in a local-first workflow with public-safe or redacted material.
|
|
4
|
+
|
|
5
|
+
## Purpose
|
|
6
|
+
|
|
7
|
+
Make the first screen explain the result, path, and proof before concepts or framework names.
|
|
8
|
+
|
|
9
|
+
## Copy-paste prompt
|
|
10
|
+
|
|
11
|
+
```text
|
|
12
|
+
Use the Plain-Language First Screen mechanism from my local AI Collaboration Open System workspace.
|
|
13
|
+
|
|
14
|
+
Purpose:
|
|
15
|
+
Make the first screen explain the result, path, and proof before concepts or framework names.
|
|
16
|
+
|
|
17
|
+
Trigger:
|
|
18
|
+
Use for README, START_HERE, handoff, review results, and any user-facing guide.
|
|
19
|
+
|
|
20
|
+
Input:
|
|
21
|
+
[paste redacted task material, context package, and acceptance card here]
|
|
22
|
+
|
|
23
|
+
Process:
|
|
24
|
+
1. Start with what the user can do in ten minutes.
|
|
25
|
+
2. Show before/after instead of abstract philosophy.
|
|
26
|
+
3. Name the files or commands that prove the claim.
|
|
27
|
+
4. Move deeper theory below the first-run path.
|
|
28
|
+
|
|
29
|
+
Return:
|
|
30
|
+
- Decision-changing findings only
|
|
31
|
+
- Evidence used
|
|
32
|
+
- Required fixes
|
|
33
|
+
- Residual risk
|
|
34
|
+
- Next action
|
|
35
|
+
|
|
36
|
+
Rules:
|
|
37
|
+
- Work from provided material only.
|
|
38
|
+
- Keep private material local.
|
|
39
|
+
- Use public-safe synthetic wording for examples.
|
|
40
|
+
- Label assumptions and unverified claims.
|
|
41
|
+
```
|
|
@@ -0,0 +1,30 @@
|
|
|
1
|
+
# Plain-Language First Screen
|
|
2
|
+
|
|
3
|
+
Part of the AI Collaboration Open System. This is a local-first, public-safe mechanism package you can copy into Claude Code, Codex, Cursor, Cline, Windsurf, or Copilot.
|
|
4
|
+
|
|
5
|
+
## Purpose
|
|
6
|
+
|
|
7
|
+
Make the first screen explain the result, path, and proof before concepts or framework names.
|
|
8
|
+
|
|
9
|
+
## When to use
|
|
10
|
+
|
|
11
|
+
Use for README, START_HERE, handoff, review results, and any user-facing guide.
|
|
12
|
+
|
|
13
|
+
## Input shape
|
|
14
|
+
|
|
15
|
+
Audience, first action, proof artifact, main contrast, and one next step.
|
|
16
|
+
|
|
17
|
+
## Process
|
|
18
|
+
|
|
19
|
+
1. Start with what the user can do in ten minutes.
|
|
20
|
+
2. Show before/after instead of abstract philosophy.
|
|
21
|
+
3. Name the files or commands that prove the claim.
|
|
22
|
+
4. Move deeper theory below the first-run path.
|
|
23
|
+
|
|
24
|
+
## Package files
|
|
25
|
+
|
|
26
|
+
- `README.md` explains the mechanism.
|
|
27
|
+
- `PROMPT.md` gives the copy-paste prompt.
|
|
28
|
+
- `TEMPLATE.md` gives the blank operating card.
|
|
29
|
+
- `EXAMPLE.synthetic.md` shows a public-safe run.
|
|
30
|
+
- `FAILURE_MODES.md` names common ways this mechanism fails.
|
|
@@ -0,0 +1,38 @@
|
|
|
1
|
+
# Plain-Language First Screen Template
|
|
2
|
+
|
|
3
|
+
AI Collaboration Open System mechanism card. Fill this in a local-first workflow with public-safe or redacted material.
|
|
4
|
+
|
|
5
|
+
## Purpose
|
|
6
|
+
|
|
7
|
+
Make the first screen explain the result, path, and proof before concepts or framework names.
|
|
8
|
+
|
|
9
|
+
## Template
|
|
10
|
+
|
|
11
|
+
### Audience:
|
|
12
|
+
|
|
13
|
+
|
|
14
|
+
### First-screen claim:
|
|
15
|
+
|
|
16
|
+
|
|
17
|
+
### Ten-minute action:
|
|
18
|
+
|
|
19
|
+
|
|
20
|
+
### Before/after proof:
|
|
21
|
+
|
|
22
|
+
|
|
23
|
+
### Files or commands:
|
|
24
|
+
|
|
25
|
+
|
|
26
|
+
### What this is not:
|
|
27
|
+
|
|
28
|
+
|
|
29
|
+
### Next step:
|
|
30
|
+
|
|
31
|
+
|
|
32
|
+
|
|
33
|
+
## Completion check
|
|
34
|
+
|
|
35
|
+
- The mechanism has a named trigger.
|
|
36
|
+
- The next action is concrete.
|
|
37
|
+
- Private details are redacted or rewritten as synthetic examples.
|
|
38
|
+
- The result can be handed to another AI tool without extra chat history.
|
|
@@ -0,0 +1,55 @@
|
|
|
1
|
+
# Root-Cause Brake Synthetic Example
|
|
2
|
+
|
|
3
|
+
This is a public-safe synthetic example for the AI Collaboration Open System. It is local-first and contains no private account, customer, route, hook, or conversation material.
|
|
4
|
+
|
|
5
|
+
## Synthetic example
|
|
6
|
+
|
|
7
|
+
A synthetic data-quarantine feature is blocked twice in a row: round one for an inconsistent status field, round two for the same status field plus a self-check that 'passed' a broken case. Instead of shipping a third patch, the brake trips, the four questions reveal a contract conflict (the status field keeps being redefined) compounded by fake verification (the self-check was cosmetic), and the next version is rebuilt by freezing the contract first — not by adding a third fix.
|
|
8
|
+
|
|
9
|
+
## Full worked example (filled end to end)
|
|
10
|
+
|
|
11
|
+
An execution AI is building a 'quarantine' feature for a synthetic records tool: bad records get parked in a holding state instead of deleted. The owner reviews each version with an independent guard. Version 4 is blocked. Version 5 is blocked too. The owner is about to ask for version 6 — and trips the Root-Cause Brake instead.
|
|
12
|
+
|
|
13
|
+
### Twice-rejected target
|
|
14
|
+
The quarantine feature for the synthetic records tool. Two consecutive blocking reviews: V4 and V5.
|
|
15
|
+
|
|
16
|
+
### Trip condition met?
|
|
17
|
+
Yes. Same artifact, two blocks in a row. By the rule, version 6 may NOT be another patched draft. Stop and diagnose.
|
|
18
|
+
|
|
19
|
+
### Findings from block 1 (V4, verbatim)
|
|
20
|
+
BLOCK. The quarantine `status` field is written as the string 'held' in one path and the enum value QUARANTINED in another, so downstream reads disagree about whether a record is parked.
|
|
21
|
+
|
|
22
|
+
### Findings from block 2 (V5, verbatim)
|
|
23
|
+
BLOCK. The `status` mismatch from V4 is only half-fixed — one more path still writes 'held'. Also: the self-check claims 'all quarantine transitions verified' but it never exercises the restore-from-quarantine path, so a broken restore passed review.
|
|
24
|
+
|
|
25
|
+
### Patch-on-patch trail
|
|
26
|
+
V4 -> V5 was 'fix the status string in the path the guard named'. It patched the one spot the reviewer pointed at, did not sweep the rest, and added no real test — classic symptom-chasing.
|
|
27
|
+
|
|
28
|
+
### Q1 Contract conflict?
|
|
29
|
+
YES. Evidence: V4 finding + V5 finding both turn on `status` being two things at once ('held' vs QUARANTINED). The agreed definition of the status field is not frozen, so every patch fixes one writer and leaves others on the old assumption.
|
|
30
|
+
|
|
31
|
+
### Q2 Fake verification?
|
|
32
|
+
YES. Evidence: V5 finding — the self-check reported 'all transitions verified' while never running the restore path. The check was cosmetic; it passed a case it never tested.
|
|
33
|
+
|
|
34
|
+
### Q3 Scope too big?
|
|
35
|
+
PARTLY. Evidence: the feature bundles park + restore + audit-log in one unit; the restore path is where the untested gap hid. Not the primary cause, but it widened the surface the fake check let slip.
|
|
36
|
+
|
|
37
|
+
### Q4 Wrong split?
|
|
38
|
+
NO. Evidence: the task was a single coherent feature; the failures are about contract and verification, not about how the work was divided.
|
|
39
|
+
|
|
40
|
+
### Named root cause
|
|
41
|
+
A contract conflict on the `status` field (it was never frozen to one representation), made invisible each round by a verification step that only went through the motions. The patches kept fixing the spot the guard named while the unfrozen contract reintroduced the same class of bug elsewhere, and the hollow self-check kept certifying it.
|
|
42
|
+
|
|
43
|
+
### Owner decision
|
|
44
|
+
Agree with the root cause. Adjustment: freeze the `status` contract to a single enum as step zero of V6, and make the self-check fail first on the restore path before any further work.
|
|
45
|
+
|
|
46
|
+
### Version 6 direction, rebuilt around the cause (NOT another patch)
|
|
47
|
+
V6 does not start from the V5 patch list. Step 1: define `status` as one enum, single source of truth, and update every writer to it at once. Step 2: write a restore-from-quarantine check that fails against the current code, then make it pass. Only then continue. The brake record (both findings, four answers, cause, decision) is filed so a later session sees why the V4->V5->V6 chain was broken on purpose instead of patched a third time.
|
|
48
|
+
|
|
49
|
+
## How the mechanism changes the outcome
|
|
50
|
+
|
|
51
|
+
Without this mechanism, a single assistant can produce a smooth answer while hiding uncertainty. With this mechanism, the workflow records trigger, evidence, decision, residual risk, and next action.
|
|
52
|
+
|
|
53
|
+
## Reuse note
|
|
54
|
+
|
|
55
|
+
Copy the shape, not the synthetic facts. Adapt the template to your own redacted task.
|
|
@@ -0,0 +1,25 @@
|
|
|
1
|
+
# Root-Cause Brake Failure Modes
|
|
2
|
+
|
|
3
|
+
AI Collaboration Open System failure checklist. Use it in a local-first workflow before trusting a mechanism run, and rewrite any public example into public-safe language.
|
|
4
|
+
|
|
5
|
+
## Failure modes
|
|
6
|
+
|
|
7
|
+
- A third patched version goes out because nobody noticed the second block was the trip condition.
|
|
8
|
+
- The four questions get answered without evidence, so the named 'root cause' is just the old symptoms relabeled.
|
|
9
|
+
- The earlier findings are overwritten, so the across-rounds pattern that proves the real cause is lost.
|
|
10
|
+
|
|
11
|
+
## Common misuse (operator errors that look fine but break the mechanism)
|
|
12
|
+
|
|
13
|
+
- Quietly shipping 'just one more small patch' after the second block because the fix 'feels close', which is precisely the spiral the brake is built to stop.
|
|
14
|
+
- Filling in the four questions as a formality with no evidence, so the diagnostic theatre passes while the real cause stays unfound.
|
|
15
|
+
- Calling a list of surface symptoms the 'root cause', so version N+1 patches the same things again under a new name.
|
|
16
|
+
- Editing the earlier findings to look tidier, which erases the across-rounds pattern that is the entire point of the diagnosis.
|
|
17
|
+
- Tripping the brake on every minor rejection, so the team learns to ignore it and it no longer signals a genuine repeat-block.
|
|
18
|
+
- Treating the brake as a project freeze and stalling the work, when it is only a diagnostic stop — work resumes the moment the owner confirms the cause.
|
|
19
|
+
|
|
20
|
+
## Guard questions
|
|
21
|
+
|
|
22
|
+
1. Did this mechanism change the decision, or just add ceremony?
|
|
23
|
+
2. Is any private material copied instead of summarized or synthesized?
|
|
24
|
+
3. Are blockers, residual risks, and next actions separated?
|
|
25
|
+
4. Could a new session continue from this file alone?
|
|
@@ -0,0 +1,73 @@
|
|
|
1
|
+
# Root-Cause Brake Prompt
|
|
2
|
+
|
|
3
|
+
This prompt belongs to the AI Collaboration Open System. Use it in a local-first workflow with public-safe or redacted material.
|
|
4
|
+
|
|
5
|
+
## Purpose
|
|
6
|
+
|
|
7
|
+
Stop a patch-on-patch death spiral by treating repeated rejection as a signal to fix the cause, not the symptom. When the same artifact gets sent back twice in a row, an automatic brake trips: you may NOT ship another patched version. You must first stop and answer four diagnostic questions — is there a contract conflict, is the verification fake, is the scope too big, is the work split wrong — decide the real root cause, and only then write the next version, rebuilt around that cause instead of carrying forward another layer of fixes.
|
|
8
|
+
|
|
9
|
+
## Copy-paste prompt
|
|
10
|
+
|
|
11
|
+
```text
|
|
12
|
+
Use the Root-Cause Brake mechanism from my local AI Collaboration Open System workspace.
|
|
13
|
+
|
|
14
|
+
Purpose:
|
|
15
|
+
Stop a patch-on-patch death spiral by treating repeated rejection as a signal to fix the cause, not the symptom. When the same artifact gets sent back twice in a row, an automatic brake trips: you may NOT ship another patched version. You must first stop and answer four diagnostic questions — is there a contract conflict, is the verification fake, is the scope too big, is the work split wrong — decide the real root cause, and only then write the next version, rebuilt around that cause instead of carrying forward another layer of fixes.
|
|
16
|
+
|
|
17
|
+
Trigger:
|
|
18
|
+
Trip the brake the moment the same thing has been rejected twice in a row (two consecutive blocking reviews on the same artifact or task), or whenever you catch yourself about to start version N+1 by adding more fixes to a growing patch list. It also fires on suspicion: a reviewer says 'we keep treating symptoms', or you notice the same kind of defect coming back under a different name each round.
|
|
19
|
+
|
|
20
|
+
Do not use when:
|
|
21
|
+
Do not trip it on a first rejection, on rejections of genuinely different things, or on a single small fix that clearly resolves a one-off mistake. One block is normal review; the brake is specifically for the repeated-block pattern. Forcing a full root-cause stop after every minor note is ceremony that buries the signal — the brake only means something if it stays reserved for the second consecutive block on the same target.
|
|
22
|
+
|
|
23
|
+
Input:
|
|
24
|
+
[paste redacted task material, context package, and acceptance card here]
|
|
25
|
+
|
|
26
|
+
Process:
|
|
27
|
+
1. Detect the trip condition: the same artifact has two consecutive blocking reviews. The moment that is true, stop. Do NOT open version N+1 as another patched draft — that move is exactly what the brake forbids.
|
|
28
|
+
2. Answer all four diagnostic questions, each with a yes / no / partly AND concrete evidence (which finding, which version, where it shows). Partial answers are not allowed; a hand-waved 'probably fine' on any question defeats the brake. Q1 Contract conflict: are the agreed definitions — fields, states, interfaces, success criteria — quietly changing from round to round, so each fix breaks a different assumption? Q2 Fake verification: is the checking step (a self-review, a gate, a test) only going through the motions, passing things it should have caught? Q3 Scope too big: is a single unit of work carrying too many fields / states / responsibilities to get right in one pass? Q4 Wrong split: is the work cut too coarse or too fine — a packet that is really five tasks, or a job shattered into pieces that cannot be verified alone?
|
|
29
|
+
3. Name the root cause. From the four answers, state which underlying cause is actually generating the repeat blocks — not a list of surface fixes, but the one structural reason the patches keep failing.
|
|
30
|
+
4. Get the root cause confirmed by the human owner before proceeding (agree / adjust / reject and re-diagnose). The brake is a deliberate governance stop, so the person who owns the work signs off on the diagnosis before the next version starts. This is not a project pause — work resumes immediately after sign-off; it just resumes rebuilt around the cause.
|
|
31
|
+
5. Write version N+1 from the root cause, not from the patch list. The next version is a rebuild aimed at the named cause; it must not re-enact the old defect under a new patch. If the cause was 'scope too big', the next version is smaller; if it was 'fake verification', the next version fixes the check first, and so on.
|
|
32
|
+
6. Record the brake on the record: the preserved findings from each block, the four answered questions with evidence, the named root cause, the owner's decision, and the rebuilt direction — so a later session sees why the chain was broken and does not restart the patch spiral.
|
|
33
|
+
|
|
34
|
+
Output shape:
|
|
35
|
+
- Trip confirmation: a one-line statement that the same target hit two consecutive blocks, so the brake applies.
|
|
36
|
+
- Four answered questions: Q1 contract conflict, Q2 fake verification, Q3 scope too big, Q4 wrong split — each yes/no/partly with a concrete evidence pointer (finding + version + where).
|
|
37
|
+
- Named root cause: the single structural reason the patches kept failing, derived from the four answers.
|
|
38
|
+
- Owner decision: agree / adjust / reject-and-re-diagnose, recorded.
|
|
39
|
+
- Rebuilt direction for version N+1: how the next version is built around the cause, explicitly not a continuation of the patch list.
|
|
40
|
+
- Brake record: preserved per-round findings + answers + cause + decision, so the next session does not reopen the spiral.
|
|
41
|
+
|
|
42
|
+
Return:
|
|
43
|
+
- Decision-changing findings only
|
|
44
|
+
- Evidence used
|
|
45
|
+
- Required fixes
|
|
46
|
+
- Residual risk
|
|
47
|
+
- Next action
|
|
48
|
+
|
|
49
|
+
Pass bar (do not pass unless all hold):
|
|
50
|
+
- The brake actually tripped at the second consecutive block instead of a third patched version going out.
|
|
51
|
+
- All four diagnostic questions are answered with yes/no/partly AND a concrete evidence pointer — none hand-waved.
|
|
52
|
+
- A single structural root cause is named, not a longer list of surface fixes.
|
|
53
|
+
- The human owner confirmed (or adjusted) the root cause before the next version started.
|
|
54
|
+
- Version N+1 is visibly rebuilt around the cause, and the per-round findings are preserved on the record.
|
|
55
|
+
|
|
56
|
+
Reject bar (send back if any holds):
|
|
57
|
+
- A third patched version was shipped after two blocks without ever stopping to diagnose (the patch-on-patch spiral the brake exists to break).
|
|
58
|
+
- One or more of the four questions was skipped or answered 'probably fine' with no evidence, so the brake was ceremony, not a real stop.
|
|
59
|
+
- The 'root cause' is just a restated list of the same surface fixes, so the next version will reproduce the defect.
|
|
60
|
+
- The next version started before the owner signed off on the diagnosis.
|
|
61
|
+
- The original per-round findings were edited or discarded, destroying the cross-round pattern that the diagnosis depends on.
|
|
62
|
+
- The brake was tripped on a first block or on unrelated rejections, draining the signal so a real repeat-block does not stand out.
|
|
63
|
+
|
|
64
|
+
Rules:
|
|
65
|
+
- Work from provided material only.
|
|
66
|
+
- Keep private material local.
|
|
67
|
+
- Use public-safe synthetic wording for examples.
|
|
68
|
+
- Label assumptions and unverified claims.
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
## Full worked example
|
|
72
|
+
|
|
73
|
+
See `EXAMPLE.synthetic.md` for this prompt run from start to finish on a public-safe synthetic task.
|