ai-collab-open-system 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.aict/START_HERE.md +127 -0
- package/.aict/WORKSPACE_MANIFEST.json +91 -0
- package/.aict/acceptance/EXAMPLE.synthetic.md +49 -0
- package/.aict/acceptance/FAILURE_MODES.md +40 -0
- package/.aict/acceptance/PROMPT.md +47 -0
- package/.aict/acceptance/README.md +44 -0
- package/.aict/acceptance/TEMPLATE.md +57 -0
- package/.aict/adapters/SHARED_CORE_CONTRACT.md +106 -0
- package/.aict/adapters/claude-code/ADAPTER.md +28 -0
- package/.aict/adapters/cline/ADAPTER.md +28 -0
- package/.aict/adapters/codex/ADAPTER.md +28 -0
- package/.aict/adapters/copilot/ADAPTER.md +28 -0
- package/.aict/adapters/cursor/ADAPTER.md +28 -0
- package/.aict/adapters/windsurf/ADAPTER.md +28 -0
- package/.aict/context/EXAMPLE.synthetic.md +53 -0
- package/.aict/context/FAILURE_MODES.md +40 -0
- package/.aict/context/PROMPT.md +47 -0
- package/.aict/context/README.md +44 -0
- package/.aict/context/TEMPLATE.md +63 -0
- package/.aict/cookbook/README.md +8 -0
- package/.aict/cookbook/bridge-to-a-second-family.md +103 -0
- package/.aict/cookbook/connect-a-tool.md +67 -0
- package/.aict/cookbook/review-a-half-product.md +79 -0
- package/.aict/cookbook/run-a-first-loop.md +81 -0
- package/.aict/examples/README.md +21 -0
- package/.aict/examples/ai-coding-long-task/CASE.md +161 -0
- package/.aict/examples/ai-coding-long-task/artifacts/acceptance-card.md +36 -0
- package/.aict/examples/ai-coding-long-task/artifacts/context-package.md +30 -0
- package/.aict/examples/ai-coding-long-task/artifacts/execution-prompt.md +30 -0
- package/.aict/examples/ai-coding-long-task/artifacts/first-ai-output.md +109 -0
- package/.aict/examples/ai-coding-long-task/artifacts/guard-review.md +40 -0
- package/.aict/examples/ai-coding-long-task/artifacts/handoff-note.md +28 -0
- package/.aict/examples/ai-coding-long-task/artifacts/harvest-seed.md +28 -0
- package/.aict/examples/ai-coding-long-task/artifacts/revised-output.md +62 -0
- package/.aict/examples/content-production-harvest/CASE.md +87 -0
- package/.aict/examples/content-production-harvest/artifacts/acceptance-card.md +28 -0
- package/.aict/examples/content-production-harvest/artifacts/context-package.md +28 -0
- package/.aict/examples/content-production-harvest/artifacts/execution-prompt.md +30 -0
- package/.aict/examples/content-production-harvest/artifacts/guard-review.md +28 -0
- package/.aict/examples/content-production-harvest/artifacts/handoff-note.md +28 -0
- package/.aict/examples/content-production-harvest/artifacts/harvest-seed.md +28 -0
- package/.aict/examples/multi-tool-collaboration/CASE.md +87 -0
- package/.aict/examples/multi-tool-collaboration/artifacts/acceptance-card.md +28 -0
- package/.aict/examples/multi-tool-collaboration/artifacts/context-package.md +28 -0
- package/.aict/examples/multi-tool-collaboration/artifacts/execution-prompt.md +30 -0
- package/.aict/examples/multi-tool-collaboration/artifacts/guard-review.md +28 -0
- package/.aict/examples/multi-tool-collaboration/artifacts/handoff-note.md +28 -0
- package/.aict/examples/multi-tool-collaboration/artifacts/harvest-seed.md +28 -0
- package/.aict/examples/personal-judgment-growth-assistant/CASE.md +87 -0
- package/.aict/examples/personal-judgment-growth-assistant/artifacts/acceptance-card.md +28 -0
- package/.aict/examples/personal-judgment-growth-assistant/artifacts/context-package.md +28 -0
- package/.aict/examples/personal-judgment-growth-assistant/artifacts/execution-prompt.md +30 -0
- package/.aict/examples/personal-judgment-growth-assistant/artifacts/guard-review.md +28 -0
- package/.aict/examples/personal-judgment-growth-assistant/artifacts/handoff-note.md +28 -0
- package/.aict/examples/personal-judgment-growth-assistant/artifacts/harvest-seed.md +28 -0
- package/.aict/examples/research-knowledge-synthesis/CASE.md +87 -0
- package/.aict/examples/research-knowledge-synthesis/artifacts/acceptance-card.md +28 -0
- package/.aict/examples/research-knowledge-synthesis/artifacts/context-package.md +28 -0
- package/.aict/examples/research-knowledge-synthesis/artifacts/execution-prompt.md +30 -0
- package/.aict/examples/research-knowledge-synthesis/artifacts/guard-review.md +28 -0
- package/.aict/examples/research-knowledge-synthesis/artifacts/handoff-note.md +28 -0
- package/.aict/examples/research-knowledge-synthesis/artifacts/harvest-seed.md +28 -0
- package/.aict/guard/EXAMPLE.synthetic.md +51 -0
- package/.aict/guard/FAILURE_MODES.md +40 -0
- package/.aict/guard/PROMPT.md +47 -0
- package/.aict/guard/README.md +44 -0
- package/.aict/guard/TEMPLATE.md +60 -0
- package/.aict/handoff/EXAMPLE.synthetic.md +51 -0
- package/.aict/handoff/FAILURE_MODES.md +40 -0
- package/.aict/handoff/PROMPT.md +47 -0
- package/.aict/handoff/README.md +44 -0
- package/.aict/handoff/TEMPLATE.md +60 -0
- package/.aict/harvest/EXAMPLE.synthetic.md +51 -0
- package/.aict/harvest/FAILURE_MODES.md +40 -0
- package/.aict/harvest/PROMPT.md +47 -0
- package/.aict/harvest/README.md +44 -0
- package/.aict/harvest/TEMPLATE.md +60 -0
- package/.aict/mechanisms/README.md +34 -0
- package/.aict/mechanisms/anti-drift-partner/EXAMPLE.synthetic.md +46 -0
- package/.aict/mechanisms/anti-drift-partner/FAILURE_MODES.md +25 -0
- package/.aict/mechanisms/anti-drift-partner/PROMPT.md +75 -0
- package/.aict/mechanisms/anti-drift-partner/README.md +82 -0
- package/.aict/mechanisms/anti-drift-partner/TEMPLATE.md +74 -0
- package/.aict/mechanisms/blind-spot-scan/EXAMPLE.synthetic.md +39 -0
- package/.aict/mechanisms/blind-spot-scan/FAILURE_MODES.md +25 -0
- package/.aict/mechanisms/blind-spot-scan/PROMPT.md +72 -0
- package/.aict/mechanisms/blind-spot-scan/README.md +79 -0
- package/.aict/mechanisms/blind-spot-scan/TEMPLATE.md +70 -0
- package/.aict/mechanisms/collaboration-coach/EXAMPLE.synthetic.md +40 -0
- package/.aict/mechanisms/collaboration-coach/FAILURE_MODES.md +25 -0
- package/.aict/mechanisms/collaboration-coach/PROMPT.md +72 -0
- package/.aict/mechanisms/collaboration-coach/README.md +79 -0
- package/.aict/mechanisms/collaboration-coach/TEMPLATE.md +61 -0
- package/.aict/mechanisms/do-not-handle-yet/EXAMPLE.synthetic.md +15 -0
- package/.aict/mechanisms/do-not-handle-yet/FAILURE_MODES.md +16 -0
- package/.aict/mechanisms/do-not-handle-yet/PROMPT.md +41 -0
- package/.aict/mechanisms/do-not-handle-yet/README.md +30 -0
- package/.aict/mechanisms/do-not-handle-yet/TEMPLATE.md +38 -0
- package/.aict/mechanisms/dual-guard/EXAMPLE.synthetic.md +54 -0
- package/.aict/mechanisms/dual-guard/FAILURE_MODES.md +25 -0
- package/.aict/mechanisms/dual-guard/PROMPT.md +76 -0
- package/.aict/mechanisms/dual-guard/README.md +81 -0
- package/.aict/mechanisms/dual-guard/TEMPLATE.md +73 -0
- package/.aict/mechanisms/feedback-absorption-ledger/EXAMPLE.synthetic.md +49 -0
- package/.aict/mechanisms/feedback-absorption-ledger/FAILURE_MODES.md +25 -0
- package/.aict/mechanisms/feedback-absorption-ledger/PROMPT.md +74 -0
- package/.aict/mechanisms/feedback-absorption-ledger/README.md +81 -0
- package/.aict/mechanisms/feedback-absorption-ledger/TEMPLATE.md +69 -0
- package/.aict/mechanisms/half-product-review/EXAMPLE.synthetic.md +15 -0
- package/.aict/mechanisms/half-product-review/FAILURE_MODES.md +16 -0
- package/.aict/mechanisms/half-product-review/PROMPT.md +41 -0
- package/.aict/mechanisms/half-product-review/README.md +30 -0
- package/.aict/mechanisms/half-product-review/TEMPLATE.md +38 -0
- package/.aict/mechanisms/handoff-abc/EXAMPLE.synthetic.md +47 -0
- package/.aict/mechanisms/handoff-abc/FAILURE_MODES.md +25 -0
- package/.aict/mechanisms/handoff-abc/PROMPT.md +75 -0
- package/.aict/mechanisms/handoff-abc/README.md +82 -0
- package/.aict/mechanisms/handoff-abc/TEMPLATE.md +60 -0
- package/.aict/mechanisms/harvest-and-erc/EXAMPLE.synthetic.md +43 -0
- package/.aict/mechanisms/harvest-and-erc/FAILURE_MODES.md +25 -0
- package/.aict/mechanisms/harvest-and-erc/PROMPT.md +74 -0
- package/.aict/mechanisms/harvest-and-erc/README.md +81 -0
- package/.aict/mechanisms/harvest-and-erc/TEMPLATE.md +60 -0
- package/.aict/mechanisms/honest-calibration/EXAMPLE.synthetic.md +43 -0
- package/.aict/mechanisms/honest-calibration/FAILURE_MODES.md +25 -0
- package/.aict/mechanisms/honest-calibration/PROMPT.md +74 -0
- package/.aict/mechanisms/honest-calibration/README.md +81 -0
- package/.aict/mechanisms/honest-calibration/TEMPLATE.md +66 -0
- package/.aict/mechanisms/one-click-dispatch/EXAMPLE.synthetic.md +15 -0
- package/.aict/mechanisms/one-click-dispatch/FAILURE_MODES.md +16 -0
- package/.aict/mechanisms/one-click-dispatch/PROMPT.md +41 -0
- package/.aict/mechanisms/one-click-dispatch/README.md +30 -0
- package/.aict/mechanisms/one-click-dispatch/TEMPLATE.md +38 -0
- package/.aict/mechanisms/plain-language-first-screen/EXAMPLE.synthetic.md +15 -0
- package/.aict/mechanisms/plain-language-first-screen/FAILURE_MODES.md +16 -0
- package/.aict/mechanisms/plain-language-first-screen/PROMPT.md +41 -0
- package/.aict/mechanisms/plain-language-first-screen/README.md +30 -0
- package/.aict/mechanisms/plain-language-first-screen/TEMPLATE.md +38 -0
- package/.aict/mechanisms/root-cause-brake/EXAMPLE.synthetic.md +55 -0
- package/.aict/mechanisms/root-cause-brake/FAILURE_MODES.md +25 -0
- package/.aict/mechanisms/root-cause-brake/PROMPT.md +73 -0
- package/.aict/mechanisms/root-cause-brake/README.md +79 -0
- package/.aict/mechanisms/root-cause-brake/TEMPLATE.md +74 -0
- package/.aict/mechanisms/scout-review-controller/EXAMPLE.synthetic.md +15 -0
- package/.aict/mechanisms/scout-review-controller/FAILURE_MODES.md +16 -0
- package/.aict/mechanisms/scout-review-controller/PROMPT.md +41 -0
- package/.aict/mechanisms/scout-review-controller/README.md +30 -0
- package/.aict/mechanisms/scout-review-controller/TEMPLATE.md +38 -0
- package/.aict/mechanisms/single-tool-guard/EXAMPLE.synthetic.md +54 -0
- package/.aict/mechanisms/single-tool-guard/FAILURE_MODES.md +25 -0
- package/.aict/mechanisms/single-tool-guard/PROMPT.md +76 -0
- package/.aict/mechanisms/single-tool-guard/README.md +83 -0
- package/.aict/mechanisms/single-tool-guard/TEMPLATE.md +75 -0
- package/.aict/mechanisms/task-splitting/EXAMPLE.synthetic.md +53 -0
- package/.aict/mechanisms/task-splitting/FAILURE_MODES.md +25 -0
- package/.aict/mechanisms/task-splitting/PROMPT.md +72 -0
- package/.aict/mechanisms/task-splitting/README.md +79 -0
- package/.aict/mechanisms/task-splitting/TEMPLATE.md +76 -0
- package/.aict/modes/README.md +11 -0
- package/.aict/modes/execute.md +31 -0
- package/.aict/modes/handoff.md +29 -0
- package/.aict/modes/harvest.md +30 -0
- package/.aict/modes/review.md +28 -0
- package/.aict/modes/shape.md +34 -0
- package/.aict/privacy/COMMERCIAL_BOUNDARY.md +34 -0
- package/.aict/privacy/PRIVACY.md +36 -0
- package/.aict/privacy/REDACTION_CHECKLIST.md +12 -0
- package/.aict/profile/CANDIDATES.md +44 -0
- package/.aict/profile/EXAMPLE.synthetic.md +49 -0
- package/.aict/profile/FAILURE_MODES.md +40 -0
- package/.aict/profile/PROMPT.md +47 -0
- package/.aict/profile/README.md +44 -0
- package/.aict/profile/TEMPLATE.md +57 -0
- package/.aict/prompts/acceptance-definition.md +109 -0
- package/.aict/prompts/guard-review.md +116 -0
- package/.aict/prompts/handoff-generation.md +110 -0
- package/.aict/prompts/harvest-extraction.md +110 -0
- package/.aict/prompts/mode-switching.md +66 -0
- package/.aict/prompts/profile-creation.md +66 -0
- package/.aict/prompts/profile-refinement.md +66 -0
- package/.aict/prompts/project-context-packaging.md +113 -0
- package/.aict/prompts/red-team-challenge.md +106 -0
- package/.aict/prompts/rule-update-proposal.md +114 -0
- package/.aict/prompts/workflow-reset.md +109 -0
- package/.aict/roles/README.md +18 -0
- package/.aict/roles/executor.md +34 -0
- package/.aict/roles/harvester.md +33 -0
- package/.aict/roles/owner-controller.md +38 -0
- package/.aict/roles/scout.md +33 -0
- package/.aict/roles/supervisor.md +34 -0
- package/.aict/roles/system-guardian.md +34 -0
- package/.aict/skills/acceptance/SKILL.md +43 -0
- package/.aict/skills/context/SKILL.md +44 -0
- package/.aict/skills/evidence-pack/SKILL.md +42 -0
- package/.aict/skills/guard/SKILL.md +46 -0
- package/.aict/skills/handoff/SKILL.md +44 -0
- package/.aict/skills/harvest/SKILL.md +44 -0
- package/.aict/skills/mode-switch/SKILL.md +42 -0
- package/.aict/skills/profile/SKILL.md +42 -0
- package/.aict/skills/red-team/SKILL.md +42 -0
- package/.aict/skills/single-tool-guard/SKILL.md +42 -0
- package/.aict/state/CURRENT_STATE.md +13 -0
- package/.aict/state/DECISIONS.md +7 -0
- package/.aict/state/TASK_LOG.md +7 -0
- package/.aict/state/evidence.jsonl +2 -0
- package/.aict/state/learning-ledger.jsonl +1 -0
- package/.aict/state/receipts.jsonl +1 -0
- package/.aict/state/runs.jsonl +1 -0
- package/.aict/state/tasks.jsonl +1 -0
- package/.aict/walkthroughs/10-minute-your-task.md +107 -0
- package/.aict/walkthroughs/10-minute.md +43 -0
- package/.aict/walkthroughs/30-minute.md +22 -0
- package/.aict/walkthroughs/60-minute.md +27 -0
- package/.aict/walkthroughs/synthetic-loop-transcript.md +43 -0
- package/CHANGELOG.md +23 -0
- package/CODE_OF_CONDUCT.md +20 -0
- package/CONTRIBUTING.md +30 -0
- package/KNOWN_LIMITATIONS.md +54 -0
- package/LICENSE +199 -0
- package/PRODUCT_CONTRACT.md +446 -0
- package/README.md +245 -0
- package/RELEASE_CHECKLIST.md +78 -0
- package/SECURITY.md +56 -0
- package/START_HERE.md +89 -0
- package/bin/ai-collab.js +2 -0
- package/docs/DOGFOOD.md +85 -0
- package/docs/FEEDBACK.md +61 -0
- package/docs/FIRST_EXPERIENCE_SPEC.md +32 -0
- package/docs/FREE_VS_PAID.md +53 -0
- package/docs/PUBLIC_BOUNDARY.md +36 -0
- package/docs/PUBLIC_MAPPING.md +178 -0
- package/docs/RELEASE_PRIORITY.md +23 -0
- package/docs/WHY_THIS_EXISTS.md +36 -0
- package/docs/open-system/00-start-here.md +60 -0
- package/docs/open-system/01-ai-collaboration-os.md +33 -0
- package/docs/open-system/02-six-layer-architecture.md +45 -0
- package/docs/open-system/03-role-system.md +33 -0
- package/docs/open-system/04-core-mechanisms.md +34 -0
- package/docs/open-system/05-failure-patterns.md +31 -0
- package/docs/open-system/06-how-to-adapt-to-your-workflow.md +31 -0
- package/package.json +69 -0
- package/privacy-manifest.json +78 -0
- package/privacy-scan.local.json.example +18 -0
- package/scripts/lib/forbidden-in-pack.js +55 -0
- package/scripts/pack-check.js +154 -0
- package/scripts/privacy-scan.js +487 -0
- package/scripts/validate-contract.js +160 -0
- package/src/adapters.js +590 -0
- package/src/bootstrap.js +1184 -0
- package/src/catalog.js +2723 -0
- package/src/cli.js +2899 -0
- package/src/dialogue.js +470 -0
- package/src/i18n.js +1034 -0
- package/src/ledger.js +2011 -0
- package/src/render.js +1381 -0
- package/src/sendmodel.js +452 -0
- package/src/validate.js +1307 -0
- package/src/workspace.js +1679 -0
- package/tests/contract.test.js +8514 -0
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: profile
|
|
3
|
+
description: Build and maintain collaboration profiles.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# profile skill
|
|
7
|
+
|
|
8
|
+
## When to use
|
|
9
|
+
|
|
10
|
+
Use before recurring or high-context work where the assistant's tone, autonomy, challenge style, and safety boundaries affect the result.
|
|
11
|
+
|
|
12
|
+
## Inputs
|
|
13
|
+
|
|
14
|
+
- The shared core contract.
|
|
15
|
+
- A redacted task context.
|
|
16
|
+
- The relevant layer template.
|
|
17
|
+
- Any acceptance criteria or review findings.
|
|
18
|
+
|
|
19
|
+
## Process
|
|
20
|
+
|
|
21
|
+
1. Extract reusable collaboration preferences from redacted material.
|
|
22
|
+
2. Separate stable preferences from task facts.
|
|
23
|
+
3. Mark inferred preferences as provisional until confirmed.
|
|
24
|
+
4. Return a compact profile card that future sessions can apply.
|
|
25
|
+
|
|
26
|
+
## Output
|
|
27
|
+
|
|
28
|
+
- Working style
|
|
29
|
+
- Decision preferences
|
|
30
|
+
- Hard boundaries
|
|
31
|
+
- Challenge and review preferences
|
|
32
|
+
- Update rule
|
|
33
|
+
|
|
34
|
+
## Safety
|
|
35
|
+
|
|
36
|
+
- Do not store secrets, client names, local paths, account details, or raw private conversations.
|
|
37
|
+
- Do not infer identity traits that the user did not provide.
|
|
38
|
+
- Do not turn a temporary mood into a permanent rule.
|
|
39
|
+
|
|
40
|
+
## Example
|
|
41
|
+
|
|
42
|
+
Create a profile that says: direct risk calls, no publishing without consent, ask before irreversible actions. Include a short evidence note for every stable preference: 'seen in repeated release work' is acceptable, while 'user sounded impatient once' stays provisional. The profile should help the next assistant choose autonomy level, response length, challenge style, and consent boundaries without copying private task history.
|
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: red-team
|
|
3
|
+
description: Find the failure path before shipping an idea.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# red-team skill
|
|
7
|
+
|
|
8
|
+
## When to use
|
|
9
|
+
|
|
10
|
+
Use for public releases, irreversible operations, broad claims, security-sensitive behavior, or expensive direction choices.
|
|
11
|
+
|
|
12
|
+
## Inputs
|
|
13
|
+
|
|
14
|
+
- The shared core contract.
|
|
15
|
+
- A redacted task context.
|
|
16
|
+
- The relevant layer template.
|
|
17
|
+
- Any acceptance criteria or review findings.
|
|
18
|
+
|
|
19
|
+
## Process
|
|
20
|
+
|
|
21
|
+
1. Name the most damaging plausible failure.
|
|
22
|
+
2. Attack assumptions through user behavior, safety, evidence, and rollback.
|
|
23
|
+
3. Separate blockers from tolerable risk.
|
|
24
|
+
4. Recommend the smallest mitigation or test.
|
|
25
|
+
|
|
26
|
+
## Output
|
|
27
|
+
|
|
28
|
+
- Worst plausible failure
|
|
29
|
+
- Attack paths
|
|
30
|
+
- Evidence gaps
|
|
31
|
+
- Mitigations
|
|
32
|
+
- Residual risk
|
|
33
|
+
|
|
34
|
+
## Safety
|
|
35
|
+
|
|
36
|
+
- Do not invent dramatic but irrelevant threats.
|
|
37
|
+
- Do not skip mundane data-loss or privacy failures.
|
|
38
|
+
- Do not treat red-team output as owner approval.
|
|
39
|
+
|
|
40
|
+
## Example
|
|
41
|
+
|
|
42
|
+
Before publishing, challenge whether README claims 'integration' when adapters are only guidance files.
|
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: single-tool-guard
|
|
3
|
+
description: Run the minimum guard when only one model family is available, with the ceiling named on the record.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# single-tool-guard skill
|
|
7
|
+
|
|
8
|
+
## When to use
|
|
9
|
+
|
|
10
|
+
Use at a completion claim when no second, different model family exists to run the cross-family binding gate, and you would otherwise trust the same assistant that just wrote the work.
|
|
11
|
+
|
|
12
|
+
## Inputs
|
|
13
|
+
|
|
14
|
+
- The shared core contract.
|
|
15
|
+
- A redacted task context.
|
|
16
|
+
- The relevant layer template.
|
|
17
|
+
- Any acceptance criteria or review findings.
|
|
18
|
+
|
|
19
|
+
## Process
|
|
20
|
+
|
|
21
|
+
1. Open a brand new conversation rather than reusing the drafting thread, whose eagerness to please suppresses objections.
|
|
22
|
+
2. Paste an adversarial prompt that defaults to refuting and hunts for missing evidence, tying each finding to a line or section.
|
|
23
|
+
3. Bound the verdict at the single-tool ceiling: this tops out at L2 / pass_with_risk and may never be filed as a passed cross-family gate.
|
|
24
|
+
4. Name the residual risk a same-family reviewer most likely shares, and leave the upgrade note to run one cross-family pass once a second family appears.
|
|
25
|
+
|
|
26
|
+
## Output
|
|
27
|
+
|
|
28
|
+
- Verdict bounded at pass_with_risk (never a plain pass)
|
|
29
|
+
- Findings tied to specific lines or sections
|
|
30
|
+
- Residual risk a same-family reviewer would share
|
|
31
|
+
- Owner sign-off required before pass_with_risk counts as accepted
|
|
32
|
+
- Upgrade note: cross-family pass still owed
|
|
33
|
+
|
|
34
|
+
## Safety
|
|
35
|
+
|
|
36
|
+
- Do not record a single-family review as if the cross-family binding gate cleared it.
|
|
37
|
+
- Do not let pass_with_risk count as accepted without an explicit owner sign-off on the named risk.
|
|
38
|
+
- Do not reuse the thread that just claimed done, and do not leave the residual risk blank.
|
|
39
|
+
|
|
40
|
+
## Example
|
|
41
|
+
|
|
42
|
+
With only one tool available, a fresh adversarial pass downgrades a done claim to pass_with_risk, names the CSV-escaping blind spot a same-family reviewer would share, and leaves an upgrade note to run a cross-family pass later.
|
|
@@ -0,0 +1,7 @@
|
|
|
1
|
+
# Decisions
|
|
2
|
+
|
|
3
|
+
Record decisions that future sessions should not reopen without new evidence.
|
|
4
|
+
|
|
5
|
+
| Date | Decision | Evidence | Revisit condition |
|
|
6
|
+
| --- | --- | --- | --- |
|
|
7
|
+
| synthetic | Keep examples synthetic | privacy boundary | public-safe replacement needed |
|
|
@@ -0,0 +1,2 @@
|
|
|
1
|
+
{"id":"e0","taskId":"t0","kind":"note","summary":"(synthetic) example evidence seed row bound to task t0","createdAt":"2026-01-01T00:00:00.000Z"}
|
|
2
|
+
{"id":"e1","taskId":"t0","kind":"cross_family_guard","summary":"(synthetic) cross-family guard review seed row bound to task t0","reviewer":"(synthetic) example reviewer","family":"(synthetic) other-model-family","createdAt":"2026-01-01T00:00:00.000Z"}
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"id":"l0","taskId":"t0","type":"harvest","content":"(synthetic) example learning seed row - written by the P4 harvest flow","status":"proposed","createdAt":"2026-01-01T00:00:00.000Z"}
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"id":"c0","taskId":"t0","verdict":"pass","guardLevel":"L3","reviewMode":"cross_family","evidenceIds":["e0","e1"],"familyUnverified":true,"status":"accepted","createdAt":"2026-01-01T00:00:00.000Z"}
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"id":"r0","taskId":"t0","command":"echo synthetic-seed","startedAt":"2026-01-01T00:00:00.000Z","finishedAt":"2026-01-01T00:00:00.000Z","exitCode":0,"status":"finished"}
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"id":"t0","title":"(synthetic) example task seed row - replace with your own","status":"open","createdAt":"2026-01-01T00:00:00.000Z"}
|
|
@@ -0,0 +1,107 @@
|
|
|
1
|
+
# 10-Minute Walkthrough (Your own task)
|
|
2
|
+
|
|
3
|
+
This is the recommended first run. You run the whole collaboration loop on one real task of your own, instead of a prepared example, and feel the value on work you actually care about. If you would rather watch the flow on a prepared case first, use `10-minute.md` (the demo preview) and then come back here.
|
|
4
|
+
|
|
5
|
+
Goal: take one messy task of yours and, in three short rounds, force the AI to (1) define "done" before it acts, (2) do only that, and (3) get re-checked by an independent AI that hunts for a thin "done" - then spend two minutes closing the loop into reusable cards so the next task starts ahead.
|
|
6
|
+
|
|
7
|
+
Everything stays local-first. You paste a redacted description into the AI tools you already use; nothing is uploaded by this workspace. Redact before you paste: replace any real name, path, customer, or internal number with a placeholder. The loop works on a redacted description; it does not need the private original.
|
|
8
|
+
|
|
9
|
+
What you need: one real task that is a bit messy, and one AI tool you can paste into. A second tool of a different model family (a different AI brand) makes Step 3 much stronger, but you can run all three rounds in one tool if that is all you have.
|
|
10
|
+
|
|
11
|
+
Want the AI to prompt you for these steps on its own - to ping you to review every time it says "done", instead of you remembering to paste Step 3? Install the adapter into your tool's always-on instructions with `node bin/ai-collab.js adapters install --target <repo>`; it turns on the coaching reminders, and if you only have one tool it routes the completion-claim check through `single-tool-guard` (a fresh adversarial pass in the same tool).
|
|
12
|
+
|
|
13
|
+
## Step 1 (2 min) - Define done before any work
|
|
14
|
+
|
|
15
|
+
Paste this into your AI tool, with your own task in the brackets:
|
|
16
|
+
|
|
17
|
+
```text
|
|
18
|
+
I have a task in front of me that is a bit messy. Do NOT write any implementation yet.
|
|
19
|
+
Task (redacted): [describe your task in plain language; replace any private name, path, or number with a placeholder]
|
|
20
|
+
Return two things:
|
|
21
|
+
1) Boundary card: this run does only this one small slice; explicitly list what is NOT in scope.
|
|
22
|
+
2) Acceptance card: a numbered list of hard, checkable standards (AC1, AC2, ...). Mark anything that would be out of scope.
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
Expected: a boundary card and an acceptance card. You now have a written definition of "done" for your own task, before a line of work exists. This is the step people skip and then regret.
|
|
26
|
+
|
|
27
|
+
## Step 2 (3 min) - Do only the accepted slice, then produce an Evidence Pack
|
|
28
|
+
|
|
29
|
+
Paste this next, so the AI builds only what the acceptance card described and hands back a structured **Evidence Pack** the next round can actually check - not a prose "it's done":
|
|
30
|
+
|
|
31
|
+
```text
|
|
32
|
+
Do only the work the acceptance card describes. Do not expand scope.
|
|
33
|
+
When you are done, produce an "Evidence Pack" in exactly this shape (it is the artifact the re-check will judge):
|
|
34
|
+
1) Changed files / diff: the list of files you changed, with the key diff hunks (or the full patch). If you changed nothing, say so.
|
|
35
|
+
2) Commands run: the exact commands you ran to verify the work (tests, build, lint, a manual reproduction). If you ran none, write "none".
|
|
36
|
+
3) Command output summary: the real output of each command (paste it, do not paraphrase), trimmed to the relevant lines.
|
|
37
|
+
4) Exit codes: the exit code of each command (0 = passed). If a command failed, keep its non-zero code and error visible - do NOT hide it.
|
|
38
|
+
5) Acceptance mapping: for each acceptance criterion (AC1, AC2, ...), say PASS / FAIL / NOT-VERIFIED and point to the evidence above that backs it.
|
|
39
|
+
6) Not verified: everything you could NOT prove (edge cases, things you skipped, criteria with no command behind them).
|
|
40
|
+
Do not claim "done" for anything that does not have evidence in this pack.
|
|
41
|
+
```
|
|
42
|
+
|
|
43
|
+
Expected: an Evidence Pack with the six numbered parts above (changed files/diff, commands run, output summary, exit codes, acceptance mapping, not-verified). Keep this whole pack - it is exactly what the next round pressure-tests, and a missing or empty pack is itself a finding in Step 3.
|
|
44
|
+
|
|
45
|
+
## Step 3 (3 min, the aha moment) - Independent re-check
|
|
46
|
+
|
|
47
|
+
Open a fresh chat. Ideally use a different AI brand than the one that did Step 2 - a different model family is the pass most likely to catch what the first one missed. Paste this:
|
|
48
|
+
|
|
49
|
+
```text
|
|
50
|
+
You are an independent reviewer. The work below claims to be done. Assume it is NOT done and prove it from the evidence, not the tone.
|
|
51
|
+
Acceptance card: [paste your Step 1 acceptance card]
|
|
52
|
+
Evidence Pack under review: [paste the Step 2 Evidence Pack: changed files/diff, commands run, output summary, exit codes, acceptance mapping, not-verified]
|
|
53
|
+
Do this, in order:
|
|
54
|
+
1) First check the Evidence Pack itself. If there is no Evidence Pack, or it is missing real command output / exit codes, or a claimed PASS has no command behind it, you CANNOT pass the work: return the verdict INSUFFICIENT_EVIDENCE and list exactly what evidence is missing. A confident "done" with no evidence is INSUFFICIENT_EVIDENCE, not pass.
|
|
55
|
+
2) For each acceptance criterion, point to the exact line/output in the Evidence Pack that backs it, or say there is no evidence for it.
|
|
56
|
+
3) Walk it the way a stranger would actually use it and say exactly where it breaks.
|
|
57
|
+
4) List defects by severity, each pinned to a specific location.
|
|
58
|
+
5) Pick the verdict: REJECT if an evidence-grounded hard defect exists; INSUFFICIENT_EVIDENCE if the pack cannot support a pass; pass only if every criterion is backed by real evidence.
|
|
59
|
+
Return: verdict (pass / REJECT / INSUFFICIENT_EVIDENCE) + defect or missing-evidence list (with locations) + the smallest fix for each + what is still unverified.
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
Expected (the aha): the independent reviewer first weighs your Evidence Pack. If Step 2 handed over a fluent "done" with no real evidence, it returns `INSUFFICIENT_EVIDENCE` and names what is missing; if the evidence exists but a criterion is not actually met, it returns `REJECT` with the defect pinned to a location - on your own task, not a tutorial's. Either way, that is the gap a single fluent chat would have hidden from you: no evidence pack means no pass.
|
|
63
|
+
|
|
64
|
+
## Step 4 (2 min) - Close the loop so it compounds
|
|
65
|
+
|
|
66
|
+
The re-check is the safety net; this step is where the loop starts paying you back. Keep it light - three short cards, not a report. Paste this:
|
|
67
|
+
|
|
68
|
+
```text
|
|
69
|
+
Close out this task in three short cards. Keep each card to a few lines - do NOT write a long report.
|
|
70
|
+
1) Handoff card (so the next session or tool resumes without re-explaining), three columns:
|
|
71
|
+
- Done: what is finished and evidence-backed.
|
|
72
|
+
- To do: what is left.
|
|
73
|
+
- Not verified: what was claimed but not proven (carry over anything the re-check flagged).
|
|
74
|
+
2) Harvest card: one reusable lesson from this task, as a single sentence I could apply to a future task.
|
|
75
|
+
3) Profile candidate (only if one applies): if a stable preference about how I want you to work showed up more than once, propose it as one line, with status `proposed`. Do NOT add it to my long-term profile yet. If nothing stable showed up, say "no profile candidate this time".
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
Expected: a three-column handoff, a one-line harvest lesson, and either one `proposed` profile candidate or an explicit "none". Save the handoff and harvest cards into your workspace (`../handoff/` and `../harvest/`). A profile candidate does NOT go straight into your long-term profile - it lands in `../profile/CANDIDATES.md` as `proposed` first. It only moves into `profile/EXAMPLE.synthetic.md` (or your real profile) after you review it: mark it `confirmed` (use as-is), `edited` (reword first), or `dropped` (discard) in CANDIDATES.md, and only `confirmed`/`edited` ones graduate. That buffer is why one task makes the next one start ahead without an unreviewed guess hardening into a standing rule - you walk away with a re-checked result *and* something reusable, but nothing edits your profile behind your back.
|
|
79
|
+
|
|
80
|
+
### Profile-candidate buffer (the state machine)
|
|
81
|
+
|
|
82
|
+
A profile candidate is a guess about a standing preference. An unreviewed guess must not silently become a rule future sessions obey, so candidates move through four states in `../profile/CANDIDATES.md`:
|
|
83
|
+
|
|
84
|
+
- `proposed` — the AI suggested it this loop; not yet trusted, not in your profile.
|
|
85
|
+
- `confirmed` — you reviewed it and it is correct as written; it may now graduate into your profile.
|
|
86
|
+
- `edited` — correct after you reword it; the edited line graduates, the original does not.
|
|
87
|
+
- `dropped` — you reviewed it and it does not belong; it stays recorded as dropped so it is not re-proposed every loop.
|
|
88
|
+
|
|
89
|
+
Rule: only `confirmed` and `edited` candidates graduate into your long-term profile, and only after you say so. `proposed` and `dropped` never edit your profile. Open `../profile/CANDIDATES.md` for the table and how to use it.
|
|
90
|
+
|
|
91
|
+
Prefer to let the tool track this for you instead of hand-editing a table? The same four states are available as commands: `ai-collab learning add --type profile --content "..."` records the candidate (and `--type harvest` records the one-line lesson from card 2), then `ai-collab learning confirm` / `learning edit` / `learning drop` keep, reword, or discard it. Next time you run `ai-collab status`, it echoes back the one preference you most recently confirmed - so the next task literally starts with "still working the way you confirmed last time." Use the table or the commands, whichever you like; they share the same states, so you are never maintaining two systems.
|
|
92
|
+
|
|
93
|
+
## Two-track comparison (optional, makes the point undeniable)
|
|
94
|
+
|
|
95
|
+
Run your task once with no discipline first, then with the loop, and compare:
|
|
96
|
+
|
|
97
|
+
1. Track A (no discipline): in a fresh chat, paste your messy task with no structure and just ask the AI to do it. Save the smooth "Sure, I will do X, Y, Z" reply. That smooth line is your real before-evidence, generated on your own task.
|
|
98
|
+
2. Track B (the loop): the three steps above.
|
|
99
|
+
3. Side by side: ask the AI to put both tracks into one table with four rows - scope, definition of done, completion claim, and what would have been missed. The messy half is real evidence from your own task, not something the tutorial invented.
|
|
100
|
+
|
|
101
|
+
## Want the why behind each step
|
|
102
|
+
|
|
103
|
+
This walkthrough is the operation card. For the reasoning behind each move and a longer copy-paste sequence to adapt, open `../cookbook/run-a-first-loop.md` (it runs this same loop on your own task and explains why each step exists). To turn Step 3 into a reusable habit on higher-stakes work, see `../cookbook/review-a-half-product.md` and `../mechanisms/dual-guard/README.md`.
|
|
104
|
+
|
|
105
|
+
## Completion check
|
|
106
|
+
|
|
107
|
+
You defined "done" before the work, had the AI do only that, had an independent AI re-check it against evidence, and closed the loop into a handoff card, a one-line harvest lesson, and (if one applied) a profile candidate - all on a real task of your own. You can name the exact place the re-check pointed to, and you leave with a re-checked result, reusable cards, and a habit (define done, do only that, get re-checked, then capture what is reusable) that makes your next task start ahead instead of from scratch.
|
|
@@ -0,0 +1,43 @@
|
|
|
1
|
+
# 10-Minute Walkthrough (Demo preview)
|
|
2
|
+
|
|
3
|
+
This is the demo preview: it runs the loop on a prepared case so you can see the flow without pasting anything of your own. To run the same loop on your own real task, use `10-minute-your-task.md` instead (that is the recommended first run). Pick this preview if your task feels too sensitive to paste right now, or you just want to watch the shape of the loop first.
|
|
4
|
+
|
|
5
|
+
Goal: walk one AI collaboration loop end to end on the prepared TaskBoard case, and watch a guard catch a false completion claim that a single agent would have accepted.
|
|
6
|
+
|
|
7
|
+
The case: a user asks an AI to add task reordering to a TaskBoard. The AI says it added mouse and keyboard reorder with tests. The guard proves the keyboard part was never implemented. You will see context, acceptance, first output, guard review, revised output, handoff, and harvest.
|
|
8
|
+
|
|
9
|
+
Everything is local-first and synthetic. You only read and copy files; nothing is uploaded.
|
|
10
|
+
|
|
11
|
+
## Step 1 (1 min) - Open the case
|
|
12
|
+
|
|
13
|
+
Open `../examples/ai-coding-long-task/CASE.md` and read "Confusing raw input" and "Likely single-agent failure". This is the messy request and the answer a raw chat usually gives.
|
|
14
|
+
|
|
15
|
+
Expected: you can say in one line why "I will refactor, add drag, keyboard, polish, and tests" is unsafe (it mixes scope and defines no pass standard).
|
|
16
|
+
|
|
17
|
+
## Step 2 (2 min) - Set context and acceptance
|
|
18
|
+
|
|
19
|
+
Open `../examples/ai-coding-long-task/artifacts/context-package.md`, then `acceptance-card.md`. Copy both into your AI tool together with `../adapters/SHARED_CORE_CONTRACT.md`.
|
|
20
|
+
|
|
21
|
+
Expected: your tool now has five checkable acceptance criteria (AC1 mouse, AC2 keyboard, AC3 tests for both, AC4 data preserved, AC5 visual polish out of scope).
|
|
22
|
+
|
|
23
|
+
## Step 3 (2 min) - Read the first AI output
|
|
24
|
+
|
|
25
|
+
Open `../examples/ai-coding-long-task/artifacts/first-ai-output.md`. Read the completion claim, then the `TaskBoard.tsx` code block.
|
|
26
|
+
|
|
27
|
+
Expected: you can point to the defect yourself. The claim says arrow-key reorder works, but `onKeyDown` (lines 27-30 of that code block) only logs the key and never calls `moveTask`, and the test block has no keyboard test.
|
|
28
|
+
|
|
29
|
+
## Step 4 (2 min) - Run the guard review
|
|
30
|
+
|
|
31
|
+
Open `../examples/ai-coding-long-task/artifacts/guard-review.md`. Optionally paste `first-ai-output.md` plus `../guard/PROMPT.md` into a second AI tool and ask it to review against the acceptance card.
|
|
32
|
+
|
|
33
|
+
Expected: the guard returns a cause-and-effect chain, not a one-line verdict. It cites `first-ai-output.md` lines 27-30 (stub handler) and the missing keyboard test, maps them to AC2 and AC3, and returns reject. This is the line the guard checks.
|
|
34
|
+
|
|
35
|
+
## Step 5 (2 min) - Read the revised output and close the loop
|
|
36
|
+
|
|
37
|
+
Open `../examples/ai-coding-long-task/artifacts/revised-output.md`, then `handoff-note.md`, then `harvest-seed.md`.
|
|
38
|
+
|
|
39
|
+
Expected: `onKeyDown` now calls `moveTask` for ArrowUp/ArrowDown, a keyboard test was added that fails on the old stub and passes on the fix, the handoff separates done / pending / unverified (visual polish), and the harvest seed is the reusable artifact you keep: verify completion claims with code and test evidence, do not trust a fluent "done".
|
|
40
|
+
|
|
41
|
+
## Completion check
|
|
42
|
+
|
|
43
|
+
You have walked context -> acceptance -> first output -> guard -> revised -> handoff -> harvest on one case, you can name the exact line the guard pointed to, and you leave with one reusable artifact (`harvest-seed.md`) you can apply to your own next task.
|
|
@@ -0,0 +1,22 @@
|
|
|
1
|
+
# 30-Minute Walkthrough
|
|
2
|
+
|
|
3
|
+
Goal: adapt one layer to a real task.
|
|
4
|
+
|
|
5
|
+
## Input
|
|
6
|
+
|
|
7
|
+
Choose one current task and redact private identifiers.
|
|
8
|
+
|
|
9
|
+
## Steps
|
|
10
|
+
|
|
11
|
+
1. Open `../context/TEMPLATE.md`.
|
|
12
|
+
2. Fill goal, current state, constraints, facts, assumptions, risks, and open questions.
|
|
13
|
+
3. Open the adapter for your tool in `../adapters/`.
|
|
14
|
+
4. Ask the tool to produce one acceptance card or review note from your context.
|
|
15
|
+
|
|
16
|
+
## Expected output file
|
|
17
|
+
|
|
18
|
+
One completed context package or acceptance card.
|
|
19
|
+
|
|
20
|
+
## Completion check
|
|
21
|
+
|
|
22
|
+
Another session can tell what the task is, what is out of scope, and what evidence is still missing.
|
|
@@ -0,0 +1,27 @@
|
|
|
1
|
+
# 60-Minute Walkthrough
|
|
2
|
+
|
|
3
|
+
Goal: run one complete AI collaboration loop.
|
|
4
|
+
|
|
5
|
+
## Steps
|
|
6
|
+
|
|
7
|
+
1. Fill a light profile.
|
|
8
|
+
2. Package task context.
|
|
9
|
+
3. Define acceptance.
|
|
10
|
+
4. Run one execution prompt.
|
|
11
|
+
5. Challenge the result with guard review.
|
|
12
|
+
6. Write a handoff note.
|
|
13
|
+
7. Extract one harvest seed.
|
|
14
|
+
|
|
15
|
+
## Expected output files
|
|
16
|
+
|
|
17
|
+
- profile card
|
|
18
|
+
- context package
|
|
19
|
+
- acceptance card
|
|
20
|
+
- execution artifact
|
|
21
|
+
- guard review
|
|
22
|
+
- handoff note
|
|
23
|
+
- harvest seed
|
|
24
|
+
|
|
25
|
+
## Completion check
|
|
26
|
+
|
|
27
|
+
The next AI session can resume without asking what happened, and the useful lesson is saved for future reuse.
|
|
@@ -0,0 +1,43 @@
|
|
|
1
|
+
# Synthetic Loop Transcript
|
|
2
|
+
|
|
3
|
+
This transcript demonstrates one complete loop using `ai-coding-long-task`.
|
|
4
|
+
|
|
5
|
+
## Goal
|
|
6
|
+
|
|
7
|
+
Show that one user can move from a messy starting point to context, acceptance, execution, guard review, handoff, and harvest without relying on a raw chat memory.
|
|
8
|
+
|
|
9
|
+
## Expected output
|
|
10
|
+
|
|
11
|
+
A complete artifact chain: context package, acceptance card, execution request, guard review result, handoff note, harvest seed, and a short comparison against single raw AI chat.
|
|
12
|
+
|
|
13
|
+
## User
|
|
14
|
+
|
|
15
|
+
A developer asks an assistant to refactor a small task board, then keeps adding bugs, design requests, accessibility requests, and test fixes across multiple sessions. Each new chat forgets which tradeoffs were rejected, whether keyboard movement is required, and which visual polish is out of scope.
|
|
16
|
+
|
|
17
|
+
## Context package
|
|
18
|
+
|
|
19
|
+
Profile: prefers direct bug risk calls, small verified steps, and no silent scope expansion. Context: synthetic task board, local-only, no auth, no deployment, existing task data must survive, keyboard accessibility matters, visual redesign is not in scope.
|
|
20
|
+
|
|
21
|
+
## Acceptance card
|
|
22
|
+
|
|
23
|
+
Done means the board preserves existing task data, supports drag and keyboard reorder, has tests for both flows, reports changed files and verification output, and leaves a handoff note listing visual polish as unverified rather than done.
|
|
24
|
+
|
|
25
|
+
## Execution request
|
|
26
|
+
|
|
27
|
+
Implement only the reorder behavior described in the acceptance card. Keep the existing data shape. Do not redesign the board. After code, report changed files, tests run, failures, and unverified areas.
|
|
28
|
+
|
|
29
|
+
## Guard review result
|
|
30
|
+
|
|
31
|
+
Guard finds that mouse reorder was tested but keyboard movement lacks evidence. It rejects completion until a keyboard reorder test exists and the handoff labels visual polish as unverified.
|
|
32
|
+
|
|
33
|
+
## Handoff note
|
|
34
|
+
|
|
35
|
+
Current state: mouse drag and keyboard arrow-key reorder are both implemented and covered by tests (2 passing), and the guard re-review accepted the fix. Completed: data shape preserved; keyboard reorder implemented and tested. Pending: only visual polish for the reorder affordance, carried as unverified. Next action: pick up the visual polish, not the keyboard work.
|
|
36
|
+
|
|
37
|
+
## Harvest seed
|
|
38
|
+
|
|
39
|
+
Reusable pattern: long coding tasks need an acceptance card before implementation, a guard pass before handoff, and an explicit unverified bucket for visual polish. Do not generalize the synthetic task board data model.
|
|
40
|
+
|
|
41
|
+
## Difference from raw chat
|
|
42
|
+
|
|
43
|
+
A raw chat produces a plausible refactor plan but loses rejected scope and unverified accessibility work. The six-layer workspace keeps the goal, done standard, review finding, next action, and reusable lesson visible.
|
package/CHANGELOG.md
ADDED
|
@@ -0,0 +1,23 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
This project's source is on GitHub with CI green, but it is **not published to npm**. The version
|
|
4
|
+
below is prepared and unpublished; see [Release Status](./README.md#release-status) for the
|
|
5
|
+
four-state ladder.
|
|
6
|
+
|
|
7
|
+
## 0.1.0 — Unreleased (GitHub source on `main`, CI green)
|
|
8
|
+
|
|
9
|
+
Status: source pushed to GitHub on `main` with CI green; **not git-tagged and not published to
|
|
10
|
+
npm**. No release date is claimed because no npm release has happened. The date here will be
|
|
11
|
+
filled in only when the package is actually published.
|
|
12
|
+
|
|
13
|
+
- Hardened CI: Node 18/20/22 matrix running `npm ci`, `npm run check`, `npm pack`, a fresh-tarball
|
|
14
|
+
install smoke test (install the packed artifact into a clean dir and run the installed `ai-collab`
|
|
15
|
+
bin), and a CLI smoke (init/guide/demo/check), so CI blocks "installs but does not run" and broken
|
|
16
|
+
CLI commands.
|
|
17
|
+
- Rebuilt the generated prompt and skill library as distinct capability packages.
|
|
18
|
+
- Added a flagship synthetic case showing messy input, baseline raw AI output, six-layer intervention, artifacts, comparison, and next step.
|
|
19
|
+
- Hardened CLI first-run behavior: real `ai-collab` commands, `init --dry-run`, required `--target`, required `--workspace`, JSON output, and bin-entry tests.
|
|
20
|
+
- Changed `--force` behavior to back up existing `.aict` content before replacement.
|
|
21
|
+
- Added adapter guidance installer for Codex, Claude Code, Cursor, GitHub Copilot, Cline, and Windsurf.
|
|
22
|
+
- Added privacy scanner coverage for common token, email, and local-path leaks.
|
|
23
|
+
- Added CI, issue templates, PR template, release checklist, and packed-package checks.
|
|
@@ -0,0 +1,20 @@
|
|
|
1
|
+
# Code of Conduct
|
|
2
|
+
|
|
3
|
+
This project is for building safer, more inspectable AI collaboration workflows.
|
|
4
|
+
|
|
5
|
+
## Expected behavior
|
|
6
|
+
|
|
7
|
+
- Be concrete and evidence-based.
|
|
8
|
+
- Respect privacy boundaries.
|
|
9
|
+
- Use synthetic examples for public discussion.
|
|
10
|
+
- Separate facts, assumptions, and opinions.
|
|
11
|
+
- Challenge claims without attacking people.
|
|
12
|
+
|
|
13
|
+
## Unacceptable behavior
|
|
14
|
+
|
|
15
|
+
- Sharing private user material without consent.
|
|
16
|
+
- Publishing secrets, local paths, or raw private conversations.
|
|
17
|
+
- Harassment or personal attacks.
|
|
18
|
+
- Misrepresenting paid services as required for the open method.
|
|
19
|
+
|
|
20
|
+
Maintainers may remove issues, discussions, or contributions that violate these boundaries.
|
package/CONTRIBUTING.md
ADDED
|
@@ -0,0 +1,30 @@
|
|
|
1
|
+
# Contributing
|
|
2
|
+
|
|
3
|
+
This project is a local-first open collaboration workspace. Contributions should make the generated workspace more usable, safer to publish, or easier to verify.
|
|
4
|
+
|
|
5
|
+
## Good contributions
|
|
6
|
+
|
|
7
|
+
- Better synthetic cases.
|
|
8
|
+
- Clearer templates.
|
|
9
|
+
- More precise adapter instructions.
|
|
10
|
+
- Stronger privacy checks.
|
|
11
|
+
- Better contract tests.
|
|
12
|
+
- Bilingual clarity improvements.
|
|
13
|
+
|
|
14
|
+
## Not a fit
|
|
15
|
+
|
|
16
|
+
- Uploading user content by default.
|
|
17
|
+
- Adding hidden scoring or private calibration.
|
|
18
|
+
- Turning the CLI into a hosted assistant.
|
|
19
|
+
- Making paid help necessary for the generic loop.
|
|
20
|
+
- Copying private workflows or real user material into examples.
|
|
21
|
+
|
|
22
|
+
## Before opening a change
|
|
23
|
+
|
|
24
|
+
Run:
|
|
25
|
+
|
|
26
|
+
```bash
|
|
27
|
+
npm run check
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
The check must pass before a change can be treated as release-ready.
|
|
@@ -0,0 +1,54 @@
|
|
|
1
|
+
# Known Limitations
|
|
2
|
+
|
|
3
|
+
Honest, known boundaries of this tool. These are real and documented on purpose — the
|
|
4
|
+
whole point of the project is to not pretend a "done" (or a tool) is more than it is, and
|
|
5
|
+
that honesty applies to the tool itself. Nothing here is a secret failure mode; each entry
|
|
6
|
+
says what the limit is, when it can bite, and what catches or works around it.
|
|
7
|
+
|
|
8
|
+
## Concurrent writes and duplicate ids — now mitigated by a file lock
|
|
9
|
+
|
|
10
|
+
**What it is.** Each ledger id (`t1`, `e2`, `c1`, `r1`, …) is allocated by reading the
|
|
11
|
+
existing rows, taking the highest numeric suffix, and returning the next one (`nextId` in
|
|
12
|
+
`src/ledger.js`). That is a *read-then-write*: the new id is decided from what is on disk at
|
|
13
|
+
read time, then appended. Historically, if two writes ran **at the same moment** against the
|
|
14
|
+
**same workspace** (e.g. two CLI processes, or two AI tools driving the same `.aict/` in
|
|
15
|
+
parallel), both could read the same highest id and then both append the same next one —
|
|
16
|
+
producing two rows with the same id in one ledger.
|
|
17
|
+
|
|
18
|
+
**Mitigation (file lock).** The id-allocation path is now serialized with a short on-disk
|
|
19
|
+
mutex. `withLedgerLock(stateDir, fn)` in `src/ledger.js` creates a lock file with
|
|
20
|
+
`openSync(lockPath, 'wx')` (the `O_EXCL` "create only, fail if it exists" flag), so exactly
|
|
21
|
+
one process holds it at a time; a loser retries with a small backoff (~25 ms) up to a ~5 s
|
|
22
|
+
timeout, and a **stale** lock left by a crashed process is reclaimed once it is older than
|
|
23
|
+
~10 s so the ledgers can never wedge permanently. The whole *read → compute next id → append*
|
|
24
|
+
(and run finish's *read-all → patch → rewrite*) happens **inside** the lock, with the ledger
|
|
25
|
+
re-read after the lock is held, so concurrent writers each see every row the others already
|
|
26
|
+
committed and cannot mint the same id. Verified by a test that spawns 15 truly-parallel
|
|
27
|
+
`task create`s and asserts zero duplicate ids (`tests/contract.test.js`, "B6a-2"); with the
|
|
28
|
+
lock disabled the same test reliably fails, which is how we know the lock — not luck — is
|
|
29
|
+
doing the work.
|
|
30
|
+
|
|
31
|
+
**Residual edge.** This is a best-effort local lock, not a distributed transaction. The
|
|
32
|
+
stale-lock reclamation window means a process paused (e.g. swapped out / suspended) for longer
|
|
33
|
+
than ~10 s while mid-write could in theory have its lock stolen — vanishingly unlikely for the
|
|
34
|
+
sub-millisecond ledger writes here, but not a hard mathematical guarantee. Networked or
|
|
35
|
+
case-insensitive filesystems with unusual `O_EXCL` semantics are likewise out of scope.
|
|
36
|
+
|
|
37
|
+
**What still catches anything that slips through.** The integrity check remains the backstop.
|
|
38
|
+
`node bin/ai-collab.js check --workspace <dir>/.aict` (also run by `npm run check`) validates
|
|
39
|
+
per-ledger id integrity and fails loudly with, for example:
|
|
40
|
+
|
|
41
|
+
```text
|
|
42
|
+
Contract check failed:
|
|
43
|
+
- ledger tasks.jsonl has duplicate id "t1"
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
So even in the residual edge, a duplicate cannot silently corrupt your trust trail: the next
|
|
47
|
+
`check` surfaces it with a pointable file + id. (The guard-level and acceptance logic also
|
|
48
|
+
recompute from evidence rather than trusting a stored field, so a duplicated row cannot quietly
|
|
49
|
+
upgrade a result.)
|
|
50
|
+
|
|
51
|
+
**How to recover.** If `check` ever reports a duplicate id, open the named `.jsonl` ledger
|
|
52
|
+
(plain JSON-lines, one record per line) and remove or renumber the offending duplicate row,
|
|
53
|
+
then re-run `check` to confirm it is clean. The ledgers remain plain, hand-inspectable files;
|
|
54
|
+
the lock is a thin coordination layer around the id allocation, not an opaque database.
|