@hallucination-studio/harness-engine 1.0.0-beta.11.2a4849a → 1.0.0-beta.13.cf40fab

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,6 @@
1
+ {
2
+ "name": "harness-engine",
3
+ "version": "1.0.0",
4
+ "description": "Repository harness skill for Codex with Google DESIGN.md control-plane guidance.",
5
+ "skills": "./skills/"
6
+ }
package/README.md CHANGED
@@ -24,6 +24,7 @@ ask for missing high-impact facts, create the harness files, and keep future wor
24
24
  - Supports durable knowledge closure with stable knowledge IDs and evidence text, so permanent docs can use natural wording instead of duplicated checklist strings.
25
25
  - Enforces a local quality gate for execution plans; failed scores write `## Rework Required` into the plan and block `plan-close`.
26
26
  - Tracks resumable workstreams so interrupted features, refactors, reliability work, and cleanup efforts can be recovered from repo state instead of chat history.
27
+ - Generates a frontend/design control plane that tells target projects to own `docs/DESIGN.md` and validate or export it with the official `@google/design.md` package.
27
28
 
28
29
  ## Why It Exists
29
30
 
@@ -65,22 +66,51 @@ Install into a custom skills directory:
65
66
  npx @hallucination-studio/harness-engine install --path /path/to/skills
66
67
  ```
67
68
 
68
- Replace an existing installed skill:
69
+ Replace an existing installed plugin bundle:
69
70
 
70
71
  ```bash
71
72
  npx @hallucination-studio/harness-engine install --local --force
72
73
  ```
73
74
 
74
- Show where the skill would be installed:
75
+ Show where the plugin bundle would be installed:
75
76
 
76
77
  ```bash
77
78
  npx @hallucination-studio/harness-engine where --local
78
79
  ```
79
80
 
81
+ ## Target Project Dependency: Google DESIGN.md
82
+
83
+ Harness Engine depends on the official Google DESIGN.md workflow for frontend style creation, but
84
+ does not bundle Google source code or install Google's package for the user. Install Google
85
+ DESIGN.md as a dev dependency in each target project that needs frontend style creation,
86
+ validation, diffs, or token exports:
87
+
88
+ ```bash
89
+ npm install --save-dev @google/design.md
90
+ ```
91
+
92
+ Use Google/Stitch to create the real `docs/DESIGN.md` through one of Google's documented paths:
93
+
94
+ - Create from a prompt in Stitch by describing the intended vibe, product, audience, and interaction feel.
95
+ - Derive from branding in Stitch by providing a brand URL or image.
96
+ - Write it by hand as markdown with optional YAML frontmatter.
97
+
98
+ Then validate or export from the target repository:
99
+
100
+ ```bash
101
+ npx @google/design.md lint docs/DESIGN.md
102
+ npx @google/design.md export docs/DESIGN.md --format css-tailwind
103
+ npx @google/design.md diff docs/DESIGN.md docs/DESIGN.next.md
104
+ ```
105
+
106
+ Harness Engine's role is to generate the control plane: `docs/FRONTEND.md` tells agents to read
107
+ `docs/DESIGN.md`, defines which project files are controlled by it, and blocks treating the
108
+ placeholder `status: design-source-required` DESIGN.md as an approved visual style.
109
+
80
110
  ## Update An Installed Skill Package
81
111
 
82
- The `npx` installer only installs or replaces the Codex skill package. To update an already
83
- installed skill, rerun `install` with `--force` in the same install location.
112
+ The `npx` installer installs or replaces the Codex plugin bundle and compatibility skill entries.
113
+ To update an already installed bundle, rerun `install` with `--force` in the same install location.
84
114
 
85
115
  Replace the local skill install:
86
116
 
@@ -149,6 +179,27 @@ The installed skill exposes the underlying script at:
149
179
  python3 .codex/skills/harness-engine/scripts/manage_harness.py --help
150
180
  ```
151
181
 
182
+ For frontend or visual-design work, the generated harness uses `docs/FRONTEND.md` to route agents through `docs/DESIGN.md`. Harness Engine does not generate style, choose themes, extract branding, or vendor Google DESIGN.md code.
183
+
184
+ Create the real `docs/DESIGN.md` through one of Google's documented paths:
185
+
186
+ - Create from a prompt in Stitch by describing the intended vibe, product, audience, and interaction feel.
187
+ - Derive from branding in Stitch by providing a brand URL or image.
188
+ - Write it by hand as markdown with optional YAML frontmatter.
189
+
190
+ Use Google's examples as references, not vendored source: `https://github.com/google-labs-code/design.md/tree/main/examples`.
191
+
192
+ Install the official package in the target project when the project wants DESIGN.md validation, diffs, or token exports:
193
+
194
+ ```bash
195
+ npm install --save-dev @google/design.md
196
+ npx @google/design.md lint docs/DESIGN.md
197
+ npx @google/design.md export docs/DESIGN.md --format css-tailwind
198
+ npx @google/design.md diff docs/DESIGN.md docs/DESIGN.next.md
199
+ ```
200
+
201
+ `docs/FRONTEND.md` defines which files are controlled by `docs/DESIGN.md`: generated token exports under `docs/design-docs/` or `src/styles/`, Tailwind theme files, global CSS variables, component theme modules, Storybook/theme previews, and UI implementation files that consume those tokens. Agents should read `docs/FRONTEND.md`, then `docs/DESIGN.md`, then generated token exports before changing controlled UI files.
202
+
152
203
  Common commands:
153
204
 
154
205
  ```bash
@@ -268,6 +319,15 @@ Check npm package contents:
268
319
  npm run pack:check
269
320
  ```
270
321
 
322
+ Before release, run:
323
+
324
+ ```bash
325
+ npm test
326
+ npm run smoke:install
327
+ npm run pack:check
328
+ git diff --check
329
+ ```
330
+
271
331
  The publish workflows expect an npm token when trusted publishing is not yet configured:
272
332
 
273
333
  ```text
package/bin/install.js CHANGED
@@ -5,8 +5,8 @@ const os = require("os");
5
5
  const path = require("path");
6
6
 
7
7
  const PACKAGE_ROOT = path.resolve(__dirname, "..");
8
- const SKILL_NAME = "harness-engine";
9
- const SOURCE_SKILL_DIR = path.join(PACKAGE_ROOT, "skills", SKILL_NAME);
8
+ const BUNDLE_NAME = "harness-engine-plugin";
9
+ const BUNDLE_ENTRIES = [".codex-plugin", "skills"];
10
10
 
11
11
  function printHelp() {
12
12
  console.log(`harness-engine
@@ -19,7 +19,7 @@ Options:
19
19
  --local Install into <cwd>/.codex/skills
20
20
  --global Install into \${CODEX_HOME:-~/.codex}/skills
21
21
  --path <dir> Install into a custom skills directory
22
- --force Replace an existing installed skill
22
+ --force Replace an existing installed bundle
23
23
  -h, --help Show this help text
24
24
  `);
25
25
  }
@@ -85,32 +85,71 @@ function copyDir(sourceDir, targetDir) {
85
85
  for (const entry of fs.readdirSync(sourceDir, { withFileTypes: true })) {
86
86
  const sourcePath = path.join(sourceDir, entry.name);
87
87
  const targetPath = path.join(targetDir, entry.name);
88
- if (entry.isDirectory()) {
88
+ const stat = fs.statSync(sourcePath);
89
+ if (stat.isDirectory()) {
89
90
  copyDir(sourcePath, targetPath);
91
+ } else if (entry.isSymbolicLink()) {
92
+ const linkTarget = fs.readlinkSync(sourcePath);
93
+ fs.symlinkSync(linkTarget, targetPath);
90
94
  } else {
91
95
  fs.copyFileSync(sourcePath, targetPath);
92
- const stat = fs.statSync(sourcePath);
93
96
  fs.chmodSync(targetPath, stat.mode);
94
97
  }
95
98
  }
96
99
  }
97
100
 
98
- function installSkill(destinationDir, force) {
99
- const skillTargetDir = path.join(destinationDir, SKILL_NAME);
100
- if (!fs.existsSync(SOURCE_SKILL_DIR)) {
101
- throw new Error(`Bundled skill not found: ${SOURCE_SKILL_DIR}`);
101
+ function copyEntry(sourcePath, targetPath) {
102
+ const stat = fs.lstatSync(sourcePath);
103
+ if (stat.isDirectory()) {
104
+ copyDir(sourcePath, targetPath);
105
+ } else if (stat.isSymbolicLink()) {
106
+ fs.symlinkSync(fs.readlinkSync(sourcePath), targetPath);
107
+ } else {
108
+ fs.mkdirSync(path.dirname(targetPath), { recursive: true });
109
+ fs.copyFileSync(sourcePath, targetPath);
110
+ fs.chmodSync(targetPath, fs.statSync(sourcePath).mode);
102
111
  }
112
+ }
103
113
 
104
- if (fs.existsSync(skillTargetDir)) {
105
- if (!force) {
106
- throw new Error(`Skill already exists at ${skillTargetDir}. Re-run with --force to replace it.`);
114
+ function assertBundleSources() {
115
+ for (const entry of BUNDLE_ENTRIES) {
116
+ const sourcePath = path.join(PACKAGE_ROOT, entry);
117
+ if (!fs.existsSync(sourcePath)) {
118
+ throw new Error(`Bundled plugin entry not found: ${sourcePath}`);
107
119
  }
108
- fs.rmSync(skillTargetDir, { recursive: true, force: true });
120
+ }
121
+ }
122
+
123
+ function removeIfExists(targetPath, force, label) {
124
+ if (!fs.existsSync(targetPath)) {
125
+ return;
126
+ }
127
+
128
+ if (!force) {
129
+ throw new Error(`${label} already exists at ${targetPath}. Re-run with --force to replace it.`);
109
130
  }
110
131
 
132
+ fs.rmSync(targetPath, { recursive: true, force: true });
133
+ }
134
+
135
+ function installBundle(destinationDir, force) {
136
+ assertBundleSources();
111
137
  fs.mkdirSync(destinationDir, { recursive: true });
112
- copyDir(SOURCE_SKILL_DIR, skillTargetDir);
113
- return skillTargetDir;
138
+ const bundleTargetDir = path.join(destinationDir, BUNDLE_NAME);
139
+ removeIfExists(bundleTargetDir, force, "Plugin bundle");
140
+
141
+ fs.mkdirSync(bundleTargetDir, { recursive: true });
142
+ for (const entry of BUNDLE_ENTRIES) {
143
+ copyEntry(path.join(PACKAGE_ROOT, entry), path.join(bundleTargetDir, entry));
144
+ }
145
+
146
+ // Compatibility: older users invoke $harness-engine from a normal skills directory.
147
+ // Keep a top-level skill copy in place while the plugin root carries the bundle.
148
+ const compatTarget = path.join(destinationDir, "harness-engine");
149
+ removeIfExists(compatTarget, force, "Compatibility skill");
150
+ copyDir(path.join(PACKAGE_ROOT, "skills", "harness-engine"), compatTarget);
151
+
152
+ return bundleTargetDir;
114
153
  }
115
154
 
116
155
  function main() {
@@ -131,7 +170,7 @@ function main() {
131
170
  const destinationDir = resolveSkillsDir(args.mode, args.customPath);
132
171
 
133
172
  if (args.command === "where") {
134
- console.log(path.join(destinationDir, SKILL_NAME));
173
+ console.log(path.join(destinationDir, BUNDLE_NAME));
135
174
  return;
136
175
  }
137
176
 
@@ -142,8 +181,8 @@ function main() {
142
181
  }
143
182
 
144
183
  try {
145
- const installedPath = installSkill(destinationDir, args.force);
146
- console.log(`Installed ${SKILL_NAME} to ${installedPath}`);
184
+ const installedPath = installBundle(destinationDir, args.force);
185
+ console.log(`Installed ${BUNDLE_NAME} plugin bundle to ${installedPath}`);
147
186
  console.log("Invoke it in Codex with $harness-engine.");
148
187
  } catch (error) {
149
188
  console.error(`Install failed: ${error.message}`);
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@hallucination-studio/harness-engine",
3
- "version": "1.0.0-beta.11.2a4849a",
3
+ "version": "1.0.0-beta.13.cf40fab",
4
4
  "description": "Install the harness-engine Codex skill for initializing and reconciling advanced repository harness docs.",
5
5
  "repository": {
6
6
  "type": "git",
@@ -19,6 +19,7 @@
19
19
  },
20
20
  "files": [
21
21
  "bin",
22
+ ".codex-plugin/**",
22
23
  "skills/**/SKILL.md",
23
24
  "skills/**/agents/**",
24
25
  "skills/**/assets/**",
@@ -12,24 +12,25 @@ Run the packaged script to inspect the target repository before editing files. U
12
12
  1. Run `python3 scripts/manage_harness.py analyze --repo <target-repo> --output <analysis.json>`.
13
13
  2. Read `analysis.json`.
14
14
  3. Ask the human only the unresolved, high-impact questions from `human_confirmations`.
15
- 4. Run `python3 scripts/manage_harness.py sample-answers --analysis <analysis.json> --output <answers.json>`.
16
- 5. Fill the placeholders in `answers.json` from the repository and the human's confirmed answers.
17
- 6. Run `python3 scripts/manage_harness.py init --repo <target-repo> --answers <answers.json>`. This is the single workspace entrypoint: it creates a new harness when none exists, and reconciles a managed or partial harness when managed harness files are already present. Reconcile refreshes managed files, backfills newly introduced managed files, and preserves unmanaged user files. Pass `--force` only with explicit user approval.
18
- 7. If the task is multi-step, run `python3 scripts/manage_harness.py plan-start --repo <target-repo> --slug <task-name> --goal "<goal>"`.
19
- 8. If you learn durable facts during the work, run `python3 scripts/manage_harness.py knowledge-log --repo <target-repo> --plan <plan-file> --fact "<fact>" --destination <durable-doc>` and keep the returned `id`. Use `--fact-file <file>` when the fact contains shell-sensitive characters.
20
- 9. Before closing the task, write those facts into their durable docs.
21
- 10. Run `python3 scripts/manage_harness.py knowledge-mark-written --repo <target-repo> --plan <plan-file> --id <knowledge-id> --evidence "<verbatim text already in durable doc>"`; prefer `--evidence-file <file>` when evidence contains backticks, globs, quotes, pipes, or other shell-sensitive characters. Evidence must be copied from the destination doc, not summarized. Use `--append` only when the exact fact should be appended mechanically.
22
- 11. If validation, evals, browser checks, or code review reveal a bug, immediately run `python3 scripts/manage_harness.py defect-log --repo <target-repo> --plan <plan-file> --severity <P0|P1|P2|P3> --summary "<bug>" --evidence "<failing check>"`. This forces the quality gate to fail.
23
- 12. Fix logged defects, then run `python3 scripts/manage_harness.py defect-resolve --repo <target-repo> --plan <plan-file> --id <bug-id> --fix-evidence "<passing check or code evidence>"`.
24
- 13. Score the finished work with `python3 scripts/manage_harness.py quality-score --repo <target-repo> --plan <plan-file> --product-correctness <0-10> --product-note "<evidence>" --ux-operator-clarity <0-10> --ux-note "<evidence>" --architecture-maintainability <0-10> --architecture-note "<evidence>" --reliability-observability <0-10> --reliability-note "<evidence>" --security-data-handling <0-10> --security-note "<evidence>"`. Every dimension needs an evidence note.
25
- 14. If `quality-score` fails, treat `## Rework Required` in the plan as the next implementation input, fix the work, then run `quality-score` again.
26
- 15. For phased or resumable work, run `python3 scripts/manage_harness.py phase-set --repo <target-repo> --plan <plan-file> --mode <multi-phase|paused|completed|stopped> --workstream <id> --current-phase <n> --continuation <target> --next-action "<next action>"`, then update `workstreams.md` with `workstream-upsert`.
27
- 16. Before closing, replace generic plan placeholders with task-specific scope, constraints, steps, validation, and completion notes; leave no open durable-knowledge placeholder except the default unused line.
28
- 17. Close the plan with `python3 scripts/manage_harness.py plan-close --repo <target-repo> --plan <plan-file> --summary "<summary>"`.
29
- 18. Before handoff, run `python3 .codex/skills/harness-engine/scripts/manage_harness.py check --repo <target-repo>` from an installed target repository.
30
- 19. To review stale generated evidence, run `python3 scripts/manage_harness.py evidence-prune --repo <target-repo>` first; it is dry-run by default. Add `--apply` only after checking the candidate list.
31
- 20. To clean transient harness runtime files or remove already committed runtime files from the remote, run `python3 scripts/manage_harness.py clean --repo <target-repo>` first; it is dry-run by default. Add `--apply` to clean local runtime state, update `.gitignore`, and stage `git rm --cached` removals, then commit and push.
32
- 21. After changing this skill, run `python3 evals/run_evals.py` and iterate until it passes.
15
+ 4. During initialization, create `docs/DESIGN.md`, `docs/FRONTEND.md`, and `docs/design-docs/style-options.md` as the target repository's design control plane. The target project owns `docs/DESIGN.md` and must create the real design system through an official Google DESIGN.md path: prompt in Stitch, brand URL/image import in Stitch, or hand-authored markdown/YAML. Harness-engine does not generate style, choose themes, extract branding, or vendor Google DESIGN.md source.
16
+ 5. Run `python3 scripts/manage_harness.py sample-answers --analysis <analysis.json> --output <answers.json>`.
17
+ 6. Fill the placeholders in `answers.json` from the repository and the human's confirmed answers.
18
+ 7. Run `python3 scripts/manage_harness.py init --repo <target-repo> --answers <answers.json>`. This is the single workspace entrypoint: it creates a new harness when none exists, and reconciles a managed or partial harness when managed harness files are already present. Reconcile refreshes managed files, backfills newly introduced managed files, and preserves unmanaged user files. Pass `--force` only with explicit user approval.
19
+ 8. If the task is multi-step, run `python3 scripts/manage_harness.py plan-start --repo <target-repo> --slug <task-name> --goal "<goal>"`.
20
+ 9. If you learn durable facts during the work, run `python3 scripts/manage_harness.py knowledge-log --repo <target-repo> --plan <plan-file> --fact "<fact>" --destination <durable-doc>` and keep the returned `id`. Use `--fact-file <file>` when the fact contains shell-sensitive characters.
21
+ 10. Before closing the task, write those facts into their durable docs.
22
+ 11. Run `python3 scripts/manage_harness.py knowledge-mark-written --repo <target-repo> --plan <plan-file> --id <knowledge-id> --evidence "<verbatim text already in durable doc>"`; prefer `--evidence-file <file>` when evidence contains backticks, globs, quotes, pipes, or other shell-sensitive characters. Evidence must be copied from the destination doc, not summarized. Use `--append` only when the exact fact should be appended mechanically.
23
+ 12. If validation, evals, browser checks, or code review reveal a bug, immediately run `python3 scripts/manage_harness.py defect-log --repo <target-repo> --plan <plan-file> --severity <P0|P1|P2|P3> --summary "<bug>" --evidence "<failing check>"`. This forces the quality gate to fail.
24
+ 13. Fix logged defects, then run `python3 scripts/manage_harness.py defect-resolve --repo <target-repo> --plan <plan-file> --id <bug-id> --fix-evidence "<passing check or code evidence>"`.
25
+ 14. Score the finished work with `python3 scripts/manage_harness.py quality-score --repo <target-repo> --plan <plan-file> --product-correctness <0-10> --product-note "<evidence>" --ux-operator-clarity <0-10> --ux-note "<evidence>" --architecture-maintainability <0-10> --architecture-note "<evidence>" --reliability-observability <0-10> --reliability-note "<evidence>" --security-data-handling <0-10> --security-note "<evidence>"`. Every dimension needs an evidence note.
26
+ 15. If `quality-score` fails, treat `## Rework Required` in the plan as the next implementation input, fix the work, then run `quality-score` again.
27
+ 16. For phased or resumable work, run `python3 scripts/manage_harness.py phase-set --repo <target-repo> --plan <plan-file> --mode <multi-phase|paused|completed|stopped> --workstream <id> --current-phase <n> --continuation <target> --next-action "<next action>"`, then update `workstreams.md` with `workstream-upsert`.
28
+ 17. Before closing, replace generic plan placeholders with task-specific scope, constraints, steps, validation, and completion notes; leave no open durable-knowledge placeholder except the default unused line.
29
+ 18. Close the plan with `python3 scripts/manage_harness.py plan-close --repo <target-repo> --plan <plan-file> --summary "<summary>"`.
30
+ 19. Before handoff, run `python3 .codex/skills/harness-engine/scripts/manage_harness.py check --repo <target-repo>` from an installed target repository.
31
+ 20. To review stale generated evidence, run `python3 scripts/manage_harness.py evidence-prune --repo <target-repo>` first; it is dry-run by default. Add `--apply` only after checking the candidate list.
32
+ 21. To clean transient harness runtime files or remove already committed runtime files from the remote, run `python3 scripts/manage_harness.py clean --repo <target-repo>` first; it is dry-run by default. Add `--apply` to clean local runtime state, update `.gitignore`, and stage `git rm --cached` removals, then commit and push.
33
+ 22. After changing this skill, run `python3 evals/run_evals.py` and iterate until it passes.
33
34
 
34
35
  ## Reading Order
35
36
 
@@ -42,6 +43,7 @@ Run the packaged script to inspect the target repository before editing files. U
42
43
  - Read [references/template-policy.md](references/template-policy.md) before overwriting existing files.
43
44
  - Read [references/evaluation-loop.md](references/evaluation-loop.md) before changing the skill, templates, scripts, or policy references.
44
45
  - Read [references/evidence-first-evals.md](references/evidence-first-evals.md) before designing evals for product correctness, frontend validation, or bug-discovery coverage.
46
+ - Read `docs/FRONTEND.md` and `docs/DESIGN.md` for frontend, UI, product design, visual design, canvas, or interface polish work. Use the target project's official `@google/design.md` CLI install to lint, diff, or export DESIGN.md-controlled files.
45
47
 
46
48
  ## Command Rules
47
49
 
@@ -68,6 +70,25 @@ Run the packaged script to inspect the target repository before editing files. U
68
70
  - Run `python3 evals/run_evals.py` after skill changes, read the structured report, and treat per-case failures as iteration input.
69
71
  - Do not add CI to user repositories unless the human explicitly asks for it.
70
72
 
73
+ ## Google DESIGN.md Integration
74
+
75
+ Harness-engine does not vendor Google DESIGN.md source, choose themes, extract branding, or ship a local Google adapter. Target repositories should create the real `docs/DESIGN.md` through one of Google's documented paths:
76
+
77
+ - Create from a prompt in Stitch.
78
+ - Derive from branding in Stitch with a URL or image.
79
+ - Write it by hand as markdown with optional YAML frontmatter.
80
+
81
+ Use examples only as references: `https://github.com/google-labs-code/design.md/tree/main/examples`.
82
+
83
+ After `docs/DESIGN.md` exists, target repositories should install the official package:
84
+
85
+ ```bash
86
+ npm install --save-dev @google/design.md
87
+ npx @google/design.md lint docs/DESIGN.md
88
+ ```
89
+
90
+ Use `docs/FRONTEND.md` to control which project files read `docs/DESIGN.md`, which generated token exports are allowed, and how agents validate those files.
91
+
71
92
  ## Output Rules
72
93
 
73
94
  - Keep `AGENTS.md` short and routing-oriented.
@@ -50,5 +50,25 @@
50
50
  {
51
51
  "id": "preserve-unmanaged-docs",
52
52
  "description": "Existing user-owned harness files should be skipped unless explicitly forced."
53
+ },
54
+ {
55
+ "id": "official-google-design-cli-documented",
56
+ "description": "Generated design docs should instruct target projects to install and use the official @google/design.md CLI."
57
+ },
58
+ {
59
+ "id": "readme-google-design-dependency-documented",
60
+ "description": "README should clearly tell users that target projects must install the official @google/design.md dependency."
61
+ },
62
+ {
63
+ "id": "frontend-design-control-plane",
64
+ "description": "Generated FRONTEND.md should define which files are controlled by docs/DESIGN.md and how agents read them."
65
+ },
66
+ {
67
+ "id": "plugin-does-not-bundle-google-design",
68
+ "description": "The plugin manifest and installer should bundle harness-engine without a local Google DESIGN.md adapter or upstream source."
69
+ },
70
+ {
71
+ "id": "pack-excludes-google-source",
72
+ "description": "The npm package dry-run should exclude third-party Google DESIGN.md source while retaining harness-engine files."
53
73
  }
54
74
  ]
@@ -9,6 +9,7 @@ import time
9
9
  from pathlib import Path
10
10
 
11
11
  SKILL_DIR = Path(__file__).resolve().parents[1]
12
+ REPO_ROOT = SKILL_DIR.parents[1]
12
13
  MANAGER = SKILL_DIR / "scripts" / "manage_harness.py"
13
14
  CASES_PATH = Path(__file__).with_name("cases.json")
14
15
 
@@ -136,6 +137,44 @@ def test_empty_repo_init(tmp_root):
136
137
  assert_contains(repo, "docs/FRONTEND.md", "Evidence For Meaningful UI Work")
137
138
  assert_contains(repo, "docs/FRONTEND.md", "Define and verify layout invariants")
138
139
  assert_contains(repo, "docs/FRONTEND.md", "preserve the primary task area")
140
+ assert_contains(repo, "docs/FRONTEND.md", "Read `docs/DESIGN.md` before implementing frontend")
141
+ assert_contains(repo, "docs/FRONTEND.md", "status: design-source-required")
142
+ assert_contains(repo, "docs/FRONTEND.md", "prompt in Stitch, brand URL/image import in Stitch, or hand-authored markdown/YAML")
143
+ assert_contains(repo, "docs/FRONTEND.md", "npm install --save-dev @google/design.md")
144
+ assert_contains(repo, "docs/FRONTEND.md", "npx @google/design.md lint docs/DESIGN.md")
145
+ assert_contains(repo, "docs/FRONTEND.md", "Generated design-token files must be derived from `docs/DESIGN.md`")
146
+ assert_contains(repo, "docs/FRONTEND.md", "Files controlled by `docs/DESIGN.md` include design token exports")
147
+ assert_contains(repo, "docs/FRONTEND.md", "Agents must read in this order for UI work")
148
+ assert_contains(repo, "docs/DESIGN.md", "version: alpha")
149
+ assert_contains(repo, "docs/DESIGN.md", "status: design-source-required")
150
+ assert_contains(repo, "docs/DESIGN.md", "## Overview")
151
+ assert_contains(repo, "docs/DESIGN.md", "Create the actual DESIGN.md through one of the official Google DESIGN.md paths")
152
+ assert_contains(repo, "docs/DESIGN.md", "Create from a prompt in Stitch")
153
+ assert_contains(repo, "docs/DESIGN.md", "Derive from branding in Stitch")
154
+ assert_contains(repo, "docs/DESIGN.md", "Write it by hand")
155
+ assert_contains(repo, "docs/DESIGN.md", "https://github.com/google-labs-code/design.md/tree/main/examples")
156
+ assert_contains(repo, "docs/DESIGN.md", "npm install --save-dev @google/design.md")
157
+ assert_contains(repo, "docs/DESIGN.md", "npx @google/design.md lint docs/DESIGN.md")
158
+ assert_contains(repo, "docs/DESIGN.md", "npx @google/design.md export docs/DESIGN.md --format <format>")
159
+ assert_contains(repo, "docs/DESIGN.md", "Harness Engine does not generate visual style, choose themes, derive branding, or vendor Google DESIGN.md source")
160
+ for heading in [
161
+ "## Colors",
162
+ "## Typography",
163
+ "## Layout",
164
+ "## Elevation & Depth",
165
+ "## Shapes",
166
+ "## Components",
167
+ "## Do's and Don'ts",
168
+ ]:
169
+ assert_contains(repo, "docs/DESIGN.md", heading)
170
+ assert_exists(repo, "docs/design-docs/style-options.md")
171
+ assert_contains(repo, "docs/design-docs/style-options.md", "Official Creation Paths")
172
+ assert_contains(repo, "docs/design-docs/style-options.md", "Create from a prompt in Stitch")
173
+ assert_contains(repo, "docs/design-docs/style-options.md", "Derive from branding in Stitch")
174
+ assert_contains(repo, "docs/design-docs/style-options.md", "Write it by hand")
175
+ assert_contains(repo, "docs/design-docs/style-options.md", "npm install --save-dev @google/design.md")
176
+ assert_contains(repo, "docs/design-docs/style-options.md", "Controlled Files")
177
+ assert_contains(repo, "docs/design-docs/style-options.md", "Do not hand-edit generated token exports")
139
178
  assert_contains(repo, "docs/sops/evidence-first-eval-loop.md", "Report per-case results")
140
179
  assert_contains(repo, "docs/sops/evidence-first-eval-loop.md", "Read the Issue Workflows in `AGENTS.md`")
141
180
 
@@ -201,6 +240,20 @@ def test_clean_removes_runtime_state_and_untracks_artifacts(tmp_root):
201
240
  repo = tmp_root / "clean-repo"
202
241
  repo.mkdir()
203
242
  subprocess.run(["git", "init"], cwd=repo, text=True, capture_output=True, check=True)
243
+ subprocess.run(
244
+ ["git", "config", "user.email", "harness-eval@example.com"],
245
+ cwd=repo,
246
+ text=True,
247
+ capture_output=True,
248
+ check=True,
249
+ )
250
+ subprocess.run(
251
+ ["git", "config", "user.name", "Harness Eval"],
252
+ cwd=repo,
253
+ text=True,
254
+ capture_output=True,
255
+ check=True,
256
+ )
204
257
  tracked_files = [
205
258
  ".codex/skills/harness-engine/SKILL.md",
206
259
  "docs/generated/canvas-polish-desktop-final.png",
@@ -1147,6 +1200,127 @@ def test_eval_report_shape(tmp_root):
1147
1200
  raise AssertionError("Eval report should include a user-facing failure message")
1148
1201
 
1149
1202
 
1203
+ def test_official_google_design_cli_documented(tmp_root):
1204
+ repo = tmp_root / "official-google-design-repo"
1205
+ repo.mkdir()
1206
+ answers = tmp_root / "official-google-design-answers.json"
1207
+ write_answers(answers, project_name="official-google-design-demo")
1208
+ run_manager("init", "--repo", str(repo), "--answers", str(answers))
1209
+ design_text = (repo / "docs" / "DESIGN.md").read_text()
1210
+ options_text = (repo / "docs" / "design-docs" / "style-options.md").read_text()
1211
+ skill_text = (SKILL_DIR / "SKILL.md").read_text()
1212
+ for text, label in [(design_text, "DESIGN.md"), (options_text, "style-options.md"), (skill_text, "SKILL.md")]:
1213
+ for needle in [
1214
+ "Create from a prompt in Stitch",
1215
+ "Derive from branding",
1216
+ "Write it by hand",
1217
+ "npm install --save-dev @google/design.md",
1218
+ "npx @google/design.md lint docs/DESIGN.md",
1219
+ ]:
1220
+ if needle not in text:
1221
+ raise AssertionError(f"{label} should document official Google DESIGN.md CLI usage: {needle}")
1222
+ if "$google-design-style" in skill_text:
1223
+ raise AssertionError("harness-engine SKILL.md should not route to a local google-design-style skill")
1224
+ if "packaged Google DESIGN.md submodule" in skill_text:
1225
+ raise AssertionError("harness-engine SKILL.md should not claim a packaged Google submodule")
1226
+
1227
+
1228
+ def test_readme_google_design_dependency_documented(tmp_root):
1229
+ readme_text = (REPO_ROOT / "README.md").read_text()
1230
+ for needle in [
1231
+ "## Target Project Dependency: Google DESIGN.md",
1232
+ "does not bundle Google source code or install Google's package for the user",
1233
+ "npm install --save-dev @google/design.md",
1234
+ "Use Google/Stitch to create the real `docs/DESIGN.md`",
1235
+ "npx @google/design.md lint docs/DESIGN.md",
1236
+ "Harness Engine's role is to generate the control plane",
1237
+ ]:
1238
+ if needle not in readme_text:
1239
+ raise AssertionError(f"README should document the Google DESIGN.md target dependency: {needle}")
1240
+
1241
+
1242
+ def test_frontend_design_control_plane(tmp_root):
1243
+ repo = tmp_root / "frontend-design-control-repo"
1244
+ repo.mkdir()
1245
+ answers = tmp_root / "frontend-design-control-answers.json"
1246
+ write_answers(answers, project_name="frontend-design-control-demo")
1247
+ run_manager("init", "--repo", str(repo), "--answers", str(answers))
1248
+ frontend_text = (repo / "docs" / "FRONTEND.md").read_text()
1249
+ for needle in [
1250
+ "Read `docs/DESIGN.md` before implementing frontend",
1251
+ "status: design-source-required",
1252
+ "do not treat it as an approved visual style",
1253
+ "Treat `docs/DESIGN.md` as the source of truth",
1254
+ "Generated design-token files must be derived from `docs/DESIGN.md`",
1255
+ "Files controlled by `docs/DESIGN.md` include design token exports",
1256
+ "Tailwind theme files",
1257
+ "global CSS variables",
1258
+ "component theme modules",
1259
+ "Storybook/theme previews",
1260
+ "Agents must read in this order for UI work",
1261
+ "Do not hand-edit generated token exports",
1262
+ ]:
1263
+ if needle not in frontend_text:
1264
+ raise AssertionError(f"FRONTEND.md should define design control plane: {needle}")
1265
+
1266
+
1267
+ def test_plugin_does_not_bundle_google_design(tmp_root):
1268
+ manifest = REPO_ROOT / ".codex-plugin" / "plugin.json"
1269
+ if not manifest.exists():
1270
+ raise AssertionError("Missing plugin manifest")
1271
+ manifest_data = json.loads(manifest.read_text())
1272
+ if manifest_data.get("skills") != "./skills/":
1273
+ raise AssertionError("Plugin manifest should expose ./skills/")
1274
+ smoke = subprocess.run(
1275
+ ["node", str(REPO_ROOT / "scripts" / "smoke_install.js")],
1276
+ cwd=REPO_ROOT,
1277
+ text=True,
1278
+ capture_output=True,
1279
+ check=False,
1280
+ )
1281
+ if smoke.returncode != 0:
1282
+ raise AssertionError(smoke.stderr or smoke.stdout)
1283
+ installed = json.loads(smoke.stdout)
1284
+ if not Path(installed["installed"]).name == "harness-engine-plugin":
1285
+ raise AssertionError("smoke install should report the installed plugin root")
1286
+ if (REPO_ROOT / "skills" / "google-design-style").exists():
1287
+ raise AssertionError("Package should not include a local google-design-style skill")
1288
+ if (REPO_ROOT / "third_party" / "google-design-md").exists():
1289
+ raise AssertionError("Package should not include Google DESIGN.md source")
1290
+
1291
+
1292
+ def test_pack_excludes_google_source(tmp_root):
1293
+ result = subprocess.run(
1294
+ ["npm", "pack", "--dry-run", "--json"],
1295
+ cwd=REPO_ROOT,
1296
+ text=True,
1297
+ capture_output=True,
1298
+ check=False,
1299
+ )
1300
+ if result.returncode != 0:
1301
+ raise AssertionError(result.stderr or result.stdout)
1302
+ json_start = result.stdout.rfind("\n[")
1303
+ if json_start == -1:
1304
+ json_start = result.stdout.find("[")
1305
+ if json_start == -1:
1306
+ raise AssertionError(f"npm pack did not emit JSON file data: {result.stdout}")
1307
+ pack_data = json.loads(result.stdout[json_start:].strip())
1308
+ files = {item["path"] for item in pack_data[0]["files"]}
1309
+ for required_path in [
1310
+ ".codex-plugin/plugin.json",
1311
+ "skills/harness-engine/SKILL.md",
1312
+ ]:
1313
+ if required_path not in files:
1314
+ raise AssertionError(f"npm pack should include {required_path}")
1315
+ forbidden_prefixes = [
1316
+ "skills/google-design-style/",
1317
+ "third_party/google-design-md/",
1318
+ ]
1319
+ for file_path in files:
1320
+ if any(file_path.startswith(prefix) for prefix in forbidden_prefixes):
1321
+ raise AssertionError(f"npm pack should not include Google design source or adapter: {file_path}")
1322
+
1323
+
1150
1324
  EVALS = [
1151
1325
  ("empty-repo-init", test_empty_repo_init),
1152
1326
  ("frontend-analysis", test_frontend_analysis),
@@ -1161,6 +1335,11 @@ EVALS = [
1161
1335
  ("evidence-prune-generated-artifacts", test_evidence_prune_generated_artifacts),
1162
1336
  ("eval-report-shape", test_eval_report_shape),
1163
1337
  ("preserve-unmanaged-docs", test_preserve_unmanaged_docs),
1338
+ ("official-google-design-cli-documented", test_official_google_design_cli_documented),
1339
+ ("readme-google-design-dependency-documented", test_readme_google_design_dependency_documented),
1340
+ ("frontend-design-control-plane", test_frontend_design_control_plane),
1341
+ ("plugin-does-not-bundle-google-design", test_plugin_does_not_bundle_google_design),
1342
+ ("pack-excludes-google-source", test_pack_excludes_google_source),
1164
1343
  ]
1165
1344
 
1166
1345
 
@@ -198,17 +198,77 @@ For each issue:
198
198
 
199
199
  DOC_FILES = {
200
200
  "docs/DESIGN.md": """{marker}
201
+ ---
202
+ version: alpha
203
+ name: {project_name} Design System
204
+ description: Placeholder for the project-owned DESIGN.md. Create the real design system through an official Google DESIGN.md creation path before UI implementation.
205
+ status: design-source-required
206
+ ---
207
+
201
208
  # Design
202
209
 
203
- ## Product Experience Bar
210
+ ## Overview
204
211
 
205
212
  {frontend_stack_notes}
206
213
 
207
- ## Review Heuristics
214
+ Harness Engine does not generate visual style, choose themes, derive branding, or vendor Google DESIGN.md source. This file is a control point for the real project-owned DESIGN.md.
215
+
216
+ Create the actual DESIGN.md through one of the official Google DESIGN.md paths:
217
+
218
+ 1. Create from a prompt in Stitch: describe the intended vibe, product, audience, and interaction feel. Stitch generates the design system and summarizes it as DESIGN.md.
219
+ 2. Derive from branding in Stitch: provide a brand URL or image so Stitch can extract palette, typography, and style patterns into DESIGN.md.
220
+ 3. Write it by hand: advanced users can author markdown and optional YAML frontmatter directly.
221
+
222
+ Chosen path for this project:
223
+
224
+ {design_creation_path}
225
+
226
+ After the real design system is created, install and use the official package in this target project:
227
+
228
+ ```bash
229
+ npm install --save-dev @google/design.md
230
+ npx @google/design.md lint docs/DESIGN.md
231
+ ```
232
+
233
+ Use upstream examples only as references, not as vendored source:
234
+
235
+ ```text
236
+ https://github.com/google-labs-code/design.md/tree/main/examples
237
+ ```
238
+
239
+ ## Colors
240
+
241
+ Pending real DESIGN.md creation. Fill this from Stitch output, brand extraction, or hand-authored design decisions.
242
+
243
+ ## Typography
244
+
245
+ Pending real DESIGN.md creation. Record font families, hierarchy, body readability, label treatment, and any tabular or technical text rules.
246
+
247
+ ## Layout
248
+
249
+ Pending real DESIGN.md creation. Record grid, spacing rhythm, density, content grouping, responsive behavior, and workflow ergonomics.
250
+
251
+ ## Elevation & Depth
252
+
253
+ Pending real DESIGN.md creation. Record how hierarchy is created through shadows, borders, tonal layers, transparency, or flat contrast.
254
+
255
+ ## Shapes
256
+
257
+ Pending real DESIGN.md creation. Record shape language for buttons, cards, inputs, chips, modals, and fixed-format UI elements.
258
+
259
+ ## Components
208
260
 
209
- - Prefer intentional interaction patterns over generic defaults.
210
- - Keep visual and UX rationale durable in `docs/design-docs/`.
211
- - Validate meaningful UI work in a real browser before closing it out.
261
+ Pending real DESIGN.md creation. Record treatment for buttons, form fields, navigation, cards or panels, tables or lists, badges, empty states, loading states, and error states.
262
+
263
+ ## Do's and Don'ts
264
+
265
+ - Do replace this placeholder with a real Google DESIGN.md generated by Stitch, derived from branding, or written by hand.
266
+ - Do run `npx @google/design.md lint docs/DESIGN.md` after edits.
267
+ - Do export tokens with `npx @google/design.md export docs/DESIGN.md --format <format>` when the frontend stack consumes generated token files.
268
+ - Do validate meaningful UI work in a real browser before closing it out.
269
+ - Don't edit generated token exports by hand; update `docs/DESIGN.md` and regenerate them.
270
+ - Don't treat this placeholder as an approved visual style.
271
+ - Don't rely on harness-engine to generate, choose, or extract product taste.
212
272
  """,
213
273
  "docs/FRONTEND.md": """{marker}
214
274
  # Frontend
@@ -225,6 +285,17 @@ DOC_FILES = {
225
285
 
226
286
  {frontend_validation_loop}
227
287
 
288
+ ## Design Style Contract
289
+
290
+ - Read `docs/DESIGN.md` before implementing frontend, UI, layout, visual-state, canvas, or interaction work.
291
+ - If `docs/DESIGN.md` has `status: design-source-required` or pending sections, do not treat it as an approved visual style. First create the real DESIGN.md through an official Google path: prompt in Stitch, brand URL/image import in Stitch, or hand-authored markdown/YAML.
292
+ - The project owns `docs/DESIGN.md`; maintain and validate it with the official Google package: `npm install --save-dev @google/design.md`, then `npx @google/design.md lint docs/DESIGN.md`.
293
+ - Treat `docs/DESIGN.md` as the source of truth for UI tokens, colors, typography, spacing, radius, elevation, component treatment, and Do's and Don'ts.
294
+ - Generated design-token files must be derived from `docs/DESIGN.md` with `npx @google/design.md export docs/DESIGN.md --format <format>`.
295
+ - Files controlled by `docs/DESIGN.md` include design token exports under `docs/design-docs/` or `src/styles/`, Tailwind theme files, global CSS variables, component theme modules, Storybook/theme previews, and any UI implementation that consumes those tokens.
296
+ - Agents must read in this order for UI work: `docs/FRONTEND.md`, `docs/DESIGN.md`, generated token exports, then the component or stylesheet being changed.
297
+ - Do not hand-edit generated token exports. Update `docs/DESIGN.md`, regenerate exports with the official CLI, and cite the lint/export command in validation.
298
+
228
299
  ## Evidence For Meaningful UI Work
229
300
 
230
301
  - Capture desktop and mobile evidence for significant UI changes.
@@ -336,6 +407,46 @@ DOC_FILES = {
336
407
 
337
408
  - Add one document per durable design decision.
338
409
  - Link active design decisions from plans and specs.
410
+ """,
411
+ "docs/design-docs/style-options.md": """{marker}
412
+ # Design System Control
413
+
414
+ The project owns `docs/DESIGN.md`. Harness Engine does not generate style or choose themes.
415
+
416
+ ## Official Creation Paths
417
+
418
+ Create the real DESIGN.md through one of the official Google DESIGN.md paths:
419
+
420
+ 1. **Create from a prompt in Stitch**: describe the intended vibe, product, audience, and interaction feel.
421
+ 2. **Derive from branding in Stitch**: provide a brand URL or image so Stitch can extract palette, typography, and style patterns.
422
+ 3. **Write it by hand**: author markdown and optional YAML frontmatter directly.
423
+
424
+ Use upstream examples only as references:
425
+
426
+ ```text
427
+ https://github.com/google-labs-code/design.md/tree/main/examples
428
+ ```
429
+
430
+ After `docs/DESIGN.md` contains the real project design system, install and use the official Google DESIGN.md CLI in this target repository:
431
+
432
+ ```bash
433
+ npm install --save-dev @google/design.md
434
+ npx @google/design.md lint docs/DESIGN.md
435
+ ```
436
+
437
+ ## Controlled Files
438
+
439
+ - `docs/DESIGN.md`: source of truth for design tokens and design rationale.
440
+ - `docs/design-docs/`: design decisions, lint reports, token export notes, and generated design evidence.
441
+ - `src/styles/`, `app/styles/`, or equivalent style directories: CSS variables, Tailwind theme exports, or framework-specific theme modules generated from `docs/DESIGN.md`.
442
+ - Component theme files, Storybook theme previews, and UI implementation files that consume exported tokens.
443
+
444
+ ## Operating Rules
445
+
446
+ - Read `docs/FRONTEND.md` before editing controlled files.
447
+ - Read `docs/DESIGN.md` before changing UI implementation.
448
+ - Do not hand-edit generated token exports; edit `docs/DESIGN.md` and rerun the official CLI export command.
449
+ - Store lint/export outputs under `docs/generated/` or cite the command output in the active plan.
339
450
  """,
340
451
  "docs/design-docs/core-beliefs.md": """{marker}
341
452
  # Core Beliefs
@@ -568,6 +679,11 @@ QUESTION_CATALOG = [
568
679
  "prompt": "If there is a frontend, what experience bar, platforms, or UX constraints should the docs enforce?",
569
680
  "reason": "Needed for design and frontend policies.",
570
681
  },
682
+ {
683
+ "id": "design_creation_path",
684
+ "prompt": "How should the project create its real DESIGN.md: Stitch prompt, Stitch brand URL/image import, or hand-authored markdown/YAML?",
685
+ "reason": "Needed because harness-engine does not generate visual style; the project must choose an official Google DESIGN.md creation path.",
686
+ },
571
687
  {
572
688
  "id": "quality_focus",
573
689
  "prompt": "Which product areas or architectural layers deserve the strictest quality scoring?",
@@ -734,6 +850,7 @@ def make_default_answers(analysis):
734
850
  if has_frontend
735
851
  else "No frontend detected. Replace this if the repo includes UI work."
736
852
  ),
853
+ "design_creation_path": "Choose one before UI implementation: create from a prompt in Stitch, derive from branding URL/image in Stitch, or write docs/DESIGN.md by hand.",
737
854
  "quality_focus": "List the product areas and architectural layers that deserve the strictest quality bar.",
738
855
  "frontend_scope": frontend_scope,
739
856
  "frontend_validation_loop": frontend_validation_loop,