forgecad 0.9.13 → 0.9.15
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +6 -4
- package/README.md +8 -4
- package/dist/assets/{AdminPage-DramHHDf.js → AdminPage-CDyGUinA.js} +2 -2
- package/dist/assets/{BenchmarkPage-Bjgkh5m9.js → BenchmarkPage-DfPMY_-d.js} +4 -15
- package/dist/assets/{BlogPage-n_HGP3Qm.js → BlogPage-kF0fkdJT.js} +2 -2
- package/dist/assets/{DocsPage-WCIkPmzC.js → DocsPage-B954L3YN.js} +9 -3
- package/dist/assets/EditorApp-Beb-IZ0y.js +14014 -0
- package/dist/assets/{EditorApp-BAnckbsk.css → EditorApp-CuDLxKqL.css} +698 -0
- package/dist/assets/{EmbedViewer-DEZKqdfW.js → EmbedViewer-C77B-TrF.js} +3 -3
- package/dist/assets/{LandingPageProofDriven-CeRIctuj.js → LandingPageProofDriven-Cr6fXMDj.js} +35 -37
- package/dist/assets/LegalPage-BRlScr9A.css +91 -0
- package/dist/assets/LegalPage-Dzklqmmg.js +39 -0
- package/dist/assets/{PricingPage-BMedqFef.css → PricingPage-BPF6HKyO.css} +25 -0
- package/dist/assets/{PricingPage-rIRa8p4Y.js → PricingPage-zWXkvlwl.js} +19 -19
- package/dist/assets/{SettingsPage-BqCUvEXM.js → SettingsPage-Bz0of4KQ.js} +2 -2
- package/dist/assets/app-CE3sYcV7.css +3890 -0
- package/dist/assets/{app-BUZqJvSO.js → app-D3kDkggg.js} +2305 -960
- package/dist/assets/cli/{render-lhGxj50Y.js → render-DSY3mMQa.js} +423 -30
- package/dist/assets/{constructionHistoryWorker-ipD1jcIv.js → constructionHistoryWorker-gpDo-uH2.js} +927 -243
- package/dist/assets/{evalWorker-CHXSe_-u.js → evalWorker-CU0Ke6DP.js} +7799 -4163
- package/dist/assets/{forgecad_geometry-BVnIeXMG.js → forgecad_geometry-Dgceylq9.js} +43 -1
- package/dist/assets/{forgecad_geometry_bg-DufhhCBV.wasm → forgecad_geometry_bg-dD4RNQF1.wasm} +0 -0
- package/dist/assets/{inspectWorker-DeRnMVv1.js → inspectWorker-COyp8XXA.js} +927 -243
- package/dist/assets/{javascript-70-4uGcz.js → javascript-1kQXfVaz.js} +1 -1
- package/dist/assets/landing-proof-driven-DiGqdtWa.js +18 -0
- package/dist/assets/{landing-proof-driven-oFYW6mjz.css → landing-proof-driven-ORyigZ6p.css} +13 -7
- package/dist/assets/legalContent-ZfFGMmi4.js +251 -0
- package/dist/assets/{manifold-D1LZIHqn.js → manifold-BRI5prcH.js} +1 -1
- package/dist/assets/{manifold-C2fwoTgd.js → manifold-C-3h2M7p.js} +2 -2
- package/dist/assets/{manifold-BTkzxi9V.js → manifold-DNkrUWpA.js} +1 -1
- package/dist/assets/{reportWorker-Cq1qGmg0.js → reportWorker-CdBz5bNg.js} +7537 -10856
- package/dist/assets/{scalar-sampling-budget-D9Qv_UlJ.js → scalar-sampling-budget-wJF98aY9.js} +6943 -4345
- package/dist/assets/{scanProxyWorker-Bs2TDgLw.js → scanProxyWorker-B-9VbLIs.js} +32 -1
- package/dist/assets/{renderSceneState-Dr0xPq1A.js → targets-B9sGB5nB.js} +27 -1
- package/dist/assets/{vendor-react-Da3A2QmU.js → vendor-react-6j1Kke-Y.js} +6 -5
- package/dist/cli/render.html +1 -1
- package/dist/docs/index.html +2 -2
- package/dist/docs-raw/AI/ai-native-cad.md +50 -0
- package/dist/docs-raw/AI/usage.md +9 -17
- package/dist/docs-raw/CLI.md +71 -21
- package/dist/docs-raw/component-model.md +27 -11
- package/dist/docs-raw/generated/assembly.md +301 -212
- package/dist/docs-raw/generated/concepts.md +238 -240
- package/dist/docs-raw/generated/core.md +283 -6
- package/dist/docs-raw/generated/curves.md +274 -361
- package/dist/docs-raw/generated/lib.md +7 -1
- package/dist/docs-raw/generated/output.md +19 -4
- package/dist/docs-raw/generated/runtime-names.md +41 -0
- package/dist/docs-raw/generated/sdf.md +31 -0
- package/dist/docs-raw/generated/sheet-metal.md +9 -0
- package/dist/docs-raw/generated/sketch.md +44 -1
- package/dist/docs-raw/generated/viewport.md +14 -6
- package/dist/docs-raw/guides/coordinate-system.md +20 -16
- package/dist/docs-raw/guides/geometry-conventions.md +2 -2
- package/dist/docs-raw/guides/inspection-bundles.md +2 -1
- package/dist/docs-raw/guides/joint-design.md +24 -0
- package/dist/docs-raw/guides/positioning.md +13 -3
- package/dist/docs-raw/legal/privacy.md +63 -0
- package/dist/docs-raw/legal/software-license.md +55 -0
- package/dist/docs-raw/legal/terms.md +87 -0
- package/dist/docs-raw/skills/forgecad-3d-reconstruction.md +3 -3
- package/dist/docs-raw/skills/forgecad-blockout-model.md +1 -1
- package/dist/docs-raw/skills/forgecad-component-model.md +11 -2
- package/dist/docs-raw/skills/forgecad-high-level-spec.md +1 -1
- package/dist/docs-raw/skills/forgecad-image-replicator.md +8 -8
- package/dist/docs-raw/skills/forgecad-lld.md +1 -1
- package/dist/docs-raw/skills/forgecad-make-a-model.md +4 -4
- package/dist/docs-raw/skills/forgecad-model-grader.md +2 -2
- package/dist/docs-raw/skills/forgecad-prepare-prompt.md +2 -2
- package/dist/docs-raw/skills/forgecad-project.md +1 -1
- package/dist/docs-raw/skills/forgecad-reconstruction-benchmark.md +4 -4
- package/dist/docs-raw/skills/forgecad-render-inspect.md +4 -2
- package/dist/docs-raw/skills/forgecad-visual-spec.md +1 -1
- package/dist/docs-raw/skills/forgecad.md +4 -3
- package/dist/index.html +40 -12
- package/dist/llms.txt +8 -0
- package/dist/site.webmanifest +1 -1
- package/dist/sitemap.xml +49 -13
- package/dist-cli/{check-compiler-LOXCPEOI.js → check-compiler-SDX5QIXI.js} +1 -2
- package/dist-cli/{check-query-propagation-BAKNVWXR.js → check-query-propagation-EAYEFT77.js} +1 -2
- package/dist-cli/{chunk-RY43WF46.js → chunk-N4O47JLF.js} +13772 -9938
- package/dist-cli/forgecad.js +2387 -899
- package/dist-cli/{forgecad_geometry-GYVNKPIE.js → forgecad_geometry-QOQIIP53.js} +42 -1
- package/dist-cli/forgecad_geometry_bg.wasm +0 -0
- package/dist-cli/{solver-46FFSK2U.js → solver-OK4HECRH.js} +0 -1
- package/dist-skill/CONTEXT.md +1120 -724
- package/dist-skill/SKILL.md +3 -2
- package/dist-skill/docs/API/core/concepts.md +64 -1
- package/dist-skill/docs/CLI.md +71 -21
- package/dist-skill/docs/generated/assembly.md +277 -229
- package/dist-skill/docs/generated/core.md +283 -6
- package/dist-skill/docs/generated/curves.md +272 -362
- package/dist-skill/docs/generated/lib.md +7 -1
- package/dist-skill/docs/generated/output.md +19 -4
- package/dist-skill/docs/generated/runtime-names.md +41 -0
- package/dist-skill/docs/generated/sdf.md +31 -0
- package/dist-skill/docs/generated/sheet-metal.md +9 -0
- package/dist-skill/docs/generated/sketch.md +44 -2
- package/dist-skill/docs/generated/viewport.md +5 -90
- package/dist-skill/docs/guides/coordinate-system.md +20 -16
- package/dist-skill/docs/guides/geometry-conventions.md +2 -2
- package/dist-skill/docs/guides/inspection-bundles.md +2 -1
- package/dist-skill/docs/guides/joint-design.md +24 -0
- package/dist-skill/docs/guides/positioning.md +13 -3
- package/dist-skill/library/forgecad-3d-reconstruction/SKILL.md +2 -2
- package/dist-skill/library/forgecad-component-model/SKILL.md +10 -1
- package/dist-skill/library/forgecad-image-replicator/SKILL.md +6 -6
- package/dist-skill/library/forgecad-image-replicator/scripts/compare_images.py +166 -0
- package/dist-skill/library/forgecad-make-a-model/SKILL.md +3 -3
- package/dist-skill/library/forgecad-model-grader/SKILL.md +1 -1
- package/dist-skill/library/forgecad-prepare-prompt/SKILL.md +1 -1
- package/dist-skill/library/forgecad-reconstruction-benchmark/SKILL.md +3 -3
- package/dist-skill/library/forgecad-render-inspect/SKILL.md +3 -1
- package/examples/api/assembly-kinematics-foundation.forge.js +65 -0
- package/examples/api/assembly-kinematics-four-bar.forge.js +115 -0
- package/examples/api/assembly-kinematics-limb.forge.js +116 -0
- package/examples/api/connector-frame-rig-chain.forge.js +102 -0
- package/examples/api/exact-sheet-shell-assembly.forge.js +0 -2
- package/examples/api/exact-surface-studio.forge.js +6 -8
- package/examples/api/helix-basics.forge.js +6 -6
- package/examples/api/lean-foundations/README.md +12 -0
- package/examples/api/lean-foundations/curve-blend-exact.forge.js +22 -0
- package/examples/api/lean-foundations/curve-fit-interpolation.forge.js +18 -0
- package/examples/api/lean-foundations/curve-helix-canonicalization.forge.js +27 -0
- package/examples/api/lean-foundations/curve-route-canonicalization.forge.js +16 -0
- package/examples/api/lean-foundations/curve-trim-reverse.forge.js +24 -0
- package/examples/api/lean-foundations/exact-curve-arc.forge.js +36 -0
- package/examples/api/mixed-edge-finishes-proof.forge.js +8 -11
- package/examples/api/route3d-elbow.forge.js +68 -0
- package/examples/api/transition-curves.forge.js +44 -15
- package/examples/api/y-blend-corner-showcase.forge.js +0 -2
- package/examples/generative/coral-vase.forge.js +1 -1
- package/examples/nurbs-tube.forge.js +1 -1
- package/package.json +14 -18
- package/dist/assets/EditorApp-CP9Za6tm.js +0 -13630
- package/dist/assets/app-CsHnaBWt.css +0 -1789
- package/dist/docs-raw/API/README.md +0 -16
- package/dist/docs-raw/API/core/concepts.md +0 -118
- package/dist/docs-raw/INDEX.md +0 -138
- package/dist/docs-raw/RELEASING.md +0 -87
- package/dist/docs-raw/agent-native-api.md +0 -27
- package/dist/docs-raw/beta-deployment.md +0 -304
- package/dist/docs-raw/beta-operations.md +0 -325
- package/dist/docs-raw/blueprint-first.md +0 -145
- package/dist/docs-raw/cli-monetization.md +0 -112
- package/dist/docs-raw/coding-best-practices.md +0 -120
- package/dist/docs-raw/coding.md +0 -340
- package/dist/docs-raw/deployment.md +0 -374
- package/dist/docs-raw/guides/skill-maintenance.md +0 -161
- package/dist/docs-raw/guides/surface-members.md +0 -82
- package/dist/docs-raw/internals/backend-vocabulary.md +0 -35
- package/dist/docs-raw/internals/compiler.md +0 -307
- package/dist/docs-raw/internals/constraint-solver-quality.md +0 -161
- package/dist/docs-raw/internals/constraint-solver.md +0 -176
- package/dist/docs-raw/internals/shape-from-slices.md +0 -152
- package/dist/docs-raw/internals/sketch-2d-pipeline.md +0 -108
- package/dist/docs-raw/platform/admin.md +0 -45
- package/dist/docs-raw/platform/architecture.md +0 -82
- package/dist/docs-raw/platform/auth.md +0 -139
- package/dist/docs-raw/platform/email.md +0 -67
- package/dist/docs-raw/platform/google-oauth-setup.md +0 -88
- package/dist/docs-raw/platform/observability.md +0 -197
- package/dist/docs-raw/platform/projects.md +0 -111
- package/dist/docs-raw/platform/sharing.md +0 -90
- package/dist/docs-raw/product/README.md +0 -39
- package/dist/docs-raw/product/api-as-product-language.md +0 -13
- package/dist/docs-raw/product/business-model.md +0 -15
- package/dist/docs-raw/product/competitive-positioning.md +0 -17
- package/dist/docs-raw/product/creative-manufacturing.md +0 -15
- package/dist/docs-raw/product/founder-story.md +0 -11
- package/dist/docs-raw/product/manufacturing-workflows.md +0 -15
- package/dist/docs-raw/product/onboarding-first-experience.md +0 -256
- package/dist/docs-raw/product/product-loop.md +0 -17
- package/dist/docs-raw/product/strategic-decisions.md +0 -22
- package/dist/docs-raw/product/user-outreach-email-templates.md +0 -161
- package/dist/docs-raw/product/user-segments.md +0 -15
- package/dist/docs-raw/product/vision.md +0 -26
- package/dist/docs-raw/rl-environments.md +0 -508
- package/dist/docs-raw/runbook.md +0 -611
- package/dist-cli/check-compiler-LOXCPEOI.js.map +0 -1
- package/dist-cli/check-query-propagation-BAKNVWXR.js.map +0 -1
- package/dist-cli/chunk-RY43WF46.js.map +0 -1
- package/dist-cli/forgecad.js.map +0 -1
- package/dist-cli/forgecad_geometry-GYVNKPIE.js.map +0 -1
- package/dist-cli/solver-46FFSK2U.js.map +0 -1
- package/dist-skill/SKILL-dev.md +0 -145
- package/dist-skill/docs-dev/API/core/concepts.md +0 -118
- package/dist-skill/docs-dev/CLI.md +0 -647
- package/dist-skill/docs-dev/agent-native-api.md +0 -27
- package/dist-skill/docs-dev/blueprint-first.md +0 -145
- package/dist-skill/docs-dev/coding-best-practices.md +0 -120
- package/dist-skill/docs-dev/coding.md +0 -340
- package/dist-skill/docs-dev/component-model.md +0 -164
- package/dist-skill/docs-dev/generated/assembly.md +0 -794
- package/dist-skill/docs-dev/generated/core.md +0 -2117
- package/dist-skill/docs-dev/generated/curves.md +0 -2583
- package/dist-skill/docs-dev/generated/lib.md +0 -169
- package/dist-skill/docs-dev/generated/output.md +0 -247
- package/dist-skill/docs-dev/generated/sdf.md +0 -446
- package/dist-skill/docs-dev/generated/sheet-metal.md +0 -504
- package/dist-skill/docs-dev/generated/sketch.md +0 -1811
- package/dist-skill/docs-dev/generated/viewport.md +0 -585
- package/dist-skill/docs-dev/generated/wood.md +0 -108
- package/dist-skill/docs-dev/guides/coordinate-system.md +0 -46
- package/dist-skill/docs-dev/guides/geometry-conventions.md +0 -52
- package/dist-skill/docs-dev/guides/inspection-bundles.md +0 -485
- package/dist-skill/docs-dev/guides/joint-design.md +0 -78
- package/dist-skill/docs-dev/guides/modeling-recipes.md +0 -78
- package/dist-skill/docs-dev/guides/positioning.md +0 -161
- package/dist-skill/docs-dev/guides/skill-maintenance.md +0 -161
- package/dist-skill/docs-dev/internals/backend-vocabulary.md +0 -35
- package/dist-skill/docs-dev/internals/compiler.md +0 -307
- package/dist-skill/docs-dev/internals/constraint-solver-quality.md +0 -161
- package/dist-skill/docs-dev/internals/constraint-solver.md +0 -176
- package/dist-skill/docs-dev/internals/sketch-2d-pipeline.md +0 -108
- package/dist-skill/library/forgecad-image-replicator/scripts/compare_images.mjs +0 -289
|
@@ -1,508 +0,0 @@
|
|
|
1
|
-
# ForgeCAD RL Environments
|
|
2
|
-
|
|
3
|
-
This is the permanent operating guide for ForgeCAD reinforcement-learning and
|
|
4
|
-
AI-lab reconstruction environments. Temporary notes can record experiments, but
|
|
5
|
-
this document is the stable contract for building, running, evaluating, and
|
|
6
|
-
maintaining reconstruction tasks.
|
|
7
|
-
|
|
8
|
-
## Product Goal
|
|
9
|
-
|
|
10
|
-
ForgeCAD RL environments should let an external lab train or evaluate an agent
|
|
11
|
-
that turns reference physical geometry into readable ForgeCAD source code.
|
|
12
|
-
|
|
13
|
-
The loop is intentionally simple:
|
|
14
|
-
|
|
15
|
-
1. ForgeCAD authors Harbor task folders directly under
|
|
16
|
-
`ai-labs/reconstruction/tasks/`.
|
|
17
|
-
2. `npm run harbor:reconstruction:refresh` validates those folders and updates
|
|
18
|
-
`ai-labs/reconstruction/dataset.toml` digests.
|
|
19
|
-
3. Harbor builds the sandbox and runs the agent.
|
|
20
|
-
4. The hidden RewardKit verifier evaluates `/app/submission/main.forge.js`
|
|
21
|
-
after the agent exits.
|
|
22
|
-
5. RewardKit writes Harbor rewards and per-criterion details; ForgeCAD-specific
|
|
23
|
-
verifier reports preserve geometry comparison evidence.
|
|
24
|
-
|
|
25
|
-
Harbor is the executable contract for both RL rollouts and benchmark evals.
|
|
26
|
-
There is no second task catalog, no generated sibling registry, and no local
|
|
27
|
-
runner command surface to keep in sync.
|
|
28
|
-
|
|
29
|
-
The external collaboration repository for SpaceXAI-facing handoff material is:
|
|
30
|
-
|
|
31
|
-
```text
|
|
32
|
-
https://github.com/KoStard/SpaceXAI_ForgeCAD
|
|
33
|
-
```
|
|
34
|
-
|
|
35
|
-
Treat that repository as coordination and distribution context. The executable
|
|
36
|
-
benchmark contract is the Harbor dataset in this repo.
|
|
37
|
-
|
|
38
|
-
## Current Reconstruction Suite
|
|
39
|
-
|
|
40
|
-
The canonical reconstruction dataset lives in:
|
|
41
|
-
|
|
42
|
-
```text
|
|
43
|
-
ai-labs/reconstruction/
|
|
44
|
-
```
|
|
45
|
-
|
|
46
|
-
The current seed tasks are:
|
|
47
|
-
|
|
48
|
-
```text
|
|
49
|
-
ball-bearing
|
|
50
|
-
sealing-frame
|
|
51
|
-
wheel
|
|
52
|
-
```
|
|
53
|
-
|
|
54
|
-
The old source-file-style labels are not primary task IDs. Public benchmark data
|
|
55
|
-
normalizes historical rows to the user-facing labels above.
|
|
56
|
-
|
|
57
|
-
## Harbor Task Contract
|
|
58
|
-
|
|
59
|
-
Each task folder includes:
|
|
60
|
-
|
|
61
|
-
```text
|
|
62
|
-
README.md
|
|
63
|
-
instruction.md
|
|
64
|
-
task.toml
|
|
65
|
-
environment/
|
|
66
|
-
Dockerfile
|
|
67
|
-
forgecad-wrapper.sh
|
|
68
|
-
reference/<task-reference>.3mf
|
|
69
|
-
starter/main.forge.js
|
|
70
|
-
tests/
|
|
71
|
-
test.sh
|
|
72
|
-
checks.py
|
|
73
|
-
enrich-reward.mjs
|
|
74
|
-
forgecad-reconstruction-check.mjs
|
|
75
|
-
score-calibration.mjs
|
|
76
|
-
task.json
|
|
77
|
-
```
|
|
78
|
-
|
|
79
|
-
The agent sees the reference asset, starter source, instructions, and the
|
|
80
|
-
workspace-local `forgecad` wrapper. The verifier files under `tests/` are hidden
|
|
81
|
-
until the agent exits.
|
|
82
|
-
|
|
83
|
-
The final answer is always:
|
|
84
|
-
|
|
85
|
-
```text
|
|
86
|
-
/app/submission/main.forge.js
|
|
87
|
-
```
|
|
88
|
-
|
|
89
|
-
The answer must be editable parametric ForgeCAD source. It must not import,
|
|
90
|
-
read, embed, copy, or encode the reference asset.
|
|
91
|
-
|
|
92
|
-
The `forgecad` wrapper exposes run/render/inspect/export checks but blocks
|
|
93
|
-
direct `score` and `compare` use during the episode. The hidden verifier uses
|
|
94
|
-
the internal comparison path after the agent exits.
|
|
95
|
-
|
|
96
|
-
Verifier execution goes through RewardKit:
|
|
97
|
-
|
|
98
|
-
```bash
|
|
99
|
-
uvx --from 'harbor-rewardkit==0.1.*' rewardkit /tests
|
|
100
|
-
```
|
|
101
|
-
|
|
102
|
-
`checks.py` exposes one programmatic criterion,
|
|
103
|
-
`forgecad_reconstruction_score`, backed by the ForgeCAD guard/render/compare
|
|
104
|
-
script. RewardKit writes `/logs/verifier/reward.json` and
|
|
105
|
-
`/logs/verifier/reward-details.json`; `enrich-reward.mjs` preserves the
|
|
106
|
-
RewardKit output while ensuring `reward.json` has numeric Harbor fields:
|
|
107
|
-
`reward`, `score`, `rawCompareScore`, `guard`, and `valid`.
|
|
108
|
-
|
|
109
|
-
## NPM Command Surface
|
|
110
|
-
|
|
111
|
-
Keep this surface small and boring:
|
|
112
|
-
|
|
113
|
-
```bash
|
|
114
|
-
npm run harbor:reconstruction:refresh
|
|
115
|
-
npm run harbor:reconstruction:check
|
|
116
|
-
npm run harbor:reconstruction:smoke
|
|
117
|
-
npm run harbor:reconstruction:pack-local
|
|
118
|
-
npm run harbor:reconstruction:clean-local
|
|
119
|
-
npm run benchmark:reconstruction:list
|
|
120
|
-
npm run benchmark:reconstruction:import-harbor -- <job-or-trial-dir> --budget standard
|
|
121
|
-
npm run benchmark:reconstruction:leaderboard
|
|
122
|
-
npm run benchmark:reconstruction:report
|
|
123
|
-
```
|
|
124
|
-
|
|
125
|
-
`refresh` validates task layout and rewrites dataset digests. It does not
|
|
126
|
-
generate task folders. If a task should change, edit the Harbor task folder
|
|
127
|
-
itself.
|
|
128
|
-
|
|
129
|
-
`check` runs the same validation and fails if `dataset.toml` is stale.
|
|
130
|
-
|
|
131
|
-
`smoke` runs Harbor with the no-op agent on `ball-bearing`. Use it whenever the
|
|
132
|
-
task image, verifier dependencies, or RewardKit contract changes.
|
|
133
|
-
|
|
134
|
-
## Local Task Workflow
|
|
135
|
-
|
|
136
|
-
Use this flow when adding or changing tasks inside the ForgeCAD repository.
|
|
137
|
-
|
|
138
|
-
1. Add or edit the task under `ai-labs/reconstruction/tasks/<task-id>/`.
|
|
139
|
-
2. Keep the task folder self-contained and Harbor-runnable.
|
|
140
|
-
3. Update the task `README.md` with environment, verifier dimensions, layout,
|
|
141
|
-
and concrete Harbor run commands.
|
|
142
|
-
4. Refresh the dataset manifest:
|
|
143
|
-
|
|
144
|
-
```bash
|
|
145
|
-
npm run harbor:reconstruction:refresh
|
|
146
|
-
npm run harbor:reconstruction:check
|
|
147
|
-
```
|
|
148
|
-
|
|
149
|
-
5. List tasks through the local reporting wrapper:
|
|
150
|
-
|
|
151
|
-
```bash
|
|
152
|
-
npm run benchmark:reconstruction:list
|
|
153
|
-
```
|
|
154
|
-
|
|
155
|
-
6. Run Harbor directly when testing the lab-facing package:
|
|
156
|
-
|
|
157
|
-
```bash
|
|
158
|
-
npm run harbor:reconstruction:smoke
|
|
159
|
-
```
|
|
160
|
-
|
|
161
|
-
7. Import completed Harbor jobs when you need local leaderboard/report output:
|
|
162
|
-
|
|
163
|
-
```bash
|
|
164
|
-
npm run benchmark:reconstruction:import-harbor -- \
|
|
165
|
-
"$HOME/Projects/CAD/ForgeCAD-RL-Agent-Runs/harbor/smoke/<timestamp>" \
|
|
166
|
-
--budget smoke
|
|
167
|
-
```
|
|
168
|
-
|
|
169
|
-
## Direct Harbor Runs
|
|
170
|
-
|
|
171
|
-
### One Task, One Agent
|
|
172
|
-
|
|
173
|
-
Use `--include-task-name` to select one Harbor task. Use `--agent` for a
|
|
174
|
-
built-in Harbor agent:
|
|
175
|
-
|
|
176
|
-
```bash
|
|
177
|
-
CODEX_AUTH_JSON_PATH="$HOME/.codex/auth.json" \
|
|
178
|
-
uvx --from harbor harbor run \
|
|
179
|
-
-p ai-labs/reconstruction/tasks \
|
|
180
|
-
--include-task-name ball-bearing \
|
|
181
|
-
--agent codex \
|
|
182
|
-
--model gpt-5.5 \
|
|
183
|
-
--artifact /app/submission/main.forge.js \
|
|
184
|
-
--jobs-dir "$HOME/Projects/CAD/ForgeCAD-RL-Agent-Runs/harbor/codex" \
|
|
185
|
-
--n-concurrent 1
|
|
186
|
-
```
|
|
187
|
-
|
|
188
|
-
Use `--agent-import-path` for a custom adapter such as the bundled Grok adapter:
|
|
189
|
-
|
|
190
|
-
```bash
|
|
191
|
-
PYTHONPATH="$PWD/ai-labs/reconstruction/agents" \
|
|
192
|
-
GROK_AUTH_JSON="$HOME/.grok/auth.json" \
|
|
193
|
-
GROK_CONFIG_TOML="$HOME/.grok/config.toml" \
|
|
194
|
-
uvx --from harbor harbor run \
|
|
195
|
-
-p ai-labs/reconstruction/tasks \
|
|
196
|
-
--include-task-name ball-bearing \
|
|
197
|
-
--agent-import-path grok_harbor_agent:GrokCliAgent \
|
|
198
|
-
--model grok-build \
|
|
199
|
-
--artifact /app/submission/main.forge.js \
|
|
200
|
-
--jobs-dir "$HOME/Projects/CAD/ForgeCAD-RL-Agent-Runs/harbor/grok" \
|
|
201
|
-
--n-concurrent 1
|
|
202
|
-
```
|
|
203
|
-
|
|
204
|
-
Current task names are `ball-bearing`, `sealing-frame`, and `wheel`.
|
|
205
|
-
|
|
206
|
-
### Full Suite
|
|
207
|
-
|
|
208
|
-
Codex:
|
|
209
|
-
|
|
210
|
-
```bash
|
|
211
|
-
CODEX_AUTH_JSON_PATH="$HOME/.codex/auth.json" \
|
|
212
|
-
uvx --from harbor harbor run \
|
|
213
|
-
-p ai-labs/reconstruction/tasks \
|
|
214
|
-
--agent codex \
|
|
215
|
-
--model gpt-5.5 \
|
|
216
|
-
--jobs-dir "$HOME/Projects/CAD/ForgeCAD-RL-Agent-Runs/harbor/codex" \
|
|
217
|
-
--artifact /app/submission/main.forge.js \
|
|
218
|
-
--n-concurrent 3
|
|
219
|
-
```
|
|
220
|
-
|
|
221
|
-
Grok:
|
|
222
|
-
|
|
223
|
-
```bash
|
|
224
|
-
PYTHONPATH="$PWD/ai-labs/reconstruction/agents" \
|
|
225
|
-
GROK_AUTH_JSON="$HOME/.grok/auth.json" \
|
|
226
|
-
GROK_CONFIG_TOML="$HOME/.grok/config.toml" \
|
|
227
|
-
uvx --from harbor harbor run \
|
|
228
|
-
-p ai-labs/reconstruction/tasks \
|
|
229
|
-
--agent-import-path grok_harbor_agent:GrokCliAgent \
|
|
230
|
-
--model grok-build \
|
|
231
|
-
--jobs-dir "$HOME/Projects/CAD/ForgeCAD-RL-Agent-Runs/harbor/grok" \
|
|
232
|
-
--artifact /app/submission/main.forge.js \
|
|
233
|
-
--n-concurrent 3
|
|
234
|
-
```
|
|
235
|
-
|
|
236
|
-
Direct Harbor CLI-agent runs need outbound internet from the task container for
|
|
237
|
-
agent install/auth/model calls, so the task definitions set
|
|
238
|
-
`allow_internet = true`. The verifier remains hidden until the agent exits.
|
|
239
|
-
|
|
240
|
-
## Remote Harbor Runs On popos
|
|
241
|
-
|
|
242
|
-
Use `popos` as a remote Harbor execution host when local machine resources are
|
|
243
|
-
the bottleneck. Do not add a second benchmark runner for this. The remote
|
|
244
|
-
machine should run the same Harbor commands against the same task folders.
|
|
245
|
-
|
|
246
|
-
Remote prerequisites:
|
|
247
|
-
|
|
248
|
-
1. `ssh popos` reaches the server.
|
|
249
|
-
2. Docker, `uvx`, Node, and npm are installed on the server.
|
|
250
|
-
3. A dedicated checkout exists at `~/Projects/CAD/ForgeCAD`.
|
|
251
|
-
4. Agent auth lives outside the repo on the server.
|
|
252
|
-
5. Run outputs stay outside the checkout under
|
|
253
|
-
`~/Projects/CAD/ForgeCAD-RL-Agent-Runs/harbor/`.
|
|
254
|
-
|
|
255
|
-
For normal runs, update the dedicated server checkout with git after pushing
|
|
256
|
-
the branch you want to test:
|
|
257
|
-
|
|
258
|
-
```bash
|
|
259
|
-
branch="$(git branch --show-current)"
|
|
260
|
-
git push origin "$branch"
|
|
261
|
-
ssh popos "cd ~/Projects/CAD/ForgeCAD && \
|
|
262
|
-
git fetch origin && \
|
|
263
|
-
git switch '$branch' && \
|
|
264
|
-
git pull --ff-only origin '$branch'"
|
|
265
|
-
```
|
|
266
|
-
|
|
267
|
-
Use `rsync` only for scratch experiments where the local work is intentionally
|
|
268
|
-
uncommitted. It mirrors the dirty checkout onto the server, so use it only
|
|
269
|
-
against a dedicated throwaway checkout:
|
|
270
|
-
|
|
271
|
-
```bash
|
|
272
|
-
rsync -az --delete \
|
|
273
|
-
--exclude .git \
|
|
274
|
-
--exclude node_modules \
|
|
275
|
-
--exclude dist \
|
|
276
|
-
--exclude dist-cli \
|
|
277
|
-
--exclude dist-bundles \
|
|
278
|
-
--exclude .data \
|
|
279
|
-
--exclude 'ai-labs/reconstruction/tasks/*/environment/forgecad-package/*.tgz' \
|
|
280
|
-
./ popos:~/Projects/CAD/ForgeCAD/
|
|
281
|
-
```
|
|
282
|
-
|
|
283
|
-
Run preflight validation on the server:
|
|
284
|
-
|
|
285
|
-
```bash
|
|
286
|
-
ssh popos 'cd ~/Projects/CAD/ForgeCAD && npm run harbor:reconstruction:check'
|
|
287
|
-
```
|
|
288
|
-
|
|
289
|
-
If `popos` reports `Missing script: "harbor:reconstruction:check"`, the server
|
|
290
|
-
checkout is stale. Update the server checkout with git first.
|
|
291
|
-
|
|
292
|
-
Run the no-op Harbor smoke on the server:
|
|
293
|
-
|
|
294
|
-
```bash
|
|
295
|
-
ssh popos 'cd ~/Projects/CAD/ForgeCAD && npm run harbor:reconstruction:smoke'
|
|
296
|
-
```
|
|
297
|
-
|
|
298
|
-
Run one task on one Codex agent remotely:
|
|
299
|
-
|
|
300
|
-
```bash
|
|
301
|
-
ssh popos 'cd ~/Projects/CAD/ForgeCAD && \
|
|
302
|
-
CODEX_AUTH_JSON_PATH="$HOME/.codex/auth.json" \
|
|
303
|
-
uvx --from harbor harbor run \
|
|
304
|
-
-p ai-labs/reconstruction/tasks \
|
|
305
|
-
--include-task-name ball-bearing \
|
|
306
|
-
--agent codex \
|
|
307
|
-
--model gpt-5.5 \
|
|
308
|
-
--artifact /app/submission/main.forge.js \
|
|
309
|
-
--jobs-dir "$HOME/Projects/CAD/ForgeCAD-RL-Agent-Runs/harbor/codex" \
|
|
310
|
-
--n-concurrent 1'
|
|
311
|
-
```
|
|
312
|
-
|
|
313
|
-
Run one task on the bundled Grok adapter remotely:
|
|
314
|
-
|
|
315
|
-
```bash
|
|
316
|
-
ssh popos 'cd ~/Projects/CAD/ForgeCAD && \
|
|
317
|
-
PYTHONPATH="$PWD/ai-labs/reconstruction/agents" \
|
|
318
|
-
GROK_AUTH_JSON="$HOME/.grok/auth.json" \
|
|
319
|
-
GROK_CONFIG_TOML="$HOME/.grok/config.toml" \
|
|
320
|
-
uvx --from harbor harbor run \
|
|
321
|
-
-p ai-labs/reconstruction/tasks \
|
|
322
|
-
--include-task-name ball-bearing \
|
|
323
|
-
--agent-import-path grok_harbor_agent:GrokCliAgent \
|
|
324
|
-
--model grok-build \
|
|
325
|
-
--artifact /app/submission/main.forge.js \
|
|
326
|
-
--jobs-dir "$HOME/Projects/CAD/ForgeCAD-RL-Agent-Runs/harbor/grok" \
|
|
327
|
-
--n-concurrent 1'
|
|
328
|
-
```
|
|
329
|
-
|
|
330
|
-
Fetch remote Harbor jobs back before importing them into the local benchmark
|
|
331
|
-
ledger:
|
|
332
|
-
|
|
333
|
-
```bash
|
|
334
|
-
rsync -az \
|
|
335
|
-
popos:~/Projects/CAD/ForgeCAD-RL-Agent-Runs/harbor/ \
|
|
336
|
-
"$HOME/Projects/CAD/ForgeCAD-RL-Agent-Runs/harbor/"
|
|
337
|
-
|
|
338
|
-
npm run benchmark:reconstruction:import-harbor -- \
|
|
339
|
-
"$HOME/Projects/CAD/ForgeCAD-RL-Agent-Runs/harbor/codex/<timestamp>" \
|
|
340
|
-
--budget standard
|
|
341
|
-
```
|
|
342
|
-
|
|
343
|
-
By default, task Dockerfiles install the pinned published ForgeCAD npm package.
|
|
344
|
-
For unpublished local ForgeCAD changes, run the local package flow on the server
|
|
345
|
-
checkout:
|
|
346
|
-
|
|
347
|
-
```bash
|
|
348
|
-
ssh popos 'cd ~/Projects/CAD/ForgeCAD && npm run harbor:reconstruction:pack-local'
|
|
349
|
-
ssh popos 'cd ~/Projects/CAD/ForgeCAD && npm run harbor:reconstruction:smoke'
|
|
350
|
-
ssh popos 'cd ~/Projects/CAD/ForgeCAD && npm run harbor:reconstruction:clean-local'
|
|
351
|
-
```
|
|
352
|
-
|
|
353
|
-
If `pack-local` fails because repo build tools are missing, run `npm install`
|
|
354
|
-
once in the server checkout.
|
|
355
|
-
|
|
356
|
-
There is intentionally no `popos` npm script. The host alias, checkout path,
|
|
357
|
-
and auth file locations are per-developer machine configuration; the repo-owned
|
|
358
|
-
commands remain the small Harbor command surface documented above.
|
|
359
|
-
|
|
360
|
-
## Local Reporting Wrapper
|
|
361
|
-
|
|
362
|
-
The local reporting wrapper is not a runner. It exists only to list Harbor
|
|
363
|
-
tasks, import completed Harbor jobs, and build submission ledgers, leaderboards,
|
|
364
|
-
and visual reports.
|
|
365
|
-
|
|
366
|
-
There is no local matrix command, no host-side agent adapter layer, no separate
|
|
367
|
-
task tracker, and no host-side evaluator. If an agent should run, run it
|
|
368
|
-
through Harbor.
|
|
369
|
-
|
|
370
|
-
## Run Data
|
|
371
|
-
|
|
372
|
-
Keep run outputs outside the source checkout under:
|
|
373
|
-
|
|
374
|
-
```text
|
|
375
|
-
~/Projects/CAD/ForgeCAD-RL-Agent-Runs/
|
|
376
|
-
benchmark/
|
|
377
|
-
harbor/
|
|
378
|
-
research/
|
|
379
|
-
smoke/
|
|
380
|
-
```
|
|
381
|
-
|
|
382
|
-
Keep raw agent logs and Harbor job directories. They are evidence for prompt
|
|
383
|
-
design, task packaging, tool behavior, and reward design.
|
|
384
|
-
|
|
385
|
-
Important Harbor files:
|
|
386
|
-
|
|
387
|
-
```text
|
|
388
|
-
<jobs-dir>/<timestamp>/<task-trial>/
|
|
389
|
-
result.json
|
|
390
|
-
trial.log
|
|
391
|
-
artifacts/main.forge.js
|
|
392
|
-
verifier/reward.json
|
|
393
|
-
verifier/reward-details.json
|
|
394
|
-
verifier/report.json
|
|
395
|
-
verifier/score.json
|
|
396
|
-
verifier/submission.png
|
|
397
|
-
verifier/reference.png
|
|
398
|
-
```
|
|
399
|
-
|
|
400
|
-
## Result Payload Contract
|
|
401
|
-
|
|
402
|
-
The Harbor verifier writes numeric reward files:
|
|
403
|
-
|
|
404
|
-
| Field | Purpose |
|
|
405
|
-
|---|---|
|
|
406
|
-
| `reward` | Scalar in `[0, 1]` for RL. |
|
|
407
|
-
| `score` | Human scale score in `[0, 100]`, normally `reward * 100`. |
|
|
408
|
-
| `rawCompareScore` | Raw ForgeCAD geometry comparison score before calibration. |
|
|
409
|
-
| `guard` | Numeric guardrail pass marker. |
|
|
410
|
-
| `valid` | Numeric validity marker. |
|
|
411
|
-
|
|
412
|
-
Harbor reward values must stay numeric because Harbor validates reward payloads
|
|
413
|
-
as scalar metrics. Detailed status, aggregation, guard issues, score vectors,
|
|
414
|
-
renders, and command logs live in `/logs/verifier/report.json`,
|
|
415
|
-
`/logs/verifier/score.json`, `/logs/verifier/reward-details.json`, and the
|
|
416
|
-
copied Harbor trial artifacts.
|
|
417
|
-
|
|
418
|
-
## Scoring Principles
|
|
419
|
-
|
|
420
|
-
The scalar reward is a projection, not the whole truth. Keep the score vector
|
|
421
|
-
first-class.
|
|
422
|
-
|
|
423
|
-
For reconstruction tasks, the geometry component should include:
|
|
424
|
-
|
|
425
|
-
- Bidirectional surface coverage over multiple thresholds.
|
|
426
|
-
- Tail-distance penalties such as p95/p99.
|
|
427
|
-
- Feature-edge matching for sharp and boundary structure.
|
|
428
|
-
- Dimension agreement.
|
|
429
|
-
- Volume or occupancy overlap where reliable.
|
|
430
|
-
- Hard caps for obvious shortcut failures.
|
|
431
|
-
|
|
432
|
-
Avoid solving score problems by only changing a curve shape. If a two-ring
|
|
433
|
-
bearing approximation scores too high, the missing signal is structural: rolling
|
|
434
|
-
elements, feature edges, component count, section occupancy, or topology. Add
|
|
435
|
-
the missing measurement and then calibrate the scalar projection.
|
|
436
|
-
|
|
437
|
-
## Maintaining Tasks
|
|
438
|
-
|
|
439
|
-
When adding or changing a task:
|
|
440
|
-
|
|
441
|
-
1. Put the task under `ai-labs/reconstruction/tasks/<task-id>/`.
|
|
442
|
-
2. Use a user-facing slug such as `ball-bearing`, not an opaque source asset ID.
|
|
443
|
-
3. Include the full Harbor layout: prompt, task TOML, environment, reference,
|
|
444
|
-
starter, tests, task metadata, and README.
|
|
445
|
-
4. Record the reference asset SHA-256 in `tests/task.json`.
|
|
446
|
-
5. Keep `/app/submission/main.forge.js` as the stable deliverable.
|
|
447
|
-
6. Refresh the dataset with `npm run harbor:reconstruction:refresh`.
|
|
448
|
-
7. Run identity scoring: reference vs reference should score near 100.
|
|
449
|
-
8. Run the starter score and record it.
|
|
450
|
-
9. Run at least one known weak shortcut candidate and one stronger candidate.
|
|
451
|
-
10. Confirm the stronger candidate ranks higher for the right reasons.
|
|
452
|
-
11. Store local experiment artifacts under `docs/temporary/projects/...`, then
|
|
453
|
-
promote only durable conclusions here.
|
|
454
|
-
|
|
455
|
-
## Maintaining Agent Adapters
|
|
456
|
-
|
|
457
|
-
Agent adapters belong in Harbor, not in the local benchmark reporting wrapper.
|
|
458
|
-
For each supported Harbor agent CLI, keep a short note answering:
|
|
459
|
-
|
|
460
|
-
1. How to run headlessly.
|
|
461
|
-
2. How to set cwd.
|
|
462
|
-
3. How to set task-local home/config.
|
|
463
|
-
4. How skills are discovered.
|
|
464
|
-
5. How to disable global/user config.
|
|
465
|
-
6. How to disable web search, subagents, plugins, MCP, and memory.
|
|
466
|
-
7. What filesystem sandbox exists.
|
|
467
|
-
8. How auth is supplied.
|
|
468
|
-
9. How stdin is made non-interactive.
|
|
469
|
-
10. How logs are captured.
|
|
470
|
-
11. How the wall-clock timeout is enforced.
|
|
471
|
-
|
|
472
|
-
Promote stable CLI adapter facts here when they become part of the supported
|
|
473
|
-
contract.
|
|
474
|
-
|
|
475
|
-
## Versioning Policy
|
|
476
|
-
|
|
477
|
-
Version every surface that labs depend on:
|
|
478
|
-
|
|
479
|
-
- Harbor dataset digest.
|
|
480
|
-
- Task IDs.
|
|
481
|
-
- `tests/task.json` schema fields.
|
|
482
|
-
- Verifier payload shape.
|
|
483
|
-
- Score aggregation profile.
|
|
484
|
-
- Geometry comparison algorithm version.
|
|
485
|
-
|
|
486
|
-
Changing the scalar reward is a benchmark change. Keep the score vector and raw
|
|
487
|
-
compare metrics in the payload so old and new rewards can be compared across
|
|
488
|
-
saved runs.
|
|
489
|
-
|
|
490
|
-
## Merge Readiness Checklist
|
|
491
|
-
|
|
492
|
-
Before merging reconstruction environment changes:
|
|
493
|
-
|
|
494
|
-
1. `npm test` passes, or a narrower check is justified in the PR.
|
|
495
|
-
2. `npm run harbor:reconstruction:refresh` passes.
|
|
496
|
-
3. `npm run harbor:reconstruction:check` passes.
|
|
497
|
-
4. `npm run benchmark:reconstruction:list` passes.
|
|
498
|
-
5. `npm run harbor:reconstruction:smoke` passes when environment packaging
|
|
499
|
-
changes.
|
|
500
|
-
6. Verifier reward output contains numeric `reward`, `score`,
|
|
501
|
-
`rawCompareScore`, `guard`, and `valid`.
|
|
502
|
-
7. Verifier report output contains the detailed status, score vector,
|
|
503
|
-
aggregation, guard data, and artifacts.
|
|
504
|
-
8. RewardKit writes `reward-details.json`.
|
|
505
|
-
9. Agent logs are captured in the Harbor trial folder.
|
|
506
|
-
10. Timeouts are enforced by Harbor and inside the verifier.
|
|
507
|
-
11. No global skills or parent repo instructions are required.
|
|
508
|
-
12. The docs index and this guide are updated when the contract changes.
|