cawdex 1.35.74 → 1.35.76
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +5 -5
- package/bin/anycode.js +2 -2
- package/bin/cawdex.js +408 -408
- package/bin/ecc-hooks.cjs +11 -11
- package/dist/agents-md.d.ts +31 -0
- package/dist/agents-md.js +340 -0
- package/dist/agents-md.js.map +1 -0
- package/dist/agents.js +1424 -1424
- package/dist/api.d.ts +1 -0
- package/dist/api.js +19 -14
- package/dist/api.js.map +1 -1
- package/dist/autonomous-loops.js +287 -287
- package/dist/benchmark-repos.d.ts +31 -0
- package/dist/benchmark-repos.js +234 -8
- package/dist/benchmark-repos.js.map +1 -1
- package/dist/command-palette.js +4 -2
- package/dist/command-palette.js.map +1 -1
- package/dist/compaction.js +8 -8
- package/dist/config.js +51 -36
- package/dist/config.js.map +1 -1
- package/dist/content-engine.js +543 -543
- package/dist/context-brief.d.ts +4 -0
- package/dist/context-brief.js +230 -0
- package/dist/context-brief.js.map +1 -0
- package/dist/cost-tracker.d.ts +33 -14
- package/dist/cost-tracker.js +81 -19
- package/dist/cost-tracker.js.map +1 -1
- package/dist/coverage.js +39 -39
- package/dist/docs-sync.js +98 -98
- package/dist/evaluation.js +452 -452
- package/dist/fixed-footer.d.ts +7 -1
- package/dist/fixed-footer.js +92 -18
- package/dist/fixed-footer.js.map +1 -1
- package/dist/git-workflow.js +49 -49
- package/dist/index.d.ts +2 -0
- package/dist/index.js +197 -65
- package/dist/index.js.map +1 -1
- package/dist/instant-artifact.d.ts +6 -0
- package/dist/instant-artifact.js +397 -0
- package/dist/instant-artifact.js.map +1 -0
- package/dist/live-queue.js +1 -1
- package/dist/live-queue.js.map +1 -1
- package/dist/model-aliases.d.ts +37 -0
- package/dist/model-aliases.js +203 -0
- package/dist/model-aliases.js.map +1 -0
- package/dist/orchestration.js +15 -15
- package/dist/permissions.d.ts +6 -0
- package/dist/permissions.js +53 -0
- package/dist/permissions.js.map +1 -1
- package/dist/pm2-manager.js +26 -26
- package/dist/query.d.ts +0 -1
- package/dist/query.js +74 -39
- package/dist/query.js.map +1 -1
- package/dist/refactor.js +87 -87
- package/dist/repo-command.js +7 -1
- package/dist/repo-command.js.map +1 -1
- package/dist/search-first.js +92 -92
- package/dist/skill-create.js +100 -100
- package/dist/stitch.js +1 -1
- package/dist/system-prompt.d.ts +2 -1
- package/dist/system-prompt.js +10 -5
- package/dist/system-prompt.js.map +1 -1
- package/dist/tools/github-repo-digest.d.ts +1 -1
- package/dist/tools/github-repo-digest.js +38 -6
- package/dist/tools/github-repo-digest.js.map +1 -1
- package/dist/types.d.ts +3 -0
- package/dist/types.js.map +1 -1
- package/dist/verification.js +55 -55
- package/package.json +1 -1
- package/resources/__init__.py +1 -1
- package/resources/exgentic/cawdex_agent/README.md +114 -114
- package/resources/exgentic/cawdex_agent/__init__.py +5 -5
- package/resources/exgentic/cawdex_agent/agent.py +605 -605
- package/resources/exgentic/cawdex_agent/requirements.txt +2 -2
- package/resources/exgentic/cawdex_agent/setup.sh +21 -21
- package/resources/exgentic/cawdex_agent/utils.py +1061 -1061
- package/resources/hal/cawdex_agent/README.md +24 -24
- package/resources/hal/cawdex_agent/__init__.py +1 -1
- package/resources/hal/cawdex_agent/main.py +550 -550
- package/resources/hal/cawdex_agent/requirements.txt +2 -2
- package/resources/kbench/cawdex_agent/README.md +107 -107
- package/resources/kbench/cawdex_agent/adapter.manifest.json +19 -19
- package/resources/kbench/cawdex_agent/runner.mjs +753 -753
- package/resources/open_agent_leaderboard/cawdex-agent-card.md +119 -119
- package/resources/terminal_bench/__init__.py +1 -1
- package/resources/terminal_bench/cawdex_agent.py +174 -174
- package/resources/terminal_bench/setup.sh +121 -121
|
@@ -1,24 +1,24 @@
|
|
|
1
|
-
# Cawdex HAL adapter
|
|
2
|
-
|
|
3
|
-
This directory is a HAL-style custom agent package. It exposes `run(input, **kwargs)` in `main.py` and shells out to the installed `cawdex` CLI in non-interactive benchmark mode.
|
|
4
|
-
|
|
5
|
-
Defaults:
|
|
6
|
-
|
|
7
|
-
- SWE-bench-like tasks return a git patch string.
|
|
8
|
-
- ScienceAgentBench-like tasks return a solution/trajectory string.
|
|
9
|
-
- AppWorld-like tasks return `Completed` after a successful Cawdex run.
|
|
10
|
-
- WebDevBench-like tasks are routed to the `webdevbench` benchmark profile, which keeps canary requirements plus frontend-backend and production/security validation evidence visible.
|
|
11
|
-
- SWE-Cycle-like tasks are routed to the `swe-cycle` benchmark profile, which keeps lifecycle phase, environment setup, implementation, verification-test generation, and static/dynamic judge evidence visible.
|
|
12
|
-
- SWE-CI-like tasks are routed to the `swe-ci` benchmark profile, which keeps current/target commits, test gaps, inferred requirements, and CI-loop verifier deltas visible.
|
|
13
|
-
- SWE-PRBench-like tasks are routed to the `swe-prbench` benchmark profile, which reviews PR metadata and diff first, expands only for concrete suspected findings, and returns severity-rated file/line findings unless patches are explicitly requested.
|
|
14
|
-
- TML-Bench/Kaggle-style tabular ML tasks are routed to the `tml-bench` benchmark profile, which extracts the data contract, avoids hidden-label leakage, trains an honest baseline, and validates the generated submission schema before completion.
|
|
15
|
-
- Pi-Bench-style proactive personal assistant tasks are routed to the `pi-bench` benchmark profile, which builds a user/workspace/app context contract, tracks hidden-intent hypotheses, asks focused clarification only when needed, and verifies observable state after proactive actions.
|
|
16
|
-
- USACO and other text-response tasks return the original task dict with a `response` field.
|
|
17
|
-
- Oracle-like fields such as `patch`, `test_patch`, `solution`, `answer`, `gold`, `FAIL_TO_PASS`, and `PASS_TO_PASS` are omitted from the prompt unless `CAWDEX_HAL_INCLUDE_ORACLE_FIELDS=1` is set.
|
|
18
|
-
- Traces and logs are written under `.cawdex/hal-trace/` unless `CAWDEX_HAL_TRACE_DIR` is set.
|
|
19
|
-
|
|
20
|
-
Useful overrides:
|
|
21
|
-
|
|
22
|
-
- `CAWDEX_HAL_COMMAND` or `CAWDEX_HAL_COMMAND`: command used to invoke Cawdex, default `cawdex`.
|
|
23
|
-
- `CAWDEX_HAL_TIMEOUT_SEC`: per-task timeout, default `1800`.
|
|
24
|
-
- HAL `-A model_name=...`, `-A provider=...`, `-A max_turns=...`, and `-A output_format=...` are forwarded to Cawdex CLI flags when present.
|
|
1
|
+
# Cawdex HAL adapter
|
|
2
|
+
|
|
3
|
+
This directory is a HAL-style custom agent package. It exposes `run(input, **kwargs)` in `main.py` and shells out to the installed `cawdex` CLI in non-interactive benchmark mode.
|
|
4
|
+
|
|
5
|
+
Defaults:
|
|
6
|
+
|
|
7
|
+
- SWE-bench-like tasks return a git patch string.
|
|
8
|
+
- ScienceAgentBench-like tasks return a solution/trajectory string.
|
|
9
|
+
- AppWorld-like tasks return `Completed` after a successful Cawdex run.
|
|
10
|
+
- WebDevBench-like tasks are routed to the `webdevbench` benchmark profile, which keeps canary requirements plus frontend-backend and production/security validation evidence visible.
|
|
11
|
+
- SWE-Cycle-like tasks are routed to the `swe-cycle` benchmark profile, which keeps lifecycle phase, environment setup, implementation, verification-test generation, and static/dynamic judge evidence visible.
|
|
12
|
+
- SWE-CI-like tasks are routed to the `swe-ci` benchmark profile, which keeps current/target commits, test gaps, inferred requirements, and CI-loop verifier deltas visible.
|
|
13
|
+
- SWE-PRBench-like tasks are routed to the `swe-prbench` benchmark profile, which reviews PR metadata and diff first, expands only for concrete suspected findings, and returns severity-rated file/line findings unless patches are explicitly requested.
|
|
14
|
+
- TML-Bench/Kaggle-style tabular ML tasks are routed to the `tml-bench` benchmark profile, which extracts the data contract, avoids hidden-label leakage, trains an honest baseline, and validates the generated submission schema before completion.
|
|
15
|
+
- Pi-Bench-style proactive personal assistant tasks are routed to the `pi-bench` benchmark profile, which builds a user/workspace/app context contract, tracks hidden-intent hypotheses, asks focused clarification only when needed, and verifies observable state after proactive actions.
|
|
16
|
+
- USACO and other text-response tasks return the original task dict with a `response` field.
|
|
17
|
+
- Oracle-like fields such as `patch`, `test_patch`, `solution`, `answer`, `gold`, `FAIL_TO_PASS`, and `PASS_TO_PASS` are omitted from the prompt unless `CAWDEX_HAL_INCLUDE_ORACLE_FIELDS=1` is set.
|
|
18
|
+
- Traces and logs are written under `.cawdex/hal-trace/` unless `CAWDEX_HAL_TRACE_DIR` is set.
|
|
19
|
+
|
|
20
|
+
Useful overrides:
|
|
21
|
+
|
|
22
|
+
- `CAWDEX_HAL_COMMAND` or `CAWDEX_HAL_COMMAND`: command used to invoke Cawdex, default `cawdex`.
|
|
23
|
+
- `CAWDEX_HAL_TIMEOUT_SEC`: per-task timeout, default `1800`.
|
|
24
|
+
- HAL `-A model_name=...`, `-A provider=...`, `-A max_turns=...`, and `-A output_format=...` are forwarded to Cawdex CLI flags when present.
|
|
@@ -1 +1 @@
|
|
|
1
|
-
"""HAL adapter package for Cawdex."""
|
|
1
|
+
"""HAL adapter package for Cawdex."""
|