harness-evolver 1.8.0 → 1.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "harness-evolver",
3
- "version": "1.8.0",
3
+ "version": "1.9.0",
4
4
  "description": "Meta-Harness-style autonomous harness optimization for Claude Code",
5
5
  "author": "Raphael Valdetaro",
6
6
  "license": "MIT",
@@ -36,15 +36,34 @@ python3 -c "import json; s=json.load(open('.harness-evolver/summary.json')); pri
36
36
 
37
37
  ### 1.5. Gather LangSmith Traces (MANDATORY after every evaluation)
38
38
 
39
- **Run these commands unconditionally after EVERY evaluation** (including baseline). If langsmith-cli is not installed or there are no runs, the commands fail silently that's fine. But you MUST attempt them.
39
+ **Run these commands unconditionally after EVERY evaluation** (including baseline). Do NOT guess project namesdiscover them.
40
+
41
+ **Step 1: Find the actual LangSmith project name**
40
42
 
41
43
  ```bash
42
- langsmith-cli --json runs list --project harness-evolver-{last_evaluated_version} --failed --fields id,name,error,inputs --limit 10 > .harness-evolver/langsmith_diagnosis.json 2>/dev/null || echo "[]" > .harness-evolver/langsmith_diagnosis.json
44
+ langsmith-cli --json projects list --name-pattern "harness-evolver*" --limit 10 2>/dev/null
45
+ ```
43
46
 
44
- langsmith-cli --json runs stats --project harness-evolver-{last_evaluated_version} > .harness-evolver/langsmith_stats.json 2>/dev/null || echo "{}" > .harness-evolver/langsmith_stats.json
47
+ This returns all projects matching the prefix. Pick the most recently updated one, or the one matching the current version. Save the project name:
48
+
49
+ ```bash
50
+ LS_PROJECT=$(langsmith-cli --json projects list --name-pattern "harness-evolver*" --limit 1 2>/dev/null | python3 -c "import sys,json; data=json.load(sys.stdin); print(data[0]['name'] if data else '')" 2>/dev/null || echo "")
45
51
  ```
46
52
 
47
- For the first iteration, use `baseline` as the version. For subsequent iterations, use the latest evaluated version.
53
+ If `LS_PROJECT` is empty, langsmith-cli is not available or no projects exist skip to step 2.
54
+
55
+ **Step 2: Gather traces from the discovered project**
56
+
57
+ ```bash
58
+ if [ -n "$LS_PROJECT" ]; then
59
+ langsmith-cli --json runs list --project "$LS_PROJECT" --failed --fields id,name,error,inputs --limit 10 > .harness-evolver/langsmith_diagnosis.json 2>/dev/null || echo "[]" > .harness-evolver/langsmith_diagnosis.json
60
+ langsmith-cli --json runs stats --project "$LS_PROJECT" > .harness-evolver/langsmith_stats.json 2>/dev/null || echo "{}" > .harness-evolver/langsmith_stats.json
61
+ echo "$LS_PROJECT" > .harness-evolver/langsmith_project.txt
62
+ else
63
+ echo "[]" > .harness-evolver/langsmith_diagnosis.json
64
+ echo "{}" > .harness-evolver/langsmith_stats.json
65
+ fi
66
+ ```
48
67
 
49
68
  These files are included in the proposer's `<files_to_read>` so it has real trace data for diagnosis.
50
69
 
package/tools/evaluate.py CHANGED
@@ -118,12 +118,17 @@ def cmd_run(args):
118
118
  api_key = os.environ.get(ls.get("api_key_env", "LANGSMITH_API_KEY"), "")
119
119
  if api_key:
120
120
  version = os.path.basename(os.path.dirname(traces_dir))
121
+ ls_project = f"{ls.get('project_prefix', 'harness-evolver')}-{version}"
121
122
  langsmith_env = {
122
123
  **os.environ,
123
124
  "LANGCHAIN_TRACING_V2": "true",
124
125
  "LANGCHAIN_API_KEY": api_key,
125
- "LANGCHAIN_PROJECT": f"{ls.get('project_prefix', 'harness-evolver')}-{version}",
126
+ "LANGCHAIN_PROJECT": ls_project,
126
127
  }
128
+ # Write the project name so the evolve skill knows where to find traces
129
+ ls_project_file = os.path.join(os.path.dirname(os.path.dirname(traces_dir)), "langsmith_project.txt")
130
+ with open(ls_project_file, "w") as f:
131
+ f.write(ls_project)
127
132
 
128
133
  for task_file in task_files:
129
134
  task_path = os.path.join(tasks_dir, task_file)