harness-evolver 2.2.0 → 2.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "harness-evolver",
3
- "version": "2.2.0",
3
+ "version": "2.3.0",
4
4
  "description": "Meta-Harness-style autonomous harness optimization for Claude Code",
5
5
  "author": "Raphael Valdetaro",
6
6
  "license": "MIT",
@@ -52,21 +52,85 @@ LS_PROJECT=$(langsmith-cli --json projects list --name-pattern "harness-evolver*
52
52
 
53
53
  If `LS_PROJECT` is empty, langsmith-cli is not available or no projects exist — skip to step 2.
54
54
 
55
- **Step 2: Gather traces from the discovered project**
55
+ **Step 2: Gather raw traces from the discovered project**
56
56
 
57
57
  ```bash
58
58
  if [ -n "$LS_PROJECT" ]; then
59
- langsmith-cli --json runs list --project "$LS_PROJECT" --failed --fields id,name,error,inputs --limit 10 > .harness-evolver/langsmith_diagnosis.json 2>/dev/null || echo "[]" > .harness-evolver/langsmith_diagnosis.json
60
- langsmith-cli --json runs list --project "$LS_PROJECT" --fields id,name,inputs,outputs,latency_ms,total_tokens --limit 20 > .harness-evolver/langsmith_runs.json 2>/dev/null || echo "[]" > .harness-evolver/langsmith_runs.json
59
+ langsmith-cli --json runs list --project "$LS_PROJECT" --recent --fields id,name,inputs,outputs,error,total_tokens --limit 30 > /tmp/langsmith_raw.json 2>/dev/null || echo "[]" > /tmp/langsmith_raw.json
61
60
  langsmith-cli --json runs stats --project "$LS_PROJECT" > .harness-evolver/langsmith_stats.json 2>/dev/null || echo "{}" > .harness-evolver/langsmith_stats.json
62
61
  echo "$LS_PROJECT" > .harness-evolver/langsmith_project.txt
63
62
  else
64
- echo "[]" > .harness-evolver/langsmith_diagnosis.json
63
+ echo "[]" > /tmp/langsmith_raw.json
65
64
  echo "{}" > .harness-evolver/langsmith_stats.json
66
65
  fi
67
66
  ```
68
67
 
69
- These files are included in the proposer's `<files_to_read>` so it has real trace data for diagnosis.
68
+ **Step 3: Process raw LangSmith data into a readable format for proposers**
69
+
70
+ The raw langsmith data has LangChain-serialized messages that are hard to read. Process it into a clean summary:
71
+
72
+ ```bash
73
+ python3 -c "
74
+ import json, sys
75
+
76
+ raw = json.load(open('/tmp/langsmith_raw.json'))
77
+ if not raw:
78
+ json.dump([], open('.harness-evolver/langsmith_runs.json', 'w'))
79
+ sys.exit(0)
80
+
81
+ clean = []
82
+ for r in raw:
83
+ entry = {'name': r.get('name', '?'), 'tokens': r.get('total_tokens', 0), 'error': r.get('error')}
84
+
85
+ # Extract readable prompt from LangChain serialized inputs
86
+ inputs = r.get('inputs', {})
87
+ if isinstance(inputs, dict) and 'messages' in inputs:
88
+ msgs = inputs['messages']
89
+ for msg_group in (msgs if isinstance(msgs, list) else [msgs]):
90
+ for msg in (msg_group if isinstance(msg_group, list) else [msg_group]):
91
+ if isinstance(msg, dict):
92
+ kwargs = msg.get('kwargs', msg)
93
+ content = kwargs.get('content', '')
94
+ msg_type = msg.get('id', ['','','',''])[3] if isinstance(msg.get('id'), list) else 'unknown'
95
+ if 'Human' in str(msg_type) or 'user' in str(msg_type).lower():
96
+ entry['user_message'] = str(content)[:300]
97
+ elif 'System' in str(msg_type):
98
+ entry['system_prompt_preview'] = str(content)[:200]
99
+
100
+ # Extract readable output
101
+ outputs = r.get('outputs', {})
102
+ if isinstance(outputs, dict) and 'generations' in outputs:
103
+ gens = outputs['generations']
104
+ if gens and isinstance(gens, list) and gens[0]:
105
+ gen = gens[0][0] if isinstance(gens[0], list) else gens[0]
106
+ if isinstance(gen, dict):
107
+ msg = gen.get('message', gen)
108
+ if isinstance(msg, dict):
109
+ kwargs = msg.get('kwargs', msg)
110
+ entry['llm_response'] = str(kwargs.get('content', ''))[:300]
111
+
112
+ clean.append(entry)
113
+
114
+ json.dump(clean, open('.harness-evolver/langsmith_runs.json', 'w'), indent=2, ensure_ascii=False)
115
+ print(f'Processed {len(clean)} LangSmith runs into readable format')
116
+ " 2>/dev/null || echo "[]" > .harness-evolver/langsmith_runs.json
117
+ ```
118
+
119
+ The resulting `langsmith_runs.json` has clean, readable entries:
120
+ ```json
121
+ [
122
+ {
123
+ "name": "ChatGoogleGenerativeAI",
124
+ "tokens": 1332,
125
+ "error": null,
126
+ "user_message": "Analise este texto: Bom dia pessoal...",
127
+ "system_prompt_preview": "Você é um moderador de conteúdo...",
128
+ "llm_response": "{\"categories\": [\"safe\"], \"severity\": \"safe\"...}"
129
+ }
130
+ ]
131
+ ```
132
+
133
+ These files are included in the proposer's `<files_to_read>` so it has readable trace data for diagnosis.
70
134
 
71
135
  ### 2. Propose (3 parallel candidates)
72
136