harness-evolver 3.0.5 → 3.0.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/agents/evolver-proposer.md +28 -10
- package/package.json +1 -1
- package/skills/setup/SKILL.md +7 -5
- package/tools/detect_stack.py +0 -173
|
@@ -64,7 +64,34 @@ Based on your strategy and diagnosis, modify the code:
|
|
|
64
64
|
- **Error handling**: retry logic, fallback strategies, timeout handling
|
|
65
65
|
- **Model selection**: which model for which task
|
|
66
66
|
|
|
67
|
-
|
|
67
|
+
### Phase 3.5: Consult Documentation (MANDATORY)
|
|
68
|
+
|
|
69
|
+
**Before writing ANY code**, you MUST consult Context7 for every library you'll be modifying or using. This is NOT optional.
|
|
70
|
+
|
|
71
|
+
**Step 1 — Identify libraries from the code you read:**
|
|
72
|
+
Read the imports in the files you're about to modify. For each framework/library (LangGraph, OpenAI, Anthropic, CrewAI, etc.):
|
|
73
|
+
|
|
74
|
+
**Step 2 — Resolve library ID:**
|
|
75
|
+
```
|
|
76
|
+
resolve-library-id(libraryName: "langgraph", query: "what you're trying to do")
|
|
77
|
+
```
|
|
78
|
+
This returns up to 10 matches. Pick the one with the highest relevance.
|
|
79
|
+
|
|
80
|
+
**Step 3 — Query docs for your specific task:**
|
|
81
|
+
```
|
|
82
|
+
get-library-docs(libraryId: "/langchain-ai/langgraph", query: "conditional edges StateGraph", topic: "routing")
|
|
83
|
+
```
|
|
84
|
+
Ask about the SPECIFIC API you're going to use or change.
|
|
85
|
+
|
|
86
|
+
**Examples of what to query:**
|
|
87
|
+
- About to modify a StateGraph? → `query: "StateGraph add_conditional_edges"`
|
|
88
|
+
- Changing prompt template? → `query: "ChatPromptTemplate from_messages"` for langchain
|
|
89
|
+
- Adding a tool? → `query: "StructuredTool create tool definition"` for langchain
|
|
90
|
+
- Changing model? → `query: "ChatOpenAI model parameters temperature"` for openai
|
|
91
|
+
|
|
92
|
+
**Why this matters:** Your training data may be outdated. Libraries change APIs between versions. A quick Context7 lookup takes seconds and prevents proposing code that uses deprecated or incorrect patterns. The documentation is the source of truth, not your model knowledge.
|
|
93
|
+
|
|
94
|
+
**If Context7 MCP is not available:** Note in proposal.md "API patterns not verified against current docs — verify before deploying."
|
|
68
95
|
|
|
69
96
|
### Phase 4: Commit and Document
|
|
70
97
|
|
|
@@ -99,15 +126,6 @@ If `production_seed.json` exists:
|
|
|
99
126
|
|
|
100
127
|
Prioritize changes that fix real production failures over synthetic test failures.
|
|
101
128
|
|
|
102
|
-
## Context7 — Documentation Lookup
|
|
103
|
-
|
|
104
|
-
Use Context7 MCP tools proactively when:
|
|
105
|
-
- Writing code that uses a library API
|
|
106
|
-
- Unsure about method signatures or patterns
|
|
107
|
-
- Checking if a better approach exists in the latest version
|
|
108
|
-
|
|
109
|
-
If Context7 is not available, proceed with model knowledge but note in proposal.md.
|
|
110
|
-
|
|
111
129
|
## Rules
|
|
112
130
|
|
|
113
131
|
1. **Read before writing** — understand the code before changing it
|
package/package.json
CHANGED
package/skills/setup/SKILL.md
CHANGED
|
@@ -47,20 +47,22 @@ Use `$EVOLVER_PY` instead of `python3` for ALL tool invocations. This ensures th
|
|
|
47
47
|
## Phase 1: Explore Project (automatic)
|
|
48
48
|
|
|
49
49
|
```bash
|
|
50
|
-
find . -maxdepth 3 -type f -name "*.py" | head -30
|
|
51
|
-
$EVOLVER_PY $TOOLS/detect_stack.py .
|
|
50
|
+
find . -maxdepth 3 -type f -name "*.py" -not -path "*/.venv/*" -not -path "*/node_modules/*" -not -path "*/__pycache__/*" | head -30
|
|
52
51
|
```
|
|
53
52
|
|
|
53
|
+
**Monorepo detection**: if the project root has multiple subdirectories with their own `main.py` or `pyproject.toml`, it's a monorepo. Use AskUserQuestion to ask WHICH app to optimize before proceeding — do NOT scan everything.
|
|
54
|
+
|
|
54
55
|
Look for:
|
|
55
56
|
- Entry points: files with `if __name__`, or named `main.py`, `app.py`, `agent.py`, `graph.py`, `pipeline.py`
|
|
56
|
-
- Framework: LangGraph, CrewAI, OpenAI SDK, Anthropic SDK, etc.
|
|
57
57
|
- Existing LangSmith config: `LANGCHAIN_PROJECT` / `LANGSMITH_PROJECT` in env or `.env`
|
|
58
58
|
- Existing test data: JSON files with inputs, CSV files, etc.
|
|
59
59
|
- Dependencies: `requirements.txt`, `pyproject.toml`
|
|
60
60
|
|
|
61
|
-
|
|
61
|
+
To identify the **framework**, read the entry point file and its immediate imports. The proposer agents will use Context7 MCP for detailed documentation lookup — you don't need to detect every library, just identify the main framework (LangGraph, CrewAI, OpenAI Agents SDK, etc.) from the imports you see.
|
|
62
|
+
|
|
63
|
+
Identify the **run command** — how to execute the agent:
|
|
62
64
|
- `python main.py` (if it accepts `--input` flag)
|
|
63
|
-
-
|
|
65
|
+
- The command in the project's README, Makefile, or scripts/
|
|
64
66
|
|
|
65
67
|
## Phase 2: Confirm Detection (interactive)
|
|
66
68
|
|
package/tools/detect_stack.py
DELETED
|
@@ -1,173 +0,0 @@
|
|
|
1
|
-
#!/usr/bin/env python3
|
|
2
|
-
"""Detect the technology stack of a harness by analyzing Python imports via AST.
|
|
3
|
-
|
|
4
|
-
Usage:
|
|
5
|
-
detect_stack.py <file_or_directory> [-o output.json]
|
|
6
|
-
|
|
7
|
-
Maps imports to known libraries and their Context7 IDs for documentation lookup.
|
|
8
|
-
Stdlib-only. No external dependencies.
|
|
9
|
-
"""
|
|
10
|
-
|
|
11
|
-
import ast
|
|
12
|
-
import json
|
|
13
|
-
import os
|
|
14
|
-
import sys
|
|
15
|
-
|
|
16
|
-
KNOWN_LIBRARIES = {
|
|
17
|
-
"langchain": {
|
|
18
|
-
"context7_id": "/langchain-ai/langchain",
|
|
19
|
-
"display": "LangChain",
|
|
20
|
-
"modules": ["langchain", "langchain_core", "langchain_openai",
|
|
21
|
-
"langchain_anthropic", "langchain_community"],
|
|
22
|
-
},
|
|
23
|
-
"langgraph": {
|
|
24
|
-
"context7_id": "/langchain-ai/langgraph",
|
|
25
|
-
"display": "LangGraph",
|
|
26
|
-
"modules": ["langgraph"],
|
|
27
|
-
},
|
|
28
|
-
"llamaindex": {
|
|
29
|
-
"context7_id": "/run-llama/llama_index",
|
|
30
|
-
"display": "LlamaIndex",
|
|
31
|
-
"modules": ["llama_index"],
|
|
32
|
-
},
|
|
33
|
-
"openai": {
|
|
34
|
-
"context7_id": "/openai/openai-python",
|
|
35
|
-
"display": "OpenAI Python SDK",
|
|
36
|
-
"modules": ["openai"],
|
|
37
|
-
},
|
|
38
|
-
"anthropic": {
|
|
39
|
-
"context7_id": "/anthropics/anthropic-sdk-python",
|
|
40
|
-
"display": "Anthropic Python SDK",
|
|
41
|
-
"modules": ["anthropic"],
|
|
42
|
-
},
|
|
43
|
-
"dspy": {
|
|
44
|
-
"context7_id": "/stanfordnlp/dspy",
|
|
45
|
-
"display": "DSPy",
|
|
46
|
-
"modules": ["dspy"],
|
|
47
|
-
},
|
|
48
|
-
"crewai": {
|
|
49
|
-
"context7_id": "/crewAIInc/crewAI",
|
|
50
|
-
"display": "CrewAI",
|
|
51
|
-
"modules": ["crewai"],
|
|
52
|
-
},
|
|
53
|
-
"autogen": {
|
|
54
|
-
"context7_id": "/microsoft/autogen",
|
|
55
|
-
"display": "AutoGen",
|
|
56
|
-
"modules": ["autogen"],
|
|
57
|
-
},
|
|
58
|
-
"chromadb": {
|
|
59
|
-
"context7_id": "/chroma-core/chroma",
|
|
60
|
-
"display": "ChromaDB",
|
|
61
|
-
"modules": ["chromadb"],
|
|
62
|
-
},
|
|
63
|
-
"pinecone": {
|
|
64
|
-
"context7_id": "/pinecone-io/pinecone-python-client",
|
|
65
|
-
"display": "Pinecone",
|
|
66
|
-
"modules": ["pinecone"],
|
|
67
|
-
},
|
|
68
|
-
"qdrant": {
|
|
69
|
-
"context7_id": "/qdrant/qdrant",
|
|
70
|
-
"display": "Qdrant",
|
|
71
|
-
"modules": ["qdrant_client"],
|
|
72
|
-
},
|
|
73
|
-
"weaviate": {
|
|
74
|
-
"context7_id": "/weaviate/weaviate",
|
|
75
|
-
"display": "Weaviate",
|
|
76
|
-
"modules": ["weaviate"],
|
|
77
|
-
},
|
|
78
|
-
"fastapi": {
|
|
79
|
-
"context7_id": "/fastapi/fastapi",
|
|
80
|
-
"display": "FastAPI",
|
|
81
|
-
"modules": ["fastapi"],
|
|
82
|
-
},
|
|
83
|
-
"flask": {
|
|
84
|
-
"context7_id": "/pallets/flask",
|
|
85
|
-
"display": "Flask",
|
|
86
|
-
"modules": ["flask"],
|
|
87
|
-
},
|
|
88
|
-
"pydantic": {
|
|
89
|
-
"context7_id": "/pydantic/pydantic",
|
|
90
|
-
"display": "Pydantic",
|
|
91
|
-
"modules": ["pydantic"],
|
|
92
|
-
},
|
|
93
|
-
"pandas": {
|
|
94
|
-
"context7_id": "/pandas-dev/pandas",
|
|
95
|
-
"display": "Pandas",
|
|
96
|
-
"modules": ["pandas"],
|
|
97
|
-
},
|
|
98
|
-
"numpy": {
|
|
99
|
-
"context7_id": "/numpy/numpy",
|
|
100
|
-
"display": "NumPy",
|
|
101
|
-
"modules": ["numpy"],
|
|
102
|
-
},
|
|
103
|
-
}
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
def detect_from_file(filepath):
|
|
107
|
-
"""Analyze imports of a Python file and return detected stack."""
|
|
108
|
-
with open(filepath) as f:
|
|
109
|
-
try:
|
|
110
|
-
tree = ast.parse(f.read())
|
|
111
|
-
except SyntaxError:
|
|
112
|
-
return {}
|
|
113
|
-
|
|
114
|
-
imports = set()
|
|
115
|
-
for node in ast.walk(tree):
|
|
116
|
-
if isinstance(node, ast.Import):
|
|
117
|
-
for alias in node.names:
|
|
118
|
-
imports.add(alias.name.split(".")[0])
|
|
119
|
-
elif isinstance(node, ast.ImportFrom):
|
|
120
|
-
if node.module:
|
|
121
|
-
imports.add(node.module.split(".")[0])
|
|
122
|
-
|
|
123
|
-
detected = {}
|
|
124
|
-
for lib_key, lib_info in KNOWN_LIBRARIES.items():
|
|
125
|
-
found = imports & set(lib_info["modules"])
|
|
126
|
-
if found:
|
|
127
|
-
detected[lib_key] = {
|
|
128
|
-
"context7_id": lib_info["context7_id"],
|
|
129
|
-
"display": lib_info["display"],
|
|
130
|
-
"modules_found": sorted(found),
|
|
131
|
-
}
|
|
132
|
-
|
|
133
|
-
return detected
|
|
134
|
-
|
|
135
|
-
|
|
136
|
-
def detect_from_directory(directory):
|
|
137
|
-
"""Analyze all .py files in a directory and consolidate the stack."""
|
|
138
|
-
all_detected = {}
|
|
139
|
-
for root, dirs, files in os.walk(directory):
|
|
140
|
-
for f in files:
|
|
141
|
-
if f.endswith(".py"):
|
|
142
|
-
filepath = os.path.join(root, f)
|
|
143
|
-
file_detected = detect_from_file(filepath)
|
|
144
|
-
for lib_key, lib_info in file_detected.items():
|
|
145
|
-
if lib_key not in all_detected:
|
|
146
|
-
all_detected[lib_key] = lib_info
|
|
147
|
-
else:
|
|
148
|
-
existing = set(all_detected[lib_key]["modules_found"])
|
|
149
|
-
existing.update(lib_info["modules_found"])
|
|
150
|
-
all_detected[lib_key]["modules_found"] = sorted(existing)
|
|
151
|
-
return all_detected
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
if __name__ == "__main__":
|
|
155
|
-
import argparse
|
|
156
|
-
|
|
157
|
-
parser = argparse.ArgumentParser(description="Detect stack from Python files")
|
|
158
|
-
parser.add_argument("path", help="File or directory to analyze")
|
|
159
|
-
parser.add_argument("--output", "-o", help="Output JSON path")
|
|
160
|
-
args = parser.parse_args()
|
|
161
|
-
|
|
162
|
-
if os.path.isfile(args.path):
|
|
163
|
-
result = detect_from_file(args.path)
|
|
164
|
-
else:
|
|
165
|
-
result = detect_from_directory(args.path)
|
|
166
|
-
|
|
167
|
-
output = json.dumps(result, indent=2)
|
|
168
|
-
|
|
169
|
-
if args.output:
|
|
170
|
-
with open(args.output, "w") as f:
|
|
171
|
-
f.write(output)
|
|
172
|
-
else:
|
|
173
|
-
print(output)
|