language-operator 0.1.46 → 0.1.47
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.claude/commands/task.md +10 -0
- data/Gemfile.lock +1 -1
- data/components/agent/.rubocop.yml +1 -0
- data/components/agent/Dockerfile +43 -0
- data/components/agent/Dockerfile.dev +38 -0
- data/components/agent/Gemfile +15 -0
- data/components/agent/Makefile +67 -0
- data/components/agent/bin/langop-agent +140 -0
- data/components/agent/config/config.yaml +47 -0
- data/components/base/Dockerfile +34 -0
- data/components/base/Makefile +42 -0
- data/components/base/entrypoint.sh +12 -0
- data/components/base/gem-credentials +2 -0
- data/components/tool/.gitignore +10 -0
- data/components/tool/.rubocop.yml +19 -0
- data/components/tool/.yardopts +7 -0
- data/components/tool/Dockerfile +44 -0
- data/components/tool/Dockerfile.dev +39 -0
- data/components/tool/Gemfile +18 -0
- data/components/tool/Makefile +77 -0
- data/components/tool/README.md +145 -0
- data/components/tool/config.ru +4 -0
- data/components/tool/examples/calculator.rb +63 -0
- data/components/tool/examples/example_tool.rb +190 -0
- data/components/tool/lib/langop/dsl.rb +20 -0
- data/components/tool/server.rb +7 -0
- data/lib/language_operator/agent/task_executor.rb +39 -7
- data/lib/language_operator/cli/commands/agent.rb +0 -3
- data/lib/language_operator/cli/commands/system.rb +1 -0
- data/lib/language_operator/cli/formatters/log_formatter.rb +19 -67
- data/lib/language_operator/cli/formatters/log_style.rb +151 -0
- data/lib/language_operator/cli/formatters/progress_formatter.rb +10 -6
- data/lib/language_operator/logger.rb +3 -8
- data/lib/language_operator/templates/agent_synthesis.tmpl +35 -28
- data/lib/language_operator/templates/schema/agent_dsl_openapi.yaml +1 -1
- data/lib/language_operator/templates/schema/agent_dsl_schema.json +1 -1
- data/lib/language_operator/version.rb +1 -1
- data/synth/001/README.md +72 -0
- data/synth/001/output.log +13 -13
- data/synth/002/Makefile +12 -0
- data/synth/002/README.md +287 -0
- data/synth/002/agent.rb +23 -0
- data/synth/002/agent.txt +1 -0
- data/synth/002/output.log +22 -0
- metadata +33 -3
- data/synth/Makefile +0 -39
- data/synth/README.md +0 -342
data/synth/README.md
DELETED
|
@@ -1,342 +0,0 @@
|
|
|
1
|
-
# Synthesis Test Suite
|
|
2
|
-
|
|
3
|
-
This directory contains a test suite for validating agent code synthesis locally without requiring a Kubernetes cluster.
|
|
4
|
-
|
|
5
|
-
## Purpose
|
|
6
|
-
|
|
7
|
-
The synthesis test suite allows you to:
|
|
8
|
-
|
|
9
|
-
1. **Test synthesis locally** - Generate Ruby DSL code from LanguageAgent YAML specs
|
|
10
|
-
2. **Compare models** - See how different LLMs (Claude, GPT-4) synthesize the same agent
|
|
11
|
-
3. **Iterate quickly** - Test template/prompt changes without deploying to K8s
|
|
12
|
-
4. **Build regression tests** - Verify synthesis quality doesn't degrade
|
|
13
|
-
5. **Debug synthesis issues** - Identify and fix problems in the synthesis pipeline
|
|
14
|
-
|
|
15
|
-
## Directory Structure
|
|
16
|
-
|
|
17
|
-
```
|
|
18
|
-
synth/
|
|
19
|
-
├── Makefile # Top-level test runner
|
|
20
|
-
├── README.md # This file
|
|
21
|
-
└── 001/ # Test case: "hello-world"
|
|
22
|
-
├── agent.yaml # Input: LanguageAgent spec
|
|
23
|
-
├── agent.rb # Output: Generated code (default model)
|
|
24
|
-
├── agent.sonnet.rb # Output: Claude Sonnet
|
|
25
|
-
├── agent.gpt-4.rb # Output: GPT-4
|
|
26
|
-
└── Makefile # Test-specific targets
|
|
27
|
-
```
|
|
28
|
-
|
|
29
|
-
## Quick Start
|
|
30
|
-
|
|
31
|
-
### Prerequisites
|
|
32
|
-
|
|
33
|
-
**Option 1: Local OpenAI-Compatible Endpoint (Recommended)**
|
|
34
|
-
|
|
35
|
-
Use a local LLM server (LMStudio, vLLM, Ollama with OpenAI adapter, etc.):
|
|
36
|
-
|
|
37
|
-
```bash
|
|
38
|
-
export SYNTHESIS_ENDPOINT="http://192.168.68.54:1234/v1"
|
|
39
|
-
export SYNTHESIS_API_KEY="dummy" # Optional, defaults to "dummy"
|
|
40
|
-
export SYNTHESIS_MODEL="mistralai/magistral-small-2509" # Or your quantized model
|
|
41
|
-
```
|
|
42
|
-
|
|
43
|
-
**Option 2: Cloud API Keys**
|
|
44
|
-
|
|
45
|
-
```bash
|
|
46
|
-
export ANTHROPIC_API_KEY="sk-ant-..." # For Claude
|
|
47
|
-
export OPENAI_API_KEY="sk-..." # For GPT-4
|
|
48
|
-
```
|
|
49
|
-
|
|
50
|
-
The harness prioritizes `SYNTHESIS_ENDPOINT` if set, allowing you to test on quantized local models before hitting cloud APIs.
|
|
51
|
-
|
|
52
|
-
### Run a Test
|
|
53
|
-
|
|
54
|
-
```bash
|
|
55
|
-
# Run synthesis for test 001
|
|
56
|
-
cd synth/001
|
|
57
|
-
make synthesize
|
|
58
|
-
|
|
59
|
-
# View the generated code
|
|
60
|
-
cat agent.rb
|
|
61
|
-
|
|
62
|
-
# Execute the agent locally
|
|
63
|
-
make run
|
|
64
|
-
|
|
65
|
-
# Clean up
|
|
66
|
-
make clean
|
|
67
|
-
```
|
|
68
|
-
|
|
69
|
-
### Compare Models
|
|
70
|
-
|
|
71
|
-
```bash
|
|
72
|
-
cd synth/001
|
|
73
|
-
|
|
74
|
-
# Generate with all models
|
|
75
|
-
make synthesize-all
|
|
76
|
-
|
|
77
|
-
# Compare outputs
|
|
78
|
-
make compare
|
|
79
|
-
|
|
80
|
-
# Or manually inspect
|
|
81
|
-
cat agent.sonnet.rb
|
|
82
|
-
cat agent.gpt-4.rb
|
|
83
|
-
```
|
|
84
|
-
|
|
85
|
-
## Test Case Format
|
|
86
|
-
|
|
87
|
-
Each test case is a numbered directory (`001`, `002`, etc.) containing:
|
|
88
|
-
|
|
89
|
-
### agent.yaml
|
|
90
|
-
|
|
91
|
-
A LanguageAgent CRD spec:
|
|
92
|
-
|
|
93
|
-
```yaml
|
|
94
|
-
apiVersion: langop.io/v1alpha1
|
|
95
|
-
kind: LanguageAgent
|
|
96
|
-
metadata:
|
|
97
|
-
name: hello-world
|
|
98
|
-
spec:
|
|
99
|
-
instructions: |
|
|
100
|
-
Say something in your logs
|
|
101
|
-
# Optional: toolRefs, modelRefs, personaRefs, etc.
|
|
102
|
-
```
|
|
103
|
-
|
|
104
|
-
### Expected Output
|
|
105
|
-
|
|
106
|
-
The synthesis process should generate a Ruby file like:
|
|
107
|
-
|
|
108
|
-
```ruby
|
|
109
|
-
require 'language_operator'
|
|
110
|
-
|
|
111
|
-
agent "hello-world" do
|
|
112
|
-
description "Say something in your logs"
|
|
113
|
-
mode :autonomous
|
|
114
|
-
objectives [
|
|
115
|
-
"Log a message to the console"
|
|
116
|
-
]
|
|
117
|
-
constraints do
|
|
118
|
-
max_iterations 1
|
|
119
|
-
timeout "30s"
|
|
120
|
-
end
|
|
121
|
-
end
|
|
122
|
-
```
|
|
123
|
-
|
|
124
|
-
## Makefile Targets
|
|
125
|
-
|
|
126
|
-
### In Test Directory (`synth/001/`)
|
|
127
|
-
|
|
128
|
-
| Target | Description |
|
|
129
|
-
|--------|-------------|
|
|
130
|
-
| `make synthesize` | Generate `agent.rb` with default model |
|
|
131
|
-
| `make synthesize-sonnet` | Generate `agent.sonnet.rb` with Claude |
|
|
132
|
-
| `make synthesize-gpt-4` | Generate `agent.gpt-4.rb` with GPT-4 |
|
|
133
|
-
| `make synthesize-all` | Generate for all configured models |
|
|
134
|
-
| `make run` | Execute the synthesized `agent.rb` locally |
|
|
135
|
-
| `make validate` | Validate Ruby syntax of `agent.rb` |
|
|
136
|
-
| `make clean` | Remove all generated `.rb` files |
|
|
137
|
-
| `make compare` | Diff outputs from different models |
|
|
138
|
-
|
|
139
|
-
### Top-Level (`synth/`)
|
|
140
|
-
|
|
141
|
-
| Target | Description |
|
|
142
|
-
|--------|-------------|
|
|
143
|
-
| `make test` | Run default synthesis for all tests |
|
|
144
|
-
| `make test-all` | Run synthesis with all models |
|
|
145
|
-
| `make clean` | Clean all test artifacts |
|
|
146
|
-
| `make list` | List available test cases |
|
|
147
|
-
|
|
148
|
-
## How It Works
|
|
149
|
-
|
|
150
|
-
### Synthesis Flow
|
|
151
|
-
|
|
152
|
-
1. **Load agent.yaml** - Parse LanguageAgent spec
|
|
153
|
-
2. **Extract fields** - Get instructions, tools, models, persona
|
|
154
|
-
3. **Build prompt** - Fill synthesis template with extracted data
|
|
155
|
-
4. **Call LLM** - Send prompt to Claude/GPT-4
|
|
156
|
-
5. **Extract code** - Parse Ruby code from markdown response
|
|
157
|
-
6. **Validate** - Check syntax and security (AST validation)
|
|
158
|
-
7. **Write output** - Save to `agent.rb` or model-specific file
|
|
159
|
-
|
|
160
|
-
### Implementation
|
|
161
|
-
|
|
162
|
-
The synthesis functionality is now integrated directly into the `aictl` CLI:
|
|
163
|
-
|
|
164
|
-
```bash
|
|
165
|
-
aictl system synthesize [INSTRUCTIONS]
|
|
166
|
-
```
|
|
167
|
-
|
|
168
|
-
This command uses LanguageModel resources from your cluster to generate agent code.
|
|
169
|
-
|
|
170
|
-
## Adding New Test Cases
|
|
171
|
-
|
|
172
|
-
1. Create a new directory:
|
|
173
|
-
```bash
|
|
174
|
-
mkdir synth/002
|
|
175
|
-
```
|
|
176
|
-
|
|
177
|
-
2. Copy the Makefile template:
|
|
178
|
-
```bash
|
|
179
|
-
cp synth/001/Makefile synth/002/
|
|
180
|
-
```
|
|
181
|
-
|
|
182
|
-
3. Create `agent.yaml`:
|
|
183
|
-
```yaml
|
|
184
|
-
apiVersion: langop.io/v1alpha1
|
|
185
|
-
kind: LanguageAgent
|
|
186
|
-
metadata:
|
|
187
|
-
name: my-test-agent
|
|
188
|
-
spec:
|
|
189
|
-
instructions: |
|
|
190
|
-
Your test instructions here
|
|
191
|
-
```
|
|
192
|
-
|
|
193
|
-
4. Run synthesis:
|
|
194
|
-
```bash
|
|
195
|
-
cd synth/002
|
|
196
|
-
make synthesize
|
|
197
|
-
```
|
|
198
|
-
|
|
199
|
-
5. Update top-level Makefile to include new test
|
|
200
|
-
|
|
201
|
-
## Example Test Cases
|
|
202
|
-
|
|
203
|
-
### 001 - Hello World
|
|
204
|
-
**Instructions**: "Say something in your logs"
|
|
205
|
-
**Expected**: Simple autonomous agent with single objective
|
|
206
|
-
|
|
207
|
-
### 002 - Scheduled Agent (Future)
|
|
208
|
-
**Instructions**: "Check website daily at noon"
|
|
209
|
-
**Expected**: Scheduled agent with cron expression
|
|
210
|
-
|
|
211
|
-
### 003 - Reactive Webhook (Future)
|
|
212
|
-
**Instructions**: "When webhook received, send email"
|
|
213
|
-
**Expected**: Reactive agent with webhook definition
|
|
214
|
-
|
|
215
|
-
### 004 - Multi-Step Workflow (Future)
|
|
216
|
-
**Instructions**: "Fetch data from API, analyze it, save results"
|
|
217
|
-
**Expected**: Agent with workflow steps and dependencies
|
|
218
|
-
|
|
219
|
-
## Relationship to `aictl system test-synthesis`
|
|
220
|
-
|
|
221
|
-
The `aictl system test-synthesis` command provides similar functionality but with different interface:
|
|
222
|
-
|
|
223
|
-
```bash
|
|
224
|
-
# CLI-based (existing command)
|
|
225
|
-
aictl system test-synthesis --instructions "Say something in your logs"
|
|
226
|
-
|
|
227
|
-
# YAML-based (this test suite)
|
|
228
|
-
cd synth/001 && make synthesize
|
|
229
|
-
```
|
|
230
|
-
|
|
231
|
-
**Benefits of YAML test suite:**
|
|
232
|
-
- ✅ Version controlled test cases
|
|
233
|
-
- ✅ Easy to compare model outputs side-by-side
|
|
234
|
-
- ✅ Repeatable regression testing
|
|
235
|
-
- ✅ Can specify full LanguageAgent spec (tools, models, etc.)
|
|
236
|
-
|
|
237
|
-
**Benefits of CLI command:**
|
|
238
|
-
- ✅ Quick one-off testing
|
|
239
|
-
- ✅ No file management
|
|
240
|
-
- ✅ Integrated with aictl workflow
|
|
241
|
-
|
|
242
|
-
Both are valuable for different use cases!
|
|
243
|
-
|
|
244
|
-
## Troubleshooting
|
|
245
|
-
|
|
246
|
-
### API Key Not Found
|
|
247
|
-
|
|
248
|
-
```
|
|
249
|
-
Error: No API key found. Set either:
|
|
250
|
-
SYNTHESIS_ENDPOINT (for local/OpenAI-compatible)
|
|
251
|
-
ANTHROPIC_API_KEY (for Claude)
|
|
252
|
-
OPENAI_API_KEY (for GPT)
|
|
253
|
-
```
|
|
254
|
-
|
|
255
|
-
**Solution**: Set environment variables:
|
|
256
|
-
```bash
|
|
257
|
-
# For local endpoint (recommended)
|
|
258
|
-
export SYNTHESIS_ENDPOINT="http://localhost:1234/v1"
|
|
259
|
-
export SYNTHESIS_MODEL="your-model-name"
|
|
260
|
-
|
|
261
|
-
# OR for cloud APIs
|
|
262
|
-
export ANTHROPIC_API_KEY="sk-ant-..."
|
|
263
|
-
export OPENAI_API_KEY="sk-..."
|
|
264
|
-
```
|
|
265
|
-
|
|
266
|
-
### Synthesis Failed
|
|
267
|
-
|
|
268
|
-
```
|
|
269
|
-
Error: LLM call failed: ...
|
|
270
|
-
```
|
|
271
|
-
|
|
272
|
-
**Solution**: Check:
|
|
273
|
-
- API key is valid
|
|
274
|
-
- Network connectivity
|
|
275
|
-
- Model name is correct
|
|
276
|
-
- LLM service is available
|
|
277
|
-
|
|
278
|
-
### Validation Failed
|
|
279
|
-
|
|
280
|
-
```
|
|
281
|
-
Error: Security validation failed
|
|
282
|
-
```
|
|
283
|
-
|
|
284
|
-
**Solution**: The generated code contains dangerous methods. This is a synthesis quality issue - the template needs improvement or the LLM hallucinated unsafe code.
|
|
285
|
-
|
|
286
|
-
### Empty Output
|
|
287
|
-
|
|
288
|
-
```
|
|
289
|
-
Error: Empty code generated
|
|
290
|
-
```
|
|
291
|
-
|
|
292
|
-
**Solution**: The LLM didn't return code in the expected format. Check the prompt and template.
|
|
293
|
-
|
|
294
|
-
## Development Workflow
|
|
295
|
-
|
|
296
|
-
### Iterate on Template Changes
|
|
297
|
-
|
|
298
|
-
1. Edit template: `lib/language_operator/templates/examples/agent_synthesis.tmpl`
|
|
299
|
-
2. Test locally: `cd synth/001 && make clean && make synthesize`
|
|
300
|
-
3. Review output: `cat agent.rb`
|
|
301
|
-
4. Repeat until satisfied
|
|
302
|
-
5. Copy to operator: Update Go operator's embedded template
|
|
303
|
-
|
|
304
|
-
### Test DSL Changes
|
|
305
|
-
|
|
306
|
-
1. Add new DSL feature to schema
|
|
307
|
-
2. Update template to show example of new feature
|
|
308
|
-
3. Create test case exercising new feature
|
|
309
|
-
4. Run synthesis: `make synthesize`
|
|
310
|
-
5. Verify generated code uses new feature correctly
|
|
311
|
-
|
|
312
|
-
## Future Enhancements
|
|
313
|
-
|
|
314
|
-
- [ ] Automated comparison with expected output (golden files)
|
|
315
|
-
- [ ] CI/CD integration (run on every PR)
|
|
316
|
-
- [ ] Metrics tracking (synthesis quality over time)
|
|
317
|
-
- [ ] More test cases covering all DSL features
|
|
318
|
-
- [ ] Support for additional models (Gemini, etc.)
|
|
319
|
-
- [ ] Template A/B testing (compare different prompt versions)
|
|
320
|
-
|
|
321
|
-
## Related Commands
|
|
322
|
-
|
|
323
|
-
```bash
|
|
324
|
-
# View DSL schema
|
|
325
|
-
aictl system schema
|
|
326
|
-
|
|
327
|
-
# View synthesis template
|
|
328
|
-
aictl system synthesis-template
|
|
329
|
-
|
|
330
|
-
# Validate template
|
|
331
|
-
aictl system validate_template
|
|
332
|
-
|
|
333
|
-
# Test synthesis (CLI)
|
|
334
|
-
aictl system test-synthesis --instructions "..."
|
|
335
|
-
```
|
|
336
|
-
|
|
337
|
-
## Questions?
|
|
338
|
-
|
|
339
|
-
See the main project documentation:
|
|
340
|
-
- [Agent DSL Reference](../docs/dsl/agent-reference.md)
|
|
341
|
-
- [Best Practices](../docs/dsl/best-practices.md)
|
|
342
|
-
- [CLAUDE.md](../CLAUDE.md) - AI context document
|