language-operator 0.0.1 → 0.1.31

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (120) hide show
  1. checksums.yaml +4 -4
  2. data/.rubocop.yml +125 -0
  3. data/CHANGELOG.md +88 -0
  4. data/Gemfile +8 -0
  5. data/Gemfile.lock +284 -0
  6. data/LICENSE +229 -21
  7. data/Makefile +82 -0
  8. data/README.md +3 -11
  9. data/Rakefile +63 -0
  10. data/bin/aictl +7 -0
  11. data/completions/_aictl +232 -0
  12. data/completions/aictl.bash +121 -0
  13. data/completions/aictl.fish +114 -0
  14. data/docs/architecture/agent-runtime.md +585 -0
  15. data/docs/dsl/SCHEMA_VERSION.md +250 -0
  16. data/docs/dsl/agent-reference.md +604 -0
  17. data/docs/dsl/best-practices.md +1078 -0
  18. data/docs/dsl/chat-endpoints.md +895 -0
  19. data/docs/dsl/constraints.md +671 -0
  20. data/docs/dsl/mcp-integration.md +1177 -0
  21. data/docs/dsl/webhooks.md +932 -0
  22. data/docs/dsl/workflows.md +744 -0
  23. data/lib/language_operator/agent/base.rb +110 -0
  24. data/lib/language_operator/agent/executor.rb +440 -0
  25. data/lib/language_operator/agent/instrumentation.rb +54 -0
  26. data/lib/language_operator/agent/metrics_tracker.rb +183 -0
  27. data/lib/language_operator/agent/safety/ast_validator.rb +272 -0
  28. data/lib/language_operator/agent/safety/audit_logger.rb +104 -0
  29. data/lib/language_operator/agent/safety/budget_tracker.rb +175 -0
  30. data/lib/language_operator/agent/safety/content_filter.rb +93 -0
  31. data/lib/language_operator/agent/safety/manager.rb +207 -0
  32. data/lib/language_operator/agent/safety/rate_limiter.rb +150 -0
  33. data/lib/language_operator/agent/safety/safe_executor.rb +127 -0
  34. data/lib/language_operator/agent/scheduler.rb +183 -0
  35. data/lib/language_operator/agent/telemetry.rb +116 -0
  36. data/lib/language_operator/agent/web_server.rb +610 -0
  37. data/lib/language_operator/agent/webhook_authenticator.rb +226 -0
  38. data/lib/language_operator/agent.rb +149 -0
  39. data/lib/language_operator/cli/commands/agent.rb +1205 -0
  40. data/lib/language_operator/cli/commands/cluster.rb +371 -0
  41. data/lib/language_operator/cli/commands/install.rb +404 -0
  42. data/lib/language_operator/cli/commands/model.rb +266 -0
  43. data/lib/language_operator/cli/commands/persona.rb +393 -0
  44. data/lib/language_operator/cli/commands/quickstart.rb +22 -0
  45. data/lib/language_operator/cli/commands/status.rb +143 -0
  46. data/lib/language_operator/cli/commands/system.rb +772 -0
  47. data/lib/language_operator/cli/commands/tool.rb +537 -0
  48. data/lib/language_operator/cli/commands/use.rb +47 -0
  49. data/lib/language_operator/cli/errors/handler.rb +180 -0
  50. data/lib/language_operator/cli/errors/suggestions.rb +176 -0
  51. data/lib/language_operator/cli/formatters/code_formatter.rb +77 -0
  52. data/lib/language_operator/cli/formatters/log_formatter.rb +288 -0
  53. data/lib/language_operator/cli/formatters/progress_formatter.rb +49 -0
  54. data/lib/language_operator/cli/formatters/status_formatter.rb +37 -0
  55. data/lib/language_operator/cli/formatters/table_formatter.rb +163 -0
  56. data/lib/language_operator/cli/formatters/value_formatter.rb +113 -0
  57. data/lib/language_operator/cli/helpers/cluster_context.rb +62 -0
  58. data/lib/language_operator/cli/helpers/cluster_validator.rb +101 -0
  59. data/lib/language_operator/cli/helpers/editor_helper.rb +58 -0
  60. data/lib/language_operator/cli/helpers/kubeconfig_validator.rb +167 -0
  61. data/lib/language_operator/cli/helpers/pastel_helper.rb +24 -0
  62. data/lib/language_operator/cli/helpers/resource_dependency_checker.rb +74 -0
  63. data/lib/language_operator/cli/helpers/schedule_builder.rb +108 -0
  64. data/lib/language_operator/cli/helpers/user_prompts.rb +69 -0
  65. data/lib/language_operator/cli/main.rb +236 -0
  66. data/lib/language_operator/cli/templates/tools/generic.yaml +66 -0
  67. data/lib/language_operator/cli/wizards/agent_wizard.rb +246 -0
  68. data/lib/language_operator/cli/wizards/quickstart_wizard.rb +588 -0
  69. data/lib/language_operator/client/base.rb +214 -0
  70. data/lib/language_operator/client/config.rb +136 -0
  71. data/lib/language_operator/client/cost_calculator.rb +37 -0
  72. data/lib/language_operator/client/mcp_connector.rb +123 -0
  73. data/lib/language_operator/client.rb +19 -0
  74. data/lib/language_operator/config/cluster_config.rb +101 -0
  75. data/lib/language_operator/config/tool_patterns.yaml +57 -0
  76. data/lib/language_operator/config/tool_registry.rb +96 -0
  77. data/lib/language_operator/config.rb +138 -0
  78. data/lib/language_operator/dsl/adapter.rb +124 -0
  79. data/lib/language_operator/dsl/agent_context.rb +90 -0
  80. data/lib/language_operator/dsl/agent_definition.rb +427 -0
  81. data/lib/language_operator/dsl/chat_endpoint_definition.rb +115 -0
  82. data/lib/language_operator/dsl/config.rb +119 -0
  83. data/lib/language_operator/dsl/context.rb +50 -0
  84. data/lib/language_operator/dsl/execution_context.rb +47 -0
  85. data/lib/language_operator/dsl/helpers.rb +109 -0
  86. data/lib/language_operator/dsl/http.rb +184 -0
  87. data/lib/language_operator/dsl/mcp_server_definition.rb +73 -0
  88. data/lib/language_operator/dsl/parameter_definition.rb +124 -0
  89. data/lib/language_operator/dsl/registry.rb +36 -0
  90. data/lib/language_operator/dsl/schema.rb +1102 -0
  91. data/lib/language_operator/dsl/shell.rb +125 -0
  92. data/lib/language_operator/dsl/tool_definition.rb +112 -0
  93. data/lib/language_operator/dsl/webhook_authentication.rb +114 -0
  94. data/lib/language_operator/dsl/webhook_definition.rb +106 -0
  95. data/lib/language_operator/dsl/workflow_definition.rb +259 -0
  96. data/lib/language_operator/dsl.rb +161 -0
  97. data/lib/language_operator/errors.rb +60 -0
  98. data/lib/language_operator/kubernetes/client.rb +279 -0
  99. data/lib/language_operator/kubernetes/resource_builder.rb +194 -0
  100. data/lib/language_operator/loggable.rb +47 -0
  101. data/lib/language_operator/logger.rb +141 -0
  102. data/lib/language_operator/retry.rb +123 -0
  103. data/lib/language_operator/retryable.rb +132 -0
  104. data/lib/language_operator/templates/README.md +23 -0
  105. data/lib/language_operator/templates/examples/agent_synthesis.tmpl +115 -0
  106. data/lib/language_operator/templates/examples/persona_distillation.tmpl +19 -0
  107. data/lib/language_operator/templates/schema/.gitkeep +0 -0
  108. data/lib/language_operator/templates/schema/CHANGELOG.md +93 -0
  109. data/lib/language_operator/templates/schema/agent_dsl_openapi.yaml +306 -0
  110. data/lib/language_operator/templates/schema/agent_dsl_schema.json +452 -0
  111. data/lib/language_operator/tool_loader.rb +242 -0
  112. data/lib/language_operator/validators.rb +170 -0
  113. data/lib/language_operator/version.rb +1 -1
  114. data/lib/language_operator.rb +65 -3
  115. data/requirements/tasks/challenge.md +9 -0
  116. data/requirements/tasks/iterate.md +36 -0
  117. data/requirements/tasks/optimize.md +21 -0
  118. data/requirements/tasks/tag.md +5 -0
  119. data/test_agent_dsl.rb +108 -0
  120. metadata +507 -20
@@ -0,0 +1,895 @@
1
+ # Chat Endpoint Guide
2
+
3
+ Complete guide to exposing agents as OpenAI-compatible chat completion endpoints.
4
+
5
+ ## Table of Contents
6
+
7
+ - [Overview](#overview)
8
+ - [Basic Configuration](#basic-configuration)
9
+ - [System Prompt](#system-prompt)
10
+ - [Model Parameters](#model-parameters)
11
+ - [API Endpoints](#api-endpoints)
12
+ - [Streaming Support](#streaming-support)
13
+ - [Authentication](#authentication)
14
+ - [Usage Examples](#usage-examples)
15
+ - [Integration with OpenAI SDK](#integration-with-openai-sdk)
16
+ - [Complete Examples](#complete-examples)
17
+ - [Best Practices](#best-practices)
18
+
19
+ ## Overview
20
+
21
+ Language Operator agents can expose OpenAI-compatible chat completion endpoints, allowing them to be used as drop-in replacements for LLM APIs in existing applications.
22
+
23
+ ### What is a Chat Endpoint?
24
+
25
+ A chat endpoint transforms an agent into an API-compatible language model that:
26
+ - Accepts OpenAI-format chat completion requests
27
+ - Supports both streaming and non-streaming responses
28
+ - Provides model listing via `/v1/models`
29
+ - Returns usage statistics (token counts)
30
+ - Works with existing OpenAI SDKs and tools
31
+
32
+ ### Use Cases
33
+
34
+ - **Domain-specific models**: Create specialized "models" for specific tasks
35
+ - **Agent as a service**: Expose agents to other applications
36
+ - **LLM proxy**: Add custom logic, caching, or rate limiting
37
+ - **Testing**: Use agents as mock LLM endpoints
38
+ - **Integration**: Connect agents to LangChain, AutoGPT, etc.
39
+
40
+ ## Basic Configuration
41
+
42
+ Define a chat endpoint using the `as_chat_endpoint` block:
43
+
44
+ ```ruby
45
+ agent "github-expert" do
46
+ description "GitHub API and workflow expert"
47
+ mode :reactive
48
+
49
+ as_chat_endpoint do
50
+ system_prompt "You are a GitHub expert assistant"
51
+ temperature 0.7
52
+ max_tokens 2000
53
+ end
54
+ end
55
+ ```
56
+
57
+ **Key points:**
58
+ - Agent automatically switches to `:reactive` mode
59
+ - Endpoints are automatically created at `/v1/chat/completions` and `/v1/models`
60
+ - Agent processes chat messages and returns completions
61
+ - Works with existing OpenAI client libraries
62
+
63
+ ## System Prompt
64
+
65
+ The system prompt defines the agent's behavior and expertise. It's prepended to every conversation.
66
+
67
+ ### Basic System Prompt
68
+
69
+ ```ruby
70
+ as_chat_endpoint do
71
+ system_prompt "You are a helpful customer service assistant"
72
+ end
73
+ ```
74
+
75
+ ### Detailed System Prompt
76
+
77
+ Use heredoc for multi-line prompts:
78
+
79
+ ```ruby
80
+ as_chat_endpoint do
81
+ system_prompt <<~PROMPT
82
+ You are a GitHub expert assistant with deep knowledge of:
83
+ - GitHub API and workflows
84
+ - Pull requests, issues, and code review
85
+ - GitHub Actions and CI/CD
86
+ - Repository management and best practices
87
+
88
+ Provide helpful, accurate answers about GitHub topics.
89
+ Keep responses concise but informative.
90
+ PROMPT
91
+ end
92
+ ```
93
+
94
+ ### System Prompt Best Practices
95
+
96
+ **Be specific about expertise:**
97
+ ```ruby
98
+ system_prompt <<~PROMPT
99
+ You are a Kubernetes troubleshooting expert specializing in:
100
+ - Pod scheduling and resource issues
101
+ - Network policy debugging
102
+ - Storage and volume problems
103
+ - Performance optimization
104
+ PROMPT
105
+ ```
106
+
107
+ **Include behavioral guidelines:**
108
+ ```ruby
109
+ system_prompt <<~PROMPT
110
+ You are a financial analyst assistant.
111
+
112
+ Guidelines:
113
+ - Base all analysis on factual data
114
+ - Clearly distinguish facts from interpretations
115
+ - Use industry-standard terminology
116
+ - Never provide investment advice
117
+ - Always cite sources when referencing data
118
+ PROMPT
119
+ ```
120
+
121
+ **Set tone and style:**
122
+ ```ruby
123
+ system_prompt <<~PROMPT
124
+ You are a friendly technical support agent.
125
+
126
+ Communication style:
127
+ - Use clear, simple language
128
+ - Be patient and encouraging
129
+ - Provide step-by-step instructions
130
+ - Offer to clarify if anything is unclear
131
+ PROMPT
132
+ ```
133
+
134
+ ## Model Parameters
135
+
136
+ Configure LLM behavior with standard OpenAI parameters.
137
+
138
+ ### Temperature
139
+
140
+ Controls randomness in responses (0.0 - 2.0):
141
+
142
+ ```ruby
143
+ as_chat_endpoint do
144
+ temperature 0.7 # Balanced creativity and consistency
145
+ end
146
+ ```
147
+
148
+ **Guidelines:**
149
+ - `0.0` - Deterministic, focused responses (good for factual tasks)
150
+ - `0.5-0.7` - Balanced (default for most use cases)
151
+ - `1.0+` - More creative and varied (good for brainstorming)
152
+
153
+ ### Max Tokens
154
+
155
+ Maximum tokens in the response:
156
+
157
+ ```ruby
158
+ as_chat_endpoint do
159
+ max_tokens 2000 # Limit response length
160
+ end
161
+ ```
162
+
163
+ **Guidelines:**
164
+ - Set based on expected response length
165
+ - Consider cost implications
166
+ - Default: 2000 tokens
167
+
168
+ ### Top P (Nucleus Sampling)
169
+
170
+ Alternative to temperature for controlling randomness (0.0 - 1.0):
171
+
172
+ ```ruby
173
+ as_chat_endpoint do
174
+ top_p 0.9 # Consider top 90% probability mass
175
+ end
176
+ ```
177
+
178
+ **Note:** Use either `temperature` or `top_p`, not both.
179
+
180
+ ### Frequency Penalty
181
+
182
+ Reduces repetition of token sequences (-2.0 to 2.0):
183
+
184
+ ```ruby
185
+ as_chat_endpoint do
186
+ frequency_penalty 0.5 # Discourage repetition
187
+ end
188
+ ```
189
+
190
+ **Guidelines:**
191
+ - `0.0` - No penalty (default)
192
+ - `0.5-1.0` - Moderate reduction in repetition
193
+ - Higher values: Stronger penalty against repetition
194
+
195
+ ### Presence Penalty
196
+
197
+ Encourages talking about new topics (-2.0 to 2.0):
198
+
199
+ ```ruby
200
+ as_chat_endpoint do
201
+ presence_penalty 0.6 # Encourage topic diversity
202
+ end
203
+ ```
204
+
205
+ ### Stop Sequences
206
+
207
+ Tokens that stop generation:
208
+
209
+ ```ruby
210
+ as_chat_endpoint do
211
+ stop ["\n\n", "END", "###"] # Stop on these sequences
212
+ end
213
+ ```
214
+
215
+ ### Model Name
216
+
217
+ Custom model identifier returned in API responses:
218
+
219
+ ```ruby
220
+ as_chat_endpoint do
221
+ model "github-expert-v1" # Custom model name
222
+ end
223
+ ```
224
+
225
+ **Default:** Agent name (e.g., `"github-expert"`)
226
+
227
+ ### Complete Parameter Configuration
228
+
229
+ ```ruby
230
+ as_chat_endpoint do
231
+ system_prompt "You are a helpful assistant"
232
+
233
+ # Model identification
234
+ model "my-custom-model-v1"
235
+
236
+ # Sampling parameters
237
+ temperature 0.7
238
+ top_p 0.9
239
+
240
+ # Length controls
241
+ max_tokens 2000
242
+ stop ["\n\n\n"]
243
+
244
+ # Repetition controls
245
+ frequency_penalty 0.5
246
+ presence_penalty 0.6
247
+ end
248
+ ```
249
+
250
+ ## API Endpoints
251
+
252
+ Chat endpoints expose OpenAI-compatible HTTP endpoints.
253
+
254
+ ### POST /v1/chat/completions
255
+
256
+ Chat completion endpoint (streaming and non-streaming).
257
+
258
+ **Request format:**
259
+ ```json
260
+ {
261
+ "model": "github-expert-v1",
262
+ "messages": [
263
+ {"role": "user", "content": "How do I create a pull request?"}
264
+ ],
265
+ "stream": false
266
+ }
267
+ ```
268
+
269
+ **Response format (non-streaming):**
270
+ ```json
271
+ {
272
+ "id": "chatcmpl-123",
273
+ "object": "chat.completion",
274
+ "created": 1677652288,
275
+ "model": "github-expert-v1",
276
+ "choices": [{
277
+ "index": 0,
278
+ "message": {
279
+ "role": "assistant",
280
+ "content": "To create a pull request on GitHub..."
281
+ },
282
+ "finish_reason": "stop"
283
+ }],
284
+ "usage": {
285
+ "prompt_tokens": 15,
286
+ "completion_tokens": 45,
287
+ "total_tokens": 60
288
+ }
289
+ }
290
+ ```
291
+
292
+ ### GET /v1/models
293
+
294
+ List available models.
295
+
296
+ **Request:**
297
+ ```bash
298
+ GET /v1/models
299
+ ```
300
+
301
+ **Response:**
302
+ ```json
303
+ {
304
+ "object": "list",
305
+ "data": [
306
+ {
307
+ "id": "github-expert-v1",
308
+ "object": "model",
309
+ "created": 1677652288,
310
+ "owned_by": "language-operator"
311
+ }
312
+ ]
313
+ }
314
+ ```
315
+
316
+ ### Health Check Endpoints
317
+
318
+ **GET /health** - Health check
319
+ ```bash
320
+ curl http://localhost:8080/health
321
+ # Returns: {"status":"healthy"}
322
+ ```
323
+
324
+ **GET /ready** - Readiness check
325
+ ```bash
326
+ curl http://localhost:8080/ready
327
+ # Returns: {"status":"ready"}
328
+ ```
329
+
330
+ ## Streaming Support
331
+
332
+ Chat endpoints support Server-Sent Events (SSE) for streaming responses.
333
+
334
+ ### Enabling Streaming
335
+
336
+ Set `stream: true` in the request:
337
+
338
+ ```bash
339
+ curl -N -X POST http://localhost:8080/v1/chat/completions \
340
+ -H "Content-Type: application/json" \
341
+ -d '{
342
+ "model": "github-expert-v1",
343
+ "messages": [{"role": "user", "content": "Explain GitHub Actions"}],
344
+ "stream": true
345
+ }'
346
+ ```
347
+
348
+ ### Streaming Response Format
349
+
350
+ Responses are sent as SSE events:
351
+
352
+ ```
353
+ data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"github-expert-v1","choices":[{"index":0,"delta":{"content":"To"},"finish_reason":null}]}
354
+
355
+ data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"github-expert-v1","choices":[{"index":0,"delta":{"content":" create"},"finish_reason":null}]}
356
+
357
+ data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"github-expert-v1","choices":[{"index":0,"delta":{"content":" a"},"finish_reason":null}]}
358
+
359
+ ...
360
+
361
+ data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"github-expert-v1","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
362
+
363
+ data: [DONE]
364
+ ```
365
+
366
+ **Key points:**
367
+ - Each chunk contains a delta with new content
368
+ - Final chunk includes `finish_reason: "stop"`
369
+ - Stream ends with `data: [DONE]`
370
+
371
+ ### Streaming vs Non-Streaming
372
+
373
+ **Non-streaming (default):**
374
+ - Complete response returned at once
375
+ - Simpler to consume
376
+ - Higher perceived latency
377
+ - Better for batch processing
378
+
379
+ **Streaming (`stream: true`):**
380
+ - Response sent incrementally
381
+ - Lower perceived latency
382
+ - Better user experience
383
+ - More complex to consume
384
+
385
+ ## Authentication
386
+
387
+ While the chat endpoint examples above don't show authentication, you can combine chat endpoints with webhooks for authentication:
388
+
389
+ ```ruby
390
+ agent "secure-chat-agent" do
391
+ mode :reactive
392
+
393
+ # Chat endpoint
394
+ as_chat_endpoint do
395
+ system_prompt "You are a helpful assistant"
396
+ temperature 0.7
397
+ end
398
+
399
+ # Webhook for custom routes (can add auth)
400
+ webhook "/authenticated" do
401
+ method :post
402
+
403
+ authenticate do
404
+ verify_api_key(
405
+ header: 'X-API-Key',
406
+ secret: ENV['API_KEY']
407
+ )
408
+ end
409
+
410
+ on_request do |context|
411
+ # Custom authenticated logic
412
+ end
413
+ end
414
+ end
415
+ ```
416
+
417
+ **Note:** Standard OpenAI SDK clients expect the `/v1/chat/completions` endpoint. For production deployments, add authentication at the infrastructure level (API gateway, ingress controller, etc.).
418
+
419
+ ## Usage Examples
420
+
421
+ ### Using curl (Non-streaming)
422
+
423
+ ```bash
424
+ curl -X POST http://localhost:8080/v1/chat/completions \
425
+ -H "Content-Type: application/json" \
426
+ -d '{
427
+ "model": "github-expert-v1",
428
+ "messages": [
429
+ {"role": "user", "content": "How do I create a pull request?"}
430
+ ]
431
+ }'
432
+ ```
433
+
434
+ ### Using curl (Streaming)
435
+
436
+ ```bash
437
+ curl -N -X POST http://localhost:8080/v1/chat/completions \
438
+ -H "Content-Type: application/json" \
439
+ -d '{
440
+ "model": "github-expert-v1",
441
+ "messages": [
442
+ {"role": "user", "content": "Explain GitHub Actions"}
443
+ ],
444
+ "stream": true
445
+ }'
446
+ ```
447
+
448
+ ### Multi-turn Conversation
449
+
450
+ ```bash
451
+ curl -X POST http://localhost:8080/v1/chat/completions \
452
+ -H "Content-Type: application/json" \
453
+ -d '{
454
+ "model": "github-expert-v1",
455
+ "messages": [
456
+ {"role": "user", "content": "What is a pull request?"},
457
+ {"role": "assistant", "content": "A pull request is a way to propose changes..."},
458
+ {"role": "user", "content": "How do I review one?"}
459
+ ]
460
+ }'
461
+ ```
462
+
463
+ ### With System Message
464
+
465
+ ```bash
466
+ curl -X POST http://localhost:8080/v1/chat/completions \
467
+ -H "Content-Type: application/json" \
468
+ -d '{
469
+ "model": "github-expert-v1",
470
+ "messages": [
471
+ {"role": "system", "content": "You are an expert in GitHub Actions."},
472
+ {"role": "user", "content": "How do I set up CI/CD?"}
473
+ ]
474
+ }'
475
+ ```
476
+
477
+ ## Integration with OpenAI SDK
478
+
479
+ Chat endpoints are compatible with OpenAI client libraries.
480
+
481
+ ### Python
482
+
483
+ ```python
484
+ from openai import OpenAI
485
+
486
+ # Point client to your agent
487
+ client = OpenAI(
488
+ api_key="not-needed", # Not used, but required by SDK
489
+ base_url="http://localhost:8080/v1"
490
+ )
491
+
492
+ # Non-streaming
493
+ response = client.chat.completions.create(
494
+ model="github-expert-v1",
495
+ messages=[
496
+ {"role": "user", "content": "How do I create a PR?"}
497
+ ]
498
+ )
499
+ print(response.choices[0].message.content)
500
+
501
+ # Streaming
502
+ stream = client.chat.completions.create(
503
+ model="github-expert-v1",
504
+ messages=[
505
+ {"role": "user", "content": "Explain GitHub Actions"}
506
+ ],
507
+ stream=True
508
+ )
509
+
510
+ for chunk in stream:
511
+ if chunk.choices[0].delta.content:
512
+ print(chunk.choices[0].delta.content, end="")
513
+ ```
514
+
515
+ ### JavaScript/TypeScript
516
+
517
+ ```javascript
518
+ import OpenAI from 'openai';
519
+
520
+ const client = new OpenAI({
521
+ apiKey: 'not-needed',
522
+ baseURL: 'http://localhost:8080/v1',
523
+ });
524
+
525
+ // Non-streaming
526
+ const response = await client.chat.completions.create({
527
+ model: 'github-expert-v1',
528
+ messages: [
529
+ { role: 'user', content: 'How do I create a PR?' }
530
+ ],
531
+ });
532
+ console.log(response.choices[0].message.content);
533
+
534
+ // Streaming
535
+ const stream = await client.chat.completions.create({
536
+ model: 'github-expert-v1',
537
+ messages: [
538
+ { role: 'user', content: 'Explain GitHub Actions' }
539
+ ],
540
+ stream: true,
541
+ });
542
+
543
+ for await (const chunk of stream) {
544
+ if (chunk.choices[0]?.delta?.content) {
545
+ process.stdout.write(chunk.choices[0].delta.content);
546
+ }
547
+ }
548
+ ```
549
+
550
+ ### Ruby
551
+
552
+ ```ruby
553
+ require 'openai'
554
+
555
+ client = OpenAI::Client.new(
556
+ access_token: "not-needed",
557
+ uri_base: "http://localhost:8080/v1/"
558
+ )
559
+
560
+ # Non-streaming
561
+ response = client.chat(
562
+ parameters: {
563
+ model: "github-expert-v1",
564
+ messages: [
565
+ { role: "user", content: "How do I create a PR?" }
566
+ ]
567
+ }
568
+ )
569
+ puts response.dig("choices", 0, "message", "content")
570
+
571
+ # Streaming
572
+ client.chat(
573
+ parameters: {
574
+ model: "github-expert-v1",
575
+ messages: [
576
+ { role: "user", content: "Explain GitHub Actions" }
577
+ ],
578
+ stream: proc do |chunk, _bytesize|
579
+ print chunk.dig("choices", 0, "delta", "content")
580
+ end
581
+ }
582
+ )
583
+ ```
584
+
585
+ ### LangChain Integration
586
+
587
+ ```python
588
+ from langchain_openai import ChatOpenAI
589
+
590
+ # Use agent as LangChain LLM
591
+ llm = ChatOpenAI(
592
+ model="github-expert-v1",
593
+ openai_api_key="not-needed",
594
+ openai_api_base="http://localhost:8080/v1"
595
+ )
596
+
597
+ # Use in LangChain chains
598
+ from langchain.chains import LLMChain
599
+ from langchain.prompts import PromptTemplate
600
+
601
+ prompt = PromptTemplate(
602
+ input_variables=["topic"],
603
+ template="Explain {topic} in GitHub"
604
+ )
605
+
606
+ chain = LLMChain(llm=llm, prompt=prompt)
607
+ result = chain.run(topic="pull requests")
608
+ print(result)
609
+ ```
610
+
611
+ ## Complete Examples
612
+
613
+ ### GitHub Expert Agent
614
+
615
+ ```ruby
616
+ agent "github-expert" do
617
+ description "GitHub API and workflow expert"
618
+ mode :reactive
619
+
620
+ as_chat_endpoint do
621
+ system_prompt <<~PROMPT
622
+ You are a GitHub expert assistant with deep knowledge of:
623
+ - GitHub API and workflows
624
+ - Pull requests, issues, and code review
625
+ - GitHub Actions and CI/CD
626
+ - Repository management and best practices
627
+
628
+ Provide helpful, accurate answers about GitHub topics.
629
+ Keep responses concise but informative.
630
+ PROMPT
631
+
632
+ model "github-expert-v1"
633
+ temperature 0.7
634
+ max_tokens 2000
635
+ end
636
+
637
+ constraints do
638
+ timeout '30s'
639
+ requests_per_minute 30
640
+ daily_budget 1000 # $10/day
641
+ end
642
+ end
643
+ ```
644
+
645
+ ### Customer Support Agent
646
+
647
+ ```ruby
648
+ agent "customer-support" do
649
+ description "Friendly customer support assistant"
650
+ mode :reactive
651
+
652
+ as_chat_endpoint do
653
+ system_prompt <<~PROMPT
654
+ You are a friendly customer support representative.
655
+
656
+ Guidelines:
657
+ - Be empathetic and understanding
658
+ - Provide clear, step-by-step solutions
659
+ - Ask clarifying questions when needed
660
+ - Escalate to human support for complex issues
661
+ - Always maintain a professional, helpful tone
662
+
663
+ Available topics:
664
+ - Account management
665
+ - Billing and payments
666
+ - Technical troubleshooting
667
+ - Product features and usage
668
+ PROMPT
669
+
670
+ model "support-assistant-v1"
671
+ temperature 0.8 # Slightly more conversational
672
+ max_tokens 1500
673
+ presence_penalty 0.6 # Encourage topic variety
674
+ end
675
+
676
+ constraints do
677
+ timeout '15s' # Quick responses for support
678
+ requests_per_minute 60
679
+ hourly_budget 500
680
+ daily_budget 5000
681
+
682
+ # Safety
683
+ blocked_topics ['violence', 'hate-speech']
684
+ end
685
+ end
686
+ ```
687
+
688
+ ### Technical Documentation Assistant
689
+
690
+ ```ruby
691
+ agent "docs-assistant" do
692
+ description "Technical documentation expert"
693
+ mode :reactive
694
+
695
+ as_chat_endpoint do
696
+ system_prompt <<~PROMPT
697
+ You are a technical documentation assistant specializing in API documentation.
698
+
699
+ Your expertise:
700
+ - REST API design and documentation
701
+ - OpenAPI/Swagger specifications
702
+ - Authentication and authorization patterns
703
+ - Rate limiting and pagination
704
+ - Error handling best practices
705
+
706
+ When answering:
707
+ - Provide code examples when relevant
708
+ - Explain concepts clearly with examples
709
+ - Reference industry standards (REST, OpenAPI, etc.)
710
+ - Include best practices and gotchas
711
+ - Format responses with proper markdown
712
+ PROMPT
713
+
714
+ model "docs-expert-v1"
715
+ temperature 0.5 # More consistent/factual
716
+ max_tokens 3000 # Longer for detailed explanations
717
+ frequency_penalty 0.3 # Reduce repetition in docs
718
+ end
719
+
720
+ constraints do
721
+ timeout '45s'
722
+ requests_per_minute 20
723
+ daily_budget 2000
724
+ end
725
+ end
726
+ ```
727
+
728
+ ### Code Review Assistant
729
+
730
+ ```ruby
731
+ agent "code-reviewer" do
732
+ description "Automated code review assistant"
733
+ mode :reactive
734
+
735
+ as_chat_endpoint do
736
+ system_prompt <<~PROMPT
737
+ You are a senior software engineer conducting code reviews.
738
+
739
+ Focus areas:
740
+ - Code correctness and logic errors
741
+ - Security vulnerabilities
742
+ - Performance issues
743
+ - Code style and best practices
744
+ - Test coverage
745
+ - Documentation quality
746
+
747
+ Review approach:
748
+ - Be constructive and specific
749
+ - Explain the "why" behind suggestions
750
+ - Prioritize issues by severity
751
+ - Suggest concrete improvements
752
+ - Acknowledge good practices
753
+
754
+ Format reviews with:
755
+ - Summary of overall code quality
756
+ - Specific issues with line references
757
+ - Suggested improvements
758
+ - Security concerns (if any)
759
+ PROMPT
760
+
761
+ model "code-reviewer-v1"
762
+ temperature 0.3 # Consistent, focused reviews
763
+ max_tokens 4000 # Detailed reviews
764
+ frequency_penalty 0.5 # Avoid repetitive comments
765
+ end
766
+
767
+ constraints do
768
+ timeout '1m' # Allow time for thorough review
769
+ requests_per_hour 50
770
+ daily_budget 3000
771
+ end
772
+ end
773
+ ```
774
+
775
+ ### SQL Query Helper
776
+
777
+ ```ruby
778
+ agent "sql-helper" do
779
+ description "SQL query assistance and optimization"
780
+ mode :reactive
781
+
782
+ as_chat_endpoint do
783
+ system_prompt <<~PROMPT
784
+ You are a database expert specializing in SQL query writing and optimization.
785
+
786
+ Expertise:
787
+ - SQL syntax (PostgreSQL, MySQL, SQLite)
788
+ - Query optimization and performance
789
+ - Index design
790
+ - Join strategies
791
+ - Aggregation and window functions
792
+ - Common table expressions (CTEs)
793
+
794
+ When helping:
795
+ - Write clean, readable SQL
796
+ - Explain query logic
797
+ - Suggest optimizations
798
+ - Warn about performance pitfalls
799
+ - Include comments in complex queries
800
+ - Consider different SQL dialects
801
+ PROMPT
802
+
803
+ model "sql-expert-v1"
804
+ temperature 0.4 # Precise for SQL
805
+ max_tokens 2500
806
+ stop ["```\n\n"] # Stop after code block
807
+ end
808
+
809
+ constraints do
810
+ timeout '30s'
811
+ requests_per_minute 40
812
+ daily_budget 1500
813
+ end
814
+ end
815
+ ```
816
+
817
+ ## Best Practices
818
+
819
+ ### System Prompt Design
820
+
821
+ 1. **Be specific about expertise** - Define clear areas of knowledge
822
+ 2. **Include guidelines** - Specify how the agent should respond
823
+ 3. **Set boundaries** - Define what the agent should/shouldn't do
824
+ 4. **Provide context** - Explain the agent's role and purpose
825
+ 5. **Use examples** - Show expected behavior in the prompt
826
+
827
+ ### Parameter Tuning
828
+
829
+ 1. **Temperature**
830
+ - Lower (0.0-0.3) for factual, consistent responses
831
+ - Medium (0.5-0.7) for balanced interactions
832
+ - Higher (0.8-1.0) for creative, varied responses
833
+
834
+ 2. **Max Tokens**
835
+ - Set based on expected response length
836
+ - Consider cost implications
837
+ - Balance between completeness and efficiency
838
+
839
+ 3. **Penalties**
840
+ - Use `frequency_penalty` to reduce repetition
841
+ - Use `presence_penalty` to encourage topic diversity
842
+ - Start low (0.0-0.5) and adjust based on behavior
843
+
844
+ ### Performance
845
+
846
+ 1. **Set appropriate timeouts** - Balance thoroughness and responsiveness
847
+ 2. **Use streaming for long responses** - Better user experience
848
+ 3. **Cache responses** - When appropriate for repeated queries
849
+ 4. **Monitor token usage** - Track costs and optimize prompts
850
+
851
+ ### Cost Management
852
+
853
+ 1. **Set budget constraints** - Use `daily_budget` and `hourly_budget`
854
+ 2. **Limit max_tokens** - Prevent unexpectedly long responses
855
+ 3. **Monitor usage** - Track requests and token consumption
856
+ 4. **Optimize prompts** - Shorter system prompts reduce costs
857
+
858
+ ```ruby
859
+ constraints do
860
+ hourly_budget 100 # $1/hour
861
+ daily_budget 1000 # $10/day
862
+ requests_per_minute 30
863
+ end
864
+ ```
865
+
866
+ ### Security
867
+
868
+ 1. **Don't expose credentials** - Never include API keys in prompts
869
+ 2. **Validate inputs** - Sanitize user messages
870
+ 3. **Filter outputs** - Use `blocked_patterns` for PII
871
+ 4. **Add authentication** - Use API gateway or webhook auth
872
+ 5. **Rate limit** - Prevent abuse with `requests_per_minute`
873
+
874
+ ### Testing
875
+
876
+ 1. **Test with OpenAI SDK** - Verify compatibility
877
+ 2. **Test streaming** - Ensure SSE works correctly
878
+ 3. **Test error cases** - Handle malformed requests
879
+ 4. **Test conversation history** - Multi-turn interactions
880
+ 5. **Load test** - Verify performance under load
881
+
882
+ ### Monitoring
883
+
884
+ 1. **Track usage metrics** - Requests, tokens, costs
885
+ 2. **Monitor latency** - Response time distribution
886
+ 3. **Log errors** - Capture and analyze failures
887
+ 4. **Monitor quality** - Track user feedback
888
+ 5. **Alert on anomalies** - Unusual usage patterns
889
+
890
+ ## See Also
891
+
892
+ - [Agent Reference](agent-reference.md) - Complete agent DSL reference
893
+ - [MCP Integration](mcp-integration.md) - Tool server capabilities
894
+ - [Webhooks](webhooks.md) - Reactive agent configuration
895
+ - [Best Practices](best-practices.md) - Production deployment patterns