agent99 0.0.3 → 0.0.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/A2A_SPEC-dev.md +1829 -0
- data/CHANGELOG.md +38 -0
- data/COMMITS.md +196 -0
- data/DOCS.md +96 -0
- data/README.md +212 -84
- data/Rakefile +62 -0
- data/docs/AI/htm.md +215 -0
- data/docs/AI/htm.rb +141 -0
- data/docs/AI/htm_demo.db +0 -0
- data/docs/AI/notes_on_htm_implementation.md +1319 -0
- data/docs/AI/some_code.rb +692 -0
- data/docs/advanced-topics/a2a-protocol.md +13 -0
- data/docs/{advanced_features.md → advanced-topics/advanced-features.md} +9 -4
- data/docs/{control_actions.md → advanced-topics/control-actions.md} +2 -0
- data/docs/advanced-topics/model-context-protocol.md +4 -0
- data/docs/advanced-topics/multi-agent-processing.md +674 -0
- data/docs/agent-development/request-response-handling.md +512 -0
- data/docs/agent99_framework/central_registry.md +94 -0
- data/docs/agent99_framework/message_client.md +120 -0
- data/docs/agent99_framework/registry_client.md +119 -0
- data/docs/api-reference/agent99-base.md +463 -0
- data/docs/api-reference/message-clients.md +495 -0
- data/docs/{api_reference.md → api-reference/overview.md} +14 -4
- data/docs/api-reference/registry-client.md +470 -0
- data/docs/api-reference/schemas.md +518 -0
- data/docs/assets/css/custom.css +27 -0
- data/docs/assets/images/agent-lifecycle.svg +73 -0
- data/docs/assets/images/agent-registry-process.svg +86 -0
- data/docs/assets/images/agent-registry-processes.svg +114 -0
- data/docs/assets/images/agent-types-overview.svg +51 -0
- data/docs/assets/images/agent99-architecture.svg +85 -0
- data/docs/assets/images/agent99_logo.png +0 -0
- data/docs/assets/images/control-actions-state.svg +83 -0
- data/docs/assets/images/knowledge-graph.svg +77 -0
- data/docs/assets/images/message-processing-flow.svg +148 -0
- data/docs/assets/images/multi-agent-system.svg +66 -0
- data/docs/assets/images/proxy-pattern-sequence.svg +48 -0
- data/docs/assets/images/request-flow.svg +97 -0
- data/docs/assets/images/request-processing-lifecycle.svg +50 -0
- data/docs/assets/images/request-response-sequence.svg +39 -0
- data/docs/{agent_lifecycle.md → core-concepts/agent-lifecycle.md} +2 -0
- data/docs/core-concepts/agent-types.md +255 -0
- data/docs/{architecture.md → core-concepts/architecture.md} +5 -5
- data/docs/core-concepts/what-is-an-agent.md +293 -0
- data/docs/diagrams/message-flow-sequence.svg +198 -0
- data/docs/diagrams/p2p-network-topology.svg +181 -0
- data/docs/diagrams/smart-transport-routing.svg +165 -0
- data/docs/diagrams/three-layer-architecture.svg +77 -0
- data/docs/diagrams/transport-extension-api.svg +309 -0
- data/docs/diagrams/transport-extension-architecture.svg +234 -0
- data/docs/diagrams/transport-selection-flowchart.svg +264 -0
- data/docs/examples/advanced-examples.md +951 -0
- data/docs/examples/basic-examples.md +268 -0
- data/docs/{agent_discovery.md → framework-components/agent-discovery.md} +9 -5
- data/docs/{agent_registry_processes.md → framework-components/agent-registry.md} +9 -3
- data/docs/{message_processing.md → framework-components/message-processing.md} +3 -1
- data/docs/getting-started/basic-example.md +306 -0
- data/docs/getting-started/installation.md +160 -0
- data/docs/getting-started/overview.md +64 -0
- data/docs/getting-started/quick-start.md +179 -0
- data/docs/index.md +97 -0
- data/docs/operations/breaking-changes.md +26 -0
- data/examples/DEMO.md +148 -0
- data/examples/README.md +50 -0
- data/examples/agent_watcher.rb +5 -1
- data/examples/bad_agent.rb +32 -0
- data/examples/chief_agent.rb +17 -6
- data/examples/control.rb +16 -7
- data/examples/example_agent.rb +16 -3
- data/examples/maxwell_agent86.rb +15 -26
- data/examples/registry.rb +10 -9
- data/examples/run_demo.rb +433 -0
- data/lib/agent99/agent_discovery.rb +4 -0
- data/lib/agent99/agent_lifecycle.rb +34 -10
- data/lib/agent99/amqp_message_client.rb +2 -2
- data/lib/agent99/base.rb +6 -2
- data/lib/agent99/message_processing.rb +6 -10
- data/lib/agent99/registry_client.rb +15 -11
- data/lib/agent99/tcp_message_client.rb +183 -0
- data/lib/agent99/version.rb +1 -1
- data/lib/agent99.rb +1 -1
- data/mkdocs.yml +195 -0
- data/p2p_plan.md +533 -0
- data/p2p_roadmap.md +299 -0
- data/registry_plan.md +1818 -0
- metadata +93 -30
- data/docs/README.md +0 -57
- data/docs/diagrams/agent_registry_processes.dot +0 -42
- data/docs/diagrams/agent_registry_processes.png +0 -0
- data/docs/diagrams/high_level_architecture.dot +0 -26
- data/docs/diagrams/high_level_architecture.png +0 -0
- data/docs/diagrams/request_flow.dot +0 -42
- data/docs/diagrams/request_flow.png +0 -0
- /data/docs/{extending_the_framework.md → advanced-topics/extending-the-framework.md} +0 -0
- /data/docs/{custom_agent_implementation.md → agent-development/custom-agent-implementation.md} +0 -0
- /data/docs/{error_handling_and_logging.md → agent-development/error-handling-and-logging.md} +0 -0
- /data/docs/{schema_definition.md → agent-development/schema-definition.md} +0 -0
- /data/docs/{messaging_system.md → framework-components/messaging-system.md} +0 -0
- /data/docs/{configuration.md → operations/configuration.md} +0 -0
- /data/docs/{preformance_considerations.md → operations/performance-considerations.md} +0 -0
- /data/docs/{security.md → operations/security.md} +0 -0
- /data/docs/{troubleshooting.md → operations/troubleshooting.md} +0 -0
@@ -33,10 +33,15 @@ The AgentWatcher will:
|
|
33
33
|
### Example Implementation
|
34
34
|
|
35
35
|
```ruby
|
36
|
-
class MyDynamicAgent < Agent99::Base
|
37
|
-
|
38
|
-
|
39
|
-
|
36
|
+
class MyDynamicAgent < Agent99::Base
|
37
|
+
def info
|
38
|
+
{
|
39
|
+
# ...
|
40
|
+
type: :server,
|
41
|
+
capabilities: ['my_capability'],
|
42
|
+
# ...
|
43
|
+
}
|
44
|
+
end
|
40
45
|
|
41
46
|
def receive_request
|
42
47
|
# Handle requests
|
@@ -4,6 +4,8 @@
|
|
4
4
|
|
5
5
|
Agent99 provides a robust control system that allows for dynamic management of agents during runtime. Control actions enable administrative operations and state management without requiring agent restarts or redeployment.
|
6
6
|
|
7
|
+

|
8
|
+
|
7
9
|
### Built-in Control Actions
|
8
10
|
|
9
11
|
The framework includes several built-in control actions:
|
@@ -0,0 +1,674 @@
|
|
1
|
+
# Multi-Agent Processing
|
2
|
+
|
3
|
+
Multi-agent processing allows you to run multiple agents within the same process or coordinate agents across different processes. This guide covers patterns, strategies, and best practices for multi-agent systems.
|
4
|
+
|
5
|
+
## Overview
|
6
|
+
|
7
|
+

|
8
|
+
|
9
|
+
## Single Process, Multiple Agents
|
10
|
+
|
11
|
+
Running multiple agents in the same Ruby process can be efficient for related services that need to share resources.
|
12
|
+
|
13
|
+
### Basic Multi-Agent Setup
|
14
|
+
|
15
|
+
```ruby
|
16
|
+
require 'agent99'
|
17
|
+
|
18
|
+
# Define your agents
|
19
|
+
class DatabaseAgent < Agent99::Base
|
20
|
+
def info
|
21
|
+
{
|
22
|
+
name: self.class.to_s,
|
23
|
+
type: :server,
|
24
|
+
capabilities: ['database', 'storage']
|
25
|
+
}
|
26
|
+
end
|
27
|
+
|
28
|
+
def process_request(payload)
|
29
|
+
# Database operations
|
30
|
+
operation = payload.dig(:operation)
|
31
|
+
case operation
|
32
|
+
when 'store'
|
33
|
+
result = store_data(payload[:data])
|
34
|
+
send_response(result: result)
|
35
|
+
when 'retrieve'
|
36
|
+
data = retrieve_data(payload[:id])
|
37
|
+
send_response(data: data)
|
38
|
+
else
|
39
|
+
send_error("Unknown operation: #{operation}")
|
40
|
+
end
|
41
|
+
end
|
42
|
+
end
|
43
|
+
|
44
|
+
class CacheAgent < Agent99::Base
|
45
|
+
def initialize
|
46
|
+
super
|
47
|
+
@cache = {}
|
48
|
+
end
|
49
|
+
|
50
|
+
def info
|
51
|
+
{
|
52
|
+
name: self.class.to_s,
|
53
|
+
type: :server,
|
54
|
+
capabilities: ['cache', 'memory']
|
55
|
+
}
|
56
|
+
end
|
57
|
+
|
58
|
+
def process_request(payload)
|
59
|
+
key = payload.dig(:key)
|
60
|
+
case payload.dig(:operation)
|
61
|
+
when 'get'
|
62
|
+
send_response(value: @cache[key])
|
63
|
+
when 'set'
|
64
|
+
@cache[key] = payload[:value]
|
65
|
+
send_response(status: 'stored')
|
66
|
+
when 'delete'
|
67
|
+
@cache.delete(key)
|
68
|
+
send_response(status: 'deleted')
|
69
|
+
end
|
70
|
+
end
|
71
|
+
end
|
72
|
+
|
73
|
+
class LoggingAgent < Agent99::Base
|
74
|
+
def info
|
75
|
+
{
|
76
|
+
name: self.class.to_s,
|
77
|
+
type: :server,
|
78
|
+
capabilities: ['logging', 'audit']
|
79
|
+
}
|
80
|
+
end
|
81
|
+
|
82
|
+
def process_request(payload)
|
83
|
+
# Log the event
|
84
|
+
log_entry = {
|
85
|
+
timestamp: Time.now.iso8601,
|
86
|
+
level: payload[:level] || 'info',
|
87
|
+
message: payload[:message],
|
88
|
+
source: payload[:source]
|
89
|
+
}
|
90
|
+
|
91
|
+
File.open('agent_audit.log', 'a') do |f|
|
92
|
+
f.puts log_entry.to_json
|
93
|
+
end
|
94
|
+
|
95
|
+
send_response(status: 'logged', entry_id: SecureRandom.uuid)
|
96
|
+
end
|
97
|
+
end
|
98
|
+
|
99
|
+
# Multi-agent process manager
|
100
|
+
class MultiAgentProcess
|
101
|
+
def initialize
|
102
|
+
@agents = []
|
103
|
+
@threads = []
|
104
|
+
@running = false
|
105
|
+
end
|
106
|
+
|
107
|
+
def add_agent(agent_class, options = {})
|
108
|
+
agent = agent_class.new(options)
|
109
|
+
@agents << agent
|
110
|
+
agent
|
111
|
+
end
|
112
|
+
|
113
|
+
def start_all
|
114
|
+
@running = true
|
115
|
+
|
116
|
+
@agents.each do |agent|
|
117
|
+
thread = Thread.new do
|
118
|
+
begin
|
119
|
+
agent.run
|
120
|
+
rescue => e
|
121
|
+
puts "Agent #{agent.class} failed: #{e.message}"
|
122
|
+
end
|
123
|
+
end
|
124
|
+
@threads << thread
|
125
|
+
end
|
126
|
+
|
127
|
+
puts "Started #{@agents.size} agents in #{@threads.size} threads"
|
128
|
+
self
|
129
|
+
end
|
130
|
+
|
131
|
+
def stop_all
|
132
|
+
@running = false
|
133
|
+
|
134
|
+
@agents.each(&:shutdown)
|
135
|
+
@threads.each(&:join)
|
136
|
+
|
137
|
+
puts "All agents stopped"
|
138
|
+
end
|
139
|
+
|
140
|
+
def wait_for_shutdown
|
141
|
+
trap('INT') do
|
142
|
+
puts "\nShutting down all agents..."
|
143
|
+
stop_all
|
144
|
+
exit
|
145
|
+
end
|
146
|
+
|
147
|
+
@threads.each(&:join)
|
148
|
+
end
|
149
|
+
end
|
150
|
+
|
151
|
+
# Usage
|
152
|
+
if __FILE__ == $0
|
153
|
+
process = MultiAgentProcess.new
|
154
|
+
|
155
|
+
# Add agents to the process
|
156
|
+
process.add_agent(DatabaseAgent)
|
157
|
+
process.add_agent(CacheAgent)
|
158
|
+
process.add_agent(LoggingAgent)
|
159
|
+
|
160
|
+
# Start all agents
|
161
|
+
process.start_all
|
162
|
+
|
163
|
+
# Wait for shutdown signal
|
164
|
+
process.wait_for_shutdown
|
165
|
+
end
|
166
|
+
```
|
167
|
+
|
168
|
+
## Agent Coordination Patterns
|
169
|
+
|
170
|
+
### Producer-Consumer Pattern
|
171
|
+
|
172
|
+
```ruby
|
173
|
+
class ProducerAgent < Agent99::Base
|
174
|
+
def initialize
|
175
|
+
super
|
176
|
+
@job_counter = 0
|
177
|
+
end
|
178
|
+
|
179
|
+
def info
|
180
|
+
{
|
181
|
+
name: self.class.to_s,
|
182
|
+
type: :hybrid,
|
183
|
+
capabilities: ['producer', 'job_generator']
|
184
|
+
}
|
185
|
+
end
|
186
|
+
|
187
|
+
def start_producing
|
188
|
+
Thread.new do
|
189
|
+
loop do
|
190
|
+
# Find consumer agents
|
191
|
+
consumers = discover_agents(['consumer'])
|
192
|
+
|
193
|
+
if consumers.any?
|
194
|
+
# Create a job
|
195
|
+
job = create_job
|
196
|
+
|
197
|
+
# Send to a random consumer
|
198
|
+
consumer = consumers.sample
|
199
|
+
send_request(consumer[:name], job)
|
200
|
+
|
201
|
+
logger.info "Sent job #{job[:id]} to #{consumer[:name]}"
|
202
|
+
else
|
203
|
+
logger.warn "No consumers available"
|
204
|
+
end
|
205
|
+
|
206
|
+
sleep(5) # Produce every 5 seconds
|
207
|
+
end
|
208
|
+
end
|
209
|
+
end
|
210
|
+
|
211
|
+
def process_request(payload)
|
212
|
+
# Handle requests for job status, etc.
|
213
|
+
case payload[:operation]
|
214
|
+
when 'status'
|
215
|
+
send_response(jobs_produced: @job_counter, status: 'running')
|
216
|
+
end
|
217
|
+
end
|
218
|
+
|
219
|
+
private
|
220
|
+
|
221
|
+
def create_job
|
222
|
+
@job_counter += 1
|
223
|
+
{
|
224
|
+
id: "job_#{@job_counter}",
|
225
|
+
type: 'data_processing',
|
226
|
+
data: Array.new(100) { rand(1000) },
|
227
|
+
created_at: Time.now.iso8601
|
228
|
+
}
|
229
|
+
end
|
230
|
+
end
|
231
|
+
|
232
|
+
class ConsumerAgent < Agent99::Base
|
233
|
+
def initialize
|
234
|
+
super
|
235
|
+
@processed_jobs = 0
|
236
|
+
end
|
237
|
+
|
238
|
+
def info
|
239
|
+
{
|
240
|
+
name: "#{self.class}_#{Socket.gethostname}_#{Process.pid}",
|
241
|
+
type: :server,
|
242
|
+
capabilities: ['consumer', 'data_processor']
|
243
|
+
}
|
244
|
+
end
|
245
|
+
|
246
|
+
def process_request(payload)
|
247
|
+
job_id = payload.dig(:id)
|
248
|
+
job_type = payload.dig(:type)
|
249
|
+
|
250
|
+
logger.info "Processing job #{job_id} of type #{job_type}"
|
251
|
+
|
252
|
+
# Simulate processing time
|
253
|
+
processing_time = rand(1..3)
|
254
|
+
sleep(processing_time)
|
255
|
+
|
256
|
+
# Process the data
|
257
|
+
data = payload.dig(:data) || []
|
258
|
+
result = data.sum / data.size.to_f rescue 0
|
259
|
+
|
260
|
+
@processed_jobs += 1
|
261
|
+
|
262
|
+
send_response(
|
263
|
+
job_id: job_id,
|
264
|
+
result: result,
|
265
|
+
processing_time: processing_time,
|
266
|
+
processed_by: info[:name],
|
267
|
+
total_processed: @processed_jobs
|
268
|
+
)
|
269
|
+
|
270
|
+
logger.info "Completed job #{job_id}"
|
271
|
+
end
|
272
|
+
end
|
273
|
+
```
|
274
|
+
|
275
|
+
### Load Balancing Pattern
|
276
|
+
|
277
|
+
```ruby
|
278
|
+
class LoadBalancerAgent < Agent99::Base
|
279
|
+
def initialize
|
280
|
+
super
|
281
|
+
@worker_stats = {}
|
282
|
+
@request_count = 0
|
283
|
+
end
|
284
|
+
|
285
|
+
def info
|
286
|
+
{
|
287
|
+
name: self.class.to_s,
|
288
|
+
type: :hybrid,
|
289
|
+
capabilities: ['load_balancer', 'proxy']
|
290
|
+
}
|
291
|
+
end
|
292
|
+
|
293
|
+
def process_request(payload)
|
294
|
+
@request_count += 1
|
295
|
+
|
296
|
+
# Find available worker agents
|
297
|
+
workers = discover_agents(['worker'])
|
298
|
+
|
299
|
+
if workers.empty?
|
300
|
+
return send_error("No workers available", "NO_WORKERS")
|
301
|
+
end
|
302
|
+
|
303
|
+
# Choose worker based on load balancing strategy
|
304
|
+
chosen_worker = choose_worker(workers)
|
305
|
+
|
306
|
+
# Forward request to chosen worker
|
307
|
+
begin
|
308
|
+
response = send_request(chosen_worker[:name], payload)
|
309
|
+
|
310
|
+
# Update worker stats
|
311
|
+
update_worker_stats(chosen_worker[:name], success: true)
|
312
|
+
|
313
|
+
# Add load balancer info to response
|
314
|
+
response[:routed_by] = info[:name]
|
315
|
+
response[:worker] = chosen_worker[:name]
|
316
|
+
|
317
|
+
send_response(response)
|
318
|
+
rescue => e
|
319
|
+
update_worker_stats(chosen_worker[:name], success: false)
|
320
|
+
send_error("Worker failed: #{e.message}", "WORKER_ERROR")
|
321
|
+
end
|
322
|
+
end
|
323
|
+
|
324
|
+
private
|
325
|
+
|
326
|
+
def choose_worker(workers)
|
327
|
+
# Round-robin load balancing
|
328
|
+
worker_index = @request_count % workers.size
|
329
|
+
workers[worker_index]
|
330
|
+
end
|
331
|
+
|
332
|
+
def update_worker_stats(worker_name, success:)
|
333
|
+
@worker_stats[worker_name] ||= { requests: 0, successes: 0, failures: 0 }
|
334
|
+
@worker_stats[worker_name][:requests] += 1
|
335
|
+
|
336
|
+
if success
|
337
|
+
@worker_stats[worker_name][:successes] += 1
|
338
|
+
else
|
339
|
+
@worker_stats[worker_name][:failures] += 1
|
340
|
+
end
|
341
|
+
end
|
342
|
+
end
|
343
|
+
|
344
|
+
class WorkerAgent < Agent99::Base
|
345
|
+
def initialize(worker_id = nil)
|
346
|
+
super()
|
347
|
+
@worker_id = worker_id || "worker_#{SecureRandom.hex(4)}"
|
348
|
+
end
|
349
|
+
|
350
|
+
def info
|
351
|
+
{
|
352
|
+
name: "#{self.class}_#{@worker_id}",
|
353
|
+
type: :server,
|
354
|
+
capabilities: ['worker', 'processing']
|
355
|
+
}
|
356
|
+
end
|
357
|
+
|
358
|
+
def process_request(payload)
|
359
|
+
# Simulate different processing capabilities
|
360
|
+
task_type = payload.dig(:task_type)
|
361
|
+
|
362
|
+
case task_type
|
363
|
+
when 'cpu_intensive'
|
364
|
+
result = perform_cpu_task(payload[:data])
|
365
|
+
when 'io_intensive'
|
366
|
+
result = perform_io_task(payload[:data])
|
367
|
+
when 'memory_intensive'
|
368
|
+
result = perform_memory_task(payload[:data])
|
369
|
+
else
|
370
|
+
result = perform_generic_task(payload[:data])
|
371
|
+
end
|
372
|
+
|
373
|
+
send_response(
|
374
|
+
result: result,
|
375
|
+
worker_id: @worker_id,
|
376
|
+
task_type: task_type,
|
377
|
+
processed_at: Time.now.iso8601
|
378
|
+
)
|
379
|
+
end
|
380
|
+
|
381
|
+
private
|
382
|
+
|
383
|
+
def perform_cpu_task(data)
|
384
|
+
# Simulate CPU-intensive work
|
385
|
+
(1..1000000).sum
|
386
|
+
end
|
387
|
+
|
388
|
+
def perform_io_task(data)
|
389
|
+
# Simulate I/O work
|
390
|
+
File.write("/tmp/worker_#{@worker_id}_output.txt", data.to_s)
|
391
|
+
"Data written to file"
|
392
|
+
end
|
393
|
+
|
394
|
+
def perform_memory_task(data)
|
395
|
+
# Simulate memory-intensive work
|
396
|
+
large_array = Array.new(100000) { rand }
|
397
|
+
large_array.sum
|
398
|
+
end
|
399
|
+
|
400
|
+
def perform_generic_task(data)
|
401
|
+
# Generic task processing
|
402
|
+
"Processed: #{data}"
|
403
|
+
end
|
404
|
+
end
|
405
|
+
```
|
406
|
+
|
407
|
+
## Multi-Process Coordination
|
408
|
+
|
409
|
+
### Process Manager
|
410
|
+
|
411
|
+
```ruby
|
412
|
+
class AgentProcessManager
|
413
|
+
def initialize
|
414
|
+
@processes = {}
|
415
|
+
end
|
416
|
+
|
417
|
+
def start_agent_process(agent_class, count: 1, options: {})
|
418
|
+
count.times do |i|
|
419
|
+
process_name = "#{agent_class.name.downcase}_#{i}"
|
420
|
+
|
421
|
+
pid = fork do
|
422
|
+
# Set process title for easier identification
|
423
|
+
Process.setproctitle("agent99_#{process_name}")
|
424
|
+
|
425
|
+
# Create and run the agent
|
426
|
+
agent = agent_class.new(options)
|
427
|
+
|
428
|
+
# Handle graceful shutdown
|
429
|
+
trap('TERM') do
|
430
|
+
agent.shutdown
|
431
|
+
exit
|
432
|
+
end
|
433
|
+
|
434
|
+
agent.run
|
435
|
+
end
|
436
|
+
|
437
|
+
@processes[process_name] = {
|
438
|
+
pid: pid,
|
439
|
+
agent_class: agent_class,
|
440
|
+
started_at: Time.now
|
441
|
+
}
|
442
|
+
|
443
|
+
puts "Started #{agent_class} as process #{pid} (#{process_name})"
|
444
|
+
end
|
445
|
+
end
|
446
|
+
|
447
|
+
def stop_all_processes
|
448
|
+
@processes.each do |name, process_info|
|
449
|
+
begin
|
450
|
+
Process.kill('TERM', process_info[:pid])
|
451
|
+
Process.wait(process_info[:pid])
|
452
|
+
puts "Stopped process #{name} (#{process_info[:pid]})"
|
453
|
+
rescue Errno::ESRCH
|
454
|
+
puts "Process #{name} (#{process_info[:pid]}) already stopped"
|
455
|
+
end
|
456
|
+
end
|
457
|
+
|
458
|
+
@processes.clear
|
459
|
+
end
|
460
|
+
|
461
|
+
def monitor_processes
|
462
|
+
Thread.new do
|
463
|
+
loop do
|
464
|
+
@processes.each do |name, process_info|
|
465
|
+
begin
|
466
|
+
# Check if process is still running
|
467
|
+
Process.getpgid(process_info[:pid])
|
468
|
+
rescue Errno::ESRCH
|
469
|
+
puts "Process #{name} (#{process_info[:pid]}) died, restarting..."
|
470
|
+
restart_process(name, process_info)
|
471
|
+
end
|
472
|
+
end
|
473
|
+
|
474
|
+
sleep(5) # Check every 5 seconds
|
475
|
+
end
|
476
|
+
end
|
477
|
+
end
|
478
|
+
|
479
|
+
def wait_for_shutdown
|
480
|
+
trap('INT') do
|
481
|
+
puts "\nShutting down all agent processes..."
|
482
|
+
stop_all_processes
|
483
|
+
exit
|
484
|
+
end
|
485
|
+
|
486
|
+
# Wait for all child processes
|
487
|
+
Process.waitall
|
488
|
+
end
|
489
|
+
|
490
|
+
private
|
491
|
+
|
492
|
+
def restart_process(name, process_info)
|
493
|
+
# Remove dead process
|
494
|
+
@processes.delete(name)
|
495
|
+
|
496
|
+
# Start new process
|
497
|
+
start_agent_process(
|
498
|
+
process_info[:agent_class],
|
499
|
+
count: 1,
|
500
|
+
options: process_info[:options] || {}
|
501
|
+
)
|
502
|
+
end
|
503
|
+
end
|
504
|
+
|
505
|
+
# Usage example
|
506
|
+
if __FILE__ == $0
|
507
|
+
manager = AgentProcessManager.new
|
508
|
+
|
509
|
+
# Start multiple instances of different agents
|
510
|
+
manager.start_agent_process(WorkerAgent, count: 3)
|
511
|
+
manager.start_agent_process(LoadBalancerAgent, count: 1)
|
512
|
+
manager.start_agent_process(LoggingAgent, count: 1)
|
513
|
+
|
514
|
+
# Start process monitoring
|
515
|
+
manager.monitor_processes
|
516
|
+
|
517
|
+
# Wait for shutdown
|
518
|
+
manager.wait_for_shutdown
|
519
|
+
end
|
520
|
+
```
|
521
|
+
|
522
|
+
## Performance Considerations
|
523
|
+
|
524
|
+
### Memory Management
|
525
|
+
|
526
|
+
```ruby
|
527
|
+
class MemoryEfficientAgent < Agent99::Base
|
528
|
+
def initialize
|
529
|
+
super
|
530
|
+
@request_count = 0
|
531
|
+
@memory_check_interval = 100
|
532
|
+
end
|
533
|
+
|
534
|
+
def process_request(payload)
|
535
|
+
@request_count += 1
|
536
|
+
|
537
|
+
# Periodic memory check
|
538
|
+
if @request_count % @memory_check_interval == 0
|
539
|
+
check_memory_usage
|
540
|
+
end
|
541
|
+
|
542
|
+
# Process request with memory awareness
|
543
|
+
result = process_with_memory_limit(payload)
|
544
|
+
send_response(result)
|
545
|
+
end
|
546
|
+
|
547
|
+
private
|
548
|
+
|
549
|
+
def check_memory_usage
|
550
|
+
# Get current memory usage
|
551
|
+
memory_mb = `ps -o rss= -p #{Process.pid}`.to_i / 1024
|
552
|
+
|
553
|
+
if memory_mb > 500 # 500MB limit
|
554
|
+
logger.warn "High memory usage: #{memory_mb}MB"
|
555
|
+
|
556
|
+
# Trigger garbage collection
|
557
|
+
GC.start
|
558
|
+
|
559
|
+
# Check again after GC
|
560
|
+
new_memory_mb = `ps -o rss= -p #{Process.pid}`.to_i / 1024
|
561
|
+
logger.info "Memory after GC: #{new_memory_mb}MB"
|
562
|
+
end
|
563
|
+
end
|
564
|
+
|
565
|
+
def process_with_memory_limit(payload)
|
566
|
+
# Implement memory-conscious processing
|
567
|
+
if payload[:data]&.size > 10000
|
568
|
+
# Stream process large data instead of loading all at once
|
569
|
+
process_large_data_streaming(payload[:data])
|
570
|
+
else
|
571
|
+
# Normal processing for small data
|
572
|
+
process_small_data(payload[:data])
|
573
|
+
end
|
574
|
+
end
|
575
|
+
end
|
576
|
+
```
|
577
|
+
|
578
|
+
### Thread Pool Management
|
579
|
+
|
580
|
+
```ruby
|
581
|
+
require 'concurrent-ruby'
|
582
|
+
|
583
|
+
class ThreadPoolAgent < Agent99::Base
|
584
|
+
def initialize(options = {})
|
585
|
+
super(options)
|
586
|
+
|
587
|
+
# Create thread pools for different task types
|
588
|
+
@cpu_pool = Concurrent::ThreadPoolExecutor.new(
|
589
|
+
min_threads: 2,
|
590
|
+
max_threads: 4,
|
591
|
+
max_queue: 100
|
592
|
+
)
|
593
|
+
|
594
|
+
@io_pool = Concurrent::ThreadPoolExecutor.new(
|
595
|
+
min_threads: 5,
|
596
|
+
max_threads: 20,
|
597
|
+
max_queue: 1000
|
598
|
+
)
|
599
|
+
end
|
600
|
+
|
601
|
+
def process_request(payload)
|
602
|
+
task_type = payload.dig(:task_type)
|
603
|
+
|
604
|
+
# Choose appropriate thread pool
|
605
|
+
pool = case task_type
|
606
|
+
when 'cpu_intensive'
|
607
|
+
@cpu_pool
|
608
|
+
when 'io_intensive', 'network'
|
609
|
+
@io_pool
|
610
|
+
else
|
611
|
+
@cpu_pool
|
612
|
+
end
|
613
|
+
|
614
|
+
# Execute in thread pool
|
615
|
+
future = Concurrent::Future.execute(executor: pool) do
|
616
|
+
perform_task(payload)
|
617
|
+
end
|
618
|
+
|
619
|
+
# Wait for completion with timeout
|
620
|
+
begin
|
621
|
+
result = future.value(30) # 30 second timeout
|
622
|
+
send_response(result)
|
623
|
+
rescue Concurrent::TimeoutError
|
624
|
+
send_error("Task timed out", "TIMEOUT")
|
625
|
+
rescue => e
|
626
|
+
send_error("Task failed: #{e.message}", "EXECUTION_ERROR")
|
627
|
+
end
|
628
|
+
end
|
629
|
+
|
630
|
+
def shutdown
|
631
|
+
# Shutdown thread pools gracefully
|
632
|
+
[@cpu_pool, @io_pool].each do |pool|
|
633
|
+
pool.shutdown
|
634
|
+
unless pool.wait_for_termination(10)
|
635
|
+
pool.kill
|
636
|
+
end
|
637
|
+
end
|
638
|
+
|
639
|
+
super
|
640
|
+
end
|
641
|
+
end
|
642
|
+
```
|
643
|
+
|
644
|
+
## Best Practices
|
645
|
+
|
646
|
+
### 1. Resource Management
|
647
|
+
- **Limit agent count**: Don't run too many agents per process
|
648
|
+
- **Monitor memory**: Set limits and monitor usage
|
649
|
+
- **Use thread pools**: Prevent thread explosion
|
650
|
+
- **Clean up resources**: Properly shutdown agents and connections
|
651
|
+
|
652
|
+
### 2. Communication Patterns
|
653
|
+
- **Use discovery**: Don't hardcode agent names
|
654
|
+
- **Handle failures**: Agents may come and go
|
655
|
+
- **Implement timeouts**: Prevent hanging requests
|
656
|
+
- **Use circuit breakers**: Protect against cascade failures
|
657
|
+
|
658
|
+
### 3. Monitoring and Debugging
|
659
|
+
- **Log extensively**: Track agent interactions
|
660
|
+
- **Use unique names**: Include process/thread identifiers
|
661
|
+
- **Monitor performance**: Track request rates and response times
|
662
|
+
- **Health checks**: Implement agent health endpoints
|
663
|
+
|
664
|
+
### 4. Testing Multi-Agent Systems
|
665
|
+
- **Integration tests**: Test agent interactions
|
666
|
+
- **Load testing**: Test under realistic load
|
667
|
+
- **Failure scenarios**: Test agent failures and recovery
|
668
|
+
- **Distributed tracing**: Track requests across agents
|
669
|
+
|
670
|
+
## Next Steps
|
671
|
+
|
672
|
+
- **[Control Actions](control-actions.md)** - Managing agent lifecycle
|
673
|
+
- **[Advanced Features](advanced-features.md)** - Dynamic loading and deployment
|
674
|
+
- **[Performance Considerations](../operations/performance-considerations.md)** - Optimization strategies
|