agent99 0.0.4 → 0.0.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (92) hide show
  1. checksums.yaml +4 -4
  2. data/A2A_SPEC-dev.md +1829 -0
  3. data/CHANGELOG.md +31 -0
  4. data/COMMITS.md +196 -0
  5. data/DOCS.md +96 -0
  6. data/README.md +200 -78
  7. data/Rakefile +62 -0
  8. data/docs/AI/htm.md +215 -0
  9. data/docs/AI/htm.rb +141 -0
  10. data/docs/AI/htm_demo.db +0 -0
  11. data/docs/AI/notes_on_htm_implementation.md +1319 -0
  12. data/docs/AI/some_code.rb +692 -0
  13. data/docs/advanced-topics/a2a-protocol.md +13 -0
  14. data/docs/{control_actions.md → advanced-topics/control-actions.md} +2 -0
  15. data/docs/advanced-topics/model-context-protocol.md +4 -0
  16. data/docs/advanced-topics/multi-agent-processing.md +674 -0
  17. data/docs/agent-development/request-response-handling.md +512 -0
  18. data/docs/api-reference/agent99-base.md +463 -0
  19. data/docs/api-reference/message-clients.md +495 -0
  20. data/docs/api-reference/registry-client.md +470 -0
  21. data/docs/api-reference/schemas.md +518 -0
  22. data/docs/assets/css/custom.css +27 -0
  23. data/docs/assets/images/agent-lifecycle.svg +73 -0
  24. data/docs/assets/images/agent-registry-process.svg +86 -0
  25. data/docs/assets/images/agent-registry-processes.svg +114 -0
  26. data/docs/assets/images/agent-types-overview.svg +51 -0
  27. data/docs/assets/images/agent99-architecture.svg +85 -0
  28. data/docs/assets/images/agent99_logo.png +0 -0
  29. data/docs/assets/images/control-actions-state.svg +83 -0
  30. data/docs/assets/images/knowledge-graph.svg +77 -0
  31. data/docs/assets/images/message-processing-flow.svg +148 -0
  32. data/docs/assets/images/multi-agent-system.svg +66 -0
  33. data/docs/assets/images/proxy-pattern-sequence.svg +48 -0
  34. data/docs/assets/images/request-flow.svg +97 -0
  35. data/docs/assets/images/request-processing-lifecycle.svg +50 -0
  36. data/docs/assets/images/request-response-sequence.svg +39 -0
  37. data/docs/{agent_lifecycle.md → core-concepts/agent-lifecycle.md} +2 -0
  38. data/docs/core-concepts/agent-types.md +255 -0
  39. data/docs/{architecture.md → core-concepts/architecture.md} +5 -5
  40. data/docs/{what_is_an_agent.md → core-concepts/what-is-an-agent.md} +1 -1
  41. data/docs/diagrams/message-flow-sequence.svg +198 -0
  42. data/docs/diagrams/p2p-network-topology.svg +181 -0
  43. data/docs/diagrams/smart-transport-routing.svg +165 -0
  44. data/docs/diagrams/three-layer-architecture.svg +77 -0
  45. data/docs/diagrams/transport-extension-api.svg +309 -0
  46. data/docs/diagrams/transport-extension-architecture.svg +234 -0
  47. data/docs/diagrams/transport-selection-flowchart.svg +264 -0
  48. data/docs/examples/advanced-examples.md +951 -0
  49. data/docs/examples/basic-examples.md +268 -0
  50. data/docs/{agent_registry_processes.md → framework-components/agent-registry.md} +1 -1
  51. data/docs/{message_processing.md → framework-components/message-processing.md} +3 -1
  52. data/docs/getting-started/basic-example.md +306 -0
  53. data/docs/getting-started/installation.md +160 -0
  54. data/docs/getting-started/overview.md +64 -0
  55. data/docs/getting-started/quick-start.md +179 -0
  56. data/docs/index.md +97 -0
  57. data/examples/DEMO.md +148 -0
  58. data/examples/README.md +50 -0
  59. data/examples/bad_agent.rb +32 -0
  60. data/examples/registry.rb +0 -8
  61. data/examples/run_demo.rb +433 -0
  62. data/lib/agent99/amqp_message_client.rb +2 -2
  63. data/lib/agent99/base.rb +1 -1
  64. data/lib/agent99/message_processing.rb +6 -12
  65. data/lib/agent99/registry_client.rb +4 -1
  66. data/lib/agent99/version.rb +1 -1
  67. data/lib/agent99.rb +1 -1
  68. data/mkdocs.yml +195 -0
  69. data/p2p_plan.md +533 -0
  70. data/p2p_roadmap.md +299 -0
  71. data/registry_plan.md +1818 -0
  72. metadata +89 -32
  73. data/docs/README.md +0 -57
  74. data/docs/diagrams/agent_registry_processes.dot +0 -42
  75. data/docs/diagrams/agent_registry_processes.png +0 -0
  76. data/docs/diagrams/high_level_architecture.dot +0 -26
  77. data/docs/diagrams/high_level_architecture.png +0 -0
  78. data/docs/diagrams/request_flow.dot +0 -42
  79. data/docs/diagrams/request_flow.png +0 -0
  80. /data/docs/{advanced_features.md → advanced-topics/advanced-features.md} +0 -0
  81. /data/docs/{extending_the_framework.md → advanced-topics/extending-the-framework.md} +0 -0
  82. /data/docs/{custom_agent_implementation.md → agent-development/custom-agent-implementation.md} +0 -0
  83. /data/docs/{error_handling_and_logging.md → agent-development/error-handling-and-logging.md} +0 -0
  84. /data/docs/{schema_definition.md → agent-development/schema-definition.md} +0 -0
  85. /data/docs/{api_reference.md → api-reference/overview.md} +0 -0
  86. /data/docs/{agent_discovery.md → framework-components/agent-discovery.md} +0 -0
  87. /data/docs/{messaging_system.md → framework-components/messaging-system.md} +0 -0
  88. /data/docs/{breaking_change_v0.0.4.md → operations/breaking-changes.md} +0 -0
  89. /data/docs/{configuration.md → operations/configuration.md} +0 -0
  90. /data/docs/{preformance_considerations.md → operations/performance-considerations.md} +0 -0
  91. /data/docs/{security.md → operations/security.md} +0 -0
  92. /data/docs/{troubleshooting.md → operations/troubleshooting.md} +0 -0
@@ -0,0 +1,674 @@
1
+ # Multi-Agent Processing
2
+
3
+ Multi-agent processing allows you to run multiple agents within the same process or coordinate agents across different processes. This guide covers patterns, strategies, and best practices for multi-agent systems.
4
+
5
+ ## Overview
6
+
7
+ ![Multi-Agent System](../assets/images/multi-agent-system.svg)
8
+
9
+ ## Single Process, Multiple Agents
10
+
11
+ Running multiple agents in the same Ruby process can be efficient for related services that need to share resources.
12
+
13
+ ### Basic Multi-Agent Setup
14
+
15
+ ```ruby
16
+ require 'agent99'
17
+
18
+ # Define your agents
19
+ class DatabaseAgent < Agent99::Base
20
+ def info
21
+ {
22
+ name: self.class.to_s,
23
+ type: :server,
24
+ capabilities: ['database', 'storage']
25
+ }
26
+ end
27
+
28
+ def process_request(payload)
29
+ # Database operations
30
+ operation = payload.dig(:operation)
31
+ case operation
32
+ when 'store'
33
+ result = store_data(payload[:data])
34
+ send_response(result: result)
35
+ when 'retrieve'
36
+ data = retrieve_data(payload[:id])
37
+ send_response(data: data)
38
+ else
39
+ send_error("Unknown operation: #{operation}")
40
+ end
41
+ end
42
+ end
43
+
44
+ class CacheAgent < Agent99::Base
45
+ def initialize
46
+ super
47
+ @cache = {}
48
+ end
49
+
50
+ def info
51
+ {
52
+ name: self.class.to_s,
53
+ type: :server,
54
+ capabilities: ['cache', 'memory']
55
+ }
56
+ end
57
+
58
+ def process_request(payload)
59
+ key = payload.dig(:key)
60
+ case payload.dig(:operation)
61
+ when 'get'
62
+ send_response(value: @cache[key])
63
+ when 'set'
64
+ @cache[key] = payload[:value]
65
+ send_response(status: 'stored')
66
+ when 'delete'
67
+ @cache.delete(key)
68
+ send_response(status: 'deleted')
69
+ end
70
+ end
71
+ end
72
+
73
+ class LoggingAgent < Agent99::Base
74
+ def info
75
+ {
76
+ name: self.class.to_s,
77
+ type: :server,
78
+ capabilities: ['logging', 'audit']
79
+ }
80
+ end
81
+
82
+ def process_request(payload)
83
+ # Log the event
84
+ log_entry = {
85
+ timestamp: Time.now.iso8601,
86
+ level: payload[:level] || 'info',
87
+ message: payload[:message],
88
+ source: payload[:source]
89
+ }
90
+
91
+ File.open('agent_audit.log', 'a') do |f|
92
+ f.puts log_entry.to_json
93
+ end
94
+
95
+ send_response(status: 'logged', entry_id: SecureRandom.uuid)
96
+ end
97
+ end
98
+
99
+ # Multi-agent process manager
100
+ class MultiAgentProcess
101
+ def initialize
102
+ @agents = []
103
+ @threads = []
104
+ @running = false
105
+ end
106
+
107
+ def add_agent(agent_class, options = {})
108
+ agent = agent_class.new(options)
109
+ @agents << agent
110
+ agent
111
+ end
112
+
113
+ def start_all
114
+ @running = true
115
+
116
+ @agents.each do |agent|
117
+ thread = Thread.new do
118
+ begin
119
+ agent.run
120
+ rescue => e
121
+ puts "Agent #{agent.class} failed: #{e.message}"
122
+ end
123
+ end
124
+ @threads << thread
125
+ end
126
+
127
+ puts "Started #{@agents.size} agents in #{@threads.size} threads"
128
+ self
129
+ end
130
+
131
+ def stop_all
132
+ @running = false
133
+
134
+ @agents.each(&:shutdown)
135
+ @threads.each(&:join)
136
+
137
+ puts "All agents stopped"
138
+ end
139
+
140
+ def wait_for_shutdown
141
+ trap('INT') do
142
+ puts "\nShutting down all agents..."
143
+ stop_all
144
+ exit
145
+ end
146
+
147
+ @threads.each(&:join)
148
+ end
149
+ end
150
+
151
+ # Usage
152
+ if __FILE__ == $0
153
+ process = MultiAgentProcess.new
154
+
155
+ # Add agents to the process
156
+ process.add_agent(DatabaseAgent)
157
+ process.add_agent(CacheAgent)
158
+ process.add_agent(LoggingAgent)
159
+
160
+ # Start all agents
161
+ process.start_all
162
+
163
+ # Wait for shutdown signal
164
+ process.wait_for_shutdown
165
+ end
166
+ ```
167
+
168
+ ## Agent Coordination Patterns
169
+
170
+ ### Producer-Consumer Pattern
171
+
172
+ ```ruby
173
+ class ProducerAgent < Agent99::Base
174
+ def initialize
175
+ super
176
+ @job_counter = 0
177
+ end
178
+
179
+ def info
180
+ {
181
+ name: self.class.to_s,
182
+ type: :hybrid,
183
+ capabilities: ['producer', 'job_generator']
184
+ }
185
+ end
186
+
187
+ def start_producing
188
+ Thread.new do
189
+ loop do
190
+ # Find consumer agents
191
+ consumers = discover_agents(['consumer'])
192
+
193
+ if consumers.any?
194
+ # Create a job
195
+ job = create_job
196
+
197
+ # Send to a random consumer
198
+ consumer = consumers.sample
199
+ send_request(consumer[:name], job)
200
+
201
+ logger.info "Sent job #{job[:id]} to #{consumer[:name]}"
202
+ else
203
+ logger.warn "No consumers available"
204
+ end
205
+
206
+ sleep(5) # Produce every 5 seconds
207
+ end
208
+ end
209
+ end
210
+
211
+ def process_request(payload)
212
+ # Handle requests for job status, etc.
213
+ case payload[:operation]
214
+ when 'status'
215
+ send_response(jobs_produced: @job_counter, status: 'running')
216
+ end
217
+ end
218
+
219
+ private
220
+
221
+ def create_job
222
+ @job_counter += 1
223
+ {
224
+ id: "job_#{@job_counter}",
225
+ type: 'data_processing',
226
+ data: Array.new(100) { rand(1000) },
227
+ created_at: Time.now.iso8601
228
+ }
229
+ end
230
+ end
231
+
232
+ class ConsumerAgent < Agent99::Base
233
+ def initialize
234
+ super
235
+ @processed_jobs = 0
236
+ end
237
+
238
+ def info
239
+ {
240
+ name: "#{self.class}_#{Socket.gethostname}_#{Process.pid}",
241
+ type: :server,
242
+ capabilities: ['consumer', 'data_processor']
243
+ }
244
+ end
245
+
246
+ def process_request(payload)
247
+ job_id = payload.dig(:id)
248
+ job_type = payload.dig(:type)
249
+
250
+ logger.info "Processing job #{job_id} of type #{job_type}"
251
+
252
+ # Simulate processing time
253
+ processing_time = rand(1..3)
254
+ sleep(processing_time)
255
+
256
+ # Process the data
257
+ data = payload.dig(:data) || []
258
+ result = data.sum / data.size.to_f rescue 0
259
+
260
+ @processed_jobs += 1
261
+
262
+ send_response(
263
+ job_id: job_id,
264
+ result: result,
265
+ processing_time: processing_time,
266
+ processed_by: info[:name],
267
+ total_processed: @processed_jobs
268
+ )
269
+
270
+ logger.info "Completed job #{job_id}"
271
+ end
272
+ end
273
+ ```
274
+
275
+ ### Load Balancing Pattern
276
+
277
+ ```ruby
278
+ class LoadBalancerAgent < Agent99::Base
279
+ def initialize
280
+ super
281
+ @worker_stats = {}
282
+ @request_count = 0
283
+ end
284
+
285
+ def info
286
+ {
287
+ name: self.class.to_s,
288
+ type: :hybrid,
289
+ capabilities: ['load_balancer', 'proxy']
290
+ }
291
+ end
292
+
293
+ def process_request(payload)
294
+ @request_count += 1
295
+
296
+ # Find available worker agents
297
+ workers = discover_agents(['worker'])
298
+
299
+ if workers.empty?
300
+ return send_error("No workers available", "NO_WORKERS")
301
+ end
302
+
303
+ # Choose worker based on load balancing strategy
304
+ chosen_worker = choose_worker(workers)
305
+
306
+ # Forward request to chosen worker
307
+ begin
308
+ response = send_request(chosen_worker[:name], payload)
309
+
310
+ # Update worker stats
311
+ update_worker_stats(chosen_worker[:name], success: true)
312
+
313
+ # Add load balancer info to response
314
+ response[:routed_by] = info[:name]
315
+ response[:worker] = chosen_worker[:name]
316
+
317
+ send_response(response)
318
+ rescue => e
319
+ update_worker_stats(chosen_worker[:name], success: false)
320
+ send_error("Worker failed: #{e.message}", "WORKER_ERROR")
321
+ end
322
+ end
323
+
324
+ private
325
+
326
+ def choose_worker(workers)
327
+ # Round-robin load balancing
328
+ worker_index = @request_count % workers.size
329
+ workers[worker_index]
330
+ end
331
+
332
+ def update_worker_stats(worker_name, success:)
333
+ @worker_stats[worker_name] ||= { requests: 0, successes: 0, failures: 0 }
334
+ @worker_stats[worker_name][:requests] += 1
335
+
336
+ if success
337
+ @worker_stats[worker_name][:successes] += 1
338
+ else
339
+ @worker_stats[worker_name][:failures] += 1
340
+ end
341
+ end
342
+ end
343
+
344
+ class WorkerAgent < Agent99::Base
345
+ def initialize(worker_id = nil)
346
+ super()
347
+ @worker_id = worker_id || "worker_#{SecureRandom.hex(4)}"
348
+ end
349
+
350
+ def info
351
+ {
352
+ name: "#{self.class}_#{@worker_id}",
353
+ type: :server,
354
+ capabilities: ['worker', 'processing']
355
+ }
356
+ end
357
+
358
+ def process_request(payload)
359
+ # Simulate different processing capabilities
360
+ task_type = payload.dig(:task_type)
361
+
362
+ case task_type
363
+ when 'cpu_intensive'
364
+ result = perform_cpu_task(payload[:data])
365
+ when 'io_intensive'
366
+ result = perform_io_task(payload[:data])
367
+ when 'memory_intensive'
368
+ result = perform_memory_task(payload[:data])
369
+ else
370
+ result = perform_generic_task(payload[:data])
371
+ end
372
+
373
+ send_response(
374
+ result: result,
375
+ worker_id: @worker_id,
376
+ task_type: task_type,
377
+ processed_at: Time.now.iso8601
378
+ )
379
+ end
380
+
381
+ private
382
+
383
+ def perform_cpu_task(data)
384
+ # Simulate CPU-intensive work
385
+ (1..1000000).sum
386
+ end
387
+
388
+ def perform_io_task(data)
389
+ # Simulate I/O work
390
+ File.write("/tmp/worker_#{@worker_id}_output.txt", data.to_s)
391
+ "Data written to file"
392
+ end
393
+
394
+ def perform_memory_task(data)
395
+ # Simulate memory-intensive work
396
+ large_array = Array.new(100000) { rand }
397
+ large_array.sum
398
+ end
399
+
400
+ def perform_generic_task(data)
401
+ # Generic task processing
402
+ "Processed: #{data}"
403
+ end
404
+ end
405
+ ```
406
+
407
+ ## Multi-Process Coordination
408
+
409
+ ### Process Manager
410
+
411
+ ```ruby
412
+ class AgentProcessManager
413
+ def initialize
414
+ @processes = {}
415
+ end
416
+
417
+ def start_agent_process(agent_class, count: 1, options: {})
418
+ count.times do |i|
419
+ process_name = "#{agent_class.name.downcase}_#{i}"
420
+
421
+ pid = fork do
422
+ # Set process title for easier identification
423
+ Process.setproctitle("agent99_#{process_name}")
424
+
425
+ # Create and run the agent
426
+ agent = agent_class.new(options)
427
+
428
+ # Handle graceful shutdown
429
+ trap('TERM') do
430
+ agent.shutdown
431
+ exit
432
+ end
433
+
434
+ agent.run
435
+ end
436
+
437
+ @processes[process_name] = {
438
+ pid: pid,
439
+ agent_class: agent_class,
440
+ started_at: Time.now
441
+ }
442
+
443
+ puts "Started #{agent_class} as process #{pid} (#{process_name})"
444
+ end
445
+ end
446
+
447
+ def stop_all_processes
448
+ @processes.each do |name, process_info|
449
+ begin
450
+ Process.kill('TERM', process_info[:pid])
451
+ Process.wait(process_info[:pid])
452
+ puts "Stopped process #{name} (#{process_info[:pid]})"
453
+ rescue Errno::ESRCH
454
+ puts "Process #{name} (#{process_info[:pid]}) already stopped"
455
+ end
456
+ end
457
+
458
+ @processes.clear
459
+ end
460
+
461
+ def monitor_processes
462
+ Thread.new do
463
+ loop do
464
+ @processes.each do |name, process_info|
465
+ begin
466
+ # Check if process is still running
467
+ Process.getpgid(process_info[:pid])
468
+ rescue Errno::ESRCH
469
+ puts "Process #{name} (#{process_info[:pid]}) died, restarting..."
470
+ restart_process(name, process_info)
471
+ end
472
+ end
473
+
474
+ sleep(5) # Check every 5 seconds
475
+ end
476
+ end
477
+ end
478
+
479
+ def wait_for_shutdown
480
+ trap('INT') do
481
+ puts "\nShutting down all agent processes..."
482
+ stop_all_processes
483
+ exit
484
+ end
485
+
486
+ # Wait for all child processes
487
+ Process.waitall
488
+ end
489
+
490
+ private
491
+
492
+ def restart_process(name, process_info)
493
+ # Remove dead process
494
+ @processes.delete(name)
495
+
496
+ # Start new process
497
+ start_agent_process(
498
+ process_info[:agent_class],
499
+ count: 1,
500
+ options: process_info[:options] || {}
501
+ )
502
+ end
503
+ end
504
+
505
+ # Usage example
506
+ if __FILE__ == $0
507
+ manager = AgentProcessManager.new
508
+
509
+ # Start multiple instances of different agents
510
+ manager.start_agent_process(WorkerAgent, count: 3)
511
+ manager.start_agent_process(LoadBalancerAgent, count: 1)
512
+ manager.start_agent_process(LoggingAgent, count: 1)
513
+
514
+ # Start process monitoring
515
+ manager.monitor_processes
516
+
517
+ # Wait for shutdown
518
+ manager.wait_for_shutdown
519
+ end
520
+ ```
521
+
522
+ ## Performance Considerations
523
+
524
+ ### Memory Management
525
+
526
+ ```ruby
527
+ class MemoryEfficientAgent < Agent99::Base
528
+ def initialize
529
+ super
530
+ @request_count = 0
531
+ @memory_check_interval = 100
532
+ end
533
+
534
+ def process_request(payload)
535
+ @request_count += 1
536
+
537
+ # Periodic memory check
538
+ if @request_count % @memory_check_interval == 0
539
+ check_memory_usage
540
+ end
541
+
542
+ # Process request with memory awareness
543
+ result = process_with_memory_limit(payload)
544
+ send_response(result)
545
+ end
546
+
547
+ private
548
+
549
+ def check_memory_usage
550
+ # Get current memory usage
551
+ memory_mb = `ps -o rss= -p #{Process.pid}`.to_i / 1024
552
+
553
+ if memory_mb > 500 # 500MB limit
554
+ logger.warn "High memory usage: #{memory_mb}MB"
555
+
556
+ # Trigger garbage collection
557
+ GC.start
558
+
559
+ # Check again after GC
560
+ new_memory_mb = `ps -o rss= -p #{Process.pid}`.to_i / 1024
561
+ logger.info "Memory after GC: #{new_memory_mb}MB"
562
+ end
563
+ end
564
+
565
+ def process_with_memory_limit(payload)
566
+ # Implement memory-conscious processing
567
+ if payload[:data]&.size > 10000
568
+ # Stream process large data instead of loading all at once
569
+ process_large_data_streaming(payload[:data])
570
+ else
571
+ # Normal processing for small data
572
+ process_small_data(payload[:data])
573
+ end
574
+ end
575
+ end
576
+ ```
577
+
578
+ ### Thread Pool Management
579
+
580
+ ```ruby
581
+ require 'concurrent-ruby'
582
+
583
+ class ThreadPoolAgent < Agent99::Base
584
+ def initialize(options = {})
585
+ super(options)
586
+
587
+ # Create thread pools for different task types
588
+ @cpu_pool = Concurrent::ThreadPoolExecutor.new(
589
+ min_threads: 2,
590
+ max_threads: 4,
591
+ max_queue: 100
592
+ )
593
+
594
+ @io_pool = Concurrent::ThreadPoolExecutor.new(
595
+ min_threads: 5,
596
+ max_threads: 20,
597
+ max_queue: 1000
598
+ )
599
+ end
600
+
601
+ def process_request(payload)
602
+ task_type = payload.dig(:task_type)
603
+
604
+ # Choose appropriate thread pool
605
+ pool = case task_type
606
+ when 'cpu_intensive'
607
+ @cpu_pool
608
+ when 'io_intensive', 'network'
609
+ @io_pool
610
+ else
611
+ @cpu_pool
612
+ end
613
+
614
+ # Execute in thread pool
615
+ future = Concurrent::Future.execute(executor: pool) do
616
+ perform_task(payload)
617
+ end
618
+
619
+ # Wait for completion with timeout
620
+ begin
621
+ result = future.value(30) # 30 second timeout
622
+ send_response(result)
623
+ rescue Concurrent::TimeoutError
624
+ send_error("Task timed out", "TIMEOUT")
625
+ rescue => e
626
+ send_error("Task failed: #{e.message}", "EXECUTION_ERROR")
627
+ end
628
+ end
629
+
630
+ def shutdown
631
+ # Shutdown thread pools gracefully
632
+ [@cpu_pool, @io_pool].each do |pool|
633
+ pool.shutdown
634
+ unless pool.wait_for_termination(10)
635
+ pool.kill
636
+ end
637
+ end
638
+
639
+ super
640
+ end
641
+ end
642
+ ```
643
+
644
+ ## Best Practices
645
+
646
+ ### 1. Resource Management
647
+ - **Limit agent count**: Don't run too many agents per process
648
+ - **Monitor memory**: Set limits and monitor usage
649
+ - **Use thread pools**: Prevent thread explosion
650
+ - **Clean up resources**: Properly shutdown agents and connections
651
+
652
+ ### 2. Communication Patterns
653
+ - **Use discovery**: Don't hardcode agent names
654
+ - **Handle failures**: Agents may come and go
655
+ - **Implement timeouts**: Prevent hanging requests
656
+ - **Use circuit breakers**: Protect against cascade failures
657
+
658
+ ### 3. Monitoring and Debugging
659
+ - **Log extensively**: Track agent interactions
660
+ - **Use unique names**: Include process/thread identifiers
661
+ - **Monitor performance**: Track request rates and response times
662
+ - **Health checks**: Implement agent health endpoints
663
+
664
+ ### 4. Testing Multi-Agent Systems
665
+ - **Integration tests**: Test agent interactions
666
+ - **Load testing**: Test under realistic load
667
+ - **Failure scenarios**: Test agent failures and recovery
668
+ - **Distributed tracing**: Track requests across agents
669
+
670
+ ## Next Steps
671
+
672
+ - **[Control Actions](control-actions.md)** - Managing agent lifecycle
673
+ - **[Advanced Features](advanced-features.md)** - Dynamic loading and deployment
674
+ - **[Performance Considerations](../operations/performance-considerations.md)** - Optimization strategies