agent99 0.0.3 → 0.0.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (103) hide show
  1. checksums.yaml +4 -4
  2. data/A2A_SPEC-dev.md +1829 -0
  3. data/CHANGELOG.md +38 -0
  4. data/COMMITS.md +196 -0
  5. data/DOCS.md +96 -0
  6. data/README.md +212 -84
  7. data/Rakefile +62 -0
  8. data/docs/AI/htm.md +215 -0
  9. data/docs/AI/htm.rb +141 -0
  10. data/docs/AI/htm_demo.db +0 -0
  11. data/docs/AI/notes_on_htm_implementation.md +1319 -0
  12. data/docs/AI/some_code.rb +692 -0
  13. data/docs/advanced-topics/a2a-protocol.md +13 -0
  14. data/docs/{advanced_features.md → advanced-topics/advanced-features.md} +9 -4
  15. data/docs/{control_actions.md → advanced-topics/control-actions.md} +2 -0
  16. data/docs/advanced-topics/model-context-protocol.md +4 -0
  17. data/docs/advanced-topics/multi-agent-processing.md +674 -0
  18. data/docs/agent-development/request-response-handling.md +512 -0
  19. data/docs/agent99_framework/central_registry.md +94 -0
  20. data/docs/agent99_framework/message_client.md +120 -0
  21. data/docs/agent99_framework/registry_client.md +119 -0
  22. data/docs/api-reference/agent99-base.md +463 -0
  23. data/docs/api-reference/message-clients.md +495 -0
  24. data/docs/{api_reference.md → api-reference/overview.md} +14 -4
  25. data/docs/api-reference/registry-client.md +470 -0
  26. data/docs/api-reference/schemas.md +518 -0
  27. data/docs/assets/css/custom.css +27 -0
  28. data/docs/assets/images/agent-lifecycle.svg +73 -0
  29. data/docs/assets/images/agent-registry-process.svg +86 -0
  30. data/docs/assets/images/agent-registry-processes.svg +114 -0
  31. data/docs/assets/images/agent-types-overview.svg +51 -0
  32. data/docs/assets/images/agent99-architecture.svg +85 -0
  33. data/docs/assets/images/agent99_logo.png +0 -0
  34. data/docs/assets/images/control-actions-state.svg +83 -0
  35. data/docs/assets/images/knowledge-graph.svg +77 -0
  36. data/docs/assets/images/message-processing-flow.svg +148 -0
  37. data/docs/assets/images/multi-agent-system.svg +66 -0
  38. data/docs/assets/images/proxy-pattern-sequence.svg +48 -0
  39. data/docs/assets/images/request-flow.svg +97 -0
  40. data/docs/assets/images/request-processing-lifecycle.svg +50 -0
  41. data/docs/assets/images/request-response-sequence.svg +39 -0
  42. data/docs/{agent_lifecycle.md → core-concepts/agent-lifecycle.md} +2 -0
  43. data/docs/core-concepts/agent-types.md +255 -0
  44. data/docs/{architecture.md → core-concepts/architecture.md} +5 -5
  45. data/docs/core-concepts/what-is-an-agent.md +293 -0
  46. data/docs/diagrams/message-flow-sequence.svg +198 -0
  47. data/docs/diagrams/p2p-network-topology.svg +181 -0
  48. data/docs/diagrams/smart-transport-routing.svg +165 -0
  49. data/docs/diagrams/three-layer-architecture.svg +77 -0
  50. data/docs/diagrams/transport-extension-api.svg +309 -0
  51. data/docs/diagrams/transport-extension-architecture.svg +234 -0
  52. data/docs/diagrams/transport-selection-flowchart.svg +264 -0
  53. data/docs/examples/advanced-examples.md +951 -0
  54. data/docs/examples/basic-examples.md +268 -0
  55. data/docs/{agent_discovery.md → framework-components/agent-discovery.md} +9 -5
  56. data/docs/{agent_registry_processes.md → framework-components/agent-registry.md} +9 -3
  57. data/docs/{message_processing.md → framework-components/message-processing.md} +3 -1
  58. data/docs/getting-started/basic-example.md +306 -0
  59. data/docs/getting-started/installation.md +160 -0
  60. data/docs/getting-started/overview.md +64 -0
  61. data/docs/getting-started/quick-start.md +179 -0
  62. data/docs/index.md +97 -0
  63. data/docs/operations/breaking-changes.md +26 -0
  64. data/examples/DEMO.md +148 -0
  65. data/examples/README.md +50 -0
  66. data/examples/agent_watcher.rb +5 -1
  67. data/examples/bad_agent.rb +32 -0
  68. data/examples/chief_agent.rb +17 -6
  69. data/examples/control.rb +16 -7
  70. data/examples/example_agent.rb +16 -3
  71. data/examples/maxwell_agent86.rb +15 -26
  72. data/examples/registry.rb +10 -9
  73. data/examples/run_demo.rb +433 -0
  74. data/lib/agent99/agent_discovery.rb +4 -0
  75. data/lib/agent99/agent_lifecycle.rb +34 -10
  76. data/lib/agent99/amqp_message_client.rb +2 -2
  77. data/lib/agent99/base.rb +6 -2
  78. data/lib/agent99/message_processing.rb +6 -10
  79. data/lib/agent99/registry_client.rb +15 -11
  80. data/lib/agent99/tcp_message_client.rb +183 -0
  81. data/lib/agent99/version.rb +1 -1
  82. data/lib/agent99.rb +1 -1
  83. data/mkdocs.yml +195 -0
  84. data/p2p_plan.md +533 -0
  85. data/p2p_roadmap.md +299 -0
  86. data/registry_plan.md +1818 -0
  87. metadata +93 -30
  88. data/docs/README.md +0 -57
  89. data/docs/diagrams/agent_registry_processes.dot +0 -42
  90. data/docs/diagrams/agent_registry_processes.png +0 -0
  91. data/docs/diagrams/high_level_architecture.dot +0 -26
  92. data/docs/diagrams/high_level_architecture.png +0 -0
  93. data/docs/diagrams/request_flow.dot +0 -42
  94. data/docs/diagrams/request_flow.png +0 -0
  95. /data/docs/{extending_the_framework.md → advanced-topics/extending-the-framework.md} +0 -0
  96. /data/docs/{custom_agent_implementation.md → agent-development/custom-agent-implementation.md} +0 -0
  97. /data/docs/{error_handling_and_logging.md → agent-development/error-handling-and-logging.md} +0 -0
  98. /data/docs/{schema_definition.md → agent-development/schema-definition.md} +0 -0
  99. /data/docs/{messaging_system.md → framework-components/messaging-system.md} +0 -0
  100. /data/docs/{configuration.md → operations/configuration.md} +0 -0
  101. /data/docs/{preformance_considerations.md → operations/performance-considerations.md} +0 -0
  102. /data/docs/{security.md → operations/security.md} +0 -0
  103. /data/docs/{troubleshooting.md → operations/troubleshooting.md} +0 -0
data/registry_plan.md ADDED
@@ -0,0 +1,1818 @@
1
+ # Agent99 Registry Architecture Plan
2
+
3
+ **Note**: The production-ready registry infrastructure will be implemented in a separate repository called **`control-registry`** to maintain clear separation of concerns from the core Agent99 framework. This repository will contain the base classes and pluggable architecture described in this plan.
4
+
5
+ ## Agent Data Schema
6
+
7
+ ### Core Agent Information
8
+ ```yaml
9
+ agent:
10
+ # Identity
11
+ uuid: string # Unique identifier (auto-generated)
12
+ name: string # Human-readable name
13
+ version: string # Agent version
14
+
15
+ # Classification
16
+ type: enum # server, client, hybrid
17
+ namespace: string # Organizational grouping (e.g., "production.us-east")
18
+ tags: array[string] # Arbitrary tags for filtering
19
+
20
+ # Capabilities
21
+ capabilities: array[object] # What the agent can do
22
+ - name: string # Capability name (e.g., "calculation")
23
+ - version: string # Capability version
24
+ - schema: json # Input/output schema for capability
25
+ - performance: object # Performance characteristics
26
+ avg_response_time: number
27
+ throughput: number
28
+ success_rate: number
29
+
30
+ # Network Information
31
+ network:
32
+ hostname: string # DNS hostname
33
+ ip_addresses: array[string] # All IPs agent is listening on
34
+ transport_endpoints: # How to reach this agent
35
+ amqp: string # amqp://queue_name
36
+ nats: string # nats://subject
37
+ named_pipe: string # /tmp/agent99/pipes/agent.in.pipe
38
+ lanet: string # 192.168.1.100:8080
39
+ http: string # https://api.example.com/agent
40
+
41
+ # Health & Status
42
+ status:
43
+ state: enum # active, idle, busy, offline, error
44
+ registered_at: timestamp # When agent joined registry
45
+ last_heartbeat: timestamp # Last health check
46
+ last_activity: timestamp # Last request processed
47
+ health_score: number # 0-100 health rating
48
+ current_load: number # Current processing load
49
+
50
+ # Metadata
51
+ metadata:
52
+ owner: string # Team/person responsible
53
+ environment: string # dev, staging, production
54
+ location: string # Geographic location/datacenter
55
+ runtime: object # Runtime information
56
+ platform: string # ruby, python, node, etc.
57
+ version: string # Runtime version
58
+ dependencies: array[string] # Required libraries
59
+ resources: # Resource constraints/usage
60
+ cpu_cores: number
61
+ memory_mb: number
62
+ disk_gb: number
63
+
64
+ # Security & Auth
65
+ security:
66
+ public_key: string # For message verification
67
+ auth_tokens: array[string] # Accepted auth methods
68
+ permissions: array[string] # What agent is allowed to do
69
+ audit_log: boolean # Whether to audit this agent
70
+
71
+ # Relationships
72
+ relationships:
73
+ depends_on: array[uuid] # Other agents this depends on
74
+ provides_to: array[uuid] # Agents that depend on this
75
+ peer_group: string # Logical grouping of peers
76
+ ```
77
+
78
+ ### Extended Schema for Advanced Features
79
+ ```yaml
80
+ agent_extended:
81
+ # Performance Metrics (time-series data)
82
+ metrics:
83
+ - timestamp: timestamp
84
+ cpu_usage: number
85
+ memory_usage: number
86
+ request_count: number
87
+ error_count: number
88
+ response_times: array[number]
89
+
90
+ # Service Discovery
91
+ discovery:
92
+ ttl: number # Time-to-live in seconds
93
+ priority: number # Selection priority (lower = higher priority)
94
+ weight: number # Load balancing weight
95
+ backup_agents: array[uuid] # Fallback agents
96
+
97
+ # Semantic Capabilities (AI/ML features)
98
+ semantic:
99
+ capability_embeddings: array[vector] # Vector representations
100
+ description: string # Natural language description
101
+ examples: array[object] # Example requests/responses
102
+ ```
103
+
104
+ ## Centralized vs Distributed Architecture
105
+
106
+ ### Why Centralized (Current Approach)
107
+
108
+ **Advantages:**
109
+ - **Simplicity**: Single source of truth, easy to understand
110
+ - **Consistency**: No synchronization issues
111
+ - **Query Performance**: Fast lookups from single location
112
+ - **Management**: Easy backup, monitoring, debugging
113
+
114
+ **Disadvantages:**
115
+ - **Single Point of Failure**: Registry down = system down
116
+ - **Scalability Limits**: All queries hit one service
117
+ - **Network Latency**: Remote agents have higher latency
118
+ - **Bottleneck**: Can become performance bottleneck
119
+
120
+ ### Distributed Architecture Options
121
+
122
+ #### 1. DNS-Like Hierarchical Model
123
+ ```
124
+ Root Registry
125
+ |
126
+ +-------------+-------------+
127
+ | | |
128
+ .americas .europe .asia-pacific
129
+ | | |
130
+ .us-east .eu-west .ap-south
131
+ | | |
132
+ [agents] [agents] [agents]
133
+ ```
134
+
135
+ **Implementation:**
136
+ ```ruby
137
+ class HierarchicalRegistry
138
+ def initialize(parent: nil, zone: nil)
139
+ @parent = parent # Parent registry
140
+ @zone = zone # e.g., "us-east.americas"
141
+ @local_agents = {}
142
+ @child_zones = {}
143
+ end
144
+
145
+ def discover(capability, recursive: true)
146
+ # Check local agents first
147
+ local_results = @local_agents.select { |a| a.has_capability?(capability) }
148
+
149
+ if local_results.empty? && recursive
150
+ # Query child zones
151
+ @child_zones.each do |zone, registry|
152
+ results = zone.discover(capability)
153
+ return results unless results.empty?
154
+ end
155
+
156
+ # Query parent if no local results
157
+ @parent&.discover(capability) || []
158
+ else
159
+ local_results
160
+ end
161
+ end
162
+ end
163
+ ```
164
+
165
+ #### 2. Peer-to-Peer DHT Model
166
+ ```ruby
167
+ # Distributed Hash Table approach (like Kademlia)
168
+ class DHTRegistry
169
+ def initialize(node_id)
170
+ @node_id = node_id
171
+ @routing_table = RoutingTable.new
172
+ @local_storage = {}
173
+ end
174
+
175
+ def store(agent_uuid, agent_data)
176
+ target_node = find_closest_node(hash(agent_uuid))
177
+ if target_node == @node_id
178
+ @local_storage[agent_uuid] = agent_data
179
+ else
180
+ forward_to_node(target_node, agent_uuid, agent_data)
181
+ end
182
+ end
183
+ end
184
+ ```
185
+
186
+ #### 3. Consensus-Based (Raft/Etcd style)
187
+ ```ruby
188
+ # Multiple registry nodes with leader election
189
+ class ConsensusRegistry
190
+ def initialize
191
+ @state = :follower
192
+ @leader = nil
193
+ @peers = []
194
+ @log = []
195
+ end
196
+
197
+ def register_agent(agent_data)
198
+ if @state == :leader
199
+ # Replicate to majority of peers
200
+ replicate_to_peers(agent_data)
201
+ else
202
+ # Forward to leader
203
+ forward_to_leader(agent_data)
204
+ end
205
+ end
206
+ end
207
+ ```
208
+
209
+ ## DNS-Inspired Model with WHOIS
210
+
211
+ ### Hierarchical Namespace
212
+ ```
213
+ agent99://greeting.services.us-east.americas/
214
+ └─capability
215
+ └─service_type
216
+ └─region
217
+ └─zone
218
+ ```
219
+
220
+ ### WHOIS-Like Query System
221
+
222
+ ```ruby
223
+ class Agent99Whois
224
+ def whois(query)
225
+ # Support multiple query types
226
+ case query
227
+ when /^uuid:(.+)/
228
+ lookup_by_uuid($1)
229
+ when /^capability:(.+)/
230
+ lookup_by_capability($1)
231
+ when /^namespace:(.+)/
232
+ lookup_by_namespace($1)
233
+ when /^owner:(.+)/
234
+ lookup_by_owner($1)
235
+ else
236
+ fuzzy_search(query)
237
+ end
238
+ end
239
+
240
+ def format_whois_response(agent)
241
+ <<~WHOIS
242
+ Agent UUID: #{agent.uuid}
243
+ Agent Name: #{agent.name}
244
+ Namespace: #{agent.namespace}
245
+ Type: #{agent.type}
246
+
247
+ Capabilities:
248
+ #{agent.capabilities.map { |c| " - #{c.name} (v#{c.version})" }.join("\n")}
249
+
250
+ Network Information:
251
+ Hostname: #{agent.network.hostname}
252
+ IP Addresses: #{agent.network.ip_addresses.join(', ')}
253
+ Transports: #{agent.network.transport_endpoints.keys.join(', ')}
254
+
255
+ Status:
256
+ State: #{agent.status.state}
257
+ Registered: #{agent.status.registered_at}
258
+ Last Seen: #{agent.status.last_heartbeat}
259
+ Health Score: #{agent.status.health_score}/100
260
+
261
+ Administrative Contact:
262
+ Owner: #{agent.metadata.owner}
263
+ Environment: #{agent.metadata.environment}
264
+ Location: #{agent.metadata.location}
265
+
266
+ Registry Information:
267
+ Registry Server: #{self.server_name}
268
+ Registry Zone: #{self.zone}
269
+ Query Time: #{Time.now}
270
+ WHOIS
271
+ end
272
+ end
273
+ ```
274
+
275
+ ### CLI WHOIS Command
276
+ ```bash
277
+ # Query agent information
278
+ $ agent99 whois uuid:86c7f0d1-e3a4-4e5c-b8a9-2d4f5e6a7b8c
279
+
280
+ # Find agents by capability
281
+ $ agent99 whois capability:greeting
282
+
283
+ # Find all agents in namespace
284
+ $ agent99 whois namespace:production.us-east
285
+
286
+ # Find agents by owner
287
+ $ agent99 whois owner:platform-team
288
+
289
+ # Fuzzy search
290
+ $ agent99 whois "calculation service"
291
+ ```
292
+
293
+ ### DNS-Like Resolution with Load Balancing
294
+
295
+ #### Agent Names vs UUIDs
296
+ **Design Principle: Names for Services, UUIDs for Instances**
297
+
298
+ ```ruby
299
+ # UUID: Always unique (instance identifier)
300
+ uuid: "86c7f0d1-e3a4-4e5c-b8a9-2d4f5e6a7b8c"
301
+
302
+ # Name: Can have duplicates (service identifier)
303
+ name: "calculation_agent"
304
+ ```
305
+
306
+ #### Multiple Agents with Same Name = Service Scaling
307
+ ```yaml
308
+ # Multiple instances of the same service
309
+ - uuid: "uuid-001"
310
+ name: "financial_processor" # Same name
311
+ instance: 1
312
+ zone: "us-east.americas"
313
+ status: "active"
314
+ load: 45%
315
+
316
+ - uuid: "uuid-002"
317
+ name: "financial_processor" # Same name
318
+ instance: 2
319
+ zone: "us-east.americas"
320
+ status: "active"
321
+ load: 78%
322
+
323
+ - uuid: "uuid-003"
324
+ name: "financial_processor" # Same name
325
+ instance: 3
326
+ zone: "us-west.americas"
327
+ status: "active"
328
+ load: 23%
329
+ ```
330
+
331
+ #### DNS-Like Resolution with Load Balancing
332
+ ```ruby
333
+ class Agent99Resolver
334
+ def resolve(query, strategy: :round_robin)
335
+ # Parse hierarchical query
336
+ # financial_processor.services.us-east.americas
337
+ parts = query.split('.')
338
+
339
+ service_name = parts[0] # "financial_processor"
340
+ service_type = parts[1] # "services"
341
+ region = parts[2] # "us-east"
342
+ zone = parts[3] # "americas"
343
+
344
+ # Find all agents with matching name
345
+ registry = find_registry(zone, region)
346
+ agents = registry.find_by_name(service_name)
347
+
348
+ # Filter by health and availability
349
+ healthy_agents = agents.select { |a| a.health_score > 70 && a.status == "active" }
350
+
351
+ # Apply load balancing strategy
352
+ select_agent(healthy_agents, strategy)
353
+ end
354
+
355
+ private
356
+
357
+ def select_agent(agents, strategy)
358
+ case strategy
359
+ when :round_robin
360
+ @round_robin_index = (@round_robin_index || 0) + 1
361
+ agents[@round_robin_index % agents.size]
362
+
363
+ when :least_loaded
364
+ agents.min_by { |agent| agent.current_load }
365
+
366
+ when :random
367
+ agents.sample
368
+
369
+ when :proximity
370
+ agents.min_by { |agent| calculate_latency(agent) }
371
+
372
+ when :weighted
373
+ # Consider both load and health score
374
+ agents.max_by { |agent| (agent.health_score * 0.7) + ((100 - agent.current_load) * 0.3) }
375
+ end
376
+ end
377
+ end
378
+ ```
379
+
380
+ #### Service Discovery Patterns
381
+ ```ruby
382
+ # 1. Exact service name resolution
383
+ agents = resolver.resolve("financial_processor.services.us-east.americas")
384
+ # Returns: One agent from the pool based on load balancing
385
+
386
+ # 2. Get all instances of a service
387
+ all_instances = registry.find_all_by_name("financial_processor")
388
+ # Returns: All agents with that name across all zones
389
+
390
+ # 3. Service health check
391
+ healthy_count = registry.count_healthy("financial_processor")
392
+ # Returns: Number of healthy instances
393
+
394
+ # 4. Geographic distribution
395
+ east_agents = resolver.resolve("financial_processor.services.us-east.americas")
396
+ west_agents = resolver.resolve("financial_processor.services.us-west.americas")
397
+ ```
398
+
399
+ #### Naming Strategies for Scaling
400
+
401
+ **Strategy 1: Service Classes**
402
+ ```yaml
403
+ # Multiple agents providing same service
404
+ name: "calculation_service" # Same name = same service type
405
+ instances: 5 # 5 instances for scaling
406
+ capability: "mathematical_computation"
407
+ ```
408
+
409
+ **Strategy 2: Versioned Services**
410
+ ```yaml
411
+ # Different versions of same service
412
+ - name: "payment_processor_v1" # Legacy version
413
+ - name: "payment_processor_v2" # Current version
414
+ - name: "payment_processor_v3" # Beta version
415
+ ```
416
+
417
+ **Strategy 3: Specialized Variants**
418
+ ```yaml
419
+ # Specialized versions of base service
420
+ - name: "image_processor_gpu" # GPU-optimized
421
+ - name: "image_processor_cpu" # CPU-optimized
422
+ - name: "image_processor_edge" # Edge-optimized
423
+ ```
424
+
425
+ #### Registry Schema Enhancement for Scaling
426
+ ```yaml
427
+ agent:
428
+ uuid: string # Always unique
429
+ name: string # Can have duplicates
430
+ service_class: string # Logical service grouping
431
+ instance_id: number # Instance number within service
432
+
433
+ # Scaling metadata
434
+ scaling:
435
+ min_instances: number # Minimum required instances
436
+ max_instances: number # Maximum allowed instances
437
+ target_load: number # Target load per instance
438
+ scale_metric: string # CPU, memory, requests_per_second
439
+
440
+ # Load balancing
441
+ load_balancing:
442
+ weight: number # Weighted round-robin weight
443
+ priority: number # Higher priority = preferred
444
+ sticky_sessions: boolean # Client session affinity
445
+ ```
446
+
447
+ ### Benefits of Duplicate Names
448
+
449
+ #### 1. Horizontal Scaling
450
+ ```ruby
451
+ # Start with one agent
452
+ register_agent(name: "data_processor", instance: 1)
453
+
454
+ # Scale up by adding more instances
455
+ register_agent(name: "data_processor", instance: 2)
456
+ register_agent(name: "data_processor", instance: 3)
457
+
458
+ # Client code doesn't change - still requests "data_processor"
459
+ agent = resolver.resolve("data_processor.services.production")
460
+ ```
461
+
462
+ #### 2. Rolling Deployments
463
+ ```ruby
464
+ # Deploy new version alongside old
465
+ register_agent(name: "api_gateway", version: "v2.1", weight: 10) # New version, low traffic
466
+ # Keep old version running
467
+ # api_gateway v2.0 instances still handling 90% traffic
468
+
469
+ # Gradually shift traffic
470
+ update_agent_weight("api_gateway", version: "v2.1", weight: 50) # 50/50 split
471
+ update_agent_weight("api_gateway", version: "v2.1", weight: 100) # Full traffic
472
+
473
+ # Remove old instances
474
+ withdraw_agents(name: "api_gateway", version: "v2.0")
475
+ ```
476
+
477
+ #### 3. Geographic Distribution
478
+ ```ruby
479
+ # Same service deployed globally
480
+ register_agent(name: "user_service", zone: "us-east.americas")
481
+ register_agent(name: "user_service", zone: "eu-west.europe")
482
+ register_agent(name: "user_service", zone: "ap-south.asia")
483
+
484
+ # Resolver picks closest instance automatically
485
+ local_agent = resolver.resolve("user_service.services.#{local_zone}")
486
+ ```
487
+
488
+ #### 4. Fault Tolerance
489
+ ```ruby
490
+ # Multiple instances provide redundancy
491
+ if primary_agent_fails?
492
+ # Registry automatically routes to healthy instances
493
+ backup_agent = resolver.resolve("critical_service.services.production")
494
+ # No application code changes needed
495
+ end
496
+ ```
497
+
498
+ ## Security Implications of Duplicate Names
499
+
500
+ ### Security Risks
501
+
502
+ #### 1. Agent Impersonation Attack
503
+ **Threat**: Malicious agent registers with same name as legitimate service
504
+ ```ruby
505
+ # Legitimate agent
506
+ register_agent(name: "payment_processor", owner: "finance_team", uuid: "legitimate-uuid")
507
+
508
+ # Malicious agent impersonating
509
+ register_agent(name: "payment_processor", owner: "attacker", uuid: "malicious-uuid")
510
+ # Now load balancer might route sensitive payments to attacker!
511
+ ```
512
+
513
+ #### 2. Service Hijacking
514
+ **Threat**: Attacker deploys agent with higher priority/weight
515
+ ```ruby
516
+ # Attacker registers with higher priority
517
+ register_agent(
518
+ name: "user_authentication",
519
+ priority: 1, # Higher than legitimate agents (priority: 10)
520
+ weight: 1000, # Much higher weight
521
+ zone: "production"
522
+ )
523
+ # All authentication requests now go to malicious agent
524
+ ```
525
+
526
+ #### 3. Data Exfiltration via DNS-like Queries
527
+ **Threat**: Attacker queries for sensitive services
528
+ ```bash
529
+ # Reconnaissance attacks
530
+ $ agent99 whois capability:financial_processor
531
+ $ agent99 discover namespace:production.sensitive
532
+ # Exposes internal architecture and service locations
533
+ ```
534
+
535
+ #### 4. Namespace Pollution
536
+ **Threat**: Filling namespace with fake agents
537
+ ```ruby
538
+ # Spam attack - register thousands of fake agents
539
+ 1000.times do |i|
540
+ register_agent(name: "critical_service", instance: i, owner: "attacker")
541
+ end
542
+ # Legitimate agents drowned out by noise
543
+ ```
544
+
545
+ ### Security Controls & Mitigations
546
+
547
+ #### 1. Ownership-Based Authorization
548
+ ```ruby
549
+ class Agent99::Registry::SecurityManager
550
+ def register_agent(agent_data, credentials)
551
+ # Verify authorization to use this service name
552
+ if existing_service?(agent_data[:name])
553
+ unless authorized_for_service?(credentials, agent_data[:name])
554
+ raise UnauthorizedError, "Not authorized to register agents for service: #{agent_data[:name]}"
555
+ end
556
+ else
557
+ # First registration - establish ownership
558
+ establish_service_ownership(agent_data[:name], credentials[:owner])
559
+ end
560
+
561
+ # Additional validations
562
+ verify_digital_signature(agent_data, credentials)
563
+ check_certificate_chain(credentials[:cert])
564
+ validate_network_location(agent_data[:network])
565
+
566
+ register(agent_data)
567
+ end
568
+
569
+ private
570
+
571
+ def authorized_for_service?(credentials, service_name)
572
+ service_owners = get_service_owners(service_name)
573
+ service_owners.include?(credentials[:owner]) ||
574
+ has_delegation_permission?(credentials[:owner], service_name)
575
+ end
576
+ end
577
+ ```
578
+
579
+ #### 2. Digital Signatures & Certificates
580
+ ```yaml
581
+ agent:
582
+ uuid: string
583
+ name: string
584
+
585
+ # Security credentials
586
+ security:
587
+ public_key: string # Agent's public key
588
+ certificate: string # X.509 certificate
589
+ certificate_authority: string # Issuing CA
590
+ signature: string # Registry entry signature
591
+
592
+ # Service authorization
593
+ service_authorization:
594
+ service_name: string # Authorized service name
595
+ authorized_by: string # Who granted permission
596
+ expires_at: timestamp # Permission expiration
597
+ permissions: array[string] # Specific permissions
598
+ ```
599
+
600
+ #### 3. Namespace Access Control Lists (ACLs)
601
+ ```ruby
602
+ class ServiceACL
603
+ def initialize(service_name)
604
+ @service_name = service_name
605
+ @owners = [] # Full control
606
+ @operators = [] # Can deploy instances
607
+ @readers = [] # Can query only
608
+ end
609
+
610
+ def can_register?(user, action = :register)
611
+ case action
612
+ when :register
613
+ @owners.include?(user) || @operators.include?(user)
614
+ when :query
615
+ @owners.include?(user) || @operators.include?(user) || @readers.include?(user)
616
+ when :modify_acl
617
+ @owners.include?(user)
618
+ end
619
+ end
620
+ end
621
+
622
+ # Usage
623
+ payment_acl = ServiceACL.new("payment_processor")
624
+ payment_acl.owners = ["finance_team_lead"]
625
+ payment_acl.operators = ["finance_team", "platform_team"]
626
+ payment_acl.readers = ["monitoring_team", "audit_team"]
627
+ ```
628
+
629
+ #### 4. Multi-Factor Registration Validation
630
+ ```ruby
631
+ def secure_register_agent(agent_data, credentials)
632
+ # 1. Cryptographic proof of identity
633
+ verify_agent_signature(agent_data, credentials[:private_key])
634
+
635
+ # 2. Network location validation
636
+ verify_network_location(agent_data[:network][:ip_addresses])
637
+
638
+ # 3. Service ownership check
639
+ verify_service_authorization(agent_data[:name], credentials[:owner])
640
+
641
+ # 4. Certificate chain validation
642
+ verify_certificate_chain(credentials[:certificate])
643
+
644
+ # 5. Rate limiting
645
+ enforce_registration_rate_limits(credentials[:owner])
646
+
647
+ # 6. Audit logging
648
+ audit_log.record_registration(agent_data, credentials, result: :success)
649
+
650
+ register_with_security_metadata(agent_data, credentials)
651
+ end
652
+ ```
653
+
654
+ #### 5. Query Authorization & Audit
655
+ ```ruby
656
+ class SecureRegistryQuery
657
+ def whois(query, requester_credentials)
658
+ # Check query permissions
659
+ unless authorized_for_query?(requester_credentials, query)
660
+ audit_log.record_unauthorized_query(query, requester_credentials)
661
+ raise UnauthorizedError, "Insufficient permissions for query"
662
+ end
663
+
664
+ # Filter results based on permissions
665
+ results = perform_query(query)
666
+ filter_sensitive_data(results, requester_credentials)
667
+ end
668
+
669
+ private
670
+
671
+ def filter_sensitive_data(results, credentials)
672
+ results.map do |agent|
673
+ case credentials[:clearance_level]
674
+ when :public
675
+ agent.slice(:name, :capabilities, :status) # Basic info only
676
+ when :operator
677
+ agent.except(:security, :internal_metadata) # Most info
678
+ when :admin
679
+ agent # Full access
680
+ end
681
+ end
682
+ end
683
+ end
684
+ ```
685
+
686
+ #### 6. Registry Integrity Protection
687
+ ```ruby
688
+ class RegistryIntegrityManager
689
+ def register_agent(agent_data)
690
+ # Create tamper-proof registry entry
691
+ registry_entry = {
692
+ agent: agent_data,
693
+ registered_at: Time.now.utc,
694
+ registered_by: current_user,
695
+ integrity_hash: calculate_integrity_hash(agent_data),
696
+ previous_entry_hash: get_last_entry_hash # Blockchain-like chaining
697
+ }
698
+
699
+ # Sign the entire entry
700
+ registry_entry[:registry_signature] = sign_entry(registry_entry)
701
+
702
+ store_registry_entry(registry_entry)
703
+ end
704
+
705
+ def verify_registry_integrity
706
+ # Verify the chain of registry entries hasn't been tampered with
707
+ verify_entry_chain
708
+ verify_all_signatures
709
+ detect_unauthorized_modifications
710
+ end
711
+ end
712
+ ```
713
+
714
+ ### Access Control Models
715
+
716
+ #### Model 1: Hierarchical Ownership
717
+ ```
718
+ finance_team_lead (owner)
719
+ ├── finance_team (operators)
720
+ │ ├── payment_processor_v1
721
+ │ ├── payment_processor_v2
722
+ │ └── billing_service
723
+ └── contractors (readers only)
724
+ ```
725
+
726
+ #### Model 2: Service-Based RBAC
727
+ ```ruby
728
+ # Role definitions
729
+ roles = {
730
+ service_owner: [:register, :modify, :delete, :query, :admin],
731
+ service_operator: [:register, :modify, :query],
732
+ service_reader: [:query],
733
+ auditor: [:query_audit_logs, :security_scan]
734
+ }
735
+
736
+ # Service permissions
737
+ "payment_processor" => {
738
+ owners: ["alice@finance.com"],
739
+ operators: ["finance-team@company.com"],
740
+ readers: ["monitoring@company.com", "audit@company.com"]
741
+ }
742
+ ```
743
+
744
+ #### Model 3: Certificate-Based Trust
745
+ ```ruby
746
+ # Only agents with valid certificates from trusted CAs can register
747
+ trusted_cas = [
748
+ "CN=Company Internal CA",
749
+ "CN=Finance Department CA",
750
+ "CN=Platform Team CA"
751
+ ]
752
+
753
+ def validate_agent_certificate(cert)
754
+ # 1. Certificate is from trusted CA
755
+ ca_valid = trusted_cas.include?(cert.issuer)
756
+
757
+ # 2. Certificate hasn't expired
758
+ time_valid = cert.not_after > Time.now
759
+
760
+ # 3. Certificate hasn't been revoked
761
+ revocation_valid = !certificate_revoked?(cert)
762
+
763
+ # 4. Service name matches certificate subject
764
+ service_authorized = cert.subject.include?(agent_data[:name])
765
+
766
+ ca_valid && time_valid && revocation_valid && service_authorized
767
+ end
768
+ ```
769
+
770
+ ### Recommendation: Layered Security Approach
771
+
772
+ 1. **Authentication**: Strong cryptographic identity
773
+ 2. **Authorization**: Service-level ACLs + ownership
774
+ 3. **Audit**: Complete logging of all registry operations
775
+ 4. **Integrity**: Tamper-proof registry entries
776
+ 5. **Isolation**: Network-level validation of agent locations
777
+ 6. **Monitoring**: Real-time detection of suspicious patterns
778
+
779
+ ## Current Implementation Analysis
780
+
781
+ ### Existing Sinatra-Based Registry
782
+ The current registry implementation (`examples/registry.rb`) is a simple Sinatra web application with:
783
+
784
+ **Current Features:**
785
+ - **In-memory storage**: Simple Ruby array (`AGENT_REGISTRY = []`)
786
+ - **RESTful HTTP API**: POST /register, GET /discover, DELETE /withdraw
787
+ - **Basic discovery**: Simple keyword matching for capabilities
788
+ - **Human-readable**: JSON responses, web interface potential
789
+ - **Stateless**: Data lost on restart
790
+
791
+ **Limitations:**
792
+ - **No persistence**: Registry data lost on process restart
793
+ - **Single point of failure**: No redundancy or high availability
794
+ - **Limited scalability**: In-memory array won't scale beyond single process
795
+ - **Simple matching**: Only exact keyword matching, no semantic search
796
+ - **HTTP-only**: Requires HTTP client for all interactions
797
+ - **No authentication**: Any agent can register/withdraw any other agent
798
+
799
+ ## Alternative Registry Architectures
800
+
801
+ ### 1. Redis-Based Registry
802
+
803
+ **Architecture:**
804
+ ```
805
+ Agent99 Agents ←→ Redis Server (Registry Data Store)
806
+ ├─ Hash: agents:{uuid} → agent data
807
+ ├─ Set: capabilities:{capability} → agent UUIDs
808
+ └─ Sorted Set: agent_heartbeats → timestamp scores
809
+ ```
810
+
811
+ **Advantages:**
812
+ - **Persistence**: Optional RDB/AOF persistence
813
+ - **High performance**: In-memory with optional disk backing
814
+ - **Pub/Sub**: Built-in notification for registry changes
815
+ - **TTL support**: Automatic agent expiration
816
+ - **Clustering**: Redis Cluster for high availability
817
+ - **Rich data structures**: Sets, sorted sets, hashes for efficient queries
818
+
819
+ **Disadvantages:**
820
+ - **External dependency**: Requires Redis server
821
+ - **Limited query capabilities**: No complex queries without Lua scripts
822
+ - **Memory constraints**: All data must fit in memory
823
+
824
+ ### 2. Database-Based Registry (SQLite/PostgreSQL)
825
+
826
+ **Architecture:**
827
+ ```sql
828
+ -- SQLite/PostgreSQL Schema
829
+ CREATE TABLE agents (
830
+ uuid TEXT PRIMARY KEY,
831
+ name TEXT NOT NULL,
832
+ type TEXT,
833
+ registered_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
834
+ last_heartbeat TIMESTAMP,
835
+ metadata JSONB
836
+ );
837
+
838
+ CREATE TABLE capabilities (
839
+ id INTEGER PRIMARY KEY,
840
+ agent_uuid TEXT REFERENCES agents(uuid) ON DELETE CASCADE,
841
+ capability TEXT NOT NULL,
842
+ description TEXT
843
+ );
844
+
845
+ -- Vector extension for semantic search (PostgreSQL)
846
+ CREATE EXTENSION vector;
847
+ ALTER TABLE capabilities ADD COLUMN embedding vector(384);
848
+ ```
849
+
850
+ **Advantages:**
851
+ - **Full persistence**: ACID compliance, backups, replication
852
+ - **Complex queries**: SQL for sophisticated discovery patterns
853
+ - **Semantic search**: Vector embeddings for capability matching
854
+ - **Audit trail**: Historical tracking of agent registrations
855
+ - **Mature tooling**: Extensive ecosystem for management and monitoring
856
+
857
+ **Disadvantages:**
858
+ - **Higher latency**: Disk I/O for every operation
859
+ - **Complexity**: Requires database administration
860
+ - **External dependency**: Database server required
861
+
862
+ ### 3. Filesystem-Based Registry
863
+
864
+ **Architecture:**
865
+ ```
866
+ /var/lib/agent99/registry/
867
+ ├── agents/
868
+ │ ├── {uuid1}.json # Agent metadata
869
+ │ ├── {uuid2}.json
870
+ │ └── {uuid3}.json
871
+ ├── capabilities/
872
+ │ ├── greeting.txt # List of agent UUIDs with this capability
873
+ │ ├── calculation.txt
874
+ │ └── data_processing.txt
875
+ └── index.json # Master index for quick lookups
876
+ ```
877
+
878
+ **Advantages:**
879
+ - **No dependencies**: Uses only filesystem
880
+ - **Simple backup**: Just copy files
881
+ - **Human readable**: Direct file inspection
882
+ - **Git-compatible**: Can version control registry state
883
+ - **Distributed potential**: Can sync via rsync, NFS, etc.
884
+
885
+ **Disadvantages:**
886
+ - **Performance**: File I/O for each operation
887
+ - **Concurrency**: File locking complexity
888
+ - **No queries**: Must implement search logic
889
+ - **Scalability**: Degrades with many agents
890
+
891
+ ### 4. Agent99-Based Registry (Self-Hosted)
892
+
893
+ **Architecture:**
894
+ ```ruby
895
+ class RegistryAgent < Agent99::Base
896
+ def info
897
+ {
898
+ name: "RegistryAgent",
899
+ type: :hybrid,
900
+ capabilities: ['registry', 'discovery', 'agent_management'],
901
+ persistence: :configurable # Redis, DB, or filesystem
902
+ }
903
+ end
904
+
905
+ def process_request(payload)
906
+ case payload[:action]
907
+ when 'register'
908
+ register_agent(payload[:agent_info])
909
+ when 'discover'
910
+ discover_agents(payload[:capability])
911
+ when 'withdraw'
912
+ withdraw_agent(payload[:uuid])
913
+ end
914
+ end
915
+ end
916
+ ```
917
+
918
+ **Advantages:**
919
+ - **Dogfooding**: Registry uses Agent99 infrastructure
920
+ - **Distributed**: Multiple registry agents for redundancy
921
+ - **Transport agnostic**: Uses Agent99's transport layers
922
+ - **Consistent**: Same patterns as other agents
923
+ - **Extensible**: Add capabilities like semantic search easily
924
+
925
+ **Disadvantages:**
926
+ - **Bootstrap problem**: How to find registry agent initially?
927
+ - **Complexity**: Registry becomes dependent on Agent99 itself
928
+ - **Circular dependency**: Registry needs registry to register itself
929
+
930
+ ## Recommended Hybrid Architecture
931
+
932
+ ### Dual-Plugin Architecture: Frontend + Backend
933
+
934
+ **Complete Registry System:**
935
+ ```
936
+ ┌─────────────────────────────────────────────────────────┐
937
+ │ Frontend Interfaces (Pluggable) │
938
+ ├─────────────────────────────────────────────────────────┤
939
+ │ • Agent99 Frontend (dogfooding) │
940
+ │ • Sinatra HTTP API (lightweight) │
941
+ │ • Rails HTTP API + Admin UI (full-featured) │
942
+ │ • CLI Interface (command-line management) │
943
+ │ • gRPC Service (high-performance RPC) │
944
+ ├─────────────────────────────────────────────────────────┤
945
+ │ Registry Core (CRUD Operations) │
946
+ ├─────────────────────────────────────────────────────────┤
947
+ │ Backend Storage (Pluggable) │
948
+ ├─────────────────────────────────────────────────────────┤
949
+ │ • Memory • Redis • Database • Filesystem • Distributed │
950
+ └─────────────────────────────────────────────────────────┘
951
+ ```
952
+
953
+ ### Frontend Interface Plugins
954
+
955
+ #### 1. Agent99 Frontend (Registry as an Agent)
956
+ ```ruby
957
+ class RegistryAgent < Agent99::Base
958
+ def initialize(backend: :redis)
959
+ @registry = Agent99::Registry.new(backend: backend)
960
+ super
961
+ end
962
+
963
+ def info
964
+ {
965
+ name: "RegistryAgent",
966
+ type: :hybrid,
967
+ capabilities: ['registry.register', 'registry.discover',
968
+ 'registry.withdraw', 'registry.admin']
969
+ }
970
+ end
971
+
972
+ def process_request(payload)
973
+ case payload[:action]
974
+ when 'register'
975
+ @registry.register(payload[:agent_info])
976
+ when 'discover'
977
+ @registry.discover(payload[:capability])
978
+ when 'withdraw'
979
+ @registry.withdraw(payload[:uuid])
980
+ when 'stats'
981
+ @registry.statistics
982
+ end
983
+ end
984
+ end
985
+ ```
986
+
987
+ #### 2. Sinatra HTTP Frontend (Current, Enhanced)
988
+ ```ruby
989
+ class RegistryHTTP < Sinatra::Base
990
+ def initialize(backend: :redis)
991
+ @registry = Agent99::Registry.new(backend: backend)
992
+ super
993
+ end
994
+
995
+ # RESTful API
996
+ post '/register' do
997
+ @registry.register(json_body)
998
+ end
999
+
1000
+ # Human UI for troubleshooting
1001
+ get '/admin' do
1002
+ @agents = @registry.list_all
1003
+ erb :admin_dashboard
1004
+ end
1005
+ end
1006
+ ```
1007
+
1008
+ #### 3. Rails Frontend (Enterprise Features)
1009
+ ```ruby
1010
+ class RegistryController < ApplicationController
1011
+ before_action :authenticate_admin!, only: [:admin]
1012
+
1013
+ def register
1014
+ @registry.register(agent_params)
1015
+ render json: { uuid: agent.uuid }
1016
+ end
1017
+
1018
+ # Rich admin interface
1019
+ def admin
1020
+ @agents = @registry.list_all
1021
+ @stats = @registry.statistics
1022
+ @health = @registry.health_check
1023
+ end
1024
+ end
1025
+ ```
1026
+
1027
+ #### 4. CLI Frontend (Command-Line Management)
1028
+ ```ruby
1029
+ class RegistryCLI < Thor
1030
+ def initialize
1031
+ @registry = Agent99::Registry.new(
1032
+ backend: ENV['REGISTRY_BACKEND'] || :redis
1033
+ )
1034
+ end
1035
+
1036
+ desc "list", "List all registered agents"
1037
+ option :format, default: "table"
1038
+ def list
1039
+ agents = @registry.list_all
1040
+ case options[:format]
1041
+ when "json"
1042
+ puts agents.to_json
1043
+ when "table"
1044
+ print_table(agents)
1045
+ when "yaml"
1046
+ puts agents.to_yaml
1047
+ end
1048
+ end
1049
+
1050
+ desc "discover CAPABILITY", "Find agents by capability"
1051
+ def discover(capability)
1052
+ agents = @registry.discover(capability)
1053
+ print_table(agents)
1054
+ end
1055
+
1056
+ desc "show UUID", "Show details for specific agent"
1057
+ def show(uuid)
1058
+ agent = @registry.get_agent(uuid)
1059
+ puts agent.to_yaml
1060
+ end
1061
+
1062
+ desc "health", "Check registry health"
1063
+ def health
1064
+ status = @registry.health_check
1065
+ puts "Registry Status: #{status[:status]}"
1066
+ puts "Total Agents: #{status[:agent_count]}"
1067
+ puts "Active Agents: #{status[:active_count]}"
1068
+ puts "Backend: #{status[:backend_type]}"
1069
+ end
1070
+
1071
+ desc "watch", "Live monitoring of registry changes"
1072
+ def watch
1073
+ @registry.subscribe do |event|
1074
+ puts "[#{Time.now}] #{event[:type]}: #{event[:agent_uuid]}"
1075
+ end
1076
+ end
1077
+ end
1078
+ ```
1079
+
1080
+ ### Agent99 with CLI Component
1081
+
1082
+ **Integrated Agent99 + CLI Design:**
1083
+ ```ruby
1084
+ class RegistryAgent < Agent99::Base
1085
+ attr_reader :cli_enabled
1086
+
1087
+ def initialize(backend: :redis, cli: false)
1088
+ @registry = Agent99::Registry.new(backend: backend)
1089
+ @cli_enabled = cli
1090
+
1091
+ if @cli_enabled
1092
+ start_cli_interface
1093
+ end
1094
+
1095
+ super
1096
+ end
1097
+
1098
+ private
1099
+
1100
+ def start_cli_interface
1101
+ Thread.new do
1102
+ # Run CLI in separate thread
1103
+ RegistryCLI.new(@registry).start
1104
+ end
1105
+ end
1106
+
1107
+ # Or expose CLI through agent messages
1108
+ def process_request(payload)
1109
+ case payload[:action]
1110
+ when 'cli_command'
1111
+ execute_cli_command(payload[:command], payload[:args])
1112
+ else
1113
+ # Regular registry operations
1114
+ end
1115
+ end
1116
+
1117
+ def execute_cli_command(command, args)
1118
+ case command
1119
+ when 'list'
1120
+ format_agents_list(@registry.list_all, args[:format])
1121
+ when 'watch'
1122
+ subscribe_to_changes
1123
+ when 'export'
1124
+ export_registry_data(args[:format])
1125
+ end
1126
+ end
1127
+ end
1128
+ ```
1129
+
1130
+ ### Primary Strategy: Pluggable Storage Backends
1131
+
1132
+ **Design Pattern (Control-Registry Repository):**
1133
+ ```ruby
1134
+ module Control
1135
+ module Registry
1136
+ module DataStorage
1137
+ class Base
1138
+ def register(agent_info); raise NotImplementedError; end
1139
+ def discover(capability); raise NotImplementedError; end
1140
+ def withdraw(uuid); raise NotImplementedError; end
1141
+ def get_agent(uuid); raise NotImplementedError; end
1142
+ def list_all; raise NotImplementedError; end
1143
+ def heartbeat(uuid); raise NotImplementedError; end
1144
+ end
1145
+
1146
+ class Redis < Base
1147
+ # Redis implementation
1148
+ end
1149
+
1150
+ class Database < Base
1151
+ # SQLite/PostgreSQL implementation
1152
+ end
1153
+
1154
+ class Filesystem < Base
1155
+ # Filesystem implementation
1156
+ end
1157
+
1158
+ class Memory < Base
1159
+ # Current in-memory implementation
1160
+ end
1161
+ end
1162
+
1163
+ module Frontend
1164
+ class Base
1165
+ def initialize(data_storage:)
1166
+ @storage = data_storage
1167
+ end
1168
+ end
1169
+
1170
+ class Agent99 < Base
1171
+ # Agent99-based frontend
1172
+ end
1173
+
1174
+ class Sinatra < Base
1175
+ # HTTP API frontend
1176
+ end
1177
+
1178
+ class CLI < Base
1179
+ # Command-line frontend
1180
+ end
1181
+ end
1182
+ end
1183
+ end
1184
+ ```
1185
+
1186
+ ### Registry Access Patterns (Using Control-Registry)
1187
+
1188
+ **1. Direct Library Access**
1189
+ ```ruby
1190
+ # Using the control-registry gem
1191
+ storage = Control::Registry::DataStorage::Redis.new
1192
+ registry = Control::Registry::Core.new(storage: storage)
1193
+ uuid = registry.register(agent_info)
1194
+ ```
1195
+
1196
+ **2. HTTP API Frontend**
1197
+ ```ruby
1198
+ # HTTP API using control-registry backend
1199
+ storage = Control::Registry::DataStorage::Redis.new
1200
+ frontend = Control::Registry::Frontend::Sinatra.new(data_storage: storage)
1201
+ frontend.start_server
1202
+ ```
1203
+
1204
+ **3. Agent99 Registry Agent**
1205
+ ```ruby
1206
+ # Registry as an Agent99 agent using control-registry
1207
+ class RegistryAgent < Agent99::Base
1208
+ def initialize
1209
+ storage = Control::Registry::DataStorage::Redis.new
1210
+ @registry_frontend = Control::Registry::Frontend::Agent99.new(data_storage: storage)
1211
+ end
1212
+ end
1213
+ ```
1214
+
1215
+ ### Storage Backend Recommendations
1216
+
1217
+ **Development Environment:**
1218
+ - **Memory Backend**: Fast iteration, no dependencies
1219
+ - **Filesystem Backend**: Persistence without external services
1220
+
1221
+ **Production - Small Scale (< 100 agents):**
1222
+ - **SQLite Backend**: Simple, reliable, no server required
1223
+ - **Filesystem Backend**: For embedded systems
1224
+
1225
+ **Production - Medium Scale (100-10,000 agents):**
1226
+ - **Redis Backend**: High performance, pub/sub notifications
1227
+ - **PostgreSQL Backend**: If complex queries needed
1228
+
1229
+ **Production - Large Scale (10,000+ agents):**
1230
+ - **Redis Cluster**: Distributed, high availability
1231
+ - **PostgreSQL + Redis**: PostgreSQL for persistence, Redis for cache
1232
+
1233
+ ## Implementation Roadmap
1234
+
1235
+ **Repository Structure**: The following implementation will be done in the `control-registry` repository, with integration points back to the `agent99` core framework.
1236
+
1237
+ ### Phase 1: Control-Registry Foundation (Week 1)
1238
+ - [ ] Create `control-registry` repository with proper gem structure
1239
+ - [ ] Create `Control::Registry::Frontend::Base` base class for frontend plugins
1240
+ - [ ] Create `Control::Registry::DataStorage::Base` base class for storage backends
1241
+ - [ ] Implement `MemoryBackend` (extract from current Sinatra)
1242
+ - [ ] Create backend factory pattern and plugin discovery system
1243
+ - [ ] Add configuration system for backend/frontend selection
1244
+
1245
+ ### Phase 2: Redis Backend (Week 2)
1246
+ - [ ] Implement `Control::Registry::DataStorage::Redis` class
1247
+ - [ ] Add Redis data structures for agents and capabilities
1248
+ - [ ] Implement pub/sub for registry change notifications
1249
+ - [ ] Add TTL-based agent expiration
1250
+ - [ ] Create Redis connection pooling
1251
+
1252
+ ### Phase 3: Database Backend (Week 3)
1253
+ - [ ] Implement `Control::Registry::DataStorage::Database` with SQLite support
1254
+ - [ ] Design schema for agents and capabilities
1255
+ - [ ] Add PostgreSQL support (optional)
1256
+ - [ ] Implement vector search for semantic discovery (PostgreSQL)
1257
+ - [ ] Add database migration system
1258
+
1259
+ ### Phase 4: Filesystem Backend (Week 4)
1260
+ - [ ] Implement `Control::Registry::DataStorage::Filesystem` class
1261
+ - [ ] Design directory structure for agent data
1262
+ - [ ] Add file locking for concurrent access
1263
+ - [ ] Implement atomic file operations
1264
+ - [ ] Add index file for performance optimization
1265
+
1266
+ ### Phase 5: Frontend Interfaces (Week 5)
1267
+ - [ ] Create `Control::Registry::Frontend::Agent99` (registry as an agent)
1268
+ - [ ] Create `Control::Registry::Frontend::Sinatra` (HTTP API)
1269
+ - [ ] Create `Control::Registry::Frontend::CLI` (command-line interface)
1270
+ - [ ] Solve bootstrap problem for Agent99 frontend (well-known addresses)
1271
+ - [ ] Add distributed registry coordination
1272
+
1273
+ ### Phase 6: Advanced Features & Security (Week 6)
1274
+ - [ ] Implement security framework with authentication and authorization
1275
+ - [ ] Add semantic capability matching with embeddings
1276
+ - [ ] Agent health monitoring and automatic deregistration
1277
+ - [ ] Registry federation for multi-cluster setups
1278
+ - [ ] MCP server integration for AI-powered operations
1279
+ - [ ] Audit logging and compliance features
1280
+
1281
+ ## Frontend Interface Comparison
1282
+
1283
+ ### Decision Matrix for Frontend Selection
1284
+
1285
+ | Frontend | Use Case | Pros | Cons |
1286
+ |----------|----------|------|------|
1287
+ | **Agent99** | Distributed systems | Dogfooding, uses Agent99 transport | Bootstrap complexity |
1288
+ | **Sinatra** | Lightweight deployments | Simple, minimal dependencies | Limited UI capabilities |
1289
+ | **Rails** | Enterprise deployments | Rich UI, authentication, audit trails | Heavy framework |
1290
+ | **CLI** | DevOps/Admin | Direct access, scriptable | No remote access |
1291
+ | **gRPC** | Microservices | High performance, type-safe | Complex setup |
1292
+
1293
+ ### Frontend Composability
1294
+
1295
+ **Multiple Frontends, Same Backend:**
1296
+ ```ruby
1297
+ # Start multiple frontends for the same registry backend
1298
+ registry_backend = Agent99::Registry.new(backend: :redis)
1299
+
1300
+ # Agent99 frontend for agent-to-agent communication
1301
+ agent_frontend = RegistryAgent.new(registry: registry_backend)
1302
+
1303
+ # HTTP API for external systems
1304
+ http_frontend = RegistryHTTP.new(registry: registry_backend)
1305
+
1306
+ # CLI for administrators
1307
+ cli_frontend = RegistryCLI.new(registry: registry_backend)
1308
+
1309
+ # All frontends operate on the same data
1310
+ ```
1311
+
1312
+ ### CLI Integration Patterns
1313
+
1314
+ #### Pattern 1: Standalone CLI Tool
1315
+ ```bash
1316
+ # Direct CLI access to registry
1317
+ $ agent99-registry list
1318
+ $ agent99-registry discover greeting
1319
+ $ agent99-registry show uuid-12345
1320
+ $ agent99-registry watch # Live updates
1321
+ ```
1322
+
1323
+ #### Pattern 2: Agent99 with Embedded CLI
1324
+ ```ruby
1325
+ # Agent that provides CLI interface
1326
+ class RegistryAgentWithCLI < Agent99::Base
1327
+ def initialize
1328
+ super
1329
+ start_repl if ENV['REGISTRY_CLI_MODE']
1330
+ end
1331
+
1332
+ def start_repl
1333
+ require 'pry'
1334
+ binding.pry # Drop into interactive console
1335
+ end
1336
+ end
1337
+ ```
1338
+
1339
+ #### Pattern 3: Remote CLI via Agent99 Messages
1340
+ ```ruby
1341
+ # CLI that sends commands through Agent99 transport
1342
+ class RemoteRegistryCLI
1343
+ def initialize
1344
+ @agent = Agent99::Base.new
1345
+ end
1346
+
1347
+ def execute(command)
1348
+ response = @agent.send_request(
1349
+ to: 'registry_agent',
1350
+ action: 'cli_command',
1351
+ command: command
1352
+ )
1353
+ display_response(response)
1354
+ end
1355
+ end
1356
+
1357
+ # Usage:
1358
+ # $ agent99-cli registry list
1359
+ # $ agent99-cli registry discover calculation
1360
+ ```
1361
+
1362
+ ## Configuration Examples
1363
+
1364
+ ### Environment Variables
1365
+ ```bash
1366
+ # Backend selection
1367
+ AGENT99_REGISTRY_BACKEND=redis
1368
+
1369
+ # Redis configuration
1370
+ AGENT99_REGISTRY_REDIS_URL=redis://localhost:6379/0
1371
+ AGENT99_REGISTRY_REDIS_TTL=3600
1372
+
1373
+ # Database configuration
1374
+ AGENT99_REGISTRY_DB_URL=sqlite://registry.db
1375
+ AGENT99_REGISTRY_DB_POOL_SIZE=10
1376
+
1377
+ # Filesystem configuration
1378
+ AGENT99_REGISTRY_FS_PATH=/var/lib/agent99/registry
1379
+ AGENT99_REGISTRY_FS_SYNC_INTERVAL=30
1380
+ ```
1381
+
1382
+ ### YAML Configuration
1383
+ ```yaml
1384
+ registry:
1385
+ backend: redis
1386
+ redis:
1387
+ url: redis://localhost:6379/0
1388
+ ttl: 3600
1389
+ namespace: agent99:registry
1390
+
1391
+ # Fallback chain
1392
+ fallbacks:
1393
+ - filesystem
1394
+ - memory
1395
+
1396
+ # Replication
1397
+ replicas:
1398
+ - redis://backup1:6379/0
1399
+ - redis://backup2:6379/0
1400
+ ```
1401
+
1402
+ ## Security Considerations
1403
+
1404
+ ### Authentication & Authorization
1405
+ ```ruby
1406
+ class Agent99::Registry::Backend
1407
+ def register(agent_info, credentials)
1408
+ authenticate!(credentials)
1409
+ authorize!(:register, agent_info)
1410
+ # ... registration logic
1411
+ end
1412
+ end
1413
+ ```
1414
+
1415
+ ### Secure Communication
1416
+ - **TLS/SSL**: For HTTP API and database connections
1417
+ - **Authentication tokens**: JWT or API keys for agent registration
1418
+ - **Rate limiting**: Prevent registry flooding
1419
+ - **Input validation**: Sanitize all registry inputs
1420
+
1421
+ ## Migration Strategy
1422
+
1423
+ ### From Current In-Memory to Persistent Backend
1424
+
1425
+ **Step 1: Backward Compatible Update**
1426
+ ```ruby
1427
+ # Update current Sinatra app
1428
+ class RegistryApp < Sinatra::Base
1429
+ def initialize
1430
+ # Start with memory backend (current behavior)
1431
+ @backend = Agent99::Registry::MemoryBackend.new
1432
+
1433
+ # Optional persistence layer
1434
+ if ENV['REGISTRY_PERSISTENCE']
1435
+ @persistent_backend = Agent99::Registry::RedisBackend.new
1436
+ @backend = Agent99::Registry::CacheBackend.new(
1437
+ cache: @backend,
1438
+ persistent: @persistent_backend
1439
+ )
1440
+ end
1441
+ end
1442
+ end
1443
+ ```
1444
+
1445
+ **Step 2: Data Migration**
1446
+ ```ruby
1447
+ # Migrate existing agents to new backend
1448
+ class RegistryMigrator
1449
+ def migrate(from_backend, to_backend)
1450
+ from_backend.list_all.each do |agent|
1451
+ to_backend.register(agent)
1452
+ end
1453
+ end
1454
+ end
1455
+ ```
1456
+
1457
+ ## Model Context Protocol (MCP) Integration
1458
+
1459
+ ### Registry as MCP Server
1460
+
1461
+ The Agent99 registry can expose itself as an MCP server, providing AI assistants with direct access to agent information and management capabilities.
1462
+
1463
+ #### MCP Tools for Agent Management
1464
+ ```ruby
1465
+ class Agent99RegistryMCPServer < MCPServer
1466
+ def initialize(registry)
1467
+ @registry = registry
1468
+ super
1469
+ end
1470
+
1471
+ def tools
1472
+ [
1473
+ {
1474
+ name: "discover_agents",
1475
+ description: "Find agents by capability, namespace, or status",
1476
+ inputSchema: {
1477
+ type: "object",
1478
+ properties: {
1479
+ capability: { type: "string" },
1480
+ namespace: { type: "string" },
1481
+ status: { type: "string" },
1482
+ health_threshold: { type: "number" }
1483
+ }
1484
+ }
1485
+ },
1486
+
1487
+ {
1488
+ name: "agent_whois",
1489
+ description: "Get detailed information about specific agents",
1490
+ inputSchema: {
1491
+ type: "object",
1492
+ properties: {
1493
+ query: {
1494
+ type: "string",
1495
+ description: "UUID, capability, owner, or search term"
1496
+ }
1497
+ },
1498
+ required: ["query"]
1499
+ }
1500
+ },
1501
+
1502
+ {
1503
+ name: "semantic_agent_search",
1504
+ description: "Find agents using natural language descriptions",
1505
+ inputSchema: {
1506
+ type: "object",
1507
+ properties: {
1508
+ description: {
1509
+ type: "string",
1510
+ description: "Natural language description of needed capability"
1511
+ }
1512
+ }
1513
+ }
1514
+ },
1515
+
1516
+ {
1517
+ name: "diagnose_agent_issues",
1518
+ description: "Analyze agent problems and suggest solutions",
1519
+ inputSchema: {
1520
+ type: "object",
1521
+ properties: {
1522
+ symptoms: { type: "string" },
1523
+ affected_agents: { type: "array", items: { type: "string" } }
1524
+ }
1525
+ }
1526
+ },
1527
+
1528
+ {
1529
+ name: "suggest_agent_placement",
1530
+ description: "Recommend optimal agent deployment locations",
1531
+ inputSchema: {
1532
+ type: "object",
1533
+ properties: {
1534
+ capability_needed: { type: "string" },
1535
+ performance_requirements: { type: "object" }
1536
+ }
1537
+ }
1538
+ }
1539
+ ]
1540
+ end
1541
+
1542
+ def call_tool(name, arguments)
1543
+ case name
1544
+ when "discover_agents"
1545
+ @registry.discover_with_filters(arguments)
1546
+ when "agent_whois"
1547
+ @registry.whois(arguments[:query])
1548
+ when "semantic_agent_search"
1549
+ @registry.semantic_search(arguments[:description])
1550
+ when "diagnose_agent_issues"
1551
+ @registry.diagnose_problems(arguments[:symptoms], arguments[:affected_agents])
1552
+ when "suggest_agent_placement"
1553
+ @registry.suggest_placement(arguments)
1554
+ end
1555
+ end
1556
+ end
1557
+ ```
1558
+
1559
+ #### MCP Resources for Documentation
1560
+ ```ruby
1561
+ def resources
1562
+ [
1563
+ {
1564
+ uri: "agent99://schemas/agent",
1565
+ name: "Agent Registration Schema",
1566
+ description: "JSON schema for agent registration data",
1567
+ mimeType: "application/json"
1568
+ },
1569
+
1570
+ {
1571
+ uri: "agent99://docs/capabilities",
1572
+ name: "Capability Documentation",
1573
+ description: "Available agent capabilities and their usage patterns",
1574
+ mimeType: "text/markdown"
1575
+ },
1576
+
1577
+ {
1578
+ uri: "agent99://topology/current",
1579
+ name: "Live Network Topology",
1580
+ description: "Current agent network topology and health status",
1581
+ mimeType: "application/json"
1582
+ },
1583
+
1584
+ {
1585
+ uri: "agent99://metrics/performance",
1586
+ name: "Performance Metrics",
1587
+ description: "Agent performance data and trends",
1588
+ mimeType: "application/json"
1589
+ }
1590
+ ]
1591
+ end
1592
+ ```
1593
+
1594
+ ### AI-Powered Agent Operations
1595
+
1596
+ #### Natural Language Agent Discovery
1597
+ ```bash
1598
+ # AI assistant with MCP registry access
1599
+ User: "Find all calculation agents that are currently having issues"
1600
+
1601
+ AI: I'll check the registry for calculation agents with health problems.
1602
+ [Uses discover_agents tool with capability: "calculation", health_threshold: 70]
1603
+
1604
+ AI: Found 3 calculation agents with issues:
1605
+ - agent_calc_01 (uuid: abc-123): High error rate (45% failures), CPU: 95%
1606
+ - agent_calc_03 (uuid: def-456): Offline for 15 minutes
1607
+ - agent_calc_07 (uuid: ghi-789): Memory usage critical (98%)
1608
+
1609
+ Would you like me to get more details about any of these agents?
1610
+
1611
+ User: "What's the root cause of agent_calc_01's problems?"
1612
+
1613
+ AI: [Uses diagnose_agent_issues tool]
1614
+ Based on the metrics, agent_calc_01 is experiencing:
1615
+ 1. High CPU due to inefficient algorithm in recent deployment
1616
+ 2. Error rate spike correlates with increased request complexity
1617
+ 3. Recommend: Rollback to previous version or increase resource allocation
1618
+ ```
1619
+
1620
+ #### Intelligent Troubleshooting
1621
+ ```bash
1622
+ User: "Our financial reporting system is slow today"
1623
+
1624
+ AI: I'll analyze agents involved in financial reporting.
1625
+ [Uses semantic_agent_search with "financial reporting data processing"]
1626
+
1627
+ AI: Found 5 agents in the financial reporting pipeline:
1628
+ - data_ingestion_agent: Normal (health: 85%)
1629
+ - financial_processor_01: Degraded (health: 60%) - High latency
1630
+ - report_generator: Normal (health: 90%)
1631
+ - financial_processor_02: Critical (health: 30%) - Frequent timeouts
1632
+
1633
+ The bottleneck appears to be financial_processor_02. It's showing:
1634
+ - 300% normal response time
1635
+ - 15% timeout rate
1636
+ - Memory usage at 95%
1637
+
1638
+ Recommended actions:
1639
+ 1. Scale up financial_processor_02 instance
1640
+ 2. Route traffic to financial_processor_01 as backup
1641
+ 3. Investigate memory leak in recent deployment
1642
+ ```
1643
+
1644
+ #### Automated Deployment Assistance
1645
+ ```bash
1646
+ User: "We need to deploy a new image processing agent. Where should it go?"
1647
+
1648
+ AI: [Uses suggest_agent_placement tool]
1649
+ Based on current topology and load patterns:
1650
+
1651
+ Recommended placement: us-west-datacenter-2
1652
+ - Lowest latency to image storage (15ms avg)
1653
+ - Available compute capacity (40% CPU utilization)
1654
+ - Network proximity to related agents
1655
+ - Fallback options available in same zone
1656
+
1657
+ Alternative locations:
1658
+ 1. us-west-datacenter-1 (higher latency to storage: 45ms)
1659
+ 2. us-east-datacenter-1 (cross-region, 120ms latency)
1660
+
1661
+ Would you like me to initiate the deployment process?
1662
+ ```
1663
+
1664
+ ### MCP Integration Benefits
1665
+
1666
+ #### For Developers
1667
+ - **Natural language queries**: "Show me broken agents" instead of complex API calls
1668
+ - **Intelligent debugging**: AI correlates symptoms across agent fleet
1669
+ - **Context-aware help**: AI understands agent relationships and dependencies
1670
+
1671
+ #### For Operations Teams
1672
+ - **Proactive monitoring**: AI predicts failures before they occur
1673
+ - **Smart alerting**: Reduced false positives through intelligent correlation
1674
+ - **Automated remediation**: AI suggests and can execute fixes
1675
+
1676
+ #### For System Architecture
1677
+ - **Protocol standardization**: MCP provides standard AI integration interface
1678
+ - **Tool composability**: Registry tools combine with other MCP servers
1679
+ - **Future-ready**: Prepared for AI agent orchestration and management
1680
+
1681
+ ### Implementation Architecture
1682
+
1683
+ ```ruby
1684
+ # Complete registry with MCP integration
1685
+ class Agent99RegistryWithMCP
1686
+ def initialize(backend: :redis)
1687
+ @registry = Agent99::Registry.new(backend: backend)
1688
+ @mcp_server = Agent99RegistryMCPServer.new(@registry)
1689
+ end
1690
+
1691
+ def start
1692
+ # Start all frontend interfaces
1693
+ start_http_api # Traditional REST API
1694
+ start_agent_frontend # Agent99 messaging interface
1695
+ start_mcp_server # MCP for AI integration
1696
+ start_cli_interface # Command line tools
1697
+ start_grpc_server # High-performance RPC (optional)
1698
+ end
1699
+
1700
+ private
1701
+
1702
+ def start_mcp_server
1703
+ # MCP server runs alongside other interfaces
1704
+ Thread.new { @mcp_server.run }
1705
+ end
1706
+ end
1707
+ ```
1708
+
1709
+ ## Use Case Examples
1710
+
1711
+ ### Development Environment
1712
+ ```yaml
1713
+ # Simple setup for development
1714
+ frontend: cli # Direct CLI access
1715
+ backend: filesystem # No external dependencies
1716
+ path: ./dev_registry # Local directory
1717
+ ```
1718
+
1719
+ ### Small Team Deployment
1720
+ ```yaml
1721
+ # Balanced features and simplicity
1722
+ frontend:
1723
+ - sinatra # HTTP API
1724
+ - cli # Admin access
1725
+ backend: sqlite # Simple database
1726
+ auth: basic # Basic authentication
1727
+ ```
1728
+
1729
+ ### Enterprise Deployment
1730
+ ```yaml
1731
+ # Full-featured production system
1732
+ frontend:
1733
+ - agent99 # Internal agent communication
1734
+ - rails # Rich admin UI
1735
+ - grpc # High-performance API
1736
+ backend:
1737
+ primary: postgresql # Complex queries
1738
+ cache: redis # Performance optimization
1739
+ auth: oauth2 # Enterprise SSO
1740
+ audit: enabled # Compliance logging
1741
+ ```
1742
+
1743
+ ### Distributed Multi-Cluster
1744
+ ```yaml
1745
+ # Globally distributed system
1746
+ frontend:
1747
+ - agent99 # Distributed agents
1748
+ - cli # Local troubleshooting
1749
+ backend:
1750
+ type: federated # Multiple registries
1751
+ regions:
1752
+ - us-east: redis-cluster
1753
+ - eu-west: redis-cluster
1754
+ - ap-south: redis-cluster
1755
+ sync: eventual # Cross-region sync
1756
+ ```
1757
+
1758
+ ## Recommendations
1759
+
1760
+ ### Immediate Actions (Maintain Compatibility)
1761
+ 1. **Refactor current registry** into backend abstraction
1762
+ 2. **Add Redis backend** as optional persistence layer
1763
+ 3. **Keep Sinatra API** for backward compatibility
1764
+ 4. **Add health monitoring** for registered agents
1765
+
1766
+ ### Medium-term Evolution
1767
+ 1. **Implement Registry Agent** for distributed access
1768
+ 2. **Add database backend** for complex queries
1769
+ 3. **Integrate with SmartMessage** transport layer
1770
+ 4. **Add semantic search** capabilities
1771
+
1772
+ ### Long-term Vision
1773
+ 1. **Fully distributed registry** with no single point of failure
1774
+ 2. **Federation support** for multi-cluster deployments
1775
+ 3. **AI-powered discovery** with semantic understanding
1776
+ 4. **Self-healing registry** with automatic reconciliation
1777
+
1778
+ ---
1779
+
1780
+ ## Repository Separation Strategy
1781
+
1782
+ ### Agent99 Core Repository
1783
+ - Contains core Agent99 framework and coordination logic
1784
+ - Includes simple in-memory registry example (current Sinatra implementation)
1785
+ - Depends on `control-registry` gem for production registry features
1786
+ - Integration points for registry discovery and agent coordination
1787
+
1788
+ ### Control-Registry Repository
1789
+ - Contains production-ready registry infrastructure
1790
+ - Pluggable frontend architecture (Agent99, Sinatra, CLI, gRPC)
1791
+ - Pluggable backend architecture (Memory, Redis, Database, Filesystem)
1792
+ - Security framework and authentication systems
1793
+ - DNS-like hierarchical model with WHOIS functionality
1794
+ - MCP integration for AI-powered operations
1795
+ - Comprehensive test suite and documentation
1796
+
1797
+ ### Integration
1798
+ ```ruby
1799
+ # Agent99 using control-registry
1800
+ require 'control-registry'
1801
+
1802
+ # Agent99 can use control-registry for production deployments
1803
+ storage = Control::Registry::DataStorage::Redis.new
1804
+ registry_client = Control::Registry::Client.new(storage: storage)
1805
+
1806
+ class MyAgent < Agent99::Base
1807
+ def initialize
1808
+ @registry = registry_client
1809
+ super
1810
+ end
1811
+ end
1812
+ ```
1813
+
1814
+ ---
1815
+
1816
+ *Last Updated: 2025-01-03*
1817
+ *Status: Ready for Control-Registry Implementation*
1818
+ *Repository: Separate `control-registry` repository established*