flock-core 0.5.0b54__py3-none-any.whl → 0.5.0b55__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of flock-core might be problematic. Click here for more details.

@@ -1,916 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: flock-core
3
- Version: 0.5.0b54
4
- Summary: Add your description here
5
- Author-email: Andre Ratzenberger <andre.ratzenberger@whiteduck.de>
6
- License-File: LICENSE
7
- Requires-Python: >=3.10
8
- Requires-Dist: devtools>=0.12.2
9
- Requires-Dist: dspy==3.0.0
10
- Requires-Dist: duckdb>=1.1.0
11
- Requires-Dist: fastapi>=0.117.1
12
- Requires-Dist: httpx>=0.28.1
13
- Requires-Dist: litellm==1.75.3
14
- Requires-Dist: loguru>=0.7.3
15
- Requires-Dist: mcp>=1.7.1
16
- Requires-Dist: opentelemetry-api>=1.30.0
17
- Requires-Dist: opentelemetry-exporter-jaeger-proto-grpc>=1.21.0
18
- Requires-Dist: opentelemetry-exporter-jaeger>=1.21.0
19
- Requires-Dist: opentelemetry-exporter-otlp>=1.30.0
20
- Requires-Dist: opentelemetry-instrumentation-logging>=0.51b0
21
- Requires-Dist: opentelemetry-sdk>=1.30.0
22
- Requires-Dist: poethepoet>=0.30.0
23
- Requires-Dist: pydantic[email]>=2.11.9
24
- Requires-Dist: rich>=14.1.0
25
- Requires-Dist: toml>=0.10.2
26
- Requires-Dist: typer>=0.19.2
27
- Requires-Dist: uvicorn>=0.37.0
28
- Requires-Dist: websockets>=15.0.1
29
- Description-Content-Type: text/markdown
30
-
31
- <p align="center">
32
- <img alt="Flock Banner" src="https://raw.githubusercontent.com/whiteducksoftware/flock/master/docs/assets/images/flock.png" width="800">
33
- </p>
34
- <p align="center">
35
- <a href="https://pypi.org/project/flock-core/" target="_blank"><img alt="PyPI Version" src="https://img.shields.io/pypi/v/flock-core?style=for-the-badge&logo=pypi&label=pip%20version"></a>
36
- <img alt="Python Version" src="https://img.shields.io/badge/python-3.10%2B-blue?style=for-the-badge&logo=python">
37
- <a href="https://github.com/whiteducksoftware/flock/blob/master/LICENSE" target="_blank"><img alt="License" src="https://img.shields.io/pypi/l/flock-core?style=for-the-badge"></a>
38
- <a href="https://whiteduck.de" target="_blank"><img alt="Built by white duck" src="https://img.shields.io/badge/Built%20by-white%20duck%20GmbH-white?style=for-the-badge&labelColor=black"></a>
39
- <a href="https://www.linkedin.com/company/whiteduck" target="_blank"><img alt="LinkedIn" src="https://img.shields.io/badge/linkedin-%230077B5.svg?style=for-the-badge&logo=linkedin&logoColor=white&label=whiteduck"></a>
40
- <a href="https://bsky.app/profile/whiteduck-gmbh.bsky.social" target="_blank"><img alt="Bluesky" src="https://img.shields.io/badge/bluesky-Follow-blue?style=for-the-badge&logo=bluesky&logoColor=%23fff&color=%23333&labelColor=%230285FF&label=whiteduck-gmbh"></a>
41
- </p>
42
-
43
- ---
44
-
45
- # 🚀 Flock 0.5: Agent Systems Without the Graphs
46
-
47
- > **What if agents collaborated like experts at a whiteboard—not like nodes in a rigid workflow?**
48
-
49
- ---
50
-
51
- ## The Problem You Know Too Well
52
-
53
- 🤯 **Prompt Hell**: Brittle 500-line prompts that break with every model update
54
- 💥 **System Failures**: One bad LLM response crashes your entire workflow
55
- 🧪 **Testing Nightmares**: "How do I unit test a prompt?" (You don't.)
56
- 📏 **Measuring Quality**: "How do I know my prompts are optimal?" (You also don't.)
57
- 📄 **Output Chaos**: Parsing unstructured LLM responses into reliable data
58
- ⛓️ **Orchestration Limits**: Graph-based frameworks create rigid, tightly-coupled systems
59
- 🚀 **Production Gap**: Jupyter notebooks don't scale to enterprise systems
60
- 🔓 **No Security Model**: Every agent sees everything—no access controls
61
-
62
- **The tooling is fundamentally broken. It's time for a better approach.**
63
-
64
- Most issues are solvable, because decades of experience with micro services tought us hard lessons about decoupling, orchestration and reliability.
65
-
66
- **Let's introduce these learnings to AI agents!**
67
-
68
- ---
69
-
70
- ## The Flock Solution: Declarative + Blackboard Architecture
71
-
72
- **What if you could skip the 'prompt engineering' step AND avoid rigid workflow graphs?**
73
-
74
- Flock 0.5 combines **declarative AI workflows** with **blackboard architecture**—the pattern that powered groundbreaking AI systems since the 1970s (Hearsay-II speech recognition at CMU).
75
-
76
- ### ✅ Declarative at Heart
77
-
78
- **No natural language prompts. No brittle instructions. Just type-safe contracts.**
79
-
80
- ```python
81
- @flock_type
82
- class MyDreamPizza(BaseModel):
83
- pizza_idea: str
84
-
85
- @flock_type
86
- class Pizza(BaseModel):
87
- ingredients: list[str]
88
- size: str
89
- crust_type: str
90
- step_by_step_instructions: list[str]
91
-
92
- # Create orchestrator
93
- flock = Flock("openai/gpt-4o")
94
-
95
- # Define agent with ZERO natural language
96
- pizza_master = (
97
- flock.agent("pizza_master")
98
- .consumes(MyDreamPizza)
99
- .publishes(Pizza)
100
- )
101
- ```
102
-
103
- **Hard-binding type contracts will even work with GPT-4729.**
104
-
105
- <p align="center">
106
- <img alt="Flock Blackboard" src="docs/img/pizza.png" width="1000">
107
- </p>
108
-
109
- ### ✅ Key Advantages
110
-
111
- ✅ **Declarative Contracts**: Define inputs/outputs with Pydantic models. Flock handles the LLM complexity.
112
- ⚡ **Built-in Resilience**: Blackboard persists context—agents crash? They recover and resume.
113
- 🧪 **Actually Testable**: Clear contracts make agents unit-testable like any other code
114
- 🔐 **Zero-Trust Security**: 5 built-in visibility types (Public, Private, Tenant, Label-based, Time-delayed)
115
- 🚀 **Dynamic Workflows**: Self-correcting loops, conditional routing, intelligent decision-making
116
- 🔧 **Production-Ready**: Real-time dashboard, WebSocket streaming, 743 passing tests
117
- 📊 **True Observability**: Agent View + Blackboard View with full data lineage
118
-
119
- ---
120
-
121
- ## Why Graphs Fail (and Blackboards Win)
122
-
123
- ### The Problem with Graph-Based Frameworks
124
-
125
- **LangGraph. CrewAI. AutoGen.** They all make the same fundamental mistake: **treating agent collaboration as a directed graph**.
126
-
127
- ```python
128
- # ❌ The Graph-Based Way (LangGraph, CrewAI, etc.)
129
- workflow.add_edge("agent_a", "agent_b") # Tight coupling!
130
- workflow.add_edge("agent_b", "agent_c") # Predefined flow!
131
-
132
- # What happens when you need to:
133
- # - Add agent_d that consumes data from agent_a?
134
- # - Run agent_b and agent_c in parallel?
135
- # - Route conditionally based on agent_a's output quality?
136
- # Answer: Rewrite the graph. Again. And again.
137
- ```
138
-
139
- **Why graphs fail at scale:**
140
-
141
- - 🔗 **Tight coupling**: Agents hardcode their successors
142
- - 📐 **Rigid topology**: Adding an agent means rewiring the graph
143
- - 🐌 **Sequential thinking**: Even independent agents wait in line
144
- - 🧪 **Testing nightmare**: Can't test agents in isolation
145
- - 🔓 **No security model**: Every agent sees everything
146
- - 📈 **Doesn't scale**: 20+ agents = spaghetti graph
147
- - 💀 **Single point of failure**: Orchestrator dies? Everything dies.
148
- - 🧠 **God object anti-pattern**: One orchestrator needs domain knowledge of 20+ agents to route correctly
149
- - 📦 **No context resilience**: Agent crashes? Context disappears. No recovery.
150
-
151
- **This is workflow orchestration dressed up as "agent systems."**
152
-
153
- ---
154
-
155
- ### The Blackboard Alternative: How Experts Actually Collaborate
156
-
157
- <p align="center">
158
- <img alt="Flock Blackboard" src="docs/img/flock_ui_blackboard_view.png" width="1000">
159
- </p>
160
-
161
- Watch a team of specialists solve a complex problem:
162
-
163
- 1. **Radiologist** posts X-ray analysis on the whiteboard
164
- 2. **Lab tech** sees it, adds blood work results
165
- 3. **Diagnostician** waits for BOTH, then posts diagnosis
166
- 4. **Pharmacist** reacts to diagnosis, suggests treatment
167
-
168
- **No one manages the workflow.** No directed graph. Just specialists reacting to relevant information appearing on a shared workspace.
169
-
170
- **This is the blackboard pattern—proven since the 1970s (Hearsay-II speech recognition system at CMU).**
171
-
172
- **Why this matters:**
173
- - **Context IS the blackboard**: All state lives in one place, not scattered across agents
174
- - **Crash resilience**: Agent dies? Blackboard persists. Restart agent, it picks up where it left off.
175
- - **100% decoupled**: Agents don't know about each other. They only know data types.
176
- - **Microservices lessons applied**: We learned in the 2000s that tight coupling kills scalability. Blackboards apply that wisdom to AI agents.
177
-
178
- ---
179
-
180
- ## 🎯 Flock 0.5: Blackboard-First Architecture
181
-
182
- ```python
183
- from flock_flow.orchestrator import Flock
184
- from flock_flow.registry import flock_type
185
- from pydantic import BaseModel
186
-
187
- # 1. Define typed artifacts (what goes on the blackboard)
188
- @flock_type
189
- class XRayAnalysis(BaseModel):
190
- finding: str
191
- confidence: float
192
-
193
- @flock_type
194
- class LabResults(BaseModel):
195
- markers: dict[str, float]
196
-
197
- @flock_type
198
- class Diagnosis(BaseModel):
199
- condition: str
200
- reasoning: str
201
-
202
- # 2. Create orchestrator (the blackboard)
203
- orchestrator = Flock("openai/gpt-4o")
204
-
205
- # 3. Agents subscribe to what they care about (NO explicit workflow!)
206
- radiologist = (
207
- orchestrator.agent("radiologist")
208
- .consumes(PatientScan)
209
- .publishes(XRayAnalysis)
210
- )
211
-
212
- lab_tech = (
213
- orchestrator.agent("lab_tech")
214
- .consumes(PatientScan)
215
- .publishes(LabResults)
216
- )
217
-
218
- diagnostician = (
219
- orchestrator.agent("diagnostician")
220
- .consumes(XRayAnalysis, LabResults) # Waits for BOTH!
221
- .publishes(Diagnosis)
222
- )
223
-
224
- # 4. Publish input, agents react opportunistically
225
- await orchestrator.publish(PatientScan(patient_id="12345", ...))
226
- await orchestrator.run_until_idle()
227
- ```
228
-
229
- **What just happened:**
230
- - Radiologist and lab_tech ran **in parallel** (both consume PatientScan)
231
- - Diagnostician **automatically waited** for both to finish
232
- - **No workflow graph.** No `.add_edge()`. Just subscriptions.
233
- - Add new agents? Just subscribe them. No rewiring.
234
-
235
- **Resilience built-in:**
236
- - Lab agent crashes? Blackboard still has XRayAnalysis. Restart lab agent, it processes the scan again.
237
- - No "orchestrator god object" deciding which agent runs when—agents decide themselves based on what's on the blackboard.
238
- - Context lives on the blackboard, not in memory. Agents are stateless and recoverable.
239
-
240
- ---
241
-
242
- ## 🔥 Why Blackboard Beats Graphs
243
-
244
- | Dimension | Graph-Based (LangGraph, CrewAI) | Blackboard (Flock 0.5) |
245
- |-----------|--------------------------------|------------------------|
246
- | **Add new agent** | Rewrite graph, update edges | Just subscribe to types |
247
- | **Parallel execution** | Manual (split nodes, join nodes) | Automatic (multiple consumers) |
248
- | **Conditional routing** | Complex graph branches | `where=lambda x: x.score > 8` |
249
- | **Testing** | Need full graph setup | Test agents in isolation |
250
- | **Security** | Add-on (if exists) | Built-in (5 visibility types) |
251
- | **Coupling** | Tight (agents know successors) | Loose (agents know types) |
252
- | **Scalability** | O(n²) edges at 20+ agents | O(n) subscriptions |
253
- | **Mental model** | "Draw the workflow" | "What data triggers this?" |
254
- | **Context management** | Scattered across agents | **Blackboard IS the context** |
255
- | **Resilience** | Agent crash = data loss | **Blackboard persists, agents recover** |
256
- | **Orchestrator pattern** | **God object with domain knowledge** | **Agents decide autonomously** |
257
- | **Single point of failure** | Orchestrator dies = everything dies | **Agents independent, blackboard survives** |
258
- | **Architecture wisdom** | Ignores 20 years of microservices | **Applies decoupling lessons learned** |
259
-
260
- ---
261
-
262
- ## 💡 Core Concepts: Rethinking Agent Coordination
263
-
264
- ### 1. Typed Artifacts (Not Unstructured Messages)
265
-
266
- **Graph frameworks:** Agents pass dictionaries or unstructured text.
267
-
268
- ```python
269
- # ❌ LangGraph/CrewAI style
270
- agent_a.output = {"result": "some text", "score": 8} # What's the schema?
271
- ```
272
-
273
- **Flock 0.5:** Every artifact is a validated Pydantic model.
274
-
275
- ```python
276
- # ✅ Flock 0.5 style
277
- @flock_type
278
- class Review(BaseModel):
279
- text: str = Field(max_length=1000)
280
- score: int = Field(ge=1, le=10)
281
- confidence: float = Field(ge=0.0, le=1.0)
282
-
283
- # Type errors caught at definition time, not runtime!
284
- ```
285
-
286
- **Benefits:**
287
- - ✅ **Debuggable**: Strong typing catches errors at development time
288
- - ✅ **Measurable**: Validate outputs against explicit schemas
289
- - ✅ **Migratable**: Type contracts survive model upgrades (GPT-4 → GPT-6)
290
- - ✅ **Testable**: Mock inputs/outputs with concrete types
291
-
292
- ---
293
-
294
- ### 2. Subscriptions (Not Edges)
295
-
296
- **Graph frameworks:** Explicit edges define flow.
297
-
298
- ```python
299
- # ❌ LangGraph style
300
- graph.add_edge("review_agent", "high_quality_handler")
301
- graph.add_edge("review_agent", "low_quality_handler") # How to route?
302
- ```
303
-
304
- **Flock 0.5:** Declarative subscriptions define reactions.
305
-
306
- ```python
307
- # ✅ Flock 0.5 style
308
- high_quality = orchestrator.agent("high_quality").consumes(
309
- Review,
310
- where=lambda r: r.score > 8 # Conditional routing!
311
- )
312
-
313
- low_quality = orchestrator.agent("low_quality").consumes(
314
- Review,
315
- where=lambda r: r.score <= 8
316
- )
317
-
318
- # Both subscribe to Review, predicate determines who fires
319
- ```
320
-
321
- ---
322
-
323
- ### 3. Visibility Controls (Not Open Access)
324
-
325
- **Graph frameworks:** Any agent can see any data.
326
-
327
- **Flock 0.5:** Producer-controlled access to artifacts.
328
-
329
- ```python
330
- # Multi-tenancy (customer data isolation)
331
- agent.publishes(
332
- CustomerData,
333
- visibility=TenantVisibility(tenant_id="customer_123")
334
- )
335
-
336
- # Private (allowlist)
337
- agent.publishes(
338
- SensitiveData,
339
- visibility=PrivateVisibility(agents={"compliance_agent"})
340
- )
341
-
342
- # Time-delayed (embargo periods)
343
- artifact.visibility = AfterVisibility(
344
- ttl=timedelta(hours=24),
345
- then=PublicVisibility()
346
- )
347
-
348
- # Label-based RBAC
349
- artifact.visibility = LabelledVisibility(
350
- required_labels={"clearance:secret"}
351
- )
352
- ```
353
-
354
- **Why this matters:** Financial services, healthcare, SaaS platforms NEED this for compliance.
355
-
356
- ---
357
-
358
- ### 4. Opportunistic Execution (Not Sequential Workflows)
359
-
360
- **Graph frameworks:** Define start node, execute path.
361
-
362
- ```python
363
- # ❌ LangGraph style
364
- result = graph.invoke({"input": "..."}, config={"start": "node_a"})
365
- # Executes: node_a → node_b → node_c (even if b and c are independent!)
366
- ```
367
-
368
- **Flock 0.5:** Publish data, all matching agents fire (in parallel if independent).
369
-
370
- ```python
371
- # ✅ Flock 0.5 style
372
- await orchestrator.publish(Review(text="Great product!", score=9))
373
-
374
- # Three agents all consume Review, run concurrently:
375
- # - sentiment_analyzer
376
- # - rating_validator
377
- # - summary_generator
378
-
379
- await orchestrator.run_until_idle() # Waits for all agents
380
- ```
381
-
382
- ---
383
-
384
- ## 🔥 What You Get With Flock 0.5
385
-
386
- <p align="center">
387
- <img alt="Flock Banner" src="docs/img/flock_ui_agent_view.png" width="1000">
388
- </p>
389
-
390
- ### ✅ Production Safety Built-In
391
-
392
- ```python
393
- # Prevent infinite feedback loops
394
- agent = (
395
- orchestrator.agent("processor")
396
- .consumes(Document)
397
- .publishes(Document) # Could trigger itself!
398
- .prevent_self_trigger(True) # But won't! ✅
399
- )
400
-
401
- # Circuit breaker for runaway agents
402
- orchestrator = Flock(max_agent_iterations=1000) # Automatic failsafe
403
-
404
- # Configuration validation
405
- agent.best_of(150, ...) # ⚠️ Warns: "best_of(150) is very high"
406
- ```
407
-
408
- **Graph frameworks:** No built-in loop prevention. No circuit breakers. Silent failures.
409
-
410
- ---
411
-
412
- ### ✅ Real-Time Observability
413
-
414
- ```python
415
- # One line to activate dashboard
416
- await orchestrator.serve(dashboard=True)
417
- ```
418
-
419
- **What you get:**
420
- - 🎯 **Agent View**: Live graph of agents and message flows
421
- - 📋 **Blackboard View**: Transformation edges showing data lineage
422
- - 🎛️ **Control Panel**: Publish artifacts and invoke agents from UI
423
- - 📊 **EventLog Module**: Searchable, sortable event history
424
- - ⌨️ **Keyboard Shortcuts**: Full accessibility (Ctrl+/ for help)
425
- - 🔍 **Auto-Filter**: Correlation ID tracking
426
-
427
- **Graph frameworks:** Basic logging at best. No real-time visualization.
428
-
429
- ---
430
-
431
- ### ✅ Advanced Execution Strategies
432
-
433
- ```python
434
- # Best-of-N execution (run agent 5x, pick best)
435
- agent.best_of(5, score=lambda r: r.metrics["confidence"])
436
-
437
- # Exclusive delivery (lease-based, exactly-once)
438
- agent.consumes(Task, delivery="exclusive")
439
-
440
- # Batch processing (accumulate 10 items before triggering)
441
- agent.consumes(Event, batch=BatchSpec(size=10, timeout=timedelta(seconds=30)))
442
-
443
- # Join operations (wait for multiple artifact types)
444
- agent.consumes(Review, Rating, join=JoinSpec(within=timedelta(minutes=5)))
445
- ```
446
-
447
- **Graph frameworks:** None of these patterns exist.
448
-
449
- ---
450
-
451
- ## ⚡ Quick Start
452
-
453
- ```bash
454
- # Install
455
- pip install flock-flow
456
-
457
- # Set API key
458
- export OPENAI_API_KEY="sk-..."
459
- export DEFAULT_MODEL="openai/gpt-4o"
460
- ```
461
-
462
- **Your First Blackboard System (60 seconds):**
463
-
464
- ```python
465
- import asyncio
466
- from pydantic import BaseModel, Field
467
- from flock_flow.orchestrator import Flock
468
- from flock_flow.registry import flock_type
469
-
470
- # 1. Define typed artifacts
471
- @flock_type
472
- class Idea(BaseModel):
473
- topic: str
474
- genre: str
475
-
476
- @flock_type
477
- class Movie(BaseModel):
478
- title: str = Field(description="Title in CAPS")
479
- runtime: int = Field(ge=60, le=400)
480
- synopsis: str
481
-
482
- @flock_type
483
- class Tagline(BaseModel):
484
- line: str
485
-
486
- # 2. Create orchestrator (the blackboard)
487
- orchestrator = Flock("openai/gpt-4o")
488
-
489
- # 3. Agents subscribe to types (NO workflow graph!)
490
- movie = (
491
- orchestrator.agent("movie")
492
- .description("Generate a compelling movie concept.")
493
- .consumes(Idea)
494
- .publishes(Movie)
495
- )
496
-
497
- tagline = (
498
- orchestrator.agent("tagline")
499
- .description("Write a one-sentence marketing tagline.")
500
- .consumes(Movie) # Auto-chains after movie!
501
- .publishes(Tagline)
502
- )
503
-
504
- # 4. Run with real-time dashboard
505
- async def main():
506
- await orchestrator.serve(dashboard=True)
507
-
508
- asyncio.run(main())
509
- ```
510
-
511
- **Publish an artifact:**
512
- ```bash
513
- curl -X POST http://localhost:8000/api/control/publish \
514
- -H "Content-Type: application/json" \
515
- -d '{"type_name": "Idea", "payload": {"topic": "AI cats", "genre": "comedy"}}'
516
- ```
517
-
518
- **Watch it execute:**
519
- 1. `movie` agent consumes `Idea`, publishes `Movie`
520
- 2. `tagline` agent automatically reacts (subscribed to `Movie`)
521
- 3. Dashboard shows live execution with full lineage
522
- 4. No graph wiring. Just subscriptions.
523
-
524
- ---
525
-
526
- ## 🚀 Enterprise Use Cases
527
-
528
- ### Financial Services: Real-Time Risk Monitoring
529
-
530
- ```python
531
- # 20+ agents monitoring different market signals
532
- volatility = orchestrator.agent("volatility").consumes(
533
- MarketData,
534
- where=lambda m: m.volatility > 0.5
535
- ).publishes(VolatilityAlert)
536
-
537
- sentiment = orchestrator.agent("sentiment").consumes(
538
- NewsArticle,
539
- text="market crash",
540
- min_p=0.9
541
- ).publishes(SentimentAlert)
542
-
543
- # Execution agent waits for BOTH signals
544
- execute = orchestrator.agent("execute").consumes(
545
- VolatilityAlert,
546
- SentimentAlert,
547
- join=JoinSpec(within=timedelta(minutes=5))
548
- ).publishes(TradeOrder)
549
-
550
- # Complete audit trail for regulators ✅
551
- # Multi-agent decision making ✅
552
- # Real-time risk correlation ✅
553
- ```
554
-
555
- ---
556
-
557
- ### Healthcare: Multi-Modal Clinical Decision Support
558
-
559
- ```python
560
- # Different specialists contribute to diagnosis
561
- radiology.publishes(
562
- XRayAnalysis,
563
- visibility=PrivateVisibility(agents=["diagnosis_agent"]) # HIPAA!
564
- )
565
-
566
- lab.publishes(
567
- LabResults,
568
- visibility=TenantVisibility(tenant_id="patient_123") # Multi-tenancy!
569
- )
570
-
571
- # Diagnostician waits for both inputs
572
- diagnosis.consumes(XRayAnalysis, LabResults).publishes(
573
- Diagnosis,
574
- visibility=PrivateVisibility(agents=["physician", "pharmacist"])
575
- )
576
-
577
- # Built-in access controls ✅
578
- # Full data lineage ✅
579
- # Compliance-ready ✅
580
- ```
581
-
582
- ---
583
-
584
- ### E-Commerce: 50-Agent Personalization Engine
585
-
586
- ```python
587
- # Parallel signal analysis (all run concurrently!)
588
- for signal in ["browsing", "purchase", "reviews", "social", "email", ...]:
589
- orchestrator.agent(f"{signal}_analyzer").consumes(UserEvent).publishes(Signal)
590
-
591
- # Recommendation engine consumes ALL signals (batched)
592
- recommender = orchestrator.agent("recommender").consumes(
593
- Signal,
594
- batch=BatchSpec(size=50, timeout=timedelta(seconds=1))
595
- ).publishes(Recommendation)
596
-
597
- # Add new signal? Just create agent, no graph rewiring ✅
598
- # Scale to 100+ agents? Linear complexity ✅
599
- ```
600
-
601
- ---
602
-
603
- ## 🗺️ Roadmap
604
-
605
- **✅ Phase 1: Core Framework (DONE - v0.5.00)**
606
- - [x] Blackboard orchestrator with typed artifacts
607
- - [x] Sequential + parallel execution
608
- - [x] Visibility controls (5 types)
609
- - [x] Real-time dashboard with WebSocket streaming
610
- - [x] Safety features (circuit breaker, feedback prevention)
611
- - [x] 743 tests, 77.65% coverage
612
-
613
- **🚧 Phase 2: Roadmap to 1.0 (Q1 2026)**
614
- - [ ] **YAML/JSON Serialization** - Export/import full orchestrators
615
- - [ ] **LLM-Powered Routing** - AI agent selection based on context
616
- - [ ] **Batch API** - Process DataFrames/CSV files
617
- - [ ] **Advanced Predicates** - Complex subscription logic
618
- - [ ] **CLI Tool** - Management console
619
- - [ ] Persistent blackboard (Redis/Postgres)
620
- - [ ] Event log replay (Kafka)
621
- - [ ] Distributed orchestration (multi-region)
622
- - [ ] OAuth/SSO for dashboard
623
- - [ ] Audit trail export (compliance)
624
-
625
- **📅 Phase 3: Post 1.0 ideas**
626
- - [ ] Migration tool (auto-convert from LangGraph/CrewAI)
627
- - [ ] Template marketplace
628
- - [ ] VS Code extension
629
-
630
- ---
631
-
632
- ## 📚 What's Built-In
633
-
634
- ✅ **LLM Provider Support** - LiteLLM (OpenAI, Anthropic, Azure, Google, etc.)
635
- ✅ **DSPy Integration** - Prompt optimization and structured outputs
636
- ✅ **MCP Protocol** - Model Context Protocol servers
637
- ✅ **Tool System** - Function calling with any LLM
638
- ✅ **Pydantic Models** - Type validation with Field constraints
639
- ✅ **Rich Output** - Beautiful console themes
640
- ✅ **FastAPI Service** - Production-grade HTTP API
641
- ✅ **Streaming** - Real-time LLM output
642
- ✅ **Async-First** - True concurrent execution
643
-
644
- ---
645
-
646
- ## 🔬 Production Quality
647
-
648
- | Metric | Graph Frameworks | Flock 0.5 |
649
- |--------|------------------|-----------|
650
- | Test Coverage | Varies | **77.65%** (743 tests) |
651
- | Critical Path Coverage | Unknown | **86-100%** |
652
- | E2E Tests | Few | 6 comprehensive scenarios |
653
- | Safety Features | None/Manual | Circuit breaker, feedback prevention |
654
- | Real-time Monitoring | None/Basic | WebSocket streaming dashboard |
655
- | Security | Add-on | 5 built-in visibility types |
656
- | Documentation | Good | Excellent (AGENTS.md + examples) |
657
-
658
- ---
659
-
660
- ## 🔍 Observability & Debugging
661
-
662
- ### Built-in OpenTelemetry Tracing with DuckDB
663
-
664
- Flock includes **production-ready distributed tracing** powered by OpenTelemetry and DuckDB—enabling AI-assisted debugging and performance analysis.
665
-
666
- **Why DuckDB?** It's a columnar analytical database **10-100x faster than SQLite** for trace analytics. No external services, no Docker—just a single embedded database file.
667
-
668
- > **📊 Production Status**: 85% Production-Ready | [View Assessment](docs/TRACING_PRODUCTION_READINESS.md)
669
- >
670
- > ✅ Complete architecture • ✅ Zero-config storage • ✅ Comprehensive UI • ⚠️ Add auth before production
671
-
672
- ### Enable Tracing
673
-
674
- ```bash
675
- # Enable auto-tracing for all agents
676
- export FLOCK_AUTO_TRACE=true
677
- export FLOCK_TRACE_FILE=true
678
-
679
- # Run your application
680
- python your_app.py
681
-
682
- # Traces stored in: .flock/traces.duckdb
683
- ```
684
-
685
- ### Filtering: Control What Gets Traced
686
-
687
- Use whitelist/blacklist filtering to reduce overhead and avoid tracing noisy operations like streaming tokens:
688
-
689
- ```bash
690
- # Trace only core services (recommended for production)
691
- export FLOCK_TRACE_SERVICES='["flock", "agent", "dspyengine", "outpututilitycomponent"]'
692
-
693
- # Exclude specific noisy operations
694
- export FLOCK_TRACE_IGNORE='["DashboardEventCollector.set_websocket_manager"]'
695
- ```
696
-
697
- **How filtering works:**
698
- - **Whitelist** (`FLOCK_TRACE_SERVICES`): Only trace specified classes (case-insensitive)
699
- - **Blacklist** (`FLOCK_TRACE_IGNORE`): Never trace specific operations (exact match)
700
- - Filtering happens **before** span creation for near-zero overhead
701
-
702
- 📖 **Full documentation:** [docs/AUTO_TRACING.md](docs/AUTO_TRACING.md)
703
-
704
- ### Real-Time Trace Viewer
705
-
706
- <p align="center">
707
- <img alt="Trace Viewer" src="docs/img/trace_viewer.png" width="1000">
708
- </p>
709
-
710
- The dashboard includes a **production-ready trace viewer** with **7 powerful view modes**:
711
-
712
- - 📅 **Timeline**: Waterfall visualization showing execution flow and span hierarchies
713
- - 📊 **Statistics**: Sortable table view with durations, span counts, and error tracking
714
- - 🔴 **RED Metrics**: Rate, Errors, Duration monitoring for service health
715
- - 🔗 **Dependencies**: Service-to-service communication with operation-level drill-down
716
- - 🗄️ **DuckDB SQL**: Interactive SQL query editor with CSV export for custom analytics
717
- - ⚙️ **Configuration**: Real-time service/operation filtering without restarts
718
- - 📚 **Guide**: Built-in documentation and query examples
719
-
720
- **Additional Features:**
721
- - **Smart Sorting**: Sort traces by date, span count, or duration with visual indicators
722
- - **CSV Export**: Download query results for offline analysis and reporting
723
- - **Maximize Mode**: Full-screen view for deep data exploration
724
- - **Multi-Trace Support**: Open and compare multiple traces simultaneously
725
- - **Full I/O Capture**: Complete input/output data with collapsible JSON viewer
726
-
727
- ### AI-Powered Debugging
728
-
729
- **AI agents (including Claude Code) can query your traces directly:**
730
-
731
- ```python
732
- import duckdb
733
-
734
- conn = duckdb.connect('.flock/traces.duckdb', read_only=True)
735
-
736
- # Find slow operations
737
- slow_ops = conn.execute("""
738
- SELECT name, AVG(duration_ms) as avg_duration
739
- FROM spans
740
- WHERE duration_ms > 1000
741
- GROUP BY name
742
- ORDER BY avg_duration DESC
743
- """).fetchall()
744
-
745
- # Find errors with their inputs
746
- errors = conn.execute("""
747
- SELECT name, status_description,
748
- json_extract(attributes, '$.input.message') as input
749
- FROM spans
750
- WHERE status_code = 'ERROR'
751
- """).fetchall()
752
-
753
- # Performance analysis by service
754
- perf = conn.execute("""
755
- SELECT service,
756
- COUNT(*) as calls,
757
- AVG(duration_ms) as avg_ms,
758
- MAX(duration_ms) as max_ms,
759
- PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY duration_ms) as p95_ms
760
- FROM spans
761
- GROUP BY service
762
- """).fetchall()
763
- ```
764
-
765
- **Example AI-assisted debugging:**
766
- ```
767
- You: "My pizza agent is slow, help me find why"
768
- AI: [queries DuckDB] "The DSPyEngine.evaluate span takes 23s on average.
769
- Checking input attributes... You're passing 50KB of conversation history.
770
- Recommendation: Limit context window to last 5 messages."
771
- ```
772
-
773
- ### What Gets Traced
774
-
775
- **Every operation is automatically traced with:**
776
-
777
- ✅ Full input arguments (with JSON serialization)
778
- ✅ Complete output values
779
- ✅ Duration and timestamps
780
- ✅ Parent-child span relationships
781
- ✅ Service and operation names
782
- ✅ Error messages and stack traces
783
- ✅ Agent metadata (name, description)
784
- ✅ Correlation IDs for request tracking
785
-
786
- **No manual instrumentation required—just enable `FLOCK_AUTO_TRACE=true`.**
787
-
788
- ### Performance Analytics
789
-
790
- ```sql
791
- -- Find bottlenecks
792
- SELECT name, service, duration_ms
793
- FROM spans
794
- WHERE duration_ms > 5000
795
- ORDER BY start_time DESC;
796
-
797
- -- Track P95 latency by operation
798
- SELECT operation,
799
- PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY duration_ms) as p95
800
- FROM spans
801
- WHERE service = 'DSPyEngine'
802
- GROUP BY operation;
803
-
804
- -- Error rate by service
805
- SELECT service,
806
- COUNT(*) as total,
807
- SUM(CASE WHEN status_code = 'ERROR' THEN 1 ELSE 0 END) as errors,
808
- (errors * 100.0 / total) as error_rate
809
- FROM spans
810
- GROUP BY service;
811
- ```
812
-
813
- ### Production Monitoring
814
-
815
- Deploy Flock with **OTEL exporters** to send traces to your observability platform:
816
-
817
- ```bash
818
- # Send to Grafana Tempo/Loki
819
- export OTEL_EXPORTER_OTLP_ENDPOINT="http://tempo:4317"
820
- export FLOCK_AUTO_TRACE=true
821
-
822
- # Or use local DuckDB + periodic exports
823
- export FLOCK_TRACE_FILE=true
824
- ```
825
-
826
- ---
827
-
828
- ## 🤝 Contributing
829
-
830
- We're building Flock 0.5 in the open! See [`AGENTS.md`](AGENTS.md) for development setup and debugging guide.
831
-
832
- ```bash
833
- git clone https://github.com/whiteducksoftware/flock-flow.git
834
- cd flock-flow
835
- uv sync
836
- uv run pytest # 743 tests pass!
837
- ```
838
-
839
- ---
840
-
841
- ## 💬 Community & Support
842
-
843
- - **GitHub Issues:** [Report bugs or request features](https://github.com/whiteducksoftware/flock-flow/issues)
844
- - **Discussions:** [Ask questions or share ideas](https://github.com/whiteducksoftware/flock-flow/discussions)
845
- - **Documentation:** [Full docs and examples](https://whiteducksoftware.github.io/flock/)
846
- - **Email:** [support@whiteduck.de](mailto:support@whiteduck.de)
847
-
848
- ---
849
-
850
- ## 🌟 Why "0.5"?
851
-
852
- We're calling this 0.5 to signal:
853
-
854
- 1. **It's production-ready** - 743 tests, enterprise features, dashboard
855
- 2. **It's still evolving** - Some advanced features coming in Q1/Q2 2026
856
- 3. **It's the future** - Blackboard architecture scales better than graphs
857
-
858
- **1.0 will arrive** when we've added advanced routing, serialization, and enterprise persistence.
859
-
860
- ---
861
-
862
- ## 🔖 The Bottom Line
863
-
864
- **Graph-based frameworks** treat agents like nodes in a workflow. Rigid. Sequential. Hard to scale.
865
-
866
- **Flock 0.5** combines **declarative AI workflows** with **blackboard architecture**:
867
- - ✅ No brittle prompts (type-safe contracts)
868
- - ✅ No rigid graphs (opportunistic execution)
869
- - ✅ No testing nightmares (unit-testable agents)
870
- - ✅ No security gaps (5 visibility types)
871
- - ✅ No production fears (743 tests, real-time monitoring)
872
-
873
- **The future of AI agents isn't workflows—it's declarative blackboards.**
874
-
875
- **Try it. You'll never go back to graphs.**
876
-
877
- ---
878
-
879
- <div align="center">
880
-
881
- **Built with ❤️ by white duck GmbH**
882
-
883
- **"Agents are just microservices. Let's treat them that way."**
884
-
885
- [⭐ Star us on GitHub](https://github.com/whiteducksoftware/flock-flow) | [📖 Read the Docs](https://whiteducksoftware.github.io/flock/) | [🚀 Try Examples](examples/)
886
-
887
- </div>
888
-
889
- ---
890
-
891
- ## 📊 Framework Comparison
892
-
893
- | | LangGraph | CrewAI | AutoGen | Flock 0.5 |
894
- |-|-----------|---------|---------|-----------|
895
- | **Pattern** | Directed Graph | Sequential Tasks | Chat-Based | Blackboard |
896
- | **Coordination** | Explicit edges | Task context | Messages | Subscriptions |
897
- | **Parallelism** | Manual (split/join) | None | None | Automatic |
898
- | **Type Safety** | TypedDict | None | None | Pydantic |
899
- | **Security** | None | None | None | 5 visibility types |
900
- | **Conditional** | Route functions | Manual | Manual | `where=lambda` |
901
- | **Testing** | Full graph | Full crew | Full group | Isolated agents |
902
- | **Real-time UI** | None | None | None | WebSocket streaming |
903
- | **Feedback Prevention** | Manual | Manual | Manual | Automatic |
904
- | **Add Agent** | Rewrite graph | Rewrite tasks | Rewrite group | Just subscribe |
905
- | **Learning Curve** | Medium | Easy | Easy | Medium |
906
- | **Scalability** | 10-20 agents | 5-10 agents | 5-10 agents | 100+ agents |
907
-
908
- ---
909
-
910
- **Last Updated:** October 6, 2025
911
- **Version:** Flock 0.5.0 (Blackboard Edition) / flock-flow 0.1.20
912
- **Status:** Production-Ready, Active Development
913
-
914
- ---
915
-
916
- **"The blackboard pattern has been battle-tested for 50 years. Declarative contracts eliminate prompt hell. Together, they're the future of AI agents."**