penguiflow 1.0.3__py3-none-any.whl → 2.1.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of penguiflow might be problematic. Click here for more details.

@@ -1,425 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: penguiflow
3
- Version: 1.0.3
4
- Summary: Async agent orchestration primitives.
5
- Author: PenguiFlow Team
6
- License: MIT License
7
-
8
- Copyright (c) 2025 hurtener
9
-
10
- Permission is hereby granted, free of charge, to any person obtaining a copy
11
- of this software and associated documentation files (the "Software"), to deal
12
- in the Software without restriction, including without limitation the rights
13
- to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
14
- copies of the Software, and to permit persons to whom the Software is
15
- furnished to do so, subject to the following conditions:
16
-
17
- The above copyright notice and this permission notice shall be included in all
18
- copies or substantial portions of the Software.
19
-
20
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
21
- IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
22
- FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
23
- AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
24
- LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
25
- OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
26
- SOFTWARE.
27
-
28
- Project-URL: Homepage, https://github.com/penguiflow/penguiflow
29
- Requires-Python: >=3.12
30
- Description-Content-Type: text/markdown
31
- License-File: LICENSE
32
- Requires-Dist: pydantic>=2.6
33
- Provides-Extra: dev
34
- Requires-Dist: mypy>=1.8; extra == "dev"
35
- Requires-Dist: pytest>=7.4; extra == "dev"
36
- Requires-Dist: pytest-asyncio>=0.23; extra == "dev"
37
- Requires-Dist: ruff>=0.2; extra == "dev"
38
- Dynamic: license-file
39
-
40
- # PenguiFlow 🐧❄️
41
-
42
- <p align="center">
43
- <img src="asset/Penguiflow.png" alt="PenguiFlow logo" width="220">
44
- </p>
45
-
46
- **Async-first orchestration library for multi-agent and data pipelines**
47
-
48
- PenguiFlow is a **lightweight Python library** to orchestrate agent flows.
49
- It provides:
50
-
51
- * **Typed, async message passing** (Pydantic v2)
52
- * **Concurrent fan-out / fan-in patterns**
53
- * **Routing & decision points**
54
- * **Retries, timeouts, backpressure**
55
- * **Dynamic loops** (controller nodes)
56
- * **Runtime playbooks** (callable subflows with shared metadata)
57
-
58
- Built on pure `asyncio` (no threads), PenguiFlow is small, predictable, and repo-agnostic.
59
- Product repos only define **their models + node functions** — the core stays dependency-light.
60
-
61
- ---
62
-
63
- ## ✨ Why PenguiFlow?
64
-
65
- * **Orchestration is everywhere.** Every Pengui service needs to connect LLMs, retrievers, SQL, or external APIs.
66
- * **Stop rewriting glue.** This library gives you reusable primitives (nodes, flows, contexts) so you can focus on business logic.
67
- * **Typed & safe.** Every hop validated with Pydantic.
68
- * **Lightweight.** Only depends on asyncio + pydantic. No broker, no server, no threads.
69
-
70
- ---
71
-
72
- ## 🏗️ Core Concepts
73
-
74
- ### Message
75
-
76
- Every payload is wrapped in a `Message` with headers and metadata.
77
-
78
- ```python
79
- from pydantic import BaseModel
80
- from penguiflow.types import Message, Headers
81
-
82
- class QueryIn(BaseModel):
83
- text: str
84
-
85
- msg = Message(
86
- payload=QueryIn(text="unique reach last 30 days"),
87
- headers=Headers(tenant="acme")
88
- )
89
- ```
90
-
91
- ### Node
92
-
93
- A node is an async function wrapped with a `Node`.
94
- It validates inputs/outputs (via `ModelRegistry`) and applies `NodePolicy` (timeout, retries, etc.).
95
-
96
- ```python
97
- from penguiflow.node import Node
98
-
99
- class QueryOut(BaseModel):
100
- topic: str
101
-
102
- async def triage(msg: QueryIn, ctx) -> QueryOut:
103
- return QueryOut(topic="metrics")
104
-
105
- triage_node = Node(triage, name="triage")
106
- ```
107
-
108
- ### Flow
109
-
110
- A flow wires nodes together in a directed graph.
111
- Edges are called **Floe**s, and flows have two invisible contexts:
112
-
113
- * **OpenSea** 🌊 — ingress (start of the flow)
114
- * **Rookery** 🐧 — egress (end of the flow)
115
-
116
- ```python
117
- from penguiflow.core import create
118
-
119
- flow = create(
120
- triage_node.to(packer_node)
121
- )
122
- ```
123
-
124
- ### Running a Flow
125
-
126
- ```python
127
- from penguiflow.registry import ModelRegistry
128
-
129
- registry = ModelRegistry()
130
- registry.register("triage", QueryIn, QueryOut)
131
- registry.register("packer", QueryOut, PackOut)
132
-
133
- flow.run(registry=registry)
134
-
135
- await flow.emit(msg) # emit into OpenSea
136
- out = await flow.fetch() # fetch from Rookery
137
- print(out.payload) # PackOut(...)
138
- await flow.stop()
139
- ```
140
-
141
- ---
142
-
143
- ## 🧭 Design Principles
144
-
145
- 1. **Async-only (`asyncio`).**
146
-
147
- * Flows are orchestrators, mostly I/O-bound.
148
- * Async tasks are cheap, predictable, and cancellable.
149
- * Heavy CPU work should be offloaded inside a node (process pool, Ray, etc.), not in PenguiFlow itself.
150
- * v1 intentionally stays in-process; scaling out or persisting state will arrive with future pluggable backends.
151
-
152
- 2. **Typed contracts.**
153
-
154
- * In/out models per node are defined with Pydantic.
155
- * Validated at runtime via cached `TypeAdapter`s.
156
- * `flow.run(registry=...)` verifies every validating node is registered so misconfigurations fail fast.
157
-
158
- 3. **Reliability first.**
159
-
160
- * Timeouts, retries with backoff, backpressure on queues.
161
- * Nodes run inside error boundaries.
162
-
163
- 4. **Minimal dependencies.**
164
-
165
- * Only asyncio + pydantic.
166
- * No broker, no server. Everything in-process.
167
-
168
- 5. **Repo-agnostic.**
169
-
170
- * Product repos declare their models + node funcs, register them, and run.
171
- * No product-specific code in the library.
172
-
173
- ---
174
-
175
- ## 📦 Installation
176
-
177
- ```bash
178
- pip install -e ./penguiflow
179
- ```
180
-
181
- Requires **Python 3.12+**.
182
-
183
- ## 🧭 Repo Structure
184
-
185
- penguiflow/
186
- __init__.py
187
- core.py # runtime orchestrator, retries, controller helpers, playbooks
188
- node.py
189
- types.py
190
- registry.py
191
- patterns.py
192
- middlewares.py
193
- viz.py
194
- README.md
195
- pyproject.toml # build metadata
196
- tests/ # pytest suite
197
- examples/ # runnable flows (fan-out, routing, controller, playbooks)
198
-
199
- ---
200
-
201
- ## 🚀 Quickstart Example
202
-
203
- ```python
204
- from pydantic import BaseModel
205
- from penguiflow import Headers, Message, ModelRegistry, Node, NodePolicy, create
206
-
207
-
208
- class TriageIn(BaseModel):
209
- text: str
210
-
211
-
212
- class TriageOut(BaseModel):
213
- text: str
214
- topic: str
215
-
216
-
217
- class RetrieveOut(BaseModel):
218
- topic: str
219
- docs: list[str]
220
-
221
-
222
- class PackOut(BaseModel):
223
- prompt: str
224
-
225
-
226
- async def triage(msg: TriageIn, ctx) -> TriageOut:
227
- topic = "metrics" if "metric" in msg.text else "general"
228
- return TriageOut(text=msg.text, topic=topic)
229
-
230
-
231
- async def retrieve(msg: TriageOut, ctx) -> RetrieveOut:
232
- docs = [f"doc_{i}_{msg.topic}" for i in range(2)]
233
- return RetrieveOut(topic=msg.topic, docs=docs)
234
-
235
-
236
- async def pack(msg: RetrieveOut, ctx) -> PackOut:
237
- prompt = f"[{msg.topic}] summarize {len(msg.docs)} docs"
238
- return PackOut(prompt=prompt)
239
-
240
-
241
- triage_node = Node(triage, name="triage", policy=NodePolicy(validate="both"))
242
- retrieve_node = Node(retrieve, name="retrieve", policy=NodePolicy(validate="both"))
243
- pack_node = Node(pack, name="pack", policy=NodePolicy(validate="both"))
244
-
245
- registry = ModelRegistry()
246
- registry.register("triage", TriageIn, TriageOut)
247
- registry.register("retrieve", TriageOut, RetrieveOut)
248
- registry.register("pack", RetrieveOut, PackOut)
249
-
250
- flow = create(
251
- triage_node.to(retrieve_node),
252
- retrieve_node.to(pack_node),
253
- )
254
- flow.run(registry=registry)
255
-
256
- message = Message(
257
- payload=TriageIn(text="show marketing metrics"),
258
- headers=Headers(tenant="acme"),
259
- )
260
-
261
- await flow.emit(message)
262
- out = await flow.fetch()
263
- print(out.prompt) # PackOut(prompt='[metrics] summarize 2 docs')
264
-
265
- await flow.stop()
266
- ```
267
-
268
- ### Patterns Toolkit
269
-
270
- PenguiFlow ships a handful of **composable patterns** to keep orchestration code tidy
271
- without forcing you into a one-size-fits-all DSL. Each helper is opt-in and can be
272
- stitched directly into a flow adjacency list:
273
-
274
- - `map_concurrent(items, worker, max_concurrency=8)` — fan a single message out into
275
- many in-memory tasks (e.g., batch document enrichment) while respecting a semaphore.
276
- - `predicate_router(name, mapping)` — route messages to successor nodes based on simple
277
- boolean functions over payload or headers. Perfect for guardrails or conditional
278
- tool invocation without building a full controller.
279
- - `union_router(name, discriminated_model)` — accept a Pydantic discriminated union and
280
- forward each variant to the matching typed successor node. Keeps type-safety even when
281
- multiple schema branches exist.
282
- - `join_k(name, k)` — aggregate `k` messages per `trace_id` before resuming downstream
283
- work. Useful for fan-out/fan-in batching, map-reduce style summarization, or consensus.
284
-
285
- All helpers are regular `Node` instances under the hood, so they inherit retries,
286
- timeouts, and validation just like hand-written nodes.
287
-
288
- ### Dynamic Controller Loops
289
-
290
- Long-running agents often need to **think, plan, and act over multiple hops**. PenguiFlow
291
- models this with a controller node that loops on itself:
292
-
293
- 1. Define a controller `Node` with `allow_cycle=True` and wire `controller.to(controller)`.
294
- 2. Emit a `Message` whose payload is a `WM` (working memory). PenguiFlow increments the
295
- `hops` counter automatically and enforces `budget_hops` + `deadline_s` so controllers
296
- cannot loop forever.
297
- 3. The controller can attach intermediate `Thought` artifacts or emit `PlanStep`s for
298
- transparency/debugging. When it is ready to finish, it returns a `FinalAnswer` which
299
- is immediately forwarded to Rookery.
300
-
301
- Deadlines and hop budgets turn into automated `FinalAnswer` error messages, making it
302
- easy to surface guardrails to downstream consumers.
303
-
304
- ---
305
-
306
- ### Playbooks & Subflows
307
-
308
- Sometimes a controller or router needs to execute a **mini flow** — for example,
309
- retrieval → rerank → compress — without polluting the global topology. `call_playbook`
310
- spawns a brand-new `PenguiFlow` on demand and wires it into the parent message context:
311
-
312
- - Trace IDs and headers are reused so observability stays intact.
313
- - The helper respects optional timeouts and always stops the subflow (even on cancel).
314
- - The first payload emitted to the playbook's Rookery is returned to the caller,
315
- allowing you to treat subflows as normal async functions.
316
-
317
- ```python
318
- from penguiflow import call_playbook
319
- from penguiflow.types import Message
320
-
321
- async def controller(msg: Message, ctx) -> Message:
322
- playbook_result = await call_playbook(build_retrieval_playbook, msg)
323
- return msg.model_copy(update={"payload": playbook_result})
324
- ```
325
-
326
- Playbooks are ideal for deploying frequently reused toolchains while keeping the main
327
- flow focused on high-level orchestration logic.
328
-
329
- ---
330
-
331
- ### Visualization
332
-
333
- Need a quick view of the flow topology? Call `flow_to_mermaid(flow)` to render the graph
334
- as a Mermaid diagram ready for Markdown or docs tools:
335
-
336
- ```python
337
- from penguiflow import flow_to_mermaid
338
-
339
- print(flow_to_mermaid(flow, direction="LR"))
340
- ```
341
-
342
- ---
343
-
344
- ## 🛡️ Reliability & Observability
345
-
346
- * **NodePolicy**: set validation scope plus per-node timeout, retries, and backoff curves.
347
- * **Structured logs**: enrich every node event with `{ts, trace_id, node_name, event, latency_ms, q_depth_in, attempt}`.
348
- * **Middleware hooks**: subscribe observers (e.g., MLflow) to the structured event stream.
349
- * See `examples/reliability_middleware/` for a concrete timeout + retry walkthrough.
350
-
351
- ---
352
-
353
- ## ⚠️ Current Constraints
354
-
355
- - **In-process runtime**: there is no built-in distribution layer yet. Long-running CPU work should be delegated to your own pools or services.
356
- - **Registry-driven typing**: nodes default to validation. Provide a `ModelRegistry` when calling `flow.run(...)` or set `validate="none"` explicitly for untyped hops.
357
- - **Observability**: structured logs + middleware hooks are available, but integrations with third-party stacks (OTel, Prometheus) are DIY for now.
358
- - **Roadmap**: v2 targets streaming, distributed backends, richer observability, and test harnesses. Contributions and proposals are welcome!
359
-
360
- ---
361
-
362
- ## 📊 Benchmarks
363
-
364
- Lightweight benchmarks live under `benchmarks/`. Run them via `uv run python benchmarks/<name>.py`
365
- to capture baselines for fan-out throughput, retry/timeout overhead, and controller
366
- playbook latency. Copy them into product repos to watch for regressions over time.
367
-
368
- ---
369
-
370
- ## 🔮 Roadmap
371
-
372
- * **v1 (current)**: safe core runtime, type-safety, retries, timeouts, routing, controller loops, playbooks via examples.
373
- * **v2 (future)**: streaming support, per-trace cancel, deadlines/budgets, observability hooks, visualizer, testing harness.
374
-
375
- ---
376
-
377
- ## 🧪 Testing
378
-
379
- ```bash
380
- pytest -q
381
- ```
382
-
383
- * Unit tests cover core runtime, type safety, routing, retries.
384
- * Example flows under `examples/` are runnable end-to-end.
385
-
386
- ---
387
-
388
- ## 🐧 Naming Glossary
389
-
390
- * **Node**: an async function + metadata wrapper.
391
- * **Floe**: an edge (queue) between nodes.
392
- * **Context**: context passed into each node to fetch/emit.
393
- * **OpenSea** 🌊: ingress context.
394
- * **Rookery** 🐧: egress context.
395
-
396
- ---
397
-
398
- ## 📖 Examples
399
-
400
- * `examples/quickstart/`: hello world pipeline.
401
- * `examples/routing_predicate/`: branching with predicates.
402
- * `examples/routing_union/`: discriminated unions with typed branches.
403
- * `examples/fanout_join/`: split work and join with `join_k`.
404
- * `examples/map_concurrent/`: bounded fan-out work inside a node.
405
- * `examples/controller_multihop/`: dynamic multi-hop agent loop.
406
- * `examples/reliability_middleware/`: retries, timeouts, and middleware hooks.
407
- * `examples/playbook_retrieval/`: retrieval → rerank → compress playbook.
408
-
409
- ---
410
-
411
- ## 🤝 Contributing
412
-
413
- * Keep the library **lightweight and generic**.
414
- * Product-specific playbooks go into `examples/`, not core.
415
- * Every new primitive requires:
416
-
417
- * Unit tests in `tests/`
418
- * Runnable example in `examples/`
419
- * Docs update in README
420
-
421
- ---
422
-
423
- ## License
424
-
425
- MIT
@@ -1,13 +0,0 @@
1
- penguiflow/__init__.py,sha256=fiQsp6-xYG2UjvuIhu71zvEiTeAjdfEjtoLRwZ8wROs,930
2
- penguiflow/core.py,sha256=fO5GXF7Hih-gEcUPbyXVJilgUmwbvd72j337r5oOWME,20908
3
- penguiflow/middlewares.py,sha256=LUlK4FrMScK3oaNSrAYNw3s4KcAZ716DTLAUqvsOkL8,319
4
- penguiflow/node.py,sha256=0NOs3rU6t1tHNNwwJopqzM2ufGcp82JpzhckynWBRqs,3563
5
- penguiflow/patterns.py,sha256=Ivuuy0on0OMsdYd5DRFZm1EgujXKPEaIIMH0ZWlJ1s0,4199
6
- penguiflow/registry.py,sha256=4lrGDMFjM7c8pfZFc_YG0YHg-F80JyF4c-j0UbAf150,1419
7
- penguiflow/types.py,sha256=QV2JvB_QnohfBATSaviPWm0HSR9B6dTc3UOwFIYyaqg,1154
8
- penguiflow/viz.py,sha256=B9T2O5A6nHBLn7JuEeujqDC6ZcwP5s6M2rpsUrj5Ul0,2091
9
- penguiflow-1.0.3.dist-info/licenses/LICENSE,sha256=JSvodvLXxSct_kI9IBsZOBpVKoESQTB_AGbkClwZ7HI,1065
10
- penguiflow-1.0.3.dist-info/METADATA,sha256=GISwzSiycpzXy0kmoP3mWeMAeutgNITeTKVGz2YRb1A,13658
11
- penguiflow-1.0.3.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
12
- penguiflow-1.0.3.dist-info/top_level.txt,sha256=F-5jgzPP4Mo_ErgtzGDFJdRT4CIfFjFBnxxcn-RpWBU,11
13
- penguiflow-1.0.3.dist-info/RECORD,,
@@ -1 +0,0 @@
1
- penguiflow