cerebrex 0.9.2 โ†’ 0.9.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +210 -9
  2. package/dist/index.js +419 -327
  3. package/package.json +1 -1
package/README.md CHANGED
@@ -7,19 +7,20 @@
7
7
  [![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](./LICENSE)
8
8
  [![CI](https://github.com/arealcoolco/CerebreX/actions/workflows/ci.yml/badge.svg)](https://github.com/arealcoolco/CerebreX/actions/workflows/ci.yml)
9
9
  [![npm version](https://img.shields.io/npm/v/cerebrex.svg)](https://www.npmjs.com/package/cerebrex)
10
+ [![Benchmarks](https://img.shields.io/badge/benchmarks-v0.9.2-brightgreen)](./BENCHMARKS.md)
10
11
  [![GitHub Stars](https://img.shields.io/github/stars/arealcoolco/CerebreX?style=social)](https://github.com/arealcoolco/CerebreX)
11
12
  [![Issues](https://img.shields.io/github/issues/arealcoolco/CerebreX)](https://github.com/arealcoolco/CerebreX/issues)
12
13
 
13
- **Build. Test. Remember. Coordinate. Publish.**
14
+ **Build. Test. Remember. Coordinate. Publish.**
14
15
  The complete infrastructure layer for AI agents โ€” in one CLI.
15
16
 
16
- [๐Ÿš€ Quickstart](#-quickstart) ยท [๐Ÿ—‚ Structure](#-monorepo-structure) ยท [๐Ÿ›ฃ Roadmap](#-roadmap) ยท [๐Ÿ› Issues](https://github.com/arealcoolco/CerebreX/issues)
17
+ [Quickstart](#-quickstart) ยท [Why CerebreX](#-why-cerebrex-vs-langchain-crewai-autogen) ยท [Benchmarks](./BENCHMARKS.md) ยท [Modules](#what-is-cerebrex) ยท [Python SDK](#-python-sdk) ยท [Roadmap](#-roadmap)
17
18
 
18
19
  </div>
19
20
 
20
21
  ---
21
22
 
22
- > **Status: v0.9.1 โ€” Security hardening patch: risk gate integrated, JWT auth on token endpoint, KAIROS backoff + validation**
23
+ > **Status: v0.9.3 โ€” Agent test runner (`cerebrex test` โ€” replay + assertions + fixture recording + CI mode)**
23
24
  > `npm install -g cerebrex` โ€” or download a self-contained binary from [GitHub Releases](https://github.com/arealcoolco/CerebreX/releases) (no Node.js required)
24
25
  >
25
26
  > **Live:** Registry UI โ†’ `https://registry.therealcool.site`
@@ -47,6 +48,56 @@ Eight modules. One CLI. One registry. One coordination layer.
47
48
 
48
49
  ---
49
50
 
51
+ ## Why CerebreX vs LangChain, CrewAI, AutoGen
52
+
53
+ > Full benchmark methodology, raw numbers, and detailed comparisons: [**BENCHMARKS.md**](./BENCHMARKS.md)
54
+
55
+ ### Measured Performance (v0.9.2)
56
+
57
+ ```
58
+ FORGE parse + scaffold 20-endpoint OpenAPI spec โ†’ 0.12ms median
59
+ MEMEX read agent memory index โ†’ 0.01ms median
60
+ MEMEX assemble 3-layer context โ†’ 0.03ms median
61
+ HIVE classify + route 10-task swarm โ†’ 0.09ms median
62
+ TRACE record tool-call step โ†’ <0.01ms median (27,435 ops/s)
63
+ All benchmarks โ†’ 100% success rate
64
+ ```
65
+
66
+ ### Features No Other Framework Has
67
+
68
+ | What You Need | CerebreX | LangChain | CrewAI | AutoGen |
69
+ |---------------|:--------:|:---------:|:------:|:-------:|
70
+ | Generate MCP servers from any OpenAPI spec | **FORGE** | โŒ | โŒ | โŒ |
71
+ | Three-layer cloud memory (KV + R2 + D1) | **MEMEX** | โš ๏ธ Paid | โŒ | โŒ |
72
+ | Nightly AI memory consolidation | **autoDream** | โŒ | โŒ | โŒ |
73
+ | Autonomous background daemon | **KAIROS** | โŒ | โŒ | โŒ |
74
+ | Risk gate on every agent action | **HIVE** | โŒ | โŒ | โŒ |
75
+ | Opus plan + human approval before execution | **ULTRAPLAN** | โŒ | โŒ | โŒ |
76
+ | Built-in MCP package registry | **REGISTRY** | โŒ | โŒ | โŒ |
77
+ | Built-in observability (free, local) | **TRACE** | โš ๏ธ Paid | โŒ | โŒ |
78
+ | Single CLI for all of the above | `cerebrex` | โŒ | โŒ | โŒ |
79
+
80
+ ### Startup Time
81
+
82
+ | | CerebreX | LangChain | CrewAI | AutoGen |
83
+ |-|:--------:|:---------:|:------:|:-------:|
84
+ | CLI / module cold start | **~80ms** | ~2,100ms | ~3,400ms | ~1,800ms |
85
+
86
+ > CerebreX starts **26x faster** than LangChain and **42x faster** than CrewAI.
87
+ > Bun runtime + single bundled file vs Python's large import tree.
88
+
89
+ ### What the Others Don't Have
90
+
91
+ **LangChain** is a composition library โ€” it connects existing tools but ships zero infrastructure. Memory requires external Redis/Postgres. Observability requires paying for LangSmith. There's no risk gating, no background daemons, and no MCP generation.
92
+
93
+ **CrewAI** orchestrates agents in crews but its memory is SQLite-only and in-process. There's no cloud persistence, no risk classification, and no autonomous daemon. Each agent does what it's told โ€” nothing more.
94
+
95
+ **AutoGen** excels at multi-agent conversation but everything runs in-process. No cloud memory, no background loop, no registry, no observability beyond print statements.
96
+
97
+ **CerebreX** is purpose-built agent infrastructure: the CLI, the cloud workers, the memory layer, the coordination engine, the observatory, and the package registry โ€” all designed together, all open source, all running on Cloudflare's free tier.
98
+
99
+ ---
100
+
50
101
  ## โšก Quickstart
51
102
 
52
103
  ```bash
@@ -333,6 +384,136 @@ The CerebreX registry includes a browser-based UI served directly from the Worke
333
384
 
334
385
  ---
335
386
 
387
+ ## ๐Ÿ“Š Benchmarks
388
+
389
+ Full results with competitive analysis: [**BENCHMARKS.md**](./BENCHMARKS.md)
390
+
391
+ ```bash
392
+ # Run all local benchmarks (no network needed)
393
+ cerebrex bench
394
+
395
+ # Run a specific suite
396
+ cerebrex bench --suite forge # MCP server generation
397
+ cerebrex bench --suite memex # three-layer memory
398
+ cerebrex bench --suite hive # swarm coordination + risk gate
399
+ cerebrex bench --suite trace # observability recording
400
+ cerebrex bench --suite registry # package search
401
+
402
+ # Or run directly with Bun
403
+ bun benchmarks/forge-bench.ts
404
+ bun benchmarks/memex-bench.ts
405
+ ```
406
+
407
+ Benchmarks use `performance.now()`, report **p50/p95/p99 latency** and **throughput (ops/s)**, and run with warmup iterations discarded. CI runs the full suite weekly (Sundays 02:00 UTC). All results in [`benchmarks/results/`](benchmarks/results/).
408
+
409
+ ---
410
+
411
+ ## ๐Ÿ Python SDK
412
+
413
+ ```bash
414
+ pip install cerebrex
415
+ ```
416
+
417
+ ```python
418
+ import asyncio
419
+ from cerebrex import CerebreXClient
420
+
421
+ async def main():
422
+ async with CerebreXClient(api_key="cx-your-key") as client:
423
+ # Write to agent memory
424
+ await client.memex.write_index("my-agent", "# Memory\n- learned today")
425
+
426
+ # Assemble a system prompt from all three memory layers
427
+ ctx = await client.memex.assemble_context("my-agent", topics=["context"])
428
+
429
+ # Search the registry
430
+ results = await client.registry.search("web-search")
431
+
432
+ # Submit a KAIROS task
433
+ task = await client.kairos.submit_task("my-agent", "fetch",
434
+ payload={"url": "https://api.example.com/data"})
435
+
436
+ asyncio.run(main())
437
+ ```
438
+
439
+ See [sdks/python/README.md](sdks/python/README.md) for the full SDK reference including ULTRAPLAN, TRACE, LangChain integration, and CrewAI integration.
440
+
441
+ ---
442
+
443
+ ## ๐Ÿงช Agent Test Runner
444
+
445
+ `cerebrex test` lets you write structured assertions against recorded agent traces โ€” no live model calls needed.
446
+
447
+ ```bash
448
+ # Scaffold a starter spec file
449
+ cerebrex test init
450
+
451
+ # Run all discovered specs
452
+ cerebrex test run
453
+
454
+ # Run a specific spec with verbose output
455
+ cerebrex test run my-agent.test.yaml --verbose
456
+
457
+ # CI mode (JSON to stdout, exit 1 on failure)
458
+ cerebrex test run --ci
459
+
460
+ # Only run tests tagged "smoke"
461
+ cerebrex test run --tag smoke
462
+
463
+ # Record a saved trace session as a reusable fixture
464
+ cerebrex test record <session-id>
465
+
466
+ # List all discovered spec files
467
+ cerebrex test list
468
+
469
+ # Inspect a spec file
470
+ cerebrex test show my-agent.test.yaml
471
+ ```
472
+
473
+ **Spec format** (`my-agent.test.yaml`):
474
+
475
+ ```yaml
476
+ name: My Agent Tests
477
+
478
+ tests:
479
+ - name: search tool called with correct query
480
+ steps:
481
+ - type: tool_call
482
+ toolName: web_search
483
+ inputs:
484
+ query: "CerebreX agent OS"
485
+ latencyMs: 120
486
+ - type: tool_result
487
+ toolName: web_search
488
+ outputs:
489
+ results:
490
+ - title: "CerebreX โ€” Agent Infrastructure OS"
491
+ tokens: 45
492
+ assert:
493
+ noErrors: true
494
+ stepCount: 2
495
+ toolsCalled:
496
+ tools: [web_search]
497
+ steps:
498
+ - at: 0
499
+ toolName: web_search
500
+
501
+ # Replay a recorded trace fixture
502
+ - name: matches recorded session
503
+ fixture: my-session.fixture.json
504
+ assert:
505
+ noErrors: true
506
+ stepCount:
507
+ min: 1
508
+ output:
509
+ path: results.0.title
510
+ contains: "CerebreX"
511
+ ```
512
+
513
+ **Assertions available:** `stepCount`, `tokenCount`, `durationMs`, `noErrors`, `toolsCalled` (with `ordered`/`exact` modes), per-step checks (`type`, `toolName`, `outputPath`/`outputValue`, `latencyMs`), and `output` (dot-path `equals`/`contains`/`matches`).
514
+
515
+ ---
516
+
336
517
  ## ๐Ÿ—‚ Monorepo Structure
337
518
 
338
519
  ```
@@ -340,11 +521,28 @@ CerebreX/
340
521
  โ”œโ”€โ”€ apps/
341
522
  โ”‚ โ”œโ”€โ”€ cli/ # cerebrex CLI โ€” the main published package
342
523
  โ”‚ โ”‚ โ”œโ”€โ”€ src/
343
- โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ commands/ # build, trace, memex, auth, hive, other-commands
344
- โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ core/ # forge/, trace/, memex/ engines + dashboard
524
+ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ commands/ # build, trace, memex, auth, hive, bench, test, other-commands
525
+ โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ core/ # forge/, trace/, memex/, test/ engines + dashboard
345
526
  โ”‚ โ”‚ โ””โ”€โ”€ dist/ # built output (git-ignored, built by CI)
346
527
  โ”‚ โ””โ”€โ”€ dashboard/ # Standalone trace explorer HTML
347
528
  โ”‚ โ””โ”€โ”€ src/index.html
529
+ โ”œโ”€โ”€ benchmarks/ # Performance benchmark suite (local + live)
530
+ โ”‚ โ”œโ”€โ”€ forge-bench.ts # FORGE pipeline timing
531
+ โ”‚ โ”œโ”€โ”€ trace-bench.ts # TRACE step recording throughput
532
+ โ”‚ โ”œโ”€โ”€ memex-bench.ts # Three-layer MEMEX operations
533
+ โ”‚ โ”œโ”€โ”€ hive-bench.ts # Swarm coordination + risk gate
534
+ โ”‚ โ”œโ”€โ”€ registry-bench.ts # Package search + metadata
535
+ โ”‚ โ”œโ”€โ”€ agent-tasks-bench.ts # Cross-framework comparison scaffold
536
+ โ”‚ โ””โ”€โ”€ src/
537
+ โ”‚ โ”œโ”€โ”€ stats.ts # p50/p95/p99 helpers
538
+ โ”‚ โ”œโ”€โ”€ types.ts # BenchmarkResult type
539
+ โ”‚ โ”œโ”€โ”€ reporters/ # console, json, markdown reporters
540
+ โ”‚ โ””โ”€โ”€ adapters/ # cerebrex adapter (5 standardized tasks)
541
+ โ”œโ”€โ”€ sdks/
542
+ โ”‚ โ””โ”€โ”€ python/ # Python async SDK (pip install cerebrex)
543
+ โ”‚ โ”œโ”€โ”€ src/cerebrex/ # CerebreXClient + module sub-clients
544
+ โ”‚ โ”œโ”€โ”€ tests/ # pytest test suite with pytest-httpx mocks
545
+ โ”‚ โ””โ”€โ”€ examples/ # quickstart, langchain_integration, crewai_integration
348
546
  โ”œโ”€โ”€ workers/
349
547
  โ”‚ โ”œโ”€โ”€ registry/ # Cloudflare Worker โ€” live registry backend + Web UI
350
548
  โ”‚ โ”‚ โ”œโ”€โ”€ src/index.ts # REST API (D1 + KV) + embedded HTML pages
@@ -370,7 +568,10 @@ CerebreX/
370
568
  โ”‚ โ”œโ”€โ”€ deploy-registry.yml # auto-deploy registry Worker
371
569
  โ”‚ โ”œโ”€โ”€ deploy-memex.yml # auto-deploy MEMEX Worker
372
570
  โ”‚ โ”œโ”€โ”€ deploy-kairos.yml # auto-deploy KAIROS Worker
373
- โ”‚ โ””โ”€โ”€ build-binaries.yml # build standalone binaries on release
571
+ โ”‚ โ”œโ”€โ”€ build-binaries.yml # build standalone binaries on release
572
+ โ”‚ โ”œโ”€โ”€ benchmarks.yml # weekly benchmark suite (Sundays 02:00 UTC)
573
+ โ”‚ โ”œโ”€โ”€ test-python.yml # Python SDK tests (3.10, 3.11, 3.12)
574
+ โ”‚ โ””โ”€โ”€ publish-python.yml # publish cerebrex to PyPI on release
374
575
  โ””โ”€โ”€ turbo.json
375
576
  ```
376
577
 
@@ -449,9 +650,9 @@ cd apps/cli && bun run build
449
650
  - [x] HIVE swarm strategies โ€” parallel, pipeline, competitive + 6 built-in presets *(v0.9)*
450
651
  - [x] `@cerebrex/system-prompt` โ€” master system prompt package + live MEMEX context loader *(v0.9)*
451
652
  - [x] Security hardening โ€” risk gate wired into HIVE worker, JWT /token endpoint authenticated, KAIROS exponential backoff + JSON validation, agentId injection prevention *(v0.9.1)*
452
- - [ ] Agent test runner โ€” `cerebrex test` with replay + assertions *(v1.0)*
453
- - [ ] Custom domain *(next)*
454
- - [ ] Enterprise tier + on-prem *(v1.0)*
653
+ - [x] Benchmark suite โ€” p50/p95/p99, forge/trace/memex/hive/registry + cross-framework agent tasks + `cerebrex bench` CLI command *(v0.9.2)*
654
+ - [x] Python SDK โ€” async httpx client, Pydantic v2, full module coverage, LangChain + CrewAI integrations *(v0.9.2)*
655
+ - [x] Agent test runner โ€” `cerebrex test` with replay + assertions, fixture recording, tag filtering, CI mode *(v0.9.3)*
455
656
 
456
657
  ---
457
658