cerebrex 0.9.2 โ†’ 0.9.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +221 -10
  2. package/dist/index.js +434 -332
  3. package/package.json +1 -1
package/README.md CHANGED
@@ -7,20 +7,21 @@
7
7
  [![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](./LICENSE)
8
8
  [![CI](https://github.com/arealcoolco/CerebreX/actions/workflows/ci.yml/badge.svg)](https://github.com/arealcoolco/CerebreX/actions/workflows/ci.yml)
9
9
  [![npm version](https://img.shields.io/npm/v/cerebrex.svg)](https://www.npmjs.com/package/cerebrex)
10
+ [![Benchmarks](https://img.shields.io/badge/benchmarks-v0.9.2-brightgreen)](./BENCHMARKS.md)
10
11
  [![GitHub Stars](https://img.shields.io/github/stars/arealcoolco/CerebreX?style=social)](https://github.com/arealcoolco/CerebreX)
11
12
  [![Issues](https://img.shields.io/github/issues/arealcoolco/CerebreX)](https://github.com/arealcoolco/CerebreX/issues)
12
13
 
13
- **Build. Test. Remember. Coordinate. Publish.**
14
+ **Build. Test. Remember. Coordinate. Publish.**
14
15
  The complete infrastructure layer for AI agents โ€” in one CLI.
15
16
 
16
- [๐Ÿš€ Quickstart](#-quickstart) ยท [๐Ÿ—‚ Structure](#-monorepo-structure) ยท [๐Ÿ›ฃ Roadmap](#-roadmap) ยท [๐Ÿ› Issues](https://github.com/arealcoolco/CerebreX/issues)
17
+ [Quickstart](#-quickstart) ยท [Why CerebreX](#-why-cerebrex-vs-langchain-crewai-autogen) ยท [Benchmarks](./BENCHMARKS.md) ยท [Modules](#what-is-cerebrex) ยท [Python SDK](#-python-sdk) ยท [Roadmap](#-roadmap)
17
18
 
18
19
  </div>
19
20
 
20
21
  ---
21
22
 
22
- > **Status: v0.9.1 โ€” Security hardening patch: risk gate integrated, JWT auth on token endpoint, KAIROS backoff + validation**
23
- > `npm install -g cerebrex` โ€” or download a self-contained binary from [GitHub Releases](https://github.com/arealcoolco/CerebreX/releases) (no Node.js required)
23
+ > **Status: v0.9.4 โ€” Security hardening (SSRF protection, security headers, file permissions, KAIROS execution engine)**
24
+ > `npm install -g cerebrex` ยท `docker pull ghcr.io/arealcoolco/cerebrex` ยท or download a self-contained binary from [GitHub Releases](https://github.com/arealcoolco/CerebreX/releases)
24
25
  >
25
26
  > **Live:** Registry UI โ†’ `https://registry.therealcool.site`
26
27
  > **Live:** Trace Explorer โ†’ `https://registry.therealcool.site/ui/trace`
@@ -47,6 +48,56 @@ Eight modules. One CLI. One registry. One coordination layer.
47
48
 
48
49
  ---
49
50
 
51
+ ## Why CerebreX vs LangChain, CrewAI, AutoGen
52
+
53
+ > Full benchmark methodology, raw numbers, and detailed comparisons: [**BENCHMARKS.md**](./BENCHMARKS.md)
54
+
55
+ ### Measured Performance (v0.9.2)
56
+
57
+ ```
58
+ FORGE parse + scaffold 20-endpoint OpenAPI spec โ†’ 0.12ms median
59
+ MEMEX read agent memory index โ†’ 0.01ms median
60
+ MEMEX assemble 3-layer context โ†’ 0.03ms median
61
+ HIVE classify + route 10-task swarm โ†’ 0.09ms median
62
+ TRACE record tool-call step โ†’ <0.01ms median (27,435 ops/s)
63
+ All benchmarks โ†’ 100% success rate
64
+ ```
65
+
66
+ ### Features No Other Framework Has
67
+
68
+ | What You Need | CerebreX | LangChain | CrewAI | AutoGen |
69
+ |---------------|:--------:|:---------:|:------:|:-------:|
70
+ | Generate MCP servers from any OpenAPI spec | **FORGE** | โŒ | โŒ | โŒ |
71
+ | Three-layer cloud memory (KV + R2 + D1) | **MEMEX** | โš ๏ธ Paid | โŒ | โŒ |
72
+ | Nightly AI memory consolidation | **autoDream** | โŒ | โŒ | โŒ |
73
+ | Autonomous background daemon | **KAIROS** | โŒ | โŒ | โŒ |
74
+ | Risk gate on every agent action | **HIVE** | โŒ | โŒ | โŒ |
75
+ | Opus plan + human approval before execution | **ULTRAPLAN** | โŒ | โŒ | โŒ |
76
+ | Built-in MCP package registry | **REGISTRY** | โŒ | โŒ | โŒ |
77
+ | Built-in observability (free, local) | **TRACE** | โš ๏ธ Paid | โŒ | โŒ |
78
+ | Single CLI for all of the above | `cerebrex` | โŒ | โŒ | โŒ |
79
+
80
+ ### Startup Time
81
+
82
+ | | CerebreX | LangChain | CrewAI | AutoGen |
83
+ |-|:--------:|:---------:|:------:|:-------:|
84
+ | CLI / module cold start | **~80ms** | ~2,100ms | ~3,400ms | ~1,800ms |
85
+
86
+ > CerebreX starts **26x faster** than LangChain and **42x faster** than CrewAI.
87
+ > Bun runtime + single bundled file vs Python's large import tree.
88
+
89
+ ### What the Others Don't Have
90
+
91
+ **LangChain** is a composition library โ€” it connects existing tools but ships zero infrastructure. Memory requires external Redis/Postgres. Observability requires paying for LangSmith. There's no risk gating, no background daemons, and no MCP generation.
92
+
93
+ **CrewAI** orchestrates agents in crews but its memory is SQLite-only and in-process. There's no cloud persistence, no risk classification, and no autonomous daemon. Each agent does what it's told โ€” nothing more.
94
+
95
+ **AutoGen** excels at multi-agent conversation but everything runs in-process. No cloud memory, no background loop, no registry, no observability beyond print statements.
96
+
97
+ **CerebreX** is purpose-built agent infrastructure: the CLI, the cloud workers, the memory layer, the coordination engine, the observatory, and the package registry โ€” all designed together, all open source, all running on Cloudflare's free tier.
98
+
99
+ ---
100
+
50
101
  ## โšก Quickstart
51
102
 
52
103
  ```bash
@@ -54,6 +105,16 @@ npm install -g cerebrex
54
105
  cerebrex --help
55
106
  ```
56
107
 
108
+ Or via Docker (no Node.js or npm required):
109
+
110
+ ```bash
111
+ docker pull ghcr.io/arealcoolco/cerebrex
112
+ docker run --rm ghcr.io/arealcoolco/cerebrex --version
113
+
114
+ # Mount a local directory to access spec files, configs, etc.
115
+ docker run --rm -v "$HOME/.cerebrex:/root/.cerebrex" ghcr.io/arealcoolco/cerebrex test run
116
+ ```
117
+
57
118
  Or build from source (requires [Bun](https://bun.sh)):
58
119
 
59
120
  ```bash
@@ -333,6 +394,136 @@ The CerebreX registry includes a browser-based UI served directly from the Worke
333
394
 
334
395
  ---
335
396
 
397
+ ## ๐Ÿ“Š Benchmarks
398
+
399
+ Full results with competitive analysis: [**BENCHMARKS.md**](./BENCHMARKS.md)
400
+
401
+ ```bash
402
+ # Run all local benchmarks (no network needed)
403
+ cerebrex bench
404
+
405
+ # Run a specific suite
406
+ cerebrex bench --suite forge # MCP server generation
407
+ cerebrex bench --suite memex # three-layer memory
408
+ cerebrex bench --suite hive # swarm coordination + risk gate
409
+ cerebrex bench --suite trace # observability recording
410
+ cerebrex bench --suite registry # package search
411
+
412
+ # Or run directly with Bun
413
+ bun benchmarks/forge-bench.ts
414
+ bun benchmarks/memex-bench.ts
415
+ ```
416
+
417
+ Benchmarks use `performance.now()`, report **p50/p95/p99 latency** and **throughput (ops/s)**, and run with warmup iterations discarded. CI runs the full suite weekly (Sundays 02:00 UTC). All results in [`benchmarks/results/`](benchmarks/results/).
418
+
419
+ ---
420
+
421
+ ## ๐Ÿ Python SDK
422
+
423
+ ```bash
424
+ pip install cerebrex
425
+ ```
426
+
427
+ ```python
428
+ import asyncio
429
+ from cerebrex import CerebreXClient
430
+
431
+ async def main():
432
+ async with CerebreXClient(api_key="cx-your-key") as client:
433
+ # Write to agent memory
434
+ await client.memex.write_index("my-agent", "# Memory\n- learned today")
435
+
436
+ # Assemble a system prompt from all three memory layers
437
+ ctx = await client.memex.assemble_context("my-agent", topics=["context"])
438
+
439
+ # Search the registry
440
+ results = await client.registry.search("web-search")
441
+
442
+ # Submit a KAIROS task
443
+ task = await client.kairos.submit_task("my-agent", "fetch",
444
+ payload={"url": "https://api.example.com/data"})
445
+
446
+ asyncio.run(main())
447
+ ```
448
+
449
+ See [sdks/python/README.md](sdks/python/README.md) for the full SDK reference including ULTRAPLAN, TRACE, LangChain integration, and CrewAI integration.
450
+
451
+ ---
452
+
453
+ ## ๐Ÿงช Agent Test Runner
454
+
455
+ `cerebrex test` lets you write structured assertions against recorded agent traces โ€” no live model calls needed.
456
+
457
+ ```bash
458
+ # Scaffold a starter spec file
459
+ cerebrex test init
460
+
461
+ # Run all discovered specs
462
+ cerebrex test run
463
+
464
+ # Run a specific spec with verbose output
465
+ cerebrex test run my-agent.test.yaml --verbose
466
+
467
+ # CI mode (JSON to stdout, exit 1 on failure)
468
+ cerebrex test run --ci
469
+
470
+ # Only run tests tagged "smoke"
471
+ cerebrex test run --tag smoke
472
+
473
+ # Record a saved trace session as a reusable fixture
474
+ cerebrex test record <session-id>
475
+
476
+ # List all discovered spec files
477
+ cerebrex test list
478
+
479
+ # Inspect a spec file
480
+ cerebrex test show my-agent.test.yaml
481
+ ```
482
+
483
+ **Spec format** (`my-agent.test.yaml`):
484
+
485
+ ```yaml
486
+ name: My Agent Tests
487
+
488
+ tests:
489
+ - name: search tool called with correct query
490
+ steps:
491
+ - type: tool_call
492
+ toolName: web_search
493
+ inputs:
494
+ query: "CerebreX agent OS"
495
+ latencyMs: 120
496
+ - type: tool_result
497
+ toolName: web_search
498
+ outputs:
499
+ results:
500
+ - title: "CerebreX โ€” Agent Infrastructure OS"
501
+ tokens: 45
502
+ assert:
503
+ noErrors: true
504
+ stepCount: 2
505
+ toolsCalled:
506
+ tools: [web_search]
507
+ steps:
508
+ - at: 0
509
+ toolName: web_search
510
+
511
+ # Replay a recorded trace fixture
512
+ - name: matches recorded session
513
+ fixture: my-session.fixture.json
514
+ assert:
515
+ noErrors: true
516
+ stepCount:
517
+ min: 1
518
+ output:
519
+ path: results.0.title
520
+ contains: "CerebreX"
521
+ ```
522
+
523
+ **Assertions available:** `stepCount`, `tokenCount`, `durationMs`, `noErrors`, `toolsCalled` (with `ordered`/`exact` modes), per-step checks (`type`, `toolName`, `outputPath`/`outputValue`, `latencyMs`), and `output` (dot-path `equals`/`contains`/`matches`).
524
+
525
+ ---
526
+
336
527
  ## ๐Ÿ—‚ Monorepo Structure
337
528
 
338
529
  ```
@@ -340,11 +531,28 @@ CerebreX/
340
531
  โ”œโ”€โ”€ apps/
341
532
  โ”‚ โ”œโ”€โ”€ cli/ # cerebrex CLI โ€” the main published package
342
533
  โ”‚ โ”‚ โ”œโ”€โ”€ src/
343
- โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ commands/ # build, trace, memex, auth, hive, other-commands
344
- โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ core/ # forge/, trace/, memex/ engines + dashboard
534
+ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ commands/ # build, trace, memex, auth, hive, bench, test, other-commands
535
+ โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ core/ # forge/, trace/, memex/, test/ engines + dashboard
345
536
  โ”‚ โ”‚ โ””โ”€โ”€ dist/ # built output (git-ignored, built by CI)
346
537
  โ”‚ โ””โ”€โ”€ dashboard/ # Standalone trace explorer HTML
347
538
  โ”‚ โ””โ”€โ”€ src/index.html
539
+ โ”œโ”€โ”€ benchmarks/ # Performance benchmark suite (local + live)
540
+ โ”‚ โ”œโ”€โ”€ forge-bench.ts # FORGE pipeline timing
541
+ โ”‚ โ”œโ”€โ”€ trace-bench.ts # TRACE step recording throughput
542
+ โ”‚ โ”œโ”€โ”€ memex-bench.ts # Three-layer MEMEX operations
543
+ โ”‚ โ”œโ”€โ”€ hive-bench.ts # Swarm coordination + risk gate
544
+ โ”‚ โ”œโ”€โ”€ registry-bench.ts # Package search + metadata
545
+ โ”‚ โ”œโ”€โ”€ agent-tasks-bench.ts # Cross-framework comparison scaffold
546
+ โ”‚ โ””โ”€โ”€ src/
547
+ โ”‚ โ”œโ”€โ”€ stats.ts # p50/p95/p99 helpers
548
+ โ”‚ โ”œโ”€โ”€ types.ts # BenchmarkResult type
549
+ โ”‚ โ”œโ”€โ”€ reporters/ # console, json, markdown reporters
550
+ โ”‚ โ””โ”€โ”€ adapters/ # cerebrex adapter (5 standardized tasks)
551
+ โ”œโ”€โ”€ sdks/
552
+ โ”‚ โ””โ”€โ”€ python/ # Python async SDK (pip install cerebrex)
553
+ โ”‚ โ”œโ”€โ”€ src/cerebrex/ # CerebreXClient + module sub-clients
554
+ โ”‚ โ”œโ”€โ”€ tests/ # pytest test suite with pytest-httpx mocks
555
+ โ”‚ โ””โ”€โ”€ examples/ # quickstart, langchain_integration, crewai_integration
348
556
  โ”œโ”€โ”€ workers/
349
557
  โ”‚ โ”œโ”€โ”€ registry/ # Cloudflare Worker โ€” live registry backend + Web UI
350
558
  โ”‚ โ”‚ โ”œโ”€โ”€ src/index.ts # REST API (D1 + KV) + embedded HTML pages
@@ -370,7 +578,10 @@ CerebreX/
370
578
  โ”‚ โ”œโ”€โ”€ deploy-registry.yml # auto-deploy registry Worker
371
579
  โ”‚ โ”œโ”€โ”€ deploy-memex.yml # auto-deploy MEMEX Worker
372
580
  โ”‚ โ”œโ”€โ”€ deploy-kairos.yml # auto-deploy KAIROS Worker
373
- โ”‚ โ””โ”€โ”€ build-binaries.yml # build standalone binaries on release
581
+ โ”‚ โ”œโ”€โ”€ build-binaries.yml # build standalone binaries on release
582
+ โ”‚ โ”œโ”€โ”€ benchmarks.yml # weekly benchmark suite (Sundays 02:00 UTC)
583
+ โ”‚ โ”œโ”€โ”€ test-python.yml # Python SDK tests (3.10, 3.11, 3.12)
584
+ โ”‚ โ””โ”€โ”€ publish-python.yml # publish cerebrex to PyPI on release
374
585
  โ””โ”€โ”€ turbo.json
375
586
  ```
376
587
 
@@ -449,9 +660,9 @@ cd apps/cli && bun run build
449
660
  - [x] HIVE swarm strategies โ€” parallel, pipeline, competitive + 6 built-in presets *(v0.9)*
450
661
  - [x] `@cerebrex/system-prompt` โ€” master system prompt package + live MEMEX context loader *(v0.9)*
451
662
  - [x] Security hardening โ€” risk gate wired into HIVE worker, JWT /token endpoint authenticated, KAIROS exponential backoff + JSON validation, agentId injection prevention *(v0.9.1)*
452
- - [ ] Agent test runner โ€” `cerebrex test` with replay + assertions *(v1.0)*
453
- - [ ] Custom domain *(next)*
454
- - [ ] Enterprise tier + on-prem *(v1.0)*
663
+ - [x] Benchmark suite โ€” p50/p95/p99, forge/trace/memex/hive/registry + cross-framework agent tasks + `cerebrex bench` CLI command *(v0.9.2)*
664
+ - [x] Python SDK โ€” async httpx client, Pydantic v2, full module coverage, LangChain + CrewAI integrations *(v0.9.2)*
665
+ - [x] Agent test runner โ€” `cerebrex test` with replay + assertions, fixture recording, tag filtering, CI mode *(v0.9.3)*
455
666
 
456
667
  ---
457
668