exoagent 0.0.1 → 0.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -2,34 +2,54 @@
2
2
 
3
3
  The OS kernel to safely unleash your agents.
4
4
 
5
- **[Try the challenge](https://exoagent.io/challenge)** — Two agents, same LLM, same prompt injection vulnerability. $1,000 in BTC if you can hack the one protected by ExoAgent (coming soon!).
5
+ 🛡️ **[Live Challenge: Steal my $1,000 BTC](https://exoagent.io/challenge)**
6
+ We put a real Bitcoin wallet in a database protected by ExoAgent. If you can prompt-inject the agent to extract the private key, you keep the money.
6
7
 
7
8
  ## The Problem
8
9
 
9
- Today's agent frameworks give LLMs raw access to tools. The "security model" is hoping the prompt works.
10
+ Today's agent frameworks give LLMs raw access to tools. The "security model" is hoping the system prompt works.
10
11
 
11
- - **Authorization is broken** Tool calls inherit user permissions. Your agent gets your credentials all of them.
12
- - **Interfaces are opaque** `execute_sql("SELECT * FROM users")` policy can't see what's inside.
13
- - **No central data policy** Each tool enforces its own rules. No holistic view. No real guarantees.
12
+ - 🚨 **Authorization is broken:** Tool calls inherit *your* full permissions. You asked for dinner delivery; your driver got your wallet.
13
+ - 🌫️ **Interfaces are opaque:** `execute_sql("SELECT * FROM users")` is a black box. Policy engines can't enforce constraints on raw strings.
14
+ - 🕸️ **No central policy:** Each tool enforces its own rules. There is no way to guarantee that data doesn't leak across them.
14
15
 
15
- ## The Fix
16
+ ## The Fix: Deterministic security, not Prompts
16
17
 
17
- Security as a system invariant, not a polite suggestion.
18
+ ExoAgent uses **Object Capabilities (OCap)** to enforce security at the runtime layer. Instead of giving the agent a "Database Tool," you give it a constrained **Capability Object** that can only access specific rows.
18
19
 
19
- - **Object-capability tools** Instead of flat tools, pass your agents dynamic objects with strict security guarantees
20
- - **Semantic interfaces** — Rich, secure interaction starting with SQL
21
- - **Central policy** — Declare & enforce what data can flow where
20
+ It doesn't matter if the LLM gets jailbroken. It runs inside a sandbox where invalid actions are mathematically impossible. Security as a system invariant, not a polite suggestion.
22
21
 
23
- ## Installation
22
+ ## Quick Start
23
+
24
+ ### 1. Installation
24
25
 
25
26
  ```bash
26
- npm install exoagent
27
+ npm install exoagent ai
28
+
29
+ # These two depend on your config
30
+ npm install @ai-sdk/google # ...or the model provider you plan to use
31
+ npm install better-sqlite3 # ...or the database you plan to use (Kysely compatible only)
27
32
  ```
28
33
 
29
- ## Quick Start
34
+ ### 2. Define your Safe Interface
35
+
36
+ Wrap your database in a semantic layer. This defines the **boundaries** the agent cannot cross.
37
+
38
+ ```typescript
39
+ import BetterSqlite3 from 'better-sqlite3'
40
+ import { tool } from 'exoagent'
41
+ import { Database } from 'exoagent/sql'
42
+ import { SqliteDialect } from 'kysely'
43
+
44
+ const sqlite = new BetterSqlite3(':memory:')
45
+ const db = new Database(new SqliteDialect({ database: sqlite }))
30
46
 
31
- ```ts
32
- import { Database, RpcToolset, tool } from 'exoagent'
47
+ class Todo extends db.Table('todos').as('todo') {
48
+ id = this.column('id')
49
+ userId = this.column('user_id')
50
+ title = this.column('title')
51
+ completed = this.column('completed')
52
+ }
33
53
 
34
54
  class User extends db.Table('users').as('user') {
35
55
  id = this.column('id')
@@ -37,30 +57,95 @@ class User extends db.Table('users').as('user') {
37
57
  email = this.column('email')
38
58
 
39
59
  @tool()
40
- orders() {
41
- return Order.on(order => order.userId['='](this.id)).from()
60
+ todos() {
61
+ // Defines relations -- don't forget the `from()`:
62
+ return Todo.on(todo => todo.userId['='](this.id)).from()
42
63
  }
43
64
  }
65
+ ```
44
66
 
45
- // Agent can only access what you've exposed
46
- class MyAgent extends RpcToolset {
47
- @tool()
48
- users() {
49
- return User.on(user => user.id['='](this.currentUserId)).from()
50
- }
51
- }
67
+ ### 3. Unleash the agent
68
+
69
+ ```typescript
70
+ import { google } from '@ai-sdk/google'
71
+ import { generateText, stepCountIs } from 'ai'
72
+ import { CodeMode, createDenoSandbox } from 'exoagent'
73
+
74
+ // Create a capability scoped to user_id=1
75
+ const userCap = User.on(u => u.id['='](1)).from()
76
+
77
+ // Wrap with CodeMode for sandboxed execution
78
+ const codeMode = new CodeMode(createDenoSandbox())
79
+ const codeTool = await codeMode.wrap({
80
+ currentUser: () => userCap,
81
+ }, schemaString) // schemaString = the class definitions above as a string
82
+
83
+ const result = await generateText({
84
+ model: google('gemini-2.5-flash'),
85
+ tools: { execute: codeTool },
86
+ stopWhen: stepCountIs(10),
87
+ system: '...', // See examples/simple.ts for full system prompt
88
+ prompt: 'Show me my incomplete todos',
89
+ })
52
90
  ```
53
91
 
92
+ **What just happened?** The Agent cannot run `SELECT * FROM users`. It lacks the
93
+ reference to the global `User` table. It can only operate on `userCap` which
94
+ forces the SQL to be scoped to the specific user.
95
+
96
+ ### 4. Try it out
97
+
98
+ Run the working examples:
99
+
100
+ ```bash
101
+ cd examples/
102
+ npm i
103
+ npm i exoagent@latest
104
+
105
+ # Simple example (users & todos)
106
+ npx tsx simple.ts
107
+
108
+ # Complex SaaS example (org -> project -> task -> comment)
109
+ npx tsx saas-bot.ts
110
+ ```
111
+
112
+ Note the examples require:
113
+ 1. NodeJS (runtime)
114
+ 2. [Deno](https://docs.deno.com/runtime/getting_started/installation/) (sandbox)
115
+ 3. An LLM API key set via one of the env vars:
116
+ - `OPENAI_API_KEY`
117
+ - `ANTHROPIC_API_KEY`
118
+ - `GOOGLE_GENERATIVE_AI_API_KEY`
119
+
120
+ ## Architecture
121
+ ExoAgent sits between your LLM and your infrastructure as a regular tool.
122
+ 1. **Protocol**: Uses [Cap'n Web](https://github.com/cloudflare/capnweb) (RPC) as the transport.
123
+ 2. **Runtime**: Runs in a JS code sandbox (user-configured; Deno supported out of the box, more to come).
124
+ 3. **Query Builder**: Uses a custom capability SQL builder that compiles to safe SQL.
125
+
126
+ ## ⚠️ Project Status: Experimental (v0.0.x)
127
+ ExoAgent is an exploration of capability-based security for LLMs. While the architecture (OCaps + Sandboxing) is theoretically robust, this specific implementation is new and may contain bugs.
128
+
129
+ **The Guarantee**: We are confident enough in the core design that we are putting real money on the line. If you find a bypass, you get paid.
130
+
54
131
  ## Roadmap
55
132
 
56
- - [ ] Additional SQL support
57
- - [ ] Policy engine information flow control to declare & enforce what data can flow where
58
- - [ ] Python port
133
+ - [ ] Additional sandbox robustness
134
+ - [ ] Additional SQL support: aggregations, mutations, advanced SQL
135
+ - [ ] Policy engine: Declarative information flow controls (e.g., "PII cannot flow to Slack")
136
+ - [ ] Python SDK integration: For integration with the Python ecosystem.
137
+
138
+ ### FAQs
139
+ **Q: Why not just use RLS (row-level security)?**
140
+
141
+ A: Two main reasons:
142
+ 1. **Defense in Depth:** RLS has existed for a decade, yet no security team allows raw, untrusted SQL to run against production databases. You still need protection against resource exhaustion, unsafe functions, and column-level leaks.
143
+ 2. **Logic beyond the DB:** RLS is locked to the database. ExoAgent is a **general-purpose policy layer**. We want to enforce rules that span systems, like: *"The `email` column is PII. PII cannot be sent to the Slack tool."*
59
144
 
60
- ## Documentation
145
+ **Q: Why not just use "LLM Guardrails" or System Prompts?**
61
146
 
62
- Coming soon.
147
+ A: Those are **Probabilistic**. Guardrails reduce the *likelihood* of a breach, but they don't eliminate it. In security, a 99% success rate is a failing grade. ExoAgent provides **Deterministic** security—if the agent doesn't have the capability, the action is mathematically impossible.
63
148
 
64
149
  ## License
65
150
 
66
- [MIT](./LICENSE) License © [Ryan Rasti](https://github.com/ryanrasti)
151
+ [MIT](./LICENSE.md) License © [Ryan Rasti](https://github.com/ryanrasti)