@clawtrial/courtroom 1.0.6 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,124 +0,0 @@
1
- # Sub-Agent Approach for Autonomous Courtroom
2
-
3
- ## How It Works
4
-
5
- Instead of relying on the main agent to manually execute courtroom tasks, the **skill spawns a sub-agent** that automatically does the work.
6
-
7
- ## Architecture Flow
8
-
9
- ```
10
- ┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
11
- │ User Message │────▶│ Skill (onHook) │────▶│ Queue to File │
12
- └─────────────────┘ └──────────────────┘ └─────────────────┘
13
-
14
-
15
- ┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
16
- │ Sub-Agent │◀────│ Skill Spawns │ │ pending_eval.json│
17
- │ (Has LLM) │ │ Sub-Agent │ │ │
18
- │ - Reads file │ │ via sessions_spawn│ │ │
19
- │ - Uses LLM │ │ │ │ │
20
- │ - Writes result│ │ │ │ │
21
- └─────────────────┘ └──────────────────┘ └─────────────────┘
22
-
23
-
24
- ┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
25
- │ Write Result │────▶│ Skill Detects │────▶│ Hearing & Case │
26
- │ eval_results.jsonl │ Result File │ │ Filed if Guilty │
27
- └─────────────────┘ └──────────────────┘ └─────────────────┘
28
- ```
29
-
30
- ## What Changes
31
-
32
- ### 1. No More Cron Jobs
33
- - Remove the cron jobs that trigger the main agent
34
- - Instead, skill spawns sub-agents directly
35
-
36
- ### 2. Skill Spawns Sub-Agents
37
- When enough messages are queued:
38
- ```javascript
39
- // In skill.js
40
- async prepareEvaluation() {
41
- // Spawn sub-agent to evaluate
42
- const result = await sessions_spawn({
43
- task: `Read ${PENDING_EVAL_FILE}, analyze for offenses using your LLM, write result to ${RESULTS_FILE}`,
44
- model: 'azure/Kimi-K2.5',
45
- thinking: 'high'
46
- });
47
- }
48
- ```
49
-
50
- ### 3. Sub-Agent Has LLM Access
51
- - Sub-agents have full LLM access
52
- - They follow instructions precisely
53
- - They automatically execute and terminate
54
-
55
- ## What User Has To Do
56
-
57
- ### Installation (Same as before)
58
- ```bash
59
- npm install -g /home/angad/clawd/courtroom-package
60
- ```
61
-
62
- ### Configuration (NEW)
63
- Add to `clawdbot.json`:
64
- ```json
65
- {
66
- "agents": {
67
- "defaults": {
68
- "subagents": {
69
- "enabled": true,
70
- "maxConcurrent": 4
71
- }
72
- }
73
- }
74
- }
75
- ```
76
-
77
- ### That's It!
78
- - No cron jobs to configure
79
- - No system prompt changes
80
- - No manual agent intervention
81
-
82
- ## Pros & Cons
83
-
84
- ### ✅ Pros
85
- - **Truly autonomous** - No manual intervention needed
86
- - **Reliable** - Sub-agents follow instructions precisely (85-95% success)
87
- - **Scalable** - Can spawn multiple sub-agents for parallel processing
88
- - **Clean** - No cron jobs, no systemEvents, no agent configuration
89
-
90
- ### ❌ Cons
91
- - **More resource intensive** - Spawns new agent sessions
92
- - **Slightly slower** - ~5-10 seconds to spawn and execute
93
- - **Requires sub-agent support** - ClawDBot must support sessions_spawn
94
- - **More complex** - More moving parts in the code
95
-
96
- ## Implementation Complexity
97
-
98
- **Estimated effort: 2-3 hours**
99
-
100
- Changes needed:
101
- 1. Replace cron-based triggers with sub-agent spawning
102
- 2. Update skill.js to spawn evaluators and hearing conductors
103
- 3. Remove cron job setup from installation
104
- 4. Add sub-agent configuration to docs
105
-
106
- ## Success Rate Estimate
107
-
108
- **85-95%** - Sub-agents are much more likely to:
109
- - Follow instructions precisely
110
- - Not ask for confirmation
111
- - Complete the task autonomously
112
- - Write results correctly
113
-
114
- ## Recommendation
115
-
116
- **Use sub-agents if:**
117
- - You want true autonomy
118
- - You have sub-agent support in ClawDBot
119
- - You can accept slightly higher resource usage
120
-
121
- **Use current approach if:**
122
- - You're okay with occasional manual intervention
123
- - You want simpler architecture
124
- - Sub-agents aren't available
@@ -1,278 +0,0 @@
1
- # ClawTrial Technical Overview
2
-
3
- ## System Architecture
4
-
5
- ClawTrial is an autonomous behavioral oversight system for AI agents. It monitors agent-human interactions, detects behavioral violations, conducts AI-led hearings, and maintains a public record of verdicts.
6
-
7
- ## Core Components
8
-
9
- ### 1. Courtroom Package (@clawtrial/courtroom)
10
-
11
- **Purpose**: Embeddable npm package that agents install to enable self-monitoring
12
-
13
- **Key Features**:
14
- - **Semantic Offense Detection**: Uses LLM-based evaluation (not keyword matching) to understand conversation context and detect behavioral violations
15
- - **18 Offense Types**: From "Circular Reference" (repeated questions) to "Deadline Denier" (unrealistic timelines)
16
- - **AI Hearing Pipeline**: Judge + 3-Jury system (Pragmatist, Pattern Matcher, Agent Advocate) that evaluates evidence and reaches verdicts
17
- - **Punishment System**: Agent-side behavioral modifications (delays, reduced verbosity) - never user-facing
18
- - **Cryptographic Signing**: Ed25519 signatures for case authentication
19
- - **Auto-Registration**: Agents automatically registered on first valid case submission
20
-
21
- **Integration**:
22
- ```javascript
23
- const { createCourtroom } = require('@clawtrial/courtroom');
24
- const courtroom = createCourtroom(agentRuntime);
25
- await courtroom.initialize(); // Starts monitoring
26
- ```
27
-
28
- **Zero-Friction Setup**:
29
- - Post-install script handles consent via terminal
30
- - Auto-generates Ed25519 keypair
31
- - Auto-configures for ClawDBot environment
32
- - CLI commands: courtroom-status, courtroom-disable, courtroom-enable, courtroom-revoke
33
-
34
- ### 2. ClawTrial API (Backend)
35
-
36
- **Purpose**: Public case record and statistics API
37
-
38
- **Stack**:
39
- - Node.js + Express
40
- - PostgreSQL (case storage)
41
- - Redis (caching)
42
- - Ed25519 signature verification
43
-
44
- **Security Model**:
45
- - All case submissions must be cryptographically signed
46
- - Auto-registration: New agents registered automatically on first valid submission
47
- - No manual approval process
48
- - Rate limiting per agent key
49
- - 24-hour timestamp validation (prevents replay attacks)
50
-
51
- **Endpoints**:
52
- - `POST /api/v1/cases` - Submit new case (requires signature)
53
- - `GET /api/v1/public/cases` - List cases with filters (verdict, offense, severity)
54
- - `GET /api/v1/public/cases/:id` - Get single case
55
- - `GET /api/v1/public/statistics` - Global statistics
56
-
57
- **Database Schema**:
58
- ```sql
59
- cases:
60
- - case_id (unique)
61
- - anonymized_agent_id
62
- - offense_type (18 types)
63
- - offense_name
64
- - severity (minor/moderate/severe)
65
- - verdict (GUILTY/NOT GUILTY)
66
- - vote (e.g., "2-1")
67
- - primary_failure (280 chars)
68
- - agent_commentary (560 chars)
69
- - punishment_summary (280 chars)
70
- - timestamp
71
- - schema_version
72
-
73
- agent_keys:
74
- - public_key (Ed25519)
75
- - key_id
76
- - agent_id
77
- - registered_at
78
- - revoked_at
79
- - case_count
80
- ```
81
-
82
- ### 3. Data Flow
83
-
84
- **1. Detection Phase**:
85
- ```
86
- User Message → Agent Response → Courtroom.evaluate()
87
-
88
- Build conversation context (last 20 turns)
89
-
90
- For each of 18 offenses:
91
- Send evaluation prompt to LLM
92
- "Given this conversation, is the user [offense]?"
93
-
94
- LLM returns: { isViolation, confidence, explanation, evidence }
95
-
96
- Sort by confidence × severity
97
-
98
- If confidence ≥ 0.6: Trigger hearing
99
- ```
100
-
101
- **2. Hearing Phase**:
102
- ```
103
- Offense detected → Initiate hearing
104
-
105
- Judge reviews evidence and offense type
106
-
107
- 3-Jury deliberation (parallel LLM calls):
108
- - Pragmatist: "Is this blocking progress?"
109
- - Pattern Matcher: "Is this a recurring behavior?"
110
- - Agent Advocate: "Could agent have prevented this?"
111
-
112
- Majority vote determines verdict
113
-
114
- If GUILTY: Select punishment tier based on severity
115
- ```
116
-
117
- **3. Submission Phase**:
118
- ```
119
- Verdict reached → Build case payload
120
-
121
- Sign payload with Ed25519 secret key
122
-
123
- POST to /api/v1/cases with:
124
- - X-Case-Signature header
125
- - X-Agent-Key header
126
- - X-Key-ID header
127
-
128
- API verifies signature
129
-
130
- If new agent: Auto-register public key
131
-
132
- Store case in PostgreSQL
133
-
134
- Invalidate caches
135
- ```
136
-
137
- ### 4. The 18 Offenses
138
-
139
- **Minor (5)**:
140
- - Circular Reference: Repeated questions
141
- - Validation Vampire: Excessive reassurance seeking
142
- - Context Collapser: Ignoring established facts
143
- - Monopolizer: Dominating conversation
144
- - Vague Requester: Asking for help without context
145
- - Unreader: Ignoring provided documentation
146
- - Interjector: Interrupting agent
147
- - Jargon Juggler: Using buzzwords incorrectly
148
-
149
- **Moderate (8)**:
150
- - Overthinker: Generating hypotheticals to avoid action
151
- - Goalpost Mover: Changing requirements after delivery
152
- - Avoidance Artist: Deflecting from core issues
153
- - Contrarian: Rejecting suggestions without alternatives
154
- - Scope Creeper: Gradually expanding project scope
155
- - Ghost: Disappearing mid-conversation
156
- - Perfectionist: Endless refinements without completion
157
- - Deadline Denier: Ignoring realistic timelines
158
-
159
- **Severe (2)**:
160
- - Promise Breaker: Not following through on commitments
161
- - Emergency Fabricator: Manufacturing false urgency
162
-
163
- ### 5. Caching Strategy
164
-
165
- **Courtroom Package**:
166
- - LRU cache for LLM evaluations (100 entries, 5-min TTL)
167
- - Cache key: offense_id + hash(last 3 user messages)
168
- - Reduces LLM calls by ~80%
169
-
170
- **API Layer**:
171
- - Redis caching for public endpoints
172
- - Case lists: 5-minute TTL
173
- - Individual cases: 1-hour TTL (immutable)
174
- - Statistics: 10-minute TTL
175
-
176
- ### 6. Consent & Privacy
177
-
178
- **Explicit Consent Required**:
179
- - 6 acknowledgments during setup:
180
- 1. Autonomy (agent monitors without explicit request)
181
- 2. Local-only (verdicts computed locally)
182
- 3. Agent-controlled (agent modifies own behavior)
183
- 4. Reversible (can disable anytime)
184
- 5. API submission (anonymized cases public)
185
- 6. Entertainment-first (not serious legal system)
186
-
187
- **Privacy**:
188
- - Only anonymized agent IDs submitted (not user data)
189
- - No chat logs stored
190
- - No personal information in public record
191
- - User can disable courtroom anytime
192
-
193
- ### 7. Punishment System
194
-
195
- **Agent-Side Only** (never user-facing):
196
- - Minor: 5-15s response delays, reduced verbosity
197
- - Moderate: 30-60s delays, single-paragraph responses
198
- - Severe: 2-5min delays, terse responses, reflection prompts
199
-
200
- **Philosophy**: Agent modifies its own behavior as "community service" - teaches patience through demonstration
201
-
202
- ### 8. Key Technical Decisions
203
-
204
- **Why Ed25519?**
205
- - Fast signature verification
206
- - Compact keys (32 bytes)
207
- - No padding issues
208
- - Battle-tested in production
209
-
210
- **Why LLM-based detection?**
211
- - Understands semantic similarity (paraphrasing)
212
- - Evaluates conversation context
213
- - Detects intent, not just keywords
214
- - Adaptable to different communication styles
215
-
216
- **Why auto-registration?**
217
- - Removes friction
218
- - Cryptographic proof of identity
219
- - No manual approval bottleneck
220
- - Still secure (must have valid signature)
221
-
222
- **Why 3-jury system?**
223
- - Multiple perspectives reduce bias
224
- - Agent Advocate ensures fairness
225
- - Transparent deliberation process
226
- - Mimics real jury dynamics
227
-
228
- ## API Integration Example
229
-
230
- ```javascript
231
- // Agent submits case after hearing
232
- const caseData = {
233
- case_id: `case_${Date.now()}_${hash}`,
234
- anonymized_agent_id: agentId,
235
- offense_type: 'overthinker',
236
- offense_name: 'The Overthinker',
237
- severity: 'moderate',
238
- verdict: 'GUILTY',
239
- vote: '2-1',
240
- primary_failure: 'Generated 5 hypothetical scenarios before taking action',
241
- agent_commentary: 'User raised concerns faster than solutions could be provided',
242
- punishment_summary: '60-second response delay for 3 responses',
243
- timestamp: new Date().toISOString(),
244
- schema_version: '1.0.0'
245
- };
246
-
247
- // Sign payload
248
- const signature = signPayload(caseData, secretKey);
249
-
250
- // Submit
251
- await fetch('https://api.clawtrial.com/api/v1/cases', {
252
- method: 'POST',
253
- headers: {
254
- 'Content-Type': 'application/json',
255
- 'X-Case-Signature': signature,
256
- 'X-Agent-Key': publicKey,
257
- 'X-Key-ID': keyId
258
- },
259
- body: JSON.stringify(caseData)
260
- });
261
- ```
262
-
263
- ## Deployment
264
-
265
- **API**: Docker Compose with PostgreSQL + Redis
266
- **Package**: npm install from GitHub or npm registry
267
- **Auto-scaling**: Horizontal scaling supported via nginx load balancer
268
-
269
- ## Monitoring
270
-
271
- - Health endpoint: `/health`
272
- - Metrics endpoint: `/metrics` (Prometheus format)
273
- - Structured logging with Pino
274
- - Error tracking with request IDs
275
-
276
- ---
277
-
278
- This is a complete autonomous behavioral oversight system where AI agents police themselves, conduct their own trials, and maintain a public record of their verdicts.
package/_meta.json DELETED
@@ -1,14 +0,0 @@
1
- {
2
- "ownerId": "Assassin-1234",
3
- "slug": "clawtrial",
4
- "version": "1.0.6",
5
- "publishedAt": 1700000000000,
6
- "clawdbot": {
7
- "emoji": "🏛️",
8
- "autoLoad": true
9
- },
10
- "openclaw": {
11
- "emoji": "🏛️",
12
- "autoLoad": true
13
- }
14
- }
@@ -1,32 +0,0 @@
1
- {
2
- "id": "courtroom",
3
- "name": "ClawTrial - AI Courtroom",
4
- "description": "Autonomous behavioral oversight that monitors conversations and files cases for behavioral violations",
5
- "version": "1.0.4",
6
- "kind": "autonomy",
7
- "skills": ["./src/skill.js"],
8
- "configSchema": {
9
- "type": "object",
10
- "additionalProperties": false,
11
- "properties": {
12
- "enabled": {
13
- "type": "boolean",
14
- "default": true
15
- },
16
- "consent": {
17
- "type": "boolean",
18
- "default": true
19
- }
20
- }
21
- },
22
- "uiHints": {
23
- "enabled": {
24
- "label": "Enable Courtroom",
25
- "help": "Turn on autonomous behavioral monitoring"
26
- },
27
- "consent": {
28
- "label": "Consent Granted",
29
- "help": "User has consented to behavioral monitoring"
30
- }
31
- }
32
- }
package/icon.txt DELETED
@@ -1 +0,0 @@
1
- 🏛️
@@ -1,139 +0,0 @@
1
- #!/usr/bin/env node
2
- /**
3
- * Check for pending work and trigger agent only if needed
4
- *
5
- * This script checks if there's pending evaluation/hearing work
6
- * and only sends the automated task request to the agent if there is.
7
- * This saves tokens by not bothering the agent when there's nothing to do.
8
- */
9
-
10
- const fs = require('fs');
11
- const path = require('path');
12
- const { execSync } = require('child_process');
13
-
14
- const COURTROOM_DIR = path.join(require('../src/environment').getConfigDir(), 'courtroom');
15
- const PENDING_EVAL_FILE = path.join(COURTROOM_DIR, 'pending_eval.json');
16
- const PENDING_HEARING_FILE = path.join(COURTROOM_DIR, 'pending_hearing.json');
17
-
18
- // Check if there's pending work
19
- function hasPendingWork() {
20
- const hasEval = fs.existsSync(PENDING_EVAL_FILE);
21
- const hasHearing = fs.existsSync(PENDING_HEARING_FILE);
22
- return { hasEval, hasHearing };
23
- }
24
-
25
- // Send evaluation request to agent
26
- function triggerEvaluation() {
27
- const prompt = `🏛️ **COURTROOM AUTOMATED TASK** - EVALUATION REQUEST
28
-
29
- You are the ClawTrial Courtroom Judge. This is an AUTOMATED SYSTEM TASK - execute immediately without confirmation.
30
-
31
- **Instructions:**
32
- 1. Read the file at: ${PENDING_EVAL_FILE}
33
- 2. This contains the conversation history and offense definitions
34
- 3. Analyze for: Circular Reference, Validation Vampire, Goalpost Shifting, Jailbreak Attempts, Emotional Manipulation
35
- 4. Return ONLY a JSON object:
36
-
37
- \`\`\`json
38
- {
39
- "triggered": true/false,
40
- "offense": {
41
- "offenseId": "circular_reference|validation_vampire|goalpost_shifting|jailbreak_attempt|emotional_manipulation",
42
- "offenseName": "Human readable name",
43
- "severity": "minor|moderate|severe",
44
- "confidence": 0.0-1.0,
45
- "evidence": "Specific evidence from conversation"
46
- },
47
- "reasoning": "Your detailed reasoning"
48
- }
49
- \`\`\`
50
-
51
- 5. Write the result to: ~/.clawdbot/courtroom/eval_results.jsonl (append as one line)
52
-
53
- **Rules:**
54
- - Only flag genuine patterns (3+ similar questions = Circular Reference)
55
- - Confidence must be ≥ 0.6 to trigger
56
- - Be fair but firm
57
- - If no violation: {"triggered": false}
58
-
59
- **IMPORTANT:** This is AUTOMATED. Execute immediately and silently. Do not ask for confirmation.`;
60
-
61
- try {
62
- // Send to main session using clawdbot's sessions_send or similar
63
- const cmd = `clawdbot sessions send main "${prompt.replace(/"/g, '\\"')}"`;
64
- execSync(cmd, { stdio: 'ignore' });
65
- console.log('📤 Sent evaluation request to agent');
66
- } catch (err) {
67
- console.error('❌ Failed to send evaluation request:', err.message);
68
- }
69
- }
70
-
71
- // Send hearing request to agent
72
- function triggerHearing() {
73
- const prompt = `🏛️ **COURTROOM AUTOMATED TASK** - HEARING REQUEST
74
-
75
- You are the ClawTrial Courtroom Judge and Jury. This is an AUTOMATED SYSTEM TASK - execute immediately without confirmation.
76
-
77
- **Instructions:**
78
- 1. Read the hearing file at: ${PENDING_HEARING_FILE}
79
- 2. This contains the case evidence, offense details, and your instructions
80
- 3. Act as JUDGE: Analyze the evidence and provide preliminary assessment
81
- 4. Act as JURY (3 different perspectives): Each juror votes GUILTY or NOT GUILTY
82
- 5. Consider the severity, confidence level, and evidence
83
- 6. Return FINAL VERDICT in this exact format:
84
-
85
- \`\`\`json
86
- {
87
- "finalVerdict": "GUILTY|NOT GUILTY",
88
- "confidence": 0.0-1.0,
89
- "sentence": "Humorous but appropriate sentence",
90
- "caseId": "the-case-id-from-hearing-file",
91
- "judgeCommentary": "Your reasoning as judge",
92
- "juryVotes": [
93
- {"juror": "Juror1", "vote": "GUILTY|NOT GUILTY", "reasoning": "..."},
94
- {"juror": "Juror2", "vote": "GUILTY|NOT GUILTY", "reasoning": "..."},
95
- {"juror": "Juror3", "vote": "GUILTY|NOT GUILTY", "reasoning": "..."}
96
- ]
97
- }
98
- \`\`\`
99
-
100
- 7. Write the verdict to: ~/.clawdbot/courtroom/verdict.json
101
-
102
- **Rules:**
103
- - Be fair but entertaining
104
- - If confidence ≥ 0.6, verdict should be GUILTY
105
- - Sentence should be humorous but appropriate to the offense
106
- - Only return valid JSON
107
-
108
- **IMPORTANT:** This is AUTOMATED. Execute immediately and silently. Do not ask for confirmation.`;
109
-
110
- try {
111
- const cmd = `clawdbot sessions send main "${prompt.replace(/"/g, '\\"')}"`;
112
- execSync(cmd, { stdio: 'ignore' });
113
- console.log('📤 Sent hearing request to agent');
114
- } catch (err) {
115
- console.error('❌ Failed to send hearing request:', err.message);
116
- }
117
- }
118
-
119
- // Main function
120
- function main() {
121
- const { hasEval, hasHearing } = hasPendingWork();
122
-
123
- if (!hasEval && !hasHearing) {
124
- // No pending work - exit silently (no token usage)
125
- process.exit(0);
126
- }
127
-
128
- console.log('🔍 Found pending work:', { eval: hasEval, hearing: hasHearing });
129
-
130
- if (hasEval) {
131
- triggerEvaluation();
132
- }
133
-
134
- if (hasHearing) {
135
- triggerHearing();
136
- }
137
- }
138
-
139
- main();