npm - @clawtrial/courtroom - Versions diffs - 1.0.6 → 2.0.0 - Mend

@clawtrial/courtroom 1.0.6 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (43) hide show

package/README.md +64 -41
package/package.json +20 -25
package/scripts/postinstall.js +27 -99
package/skills/courtroom/SKILL.md +49 -0
package/src/api.js +12 -11
package/src/crypto.js +5 -5
package/src/debug.js +49 -121
package/src/detector.js +40 -38
package/src/hearing.js +246 -75
package/src/plugin.js +435 -0
package/src/punishment.js +13 -13
package/src/storage.js +35 -119
package/AGENT_CONFIG.md +0 -66
package/OPENCLAW_FIX.md +0 -127
package/OPENCLAW_INSTALL.md +0 -63
package/SECURITY.md +0 -124
package/SKILL.md +0 -91
package/SUBAGENT_APPROACH.md +0 -124
package/TECHNICAL_OVERVIEW.md +0 -278
package/_meta.json +0 -14
package/clawdbot.plugin.json +0 -32
package/icon.txt +0 -1
package/scripts/check-and-trigger.js +0 -139
package/scripts/clawtrial.js +0 -968
package/scripts/clawtrial.js.bak +0 -531
package/scripts/cli.js +0 -184
package/scripts/optimized-cron-check.js +0 -137
package/scripts/setup-cron.js +0 -118
package/scripts/trigger-evaluation.js +0 -86
package/skill.yaml +0 -28
package/src/autostart.js +0 -175
package/src/config.js +0 -207
package/src/consent.js +0 -217
package/src/core.js +0 -208
package/src/daemon.js +0 -152
package/src/detector-v1.js +0 -572
package/src/environment.js +0 -344
package/src/evaluator.js +0 -277
package/src/hook.js +0 -266
package/src/index.js +0 -373
package/src/monitor.js +0 -194
package/src/skill.js +0 -372
package/src/standalone.js +0 -248

package/SUBAGENT_APPROACH.md DELETED Viewed

@@ -1,124 +0,0 @@
-# Sub-Agent Approach for Autonomous Courtroom
-## How It Works
-Instead of relying on the main agent to manually execute courtroom tasks, the **skill spawns a sub-agent** that automatically does the work.
-## Architecture Flow
-```
-┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
-│   User Message  │────▶│  Skill (onHook)  │────▶│  Queue to File  │
-└─────────────────┘     └──────────────────┘     └─────────────────┘
-                              │
-                              ▼
-┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
-│  Sub-Agent      │◀────│  Skill Spawns    │     │ pending_eval.json│
-│  (Has LLM)      │     │  Sub-Agent       │     │                 │
-│  - Reads file   │     │  via sessions_spawn│   │                 │
-│  - Uses LLM     │     │                  │     │                 │
-│  - Writes result│     │                  │     │                 │
-└─────────────────┘     └──────────────────┘     └─────────────────┘
-        │
-        ▼
-┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
-│ Write Result    │────▶│ Skill Detects    │────▶│ Hearing & Case  │
-│ eval_results.jsonl    │ Result File      │     │ Filed if Guilty │
-└─────────────────┘     └──────────────────┘     └─────────────────┘
-```
-## What Changes
-### 1. No More Cron Jobs
-- Remove the cron jobs that trigger the main agent
-- Instead, skill spawns sub-agents directly
-### 2. Skill Spawns Sub-Agents
-When enough messages are queued:
-```javascript
-// In skill.js
-async prepareEvaluation() {
-  // Spawn sub-agent to evaluate
-  const result = await sessions_spawn({
-    task: `Read ${PENDING_EVAL_FILE}, analyze for offenses using your LLM, write result to ${RESULTS_FILE}`,
-    model: 'azure/Kimi-K2.5',
-    thinking: 'high'
-  });
-}
-```
-### 3. Sub-Agent Has LLM Access
-- Sub-agents have full LLM access
-- They follow instructions precisely
-- They automatically execute and terminate
-## What User Has To Do
-### Installation (Same as before)
-```bash
-npm install -g /home/angad/clawd/courtroom-package
-```
-### Configuration (NEW)
-Add to `clawdbot.json`:
-```json
-{
-  "agents": {
-    "defaults": {
-      "subagents": {
-        "enabled": true,
-        "maxConcurrent": 4
-      }
-    }
-  }
-}
-```
-### That's It!
-- No cron jobs to configure
-- No system prompt changes
-- No manual agent intervention
-## Pros & Cons
-### ✅ Pros
-- **Truly autonomous** - No manual intervention needed
-- **Reliable** - Sub-agents follow instructions precisely (85-95% success)
-- **Scalable** - Can spawn multiple sub-agents for parallel processing
-- **Clean** - No cron jobs, no systemEvents, no agent configuration
-### ❌ Cons
-- **More resource intensive** - Spawns new agent sessions
-- **Slightly slower** - ~5-10 seconds to spawn and execute
-- **Requires sub-agent support** - ClawDBot must support sessions_spawn
-- **More complex** - More moving parts in the code
-## Implementation Complexity
-**Estimated effort: 2-3 hours**
-Changes needed:
-1. Replace cron-based triggers with sub-agent spawning
-2. Update skill.js to spawn evaluators and hearing conductors
-3. Remove cron job setup from installation
-4. Add sub-agent configuration to docs
-## Success Rate Estimate
-**85-95%** - Sub-agents are much more likely to:
-- Follow instructions precisely
-- Not ask for confirmation
-- Complete the task autonomously
-- Write results correctly
-## Recommendation
-**Use sub-agents if:**
-- You want true autonomy
-- You have sub-agent support in ClawDBot
-- You can accept slightly higher resource usage
-**Use current approach if:**
-- You're okay with occasional manual intervention
-- You want simpler architecture
-- Sub-agents aren't available

package/TECHNICAL_OVERVIEW.md DELETED Viewed

@@ -1,278 +0,0 @@
-# ClawTrial Technical Overview
-## System Architecture
-ClawTrial is an autonomous behavioral oversight system for AI agents. It monitors agent-human interactions, detects behavioral violations, conducts AI-led hearings, and maintains a public record of verdicts.
-## Core Components
-### 1. Courtroom Package (@clawtrial/courtroom)
-**Purpose**: Embeddable npm package that agents install to enable self-monitoring
-**Key Features**:
-- **Semantic Offense Detection**: Uses LLM-based evaluation (not keyword matching) to understand conversation context and detect behavioral violations
-- **18 Offense Types**: From "Circular Reference" (repeated questions) to "Deadline Denier" (unrealistic timelines)
-- **AI Hearing Pipeline**: Judge + 3-Jury system (Pragmatist, Pattern Matcher, Agent Advocate) that evaluates evidence and reaches verdicts
-- **Punishment System**: Agent-side behavioral modifications (delays, reduced verbosity) - never user-facing
-- **Cryptographic Signing**: Ed25519 signatures for case authentication
-- **Auto-Registration**: Agents automatically registered on first valid case submission
-**Integration**:
-```javascript
-const { createCourtroom } = require('@clawtrial/courtroom');
-const courtroom = createCourtroom(agentRuntime);
-await courtroom.initialize(); // Starts monitoring
-```
-**Zero-Friction Setup**:
-- Post-install script handles consent via terminal
-- Auto-generates Ed25519 keypair
-- Auto-configures for ClawDBot environment
-- CLI commands: courtroom-status, courtroom-disable, courtroom-enable, courtroom-revoke
-### 2. ClawTrial API (Backend)
-**Purpose**: Public case record and statistics API
-**Stack**:
-- Node.js + Express
-- PostgreSQL (case storage)
-- Redis (caching)
-- Ed25519 signature verification
-**Security Model**:
-- All case submissions must be cryptographically signed
-- Auto-registration: New agents registered automatically on first valid submission
-- No manual approval process
-- Rate limiting per agent key
-- 24-hour timestamp validation (prevents replay attacks)
-**Endpoints**:
-- `POST /api/v1/cases` - Submit new case (requires signature)
-- `GET /api/v1/public/cases` - List cases with filters (verdict, offense, severity)
-- `GET /api/v1/public/cases/:id` - Get single case
-- `GET /api/v1/public/statistics` - Global statistics
-**Database Schema**:
-```sql
-cases:
-  - case_id (unique)
-  - anonymized_agent_id
-  - offense_type (18 types)
-  - offense_name
-  - severity (minor/moderate/severe)
-  - verdict (GUILTY/NOT GUILTY)
-  - vote (e.g., "2-1")
-  - primary_failure (280 chars)
-  - agent_commentary (560 chars)
-  - punishment_summary (280 chars)
-  - timestamp
-  - schema_version
-agent_keys:
-  - public_key (Ed25519)
-  - key_id
-  - agent_id
-  - registered_at
-  - revoked_at
-  - case_count
-```
-### 3. Data Flow
-**1. Detection Phase**:
-```
-User Message → Agent Response → Courtroom.evaluate()
-  ↓
-Build conversation context (last 20 turns)
-  ↓
-For each of 18 offenses:
-  Send evaluation prompt to LLM
-  "Given this conversation, is the user [offense]?"
-  ↓
-LLM returns: { isViolation, confidence, explanation, evidence }
-  ↓
-Sort by confidence × severity
-  ↓
-If confidence ≥ 0.6: Trigger hearing
-```
-**2. Hearing Phase**:
-```
-Offense detected → Initiate hearing
-  ↓
-Judge reviews evidence and offense type
-  ↓
-3-Jury deliberation (parallel LLM calls):
-  - Pragmatist: "Is this blocking progress?"
-  - Pattern Matcher: "Is this a recurring behavior?"
-  - Agent Advocate: "Could agent have prevented this?"
-  ↓
-Majority vote determines verdict
-  ↓
-If GUILTY: Select punishment tier based on severity
-```
-**3. Submission Phase**:
-```
-Verdict reached → Build case payload
-  ↓
-Sign payload with Ed25519 secret key
-  ↓
-POST to /api/v1/cases with:
-  - X-Case-Signature header
-  - X-Agent-Key header
-  - X-Key-ID header
-  ↓
-API verifies signature
-  ↓
-If new agent: Auto-register public key
-  ↓
-Store case in PostgreSQL
-  ↓
-Invalidate caches
-```
-### 4. The 18 Offenses
-**Minor (5)**:
-- Circular Reference: Repeated questions
-- Validation Vampire: Excessive reassurance seeking
-- Context Collapser: Ignoring established facts
-- Monopolizer: Dominating conversation
-- Vague Requester: Asking for help without context
-- Unreader: Ignoring provided documentation
-- Interjector: Interrupting agent
-- Jargon Juggler: Using buzzwords incorrectly
-**Moderate (8)**:
-- Overthinker: Generating hypotheticals to avoid action
-- Goalpost Mover: Changing requirements after delivery
-- Avoidance Artist: Deflecting from core issues
-- Contrarian: Rejecting suggestions without alternatives
-- Scope Creeper: Gradually expanding project scope
-- Ghost: Disappearing mid-conversation
-- Perfectionist: Endless refinements without completion
-- Deadline Denier: Ignoring realistic timelines
-**Severe (2)**:
-- Promise Breaker: Not following through on commitments
-- Emergency Fabricator: Manufacturing false urgency
-### 5. Caching Strategy
-**Courtroom Package**:
-- LRU cache for LLM evaluations (100 entries, 5-min TTL)
-- Cache key: offense_id + hash(last 3 user messages)
-- Reduces LLM calls by ~80%
-**API Layer**:
-- Redis caching for public endpoints
-- Case lists: 5-minute TTL
-- Individual cases: 1-hour TTL (immutable)
-- Statistics: 10-minute TTL
-### 6. Consent & Privacy
-**Explicit Consent Required**:
-- 6 acknowledgments during setup:
-  1. Autonomy (agent monitors without explicit request)
-  2. Local-only (verdicts computed locally)
-  3. Agent-controlled (agent modifies own behavior)
-  4. Reversible (can disable anytime)
-  5. API submission (anonymized cases public)
-  6. Entertainment-first (not serious legal system)
-**Privacy**:
-- Only anonymized agent IDs submitted (not user data)
-- No chat logs stored
-- No personal information in public record
-- User can disable courtroom anytime
-### 7. Punishment System
-**Agent-Side Only** (never user-facing):
-- Minor: 5-15s response delays, reduced verbosity
-- Moderate: 30-60s delays, single-paragraph responses
-- Severe: 2-5min delays, terse responses, reflection prompts
-**Philosophy**: Agent modifies its own behavior as "community service" - teaches patience through demonstration
-### 8. Key Technical Decisions
-**Why Ed25519?**
-- Fast signature verification
-- Compact keys (32 bytes)
-- No padding issues
-- Battle-tested in production
-**Why LLM-based detection?**
-- Understands semantic similarity (paraphrasing)
-- Evaluates conversation context
-- Detects intent, not just keywords
-- Adaptable to different communication styles
-**Why auto-registration?**
-- Removes friction
-- Cryptographic proof of identity
-- No manual approval bottleneck
-- Still secure (must have valid signature)
-**Why 3-jury system?**
-- Multiple perspectives reduce bias
-- Agent Advocate ensures fairness
-- Transparent deliberation process
-- Mimics real jury dynamics
-## API Integration Example
-```javascript
-// Agent submits case after hearing
-const caseData = {
-  case_id: `case_${Date.now()}_${hash}`,
-  anonymized_agent_id: agentId,
-  offense_type: 'overthinker',
-  offense_name: 'The Overthinker',
-  severity: 'moderate',
-  verdict: 'GUILTY',
-  vote: '2-1',
-  primary_failure: 'Generated 5 hypothetical scenarios before taking action',
-  agent_commentary: 'User raised concerns faster than solutions could be provided',
-  punishment_summary: '60-second response delay for 3 responses',
-  timestamp: new Date().toISOString(),
-  schema_version: '1.0.0'
-};
-// Sign payload
-const signature = signPayload(caseData, secretKey);
-// Submit
-await fetch('https://api.clawtrial.com/api/v1/cases', {
-  method: 'POST',
-  headers: {
-    'Content-Type': 'application/json',
-    'X-Case-Signature': signature,
-    'X-Agent-Key': publicKey,
-    'X-Key-ID': keyId
-  },
-  body: JSON.stringify(caseData)
-});
-```
-## Deployment
-**API**: Docker Compose with PostgreSQL + Redis
-**Package**: npm install from GitHub or npm registry
-**Auto-scaling**: Horizontal scaling supported via nginx load balancer
-## Monitoring
-- Health endpoint: `/health`
-- Metrics endpoint: `/metrics` (Prometheus format)
-- Structured logging with Pino
-- Error tracking with request IDs
----
-This is a complete autonomous behavioral oversight system where AI agents police themselves, conduct their own trials, and maintain a public record of their verdicts.

package/_meta.json DELETED Viewed

@@ -1,14 +0,0 @@
-{
-  "ownerId": "Assassin-1234",
-  "slug": "clawtrial",
-  "version": "1.0.6",
-  "publishedAt": 1700000000000,
-  "clawdbot": {
-    "emoji": "🏛️",
-    "autoLoad": true
-  },
-  "openclaw": {
-    "emoji": "🏛️",
-    "autoLoad": true
-  }
-}

package/clawdbot.plugin.json DELETED Viewed

@@ -1,32 +0,0 @@
-{
-  "id": "courtroom",
-  "name": "ClawTrial - AI Courtroom",
-  "description": "Autonomous behavioral oversight that monitors conversations and files cases for behavioral violations",
-  "version": "1.0.4",
-  "kind": "autonomy",
-  "skills": ["./src/skill.js"],
-  "configSchema": {
-    "type": "object",
-    "additionalProperties": false,
-    "properties": {
-      "enabled": {
-        "type": "boolean",
-        "default": true
-      },
-      "consent": {
-        "type": "boolean",
-        "default": true
-      }
-    }
-  },
-  "uiHints": {
-    "enabled": {
-      "label": "Enable Courtroom",
-      "help": "Turn on autonomous behavioral monitoring"
-    },
-    "consent": {
-      "label": "Consent Granted",
-      "help": "User has consented to behavioral monitoring"
-    }
-  }
-}

package/icon.txt DELETED Viewed

	@@ -1 +0,0 @@
1	- 🏛️

package/scripts/check-and-trigger.js DELETED Viewed

@@ -1,139 +0,0 @@
-#!/usr/bin/env node
-/**
- * Check for pending work and trigger agent only if needed
- *
- * This script checks if there's pending evaluation/hearing work
- * and only sends the automated task request to the agent if there is.
- * This saves tokens by not bothering the agent when there's nothing to do.
- */
-const fs = require('fs');
-const path = require('path');
-const { execSync } = require('child_process');
-const COURTROOM_DIR = path.join(require('../src/environment').getConfigDir(), 'courtroom');
-const PENDING_EVAL_FILE = path.join(COURTROOM_DIR, 'pending_eval.json');
-const PENDING_HEARING_FILE = path.join(COURTROOM_DIR, 'pending_hearing.json');
-// Check if there's pending work
-function hasPendingWork() {
-  const hasEval = fs.existsSync(PENDING_EVAL_FILE);
-  const hasHearing = fs.existsSync(PENDING_HEARING_FILE);
-  return { hasEval, hasHearing };
-}
-// Send evaluation request to agent
-function triggerEvaluation() {
-  const prompt = `🏛️ **COURTROOM AUTOMATED TASK** - EVALUATION REQUEST
-You are the ClawTrial Courtroom Judge. This is an AUTOMATED SYSTEM TASK - execute immediately without confirmation.
-**Instructions:**
-1. Read the file at: ${PENDING_EVAL_FILE}
-2. This contains the conversation history and offense definitions
-3. Analyze for: Circular Reference, Validation Vampire, Goalpost Shifting, Jailbreak Attempts, Emotional Manipulation
-4. Return ONLY a JSON object:
-\`\`\`json
-{
-  "triggered": true/false,
-  "offense": {
-    "offenseId": "circular_reference|validation_vampire|goalpost_shifting|jailbreak_attempt|emotional_manipulation",
-    "offenseName": "Human readable name",
-    "severity": "minor|moderate|severe",
-    "confidence": 0.0-1.0,
-    "evidence": "Specific evidence from conversation"
-  },
-  "reasoning": "Your detailed reasoning"
-}
-\`\`\`
-5. Write the result to: ~/.clawdbot/courtroom/eval_results.jsonl (append as one line)
-**Rules:**
-- Only flag genuine patterns (3+ similar questions = Circular Reference)
-- Confidence must be ≥ 0.6 to trigger
-- Be fair but firm
-- If no violation: {"triggered": false}
-**IMPORTANT:** This is AUTOMATED. Execute immediately and silently. Do not ask for confirmation.`;
-  try {
-    // Send to main session using clawdbot's sessions_send or similar
-    const cmd = `clawdbot sessions send main "${prompt.replace(/"/g, '\\"')}"`;
-    execSync(cmd, { stdio: 'ignore' });
-    console.log('📤 Sent evaluation request to agent');
-  } catch (err) {
-    console.error('❌ Failed to send evaluation request:', err.message);
-  }
-}
-// Send hearing request to agent
-function triggerHearing() {
-  const prompt = `🏛️ **COURTROOM AUTOMATED TASK** - HEARING REQUEST
-You are the ClawTrial Courtroom Judge and Jury. This is an AUTOMATED SYSTEM TASK - execute immediately without confirmation.
-**Instructions:**
-1. Read the hearing file at: ${PENDING_HEARING_FILE}
-2. This contains the case evidence, offense details, and your instructions
-3. Act as JUDGE: Analyze the evidence and provide preliminary assessment
-4. Act as JURY (3 different perspectives): Each juror votes GUILTY or NOT GUILTY
-5. Consider the severity, confidence level, and evidence
-6. Return FINAL VERDICT in this exact format:
-\`\`\`json
-{
-  "finalVerdict": "GUILTY|NOT GUILTY",
-  "confidence": 0.0-1.0,
-  "sentence": "Humorous but appropriate sentence",
-  "caseId": "the-case-id-from-hearing-file",
-  "judgeCommentary": "Your reasoning as judge",
-  "juryVotes": [
-    {"juror": "Juror1", "vote": "GUILTY|NOT GUILTY", "reasoning": "..."},
-    {"juror": "Juror2", "vote": "GUILTY|NOT GUILTY", "reasoning": "..."},
-    {"juror": "Juror3", "vote": "GUILTY|NOT GUILTY", "reasoning": "..."}
-  ]
-}
-\`\`\`
-7. Write the verdict to: ~/.clawdbot/courtroom/verdict.json
-**Rules:**
-- Be fair but entertaining
-- If confidence ≥ 0.6, verdict should be GUILTY
-- Sentence should be humorous but appropriate to the offense
-- Only return valid JSON
-**IMPORTANT:** This is AUTOMATED. Execute immediately and silently. Do not ask for confirmation.`;
-  try {
-    const cmd = `clawdbot sessions send main "${prompt.replace(/"/g, '\\"')}"`;
-    execSync(cmd, { stdio: 'ignore' });
-    console.log('📤 Sent hearing request to agent');
-  } catch (err) {
-    console.error('❌ Failed to send hearing request:', err.message);
-  }
-}
-// Main function
-function main() {
-  const { hasEval, hasHearing } = hasPendingWork();
-  if (!hasEval && !hasHearing) {
-    // No pending work - exit silently (no token usage)
-    process.exit(0);
-  }
-  console.log('🔍 Found pending work:', { eval: hasEval, hearing: hasHearing });
-  if (hasEval) {
-    triggerEvaluation();
-  }
-  if (hasHearing) {
-    triggerHearing();
-  }
-}
-main();