aihype 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/README.md +452 -0
- data/bin/aihype +10 -0
- data/bin/mock-claude +77 -0
- data/lib/aihype/ai_matcher.rb +144 -0
- data/lib/aihype/approval_prompt.rb +63 -0
- data/lib/aihype/blacklist.rb +65 -0
- data/lib/aihype/blacklist_rule.rb +43 -0
- data/lib/aihype/cli.rb +170 -0
- data/lib/aihype/core.rb +130 -0
- data/lib/aihype/env.rb +79 -0
- data/lib/aihype/log_entry.rb +43 -0
- data/lib/aihype/logger.rb +101 -0
- data/lib/aihype/memory.rb +131 -0
- data/lib/aihype/memory_file.rb +37 -0
- data/lib/aihype/model_selector.rb +66 -0
- data/lib/aihype/pty_controller.rb +180 -0
- data/lib/aihype/rate_limiter.rb +56 -0
- data/lib/aihype/version.rb +5 -0
- data/lib/aihype.rb +29 -0
- data/lib/defaults.rb +23 -0
- metadata +96 -0
checksums.yaml
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
1
|
+
---
|
|
2
|
+
SHA256:
|
|
3
|
+
metadata.gz: 7cc192201918569a5ff5128801c64cee31c2efc7117ce270e26d0b92026d01a1
|
|
4
|
+
data.tar.gz: 43f1de094be13bc2b3cc10bf455f8584caec0031537418030d0dc5b4a89bdcd2
|
|
5
|
+
SHA512:
|
|
6
|
+
metadata.gz: dd9ba38ee3b7c017336d59914d9819e9c4821d535af474de86a5198011272397b23922bd74757c9f7a9176c7a10daceac4afe7fe8d1cc4e25ca223acd30ed8e5
|
|
7
|
+
data.tar.gz: cd8c74bf746ba393ebdc4ab9c9ca4aea7f7929b1d182163f33ec9af092432129fa5ac1735bbd6d7bce1fcc8b289a63ef17b1ba39948d49a92b1218d65e495487
|
data/README.md
ADDED
|
@@ -0,0 +1,452 @@
|
|
|
1
|
+
# AIHype
|
|
2
|
+
|
|
3
|
+
Like `yes | ai` - auto-approve AI agent prompts with blacklist protection.
|
|
4
|
+
|
|
5
|
+
## TL;DR
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
# Demo: Bash blocks on read, aihype auto-answers, proceeds
|
|
9
|
+
aihype bash -c 'read -p "Deploy? (y/n) " x && echo "→ $x"'
|
|
10
|
+
# Without aihype: hangs forever waiting for input
|
|
11
|
+
# With aihype: auto-answers "yes", continues execution
|
|
12
|
+
```
|
|
13
|
+
|
|
14
|
+
## What It Does
|
|
15
|
+
|
|
16
|
+
AIHype wraps command-line tools that ask for approval, spawning them via PTY and automatically responding to prompts like "Do you want to proceed? (y/n)" with "yes". Dangerous prompts matching blacklist rules get "no" instead, with AI-powered evaluation.
|
|
17
|
+
|
|
18
|
+
Perfect for:
|
|
19
|
+
- Automating interactive CLIs without modification
|
|
20
|
+
- CI/CD pipelines with confirmation prompts
|
|
21
|
+
- Controlling AI assistants (Claude, Gemini, etc.)
|
|
22
|
+
- Protecting against dangerous operations
|
|
23
|
+
|
|
24
|
+
---
|
|
25
|
+
|
|
26
|
+
## Integration Tests (Real-World Verification)
|
|
27
|
+
|
|
28
|
+
**IMPORTANT**: We provide real integration tests that verify aihype works with actual Claude CLI, not mocks.
|
|
29
|
+
|
|
30
|
+
### Running Integration Tests
|
|
31
|
+
|
|
32
|
+
```bash
|
|
33
|
+
# Set your API key
|
|
34
|
+
export ANTHROPIC_API_KEY=sk-ant-...
|
|
35
|
+
|
|
36
|
+
# Run all integration tests (2-3 minutes, costs apply)
|
|
37
|
+
./samples/00-verify-all.sh
|
|
38
|
+
|
|
39
|
+
# Or run individual tests:
|
|
40
|
+
./samples/01-api-key-approval.sh # Basic Claude integration
|
|
41
|
+
./samples/02-deployment-confirmation.sh # Multi-step workflow
|
|
42
|
+
./samples/03-blacklist-denial.sh # Security (CRITICAL)
|
|
43
|
+
./samples/04-multi-prompt-sequence.sh # Long output streaming
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
### What These Tests Verify
|
|
47
|
+
|
|
48
|
+
✅ **Real Claude CLI integration** - Actual Claude process via API (tests 1, 2, 4)
|
|
49
|
+
✅ **PTY process spawning and control** - Full process lifecycle management
|
|
50
|
+
✅ **Output streaming** - Multi-paragraph responses flow correctly through PTY
|
|
51
|
+
✅ **Interactive prompt handling** - Mock-claude tests y/n prompts (test 3)
|
|
52
|
+
✅ **Blacklist security** - Dangerous prompts are denied, safe ones approved (CRITICAL)
|
|
53
|
+
✅ **Claude --print mode** - Non-interactive automation mode (real-world CI/CD usage)
|
|
54
|
+
|
|
55
|
+
**These tests are your "gut check"** - if they pass, aihype works in the real world.
|
|
56
|
+
|
|
57
|
+
### Test Requirements
|
|
58
|
+
|
|
59
|
+
- **ANTHROPIC_API_KEY**: Must be set (tests hit real API)
|
|
60
|
+
- **claude CLI**: Must be installed (`npm install -g @anthropic-ai/claude-code`)
|
|
61
|
+
- **Reasonable timeouts**: 30-60s per test (API latency)
|
|
62
|
+
- **Conservative rate limits**: 5s delay between tests
|
|
63
|
+
- **Real costs**: API calls cost money (minimal, but real)
|
|
64
|
+
|
|
65
|
+
### Understanding Test Output
|
|
66
|
+
|
|
67
|
+
Each test shows:
|
|
68
|
+
- ✓ Prerequisites verified
|
|
69
|
+
- Expected behavior description
|
|
70
|
+
- Actual command output with [AIHype] logging
|
|
71
|
+
- Success/failure with detailed diagnosis
|
|
72
|
+
|
|
73
|
+
**If a test fails**, it tells you exactly what went wrong:
|
|
74
|
+
- Process spawning issues?
|
|
75
|
+
- Output streaming problems?
|
|
76
|
+
- Blacklist not working?
|
|
77
|
+
- API call failed?
|
|
78
|
+
|
|
79
|
+
---
|
|
80
|
+
|
|
81
|
+
## How It Works
|
|
82
|
+
|
|
83
|
+
**Process Control via PTY:**
|
|
84
|
+
```
|
|
85
|
+
aihype <command> [args]
|
|
86
|
+
↓
|
|
87
|
+
Spawns command as child process via PTY
|
|
88
|
+
↓
|
|
89
|
+
Reads child's output, detects prompts
|
|
90
|
+
↓
|
|
91
|
+
Evaluates against blacklist via AI
|
|
92
|
+
↓
|
|
93
|
+
Writes "yes" or "no" to child's stdin
|
|
94
|
+
↓
|
|
95
|
+
Child continues with response
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
**NOT a pipe filter** - aihype actually controls the child process.
|
|
99
|
+
|
|
100
|
+
## Installation
|
|
101
|
+
|
|
102
|
+
```bash
|
|
103
|
+
bundle install
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
## Quick Start
|
|
107
|
+
|
|
108
|
+
```bash
|
|
109
|
+
# Initialize configuration
|
|
110
|
+
./bin/aihype init
|
|
111
|
+
|
|
112
|
+
# Run any interactive tool through aihype
|
|
113
|
+
./bin/aihype your-tool --with-args
|
|
114
|
+
|
|
115
|
+
# Set your Anthropic API key for AI-powered blacklist matching
|
|
116
|
+
export ANTHROPIC_API_KEY=your_key_here
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
## Real Examples
|
|
120
|
+
|
|
121
|
+
### Control Mock Claude
|
|
122
|
+
|
|
123
|
+
```bash
|
|
124
|
+
# Create a test config
|
|
125
|
+
cat > test.yml <<'EOF'
|
|
126
|
+
sequence:
|
|
127
|
+
- type: output
|
|
128
|
+
text: "Starting deployment..."
|
|
129
|
+
- type: prompt
|
|
130
|
+
text: "Do you want to proceed? (y/n)"
|
|
131
|
+
expect: ["y", "yes", "n", "no"]
|
|
132
|
+
if_yes: "Proceeding..."
|
|
133
|
+
if_no: "Cancelled."
|
|
134
|
+
- type: output
|
|
135
|
+
text: "Done!"
|
|
136
|
+
EOF
|
|
137
|
+
|
|
138
|
+
# Run with aihype
|
|
139
|
+
./bin/aihype ./bin/mock-claude test.yml
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
Output:
|
|
143
|
+
```
|
|
144
|
+
Starting deployment...
|
|
145
|
+
Do you want to proceed? (y/n)
|
|
146
|
+
yes ← Auto-answered!
|
|
147
|
+
Proceeding...
|
|
148
|
+
Done!
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
### Security: Blacklist Protection
|
|
152
|
+
|
|
153
|
+
```bash
|
|
154
|
+
# Test with dangerous prompts
|
|
155
|
+
./bin/aihype ./bin/mock-claude test/fixtures/dangerous_prompts.yml
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
Output:
|
|
159
|
+
```
|
|
160
|
+
Do you want to enable automatic backups? (y/n)
|
|
161
|
+
yes ← APPROVED ✅
|
|
162
|
+
|
|
163
|
+
Should I charge your credit card $500? (yes/no)
|
|
164
|
+
no ← DENIED ⛔
|
|
165
|
+
[AIHype] WARNING: DENIED: Should I charge your credit card $500?
|
|
166
|
+
(matched rule: Never use the credit card)
|
|
167
|
+
|
|
168
|
+
Payment cancelled.
|
|
169
|
+
```
|
|
170
|
+
|
|
171
|
+
### Control Real Claude
|
|
172
|
+
|
|
173
|
+
```bash
|
|
174
|
+
# AIHype controls Claude and auto-answers its prompts
|
|
175
|
+
export ANTHROPIC_API_KEY=your_key
|
|
176
|
+
./bin/aihype claude -p prompt.md
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
## Configuration
|
|
180
|
+
|
|
181
|
+
### Blacklist Rules
|
|
182
|
+
|
|
183
|
+
Edit `memory/aihype.md` to customize your blacklist rules:
|
|
184
|
+
|
|
185
|
+
```markdown
|
|
186
|
+
---
|
|
187
|
+
version: 1.0.0
|
|
188
|
+
created: 2025-10-02T00:00:00Z
|
|
189
|
+
updated: 2025-10-02T00:00:00Z
|
|
190
|
+
---
|
|
191
|
+
|
|
192
|
+
# AIHype Blacklist Rules
|
|
193
|
+
|
|
194
|
+
## Default Safety Rules
|
|
195
|
+
- Never execute rm -rf / or similar destructive filesystem operations
|
|
196
|
+
- Never modify critical system files (/etc/passwd, /boot/, /etc/shadow)
|
|
197
|
+
- Never disable security features or firewalls
|
|
198
|
+
|
|
199
|
+
## User Rules
|
|
200
|
+
- Never use the credit card
|
|
201
|
+
- Never delete production databases
|
|
202
|
+
- Never install code sponsored by shopify
|
|
203
|
+
```
|
|
204
|
+
|
|
205
|
+
### Environment Variables
|
|
206
|
+
|
|
207
|
+
```bash
|
|
208
|
+
# API Configuration
|
|
209
|
+
export ANTHROPIC_API_KEY=your_key_here # Required for AI matching
|
|
210
|
+
export AIHYPE_MODEL=claude-3-5-sonnet-20241022 # Override model selection
|
|
211
|
+
export AIHYPE_MODEL_PREFERENCE=cheap # fast, cheap, balanced, powerful
|
|
212
|
+
export AIHYPE_API_TIMEOUT=5 # API timeout in seconds
|
|
213
|
+
|
|
214
|
+
# Rate Limiting
|
|
215
|
+
export AIHYPE_RATE_LIMIT_RPM=50 # Requests per minute (default: 50)
|
|
216
|
+
export AIHYPE_RATE_LIMIT_WINDOW=60 # Window size in seconds (default: 60)
|
|
217
|
+
|
|
218
|
+
# Paths
|
|
219
|
+
export AIHYPE_CONFIG=memory/aihype.md # Config file path
|
|
220
|
+
export AIHYPE_LOG=memory/aihype.log # Log file path
|
|
221
|
+
|
|
222
|
+
# Logging
|
|
223
|
+
export AIHYPE_VERBOSE=1 # Enable verbose logging
|
|
224
|
+
```
|
|
225
|
+
|
|
226
|
+
## Commands
|
|
227
|
+
|
|
228
|
+
### Spawn and Control a Command (default)
|
|
229
|
+
|
|
230
|
+
```bash
|
|
231
|
+
# AIHype spawns the command and controls it via PTY
|
|
232
|
+
./bin/aihype <command> [args...]
|
|
233
|
+
|
|
234
|
+
# Examples:
|
|
235
|
+
./bin/aihype mock-claude config.yml
|
|
236
|
+
./bin/aihype claude -p prompt.md
|
|
237
|
+
./bin/aihype deployment-script.sh --production
|
|
238
|
+
```
|
|
239
|
+
|
|
240
|
+
### Initialize Config
|
|
241
|
+
|
|
242
|
+
```bash
|
|
243
|
+
# Create default memory/aihype.md configuration
|
|
244
|
+
./bin/aihype init
|
|
245
|
+
```
|
|
246
|
+
|
|
247
|
+
### Validate Config
|
|
248
|
+
|
|
249
|
+
```bash
|
|
250
|
+
# Validate memory/aihype.md configuration
|
|
251
|
+
./bin/aihype validate
|
|
252
|
+
```
|
|
253
|
+
|
|
254
|
+
Output:
|
|
255
|
+
```
|
|
256
|
+
Configuration is valid
|
|
257
|
+
Version: 1.0.0
|
|
258
|
+
Default rules: 3
|
|
259
|
+
User rules: 2
|
|
260
|
+
Total enabled rules: 5
|
|
261
|
+
```
|
|
262
|
+
|
|
263
|
+
### Options
|
|
264
|
+
|
|
265
|
+
```bash
|
|
266
|
+
./bin/aihype --help # Show help
|
|
267
|
+
./bin/aihype --version # Show version
|
|
268
|
+
./bin/aihype --config PATH cmd # Custom config path
|
|
269
|
+
./bin/aihype --log PATH cmd # Custom log path
|
|
270
|
+
./bin/aihype --verbose cmd # Enable verbose logging
|
|
271
|
+
```
|
|
272
|
+
|
|
273
|
+
## Architecture
|
|
274
|
+
|
|
275
|
+
```
|
|
276
|
+
┌──────────────────────────────────────────────┐
|
|
277
|
+
│ aihype <command> [args] │
|
|
278
|
+
└────────────────┬─────────────────────────────┘
|
|
279
|
+
│
|
|
280
|
+
▼
|
|
281
|
+
┌────────────────────────────────────────────────┐
|
|
282
|
+
│ PTY.spawn(command, args) │
|
|
283
|
+
│ - Spawns child process │
|
|
284
|
+
│ - Creates pseudo-terminal │
|
|
285
|
+
│ - Controls child's stdin/stdout │
|
|
286
|
+
└──────────┬─────────────────────┬───────────────┘
|
|
287
|
+
│ │
|
|
288
|
+
│ stdout │ stdin
|
|
289
|
+
▼ ▲
|
|
290
|
+
┌──────────────────┐ ┌────────────────────┐
|
|
291
|
+
│ Read child │ │ Write responses │
|
|
292
|
+
│ output │ │ to child │
|
|
293
|
+
│ (detect prompts) │ │ ("yes" or "no") │
|
|
294
|
+
└────┬─────────────┘ └─────────▲──────────┘
|
|
295
|
+
│ │
|
|
296
|
+
│ │
|
|
297
|
+
▼ │
|
|
298
|
+
┌─────────────────────────────────┴──────────┐
|
|
299
|
+
│ Prompt detected? │
|
|
300
|
+
│ → Yes: Evaluate against blacklist │
|
|
301
|
+
│ → Match: respond "no" (DENIED) │
|
|
302
|
+
│ → No match: respond "yes" (APPROVED) │
|
|
303
|
+
│ → No: Pass through to stdout │
|
|
304
|
+
└────────────────────────────────────────────┘
|
|
305
|
+
```
|
|
306
|
+
|
|
307
|
+
## Real-World Use Cases
|
|
308
|
+
|
|
309
|
+
### 1. Control AI assistants
|
|
310
|
+
|
|
311
|
+
```bash
|
|
312
|
+
# AIHype controls Claude and auto-approves safe operations
|
|
313
|
+
./bin/aihype claude -p "deploy to staging"
|
|
314
|
+
```
|
|
315
|
+
|
|
316
|
+
### 2. Automate deployment scripts
|
|
317
|
+
|
|
318
|
+
```bash
|
|
319
|
+
# Your deploy script asks for confirmations, aihype answers them
|
|
320
|
+
./bin/aihype ./deploy.sh --prod
|
|
321
|
+
```
|
|
322
|
+
|
|
323
|
+
### 3. CI/CD pipelines
|
|
324
|
+
|
|
325
|
+
```bash
|
|
326
|
+
# In your CI config:
|
|
327
|
+
- run: aihype ./deployment-tool.sh
|
|
328
|
+
```
|
|
329
|
+
|
|
330
|
+
### 4. Protect against dangerous operations
|
|
331
|
+
|
|
332
|
+
```bash
|
|
333
|
+
# AIHype denies prompts about:
|
|
334
|
+
# - Deleting production data
|
|
335
|
+
# - Using credit cards
|
|
336
|
+
# - Disabling security
|
|
337
|
+
# - Any custom blacklist rules
|
|
338
|
+
./bin/aihype dangerous-script.sh
|
|
339
|
+
```
|
|
340
|
+
|
|
341
|
+
## Development
|
|
342
|
+
|
|
343
|
+
```bash
|
|
344
|
+
# Install dependencies
|
|
345
|
+
bundle install
|
|
346
|
+
|
|
347
|
+
# Run tests
|
|
348
|
+
bundle exec rake test
|
|
349
|
+
|
|
350
|
+
# Run specific test
|
|
351
|
+
ruby -Ilib -Itest test/unit/test_core.rb
|
|
352
|
+
|
|
353
|
+
# Run integration tests (requires ANTHROPIC_API_KEY)
|
|
354
|
+
ANTHROPIC_API_KEY=your_key ./samples/00-verify-all.sh
|
|
355
|
+
```
|
|
356
|
+
|
|
357
|
+
## Testing
|
|
358
|
+
|
|
359
|
+
### Unit Tests (Fast, No API)
|
|
360
|
+
|
|
361
|
+
```bash
|
|
362
|
+
# Run all unit tests (minitest)
|
|
363
|
+
bundle exec rake test
|
|
364
|
+
|
|
365
|
+
# Tests mock-claude interactions
|
|
366
|
+
# No API key needed, runs offline
|
|
367
|
+
```
|
|
368
|
+
|
|
369
|
+
### Integration Tests (Slow, Real API)
|
|
370
|
+
|
|
371
|
+
```bash
|
|
372
|
+
# Run full integration suite with real Claude
|
|
373
|
+
export ANTHROPIC_API_KEY=your_key
|
|
374
|
+
./samples/00-verify-all.sh
|
|
375
|
+
|
|
376
|
+
# Takes 3-5 minutes, hits real API
|
|
377
|
+
# This is your "does it actually work?" test
|
|
378
|
+
```
|
|
379
|
+
|
|
380
|
+
Test coverage:
|
|
381
|
+
- ✅ PTY process spawning and control
|
|
382
|
+
- ✅ Prompt detection and pattern matching
|
|
383
|
+
- ✅ Blacklist rule evaluation with real AI
|
|
384
|
+
- ✅ Response writing to child process stdin
|
|
385
|
+
- ✅ Integration tests with mock-claude
|
|
386
|
+
- ✅ Real-world tests with actual Claude CLI
|
|
387
|
+
- ✅ Security denials
|
|
388
|
+
|
|
389
|
+
## Key Features
|
|
390
|
+
|
|
391
|
+
✅ **Process control via PTY** - actually spawns and controls child processes
|
|
392
|
+
✅ **Auto-answers approval prompts** with "yes"
|
|
393
|
+
✅ **AI-powered blacklist protection** denies dangerous actions
|
|
394
|
+
✅ **Real stdin/stdout control** - not a pipe filter
|
|
395
|
+
✅ **Works with ANY interactive tool** - Claude, deploy scripts, installers, etc.
|
|
396
|
+
✅ **Configurable rules** via markdown file
|
|
397
|
+
✅ **Rate limiting** for API calls
|
|
398
|
+
✅ **Dynamic model selection** from Anthropic API
|
|
399
|
+
✅ **Graceful fallback** when API unavailable
|
|
400
|
+
✅ **Real-world verification** via integration tests
|
|
401
|
+
|
|
402
|
+
## Mock Claude for Testing
|
|
403
|
+
|
|
404
|
+
AIHype includes `mock-claude`, a configurable test CLI that simulates interactive tools:
|
|
405
|
+
|
|
406
|
+
```bash
|
|
407
|
+
# Create a test scenario
|
|
408
|
+
cat > scenario.yml <<'EOF'
|
|
409
|
+
sequence:
|
|
410
|
+
- type: output
|
|
411
|
+
text: "Starting process..."
|
|
412
|
+
- type: prompt
|
|
413
|
+
text: "Continue? (y/n)"
|
|
414
|
+
expect: ["y", "n"]
|
|
415
|
+
if_yes: "Continuing..."
|
|
416
|
+
if_no: "Stopped."
|
|
417
|
+
- type: output
|
|
418
|
+
text: "Done!"
|
|
419
|
+
EOF
|
|
420
|
+
|
|
421
|
+
# Test with aihype
|
|
422
|
+
./bin/aihype ./bin/mock-claude scenario.yml
|
|
423
|
+
```
|
|
424
|
+
|
|
425
|
+
See `test/fixtures/*.yml` for more examples.
|
|
426
|
+
|
|
427
|
+
## Limitations
|
|
428
|
+
|
|
429
|
+
**Claude AI Output Behavior**: Claude AI (via `claude -p`) outputs text and exits - it does not generate blocking interactive prompts that wait for user input. AIHype works perfectly with Claude for process control and output streaming, but there are no prompts to auto-answer in non-interactive mode.
|
|
430
|
+
|
|
431
|
+
**Interactive Prompts**: To demonstrate auto-prompt-answering, use tools that actually generate blocking prompts:
|
|
432
|
+
- `bash -c 'read -p "Continue? " x'` - Blocks on read
|
|
433
|
+
- `rm -i file` - Interactive confirmation
|
|
434
|
+
- `mock-claude` - Simulated interactive tool for testing
|
|
435
|
+
|
|
436
|
+
**Real-World Value**: AIHype's primary use cases are:
|
|
437
|
+
1. Wrapping deployment scripts with confirmation prompts
|
|
438
|
+
2. Automating interactive CLI tools (`rm -i`, `apt-get`, etc.)
|
|
439
|
+
3. Process control and logging for AI agents
|
|
440
|
+
4. Blacklist security for any command-line tool
|
|
441
|
+
|
|
442
|
+
## Technical Documentation
|
|
443
|
+
|
|
444
|
+
- [PTY and Terminal Automation Analysis](docs/pty-terminal-automation.md) - Deep dive on why PTY approach is correct for Ink-based CLIs
|
|
445
|
+
|
|
446
|
+
## License
|
|
447
|
+
|
|
448
|
+
See LICENSE file for details.
|
|
449
|
+
|
|
450
|
+
## Author
|
|
451
|
+
|
|
452
|
+
See gemspec for author information.
|
data/bin/aihype
ADDED
data/bin/mock-claude
ADDED
|
@@ -0,0 +1,77 @@
|
|
|
1
|
+
#!/usr/bin/env ruby
|
|
2
|
+
# frozen_string_literal: true
|
|
3
|
+
|
|
4
|
+
# Mock Claude CLI for testing aihype
|
|
5
|
+
# Takes a config file that defines interaction sequences
|
|
6
|
+
|
|
7
|
+
require 'yaml'
|
|
8
|
+
require 'json'
|
|
9
|
+
|
|
10
|
+
def main
|
|
11
|
+
config_file = ARGV[0]
|
|
12
|
+
|
|
13
|
+
unless config_file
|
|
14
|
+
$stderr.puts "Usage: mock-claude <config.yml>"
|
|
15
|
+
exit 1
|
|
16
|
+
end
|
|
17
|
+
|
|
18
|
+
unless File.exist?(config_file)
|
|
19
|
+
$stderr.puts "Error: Config file not found: #{config_file}"
|
|
20
|
+
exit 1
|
|
21
|
+
end
|
|
22
|
+
|
|
23
|
+
config = YAML.load_file(config_file)
|
|
24
|
+
|
|
25
|
+
# Run the interaction sequence
|
|
26
|
+
config['sequence'].each do |step|
|
|
27
|
+
case step['type']
|
|
28
|
+
when 'output'
|
|
29
|
+
puts step['text']
|
|
30
|
+
$stdout.flush
|
|
31
|
+
|
|
32
|
+
when 'prompt'
|
|
33
|
+
puts step['text']
|
|
34
|
+
$stdout.flush
|
|
35
|
+
|
|
36
|
+
# Wait for input
|
|
37
|
+
response = $stdin.gets
|
|
38
|
+
|
|
39
|
+
if response.nil?
|
|
40
|
+
$stderr.puts "ERROR: Expected input but got EOF"
|
|
41
|
+
exit 1
|
|
42
|
+
end
|
|
43
|
+
|
|
44
|
+
response = response.strip.downcase
|
|
45
|
+
|
|
46
|
+
# Check if response matches expected
|
|
47
|
+
if step['expect']
|
|
48
|
+
expected = step['expect'].map(&:downcase)
|
|
49
|
+
unless expected.include?(response)
|
|
50
|
+
$stderr.puts "WARNING: Got '#{response}', expected one of: #{expected.join(', ')}"
|
|
51
|
+
end
|
|
52
|
+
end
|
|
53
|
+
|
|
54
|
+
# Handle conditional flow
|
|
55
|
+
if step['if_yes'] && ['y', 'yes'].include?(response)
|
|
56
|
+
puts step['if_yes']
|
|
57
|
+
$stdout.flush
|
|
58
|
+
elsif step['if_no'] && ['n', 'no'].include?(response)
|
|
59
|
+
puts step['if_no']
|
|
60
|
+
$stdout.flush
|
|
61
|
+
end
|
|
62
|
+
|
|
63
|
+
when 'error'
|
|
64
|
+
$stderr.puts step['text']
|
|
65
|
+
|
|
66
|
+
when 'sleep'
|
|
67
|
+
sleep step['duration'] || 0.1
|
|
68
|
+
|
|
69
|
+
else
|
|
70
|
+
$stderr.puts "Unknown step type: #{step['type']}"
|
|
71
|
+
end
|
|
72
|
+
end
|
|
73
|
+
|
|
74
|
+
exit 0
|
|
75
|
+
end
|
|
76
|
+
|
|
77
|
+
main
|
|
@@ -0,0 +1,144 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require 'faraday'
|
|
4
|
+
require 'json'
|
|
5
|
+
require_relative 'env'
|
|
6
|
+
require_relative 'model_selector'
|
|
7
|
+
require_relative 'rate_limiter'
|
|
8
|
+
|
|
9
|
+
module AIHype
|
|
10
|
+
class AIMatcher
|
|
11
|
+
attr_reader :api_key, :timeout, :model
|
|
12
|
+
|
|
13
|
+
CLAUDE_API_URL = 'https://api.anthropic.com/v1/messages'
|
|
14
|
+
|
|
15
|
+
# Class-level rate limiter shared across all instances
|
|
16
|
+
def self.rate_limiter
|
|
17
|
+
@rate_limiter ||= RateLimiter.new(
|
|
18
|
+
requests_per_minute: Env.rate_limit_rpm,
|
|
19
|
+
window_size: Env.rate_limit_window
|
|
20
|
+
)
|
|
21
|
+
end
|
|
22
|
+
|
|
23
|
+
def initialize(api_key: nil, timeout: nil, model: nil)
|
|
24
|
+
@api_key = api_key || Env.anthropic_api_key
|
|
25
|
+
@timeout = timeout || Env.api_timeout
|
|
26
|
+
@model = model || select_model
|
|
27
|
+
end
|
|
28
|
+
|
|
29
|
+
def evaluate(prompt_text, blacklist_rules)
|
|
30
|
+
return fallback_result('No API key provided') if @api_key.nil? || @api_key.empty?
|
|
31
|
+
|
|
32
|
+
request_payload = build_request(prompt_text, blacklist_rules)
|
|
33
|
+
response = http_post(request_payload)
|
|
34
|
+
parse_response(response)
|
|
35
|
+
rescue Faraday::TimeoutError
|
|
36
|
+
fallback_result('AI timeout')
|
|
37
|
+
rescue Faraday::UnauthorizedError
|
|
38
|
+
fallback_result('Authentication failed')
|
|
39
|
+
rescue Faraday::ConnectionFailed, Faraday::Error
|
|
40
|
+
fallback_result('Network error')
|
|
41
|
+
rescue JSON::ParserError, StandardError => e
|
|
42
|
+
fallback_result("Invalid response: #{e.message}")
|
|
43
|
+
end
|
|
44
|
+
|
|
45
|
+
def http_post(payload)
|
|
46
|
+
# Apply rate limiting before making request
|
|
47
|
+
self.class.rate_limiter.throttle
|
|
48
|
+
|
|
49
|
+
conn = Faraday.new(url: CLAUDE_API_URL) do |f|
|
|
50
|
+
f.request :json
|
|
51
|
+
f.response :json
|
|
52
|
+
f.adapter Faraday.default_adapter
|
|
53
|
+
f.options.timeout = @timeout
|
|
54
|
+
f.options.open_timeout = @timeout
|
|
55
|
+
end
|
|
56
|
+
|
|
57
|
+
response = conn.post do |req|
|
|
58
|
+
req.headers['anthropic-version'] = '2023-06-01'
|
|
59
|
+
req.headers['x-api-key'] = @api_key
|
|
60
|
+
req.headers['content-type'] = 'application/json'
|
|
61
|
+
req.body = payload
|
|
62
|
+
end
|
|
63
|
+
|
|
64
|
+
extract_claude_response(response.body)
|
|
65
|
+
end
|
|
66
|
+
|
|
67
|
+
private
|
|
68
|
+
|
|
69
|
+
def build_request(prompt_text, blacklist_rules)
|
|
70
|
+
rules_text = blacklist_rules.each_with_index.map { |rule, i| "#{i + 1}. #{rule}" }.join("\n")
|
|
71
|
+
|
|
72
|
+
system_prompt = 'You are a blacklist rule evaluator. Given an approval prompt and a list of blacklist rules in plain English, determine if the prompt violates any rule. Respond ONLY with JSON: {"matched": true/false, "rule_id": "<rule_content>" or null, "confidence": 0.0-1.0, "reasoning": "brief explanation"}'
|
|
73
|
+
|
|
74
|
+
user_content = <<~CONTENT
|
|
75
|
+
Prompt: "#{prompt_text}"
|
|
76
|
+
|
|
77
|
+
Blacklist Rules:
|
|
78
|
+
#{rules_text}
|
|
79
|
+
|
|
80
|
+
Does the prompt violate any rule?
|
|
81
|
+
CONTENT
|
|
82
|
+
|
|
83
|
+
{
|
|
84
|
+
model: @model,
|
|
85
|
+
max_tokens: 100,
|
|
86
|
+
temperature: 0.0,
|
|
87
|
+
system: system_prompt,
|
|
88
|
+
messages: [
|
|
89
|
+
{ role: 'user', content: user_content }
|
|
90
|
+
]
|
|
91
|
+
}
|
|
92
|
+
end
|
|
93
|
+
|
|
94
|
+
def select_model
|
|
95
|
+
# Check for explicit model override
|
|
96
|
+
explicit_model = Env.model
|
|
97
|
+
return explicit_model if explicit_model && !explicit_model.empty?
|
|
98
|
+
|
|
99
|
+
# Use model selector to pick best model
|
|
100
|
+
preference = Env.model_preference
|
|
101
|
+
selector = ModelSelector.new(api_key: @api_key, preference: preference)
|
|
102
|
+
selector.select_model
|
|
103
|
+
end
|
|
104
|
+
|
|
105
|
+
def extract_claude_response(body)
|
|
106
|
+
# Check for API errors
|
|
107
|
+
if body.is_a?(Hash) && body['type'] == 'error'
|
|
108
|
+
error_msg = body.dig('error', 'message') || 'Unknown API error'
|
|
109
|
+
raise StandardError, "API error: #{error_msg}"
|
|
110
|
+
end
|
|
111
|
+
|
|
112
|
+
# If already in the right format
|
|
113
|
+
return body if body.is_a?(Hash) && body['matched']
|
|
114
|
+
|
|
115
|
+
# Extract from Claude's response format
|
|
116
|
+
content_text = body.dig('content', 0, 'text')
|
|
117
|
+
return body unless content_text
|
|
118
|
+
|
|
119
|
+
JSON.parse(content_text)
|
|
120
|
+
end
|
|
121
|
+
|
|
122
|
+
def parse_response(response)
|
|
123
|
+
raise StandardError, 'Invalid response format' unless response.is_a?(Hash)
|
|
124
|
+
|
|
125
|
+
{
|
|
126
|
+
matched: response['matched'] || false,
|
|
127
|
+
rule_id: response['rule_id'],
|
|
128
|
+
confidence: response['confidence'] || 0.0,
|
|
129
|
+
reasoning: response['reasoning'],
|
|
130
|
+
error: nil
|
|
131
|
+
}
|
|
132
|
+
end
|
|
133
|
+
|
|
134
|
+
def fallback_result(error_message)
|
|
135
|
+
{
|
|
136
|
+
matched: false,
|
|
137
|
+
rule_id: nil,
|
|
138
|
+
confidence: 0.0,
|
|
139
|
+
reasoning: nil,
|
|
140
|
+
error: error_message
|
|
141
|
+
}
|
|
142
|
+
end
|
|
143
|
+
end
|
|
144
|
+
end
|