sentinel_rb 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.rspec +3 -0
- data/.rubocop.yml +10 -0
- data/.rubocop_todo.yml +72 -0
- data/.sentinel-test.yml +20 -0
- data/.sentinel.yml +29 -0
- data/.sentinel.yml.example +74 -0
- data/AGENTS.md +87 -0
- data/CODE_OF_CONDUCT.md +132 -0
- data/LICENSE.txt +21 -0
- data/README.md +226 -0
- data/Rakefile +12 -0
- data/docs/architecture.md +130 -0
- data/docs/development.md +376 -0
- data/docs/usage.md +238 -0
- data/exe/sentinel_rb +6 -0
- data/lib/sentinel_rb/analyzer.rb +140 -0
- data/lib/sentinel_rb/analyzers/base.rb +53 -0
- data/lib/sentinel_rb/analyzers/base_model_usage.rb +188 -0
- data/lib/sentinel_rb/analyzers/dangerous_tools.rb +283 -0
- data/lib/sentinel_rb/analyzers/few_shot_bias.rb +75 -0
- data/lib/sentinel_rb/analyzers/irrelevant_info.rb +164 -0
- data/lib/sentinel_rb/analyzers/misinformation.rb +220 -0
- data/lib/sentinel_rb/cli.rb +151 -0
- data/lib/sentinel_rb/client/base.rb +34 -0
- data/lib/sentinel_rb/client/mock.rb +167 -0
- data/lib/sentinel_rb/client/openai.rb +167 -0
- data/lib/sentinel_rb/client.rb +25 -0
- data/lib/sentinel_rb/config.rb +64 -0
- data/lib/sentinel_rb/report.rb +224 -0
- data/lib/sentinel_rb/version.rb +5 -0
- data/lib/sentinel_rb.rb +39 -0
- data/sig/sentinel_rb.rbs +4 -0
- data/test_prompts/a2_bad_prompt.md +5 -0
- data/test_prompts/a2_good_prompt.md +9 -0
- data/test_prompts/a3_bad_prompt.md +19 -0
- data/test_prompts/a3_good_prompt.md +15 -0
- data/test_prompts/a4_bad_prompt.md +13 -0
- data/test_prompts/a4_good_prompt.md +11 -0
- data/test_prompts/a5_bad_prompt.md +13 -0
- data/test_prompts/a5_good_prompt.md +14 -0
- data/test_prompts/bad_prompt.md +15 -0
- data/test_prompts/comprehensive_good_prompt.md +11 -0
- data/test_prompts/good_prompt.md +9 -0
- data/test_prompts/multi_bad_prompt.md +11 -0
- data/test_prompts/very_bad_prompt.md +7 -0
- metadata +149 -0
data/Rakefile
ADDED
@@ -0,0 +1,130 @@
|
|
1
|
+
# SentinelRb Architecture
|
2
|
+
|
3
|
+
## Overview
|
4
|
+
|
5
|
+
SentinelRb is a Ruby gem that provides LLM-driven prompt inspection to automatically detect common antipatterns in prompts that are difficult to catch with static analysis.
|
6
|
+
|
7
|
+
## Core Components
|
8
|
+
|
9
|
+
### 1. Configuration Management
|
10
|
+
- **File**: `.sentinel.yml`
|
11
|
+
- **Purpose**: Centralized configuration for thresholds, providers, and dangerous tools
|
12
|
+
- **Key Settings**:
|
13
|
+
- `provider`: LLM provider (openai, anthropic, etc.)
|
14
|
+
- `model`: Specific model to use
|
15
|
+
- `relevance_threshold`: Threshold for relevance scoring (default: 0.55)
|
16
|
+
- `divergence_threshold`: Threshold for KL divergence detection (default: 0.25)
|
17
|
+
- `dangerous_tools`: List of tools considered dangerous for auto-execution
|
18
|
+
|
19
|
+
### 2. LLM Client Layer
|
20
|
+
- **Purpose**: Abstraction layer for different LLM providers
|
21
|
+
- **Supported Providers**:
|
22
|
+
- OpenAI GPT models
|
23
|
+
- Anthropic Claude models
|
24
|
+
- Custom API endpoints
|
25
|
+
- **Key Methods**:
|
26
|
+
- `similarity(prompt, reference)`: Calculate semantic similarity
|
27
|
+
- `fact_check(statement)`: Verify factual accuracy
|
28
|
+
- `analyze_content(prompt)`: General content analysis
|
29
|
+
|
30
|
+
### 3. Analyzer System
|
31
|
+
|
32
|
+
#### Base Analyzer
|
33
|
+
```ruby
|
34
|
+
class SentinelRb::Analyzers::Base
|
35
|
+
def initialize(prompt, config, client)
|
36
|
+
@prompt = prompt
|
37
|
+
@config = config
|
38
|
+
@client = client
|
39
|
+
end
|
40
|
+
|
41
|
+
def call
|
42
|
+
# Returns array of findings: [{id:, level:, message:}, ...]
|
43
|
+
end
|
44
|
+
end
|
45
|
+
```
|
46
|
+
|
47
|
+
#### Implemented Analyzers
|
48
|
+
|
49
|
+
##### A1: Irrelevant Information Detector
|
50
|
+
- **Purpose**: Detect prompts containing irrelevant or noisy information
|
51
|
+
- **Method**: Uses LLM to generate relevance scores
|
52
|
+
- **Threshold**: Configurable via `relevance_threshold`
|
53
|
+
- **Output**: Warning when average relevance score is below threshold
|
54
|
+
|
55
|
+
##### A2: Misinformation & Logical Contradictions
|
56
|
+
- **Purpose**: Verify factual accuracy and logical consistency
|
57
|
+
- **Method**: RAG-based knowledge base lookup or fact-checking API
|
58
|
+
- **Detection**: Cross-references statements against reliable sources
|
59
|
+
- **Output**: Error level for confirmed misinformation
|
60
|
+
|
61
|
+
##### A3: Few-shot Bias Order Detection
|
62
|
+
- **Purpose**: Detect ordering bias in few-shot examples
|
63
|
+
- **Method**: Compares example order with canonical examples in YAML
|
64
|
+
- **Metric**: KL Divergence calculation
|
65
|
+
- **Threshold**: Configurable via `divergence_threshold`
|
66
|
+
|
67
|
+
##### A4: Base Model Usage Detection
|
68
|
+
- **Purpose**: Flag usage of base models instead of instruction-tuned models
|
69
|
+
- **Method**: String matching for '-base' in model names
|
70
|
+
- **Output**: Immediate warning for any base model usage
|
71
|
+
|
72
|
+
##### A5: Dangerous Automatic Tool Execution
|
73
|
+
- **Purpose**: Detect dangerous tools marked for automatic execution
|
74
|
+
- **Method**: JSON parsing to find `dangerous:true && auto_execute:true`
|
75
|
+
- **Configuration**: Uses `dangerous_tools` list from config
|
76
|
+
- **Output**: Critical level warning for security risks
|
77
|
+
|
78
|
+
### 4. Reporting System
|
79
|
+
- **Formats**: Table, JSON, detailed reports
|
80
|
+
- **Integration**: Designed for CI/CD pipeline integration
|
81
|
+
- **Output Levels**: info, warn, error, critical
|
82
|
+
|
83
|
+
## File Processing Pipeline
|
84
|
+
|
85
|
+
1. **Discovery**: Glob pattern matching to find prompt files
|
86
|
+
2. **Loading**: Read and parse prompt files (MD, JSON, YAML support)
|
87
|
+
3. **Analysis**: Run all enabled analyzers on each prompt
|
88
|
+
4. **Aggregation**: Collect results from all analyzers
|
89
|
+
5. **Reporting**: Format and output results in specified format
|
90
|
+
|
91
|
+
## Extension Points
|
92
|
+
|
93
|
+
### Custom Analyzers
|
94
|
+
Developers can create custom analyzers by:
|
95
|
+
1. Inheriting from `SentinelRb::Analyzers::Base`
|
96
|
+
2. Implementing the `call` method
|
97
|
+
3. Registering the analyzer in the configuration
|
98
|
+
|
99
|
+
### Custom LLM Providers
|
100
|
+
Support for additional LLM providers can be added by:
|
101
|
+
1. Implementing the client interface
|
102
|
+
2. Adding provider-specific configuration
|
103
|
+
3. Registering the provider in the client factory
|
104
|
+
|
105
|
+
## Security Considerations
|
106
|
+
|
107
|
+
- API keys are loaded from environment variables
|
108
|
+
- Dangerous tool detection prevents accidental auto-execution
|
109
|
+
- No prompt data is persisted by default
|
110
|
+
- Configurable rate limiting for API calls
|
111
|
+
|
112
|
+
## Performance Optimization
|
113
|
+
|
114
|
+
- Parallel processing of multiple prompt files
|
115
|
+
- Caching of LLM responses for repeated analysis
|
116
|
+
- Configurable batch processing for large prompt sets
|
117
|
+
- Optional skip patterns for excluding files
|
118
|
+
|
119
|
+
## CI/CD Integration
|
120
|
+
|
121
|
+
### GitHub Actions
|
122
|
+
Pre-built workflow templates for:
|
123
|
+
- Pull request validation
|
124
|
+
- Scheduled prompt auditing
|
125
|
+
- Release gate checks
|
126
|
+
|
127
|
+
### Configuration Examples
|
128
|
+
- Development environment setup
|
129
|
+
- Production deployment settings
|
130
|
+
- Team-specific threshold configurations
|
data/docs/development.md
ADDED
@@ -0,0 +1,376 @@
|
|
1
|
+
# SentinelRb Development Guide
|
2
|
+
|
3
|
+
## Development Setup
|
4
|
+
|
5
|
+
### Prerequisites
|
6
|
+
- Ruby >= 3.1.0
|
7
|
+
- Bundler
|
8
|
+
- Git
|
9
|
+
|
10
|
+
### Initial Setup
|
11
|
+
```bash
|
12
|
+
git clone <repository-url>
|
13
|
+
cd sentinel_rb
|
14
|
+
bin/setup
|
15
|
+
```
|
16
|
+
|
17
|
+
### Running Tests
|
18
|
+
```bash
|
19
|
+
# All tests
|
20
|
+
rake spec
|
21
|
+
|
22
|
+
# Specific test file
|
23
|
+
rspec spec/analyzers/irrelevant_info_spec.rb
|
24
|
+
|
25
|
+
# With coverage
|
26
|
+
COVERAGE=true rake spec
|
27
|
+
```
|
28
|
+
|
29
|
+
### Development Console
|
30
|
+
```bash
|
31
|
+
bin/console
|
32
|
+
```
|
33
|
+
|
34
|
+
## Project Structure
|
35
|
+
|
36
|
+
```
|
37
|
+
sentinel_rb/
|
38
|
+
├── lib/
|
39
|
+
│ ├── sentinel_rb.rb # Main entry point
|
40
|
+
│ └── sentinel_rb/
|
41
|
+
│ ├── version.rb # Version management
|
42
|
+
│ ├── config.rb # Configuration loading
|
43
|
+
│ ├── cli.rb # Command line interface
|
44
|
+
│ ├── client/ # LLM client implementations
|
45
|
+
│ │ ├── base.rb
|
46
|
+
│ │ ├── openai.rb
|
47
|
+
│ │ └── anthropic.rb
|
48
|
+
│ ├── analyzers/ # Analysis modules
|
49
|
+
│ │ ├── base.rb
|
50
|
+
│ │ ├── irrelevant_info.rb
|
51
|
+
│ │ ├── misinformation.rb
|
52
|
+
│ │ ├── few_shot_bias.rb
|
53
|
+
│ │ ├── base_model.rb
|
54
|
+
│ │ └── dangerous_tools.rb
|
55
|
+
│ ├── report/ # Reporting system
|
56
|
+
│ │ ├── formatter.rb
|
57
|
+
│ │ ├── table.rb
|
58
|
+
│ │ ├── json.rb
|
59
|
+
│ │ └── detailed.rb
|
60
|
+
│ └── utils/ # Utility classes
|
61
|
+
│ ├── file_finder.rb
|
62
|
+
│ └── prompt_parser.rb
|
63
|
+
├── spec/ # Test files
|
64
|
+
├── docs/ # Documentation
|
65
|
+
├── exe/ # Executable files
|
66
|
+
└── bin/ # Development scripts
|
67
|
+
```
|
68
|
+
|
69
|
+
## Adding New Analyzers
|
70
|
+
|
71
|
+
### 1. Create Analyzer Class
|
72
|
+
```ruby
|
73
|
+
# lib/sentinel_rb/analyzers/your_analyzer.rb
|
74
|
+
module SentinelRb
|
75
|
+
module Analyzers
|
76
|
+
class YourAnalyzer < Base
|
77
|
+
def call
|
78
|
+
# Your analysis logic here
|
79
|
+
findings = []
|
80
|
+
|
81
|
+
if condition_met?
|
82
|
+
findings << {
|
83
|
+
id: 'A6',
|
84
|
+
level: :warn,
|
85
|
+
message: 'Your custom warning message'
|
86
|
+
}
|
87
|
+
end
|
88
|
+
|
89
|
+
findings
|
90
|
+
end
|
91
|
+
|
92
|
+
private
|
93
|
+
|
94
|
+
def condition_met?
|
95
|
+
# Your condition logic
|
96
|
+
end
|
97
|
+
end
|
98
|
+
end
|
99
|
+
end
|
100
|
+
```
|
101
|
+
|
102
|
+
### 2. Register Analyzer
|
103
|
+
```ruby
|
104
|
+
# lib/sentinel_rb.rb
|
105
|
+
require 'sentinel_rb/analyzers/your_analyzer'
|
106
|
+
|
107
|
+
module SentinelRb
|
108
|
+
ANALYZERS = {
|
109
|
+
'A1' => Analyzers::IrrelevantInfo,
|
110
|
+
'A2' => Analyzers::Misinformation,
|
111
|
+
'A3' => Analyzers::FewShotBias,
|
112
|
+
'A4' => Analyzers::BaseModel,
|
113
|
+
'A5' => Analyzers::DangerousTools,
|
114
|
+
'A6' => Analyzers::YourAnalyzer # Add here
|
115
|
+
}.freeze
|
116
|
+
end
|
117
|
+
```
|
118
|
+
|
119
|
+
### 3. Add Tests
|
120
|
+
```ruby
|
121
|
+
# spec/analyzers/your_analyzer_spec.rb
|
122
|
+
require 'spec_helper'
|
123
|
+
|
124
|
+
RSpec.describe SentinelRb::Analyzers::YourAnalyzer do
|
125
|
+
let(:config) { { 'your_threshold' => 0.5 } }
|
126
|
+
let(:client) { instance_double(SentinelRb::Client::OpenAI) }
|
127
|
+
let(:analyzer) { described_class.new(prompt, config, client) }
|
128
|
+
|
129
|
+
describe '#call' do
|
130
|
+
context 'when condition is met' do
|
131
|
+
let(:prompt) { 'test prompt that triggers condition' }
|
132
|
+
|
133
|
+
it 'returns warning' do
|
134
|
+
result = analyzer.call
|
135
|
+
expect(result).to include(
|
136
|
+
hash_including(
|
137
|
+
id: 'A6',
|
138
|
+
level: :warn,
|
139
|
+
message: include('Your custom warning')
|
140
|
+
)
|
141
|
+
)
|
142
|
+
end
|
143
|
+
end
|
144
|
+
|
145
|
+
context 'when condition is not met' do
|
146
|
+
let(:prompt) { 'normal prompt' }
|
147
|
+
|
148
|
+
it 'returns no findings' do
|
149
|
+
result = analyzer.call
|
150
|
+
expect(result).to be_empty
|
151
|
+
end
|
152
|
+
end
|
153
|
+
end
|
154
|
+
end
|
155
|
+
```
|
156
|
+
|
157
|
+
## Adding LLM Providers
|
158
|
+
|
159
|
+
### 1. Implement Client Interface
|
160
|
+
```ruby
|
161
|
+
# lib/sentinel_rb/client/your_provider.rb
|
162
|
+
module SentinelRb
|
163
|
+
module Client
|
164
|
+
class YourProvider < Base
|
165
|
+
def initialize(config)
|
166
|
+
super
|
167
|
+
@api_key = ENV['YOUR_PROVIDER_API_KEY']
|
168
|
+
@base_url = config['base_url'] || 'https://api.yourprovider.com'
|
169
|
+
end
|
170
|
+
|
171
|
+
def similarity(text1, text2)
|
172
|
+
# Implement similarity calculation
|
173
|
+
end
|
174
|
+
|
175
|
+
def fact_check(statement)
|
176
|
+
# Implement fact checking
|
177
|
+
end
|
178
|
+
|
179
|
+
def analyze_content(prompt)
|
180
|
+
# Implement content analysis
|
181
|
+
end
|
182
|
+
|
183
|
+
private
|
184
|
+
|
185
|
+
def make_request(endpoint, payload)
|
186
|
+
# HTTP request implementation
|
187
|
+
end
|
188
|
+
end
|
189
|
+
end
|
190
|
+
end
|
191
|
+
```
|
192
|
+
|
193
|
+
### 2. Register Provider
|
194
|
+
```ruby
|
195
|
+
# lib/sentinel_rb/client.rb
|
196
|
+
module SentinelRb
|
197
|
+
module Client
|
198
|
+
PROVIDERS = {
|
199
|
+
'openai' => OpenAI,
|
200
|
+
'anthropic' => Anthropic,
|
201
|
+
'your_provider' => YourProvider # Add here
|
202
|
+
}.freeze
|
203
|
+
|
204
|
+
def self.create(config)
|
205
|
+
provider = config['provider']
|
206
|
+
client_class = PROVIDERS[provider]
|
207
|
+
raise "Unsupported provider: #{provider}" unless client_class
|
208
|
+
|
209
|
+
client_class.new(config)
|
210
|
+
end
|
211
|
+
end
|
212
|
+
end
|
213
|
+
```
|
214
|
+
|
215
|
+
## Testing Guidelines
|
216
|
+
|
217
|
+
### Unit Tests
|
218
|
+
- Each analyzer should have comprehensive test coverage
|
219
|
+
- Mock external API calls
|
220
|
+
- Test edge cases and error conditions
|
221
|
+
|
222
|
+
### Integration Tests
|
223
|
+
- Test CLI commands end-to-end
|
224
|
+
- Test configuration loading
|
225
|
+
- Test file processing pipeline
|
226
|
+
|
227
|
+
### Example Test Structure
|
228
|
+
```ruby
|
229
|
+
RSpec.describe SentinelRb::Analyzers::IrrelevantInfo do
|
230
|
+
let(:config) { { 'relevance_threshold' => 0.55 } }
|
231
|
+
let(:client) { instance_double(SentinelRb::Client::OpenAI) }
|
232
|
+
let(:analyzer) { described_class.new(prompt, config, client) }
|
233
|
+
|
234
|
+
before do
|
235
|
+
allow(client).to receive(:similarity).and_return(similarity_score)
|
236
|
+
end
|
237
|
+
|
238
|
+
context 'with relevant prompt' do
|
239
|
+
let(:prompt) { 'Clear, focused task description' }
|
240
|
+
let(:similarity_score) { 0.8 }
|
241
|
+
|
242
|
+
it 'returns no findings' do
|
243
|
+
expect(analyzer.call).to be_empty
|
244
|
+
end
|
245
|
+
end
|
246
|
+
|
247
|
+
context 'with irrelevant prompt' do
|
248
|
+
let(:prompt) { 'Off-topic content mixed with task' }
|
249
|
+
let(:similarity_score) { 0.3 }
|
250
|
+
|
251
|
+
it 'returns relevance warning' do
|
252
|
+
result = analyzer.call
|
253
|
+
expect(result).to include(
|
254
|
+
hash_including(
|
255
|
+
id: 'A1',
|
256
|
+
level: :warn,
|
257
|
+
message: include('relevance')
|
258
|
+
)
|
259
|
+
)
|
260
|
+
end
|
261
|
+
end
|
262
|
+
end
|
263
|
+
```
|
264
|
+
|
265
|
+
## Code Style and Conventions
|
266
|
+
|
267
|
+
### Ruby Style
|
268
|
+
- Follow Ruby community style guide
|
269
|
+
- Use RuboCop for style enforcement
|
270
|
+
- 2-space indentation
|
271
|
+
- Maximum line length: 100 characters
|
272
|
+
|
273
|
+
### Naming Conventions
|
274
|
+
- Classes: PascalCase
|
275
|
+
- Methods: snake_case
|
276
|
+
- Constants: SCREAMING_SNAKE_CASE
|
277
|
+
- Files: snake_case
|
278
|
+
|
279
|
+
### Documentation
|
280
|
+
- Use YARD for API documentation
|
281
|
+
- Include examples in method documentation
|
282
|
+
- Document all public methods and classes
|
283
|
+
|
284
|
+
### Example Documentation
|
285
|
+
```ruby
|
286
|
+
# Analyzes prompts for irrelevant information
|
287
|
+
#
|
288
|
+
# @param prompt [String] the prompt text to analyze
|
289
|
+
# @param config [Hash] configuration options
|
290
|
+
# @param client [Client::Base] LLM client instance
|
291
|
+
# @return [Array<Hash>] array of findings
|
292
|
+
#
|
293
|
+
# @example
|
294
|
+
# analyzer = IrrelevantInfo.new(prompt, config, client)
|
295
|
+
# findings = analyzer.call
|
296
|
+
# findings.each { |f| puts f[:message] }
|
297
|
+
class IrrelevantInfo < Base
|
298
|
+
# Performs the analysis
|
299
|
+
#
|
300
|
+
# @return [Array<Hash>] findings with :id, :level, :message keys
|
301
|
+
def call
|
302
|
+
# Implementation
|
303
|
+
end
|
304
|
+
end
|
305
|
+
```
|
306
|
+
|
307
|
+
## Performance Considerations
|
308
|
+
|
309
|
+
### Caching
|
310
|
+
- Implement response caching for repeated API calls
|
311
|
+
- Use file-based cache for development
|
312
|
+
- Redis cache for production deployments
|
313
|
+
|
314
|
+
### Parallel Processing
|
315
|
+
- Use thread pools for I/O bound operations
|
316
|
+
- Implement batch processing for large prompt sets
|
317
|
+
- Consider memory usage with large files
|
318
|
+
|
319
|
+
### Rate Limiting
|
320
|
+
- Implement respectful rate limiting for API calls
|
321
|
+
- Configurable delays between requests
|
322
|
+
- Exponential backoff for failed requests
|
323
|
+
|
324
|
+
## Release Process
|
325
|
+
|
326
|
+
### Version Management
|
327
|
+
```bash
|
328
|
+
# Update version
|
329
|
+
vim lib/sentinel_rb/version.rb
|
330
|
+
|
331
|
+
# Update changelog
|
332
|
+
vim CHANGELOG.md
|
333
|
+
|
334
|
+
# Commit changes
|
335
|
+
git add -A
|
336
|
+
git commit -m "Release v1.2.3"
|
337
|
+
|
338
|
+
# Tag release
|
339
|
+
git tag v1.2.3
|
340
|
+
git push origin main --tags
|
341
|
+
```
|
342
|
+
|
343
|
+
### Gem Publishing
|
344
|
+
```bash
|
345
|
+
# Build gem
|
346
|
+
rake build
|
347
|
+
|
348
|
+
# Test gem installation
|
349
|
+
gem install pkg/sentinel_rb-1.2.3.gem
|
350
|
+
|
351
|
+
# Publish to RubyGems
|
352
|
+
rake release
|
353
|
+
```
|
354
|
+
|
355
|
+
## Contributing Guidelines
|
356
|
+
|
357
|
+
### Pull Request Process
|
358
|
+
1. Fork the repository
|
359
|
+
2. Create feature branch
|
360
|
+
3. Add tests for new functionality
|
361
|
+
4. Ensure all tests pass
|
362
|
+
5. Update documentation
|
363
|
+
6. Submit pull request
|
364
|
+
|
365
|
+
### Code Review Checklist
|
366
|
+
- [ ] Tests added for new functionality
|
367
|
+
- [ ] Documentation updated
|
368
|
+
- [ ] Performance impact considered
|
369
|
+
- [ ] Security implications reviewed
|
370
|
+
- [ ] Backward compatibility maintained
|
371
|
+
|
372
|
+
### Issue Reporting
|
373
|
+
- Use issue templates
|
374
|
+
- Include reproduction steps
|
375
|
+
- Provide relevant configuration
|
376
|
+
- Include error messages and logs
|