diogenes 0.1.2 → 0.1.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.mise/config.toml +72 -0
- data/.mise/mise.lock +179 -0
- data/.mise/tasks/update-hk-import +79 -0
- data/.release-please-config.json +1 -1
- data/.release-please-manifest.json +2 -2
- data/CHANGELOG.md +7 -0
- data/CLAUDE.md +107 -99
- data/CONTRIBUTING.md +206 -0
- data/README.md +157 -134
- data/Rakefile +15 -1
- data/Steepfile +11 -0
- data/docs/gates.md +178 -0
- data/docs/targets.md +11 -0
- data/exe/diogenes +6 -0
- data/hk.pkl +46 -0
- data/lib/diogenes/cli/init.rb +88 -0
- data/lib/diogenes/cli.rb +95 -0
- data/lib/diogenes/templates/init/artifacts/decision_record.md.erb +53 -0
- data/lib/diogenes/templates/init/diogenes.rb +13 -0
- data/lib/diogenes/templates/init/hooks/README.md +15 -0
- data/lib/diogenes/templates/init/rules/five_gates.rb +33 -0
- data/lib/diogenes/templates/init/skills/example_skill.rb +33 -0
- data/lib/diogenes/version.rb +2 -1
- data/lib/diogenes.rb +27 -2
- data/sig/generated/diogenes/cli/init.rbs +34 -0
- data/sig/generated/diogenes/cli.rbs +34 -0
- data/sig/generated/diogenes/version.rbs +5 -0
- data/sig/generated/diogenes.rbs +26 -0
- metadata +23 -9
- data/docs/context.md +0 -60
- data/docs/contributing.md +0 -228
- data/docs/dashboard.md +0 -365
- data/docs/examples.md +0 -162
- data/docs/framework.md +0 -146
- data/mise.lock +0 -48
- data/mise.toml +0 -6
data/docs/gates.md
ADDED
|
@@ -0,0 +1,178 @@
|
|
|
1
|
+
# Gate Reference
|
|
2
|
+
|
|
3
|
+
The five gates each map to a Ruby principle. A feature must pass all five to proceed. Gates are evaluated in order — failing early is cheaper than failing late.
|
|
4
|
+
|
|
5
|
+
## Gate 1 — Failure Mode
|
|
6
|
+
|
|
7
|
+
**Ruby principle:** Least surprise — at scale
|
|
8
|
+
|
|
9
|
+
**The question:** What happens when this feature is wrong?
|
|
10
|
+
|
|
11
|
+
This gate isn't asking *if* the AI will be wrong. It will be. The gate asks what failure looks like for a real user at real scale. A wrong restaurant recommendation is recoverable. A wrong medication interaction summary is not.
|
|
12
|
+
|
|
13
|
+
### Options
|
|
14
|
+
|
|
15
|
+
| Option | Values | Notes |
|
|
16
|
+
|--------|--------|-------|
|
|
17
|
+
| `severity` | `:recoverable`, `:embarrassing`, `:catastrophic` | Required |
|
|
18
|
+
|
|
19
|
+
### Pass / Fail
|
|
20
|
+
|
|
21
|
+
| Severity | Result |
|
|
22
|
+
|----------|--------|
|
|
23
|
+
| `:recoverable` | ✓ PASS — user can identify and recover from a wrong answer |
|
|
24
|
+
| `:embarrassing` | ✓ PASS (with conditions) — wrong answers hurt trust but don't cause harm |
|
|
25
|
+
| `:catastrophic` | ✗ FAIL — wrong answers cause harm that can't be undone |
|
|
26
|
+
|
|
27
|
+
### Failure message
|
|
28
|
+
|
|
29
|
+
```text
|
|
30
|
+
Gate 1 (failure_mode) failed.
|
|
31
|
+
severity: :catastrophic — catastrophic failures are not acceptable at scale.
|
|
32
|
+
When this feature is wrong, the user cannot recover without assistance or may
|
|
33
|
+
be actively harmed. Consider a software alternative: clearer UI, explicit error
|
|
34
|
+
states, or rule-based logic with predictable output.
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
## Gate 2 — User Verifiable
|
|
38
|
+
|
|
39
|
+
**Ruby principle:** Trust requires verification
|
|
40
|
+
|
|
41
|
+
**The question:** Can your average user tell when it's wrong?
|
|
42
|
+
|
|
43
|
+
The calibration gap. If the user lacks the domain knowledge to evaluate the AI's output, you've built a confident-sounding oracle nobody can fact-check. That's not a feature — it's a liability. The AI's confidence doesn't transfer knowledge to the user.
|
|
44
|
+
|
|
45
|
+
### Options
|
|
46
|
+
|
|
47
|
+
| Option | Values | Notes |
|
|
48
|
+
|--------|--------|-------|
|
|
49
|
+
| `domain` | `:general`, `:technical`, `:financial`, `:medical`, `:legal` | Required |
|
|
50
|
+
| `user_expertise` | `:novice`, `:intermediate`, `:expert` | Optional, defaults to `:novice` |
|
|
51
|
+
|
|
52
|
+
### Pass / Fail
|
|
53
|
+
|
|
54
|
+
| Domain | User Expertise | Result |
|
|
55
|
+
|--------|---------------|--------|
|
|
56
|
+
| `:general` | any | ✓ PASS |
|
|
57
|
+
| `:technical` | `:expert` | ✓ PASS |
|
|
58
|
+
| `:technical` | `:novice`, `:intermediate` | ✗ FAIL |
|
|
59
|
+
| `:financial` | `:expert` | ✓ PASS |
|
|
60
|
+
| `:financial` | `:novice`, `:intermediate` | ✗ FAIL |
|
|
61
|
+
| `:medical` | any | ✗ FAIL |
|
|
62
|
+
| `:legal` | any | ✗ FAIL |
|
|
63
|
+
|
|
64
|
+
### Failure message
|
|
65
|
+
|
|
66
|
+
```text
|
|
67
|
+
Gate 2 (user_verifiable) failed.
|
|
68
|
+
domain: :financial, user_expertise: :novice — the average user cannot verify
|
|
69
|
+
financial AI output without domain expertise. A confident wrong answer in a
|
|
70
|
+
financial context creates disputes users can't resolve themselves.
|
|
71
|
+
Consider: display the underlying data alongside any AI interpretation so the
|
|
72
|
+
user can evaluate both.
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
## Gate 3 — Human in the Loop
|
|
76
|
+
|
|
77
|
+
**Ruby principle:** Human-centered design, genuinely
|
|
78
|
+
|
|
79
|
+
**The question:** Is there a human checking — actually checking?
|
|
80
|
+
|
|
81
|
+
"A human reviews it" sounds responsible until you realize the human is reviewing 400 AI outputs a day and rubber-stamping them. A human in the loop must have the time, context, and authority to actually intervene. Theater is not a gate.
|
|
82
|
+
|
|
83
|
+
### Options
|
|
84
|
+
|
|
85
|
+
| Option | Values | Notes |
|
|
86
|
+
|--------|--------|-------|
|
|
87
|
+
| `required` | `true`, `false` | Required. Does this feature need human review? |
|
|
88
|
+
| `capacity` | `:real`, `:theoretical` | Required if `required: true` |
|
|
89
|
+
| `authority` | `true`, `false` | Optional. Can the human actually intervene? Defaults to `true` |
|
|
90
|
+
|
|
91
|
+
### Pass / Fail
|
|
92
|
+
|
|
93
|
+
| Required | Capacity | Authority | Result |
|
|
94
|
+
|----------|----------|-----------|--------|
|
|
95
|
+
| `false` | — | — | ✓ PASS — no human review needed |
|
|
96
|
+
| `true` | `:real` | `true` | ✓ PASS |
|
|
97
|
+
| `true` | `:real` | `false` | ✗ FAIL — human can see but not act |
|
|
98
|
+
| `true` | `:theoretical` | any | ✗ FAIL — rubber-stamping is not a loop |
|
|
99
|
+
|
|
100
|
+
### Failure message
|
|
101
|
+
|
|
102
|
+
```text
|
|
103
|
+
Gate 3 (human_in_loop) failed.
|
|
104
|
+
capacity: :theoretical — a human review process that doesn't scale to real
|
|
105
|
+
throughput is theater, not a safety gate. If reviewers are rubber-stamping,
|
|
106
|
+
the loop isn't doing the work you think it is.
|
|
107
|
+
Consider: reduce the volume of AI decisions that require review, or invest in
|
|
108
|
+
tooling that makes review genuinely fast and high-confidence.
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
## Gate 4 — Observability
|
|
112
|
+
|
|
113
|
+
**Ruby principle:** Craftsmanship — you wouldn't ship a sorting algorithm blind
|
|
114
|
+
|
|
115
|
+
**The question:** Do you have the monitoring to know when it's going wrong?
|
|
116
|
+
|
|
117
|
+
Most teams ship AI features with less monitoring than they'd give a sorting algorithm. Without the ability to detect hallucination rates, output drift, or user confusion signals, you don't know what you've shipped. You've released something that can degrade silently.
|
|
118
|
+
|
|
119
|
+
### Options
|
|
120
|
+
|
|
121
|
+
| Option | Values | Notes |
|
|
122
|
+
|--------|--------|-------|
|
|
123
|
+
| `monitoring` | `:required`, `:optional`, `:none` | Required |
|
|
124
|
+
| `signals` | array of symbols | Optional. e.g. `[:hallucination_rate, :user_correction_rate, :latency]` |
|
|
125
|
+
|
|
126
|
+
### Pass / Fail
|
|
127
|
+
|
|
128
|
+
| Monitoring | Result |
|
|
129
|
+
|------------|--------|
|
|
130
|
+
| `:required` (and signals defined) | ✓ PASS |
|
|
131
|
+
| `:required` (and no signals) | ✗ FAIL — required but no signals defined |
|
|
132
|
+
| `:optional` | ✓ PASS (with warning) |
|
|
133
|
+
| `:none` | ✗ FAIL |
|
|
134
|
+
|
|
135
|
+
### Failure message
|
|
136
|
+
|
|
137
|
+
```text
|
|
138
|
+
Gate 4 (observability) failed.
|
|
139
|
+
monitoring: :none — you have no way to know when this feature starts going wrong
|
|
140
|
+
in production. AI output degrades over time as prompts, models, and underlying
|
|
141
|
+
data change. Silent degradation is worse than no feature.
|
|
142
|
+
Consider: define at minimum a user correction rate signal and a weekly review
|
|
143
|
+
cadence before shipping.
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
## Gate 5 — Right Tool
|
|
147
|
+
|
|
148
|
+
**Ruby principle:** Convention over configuration — use the boring right answer
|
|
149
|
+
|
|
150
|
+
**The question:** Is AI the right tool here, or just the exciting one?
|
|
151
|
+
|
|
152
|
+
This gate exists to give you explicit permission to say no. Sometimes a rule-based system, a good search index, clearer copy, or better UX solves the problem better, cheaper, and more reliably. The framework should make "not AI" a first-class answer, not a fallback.
|
|
153
|
+
|
|
154
|
+
### Options
|
|
155
|
+
|
|
156
|
+
| Option | Values | Notes |
|
|
157
|
+
|--------|--------|-------|
|
|
158
|
+
| `alternatives_considered` | `true`, `false` | Required |
|
|
159
|
+
| `alternative_verdict` | `:inferior`, `:equivalent`, `:superior` | Required if `alternatives_considered: true` |
|
|
160
|
+
|
|
161
|
+
### Pass / Fail
|
|
162
|
+
|
|
163
|
+
| Alternatives Considered | Alternative Verdict | Result |
|
|
164
|
+
|------------------------|---------------------|--------|
|
|
165
|
+
| `false` | — | ✗ FAIL — you haven't checked |
|
|
166
|
+
| `true` | `:inferior` | ✓ PASS — AI is genuinely the right choice |
|
|
167
|
+
| `true` | `:equivalent` | ✗ FAIL — use the simpler option |
|
|
168
|
+
| `true` | `:superior` | ✗ FAIL — the non-AI solution is better; build that instead |
|
|
169
|
+
|
|
170
|
+
### Failure message
|
|
171
|
+
|
|
172
|
+
```text
|
|
173
|
+
Gate 5 (right_tool) failed.
|
|
174
|
+
alternative_verdict: :equivalent — a non-AI solution would serve this need
|
|
175
|
+
equally well, with lower complexity, lower cost, and more predictable behavior.
|
|
176
|
+
Convention over configuration: use the boring right answer.
|
|
177
|
+
The software alternative should be the feature.
|
|
178
|
+
```
|
data/docs/targets.md
ADDED
data/exe/diogenes
ADDED
data/hk.pkl
ADDED
|
@@ -0,0 +1,46 @@
|
|
|
1
|
+
amends "package://github.com/jdx/hk/releases/download/v1.45.0/hk@1.45.0#/Config.pkl"
|
|
2
|
+
import "package://github.com/jdx/hk/releases/download/v1.45.0/hk@1.45.0#/Builtins.pkl"
|
|
3
|
+
|
|
4
|
+
fail_fast = false
|
|
5
|
+
env {
|
|
6
|
+
["HK_MISE"] = "1"
|
|
7
|
+
}
|
|
8
|
+
|
|
9
|
+
exclude =
|
|
10
|
+
List(
|
|
11
|
+
"vendor/**/*",
|
|
12
|
+
"tmp/**/*"
|
|
13
|
+
)
|
|
14
|
+
|
|
15
|
+
local linters = new Mapping<String, Step> {
|
|
16
|
+
["actionlint"] = Builtins.actionlint
|
|
17
|
+
["mise"] = Builtins.mise
|
|
18
|
+
["pkl"] = Builtins.pkl
|
|
19
|
+
["yamlfmt"] = Builtins.yamlfmt
|
|
20
|
+
["standard-rb"] = (Builtins.standard_rb) {
|
|
21
|
+
exclude = List("bin/*")
|
|
22
|
+
check = "bundle exec rake standard {{files}}"
|
|
23
|
+
fix = "bundle exec rake standard:fix {{ files }}"
|
|
24
|
+
}
|
|
25
|
+
|
|
26
|
+
// Generic linters
|
|
27
|
+
["fix-smart-quotes"] = Builtins.fix_smart_quotes
|
|
28
|
+
["mixed-line-ending"] = Builtins.mixed_line_ending
|
|
29
|
+
["newlines"] = Builtins.newlines
|
|
30
|
+
["trailing-whitespace"] = Builtins.trailing_whitespace
|
|
31
|
+
}
|
|
32
|
+
|
|
33
|
+
hooks {
|
|
34
|
+
["pre-commit"] {
|
|
35
|
+
fix = true
|
|
36
|
+
stash = "git"
|
|
37
|
+
steps = linters
|
|
38
|
+
}
|
|
39
|
+
["fix"] {
|
|
40
|
+
fix = true
|
|
41
|
+
steps = linters
|
|
42
|
+
}
|
|
43
|
+
["check"] {
|
|
44
|
+
steps = linters
|
|
45
|
+
}
|
|
46
|
+
}
|
|
@@ -0,0 +1,88 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
# rbs_inline: enabled
|
|
3
|
+
|
|
4
|
+
require "fileutils"
|
|
5
|
+
|
|
6
|
+
module Diogenes
|
|
7
|
+
class Cli
|
|
8
|
+
class Init
|
|
9
|
+
DIOGENES_DIR = ".diogenes" #: String
|
|
10
|
+
TEMPLATES_ROOT = File.expand_path("../templates/init", __dir__) #: String
|
|
11
|
+
|
|
12
|
+
#: (cwd: String, out: IO, err: IO) -> Integer
|
|
13
|
+
def self.run(cwd:, out:, err:)
|
|
14
|
+
new(cwd: cwd, out: out, err: err).run
|
|
15
|
+
end
|
|
16
|
+
|
|
17
|
+
#: (cwd: String, out: IO, err: IO) -> void
|
|
18
|
+
def initialize(cwd:, out:, err:)
|
|
19
|
+
@cwd = cwd #: String
|
|
20
|
+
@out = out #: IO
|
|
21
|
+
@err = err #: IO
|
|
22
|
+
end
|
|
23
|
+
|
|
24
|
+
#: () -> Integer
|
|
25
|
+
def run
|
|
26
|
+
if diogenes_dir_exist?
|
|
27
|
+
@err.puts "Warning: #{DIOGENES_DIR}/ already exists — existing files will not be overwritten."
|
|
28
|
+
end
|
|
29
|
+
|
|
30
|
+
@out.puts "Initializing #{DIOGENES_DIR}/"
|
|
31
|
+
|
|
32
|
+
created = [] #: Array[String]
|
|
33
|
+
skipped = [] #: Array[String]
|
|
34
|
+
|
|
35
|
+
template_files.each do |relative_path|
|
|
36
|
+
destination = File.join(@cwd, DIOGENES_DIR, relative_path)
|
|
37
|
+
|
|
38
|
+
if File.exist?(destination)
|
|
39
|
+
skipped << relative_path
|
|
40
|
+
next
|
|
41
|
+
end
|
|
42
|
+
|
|
43
|
+
FileUtils.mkdir_p(File.dirname(destination))
|
|
44
|
+
FileUtils.cp(template_path(relative_path), destination)
|
|
45
|
+
created << relative_path
|
|
46
|
+
end
|
|
47
|
+
|
|
48
|
+
print_summary(created:, skipped:)
|
|
49
|
+
0
|
|
50
|
+
end
|
|
51
|
+
|
|
52
|
+
private
|
|
53
|
+
|
|
54
|
+
#: () -> bool
|
|
55
|
+
def diogenes_dir_exist?
|
|
56
|
+
Dir.exist?(File.join(@cwd, DIOGENES_DIR))
|
|
57
|
+
end
|
|
58
|
+
|
|
59
|
+
#: () -> Array[String]
|
|
60
|
+
def template_files
|
|
61
|
+
Dir.glob("**/*", base: TEMPLATES_ROOT, flags: File::FNM_DOTMATCH)
|
|
62
|
+
.reject { |path| path == "." || path.end_with?("/.") }
|
|
63
|
+
.select { |path| File.file?(template_path(path)) }
|
|
64
|
+
.sort
|
|
65
|
+
end
|
|
66
|
+
|
|
67
|
+
#: (String) -> String
|
|
68
|
+
def template_path(relative_path)
|
|
69
|
+
File.join(TEMPLATES_ROOT, relative_path)
|
|
70
|
+
end
|
|
71
|
+
|
|
72
|
+
#: (created: Array[String], skipped: Array[String]) -> void
|
|
73
|
+
def print_summary(created:, skipped:)
|
|
74
|
+
if created.any?
|
|
75
|
+
@out.puts "Initialized #{DIOGENES_DIR}/ with #{created.size} file#{"s" unless created.size == 1}."
|
|
76
|
+
created.each { |path| @out.puts " created #{DIOGENES_DIR}/#{path}" }
|
|
77
|
+
elsif skipped.any?
|
|
78
|
+
@out.puts "No files created — all #{skipped.size} template files already exist."
|
|
79
|
+
end
|
|
80
|
+
|
|
81
|
+
return if skipped.empty?
|
|
82
|
+
|
|
83
|
+
@out.puts "Skipped #{skipped.size} existing file#{"s" unless skipped.size == 1}:"
|
|
84
|
+
skipped.each { |path| @out.puts " skipped #{DIOGENES_DIR}/#{path}" }
|
|
85
|
+
end
|
|
86
|
+
end
|
|
87
|
+
end
|
|
88
|
+
end
|
data/lib/diogenes/cli.rb
ADDED
|
@@ -0,0 +1,95 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
# rbs_inline: enabled
|
|
3
|
+
|
|
4
|
+
require_relative "cli/init"
|
|
5
|
+
|
|
6
|
+
module Diogenes
|
|
7
|
+
class Cli
|
|
8
|
+
COMMANDS = {
|
|
9
|
+
"init" => Init
|
|
10
|
+
}.freeze #: Hash[String, Class]
|
|
11
|
+
|
|
12
|
+
#: (?Array[String] argv, ?out: IO, ?err: IO) -> Integer
|
|
13
|
+
def self.run(argv = ARGV, out: $stdout, err: $stderr)
|
|
14
|
+
new(argv: argv, out: out, err: err).run
|
|
15
|
+
end
|
|
16
|
+
|
|
17
|
+
#: (argv: Array[String], out: IO, err: IO) -> void
|
|
18
|
+
def initialize(argv:, out:, err:)
|
|
19
|
+
@argv = argv.dup #: Array[String]
|
|
20
|
+
@out = out #: IO
|
|
21
|
+
@err = err #: IO
|
|
22
|
+
end
|
|
23
|
+
|
|
24
|
+
#: () -> Integer
|
|
25
|
+
def run
|
|
26
|
+
command, *args = @argv
|
|
27
|
+
|
|
28
|
+
case command
|
|
29
|
+
when nil, "help", "-h", "--help"
|
|
30
|
+
print_help
|
|
31
|
+
when "version", "-v", "--version"
|
|
32
|
+
print_version
|
|
33
|
+
when *COMMANDS.keys
|
|
34
|
+
run_command(COMMANDS.fetch(command), args)
|
|
35
|
+
else
|
|
36
|
+
raise UserError, unknown_command_message(command)
|
|
37
|
+
end
|
|
38
|
+
rescue OptionParser::ParseError, UserError => e
|
|
39
|
+
@err.puts e.message
|
|
40
|
+
1
|
|
41
|
+
end
|
|
42
|
+
|
|
43
|
+
private
|
|
44
|
+
|
|
45
|
+
# Run a command
|
|
46
|
+
# --
|
|
47
|
+
#: (Class, Array[String]) -> Integer
|
|
48
|
+
def run_command(command_class, args)
|
|
49
|
+
raise UserError, "Unexpected arguments: #{args.join(" ")}" if args.any?
|
|
50
|
+
|
|
51
|
+
command_class.run(cwd: Dir.pwd, out: @out, err: @err)
|
|
52
|
+
end
|
|
53
|
+
|
|
54
|
+
#: () -> Integer
|
|
55
|
+
def print_help
|
|
56
|
+
@out.puts <<~HELP
|
|
57
|
+
Diogenes - The Ruby AI Feature Gatekeeper
|
|
58
|
+
|
|
59
|
+
Usage:
|
|
60
|
+
diogenes <command> [options]
|
|
61
|
+
|
|
62
|
+
Description:
|
|
63
|
+
Diogenes is a Ruby gem that holds your AI features to the same light. It has two jobs:
|
|
64
|
+
1. A five-gate gauntlet decision framework that determines whether something should be an AI feature in production, or whether it's a software problem in disguise. Derived from Ruby's core principles: least surprise, programmer happiness, and human-centered design.
|
|
65
|
+
2. An agent configuration build tool that lets you write your AI agent skills, rules, and hooks once in a canonical Ruby DSL, and Diogenes builds the right configuration for every agent you use: Claude Code, Cursor, Copilot, Codex, Gemini, and more.
|
|
66
|
+
|
|
67
|
+
Commands:
|
|
68
|
+
init Initialize a new Diogenes project
|
|
69
|
+
build Build the Diogenes agent configuration for a given target
|
|
70
|
+
evaluate Evaluate a proposed AI feature through the five gates
|
|
71
|
+
validate Validate the Diogenes agent configuration for a given target
|
|
72
|
+
help Show help for a command
|
|
73
|
+
version Show version information
|
|
74
|
+
HELP
|
|
75
|
+
0
|
|
76
|
+
end
|
|
77
|
+
|
|
78
|
+
#: () -> Integer
|
|
79
|
+
def print_version
|
|
80
|
+
@out.puts(Diogenes::VERSION)
|
|
81
|
+
0
|
|
82
|
+
end
|
|
83
|
+
|
|
84
|
+
# Print the unknown command message
|
|
85
|
+
# --
|
|
86
|
+
#: (String) -> String
|
|
87
|
+
def unknown_command_message(command)
|
|
88
|
+
<<~MESSAGE.strip
|
|
89
|
+
Unknown command: #{command}
|
|
90
|
+
|
|
91
|
+
Run `diogenes help` to see available commands.
|
|
92
|
+
MESSAGE
|
|
93
|
+
end
|
|
94
|
+
end
|
|
95
|
+
end
|
|
@@ -0,0 +1,53 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
Diogenes.artifact "decision_record" do
|
|
4
|
+
description "Decision record produced by gate evaluation"
|
|
5
|
+
filename "<%= feature_slug %>_decision.md"
|
|
6
|
+
|
|
7
|
+
template <<~TEMPLATE
|
|
8
|
+
# AI Feature Decision Record
|
|
9
|
+
|
|
10
|
+
**Feature:** <%= feature_name %>
|
|
11
|
+
**Date:** <%= Date.today.strftime("%Y-%m-%d") %>
|
|
12
|
+
**Evaluator:** <%= evaluator %>
|
|
13
|
+
**Verdict:** <%= verdict %>
|
|
14
|
+
|
|
15
|
+
---
|
|
16
|
+
|
|
17
|
+
## Gate Results
|
|
18
|
+
|
|
19
|
+
| Gate | Principle | Result | Reason |
|
|
20
|
+
|------|-----------|--------|--------|
|
|
21
|
+
| 1. Failure Mode | Least surprise at scale | <%= gate_result(:failure_mode) %> | <%= gate_reason(:failure_mode) %> |
|
|
22
|
+
| 2. User Verifiable | Trust requires verification | <%= gate_result(:user_verifiable) %> | <%= gate_reason(:user_verifiable) %> |
|
|
23
|
+
| 3. Human in Loop | Human-centered, genuinely | <%= gate_result(:human_in_loop) %> | <%= gate_reason(:human_in_loop) %> |
|
|
24
|
+
| 4. Observability | Craftsmanship | <%= gate_result(:observability) %> | <%= gate_reason(:observability) %> |
|
|
25
|
+
| 5. Right Tool | Convention over configuration | <%= gate_result(:right_tool) %> | <%= gate_reason(:right_tool) %> |
|
|
26
|
+
|
|
27
|
+
---
|
|
28
|
+
|
|
29
|
+
## Verdict: <%= verdict %>
|
|
30
|
+
|
|
31
|
+
<% if verdict == "REJECT" || verdict == "PROCEED WITH CONDITIONS" %>
|
|
32
|
+
## Recommended Alternative or Mitigation
|
|
33
|
+
|
|
34
|
+
<%= alternative %>
|
|
35
|
+
<% end %>
|
|
36
|
+
|
|
37
|
+
<% if verdict == "PROCEED" || verdict == "PROCEED WITH CONDITIONS" %>
|
|
38
|
+
## Conditions for Proceeding
|
|
39
|
+
|
|
40
|
+
<%= conditions.empty? ? "None — all gates passed." : conditions %>
|
|
41
|
+
<% end %>
|
|
42
|
+
|
|
43
|
+
---
|
|
44
|
+
|
|
45
|
+
## Notes
|
|
46
|
+
|
|
47
|
+
<%= notes.empty? ? "_No additional notes._" : notes %>
|
|
48
|
+
|
|
49
|
+
---
|
|
50
|
+
|
|
51
|
+
*Generated by [Diogenes](https://github.com/meaganewaller/diogenes)*
|
|
52
|
+
TEMPLATE
|
|
53
|
+
end
|
|
@@ -0,0 +1,13 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
# Diogenes project configuration.
|
|
4
|
+
# Run `diogenes build --all` after changing source files.
|
|
5
|
+
|
|
6
|
+
Diogenes.configure do
|
|
7
|
+
name "My Project"
|
|
8
|
+
description "AI agent configuration for this project"
|
|
9
|
+
|
|
10
|
+
targets :claude_code, :cursor
|
|
11
|
+
|
|
12
|
+
artifacts_dir "docs/decisions"
|
|
13
|
+
end
|
|
@@ -0,0 +1,15 @@
|
|
|
1
|
+
# Hooks
|
|
2
|
+
|
|
3
|
+
Hooks define event-triggered behaviors that run automatically during agent sessions.
|
|
4
|
+
|
|
5
|
+
Add `.rb` files to this directory when you're ready. Example:
|
|
6
|
+
|
|
7
|
+
```ruby
|
|
8
|
+
Diogenes.hook "run_tests_before_commit" do
|
|
9
|
+
trigger "PreToolUse"
|
|
10
|
+
description "Remind the agent to run tests before committing"
|
|
11
|
+
prompt <<~PROMPT
|
|
12
|
+
Before making any git commit, run the test suite and confirm it passes.
|
|
13
|
+
PROMPT
|
|
14
|
+
end
|
|
15
|
+
```
|
|
@@ -0,0 +1,33 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
Diogenes.rule "five_gates" do
|
|
4
|
+
description "The five Diogenes gates — default ruleset for AI feature decisions"
|
|
5
|
+
|
|
6
|
+
content <<~RULE
|
|
7
|
+
## The Five Diogenes Gates
|
|
8
|
+
|
|
9
|
+
When evaluating or building AI features, apply these gates in order.
|
|
10
|
+
A gate failure is information, not a verdict — but it must be addressed
|
|
11
|
+
before shipping.
|
|
12
|
+
|
|
13
|
+
**Gate 1 — Failure Mode** (Least surprise at scale)
|
|
14
|
+
What happens when this feature is wrong? Recoverable failures pass.
|
|
15
|
+
Embarrassing failures pass with conditions. Catastrophic failures do not pass.
|
|
16
|
+
|
|
17
|
+
**Gate 2 — User Verifiable** (Trust requires verification)
|
|
18
|
+
Can the average user tell when the output is wrong? If the user lacks
|
|
19
|
+
domain expertise to evaluate AI output, the feature creates a confidence gap.
|
|
20
|
+
|
|
21
|
+
**Gate 3 — Human in the Loop** (Human-centered design, genuinely)
|
|
22
|
+
Is there a human with time, context, and authority to actually intervene?
|
|
23
|
+
Rubber-stamping is not a loop.
|
|
24
|
+
|
|
25
|
+
**Gate 4 — Observability** (Craftsmanship — you wouldn't ship blind)
|
|
26
|
+
Do you have monitoring to detect when this feature degrades in production?
|
|
27
|
+
Silent degradation is worse than no feature.
|
|
28
|
+
|
|
29
|
+
**Gate 5 — Right Tool** (Convention over configuration)
|
|
30
|
+
Is AI the right answer, or just the exciting one? Prefer the boring,
|
|
31
|
+
predictable software alternative when it serves the same need.
|
|
32
|
+
RULE
|
|
33
|
+
end
|
|
@@ -0,0 +1,33 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
# Example skill demonstrating the full Diogenes skill DSL.
|
|
4
|
+
# Copy this file, rename it, and customize the values for your project.
|
|
5
|
+
|
|
6
|
+
Diogenes.skill "evaluate_feature" do
|
|
7
|
+
command "/evaluate-feature"
|
|
8
|
+
description "Walk a proposed AI feature through the five Diogenes gates"
|
|
9
|
+
|
|
10
|
+
prompt <<~PROMPT
|
|
11
|
+
You are helping a Ruby developer evaluate whether a proposed AI feature
|
|
12
|
+
should be built. Walk them through each of the five Diogenes gates in order,
|
|
13
|
+
asking focused questions and surfacing the failure mode clearly if a gate fails.
|
|
14
|
+
|
|
15
|
+
The feature being evaluated: {{input}}
|
|
16
|
+
|
|
17
|
+
The five gates are:
|
|
18
|
+
1. Failure Mode — What happens when this feature is wrong? Is the failure recoverable?
|
|
19
|
+
2. User Verifiable — Can the average user tell when it's wrong?
|
|
20
|
+
3. Human in the Loop — Is there a real human checking, with time and authority to intervene?
|
|
21
|
+
4. Observability — Do you have monitoring to know when it's going wrong in production?
|
|
22
|
+
5. Right Tool — Is AI the right answer, or just the exciting one?
|
|
23
|
+
|
|
24
|
+
For each gate:
|
|
25
|
+
- State the gate name and the Ruby principle it maps to
|
|
26
|
+
- Ask one focused question to determine pass or fail
|
|
27
|
+
- State the verdict clearly (PASS or FAIL)
|
|
28
|
+
- If FAIL: explain why and suggest what the right approach is instead
|
|
29
|
+
|
|
30
|
+
At the end, generate a structured decision record with a summary verdict:
|
|
31
|
+
PROCEED, REJECT, or PROCEED WITH CONDITIONS.
|
|
32
|
+
PROMPT
|
|
33
|
+
end
|
data/lib/diogenes/version.rb
CHANGED
data/lib/diogenes.rb
CHANGED
|
@@ -1,8 +1,33 @@
|
|
|
1
1
|
# frozen_string_literal: true
|
|
2
|
+
# rbs_inline: enabled
|
|
2
3
|
|
|
3
|
-
|
|
4
|
+
require "zeitwerk"
|
|
5
|
+
|
|
6
|
+
# Diogenes is the main module for the Diogenes gem.
|
|
7
|
+
#
|
|
8
|
+
# Provides configuration, CLI, and runtime library for the Diogenes gem.
|
|
9
|
+
loader = Zeitwerk::Loader.for_gem
|
|
10
|
+
loader.setup
|
|
4
11
|
|
|
5
12
|
module Diogenes
|
|
13
|
+
# Base error class for Diogenes.
|
|
6
14
|
class Error < StandardError; end
|
|
7
|
-
|
|
15
|
+
|
|
16
|
+
# Raised when invalid arguments are passed to a method.
|
|
17
|
+
class ArgumentError < ::ArgumentError; end
|
|
18
|
+
|
|
19
|
+
# Raised when the configuration is invalid.
|
|
20
|
+
class ValidationError < Error; end
|
|
21
|
+
|
|
22
|
+
# Raised when the user provides invalid arguments to the CLI.
|
|
23
|
+
class UserError < Error; end
|
|
24
|
+
|
|
25
|
+
# Returns the version of the Diogenes gem.
|
|
26
|
+
#
|
|
27
|
+
# This is managed by the release-please workflow.
|
|
28
|
+
# --
|
|
29
|
+
#: () -> String
|
|
30
|
+
def self.version
|
|
31
|
+
VERSION
|
|
32
|
+
end
|
|
8
33
|
end
|
|
@@ -0,0 +1,34 @@
|
|
|
1
|
+
# Generated from lib/diogenes/cli/init.rb with RBS::Inline
|
|
2
|
+
|
|
3
|
+
module Diogenes
|
|
4
|
+
class Cli
|
|
5
|
+
class Init
|
|
6
|
+
DIOGENES_DIR: String
|
|
7
|
+
|
|
8
|
+
TEMPLATES_ROOT: String
|
|
9
|
+
|
|
10
|
+
# : (cwd: String, out: IO, err: IO) -> Integer
|
|
11
|
+
def self.run: (cwd: String, out: IO, err: IO) -> Integer
|
|
12
|
+
|
|
13
|
+
# : (cwd: String, out: IO, err: IO) -> void
|
|
14
|
+
def initialize: (cwd: String, out: IO, err: IO) -> void
|
|
15
|
+
|
|
16
|
+
# : () -> Integer
|
|
17
|
+
def run: () -> Integer
|
|
18
|
+
|
|
19
|
+
private
|
|
20
|
+
|
|
21
|
+
# : () -> bool
|
|
22
|
+
def diogenes_dir_exist?: () -> bool
|
|
23
|
+
|
|
24
|
+
# : () -> Array[String]
|
|
25
|
+
def template_files: () -> Array[String]
|
|
26
|
+
|
|
27
|
+
# : (String) -> String
|
|
28
|
+
def template_path: (String) -> String
|
|
29
|
+
|
|
30
|
+
# : (created: Array[String], skipped: Array[String]) -> void
|
|
31
|
+
def print_summary: (created: Array[String], skipped: Array[String]) -> void
|
|
32
|
+
end
|
|
33
|
+
end
|
|
34
|
+
end
|