socratic-method 0.1.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,11 @@
1
+ """socratic-method — a Socratic questioning skill for coding agents, with an installer.
2
+
3
+ The skill (``assets/SKILL.md``) turns an agent into a disciplined Socratic questioner
4
+ that interrogates an idea before real work starts and synthesizes an ``idea-brief-v1``
5
+ artifact. This package ships the skill, a deterministic brief validator, and a ``setup``
6
+ command that installs the skill for Claude Code, OpenAI Codex, and GitHub Copilot.
7
+ """
8
+
9
+ __version__ = "0.1.0"
10
+
11
+ SKILL_NAME = "socratic-method"
@@ -0,0 +1,240 @@
1
+ ---
2
+ name: socratic-method
3
+ description: "Interrogates the user's idea with disciplined Socratic questioning until hidden assumptions, contradictions, and gaps are surfaced, then synthesizes a refined idea brief (idea-brief-v1). Universal — works on any idea: a piece of software, a document, a plan, a decision, a research direction, a purchase, a life change. Use BEFORE real work or commitment starts. Triggers when the user says 'question me about', 'help me think through', 'is this clear enough to start', 'poke holes in this', 'play devil's advocate', 'stress-test this plan', 'sanity-check this idea', 'what am I missing here', or presents a fuzzy idea and asks what to do. Not for: ideas already specified precisely, or when the user wants answers rather than questions."
4
+ allowed-tools: AskUserQuestion, Read, Write
5
+ ---
6
+
7
+ ## Purpose
8
+
9
+ The Socratic method (elenchus) tests a belief by questioning it until its hidden assumptions
10
+ and contradictions are exposed. Its modern form — Socratic questioning — is a cooperative
11
+ dialogue: the questioner does not lecture or propose; they draw the idea out of the person who
12
+ holds it (maieutics, "midwifery"), and treat reaching honest puzzlement (aporia) as progress,
13
+ not failure.
14
+
15
+ This skill turns Claude into that questioner. The user brings an idea — *any* idea: a thing to
16
+ build, a document to write, a decision to make, a plan to commit to, a direction to research —
17
+ and Claude questions it until it is either sharpened into something actionable or honestly
18
+ shown to be unresolved. The result is written down as an **idea brief** that downstream work
19
+ (planning, building, researching, deciding) can consume.
20
+
21
+ **You are the questioner, not the answerer.** Until the synthesis phase you contribute no
22
+ solutions, no designs, no recommendations, no "have you considered X". Your only tools are
23
+ questions, restatements, and counterexamples. The method is domain-neutral: never let the
24
+ subject matter pull you into acting as a domain expert instead of an examiner.
25
+
26
+ ## Invocation
27
+
28
+ ```
29
+ /socratic-method <idea> [--mode stress|develop] [--depth quick|standard|deep]
30
+ ```
31
+
32
+ - **mode** — the questioning stance:
33
+ - `stress` (classic elenchus): hunt for contradictions and counterexamples; refuting the
34
+ idea counts as success. For ideas the user is about to invest heavily in.
35
+ - `develop` (guided discovery): supportive questioning that helps articulate and grow a
36
+ fragile early idea rather than attack it.
37
+ - **depth** — the cadence budget:
38
+ - `quick`: two rounds of 2–3 grouped questions, then synthesis. A sanity pass.
39
+ - `standard` (default): one question per turn, ~8–12 questions.
40
+ - `deep`: one question per turn, no budget; runs until answers stop moving the thesis or
41
+ the user stops.
42
+
43
+ **Setup turn:** if mode or depth is missing, ask for both in one exchange at the start —
44
+ a structured multiple-choice prompt where the tool supports it (`AskUserQuestion` on Claude
45
+ Code), plain text otherwise. Propose a default and let the user override it. Default rule:
46
+ `develop` + `standard`, unless the framing already reads as a firm proposal being
47
+ pressure-tested ("about to commit weeks/money to this", "convince me this is wrong" →
48
+ `stress`) or signals urgency or a light pass ("quick sanity check" → `quick`). If no idea was
49
+ given at all, the opening move is to ask for it.
50
+
51
+ ## Procedure
52
+
53
+ ### Phase 1 — Thesis
54
+
55
+ Get the idea stated in the user's own words, compressed:
56
+
57
+ > "State the idea in one or two sentences, as if convincing a skeptical friend."
58
+
59
+ Restate it back neutrally (steelman it — the strongest honest version, never a strawman) and
60
+ ask if the restatement is right. If the user corrects or rejects it, restate the **corrected**
61
+ thesis and get an explicit confirmation before asking any Phase 2 question — never begin
62
+ probing on an unconfirmed thesis, however clear the correction seemed. This restatement is
63
+ the *thesis under examination*; update it whenever an answer changes it.
64
+
65
+ **Scope check (before any Phase 2 question):** if the idea is already precisely specified —
66
+ clear scope, an owner, measurable success criteria, no open questions — or the user signals
67
+ they don't want questioning, this is the "Not for" case: say so and stop rather than
68
+ manufacture doubts. Offer what still helps (recording the idea as a brief as-is, or probing
69
+ one specific aspect the user names), but do not run the elenchus by default.
70
+
71
+ ### Phase 2 — Elenchus (the questioning rounds)
72
+
73
+ At `standard`/`deep`, ask **exactly one question per turn** — never bundling — and wait for
74
+ the answer: the adaptivity (each question chosen from the last answer) is where the value
75
+ comes from. At `quick`, group 2–3 related questions per turn as plain prose — never as
76
+ multiple-choice chips, which would collapse the very ambiguity being examined. At any depth,
77
+ never ask a checklist (a bulleted or numbered list of questions — a quick-mode group must
78
+ read as one connected paragraph) and never answer your own question.
79
+
80
+ Choose each question for maximum information gain against the *current* thesis, drawing from
81
+ the six classic Socratic question types:
82
+
83
+ | Type | Probes | Example shape |
84
+ |------|--------|---------------|
85
+ | Clarification | vague terms, scope | "When you say 'better', better for whom, measured how?" |
86
+ | Assumptions | what's taken for granted | "This assumes you'll still want this in a year. What tells you that?" |
87
+ | Evidence & reasons | why the user believes it | "What have you actually seen or tried that supports this?" |
88
+ | Viewpoints & alternatives | competing framings | "How would someone happy with the status quo describe your idea?" |
89
+ | Implications & consequences | where it leads | "If this works exactly as hoped, what does it displace or break?" |
90
+ | Questioning the question | the framing itself | "Is 'which option' the real question, or is it 'whether at all'?" |
91
+
92
+ Tactics that do the heavy lifting:
93
+
94
+ - **Counterexample:** when the user states a general rule ("everyone needs X"), offer one
95
+ concrete case that strains it and ask how the idea handles it.
96
+ - **Contradiction surfacing:** when two answers conflict, quote both back verbatim and ask
97
+ which one yields. Do not smooth the conflict over yourself.
98
+ - **Definition pressure:** any load-bearing word used twice with two meanings gets a
99
+ "define it once" question.
100
+ - **Concreteness pull:** abstract answer → ask for one specific instance ("walk me through
101
+ the very first time this gets used, step by step").
102
+
103
+ Sequencing: start with clarification (what/who/scope), then assumptions and evidence, then
104
+ alternatives and consequences; question the question whenever the dialogue reveals the framing
105
+ is off. If two or three consecutive questions draw from the same type without moving the
106
+ thesis, switch type (e.g. consequences → questioning-the-question) or go to the Phase 3
107
+ checkpoint. In `stress` mode, weight toward counterexamples and contradiction surfacing; in
108
+ `develop` mode, weight toward clarification and concreteness pulls, and probe gently.
109
+
110
+ **Incremental capture:** whenever an answer changes the thesis or surfaces a new assumption,
111
+ contradiction, or open question, silently update the draft brief at the output path (Phase 4
112
+ format, `verdict: aporia` while in progress). An interrupted or abandoned session must still
113
+ leave a usable partial brief.
114
+
115
+ ### Phase 3 — Verdict checkpoint
116
+
117
+ Stop questioning when (whichever comes first): the depth budget is spent; answers stop
118
+ changing the thesis; or the user says stop. Then state honestly which state was reached:
119
+
120
+ - **Sharpened:** the thesis survived, now with explicit scope, assumptions, and constraints.
121
+ - **Aporia:** a genuine unresolved hole remains. Name it plainly. Aporia is a *finding*, not a
122
+ failure — "we don't yet know who this is for" saves more than a confident wrong answer. Do
123
+ not paper over it with a proposed solution. Aporia also hides behind a "sharpened" label:
124
+ if the final thesis is mainly a plan to answer the open questions ("first find out X /
125
+ go ask / define the metric"), the verdict is aporia — the gathering plan belongs in
126
+ `next_step`, not the thesis. A genuinely sharpened thesis states what to do or build;
127
+ a thesis that states what to find out is aporia wearing the label.
128
+ - **Refuted:** two of the user's own answers collided and could not be reconciled. You may
129
+ declare refutation **only** by quoting the colliding answers verbatim — never by asserting
130
+ your own domain opinion ("that won't work"). The method refutes people out of their own
131
+ mouth, or not at all. State the refutation as the idea's own claims colliding ("the claim
132
+ that X can't hold with the claim that Y"), never as the person conceding or yielding.
133
+
134
+ ### Phase 4 — Maieutic synthesis (the deliverable)
135
+
136
+ Only now may you contribute content. Produce the **idea brief**, built strictly from the
137
+ user's own answers (cite their words; invent nothing they didn't say).
138
+
139
+ Format — `idea-brief-v1`, a markdown file with YAML frontmatter. The format is formal:
140
+ the packaged schema (`idea-brief-v1.schema.json`, installed alongside this skill's source)
141
+ defines the frontmatter, and the harness validates briefs with `socratic-method validate
142
+ <path>` (the eval matrix in the package repository's `evals/` runs it on every brief):
143
+
144
+ ```markdown
145
+ ---
146
+ schema: idea-brief-v1
147
+ idea: <slug> # slug only, no date — the filename adds -YYYYMMDD
148
+ date: <YYYY-MM-DD>
149
+ mode: stress # stress | develop
150
+ depth: standard # quick | standard | deep
151
+ verdict: sharpened # sharpened | aporia | refuted
152
+ thesis_final: "One- or two-sentence refined statement"
153
+ questions_asked: 9 # Phase 2 probing questions only (not the thesis ask, steelman
154
+ # # confirmations, or clarifying sub-questions of the same probe);
155
+ # # recount from the conversation when writing — never estimate
156
+ types_used: [clarification, assumptions, evidence] # exact tokens: clarification |
157
+ # assumptions | evidence | viewpoints | implications | questioning-the-question
158
+ # colliding_claims: ["quote 1", "quote 2"] # REQUIRED when verdict: refuted — the two
159
+ # # colliding answers, verbatim as the user said them
160
+ assumptions:
161
+ - text: "..."
162
+ status: unvalidated # validated | unvalidated | risky
163
+ open_questions:
164
+ - "..."
165
+ constraints:
166
+ - "..."
167
+ next_step: "One concrete action"
168
+ ---
169
+
170
+ # Idea brief: <short name>
171
+ ## What changed under questioning — initial vs final thesis, a line or two each
172
+ ## Scope — who/what it's for, and what is explicitly out
173
+ ## Assumptions surfaced — narrative behind the frontmatter list
174
+ ## Contradictions & how resolved — or "unresolved", carried to open questions
175
+ ## Open questions (aporia) — what must be answered before/while proceeding
176
+ ## Suggested next step
177
+ ```
178
+
179
+ Write it to `notes/idea-briefs/<slug>.md` (create the directory if needed; honor a
180
+ user-supplied path instead). Derive the slug from the idea plus the date
181
+ (`<idea-slug>-YYYYMMDD`, lowercase letters/digits/hyphens only — never path characters from
182
+ free-text input); if the file already exists, read it first and overwrite only if it
183
+ is an earlier draft of this same dialogue — otherwise pick a suffixed name. Never write into
184
+ areas owned by generators or other tooling (for example generated-artifact directories such
185
+ as `.claude/agents/generated/`). Print the brief in chat as well.
186
+
187
+ **Self-check before presenting:** after writing the file, `Read` it back from disk and
188
+ check what is actually there — never self-check from memory, and never say "saved" for a
189
+ file you have not re-read (a claimed save with no file on disk is fabrication, not a
190
+ save). Verify: every required frontmatter key present and enum-valid — including keys
191
+ with nothing gathered, which still appear with an empty list (`[]`); every assumption
192
+ carries a status; `questions_asked` recounted from the conversation, not estimated;
193
+ `verdict: aporia` ⇒ `open_questions` non-empty; `verdict: refuted` ⇒ `colliding_claims`
194
+ holds the two colliding answers exactly as the user said them, and both appear in the
195
+ body. Fix mismatches before printing. This read-back is only the inner loop — the
196
+ harness-side validator and eval matrix are the final authority, so do not claim the
197
+ brief "validates"; report only that the self-check passed.
198
+
199
+ A complete worked example — a sample exchange showing the steelman restatement,
200
+ contradiction-surfacing, and a verbatim-quote refutation, plus a full `idea-brief-v1` file —
201
+ is in [references/example-session.md](references/example-session.md). Consult it before
202
+ writing your first brief of a session.
203
+
204
+ ## Combining with other skills
205
+
206
+ This skill is a *front-end*: the brief is the input that makes downstream work better. The
207
+ frontmatter is designed for mechanical hand-off — `open_questions` is a ready-made research
208
+ agenda, `assumptions[status=unvalidated]` is a validation worklist.
209
+
210
+ - **Before building or writing anything:** run this, then start the real work (plan mode, a
211
+ draft, a spec) with the brief as the starting spec; open questions get verified first.
212
+ - **Before research** (e.g. `deep-research`): pass `open_questions` and unvalidated
213
+ assumptions as the research questions.
214
+ - **Before authoring an agent, skill, or subagent:** question the idea — domain, sources,
215
+ who consults it, what "good advice" means — before building anything; the brief informs
216
+ scope and source selection.
217
+ - **Chained by another skill:** any skill may invoke this one first when its own input is
218
+ fuzzy; it returns control once the brief exists.
219
+
220
+ ## Guardrails
221
+
222
+ - One question per turn at `standard`/`deep`; small prose groups only at `quick`. No
223
+ checklists, no "also, ...".
224
+ - No solutions, designs, or recommendations before Phase 4 — even if asked "what would you
225
+ do?", answer once with a question that helps the *user* decide, and offer to move to
226
+ synthesis if they'd rather stop.
227
+ - Stay the examiner: do not drift into domain-expert mode, whatever the subject.
228
+ - Non-adversarial tone even in `stress` mode: you are examining the idea, not the person.
229
+ Steelman before you probe.
230
+ - Never manufacture agreement: if aporia or refutation is the honest result, the brief says
231
+ so — and refutation is only ever declared in the user's own quoted words.
232
+ - Respect the stop signal instantly, including soft forms: "stop", "that's enough", "I'm
233
+ done", "let's not re-walk this" → go straight to Phase 4 with whatever was gathered. Never
234
+ politely push past a stop signal with more questions.
235
+
236
+ ## Source
237
+
238
+ Distilled from the standard account of the Socratic method
239
+ (https://en.wikipedia.org/wiki/Socratic_method): classic elenchus and aporia, maieutics, and
240
+ the modern Socratic-questioning taxonomy used in education and cognitive-behavioural therapy.
@@ -0,0 +1,97 @@
1
+ {
2
+ "$schema": "https://json-schema.org/draft/2020-12/schema",
3
+ "$id": "idea-brief-v1.schema.json",
4
+ "title": "idea-brief-v1 frontmatter",
5
+ "description": "YAML frontmatter of an idea brief produced by the socratic-method skill (.claude/skills/socratic-method). The markdown body is checked separately by tools/subagent_factory/validate_idea_brief.py (required headers, verbatim colliding claims).",
6
+ "type": "object",
7
+ "required": [
8
+ "schema",
9
+ "idea",
10
+ "date",
11
+ "mode",
12
+ "depth",
13
+ "verdict",
14
+ "thesis_final",
15
+ "assumptions",
16
+ "open_questions",
17
+ "constraints",
18
+ "next_step"
19
+ ],
20
+ "properties": {
21
+ "schema": { "const": "idea-brief-v1" },
22
+ "idea": {
23
+ "type": "string",
24
+ "pattern": "^[a-z0-9][a-z0-9-]*$",
25
+ "description": "Slug: lowercase letters, digits, hyphens only."
26
+ },
27
+ "date": {
28
+ "type": "string",
29
+ "pattern": "^\\d{4}-\\d{2}-\\d{2}$"
30
+ },
31
+ "mode": { "enum": ["stress", "develop"] },
32
+ "depth": { "enum": ["quick", "standard", "deep"] },
33
+ "verdict": { "enum": ["sharpened", "aporia", "refuted"] },
34
+ "thesis_final": { "type": "string", "minLength": 1 },
35
+ "assumptions": {
36
+ "type": "array",
37
+ "items": {
38
+ "type": "object",
39
+ "required": ["text", "status"],
40
+ "properties": {
41
+ "text": { "type": "string", "minLength": 1 },
42
+ "status": { "enum": ["validated", "unvalidated", "risky"] }
43
+ },
44
+ "additionalProperties": false
45
+ }
46
+ },
47
+ "open_questions": {
48
+ "type": "array",
49
+ "items": { "type": "string", "minLength": 1 }
50
+ },
51
+ "constraints": {
52
+ "type": "array",
53
+ "items": { "type": "string", "minLength": 1 }
54
+ },
55
+ "next_step": { "type": "string", "minLength": 1 },
56
+ "questions_asked": {
57
+ "type": "integer",
58
+ "minimum": 0,
59
+ "description": "Total elenchus questions the examiner asked (auditable session claim)."
60
+ },
61
+ "types_used": {
62
+ "type": "array",
63
+ "uniqueItems": true,
64
+ "items": {
65
+ "enum": [
66
+ "clarification",
67
+ "assumptions",
68
+ "evidence",
69
+ "viewpoints",
70
+ "implications",
71
+ "questioning-the-question"
72
+ ]
73
+ },
74
+ "description": "Which of the six Socratic question types were used."
75
+ },
76
+ "colliding_claims": {
77
+ "type": "array",
78
+ "minItems": 2,
79
+ "maxItems": 2,
80
+ "items": { "type": "string", "minLength": 1 },
81
+ "description": "verdict=refuted only: the two user answers that collided, quoted verbatim."
82
+ }
83
+ },
84
+ "additionalProperties": false,
85
+ "allOf": [
86
+ {
87
+ "if": { "properties": { "verdict": { "const": "refuted" } } },
88
+ "then": { "required": ["colliding_claims"] }
89
+ },
90
+ {
91
+ "if": { "properties": { "verdict": { "const": "aporia" } } },
92
+ "then": {
93
+ "properties": { "open_questions": { "minItems": 1 } }
94
+ }
95
+ }
96
+ ]
97
+ }
@@ -0,0 +1,155 @@
1
+ # Worked example — socratic-method session
2
+
3
+ A condensed but realistic session, chosen deliberately from a **non-software** domain to
4
+ anchor the skill's universality. It demonstrates the style-sensitive moves: the steelman
5
+ restatement, one question per turn, contradiction surfacing by verbatim quotation, the
6
+ difference between a refutation declaration and aporia, and the exact `idea-brief-v1` shape.
7
+
8
+ Invocation: `/socratic-method I want to start a weekly tech-talk series at work --mode stress --depth standard`
9
+
10
+ ## The exchange (abridged)
11
+
12
+ **Agent (Phase 1 — thesis):**
13
+ > State the idea in one or two sentences, as if convincing a skeptical friend.
14
+
15
+ **User:** We should run a weekly 30-minute tech talk every Friday. People barely know what
16
+ other teams are doing, and it'd help juniors learn faster.
17
+
18
+ **Agent (steelman restatement — strongest honest version, then confirm):**
19
+ > So the thesis is: *a recurring weekly 30-minute internal talk will fix cross-team
20
+ > visibility and accelerate junior growth, and weekly cadence is essential to that.* Is that
21
+ > the claim — including the "weekly is essential" part?
22
+
23
+ **User:** Yes, weekly, otherwise it loses momentum.
24
+
25
+ **Agent (Phase 2 — clarification, one question only):**
26
+ > When you say "people barely know what other teams are doing" — who specifically has told
27
+ > you that, and what did it cost them?
28
+
29
+ **User:** Mostly me, honestly. I was blindsided by the platform team's migration last month.
30
+
31
+ **Agent (assumptions):**
32
+ > The plan assumes a steady supply of speakers — roughly 50 talks a year. Who, concretely,
33
+ > gives talks 3 through 10?
34
+
35
+ **User:** That's the weak spot. Realistically I'd be begging people. Most engineers here hate
36
+ presenting.
37
+
38
+ **Agent (contradiction surfacing — both answers quoted verbatim, user chooses which yields):**
39
+ > Two of your answers are now in tension. You said **"weekly, otherwise it loses momentum"**,
40
+ > and you said **"realistically I'd be begging people. Most engineers here hate presenting."**
41
+ > A cadence that outruns your speaker supply doesn't build momentum — it builds visible
42
+ > no-shows. Which one yields: the weekly cadence, or the assumption that momentum requires it?
43
+
44
+ **User:** …the cadence, I guess. Monthly with strong talks beats weekly with filler.
45
+
46
+ *(The thesis is updated; questioning continues through evidence, alternatives, and
47
+ consequences until answers stop moving it.)*
48
+
49
+ **Agent (Phase 3 — verdict checkpoint):**
50
+ > Verdict: **sharpened, with one open hole.** The idea survived, but transformed: monthly, not
51
+ > weekly; and the evidence for the visibility problem is currently a sample of one (your
52
+ > migration surprise). That's an unvalidated assumption, not a refuted idea — so this is
53
+ > carried as an open question, not declared refuted.
54
+
55
+ ### How a refutation would have read instead
56
+
57
+ Refutation is declared **only** out of the user's own mouth. If the user had insisted the
58
+ cadence must stay weekly, it would read:
59
+
60
+ > Verdict: **refuted as stated.** You said **"weekly, otherwise it loses momentum"** and you
61
+ > said **"realistically I'd be begging people. Most engineers here hate presenting"** — and
62
+ > when asked which claim yields, neither did. The claim that momentum requires weekly cadence
63
+ > can't hold with the claim that speakers must be begged: the idea as stated contradicts
64
+ > itself, and here is what would need to change for it to survive.
65
+
66
+ Never: "weekly won't work at your company" — that is the agent's domain opinion, which is not
67
+ grounds for refutation.
68
+
69
+ ### How a pure aporia verdict would have read instead
70
+
71
+ Aporia is the honest terminus when questioning exhausts the budget and a load-bearing hole
72
+ remains that no amount of further dialogue can fill *right now* — the thesis is neither
73
+ sharpened into something actionable nor self-contradictory. If the user had answered "I
74
+ genuinely don't know" to both the speaker-supply and the who-feels-this-problem questions, it
75
+ would read:
76
+
77
+ > Verdict: **aporia.** The idea didn't collapse — but its two load-bearing inputs are
78
+ > unknowns you can't answer from where you sit: whether anyone besides you feels the
79
+ > visibility gap, and whether anyone at all will volunteer to speak. Until those are known,
80
+ > any cadence decision is guesswork. That's the finding: this isn't ready to decide, and the
81
+ > brief records exactly what would make it ready.
82
+
83
+ The brief then carries `verdict: aporia`, the unknowns as `open_questions`, and a `next_step`
84
+ that gathers the missing answers — never a proposed solution that papers over them. Aporia is
85
+ delivered as a *result*, not an apology.
86
+
87
+ ## The deliverable — `notes/idea-briefs/tech-talk-series-20260704.md`
88
+
89
+ ```markdown
90
+ ---
91
+ schema: idea-brief-v1
92
+ idea: tech-talk-series
93
+ date: 2026-07-04
94
+ mode: stress
95
+ depth: standard
96
+ verdict: sharpened
97
+ thesis_final: "A monthly 30-minute internal tech talk, seeded with 3 pre-committed speakers,
98
+ to improve cross-team visibility — piloted for a quarter before any cadence increase."
99
+ questions_asked: 9
100
+ types_used: [clarification, assumptions, evidence, implications]
101
+ assumptions:
102
+ - text: "Cross-team visibility is a widely felt problem, not just the proposer's"
103
+ status: unvalidated
104
+ - text: "Enough engineers will volunteer to speak at monthly cadence"
105
+ status: risky
106
+ - text: "30 minutes is short enough that attendance survives busy weeks"
107
+ status: unvalidated
108
+ open_questions:
109
+ - "Do at least 5 other people report being surprised by another team's work in the last quarter?"
110
+ - "Who are speakers 1-3, by name, before anything is announced?"
111
+ constraints:
112
+ - "No budget; must run on existing meeting infrastructure"
113
+ - "Cannot be mandatory (user: 'the moment it's mandatory it's dead')"
114
+ next_step: "Ask the last 5 people who joined what they wish they'd known — validates the
115
+ visibility assumption before committing anyone's Friday."
116
+ ---
117
+
118
+ # Idea brief: internal tech-talk series
119
+
120
+ ## What changed under questioning
121
+ Initial: weekly talks to fix visibility and grow juniors, weekly cadence essential.
122
+ Final: monthly pilot with pre-committed speakers; the junior-growth goal was dropped as a
123
+ separate untested claim riding along with the visibility one.
124
+
125
+ ## Scope
126
+ For: engineers across teams at the user's company. Out of scope: external audience,
127
+ recorded-content library, mandatory attendance.
128
+
129
+ ## Assumptions surfaced
130
+ The visibility problem rests on a sample of one (the user's own migration surprise). Speaker
131
+ supply was the load-bearing weakness: the user's own words — "I'd be begging people" —
132
+ forced the cadence change.
133
+
134
+ ## Contradictions & how resolved
135
+ "Weekly, otherwise it loses momentum" vs "most engineers here hate presenting" — resolved by
136
+ the user: cadence yielded ("monthly with strong talks beats weekly with filler").
137
+
138
+ ## Open questions (aporia)
139
+ Whether the visibility problem is real beyond the user, and who the first three named
140
+ speakers are. Both must be answered before announcing anything.
141
+
142
+ ## Suggested next step
143
+ Interview the last 5 joiners about what they wish they'd known. Cheap, fast, and directly
144
+ tests the founding assumption.
145
+ ```
146
+
147
+ ## What to imitate
148
+
149
+ - The restatement adds the hidden rider ("weekly is essential") *and asks for confirmation* —
150
+ that rider is what later collides.
151
+ - Exactly one question per message; each names why it matters only when that isn't obvious.
152
+ - The contradiction move quotes **both answers verbatim in bold** and hands the choice to the
153
+ user — the agent never resolves it.
154
+ - Frontmatter values come from the user's answers; the `constraints` entry even cites their
155
+ wording. Nothing in the brief was invented by the agent.
socratic_method/cli.py ADDED
@@ -0,0 +1,151 @@
1
+ """socratic-method command-line interface.
2
+
3
+ Subcommands:
4
+ setup install/refresh the skill for Claude Code, Codex, and/or Copilot
5
+ status show what is installed where, and whether it matches this package
6
+ uninstall remove managed skill files
7
+ validate check an idea brief against the idea-brief-v1 format
8
+ """
9
+
10
+ from __future__ import annotations
11
+
12
+ import argparse
13
+ import sys
14
+ from pathlib import Path
15
+
16
+ from . import __version__
17
+ from .installer import PLATFORMS, detect_platforms, install, status, uninstall
18
+ from .validator import validate_idea_brief
19
+
20
+ _TARGET_CHOICES = [*PLATFORMS.keys(), "all"]
21
+
22
+
23
+ def _targets(raw: list[str]) -> list[str]:
24
+ bad = [t for t in raw if t not in _TARGET_CHOICES]
25
+ if bad:
26
+ raise SystemExit(
27
+ f"unknown target(s): {', '.join(bad)} (choose from {', '.join(_TARGET_CHOICES)})"
28
+ )
29
+ return list(PLATFORMS) if (not raw or "all" in raw) else list(dict.fromkeys(raw))
30
+
31
+
32
+ def _print_action(a) -> None:
33
+ mark = {
34
+ "installed": "+",
35
+ "up-to-date": "=",
36
+ "skipped": "~",
37
+ "would-install": ">",
38
+ "blocked": "!",
39
+ "removed": "-",
40
+ "not-installed": " ",
41
+ "partial-or-modified": "!",
42
+ }.get(a.outcome, "?")
43
+ line = f" [{mark}] {a.platform:8s} {a.scope:8s} {a.outcome:20s} {a.target}"
44
+ print(line)
45
+ if a.detail:
46
+ print(f" {a.detail}")
47
+
48
+
49
+ def main(argv: list[str] | None = None) -> int:
50
+ parser = argparse.ArgumentParser(
51
+ prog="socratic-method",
52
+ description="Socratic questioning skill for coding agents — installer and brief validator.",
53
+ )
54
+ parser.add_argument("--version", action="version", version=f"%(prog)s {__version__}")
55
+ sub = parser.add_subparsers(dest="command", required=True)
56
+
57
+ p_setup = sub.add_parser("setup", help="install/refresh the skill for one or more platforms")
58
+ p_setup.add_argument(
59
+ "targets",
60
+ nargs="*",
61
+ metavar="TARGET",
62
+ help=f"platforms to set up: {', '.join(_TARGET_CHOICES)} "
63
+ "(default: auto-detect installed agents)",
64
+ )
65
+ p_setup.add_argument(
66
+ "--scope",
67
+ choices=["project", "user"],
68
+ default="project",
69
+ help="install into the current project (default) or the user home",
70
+ )
71
+ p_setup.add_argument(
72
+ "--root",
73
+ type=Path,
74
+ default=Path.cwd(),
75
+ help="project root for --scope project (default: cwd)",
76
+ )
77
+ p_setup.add_argument(
78
+ "--force",
79
+ action="store_true",
80
+ help="overwrite an existing install that differs from this package",
81
+ )
82
+ p_setup.add_argument("--dry-run", action="store_true", help="print the plan, write nothing")
83
+
84
+ p_status = sub.add_parser("status", help="show install state for every platform and scope")
85
+ p_status.add_argument("--root", type=Path, default=Path.cwd())
86
+
87
+ p_un = sub.add_parser("uninstall", help="remove managed skill files")
88
+ p_un.add_argument("targets", nargs="*", metavar="TARGET")
89
+ p_un.add_argument("--scope", choices=["project", "user"], default="project")
90
+ p_un.add_argument("--root", type=Path, default=Path.cwd())
91
+ p_un.add_argument("--dry-run", action="store_true")
92
+
93
+ p_val = sub.add_parser("validate", help="validate an idea-brief-v1 file")
94
+ p_val.add_argument("brief", type=Path, help="path to the brief markdown file")
95
+
96
+ args = parser.parse_args(argv)
97
+ home = Path.home()
98
+
99
+ if args.command == "validate":
100
+ errors = validate_idea_brief(args.brief)
101
+ for e in errors:
102
+ print(f"ERROR: {e}")
103
+ if not errors:
104
+ print(f"OK: {args.brief} is a valid idea-brief-v1")
105
+ return 0 if not errors else 1
106
+
107
+ if args.command == "status":
108
+ print(f"socratic-method {__version__} — install status (project root: {args.root})")
109
+ detected = detect_platforms(home)
110
+ for key, ev in detected.items():
111
+ print(f" agent {key:8s} {'detected: ' + ev if ev else 'not detected'}")
112
+ for a in status(args.root, home):
113
+ _print_action(a)
114
+ return 0
115
+
116
+ if args.command == "setup" and not args.targets:
117
+ # Auto-detect which agents are installed; configure only those.
118
+ detected = detect_platforms(home)
119
+ for key, ev in detected.items():
120
+ print(f" agent {key:8s} {'detected: ' + ev if ev else 'not detected'}")
121
+ keys = [k for k, ev in detected.items() if ev]
122
+ if not keys:
123
+ print(
124
+ "No supported agent detected on this machine. Name targets explicitly, "
125
+ "e.g. `socratic-method setup claude` or `socratic-method setup all`."
126
+ )
127
+ return 1
128
+ else:
129
+ keys = _targets(args.targets)
130
+
131
+ rc = 0
132
+ for key in keys:
133
+ try:
134
+ if args.command == "setup":
135
+ a = install(
136
+ key, args.scope, args.root, home, force=args.force, dry_run=args.dry_run
137
+ )
138
+ else:
139
+ a = uninstall(key, args.scope, args.root, home, dry_run=args.dry_run)
140
+ except ValueError as e:
141
+ print(f" [!] {key:8s} {args.scope:8s} error: {e}")
142
+ rc = 1
143
+ continue
144
+ _print_action(a)
145
+ if a.outcome == "blocked":
146
+ rc = 1
147
+ return rc
148
+
149
+
150
+ if __name__ == "__main__":
151
+ sys.exit(main())
@@ -0,0 +1,263 @@
1
+ """Install/inspect/remove the socratic-method skill for coding-agent platforms.
2
+
3
+ Design rules (inherited from the skill's own eval-hardened history):
4
+
5
+ - **Idempotent.** Every managed file is compared by content before writing;
6
+ an identical install is reported "up to date", never rewritten.
7
+ - **Never clobber local edits silently.** A file that differs from the packaged
8
+ version requires ``force=True`` to overwrite.
9
+ - **Verify before claiming.** After writing, every file is read back from disk and
10
+ compared to the packaged content; "installed" is only reported when the read-back
11
+ matches. A claimed install with no verified file is the failure mode this skill
12
+ exists to prevent.
13
+
14
+ Platform skill directories (data-driven so corrections are one-line; grounded in the
15
+ platform research of the agent-skills-advisor corpus — see README for caveats):
16
+
17
+ - Claude Code: ``.claude/skills/`` (project) and ``~/.claude/skills/`` (user).
18
+ - OpenAI Codex: ``.agents/skills/`` (project) and ``~/.agents/skills/`` (user) —
19
+ Codex scans the open Agent Skills directory, not ``.claude/skills``.
20
+ - GitHub Copilot: ``.github/skills/`` (project). Copilot also reads a repo's
21
+ ``.claude/skills/``, so a project-scope Claude install already covers Copilot;
22
+ the installer detects that and skips to avoid a double-triggering duplicate.
23
+ """
24
+
25
+ from __future__ import annotations
26
+
27
+ import contextlib
28
+ import hashlib
29
+ import shutil
30
+ from dataclasses import dataclass
31
+ from importlib.resources import files
32
+ from pathlib import Path
33
+
34
+ from . import SKILL_NAME
35
+
36
+ # Relative paths (within the skill directory) of every file this installer manages.
37
+ MANAGED_FILES = (
38
+ "SKILL.md",
39
+ "references/example-session.md",
40
+ "idea-brief-v1.schema.json",
41
+ )
42
+
43
+
44
+ @dataclass(frozen=True)
45
+ class Platform:
46
+ key: str
47
+ label: str
48
+ project_dir: str # relative to --root
49
+ user_dir: str | None # relative to ~; None = no user scope documented
50
+ note: str = ""
51
+
52
+
53
+ PLATFORMS: dict[str, Platform] = {
54
+ "claude": Platform(
55
+ key="claude",
56
+ label="Claude Code",
57
+ project_dir=".claude/skills",
58
+ user_dir=".claude/skills",
59
+ ),
60
+ "codex": Platform(
61
+ key="codex",
62
+ label="OpenAI Codex",
63
+ project_dir=".agents/skills",
64
+ user_dir=".agents/skills",
65
+ ),
66
+ "copilot": Platform(
67
+ key="copilot",
68
+ label="GitHub Copilot",
69
+ project_dir=".github/skills",
70
+ user_dir=None,
71
+ note="reads a repo's .claude/skills too — a project Claude install covers it",
72
+ ),
73
+ }
74
+
75
+
76
+ def detect_platforms(home: Path, path_env: str | None = None) -> dict[str, str | None]:
77
+ """Detect which agents are installed on this machine.
78
+
79
+ Returns ``{platform_key: evidence_or_None}`` — evidence is the concrete signal
80
+ found (a CLI on PATH, a config directory, an editor extension), so every claimed
81
+ detection is verifiable. ``path_env`` overrides the PATH searched (for tests).
82
+ """
83
+
84
+ def which(cmd: str) -> str | None:
85
+ return shutil.which(cmd, path=path_env)
86
+
87
+ evidence: dict[str, str | None] = {}
88
+
89
+ if exe := which("claude"):
90
+ evidence["claude"] = f"claude CLI on PATH ({exe})"
91
+ elif (home / ".claude").is_dir():
92
+ evidence["claude"] = f"config directory {home / '.claude'}"
93
+ else:
94
+ evidence["claude"] = None
95
+
96
+ if exe := which("codex"):
97
+ evidence["codex"] = f"codex CLI on PATH ({exe})"
98
+ elif (home / ".codex").is_dir():
99
+ evidence["codex"] = f"config directory {home / '.codex'}"
100
+ else:
101
+ evidence["codex"] = None
102
+
103
+ if exe := which("copilot"):
104
+ evidence["copilot"] = f"copilot CLI on PATH ({exe})"
105
+ elif (home / ".local/share/gh/extensions/gh-copilot").is_dir():
106
+ evidence["copilot"] = "gh-copilot extension installed"
107
+ elif vsix := sorted((home / ".vscode" / "extensions").glob("github.copilot*")):
108
+ evidence["copilot"] = f"VS Code extension {vsix[-1].name}"
109
+ else:
110
+ evidence["copilot"] = None
111
+
112
+ return evidence
113
+
114
+
115
+ def packaged_content(rel: str) -> bytes:
116
+ return files("socratic_method").joinpath(f"assets/{rel}").read_bytes()
117
+
118
+
119
+ def _digest(data: bytes) -> str:
120
+ return hashlib.sha256(data).hexdigest()
121
+
122
+
123
+ def skill_dir(platform: Platform, scope: str, root: Path, home: Path) -> Path:
124
+ """Resolve the target skill directory for a platform+scope."""
125
+ if scope == "project":
126
+ base = root / platform.project_dir
127
+ elif scope == "user":
128
+ if platform.user_dir is None:
129
+ raise ValueError(
130
+ f"{platform.label} has no documented user-scope skills directory; "
131
+ "use --scope project"
132
+ )
133
+ base = home / platform.user_dir
134
+ else:
135
+ raise ValueError(f"unknown scope '{scope}'")
136
+ return base / SKILL_NAME
137
+
138
+
139
+ def file_state(target: Path, rel: str) -> str:
140
+ """One managed file's state: 'missing' | 'up-to-date' | 'differs'."""
141
+ dst = target / rel
142
+ if not dst.is_file():
143
+ return "missing"
144
+ return (
145
+ "up-to-date" if _digest(dst.read_bytes()) == _digest(packaged_content(rel)) else "differs"
146
+ )
147
+
148
+
149
+ def install_state(target: Path) -> str:
150
+ """Whole-install state: 'not-installed' | 'up-to-date' | 'partial-or-modified'."""
151
+ states = {rel: file_state(target, rel) for rel in MANAGED_FILES}
152
+ if all(s == "missing" for s in states.values()):
153
+ return "not-installed"
154
+ if all(s == "up-to-date" for s in states.values()):
155
+ return "up-to-date"
156
+ return "partial-or-modified"
157
+
158
+
159
+ @dataclass
160
+ class Action:
161
+ platform: str
162
+ scope: str
163
+ target: Path
164
+ # 'installed' | 'up-to-date' | 'skipped' | 'would-install' | 'blocked' |
165
+ # 'removed' | 'not-installed'
166
+ outcome: str
167
+ detail: str = ""
168
+
169
+
170
+ def install(
171
+ platform_key: str,
172
+ scope: str,
173
+ root: Path,
174
+ home: Path,
175
+ *,
176
+ force: bool = False,
177
+ dry_run: bool = False,
178
+ ) -> Action:
179
+ platform = PLATFORMS[platform_key]
180
+ target = skill_dir(platform, scope, root, home)
181
+
182
+ # Copilot dedupe: a project-scope Claude install in the same root already covers it.
183
+ if platform_key == "copilot" and scope == "project" and not force:
184
+ claude_target = skill_dir(PLATFORMS["claude"], "project", root, home)
185
+ if install_state(claude_target) == "up-to-date":
186
+ return Action(
187
+ platform_key,
188
+ scope,
189
+ target,
190
+ "skipped",
191
+ f"Copilot reads .claude/skills — already covered by {claude_target} "
192
+ "(use --force to install to .github/skills anyway)",
193
+ )
194
+
195
+ state = install_state(target)
196
+ if state == "up-to-date":
197
+ return Action(
198
+ platform_key, scope, target, "up-to-date", "all files match the packaged version"
199
+ )
200
+ if state == "partial-or-modified" and not force:
201
+ differing = [rel for rel in MANAGED_FILES if file_state(target, rel) == "differs"]
202
+ return Action(
203
+ platform_key,
204
+ scope,
205
+ target,
206
+ "blocked",
207
+ f"existing install differs from packaged version "
208
+ f"({', '.join(differing) or 'partial'}); re-run with --force to overwrite",
209
+ )
210
+ if dry_run:
211
+ return Action(platform_key, scope, target, "would-install", f"{len(MANAGED_FILES)} files")
212
+
213
+ for rel in MANAGED_FILES:
214
+ dst = target / rel
215
+ dst.parent.mkdir(parents=True, exist_ok=True)
216
+ dst.write_bytes(packaged_content(rel))
217
+
218
+ # Verify before claiming: read back every file from disk.
219
+ unverified = [rel for rel in MANAGED_FILES if file_state(target, rel) != "up-to-date"]
220
+ if unverified:
221
+ return Action(
222
+ platform_key,
223
+ scope,
224
+ target,
225
+ "blocked",
226
+ f"post-write verification FAILED for: {', '.join(unverified)}",
227
+ )
228
+ return Action(
229
+ platform_key,
230
+ scope,
231
+ target,
232
+ "installed",
233
+ f"{len(MANAGED_FILES)} files written and read back",
234
+ )
235
+
236
+
237
+ def uninstall(
238
+ platform_key: str, scope: str, root: Path, home: Path, *, dry_run: bool = False
239
+ ) -> Action:
240
+ platform = PLATFORMS[platform_key]
241
+ target = skill_dir(platform, scope, root, home)
242
+ if install_state(target) == "not-installed":
243
+ return Action(platform_key, scope, target, "not-installed")
244
+ if dry_run:
245
+ return Action(platform_key, scope, target, "would-install", "would remove managed files")
246
+ for rel in MANAGED_FILES:
247
+ (target / rel).unlink(missing_ok=True)
248
+ # Remove now-empty directories we own, innermost first.
249
+ for d in sorted({(target / rel).parent for rel in MANAGED_FILES} | {target}, reverse=True):
250
+ with contextlib.suppress(OSError): # not empty — user files present; leave them
251
+ d.rmdir()
252
+ return Action(platform_key, scope, target, "removed")
253
+
254
+
255
+ def status(root: Path, home: Path) -> list[Action]:
256
+ out = []
257
+ for key, platform in PLATFORMS.items():
258
+ for scope in ("project", "user"):
259
+ if scope == "user" and platform.user_dir is None:
260
+ continue
261
+ target = skill_dir(platform, scope, root, home)
262
+ out.append(Action(key, scope, target, install_state(target)))
263
+ return out
@@ -0,0 +1,113 @@
1
+ """Validate an idea brief produced by the socratic-method skill.
2
+
3
+ Structural: YAML frontmatter validated against the packaged
4
+ ``idea-brief-v1.schema.json``. Cross-field rules the schema cannot express:
5
+
6
+ - ``verdict: refuted`` — each ``colliding_claims`` entry must appear verbatim in the
7
+ markdown body (the skill may only refute out of the user's own quoted mouth);
8
+ - required body headers from the skill's Phase 4 template must all be present;
9
+ - the filename slug (when the file follows ``<idea-slug>-YYYYMMDD.md``) must agree
10
+ with the frontmatter ``idea`` field.
11
+
12
+ API: ``validate_idea_brief(path) -> list[str]`` (empty list = valid).
13
+ CLI: ``socratic-method validate <brief.md>``.
14
+ """
15
+
16
+ from __future__ import annotations
17
+
18
+ import datetime
19
+ import json
20
+ import re
21
+ from importlib.resources import files
22
+ from pathlib import Path
23
+
24
+ import jsonschema
25
+ import yaml
26
+
27
+ _REQUIRED_HEADERS = (
28
+ "# Idea brief:",
29
+ "## What changed under questioning",
30
+ "## Scope",
31
+ "## Assumptions surfaced",
32
+ "## Contradictions & how resolved",
33
+ "## Open questions (aporia)",
34
+ "## Suggested next step",
35
+ )
36
+
37
+ _FILENAME_RE = re.compile(r"^(?P<slug>[a-z0-9][a-z0-9-]*)-(?P<date>\d{8})\.md$")
38
+
39
+
40
+ def load_schema() -> dict:
41
+ """Load the packaged idea-brief-v1 JSON schema."""
42
+ raw = (
43
+ files("socratic_method")
44
+ .joinpath("assets/idea-brief-v1.schema.json")
45
+ .read_text(encoding="utf-8")
46
+ )
47
+ return json.loads(raw)
48
+
49
+
50
+ def split_frontmatter(text: str) -> tuple[str | None, str]:
51
+ """Return (frontmatter_yaml, body), or (None, text) when no frontmatter block."""
52
+ if not text.startswith("---"):
53
+ return None, text
54
+ parts = text.split("\n---", 2)
55
+ if len(parts) < 2:
56
+ return None, text
57
+ return parts[0].removeprefix("---").strip("\n"), parts[1] + (parts[2] if len(parts) > 2 else "")
58
+
59
+
60
+ def validate_idea_brief(brief_path: str | Path) -> list[str]:
61
+ """Return list of error strings. Empty list = valid."""
62
+ path = Path(brief_path)
63
+ errors: list[str] = []
64
+
65
+ try:
66
+ text = path.read_text(encoding="utf-8")
67
+ except OSError as e:
68
+ return [f"Read error: {e}"]
69
+
70
+ raw_fm, body = split_frontmatter(text)
71
+ if raw_fm is None:
72
+ return ["No YAML frontmatter block (file must start with ---)"]
73
+
74
+ try:
75
+ fm = yaml.safe_load(raw_fm)
76
+ except yaml.YAMLError as e:
77
+ return [f"Frontmatter YAML parse error: {e}"]
78
+ if not isinstance(fm, dict):
79
+ return ["Frontmatter is not a mapping"]
80
+
81
+ # YAML parses an unquoted 2026-07-04 as datetime.date; normalize before schema check.
82
+ if isinstance(fm.get("date"), datetime.date):
83
+ fm["date"] = fm["date"].isoformat()
84
+
85
+ try:
86
+ schema = load_schema()
87
+ except (OSError, json.JSONDecodeError) as e:
88
+ return [f"Schema load error: {e}"]
89
+ validator = jsonschema.Draft202012Validator(schema)
90
+ errors.extend(
91
+ f"frontmatter{'.' + '.'.join(str(p) for p in err.path) if err.path else ''}: {err.message}"
92
+ for err in sorted(validator.iter_errors(fm), key=lambda e: list(e.path))
93
+ )
94
+
95
+ for header in _REQUIRED_HEADERS:
96
+ if not any(line.startswith(header) for line in body.splitlines()):
97
+ errors.append(f"body: missing required header '{header}'")
98
+
99
+ if fm.get("verdict") == "refuted":
100
+ for i, claim in enumerate(fm.get("colliding_claims") or []):
101
+ if isinstance(claim, str) and claim not in body:
102
+ errors.append(
103
+ f"colliding_claims[{i}]: not found verbatim in body — "
104
+ "refutation must quote the colliding answers exactly"
105
+ )
106
+
107
+ m = _FILENAME_RE.match(path.name)
108
+ if m and isinstance(fm.get("idea"), str) and m.group("slug") != fm["idea"]:
109
+ errors.append(
110
+ f"filename slug '{m.group('slug')}' does not match frontmatter idea '{fm['idea']}'"
111
+ )
112
+
113
+ return errors
@@ -0,0 +1,170 @@
1
+ Metadata-Version: 2.4
2
+ Name: socratic-method
3
+ Version: 0.1.0
4
+ Summary: A Socratic questioning skill for coding agents (Claude Code, Codex, Copilot), with a one-command installer and an idea-brief validator.
5
+ Project-URL: Homepage, https://github.com/grammy-jiang/socratic-method
6
+ Project-URL: Issues, https://github.com/grammy-jiang/socratic-method/issues
7
+ Author: Grammy Jiang
8
+ License-Expression: MIT
9
+ License-File: LICENSE
10
+ Keywords: agent-skills,claude-code,codex,copilot,socratic-method
11
+ Classifier: Development Status :: 4 - Beta
12
+ Classifier: Environment :: Console
13
+ Classifier: Intended Audience :: Developers
14
+ Classifier: Programming Language :: Python :: 3.11
15
+ Classifier: Programming Language :: Python :: 3.12
16
+ Classifier: Programming Language :: Python :: 3.13
17
+ Classifier: Topic :: Software Development :: Quality Assurance
18
+ Requires-Python: >=3.11
19
+ Requires-Dist: jsonschema>=4.0
20
+ Requires-Dist: pyyaml>=6.0
21
+ Description-Content-Type: text/markdown
22
+
23
+ # socratic-method
24
+
25
+ A **Socratic questioning skill for coding agents** — Claude Code, OpenAI Codex, and
26
+ GitHub Copilot — packaged with a one-command installer and a deterministic artifact
27
+ validator.
28
+
29
+ Invoke the skill before real work starts (writing software, drafting a plan or document,
30
+ making a decision) and the agent becomes a disciplined questioner: it steelmans your idea,
31
+ interrogates it one question at a time (classic elenchus — six Socratic question types,
32
+ counterexamples, contradiction-surfacing by quoting your own words), and ends with an
33
+ honest verdict — **sharpened**, **aporia** (a genuinely unresolved hole, treated as a
34
+ finding), or **refuted** (only ever out of your own mouth). The result is written down as
35
+ a machine-validatable **idea brief** (`idea-brief-v1`) that downstream work can consume:
36
+ open questions become a research agenda, unvalidated assumptions become a validation
37
+ worklist.
38
+
39
+ The skill was authored and hardened in
40
+ [subagent-factory](https://github.com/grammy-jiang/subagent-factory): seven rounds of
41
+ grounded review plus a six-cell adversarial behavioral eval (examiner vs. scripted user
42
+ simulator, deterministic graders + independent judge), which caught and fixed real
43
+ behavior gaps static review missed. The eval harness ships in this repo under `evals/`.
44
+
45
+ ## Install
46
+
47
+ ```bash
48
+ pip install git+https://github.com/grammy-jiang/socratic-method
49
+ # or, without installing:
50
+ uvx --from git+https://github.com/grammy-jiang/socratic-method socratic-method --help
51
+ ```
52
+
53
+ ## Set up the skill for your agents
54
+
55
+ ```bash
56
+ # auto-detect which agents are installed on this machine and configure those
57
+ socratic-method setup
58
+
59
+ # or name platforms explicitly / force all three
60
+ socratic-method setup claude codex
61
+ socratic-method setup all
62
+
63
+ # install into your user home instead of the current project
64
+ socratic-method setup claude --scope user
65
+
66
+ # see what would happen first / check current state / remove
67
+ socratic-method setup --dry-run
68
+ socratic-method status
69
+ socratic-method uninstall claude
70
+ ```
71
+
72
+ With no targets, `setup` **auto-detects** installed agents and configures only those,
73
+ printing the concrete evidence for each detection (never a bare claim):
74
+
75
+ | Agent | Detection signals, in order |
76
+ |---|---|
77
+ | Claude Code | `claude` CLI on PATH; else `~/.claude/` config directory |
78
+ | OpenAI Codex | `codex` CLI on PATH; else `~/.codex/` config directory |
79
+ | GitHub Copilot | `copilot` CLI on PATH; else `gh-copilot` extension; else a `github.copilot*` VS Code extension |
80
+
81
+ If nothing is detected, `setup` installs nothing and tells you how to name targets
82
+ explicitly. `setup all` bypasses detection.
83
+
84
+ `setup` is **idempotent** (content-hash comparison; an identical install reports
85
+ "up to date"), refuses to overwrite locally modified files without `--force`, and after
86
+ every write **reads the files back from disk before reporting success** — the skill's own
87
+ "verify before claiming" rule, applied to its installer.
88
+
89
+ ### Where the skill lands
90
+
91
+ | Platform | `--scope project` (default) | `--scope user` |
92
+ |---|---|---|
93
+ | Claude Code | `<root>/.claude/skills/socratic-method/` | `~/.claude/skills/socratic-method/` |
94
+ | OpenAI Codex | `<root>/.agents/skills/socratic-method/` | `~/.agents/skills/socratic-method/` |
95
+ | GitHub Copilot | `<root>/.github/skills/socratic-method/` | — |
96
+
97
+ Copilot also reads a repo's `.claude/skills/`, so if the Claude project install is already
98
+ present, the Copilot step reports "already covered" and skips (a duplicate would trigger
99
+ twice); `--force` installs to `.github/skills/` anyway. Paths live in one data-driven
100
+ registry (`installer.py`) — if a platform moves its skills directory, the fix is one line.
101
+
102
+ ## Use the skill
103
+
104
+ In any agent with the skill installed:
105
+
106
+ > stress-test this plan: …
107
+ > help me think through …
108
+ > poke holes in this idea: …
109
+
110
+ Or invoke directly: `/socratic-method <idea> [--mode stress|develop] [--depth quick|standard|deep]`.
111
+ The session ends with the brief saved to `notes/idea-briefs/<slug>-YYYYMMDD.md`.
112
+
113
+ ## Validate a brief
114
+
115
+ ```bash
116
+ socratic-method validate notes/idea-briefs/my-idea-20260704.md
117
+ ```
118
+
119
+ Checks the YAML frontmatter against the packaged `idea-brief-v1` JSON schema plus the
120
+ cross-field rules a schema can't express (e.g. `verdict: refuted` requires the two
121
+ colliding claims verbatim in the body; `verdict: aporia` requires open questions).
122
+
123
+ ## Behavioral eval harness (`evals/`)
124
+
125
+ The six-cell regression matrix that hardened the skill: normal cells (planted
126
+ contradiction → refuted; genuine unknowns → aporia; quick-depth cadence), edge cells
127
+ (mid-session stop; disputed restatement), and an out-of-scope cell (fully specified plan →
128
+ decline). Each cell runs a live examiner against a scripted user simulator, then grades
129
+ the transcript with deterministic graders and an independent model judge.
130
+
131
+ ```bash
132
+ python evals/run_eval.py --dry-run # list cells, no calls
133
+ python evals/run_eval.py --cell O1 # one cell
134
+ python evals/run_eval.py # full matrix — spawns ~30-60 headless `claude` calls
135
+ ```
136
+
137
+ Requires the `claude` CLI and real tokens; run cells individually while iterating. A cell
138
+ passes only when **all** deterministic graders pass AND the judge confirms the expected
139
+ behavior with no fabrication.
140
+
141
+ ## Development
142
+
143
+ ```bash
144
+ uv sync # or: pip install -e . && pip install pytest
145
+ uv run pytest # validator negatives, installer + detection behavior, CLI smoke
146
+ uv run pre-commit install # one tool per file type: ruff (py), syntax checks (yaml/json/toml), actionlint
147
+ ```
148
+
149
+ CI (`.github/workflows/ci.yml`) runs the test suite on Python 3.11–3.13, builds the
150
+ sdist/wheel, checks metadata, and smoke-tests the CLI installed from the built wheel.
151
+
152
+ ## Releasing to PyPI
153
+
154
+ Publishing uses **PyPI Trusted Publishing** (OIDC) — no API token lives in the repo.
155
+ One-time setup on pypi.org: *Account → Publishing → Add a new pending publisher* with
156
+ project `socratic-method`, owner `grammy-jiang`, repository `socratic-method`, workflow
157
+ `release.yml`, environment `pypi`. Then, to release:
158
+
159
+ 1. Bump `__version__` in `src/socratic_method/__init__.py`.
160
+ 2. Create a GitHub Release with tag `vX.Y.Z` (must match — the workflow verifies).
161
+ 3. `release.yml` builds, checks, and publishes; the package appears at
162
+ `pypi.org/project/socratic-method/`, after which `pip install socratic-method` works.
163
+
164
+ Maintenance rule inherited from the skill's history: a behavioral failure is not fixed
165
+ until something durable changes — a grader, a scenario, or a rail in `SKILL.md` — never a
166
+ prose spot-fix alone.
167
+
168
+ ## License
169
+
170
+ MIT
@@ -0,0 +1,12 @@
1
+ socratic_method/__init__.py,sha256=wxwvjw2VfqE6ikJq_8Y4UjhBIOwZnU01FcBFmZCwObY,495
2
+ socratic_method/cli.py,sha256=YGWfrqMkIdTh8Bq61zi3cV-F-4u-DLY9XRdnaBYEK5M,5315
3
+ socratic_method/installer.py,sha256=T9l9ASAvp-G9OdwhaKDS7PWmN8frFM1YV_P5QHzcAV4,9186
4
+ socratic_method/validator.py,sha256=_YL-U0sAl05vUYei6rdgDpOichW3IWciYijoY8cFCW0,3880
5
+ socratic_method/assets/SKILL.md,sha256=2mEJIbZPn1ke_8gIW7rfCMBm27qxF3IXQFQ7KKZ015w,14879
6
+ socratic_method/assets/idea-brief-v1.schema.json,sha256=LC4BnrKVvi8PH8LMY5S3kgI6Zs3x8XIKhFwBFhDVsKQ,2894
7
+ socratic_method/assets/references/example-session.md,sha256=GX8UDSPK9UshSb56O1xSBjJh1WArkoy5lJus3fRaH-4,7605
8
+ socratic_method-0.1.0.dist-info/METADATA,sha256=98_OpKSwwfL83ZBTCamgrT0FOq8Sn8sClWSQO8_MUo4,7530
9
+ socratic_method-0.1.0.dist-info/WHEEL,sha256=mffPy8wBnZQn2VnJUU5jE99KsxaSfiyMHV9Yt0aLVxs,87
10
+ socratic_method-0.1.0.dist-info/entry_points.txt,sha256=t5OsMhd77t_NZY95kBC99MlW9Ow4S6LpFYwKCC_9ztc,61
11
+ socratic_method-0.1.0.dist-info/licenses/LICENSE,sha256=uFPMuBlmnrkl9sxXavhNpH6trppskKcU_5AN435lCr4,1069
12
+ socratic_method-0.1.0.dist-info/RECORD,,
@@ -0,0 +1,4 @@
1
+ Wheel-Version: 1.0
2
+ Generator: hatchling 1.30.1
3
+ Root-Is-Purelib: true
4
+ Tag: py3-none-any
@@ -0,0 +1,2 @@
1
+ [console_scripts]
2
+ socratic-method = socratic_method.cli:main
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Grammy Jiang
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.