research-copilot 0.2.20 → 0.2.21

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (81) hide show
  1. package/README.md +1 -1
  2. package/app/out/main/index.mjs +2585 -48
  3. package/app/out/renderer/assets/{MilkdownMarkdownEditor-CCdZ2mtg.css → MilkdownMarkdownEditor-BW0Pt28W.css} +16 -1
  4. package/app/out/renderer/assets/{MilkdownMarkdownEditor-Bj7JSjF5.js → MilkdownMarkdownEditor-OhCrq3X0.js} +56 -51
  5. package/app/out/renderer/assets/{arc-CPL9nDFE.js → arc-DLr0RP8F.js} +1 -1
  6. package/app/out/renderer/assets/{blockDiagram-c4efeb88-BFOajDNs.js → blockDiagram-c4efeb88-XhKChw2n.js} +8 -8
  7. package/app/out/renderer/assets/{c4Diagram-c83219d4-LeqnQ2-5.js → c4Diagram-c83219d4-DDoJmoIQ.js} +3 -3
  8. package/app/out/renderer/assets/{channel-jk5Np8ud.js → channel-CJCgJSqV.js} +1 -1
  9. package/app/out/renderer/assets/{classDiagram-beda092f-CxOqB6OU.js → classDiagram-beda092f-CAmimZpz.js} +6 -6
  10. package/app/out/renderer/assets/{classDiagram-v2-2358418a-CyP_5qLa.js → classDiagram-v2-2358418a-Bma4E_Eg.js} +10 -10
  11. package/app/out/renderer/assets/{clone-PHFwh58n.js → clone-C338dmoI.js} +1 -1
  12. package/app/out/renderer/assets/{createText-1719965b-CE_0jsfj.js → createText-1719965b-_up4NJqB.js} +2 -2
  13. package/app/out/renderer/assets/{edges-96097737-DBk1JhZS.js → edges-96097737-Bpp6hVLn.js} +3 -3
  14. package/app/out/renderer/assets/{erDiagram-0228fc6a-DnR_LkSB.js → erDiagram-0228fc6a-bjTh_7ap.js} +5 -5
  15. package/app/out/renderer/assets/{flowDb-c6c81e3f-CJrZUKlS.js → flowDb-c6c81e3f-BjVV4DVk.js} +1 -1
  16. package/app/out/renderer/assets/{flowDiagram-50d868cf-CfNfrt17.js → flowDiagram-50d868cf-gmeaaZ6z.js} +12 -12
  17. package/app/out/renderer/assets/{flowDiagram-v2-4f6560a1-BGQtiK3j.js → flowDiagram-v2-4f6560a1-nem5zs2M.js} +12 -12
  18. package/app/out/renderer/assets/{flowchart-elk-definition-6af322e1-BXLraghz.js → flowchart-elk-definition-6af322e1-DPaGAYRw.js} +6 -6
  19. package/app/out/renderer/assets/{ganttDiagram-a2739b55-CAwaEMMm.js → ganttDiagram-a2739b55-CnAti19E.js} +3 -3
  20. package/app/out/renderer/assets/{gitGraphDiagram-82fe8481-vuSEC6ny.js → gitGraphDiagram-82fe8481-DQWHD3SJ.js} +2 -2
  21. package/app/out/renderer/assets/{graph-CZfltE7S.js → graph-DKiKgH8m.js} +1 -1
  22. package/app/out/renderer/assets/{index-DIZJXKQ6.js → index-4s-c5d65.js} +3 -3
  23. package/app/out/renderer/assets/{index-5325376f-DWTrHDEo.js → index-5325376f-G-0aO-2i.js} +6 -6
  24. package/app/out/renderer/assets/{index-CwPfquqm.js → index-9q_P5ULR.js} +4 -4
  25. package/app/out/renderer/assets/{index-EaGZvaBp.js → index-B1A3JxQj.js} +3 -3
  26. package/app/out/renderer/assets/{index-C2tqvXjC.js → index-BBUrmGmY.js} +6 -6
  27. package/app/out/renderer/assets/{index-D_7yOLk3.js → index-BQho5LH-.js} +6 -6
  28. package/app/out/renderer/assets/{index-B6f2bVW_.js → index-BUVlmsgO.js} +3 -3
  29. package/app/out/renderer/assets/{index-DpXI4mHb.js → index-BzEthrJ4.js} +3 -3
  30. package/app/out/renderer/assets/{index-CUsEKU8Q.js → index-C1YzkB4z.js} +93 -36
  31. package/app/out/renderer/assets/{index-CMfKxpBP.js → index-CGo665vD.js} +3 -3
  32. package/app/out/renderer/assets/{index-B5Mkpo9f.js → index-CPZaxR35.js} +3 -3
  33. package/app/out/renderer/assets/{index-BpdWQuss.js → index-CSyD1mbL.js} +3 -3
  34. package/app/out/renderer/assets/{index-DB8ImtMy.js → index-Cf7vlFSn.js} +3 -3
  35. package/app/out/renderer/assets/{index-CyDfvefg.js → index-CluH1o2q.js} +6 -6
  36. package/app/out/renderer/assets/{index-7dcVwInU.js → index-Cw1n3klA.js} +5 -5
  37. package/app/out/renderer/assets/{index-Ul-Kq9b2.js → index-DFzvntIw.js} +3 -3
  38. package/app/out/renderer/assets/{index-t0-md-MG.js → index-DHzyAhWM.js} +4 -4
  39. package/app/out/renderer/assets/{index-Cc9coKGN.js → index-DhliHfCM.js} +6 -6
  40. package/app/out/renderer/assets/{index-K0o5fHYG.js → index-DkVFbCxC.js} +3 -3
  41. package/app/out/renderer/assets/{index-DiCqe1UR.js → index-DpZJP5MT.js} +6 -6
  42. package/app/out/renderer/assets/{index-CaYWMBXT.js → index-Gfd_DiMG.js} +3 -3
  43. package/app/out/renderer/assets/{index-Di3HmXc-.js → index-jOvNAYyP.js} +3 -3
  44. package/app/out/renderer/assets/{index-B4V7cFWJ.js → index-rrJkk8KV.js} +6 -6
  45. package/app/out/renderer/assets/{index-BgAs-p8D.js → index-vfSerSmF.js} +1 -1
  46. package/app/out/renderer/assets/{infoDiagram-8eee0895-BmPESCfj.js → infoDiagram-8eee0895-BCnBkXXS.js} +2 -2
  47. package/app/out/renderer/assets/{journeyDiagram-c64418c1-BGsCbfr_.js → journeyDiagram-c64418c1-Bq2wSX3k.js} +4 -4
  48. package/app/out/renderer/assets/{layout-5MwFTPs7.js → layout-BvkumzoT.js} +2 -2
  49. package/app/out/renderer/assets/{line-D0U74KO0.js → line-eU4el-G4.js} +1 -1
  50. package/app/out/renderer/assets/{linear-BclyBoiT.js → linear-DlBjMBEa.js} +1 -1
  51. package/app/out/renderer/assets/{mindmap-definition-8da855dc-un1bPKBj.js → mindmap-definition-8da855dc-CzLBu7ao.js} +3 -3
  52. package/app/out/renderer/assets/{pieDiagram-a8764435-B7KM3duv.js → pieDiagram-a8764435--olrXFr_.js} +3 -3
  53. package/app/out/renderer/assets/{quadrantDiagram-1e28029f-C8i5m3Os.js → quadrantDiagram-1e28029f-BnpnBBgc.js} +3 -3
  54. package/app/out/renderer/assets/{requirementDiagram-08caed73-FjqENNN5.js → requirementDiagram-08caed73-6O9WS7hn.js} +5 -5
  55. package/app/out/renderer/assets/{sankeyDiagram-a04cb91d-BKV22yuJ.js → sankeyDiagram-a04cb91d-D-iJnK91.js} +2 -2
  56. package/app/out/renderer/assets/{sequenceDiagram-c5b8d532-DWO-Z2i3.js → sequenceDiagram-c5b8d532-DBlK15cV.js} +3 -3
  57. package/app/out/renderer/assets/{stateDiagram-1ecb1508-BqohgALA.js → stateDiagram-1ecb1508-DKXKPYuk.js} +6 -6
  58. package/app/out/renderer/assets/{stateDiagram-v2-c2b004d7-B3sEkrB8.js → stateDiagram-v2-c2b004d7-DY288Eo5.js} +10 -10
  59. package/app/out/renderer/assets/{styles-b4e223ce-BGytHk8n.js → styles-b4e223ce-CRJ_xgJ-.js} +1 -1
  60. package/app/out/renderer/assets/{styles-ca3715f6-B0PvBknL.js → styles-ca3715f6-Bp_k5KLD.js} +1 -1
  61. package/app/out/renderer/assets/{styles-d45a18b0-C6F384ai.js → styles-d45a18b0-DLA8Gg6D.js} +4 -4
  62. package/app/out/renderer/assets/{svgDrawCommon-b86b1483-BXgThwM_.js → svgDrawCommon-b86b1483-Dm5CK2gQ.js} +1 -1
  63. package/app/out/renderer/assets/{timeline-definition-faaaa080-iNn5igPR.js → timeline-definition-faaaa080-D-m9BHUg.js} +3 -3
  64. package/app/out/renderer/assets/{xychartDiagram-f5964ef8-oF_gxlk1.js → xychartDiagram-f5964ef8-Drn4Rqev.js} +5 -5
  65. package/app/out/renderer/index.html +1 -1
  66. package/lib/skills/builtin/academic-marp-slides/SKILL.md +933 -0
  67. package/lib/skills/builtin/research-grants/SKILL.md +15 -11
  68. package/lib/skills/builtin/scholar-evaluation/SKILL.md +12 -11
  69. package/lib/skills/builtin/scientific-schematics/SKILL.md +463 -560
  70. package/lib/skills/builtin/teaching-marp-slides/SKILL.md +1218 -0
  71. package/package.json +1 -1
  72. package/scripts/audit-diagram-prompts.mjs +67 -0
  73. package/scripts/test-skill-routing.mjs +238 -0
  74. package/lib/skills/builtin/marp-slides/SKILL.md +0 -642
  75. package/lib/skills/builtin/scientific-schematics/references/QUICK_REFERENCE.md +0 -182
  76. package/lib/skills/builtin/scientific-schematics/references/README.md +0 -292
  77. package/lib/skills/builtin/scientific-schematics/scripts/__pycache__/generate_schematic.cpython-312.pyc +0 -0
  78. package/lib/skills/builtin/scientific-schematics/scripts/__pycache__/generate_schematic_ai.cpython-312.pyc +0 -0
  79. package/lib/skills/builtin/scientific-schematics/scripts/example_usage.sh +0 -85
  80. package/lib/skills/builtin/scientific-schematics/scripts/generate_schematic.py +0 -141
  81. package/lib/skills/builtin/scientific-schematics/scripts/generate_schematic_ai.py +0 -910
@@ -1,603 +1,506 @@
1
1
  ---
2
2
  name: scientific-schematics
3
- description: Create publication-quality scientific diagrams using OpenRouter API with smart iterative refinement. The skill defaults to Gemini 3 Pro Image for generation plus Gemini 3 Pro for review, and only regenerates if quality is below threshold for your document type.
3
+ description: Prompt guidance for the generate_diagram tool how to describe different scientific diagram types (flowcharts, pathways, architecture, circuits, networks, conceptual) so the tool produces publication-grade output on the first iteration.
4
4
  category: Visualization
5
5
  tags: [science, visualization, diagrams]
6
- triggers: [diagram, schematic, flowchart, architecture diagram, system diagram, 示意图, 流程图, generate diagram]
7
- allowed-tools: [Read, Write, Edit, Bash]
6
+ triggers: [diagram, schematic, flowchart, architecture diagram, pathway, circuit, 示意图, 流程图]
7
+ allowed-tools: [Read, Write, Edit]
8
8
  license: MIT license
9
9
  metadata:
10
- skill-author: K-Dense Inc.
10
+ skill-author: Dong Dai
11
11
  ---
12
12
 
13
- # Scientific Schematics and Diagrams
13
+ # Scientific Schematics
14
14
 
15
- ## Overview
15
+ ## Scope of this skill
16
16
 
17
- Scientific schematics and diagrams transform complex concepts into clear visual representations for publication. We want the diagram a textbook style. **This skill uses OpenRouter API for image generation with Gemini quality review.**
17
+ This skill **does not generate images**. Image generation is owned by the
18
+ `generate_diagram` tool, which runs a provider-backed generate → review →
19
+ (optional) edit loop. The tool is always available; this skill is only
20
+ loaded to sharpen the prompt the tool receives.
18
21
 
19
- **How it works:**
20
- - Describe your diagram in natural language
21
- - Gemini image model generates publication-quality diagrams via OpenRouter
22
- - **Gemini reviews quality** against document-type thresholds
23
- - **Smart iteration**: Only regenerates if quality is below threshold
24
- - Publication-ready output in minutes
25
- - No coding, templates, or manual drawing required
22
+ Use this skill to:
23
+ 1. Pick the right `diagram_type` parameter for the request.
24
+ 2. Write a description that names components, labels, quantities, and
25
+ flow direction unambiguously (LLM image models hallucinate when these
26
+ are vague).
27
+ 3. Choose an appropriate `doc_type` so the reviewer applies the right
28
+ quality threshold.
29
+ 4. Embed the resulting image in the document correctly.
26
30
 
27
- **Quality Thresholds by Document Type:**
28
- | Document Type | Threshold | Description |
29
- |---------------|-----------|-------------|
30
- | journal | 8.5/10 | Nature, Science, peer-reviewed journals |
31
- | conference | 8.0/10 | Conference papers |
32
- | thesis | 8.0/10 | Dissertations, theses |
33
- | grant | 8.0/10 | Grant proposals |
34
- | preprint | 7.5/10 | arXiv, bioRxiv, etc. |
35
- | report | 7.5/10 | Technical reports |
36
- | poster | 7.0/10 | Academic posters |
37
- | presentation | 6.5/10 | Slides, talks |
38
- | default | 7.5/10 | General purpose |
39
-
40
- **Simply describe what you want, and AI generates it.** All diagrams are stored in the @ws/figures/ subfolder and referenced in papers/posters.
41
-
42
- ### Embedding Figures in Documents (MANDATORY)
43
-
44
- **After generating every figure, you MUST embed it in the document using markdown image syntax.**
45
-
46
- The image path must be **relative to the markdown file you are writing**, not the workspace root. This ensures the renderer resolves the path correctly regardless of where the markdown file lives.
47
-
48
- **Example — markdown at workspace root (`paper.md`):**
49
- ```bash
50
- python @skill/scripts/generate_schematic.py "CONSORT flowchart..." -o @ws/figures/consort.png
51
- ```
52
- Embed in `paper.md`:
53
- ```markdown
54
- ![Figure 2: CONSORT participant flow diagram](figures/consort.png)
55
- ```
56
-
57
- **Example — markdown in a subdirectory (`workspace/paper_draft.md`):**
58
- ```bash
59
- python @skill/scripts/generate_schematic.py "CONSORT flowchart..." -o @ws/figures/consort.png
60
- ```
61
- Embed in `workspace/paper_draft.md` — use `../` to go up:
62
- ```markdown
63
- ![Figure 2: CONSORT participant flow diagram](../figures/consort.png)
64
- ```
65
-
66
- **Rule of thumb:** count how many directories deep your markdown file is from the workspace root, and prepend that many `../` segments to `figures/filename.png`.
67
-
68
- Alternatively, you can generate the image alongside the markdown file:
69
- ```bash
70
- # If markdown is at workspace/paper_draft.md, save figures next to it:
71
- python @skill/scripts/generate_schematic.py "CONSORT flowchart..." -o @ws/workspace/figures/consort.png
72
- ```
73
- Then embed simply as:
74
- ```markdown
75
- ![Figure 2: CONSORT participant flow diagram](figures/consort.png)
76
- ```
77
-
78
- Do NOT only cite figures as text ("see Figure 1"). Always include the `![...](...)` embed so the figure renders visually in the output.
79
-
80
- **Default model path:** this skill intentionally uses one default generation model, `google/gemini-3-pro-image-preview`, with Gemini review via `google/gemini-3-pro-preview`. Agents should not pick among multiple image backends during normal operation.
81
-
82
- ### Path Conventions
83
-
84
- - Shell command examples use scoped paths:
85
- - `@skill/...` for files shipped with this skill (for example scripts)
86
- - `@ws/...` for workspace files (inputs/outputs)
87
- - Python API and JSON log examples show raw filesystem paths (typically workspace-relative, such as `figures/...`).
88
-
89
- ## Quick Start: Generate Any Diagram
90
-
91
- Create any scientific diagram by simply describing it. AI handles everything automatically with **smart iteration**:
92
-
93
- ```bash
94
- # Generate for journal paper (highest quality threshold: 8.5/10)
95
- python @skill/scripts/generate_schematic.py "CONSORT participant flow diagram with 500 screened, 150 excluded, 350 randomized" -o @ws/figures/consort.png --doc-type journal
96
-
97
- # Generate for presentation (lower threshold: 6.5/10 - faster)
98
- python @skill/scripts/generate_schematic.py "Transformer encoder-decoder architecture showing multi-head attention" -o @ws/figures/transformer.png --doc-type presentation
99
-
100
- # Generate for poster (moderate threshold: 7.0/10)
101
- python @skill/scripts/generate_schematic.py "MAPK signaling pathway from EGFR to gene transcription" -o @ws/figures/mapk_pathway.png --doc-type poster
102
-
103
- # Custom max iterations (max 2)
104
- python @skill/scripts/generate_schematic.py "Complex circuit diagram with op-amp, resistors, and capacitors" -o @ws/figures/circuit.png --iterations 2 --doc-type journal
105
- ```
106
-
107
- **What happens behind the scenes:**
108
- 1. **Generation 1**: Gemini image model creates an initial image following scientific diagram best practices
109
- 2. **Review 1**: **Gemini** evaluates quality against the document-type threshold
110
- 3. **Decision**: If quality >= threshold → **DONE** (no more iterations needed!)
111
- 4. **If below threshold**: Improved prompt based on critique, regenerate
112
- 5. **Repeat**: Until quality meets threshold OR max iterations reached
113
-
114
- **Smart Iteration Benefits:**
115
- - Saves API calls if first generation is good enough
116
- - Higher quality standards for journal papers
117
- - Faster turnaround for presentations/posters
118
- - Appropriate quality for each use case
119
-
120
- **Output**: Versioned images plus a detailed review log with quality scores, critiques, and early-stop information.
121
-
122
- ### Configuration
123
-
124
- Set your OpenRouter API key:
125
- ```bash
126
- export OPENROUTER_API_KEY='your-openrouter-api-key'
127
- ```
128
-
129
- Optional model overrides:
130
- ```bash
131
- export SCHEMATIC_IMAGE_MODEL='google/gemini-3-pro-image-preview' # default
132
- export SCHEMATIC_REVIEW_MODEL='google/gemini-3-pro-preview' # default
133
- ```
134
-
135
- ### AI Generation Best Practices
136
-
137
- **Effective Prompts for Scientific Diagrams:**
138
-
139
- ✓ **Good prompts** (specific, detailed):
140
- - "CONSORT flowchart showing participant flow from screening (n=500) through randomization to final analysis"
141
- - "Transformer neural network architecture with encoder stack on left, decoder stack on right, showing multi-head attention and cross-attention connections"
142
- - "Biological signaling cascade: EGFR receptor → RAS → RAF → MEK → ERK → nucleus, with phosphorylation steps labeled"
143
- - "Block diagram of IoT system: sensors → microcontroller → WiFi module → cloud server → mobile app"
144
-
145
- ✗ **Avoid vague prompts**:
146
- - "Make a flowchart" (too generic)
147
- - "Neural network" (which type? what components?)
148
- - "Pathway diagram" (which pathway? what molecules?)
149
-
150
- **Key elements to include:**
151
- - **Type**: Flowchart, architecture diagram, pathway, circuit, etc.
152
- - **Components**: Specific elements to include
153
- - **Flow/Direction**: How elements connect (left-to-right, top-to-bottom)
154
- - **Labels**: Key annotations or text to include
155
- - **Style**: A textbook style plus Any specific visual requirements
156
-
157
- **Scientific Quality Guidelines** (automatically applied):
158
- - Clean white/light background
159
- - High contrast for readability
160
- - Clear, readable labels (minimum 10pt)
161
- - Professional typography (sans-serif fonts)
162
- - Colorblind-friendly colors (Okabe-Ito palette)
163
- - Proper spacing to prevent crowding
164
- - Scale bars, legends, axes where appropriate
165
-
166
- ## When to Use This Skill
167
-
168
- This skill should be used when:
169
- - Creating neural network architecture diagrams (Transformers, CNNs, RNNs, etc.)
170
- - Illustrating system architectures and data flow diagrams
171
- - Drawing methodology flowcharts for study design (CONSORT, PRISMA)
172
- - Visualizing algorithm workflows and processing pipelines
173
- - Creating circuit diagrams and electrical schematics
174
- - Depicting biological pathways and molecular interactions
175
- - Generating network topologies and hierarchical structures
176
- - Illustrating conceptual frameworks and theoretical models
177
- - Designing block diagrams for technical papers
178
-
179
- ## How to Use This Skill
180
-
181
- **Simply describe your diagram in natural language.** AI generates it automatically:
182
-
183
- ```bash
184
- python @skill/scripts/generate_schematic.py "your diagram description" -o @ws/output.png
185
- ```
186
-
187
- **That's it!** The AI handles:
188
- - Layout and composition
189
- - Labels and annotations
190
- - Colors and styling
191
- - Quality review and refinement
192
- - Publication-ready output
193
-
194
- **Works for all diagram types:**
195
- - Flowcharts (CONSORT, PRISMA, etc.)
196
- - Neural network architectures
197
- - Biological pathways
198
- - Circuit diagrams
199
- - System architectures
200
- - Block diagrams
201
- - Any scientific visualization
202
-
203
- **No coding, no templates, no manual drawing required.**
31
+ Do not write custom Python, shell, or SVG — the tool produces the image.
204
32
 
205
33
  ---
206
34
 
207
- # AI Generation Mode (OpenRouter + Gemini Review)
208
-
209
- ## Smart Iterative Refinement Workflow
210
-
211
- The AI generation system uses **smart iteration** - it only regenerates if quality is below the threshold for your document type:
212
-
213
- ### How Smart Iteration Works
214
-
215
- ```
216
- ┌─────────────────────────────────────────────────────┐
217
- │ 1. Generate image with Gemini via OpenRouter │
218
- │ ↓ │
219
- │ 2. Review quality with Gemini │
220
- │ ↓ │
221
- │ 3. Score >= threshold? │
222
- │ YES → DONE! (early stop) │
223
- │ NO → Improve prompt, go to step 1 │
224
- │ ↓ │
225
- │ 4. Repeat until quality met OR max iterations │
226
- └─────────────────────────────────────────────────────┘
227
- ```
228
-
229
- ### Iteration 1: Initial Generation
230
- **Prompt Construction:**
231
- ```
232
- Scientific diagram guidelines + User request
233
- ```
35
+ ## Tool usage
234
36
 
235
- **Output:** `diagram_v1.png`
236
-
237
- ### Quality Review by Gemini
238
-
239
- Gemini evaluates the diagram on:
240
- 1. **Scientific Accuracy** (0-2 points) - Correct concepts, notation, relationships
241
- 2. **Clarity and Readability** (0-2 points) - Easy to understand, clear hierarchy
242
- 3. **Label Quality** (0-2 points) - Complete, readable, consistent labels
243
- 4. **Layout and Composition** (0-2 points) - Logical flow, balanced, no overlaps
244
- 5. **Professional Appearance** (0-2 points) - Publication-ready quality
245
-
246
- **Example Review Output:**
247
- ```
248
- SCORE: 8.0
249
-
250
- STRENGTHS:
251
- - Clear flow from top to bottom
252
- - All phases properly labeled
253
- - Professional typography
254
-
255
- ISSUES:
256
- - Participant counts slightly small
257
- - Minor overlap on exclusion box
258
-
259
- VERDICT: ACCEPTABLE (for poster, threshold 7.0)
260
- ```
261
-
262
- ### Decision Point: Continue or Stop?
263
-
264
- | If Score... | Action |
265
- |-------------|--------|
266
- | >= threshold | **STOP** - Quality is good enough for this document type |
267
- | < threshold | Continue to next iteration with improved prompt |
268
-
269
- **Example:**
270
- - For a **poster** (threshold 7.0): Score of 7.5 → **DONE after 1 iteration!**
271
- - For a **journal** (threshold 8.5): Score of 7.5 Continue improving
272
-
273
- ### Subsequent Iterations (Only If Needed)
274
-
275
- If quality is below threshold, the system:
276
- 1. Extracts specific issues from Gemini's review
277
- 2. Enhances the prompt with improvement instructions
278
- 3. Regenerates via OpenRouter
279
- 4. Reviews again with Gemini
280
- 5. Repeats until threshold met or max iterations reached
281
-
282
- ### Review Log
283
- All iterations are saved with a JSON review log that includes early-stop information:
284
- ```json
285
- {
286
- "user_prompt": "CONSORT participant flow diagram...",
287
- "doc_type": "poster",
288
- "quality_threshold": 7.0,
289
- "iterations": [
290
- {
291
- "iteration": 1,
292
- "image_path": "figures/consort_v1.png",
293
- "score": 7.5,
294
- "needs_improvement": false,
295
- "critique": "SCORE: 7.5\nSTRENGTHS:..."
296
- }
297
- ],
298
- "final_score": 7.5,
299
- "early_stop": true,
300
- "early_stop_reason": "Quality score 7.5 meets threshold 7.0 for poster"
301
- }
302
- ```
303
-
304
- **Note:** With smart iteration, you may see only 1 iteration instead of the full 2 if quality is achieved early!
305
-
306
- ## Advanced AI Generation Usage
307
-
308
- ### Supported Automation Interface
309
-
310
- Use the CLI wrapper as the stable interface:
311
-
312
- ```bash
313
- python @skill/scripts/generate_schematic.py \
314
- "Transformer architecture diagram" \
315
- -o @ws/figures/transformer.png \
316
- --iterations 2
317
- ```
37
+ Call the `generate_diagram` tool with:
38
+
39
+ ```
40
+ prompt: <description crafted using the guidance below>
41
+ output: figures/<name>.png
42
+ doc_type: journal | conference | thesis | grant | preprint |
43
+ report | poster | presentation | default
44
+ diagram_type: flowchart | architecture | pathway | circuit |
45
+ network | conceptual | auto
46
+ iterations: 1 | 2 | 3 (default 2)
47
+ aspect: auto | square | landscape | portrait (default auto)
48
+ quality: low | medium | high | auto (default derived from doc_type)
49
+ format: auto | png | svg (default auto — inferred from output extension)
50
+ ```
51
+
52
+ **aspect guidance** — the default `auto` lets the model pick, but
53
+ when you already know the figure's shape, set it explicitly:
54
+ - `landscape` for wide architecture diagrams, left-to-right pipelines,
55
+ multi-panel (3+ columns) layouts
56
+ - `portrait` for CONSORT/PRISMA flows, top-to-bottom pathway cascades,
57
+ tall hierarchies
58
+ - `square` for single-concept schematics, cycles, small callouts
59
+ - `auto` when the shape is genuinely ambiguous or the prompt already
60
+ describes the layout strongly
61
+
62
+ **quality guidance** — gpt-image-2 exposes four tiers; the cost and
63
+ render time roughly follow low < medium < high. Omitting the field
64
+ selects a sensible default from `doc_type`:
65
+
66
+ | doc_type | default quality |
67
+ |--------------------------------------|-----------------|
68
+ | journal / conference / thesis / grant| high |
69
+ | preprint / report / poster | medium |
70
+ | presentation | low |
71
+ | default | medium |
72
+
73
+ The loop automatically bumps one tier on a `needs_edit` verdict, so the
74
+ first iteration runs cheap and refinement only spends more compute when
75
+ the reviewer signals the draft was close-but-not-there. Override with
76
+ `quality: "low"` for explicit exploration / drafts, or `"high"` to force
77
+ camera-ready on the first pass.
78
+
79
+ The tool returns the final image path, per-iteration review scores,
80
+ the quality tier used for each iteration, the verdict trail
81
+ (acceptable / needs_edit / needs_regen), and a JSON review log at
82
+ `<name>_review_log.json`.
83
+
84
+ When `diagram_type: auto`, the tool infers from keywords in the prompt.
85
+ Explicit types beat inference when you know what you want.
86
+
87
+ ### Choosing PNG vs SVG output
88
+
89
+ The `format` parameter (or the `output` extension) selects the output
90
+ shape. The two formats have different quality envelopes and use cases:
91
+
92
+ **PNG (default)** — gpt-image-2 native raster, drives the verdict-driven
93
+ review loop directly. Best raw visual quality. Use when:
94
+ - The figure is final and won't be edited further
95
+ - Embedding in slides, posters, web, or as a paper raster figure
96
+ - The user said "image", "PNG", "图片"
97
+
98
+ **SVG** — same gpt-image-2 PNG verdict loop, then a vision-capable chat
99
+ model transcribes the finalized PNG into editable SVG markup. The .png
100
+ anchor is preserved as a sibling file (`<name>.png` next to `<name>.svg`)
101
+ so the user can diff visual drift later. Use when:
102
+ - The user wants to **edit labels / colors** in Inkscape, draw.io, or
103
+ by hand
104
+ - Embedding in LaTeX as `\includegraphics{*.svg}` or in HTML/Markdown
105
+ for crisp scaling
106
+ - The user said "SVG", "vector", "矢量图", "向量图"
107
+
108
+ Quality tiers depending on the user's configuration:
109
+
110
+ | Configuration | SVG path active | Quality |
111
+ |----------------------------------------------|---------------------------|---------|
112
+ | `OPENAI_API_KEY` + vision-capable chat model | PNG-anchored transcription | High (~7+/10, matches PNG) |
113
+ | `OPENAI_API_KEY` + non-vision chat model | Tool returns `SVG_REQUIRES_VISION_MODEL` error | — (recover by switching model or using PNG) |
114
+ | No `OPENAI_API_KEY` | Chat-model-only synthesis | Limited (~6/10) |
115
+
116
+ All catalog chat models in this app currently support vision input, so
117
+ the second row is rare in practice — Path B exists as a safety net for
118
+ future non-vision models.
119
+
120
+ ### Internal prompt structure
121
+
122
+ Under the hood the tool converts your `prompt` into a fixed-slot
123
+ production brief that gpt-image-2 parses reliably:
124
+
125
+ ```
126
+ 【SCENE / USE】 publication-grade scientific <diagram_type>
127
+ 【SUBJECT】 <your prompt verbatim>
128
+ 【COMPOSITION】 <derived from diagram_type and aspect>
129
+ 【KEY DETAILS】 <type-specific rules + typography + colour>
130
+ 【TEXT】 render every quoted literal verbatim
131
+ 【MUST KEEP】 <auto-extracted: quoted strings, n=X, numeric+unit>
132
+ 【AVOID】 no figure numbers, titles, captions, 3D, mascots
133
+ ```
134
+
135
+ Two practical consequences for **how you write the prompt**:
136
+
137
+ 1. **Put every label, number, and identifier in quotes.** The tool
138
+ auto-extracts quoted strings (and `n=...`, `1kΩ` / `10µF` / `128
139
+ nodes` style numeric+unit tokens) into a `MUST KEEP` checklist the
140
+ model must render verbatim. `"EGFR"` survives better than just
141
+ EGFR; `"n=350"` survives better than just n=350.
142
+ 2. **Describe in fixed-slot order when you can** — scene/use → subject
143
+ → key details → composition → text → must-keep → avoid. The tool
144
+ will structure whatever you write, but pre-structured prose rewrites
145
+ less aggressively.
146
+
147
+ ### Reference images (iterative refinement and style boards)
148
+
149
+ Two modes of `reference_mode` are supported; a third is reserved.
150
+
151
+ **`reference_mode: "revise_layout"`** (default) — the reference is the
152
+ existing draft to polish. The tool treats iteration 1 as a surgical
153
+ edit: gpt-image-2 keeps layout, colour, and positions unchanged unless
154
+ specific blocking issues require otherwise. Use this when the user
155
+ says "improve this figure" or "fix X in this figure".
156
+
157
+ **`reference_mode: "style_only"`** — the reference is a **style board**,
158
+ not a layout. The tool redraws the subject from scratch in the
159
+ reference's visual idiom (palette, typography, line weights, geometry)
160
+ while ignoring its layout and content. Use this when:
161
+ - The user points at an older figure and says "use this style" but
162
+ wants a completely different diagram.
163
+ - A lab or institution has a house-style exemplar you want the tool
164
+ to lift from without copying content.
165
+ - Reference and subject are semantically different (e.g., reference is
166
+ a CONSORT flowchart, subject is an architecture diagram) but you want
167
+ them to look like siblings.
168
+
169
+ `style_only` works with both raster and SVG outputs. In raster mode the
170
+ reference is passed through `/v1/images/edits` with an explicit
171
+ "style only, do not copy layout" instruction. In SVG mode the
172
+ reference SVG source is embedded in the prompt as a style exemplar.
173
+
174
+ **`reference_mode: "local_edit"`** — reserved; masked-region edit is
175
+ not yet implemented and the tool rejects it with a clear error.
318
176
 
319
- Do not rely on importing private skill modules directly from Python; this skill ships as
320
- scripts and references, not as an installable Python package.
177
+ ---
321
178
 
322
- ### Command-Line Options
179
+ ## Diagram types and how to describe them
180
+
181
+ ### flowchart
182
+ Use for CONSORT, PRISMA, study-design flow, decision trees, swimlanes.
183
+
184
+ Required in prompt:
185
+ - Every node's label (exact text)
186
+ - Every arrow's source and target
187
+ - Exact counts (n = …), conditions, yes/no labels
188
+ - Flow direction (top-to-bottom default)
189
+
190
+ Example:
191
+ > CONSORT participant flow, vertical top-to-bottom. Boxes:
192
+ > "Assessed for eligibility (n=500)" → split into
193
+ > "Excluded (n=150): age<18 n=80, declined n=50, other n=20"
194
+ > (right branch) and "Randomized (n=350)" (down). "Randomized" splits to
195
+ > "Treatment (n=175)" and "Control (n=175)". Each arm shows
196
+ > "Lost to follow-up (n=15 / n=10)" then "Analyzed (n=160 / n=165)".
197
+ > Colour: blue for process, orange for exclusion, green for analysis.
198
+
199
+ ### architecture
200
+ Use for system diagrams, microservices, data pipelines, block diagrams.
201
+
202
+ Required in prompt:
203
+ - Every block's label and role
204
+ - Layering or grouping (what belongs together)
205
+ - Connection direction + protocol/interface labels on edges
206
+ - Any shared boundaries (VPC, cluster, device)
207
+
208
+ Example:
209
+ > IoT monitoring architecture, three layers stacked vertically.
210
+ > Bottom layer (sensors): "Temperature", "Humidity", "Motion" in green
211
+ > boxes. Middle layer: "ESP32 microcontroller" in blue; connects up via
212
+ > "WiFi" and sideways to "Local display". Top layer: "Cloud server" in
213
+ > grey, connected to "Mobile app" in light blue. Edges labelled with
214
+ > protocols (I2C, UART, WiFi, HTTPS).
215
+
216
+ ### pathway
217
+ Use for signalling cascades, metabolic pathways, gene regulation,
218
+ protein-protein interaction maps.
219
+
220
+ Required in prompt:
221
+ - Every molecule by its symbol
222
+ - Arrow type: activation (→) vs inhibition (⊣)
223
+ - Exact order and compartments (cytoplasm, nucleus, membrane)
224
+ - Post-translational modifications if relevant
225
+
226
+ Example:
227
+ > MAPK signalling pathway, top-to-bottom. Cell membrane at top with
228
+ > "EGFR receptor" embedded. Below: "RAS-GTP" (oval) → "RAF" → "MEK" →
229
+ > "ERK" (all kinases, rectangles). "ERK" arrow crosses nuclear envelope
230
+ > (dashed line) to "Transcription factors" in the nucleus. Label each
231
+ > arrow "phosphorylation". Okabe-Ito palette.
232
+
233
+ ### circuit
234
+ Use for analogue/digital circuits, signal chains, power systems.
235
+
236
+ Required in prompt:
237
+ - Every component type with value + unit (1kΩ, 10µF, 5V)
238
+ - Ground and supply nets
239
+ - Wire crossings: dot = connection, no dot = jump
240
+
241
+ Example:
242
+ > Non-inverting op-amp amplifier. Input signal on left connects via
243
+ > 1kΩ resistor (R1) to the + input of an op-amp. Feedback: output
244
+ > through 10kΩ (R2) to − input, and 1kΩ (R3) from − input to ground.
245
+ > Supply rails ±12V. Output on the right. Standard IEEE symbols.
246
+
247
+ ### network
248
+ Use for neural network architectures, graphs, trees, org charts.
249
+
250
+ Required in prompt:
251
+ - Node types and dimensions where applicable (Dense 128, Conv 3×3)
252
+ - Edge semantics (data flow vs attention vs skip)
253
+ - Hierarchy direction
254
+
255
+ Example:
256
+ > Transformer encoder-decoder, two stacks side by side. Encoder (left,
257
+ > light blue): "Input embedding" → "Positional encoding" → 6× {Multi-head
258
+ > self-attention, Add & Norm, Feed-forward, Add & Norm}. Decoder (right,
259
+ > light red): "Output embedding" → "Positional encoding" → 6× {Masked
260
+ > self-attention, Add & Norm, Cross-attention (dashed edge from encoder
261
+ > top to decoder cross-attention), Add & Norm, Feed-forward, Add & Norm}.
262
+ > → Linear → Softmax. Label every block.
263
+
264
+ ### conceptual
265
+ Use for frameworks, theoretical models, idea maps, whiteboard-style
266
+ diagrams. Looser rules, more artistic licence.
267
+
268
+ Required in prompt:
269
+ - Core concepts grouped by category
270
+ - Relationships between categories
271
+ - Any visual metaphor you want (layers, concentric rings, Venn)
323
272
 
324
- ```bash
325
- # Basic usage (default threshold 7.5/10)
326
- python @skill/scripts/generate_schematic.py "diagram description" -o @ws/output.png
273
+ ---
327
274
 
328
- # Specify document type for appropriate quality threshold
329
- python @skill/scripts/generate_schematic.py "diagram" -o @ws/out.png --doc-type journal # 8.5/10
330
- python @skill/scripts/generate_schematic.py "diagram" -o @ws/out.png --doc-type conference # 8.0/10
331
- python @skill/scripts/generate_schematic.py "diagram" -o @ws/out.png --doc-type poster # 7.0/10
332
- python @skill/scripts/generate_schematic.py "diagram" -o @ws/out.png --doc-type presentation # 6.5/10
275
+ ## doc_type thresholds
333
276
 
334
- # Custom max iterations (1-2)
335
- python @skill/scripts/generate_schematic.py "complex diagram" -o @ws/diagram.png --iterations 2
277
+ | doc_type | Behaviour |
278
+ |---------------|-----------|
279
+ | journal | Strictest threshold; accept only near-camera-ready output. |
280
+ | conference | Strict; small cosmetic issues allowed. |
281
+ | thesis, grant | Same bar as conference. |
282
+ | preprint, report | Standard threshold. |
283
+ | poster | Tolerates larger labels and bolder colours. |
284
+ | presentation | Lowest threshold; optimised for first-pass speed. |
285
+ | default | Middle of the range. |
336
286
 
337
- # Verbose output (see all API calls and reviews)
338
- python @skill/scripts/generate_schematic.py "flowchart" -o @ws/flow.png -v
287
+ Exact numeric thresholds depend on the review provider and are not
288
+ directly comparable across providers (see the tool's review log for the
289
+ applied threshold on each run).
339
290
 
340
- # Combine options
341
- python @skill/scripts/generate_schematic.py "neural network" -o @ws/nn.png --doc-type journal --iterations 2 -v
342
- ```
291
+ ---
343
292
 
344
- ### Prompt Engineering Tips
293
+ ## Iteration strategy
294
+
295
+ Default is **2 iterations** for almost every case. Do not bump to 3
296
+ without concrete evidence — the third iteration frequently *regresses*
297
+ a fix from the second, and the tool now detects this and stops early
298
+ anyway (see "Regression detection" below).
299
+
300
+ - `iterations: 1` — only when you want the first-pass output without
301
+ any review-driven rework.
302
+ - `iterations: 2` (default) — one draft plus one review-driven revision.
303
+ The second pass uses image-to-image editing when the reviewer judges
304
+ issues are cosmetic (needs_edit). When the reviewer flags structural
305
+ problems (needs_regen) the second pass redraws from scratch.
306
+ - `iterations: 3` — only when iter 2 is expected to still be below
307
+ threshold. Good signals: `doc_type: journal | conference` AND the
308
+ prompt asks for something dense (many labels, multi-panel, intricate
309
+ relationships). Bad signals: `doc_type: presentation | poster`,
310
+ single-concept diagrams, simple flowcharts.
311
+
312
+ The tool stops early the moment a review comes back with verdict
313
+ `acceptable`, so higher `iterations` costs nothing when not needed —
314
+ but each *unnecessary* iteration risks the model undoing earlier
315
+ corrections.
316
+
317
+ ### Relaxed acceptance
318
+
319
+ When the reviewer returns `needs_edit` but the score is comfortably
320
+ above threshold (**≥ threshold + 1.0**) and no critical issues remain
321
+ (`wrong_content`, `missing_element`, `illegible_text`), the tool
322
+ promotes the verdict to `acceptable` and stops. Cosmetic notes
323
+ (`layout_collision`, `style_mismatch`) do not force another round
324
+ when the diagram is already well past the bar.
325
+
326
+ ### Regression detection
327
+
328
+ The tool tracks which blocking issues the reviewer stops complaining
329
+ about between iterations — those are the corrections we want to keep.
330
+ Each subsequent edit prompt explicitly reminds the model of the
331
+ already-fixed items and forbids regressing them. If the reviewer in a
332
+ later draft nevertheless re-introduces an issue from that fixed set,
333
+ the loop terminates early with `stoppedReason: "regression_detected"`
334
+ and the current best draft is returned. Two common causes of
335
+ regression:
336
+
337
+ - Too many iterations on a diagram that is already good enough (ships
338
+ a new cosmetic flaw while fixing an old cosmetic flaw).
339
+ - Prompt overspecification — the original request listed dozens of
340
+ tiny details, and the model keeps dropping a different one each
341
+ round.
342
+
343
+ When you see `regressedIssues` or `stoppedReason: "regression_detected"`
344
+ in the review log, prefer the current output over re-running with
345
+ more iterations; re-running usually makes it worse.
345
346
 
346
- **1. Be Specific About Layout:**
347
- ```
348
- ✓ "Flowchart with vertical flow, top to bottom"
349
- ✓ "Architecture diagram with encoder on left, decoder on right"
350
- ✓ "Circular pathway diagram with clockwise flow"
351
- ```
352
-
353
- **2. Include Quantitative Details:**
354
- ```
355
- ✓ "Neural network with input layer (784 nodes), hidden layer (128 nodes), output (10 nodes)"
356
- ✓ "Flowchart showing n=500 screened, n=150 excluded, n=350 randomized"
357
- ✓ "Circuit with 1kΩ resistor, 10µF capacitor, 5V source"
358
- ```
347
+ ---
359
348
 
360
- **3. Specify Visual Style:**
361
- ```
362
- ✓ "Minimalist block diagram with clean lines"
363
- ✓ "Detailed biological pathway with protein structures"
364
- ✓ "Technical schematic with engineering notation"
365
- ```
349
+ ## Embedding the result in documents
366
350
 
367
- **4. Request Specific Labels:**
368
- ```
369
- "Label all arrows with activation/inhibition"
370
- ✓ "Include layer dimensions in each box"
371
- ✓ "Show time progression with timestamps"
372
- ```
351
+ The tool saves the final image to the exact `output` path you specified
352
+ (plus versioned `_v1.png`, `_v2.png`, … siblings for debugging, and a
353
+ `_review_log.json` next to them).
373
354
 
374
- **5. Mention Color Requirements:**
375
- ```
376
- ✓ "Use colorblind-friendly colors"
377
- ✓ "Grayscale-compatible design"
378
- ✓ "Color-code by function: blue for input, green for processing, red for output"
379
- ```
355
+ Embed with markdown image syntax. **The path is relative to the markdown
356
+ file you are writing, not the workspace root.**
380
357
 
381
- ## AI Generation Examples
382
-
383
- ### Example 1: CONSORT Flowchart
384
- ```bash
385
- python @skill/scripts/generate_schematic.py \
386
- "CONSORT participant flow diagram for randomized controlled trial. \
387
- Start with 'Assessed for eligibility (n=500)' at top. \
388
- Show 'Excluded (n=150)' with reasons: age<18 (n=80), declined (n=50), other (n=20). \
389
- Then 'Randomized (n=350)' splits into two arms: \
390
- 'Treatment group (n=175)' and 'Control group (n=175)'. \
391
- Each arm shows 'Lost to follow-up' (n=15 and n=10). \
392
- End with 'Analyzed' (n=160 and n=165). \
393
- Use blue boxes for process steps, orange for exclusion, green for final analysis." \
394
- -o @ws/figures/consort.png
358
+ Markdown at workspace root (`paper.md`) and tool output at `figures/consort.png`:
359
+ ```markdown
360
+ ![Figure 2: CONSORT participant flow](figures/consort.png)
395
361
  ```
396
362
 
397
- ### Example 2: Neural Network Architecture
398
- ```bash
399
- python @skill/scripts/generate_schematic.py \
400
- "Transformer encoder-decoder architecture diagram. \
401
- Left side: Encoder stack with input embedding, positional encoding, \
402
- multi-head self-attention, add & norm, feed-forward, add & norm. \
403
- Right side: Decoder stack with output embedding, positional encoding, \
404
- masked self-attention, add & norm, cross-attention (receiving from encoder), \
405
- add & norm, feed-forward, add & norm, linear & softmax. \
406
- Show cross-attention connection from encoder to decoder with dashed line. \
407
- Use light blue for encoder, light red for decoder. \
408
- Label all components clearly." \
409
- -o @ws/figures/transformer.png --iterations 2
363
+ Markdown in a subdirectory (`workspace/paper_draft.md`):
364
+ ```markdown
365
+ ![Figure 2: CONSORT participant flow](../figures/consort.png)
410
366
  ```
411
367
 
412
- ### Example 3: Biological Pathway
413
- ```bash
414
- python @skill/scripts/generate_schematic.py \
415
- "MAPK signaling pathway diagram. \
416
- Start with EGFR receptor at cell membrane (top). \
417
- Arrow down to RAS (with GTP label). \
418
- Arrow to RAF kinase. \
419
- Arrow to MEK kinase. \
420
- Arrow to ERK kinase. \
421
- Final arrow to nucleus showing gene transcription. \
422
- Label each arrow with 'phosphorylation' or 'activation'. \
423
- Use rounded rectangles for proteins, different colors for each. \
424
- Include membrane boundary line at top." \
425
- -o @ws/figures/mapk_pathway.png
426
- ```
368
+ Rule of thumb: prepend `../` per directory the markdown file is below
369
+ the workspace root.
427
370
 
428
- ### Example 4: System Architecture
429
- ```bash
430
- python @skill/scripts/generate_schematic.py \
431
- "IoT system architecture block diagram. \
432
- Bottom layer: Sensors (temperature, humidity, motion) in green boxes. \
433
- Middle layer: Microcontroller (ESP32) in blue box. \
434
- Connections to WiFi module (orange box) and Display (purple box). \
435
- Top layer: Cloud server (gray box) connected to mobile app (light blue box). \
436
- Show data flow arrows between all components. \
437
- Label connections with protocols: I2C, UART, WiFi, HTTPS." \
438
- -o @ws/figures/iot_architecture.png
439
- ```
371
+ Always include the `![...](...)` embed — do not only reference the
372
+ figure in prose ("see Figure 1"), or the rendered document will not
373
+ show the image.
440
374
 
441
375
  ---
442
376
 
443
- ## Command-Line Usage
444
-
445
- The main entry point for generating scientific schematics:
446
-
447
- ```bash
448
- # Basic usage
449
- python @skill/scripts/generate_schematic.py "diagram description" -o @ws/output.png
450
-
451
- # Custom iterations (max 2)
452
- python @skill/scripts/generate_schematic.py "complex diagram" -o @ws/diagram.png --iterations 2
453
-
454
- # Verbose mode
455
- python @skill/scripts/generate_schematic.py "diagram" -o @ws/out.png -v
456
- ```
457
-
458
- **Note:** The AI generation system includes automatic quality review in its iterative refinement process. Each iteration is evaluated for scientific accuracy, clarity, and accessibility.
459
-
460
- ## Best Practices Summary
461
-
462
- ### Design Principles
463
-
464
- 1. **Clarity over complexity** - Simplify, remove unnecessary elements
465
- 2. **Consistent styling** - Use templates and style files
466
- 3. **Colorblind accessibility** - Use Okabe-Ito palette, redundant encoding
467
- 4. **Appropriate typography** - Sans-serif fonts, minimum 7-8 pt
468
- 5. **Vector format** - Always use PDF/SVG for publication
469
-
470
- ### Technical Requirements
471
-
472
- 1. **Resolution** - Vector preferred, or 300+ DPI for raster
473
- 2. **File format** - PDF for LaTeX, SVG for web, PNG as fallback
474
- 3. **Color space** - RGB for digital, CMYK for print (convert if needed)
475
- 4. **Line weights** - Minimum 0.5 pt, typical 1-2 pt
476
- 5. **Text size** - 7-8 pt minimum at final size
477
-
478
- ### Integration Guidelines
479
-
480
- 1. **Include in markdown** - Use `![Figure N: caption](path/to/figures/filename.png)` for every generated image (path must be relative to the markdown file, not the workspace root)
481
- 2. **Include in LaTeX** - Use `\includegraphics{}` for generated images in LaTeX documents
482
- 3. **Caption thoroughly** - Describe all elements and abbreviations
483
- 4. **Reference in text** - Explain diagram in narrative flow
484
- 5. **Maintain consistency** - Same style across all figures in paper
485
- 6. **Version control** - Keep prompts and generated images in repository
377
+ ## What the tool does NOT do
486
378
 
487
- ## Troubleshooting Common Issues
379
+ - It does not embed figures into your document — you must add the
380
+ markdown image syntax yourself.
381
+ - It does not cite figures in text — write the narrative reference too.
382
+ - It does not deduplicate across iterations; the `_v1.png` and `_v2.png`
383
+ intermediates are left on disk for inspection.
488
384
 
489
- ### AI Generation Issues
490
-
491
- **Problem**: Overlapping text or elements
492
- - **Solution**: AI generation automatically handles spacing
493
- - **Solution**: Increase iterations: `--iterations 2` for better refinement
494
-
495
- **Problem**: Elements not connecting properly
496
- - **Solution**: Make your prompt more specific about connections and layout
497
- - **Solution**: Increase iterations for better refinement
498
-
499
- ### Image Quality Issues
500
-
501
- **Problem**: Export quality poor
502
- - **Solution**: AI generation produces high-quality images automatically
503
- - **Solution**: Increase iterations for better results: `--iterations 2`
504
-
505
- **Problem**: Elements overlap after generation
506
- - **Solution**: AI generation automatically handles spacing
507
- - **Solution**: Increase iterations: `--iterations 2` for better refinement
508
- - **Solution**: Make your prompt more specific about layout and spacing requirements
509
-
510
- ### API Issues
511
-
512
- **Problem**: Authentication error
513
- - **Solution**: Verify `OPENROUTER_API_KEY` is set correctly
514
- - **Solution**: Check your OpenRouter account has sufficient credits
515
-
516
- **Problem**: Model not available
517
- - **Solution**: Check OpenRouter model availability at openrouter.ai/models
518
- - **Solution**: Try alternative models via `--image-model` or `--review-model`
519
-
520
- ## Resources and References
521
-
522
- ### Detailed References
523
-
524
- Load these files for comprehensive information on specific topics:
525
-
526
- - **`@skill/references/best_practices.md`** - Publication standards and accessibility guidelines
527
- - **`@skill/references/README.md`** - Extended usage guide and troubleshooting
528
- - **`@skill/references/QUICK_REFERENCE.md`** - Condensed command cheat sheet
529
-
530
- ### External Resources
531
-
532
- **Python Libraries**
533
- - Schemdraw Documentation: https://schemdraw.readthedocs.io/
534
- - NetworkX Documentation: https://networkx.org/documentation/
535
- - Matplotlib Documentation: https://matplotlib.org/
536
-
537
- **Publication Standards**
538
- - Nature Figure Guidelines: https://www.nature.com/nature/for-authors/final-submission
539
- - Science Figure Guidelines: https://www.science.org/content/page/instructions-preparing-initial-manuscript
540
- - CONSORT Diagram: http://www.consort-statement.org/consort-statement/flow-diagram
541
-
542
- ## Integration with Other Skills
543
-
544
- This skill works synergistically with:
545
-
546
- - **Scientific Writing** - Diagrams follow figure best practices
547
- - **Scientific Visualization** - Shares color palettes and styling
548
- - **Research Grants** - Methodology diagrams for proposals
549
- - **Scholar Evaluation** - Evaluate clarity, completeness, and communication quality
550
-
551
- ## Quick Reference Checklist
552
-
553
- Before submitting diagrams, verify:
554
-
555
- ### Visual Quality
556
- - [ ] High-quality image format (PNG from AI generation)
557
- - [ ] No overlapping elements (AI handles automatically)
558
- - [ ] Adequate spacing between all components (AI optimizes)
559
- - [ ] Clean, professional alignment
560
- - [ ] All arrows connect properly to intended targets
561
-
562
- ### Accessibility
563
- - [ ] Colorblind-safe palette (Okabe-Ito) used
564
- - [ ] Works in grayscale (tested with accessibility checker)
565
- - [ ] Sufficient contrast between elements (verified)
566
- - [ ] Redundant encoding where appropriate (shapes + colors)
567
- - [ ] Colorblind simulation passes all checks
568
-
569
- ### Typography and Readability
570
- - [ ] Text minimum 7-8 pt at final size
571
- - [ ] All elements labeled clearly and completely
572
- - [ ] Consistent font family and sizing
573
- - [ ] No text overlaps or cutoffs
574
- - [ ] Units included where applicable
575
-
576
- ### Publication Standards
577
- - [ ] Consistent styling with other figures in manuscript
578
- - [ ] Comprehensive caption written with all abbreviations defined
579
- - [ ] Referenced appropriately in manuscript text
580
- - [ ] Meets journal-specific dimension requirements
581
- - [ ] Exported in required format for journal (PDF/EPS/TIFF)
582
-
583
- ## Environment Setup
584
-
585
- ```bash
586
- # Required
587
- export OPENROUTER_API_KEY='your-openrouter-api-key'
588
-
589
- # Optional model overrides
590
- export SCHEMATIC_IMAGE_MODEL='google/gemini-3-pro-image-preview'
591
- export SCHEMATIC_REVIEW_MODEL='google/gemini-3-pro-preview'
592
- ```
385
+ ---
593
386
 
594
- ## Getting Started
387
+ ## House style — what gives every figure its shared identity
388
+
389
+ Every diagram produced by this tool is drawn against a fixed **house
390
+ visual style** — a single profile bundled with the system that pins:
391
+
392
+ - **Theme** — editorial institutional, off-white background, graphite
393
+ text/strokes, restrained contrast, disciplined whitespace. Not
394
+ "AI-aesthetic", not startup-whitepaper.
395
+ - **Typography** — Inter / SF Pro Text / Helvetica Neue / Arial stack,
396
+ with fixed size tokens for section labels, node labels, edge
397
+ annotations, and foot labels.
398
+ - **Palette roles** — text/stroke, primaryStructure, secondaryStructure,
399
+ contextFill, resultAccent, warningAccent, grid — each mapped to a
400
+ specific hex. Not "Okabe-Ito preferred" — a concrete editorial blue/red
401
+ identity with accessibility as fallback, not default.
402
+ - **Geometry tokens** — stroke widths (1.5px / 1px), corner radius 8px,
403
+ filled-triangle arrowheads, dashed-border group containers, 32px min
404
+ gutter, 3-step box height scale.
405
+ - **Motifs** — labels outside boxes, arrow stroke matches source colour,
406
+ at most two accent colours per figure.
407
+
408
+ The reviewer is explicitly told to grade "house-style adherence" as the
409
+ fifth rubric dimension, so iterations that drift from the profile get
410
+ blocked from acceptance. You'll see `houseProfile: editorial-institutional-v1`
411
+ in the review log to confirm which profile was applied.
412
+
413
+ ### Overriding the house style via prompt
414
+
415
+ There is **no** user-facing theme selector — consistency across figures
416
+ is the whole point. If a user needs to deviate for a specific figure
417
+ (e.g., matching a publication's brand colours), express the overrides
418
+ in the `prompt` itself using quoted tokens or hex codes:
419
+
420
+ ```
421
+ prompt: >
422
+ Workflow diagram, three stages.
423
+ Use "#E30613" for the final "Result" node (publication brand red);
424
+ rest of the figure follows the default house palette.
425
+ ```
426
+
427
+ The tool's literal extractor picks up hex codes (`#RRGGBB`,
428
+ `#RRGGBBAA`), rgb()/rgba(), hsl()/hsla() tokens from plain prose and
429
+ pins them in MUST KEEP, so the override survives rendering.
430
+
431
+ ## Configuration
432
+
433
+ Raster image generation (gpt-image-2) requires `OPENAI_API_KEY`
434
+ (set under Settings → API Keys). Review uses either OpenAI or
435
+ Anthropic based on Settings → Diagrams. The `auto` setting prefers
436
+ Claude when both keys are available, so the generator does not grade
437
+ its own family.
438
+
439
+ ### Output format selection (png vs svg)
440
+
441
+ Format intent flows through two redundant channels — either is enough,
442
+ and when they disagree the explicit `format` parameter wins:
443
+
444
+ 1. **Explicit `format` parameter** — use this when the user said
445
+ something format-specific in the prompt:
446
+ - "SVG / 矢量图 / vector / 向量图" → `format: "svg"`
447
+ - "PNG / 图片 / raster / bitmap" → `format: "png"`
448
+ - Otherwise omit (defaults to `auto`).
449
+ 2. **Output filename extension** — `output: figures/foo.svg` implies
450
+ `format: "svg"`, `output: figures/foo.png` implies `format: "png"`.
451
+ The tool will rewrite the extension to match `format` if they
452
+ disagree (and report `extensionChanged` in the result so you know
453
+ to update any Markdown embed).
454
+
455
+ Behavioural summary:
456
+
457
+ - `format: "svg"` (or `.svg` extension) → **always** synthesises SVG
458
+ via the chat model, even when `OPENAI_API_KEY` is present. Choose
459
+ this for vector output (scales infinitely, small filesize, editable).
460
+ - `format: "png"` (or `.png` extension, the default) → raster via
461
+ `gpt-image-2` when `OPENAI_API_KEY` is configured. Falls back to
462
+ SVG-via-chat-model automatically when the key is missing; the file
463
+ is renamed `.svg` and `extensionChanged` is reported.
464
+
465
+ ### SVG fallback
466
+
467
+ The same verdict-driven iteration loop runs for both formats. In SVG
468
+ mode, generation goes through the chat model, and review picks the
469
+ strongest option available at runtime:
470
+
471
+ 1. **Rasterise-then-vision (preferred)** — when an offscreen renderer
472
+ is available (Electron main process) and a real vision reviewer
473
+ (OpenAI or Anthropic) is configured, the generated SVG is
474
+ rendered to PNG and passed to the vision model. This catches text
475
+ overflow, element overlap, and other problems invisible at the
476
+ SVG-source level.
477
+ 2. **Source-level fallback** — when rasterisation is unavailable (e.g.
478
+ running outside Electron) or no vision reviewer auth is present,
479
+ the reviewer reads the SVG markup as text. Structural checks work;
480
+ visual-layout checks do not.
481
+
482
+ Other SVG-mode notes:
483
+
484
+ - `mode: "svg_fallback"` appears in the tool result and review log.
485
+ The provider id distinguishes the two review paths:
486
+ `rasterize+openai:…` or `rasterize+anthropic:…` vs
487
+ `svg-fallback:…`.
488
+ - Quality of the SVG itself depends on the chat model's spatial
489
+ reasoning. Claude Opus and GPT-4o / GPT-5 produce usable output for
490
+ flowcharts, simple architecture, and box-and-arrow schemas. Pathway
491
+ illustrations and complex circuits will be noticeably weaker than
492
+ gpt-image-2.
493
+ - Self-grading bias only applies in the source-level path (same model
494
+ reads back its own SVG). The rasterise-then-vision path sends the
495
+ rendered image to a different model, so bias is negligible.
496
+ - Scores across raster and SVG modes are still not directly
497
+ comparable — thresholds are calibrated per reviewer and per path.
498
+
499
+ To embed an SVG in Markdown:
595
500
 
596
- **Simplest possible usage:**
597
- ```bash
598
- python @skill/scripts/generate_schematic.py "your diagram description" -o @ws/output.png
501
+ ```markdown
502
+ ![Figure 1: workflow](figures/workflow.svg)
599
503
  ```
600
504
 
601
- ---
602
-
603
- Use this skill to create clear, accessible, publication-quality diagrams that effectively communicate complex scientific concepts. The AI-powered workflow with iterative refinement ensures diagrams meet professional standards.
505
+ Most Markdown renderers (GitHub, typical preview extensions) and
506
+ Milkdown render SVG inline just like PNG.