npm - research-copilot - Versions diffs - 0.2.20 → 0.2.21 - Mend

research-copilot 0.2.20 → 0.2.21

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (81) hide show

package/lib/skills/builtin/scientific-schematics/SKILL.md CHANGED Viewed

@@ -1,603 +1,506 @@
 ---
 name: scientific-schematics
-description: Create publication-quality scientific diagrams using OpenRouter API with smart iterative refinement. The skill defaults to Gemini 3 Pro Image for generation plus Gemini 3 Pro for review, and only regenerates if quality is below threshold for your document type.
+description: Prompt guidance for the generate_diagram tool — how to describe different scientific diagram types (flowcharts, pathways, architecture, circuits, networks, conceptual) so the tool produces publication-grade output on the first iteration.
 category: Visualization
 tags: [science, visualization, diagrams]
-triggers: [diagram, schematic, flowchart, architecture diagram, system diagram, 示意图, 流程图, generate diagram]
-allowed-tools: [Read, Write, Edit, Bash]
+triggers: [diagram, schematic, flowchart, architecture diagram, pathway, circuit, 示意图, 流程图]
+allowed-tools: [Read, Write, Edit]
 license: MIT license
 metadata:
-    skill-author: K-Dense Inc.
+    skill-author: Dong Dai
 ---
-# Scientific Schematics and Diagrams
+# Scientific Schematics
-## Overview
+## Scope of this skill
-Scientific schematics and diagrams transform complex concepts into clear visual representations for publication. We want the diagram a textbook style. **This skill uses OpenRouter API for image generation with Gemini quality review.**
+This skill **does not generate images**. Image generation is owned by the
+`generate_diagram` tool, which runs a provider-backed generate → review →
+(optional) edit loop. The tool is always available; this skill is only
+loaded to sharpen the prompt the tool receives.
-**How it works:**
-- Describe your diagram in natural language
-- Gemini image model generates publication-quality diagrams via OpenRouter
-- **Gemini reviews quality** against document-type thresholds
-- **Smart iteration**: Only regenerates if quality is below threshold
-- Publication-ready output in minutes
-- No coding, templates, or manual drawing required
+Use this skill to:
+1. Pick the right `diagram_type` parameter for the request.
+2. Write a description that names components, labels, quantities, and
+   flow direction unambiguously (LLM image models hallucinate when these
+   are vague).
+3. Choose an appropriate `doc_type` so the reviewer applies the right
+   quality threshold.
+4. Embed the resulting image in the document correctly.
-**Quality Thresholds by Document Type:**
-| Document Type | Threshold | Description |
-|---------------|-----------|-------------|
-| journal | 8.5/10 | Nature, Science, peer-reviewed journals |
-| conference | 8.0/10 | Conference papers |
-| thesis | 8.0/10 | Dissertations, theses |
-| grant | 8.0/10 | Grant proposals |
-| preprint | 7.5/10 | arXiv, bioRxiv, etc. |
-| report | 7.5/10 | Technical reports |
-| poster | 7.0/10 | Academic posters |
-| presentation | 6.5/10 | Slides, talks |
-| default | 7.5/10 | General purpose |
-**Simply describe what you want, and AI generates it.** All diagrams are stored in the @ws/figures/ subfolder and referenced in papers/posters.
-### Embedding Figures in Documents (MANDATORY)
-**After generating every figure, you MUST embed it in the document using markdown image syntax.**
-The image path must be **relative to the markdown file you are writing**, not the workspace root. This ensures the renderer resolves the path correctly regardless of where the markdown file lives.
-**Example — markdown at workspace root (`paper.md`):**
-```bash
-python @skill/scripts/generate_schematic.py "CONSORT flowchart..." -o @ws/figures/consort.png
-```
-Embed in `paper.md`:
-```markdown
-![Figure 2: CONSORT participant flow diagram](figures/consort.png)
-```
-**Example — markdown in a subdirectory (`workspace/paper_draft.md`):**
-```bash
-python @skill/scripts/generate_schematic.py "CONSORT flowchart..." -o @ws/figures/consort.png
-```
-Embed in `workspace/paper_draft.md` — use `../` to go up:
-```markdown
-![Figure 2: CONSORT participant flow diagram](../figures/consort.png)
-```
-**Rule of thumb:** count how many directories deep your markdown file is from the workspace root, and prepend that many `../` segments to `figures/filename.png`.
-Alternatively, you can generate the image alongside the markdown file:
-```bash
-# If markdown is at workspace/paper_draft.md, save figures next to it:
-python @skill/scripts/generate_schematic.py "CONSORT flowchart..." -o @ws/workspace/figures/consort.png
-```
-Then embed simply as:
-```markdown
-![Figure 2: CONSORT participant flow diagram](figures/consort.png)
-```
-Do NOT only cite figures as text ("see Figure 1"). Always include the `![...](...)` embed so the figure renders visually in the output.
-**Default model path:** this skill intentionally uses one default generation model, `google/gemini-3-pro-image-preview`, with Gemini review via `google/gemini-3-pro-preview`. Agents should not pick among multiple image backends during normal operation.
-### Path Conventions
-- Shell command examples use scoped paths:
-  - `@skill/...` for files shipped with this skill (for example scripts)
-  - `@ws/...` for workspace files (inputs/outputs)
-- Python API and JSON log examples show raw filesystem paths (typically workspace-relative, such as `figures/...`).
-## Quick Start: Generate Any Diagram
-Create any scientific diagram by simply describing it. AI handles everything automatically with **smart iteration**:
-```bash
-# Generate for journal paper (highest quality threshold: 8.5/10)
-python @skill/scripts/generate_schematic.py "CONSORT participant flow diagram with 500 screened, 150 excluded, 350 randomized" -o @ws/figures/consort.png --doc-type journal
-# Generate for presentation (lower threshold: 6.5/10 - faster)
-python @skill/scripts/generate_schematic.py "Transformer encoder-decoder architecture showing multi-head attention" -o @ws/figures/transformer.png --doc-type presentation
-# Generate for poster (moderate threshold: 7.0/10)
-python @skill/scripts/generate_schematic.py "MAPK signaling pathway from EGFR to gene transcription" -o @ws/figures/mapk_pathway.png --doc-type poster
-# Custom max iterations (max 2)
-python @skill/scripts/generate_schematic.py "Complex circuit diagram with op-amp, resistors, and capacitors" -o @ws/figures/circuit.png --iterations 2 --doc-type journal
-```
-**What happens behind the scenes:**
-1. **Generation 1**: Gemini image model creates an initial image following scientific diagram best practices
-2. **Review 1**: **Gemini** evaluates quality against the document-type threshold
-3. **Decision**: If quality >= threshold → **DONE** (no more iterations needed!)
-4. **If below threshold**: Improved prompt based on critique, regenerate
-5. **Repeat**: Until quality meets threshold OR max iterations reached
-**Smart Iteration Benefits:**
-- Saves API calls if first generation is good enough
-- Higher quality standards for journal papers
-- Faster turnaround for presentations/posters
-- Appropriate quality for each use case
-**Output**: Versioned images plus a detailed review log with quality scores, critiques, and early-stop information.
-### Configuration
-Set your OpenRouter API key:
-```bash
-export OPENROUTER_API_KEY='your-openrouter-api-key'
-```
-Optional model overrides:
-```bash
-export SCHEMATIC_IMAGE_MODEL='google/gemini-3-pro-image-preview'   # default
-export SCHEMATIC_REVIEW_MODEL='google/gemini-3-pro-preview'         # default
-```
-### AI Generation Best Practices
-**Effective Prompts for Scientific Diagrams:**
-✓ **Good prompts** (specific, detailed):
-- "CONSORT flowchart showing participant flow from screening (n=500) through randomization to final analysis"
-- "Transformer neural network architecture with encoder stack on left, decoder stack on right, showing multi-head attention and cross-attention connections"
-- "Biological signaling cascade: EGFR receptor → RAS → RAF → MEK → ERK → nucleus, with phosphorylation steps labeled"
-- "Block diagram of IoT system: sensors → microcontroller → WiFi module → cloud server → mobile app"
-✗ **Avoid vague prompts**:
-- "Make a flowchart" (too generic)
-- "Neural network" (which type? what components?)
-- "Pathway diagram" (which pathway? what molecules?)
-**Key elements to include:**
-- **Type**: Flowchart, architecture diagram, pathway, circuit, etc.
-- **Components**: Specific elements to include
-- **Flow/Direction**: How elements connect (left-to-right, top-to-bottom)
-- **Labels**: Key annotations or text to include
-- **Style**: A textbook style plus Any specific visual requirements
-**Scientific Quality Guidelines** (automatically applied):
-- Clean white/light background
-- High contrast for readability
-- Clear, readable labels (minimum 10pt)
-- Professional typography (sans-serif fonts)
-- Colorblind-friendly colors (Okabe-Ito palette)
-- Proper spacing to prevent crowding
-- Scale bars, legends, axes where appropriate
-## When to Use This Skill
-This skill should be used when:
-- Creating neural network architecture diagrams (Transformers, CNNs, RNNs, etc.)
-- Illustrating system architectures and data flow diagrams
-- Drawing methodology flowcharts for study design (CONSORT, PRISMA)
-- Visualizing algorithm workflows and processing pipelines
-- Creating circuit diagrams and electrical schematics
-- Depicting biological pathways and molecular interactions
-- Generating network topologies and hierarchical structures
-- Illustrating conceptual frameworks and theoretical models
-- Designing block diagrams for technical papers
-## How to Use This Skill
-**Simply describe your diagram in natural language.** AI generates it automatically:
-```bash
-python @skill/scripts/generate_schematic.py "your diagram description" -o @ws/output.png
-```
-**That's it!** The AI handles:
-- Layout and composition
-- Labels and annotations
-- Colors and styling
-- Quality review and refinement
-- Publication-ready output
-**Works for all diagram types:**
-- Flowcharts (CONSORT, PRISMA, etc.)
-- Neural network architectures
-- Biological pathways
-- Circuit diagrams
-- System architectures
-- Block diagrams
-- Any scientific visualization
-**No coding, no templates, no manual drawing required.**
+Do not write custom Python, shell, or SVG — the tool produces the image.
 ---
-# AI Generation Mode (OpenRouter + Gemini Review)
-## Smart Iterative Refinement Workflow
-The AI generation system uses **smart iteration** - it only regenerates if quality is below the threshold for your document type:
-### How Smart Iteration Works
-```
-┌─────────────────────────────────────────────────────┐
-│  1. Generate image with Gemini via OpenRouter        │
-│                    ↓                                │
-│  2. Review quality with Gemini                      │
-│                    ↓                                │
-│  3. Score >= threshold?                             │
-│       YES → DONE! (early stop)                      │
-│       NO  → Improve prompt, go to step 1            │
-│                    ↓                                │
-│  4. Repeat until quality met OR max iterations      │
-└─────────────────────────────────────────────────────┘
-```
-### Iteration 1: Initial Generation
-**Prompt Construction:**
-```
-Scientific diagram guidelines + User request
-```
+## Tool usage
-**Output:** `diagram_v1.png`
-### Quality Review by Gemini
-Gemini evaluates the diagram on:
-1. **Scientific Accuracy** (0-2 points) - Correct concepts, notation, relationships
-2. **Clarity and Readability** (0-2 points) - Easy to understand, clear hierarchy
-3. **Label Quality** (0-2 points) - Complete, readable, consistent labels
-4. **Layout and Composition** (0-2 points) - Logical flow, balanced, no overlaps
-5. **Professional Appearance** (0-2 points) - Publication-ready quality
-**Example Review Output:**
-```
-SCORE: 8.0
-STRENGTHS:
-- Clear flow from top to bottom
-- All phases properly labeled
-- Professional typography
-ISSUES:
-- Participant counts slightly small
-- Minor overlap on exclusion box
-VERDICT: ACCEPTABLE (for poster, threshold 7.0)
-```
-### Decision Point: Continue or Stop?
-| If Score... | Action |
-|-------------|--------|
-| >= threshold | **STOP** - Quality is good enough for this document type |
-| < threshold | Continue to next iteration with improved prompt |
-**Example:**
-- For a **poster** (threshold 7.0): Score of 7.5 → **DONE after 1 iteration!**
-- For a **journal** (threshold 8.5): Score of 7.5 → Continue improving
-### Subsequent Iterations (Only If Needed)
-If quality is below threshold, the system:
-1. Extracts specific issues from Gemini's review
-2. Enhances the prompt with improvement instructions
-3. Regenerates via OpenRouter
-4. Reviews again with Gemini
-5. Repeats until threshold met or max iterations reached
-### Review Log
-All iterations are saved with a JSON review log that includes early-stop information:
-```json
-{
-  "user_prompt": "CONSORT participant flow diagram...",
-  "doc_type": "poster",
-  "quality_threshold": 7.0,
-  "iterations": [
-    {
-      "iteration": 1,
-      "image_path": "figures/consort_v1.png",
-      "score": 7.5,
-      "needs_improvement": false,
-      "critique": "SCORE: 7.5\nSTRENGTHS:..."
-    }
-  ],
-  "final_score": 7.5,
-  "early_stop": true,
-  "early_stop_reason": "Quality score 7.5 meets threshold 7.0 for poster"
-}
-```
-**Note:** With smart iteration, you may see only 1 iteration instead of the full 2 if quality is achieved early!
-## Advanced AI Generation Usage
-### Supported Automation Interface
-Use the CLI wrapper as the stable interface:
-```bash
-python @skill/scripts/generate_schematic.py \
-  "Transformer architecture diagram" \
-  -o @ws/figures/transformer.png \
-  --iterations 2
-```
+Call the `generate_diagram` tool with:
+```
+prompt:         <description crafted using the guidance below>
+output:         figures/<name>.png
+doc_type:       journal | conference | thesis | grant | preprint |
+                report | poster | presentation | default
+diagram_type:   flowchart | architecture | pathway | circuit |
+                network | conceptual | auto
+iterations:     1 | 2 | 3   (default 2)
+aspect:         auto | square | landscape | portrait   (default auto)
+quality:        low | medium | high | auto   (default derived from doc_type)
+format:         auto | png | svg   (default auto — inferred from output extension)
+```
+**aspect guidance** — the default `auto` lets the model pick, but
+when you already know the figure's shape, set it explicitly:
+- `landscape` for wide architecture diagrams, left-to-right pipelines,
+  multi-panel (3+ columns) layouts
+- `portrait` for CONSORT/PRISMA flows, top-to-bottom pathway cascades,
+  tall hierarchies
+- `square` for single-concept schematics, cycles, small callouts
+- `auto` when the shape is genuinely ambiguous or the prompt already
+  describes the layout strongly
+**quality guidance** — gpt-image-2 exposes four tiers; the cost and
+render time roughly follow low < medium < high. Omitting the field
+selects a sensible default from `doc_type`:
+| doc_type                             | default quality |
+|--------------------------------------|-----------------|
+| journal / conference / thesis / grant| high            |
+| preprint / report / poster           | medium          |
+| presentation                         | low             |
+| default                              | medium          |
+The loop automatically bumps one tier on a `needs_edit` verdict, so the
+first iteration runs cheap and refinement only spends more compute when
+the reviewer signals the draft was close-but-not-there. Override with
+`quality: "low"` for explicit exploration / drafts, or `"high"` to force
+camera-ready on the first pass.
+The tool returns the final image path, per-iteration review scores,
+the quality tier used for each iteration, the verdict trail
+(acceptable / needs_edit / needs_regen), and a JSON review log at
+`<name>_review_log.json`.
+When `diagram_type: auto`, the tool infers from keywords in the prompt.
+Explicit types beat inference when you know what you want.
+### Choosing PNG vs SVG output
+The `format` parameter (or the `output` extension) selects the output
+shape. The two formats have different quality envelopes and use cases:
+**PNG (default)** — gpt-image-2 native raster, drives the verdict-driven
+review loop directly. Best raw visual quality. Use when:
+- The figure is final and won't be edited further
+- Embedding in slides, posters, web, or as a paper raster figure
+- The user said "image", "PNG", "图片"
+**SVG** — same gpt-image-2 PNG verdict loop, then a vision-capable chat
+model transcribes the finalized PNG into editable SVG markup. The .png
+anchor is preserved as a sibling file (`<name>.png` next to `<name>.svg`)
+so the user can diff visual drift later. Use when:
+- The user wants to **edit labels / colors** in Inkscape, draw.io, or
+  by hand
+- Embedding in LaTeX as `\includegraphics{*.svg}` or in HTML/Markdown
+  for crisp scaling
+- The user said "SVG", "vector", "矢量图", "向量图"
+Quality tiers depending on the user's configuration:
+| Configuration                                | SVG path active           | Quality |
+|----------------------------------------------|---------------------------|---------|
+| `OPENAI_API_KEY` + vision-capable chat model | PNG-anchored transcription | High (~7+/10, matches PNG) |
+| `OPENAI_API_KEY` + non-vision chat model     | Tool returns `SVG_REQUIRES_VISION_MODEL` error | — (recover by switching model or using PNG) |
+| No `OPENAI_API_KEY`                          | Chat-model-only synthesis  | Limited (~6/10) |
+All catalog chat models in this app currently support vision input, so
+the second row is rare in practice — Path B exists as a safety net for
+future non-vision models.
+### Internal prompt structure
+Under the hood the tool converts your `prompt` into a fixed-slot
+production brief that gpt-image-2 parses reliably:
+```
+【SCENE / USE】  publication-grade scientific <diagram_type>
+【SUBJECT】      <your prompt verbatim>
+【COMPOSITION】  <derived from diagram_type and aspect>
+【KEY DETAILS】  <type-specific rules + typography + colour>
+【TEXT】          render every quoted literal verbatim
+【MUST KEEP】    <auto-extracted: quoted strings, n=X, numeric+unit>
+【AVOID】         no figure numbers, titles, captions, 3D, mascots
+```
+Two practical consequences for **how you write the prompt**:
+1. **Put every label, number, and identifier in quotes.** The tool
+   auto-extracts quoted strings (and `n=...`, `1kΩ` / `10µF` / `128
+   nodes` style numeric+unit tokens) into a `MUST KEEP` checklist the
+   model must render verbatim. `"EGFR"` survives better than just
+   EGFR; `"n=350"` survives better than just n=350.
+2. **Describe in fixed-slot order when you can** — scene/use → subject
+   → key details → composition → text → must-keep → avoid. The tool
+   will structure whatever you write, but pre-structured prose rewrites
+   less aggressively.
+### Reference images (iterative refinement and style boards)
+Two modes of `reference_mode` are supported; a third is reserved.
+**`reference_mode: "revise_layout"`** (default) — the reference is the
+existing draft to polish. The tool treats iteration 1 as a surgical
+edit: gpt-image-2 keeps layout, colour, and positions unchanged unless
+specific blocking issues require otherwise. Use this when the user
+says "improve this figure" or "fix X in this figure".
+**`reference_mode: "style_only"`** — the reference is a **style board**,
+not a layout. The tool redraws the subject from scratch in the
+reference's visual idiom (palette, typography, line weights, geometry)
+while ignoring its layout and content. Use this when:
+- The user points at an older figure and says "use this style" but
+  wants a completely different diagram.
+- A lab or institution has a house-style exemplar you want the tool
+  to lift from without copying content.
+- Reference and subject are semantically different (e.g., reference is
+  a CONSORT flowchart, subject is an architecture diagram) but you want
+  them to look like siblings.
+`style_only` works with both raster and SVG outputs. In raster mode the
+reference is passed through `/v1/images/edits` with an explicit
+"style only, do not copy layout" instruction. In SVG mode the
+reference SVG source is embedded in the prompt as a style exemplar.
+**`reference_mode: "local_edit"`** — reserved; masked-region edit is
+not yet implemented and the tool rejects it with a clear error.
-Do not rely on importing private skill modules directly from Python; this skill ships as
-scripts and references, not as an installable Python package.
+---
-### Command-Line Options
+## Diagram types and how to describe them
+### flowchart
+Use for CONSORT, PRISMA, study-design flow, decision trees, swimlanes.
+Required in prompt:
+- Every node's label (exact text)
+- Every arrow's source and target
+- Exact counts (n = …), conditions, yes/no labels
+- Flow direction (top-to-bottom default)
+Example:
+> CONSORT participant flow, vertical top-to-bottom. Boxes:
+> "Assessed for eligibility (n=500)" → split into
+> "Excluded (n=150): age<18 n=80, declined n=50, other n=20"
+> (right branch) and "Randomized (n=350)" (down). "Randomized" splits to
+> "Treatment (n=175)" and "Control (n=175)". Each arm shows
+> "Lost to follow-up (n=15 / n=10)" then "Analyzed (n=160 / n=165)".
+> Colour: blue for process, orange for exclusion, green for analysis.
+### architecture
+Use for system diagrams, microservices, data pipelines, block diagrams.
+Required in prompt:
+- Every block's label and role
+- Layering or grouping (what belongs together)
+- Connection direction + protocol/interface labels on edges
+- Any shared boundaries (VPC, cluster, device)
+Example:
+> IoT monitoring architecture, three layers stacked vertically.
+> Bottom layer (sensors): "Temperature", "Humidity", "Motion" in green
+> boxes. Middle layer: "ESP32 microcontroller" in blue; connects up via
+> "WiFi" and sideways to "Local display". Top layer: "Cloud server" in
+> grey, connected to "Mobile app" in light blue. Edges labelled with
+> protocols (I2C, UART, WiFi, HTTPS).
+### pathway
+Use for signalling cascades, metabolic pathways, gene regulation,
+protein-protein interaction maps.
+Required in prompt:
+- Every molecule by its symbol
+- Arrow type: activation (→) vs inhibition (⊣)
+- Exact order and compartments (cytoplasm, nucleus, membrane)
+- Post-translational modifications if relevant
+Example:
+> MAPK signalling pathway, top-to-bottom. Cell membrane at top with
+> "EGFR receptor" embedded. Below: "RAS-GTP" (oval) → "RAF" → "MEK" →
+> "ERK" (all kinases, rectangles). "ERK" arrow crosses nuclear envelope
+> (dashed line) to "Transcription factors" in the nucleus. Label each
+> arrow "phosphorylation". Okabe-Ito palette.
+### circuit
+Use for analogue/digital circuits, signal chains, power systems.
+Required in prompt:
+- Every component type with value + unit (1kΩ, 10µF, 5V)
+- Ground and supply nets
+- Wire crossings: dot = connection, no dot = jump
+Example:
+> Non-inverting op-amp amplifier. Input signal on left connects via
+> 1kΩ resistor (R1) to the + input of an op-amp. Feedback: output
+> through 10kΩ (R2) to − input, and 1kΩ (R3) from − input to ground.
+> Supply rails ±12V. Output on the right. Standard IEEE symbols.
+### network
+Use for neural network architectures, graphs, trees, org charts.
+Required in prompt:
+- Node types and dimensions where applicable (Dense 128, Conv 3×3)
+- Edge semantics (data flow vs attention vs skip)
+- Hierarchy direction
+Example:
+> Transformer encoder-decoder, two stacks side by side. Encoder (left,
+> light blue): "Input embedding" → "Positional encoding" → 6× {Multi-head
+> self-attention, Add & Norm, Feed-forward, Add & Norm}. Decoder (right,
+> light red): "Output embedding" → "Positional encoding" → 6× {Masked
+> self-attention, Add & Norm, Cross-attention (dashed edge from encoder
+> top to decoder cross-attention), Add & Norm, Feed-forward, Add & Norm}.
+> → Linear → Softmax. Label every block.
+### conceptual
+Use for frameworks, theoretical models, idea maps, whiteboard-style
+diagrams. Looser rules, more artistic licence.
+Required in prompt:
+- Core concepts grouped by category
+- Relationships between categories
+- Any visual metaphor you want (layers, concentric rings, Venn)
-```bash
-# Basic usage (default threshold 7.5/10)
-python @skill/scripts/generate_schematic.py "diagram description" -o @ws/output.png
+---
-# Specify document type for appropriate quality threshold
-python @skill/scripts/generate_schematic.py "diagram" -o @ws/out.png --doc-type journal      # 8.5/10
-python @skill/scripts/generate_schematic.py "diagram" -o @ws/out.png --doc-type conference   # 8.0/10
-python @skill/scripts/generate_schematic.py "diagram" -o @ws/out.png --doc-type poster       # 7.0/10
-python @skill/scripts/generate_schematic.py "diagram" -o @ws/out.png --doc-type presentation # 6.5/10
+## doc_type thresholds
-# Custom max iterations (1-2)
-python @skill/scripts/generate_schematic.py "complex diagram" -o @ws/diagram.png --iterations 2
+| doc_type      | Behaviour |
+|---------------|-----------|
+| journal       | Strictest threshold; accept only near-camera-ready output. |
+| conference    | Strict; small cosmetic issues allowed. |
+| thesis, grant | Same bar as conference. |
+| preprint, report | Standard threshold. |
+| poster        | Tolerates larger labels and bolder colours. |
+| presentation  | Lowest threshold; optimised for first-pass speed. |
+| default       | Middle of the range. |
-# Verbose output (see all API calls and reviews)
-python @skill/scripts/generate_schematic.py "flowchart" -o @ws/flow.png -v
+Exact numeric thresholds depend on the review provider and are not
+directly comparable across providers (see the tool's review log for the
+applied threshold on each run).
-# Combine options
-python @skill/scripts/generate_schematic.py "neural network" -o @ws/nn.png --doc-type journal --iterations 2 -v
-```
+---
-### Prompt Engineering Tips
+## Iteration strategy
+Default is **2 iterations** for almost every case. Do not bump to 3
+without concrete evidence — the third iteration frequently *regresses*
+a fix from the second, and the tool now detects this and stops early
+anyway (see "Regression detection" below).
+- `iterations: 1` — only when you want the first-pass output without
+  any review-driven rework.
+- `iterations: 2` (default) — one draft plus one review-driven revision.
+  The second pass uses image-to-image editing when the reviewer judges
+  issues are cosmetic (needs_edit). When the reviewer flags structural
+  problems (needs_regen) the second pass redraws from scratch.
+- `iterations: 3` — only when iter 2 is expected to still be below
+  threshold. Good signals: `doc_type: journal | conference` AND the
+  prompt asks for something dense (many labels, multi-panel, intricate
+  relationships). Bad signals: `doc_type: presentation | poster`,
+  single-concept diagrams, simple flowcharts.
+The tool stops early the moment a review comes back with verdict
+`acceptable`, so higher `iterations` costs nothing when not needed —
+but each *unnecessary* iteration risks the model undoing earlier
+corrections.
+### Relaxed acceptance
+When the reviewer returns `needs_edit` but the score is comfortably
+above threshold (**≥ threshold + 1.0**) and no critical issues remain
+(`wrong_content`, `missing_element`, `illegible_text`), the tool
+promotes the verdict to `acceptable` and stops. Cosmetic notes
+(`layout_collision`, `style_mismatch`) do not force another round
+when the diagram is already well past the bar.
+### Regression detection
+The tool tracks which blocking issues the reviewer stops complaining
+about between iterations — those are the corrections we want to keep.
+Each subsequent edit prompt explicitly reminds the model of the
+already-fixed items and forbids regressing them. If the reviewer in a
+later draft nevertheless re-introduces an issue from that fixed set,
+the loop terminates early with `stoppedReason: "regression_detected"`
+and the current best draft is returned. Two common causes of
+regression:
+- Too many iterations on a diagram that is already good enough (ships
+  a new cosmetic flaw while fixing an old cosmetic flaw).
+- Prompt overspecification — the original request listed dozens of
+  tiny details, and the model keeps dropping a different one each
+  round.
+When you see `regressedIssues` or `stoppedReason: "regression_detected"`
+in the review log, prefer the current output over re-running with
+more iterations; re-running usually makes it worse.
-**1. Be Specific About Layout:**
-```
-✓ "Flowchart with vertical flow, top to bottom"
-✓ "Architecture diagram with encoder on left, decoder on right"
-✓ "Circular pathway diagram with clockwise flow"
-```
-**2. Include Quantitative Details:**
-```
-✓ "Neural network with input layer (784 nodes), hidden layer (128 nodes), output (10 nodes)"
-✓ "Flowchart showing n=500 screened, n=150 excluded, n=350 randomized"
-✓ "Circuit with 1kΩ resistor, 10µF capacitor, 5V source"
-```
+---
-**3. Specify Visual Style:**
-```
-✓ "Minimalist block diagram with clean lines"
-✓ "Detailed biological pathway with protein structures"
-✓ "Technical schematic with engineering notation"
-```
+## Embedding the result in documents
-**4. Request Specific Labels:**
-```
-✓ "Label all arrows with activation/inhibition"
-✓ "Include layer dimensions in each box"
-✓ "Show time progression with timestamps"
-```
+The tool saves the final image to the exact `output` path you specified
+(plus versioned `_v1.png`, `_v2.png`, … siblings for debugging, and a
+`_review_log.json` next to them).
-**5. Mention Color Requirements:**
-```
-✓ "Use colorblind-friendly colors"
-✓ "Grayscale-compatible design"
-✓ "Color-code by function: blue for input, green for processing, red for output"
-```
+Embed with markdown image syntax. **The path is relative to the markdown
+file you are writing, not the workspace root.**
-## AI Generation Examples
-### Example 1: CONSORT Flowchart
-```bash
-python @skill/scripts/generate_schematic.py \
-  "CONSORT participant flow diagram for randomized controlled trial. \
-   Start with 'Assessed for eligibility (n=500)' at top. \
-   Show 'Excluded (n=150)' with reasons: age<18 (n=80), declined (n=50), other (n=20). \
-   Then 'Randomized (n=350)' splits into two arms: \
-   'Treatment group (n=175)' and 'Control group (n=175)'. \
-   Each arm shows 'Lost to follow-up' (n=15 and n=10). \
-   End with 'Analyzed' (n=160 and n=165). \
-   Use blue boxes for process steps, orange for exclusion, green for final analysis." \
-  -o @ws/figures/consort.png
+Markdown at workspace root (`paper.md`) and tool output at `figures/consort.png`:
+```markdown
+![Figure 2: CONSORT participant flow](figures/consort.png)
 ```
-### Example 2: Neural Network Architecture
-```bash
-python @skill/scripts/generate_schematic.py \
-  "Transformer encoder-decoder architecture diagram. \
-   Left side: Encoder stack with input embedding, positional encoding, \
-   multi-head self-attention, add & norm, feed-forward, add & norm. \
-   Right side: Decoder stack with output embedding, positional encoding, \
-   masked self-attention, add & norm, cross-attention (receiving from encoder), \
-   add & norm, feed-forward, add & norm, linear & softmax. \
-   Show cross-attention connection from encoder to decoder with dashed line. \
-   Use light blue for encoder, light red for decoder. \
-   Label all components clearly." \
-  -o @ws/figures/transformer.png --iterations 2
+Markdown in a subdirectory (`workspace/paper_draft.md`):
+```markdown
+![Figure 2: CONSORT participant flow](../figures/consort.png)
 ```
-### Example 3: Biological Pathway
-```bash
-python @skill/scripts/generate_schematic.py \
-  "MAPK signaling pathway diagram. \
-   Start with EGFR receptor at cell membrane (top). \
-   Arrow down to RAS (with GTP label). \
-   Arrow to RAF kinase. \
-   Arrow to MEK kinase. \
-   Arrow to ERK kinase. \
-   Final arrow to nucleus showing gene transcription. \
-   Label each arrow with 'phosphorylation' or 'activation'. \
-   Use rounded rectangles for proteins, different colors for each. \
-   Include membrane boundary line at top." \
-  -o @ws/figures/mapk_pathway.png
-```
+Rule of thumb: prepend `../` per directory the markdown file is below
+the workspace root.
-### Example 4: System Architecture
-```bash
-python @skill/scripts/generate_schematic.py \
-  "IoT system architecture block diagram. \
-   Bottom layer: Sensors (temperature, humidity, motion) in green boxes. \
-   Middle layer: Microcontroller (ESP32) in blue box. \
-   Connections to WiFi module (orange box) and Display (purple box). \
-   Top layer: Cloud server (gray box) connected to mobile app (light blue box). \
-   Show data flow arrows between all components. \
-   Label connections with protocols: I2C, UART, WiFi, HTTPS." \
-  -o @ws/figures/iot_architecture.png
-```
+Always include the `![...](...)` embed — do not only reference the
+figure in prose ("see Figure 1"), or the rendered document will not
+show the image.
 ---
-## Command-Line Usage
-The main entry point for generating scientific schematics:
-```bash
-# Basic usage
-python @skill/scripts/generate_schematic.py "diagram description" -o @ws/output.png
-# Custom iterations (max 2)
-python @skill/scripts/generate_schematic.py "complex diagram" -o @ws/diagram.png --iterations 2
-# Verbose mode
-python @skill/scripts/generate_schematic.py "diagram" -o @ws/out.png -v
-```
-**Note:** The AI generation system includes automatic quality review in its iterative refinement process. Each iteration is evaluated for scientific accuracy, clarity, and accessibility.
-## Best Practices Summary
-### Design Principles
-1. **Clarity over complexity** - Simplify, remove unnecessary elements
-2. **Consistent styling** - Use templates and style files
-3. **Colorblind accessibility** - Use Okabe-Ito palette, redundant encoding
-4. **Appropriate typography** - Sans-serif fonts, minimum 7-8 pt
-5. **Vector format** - Always use PDF/SVG for publication
-### Technical Requirements
-1. **Resolution** - Vector preferred, or 300+ DPI for raster
-2. **File format** - PDF for LaTeX, SVG for web, PNG as fallback
-3. **Color space** - RGB for digital, CMYK for print (convert if needed)
-4. **Line weights** - Minimum 0.5 pt, typical 1-2 pt
-5. **Text size** - 7-8 pt minimum at final size
-### Integration Guidelines
-1. **Include in markdown** - Use `![Figure N: caption](path/to/figures/filename.png)` for every generated image (path must be relative to the markdown file, not the workspace root)
-2. **Include in LaTeX** - Use `\includegraphics{}` for generated images in LaTeX documents
-3. **Caption thoroughly** - Describe all elements and abbreviations
-4. **Reference in text** - Explain diagram in narrative flow
-5. **Maintain consistency** - Same style across all figures in paper
-6. **Version control** - Keep prompts and generated images in repository
+## What the tool does NOT do
-## Troubleshooting Common Issues
+- It does not embed figures into your document — you must add the
+  markdown image syntax yourself.
+- It does not cite figures in text — write the narrative reference too.
+- It does not deduplicate across iterations; the `_v1.png` and `_v2.png`
+  intermediates are left on disk for inspection.
-### AI Generation Issues
-**Problem**: Overlapping text or elements
-- **Solution**: AI generation automatically handles spacing
-- **Solution**: Increase iterations: `--iterations 2` for better refinement
-**Problem**: Elements not connecting properly
-- **Solution**: Make your prompt more specific about connections and layout
-- **Solution**: Increase iterations for better refinement
-### Image Quality Issues
-**Problem**: Export quality poor
-- **Solution**: AI generation produces high-quality images automatically
-- **Solution**: Increase iterations for better results: `--iterations 2`
-**Problem**: Elements overlap after generation
-- **Solution**: AI generation automatically handles spacing
-- **Solution**: Increase iterations: `--iterations 2` for better refinement
-- **Solution**: Make your prompt more specific about layout and spacing requirements
-### API Issues
-**Problem**: Authentication error
-- **Solution**: Verify `OPENROUTER_API_KEY` is set correctly
-- **Solution**: Check your OpenRouter account has sufficient credits
-**Problem**: Model not available
-- **Solution**: Check OpenRouter model availability at openrouter.ai/models
-- **Solution**: Try alternative models via `--image-model` or `--review-model`
-## Resources and References
-### Detailed References
-Load these files for comprehensive information on specific topics:
-- **`@skill/references/best_practices.md`** - Publication standards and accessibility guidelines
-- **`@skill/references/README.md`** - Extended usage guide and troubleshooting
-- **`@skill/references/QUICK_REFERENCE.md`** - Condensed command cheat sheet
-### External Resources
-**Python Libraries**
-- Schemdraw Documentation: https://schemdraw.readthedocs.io/
-- NetworkX Documentation: https://networkx.org/documentation/
-- Matplotlib Documentation: https://matplotlib.org/
-**Publication Standards**
-- Nature Figure Guidelines: https://www.nature.com/nature/for-authors/final-submission
-- Science Figure Guidelines: https://www.science.org/content/page/instructions-preparing-initial-manuscript
-- CONSORT Diagram: http://www.consort-statement.org/consort-statement/flow-diagram
-## Integration with Other Skills
-This skill works synergistically with:
-- **Scientific Writing** - Diagrams follow figure best practices
-- **Scientific Visualization** - Shares color palettes and styling
-- **Research Grants** - Methodology diagrams for proposals
-- **Scholar Evaluation** - Evaluate clarity, completeness, and communication quality
-## Quick Reference Checklist
-Before submitting diagrams, verify:
-### Visual Quality
-- [ ] High-quality image format (PNG from AI generation)
-- [ ] No overlapping elements (AI handles automatically)
-- [ ] Adequate spacing between all components (AI optimizes)
-- [ ] Clean, professional alignment
-- [ ] All arrows connect properly to intended targets
-### Accessibility
-- [ ] Colorblind-safe palette (Okabe-Ito) used
-- [ ] Works in grayscale (tested with accessibility checker)
-- [ ] Sufficient contrast between elements (verified)
-- [ ] Redundant encoding where appropriate (shapes + colors)
-- [ ] Colorblind simulation passes all checks
-### Typography and Readability
-- [ ] Text minimum 7-8 pt at final size
-- [ ] All elements labeled clearly and completely
-- [ ] Consistent font family and sizing
-- [ ] No text overlaps or cutoffs
-- [ ] Units included where applicable
-### Publication Standards
-- [ ] Consistent styling with other figures in manuscript
-- [ ] Comprehensive caption written with all abbreviations defined
-- [ ] Referenced appropriately in manuscript text
-- [ ] Meets journal-specific dimension requirements
-- [ ] Exported in required format for journal (PDF/EPS/TIFF)
-## Environment Setup
-```bash
-# Required
-export OPENROUTER_API_KEY='your-openrouter-api-key'
-# Optional model overrides
-export SCHEMATIC_IMAGE_MODEL='google/gemini-3-pro-image-preview'
-export SCHEMATIC_REVIEW_MODEL='google/gemini-3-pro-preview'
-```
+---
-## Getting Started
+## House style — what gives every figure its shared identity
+Every diagram produced by this tool is drawn against a fixed **house
+visual style** — a single profile bundled with the system that pins:
+- **Theme** — editorial institutional, off-white background, graphite
+  text/strokes, restrained contrast, disciplined whitespace. Not
+  "AI-aesthetic", not startup-whitepaper.
+- **Typography** — Inter / SF Pro Text / Helvetica Neue / Arial stack,
+  with fixed size tokens for section labels, node labels, edge
+  annotations, and foot labels.
+- **Palette roles** — text/stroke, primaryStructure, secondaryStructure,
+  contextFill, resultAccent, warningAccent, grid — each mapped to a
+  specific hex. Not "Okabe-Ito preferred" — a concrete editorial blue/red
+  identity with accessibility as fallback, not default.
+- **Geometry tokens** — stroke widths (1.5px / 1px), corner radius 8px,
+  filled-triangle arrowheads, dashed-border group containers, 32px min
+  gutter, 3-step box height scale.
+- **Motifs** — labels outside boxes, arrow stroke matches source colour,
+  at most two accent colours per figure.
+The reviewer is explicitly told to grade "house-style adherence" as the
+fifth rubric dimension, so iterations that drift from the profile get
+blocked from acceptance. You'll see `houseProfile: editorial-institutional-v1`
+in the review log to confirm which profile was applied.
+### Overriding the house style via prompt
+There is **no** user-facing theme selector — consistency across figures
+is the whole point. If a user needs to deviate for a specific figure
+(e.g., matching a publication's brand colours), express the overrides
+in the `prompt` itself using quoted tokens or hex codes:
+```
+prompt: >
+  Workflow diagram, three stages.
+  Use "#E30613" for the final "Result" node (publication brand red);
+  rest of the figure follows the default house palette.
+```
+The tool's literal extractor picks up hex codes (`#RRGGBB`,
+`#RRGGBBAA`), rgb()/rgba(), hsl()/hsla() tokens from plain prose and
+pins them in MUST KEEP, so the override survives rendering.
+## Configuration
+Raster image generation (gpt-image-2) requires `OPENAI_API_KEY`
+(set under Settings → API Keys). Review uses either OpenAI or
+Anthropic based on Settings → Diagrams. The `auto` setting prefers
+Claude when both keys are available, so the generator does not grade
+its own family.
+### Output format selection (png vs svg)
+Format intent flows through two redundant channels — either is enough,
+and when they disagree the explicit `format` parameter wins:
+1. **Explicit `format` parameter** — use this when the user said
+   something format-specific in the prompt:
+   - "SVG / 矢量图 / vector / 向量图" → `format: "svg"`
+   - "PNG / 图片 / raster / bitmap" → `format: "png"`
+   - Otherwise omit (defaults to `auto`).
+2. **Output filename extension** — `output: figures/foo.svg` implies
+   `format: "svg"`, `output: figures/foo.png` implies `format: "png"`.
+   The tool will rewrite the extension to match `format` if they
+   disagree (and report `extensionChanged` in the result so you know
+   to update any Markdown embed).
+Behavioural summary:
+- `format: "svg"` (or `.svg` extension) → **always** synthesises SVG
+  via the chat model, even when `OPENAI_API_KEY` is present. Choose
+  this for vector output (scales infinitely, small filesize, editable).
+- `format: "png"` (or `.png` extension, the default) → raster via
+  `gpt-image-2` when `OPENAI_API_KEY` is configured. Falls back to
+  SVG-via-chat-model automatically when the key is missing; the file
+  is renamed `.svg` and `extensionChanged` is reported.
+### SVG fallback
+The same verdict-driven iteration loop runs for both formats. In SVG
+mode, generation goes through the chat model, and review picks the
+strongest option available at runtime:
+1. **Rasterise-then-vision (preferred)** — when an offscreen renderer
+   is available (Electron main process) and a real vision reviewer
+   (OpenAI or Anthropic) is configured, the generated SVG is
+   rendered to PNG and passed to the vision model. This catches text
+   overflow, element overlap, and other problems invisible at the
+   SVG-source level.
+2. **Source-level fallback** — when rasterisation is unavailable (e.g.
+   running outside Electron) or no vision reviewer auth is present,
+   the reviewer reads the SVG markup as text. Structural checks work;
+   visual-layout checks do not.
+Other SVG-mode notes:
+- `mode: "svg_fallback"` appears in the tool result and review log.
+  The provider id distinguishes the two review paths:
+  `rasterize+openai:…` or `rasterize+anthropic:…` vs
+  `svg-fallback:…`.
+- Quality of the SVG itself depends on the chat model's spatial
+  reasoning. Claude Opus and GPT-4o / GPT-5 produce usable output for
+  flowcharts, simple architecture, and box-and-arrow schemas. Pathway
+  illustrations and complex circuits will be noticeably weaker than
+  gpt-image-2.
+- Self-grading bias only applies in the source-level path (same model
+  reads back its own SVG). The rasterise-then-vision path sends the
+  rendered image to a different model, so bias is negligible.
+- Scores across raster and SVG modes are still not directly
+  comparable — thresholds are calibrated per reviewer and per path.
+To embed an SVG in Markdown:
-**Simplest possible usage:**
-```bash
-python @skill/scripts/generate_schematic.py "your diagram description" -o @ws/output.png
+```markdown
+![Figure 1: workflow](figures/workflow.svg)
 ```
----
-Use this skill to create clear, accessible, publication-quality diagrams that effectively communicate complex scientific concepts. The AI-powered workflow with iterative refinement ensures diagrams meet professional standards.
+Most Markdown renderers (GitHub, typical preview extensions) and
+Milkdown render SVG inline just like PNG.