npm - @research-copilot/plugin - Versions diffs - 1.1.15 → 1.1.16 - Mend

@research-copilot/plugin 1.1.15 → 1.1.16

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (117) hide show

package/dist/.claude-plugin/plugin.json +3 -2
package/dist/.codex-plugin/plugin.toml +2 -1
package/dist/.cursor-plugin/plugin.json +3 -2
package/dist/.gemini-plugin/plugin.json +3 -2
package/dist/.opencode-plugin/plugin.json +3 -2
package/dist/.windsurf-plugin/plugin.json +3 -2
package/dist/agents/copilot-conductor.agent.md +60 -0
package/dist/agents/copilot-experiment.agent.md +56 -0
package/dist/agents/copilot-ideation.agent.md +45 -0
package/dist/agents/copilot-literature.agent.md +34 -0
package/dist/agents/copilot-polisher.agent.md +30 -0
package/dist/agents/copilot-rebuttal.agent.md +35 -0
package/dist/agents/copilot-reviewer.agent.md +35 -0
package/dist/agents/copilot-writer.agent.md +39 -0
package/dist/hooks/dispatch-reminder.json +17 -0
package/dist/hooks/loop-armer.json +17 -0
package/dist/hooks/research-copilot-guard.hook.md +51 -0
package/dist/hooks/scientist-guardrails.json +17 -0
package/dist/hooks/scripts/__tests__/__init__.py +0 -0
package/dist/hooks/scripts/__tests__/test_post_tool_loop_armer.py +88 -0
package/dist/hooks/scripts/__tests__/test_research_copilot_guard_main_session.py +150 -0
package/dist/hooks/scripts/__tests__/test_session_start_memory_injector.py +66 -0
package/dist/hooks/scripts/__tests__/test_user_prompt_dispatch_reminder.py +37 -0
package/dist/hooks/scripts/_copilot_hook_lib.py +564 -0
package/dist/hooks/scripts/copilot_subagent_stop.py +203 -0
package/dist/hooks/scripts/copilot_write_guard.py +96 -0
package/dist/hooks/scripts/post_tool_loop_armer.py +61 -0
package/dist/hooks/scripts/research_copilot_guard.py +208 -0
package/dist/hooks/scripts/scientist_guardrails.py +29 -0
package/dist/hooks/scripts/session_start_memory_injector.py +188 -0
package/dist/hooks/scripts/user_prompt_dispatch_reminder.py +40 -0
package/dist/hooks/session-memory-injector.json +17 -0
package/dist/hooks/tests/__init__.py +0 -0
package/dist/hooks/tests/conftest.py +61 -0
package/dist/hooks/tests/fixtures/transcript_copilot_experiment_complete.jsonl +2 -0
package/dist/hooks/tests/fixtures/transcript_copilot_experiment_state_jump.jsonl +2 -0
package/dist/hooks/tests/fixtures/transcript_copilot_literature.jsonl +2 -0
package/dist/hooks/tests/fixtures/transcript_main_only.jsonl +2 -0
package/dist/hooks/tests/fixtures/transcript_malformed_state_output.jsonl +2 -0
package/dist/hooks/tests/integration_run.ps1 +65 -0
package/dist/hooks/tests/test_copilot_hook_lib.py +398 -0
package/dist/hooks/tests/test_copilot_subagent_stop.py +186 -0
package/dist/hooks/tests/test_copilot_write_guard.py +137 -0
package/dist/hooks/tests/test_session_start_snapshot.py +116 -0
package/dist/hooks/tests/test_state_machine_consistency.py +75 -0
package/dist/skills/arxivsub-skill/SKILL.md +98 -0
package/dist/skills/arxivsub-skill/skill.json +5 -0
package/dist/skills/de-ai-checker/SKILL.md +110 -0
package/dist/skills/de-ai-checker/skill.json +5 -0
package/dist/skills/deep-interview/SKILL.md +91 -0
package/dist/skills/deep-interview/skill.json +5 -0
package/dist/skills/grill-with-docs/SKILL.md +120 -0
package/dist/skills/grill-with-docs/skill.json +5 -0
package/dist/skills/init-mcp/SKILL.md +83 -0
package/dist/skills/init-mcp/skill.json +5 -0
package/dist/skills/model-escalation/SKILL.md +93 -0
package/dist/skills/model-escalation/skill.json +5 -0
package/dist/skills/paper-architecture-web-drawing/SKILL.md +282 -0
package/dist/skills/paper-architecture-web-drawing/skill.json +5 -0
package/dist/skills/paper-deai/SKILL.md +53 -0
package/dist/skills/paper-deai/skill.json +5 -0
package/dist/skills/paper-en2zh/SKILL.md +29 -0
package/dist/skills/paper-en2zh/skill.json +5 -0
package/dist/skills/paper-expand/SKILL.md +43 -0
package/dist/skills/paper-expand/skill.json +5 -0
package/dist/skills/paper-experiment-analysis/SKILL.md +38 -0
package/dist/skills/paper-experiment-analysis/skill.json +5 -0
package/dist/skills/paper-figure-caption/SKILL.md +29 -0
package/dist/skills/paper-figure-caption/skill.json +5 -0
package/dist/skills/paper-logic-check/SKILL.md +30 -0
package/dist/skills/paper-logic-check/skill.json +5 -0
package/dist/skills/paper-polish/SKILL.md +34 -305
package/dist/skills/paper-polish/skill.json +5 -0
package/dist/skills/paper-review/SKILL.md +49 -0
package/dist/skills/paper-review/skill.json +5 -0
package/dist/skills/paper-sanity-check/SKILL.md +122 -0
package/dist/skills/paper-sanity-check/skill.json +5 -0
package/dist/skills/paper-shorten/SKILL.md +42 -0
package/dist/skills/paper-shorten/skill.json +5 -0
package/dist/skills/paper-table-caption/SKILL.md +29 -0
package/dist/skills/paper-table-caption/skill.json +5 -0
package/dist/skills/paper-translate/SKILL.md +48 -0
package/dist/skills/paper-translate/skill.json +5 -0
package/dist/skills/plugin-dev-agent-development/SKILL.md +95 -0
package/dist/skills/plugin-dev-agent-development/skill.json +5 -0
package/dist/skills/research-workflow/SKILL.md +116 -0
package/dist/skills/research-workflow/skill.json +5 -0
package/dist/skills/scientist-experiment-runner/SKILL.md +76 -0
package/dist/skills/scientist-experiment-runner/skill.json +5 -0
package/dist/skills/scientist-ideation/SKILL.md +52 -0
package/dist/skills/scientist-ideation/skill.json +5 -0
package/dist/skills/scientist-plotting/SKILL.md +49 -0
package/dist/skills/scientist-plotting/skill.json +5 -0
package/dist/skills/scientist-review/SKILL.md +40 -0
package/dist/skills/scientist-review/skill.json +5 -0
package/dist/skills/scientist-runtime-init/SKILL.md +46 -0
package/dist/skills/scientist-runtime-init/skill.json +5 -0
package/dist/skills/scientist-writeup/SKILL.md +60 -0
package/dist/skills/scientist-writeup/skill.json +5 -0
package/dist/skills/talk-normal/SKILL.md +73 -0
package/dist/skills/talk-normal/skill.json +5 -0
package/package.json +1 -1
package/dist/agents/rc-experiment.md +0 -203
package/dist/agents/rc-ideation.md +0 -224
package/dist/agents/rc-literature.md +0 -228
package/dist/agents/rc-plan.md +0 -189
package/dist/agents/rc-polisher.md +0 -166
package/dist/agents/rc-rebuttal.md +0 -194
package/dist/agents/rc-reviewer.md +0 -187
package/dist/agents/rc-update-spec.md +0 -231
package/dist/agents/rc-verify.md +0 -234
package/dist/agents/rc-writer.md +0 -161
package/dist/skills/experiment-design/SKILL.md +0 -331
package/dist/skills/full-research-workflow/SKILL.md +0 -363
package/dist/skills/literature-search/SKILL.md +0 -244
package/dist/skills/sanity-check/SKILL.md +0 -449
package/dist/skills/submission-sprint/SKILL.md +0 -361

package/dist/skills/grill-with-docs/SKILL.md ADDED Viewed

@@ -0,0 +1,120 @@
+---
+name: grill-with-docs
+description: "Post-plan stress test. Use AFTER a plan is drafted (Goal anchor, ideation candidate, rebuttal strategy, pipeline template) to gap-check the plan against the project's existing documentation, terminology, and recent reviewer / handoff history in `.copilot/`. Sharpens fuzzy terms inline, cross-references the codebase / tex / logs, and offers ADRs only when a decision is hard to reverse. Never used to draft the plan itself. Triggers on: '校验计划', '对着文档拷问', '把计划放到文档里盘一遍', 'grill the plan', 'stress-test plan', 'audit plan against docs', 'check plan for gaps'."
+version: 0.2.0
+---
+# Grill with docs — Post-plan gap check
+Run **after** a plan exists (Goal anchor in `experiments.md`, selected direction in `ideas.md`, response strategy in `rebuttal/round-N.md`, or routing decision in `decisions.md`). Purpose: stress-test that plan against the project's existing language and documented state, surface contradictions, and update the docs inline as terminology is resolved.
+This skill is **not for plan drafting**. If no plan exists yet, run `deep-interview` first, hand off to the planning agent, then return here.
+## When this skill fires
+Fire automatically when the most recent disk write was one of:
+- `## Goal anchor` block freshly written to `.copilot/experiments.md`
+- A new `## Selected direction` in `.copilot/ideas.md`
+- A new `## Reviewer N strategy` block in `rebuttal/round-N.md`
+- A new pipeline template entry in `.copilot/decisions.md`
+Also fires on user request: "校验一下这个计划" / "grill this plan."
+## Documentation surface (auto-detected)
+```
+.copilot/
+├── state.md             ← stage cursor + loop counters
+├── literature.md        ← locked baseline + related work
+├── ideas.md             ← user preferences + candidates + selected
+├── experiments.md       ← Goal anchor + Run-N history
+├── handoff.md           ← writer / polisher / reviewer / rebuttal facts
+├── decisions.md         ← approval-gate decisions
+├── glossary.md          ← created lazily by THIS skill on first term resolve
+├── adr/                 ← created lazily by THIS skill on first ADR
+└── reviews/round-N.md   ← independent review rounds
+```
+`glossary.md` and `adr/` are created **only when** the first term / first ADR appears — do not pre-create empty scaffolds.
+## Procedure
+### Step 1 — Read the plan + the docs
+Load the just-written plan block and the relevant `.copilot/` files. For multi-doc projects (rare in this repo), also load any sibling `CONTEXT.md` / `CONTEXT-MAP.md` if present.
+### Step 2 — Run the four challenges, in order
+| Challenge | What you do |
+|---|---|
+| **Glossary clash** | For every noun phrase in the plan, check `.copilot/glossary.md` (and the existing tex / `ideas.md` / `literature.md`). If a term collides with prior usage or is fuzzy ("module," "robustness," "improvement"), propose a precise canonical term and ask the user to confirm. Update `glossary.md` inline when resolved. |
+| **Sharpen fuzzy language** | For every claim ("works better," "more robust," "faster"), demand the metric / unit / baseline / threshold. Push the user to a number or a falsifiable shape. |
+| **Concrete scenario stress test** | For every relationship in the plan ("Module A feeds Module B"), spell out one concrete scenario end-to-end. If the scenario breaks, flag it before any experiment burns compute. |
+| **Cross-reference with code / data** | For every "how it works" claim, grep the codebase / tex / logs and confirm the code agrees. If the plan describes behaviour the code does not exhibit, the plan is wrong — flag it. |
+Each challenge runs **once** per pass. One question at a time, with a recommended answer + the file:line that motivated the question.
+### Step 3 — Update docs inline (lazy creation)
+When a term is resolved, write to `.copilot/glossary.md` immediately:
+```markdown
+## <Canonical term>
+- Definition: <one sentence>
+- First defined: <YYYY-MM-DD> (during grill-with-docs of <plan slug>)
+- Aliases to avoid: <fuzzy or colliding terms now retired>
+- Used in: <file paths / sections>
+```
+Create `glossary.md` if it does not yet exist.
+### Step 4 — Offer an ADR only when all three are true
+Add to `.copilot/adr/NNNN-<slug>.md` only when:
+1. **Hard to reverse** — changing this mid-project means redoing experiments / rewriting sections
+2. **Surprising without context** — a future reader (or a reviewer) will ask "why this way?"
+3. **Result of a real trade-off** — there were ≥2 alternatives, one was picked for a specific reason
+If even one is false, skip the ADR. Most decisions live in `decisions.md` already and do not need promotion. Number ADRs by file count: first ADR is `0001-<slug>.md`.
+ADR template:
+```markdown
+# <NNNN> — <Title>
+- Status: accepted | superseded by NNNN
+- Date: <YYYY-MM-DD>
+- Context: <2-3 sentences — why this came up, which plan triggered it>
+- Decision: <the chosen alternative, one sentence>
+- Alternatives considered: <list with one-line "why not">
+- Consequences: <experiments / sections / future ablations this commits us to>
+```
+## Output
+When the pass finishes, emit:
+```markdown
+## Grill-with-docs report — <plan slug>
+- Date: <YYYY-MM-DD>
+- Plan reviewed: <file:line>
+- Glossary entries added / updated: <count> → <glossary.md anchors>
+- Fuzzy claims sharpened: <count> → <plan file:line edits>
+- Scenarios stress-tested: <count> → <list of scenarios + outcomes>
+- Code cross-references: <count> → <files / functions verified>
+- ADRs created: <count> → <adr/NNNN-*.md anchors, or "none — bar not met">
+- Plan changes proposed: <list of edits to the plan file, with file:line>
+- Residual risks: <list anything you grilled and could not resolve>
+- Hand off to: <agent who acts on the changes, or "user approval" if changes need confirmation>
+```
+## Hard constraints
+- **Post-plan only** — if no plan block exists, exit and recommend `deep-interview` first
+- **Read before challenging** — every challenge must cite a concrete file:line, not "in general"
+- **Update docs inline** — never batch glossary updates "for the end"
+- **ADR bar is strict** — three conditions, ALL must hold; otherwise leave the decision in `decisions.md`
+- **Do not edit the plan unilaterally** — propose edits with file:line; the writing agent (or user) applies them
+- **One challenge round per pass** — do not loop the four challenges; if more passes are needed, the user explicitly re-invokes
+- **Lazy file creation** — `glossary.md` and `adr/` directory created only on first real entry

package/dist/skills/grill-with-docs/skill.json ADDED Viewed

@@ -0,0 +1,5 @@
+{
+  "name": "grill-with-docs",
+  "description": "Post-plan stress test. Use AFTER a plan is drafted (Goal anchor, ideation candidate, rebuttal strategy, pipeline template) to gap-check the plan against the project's existing documentation, terminology, and recent reviewer / handoff history in `.copilot/`. Sharpens fuzzy terms inline, cross-references the codebase / tex / logs, and offers ADRs only when a decision is hard to reverse. Never used to draft the plan itself. Triggers on: '校验计划', '对着文档拷问', '把计划放到文档里盘一遍', 'grill the plan', 'stress-test plan', 'audit plan against docs', 'check plan for gaps'.",
+  "entry": "SKILL.md"
+}

package/dist/skills/init-mcp/SKILL.md ADDED Viewed

@@ -0,0 +1,83 @@
+---
+name: init-mcp
+description: "Use when setting up the plugin for the first time, installing dependencies, configuring MCP servers, or when the user says '初始化', 'init', 'setup', '装环境', '配置', 'install', 'configure', 'first time', '首次使用'. Handles both dependency marketplace installation and MCP server setup."
+version: 0.3.0
+---
+# Init MCP
+One-shot plugin setup: install dependency marketplaces → install Python deps → write `.mcp.json` → register hooks → regenerate skill.json metadata → verify each server → report optional secrets.
+## Step 1: Install dependency marketplaces
+This plugin depends on skills from 5 third-party marketplaces. If they are not added, plugin dependencies will stay unresolved (skills from those sources will be missing). The `superpowers` dependency uses Claude Code's built-in `claude-plugins-official` marketplace.
+Check whether each marketplace is already added by looking at the user's installed plugins. For each missing marketplace, instruct the user to run:
+```
+/plugin marketplace add Imbad0202/academic-research-skills
+/plugin marketplace add Lylll9436/Paper-Polish-Workflow-skill
+/plugin marketplace add multica-ai/andrej-karpathy-skills
+/plugin marketplace add anthropics/skills
+/plugin marketplace add Orchestra-Research/AI-Research-SKILLs
+```
+These are `/plugin` commands that must be typed by the user in the Claude Code prompt — they cannot be run via Bash. After the user adds all marketplaces, proceed to Step 2.
+If all marketplaces are already present, skip this step.
+## Step 2: Run the installer script
+`self/install.py` is a cross-platform Python script that handles MCP and hook setup.
+```bash
+python self/install.py
+```
+Supported flags:
+- `--target /path` install to a non-default workspace
+- `--dry-run` print plan without writing files
+- `--skip-deps` skip pip install
+- `--skip-verify` skip the MCP startup handshake
+## What the script does
+1. **Report dependency marketplaces** — print the prerequisite `plugin marketplace add` commands.
+2. **Install Python deps** — read `self/mcp/requirements.txt`, run `pip install` (default: `pdfplumber`).
+3. **Write `.mcp.json`** — scan `self/mcp/servers/` for every `server.py`, generate a Claude-Code-style `.mcp.json` with **absolute paths** to avoid `${workspaceFolder}`-expansion failures.
+4. **Register hooks** — inject SessionStart, PreToolUse, UserPromptSubmit, and PostToolUse hooks into `.claude/settings.json`. Idempotent; no duplicates.
+5. **Register conductor agent** — set `agent: copilot-conductor` in `.claude/settings.json`.
+6. **Regenerate skill.json metadata** — required by Claude Code 2.1.142+. Calls `self/scripts/generate-skill-json.py` to walk every skill and write a sibling `skill.json` from its SKILL.md frontmatter.
+7. **Verify MCP startup** — send `initialize` JSON-RPC to each server and confirm a response.
+8. **Report optional secrets** — check `ARXIVSUB_SKILL_KEY`; if unset, warn but do not block install.
+## Trigger scenarios
+- First-time use after a fresh clone → `/init-mcp`
+- An MCP server is unresponsive → `/init-mcp` (the script rewrites config and re-verifies)
+- After adding a new dependency marketplace → `/init-mcp` to re-verify everything
+## Servers currently under `self/mcp/servers/`
+The repo-root `.mcp.json` is generated by scanning `self/mcp/servers/` — there is no static `self/mcp/mcp.json`.
+| Server | Deps | Description |
+|---|---|---|
+| `ai-scientist` | stdlib only | runtime check, experiment directory browsing (non-model) |
+| `arxiv-search` | stdlib only | arXiv search, 3-second rate-limit + 429 retry |
+| `arxivsub-search` | stdlib + `ARXIVSUB_SKILL_KEY` | arXiv + top-venue joint search |
+| `dblp-bib` | stdlib only | DBLP BibTeX query, 1.5-second rate-limit |
+| `google-scholar` | stdlib only | Scholar metadata / citation formats |
+| `pdf-text` | `pdfplumber` (preferred) / `PyPDF2` (fallback) | Local PDF text extraction |
+## After installation
+1. **Restart Claude Code** (or run `/clear`) so the new MCP config takes effect.
+2. In a fresh session, verify: call `arxiv-search.search_arxiv` or `dblp-bib.search_dblp_bibtex` to confirm tool registration.
+3. If `ARXIVSUB_SKILL_KEY` is unset, `arxivsub-search` returns `missing_api_key`; configure via env var or `.env` as the warning suggests.
+## Notes
+- **Idempotent**: the script can run multiple times; existing hooks are not duplicated; existing `.mcp.json` is overwritten to stay in sync with `self/mcp/servers/`.
+- **Does not touch global settings**: writes only project-level `.claude/settings.json`; never touches `~/.claude/settings.json`.
+- **Other MCP entries**: the current implementation overwrites `.mcp.json`; if the user has non-`self/` MCP entries there, they must be merged manually. Use `python self/install.py --dry-run` to inspect the planned write.

package/dist/skills/init-mcp/skill.json ADDED Viewed

@@ -0,0 +1,5 @@
+{
+  "name": "init-mcp",
+  "description": "Use when setting up the plugin for the first time, installing dependencies, configuring MCP servers, or when the user says '初始化', 'init', 'setup', '装环境', '配置', 'install', 'configure', 'first time', '首次使用'. Handles both dependency marketplace installation and MCP server setup.",
+  "entry": "SKILL.md"
+}

package/dist/skills/model-escalation/SKILL.md ADDED Viewed

@@ -0,0 +1,93 @@
+---
+name: model-escalation
+description: "Use when repeated debugging or writing iterations fail, root cause is unclear, environment limits block progress, the user is still dissatisfied after multiple attempts, or the user says '疑难杂症', '卡住', '多轮迭代无解', '反复失败', '更强模型', '升级求助', 'stuck', 'escalate', 'stronger model'. Produces a hand-off summary suitable for a stronger model to pick up."
+version: 0.2.0
+---
+# Model Escalation
+## Role
+When a problem has resisted multiple solid attempts in the current session, or you can clearly perceive that the current model / environment / context cannot continue to make high-quality progress, your job is to **stop low-yield trial-and-error** and produce a high-quality help summary suitable for handoff to a stronger model.
+## Use this skill when
+- You have already done ≥ 2–3 rounds of substantive attempts; the problem is unresolved
+- The root cause is unclear; continuing edits will significantly raise the risk of accidental damage
+- Environment / permission / tool / context limits block verification
+- The user remains dissatisfied and you have no high-confidence improvement path
+- You can clearly describe the impasse but cannot reliably converge within this session
+## Core requirements
+- Be honest about the current state; do not exaggerate, do not cover up
+- Write only verified information; mark anything unverified explicitly as a "current hypothesis"
+- Separate goal, current state, attempts, results, and blocker
+- Preserve executable context: error messages, file paths, commands, I/O, blast radius
+- Do not push responsibility onto the user; your job is to make the handoff as easy to pick up as possible
+## Output format
+Output strictly in the structure below.
+### Recommend Escalating
+The current problem has entered a high-cost iteration zone; continued trial-and-error in this session has low yield. Forward the following summary to a stronger model to continue.
+### 1. Goal
+- 1–3 sentences on the desired end state
+- Acceptance criteria or the user's expected outcome
+### 2. Current state
+- Where you currently are
+- Actual behavior or error symptom
+- Files / modules / commands / data directly related to the issue
+### 3. Attempts so far
+List in chronological order; each item includes:
+1. What was done
+2. Observed result
+3. What this rules out, or why it still fails
+### 4. Current judgment
+- Confirmed facts
+- Current hypotheses
+- The actual blocker location
+### 5. Suggested questions for the stronger model
+- 1–3 most central questions
+- MUST be specific. NEVER "help me see what's wrong."
+### 6. Forwardable help prompt
+```text
+I am working on a problem; please continue from the information below and prioritize a minimum verifiable plan.
+Goal:
+...
+Current state:
+...
+Attempts so far:
+...
+Confirmed facts:
+...
+Current hypotheses:
+...
+Blocker:
+...
+Please focus on:
+1. ...
+2. ...
+3. ...
+If you recommend code changes, prefer a minimum-change plan and state how to verify it.
+```
+## Execution checklist before output
+1. Have you clearly separated facts from hypotheses?
+2. Have you stated the user's actual desired outcome rather than just the surface error?
+3. Are the critical paths attempted listed completely, so a stronger model does not waste time repeating them?
+4. Are the suggested questions specific enough to act on?
+5. Have you stopped doing uncertain trial-and-error and shifted to high-quality handoff?

package/dist/skills/model-escalation/skill.json ADDED Viewed

@@ -0,0 +1,5 @@
+{
+  "name": "model-escalation",
+  "description": "Use when repeated debugging or writing iterations fail, root cause is unclear, environment limits block progress, the user is still dissatisfied after multiple attempts, or the user says '疑难杂症', '卡住', '多轮迭代无解', '反复失败', '更强模型', '升级求助', 'stuck', 'escalate', 'stronger model'. Produces a hand-off summary suitable for a stronger model to pick up.",
+  "entry": "SKILL.md"
+}

package/dist/skills/paper-architecture-web-drawing/SKILL.md ADDED Viewed

@@ -0,0 +1,282 @@
+---
+name: paper-architecture-web-drawing
+description: "Use when the user wants a paper's abstract + method turned into a publication-ready architecture diagram, rendered as a single HTML file with inline SVG (and Python-generated SVG sub-figures for heatmaps / distributions / scatter / matrices). Triggers on: \"架构图\", \"结构图\", \"method figure\", \"overview figure\", \"pipeline diagram\", \"draw methodology\", \"网页绘图\". Enforces compactness numerics and a mandatory 10-round self-check loop. Do NOT use for line / bar / scatter data plots, posters, art, pure Mermaid sketches, or before the method is settled."
+version: 0.5.1
+---
+# Paper Architecture Web Drawing
+Input: the paper's Abstract + Method (or a `.tex` / `.md` / `.txt` paper file in the workspace).
+Output: a single HTML file with inline SVG rendering a top-conference-grade method figure; optionally an independent companion `.svg` of the same content.
+Not for: line / bar / scatter data plots, posters, illustrations, pure Mermaid sketches, or pre-method-settled drafts.
+## 0. Seven non-negotiable rules
+1. **White background + vector-first**: pure white background, inline SVG as the dominant medium. **Banned: gradients, shadows, glassmorphism, glow, decorative backgrounds.** Only MathJax / KaTeX may go online.
+2. **At least 3 real glyphs**: weights / distribution / tokens / cache / codebook / attention / scatter etc. must be drawn as matrix grids, heatmaps, histograms, boxplots, or scatter. **Text boxes only = fail.** Glyphs expressible in Python (see §2.7) **prefer Python**; do not hand-write complex heatmaps or curves.
+3. **Equations near their module**: LaTeX equations anchor as local labels on their corresponding module. **Do not pile them in a bottom strip.** Render via MathJax / KaTeX in HTML; never write ASCII pseudo-equations (`sum(...)/sum(...)`).
+4. **Font-size floor (top-conference density)**: main title ≥ 26 px, section title ≥ 22 px, module label ≥ 18 px, equation label ≥ 16 px, auxiliary ≥ 14 px. **When tight, delete words before shrinking type.** viewBox must give enough room — no 1060×330 strip that crushes the type.
+5. **English labels only**: no Chinese labels; no single-component description longer than 10 words.
+6. **Browser verification**: after writing the HTML, open it in a browser and take a screenshot. On Windows: `start "" "$(pwd -W)/path.html"` or `python -m http.server`.
+7. **Every arrow has paper grounding**: NEVER invent modules, losses, or feedback loops.
+## 1. Banned visual modes
+- **SmartArt / PowerPoint flowcharts**: equal-width rounded cards chained linearly, all nodes with the same corner radius and border.
+- **Dashboard / poster style**: right-side KPI column, result-card stack, marketing badges, statistic stickers, glow emphasis.
+- **Web-UI collage**: title bar + subtitle bar + content card patterns; pill-badge arrays.
+- **Big box + arrow + bottom legend** as the dominant frame.
+- **Top stage-label + bottom caption / problem statement / method summary**. The figure should be self-explanatory.
+- **bypass / feedback / dashed line crossing** through module titles, equations, badges, or result numbers.
+- **Small font + large whitespace** in exchange for content capacity.
+- High-saturation red / green / purple, or five or more salient light-color blocks at once.
+## 2. Workflow
+### 2.1 Read the paper
+- Look for `.tex` / `.md` / `.txt` in the workspace; read Abstract / Method / Approach / Overview / Framework.
+- If multiple candidate files exist, **confirm with the user first** — do not guess.
+- Only read context needed to reconstruct the main pipeline.
+### 2.2 Structure extraction (mandatory before drawing)
+For every key module fill out these 5 fields. If you cannot articulate one, **do not draw the module**:
+| Field | Content |
+|---|---|
+| `Name` | Short stable English module name, no slogans |
+| `Type` | input / encoder / alignment / retrieval / fusion / optimization / loss / output |
+| `Is novel?` | Is this a contribution that needs visual highlight? |
+| `Internal elements` | The objects / operations worth visualizing inside it (attention, MLP, codebook, feature map, cache update) |
+| `Topology role` | main-chain node / parallel branch / merge point / feedback point / training-only branch |
+### 2.3 Pick a layout family (in order of priority)
+1. Explicit feedback / iteration / alternating optimization / until convergence → **Loop / U-shape**
+2. train/infer or stage1/stage2 or coarse/fine or retrieve/generate → **Two-stage**
+3. ≥2 semantically independent branches merging into a shared main module → **Multi-branch with merge**
+4. Narrow column / single-column vertical reading → **Linear vertical**
+5. 3-6 serial stages, no strong feedback → **Linear horizontal** (default)
+6. Local complex substructure embedded in the main chain → **Hybrid composition**
+**Tie-breakers**: prefer the layout that preserves a strong visual center, gives the key mechanism panel enough area, minimizes arrow crossings and diagonal text overlaps, and forms a natural input-vs-output contrast.
+**Veto conditions** (any of these → switch layout):
+- No room for a main illustration; all modules forced into equal-weight small boxes.
+- Need >2 long cross-region connectors to convey the main flow.
+- Key equations forced into corners.
+- Output region and auxiliary text fighting for space.
+- Must shrink font or add whitespace to fit.
+### 2.4 Default blueprint
+**Input object → 2-3 mechanism panels → output object**
+- **Left**: tensor / weights / KV cache / tokens / feature grid as a visualized input object (NOT a text box).
+- **Middle**: each panel centers on a **main illustration**, not uniformly-sized small cards. Auxiliary objects (codebook / sensitivity map / objective / memory) anchor to their mechanism.
+- **Right**: the transformed object of the same kind, preferring **structural change** over KPI summary.
+- Input ↔ output **reuse the same graphic motif** to show state change (e.g. same-shape cache blocks before/after compression).
+- Highlight only 1–2 core contribution modules: same-family slightly heavier border / slightly darker fill / local bracket / callout. **Never** via high saturation or large badges.
+### 2.5 Palette (pick 1 of 5; one family across the whole figure)
+| Family | Use |
+|---|---|
+| **Blue-Gray** | Generic pipeline / system figure (default) |
+| **Warm Tones** | Moderate emphasis on novelty |
+| **Green-Cyan** | Generative / biological / light themes |
+| **Purple-Blue** | Theory / math-heavy |
+| **Monochrome** | Minimalist / B&W-print-friendly |
+**Color roles** (consistent across any chosen family): Primary background (normal modules), Secondary background (minor modules), Accent background (contribution modules), Input/Output background (lighter), Primary border, Accent border, **Arrow color: one dark color across the whole figure**, Main text, Secondary text. **Never mix families.**
+### 2.6 Typography
+- 2–3 stroke widths: main flow / secondary structure / coordinate auxiliaries.
+- Small / medium corner radius; avoid web-card-style large rounding.
+- **Sans for labels + serif for equations.**
+- Short stable module titles, no slogans; subtitles default to omitted.
+- Training-only branches: lighter fill + dashed arrows.
+- Long paths (bypass, feedback, training-only branches) follow the outer edge of regions; do not cross dense text inside sub-panels.
+- Compactness first: align, share edges, tighten via grouping. NEVER reduce font size to fit content.
+- Asymmetric layouts allowed; area reflects importance; never chase column parity.
+### 2.7 Abstract object → glyph (Python vs hand-written SVG)
+| Paper object | Glyph | Recommended source |
+|---|---|---|
+| `weights` / `kernels` / `parameters` matrix | matrix grid + outlier column / point highlight | matplotlib `imshow` |
+| `distribution` / `density` / histogram | histogram / KDE curve | matplotlib `hist` + `kdeplot` |
+| `outliers` / `IQR` / `boxplot` | boxplot + Q1 / Q3 / whisker annotations | matplotlib `boxplot` |
+| `scatter` / two-variable relation / error comparison | scatter + diagonal + highlight region | matplotlib `scatter` |
+| `attention` / `similarity` / `heatmap` | 2D heatmap + colorbar | matplotlib `imshow` (cmap viridis / coolwarm) |
+| `eigenvectors` / `subspace` / `basis` | disk + direction arrows / axes | matplotlib `quiver` or hand-SVG |
+| `quantization` / `clustering` / `codebook` | bin partition lines / cluster centers / lookup blocks | matplotlib + `axvline` for centroids |
+| `tokens` / `patches` / `cache blocks` | brick array + bit-width tag | hand-written SVG (structured) |
+| `loss landscape` / 3D surface | contour / pcolormesh | matplotlib |
+| `loss` / `objective` / `constraint` | short equation chip (attached to module) | hand-written SVG + MathJax |
+| module boxes, arrows, formula chips, brackets | box / line / label | **hand-written SVG** (Python is not elegant here) |
+**Rule**: if the glyph carries numerical / distribution / geometric content → Python-generated SVG. If the glyph is a structured layout (box, arrow, equation slot) → hand-written SVG.
+### 2.7.1 Python-generated SVG sub-figures (matplotlib)
+Place a `figures/<paper>_components.py` script; each subplot saves to a separate `.svg`; the main HTML embeds them inline or via `<img src=...svg>`. **Prefer inline** (single-file delivery, second-pass editable).
+Minimum skeleton:
+```python
+import matplotlib
+matplotlib.use("Agg")
+import matplotlib.pyplot as plt
+import numpy as np
+plt.rcParams.update({
+    "font.family": "DejaVu Sans",
+    "font.size": 14,
+    "axes.linewidth": 1.0,
+    "axes.spines.top": False,
+    "axes.spines.right": False,
+    "svg.fonttype": "none",   # keep text as <text>, do not outline
+})
+def save(fig, path):
+    fig.savefig(path, format="svg", bbox_inches="tight",
+                pad_inches=0.05, transparent=True)
+    plt.close(fig)
+```
+**Hard rules**:
+- `svg.fonttype="none"`: text stays as editable `<text>`, not paths.
+- `transparent=True` + main HTML white background, avoiding double background layers.
+- Per-subplot font size ≥ 12 (after embedding scale, still readable).
+- Use only your palette family (see §2.5); never matplotlib's default `tab:blue`.
+- One figure per function; never `plt.show()` in the script.
+### 2.7.2 Inline embedding
+After Python produces `comp_a.svg`, in the main HTML:
+```html
+<g transform="translate(120, 80)">
+  <!-- inline-svg-include: comp_a.svg -->
+</g>
+```
+Two delivery paths:
+- **Copy inline**: paste `comp_a.svg`'s inner `<g>...</g>` into the main SVG at the corresponding spot; drop the outer `<svg>` header.
+- **Object reference**: in main HTML, `<image href="comp_a.svg" x=.. y=.. width=..>` or `<foreignObject>`. **MUST** verify rendering in the browser before delivery.
+After inline embedding, **manually adjust size / position**: matplotlib's default viewBox differs from the main figure's coordinate system; wrap in a `<g transform="translate(x,y) scale(s)">`.
+### 2.8 Equation placement
+- Every key equation has a dedicated slot (local white-background equation slot / module-internal equation strip / anchor aligned to glyph). **Never a floating web-sticker.**
+- Long equations split into two short labels or shorter equivalent forms; never let a single long line crush a panel.
+- objective / update / normalization / threshold equations live inside their module's slot, not piled in a unified bottom strip.
+- The equation chip sits ≤ 15 px below its main illustration; no big air gap between them.
+### 2.9 Compactness numerics (hard targets)
+Most "doesn't look like a method figure" failures come from **loose layout**. Enforce these upper bounds:
+| Metric | Upper bound | Meaning |
+|---|---|---|
+| Whitespace ratio inside a panel | ≤ 15% | Title + glyphs + equations + labels cover ≥ 85% of panel area |
+| Main illustration's share of panel visible area | ≥ 65% | "The figure dominates; text is secondary" |
+| Cross-panel horizontal gap | 20-40 px (viewBox units) | >40 = loose |
+| Cross-panel vertical gap | 15-30 px | between adjacent rows |
+| Title bottom edge → first panel | ≤ 30 px | No top-air |
+| viewBox aspect ratio | 0.45 - 0.65 (single-row 4-panel) | Top-venue `figure*` ≈ 2:1 |
+| Panel top padding (above title) | ≤ 16 px | Title touches edge |
+| Panel bottom padding (below last element) | ≤ 16 px | No large dead space |
+| Distance: glyph ↔ shape / label | ≤ 10 px | Tightly aligned, not floating |
+| Nested `<rect>` levels in one panel | ≤ 2 | 3-level nesting = card bloat |
+**Panel-equal-height trap**: 4 panels in one row need not be equal height. Input/Output typically 200–300 px shorter than mechanism panels. **Forcing equal height = manufacturing whitespace.**
+**How to measure**: after rendering, use browser dev tools. Or temporarily draw `<rect>` boundary markers inside the SVG to eyeball dead-space ratios.
+### 2.10 Mandatory ≥ 10-round self-check loop
+After writing the HTML, **never deliver directly**. Run 10 iterations from the table below; each round:
+1. Render PNG with `chrome --headless --screenshot`.
+2. Open the PNG; write down "this round's focus dimension and 3 most non-compliant spots" (concrete only; "looks fine" is banned).
+3. Fix HTML / Python / regenerate the sub-figure.
+4. Re-render.
+| Round | Focus | Must check |
+|---|---|---|
+| 1 | **Topology fidelity** | Aligned to paper's method? No missing / extra modules, arrows, losses, feedback? |
+| 2 | **Compactness** | Measure each panel's whitespace; tighten one by one; trim viewBox |
+| 3 | **Font sizes** | Spot-check every text against 26/22/18/16/14 floor; if below floor, delete words, do not shrink |
+| 4 | **Color discipline** | Single palette family? No gradients / shadows / glass / glow / decorative backgrounds? |
+| 5 | **Equation anchoring** | LaTeX truly rendered via MathJax? Chip glued to its module? |
+| 6 | **Arrow routing** | Main flow / bypass / feedback do not cross text / equations / badges |
+| 7 | **Python sub-figures** | `svg.fonttype="none"`, palette consistent with main, aspect matched to panel |
+| 8 | **Visual hierarchy** | Main reading path obvious at first glance? 1–2 contribution panels highlighted? Non-contribution modules restrained? |
+| 9 | **Anti-pattern scan** | Does it resemble SmartArt / PowerPoint / dashboard / web UI / poster? |
+| 10 | **Paper context** | Drop into an ICML/CVPR two-column layout — does it feel native? Cover the title — does it still feel like this paper? |
+**Never skip a round.** If a round finds nothing because earlier rounds already fixed it, explicitly record "this round 0 issues" — do not silently skip.
+**Round 11+ optional**: only when a previous fix triggered a new violation (e.g. tightening clipped some text); add rounds until stable.
+### 2.11 Quick-rollback conditions
+During iteration, if **any** of the following holds, roll back to §2.4 default blueprint and redraw; do not keep tuning:
+- 3 rounds of compactness work still cannot hit the §2.9 numbers.
+- The main illustration is essentially stacked text, not graphics.
+- viewBox shrinking always leaves large whitespace → usually caused by forced panel-equal-height (§2.9 trap).
+## 3. File locations
+- If the repo has `figures/` or `dist/figures/` → output there.
+- Otherwise put the files next to the paper source.
+- Default artifacts:
+  - `method_architecture.html` (main figure, inline SVG)
+  - `method_architecture.svg` (homologous standalone SVG, optional)
+  - `<paper>_components.py` (script generating Python sub-figures)
+  - `comp_*.svg` (Python sub-figure source files, for later editing)
+- The main HTML does NOT go online (MathJax / KaTeX CDN is the only exception).
+## 4. Completion checklist
+Confirm every item. Any miss → continue iterating:
+- [ ] Single HTML file opens directly; pure white background; inline SVG dominates
+- [ ] ≥ 3 real glyphs (heatmap / histogram / boxplot / scatter / matrix), with **≥ 1 Python-generated** if numerical
+- [ ] Each main mechanism panel has a main illustration, NOT just title + equation + cards
+- [ ] Python sub-figures use `svg.fonttype="none"`; text is editable
+- [ ] Python sub-figures use the same palette family as the main figure; no matplotlib defaults
+- [ ] **Compactness (§2.9)**: panel whitespace ≤ 15%, main illustration ≥ 65% of panel, cross-panel gap 20–40 px
+- [ ] **Panels NOT forced equal-height**: Input/Output 200–300 px shorter than mechanism panels
+- [ ] **viewBox aspect ratio 0.45–0.65** (single-row 4-panel); top/bottom padding ≤ 30 px
+- [ ] Equations sit in dedicated slots, aligned with their module; not floating; not crushed by codebook / arrow / badge
+- [ ] LaTeX equations render correctly (MathJax / KaTeX); no ASCII pseudo-equations
+- [ ] Font sizes meet the floor: 26 / 22 / 18 / 16 / 14
+- [ ] All English labels; single-component description ≤ 10 words
+- [ ] 1–2 core contribution modules highlighted (restrained: no high saturation, no big badges)
+- [ ] Palette: single family, 3–5 shades; **no gradients / shadows / glow / glass**
+- [ ] Main-flow arrows do not cross text / equations / badges; long paths follow outer edges
+- [ ] Input ↔ Output reuse a graphic motif or form a sensible before/after contrast
+- [ ] No top stage-label / phase-label; no bottom caption / problem statement / footer text
+- [ ] No right-side KPI column / result-card stack / dashboard / poster summary strip
+- [ ] Does not resemble SmartArt / PowerPoint / web component / product banner
+- [ ] **≥ 10 rounds of self-check completed (§2.10)**, with a PNG screenshot and 3 spot notes per round
+- [ ] Every arrow maps to a real data-flow / control-flow / supervision signal
+- [ ] No invented modules / losses / feedback loops absent from the paper
+## 5. Delivery contract
+1. Single HTML file as the main figure (inline-SVG-dominant).
+2. `<paper>_components.py`: script generating all Python numerical / geometric sub-figures.
+3. `comp_*.svg`: Python sub-figure source files (so the user can edit later).
+4. Optional: standalone `method_architecture.svg`.
+5. Brief notes: main flow, auxiliary branches, Python sub-figure manifest, source files referenced.
+6. When ambiguity remains, **explicitly flag the undefined modules**; do not invent.

package/dist/skills/paper-architecture-web-drawing/skill.json ADDED Viewed

@@ -0,0 +1,5 @@
+{
+  "name": "paper-architecture-web-drawing",
+  "description": "Use when the user wants a paper's abstract + method turned into a publication-ready architecture diagram, rendered as a single HTML file with inline SVG (and Python-generated SVG sub-figures for heatmaps / distributions / scatter / matrices). Triggers on: \"架构图\", \"结构图\", \"method figure\", \"overview figure\", \"pipeline diagram\", \"draw methodology\", \"网页绘图\". Enforces compactness numerics and a mandatory 10-round self-check loop. Do NOT use for line / bar / scatter data plots, posters, art, pure Mermaid sketches, or before the method is settled.",
+  "entry": "SKILL.md"
+}