npm - medsci-skills - Versions diffs - 4.1.0 → 4.7.0 - Mend

medsci-skills 4.1.0 → 4.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (274) hide show

package/README.md CHANGED Viewed

@@ -2,14 +2,16 @@
 # MedSci Skills
-**44 skills that actually work.** Built by a physician-researcher, tested on real publications.
+**45 skills that actually work.** Built by a physician-researcher, tested on real publications.
-*MedSci Skills is a submission-grade clinical manuscript workflow, not a generic biomedical skill catalog. It competes on clinical submission reliability, not skill count.*
+*MedSci Skills is a submission-grade clinical manuscript workflow, not a generic biomedical skill catalog. Its moat is the compliance layer — 36 reporting guidelines and risk-of-bias tools, reference/citation verification, and deterministic integrity gates, before peer review sees the manuscript. It competes on clinical submission reliability, not skill count.*
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
 [![Release](https://img.shields.io/github/v/release/Aperivue/medsci-skills?style=flat-square&color=blue)](https://github.com/Aperivue/medsci-skills/releases/latest)
 [![CI](https://img.shields.io/github/actions/workflow/status/Aperivue/medsci-skills/validate.yml?branch=main&style=flat-square&label=CI)](https://github.com/Aperivue/medsci-skills/actions/workflows/validate.yml)
-![Skills](https://img.shields.io/badge/Skills-44-brightgreen?style=flat-square)
+![Skills](https://img.shields.io/badge/Skills-45-brightgreen?style=flat-square)
+[![npm](https://img.shields.io/npm/v/medsci-skills?style=flat-square&label=npm&color=cb3837)](https://www.npmjs.com/package/medsci-skills)
+[![good first issues](https://img.shields.io/github/issues/Aperivue/medsci-skills/good%20first%20issue?style=flat-square&label=good%20first%20issues&color=7057ff)](https://github.com/Aperivue/medsci-skills/contribute)
 [![Agent Skills](https://img.shields.io/badge/Agent_Skills-standard-blue?style=flat-square)](https://agentskills.io)
 [![Claude Code](https://img.shields.io/badge/Claude_Code-supported-success?style=flat-square)](docs/host_compatibility.md)
@@ -18,6 +20,7 @@
 [![GitHub Copilot](https://img.shields.io/badge/GitHub_Copilot-supported-success?style=flat-square)](docs/host_compatibility.md)
 [![DOI](https://img.shields.io/badge/DOI-10.5281%2Fzenodo.20155321-blue?style=flat-square)](https://doi.org/10.5281/zenodo.20155321)
+[![arXiv](https://img.shields.io/badge/arXiv-2606.09500-b31b1b?style=flat-square)](https://arxiv.org/abs/2606.09500)
 [![Citation](https://img.shields.io/badge/Cite-CITATION.cff-blue?style=flat-square)](CITATION.cff)
 ![Built by](https://img.shields.io/badge/Built_by-Physician--Researcher-blue?style=flat-square)
@@ -35,10 +38,31 @@
 ---
+## What is MedSci Skills?
+MedSci Skills is an open-source Claude Code skill collection for **clinical
+manuscript preparation**. It helps physician-researchers and biomedical
+investigators move from literature search, study design, statistics, and figures to
+reporting-guideline compliance, citation/reference auditing, numerical-consistency
+checks, and response-to-reviewer workflows — combining agentic writing with
+**deterministic integrity gates** for submission-grade biomedical research. It is
+**not** a diagnostic tool, an autonomous author, or a general AI-scientist platform;
+every output requires human-expert verification. New here? See the
+[3 workflows below](#start-here-3-workflows), the [FAQ](docs/faq.md), and the
+[scope boundary](ROADMAP.md#not-planned--explicitly-out-of-scope).
+---
 ## Quick Start
 **No terminal?** Use the classroom installer ZIP — download, unzip, double-click the installer, then restart your agent app (see [Installation](#installation)).
+**Have a terminal?** Fastest path — one command, nothing to clone:
+```bash
+npx medsci-skills install        # copies every skill into your agent's folder
+```
 **Have git?** Install every skill in three commands:
 ```bash
@@ -81,10 +105,38 @@ All eight plugins share the same repository source, so this groups and enables s
 **Want just one capability?** Two skills are also published as focused standalone repos (generated mirrors; this repo stays the source of truth), each installable on its own with `/plugin marketplace add Aperivue/<repo>`:
 - [`Aperivue/verify-refs`](https://github.com/Aperivue/verify-refs) — catch fabricated/mismatched citations (PubMed + CrossRef).
-- [`Aperivue/check-reporting`](https://github.com/Aperivue/check-reporting) — audit a manuscript against 32 EQUATOR reporting guidelines.
+- [`Aperivue/check-reporting`](https://github.com/Aperivue/check-reporting) — audit a manuscript against the bundled EQUATOR reporting guidelines and risk-of-bias tools.
 ---
+## Start here: 3 workflows
+New users don't need all the skills at once. Most work starts as one of three
+workflows. Each runs through `/orchestrate` or by invoking the named skills in
+order; all outputs require human-expert review.
+**Workflow A — Manuscript pre-submission audit.** *Use when* a manuscript is nearly
+ready and you want it checked before a reviewer sees it. *Skills:* `/self-review` →
+`/check-reporting` → `/verify-refs` → `/sync-submission`. *In:* your manuscript
+(+ `refs.bib`, tables/figures). *Out:* anticipated reviewer comments, an item-by-item
+reporting-guideline audit, a citation-integrity report, and a submission-package
+drift check. *Safety:* it flags issues; you fix and verify them.
+**Workflow B — Data to manuscript package.** *Use when* you have a cleaned dataset
+and need a full draft. *Skills:* `/clean-data` → `/analyze-stats` → `/make-figures` →
+`/write-paper` → `/check-reporting` → `/find-journal`. *In:* a cleaned CSV/parquet
++ a research question. *Out:* reproducible analysis code, publication-ready figures,
+an IMRaD draft, a reporting checklist, and a journal shortlist. *Safety:* statistics
+and claims must be verified against your data; the toolkit never fabricates numbers
+or references.
+**Workflow C — Systematic review / meta-analysis.** *Use when* you are running an
+SR/MA. *Skills:* `/meta-analysis` (with `/search-lit`, `/make-figures`,
+`/check-reporting`). *In:* a research question + search strategy. *Out:* PROSPERO-style
+protocol scaffolding, screening/extraction structure, PRISMA-consistent counts and
+diagram, pooled-estimate figures, and a manuscript draft. *Safety:* screening and
+extraction decisions stay with the human review team.
 ## Live Demos: Three Study Types, Three Full Pipelines
 Three public datasets. Three study types. Each produces a complete manuscript, publication-ready figures, and a reporting compliance audit.
@@ -215,6 +267,43 @@ The E2E pipeline (`orchestrate --e2e`) produces everything up to `qc/`. The `sub
 ## What's New
+**v4.7** is the **self-update foundation** — physician-researchers stay current without GitHub, git, or a terminal. Additive and backward-compatible; still 45 skills / 36 guidelines / 30 detectors:
+- **Transactional, crash-recoverable installer.** Each install runs through a durable journal state machine recovered on the next run (roll back / forward-clean / fail-closed), with per-target SHA-256 inventories — your modified or third-party skills are backed up and never clobbered or auto-deleted.
+- **One-click self-updater** (`~/.medsci-skills/updater/`, `install.py --check-update`). Verifies the download against the github.com API digest and **never `extractall()`s** (per-entry rejection of traversal / symlink / duplicate / zip-bomb + an allowlist & per-file hash). The release pipeline injects a verified `provenance.json`, attests build provenance, runs on a protected `release` environment, and verifies each ZIP round-trips through the updater's own safe-extract before publishing.
+- **Opt-in update notice (off by default):** `install.py --enable-update-notify` shows a one-line "update available" message at Claude Code session start — no telemetry, reads nothing about your session, installs nothing. `--disable-update-notify` / `MEDSCI_NO_UPDATE_CHECK=1` turn it off. *(Honest scope: the digest/attestation detect transport tampering, not a compromised publisher account — see `SECURITY.md`.)*
+**v4.6** is a maintainability, governance, and review-depth release — still 45 skills / 36 guidelines; analysis-integrity detectors **28 → 30**, domain probes 11 → 12:
+- **Fairness / equity / subgroup-performance probe (EQ0–EQ6)** for AI/prediction/diagnostic studies that claim cross-population performance, plus two new detectors: an **AI-disclosure + data/code-availability** check (`/sync-submission`) and a **structured-summary-box conformance** check (`/academic-aio`).
+- **Governance + answer-engine layer:** `ROADMAP.md`, `MAINTAINERS.md`, `SECURITY.md`, a maintainer workflow + release checklist, an AEO/GEO `docs/faq.md`, a "Start here: 3 workflows" + "Validation status" section in this README, and a new `maturity` field (official / experimental / community) on every skill.
+- **Token diet (pilot):** `write-paper` Phase 7 integrity audits moved to a load-on-demand reference (~2,559 tokens saved per invocation). Positioning now leads with the compliance moat rather than skill count.
+**v4.5** deepens the review + submission surface with no new skill or reporting-guideline count (still 45 skills / 36 guidelines); analysis-integrity detectors **27 → 28**:
+- **`/clean-data` + `/analyze-stats` — reverse-coded-item / negative-alpha detector.** A multi-item Likert scale with a negatively-worded item must be recoded `(min+max) − x` before the scale total or Cronbach's alpha is computed; left un-recoded, the item correlates negatively with the rest of the scale and alpha collapses (often negative). A negative alpha is a coding bug, not a "multidimensional construct." New stdlib-only `check_reverse_coding.py` returns `REVERSE_CODING_LIKELY` / `REVERSE_CODING_SUSPECT` / `OK` from per-item item-rest correlations + raw alpha; the Likert summary template gains a `--reverse-items` recode flag.
+- **`/peer-review` + `/self-review` — SR/MA + DTA + prediction-model probe batch.** `sr_ma.md` **P12** risk-of-bias table row-sum ↔ traffic-light figure-matrix reconciliation and **P13** included-study ↔ reference-list completeness; `diagnostic_accuracy.md` **D7** index-test-as-enrollment-criterion circularity; `clinical_prediction_model.md` **CP5** intended-use horizon leakage and **CP6** development/CV vs held-out/external validation-nomenclature conflation. Vendored byte-identical into `/self-review`.
+- **`/sync-submission` — embedded absolute-path leak scan.** A `word/*.xml` attribute (e.g. a pandoc-embedded image's `<pic:cNvPr descr="…">`) carrying an absolute home-dir path (`/Users/…`, `/home/…`) is a username leak invisible to a rendered-text scan; now flagged as `docx_embedded_abs_path` under `check_asset_anonymization.py`.
+**v4.4** adds reviewer/analysis depth with no new skill or reporting-guideline count (still 45 skills / 36 guidelines / 27 detectors):
+- **`/author-strategy` — trajectory-archetype classification (optional).** Classifies a queried author's PubMed trajectory into abstract career archetypes (A1 infrastructure builder, A2 methodology rule-maker, A3 clinical→AI hybrid, A4 SR/MA volume engine, A5 large-consortium participation, A6 device/technique depth, + a computed composite) as an **explainable, multi-label, confidence-scored heuristic — not an objective verdict**. The rubric is a single canonical YAML (the narrative doc is generated from it); scores exclude `unavailable` signals (h-index/citation/venue-tier → `[VERIFY]`, never fabricated); a **disambiguation gate** binds an approved `corpus_manifest.json` to the CSV (csv + PMID-set hashes) so a surname alone never classifies, and target-author attribution never borrows a co-author's ORCID/affiliation.
+- **`/peer-review` + `/self-review` — Image-Synthesis / cross-modality probe (IS1–IS4)** for studies that synthesize one imaging modality from another and claim the output carries the target's information, plus a reviewer-side reference-integrity spot-check.
+- **`/verify-refs` — OpenAlex tertiary index** recovers conference-proceedings / non-DOI citations (NeurIPS/ICLR/ACL) that fall through PubMed and CrossRef, the free analogue of a portal's second index.
+**v4.3** hardens the **cross-sectional / observational cohort** review surface end-to-end, much of it reverse-engineered from real CC-BY cohort papers (learn-only under the license firewall) — no new skill or reporting-guideline count (still 45 skills / 36 guidelines); analysis-integrity detectors **25 → 27**:
+- **Observational probes O1 → O14** (`/peer-review` + `/self-review`, vendored) — over-adjustment / analysis-unit clustering / outcome construct-validity (O7–O9), overlapping-subset gradient (O10), **complex-survey design & weighting** for NHANES/KNHANES (O11), **data-driven threshold / "inflection-point" mining** (O12), **cross-sectional mediation** temporal-order & sequential-ignorability (O13), and **interaction scale** — additive RERI/AP/S vs multiplicative (O14). Plus a new **clinical-prediction-model** probe module **CP1–CP4** and survival **S9** (panel-data / multistate variance).
+- **Two new detectors (25 → 27)** — `check_wordcount_cap.py` (the revision-inflation trap: body vs journal cap) and `check_paren_spans.py` (em-dash→paren conversions that wrap a whole sentence). Plus a `check_confounding_completeness.py` upgrade (DB-code↔prose alias map, SMD-from-mean±SD, exposure-defining-covariate exemption), a `check_cohort_arithmetic.py` `ANALYSIS_UNIT_UNDISCLOSED` check, a `check_scope_coherence.py` cross-sectional-yield lexicon, and a verify-refs corporate/collective-author render-abort fix.
+- **Analysis & submission tooling** — `/analyze-stats` gains **mediation** and **interaction & effect-modification** guides; `/sync-submission` gains `assemble_supplement.py` (S{N} index↔file integrity) and a `/revise` body-word-count exit gate; `/render-pdf-doc` gains a `scan_glyph_coverage.py` xelatex silent-glyph-drop scan.
+**v4.2** builds out the case-report capability end-to-end, grounded in real CC-BY case reports (learn-only under the license firewall) — no new skill or reporting-guideline count (still 45 skills / 36 guidelines); journal profiles **68 → 73**:
+- **Case-report + case-series writing** — `/write-paper` gains a CARE narrative + 150-word-abstract case-report exemplar, a **case-series** paper type (methods-light mini-cohort, all-cases summary table, counts-not-rates), and **adverse-event/pharmacovigilance** (Naranjo/WHO-UMC causality) and **diagnostic-pitfall/mimic** subtypes.
+- **Radiology / imaging-led track** — a dedicated `exemplar_case_report_radiology.md` (per-modality technique→findings→impression, structured-reporting lexicons BI-RADS/LI-RADS/PI-RADS/TI-RADS/Lung-RADS/O-RADS, quantitative threshold honesty, an interventional-radiology procedure/complication subtype, DICOM de-identification) plus a `/make-figures` annotated multimodality imaging-panel exemplar.
+- **Case-report reviewer probe** — `/peer-review` + `/self-review` ship a vendored case-report domain probe **CR1–CR9** (novelty/consent/n=1 causality, case-series design, adverse-event causality, imaging-led discipline).
+- **Where to submit** — compact `/find-journal` profiles for Journal of Medical Case Reports, Cureus, Radiology Case Reports, BMJ Case Reports, and BJR Case Reports, and `/check-reporting` CARE notes for adverse-event and case-series subtypes.
 **v4.1** ships distribution levers and a submission pre-flight gate — analysis-integrity detectors **24 → 25** (still 43 skills):
 - **Claude Code plugin marketplace** — `/plugin marketplace add Aperivue/medsci-skills`, then `/plugin` discovery of eight `medsci-*` category plugins generated from the catalog SSOT (`.claude-plugin/marketplace.json`).
@@ -260,7 +349,7 @@ Earlier in this series: analysis-integrity guards (confounding completeness, cla
 | **Battle-tested** | Used on real manuscript submissions by a practicing physician-researcher | Unknown provenance and validation |
 | **Depth per skill** | 150-600 lines of documentation + bundled reference files (curated journal profile library, checklists, formula sheets, code templates) | Typically thin SKILL.md templates |
-**MedSci-Audit** — the verification edge in the first rows above is a named suite of **25 deterministic detectors** (citation & reference integrity, cohort & pool arithmetic, scope/estimand contracts, reporting compliance, and more) that catch fabricated or drifted content before a manuscript reaches a reviewer. See **[`MEDSCI_AUDIT.md`](MEDSCI_AUDIT.md)** for the suite, its six families, and its evaluation evidence.
+**MedSci-Audit** — the verification edge in the first rows above is a named suite of **28 deterministic detectors** (citation & reference integrity, cohort & pool arithmetic, scope/estimand contracts, reporting compliance, and more) that catch fabricated or drifted content before a manuscript reaches a reviewer. See **[`MEDSCI_AUDIT.md`](MEDSCI_AUDIT.md)** for the suite, its six families, and its evaluation evidence.
 ---
@@ -332,7 +421,7 @@ ma-scout -> search-lit -> fulltext-retrieval -> design-study ──> write-proto
 | **search-lit** | PubMed + Semantic Scholar + bioRxiv search with anti-hallucination citation verification. Token-efficient error handling -- CrossRef failures are silently batched, not repeated. BibTeX output tags each entry with `verified`/`verified_by`/`verified_on` fields so downstream skills can trust the citation provenance. |
 | **verify-refs** | Pre-submission reference audit for `.md`, `.docx`, `.bib`, or `.tsv` inputs. Extracts references, verifies DOI/PMID via CrossRef/PubMed when available, and writes `qc/reference_audit.json` as the sole output — row-level status (OK / MISMATCH / UNVERIFIED / FABRICATED) lives inside the JSON `records[]` block. `/search-lit` produces candidate BibTeX; `/lit-sync` owns `manuscript/_src/refs.bib`. |
 | **fulltext-retrieval** | Batch open-access PDF downloader. Unpaywall → PMC → OpenAlex → CrossRef pipeline. OA-only -- no paywall bypass. Input: DOI list or TSV. Optional PDF→Markdown conversion via [pymupdf4llm](https://pymupdf.readthedocs.io/en/latest/pymupdf4llm/) for token-efficient LLM analysis of academic papers. |
-| **check-reporting** | Manuscript compliance audit against 32 reporting guidelines and risk of bias tools (STROBE, STARD, STARD-AI, TRIPOD, TRIPOD+AI, PRISMA, PRISMA-DTA, PRISMA-P, MOOSE, ARRIVE, CONSORT, CARE, SPIRIT, CLAIM, SQUIRE 2.0, CLEAR, GRRAS, MI-CLEAR-LLM, SWiM, AMSTAR 2, QUADAS-2, QUADAS-C, RoB 2, ROBINS-I, ROBINS-E, ROBIS, ROB-ME, PROBAST, PROBAST+AI, NOS, COSMIN, RoB NMA). Machine-readable JSON summary with `compliance_pct` and `fixable_by_ai` flags for automated pipeline integration. |
+| **check-reporting** | Manuscript compliance audit against 36 reporting guidelines and risk of bias tools (STROBE, STARD, STARD-AI, TRIPOD, TRIPOD+AI, TRIPOD-LLM, PRISMA, PRISMA-DTA, PRISMA-P, MOOSE, ARRIVE, CONSORT, CONSORT-AI, CARE, SPIRIT, SPIRIT-AI, CLAIM, DECIDE-AI, SQUIRE 2.0, CLEAR, GRRAS, MI-CLEAR-LLM, SWiM, AMSTAR 2, QUADAS-2, QUADAS-C, RoB 2, ROBINS-I, ROBINS-E, ROBIS, ROB-ME, PROBAST, PROBAST+AI, NOS, COSMIN, RoB NMA). Machine-readable JSON summary with `compliance_pct` and `fixable_by_ai` flags for automated pipeline integration. |
 | **analyze-stats** | Statistical analysis code generation (Python/R) for diagnostic accuracy, DTA meta-analysis (bivariate/HSROC), inter-rater agreement, survival analysis, demographics tables, regression (logistic/linear), propensity score (matching/IPTW/overlap weighting), and repeated measures (RM ANOVA/GEE/mixed models). Calibration mandatory for prediction models. |
 | **meta-analysis** | Full systematic review and meta-analysis pipeline (8 phases). DTA (bivariate/HSROC) and intervention meta-analysis. Protocol to submission-ready manuscript with PRISMA-DTA compliance. |
 | **make-figures** | Publication-ready figures and visual abstracts: ROC curves, forest plots, PRISMA/CONSORT/STARD flow diagrams, Kaplan-Meier curves, Bland-Altman plots, confusion matrices, and journal-specific visual/graphical abstracts (python-pptx template-based). Communication-first design principles (Nat Hum Behav 2026 — key message, audience, cognitive load, figure-vs-table decision) and five flow-diagram production lessons (official-template fidelity, VML fallback PDF export, docx XML escape, sequential placeholder mapping, version freeze); critic rubric Section G adds 5 communication-first checks. `--study-type` auto-generates the full required figure set; structured `_figure_manifest.md` output for downstream pipeline consumption; D2 enforced as default for flow diagrams. |
@@ -363,6 +452,7 @@ ma-scout -> search-lit -> fulltext-retrieval -> design-study ──> write-proto
 | **ma-scout** | Meta-analysis topic discovery and feasibility assessment. Two modes: (A) Professor-first — profile → pillar analysis → MA gaps, (B) Topic-first — question → landscape scan → co-author matching. Multi-source validation (PubMed, PROSPERO, bioRxiv) with realistic k estimation (15-30% discount). |
 | **lit-sync** | Sync research references from .bib files to Zotero library + Obsidian literature notes. Concept extraction from 10+ literature notes with cross-cutting theme discovery. Works after `/search-lit` or standalone. |
 | **academic-aio** | AI search engine (Perplexity / ChatGPT web / Elicit / Consensus / SciSpace) and RAG visibility checklist for medical AI papers. Integrates TRIPOD+AI, CLAIM, STARD-AI, TRIPOD-LLM, DECIDE-AI reporting anchors with generative-engine-optimization (GEO) principles. Covers title, abstract, structured summary boxes (Key Points / Research in Context / Plain-Language Summary), preprints, GitHub README, `CITATION.cff`, Zenodo, and Hugging Face model/dataset cards. Explicit defense against LLM citation fabrication (Agarwal 2025, Nat Commun). Produces a visible PASS/PARTIAL/FAIL checklist; never applies edits silently. Pairs with `write-paper` Phase 4/6/7, runs after `self-review` + `humanize`. |
+| **polish-language** | Academic English consistency linting and non-native (ESL) clarity polish. A deterministic linter (`lint_consistency.py`) flags abbreviation define-once violations, US/UK spelling drift, hyphen-vs-en-dash numeric ranges, `P`/`p` case and impossible `P = 0.000`, hyphenation variants, small-number style, and value/unit spacing — then a gated, style-only clarity pass fixes wording without ever changing numbers, citations, or scientific meaning. Distinct from `humanize` (AI-tell removal) and `check-reporting` (guideline items); bundles a reproducible challenge card. |
 | **manage-refs** | Reference lifecycle as a single skill: citekey ↔ `.bib` validation, journal-CSL pandoc rendering (`render_pandoc.sh`), manuscript ↔ rendered DOCX cross-reference QC (`check_xref.py --strict` is the submission gate), `[N]` ↔ `[@key]` marker conversion, and native Zotero CWYW field-code injection for co-author Word workflows. Hybrid 3-phase strategy (pandoc draft → CWYW transition → Zotero CWYW for circulation/revision/submission). Sole writer of `manuscript_final.docx` and `qc/xref_audit.json`. Split out of `write-paper` Phase 7.6 so `revise`, `peer-review`, `sync-submission`, and `find-journal` can render directly without depending on a sibling skill. |
 | **render-pdf-doc** | Render non-bibliography academic markdown (proposal, briefing handout, anchor doc, IRB cover, reference table) to publication-quality PDF via `pandoc + xelatex` with CJK font fallback (Apple SD Gothic Neo on macOS, Noto Sans CJK KR on Linux) and content-proportional pipe-table column widths. Boundary opposite of `manage-refs` (bibliography-driven). Spun off from `write-paper` Phase 7.6. |
 | **define-variables** | Literature-grounded variable operationalization for observational research. Turns a data dictionary plus research question into a citation-backed table of exposure / outcome / covariate definitions, cutoffs, and DB-variable mappings. Tier 0 dictionary-first rule prevents ad-hoc phenotype definitions that invite reviewer rejection. Bridges `/search-lit` output into `/write-protocol` Methods. |
@@ -442,6 +532,30 @@ See [docs/classroom_distribution_plan.md](docs/classroom_distribution_plan.md) a
 > **Tip:** Not sure which skill to use? Start with `/orchestrate` -- it will classify your request and route you to the right tool.
+## Updating
+MedSci Skills updates often. You do **not** need GitHub, git, or the command line to stay current.
+- **One click (recommended for the classroom install).** After installing, an updater is placed at
+  `~/.medsci-skills/updater/` (and, if you chose `--desktop-launcher`, an **"Update MedSci Skills"**
+  icon on your Desktop). Double-click it: it downloads the latest release from GitHub, verifies it,
+  and re-installs — transactionally, so an interrupted update never corrupts your install.
+- **Already installed an old copy?** Re-download the latest classroom ZIP **once** and double-click
+  the installer; from then on the one-click updater is in place for every future update.
+- **Terminal users:** `npx medsci-skills@latest install` always installs the latest.
+- **Just checking:** `python3 installers/install.py --check-update` reports whether a newer version
+  is available and installs nothing.
+- **Get reminded (opt-in, Claude Code):** `python3 installers/install.py --enable-update-notify`
+  shows a one-line *"update available"* notice when a Claude Code session starts. It is **off by
+  default**, checks at most once a day, reads nothing about your session, and never installs
+  anything. Turn it off with `--disable-update-notify`, or silence it with `MEDSCI_NO_UPDATE_CHECK=1`.
+- **Claude Code plugin marketplace:** third-party marketplace **auto-update is off by default** —
+  enable it in Claude Code or run a manual plugin update.
+Updates connect only to GitHub, send no information about your machine or work, and create no
+telemetry or tracking. Modified skills are backed up before an update and never auto-deleted. See
+the [update privacy & data notice](docs/update_privacy.md).
 ## Key Features
 ### Autonomous E2E Pipeline
@@ -465,8 +579,8 @@ Projects declare their source-of-truth layout in `SSOT.yaml`, and a `qc/migratio
 ### Meta-Analysis Failure Modes
 `/meta-analysis` ships empirical failure-mode references (data integrity, review orchestration, submission package drift, post-submission release ops) with four automation hooks: `scripts/prisma_5way_consistency.py` (DI-6 PRISMA number consistency), `scripts/extraction_consensus_log_init.py` (DI-1 dual-extraction scaffold), `scripts/tag_cleanup_gate.sh` (DI-8 placeholder tag gate), and `scripts/verify_package_integrity.py` (SPD SHA-256 manifest for submission bundles).
-### 32 Reporting Guidelines & RoB Tools Built-in
-`check-reporting` includes bundled checklists for 32 guidelines and risk-of-bias tools: STROBE, STARD, STARD-AI, TRIPOD, TRIPOD+AI, PRISMA 2020, PRISMA-DTA, PRISMA-P, MOOSE, ARRIVE, CONSORT, CARE, SPIRIT, CLAIM, SQUIRE 2.0, CLEAR, GRRAS, MI-CLEAR-LLM, SWiM, AMSTAR 2, QUADAS-2, QUADAS-C, RoB 2, ROBINS-I, ROBINS-E, ROBIS, ROB-ME, PROBAST, PROBAST+AI, NOS, COSMIN, RoB NMA. Includes Results/Discussion section boundary checks and machine-readable JSON summary for pipeline integration.
+### 36 Reporting Guidelines & RoB Tools Built-in
+`check-reporting` includes bundled checklists for 36 guidelines and risk-of-bias tools: STROBE, STARD, STARD-AI, TRIPOD, TRIPOD+AI, TRIPOD-LLM, PRISMA 2020, PRISMA-DTA, PRISMA-P, MOOSE, ARRIVE, CONSORT, CONSORT-AI, CARE, SPIRIT, SPIRIT-AI, CLAIM, DECIDE-AI, SQUIRE 2.0, CLEAR, GRRAS, MI-CLEAR-LLM, SWiM, AMSTAR 2, QUADAS-2, QUADAS-C, RoB 2, ROBINS-I, ROBINS-E, ROBIS, ROB-ME, PROBAST, PROBAST+AI, NOS, COSMIN, RoB NMA. Includes Results/Discussion section boundary checks and machine-readable JSON summary for pipeline integration.
 ### Publication-Ready Output
 `analyze-stats` generates reproducible Python/R code for 13 analysis types -- including regression, propensity score, and repeated measures -- with mandatory calibration for prediction models. `make-figures` produces journal-specification figures (300 DPI, colorblind-safe palettes, proper dimensions), visual/graphical abstracts, and a tool selection guide (D2 for flow diagrams, matplotlib for data plots). `--study-type` auto-generates the complete figure set for each study design.
@@ -480,6 +594,14 @@ Projects declare their source-of-truth layout in `SSOT.yaml`, and a `qc/migratio
 ### Skills Work Together
 Skills call each other. `check-reporting` invokes `make-figures` for PRISMA diagrams. `write-paper` calls `search-lit` for citation verification. `self-review` delegates reporting compliance to `check-reporting`. `calc-sample-size` output feeds directly into `write-protocol`'s IRB justification section.
+### Validation status — available vs CI-gated vs evaluated
+Be precise about what "validated" means here — the three tiers are different facts:
+- **Available** — every bundled skill and deterministic detector. The current totals are the single source of truth in [`metadata/catalog_counts.json`](metadata/catalog_counts.json) and [`MEDSCI_AUDIT.md`](MEDSCI_AUDIT.md).
+- **CI-gated** — detectors with a committed challenge/regression test that runs on every push via [`validate.yml`](.github/workflows/validate.yml).
+- **Formally evaluated** — the subset measured by the canonical evaluation harness **E1** in [`evaluation/`](evaluation/), which is v3.8-era and validates the then-current detector subset; detectors added since are **CI-tested, not yet E1-evaluated** (the size of the catalog and the size of the evaluated subset are deliberately reported as separate facts — see `MEDSCI_AUDIT.md`).
+The toolkit is *designed to reduce common manuscript-preparation errors*; it does **not** guarantee correctness and is **not** clinically validated.
 ## Setup
 **New to Python, R, or the command line?** The full step-by-step guide for clinicians is in [`docs/setup/`](docs/setup/README.md):
@@ -567,6 +689,35 @@ Or equivalently: `/write-paper --autonomous` if analysis and figures already exi
 /search-lit            # Find supporting literature with verified citations
 ```
+## Contributing
+Contributions are welcome — and most are **one small, self-contained file** that a
+template walks you through. You do not need to understand the whole pipeline to add value.
+Pick a [**good first issue**](https://github.com/Aperivue/medsci-skills/contribute), or start
+from one of these:
+| Want to add… | How | Issue |
+|---|---|---|
+| **A journal profile** (submission rules for a journal we don't cover) | `/add-journal`, or copy an existing `journal_profiles/*.md` | [#115](https://github.com/Aperivue/medsci-skills/issues/115) |
+| **A figure exemplar** (ROC, KM, forest, Bland–Altman, confusion matrix…) | one `make-figures/references/exemplar_plots/*.md` anatomy model | [#118](https://github.com/Aperivue/medsci-skills/issues/118) |
+| **A CSL citation style** for a journal that lacks one | drop a `.csl` into `manage-refs/citation_styles/` | [#117](https://github.com/Aperivue/medsci-skills/issues/117) |
+| **A de-identification locale pack** for one more country | add patterns to `deidentify/` | [#116](https://github.com/Aperivue/medsci-skills/issues/116) |
+| **A reporting checklist or peer-review exemplar** | one reference file in the matching skill | [#120](https://github.com/Aperivue/medsci-skills/issues/120) |
+| **A README translation** (e.g., zh-CN) | a translated `README` | [#119](https://github.com/Aperivue/medsci-skills/issues/119) |
+Every contribution is gated the same way the maintainers are: it must be a self-contained
+file, pass the CI (`validate.yml` — PII scan, structure, catalog consistency), and carry no
+patient or author identifiers. See [`CONTRIBUTING.md`](CONTRIBUTING.md) for the PR checklist
+and the PII/publication hygiene rules. New ideas that don't fit a template? Open a
+[skill request](https://github.com/Aperivue/medsci-skills/issues/new?template=skill_request.yml)
+or a [detector request](https://github.com/Aperivue/medsci-skills/issues/new?template=detector_request.yml).
+**Governance:** [`ROADMAP.md`](ROADMAP.md) (priorities + scope boundary),
+[`MAINTAINERS.md`](MAINTAINERS.md) (roles — clinical authority stays with the founder),
+[`docs/maintainer_workflow.md`](docs/maintainer_workflow.md) (review + release process),
+and [`SECURITY.md`](SECURITY.md) (vulnerability reporting + the medical-scope boundary).
+A change that touches a medical/research claim needs Clinical-Lead review.
 ## In the Wild
 Adoption is tracked openly in [`IMPACT.md`](IMPACT.md) (stars, forks, traffic,
@@ -577,6 +728,18 @@ and academic use is logged in [`docs/citations.md`](docs/citations.md).
 [let us know](https://github.com/Aperivue/medsci-skills/issues/new?template=used-in-research.yml).
 It helps other researchers find the toolkit — and we list it (with your permission).
+## Citation
+If you use MedSci Skills in your research, please cite the software via
+[`CITATION.cff`](CITATION.cff) (Zenodo concept DOI
+[10.5281/zenodo.20155321](https://doi.org/10.5281/zenodo.20155321)).
+The design and evaluation of the toolkit are described in a preprint:
+> Nam Y, Kim N. *Agentic Skills for Auditable and Reproducible Medical Research
+> Writing: An Integrity-gated Architecture for LLM-Assisted Clinical Manuscripts.*
+> arXiv:2606.09500 (2026). https://arxiv.org/abs/2606.09500
 ## Disclaimer
 These skills are research productivity tools. They do **not** provide clinical decision support, medical advice, or diagnostic recommendations. All outputs should be reviewed by qualified researchers before use in any publication or clinical context.

package/installers/install.py CHANGED Viewed

@@ -1,9 +1,10 @@
 #!/usr/bin/env python3
 """Install MedSci Skills for local agent apps.
-This installer is intentionally conservative and dependency-free. It copies the
-repository's skills into common local skill folders and optionally writes a
-small Cursor project rule that tells Cursor where to find the skills.
+Dependency-free. Installs the repository's skills into common local skill folders via a
+**transactional, crash-recoverable** install (see installers/medsci_txn.py) so an
+interrupted install is recovered on the next run, and optionally writes a small Cursor
+project rule. No network access here.
 """
 from __future__ import annotations
@@ -11,10 +12,11 @@ from __future__ import annotations
 import argparse
 import datetime as dt
 import os
-import shutil
 import sys
 from pathlib import Path
+sys.path.insert(0, str(Path(__file__).resolve().parent))  # allow `import medsci_txn` when run as a script
+import medsci_txn  # noqa: E402
 REPO_ROOT = Path(__file__).resolve().parents[1]
 SKILLS_DIR = REPO_ROOT / "skills"
@@ -53,20 +55,20 @@ def copy_skills(target: str, dest: Path, log_lines: list[str], dry_run: bool) ->
     if not SKILLS_DIR.exists():
         raise FileNotFoundError(f"skills directory not found: {SKILLS_DIR}")
-    skill_dirs = sorted(p for p in SKILLS_DIR.iterdir() if p.is_dir() and (p / "SKILL.md").exists())
-    log(f"\n[{target}] installing {len(skill_dirs)} skills to {dest}", log_lines)
+    owned = sorted(p.name for p in SKILLS_DIR.iterdir() if p.is_dir() and (p / "SKILL.md").exists())
+    log(f"\n[{target}] installing {len(owned)} skills to {dest}", log_lines)
     if dry_run:
-        for skill in skill_dirs:
-            log(f"  DRY RUN copy {skill.name}", log_lines)
-        return len(skill_dirs)
+        for name in owned:
+            log(f"  DRY RUN install {name}", log_lines)
+        return len(owned)
-    dest.mkdir(parents=True, exist_ok=True)
-    for skill in skill_dirs:
-        shutil.copytree(skill, dest / skill.name, dirs_exist_ok=True)
-        log(f"  installed {skill.name}", log_lines)
-    verify_discoverable(dest, [s.name for s in skill_dirs], log_lines)
-    return len(skill_dirs)
+    result = medsci_txn.install_target(
+        SKILLS_DIR, dest, target, owned, medsci_txn.state_home(),
+        lambda m: log(m, log_lines),
+    )
+    verify_discoverable(dest, owned, log_lines)
+    return result["installed"]
 def install_cursor_rule(project: Path, log_lines: list[str], dry_run: bool) -> None:
@@ -116,29 +118,44 @@ def run_self_test() -> int:
     problems: list[str] = []
     sink: list[str] = []
-    # Snapshot real host dirs to prove the self-test never creates them.
+    # Snapshot real host + state dirs to prove the self-test never creates them.
     host_dirs = [default_target_dir("claude"), default_target_dir("codex")]
-    existed_before = {d: d.exists() for d in host_dirs}
+    real_state = medsci_txn.state_home()
+    watched = host_dirs + [real_state]
+    existed_before = {d: d.exists() for d in watched}
+    prev_home = os.environ.get("MEDSCI_HOME")
     with tempfile.TemporaryDirectory(prefix="medsci-selftest-") as tmp:
         tmp_path = Path(tmp)
-        dest = tmp_path / "skills"
+        os.environ["MEDSCI_HOME"] = str(tmp_path / "state")  # isolate transactional state to temp
         try:
-            copied = copy_skills("self-test", dest, sink, dry_run=False)  # includes verify_discoverable
-        except Exception as exc:  # noqa: BLE001
-            problems.append(f"copy/verify raised: {exc}")
-            copied = -1
-        if copied != n:
-            problems.append(f"copied {copied} != source skill count {n}")
-        proj = tmp_path / "project"
-        install_cursor_rule(proj, sink, dry_run=False)
-        if not (proj / ".cursor" / "rules" / "medsci-skills.mdc").is_file():
-            problems.append("cursor project rule was not written")
-    for d in host_dirs:
+            dest = tmp_path / "skills"
+            try:
+                copied = copy_skills("self-test", dest, sink, dry_run=False)  # transactional + verify
+            except Exception as exc:  # noqa: BLE001
+                problems.append(f"install/verify raised: {exc}")
+                copied = -1
+            if copied != n:
+                problems.append(f"installed {copied} != source skill count {n}")
+            # a second install must be idempotent (recovery + re-commit, no error)
+            try:
+                copy_skills("self-test", dest, sink, dry_run=False)
+            except Exception as exc:  # noqa: BLE001
+                problems.append(f"second (idempotent) install raised: {exc}")
+            proj = tmp_path / "project"
+            install_cursor_rule(proj, sink, dry_run=False)
+            if not (proj / ".cursor" / "rules" / "medsci-skills.mdc").is_file():
+                problems.append("cursor project rule was not written")
+        finally:
+            if prev_home is None:
+                os.environ.pop("MEDSCI_HOME", None)
+            else:
+                os.environ["MEDSCI_HOME"] = prev_home
+    for d in watched:
         if not existed_before[d] and d.exists():
-            problems.append(f"self-test created a real host dir: {d}")
+            problems.append(f"self-test created a real dir: {d}")
     print("MedSci Skills installer self-test")
     print(f"  source skills: {n}")
@@ -146,7 +163,7 @@ def run_self_test() -> int:
         for p in problems:
             print(f"  FAIL: {p}")
         return 1
-    print(f"  OK: {n}/{n} skills discoverable in temp target; cursor rule written; no host dir touched")
+    print(f"  OK: {n}/{n} skills discoverable in temp target; idempotent; cursor rule written; no host/state dir touched")
     return 0
@@ -177,6 +194,27 @@ def parse_args() -> argparse.Namespace:
         action="store_true",
         help="Simulate installs into temp dirs, assert all skills are discoverable, and touch no host directory. Exits 0 on pass.",
     )
+    parser.add_argument(
+        "--check-update",
+        action="store_true",
+        help="Report whether a newer release is available (connects to GitHub; installs nothing).",
+    )
+    parser.add_argument(
+        "--desktop-launcher",
+        action="store_true",
+        help="With your consent, also place an 'Update MedSci Skills' launcher on your Desktop.",
+    )
+    parser.add_argument(
+        "--enable-update-notify",
+        action="store_true",
+        help="Opt in: show a one-line 'update available' notice at Claude Code session start "
+             "(merges a hook into ~/.claude/settings.json; 24h-cached; no telemetry).",
+    )
+    parser.add_argument(
+        "--disable-update-notify",
+        action="store_true",
+        help="Opt out: remove the session-start update-notice hook from ~/.claude/settings.json.",
+    )
     return parser.parse_args()
@@ -184,31 +222,80 @@ def main() -> int:
     args = parse_args()
     if args.self_test:
         return run_self_test()
+    if args.check_update:
+        try:
+            import update  # noqa: PLC0415 - optional, only when explicitly requested
+            return update.check_update(medsci_txn.state_home())
+        except Exception as exc:  # noqa: BLE001
+            print(f"MedSci Skills: update check unavailable ({exc}).", file=sys.stderr)
+            return 1
+    if args.enable_update_notify or args.disable_update_notify:
+        try:
+            import update  # noqa: PLC0415
+            home = medsci_txn.state_home()
+            if args.disable_update_notify:
+                r = update.unregister_session_hook(home, update.default_settings_path())
+                print("Session-start update notice disabled." if r == "disabled"
+                      else "Session-start update notice was not enabled; nothing to do.")
+                return 0
+            # Opt-in: ensure the updater home (with the hook script) exists, then register the hook.
+            update.install_updater_home(REPO_ROOT, home, lambda _m: None)
+            r = update.register_session_hook(home, update.default_settings_path())
+            print("Opted in: Claude Code will show a one-line update notice at session start "
+                  "(24h-cached, no telemetry). Disable with: install.py --disable-update-notify"
+                  if r == "enabled" else "Already opted in to the session-start update notice; no change.")
+            return 0
+        except Exception as exc:  # noqa: BLE001
+            print(f"MedSci Skills: could not change the update-notify setting ({exc}).", file=sys.stderr)
+            return 1
     log_lines: list[str] = []
     log("MedSci Skills Installer", log_lines)
     log(f"Repository: {REPO_ROOT}", log_lines)
     log(f"Python: {sys.version.split()[0]}", log_lines)
     log(f"OS: {os.name}", log_lines)
+    # Each target is an independent transaction: a failure on one (e.g. a fail-closed corrupt
+    # journal) is logged and the others still proceed; successful targets are fully committed.
+    targets = [t for t in ("claude", "codex") if args.target in {"all", t}]
+    failures: list[str] = []
+    for t in targets:
+        try:
+            copy_skills(t, default_target_dir(t), log_lines, args.dry_run)
+        except Exception as exc:  # noqa: BLE001 - classroom installer shows friendly per-target errors.
+            failures.append(t)
+            log(f"\n[{t}] FAILED: {exc}", log_lines)
+            log(f"  [{t}] left unchanged (transactional); other targets continue.", log_lines)
     try:
-        if args.target in {"all", "claude"}:
-            copy_skills("claude", default_target_dir("claude"), log_lines, args.dry_run)
-        if args.target in {"all", "codex"}:
-            copy_skills("codex", default_target_dir("codex"), log_lines, args.dry_run)
         if args.target == "cursor" and not args.cursor_project:
             log("\n[cursor] skipped: pass --cursor-project <folder> to install a Cursor rule.", log_lines)
         if args.cursor_project:
             install_cursor_rule(args.cursor_project.expanduser().resolve(), log_lines, args.dry_run)
+    except Exception as exc:  # noqa: BLE001
+        failures.append("cursor")
+        log(f"\n[cursor] FAILED: {exc}", log_lines)
+    # Place the one-click updater under ~/.medsci-skills/updater/ so a future update needs no
+    # GitHub/terminal even if this download folder is deleted (best-effort; never fatal).
+    if not args.dry_run:
+        try:
+            import update  # noqa: PLC0415
+            update.install_updater_home(REPO_ROOT, medsci_txn.state_home(),
+                                        lambda m: log(m, log_lines),
+                                        desktop=args.desktop_launcher)
+        except Exception as exc:  # noqa: BLE001
+            log(f"\n[updater] could not install the one-click updater ({exc}); updates still work via re-running the installer.", log_lines)
-        log("\nDone. Restart Claude Code, Codex, or Cursor before testing the skills.", log_lines)
-        log("First test prompt:", log_lines)
-        log("MedSci Skills가 설치됐는지 확인하고, 오늘 실습에 쓸 대표 스킬 5개만 보여줘.", log_lines)
-    except Exception as exc:  # noqa: BLE001 - classroom installer should show friendly errors.
-        log(f"\nERROR: {exc}", log_lines)
+    if failures:
+        log(f"\nCompleted with errors on: {', '.join(failures)}. Other targets are fully installed.", log_lines)
         log("If this happened during class, send the install log to the instructor.", log_lines)
-        write_log(log_lines)
+        log_path = write_log(log_lines)
+        print(f"\nInstall log: {log_path}")
         return 1
+    log("\nDone. Restart Claude Code, Codex, or Cursor before testing the skills.", log_lines)
+    log("First test prompt:", log_lines)
+    log("MedSci Skills가 설치됐는지 확인하고, 오늘 실습에 쓸 대표 스킬 5개만 보여줘.", log_lines)
     log_path = write_log(log_lines)
     print(f"\nInstall log: {log_path}")
     return 0