npm - medsci-skills - Versions diffs - 4.10.0 → 5.0.0 - Mend

medsci-skills 4.10.0 → 5.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (103) hide show

package/README.md CHANGED Viewed

@@ -2,14 +2,14 @@
 # MedSci Skills
-**45 skills that actually work.** Built by a physician-researcher, tested on real publications.
+**51 skills that actually work.** Built by a physician-researcher, tested on real publications.
-*MedSci Skills is a submission-grade clinical manuscript workflow, not a generic biomedical skill catalog. Its moat is the compliance layer — 38 reporting guidelines and risk-of-bias tools, reference/citation verification, and deterministic integrity gates, before peer review sees the manuscript. It competes on clinical submission reliability, not skill count.*
+*MedSci Skills is an end-to-end research tool for physician and medical-engineering researchers — design → scaffold → validate → publish — for the clinical manuscript and the medical-AI model behind it. Its moat is the compliance layer — 38 reporting guidelines and risk-of-bias tools, reference/citation verification, and deterministic integrity gates before peer review — now extended by a model-engineering lane that scaffolds reproducible, leakage-safe training repos and audits model validation. Clinical AI model research engineering is in scope; a general AI-scientist platform is not. It competes on clinical submission reliability, not skill count.*
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
 [![Release](https://img.shields.io/github/v/release/Aperivue/medsci-skills?style=flat-square&color=blue)](https://github.com/Aperivue/medsci-skills/releases/latest)
 [![CI](https://img.shields.io/github/actions/workflow/status/Aperivue/medsci-skills/validate.yml?branch=main&style=flat-square&label=CI)](https://github.com/Aperivue/medsci-skills/actions/workflows/validate.yml)
-![Skills](https://img.shields.io/badge/Skills-45-brightgreen?style=flat-square)
+![Skills](https://img.shields.io/badge/Skills-51-brightgreen?style=flat-square)
 [![npm](https://img.shields.io/npm/v/medsci-skills?style=flat-square&label=npm&color=cb3837)](https://www.npmjs.com/package/medsci-skills)
 [![Watch the 2-min intro](https://img.shields.io/badge/▶_Watch-2--min_intro-FF0000?style=flat-square&logo=youtube&logoColor=white)](https://youtu.be/MclQ_RIofpE)
 [![good first issues](https://img.shields.io/github/issues/Aperivue/medsci-skills/good%20first%20issue?style=flat-square&label=good%20first%20issues&color=7057ff)](https://github.com/Aperivue/medsci-skills/contribute)
@@ -42,14 +42,20 @@
 ## What is MedSci Skills?
 MedSci Skills is an open-source Claude Code skill collection for **clinical
-manuscript preparation**. It helps physician-researchers and biomedical
-investigators move from literature search, study design, statistics, and figures to
-reporting-guideline compliance, citation/reference auditing, numerical-consistency
-checks, and response-to-reviewer workflows — combining agentic writing with
-**deterministic integrity gates** for submission-grade biomedical research. It is
-**not** a diagnostic tool, an autonomous author, or a general AI-scientist platform;
-every output requires human-expert verification. New here? See the
-[3 workflows below](#start-here-3-workflows), the [FAQ](docs/faq.md), and the
+research — the manuscript and the medical-AI model alike**. It helps
+physician-researchers and biomedical/medical-engineering investigators move from
+literature search, study design, statistics, and figures to reporting-guideline
+compliance, citation/reference auditing, numerical-consistency checks, and
+response-to-reviewer workflows — combining agentic writing with **deterministic
+integrity gates** for submission-grade biomedical research. As of **v5.0** it adds a
+**model-engineering lane**: choose a paper-grounded architecture, scaffold a
+reproducible, leakage-safe PyTorch training repo, and validate, document, and
+evaluate a medical-imaging or LLM/MLLM model so the work reaches a paper — it
+**integrates** MONAI / nnU-Net, never reimplements them. Clinical AI model research
+engineering is in scope; it is **not** a diagnostic tool, an autonomous author, or a
+general AI-scientist platform, and every output requires human-expert verification.
+New here? See the [3 workflows below](#start-here-3-workflows), the
+[FAQ](docs/faq.md), and the
 [scope boundary](ROADMAP.md#not-planned--explicitly-out-of-scope).
 ---
@@ -82,17 +88,18 @@ Restart Claude Code, then start with **`/orchestrate`** — it classifies your r
 ### Install as a Claude Code plugin
-Prefer plugins? One line adds the marketplace; `/plugin` then lets you browse eight category plugins and enable the ones you want:
+Prefer plugins? One line adds the marketplace; `/plugin` then lets you browse nine category plugins and enable the ones you want:
 ```text
 /plugin marketplace add Aperivue/medsci-skills
-/plugin            # browse eight category plugins; enable the ones you want
+/plugin            # browse nine category plugins; enable the ones you want
 ```
 | Plugin | Covers |
 |--------|--------|
 | `medsci-literature` | Literature search, full-text retrieval, Zotero sync, reference-integrity audits |
 | `medsci-data` | Study design, variable operationalization, sample size, data cleaning, de-identification, codebooks, dataset versioning |
+| `medsci-modeling` | Architecture selection, reproducible model-scaffold repos, model-validation audits, Model Card/Datasheet, model & LLM/MLLM evaluation |
 | `medsci-analysis` | Statistics, figures, batch/cross-national/replication analysis, meta-analysis |
 | `medsci-writing` | IMRAD & protocol drafting, AI-pattern removal, AI-search optimization, reviewer responses |
 | `medsci-review` | Self-review, peer review, reporting-guideline compliance |
@@ -451,6 +458,12 @@ ma-scout -> search-lit -> fulltext-retrieval -> design-study ──> write-proto
 | **make-figures** | Publication-ready figures and visual abstracts: ROC curves, forest plots, PRISMA/CONSORT/STARD flow diagrams, Kaplan-Meier curves, Bland-Altman plots, confusion matrices, and journal-specific visual/graphical abstracts (python-pptx template-based). Communication-first design principles (Nat Hum Behav 2026 — key message, audience, cognitive load, figure-vs-table decision) and five flow-diagram production lessons (official-template fidelity, VML fallback PDF export, docx XML escape, sequential placeholder mapping, version freeze); critic rubric Section G adds 5 communication-first checks. `--study-type` auto-generates the full required figure set; structured `_figure_manifest.md` output for downstream pipeline consumption; D2 enforced as default for flow diagrams. |
 | **design-study** | Study design review: identifies analysis unit, cohort logic, data leakage risks, comparator design, validation strategy, and reporting guideline fit. |
 | **design-ai-benchmarking** | Design and validity review for benchmarking AI system(s) against a human-expert panel: evaluation-question and arm definition, decoupled multi-dimensional rubrics with anchors, planted calibration probes (positive-control / known-bad / instability / mechanism-contradiction), reviewer-panel construction with per-reviewer randomization, inter-rater reliability targets with separate control-item reliability, LLM-as-judge vs human-as-judge adjudication, construct-independence guards, and a structured JSON rating-export schema. Locks the rubric before data collection. |
+| **model-validation** | Design or audit the clinical-validation study for an engineer-built medical-imaging model (segmentation / classification / detection): patient-level split disjointness and the data-leakage taxonomy, tuning-on-test, internal vs genuine external validation, comparator design, single-run vs multi-seed variance, task-correct metric selection (Metrics Reloaded), test-set sizing, and CLAIM 2024 / TRIPOD+AI / STARD-AI reporting fit. Ships a deterministic split-leakage gate that proves patient disjointness by set arithmetic on the emitted split table. Integrates with MONAI / nnU-Net — does not replace them. |
+| **model-scaffold** | Generate a reproducible, runnable PyTorch training repo for a medical-imaging task — segmentation (U-Net), classification, detection, image-to-image synthesis, or self-supervised pretraining — the missing middle link between choosing an architecture and validating a trained model. Emits a patient-level seed-locked split as an auditable artifact, a task-appropriate model, train/evaluate scripts that seed every RNG and infer under eval mode, a config, requirements, a reproducibility record, and a Methods stub with VERIFY placeholders (no fabricated numbers). Reproducibility holds by construction; ships a `check_training_hygiene` AST gate + a network-free build→validate challenge. Integrates with MONAI / nnU-Net / TorchIO / timm / torchvision — does not reimplement them. |
+| **architecture-zoo** | "Which architecture for which research question" decision tool: maps task (classification / segmentation / detection / transfer), modality, data scale, and class imbalance to a paper-grounded architecture shortlist. Curates the foundational curriculum (ResNet / DenseNet / EfficientNet / ViT / Swin; U-Net / 3-D U-Net / Attention & Residual U-Net / nnU-Net / Mask R-CNN; SAM/MedSAM / TotalSegmentator / BiomedCLIP / DINO / MAE / SimCLR) — each with core idea, when-to-use, medical-imaging use, reference implementation, validation setup, and the matching model-scaffold template. Advisory; teaches archetypes, not a live SOTA leaderboard. |
+| **model-card** | Generate the documentation an engineer-built medical-imaging model must carry — a Model Card (Mitchell et al. 2019), a Datasheet for its dataset (Gebru et al. 2021), and a METRIC-informed data-quality pass — filled from user-supplied facts (never fabricated), then verify every required section is present and non-empty with a deterministic completeness gate (`check_model_card_complete`). Model Card / Datasheet are documentation standards vendored as templates, not counted reporting checklists. |
+| **model-evaluation** | Compute and report task-correct held-out metrics for a trained medical-imaging model — segmentation (Dice + a boundary metric HD95/NSD, per structure), classification (AUROC + AUPRC + sensitivity/specificity with bootstrap CIs at the deployment prevalence), or detection (FROC/mAP with a stated IoU criterion) — plus calibration and subgroup slices. Emits a per-case table for analyze-stats and gates the metric choice against Metrics Reloaded / CLAIM 2024 (`check_metric_reporting`). Numbers come only from executed code. |
+| **mllm-eval** | Model-agnostic evaluation harness (closed API or open weights) for an LLM/MLLM on a clinical task — radiology report generation, VQA, clinical text extraction — covering the adjudicated reference standard, clinical-efficacy metrics (RadGraph-F1 / CheXbert-F1 beyond BLEU/ROUGE), faithfulness/hallucination, pretraining-contamination, prompt sensitivity, and a reader study; gates the plan with `check_mllm_eval_completeness` and routes the reviewer audit to the MLLM probe. |
 | **intake-project** | Classifies new research projects, summarizes current state, identifies missing inputs, and recommends next steps. |
 | **grant-builder** | Structures grant proposals: significance, innovation, approach, milestones, and consortium roles. |
 | **present-paper** | Academic presentation preparation: paper analysis, supporting research, speaker scripts, slide note injection, and Q&A prep. |

package/metadata/distribution_files.json CHANGED Viewed

@@ -396,6 +396,46 @@
       "size": 1421,
       "sha256": "912c52e9289a7ccb014aa8a18105b6dfe04c2cc040e970c73b4bbc6b2d8a8a39"
     },
+    {
+      "path": "skills/architecture-zoo/SKILL.md",
+      "size": 5712,
+      "sha256": "ffb9a52d417f309a3b22c8d0f74e6700b67350c0d7a2a2e2c785f0cb4cb066bc"
+    },
+    {
+      "path": "skills/architecture-zoo/references/classification.md",
+      "size": 5986,
+      "sha256": "035e0fddaccb0e19e23ffd7756b075154f847ee20f9a512cd2a010c0db4210fa"
+    },
+    {
+      "path": "skills/architecture-zoo/references/detection.md",
+      "size": 3917,
+      "sha256": "60549b53a87dcb149c442498695321e94c28f0c8853316dda995a2dd730dfeec"
+    },
+    {
+      "path": "skills/architecture-zoo/references/foundation_models.md",
+      "size": 5267,
+      "sha256": "495453b025f1cb13d5ba0bac9be6d3b0d63dd958ea48d2b9e55bc222e9c26786"
+    },
+    {
+      "path": "skills/architecture-zoo/references/index.md",
+      "size": 4024,
+      "sha256": "d2360eb8635347f0f22be0b0e337a540db865aafdc12678455cf32bc49fd9d3d"
+    },
+    {
+      "path": "skills/architecture-zoo/references/segmentation.md",
+      "size": 6508,
+      "sha256": "17618fe3d6884cf89b034ededcd69e081a65d7bfd473495eb2ab1fb5d8b15d8b"
+    },
+    {
+      "path": "skills/architecture-zoo/references/synthesis.md",
+      "size": 3973,
+      "sha256": "b7fec1a55a7b9e2f8eea6979d7bfbda1fc03664aeec0cc89daac8c8f9bdfb8bc"
+    },
+    {
+      "path": "skills/architecture-zoo/skill.yml",
+      "size": 2836,
+      "sha256": "7d7c727a9fc75383e775feac14faf1c4caad90307ed8ca20859ef499ac17deb0"
+    },
     {
       "path": "skills/author-strategy/SKILL.md",
       "size": 9209,
@@ -506,6 +546,11 @@
       "size": 4373,
       "sha256": "ee59d959b91c831d34e04853a83a969bb6315e49f692baa904ab8805b8f17147"
     },
+    {
+      "path": "skills/check-reporting/references/appraisal_tools/METRICS_RELOADED.md",
+      "size": 2384,
+      "sha256": "e06267be7ffa5b5e3f52de387b745bf2676f016c29b02a5872d38c68ddf762ec"
+    },
     {
       "path": "skills/check-reporting/references/checklists/AMSTAR2.md",
       "size": 4566,
@@ -1023,13 +1068,18 @@
     },
     {
       "path": "skills/find-journal/POLICY.md",
-      "size": 4486,
-      "sha256": "02de377328f457c57c5edbade8bae40e12c51b4c6ff8ab6293b330c482187ce7"
+      "size": 5255,
+      "sha256": "c2b61b5830844fa3eebe333dc4864acb4d085097e81ba2dbc1ef72dd65bebee3"
     },
     {
       "path": "skills/find-journal/SKILL.md",
-      "size": 14455,
-      "sha256": "48a95f5ee639e59f00241608b23b3e652395bffaa63a6bccba295e5b3d5a49d6"
+      "size": 22072,
+      "sha256": "45074f6499896700bff0eb20bd91c4076b143bcd39521d56559239edaa8b24d8"
+    },
+    {
+      "path": "skills/find-journal/references/acceptance_signals_schema.md",
+      "size": 6869,
+      "sha256": "620ba1c55272908a16125ba69074491df5246d6cf9afa14e6b08a08e8dc4e866"
     },
     {
       "path": "skills/find-journal/references/journal_profiles/AJNR.md",
@@ -1038,8 +1088,8 @@
     },
     {
       "path": "skills/find-journal/references/journal_profiles/AJR.md",
-      "size": 1417,
-      "sha256": "ae26a8ea72672a973977a01f7e7a8cdbdc98b5dbcd2a7bc4605bb6eaee1a87fe"
+      "size": 2075,
+      "sha256": "9d9a5ee7b3b2287a6deae8fc06d54dd60b26afa843f5830f7bb5fb2e2ac633f7"
     },
     {
       "path": "skills/find-journal/references/journal_profiles/Abdominal_Radiology.md",
@@ -1098,8 +1148,8 @@
     },
     {
       "path": "skills/find-journal/references/journal_profiles/Clinical_and_Molecular_Hepatology.md",
-      "size": 2151,
-      "sha256": "7b343dd7628ec9c6efcd4c9afcb66ca35846d48af20c62e58038c437f0405a0a"
+      "size": 2391,
+      "sha256": "7d923a09bc543fe49db28b9b79714958522e2fe1781d60c6f22dc1efcd3ba650"
     },
     {
       "path": "skills/find-journal/references/journal_profiles/Cureus.md",
@@ -1128,8 +1178,8 @@
     },
     {
       "path": "skills/find-journal/references/journal_profiles/European_Radiology.md",
-      "size": 1640,
-      "sha256": "b71a9fe293c985c47f674459b4a613083749bc8d246d062e2c3afddc988e349c"
+      "size": 2350,
+      "sha256": "fa50a9a9d6012b8c6b75d81a69b950dfbe7db29775c8112bf5fe68d6d5655dfa"
     },
     {
       "path": "skills/find-journal/references/journal_profiles/Hepatology_Communications.md",
@@ -1158,8 +1208,8 @@
     },
     {
       "path": "skills/find-journal/references/journal_profiles/Investigative_Radiology.md",
-      "size": 1480,
-      "sha256": "ab740ec46a98b92fc243d4770b2a4f6ed400ea5a1f86e0fdffc121cdb2c98fa9"
+      "size": 2118,
+      "sha256": "c218b4e764a070a3297802076c1e6c68e514160cc1efa038552cc271b01af9ce"
     },
     {
       "path": "skills/find-journal/references/journal_profiles/JACC_Advances.md",
@@ -1248,8 +1298,8 @@
     },
     {
       "path": "skills/find-journal/references/journal_profiles/KJR.md",
-      "size": 3036,
-      "sha256": "a0814e6d62288389db7528b73a25db870ab91635dc4b946fb0c8bf8af47150a3"
+      "size": 4061,
+      "sha256": "0849de001a47038b8bfc92b337285372c80ccf4861162cae876295d82ea4f1c8"
     },
     {
       "path": "skills/find-journal/references/journal_profiles/Korean_Circulation_Journal.md",
@@ -1348,8 +1398,8 @@
     },
     {
       "path": "skills/find-journal/references/journal_profiles/RYAI.md",
-      "size": 1601,
-      "sha256": "1a6a387ed715a559bb0a7fd535b76ecb9334ff21e43b6c025520ca1167993a5e"
+      "size": 2341,
+      "sha256": "6d85cd675a5e6f395f74865256b7fe937749a4e31cb4c6f8d43d4059cb33a717"
     },
     {
       "path": "skills/find-journal/references/journal_profiles/Radiology.md",
@@ -1396,10 +1446,45 @@
       "size": 1511,
       "sha256": "e2811a46b89f39a7395d98460347ce274ff58ce50138364e8cf96050a05aad91"
     },
+    {
+      "path": "skills/find-journal/scripts/acceptance_readiness_challenge/expected/report_ceiling.txt",
+      "size": 1570,
+      "sha256": "1e84f888350249a3b187d38731a371487232851c6521689e319ef59408ed9448"
+    },
+    {
+      "path": "skills/find-journal/scripts/acceptance_readiness_challenge/expected/report_clean.txt",
+      "size": 350,
+      "sha256": "40b1a3b94cf15d504fc2c04e25f5aa737410a8b2480118dcd3c9cd1927ee825c"
+    },
+    {
+      "path": "skills/find-journal/scripts/acceptance_readiness_challenge/fixture_ceiling/manuscript.md",
+      "size": 717,
+      "sha256": "557ce5c0607024aebae3375e117b1052acace3d1931a3dc8d015a6b9c6043843"
+    },
+    {
+      "path": "skills/find-journal/scripts/acceptance_readiness_challenge/fixture_clean/manuscript.md",
+      "size": 646,
+      "sha256": "77a215d5c5fe8d09035308f04cdf26ab7c2736c53563b71930a2be1941b7061b"
+    },
+    {
+      "path": "skills/find-journal/scripts/acceptance_readiness_challenge/problem.md",
+      "size": 1981,
+      "sha256": "469b518a373f5bef95016dc866f82b0d05395eb26ae162656abf1303495ed71c"
+    },
+    {
+      "path": "skills/find-journal/scripts/acceptance_readiness_challenge/verify.sh",
+      "size": 1232,
+      "sha256": "0c525f322306fd5229269f155179924d80a419301988fff0171115c3b323d795"
+    },
+    {
+      "path": "skills/find-journal/scripts/assess_acceptance_readiness.py",
+      "size": 10959,
+      "sha256": "32930f970f308d95aaf82c61d809a307685668165df4fe2bf7c2b4f5052b3e87"
+    },
     {
       "path": "skills/find-journal/skill.yml",
-      "size": 1428,
-      "sha256": "0fdf9fdce505cc1d264bdb9c69a4ad6107ea6d510d186385f854636b9f10e609"
+      "size": 1971,
+      "sha256": "8a4ce35c9fccb8a43de5d06bee09b27fc70d2c735d8fa9fce353ec9467fa9936"
     },
     {
       "path": "skills/fulltext-retrieval/SKILL.md",
@@ -2411,6 +2496,241 @@
       "size": 3891,
       "sha256": "d056566bb052bd917b4705fcd4912ef2173d5bd61b83bff4885b847bb824aa83"
     },
+    {
+      "path": "skills/mllm-eval/SKILL.md",
+      "size": 6288,
+      "sha256": "ccf3da2f70b356d432b3d33500f667f9d7858500a9ff325631a9a16f5f300a8b"
+    },
+    {
+      "path": "skills/mllm-eval/scripts/check_mllm_eval_completeness.py",
+      "size": 9380,
+      "sha256": "fe24a9b0dfcce7d826c29be59f2534c768daf73e1e98f545e8a7791bc0709079"
+    },
+    {
+      "path": "skills/mllm-eval/scripts/mllm_eval_completeness_challenge/fixture/plan_bad.md",
+      "size": 202,
+      "sha256": "41cd455b9a747f92506091f0a8fa607e069042c4add683d12b274e6081538ea1"
+    },
+    {
+      "path": "skills/mllm-eval/scripts/mllm_eval_completeness_challenge/fixture/plan_good.md",
+      "size": 795,
+      "sha256": "93b703c163348e766dfbd2d01b2fc7267b57a082652d8bdd839b23646caa99a6"
+    },
+    {
+      "path": "skills/mllm-eval/scripts/mllm_eval_completeness_challenge/problem.md",
+      "size": 1681,
+      "sha256": "dc9ce54a7bd47ff5d3383c76e681760c0891dc503933863948df1cd999a11a0e"
+    },
+    {
+      "path": "skills/mllm-eval/scripts/mllm_eval_completeness_challenge/verify.sh",
+      "size": 1335,
+      "sha256": "7c3afbcd5adecdcdc8a2c9d14251521e2bfa39cbd3f650e77c92f37aabe49c9c"
+    },
+    {
+      "path": "skills/mllm-eval/skill.yml",
+      "size": 3373,
+      "sha256": "72fcc60d62edc95808df7cc0ff283822069caaa61e9163fc8d3f298a3afa3801"
+    },
+    {
+      "path": "skills/model-card/SKILL.md",
+      "size": 5807,
+      "sha256": "070b1ab8391a37bc4ffebadb918b71e83251da2293d119f5b0cfbf28bea0d4c2"
+    },
+    {
+      "path": "skills/model-card/references/datasheet_template.md",
+      "size": 2280,
+      "sha256": "11998d71a43fcdf75d7f3b72cd46de7f9e4d999ac6b0b6febd5a3dee92a7a517"
+    },
+    {
+      "path": "skills/model-card/references/metric_dimensions.md",
+      "size": 3066,
+      "sha256": "70aeb1af8ed7510aff0562c8ac99852a0b1b49de5eb3ef923a1bbeb4f4c051eb"
+    },
+    {
+      "path": "skills/model-card/references/model_card_template.md",
+      "size": 3239,
+      "sha256": "e40ed0b7fd1a7370d22bb74a596048d1e09b75c14f1b2064ba89d06c886cefbd"
+    },
+    {
+      "path": "skills/model-card/scripts/check_model_card_complete.py",
+      "size": 8856,
+      "sha256": "aa1fdc6a88e18696d0dfc23397a8a9141c1219583081deb5e27b822f152d4005"
+    },
+    {
+      "path": "skills/model-card/scripts/check_model_card_complete_challenge/fixture/complete/DATASHEET.md",
+      "size": 2179,
+      "sha256": "1b53763c3017c88de0ddc583fc6a1975400d69d1929852580fd96c7fe08ccc3e"
+    },
+    {
+      "path": "skills/model-card/scripts/check_model_card_complete_challenge/fixture/complete/MODEL_CARD.md",
+      "size": 2785,
+      "sha256": "fbb5ce99d2d144ecb53d1581826758d4aece996351089e7592999873760dcfaf"
+    },
+    {
+      "path": "skills/model-card/scripts/check_model_card_complete_challenge/fixture/incomplete/MODEL_CARD.md",
+      "size": 880,
+      "sha256": "9b017400531316d787b71aaa7bc4a1fc82aa7f782e54df1eda1c34cde8980db4"
+    },
+    {
+      "path": "skills/model-card/scripts/check_model_card_complete_challenge/problem.md",
+      "size": 1853,
+      "sha256": "dc85f56704dfd049cd9807d0ffd77bc1264de69c2b20ca3e691f6eedc6843deb"
+    },
+    {
+      "path": "skills/model-card/scripts/check_model_card_complete_challenge/verify.sh",
+      "size": 1857,
+      "sha256": "8a8ee577857f4333b38313d6a322f02a54d8011698251bbec4e1a7d6bd758b21"
+    },
+    {
+      "path": "skills/model-card/skill.yml",
+      "size": 2877,
+      "sha256": "0d71da374191ca35545b90130d7d30d4693e4bd469014a3b0993063bdfd4b958"
+    },
+    {
+      "path": "skills/model-evaluation/SKILL.md",
+      "size": 5031,
+      "sha256": "17ffde905359e4cffdf747422b7d41c214d2abfcc71e50e1b1e4a689d87fa695"
+    },
+    {
+      "path": "skills/model-evaluation/references/metric_guide.md",
+      "size": 2454,
+      "sha256": "8d09ca7ce9fb9f66ee4942689294d9b12ae1d892ac67769cd68fdc38a4e220ee"
+    },
+    {
+      "path": "skills/model-evaluation/scripts/check_metric_reporting.py",
+      "size": 9564,
+      "sha256": "c33f52ee62ae93417027d0bb0b6cf2a95f5747d52403ab99c075bc64a5e2c593"
+    },
+    {
+      "path": "skills/model-evaluation/scripts/metric_reporting_challenge/fixture/clf_bad.md",
+      "size": 84,
+      "sha256": "571aceb9567454e8bbe12e0967bc4cde4b142da99f6056f7530f3b8723ddf77c"
+    },
+    {
+      "path": "skills/model-evaluation/scripts/metric_reporting_challenge/fixture/clf_good.md",
+      "size": 186,
+      "sha256": "ba44a3b38b4128fa713555c8b332f211e9087e6bc016d80af4c3074c3ca6ef8e"
+    },
+    {
+      "path": "skills/model-evaluation/scripts/metric_reporting_challenge/fixture/seg_bad.md",
+      "size": 119,
+      "sha256": "50c138840976b8fa010cb0ca9fd24fffc7181e275eb6c71e76500da7d2a0420d"
+    },
+    {
+      "path": "skills/model-evaluation/scripts/metric_reporting_challenge/fixture/seg_good.md",
+      "size": 207,
+      "sha256": "903a2dbd61cf56ce8e3d808f867ebd47ba4d6f2bd1542d7482b85477a6876eb0"
+    },
+    {
+      "path": "skills/model-evaluation/scripts/metric_reporting_challenge/problem.md",
+      "size": 1599,
+      "sha256": "1f842374e3569d5fc27d7c26aa7f6bc6be43f7f38d2d9f3e2f9ab62fb29ddd65"
+    },
+    {
+      "path": "skills/model-evaluation/scripts/metric_reporting_challenge/verify.sh",
+      "size": 1235,
+      "sha256": "f819d1333a7206db6383ccbd41c6eadae015004e61ae335e67a3334936c58cc8"
+    },
+    {
+      "path": "skills/model-evaluation/skill.yml",
+      "size": 2921,
+      "sha256": "9713dbab40c54ca88324e7dcd74d3142890f9625ad68d708ba40b0cf57a7b9ee"
+    },
+    {
+      "path": "skills/model-scaffold/SKILL.md",
+      "size": 7686,
+      "sha256": "cede2020d3ceee0e1599cbcfb412230ab9c04b08032c8bbaff620e4129ec6785"
+    },
+    {
+      "path": "skills/model-scaffold/references/training_guide.md",
+      "size": 2661,
+      "sha256": "4a3197a89b8d3473071051f67a7bcf40f71ecf007e942f81e9c40dc48fce1a9c"
+    },
+    {
+      "path": "skills/model-scaffold/scripts/check_training_hygiene.py",
+      "size": 11752,
+      "sha256": "66970049c85e46a0080b5f19eb3b5ac9dfd656a674b6f1698825e968eedd4814"
+    },
+    {
+      "path": "skills/model-scaffold/scripts/scaffold.py",
+      "size": 40470,
+      "sha256": "33569806ecd230aebb9d35c77a27b8480d23329b00f633d7fee9531a7bcec974"
+    },
+    {
+      "path": "skills/model-scaffold/scripts/scaffold_challenge/expected/split_assignment.csv",
+      "size": 144,
+      "sha256": "23949c5c9d179ef152127e0e6b865138169184f95c4e2d5234a65bab3762bd71"
+    },
+    {
+      "path": "skills/model-scaffold/scripts/scaffold_challenge/fixture/manifest.csv",
+      "size": 582,
+      "sha256": "defce0cdce35211f038b139a9ab4214c63b22f3e60ba7ee74fa6bea8fe33aa7f"
+    },
+    {
+      "path": "skills/model-scaffold/scripts/scaffold_challenge/problem.md",
+      "size": 2678,
+      "sha256": "369e98da65cbd16c67f5ff3f17d99b707d713bbf3239409f9921190c85f8f08a"
+    },
+    {
+      "path": "skills/model-scaffold/scripts/scaffold_challenge/verify.sh",
+      "size": 5136,
+      "sha256": "040acee712943e8392b70b40b790438bec056a3a4e7a282f76455d2a2b791447"
+    },
+    {
+      "path": "skills/model-scaffold/skill.yml",
+      "size": 3195,
+      "sha256": "06c4ed02872bfc38abd1c753c02a640a2bfaa1141fbfdd9f8b9ccbe18a148d82"
+    },
+    {
+      "path": "skills/model-validation/SKILL.md",
+      "size": 9347,
+      "sha256": "ecd48672a03923bf1ace63528fd2dbcf138cd880103cc8c40345b3857d66ad1c"
+    },
+    {
+      "path": "skills/model-validation/scripts/check_split_leakage.py",
+      "size": 11616,
+      "sha256": "a207b82abbb5914927e0de25663820c250207b0b1689f8ed17bfe6bda9eed6e2"
+    },
+    {
+      "path": "skills/model-validation/scripts/check_split_leakage_challenge/expected/clean.txt",
+      "size": 358,
+      "sha256": "5d1e203ebc6656c172f60f4cd4eb3954fdbcd77579a6768c6bbd56952a966e6c"
+    },
+    {
+      "path": "skills/model-validation/scripts/check_split_leakage_challenge/expected/leak.txt",
+      "size": 470,
+      "sha256": "34be75bbbabf10073a1fd11587e019491b7319e2030b166a1ab3e82e4899f3c2"
+    },
+    {
+      "path": "skills/model-validation/scripts/check_split_leakage_challenge/fixture/split_seed.txt",
+      "size": 3,
+      "sha256": "084c799cd551dd1d8d5c5f9a5d593b2e931f5e36122ee5c793c1d08a19839cc0"
+    },
+    {
+      "path": "skills/model-validation/scripts/check_split_leakage_challenge/fixture/splits_clean.csv",
+      "size": 150,
+      "sha256": "a4a15320685b58c4868737e40b4df0a07a942b39b5d94611e87d63d22e88aee0"
+    },
+    {
+      "path": "skills/model-validation/scripts/check_split_leakage_challenge/fixture/splits_leak.csv",
+      "size": 145,
+      "sha256": "1bc2cc7b11a7a4a9ddc0168a83eab33ea1b03e3ca98bda5301679a18e19faf53"
+    },
+    {
+      "path": "skills/model-validation/scripts/check_split_leakage_challenge/problem.md",
+      "size": 2347,
+      "sha256": "f140095ec5d190120bf73dc62710b2ccac31c7f7d54307a0854c8438df61940e"
+    },
+    {
+      "path": "skills/model-validation/scripts/check_split_leakage_challenge/verify.sh",
+      "size": 1925,
+      "sha256": "9ae88c109caa1408a3a3b08ea0583fc97ce98b93c909f75d043f08a231e91e61"
+    },
+    {
+      "path": "skills/model-validation/skill.yml",
+      "size": 3079,
+      "sha256": "4891c0698445f98552ab2a2315bc8288a6cc4744f8711e20f939c8af8739e434"
+    },
     {
       "path": "skills/orchestrate/SKILL.md",
       "size": 35203,
@@ -2438,8 +2758,8 @@
     },
     {
       "path": "skills/peer-review/SKILL.md",
-      "size": 55415,
-      "sha256": "4b75d67b35444eada1a08700b9a0d27e6b8453fbca197893799db77811904b52"
+      "size": 58493,
+      "sha256": "057c39cd131c49eeb1949560a0d510f176a012431a0c8d36dd803061f98e0629"
     },
     {
       "path": "skills/peer-review/references/aczel_2021_reviewer2_patterns.md",
@@ -2481,6 +2801,16 @@
       "size": 11244,
       "sha256": "197cfaa4bdcfe223a0ebfb69c229ccb3852160ab76870ade914eb8997ab684c5"
     },
+    {
+      "path": "skills/peer-review/references/domain-probes/mllm_evaluation.md",
+      "size": 7785,
+      "sha256": "1b63f7d987bc2ba8b9e67008b1713d4e874437f22c46002a12dcda1da0b73d2a"
+    },
+    {
+      "path": "skills/peer-review/references/domain-probes/model_development.md",
+      "size": 10438,
+      "sha256": "263db14ecedfc51caffb8b4966bade4566706b3ae4ce67d0b15840bdbbcdba07"
+    },
     {
       "path": "skills/peer-review/references/domain-probes/narrative_review.md",
       "size": 12598,
@@ -2643,8 +2973,8 @@
     },
     {
       "path": "skills/present-paper/SKILL.md",
-      "size": 29247,
-      "sha256": "aa8455317bd4996d5b1e0cc9d27c8be33b8112b7826749d87376c161d6cf87d5"
+      "size": 33518,
+      "sha256": "777197edf83a4d242508b366abe24681198467a95950b0acc385ac2714f33cb4"
     },
     {
       "path": "skills/present-paper/references/critic_rubrics/slide.md",
@@ -2661,11 +2991,41 @@
       "size": 15007,
       "sha256": "d0f964af7523ec8bfef50ca627878f8c2cfe58159c2a827c5f7dfd43585cf9ee"
     },
+    {
+      "path": "skills/present-paper/references/presentation_design_guidelines.md",
+      "size": 7460,
+      "sha256": "689d021ff7fc5e04abf93f3d6e1b0646bb5aa86430239b76b51d2224b3b92f0c"
+    },
     {
       "path": "skills/present-paper/references/slide_design_principles.md",
       "size": 10436,
       "sha256": "7f2a5e03c8f2ddbb2d84a163506c5f3a2d1cca1353a694abd7bfb14225324826"
     },
+    {
+      "path": "skills/present-paper/references/slide_visual_styles/CATALOG.md",
+      "size": 3177,
+      "sha256": "78782ce6916212bcae9e1d8513197721a8ac72ed1faa33ce335b36d0c88a828f"
+    },
+    {
+      "path": "skills/present-paper/references/slide_visual_styles/clinical_blue.md",
+      "size": 3249,
+      "sha256": "334ee770935b93a1ddf86c426a8f3da377d2618c0f6553308ccc38e871f26160"
+    },
+    {
+      "path": "skills/present-paper/references/slide_visual_styles/dark_modern.md",
+      "size": 3197,
+      "sha256": "b5adf2331318a1276fa5eedf173a36c056d220fbfa04819a5c4d4d68b1109f68"
+    },
+    {
+      "path": "skills/present-paper/references/slide_visual_styles/editorial_mono.md",
+      "size": 3068,
+      "sha256": "db07e84bbf4d248f0a4837fe22ed61c82e176fe94055b590131e24d25c59f20b"
+    },
+    {
+      "path": "skills/present-paper/references/slide_visual_styles/institutional_brand.md",
+      "size": 4431,
+      "sha256": "c8b04c93bf61072fc6c4ee7d402e107d7375e933d60085bf325d74830896684a"
+    },
     {
       "path": "skills/present-paper/references/slide_visual_styles/nature_lancet.md",
       "size": 7989,
@@ -2691,6 +3051,11 @@
       "size": 6758,
       "sha256": "6eeaf94c396d0f4ff365eaea0408f2fc00f8e2dc75b53a9514214194cb9329f9"
     },
+    {
+      "path": "skills/present-paper/scripts/inspect_pptx_template.py",
+      "size": 5073,
+      "sha256": "648fe3d2904a5ffffb41eb064d1780f605a672dc44618a74b5e3e59c023cb63d"
+    },
     {
       "path": "skills/present-paper/scripts/strip_notes_for_sharing.py",
       "size": 5508,
@@ -2933,8 +3298,8 @@
     },
     {
       "path": "skills/self-review/SKILL.md",
-      "size": 94121,
-      "sha256": "7222f5bc17832d84a23f1ea63fb090eb9a114c70bec3cf16ff2c7a16b24d0f40"
+      "size": 94657,
+      "sha256": "14c982c492d6305c238737366f0996415133fe02ff9aaad7e2c2207d78d260a2"
     },
     {
       "path": "skills/self-review/references/domain-probes/ai_overclaiming.md",
@@ -2971,6 +3336,16 @@
       "size": 11244,
       "sha256": "197cfaa4bdcfe223a0ebfb69c229ccb3852160ab76870ade914eb8997ab684c5"
     },
+    {
+      "path": "skills/self-review/references/domain-probes/mllm_evaluation.md",
+      "size": 7785,
+      "sha256": "1b63f7d987bc2ba8b9e67008b1713d4e874437f22c46002a12dcda1da0b73d2a"
+    },
+    {
+      "path": "skills/self-review/references/domain-probes/model_development.md",
+      "size": 10438,
+      "sha256": "263db14ecedfc51caffb8b4966bade4566706b3ae4ce67d0b15840bdbbcdba07"
+    },
     {
       "path": "skills/self-review/references/domain-probes/narrative_review.md",
       "size": 12598,
@@ -3418,8 +3793,8 @@
     },
     {
       "path": "skills/write-paper/references/journal_profiles/Clinical_and_Molecular_Hepatology.md",
-      "size": 7981,
-      "sha256": "666c53f8c68b3a3436beafdb76a3671825802c5d6fe0a92c052db9075848c7da"
+      "size": 8585,
+      "sha256": "b268d9cf8caa255e72ab4cbc7960473de7666e8c65e86cb3d2e41ea6791a0642"
     },
     {
       "path": "skills/write-paper/references/journal_profiles/Diabetes_Metabolism_Journal.md",