@yibeichan/claude-skills 1.0.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +98 -0
- package/cli.js +272 -0
- package/install.py +240 -0
- package/package.json +44 -0
- package/skills/bidsapp-nidm-standards/SKILL.md +202 -0
- package/skills/bidsapp-nidm-standards/references/babs_config.md +20 -0
- package/skills/bidsapp-nidm-standards/references/cli_arguments.md +76 -0
- package/skills/bidsapp-nidm-standards/references/container_patterns.md +53 -0
- package/skills/bidsapp-nidm-standards/references/nidm_integration.md +403 -0
- package/skills/bidsapp-nidm-standards/references/repo_structure.md +121 -0
- package/skills/bidsapp-nidm-standards/references/testing_patterns.md +82 -0
- package/skills/dicom2fmriprep/SKILL.md +377 -0
- package/skills/dicom2fmriprep/evals/evals.json +26 -0
- package/skills/dicom2fmriprep/references/babs-details.md +407 -0
- package/skills/dicom2fmriprep/references/fmriprep-details.md +250 -0
- package/skills/dicom2fmriprep/references/heudiconv-details.md +243 -0
- package/skills/fmri-ssm/SKILL.md +317 -0
- package/skills/fmri-ssm/references/code_templates.md +1570 -0
- package/skills/fmri-ssm/references/downstream_analysis.md +680 -0
- package/skills/fmri-ssm/references/group_inference.md +608 -0
- package/skills/fmri-ssm/references/hrf_modeling.md +447 -0
- package/skills/fmri-ssm/references/model_catalog.md +436 -0
- package/skills/fmri-ssm/references/paradigm_guide.md +406 -0
- package/skills/fmri-ssm/references/preprocessing.md +614 -0
- package/skills/fmri-ssm.zip +0 -0
- package/skills/neuroimaging-qc/SKILL.md +203 -0
- package/skills/neuroimaging-qc/references/eeg_qc.md +400 -0
- package/skills/neuroimaging-qc/references/fmri_qc.md +343 -0
- package/skills/neuroimaging-qc/references/fnirs_qc.md +430 -0
- package/skills/neuroimaging-qc/references/structural_qc.md +454 -0
- package/skills/neuroimaging-qc/scripts/parse_fmriprep_confounds.py +153 -0
- package/skills/neuroimaging-qc/scripts/parse_mriqc.py +114 -0
- package/skills/neuroimaging-qc/scripts/qc_report.py +295 -0
- package/skills/scientific-writer/SKILL.md +202 -0
- package/skills/scientific-writer/references/citation_styles.md +163 -0
- package/skills/scientific-writer/references/field_conventions.md +245 -0
- package/skills/scientific-writer/references/figures_tables.md +225 -0
- package/skills/scientific-writer/references/reporting_guidelines.md +225 -0
- package/skills.json +54 -0
|
@@ -0,0 +1,243 @@
|
|
|
1
|
+
# heudiconv Detailed Reference
|
|
2
|
+
|
|
3
|
+
## Table of Contents
|
|
4
|
+
- [SeqInfo Fields](#seqinfo-fields)
|
|
5
|
+
- [Advanced Heuristic Patterns](#advanced-heuristic-patterns)
|
|
6
|
+
- [IntendedFor Population](#intendedfor-population)
|
|
7
|
+
- [Optional Heuristic Functions](#optional-heuristic-functions)
|
|
8
|
+
- [CLI Reference](#cli-reference)
|
|
9
|
+
- [Troubleshooting](#troubleshooting)
|
|
10
|
+
|
|
11
|
+
## SeqInfo Fields
|
|
12
|
+
|
|
13
|
+
Every `s` object in `seqinfo` has these 28 fields:
|
|
14
|
+
|
|
15
|
+
| Field | Type | Description |
|
|
16
|
+
|-------|------|-------------|
|
|
17
|
+
| `total_files_till_now` | int | Cumulative file count |
|
|
18
|
+
| `example_dcm_file` | str | Path to example DICOM |
|
|
19
|
+
| `series_id` | str | Unique series identifier |
|
|
20
|
+
| `dcm_dir_name` | str | DICOM directory name |
|
|
21
|
+
| `series_files` | int | Number of files in series |
|
|
22
|
+
| `unspecified` | - | Reserved |
|
|
23
|
+
| `dim1` | int | First dimension (x) |
|
|
24
|
+
| `dim2` | int | Second dimension (y) |
|
|
25
|
+
| `dim3` | int | Third dimension (slices) |
|
|
26
|
+
| `dim4` | int | Fourth dimension (volumes/timepoints) |
|
|
27
|
+
| `TR` | float | Repetition time (seconds) |
|
|
28
|
+
| `TE` | float | Echo time (ms) |
|
|
29
|
+
| `protocol_name` | str | Scanner protocol name |
|
|
30
|
+
| `is_motion_corrected` | bool | MoCo series flag |
|
|
31
|
+
| `is_derived` | bool | Derived/secondary series |
|
|
32
|
+
| `patient_id` | str | Patient identifier |
|
|
33
|
+
| `study_description` | str | Study description |
|
|
34
|
+
| `referring_physician_name` | str | Referring physician |
|
|
35
|
+
| `series_description` | str | Series description |
|
|
36
|
+
| `sequence_name` | str | Pulse sequence name |
|
|
37
|
+
| `image_type` | tuple | DICOM ImageType |
|
|
38
|
+
| `accession_number` | str | Accession number |
|
|
39
|
+
| `patient_age` | str | Patient age |
|
|
40
|
+
| `patient_sex` | str | Patient sex |
|
|
41
|
+
| `date` | str | Acquisition date |
|
|
42
|
+
| `series_uid` | str | Series instance UID |
|
|
43
|
+
| `time` | str | Acquisition time |
|
|
44
|
+
| `custom` | dict | Custom fields from `custom_seqinfo` |
|
|
45
|
+
|
|
46
|
+
## Advanced Heuristic Patterns
|
|
47
|
+
|
|
48
|
+
### Multi-Echo fMRI
|
|
49
|
+
|
|
50
|
+
```python
|
|
51
|
+
func_me_echo1 = create_key(
|
|
52
|
+
'sub-{subject}/{session}/func/sub-{subject}_{session}_task-rest_echo-1_bold'
|
|
53
|
+
)
|
|
54
|
+
func_me_echo2 = create_key(
|
|
55
|
+
'sub-{subject}/{session}/func/sub-{subject}_{session}_task-rest_echo-2_bold'
|
|
56
|
+
)
|
|
57
|
+
func_me_echo3 = create_key(
|
|
58
|
+
'sub-{subject}/{session}/func/sub-{subject}_{session}_task-rest_echo-3_bold'
|
|
59
|
+
)
|
|
60
|
+
|
|
61
|
+
for s in seqinfo:
|
|
62
|
+
if 'multiecho' in s.protocol_name.lower() and not s.is_motion_corrected:
|
|
63
|
+
if abs(s.TE - 12.0) < 1:
|
|
64
|
+
info[func_me_echo1].append(s.series_id)
|
|
65
|
+
elif abs(s.TE - 28.0) < 1:
|
|
66
|
+
info[func_me_echo2].append(s.series_id)
|
|
67
|
+
elif abs(s.TE - 44.0) < 1:
|
|
68
|
+
info[func_me_echo3].append(s.series_id)
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
### Magnitude and Phase Fieldmaps
|
|
72
|
+
|
|
73
|
+
```python
|
|
74
|
+
fmap_mag1 = create_key('sub-{subject}/{session}/fmap/sub-{subject}_{session}_magnitude1')
|
|
75
|
+
fmap_mag2 = create_key('sub-{subject}/{session}/fmap/sub-{subject}_{session}_magnitude2')
|
|
76
|
+
fmap_phase = create_key('sub-{subject}/{session}/fmap/sub-{subject}_{session}_phasediff')
|
|
77
|
+
|
|
78
|
+
for s in seqinfo:
|
|
79
|
+
if 'field_map' in s.protocol_name.lower():
|
|
80
|
+
if s.TE < 10 and 'M' in s.image_type:
|
|
81
|
+
info[fmap_mag1] = [s.series_id]
|
|
82
|
+
elif s.TE > 10 and 'M' in s.image_type:
|
|
83
|
+
info[fmap_mag2] = [s.series_id]
|
|
84
|
+
elif 'P' in s.image_type:
|
|
85
|
+
info[fmap_phase] = [s.series_id]
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
### Spin-Echo Fieldmaps (pepolar)
|
|
89
|
+
|
|
90
|
+
```python
|
|
91
|
+
fmap_se_ap = create_key('sub-{subject}/{session}/fmap/sub-{subject}_{session}_dir-AP_epi')
|
|
92
|
+
fmap_se_pa = create_key('sub-{subject}/{session}/fmap/sub-{subject}_{session}_dir-PA_epi')
|
|
93
|
+
|
|
94
|
+
for s in seqinfo:
|
|
95
|
+
if 'spinecho' in s.protocol_name.lower() or 'sefield' in s.protocol_name.lower():
|
|
96
|
+
if 'AP' in s.protocol_name or 'j-' in s.protocol_name:
|
|
97
|
+
info[fmap_se_ap] = [s.series_id]
|
|
98
|
+
elif 'PA' in s.protocol_name or 'j' in s.protocol_name:
|
|
99
|
+
info[fmap_se_pa] = [s.series_id]
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
### T2w Anatomical
|
|
103
|
+
|
|
104
|
+
```python
|
|
105
|
+
t2w = create_key('sub-{subject}/{session}/anat/sub-{subject}_{session}_T2w')
|
|
106
|
+
|
|
107
|
+
for s in seqinfo:
|
|
108
|
+
if 't2' in s.protocol_name.lower() and 'spc' in s.sequence_name.lower():
|
|
109
|
+
info[t2w].append(s.series_id)
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
### Single-Session Studies
|
|
113
|
+
|
|
114
|
+
Omit `{session}` from all templates:
|
|
115
|
+
|
|
116
|
+
```python
|
|
117
|
+
t1w = create_key('sub-{subject}/anat/sub-{subject}_T1w')
|
|
118
|
+
func = create_key('sub-{subject}/func/sub-{subject}_task-rest_run-{item:02d}_bold')
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
## IntendedFor Population
|
|
122
|
+
|
|
123
|
+
Use `POPULATE_INTENDED_FOR_OPTS` in your heuristic to automatically link fieldmaps to their target scans:
|
|
124
|
+
|
|
125
|
+
```python
|
|
126
|
+
POPULATE_INTENDED_FOR_OPTS = {
|
|
127
|
+
'matching_parameters': ['ImagingVolume', 'Shims'],
|
|
128
|
+
'criterion': 'Closest'
|
|
129
|
+
}
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
Or run after conversion:
|
|
133
|
+
```bash
|
|
134
|
+
heudiconv --command populate-intended-for --files /path/to/bids_dataset
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
**`matching_parameters`** options: `'ImagingVolume'`, `'Shims'`, `'Force'`
|
|
138
|
+
**`criterion`** options: `'Closest'` (nearest in time), `'First'`
|
|
139
|
+
|
|
140
|
+
## Optional Heuristic Functions
|
|
141
|
+
|
|
142
|
+
### `filter_dicom(dcm_data)`
|
|
143
|
+
Return `True` to **exclude** a DICOM from consideration:
|
|
144
|
+
```python
|
|
145
|
+
def filter_dicom(dcm_data):
|
|
146
|
+
"""Exclude localizers and scouts."""
|
|
147
|
+
if 'localizer' in dcm_data.SeriesDescription.lower():
|
|
148
|
+
return True
|
|
149
|
+
return False
|
|
150
|
+
```
|
|
151
|
+
|
|
152
|
+
### `filter_files(fl)`
|
|
153
|
+
Return `True` to **keep** a file path:
|
|
154
|
+
```python
|
|
155
|
+
def filter_files(fl):
|
|
156
|
+
"""Only process .dcm files."""
|
|
157
|
+
return fl.endswith('.dcm')
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
### `infotoids(seqinfos, outdir)`
|
|
161
|
+
Override automatic subject/session/locator extraction:
|
|
162
|
+
```python
|
|
163
|
+
def infotoids(seqinfos, outdir):
|
|
164
|
+
return {
|
|
165
|
+
'locator': 'my_study',
|
|
166
|
+
'session': None,
|
|
167
|
+
'subject': None # uses default extraction
|
|
168
|
+
}
|
|
169
|
+
```
|
|
170
|
+
|
|
171
|
+
### `custom_seqinfo(wrapper, series_files)`
|
|
172
|
+
Add custom fields accessible via `s.custom`:
|
|
173
|
+
```python
|
|
174
|
+
def custom_seqinfo(wrapper, series_files):
|
|
175
|
+
from heudiconv.dicoms import parse_private_csa_header
|
|
176
|
+
csa = parse_private_csa_header(wrapper, 'series')
|
|
177
|
+
return {'slice_timing': csa.get('MosaicRefAcqTimes', '')}
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
## CLI Reference
|
|
181
|
+
|
|
182
|
+
### Key Flags
|
|
183
|
+
|
|
184
|
+
| Flag | Description |
|
|
185
|
+
|------|-------------|
|
|
186
|
+
| `--files` | DICOM files/directories/tarballs |
|
|
187
|
+
| `-d, --dicom_dir_template` | Path template with `{subject}`, `{session}` |
|
|
188
|
+
| `-o, --outdir` | Output directory |
|
|
189
|
+
| `-f, --heuristic` | Heuristic name or path to .py file |
|
|
190
|
+
| `-s, --subjects` | Subject ID(s) |
|
|
191
|
+
| `-ss, --ses` | Session label |
|
|
192
|
+
| `-c, --converter` | `dcm2niix` or `none` |
|
|
193
|
+
| `-b, --bids` | Enable BIDS output |
|
|
194
|
+
| `--minmeta` | Exclude dcmstack metadata from sidecars |
|
|
195
|
+
| `--overwrite` | Overwrite existing files |
|
|
196
|
+
| `-g, --grouping` | `studyUID` (default), `accession_number`, `all`, `custom` |
|
|
197
|
+
| `--dcmconfig` | JSON config for dcm2niix options |
|
|
198
|
+
| `-q, --queue` | `SLURM` for batch submission |
|
|
199
|
+
|
|
200
|
+
### Batch Processing
|
|
201
|
+
|
|
202
|
+
```bash
|
|
203
|
+
# Multiple subjects
|
|
204
|
+
heudiconv --files /path/to/dicoms/{subject}/*/*/*.dcm \
|
|
205
|
+
-o /path/to/bids -f heuristic.py \
|
|
206
|
+
-s sub01 sub02 sub03 \
|
|
207
|
+
-c dcm2niix -b --minmeta
|
|
208
|
+
|
|
209
|
+
# Using dicom_dir_template for organized DICOMs
|
|
210
|
+
heudiconv -d '/data/dicoms/{subject}/{session}/*/*.dcm' \
|
|
211
|
+
-o /path/to/bids -f heuristic.py \
|
|
212
|
+
-s sub01 -ss ses01 ses02 \
|
|
213
|
+
-c dcm2niix -b --minmeta
|
|
214
|
+
```
|
|
215
|
+
|
|
216
|
+
## Troubleshooting
|
|
217
|
+
|
|
218
|
+
### "No DICOMs found" or empty dicominfo.tsv
|
|
219
|
+
- Check the `--files` glob pattern matches actual DICOM paths
|
|
220
|
+
- Try a broader glob: `--files /path/to/dicoms/`
|
|
221
|
+
- Verify DICOMs aren't compressed (decompress first)
|
|
222
|
+
|
|
223
|
+
### Duplicate series in output
|
|
224
|
+
- Siemens MoCo series: filter with `not s.is_motion_corrected`
|
|
225
|
+
- Derived reconstructions: filter with `not s.is_derived`
|
|
226
|
+
- Check `image_type` for `'ORIGINAL'` vs `'DERIVED'`
|
|
227
|
+
|
|
228
|
+
### Session mixing errors
|
|
229
|
+
- Don't combine multiple sessions in one DICOM folder
|
|
230
|
+
- Use `-g accession_number` if studyUID grouping fails
|
|
231
|
+
|
|
232
|
+
### Large JSON sidecars
|
|
233
|
+
- Always use `--minmeta` to prevent dcmstack metadata bloat
|
|
234
|
+
|
|
235
|
+
### "Template must be a valid format string"
|
|
236
|
+
- Check that all `create_key()` calls have non-empty template strings
|
|
237
|
+
- Ensure no `None` templates
|
|
238
|
+
|
|
239
|
+
### ReproIn Convention
|
|
240
|
+
If DICOMs were named following ReproIn at scan time, use the built-in ReproIn heuristic:
|
|
241
|
+
```bash
|
|
242
|
+
heudiconv --files /path/to/dicoms/ -o /path/to/bids -f reproin -c dcm2niix -b --minmeta
|
|
243
|
+
```
|
|
@@ -0,0 +1,317 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: fmri-ssm
|
|
3
|
+
description: >
|
|
4
|
+
State-space models (SSMs) for fMRI analysis: HMM, HMM-MAR, sticky/HDP-HMM, IO-HMM,
|
|
5
|
+
SLDS, rSLDS, SNLDS. Covers resting-state, task-based (MID, SST, N-back), and naturalistic
|
|
6
|
+
fMRI (movie, gaming). Python code generation (hmmlearn, ssm, pyhsmm, osl-dynamics, glhmm),
|
|
7
|
+
HRF-aware modeling, fMRIPrep/XCP-D preprocessing, CIFTI/parcellation/ICA, model selection,
|
|
8
|
+
and single-subject + group-level inference. Trigger keywords: HMM on brain data, brain
|
|
9
|
+
state dynamics, dynamic FC, switching dynamics, latent states from BOLD, HRF deconvolution
|
|
10
|
+
for state models, SLDS/rSLDS on neural timeseries, choosing K for fMRI, state-space +
|
|
11
|
+
neuroimaging, task paradigms (MID, SST, N-back, movie-watching) with dynamic/latent-state
|
|
12
|
+
analysis, temporal dynamics beyond standard GLM.
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
# State-Space Modeling for fMRI Data
|
|
16
|
+
|
|
17
|
+
Full pipeline skill: from preprocessed BOLD data to fitted SSMs and group-level inference,
|
|
18
|
+
across paradigms and data formats.
|
|
19
|
+
|
|
20
|
+
## When to Use This Skill
|
|
21
|
+
|
|
22
|
+
- User asks about HMMs, brain states, dynamic functional connectivity, or switching dynamics
|
|
23
|
+
- User wants to go beyond standard GLM and model temporal brain dynamics
|
|
24
|
+
- User specifies a paradigm (resting-state, task, movie-watching) and wants latent-state analysis
|
|
25
|
+
- User asks about choosing K (number of states), HRF deconvolution for SSMs, or state persistence
|
|
26
|
+
- User wants to fit or compare SSM libraries (hmmlearn, ssm, osl-dynamics, glhmm, pyhsmm)
|
|
27
|
+
- User asks about group-level state inference, label-switching, or multi-subject SSMs
|
|
28
|
+
|
|
29
|
+
---
|
|
30
|
+
|
|
31
|
+
## How to Use This Skill
|
|
32
|
+
|
|
33
|
+
1. **Identify paradigm and data format** — determines HRF strategy, model family, and
|
|
34
|
+
covariate structure.
|
|
35
|
+
2. **Use the decision tree below** to narrow the model choice.
|
|
36
|
+
3. **Read the relevant reference file** for implementation details before generating code.
|
|
37
|
+
4. **Use `references/code_templates.md` as the starting point for any code** — templates
|
|
38
|
+
handle common pitfalls (HRF, TR alignment, state ordering, multi-run boundaries).
|
|
39
|
+
|
|
40
|
+
## Reference Files
|
|
41
|
+
|
|
42
|
+
Read these before generating code or providing detailed guidance:
|
|
43
|
+
|
|
44
|
+
| File | When to read |
|
|
45
|
+
|------|-------------|
|
|
46
|
+
| `references/model_catalog.md` | User asks about a specific model, needs help choosing, or wants to understand model assumptions |
|
|
47
|
+
| `references/hrf_modeling.md` | Any question about HRF, temporal blurring, deconvolution, or how hemodynamics affect state inference |
|
|
48
|
+
| `references/paradigm_guide.md` | User specifies a paradigm (resting, task, naturalistic) and needs paradigm-specific recommendations |
|
|
49
|
+
| `references/preprocessing.md` | Questions about preprocessing before SSM fitting, confound regression, CIFTI handling, parcellation |
|
|
50
|
+
| `references/code_templates.md` | When generating Python code for any SSM — always read this first |
|
|
51
|
+
| `references/group_inference.md` | User wants to compare states across subjects, conditions, or populations |
|
|
52
|
+
| `references/downstream_analysis.md` | User asks about behavioral correlates, reporting, methods section writing, simulation testing, or what to do after fitting the model |
|
|
53
|
+
|
|
54
|
+
---
|
|
55
|
+
|
|
56
|
+
## Quick-Reference Decision Tree
|
|
57
|
+
|
|
58
|
+
### Step 1: What is the paradigm?
|
|
59
|
+
|
|
60
|
+
**Resting state** → States represent intrinsic brain dynamics. No external timing. HRF is a
|
|
61
|
+
nuisance — you are modeling BOLD-level dynamics unless you explicitly deconvolve. Use HMM or
|
|
62
|
+
HMM-MAR on parcellated or ICA data.
|
|
63
|
+
→ Read `references/paradigm_guide.md § Resting State`
|
|
64
|
+
|
|
65
|
+
**Task-based (event/block design)** → External task structure provides ground truth for
|
|
66
|
+
validation. Key question: are you modeling task-evoked states or residual dynamics beyond the
|
|
67
|
+
task? HRF alignment is critical — task events are neural-time but BOLD is delayed 4-6s.
|
|
68
|
+
→ Read `references/paradigm_guide.md § Task-Based`
|
|
69
|
+
|
|
70
|
+
**Naturalistic (movie, TV, gaming)** → Continuous stimulation without discrete trial structure.
|
|
71
|
+
States reflect stimulus-driven + endogenous dynamics. Shared stimulus enables inter-subject
|
|
72
|
+
state alignment. Typically longer runs, which helps estimation.
|
|
73
|
+
→ Read `references/paradigm_guide.md § Naturalistic`
|
|
74
|
+
|
|
75
|
+
### Step 2: What data format?
|
|
76
|
+
|
|
77
|
+
| Format | Typical shape | Notes |
|
|
78
|
+
|--------|--------------|-------|
|
|
79
|
+
| ROI timeseries | (T, n_rois) | Most common. Parcellate first (Schaefer, Gordon, Glasser). |
|
|
80
|
+
| ICA components | (T, n_components) | From group ICA (FSL MELODIC, GIFT). Dimensionality-reduced. |
|
|
81
|
+
| CIFTI dtseries | (T, n_vertices+n_voxels) | Surface + subcortical. Parcellate or PCA before SSM. |
|
|
82
|
+
| Voxel-level NIfTI | (T, n_voxels) | Too high-dimensional for direct SSM. Apply parcellation/ICA/PCA first. |
|
|
83
|
+
|
|
84
|
+
→ See `references/preprocessing.md § Dimensionality Reduction` for CIFTI and voxel data.
|
|
85
|
+
|
|
86
|
+
### Step 3: Which model family?
|
|
87
|
+
|
|
88
|
+
**"I want discrete brain states and their transitions"**
|
|
89
|
+
→ HMM family. Gaussian HMM if states differ in mean/FC patterns. HMM-MAR if states differ
|
|
90
|
+
in temporal dynamics (autoregressive structure).
|
|
91
|
+
|
|
92
|
+
**"I want continuous latent dynamics that switch between regimes"**
|
|
93
|
+
→ SLDS or rSLDS. States govern a linear dynamical system; latent space is continuous.
|
|
94
|
+
|
|
95
|
+
**"I want external inputs (task events, stimuli) to drive state transitions"**
|
|
96
|
+
→ Input-output HMM or input-driven SLDS.
|
|
97
|
+
|
|
98
|
+
**"I want hierarchical structure (fast states within slow states)"**
|
|
99
|
+
→ Hierarchical HMM or hierarchical SLDS. Useful for naturalistic paradigms.
|
|
100
|
+
|
|
101
|
+
**"Deep data on few subjects"**
|
|
102
|
+
→ Per-subject models with more complex structure (rSLDS, HMM-MAR with many lags).
|
|
103
|
+
Cross-validate within-subject using held-out runs or time segments.
|
|
104
|
+
|
|
105
|
+
**"Many subjects with short scans"**
|
|
106
|
+
→ Group-level HMM or concatenation approaches. Simpler models (Gaussian HMM) are more
|
|
107
|
+
robust. → See `references/group_inference.md`
|
|
108
|
+
|
|
109
|
+
→ Full model specs, equations, and breakdown conditions: `references/model_catalog.md`
|
|
110
|
+
|
|
111
|
+
### Step 4: HRF strategy
|
|
112
|
+
|
|
113
|
+
The BOLD signal is not a direct readout of neural activity — it's the neural signal convolved
|
|
114
|
+
with the HRF, introducing a ~4-6s delay and temporal smoothing.
|
|
115
|
+
|
|
116
|
+
**Why this matters:** If neural states switch rapidly (<5s), the HRF blurs transitions in BOLD.
|
|
117
|
+
A 2-second neural state may appear as a gradual 8-10s transition.
|
|
118
|
+
|
|
119
|
+
| Situation | Strategy |
|
|
120
|
+
|-----------|----------|
|
|
121
|
+
| States slow (>10s) | Fit SSM directly on BOLD. Includes most resting-state and block designs. |
|
|
122
|
+
| States fast (<10s) + have task timing | Deconvolve first (Wiener/FIR), then fit SSM. Or model HRF within emission. |
|
|
123
|
+
| Naturalistic / no explicit timing | Fit on BOLD directly. States at BOLD timescale are interpretable. |
|
|
124
|
+
|
|
125
|
+
→ Full treatment with code: `references/hrf_modeling.md`
|
|
126
|
+
|
|
127
|
+
### Step 5: Key parameter decisions
|
|
128
|
+
|
|
129
|
+
**Number of states (K):** No single right answer. Use:
|
|
130
|
+
- BIC (tends to prefer fewer states with fMRI)
|
|
131
|
+
- Cross-validated log-likelihood (held-out time segments or runs)
|
|
132
|
+
- Free energy (variational methods)
|
|
133
|
+
- Stability: fit multiple K values, check state reproducibility across initializations
|
|
134
|
+
- Sanity check: states should have interpretable spatial patterns and plausible dwell times
|
|
135
|
+
|
|
136
|
+
**Covariance structure (Gaussian emissions):**
|
|
137
|
+
- Full: captures FC per state, many parameters — needs substantial data or regularization
|
|
138
|
+
- Diagonal: fewer parameters, more robust with limited data
|
|
139
|
+
- Factor analysis / low-rank: good for high-dimensional data
|
|
140
|
+
|
|
141
|
+
**Autoregressive order (HMM-MAR):** Typical range 1–5 for TR=0.7–2s. Cross-validate.
|
|
142
|
+
|
|
143
|
+
**Initialization:** K-means or GMM (much better than random). 20–50 random restarts.
|
|
144
|
+
|
|
145
|
+
**Stickiness:** If model infers rapid switching (every TR), check for motion artifacts or add
|
|
146
|
+
a sticky prior. Sticky HMM adds self-transition bias, discouraging unrealistic rapid switching.
|
|
147
|
+
|
|
148
|
+
### Step 6: What to do after fitting
|
|
149
|
+
|
|
150
|
+
**Connect states to behavior or cognition:**
|
|
151
|
+
→ Read `references/downstream_analysis.md § Behavioral Correlates`
|
|
152
|
+
|
|
153
|
+
**Write a methods/results section or prepare figures:**
|
|
154
|
+
→ Read `references/downstream_analysis.md § Reporting Checklist` and `§ Required Figures`
|
|
155
|
+
|
|
156
|
+
**Sanity check — does the pipeline even work?**
|
|
157
|
+
→ Run `simulate_and_recover()` from `references/downstream_analysis.md § Simulation Testing`
|
|
158
|
+
before trusting real-data results.
|
|
159
|
+
|
|
160
|
+
---
|
|
161
|
+
|
|
162
|
+
## Covariates and Confounds
|
|
163
|
+
|
|
164
|
+
**Regress out before SSM fitting:**
|
|
165
|
+
- Head motion parameters (6 or 24-parameter expansion)
|
|
166
|
+
- CSF and WM signals (or aCompCor components)
|
|
167
|
+
- Use fMRIPrep/XCP-D recommended confound strategy
|
|
168
|
+
- Global signal regression: controversial — can introduce anticorrelations
|
|
169
|
+
|
|
170
|
+
**Covariates that can enter the SSM:**
|
|
171
|
+
- Task timing (onsets, durations) → transition model or emission model
|
|
172
|
+
- Stimulus features (naturalistic) → emission model covariates
|
|
173
|
+
- Subject-level covariates (age, group) → hierarchical prior
|
|
174
|
+
|
|
175
|
+
→ Full confound strategy: `references/preprocessing.md`
|
|
176
|
+
|
|
177
|
+
---
|
|
178
|
+
|
|
179
|
+
## Examples
|
|
180
|
+
|
|
181
|
+
### Example 1: Resting-state HMM with hmmlearn
|
|
182
|
+
|
|
183
|
+
User: "I have fMRIPrep-preprocessed resting-state data parcellated to Schaefer-200.
|
|
184
|
+
I want to find recurring brain states."
|
|
185
|
+
|
|
186
|
+
Steps:
|
|
187
|
+
1. Read `references/paradigm_guide.md § Resting State` and `references/code_templates.md`
|
|
188
|
+
2. Recommend Gaussian HMM (states differ in FC patterns, not dynamics)
|
|
189
|
+
3. Suggest K=6-12 with BIC/cross-validation
|
|
190
|
+
4. Sticky HMM to avoid TR-level switching
|
|
191
|
+
5. Use full covariance if data permits (>300 TRs per state), else diagonal
|
|
192
|
+
|
|
193
|
+
Key code pattern (from `references/code_templates.md`):
|
|
194
|
+
```python
|
|
195
|
+
from hmmlearn import hmm
|
|
196
|
+
|
|
197
|
+
model = hmm.GaussianHMM(
|
|
198
|
+
n_components=K,
|
|
199
|
+
covariance_type="full", # or "diag" for limited data
|
|
200
|
+
n_iter=100,
|
|
201
|
+
random_state=42
|
|
202
|
+
)
|
|
203
|
+
# Use K-means init; multiple restarts; pass lengths array for multi-run
|
|
204
|
+
model.fit(X, lengths=run_lengths)
|
|
205
|
+
```
|
|
206
|
+
|
|
207
|
+
### Example 2: Task-based SSM with HRF-aware IO-HMM (N-back paradigm)
|
|
208
|
+
|
|
209
|
+
User: "I have N-back fMRI (0-back, 2-back). I want to model how cognitive load shifts
|
|
210
|
+
brain states over time."
|
|
211
|
+
|
|
212
|
+
Steps:
|
|
213
|
+
1. Read `references/paradigm_guide.md § Task-Based` and `references/hrf_modeling.md`
|
|
214
|
+
2. Recommend IO-HMM — external task input drives state transitions
|
|
215
|
+
3. HRF strategy: convolve task regressors with canonical HRF before using as inputs
|
|
216
|
+
4. Read `references/code_templates.md` for IO-HMM template
|
|
217
|
+
|
|
218
|
+
Key decisions:
|
|
219
|
+
- Task input: HRF-convolved indicator for 0-back vs. 2-back condition
|
|
220
|
+
- Model: `ssm.HMM` with `InputDrivenTransitions`
|
|
221
|
+
- Covariates: regress out motion + CSF/WM before fitting
|
|
222
|
+
|
|
223
|
+
### Example 3: Naturalistic fMRI with rSLDS (movie-watching)
|
|
224
|
+
|
|
225
|
+
User: "I have movie-watching fMRI from 50 subjects. I want continuous latent dynamics
|
|
226
|
+
that switch between regimes, not just discrete states."
|
|
227
|
+
|
|
228
|
+
Steps:
|
|
229
|
+
1. Read `references/paradigm_guide.md § Naturalistic` and `references/model_catalog.md § rSLDS`
|
|
230
|
+
2. Recommend rSLDS — latent space with state-dependent switching boundaries
|
|
231
|
+
3. Start with SLDS first; upgrade if SLDS fits poorly
|
|
232
|
+
4. Group-level inference: `references/group_inference.md`
|
|
233
|
+
|
|
234
|
+
Key considerations:
|
|
235
|
+
- Fit on BOLD directly (naturalistic = slow states, HRF less critical)
|
|
236
|
+
- Use `ssm.SLDS(recurrent=True)` with Laplace-EM
|
|
237
|
+
- Shared stimulus across subjects enables inter-subject state alignment
|
|
238
|
+
|
|
239
|
+
---
|
|
240
|
+
|
|
241
|
+
## Common Pitfalls
|
|
242
|
+
|
|
243
|
+
1. **Fitting SSMs to raw data.** Always preprocess fully first (fMRIPrep + XCP-D with
|
|
244
|
+
appropriate denoising).
|
|
245
|
+
|
|
246
|
+
2. **Ignoring HRF when interpreting fast states.** States lasting 2-3 TRs in BOLD ≠
|
|
247
|
+
2-3 second neural events. Acknowledge the BOLD timescale.
|
|
248
|
+
|
|
249
|
+
3. **Too many states for the data.** Rule of thumb: ≥50-100 time points per state for
|
|
250
|
+
stable full-covariance Gaussian HMM. A 5-min scan (~300 TRs) cannot reliably support
|
|
251
|
+
a 12-state model with 100 ROIs.
|
|
252
|
+
|
|
253
|
+
4. **Naive multi-run concatenation.** The state at end of run 1 has no temporal relation
|
|
254
|
+
to start of run 2. Pass `lengths` array to reset the forward algorithm at run boundaries.
|
|
255
|
+
|
|
256
|
+
5. **State label switching across subjects.** HMMs are invariant to state relabeling. Use
|
|
257
|
+
Hungarian algorithm on emission parameters or fit a group model to align labels.
|
|
258
|
+
|
|
259
|
+
6. **Motion-driven states.** High-motion time points create artifactual states. Scrub/censor
|
|
260
|
+
before fitting; verify states don't correlate with framewise displacement.
|
|
261
|
+
|
|
262
|
+
7. **Circular analysis.** Don't select K on full data then refit and report. Use proper
|
|
263
|
+
cross-validation or a held-out set.
|
|
264
|
+
|
|
265
|
+
---
|
|
266
|
+
|
|
267
|
+
## Python Library Landscape
|
|
268
|
+
|
|
269
|
+
| Library | Models | GPU/JAX | Best for | Limitations |
|
|
270
|
+
|---------|--------|---------|----------|-------------|
|
|
271
|
+
| `hmmlearn` | Gaussian HMM, GMM-HMM | No | Simple API, well-tested | No AR emissions, no SLDS |
|
|
272
|
+
| `ssm` (Linderman) | HMM, HSMM, SLDS, rSLDS, SNLDS, input-driven | Yes (JAX) | Most complete SSM library | Less mature; some models need JAX |
|
|
273
|
+
| `dynamax` (probml) | HMM, LGSSM, custom SSMs | Yes (JAX-native) | **Modular Lego design** — swap emission/transition/initial independently; full JAX JIT | Newer; no rSLDS/SNLDS out of box |
|
|
274
|
+
| `pyhsmm` | HMM, sticky HMM, HDP-HMM | No | Bayesian nonparametric (auto-selects K) | Slower, less maintained |
|
|
275
|
+
| `brainiak` | HMM, HTFA | No | Neuro-specific, fMRI-designed | Limited model variety |
|
|
276
|
+
| `glhmm` (Vidaurre) | Gaussian-Linear HMM (≈ HMM-MAR) | Roadmap | Group-level neuroimaging | Newer, less documentation |
|
|
277
|
+
| `osl-dynamics` | HMM, DyNeMo | Yes (TF/JAX) | Oxford successor to HMM-MAR toolbox | GPU required for DyNeMo |
|
|
278
|
+
| `nilearn` | (not SSMs) | — | Preprocessing, parcellation, connectivity | No SSM fitting |
|
|
279
|
+
|
|
280
|
+
→ Working code examples for each library: `references/code_templates.md`
|
|
281
|
+
|
|
282
|
+
---
|
|
283
|
+
|
|
284
|
+
## Language & Compute Environment
|
|
285
|
+
|
|
286
|
+
**This skill is Python-first.** All code targets Python ≥ 3.10 with the library versions
|
|
287
|
+
in each template. The `ssm` library (Linderman lab) provides a JAX backend that runs on
|
|
288
|
+
GPU transparently — the same API works on CPU or GPU.
|
|
289
|
+
|
|
290
|
+
**GPU acceleration — recommended, not required:**
|
|
291
|
+
|
|
292
|
+
| Scenario | Recommendation |
|
|
293
|
+
|----------|---------------|
|
|
294
|
+
| Standard Gaussian HMM, ≤ 50 subjects | CPU is fine (hmmlearn is fast) |
|
|
295
|
+
| ssm / dynamax HMM, > 50 subjects or > 1000 TRs/subject | GPU recommended |
|
|
296
|
+
| rSLDS / SNLDS (Laplace-EM) | GPU strongly recommended — 5–10× speedup |
|
|
297
|
+
| dynamax custom SSM with JIT | GPU recommended; JIT gives 10–100× speedup even on CPU |
|
|
298
|
+
| DyNeMo (osl-dynamics) | GPU required for practical training |
|
|
299
|
+
| Model selection sweeps over K | Parallelize across GPUs or HPC jobs |
|
|
300
|
+
|
|
301
|
+
All code in `references/code_templates.md` runs on CPU. GPU is a drop-in speedup.
|
|
302
|
+
See `references/code_templates.md §12` (JAX/GPU) and `§13` (dynamax Lego models).
|
|
303
|
+
|
|
304
|
+
**JAX note:** Install JAX before importing `ssm` for GPU support:
|
|
305
|
+
```bash
|
|
306
|
+
# CPU-only JAX (default)
|
|
307
|
+
pip install jax jaxlib ssm
|
|
308
|
+
|
|
309
|
+
# GPU (CUDA 12)
|
|
310
|
+
pip install "jax[cuda12]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
|
|
311
|
+
pip install ssm
|
|
312
|
+
```
|
|
313
|
+
|
|
314
|
+
**Julia:** For Bayesian HMM inference with full MCMC uncertainty, `Turing.jl` and
|
|
315
|
+
`StateSpaceModels.jl` are excellent alternatives with richer posterior sampling than
|
|
316
|
+
Python's `pyhsmm`. Recommend if the user is already in the Julia ecosystem or needs
|
|
317
|
+
calibrated credible intervals on transition probabilities.
|