@datagrok/peptides 1.27.3 → 1.27.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CLAUDE.md ADDED
@@ -0,0 +1,184 @@
1
+ # CLAUDE.md
2
+
3
+ This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4
+
5
+ ## Overview
6
+
7
+ **Peptides** (`@datagrok/peptides`) is a Datagrok plugin for **Structure-Activity Relationship (SAR) analysis** of peptide collections. It detects macromolecule columns automatically, renders amino acids with color-coded monomers, and provides interactive viewers to identify point mutations and residues causing major activity changes.
8
+
9
+ Category: **Bioinformatics**. Top menu: `Bio | Analyze | SAR...`.
10
+
11
+ ## Build Commands
12
+
13
+ ```bash
14
+ npm install
15
+ npm run build # grok api && grok check --soft && webpack
16
+ npm run test # grok test
17
+ ```
18
+
19
+ ## Architecture
20
+
21
+ ### Entry Point — `src/package.ts`
22
+
23
+ `PackageFunctions` class registers all platform-visible functions via `@grok.decorators`:
24
+ - `initPeptides` — `@init`: loads MonomerWorks, TreeHelper, and monomer library components
25
+ - `Peptides` — `@func`: app landing view with Simple/Complex/HELM demo buttons
26
+ - `peptidesDialog` (`Bio Peptides`) — `@func`: top-menu SAR dialog (`Bio | Analyze | SAR...`). Validates active table has Macromolecule + numerical columns, shows config dialog, launches analysis
27
+ - `peptidesPanel` — `@panel`: property panel widget for Macromolecule columns
28
+ - `manualAlignment` — `@panel`: property panel for Monomer semtype, manual sequence editing
29
+ - `macromoleculeSarFastaDemo` — `@func`: demo dashboard loading FASTA peptides with −lg scaling + MCL clustering
30
+ - `lstPiechartCellRenderer` — `@func`: custom grid cell renderer for Logo Summary Table pie charts
31
+
32
+ **Registered Viewers** (all `@func` with `role: 'viewer'`):
33
+ - `Sequence Variability Map` → `MonomerPosition`
34
+ - `Most Potent Residues` → `MostPotentResidues`
35
+ - `Sequence Mutation Cliffs` → `MutationCliffsViewer`
36
+ - `Logo Summary Table` → `LogoSummaryTable`
37
+ - `Sequence Position Statistics` → `SequencePositionStatsViewer`
38
+ - `Active peptide selection` → `ClusterMaxActivityViewer`
39
+
40
+ **Exports**: `_package`, `getMonomerWorksInstance()`, `getTreeHelperInstance()`
41
+
42
+ ### Model — `src/model.ts`
43
+
44
+ `PeptidesModel` is the central controller for SAR analysis. Stored in `DataFrame.temp['peptidesModel']` as singleton per dataframe.
45
+
46
+ **Key responsibilities:**
47
+ - Manages analysis settings (sequence column, activity column, scaling, viewer toggles, sequence space params, MCL settings)
48
+ - Splits aligned sequences into per-position columns (`joinDataFrames`)
49
+ - Creates scaled activity column (none/lg/−lg)
50
+ - Manages collaborative selection across all viewers (WebLogo, invariant map, mutation cliffs, clusters)
51
+ - Builds the Properties panel accordion (Distribution, Mutation Cliffs pairs, Selection)
52
+ - Manages viewer lifecycle: add/close dendrogram, sequence space, cluster max activity, logo summary table, MonomerPosition, MostPotentResidues
53
+ - Computes and caches `MonomerPositionStats` — per-monomer-per-position statistics
54
+ - Handles grid rendering setup: WebLogo column headers, monomer cell renderers, tooltips
55
+
56
+ **Settings** are stored as a JSON tag on the DataFrame (`TAGS.SETTINGS`). Changing settings triggers selective updates (only affected viewers/computations refresh).
57
+
58
+ **Important fields:**
59
+ - `positionColumns` — columns representing individual positions in split sequences (tagged with `TAGS.POSITION_COL`)
60
+ - `webLogoSelection` — current monomer-position selection from WebLogo headers
61
+ - `_dm` — cached distance matrix
62
+ - `_sequenceSpaceViewer` — scatter plot for sequence space
63
+ - `_mclViewer` — MCL clustering viewer
64
+
65
+ ### Types — `src/utils/types.ts`
66
+
67
+ Key types:
68
+ - `PeptidesSettings` — full analysis config: sequenceColumnName, activityColumnName, activityScaling, viewer toggles, sequenceSpaceParams, mclSettings
69
+ - `SequenceSpaceParams` — distance function, gap penalties, DBSCAN params, fingerprint type
70
+ - `MCLSettings` — MCL clustering params: inflation, threshold, iterations, WebGPU, min cluster size
71
+ - `MutationCliffs` — `Map<Monomer, Map<Position, Map<Index, Indexes>>>` — pairwise cliff data
72
+ - `Selection` — `{ [positionOrClusterType]: string[] }` — unified selection across viewers
73
+
74
+ ### Constants — `src/utils/constants.ts`
75
+
76
+ - `COLUMNS_NAMES` — Activity, Monomer, Position, P-Value, Mean difference, Count, Ratio, etc.
77
+ - `TAGS` — DataFrame/column tags: SETTINGS, POSITION_COL, ANALYSIS_COL, CUSTOM_CLUSTER, MONOMER_POSITION_MODE, etc.
78
+ - `SEM_TYPES` — Monomer, MacromoleculeDifference
79
+ - `SCALING_METHODS` — none, lg, −lg
80
+ - `SUFFIXES` — Viewer prefixes for namespacing (LST, MP, MPR, WL)
81
+
82
+ ## Viewers (`src/viewers/`)
83
+
84
+ | File | Class | Purpose |
85
+ |---|---|---|
86
+ | `sar-viewer.ts` | `SARViewer` (abstract) | Base for MonomerPosition and MostPotentResidues. Manages mutation cliffs computation, invariant map stats, selection, mode switching (Invariant Map / Mutation Cliffs) |
87
+ | `sar-viewer.ts` | `MonomerPosition` | Horizontal heatmap: monomers × positions. Two modes: Invariant Map (colored by aggregated stats) and Mutation Cliffs (circle size=count, color=mean diff). Has monomer search/filter |
88
+ | `sar-viewer.ts` | `MostPotentResidues` | Vertical viewer: one row per position showing the most potent monomer with mean difference, p-value, count, ratio |
89
+ | `logo-summary.ts` | `LogoSummaryTable` | Per-cluster summary grid: WebLogo, activity distribution histogram, members count, mean difference, p-value. Supports original + custom clusters, filtering small clusters |
90
+ | `mutation-cliffs-viewer.ts` | `MutationCliffsViewer` | Line chart of mutation cliffs at a selected position. Splits by series column, syncs selection with main dataframe |
91
+ | `cluster-max-activity-viewer.ts` | `ClusterMaxActivityViewer` | Scatter plot: cluster size vs max activity per cluster. Draws threshold lines, supports auto-selection of top quadrants |
92
+ | `position-statistics-viewer.ts` | `SequencePositionStatsViewer` | Box/violin plot of numerical values grouped by monomers at a selected position. Configurable motif overhang |
93
+
94
+ ## Widgets (`src/widgets/`)
95
+
96
+ | File | Purpose |
97
+ |---|---|
98
+ | `peptides.ts` | Main SAR launch UI: `analyzePeptidesUI` (config dialog) + `startAnalysis` (creates model, adds viewers, starts analysis). This is the primary entry point for all SAR analysis |
99
+ | `distribution.ts` | Activity distribution panels with histograms. Supports breakdown by monomers, positions, or clusters |
100
+ | `mutation-cliffs.ts` | Mutation Cliffs panel: pairs grid + unique sequences grid. Filtering by monomer, shift-click multi-select |
101
+ | `selection.ts` | Selection summary grid mirroring selected rows with WebLogo headers and monomer-position stats |
102
+ | `manual-alignment.ts` | Text area for editing aligned peptide sequences with apply/reset |
103
+ | `settings.ts` | Settings dialog accordion: General (activity, scaling), Viewers (toggle each), Columns (aggregation), Sequence Space (distance, DBSCAN), MCL params |
104
+
105
+ ## Utilities (`src/utils/`)
106
+
107
+ | File | Purpose |
108
+ |---|---|
109
+ | `algorithms.ts` | Core computations: `findMutations` (mutation cliff detection via web workers), `calculateMonomerPositionStatistics` (per-position stats), `calculateClusterStatistics`, `calculateMutationCliffStats` |
110
+ | `cell-renderer.ts` | Custom grid cell renderers: `renderMutationCliffs` (colored circles), `renderInvariantMap` (colored rectangles), `renderWebLogo` (stacked letter logos), `setWebLogoRenderer` (column header WebLogos with mouse events), `LSTPieChartRenderer` |
111
+ | `misc.ts` | Helpers: `scaleActivity`, `modifySelection` (Shift/Ctrl logic), `highlightMonomerPosition`, `initSelection`, `getSelectionBitset`, `isSelectionEmpty`, `extractColInfo`, `expandGrid` |
112
+ | `statistics.ts` | Stats engine: `calculateStats` (t-test, mean difference, p-value, ratio for monomer vs rest), `getAggregatedValue`, `getAggregatedValues`, `getFrequencies` |
113
+ | `tooltips.ts` | Rich tooltips for monomer-position cells: histogram + stats table + aggregated columns |
114
+ | `parallel-mutation-cliffs.ts` | `MutationCliffsCalculator` class: distributes pairwise comparisons across Web Workers for parallel computation |
115
+ | `types.ts` | Type definitions (see Types section above) |
116
+ | `constants.ts` | Constants (see Constants section above) |
117
+
118
+ ### Web Worker — `src/workers/mutation-cliffs-worker.ts`
119
+
120
+ Receives a chunk of the upper-triangular pairwise comparison matrix. For each pair, checks activity delta threshold and max mutation count. Returns qualifying pairs (position, index1, index2).
121
+
122
+ ### Helpers — `src/peptideUtils.ts`
123
+
124
+ `PeptideUtils` static class: lazily loads and caches monomer library (`IMonomerLib`) and sequence helper (`ISeqHelper`). Call `loadComponents()` before using.
125
+
126
+ ### Demo — `src/demo/fasta.ts`
127
+
128
+ `macromoleculeSarFastaDemoUI`: loads sample CSV, configures FASTA/PT notation, scales IC50 with −lg, launches SAR with MCL clustering.
129
+
130
+ ## Analysis Flow
131
+
132
+ 1. User opens table with Macromolecule column → `initPeptides` loads MonomerWorks + TreeHelper
133
+ 2. User triggers SAR via top menu (`Bio | Analyze | SAR...`) or column panel → `analyzePeptidesUI` shows config dialog
134
+ 3. `startAnalysis` creates `PeptidesModel`, splits sequences into position columns, scales activity
135
+ 4. Model adds viewers (MonomerPosition, MostPotentResidues, LogoSummaryTable, etc.) to the TableView
136
+ 5. Viewers compute stats independently but share selection through the model's collaborative filtering
137
+ 6. Property panel accordion shows Distribution, Mutation Cliffs pairs, and Selection based on current viewer/selection state
138
+
139
+ ## Tests (`src/tests/`)
140
+
141
+ | File | Categories |
142
+ |---|---|
143
+ | `core.ts` | Start analysis (simple/complex), save/load project |
144
+ | `viewers.ts` | Viewer rendering, selection, interaction |
145
+ | `widgets.ts` | Settings dialog, distribution, mutation cliffs widgets |
146
+ | `model.ts` | Model initialization, settings changes, viewer management |
147
+ | `table-view.ts` | Grid interactions, selection sync, WebLogo behavior |
148
+ | `misc.ts` | Algorithm unit tests (mutation cliff detection) |
149
+ | `benchmarks.ts` | Performance tests: mutation cliffs, cluster stats, monomer-position stats at 5k–200k sequences |
150
+ | `utils.ts` | Test constants (`TEST_COLUMN_NAMES`) |
151
+
152
+ ## Key Dependencies
153
+
154
+ - `@datagrok-libraries/bio` — macromolecule utilities, sequence splitting, monomer palettes, WebLogo, MonomerWorks
155
+ - `@datagrok-libraries/ml` — distance matrices, dimensionality reduction (UMAP/tSNE), MCL clustering, macromolecule distance functions
156
+ - `@datagrok-libraries/math` — DBSCAN worker, WebGPU utilities
157
+ - `@datagrok-libraries/statistics` — statistical functions
158
+ - `@datagrok-libraries/utils` — BitArray, UI helpers (`u2.appHeader`)
159
+ - `@datagrok-libraries/tutorials` — tutorial framework
160
+ - `wu` — lazy iteration
161
+ - `cash-dom` — jQuery-like DOM manipulation
162
+ - `uuid` — unique ID generation
163
+ - `rxjs` — reactive subscriptions
164
+
165
+ ## Quick Lookups
166
+
167
+ | Looking for... | Check first |
168
+ |---|---|
169
+ | App/viewer/panel registration | `src/package.ts` |
170
+ | Analysis launch flow | `src/widgets/peptides.ts` → `startAnalysis` |
171
+ | Central model, settings, viewer management | `src/model.ts` |
172
+ | All type definitions | `src/utils/types.ts` |
173
+ | Column names, tags, scaling enums | `src/utils/constants.ts` |
174
+ | SAR heatmap viewers (MonomerPosition, MostPotentResidues) | `src/viewers/sar-viewer.ts` |
175
+ | Cluster summary viewer | `src/viewers/logo-summary.ts` |
176
+ | Mutation cliffs line chart | `src/viewers/mutation-cliffs-viewer.ts` |
177
+ | Custom grid cell rendering (WebLogo, heatmap cells) | `src/utils/cell-renderer.ts` |
178
+ | Statistics (t-test, p-value, mean diff) | `src/utils/statistics.ts` |
179
+ | Mutation cliff detection (parallel) | `src/utils/algorithms.ts` + `src/utils/parallel-mutation-cliffs.ts` |
180
+ | Selection logic (Shift/Ctrl) | `src/utils/misc.ts` → `modifySelection` |
181
+ | Tooltips for monomer-position cells | `src/utils/tooltips.ts` |
182
+ | Settings dialog UI | `src/widgets/settings.ts` |
183
+ | Demo data files | `files/aligned.csv`, `aligned_2.csv`, `aligned_3.csv` |
184
+ | Benchmark test data | `files/tests/` (5k–200k .d42 files) |