npm - immunum - Versions diffs - 0.9.0 → 0.9.2 - Mend

immunum 0.9.0 → 0.9.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 ENPICOM
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/README.md CHANGED Viewed

@@ -2,23 +2,31 @@
 High-performance antibody and TCR sequence numbering in Rust, Python, and WebAssembly.
+[![Crates.io](https://img.shields.io/crates/v/immunum)](https://crates.io/crates/immunum)
+[![PyPI](https://img.shields.io/pypi/v/immunum)](https://pypi.org/project/immunum/)
+[![npm](https://img.shields.io/npm/v/immunum)](https://www.npmjs.com/package/immunum)
+[![License: MIT](https://img.shields.io/crates/l/immunum)](LICENSE)
+[![CI](https://img.shields.io/github/actions/workflow/status/ENPICOM/immunum/ci.yml?label=CI)](https://github.com/ENPICOM/immunum/actions/workflows/ci.yml)
+[![Docs](https://img.shields.io/badge/docs-immunum.enpicom.com-blue)](https://immunum.enpicom.com)
 ## Overview
-`immunum` is a library for numbering antibody and T-cell receptor (TCR) variable domain sequences. It uses Needleman-Wunsch semi-global alignment against position-specific scoring matrices (PSSM) built from consensus sequences, with BLOSUM62-based substitution scores.
+`immunum` is a library for numbering antibody and T-cell receptor (TCR) variable domain sequences. It uses Needleman-Wunsch semi-global alignment against position-specific scoring matrices built from consensus sequences, with BLOSUM62-based substitution scores.
 Available as:
 - **Rust crate** — core library and CLI
 - **Python package** — via PyPI (`pip install immunum`), with a [Polars](https://pola.rs) plugin for vectorized batch processing
-- **npm package** — WebAssembly build for Node.js and browsers
+- **npm package** — for Node.js and browsers
 ### Supported chains
-| Antibody | TCR |
-|----------|-----|
-| IGH (heavy) | TRA (alpha) |
-| IGK (kappa) | TRB (beta) |
+| Antibody     | TCR         |
+| ------------ | ----------- |
+| IGH (heavy)  | TRA (alpha) |
+| IGK (kappa)  | TRB (beta)  |
 | IGL (lambda) | TRD (delta) |
-| | TRG (gamma) |
+|              | TRG (gamma) |
 ### Numbering schemes
@@ -27,6 +35,26 @@ Available as:
 Chain type is automatically detected by aligning against all loaded chains and selecting the best match.
+## Table of Contents
+- [Python](#python)
+  - [Installation](#installation)
+  - [Numbering](#numbering)
+  - [Segmentation](#segmentation)
+  - [Polars plugin](#polars-plugin)
+- [JavaScript / npm](#javascript--npm)
+  - [Installation](#installation-1)
+  - [Usage](#usage)
+- [Rust](#rust)
+  - [Usage](#usage-1)
+- [CLI](#cli)
+  - [Options](#options)
+  - [Input](#input)
+  - [Output](#output)
+  - [Examples](#examples)
+- [Development](#development)
+- [Project structure](#project-structure)
 ## Python
 ### Installation
@@ -46,8 +74,8 @@ sequence = "QVQLVQSGAEVKRPGSSVTVSCKASGGSFSTYALSWVRQAPGRGLEWMGGVIPLLTITNYAPRFQGRI
 result = annotator.number(sequence)
 print(result.chain)       # H
-print(result.confidence)  # 0.97
-print(result.numbering)   # {"1": "E", "2": "V", "3": "Q", ...}
+print(result.confidence)  # 0.78
+print(result.numbering)   # {"1": "Q", "2": "V", "3": "Q", ...}
 ```
 ### Segmentation
@@ -55,14 +83,20 @@ print(result.numbering)   # {"1": "E", "2": "V", "3": "Q", ...}
 `segment` splits the sequence into FR/CDR regions:
 ```python
+from immunum import Annotator
+annotator = Annotator(chains=["H", "K", "L"], scheme="imgt")
+sequence = "QVQLVQSGAEVKRPGSSVTVSCKASGGSFSTYALSWVRQAPGRGLEWMGGVIPLLTITNYAPRFQGRITITADRSTSTAYLELNSLRPEDTAVYYCAREGTTGKPIGAFAHWGQGTLVTVSS"
 result = annotator.segment(sequence)
-print(result.fr1)   # EVQLVESGGGLVKPGGSLKLSCAAS
-print(result.cdr1)  # GFTFSSYAMS
-print(result.fr2)   # WVRQAPGKGLEWVS
-print(result.cdr2)  # AISGSGGS
-print(result.fr3)   # TYYADSVKGRFTISRDNAKN
-print(result.cdr3)  # ...
-print(result.fr4)   # ...
+assert result.fr1 == 'QVQLVQSGAEVKRPGSSVTVSCKAS'
+assert result.cdr1 == 'GGSFSTYA'
+assert result.fr2 == 'LSWVRQAPGRGLEWMGG'
+assert result.cdr2 == 'VIPLLTIT'
+assert result.fr3 == 'NYAPRFQGRITITADRSTSTAYLELNSLRPEDTAVYYC'
+assert result.cdr3 == 'AREGTTGKPIGAFAH'
+assert result.fr4 == 'WGQGTLVTVSS'
 ```
 Chains: `"H"` (heavy), `"K"` (kappa), `"L"` (lambda), `"A"` (TRA), `"B"` (TRB), `"G"` (TRG), `"D"` (TRD).
@@ -73,7 +107,7 @@ For batch processing, `immunum.polars` registers elementwise Polars expressions:
 ```python
 import polars as pl
-import immunum.polars as ip
+import immunum.polars as imp
 df = pl.DataFrame({"sequence": [
     "QVQLVQSGAEVKRPGSSVTVSCKASGGSFSTYALSWVRQAPGRGLEWMGGVIPLLTITNYAPRFQGRITITADRSTSTAYLELNSLRPEDTAVYYCAREGTTGKPIGAFAHWGQGTLVTVSS",
@@ -82,18 +116,18 @@ df = pl.DataFrame({"sequence": [
 # Add a struct column with chain, scheme, confidence, numbering
 result = df.with_columns(
-    ip.number(pl.col("sequence"), chains=["H", "K", "L"], scheme="imgt").alias("numbered")
+    imp.number(pl.col("sequence"), chains=["H", "K", "L"], scheme="imgt").alias("numbered")
 )
 # Add a struct column with FR/CDR segments
 result = df.with_columns(
-    ip.segment(pl.col("sequence"), chains=["H", "K", "L"], scheme="imgt").alias("segmented")
+    imp.segment(pl.col("sequence"), chains=["H", "K", "L"], scheme="imgt").alias("segmented")
 )
 ```
 The `number` expression returns a struct with fields `chain`, `scheme`, `confidence`, and `numbering` (a struct of position→residue). The `segment` expression returns a struct with fields `fr1`, `cdr1`, `fr2`, `cdr2`, `fr3`, `cdr3`, `fr4`, `prefix`, `postfix`.
-## WebAssembly
+## JavaScript / npm
 ### Installation
@@ -110,12 +144,13 @@ await init(); // load the wasm module
 const annotator = new Annotator(["H", "K", "L"], "imgt");
-const sequence = "QVQLVQSGAEVKRPGSSVTVSCKASGGSFSTYALSWVRQAPGRGLEWMGGVIPLLTITNYAPRFQGRITITADRSTSTAYLELNSLRPEDTAVYYCAREGTTGKPIGAFAHWGQGTLVTVSS";
+const sequence =
+  "QVQLVQSGAEVKRPGSSVTVSCKASGGSFSTYALSWVRQAPGRGLEWMGGVIPLLTITNYAPRFQGRITITADRSTSTAYLELNSLRPEDTAVYYCAREGTTGKPIGAFAHWGQGTLVTVSS";
 const result = annotator.number(sequence);
-console.log(result.chain);       // "H"
-console.log(result.confidence);  // 0.97
-console.log(result.numbering);   // { "1": "E", "2": "V", ... }
+console.log(result.chain); // "H"
+console.log(result.confidence); // 0.97
+console.log(result.numbering); // { "1": "E", "2": "V", ... }
 const segments = annotator.segment(sequence);
 console.log(segments.cdr3);
@@ -148,6 +183,13 @@ for (aa, pos) in sequence.chars().zip(result.positions.iter()) {
 }
 ```
+Add to `Cargo.toml`:
+```toml
+[dependencies]
+immunum = "0.9"
+```
 ## CLI
 ```bash
@@ -156,11 +198,11 @@ immunum number [OPTIONS] [INPUT] [OUTPUT]
 ### Options
-| Flag | Description | Default |
-|------|-------------|---------|
-| `-s, --scheme` | Numbering scheme: `imgt` (`i`), `kabat` (`k`) | `imgt` |
-| `-c, --chain` | Chain filter: `h`,`k`,`l`,`a`,`b`,`g`,`d` or groups: `ig`, `tcr`, `all`. Accepts any form (`h`, `heavy`, `igh`), case-insensitive. | `ig` |
-| `-f, --format` | Output format: `tsv`, `json`, `jsonl` | `tsv` |
+| Flag           | Description                                                                                                                        | Default |
+| -------------- | ---------------------------------------------------------------------------------------------------------------------------------- | ------- |
+| `-s, --scheme` | Numbering scheme: `imgt` (`i`), `kabat` (`k`)                                                                                      | `imgt`  |
+| `-c, --chain`  | Chain filter: `h`,`k`,`l`,`a`,`b`,`g`,`d` or groups: `ig`, `tcr`, `all`. Accepts any form (`h`, `heavy`, `igh`), case-insensitive. | `ig`    |
+| `-f, --format` | Output format: `tsv`, `json`, `jsonl`                                                                                              | `tsv`   |
 ### Input
@@ -219,7 +261,7 @@ uv tool install go-task-bin
 And then run `task` or `task --list-all` to get the full list of available tasks.
-By default, `dev` profile will be used in all but `behcnmark-*` taks, but you can change it
+By default, `dev` profile will be used in all but `benchmark-*` tasks, but you can change it
 via providing `PROFILE=release` to your task.
 Also, by default, `task` caches results, but you can ignore it by running `task my-task -f`.
@@ -251,12 +293,12 @@ task lint    # runs linting for python and rust
 ### Benchmarking
-There are multiple benchmarks in the repository. For full list, see `task | grep behcmark`:
+There are multiple benchmarks in the repository. For full list, see `task | grep benchmark`:
 ```bash
 $ task | grep benchmark
 * benchmark-accuracy:           Accuracy benchmark across all fixtures (1k sequences, 7 rounds each)
-* benchmark-cli:                Behcmark correctness of the CLI tool
+* benchmark-cli:                Benchmark correctness of the CLI tool
 * benchmark-comparison:         Speed + correctness benchmark: immunum vs antpack vs anarci (1k IGH sequences)
 * benchmark-scaling:            Scaling benchmark: sizes 100..10M (10x steps), 1 round, H/imgt. Pass CLI_ARGS to filter tools, e.g. -- --tools immunum
 * benchmark-speed:              Speed benchmark across dataset sizes (100 to 1M sequences, 7 rounds, H/imgt)
@@ -264,6 +306,7 @@ $ task | grep benchmark
 ```
 ## Project structure
 ```
 src/
 ├── main.rs          # CLI binary (immunum number ...)
@@ -291,8 +334,8 @@ fixtures/
 └── ig.tsv           # Example TSV input
 scripts/             # Python tooling for generating consensus data
 immunum/
-└── _internal.pyi    # python stub file for pyo3
-└── polars.py        # polars extension module
+├── _internal.pyi    # python stub file for pyo3
+├── polars.py        # polars extension module
 └── python.py        # python module
 ```

package/immunum.d.ts CHANGED Viewed

@@ -8,7 +8,7 @@ export type Numbering = Record<string, string>;
 export interface NumberingResult {
     /** Detected chain type: `"H"`, `"K"`, `"L"`, `"A"`, `"B"`, `"G"`, or `"D"`. */
     chain: string;
-    /** Numbering scheme used: `"imgt"` or `"kabat"` */
+    /** Numbering scheme used: `"imgt"` or `"kabat"`. */
     scheme: string;
     /** Alignment confidence score between 0 and 1. */
     confidence: number;
@@ -31,7 +31,31 @@ export interface SegmentationResult {
     postfix: string;
 }
-/** Annotator for numbering antibody and TCR sequences. */
+/**
+ * Annotates antibody and T-cell receptor sequences with IMGT or Kabat position numbers.
+ *
+ * @param chains - Chain types to consider during auto-detection. Each entry is a
+ *   case-insensitive string. Accepted values:
+ *   - Antibody heavy chain: `"IGH"` / `"H"` / `"heavy"`
+ *   - Antibody kappa chain: `"IGK"` / `"K"` / `"kappa"`
+ *   - Antibody lambda chain: `"IGL"` / `"L"` / `"lambda"`
+ *   - TCR alpha chain:       `"TRA"` / `"A"` / `"alpha"`
+ *   - TCR beta chain:        `"TRB"` / `"B"` / `"beta"`
+ *   - TCR gamma chain:       `"TRG"` / `"G"` / `"gamma"`
+ *   - TCR delta chain:       `"TRD"` / `"D"` / `"delta"`
+ *
+ *   Pass all chains you want to consider; the annotator scores each and picks the
+ *   best-matching one. To consider every supported chain pass all seven values.
+ *
+ * @param scheme - Numbering scheme to use for output positions. Accepted values
+ *   (case-insensitive):
+ *   - `"IMGT"` / `"i"` — IMGT numbering (recommended; used internally)
+ *   - `"Kabat"` / `"k"` — Kabat numbering (derived from IMGT)
+ *
+ * @param min_confidence - Optional minimum alignment confidence threshold in the
+ *   range `[0, 1]`. Sequences scoring below this value are rejected with an error.
+ *   Defaults to `0.5` when `null` or omitted.
+ */
 export class Annotator {
     free(): void;
     [Symbol.dispose](): void;

package/immunum_bg.wasm CHANGED Viewed

Binary file

package/package.json CHANGED Viewed

@@ -4,7 +4,7 @@
     "ENPICOM <dev@enpicom.com>"
   ],
   "description": "Fast antibody and T-cell receptor numbering in Rust and Python",
-  "version": "0.9.0",
+  "version": "0.9.2",
   "license": "MIT",
   "repository": {
     "type": "git",