PyPI - LZGraphs - Versions diffs - 2.1.2__tar.gz → 2.3.0__tar.gz - Mend

LZGraphs 2.1.2tar.gz → 2.3.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (69) hide show

{lzgraphs-2.1.2 → lzgraphs-2.3.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: LZGraphs
-Version: 2.1.2
+Version: 2.3.0
 Summary: An Implementation of LZ76 Based Graphs for Repertoire Representation and Analysis
 Author-email: Thomas Konstantinovsky <thomaskon90@gmail.com>
 Maintainer-email: Thomas Konstantinovsky <thomaskon90@gmail.com>
@@ -29,7 +29,6 @@ Description-Content-Type: text/markdown
 License-File: LICENSE
 Requires-Dist: networkx>=3.0
 Requires-Dist: numpy>=1.24
-Requires-Dist: pandas>=1.5
 Requires-Dist: tqdm>=4.65
 Requires-Dist: scipy>=1.10
 Provides-Extra: viz
@@ -38,6 +37,7 @@ Requires-Dist: seaborn>=0.12; extra == "viz"
 Provides-Extra: dev
 Requires-Dist: pytest>=7.0; extra == "dev"
 Requires-Dist: pytest-cov>=4.0; extra == "dev"
+Requires-Dist: pandas>=1.5; extra == "dev"
 Requires-Dist: black>=23.0; extra == "dev"
 Requires-Dist: isort>=5.12; extra == "dev"
 Requires-Dist: ruff>=0.1.0; extra == "dev"
@@ -45,6 +45,8 @@ Requires-Dist: pre-commit>=3.0; extra == "dev"
 Requires-Dist: build>=1.0; extra == "dev"
 Requires-Dist: twine>=4.0; extra == "dev"
 Provides-Extra: docs
+Requires-Dist: mkdocs-material>=9.5; extra == "docs"
+Requires-Dist: mkdocstrings[python]>=0.24; extra == "docs"
 Dynamic: license-file
 <p align="center">
@@ -125,6 +127,7 @@ The diversity of T-cells and B-cells is crucial for producing receptors that rec
 - **Repertoire comparison** -- compare two repertoires via graph-level statistics
 - **Analytical probability distributions** -- exact moments and scipy-like distribution objects for generation probabilities
 - **Gene annotation support** -- optional V/J gene tracking on edges for gene usage analysis
+- **Bayesian posterior personalization** -- adapt population-level models to individual repertoires using Dirichlet-Multinomial conjugacy
 - **Abundance weighting** -- weight sequences by clonal abundance for more realistic models
 - **Serialization** -- save and load graphs in JSON format
@@ -150,23 +153,20 @@ print(LZGraphs.__version__)
 Build an amino acid positional graph from CDR3 sequences and compute sequence probabilities:
 ```python
-import pandas as pd
 from LZGraphs import AAPLZGraph
-# Prepare data as a DataFrame with a 'cdr3_amino_acid' column
-data = pd.DataFrame({
-    'cdr3_amino_acid': [
-        'CASSLAPGATNEKLFF',
-        'CASSLGQAYEQYF',
-        'CASSFSTCSANYGYTF',
-        'CASSQEGTEAFF',
-        'CASSLGQGNIQYF',
-        # ... your CDR3 amino acid sequences
-    ]
-})
+# Pass a plain list of CDR3 amino acid sequences
+sequences = [
+    'CASSLAPGATNEKLFF',
+    'CASSLGQAYEQYF',
+    'CASSFSTCSANYGYTF',
+    'CASSQEGTEAFF',
+    'CASSLGQGNIQYF',
+    # ... your CDR3 amino acid sequences
+]
 # Construct the graph
-graph = AAPLZGraph(data, verbose=True)
+graph = AAPLZGraph(sequences, verbose=True)
 # Compute the log-probability of a sequence under the model
 log_prob = graph.walk_log_probability('CASSLAPGATNEKLFF')
@@ -221,15 +221,14 @@ graph = NaiveLZGraph(cdr3_list, dictionary, verbose=True)
 ### Gene Annotation
-All three graph types support optional V and J gene annotation. Include `V` and `J` columns in your DataFrame (or pass them separately for NaiveLZGraph) to track gene usage on graph edges:
+All three graph types support optional V and J gene annotation. Pass gene lists alongside sequences to track gene usage on graph edges:
 ```python
-data = pd.DataFrame({
-    'cdr3_amino_acid': sequences,
-    'V': v_genes,
-    'J': j_genes,
-})
-graph = AAPLZGraph(data, verbose=True)
+sequences = ['CASSLEPSGGTDTQYF', 'CASSDTSGGTDTQYF', ...]
+v_genes   = ['TRBV16-1*01', 'TRBV1-1*01', ...]
+j_genes   = ['TRBJ1-2*01', 'TRBJ1-5*01', ...]
+graph = AAPLZGraph(sequences, v_genes=v_genes, j_genes=j_genes, verbose=True)
 # Gene data is now available
 print(graph.has_gene_data)           # True
@@ -248,16 +247,14 @@ This is particularly important for:
 - **Better representation of clonal expansion** -- dominant clones shape the graph structure proportionally to their prevalence
 - **More realistic sequence generation** -- simulated sequences reflect the abundance-weighted landscape, not just the unique sequence set
-To use abundance weighting, include an `abundance` column in your DataFrame:
+To use abundance weighting, pass an `abundances` list alongside your sequences:
 ```python
-data = pd.DataFrame({
-    'cdr3_amino_acid': ['CASSLAPGATNEKLFF', 'CASSLGQAYEQYF', 'CASSFSTCSANYGYTF'],
-    'abundance': [150, 42, 7],
-})
+sequences  = ['CASSLAPGATNEKLFF', 'CASSLGQAYEQYF', 'CASSFSTCSANYGYTF']
+abundances = [150, 42, 7]
 # Each sequence is weighted by its abundance during graph construction
-graph = AAPLZGraph(data, verbose=True)
+graph = AAPLZGraph(sequences, abundances=abundances, verbose=True)
 ```
 For `NaiveLZGraph`, pass abundances as a separate parameter:
@@ -335,6 +332,20 @@ jsd = jensen_shannon_divergence(graph1, graph2)
 comparison = compare_repertoires(graph1, graph2)
 ```
+### Bayesian Posterior Personalization
+```python
+# Adapt a population graph to an individual
+posterior = population_graph.get_posterior(
+    individual_sequences,
+    abundances=clonal_counts,
+    kappa=100.0  # prior strength
+)
+# The posterior is a full graph
+simulated = posterior.simulate(1000, seed=42)
+```
 ### Visualization
 ```python

{lzgraphs-2.1.2 → lzgraphs-2.3.0}/README.md RENAMED Viewed

@@ -76,6 +76,7 @@ The diversity of T-cells and B-cells is crucial for producing receptors that rec
 - **Repertoire comparison** -- compare two repertoires via graph-level statistics
 - **Analytical probability distributions** -- exact moments and scipy-like distribution objects for generation probabilities
 - **Gene annotation support** -- optional V/J gene tracking on edges for gene usage analysis
+- **Bayesian posterior personalization** -- adapt population-level models to individual repertoires using Dirichlet-Multinomial conjugacy
 - **Abundance weighting** -- weight sequences by clonal abundance for more realistic models
 - **Serialization** -- save and load graphs in JSON format
@@ -101,23 +102,20 @@ print(LZGraphs.__version__)
 Build an amino acid positional graph from CDR3 sequences and compute sequence probabilities:
 ```python
-import pandas as pd
 from LZGraphs import AAPLZGraph
-# Prepare data as a DataFrame with a 'cdr3_amino_acid' column
-data = pd.DataFrame({
-    'cdr3_amino_acid': [
-        'CASSLAPGATNEKLFF',
-        'CASSLGQAYEQYF',
-        'CASSFSTCSANYGYTF',
-        'CASSQEGTEAFF',
-        'CASSLGQGNIQYF',
-        # ... your CDR3 amino acid sequences
-    ]
-})
+# Pass a plain list of CDR3 amino acid sequences
+sequences = [
+    'CASSLAPGATNEKLFF',
+    'CASSLGQAYEQYF',
+    'CASSFSTCSANYGYTF',
+    'CASSQEGTEAFF',
+    'CASSLGQGNIQYF',
+    # ... your CDR3 amino acid sequences
+]
 # Construct the graph
-graph = AAPLZGraph(data, verbose=True)
+graph = AAPLZGraph(sequences, verbose=True)
 # Compute the log-probability of a sequence under the model
 log_prob = graph.walk_log_probability('CASSLAPGATNEKLFF')
@@ -172,15 +170,14 @@ graph = NaiveLZGraph(cdr3_list, dictionary, verbose=True)
 ### Gene Annotation
-All three graph types support optional V and J gene annotation. Include `V` and `J` columns in your DataFrame (or pass them separately for NaiveLZGraph) to track gene usage on graph edges:
+All three graph types support optional V and J gene annotation. Pass gene lists alongside sequences to track gene usage on graph edges:
 ```python
-data = pd.DataFrame({
-    'cdr3_amino_acid': sequences,
-    'V': v_genes,
-    'J': j_genes,
-})
-graph = AAPLZGraph(data, verbose=True)
+sequences = ['CASSLEPSGGTDTQYF', 'CASSDTSGGTDTQYF', ...]
+v_genes   = ['TRBV16-1*01', 'TRBV1-1*01', ...]
+j_genes   = ['TRBJ1-2*01', 'TRBJ1-5*01', ...]
+graph = AAPLZGraph(sequences, v_genes=v_genes, j_genes=j_genes, verbose=True)
 # Gene data is now available
 print(graph.has_gene_data)           # True
@@ -199,16 +196,14 @@ This is particularly important for:
 - **Better representation of clonal expansion** -- dominant clones shape the graph structure proportionally to their prevalence
 - **More realistic sequence generation** -- simulated sequences reflect the abundance-weighted landscape, not just the unique sequence set
-To use abundance weighting, include an `abundance` column in your DataFrame:
+To use abundance weighting, pass an `abundances` list alongside your sequences:
 ```python
-data = pd.DataFrame({
-    'cdr3_amino_acid': ['CASSLAPGATNEKLFF', 'CASSLGQAYEQYF', 'CASSFSTCSANYGYTF'],
-    'abundance': [150, 42, 7],
-})
+sequences  = ['CASSLAPGATNEKLFF', 'CASSLGQAYEQYF', 'CASSFSTCSANYGYTF']
+abundances = [150, 42, 7]
 # Each sequence is weighted by its abundance during graph construction
-graph = AAPLZGraph(data, verbose=True)
+graph = AAPLZGraph(sequences, abundances=abundances, verbose=True)
 ```
 For `NaiveLZGraph`, pass abundances as a separate parameter:
@@ -286,6 +281,20 @@ jsd = jensen_shannon_divergence(graph1, graph2)
 comparison = compare_repertoires(graph1, graph2)
 ```
+### Bayesian Posterior Personalization
+```python
+# Adapt a population graph to an individual
+posterior = population_graph.get_posterior(
+    individual_sequences,
+    abundances=clonal_counts,
+    kappa=100.0  # prior strength
+)
+# The posterior is a full graph
+simulated = posterior.simulate(1000, seed=42)
+```
 ### Visualization
 ```python

{lzgraphs-2.1.2 → lzgraphs-2.3.0}/pyproject.toml RENAMED Viewed

@@ -45,7 +45,6 @@ classifiers = [
 dependencies = [
     "networkx>=3.0",
     "numpy>=1.24",
-    "pandas>=1.5",
     "tqdm>=4.65",
     "scipy>=1.10",
 ]
@@ -58,6 +57,7 @@ viz = [
 dev = [
     "pytest>=7.0",
     "pytest-cov>=4.0",
+    "pandas>=1.5",
     "black>=23.0",
     "isort>=5.12",
     "ruff>=0.1.0",
@@ -65,7 +65,10 @@ dev = [
     "build>=1.0",
     "twine>=4.0",
 ]
-docs = []
+docs = [
+    "mkdocs-material>=9.5",
+    "mkdocstrings[python]>=0.24",
+]
 [project.urls]
 Homepage = "https://github.com/MuteJester/LZGraphs"

lzgraphs-2.3.0/setup.py ADDED Viewed

@@ -0,0 +1,40 @@
+"""
+Build script for optional C extensions.
+The _fast_walk extension accelerates LZGraph.simulate() by ~50-100x.
+If compilation fails (no C compiler), the package still installs and
+falls back to the pure-Python implementation automatically.
+"""
+import os
+import sys
+from setuptools import setup, Extension
+# Ensure setuptools can resolve the dynamic version (attr = "LZGraphs.__version__")
+# when running in an isolated build environment where src/ isn't on sys.path.
+sys.path.insert(0, os.path.join(os.path.dirname(os.path.abspath(__file__)), "src"))
+ext_modules = [
+    Extension(
+        "LZGraphs._fast_walk",
+        sources=[os.path.join("src", "LZGraphs", "_fast_walk.c")],
+        # No external library dependencies — pure C + Python.h
+    ),
+]
+def run_setup(extensions):
+    setup(ext_modules=extensions)
+try:
+    run_setup(ext_modules)
+except Exception:
+    print(
+        "\n"
+        "WARNING: Failed to compile C extension _fast_walk.\n"
+        "         LZGraphs will use the pure-Python fallback for simulate().\n"
+        "         This is fine — the package works without it, just slower.\n"
+        "\n"
+    )
+    run_setup([])

{lzgraphs-2.1.2 → lzgraphs-2.3.0}/src/LZGraphs/__init__.py RENAMED Viewed

@@ -1,4 +1,4 @@
-__version__ = "2.1.2"
+__version__ = "2.3.0"
 # =============================================================================
 # Graph classes

lzgraphs-2.3.0/src/LZGraphs/_fast_walk.c ADDED Viewed

@@ -0,0 +1,321 @@
+/*
+ * _fast_walk.c — CPython C extension for fast Markov chain random walks.
+ *
+ * Implements the full simulate() loop in C including string assembly,
+ * for ~100-200x speedup over the original pure-Python implementation.
+ * Uses xoshiro256++ for fast, high-quality RNG.
+ *
+ * The extension is optional: if it fails to compile (no C compiler),
+ * LZGraphs falls back to the pure-Python bisect-based implementation.
+ */
+#define PY_SSIZE_T_CLEAN
+#include <Python.h>
+#include <stdint.h>
+#include <string.h>
+/* ========================================================================
+ * xoshiro256++ RNG — public domain by David Blackman and Sebastiano Vigna
+ * ======================================================================== */
+static inline uint64_t rotl(const uint64_t x, int k) {
+    return (x << k) | (x >> (64 - k));
+}
+typedef struct {
+    uint64_t s[4];
+} xoshiro256_state;
+static inline uint64_t xoshiro256pp_next(xoshiro256_state *state) {
+    const uint64_t result = rotl(state->s[0] + state->s[3], 23) + state->s[0];
+    const uint64_t t = state->s[1] << 17;
+    state->s[2] ^= state->s[0];
+    state->s[3] ^= state->s[1];
+    state->s[1] ^= state->s[2];
+    state->s[0] ^= state->s[3];
+    state->s[2] ^= t;
+    state->s[3] = rotl(state->s[3], 45);
+    return result;
+}
+static inline double xoshiro256pp_double(xoshiro256_state *state) {
+    return (double)(xoshiro256pp_next(state) >> 11) * 0x1.0p-53;
+}
+static inline uint64_t splitmix64(uint64_t *x) {
+    uint64_t z = (*x += 0x9e3779b97f4a7c15ULL);
+    z = (z ^ (z >> 30)) * 0xbf58476d1ce4e5b9ULL;
+    z = (z ^ (z >> 27)) * 0x94d049bb133111ebULL;
+    return z ^ (z >> 31);
+}
+static void seed_xoshiro256(xoshiro256_state *state, uint64_t seed) {
+    state->s[0] = splitmix64(&seed);
+    state->s[1] = splitmix64(&seed);
+    state->s[2] = splitmix64(&seed);
+    state->s[3] = splitmix64(&seed);
+}
+/* ========================================================================
+ * Binary search (bisect_left) on a double array
+ * ======================================================================== */
+static inline Py_ssize_t bisect_left_double(
+    const double *arr, Py_ssize_t n, double value
+) {
+    Py_ssize_t lo = 0, hi = n;
+    while (lo < hi) {
+        Py_ssize_t mid = lo + (hi - lo) / 2;
+        if (arr[mid] < value)
+            lo = mid + 1;
+        else
+            hi = mid;
+    }
+    return lo;
+}
+/* ========================================================================
+ * simulate_walks — full simulation with string assembly in C
+ *
+ * Args:
+ *   n_walks       : int
+ *   offsets       : intp array [n_nodes+1] (buffer)
+ *   neighbors     : intp array [total_edges] (buffer)
+ *   cumweights    : float64 array [total_edges] (buffer)
+ *   stop_probs    : float64 array [n_nodes] (buffer)
+ *   initial_ids   : intp array [n_initial] (buffer)
+ *   initial_cw    : float64 array [n_initial] (buffer)
+ *   seed          : uint64
+ *   clean_labels  : list[str] — label for each node ID
+ *   return_walks  : bool — if True, return (walk, seq) tuples
+ *   id_to_node    : list[str] — node names (only used if return_walks)
+ *
+ * Returns:
+ *   list[str]  or  list[tuple[list[str], str]]
+ * ======================================================================== */
+static PyObject* py_simulate_walks(PyObject *self, PyObject *args) {
+    int n_walks, return_walks;
+    Py_buffer offsets_buf, neighbors_buf, cumweights_buf;
+    Py_buffer stop_probs_buf, initial_ids_buf, initial_cw_buf;
+    unsigned long long seed;
+    PyObject *clean_labels;  /* Python list of str */
+    PyObject *id_to_node;    /* Python list of str */
+    PyObject *result_list = NULL;
+    if (!PyArg_ParseTuple(args, "iy*y*y*y*y*y*KOpO",
+            &n_walks,
+            &offsets_buf, &neighbors_buf, &cumweights_buf,
+            &stop_probs_buf, &initial_ids_buf, &initial_cw_buf,
+            &seed,
+            &clean_labels,
+            &return_walks,
+            &id_to_node))
+        return NULL;
+    const Py_ssize_t *offsets = (const Py_ssize_t *)offsets_buf.buf;
+    const Py_ssize_t *neighbors = (const Py_ssize_t *)neighbors_buf.buf;
+    const double *cumweights = (const double *)cumweights_buf.buf;
+    const double *stop_probs = (const double *)stop_probs_buf.buf;
+    const Py_ssize_t *initial_ids = (const Py_ssize_t *)initial_ids_buf.buf;
+    const double *initial_cw = (const double *)initial_cw_buf.buf;
+    const Py_ssize_t n_initial = initial_cw_buf.len / (Py_ssize_t)sizeof(double);
+    if (n_initial <= 0) {
+        PyErr_SetString(PyExc_ValueError,
+            "Cannot simulate: graph has no initial states.");
+        goto cleanup;
+    }
+    /* Pre-fetch label UTF-8 data for fast string assembly */
+    const Py_ssize_t n_labels = PyList_GET_SIZE(clean_labels);
+    const char **label_ptrs = (const char **)PyMem_Malloc(n_labels * sizeof(char *));
+    Py_ssize_t *label_lens = (Py_ssize_t *)PyMem_Malloc(n_labels * sizeof(Py_ssize_t));
+    if (!label_ptrs || !label_lens) {
+        PyMem_Free(label_ptrs);
+        PyMem_Free(label_lens);
+        PyErr_NoMemory();
+        goto cleanup;
+    }
+    for (Py_ssize_t i = 0; i < n_labels; i++) {
+        PyObject *s = PyList_GET_ITEM(clean_labels, i);
+        label_ptrs[i] = PyUnicode_AsUTF8AndSize(s, &label_lens[i]);
+        if (!label_ptrs[i]) {
+            PyMem_Free(label_ptrs);
+            PyMem_Free(label_lens);
+            goto cleanup;
+        }
+    }
+    xoshiro256_state rng;
+    seed_xoshiro256(&rng, (uint64_t)seed);
+    result_list = PyList_New(n_walks);
+    if (!result_list) {
+        PyMem_Free(label_ptrs);
+        PyMem_Free(label_lens);
+        goto cleanup;
+    }
+    /* Reusable walk buffer */
+    Py_ssize_t walk_cap = 64;
+    Py_ssize_t *walk_buf = (Py_ssize_t *)PyMem_Malloc(walk_cap * sizeof(Py_ssize_t));
+    /* Reusable string buffer */
+    Py_ssize_t str_cap = 256;
+    char *str_buf = (char *)PyMem_Malloc(str_cap);
+    if (!walk_buf || !str_buf) {
+        PyMem_Free(walk_buf);
+        PyMem_Free(str_buf);
+        PyMem_Free(label_ptrs);
+        PyMem_Free(label_lens);
+        Py_DECREF(result_list);
+        PyErr_NoMemory();
+        goto cleanup;
+    }
+    for (int i = 0; i < n_walks; i++) {
+        /* Pick initial state */
+        double r = xoshiro256pp_double(&rng);
+        Py_ssize_t init_idx = bisect_left_double(initial_cw, n_initial, r);
+        if (init_idx >= n_initial) init_idx = n_initial - 1;
+        Py_ssize_t current = initial_ids[init_idx];
+        Py_ssize_t walk_len = 0;
+        walk_buf[walk_len++] = current;
+        /* Build string incrementally */
+        Py_ssize_t str_len = 0;
+        Py_ssize_t llen = label_lens[current];
+        if (str_len + llen > str_cap) {
+            str_cap = (str_len + llen) * 2;
+            str_buf = (char *)PyMem_Realloc(str_buf, str_cap);
+            if (!str_buf) goto oom;
+        }
+        memcpy(str_buf + str_len, label_ptrs[current], llen);
+        str_len += llen;
+        while (1) {
+            double sp = stop_probs[current];
+            if (sp == sp) {
+                if (xoshiro256pp_double(&rng) < sp)
+                    break;
+            }
+            Py_ssize_t start = offsets[current];
+            Py_ssize_t end = offsets[current + 1];
+            if (start == end)
+                break;
+            r = xoshiro256pp_double(&rng);
+            Py_ssize_t idx = bisect_left_double(cumweights + start, end - start, r);
+            if (idx >= end - start) idx = end - start - 1;
+            current = neighbors[start + idx];
+            /* Grow walk buffer if needed */
+            if (walk_len >= walk_cap) {
+                walk_cap *= 2;
+                Py_ssize_t *new_buf = (Py_ssize_t *)PyMem_Realloc(walk_buf, walk_cap * sizeof(Py_ssize_t));
+                if (!new_buf) goto oom;
+                walk_buf = new_buf;
+            }
+            walk_buf[walk_len++] = current;
+            /* Append label to string buffer */
+            llen = label_lens[current];
+            if (str_len + llen > str_cap) {
+                str_cap = (str_len + llen) * 2;
+                char *new_str = (char *)PyMem_Realloc(str_buf, str_cap);
+                if (!new_str) goto oom;
+                str_buf = new_str;
+            }
+            memcpy(str_buf + str_len, label_ptrs[current], llen);
+            str_len += llen;
+        }
+        /* Create Python string from buffer */
+        PyObject *seq = PyUnicode_FromStringAndSize(str_buf, str_len);
+        if (!seq) goto oom;
+        if (return_walks) {
+            /* Build walk list of node name strings */
+            PyObject *walk = PyList_New(walk_len);
+            if (!walk) { Py_DECREF(seq); goto oom; }
+            for (Py_ssize_t j = 0; j < walk_len; j++) {
+                PyObject *node_name = PyList_GET_ITEM(id_to_node, walk_buf[j]);
+                Py_INCREF(node_name);
+                PyList_SET_ITEM(walk, j, node_name);
+            }
+            PyObject *tup = PyTuple_Pack(2, walk, seq);
+            Py_DECREF(walk);
+            Py_DECREF(seq);
+            if (!tup) goto oom;
+            PyList_SET_ITEM(result_list, i, tup);
+        } else {
+            PyList_SET_ITEM(result_list, i, seq);
+        }
+    }
+    PyMem_Free(walk_buf);
+    PyMem_Free(str_buf);
+    PyMem_Free(label_ptrs);
+    PyMem_Free(label_lens);
+    goto cleanup;
+oom:
+    PyMem_Free(walk_buf);
+    PyMem_Free(str_buf);
+    PyMem_Free(label_ptrs);
+    PyMem_Free(label_lens);
+    Py_XDECREF(result_list);
+    result_list = NULL;
+    if (!PyErr_Occurred())
+        PyErr_NoMemory();
+cleanup:
+    PyBuffer_Release(&offsets_buf);
+    PyBuffer_Release(&neighbors_buf);
+    PyBuffer_Release(&cumweights_buf);
+    PyBuffer_Release(&stop_probs_buf);
+    PyBuffer_Release(&initial_ids_buf);
+    PyBuffer_Release(&initial_cw_buf);
+    return result_list;
+}
+/* ========================================================================
+ * Module definition
+ * ======================================================================== */
+static PyMethodDef FastWalkMethods[] = {
+    {"simulate_walks", py_simulate_walks, METH_VARARGS,
+     "Run n random walks on a CSR-encoded graph with string assembly.\n\n"
+     "Args:\n"
+     "    n_walks (int): Number of walks.\n"
+     "    offsets (array): CSR row offsets [n_nodes+1], dtype=intp.\n"
+     "    neighbors (array): Flat neighbor IDs, dtype=intp.\n"
+     "    cumweights (array): Flat cumulative weights, dtype=float64.\n"
+     "    stop_probs (array): Per-node stop probability (NaN=none), dtype=float64.\n"
+     "    initial_ids (array): Initial state IDs, dtype=intp.\n"
+     "    initial_cumprobs (array): Cumulative initial probs, dtype=float64.\n"
+     "    seed (int): RNG seed (xoshiro256++).\n"
+     "    clean_labels (list[str]): Subpattern label for each node.\n"
+     "    return_walks (bool): If True, return (walk, seq) tuples.\n"
+     "    id_to_node (list[str]): Node names for walk output.\n\n"
+     "Returns:\n"
+     "    list[str] or list[tuple[list[str], str]]\n"},
+    {NULL, NULL, 0, NULL}
+};
+static struct PyModuleDef fast_walk_module = {
+    PyModuleDef_HEAD_INIT,
+    "_fast_walk",
+    "C-accelerated random walk simulation for LZGraphs.\n"
+    "Uses xoshiro256++ RNG for high-quality, fast random number generation.\n"
+    "This module is optional — LZGraphs falls back to pure Python if unavailable.",
+    -1,
+    FastWalkMethods
+};
+PyMODINIT_FUNC PyInit__fast_walk(void) {
+    return PyModule_Create(&fast_walk_module);
+}

lzgraphs-2.3.0/src/LZGraphs/constants.py ADDED Viewed

@@ -0,0 +1,6 @@
+"""Module-level numeric constants shared across LZGraphs internals."""
+import numpy as np
+# Machine epsilon — cached once at module level (avoids repeated np.finfo calls)
+_EPS = np.finfo(np.float64).eps
+_LOG_EPS = np.log(_EPS)

LZGraphs 2.1.2__tar.gz → 2.3.0__tar.gz

LZGraphs 2.1.2tar.gz → 2.3.0tar.gz