graphwise 1.4.1 → 1.4.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +58 -8
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -47,6 +47,8 @@ Three key properties:
|
|
|
47
47
|
2. **Frontier collision detection**: when a vertex is reached by multiple frontiers, the connecting path is recorded
|
|
48
48
|
3. **Implicit termination**: halts when all frontier queues are empty; no depth bound or size threshold
|
|
49
49
|
|
|
50
|
+
---
|
|
51
|
+
|
|
50
52
|
#### DOME: Degree-Ordered Multi-seed Expansion
|
|
51
53
|
|
|
52
54
|
The default priority function uses degree-based hub deferral:
|
|
@@ -55,6 +57,8 @@ $$\pi(v) = \frac{\deg^{+}(v) + \deg^{-}(v)}{w_V(v) + \varepsilon}$$
|
|
|
55
57
|
|
|
56
58
|
where $\deg^{+}(v)$ is weighted out-degree, $\deg^{-}(v)$ is weighted in-degree, $w_V(v)$ is node weight, and $\varepsilon > 0$ prevents division by zero.
|
|
57
59
|
|
|
60
|
+
---
|
|
61
|
+
|
|
58
62
|
#### Expansion Variants
|
|
59
63
|
|
|
60
64
|
| Algorithm | Priority Function | Phases |
|
|
@@ -73,70 +77,108 @@ where $\deg^{+}(v)$ is weighted out-degree, $\deg^{-}(v)$ is weighted in-degree,
|
|
|
73
77
|
| **SIFT** | MI threshold with degree fallback | 1 |
|
|
74
78
|
| **FLUX** | Density-adaptive strategy switching | 1 |
|
|
75
79
|
|
|
80
|
+
---
|
|
81
|
+
|
|
76
82
|
#### EDGE: Entropy-Driven Graph Expansion
|
|
77
83
|
|
|
78
84
|
$$\pi_{\text{EDGE}}(v) = \frac{1}{H_{\text{local}}(v) + \varepsilon} \times \log(\deg(v) + 1)$$
|
|
79
85
|
|
|
80
86
|
where $H_{\text{local}}(v) = -\sum_{\tau} p(\tau) \log p(\tau)$ is the Shannon entropy of the neighbour type distribution. Nodes bridging heterogeneous structural regimes (high entropy) are explored first.
|
|
81
87
|
|
|
88
|
+
---
|
|
89
|
+
|
|
82
90
|
#### PIPE: Path-potential Informed Priority Expansion
|
|
83
91
|
|
|
84
|
-
$$\pi_{\text{PIPE}}(v) = \frac{\deg(v)}{1 + \
|
|
92
|
+
$$\pi_{\text{PIPE}}(v) = \frac{\deg(v)}{1 + \mathrm{pathPotential}(v)}$$
|
|
85
93
|
|
|
86
|
-
where $\
|
|
94
|
+
where $\mathrm{pathPotential}(v) = \lvert N(v) \cap \bigcup_{j \neq i} V_j \rvert$ counts neighbours already visited by other seed frontiers. High path potential indicates imminent path completion.
|
|
95
|
+
|
|
96
|
+
---
|
|
87
97
|
|
|
88
98
|
#### SAGE: Salience-Accumulation Guided Expansion
|
|
89
99
|
|
|
90
|
-
|
|
100
|
+
$$
|
|
101
|
+
\pi_{\text{SAGE}}(v) = \begin{cases} \log(\deg(v) + 1) & \text{Phase 1 (before first path)} \\ -(\text{salience}(v) \times 1000 - \deg(v)) & \text{Phase 2 (after first path)} \end{cases}
|
|
102
|
+
$$
|
|
91
103
|
|
|
92
104
|
where $\text{salience}(v)$ counts discovered paths containing $v$. Salience dominates in Phase 2; degree serves as tiebreaker.
|
|
93
105
|
|
|
106
|
+
---
|
|
107
|
+
|
|
94
108
|
#### REACH: Retrospective Expansion with Adaptive Convergence
|
|
95
109
|
|
|
96
|
-
|
|
110
|
+
$$
|
|
111
|
+
\pi_{\text{REACH}}(v) = \begin{cases} \log(\deg(v) + 1) & \text{Phase 1} \\ \log(\deg(v) + 1) \times (1 - \widehat{\text{MI}}(v)) & \text{Phase 2} \end{cases}
|
|
112
|
+
$$
|
|
113
|
+
|
|
114
|
+
where $\widehat{\text{MI}}(v)$ estimates MI via Jaccard similarity to discovered path endpoints:
|
|
97
115
|
|
|
98
|
-
|
|
116
|
+
$$
|
|
117
|
+
\widehat{\text{MI}}(v) = \frac{1}{\lvert \mathcal{P}\_{\text{top}} \rvert} \sum\_{p} J(N(v), N(p\_{\text{endpoint}}))
|
|
118
|
+
$$
|
|
119
|
+
|
|
120
|
+
---
|
|
99
121
|
|
|
100
122
|
#### MAZE: Multi-frontier Adaptive Zone Expansion
|
|
101
123
|
|
|
102
|
-
|
|
124
|
+
$$
|
|
125
|
+
\pi^{(1)}(v) = \frac{\deg(v)}{1 + \mathrm{pathPotential}(v)} \qquad \pi^{(2)}(v) = \pi^{(1)}(v) \times \frac{1}{1 + \lambda \cdot \text{salience}(v)}
|
|
126
|
+
$$
|
|
103
127
|
|
|
104
128
|
Phase 1 uses PIPE's path potential until $M$ paths found. Phase 2 incorporates SAGE's salience feedback. Phase 3 evaluates diversity, path count, and salience plateau for termination.
|
|
105
129
|
|
|
130
|
+
---
|
|
131
|
+
|
|
106
132
|
#### TIDE: Total Interconnected Degree Expansion
|
|
107
133
|
|
|
108
134
|
$$\pi_{\text{TIDE}}(v) = \deg(v) + \sum_{w \in N(v)} \deg(w)$$
|
|
109
135
|
|
|
110
136
|
Nodes in sparse regions (low aggregate neighbourhood degree) are explored first. Related to EDGE but uses raw degree sums rather than entropy.
|
|
111
137
|
|
|
138
|
+
---
|
|
139
|
+
|
|
112
140
|
#### LACE: Local Affinity-Computed Expansion
|
|
113
141
|
|
|
114
142
|
$$\pi_{\text{LACE}}(v) = 1 - \overline{\text{MI}}(v, \text{frontier})$$
|
|
115
143
|
|
|
116
144
|
Prioritises nodes by average MI to already-visited frontier nodes. Related to HAE but uses MI to visited nodes rather than type entropy.
|
|
117
145
|
|
|
146
|
+
---
|
|
147
|
+
|
|
118
148
|
#### WARP: Weighted Adjacent Reachability Priority
|
|
119
149
|
|
|
120
150
|
$$\pi_{\text{WARP}}(v) = \frac{1}{1 + \text{bridge}(v)}$$
|
|
121
151
|
|
|
122
152
|
Pure cross-frontier bridge score without degree normalisation. Related to PIPE but omits the degree numerator.
|
|
123
153
|
|
|
154
|
+
---
|
|
155
|
+
|
|
124
156
|
#### FUSE: Fused Utility-Salience Expansion
|
|
125
157
|
|
|
126
158
|
$$\pi_{\text{FUSE}}(v) = (1 - w) \cdot \deg(v) + w \cdot (1 - \overline{\text{MI}})$$
|
|
127
159
|
|
|
128
160
|
Single-phase weighted blend of degree and MI. Related to SAGE but uses continuous blending rather than two-phase transition.
|
|
129
161
|
|
|
162
|
+
---
|
|
163
|
+
|
|
130
164
|
#### SIFT: Salience-Informed Frontier Threshold
|
|
131
165
|
|
|
132
|
-
|
|
166
|
+
$$
|
|
167
|
+
\pi_{\text{SIFT}}(v) = \begin{cases} 1 - \overline{\text{MI}} & \text{if } \overline{\text{MI}} \geq \tau \\ \deg(v) + 100 & \text{otherwise} \end{cases}
|
|
168
|
+
$$
|
|
133
169
|
|
|
134
170
|
MI-threshold-based priority with degree fallback. Related to REACH but uses a hard threshold instead of continuous MI-weighted priority.
|
|
135
171
|
|
|
172
|
+
---
|
|
173
|
+
|
|
136
174
|
#### FLUX: Flexible Local Utility Crossover
|
|
137
175
|
|
|
138
176
|
Density-adaptive strategy switching. Selects between DOME, EDGE, and PIPE modes per-node based on local graph density and cross-frontier bridge score. Related to MAZE but adapts spatially (per-node) rather than temporally (per-phase).
|
|
139
177
|
|
|
178
|
+
---
|
|
179
|
+
|
|
180
|
+
---
|
|
181
|
+
|
|
140
182
|
### Path Ranking: PARSE
|
|
141
183
|
|
|
142
184
|
**Path Aggregation Ranked by Salience Estimation** (PARSE) scores paths by the geometric mean of per-edge mutual information, eliminating length bias:
|
|
@@ -215,7 +257,7 @@ where $c(\tau_u)$ is the count of nodes with the same type as $u$. Weights Jacca
|
|
|
215
257
|
| --------------------------- | --------------------------------------------------- |
|
|
216
258
|
| **Katz Index** | $\sum_{k=1}^{\infty} \beta^k (A^k)_{st}$ |
|
|
217
259
|
| **Communicability** | $(e^A)_{st}$ |
|
|
218
|
-
| **Resistance Distance** | $L^{+}_{ss} + L^{+}_{tt} - 2L^{+}_{st}$
|
|
260
|
+
| **Resistance Distance** | $L^{+}\_{ss} + L^{+}\_{tt} - 2L^{+}\_{st}$ |
|
|
219
261
|
| **Jaccard Arithmetic Mean** | $\frac{1}{k} \sum J(N(u), N(v))$ |
|
|
220
262
|
| **Degree-Sum** | $\sum_{v \in P} \deg(v)$ |
|
|
221
263
|
| **Widest Path** | $\min_{(u,v) \in P} w(u,v)$ |
|
|
@@ -223,6 +265,10 @@ where $c(\tau_u)$ is the count of nodes with the same type as $u$. Weights Jacca
|
|
|
223
265
|
| **Betweenness** | Fraction of shortest paths through node |
|
|
224
266
|
| **Random** | Uniform random score (null baseline) |
|
|
225
267
|
|
|
268
|
+
---
|
|
269
|
+
|
|
270
|
+
---
|
|
271
|
+
|
|
226
272
|
### Seed Selection: GRASP
|
|
227
273
|
|
|
228
274
|
**Graph-agnostic Representative seed pAir Sampling**: selects structurally representative seed pairs from an unknown graph using reservoir sampling and structural feature clustering. Operates blind: no full graph loading, no ground-truth labels, no human-defined strata.
|
|
@@ -233,6 +279,10 @@ Three phases:
|
|
|
233
279
|
2. **Structural features**: for each sampled node compute $\log(\deg + 1)$, clustering coefficient, approximate PageRank
|
|
234
280
|
3. **Cluster and sample**: MiniBatchKMeans into $K$ groups; sample within-cluster and cross-cluster pairs
|
|
235
281
|
|
|
282
|
+
---
|
|
283
|
+
|
|
284
|
+
---
|
|
285
|
+
|
|
236
286
|
## Module Exports
|
|
237
287
|
|
|
238
288
|
```typescript
|