semanticembed 0.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- semanticembed-0.1.0/.gitignore +11 -0
- semanticembed-0.1.0/LICENSE +18 -0
- semanticembed-0.1.0/PKG-INFO +257 -0
- semanticembed-0.1.0/README.md +233 -0
- semanticembed-0.1.0/dimensions.md +92 -0
- semanticembed-0.1.0/docs/api_reference.md +290 -0
- semanticembed-0.1.0/docs/dimensions.md +92 -0
- semanticembed-0.1.0/docs/getting_started.md +248 -0
- semanticembed-0.1.0/docs/input_format.md +71 -0
- semanticembed-0.1.0/docs/license_keys.md +52 -0
- semanticembed-0.1.0/docs/output_format.md +81 -0
- semanticembed-0.1.0/examples/ai_agent_pipeline.json +21 -0
- semanticembed-0.1.0/examples/cicd_pipeline.json +23 -0
- semanticembed-0.1.0/examples/google_online_boutique.json +21 -0
- semanticembed-0.1.0/examples/react/README.md +54 -0
- semanticembed-0.1.0/examples/react/RadarChart.tsx +109 -0
- semanticembed-0.1.0/examples/react/RiskTable.tsx +114 -0
- semanticembed-0.1.0/examples/react/TopologySummary.tsx +83 -0
- semanticembed-0.1.0/examples/react/useSemanticEmbed.ts +94 -0
- semanticembed-0.1.0/examples/weaveworks_sock_shop.json +21 -0
- semanticembed-0.1.0/notebooks/01_quickstart.ipynb +264 -0
- semanticembed-0.1.0/notebooks/02_dimensions.ipynb +309 -0
- semanticembed-0.1.0/notebooks/03_drift_detection.ipynb +195 -0
- semanticembed-0.1.0/notebooks/04_bring_your_own.ipynb +261 -0
- semanticembed-0.1.0/notebooks/05_ai_agent_pipelines.ipynb +257 -0
- semanticembed-0.1.0/notebooks/06_cicd_pipelines.ipynb +259 -0
- semanticembed-0.1.0/notebooks/07_opentelemetry.ipynb +343 -0
- semanticembed-0.1.0/pyproject.toml +37 -0
- semanticembed-0.1.0/src/semanticembed/__init__.py +37 -0
- semanticembed-0.1.0/src/semanticembed/client.py +240 -0
- semanticembed-0.1.0/src/semanticembed/exceptions.py +34 -0
- semanticembed-0.1.0/src/semanticembed/models.py +118 -0
|
@@ -0,0 +1,18 @@
|
|
|
1
|
+
SemanticEmbed SDK License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 SemanticEmbed Inc. All rights reserved.
|
|
4
|
+
|
|
5
|
+
This software is proprietary and confidential. Unauthorized copying,
|
|
6
|
+
distribution, modification, or use of this software, via any medium,
|
|
7
|
+
is strictly prohibited.
|
|
8
|
+
|
|
9
|
+
The SemanticEmbed SDK is distributed as a compiled package for the
|
|
10
|
+
sole purpose of interfacing with the SemanticEmbed cloud API. It does
|
|
11
|
+
not contain the encoding algorithm or any proprietary computation logic.
|
|
12
|
+
|
|
13
|
+
Free tier: graphs up to 50 nodes, no signup required.
|
|
14
|
+
Paid tier: unlimited nodes, requires a valid license key.
|
|
15
|
+
|
|
16
|
+
Patent pending. Application #63/994,075.
|
|
17
|
+
|
|
18
|
+
For licensing inquiries: jeffmurr@seas.upenn.edu
|
|
@@ -0,0 +1,257 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: semanticembed
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: 6D structural intelligence for directed graphs. Six numbers per node. Sub-millisecond.
|
|
5
|
+
Project-URL: Homepage, https://github.com/jmurray10/semanticembed-sdk
|
|
6
|
+
Project-URL: Documentation, https://github.com/jmurray10/semanticembed-sdk
|
|
7
|
+
Project-URL: Repository, https://github.com/jmurray10/semanticembed-sdk
|
|
8
|
+
Author-email: Jeff Murray <jeffmurr@seas.upenn.edu>
|
|
9
|
+
License: Proprietary
|
|
10
|
+
License-File: LICENSE
|
|
11
|
+
Keywords: graph,microservices,observability,risk-detection,structural-analysis
|
|
12
|
+
Classifier: Development Status :: 4 - Beta
|
|
13
|
+
Classifier: Intended Audience :: Developers
|
|
14
|
+
Classifier: Programming Language :: Python :: 3
|
|
15
|
+
Classifier: Programming Language :: Python :: 3.9
|
|
16
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
17
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
18
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
19
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
20
|
+
Classifier: Topic :: Software Development :: Libraries
|
|
21
|
+
Requires-Python: >=3.9
|
|
22
|
+
Requires-Dist: httpx>=0.24
|
|
23
|
+
Description-Content-Type: text/markdown
|
|
24
|
+
|
|
25
|
+
# SemanticEmbed SDK
|
|
26
|
+
|
|
27
|
+
**Structural intelligence for directed graphs. Six numbers per node. Sub-millisecond.**
|
|
28
|
+
|
|
29
|
+
SemanticEmbed computes a 6-dimensional structural encoding for every node in a directed graph. From a bare edge list -- no runtime telemetry, no historical data, no tuning -- it produces six independent measurements that fully describe each node's structural role.
|
|
30
|
+
|
|
31
|
+
> **Validated against production incidents.** In a blind test against a live production environment (100+ services, 2,500+ incidents over 30 days), the majority of topology-relevant incidents occurred on nodes that 6D structural analysis had flagged as risky -- from the call graph alone, before any incident occurred.
|
|
32
|
+
|
|
33
|
+
---
|
|
34
|
+
|
|
35
|
+
## Why 6D?
|
|
36
|
+
|
|
37
|
+
Observability tools tell you **what broke**. SemanticEmbed tells you **what will break** -- from topology alone.
|
|
38
|
+
|
|
39
|
+
- **No agents, no instrumentation** -- just an edge list
|
|
40
|
+
- **Sub-millisecond** -- encodes 100+ node graphs in <1ms
|
|
41
|
+
- **Works on any directed graph** -- microservices, AI agent pipelines, data workflows, CI/CD
|
|
42
|
+
- **Mathematically independent axes** -- six dimensions, zero redundancy, each captures structural information no other metric provides
|
|
43
|
+
|
|
44
|
+
---
|
|
45
|
+
|
|
46
|
+
## Try It Now
|
|
47
|
+
|
|
48
|
+
**[Open the Interactive Demo in Google Colab](https://colab.research.google.com/github/jmurray10/semanticembed-sdk/blob/main/notebooks/01_quickstart.ipynb)** -- runs in your browser, nothing to install locally.
|
|
49
|
+
|
|
50
|
+
---
|
|
51
|
+
|
|
52
|
+
## Install
|
|
53
|
+
|
|
54
|
+
```bash
|
|
55
|
+
pip install semanticembed
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
**Free tier:** Up to 50 nodes per graph. No signup required.
|
|
59
|
+
|
|
60
|
+
---
|
|
61
|
+
|
|
62
|
+
## Quick Start
|
|
63
|
+
|
|
64
|
+
```python
|
|
65
|
+
from semanticembed import encode, report
|
|
66
|
+
|
|
67
|
+
# Any directed graph as an edge list
|
|
68
|
+
edges = [
|
|
69
|
+
("frontend", "api-gateway"),
|
|
70
|
+
("api-gateway", "order-service"),
|
|
71
|
+
("api-gateway", "user-service"),
|
|
72
|
+
("order-service", "payment-service"),
|
|
73
|
+
("order-service", "inventory-service"),
|
|
74
|
+
("payment-service", "database"),
|
|
75
|
+
]
|
|
76
|
+
|
|
77
|
+
# Compute the 6D encoding (sub-millisecond)
|
|
78
|
+
result = encode(edges)
|
|
79
|
+
|
|
80
|
+
# Six structural measurements per node
|
|
81
|
+
for node, vector in result.vectors.items():
|
|
82
|
+
print(f"{node}: {vector}")
|
|
83
|
+
|
|
84
|
+
# Structural risk report
|
|
85
|
+
print(report(result))
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
Output:
|
|
89
|
+
|
|
90
|
+
```
|
|
91
|
+
STRUCTURAL RISK REPORT
|
|
92
|
+
======================
|
|
93
|
+
|
|
94
|
+
AMPLIFICATION RISKS (high fanout, high criticality):
|
|
95
|
+
- api-gateway | fanout=0.667 | criticality=0.556
|
|
96
|
+
|
|
97
|
+
CONVERGENCE SINKS (low independence, many upstream callers):
|
|
98
|
+
- database | independence=0.000
|
|
99
|
+
|
|
100
|
+
STRUCTURAL SPOF (low independence, high upstream dependency):
|
|
101
|
+
- api-gateway | independence=0.000 | every request flows through this node
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
---
|
|
105
|
+
|
|
106
|
+
## What It Finds That Other Tools Miss
|
|
107
|
+
|
|
108
|
+
| Your current tools | SemanticEmbed |
|
|
109
|
+
|---|---|
|
|
110
|
+
| This service has high latency | This service is on 89% of all paths (structural SPOF) |
|
|
111
|
+
| This service had 5 errors | This service fans out to 12 downstream services (amplification risk) |
|
|
112
|
+
| This service is healthy | This service has zero lateral redundancy (convergence sink) |
|
|
113
|
+
|
|
114
|
+
Runtime monitoring tells you what is slow **now**. Structural analysis tells you what **will** cause cascading failures regardless of current load.
|
|
115
|
+
|
|
116
|
+
---
|
|
117
|
+
|
|
118
|
+
## The Six Dimensions
|
|
119
|
+
|
|
120
|
+
Every node gets six independent structural measurements:
|
|
121
|
+
|
|
122
|
+
| Dimension | What It Measures | Risk Signal |
|
|
123
|
+
|-----------|-----------------|-------------|
|
|
124
|
+
| **Depth** | Position in the execution pipeline (0.0 = entry, 1.0 = deepest) | Deep nodes accumulate upstream latency |
|
|
125
|
+
| **Independence** | Lateral redundancy at the same pipeline stage | Low independence = structural chokepoint |
|
|
126
|
+
| **Hierarchy** | Module or group membership | Cross-module dependencies = blast radius |
|
|
127
|
+
| **Throughput** | Fraction of total traffic flowing through the node | High throughput + low independence = hidden bottleneck |
|
|
128
|
+
| **Criticality** | Fraction of end-to-end paths depending on this node | High criticality = SPOF |
|
|
129
|
+
| **Fanout** | Broadcaster (1.0) vs aggregator (0.0) | High fanout = amplification risk |
|
|
130
|
+
|
|
131
|
+
These six properties are mathematically independent -- knowing any five tells you nothing about the sixth.
|
|
132
|
+
|
|
133
|
+
See [docs/dimensions.md](docs/dimensions.md) for the full reference.
|
|
134
|
+
|
|
135
|
+
---
|
|
136
|
+
|
|
137
|
+
## Use Cases
|
|
138
|
+
|
|
139
|
+
**Microservice architectures** -- Find SPOFs, amplification cascades, and convergence bottlenecks in any service mesh. Works with Kubernetes, Istio, OTel traces, or static architecture diagrams.
|
|
140
|
+
|
|
141
|
+
**AI agent pipelines** -- Identify vendor concentration risk, gateway bottlenecks, and guardrail single points of failure in LLM orchestration graphs.
|
|
142
|
+
|
|
143
|
+
**CI/CD and data pipelines** -- Detect structural fragility in build graphs, ETL workflows, and deployment pipelines before they cause cascading failures.
|
|
144
|
+
|
|
145
|
+
**Architecture drift monitoring** -- Compare structural fingerprints across releases. Know exactly which services changed structural role and by how much.
|
|
146
|
+
|
|
147
|
+
---
|
|
148
|
+
|
|
149
|
+
## Notebooks
|
|
150
|
+
|
|
151
|
+
Step-by-step Colab notebooks. Click to open, run in your browser.
|
|
152
|
+
|
|
153
|
+
| Notebook | Use Case | What You Learn |
|
|
154
|
+
|----------|----------|---------------|
|
|
155
|
+
| [01 - Quickstart](https://colab.research.google.com/github/jmurray10/semanticembed-sdk/blob/main/notebooks/01_quickstart.ipynb) | Getting started | Install, encode a graph, read the risk report |
|
|
156
|
+
| [02 - Dimensions Deep Dive](https://colab.research.google.com/github/jmurray10/semanticembed-sdk/blob/main/notebooks/02_dimensions.ipynb) | Understanding 6D | What each dimension means, with worked examples |
|
|
157
|
+
| [03 - Drift Detection](https://colab.research.google.com/github/jmurray10/semanticembed-sdk/blob/main/notebooks/03_drift_detection.ipynb) | Architecture drift | Compare graph versions, detect structural changes |
|
|
158
|
+
| [04 - Bring Your Own Graph](https://colab.research.google.com/github/jmurray10/semanticembed-sdk/blob/main/notebooks/04_bring_your_own.ipynb) | Any graph | Load from JSON, OTel traces, or Kubernetes |
|
|
159
|
+
| [05 - AI Agent Pipelines](https://colab.research.google.com/github/jmurray10/semanticembed-sdk/blob/main/notebooks/05_ai_agent_pipelines.ipynb) | AI/LLM agents | Vendor concentration, gateway bottlenecks, guardrail SPOFs |
|
|
160
|
+
| [06 - CI/CD & Data Pipelines](https://colab.research.google.com/github/jmurray10/semanticembed-sdk/blob/main/notebooks/06_cicd_pipelines.ipynb) | CI/CD & ETL | Build graph fragility, pipeline bottlenecks, drift gates |
|
|
161
|
+
| [07 - OpenTelemetry](https://colab.research.google.com/github/jmurray10/semanticembed-sdk/blob/main/notebooks/07_opentelemetry.ipynb) | OTel traces | Extract edges from traces, structural analysis, CI/CD gates |
|
|
162
|
+
|
|
163
|
+
---
|
|
164
|
+
|
|
165
|
+
## Example Graphs
|
|
166
|
+
|
|
167
|
+
The `examples/` directory contains edge lists for well-known architectures:
|
|
168
|
+
|
|
169
|
+
| File | Application | Nodes | Edges |
|
|
170
|
+
|------|------------|-------|-------|
|
|
171
|
+
| [google_online_boutique.json](examples/google_online_boutique.json) | Google Online Boutique (microservices) | 11 | 15 |
|
|
172
|
+
| [weaveworks_sock_shop.json](examples/weaveworks_sock_shop.json) | Weaveworks Sock Shop (microservices) | 15 | 15 |
|
|
173
|
+
| [ai_agent_pipeline.json](examples/ai_agent_pipeline.json) | Multi-agent LLM orchestration | 12 | 15 |
|
|
174
|
+
| [cicd_pipeline.json](examples/cicd_pipeline.json) | CI/CD build pipeline | 13 | 17 |
|
|
175
|
+
|
|
176
|
+
---
|
|
177
|
+
|
|
178
|
+
## React Components
|
|
179
|
+
|
|
180
|
+
Drop-in React components for rendering SDK results. See [examples/react/](examples/react/) for the full source.
|
|
181
|
+
|
|
182
|
+
| Component | What it renders |
|
|
183
|
+
|-----------|----------------|
|
|
184
|
+
| `useSemanticEmbed.ts` | React hook — call `encode()` from your frontend |
|
|
185
|
+
| `RiskTable.tsx` | Sortable risk table with severity badges |
|
|
186
|
+
| `RadarChart.tsx` | 6D radar chart comparing node profiles |
|
|
187
|
+
| `TopologySummary.tsx` | KPI cards + risk breakdown |
|
|
188
|
+
|
|
189
|
+
```tsx
|
|
190
|
+
import { useSemanticEmbed } from './useSemanticEmbed';
|
|
191
|
+
import { RiskTable } from './RiskTable';
|
|
192
|
+
|
|
193
|
+
function App() {
|
|
194
|
+
const { result, loading, encode } = useSemanticEmbed();
|
|
195
|
+
return (
|
|
196
|
+
<>
|
|
197
|
+
<button onClick={() => encode([["A","B"],["B","C"],["C","D"]])}>Analyze</button>
|
|
198
|
+
{result && <RiskTable risks={result.risks} />}
|
|
199
|
+
</>
|
|
200
|
+
);
|
|
201
|
+
}
|
|
202
|
+
```
|
|
203
|
+
|
|
204
|
+
---
|
|
205
|
+
|
|
206
|
+
## Input Format
|
|
207
|
+
|
|
208
|
+
SemanticEmbed accepts any directed graph as an edge list.
|
|
209
|
+
|
|
210
|
+
```python
|
|
211
|
+
# Python tuples
|
|
212
|
+
edges = [("A", "B"), ("B", "C")]
|
|
213
|
+
result = encode(edges)
|
|
214
|
+
|
|
215
|
+
# JSON file
|
|
216
|
+
result = encode_file("my_graph.json")
|
|
217
|
+
```
|
|
218
|
+
|
|
219
|
+
JSON format:
|
|
220
|
+
|
|
221
|
+
```json
|
|
222
|
+
{
|
|
223
|
+
"edges": [
|
|
224
|
+
{"source": "A", "target": "B"},
|
|
225
|
+
{"source": "B", "target": "C"}
|
|
226
|
+
]
|
|
227
|
+
}
|
|
228
|
+
```
|
|
229
|
+
|
|
230
|
+
See [docs/input_format.md](docs/input_format.md) for the full spec.
|
|
231
|
+
|
|
232
|
+
---
|
|
233
|
+
|
|
234
|
+
## Documentation
|
|
235
|
+
|
|
236
|
+
| Document | Description |
|
|
237
|
+
|----------|-------------|
|
|
238
|
+
| [docs/getting_started.md](docs/getting_started.md) | Install, encode, read results, export -- one page |
|
|
239
|
+
| [docs/api_reference.md](docs/api_reference.md) | Every function, class, parameter, and return type |
|
|
240
|
+
| [docs/dimensions.md](docs/dimensions.md) | The six structural dimensions -- full reference |
|
|
241
|
+
| [docs/input_format.md](docs/input_format.md) | Edge list input specification |
|
|
242
|
+
| [docs/output_format.md](docs/output_format.md) | Encoding output and risk report format |
|
|
243
|
+
|
|
244
|
+
---
|
|
245
|
+
|
|
246
|
+
## License
|
|
247
|
+
|
|
248
|
+
SemanticEmbed SDK is proprietary software distributed as a compiled package.
|
|
249
|
+
Free tier available for graphs up to 50 nodes. See [LICENSE](LICENSE) for terms.
|
|
250
|
+
|
|
251
|
+
**Patent pending.** Application #63/994,075.
|
|
252
|
+
|
|
253
|
+
---
|
|
254
|
+
|
|
255
|
+
## Contact
|
|
256
|
+
|
|
257
|
+
Email jeffmurr@seas.upenn.edu
|
|
@@ -0,0 +1,233 @@
|
|
|
1
|
+
# SemanticEmbed SDK
|
|
2
|
+
|
|
3
|
+
**Structural intelligence for directed graphs. Six numbers per node. Sub-millisecond.**
|
|
4
|
+
|
|
5
|
+
SemanticEmbed computes a 6-dimensional structural encoding for every node in a directed graph. From a bare edge list -- no runtime telemetry, no historical data, no tuning -- it produces six independent measurements that fully describe each node's structural role.
|
|
6
|
+
|
|
7
|
+
> **Validated against production incidents.** In a blind test against a live production environment (100+ services, 2,500+ incidents over 30 days), the majority of topology-relevant incidents occurred on nodes that 6D structural analysis had flagged as risky -- from the call graph alone, before any incident occurred.
|
|
8
|
+
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
## Why 6D?
|
|
12
|
+
|
|
13
|
+
Observability tools tell you **what broke**. SemanticEmbed tells you **what will break** -- from topology alone.
|
|
14
|
+
|
|
15
|
+
- **No agents, no instrumentation** -- just an edge list
|
|
16
|
+
- **Sub-millisecond** -- encodes 100+ node graphs in <1ms
|
|
17
|
+
- **Works on any directed graph** -- microservices, AI agent pipelines, data workflows, CI/CD
|
|
18
|
+
- **Mathematically independent axes** -- six dimensions, zero redundancy, each captures structural information no other metric provides
|
|
19
|
+
|
|
20
|
+
---
|
|
21
|
+
|
|
22
|
+
## Try It Now
|
|
23
|
+
|
|
24
|
+
**[Open the Interactive Demo in Google Colab](https://colab.research.google.com/github/jmurray10/semanticembed-sdk/blob/main/notebooks/01_quickstart.ipynb)** -- runs in your browser, nothing to install locally.
|
|
25
|
+
|
|
26
|
+
---
|
|
27
|
+
|
|
28
|
+
## Install
|
|
29
|
+
|
|
30
|
+
```bash
|
|
31
|
+
pip install semanticembed
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
**Free tier:** Up to 50 nodes per graph. No signup required.
|
|
35
|
+
|
|
36
|
+
---
|
|
37
|
+
|
|
38
|
+
## Quick Start
|
|
39
|
+
|
|
40
|
+
```python
|
|
41
|
+
from semanticembed import encode, report
|
|
42
|
+
|
|
43
|
+
# Any directed graph as an edge list
|
|
44
|
+
edges = [
|
|
45
|
+
("frontend", "api-gateway"),
|
|
46
|
+
("api-gateway", "order-service"),
|
|
47
|
+
("api-gateway", "user-service"),
|
|
48
|
+
("order-service", "payment-service"),
|
|
49
|
+
("order-service", "inventory-service"),
|
|
50
|
+
("payment-service", "database"),
|
|
51
|
+
]
|
|
52
|
+
|
|
53
|
+
# Compute the 6D encoding (sub-millisecond)
|
|
54
|
+
result = encode(edges)
|
|
55
|
+
|
|
56
|
+
# Six structural measurements per node
|
|
57
|
+
for node, vector in result.vectors.items():
|
|
58
|
+
print(f"{node}: {vector}")
|
|
59
|
+
|
|
60
|
+
# Structural risk report
|
|
61
|
+
print(report(result))
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
Output:
|
|
65
|
+
|
|
66
|
+
```
|
|
67
|
+
STRUCTURAL RISK REPORT
|
|
68
|
+
======================
|
|
69
|
+
|
|
70
|
+
AMPLIFICATION RISKS (high fanout, high criticality):
|
|
71
|
+
- api-gateway | fanout=0.667 | criticality=0.556
|
|
72
|
+
|
|
73
|
+
CONVERGENCE SINKS (low independence, many upstream callers):
|
|
74
|
+
- database | independence=0.000
|
|
75
|
+
|
|
76
|
+
STRUCTURAL SPOF (low independence, high upstream dependency):
|
|
77
|
+
- api-gateway | independence=0.000 | every request flows through this node
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
---
|
|
81
|
+
|
|
82
|
+
## What It Finds That Other Tools Miss
|
|
83
|
+
|
|
84
|
+
| Your current tools | SemanticEmbed |
|
|
85
|
+
|---|---|
|
|
86
|
+
| This service has high latency | This service is on 89% of all paths (structural SPOF) |
|
|
87
|
+
| This service had 5 errors | This service fans out to 12 downstream services (amplification risk) |
|
|
88
|
+
| This service is healthy | This service has zero lateral redundancy (convergence sink) |
|
|
89
|
+
|
|
90
|
+
Runtime monitoring tells you what is slow **now**. Structural analysis tells you what **will** cause cascading failures regardless of current load.
|
|
91
|
+
|
|
92
|
+
---
|
|
93
|
+
|
|
94
|
+
## The Six Dimensions
|
|
95
|
+
|
|
96
|
+
Every node gets six independent structural measurements:
|
|
97
|
+
|
|
98
|
+
| Dimension | What It Measures | Risk Signal |
|
|
99
|
+
|-----------|-----------------|-------------|
|
|
100
|
+
| **Depth** | Position in the execution pipeline (0.0 = entry, 1.0 = deepest) | Deep nodes accumulate upstream latency |
|
|
101
|
+
| **Independence** | Lateral redundancy at the same pipeline stage | Low independence = structural chokepoint |
|
|
102
|
+
| **Hierarchy** | Module or group membership | Cross-module dependencies = blast radius |
|
|
103
|
+
| **Throughput** | Fraction of total traffic flowing through the node | High throughput + low independence = hidden bottleneck |
|
|
104
|
+
| **Criticality** | Fraction of end-to-end paths depending on this node | High criticality = SPOF |
|
|
105
|
+
| **Fanout** | Broadcaster (1.0) vs aggregator (0.0) | High fanout = amplification risk |
|
|
106
|
+
|
|
107
|
+
These six properties are mathematically independent -- knowing any five tells you nothing about the sixth.
|
|
108
|
+
|
|
109
|
+
See [docs/dimensions.md](docs/dimensions.md) for the full reference.
|
|
110
|
+
|
|
111
|
+
---
|
|
112
|
+
|
|
113
|
+
## Use Cases
|
|
114
|
+
|
|
115
|
+
**Microservice architectures** -- Find SPOFs, amplification cascades, and convergence bottlenecks in any service mesh. Works with Kubernetes, Istio, OTel traces, or static architecture diagrams.
|
|
116
|
+
|
|
117
|
+
**AI agent pipelines** -- Identify vendor concentration risk, gateway bottlenecks, and guardrail single points of failure in LLM orchestration graphs.
|
|
118
|
+
|
|
119
|
+
**CI/CD and data pipelines** -- Detect structural fragility in build graphs, ETL workflows, and deployment pipelines before they cause cascading failures.
|
|
120
|
+
|
|
121
|
+
**Architecture drift monitoring** -- Compare structural fingerprints across releases. Know exactly which services changed structural role and by how much.
|
|
122
|
+
|
|
123
|
+
---
|
|
124
|
+
|
|
125
|
+
## Notebooks
|
|
126
|
+
|
|
127
|
+
Step-by-step Colab notebooks. Click to open, run in your browser.
|
|
128
|
+
|
|
129
|
+
| Notebook | Use Case | What You Learn |
|
|
130
|
+
|----------|----------|---------------|
|
|
131
|
+
| [01 - Quickstart](https://colab.research.google.com/github/jmurray10/semanticembed-sdk/blob/main/notebooks/01_quickstart.ipynb) | Getting started | Install, encode a graph, read the risk report |
|
|
132
|
+
| [02 - Dimensions Deep Dive](https://colab.research.google.com/github/jmurray10/semanticembed-sdk/blob/main/notebooks/02_dimensions.ipynb) | Understanding 6D | What each dimension means, with worked examples |
|
|
133
|
+
| [03 - Drift Detection](https://colab.research.google.com/github/jmurray10/semanticembed-sdk/blob/main/notebooks/03_drift_detection.ipynb) | Architecture drift | Compare graph versions, detect structural changes |
|
|
134
|
+
| [04 - Bring Your Own Graph](https://colab.research.google.com/github/jmurray10/semanticembed-sdk/blob/main/notebooks/04_bring_your_own.ipynb) | Any graph | Load from JSON, OTel traces, or Kubernetes |
|
|
135
|
+
| [05 - AI Agent Pipelines](https://colab.research.google.com/github/jmurray10/semanticembed-sdk/blob/main/notebooks/05_ai_agent_pipelines.ipynb) | AI/LLM agents | Vendor concentration, gateway bottlenecks, guardrail SPOFs |
|
|
136
|
+
| [06 - CI/CD & Data Pipelines](https://colab.research.google.com/github/jmurray10/semanticembed-sdk/blob/main/notebooks/06_cicd_pipelines.ipynb) | CI/CD & ETL | Build graph fragility, pipeline bottlenecks, drift gates |
|
|
137
|
+
| [07 - OpenTelemetry](https://colab.research.google.com/github/jmurray10/semanticembed-sdk/blob/main/notebooks/07_opentelemetry.ipynb) | OTel traces | Extract edges from traces, structural analysis, CI/CD gates |
|
|
138
|
+
|
|
139
|
+
---
|
|
140
|
+
|
|
141
|
+
## Example Graphs
|
|
142
|
+
|
|
143
|
+
The `examples/` directory contains edge lists for well-known architectures:
|
|
144
|
+
|
|
145
|
+
| File | Application | Nodes | Edges |
|
|
146
|
+
|------|------------|-------|-------|
|
|
147
|
+
| [google_online_boutique.json](examples/google_online_boutique.json) | Google Online Boutique (microservices) | 11 | 15 |
|
|
148
|
+
| [weaveworks_sock_shop.json](examples/weaveworks_sock_shop.json) | Weaveworks Sock Shop (microservices) | 15 | 15 |
|
|
149
|
+
| [ai_agent_pipeline.json](examples/ai_agent_pipeline.json) | Multi-agent LLM orchestration | 12 | 15 |
|
|
150
|
+
| [cicd_pipeline.json](examples/cicd_pipeline.json) | CI/CD build pipeline | 13 | 17 |
|
|
151
|
+
|
|
152
|
+
---
|
|
153
|
+
|
|
154
|
+
## React Components
|
|
155
|
+
|
|
156
|
+
Drop-in React components for rendering SDK results. See [examples/react/](examples/react/) for the full source.
|
|
157
|
+
|
|
158
|
+
| Component | What it renders |
|
|
159
|
+
|-----------|----------------|
|
|
160
|
+
| `useSemanticEmbed.ts` | React hook — call `encode()` from your frontend |
|
|
161
|
+
| `RiskTable.tsx` | Sortable risk table with severity badges |
|
|
162
|
+
| `RadarChart.tsx` | 6D radar chart comparing node profiles |
|
|
163
|
+
| `TopologySummary.tsx` | KPI cards + risk breakdown |
|
|
164
|
+
|
|
165
|
+
```tsx
|
|
166
|
+
import { useSemanticEmbed } from './useSemanticEmbed';
|
|
167
|
+
import { RiskTable } from './RiskTable';
|
|
168
|
+
|
|
169
|
+
function App() {
|
|
170
|
+
const { result, loading, encode } = useSemanticEmbed();
|
|
171
|
+
return (
|
|
172
|
+
<>
|
|
173
|
+
<button onClick={() => encode([["A","B"],["B","C"],["C","D"]])}>Analyze</button>
|
|
174
|
+
{result && <RiskTable risks={result.risks} />}
|
|
175
|
+
</>
|
|
176
|
+
);
|
|
177
|
+
}
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
---
|
|
181
|
+
|
|
182
|
+
## Input Format
|
|
183
|
+
|
|
184
|
+
SemanticEmbed accepts any directed graph as an edge list.
|
|
185
|
+
|
|
186
|
+
```python
|
|
187
|
+
# Python tuples
|
|
188
|
+
edges = [("A", "B"), ("B", "C")]
|
|
189
|
+
result = encode(edges)
|
|
190
|
+
|
|
191
|
+
# JSON file
|
|
192
|
+
result = encode_file("my_graph.json")
|
|
193
|
+
```
|
|
194
|
+
|
|
195
|
+
JSON format:
|
|
196
|
+
|
|
197
|
+
```json
|
|
198
|
+
{
|
|
199
|
+
"edges": [
|
|
200
|
+
{"source": "A", "target": "B"},
|
|
201
|
+
{"source": "B", "target": "C"}
|
|
202
|
+
]
|
|
203
|
+
}
|
|
204
|
+
```
|
|
205
|
+
|
|
206
|
+
See [docs/input_format.md](docs/input_format.md) for the full spec.
|
|
207
|
+
|
|
208
|
+
---
|
|
209
|
+
|
|
210
|
+
## Documentation
|
|
211
|
+
|
|
212
|
+
| Document | Description |
|
|
213
|
+
|----------|-------------|
|
|
214
|
+
| [docs/getting_started.md](docs/getting_started.md) | Install, encode, read results, export -- one page |
|
|
215
|
+
| [docs/api_reference.md](docs/api_reference.md) | Every function, class, parameter, and return type |
|
|
216
|
+
| [docs/dimensions.md](docs/dimensions.md) | The six structural dimensions -- full reference |
|
|
217
|
+
| [docs/input_format.md](docs/input_format.md) | Edge list input specification |
|
|
218
|
+
| [docs/output_format.md](docs/output_format.md) | Encoding output and risk report format |
|
|
219
|
+
|
|
220
|
+
---
|
|
221
|
+
|
|
222
|
+
## License
|
|
223
|
+
|
|
224
|
+
SemanticEmbed SDK is proprietary software distributed as a compiled package.
|
|
225
|
+
Free tier available for graphs up to 50 nodes. See [LICENSE](LICENSE) for terms.
|
|
226
|
+
|
|
227
|
+
**Patent pending.** Application #63/994,075.
|
|
228
|
+
|
|
229
|
+
---
|
|
230
|
+
|
|
231
|
+
## Contact
|
|
232
|
+
|
|
233
|
+
Email jeffmurr@seas.upenn.edu
|
|
@@ -0,0 +1,92 @@
|
|
|
1
|
+
# The Six Dimensions
|
|
2
|
+
|
|
3
|
+
SemanticEmbed computes six independent structural measurements for every node in a directed graph. Together, these six numbers form a coordinate vector in 6-dimensional Euclidean space that fully describes the node's structural role.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Overview
|
|
8
|
+
|
|
9
|
+
| # | Dimension | What It Measures | Range |
|
|
10
|
+
|---|-----------|-----------------|-------|
|
|
11
|
+
| 1 | **Depth** | Position in the execution pipeline | 0.0 -- 1.0 |
|
|
12
|
+
| 2 | **Independence** | Lateral redundancy at the same stage | 0.0 -- 1.0 |
|
|
13
|
+
| 3 | **Hierarchy** | Module or group membership | 0.0 -- 1.0 |
|
|
14
|
+
| 4 | **Throughput** | Fraction of total traffic through this node | 0.0 -- 1.0 |
|
|
15
|
+
| 5 | **Criticality** | Fraction of end-to-end paths depending on this node | 0.0 -- 1.0 |
|
|
16
|
+
| 6 | **Fanout** | Broadcaster (1.0) vs aggregator (0.0) | 0.0 -- 1.0 |
|
|
17
|
+
|
|
18
|
+
---
|
|
19
|
+
|
|
20
|
+
## Depth
|
|
21
|
+
|
|
22
|
+
Where a node sits in the execution pipeline. Entry points are 0.0, deepest sinks are 1.0.
|
|
23
|
+
|
|
24
|
+
A failure at low depth cascades forward through more of the graph. A failure at high depth has limited downstream impact.
|
|
25
|
+
|
|
26
|
+
---
|
|
27
|
+
|
|
28
|
+
## Independence
|
|
29
|
+
|
|
30
|
+
How many other nodes operate at the same pipeline stage. Measures lateral redundancy.
|
|
31
|
+
|
|
32
|
+
A node with independence 0.0 is the only node at its depth level -- a structural chokepoint. A node with high independence has many parallel peers.
|
|
33
|
+
|
|
34
|
+
**This dimension has no equivalent in standard graph centrality measures.** Maximum correlation with all 10 standard centrality metrics: |r| = 0.369. It captures a structural property that the entire field's standard toolkit cannot see.
|
|
35
|
+
|
|
36
|
+
---
|
|
37
|
+
|
|
38
|
+
## Hierarchy
|
|
39
|
+
|
|
40
|
+
Which module, cluster, or logical group a node belongs to. Captures community structure.
|
|
41
|
+
|
|
42
|
+
Nodes in the same module have similar hierarchy values. Cross-module dependencies show up as connections between nodes with different hierarchy values.
|
|
43
|
+
|
|
44
|
+
---
|
|
45
|
+
|
|
46
|
+
## Throughput
|
|
47
|
+
|
|
48
|
+
What fraction of total traffic flows through this node, based on connectivity relative to the graph.
|
|
49
|
+
|
|
50
|
+
Independent of depth: a deep node can have high or low throughput. High throughput combined with low independence indicates a hidden bottleneck.
|
|
51
|
+
|
|
52
|
+
---
|
|
53
|
+
|
|
54
|
+
## Criticality
|
|
55
|
+
|
|
56
|
+
How many end-to-end paths through the graph depend on this node.
|
|
57
|
+
|
|
58
|
+
A node with criticality 0.5 sits on half of all source-to-sink paths. If it fails, half of all end-to-end flows break. Independent of throughput: a low-traffic node can be highly critical if it bridges two large subgraphs.
|
|
59
|
+
|
|
60
|
+
---
|
|
61
|
+
|
|
62
|
+
## Fanout
|
|
63
|
+
|
|
64
|
+
Whether a node amplifies (broadcasts to many downstream) or aggregates (collects from many upstream).
|
|
65
|
+
|
|
66
|
+
High fanout = amplification risk. A failure here multiplies across all downstream dependents. Low fanout = convergence sink. Many upstream services depend on this single point.
|
|
67
|
+
|
|
68
|
+
---
|
|
69
|
+
|
|
70
|
+
## Independence of Dimensions
|
|
71
|
+
|
|
72
|
+
The six dimensions are mathematically independent. Knowing any five tells you nothing about the sixth:
|
|
73
|
+
|
|
74
|
+
- A deep node can have high or low throughput
|
|
75
|
+
- A high-traffic node can sit on one path or many paths
|
|
76
|
+
- A node at any depth, throughput, or criticality can be a broadcaster or aggregator
|
|
77
|
+
- Independence has near-zero correlation with all standard centrality measures
|
|
78
|
+
|
|
79
|
+
This independence is what makes the encoding efficient. No dimension is redundant. Every dimension carries structural information that no other dimension provides.
|
|
80
|
+
|
|
81
|
+
Adding more dimensions (7, 8, ...) was tested and hurt performance -- the extra dimensions carried redundant information. Six is the natural dimensionality of the structural property space for directed computational graphs.
|
|
82
|
+
|
|
83
|
+
---
|
|
84
|
+
|
|
85
|
+
## Structural Risk Patterns
|
|
86
|
+
|
|
87
|
+
| Pattern | Dimensions | Risk |
|
|
88
|
+
|---------|-----------|------|
|
|
89
|
+
| **Amplification risk** | High fanout + high criticality | Failure cascades to many services across many paths |
|
|
90
|
+
| **Convergence sink** | Low independence + low fanout | Many services depend on one aggregation point |
|
|
91
|
+
| **Structural SPOF** | Low independence + high criticality | Only node at its depth, on most paths |
|
|
92
|
+
| **Hidden bottleneck** | High throughput + low independence | Carries most traffic with no redundancy |
|