@robotaccomplice/architext 1.0.0 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,480 @@
1
+ # Architext Routing Correctness Plan
2
+
3
+ Architext routing is a correctness subsystem. It should be developed and tested
4
+ as geometry, not tuned only by looking at screenshots.
5
+
6
+ ## Goals
7
+
8
+ - Keep edges out of node bodies.
9
+ - Keep labels out of node bodies and away from other labels when practical.
10
+ - Make fan-out and fan-in deterministic for repeated source/target groups.
11
+ - Keep route output stable for identical model data.
12
+ - Keep selected routes visually traceable.
13
+ - Allow users to choose a single route rendering style per view: orthogonal or
14
+ curved.
15
+ - Prefer automatic routing. Data-level hints should only influence scoring when
16
+ the automatic result is not good enough.
17
+
18
+ ## Non-Goals
19
+
20
+ - No manual per-edge coordinate authoring as the default workflow.
21
+ - No browser-only routing behavior that cannot be exercised from tests.
22
+ - No layout rewrites until current routing behavior is isolated behind a pure
23
+ API.
24
+
25
+ ## Target API
26
+
27
+ The viewer should call a pure diagram planning function before drawing anything:
28
+
29
+ ```ts
30
+ planDiagram(input: DiagramPlanningInput): PlannedDiagram
31
+ ```
32
+
33
+ `planDiagram` should see the whole rendered diagram:
34
+
35
+ - view lanes and lane bounds
36
+ - node rectangles
37
+ - relationship set
38
+ - expected label text and approximate label boxes
39
+ - current route style
40
+ - canvas bounds
41
+ - reserved UI bands and gutters
42
+ - routing/debug options
43
+
44
+ The edge router remains a subordinate pure function:
45
+
46
+ ```ts
47
+ routeEdges(input: RoutingInput): Map<string, RoutedEdge>
48
+ ```
49
+
50
+ `RoutingInput` should include:
51
+
52
+ - relationships to route
53
+ - node rectangles
54
+ - visible node ids
55
+ - lane and row indexes
56
+ - canvas bounds
57
+ - route options such as node padding, label padding, and debug mode
58
+
59
+ `RoutedEdge` should include:
60
+
61
+ - edge id
62
+ - SVG path string
63
+ - label point
64
+ - route samples
65
+ - total cost derived from named route-quality costs
66
+ - route-quality cost components for length, boundary pressure, node clearance,
67
+ edge proximity, crossings, repeated crossings, bends, doglegs, perimeter
68
+ fallback, fan-out direction, label movement, and label conflicts
69
+ - warnings when no clean route exists
70
+ - optional debug metadata such as rejected candidates and collision scores
71
+
72
+ `PlannedDiagram` should include:
73
+
74
+ - planned node rectangles
75
+ - planned lane bands
76
+ - routed edges
77
+ - label positions and label boxes
78
+ - warnings for node density, too-close nodes, least-bad routes, and label
79
+ conflicts
80
+ - debug geometry for corridors, ports, and rejected candidates
81
+
82
+ ## Invariants
83
+
84
+ The routing test suite should encode these invariants:
85
+
86
+ - Every routed edge has finite numeric coordinates.
87
+ - A rendered view must not mix orthogonal and curved route styles.
88
+ - Every route has a stable path for stable input.
89
+ - Source and target anchors are outside or on the boundary of their nodes.
90
+ - The first and final route segments meet source and target node boundaries at a
91
+ perpendicular angle.
92
+ - Routes avoid unnecessary bends, doglegs, and corridor excursions when a
93
+ straighter clean route exists.
94
+ - Candidate generation must stay bounded. Flexible ports are useful only if they
95
+ do not make dense real-world views too slow to validate.
96
+ - Perpendicular line crossings should use hop-over rendering when the crossing
97
+ is accepted rather than avoidable.
98
+ - Crossing the same route more than once is almost always a planner failure and
99
+ should be heavily penalized before hop-over rendering is considered.
100
+ - Multiple routes using the same node side should not emerge from the exact same
101
+ surface point unless color, z-order, and selection highlighting make the stack
102
+ unambiguous.
103
+ - Perpendicular contact does not require anchoring to the center of a node side.
104
+ The planner should choose among valid points along a side when that avoids an
105
+ unnecessary bend.
106
+ - Short middle jogs between two parallel route segments are route-quality
107
+ failures. The planner should choose a better side or port instead of drawing a
108
+ shallow Z break.
109
+ - Labels and step badges must not obscure the beginning or end of short
110
+ connectors. For short straight connectors, place the badge beside the line
111
+ rather than centered on it.
112
+ - Port spacing must not introduce a dogleg into a clean direct route. Prefer a
113
+ centered direct connector over an offset connector when there is no overlap to
114
+ resolve.
115
+ - Route samples avoid non-endpoint node rectangles with configured padding when
116
+ a clean route is available.
117
+ - When no clean route exists, the router reports a warning instead of hiding the
118
+ failure behind a convoluted path. In practice, this often means nodes are too
119
+ close together or the view is too dense for the current layout.
120
+ - Multi-edge fan-out creates distinct routes or label positions.
121
+ - Labels avoid non-endpoint node rectangles when an alternative exists.
122
+ - Route order is deterministic and independent of JavaScript map iteration
123
+ accidents.
124
+ - Viewer route planning that takes longer than one second must show visible
125
+ progress feedback. Long planning must not leave the viewer looking frozen.
126
+
127
+ ## Viewer Responsiveness
128
+
129
+ Route planning is pure geometry, but the viewer must treat it as potentially
130
+ expensive work. A mature global CLI has to handle large target repositories
131
+ without making the browser appear broken.
132
+
133
+ The viewer should plan routed diagrams through a single asynchronous boundary
134
+ instead of calling the planner directly from React render. The first practical
135
+ boundary is a package-owned Web Worker:
136
+
137
+ - React builds the complete `planDiagram` input for the active view.
138
+ - A worker runs `planDiagram` and returns structured-cloneable geometry.
139
+ - The main thread reconstructs view helpers such as `positionFor` from returned
140
+ node rectangles.
141
+ - A route-planning overlay appears only after a plan has been pending for more
142
+ than 1000 ms.
143
+ - Fast plans should not flash a loading state.
144
+ - Worker failures should render a visible route-planning error instead of
145
+ silently leaving stale geometry on screen.
146
+
147
+ This is a viewer-responsiveness rule, not a substitute for routing performance
148
+ work. Roboticus and synthetic benchmarks should still ratchet planner runtime
149
+ downward, but any runtime above one second must be made explicit to users.
150
+
151
+ ## Fixture Catalog
152
+
153
+ Initial fixtures:
154
+
155
+ - `simple-adjacent`: two nodes in neighboring lanes.
156
+ - `same-lane`: source and target in one lane.
157
+ - `multi-edge-fan-out`: one source routes to multiple targets.
158
+ - `multi-edge-fan-in`: multiple sources route to one target.
159
+ - `bidirectional`: opposite relationships between the same pair.
160
+ - `dense-lanes`: blockers between source and target lanes.
161
+ - `long-label`: label placement under wider text.
162
+ - `c4-component`: structural dependency view with container/component cards.
163
+ - `data-risks`: routes in the risk overlay view.
164
+
165
+ ## Fitness Tests
166
+
167
+ Roboticus remains useful as a real-project sentinel, but it is too broad and too
168
+ slow to be the primary routing litmus. Routing correctness should be protected
169
+ by named synthetic fixtures that are dense enough to expose planner failures and
170
+ small enough to run on every local test pass.
171
+
172
+ Default local and CI tests should run the fixture suite. Real-project benchmark
173
+ runs, including Roboticus, should be explicit so normal routing iteration stays
174
+ fast and deterministic.
175
+
176
+ Fitness tests should operate on planned geometry, not screenshots. Each fixture
177
+ should assert the same invariants that define acceptable output:
178
+
179
+ - route coordinates are finite and deterministic
180
+ - routes do not enter non-endpoint node rectangles
181
+ - source and target contact is perpendicular
182
+ - clean direct routes stay straight
183
+ - fan-out and fan-in use distinguishable attachment points or labels
184
+ - accepted perpendicular crossings render hop-overs
185
+ - a route does not cross the same route more than once
186
+ - bend counts stay under fixture-specific limits
187
+ - labels stay outside node bodies when the fixture has enough space
188
+ - fixture-level metric budgets stay within agreed bounds for bends, crossings,
189
+ repeated crossings, dogleg cost, label movement, label conflicts, and warning
190
+ counts
191
+ - perimeter fallback routes are warnings, not invisible successes; fixture
192
+ budgets should ratchet allowed fallback counts downward as interior routing
193
+ improves
194
+ - monotonic backtracking is now a named route-quality cost. Current complex
195
+ fixtures have zero backtracking, which means the remaining fallback problem is
196
+ corridor availability rather than path direction alone.
197
+ - interior corridor candidates now reduce `complex-fan-out` perimeter fallback
198
+ routes from three to two. Perimeter fallback now considers the full port
199
+ candidate set, which removed the remaining `complex-fan-out` endpoint stack
200
+ without increasing fallback count.
201
+ - Route scoring now evaluates an estimated label box, not only the route label
202
+ anchor point. This keeps label readability in the same candidate-selection
203
+ pipeline as route geometry instead of relying solely on post-placement repair.
204
+ - Interior candidate generation must consider whole-diagram free-space gutters,
205
+ not just the midpoint gap between the source and destination rectangles. Dense
206
+ fan-out and fan-in diagrams often have a clean lane gutter between blocker and
207
+ endpoint columns; treating that as a first-class interior corridor avoids
208
+ perimeter fallback without adding per-fixture route hints.
209
+ - Endpoint stack detection is symmetric. Fan-out must separate source anchors,
210
+ and fan-in must separate destination anchors before bend count is allowed to
211
+ break ties.
212
+ - Corridor candidate generation is bounded to the source-target span and route
213
+ point sequences are deduplicated before scoring. This preserves whole-diagram
214
+ gutter awareness without forcing every edge to evaluate every corridor in the
215
+ diagram.
216
+ - Cheap direct and gutter candidates are scored before Dijkstra grid candidates
217
+ or perimeter fallbacks are generated. Grid/perimeter routing remains available
218
+ for hard cases, but clean cheap candidates short-circuit the expensive path.
219
+ - Edge-proximity scoring must not use pairwise sample scans in the main routing
220
+ loop. Until it is backed by a spatial index, correctness checks rely on
221
+ collisions, crossings, repeated crossings, endpoint stacks, doglegs, and
222
+ fallback warnings.
223
+ - Roboticus benchmark after cheap-candidate short-circuiting, bounded corridors,
224
+ grid side-pair pruning, and disabled pairwise edge-proximity scans: 69 seconds
225
+ on May 14, 2026. Previous successful benchmark was 409 seconds; intermediate
226
+ attempts that kept pairwise edge-proximity scans exceeded ten minutes.
227
+ - Next optimization target: replace repeated previous-route scans with a route
228
+ spatial index. Candidate scoring should query only nearby prior route samples
229
+ or segments instead of walking every previous route for every candidate.
230
+ - Route crossing and endpoint-stack checks now use an incremental route index.
231
+ Roboticus benchmark after this change: 27.8 seconds on May 14, 2026, down from
232
+ 69 seconds after the first optimization pass and 409 seconds before routing
233
+ optimization.
234
+ - Next optimization target: index node rectangles for route quality, label
235
+ clearance, and collision checks. Candidate scoring should query nearby
236
+ blockers by sample bounds instead of scanning every non-endpoint node for
237
+ every sample.
238
+ - Blocker rectangle indexing was tested after the route index and did not improve
239
+ the Roboticus benchmark enough to keep as the next retained optimization.
240
+ The next retained target is the grid router's Dijkstra implementation: it
241
+ should use a priority queue instead of repeatedly scanning every graph point.
242
+ - Priority-queue Dijkstra did not materially improve the Roboticus benchmark;
243
+ it remains useful as bounded algorithmic cleanup for hard grid-route cases.
244
+ The dominant repeated work was route planning the same geometry for orthogonal
245
+ and curved render styles. Raw route geometry is now cached independently of
246
+ style so a style change only re-renders the path shape. Roboticus benchmark
247
+ after raw-route caching: 15.4 seconds on May 14, 2026.
248
+ - Subsequent local Roboticus benchmark runs after adding worker-backed viewer
249
+ planning still passed but measured 20.5-29.5 seconds. The worker change
250
+ improves viewer responsiveness rather than pure planner speed; the real-project
251
+ sentinel remains too slow and variable to run by default.
252
+ - CPU profiling shows the retained hot path is route-clearance scoring:
253
+ `distanceToRect`, `routeQualityFromSamples`, grid-route segment checks, and
254
+ test collision verification dominate runtime. The next retained optimization
255
+ should preserve route semantics while reducing repeated blocker lookup and
256
+ avoiding square-root distance work until a point is within a clearance
257
+ threshold.
258
+ - Retained clearance optimizations now cache blocker rectangles per endpoint
259
+ pair, prefilter blockers by candidate sample bounds, avoid square-root
260
+ distance work outside threshold ranges, and use exact segment/rectangle checks
261
+ for orthogonal collision counting. Roboticus benchmark after these changes:
262
+ 5.6 seconds on May 14, 2026.
263
+ - A grid graph adjacency cache was tested and not retained. In the current
264
+ route shape, cache-key and graph materialization overhead outweighed reuse and
265
+ regressed Roboticus from roughly 6.0 seconds to 7.2 seconds.
266
+ - The next retained grid-route candidate is scan-line blocker prefiltering:
267
+ horizontal grid segments only need blockers whose padded vertical span contains
268
+ that y value, and vertical grid segments only need blockers whose padded
269
+ horizontal span contains that x value.
270
+ - Scan-line blocker prefiltering was retained. It keeps grid topology unchanged
271
+ while reducing impossible segment/blocker checks. Roboticus benchmark after
272
+ this change: 5.5 seconds on May 14, 2026.
273
+ - Array-indexed grid adjacency and visited flags replaced `Map`/`Set`
274
+ bookkeeping inside Dijkstra. This keeps pathfinding behavior unchanged while
275
+ reducing inner-loop overhead. Roboticus benchmark after this cleanup:
276
+ 5.25 seconds on May 14, 2026.
277
+ - The next optimization target is reducing grid-route invocation count, not
278
+ further tuning grid internals. The router should measure how many edges reach
279
+ grid routing, why cheap candidates were rejected, and whether bounded cheap
280
+ candidates can be expanded before invoking Dijkstra.
281
+ - Roboticus measurement showed 67 of 395 routed edges escalated to grid routing,
282
+ but those edges caused 9,188 grid-route calls. Most cheap-candidate rejections
283
+ were crossings, but accepting those blindly would violate the crossing
284
+ avoidance invariant. The safer optimization is reducing grid port fan-out while
285
+ leaving the broad cheap candidate set intact.
286
+ - Bounded grid port fan-out was retained. Cheap routing still evaluates the broad
287
+ aligned port set, but grid routing now uses representative offsets only. This
288
+ reduced Roboticus grid-route calls from 9,188 to 4,324 and moved the benchmark
289
+ to 4.2 seconds on May 14, 2026.
290
+
291
+ Remaining ratchets:
292
+
293
+ - Keep `complex-fan-out` at zero perimeter fallback routes.
294
+ - Keep `complex-fan-in` at zero perimeter fallback routes.
295
+ - Keep `complex-c4-component` at zero perimeter fallback routes.
296
+ - Keep `endpointStackCost`, `doglegCost`, `monotonicBacktrackCost`,
297
+ `labelConflictCost`, and `labelNodeConflictCost` at zero for complex fixtures
298
+ unless the fixture is explicitly modeling an unavoidable warning.
299
+ - Keep Roboticus as an explicit benchmark until routing behavior stabilizes.
300
+
301
+ Initial complex fixtures:
302
+
303
+ - `complex-fan-out` covered: one source routes to multiple targets around intervening
304
+ nodes.
305
+ - `complex-fan-in` covered: multiple sources converge on one target without
306
+ sharing an unreadable endpoint stack.
307
+ - `complex-crossing-hops` covered: accepted perpendicular intersections are
308
+ rendered with hops after route selection.
309
+ - `complex-c4-component` covered: C4-style lanes route through the same planner as
310
+ system maps.
311
+ - `complex-too-close` covered: deliberately cramped nodes produce explicit warnings
312
+ rather than hiding the failure behind a convoluted path.
313
+
314
+ ## Roboticus Baseline
315
+
316
+ Roboticus is the first real-project routing benchmark. On May 14, 2026, the
317
+ data-only Roboticus install validated cleanly and reported no lifecycle
318
+ migration issues. Initial extraction exposed route/node collisions in dense
319
+ views. The first routing improvement made node-body collisions a dominant
320
+ selection constraint and added obstacle-aware orthogonal candidates.
321
+
322
+ Headless route checks covered non-C4, non-sequence views with both structural
323
+ relationships and flow relationships.
324
+
325
+ Initial collision baseline:
326
+
327
+ | View | Type | Relationship Set | Relationships | Route Collisions |
328
+ | --- | --- | --- | ---: | ---: |
329
+ | `system-map` | `system-map` | structural | 77 | 20 |
330
+ | `system-map` | `system-map` | flow | 65 | 24 |
331
+ | `agent-turn-flow` | `flow-explorer` | structural | 24 | 2 |
332
+ | `agent-turn-flow` | `flow-explorer` | flow | 32 | 1 |
333
+ | `dataflow-sensitive` | `dataflow` | structural | 46 | 13 |
334
+ | `dataflow-sensitive` | `dataflow` | flow | 38 | 12 |
335
+ | `deployment-local` | `deployment` | structural | 12 | 2 |
336
+ | `deployment-local` | `deployment` | flow | 13 | 3 |
337
+ | `risk-overlay` | `risk-overlay` | structural | 53 | 11 |
338
+ | `risk-overlay` | `risk-overlay` | flow | 35 | 5 |
339
+
340
+ Current benchmark:
341
+
342
+ | View | Type | Relationship Set | Relationships | Route Collisions |
343
+ | --- | --- | --- | ---: | ---: |
344
+ | `system-map` | `system-map` | structural | 77 | 0 |
345
+ | `system-map` | `system-map` | flow | 65 | 0 |
346
+ | `agent-turn-flow` | `flow-explorer` | structural | 24 | 0 |
347
+ | `agent-turn-flow` | `flow-explorer` | flow | 32 | 0 |
348
+ | `dataflow-sensitive` | `dataflow` | structural | 46 | 0 |
349
+ | `dataflow-sensitive` | `dataflow` | flow | 38 | 0 |
350
+ | `deployment-local` | `deployment` | structural | 12 | 0 |
351
+ | `deployment-local` | `deployment` | flow | 13 | 0 |
352
+ | `risk-overlay` | `risk-overlay` | structural | 53 | 0 |
353
+ | `risk-overlay` | `risk-overlay` | flow | 35 | 0 |
354
+
355
+ All routes have finite geometry. `first-party-surfaces` (`c4-container`) and
356
+ `release-gate-flow` (`sequence`) were skipped because those views still use
357
+ separate drawing logic.
358
+
359
+ The benchmark is now covered by a conditional local test that runs when
360
+ `../roboticus` exists next to Architext. It exercises both orthogonal and curved
361
+ route rendering modes against the same obstacle-aware geometry. Curved-mode
362
+ collision checks use samples from the rendered curved path, not only the
363
+ pre-smoothed polyline. The next correctness target is to bring C4 routing under
364
+ the same pure routing API and then add label-box collision checks.
365
+
366
+ ## Implementation Sequence
367
+
368
+ 1. Extract the current route planner into a pure module without changing visual
369
+ behavior.
370
+ 2. Add fixture tests that check determinism, finite geometry, collision
371
+ avoidance, and fan-out uniqueness.
372
+ 3. Introduce a holistic `planDiagram` pass that computes nodes, approximate
373
+ label boxes, lanes, route corridors, and warnings before drawing SVG/HTML
374
+ elements.
375
+ 4. Add a debug overlay hidden behind `?debugRouting=1`.
376
+ The overlay should read directly from `planDiagram` output and show route
377
+ warnings, label warnings, and dominant named cost components. It must not
378
+ have separate routing math.
379
+ 5. Replace the current candidate-scoring approach with library-derived routing
380
+ concepts:
381
+ - plan all edges against fixed node rectangles before rendering
382
+ - use explicit source and target port candidates
383
+ - use perpendicular source and target port stubs
384
+ - support flexible side-port placement instead of side-midpoint anchoring
385
+ - apply monotonic path restrictions where source-to-target direction is clear
386
+ - prefer center/direct routes first, then space-distributed alternatives
387
+ - bound candidate search and report search-exhausted warnings
388
+ - score named costs: node collisions, edge crossings, repeated crossings,
389
+ bends, long corridors, shallow doglegs, label conflicts, and perimeter
390
+ fallback
391
+ - reserve bridge/hop rendering for accepted perpendicular intersections after
392
+ route selection
393
+ - handle same-side port spacing with geometry first and color/z-order second
394
+ - return route warnings for least-bad fallbacks and too-close node
395
+ arrangements
396
+ 6. Use ELK, libavoid, yFiles, and JointJS as algorithm references, not default
397
+ dependencies.
398
+ 7. Add optional schema-supported routing hints only after automatic routing has
399
+ measurable coverage.
400
+
401
+ ## Curved Routing Track
402
+
403
+ Curved routing must not mean "draw arbitrary Bézier edges and hope they look
404
+ better." It needs the same geometry discipline as orthogonal routing: fixed
405
+ inputs, sampled paths, collision checks, label scoring, and deterministic
406
+ output.
407
+
408
+ Near-term approach:
409
+
410
+ - Route first, curve second. Compute an obstacle-aware polyline or orthogonal
411
+ route, then transform it into a smooth path as a rendering stage.
412
+ - Use cubic Bézier or quadratic spline smoothing over accepted route points.
413
+ This is the practical yFiles/yEd-style post-processing model and matches
414
+ Architext's current lane/row constraints.
415
+ - Keep the route samples tied to the rendered curve, not only the pre-smoothed
416
+ polyline, before claiming collision correctness for curved mode. This is now
417
+ covered for the rounded-curve rendering path.
418
+ - Score curve candidates by node clearance, label clearance, bend smoothness,
419
+ edge-edge proximity, and route length.
420
+ - Preserve style purity: a view rendered in curved mode uses curved edges
421
+ consistently; a view rendered in orthogonal mode uses orthogonal edges
422
+ consistently.
423
+
424
+ Algorithm ideas to lift:
425
+
426
+ - **Bezier spline post-processing:** transform selected polyline/orthogonal
427
+ routes into smooth cubic or quadratic segments while preserving anchors and
428
+ obstacle clearance.
429
+ - **Tangent-visibility routing:** treat node rectangles as inflated obstacles
430
+ and generate curve control points from visible tangent corridors.
431
+ - **Geometric control-point modeling:** make control points explicit route data
432
+ so curves can be sampled, scored, debugged, and tested.
433
+ - **Edge bundling:** consider only for dense overview modes. Bundling can reduce
434
+ clutter, but it can also hide individual dependency paths and should not be
435
+ the default for workflow or C4 views.
436
+
437
+ Deferred ideas:
438
+
439
+ - Force-directed edge bundling is useful for large network visualizations, but
440
+ it is iterative, less deterministic, and can obscure individual architecture
441
+ relationships.
442
+ - Differential-equation-based routing is too complex for Architext's current
443
+ needs and should not be introduced without a concrete fixture that simpler
444
+ geometric routing cannot solve.
445
+ - Curve-based planar graph routing is aimed at general graph traversal problems,
446
+ not the fixed-node architecture diagrams Architext currently renders.
447
+
448
+ ## Debug Overlay
449
+
450
+ The debug overlay should be disabled by default and enabled with:
451
+
452
+ ```text
453
+ ?debugRouting=1
454
+ ```
455
+
456
+ It should show:
457
+
458
+ - node rectangles
459
+ - chosen route samples
460
+ - label boxes
461
+ - selected route points and warning-colored route points
462
+ - route cost
463
+ - collision warnings
464
+
465
+ ## Verification
466
+
467
+ Routing changes should run:
468
+
469
+ ```sh
470
+ npm run verify
471
+ ```
472
+
473
+ Before release packaging, run:
474
+
475
+ ```sh
476
+ npm run release:check
477
+ ```
478
+
479
+ For visual changes, update the self-hosted screenshots only after the geometry
480
+ tests pass.