@reicek/neataptic-ts 0.1.13 → 0.1.15

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,419 +1,285 @@
1
- # Hyper MorphoNEAT: Concrete Implementation Plan (Aligned With Current NeatapticTS Core)
1
+ # Hyper MorphoNEAT (draft)
2
2
 
3
- Hyper MorphoNEAT is a proposed hybrid framework for neural network evolution that unifies key principles from:
3
+ Hyper MorphoNEAT sits between HyperNEAT and a small developmental language: it keeps HyperNEAT's compact spatial patterns, but adds a deterministic rule layer and runtime morph policies so evolution can both propose macro motifs and let the phenotype adapt locally during lifetime. It excels when geometry + lifelong adaptation + scalability are all important.
4
4
 
5
- - NEAT (incremental complexification, speciation, innovation tracking)
6
- - HyperNEAT / ES-HyperNEAT (indirect CPPN encodings over geometric substrates, scalable connectivity sampling)
7
- - Evo-Devo / Developmental Biology (rule‑driven staged growth, differentiation, symmetry, modular morphogenesis)
8
- - NeuroMorph / Structural Plasticity research (activity‑driven synaptogenesis & pruning, local rewiring)
9
- - Classic synaptic plasticity (Hebbian / anti-Hebbian, homeostatic adjustments)
5
+ Hyper MorphoNEAT is deliberately pragmatic: introduce an indirect, rule‑driven layer that remains opt‑in and deterministic, provide lightweight runtime morph policies that act only when enabled, and preserve the current NEAT/Network behavior when the feature flag is off.
10
6
 
11
- Its long‑term goal: mimic a simplified “digital embryogenesis + lifelong adaptation” pipeline—starting from a handful of proto‑cells (minimal input/output scaffold) plus a _genetic rulebook_ (developmental & pattern genes) that can:
7
+ What follows is a phased implementation plan and rationale. Each phase lists safe, reviewable changes, validation checks (determinism, small smoke tests, typechecks), and explicit rollback boundaries so reviewers can validate correctness and performance before accepting further complexity.
12
8
 
13
- 1. Elaborate structure (grow) when additional capacity is _demonstrably_ needed.
14
- 2. Reshape or prune (shrink) when regions are underutilized or wasteful.
15
- 3. Focus evolutionary search dynamically on emergent _bottlenecks_ or _frontier regions_ rather than the entire brain uniformly.
16
- 4. Maintain indirect encodings so extremely large phenotypes (millions of potential connections) are generated _on demand_ without materializing every dormant element.
9
+ ---
17
10
 
18
- This document both (A) expands the conceptual/educational narrative and (B) refines the pragmatic phased implementation grounded in the existing codebase (`src/architecture/*`, `src/neat/*`, pruning, slabs, pooling, optimizers). The aim: introduce indirect & developmental encodings, runtime morphogenesis, and adaptive large‑scale network support without destabilizing current NEAT / Network functionality.
11
+ ## Conceptual Expansion: “From ProtoBrain to Adaptive Cortex”
19
12
 
20
- ---
13
+ Hyper MorphoNEAT reframes topology growth as a staged, deterministic engineering pipeline rather than ad‑hoc structural mutation. The pipeline is intentionally compositional: small, well‑specified rule primitives and compact pattern generators (CPPNs) produce large, traceable phenotypes via repeatable passes. That design lets evolution operate on a concise, high‑leverage symbolic layer (the genotype) while runtime morph policies make bounded, local trade‑offs during an individual's lifetime (the phenotype).
21
14
 
22
- ## 1. Conceptual Expansion: “From Proto-Brain to Adaptive Cortex”
15
+ Practical design principles applied throughout this plan:
23
16
 
24
- This section formalizes the developmental metaphor underpinning the architecture. Instead of treating topology growth as a flat sequence of structural mutations, we frame the system as a staged developmental process: _pattern specification proliferation differentiation guidance refinement → lifelong adaptation_. Each mechanism (developmental rules, CPPNs, morphogenesis hooks) is deliberately aligned to one of these abstracted biological roles so that (1) contributors can predict emergent behavior, (2) design trade‑offs inherit a coherent vocabulary, and (3) future extensions (e.g. chemical gradient analogues, temporal morph phases) have an obvious insertion point.
17
+ - Opt‑in and isolated: hyper features live under guarded flags and an isolated namespace to avoid regressions.
18
+ - Deterministic by default: genotype + seed → canonical phenotype; canonical hashing and stable ordering are required for reproducible experiments.
19
+ - Budgeted growth: all expansions obey explicit complexity caps (nodes, edges, memory) and are reversible or roll‑backable for safety.
20
+ - Lazy instrumentation: telemetry and extra buffers allocate only when enabled to preserve baseline performance.
21
+ - Traceability: every developmental action carries ancestry/trace metadata to help debugging and analysis.
25
22
 
26
- Analogy to biological development (simplified):
23
+ Analogy to biological development (engineered mapping):
27
24
 
28
- | Stage | Biological Inspiration | Hyper MorphoNEAT Analog |
29
- | -------------------- | -------------------------------------------------- | ------------------------------------------------------------------------ |
30
- | Embryonic Seed | Few stem cells | Minimal genotype with input/output anchor nodes |
31
- | Patterning Gradients | Morphogens, HOX genes | CPPN fields + developmental rules assigning coordinates & tags |
32
- | Proliferation | Cell division | `replicate` rules splitting regions / subdividing connection paths |
33
- | Differentiation | Neurons specialize (sensory / interneuron / motor) | Rule‑based assignment of activation fn, plasticity mode, gating role |
34
- | Axon Guidance | Growth cones follow gradients | Spatial CPPN thresholds & local activity heuristics form new connections |
35
- | Pruning & Refinement | Synaptic pruning (use-it-or-lose-it) | Activity/frequency / contribution metrics drive removal |
36
- | Lifelong Plasticity | Hebbian, structural changes | Morphogenesis hooks + plastic weight adaptation |
25
+ | Stage | Biological Inspiration | Hyper MorphoNEAT Engineering Analog |
26
+ | ------------------------- | --------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------- |
27
+ | Embryonic Seed | Few stem cells | Minimal, deterministic genotype with input/output anchors |
28
+ | Patterning Gradients | Morphogens, HOX genes | CPPN fields + compact developmental rules that assign coordinates, tags and role metadata |
29
+ | Proliferation | Cell division | Deterministic `replicate` rules that expand regions or bands under growth budgets |
30
+ | Differentiation | Neurons specialize | Rules assign activation families, plasticity profiles, and gating behaviors (explicit, testable traits) |
31
+ | Axon Guidance | Growth cones follow gradients | Spatial CPPN thresholds, local heuristics and wiring‑cost bias producing sparse, locality‑aware connectivity |
32
+ | Pruning & Refinement | Synaptic pruning | Activity and contribution metrics drive removal; pruning prefers long or low‑contribution inter‑module links under budget |
33
+ | Lifelong Plasticity | Hebbian and structural remodeling | Optional, gated plasticity + morph hooks that adjust weights and (rarely) local structure during runtime |
34
+ | Wiring cost (engineering) | Metabolic / material cost | Explicit per‑genotype wiring knobs (counts, length, inter‑module penalties) used by CPPNs, morph policies and selection to favor compact, modular wiring |
37
35
 
38
- ### 1.1 Core Idea
36
+ The remainder of this section spells out how these mappings are realized as deterministic passes, safe runtime hooks, and explicit validation checks. See the next subsection for the core "rule‑first" rationale and the planned deterministic invariants that must hold across phases.
39
37
 
40
- Motivates a _rule‑first_ search strategy: by beginning with a minimal developmental program the algorithm explores a compact, high‑leverage encoding space where each mutation can reshape large swaths of potential phenotype. This delays costly exploration of the vast combinatorial topology space until rules and pattern generators establish macro‑regularities (symmetry, modular bands, coordinate partitions) that scaffold efficient scaling. The approach mirrors biological canalization: early developmental constraints bias later structural elaboration toward coherent, reusable motifs.
41
- Start _tiny_ to minimize initial search space. Instead of evolving a large static genome, we evolve a **compact developmental program** that is repeatedly executed (or partially re‑executed) to refine the phenotype. Evolution acts on rules & CPPN weights (indirect), while _runtime morphogenesis_ adjusts expression intensity and local connectivity. The phenotype thus becomes an emergent, continually reshaped structure rather than a fixed topology.
38
+ ### Core Idea
42
39
 
43
- ### 1.2 Genotype vs Phenotype Layering
40
+ Hyper MorphoNEAT adopts a rule‑first engineering strategy: treat the genotype as a compact, declarative program and execute it in deterministic, well‑scoped passes to produce a runtime phenotype. Evolution operates on concise symbolic elements (rules, CPPNs, substrate modifiers) while runtime morph policies perform bounded, local adjustments under explicit budgets. This separation keeps the heritable search space small and interpretable, and makes large structural effects reproducible and debuggable.
44
41
 
45
- Defines a strict stratification between _heritable specification_ (rules + CPPN parameters) and _ephemeral realization_ (instantiated nodes, connections, plastic state). This boundary supports determinism (phenotypes can be reconstructed bit‑exactly from genotype + seed), enables aggressive memory reclamation (transient runtime arrays can be discarded or downsampled between evaluations), and decouples evolutionary operators from runtime adaptation. The layering also allows analytical tooling (diffing, hashing, compression) to operate on a concise symbolic genome rather than sprawling structural graphs.
46
- | Component | Genotype (Stored) | Phenotype (Materialized / Runtime) |
47
- |------------------------|----------------------------------------|---------------------------------------------------------------------|
48
- | Node existence | Rule-derived (virtual slots) | Instantiated `Node` objects (possibly pooled) |
49
- | Connection potential | CPPN output, rule masks | Subset realized (meets thresholds + budget) |
50
- | Module boundaries | Hierarchy / symmetry rules | Tag arrays on nodes/edges, used for focused evolution & pruning |
51
- | Plasticity configuration | Gene flags (hebbian rate, gating ability) | Runtime per-connection accumulators (optional) |
42
+ Pipeline (high level):
52
43
 
53
- ### 1.3 Dynamic Evolution Focus
44
+ 1. Substrate: deterministically assign coordinates and region tags for inputs/outputs/hidden slots.
45
+ 2. Rule passes: apply prioritized, deterministic rules (replicate, symmetry, hierarchy, etc.) to expand a virtual node list and attach ancestry/trace metadata.
46
+ 3. CPPN evaluation: procedurally score candidate pairs (weight, mask, meta) with cost‑aware thresholding; cache adjacency results keyed by canonical genotype/substrate hashes and wiring prefs.
47
+ 4. Materialize: instantiate pooled Node/Connection slabs from the realized adjacency list; apply activation/plasticity traits.
54
48
 
55
- Explains the allocation of evolutionary variance as an _adaptive resource scheduling_ problem: mutation budget is directed toward regions exhibiting evidence of constraint (high error attribution relative to size), stagnation (age without improvement), or under‑exploration (low structural diversity). By contrast, mature, stable, and well‑performing regions experience mutation cooling, reducing destructive interference. This dynamic focus mimics targeted neurogenesis and synaptic remodeling phenomena, yielding faster convergence for a fixed total mutation rate and curbing indiscriminate bloat.
56
- Rather than applying uniform mutation pressure, we maintain _region metrics_ (error attribution, novelty, saturation). Evolution probabilities shift toward regions that are:
49
+ Deterministic invariants & safety guarantees:
57
50
 
58
- - Bottlenecked (high gradient / error flow concentration)
59
- - Under-explored (low structural entropy, few recent mutations)
60
- - Recently regressed (performance dip localized)
51
+ - Reproducibility: genotype + seed canonical phenotype (stable ordering, canonical JSON/hash).
52
+ - Idempotence: repeated builds produce identical node/edge orderings and hashes unless the genotype or seed changes.
53
+ - Budget enforcement: all expansion steps respect explicit caps (maxNodes, maxEdges, memoryBudget) and are reversible or roll‑backable.
54
+ - No global side effects: imports and builds must not mutate global runtime state when hyper mode is disabled.
55
+ - Lazy diagnostics: telemetry, traces, and per‑connection extras allocate only when enabled.
61
56
 
62
- Conceptually:
57
+ Minimal pseudocode example: (canonical snippet retained in the Implementation section below — see "Minimal pseudocode example")
63
58
 
64
- ```
65
- focusScore(region) = w_err * normalizedErrorShare
66
- + w_nov * (1 - structuralDiversity)
67
- + w_regress * recentPerfDrop
68
- - w_stable * stabilityAge
69
- ```
59
+ > NOTE: To avoid duplicate snippets and drift, keep the `buildPhenotype` example only in the Implementation section; this line acts as a local pointer.
60
+
61
+ These rules keep the implementation auditable and allow later phases (morph hooks, plasticity, evolution integration) to build on a deterministic, testable foundation.
62
+
63
+ ### Genotype vs Phenotype Layering
64
+
65
+ This document maintains a single canonical "Layer responsibilities" section in the Implementation area (see the "Layer responsibilities" and "Pipeline (high level)" headings later). To avoid duplication and maintenance drift, readers should consult that canonical section for details on genotype/phenotype separation, invariants, and practical guidelines.
66
+
67
+ ### Dynamic Evolution Focus
68
+
69
+ In Hyper MorphoNEAT we treat mutation targeting as a controlled allocation problem: rather than applying uniform mutation pressure across the entire phenotype, we compute per‑region focus scores from multiple, complementary signals and use those scores to bias where evolutionary operators and runtime morph actions apply. The intention is practical and measurable: concentrate structural edits (rule mutations, local growth/prune, CPPN perturbations) where they are most likely to reduce task error or increase useful diversity, while cooling mature, stable regions to avoid destructive churn.
70
+
71
+ Key signals
72
+
73
+ - errShare: fraction of total error attributed to the region (backprop attribution, gradient magnitude, or surrogate credit).
74
+ - utilization: fraction of active units / firing rate baseline (indicates capacity in use).
75
+ - contribGradient: aggregate magnitude of gradients flowing through the region (proxy for learning pressure).
76
+ - noveltyScore: structural or activation novelty relative to recent history (encourages exploration).
77
+ - stabilityAge: time since last successful mutation / performance improvement (cooling factor).
78
+ - wiringMetrics: meanEdgeLength, interModuleRatio, totalConnections (used to weight wiring‑cost penalties).
79
+
80
+ Normalized focus score
81
+
82
+ - Normalize each signal to [0,1] using rolling statistics (mean/var) or robust quantiles to avoid outlier domination.
83
+ - Combine with configurable weights and a small L2 regularizer to avoid score collapse:
84
+
85
+ focusScore(region) = Σ*k w_k * norm*k(region) - w_cost * norm_wiring(region) - γ \* ||w||^2
86
+
87
+ where norm_k are per-signal normalizers, w_k are configuration weights, w_cost encodes wiring penalties, and γ stabilizes weights if learned/adapted.
88
+
89
+ Sampling & operator placement
90
+
91
+ 1. Compute focusScore for all modules/regions each epoch (or lower cadence).
92
+ 2. Convert scores to sampling probabilities via softmax with temperature τ to control selection sparsity:
70
93
 
71
- Sampling of mutation targets becomes weighted by `focusScore` to _steer_ complexification.
72
-
73
- Objectives:
74
- - Introduce namespace, feature gating, and minimal type contracts without altering runtime behavior.
75
- - Establish deterministic seeds & hashing utility stubs used in later phases.
76
- - Guarantee zero performance / bundle size regression when flag disabled.
77
-
78
- Scope Inclusions:
79
- - Directory skeleton, preliminary interfaces, config flag, smoke tests, documentation pointer in README.
80
- Scope Exclusions:
81
- - Any allocation of new runtime arrays; mutation / reproduction changes.
82
-
83
- Key Tasks:
84
- 1. Create `src/hyper/` with placeholder modules (`genotype.ts`, `developmentalRules.ts`, `substrate.ts`, `phenotypeBuilder.ts`, `internal/hash.ts`).
85
- 2. Extend `config.ts` with `enableHyper` & `enableHyperTelemetry` flags (default false).
86
- 3. Implement a minimal `createHyperContext(seed)` returning deterministic RNG + version tag.
87
- 4. Add TypeScript interfaces with TODO JSDoc blocks referencing later phases (versioned via `@phase` tag).
88
- 5. Add Jest smoke test: importing hyper entry when disabled does not mutate global state (heap snapshot diff < threshold, optional if infra present).
89
- 6. Add build size guard (compare gzip bundle delta; skip if tooling unavailable—document).
90
-
91
- Interfaces Added / Changed:
92
- - `config.enableHyper?: boolean`
93
- - `HyperVersion = { major:1, minor:0, phase:0 }` constant.
94
-
95
- Tests:
96
- - `hyper.disabled.import.spec.ts`: ensures side‑effect free import.
97
- - `config.flag.default.spec.ts`: asserts flag false by default.
98
-
99
- Metrics & Exit Criteria:
100
- - Bundle delta < 1% (or documented if tooling absent).
101
- - No additional failing tests; coverage for new lines ≥ 80% (interface lines excluded).
102
-
103
- Risks & Mitigations:
104
- - Risk: Accidental circular import with existing `neat` modules → Mitigation: forbid `src/hyper` importing `src/neat` in linter rule.
105
- - Risk: Silent performance regression → Mitigation: micro benchmark baseline captured (no hyper usage).
106
-
107
- Deferred Items:
108
- - Hash canonicalization logic (implemented in Phase 1 with real genotype structure).
109
- - CLI / docs surface.
110
-
111
- Acceptance:
112
- - Tree contains hyper skeleton; CI green; disabling flag yields identical test timings (± variance threshold).
113
-
114
- - If memory near cap → prune least-contributing edges/nodes (respecting module diversity quotas).
115
- - Else if sustained high error & high utilization → targeted growth (replicate high‑pressure path or densify a sparse region).
116
- - Else if plateau & low exploration → mutate developmental rules / CPPN (global structural shift) before adding raw capacity.
117
- Objectives: - Define persistent `HyperGenotype` schema and deterministic build path to a baseline phenotype. - Provide substrate coordinate assignment + hashing for regeneration equivalence. - Implement serialization & hash stability tests.
118
-
119
- Key Tasks:
120
- 1. Factory: `createInitialGenotype({ input, output, seed })` producing minimal rule list (empty for now) & substrate spec.
121
- 2. Substrate: implement 1D / simple 2D coordinate assignment strategies (line, grid) with deterministic ordering.
122
- 3. Phenotype builder (baseline): instantiate input & output nodes; create fully connected edges input→output only.
123
- 4. Stable hashing: `hashGenotype(geno)` (order‑independent rule hash, sorted JSON canonicalization).
124
- 5. Serialization: `encodeGenotype(geno)` / `decodeGenotype(json)`; version field check.
125
- 6. Determinism test: repeated build from same seed produces identical edge weight ordering & hash.
126
- 7. Performance micro‑benchmark harness (optional) to record baseline build time.
127
-
128
- Interfaces Added:
129
- - `interface HyperGenotype { seed:number; input:number; output:number; rules:DevelopmentalRule[]; substrateSpec:SubstrateSpec; version:1; }`
130
- - `buildPhenotype(geno, { config })` (returns `Network`).
131
-
132
- Tests:
133
- - Round trip serialization parity.
134
- - Hash stability after neutral no‑op mutate attempt.
135
- - Memory footprint comparison vs direct `Network` baseline (connections count equality).
136
-
137
- Metrics & Exit Criteria:
138
- - Deterministic hash reproducibility = 100% across N=50 rebuilds.
139
- - Build time overhead ≤ 1.1× direct instantiation.
140
-
141
- Risks & Mitigations:
142
- - Risk: Order dependence in JSON serialization → canonical sort property keys.
143
- - Risk: Hidden mutable references (arrays reused) → deep freeze in test environment when possible.
144
-
145
- Deferred Items:
146
- - Rules execution engine (Phase 2).
147
- - Indirect connectivity (Phase 3).
148
-
149
- Acceptance:
150
- - All genotype round‑trip & determinism tests pass; documentation updated with example.
151
-
152
- if (ctx.memory.pressure > 0.9) pruneLowContribution(net, ctx);
153
- else if (ctx.error.stagnant && ctx.utilization.high) replicateCriticalPath(geno, ctx);
154
- else if (ctx.novelty.low) diversifyRules(geno);
155
- Objectives: - Introduce deterministic multi‑pass rule application with priority & probability handling. - Support initial structural motif expansion (replication, symmetry, hierarchy tagging). - Ensure repeated genotype build yields identical phenotype for fixed random stream.
156
-
157
- Key Tasks:
158
- 1. Rule interface finalization: `priority`, `probability`, `enabled` semantics; stable normalized param hashing.
159
- 2. Execution engine: sort by priority; iterate applying rules; for probabilistic rules use deterministic RNG seeded from (genotype.seed + rule.id).
160
- 3. Implement rule kinds:
161
- - `replicate`: duplicate coordinate bands / node groups (maintain mapping table for ancestry).
162
- - `symmetry`: generate mirrored coordinates and optional parameter tying metadata.
163
- - `hierarchy`: assign module IDs & nested scopes.
164
- 4. Extend phenotype builder to expand virtual node list before physical instantiation.
165
- 5. Add complexity guard (maxNodes, maxRules) pre‑instantiation.
166
- 6. Mutation operators: toggle enable, adjust numeric params, shift priority (bounded), adjust probability sigmoid‑clamped.
167
- 7. Deterministic test battery: same seed + rule set produces identical node ordering & ancestry map.
168
-
169
- Metrics & Exit Criteria:
170
- - Phenotype node count scaling validates expected multiplicative factors for staged replication.
171
- - Rule application runtime ≤ 10% of total build time for small networks (<5k nodes) in benchmark harness.
172
-
173
- Risks:
174
- - Cascading replication explosion → enforce geometric growth cap per pass.
175
- - Symmetry drift (floating point coordinate error) → use rational or fixed‑point representation for mirror axes.
176
-
177
- Deferred:
178
- - Differentiation / activation assignment (future rule kind) if excluded here.
179
-
180
- Acceptance:
181
- - All rule semantics tests pass; no nondeterministic drift across multi‑build sequences.
182
-
183
- Nodes inherit _base role_ (input/output/hidden). Developmental rules add _traits_: gating, recurrence permission, plasticity profile, activation family set. A `differentiate` rule might look like:
94
+ prob(region) exp(focusScore(region) / τ)
95
+
96
+ 3. Sample a small set of targets (top‑K or N draws without replacement) and apply bounded, local edits:
97
+ - Local edits must be budgeted (maxNewEdgesPerCycle, maxNodesPerCycle) and test‑rollbackable.
98
+ - Prefer low‑latency edits first (edge densification, small replicate); defer expensive global replays.
99
+
100
+ Safety invariants and limits
101
+
102
+ - Budget caps: every morph or mutation is rejected if it would violate maxNodes, maxEdges, or memoryBudget.
103
+ - Cooldown: a region that was edited resets a cooldown counter preventing repeated edits for N epochs.
104
+ - Dry run validation: for costly edits perform a dry‑build check (no allocation) to validate referential integrity before committing.
105
+ - Deterministic seed usage: stochastic choices in sampling use a deterministic RNG seeded from (genotype.hash + epochCounter) for reproducibility.
106
+
107
+ Practical notes for implementation & experiments
108
+
109
+ - Use online, robust normalization (e.g., exponential moving mean/std or median/MAD) to make norm_k stable across training phases.
110
+ - Start with conservative weights (favor errShare + utilization) and anneal toward novelty when stagnation persists.
111
+ - Evaluate ablations:
112
+ - uniform vs focus sampling (measure evals/sec to convergence).
113
+ - with/without wiringCost in the score (measure modularity Q, meanEdgeLength, task performance).
114
+ - Logging: record per‑epoch focusScore distributions, sampled regions, and before/after deltas for node/edge counts so causal effects are traceable.
115
+
116
+ Example sketch (pseudo)
184
117
 
185
118
  ```ts
186
- { kind: 'differentiate', params: { region: 'motor', chooseActivation: ['tanh','relu'], plasticity: 'hebbian_fast' } }
187
- Objectives:
188
- - Replace exhaustive direct edge enumeration with procedural generation via CPPNs to enable large coordinate substrates.
189
- - Introduce controllable sparsity & pattern regularity.
190
- - Maintain cache for adjacency & weight pattern reuse across evaluations.
191
-
192
- Key Tasks:
193
- 1. `CPPNGene` definition: layers spec, activations list, weights TypedArray, innovation id.
194
- 2. Implement forward evaluator (pure function, no dynamic allocation in hot loop).
195
- 3. Candidate pair sampling strategy: configurable (all pairs up to dimension bound; radius-limited; focus-guided subset placeholder for later).
196
- 4. Connectivity decision: compute outputs (weight, mask probability, optional metadata channels), threshold to realize edge.
197
- 5. Weight scaling & optional normalization (e.g., fan-in scaling) hooks.
198
- 6. Adjacency cache keyed by (genotype structural hash, substrate hash, CPPN hash, threshold) → returns list of (src,dst,weight,meta).
199
- 7. Cache invalidation on CPPN mutation or substrate change.
200
- 8. Benchmark harness: measure generation vs baseline direct full connect for moderate size (e.g., 2k×2k potential pairs pruned to <5%).
201
- 9. Tests: symmetry invariance for mirrored coordinates; deterministic adjacency ordering; threshold monotonicity (higher threshold subset).
202
-
203
- Metrics & Exit Criteria:
204
- - Indirect build time ≤ 1.5× baseline for medium nets (document exact ratio & hardware).
205
- - Memory footprint: adjacency cache reuses buffers; additional overhead per realized connection < +8 bytes vs baseline.
206
-
207
- Risks & Mitigations:
208
- - Risk: Cache blowup for many thresholds → LRU eviction & size cap.
209
- - Risk: Floating mismatch causing non-determinism → fixed precision rounding for CPPN inputs.
210
-
211
- Deferred:
212
- - Multi-CPPN stacking & plasticity channel usage (Phase 5/7).
213
-
214
- Acceptance:
215
- - Patterns reproducible; scaling metrics logged.
216
- Describes hierarchical assembly as recursive, scope‑stacked rule evaluation: a parent rule can spawn sub‑scopes whose local coordinate frames and probability adjustments produce self‑similar yet diversified descendants. This generates fractal‑like scaling (band → stripe cluster → columnar macro‑module) while preserving traceability (each object carries an ancestry chain). The result is exponential phenotype expressivity from logarithmic genotype growth.
217
- Rules can recursively introduce *proto‑modules* which themselves run a localized rule subset (mini developmental pass) enabling fractal expansion without a huge flat genome.
218
-
219
- Objectives:
220
- - Enable intra‑generation structural adjustments driven by live telemetry.
221
- - Enforce memory and sparsity budgets dynamically.
222
- - Provide consistent event sequencing & transactional rebuild semantics.
223
-
224
- Key Tasks:
225
- 1. Metrics collectors: per module (activity mean/var, utilization, contribution proxy) updated post‑epoch.
226
- 2. Event dispatcher: `maybeMorph(event: MorphEvent, metrics)` gating by cooldown & budget state.
227
- 3. Actions:
228
- - `activity_expand`: local replicate path or add focused edges (calls CPPN subset eval or direct small add).
229
- - `low_contrib_prune`: integrate existing pruning API with module tags; maintain target sparsity.
230
- - `module_split`: duplicate module nodes + incident edges; reassign subset based on coordinate partition.
231
- 4. Transaction layer: queue structural edits, apply in batch, rebuild slabs if size changed; reuse pools.
232
- 5. Budget enforcement: connection/node hard caps; forced prune order by (contribution score ↑ age).
233
- 6. Trace logging: append event entries with before/after counts & focus scores.
234
- 7. Tests: deterministic morph under fixed metrics stream; rollback safety test (simulate failed morph then revert).
235
-
236
- Metrics & Exit Criteria:
237
- - Morph hook overhead (disabled) <1% runtime; (enabled idle) <5%.
238
- - No memory leak (object pool size returns after prune cycles).
239
-
240
- Risks:
241
- - Concurrent training modifications → apply morph only between batches / epochs.
242
- - Cascading rebuild thrash → cooldown period & action quota per event.
243
-
244
- Deferred:
245
- - Focus scoring integration (Phase 7) beyond baseline metrics.
246
-
247
- Acceptance:
248
- - Structural edits validated by tests; performance overhead documented.
249
- | HyperNEAT | CPPN encodes connectivity over geometric substrate | CPPN(s) produce weight *and* activation / plasticity hints; substrate may be multi-dimensional |
250
- | ES-HyperNEAT | Adaptive sampling of large substrates | Deferred generation: only sample connection candidates near active regions or flagged by focus metrics |
251
- | Evo-Devo (Dev. Biology) | Growth via rules, differentiation, symmetry | Rule engine with execution probabilities & priorities |
252
- Objectives:
253
- - Add activity‑dependent micro‑update channel (Hebbian / anti‑Hebbian) plus optional decay for transient potentiation.
254
- - Guarantee negligible overhead when disabled.
255
-
256
- Key Tasks:
257
- 1. Extend `Connection` with optional side buffers (e.g., `trace`, `plasticAccum`), allocated only when enabled.
258
- 2. Implement update rule: `Δw = η * pre * post - λ * w` (configurable terms) executed post forward pass.
259
- 3. Config gating: early return fast path; ensure branch predictability.
260
- 4. Reset semantics on connection pooling release.
261
- 5. Integrate with morphogenesis (optional): plasticity metrics feed utilization / growth decisions.
262
- 6. Tests: numerical validation on toy 2‑node network; performance micro‑benchmark toggling plasticity.
263
-
264
- Metrics & Exit Criteria:
265
- - Overhead disabled <1%; enabled small net <10% additional time (doc exact numbers).
266
- - Weight drift stable (bounded by decay) over long run test.
267
-
268
- Risks:
269
- - Accidental interference with gradient updates → apply plasticity adjustments after optimizer step or in dedicated buffer.
270
-
271
- Deferred:
272
- - Heterosynaptic or triplet rules; plasticity-driven structural triggers (future extension).
273
-
274
- Acceptance:
275
- - Correctness + performance tests pass; documentation includes usage example.
276
- ## 3. Lifecycle Timeline (Educational Walkthrough)
119
+ const scores = regions.map((r) =>
120
+ computeFocusScore(r, metrics, geno.wiringPreferences)
121
+ );
122
+ const probs = softmax(scores, config.focusTemperature);
123
+ const targets = sampleWithoutReplacement(
124
+ regions,
125
+ probs,
126
+ config.maxRegionsPerEpoch
127
+ );
128
+ for (const t of targets) {
129
+ if (!withinBudgets(t)) continue;
130
+ applyLocalEditSafely(net, geno, t);
131
+ }
132
+ ```
133
+
134
+ This focused allocation framework keeps Hyper MorphoNEAT’s evolutionary pressure intentional and measurable: edits are data‑driven, budget‑bounded, reproducible, and amenable to systematic ablation studies.
135
+
136
+ ## Lifecycle Timeline
137
+
277
138
  Supplies a stage‑wise state machine perspective enabling contributors to reason about invariants (e.g., phenotype immutability within a training epoch) and side‑effects (cache invalidation boundaries). Each numbered stage defines clear preconditions and postconditions, reducing coupling: reproduction only manipulates symbolic genotype; rebuild regenerates material state; morphogenesis edits runtime graph under strict budget guards. This segregation eases targeted profiling and correctness auditing.
278
139
 
279
- Objectives:
280
- - Provide structured introspection (metrics snapshots, developmental trace, module lineage) with opt‑in overhead.
281
- - Expose stable JSON schemas for external tooling.
282
-
283
- Key Tasks:
284
- 1. Metrics aggregator API: `collectModuleMetrics(net)` returns typed array views / plain objects.
285
- 2. Developmental trace structure: sequence of events with timestamps, hashes, delta summaries.
286
- 3. Export APIs: `exportTrace()`, `exportModuleReport()`, embed subset into ONNX metadata (if feature present) under reserved namespace.
287
- 4. Lazy allocation design: allocate buffers only on first enable call; free or mark reusable when disabled.
288
- 5. Tests: allocation counts via instrumentation harness; schema validation against JSON schema file.
289
-
290
- Metrics & Exit Criteria:
291
- - Disabled path adds zero additional retained objects (heap diff baseline).
292
- - Trace export round‑trip size overhead <5% of serialized genotype for 100 events (document).
293
-
294
- Risks:
295
- - Telemetry misuse in hot loop causing overhead guidance docs + runtime warning when sampling frequency too high.
296
-
297
- Deferred:
298
- - Live streaming / websocket visualizer.
299
-
300
- Acceptance:
301
- - Introspection APIs stable & documented; performance invariants verified.
302
- | 2 | Development Pass (Static) | Building phenotype (Phase 1–2 implementation) | Apply rules in priority order (replicate, symmetry, hierarchy, differentiate) to expand virtual node set & module tags | Interim developmental trace entries |
303
- | 3 | Indirect Connectivity Synthesis | After rules applied; before instantiating physical network | Evaluate CPPN(s) selectively over candidate coordinate pairs (sparse sampling) to decide edges + initial weights + metadata | Edge candidate list (lazy) |
304
- | 4 | Phenotype Materialization | Edge candidates + nodes prepared | Instantiate pooled `Node` / `Connection` objects; pack into slabs; apply activation/plasticity traits | Runtime `Network` |
305
- Objectives:
306
- - Introduce full hyper‑aware mutation & recombination operators with innovation tracking and speciation distance extensions.
307
- - Maintain evolutionary performance (throughput) within acceptable overhead bounds.
308
-
309
- Key Tasks:
310
- 1. Mutation operators: add/remove rule, tweak rule params, adjust priority/probability, CPPN weight perturb, CPPN topology addlayer/add‑node, substrate scale/rotation adjust.
311
- 2. Crossover implementation: multi‑family alignment (rules, CPPNs, substrate modifiers) with blending modes & conflict resolution.
312
- 3. Innovation tracking: extend registry to assign IDs to new rule signatures & CPPN topology changes.
313
- 4. Speciation metric: weighted combination of (rule hash edit distance, CPPN topology distance, averaged CPPN weight cosine distance, substrate modifier delta) + existing compatibility coefficients.
314
- 5. Population loop integration: hyper mode branch using new reproduction path; preserve classic mode unaffected.
315
- 6. Cache reuse: precompute parent phenotype hashes, reuse adjacency caches when structural invariants retained.
316
- 7. Reproduction tests & correctness: determinism under seeded RNG, invalid genome rejection, budget enforcement.
317
- 8. Performance test: evolving small population (e.g., 50) for N generations measuring evals/sec vs baseline NEAT.
318
-
319
- Metrics & Exit Criteria:
320
- - Throughput ≥ 60% of baseline NEAT at comparable population sizes (document conditions).
321
- - Speciation maintains diversity (no single species >80% population after burn‑in unless directed).
322
- - Crossover failure (invalid child) rate <5% (above triggers validation tuning).
323
-
324
- Risks:
325
- - Overly punitive distance causing species fragmentation → dynamic coefficient adjustment algorithm.
326
- - Genome bloat through blended duplication → complexity budgets enforced pre‑speciation assignment.
327
-
328
- Deferred:
329
- - Multi‑parent crossover; adaptive crossover operator selection.
330
-
331
- Acceptance:
332
- - End‑to‑end evolutionary run stable; diversity & performance metrics recorded.
140
+ | Build Step | Implemented In | Description | Artifacts |
141
+ | ---------------------------- | -------------- | ---------------------------------------------------------------------------------------------------------------------- | ----------------------------------- |
142
+ | 1. Static Development Pass | Phase 2 | Apply rules in priority order (replicate, symmetry, hierarchy, differentiate) to expand virtual node set & module tags | Interim developmental trace entries |
143
+ | 2. Indirect Connectivity | Phase 3 | Evaluate CPPN(s) selectively over candidate coordinate pairs (sparse sampling) to decide edges + initial weights | Edge candidate list (lazy) |
144
+ | 3. Phenotype Materialization | Phase 1-3 | Instantiate pooled `Node` / `Connection` objects; pack into slabs; apply activation/plasticity traits | Runtime `Network` |
145
+
146
+ Objectives:
147
+
148
+ - Introduce full hyper‑aware mutation & recombination operators with innovation tracking and speciation distance extensions.
149
+ - Maintain evolutionary performance (throughput) within acceptable overhead bounds.
150
+
151
+ Key Tasks:
152
+
153
+ 1. Mutation operators: add/remove rule, tweak rule params, adjust priority/probability, CPPN weight perturb, CPPN topology add‑layer/add‑node, substrate scale/rotation adjust.
154
+ 2. Crossover implementation: multi‑family alignment (rules, CPPNs, substrate modifiers) with blending modes & conflict resolution.
155
+ 3. Innovation tracking: extend registry to assign IDs to new rule signatures & CPPN topology changes.
156
+ 4. Speciation metric: weighted combination of (rule hash edit distance, CPPN topology distance, averaged CPPN weight cosine distance, substrate modifier delta) + existing compatibility coefficients.
157
+ 5. Population loop integration: hyper mode branch using new reproduction path; preserve classic mode unaffected.
158
+ 6. Cache reuse: precompute parent phenotype hashes, reuse adjacency caches when structural invariants retained.
159
+ 7. Reproduction tests & correctness: determinism under seeded RNG, invalid genome rejection, budget enforcement.
160
+ 8. Performance test: evolving small population (e.g., 50) for N generations measuring evals/sec vs baseline NEAT.
161
+
162
+ Metrics & Exit Criteria:
163
+
164
+ - Throughput 60% of baseline NEAT at comparable population sizes (document conditions).
165
+ - Speciation maintains diversity (no single species >80% population after burn‑in unless directed).
166
+ - Crossover failure (invalid child) rate <5% (above triggers validation tuning).
167
+
168
+ Risks:
169
+
170
+ - Overly punitive distance causing species fragmentation → dynamic coefficient adjustment algorithm.
171
+ - Genome bloat through blended duplication complexity budgets enforced prespeciation assignment.
172
+
173
+ Deferred:
174
+
175
+ - Multi‑parent crossover; adaptive crossover operator selection.
176
+
177
+ Acceptance:
178
+
179
+ - End‑to‑end evolutionary run stable; diversity & performance metrics recorded.
333
180
 
334
181
  Unlike classic NEAT where crossover aligns by innovation numbers for direct connection genes, Hyper MorphoNEAT must align heterogeneous gene families:
335
- 1. Developmental Rules
336
- Objectives:
337
- - Demonstrate linear (or sublinear) memory growth with active connections and acceptable build/morph latencies at large scale.
338
- - Validate absence of memory leaks under churn (grow/prune cycles).
339
- - Produce public guidance (tuning flags, thresholds) prior to general availability.
340
-
341
- Key Tasks:
342
- 1. Synthetic benchmark suite: varying substrate sizes, sparsity thresholds, morph cycle frequencies.
343
- 2. Long‑run churn test: repeated morph cycles (e.g., 10k iterations) measuring pool highwater marks & GC stabilized memory.
344
- 3. Profiling: identify hottest functions (CPPN eval, rule pass) and micro‑optimize (loop unrolling, typed array reusage) if >30% total time each.
345
- 4. Memory accounting: compute bytes/active connection (including pools) vs baseline; break down into slabs, metadata, plasticity extras.
346
- 5. Regression thresholds integrated into CI (fail if build time or memory exceeds stored baseline by >10%).
347
- 6. Documentation: produce scaling appendix with empirical curves (edges vs memory, edges vs build time).
348
-
349
- Metrics & Exit Criteria:
350
- - Peak bytes per active connection within target (establish numeric after Phase 5 measurement).
351
- - No upward trend in pool size after churn plateau (slope ~0 over final 20% iterations).
352
- - Build + morph latency percentiles (p95) documented and acceptable for release goals.
353
-
354
- Risks:
355
- - Benchmark instability across environments → pin Node.js version & isolate CPU scaling (single thread) for CI.
356
- - Hidden fragmentation in pooled arrays implement periodic compaction or sentinel leak detection.
357
-
358
- Deferred:
359
- - GPU/WebGPU fast path; advanced compression.
360
-
361
- Acceptance:
362
- - Scaling report published; CI thresholds enforced; release readiness sign‑off.
182
+
183
+ 1. Developmental Rules
184
+ Objectives: - Demonstrate linear (or sublinear) memory growth with active connections and acceptable build/morph latencies at large scale. - Validate absence of memory leaks under churn (grow/prune cycles). - Produce public guidance (tuning flags, thresholds) prior to general availability.
185
+
186
+ Key Tasks:
187
+
188
+ 1. Synthetic benchmark suite: varying substrate sizes, sparsity thresholds, morph cycle frequencies.
189
+ 2. Long‑run churn test: repeated morph cycles (e.g., 10k iterations) measuring pool high‑water marks & GC stabilized memory.
190
+ 3. Profiling: identify hottest functions (CPPN eval, rule pass) and microoptimize (loop unrolling, typed array reusage) if >30% total time each.
191
+ 4. Memory accounting: compute bytes/active connection (including pools) vs baseline; break down into slabs, metadata, plasticity extras.
192
+ 5. Regression thresholds integrated into CI (fail if build time or memory exceeds stored baseline by >10%).
193
+ 6. Documentation: produce scaling appendix with empirical curves (edges vs memory, edges vs build time).
194
+
195
+ Metrics & Exit Criteria:
196
+
197
+ - Peak bytes per active connection within target (establish numeric after Phase 5 measurement).
198
+ - No upward trend in pool size after churn plateau (slope ~0 over final 20% iterations).
199
+ - Build + morph latency percentiles (p95) documented and acceptable for release goals.
200
+
201
+ Risks:
202
+
203
+ - Benchmark instability across environmentspin Node.js version & isolate CPU scaling (single thread) for CI.
204
+ - Hidden fragmentation in pooled arrays → implement periodic compaction or sentinel leak detection.
205
+
206
+ Deferred:
207
+
208
+ - GPU/WebGPU fast path; advanced compression.
209
+
210
+ Acceptance:
211
+
212
+ - Scaling report published; CI thresholds enforced; release readiness sign‑off.
213
+
363
214
  |-------------|------------------------|---------------------|-----------------------|
364
- | Rule | Innovation ID or (kind + canonical param signature hash) | Same kind & equal normalized params | Uniform pick or parameter-wise blend |
365
- | CPPN | Innovation ID (per layer addition) + topology hash | Identical layer counts + activation sequence | Weight crossover (per-weight uniform or arithmetic) |
215
+ | Rule | Innovation ID or (kind + canonical param signature hash) | Same kind & equal normalized params | Uniform pick or parameter-wise blend |
216
+ | CPPN | Innovation ID (per layer addition) + topology hash | Identical layer counts + activation sequence | Weight crossover (per-weight uniform or arithmetic) |
366
217
  | Substrate Modifier | Modifier type (e.g., scale, rotation) + axis | Same type & axis | Average numeric params; random tie-break on enums |
367
- | Module Tag | Module ID (if shared ancestry) | ID equality | Inherit or merge (union of roles) |
218
+ | Module Tag | Module ID (if shared ancestry) | ID equality | Inherit or merge (union of roles) |
368
219
 
369
220
  Unmatched (disjoint / excess) genes: inclusion probability biased toward fitter parent (like NEAT) but capped to avoid bloat.
370
221
 
371
- #### 3.1.2 Rule Parameter Blending
372
- Argues for continuous parameter interpolation to buffer offspring against abrupt fitness cliffs introduced by discrete rule parameter jumps (e.g., replicate.times from 1→3). Blending supports *semantic continuity*: small α adjustments generate proportionally moderate structural differences after development, smoothing the adaptive landscape and improving combined efficacy of mutation + crossover.
373
- For numeric params we can apply *biased arithmetic crossover* (BAC):
374
- ```
222
+ #### Rule Parameter Blending
375
223
 
224
+ Argues for continuous parameter interpolation to buffer offspring against abrupt fitness cliffs introduced by discrete rule parameter jumps (e.g., replicate.times from 1→3). Blending supports _semantic continuity_: small α adjustments generate proportionally moderate structural differences after development, smoothing the adaptive landscape and improving combined efficacy of mutation + crossover.
225
+ For numeric params we can apply _biased arithmetic crossover_ (BAC):
226
+
227
+ ```
376
228
  childParam = α _ paramA + (1-α) _ paramB, α ~ U(0,1) (optionally biased toward fitter)
229
+ ```
377
230
 
378
- ````
379
231
  Boolean / categorical: coin flip or frequency-based if more than 2 parents (future multi-parent reproduction).
380
232
 
381
- #### 3.1.3 CPPN Weight Crossover
233
+ #### CPPN Weight Crossover
234
+
382
235
  Characterizes crossover operator choice as shaping the exploration–stability frontier: uniform promotes high exploratory variance (diversity of micro‑patterns), arithmetic maintains macro‑structural coherence, and SBX simulates sampled interpolation with controllable spread parameter η. Operator selection can be meta‑optimized via telemetry feedback (tracking post‑crossover disruption metrics) to adapt exploration pressure over evolutionary time.
383
236
  Mode options (configurable):
384
- * `uniform`: per weight pick A or B.
385
- * `arithmetic`: `w_child = 0.5*(wA+wB)` with occasional noise injection.
386
- * `simulated_binary_crossover (SBX)`: for more exploratory offspring.
387
237
 
388
- #### 3.1.4 Innovation & History Tracking
238
+ - `uniform`: per weight pick A or B.
239
+ - `arithmetic`: `w_child = 0.5*(wA+wB)` with occasional noise injection.
240
+ - `simulated_binary_crossover (SBX)`: for more exploratory offspring.
241
+
242
+ #### Innovation & History Tracking
243
+
389
244
  Generalizes innovation numbers beyond direct connection genes to heterogeneous developmental and indirect encoding elements. This preserves historical distance metrics used in speciation clustering, preventing premature mixing of distinct morphogenetic strategies. Canonical hashing of normalized rule parameters minimizes spurious innovation inflation while allowing genuinely novel composite configurations to register distinct lineage identity.
390
245
  We extend innovation bookkeeping: every new blended rule or CPPN structure receives a fresh innovation id; however if two parents share the same canonical hash we preserve the id to aid speciation distance continuity.
391
246
 
392
- #### 3.1.5 Post-Crossover Normalization
247
+ #### Post-Crossover Normalization
248
+
393
249
  Details a sanitation pass ensuring the offspring genotype respects global complexity budgets and semantic minimality. Redundant or shadowed rules (those whose effects are subsumed by a higher‑priority equivalent) are culled; priorities are rebalanced to avoid starvation of late but essential rule classes; and heuristic impact estimates (e.g., historical contribution deltas) guide which excess elements to discard when budget pressure is high.
394
250
  After merging:
395
- * Re-sort rules by (priority → probability → innovation).
396
- * Deduplicate semantically equivalent rules (same normalized hash) keeping the one from fitter parent.
397
- * Enforce complexity budget (rule count <= configurable max) dropping lowest impact (estimated contribution heuristic) first.
398
251
 
399
- #### 3.1.6 Offspring Validation Pass
252
+ - Re-sort rules by (priority → probability → innovation).
253
+ - Deduplicate semantically equivalent rules (same normalized hash) keeping the one from fitter parent.
254
+ - Enforce complexity budget (rule count <= configurable max) dropping lowest impact (estimated contribution heuristic) first.
255
+
256
+ #### Offspring Validation Pass
257
+
400
258
  Defines a lightweight static checking phase performing referential integrity (regions, module tags), feasibility (symmetry without axis definition), and budget alignment validation before incurring allocation costs. Early rejection reduces wasted CPU cycles and prevents subtle runtime invariants (e.g., slab index density assumptions) from being violated in downstream materialization.
401
- Run a *dry build* (no object instantiation) to ensure no invalid combinations (e.g., differentiate targets region that no longer exists after rule pruning). Invalid references are either re-mapped (if a similar region persists) or rule disabled.
259
+ Run a _dry build_ (no object instantiation) to ensure no invalid combinations (e.g., differentiate targets region that no longer exists after rule pruning). Invalid references are either re-mapped (if a similar region persists) or rule disabled.
260
+
261
+ #### Pseudocode
402
262
 
403
- #### 3.1.7 Pseudocode
404
263
  Supplies a reference pseudocode outlining data flow and decision ordering so reviewers can validate conceptual correctness (alignment before normalization; budget enforcement post‑merge) independent of TypeScript specifics. This separation accelerates design iteration and lowers risk of misimplementation during incremental PRs.
264
+
405
265
  ```ts
406
- function crossoverHyperGenotype(a: HyperGenotype, b: HyperGenotype, cfg: CrossCfg): HyperGenotype {
266
+ function crossoverHyperGenotype(
267
+ a: HyperGenotype,
268
+ b: HyperGenotype,
269
+ cfg: CrossCfg
270
+ ): HyperGenotype {
407
271
  const child: HyperGenotype = seedChildBase(a, b);
408
272
  // 1. Align rules
409
273
  const aligned = alignRules(a.rules, b.rules);
410
274
  for (const pair of aligned) {
411
275
  if (pair.match) child.rules.push(blendRule(pair.a, pair.b, cfg));
412
- else child.rules.push(selectDisjoint(pair, fitnessBias(a,b)));
276
+ else child.rules.push(selectDisjoint(pair, fitnessBias(a, b)));
413
277
  }
414
278
  // 2. Merge CPPNs
415
279
  const cppnPairs = alignCPPNs(a.cppns, b.cppns);
416
- child.cppns = cppnPairs.map(p => p.match ? crossoverCPPN(p.a, p.b, cfg) : preferFitter(p, a, b));
280
+ child.cppns = cppnPairs.map((p) =>
281
+ p.match ? crossoverCPPN(p.a, p.b, cfg) : preferFitter(p, a, b)
282
+ );
417
283
  // 3. Substrate modifiers
418
284
  child.substrateSpec = mergeSubstrate(a.substrateSpec, b.substrateSpec, cfg);
419
285
  // 4. Clean up
@@ -422,31 +288,32 @@ function crossoverHyperGenotype(a: HyperGenotype, b: HyperGenotype, cfg: CrossCf
422
288
  validateChild(child);
423
289
  return child;
424
290
  }
425
- ````
291
+ ```
426
292
 
427
- ### 3.2 Educational Comparison: Classic NEAT vs Hyper MorphoNEAT Reproduction
293
+ ### Educational Comparison: Classic NEAT vs Hyper MorphoNEAT Reproduction
428
294
 
429
295
  Synthesizes the structural and informational expansion introduced by indirect + developmental encodings, clarifying why increased crossover complexity yields disproportionate expressive gains (large structural motifs negotiated at symbolic level). It contextualizes the trade‑off: added bookkeeping overhead versus potential for emergent macro‑regularities and smoother scaling to high node counts.
430
- | Aspect | Classic NEAT Crossover | Hyper MorphoNEAT Crossover |
431
- |---------------------|-----------------------------------------------|-----------------------------------------------------------------|
432
- | Alignment Basis | Innovation numbers (node/connection genes) | Multi-family: rules, CPPNs, substrate modifiers, tags |
433
- | Genome Size Control | Excess/disjoint from fitter | Budgeted + heuristic pruning post-merge |
434
- | Expressivity Change | Structural genes directly swapped | Developmental programs blended (indirect structural consequences)|
435
- | Weight Handling | A/B pick for matching connection weights | Mode selectable: uniform / arithmetic / SBX for CPPN weights |
436
- | Phenotype Rebuild | Direct reconstitution | Regeneration via rule + CPPN re-execution (cached) |
437
296
 
438
- ### 3.3 Reproduction Placement in Timeline
297
+ | Aspect | Classic NEAT Crossover | Hyper MorphoNEAT Crossover |
298
+ | ------------------- | ------------------------------------------ | ----------------------------------------------------------------- |
299
+ | Alignment Basis | Innovation numbers (node/connection genes) | Multi-family: rules, CPPNs, substrate modifiers, tags |
300
+ | Genome Size Control | Excess/disjoint from fitter | Budgeted + heuristic pruning post-merge |
301
+ | Expressivity Change | Structural genes directly swapped | Developmental programs blended (indirect structural consequences) |
302
+ | Weight Handling | A/B pick for matching connection weights | Mode selectable: uniform / arithmetic / SBX for CPPN weights |
303
+ | Phenotype Rebuild | Direct reconstitution | Regeneration via rule + CPPN re-execution (cached) |
304
+
305
+ ### Reproduction Placement in Timeline
439
306
 
440
307
  Explains that placing reproduction _before_ any growth/prune cycle in the offspring epoch guarantees a clean separation of heritable innovation and individual lifetime adaptation. This ordering preserves analytical decomposability: fitness deltas can be partitioned into genetic vs morphogenetic contribution without confounding carry‑over artifacts.
441
308
  Reproduction occurs _after_ selection and _before_ new morphogenesis-driven growth of the offspring. This preserves the principle that runtime morphogenesis acts on each individual's phenotype _post_ genetic inheritance, avoiding entangling heritable rules with ephemeral runtime adjustments.
442
309
 
443
- ### 3.4 Exported Genome Mix Example (Conceptual)
310
+ ### Exported Genome Mix Example (Conceptual)
444
311
 
445
312
  Demonstrates how overlapping parental rule sets synthesize into hybrid developmental trajectories: blended replication depth adjusts module proliferation rate, symmetry alignment ensures spatial coherence, and retained differentiation preserves functional specialization. The example concretely shows genotype‑level arithmetic producing qualitatively interpretable phenotype differences post‑development, reinforcing the utility of rule interpolation.
446
313
  Parent A (excerpt):
447
314
 
448
315
  ```
449
- Rules: [ replicate(times=2), symmetry(axis=y), differentiate(region=motor, act=tanh) ]
316
+ Rules: [ replicate(times=2), symmetry(axis=y), differentiate(region=motor, act='tanh') ]
450
317
  CPPN: topology hash H1, weights WA
451
318
  Substrate: dims=2, scale=1.0
452
319
  ```
@@ -462,18 +329,14 @@ Substrate: dims=2, scale=1.2 (stretched x-axis)
462
329
  Child (result):
463
330
 
464
331
  ```
465
- Rules: [ replicate(times=2 or 1→2 blended), symmetry(axis=y), hierarchy(levels=2), differentiate(region=motor, act=tanh) ]
332
+ Rules: [ replicate(times=2 or 1→2 blended), symmetry(axis=y), hierarchy(levels=2), differentiate(region=motor, act='tanh') ]
466
333
  CPPN: crossover(H1(WA,WB))
467
334
  Substrate: dims=2, scale ~1.1 (blended) with normalization
468
335
  ```
469
336
 
470
337
  Development then proceeds (rules executed, CPPN queried, phenotype materialized) producing a network that inherits broad symmetry + replication depth + motor differentiation.
471
338
 
472
- ---
473
-
474
- ---
475
-
476
- ## 4. Focusing Evolution Dynamically
339
+ ## Focusing Evolution Dynamically
477
340
 
478
341
  Frames focus scoring as a multi‑objective prioritization heuristic balancing exploitation (error attribution, gradient contribution) with exploration (novelty deficits, structural entropy). By converting heterogeneous signals into a scalar sampling weight, the system produces a soft, continuous pressure distribution that adapts as modules mature or regress, reducing manual tuning of per‑operator probabilities.
479
342
  We maintain per-module metrics:
@@ -481,7 +344,11 @@ We maintain per-module metrics:
481
344
  ```
482
345
  ModuleMetric = {
483
346
  id, errShare, actMean, actVar, age, lastMutationIter,
484
- contribGradient, noveltyScore, sparsity, utilization
347
+ contribGradient, noveltyScore, sparsity, utilization,
348
+ // wiring and modularity signals used to prefer compact/regular wiring
349
+ meanEdgeLength?: number,
350
+ interModuleEdgeRatio?: number,
351
+ modularityQ?: number
485
352
  }
486
353
  ```
487
354
 
@@ -493,7 +360,8 @@ function computeFocus(m: ModuleMetric) {
493
360
  0.35 * norm(m.errShare) +
494
361
  0.2 * (1 - norm(m.noveltyScore)) +
495
362
  0.25 * norm(m.contribGradient) +
496
- 0.2 * underUtilPenalty(m.utilization)
363
+ 0.15 * (1 - norm(m.modularityQ || 0)) + // prefer modules that increase modularity score
364
+ 0.15 * underUtilPenalty(m.utilization)
497
365
  );
498
366
  }
499
367
  ```
@@ -504,13 +372,131 @@ Modules chosen for:
504
372
  - Diversification if low novelty + moderate error.
505
373
  - Pruning if very low contribution & low utilization.
506
374
 
375
+ Notes:
376
+
377
+ - `meanEdgeLength` and `interModuleEdgeRatio` feed into focusScore and can increase pruning pressure on long or cross‑module links.
378
+ - `modularityQ` is used to prefer mutations/morphs that increase modular structure; it can also be used as a speciation axis.
379
+
507
380
  ---
508
381
 
509
- ## 5. Example Genotype Snippets
382
+ ## High-Level Goals
383
+
384
+ Links biologically inspired constructs (symmetry, differentiation, plasticity) to concrete engineering KPIs (sample efficiency, scaling curvature, memory per effective edge) to ensure aesthetic analogies are instrumented and falsifiable. This guards against ornamental complexity by demanding metric justification for each added mechanism.
385
+
386
+ 1. Add a compact Evo‑Devo genotype layer that can generate / regenerate phenotypic `Network` graphs deterministically.
387
+ 2. Support CPPN‑driven structural & weight pattern generation (indirect encoding) with caching for large substrates.
388
+ 3. Introduce morphogenesis (runtime growth/prune rules) integrated with existing pruning + connection pooling.
389
+ 4. Maintain or reduce memory per active connection via slab packing + sparsity while enabling growth to millions of edges incrementally.
390
+ 5. Provide introspection (modules, regions, developmental lineage) without heavy overhead when disabled.
391
+ 6. Provide wiring‑cost primitives and selection hooks so evolution and morphogenesis can prefer compact, modular, and regular wiring (configurable penalties and Pareto options).
392
+
393
+ ---
394
+
395
+ ## How to use these instructions
396
+
397
+ These instructions describe a step‑by‑step plan to implement the features of Hyper MorphoNEAT in a series of incremental, testable phases. Each section details a conceptual component, followed by a precise specification of the corresponding implementation step.
398
+
399
+ For each phase:
400
+
401
+ - Review the objectives and key tasks to understand the goals and technical requirements.
402
+ - Implement the changes in small, reviewable increments, following the specified order and guidelines.
403
+ - Validate correctness and performance at each step, using the provided acceptance criteria and tests.
404
+ - Update this plan as needed to reflect any changes or discoveries during implementation.
405
+
406
+ ### Policy & contribution notes (short)
407
+
408
+ For strict rules, automated validations, and contribution guidance see the canonical repository documents:
409
+
410
+ - `.github/copilot-instructions.md` — repo-specific contributor instructions and strict rules.
411
+ - `STYLEGUIDE.md` — coding/style/test conventions (tests: single-expect rule, naming, JSDoc requirements, etc.).
412
+
413
+ Summary (local): keep changes small and reviewable, gate hyper features behind a flag, and run the repo validation suite before merging (tests, lint, typecheck, build). Use the canonical files above as source-of-truth; do not duplicate policy here.
414
+
415
+ ### Minimal pseudocode example
416
+
417
+ To ground the conceptual pipeline in a concrete example, the following pseudocode illustrates the high-level function calls and data flow for building a phenotype from a genotype.
418
+
419
+ ```ts
420
+ function buildPhenotype(geno, seed) {
421
+ const substrate = layoutSubstrate(geno.substrateSpec);
422
+ const virtualNodes = applyRules(geno.rules, substrate, { seed });
423
+ const adjacency = evaluateCPPNSafe(
424
+ geno.cppns,
425
+ virtualNodes,
426
+ geno.wiringPreferences
427
+ );
428
+
429
+ return materializeNetwork(virtualNodes, adjacency, { poolReuse: true });
430
+ }
431
+ ```
432
+
433
+ ### Layer responsibilities
434
+
435
+ To maintain a clean separation of concerns, the system is divided into two primary layers: the Genotype and the Phenotype. This division is crucial for ensuring that the genetic representation remains compact and heritable, while the materialized network can be optimized for runtime performance.
436
+
437
+ 1. Genotype (persistent, small)
438
+
439
+ - Encodes rules, CPPN parameters, substrate spec, wiring preferences and version metadata.
440
+ - Must be immutable during a build pass; mutations create a new genotype object.
441
+ - Provides canonical serialization and deterministic hashing (order‑independent where appropriate).
442
+ - Small and cheap to copy/compare; used by evolutionary operators and CI checks.
443
+
444
+ 2. Phenotype (materialized, transient)
445
+
446
+ - Instantiates Node/Connection objects, activation buffers, plasticity accumulators and runtime telemetry.
447
+ - May be pooled and reused across builds; must be reconstructible from the genotype + seed.
448
+ - Holds transient performance state (running statistics, per‑connection traces) that is optional and lazy‑allocated.
449
+ - Subject to explicit budgets (maxNodes, maxEdges, memoryBudget) and reversible morph operations.
450
+
451
+ ### Key invariants & safety guarantees
452
+
453
+ To ensure robust and predictable behavior, the Hyper MorphoNEAT implementation must adhere to a set of key invariants and safety guarantees. These principles are designed to prevent common pitfalls in complex evolutionary systems, such as non-determinism and uncontrolled resource consumption.
454
+
455
+ - Determinism: genotype + seed → canonical phenotype (stable ordering and stable hash).
456
+ - Referential safety: no runtime phenotype object is retained as part of a genotype; genotype serialization contains only serializable fields.
457
+ - Lazy allocation: telemetry and per‑connection extras allocate only when enabled.
458
+ - Budget enforcement: all builds and morphs check and honor configured resource caps before committing structural changes.
459
+ - Side‑effect free import: importing hyper modules with the feature flag disabled must not mutate global runtime state.
460
+
461
+ ### Practical guidelines
462
+
463
+ To complement the strict invariants, the following practical guidelines should be followed during development. These are best practices that will help maintain code quality, performance, and debuggability.
464
+
465
+ - Canonicalize before hashing: sort rule lists and normalize numeric fields prior to JSON/string hashing to avoid order-dependent innovation ids.
466
+ - Use shallow, copy-on‑write genotype mutations for evolutionary operators; avoid mutating arrays in place.
467
+ - Cache adjacency/CPPN outputs keyed by canonical fingerprints (genotype hash, substrate hash, wiring prefs) and invalidate conservatively on genotype changes.
468
+ - Pool ephemeral storage (slabs, activation arrays, plasticity buffers) and reset on release to avoid steady heap growth.
469
+ - Include lightweight ancestry/trace metadata in phenotype objects when telemetry is enabled; keep it out of core runtime paths otherwise.
470
+
471
+ ## Memory & performance alignment
472
+
473
+ Efficient memory management and high performance are critical for evolving complex neural networks. Hyper MorphoNEAT's implementation is therefore tightly aligned with the repository's dedicated memory and performance optimization roadmap. This section summarizes the key targets and how they map to the phased implementation.
474
+
475
+ **All memory operations, feature flags, and constants within the Hyper MorphoNEAT implementation must be sourced from the `Centralized Memory Manager` defined in Phase 3.5 of the `Memory_Optimization.md` plan. This ensures architectural consistency and adherence to the pay-for-use principle.**
476
+
477
+ Hyper MorphoNEAT relies on the repository's dedicated memory roadmap; this section summarizes the concrete targets and phase mapping from `plans/Memory_Optimization.md` so implementers and reviewers share the same acceptance criteria.
478
+
479
+ - Bytes / active connection: target ≈ 64 bytes per active connection (empirical baseline and Phase 1/3 goals in `plans/Memory_Optimization.md`). Use slab packing, bit‑flags, and optional pay‑for‑use slabs (gain, plasticity) to achieve this.
480
+ - Adjacency / phenotype caching: aim for an adjacency cache hit ratio > 70% in repeated evaluation scenarios (see Memory plan L7 and Hyper phases). Budget adjacency cache bytes to remain small (goal: ~+6 bytes amortized per active connection when the cache is effective).
481
+ - Rebuild variance: p95/median rebuild time ratio target < 2.5× (see Memory plan Phase targets for rebuild variance and slab reuse guidance).
482
+ - Plasticity side‑buffers and gains: keep optional side‑buffers < 8 bytes per plastic connection when enabled (Memory plan Phase 5/Hyper-specific targets).
483
+
484
+ Mapping to Memory phases (canonical references)
485
+
486
+ - Phase 0: Baseline instrumentation and dist‑only snapshots — use this to establish the Hyper baseline before enabling morphogenesis.
487
+ - Phase 1: Field audit & slimming — enforces connection enumerable key limits and documents bytes/connection baseline (~64–69 bytes observed).
488
+ - Phase 2: Node pooling — reuse reduces churn and enables deterministic parity when pooling is toggled on/off.
489
+ - Phase 3: Slab packing & optional slabs — the primary mechanism Hyper features should rely on to meet bytes/connection targets and to support low-overhead morphogenesis churn.
490
+
491
+ Implementers should consult `plans/Memory_Optimization.md` for benchmark artifacts, CV targets, and the detailed slab/pooling rules; Hyper PRs that touch growth/prune or caching must reference the relevant Memory phase tests (Field audit, NodePool stats, slab parity & gain omission tests) in their CI checklist.
492
+
493
+ ### Example Genotype Snippets
494
+
495
+ To help developers and users get started, this section provides canonical genotype examples. These snippets serve as reproducible starting points for experiments and as fixtures for regression tests, ensuring that the developmental process is both deterministic and performant.
510
496
 
511
497
  Offers canonical genotype archetypes that encode best‑practice starting conditions (minimal symmetric scaffold, hierarchical seed) enabling reproducible baselines for benchmarking. These snippets reduce ramp‑up cost for new users and function as fixtures in regression tests ensuring deterministic development and stable performance signatures across releases.
512
498
 
513
- ### 5.1 Minimal Seed
499
+ #### Minimal Seed
514
500
 
515
501
  Establishes a deterministic seed configuration with just enough structural variability (symmetry + shallow hierarchy) to exercise rule execution pathways while remaining analytically tractable. This baseline underpins performance profiling (isolated from higher‑order morphogenesis) and provides a control for evaluating incremental feature flags.
516
502
 
@@ -529,7 +515,12 @@ const geno: HyperGenotype = {
529
515
  { id: 2, kind: 'symmetry', params: { axis: 'y' }, probability: 0.7 },
530
516
  { id: 3, kind: 'hierarchy', params: { levels: 2 }, probability: 1 },
531
517
  ],
532
- cppns: [seedCPPN(/* architecture spec */)],
518
+ // TODO: replace with a concrete CPPNGene example; minimal example: seedCPPN({ layers:[{units:8,act:'tanh'},{units:1,act:'identity'}], seed:42 })
519
+ cppns: [
520
+ seedCPPN({
521
+ /* minimal architecture spec: layers, activations, seed */
522
+ }),
523
+ ],
533
524
  substrateSpec: {
534
525
  dims: 2,
535
526
  inputLayout: 'line',
@@ -540,7 +531,7 @@ const geno: HyperGenotype = {
540
531
  };
541
532
  ```
542
533
 
543
- ### 5.2 Rule Mutation Example
534
+ #### Rule Mutation Example
544
535
 
545
536
  Showcases a parameter mutation that incrementally increases structural capacity along an existing replication axis, illustrating fine‑grained controllability of developmental expansion without introducing novel rule kinds. This emphasizes mutation granularity: small numeric shifts propagate into proportionate phenotype elaboration, aiding smooth fitness landscape traversal.
546
537
 
@@ -553,7 +544,7 @@ mutateRule(
553
544
  );
554
545
  ```
555
546
 
556
- ### 5.3 Activity-Based Synaptogenesis (Runtime)
547
+ #### Activity-Based Synaptogenesis (Runtime)
557
548
 
558
549
  Exemplifies a morphogenesis policy that reacts to instantaneous utilization and error signals to add localized capacity, distinct from heritable genome alteration. This separation enables temporally adaptive fine‑tuning (short horizon structural adjustments) while preserving the slower evolutionary channel for consolidating successful motifs into heritable rules.
559
550
 
@@ -565,403 +556,83 @@ for (const region of regionsSortedByFocus(net)) {
565
556
  }
566
557
  ```
567
558
 
568
- ---
569
-
570
- ## 6. Indirect Connectivity Example
571
-
572
- Explains how coordinate‑based CPPN queries transform geometric relations (relative displacement, distance) into correlated weight patterns and probabilistic connectivity masks. This yields structured sparsity (e.g., banded, radial, or mirrored motifs) whose regularity enhances parameter sharing and reduces overfitting compared to unstructured random sparse graphs at equal edge budgets.
573
- Given node coordinates `(x_i,y_i)` and `(x_j,y_j)`, the CPPN input vector:
574
-
575
- ```
576
- v = [x_i, y_i, x_j, y_j, |x_i-x_j|, |y_i-y_j|, dist, 1]
577
- ```
578
-
579
- CPPN output channels (example):
580
-
581
- - `o0`: raw weight value
582
- - `o1`: connection mask probability (threshold)
583
- - `o2`: plasticity rate hint
584
- - `o3`: symmetry tag (for future tying)
585
-
586
- Edge realized if `sigmoid(o1) > maskThreshold`. Weight = `scale(o0)`. Optional plasticity metadata attached only if feature flag.
587
-
588
- ---
589
-
590
- ## 7. Morphogenesis Event Cycle (Detailed Example)
591
-
592
- Documents the order and conditionality of runtime adaptation actions (prune → grow → rebuild), making explicit the invariants (metrics snapshot immutability during a cycle, deferred rebuild) that guard against race conditions and inconsistent state. This trace form facilitates reproducibility and pedagogical walkthroughs.
593
-
594
- ```
595
- // Called after each training epoch
596
- onEpochEnd(net, geno, stats) {
597
- const metrics = collectModuleMetrics(net, stats);
598
- const focusOrder = rankModules(metrics);
599
- for (const m of focusOrder) {
600
- if (shouldPrune(m)) pruneModuleEdges(net, m, { keepFraction:0.6 });
601
- else if (shouldGrow(m)) growModule(net, geno, m, { strategy:'replicate_path' });
602
- }
603
- if (genoMutated) rebuildPhenotype(geno, net, { reuse: true });
604
- recordTrace(metrics, actionsTaken);
605
- }
606
- ```
607
-
608
- ---
559
+ ### Connection costs & wiring penalties
609
560
 
610
- ## 8. Educational Comparison Summary
561
+ A common challenge in generative neural network systems like HyperNEAT is the tendency to produce highly regular but inefficient wiring. To address this, Hyper MorphoNEAT treats wiring cost as a first-class citizen, allowing evolution to favor more compact and modular structures. This section details how wiring penalties are integrated into the system.
611
562
 
612
- Consolidates distinguishing dimensions (encoding granularity, growth triggers, scalability levers, plasticity integration) into a comparative matrix to contextualize Hyper MorphoNEAT’s hybrid positioning. The summary aids evaluators in mapping application constraints (e.g., need for extreme scaling, interpretability) to algorithm choice.
613
- | Aspect | Classic NEAT | HyperNEAT | Hyper MorphoNEAT |
614
- |------------------------|-------------------------------------|--------------------------|-------------------------------------------------------------|
615
- | Encoding | Direct genes (nodes/conns) | Indirect (CPPN) | Hybrid: rules + CPPN + runtime growth |
616
- | Growth Trigger | Genetic mutation only | Genetic (CPPN changes) | Genetic + runtime morphogenesis + focus policies |
617
- | Scalability Mechanism | Gradual complexification | Geometric regularities | Deferred materialization + region focus + sparsity budgets |
618
- | Plasticity | (Optional) weight updates | Not central | Built-in local synaptogenesis + Hebbian optional |
619
- | Speciation | Yes | Typically via CPPN genes | Extended to rules + CPPN + module signatures |
620
- | Pruning | Limited / manual | Not primary | Integrated cyclical prune/regrow + memory guards |
563
+ Plain HyperNEAT often produces highly regular but non‑modular wiring (many long inter‑module links) because there is no explicit penalty for wiring length or inter‑module connections. To reliably encourage compact, modular networks the plan should treat wiring cost as a first‑class signal across CPPN decisioning, morphogenesis, and telemetry.
621
564
 
622
- ---
565
+ - Treat wiring cost (connection count, euclidean length, inter‑module links) as an explicit, configurable signal used across CPPN decisioning, morphogenesis policies, and selection/fitness. In the developmental metaphor this is equivalent to metabolic or material costs that bias growth and pruning.
566
+ - Practically: CPPNs can combine a mask output with a cost term to produce an effective score used to realize edges; morph policies prefer local densification and prune long/inter‑module edges first under budget pressure; evolution may expose per‑genotype wiring preference weights (λ_count, λ_len, λ_inter) so wiring economics can be tuned or evolved.
567
+ - Making wiring cost explicit enables traceable trade‑offs (task performance vs wiring economy) and supports Pareto or penalized selection strategies described later in this plan.
623
568
 
624
- ## 9. Educational Hooks & Introspection
569
+ High level recommendations (what and why):
625
570
 
626
- Justifies first‑class introspection: complex indirect encodings risk opacity without structured tracing and metric surfacing. By instrumenting genotype→phenotype lineage, module utilization, and morph event deltas under optional flags, the system supports hypothesisdriven debugging, comparative ablation studies, and educational visualization without imposing default overhead.
627
- Planned user-facing helpers:
571
+ - Add wiring cost terms to genotype/network bookkeeping: per‑genotype knobs (connectionCountWeight, lengthCostWeight, interModulePenalty) and an optional pergenotype evolved preference for wiring economy.
572
+ - Apply cost awareness at three layers: (A) CPPN adjacency decision (cheap early pruning), (B) fitness/selection (global tradeoff), and (C) morphogenesis/telemetry (local growth/prune policy signals).
628
573
 
629
- - `hyper.inspectGenotype(geno)` prints rule list, CPPN summaries.
630
- - `hyper.trace()` → chronological list of developmental & morph actions.
631
- - `hyper.moduleReport(net)` → table of metrics (utilization, error share, sparsity, age).
632
- - `hyper.replay(traceLog)` → reproduce growth decisions (determinism demo).
574
+ Concrete lightweight API & formulas:
633
575
 
634
- ---
635
-
636
- ## 10. Alignment With Implementation Phases
637
-
638
- Articulates how conceptual pillars (indirect encoding, developmental rules, runtime morphogenesis, plasticity, telemetry, evolutionary integration) are deliberately staged to minimize compounded uncertainty. Each phase establishes a stable substrate (e.g., deterministic genotype regeneration) before layering on adaptive complexity, reducing confounding when performance regressions appear.
639
- | Concept Section | Implementation Phase(s) |
640
- |---------------------------------|-------------------------|
641
- | Seed Genotype & Substrate | 0–1 |
642
- | Replication / Symmetry Rules | 2 |
643
- | Indirect CPPN Connectivity | 3 |
644
- | Runtime Morphogenesis Hooks | 4 |
645
- | Plasticity Variables | 5 |
646
- | Telemetry / Trace Export | 6 |
647
- | Speciation Extensions | 7 |
648
- | Scaling Validation / Memory Budgets | 8 |
649
-
650
- ---
651
-
652
- ## 11. Future Educational Enhancements
653
-
654
- Enumerates future pedagogical tooling (interactive developmental replays, rule mutation sandboxes, lineage visualization) intended to lower cognitive barriers for newcomers and support empirical methodology (e.g., side‑by‑side growth trajectory comparison). These extensions act as force multipliers for community experimentation and reproducibility.
655
-
656
- - Interactive visualization: animate rule application passes.
657
- - Module lineage tree export (GraphViz / JSON).
658
- - “What-if” sandbox: apply hypothetical rule mutation and preview delta in nodes/edges before committing.
659
- - Tutorial notebooks: build from seed → mid complexity → large modular brain.
576
+ - Track per‑network summary values: totalConnections, totalWiringLength (sum of euclidean distances of realized edges), interModuleEdgeCount.
577
+ - Single‑objective penalty (simple):
660
578
 
661
- ---
662
-
663
- > NOTE: All conceptual elaborations above are descriptive; actual code remains gated behind flags and phases below.
664
-
665
- ---
666
-
667
- ## High-Level Goals
668
-
669
- Links biologically inspired constructs (symmetry, differentiation, plasticity) to concrete engineering KPIs (sample efficiency, scaling curvature, memory per effective edge) to ensure aesthetic analogies are instrumented and falsifiable. This guards against ornamental complexity by demanding metric justification for each added mechanism.
670
-
671
- 1. Add a compact Evo‑Devo genotype layer that can generate / regenerate phenotypic `Network` graphs deterministically.
672
- 2. Support CPPN‑driven structural & weight pattern generation (indirect encoding) with caching for large substrates.
673
- 3. Introduce morphogenesis (runtime growth/prune rules) integrated with existing pruning + connection pooling.
674
- 4. Maintain or reduce memory per active connection via slab packing + sparsity while enabling growth to millions of edges incrementally.
675
- 5. Provide introspection (modules, regions, developmental lineage) without heavy overhead when disabled.
676
-
677
- ---
678
-
679
- ## Current Baseline (What We Leverage)
680
-
681
- Surveys core infrastructural assets (connection pooling, slab packing, pruning logic, deterministic RNG) leveraged to accelerate development and reduce risk. Emphasizing reuse clarifies that innovation is concentrated in encoding and adaptive control layers, not low‑level numerical plumbing.
682
-
683
- Already present & reused:
684
-
685
- - `Connection.acquire/release` pooling.
686
- - Packed connection slab (`network.slab.ts`) + activation array pool.
687
- - Configurable pruning (`network.prune.ts`) and sparsity targeting.
688
- - Deterministic RNG snapshot / restore.
689
- - Multi‑optimizer update paths in `Node`.
690
- - NEAT mutation operators (can be extended to developmental rule mutation).
691
-
692
- Gaps to fill:
693
-
694
- - No genotype <-> phenotype separation beyond classic NEAT gene lists.
695
- - No spatial substrate abstraction or coordinate system.
696
- - No CPPN module.
697
- - No runtime growth triggers aside from pruning & add‑node mutation.
698
- - No hierarchical/module metadata structure.
699
-
700
- ---
701
-
702
- ## Module Layout (Proposed New Files / Folders)
703
-
704
- Defines a modular namespace that enforces separation of concerns: genotype logic isolated from runtime morph policies, CPPN generation segregated from phenotype materialization, and telemetry decoupled via lazy hooks. This isolation simplifies dependency analysis and reduces inadvertent cross‑layer coupling.
705
-
706
- ```
707
- src/hyper/
708
- genotype.ts // Core Evo-Devo genotype (rule list, CPPN refs, seeds)
709
- developmentalRules.ts // Rule interfaces + execution engine
710
- cppn/
711
- cppn.ts // Minimal differentiable / evolvable CPPN (reuse existing activations)
712
- compiler.ts // Optional fast-path compilation to slab weights
713
- substrate.ts // Spatial substrate (coords, regions, symmetry helpers)
714
- morphogenesis.ts // Runtime growth & pruning coordinator
715
- phenotypeBuilder.ts // Builds a Network from genotype + substrate
716
- telemetry.ts // Lightweight hooks, lazy when disabled
717
- serialization.ts // Genotype (not full Network) persistence
718
- mutation.ts // Mutations specific to rules / CPPN structure
719
- spec.md // (Design notes, constraints)
720
579
  ```
721
-
722
- All additions are additive; existing APIs remain stable until an eventual major version.
723
-
724
- ---
725
-
726
- ## Data Contracts (Initial Draft)
727
-
728
- Introduces stable interface nuclei enabling early consumer code (tests, tooling) to target contracts while internal algorithms iterate. Early formalization also enables forward‑compatible serialization (version tagging, optional field evolution) and eases future compression or remote execution strategies.
729
-
730
- Genotype core:
731
-
732
- ```ts
733
- interface DevelopmentalRule {
734
- id: number;
735
- kind:
736
- | 'replicate'
737
- | 'differentiate'
738
- | 'symmetry'
739
- | 'hierarchy'
740
- | 'prune_hint'
741
- | 'densify_region';
742
- params: Record<string, number | string | boolean>;
743
- probability?: number; // execution probability in a pass
744
- priority?: number; // ordering
745
- enabled?: boolean;
746
- }
747
-
748
- interface HyperGenotype {
749
- seed: number;
750
- input: number;
751
- output: number;
752
- rules: DevelopmentalRule[];
753
- cppns: CPPNGene[]; // Each produces pattern(s)
754
- substrateSpec: SubstrateSpec; // Dimensions, coordinate frames
755
- version: 1;
756
- }
580
+ penalizedFitness = rawFitness - λ_count * totalConnections
581
+ - λ_len * totalWiringLength
582
+ - λ_inter * interModuleEdgeCount
757
583
  ```
758
584
 
759
- Phenotype build call:
585
+ - Multi‑objective alternative: treat (taskFitness, wiringCost) as a Pareto pair and use Pareto selection (e.g., NSGA variants) to explore the trade‑off surface without scalarizing.
760
586
 
761
- ```ts
762
- buildPhenotype(geno: HyperGenotype, opts): Network; // Reuses Connection pooling + slab
763
- ```
587
+ CPPN & adjacency recommendations (cheap, local bias):
764
588
 
765
- Runtime morphogenesis callback hook signature:
589
+ - Make the CPPN adjacency decision cost‑aware by combining the mask output with a distance / inter‑module cost term before thresholding:
766
590
 
767
- ```ts
768
- type MorphEvent =
769
- | 'epochEnd'
770
- | 'stagnation'
771
- | 'memoryPressure'
772
- | 'externalSignal';
773
- type MorphogenesisHook = (
774
- net: Network,
775
- ctx: { event: MorphEvent; metrics: any }
776
- ) => void;
777
591
  ```
778
-
779
- ## Risk & Mitigation Summary
780
-
781
- Transforms diffuse architectural risks into an actionable ledger, pairing each risk with concrete mitigation levers (caching, budgets, feature flags). This supports proactive monitoring and simplifies post‑mortem attribution if regressions emerge.
782
- | Risk | Mitigation |
783
- |------|------------|
784
- | Rebuild overhead for large CPPN substrates | Caching + incremental diff generation |
785
- | Memory blowup from plasticity state | Optional feature; pooled typed arrays with reuse |
786
- | Rule explosion causing combinatorial growth | Global complexity budget + rule priority throttle |
787
- | API instability | Feature-flag entire hyper layer until Phase 6 |
788
-
789
- ---
790
-
791
- ## Incremental PR Sequencing (Granular Checklist)
792
-
793
- Breaks delivery into reviewable micro‑increments so semantic drift or performance regressions are localized; each PR carries its own acceptance tests and plan diff, forming an auditable evolution trail of the design document itself.
794
-
795
- 1. PR1: Scaffolding + config flag + empty tests.
796
- 2. PR2: Genotype + basic phenotype builder.
797
- 3. PR3: Developmental rules (replicate/symmetry) + tests.
798
- 4. PR4: CPPN core + indirect edge generation.
799
- 5. PR5: Morphogenesis hooks + activity metrics.
800
- 6. PR6: Plasticity (Hebbian) + gating.
801
- 7. PR7: Telemetry + export trace.
802
- 8. PR8: Evolution integration (mutation + speciation extension).
803
- 9. PR9: Scaling benchmarks + docs.
804
- 10. PR10: Stabilization & API doc examples.
805
-
806
- Each PR keeps surface area small, adds tests, and updates this plan (append CHANGELOG section).
807
-
808
- ---
809
-
810
- ## Example (Future) User API Sketch (Post Phase 5)
811
-
812
- Provides a provisional user‑facing construct to validate naming consistency, configuration surface minimality, and composability with existing library patterns before hardening interfaces post Phase 6.
813
-
814
- ```ts
815
- import { createHyper } from 'neataptic-ts/hyper';
816
-
817
- const hyper = createHyper({
818
- input: 16,
819
- output: 4,
820
- rules: [{ kind: 'replicate', params: { times: 2 } }],
821
- cppn: { layers: [8, 8], activation: 'tanh' },
822
- enableMorphogenesis: true,
823
- plasticity: { mode: 'hebbian', rate: 1e-3 },
824
- });
825
-
826
- const net = hyper.build();
827
- // training loop ... hyper.maybeMorph(eventMetrics)
592
+ effectiveScore = maskOutput - costCoeff * (normalizedDistance + interModuleFlag)
593
+ realizeEdge if sigmoid(effectiveScore) > maskThreshold
828
594
  ```
829
595
 
830
- ---
596
+ - Provide per‑module mask bonuses so CPPNs can be biased toward intra‑module connectivity (e.g., boost maskOutput if src.module === dst.module).
831
597
 
832
- ## Acceptance & Success Metrics
598
+ Morphogenesis & pruning recommendations:
833
599
 
834
- Anchors success to quantifiable, automatable metrics (build time ratio, memory per active connection, deterministic hash reproducibility) ensuring progress narratives are evidence‑based; thresholds provide regression guards in CI.
835
- | Metric | Target (initial) | Rationale |
836
- |----------------------------------|-------------------------------------------------|--------------------------|
837
- | Phenotype build time (50k edges) | < 1.2× baseline Network construction | Maintain responsiveness |
838
- | Memory per active connection | TBD after measurement (< baseline by Phase 8) | Scale to big nets |
839
- | Morph hook overhead (disabled) | < 1% runtime | Pay only when used |
840
- | Deterministic rebuild hash match | 100% | Reproducibility |
600
+ - Under memory pressure or when pruning is considered, prefer removing inter‑module or long‑distance edges first (unless they show high contribution score).
601
+ - When growing, prefer local densification (intra‑module) before adding long‑range shortcuts; allow occasional long links but gate by a budget or cooldown.
841
602
 
842
- ---
843
-
844
- ## Future Extensions (Post v1)
845
-
846
- Signals strategic extension vectors (symbolic modules, GPU path, compressed serialization) to align community contributions and prevent ad‑hoc divergence once the core is stable.
847
-
848
- - Spatially aware neuro-symbolic modules.
849
- - GPU/WebGPU execution path using slab + typed buffers.
850
- - Compressed serialization (delta-coded genotype + pattern seeds).
851
-
852
- ---
853
-
854
- ## Maintenance Notes
855
-
856
- Enumerates non‑negotiable constraints (optional fields, isolation of hyper namespace, pooled state hygiene) that reviewers should enforce to avoid gradual entropy accumulation and maintain predictable memory/performance profiles.
603
+ Telemetry & metrics:
857
604
 
858
- - Keep `hyper/` isolated; do not import from it inside baseline `Network` unless flag enabled to avoid bundle bloat.
859
- - All added fields on `Network` must be optional and lazily allocated.
860
- - Pool any added per-connection arrays (plasticity, tags) via indexed side buffers.
605
+ - Expose wiring metrics in `telemetry.ts`: meanEdgeLength, interModuleRatio, modularityQ. Use these to track emergent modularity and guide focusScore adjustments.
861
606
 
862
- ---
863
-
864
-
865
- ---
607
+ Practical experiment suggestions (to tune λ values and policies):
866
608
 
867
- ## Phased Implementation Plan
609
+ - A/B test: no cost vs scalar penalty vs Pareto selection; measure task performance vs modularity (Q), interModuleRatio, and edge count.
610
+ - Compare CPPN cost‑aware thresholding vs cost only in selection to see which induces more modularity with less performance loss.
868
611
 
869
- Specifies an incremental delivery roadmap where each phase yields a self‑contained, benchmark‑able capability with explicit rollback boundaries. This staging supports empirical validation (performance, memory) and confines failure domains—problems in morphogenesis (Phase 4) cannot corrupt genotype determinism established in earlier phases.
612
+ Risks & mitigations:
870
613
 
871
- ### Phase 0 Groundwork (Low Risk / Enablers)
614
+ - Over‑penalizing wiring reduces achievable task performance mitigate by annealing λs, evolving λ per genotype, or using Pareto selection.
615
+ - Metric compute cost (modularity, length) — compute telemetry incrementally and at low cadence (end of epoch) or sample subnetworks.
872
616
 
873
- Focuses on infrastructural enablement: feature flags for conditional compilation, placeholder interfaces for type stability, and sentinel tests ensuring future expansions do not regress baseline construction pathways. No algorithmic semantics change in this phase, de‑risking the branch point.
874
- Goal: Create scaffolding without behavior changes.
875
- Steps:
876
-
877
- 1. Add `src/hyper/` folder with placeholder `genotype.ts`, `developmentalRules.ts`, `substrate.ts` exporting empty interfaces + TODO comments.
878
- 2. Add feature flag to `config` (e.g. `config.enableHyper = false`).
879
- 3. Add `test/hyper/` folder.
880
- 4. Add unit test stubs verifying import does not throw. Naming pattern: `hyper.*.test.ts` using the topic or focus on the file name.
881
- Acceptance: Build + tests unchanged; tree includes new folder.
617
+ Where to add this in the phased plan (quick mapping):
882
618
 
883
- ### Phase 1 Genotype & Substrate Core
619
+ - Phase 1: add config flags and genotype fields for wiring‑cost weights and basic wiring telemetry counters.
620
+ - Phase 3 (CPPN): implement cost‑aware adjacency thresholding and per‑module mask bonuses; include adjacency cache keys for cost parameters.
621
+ - Phase 4 (Morphogenesis): prefer local growth and prune inter‑module edges first; include wiring metrics in morph decision inputs.
622
+ - Phase 6 (Telemetry): export wiring metrics and modularity Q; add automated ablation dashboard entries.
623
+ - Phase 7 (Evolution): support penalized fitness option and/or Pareto selection; include wiringCost component in speciation distance if desired.
884
624
 
885
- Delivers reproducible genotype→phenotype translation with hashing and serialization hooks so subsequent adaptive layers (rules, CPPNs) inherit a verifiable foundation. Determinism here is pivotal for isolating non‑deterministic variance sources in later performance analyses.
886
- Steps:
625
+ This section intentionally keeps changes incremental and opt‑in: wiring penalties are gated by config flags and can be tuned or evolved, so users keep the option to explore pure regular HyperNEAT behaviour or wiring‑aware morphogenesis.
887
626
 
888
- 1. Implement `HyperGenotype` structure & factory (`createInitialGenotype(input, output, seed)`).
889
- 2. Implement minimal `substrate.ts` with coordinate assignment for input/output nodes (e.g., 1D normalized positions).
890
- 3. Implement deterministic rebuild to a trivial `Network` (linear fully‑connected input→output) via `phenotypeBuilder.ts`.
891
- 4. Add serialization for genotype only (`toJSONGenotype`, `fromJSONGenotype`).
892
- 5. Add tests: round‑trip genotype + phenotype equivalence vs direct Network baseline.
893
- Acceptance: Can build network from genotype deterministically, coverage >= existing threshold for new lines.
627
+ ### Morphogenesis Event Cycle (Detailed Example)
894
628
 
895
- ### Phase 2 Developmental Rule Engine (Static Application)
629
+ To illustrate how runtime adaptations occur, this section provides a detailed example of the morphogenesis event cycle. It documents the sequence of actions—pruning, growing, and rebuilding—and the invariants that ensure state consistency and prevent race conditions. This example serves as both a pedagogical tool and a reference for implementation.
896
630
 
897
- Confirms that static developmental rule application yields predictable, parameter‑controlled structural transformations (node/edge count scaling laws) before introducing temporal dependencies or indirect connectivity. This isolates semantic bugs (e.g., symmetry duplication drift) early.
898
- Steps:
899
-
900
- 1. Define rule execution ordering & priority.
901
- 2. Implement rule kinds: `replicate` (split connection region), `symmetry` (mirror coordinates), `hierarchy` (tag modular region ids).
902
- 3. Extend phenotype builder to apply rules in passes (no runtime dynamics yet).
903
- 4. Provide mutation operators for enabling/disabling rules & parameter tweak.
904
- 5. Tests: Rule application changes node & connection counts as expected; determinism with identical seeds.
905
- Acceptance: Complexity scaling via rules validated on synthetic tests.
906
-
907
- ### Phase 3 – CPPN Integration (Indirect Encoding)
908
-
909
- Adds CPPN‑based pattern synthesis to decouple connection enumeration from explicit genome length, enabling graceful scaling to large substrates via procedural generation plus sparsity thresholds. Performance and caching instrumentation ensure tractability before layering dynamic growth.
910
- Steps:
911
-
912
- 1. Implement lightweight CPPN (feedforward multi‑layer perceptron) using existing `Node` activation functions (no recursion) in `cppn/cppn.ts`.
913
- 2. Define `CPPNGene` (architecture description + weights array reused via typed arrays for memory efficiency).
914
- 3. Add mapping: for each candidate pair (i,j) the CPPN queries `(x_i, y_i, x_j, y_j, distance, bias)` to produce weight or mask.
915
- 4. Introduce sparsity threshold (e.g. absolute output < t -> skip connection) feeding into existing slab rebuild.
916
- 5. Cache generated adjacency (fingerprint genotype + substrate + threshold) to avoid recomputation across evaluations.
917
- 6. Tests: Weight pattern symmetry; threshold controls edge count.
918
- Acceptance: Indirect generation path within 1.5× baseline time for medium nets; memory overhead bounded (document).
919
-
920
- ### Phase 4 – Morphogenesis (Runtime Growth & Pruning Hooks)
921
-
922
- Installs runtime evaluators that adjust structure in response to short‑horizon performance signals, establishing a middle adaptation timescale between gradient updates and generational evolution. Policies are bounded by memory/growth budgets to preserve predictability.
923
- Steps:
924
-
925
- 1. Implement `morphogenesis.ts` maintaining counters: activity, error trend, stagnation iterations.
926
- 2. Expose `registerMorphHook` on Network (only active if hyper mode enabled) storing a list of `MorphogenesisHook`.
927
- 3. Provide built‑in policies: `activity_expand`, `low_contrib_prune` (leveraging existing pruning API), `module_split` (clone subgraph tag).
928
- 4. Introduce growth budget guard referencing memory plan (max connections & nodes; triggers forced pruning).
929
- 5. Tests: Controlled mock metrics trigger expected growth/prune actions (assert node/connection deltas).
930
- Acceptance: Hooks fire without breaking forward / backward passes; pool usage stable.
931
-
932
- ### Phase 5 – Plasticity & Local Adaptation
933
-
934
- Augments standard optimizer updates with local activity‑dependent modifications (Hebbian/anti‑Hebbian) providing rapid micro‑adaptation where gradient signals may be sparse or noisy, potentially improving credit assignment in deep or recurrent motifs.
935
- Steps:
936
-
937
- 1. Add optional per‑connection short‑term plasticity variables (reuse existing `Connection` extension fields; ensure pooling reset).
938
- 2. Implement Hebbian update option executed post‑activation batch (scales small typed array of weight deltas, not reusing `propagate`).
939
- 3. Provide configuration gating to avoid overhead when disabled.
940
- 4. Tests: Hebbian updates modify weights in isolation; disabled path adds near‑zero overhead (<5% baseline runtime for small net).
941
- Acceptance: Plasticity coexists with backprop/evolution.
942
-
943
- ### Phase 6 – Telemetry & Introspection (Lazy)
944
-
945
- Ensures instrumentation cost is pay‑as‑you‑go: metric buffers and trace arrays allocate only when enabled, and hot paths retain branch‑predictable checks. This design encourages pervasive measurability without penalizing production performance.
946
- Steps:
947
-
948
- 1. `telemetry.ts` gathers per‑module stats (avg activity, sparsity) only when `enableTelemetry` flag set.
949
- 2. Add `network.exportDevelopmentalTrace()` returning rule application + growth events.
950
- 3. Add optional ONNX metadata section embedding module tags.
951
- 4. Tests: Telemetry off -> no extra allocations (assert via heap sampling harness later). Basic JSON output validated.
952
- Acceptance: Non‑intrusive diagnostics present.
953
-
954
- ### Phase 7 – Evolutionary Integration (Mutation + Reproduction)
955
-
956
- Integrates multi‑family crossover and expanded distance metrics so population dynamics can exploit recombination benefits (innovation combination, deleterious mutation masking) while maintaining ecological niche protection for divergent developmental strategies.
957
- Steps:
631
+ Documents the order and conditionality of runtime adaptation actions (prune → grow → rebuild), making explicit the invariants (metrics snapshot immutability during a cycle, deferred rebuild) that guard against race conditions and inconsistent state. This trace form facilitates reproducibility and pedagogical walkthroughs.
958
632
 
959
- 1. Extend existing NEAT mutation registry with hyper‑aware mutations (add/remove rule, mutate CPPN weight/topology, adjust substrate scale, modify rule probabilities/priorities).
960
- 2. Implement `crossoverHyperGenotype(a,b,cfg)` with alignment & blending strategies (rules, CPPNs, substrate modifiers) + validation pass.
961
- 3. Add speciation distance components for genotype differences (rule vector hash distance, CPPN topology/weight signature, substrate modifier delta).
962
- 4. Integrate reproduction into population loop: select parents, generate offspring genotype(s), regenerate phenotype using cached artifacts.
963
- 5. Fitness evaluation optionally rebuilds phenotype each generation if genotype mutated or crossover occurred (cache reuse across offspring clones).
964
- 6. Tests:
633
+ 1a. Add mutations / crossover support for genotype wiring preference fields; allow `wiringPreferences.evolvePreference` to enable evolution of λ weights.
634
+ 1b. Add fitness options: scalarized penalized fitness (fitness' = fitness - λ*count * count - λ*len * length - λ_inter \* interModule) and a Pareto multi‑objective mode. Make selection method configurable.
635
+ 1c. Optionally add wiringCost component into speciation distance to discourage mixing of very different wiring preferences unless desired. 2. Implement `crossoverHyperGenotype(a,b,cfg)` with alignment & blending strategies (rules, CPPNs, substrate modifiers) + validation pass. 3. Add speciation distance components for genotype differences (rule vector hash distance, CPPN topology/weight signature, substrate modifier delta). 4. Integrate reproduction into population loop: select parents, generate offspring genotype(s), regenerate phenotype using cached artifacts. 5. Fitness evaluation optionally rebuilds phenotype each generation if genotype mutated or crossover occurred (cache reuse across offspring clones). 6. Tests:
965
636
 
966
637
  - Crossover determinism under fixed RNG.
967
638
  - Speciation separation on crafted genotype pairs.
@@ -978,6 +649,109 @@ Steps:
978
649
  2. Stress test morphogenesis growing then pruning to ensure no memory leaks (pool sizes stable).
979
650
  3. Document scaling heuristics & recommended flags.
980
651
  4. CI job thresholding memory & run time.
981
- Acceptance: Peak memory per active connection meets target (<X bytes; finalize after measurement Phase 6).
652
+ Acceptance: Peak memory per active connection meets target (TODO: finalize numeric target after Phase 8 measurement; suggested temporary target: <= 64 bytes/active connection measured on CI harness).
653
+
654
+ ## Phases 0–1: Acceptance, Objectives, Key Tasks, Tests, Risks (polished)
655
+
656
+ Thesis: Deliver a minimal Hyper MorphoNEAT skeleton that is opt‑in, deterministic, and verifiable; provide an explicit pruning policy for morphogenesis with deterministic validation and measurable acceptance targets.
657
+
658
+ 1. Objectives
659
+
660
+ 1) Provide a guarded hyper feature flag that is off by default.
661
+ 2) Define a persistent HyperGenotype schema and deterministic build path to a baseline phenotype.
662
+ 3) Add substrate coordinate assignment and canonical genotype hashing/serialization.
663
+ 4) Specify a deterministic pruning policy for morphogenesis with budget checks and rollback safety.
664
+ 5) Surface concrete tests and numeric acceptance criteria for CI gating.
665
+
666
+ 2. Key tasks
667
+
668
+ 1) Feature flag: add config.enableHyper (default false). All hyper imports must be no‑op when false.
669
+ 2) Genotype factory: implement createInitialGenotype({ input, output, seed }) returning a minimal rule list and substrateSpec.
670
+ 3) Substrate: deterministic 1D/2D coordinate assignment with canonical ordering.
671
+ 4) Serialization & hash: implement encodeGenotype/decodeGenotype and hashGenotype(geno) using canonical JSON (sorted keys) and stable numeric formatting.
672
+ 5) Phenotype builder: baseline materialization producing fully connected input→output network when hyper is enabled.
673
+ 6) Pruning policy: implement the stepwise pruning algorithm below (deterministic RNG, explicit budgets, dry‑run validation and rollback).
674
+ 7) Tests & microbench: add deterministic rebuild tests (N=50), serialization round‑trip, and build-time microbenchmark harness.
675
+
676
+ 3. Morphogenesis Event Cycle — Pruning policy (compact algorithm)
677
+
678
+ - Invariants:
679
+
680
+ - genotype + seed -> canonical phenotype ordering and hash.
681
+ - All stochastic choices use a deterministic RNG seeded via a canonical helper `combineSeeds(genoHash, epochCounter)` (e.g., concat+xxhash64 or HMAC with fixed key). TODO: define exact `combineSeeds` implementation and expose helper in `src/hyper/utils.ts` for reproducibility.
682
+ - Budgets: maxNodes, maxEdges, memoryBudget are checked before commit.
683
+
684
+ - Deterministic pruning algorithm (stepwise):
685
+
686
+ ```typescript
687
+ // Pseudocode: prunePolicy.ts
688
+ // deterministicSeed := combineSeeds(genoHash, epochCounter)
689
+ // NOTE: `combineSeeds` must be a documented, canonical combiner (stable across platforms).
690
+
691
+ /**
692
+ * pruneModuleEdges(net, moduleId, ctx)
693
+ * - Prefer inter-module and long edges.
694
+ * - Perform dry‑run removals to validate budget/constraints before commit.
695
+ */
696
+ function pruneModuleEdges(net, moduleId, ctx) {
697
+ // Step 1: Snapshot metrics (no mutation)
698
+ const edges = net.getEdgesForModule(moduleId); // stable ordering
699
+ // Step 2: Score edges (higher -> better candidate for removal)
700
+ // score = λ_len * normalizedLength + λ_inter * interModuleFlag - contributionScore
701
+ const scored = edges.map((e) => ({
702
+ edge: e,
703
+ score:
704
+ ctx.wiringCost.lengthWeight * normalize(e.length) +
705
+ ctx.wiringCost.interWeight * (e.src.module !== e.dst.module ? 1 : 0) -
706
+ normalizeContribution(e.contribution),
707
+ }));
708
+ // Step 3: Sort descending by score (tie-break deterministic by edge.id)
709
+ scored.sort((a, b) =>
710
+ a.score === b.score ? a.edge.id - b.edge.id : b.score - a.score
711
+ );
712
+ // Step 4: Select batch to remove until targetFraction or budget satisfied
713
+ const toRemove = [];
714
+ let removalCount = 0;
715
+ for (const s of scored) {
716
+ if (removalCount >= ctx.maxRemovalsPerCycle) break;
717
+ if (wouldViolateConnectivity(net, s.edge)) continue; // preserve minimal connectivity heuristics
718
+ toRemove.push(s.edge);
719
+ removalCount++;
720
+ }
721
+ // Step 5: Dry run validation
722
+ const dryNet = net.cloneDry(); // no heavy allocation, returns simulated counts
723
+ dryNet.removeEdges(toRemove);
724
+ if (!dryNet.withinBudgets(ctx.budgets)) {
725
+ return { success: false, reason: 'budget-violation' }; // abort, no mutation
726
+ }
727
+ // Step 6: Commit
728
+ net.removeEdges(toRemove);
729
+ // Step 7: Record trace entry (lazy allocate only when telemetry enabled)
730
+ traceAppend({
731
+ type: 'prune',
732
+ moduleId,
733
+ removed: toRemove.length,
734
+ seed: deterministicSeed,
735
+ });
736
+ return { success: true, removed: toRemove.length };
737
+ }
738
+ ```
982
739
 
983
- ---
740
+ 4. Tests (single‑expect rule; deterministic seeds; numeric targets)
741
+
742
+ 1) Unit: "prune selects inter-module edges first" — set up a small net with known inter/intra edges; run prunePolicy; expect(topRemovedIsInterModule). (1 expect)
743
+ 2) Determinism: "build + prune repeatability N=50" — rebuild from same genotype+seed 50 times, run prunePolicy with same epochCounter; expect(allHashesEqual). (1 expect)
744
+ 3) Safety: "dry‑run rejects budget violation" — construct net where proposed removals would violate minConnectivity; expect(dryRunRejected). (1 expect)
745
+
746
+ 5. Metrics and Acceptance criteria (numeric)
747
+
748
+ 1) Deterministic rebuild reproducibility: 100% identical phenotype ordering and hash across N=50 rebuilds.
749
+ 2) Pruning policy overhead: disabled path adds <1% runtime; enabled idle adds <5% runtime in microbench (instrumented).
750
+ 3) Pruned edges preference: top 75% of first batch removals should be inter‑module or top 25% longest edges (measured on synthetic nets).
751
+ 4) Build time overhead for Phase 1 baseline: ≤1.1× direct instantiation for small nets (<=1k nodes).
752
+
753
+ 6. Risks & mitigations
754
+
755
+ 1) Risk: Non‑determinism from float rounding. Mitigation: canonical numeric formatting and fixed‑precision rounding for hashing and CPPN inputs.
756
+ 2) Risk: Cascade removal causing disconnected modules. Mitigation: connectivity checks in wouldViolateConnectivity and dry‑run validation.
757
+ 3) Risk: Memory blowup from adjacency cache. Mitigation: LRU eviction and explicit cache size config.