bulltrackers-module 1.0.281 → 1.0.283

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,925 +1,210 @@
1
- # BullTrackers Computation System & Root Data Indexer
2
-
3
- ## Table of Contents
4
- 1. [System Philosophy](#system-philosophy)
5
- 2. [Root Data Indexer](#root-data-indexer)
6
- 3. [Computation Architecture](#computation-architecture)
7
- 4. [Context & Dependency Injection](#context--dependency-injection)
8
- 5. [Execution Pipeline](#execution-pipeline)
9
- 6. [Smart Hashing & Versioning](#smart-hashing--versioning)
10
- 7. [Data Loading & Streaming](#data-loading--streaming)
11
- 8. [Auto-Sharding System](#auto-sharding-system)
12
- 9. [Quality Assurance](#quality-assurance)
13
- 10. [Operational Modes](#operational-modes)
1
+ # BullTrackers Computation System: Architecture & Operational Manual
14
2
 
15
- ---
16
-
17
- ## System Philosophy
18
-
19
- The BullTrackers Computation System is a **dependency-aware, distributed calculation engine** that processes massive financial datasets with strict guarantees:
20
-
21
- - **Incremental Recomputation**: Only re-runs when code or dependencies change
22
- - **Historical Continuity**: Ensures chronological execution for time-series calculations
23
- - **Transparent Sharding**: Handles Firestore's 1MB document limit automatically
24
- - **Data Availability Gating**: Never runs when source data is missing
25
- - **Cascading Invalidation**: Upstream changes automatically invalidate downstream results
26
-
27
- ### Key Design Principles
28
-
29
- **1. Source of Truth Paradigm**
30
- - The Root Data Indexer creates a daily availability manifest
31
- - Computations are gated by data availability checks
32
- - Missing data triggers "IMPOSSIBLE" states, not retries
33
-
34
- **2. Merkle Tree Dependency Hashing**
35
- - Every calculation has a hash that includes:
36
- - Its own source code
37
- - Hashes of math layers it uses
38
- - Hashes of calculations it depends on
39
- - Changes cascade: updating calculation A invalidates all dependents
40
-
41
- **3. Stateless Execution**
42
- - Each worker receives complete context for its task
43
- - No inter-worker communication
44
- - Infinitely horizontally scalable
3
+ This document provides a comprehensive overview of the BullTrackers Computation System, a distributed, deterministic, and self-optimizing data pipeline. Unlike traditional task schedulers, this system operates on "Build System" principles, treating data calculations as compiled artifacts with strict versioning and dependency guarantees.
45
4
 
46
5
  ---
47
6
 
48
- ## Root Data Indexer
49
-
50
- ### Purpose
51
-
52
- The Root Data Indexer runs daily to scan all data sources and create a **centralized availability manifest**. This prevents the computation system from attempting to run calculations when source data doesn't exist.
53
-
54
- ### Architecture
55
-
56
- ```
57
- ┌──────────────────────────────────────────────────────┐
58
- │ Root Data Indexer (Daily Scan) │
59
- ├──────────────────────────────────────────────────────┤
60
- │ For each date (2023-01-01 → Tomorrow): │
61
- │ 1. Check Normal User Portfolios (Canary: 19M) │
62
- │ 2. Check Speculator Portfolios (Canary: 19M) │
63
- │ 3. Check Normal Trade History (Canary: 19M) │
64
- │ 4. Check Speculator Trade History (Canary: 19M) │
65
- │ 5. Check Daily Insights (/{date}) │
66
- │ 6. Check Social Posts (/{date}/posts) │
67
- │ 7. Pre-Load Price Shard (shard_0) → Date Map │
68
- │ │
69
- │ Output: /system_root_data_index/{date} │
70
- │ { │
71
- │ hasPortfolio: true, │
72
- │ hasHistory: false, │
73
- │ hasInsights: true, │
74
- │ hasSocial: true, │
75
- │ hasPrices: true, │
76
- │ details: { │
77
- │ normalPortfolio: true, │
78
- │ speculatorPortfolio: false, │
79
- │ normalHistory: false, │
80
- │ speculatorHistory: false │
81
- │ } │
82
- │ } │
83
- └──────────────────────────────────────────────────────┘
84
- ```
85
-
86
- ### Canary Block System
87
-
88
- Instead of scanning every user block (which would be prohibitively expensive), the indexer uses **representative blocks**:
89
-
90
- - **Block 19M**: Statistically verified to always contain data when the system is healthy
91
- - **Part 0**: First part of sharded collections
92
-
93
- **Logic**: If Block 19M has data for a date, all blocks have data for that date. This is a deliberate architectural assumption that reduces indexing cost by 99%.
94
-
95
- ### Granular UserType Tracking
96
-
97
- The indexer distinguishes between:
98
- - **Normal Users**: Traditional portfolio tracking (AggregatedPositions)
99
- - **Speculators**: Advanced traders (PublicPositions with leverage/SL/TP)
100
-
101
- This enables calculations to declare: `userType: 'speculator'` and be gated only when speculator data exists.
102
-
103
- ### Price Data Optimization
104
-
105
- Price data is handled differently:
7
+ ## 1. System Philosophy & Core Concepts
106
8
 
107
- 1. **Pre-Load Once**: The entire `shard_0` document is loaded into memory
108
- 2. **Extract Date Keys**: All dates with price data are extracted into a `Set`
109
- 3. **Fast Lookup**: Each date check becomes O(1) instead of a Firestore read
9
+ ### The "Build System" Paradigm
10
+ We treat the computation pipeline like a large-scale software build system (e.g., Bazel or Make). Every data point is an "artifact" produced by a specific version of code (Code Hash) acting on specific versions of dependencies (Dependency Hashes).
11
+ * **Determinism**: If the input data and code haven't changed, the output *must* be identical. We verify this to skip unnecessary work.
12
+ * **Merkle Tree Structure**: The state of the system is a DAG (Directed Acyclic Graph) of hashes. A change in a root node propagates potential invalidation down the tree, but invalidation stops as soon as a node produces the same output as before (Short-Circuiting).
110
13
 
111
- This reduces price availability checks from **~1000 reads/day** to **1 read total**.
14
+ ### Source-of-Truth Architecture
15
+ The **Root Data Index** is the absolute source of truth. No computation can start until the underlying raw data (prices, signals) is indexed and verified "Available" for the target date. This prevents partial runs and "garbage-in-garbage-out".
112
16
 
113
- ### Index Schema
114
-
115
- ```javascript
116
- {
117
- date: "2024-12-07",
118
- lastUpdated: Timestamp,
119
-
120
- // Aggregate Flags (true if ANY subtype exists)
121
- hasPortfolio: true, // normalPortfolio OR speculatorPortfolio
122
- hasHistory: false, // normalHistory OR speculatorHistory
123
- hasInsights: true, // Insights document exists
124
- hasSocial: true, // At least 1 social post exists
125
- hasPrices: true, // Price data exists for this date
126
-
127
- // Granular Breakdown
128
- details: {
129
- normalPortfolio: true,
130
- speculatorPortfolio: false,
131
- normalHistory: false,
132
- speculatorHistory: false
133
- }
134
- }
135
- ```
136
-
137
- ### Availability Check Logic (Computation System)
138
-
139
- ```javascript
140
- // AvailabilityChecker.js - checkRootDependencies()
141
-
142
- if (calculation.rootDataDependencies includes 'portfolio') {
143
- if (calculation.userType === 'speculator') {
144
- if (!rootDataStatus.speculatorPortfolio) → MISSING
145
- } else if (calculation.userType === 'normal') {
146
- if (!rootDataStatus.normalPortfolio) → MISSING
147
- } else {
148
- // userType: 'all' or 'aggregate'
149
- if (!rootDataStatus.hasPortfolio) → MISSING
150
- }
151
- }
152
-
153
- // Similar logic for 'history' dependency
154
- // Global data types (insights, social, price) have no subtypes
155
- ```
156
-
157
- **Critical Behavior**:
158
- - If data is missing for a **historical date** → Mark calculation as `IMPOSSIBLE` (permanent failure)
159
- - If data is missing for **today's date** → Mark as `BLOCKED` (retriable, data may arrive later)
17
+ ### The Three-Layer Hash Model
18
+ To optimize execution, we track three distinct hashes for every calculation:
19
+ 1. **Code Hash (Static)**: A SHA-256 hash of the cleaned source code (comments and whitespace stripped). This tells us if the logic *might* have changed.
20
+ 2. **SimHash (Behavioral)**: Generated by running the code against a deterministic "Fabricated" context. This tells us if the logic *actually* changed behavior (e.g., a refactor that changes variable names but not logic will have a different Code Hash but the same SimHash).
21
+ 3. **ResultHash (Output)**: A hash of the actual production output from a run. This tells us if the data changed. Used for downstream short-circuiting.
160
22
 
161
23
  ---
162
24
 
163
- ## Computation Architecture
164
-
165
- ### Calculation Types
166
-
167
- #### **Standard (Per-User) Computations**
168
-
169
- ```javascript
170
- class UserRiskProfile {
171
- static getMetadata() {
172
- return {
173
- type: 'standard', // Runs once per user
174
- category: 'risk-management', // Storage category
175
- isHistorical: true, // Needs yesterday's data
176
- rootDataDependencies: ['portfolio', 'history'],
177
- userType: 'speculator' // Only for speculator users
178
- };
179
- }
180
-
181
- static getDependencies() {
182
- return ['market-volatility']; // Needs this calc to run first
183
- }
184
-
185
- async process(context) {
186
- const { user, computed, math } = context;
187
- // Process individual user
188
- this.results[user.id] = { riskScore: /* ... */ };
189
- }
190
- }
191
- ```
192
-
193
- **Execution Model**:
194
- - Portfolio data streams in batches (50 users at a time)
195
- - Each user processed independently
196
- - Memory-efficient for millions of users
25
+ ## 2. Core Components Overview
197
26
 
198
- #### **Meta (Once-Per-Day) Computations**
199
-
200
- ```javascript
201
- class MarketMomentum {
202
- static getMetadata() {
203
- return {
204
- type: 'meta', // Runs once per day globally
205
- category: 'market-signals',
206
- rootDataDependencies: ['price', 'insights']
207
- };
208
- }
209
-
210
- async process(context) {
211
- const { prices, insights, math } = context;
212
- // Process all tickers
213
- for (const [instId, data] of Object.entries(prices.history)) {
214
- this.results[data.ticker] = { momentum: /* ... */ };
215
- }
216
- }
217
- }
218
- ```
219
-
220
- **Execution Model**:
221
- - Loads all price data (or processes in shard batches)
222
- - Runs once, produces global results
223
- - Used for market-wide analytics
27
+ ### Root Data Indexer
28
+ A scheduled crawler that verifies the availability of raw external data (e.g., asset prices, global signals) for a given date. It produces an "Availability Manifest" that the Dispatcher consults before scheduling anything.
224
29
 
225
30
  ### Manifest Builder
31
+ * **Role**: Topology Discovery.
32
+ * **Mechanism**: It scans the `calculations/` directory, loads every module, and builds the global Dependency Graph (DAG) in memory.
33
+ * **Output**: A topological sort of all calculations assigned to "Passes" (Pass 0, Pass 1, etc.).
226
34
 
227
- The Manifest Builder automatically:
35
+ ### The Dispatcher (`WorkflowOrchestrator.js`)
36
+ The "Brain" of the system. It runs largely stateless, analyzing the `StatusRepository` against the `Manifest`.
37
+ * **Responsibility**: For a given Grid (Date x Calculation), it determines if the state is `RUNNABLE`, `BLOCKED`, `SKIPPED`, or `IMPOSSIBLE`.
38
+ * **Key Logic**: It implements the "Short-Circuiting" and "Historical Continuity" checks.
228
39
 
229
- 1. **Discovers** all calculation classes in the codebase
230
- 2. **Analyzes** their dependencies
231
- 3. **Sorts** them topologically (builds a DAG)
232
- 4. **Assigns** pass numbers (execution waves)
233
- 5. **Generates** smart hashes for each calculation
40
+ ### The Build Optimizer
41
+ A pre-flight tool that attempts to avoiding running tasks by proving they are identical to previous versions.
42
+ * **Mechanism**: If a calculation's Code Hash changes, the Optimizer runs a **Simulation** (using `SimRunner`) to generate a SimHash. If the SimHash matches the registry, the system acts as if the code never changed, skipping the production re-run.
234
43
 
235
- ```
236
- Pass 1 (No Dependencies):
237
- - market-volatility
238
- - price-momentum
239
-
240
- Pass 2 (Depends on Pass 1):
241
- - user-risk-profile (needs market-volatility)
242
- - sentiment-score (needs price-momentum)
243
-
244
- Pass 3 (Depends on Pass 2):
245
- - combined-signal (needs sentiment-score + user-risk-profile)
246
- ```
247
-
248
- **Circular Dependency Detection**: If `A → B → C → A`, the builder throws a fatal error and refuses to generate a manifest.
44
+ ### The Worker (`StandardExecutor` / `MetaExecutor`)
45
+ The execution unit. It is unaware of the broader topology.
46
+ * **Input**: A target Calculation and Date.
47
+ * **Action**: Fetches inputs, runs `process()`, validates results, and writes to Firestore.
48
+ * **Output**: The computed data + the **ResultHash**.
249
49
 
250
50
  ---
251
51
 
252
- ## Context & Dependency Injection
52
+ ## 3. The Daily Lifecycle (Chronological Process)
253
53
 
254
- ### Context Structure
54
+ ### Phase 1: Indexing
55
+ The system waits for the `SystemEpoch` to advance. The Root Data Indexer checks for "Canary Blocks" (indicators that external data providers have finished for the day). Once confirmed, the date is marked `OPEN`.
255
56
 
256
- Every calculation receives exactly what it declares:
57
+ ### Phase 2: Pre-Flight Optimization
58
+ Before dispatching workers:
59
+ 1. The system identifies all calculations with new **Code Hashes**.
60
+ 2. It runs `SimRunner` for these calculations to generate fresh **SimHashes**.
61
+ 3. If `SimHash(New) == SimHash(Old)`, the system updates the Status Ledger to enable the new Code Hash without flagging it as "Changed".
257
62
 
258
- ```javascript
259
- {
260
- // IDENTITY (Standard only)
261
- user: {
262
- id: "user_123",
263
- type: "speculator",
264
- portfolio: { today: {...}, yesterday: {...} },
265
- history: { today: {...}, yesterday: {...} }
266
- },
267
-
268
- // TEMPORAL
269
- date: { today: "2024-12-07" },
270
-
271
- // ROOT DATA (if declared)
272
- insights: { today: {...}, yesterday: {...} },
273
- social: { today: {...}, yesterday: {...} },
274
- prices: { history: {...} },
275
-
276
- // MAPPINGS
277
- mappings: {
278
- tickerToInstrument: { "AAPL": 123 },
279
- instrumentToTicker: { 123: "AAPL" }
280
- },
281
-
282
- // MATH LAYERS (always injected)
283
- math: {
284
- extract: DataExtractor,
285
- compute: MathPrimitives,
286
- signals: SignalPrimitives,
287
- history: HistoryExtractor,
288
- insights: InsightsExtractor,
289
- priceExtractor: priceExtractor,
290
- // ... 20+ utility classes
291
- },
292
-
293
- // DEPENDENCIES (if declared)
294
- computed: {
295
- "market-volatility": { "AAPL": { volatility: 0.25 } }
296
- },
297
-
298
- // HISTORICAL DEPENDENCIES (if isHistorical: true)
299
- previousComputed: {
300
- "market-volatility": { "AAPL": { volatility: 0.23 } }
301
- }
302
- }
303
- ```
304
-
305
- ### Lazy Loading Optimization
306
-
307
- The system **only loads what you declare**:
308
-
309
- ```javascript
310
- // Calculation A declares:
311
- rootDataDependencies: ['portfolio']
312
-
313
- // Context A receives:
314
- { user: { portfolio: {...} } } // No insights, social, or prices loaded
315
-
316
-
317
- // Calculation B declares:
318
- rootDataDependencies: ['portfolio', 'insights']
319
-
320
- // Context B receives:
321
- {
322
- user: { portfolio: {...} },
323
- insights: { today: {...} } // Insights fetched on-demand
324
- }
325
- ```
326
-
327
- This prevents unnecessary Firestore reads and keeps memory usage minimal.
328
-
329
- ---
330
-
331
- ## Execution Pipeline
332
-
333
- ### Phase 1: Analysis & Dispatch
334
-
335
- ```
336
- ┌─────────────────────────────────────────────────────┐
337
- │ Computation Dispatcher │
338
- │ (Smart Pre-Flight Checker) │
339
- ├─────────────────────────────────────────────────────┤
340
- │ For each date in range: │
341
- │ 1. Fetch Root Data Index │
342
- │ 2. Fetch Computation Status (stored hashes) │
343
- │ 3. Fetch Yesterday's Status (historical check) │
344
- │ 4. Run analyzeDateExecution() │
345
- │ │
346
- │ Decision Logic per Calculation: │
347
- │ ├─ Root Data Missing? │
348
- │ │ ├─ Historical Date → Mark IMPOSSIBLE │
349
- │ │ └─ Today's Date → Mark BLOCKED (retriable) │
350
- │ │ │
351
- │ ├─ Dependency Impossible? │
352
- │ │ └─ Mark IMPOSSIBLE (cascading failure) │
353
- │ │ │
354
- │ ├─ Dependency Missing/Hash Mismatch? │
355
- │ │ └─ Mark BLOCKED (wait for dependency) │
356
- │ │ │
357
- │ ├─ Historical Continuity Broken? │
358
- │ │ └─ Mark BLOCKED (wait for yesterday) │
359
- │ │ │
360
- │ ├─ Hash Mismatch? │
361
- │ │ └─ Mark RUNNABLE (re-run needed) │
362
- │ │ │
363
- │ └─ Hash Match? │
364
- │ └─ Mark SKIPPED (up-to-date) │
365
- │ │
366
- │ 5. Create Audit Ledger (PENDING state) │
367
- │ 6. Publish RUNNABLE tasks to Pub/Sub │
368
- └─────────────────────────────────────────────────────┘
369
- ```
370
-
371
- ### Phase 2: Worker Execution
372
-
373
- ```
374
- ┌─────────────────────────────────────────────────────┐
375
- │ Computation Worker │
376
- │ (Processes Single Task) │
377
- ├─────────────────────────────────────────────────────┤
378
- │ 1. Parse Pub/Sub Message │
379
- │ { date, pass, computation, previousCategory } │
380
- │ │
381
- │ 2. Load Manifest (cached in memory) │
382
- │ │
383
- │ 3. Fetch Dependencies │
384
- │ - Load dependency results (auto-reassemble) │
385
- │ - Load previous day's results (if historical) │
386
- │ │
387
- │ 4. Execute Calculation │
388
- │ ├─ Standard: Stream users in batches │
389
- │ └─ Meta: Load global data / price shards │
390
- │ │
391
- │ 5. Validate Results (HeuristicValidator) │
392
- │ - NaN Detection │
393
- │ - Flatline Detection (stuck values) │
394
- │ - Null/Empty Analysis │
395
- │ - Dead Object Detection │
396
- │ │
397
- │ 6. Store Results │
398
- │ ├─ Calculate size │
399
- │ ├─ If > 900KB → Auto-shard │
400
- │ └─ Write to Firestore │
401
- │ │
402
- │ 7. Update Status & Ledgers │
403
- │ ├─ computation_status/{date} → New hash │
404
- │ ├─ Audit Ledger → COMPLETED │
405
- │ └─ Run History → SUCCESS/FAILURE record │
406
- │ │
407
- │ 8. Category Migration (if detected) │
408
- │ └─ Delete old category's data │
409
- └─────────────────────────────────────────────────────┘
410
- ```
63
+ ### Phase 3: Dispatch Analysis
64
+ The Dispatcher iterates through the Topological Passes (0 -> N). For each calculation, it queries `calculateExecutionStatus`:
65
+ * Are dependencies done?
66
+ * Did dependencies change their output (`ResultHash`)?
67
+ * Is historical context available?
411
68
 
412
- ### Error Handling Stages
69
+ ### Phase 4: Execution Waves
70
+ Workers are triggered via Pub/Sub or direct method invocation.
71
+ * **Pass 1**: Primitive conversions (e.g., Price Extractor).
72
+ * **Pass 2**: Technical Indicators that depend on Pass 1.
73
+ * **Pass 3**: Aggregations and Complex Metrics.
413
74
 
414
- The system tracks **where** failures occur:
415
-
416
- ```javascript
417
- // Run History Schema
418
- {
419
- status: "FAILURE" | "SUCCESS" | "CRASH",
420
- error: {
421
- message: "...",
422
- stage: "EXECUTION" | "PREPARE_SHARDS" | "COMMIT_BATCH" |
423
- "SHARDING_LIMIT_EXCEEDED" | "QUALITY_CIRCUIT_BREAKER" |
424
- "MANIFEST_LOAD" | "SYSTEM_CRASH"
425
- }
426
- }
427
- ```
428
-
429
- **Stage-Specific Handling**:
430
- - `QUALITY_CIRCUIT_BREAKER`: Block deployment, data integrity issue
431
- - `SHARDING_LIMIT_EXCEEDED`: Firestore hard limit hit, needs redesign
432
- - `SYSTEM_CRASH`: Infrastructure issue, retriable
433
- - `EXECUTION`: Logic bug in calculation code
75
+ ### Phase 5: Reconciliation
76
+ After all queues drain, the system performs a final sweep. Any tasks marked `FAILED` are retried (up to a limit). Impossible tasks are finalized as `IMPOSSIBLE`.
434
77
 
435
78
  ---
436
79
 
437
- ## Smart Hashing & Versioning
438
-
439
- ### Hash Composition
80
+ ## 4. Deep Dive: Hashing & Dependency Logic
440
81
 
82
+ ### Intrinsic Code Hashing
83
+ Located in `topology/HashManager.js`.
84
+ We generate a unique fingerprint for every calculation file:
441
85
  ```javascript
442
- // Step 1: Intrinsic Hash (Code + System Epoch)
443
- const codeHash = SHA256(calculation.toString());
444
- const intrinsicHash = SHA256(codeHash + "|EPOCH:v1.0-epoch-2");
445
-
446
- // Step 2: Layer Hashing (Dynamic Detection)
447
- let compositeHash = intrinsicHash;
448
- for (const [layer, exports] of MATH_LAYERS) {
449
- for (const [exportName, triggerPatterns] of exports) {
450
- if (codeString.includes(exportName)) {
451
- compositeHash += layerHashes[layer][exportName];
452
- }
453
- }
454
- }
455
-
456
- // Step 3: Dependency Hashing (Merkle Tree)
457
- const depHashes = dependencies.map(dep => dep.hash).join('|');
458
- const finalHash = SHA256(compositeHash + "|DEPS:" + depHashes);
86
+ clean = codeString.replace(comments).replace(whitespace);
87
+ hash = sha256(clean);
459
88
  ```
89
+ This ensures that changes to comments or formatting do *not* trigger re-runs.
460
90
 
461
- ### Cascading Invalidation Example
462
-
463
- ```
464
- Initial State:
465
- PriceVolatility → hash: abc123
466
- UserRisk (uses PV) hash: def456 (includes abc123)
467
- Signal (uses UR) → hash: ghi789 (includes def456)
468
-
469
- Developer Updates PriceVolatility:
470
- PriceVolatility → hash: xyz000 (NEW!)
471
- UserRisk → hash: uvw111 (NEW! Dependency changed)
472
- Signal → hash: rst222 (NEW! Cascade)
473
-
474
- Next Dispatch:
475
- All 3 calculations marked RUNNABLE (hash mismatch)
476
- ```
477
-
478
- ### System Epoch
479
-
480
- `system_epoch.js`:
481
- ```javascript
482
- module.exports = "v1.0-epoch-2";
483
- ```
91
+ ### Behavioral Hashing (SimHash)
92
+ Located in `simulation/SimRunner.js`.
93
+ When code changes, we can't be 100% sure it's safe just by looking at the source.
94
+ 1. **The Fabricator**: Generates a deterministic mock `Context` (prices, previous results) based on the input schema.
95
+ 2. **Simulation Run**: The calculation `process()` method is executed against this mock data.
96
+ 3. **The Registry**: The hash of the *output* of this simulation is stored.
97
+ If a refactor results in the exact same Mock Output, the system considers the change "Cosmetic".
484
98
 
485
- **Purpose**: Changing this string forces **global re-computation** of all calculations, even if code hasn't changed. Used for:
486
- - Schema migrations
487
- - Critical bug fixes requiring historical reprocessing
488
- - Firestore structure changes
99
+ ### Dependency Short-Circuiting
100
+ Implemented in `WorkflowOrchestrator.js` (`analyzeDateExecution`).
101
+ Even if an upstream calculation re-runs, downstream dependents might not need to.
102
+ * **Logic**:
103
+ * Calc A (Upstream) re-runs. Old Output Hash: `HashX`. New Output Hash: `HashX`.
104
+ * Calc B (Downstream) sees that Calc A "changed" (new timestamp), BUT the content hash `HashX` is identical to what Calc B used last time.
105
+ * **Result**: Calc B is `SKIPPED`.
489
106
 
490
107
  ---
491
108
 
492
- ## Data Loading & Streaming
493
-
494
- ### Streaming Architecture (Standard Computations)
495
-
496
- ```javascript
497
- // Problem: 10M users × 5KB portfolio = 50GB
498
- // Solution: Stream in chunks
499
-
500
- async function* streamPortfolioData(dateStr, refs) {
501
- const BATCH_SIZE = 50; // 50 users at a time
502
-
503
- for (let i = 0; i < refs.length; i += BATCH_SIZE) {
504
- const batchRefs = refs.slice(i, i + BATCH_SIZE);
505
- const userData = await loadDataByRefs(batchRefs);
506
-
507
- yield userData; // { user1: {...}, user2: {...}, ... }
508
-
509
- // Memory cleared after each iteration
510
- }
511
- }
512
-
513
- // Usage in Executor
514
- for await (const userBatch of streamPortfolioData(date, refs)) {
515
- for (const [userId, portfolio] of Object.entries(userBatch)) {
516
- const context = buildContext({ userId, portfolio, ... });
517
- await calculation.process(context);
518
- }
519
- }
520
- ```
521
-
522
- ### Price Data Batching (Meta Computations)
523
-
524
- ```javascript
525
- // Problem: 10,000 tickers × 2 years history = 1GB
526
- // Solution: Process shards sequentially
527
-
528
- for (const shardRef of priceShardRefs) {
529
- const shardData = await loadPriceShard(shardRef); // ~100 tickers
530
-
531
- const context = buildMetaContext({
532
- prices: { history: shardData }
533
- });
534
-
535
- await calculation.process(context);
536
-
537
- // Results accumulate across shards
538
- // Memory cleared between shards
539
- }
540
- ```
541
-
542
- ### Smart Shard Indexing (Optimization)
543
-
544
- For targeted price lookups (e.g., "only calculate momentum for AAPL, GOOGL, MSFT"):
545
-
546
- ```javascript
547
- // Without Indexing: Load ALL shards, filter after
548
- // Cost: 100+ Firestore reads
549
-
550
- // With Indexing: Pre-map which shard contains each instrument
551
- const index = {
552
- "123": "shard_0", // AAPL
553
- "456": "shard_2", // GOOGL
554
- "789": "shard_0" // MSFT
555
- };
556
-
557
- const relevantShards = ["shard_0", "shard_2"];
558
- // Cost: 2 Firestore reads
559
- ```
560
-
561
- **Index Building**: Runs once, cached in `/system_metadata/price_shard_index`.
109
+ ## 5. Decision Logic & Edge Case Scenarios
110
+
111
+ ### Scenario A: Standard Code Change (Logic)
112
+ * **Trigger**: You change the formula for `RSI`. Code Hash changes. SimHash changes.
113
+ * **Dispatcher**: Sees `storedHash !== currentHash`.
114
+ * **Result**: Marks as `RUNNABLE`. Worker runs.
115
+
116
+ ### Scenario B: Cosmetic Code Change (Refactor)
117
+ * **Trigger**: You rename a variable in `RSI`. Code Hash changes. SimHash remains identical.
118
+ * **Optimizer**: Updates the centralized Status Ledger: "Version `Desc_v2` is equivalent to `Desc_v1`".
119
+ * **Dispatcher**: Sees the new hash in the ledger as "Verified".
120
+ * **Result**: Task is `SKIPPED`.
121
+
122
+ ### Scenario C: Upstream Invalidation (The Cascade)
123
+ * **Condition**: `PriceExtractor` fixes a bug. `ResultHash` changes from `HashA` to `HashB`.
124
+ * **Downstream**: `RSI` checks detailed dependency report.
125
+ * **Check**: `LastRunDeps['PriceExtractor'] (HashA) !== CurrentDeps['PriceExtractor'] (HashB)`.
126
+ * **Result**: `RSI` is forced to re-run.
127
+
128
+ ### Scenario D: Upstream Stability (The Firewall)
129
+ * **Condition**: `PriceExtractor` runs an optimization. Output is exact same data. `ResultHash` remains `HashA`.
130
+ * **Downstream**: `RSI` checks dependency report.
131
+ * **Check**: `LastRunDeps['PriceExtractor'] (HashA) === CurrentDeps['PriceExtractor'] (HashA)`.
132
+ * **Result**: `RSI` is `SKIPPED`. This firewall prevents massive re-calculation storms for non-functional upstream changes.
133
+
134
+ ### Scenario E: The "Impossible" State
135
+ * **Condition**: Core market data is missing for `1990-01-01`.
136
+ * **Root Indexer**: Marks date as providing `[]` (empty) for critical inputs.
137
+ * **Dispatcher**: Marks `PriceExtractor` as `IMPOSSIBLE: NO_DATA`.
138
+ * **Propagation**: Any calculation depending on `PriceExtractor` sees the `IMPOSSIBLE` status and marks *itself* as `IMPOSSIBLE: UPSTREAM`.
139
+ * **Benefit**: The system doesn't waste cycles retrying calculations that can never succeed.
140
+
141
+ ### Scenario F: Category Migration
142
+ * **Condition**: You change `getMetadata()` for a calculation, moving it from `signals` to `risk`.
143
+ * **Dispatcher**: Detects `storedCategory !== newCategory`.
144
+ * **Worker**:
145
+ 1. Runs `process()` and writes to the *new* path (`risk/CalculateX`).
146
+ 2. Detects the `previousCategory` flag.
147
+ 3. Deletes the data at the *old* path (`signals/CalculateX`) to prevent orphan data.
562
148
 
563
149
  ---
564
150
 
565
- ## Auto-Sharding System
566
-
567
- ### The 1MB Problem
568
-
569
- Firestore's hard limit: **1MB per document**. A calculation producing results for 5,000 tickers easily exceeds this.
151
+ ## 6. Data Management & Storage
570
152
 
571
- ### Transparent Sharding Solution
153
+ ### Input Streaming
154
+ To handle large datasets without OOM (Out Of Memory) errors:
155
+ * `StandardExecutor` does not load all users/tickers at once.
156
+ * It utilizes wait-and-stream logic (e.g., batches of 50 ids) to process the `Context`.
572
157
 
573
- ```javascript
574
- // ResultCommitter.js - prepareAutoShardedWrites()
575
-
576
- const totalSize = calculateFirestoreBytes(result);
577
-
578
- if (totalSize < 900KB) {
579
- // Write normally
580
- /results/{date}/{category}/{calc}
581
- → { "AAPL": {...}, "GOOGL": {...}, _completed: true }
582
-
583
- } else {
584
- // Auto-shard
585
- // Step 1: Split into chunks < 900KB each
586
- const chunks = splitIntoChunks(result);
587
-
588
- // Step 2: Write shards
589
- /results/{date}/{category}/{calc}/_shards/shard_0 → chunk 1
590
- /results/{date}/{category}/{calc}/_shards/shard_1 → chunk 2
591
- /results/{date}/{category}/{calc}/_shards/shard_N → chunk N
592
-
593
- // Step 3: Write pointer
594
- /results/{date}/{category}/{calc}
595
- → { _sharded: true, _shardCount: N, _completed: true }
596
- }
597
- ```
598
-
599
- ### Transparent Reassembly
600
-
601
- ```javascript
602
- // DependencyFetcher.js - fetchExistingResults()
603
-
604
- const doc = await docRef.get();
605
- const data = doc.data();
606
-
607
- if (data._sharded === true) {
608
- // 1. Fetch all shards
609
- const shardsCol = docRef.collection('_shards');
610
- const snapshot = await shardsCol.get();
611
-
612
- // 2. Merge back into single object
613
- const assembled = {};
614
- snapshot.forEach(shard => {
615
- Object.assign(assembled, shard.data());
616
- });
617
-
618
- // 3. Return as if never sharded
619
- return assembled;
620
- }
621
-
622
- // Normal path: return as-is
623
- return data;
624
- ```
158
+ ### Transparent Auto-Sharding
159
+ Firestore has a 1MB document limit.
160
+ * **Write Path**: If a calculation result > 900KB, it is split into `DocID`, `DocID_shard1`, `DocID_shard2`.
161
+ * **Read Path**: The `DependencyFetcher` automatically detects sharding pointers and re-assembles (hydrates) the full object before passing it to `process()`.
625
162
 
626
- **Developer Experience**: You **never** know or care whether data is sharded. Read and write as if documents have no size limit.
627
-
628
- ### Sharding Limits
629
-
630
- **Maximum Calculation Size**: ~450MB (500 shards × 900KB)
631
-
632
- If a calculation exceeds this, the system throws:
633
- ```
634
- error: {
635
- stage: "SHARDING_LIMIT_EXCEEDED",
636
- message: "Firestore subcollection limit reached"
637
- }
638
- ```
639
-
640
- **Solution**: Refactor calculation to produce less data or split into multiple calculations.
163
+ ### Compression Strategy
164
+ * Payloads are inspected before write.
165
+ * If efficient (high entropy text/JSON), Zlib compression is applied.
166
+ * Metadata is tagged `encoding: 'zlib'` so readers know to inflate.
641
167
 
642
168
  ---
643
169
 
644
- ## Quality Assurance
645
-
646
- ### HeuristicValidator (Grey Box Testing)
647
-
648
- Runs statistical analysis on results **before storage**:
649
-
650
- ```javascript
651
- // ResultsValidator.js
652
-
653
- 1. NaN Detection
654
- - Scans sample of results for NaN/Infinity
655
- - Threshold: 0% (strict, NaN is always a bug)
170
+ ## 7. Quality Assurance & Self-Healing
656
171
 
657
- 2. Flatline Detection
658
- - Checks if >95% of values are identical
659
- - Catches stuck loops or broken RNG
172
+ ### The Heuristic Validator
173
+ Before saving *any* result, the Executor runs heuristics:
174
+ * **NaN Check**: Are there `NaN` or `Infinity` values in key fields?
175
+ * **Flatline Check**: Is the data variance 0.00 across a large timespan?
176
+ * **Null Density**: Is >50% of the dataset null?
177
+ * **Circuit Breaker**: If heuristics fail, the task throws an error. It is better to fail and alert than to persist corrupted data that pollutes the cache.
660
178
 
661
- 3. Null/Empty Analysis
662
- - Threshold: 90% of results are null/0
663
- - Indicates data pipeline failure
179
+ ### Zombie Task Recovery
180
+ * **Lease Mechanism**: When a task starts, it sets a `startedAt` timestamp.
181
+ * **Detection**: The Dispatcher checks for tasks marked `RUNNING` where `startedAt` > 15 minutes ago.
182
+ * **Resolution**: These are assumed crashed (OOM/Timeout). They are reset to `PENDING` (or `FAILED` if retry count exceeded).
664
183
 
665
- 4. Dead Object Detection
666
- - Finds objects where all properties are null/0
667
- - Example: { profile: [], score: 0, signal: null }
668
-
669
- 5. Vector Emptiness (Distribution Calcs)
670
- - Checks if histogram/profile arrays are empty
671
- - Threshold: 90% empty → FAIL
672
- ```
673
-
674
- **Circuit Breaker**: If validation fails, the calculation **does not store results** and is marked as `FAILURE` with stage: `QUALITY_CIRCUIT_BREAKER`.
675
-
676
- ### Validation Overrides
677
-
678
- For legitimately sparse datasets:
679
-
680
- ```javascript
681
- // validation_overrides.js
682
- module.exports = {
683
- "bankruptcy-detector": {
684
- maxZeroPct: 100 // Rare event, 100% zeros is expected
685
- },
686
- "earnings-surprise": {
687
- maxNullPct: 99 // Only runs on earnings days
688
- }
689
- };
690
- ```
691
-
692
- ### Build Reporter (Pre-Deployment Analysis)
693
-
694
- ```bash
695
- npm run build-reporter
696
- ```
697
-
698
- Generates a **simulation report** without running calculations:
699
-
700
- ```
701
- Build Report: v1.2.5_2024-12-07
702
- ================================
703
-
704
- Summary:
705
- - 1,245 Re-Runs (hash mismatch)
706
- - 23 New Calculations
707
- - 0 Impossible
708
- - 45 Blocked (waiting for data)
709
-
710
- Detailed Breakdown:
711
-
712
- 2024-12-01:
713
- Will Re-Run:
714
- - user-risk-profile (Hash: abc123 → xyz789)
715
- - sentiment-score (Hash: def456 → uvw012)
716
-
717
- Blocked:
718
- - social-sentiment (Missing Root Data: social)
719
-
720
- 2024-12-02:
721
- Will Run:
722
- - new-momentum-signal (New calculation)
723
- ```
724
-
725
- **Use Case**: Review before deploying to production. If 10,000 re-runs are detected, investigate whether code change was intentional.
184
+ ### Dead Letter Queue (DLQ)
185
+ Tasks that deterministically fail (crash every time) after N retries are moved to a special DLQ status. This prevents the system from getting stuck in an infinite retry loop.
726
186
 
727
187
  ---
728
188
 
729
- ## Operational Modes
189
+ ## 8. Developer Workflows
730
190
 
731
- ### Mode 1: Local Orchestrator (Development)
191
+ ### How to Add a New Calculation
192
+ 1. Create `calculations/category/MyNewCalc.js`.
193
+ 2. Implement `getMetadata()` to define dependencies.
194
+ 3. Implement `process(context)`.
195
+ 4. Run `npm run build-manifest` to register it in the topology.
732
196
 
733
- ```bash
734
- # Run all calculations for Pass 1 sequentially
735
- COMPUTATION_PASS_TO_RUN=1 npm run computation-orchestrator
736
- ```
737
-
738
- **Behavior**:
739
- - Single-process execution
740
- - Loads manifest
741
- - Iterates through all dates
742
- - Runs calculations in order
743
- - Good for: Debugging, local testing
197
+ ### How to Force a Global Re-Run
198
+ * Change the `SYSTEM_EPOCH` constant in `system_epoch.js`.
199
+ * This changes the "Global Salt" for all hashes, processing every calculation as "New".
744
200
 
745
- ### Mode 2: Dispatcher + Workers (Production)
201
+ ### How to Backfill History
202
+ * **Standard Dispatcher**: Good for recent history (last 30 days).
203
+ * **BatchPriceExecutor**: Specialized for massive historical backfills (e.g., 20 years of price data). It bypasses some topology checks for raw speed.
746
204
 
205
+ ### Local Debugging
206
+ Run the orchestrator in "Dry Run" mode:
747
207
  ```bash
748
- # Step 1: Dispatch tasks to Pub/Sub
749
- COMPUTATION_PASS_TO_RUN=1 npm run computation-dispatcher
750
-
751
- # Step 2: Cloud Function workers consume tasks
752
- # (Auto-scaled by GCP, 0 to 1000+ workers)
208
+ node scripts/run_orchestrator.js --date=2024-01-01 --dry-run
753
209
  ```
754
-
755
- **Behavior**:
756
- - Dispatcher analyzes all dates
757
- - Publishes ~10,000 messages to Pub/Sub
758
- - Workers process in parallel
759
- - Each worker handles 1 date
760
- - Auto-retries on failure (Pub/Sub built-in)
761
-
762
- **Scaling**: 1,000 dates × 3 calcs = 3,000 tasks. With 100 workers, completes in ~5 minutes.
763
-
764
- ### Mode 3: Batch Price Executor (Optimization)
765
-
766
- ```bash
767
- # For price-dependent calcs, bulk-process historical data
768
- npm run batch-price-executor --dates=2024-12-01,2024-12-02 --calcs=momentum-signal
769
- ```
770
-
771
- **Behavior**:
772
- - Loads price shards once
773
- - Processes multiple dates in a single pass
774
- - Bypasses Pub/Sub overhead
775
- - **10x faster** for historical backfills
776
-
777
- **Use Case**: After deploying a new price-dependent calculation, backfill 2 years of history in 1 hour instead of 10.
778
-
779
- ---
780
-
781
- ## Advanced Topics
782
-
783
- ### Historical Continuity Enforcement
784
-
785
- For calculations that depend on their own previous results:
786
-
787
- ```javascript
788
- // Example: cumulative-pnl needs yesterday's cumulative-pnl
789
-
790
- static getMetadata() {
791
- return { isHistorical: true };
792
- }
793
-
794
- // Dispatcher Logic:
795
- if (calculation.isHistorical) {
796
- const yesterday = date - 1 day;
797
- const yesterdayStatus = await fetchComputationStatus(yesterday);
798
-
799
- if (!yesterdayStatus[calcName] ||
800
- yesterdayStatus[calcName].hash !== currentHash) {
801
- // Yesterday is missing or has wrong hash
802
- report.blocked.push({
803
- reason: "Waiting for historical continuity"
804
- });
805
- }
806
- }
807
- ```
808
-
809
- **Result**: Historical calculations run in **strict chronological order**, never skipping days.
810
-
811
- ### Category Migration System
812
-
813
- If a calculation's category changes:
814
-
815
- ```javascript
816
- // Before: category: 'signals'
817
- // After: category: 'risk-management'
818
-
819
- // System detects change:
820
- manifest.previousCategory = 'signals';
821
-
822
- // Worker executes:
823
- 1. Runs calculation normally
824
- 2. Stores in new category: /results/{date}/risk-management/{calc}
825
- 3. Deletes old category: /results/{date}/signals/{calc}
826
- ```
827
-
828
- **Automation**: Zero manual data migration needed.
829
-
830
- ### Audit Ledger vs Run History
831
-
832
- **Audit Ledger** (`computation_audit_ledger/{date}/passes/{pass}/tasks/{calc}`):
833
- - Created **before** dispatch
834
- - Status: PENDING → COMPLETED
835
- - Purpose: Track which tasks were dispatched
836
-
837
- **Run History** (`computation_run_history/{date}/runs/{runId}`):
838
- - Created **after** execution attempt
839
- - Status: SUCCESS | FAILURE | CRASH
840
- - Purpose: Debug failures, track performance
841
-
842
- **Why Both?**: Audit Ledger answers "What should run?", Run History answers "What actually happened?".
843
-
844
- ---
845
-
846
- ## Summary: The Complete Flow
847
-
848
- ### For a Standard Calculation
849
-
850
- ```
851
- 1. Root Data Indexer (Daily)
852
- └─ Scans all data sources
853
- └─ Creates availability manifest
854
-
855
- 2. Dispatcher (Per-Pass)
856
- ├─ Loads manifest
857
- ├─ For each date:
858
- │ ├─ Checks root data availability
859
- │ ├─ Checks dependency status
860
- │ ├─ Checks historical continuity
861
- │ └─ Decides: RUNNABLE | BLOCKED | IMPOSSIBLE
862
- ├─ Creates Audit Ledger (PENDING)
863
- └─ Publishes RUNNABLE tasks to Pub/Sub
864
-
865
- 3. Worker (Per-Task)
866
- ├─ Receives {date, pass, computation}
867
- ├─ Loads manifest (cached)
868
- ├─ Fetches dependencies (auto-reassembles shards)
869
- ├─ Streams portfolio data in batches
870
- ├─ For each user:
871
- │ ├─ Builds context (dependency injection)
872
- │ └─ Calls calculation.process(context)
873
- ├─ Validates results (HeuristicValidator)
874
- ├─ Auto-shards if > 900KB
875
- ├─ Commits to Firestore
876
- ├─ Updates status hash
877
- ├─ Updates Audit Ledger → COMPLETED
878
- └─ Records Run History → SUCCESS
879
-
880
- 4. Next Pass
881
- └─ Depends on results from this pass
882
- ```
883
-
884
- ### For a Meta Calculation
885
-
886
- Same flow, except:
887
- - **Step 3**: Loads global data instead of streaming users
888
- - **Context**: No user object, prices/insights instead
889
- - **Result**: One document with all tickers' data
890
-
891
- ---
892
-
893
- ## Key Takeaways
894
-
895
- 1. **Data Availability Gates Everything**: Computations never run when source data is missing
896
- 2. **Smart Hashing Enables Incremental Updates**: Only changed calculations re-run
897
- 3. **Sharding is Invisible**: Read/write as if documents have no size limit
898
- 4. **Streaming Handles Scale**: Process millions of users without OOM
899
- 5. **Quality Checks Prevent Bad Data**: Results validated before storage
900
- 6. **Historical Continuity is Enforced**: Time-series calculations run in order
901
- 7. **Distributed Execution Scales Infinitely**: 1 worker or 1,000 workers, same code
902
-
903
- ---
904
-
905
- ## Operational Checklist
906
-
907
- **Daily (Automated)**:
908
- - ✅ Root Data Indexer runs at 2 AM UTC
909
- - ✅ Computation Dispatchers run for each pass (3 AM, 4 AM, 5 AM)
910
- - ✅ Workers auto-scale based on Pub/Sub queue depth
911
-
912
- **After Code Changes**:
913
- 1. Run Build Reporter to preview impact
914
- 2. Review re-run count (expected vs actual)
915
- 3. Deploy to staging, run single date
916
- 4. Validate results in Firestore
917
- 5. Deploy to production
918
- 6. Monitor Run History for failures
919
-
920
- **Debugging a Failure**:
921
- 1. Check Run History for error stage
922
- 2. If `QUALITY_CIRCUIT_BREAKER`: Data integrity issue, review validator logs
923
- 3. If `EXECUTION`: Logic bug, reproduce locally with Orchestrator mode
924
- 4. If `SYSTEM_CRASH`: Infrastructure issue, check Cloud Function logs
925
- 5. Fix bug, redeploy, re-trigger specific pass
210
+ This prints the `Analysis Report` (Runnable/Blocked lists) without actually triggering workers.