bulltrackers-module 1.0.306 → 1.0.307

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,395 +0,0 @@
1
- # Complete Feature Inventory of BullTrackers Computation System
2
-
3
- ## Core DAG Engine Features
4
-
5
- ### 1. **Topological Sorting (Kahn's Algorithm)**
6
- - **Files**: `ManifestBuilder.js:187-205`
7
- - **Implementation**: Builds execution passes by tracking in-degrees, queuing zero-dependency nodes
8
- - **Niche aspect**: Dynamic pass assignment (line 201: `neighborEntry.pass = currentEntry.pass + 1`)
9
- - **Common in**: Airflow, Prefect, Dagster (all use topological sort)
10
-
11
- ### 2. **Cycle Detection (Tarjan's SCC Algorithm)**
12
- - **Files**: `ManifestBuilder.js:98-141`
13
- - **Implementation**: Strongly Connected Components detection with stack-based traversal
14
- - **Niche aspect**: Returns human-readable cycle chain (line 137: `cycle.join(' -> ') + ' -> ' + cycle[0]`)
15
- - **Common in**: Academic graph libraries, rare in production DAG systems (most use simpler DFS)
16
-
17
- ### 3. **Auto-Discovery Manifest Building**
18
- - **Files**: `ManifestBuilder.js:143-179`, `ManifestLoader.js:9-42`
19
- - **Implementation**: Scans directories, instantiates classes, extracts metadata via `getMetadata()` static method
20
- - **Niche aspect**: Singleton caching with multi-key support (ManifestLoader.js:9)
21
- - **Common in**: Plugin systems (Airflow providers), less common for computation graphs
22
-
23
- ## Dependency Management & Optimization
24
-
25
- ### 4. **Multi-Layered Hash Composition**
26
- - **Files**: `ManifestBuilder.js:56-95`, `HashManager.js:25-36`
27
- - **Implementation**: Composite hash from code + epoch + infrastructure + layers + dependencies
28
- - **Niche aspect**: Infrastructure hash (recursive file tree hashing, HashManager.js:38-79)
29
- - **Common in**: Build systems (Bazel, Buck), **very rare** in data pipelines
30
-
31
- ### 5. **Content-Based Dependency Short-Circuiting**
32
- - **Files**: `WorkflowOrchestrator.js:51-73`
33
- - **Implementation**: Tracks `resultHash` (output data hash), skips re-run if output unchanged despite code change
34
- - **Niche aspect**: `dependencyResultHashes` tracking (line 59-67)
35
- - **Common in**: **Extremely rare** - only seen in specialized incremental computation systems
36
-
37
- ### 6. **Behavioral Stability Detection (SimHash)**
38
- - **Files**: `BuildReporter.js:55-89`, `SimRunner.js:12-42`, `Fabricator.js:20-244`
39
- - **Implementation**: Runs code against deterministic mock data, hashes output to detect "logic changes" vs "cosmetic changes"
40
- - **Niche aspect**: Seeded random data generation (SeededRandom.js:1-38) for reproducible simulations
41
- - **Common in**: **Unique** - haven't seen this elsewhere. Conceptually similar to property-based testing but for optimization
42
-
43
- ### 7. **System Epoch Forcing**
44
- - **Files**: `system_epoch.js:1-2`, `ManifestBuilder.js:65`
45
- - **Implementation**: Manual version bump to force global re-computation
46
- - **Niche aspect**: Single-line file that invalidates all cached results
47
- - **Common in**: Cache invalidation patterns, but unusual to have a dedicated module
48
-
49
- ## Execution & Resource Management
50
-
51
- ### 8. **Streaming Execution with Batch Flushing**
52
- - **Files**: `StandardExecutor.js:86-158`
53
- - **Implementation**: Async generators yield data chunks, flush to DB every N users
54
- - **Niche aspect**: Adaptive flushing based on V8 heap pressure (line 128-145)
55
- - **Common in**: ETL tools (Spark, Flink use micro-batching), **heap-aware flushing is rare**
56
-
57
- ### 9. **Memory Heartbeat (Flight Recorder)**
58
- - **Files**: `computation_worker.js:30-53`
59
- - **Implementation**: Background timer writes memory stats to Firestore every 2 seconds
60
- - **Niche aspect**: Uses `.unref()` to prevent blocking process exit (line 50)
61
- - **Common in**: APM tools (DataDog, New Relic), **embedding in workers is custom**
62
-
63
- ### 10. **Forensic Crash Analysis & Intelligent Routing**
64
- - **Files**: `computation_dispatcher.js:31-68`
65
- - **Implementation**: Reads last memory stats from failed runs, routes to high-mem queue if OOM suspected
66
- - **Niche aspect**: Parses telemetry to distinguish crash types (line 44-50)
67
- - **Common in**: Kubernetes autoscaling heuristics, **application-level routing is rare**
68
-
69
- ### 11. **Circuit Breaker Pattern**
70
- - **Files**: `StandardExecutor.js:164-173`
71
- - **Implementation**: Tracks error rate, fails fast if >10% failures after 100 items
72
- - **Niche aspect**: Runs mid-stream (not just at job start)
73
- - **Common in**: Microservices (Hystrix, Resilience4j), uncommon in data pipelines
74
-
75
- ### 12. **Incremental Auto-Sharding**
76
- - **Files**: `ResultCommitter.js:234-302`
77
- - **Implementation**: Dynamically splits results into Firestore subcollection shards, tracks shard index across flushes
78
- - **Niche aspect**: `flushMode: INTERMEDIATE` flag (line 150) to avoid pointer updates mid-stream
79
- - **Common in**: Database sharding, **dynamic document sharding is custom**
80
-
81
- ### 13. **GZIP Compression Strategy**
82
- - **Files**: `ResultCommitter.js:128-157`
83
- - **Implementation**: Compresses results >50KB, stores as binary blob if <900KB compressed
84
- - **Niche aspect**: Falls back to sharding if compression fails or exceeds limit
85
- - **Common in**: Storage layers, integration at application level is custom
86
-
87
- ## Data Quality & Validation
88
-
89
- ### 14. **Heuristic Validation (Grey Box)**
90
- - **Files**: `ResultsValidator.js:8-96`
91
- - **Implementation**: Statistical analysis (zero%, null%, flatline detection) without knowing schema
92
- - **Niche aspect**: Weekend mode (line 57-64) - relaxes thresholds on Saturdays/Sundays
93
- - **Common in**: Data quality tools (Great Expectations, Soda), **weekend-aware thresholds are domain-specific**
94
-
95
- ### 15. **Contract Discovery & Enforcement**
96
- - **Files**: `ContractDiscoverer.js:11-120`, `ContractValidator.js:9-64`
97
- - **Implementation**: Monte Carlo simulation learns behavioral bounds, enforces at runtime
98
- - **Niche aspect**: Distinguishes "physics limits" (ratios 0-1) from "statistical envelopes" (6-sigma)
99
- - **Common in**: **Unique** - closest analogue is schema inference (Pandas Profiling) but this is probabilistic + enforced
100
-
101
- ### 16. **Semantic Gates**
102
- - **Files**: `ResultCommitter.js:118-127`
103
- - **Implementation**: Blocks results that violate contracts before writing
104
- - **Niche aspect**: Differentiated error handling - `SEMANTIC_GATE` errors are non-retryable (line 210-225)
105
- - **Common in**: Type systems (TypeScript, Mypy), **runtime probabilistic checks are rare**
106
-
107
- ### 17. **Root Data Availability Tracking**
108
- - **Files**: `AvailabilityChecker.js:49-87`, `utils.js:11-17`
109
- - **Implementation**: Centralized index (`system_root_data_index`) tracks what data exists per day
110
- - **Niche aspect**: Granular user-type checks (speculator vs normal portfolio, line 23-47)
111
- - **Common in**: Data catalogs (Amundsen, DataHub), **day-level granularity is custom**
112
-
113
- ### 18. **Impossible State Propagation**
114
- - **Files**: `WorkflowOrchestrator.js:94-96`, `logger.js:77-93`
115
- - **Implementation**: Marks calculations as `IMPOSSIBLE` instead of failing them, allows graph to continue
116
- - **Niche aspect**: Separate "impossible" category in analysis reports (logger.js:86-91)
117
- - **Common in**: Workflow engines handle failures, **explicit impossible state is rare**
118
-
119
- ## Orchestration & Coordination
120
-
121
- ### 19. **Event-Driven Callback Pattern (Zero Polling)**
122
- - **Files**: `bulltrackers_pipeline.yaml:49-76`, `computation_worker.js:82-104`
123
- - **Implementation**: Workflow creates callback endpoint, worker POSTs on completion, workflow wakes
124
- - **Niche aspect**: IAM authentication for callbacks (computation_worker.js:88-91)
125
- - **Common in**: Cloud Workflows, AWS Step Functions (both support callbacks), **IAM-secured callbacks are best practice but not default**
126
-
127
- ### 20. **Run State Counter Pattern**
128
- - **Files**: `computation_dispatcher.js:107-115`, `computation_worker.js:106-123`
129
- - **Implementation**: Shared Firestore doc tracks `remainingTasks`, workers decrement on completion
130
- - **Niche aspect**: Transaction-based decrement (computation_worker.js:109-119) ensures atomicity
131
- - **Common in**: Distributed systems, **Firestore-specific implementation is custom**
132
-
133
- ### 21. **Audit Ledger (Ledger-DB Pattern)**
134
- - **Files**: `computation_dispatcher.js:143-163`, `RunRecorder.js:26-99`
135
- - **Implementation**: Write-once ledger per task (`computation_audit_ledger/{date}/passes/{pass}/tasks/{calc}`)
136
- - **Niche aspect**: Stores granular timing breakdown (RunRecorder.js:64-70)
137
- - **Common in**: Event sourcing systems, **granular profiling in ledger is uncommon**
138
-
139
- ### 22. **Poison Message Handling (DLQ)**
140
- - **Files**: `computation_worker.js:36-60`
141
- - **Implementation**: Max retries check via Pub/Sub `deliveryAttempt`, moves to dead letter queue
142
- - **Niche aspect**: Differentiates deterministic errors (line 194-222) from transient failures
143
- - **Common in**: Message queues (RabbitMQ, SQS), **logic-aware routing is custom**
144
-
145
- ### 23. **Catch-Up Logic (Historical Scan)**
146
- - **Files**: `computation_dispatcher.js:65-81`
147
- - **Implementation**: Scans full date range (earliest data → target date) instead of just target date
148
- - **Niche aspect**: Parallel analysis with concurrency limit (line 85)
149
- - **Common in**: Data pipelines (backfill mode), **integrated into dispatcher is convenient**
150
-
151
- ## Observability & Debugging
152
-
153
- ### 24. **Structured Logging System**
154
- - **Files**: `logger.js:27-118`
155
- - **Implementation**: Dual output (human-readable + JSON), process tracking, context inheritance
156
- - **Niche aspect**: `ProcessLogger` class (line 120-148) for scoped logging with auto-stats
157
- - **Common in**: Production apps (Winston, Bunyan), **process-scoped loggers are nice touch**
158
-
159
- ### 25. **Date Analysis Reports**
160
- - **Files**: `logger.js:77-132`
161
- - **Implementation**: Per-date breakdown of runnable/blocked/impossible/skipped calculations
162
- - **Niche aspect**: Unicode symbols for visual parsing (line 103)
163
- - **Common in**: DAG visualization tools, **inline CLI reports are developer-friendly**
164
-
165
- ### 26. **Build Report Generator**
166
- - **Files**: `BuildReporter.js:138-248`
167
- - **Implementation**: Pre-deployment impact analysis showing blast radius of code changes
168
- - **Niche aspect**: Blast radius calculation (line 62-77) - finds all downstream dependents
169
- - **Common in**: CI/CD tools (GitHub's "affected projects"), **calculation-level granularity is detailed**
170
-
171
- ### 27. **System Fingerprinting**
172
- - **Files**: `BuildReporter.js:28-51`, `HashManager.js:80-111`
173
- - **Implementation**: SHA-256 hash of entire codebase + manifest, triggers report on change
174
- - **Niche aspect**: Recursive directory walk with ignore patterns (HashManager.js:44-60)
175
- - **Common in**: Docker layer caching, **for change detection at deploy-time is creative**
176
-
177
- ### 28. **Execution Statistics Tracking**
178
- - **Files**: `StandardExecutor.js:64-71`, `RunRecorder.js:57-70`
179
- - **Implementation**: Tracks processed/skipped users, setup/stream/processing time breakdowns
180
- - **Niche aspect**: Profiler-ready structure (RunRecorder.js:64-70) for BigQuery analysis
181
- - **Common in**: Profilers (cProfile, pyflame), **baked into business logic is pragmatic**
182
-
183
- ## Data Access Patterns
184
-
185
- ### 29. **Smart Shard Indexing**
186
- - **Files**: `data_loader.js:152-213`
187
- - **Implementation**: Maintains `instrumentId → shardId` index to avoid scanning all shards
188
- - **Niche aspect**: 24-hour TTL with rebuild logic (line 167-172)
189
- - **Common in**: Database indexes, **application-level shard routing is custom**
190
-
191
- ### 30. **Async Generator Streaming**
192
- - **Files**: `data_loader.js:130-150`
193
- - **Implementation**: `async function*` yields data chunks, caller consumes with `for await`
194
- - **Niche aspect**: Supports pre-provided refs (line 132) for dependency injection
195
- - **Common in**: Node.js streams, **generator-based approach is modern/clean**
196
-
197
- ### 31. **Cached Data Loader**
198
- - **Files**: `CachedDataLoader.js:14-73`
199
- - **Implementation**: Execution-scoped cache for mappings/insights/social data
200
- - **Niche aspect**: Decompression helper (line 24-32) for transparent GZIP handling
201
- - **Common in**: Data layers (Apollo Client, React Query), **per-execution scope is appropriate**
202
-
203
- ### 32. **Deferred Hydration**
204
- - **Files**: `DependencyFetcher.js:23-66`
205
- - **Implementation**: Fetches metadata documents, hydrates sharded data on-demand
206
- - **Niche aspect**: Parallel hydration promises (line 44-47)
207
- - **Common in**: ORMs (lazy loading), **manual shard hydration is low-level**
208
-
209
- ## Domain-Specific Intelligence
210
-
211
- ### 33. **User Classification Engine**
212
- - **Files**: `profiling.js:24-236`
213
- - **Implementation**: "Smart Money" scoring with 18+ behavioral signals
214
- - **Niche aspect**: Multi-factor scoring (portfolio allocation + trade history + execution timing)
215
- - **Common in**: Fintech risk models, **granularity is impressive**
216
-
217
- ### 34. **Convex Hull Risk Geometry**
218
- - **Files**: `profiling.js:338-365`
219
- - **Implementation**: Monotone Chain algorithm for efficient frontier analysis
220
- - **Niche aspect**: O(n log n) algorithm choice (profiling.js:345-363)
221
- - **Common in**: Computational geometry libraries, **integration into user profiling is domain-specific**
222
-
223
- ### 35. **Kadane's Maximum Drawdown**
224
- - **Files**: `extractors.js:27-52`
225
- - **Implementation**: O(n) single-pass algorithm for peak-to-trough decline
226
- - **Niche aspect**: Returns indices for visualization (line 47)
227
- - **Common in**: Finance libraries (QuantLib), **clean implementation**
228
-
229
- ### 36. **Fast Fourier Transform (Cooley-Tukey)**
230
- - **Files**: `mathematics.js:148-184`
231
- - **Implementation**: O(n log n) frequency domain analysis with zero-padding
232
- - **Niche aspect**: Recursive implementation (line 163-183)
233
- - **Common in**: Signal processing (NumPy, SciPy), **JavaScript implementation is rare**
234
-
235
- ### 37. **Sliding Window Extrema (Monotonic Queue)**
236
- - **Files**: `mathematics.js:227-259`
237
- - **Implementation**: O(n) min/max calculation using deque
238
- - **Niche aspect**: Dual deques (one for min, one for max, line 236-237)
239
- - **Common in**: Competitive programming, **production usage is uncommon**
240
-
241
- ### 38. **Geometric Brownian Motion Simulator**
242
- - **Files**: `mathematics.js:99-118`
243
- - **Implementation**: Box-Muller transform for normal random variates, Monte Carlo simulation
244
- - **Niche aspect**: Returns `Float32Array` for memory efficiency (line 106)
245
- - **Common in**: Quant finance (Black-Scholes), **typed arrays are performance-conscious**
246
-
247
- ### 39. **Hit Probability Calculator**
248
- - **Files**: `mathematics.js:75-97`
249
- - **Implementation**: Closed-form barrier option pricing formula
250
- - **Niche aspect**: Custom `normCDF` implementation (line 85-89) avoids external deps
251
- - **Common in**: Options pricing libraries, **standalone implementation is self-contained**
252
-
253
- ### 40. **Kernel Density Estimation**
254
- - **Files**: `mathematics.js:263-288`
255
- - **Implementation**: Gaussian kernel with weighted samples
256
- - **Niche aspect**: 3-bandwidth cutoff for performance (line 276)
257
- - **Common in**: Stats packages (SciPy, R), **production KDE is uncommon**
258
-
259
- ## Schema & Type Management
260
-
261
- ### 41. **Schema Capture System**
262
- - **Files**: `schema_capture.js:28-68`
263
- - **Implementation**: Batch stores class-defined schemas to Firestore
264
- - **Niche aspect**: Pre-commit validation (line 32-34) prevents batch failures
265
- - **Common in**: Schema registries (Confluent), **lightweight alternative**
266
-
267
- ### 42. **Production Schema Validators**
268
- - **Files**: `validators.js:14-137`
269
- - **Implementation**: Structural validation matching schema.md definitions
270
- - **Niche aspect**: Separate validators per data type (portfolio/history/social/insights/prices)
271
- - **Common in**: Data quality frameworks, **schema.md alignment is discipline**
272
-
273
- ### 43. **Legacy Mapping System**
274
- - **Files**: `HashManager.js:8-23`, `ContextFactory.js:12-17`
275
- - **Implementation**: Alias mapping for backward compatibility (e.g., `extract` → `DataExtractor`)
276
- - **Niche aspect**: Dual injection into context (line 14-16)
277
- - **Common in**: API versioning, **maintaining during refactor is good practice**
278
-
279
- ## Infrastructure & Operations
280
-
281
- ### 44. **Self-Healing Sharding Strategy**
282
- - **Files**: `ResultCommitter.js:234-302`
283
- - **Implementation**: Progressively stricter sharding on failure (900KB → 450KB → 200KB → 100KB)
284
- - **Niche aspect**: Strategy array iteration (line 241-246)
285
- - **Common in**: Resilience patterns, **adaptive sharding is creative**
286
-
287
- ### 45. **Initial Write Cleanup Logic**
288
- - **Files**: `ResultCommitter.js:111-127`, `StandardExecutor.js:122-124`
289
- - **Implementation**: `isInitialWrite` flag triggers shard deletion before first write
290
- - **Niche aspect**: Transition detection (line 115-121) from sharded → compressed
291
- - **Common in**: Migration scripts, **baked into write path is convenient**
292
-
293
- ### 46. **Firestore Byte Calculator**
294
- - **Files**: `ResultCommitter.js:319-324`
295
- - **Implementation**: Estimates document size for batch limits
296
- - **Niche aspect**: Handles `DocumentReference` paths (line 322)
297
- - **Common in**: Firestore SDKs (internal), **custom implementation for control**
298
-
299
- ### 47. **Retry with Exponential Backoff**
300
- - **Files**: `utils.js:65-79`
301
- - **Implementation**: Async retry wrapper with configurable attempts and backoff
302
- - **Niche aspect**: 1s → 2s → 4s progression (line 75)
303
- - **Common in**: HTTP clients (axios, got), **standalone utility is reusable**
304
-
305
- ### 48. **Batch Commit Chunker**
306
- - **Files**: `utils.js:86-128`
307
- - **Implementation**: Splits writes into Firestore 500-op/10MB batches
308
- - **Niche aspect**: Supports DELETE operations (line 103-108)
309
- - **Common in**: ORMs (SQLAlchemy bulk), **DELETE support is complete**
310
-
311
- ### 49. **Date Range Generator**
312
- - **Files**: `utils.js:131-139`
313
- - **Implementation**: UTC-aware date string generation
314
- - **Niche aspect**: Forces UTC via `Date.UTC()` constructor (line 133-134)
315
- - **Common in**: Date libraries (date-fns, Luxon), **UTC enforcement is critical for finance**
316
-
317
- ### 50. **Earliest Date Discovery**
318
- - **Files**: `utils.js:158-207`
319
- - **Implementation**: Scans multiple collections to find first available data
320
- - **Niche aspect**: Handles both flat and sharded collections (line 142-157, 160-174)
321
- - **Common in**: Data discovery tools, **multi-source aggregation is thorough**
322
-
323
- ## Advanced Patterns
324
-
325
- ### 51. **Tarjan's Stack Management**
326
- - **Files**: `ManifestBuilder.js:98-141`
327
- - **Implementation**: Manual stack tracking for SCC detection
328
- - **Niche aspect**: `onStack` Set for O(1) membership checks (line 106)
329
- - **Common in**: Graph algorithm implementations, **production usage is advanced**
330
-
331
- ### 52. **Dependency-Injection Context Factory**
332
- - **Files**: `ContextFactory.js:17-61`
333
- - **Implementation**: Separate builders for per-user vs meta contexts
334
- - **Niche aspect**: Math layer injection with legacy aliases (line 12-17)
335
- - **Common in**: DI frameworks (Spring, Guice), **manual factory is lightweight**
336
-
337
- ### 53. **Price Batch Executor**
338
- - **Files**: `PriceBatchExecutor.js:12-104`
339
- - **Implementation**: Specialized executor for price-only calculations (optimization pass)
340
- - **Niche aspect**: Outer concurrency (2) + shard batching (20) + write batching (50) nested limits
341
- - **Common in**: MapReduce systems, **three-level batching is complex**
342
-
343
- ### 54. **Deterministic Mock Data Fabrication**
344
- - **Files**: `Fabricator.js:20-244`, `SeededRandom.js:8-38`
345
- - **Implementation**: LCG PRNG seeded by calculation name for reproducible fakes
346
- - **Niche aspect**: Iteration-based seed rotation (Fabricator.js:29)
347
- - **Common in**: Property-based testing (Hypothesis, QuickCheck), **for optimization is novel**
348
-
349
- ### 55. **Schema-Driven Fake Generation**
350
- - **Files**: `Fabricator.js:48-71`
351
- - **Implementation**: Recursively generates data matching JSON schema
352
- - **Niche aspect**: Volume scaling flag (line 49) for aggregate vs per-item data
353
- - **Common in**: Schema-based generators (JSF, json-schema-faker), **custom to domain**
354
-
355
- ### 56. **Migration Cleanup Hook**
356
- - **Files**: `ResultCommitter.js:81-83`, `ResultCommitter.js:305-317`
357
- - **Implementation**: Deletes old category data when calculation moves
358
- - **Niche aspect**: `previousCategory` tracking in manifest (WorkflowOrchestrator.js:50-54)
359
- - **Common in**: Schema migration tools (Alembic, Flyway), **inline cleanup is pragmatic**
360
-
361
- ### 57. **Non-Retryable Error Classification**
362
- - **Files**: `ResultCommitter.js:18-21`, `computation_worker.js:194-225`
363
- - **Implementation**: Distinguishes deterministic failures from transient errors
364
- - **Niche aspect**: `error.stage` property for categorization (computation_worker.js:205-209)
365
- - **Common in**: Error handling libraries (Sentry), **semantic error types are good practice**
366
-
367
- ### 58. **Reverse Adjacency Graph**
368
- - **Files**: `BuildReporter.js:62-77`
369
- - **Implementation**: Maintains child → parent edges for impact analysis
370
- - **Niche aspect**: Used for blast radius calculation (line 66-74)
371
- - **Common in**: Dependency analyzers (npm-why), **runtime maintenance is useful**
372
-
373
- ### 59. **Multi-Key Manifest Cache**
374
- - **Files**: `ManifestLoader.js:9-14`
375
- - **Implementation**: Cache key is JSON-stringified sorted product lines
376
- - **Niche aspect**: Handles `['ALL']` vs `['crypto', 'stocks']` as different keys
377
- - **Common in**: Memoization libraries (lodash.memoize), **cache key design is thoughtful**
378
-
379
- ### 60. **Workflow Variable Restoration**
380
- - **Files**: `bulltrackers_pipeline.yaml:11-17`
381
- - **Implementation**: Comment notes a bug fix restoring `passes` and `max_retries` variables
382
- - **Niche aspect**: T-1 date logic (line 13-15) for "process yesterday" pattern
383
- - **Common in**: Production YAML configs, **inline documentation is helpful**
384
-
385
- ---
386
-
387
- ## Summary Statistics
388
-
389
- - **Total Features Identified**: 60
390
- - **Unique/Rare Features**: ~15 (SimHash, content-based short-circuit, forensic routing, contract discovery, weekend validation, behavioral stability, heap-aware flushing, monotonic queue extrema, FFT, KDE, smart shard indexing, recursive infra hash, semantic gates, impossible propagation, blast radius)
391
- - **Advanced CS Algorithms**: 8 (Kahn's, Tarjan's, Convex Hull, Kadane's, FFT, Box-Muller, Monotonic Queue, LCG)
392
- - **Common Patterns (Elevated)**: ~25 (executed exceptionally well or with domain-specific twist)
393
- - **Standard Infrastructure**: ~22 (logging, retries, batching, streaming, caching, validation, etc.)
394
-
395
- **Verdict**: About 25% truly novel, 40% common patterns elevated to production-grade, 35% standard infrastructure executed well.
@@ -1,93 +0,0 @@
1
- # The BullTrackers Computation System: An Advanced DAG-Based Architecture for High-Fidelity Financial Simulation
2
-
3
- ## Abstract
4
-
5
- This paper details the design, implementation, and theoretical underpinnings of the BullTrackers Computation System, a proprietary high-performance execution engine designed for complex financial modeling and user behavior analysis. The system leverages a Directed Acyclic Graph (DAG) architecture to orchestrate interdependent calculations, employing Kahn’s Algorithm for topological sorting and Tarjan’s Algorithm for cycle detection. Key innovations include "Content-Based Dependency Short-Circuiting" for massive optimization, a "System Epoch" and "Infrastructure Hash" based auditing system for absolute reproducibility, and a batch-flushing execution model designed to mitigate Out-Of-Memory (OOM) errors during high-volume processing. We further explore the application of this system in running advanced psychometric and risk-geometry models ("Smart Money" scoring) and how the architecture supports self-healing workflows through granular state management.
6
-
7
- ## 1. Introduction
8
-
9
- In modern financial analytics, derived data often depends on a complex web of varying input frequencies—real-time price ticks, daily portfolio snapshots, and historical trade logs. Traditional linear batch processing protocols fail to capture the nuances of these interdependencies, often leading to race conditions or redundant computations.
10
-
11
- The BullTrackers Computation System was devised to solve this by treating the entire domain logic as a **Directed Acyclic Graph (DAG)**. Every calculation is a node, and every data requirement is an edge. By resolving the topography of this graph dynamically at runtime, the system ensures that:
12
- 1. Data is always available before it is consumed (referential integrity).
13
- 2. Only necessary computations are executed (efficiency).
14
- 3. Changes in code or infrastructure propagate deterministically through the graph (auditability).
15
-
16
- ## 2. Theoretical Foundations
17
-
18
- The core utility of the system is its ability to turn a collection of loosely coupled JavaScript classes into a strictly ordered execution plan.
19
-
20
- ### 2.1 Directed Acyclic Graphs (DAGs)
21
- We model the computation space as a DAG where $G = (V, E)$.
22
- * **Vertices ($V$)**: Individual Calculation Units (e.g., `NetProfit`, [SmartMoneyScore](file:///C:/Users/aiden/Desktop/code_projects/Bulltrackers2025/Backend/Entrypoints/BullTrackers/Backend/Core/bulltrackers-module/functions/computation-system/layers/profiling.js#24-236)).
23
- * **Edges ($E$)**: Data dependencies, where an edge $(u, v)$ implies $v$ requires the output of $u$.
24
-
25
- ### 2.2 Topological Sorting (Kahn’s Algorithm)
26
- To execute the graph, we must linearize it such that for every dependency $u \rightarrow v$, $u$ precedes $v$ in the execution order. We implement **Kahn’s Algorithm** within [ManifestBuilder.js](file:///C:/Users/aiden/Desktop/code_projects/Bulltrackers2025/Backend/Entrypoints/BullTrackers/Backend/Core/bulltrackers-module/functions/computation-system/context/ManifestBuilder.js) to achieve this:
27
- 1. Calculate the **in-degree** (number of incoming edges) for all nodes.
28
- 2. Initialize a queue with all nodes having an in-degree of 0 (independent nodes).
29
- 3. While the queue is not empty:
30
- * Dequeue node $N$ and add it to the `SortedManifest`.
31
- * For each neighbor $M$ dependent on $N$, decrement $M$'s in-degree.
32
- * If $M$'s in-degree becomes 0, enqueue $M$.
33
- 4. This generates a series of "Passes" or "Waves" of execution, allowing parallel processing of independent nodes within the same pass.
34
-
35
- ### 2.3 Cycle Detection (Tarjan’s Algorithm)
36
- A critical failure mode in DAGs is the introduction of a cycle (e.g., A needs B, B needs A), effectively turning the DAG into a DCG (Directed Cyclic Graph), which is unresolvable.
37
- If Kahn’s algorithm fails to visit all nodes (indicating a cycle exists), the system falls back to **Tarjan’s Strongly Connected Components (SCC) Algorithm**. This uses depth-first search to identify the exact cycle chain (e.g., `Calc A -> Calc B -> Calc C -> Calc A`), reporting the "First Cycle Found" to the developer for immediate remediation.
38
-
39
- ## 3. System Architecture & "Source of Truth"
40
-
41
- The architecture is centered around the **Manifest**, a dynamic, immutable registry of all capabilities within the system.
42
-
43
- ### 3.1 The Dynamic Manifest
44
- Unlike static build tools, the Manifest is built at runtime by [ManifestLoader.js](file:///C:/Users/aiden/Desktop/code_projects/Bulltrackers2025/Backend/Entrypoints/BullTrackers/Backend/Core/bulltrackers-module/functions/computation-system/topology/ManifestLoader.js) and [ManifestBuilder.js](file:///C:/Users/aiden/Desktop/code_projects/Bulltrackers2025/Backend/Entrypoints/BullTrackers/Backend/Core/bulltrackers-module/functions/computation-system/context/ManifestBuilder.js). It employs an **Auto-Discovery** mechanism that scans directories for calculation classes.
45
- * **Static Metadata**: Each class exposes `getMetadata()` and `getDependencies()`.
46
- * **Product Line Filtering**: The builder can slice the graph, generating a subgraph relevant only to specific product lines (e.g., "Crypto", "Stocks"), reducing overhead.
47
-
48
- ### 3.2 Granular Hashing & The Audit Chain
49
- To ensure that "if the code hasn't changed, the result shouldn't change," the system implements a multi-layered hashing strategy ([HashManager.js](file:///C:/Users/aiden/Desktop/code_projects/Bulltrackers2025/Backend/Entrypoints/BullTrackers/Backend/Core/bulltrackers-module/functions/computation-system/topology/HashManager.js)):
50
- 1. **Code Hash**: The raw string content of the calculation class.
51
- 2. **Layer Hash**: Hashes of shared utility layers (`mathematics`, `profiling`) used by the class.
52
- 3. **Dependency Hash**: A composite hash of all upstream dependencies.
53
- 4. **Infrastructure Hash**: A hash representing the underlying system environment.
54
- 5. **System Epoch**: A manual versioning flag to force global re-computation.
55
-
56
- This results in a `Composite Hash`. If this hash matches the `storedHash` in the database, execution can be skipped entirely.
57
-
58
- ## 4. Execution Engine: Flow, Resilience & Optimization
59
-
60
- The `WorkflowOrchestrator` acts as the runtime kernel, utilizing [StandardExecutor](file:///C:/Users/aiden/Desktop/code_projects/Bulltrackers2025/Backend/Entrypoints/BullTrackers/Backend/Core/bulltrackers-module/functions/computation-system/executors/StandardExecutor.js#16-257) and [MetaExecutor](file:///C:/Users/aiden/Desktop/code_projects/Bulltrackers2025/Backend/Entrypoints/BullTrackers/Backend/Core/bulltrackers-module/functions/computation-system/executors/MetaExecutor.js#12-83) for the heavy lifting.
61
-
62
- ### 4.1 Content-Based Dependency Short-Circuiting
63
- A major optimization (O(n) gain) is the **Content-Based Short-Circuiting** logic found in [WorkflowOrchestrator.js](file:///C:/Users/aiden/Desktop/code_projects/Bulltrackers2025/Backend/Entrypoints/BullTrackers/Backend/Core/bulltrackers-module/functions/computation-system/WorkflowOrchestrator.js):
64
- Even if an upstream dependency *re-runs* (e.g., its timestamp changed), its *output* might be identical to the previous run.
65
- 1. The system tracks `ResultHash` (hash of the actual output data).
66
- 2. When checking dependencies for Node B (which depends on A), if A has re-run but its `ResultHash` is unchanged from what B used last time, B **does not need to re-run**.
67
- 3. This effectively stops "change propagation" dead in its tracks if the data change is semantically null.
68
-
69
- ### 4.2 Batch Flushing & OOM Prevention
70
- Financial datasets (processing 100k+ users with daily portfolios) often exceed Node.js heap limits. The [StandardExecutor](file:///C:/Users/aiden/Desktop/code_projects/Bulltrackers2025/Backend/Entrypoints/BullTrackers/Backend/Core/bulltrackers-module/functions/computation-system/executors/StandardExecutor.js#16-257) implements a **Streaming & Flushing** architecture:
71
- * **Streams** inputs (Portfolio/History) using generators (`yield`), preventing loading all users into memory.
72
- * **Buffers** results in a `state` object.
73
- * **Flushes** to the database (Firestore/Storage) every $N$ users (e.g., 5000), clearing the internal buffer helps avoid Out-Of-Memory crashes.
74
- * **Incremental Sharding**: It manages shard indices dynamically to split massive result sets into retrievable chunks.
75
-
76
- ### 4.3 Handling "Impossible" States
77
- If a dependency fails or is missing critical data, the Orchestrator marks dependent nodes as `IMPOSSIBLE` rather than failing them. This allows the rest of the graph (independent branches) to continue execution, maximizing system throughput even in a partially degraded state.
78
-
79
- ## 5. Advanced Application: Psychometrics & Risk Geometry
80
-
81
- The capabilities of this computation engine are best demonstrated by the [profiling.js](file:///C:/Users/aiden/Desktop/code_projects/Bulltrackers2025/Backend/Entrypoints/BullTrackers/Backend/Core/bulltrackers-module/functions/computation-system/layers/profiling.js) layer it powers. Because the DAG ensures all historical and portfolio data is perfectly aligned, we can run sophisticated O(n^2) or O(n log n) algorithms on user data reliably.
82
-
83
- ### 5.1 "Smart Money" & Cognitive Profiling
84
- The system executes a [UserClassifier](file:///C:/Users/aiden/Desktop/code_projects/Bulltrackers2025/Backend/Entrypoints/BullTrackers/Backend/Core/bulltrackers-module/functions/computation-system/layers/profiling.js#382-399) that computes:
85
- * **Risk Geometry**: Using the **Monotone Chain** algorithm to compute the Convex Hull of a user's risk/reward performance (Efficient Frontier analysis).
86
- * **Psychometrics**: Detecting "Revenge Trading" (increasing risk after losses) and "Disposition Skew" (holding losers too long).
87
- * **Attribution**: Separating "Luck" (market beta) from "Skill" (Alpha) by comparing performance against sector benchmarks.
88
-
89
- These complex models depend on the *guarantee* provided by the DAG that all necessary history and price data is pre-computed and available in the [Context](file:///C:/Users/aiden/Desktop/code_projects/Bulltrackers2025/Backend/Entrypoints/BullTrackers/Backend/Core/bulltrackers-module/functions/computation-system/simulation/Fabricator.js#20-69).
90
-
91
- ## 6. Conclusion
92
-
93
- The BullTrackers Computation System represents a shift from "Action-Based" to "State-Based" architecture. By encoding the domain logic into a Directed Acyclic Graph, we achieve a system that is self-healing, massively scalable via short-circuiting and batching, and capable of supporting deep analytical models. It provides the robustness required for high-stakes financial simulation, ensuring that every decimal point is traceable, reproducible, and verifiable.