@simbimbo/brainstem 0.0.1 → 0.0.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +87 -0
- package/README.md +99 -3
- package/brainstem/__init__.py +3 -0
- package/brainstem/api.py +257 -0
- package/brainstem/connectors/__init__.py +1 -0
- package/brainstem/connectors/logicmonitor.py +26 -0
- package/brainstem/connectors/types.py +16 -0
- package/brainstem/demo.py +64 -0
- package/brainstem/fingerprint.py +44 -0
- package/brainstem/ingest.py +108 -0
- package/brainstem/instrumentation.py +38 -0
- package/brainstem/interesting.py +62 -0
- package/brainstem/models.py +80 -0
- package/brainstem/recurrence.py +112 -0
- package/brainstem/scoring.py +38 -0
- package/brainstem/storage.py +428 -0
- package/docs/adapters.md +435 -0
- package/docs/api.md +380 -0
- package/docs/architecture.md +333 -0
- package/docs/connectors.md +66 -0
- package/docs/data-model.md +290 -0
- package/docs/design-governance.md +595 -0
- package/docs/mvp-flow.md +109 -0
- package/docs/roadmap.md +87 -0
- package/docs/scoring.md +424 -0
- package/docs/v0.0.1.md +277 -0
- package/docs/vision.md +85 -0
- package/package.json +6 -14
- package/pyproject.toml +18 -0
- package/tests/fixtures/sample_syslog.log +6 -0
- package/tests/test_api.py +319 -0
- package/tests/test_canonicalization.py +28 -0
- package/tests/test_demo.py +25 -0
- package/tests/test_fingerprint.py +22 -0
- package/tests/test_ingest.py +15 -0
- package/tests/test_instrumentation.py +16 -0
- package/tests/test_interesting.py +36 -0
- package/tests/test_logicmonitor.py +22 -0
- package/tests/test_recurrence.py +16 -0
- package/tests/test_scoring.py +21 -0
- package/tests/test_storage.py +294 -0
|
@@ -0,0 +1,595 @@
|
|
|
1
|
+
# brAInstem Design Governance
|
|
2
|
+
|
|
3
|
+
_Status: canonical working governance doc for early product design and v0.0.1 execution_
|
|
4
|
+
|
|
5
|
+
## Purpose
|
|
6
|
+
|
|
7
|
+
This document exists to keep brAInstem from becoming a vague, overbuilt "AIOps platform" before it becomes a useful product.
|
|
8
|
+
|
|
9
|
+
It is the design governor for:
|
|
10
|
+
- what brAInstem is
|
|
11
|
+
- what it is not
|
|
12
|
+
- how we decide what belongs in early versions
|
|
13
|
+
- how ingestion, attention, discovery, memory, and operator output should fit together
|
|
14
|
+
- how to keep the product honest while still moving fast
|
|
15
|
+
|
|
16
|
+
If a future feature, architecture idea, connector, or AI capability conflicts with this document, the burden of proof is on the new idea.
|
|
17
|
+
|
|
18
|
+
---
|
|
19
|
+
|
|
20
|
+
## 1. Core product thesis
|
|
21
|
+
|
|
22
|
+
brAInstem is an **always-on operational memory runtime for weak signals**.
|
|
23
|
+
|
|
24
|
+
Its job is to:
|
|
25
|
+
1. continuously listen to operational event streams
|
|
26
|
+
2. normalize noisy raw inputs into a canonical event stream
|
|
27
|
+
3. assign and update **attention** over time
|
|
28
|
+
4. suppress most inconsequential activity quickly and cheaply
|
|
29
|
+
5. preserve enough evidence so small signals can earn more attention later
|
|
30
|
+
6. surface only the meaningful "meat" of the stream to humans
|
|
31
|
+
7. remember historically meaningful patterns so future weak signals can be compared against the past
|
|
32
|
+
|
|
33
|
+
### Short version
|
|
34
|
+
|
|
35
|
+
brAInstem remembers the weird operational stuff that monitoring systems do not escalate and humans do not have time to remember.
|
|
36
|
+
|
|
37
|
+
### Even shorter version
|
|
38
|
+
|
|
39
|
+
It tells operators what is quietly becoming important before it becomes an incident.
|
|
40
|
+
|
|
41
|
+
---
|
|
42
|
+
|
|
43
|
+
## 2. The problem brAInstem is solving
|
|
44
|
+
|
|
45
|
+
Modern operational environments produce enormous amounts of low-grade noise:
|
|
46
|
+
- recurring warnings
|
|
47
|
+
- self-healing service failures
|
|
48
|
+
- auth oddities
|
|
49
|
+
- VPN/firewall flap patterns
|
|
50
|
+
- restart storms
|
|
51
|
+
- low-severity events that never page anyone
|
|
52
|
+
- event clusters that matter only when viewed across time, hosts, or sources
|
|
53
|
+
|
|
54
|
+
Most existing tools are optimized for one or more of the following:
|
|
55
|
+
- hard failures
|
|
56
|
+
- alerting thresholds
|
|
57
|
+
- active incident response
|
|
58
|
+
- event correlation around incidents
|
|
59
|
+
- generic observability queries
|
|
60
|
+
|
|
61
|
+
But there is still a gap around the **forgotten operational middle**:
|
|
62
|
+
- patterns too weak to alert on today
|
|
63
|
+
- patterns too repetitive for humans to remember accurately
|
|
64
|
+
- patterns too individually small to justify immediate investigation
|
|
65
|
+
- patterns that later become tickets or outages
|
|
66
|
+
|
|
67
|
+
brAInstem is designed to fill that gap.
|
|
68
|
+
|
|
69
|
+
---
|
|
70
|
+
|
|
71
|
+
## 3. Category position
|
|
72
|
+
|
|
73
|
+
brAInstem is **not** best described as a generic AIOps platform.
|
|
74
|
+
|
|
75
|
+
That category is crowded, vague, and easy to disappear inside.
|
|
76
|
+
|
|
77
|
+
### Preferred category language
|
|
78
|
+
- operational memory engine
|
|
79
|
+
- weak-signal operational memory
|
|
80
|
+
- pre-incident pattern memory
|
|
81
|
+
- always-on operational attention engine
|
|
82
|
+
|
|
83
|
+
### Anti-positioning
|
|
84
|
+
Do **not** lead with:
|
|
85
|
+
- "AIOps platform"
|
|
86
|
+
- "AI observability"
|
|
87
|
+
- "chat with all your logs"
|
|
88
|
+
- "generic event correlation"
|
|
89
|
+
- "autonomous operations"
|
|
90
|
+
|
|
91
|
+
### Why
|
|
92
|
+
Those descriptions flatten the most differentiated part of brAInstem:
|
|
93
|
+
its ability to remember and score weak, recurring, self-healing, pre-threshold operational patterns across time.
|
|
94
|
+
|
|
95
|
+
---
|
|
96
|
+
|
|
97
|
+
## 4. Primary buyer and beachhead
|
|
98
|
+
|
|
99
|
+
### First target buyer
|
|
100
|
+
- MSPs
|
|
101
|
+
|
|
102
|
+
### Secondary buyers
|
|
103
|
+
- lean NOC teams
|
|
104
|
+
- infrastructure/SRE teams with persistent alert fatigue
|
|
105
|
+
- operations-heavy consultants who repeatedly see the same patterns across environments
|
|
106
|
+
|
|
107
|
+
### Why MSPs first
|
|
108
|
+
MSPs are a strong beachhead because they:
|
|
109
|
+
- manage many customer environments with recurring pattern overlap
|
|
110
|
+
- suffer from alert fatigue and technician memory loss
|
|
111
|
+
- need to reduce surprise tickets
|
|
112
|
+
- benefit directly from surfacing weak recurring issues before customers feel them
|
|
113
|
+
- can monetize proactive identification of trouble patterns
|
|
114
|
+
|
|
115
|
+
### Buyer-value language
|
|
116
|
+
brAInstem should help buyers:
|
|
117
|
+
- reduce surprise tickets
|
|
118
|
+
- identify recurring self-healing issues earlier
|
|
119
|
+
- preserve senior technician intuition as reusable memory
|
|
120
|
+
- make junior technicians more effective
|
|
121
|
+
- surface what to check before users notice symptoms
|
|
122
|
+
|
|
123
|
+
---
|
|
124
|
+
|
|
125
|
+
## 5. Product form
|
|
126
|
+
|
|
127
|
+
brAInstem should be designed as a **self-contained runtime**, not just an analytics script.
|
|
128
|
+
|
|
129
|
+
### Required properties
|
|
130
|
+
- always listening
|
|
131
|
+
- self-contained
|
|
132
|
+
- source-installable
|
|
133
|
+
- OS-agnostic as far as practical
|
|
134
|
+
- dependency-self-sufficient where possible
|
|
135
|
+
- suitable for local, edge, or on-prem deployment
|
|
136
|
+
|
|
137
|
+
### What this means in practice
|
|
138
|
+
The product should eventually include:
|
|
139
|
+
- input apparatus
|
|
140
|
+
- normalization apparatus
|
|
141
|
+
- discovery apparatus
|
|
142
|
+
- memory/store apparatus
|
|
143
|
+
- operator-facing output apparatus
|
|
144
|
+
|
|
145
|
+
The product is not just a batch analyzer. It is a continuously operating system for operational attention.
|
|
146
|
+
|
|
147
|
+
---
|
|
148
|
+
|
|
149
|
+
## 6. Non-negotiable day-1 principle: robust input apparatus
|
|
150
|
+
|
|
151
|
+
The input apparatus must be robust from the first serious release.
|
|
152
|
+
|
|
153
|
+
If ingestion is flaky, everything above it is untrustworthy.
|
|
154
|
+
|
|
155
|
+
### Day-1 robustness requirements
|
|
156
|
+
The intake layer must:
|
|
157
|
+
- never lose data silently
|
|
158
|
+
- survive malformed input
|
|
159
|
+
- preserve provenance
|
|
160
|
+
- support bursty input without collapsing downstream logic
|
|
161
|
+
- expose queue depth / health / parse failure visibility
|
|
162
|
+
- preserve enough raw evidence for audit and replay
|
|
163
|
+
- decouple intake from heavy processing using durable buffering/spooling
|
|
164
|
+
|
|
165
|
+
### Robust does not mean universal on day one
|
|
166
|
+
We should distinguish between:
|
|
167
|
+
- **robustness**: strong ingestion behavior under ugly real conditions
|
|
168
|
+
- **breadth**: number of supported source types
|
|
169
|
+
|
|
170
|
+
Robustness is non-negotiable on day one.
|
|
171
|
+
Breadth expands over time through adapter contracts.
|
|
172
|
+
|
|
173
|
+
---
|
|
174
|
+
|
|
175
|
+
## 7. Universal input philosophy
|
|
176
|
+
|
|
177
|
+
brAInstem should be architected so that many source types can eventually feed one canonical internal stream.
|
|
178
|
+
|
|
179
|
+
### Examples of desired source classes
|
|
180
|
+
- syslog
|
|
181
|
+
- file tails
|
|
182
|
+
- JSON logs
|
|
183
|
+
- structured webhooks
|
|
184
|
+
- monitoring/alert APIs
|
|
185
|
+
- vendor event payloads
|
|
186
|
+
- streaming inputs
|
|
187
|
+
- audit/event exports
|
|
188
|
+
|
|
189
|
+
### Universal does **not** mean
|
|
190
|
+
- bespoke first-class support for every source in v0.0.1
|
|
191
|
+
- a giant connector zoo before the core event model is stable
|
|
192
|
+
- turning brAInstem into a generic log shipper
|
|
193
|
+
|
|
194
|
+
### Universal **does** mean
|
|
195
|
+
- a stable raw envelope contract
|
|
196
|
+
- a stable canonical event contract
|
|
197
|
+
- adapters that map input into those contracts cleanly
|
|
198
|
+
- downstream discovery logic that is source-agnostic once normalization is complete
|
|
199
|
+
|
|
200
|
+
---
|
|
201
|
+
|
|
202
|
+
## 8. Canonical stream of consciousness
|
|
203
|
+
|
|
204
|
+
The system should internally treat all normalized events as part of one continuous operational stream of consciousness.
|
|
205
|
+
|
|
206
|
+
That does **not** mean one giant undifferentiated blob.
|
|
207
|
+
It means all events, regardless of source, should become compatible units inside one shared attention/discovery pipeline.
|
|
208
|
+
|
|
209
|
+
### Canonical pipeline concept
|
|
210
|
+
1. raw input arrives
|
|
211
|
+
2. raw input is wrapped in a provenance-preserving envelope
|
|
212
|
+
3. parser/canonicalizer emits canonical events
|
|
213
|
+
4. canonical events are normalized / deduped / baseline-adjusted
|
|
214
|
+
5. attention is assigned
|
|
215
|
+
6. attention evolves over time based on recurrence/spread/context/history
|
|
216
|
+
7. only the meaningful material is surfaced to human review or promoted into memory
|
|
217
|
+
|
|
218
|
+
This canonical stream is the substrate of the product.
|
|
219
|
+
|
|
220
|
+
---
|
|
221
|
+
|
|
222
|
+
## 9. Attention is the central primitive
|
|
223
|
+
|
|
224
|
+
This is one of the most important design rules in the entire product.
|
|
225
|
+
|
|
226
|
+
brAInstem should not operate on a naive binary:
|
|
227
|
+
- keep
|
|
228
|
+
n- drop
|
|
229
|
+
|
|
230
|
+
Instead, it should operate on **attention**.
|
|
231
|
+
|
|
232
|
+
### Why
|
|
233
|
+
A lot of operationally meaningful patterns begin as tiny, inconsequential-looking events.
|
|
234
|
+
A single event often means nothing.
|
|
235
|
+
A recurring, spreading, or historically familiar version of that same event may matter a lot.
|
|
236
|
+
|
|
237
|
+
### Therefore
|
|
238
|
+
Most events should not be hard-dropped immediately unless they are clearly worthless.
|
|
239
|
+
Instead, they should be assigned an initial attention score and routed according to that score.
|
|
240
|
+
|
|
241
|
+
### Attention should be:
|
|
242
|
+
- cheap to assign initially
|
|
243
|
+
- dynamic over time
|
|
244
|
+
- evidence-based
|
|
245
|
+
- inspectable
|
|
246
|
+
- auditable
|
|
247
|
+
- promotable
|
|
248
|
+
|
|
249
|
+
### Attention is not just a score
|
|
250
|
+
It is a product behavior model.
|
|
251
|
+
Attention determines:
|
|
252
|
+
- retention depth
|
|
253
|
+
- compute budget
|
|
254
|
+
- routing behavior
|
|
255
|
+
- whether humans see something
|
|
256
|
+
- whether something enters long-term memory
|
|
257
|
+
|
|
258
|
+
---
|
|
259
|
+
|
|
260
|
+
## 10. Attention bands
|
|
261
|
+
|
|
262
|
+
brAInstem should use explicit attention bands rather than one magical opaque score.
|
|
263
|
+
|
|
264
|
+
A suggested early model:
|
|
265
|
+
|
|
266
|
+
### 0. Ignore-fast
|
|
267
|
+
Characteristics:
|
|
268
|
+
- trivial or clearly low-value noise
|
|
269
|
+
- low novelty
|
|
270
|
+
- low recurrence significance
|
|
271
|
+
- low spread
|
|
272
|
+
- low historical relevance
|
|
273
|
+
|
|
274
|
+
Behavior:
|
|
275
|
+
- aggregate counters only or very short retention
|
|
276
|
+
- do not send to deep discovery
|
|
277
|
+
- do not surface to humans
|
|
278
|
+
|
|
279
|
+
### 1. Background
|
|
280
|
+
Characteristics:
|
|
281
|
+
- probably not meaningful alone
|
|
282
|
+
- worth retaining at low cost
|
|
283
|
+
- may matter if repeated later
|
|
284
|
+
|
|
285
|
+
Behavior:
|
|
286
|
+
- keep counts/fingerprint statistics
|
|
287
|
+
- short evidence retention
|
|
288
|
+
- eligible for score growth through recurrence/time
|
|
289
|
+
|
|
290
|
+
### 2. Watch
|
|
291
|
+
Characteristics:
|
|
292
|
+
- weak but plausibly meaningful
|
|
293
|
+
- repeated enough or novel enough to monitor
|
|
294
|
+
|
|
295
|
+
Behavior:
|
|
296
|
+
- maintain context window
|
|
297
|
+
- keep more evidence
|
|
298
|
+
- eligible for candidate formation
|
|
299
|
+
|
|
300
|
+
### 3. Investigate
|
|
301
|
+
Characteristics:
|
|
302
|
+
- likely meaningful weak signal
|
|
303
|
+
- strong enough to justify active discovery logic now
|
|
304
|
+
|
|
305
|
+
Behavior:
|
|
306
|
+
- full candidate generation
|
|
307
|
+
- historical comparison
|
|
308
|
+
- richer explanation generation
|
|
309
|
+
|
|
310
|
+
### 4. Promote
|
|
311
|
+
Characteristics:
|
|
312
|
+
- meaningfully important
|
|
313
|
+
- digest-worthy
|
|
314
|
+
- review-worthy
|
|
315
|
+
- incident-memory-worthy
|
|
316
|
+
|
|
317
|
+
Behavior:
|
|
318
|
+
- surface to humans
|
|
319
|
+
- optionally persist as durable memory / incident pattern
|
|
320
|
+
|
|
321
|
+
### Design rule
|
|
322
|
+
A low-attention event must be able to **earn more attention later**.
|
|
323
|
+
|
|
324
|
+
---
|
|
325
|
+
|
|
326
|
+
## 11. Discovery apparatus
|
|
327
|
+
|
|
328
|
+
The discovery apparatus is the part of brAInstem that turns canonical events plus accumulated attention into operationally meaningful signals.
|
|
329
|
+
|
|
330
|
+
### The discovery apparatus should:
|
|
331
|
+
- review events in near-real-time
|
|
332
|
+
- consume normalized event/signal state, not just raw payloads
|
|
333
|
+
- operate cheaply on the common path
|
|
334
|
+
- reserve heavier analysis for events/signatures/candidates that have earned attention
|
|
335
|
+
|
|
336
|
+
### Candidate classes that matter early
|
|
337
|
+
- recurrence candidate
|
|
338
|
+
- burst candidate
|
|
339
|
+
- self-heal candidate
|
|
340
|
+
- spread candidate
|
|
341
|
+
- precursor candidate
|
|
342
|
+
|
|
343
|
+
### Core design rule
|
|
344
|
+
Do not waste expensive reasoning on the whole stream.
|
|
345
|
+
Let the stream earn deeper review.
|
|
346
|
+
|
|
347
|
+
---
|
|
348
|
+
|
|
349
|
+
## 12. The "meat" principle
|
|
350
|
+
|
|
351
|
+
Most raw operational data is not the product.
|
|
352
|
+
The product is the meaningful reduction of that data.
|
|
353
|
+
|
|
354
|
+
brAInstem should be explicitly designed to:
|
|
355
|
+
- suppress the junk quickly
|
|
356
|
+
- preserve enough evidence for audit
|
|
357
|
+
- retain enough weak material for attention growth
|
|
358
|
+
- promote only what actually matters
|
|
359
|
+
|
|
360
|
+
### The "meat" is:
|
|
361
|
+
- recurring weak signals
|
|
362
|
+
- self-healing instability
|
|
363
|
+
- growing or spreading oddities
|
|
364
|
+
- patterns with historical precedent
|
|
365
|
+
- patterns that deserve a technician's eyes
|
|
366
|
+
|
|
367
|
+
If the system simply captures everything and summarizes indiscriminately, it has failed.
|
|
368
|
+
|
|
369
|
+
---
|
|
370
|
+
|
|
371
|
+
## 13. Output is the product
|
|
372
|
+
|
|
373
|
+
The internal pipeline matters, but operator-facing output is what creates value.
|
|
374
|
+
|
|
375
|
+
Every surfaced item should aim to answer:
|
|
376
|
+
- what is happening?
|
|
377
|
+
- why does it matter?
|
|
378
|
+
- how often / where is it happening?
|
|
379
|
+
- have we seen this before?
|
|
380
|
+
- what should a human check next?
|
|
381
|
+
|
|
382
|
+
### Early output surfaces
|
|
383
|
+
- interesting items list
|
|
384
|
+
- attention queue
|
|
385
|
+
- weak-signal digest
|
|
386
|
+
- history recall for similar patterns
|
|
387
|
+
|
|
388
|
+
### Anti-pattern
|
|
389
|
+
Do not surface giant blobs of internal state, giant raw logs, or inscrutable score dumps and call that product.
|
|
390
|
+
|
|
391
|
+
The output must feel like technician-grade operational guidance.
|
|
392
|
+
|
|
393
|
+
---
|
|
394
|
+
|
|
395
|
+
## 14. What brAInstem is not
|
|
396
|
+
|
|
397
|
+
brAInstem is not, at least in its early phases:
|
|
398
|
+
- a full SIEM
|
|
399
|
+
- a generic observability backend
|
|
400
|
+
- a generic log shipper
|
|
401
|
+
- a full root-cause engine
|
|
402
|
+
- a complete event correlation platform for all use cases
|
|
403
|
+
- a chatbot over arbitrary logs
|
|
404
|
+
- a replacement for monitoring, alerting, and dashboards
|
|
405
|
+
|
|
406
|
+
It can integrate with or complement those things.
|
|
407
|
+
It should not try to become all of them at once.
|
|
408
|
+
|
|
409
|
+
---
|
|
410
|
+
|
|
411
|
+
## 15. v0.0.1 release philosophy
|
|
412
|
+
|
|
413
|
+
A real v0.0.1 should be **honest, narrow, and usable**.
|
|
414
|
+
|
|
415
|
+
It should not pretend the universal always-on intake appliance is fully complete if it is not.
|
|
416
|
+
|
|
417
|
+
### v0.0.1 should prove:
|
|
418
|
+
- the category framing is coherent
|
|
419
|
+
- the canonical event model exists
|
|
420
|
+
- the attention model exists
|
|
421
|
+
- a small set of input types can feed one normalized stream
|
|
422
|
+
- recurrence/interesting-item output is real
|
|
423
|
+
- the product can emit something operator-meaningful
|
|
424
|
+
|
|
425
|
+
### v0.0.1 should not claim:
|
|
426
|
+
- full universal source coverage
|
|
427
|
+
- fully mature always-on distributed ingestion
|
|
428
|
+
- enterprise-grade multi-tenancy
|
|
429
|
+
- full discovery sophistication
|
|
430
|
+
- full memory retrieval maturity
|
|
431
|
+
|
|
432
|
+
The point of v0.0.1 is to put a truthful first stake in the ground.
|
|
433
|
+
|
|
434
|
+
---
|
|
435
|
+
|
|
436
|
+
## 16. Recommended exact v0.0.1 shape
|
|
437
|
+
|
|
438
|
+
### In scope
|
|
439
|
+
- self-contained local runtime / prototype framing
|
|
440
|
+
- canonical event schema
|
|
441
|
+
- narrow but real ingestion path(s)
|
|
442
|
+
- recurrence/interesting-item generation
|
|
443
|
+
- early attention scoring language and behavior
|
|
444
|
+
- SQLite persistence
|
|
445
|
+
- demo or CLI path that proves end-to-end flow
|
|
446
|
+
- documentation that explains the architecture honestly
|
|
447
|
+
|
|
448
|
+
### Out of scope
|
|
449
|
+
- broad connector ecosystem
|
|
450
|
+
- complete syslog appliance behavior
|
|
451
|
+
- production-grade multi-tenant orchestration
|
|
452
|
+
- heavy UI work
|
|
453
|
+
- cloud dependency sprawl
|
|
454
|
+
- black-box AI-heavy ranking
|
|
455
|
+
|
|
456
|
+
### Honest release description
|
|
457
|
+
brAInstem 0.0.1 is an experimental operational memory prototype that ingests early source types, normalizes them into a canonical stream, assigns attention, detects recurring weak signals, and emits concise operator-facing interesting items from a self-contained local runtime.
|
|
458
|
+
|
|
459
|
+
---
|
|
460
|
+
|
|
461
|
+
## 17. Day-1 architecture priority order
|
|
462
|
+
|
|
463
|
+
The right implementation spine is:
|
|
464
|
+
|
|
465
|
+
1. **Canonical event model**
|
|
466
|
+
2. **Adapter/raw envelope contract**
|
|
467
|
+
3. **Input apparatus**
|
|
468
|
+
4. **Normalization + dedupe**
|
|
469
|
+
5. **Attention scoring**
|
|
470
|
+
6. **Discovery/candidate generation**
|
|
471
|
+
7. **Output/interesting items/digest**
|
|
472
|
+
8. **Broader connectors**
|
|
473
|
+
|
|
474
|
+
### Why this order matters
|
|
475
|
+
If we build connectors before the event model and attention model are stable, we will create a swamp of special cases.
|
|
476
|
+
|
|
477
|
+
---
|
|
478
|
+
|
|
479
|
+
## 18. Dependency strategy
|
|
480
|
+
|
|
481
|
+
The dependency strategy should follow these rules:
|
|
482
|
+
|
|
483
|
+
### Prefer
|
|
484
|
+
- widely used Python runtime/web/schema packages
|
|
485
|
+
- small, understandable libraries
|
|
486
|
+
- dependencies that can be installed from source
|
|
487
|
+
- offline-friendly/self-contained deployment assumptions
|
|
488
|
+
|
|
489
|
+
### Avoid early
|
|
490
|
+
- giant vendor SDK dependency sprawl
|
|
491
|
+
- dependencies that imply cloud lock-in
|
|
492
|
+
- systems that require distributed infra for basic operation
|
|
493
|
+
- magic ML dependencies before the baseline rules are proven useful
|
|
494
|
+
|
|
495
|
+
### General principle
|
|
496
|
+
Use packages to accelerate protocol handling and schema validation, not to outsource the core product thesis.
|
|
497
|
+
|
|
498
|
+
---
|
|
499
|
+
|
|
500
|
+
## 19. Auditability and trust
|
|
501
|
+
|
|
502
|
+
Every significant system behavior must be explainable enough for operators and builders to trust it.
|
|
503
|
+
|
|
504
|
+
This applies especially to:
|
|
505
|
+
- parse failures
|
|
506
|
+
- dropped/suppressed events
|
|
507
|
+
- attention changes
|
|
508
|
+
- candidate promotion
|
|
509
|
+
- digest inclusion
|
|
510
|
+
|
|
511
|
+
### Early trust requirements
|
|
512
|
+
The system should be able to explain:
|
|
513
|
+
- why something was ignored
|
|
514
|
+
- why something stayed background
|
|
515
|
+
- why something was promoted
|
|
516
|
+
- why something surfaced now instead of earlier
|
|
517
|
+
|
|
518
|
+
If the system cannot explain its own behavior, it is not ready for operational trust.
|
|
519
|
+
|
|
520
|
+
---
|
|
521
|
+
|
|
522
|
+
## 20. Biggest design mistakes to avoid
|
|
523
|
+
|
|
524
|
+
### Mistake 1
|
|
525
|
+
Building a connector zoo before freezing the canonical event model.
|
|
526
|
+
|
|
527
|
+
### Mistake 2
|
|
528
|
+
Treating adapters as arbitrary plugin emitters rather than strict envelope emitters.
|
|
529
|
+
|
|
530
|
+
### Mistake 3
|
|
531
|
+
Conflating robustness with breadth.
|
|
532
|
+
|
|
533
|
+
### Mistake 4
|
|
534
|
+
Using opaque AI scoring before transparent attention logic exists.
|
|
535
|
+
|
|
536
|
+
### Mistake 5
|
|
537
|
+
Dropping tiny signals too aggressively and losing the very patterns the product exists to catch.
|
|
538
|
+
|
|
539
|
+
### Mistake 6
|
|
540
|
+
Treating dashboards as the product before the outputs are actually useful.
|
|
541
|
+
|
|
542
|
+
### Mistake 7
|
|
543
|
+
Shipping category language that overclaims beyond what the system truly does.
|
|
544
|
+
|
|
545
|
+
---
|
|
546
|
+
|
|
547
|
+
## 21. Working decision framework
|
|
548
|
+
|
|
549
|
+
When deciding whether a feature belongs now, ask:
|
|
550
|
+
|
|
551
|
+
1. Does this improve the canonical stream of consciousness?
|
|
552
|
+
2. Does this improve attention assignment or attention evolution?
|
|
553
|
+
3. Does this make the input apparatus more robust?
|
|
554
|
+
4. Does this make operator output more useful?
|
|
555
|
+
5. Does this preserve self-contained, source-installable deployment?
|
|
556
|
+
6. Does this tighten the product wedge for MSP/lean ops weak-signal detection?
|
|
557
|
+
|
|
558
|
+
If the answer is mostly no, it probably does not belong yet.
|
|
559
|
+
|
|
560
|
+
---
|
|
561
|
+
|
|
562
|
+
## 22. Recommended near-term sequence
|
|
563
|
+
|
|
564
|
+
### Immediate design work
|
|
565
|
+
1. write explicit `v0.0.1` scope doc
|
|
566
|
+
2. revise architecture doc around raw envelope → canonical event → attention → discovery → output
|
|
567
|
+
3. define adapter contract
|
|
568
|
+
4. define attention bands and routing behavior
|
|
569
|
+
5. define operator output contract
|
|
570
|
+
|
|
571
|
+
### Immediate implementation work after that
|
|
572
|
+
1. make the current prototype language align with attention
|
|
573
|
+
2. make the demo output read like product
|
|
574
|
+
3. ensure tests/run-path are reliable in a real env
|
|
575
|
+
4. cut an honest `0.0.1`
|
|
576
|
+
|
|
577
|
+
### Next milestone after `0.0.1`
|
|
578
|
+
Build the first truly robust always-on intake runtime:
|
|
579
|
+
- syslog UDP/TCP
|
|
580
|
+
- HTTP ingest/batch
|
|
581
|
+
- durable spool
|
|
582
|
+
- parse/decode observability
|
|
583
|
+
- source health/status surfaces
|
|
584
|
+
|
|
585
|
+
---
|
|
586
|
+
|
|
587
|
+
## 23. Final governing statement
|
|
588
|
+
|
|
589
|
+
brAInstem wins if it becomes the system that:
|
|
590
|
+
- listens continuously
|
|
591
|
+
- forgets the right things cheaply
|
|
592
|
+
- remembers the right weird things accurately
|
|
593
|
+
- and tells a human what is quietly becoming important before it becomes painful
|
|
594
|
+
|
|
595
|
+
Everything else is secondary.
|
package/docs/mvp-flow.md
ADDED
|
@@ -0,0 +1,109 @@
|
|
|
1
|
+
# MVP Flow
|
|
2
|
+
|
|
3
|
+
> This document is subordinate to `design-governance.md`. If the MVP flow here drifts beyond the governance boundaries, the governance doc wins.
|
|
4
|
+
|
|
5
|
+
## Goal
|
|
6
|
+
|
|
7
|
+
Build the smallest possible brAInstem flow that proves the product thesis:
|
|
8
|
+
- logs can be remembered
|
|
9
|
+
- weak signals can be scored
|
|
10
|
+
- recurring self-resolving issues can be surfaced before traditional alerts
|
|
11
|
+
|
|
12
|
+
## MVP source types
|
|
13
|
+
|
|
14
|
+
Start with:
|
|
15
|
+
- syslog files
|
|
16
|
+
- line-oriented service logs
|
|
17
|
+
- auth logs
|
|
18
|
+
- firewall / VPN logs
|
|
19
|
+
- LogicMonitor alert/event payloads via a simple connector contract
|
|
20
|
+
|
|
21
|
+
## MVP processing flow
|
|
22
|
+
|
|
23
|
+
### 1. Ingest
|
|
24
|
+
Read raw lines from a source file or stream.
|
|
25
|
+
|
|
26
|
+
Example input:
|
|
27
|
+
- VPN rekey failure
|
|
28
|
+
- SSH auth failure burst
|
|
29
|
+
- repeated service restart line
|
|
30
|
+
|
|
31
|
+
### 2. Normalize
|
|
32
|
+
Convert each line into a standard event:
|
|
33
|
+
- tenant
|
|
34
|
+
- asset
|
|
35
|
+
- host
|
|
36
|
+
- service
|
|
37
|
+
- timestamp
|
|
38
|
+
- severity
|
|
39
|
+
- message_raw
|
|
40
|
+
- message_normalized
|
|
41
|
+
|
|
42
|
+
### 3. Fingerprint
|
|
43
|
+
Map normalized messages into signatures.
|
|
44
|
+
|
|
45
|
+
Example:
|
|
46
|
+
- many slightly different VPN errors -> one signature family
|
|
47
|
+
|
|
48
|
+
### 4. Candidate generation
|
|
49
|
+
Create candidates when patterns emerge:
|
|
50
|
+
- recurrence candidate
|
|
51
|
+
- self-heal candidate
|
|
52
|
+
- spread candidate
|
|
53
|
+
- burst candidate
|
|
54
|
+
|
|
55
|
+
### 5. Score
|
|
56
|
+
Compute human-significance score using:
|
|
57
|
+
- recurrence
|
|
58
|
+
- recovery
|
|
59
|
+
- spread
|
|
60
|
+
- novelty
|
|
61
|
+
- temporal correlation
|
|
62
|
+
- human impact likelihood
|
|
63
|
+
- precursor likelihood
|
|
64
|
+
- memory weight
|
|
65
|
+
|
|
66
|
+
### 6. Surface
|
|
67
|
+
Produce:
|
|
68
|
+
- daily digest item
|
|
69
|
+
- candidate explanation
|
|
70
|
+
- related prior history
|
|
71
|
+
|
|
72
|
+
### 7. Promote
|
|
73
|
+
If a candidate repeatedly matters, promote it into incident memory.
|
|
74
|
+
|
|
75
|
+
## First killer demo
|
|
76
|
+
|
|
77
|
+
### Scenario
|
|
78
|
+
Recurring self-healing VPN instability.
|
|
79
|
+
|
|
80
|
+
Input pattern:
|
|
81
|
+
- tunnel drops briefly
|
|
82
|
+
- recovers before alert threshold
|
|
83
|
+
- happens 10+ times over several days
|
|
84
|
+
|
|
85
|
+
Desired output:
|
|
86
|
+
- a digest item surfaces it
|
|
87
|
+
- score explains why it matters
|
|
88
|
+
- prior similar incident memory is linked if available
|
|
89
|
+
- raw log evidence is still drillable
|
|
90
|
+
|
|
91
|
+
## MVP success criteria
|
|
92
|
+
|
|
93
|
+
A successful MVP should:
|
|
94
|
+
- ingest raw syslog-like events
|
|
95
|
+
- cluster similar weak signals
|
|
96
|
+
- produce an explainable candidate score
|
|
97
|
+
- surface at least one meaningful item that classic threshold alerting would miss
|
|
98
|
+
- support a concise operator digest
|
|
99
|
+
|
|
100
|
+
## Non-goals for MVP
|
|
101
|
+
|
|
102
|
+
Do not try to solve:
|
|
103
|
+
- full SIEM workflows
|
|
104
|
+
- complete root-cause analysis
|
|
105
|
+
- all log formats
|
|
106
|
+
- all tenants and data sources at once
|
|
107
|
+
- perfect semantic understanding
|
|
108
|
+
|
|
109
|
+
The MVP should prove the weak-signal memory model works.
|