npm - @simbimbo/brainstem - Versions diffs - 0.0.1 → 0.0.3 - Mend

@simbimbo/brainstem 0.0.1 → 0.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (41) hide show

package/CHANGELOG.md +87 -0
package/README.md +99 -3
package/brainstem/__init__.py +3 -0
package/brainstem/api.py +257 -0
package/brainstem/connectors/__init__.py +1 -0
package/brainstem/connectors/logicmonitor.py +26 -0
package/brainstem/connectors/types.py +16 -0
package/brainstem/demo.py +64 -0
package/brainstem/fingerprint.py +44 -0
package/brainstem/ingest.py +108 -0
package/brainstem/instrumentation.py +38 -0
package/brainstem/interesting.py +62 -0
package/brainstem/models.py +80 -0
package/brainstem/recurrence.py +112 -0
package/brainstem/scoring.py +38 -0
package/brainstem/storage.py +428 -0
package/docs/adapters.md +435 -0
package/docs/api.md +380 -0
package/docs/architecture.md +333 -0
package/docs/connectors.md +66 -0
package/docs/data-model.md +290 -0
package/docs/design-governance.md +595 -0
package/docs/mvp-flow.md +109 -0
package/docs/roadmap.md +87 -0
package/docs/scoring.md +424 -0
package/docs/v0.0.1.md +277 -0
package/docs/vision.md +85 -0
package/package.json +6 -14
package/pyproject.toml +18 -0
package/tests/fixtures/sample_syslog.log +6 -0
package/tests/test_api.py +319 -0
package/tests/test_canonicalization.py +28 -0
package/tests/test_demo.py +25 -0
package/tests/test_fingerprint.py +22 -0
package/tests/test_ingest.py +15 -0
package/tests/test_instrumentation.py +16 -0
package/tests/test_interesting.py +36 -0
package/tests/test_logicmonitor.py +22 -0
package/tests/test_recurrence.py +16 -0
package/tests/test_scoring.py +21 -0
package/tests/test_storage.py +294 -0

package/CHANGELOG.md ADDED Viewed

@@ -0,0 +1,87 @@
+# Changelog
+## 0.0.3 — 2026-03-22
+Intake Foundation follow-up release for **brAInstem**.
+### Highlights
+- persists `RawInputEnvelope` intake records to SQLite before canonicalization
+- records canonicalization outcomes explicitly (`received`, `canonicalized`, `parse_failed`, `unsupported`)
+- adds ingest accounting for:
+  - received
+  - canonicalized
+  - parse_failed
+  - candidates_generated
+- adds runtime inspection endpoints for intake trust and observability:
+  - `GET /stats`
+  - `GET /failures`
+  - `GET /failures/{id}`
+  - `GET /ingest/recent`
+  - `GET /sources`
+- adds storage/query helpers for recent raw envelopes, recent failures, and per-source summaries
+- expands tests around raw-envelope persistence, failure inspection, source summaries, and stats
+### Validation
+- local test suite passed (`26 passed`)
+## 0.0.2 — 2026-03-22
+First fully aligned public foundation release of **brAInstem**.
+### Why 0.0.2
+- `0.0.1` was already published to npm earlier as the first placeholder/bootstrap package version
+- `0.0.2` is the first release where the local repo, documentation, canonical location, runtime foundation, and validation are being shipped together intentionally
+### What this release adds/locks in
+- canonical repo location at `~/brAInstem`
+- design governance, `v0.0.1` scope, adapter contract, and attention scoring docs
+- explicit `RawInputEnvelope` and `CanonicalEvent` foundation models
+- canonicalization path for current syslog-like ingestion
+- minimal FastAPI runtime with:
+  - `POST /ingest/event`
+  - `POST /ingest/batch`
+  - `GET /interesting`
+  - `GET /healthz`
+- focused API and canonicalization tests
+### Validation
+- local test suite passed (`19 passed`)
+- local end-to-end demo path executed successfully against sample syslog input
+- minimal FastAPI runtime and canonicalization tests passed locally
+## 0.0.1 — 2026-03-22
+First public prototype release of **brAInstem**.
+### What this release is
+- an experimental, self-contained operational memory prototype for weak signals
+- a proof that early event sources can feed one normalized internal stream
+- a first cut at attention-oriented weak-signal discovery and operator-facing interesting items
+### Included in 0.0.1
+- syslog-like ingestion path
+- event fingerprinting and recurrence candidate generation
+- interpretable scoring with operator-facing decision/attention output
+- SQLite persistence for events, signatures, and candidates
+- local demo path for end-to-end validation
+- initial LogicMonitor connector model/mapping work
+- design governance docs covering:
+  - product thesis
+  - `v0.0.1` scope
+  - adapter/raw-envelope/canonical-event contract
+  - attention scoring model
+### Not claimed in 0.0.1
+- full universal intake apparatus
+- production-grade always-on ingestion runtime
+- mature multi-tenant MSP platform behavior
+- complete discovery apparatus breadth (burst/spread/self-heal/precursor at full maturity)
+- polished operator UI
+### Validation
+- local test suite passed (`19 passed`)
+- local end-to-end demo path executed successfully against sample syslog input
+- minimal FastAPI runtime and canonicalization tests passed locally
+### Release framing
+This release puts a truthful first stake in the ground for brAInstem as an operational memory runtime focused on weak signals and operator attention.

package/README.md CHANGED Viewed

@@ -1,5 +1,101 @@
-# @simbimbo/brainstem
+# brAInstem
-brAInstem is an operational memory engine for weak signals.
+**Operational memory for weak signals**
-This package name is being reserved for the upcoming brAInstem project.
+brAInstem is an always-on operational memory runtime for weak signals. Instead of treating memory as conversational context, brAInstem treats logs and operational events as raw operational experience that can be normalized into one canonical stream, assigned attention, clustered into patterns, and promoted into durable operational knowledge.
+## One-line pitch
+brAInstem helps MSPs and lean ops teams detect recurring, self-resolving, and quietly escalating issues before they become major incidents.
+## The problem
+Most operational pain never becomes a classic threshold alert.
+It shows up as:
+- recurring low-grade warnings
+- brief self-healing failures
+- cross-system weak signals
+- near-misses that humans forget because there is too much noise
+Traditional monitoring catches hard failures. brAInstem is designed to catch patterns that matter to humans before they become obvious outages.
+## Core idea
+Logs and events should not only be stored and searched. They should be:
+1. ingested into a provenance-preserving raw envelope
+2. normalized into one canonical event stream
+3. assigned and updated with **attention** over time
+4. compressed so most inconsequential noise is handled cheaply
+5. promoted into operator-facing weak-signal outputs only when enough attention is earned
+6. promoted later into incident memory, lessons, and runbook hints when justified
+7. retrieved again when similar patterns recur
+## Primary users
+- MSP owners
+- MSP technicians
+- NOC teams
+- SRE / infrastructure operators
+- small security / ops teams dealing with alert fatigue and log blindness
+## MVP promise
+For a given tenant or environment, brAInstem should answer:
+- What happened today that mattered but never alerted?
+- What self-resolving issues are recurring?
+- What patterns are likely to become tickets later?
+- Have we seen this before?
+- What happened right before the last similar incident?
+## Relationship to ocmemog
+Shared DNA:
+- ingest -> candidate -> promotion pipeline
+- compact retrieval
+- provenance and explainability
+- memory scoring and recurrence awareness
+Different center of gravity:
+- ocmemog = assistant memory and continuity
+- brAInstem = operational event intelligence and weak-signal detection
+## Initial scope
+The long-term product direction is an always-on self-contained runtime with a robust input apparatus, a discovery apparatus, and operator-facing outputs.
+For the first public prototype line, start with:
+- a narrow but real ingestion story
+- a canonical event stream
+- attention-oriented weak-signal discovery
+- operator-facing interesting items / digest output
+- syslog-like events and LogicMonitor-shaped events as early proof sources
+Delay until later:
+- broad universal connector coverage
+- mature syslog appliance behavior across every input mode
+- full SIEM behavior
+- broad compliance workflows
+- generic observability replacement
+- "chat with all your logs" as the primary story
+## Proposed docs
+- `docs/design-governance.md` — canonical product/design guardrails
+- `docs/v0.0.1.md` — first release scope and acceptance criteria
+- `docs/adapters.md` — intake, raw envelope, and canonical event contract
+- `docs/vision.md`
+- `docs/architecture.md`
+- `docs/scoring.md` — attention scoring and routing model
+- `docs/roadmap.md`
+## Design governance
+Before expanding scope, adding connectors, or making architecture claims, read:
+- `docs/design-governance.md`
+That document is the canonical governor for:
+- what brAInstem is and is not
+- how attention should work
+- what belongs in `0.0.1`
+- how the input apparatus, discovery apparatus, and operator outputs should evolve

package/brainstem/__init__.py ADDED Viewed

@@ -0,0 +1,3 @@
+"""brAInstem — operational memory for weak signals."""
+__version__ = "0.0.3"

package/brainstem/api.py ADDED Viewed

@@ -0,0 +1,257 @@
+from __future__ import annotations
+from datetime import datetime
+from typing import Any, Dict, List, Optional
+import json
+from fastapi import FastAPI, HTTPException, Query
+from fastapi.responses import JSONResponse
+from pydantic import BaseModel, Field
+from .ingest import canonicalize_raw_input_envelope
+from .interesting import interesting_items
+from .models import Candidate, RawInputEnvelope
+from .recurrence import build_recurrence_candidates
+from .storage import (
+    RAW_ENVELOPE_STATUSES,
+    get_ingest_stats,
+    init_db,
+    list_candidates,
+    get_raw_envelope_by_id,
+    get_source_dimension_summaries,
+    list_recent_failed_raw_envelopes,
+    list_recent_raw_envelopes,
+    set_raw_envelope_status,
+    store_candidates,
+    store_events,
+    store_raw_envelopes,
+    store_signatures,
+)
+from .ingest import signatures_for_events
+app = FastAPI(title="brAInstem Runtime")
+class RawEnvelopeRequest(BaseModel):
+    tenant_id: str
+    source_type: str
+    source_id: str = ""
+    source_name: str = ""
+    message_raw: str
+    timestamp: Optional[str] = None
+    host: str = ""
+    service: str = ""
+    severity: str = "info"
+    asset_id: str = ""
+    source_path: str = ""
+    facility: str = ""
+    structured_fields: Dict[str, Any] = Field(default_factory=dict)
+    correlation_keys: Dict[str, Any] = Field(default_factory=dict)
+    metadata: Dict[str, Any] = Field(default_factory=dict)
+class IngestEventRequest(RawEnvelopeRequest):
+    pass
+class IngestBatchRequest(BaseModel):
+    events: List[RawEnvelopeRequest]
+    threshold: int = Field(default=2, ge=1)
+    db_path: Optional[str] = None
+def _raw_envelope_from_request(payload: RawEnvelopeRequest) -> RawInputEnvelope:
+    return RawInputEnvelope(
+        tenant_id=payload.tenant_id,
+        source_type=payload.source_type,
+        source_id=payload.source_id,
+        source_name=payload.source_name,
+        timestamp=payload.timestamp or datetime.utcnow().isoformat() + "Z",
+        message_raw=payload.message_raw,
+        host=payload.host,
+        service=payload.service,
+        severity=payload.severity,
+        asset_id=payload.asset_id,
+        source_path=payload.source_path,
+        facility=payload.facility,
+        structured_fields=dict(payload.structured_fields),
+        correlation_keys=dict(payload.correlation_keys),
+        metadata=dict(payload.metadata),
+    )
+def _candidate_from_row(row) -> Candidate:
+    return Candidate(
+        candidate_type=row["candidate_type"],
+        title=row["title"],
+        summary=row["summary"],
+        score_total=float(row["score_total"]),
+        score_breakdown=json.loads(row["score_breakdown_json"] or "{}"),
+        decision_band=row["decision_band"],
+        source_signature_ids=json.loads(row["source_signature_ids_json"] or "[]"),
+        source_event_ids=json.loads(row["source_event_ids_json"] or "[]"),
+        confidence=float(row["confidence"]),
+        metadata=json.loads(row["metadata_json"] or "{}"),
+    )
+def _raw_envelope_from_row(row) -> Dict[str, Any]:
+    return {
+        "id": row["id"],
+        "tenant_id": row["tenant_id"],
+        "source_type": row["source_type"],
+        "source_id": row["source_id"],
+        "source_name": row["source_name"],
+        "timestamp": row["timestamp"],
+        "host": row["host"],
+        "service": row["service"],
+        "severity": row["severity"],
+        "asset_id": row["asset_id"],
+        "source_path": row["source_path"],
+        "facility": row["facility"],
+        "message_raw": row["message_raw"],
+        "structured_fields": json.loads(row["structured_fields_json"] or "{}"),
+        "correlation_keys": json.loads(row["correlation_keys_json"] or "{}"),
+        "metadata": json.loads(row["metadata_json"] or "{}"),
+        "canonicalization_status": row["canonicalization_status"],
+        "failure_reason": row["failure_reason"],
+    }
+def _run_ingest_batch(raw_events: List[RawInputEnvelope], *, threshold: int, db_path: Optional[str]) -> Dict[str, Any]:
+    raw_envelope_ids: List[int] = []
+    if db_path:
+        init_db(db_path)
+        raw_envelope_ids = store_raw_envelopes(raw_events, db_path)
+    events = []
+    parse_failed = 0
+    for idx, raw_event in enumerate(raw_events):
+        raw_envelope_id = raw_envelope_ids[idx] if idx < len(raw_envelope_ids) else None
+        try:
+            canonical_event = canonicalize_raw_input_envelope(raw_event)
+        except Exception as exc:
+            parse_failed += 1
+            if raw_envelope_id is not None:
+                set_raw_envelope_status(
+                    raw_envelope_id,
+                    "parse_failed",
+                    db_path=db_path,
+                    failure_reason=str(exc),
+                )
+            continue
+        events.append(canonical_event)
+        if raw_envelope_id is not None:
+            set_raw_envelope_status(raw_envelope_id, "canonicalized", db_path=db_path)
+    if not events:
+        return {
+            "ok": True,
+            "event_count": 0,
+            "signature_count": 0,
+            "candidate_count": 0,
+            "parse_failed": parse_failed,
+            "interesting_items": [],
+        }
+    signatures = signatures_for_events(events)
+    candidates = build_recurrence_candidates(events, signatures, threshold=threshold)
+    if db_path:
+        store_events(events, db_path)
+        store_signatures(signatures, db_path)
+        store_candidates(candidates, db_path)
+    return {
+        "ok": True,
+        "tenant_id": events[0].tenant_id if events else "",
+        "event_count": len(events),
+        "signature_count": len({sig.signature_key for sig in signatures}),
+        "candidate_count": len(candidates),
+        "parse_failed": parse_failed,
+        "interesting_items": interesting_items(candidates, limit=max(1, 5)),
+    }
+@app.post("/ingest/event")
+def ingest_event(payload: IngestEventRequest, threshold: int = 2, db_path: Optional[str] = None) -> Dict[str, Any]:
+    if threshold < 1:
+        raise HTTPException(status_code=422, detail="threshold must be >= 1")
+    return _run_ingest_batch([_raw_envelope_from_request(payload)], threshold=threshold, db_path=db_path)
+@app.post("/ingest/batch")
+def ingest_batch(payload: IngestBatchRequest) -> Dict[str, Any]:
+    raw_events = [_raw_envelope_from_request(event) for event in payload.events]
+    return _run_ingest_batch(raw_events, threshold=payload.threshold, db_path=payload.db_path)
+@app.get("/interesting")
+def get_interesting(
+    limit: int = Query(default=5, ge=1),
+    db_path: Optional[str] = None,
+) -> Dict[str, Any]:
+    if not db_path:
+        return {"ok": True, "items": []}
+    rows = list_candidates(db_path=db_path, limit=limit)
+    candidates = [_candidate_from_row(row) for row in rows]
+    return {"ok": True, "items": interesting_items(candidates, limit=limit)}
+@app.get("/stats")
+def get_stats(db_path: Optional[str] = None) -> Dict[str, Any]:
+    return {"ok": True, **get_ingest_stats(db_path)}
+@app.get("/failures")
+def get_failures(
+    limit: int = Query(default=20, ge=1),
+    status: Optional[str] = None,
+    db_path: Optional[str] = None,
+) -> Dict[str, Any]:
+    if status is not None and status not in RAW_ENVELOPE_STATUSES:
+        raise HTTPException(
+            status_code=422,
+            detail=f"invalid status '{status}'; expected one of: {', '.join(RAW_ENVELOPE_STATUSES)}",
+        )
+    rows = list_recent_failed_raw_envelopes(db_path=db_path, status=status, limit=limit)
+    items = [_raw_envelope_from_row(row) for row in rows]
+    return {"ok": True, "items": items, "count": len(items), "status": status}
+@app.get("/ingest/recent")
+def get_ingest_recent(
+    limit: int = Query(default=20, ge=1),
+    status: Optional[str] = None,
+    db_path: Optional[str] = None,
+) -> Dict[str, Any]:
+    if status is not None and status not in RAW_ENVELOPE_STATUSES:
+        raise HTTPException(
+            status_code=422,
+            detail=f"invalid status '{status}'; expected one of: {', '.join(RAW_ENVELOPE_STATUSES)}",
+        )
+    rows = list_recent_raw_envelopes(db_path=db_path, status=status, limit=limit, failures_only=False)
+    items = [_raw_envelope_from_row(row) for row in rows]
+    return {"ok": True, "items": items, "count": len(items), "status": status}
+@app.get("/sources")
+def get_sources(
+    limit: int = Query(default=10, ge=1),
+    db_path: Optional[str] = None,
+) -> Dict[str, Any]:
+    return {"ok": True, "items": get_source_dimension_summaries(db_path=db_path, limit=limit)}
+@app.get("/failures/{raw_envelope_id}")
+def get_failure(raw_envelope_id: int, db_path: Optional[str] = None) -> Dict[str, Any]:
+    row = get_raw_envelope_by_id(raw_envelope_id, db_path=db_path)
+    if row is None:
+        raise HTTPException(status_code=404, detail="raw envelope not found")
+    return {"ok": True, "item": _raw_envelope_from_row(row)}
+@app.get("/healthz")
+def healthz() -> Dict[str, str]:
+    return JSONResponse(content={"ok": True, "status": "ok"})

package/brainstem/connectors/__init__.py ADDED Viewed

	@@ -0,0 +1 @@
1	+ """Connector entry points for brAInstem."""

package/brainstem/connectors/logicmonitor.py ADDED Viewed

@@ -0,0 +1,26 @@
+from __future__ import annotations
+from .types import ConnectorEvent
+def map_logicmonitor_event(payload: dict, *, tenant_id: str) -> ConnectorEvent:
+    metadata = payload.get("metadata") or {}
+    host = payload.get("host") or payload.get("resource_name") or ""
+    service = payload.get("service") or metadata.get("datasource") or "logicmonitor"
+    return ConnectorEvent(
+        tenant_id=tenant_id,
+        source_type="logicmonitor",
+        host=str(host),
+        service=str(service),
+        severity=str(payload.get("severity") or "info"),
+        timestamp=str(payload.get("timestamp") or ""),
+        message_raw=str(payload.get("message_raw") or payload.get("message") or ""),
+        metadata={
+            "alert_id": payload.get("alert_id"),
+            "resource_id": payload.get("resource_id"),
+            "datasource": metadata.get("datasource"),
+            "instance_name": metadata.get("instance_name"),
+            "acknowledged": metadata.get("acknowledged"),
+            "cleared_at": metadata.get("cleared_at"),
+        },
+    )

package/brainstem/connectors/types.py ADDED Viewed

@@ -0,0 +1,16 @@
+from __future__ import annotations
+from dataclasses import dataclass, field
+from typing import Any, Dict
+@dataclass
+class ConnectorEvent:
+    tenant_id: str
+    source_type: str
+    host: str
+    service: str
+    severity: str
+    timestamp: str
+    message_raw: str
+    metadata: Dict[str, Any] = field(default_factory=dict)

package/brainstem/demo.py ADDED Viewed

@@ -0,0 +1,64 @@
+from __future__ import annotations
+import argparse
+import json
+from dataclasses import asdict
+from typing import Any, Dict
+from .ingest import ingest_syslog_file, signatures_for_events
+from .instrumentation import emit, span
+from .interesting import interesting_items
+from .recurrence import build_recurrence_candidates, digest_items
+from .storage import init_db, store_candidates, store_events, store_signatures
+def run_syslog_demo(path: str, tenant_id: str, threshold: int = 2, db_path: str | None = None) -> Dict[str, Any]:
+    with span("syslog_demo", path=path, tenant_id=tenant_id, threshold=threshold):
+        init_db(db_path)
+        events = ingest_syslog_file(path, tenant_id=tenant_id)
+        emit("syslog_demo_events_loaded", count=len(events), path=path)
+        signatures = signatures_for_events(events)
+        emit("syslog_demo_signatures_built", count=len(signatures))
+        candidates = build_recurrence_candidates(events, signatures, threshold=threshold)
+        store_events(events, db_path)
+        store_signatures(signatures, db_path)
+        store_candidates(candidates, db_path)
+        digest = digest_items(candidates)
+        items = interesting_items(candidates, limit=5)
+        payload = {
+            "ok": True,
+            "tenant_id": tenant_id,
+            "event_count": len(events),
+            "signature_count": len({sig.signature_key for sig in signatures}),
+            "candidate_count": len(candidates),
+            "canonical_stream": {
+                "event_count": len(events),
+                "signature_count": len({sig.signature_key for sig in signatures}),
+            },
+            "digest": digest,
+            "interesting_items": items,
+            "top_candidate": asdict(candidates[0]) if candidates else None,
+        }
+        emit(
+            "syslog_demo_summary",
+            event_count=payload["event_count"],
+            signature_count=payload["signature_count"],
+            candidate_count=payload["candidate_count"],
+        )
+        return payload
+def main() -> int:
+    parser = argparse.ArgumentParser(description="Run the brAInstem syslog weak-signal demo.")
+    parser.add_argument("path", help="Path to a syslog-like input file")
+    parser.add_argument("--tenant", default="demo-tenant", help="Tenant/environment identifier")
+    parser.add_argument("--threshold", type=int, default=2, help="Minimum recurrence count for candidate emission")
+    parser.add_argument("--db-path", default=None, help="Optional SQLite path for persistent state")
+    args = parser.parse_args()
+    payload = run_syslog_demo(args.path, tenant_id=args.tenant, threshold=args.threshold, db_path=args.db_path)
+    print(json.dumps(payload, indent=2))
+    return 0
+if __name__ == "__main__":
+    raise SystemExit(main())

package/brainstem/fingerprint.py ADDED Viewed

@@ -0,0 +1,44 @@
+from __future__ import annotations
+import re
+from .models import Event, Signature
+_WHITESPACE_RE = re.compile(r"\s+")
+_IPV4_RE = re.compile(r"\b(?:\d{1,3}\.){3}\d{1,3}\b")
+_NUMBER_RE = re.compile(r"\b\d+\b")
+def normalize_message(message: str) -> str:
+    text = (message or "").strip().lower()
+    text = _IPV4_RE.sub("<ip>", text)
+    text = _NUMBER_RE.sub("<n>", text)
+    text = _WHITESPACE_RE.sub(" ", text)
+    return text
+def event_family_for(event: Event) -> str:
+    message_normalized = getattr(event, "message_normalized", None) or normalize_message(event.message_raw)
+    base = message_normalized
+    if "fail" in base or "error" in base:
+        return "failure"
+    if "restart" in base or "stopped" in base or "started" in base:
+        return "service_lifecycle"
+    if "auth" in base or "login" in base:
+        return "auth"
+    return "generic"
+def fingerprint_event(event: Event) -> Signature:
+    normalized = getattr(event, "signature_input", None) or getattr(event, "message_normalized", None) or normalize_message(event.message_raw)
+    family = event_family_for(event)
+    service = (event.service or "").strip().lower()
+    host = (event.host or "").strip().lower()
+    signature_key = f"{family}|{service}|{normalized}"
+    return Signature(
+        signature_key=signature_key,
+        event_family=family,
+        normalized_pattern=normalized,
+        service=service or host,
+    )