legion-apollo 0.3.1 → 0.3.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 487b9b2441548d8c6e112f287119bc230f85d3a329e39b9b0ca5ffad0d483c23
4
- data.tar.gz: 1fcd2fc638ea141c8ff48a9e13ff1d56a8e469ade43a6a28242f1d036e0a131a
3
+ metadata.gz: 1c8e724eefe292faefec05ca4fc589ed935f4734bc5c4726b7c5d70ed3b69dbb
4
+ data.tar.gz: 3fc0246858d50305bd2d956e87c9ec844c71a769c4f81955f43d2640159a175c
5
5
  SHA512:
6
- metadata.gz: f4aa0535295bdded7f8fb087b344461cc269f477cd772e0c66c18706ae7f128e4eecac8c7f56c7ac29b808836d2f1c05eafeb088bbbe7f334aa13a256f45aaec
7
- data.tar.gz: 7faa330d5b94f70cd0083fbc49b0ccf329c63604f7de35f7196bd552ff356efd2d7c11856efa7138f2313fdb56b3aae4bdc36c07ba402877dd57c19be6604017
6
+ metadata.gz: eb2815139791a2ed4abd5f09f6ed2b5cf27b89c399ed00276fc58e8ebfaceede00eb1b0cc02f43371ce5099d12467398e7ce92e58352cc854bb4b5eecc4a2c1c
7
+ data.tar.gz: 763ff525cdf09284bc3a7c3337407bfcaa8a750f5dddaeb66e1a1b082742a9e158b813a39a0d606c4c1740ba26137c6b565c6152688bd6f007b601ed98dd6fd1
data/CHANGELOG.md CHANGED
@@ -1,5 +1,16 @@
1
1
  # Changelog
2
2
 
3
+ ## [0.3.2] - 2026-03-26
4
+
5
+ ### Added
6
+ - Self-knowledge seed system: 10 markdown documents covering LegionIO identity, architecture, extensions, security, LLM pipeline, Apollo, CLI, cognitive layer, Teams integration, and deployment
7
+ - `Apollo::Local.seed_self_knowledge` auto-ingests self-knowledge docs on boot (local + global)
8
+ - `Apollo::Local.seeded?` query method
9
+ - `data/**/*` included in gemspec so self-knowledge ships with the gem
10
+
11
+ ### Changed
12
+ - Refactored `seed_self_knowledge` into smaller helpers (`self_knowledge_files`, `seed_files`, `seed_single_file`) to satisfy rubocop complexity
13
+
3
14
  ## [0.3.1] - 2026-03-26
4
15
 
5
16
  ### Added
data/README.md CHANGED
@@ -2,15 +2,62 @@
2
2
 
3
3
  Apollo client library for the LegionIO framework.
4
4
 
5
- Provides `query`, `ingest`, and `retrieve` with smart routing: co-located lex-apollo service, RabbitMQ transport, or graceful failure.
5
+ **Version**: 0.3.2
6
+
7
+ Provides `query`, `ingest`, and `retrieve` with smart routing: co-located lex-apollo service, RabbitMQ transport, or graceful failure. Supports a node-local SQLite knowledge store (`Apollo::Local`) that mirrors the same API without requiring any remote infrastructure.
6
8
 
7
9
  ## Usage
8
10
 
9
11
  ```ruby
10
12
  Legion::Apollo.start
11
13
 
14
+ # Global knowledge store (requires lex-apollo or RabbitMQ)
12
15
  Legion::Apollo.ingest(content: 'Some knowledge', tags: %w[fact ruby])
13
16
  results = Legion::Apollo.query(text: 'tell me about ruby', limit: 5)
17
+
18
+ # Node-local store (SQLite + FTS5, no network required)
19
+ Legion::Apollo.ingest(content: 'Local note', scope: :local)
20
+ results = Legion::Apollo.query(text: 'local note', scope: :local)
21
+
22
+ # Query both and merge (deduped by content hash, ranked by confidence)
23
+ results = Legion::Apollo.query(text: 'ruby', scope: :all)
24
+ ```
25
+
26
+ ## Scopes
27
+
28
+ | Scope | Route |
29
+ |-------|-------|
30
+ | `:global` (default) | Co-located lex-apollo or RabbitMQ transport |
31
+ | `:local` | `Apollo::Local` SQLite+FTS5 store (node-local) |
32
+ | `:all` | Both merged, deduped by `content_hash`, ranked by confidence |
33
+
34
+ ## Local Store
35
+
36
+ `Apollo::Local` provides a node-local knowledge store backed by SQLite + FTS5. When started (e.g., via `Legion::Apollo.start`, which calls `Legion::Apollo::Local.start` automatically), it uses `Legion::Data::Local` when available and respects `Settings[:apollo][:local][:enabled]`.
37
+
38
+ Features:
39
+ - Content-hash dedup (MD5 of normalized content)
40
+ - Optional LLM embeddings (1024-dim) with cosine rerank when `Legion::LLM.can_embed?`
41
+ - TTL expiry (default 5-year retention)
42
+ - FTS5 full-text search with `ILIKE` fallback
43
+
44
+ ## Configuration
45
+
46
+ ```json
47
+ {
48
+ "apollo": {
49
+ "default_limit": 5,
50
+ "min_confidence": 0.3,
51
+ "max_tags": 20,
52
+ "local": {
53
+ "enabled": true,
54
+ "retention_years": 5,
55
+ "default_limit": 5,
56
+ "min_confidence": 0.3,
57
+ "fts_candidate_multiplier": 3
58
+ }
59
+ }
60
+ }
14
61
  ```
15
62
 
16
63
  ## License
@@ -0,0 +1,51 @@
1
+ # What is LegionIO?
2
+
3
+ LegionIO is an extensible async job engine and cognitive platform for Ruby. It schedules tasks, creates relationships between services, and runs them concurrently. It was created by Matthew Iverson (@Esity) and is licensed under Apache-2.0.
4
+
5
+ LegionIO is not a chatbot. It is a framework that can power chatbots, AI assistants, background workers, service integrations, and autonomous agents. The chat interface is one of many ways to interact with it.
6
+
7
+ ## Core Purpose
8
+
9
+ LegionIO connects isolated systems — cloud accounts, on-premise services, SaaS tools — into a unified async task engine. It uses RabbitMQ for message passing, supports SQLite/PostgreSQL/MySQL for persistence, and Redis/Memcached for caching.
10
+
11
+ ## What LegionIO Does
12
+
13
+ - Schedules and executes async tasks across distributed services
14
+ - Chains tasks into workflows (Task A -> conditioner -> Task B -> transformer -> Task C)
15
+ - Auto-discovers and loads extension gems (LEX plugins) at boot
16
+ - Provides a unified REST API on port 4567 for all operations
17
+ - Integrates with HashiCorp Vault for secrets and authentication
18
+ - Supports Kerberos auto-authentication to Vault using existing AD credentials
19
+ - Runs a 19-step LLM pipeline with RAG, guardrails, cost tracking, and model routing
20
+ - Maintains a shared knowledge store (Apollo) for organizational knowledge
21
+ - Provides a rich terminal UI with AI chat, dashboard, and extension browser
22
+ - Tracks token usage and costs per-user and per-team
23
+ - Supports HIPAA PHI compliance with redaction, crypto-erasure, and audit trails
24
+ - Runs as a macOS/Linux background service via Homebrew or systemd
25
+
26
+ ## What LegionIO Does Not Do
27
+
28
+ - It does not provide direct cloud infrastructure (no VMs, no networking)
29
+ - It does not replace Terraform, Ansible, or Chef for infrastructure management
30
+ - It does not host web applications or serve static content
31
+ - It does not provide its own LLM — it routes to providers like Bedrock, Anthropic, OpenAI, Gemini, or Ollama
32
+ - It does not require RabbitMQ in lite mode (uses an in-process message adapter)
33
+ - It does not store credentials on disk — all secrets are in Vault or environment variables
34
+
35
+ ## Installation
36
+
37
+ Install via Homebrew on macOS:
38
+ ```
39
+ brew tap legionio/tap
40
+ brew install legionio
41
+ ```
42
+
43
+ Or via RubyGems:
44
+ ```
45
+ gem install legionio
46
+ ```
47
+
48
+ ## Key Binaries
49
+
50
+ - `legionio` — daemon and operational CLI (start, stop, config, lex, task, mcp, etc.)
51
+ - `legion` — interactive terminal shell with AI chat, onboarding wizard, and dashboard
@@ -0,0 +1,51 @@
1
+ # LegionIO Architecture
2
+
3
+ ## Boot Sequence
4
+
5
+ LegionIO starts subsystems in a fixed order. Each phase is individually toggleable.
6
+
7
+ 1. Logging (legion-logging)
8
+ 2. Settings (legion-settings — loads from /etc/legionio, ~/.legionio, ./settings)
9
+ 3. Crypt (legion-crypt — Vault connection, Kerberos auto-auth)
10
+ 4. Transport (legion-transport — RabbitMQ or InProcess lite adapter)
11
+ 5. Cache (legion-cache — Redis, Memcached, or Memory adapter)
12
+ 6. Data (legion-data — SQLite, PostgreSQL, or MySQL via Sequel)
13
+ 7. RBAC (legion-rbac — role-based access control)
14
+ 8. LLM (legion-llm — AI provider setup and routing)
15
+ 9. Apollo (legion-apollo — shared and local knowledge store)
16
+ 10. GAIA (legion-gaia — cognitive coordination layer, 24 phases)
17
+ 11. Telemetry (OpenTelemetry tracing, optional)
18
+ 12. Extensions (two-phase parallel: require+autobuild, then hook actors)
19
+ 13. API (Sinatra/Puma REST API on port 4567)
20
+
21
+ Shutdown runs in reverse order. Reload shuts down then re-runs from settings onward.
22
+
23
+ ## Core Gems
24
+
25
+ | Gem | Purpose |
26
+ |-----|---------|
27
+ | legion-transport | RabbitMQ AMQP messaging + InProcess lite adapter |
28
+ | legion-cache | Caching (Redis/Memcached/Memory) |
29
+ | legion-crypt | Encryption, Vault integration, JWT, Kerberos auth, mTLS |
30
+ | legion-data | Database persistence via Sequel (SQLite/PostgreSQL/MySQL) |
31
+ | legion-json | JSON serialization (multi_json wrapper) |
32
+ | legion-logging | Console + structured JSON logging with redaction |
33
+ | legion-settings | Configuration management with schema validation |
34
+ | legion-llm | LLM integration with 19-step pipeline |
35
+ | legion-mcp | MCP server with 58+ tools |
36
+ | legion-gaia | Cognitive coordination (24 phases: 16 active + 8 dream) |
37
+ | legion-apollo | Shared knowledge store client (local SQLite + global pgvector) |
38
+ | legion-rbac | Role-based access control with Vault-style policies |
39
+ | legion-tty | Rich terminal UI with AI chat and operational dashboard |
40
+
41
+ ## Extension Loading
42
+
43
+ Extensions are gems named `lex-*`, auto-discovered via Bundler or Gem::Specification. Loading is two-phase and parallel: all extensions are required and `autobuild` runs concurrently on a thread pool, then `hook_all_actors` starts subscriptions sequentially. This prevents race conditions.
44
+
45
+ ## Lite Mode
46
+
47
+ Setting `LEGION_MODE=lite` replaces RabbitMQ with an InProcess adapter and Redis with a Memory adapter. No external infrastructure required. Useful for development, demos, and single-machine deployments.
48
+
49
+ ## REST API
50
+
51
+ Full REST API served by Sinatra/Puma on port 4567. Endpoints include tasks, extensions, runners, nodes, schedules, relationships, settings, events (SSE), transport status, hooks, workers, teams, capacity, tenants, audit, RBAC, and webhooks. JWT Bearer auth middleware with rate limiting.
@@ -0,0 +1,54 @@
1
+ # LegionIO Extension System (LEX)
2
+
3
+ ## What is a LEX?
4
+
5
+ A LEX (Legion Extension) is a Ruby gem named `lex-*` that plugs into LegionIO. Each LEX defines runners (functions) and actors (execution modes). Extensions are auto-discovered at boot — install a gem and it loads automatically.
6
+
7
+ ## Actor Types
8
+
9
+ | Type | Behavior |
10
+ |------|----------|
11
+ | Subscription | Consumes messages from an AMQP queue |
12
+ | Polling | Polls on a schedule |
13
+ | Interval (Every) | Runs at a fixed interval |
14
+ | Once | Runs once at startup |
15
+ | Loop | Runs continuously |
16
+ | Nothing | Passive — only invoked via API or other extensions |
17
+
18
+ ## Creating an Extension
19
+
20
+ ```bash
21
+ legion lex create myextension # scaffold a new lex-myextension gem
22
+ legion generate runner myrunner # add a runner with functions
23
+ legion generate actor myactor # add an actor with type selection
24
+ legion generate tool mytool # add an MCP tool
25
+ ```
26
+
27
+ ## Extension Categories
28
+
29
+ ### Core Operational (21 extensions)
30
+ node, tasker, scheduler, synapse, LLM gateway, detect, telemetry, acp, react, webhook, health, metering, exec, conditioner, transformer, tick, audit, codegen, privatecore, lex (meta), knowledge
31
+
32
+ ### Agentic/Cognitive (13 consolidated gems + supporting)
33
+ self (identity, metacognition, reflection, personality, agency), affect (emotion, mood, sentiment), imagination (creative generation, dream ideation), language (NLU, discourse), memory (episodic, semantic, working memory), social (theory of mind, social cognition), swarm-github (code review), mesh (inter-agent communication), mind-growth (autonomous expansion), autofix, dataset, eval, factory
34
+
35
+ ### AI Provider Integrations (7)
36
+ azure-ai, bedrock, claude, foundry, gemini, openai, xai
37
+
38
+ ### Service Integrations (10 common + 40 additional)
39
+ Common: consul, github, http, kerberos, vault, tfe, microsoft-teams, slack, webhook, acp
40
+ Additional: chef, jfrog, ssh, smtp, kafka, jira, docker, kubernetes, and more
41
+
42
+ ## Role-Based Filtering
43
+
44
+ Extensions load based on role profile:
45
+ - `nil` (default): all extensions
46
+ - `:core`: 14 core operational only
47
+ - `:cognitive`: core + all agentic
48
+ - `:service`: core + service integrations
49
+ - `:dev`: core + AI + essential agentic
50
+ - `:custom`: explicit list from settings
51
+
52
+ ## Extension Discovery
53
+
54
+ At boot, LegionIO calls `Bundler.load.specs` (or `Gem::Specification` fallback) to find all `lex-*` gems. Each extension's `autobuild` creates runners and actors. After all extensions load, `hook_all_actors` activates AMQP subscriptions and timers.
@@ -0,0 +1,46 @@
1
+ # LegionIO Security
2
+
3
+ ## Authentication
4
+
5
+ ### Kerberos + Vault
6
+ LegionIO authenticates to HashiCorp Vault using Kerberos (SPNEGO). On macOS or Linux machines joined to Active Directory, the existing Kerberos ticket is used — no password entry needed. The SPNEGO token is sent as an HTTP Authorization header to Vault's Kerberos auth backend, which returns a Vault token. Token renewal runs in a background thread at 75% TTL.
7
+
8
+ ### JWT Authentication
9
+ The REST API uses JWT Bearer auth. Tokens are validated against JWKS endpoints. Skip paths exist for health and readiness checks.
10
+
11
+ ### mTLS
12
+ Optional mutual TLS for internal communications. Vault PKI issues certificates, and a background thread rotates them at 50% TTL. Feature-flagged via `security.mtls.enabled`.
13
+
14
+ ## Secrets Management
15
+
16
+ All secrets are stored in HashiCorp Vault, never on disk. Config files reference secrets using `vault://` URIs that are resolved at runtime. Environment variable fallback is supported via `env://` URIs.
17
+
18
+ Example: `"bearer_token": "vault://secret/data/llm/bedrock#bearer_token"`
19
+
20
+ ## RBAC
21
+
22
+ Optional role-based access control using Vault-style flat policies. Policies map identities to allowed actions on resources. Enforced at the API middleware layer and in the LLM pipeline.
23
+
24
+ ## HIPAA PHI Compliance
25
+
26
+ - **PHI Tagging**: Metadata classification for sensitive data
27
+ - **PHI Access Logging**: Audit trail via Legion::Audit for all PHI access
28
+ - **PHI Erasure**: Crypto-erasure orchestration via Crypt::Erasure + Cache purge
29
+ - **PHI TTL Cap**: legion-cache enforces maximum TTL for PHI-tagged data
30
+ - **Redaction**: Automatic PII/PHI redaction in all log output via legion-logging
31
+
32
+ All PHI features are off by default and enabled via configuration.
33
+
34
+ ## Audit
35
+
36
+ - Tamper-evident hash chain for audit entries
37
+ - 7-year tiered retention (hot -> warm -> cold storage)
38
+ - SIEM export for Splunk/ELK ingestion
39
+ - Queryable via CLI (`legion audit`) and REST API
40
+
41
+ ## Network Security
42
+
43
+ - No public IPs or ingress in production deployments
44
+ - TLS required on all connections (Optum-sanctioned CAs only in UHG deployments)
45
+ - Rate limiting middleware with per-IP/agent/tenant tiers
46
+ - Request body size limits (1MB max)
@@ -0,0 +1,54 @@
1
+ # LegionIO LLM Pipeline
2
+
3
+ ## Overview
4
+
5
+ LegionIO routes AI requests through a 19-step pipeline that adds governance, RAG context, tool use, cost tracking, and knowledge capture. The pipeline is provider-agnostic — it works with Bedrock, Anthropic, OpenAI, Gemini, and local Ollama.
6
+
7
+ ## Pipeline Steps
8
+
9
+ 1. **Normalize** — standardize request format
10
+ 2. **Profile** — derive caller profile (user, system, service)
11
+ 3. **RBAC** — check access permissions
12
+ 4. **Classification** — classify request sensitivity
13
+ 5. **Billing** — check budget and rate limits
14
+ 6. **Guardrails** — input validation and safety checks
15
+ 7. **GAIA Advisory** — cognitive layer enrichment (optional system prompt)
16
+ 8. **RAG Context** — retrieve relevant knowledge from Apollo (global + local)
17
+ 9. **MCP Discovery** — discover available tools
18
+ 10. **Enrichment Injection** — prepend GAIA/RAG context to system prompt
19
+ 11. **Fleet Selection** — choose optimal model/provider
20
+ 12. **Dispatch** — send request to LLM provider
21
+ 13. **Parse Response** — extract text and tool calls
22
+ 14. **Tool Calls** — execute MCP tools if requested
23
+ 15. **Post-Response** — post-processing
24
+ 16. **Audit** — publish audit trail
25
+ 17. **Metering** — record token usage and cost
26
+ 18. **Timeline** — record timing data
27
+ 19. **Knowledge Capture** — write significant responses back to Apollo
28
+
29
+ ## Supported Providers
30
+
31
+ | Provider | Models | Auth |
32
+ |----------|--------|------|
33
+ | AWS Bedrock | Claude, Llama, Mistral | Bearer token (from Vault) |
34
+ | Anthropic | Claude family | API key |
35
+ | OpenAI | GPT-4, GPT-3.5 | API key |
36
+ | Google Gemini | Gemini Pro, Flash | API key |
37
+ | Ollama | Any local model | None (localhost) |
38
+
39
+ ## Cost Tracking
40
+
41
+ Every LLM call is metered. Token counts (input/output) and estimated costs are tracked per-request, per-session, per-user, and per-team. The status bar in the terminal UI shows real-time token count and cost. Budget limits can be set per-user or per-team.
42
+
43
+ ## Model Routing
44
+
45
+ The fleet selection step chooses the optimal model based on request classification, cost constraints, and provider availability. Model escalation automatically retries with a more capable model if the initial response fails quality checks.
46
+
47
+ ## RAG Integration
48
+
49
+ Step 8 retrieves relevant context from Apollo using scope routing:
50
+ - `:local` — node-local SQLite+FTS5 store only
51
+ - `:global` — shared PostgreSQL+pgvector store
52
+ - `:all` — both merged, deduplicated by content hash, ranked by confidence
53
+
54
+ Retrieved context is injected into the system prompt by the Enrichment Injector (step 10).
@@ -0,0 +1,54 @@
1
+ # Apollo Knowledge Store
2
+
3
+ ## What is Apollo?
4
+
5
+ Apollo is LegionIO's shared knowledge store. It provides organizational memory that persists across sessions and is shared across all LegionIO nodes. Every AI response can reference knowledge from Apollo, and significant responses are captured back into it.
6
+
7
+ ## Architecture
8
+
9
+ ### Local Store (every node)
10
+ - SQLite database with FTS5 full-text search
11
+ - Content-hash deduplication (MD5)
12
+ - Optional LLM embeddings (1024-dim) with cosine rerank
13
+ - TTL-based expiry (default 5 years)
14
+ - Works offline — no network required
15
+
16
+ ### Global Store (shared)
17
+ - PostgreSQL with pgvector extension
18
+ - HNSW cosine similarity index
19
+ - Agents interact via RabbitMQ (no direct DB access)
20
+ - Hosted on Azure PostgreSQL Flexible Server
21
+
22
+ ## Scope Routing
23
+
24
+ All queries and ingests accept a `scope:` parameter:
25
+ - `:local` — SQLite only
26
+ - `:global` — PostgreSQL only (via transport or co-located extension)
27
+ - `:all` — both merged, deduplicated by content hash, ranked by confidence
28
+
29
+ ## Knowledge Capture
30
+
31
+ The LLM pipeline's step 19 (Knowledge Capture) automatically writes significant responses back to Apollo. This creates a feedback loop where the system learns from its own interactions. Content-hash deduplication prevents echo chambers.
32
+
33
+ ## Knowledge CLI
34
+
35
+ ```
36
+ legion knowledge query "question" # query and synthesize answer
37
+ legion knowledge retrieve "question" # raw source chunks
38
+ legion knowledge ingest <path> # ingest file or directory
39
+ legion knowledge status # corpus stats
40
+ legion knowledge health # full health report
41
+ legion knowledge maintain # orphan detection and cleanup
42
+ legion knowledge quality # quality report
43
+ legion knowledge monitor add <path> # watch a directory for changes
44
+ legion knowledge capture commit # capture git commit as knowledge
45
+ ```
46
+
47
+ ## Content Pipeline
48
+
49
+ Files ingested via `legion knowledge ingest` go through:
50
+ 1. Format detection (Markdown, PDF, DOCX, plain text)
51
+ 2. Chunking by heading hierarchy (H1-H6 with ancestry path)
52
+ 3. Delta detection (only new/changed files via manifest)
53
+ 4. Batch embedding (one LLM call per file, not per chunk)
54
+ 5. Upsert to Apollo (local and/or global based on scope)
@@ -0,0 +1,59 @@
1
+ # LegionIO CLI Reference
2
+
3
+ ## Interactive Shell
4
+
5
+ Running `legion` with no arguments launches the rich terminal UI:
6
+ - Digital rain intro animation on first run
7
+ - Onboarding wizard with Kerberos identity detection
8
+ - AI chat shell with streaming responses
9
+ - Dashboard (Ctrl+D) with service status panels
10
+ - Extension browser, config editor, command palette (Ctrl+K)
11
+ - 115+ slash commands, tab completion, session persistence
12
+
13
+ ## Key Commands
14
+
15
+ ### Daemon Operations (legionio)
16
+ ```
17
+ legionio start # start daemon
18
+ legionio stop # stop daemon
19
+ legionio status # check daemon status
20
+ legionio doctor # 11-check environment diagnosis
21
+ legionio config scaffold # generate starter config files
22
+ legionio config import <url> # import config from URL
23
+ legionio bootstrap <url> # one-command setup (config + scaffold + install packs)
24
+ legionio setup agentic # install 47 cognitive gems
25
+ legionio setup claude-code # configure MCP server for Claude Code
26
+ legionio setup cursor # configure MCP server for Cursor
27
+ legionio mcp stdio # start MCP server (stdio transport)
28
+ legionio lex list # list loaded extensions
29
+ legionio update # self-update via Homebrew or gem
30
+ ```
31
+
32
+ ### Interactive / Dev (legion)
33
+ ```
34
+ legion # launch rich terminal UI
35
+ legion chat # AI chat REPL
36
+ legion do "natural language" # natural language command routing
37
+ legion knowledge query "question" # query knowledge base
38
+ legion commit # AI-generated commit message
39
+ legion pr # AI-generated PR description
40
+ legion review # AI code review
41
+ legion plan # read-only exploration mode
42
+ legion memory list # persistent memory management
43
+ legion mind-growth status # cognitive architecture status
44
+ ```
45
+
46
+ ### MCP Server
47
+
48
+ LegionIO exposes 58+ MCP tools when configured as an MCP server in Claude Code, Cursor, or VS Code. Tools cover knowledge queries, extension management, task operations, system status, and more.
49
+
50
+ ## Natural Language Commands
51
+
52
+ `legion do` routes free-text to the right extension capability:
53
+ ```
54
+ legion do "list all running extensions"
55
+ legion do "check system health"
56
+ legion do "show vault status"
57
+ ```
58
+
59
+ It tries three resolution paths: daemon API, in-process capability registry, LLM classification.
@@ -0,0 +1,49 @@
1
+ # LegionIO Cognitive Layer
2
+
3
+ ## GAIA (Cognitive Coordination)
4
+
5
+ GAIA is LegionIO's cognitive coordination layer. It manages 24 phases (16 active + 8 dream) in a tick-based cycle. Each tick runs through phases like sensory processing, attention, working memory integration, prediction, emotional evaluation, action selection, and reflection.
6
+
7
+ ### Active Phases (16)
8
+ sensory_processing, attention_filtering, working_memory_integration, contextual_memory_retrieval, knowledge_retrieval, prediction_engine, emotional_evaluation, action_selection, social_cognition, theory_of_mind, homeostasis_regulation, autonomy_gating, identity_entropy_check, post_tick_reflection, audit_publish, metering
9
+
10
+ ### Dream Phases (8)
11
+ memory_consolidation, dream_generation, emotional_processing, creative_association, pattern_extraction, belief_update, schema_integration, dream_ideation
12
+
13
+ ## Agentic Extensions
14
+
15
+ LegionIO includes 13 consolidated cognitive domain gems:
16
+
17
+ | Gem | Domain |
18
+ |-----|--------|
19
+ | lex-agentic-self | Identity, metacognition, reflection, personality, agency, self-talk |
20
+ | lex-agentic-affect | Emotion modeling, mood tracking, sentiment analysis |
21
+ | lex-agentic-imagination | Creative generation, dream ideation, scenario planning |
22
+ | lex-agentic-language | Natural language understanding, discourse analysis |
23
+ | lex-agentic-memory | Episodic, semantic, and working memory management |
24
+ | lex-agentic-social | Social cognition, theory of mind, preference exchange |
25
+ | lex-mesh | Inter-agent communication, gossip protocol, preference profiles |
26
+ | lex-mind-growth | Autonomous cognitive architecture expansion |
27
+ | lex-swarm-github | Multi-agent code review |
28
+ | lex-eval | Evaluation and benchmarking |
29
+ | lex-autofix | Autonomous code fix pipeline |
30
+ | lex-dataset | Training data management |
31
+ | lex-factory | Spec-to-code generation pipeline |
32
+
33
+ ## Self-Awareness
34
+
35
+ The `lex-agentic-self` extension maintains a live self-model:
36
+ - **Metacognition**: Real-time snapshot of loaded extensions, capabilities, health
37
+ - **Self-narrative**: Prose description of current state (injected into system prompt)
38
+ - **Behavioral fingerprint**: 6-dimension identity tracking with drift detection
39
+ - **Personality**: Big Five OCEAN traits that evolve slowly over time
40
+ - **Reflection**: Post-tick analysis with health scores across 7 categories
41
+
42
+ ## Mind Growth
43
+
44
+ `lex-mind-growth` enables autonomous expansion of cognitive capabilities:
45
+ - Gap analysis against reference cognitive models
46
+ - Proposal, evaluation, and staged build pipeline
47
+ - Swarm-based building and distributed consensus
48
+ - Competitive evolution with fitness-based selection
49
+ - 25 completed phases, 998+ specs
@@ -0,0 +1,44 @@
1
+ # Microsoft Teams Integration
2
+
3
+ ## Overview
4
+
5
+ The `lex-microsoft_teams` extension connects LegionIO to Microsoft Teams via the Microsoft Graph API. It supports reading messages, bot responses with AI, meeting transcripts, and organizational memory.
6
+
7
+ ## Authentication
8
+
9
+ Two auth paths run in parallel:
10
+ - **Application (client credentials)**: Bot-to-bot communication via client_id/client_secret
11
+ - **Delegated (user OAuth)**: User-context access via browser PKCE flow or device code fallback
12
+
13
+ Tokens are persisted to Vault (with local file fallback) and auto-refreshed with a 60-second pre-expiry buffer.
14
+
15
+ ## Capabilities
16
+
17
+ ### Message Reading
18
+ - 1:1 and group chat messages
19
+ - Channel messages across teams
20
+ - Real-time message processing via AMQP transport
21
+
22
+ ### AI Bot
23
+ - Direct chat mode: users DM the bot, get AI responses via LLM pipeline
24
+ - Conversation observer mode: passive extraction from watched chats (disabled by default)
25
+ - Multi-turn sessions with context persistence
26
+ - Memory trace injection for organizational context
27
+
28
+ ### Meetings and Transcripts
29
+ - Online meeting CRUD and join URL lookup
30
+ - Meeting transcript retrieval (VTT/DOCX format)
31
+ - Attendance reports
32
+
33
+ ### Organizational Intelligence
34
+ - Profile ingestion: identity, contacts, conversation summaries
35
+ - Incremental sync every 15 minutes for new messages
36
+ - Memory traces stored across sender, teams, and chat domains
37
+
38
+ ## RAG Integration
39
+
40
+ The bot injects organizational memory context into every response:
41
+ - Retrieves traces from lex-agentic-memory across 3 domain scopes
42
+ - Deduplicates by trace_id, ranks by strength and recency
43
+ - Appends formatted context to the system prompt (2000 token budget)
44
+ - Per-user preference profiles from lex-mesh customize response style
@@ -0,0 +1,75 @@
1
+ # LegionIO Deployment
2
+
3
+ ## Installation Methods
4
+
5
+ ### Homebrew (macOS, recommended)
6
+ ```
7
+ brew tap legionio/tap
8
+ brew install legionio
9
+ ```
10
+ This installs a self-contained Ruby 3.4.8 runtime with YJIT, all core gems, and wrapper scripts. No system Ruby or rbenv required. Redis is installed as a recommended dependency.
11
+
12
+ ### RubyGems
13
+ ```
14
+ gem install legionio
15
+ ```
16
+
17
+ ### Docker
18
+ ```
19
+ docker pull legionio/legion
20
+ ```
21
+
22
+ ## Configuration
23
+
24
+ Config files live at `~/.legionio/settings/` as JSON files (one per subsystem). Generate starter configs:
25
+ ```
26
+ legionio config scaffold
27
+ ```
28
+
29
+ Bootstrap from a remote URL:
30
+ ```
31
+ legionio bootstrap https://example.com/config.json
32
+ ```
33
+
34
+ Settings resolution order: command-line flags > environment variables > config files > defaults.
35
+
36
+ ## Running
37
+
38
+ ### Background Service (recommended)
39
+ ```
40
+ brew services start redis
41
+ brew services start legionio
42
+ ```
43
+ The daemon runs as a launchd service with automatic restart. Logs at `$(brew --prefix)/var/log/legion/legion.log`.
44
+
45
+ ### Foreground
46
+ ```
47
+ legionio start --log-level debug
48
+ ```
49
+
50
+ ### Lite Mode (no infrastructure)
51
+ ```
52
+ LEGION_MODE=lite legionio start
53
+ ```
54
+ Replaces RabbitMQ with in-process messaging and Redis with in-memory cache.
55
+
56
+ ## Infrastructure Requirements
57
+
58
+ | Service | Required? | Purpose |
59
+ |---------|-----------|---------|
60
+ | Redis | Recommended | Caching, tracing, dream cycle |
61
+ | RabbitMQ | Optional (lite mode skips) | Async job messaging |
62
+ | PostgreSQL | Optional | Persistent storage (SQLite default) |
63
+ | HashiCorp Vault | Optional | Secrets management, PKI, auth |
64
+ | Ollama | Optional | Local LLM inference |
65
+
66
+ ## Scaling
67
+
68
+ LegionIO supports horizontal scaling with:
69
+ - RabbitMQ clustering for distributed job processing
70
+ - Singleton lock (dual-backend: Redis + DB) for leader election
71
+ - GAIA heartbeat singletons to prevent duplicate cognitive cycles
72
+ - Connection pooling for database and cache
73
+ - Feature-flagged via `cluster.singleton_enabled` and `cluster.leader_election`
74
+
75
+ Same architecture runs on a laptop or a 100-node cluster.
@@ -86,10 +86,70 @@ module Legion
86
86
 
87
87
  def reset!
88
88
  @started = false
89
+ @seeded = false
90
+ end
91
+
92
+ def seed_self_knowledge
93
+ return unless started?
94
+ return if @seeded
95
+
96
+ files = self_knowledge_files
97
+ return if files.empty?
98
+
99
+ count = seed_files(files)
100
+ @seeded = true
101
+ Legion::Logging.info("Apollo::Local seeded #{count} self-knowledge files") if defined?(Legion::Logging)
102
+ rescue StandardError => e
103
+ Legion::Logging.warn("Apollo::Local seed failed: #{e.message}") if defined?(Legion::Logging)
104
+ end
105
+
106
+ def seeded?
107
+ @seeded == true
89
108
  end
90
109
 
91
110
  private
92
111
 
112
+ def self_knowledge_files
113
+ seed_dir = File.join(File.expand_path('../../..', __dir__), 'data', 'self-knowledge')
114
+ return [] unless File.directory?(seed_dir)
115
+
116
+ Dir[File.join(seed_dir, '*.md')]
117
+ end
118
+
119
+ def seed_files(files)
120
+ count = 0
121
+ files.each do |path|
122
+ count += 1 if seed_single_file(path)
123
+ end
124
+ count
125
+ end
126
+
127
+ def seed_single_file(path)
128
+ content = File.read(path)
129
+ return false if content.strip.empty?
130
+
131
+ tags = ['legionio', 'self-knowledge', File.basename(path, '.md')]
132
+ result = ingest(content: content, tags: tags, source_channel: 'self-knowledge',
133
+ submitted_by: 'legion-apollo', confidence: 0.9)
134
+ return false unless result[:success] && result[:mode] != :deduplicated
135
+
136
+ ingest_global(content: content, tags: tags) if global_available?
137
+ true
138
+ end
139
+
140
+ def ingest_global(content:, tags:)
141
+ Legion::Apollo.ingest(content: content, tags: tags, source_channel: 'self-knowledge',
142
+ submitted_by: 'legion-apollo', confidence: 0.9, scope: :global)
143
+ rescue StandardError => e
144
+ Legion::Logging.debug("Global seed ingest failed: #{e.message}") if defined?(Legion::Logging)
145
+ end
146
+
147
+ def global_available?
148
+ defined?(Legion::Apollo) && Legion::Apollo.started? && Legion::Apollo.respond_to?(:ingest)
149
+ rescue StandardError
150
+ false
151
+ end
152
+
93
153
  def local_enabled?
94
154
  return false unless defined?(Legion::Settings)
95
155
 
@@ -186,7 +246,7 @@ module Legion
186
246
  def parse_tags(tags_json)
187
247
  return [] if tags_json.nil? || tags_json.empty?
188
248
 
189
- ::JSON.parse(tags_json)
249
+ Legion::JSON.parse(tags_json)
190
250
  rescue StandardError
191
251
  []
192
252
  end
@@ -218,7 +278,7 @@ module Legion
218
278
  def parse_embedding(embedding_json)
219
279
  return nil if embedding_json.nil? || embedding_json.empty?
220
280
 
221
- parsed = ::JSON.parse(embedding_json)
281
+ parsed = Legion::JSON.parse(embedding_json)
222
282
  parsed.is_a?(Array) ? parsed.map(&:to_f) : nil
223
283
  rescue StandardError
224
284
  nil
@@ -2,6 +2,6 @@
2
2
 
3
3
  module Legion
4
4
  module Apollo
5
- VERSION = '0.3.1'
5
+ VERSION = '0.3.2'
6
6
  end
7
7
  end
data/lib/legion/apollo.rb CHANGED
@@ -22,6 +22,9 @@ module Legion
22
22
 
23
23
  @started = true
24
24
  Legion::Logging.info 'Legion::Apollo started' if defined?(Legion::Logging)
25
+
26
+ Legion::Apollo::Local.start
27
+ seed_self_knowledge
25
28
  end
26
29
 
27
30
  def shutdown
@@ -222,7 +225,7 @@ module Legion
222
225
  hash = e[:content_hash] || Digest::MD5.hexdigest(e[:content].to_s.strip.downcase.gsub(/\s+/, ' '))
223
226
  tags = if e[:tags].is_a?(String)
224
227
  begin
225
- ::JSON.parse(e[:tags])
228
+ Legion::JSON.parse(e[:tags])
226
229
  rescue StandardError
227
230
  []
228
231
  end
@@ -285,6 +288,14 @@ module Legion
285
288
  { success: false, error: e.message }
286
289
  end
287
290
 
291
+ def seed_self_knowledge
292
+ Legion::Apollo::Local.seed_self_knowledge if Legion::Apollo::Local.started?
293
+ rescue StandardError => e
294
+ if defined?(Legion::Logging)
295
+ Legion::Logging.warn("Apollo self-knowledge seed failed (#{e.class}): #{e.message}")
296
+ end
297
+ end
298
+
288
299
  def apollo_setting(key, default)
289
300
  return default unless defined?(Legion::Settings) && !Legion::Settings[:apollo].nil?
290
301
 
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: legion-apollo
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.1
4
+ version: 0.3.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Esity
@@ -66,6 +66,16 @@ files:
66
66
  - CHANGELOG.md
67
67
  - LICENSE
68
68
  - README.md
69
+ - data/self-knowledge/01-what-is-legion.md
70
+ - data/self-knowledge/02-architecture.md
71
+ - data/self-knowledge/03-extensions.md
72
+ - data/self-knowledge/04-security.md
73
+ - data/self-knowledge/05-llm-pipeline.md
74
+ - data/self-knowledge/06-apollo-knowledge.md
75
+ - data/self-knowledge/07-cli-reference.md
76
+ - data/self-knowledge/08-cognitive-layer.md
77
+ - data/self-knowledge/09-teams-integration.md
78
+ - data/self-knowledge/10-deployment.md
69
79
  - lib/legion/apollo.rb
70
80
  - lib/legion/apollo/helpers/confidence.rb
71
81
  - lib/legion/apollo/helpers/similarity.rb