RubyGems - legion-apollo - Versions diffs - 0.3.0 → 0.3.2 - Mend

legion-apollo 0.3.0 → 0.3.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +19 -0
data/README.md +48 -1
data/data/self-knowledge/01-what-is-legion.md +51 -0
data/data/self-knowledge/02-architecture.md +51 -0
data/data/self-knowledge/03-extensions.md +54 -0
data/data/self-knowledge/04-security.md +46 -0
data/data/self-knowledge/05-llm-pipeline.md +54 -0
data/data/self-knowledge/06-apollo-knowledge.md +54 -0
data/data/self-knowledge/07-cli-reference.md +59 -0
data/data/self-knowledge/08-cognitive-layer.md +49 -0
data/data/self-knowledge/09-teams-integration.md +44 -0
data/data/self-knowledge/10-deployment.md +75 -0
data/lib/legion/apollo/local.rb +62 -2
data/lib/legion/apollo/runners/request.rb +14 -0
data/lib/legion/apollo/runners.rb +3 -0
data/lib/legion/apollo/version.rb +1 -1
data/lib/legion/apollo.rb +161 -15
metadata +13 -1

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 63a18f390ea4ed531450615fbff3877951d39b6a4d60bdf5fbee72a2572f3817
-  data.tar.gz: 9220f43c2c962e936d82257b8e944fb9b82fd7e1c306fbb8462a3323e6e37de6
+  metadata.gz: 1c8e724eefe292faefec05ca4fc589ed935f4734bc5c4726b7c5d70ed3b69dbb
+  data.tar.gz: 3fc0246858d50305bd2d956e87c9ec844c71a769c4f81955f43d2640159a175c
 SHA512:
-  metadata.gz: '08e857932f343c0b7a5d8ec98abb1e2c2bbb4c4ba3953ef9ae33927b6a0f1861901f8646f900fba9fc233f99737805b355fa0f0176c11771fc664fe97ee54214'
-  data.tar.gz: 1d2bc1a4f9d9bf637731c48061bda8e442316feeb4a0eb29ccf0546d20ea1363be795c737079730706707659b1beef7cb81de63087ff4944084c25a9320e9eda
+  metadata.gz: eb2815139791a2ed4abd5f09f6ed2b5cf27b89c399ed00276fc58e8ebfaceede00eb1b0cc02f43371ce5099d12467398e7ce92e58352cc854bb4b5eecc4a2c1c
+  data.tar.gz: 763ff525cdf09284bc3a7c3337407bfcaa8a750f5dddaeb66e1a1b082742a9e158b813a39a0d606c4c1740ba26137c6b565c6152688bd6f007b601ed98dd6fd1

data/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,24 @@
 # Changelog
+## [0.3.2] - 2026-03-26
+### Added
+- Self-knowledge seed system: 10 markdown documents covering LegionIO identity, architecture, extensions, security, LLM pipeline, Apollo, CLI, cognitive layer, Teams integration, and deployment
+- `Apollo::Local.seed_self_knowledge` auto-ingests self-knowledge docs on boot (local + global)
+- `Apollo::Local.seeded?` query method
+- `data/**/*` included in gemspec so self-knowledge ships with the gem
+### Changed
+- Refactored `seed_self_knowledge` into smaller helpers (`self_knowledge_files`, `seed_files`, `seed_single_file`) to satisfy rubocop complexity
+## [0.3.1] - 2026-03-26
+### Added
+- `scope:` param on `query`/`retrieve`/`ingest` — `:global` (default), `:local` (SQLite only), `:all` (merged global + local)
+- `Legion::Apollo::Runners::Request` shim — GAIA `knowledge_retrieval` phase now resolves to merged retrieval without any changes to `legion-gaia`
+- Merge helpers: `query_merged`, `normalize_local_entries`, `normalize_global_entries`, `dedup_and_rank`
+- Ingest routing: `ingest_local` and `ingest_all` private helpers
 ## [0.3.0] - 2026-03-25
 ### Added

data/README.md CHANGED Viewed

@@ -2,15 +2,62 @@
 Apollo client library for the LegionIO framework.
-Provides `query`, `ingest`, and `retrieve` with smart routing: co-located lex-apollo service, RabbitMQ transport, or graceful failure.
+**Version**: 0.3.2
+Provides `query`, `ingest`, and `retrieve` with smart routing: co-located lex-apollo service, RabbitMQ transport, or graceful failure. Supports a node-local SQLite knowledge store (`Apollo::Local`) that mirrors the same API without requiring any remote infrastructure.
 ## Usage
 ```ruby
 Legion::Apollo.start
+# Global knowledge store (requires lex-apollo or RabbitMQ)
 Legion::Apollo.ingest(content: 'Some knowledge', tags: %w[fact ruby])
 results = Legion::Apollo.query(text: 'tell me about ruby', limit: 5)
+# Node-local store (SQLite + FTS5, no network required)
+Legion::Apollo.ingest(content: 'Local note', scope: :local)
+results = Legion::Apollo.query(text: 'local note', scope: :local)
+# Query both and merge (deduped by content hash, ranked by confidence)
+results = Legion::Apollo.query(text: 'ruby', scope: :all)
+```
+## Scopes
+| Scope | Route |
+|-------|-------|
+| `:global` (default) | Co-located lex-apollo or RabbitMQ transport |
+| `:local` | `Apollo::Local` SQLite+FTS5 store (node-local) |
+| `:all` | Both merged, deduped by `content_hash`, ranked by confidence |
+## Local Store
+`Apollo::Local` provides a node-local knowledge store backed by SQLite + FTS5. When started (e.g., via `Legion::Apollo.start`, which calls `Legion::Apollo::Local.start` automatically), it uses `Legion::Data::Local` when available and respects `Settings[:apollo][:local][:enabled]`.
+Features:
+- Content-hash dedup (MD5 of normalized content)
+- Optional LLM embeddings (1024-dim) with cosine rerank when `Legion::LLM.can_embed?`
+- TTL expiry (default 5-year retention)
+- FTS5 full-text search with `ILIKE` fallback
+## Configuration
+```json
+{
+  "apollo": {
+    "default_limit": 5,
+    "min_confidence": 0.3,
+    "max_tags": 20,
+    "local": {
+      "enabled": true,
+      "retention_years": 5,
+      "default_limit": 5,
+      "min_confidence": 0.3,
+      "fts_candidate_multiplier": 3
+    }
+  }
+}
 ```
 ## License

data/data/self-knowledge/01-what-is-legion.md ADDED Viewed

@@ -0,0 +1,51 @@
+# What is LegionIO?
+LegionIO is an extensible async job engine and cognitive platform for Ruby. It schedules tasks, creates relationships between services, and runs them concurrently. It was created by Matthew Iverson (@Esity) and is licensed under Apache-2.0.
+LegionIO is not a chatbot. It is a framework that can power chatbots, AI assistants, background workers, service integrations, and autonomous agents. The chat interface is one of many ways to interact with it.
+## Core Purpose
+LegionIO connects isolated systems — cloud accounts, on-premise services, SaaS tools — into a unified async task engine. It uses RabbitMQ for message passing, supports SQLite/PostgreSQL/MySQL for persistence, and Redis/Memcached for caching.
+## What LegionIO Does
+- Schedules and executes async tasks across distributed services
+- Chains tasks into workflows (Task A -> conditioner -> Task B -> transformer -> Task C)
+- Auto-discovers and loads extension gems (LEX plugins) at boot
+- Provides a unified REST API on port 4567 for all operations
+- Integrates with HashiCorp Vault for secrets and authentication
+- Supports Kerberos auto-authentication to Vault using existing AD credentials
+- Runs a 19-step LLM pipeline with RAG, guardrails, cost tracking, and model routing
+- Maintains a shared knowledge store (Apollo) for organizational knowledge
+- Provides a rich terminal UI with AI chat, dashboard, and extension browser
+- Tracks token usage and costs per-user and per-team
+- Supports HIPAA PHI compliance with redaction, crypto-erasure, and audit trails
+- Runs as a macOS/Linux background service via Homebrew or systemd
+## What LegionIO Does Not Do
+- It does not provide direct cloud infrastructure (no VMs, no networking)
+- It does not replace Terraform, Ansible, or Chef for infrastructure management
+- It does not host web applications or serve static content
+- It does not provide its own LLM — it routes to providers like Bedrock, Anthropic, OpenAI, Gemini, or Ollama
+- It does not require RabbitMQ in lite mode (uses an in-process message adapter)
+- It does not store credentials on disk — all secrets are in Vault or environment variables
+## Installation
+Install via Homebrew on macOS:
+```
+brew tap legionio/tap
+brew install legionio
+```
+Or via RubyGems:
+```
+gem install legionio
+```
+## Key Binaries
+- `legionio` — daemon and operational CLI (start, stop, config, lex, task, mcp, etc.)
+- `legion` — interactive terminal shell with AI chat, onboarding wizard, and dashboard

data/data/self-knowledge/02-architecture.md ADDED Viewed

@@ -0,0 +1,51 @@
+# LegionIO Architecture
+## Boot Sequence
+LegionIO starts subsystems in a fixed order. Each phase is individually toggleable.
+1. Logging (legion-logging)
+2. Settings (legion-settings — loads from /etc/legionio, ~/.legionio, ./settings)
+3. Crypt (legion-crypt — Vault connection, Kerberos auto-auth)
+4. Transport (legion-transport — RabbitMQ or InProcess lite adapter)
+5. Cache (legion-cache — Redis, Memcached, or Memory adapter)
+6. Data (legion-data — SQLite, PostgreSQL, or MySQL via Sequel)
+7. RBAC (legion-rbac — role-based access control)
+8. LLM (legion-llm — AI provider setup and routing)
+9. Apollo (legion-apollo — shared and local knowledge store)
+10. GAIA (legion-gaia — cognitive coordination layer, 24 phases)
+11. Telemetry (OpenTelemetry tracing, optional)
+12. Extensions (two-phase parallel: require+autobuild, then hook actors)
+13. API (Sinatra/Puma REST API on port 4567)
+Shutdown runs in reverse order. Reload shuts down then re-runs from settings onward.
+## Core Gems
+| Gem | Purpose |
+|-----|---------|
+| legion-transport | RabbitMQ AMQP messaging + InProcess lite adapter |
+| legion-cache | Caching (Redis/Memcached/Memory) |
+| legion-crypt | Encryption, Vault integration, JWT, Kerberos auth, mTLS |
+| legion-data | Database persistence via Sequel (SQLite/PostgreSQL/MySQL) |
+| legion-json | JSON serialization (multi_json wrapper) |
+| legion-logging | Console + structured JSON logging with redaction |
+| legion-settings | Configuration management with schema validation |
+| legion-llm | LLM integration with 19-step pipeline |
+| legion-mcp | MCP server with 58+ tools |
+| legion-gaia | Cognitive coordination (24 phases: 16 active + 8 dream) |
+| legion-apollo | Shared knowledge store client (local SQLite + global pgvector) |
+| legion-rbac | Role-based access control with Vault-style policies |
+| legion-tty | Rich terminal UI with AI chat and operational dashboard |
+## Extension Loading
+Extensions are gems named `lex-*`, auto-discovered via Bundler or Gem::Specification. Loading is two-phase and parallel: all extensions are required and `autobuild` runs concurrently on a thread pool, then `hook_all_actors` starts subscriptions sequentially. This prevents race conditions.
+## Lite Mode
+Setting `LEGION_MODE=lite` replaces RabbitMQ with an InProcess adapter and Redis with a Memory adapter. No external infrastructure required. Useful for development, demos, and single-machine deployments.
+## REST API
+Full REST API served by Sinatra/Puma on port 4567. Endpoints include tasks, extensions, runners, nodes, schedules, relationships, settings, events (SSE), transport status, hooks, workers, teams, capacity, tenants, audit, RBAC, and webhooks. JWT Bearer auth middleware with rate limiting.

data/data/self-knowledge/03-extensions.md ADDED Viewed

@@ -0,0 +1,54 @@
+# LegionIO Extension System (LEX)
+## What is a LEX?
+A LEX (Legion Extension) is a Ruby gem named `lex-*` that plugs into LegionIO. Each LEX defines runners (functions) and actors (execution modes). Extensions are auto-discovered at boot — install a gem and it loads automatically.
+## Actor Types
+| Type | Behavior |
+|------|----------|
+| Subscription | Consumes messages from an AMQP queue |
+| Polling | Polls on a schedule |
+| Interval (Every) | Runs at a fixed interval |
+| Once | Runs once at startup |
+| Loop | Runs continuously |
+| Nothing | Passive — only invoked via API or other extensions |
+## Creating an Extension
+```bash
+legion lex create myextension    # scaffold a new lex-myextension gem
+legion generate runner myrunner   # add a runner with functions
+legion generate actor myactor     # add an actor with type selection
+legion generate tool mytool       # add an MCP tool
+```
+## Extension Categories
+### Core Operational (21 extensions)
+node, tasker, scheduler, synapse, LLM gateway, detect, telemetry, acp, react, webhook, health, metering, exec, conditioner, transformer, tick, audit, codegen, privatecore, lex (meta), knowledge
+### Agentic/Cognitive (13 consolidated gems + supporting)
+self (identity, metacognition, reflection, personality, agency), affect (emotion, mood, sentiment), imagination (creative generation, dream ideation), language (NLU, discourse), memory (episodic, semantic, working memory), social (theory of mind, social cognition), swarm-github (code review), mesh (inter-agent communication), mind-growth (autonomous expansion), autofix, dataset, eval, factory
+### AI Provider Integrations (7)
+azure-ai, bedrock, claude, foundry, gemini, openai, xai
+### Service Integrations (10 common + 40 additional)
+Common: consul, github, http, kerberos, vault, tfe, microsoft-teams, slack, webhook, acp
+Additional: chef, jfrog, ssh, smtp, kafka, jira, docker, kubernetes, and more
+## Role-Based Filtering
+Extensions load based on role profile:
+- `nil` (default): all extensions
+- `:core`: 14 core operational only
+- `:cognitive`: core + all agentic
+- `:service`: core + service integrations
+- `:dev`: core + AI + essential agentic
+- `:custom`: explicit list from settings
+## Extension Discovery
+At boot, LegionIO calls `Bundler.load.specs` (or `Gem::Specification` fallback) to find all `lex-*` gems. Each extension's `autobuild` creates runners and actors. After all extensions load, `hook_all_actors` activates AMQP subscriptions and timers.

data/data/self-knowledge/04-security.md ADDED Viewed

@@ -0,0 +1,46 @@
+# LegionIO Security
+## Authentication
+### Kerberos + Vault
+LegionIO authenticates to HashiCorp Vault using Kerberos (SPNEGO). On macOS or Linux machines joined to Active Directory, the existing Kerberos ticket is used — no password entry needed. The SPNEGO token is sent as an HTTP Authorization header to Vault's Kerberos auth backend, which returns a Vault token. Token renewal runs in a background thread at 75% TTL.
+### JWT Authentication
+The REST API uses JWT Bearer auth. Tokens are validated against JWKS endpoints. Skip paths exist for health and readiness checks.
+### mTLS
+Optional mutual TLS for internal communications. Vault PKI issues certificates, and a background thread rotates them at 50% TTL. Feature-flagged via `security.mtls.enabled`.
+## Secrets Management
+All secrets are stored in HashiCorp Vault, never on disk. Config files reference secrets using `vault://` URIs that are resolved at runtime. Environment variable fallback is supported via `env://` URIs.
+Example: `"bearer_token": "vault://secret/data/llm/bedrock#bearer_token"`
+## RBAC
+Optional role-based access control using Vault-style flat policies. Policies map identities to allowed actions on resources. Enforced at the API middleware layer and in the LLM pipeline.
+## HIPAA PHI Compliance
+- **PHI Tagging**: Metadata classification for sensitive data
+- **PHI Access Logging**: Audit trail via Legion::Audit for all PHI access
+- **PHI Erasure**: Crypto-erasure orchestration via Crypt::Erasure + Cache purge
+- **PHI TTL Cap**: legion-cache enforces maximum TTL for PHI-tagged data
+- **Redaction**: Automatic PII/PHI redaction in all log output via legion-logging
+All PHI features are off by default and enabled via configuration.
+## Audit
+- Tamper-evident hash chain for audit entries
+- 7-year tiered retention (hot -> warm -> cold storage)
+- SIEM export for Splunk/ELK ingestion
+- Queryable via CLI (`legion audit`) and REST API
+## Network Security
+- No public IPs or ingress in production deployments
+- TLS required on all connections (Optum-sanctioned CAs only in UHG deployments)
+- Rate limiting middleware with per-IP/agent/tenant tiers
+- Request body size limits (1MB max)

data/data/self-knowledge/05-llm-pipeline.md ADDED Viewed

@@ -0,0 +1,54 @@
+# LegionIO LLM Pipeline
+## Overview
+LegionIO routes AI requests through a 19-step pipeline that adds governance, RAG context, tool use, cost tracking, and knowledge capture. The pipeline is provider-agnostic — it works with Bedrock, Anthropic, OpenAI, Gemini, and local Ollama.
+## Pipeline Steps
+1. **Normalize** — standardize request format
+2. **Profile** — derive caller profile (user, system, service)
+3. **RBAC** — check access permissions
+4. **Classification** — classify request sensitivity
+5. **Billing** — check budget and rate limits
+6. **Guardrails** — input validation and safety checks
+7. **GAIA Advisory** — cognitive layer enrichment (optional system prompt)
+8. **RAG Context** — retrieve relevant knowledge from Apollo (global + local)
+9. **MCP Discovery** — discover available tools
+10. **Enrichment Injection** — prepend GAIA/RAG context to system prompt
+11. **Fleet Selection** — choose optimal model/provider
+12. **Dispatch** — send request to LLM provider
+13. **Parse Response** — extract text and tool calls
+14. **Tool Calls** — execute MCP tools if requested
+15. **Post-Response** — post-processing
+16. **Audit** — publish audit trail
+17. **Metering** — record token usage and cost
+18. **Timeline** — record timing data
+19. **Knowledge Capture** — write significant responses back to Apollo
+## Supported Providers
+| Provider | Models | Auth |
+|----------|--------|------|
+| AWS Bedrock | Claude, Llama, Mistral | Bearer token (from Vault) |
+| Anthropic | Claude family | API key |
+| OpenAI | GPT-4, GPT-3.5 | API key |
+| Google Gemini | Gemini Pro, Flash | API key |
+| Ollama | Any local model | None (localhost) |
+## Cost Tracking
+Every LLM call is metered. Token counts (input/output) and estimated costs are tracked per-request, per-session, per-user, and per-team. The status bar in the terminal UI shows real-time token count and cost. Budget limits can be set per-user or per-team.
+## Model Routing
+The fleet selection step chooses the optimal model based on request classification, cost constraints, and provider availability. Model escalation automatically retries with a more capable model if the initial response fails quality checks.
+## RAG Integration
+Step 8 retrieves relevant context from Apollo using scope routing:
+- `:local` — node-local SQLite+FTS5 store only
+- `:global` — shared PostgreSQL+pgvector store
+- `:all` — both merged, deduplicated by content hash, ranked by confidence
+Retrieved context is injected into the system prompt by the Enrichment Injector (step 10).

data/data/self-knowledge/06-apollo-knowledge.md ADDED Viewed

@@ -0,0 +1,54 @@
+# Apollo Knowledge Store
+## What is Apollo?
+Apollo is LegionIO's shared knowledge store. It provides organizational memory that persists across sessions and is shared across all LegionIO nodes. Every AI response can reference knowledge from Apollo, and significant responses are captured back into it.
+## Architecture
+### Local Store (every node)
+- SQLite database with FTS5 full-text search
+- Content-hash deduplication (MD5)
+- Optional LLM embeddings (1024-dim) with cosine rerank
+- TTL-based expiry (default 5 years)
+- Works offline — no network required
+### Global Store (shared)
+- PostgreSQL with pgvector extension
+- HNSW cosine similarity index
+- Agents interact via RabbitMQ (no direct DB access)
+- Hosted on Azure PostgreSQL Flexible Server
+## Scope Routing
+All queries and ingests accept a `scope:` parameter:
+- `:local` — SQLite only
+- `:global` — PostgreSQL only (via transport or co-located extension)
+- `:all` — both merged, deduplicated by content hash, ranked by confidence
+## Knowledge Capture
+The LLM pipeline's step 19 (Knowledge Capture) automatically writes significant responses back to Apollo. This creates a feedback loop where the system learns from its own interactions. Content-hash deduplication prevents echo chambers.
+## Knowledge CLI
+```
+legion knowledge query "question"     # query and synthesize answer
+legion knowledge retrieve "question"  # raw source chunks
+legion knowledge ingest <path>        # ingest file or directory
+legion knowledge status               # corpus stats
+legion knowledge health               # full health report
+legion knowledge maintain             # orphan detection and cleanup
+legion knowledge quality              # quality report
+legion knowledge monitor add <path>   # watch a directory for changes
+legion knowledge capture commit       # capture git commit as knowledge
+```
+## Content Pipeline
+Files ingested via `legion knowledge ingest` go through:
+1. Format detection (Markdown, PDF, DOCX, plain text)
+2. Chunking by heading hierarchy (H1-H6 with ancestry path)
+3. Delta detection (only new/changed files via manifest)
+4. Batch embedding (one LLM call per file, not per chunk)
+5. Upsert to Apollo (local and/or global based on scope)

data/data/self-knowledge/07-cli-reference.md ADDED Viewed

@@ -0,0 +1,59 @@
+# LegionIO CLI Reference
+## Interactive Shell
+Running `legion` with no arguments launches the rich terminal UI:
+- Digital rain intro animation on first run
+- Onboarding wizard with Kerberos identity detection
+- AI chat shell with streaming responses
+- Dashboard (Ctrl+D) with service status panels
+- Extension browser, config editor, command palette (Ctrl+K)
+- 115+ slash commands, tab completion, session persistence
+## Key Commands
+### Daemon Operations (legionio)
+```
+legionio start                    # start daemon
+legionio stop                     # stop daemon
+legionio status                   # check daemon status
+legionio doctor                   # 11-check environment diagnosis
+legionio config scaffold          # generate starter config files
+legionio config import <url>      # import config from URL
+legionio bootstrap <url>          # one-command setup (config + scaffold + install packs)
+legionio setup agentic            # install 47 cognitive gems
+legionio setup claude-code        # configure MCP server for Claude Code
+legionio setup cursor             # configure MCP server for Cursor
+legionio mcp stdio                # start MCP server (stdio transport)
+legionio lex list                 # list loaded extensions
+legionio update                   # self-update via Homebrew or gem
+```
+### Interactive / Dev (legion)
+```
+legion                            # launch rich terminal UI
+legion chat                       # AI chat REPL
+legion do "natural language"      # natural language command routing
+legion knowledge query "question" # query knowledge base
+legion commit                     # AI-generated commit message
+legion pr                         # AI-generated PR description
+legion review                     # AI code review
+legion plan                       # read-only exploration mode
+legion memory list                # persistent memory management
+legion mind-growth status         # cognitive architecture status
+```
+### MCP Server
+LegionIO exposes 58+ MCP tools when configured as an MCP server in Claude Code, Cursor, or VS Code. Tools cover knowledge queries, extension management, task operations, system status, and more.
+## Natural Language Commands
+`legion do` routes free-text to the right extension capability:
+```
+legion do "list all running extensions"
+legion do "check system health"
+legion do "show vault status"
+```
+It tries three resolution paths: daemon API, in-process capability registry, LLM classification.

data/data/self-knowledge/08-cognitive-layer.md ADDED Viewed

@@ -0,0 +1,49 @@
+# LegionIO Cognitive Layer
+## GAIA (Cognitive Coordination)
+GAIA is LegionIO's cognitive coordination layer. It manages 24 phases (16 active + 8 dream) in a tick-based cycle. Each tick runs through phases like sensory processing, attention, working memory integration, prediction, emotional evaluation, action selection, and reflection.
+### Active Phases (16)
+sensory_processing, attention_filtering, working_memory_integration, contextual_memory_retrieval, knowledge_retrieval, prediction_engine, emotional_evaluation, action_selection, social_cognition, theory_of_mind, homeostasis_regulation, autonomy_gating, identity_entropy_check, post_tick_reflection, audit_publish, metering
+### Dream Phases (8)
+memory_consolidation, dream_generation, emotional_processing, creative_association, pattern_extraction, belief_update, schema_integration, dream_ideation
+## Agentic Extensions
+LegionIO includes 13 consolidated cognitive domain gems:
+| Gem | Domain |
+|-----|--------|
+| lex-agentic-self | Identity, metacognition, reflection, personality, agency, self-talk |
+| lex-agentic-affect | Emotion modeling, mood tracking, sentiment analysis |
+| lex-agentic-imagination | Creative generation, dream ideation, scenario planning |
+| lex-agentic-language | Natural language understanding, discourse analysis |
+| lex-agentic-memory | Episodic, semantic, and working memory management |
+| lex-agentic-social | Social cognition, theory of mind, preference exchange |
+| lex-mesh | Inter-agent communication, gossip protocol, preference profiles |
+| lex-mind-growth | Autonomous cognitive architecture expansion |
+| lex-swarm-github | Multi-agent code review |
+| lex-eval | Evaluation and benchmarking |
+| lex-autofix | Autonomous code fix pipeline |
+| lex-dataset | Training data management |
+| lex-factory | Spec-to-code generation pipeline |
+## Self-Awareness
+The `lex-agentic-self` extension maintains a live self-model:
+- **Metacognition**: Real-time snapshot of loaded extensions, capabilities, health
+- **Self-narrative**: Prose description of current state (injected into system prompt)
+- **Behavioral fingerprint**: 6-dimension identity tracking with drift detection
+- **Personality**: Big Five OCEAN traits that evolve slowly over time
+- **Reflection**: Post-tick analysis with health scores across 7 categories
+## Mind Growth
+`lex-mind-growth` enables autonomous expansion of cognitive capabilities:
+- Gap analysis against reference cognitive models
+- Proposal, evaluation, and staged build pipeline
+- Swarm-based building and distributed consensus
+- Competitive evolution with fitness-based selection
+- 25 completed phases, 998+ specs

data/data/self-knowledge/09-teams-integration.md ADDED Viewed

@@ -0,0 +1,44 @@
+# Microsoft Teams Integration
+## Overview
+The `lex-microsoft_teams` extension connects LegionIO to Microsoft Teams via the Microsoft Graph API. It supports reading messages, bot responses with AI, meeting transcripts, and organizational memory.
+## Authentication
+Two auth paths run in parallel:
+- **Application (client credentials)**: Bot-to-bot communication via client_id/client_secret
+- **Delegated (user OAuth)**: User-context access via browser PKCE flow or device code fallback
+Tokens are persisted to Vault (with local file fallback) and auto-refreshed with a 60-second pre-expiry buffer.
+## Capabilities
+### Message Reading
+- 1:1 and group chat messages
+- Channel messages across teams
+- Real-time message processing via AMQP transport
+### AI Bot
+- Direct chat mode: users DM the bot, get AI responses via LLM pipeline
+- Conversation observer mode: passive extraction from watched chats (disabled by default)
+- Multi-turn sessions with context persistence
+- Memory trace injection for organizational context
+### Meetings and Transcripts
+- Online meeting CRUD and join URL lookup
+- Meeting transcript retrieval (VTT/DOCX format)
+- Attendance reports
+### Organizational Intelligence
+- Profile ingestion: identity, contacts, conversation summaries
+- Incremental sync every 15 minutes for new messages
+- Memory traces stored across sender, teams, and chat domains
+## RAG Integration
+The bot injects organizational memory context into every response:
+- Retrieves traces from lex-agentic-memory across 3 domain scopes
+- Deduplicates by trace_id, ranks by strength and recency
+- Appends formatted context to the system prompt (2000 token budget)
+- Per-user preference profiles from lex-mesh customize response style

data/data/self-knowledge/10-deployment.md ADDED Viewed

@@ -0,0 +1,75 @@
+# LegionIO Deployment
+## Installation Methods
+### Homebrew (macOS, recommended)
+```
+brew tap legionio/tap
+brew install legionio
+```
+This installs a self-contained Ruby 3.4.8 runtime with YJIT, all core gems, and wrapper scripts. No system Ruby or rbenv required. Redis is installed as a recommended dependency.
+### RubyGems
+```
+gem install legionio
+```
+### Docker
+```
+docker pull legionio/legion
+```
+## Configuration
+Config files live at `~/.legionio/settings/` as JSON files (one per subsystem). Generate starter configs:
+```
+legionio config scaffold
+```
+Bootstrap from a remote URL:
+```
+legionio bootstrap https://example.com/config.json
+```
+Settings resolution order: command-line flags > environment variables > config files > defaults.
+## Running
+### Background Service (recommended)
+```
+brew services start redis
+brew services start legionio
+```
+The daemon runs as a launchd service with automatic restart. Logs at `$(brew --prefix)/var/log/legion/legion.log`.
+### Foreground
+```
+legionio start --log-level debug
+```
+### Lite Mode (no infrastructure)
+```
+LEGION_MODE=lite legionio start
+```
+Replaces RabbitMQ with in-process messaging and Redis with in-memory cache.
+## Infrastructure Requirements
+| Service | Required? | Purpose |
+|---------|-----------|---------|
+| Redis | Recommended | Caching, tracing, dream cycle |
+| RabbitMQ | Optional (lite mode skips) | Async job messaging |
+| PostgreSQL | Optional | Persistent storage (SQLite default) |
+| HashiCorp Vault | Optional | Secrets management, PKI, auth |
+| Ollama | Optional | Local LLM inference |
+## Scaling
+LegionIO supports horizontal scaling with:
+- RabbitMQ clustering for distributed job processing
+- Singleton lock (dual-backend: Redis + DB) for leader election
+- GAIA heartbeat singletons to prevent duplicate cognitive cycles
+- Connection pooling for database and cache
+- Feature-flagged via `cluster.singleton_enabled` and `cluster.leader_election`
+Same architecture runs on a laptop or a 100-node cluster.

data/lib/legion/apollo/local.rb CHANGED Viewed

@@ -86,10 +86,70 @@ module Legion
         def reset!
           @started = false
+          @seeded = false
+        end
+        def seed_self_knowledge
+          return unless started?
+          return if @seeded
+          files = self_knowledge_files
+          return if files.empty?
+          count = seed_files(files)
+          @seeded = true
+          Legion::Logging.info("Apollo::Local seeded #{count} self-knowledge files") if defined?(Legion::Logging)
+        rescue StandardError => e
+          Legion::Logging.warn("Apollo::Local seed failed: #{e.message}") if defined?(Legion::Logging)
+        end
+        def seeded?
+          @seeded == true
         end
         private
+        def self_knowledge_files
+          seed_dir = File.join(File.expand_path('../../..', __dir__), 'data', 'self-knowledge')
+          return [] unless File.directory?(seed_dir)
+          Dir[File.join(seed_dir, '*.md')]
+        end
+        def seed_files(files)
+          count = 0
+          files.each do |path|
+            count += 1 if seed_single_file(path)
+          end
+          count
+        end
+        def seed_single_file(path)
+          content = File.read(path)
+          return false if content.strip.empty?
+          tags = ['legionio', 'self-knowledge', File.basename(path, '.md')]
+          result = ingest(content: content, tags: tags, source_channel: 'self-knowledge',
+                          submitted_by: 'legion-apollo', confidence: 0.9)
+          return false unless result[:success] && result[:mode] != :deduplicated
+          ingest_global(content: content, tags: tags) if global_available?
+          true
+        end
+        def ingest_global(content:, tags:)
+          Legion::Apollo.ingest(content: content, tags: tags, source_channel: 'self-knowledge',
+                                submitted_by: 'legion-apollo', confidence: 0.9, scope: :global)
+        rescue StandardError => e
+          Legion::Logging.debug("Global seed ingest failed: #{e.message}") if defined?(Legion::Logging)
+        end
+        def global_available?
+          defined?(Legion::Apollo) && Legion::Apollo.started? && Legion::Apollo.respond_to?(:ingest)
+        rescue StandardError
+          false
+        end
         def local_enabled?
           return false unless defined?(Legion::Settings)
@@ -186,7 +246,7 @@ module Legion
         def parse_tags(tags_json)
           return [] if tags_json.nil? || tags_json.empty?
-          ::JSON.parse(tags_json)
+          Legion::JSON.parse(tags_json)
         rescue StandardError
           []
         end
@@ -218,7 +278,7 @@ module Legion
         def parse_embedding(embedding_json)
           return nil if embedding_json.nil? || embedding_json.empty?
-          parsed = ::JSON.parse(embedding_json)
+          parsed = Legion::JSON.parse(embedding_json)
           parsed.is_a?(Array) ? parsed.map(&:to_f) : nil
         rescue StandardError
           nil

data/lib/legion/apollo/runners/request.rb ADDED Viewed

@@ -0,0 +1,14 @@
+# frozen_string_literal: true
+module Legion
+  module Apollo
+    module Runners
+      # GAIA knowledge_retrieval shim — delegates to Legion::Apollo.retrieve with scope: :all.
+      module Request
+        def self.retrieve(text:, limit: 5, **)
+          Legion::Apollo.retrieve(text: text, limit: limit, scope: :all, **)
+        end
+      end
+    end
+  end
+end

data/lib/legion/apollo/runners.rb ADDED Viewed

@@ -0,0 +1,3 @@
+# frozen_string_literal: true
+require_relative 'runners/request'

data/lib/legion/apollo/version.rb CHANGED Viewed

@@ -2,6 +2,6 @@
 module Legion
   module Apollo
-    VERSION = '0.3.0'
+    VERSION = '0.3.2'
   end
 end

data/lib/legion/apollo.rb CHANGED Viewed

@@ -1,13 +1,16 @@
 # frozen_string_literal: true
+require 'digest'
 require_relative 'apollo/version'
 require_relative 'apollo/settings'
 require_relative 'apollo/local'
+require_relative 'apollo/runners'
 module Legion
   # Apollo client library — query, ingest, and retrieve with smart routing.
   # Routes to a co-located lex-apollo service when available, falls back to
   # RabbitMQ transport, and degrades gracefully when neither is present.
+  # Supports scope: :global (default), :local (SQLite only), :all (merged).
   module Apollo # rubocop:disable Metrics/ModuleLength
     class << self # rubocop:disable Metrics/ClassLength
       def start
@@ -19,6 +22,9 @@ module Legion
         @started = true
         Legion::Logging.info 'Legion::Apollo started' if defined?(Legion::Logging)
+        Legion::Apollo::Local.start
+        seed_self_knowledge
       end
       def shutdown
@@ -36,39 +42,49 @@ module Legion
         Legion::Apollo::Local
       end
-      def query(text:, limit: nil, min_confidence: nil, tags: nil, **opts) # rubocop:disable Metrics/MethodLength
+      def query(text:, limit: nil, min_confidence: nil, tags: nil, scope: :global, **opts) # rubocop:disable Metrics/MethodLength,Metrics/CyclomaticComplexity,Metrics/ParameterLists
         return not_started_error unless started?
-        limit ||= apollo_setting(:default_limit, 5)
+        limit          ||= apollo_setting(:default_limit, 5)
         min_confidence ||= apollo_setting(:min_confidence, 0.3)
         payload = { text: text, limit: limit, min_confidence: min_confidence, tags: tags, **opts }
-        if co_located_reader?
-          direct_query(payload)
-        elsif transport_available?
-          publish_query(payload)
+        case scope
+        when :local then query_local(payload)
+        when :all   then query_merged(payload)
         else
-          { success: false, error: :no_path_available }
+          if co_located_reader?
+            direct_query(payload)
+          elsif transport_available?
+            publish_query(payload)
+          else
+            { success: false, error: :no_path_available }
+          end
         end
       end
-      def ingest(content:, tags: [], **opts)
+      def ingest(content:, tags: [], scope: :global, **opts) # rubocop:disable Metrics/MethodLength
         return not_started_error unless started?
         payload = { content: content, tags: Array(tags).first(apollo_setting(:max_tags, 20)), **opts }
-        if co_located_writer?
-          direct_ingest(payload)
-        elsif transport_available?
-          publish_ingest(payload)
+        case scope
+        when :local then ingest_local(payload)
+        when :all   then ingest_all(payload)
         else
-          { success: false, error: :no_path_available }
+          if co_located_writer?
+            direct_ingest(payload)
+          elsif transport_available?
+            publish_ingest(payload)
+          else
+            { success: false, error: :no_path_available }
+          end
         end
       end
-      def retrieve(text:, limit: 5, **)
-        query(text: text, limit: limit, **)
+      def retrieve(text:, limit: 5, scope: :global, **)
+        query(text: text, limit: limit, scope: scope, **)
       end
       def transport_available?
@@ -150,6 +166,136 @@ module Legion
         { success: false, error: e.message }
       end
+      def query_local(payload)
+        return { success: false, error: :no_path_available } unless Legion::Apollo::Local.started?
+        result = Legion::Apollo::Local.query(**payload.slice(:text, :limit, :min_confidence, :tags))
+        return result unless result[:success]
+        entries = normalize_local_entries(Array(result[:results]))
+        { success: true, entries: entries, count: entries.size, mode: :local }
+      rescue StandardError => e
+        { success: false, error: e.message }
+      end
+      def query_merged(payload) # rubocop:disable Metrics/MethodLength,Metrics/AbcSize,Metrics/CyclomaticComplexity,Metrics/PerceivedComplexity
+        entries = []
+        attempted = false
+        any_success = false
+        errors = []
+        if co_located_reader?
+          attempted = true
+          global = direct_query(payload)
+          if global[:success]
+            any_success = true
+            entries.concat(normalize_global_entries(Array(global[:entries]))) if global[:entries]
+          else
+            errors << global[:error]
+          end
+        end
+        if Legion::Apollo::Local.started?
+          attempted = true
+          local = Legion::Apollo::Local.query(**payload.slice(:text, :limit, :min_confidence, :tags))
+          if local[:success]
+            any_success = true
+            entries.concat(normalize_local_entries(Array(local[:results]))) if local[:results]
+          else
+            errors << local[:error]
+          end
+        end
+        return { success: false, error: :no_path_available } unless attempted
+        unless any_success
+          combined_error = errors.compact.map(&:to_s).reject(&:empty?).join('; ')
+          combined_error = :upstream_query_failed if combined_error.empty?
+          return { success: false, error: combined_error }
+        end
+        ranked = dedup_and_rank(entries, limit: payload[:limit])
+        { success: true, entries: ranked, count: ranked.size, mode: :merged }
+      rescue StandardError => e
+        { success: false, error: e.message }
+      end
+      def normalize_local_entries(entries) # rubocop:disable Metrics/MethodLength,Metrics/AbcSize
+        entries.map do |e|
+          hash = e[:content_hash] || Digest::MD5.hexdigest(e[:content].to_s.strip.downcase.gsub(/\s+/, ' '))
+          tags = if e[:tags].is_a?(String)
+                   begin
+                     Legion::JSON.parse(e[:tags])
+                   rescue StandardError
+                     []
+                   end
+                 else
+                   Array(e[:tags])
+                 end
+          { id: e[:id], content: e[:content], content_hash: hash,
+            confidence: e[:confidence] || 0.5, content_type: 'fact', tags: tags, source: :local }
+        end
+      end
+      def normalize_global_entries(entries)
+        entries.map do |e|
+          hash = e[:content_hash] || Digest::MD5.hexdigest(e[:content].to_s.strip.downcase.gsub(/\s+/, ' '))
+          { id: e[:id], content: e[:content], content_hash: hash,
+            confidence: e[:confidence] || 0.5, content_type: e[:content_type] || 'fact',
+            tags: Array(e[:tags]), source: :global }
+        end
+      end
+      def dedup_and_rank(entries, limit:)
+        sorted = entries
+                 .sort_by { |e| -(e[:confidence] || 0) }
+                 .uniq { |e| e[:content_hash] }
+        limit ? sorted.first(limit) : sorted
+      end
+      def ingest_local(payload)
+        return { success: false, error: :no_path_available } unless Legion::Apollo::Local.started?
+        Legion::Apollo::Local.ingest(**payload)
+      rescue StandardError => e
+        { success: false, error: e.message }
+      end
+      def ingest_all(payload) # rubocop:disable Metrics/MethodLength,Metrics/AbcSize,Metrics/CyclomaticComplexity,Metrics/PerceivedComplexity
+        results = []
+        if co_located_writer?
+          results << direct_ingest(payload)
+        elsif transport_available?
+          results << publish_ingest(payload)
+        end
+        results << Legion::Apollo::Local.ingest(**payload) if Legion::Apollo::Local.started?
+        return { success: false, error: :no_path_available } if results.empty?
+        overall_success = results.any? { |r| r.respond_to?(:[]) && r[:success] }
+        if overall_success
+          { success: true, mode: :all, results: results }
+        else
+          errors = results.select { |r| r.respond_to?(:[]) }.map { |r| r[:error] }.compact.uniq
+          error_value = errors.length <= 1 ? errors.first : errors
+          { success: false, mode: :all, results: results, error: error_value }
+        end
+      rescue StandardError => e
+        { success: false, error: e.message }
+      end
+      def seed_self_knowledge
+        Legion::Apollo::Local.seed_self_knowledge if Legion::Apollo::Local.started?
+      rescue StandardError => e
+        if defined?(Legion::Logging)
+          Legion::Logging.warn("Apollo self-knowledge seed failed (#{e.class}): #{e.message}")
+        end
+      end
       def apollo_setting(key, default)
         return default unless defined?(Legion::Settings) && !Legion::Settings[:apollo].nil?

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: legion-apollo
 version: !ruby/object:Gem::Version
-  version: 0.3.0
+  version: 0.3.2
 platform: ruby
 authors:
 - Esity
@@ -66,6 +66,16 @@ files:
 - CHANGELOG.md
 - LICENSE
 - README.md
+- data/self-knowledge/01-what-is-legion.md
+- data/self-knowledge/02-architecture.md
+- data/self-knowledge/03-extensions.md
+- data/self-knowledge/04-security.md
+- data/self-knowledge/05-llm-pipeline.md
+- data/self-knowledge/06-apollo-knowledge.md
+- data/self-knowledge/07-cli-reference.md
+- data/self-knowledge/08-cognitive-layer.md
+- data/self-knowledge/09-teams-integration.md
+- data/self-knowledge/10-deployment.md
 - lib/legion/apollo.rb
 - lib/legion/apollo/helpers/confidence.rb
 - lib/legion/apollo/helpers/similarity.rb
@@ -76,6 +86,8 @@ files:
 - lib/legion/apollo/messages/ingest.rb
 - lib/legion/apollo/messages/query.rb
 - lib/legion/apollo/messages/writeback.rb
+- lib/legion/apollo/runners.rb
+- lib/legion/apollo/runners/request.rb
 - lib/legion/apollo/settings.rb
 - lib/legion/apollo/version.rb
 homepage: https://github.com/LegionIO/legion-apollo