legionio 1.8.6 → 1.8.14

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: fc945e4cbc2538a5a3c3566be37282dd8e532af7c3dfdbe74a43214e2422883c
4
- data.tar.gz: 3d5a9e859cb710670e4cbdb480956871632c800426262a621d93c69686c10f39
3
+ metadata.gz: 47d2f43298e4d606c1a6817bddee1bc2e09fe63c78bcae858c45cdc29622ab85
4
+ data.tar.gz: 6043356cca31ed43034509f3c4e0f6de5b37dc8cc72011c4442cf50c072e392a
5
5
  SHA512:
6
- metadata.gz: cd3531730af067dba6c85f9f19e066c601c6e5ce0678d4f3b1f640a15e18477b5f782294d1c328d0fbed7123f0401f85b45a251566f1bc9b1c765d4c301caff1
7
- data.tar.gz: 4625e1bdcf1cc4f68c5a7df9621cfd22bfbcfa2a675cd5e34c3182a9bf273b4b92b9ca3e7b23b216d0468bb22d49bbca85739eb0b65dbec0ccfb5c4de6537f9a
6
+ metadata.gz: 29daf6503fc9a0fa6fd893d50b1dc05ac35040c6ed3b49024cb46df6863efe173a69b87c0bee65a5aaa65f6779b50f16784487cddde26c8ed92dfedc816a12c9
7
+ data.tar.gz: 90d6980b55176786950d8da78ace2f902c5974f1c2d5887a73eab97b7fb690a2cdb9fde657eeeddd44ee839d42d0fda4bbfe83dda2d9ee8f75cbef08e7b4941b
data/.gitignore CHANGED
@@ -1,6 +1,6 @@
1
1
  /.bundle/
2
2
  /.yardoc
3
- /Gemfile.lock
3
+ Gemfile.lock
4
4
  /_yardoc/
5
5
  /coverage/
6
6
  /doc/
@@ -28,3 +28,8 @@ legionio_wallpaper*.svg
28
28
  legionio_overview*
29
29
  # git worktrees
30
30
  .worktrees/
31
+ # local-only directories
32
+ docs/
33
+ config/tls/
34
+ # generated integration specs
35
+ spec/integration/self_generate_spec.rb
data/AGENTS.md ADDED
@@ -0,0 +1,11 @@
1
+ # AGENTS.md
2
+
3
+ Instructions for AI agents working in this repository.
4
+
5
+ ## Pre-Commit Requirements
6
+
7
+ Always run a full `bundle exec rspec` and `bundle exec rubocop -A` and fix all errors before committing.
8
+
9
+ ## Repository Context
10
+
11
+ This is the primary gem (`legionio`) of the LegionIO framework. See `CLAUDE.md` for full architecture, file map, and conventions.
data/CHANGELOG.md CHANGED
@@ -2,6 +2,63 @@
2
2
 
3
3
  ## [Unreleased]
4
4
 
5
+ ## [1.8.14] - 2026-04-18
6
+
7
+ ### Fixed
8
+ - Optional subsystem `LoadError`s (RBAC, Data, LLM, Apollo, Gaia, Telemetry) now log at the caller-specified level instead of always ERROR with a full stack trace — `handle_exception` respects the `level:` kwarg. Fixes #155
9
+ - `web_fetch` tool in `/api/llm/*` endpoints now delegates to `Legion::CLI::Chat::WebFetch.fetch` instead of bare `Net::HTTP.get`, gaining SSL, redirect-following, HTML-to-markdown conversion, and `maxLength` support. Fixes #153
10
+ - `web_search` tool in `/api/llm/*` endpoints no longer falls through to the generic "not executable server-side" error — added dispatch branch delegating to `Legion::CLI::Chat::WebSearch.search`. Fixes #154
11
+
12
+ ## [1.8.13] - 2026-04-17
13
+
14
+ ### Added
15
+ - `Absorbers::Base#query_knowledge` — scope-aware knowledge retrieval (`:local`, `:global`, `:all`) with deduplication, matching the pattern established by `Helpers::Knowledge`
16
+
17
+ ### Fixed
18
+ - `Absorbers::Base` now routes ingestion by scope: `absorb_to_knowledge`, `absorb_raw`, and `ingest_chunks` resolve `Legion::Apollo::Local` for `:local` scope and `Legion::Apollo` for `:global`, instead of always hitting the global store
19
+ - Added `apollo_local_available?` and `resolve_apollo_target` private helpers for scope-driven Apollo target selection
20
+
21
+ ## [1.8.12] - 2026-04-17
22
+
23
+ ### Fixed
24
+ - `Actors::Subscription` now supports `pattern` class method as a DSL accessor for routing key hints, delegating to `routing_key_hint` — extensions calling `pattern 'some.routing.key'` no longer raise `NoMethodError`. Fixes #143
25
+ - `Absorbers::Base` removed deprecated `alias handle absorb` — use `#absorb` directly
26
+ - Generator template (`legion generate absorber`) now emits `def absorb(...)` instead of `def handle(...)`
27
+ - `Matchers::File` is now required and registered alongside `Matchers::Url` in the absorber loader
28
+ - Absorber base spec updated to use `#absorb` instead of removed `#handle` alias
29
+
30
+ ## [1.8.11] - 2026-04-17
31
+
32
+ ### Fixed
33
+ - `Legion::CLI::Chat::WebFetch` — eliminated all remaining polynomial regex patterns (CodeQL `rb/polynomial-redos`): replaced `convert_blocks!`, `convert_headings!`, `convert_lists!`, `convert_formatting!`, and `strip_remaining_tags!` with index-based tag scanning helpers (`replace_tag_blocks!`, `replace_open_tags!`, `replace_close_tags!`, `replace_self_closing!`). No regex with `[^>]*` or `[^>]+` remains in the HTML-to-markdown pipeline.
34
+
35
+ ## [1.8.10] - 2026-04-17
36
+
37
+ ### Fixed
38
+ - `Legion::CLI::Chat::WebFetch#convert_links!` polynomial regex on uncontrolled data (CodeQL `rb/polynomial-redos`) — replaced backtracking `<a[^>]*href=...>` regex with index-based scanner that walks tag boundaries without backtracking
39
+ - Thor `[WARNING] Attempted to create command` noise during rspec — prepend `RSpec::Mocks::AnyInstance::Recorder` to wrap `observe!`, `mark_invoked!`, `restore_original_method!`, and `remove_dummy_method!` inside `Thor.no_commands_context` when the target class is a Thor subclass
40
+
41
+ ## [1.8.9] - 2026-04-17
42
+
43
+ ### Fixed
44
+ - `Legion::DigitalWorker::Registry#emit_blocked` passed positional hash to `Legion::Events.emit` which expects kwargs — caused `ArgumentError` masking intended domain exceptions (`WorkerNotFound`, `WorkerNotActive`, `InsufficientConsent`). Fixes #114
45
+
46
+ ### Added
47
+ - `Legion::Audit::HashChain` now includes `seq` in `CANONICAL_FIELDS` and `verify_chain` detects gaps in sequence numbers, preventing undetected record deletion from the tamper-evident audit chain. Backwards-compatible: gap check is skipped when `seq` is absent. Fixes #149
48
+
49
+ ## [1.8.8] - 2026-04-17
50
+
51
+ ### Fixed
52
+ - `Legion::Ingress` code injection (CodeQL `rb/code-injection`) — replaced `Kernel.const_get` with allowlist lookup against registered extension modules; `resolve_runner_class` now only resolves classes present in `loaded_extension_modules` or `local_tasks`
53
+ - `Legion::Graph::Exporter#to_dot` incomplete string escaping (CodeQL `rb/incomplete-sanitization`) — extracted `dot_escape` helper using char-by-char escaping of backslashes and quotes for DOT labels
54
+ - `Legion::CLI::Chat::WebFetch#strip_invisible!` polynomial regex / incomplete sanitization / bad tag filter (CodeQL `rb/polynomial-redos`, `rb/incomplete-multi-character-sanitization`, `rb/bad-tag-filter`) — replaced regex `gsub!` with iterative `strip_tag_blocks!` that finds open/close tags by index, eliminating backtracking and handling malformed closing tags
55
+
56
+ ## [1.8.7] - 2026-04-17
57
+
58
+ ### Fixed
59
+ - `Legion::CLI::Chat::WebSearch#extract_real_url` incomplete URL substring sanitization (CodeQL `rb/incomplete-url-substring-sanitization`) — replaced `include?('duckduckgo.com')` with `URI.parse` host check using `end_with?`
60
+ - `Legion::Tools::EmbeddingCache.clear` now flushes L1/L2 cache tiers in addition to L0 memory, preventing stale lookups after clear
61
+
5
62
  ## [1.8.6] - 2026-04-15
6
63
 
7
64
  ### Added
data/CLAUDE.md CHANGED
@@ -1,7 +1,7 @@
1
1
  # LegionIO: Async Job Engine and Task Framework
2
2
 
3
3
  **Repository Level 3 Documentation**
4
- - **Parent**: `/Users/miverso2/rubymine/legion/CLAUDE.md`
4
+ - **Parent**: `../CLAUDE.md`
5
5
 
6
6
  ## Purpose
7
7
 
@@ -9,7 +9,7 @@ The primary gem for the LegionIO framework. An extensible async job engine for s
9
9
 
10
10
  **GitHub**: https://github.com/LegionIO/LegionIO
11
11
  **Gem**: `legionio`
12
- **Version**: 1.7.34
12
+ **Version**: 1.8.12
13
13
  **License**: Apache-2.0
14
14
  **Docker**: `legionio/legion`
15
15
  **Ruby**: >= 3.4
@@ -796,6 +796,8 @@ bundle exec rspec # ~3500+ examples, 0 failures
796
796
  bundle exec rubocop # 0 offenses
797
797
  ```
798
798
 
799
+ **Always run a full `bundle exec rspec` and `bundle exec rubocop -A` and fix all errors before committing.**
800
+
799
801
  Specs use `rack-test` for API testing. `Legion::JSON.load` returns symbol keys — use `body[:data]` not `body['data']` in specs.
800
802
 
801
803
  ---
data/CODEOWNERS CHANGED
@@ -1,5 +1,5 @@
1
1
  # Default owner — all files
2
- * @Esity
2
+ * @Esity @LegionIO/core
3
3
 
4
4
  # Core library code
5
5
  # lib/ @Esity @future-core-team
data/README.md CHANGED
@@ -14,7 +14,11 @@ Schedule tasks, chain services into dependency graphs, run them concurrently via
14
14
  ╰──────────────────────────────────────╯
15
15
  ```
16
16
 
17
- **Ruby >= 3.4** | **v1.7.21** | **Apache-2.0** | [@Esity](https://github.com/Esity)
17
+ [![Gem Version](https://img.shields.io/gem/v/legionio.svg)](https://rubygems.org/gems/legionio)
18
+ [![Ruby](https://img.shields.io/badge/ruby-%3E%3D%203.4-red.svg)](https://www.ruby-lang.org/)
19
+ [![License](https://img.shields.io/badge/license-Apache--2.0-blue.svg)](LICENSE)
20
+
21
+ **Ruby >= 3.4** | **v1.8.12** | **Apache-2.0** | [@Esity](https://github.com/Esity)
18
22
 
19
23
  ---
20
24
 
@@ -547,10 +551,34 @@ Each phase registers with `Legion::Readiness`. All phases are individually toggl
547
551
  git clone https://github.com/LegionIO/LegionIO.git
548
552
  cd LegionIO
549
553
  bundle install
550
- bundle exec rspec # 0 failures
554
+ bundle exec rspec # ~3500+ examples, 0 failures
551
555
  bundle exec rubocop # 0 offenses
552
556
  ```
553
557
 
558
+ Always run `bundle exec rspec` and `bundle exec rubocop -A` and fix all errors before committing.
559
+
560
+ ### Project Structure
561
+
562
+ | Path | Purpose |
563
+ |------|---------|
564
+ | `lib/legion.rb` | Entry point: `Legion.start`, `.shutdown`, `.reload` |
565
+ | `lib/legion/service.rb` | 15-phase startup orchestrator |
566
+ | `lib/legion/cli.rb` | Thor CLI: 40+ subcommands across two binaries |
567
+ | `lib/legion/api.rb` | Sinatra REST API with middleware stack |
568
+ | `lib/legion/extensions/` | LEX discovery, loading, actors, builders |
569
+ | `lib/legion/tools/` | Canonical tool layer (Registry, Discovery, EmbeddingCache) |
570
+ | `lib/legion/digital_worker/` | AI-as-labor governance platform |
571
+ | `lib/legion/cli/chat/` | Interactive AI REPL with 40 tools |
572
+ | `spec/` | RSpec suite (~3500+ examples) |
573
+
574
+ ### Contributing
575
+
576
+ 1. Fork the repo and create a feature branch
577
+ 2. Write specs for new functionality
578
+ 3. Ensure `bundle exec rspec` passes with 0 failures
579
+ 4. Ensure `bundle exec rubocop` passes with 0 offenses
580
+ 5. Open a PR targeting `main`
581
+
554
582
  ## License
555
583
 
556
584
  Apache-2.0
@@ -2,6 +2,9 @@
2
2
 
3
3
  require 'securerandom'
4
4
  require 'open3'
5
+ require 'resolv'
6
+ require 'ipaddr'
7
+ require 'uri'
5
8
 
6
9
  module Legion
7
10
  class API < Sinatra::Base
@@ -69,9 +72,36 @@ module Legion
69
72
  Dir.glob(pattern).first(100).join("\n")
70
73
  when 'web_fetch'
71
74
  url = kwargs[:url] || kwargs.values.first.to_s
72
- require 'net/http'
73
- uri = URI(url)
74
- Net::HTTP.get(uri)
75
+ raw_length = (kwargs[:maxLength] || kwargs[:max_length])&.to_i
76
+ max_length = raw_length&.positive? ? raw_length : nil
77
+ parsed = begin
78
+ URI.parse(url)
79
+ rescue StandardError
80
+ nil
81
+ end
82
+ raise 'Invalid or non-HTTP URL' unless parsed.is_a?(URI::HTTP)
83
+
84
+ addr = begin
85
+ ::Resolv.getaddress(parsed.host)
86
+ rescue StandardError
87
+ nil
88
+ end
89
+ if addr
90
+ ip = ::IPAddr.new(addr)
91
+ raise 'SSRF: private/loopback targets are not permitted' if
92
+ ip.loopback? || ip.private? || ip.link_local?
93
+ end
94
+ require 'legion/cli/chat/web_fetch'
95
+ content = Legion::CLI::Chat::WebFetch.fetch(url)
96
+ max_length ? content[0, max_length] : content
97
+ when 'web_search'
98
+ query = kwargs[:query] || kwargs.values.first.to_s
99
+ raw_results = (kwargs[:max_results] || kwargs[:maxResults]).to_i
100
+ max_results = raw_results.positive? ? [raw_results, 50].min : 5
101
+ require 'legion/cli/chat/web_search'
102
+ results = Legion::CLI::Chat::WebSearch.search(query, max_results: max_results,
103
+ auto_fetch: false)
104
+ results[:results].map { |r| "### #{r[:title]}\n#{r[:url]}\n#{r[:snippet]}" }.join("\n\n")
75
105
  else
76
106
  "Tool #{tool_ref} is not executable server-side. Use a legion_ prefixed tool instead."
77
107
  end
@@ -7,7 +7,7 @@ module Legion
7
7
  module HashChain
8
8
  ALGORITHM = 'SHA256'
9
9
  GENESIS_HASH = ('0' * 64).freeze
10
- CANONICAL_FIELDS = %i[principal_id action resource source status detail created_at previous_hash].freeze
10
+ CANONICAL_FIELDS = %i[seq principal_id action resource source status detail created_at previous_hash].freeze
11
11
 
12
12
  module_function
13
13
 
@@ -23,7 +23,10 @@ module Legion
23
23
  def verify_chain(records)
24
24
  broken = []
25
25
  records.each_cons(2) do |prev, curr|
26
- broken << { id: curr[:id], expected: prev[:record_hash], got: curr[:previous_hash] } unless curr[:previous_hash] == prev[:record_hash]
26
+ unless curr[:previous_hash] == prev[:record_hash]
27
+ broken << { id: curr[:id], type: :broken_link, expected: prev[:record_hash], got: curr[:previous_hash] }
28
+ end
29
+ broken << { id: curr[:id], type: :gap, expected_seq: prev[:seq] + 1, got_seq: curr[:seq] } if prev[:seq] && curr[:seq] && curr[:seq] != prev[:seq] + 1
27
30
  end
28
31
  { valid: broken.empty?, broken_links: broken, records_checked: records.size }
29
32
  end
@@ -93,46 +93,175 @@ module Legion
93
93
  end
94
94
 
95
95
  def strip_invisible!(text)
96
- text.gsub!(%r{<script[^>]*>.*?</script>}mi, '')
97
- text.gsub!(%r{<style[^>]*>.*?</style>}mi, '')
98
- text.gsub!(%r{<nav[^>]*>.*?</nav>}mi, '')
99
- text.gsub!(%r{<footer[^>]*>.*?</footer>}mi, '')
100
- text.gsub!(/<!--.*?-->/m, '')
96
+ %w[script style nav footer].each { |tag| strip_tag_blocks!(text, tag) }
97
+ strip_html_comments!(text)
98
+ end
99
+
100
+ def strip_html_comments!(text)
101
+ loop do
102
+ open_idx = text.index('<!--')
103
+ break unless open_idx
104
+
105
+ close_idx = text.index('-->', open_idx + 4)
106
+ if close_idx
107
+ text[open_idx..(close_idx + 2)] = ''
108
+ else
109
+ text[open_idx..] = ''
110
+ end
111
+ end
112
+ end
113
+
114
+ def strip_tag_blocks!(text, tag)
115
+ loop do
116
+ open_idx = text.index(/<#{tag}[\s>]/mi)
117
+ break unless open_idx
118
+
119
+ close_pat = %r{</#{tag}\s*>}mi
120
+ close_match = close_pat.match(text, open_idx)
121
+ if close_match
122
+ text[open_idx..(close_match.end(0) - 1)] = ''
123
+ else
124
+ text[open_idx..] = ''
125
+ end
126
+ end
127
+ end
128
+
129
+ def replace_tag_blocks!(text, tag)
130
+ loop do
131
+ open_idx = text.index(/<#{tag}[\s>]/mi)
132
+ break unless open_idx
133
+
134
+ tag_end = text.index('>', open_idx)
135
+ break unless tag_end
136
+
137
+ close_pat = %r{</#{tag}\s*>}mi
138
+ close_match = close_pat.match(text, tag_end)
139
+ if close_match
140
+ inner = text[(tag_end + 1)...close_match.begin(0)]
141
+ replacement = yield(inner)
142
+ text[open_idx..(close_match.end(0) - 1)] = replacement
143
+ else
144
+ text[open_idx..] = ''
145
+ end
146
+ end
147
+ end
148
+
149
+ def replace_open_tags!(text, tag, replacement)
150
+ loop do
151
+ idx = text.index(/<#{tag}[\s>]/mi)
152
+ break unless idx
153
+
154
+ close = text.index('>', idx)
155
+ break unless close
156
+
157
+ text[idx..close] = replacement
158
+ end
159
+ end
160
+
161
+ def replace_close_tags!(text, tag, replacement)
162
+ pat = %r{</#{tag}\s*>}mi
163
+ loop do
164
+ match = pat.match(text)
165
+ break unless match
166
+
167
+ text[match.begin(0)..(match.end(0) - 1)] = replacement
168
+ end
169
+ end
170
+
171
+ def replace_self_closing!(text, tag, replacement)
172
+ loop do
173
+ idx = text.index(%r{<#{tag}[\s>/]}mi)
174
+ break unless idx
175
+
176
+ close = text.index('>', idx)
177
+ break unless close
178
+
179
+ text[idx..close] = replacement
180
+ end
101
181
  end
102
182
 
103
183
  def convert_headings!(text)
104
184
  (1..6).each do |n|
105
185
  prefix = '#' * n
106
- text.gsub!(%r{<h#{n}[^>]*>(.*?)</h#{n}>}mi, "\n#{prefix} \\1\n")
186
+ replace_tag_blocks!(text, "h#{n}") { |inner| "\n#{prefix} #{inner}\n" }
107
187
  end
108
188
  end
109
189
 
110
190
  def convert_links!(text)
111
- text.gsub!(%r{<a[^>]*href=["']([^"']*)["'][^>]*>(.*?)</a>}mi, '[\\2](\\1)')
191
+ result = String.new
192
+ pos = 0
193
+ while pos < text.length
194
+ open_idx = text.index(/<a[\s>]/mi, pos)
195
+ break unless open_idx
196
+
197
+ close_idx = text.index(%r{</a\s*>}mi, open_idx)
198
+ unless close_idx
199
+ result << text[pos..]
200
+ pos = text.length
201
+ break
202
+ end
203
+
204
+ result << text[pos...open_idx]
205
+
206
+ tag_end = text.index('>', open_idx)
207
+ if tag_end && tag_end < close_idx
208
+ tag = text[open_idx..tag_end]
209
+ href = tag[/href=["']([^"']*)["']/i, 1]
210
+ inner = text[(tag_end + 1)...close_idx]
211
+ result << if href
212
+ "[#{inner}](#{href})"
213
+ else
214
+ inner
215
+ end
216
+ else
217
+ # Malformed opening tag — preserve the inner text up to the closing tag
218
+ result << text[open_idx...close_idx]
219
+ end
220
+
221
+ close_end = text.index('>', close_idx)
222
+ pos = close_end ? close_end + 1 : close_idx + 4
223
+ end
224
+ result << text[pos..] if pos < text.length
225
+ text.replace(result)
112
226
  end
113
227
 
114
228
  def convert_lists!(text)
115
- text.gsub!(%r{<li[^>]*>(.*?)</li>}mi, "\n- \\1")
116
- text.gsub!(%r{</?[ou]l[^>]*>}mi, "\n")
229
+ replace_tag_blocks!(text, 'li') { |inner| "\n- #{inner}" }
230
+ replace_open_tags!(text, 'ul', "\n")
231
+ replace_close_tags!(text, 'ul', "\n")
232
+ replace_open_tags!(text, 'ol', "\n")
233
+ replace_close_tags!(text, 'ol', "\n")
117
234
  end
118
235
 
119
236
  def convert_formatting!(text)
120
- text.gsub!(%r{<(b|strong)[^>]*>(.*?)</\1>}mi, '**\\2**')
121
- text.gsub!(%r{<(i|em)[^>]*>(.*?)</\1>}mi, '*\\2*')
122
- text.gsub!(%r{<code[^>]*>(.*?)</code>}mi, '`\\1`')
237
+ %w[b strong].each { |t| replace_tag_blocks!(text, t) { |inner| "**#{inner}**" } }
238
+ %w[i em].each { |t| replace_tag_blocks!(text, t) { |inner| "*#{inner}*" } }
239
+ replace_tag_blocks!(text, 'code') { |inner| "`#{inner}`" }
123
240
  end
124
241
 
125
242
  def convert_blocks!(text)
126
- text.gsub!(%r{<pre[^>]*>(.*?)</pre>}mi, "\n```\n\\1\n```\n")
127
- text.gsub!(%r{<blockquote[^>]*>(.*?)</blockquote>}mi, "\n> \\1\n")
128
- text.gsub!(/<p[^>]*>/mi, "\n\n")
129
- text.gsub!(%r{</p>}mi, "\n")
130
- text.gsub!(%r{<br\s*/?>}, "\n")
131
- text.gsub!(%r{<hr\s*/?>}, "\n---\n")
243
+ replace_tag_blocks!(text, 'pre') { |inner| "\n```\n#{inner}\n```\n" }
244
+ replace_tag_blocks!(text, 'blockquote') { |inner| "\n> #{inner}\n" }
245
+ replace_open_tags!(text, 'p', "\n\n")
246
+ replace_close_tags!(text, 'p', "\n")
247
+ replace_self_closing!(text, 'br', "\n")
248
+ replace_self_closing!(text, 'hr', "\n---\n")
132
249
  end
133
250
 
134
251
  def strip_remaining_tags!(text)
135
- text.gsub!(/<[^>]+>/, '')
252
+ result = String.new(capacity: text.length)
253
+ pos = 0
254
+ while pos < text.length
255
+ open_idx = text.index('<', pos)
256
+ unless open_idx
257
+ result << text[pos..]
258
+ break
259
+ end
260
+ result << text[pos...open_idx]
261
+ close_idx = text.index('>', open_idx)
262
+ pos = close_idx ? close_idx + 1 : text.length
263
+ end
264
+ text.replace(result)
136
265
  end
137
266
 
138
267
  def clean_whitespace(text)
@@ -78,7 +78,8 @@ module Legion
78
78
  end
79
79
 
80
80
  def extract_real_url(ddg_url)
81
- return ddg_url unless ddg_url.include?('duckduckgo.com')
81
+ uri = URI.parse(ddg_url)
82
+ return ddg_url unless uri.host&.end_with?('.duckduckgo.com') || uri.host == 'duckduckgo.com'
82
83
 
83
84
  match = ddg_url.match(/uddg=([^&]+)/)
84
85
  return nil unless match
@@ -398,11 +398,11 @@ module Legion
398
398
  pattern :url, #{escaped_pat}
399
399
  description 'TODO: describe what this absorber handles'
400
400
 
401
- def handle(url: nil, content: nil, metadata: {}, context: {})
401
+ def absorb(url: nil, content: nil, metadata: {}, context: {})
402
402
  report_progress(message: 'starting absorption')
403
403
 
404
404
  # TODO: implement content acquisition and processing
405
- # absorb_to_knowledge(content: text, tags: ['tag'])
405
+ # absorb_to_knowledge(content: content, tags: ['tag'])
406
406
 
407
407
  report_progress(message: 'done', percent: 100)
408
408
  { success: true }
@@ -57,11 +57,10 @@ module Legion
57
57
  def self.emit_blocked(worker_id:, reason:)
58
58
  return unless defined?(Legion::Events)
59
59
 
60
- Legion::Events.emit('worker.blocked', {
61
- worker_id: worker_id,
62
- reason: reason,
63
- at: Time.now.utc
64
- })
60
+ Legion::Events.emit('worker.blocked',
61
+ worker_id: worker_id,
62
+ reason: reason,
63
+ at: Time.now.utc)
65
64
  end
66
65
 
67
66
  private_class_method :emit_blocked
@@ -37,12 +37,17 @@ module Legion
37
37
  raise NotImplementedError, "#{self.class.name} must implement #absorb"
38
38
  end
39
39
 
40
- # @deprecated Use #absorb instead
41
- alias handle absorb
40
+ # @deprecated Use {#absorb} instead. Will be removed in a future major release.
41
+ def handle(url: nil, content: nil, metadata: {}, context: {})
42
+ Legion::Logging.warn("#{self.class.name}#handle is deprecated — use #absorb instead") if defined?(Legion::Logging)
43
+ absorb(url: url, content: content, metadata: metadata, context: context)
44
+ end
42
45
 
43
46
  def absorb_to_knowledge(content:, tags: [], scope: :global, **opts)
44
47
  return fallback_absorb(:chunker, content, tags, scope, opts) unless chunker_available?
45
- return fallback_absorb(:apollo, content, tags, scope, opts) unless apollo_available?
48
+
49
+ target = resolve_apollo_target(scope)
50
+ return fallback_absorb(:apollo, content, tags, scope, opts) unless target
46
51
 
47
52
  sections = [{ heading: opts.delete(:heading) || 'absorbed',
48
53
  content: content,
@@ -54,11 +59,27 @@ module Legion
54
59
  end
55
60
 
56
61
  def absorb_raw(content:, tags: [], scope: :global, **)
57
- if apollo_available?
58
- Legion::Apollo.ingest(content: content, tags: Array(tags), scope: scope, **)
62
+ target = resolve_apollo_target(scope)
63
+ unless target
64
+ Legion::Logging.warn("absorb_raw: Apollo not available for scope=#{scope}") if defined?(Legion::Logging)
65
+ return { success: false, error: :apollo_not_available }
66
+ end
67
+
68
+ target.ingest(content: content, tags: Array(tags), scope: scope, **)
69
+ end
70
+
71
+ def query_knowledge(text:, limit: 5, scope: :all, **)
72
+ case scope.to_sym
73
+ when :local
74
+ return { success: false, error: :apollo_not_available } unless apollo_local_available?
75
+
76
+ Legion::Apollo::Local.query(text: text, limit: limit, **)
77
+ when :global
78
+ return { success: false, error: :apollo_not_available } unless apollo_available?
79
+
80
+ Legion::Apollo.query(text: text, limit: limit, **)
59
81
  else
60
- Legion::Logging.warn('absorb_raw: Apollo not available') if defined?(Legion::Logging)
61
- { success: false, error: :apollo_not_available }
82
+ query_all_scopes(text: text, limit: limit, **)
62
83
  end
63
84
  end
64
85
 
@@ -107,6 +128,46 @@ module Legion
107
128
  (!Legion::Apollo.respond_to?(:started?) || Legion::Apollo.started?)
108
129
  end
109
130
 
131
+ def apollo_local_available?
132
+ defined?(Legion::Apollo::Local) &&
133
+ Legion::Apollo::Local.respond_to?(:ingest) &&
134
+ (!Legion::Apollo::Local.respond_to?(:started?) || Legion::Apollo::Local.started?)
135
+ rescue NameError
136
+ false
137
+ end
138
+
139
+ def resolve_apollo_target(scope)
140
+ case scope.to_sym
141
+ when :local
142
+ apollo_local_available? ? Legion::Apollo::Local : nil
143
+ else
144
+ apollo_available? ? Legion::Apollo : nil
145
+ end
146
+ end
147
+
148
+ def query_all_scopes(text:, limit:, **)
149
+ local_results = apollo_local_available? ? Array((Legion::Apollo::Local.query(text: text, limit: limit, **) || {})[:results]) : []
150
+ global_results = apollo_available? ? Array((Legion::Apollo.query(text: text, limit: limit, **) || {})[:results]) : []
151
+
152
+ if local_results.empty? && global_results.empty? && !apollo_local_available? && !apollo_available?
153
+ return { success: false, error: :apollo_not_available }
154
+ end
155
+
156
+ seen = {}
157
+ merged = []
158
+ local_results.each do |r|
159
+ key = r[:content_hash] || r[:content]
160
+ seen[key] = true
161
+ merged << r
162
+ end
163
+ global_results.each do |r|
164
+ key = r[:content_hash] || r[:content]
165
+ merged << r unless seen[key]
166
+ end
167
+
168
+ { success: true, results: merged.first(limit), count: [merged.size, limit].min, scope: :all }
169
+ end
170
+
110
171
  def fallback_absorb(reason, content, tags, scope, opts)
111
172
  if defined?(Legion::Logging)
112
173
  label = reason == :chunker ? 'lex-knowledge not available' : 'Apollo not available'
@@ -124,12 +185,15 @@ module Legion
124
185
  end
125
186
 
126
187
  def ingest_chunks(chunks, embeddings, tags, scope, opts)
188
+ target = resolve_apollo_target(scope)
189
+ return unless target
190
+
127
191
  chunks.each_with_index do |chunk, idx|
128
192
  vector = embeddings.is_a?(Array) ? embeddings.dig(idx, :vector) : nil
129
193
  payload = build_chunk_payload(chunk, tags, opts)
130
194
  payload[:embedding] = vector if vector
131
- Legion::Apollo.ingest(content: payload[:content], tags: payload[:tags],
132
- scope: scope, **payload.except(:content, :tags))
195
+ target.ingest(content: payload[:content], tags: payload[:tags],
196
+ scope: scope, **payload.except(:content, :tags))
133
197
  end
134
198
  end
135
199
 
@@ -2,6 +2,7 @@
2
2
 
3
3
  require_relative 'absorbers/matchers/base'
4
4
  require_relative 'absorbers/matchers/url'
5
+ require_relative 'absorbers/matchers/file'
5
6
  require_relative 'absorbers/base'
6
7
  require_relative 'absorbers/pattern_matcher'
7
8
 
@@ -20,6 +20,13 @@ module Legion
20
20
  define_dsl_accessor :delay_start, default: 0
21
21
  define_dsl_accessor :block, default: false
22
22
  define_dsl_accessor :prefetch, default: 2
23
+ define_dsl_accessor :routing_key_hint, default: nil
24
+
25
+ def self.pattern(routing_key = nil)
26
+ return routing_key_hint unless routing_key
27
+
28
+ routing_key_hint(routing_key)
29
+ end
23
30
 
24
31
  def initialize(**_options)
25
32
  super()
@@ -37,19 +37,37 @@ module Legion
37
37
  lines = ['digraph legion_tasks {', ' rankdir=LR;']
38
38
 
39
39
  graph[:nodes].each do |key, node|
40
- label = node[:label].gsub('"', '\\"')
40
+ label = dot_escape(node[:label])
41
41
  shape = node[:type] == 'trigger' ? 'box' : 'ellipse'
42
42
  lines << " \"#{key}\" [label=\"#{label}\" shape=#{shape}];"
43
43
  end
44
44
 
45
45
  graph[:edges].each do |edge|
46
- label = edge[:label] && !edge[:label].empty? ? " [label=\"#{edge[:label]}\"]" : ''
46
+ escaped = dot_escape(edge[:label])
47
+ label = escaped && !escaped.empty? ? " [label=\"#{escaped}\"]" : ''
47
48
  lines << " \"#{edge[:from]}\" -> \"#{edge[:to]}\"#{label};"
48
49
  end
49
50
 
50
51
  lines << '}'
51
52
  lines.join("\n")
52
53
  end
54
+
55
+ private
56
+
57
+ def dot_escape(str)
58
+ return str unless str.is_a?(String)
59
+
60
+ result = String.new(capacity: str.length)
61
+ str.each_char do |ch|
62
+ escaped = case ch
63
+ when '\\' then '\\\\'
64
+ when '"' then '\\"'
65
+ else ch
66
+ end
67
+ result << escaped
68
+ end
69
+ result
70
+ end
53
71
  end
54
72
  end
55
73
  end
@@ -79,18 +79,23 @@ module Legion
79
79
 
80
80
  Legion::Events.emit('ingress.received', runner_class: rc.to_s, function: fn, source: source)
81
81
 
82
+ resolved_rc = begin
83
+ resolve_runner_class(rc)
84
+ rescue InvalidRunnerClass
85
+ rc
86
+ end
87
+
82
88
  if local_runner?(rc)
83
89
  Legion::Logging.debug "[Ingress] local short-circuit: #{rc}.#{fn}" if defined?(Legion::Logging)
84
- klass = rc.is_a?(String) ? Kernel.const_get(rc) : rc
85
90
  ctx = message.merge(runner_class: rc.to_s, function: fn.to_s)
86
- return Legion::Context.with_task_context(ctx) { klass.send(fn.to_sym, **message) }
91
+ return Legion::Context.with_task_context(ctx) { resolved_rc.send(fn.to_sym, **message) }
87
92
  end
88
93
 
89
94
  runner_block = lambda {
90
95
  ctx = message.merge(runner_class: rc.to_s, function: fn.to_s)
91
96
  Legion::Context.with_task_context(ctx) do
92
97
  Legion::Runner.run(
93
- runner_class: rc,
98
+ runner_class: resolved_rc,
94
99
  function: fn,
95
100
  check_subtask: check_subtask,
96
101
  generate_task: generate_task,
@@ -127,14 +132,47 @@ module Legion
127
132
  def local_runner?(runner_class)
128
133
  return false unless defined?(Legion::Extensions) && Legion::Extensions.local_tasks.is_a?(Array)
129
134
 
130
- klass = runner_class.is_a?(String) ? Kernel.const_get(runner_class) : runner_class
135
+ klass = resolve_runner_class(runner_class)
131
136
  Legion::Extensions.local_tasks.any? { |t| t[:runner_module] == klass }
132
- rescue NameError
137
+ rescue NameError, InvalidRunnerClass
133
138
  false
134
139
  end
135
140
 
136
141
  private
137
142
 
143
+ def resolve_runner_class(runner_class)
144
+ return runner_class unless runner_class.is_a?(String)
145
+
146
+ raise InvalidRunnerClass, "invalid runner_class format: #{runner_class}" unless runner_class.match?(RUNNER_CLASS_PATTERN)
147
+
148
+ resolved = registered_runner_modules[runner_class]
149
+ raise InvalidRunnerClass, "unregistered runner_class: #{runner_class}" unless resolved
150
+
151
+ resolved
152
+ end
153
+
154
+ def registered_runner_modules
155
+ return @registered_runner_modules if defined?(@registered_runner_modules) && @registered_runner_modules
156
+
157
+ modules = {}
158
+ if defined?(Legion::Extensions) && Legion::Extensions.respond_to?(:loaded_extension_modules)
159
+ Legion::Extensions.loaded_extension_modules.each do |mod|
160
+ modules[mod.to_s] = mod
161
+ end
162
+ end
163
+ if defined?(Legion::Extensions) && Legion::Extensions.local_tasks.is_a?(Array)
164
+ Legion::Extensions.local_tasks.each do |t|
165
+ mod = t[:runner_module]
166
+ modules[mod.to_s] = mod if mod
167
+ end
168
+ end
169
+ @registered_runner_modules = modules
170
+ end
171
+
172
+ def reset_runner_cache!
173
+ @registered_runner_modules = nil
174
+ end
175
+
138
176
  def parse_payload(payload)
139
177
  case payload
140
178
  when Hash
@@ -172,6 +172,7 @@ module Legion
172
172
 
173
173
  def clear
174
174
  clear_memory
175
+ clear_cache_tiers
175
176
  rescue StandardError => e
176
177
  handle_exception(e, level: :warn, handled: true, operation: :embedding_cache_clear)
177
178
  end
@@ -244,6 +245,13 @@ module Legion
244
245
  false
245
246
  end
246
247
 
248
+ def clear_cache_tiers
249
+ Legion::Cache.local.flush if cache_local_available? && Legion::Cache.local.respond_to?(:flush)
250
+ Legion::Cache.flush if cache_global_available? && Legion::Cache.respond_to?(:flush)
251
+ rescue StandardError => e
252
+ handle_exception(e, level: :debug, handled: true, operation: :clear_cache_tiers)
253
+ end
254
+
247
255
  # --- Cache tier helpers ---
248
256
  def cache_local_get(key)
249
257
  return nil unless cache_local_available?
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Legion
4
- VERSION = '1.8.6'
4
+ VERSION = '1.8.14'
5
5
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: legionio
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.8.6
4
+ version: 1.8.14
5
5
  platform: ruby
6
6
  authors:
7
7
  - Esity
@@ -406,6 +406,7 @@ files:
406
406
  - ".github/workflows/publish-homebrew.yml"
407
407
  - ".gitignore"
408
408
  - ".rubocop.yml"
409
+ - AGENTS.md
409
410
  - CHANGELOG.md
410
411
  - CLAUDE.md
411
412
  - CODEOWNERS
@@ -417,9 +418,6 @@ files:
417
418
  - completions/_legionio
418
419
  - completions/legion.bash
419
420
  - completions/legionio.bash
420
- - config/tls/README.md
421
- - config/tls/generate-certs.sh
422
- - config/tls/settings-tls.json
423
421
  - deploy/helm/legion/Chart.yaml
424
422
  - deploy/helm/legion/templates/_helpers.tpl
425
423
  - deploy/helm/legion/templates/deployment-api.yaml
@@ -430,7 +428,6 @@ files:
430
428
  - deploy/helm/legion/templates/serviceaccount.yaml
431
429
  - deploy/helm/legion/values.yaml
432
430
  - docker_deploy.rb
433
- - docs/README.md
434
431
  - exe/legion
435
432
  - exe/legionio
436
433
  - extensions-agentic/lex-consent/db/migrations/001_create_consent_maps.rb
data/config/tls/README.md DELETED
@@ -1,31 +0,0 @@
1
- # LegionIO TLS Configuration
2
-
3
- Quick-start guide for enabling TLS on all LegionIO components.
4
-
5
- ## Generating Dev Certificates
6
-
7
- ```bash
8
- sudo ./generate-certs.sh /etc/legionio/tls
9
- ```
10
-
11
- Requires `openssl` in PATH. Creates:
12
- - `ca.pem` / `ca.key` — self-signed CA
13
- - `server.crt` / `server.key` — server certificate (localhost + 127.0.0.1 SAN)
14
- - `client.crt` / `client.key` — client certificate
15
-
16
- ## Applying the Settings
17
-
18
- Copy `settings-tls.json` to your LegionIO settings directory
19
- (`~/legionio/settings/` or `/etc/legionio/settings/`) and adjust paths.
20
-
21
- Feature flags (default false — plain connections preserved unless enabled):
22
- - `data.tls.enabled` — enables TLS for PostgreSQL/MySQL
23
- - `api.tls.enabled` — enables TLS for the Puma HTTP API
24
-
25
- ## Validating
26
-
27
- ```bash
28
- legion doctor
29
- ```
30
-
31
- The TLS doctor check verifies: TLS enabled/verify mode, cert file existence, sslmode correctness.
@@ -1,64 +0,0 @@
1
- #!/usr/bin/env bash
2
- set -euo pipefail
3
-
4
- # Generates a self-signed CA and service certificates for local TLS development.
5
- # Usage: ./generate-certs.sh [output-dir]
6
- # Default output-dir: /etc/legionio/tls
7
-
8
- OUTPUT_DIR="${1:-/etc/legionio/tls}"
9
- DAYS=365
10
- CA_CN="LegionIO Dev CA"
11
- SERVER_CN="legionio-server"
12
- CLIENT_CN="legionio-client"
13
-
14
- mkdir -p "${OUTPUT_DIR}"
15
-
16
- echo "Generating CA key and certificate..."
17
- openssl genrsa -out "${OUTPUT_DIR}/ca.key" 4096
18
- openssl req -new -x509 \
19
- -key "${OUTPUT_DIR}/ca.key" \
20
- -out "${OUTPUT_DIR}/ca.pem" \
21
- -days "${DAYS}" \
22
- -subj "/CN=${CA_CN}/O=LegionIO/OU=Dev"
23
-
24
- echo "Generating server key and CSR..."
25
- openssl genrsa -out "${OUTPUT_DIR}/server.key" 2048
26
- openssl req -new \
27
- -key "${OUTPUT_DIR}/server.key" \
28
- -out "${OUTPUT_DIR}/server.csr" \
29
- -subj "/CN=${SERVER_CN}/O=LegionIO/OU=Dev"
30
-
31
- echo "Signing server certificate with CA..."
32
- openssl x509 -req \
33
- -in "${OUTPUT_DIR}/server.csr" \
34
- -CA "${OUTPUT_DIR}/ca.pem" \
35
- -CAkey "${OUTPUT_DIR}/ca.key" \
36
- -CAcreateserial \
37
- -out "${OUTPUT_DIR}/server.crt" \
38
- -days "${DAYS}" \
39
- -extfile <(printf "subjectAltName=DNS:localhost,IP:127.0.0.1")
40
-
41
- echo "Generating client key and CSR..."
42
- openssl genrsa -out "${OUTPUT_DIR}/client.key" 2048
43
- openssl req -new \
44
- -key "${OUTPUT_DIR}/client.key" \
45
- -out "${OUTPUT_DIR}/client.csr" \
46
- -subj "/CN=${CLIENT_CN}/O=LegionIO/OU=Dev"
47
-
48
- echo "Signing client certificate with CA..."
49
- openssl x509 -req \
50
- -in "${OUTPUT_DIR}/client.csr" \
51
- -CA "${OUTPUT_DIR}/ca.pem" \
52
- -CAkey "${OUTPUT_DIR}/ca.key" \
53
- -CAcreateserial \
54
- -out "${OUTPUT_DIR}/client.crt" \
55
- -days "${DAYS}"
56
-
57
- chmod 600 "${OUTPUT_DIR}"/*.key
58
- rm -f "${OUTPUT_DIR}"/*.csr "${OUTPUT_DIR}"/*.srl
59
-
60
- echo ""
61
- echo "Certificates written to ${OUTPUT_DIR}:"
62
- ls -lh "${OUTPUT_DIR}"
63
- echo ""
64
- echo "Reference these paths in settings-tls.json or your legionio settings JSON."
@@ -1,43 +0,0 @@
1
- {
2
- "transport": {
3
- "connection": {
4
- "port": 5671
5
- },
6
- "tls": {
7
- "enabled": true,
8
- "verify": "peer",
9
- "ca": "/etc/legionio/tls/ca.pem",
10
- "cert": "/etc/legionio/tls/client.crt",
11
- "key": "/etc/legionio/tls/client.key"
12
- }
13
- },
14
- "data": {
15
- "adapter": "postgres",
16
- "tls": {
17
- "enabled": true,
18
- "sslmode": "verify-full",
19
- "ca": "/etc/legionio/tls/ca.pem",
20
- "cert": "/etc/legionio/tls/client.crt",
21
- "key": "/etc/legionio/tls/client.key"
22
- }
23
- },
24
- "cache": {
25
- "adapter": "redis",
26
- "tls": {
27
- "enabled": true,
28
- "verify": "peer",
29
- "ca": "/etc/legionio/tls/ca.pem"
30
- }
31
- },
32
- "api": {
33
- "port": 4567,
34
- "bind": "0.0.0.0",
35
- "tls": {
36
- "enabled": true,
37
- "cert": "/etc/legionio/tls/server.crt",
38
- "key": "/etc/legionio/tls/server.key",
39
- "ca": "/etc/legionio/tls/ca.pem",
40
- "verify": "peer"
41
- }
42
- }
43
- }
data/docs/README.md DELETED
@@ -1,6 +0,0 @@
1
- # Moved
2
-
3
- All documentation has been consolidated into the workspace-level docs repo:
4
- `/Users/miverso2/rubymine/legion/docs/`
5
-
6
- That repo is local-only (no remote) to prevent accidental leakage of private content.