legion-apollo 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 63a18f390ea4ed531450615fbff3877951d39b6a4d60bdf5fbee72a2572f3817
4
+ data.tar.gz: 9220f43c2c962e936d82257b8e944fb9b82fd7e1c306fbb8462a3323e6e37de6
5
+ SHA512:
6
+ metadata.gz: '08e857932f343c0b7a5d8ec98abb1e2c2bbb4c4ba3953ef9ae33927b6a0f1861901f8646f900fba9fc233f99737805b355fa0f0176c11771fc664fe97ee54214'
7
+ data.tar.gz: 1d2bc1a4f9d9bf637731c48061bda8e442316feeb4a0eb29ccf0546d20ea1363be795c737079730706707659b1beef7cb81de63087ff4944084c25a9320e9eda
data/CHANGELOG.md ADDED
@@ -0,0 +1,21 @@
1
+ # Changelog
2
+
3
+ ## [0.3.0] - 2026-03-25
4
+
5
+ ### Added
6
+ - `Legion::Apollo::Local` — node-local knowledge store backed by SQLite + FTS5
7
+ - Local settings defaults (retention_years, default_query_scope, fts_candidate_multiplier)
8
+ - SQLite migration with FTS5 virtual table for full-text search
9
+ - Ingest with content hash dedup, optional LLM embedding, configurable TTL (5-year default)
10
+ - Query with FTS5 keyword search, tag filtering, confidence gating, cosine rerank
11
+ - `embedded_at` column for future embedding backfill identification
12
+ - `.local` accessor on `Legion::Apollo` module
13
+
14
+ ## [0.2.1] - 2026-03-25
15
+
16
+ ### Added
17
+ - Initial gem scaffold: `Legion::Apollo` public API (`start`, `shutdown`, `query`, `ingest`, `retrieve`)
18
+ - `Legion::Apollo::Settings` with default configuration values
19
+ - Transport message envelope classes: `Ingest`, `Query`, `Writeback`, `AccessBoost`
20
+ - Helper modules: `Confidence` constants, `Similarity` math, `TagNormalizer`
21
+ - Smart routing: co-located lex-apollo service, RabbitMQ transport, graceful failure
data/LICENSE ADDED
@@ -0,0 +1,168 @@
1
+ Apache License
2
+ Version 2.0, January 2004
3
+ http://www.apache.org/licenses/
4
+
5
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6
+
7
+ 1. Definitions.
8
+
9
+ "License" shall mean the terms and conditions for use, reproduction,
10
+ and distribution as defined by Sections 1 through 9 of this document.
11
+
12
+ "Licensor" shall mean the copyright owner or entity authorized by
13
+ the copyright owner that is granting the License.
14
+
15
+ "Legal Entity" shall mean the union of the acting entity and all
16
+ other entities that control, are controlled by, or are under common
17
+ control with that entity. For the purposes of this definition,
18
+ "control" means (i) the power, direct or indirect, to cause the
19
+ direction or management of such entity, whether by contract or
20
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
21
+ outstanding shares, or (iii) beneficial ownership of such entity.
22
+
23
+ "You" (or "Your") shall mean an individual or Legal Entity
24
+ exercising permissions granted by this License.
25
+
26
+ "Source" form shall mean the preferred form for making modifications,
27
+ including but not limited to software source code, documentation
28
+ source, and configuration files.
29
+
30
+ "Object" form shall mean any form resulting from mechanical
31
+ transformation or translation of a Source form, including but
32
+ not limited to compiled object code, generated documentation,
33
+ and conversions to other media types.
34
+
35
+ "Work" shall mean the work of authorship made available under
36
+ the License, as indicated by a copyright notice that is included in
37
+ or attached to the work (an example is provided in the Appendix below).
38
+
39
+ "Derivative Works" shall mean any work, whether in Source or Object
40
+ form, that is based on (or derived from) the Work and for which the
41
+ editorial revisions, annotations, elaborations, or other transformations
42
+ represent, as a whole, an original work of authorship. For the purposes
43
+ of this License, Derivative Works shall not include works that remain
44
+ separable from, or merely link (or bind by name) to the interfaces of,
45
+ the Work and the Derivative Works thereof.
46
+
47
+ "Contribution" shall mean, as submitted to the Licensor for inclusion
48
+ in the Work by the copyright owner or by an individual or Legal Entity
49
+ authorized to submit on behalf of the copyright owner. For the purposes
50
+ of this definition, "submitted" means any form of electronic, verbal, or
51
+ written communication sent to the Licensor or its representatives,
52
+ including but not limited to communication on electronic mailing lists,
53
+ source code control systems, and issue tracking systems that are managed
54
+ by, or on behalf of, the Licensor for the purpose of developing and
55
+ improving the Work, but excluding communication that is conspicuously
56
+ marked or designated in writing by the copyright owner as "Not a
57
+ Contribution."
58
+
59
+ "Contributor" shall mean Licensor and any Legal Entity on behalf of
60
+ whom a Contribution has been received by the Licensor and included
61
+ within the Work.
62
+
63
+ 2. Grant of Copyright License. Subject to the terms and conditions of
64
+ this License, each Contributor hereby grants to You a perpetual,
65
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
66
+ copyright license to reproduce, prepare Derivative Works of,
67
+ publicly display, publicly perform, sublicense, and distribute the
68
+ Work and such Derivative Works in Source or Object form.
69
+
70
+ 3. Grant of Patent License. Subject to the terms and conditions of
71
+ this License, each Contributor hereby grants to You a perpetual,
72
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
73
+ (except as stated in this section) patent license to make, have made,
74
+ use, offer to sell, sell, import, and otherwise transfer the Work,
75
+ where such license applies only to those patent claims licensable
76
+ by such Contributor that are necessarily infringed by their
77
+ Contribution(s) alone or by the combined work (in which such
78
+ Contribution(s) were submitted). If You institute patent litigation
79
+ against any entity (including a cross-claim or counterclaim in a
80
+ lawsuit) alleging that the Work or any of its copyright or other
81
+ intellectual property right is a patent claim is infringed by
82
+ the Work or any Contribution incorporated within the Work, a
83
+ patent license shall automatically terminate as of the date such
84
+ litigation is filed.
85
+
86
+ 4. Redistribution. You may reproduce and distribute copies of the
87
+ Work or Derivative Works thereof in any medium, with or without
88
+ modifications, and in Source or Object form, provided that You
89
+ meet the following conditions:
90
+
91
+ (a) You must give any other recipients of the Work or Derivative
92
+ Works a copy of this License; and
93
+
94
+ (b) You must cause any modified files to carry prominent notices
95
+ stating that You changed the files; and
96
+
97
+ (c) You must retain, in the Source form of any Derivative Works
98
+ that You distribute, all copyright, patent, trademark, and
99
+ attribution notices from the Source form of the Work,
100
+ excluding those notices that do not pertain to any part of
101
+ the Derivative Works; and
102
+
103
+ (d) If the Work includes a "NOTICE" text file as part of its
104
+ distribution, You must include a readable copy of the
105
+ attribution notices contained within such NOTICE file, in
106
+ at least one of the following places: within a NOTICE text
107
+ file distributed as part of the Derivative Works; within
108
+ the Source form or documentation, if provided along with the
109
+ Derivative Works; or, within a display generated by the
110
+ Derivative Works, if and wherever such third-party notices
111
+ normally appear. The contents of the NOTICE file are for
112
+ informational purposes only and do not modify the License.
113
+ You may add Your own attribution notices within Derivative
114
+ Works that You distribute, alongside or in addition to the
115
+ NOTICE text from the Work, provided that such additional
116
+ attribution notices cannot be construed as modifying the
117
+ License.
118
+
119
+ You may add Your own license statement for Your modifications and
120
+ may provide additional grant of rights to use, copy, modify, merge,
121
+ publish, distribute, sublicense, and/or sell copies of the
122
+ Contribution, and to permit persons to whom the Contribution is
123
+ furnished to do so.
124
+
125
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
126
+ any Contribution intentionally submitted for inclusion in the Work
127
+ by You to the Licensor shall be under the terms and conditions of
128
+ this License, without any additional terms or conditions.
129
+ Notwithstanding the above, nothing herein shall supersede or modify
130
+ the terms of any separate license agreement you may have executed
131
+ with Licensor regarding such Contributions.
132
+
133
+ 6. Trademarks. This License does not grant permission to use the trade
134
+ names, trademarks, service marks, or product names of the Licensor,
135
+ except as required for reasonable and customary use in describing the
136
+ origin of the Work and reproducing the content of the NOTICE file.
137
+
138
+ 7. Disclaimer of Warranty. Unless required by applicable law or
139
+ agreed to in writing, Licensor provides the Work (and each
140
+ Contributor provides its Contributions) on an "AS IS" BASIS,
141
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
142
+ implied, including, without limitation, any warranties or conditions
143
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
144
+ PARTICULAR PURPOSE. You are solely responsible for determining the
145
+ appropriateness of using or reproducing the Work and assume any
146
+ risks associated with Your exercise of permissions under this License.
147
+
148
+ 8. Limitation of Liability. In no event and under no legal theory,
149
+ whether in tort (including negligence), contract, or otherwise,
150
+ unless required by applicable law (such as deliberate and grossly
151
+ negligent acts) or agreed to in writing, shall any Contributor be
152
+ liable to You for damages, including any direct, indirect, special,
153
+ incidental, or exemplary damages of any character arising as a
154
+ result of this License or out of the use or inability to use the
155
+ Work (including but not limited to damages for loss of goodwill,
156
+ work stoppage, computer failure or malfunction, or all other
157
+ commercial damages or losses), even if such Contributor has been
158
+ advised of the possibility of such damages.
159
+
160
+ 9. Accepting Warranty or Additional Liability. While redistributing
161
+ the Work or Derivative Works thereof, You may choose to offer, and
162
+ charge a fee for, acceptance of support, warranty, indemnity, or
163
+ other liability obligations and/or rights consistent with this
164
+ License. However, in accepting such obligations, You may offer only
165
+ conditions consistent with this terms of this License, and may not
166
+ impose any additional obligations.
167
+
168
+ END OF TERMS AND CONDITIONS
data/README.md ADDED
@@ -0,0 +1,18 @@
1
+ # legion-apollo
2
+
3
+ Apollo client library for the LegionIO framework.
4
+
5
+ Provides `query`, `ingest`, and `retrieve` with smart routing: co-located lex-apollo service, RabbitMQ transport, or graceful failure.
6
+
7
+ ## Usage
8
+
9
+ ```ruby
10
+ Legion::Apollo.start
11
+
12
+ Legion::Apollo.ingest(content: 'Some knowledge', tags: %w[fact ruby])
13
+ results = Legion::Apollo.query(text: 'tell me about ruby', limit: 5)
14
+ ```
15
+
16
+ ## License
17
+
18
+ Apache-2.0
@@ -0,0 +1,52 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Legion
4
+ module Apollo
5
+ module Helpers
6
+ # Confidence constants and predicate helpers for Apollo knowledge entries.
7
+ # DB-dependent methods live in lex-apollo; only pure-function logic here.
8
+ module Confidence
9
+ INITIAL_CONFIDENCE = 0.5
10
+ CORROBORATION_BOOST = 0.15
11
+ CONTRADICTION_PENALTY = 0.20
12
+ DECAY_RATE = 0.005
13
+ WRITE_GATE_THRESHOLD = 0.3
14
+ HIGH_CONFIDENCE = 0.8
15
+ ARCHIVE_THRESHOLD = 0.1
16
+
17
+ STATUSES = %i[pending confirmed disputed deprecated archived].freeze
18
+
19
+ CONTENT_TYPES = %i[
20
+ fact observation hypothesis procedure opinion
21
+ question answer summary analysis synthesis
22
+ ].freeze
23
+
24
+ module_function
25
+
26
+ def valid_status?(status)
27
+ STATUSES.include?(status&.to_sym)
28
+ end
29
+
30
+ def valid_content_type?(type)
31
+ CONTENT_TYPES.include?(type&.to_sym)
32
+ end
33
+
34
+ def above_write_gate?(confidence)
35
+ confidence.to_f >= WRITE_GATE_THRESHOLD
36
+ end
37
+
38
+ def high_confidence?(confidence)
39
+ confidence.to_f >= HIGH_CONFIDENCE
40
+ end
41
+
42
+ def apollo_setting(key, default)
43
+ return default unless defined?(Legion::Settings) && !Legion::Settings[:apollo].nil?
44
+
45
+ Legion::Settings[:apollo][key] || default
46
+ rescue StandardError
47
+ default
48
+ end
49
+ end
50
+ end
51
+ end
52
+ end
@@ -0,0 +1,47 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Legion
4
+ module Apollo
5
+ module Helpers
6
+ # Pure cosine similarity math and match classification for Apollo vectors.
7
+ module Similarity
8
+ EXACT_MATCH_THRESHOLD = 0.95
9
+ HIGH_SIMILARITY_THRESHOLD = 0.85
10
+ CORROBORATION_THRESHOLD = 0.75
11
+ RELATED_THRESHOLD = 0.5
12
+
13
+ module_function
14
+
15
+ def cosine_similarity(vec_a, vec_b) # rubocop:disable Metrics/MethodLength,Metrics/AbcSize
16
+ return 0.0 if vec_a.nil? || vec_b.nil? || vec_a.empty? || vec_b.empty?
17
+ return 0.0 unless vec_a.size == vec_b.size
18
+
19
+ dot = 0.0
20
+ mag_a = 0.0
21
+ mag_b = 0.0
22
+
23
+ vec_a.size.times do |i|
24
+ a = vec_a[i].to_f
25
+ b = vec_b[i].to_f
26
+ dot += a * b
27
+ mag_a += a * a
28
+ mag_b += b * b
29
+ end
30
+
31
+ denom = Math.sqrt(mag_a) * Math.sqrt(mag_b)
32
+ denom.zero? ? 0.0 : (dot / denom)
33
+ end
34
+
35
+ def classify_match(similarity)
36
+ case similarity
37
+ when EXACT_MATCH_THRESHOLD..1.0 then :exact
38
+ when HIGH_SIMILARITY_THRESHOLD...EXACT_MATCH_THRESHOLD then :high
39
+ when CORROBORATION_THRESHOLD...HIGH_SIMILARITY_THRESHOLD then :corroboration
40
+ when RELATED_THRESHOLD...CORROBORATION_THRESHOLD then :related
41
+ else :unrelated
42
+ end
43
+ end
44
+ end
45
+ end
46
+ end
47
+ end
@@ -0,0 +1,33 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Legion
4
+ module Apollo
5
+ module Helpers
6
+ # Pure-function tag normalization: lowercase, strip invalid chars, dedup, truncate.
7
+ module TagNormalizer
8
+ MAX_TAG_LENGTH = 64
9
+ MAX_TAGS = 20
10
+
11
+ module_function
12
+
13
+ def normalize(tags)
14
+ return [] unless tags.is_a?(Array)
15
+
16
+ tags
17
+ .map { |t| normalize_one(t) }
18
+ .compact
19
+ .uniq
20
+ .first(MAX_TAGS)
21
+ end
22
+
23
+ def normalize_one(tag)
24
+ return nil if tag.nil?
25
+
26
+ normalized = tag.to_s.strip.downcase.gsub(/[^a-z0-9_:-]/, '_').squeeze('_')
27
+ normalized = normalized[0, MAX_TAG_LENGTH] if normalized.length > MAX_TAG_LENGTH
28
+ normalized.empty? ? nil : normalized
29
+ end
30
+ end
31
+ end
32
+ end
33
+ end
@@ -0,0 +1,34 @@
1
+ # frozen_string_literal: true
2
+
3
+ Sequel.migration do # rubocop:disable Metrics/BlockLength
4
+ up do
5
+ create_table(:local_knowledge) do
6
+ primary_key :id
7
+ String :content, null: false, text: true
8
+ String :content_hash, null: false, size: 32
9
+ String :tags, text: true
10
+ String :embedding, text: true
11
+ String :embedded_at
12
+ String :source_channel
13
+ String :source_agent
14
+ String :submitted_by
15
+ Float :confidence, default: 1.0
16
+ String :expires_at, null: false
17
+ String :created_at, null: false
18
+ String :updated_at, null: false
19
+
20
+ unique :content_hash, name: :idx_local_knowledge_hash
21
+ index :expires_at, name: :idx_local_knowledge_expires
22
+ index :embedded_at, name: :idx_local_knowledge_embedded
23
+ end
24
+
25
+ fts_sql = 'CREATE VIRTUAL TABLE IF NOT EXISTS local_knowledge_fts ' \
26
+ "USING fts5(content, tags, content='local_knowledge', content_rowid='id')"
27
+ run fts_sql
28
+ end
29
+
30
+ down do
31
+ run 'DROP TABLE IF EXISTS local_knowledge_fts'
32
+ drop_table(:local_knowledge) if table_exists?(:local_knowledge)
33
+ end
34
+ end
@@ -0,0 +1,244 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'digest'
4
+ require 'time'
5
+
6
+ module Legion
7
+ module Apollo
8
+ # Node-local knowledge store backed by SQLite + FTS5.
9
+ # Mirrors Legion::Apollo's public API but stores locally.
10
+ module Local # rubocop:disable Metrics/ModuleLength
11
+ MIGRATION_PATH = File.expand_path('local/migrations', __dir__).freeze
12
+
13
+ class << self # rubocop:disable Metrics/ClassLength
14
+ def start
15
+ return if @started
16
+ return unless local_enabled?
17
+ return unless data_local_available?
18
+
19
+ Legion::Data::Local.register_migrations(name: :apollo_local, path: MIGRATION_PATH)
20
+ @started = true
21
+ Legion::Logging.info 'Legion::Apollo::Local started' if defined?(Legion::Logging)
22
+ end
23
+
24
+ def shutdown
25
+ @started = false
26
+ Legion::Logging.info 'Legion::Apollo::Local shutdown' if defined?(Legion::Logging)
27
+ end
28
+
29
+ def started?
30
+ @started == true
31
+ end
32
+
33
+ def ingest(content:, tags: [], **opts) # rubocop:disable Metrics/MethodLength,Metrics/AbcSize
34
+ return not_started_error unless started?
35
+
36
+ hash = content_hash(content)
37
+ return { success: true, mode: :deduplicated } if duplicate?(hash)
38
+
39
+ embedding, embedded_at = generate_embedding(content)
40
+ now = Time.now.utc.strftime('%Y-%m-%dT%H:%M:%S.%LZ')
41
+ expires = compute_expires_at
42
+
43
+ row = {
44
+ content: content,
45
+ content_hash: hash,
46
+ tags: Legion::JSON.dump(Array(tags).first(local_setting(:max_tags, 20))),
47
+ embedding: embedding ? Legion::JSON.dump(embedding) : nil,
48
+ embedded_at: embedded_at,
49
+ source_channel: opts[:source_channel],
50
+ source_agent: opts[:source_agent],
51
+ submitted_by: opts[:submitted_by],
52
+ confidence: opts[:confidence] || 1.0,
53
+ expires_at: expires,
54
+ created_at: now,
55
+ updated_at: now
56
+ }
57
+
58
+ id = db[:local_knowledge].insert(row)
59
+ sync_fts(id, content, row[:tags])
60
+
61
+ { success: true, mode: :local, id: id }
62
+ rescue StandardError => e
63
+ { success: false, error: e.message }
64
+ end
65
+
66
+ def query(text:, limit: nil, min_confidence: nil, tags: nil, **) # rubocop:disable Metrics/MethodLength,Metrics/AbcSize
67
+ return not_started_error unless started?
68
+
69
+ limit ||= local_setting(:default_limit, 5)
70
+ min_confidence ||= local_setting(:min_confidence, 0.3)
71
+ multiplier = local_setting(:fts_candidate_multiplier, 3)
72
+
73
+ candidates = fts_search(text, limit: limit * multiplier)
74
+ candidates = filter_candidates(candidates, min_confidence: min_confidence, tags: tags)
75
+ candidates = cosine_rerank(text, candidates) if can_rerank?
76
+ results = candidates.first(limit)
77
+
78
+ { success: true, results: results, count: results.size, mode: :local }
79
+ rescue StandardError => e
80
+ { success: false, error: e.message }
81
+ end
82
+
83
+ def retrieve(text:, limit: 5, **)
84
+ query(text: text, limit: limit, **)
85
+ end
86
+
87
+ def reset!
88
+ @started = false
89
+ end
90
+
91
+ private
92
+
93
+ def local_enabled?
94
+ return false unless defined?(Legion::Settings)
95
+
96
+ settings = Legion::Settings[:apollo]
97
+ return true if settings.nil?
98
+
99
+ local = settings[:local]
100
+ return true if local.nil?
101
+
102
+ local[:enabled] != false
103
+ rescue StandardError
104
+ true
105
+ end
106
+
107
+ def data_local_available?
108
+ defined?(Legion::Data::Local) && Legion::Data::Local.connected?
109
+ rescue StandardError
110
+ false
111
+ end
112
+
113
+ def db
114
+ Legion::Data::Local.connection
115
+ end
116
+
117
+ def content_hash(content)
118
+ normalized = content.to_s.strip.downcase.gsub(/\s+/, ' ')
119
+ Digest::MD5.hexdigest(normalized)
120
+ end
121
+
122
+ def duplicate?(hash)
123
+ db[:local_knowledge].where(content_hash: hash).any?
124
+ rescue StandardError
125
+ false
126
+ end
127
+
128
+ def generate_embedding(content) # rubocop:disable Metrics/MethodLength,Metrics/CyclomaticComplexity,Metrics/PerceivedComplexity
129
+ unless defined?(Legion::LLM) && Legion::LLM.respond_to?(:can_embed?) && Legion::LLM.can_embed?
130
+ return [nil, nil]
131
+ end
132
+
133
+ result = Legion::LLM::Embeddings.generate(text: content)
134
+ vector = result.is_a?(Hash) ? result[:vector] : result
135
+ if vector.is_a?(Array) && vector.any?
136
+ [vector, Time.now.utc.strftime('%Y-%m-%dT%H:%M:%S.%LZ')]
137
+ else
138
+ [nil, nil]
139
+ end
140
+ rescue StandardError
141
+ [nil, nil]
142
+ end
143
+
144
+ def compute_expires_at
145
+ years = local_setting(:retention_years, 5)
146
+ (Time.now.utc + (years * 365.25 * 24 * 3600)).strftime('%Y-%m-%dT%H:%M:%S.%LZ')
147
+ end
148
+
149
+ def sync_fts(id, content, tags_json)
150
+ sql = 'INSERT INTO local_knowledge_fts(rowid, content, tags) ' \
151
+ "VALUES (#{id}, #{db.literal(content)}, #{db.literal(tags_json)})"
152
+ db.run(sql)
153
+ rescue StandardError => e
154
+ Legion::Logging.warn("FTS5 sync failed for id=#{id}: #{e.message}") if defined?(Legion::Logging)
155
+ end
156
+
157
+ def fts_search(text, limit:) # rubocop:disable Metrics/MethodLength,Metrics/AbcSize
158
+ escaped = text.to_s.gsub('"', '""')
159
+ now = Time.now.utc.strftime('%Y-%m-%dT%H:%M:%S.%LZ')
160
+ db.fetch(
161
+ 'SELECT lk.* FROM local_knowledge lk ' \
162
+ 'INNER JOIN local_knowledge_fts fts ON lk.id = fts.rowid ' \
163
+ 'WHERE local_knowledge_fts MATCH ? AND lk.expires_at > ? ORDER BY fts.rank LIMIT ?',
164
+ escaped, now, limit
165
+ ).all
166
+ rescue StandardError
167
+ db[:local_knowledge]
168
+ .where(Sequel.lit('expires_at > ?', Time.now.utc.strftime('%Y-%m-%dT%H:%M:%S.%LZ')))
169
+ .where(Sequel.ilike(:content, "%#{text}%"))
170
+ .limit(limit)
171
+ .all
172
+ end
173
+
174
+ def filter_candidates(candidates, min_confidence:, tags:)
175
+ candidates = candidates.select { |c| (c[:confidence] || 0) >= min_confidence }
176
+ if tags && !tags.empty?
177
+ tag_set = Array(tags).map(&:to_s)
178
+ candidates = candidates.select do |c|
179
+ entry_tags = parse_tags(c[:tags])
180
+ tag_set.intersect?(entry_tags)
181
+ end
182
+ end
183
+ candidates
184
+ end
185
+
186
+ def parse_tags(tags_json)
187
+ return [] if tags_json.nil? || tags_json.empty?
188
+
189
+ ::JSON.parse(tags_json)
190
+ rescue StandardError
191
+ []
192
+ end
193
+
194
+ def can_rerank?
195
+ defined?(Legion::LLM) && Legion::LLM.respond_to?(:can_embed?) && Legion::LLM.can_embed?
196
+ end
197
+
198
+ def cosine_rerank(text, candidates) # rubocop:disable Metrics/MethodLength,Metrics/AbcSize,Metrics/CyclomaticComplexity,Metrics/PerceivedComplexity
199
+ query_result = Legion::LLM::Embeddings.generate(text: text)
200
+ query_vec = query_result.is_a?(Hash) ? query_result[:vector] : query_result
201
+ return candidates unless query_vec.is_a?(Array) && query_vec.any?
202
+
203
+ scored = candidates.map do |c|
204
+ entry_vec = parse_embedding(c[:embedding])
205
+ score = if entry_vec
206
+ Legion::Apollo::Helpers::Similarity.cosine_similarity(query_vec, entry_vec)
207
+ else
208
+ 0.0
209
+ end
210
+ c.merge(similarity: score)
211
+ end
212
+
213
+ scored.sort_by { |c| -(c[:similarity] || 0) }
214
+ rescue StandardError
215
+ candidates
216
+ end
217
+
218
+ def parse_embedding(embedding_json)
219
+ return nil if embedding_json.nil? || embedding_json.empty?
220
+
221
+ parsed = ::JSON.parse(embedding_json)
222
+ parsed.is_a?(Array) ? parsed.map(&:to_f) : nil
223
+ rescue StandardError
224
+ nil
225
+ end
226
+
227
+ def local_setting(key, default)
228
+ return default unless defined?(Legion::Settings) && !Legion::Settings[:apollo].nil?
229
+
230
+ local = Legion::Settings[:apollo][:local]
231
+ return default if local.nil?
232
+
233
+ local[key] || default
234
+ rescue StandardError
235
+ default
236
+ end
237
+
238
+ def not_started_error
239
+ { success: false, error: :not_started }
240
+ end
241
+ end
242
+ end
243
+ end
244
+ end
@@ -0,0 +1,20 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Legion
4
+ module Apollo
5
+ module Messages
6
+ # Envelope for publishing access-frequency boost events to the Apollo exchange.
7
+ class AccessBoost
8
+ ROUTING_KEY = 'apollo.access.boost'
9
+ EXCHANGE = 'apollo'
10
+
11
+ def publish(payload)
12
+ return unless defined?(Legion::Transport)
13
+
14
+ exchange = Legion::Transport::Exchange.new(EXCHANGE, type: :topic, auto_delete: false)
15
+ exchange.publish(payload, routing_key: ROUTING_KEY)
16
+ end
17
+ end
18
+ end
19
+ end
20
+ end
@@ -0,0 +1,20 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Legion
4
+ module Apollo
5
+ module Messages
6
+ # Envelope for publishing knowledge ingest requests to the Apollo exchange.
7
+ class Ingest
8
+ ROUTING_KEY = 'apollo.ingest'
9
+ EXCHANGE = 'apollo'
10
+
11
+ def publish(payload)
12
+ return unless defined?(Legion::Transport)
13
+
14
+ exchange = Legion::Transport::Exchange.new(EXCHANGE, type: :topic, auto_delete: false)
15
+ exchange.publish(payload, routing_key: ROUTING_KEY)
16
+ end
17
+ end
18
+ end
19
+ end
20
+ end
@@ -0,0 +1,20 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Legion
4
+ module Apollo
5
+ module Messages
6
+ # Envelope for publishing knowledge query requests to the Apollo exchange.
7
+ class Query
8
+ ROUTING_KEY = 'apollo.query'
9
+ EXCHANGE = 'apollo'
10
+
11
+ def publish(payload)
12
+ return unless defined?(Legion::Transport)
13
+
14
+ exchange = Legion::Transport::Exchange.new(EXCHANGE, type: :topic, auto_delete: false)
15
+ exchange.publish(payload, routing_key: ROUTING_KEY)
16
+ end
17
+ end
18
+ end
19
+ end
20
+ end
@@ -0,0 +1,20 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Legion
4
+ module Apollo
5
+ module Messages
6
+ # Envelope for publishing knowledge writeback events to the Apollo exchange.
7
+ class Writeback
8
+ ROUTING_KEY = 'apollo.writeback'
9
+ EXCHANGE = 'apollo'
10
+
11
+ def publish(payload)
12
+ return unless defined?(Legion::Transport)
13
+
14
+ exchange = Legion::Transport::Exchange.new(EXCHANGE, type: :topic, auto_delete: false)
15
+ exchange.publish(payload, routing_key: ROUTING_KEY)
16
+ end
17
+ end
18
+ end
19
+ end
20
+ end
@@ -0,0 +1,32 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Legion
4
+ module Apollo
5
+ # Default configuration values for the Apollo client.
6
+ module Settings
7
+ def self.default
8
+ {
9
+ enabled: true,
10
+ transport_mode: :auto,
11
+ query_timeout: 5,
12
+ ingest_timeout: 10,
13
+ max_tags: 20,
14
+ default_limit: 5,
15
+ min_confidence: 0.3,
16
+ local: local_defaults
17
+ }
18
+ end
19
+
20
+ def self.local_defaults
21
+ {
22
+ enabled: true,
23
+ retention_years: 5,
24
+ default_query_scope: :all,
25
+ fts_candidate_multiplier: 3,
26
+ min_confidence: 0.3,
27
+ default_limit: 5
28
+ }
29
+ end
30
+ end
31
+ end
32
+ end
@@ -0,0 +1,7 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Legion
4
+ module Apollo
5
+ VERSION = '0.3.0'
6
+ end
7
+ end
@@ -0,0 +1,166 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative 'apollo/version'
4
+ require_relative 'apollo/settings'
5
+ require_relative 'apollo/local'
6
+
7
+ module Legion
8
+ # Apollo client library — query, ingest, and retrieve with smart routing.
9
+ # Routes to a co-located lex-apollo service when available, falls back to
10
+ # RabbitMQ transport, and degrades gracefully when neither is present.
11
+ module Apollo # rubocop:disable Metrics/ModuleLength
12
+ class << self # rubocop:disable Metrics/ClassLength
13
+ def start
14
+ return if @started
15
+
16
+ merge_settings
17
+ detect_transport
18
+ detect_data
19
+
20
+ @started = true
21
+ Legion::Logging.info 'Legion::Apollo started' if defined?(Legion::Logging)
22
+ end
23
+
24
+ def shutdown
25
+ @started = false
26
+ @transport_available = nil
27
+ @data_available = nil
28
+ Legion::Logging.info 'Legion::Apollo shutdown' if defined?(Legion::Logging)
29
+ end
30
+
31
+ def started?
32
+ @started == true
33
+ end
34
+
35
+ def local
36
+ Legion::Apollo::Local
37
+ end
38
+
39
+ def query(text:, limit: nil, min_confidence: nil, tags: nil, **opts) # rubocop:disable Metrics/MethodLength
40
+ return not_started_error unless started?
41
+
42
+ limit ||= apollo_setting(:default_limit, 5)
43
+ min_confidence ||= apollo_setting(:min_confidence, 0.3)
44
+
45
+ payload = { text: text, limit: limit, min_confidence: min_confidence, tags: tags, **opts }
46
+
47
+ if co_located_reader?
48
+ direct_query(payload)
49
+ elsif transport_available?
50
+ publish_query(payload)
51
+ else
52
+ { success: false, error: :no_path_available }
53
+ end
54
+ end
55
+
56
+ def ingest(content:, tags: [], **opts)
57
+ return not_started_error unless started?
58
+
59
+ payload = { content: content, tags: Array(tags).first(apollo_setting(:max_tags, 20)), **opts }
60
+
61
+ if co_located_writer?
62
+ direct_ingest(payload)
63
+ elsif transport_available?
64
+ publish_ingest(payload)
65
+ else
66
+ { success: false, error: :no_path_available }
67
+ end
68
+ end
69
+
70
+ def retrieve(text:, limit: 5, **)
71
+ query(text: text, limit: limit, **)
72
+ end
73
+
74
+ def transport_available?
75
+ @transport_available == true
76
+ end
77
+
78
+ def data_available?
79
+ @data_available == true
80
+ end
81
+
82
+ private
83
+
84
+ def merge_settings
85
+ return unless defined?(Legion::Settings)
86
+
87
+ defaults = Legion::Apollo::Settings.default
88
+ Legion::Settings[:apollo] = defaults.merge(Legion::Settings[:apollo] || {})
89
+ rescue StandardError => e
90
+ Legion::Logging.debug("Apollo settings merge failed: #{e.message}") if defined?(Legion::Logging)
91
+ end
92
+
93
+ def detect_transport
94
+ @transport_available = defined?(Legion::Transport) &&
95
+ Legion::Settings[:transport][:connected] == true
96
+ rescue StandardError
97
+ @transport_available = false
98
+ end
99
+
100
+ def detect_data
101
+ @data_available = defined?(Legion::Data) &&
102
+ Legion::Settings[:data][:connected] == true
103
+ rescue StandardError
104
+ @data_available = false
105
+ end
106
+
107
+ def co_located_reader?
108
+ return false unless data_available?
109
+
110
+ defined?(Legion::Extensions::Apollo::Runners::Knowledge) &&
111
+ Legion::Extensions::Apollo::Runners::Knowledge.respond_to?(:handle_query)
112
+ rescue StandardError
113
+ false
114
+ end
115
+
116
+ def co_located_writer?
117
+ return false unless data_available?
118
+
119
+ defined?(Legion::Extensions::Apollo::Runners::Knowledge) &&
120
+ Legion::Extensions::Apollo::Runners::Knowledge.respond_to?(:handle_ingest)
121
+ rescue StandardError
122
+ false
123
+ end
124
+
125
+ def direct_query(payload)
126
+ Legion::Extensions::Apollo::Runners::Knowledge.handle_query(**payload)
127
+ rescue StandardError => e
128
+ { success: false, error: e.message }
129
+ end
130
+
131
+ def direct_ingest(payload)
132
+ Legion::Extensions::Apollo::Runners::Knowledge.handle_ingest(**payload)
133
+ rescue StandardError => e
134
+ { success: false, error: e.message }
135
+ end
136
+
137
+ def publish_query(payload)
138
+ require_relative 'apollo/messages/query' unless defined?(Legion::Apollo::Messages::Query)
139
+ Legion::Apollo::Messages::Query.new.publish(Legion::JSON.dump(payload))
140
+ { success: true, mode: :async }
141
+ rescue StandardError => e
142
+ { success: false, error: e.message }
143
+ end
144
+
145
+ def publish_ingest(payload)
146
+ require_relative 'apollo/messages/ingest' unless defined?(Legion::Apollo::Messages::Ingest)
147
+ Legion::Apollo::Messages::Ingest.new.publish(Legion::JSON.dump(payload))
148
+ { success: true, mode: :async }
149
+ rescue StandardError => e
150
+ { success: false, error: e.message }
151
+ end
152
+
153
+ def apollo_setting(key, default)
154
+ return default unless defined?(Legion::Settings) && !Legion::Settings[:apollo].nil?
155
+
156
+ Legion::Settings[:apollo][key] || default
157
+ rescue StandardError
158
+ default
159
+ end
160
+
161
+ def not_started_error
162
+ { success: false, error: :not_started }
163
+ end
164
+ end
165
+ end
166
+ end
metadata ADDED
@@ -0,0 +1,107 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: legion-apollo
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.3.0
5
+ platform: ruby
6
+ authors:
7
+ - Esity
8
+ bindir: bin
9
+ cert_chain: []
10
+ date: 1980-01-02 00:00:00.000000000 Z
11
+ dependencies:
12
+ - !ruby/object:Gem::Dependency
13
+ name: legion-json
14
+ requirement: !ruby/object:Gem::Requirement
15
+ requirements:
16
+ - - ">="
17
+ - !ruby/object:Gem::Version
18
+ version: 1.2.1
19
+ type: :runtime
20
+ prerelease: false
21
+ version_requirements: !ruby/object:Gem::Requirement
22
+ requirements:
23
+ - - ">="
24
+ - !ruby/object:Gem::Version
25
+ version: 1.2.1
26
+ - !ruby/object:Gem::Dependency
27
+ name: legion-logging
28
+ requirement: !ruby/object:Gem::Requirement
29
+ requirements:
30
+ - - ">="
31
+ - !ruby/object:Gem::Version
32
+ version: 1.3.2
33
+ type: :runtime
34
+ prerelease: false
35
+ version_requirements: !ruby/object:Gem::Requirement
36
+ requirements:
37
+ - - ">="
38
+ - !ruby/object:Gem::Version
39
+ version: 1.3.2
40
+ - !ruby/object:Gem::Dependency
41
+ name: legion-settings
42
+ requirement: !ruby/object:Gem::Requirement
43
+ requirements:
44
+ - - ">="
45
+ - !ruby/object:Gem::Version
46
+ version: 1.3.14
47
+ type: :runtime
48
+ prerelease: false
49
+ version_requirements: !ruby/object:Gem::Requirement
50
+ requirements:
51
+ - - ">="
52
+ - !ruby/object:Gem::Version
53
+ version: 1.3.14
54
+ description: Client-side Apollo knowledge store API for LegionIO. Provides query,
55
+ ingest, and retrieve with smart routing (co-located service, RabbitMQ, or graceful
56
+ failure).
57
+ email:
58
+ - matthewdiverson@gmail.com
59
+ executables: []
60
+ extensions: []
61
+ extra_rdoc_files:
62
+ - CHANGELOG.md
63
+ - LICENSE
64
+ - README.md
65
+ files:
66
+ - CHANGELOG.md
67
+ - LICENSE
68
+ - README.md
69
+ - lib/legion/apollo.rb
70
+ - lib/legion/apollo/helpers/confidence.rb
71
+ - lib/legion/apollo/helpers/similarity.rb
72
+ - lib/legion/apollo/helpers/tag_normalizer.rb
73
+ - lib/legion/apollo/local.rb
74
+ - lib/legion/apollo/local/migrations/001_create_local_knowledge.rb
75
+ - lib/legion/apollo/messages/access_boost.rb
76
+ - lib/legion/apollo/messages/ingest.rb
77
+ - lib/legion/apollo/messages/query.rb
78
+ - lib/legion/apollo/messages/writeback.rb
79
+ - lib/legion/apollo/settings.rb
80
+ - lib/legion/apollo/version.rb
81
+ homepage: https://github.com/LegionIO/legion-apollo
82
+ licenses:
83
+ - Apache-2.0
84
+ metadata:
85
+ bug_tracker_uri: https://github.com/LegionIO/legion-apollo/issues
86
+ changelog_uri: https://github.com/LegionIO/legion-apollo/blob/main/CHANGELOG.md
87
+ homepage_uri: https://github.com/LegionIO/LegionIO
88
+ source_code_uri: https://github.com/LegionIO/legion-apollo
89
+ rubygems_mfa_required: 'true'
90
+ rdoc_options: []
91
+ require_paths:
92
+ - lib
93
+ required_ruby_version: !ruby/object:Gem::Requirement
94
+ requirements:
95
+ - - ">="
96
+ - !ruby/object:Gem::Version
97
+ version: '3.4'
98
+ required_rubygems_version: !ruby/object:Gem::Requirement
99
+ requirements:
100
+ - - ">="
101
+ - !ruby/object:Gem::Version
102
+ version: '0'
103
+ requirements: []
104
+ rubygems_version: 3.6.9
105
+ specification_version: 4
106
+ summary: Apollo client library for the LegionIO framework
107
+ test_files: []