llm-lsp 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 200411c08bb9e2d5c1bf5e55ebcaf66359f50ff06729c3b310072d46dde0d79d
4
+ data.tar.gz: 0e677ff3b6c0c35c6f707eeb2ca95359fe74ee516060a48dd9bbf88218f11687
5
+ SHA512:
6
+ metadata.gz: 60461145a86057ee35e830c6df1d075d9ee30cbca4bc876d0a64fbef86867b43ab64006f177447b78ec40063d4722248e41a6e0bae22c6b8d6f3a14c5287b3f8
7
+ data.tar.gz: 96792ec7394ef784cebe710e7841c1d296fe0dd11198574fc81d7898e6f6d06110813570686e1a714c42a7732ad20efa67110264b58254a7c7843ae21745e268
data/README.md ADDED
@@ -0,0 +1,189 @@
1
+ # llm-lsp.rb
2
+
3
+ 使用 Ruby 利用 LLM 大模型实现的通用补全 LSP Server,支持 Ollama 等兼容 OpenAI API 的后端。
4
+
5
+ ## 特点
6
+
7
+ 1. 支持大模型的 FIM (Fill-In-the-Middle) 能力进行代码补全
8
+ 2. 流式补全,支持请求取消时立即中断连接释放 GPU 资源
9
+ 3. 请求防抖,同一文档的连续补全请求只处理最新的
10
+ 4. 基于 Async 的异步架构,不阻塞编辑器交互
11
+
12
+ ## 技术栈
13
+
14
+ 1. [async](https://github.com/socketry/async) — 异步事件驱动服务框架
15
+ 2. [io-stream](https://github.com/socketry/io-stream) — 配合 async 提供高效的缓冲 IO
16
+ 3. [ruby-openai](https://github.com/alexrudall/ruby-openai) — OpenAI 兼容的 LLM API 客户端
17
+
18
+ ## 安装
19
+
20
+ ```bash
21
+ gem install llm-lsp
22
+ ```
23
+
24
+ 或在 Gemfile 中添加后 `bundle install`:
25
+
26
+ ```ruby
27
+ gem "llm-lsp"
28
+ ```
29
+
30
+ ## 使用
31
+
32
+ ```bash
33
+ llm-lsp [options]
34
+ # 或通过 bundle
35
+ bundle exec llm-lsp [options]
36
+ ```
37
+
38
+ ### 命令行参数
39
+
40
+ | 参数 | 说明 |
41
+ |------|------|
42
+ | `-c`, `--config FILE` | 配置文件路径(默认 `~/.config/llm-lsp/llm-lsp.yml`) |
43
+ | `-m`, `--provider NAME` | 选择 provider(覆盖配置文件设定) |
44
+ | `--verbose LEVEL` | 日志级别 (1=ERROR, 2=WARN, 3=INFO, 4=DEBUG) |
45
+ | `--log FILE` | 日志文件路径(默认 STDERR) |
46
+ | `-v`, `--version` | 显示版本号 |
47
+ | `-h`, `--help` | 显示帮助信息 |
48
+
49
+ ### 配置文件
50
+
51
+ 支持 YAML 配置文件,默认路径 `~/.config/llm-lsp/llm-lsp.yml`,可通过 `-c` 指定其他路径。
52
+
53
+ ```yaml
54
+ provider: ollama
55
+
56
+ providers:
57
+ ollama:
58
+ model: qwen2.5-coder:1.5b
59
+ api_base: http://localhost:11434/v1
60
+ context_window: 2048
61
+ tokens_to_clear:
62
+ - "<|endoftext|>"
63
+ openai:
64
+ model: gpt-4
65
+ api_base: https://api.openai.com/v1
66
+ access_token: sk-xxxx
67
+ context_window: 8192
68
+ ```
69
+
70
+ **优先级**(从低到高):
71
+ 1. 配置文件(`~/.config/llm-lsp/llm-lsp.yml`)
72
+ 2. `-m` 命令行参数(仅覆盖 provider 选择)
73
+ 3. LSP 客户端 `initializationOptions`(最高优先级,同名 provider 完全覆盖)
74
+
75
+ ### LSP 协议配置
76
+
77
+ #### initializationOptions
78
+
79
+ 通过 LSP 客户端的 `initialize` 请求传入。完整结构:
80
+
81
+ ```jsonc
82
+ {
83
+ "provider": "ollama", // 可选,当前使用的 provider 名称
84
+ "providers": { // 可选,provider 定义(支持多个)
85
+ "ollama": {
86
+ "model": "qwen2.5-coder:1.5b", // 必填,模型名称
87
+ "api_base": "http://localhost:11434/v1", // 必填,API 端点
88
+ "access_token": "", // 可选,API 密钥(默认空)
89
+ "context_window": 2048, // 可选,上下文窗口大小(默认 2048)
90
+ "fim": { // 可选,FIM 特殊标记(默认 null,使用 prompt+suffix 模式)
91
+ "prefix": "<fim_prefix>",
92
+ "suffix": "<fim_suffix>",
93
+ "middle": "<fim_middle>"
94
+ },
95
+ "tokenizer_config": { // 可选,tokenizer 配置(默认 null,回退字符计数)
96
+ // 三选一:
97
+ "path": "/path/to/tokenizer.json", // 本地文件
98
+ "repository": "Qwen/Qwen2.5-Coder-1.5B", // HuggingFace Hub
99
+ "url": "https://example.com/tokenizer.json" // URL 下载
100
+ },
101
+ "tokens_to_clear": ["<|endoftext|>"] // 可选,从补全结果中清除的标记(默认 [])
102
+ }
103
+ }
104
+ }
105
+ ```
106
+
107
+ 当 `initializationOptions` 未提供 providers/provider 时,会使用配置文件中的默认值。
108
+
109
+ #### workspace/didChangeConfiguration
110
+
111
+ 运行时动态修改配置,参数结构位于 `params.settings`:
112
+
113
+ ```jsonc
114
+ {
115
+ "provider": "ollama", // 可选,切换当前 provider
116
+ "providers": { // 可选,更新 provider 配置(结构同 initializationOptions.providers)
117
+ "ollama": {
118
+ "model": "qwen2.5-coder:7b",
119
+ "api_base": "http://localhost:11434/v1"
120
+ // ... 同上所有字段
121
+ }
122
+ }
123
+ }
124
+ ```
125
+
126
+ ### 编辑器集成
127
+
128
+ #### coc.nvim
129
+
130
+ 在 `coc-settings.json` 中添加:
131
+
132
+ ```json
133
+ {
134
+ "languageserver": {
135
+ "llm-lsp": {
136
+ "command": "llm-lsp",
137
+ "args": ["--log", "/tmp/llm-lsp.log", "--verbose", "4"],
138
+ "filetypes": ["*"],
139
+ "initializationOptions": {
140
+ "provider": "ollama",
141
+ "providers": {
142
+ "ollama": {
143
+ "model": "qwen2.5-coder:1.5b",
144
+ "api_base": "http://localhost:11434/v1"
145
+ }
146
+ }
147
+ }
148
+ }
149
+ }
150
+ }
151
+ ```
152
+
153
+ 如果已在配置文件中定义了 providers,`initializationOptions` 可以省略 providers 部分:
154
+
155
+ ```json
156
+ {
157
+ "languageserver": {
158
+ "llm-lsp": {
159
+ "command": "llm-lsp",
160
+ "args": ["-c", "/path/to/config.yml", "--log", "/tmp/llm-lsp.log"],
161
+ "filetypes": ["*"]
162
+ }
163
+ }
164
+ }
165
+ ```
166
+
167
+ ## 测试
168
+
169
+ 项目使用 Minitest 进行集成测试,通过启动 LSP 服务器子进程并进行 JSON-RPC 通信验证。
170
+
171
+ ```bash
172
+ # 运行全部测试
173
+ bundle exec rake test
174
+ bundle exec ruby test/test_lsp.rb
175
+
176
+ # 运行单个测试
177
+ bundle exec ruby test/test_lsp.rb --name test_inline_completion
178
+ ```
179
+
180
+ **注意:**
181
+ - 部分测试依赖 Ollama 在 `localhost:11434` 运行,不可用时自动 skip
182
+ - Ollama 首次加载模型可能需要 30 秒以上
183
+ - 测试日志输出到 `/tmp/llm-lsp-test.log`
184
+
185
+ ## 依赖
186
+
187
+ - Ruby (参考 `.ruby-version`)
188
+ - Bundler
189
+ - Ollama 或其他兼容 OpenAI Completions API 的后端
data/bin/llm-lsp ADDED
@@ -0,0 +1,109 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "json"
4
+ require "logger"
5
+ require "optparse"
6
+ require "yaml"
7
+
8
+ require_relative "../lib/llm_lsp"
9
+
10
+ DEFAULT_CONFIG_PATH = File.join(Dir.home, ".config", "llm-lsp", "llm-lsp.yml")
11
+
12
+ def parse_options
13
+ options = {
14
+ verbose: 1,
15
+ logfile: nil,
16
+ config: DEFAULT_CONFIG_PATH,
17
+ provider: nil,
18
+ parser: nil,
19
+ }
20
+
21
+ opts = OptionParser.new do |o|
22
+ o.banner = "Usage: llm-lsp [options]"
23
+
24
+ o.on("-c", "--config FILE", String,
25
+ "Config file path (default: #{DEFAULT_CONFIG_PATH})") do |f|
26
+ options[:config] = f
27
+ end
28
+
29
+ o.on("-m", "--provider NAME", String,
30
+ "Select provider by name") do |name|
31
+ options[:provider] = name
32
+ end
33
+
34
+ o.on("--verbose LEVEL", Integer, "Set verbose level (1-4)") do |v|
35
+ options[:verbose] = v
36
+ end
37
+
38
+ o.on("--log FILE", String, "Path to log file (default STDERR)") do |f|
39
+ options[:logfile] = f
40
+ end
41
+
42
+ o.on("-v", "--version", "Show version") do
43
+ puts "llm-lsp #{LlmLsp::VERSION}"
44
+ exit 0
45
+ end
46
+
47
+ o.on_tail("-h", "--help", "Show this message") do
48
+ puts o
49
+ exit 0
50
+ end
51
+ end
52
+ opts.parse!(ARGV)
53
+ options[:parser] = opts
54
+ options
55
+ end
56
+
57
+ def create_logger(options)
58
+ logdev = if options[:logfile]
59
+ file = File.open(options[:logfile], "a")
60
+ file.sync = true
61
+ STDERR.reopen(file)
62
+ file
63
+ else
64
+ STDERR
65
+ end
66
+
67
+ logger = Logger.new(logdev)
68
+ logger.level = case options[:verbose]
69
+ when 1 then Logger::ERROR
70
+ when 2 then Logger::WARN
71
+ when 3 then Logger::INFO
72
+ when 4 then Logger::DEBUG
73
+ else Logger::INFO
74
+ end
75
+ logger.formatter = proc do |severity, datetime, _prg, msg|
76
+ if msg.is_a? Exception
77
+ msg = "#{msg.message} (#{msg.class})\n" << (msg.backtrace || []).join("\n")
78
+ end
79
+ timestamp = datetime.strftime("%Y-%m-%dT%H:%M:%S.%L%:z")
80
+ "#{severity[0]}[#{timestamp}] #{msg}\n"
81
+ end
82
+ logger
83
+ rescue => e
84
+ STDERR.puts "Failed to create logger: #{e.message}"
85
+ Logger.new(STDERR)
86
+ end
87
+
88
+ def load_config(path, logger)
89
+ return {} unless path && File.exist?(path)
90
+ config = YAML.safe_load_file(path, symbolize_names: true) || {}
91
+ logger.info("Loaded config from #{path}")
92
+ config
93
+ rescue => e
94
+ logger.warn("Failed to load config #{path}: #{e.message}")
95
+ {}
96
+ end
97
+
98
+ def main
99
+ options = parse_options
100
+ logger = create_logger(options)
101
+
102
+ config = load_config(options[:config], logger)
103
+ config[:provider] = options[:provider].to_sym if options[:provider]
104
+
105
+ server = LlmServer.new(logger, config: config)
106
+ server.run
107
+ end
108
+
109
+ main
@@ -0,0 +1,19 @@
1
+ module HashDeepMerge
2
+ refine Hash do
3
+ def deep_merge(other_hash)
4
+ merge(other_hash) do |key, old_val, new_val|
5
+ if old_val.is_a?(Hash) && new_val.is_a?(Hash)
6
+ old_val.deep_merge(new_val)
7
+ elsif old_val.is_a?(Array) && new_val.is_a?(Array)
8
+ old_val + new_val
9
+ else
10
+ new_val
11
+ end
12
+ end
13
+ end
14
+
15
+ def deep_merge!(other_hash, &block)
16
+ replace(deep_merge(other_hash, &block))
17
+ end
18
+ end
19
+ end
@@ -0,0 +1,3 @@
1
+ module LlmLsp
2
+ VERSION = "0.2.0"
3
+ end
data/lib/llm_lsp.rb ADDED
@@ -0,0 +1,2 @@
1
+ require_relative "llm_lsp/version"
2
+ require_relative "llm_server"
data/lib/llm_server.rb ADDED
@@ -0,0 +1,537 @@
1
+ require "openai"
2
+ require_relative "lsp/session"
3
+ require_relative "lsp/json_rpc"
4
+ require_relative "lsp/stdio"
5
+ require_relative "lsp/code"
6
+
7
+ class Document
8
+ attr_reader :uri, :text, :position_encoding
9
+
10
+ # text 来自 JSON-RPC(JSON 规范 UTF-8),到 Ruby 端已是 UTF-8 字符串
11
+ # position_encoding 只影响 Position.character 数字的计量单位
12
+ def initialize(uri, text, position_encoding = "utf-16")
13
+ @uri = uri
14
+ @text = text
15
+ @position_encoding = position_encoding
16
+ end
17
+
18
+ # 将 LSP Position (line, character) 转换为 @text 中的字符偏移量
19
+ def position_to_offset(lineno, char)
20
+ offset = 0
21
+ @text.each_line.with_index do |line, index|
22
+ return offset + lsp_char_to_chars(line, char) if lineno == index
23
+ offset += line.length
24
+ end
25
+ end
26
+
27
+ def slice(range)
28
+ @text[range]
29
+ end
30
+
31
+ private
32
+
33
+ # 将 LSP character offset 转换为行内 Ruby 字符数
34
+ # utf-16: UTF-16 code unit 数,BMP 字符占 1,补充平面占 2
35
+ # utf-32: code point 数,等于 Ruby 字符数
36
+ # utf-8: 字节数,逐字符累加 bytesize 直到达到目标
37
+ def lsp_char_to_chars(line_text, lsp_char)
38
+ return lsp_char if @position_encoding == "utf-32"
39
+
40
+ consumed = 0
41
+ line_text.each_char.with_index do |ch, i|
42
+ return i if consumed >= lsp_char
43
+ consumed += char_units(ch)
44
+ end
45
+ line_text.length
46
+ end
47
+
48
+ def char_units(ch)
49
+ case @position_encoding
50
+ when "utf-8"
51
+ ch.bytesize
52
+ when "utf-16"
53
+ ch.ord > 0xFFFF ? 2 : 1
54
+ else
55
+ 1
56
+ end
57
+ end
58
+ end
59
+
60
+ class LlmServer
61
+ attr_reader :session, :logger
62
+
63
+ def initialize(logger, config: {})
64
+ @logger = logger
65
+ @config = config
66
+ channel = JsonRpc.new(Stdio.new)
67
+ @session = LspSession.new("LlmLsp", channel, logger)
68
+ @providers = {}
69
+ @provider = nil
70
+
71
+ @position_encoding = "utf-16"
72
+ @documents = {}
73
+ @debounce_delay = 0.2
74
+ @pending_completions = {}
75
+
76
+ # 补全采纳追踪:单协程 + 有序队列
77
+ # Ruby Hash 保持插入顺序,补全按时间先后发生,expire_at 单调递增,
78
+ # 因此 Hash 的迭代顺序天然按过期时间从早到晚排列
79
+ # key: completion_id, value: expire_at (Float, monotonic clock)
80
+ @pending_accepts = {}
81
+ @accept_timeout = 30
82
+ @tokenizer_cache = {}
83
+
84
+ setup_handlers
85
+ end
86
+
87
+ def run
88
+ logger.info("Starting LLM LSP Server...")
89
+ Sync do |task|
90
+ task.async { accept_timer_loop }
91
+ task.async { @session.start }
92
+ end
93
+ end
94
+
95
+ def setup_handlers
96
+ session.on_method("initialize") do |msg|
97
+ handle_initialize(msg)
98
+ end
99
+
100
+ session.on_method("initialized") do |msg|
101
+ logger.info("Client initialized.")
102
+ end
103
+
104
+ session.on_method("textDocument/didOpen") do |msg|
105
+ params = msg.fetch(:params, {})
106
+ uri = params.dig(:textDocument, :uri)
107
+ text = params.dig(:textDocument, :text)
108
+ @documents[uri] = Document.new(uri, text, @position_encoding)
109
+ logger.debug("Opened document: #{uri}")
110
+ end
111
+
112
+ session.on_method("textDocument/didChange") do |msg|
113
+ params = msg.fetch(:params, {})
114
+ uri = params.dig(:textDocument, :uri)
115
+ changes = params.dig(:contentChanges)
116
+ @documents[uri] = Document.new(uri, changes.dig(0, :text), @position_encoding)
117
+ logger.debug("Updated document: #{uri}")
118
+ end
119
+
120
+ session.on_method("textDocument/didClose") do |msg|
121
+ uri = msg.dig(:params, :textDocument, :uri)
122
+ @documents.delete(uri)
123
+ @pending_completions.delete(uri)&.stop
124
+ logger.debug("Closed document: #{uri}")
125
+ end
126
+
127
+ session.on_method("textDocument/inlineCompletion") do |msg|
128
+ handle_inline_completion(msg)
129
+ end
130
+
131
+ session.on_method("workspace/didChangeConfiguration") do |msg|
132
+ handle_did_change_configuration(msg)
133
+ end
134
+
135
+ session.on_method("workspace/executeCommand") do |msg|
136
+ handle_execute_command(msg)
137
+ end
138
+ end
139
+
140
+ def handle_initialize(msg)
141
+ params = msg.fetch(:params, {})
142
+ opts = params.fetch(:initializationOptions, {})
143
+ logger.debug("initializationOptions: #{opts}")
144
+
145
+ errors = []
146
+
147
+ # 1) 配置文件 providers(低优先级)
148
+ @config.fetch(:providers, {}).each do |name, conf|
149
+ if (err = add_provider(name, conf))
150
+ logger.warn(err)
151
+ errors << err
152
+ end
153
+ end
154
+
155
+ # 2) initializationOptions providers(高优先级,同名覆盖)
156
+ opts.fetch(:providers, {}).each do |name, conf|
157
+ if (err = add_provider(name, conf))
158
+ logger.warn(err)
159
+ errors << err
160
+ end
161
+ end
162
+
163
+ # provider 选择:initializationOptions > 配置文件(含 -m 覆盖)
164
+ @provider = if opts.dig(:provider)
165
+ opts[:provider].to_sym
166
+ elsif @config.dig(:provider)
167
+ @config[:provider].to_sym
168
+ else
169
+ nil
170
+ end
171
+
172
+ if !@providers.key?(@provider)
173
+ session.reply(
174
+ msg[:id],
175
+ code: Code::INVALID_PARAMS,
176
+ message: "Provider '#{@provider}' not found, available: #{@providers.keys.join(", ")}"
177
+ )
178
+ # 通知客户端初始化过程中有一些非致命错误
179
+ errors.each { |err| notify_editor(err) }
180
+ return
181
+ end
182
+
183
+ logger.info("Providers: #{@providers.keys.join(", ")}, active: #{@provider}")
184
+
185
+ # position encoding 协商:客户端在 capabilities.general.positionEncodings 声明支持列表
186
+ # 服务器从中选一个,未声明则默认 utf-16
187
+ client_encodings = params.dig(:capabilities, :general, :positionEncodings) || []
188
+ # 优先选 utf-32(最简单),其次 utf-16(最通用)
189
+ @position_encoding = if client_encodings.include?("utf-32")
190
+ "utf-32"
191
+ elsif client_encodings.include?("utf-16")
192
+ "utf-16"
193
+ elsif client_encodings.include?("utf-8")
194
+ "utf-8"
195
+ else
196
+ "utf-16"
197
+ end
198
+ logger.info("Position encoding: #{@position_encoding}")
199
+
200
+ session.reply(msg[:id], result: {
201
+ capabilities: {
202
+ positionEncoding: @position_encoding,
203
+ # 0: None, 1: Full, 2: Incremental
204
+ textDocumentSync: 1, # Full sync TODO: Use incremental sync
205
+ inlineCompletionProvider: true,
206
+ executeCommandProvider: {
207
+ commands: [
208
+ # 自定义命令,接受 inline 补全
209
+ "inlineCompletion/accept"
210
+ ],
211
+ }
212
+ #completionProvider: {
213
+ # resolveProvider: false,
214
+ # #triggerCharacters: ["."]
215
+ #}
216
+ },
217
+ serverInfo: {
218
+ name: "llm-lsp",
219
+ version: LlmLsp::VERSION,
220
+ }
221
+ })
222
+
223
+ # 通知客户端初始化过程中有一些非致命错误
224
+ errors.each { |err| notify_editor(err) }
225
+ end
226
+
227
+ def handle_did_change_configuration(msg)
228
+ settings = msg.dig(:params, :settings) || {}
229
+
230
+ settings.fetch(:providers, {}).each do |name, provider_conf|
231
+ if (err = add_provider(name, provider_conf))
232
+ logger.warn(err)
233
+ notify_editor(err)
234
+ next
235
+ end
236
+ logger.info("Provider #{name} updated")
237
+ end
238
+
239
+ provider = settings.dig(:provider)&.to_sym
240
+ return if @provider == provider
241
+ if !@providers.key?(provider)
242
+ err = "Unknown provider: '#{provider}', available: #{@providers.keys.join(", ")}"
243
+ logger.warn(err)
244
+ notify_editor(err)
245
+ return
246
+ end
247
+
248
+ @provider = provider
249
+ logger.info("Switched to provider: #{@provider}")
250
+ notify_editor("Switched to provider: #{@provider}", type: 3)
251
+ end
252
+
253
+ def current_provider
254
+ @providers[@provider]
255
+ end
256
+
257
+ def current_client
258
+ current_provider[:client]
259
+ end
260
+
261
+ # window/logMessage 记录日志
262
+ # window/showMessage 提示用户
263
+ # window/showMessageRequest 需要用户交互,
264
+ # 多了 actions: [ { title: "ok" }, { title: "no" } ]
265
+ # type: 1 Error, 2 Warn, 3: Info, 4: Log
266
+ def notify_editor(message, type: 2)
267
+ session.notify("window/showMessage", params: {
268
+ type: type,
269
+ message: "LlmLsp: " + message,
270
+ })
271
+ end
272
+
273
+ def handle_execute_command(msg)
274
+ id = msg[:id]
275
+ params = msg.fetch(:params, {})
276
+ command = params.dig(:command)
277
+ arguments = params.dig(:arguments)
278
+ kind = arguments&.at(0)
279
+ item = arguments&.at(1)
280
+
281
+ unless command == "inlineCompletion/accept" && kind == "llm-lsp" && item
282
+ return session.reply(id, code: Code::INVALID_PARAMS, message: "Unknown command: #{command}")
283
+ end
284
+
285
+ session.reply(id, result: nil)
286
+
287
+ completion_id = item.dig(:id)
288
+ @pending_accepts.delete(completion_id)
289
+ logger.info("Completion accepted: #{completion_id}, model: #{item.dig(:model)}")
290
+ end
291
+
292
+ # 记录一个待确认的补全,超时后由 accept_timer_loop 清理并记录拒绝
293
+ def track_completion(completion_id)
294
+ @pending_accepts[completion_id] = Process.clock_gettime(Process::CLOCK_MONOTONIC) + @accept_timeout
295
+ end
296
+
297
+ # 常驻 fiber:每秒扫描一次,清理已过期的补全
298
+ # Ruby Hash 保持插入顺序,expire_at 单调递增,
299
+ # 从头遍历到第一个未过期的即可 break
300
+ def accept_timer_loop
301
+ loop do
302
+ sleep(1)
303
+ now = Process.clock_gettime(Process::CLOCK_MONOTONIC)
304
+ @pending_accepts.each do |cid, expire_at|
305
+ break if expire_at > now
306
+ @pending_accepts.delete(cid)
307
+ logger.info("Completion rejected (timeout): #{cid}")
308
+ end
309
+ end
310
+ end
311
+
312
+ # 注册 provider,返回 nil 成功,String 错误信息失败
313
+ def add_provider(name, conf)
314
+ return "Provider '#{name}': config must be a Hash, got #{conf.class}" unless conf.is_a?(Hash)
315
+ return "Provider '#{name}': missing required field 'model'" unless conf.dig(:model)
316
+ return "Provider '#{name}': missing required field 'api_base'" unless conf.dig(:api_base)
317
+
318
+ client = build_client(conf)
319
+ return "Provider '#{name}': failed to create client" unless client
320
+
321
+ @providers[name] = {
322
+ name: name,
323
+ model: conf[:model],
324
+ api_base: conf[:api_base],
325
+ access_token: conf.fetch(:access_token, ""),
326
+ client: client,
327
+ context_window: conf.fetch(:context_window, 2048),
328
+ fim: conf.dig(:fim),
329
+ tokenizer_config: conf.dig(:tokenizer_config),
330
+ tokens_to_clear: conf.fetch(:tokens_to_clear, []),
331
+ }
332
+ nil
333
+ end
334
+
335
+ def build_client(conf)
336
+ OpenAI::Client.new(
337
+ access_token: conf.fetch(:access_token, ""),
338
+ uri_base: conf[:api_base],
339
+ log_errors: true,
340
+ )
341
+ rescue => e
342
+ logger.error("Failed to create OpenAI client: #{e.message}")
343
+ nil
344
+ end
345
+
346
+ def load_tokenizer(config)
347
+ return nil unless config
348
+ key = config[:path] || config[:repository] || config[:url]
349
+ return nil unless key
350
+ return @tokenizer_cache[key] if @tokenizer_cache.key?(key)
351
+
352
+ require "tokenizers" unless defined?(Tokenizers)
353
+
354
+ tokenizer = if config[:path]
355
+ Tokenizers.from_file(config[:path])
356
+ elsif config[:repository]
357
+ Tokenizers.from_pretrained(config[:repository])
358
+ elsif config[:url]
359
+ download_and_load_tokenizer(config[:url])
360
+ end
361
+ @tokenizer_cache[key] = tokenizer if tokenizer
362
+ tokenizer
363
+ rescue LoadError => e
364
+ logger.warn("Tokenizers gem not available, falling back to character counting: #{e.message}")
365
+ @tokenizer_cache[key] = nil
366
+ nil
367
+ rescue => e
368
+ logger.warn("Failed to load tokenizer: #{e.message}")
369
+ nil
370
+ end
371
+
372
+ def download_and_load_tokenizer(url)
373
+ require "open-uri"
374
+ cache_dir = File.join(Dir.home, ".cache", "llm-lsp", "tokenizers")
375
+ FileUtils.mkdir_p(cache_dir)
376
+ filename = File.join(cache_dir, Digest::SHA256.hexdigest(url) + ".json")
377
+ unless File.exist?(filename)
378
+ URI.open(url) do |remote|
379
+ File.write(filename, remote.read)
380
+ end
381
+ end
382
+ Tokenizers.from_file(filename)
383
+ end
384
+
385
+ def count_tokens(tokenizer, text)
386
+ if tokenizer
387
+ tokenizer.encode(text, add_special_tokens: false).ids.size
388
+ else
389
+ text.length
390
+ end
391
+ end
392
+
393
+ # 参考 llm-ls 的 build_prompt:逐行交替收集 before/after,按 token 计数截断
394
+ def build_prompt(doc, line, char, tokenizer, context_window, fim)
395
+ curr = doc.position_to_offset(line, char)
396
+ before_text = doc.slice(0...curr) || ""
397
+ after_text = doc.slice(curr..) || ""
398
+
399
+ before_lines = before_text.lines
400
+ after_lines = after_text.lines
401
+ before_lines = [""] if before_lines.empty?
402
+ after_lines = [""] if after_lines.empty?
403
+
404
+ fim_overhead = fim ? count_tokens(tokenizer, "#{fim[:prefix]}#{fim[:suffix]}#{fim[:middle]}") : 0
405
+ remaining = context_window - fim_overhead
406
+
407
+ collected_before = []
408
+ collected_after = []
409
+
410
+ bi = before_lines.size - 1
411
+ ai = 0
412
+
413
+ while (bi >= 0 || ai < after_lines.size) && remaining > 0
414
+ if bi >= 0
415
+ tokens = count_tokens(tokenizer, before_lines[bi])
416
+ break if tokens > remaining
417
+ remaining -= tokens
418
+ collected_before.unshift(before_lines[bi])
419
+ bi -= 1
420
+ end
421
+ if ai < after_lines.size && remaining > 0
422
+ tokens = count_tokens(tokenizer, after_lines[ai])
423
+ break if tokens > remaining
424
+ remaining -= tokens
425
+ collected_after << after_lines[ai]
426
+ ai += 1
427
+ end
428
+ end
429
+
430
+ prefix = collected_before.join
431
+ suffix = collected_after.join
432
+
433
+ if fim
434
+ prompt = "#{fim[:prefix]}#{prefix}#{fim[:suffix]}#{suffix}#{fim[:middle]}"
435
+ { prompt: prompt, suffix: nil }
436
+ else
437
+ { prompt: prefix, suffix: suffix }
438
+ end
439
+ end
440
+
441
+ # 防抖 + 流式补全:
442
+ # - llm 采用流式接口时,取消请求发出后立即关闭链接,释放GPU资源;
443
+ # 对于api接口非流失只要提交请求就会按照完整生成来计费,无论是否读取了输出,而流式通常会终止推理
444
+ # - 同一文档的并发请求通过 @pending_completions 做合并,只处理最新请求
445
+ # - ruby-openai 没有直接的取消方法,但 Async 环境下 task.stop 抛出
446
+ # Async::Stop 异常会中断底层 IO 操作,从而中断 HTTP 连接释放 GPU 资源
447
+ # - 代码需要考虑 Async::Stop 异常,保证异常安全性,
448
+ # 如果代码不用管理异常也是安全的,那么可以交由框架处理
449
+ def handle_inline_completion(msg)
450
+ id = msg[:id]
451
+ params = msg.fetch(:params, {})
452
+ uri = params.dig(:textDocument, :uri)
453
+ position = params.dig(:position)
454
+ doc = @documents.dig(uri)
455
+ return session.reply(id, code: Code::INVALID_REQUEST, message: "Document #{uri} not found") unless doc
456
+ return session.reply(id, code: Code::INVALID_PARAMS, message: "Invalid position") unless position
457
+
458
+ # 防抖:取消同一文档上的旧补全请求,只处理最新请求
459
+ if (old_worker = @pending_completions.delete(uri))
460
+ old_worker.stop
461
+ logger.debug("Debounce: cancelled previous completion for #{uri}")
462
+ end
463
+ @pending_completions[uri] = Async::Task.current
464
+
465
+ # 延时等待,期间若有新请求会 stop 当前 task
466
+ # sleep 在 Async 中是非阻塞的,只暂停当前 task
467
+ sleep(@debounce_delay)
468
+
469
+ line = position.dig(:line)
470
+ char = position.dig(:character)
471
+
472
+ provider = current_provider
473
+ tokenizer = load_tokenizer(provider[:tokenizer_config])
474
+ prompt_data = build_prompt(doc, line, char, tokenizer, provider[:context_window], provider[:fim])
475
+ logger.debug("prompt: #{prompt_data[:prompt]&.length} chars, suffix: #{prompt_data[:suffix]&.length} chars")
476
+
477
+ items = llm_completions(prompt_data, provider[:tokens_to_clear]).map do |item|
478
+ {
479
+ insertText: item.fetch(:text, ""),
480
+ # 补充遥感信息
481
+ command: {
482
+ title: "Accept Completion",
483
+ tooltip: "accept inline completion",
484
+ command: "inlineCompletion/accept",
485
+ # arguments 允许任意类型数组 Any[]
486
+ arguments: ["llm-lsp", {
487
+ id: id, # 复用消息 id
488
+ uri: uri,
489
+ position: position,
490
+ insertText: item[:text],
491
+ model: "#{item[:provider]}:#{item[:model]}",
492
+ }],
493
+ }
494
+ }
495
+ end
496
+ session.reply(id, result: { items: items })
497
+ track_completion(id) unless items.empty?
498
+ rescue Async::Stop
499
+ logger.debug("abort request: #{id}")
500
+ ensure
501
+ @pending_completions.delete(uri)
502
+ end
503
+
504
+ def llm_completions(prompt_data, tokens_to_clear)
505
+ logger.info("Calling LLM with provider #{@provider} ...")
506
+ chunks = []
507
+ params = {
508
+ model: current_provider[:model],
509
+ prompt: prompt_data[:prompt],
510
+ max_tokens: 500,
511
+ temperature: 0.01,
512
+ top_p: 0.9,
513
+ stop: [
514
+ # "\n", for line completion
515
+ "\n\n",
516
+ ],
517
+ stream: proc { |chunk, _event|
518
+ text = chunk.dig("choices", 0, "text")
519
+ chunks << text if text
520
+ },
521
+ }
522
+ params[:suffix] = prompt_data[:suffix] if prompt_data[:suffix]
523
+
524
+ current_client.completions(parameters: params)
525
+
526
+ result = chunks.join
527
+ tokens_to_clear.each { |tok| result.gsub!(tok, "") }
528
+ logger.info("LLM completion: #{result.length} chars")
529
+ logger.debug("LLM completion: #{result}")
530
+ return [] if result.empty?
531
+ [{
532
+ text: result,
533
+ provider: current_provider[:name],
534
+ model: current_provider[:model],
535
+ }]
536
+ end
537
+ end
data/lib/lsp/code.rb ADDED
@@ -0,0 +1,9 @@
1
+ module Code
2
+ PARSE_ERROR = -32700
3
+ INVALID_REQUEST = -32600
4
+ METHOD_NOT_FOUND = -32601
5
+ INVALID_PARAMS = -32602
6
+ INTERNAL_ERROR = -32603
7
+ SERVER_NOT_INITIALIZED = -32002
8
+ UNKNOWN_ERROR_CODE = -32001
9
+ end
@@ -0,0 +1,55 @@
1
+ require "json"
2
+ require_relative "code"
3
+
4
+ class JsonRpc
5
+ attr_reader :io
6
+
7
+ def initialize(io)
8
+ @io = io
9
+ end
10
+
11
+ def build_request(method, params: nil, id: nil)
12
+ j = {
13
+ jsonrpc: "2.0",
14
+ method: method,
15
+ }
16
+ j[:params] = params if params
17
+ j[:id] = id if id
18
+ j
19
+ end
20
+
21
+ def build_response(id, result: nil, code: nil, message: nil)
22
+ j = {
23
+ jsonrpc: "2.0",
24
+ id: id,
25
+ }
26
+ if result || (code.nil? && message.nil?)
27
+ j[:result] = result
28
+ else
29
+ j[:error] = {
30
+ code: code || Code::INVALID_REQUEST,
31
+ message: message || "",
32
+ }
33
+ end
34
+ j
35
+ end
36
+
37
+ def receive_message
38
+ headers = {}
39
+ while (line = io.gets) && line != "\r\n"
40
+ k, v = line.rstrip.split(/:\s*/)
41
+ headers[k.downcase] = v
42
+ end
43
+ length = headers.dig("content-length")&.to_i
44
+ return nil unless length
45
+ raw_json = io.read_exactly(length)
46
+ JSON.parse(raw_json, symbolize_names: true)
47
+ end
48
+
49
+ def send_message(message)
50
+ jsonrpc = JSON.generate(message)
51
+ body = "Content-Length: #{jsonrpc.bytesize}\r\n\r\n#{jsonrpc}"
52
+ io.write(body)
53
+ io.flush
54
+ end
55
+ end
data/lib/lsp/popen.rb ADDED
@@ -0,0 +1,33 @@
1
+ require "open3"
2
+ require "io/stream"
3
+
4
+ class Popen
5
+ attr_reader :input, :output, :wait_thr
6
+
7
+ def initialize(cmd)
8
+ pin, pout, @wait_thr = Open3.popen2(cmd)
9
+ @input = IO::Stream(pout)
10
+ @output = IO::Stream(pin)
11
+ end
12
+
13
+ def gets = input.gets
14
+ def read(size) = input.read(size)
15
+ def read_exactly(size) = input.read_exactly(size)
16
+ def write(buf) = output.write(buf)
17
+ def flush = output.flush
18
+
19
+ def close
20
+ input.close rescue nil
21
+ output.close rescue nil
22
+ end
23
+
24
+ def pid = wait_thr.pid
25
+
26
+ def alive? = wait_thr.alive?
27
+
28
+ def join(timeout = nil) = wait_thr.join(timeout)
29
+
30
+ def kill(signal = "TERM")
31
+ Process.kill(signal, pid) rescue nil
32
+ end
33
+ end
@@ -0,0 +1,141 @@
1
+ require "async"
2
+ require_relative "code"
3
+
4
+ class LspSession
5
+ attr_reader :name
6
+ attr_reader :channel
7
+ attr_reader :logger
8
+ attr :id
9
+
10
+ def initialize(name, channel, logger)
11
+ @name = name
12
+ @channel = channel
13
+ @logger = logger
14
+
15
+ @id = Random.rand(10)
16
+ @once_handlers = {}
17
+ @on_handlers = {}
18
+ @running_request = {}
19
+ # :setup, :initialized, :running, :shutdown, :exit
20
+ @state = :setup
21
+ end
22
+
23
+ def next_id
24
+ @id += 1
25
+ end
26
+
27
+ def start
28
+ Sync do |task|
29
+ # try call setup handler first
30
+ task.async do
31
+ @once_handlers.delete(:setup)&.call(nil)
32
+ rescue => e
33
+ logger.fatal(e)
34
+ end
35
+
36
+ # main loop
37
+ loop do
38
+ message = receive_message
39
+ next unless message
40
+
41
+ key ||= [:req, message[:method]] if message.key? :method
42
+ key ||= [:rsp, message[:id]] if message.key? :id
43
+ next logger.warn("#{name}: Invalid message #{message}, ignore") unless key
44
+
45
+ # speicial request
46
+ next if handle_special_request(message)
47
+ # normal request
48
+ worker = task.async do
49
+ h ||= @once_handlers.delete(key) || @once_handlers.delete(:*)
50
+ h ||= @on_handlers.dig(key) || @on_handlers.dig(:*)
51
+ next logger.debug("#{name}: no handler, ignore message #{message}") unless h
52
+ h.call(message)
53
+ rescue => e
54
+ raise
55
+ ensure
56
+ @running_request.delete(message[:id]) if message[:id]
57
+ end
58
+ if key.first == :req && id = message.dig(:id)
59
+ @running_request[id] = worker
60
+ end
61
+ # TODO: 规范异常处理
62
+ rescue => e
63
+ logger.fatal(e)
64
+ end
65
+ end
66
+ end
67
+
68
+ def handle_special_request(message)
69
+ method = message.dig(:method)
70
+ if "$/cancelRequest" == method
71
+ id = message.dig(:params, :id)
72
+ worker = @running_request.delete(id)
73
+ worker&.stop # try stop it (send Async::Stop exception)
74
+ method
75
+ elsif "shutdown" == method
76
+ @state = :shutdown
77
+ reply(message[:id], result: nil)
78
+ method
79
+ elsif "exit" == method
80
+ code = @state == :shutdown ? 0 : 1
81
+ exit(code)
82
+ elsif [:shutdown, :exit].include?(@state)
83
+ # wait quitting, reject new request
84
+ reply(message.dig(:id), code: Code::INVALID_REQUEST, message: "Server is shutting down")
85
+ method
86
+ else
87
+ nil
88
+ end
89
+ end
90
+
91
+ def on(event=nil, &block)
92
+ event ||= :*
93
+ handlers = (event == :setup ? @once_handlers : @on_handlers)
94
+ handlers[event] = lambda { |m| block&.call(m) }
95
+ end
96
+ def on_method(method, &block) = on([:req, method], &block)
97
+
98
+ def once(event=nil, &block)
99
+ event ||= :*
100
+ @once_handlers[event] = lambda { |m| block&.call(m) }
101
+ end
102
+ def once_method(method, &block) = on([:req, method], &block)
103
+
104
+ def wait(event=nil, &block)
105
+ event ||= :*
106
+ cond = Async::Condition.new
107
+ @once_handlers[event] = lambda { |m| cond.signal(m) }
108
+ msg = cond.wait
109
+ block ? block.call(msg) : msg
110
+ end
111
+ def wait_method(method, &block) = wait([:req, method], &block)
112
+ def wait_response(id, &block) = wait([:rsp, id], &block)
113
+
114
+ def request(method, params: nil, id: nil)
115
+ id = next_id if id.nil?
116
+ message = channel.build_request(method, params:, id:)
117
+ send_message(message)
118
+ wait_response(id)
119
+ end
120
+
121
+ def notify(method, params: nil)
122
+ message = channel.build_request(method, params:)
123
+ send_message(message)
124
+ end
125
+
126
+ def reply(id, result: nil, code: nil, message: nil)
127
+ msg = channel.build_response(id, result:, code:, message:)
128
+ send_message(msg)
129
+ end
130
+
131
+ def send_message(msg)
132
+ channel.send_message(msg)
133
+ logger.debug { "<= #{msg}" }
134
+ end
135
+
136
+ def receive_message
137
+ msg = channel.receive_message
138
+ logger.debug { "=> #{msg}" }
139
+ msg
140
+ end
141
+ end
data/lib/lsp/stdio.rb ADDED
@@ -0,0 +1,25 @@
1
+ require "io/stream"
2
+
3
+ class Stdio
4
+ attr_reader :input
5
+ attr_reader :output
6
+
7
+ def initialize
8
+ # IO::Stream::Generic 对 IO 对象的基础封装,使其能更好的配合 Async
9
+ # IO::Stream::Buffered 继承 Generic 提供缓冲能力,并且支持 read_exactly, read_until 上层函数
10
+ # IO::Stream module,并且重载了 call 能直接构造 Buffered 对象
11
+ @input = IO::Stream(STDIN)
12
+ @output = IO::Stream(STDOUT)
13
+ end
14
+
15
+ def gets = input.gets
16
+ def read(size) = input.read(size)
17
+ def read_exactly(size) = input.read_exactly(size)
18
+ def write(buf) = output.write(buf)
19
+ def flush = output.flush
20
+
21
+ def close
22
+ input.close
23
+ output.close
24
+ end
25
+ end
metadata ADDED
@@ -0,0 +1,107 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: llm-lsp
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.2.0
5
+ platform: ruby
6
+ authors:
7
+ - alpha0x00
8
+ bindir: bin
9
+ cert_chain: []
10
+ date: 1980-01-02 00:00:00.000000000 Z
11
+ dependencies:
12
+ - !ruby/object:Gem::Dependency
13
+ name: async
14
+ requirement: !ruby/object:Gem::Requirement
15
+ requirements:
16
+ - - "~>"
17
+ - !ruby/object:Gem::Version
18
+ version: '2.0'
19
+ type: :runtime
20
+ prerelease: false
21
+ version_requirements: !ruby/object:Gem::Requirement
22
+ requirements:
23
+ - - "~>"
24
+ - !ruby/object:Gem::Version
25
+ version: '2.0'
26
+ - !ruby/object:Gem::Dependency
27
+ name: io-stream
28
+ requirement: !ruby/object:Gem::Requirement
29
+ requirements:
30
+ - - "~>"
31
+ - !ruby/object:Gem::Version
32
+ version: '0.11'
33
+ type: :runtime
34
+ prerelease: false
35
+ version_requirements: !ruby/object:Gem::Requirement
36
+ requirements:
37
+ - - "~>"
38
+ - !ruby/object:Gem::Version
39
+ version: '0.11'
40
+ - !ruby/object:Gem::Dependency
41
+ name: ruby-openai
42
+ requirement: !ruby/object:Gem::Requirement
43
+ requirements:
44
+ - - "~>"
45
+ - !ruby/object:Gem::Version
46
+ version: '8.0'
47
+ type: :runtime
48
+ prerelease: false
49
+ version_requirements: !ruby/object:Gem::Requirement
50
+ requirements:
51
+ - - "~>"
52
+ - !ruby/object:Gem::Version
53
+ version: '8.0'
54
+ - !ruby/object:Gem::Dependency
55
+ name: tokenizers
56
+ requirement: !ruby/object:Gem::Requirement
57
+ requirements:
58
+ - - "~>"
59
+ - !ruby/object:Gem::Version
60
+ version: '0.6'
61
+ type: :runtime
62
+ prerelease: false
63
+ version_requirements: !ruby/object:Gem::Requirement
64
+ requirements:
65
+ - - "~>"
66
+ - !ruby/object:Gem::Version
67
+ version: '0.6'
68
+ description: Ruby LSP server that provides code completion using LLM FIM (Fill-In-the-Middle).
69
+ Supports Ollama and other OpenAI API compatible backends.
70
+ executables:
71
+ - llm-lsp
72
+ extensions: []
73
+ extra_rdoc_files: []
74
+ files:
75
+ - README.md
76
+ - bin/llm-lsp
77
+ - lib/core_ext/hash_deep_merge.rb
78
+ - lib/llm_lsp.rb
79
+ - lib/llm_lsp/version.rb
80
+ - lib/llm_server.rb
81
+ - lib/lsp/code.rb
82
+ - lib/lsp/json_rpc.rb
83
+ - lib/lsp/popen.rb
84
+ - lib/lsp/session.rb
85
+ - lib/lsp/stdio.rb
86
+ homepage: https://github.com/leetking/llm-lsp.rb
87
+ licenses:
88
+ - MIT
89
+ metadata: {}
90
+ rdoc_options: []
91
+ require_paths:
92
+ - lib
93
+ required_ruby_version: !ruby/object:Gem::Requirement
94
+ requirements:
95
+ - - ">="
96
+ - !ruby/object:Gem::Version
97
+ version: '3.1'
98
+ required_rubygems_version: !ruby/object:Gem::Requirement
99
+ requirements:
100
+ - - ">="
101
+ - !ruby/object:Gem::Version
102
+ version: '0'
103
+ requirements: []
104
+ rubygems_version: 3.6.9
105
+ specification_version: 4
106
+ summary: LLM-powered LSP server for code completion
107
+ test_files: []