tree_haver 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/REEK ADDED
File without changes
data/RUBOCOP.md ADDED
@@ -0,0 +1,71 @@
1
+ # RuboCop Usage Guide
2
+
3
+ ## Overview
4
+
5
+ A tale of two RuboCop plugin gems.
6
+
7
+ ### RuboCop Gradual
8
+
9
+ This project uses `rubocop_gradual` instead of vanilla RuboCop for code style checking. The `rubocop_gradual` tool allows for gradual adoption of RuboCop rules by tracking violations in a lock file.
10
+
11
+ ### RuboCop LTS
12
+
13
+ This project uses `rubocop-lts` to ensure, on a best-effort basis, compatibility with Ruby >= 1.9.2.
14
+ RuboCop rules are meticulously configured by the `rubocop-lts` family of gems to ensure that a project is compatible with a specific version of Ruby. See: https://rubocop-lts.gitlab.io for more.
15
+
16
+ ## Checking RuboCop Violations
17
+
18
+ To check for RuboCop violations in this project, always use:
19
+
20
+ ```bash
21
+ bundle exec rake rubocop_gradual:check
22
+ ```
23
+
24
+ **Do not use** the standard RuboCop commands like:
25
+ - `bundle exec rubocop`
26
+ - `rubocop`
27
+
28
+ ## Understanding the Lock File
29
+
30
+ The `.rubocop_gradual.lock` file tracks all current RuboCop violations in the project. This allows the team to:
31
+
32
+ 1. Prevent new violations while gradually fixing existing ones
33
+ 2. Track progress on code style improvements
34
+ 3. Ensure CI builds don't fail due to pre-existing violations
35
+
36
+ ## Common Commands
37
+
38
+ - **Check violations**
39
+ - `bundle exec rake rubocop_gradual`
40
+ - `bundle exec rake rubocop_gradual:check`
41
+ - **(Safe) Autocorrect violations, and update lockfile if no new violations**
42
+ - `bundle exec rake rubocop_gradual:autocorrect`
43
+ - **Force update the lock file (w/o autocorrect) to match violations present in code**
44
+ - `bundle exec rake rubocop_gradual:force_update`
45
+
46
+ ## Workflow
47
+
48
+ 1. Before submitting a PR, run `bundle exec rake rubocop_gradual:autocorrect`
49
+ a. or just the default `bundle exec rake`, as autocorrection is a pre-requisite of the default task.
50
+ 2. If there are new violations, either:
51
+ - Fix them in your code
52
+ - Run `bundle exec rake rubocop_gradual:force_update` to update the lock file (only for violations you can't fix immediately)
53
+ 3. Commit the updated `.rubocop_gradual.lock` file along with your changes
54
+
55
+ ## Never add inline RuboCop disables
56
+
57
+ Do not add inline `rubocop:disable` / `rubocop:enable` comments anywhere in the codebase (including specs, except when following the few existing `rubocop:disable` patterns for a rule already being disabled elsewhere in the code). We handle exceptions in two supported ways:
58
+
59
+ - Permanent/structural exceptions: prefer adjusting the RuboCop configuration (e.g., in `.rubocop.yml`) to exclude a rule for a path or file pattern when it makes sense project-wide.
60
+ - Temporary exceptions while improving code: record the current violations in `.rubocop_gradual.lock` via the gradual workflow:
61
+ - `bundle exec rake rubocop_gradual:autocorrect` (preferred; will autocorrect what it can and update the lock only if no new violations were introduced)
62
+ - If needed, `bundle exec rake rubocop_gradual:force_update` (as a last resort when you cannot fix the newly reported violations immediately)
63
+
64
+ In general, treat the rules as guidance to follow; fix violations rather than ignore them. For example, RSpec conventions in this project expect `described_class` to be used in specs that target a specific class under test.
65
+
66
+ ## Benefits of rubocop_gradual
67
+
68
+ - Allows incremental adoption of code style rules
69
+ - Prevents CI failures due to pre-existing violations
70
+ - Provides a clear record of code style debt
71
+ - Enables focused efforts on improving code quality over time
data/SECURITY.md ADDED
@@ -0,0 +1,21 @@
1
+ # Security Policy
2
+
3
+ ## Supported Versions
4
+
5
+ | Version | Supported |
6
+ |----------|-----------|
7
+ | 1.latest | ✅ |
8
+
9
+ ## Security contact information
10
+
11
+ To report a security vulnerability, please use the
12
+ [Tidelift security contact](https://tidelift.com/security).
13
+ Tidelift will coordinate the fix and disclosure.
14
+
15
+ ## Additional Support
16
+
17
+ If you are interested in support for versions older than the latest release,
18
+ please consider sponsoring the project / maintainer @ https://liberapay.com/pboling/donate,
19
+ or find other sponsorship links in the [README].
20
+
21
+ [README]: README.md
@@ -0,0 +1,410 @@
1
+ # frozen_string_literal: true
2
+
3
+ module TreeHaver
4
+ # The load condition isn't really worth testing, so :nocov:
5
+ # :nocov:
6
+ begin
7
+ require "ffi"
8
+ FFI_AVAILABLE = true
9
+ rescue LoadError
10
+ FFI_AVAILABLE = false
11
+ end
12
+ # :nocov:
13
+
14
+ module Backends
15
+ # FFI-based backend for calling libtree-sitter directly
16
+ #
17
+ # This backend uses Ruby FFI (JNR-FFI on JRuby) to call the native Tree-sitter
18
+ # C library without requiring MRI C extensions. This makes it compatible with
19
+ # JRuby, TruffleRuby, and other Ruby implementations that support FFI.
20
+ #
21
+ # The FFI backend currently supports:
22
+ # - Parsing source code
23
+ # - AST node traversal
24
+ # - Accessing node types and children
25
+ #
26
+ # Not yet supported:
27
+ # - Query API (Tree-sitter queries/patterns)
28
+ #
29
+ # @note Requires the `ffi` gem and libtree-sitter shared library to be installed
30
+ # @see https://github.com/ffi/ffi Ruby FFI
31
+ # @see https://tree-sitter.github.io/tree-sitter/ Tree-sitter
32
+ module FFI
33
+ # Native FFI bindings to libtree-sitter
34
+ #
35
+ # This module handles loading the Tree-sitter runtime library and defining
36
+ # FFI function attachments for the core Tree-sitter API.
37
+ #
38
+ # @api private
39
+ module Native
40
+ if FFI_AVAILABLE && defined?(::FFI)
41
+ extend ::FFI::Library
42
+
43
+ # FFI struct representation of TSNode
44
+ #
45
+ # Mirrors the C struct layout used by Tree-sitter. TSNode is passed
46
+ # by value in the Tree-sitter C API.
47
+ #
48
+ # @api private
49
+ class TSNode < ::FFI::Struct
50
+ layout :context,
51
+ [:uint32, 4],
52
+ :id,
53
+ :pointer,
54
+ :tree,
55
+ :pointer
56
+ end
57
+
58
+ typedef TSNode.by_value, :ts_node
59
+
60
+ class << self
61
+ # Get list of candidate library names for loading libtree-sitter
62
+ #
63
+ # The list is built dynamically to respect environment variables set at runtime.
64
+ # If TREE_SITTER_RUNTIME_LIB is set, it is tried first.
65
+ #
66
+ # @note TREE_SITTER_LIB is intentionally NOT supported
67
+ # @return [Array<String>] list of library names to try
68
+ # @example
69
+ # Native.lib_candidates
70
+ # # => ["tree-sitter", "libtree-sitter.so.0", "libtree-sitter.so", ...]
71
+ def lib_candidates
72
+ [
73
+ ENV["TREE_SITTER_RUNTIME_LIB"],
74
+ "tree-sitter",
75
+ "libtree-sitter.so.0",
76
+ "libtree-sitter.so",
77
+ "libtree-sitter.dylib",
78
+ "libtree-sitter.dll",
79
+ ].compact
80
+ end
81
+
82
+ # Load the Tree-sitter runtime library
83
+ #
84
+ # Tries each candidate library name in order until one succeeds.
85
+ # After loading, attaches FFI function definitions for the Tree-sitter API.
86
+ #
87
+ # @raise [TreeHaver::NotAvailable] if no library can be loaded
88
+ # @return [void]
89
+ # @example
90
+ # TreeHaver::Backends::FFI::Native.try_load!
91
+ def try_load!
92
+ return if @loaded # rubocop:disable ThreadSafety/ClassInstanceVariable
93
+ last_error = nil
94
+ candidates = lib_candidates
95
+ candidates.each do |name|
96
+ ffi_lib(name)
97
+ @loaded = true # rubocop:disable ThreadSafety/ClassInstanceVariable
98
+ break
99
+ rescue ::FFI::NotFoundError, LoadError => e
100
+ last_error = e
101
+ end
102
+ unless @loaded # rubocop:disable ThreadSafety/ClassInstanceVariable
103
+ # :nocov:
104
+ # This failure path cannot be tested in a shared test suite because:
105
+ # 1. Once FFI loads a library via ffi_lib, it cannot be unloaded
106
+ # 2. Other tests may load the library first (test order is randomized)
107
+ # 3. The @loaded flag can be reset, but ffi_lib state persists
108
+ # ENV precedence is tested implicitly by parsing tests that work when
109
+ # TREE_SITTER_RUNTIME_LIB is set correctly in the environment.
110
+ tried = candidates.join(", ")
111
+ env_hint = ENV["TREE_SITTER_RUNTIME_LIB"] ? " TREE_SITTER_RUNTIME_LIB=#{ENV["TREE_SITTER_RUNTIME_LIB"]}." : ""
112
+ msg = if last_error
113
+ "Could not load libtree-sitter (tried: #{tried}).#{env_hint} #{last_error.class}: #{last_error.message}"
114
+ else
115
+ "Could not load libtree-sitter (tried: #{tried}).#{env_hint}"
116
+ end
117
+ raise TreeHaver::NotAvailable, msg
118
+ # :nocov:
119
+ end
120
+
121
+ # Attach functions after lib is selected
122
+ attach_function(:ts_parser_new, [], :pointer)
123
+ attach_function(:ts_parser_delete, [:pointer], :void)
124
+ attach_function(:ts_parser_set_language, [:pointer, :pointer], :bool)
125
+ attach_function(:ts_parser_parse_string, [:pointer, :pointer, :string, :uint32], :pointer)
126
+
127
+ attach_function(:ts_tree_delete, [:pointer], :void)
128
+ attach_function(:ts_tree_root_node, [:pointer], :ts_node)
129
+
130
+ attach_function(:ts_node_type, [:ts_node], :string)
131
+ attach_function(:ts_node_child_count, [:ts_node], :uint32)
132
+ attach_function(:ts_node_child, [:ts_node, :uint32], :ts_node)
133
+ end
134
+
135
+ def loaded?
136
+ !!@loaded
137
+ end
138
+ end
139
+ else
140
+ # :nocov:
141
+ # Fallback stubs when FFI gem is not installed.
142
+ # These paths cannot be tested in a test suite where FFI is a dependency,
143
+ # since the gem is always available. They provide graceful degradation
144
+ # for environments where FFI cannot be installed.
145
+ class << self
146
+ def try_load!
147
+ raise TreeHaver::NotAvailable, "FFI not available"
148
+ end
149
+
150
+ def loaded?
151
+ false
152
+ end
153
+ end
154
+ # :nocov:
155
+ end
156
+ end
157
+
158
+ class << self
159
+ # Check if the FFI backend is available
160
+ #
161
+ # Returns true if the `ffi` gem is present. The actual runtime library
162
+ # (libtree-sitter) is loaded lazily when needed.
163
+ #
164
+ # @return [Boolean] true if FFI gem is available
165
+ # @example
166
+ # if TreeHaver::Backends::FFI.available?
167
+ # puts "FFI backend is ready"
168
+ # end
169
+ def available?
170
+ return false unless FFI_AVAILABLE && defined?(::FFI)
171
+ # We report available when ffi is present; loading lib happens lazily
172
+ true
173
+ end
174
+
175
+ # Get capabilities supported by this backend
176
+ #
177
+ # @return [Hash{Symbol => Object}] capability map
178
+ # @example
179
+ # TreeHaver::Backends::FFI.capabilities
180
+ # # => { backend: :ffi, parse: true, query: false, bytes_field: true }
181
+ def capabilities
182
+ return {} unless available?
183
+ {
184
+ backend: :ffi,
185
+ parse: true,
186
+ query: false,
187
+ bytes_field: true,
188
+ }
189
+ end
190
+ end
191
+
192
+ # Represents a Tree-sitter language loaded via FFI
193
+ #
194
+ # Holds a pointer to a TSLanguage struct from a loaded shared library.
195
+ class Language
196
+ # The FFI pointer to the TSLanguage struct
197
+ # @return [FFI::Pointer]
198
+ attr_reader :pointer
199
+
200
+ # @api private
201
+ # @param ptr [FFI::Pointer] pointer to TSLanguage
202
+ def initialize(ptr)
203
+ @pointer = ptr
204
+ end
205
+
206
+ # Convert to FFI pointer for passing to native functions
207
+ #
208
+ # @return [FFI::Pointer]
209
+ def to_ptr
210
+ @pointer
211
+ end
212
+
213
+ # Load a language from a shared library
214
+ #
215
+ # The library must export a function that returns a pointer to a TSLanguage struct.
216
+ # Symbol resolution uses this precedence (when symbol: not provided):
217
+ # 1. ENV["TREE_SITTER_LANG_SYMBOL"]
218
+ # 2. Guessed from filename (e.g., "libtree-sitter-toml.so" → "tree_sitter_toml")
219
+ # 3. Default fallback ("tree_sitter_toml")
220
+ #
221
+ # @param path [String] absolute path to the language shared library
222
+ # @param symbol [String, nil] explicit exported function name (highest precedence)
223
+ # @param name [String, nil] optional logical name (accepted for compatibility, not used)
224
+ # @return [Language] loaded language handle
225
+ # @raise [TreeHaver::NotAvailable] if FFI not available or library cannot be loaded
226
+ # @example
227
+ # lang = TreeHaver::Backends::FFI::Language.from_library(
228
+ # "/usr/local/lib/libtree-sitter-toml.so",
229
+ # symbol: "tree_sitter_toml"
230
+ # )
231
+ class << self
232
+ def from_library(path, symbol: nil, name: nil)
233
+ raise TreeHaver::NotAvailable, "FFI not available" unless Backends::FFI.available?
234
+ begin
235
+ dl = ::FFI::DynamicLibrary.open(path, ::FFI::DynamicLibrary::RTLD_LAZY)
236
+ rescue LoadError => e
237
+ raise TreeHaver::NotAvailable, "Could not open language library at #{path}: #{e.message}"
238
+ end
239
+
240
+ requested = symbol || ENV["TREE_SITTER_LANG_SYMBOL"]
241
+ base = File.basename(path)
242
+ guessed_lang = base.sub(/^libtree[-_]sitter[-_]/, "").sub(/\.(so(\.\d+)?)|\.dylib|\.dll\z/, "")
243
+ # If an override was provided (arg or ENV), treat it as strict and do not fall back.
244
+ # Only when no override is provided do we attempt guessed and default candidates.
245
+ candidates = if requested && !requested.to_s.empty?
246
+ [requested]
247
+ else
248
+ [(guessed_lang.empty? ? nil : "tree_sitter_#{guessed_lang}"), "tree_sitter_toml"].compact
249
+ end
250
+
251
+ func = nil
252
+ last_err = nil
253
+ candidates.each do |name|
254
+ addr = dl.find_function(name)
255
+ func = ::FFI::Function.new(:pointer, [], addr)
256
+ break
257
+ rescue StandardError => e
258
+ last_err = e
259
+ end
260
+ unless func
261
+ env_used = []
262
+ env_used << "TREE_SITTER_LANG_SYMBOL=#{ENV["TREE_SITTER_LANG_SYMBOL"]}" if ENV["TREE_SITTER_LANG_SYMBOL"]
263
+ detail = env_used.empty? ? "" : " Env overrides: #{env_used.join(", ")}."
264
+ raise TreeHaver::NotAvailable, "Could not resolve language symbol in #{path} (tried: #{candidates.join(", ")}).#{detail} #{last_err&.message}"
265
+ end
266
+
267
+ # Only ensure the core lib is loaded when we actually need to interact with it
268
+ # (e.g., during parsing). Creating the Language handle does not require core to be loaded.
269
+ ptr = func.call
270
+ raise TreeHaver::NotAvailable, "Language factory returned NULL for #{path}" if ptr.null?
271
+ new(ptr)
272
+ end
273
+
274
+ # Backward-compatible alias
275
+ alias_method :from_path, :from_library
276
+ end
277
+ end
278
+
279
+ # FFI-based Tree-sitter parser
280
+ #
281
+ # Wraps a TSParser pointer and manages its lifecycle with a finalizer.
282
+ class Parser
283
+ # Create a new parser instance
284
+ #
285
+ # @raise [TreeHaver::NotAvailable] if FFI not available or parser creation fails
286
+ def initialize
287
+ raise TreeHaver::NotAvailable, "FFI not available" unless Backends::FFI.available?
288
+
289
+ Native.try_load!
290
+ @parser = Native.ts_parser_new
291
+ raise TreeHaver::NotAvailable, "Failed to create ts_parser" if @parser.null?
292
+
293
+ ObjectSpace.define_finalizer(self, self.class.finalizer(@parser))
294
+ end
295
+
296
+ class << self
297
+ # @api private
298
+ # @param ptr [FFI::Pointer] pointer to TSParser
299
+ # @return [Proc] finalizer that deletes the parser
300
+ def finalizer(ptr)
301
+ proc {
302
+ begin
303
+ Native.ts_parser_delete(ptr)
304
+ rescue StandardError
305
+ nil
306
+ end
307
+ }
308
+ end
309
+ end
310
+
311
+ # Set the language for this parser
312
+ #
313
+ # @param lang [Language] the language to use for parsing
314
+ # @return [Language] the language that was set
315
+ # @raise [TreeHaver::NotAvailable] if setting the language fails
316
+ def language=(lang)
317
+ ok = Native.ts_parser_set_language(@parser, lang.to_ptr)
318
+ raise TreeHaver::NotAvailable, "Failed to set language on parser" unless ok
319
+
320
+ lang
321
+ end
322
+
323
+ # Parse source code into a syntax tree
324
+ #
325
+ # @param source [String] the source code to parse (should be UTF-8)
326
+ # @return [Tree] the parsed syntax tree
327
+ # @raise [TreeHaver::NotAvailable] if parsing fails
328
+ def parse(source)
329
+ src = String(source)
330
+ tree_ptr = Native.ts_parser_parse_string(@parser, ::FFI::Pointer::NULL, src, src.bytesize)
331
+ raise TreeHaver::NotAvailable, "Parse returned NULL" if tree_ptr.null?
332
+
333
+ Tree.new(tree_ptr)
334
+ end
335
+ end
336
+
337
+ # FFI-based Tree-sitter tree
338
+ #
339
+ # Wraps a TSTree pointer and manages its lifecycle with a finalizer.
340
+ class Tree
341
+ # @api private
342
+ # @param ptr [FFI::Pointer] pointer to TSTree
343
+ def initialize(ptr)
344
+ @ptr = ptr
345
+ ObjectSpace.define_finalizer(self, self.class.finalizer(@ptr))
346
+ end
347
+
348
+ # @api private
349
+ # @param ptr [FFI::Pointer] pointer to TSTree
350
+ class << self
351
+ # @return [Proc] finalizer that deletes the tree
352
+ def finalizer(ptr)
353
+ proc {
354
+ begin
355
+ Native.ts_tree_delete(ptr)
356
+ rescue StandardError
357
+ nil
358
+ end
359
+ }
360
+ end
361
+ end
362
+
363
+ # Get the root node of the syntax tree
364
+ #
365
+ # @return [Node] the root node
366
+ def root_node
367
+ node_val = Native.ts_tree_root_node(@ptr)
368
+ Node.new(node_val)
369
+ end
370
+ end
371
+
372
+ # FFI-based Tree-sitter node
373
+ #
374
+ # Wraps a TSNode by-value struct. TSNode is passed by value in the
375
+ # Tree-sitter C API, so we store the struct value directly.
376
+ class Node
377
+ # @api private
378
+ # @param ts_node_value [Native::TSNode] the TSNode struct (by value)
379
+ def initialize(ts_node_value)
380
+ # Store by-value struct (FFI will copy); methods pass it back by value
381
+ @val = ts_node_value
382
+ end
383
+
384
+ # Get the type name of this node
385
+ #
386
+ # @return [String] the node type (e.g., "document", "table", "pair")
387
+ def type
388
+ Native.ts_node_type(@val)
389
+ end
390
+
391
+ # Iterate over child nodes
392
+ #
393
+ # @yieldparam child [Node] each child node
394
+ # @return [Enumerator, nil] an enumerator if no block given, nil otherwise
395
+ def each
396
+ return enum_for(:each) unless block_given?
397
+
398
+ count = Native.ts_node_child_count(@val)
399
+ i = 0
400
+ while i < count
401
+ child = Native.ts_node_child(@val, i)
402
+ yield Node.new(child)
403
+ i += 1
404
+ end
405
+ nil
406
+ end
407
+ end
408
+ end
409
+ end
410
+ end