tree_haver 3.2.2 → 3.2.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- checksums.yaml.gz.sig +0 -0
- data/CHANGELOG.md +218 -3
- data/LICENSE.txt +1 -1
- data/README.md +140 -22
- data/lib/tree_haver/backend_api.rb +349 -0
- data/lib/tree_haver/backends/citrus.rb +37 -8
- data/lib/tree_haver/backends/commonmarker.rb +24 -0
- data/lib/tree_haver/backends/java.rb +160 -22
- data/lib/tree_haver/backends/markly.rb +24 -0
- data/lib/tree_haver/backends/prism.rb +21 -8
- data/lib/tree_haver/backends/psych.rb +24 -0
- data/lib/tree_haver/language.rb +25 -10
- data/lib/tree_haver/language_registry.rb +57 -3
- data/lib/tree_haver/node.rb +15 -1
- data/lib/tree_haver/rspec/dependency_tags.rb +249 -23
- data/lib/tree_haver/version.rb +1 -1
- data/lib/tree_haver.rb +370 -108
- data.tar.gz.sig +0 -0
- metadata +5 -4
- metadata.gz.sig +0 -0
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 1cea4c9eec7fee0afdffcb4f6b66a7dce698af731cc23028a77f70bc71dc207e
|
|
4
|
+
data.tar.gz: 68ac3a9c026204a9f0cd42f0afc01f00478405f0be07b74c32c1a85488c7275e
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 4f48fad6061b7d14cc8cd0b0a7c98e0457f1e65613ef8f4ea469f5cdb36e6137512b42a329f846752e9016a43a2fe6cd24e73cbcce2ecb83b17f4cc9eb3653b6
|
|
7
|
+
data.tar.gz: 68949852113d833c6d40d8bde278d08c9b4b68b6d90299af99036683b070a3eeac97a92723de435be2c9e7025f718627d9b100a5fbcbcad554e5e3c6e3b141e3
|
checksums.yaml.gz.sig
CHANGED
|
Binary file
|
data/CHANGELOG.md
CHANGED
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
|
|
1
|
+
**# Changelog
|
|
2
2
|
|
|
3
3
|
[![SemVer 2.0.0][📌semver-img]][📌semver] [![Keep-A-Changelog 1.0.0][📗keep-changelog-img]][📗keep-changelog]
|
|
4
4
|
|
|
@@ -30,6 +30,217 @@ Please file a bug if you notice a violation of semantic versioning.
|
|
|
30
30
|
|
|
31
31
|
### Security
|
|
32
32
|
|
|
33
|
+
## [3.2.4] - 2026-01-04
|
|
34
|
+
|
|
35
|
+
- TAG: [v3.2.4][3.2.4t]
|
|
36
|
+
- COVERAGE: 92.07% -- 2229/2421 lines in 23 files
|
|
37
|
+
- BRANCH COVERAGE: 74.79% -- 786/1051 branches in 23 files
|
|
38
|
+
- 90.37% documented
|
|
39
|
+
|
|
40
|
+
### Added
|
|
41
|
+
|
|
42
|
+
- **External backend registration via `backend_module`** - External gems can now register
|
|
43
|
+
their own pure Ruby backends using the same API as built-in backends. This enables gems
|
|
44
|
+
like rbs-merge to integrate with `TreeHaver.parser_for` without modifying tree_haver:
|
|
45
|
+
```ruby
|
|
46
|
+
TreeHaver.register_language(
|
|
47
|
+
:rbs,
|
|
48
|
+
backend_module: Rbs::Merge::Backends::RbsBackend,
|
|
49
|
+
backend_type: :rbs,
|
|
50
|
+
gem_name: "rbs",
|
|
51
|
+
)
|
|
52
|
+
# Now TreeHaver.parser_for(:rbs) works!
|
|
53
|
+
```
|
|
54
|
+
- **`Backends::PURE_RUBY_BACKENDS` constant** - Maps pure Ruby backend names to their
|
|
55
|
+
language and module info. Used for auto-registration of built-in backends.
|
|
56
|
+
- **`TreeHaver.register_builtin_backends!`** - Registers built-in pure Ruby backends
|
|
57
|
+
(Prism, Psych, Commonmarker, Markly) in the LanguageRegistry using the same API that
|
|
58
|
+
external backends use. Called automatically by `parser_for` on first use.
|
|
59
|
+
- **`TreeHaver.ensure_builtin_backends_registered!`** - Idempotent helper that ensures
|
|
60
|
+
built-in backends are registered exactly once.
|
|
61
|
+
- **`parser_for` now supports registered `backend_module` backends** - When a language
|
|
62
|
+
has a registered `backend_module`, `parser_for` will use it. This enables external
|
|
63
|
+
gems to provide language support without tree-sitter grammars:
|
|
64
|
+
- Checks LanguageRegistry for registered `backend_module` entries
|
|
65
|
+
- Creates parser from the backend module's `Parser` and `Language` classes
|
|
66
|
+
- Falls back to tree-sitter and Citrus if no backend_module matches
|
|
67
|
+
- **RBS dependency tags in `DependencyTags`** - New RSpec tags for RBS parsing:
|
|
68
|
+
- `:rbs_grammar` - tree-sitter-rbs grammar is available and parsing works
|
|
69
|
+
- `:rbs_parsing` - at least one RBS parser (rbs gem OR tree-sitter-rbs) is available
|
|
70
|
+
- `:rbs_gem` - the official rbs gem is available (MRI only)
|
|
71
|
+
- Negated versions: `:not_rbs_grammar`, `:not_rbs_parsing`, `:not_rbs_gem`
|
|
72
|
+
- New availability methods: `tree_sitter_rbs_available?`, `rbs_gem_available?`, `any_rbs_backend_available?`
|
|
73
|
+
- **Support for tree-sitter 0.26.x ABI** - TreeHaver now fully supports grammars built
|
|
74
|
+
against tree-sitter 0.26.x (LANGUAGE_VERSION 15). This required updates to vendored
|
|
75
|
+
dependencies:
|
|
76
|
+
- **ruby-tree-sitter**: Updated to support tree-sitter 0.26.3 C library API changes
|
|
77
|
+
including new `ts_language_abi_version()` function, UTF-16 encoding split, and
|
|
78
|
+
removal of deprecated parser timeout/cancellation APIs
|
|
79
|
+
- **tree_stump (Rust backend)**: Updated to tree-sitter Rust crate 0.26.3 with new
|
|
80
|
+
`abi_version()` method, `u32` child indices, and streaming iterator-based query matches
|
|
81
|
+
- **MRI backend now loads grammars with LANGUAGE_VERSION 15** - Previously, MRI backend
|
|
82
|
+
using ruby_tree_sitter could only load grammars with LANGUAGE_VERSION ≤ 14. Now supports
|
|
83
|
+
grammars built against tree-sitter 0.26.x.
|
|
84
|
+
- **Rust backend now loads grammars with LANGUAGE_VERSION 15** - Previously, the tree_stump
|
|
85
|
+
Rust backend reported "Incompatible language version 15. Expected minimum 13, maximum 14".
|
|
86
|
+
Now supports the latest grammar format.
|
|
87
|
+
- **BackendAPI validation module** - New `TreeHaver::BackendAPI` module for validating
|
|
88
|
+
backend API compliance:
|
|
89
|
+
- `BackendAPI.validate(backend_module)` - Returns validation results hash
|
|
90
|
+
- `BackendAPI.validate!(backend_module)` - Raises on validation failure
|
|
91
|
+
- `BackendAPI.validate_node_instance(node)` - Validates a node instance
|
|
92
|
+
- Defines required and optional methods for Language, Parser, Tree, and Node classes
|
|
93
|
+
- Documents API contract for wrapper vs raw backends
|
|
94
|
+
- New `examples/validate_backends.rb` script to validate all backends
|
|
95
|
+
- **Java backend Node class now implements full API** - Added missing methods to ensure
|
|
96
|
+
API consistency with other backends:
|
|
97
|
+
- `parent` - Get parent node
|
|
98
|
+
- `next_sibling` - Get next sibling node
|
|
99
|
+
- `prev_sibling` - Get previous sibling node
|
|
100
|
+
- `named?` - Check if node is named
|
|
101
|
+
- `child_by_field_name` - Get child by field name
|
|
102
|
+
- All methods properly handle jtreesitter 0.26.0's `Optional<Node>` return types
|
|
103
|
+
- **Three environment variables for backend control** - Fine-grained control over which
|
|
104
|
+
backends are available:
|
|
105
|
+
- `TREE_HAVER_BACKEND` - Single backend selection (auto, mri, ffi, rust, java, citrus, etc.)
|
|
106
|
+
- `TREE_HAVER_NATIVE_BACKEND` - Allow list for native backends (auto, none, or comma-separated
|
|
107
|
+
list like `mri,ffi`). Use `none` for pure-Ruby-only mode.
|
|
108
|
+
- `TREE_HAVER_RUBY_BACKEND` - Allow list for pure Ruby backends (auto, none, or comma-separated
|
|
109
|
+
list like `citrus,prism`). Use `none` for native-only mode.
|
|
110
|
+
- **Backend availability now respects allow lists** - When `TREE_HAVER_NATIVE_BACKEND` is set
|
|
111
|
+
to specific backends (e.g., `mri,ffi`), all other native backends are treated as unavailable.
|
|
112
|
+
This applies to ALL backend selection mechanisms:
|
|
113
|
+
- Auto-selection in `backend_module`
|
|
114
|
+
- Explicit selection via `with_backend(:rust)` - returns nil/unavailable
|
|
115
|
+
- Explicit selection via `resolve_backend_module(:rust)` - returns nil
|
|
116
|
+
- RSpec dependency tags (`ffi_available?`, etc.)
|
|
117
|
+
|
|
118
|
+
This makes the environment variables a **hard restriction**, not just a hint for auto-selection.
|
|
119
|
+
Use `TREE_HAVER_NATIVE_BACKEND=none` for pure-Ruby-only mode, or specify exactly which
|
|
120
|
+
native backends are permitted (e.g., `mri,ffi`).
|
|
121
|
+
- **Java backend updated for jtreesitter 0.26.0** - Full compatibility with jtreesitter 0.26.0:
|
|
122
|
+
- Updated `Parser#parse` and `Parser#parse_string` to handle `Optional<Tree>` return type
|
|
123
|
+
- Updated `Tree#root_node` to handle `Optional<Node>` return type
|
|
124
|
+
- Fixed `parse_string` argument order to match jtreesitter 0.26.0 API: `parse(String, Tree)`
|
|
125
|
+
- Updated `Language.load_by_name` to use `SymbolLookup` API (single-arg `load(name)` removed)
|
|
126
|
+
- Added `bin/setup-jtreesitter` script to download jtreesitter JAR from Maven Central
|
|
127
|
+
- Added `bin/build-grammar` script to build tree-sitter grammars from source
|
|
128
|
+
- Older versions of jtreesitter are NOT supported
|
|
129
|
+
- **`TREE_HAVER_BACKEND_PROTECT` environment variable** - Explicit control over backend
|
|
130
|
+
conflict protection. Set to `false` to disable protection that prevents mixing
|
|
131
|
+
incompatible native backends (e.g., FFI after MRI). Useful for testing scenarios
|
|
132
|
+
where you understand the risks. Default behavior (protection enabled) unchanged.
|
|
133
|
+
|
|
134
|
+
### Changed
|
|
135
|
+
|
|
136
|
+
- **API normalized: `from_library` is now universal** - All language-specific backends
|
|
137
|
+
(Psych, Prism, Commonmarker, Markly) now implement `Language.from_library` for API
|
|
138
|
+
consistency. This allows `TreeHaver.parser_for(:yaml)` to work uniformly regardless
|
|
139
|
+
of which backend is active:
|
|
140
|
+
- **Psych**: `from_library` accepts (and ignores) path/symbol, returns YAML language
|
|
141
|
+
- **Prism**: `from_library` accepts (and ignores) path/symbol, returns Ruby language
|
|
142
|
+
- **Commonmarker**: `from_library` accepts (and ignores) path/symbol, returns Markdown language
|
|
143
|
+
- **Markly**: `from_library` accepts (and ignores) path/symbol, returns Markdown language
|
|
144
|
+
- All raise `TreeHaver::NotAvailable` if a different language is requested
|
|
145
|
+
- **Citrus backend `from_library` now looks up registered grammars** - Instead of always
|
|
146
|
+
raising an error, `Backends::Citrus::Language.from_library` now looks up registered
|
|
147
|
+
Citrus grammars by name via `LanguageRegistry`. This enables `TreeHaver.parser_for(:toml)`
|
|
148
|
+
to work seamlessly when a Citrus grammar has been registered with
|
|
149
|
+
`TreeHaver.register_language(:toml, grammar_module: TomlRB::Document)`.
|
|
150
|
+
- **Java backend requires jtreesitter >= 0.26.0** - Due to API changes in jtreesitter,
|
|
151
|
+
older versions are no longer supported. The tree-sitter runtime library must also be
|
|
152
|
+
version 0.26.x to match.
|
|
153
|
+
by the RSpec dependency tags. This ensures tests tagged with `:mri_backend` only run when
|
|
154
|
+
MRI is in the allow list. Same for `TREE_HAVER_RUBY_BACKEND` and pure Ruby backends.
|
|
155
|
+
- New `TreeHaver.allowed_native_backends` method returns the allow list for native backends.
|
|
156
|
+
- New `TreeHaver.allowed_ruby_backends` method returns the allow list for pure Ruby backends.
|
|
157
|
+
- New `TreeHaver.backend_allowed?(backend)` method checks if a specific backend is allowed
|
|
158
|
+
based on the current environment variable settings.
|
|
159
|
+
- New `DependencyTags.allowed_native_backends` and `DependencyTags.allowed_ruby_backends` methods.
|
|
160
|
+
- Updated `examples/test_backend_selection.rb` script to test all three environment variables.
|
|
161
|
+
- **`LanguageRegistry` now supports any backend type** - Previously only `:tree_sitter` and
|
|
162
|
+
`:citrus` were documented. Now supports arbitrary backend types including `:prism`, `:psych`,
|
|
163
|
+
`:commonmarker`, `:markly`, `:rbs`, or any custom type. External gems can register their
|
|
164
|
+
own backend types using the same API.
|
|
165
|
+
- **`register_language` accepts `backend_module` parameter** - New parameter for registering
|
|
166
|
+
pure Ruby backends. The module must provide `Language` and `Parser` classes with the
|
|
167
|
+
standard TreeHaver API (`available?`, `capabilities`, `from_library`, etc.).
|
|
168
|
+
|
|
169
|
+
### Fixed
|
|
170
|
+
|
|
171
|
+
- **`TreeHaver::Node#text` now handles backends with different `text` method signatures** -
|
|
172
|
+
Previously, `Node#text` would call `@inner_node.text` directly, but `TreeStump::Node#text`
|
|
173
|
+
(Rust backend) requires the source as an argument (`text(source)`). This caused
|
|
174
|
+
`ArgumentError: wrong number of arguments (given 0, expected 1)` when using the Rust
|
|
175
|
+
backend. Now `Node#text` checks the method arity and passes the source when required:
|
|
176
|
+
- Arity 0 or -1: calls `@inner_node.text` without arguments
|
|
177
|
+
- Arity >= 1: calls `@inner_node.text(@source)` with source
|
|
178
|
+
- Falls back to byte-based extraction if source is available
|
|
179
|
+
|
|
180
|
+
- **AUTO mode now gracefully falls back when explicitly requested backend is blocked** -
|
|
181
|
+
Previously, if `TREE_HAVER_BACKEND=ffi` was set in the environment but FFI was blocked
|
|
182
|
+
due to MRI being used first (backend conflict protection), `parser_for` would raise a
|
|
183
|
+
`BackendConflict` error. Now, when the explicitly requested backend is blocked by a
|
|
184
|
+
**backend conflict** (e.g., FFI after MRI causes segfaults):
|
|
185
|
+
- `backend_module` detects the conflict and falls back to auto-selection
|
|
186
|
+
- `resolve_native_backend_module` rescues `BackendConflict` and continues to the next
|
|
187
|
+
backend in the priority list
|
|
188
|
+
- This enables seamless multi-backend usage in test suites where different tests use
|
|
189
|
+
different backends, but one backend has already "poisoned" the process for another.
|
|
190
|
+
|
|
191
|
+
Note: This fallback only applies to **backend conflicts** (runtime incompatibility).
|
|
192
|
+
If a backend is disallowed by `TREE_HAVER_NATIVE_BACKEND` or `TREE_HAVER_RUBY_BACKEND`,
|
|
193
|
+
it will simply be unavailable—no error is raised, but no fallback occurs either.
|
|
194
|
+
|
|
195
|
+
- **`java_backend_available?` now verifies grammar loading works** - Previously, the
|
|
196
|
+
`DependencyTags.java_backend_available?` method only checked if java-tree-sitter
|
|
197
|
+
classes could be loaded, but didn't verify that grammars could actually be used.
|
|
198
|
+
This caused tests tagged with `:java_backend` to run on JRuby even when the grammar
|
|
199
|
+
`.so` files (built for MRI) were incompatible with java-tree-sitter's Foreign Function
|
|
200
|
+
Memory API. Now the check does a live test by attempting to load a grammar, ensuring
|
|
201
|
+
the tag accurately reflects whether the Java backend is fully functional.
|
|
202
|
+
|
|
203
|
+
## [3.2.3] - 2026-01-02
|
|
204
|
+
|
|
205
|
+
- TAG: [v3.2.3][3.2.3t]
|
|
206
|
+
- COVERAGE: 94.91% -- 2088/2200 lines in 22 files
|
|
207
|
+
- BRANCH COVERAGE: 81.37% -- 738/907 branches in 22 files
|
|
208
|
+
- 90.14% documented
|
|
209
|
+
|
|
210
|
+
### Fixed
|
|
211
|
+
|
|
212
|
+
- **`parser_for` now respects explicitly requested non-native backends** - Previously,
|
|
213
|
+
`parser_for` would always try tree-sitter backends first and only fall back to alternative
|
|
214
|
+
backends if tree-sitter was unavailable. Now it checks `effective_backend` and skips
|
|
215
|
+
tree-sitter attempts entirely when a non-native backend is explicitly requested via:
|
|
216
|
+
- `TREE_HAVER_BACKEND=citrus` (or `prism`, `psych`, `commonmarker`, `markly`)
|
|
217
|
+
- `TreeHaver.backend = :citrus`
|
|
218
|
+
- `TreeHaver.with_backend(:citrus) { ... }`
|
|
219
|
+
|
|
220
|
+
Native backends (`:mri`, `:rust`, `:ffi`, `:java`) still use tree-sitter grammar discovery.
|
|
221
|
+
|
|
222
|
+
- **`load_tree_sitter_language` now correctly ignores Citrus registrations** - Previously,
|
|
223
|
+
if a language was registered with Citrus first, `load_tree_sitter_language` would
|
|
224
|
+
incorrectly try to use it even when a native backend was explicitly requested. Now it
|
|
225
|
+
only uses registrations that have a `:tree_sitter` key, allowing proper backend switching
|
|
226
|
+
between Citrus and native tree-sitter backends.
|
|
227
|
+
|
|
228
|
+
- **`load_tree_sitter_language` now validates registered paths exist** - Previously,
|
|
229
|
+
if a language had a stale/invalid tree-sitter registration with a non-existent path
|
|
230
|
+
(e.g., from a test), the code would try to use it and fail. Now it checks
|
|
231
|
+
`File.exist?(path)` before using a registered path, falling back to auto-discovery
|
|
232
|
+
via `GrammarFinder` if the registered path doesn't exist.
|
|
233
|
+
|
|
234
|
+
- **`Language.method_missing` no longer falls back to Citrus when native backend explicitly requested** -
|
|
235
|
+
Previously, when tree-sitter loading failed (e.g., .so file missing), the code would
|
|
236
|
+
silently fall back to Citrus even if the user explicitly requested `:mri`, `:rust`,
|
|
237
|
+
`:ffi`, or `:java`. Now fallback to Citrus only happens when `effective_backend` is `:auto`.
|
|
238
|
+
This is a **breaking change** for users who relied on silent fallback behavior.
|
|
239
|
+
|
|
240
|
+
- **Simplified `parser_for` implementation** - Refactored from complex nested conditionals to
|
|
241
|
+
cleaner helper methods (`load_tree_sitter_language`, `load_citrus_language`). The logic is
|
|
242
|
+
now easier to follow and maintain.
|
|
243
|
+
|
|
33
244
|
## [3.2.2] - 2026-01-01
|
|
34
245
|
|
|
35
246
|
- TAG: [v3.2.2][3.2.2t]
|
|
@@ -674,7 +885,11 @@ Despite the major version bump to 3.0.0 (following semver due to the breaking `L
|
|
|
674
885
|
|
|
675
886
|
- Initial release
|
|
676
887
|
|
|
677
|
-
[Unreleased]: https://github.com/kettle-rb/tree_haver/compare/v3.2.
|
|
888
|
+
[Unreleased]: https://github.com/kettle-rb/tree_haver/compare/v3.2.4...HEAD
|
|
889
|
+
[3.2.4]: https://github.com/kettle-rb/tree_haver/compare/v3.2.3...v3.2.4
|
|
890
|
+
[3.2.4t]: https://github.com/kettle-rb/tree_haver/releases/tag/v3.2.4
|
|
891
|
+
[3.2.3]: https://github.com/kettle-rb/tree_haver/compare/v3.2.2...v3.2.3
|
|
892
|
+
[3.2.3t]: https://github.com/kettle-rb/tree_haver/releases/tag/v3.2.3
|
|
678
893
|
[3.2.2]: https://github.com/kettle-rb/tree_haver/compare/v3.2.1...v3.2.2
|
|
679
894
|
[3.2.2t]: https://github.com/kettle-rb/tree_haver/releases/tag/v3.2.2
|
|
680
895
|
[3.2.1]: https://github.com/kettle-rb/tree_haver/compare/v3.2.0...v3.2.1
|
|
@@ -692,4 +907,4 @@ Despite the major version bump to 3.0.0 (following semver due to the breaking `L
|
|
|
692
907
|
[2.0.0]: https://github.com/kettle-rb/tree_haver/compare/v1.0.0...v2.0.0
|
|
693
908
|
[2.0.0t]: https://github.com/kettle-rb/tree_haver/releases/tag/v2.0.0
|
|
694
909
|
[1.0.0]: https://github.com/kettle-rb/tree_haver/compare/a89211bff10f4440b96758a8ac9d7d539001b0c8...v1.0.0
|
|
695
|
-
[1.0.0t]: https://github.com/kettle-rb/tree_haver/tags/v1.0.0
|
|
910
|
+
[1.0.0t]: https://github.com/kettle-rb/tree_haver/tags/v1.0.0**
|
data/LICENSE.txt
CHANGED
data/README.md
CHANGED
|
@@ -185,25 +185,120 @@ gem "citrus", "~> 3.0"
|
|
|
185
185
|
|
|
186
186
|
#### Java Backend (JRuby only)
|
|
187
187
|
|
|
188
|
-
|
|
188
|
+
**Requires jtreesitter >= 0.26.0** from Maven Central. Older versions are not supported due to breaking API changes.
|
|
189
|
+
|
|
190
|
+
```ruby
|
|
191
|
+
# No gem dependency - uses JRuby's built-in Java integration
|
|
192
|
+
# Download the JAR:
|
|
193
|
+
# curl -L -o jtreesitter-0.26.0.jar \
|
|
194
|
+
# "https://repo1.maven.org/maven2/io/github/tree-sitter/jtreesitter/0.26.0/jtreesitter-0.26.0.jar"
|
|
195
|
+
|
|
196
|
+
# Set environment variable:
|
|
197
|
+
# export TREE_SITTER_JAVA_JARS_DIR=/path/to/jars
|
|
198
|
+
```
|
|
199
|
+
|
|
200
|
+
**Also requires**:
|
|
201
|
+
- Tree-sitter runtime library (`libtree-sitter.so`) version 0.26+ (must match jtreesitter version)
|
|
202
|
+
- Grammar `.so` files built against tree-sitter 0.26+ (or rebuilt with `tree-sitter generate`)
|
|
189
203
|
|
|
190
204
|
### Backend Platform Compatibility
|
|
191
205
|
|
|
192
206
|
Not all backends work on all Ruby platforms. Here's a complete compatibility matrix:
|
|
193
207
|
|
|
194
|
-
| Backend | MRI | JRuby | TruffleRuby | Notes |
|
|
195
|
-
| --- | :-: | :-: | :-: | --- |
|
|
196
|
-
| **MRI** ([ruby\_tree\_sitter](https://github.com/Faveod/ruby-tree-sitter)) | ✅ | ❌ | ❌ | C extension, MRI only |
|
|
197
|
-
| **Rust** ([tree\_stump](https://github.com/joker1007/tree_stump)) | ✅ | ❌ | ❌ | magnus/rb-sys incompatible with non-MRI |
|
|
198
|
-
| **FFI** | ✅ | ✅ | ❌ | TruffleRuby FFI doesn't support `STRUCT_BY_VALUE` |
|
|
199
|
-
| **Java** ([jtreesitter](https://central.sonatype.com/artifact/io.github.tree-sitter/jtreesitter)) | ❌ | ✅ | ❌ | JRuby only, requires
|
|
200
|
-
| **Prism** | ✅ | ✅ | ✅ | Ruby parsing, stdlib in Ruby 3.4+ |
|
|
201
|
-
| **Psych** | ✅ | ✅ | ✅ | YAML parsing, stdlib |
|
|
202
|
-
| **Citrus** | ✅ | ✅ | ✅ | Pure Ruby, no native dependencies |
|
|
203
|
-
| **Commonmarker** | ✅ | ❌ | ❓ | Rust extension for Markdown |
|
|
204
|
-
| **Markly** | ✅ | ❌ | ❓ | C extension for Markdown |
|
|
208
|
+
| Backend | MRI | JRuby | TruffleRuby | API Complete | Notes |
|
|
209
|
+
| --- | :-: | :-: | :-: | :-: | --- |
|
|
210
|
+
| **MRI** ([ruby\_tree\_sitter](https://github.com/Faveod/ruby-tree-sitter)) | ✅ | ❌ | ❌ | ✅ | C extension, MRI only |
|
|
211
|
+
| **Rust** ([tree\_stump](https://github.com/joker1007/tree_stump)) | ✅ | ❌ | ❌ | ✅ | magnus/rb-sys incompatible with non-MRI |
|
|
212
|
+
| **FFI** | ✅ | ✅ | ❌ | ⚠️ | TruffleRuby FFI doesn't support `STRUCT_BY_VALUE` |
|
|
213
|
+
| **Java** ([jtreesitter](https://central.sonatype.com/artifact/io.github.tree-sitter/jtreesitter)) | ❌ | ✅ | ❌ | ✅ | JRuby only, requires jtreesitter >= 0.26.0 |
|
|
214
|
+
| **Prism** | ✅ | ✅ | ✅ | ✅ | Ruby parsing, stdlib in Ruby 3.4+ |
|
|
215
|
+
| **Psych** | ✅ | ✅ | ✅ | ✅ | YAML parsing, stdlib |
|
|
216
|
+
| **Citrus** | ✅ | ✅ | ✅ | ⚠️ | Pure Ruby, no native dependencies |
|
|
217
|
+
| **Commonmarker** | ✅ | ❌ | ❓ | ✅ | Rust extension for Markdown |
|
|
218
|
+
| **Markly** | ✅ | ❌ | ❓ | ✅ | C extension for Markdown |
|
|
219
|
+
|
|
220
|
+
**Legend**: ✅ = Works / Complete, ❌ = Does not work, ❓ = Untested, ⚠️ = Partial (some optional methods missing)
|
|
221
|
+
|
|
222
|
+
**API Complete** indicates whether the backend implements all optional Node methods (`parent`, `next_sibling`, `prev_sibling`, `named?`, `missing?`, `text`, `child_by_field_name`). Backends marked ⚠️ work but may be missing some advanced traversal methods. Use `TreeHaver::BackendAPI.validate(backend_module)` to check specific backends.
|
|
223
|
+
|
|
224
|
+
### Version Requirements for Tree-Sitter Backends
|
|
225
|
+
|
|
226
|
+
#### tree-sitter Runtime Library
|
|
227
|
+
|
|
228
|
+
All tree-sitter backends (MRI, Rust, FFI, Java) require the tree-sitter runtime library. **Version 0.26+ is required** for the Java backend (to match jtreesitter 0.26.0). Other backends may work with 0.24+, but 0.26+ is recommended for consistency.
|
|
229
|
+
|
|
230
|
+
```bash
|
|
231
|
+
# Check your tree-sitter version
|
|
232
|
+
tree-sitter --version # Should be 0.26.0 or newer for Java backend
|
|
233
|
+
|
|
234
|
+
# macOS
|
|
235
|
+
brew install tree-sitter
|
|
236
|
+
|
|
237
|
+
# Ubuntu/Debian
|
|
238
|
+
apt-get install libtree-sitter0 libtree-sitter-dev
|
|
239
|
+
|
|
240
|
+
# Fedora
|
|
241
|
+
dnf install tree-sitter tree-sitter-devel
|
|
242
|
+
```
|
|
243
|
+
|
|
244
|
+
#### jtreesitter (Java Backend)
|
|
245
|
+
|
|
246
|
+
**The Java backend requires jtreesitter >= 0.26.0.** This version introduced breaking API changes:
|
|
247
|
+
|
|
248
|
+
- `Parser.parse()` returns `Optional<Tree>` instead of `Tree`
|
|
249
|
+
- `Tree.getRootNode()` returns `Node` directly (not `Optional<Node>`)
|
|
250
|
+
- `Node.getChild()`, `getParent()`, `getNextSibling()`, `getPrevSibling()` return `Optional<Node>`
|
|
251
|
+
- `Language.load(name)` was removed; use `SymbolLookup` API instead
|
|
252
|
+
|
|
253
|
+
Older versions of jtreesitter are **NOT supported**.
|
|
254
|
+
|
|
255
|
+
```bash
|
|
256
|
+
# Download jtreesitter 0.26.0 from Maven Central
|
|
257
|
+
curl -L -o jtreesitter-0.26.0.jar \
|
|
258
|
+
"https://repo1.maven.org/maven2/io/github/tree-sitter/jtreesitter/0.26.0/jtreesitter-0.26.0.jar"
|
|
205
259
|
|
|
206
|
-
|
|
260
|
+
# Or use the provided setup script
|
|
261
|
+
bin/setup-jtreesitter
|
|
262
|
+
```
|
|
263
|
+
|
|
264
|
+
Set the environment variable to point to your JAR directory:
|
|
265
|
+
|
|
266
|
+
```bash
|
|
267
|
+
export TREE_SITTER_JAVA_JARS_DIR=/path/to/jars
|
|
268
|
+
```
|
|
269
|
+
|
|
270
|
+
#### Grammar ABI Compatibility
|
|
271
|
+
|
|
272
|
+
**CRITICAL**: Grammars must be built against a compatible tree-sitter version.
|
|
273
|
+
|
|
274
|
+
Tree-sitter 0.24+ changed how language ABI versions are reported (from `ts_language_version()` to `ts_language_abi_version()`). For the Java backend with jtreesitter 0.26.0, grammars must be built against tree-sitter 0.26+. If you get errors like:
|
|
275
|
+
|
|
276
|
+
```
|
|
277
|
+
Failed to load tree_sitter_toml
|
|
278
|
+
Version mismatch detected: The grammar was built against tree-sitter < 0.26
|
|
279
|
+
```
|
|
280
|
+
|
|
281
|
+
You need to rebuild the grammar from source:
|
|
282
|
+
|
|
283
|
+
```bash
|
|
284
|
+
# Use the provided build script
|
|
285
|
+
bin/build-grammar toml
|
|
286
|
+
|
|
287
|
+
# Or manually:
|
|
288
|
+
git clone https://github.com/tree-sitter-grammars/tree-sitter-toml
|
|
289
|
+
cd tree-sitter-toml
|
|
290
|
+
tree-sitter generate # Regenerates parser.c for your tree-sitter version
|
|
291
|
+
cc -shared -fPIC -o libtree-sitter-toml.so src/parser.c src/scanner.c -I src
|
|
292
|
+
```
|
|
293
|
+
|
|
294
|
+
**Grammar sources for common languages:**
|
|
295
|
+
|
|
296
|
+
| Language | Repository |
|
|
297
|
+
| --- | --- |
|
|
298
|
+
| TOML | [tree-sitter-grammars/tree-sitter-toml](https://github.com/tree-sitter-grammars/tree-sitter-toml) |
|
|
299
|
+
| JSON | [tree-sitter/tree-sitter-json](https://github.com/tree-sitter/tree-sitter-json) |
|
|
300
|
+
| JSONC | [WhyNotHugo/tree-sitter-jsonc](https://gitlab.com/WhyNotHugo/tree-sitter-jsonc) |
|
|
301
|
+
| Bash | [tree-sitter/tree-sitter-bash](https://github.com/tree-sitter/tree-sitter-bash) |
|
|
207
302
|
|
|
208
303
|
#### TruffleRuby Limitations
|
|
209
304
|
|
|
@@ -215,11 +310,12 @@ TruffleRuby users should use: **Prism** (Ruby), **Psych** (YAML), **Citrus** (TO
|
|
|
215
310
|
|
|
216
311
|
#### JRuby Limitations
|
|
217
312
|
|
|
218
|
-
JRuby runs on the JVM and **cannot load native `.so` extensions**:
|
|
313
|
+
JRuby runs on the JVM and **cannot load native `.so` extensions via Ruby's C API**:
|
|
219
314
|
|
|
220
315
|
- **MRI/Rust**: C and Rust extensions simply cannot be loaded
|
|
221
316
|
- **FFI**: Works\! JRuby has excellent FFI support
|
|
222
|
-
|
|
317
|
+
- **Java**: Works\! The Java backend uses jtreesitter (requires >= 0.26.0)
|
|
318
|
+
JRuby users should use: **Java backend** (best performance, full API) or **FFI backend** for tree-sitter, plus **Prism**, **Psych**, **Citrus** for other formats.
|
|
223
319
|
|
|
224
320
|
[ruby_tree_sitter]: https://github.com/Faveod/ruby-tree-sitter
|
|
225
321
|
[tree_stump]: https://github.com/joker1007/tree_stump
|
|
@@ -314,9 +410,9 @@ The `*-merge` gem family provides intelligent, AST-based merging for various fil
|
|
|
314
410
|
[citrus]: https://github.com/mjackson/citrus
|
|
315
411
|
[tree_haver]: https://github.com/kettle-rb/tree_haver
|
|
316
412
|
|
|
317
|
-
**Note:** Java backend works with grammar
|
|
413
|
+
**Note:** Java backend works with grammar `.so` files built against tree-sitter 0.24+. The grammars must be rebuilt with `tree-sitter generate` if they were compiled against older tree-sitter versions. FFI is recommended for JRuby as it's easier to set up.
|
|
318
414
|
|
|
319
|
-
**Note:** TreeHaver can use `ruby_tree_sitter` (MRI) or `tree_stump` (MRI
|
|
415
|
+
**Note:** TreeHaver can use `ruby_tree_sitter` (MRI) or `tree_stump` (MRI) as backends, or `java-tree-sitter` / `jtreesitter` >= 0.26.0 ([docs](https://tree-sitter.github.io/java-tree-sitter/), [maven](https://central.sonatype.com/artifact/io.github.tree-sitter/jtreesitter), [source](https://github.com/tree-sitter/java-tree-sitter), JRuby), or FFI on any backend, giving you TreeHaver's unified API, grammar discovery, and security features, plus full access to incremental parsing when using those backends.
|
|
320
416
|
|
|
321
417
|
**Note:** `tree_stump` currently requires unreleased fixes in the `main` branch.
|
|
322
418
|
|
|
@@ -471,7 +567,7 @@ TreeHaver supports 10 parsing backends, each with different trade-offs. The `aut
|
|
|
471
567
|
| **MRI** | C extension via ruby\_tree\_sitter | ⚡ Fastest | MRI only | [JSON](examples/mri_json.rb) · [JSONC](examples/mri_jsonc.rb) · \~\~Bash\~\~\* · [TOML](examples/mri_toml.rb) |
|
|
472
568
|
| **Rust** | Precompiled via tree\_stump | ⚡ Very Fast | ✅ Good | [JSON](examples/rust_json.rb) · [JSONC](examples/rust_jsonc.rb) · \~\~Bash\~\~\* · [TOML](examples/rust_toml.rb) |
|
|
473
569
|
| **FFI** | Dynamic linking via FFI | 🔵 Fast | ✅ Universal | [JSON](examples/ffi_json.rb) · [JSONC](examples/ffi_jsonc.rb) · [Bash](examples/ffi_bash.rb) · [TOML](examples/ffi_toml.rb) |
|
|
474
|
-
| **Java** | JNI bindings | ⚡ Very Fast | JRuby only | [JSON](examples/java_json.rb) · [JSONC](examples/java_jsonc.rb) · [Bash](examples/java_bash.rb) · [TOML](examples/java_toml.rb) |
|
|
570
|
+
| **Java** | JNI bindings (jtreesitter >= 0.26.0) | ⚡ Very Fast | JRuby only | [JSON](examples/java_json.rb) · [JSONC](examples/java_jsonc.rb) · [Bash](examples/java_bash.rb) · [TOML](examples/java_toml.rb) |
|
|
475
571
|
|
|
476
572
|
#### Language-Specific Backends (Native Parser Integration)
|
|
477
573
|
|
|
@@ -745,12 +841,34 @@ export TREE_SITTER_TOML_PATH=/usr/local/lib/libtree-sitter-toml.so
|
|
|
745
841
|
export TREE_SITTER_JSON_PATH=/usr/local/lib/libtree-sitter-json.so
|
|
746
842
|
```
|
|
747
843
|
|
|
748
|
-
#### JRuby-Specific: Java Backend
|
|
844
|
+
#### JRuby-Specific: Java Backend Configuration
|
|
845
|
+
|
|
846
|
+
For the Java backend on JRuby, you need:
|
|
749
847
|
|
|
750
|
-
|
|
848
|
+
1. **jtreesitter >= 0.26.0** JAR from Maven Central
|
|
849
|
+
2. **Tree-sitter runtime library** (`libtree-sitter.so`) version 0.26+
|
|
850
|
+
3. **Grammar `.so` files** built against tree-sitter 0.26+
|
|
751
851
|
|
|
752
852
|
``` bash
|
|
853
|
+
# Download jtreesitter JAR (or use bin/setup-jtreesitter)
|
|
753
854
|
export TREE_SITTER_JAVA_JARS_DIR=/path/to/java-tree-sitter/jars
|
|
855
|
+
|
|
856
|
+
# Point to tree-sitter runtime (must be 0.26+)
|
|
857
|
+
export TREE_SITTER_RUNTIME_LIB=/usr/local/lib/libtree-sitter.so
|
|
858
|
+
|
|
859
|
+
# Point to grammar libraries (must be built for tree-sitter 0.26+)
|
|
860
|
+
export TREE_SITTER_TOML_PATH=/path/to/libtree-sitter-toml.so
|
|
861
|
+
```
|
|
862
|
+
|
|
863
|
+
**Building grammars for Java backend:**
|
|
864
|
+
|
|
865
|
+
If you get "version mismatch" errors, rebuild the grammar:
|
|
866
|
+
|
|
867
|
+
```bash
|
|
868
|
+
# Use the provided build script
|
|
869
|
+
bin/build-grammar toml
|
|
870
|
+
|
|
871
|
+
# This regenerates parser.c for your tree-sitter version and compiles it
|
|
754
872
|
```
|
|
755
873
|
|
|
756
874
|
For more see [docs](https://tree-sitter.github.io/java-tree-sitter/), [maven](https://central.sonatype.com/artifact/io.github.tree-sitter/jtreesitter), and [source](https://github.com/tree-sitter/java-tree-sitter).
|
|
@@ -1788,7 +1906,7 @@ See [LICENSE.txt](LICENSE.txt) for the official [Copyright Notice](https://opens
|
|
|
1788
1906
|
|
|
1789
1907
|
<ul>
|
|
1790
1908
|
<li>
|
|
1791
|
-
Copyright (c) 2025 Peter H. Boling, of
|
|
1909
|
+
Copyright (c) 2025-2026 Peter H. Boling, of
|
|
1792
1910
|
<a href="https://discord.gg/3qme4XHNKN">
|
|
1793
1911
|
Galtzo.com
|
|
1794
1912
|
<picture>
|
|
@@ -1977,7 +2095,7 @@ Thanks for RTFM. ☺️
|
|
|
1977
2095
|
[📌gitmoji]: https://gitmoji.dev
|
|
1978
2096
|
[📌gitmoji-img]: https://img.shields.io/badge/gitmoji_commits-%20%F0%9F%98%9C%20%F0%9F%98%8D-34495e.svg?style=flat-square
|
|
1979
2097
|
[🧮kloc]: https://www.youtube.com/watch?v=dQw4w9WgXcQ
|
|
1980
|
-
[🧮kloc-img]: https://img.shields.io/badge/KLOC-2.
|
|
2098
|
+
[🧮kloc-img]: https://img.shields.io/badge/KLOC-2.421-FFDD67.svg?style=for-the-badge&logo=YouTube&logoColor=blue
|
|
1981
2099
|
[🔐security]: SECURITY.md
|
|
1982
2100
|
[🔐security-img]: https://img.shields.io/badge/security-policy-259D6C.svg?style=flat
|
|
1983
2101
|
[📄copyright-notice-explainer]: https://opensource.stackexchange.com/questions/5778/why-do-licenses-such-as-the-mit-license-specify-a-single-year
|