tree_haver 2.0.0 → 3.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- checksums.yaml.gz.sig +0 -0
- data/CHANGELOG.md +285 -1
- data/CONTRIBUTING.md +132 -0
- data/README.md +529 -36
- data/lib/tree_haver/backends/citrus.rb +177 -20
- data/lib/tree_haver/backends/commonmarker.rb +490 -0
- data/lib/tree_haver/backends/ffi.rb +341 -142
- data/lib/tree_haver/backends/java.rb +65 -16
- data/lib/tree_haver/backends/markly.rb +559 -0
- data/lib/tree_haver/backends/mri.rb +183 -17
- data/lib/tree_haver/backends/prism.rb +624 -0
- data/lib/tree_haver/backends/psych.rb +597 -0
- data/lib/tree_haver/backends/rust.rb +60 -17
- data/lib/tree_haver/citrus_grammar_finder.rb +170 -0
- data/lib/tree_haver/grammar_finder.rb +115 -11
- data/lib/tree_haver/language_registry.rb +62 -71
- data/lib/tree_haver/node.rb +220 -4
- data/lib/tree_haver/path_validator.rb +29 -24
- data/lib/tree_haver/tree.rb +63 -9
- data/lib/tree_haver/version.rb +2 -2
- data/lib/tree_haver.rb +835 -75
- data/sig/tree_haver.rbs +18 -1
- data.tar.gz.sig +0 -0
- metadata +9 -4
- metadata.gz.sig +0 -0
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: ad698ec9d939142aa2fd735e356ee50e83be0b2bc9ac16358f15b48f18d11d13
|
|
4
|
+
data.tar.gz: 51d0e7f2cb6d39bd0e10ffca602e096e86d463ccc38cbce5e0997dce39ca1bb2
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 0c0c55b6be53012357cd638e8997f1bd201310e494750b90a82a3f9588fb505a059ead9f056ffc443c424ca2a3cf6370d27e0b508a30faa90a39403bb067cb85
|
|
7
|
+
data.tar.gz: 71d9577716ab0edb436fa784dbfc5d6c81360c745ed98ef156c57475b6cd1f11eccf0030f2565e12d7dcf5035fcdf8fdc2c1eca64f81623109b1318048f41db4
|
checksums.yaml.gz.sig
CHANGED
|
Binary file
|
data/CHANGELOG.md
CHANGED
|
@@ -30,6 +30,286 @@ Please file a bug if you notice a violation of semantic versioning.
|
|
|
30
30
|
|
|
31
31
|
### Security
|
|
32
32
|
|
|
33
|
+
## [3.1.0] - 2025-12-18
|
|
34
|
+
|
|
35
|
+
- TAG: [v3.1.0][3.1.0t]
|
|
36
|
+
- COVERAGE: 82.65% -- 943/1141 lines in 11 files
|
|
37
|
+
- BRANCH COVERAGE: 63.80% -- 349/547 branches in 11 files
|
|
38
|
+
- 88.97% documented
|
|
39
|
+
|
|
40
|
+
### Added
|
|
41
|
+
|
|
42
|
+
- **Position API Enhancements** – Added consistent position methods to all backend Node classes for compatibility with `*-merge` gems
|
|
43
|
+
- `start_line` - Returns 1-based line number where node starts (converts 0-based `start_point.row` to 1-based)
|
|
44
|
+
- `end_line` - Returns 1-based line number where node ends (converts 0-based `end_point.row` to 1-based)
|
|
45
|
+
- `source_position` - Returns hash `{start_line:, end_line:, start_column:, end_column:}` with 1-based lines and 0-based columns
|
|
46
|
+
- `first_child` - Convenience method that returns `children.first` for iteration compatibility
|
|
47
|
+
- **Fixed:** `TreeHaver::Node#start_point` and `#end_point` now handle both Point objects and hashes from backends (Prism, Citrus return hashes)
|
|
48
|
+
- **Fixed:** Added Psych, Commonmarker, and Markly backends to `resolve_backend_module` and `backend_module` case statements so they can be explicitly selected with `TreeHaver.backend = :psych` etc.
|
|
49
|
+
- **Fixed:** Added Prism, Psych, Commonmarker, and Markly backends to `unwrap_language` method so language objects are properly passed to backend parsers
|
|
50
|
+
- **Fixed:** Commonmarker backend's `text` method now safely handles container nodes that don't have string_content (wraps in rescue TypeError)
|
|
51
|
+
- **Added to:**
|
|
52
|
+
- Main `TreeHaver::Node` wrapper (used by tree-sitter backends: MRI, FFI, Java, Rust)
|
|
53
|
+
- `Backends::Commonmarker::Node` - uses Commonmarker's `sourcepos` (already 1-based)
|
|
54
|
+
- `Backends::Markly::Node` - uses Markly's `source_position` (already 1-based)
|
|
55
|
+
- `Backends::Prism::Node` - uses Prism's `location` (already 1-based)
|
|
56
|
+
- `Backends::Psych::Node` - calculates from `start_point`/`end_point` (0-based)
|
|
57
|
+
- `Backends::Citrus::Node` - calculates from `start_point`/`end_point` (0-based)
|
|
58
|
+
- **Backward Compatible:** Existing `start_point`/`end_point` methods continue to work unchanged
|
|
59
|
+
- **Purpose:** Enables all `*-merge` gems to use consistent position API without backend-specific workarounds
|
|
60
|
+
|
|
61
|
+
- **Prism Backend** – New backend wrapping Ruby's official Prism parser (stdlib in Ruby 3.4+, gem for 3.2+)
|
|
62
|
+
- `TreeHaver::Backends::Prism::Language` - Language wrapper (Ruby-only)
|
|
63
|
+
- `TreeHaver::Backends::Prism::Parser` - Parser with `parse` and `parse_string` methods
|
|
64
|
+
- `TreeHaver::Backends::Prism::Tree` - Tree wrapper with `root_node`, `errors`, `warnings`, `comments`
|
|
65
|
+
- `TreeHaver::Backends::Prism::Node` - Node wrapper implementing full TreeHaver::Node protocol
|
|
66
|
+
- Registered with `:prism` backend name, no conflicts with other backends
|
|
67
|
+
|
|
68
|
+
- **Psych Backend** – New backend wrapping Ruby's standard library YAML parser
|
|
69
|
+
- `TreeHaver::Backends::Psych::Language` - Language wrapper (YAML-only)
|
|
70
|
+
- `TreeHaver::Backends::Psych::Parser` - Parser with `parse` and `parse_string` methods
|
|
71
|
+
- `TreeHaver::Backends::Psych::Tree` - Tree wrapper with `root_node`, `errors`
|
|
72
|
+
- `TreeHaver::Backends::Psych::Node` - Node wrapper implementing TreeHaver::Node protocol
|
|
73
|
+
- Psych-specific methods: `mapping?`, `sequence?`, `scalar?`, `alias?`, `mapping_entries`, `anchor`, `tag`, `value`
|
|
74
|
+
- Registered with `:psych` backend name, no conflicts with other backends
|
|
75
|
+
|
|
76
|
+
- **Commonmarker Backend** – New backend wrapping the Commonmarker gem (comrak Rust parser)
|
|
77
|
+
- `TreeHaver::Backends::Commonmarker::Language` - Language wrapper with parse options passthrough
|
|
78
|
+
- `TreeHaver::Backends::Commonmarker::Parser` - Parser with `parse` and `parse_string` methods
|
|
79
|
+
- `TreeHaver::Backends::Commonmarker::Tree` - Tree wrapper with `root_node`
|
|
80
|
+
- `TreeHaver::Backends::Commonmarker::Node` - Node wrapper implementing TreeHaver::Node protocol
|
|
81
|
+
- Commonmarker-specific methods: `header_level`, `fence_info`, `url`, `title`, `next_sibling`, `previous_sibling`, `parent`
|
|
82
|
+
- Registered with `:commonmarker` backend name, no conflicts with other backends
|
|
83
|
+
|
|
84
|
+
- **Markly Backend** – New backend wrapping the Markly gem (cmark-gfm C library)
|
|
85
|
+
- `TreeHaver::Backends::Markly::Language` - Language wrapper with flags and extensions passthrough
|
|
86
|
+
- `TreeHaver::Backends::Markly::Parser` - Parser with `parse` and `parse_string` methods
|
|
87
|
+
- `TreeHaver::Backends::Markly::Tree` - Tree wrapper with `root_node`
|
|
88
|
+
- `TreeHaver::Backends::Markly::Node` - Node wrapper implementing TreeHaver::Node protocol
|
|
89
|
+
- Type normalization: `:header` → `"heading"`, `:hrule` → `"thematic_break"`, `:html` → `"html_block"`
|
|
90
|
+
- Markly-specific methods: `header_level`, `fence_info`, `url`, `title`, `next_sibling`, `previous_sibling`, `parent`, `raw_type`
|
|
91
|
+
- Registered with `:markly` backend name, no conflicts with other backends
|
|
92
|
+
|
|
93
|
+
- **Automatic Citrus Fallback** – When tree-sitter fails, automatically fall back to Citrus backend
|
|
94
|
+
- `TreeHaver::Language.method_missing` now catches tree-sitter loading errors (`NotAvailable`, `ArgumentError`, `LoadError`, `FFI::NotFoundError`) and falls back to registered Citrus grammar
|
|
95
|
+
- `TreeHaver::Parser#initialize` now catches parser creation errors and falls back to Citrus parser when backend is `:auto`
|
|
96
|
+
- `TreeHaver::Parser#language=` automatically switches to Citrus parser when a Citrus language is assigned
|
|
97
|
+
- Enables seamless use of pure-Ruby parsers (like toml-rb) when tree-sitter runtime is unavailable
|
|
98
|
+
|
|
99
|
+
- **GrammarFinder Runtime Check** – `GrammarFinder#available?` now verifies tree-sitter runtime is actually usable
|
|
100
|
+
- New `GrammarFinder.tree_sitter_runtime_usable?` class method tests if parser can be created
|
|
101
|
+
- `TREE_SITTER_BACKENDS` constant defines which backends use tree-sitter (MRI, FFI, Rust, Java)
|
|
102
|
+
- Prevents registration of grammars when tree-sitter runtime isn't functional
|
|
103
|
+
- `GrammarFinder.reset_runtime_check!` for testing
|
|
104
|
+
|
|
105
|
+
- **Empty ENV Variable as Explicit Skip** – Setting `TREE_SITTER_<LANG>_PATH=''` explicitly disables that grammar
|
|
106
|
+
- Previously, empty string was treated same as unset (would search paths)
|
|
107
|
+
- Now, empty string means "do not use tree-sitter for this language"
|
|
108
|
+
- Allows explicit opt-out to force fallback to alternative backends like Citrus
|
|
109
|
+
- Useful for testing and environments where tree-sitter isn't desired
|
|
110
|
+
|
|
111
|
+
- **TOML Examples** – New example scripts demonstrating TOML parsing with various backends
|
|
112
|
+
- `examples/auto_toml.rb` - Auto backend selection with Citrus fallback demonstration
|
|
113
|
+
- `examples/ffi_toml.rb` - FFI backend with TOML
|
|
114
|
+
- `examples/mri_toml.rb` - MRI backend with TOML
|
|
115
|
+
- `examples/rust_toml.rb` - Rust backend with TOML
|
|
116
|
+
- `examples/java_toml.rb` - Java backend with TOML (JRuby only)
|
|
117
|
+
|
|
118
|
+
### Fixed
|
|
119
|
+
|
|
120
|
+
- **BREAKING**: `TreeHaver::Language.method_missing` no longer raises `ArgumentError` when only Citrus grammar is registered and tree-sitter backend is active – it now falls back to Citrus instead
|
|
121
|
+
- Previously: Would raise "No grammar registered for :lang compatible with tree_sitter backend"
|
|
122
|
+
- Now: Returns `TreeHaver::Backends::Citrus::Language` if Citrus grammar is registered
|
|
123
|
+
- Migration: If you were catching this error, update your code to handle the fallback behavior
|
|
124
|
+
- This is a bug fix, but would be a breaking change for some users who were relying on the old behavior
|
|
125
|
+
|
|
126
|
+
## [3.0.0] - 2025-12-16
|
|
127
|
+
|
|
128
|
+
- TAG: [v3.0.0][3.0.0t]
|
|
129
|
+
- COVERAGE: 85.19% -- 909/1067 lines in 11 files
|
|
130
|
+
- BRANCH COVERAGE: 67.47% -- 338/501 branches in 11 files
|
|
131
|
+
- 92.93% documented
|
|
132
|
+
|
|
133
|
+
### Added
|
|
134
|
+
|
|
135
|
+
#### Backend Requirements
|
|
136
|
+
|
|
137
|
+
- **MRI Backend**: Requires `ruby_tree_sitter` v2.0+ (exceptions inherit from `Exception` not `StandardError`)
|
|
138
|
+
- In ruby_tree_sitter v2.0, TreeSitter errors were changed to inherit from Exception for thread-safety
|
|
139
|
+
- TreeHaver now properly handles: `ParserNotFoundError`, `LanguageLoadError`, `SymbolNotFoundError`, etc.
|
|
140
|
+
|
|
141
|
+
#### Thread-Safe Backend Selection (Hybrid Approach)
|
|
142
|
+
|
|
143
|
+
- **NEW: Block-based backend API** - `TreeHaver.with_backend(:ffi) { ... }` for thread-safe backend selection
|
|
144
|
+
- Thread-local context with proper nesting support
|
|
145
|
+
- Exception-safe (context restored even on errors)
|
|
146
|
+
- Fully backward compatible with existing global backend setting
|
|
147
|
+
- **NEW: Explicit backend parameters**
|
|
148
|
+
- `Parser.new(backend: :mri)` - specify backend when creating parser
|
|
149
|
+
- `Language.from_library(path, backend: :ffi)` - specify backend when loading language
|
|
150
|
+
- Backend parameters override thread context and global settings
|
|
151
|
+
- **NEW: Backend introspection** - `parser.backend` returns the current backend name (`:ffi`, `:mri`, etc.)
|
|
152
|
+
- **Backend precedence chain**: `explicit parameter > thread context > global setting > :auto`
|
|
153
|
+
- **Backend-aware caching** - Language cache now includes backend in cache key to prevent cross-backend pollution
|
|
154
|
+
- Added `TreeHaver.effective_backend` - returns the currently effective backend considering precedence
|
|
155
|
+
- Added `TreeHaver.current_backend_context` - returns thread-local backend context
|
|
156
|
+
- Added `TreeHaver.resolve_backend_module(explicit_backend)` - resolves backend module with precedence
|
|
157
|
+
|
|
158
|
+
#### Examples and Discovery
|
|
159
|
+
|
|
160
|
+
- Added 18 comprehensive examples demonstrating all backends and languages
|
|
161
|
+
- JSON examples (5): auto, MRI, Rust, FFI, Java
|
|
162
|
+
- JSONC examples (5): auto, MRI, Rust, FFI, Java
|
|
163
|
+
- Bash examples (5): auto, MRI, Rust, FFI, Java
|
|
164
|
+
- Citrus examples (3): TOML, Finitio, Dhall
|
|
165
|
+
- All examples use bundler inline (self-contained, no Gemfile needed)
|
|
166
|
+
- Added `examples/run_all.rb` - comprehensive test runner with colored output
|
|
167
|
+
- Updated `examples/README.md` - complete guide to all examples
|
|
168
|
+
- Added `TreeHaver::CitrusGrammarFinder` for language-agnostic discovery and registration of Citrus-based grammar gems
|
|
169
|
+
- Automatically discovers Citrus grammar gems by gem name and grammar constant path
|
|
170
|
+
- Validates grammar modules respond to `.parse(source)` before registration
|
|
171
|
+
- Provides helpful error messages when grammars are not found
|
|
172
|
+
- Added multi-backend language registry supporting multiple backends per language simultaneously
|
|
173
|
+
- Restructured `LanguageRegistry` to use nested hash: `{ language: { backend_type: config } }`
|
|
174
|
+
- Enables registering both tree-sitter and Citrus grammars for the same language without conflicts
|
|
175
|
+
- Supports runtime backend switching, benchmarking, and fallback scenarios
|
|
176
|
+
- Added `LanguageRegistry.register(name, backend_type, **config)` with backend-specific configuration storage
|
|
177
|
+
- Added `LanguageRegistry.registered(name, backend_type = nil)` to query by specific backend or get all backends
|
|
178
|
+
- Added `TreeHaver::Backends::Citrus::Node#structural?` method to distinguish structural nodes from terminals
|
|
179
|
+
- Uses Citrus grammar's `terminal?` method to dynamically determine node classification
|
|
180
|
+
- Works with any Citrus grammar without language-specific knowledge
|
|
181
|
+
|
|
182
|
+
### Changed
|
|
183
|
+
|
|
184
|
+
- **BREAKING**: All errors now inherit from `TreeHaver::Error` which inherits from `Exception`
|
|
185
|
+
- see: https://github.com/Faveod/ruby-tree-sitter/pull/83 for reasoning
|
|
186
|
+
- **BREAKING**: `LanguageRegistry.register` signature changed from `register(name, path:, symbol:)` to `register(name, backend_type, **config)`
|
|
187
|
+
- This enables proper separation of tree-sitter and Citrus configurations
|
|
188
|
+
- Users should update to use `TreeHaver.register_language` instead of calling `LanguageRegistry.register` directly
|
|
189
|
+
- Updated `TreeHaver.register_language` to support both tree-sitter and Citrus grammars in single call or separate calls
|
|
190
|
+
- Can now register: `register_language(:toml, path: "...", symbol: "...", grammar_module: TomlRB::Document)`
|
|
191
|
+
- **INTENTIONAL DESIGN**: Uses separate `if` statements (not `elsif`) to allow registering both backends simultaneously
|
|
192
|
+
- Enables maximum flexibility: runtime backend switching, performance benchmarking, fallback scenarios
|
|
193
|
+
- Multiple registrations for same language now merge instead of overwrite
|
|
194
|
+
|
|
195
|
+
### Improved
|
|
196
|
+
|
|
197
|
+
#### Code Quality and Documentation
|
|
198
|
+
|
|
199
|
+
- **Uniform backend API**: All backends now implement `reset!` method for consistent testing interface
|
|
200
|
+
- Eliminates need for tests to manipulate private instance variables
|
|
201
|
+
- Provides clean way to reset backend state between tests
|
|
202
|
+
- **Documented design decisions** with inline rationale
|
|
203
|
+
- FFI Tree finalizer behavior and why Parser doesn't use finalizers
|
|
204
|
+
- `resolve_backend_module` early-return pattern with comprehensive comments
|
|
205
|
+
- `register_language` multi-backend registration capability extensively documented
|
|
206
|
+
- **Enhanced YARD documentation**
|
|
207
|
+
- All Citrus examples now include `gem_name` parameter (matches actual usage patterns)
|
|
208
|
+
- Added complete examples showing both single-backend and multi-backend registration
|
|
209
|
+
- Documented backend precedence chain and thread-safety guarantees
|
|
210
|
+
- **Comprehensive test coverage** for thread-safe backend selection
|
|
211
|
+
- Thread-local context tests
|
|
212
|
+
- Parser backend parameter tests
|
|
213
|
+
- Language backend parameter tests
|
|
214
|
+
- Concurrent parsing tests with multiple backends
|
|
215
|
+
- Backend-aware cache isolation tests
|
|
216
|
+
- Nested block behavior tests (inner blocks override outer blocks)
|
|
217
|
+
- Exception safety tests (context restored even on errors)
|
|
218
|
+
- Explicit parameter precedence tests
|
|
219
|
+
- Updated `Language.method_missing` to automatically select appropriate grammar based on active backend
|
|
220
|
+
- tree-sitter backends (MRI, Rust, FFI, Java) query `:tree_sitter` registry key
|
|
221
|
+
- Citrus backend queries `:citrus` registry key
|
|
222
|
+
- Provides clear error messages when requested backend has no registered grammar
|
|
223
|
+
- Improved `TreeHaver::Backends::Citrus::Node#type` to use dynamic Citrus grammar introspection
|
|
224
|
+
- Uses event `.name` method and Symbol events for accurate type extraction
|
|
225
|
+
- Works with any Citrus grammar without language-specific code
|
|
226
|
+
- Handles compound rules (Repeat, Choice, Optional) intelligently
|
|
227
|
+
|
|
228
|
+
### Fixed
|
|
229
|
+
|
|
230
|
+
#### Thread-Safety and Backend Selection
|
|
231
|
+
|
|
232
|
+
- Fixed `resolve_backend_module` to properly handle mocked backends without `available?` method
|
|
233
|
+
- Assumes modules without `available?` are available (for test compatibility and backward compatibility)
|
|
234
|
+
- Only rejects if module explicitly has `available?` method and returns false
|
|
235
|
+
- Makes code more defensive and test-friendly
|
|
236
|
+
- Fixed Language cache to include backend in cache key
|
|
237
|
+
- Prevents returning wrong backend's Language object when switching backends
|
|
238
|
+
- Essential for correctness with multiple backends in use
|
|
239
|
+
- Cache key now: `"#{path}:#{symbol}:#{backend}"` instead of just `"#{path}:#{symbol}"`
|
|
240
|
+
- Fixed `TreeHaver.register_language` to properly support multi-backend registration
|
|
241
|
+
- Documented intentional design: uses `if` not `elsif` to allow both backends in one call
|
|
242
|
+
- Added comprehensive inline comments explaining why no early return
|
|
243
|
+
- Added extensive YARD documentation with examples
|
|
244
|
+
|
|
245
|
+
#### Backend Bug Fixes
|
|
246
|
+
|
|
247
|
+
- Fixed critical double-wrapping bug in ALL backends (MRI, Rust, FFI, Java, Citrus)
|
|
248
|
+
- Backend `Parser#parse` and `parse_string` methods now return raw backend trees
|
|
249
|
+
- TreeHaver::Parser wraps the raw tree in TreeHaver::Tree (single wrapping)
|
|
250
|
+
- Previously backends were returning TreeHaver::Tree, then TreeHaver::Parser wrapped it again (double wrapping)
|
|
251
|
+
- This caused `@inner_tree` to be a TreeHaver::Tree instead of raw backend tree, leading to nil errors
|
|
252
|
+
- Fixed TreeHaver::Parser to pass source parameter when wrapping backend trees
|
|
253
|
+
- Enables `Node#text` to work correctly by providing source for text extraction
|
|
254
|
+
- Fixes all parse and parse_string methods to include `source: source` parameter
|
|
255
|
+
- Fixed MRI backend to properly use ruby_tree_sitter API
|
|
256
|
+
- Fixed `require "tree_sitter"` (gem name is `ruby_tree_sitter` but requires `tree_sitter`)
|
|
257
|
+
- Fixed `Language.load` to use correct argument order: `(symbol_name, path)`
|
|
258
|
+
- Fixed `Parser#parse` to use `parse_string(nil, source)` instead of creating Input objects
|
|
259
|
+
- Fixed `Language.from_library` to implement the expected signature matching other backends
|
|
260
|
+
- Fixed FFI backend missing essential node methods
|
|
261
|
+
- Added `ts_node_start_byte`, `ts_node_end_byte`, `ts_node_start_point`, `ts_node_end_point`
|
|
262
|
+
- Added `ts_node_is_null`, `ts_node_is_named`
|
|
263
|
+
- These methods are required for accessing node byte positions and metadata
|
|
264
|
+
- Fixes `NoMethodError` when using FFI backend to traverse AST nodes
|
|
265
|
+
- Fixed GrammarFinder error messages for environment variable validation
|
|
266
|
+
- Detects leading/trailing whitespace in paths and provides correction suggestions
|
|
267
|
+
- Shows when TREE_SITTER_*_PATH is set but points to nonexistent file
|
|
268
|
+
- Provides helpful guidance for setting environment variables correctly
|
|
269
|
+
- Fixed registry conflicts when registering multiple backend types for the same language
|
|
270
|
+
- Fixed `CitrusGrammarFinder` to properly handle gems with non-standard require paths (e.g., `toml-rb.rb` vs `toml/rb.rb`)
|
|
271
|
+
- Fixed Citrus backend infinite recursion in `Node#extract_type_from_event`
|
|
272
|
+
- Added cycle detection to prevent stack overflow when traversing recursive grammar structures
|
|
273
|
+
|
|
274
|
+
### Known Issues
|
|
275
|
+
|
|
276
|
+
- **MRI backend + Bash grammar**: ABI/symbol loading incompatibility
|
|
277
|
+
- The ruby_tree_sitter gem cannot load tree-sitter-bash grammar (symbol not found)
|
|
278
|
+
- Workaround: Use FFI backend instead (works perfectly)
|
|
279
|
+
- This is documented in examples and test runner
|
|
280
|
+
- **Rust backend + Bash grammar**: Version mismatch due to static linking
|
|
281
|
+
- tree_stump statically links tree-sitter at compile time
|
|
282
|
+
- System bash.so may be compiled with different tree-sitter version
|
|
283
|
+
- Workaround: Use FFI backend (dynamic linking avoids version conflicts)
|
|
284
|
+
- This is documented in examples with detailed explanations
|
|
285
|
+
|
|
286
|
+
### Notes on Backward Compatibility
|
|
287
|
+
|
|
288
|
+
Despite the major version bump to 3.0.0 (following semver due to the breaking `LanguageRegistry.register` signature change), **most users will experience NO BREAKING CHANGES**:
|
|
289
|
+
|
|
290
|
+
#### Why 3.0.0?
|
|
291
|
+
|
|
292
|
+
- `LanguageRegistry.register` signature changed to support multi-backend registration
|
|
293
|
+
- However, most users should use `TreeHaver.register_language` (which remains backward compatible)
|
|
294
|
+
- Direct calls to `LanguageRegistry.register` are rare in practice
|
|
295
|
+
|
|
296
|
+
#### What Stays the Same?
|
|
297
|
+
|
|
298
|
+
- **Global backend setting**: `TreeHaver.backend = :ffi` works unchanged
|
|
299
|
+
- **Parser creation**: `Parser.new` without parameters works as before
|
|
300
|
+
- **Language loading**: `Language.from_library(path)` works as before
|
|
301
|
+
- **Auto-detection**: Backend auto-selection still works when backend is `:auto`
|
|
302
|
+
- **All existing code** continues to work without modifications
|
|
303
|
+
|
|
304
|
+
#### What's New (All Optional)?
|
|
305
|
+
|
|
306
|
+
- Thread-safe block API: `TreeHaver.with_backend(:ffi) { ... }`
|
|
307
|
+
- Explicit backend parameters: `Parser.new(backend: :mri)`
|
|
308
|
+
- Backend introspection: `parser.backend`
|
|
309
|
+
- Multi-backend language registration
|
|
310
|
+
|
|
311
|
+
**Migration Path**: Existing codebases can upgrade to 3.0.0 and gain access to new thread-safe features without changing any existing code. The new features are purely additive and opt-in.
|
|
312
|
+
|
|
33
313
|
## [2.0.0] - 2025-12-15
|
|
34
314
|
|
|
35
315
|
- TAG: [v2.0.0][2.0.0t]
|
|
@@ -85,7 +365,11 @@ Please file a bug if you notice a violation of semantic versioning.
|
|
|
85
365
|
|
|
86
366
|
- Initial release
|
|
87
367
|
|
|
88
|
-
[Unreleased]: https://github.com/kettle-rb/tree_haver/compare/
|
|
368
|
+
[Unreleased]: https://github.com/kettle-rb/tree_haver/compare/v3.1.0...HEAD
|
|
369
|
+
[3.1.0]: https://github.com/kettle-rb/tree_haver/compare/v3.0.0...v3.1.0
|
|
370
|
+
[3.1.0t]: https://github.com/kettle-rb/tree_haver/releases/tag/v3.1.0
|
|
371
|
+
[3.0.0]: https://github.com/kettle-rb/tree_haver/compare/v2.0.0...v3.0.0
|
|
372
|
+
[3.0.0t]: https://github.com/kettle-rb/tree_haver/releases/tag/v3.0.0
|
|
89
373
|
[2.0.0]: https://github.com/kettle-rb/tree_haver/compare/v1.0.0...v2.0.0
|
|
90
374
|
[2.0.0t]: https://github.com/kettle-rb/tree_haver/releases/tag/v2.0.0
|
|
91
375
|
[1.0.0]: https://github.com/kettle-rb/tree_haver/compare/a89211bff10f4440b96758a8ac9d7d539001b0c8...v1.0.0
|
data/CONTRIBUTING.md
CHANGED
|
@@ -42,6 +42,138 @@ There are many Rake tasks available as well. You can see them by running:
|
|
|
42
42
|
bin/rake -T
|
|
43
43
|
```
|
|
44
44
|
|
|
45
|
+
## Backend Compatibility Testing
|
|
46
|
+
|
|
47
|
+
TreeHaver supports multiple backends with different characteristics:
|
|
48
|
+
|
|
49
|
+
- **MRI**: ruby_tree_sitter (C extension, tree-sitter grammars)
|
|
50
|
+
- **FFI**: Pure Ruby FFI bindings (tree-sitter grammars)
|
|
51
|
+
- **Rust**: tree_stump (Rust extension, tree-sitter grammars)
|
|
52
|
+
- **Citrus**: Pure Ruby parser (TOML only via toml-rb grammar)
|
|
53
|
+
|
|
54
|
+
Not all backends can coexist in the same Ruby process. Notably, **FFI and MRI backends conflict**
|
|
55
|
+
at the libtree-sitter runtime level—using both in the same process will cause segfaults.
|
|
56
|
+
|
|
57
|
+
The **Citrus backend** works differently:
|
|
58
|
+
- Uses pure Ruby parsing (no .so files)
|
|
59
|
+
- Currently only supports TOML via toml-rb grammar
|
|
60
|
+
- Can coexist with tree-sitter backends
|
|
61
|
+
- Useful for testing multi-backend scenarios
|
|
62
|
+
|
|
63
|
+
The `bin/backend-matrix` script helps test and document backend compatibility by running tests
|
|
64
|
+
in isolated subprocesses.
|
|
65
|
+
|
|
66
|
+
### Basic Usage
|
|
67
|
+
|
|
68
|
+
```shell
|
|
69
|
+
# Test all backends with TOML grammar (default)
|
|
70
|
+
bin/backend-matrix
|
|
71
|
+
|
|
72
|
+
# Test specific backend order (including Citrus)
|
|
73
|
+
bin/backend-matrix ffi mri rust citrus
|
|
74
|
+
|
|
75
|
+
# Test Citrus with tree-sitter backends
|
|
76
|
+
bin/backend-matrix citrus mri ffi # Citrus before tree-sitter
|
|
77
|
+
bin/backend-matrix mri citrus ffi # Citrus between tree-sitter
|
|
78
|
+
|
|
79
|
+
# Test with a different grammar
|
|
80
|
+
bin/backend-matrix --grammar=json
|
|
81
|
+
|
|
82
|
+
# Test multiple grammars
|
|
83
|
+
bin/backend-matrix --grammars=json,toml,bash
|
|
84
|
+
|
|
85
|
+
# Citrus only supports TOML
|
|
86
|
+
bin/backend-matrix --grammar=toml citrus
|
|
87
|
+
```
|
|
88
|
+
|
|
89
|
+
### All Permutations Mode
|
|
90
|
+
|
|
91
|
+
Test all possible backend combinations by spawning fresh subprocesses for each:
|
|
92
|
+
|
|
93
|
+
```shell
|
|
94
|
+
# Test all 64 backend combinations (4 backends: 4 1-backend + 12 2-backend + 24 3-backend + 24 4-backend)
|
|
95
|
+
bin/backend-matrix --all-permutations
|
|
96
|
+
|
|
97
|
+
# With multiple grammars
|
|
98
|
+
bin/backend-matrix --all-permutations --grammars=json,toml
|
|
99
|
+
|
|
100
|
+
# Note: Citrus only supports TOML, so JSON/Bash tests will skip for Citrus
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
### Cross-Grammar Testing
|
|
104
|
+
|
|
105
|
+
The most interesting test: can different backends coexist if they use *different* grammar files?
|
|
106
|
+
|
|
107
|
+
```shell
|
|
108
|
+
# Test: FFI+json then MRI+toml, MRI+json then FFI+toml, etc.
|
|
109
|
+
bin/backend-matrix --cross-grammar --grammars=json,toml
|
|
110
|
+
|
|
111
|
+
# Full cross-grammar matrix
|
|
112
|
+
bin/backend-matrix --all-permutations --cross-grammar --grammars=json,toml
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
### Custom Source Files
|
|
116
|
+
|
|
117
|
+
Provide your own source files for parsing:
|
|
118
|
+
|
|
119
|
+
```shell
|
|
120
|
+
bin/backend-matrix --toml-source=my_config.toml --json-source=data.json
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
### List Available Grammars
|
|
124
|
+
|
|
125
|
+
Check which grammars are configured and available:
|
|
126
|
+
|
|
127
|
+
```shell
|
|
128
|
+
bin/backend-matrix --list-grammars
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
### Understanding the Output
|
|
132
|
+
|
|
133
|
+
The script produces tables showing:
|
|
134
|
+
|
|
135
|
+
1. **1-Backend Tests**: Each backend tested in isolation with all grammars
|
|
136
|
+
2. **2-Backend Tests**: Pairs of backends tested in sequence (A → B)
|
|
137
|
+
3. **3-Backend Tests**: Triples tested in sequence (A → B → C)
|
|
138
|
+
4. **Backend Pair Compatibility**: Data-driven analysis of which backends can coexist
|
|
139
|
+
5. **Statistics**: Success rates and combination counts
|
|
140
|
+
|
|
141
|
+
Example findings:
|
|
142
|
+
|
|
143
|
+
```
|
|
144
|
+
Backend Pair Compatibility:
|
|
145
|
+
╭───────────────┬────────────────────┬─────────┬────────╮
|
|
146
|
+
│ Backend Pair │ Compatibility │ Working │ Failed │
|
|
147
|
+
├───────────────┼────────────────────┼─────────┼────────┤
|
|
148
|
+
│ ffi+mri │ ✗ Incompatible │ 0 │ 8 │
|
|
149
|
+
│ mri+rust │ ✓ Fully compatible │ 8 │ 0 │
|
|
150
|
+
│ ffi+rust │ ✓ Fully compatible │ 8 │ 0 │
|
|
151
|
+
│ citrus+mri │ ✓ Fully compatible │ 2 │ 0 │
|
|
152
|
+
│ citrus+ffi │ ✓ Fully compatible │ 2 │ 0 │
|
|
153
|
+
│ citrus+rust │ ✓ Fully compatible │ 2 │ 0 │
|
|
154
|
+
╰───────────────┴────────────────────┴─────────┴────────╯
|
|
155
|
+
|
|
156
|
+
Note: Citrus only supports TOML, so it has fewer total combinations.
|
|
157
|
+
```
|
|
158
|
+
|
|
159
|
+
### Required Environment Variables
|
|
160
|
+
|
|
161
|
+
The script requires grammar paths to be set for tree-sitter backends:
|
|
162
|
+
|
|
163
|
+
```shell
|
|
164
|
+
export TREE_SITTER_TOML_PATH=/path/to/libtree-sitter-toml.so
|
|
165
|
+
export TREE_SITTER_JSON_PATH=/path/to/libtree-sitter-json.so
|
|
166
|
+
export TREE_SITTER_BASH_PATH=/path/to/libtree-sitter-bash.so
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
See `.envrc` for examples of how these are typically configured.
|
|
170
|
+
|
|
171
|
+
**For Citrus backend:**
|
|
172
|
+
- Requires the `toml-rb` gem (pure Ruby TOML parser)
|
|
173
|
+
- **Auto-installs**: Script uses bundler inline to install `toml-rb` automatically if missing
|
|
174
|
+
- No environment variables needed (doesn't use .so files)
|
|
175
|
+
- Only supports TOML grammar
|
|
176
|
+
|
|
45
177
|
## Environment Variables for Local Development
|
|
46
178
|
|
|
47
179
|
Below are the primary environment variables recognized by stone_checksums (and its integrated tools). Unless otherwise noted, set boolean values to the string "true" to enable.
|