tree_haver 2.0.0 → 3.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- checksums.yaml.gz.sig +0 -0
- data/CHANGELOG.md +190 -1
- data/CONTRIBUTING.md +100 -0
- data/README.md +342 -11
- data/lib/tree_haver/backends/citrus.rb +141 -20
- data/lib/tree_haver/backends/ffi.rb +338 -141
- data/lib/tree_haver/backends/java.rb +65 -16
- data/lib/tree_haver/backends/mri.rb +154 -17
- data/lib/tree_haver/backends/rust.rb +59 -16
- data/lib/tree_haver/citrus_grammar_finder.rb +170 -0
- data/lib/tree_haver/grammar_finder.rb +42 -7
- data/lib/tree_haver/language_registry.rb +62 -71
- data/lib/tree_haver/node.rb +150 -0
- data/lib/tree_haver/path_validator.rb +29 -24
- data/lib/tree_haver/tree.rb +63 -9
- data/lib/tree_haver/version.rb +2 -2
- data/lib/tree_haver.rb +697 -56
- data.tar.gz.sig +0 -0
- metadata +5 -4
- metadata.gz.sig +0 -0
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 0ddc5d837509119a581acd92a33fc7819b6e6dd40a727d1eb0b074d1e944c22f
|
|
4
|
+
data.tar.gz: f7b329ee2245068b3bc05e9e6c250e565bbb3b617a50840617f389c2f18dc679
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 66e9dda1e2638d5340ad99a48959d63002293225a3fc636ea776f5718e0ed388628a4a9fe87c1b0150ae265ec812dd763b1cecfe434044148a1fffc16a9e78b1
|
|
7
|
+
data.tar.gz: 9545f9a6b25e6f5944c60b1c7e1a3ecc278cea04c5274b3c34a3d3a0d9d83f14324bda96820286e21ae627c380200866ee47cf6161a12f9bffc9ff72064c3aec
|
checksums.yaml.gz.sig
CHANGED
|
Binary file
|
data/CHANGELOG.md
CHANGED
|
@@ -30,6 +30,193 @@ Please file a bug if you notice a violation of semantic versioning.
|
|
|
30
30
|
|
|
31
31
|
### Security
|
|
32
32
|
|
|
33
|
+
## [3.0.0] - 2025-12-16
|
|
34
|
+
|
|
35
|
+
- TAG: [v3.0.0][3.0.0t]
|
|
36
|
+
- COVERAGE: 85.19% -- 909/1067 lines in 11 files
|
|
37
|
+
- BRANCH COVERAGE: 67.47% -- 338/501 branches in 11 files
|
|
38
|
+
- 92.93% documented
|
|
39
|
+
|
|
40
|
+
### Added
|
|
41
|
+
|
|
42
|
+
#### Backend Requirements
|
|
43
|
+
|
|
44
|
+
- **MRI Backend**: Requires `ruby_tree_sitter` v2.0+ (exceptions inherit from `Exception` not `StandardError`)
|
|
45
|
+
- In ruby_tree_sitter v2.0, TreeSitter errors were changed to inherit from Exception for thread-safety
|
|
46
|
+
- TreeHaver now properly handles: `ParserNotFoundError`, `LanguageLoadError`, `SymbolNotFoundError`, etc.
|
|
47
|
+
|
|
48
|
+
#### Thread-Safe Backend Selection (Hybrid Approach)
|
|
49
|
+
|
|
50
|
+
- **NEW: Block-based backend API** - `TreeHaver.with_backend(:ffi) { ... }` for thread-safe backend selection
|
|
51
|
+
- Thread-local context with proper nesting support
|
|
52
|
+
- Exception-safe (context restored even on errors)
|
|
53
|
+
- Fully backward compatible with existing global backend setting
|
|
54
|
+
- **NEW: Explicit backend parameters**
|
|
55
|
+
- `Parser.new(backend: :mri)` - specify backend when creating parser
|
|
56
|
+
- `Language.from_library(path, backend: :ffi)` - specify backend when loading language
|
|
57
|
+
- Backend parameters override thread context and global settings
|
|
58
|
+
- **NEW: Backend introspection** - `parser.backend` returns the current backend name (`:ffi`, `:mri`, etc.)
|
|
59
|
+
- **Backend precedence chain**: `explicit parameter > thread context > global setting > :auto`
|
|
60
|
+
- **Backend-aware caching** - Language cache now includes backend in cache key to prevent cross-backend pollution
|
|
61
|
+
- Added `TreeHaver.effective_backend` - returns the currently effective backend considering precedence
|
|
62
|
+
- Added `TreeHaver.current_backend_context` - returns thread-local backend context
|
|
63
|
+
- Added `TreeHaver.resolve_backend_module(explicit_backend)` - resolves backend module with precedence
|
|
64
|
+
|
|
65
|
+
#### Examples and Discovery
|
|
66
|
+
|
|
67
|
+
- Added 18 comprehensive examples demonstrating all backends and languages
|
|
68
|
+
- JSON examples (5): auto, MRI, Rust, FFI, Java
|
|
69
|
+
- JSONC examples (5): auto, MRI, Rust, FFI, Java
|
|
70
|
+
- Bash examples (5): auto, MRI, Rust, FFI, Java
|
|
71
|
+
- Citrus examples (3): TOML, Finitio, Dhall
|
|
72
|
+
- All examples use bundler inline (self-contained, no Gemfile needed)
|
|
73
|
+
- Added `examples/run_all.rb` - comprehensive test runner with colored output
|
|
74
|
+
- Updated `examples/README.md` - complete guide to all examples
|
|
75
|
+
- Added `TreeHaver::CitrusGrammarFinder` for language-agnostic discovery and registration of Citrus-based grammar gems
|
|
76
|
+
- Automatically discovers Citrus grammar gems by gem name and grammar constant path
|
|
77
|
+
- Validates grammar modules respond to `.parse(source)` before registration
|
|
78
|
+
- Provides helpful error messages when grammars are not found
|
|
79
|
+
- Added multi-backend language registry supporting multiple backends per language simultaneously
|
|
80
|
+
- Restructured `LanguageRegistry` to use nested hash: `{ language: { backend_type: config } }`
|
|
81
|
+
- Enables registering both tree-sitter and Citrus grammars for the same language without conflicts
|
|
82
|
+
- Supports runtime backend switching, benchmarking, and fallback scenarios
|
|
83
|
+
- Added `LanguageRegistry.register(name, backend_type, **config)` with backend-specific configuration storage
|
|
84
|
+
- Added `LanguageRegistry.registered(name, backend_type = nil)` to query by specific backend or get all backends
|
|
85
|
+
- Added `TreeHaver::Backends::Citrus::Node#structural?` method to distinguish structural nodes from terminals
|
|
86
|
+
- Uses Citrus grammar's `terminal?` method to dynamically determine node classification
|
|
87
|
+
- Works with any Citrus grammar without language-specific knowledge
|
|
88
|
+
|
|
89
|
+
### Changed
|
|
90
|
+
|
|
91
|
+
- **BREAKING**: All errors now inherit from `TreeHaver::Error` which inherits from `Exception`
|
|
92
|
+
- see: https://github.com/Faveod/ruby-tree-sitter/pull/83 for reasoning
|
|
93
|
+
- **BREAKING**: `LanguageRegistry.register` signature changed from `register(name, path:, symbol:)` to `register(name, backend_type, **config)`
|
|
94
|
+
- This enables proper separation of tree-sitter and Citrus configurations
|
|
95
|
+
- Users should update to use `TreeHaver.register_language` instead of calling `LanguageRegistry.register` directly
|
|
96
|
+
- Updated `TreeHaver.register_language` to support both tree-sitter and Citrus grammars in single call or separate calls
|
|
97
|
+
- Can now register: `register_language(:toml, path: "...", symbol: "...", grammar_module: TomlRB::Document)`
|
|
98
|
+
- **INTENTIONAL DESIGN**: Uses separate `if` statements (not `elsif`) to allow registering both backends simultaneously
|
|
99
|
+
- Enables maximum flexibility: runtime backend switching, performance benchmarking, fallback scenarios
|
|
100
|
+
- Multiple registrations for same language now merge instead of overwrite
|
|
101
|
+
|
|
102
|
+
### Improved
|
|
103
|
+
|
|
104
|
+
#### Code Quality and Documentation
|
|
105
|
+
|
|
106
|
+
- **Uniform backend API**: All backends now implement `reset!` method for consistent testing interface
|
|
107
|
+
- Eliminates need for tests to manipulate private instance variables
|
|
108
|
+
- Provides clean way to reset backend state between tests
|
|
109
|
+
- **Documented design decisions** with inline rationale
|
|
110
|
+
- FFI Tree finalizer behavior and why Parser doesn't use finalizers
|
|
111
|
+
- `resolve_backend_module` early-return pattern with comprehensive comments
|
|
112
|
+
- `register_language` multi-backend registration capability extensively documented
|
|
113
|
+
- **Enhanced YARD documentation**
|
|
114
|
+
- All Citrus examples now include `gem_name` parameter (matches actual usage patterns)
|
|
115
|
+
- Added complete examples showing both single-backend and multi-backend registration
|
|
116
|
+
- Documented backend precedence chain and thread-safety guarantees
|
|
117
|
+
- **Comprehensive test coverage** for thread-safe backend selection
|
|
118
|
+
- Thread-local context tests
|
|
119
|
+
- Parser backend parameter tests
|
|
120
|
+
- Language backend parameter tests
|
|
121
|
+
- Concurrent parsing tests with multiple backends
|
|
122
|
+
- Backend-aware cache isolation tests
|
|
123
|
+
- Nested block behavior tests (inner blocks override outer blocks)
|
|
124
|
+
- Exception safety tests (context restored even on errors)
|
|
125
|
+
- Explicit parameter precedence tests
|
|
126
|
+
- Updated `Language.method_missing` to automatically select appropriate grammar based on active backend
|
|
127
|
+
- tree-sitter backends (MRI, Rust, FFI, Java) query `:tree_sitter` registry key
|
|
128
|
+
- Citrus backend queries `:citrus` registry key
|
|
129
|
+
- Provides clear error messages when requested backend has no registered grammar
|
|
130
|
+
- Improved `TreeHaver::Backends::Citrus::Node#type` to use dynamic Citrus grammar introspection
|
|
131
|
+
- Uses event `.name` method and Symbol events for accurate type extraction
|
|
132
|
+
- Works with any Citrus grammar without language-specific code
|
|
133
|
+
- Handles compound rules (Repeat, Choice, Optional) intelligently
|
|
134
|
+
|
|
135
|
+
### Fixed
|
|
136
|
+
|
|
137
|
+
#### Thread-Safety and Backend Selection
|
|
138
|
+
|
|
139
|
+
- Fixed `resolve_backend_module` to properly handle mocked backends without `available?` method
|
|
140
|
+
- Assumes modules without `available?` are available (for test compatibility and backward compatibility)
|
|
141
|
+
- Only rejects if module explicitly has `available?` method and returns false
|
|
142
|
+
- Makes code more defensive and test-friendly
|
|
143
|
+
- Fixed Language cache to include backend in cache key
|
|
144
|
+
- Prevents returning wrong backend's Language object when switching backends
|
|
145
|
+
- Essential for correctness with multiple backends in use
|
|
146
|
+
- Cache key now: `"#{path}:#{symbol}:#{backend}"` instead of just `"#{path}:#{symbol}"`
|
|
147
|
+
- Fixed `TreeHaver.register_language` to properly support multi-backend registration
|
|
148
|
+
- Documented intentional design: uses `if` not `elsif` to allow both backends in one call
|
|
149
|
+
- Added comprehensive inline comments explaining why no early return
|
|
150
|
+
- Added extensive YARD documentation with examples
|
|
151
|
+
|
|
152
|
+
#### Backend Bug Fixes
|
|
153
|
+
|
|
154
|
+
- Fixed critical double-wrapping bug in ALL backends (MRI, Rust, FFI, Java, Citrus)
|
|
155
|
+
- Backend `Parser#parse` and `parse_string` methods now return raw backend trees
|
|
156
|
+
- TreeHaver::Parser wraps the raw tree in TreeHaver::Tree (single wrapping)
|
|
157
|
+
- Previously backends were returning TreeHaver::Tree, then TreeHaver::Parser wrapped it again (double wrapping)
|
|
158
|
+
- This caused `@inner_tree` to be a TreeHaver::Tree instead of raw backend tree, leading to nil errors
|
|
159
|
+
- Fixed TreeHaver::Parser to pass source parameter when wrapping backend trees
|
|
160
|
+
- Enables `Node#text` to work correctly by providing source for text extraction
|
|
161
|
+
- Fixes all parse and parse_string methods to include `source: source` parameter
|
|
162
|
+
- Fixed MRI backend to properly use ruby_tree_sitter API
|
|
163
|
+
- Fixed `require "tree_sitter"` (gem name is `ruby_tree_sitter` but requires `tree_sitter`)
|
|
164
|
+
- Fixed `Language.load` to use correct argument order: `(symbol_name, path)`
|
|
165
|
+
- Fixed `Parser#parse` to use `parse_string(nil, source)` instead of creating Input objects
|
|
166
|
+
- Fixed `Language.from_library` to implement the expected signature matching other backends
|
|
167
|
+
- Fixed FFI backend missing essential node methods
|
|
168
|
+
- Added `ts_node_start_byte`, `ts_node_end_byte`, `ts_node_start_point`, `ts_node_end_point`
|
|
169
|
+
- Added `ts_node_is_null`, `ts_node_is_named`
|
|
170
|
+
- These methods are required for accessing node byte positions and metadata
|
|
171
|
+
- Fixes `NoMethodError` when using FFI backend to traverse AST nodes
|
|
172
|
+
- Fixed GrammarFinder error messages for environment variable validation
|
|
173
|
+
- Detects leading/trailing whitespace in paths and provides correction suggestions
|
|
174
|
+
- Shows when TREE_SITTER_*_PATH is set but points to nonexistent file
|
|
175
|
+
- Provides helpful guidance for setting environment variables correctly
|
|
176
|
+
- Fixed registry conflicts when registering multiple backend types for the same language
|
|
177
|
+
- Fixed `CitrusGrammarFinder` to properly handle gems with non-standard require paths (e.g., `toml-rb.rb` vs `toml/rb.rb`)
|
|
178
|
+
- Fixed Citrus backend infinite recursion in `Node#extract_type_from_event`
|
|
179
|
+
- Added cycle detection to prevent stack overflow when traversing recursive grammar structures
|
|
180
|
+
|
|
181
|
+
### Known Issues
|
|
182
|
+
|
|
183
|
+
- **MRI backend + Bash grammar**: ABI/symbol loading incompatibility
|
|
184
|
+
- The ruby_tree_sitter gem cannot load tree-sitter-bash grammar (symbol not found)
|
|
185
|
+
- Workaround: Use FFI backend instead (works perfectly)
|
|
186
|
+
- This is documented in examples and test runner
|
|
187
|
+
- **Rust backend + Bash grammar**: Version mismatch due to static linking
|
|
188
|
+
- tree_stump statically links tree-sitter at compile time
|
|
189
|
+
- System bash.so may be compiled with different tree-sitter version
|
|
190
|
+
- Workaround: Use FFI backend (dynamic linking avoids version conflicts)
|
|
191
|
+
- This is documented in examples with detailed explanations
|
|
192
|
+
|
|
193
|
+
### Notes on Backward Compatibility
|
|
194
|
+
|
|
195
|
+
Despite the major version bump to 3.0.0 (following semver due to the breaking `LanguageRegistry.register` signature change), **most users will experience NO BREAKING CHANGES**:
|
|
196
|
+
|
|
197
|
+
#### Why 3.0.0?
|
|
198
|
+
|
|
199
|
+
- `LanguageRegistry.register` signature changed to support multi-backend registration
|
|
200
|
+
- However, most users should use `TreeHaver.register_language` (which remains backward compatible)
|
|
201
|
+
- Direct calls to `LanguageRegistry.register` are rare in practice
|
|
202
|
+
|
|
203
|
+
#### What Stays the Same?
|
|
204
|
+
|
|
205
|
+
- **Global backend setting**: `TreeHaver.backend = :ffi` works unchanged
|
|
206
|
+
- **Parser creation**: `Parser.new` without parameters works as before
|
|
207
|
+
- **Language loading**: `Language.from_library(path)` works as before
|
|
208
|
+
- **Auto-detection**: Backend auto-selection still works when backend is `:auto`
|
|
209
|
+
- **All existing code** continues to work without modifications
|
|
210
|
+
|
|
211
|
+
#### What's New (All Optional)?
|
|
212
|
+
|
|
213
|
+
- Thread-safe block API: `TreeHaver.with_backend(:ffi) { ... }`
|
|
214
|
+
- Explicit backend parameters: `Parser.new(backend: :mri)`
|
|
215
|
+
- Backend introspection: `parser.backend`
|
|
216
|
+
- Multi-backend language registration
|
|
217
|
+
|
|
218
|
+
**Migration Path**: Existing codebases can upgrade to 3.0.0 and gain access to new thread-safe features without changing any existing code. The new features are purely additive and opt-in.
|
|
219
|
+
|
|
33
220
|
## [2.0.0] - 2025-12-15
|
|
34
221
|
|
|
35
222
|
- TAG: [v2.0.0][2.0.0t]
|
|
@@ -85,7 +272,9 @@ Please file a bug if you notice a violation of semantic versioning.
|
|
|
85
272
|
|
|
86
273
|
- Initial release
|
|
87
274
|
|
|
88
|
-
[Unreleased]: https://github.com/kettle-rb/tree_haver/compare/
|
|
275
|
+
[Unreleased]: https://github.com/kettle-rb/tree_haver/compare/v3.0.0...HEAD
|
|
276
|
+
[3.0.0]: https://github.com/kettle-rb/tree_haver/compare/v2.0.0...v3.0.0
|
|
277
|
+
[3.0.0t]: https://github.com/kettle-rb/tree_haver/releases/tag/v3.0.0
|
|
89
278
|
[2.0.0]: https://github.com/kettle-rb/tree_haver/compare/v1.0.0...v2.0.0
|
|
90
279
|
[2.0.0t]: https://github.com/kettle-rb/tree_haver/releases/tag/v2.0.0
|
|
91
280
|
[1.0.0]: https://github.com/kettle-rb/tree_haver/compare/a89211bff10f4440b96758a8ac9d7d539001b0c8...v1.0.0
|
data/CONTRIBUTING.md
CHANGED
|
@@ -42,6 +42,106 @@ There are many Rake tasks available as well. You can see them by running:
|
|
|
42
42
|
bin/rake -T
|
|
43
43
|
```
|
|
44
44
|
|
|
45
|
+
## Backend Compatibility Testing
|
|
46
|
+
|
|
47
|
+
TreeHaver supports multiple backends (MRI, FFI, Rust, Citrus), but not all backends can coexist
|
|
48
|
+
in the same Ruby process. Notably, **FFI and MRI backends conflict** at the libtree-sitter runtime
|
|
49
|
+
level—using both in the same process will cause segfaults.
|
|
50
|
+
|
|
51
|
+
The `bin/backend-matrix` script helps test and document backend compatibility by running tests
|
|
52
|
+
in isolated subprocesses.
|
|
53
|
+
|
|
54
|
+
### Basic Usage
|
|
55
|
+
|
|
56
|
+
```shell
|
|
57
|
+
# Test all backends with TOML grammar (default)
|
|
58
|
+
bin/backend-matrix
|
|
59
|
+
|
|
60
|
+
# Test specific backend order
|
|
61
|
+
bin/backend-matrix ffi mri rust
|
|
62
|
+
|
|
63
|
+
# Test with a different grammar
|
|
64
|
+
bin/backend-matrix --grammar=json
|
|
65
|
+
|
|
66
|
+
# Test multiple grammars
|
|
67
|
+
bin/backend-matrix --grammars=json,toml,bash
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
### All Permutations Mode
|
|
71
|
+
|
|
72
|
+
Test all possible backend combinations by spawning fresh subprocesses for each:
|
|
73
|
+
|
|
74
|
+
```shell
|
|
75
|
+
# Test all 15 backend combinations (1-backend, 2-backend, 3-backend)
|
|
76
|
+
bin/backend-matrix --all-permutations
|
|
77
|
+
|
|
78
|
+
# With multiple grammars
|
|
79
|
+
bin/backend-matrix --all-permutations --grammars=json,toml
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
### Cross-Grammar Testing
|
|
83
|
+
|
|
84
|
+
The most interesting test: can different backends coexist if they use *different* grammar files?
|
|
85
|
+
|
|
86
|
+
```shell
|
|
87
|
+
# Test: FFI+json then MRI+toml, MRI+json then FFI+toml, etc.
|
|
88
|
+
bin/backend-matrix --cross-grammar --grammars=json,toml
|
|
89
|
+
|
|
90
|
+
# Full cross-grammar matrix
|
|
91
|
+
bin/backend-matrix --all-permutations --cross-grammar --grammars=json,toml
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
### Custom Source Files
|
|
95
|
+
|
|
96
|
+
Provide your own source files for parsing:
|
|
97
|
+
|
|
98
|
+
```shell
|
|
99
|
+
bin/backend-matrix --toml-source=my_config.toml --json-source=data.json
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
### List Available Grammars
|
|
103
|
+
|
|
104
|
+
Check which grammars are configured and available:
|
|
105
|
+
|
|
106
|
+
```shell
|
|
107
|
+
bin/backend-matrix --list-grammars
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
### Understanding the Output
|
|
111
|
+
|
|
112
|
+
The script produces tables showing:
|
|
113
|
+
|
|
114
|
+
1. **1-Backend Tests**: Each backend tested in isolation with all grammars
|
|
115
|
+
2. **2-Backend Tests**: Pairs of backends tested in sequence (A → B)
|
|
116
|
+
3. **3-Backend Tests**: Triples tested in sequence (A → B → C)
|
|
117
|
+
4. **Backend Pair Compatibility**: Data-driven analysis of which backends can coexist
|
|
118
|
+
5. **Statistics**: Success rates and combination counts
|
|
119
|
+
|
|
120
|
+
Example findings:
|
|
121
|
+
|
|
122
|
+
```
|
|
123
|
+
Backend Pair Compatibility:
|
|
124
|
+
╭──────────────┬────────────────────┬─────────┬────────╮
|
|
125
|
+
│ Backend Pair │ Compatibility │ Working │ Failed │
|
|
126
|
+
├──────────────┼────────────────────┼─────────┼────────┤
|
|
127
|
+
│ ffi+mri │ ✗ Incompatible │ 0 │ 8 │
|
|
128
|
+
│ mri+rust │ ✓ Fully compatible │ 8 │ 0 │
|
|
129
|
+
│ ffi+rust │ ✓ Fully compatible │ 8 │ 0 │
|
|
130
|
+
╰──────────────┴────────────────────┴─────────┴────────╯
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
### Required Environment Variables
|
|
134
|
+
|
|
135
|
+
The script requires grammar paths to be set:
|
|
136
|
+
|
|
137
|
+
```shell
|
|
138
|
+
export TREE_SITTER_TOML_PATH=/path/to/libtree-sitter-toml.so
|
|
139
|
+
export TREE_SITTER_JSON_PATH=/path/to/libtree-sitter-json.so
|
|
140
|
+
export TREE_SITTER_BASH_PATH=/path/to/libtree-sitter-bash.so
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
See `.envrc` for examples of how these are typically configured.
|
|
144
|
+
|
|
45
145
|
## Environment Variables for Local Development
|
|
46
146
|
|
|
47
147
|
Below are the primary environment variables recognized by stone_checksums (and its integrated tools). Unless otherwise noted, set boolean values to the string "true" to enable.
|