tree_haver 2.0.0 → 3.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: f711d2013566d14c8ae9d5f3609de3c32f8621e04d573543841c88e04fb6b1fb
4
- data.tar.gz: '09c5054535b0399848c728302631daf9fdebb6c9d6a1aa3fa7d5428e52875b36'
3
+ metadata.gz: 0ddc5d837509119a581acd92a33fc7819b6e6dd40a727d1eb0b074d1e944c22f
4
+ data.tar.gz: f7b329ee2245068b3bc05e9e6c250e565bbb3b617a50840617f389c2f18dc679
5
5
  SHA512:
6
- metadata.gz: fd50cd072f0e37bd051e184eb4e8fb2f198f3645f403732d04e73d08990890c28cf30d678a40142b459ea361577cfa1c22a68e1b0050f71d5cc9fe631d1a4f0d
7
- data.tar.gz: 46072b508a3a538750b202a948a8a9e14dcc0134e659bc481df66de28d334daee69e55cd4c3d7b66bf01fe193f3fed9a419c785eea0a79266680261186367c29
6
+ metadata.gz: 66e9dda1e2638d5340ad99a48959d63002293225a3fc636ea776f5718e0ed388628a4a9fe87c1b0150ae265ec812dd763b1cecfe434044148a1fffc16a9e78b1
7
+ data.tar.gz: 9545f9a6b25e6f5944c60b1c7e1a3ecc278cea04c5274b3c34a3d3a0d9d83f14324bda96820286e21ae627c380200866ee47cf6161a12f9bffc9ff72064c3aec
checksums.yaml.gz.sig CHANGED
Binary file
data/CHANGELOG.md CHANGED
@@ -30,6 +30,193 @@ Please file a bug if you notice a violation of semantic versioning.
30
30
 
31
31
  ### Security
32
32
 
33
+ ## [3.0.0] - 2025-12-16
34
+
35
+ - TAG: [v3.0.0][3.0.0t]
36
+ - COVERAGE: 85.19% -- 909/1067 lines in 11 files
37
+ - BRANCH COVERAGE: 67.47% -- 338/501 branches in 11 files
38
+ - 92.93% documented
39
+
40
+ ### Added
41
+
42
+ #### Backend Requirements
43
+
44
+ - **MRI Backend**: Requires `ruby_tree_sitter` v2.0+ (exceptions inherit from `Exception` not `StandardError`)
45
+ - In ruby_tree_sitter v2.0, TreeSitter errors were changed to inherit from Exception for thread-safety
46
+ - TreeHaver now properly handles: `ParserNotFoundError`, `LanguageLoadError`, `SymbolNotFoundError`, etc.
47
+
48
+ #### Thread-Safe Backend Selection (Hybrid Approach)
49
+
50
+ - **NEW: Block-based backend API** - `TreeHaver.with_backend(:ffi) { ... }` for thread-safe backend selection
51
+ - Thread-local context with proper nesting support
52
+ - Exception-safe (context restored even on errors)
53
+ - Fully backward compatible with existing global backend setting
54
+ - **NEW: Explicit backend parameters**
55
+ - `Parser.new(backend: :mri)` - specify backend when creating parser
56
+ - `Language.from_library(path, backend: :ffi)` - specify backend when loading language
57
+ - Backend parameters override thread context and global settings
58
+ - **NEW: Backend introspection** - `parser.backend` returns the current backend name (`:ffi`, `:mri`, etc.)
59
+ - **Backend precedence chain**: `explicit parameter > thread context > global setting > :auto`
60
+ - **Backend-aware caching** - Language cache now includes backend in cache key to prevent cross-backend pollution
61
+ - Added `TreeHaver.effective_backend` - returns the currently effective backend considering precedence
62
+ - Added `TreeHaver.current_backend_context` - returns thread-local backend context
63
+ - Added `TreeHaver.resolve_backend_module(explicit_backend)` - resolves backend module with precedence
64
+
65
+ #### Examples and Discovery
66
+
67
+ - Added 18 comprehensive examples demonstrating all backends and languages
68
+ - JSON examples (5): auto, MRI, Rust, FFI, Java
69
+ - JSONC examples (5): auto, MRI, Rust, FFI, Java
70
+ - Bash examples (5): auto, MRI, Rust, FFI, Java
71
+ - Citrus examples (3): TOML, Finitio, Dhall
72
+ - All examples use bundler inline (self-contained, no Gemfile needed)
73
+ - Added `examples/run_all.rb` - comprehensive test runner with colored output
74
+ - Updated `examples/README.md` - complete guide to all examples
75
+ - Added `TreeHaver::CitrusGrammarFinder` for language-agnostic discovery and registration of Citrus-based grammar gems
76
+ - Automatically discovers Citrus grammar gems by gem name and grammar constant path
77
+ - Validates grammar modules respond to `.parse(source)` before registration
78
+ - Provides helpful error messages when grammars are not found
79
+ - Added multi-backend language registry supporting multiple backends per language simultaneously
80
+ - Restructured `LanguageRegistry` to use nested hash: `{ language: { backend_type: config } }`
81
+ - Enables registering both tree-sitter and Citrus grammars for the same language without conflicts
82
+ - Supports runtime backend switching, benchmarking, and fallback scenarios
83
+ - Added `LanguageRegistry.register(name, backend_type, **config)` with backend-specific configuration storage
84
+ - Added `LanguageRegistry.registered(name, backend_type = nil)` to query by specific backend or get all backends
85
+ - Added `TreeHaver::Backends::Citrus::Node#structural?` method to distinguish structural nodes from terminals
86
+ - Uses Citrus grammar's `terminal?` method to dynamically determine node classification
87
+ - Works with any Citrus grammar without language-specific knowledge
88
+
89
+ ### Changed
90
+
91
+ - **BREAKING**: All errors now inherit from `TreeHaver::Error` which inherits from `Exception`
92
+ - see: https://github.com/Faveod/ruby-tree-sitter/pull/83 for reasoning
93
+ - **BREAKING**: `LanguageRegistry.register` signature changed from `register(name, path:, symbol:)` to `register(name, backend_type, **config)`
94
+ - This enables proper separation of tree-sitter and Citrus configurations
95
+ - Users should update to use `TreeHaver.register_language` instead of calling `LanguageRegistry.register` directly
96
+ - Updated `TreeHaver.register_language` to support both tree-sitter and Citrus grammars in single call or separate calls
97
+ - Can now register: `register_language(:toml, path: "...", symbol: "...", grammar_module: TomlRB::Document)`
98
+ - **INTENTIONAL DESIGN**: Uses separate `if` statements (not `elsif`) to allow registering both backends simultaneously
99
+ - Enables maximum flexibility: runtime backend switching, performance benchmarking, fallback scenarios
100
+ - Multiple registrations for same language now merge instead of overwrite
101
+
102
+ ### Improved
103
+
104
+ #### Code Quality and Documentation
105
+
106
+ - **Uniform backend API**: All backends now implement `reset!` method for consistent testing interface
107
+ - Eliminates need for tests to manipulate private instance variables
108
+ - Provides clean way to reset backend state between tests
109
+ - **Documented design decisions** with inline rationale
110
+ - FFI Tree finalizer behavior and why Parser doesn't use finalizers
111
+ - `resolve_backend_module` early-return pattern with comprehensive comments
112
+ - `register_language` multi-backend registration capability extensively documented
113
+ - **Enhanced YARD documentation**
114
+ - All Citrus examples now include `gem_name` parameter (matches actual usage patterns)
115
+ - Added complete examples showing both single-backend and multi-backend registration
116
+ - Documented backend precedence chain and thread-safety guarantees
117
+ - **Comprehensive test coverage** for thread-safe backend selection
118
+ - Thread-local context tests
119
+ - Parser backend parameter tests
120
+ - Language backend parameter tests
121
+ - Concurrent parsing tests with multiple backends
122
+ - Backend-aware cache isolation tests
123
+ - Nested block behavior tests (inner blocks override outer blocks)
124
+ - Exception safety tests (context restored even on errors)
125
+ - Explicit parameter precedence tests
126
+ - Updated `Language.method_missing` to automatically select appropriate grammar based on active backend
127
+ - tree-sitter backends (MRI, Rust, FFI, Java) query `:tree_sitter` registry key
128
+ - Citrus backend queries `:citrus` registry key
129
+ - Provides clear error messages when requested backend has no registered grammar
130
+ - Improved `TreeHaver::Backends::Citrus::Node#type` to use dynamic Citrus grammar introspection
131
+ - Uses event `.name` method and Symbol events for accurate type extraction
132
+ - Works with any Citrus grammar without language-specific code
133
+ - Handles compound rules (Repeat, Choice, Optional) intelligently
134
+
135
+ ### Fixed
136
+
137
+ #### Thread-Safety and Backend Selection
138
+
139
+ - Fixed `resolve_backend_module` to properly handle mocked backends without `available?` method
140
+ - Assumes modules without `available?` are available (for test compatibility and backward compatibility)
141
+ - Only rejects if module explicitly has `available?` method and returns false
142
+ - Makes code more defensive and test-friendly
143
+ - Fixed Language cache to include backend in cache key
144
+ - Prevents returning wrong backend's Language object when switching backends
145
+ - Essential for correctness with multiple backends in use
146
+ - Cache key now: `"#{path}:#{symbol}:#{backend}"` instead of just `"#{path}:#{symbol}"`
147
+ - Fixed `TreeHaver.register_language` to properly support multi-backend registration
148
+ - Documented intentional design: uses `if` not `elsif` to allow both backends in one call
149
+ - Added comprehensive inline comments explaining why no early return
150
+ - Added extensive YARD documentation with examples
151
+
152
+ #### Backend Bug Fixes
153
+
154
+ - Fixed critical double-wrapping bug in ALL backends (MRI, Rust, FFI, Java, Citrus)
155
+ - Backend `Parser#parse` and `parse_string` methods now return raw backend trees
156
+ - TreeHaver::Parser wraps the raw tree in TreeHaver::Tree (single wrapping)
157
+ - Previously backends were returning TreeHaver::Tree, then TreeHaver::Parser wrapped it again (double wrapping)
158
+ - This caused `@inner_tree` to be a TreeHaver::Tree instead of raw backend tree, leading to nil errors
159
+ - Fixed TreeHaver::Parser to pass source parameter when wrapping backend trees
160
+ - Enables `Node#text` to work correctly by providing source for text extraction
161
+ - Fixes all parse and parse_string methods to include `source: source` parameter
162
+ - Fixed MRI backend to properly use ruby_tree_sitter API
163
+ - Fixed `require "tree_sitter"` (gem name is `ruby_tree_sitter` but requires `tree_sitter`)
164
+ - Fixed `Language.load` to use correct argument order: `(symbol_name, path)`
165
+ - Fixed `Parser#parse` to use `parse_string(nil, source)` instead of creating Input objects
166
+ - Fixed `Language.from_library` to implement the expected signature matching other backends
167
+ - Fixed FFI backend missing essential node methods
168
+ - Added `ts_node_start_byte`, `ts_node_end_byte`, `ts_node_start_point`, `ts_node_end_point`
169
+ - Added `ts_node_is_null`, `ts_node_is_named`
170
+ - These methods are required for accessing node byte positions and metadata
171
+ - Fixes `NoMethodError` when using FFI backend to traverse AST nodes
172
+ - Fixed GrammarFinder error messages for environment variable validation
173
+ - Detects leading/trailing whitespace in paths and provides correction suggestions
174
+ - Shows when TREE_SITTER_*_PATH is set but points to nonexistent file
175
+ - Provides helpful guidance for setting environment variables correctly
176
+ - Fixed registry conflicts when registering multiple backend types for the same language
177
+ - Fixed `CitrusGrammarFinder` to properly handle gems with non-standard require paths (e.g., `toml-rb.rb` vs `toml/rb.rb`)
178
+ - Fixed Citrus backend infinite recursion in `Node#extract_type_from_event`
179
+ - Added cycle detection to prevent stack overflow when traversing recursive grammar structures
180
+
181
+ ### Known Issues
182
+
183
+ - **MRI backend + Bash grammar**: ABI/symbol loading incompatibility
184
+ - The ruby_tree_sitter gem cannot load tree-sitter-bash grammar (symbol not found)
185
+ - Workaround: Use FFI backend instead (works perfectly)
186
+ - This is documented in examples and test runner
187
+ - **Rust backend + Bash grammar**: Version mismatch due to static linking
188
+ - tree_stump statically links tree-sitter at compile time
189
+ - System bash.so may be compiled with different tree-sitter version
190
+ - Workaround: Use FFI backend (dynamic linking avoids version conflicts)
191
+ - This is documented in examples with detailed explanations
192
+
193
+ ### Notes on Backward Compatibility
194
+
195
+ Despite the major version bump to 3.0.0 (following semver due to the breaking `LanguageRegistry.register` signature change), **most users will experience NO BREAKING CHANGES**:
196
+
197
+ #### Why 3.0.0?
198
+
199
+ - `LanguageRegistry.register` signature changed to support multi-backend registration
200
+ - However, most users should use `TreeHaver.register_language` (which remains backward compatible)
201
+ - Direct calls to `LanguageRegistry.register` are rare in practice
202
+
203
+ #### What Stays the Same?
204
+
205
+ - **Global backend setting**: `TreeHaver.backend = :ffi` works unchanged
206
+ - **Parser creation**: `Parser.new` without parameters works as before
207
+ - **Language loading**: `Language.from_library(path)` works as before
208
+ - **Auto-detection**: Backend auto-selection still works when backend is `:auto`
209
+ - **All existing code** continues to work without modifications
210
+
211
+ #### What's New (All Optional)?
212
+
213
+ - Thread-safe block API: `TreeHaver.with_backend(:ffi) { ... }`
214
+ - Explicit backend parameters: `Parser.new(backend: :mri)`
215
+ - Backend introspection: `parser.backend`
216
+ - Multi-backend language registration
217
+
218
+ **Migration Path**: Existing codebases can upgrade to 3.0.0 and gain access to new thread-safe features without changing any existing code. The new features are purely additive and opt-in.
219
+
33
220
  ## [2.0.0] - 2025-12-15
34
221
 
35
222
  - TAG: [v2.0.0][2.0.0t]
@@ -85,7 +272,9 @@ Please file a bug if you notice a violation of semantic versioning.
85
272
 
86
273
  - Initial release
87
274
 
88
- [Unreleased]: https://github.com/kettle-rb/tree_haver/compare/v2.0.0...HEAD
275
+ [Unreleased]: https://github.com/kettle-rb/tree_haver/compare/v3.0.0...HEAD
276
+ [3.0.0]: https://github.com/kettle-rb/tree_haver/compare/v2.0.0...v3.0.0
277
+ [3.0.0t]: https://github.com/kettle-rb/tree_haver/releases/tag/v3.0.0
89
278
  [2.0.0]: https://github.com/kettle-rb/tree_haver/compare/v1.0.0...v2.0.0
90
279
  [2.0.0t]: https://github.com/kettle-rb/tree_haver/releases/tag/v2.0.0
91
280
  [1.0.0]: https://github.com/kettle-rb/tree_haver/compare/a89211bff10f4440b96758a8ac9d7d539001b0c8...v1.0.0
data/CONTRIBUTING.md CHANGED
@@ -42,6 +42,106 @@ There are many Rake tasks available as well. You can see them by running:
42
42
  bin/rake -T
43
43
  ```
44
44
 
45
+ ## Backend Compatibility Testing
46
+
47
+ TreeHaver supports multiple backends (MRI, FFI, Rust, Citrus), but not all backends can coexist
48
+ in the same Ruby process. Notably, **FFI and MRI backends conflict** at the libtree-sitter runtime
49
+ level—using both in the same process will cause segfaults.
50
+
51
+ The `bin/backend-matrix` script helps test and document backend compatibility by running tests
52
+ in isolated subprocesses.
53
+
54
+ ### Basic Usage
55
+
56
+ ```shell
57
+ # Test all backends with TOML grammar (default)
58
+ bin/backend-matrix
59
+
60
+ # Test specific backend order
61
+ bin/backend-matrix ffi mri rust
62
+
63
+ # Test with a different grammar
64
+ bin/backend-matrix --grammar=json
65
+
66
+ # Test multiple grammars
67
+ bin/backend-matrix --grammars=json,toml,bash
68
+ ```
69
+
70
+ ### All Permutations Mode
71
+
72
+ Test all possible backend combinations by spawning fresh subprocesses for each:
73
+
74
+ ```shell
75
+ # Test all 15 backend combinations (1-backend, 2-backend, 3-backend)
76
+ bin/backend-matrix --all-permutations
77
+
78
+ # With multiple grammars
79
+ bin/backend-matrix --all-permutations --grammars=json,toml
80
+ ```
81
+
82
+ ### Cross-Grammar Testing
83
+
84
+ The most interesting test: can different backends coexist if they use *different* grammar files?
85
+
86
+ ```shell
87
+ # Test: FFI+json then MRI+toml, MRI+json then FFI+toml, etc.
88
+ bin/backend-matrix --cross-grammar --grammars=json,toml
89
+
90
+ # Full cross-grammar matrix
91
+ bin/backend-matrix --all-permutations --cross-grammar --grammars=json,toml
92
+ ```
93
+
94
+ ### Custom Source Files
95
+
96
+ Provide your own source files for parsing:
97
+
98
+ ```shell
99
+ bin/backend-matrix --toml-source=my_config.toml --json-source=data.json
100
+ ```
101
+
102
+ ### List Available Grammars
103
+
104
+ Check which grammars are configured and available:
105
+
106
+ ```shell
107
+ bin/backend-matrix --list-grammars
108
+ ```
109
+
110
+ ### Understanding the Output
111
+
112
+ The script produces tables showing:
113
+
114
+ 1. **1-Backend Tests**: Each backend tested in isolation with all grammars
115
+ 2. **2-Backend Tests**: Pairs of backends tested in sequence (A → B)
116
+ 3. **3-Backend Tests**: Triples tested in sequence (A → B → C)
117
+ 4. **Backend Pair Compatibility**: Data-driven analysis of which backends can coexist
118
+ 5. **Statistics**: Success rates and combination counts
119
+
120
+ Example findings:
121
+
122
+ ```
123
+ Backend Pair Compatibility:
124
+ ╭──────────────┬────────────────────┬─────────┬────────╮
125
+ │ Backend Pair │ Compatibility │ Working │ Failed │
126
+ ├──────────────┼────────────────────┼─────────┼────────┤
127
+ │ ffi+mri │ ✗ Incompatible │ 0 │ 8 │
128
+ │ mri+rust │ ✓ Fully compatible │ 8 │ 0 │
129
+ │ ffi+rust │ ✓ Fully compatible │ 8 │ 0 │
130
+ ╰──────────────┴────────────────────┴─────────┴────────╯
131
+ ```
132
+
133
+ ### Required Environment Variables
134
+
135
+ The script requires grammar paths to be set:
136
+
137
+ ```shell
138
+ export TREE_SITTER_TOML_PATH=/path/to/libtree-sitter-toml.so
139
+ export TREE_SITTER_JSON_PATH=/path/to/libtree-sitter-json.so
140
+ export TREE_SITTER_BASH_PATH=/path/to/libtree-sitter-bash.so
141
+ ```
142
+
143
+ See `.envrc` for examples of how these are typically configured.
144
+
45
145
  ## Environment Variables for Local Development
46
146
 
47
147
  Below are the primary environment variables recognized by stone_checksums (and its integrated tools). Unless otherwise noted, set boolean values to the string "true" to enable.