tree_haver 3.0.0 → 3.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- checksums.yaml.gz.sig +3 -3
- data/CHANGELOG.md +96 -1
- data/CONTRIBUTING.md +46 -14
- data/README.md +248 -86
- data/lib/tree_haver/backends/citrus.rb +36 -0
- data/lib/tree_haver/backends/commonmarker.rb +490 -0
- data/lib/tree_haver/backends/ffi.rb +15 -13
- data/lib/tree_haver/backends/java.rb +1 -1
- data/lib/tree_haver/backends/markly.rb +559 -0
- data/lib/tree_haver/backends/mri.rb +41 -12
- data/lib/tree_haver/backends/prism.rb +624 -0
- data/lib/tree_haver/backends/psych.rb +597 -0
- data/lib/tree_haver/backends/rust.rb +1 -1
- data/lib/tree_haver/grammar_finder.rb +74 -5
- data/lib/tree_haver/node.rb +72 -6
- data/lib/tree_haver/version.rb +1 -1
- data/lib/tree_haver.rb +143 -24
- data/sig/tree_haver.rbs +18 -1
- data.tar.gz.sig +0 -0
- metadata +8 -4
- metadata.gz.sig +0 -0
data/README.md
CHANGED
|
@@ -54,20 +54,20 @@
|
|
|
54
54
|
|
|
55
55
|
## 🌻 Synopsis
|
|
56
56
|
|
|
57
|
-
TreeHaver is a cross-Ruby adapter for the [tree-sitter](https://tree-sitter.github.io/tree-sitter/) parsing
|
|
57
|
+
TreeHaver is a cross-Ruby adapter for the [tree-sitter](https://tree-sitter.github.io/tree-sitter/) and [Citrus](https://github.com/mjackson/citrus) parsing libraries and other dedicated parsing tools that works seamlessly across MRI Ruby, JRuby, and TruffleRuby. It provides a unified API for parsing source code using grammars, regardless of your Ruby implementation.
|
|
58
58
|
|
|
59
59
|
### The Adapter Pattern: Like Faraday, but for Parsing
|
|
60
60
|
|
|
61
61
|
If you've used [Faraday](https://github.com/lostisland/faraday), [multi_json](https://github.com/intridea/multi_json), or [multi_xml](https://github.com/sferik/multi_xml), you'll feel right at home with TreeHaver. These gems share a common philosophy:
|
|
62
62
|
|
|
63
|
-
| Gem | Unified API for | Backend Examples
|
|
64
|
-
|
|
65
|
-
| **Faraday** | HTTP requests | Net::HTTP, Typhoeus, Patron, Excon
|
|
66
|
-
| **multi_json** | JSON parsing | Oj, Yajl, JSON gem
|
|
67
|
-
| **multi_xml** | XML parsing | Nokogiri, LibXML, Ox
|
|
68
|
-
| **TreeHaver** |
|
|
63
|
+
| Gem | Unified API for | Backend Examples |
|
|
64
|
+
|----------------|---------------------|--------------------------------------------------------------------------|
|
|
65
|
+
| **Faraday** | HTTP requests | Net::HTTP, Typhoeus, Patron, Excon |
|
|
66
|
+
| **multi_json** | JSON parsing | Oj, Yajl, JSON gem |
|
|
67
|
+
| **multi_xml** | XML parsing | Nokogiri, LibXML, Ox |
|
|
68
|
+
| **TreeHaver** | Code parsing | MRI, Rust, FFI, Java, Prism, Psych, Commonmarker, Markly, Citrus (& Co.) |
|
|
69
69
|
|
|
70
|
-
**Write once, run anywhere.**
|
|
70
|
+
**Write once, run anywhere.**
|
|
71
71
|
|
|
72
72
|
**Learn once, write anywhere.**
|
|
73
73
|
|
|
@@ -88,18 +88,26 @@ tree = parser.parse(source_code)
|
|
|
88
88
|
### Key Features
|
|
89
89
|
|
|
90
90
|
- **Universal Ruby Support**: Works on MRI Ruby, JRuby, and TruffleRuby
|
|
91
|
-
- **
|
|
92
|
-
- **
|
|
93
|
-
|
|
94
|
-
- **
|
|
95
|
-
|
|
96
|
-
|
|
97
|
-
|
|
91
|
+
- **10 Parsing Backends** - Choose the right backend for your needs:
|
|
92
|
+
- **Tree-sitter Backends** (high-performance, incremental parsing):
|
|
93
|
+
- **MRI Backend**: Leverages [`ruby_tree_sitter`](https://github.com/Faveod/ruby-tree-sitter) gem (C extension, fastest on MRI)
|
|
94
|
+
- **Rust Backend**: Uses [`tree_stump`](https://github.com/anthropics/tree_stump) gem (Rust with precompiled binaries)
|
|
95
|
+
- **Note**: Currently requires [pboling's fork](https://github.com/pboling/tree_stump/tree/tree_haver) until PRs [#5](https://github.com/joker1007/tree_stump/pull/5), [#7](https://github.com/joker1007/tree_stump/pull/7), [#11](https://github.com/joker1007/tree_stump/pull/11), and [#13](https://github.com/joker1007/tree_stump/pull/13) are merged
|
|
96
|
+
- **FFI Backend**: Pure Ruby FFI bindings to `libtree-sitter` (ideal for JRuby, TruffleRuby)
|
|
97
|
+
- **Java Backend**: Native Java integration for JRuby with java-tree-sitter grammar JARs
|
|
98
|
+
- **Language-Specific Backends** (native parser integration):
|
|
99
|
+
- **Prism Backend**: Ruby's official parser ([Prism](https://github.com/ruby/prism), stdlib in Ruby 3.4+)
|
|
100
|
+
- **Psych Backend**: Ruby's YAML parser ([Psych](https://github.com/ruby/psych), stdlib)
|
|
101
|
+
- **Commonmarker Backend**: Fast Markdown parser ([Commonmarker](https://github.com/gjtorikian/commonmarker), comrak Rust)
|
|
102
|
+
- **Markly Backend**: GitHub Flavored Markdown ([Markly](https://github.com/ioquatix/markly), cmark-gfm C)
|
|
103
|
+
- **Pure Ruby Fallback**:
|
|
104
|
+
- **Citrus Backend**: Pure Ruby parsing via [`citrus`](https://github.com/mjackson/citrus) (no native dependencies)
|
|
98
105
|
- **Automatic Backend Selection**: Intelligently selects the best backend for your Ruby implementation
|
|
99
|
-
- **Language Agnostic**:
|
|
106
|
+
- **Language Agnostic**: Parse any language - Ruby, Markdown, YAML, JSON, Bash, TOML, JavaScript, etc.
|
|
100
107
|
- **Grammar Discovery**: Built-in `GrammarFinder` utility for platform-aware grammar library discovery
|
|
108
|
+
- **Unified Position API**: Consistent `start_line`, `end_line`, `source_position` across all backends
|
|
101
109
|
- **Thread-Safe**: Built-in language registry with thread-safe caching
|
|
102
|
-
- **Minimal API Surface**: Simple, focused API that covers the most common
|
|
110
|
+
- **Minimal API Surface**: Simple, focused API that covers the most common use cases
|
|
103
111
|
|
|
104
112
|
### Backend Requirements
|
|
105
113
|
|
|
@@ -204,7 +212,7 @@ TreeHaver solves these problems by providing a unified API that automatically se
|
|
|
204
212
|
|
|
205
213
|
**Note:** Java backend works with grammar JARs built specifically for java-tree-sitter, or grammar .so files that statically link tree-sitter. This is why FFI is recommended for JRuby & TruffleRuby.
|
|
206
214
|
|
|
207
|
-
**Note:** TreeHaver can use `ruby_tree_sitter` or `tree_stump` as backends, giving you TreeHaver's unified API, grammar discovery, and security features, plus full access to incremental parsing when using those backends.
|
|
215
|
+
**Note:** TreeHaver can use `ruby_tree_sitter` (MRI) or `tree_stump` (MRI, JRuby?) as backends, or `jruby-tree-sitter` (JRuby), giving you TreeHaver's unified API, grammar discovery, and security features, plus full access to incremental parsing when using those backends.
|
|
208
216
|
|
|
209
217
|
**Note:** `tree_stump` currently requires [pboling's fork (tree_haver branch)](https://github.com/pboling/tree_stump/tree/tree_haver) until upstream PRs [#5](https://github.com/joker1007/tree_stump/pull/5), [#7](https://github.com/joker1007/tree_stump/pull/7), [#11](https://github.com/joker1007/tree_stump/pull/11), and [#13](https://github.com/joker1007/tree_stump/pull/13) are merged.
|
|
210
218
|
|
|
@@ -355,18 +363,29 @@ NOTE: Be prepared to track down certs for signed gems and add them the same way
|
|
|
355
363
|
|
|
356
364
|
### Available Backends
|
|
357
365
|
|
|
358
|
-
TreeHaver supports
|
|
366
|
+
TreeHaver supports 10 parsing backends, each with different trade-offs. The `auto` backend automatically selects the best available option.
|
|
367
|
+
|
|
368
|
+
#### Tree-sitter Backends (Universal Parsing)
|
|
369
|
+
|
|
370
|
+
| Backend | Description | Performance | Portability | Examples |
|
|
371
|
+
|---------|-------------|-------------|-------------|----------|
|
|
372
|
+
| **Auto** | Auto-selects best backend | Varies | ✅ Universal | [JSON](examples/auto_json.rb) · [JSONC](examples/auto_jsonc.rb) · [Bash](examples/auto_bash.rb) · [TOML](examples/auto_toml.rb) |
|
|
373
|
+
| **MRI** | C extension via ruby_tree_sitter | ⚡ Fastest | MRI only | [JSON](examples/mri_json.rb) · [JSONC](examples/mri_jsonc.rb) · ~~Bash~~* · [TOML](examples/mri_toml.rb) |
|
|
374
|
+
| **Rust** | Precompiled via tree_stump | ⚡ Very Fast | ✅ Good | [JSON](examples/rust_json.rb) · [JSONC](examples/rust_jsonc.rb) · ~~Bash~~* · [TOML](examples/rust_toml.rb) |
|
|
375
|
+
| **FFI** | Dynamic linking via FFI | 🔵 Fast | ✅ Universal | [JSON](examples/ffi_json.rb) · [JSONC](examples/ffi_jsonc.rb) · [Bash](examples/ffi_bash.rb) · [TOML](examples/ffi_toml.rb) |
|
|
376
|
+
| **Java** | JNI bindings | ⚡ Very Fast | JRuby only | [JSON](examples/java_json.rb) · [JSONC](examples/java_jsonc.rb) · [Bash](examples/java_bash.rb) · [TOML](examples/java_toml.rb) |
|
|
377
|
+
|
|
378
|
+
#### Language-Specific Backends (Native Parser Integration)
|
|
359
379
|
|
|
360
380
|
| Backend | Description | Performance | Portability | Examples |
|
|
361
381
|
|---------|-------------|-------------|-------------|----------|
|
|
362
|
-
| **
|
|
363
|
-
| **
|
|
364
|
-
| **
|
|
365
|
-
| **
|
|
366
|
-
| **Java** | JNI bindings | ⚡ Very Fast | JRuby only | [JSON](examples/java_json.rb) · [JSONC](examples/java_jsonc.rb) · [Bash](examples/java_bash.rb) |
|
|
382
|
+
| **Prism** | Ruby's official parser | ⚡ Very Fast | ✅ Universal | [Ruby](examples/prism_ruby.rb) |
|
|
383
|
+
| **Psych** | Ruby's YAML parser (stdlib) | ⚡ Very Fast | ✅ Universal | [YAML](examples/psych_yaml.rb) |
|
|
384
|
+
| **Commonmarker** | Markdown via comrak (Rust) | ⚡ Very Fast | ✅ Good | [Markdown](examples/commonmarker_markdown.rb) · [Merge](examples/commonmarker_merge_example.rb) |
|
|
385
|
+
| **Markly** | GFM via cmark-gfm (C) | ⚡ Very Fast | ✅ Good | [Markdown](examples/markly_markdown.rb) · [Merge](examples/markly_merge_example.rb) |
|
|
367
386
|
| **Citrus** | Pure Ruby parsing | 🟡 Slower | ✅ Universal | [TOML](examples/citrus_toml.rb) · [Finitio](examples/citrus_finitio.rb) · [Dhall](examples/citrus_dhall.rb) |
|
|
368
387
|
|
|
369
|
-
**Selection Priority (Auto mode):** MRI → Rust → FFI → Java → Citrus
|
|
388
|
+
**Selection Priority (Auto mode):** MRI → Rust → FFI → Java → Prism → Psych → Commonmarker → Markly → Citrus
|
|
370
389
|
|
|
371
390
|
**Known Issues:**
|
|
372
391
|
- *MRI + Bash: ABI incompatibility (use FFI instead)
|
|
@@ -375,29 +394,43 @@ TreeHaver supports multiple parsing backends, each with different trade-offs. Th
|
|
|
375
394
|
**Backend Requirements:**
|
|
376
395
|
|
|
377
396
|
```ruby
|
|
378
|
-
#
|
|
379
|
-
gem
|
|
380
|
-
|
|
381
|
-
#
|
|
382
|
-
gem '
|
|
383
|
-
|
|
384
|
-
#
|
|
385
|
-
gem
|
|
386
|
-
|
|
387
|
-
#
|
|
388
|
-
gem
|
|
397
|
+
# Tree-sitter backends
|
|
398
|
+
gem "ruby_tree_sitter", "~> 2.0" # MRI backend
|
|
399
|
+
gem "tree_stump" # Rust backend
|
|
400
|
+
gem "ffi", ">= 1.15", "< 2.0" # FFI backend
|
|
401
|
+
# Java backend: no gem required (uses JRuby's built-in JNI)
|
|
402
|
+
|
|
403
|
+
# Language-specific backends
|
|
404
|
+
gem "prism", "~> 1.0" # Ruby parsing (stdlib in Ruby 3.4+)
|
|
405
|
+
# Psych: no gem required (Ruby stdlib)
|
|
406
|
+
gem "commonmarker", ">= 0.23" # Markdown parsing (comrak)
|
|
407
|
+
gem "markly", "~> 0.11" # GFM parsing (cmark-gfm)
|
|
408
|
+
|
|
409
|
+
# Pure Ruby fallback
|
|
410
|
+
gem "citrus", "~> 3.0" # Citrus backend
|
|
389
411
|
# Plus grammar gems: toml-rb, dhall, finitio, etc.
|
|
390
412
|
```
|
|
391
413
|
|
|
392
414
|
**Force Specific Backend:**
|
|
393
415
|
|
|
394
416
|
```ruby
|
|
417
|
+
# Tree-sitter backends
|
|
418
|
+
TreeHaver.backend = :mri # Force MRI backend (ruby_tree_sitter)
|
|
419
|
+
TreeHaver.backend = :rust # Force Rust backend (tree_stump)
|
|
395
420
|
TreeHaver.backend = :ffi # Force FFI backend
|
|
396
|
-
TreeHaver.backend = :
|
|
397
|
-
|
|
398
|
-
|
|
421
|
+
TreeHaver.backend = :java # Force Java backend (JRuby only)
|
|
422
|
+
|
|
423
|
+
# Language-specific backends
|
|
424
|
+
TreeHaver.backend = :prism # Force Prism (Ruby parsing)
|
|
425
|
+
TreeHaver.backend = :psych # Force Psych (YAML parsing)
|
|
426
|
+
TreeHaver.backend = :commonmarker # Force Commonmarker (Markdown)
|
|
427
|
+
TreeHaver.backend = :markly # Force Markly (GFM Markdown)
|
|
428
|
+
|
|
429
|
+
# Pure Ruby fallback
|
|
399
430
|
TreeHaver.backend = :citrus # Force Citrus backend
|
|
400
|
-
|
|
431
|
+
|
|
432
|
+
# Auto-selection (default)
|
|
433
|
+
TreeHaver.backend = :auto # Let TreeHaver choose
|
|
401
434
|
```
|
|
402
435
|
|
|
403
436
|
**Block-based Backend Switching:**
|
|
@@ -453,7 +486,7 @@ TreeHaver.backend_module # => TreeHaver::Backends::FFI
|
|
|
453
486
|
TreeHaver.capabilities # => { backend: :ffi, parse: true, query: false, ... }
|
|
454
487
|
```
|
|
455
488
|
|
|
456
|
-
See [examples/](examples/) directory for
|
|
489
|
+
See [examples/](examples/) directory for **26 complete working examples** demonstrating all 10 backends with multiple languages (JSON, JSONC, Bash, TOML, Ruby, YAML, Markdown) plus markdown-merge integration examples.
|
|
457
490
|
|
|
458
491
|
### Security Considerations
|
|
459
492
|
|
|
@@ -765,74 +798,103 @@ parser = TreeSitter::Parser.new # Actually creates TreeHaver::Parser
|
|
|
765
798
|
|
|
766
799
|
This is safe and idempotent—if the real `TreeSitter` module is already loaded, the shim does nothing.
|
|
767
800
|
|
|
768
|
-
#### ⚠️
|
|
801
|
+
#### ⚠️ Important: Exception Hierarchy
|
|
769
802
|
|
|
770
|
-
**ruby_tree_sitter v2+ exceptions inherit from `Exception` (not `StandardError`).**
|
|
771
|
-
**TreeHaver exceptions follow Ruby best practices and inherit from `StandardError`.**
|
|
803
|
+
**Both ruby_tree_sitter v2+ and TreeHaver exceptions inherit from `Exception` (not `StandardError`).**
|
|
772
804
|
|
|
773
|
-
This
|
|
805
|
+
This design decision follows ruby_tree_sitter's lead for thread-safety and signal handling reasons. See [ruby_tree_sitter PR #83](https://github.com/Faveod/ruby-tree-sitter/pull/83) for the rationale.
|
|
774
806
|
|
|
775
|
-
|
|
776
|
-
|----------|---------------------|----------------------|
|
|
777
|
-
| `rescue => e` | Does NOT catch TreeSitter errors | DOES catch TreeHaver errors |
|
|
778
|
-
| Behavior | Errors propagate (inherit Exception) | Errors caught (inherit StandardError) |
|
|
779
|
-
|
|
780
|
-
**Example showing the difference:**
|
|
807
|
+
**What this means for exception handling:**
|
|
781
808
|
|
|
782
809
|
```ruby
|
|
783
|
-
#
|
|
810
|
+
# ⚠️ This will NOT catch TreeHaver errors
|
|
784
811
|
begin
|
|
785
|
-
|
|
812
|
+
TreeHaver::Language.from_library("/nonexistent.so")
|
|
786
813
|
rescue => e
|
|
787
|
-
puts "Caught!" # Never reached -
|
|
814
|
+
puts "Caught!" # Never reached - TreeHaver::Error inherits Exception
|
|
788
815
|
end
|
|
789
816
|
|
|
790
|
-
#
|
|
791
|
-
require "tree_haver/compat"
|
|
817
|
+
# ✅ Explicit rescue is required
|
|
792
818
|
begin
|
|
793
|
-
|
|
794
|
-
rescue => e
|
|
795
|
-
puts "Caught!" #
|
|
819
|
+
TreeHaver::Language.from_library("/nonexistent.so")
|
|
820
|
+
rescue TreeHaver::Error => e
|
|
821
|
+
puts "Caught!" # This works
|
|
822
|
+
end
|
|
823
|
+
|
|
824
|
+
# ✅ Or rescue specific exceptions
|
|
825
|
+
begin
|
|
826
|
+
TreeHaver::Language.from_library("/nonexistent.so")
|
|
827
|
+
rescue TreeHaver::NotAvailable => e
|
|
828
|
+
puts "Grammar not available: #{e.message}"
|
|
796
829
|
end
|
|
797
830
|
```
|
|
798
831
|
|
|
799
|
-
**
|
|
832
|
+
**TreeHaver Exception Hierarchy:**
|
|
833
|
+
|
|
834
|
+
```
|
|
835
|
+
Exception
|
|
836
|
+
└── TreeHaver::Error # Base error class
|
|
837
|
+
├── TreeHaver::NotAvailable # Backend/grammar not available
|
|
838
|
+
└── TreeHaver::BackendConflict # Backend incompatibility detected
|
|
839
|
+
```
|
|
840
|
+
|
|
841
|
+
**Compatibility Mode Behavior:**
|
|
842
|
+
|
|
843
|
+
The compat mode (`require "tree_haver/compat"`) creates aliases but **does not change the exception hierarchy**:
|
|
800
844
|
|
|
801
845
|
```ruby
|
|
802
|
-
|
|
803
|
-
|
|
804
|
-
|
|
805
|
-
|
|
806
|
-
|
|
807
|
-
|
|
846
|
+
require "tree_haver/compat"
|
|
847
|
+
|
|
848
|
+
# TreeSitter constants are now aliases to TreeHaver
|
|
849
|
+
TreeSitter::Error # => TreeHaver::Error (still inherits Exception)
|
|
850
|
+
TreeSitter::Parser # => TreeHaver::Parser
|
|
851
|
+
TreeSitter::Language # => TreeHaver::Language
|
|
808
852
|
|
|
809
|
-
#
|
|
853
|
+
# Exception handling remains the same
|
|
810
854
|
begin
|
|
811
|
-
|
|
812
|
-
rescue
|
|
813
|
-
|
|
855
|
+
TreeSitter::Language.load("missing", "/nonexistent.so")
|
|
856
|
+
rescue TreeSitter::Error => e # Still requires explicit rescue
|
|
857
|
+
puts "Error: #{e.message}"
|
|
814
858
|
end
|
|
815
859
|
```
|
|
816
860
|
|
|
817
|
-
**
|
|
861
|
+
**Best Practices:**
|
|
862
|
+
|
|
863
|
+
1. **Always use explicit rescue** for TreeHaver errors:
|
|
864
|
+
```ruby
|
|
865
|
+
begin
|
|
866
|
+
finder = TreeHaver::GrammarFinder.new(:toml)
|
|
867
|
+
finder.register! if finder.available?
|
|
868
|
+
language = TreeHaver::Language.toml
|
|
869
|
+
rescue TreeHaver::NotAvailable => e
|
|
870
|
+
warn("TOML grammar not available: #{e.message}")
|
|
871
|
+
# Fallback to another backend or fail gracefully
|
|
872
|
+
end
|
|
873
|
+
`````
|
|
874
|
+
|
|
875
|
+
2. **Never rely on `rescue => e`** to catch TreeHaver errors (it won't work)
|
|
818
876
|
|
|
819
|
-
|
|
820
|
-
2. **Safety**: Inheriting from `Exception` can catch system signals (`SIGTERM`, `SIGINT`) and `exit`, which is dangerous
|
|
821
|
-
3. **Consistency**: Most Ruby libraries follow this convention
|
|
822
|
-
4. **Testability**: StandardError exceptions are easier to test and mock
|
|
877
|
+
**Why inherit from Exception?**
|
|
823
878
|
|
|
824
|
-
|
|
879
|
+
Following ruby_tree_sitter's reasoning:
|
|
880
|
+
- **Thread safety**: Prevents accidental catching in thread cleanup code
|
|
881
|
+
- **Signal handling**: Ensures parsing errors don't interfere with SIGTERM/SIGINT
|
|
882
|
+
- **Intentional handling**: Forces developers to explicitly handle parsing errors
|
|
883
|
+
|
|
884
|
+
See `lib/tree_haver/compat.rb` for compatibility layer documentation.
|
|
825
885
|
|
|
826
886
|
## 🔧 Basic Usage
|
|
827
887
|
|
|
828
888
|
### Quick Start
|
|
829
889
|
|
|
830
|
-
Here
|
|
890
|
+
TreeHaver works with any language through its 10 backends. Here are examples for different parsing needs:
|
|
891
|
+
|
|
892
|
+
#### Parsing with Tree-sitter (Universal Languages)
|
|
831
893
|
|
|
832
894
|
```ruby
|
|
833
895
|
require "tree_haver"
|
|
834
896
|
|
|
835
|
-
# Load a
|
|
897
|
+
# Load a tree-sitter grammar (works with MRI, Rust, FFI, or Java backend)
|
|
836
898
|
language = TreeHaver::Language.from_library(
|
|
837
899
|
"/usr/local/lib/libtree-sitter-toml.so",
|
|
838
900
|
symbol: "tree_sitter_toml",
|
|
@@ -842,7 +904,7 @@ language = TreeHaver::Language.from_library(
|
|
|
842
904
|
parser = TreeHaver::Parser.new
|
|
843
905
|
parser.language = language
|
|
844
906
|
|
|
845
|
-
# Parse
|
|
907
|
+
# Parse source code
|
|
846
908
|
source = <<~TOML
|
|
847
909
|
[package]
|
|
848
910
|
name = "my-app"
|
|
@@ -851,16 +913,116 @@ TOML
|
|
|
851
913
|
|
|
852
914
|
tree = parser.parse(source)
|
|
853
915
|
|
|
854
|
-
# Access the
|
|
916
|
+
# Access the unified Position API (works across all backends)
|
|
855
917
|
root = tree.root_node
|
|
856
|
-
puts "Root
|
|
918
|
+
puts "Root type: #{root.type}" # => "document"
|
|
919
|
+
puts "Start line: #{root.start_line}" # => 1 (1-based)
|
|
920
|
+
puts "End line: #{root.end_line}" # => 3
|
|
921
|
+
puts "Position: #{root.source_position}" # => {start_line: 1, end_line: 3, ...}
|
|
857
922
|
|
|
858
923
|
# Traverse the tree
|
|
859
924
|
root.each do |child|
|
|
860
|
-
puts "Child
|
|
861
|
-
|
|
862
|
-
|
|
925
|
+
puts "Child: #{child.type} at line #{child.start_line}"
|
|
926
|
+
end
|
|
927
|
+
```
|
|
928
|
+
|
|
929
|
+
#### Parsing Ruby with Prism
|
|
930
|
+
|
|
931
|
+
```ruby
|
|
932
|
+
require "tree_haver"
|
|
933
|
+
|
|
934
|
+
TreeHaver.backend = :prism
|
|
935
|
+
parser = TreeHaver::Parser.new
|
|
936
|
+
parser.language = TreeHaver::Backends::Prism::Language.ruby
|
|
937
|
+
|
|
938
|
+
source = <<~RUBY
|
|
939
|
+
class Example
|
|
940
|
+
def hello
|
|
941
|
+
puts "Hello, world!"
|
|
942
|
+
end
|
|
863
943
|
end
|
|
944
|
+
RUBY
|
|
945
|
+
|
|
946
|
+
tree = parser.parse(source)
|
|
947
|
+
root = tree.root_node
|
|
948
|
+
|
|
949
|
+
# Find all method definitions
|
|
950
|
+
def find_methods(node, results = [])
|
|
951
|
+
results << node if node.type == "def_node"
|
|
952
|
+
node.children.each { |child| find_methods(child, results) }
|
|
953
|
+
results
|
|
954
|
+
end
|
|
955
|
+
|
|
956
|
+
methods = find_methods(root)
|
|
957
|
+
methods.each do |method_node|
|
|
958
|
+
pos = method_node.source_position
|
|
959
|
+
puts "Method at lines #{pos[:start_line]}-#{pos[:end_line]}"
|
|
960
|
+
end
|
|
961
|
+
```
|
|
962
|
+
|
|
963
|
+
#### Parsing YAML with Psych
|
|
964
|
+
|
|
965
|
+
```ruby
|
|
966
|
+
require "tree_haver"
|
|
967
|
+
|
|
968
|
+
TreeHaver.backend = :psych
|
|
969
|
+
parser = TreeHaver::Parser.new
|
|
970
|
+
parser.language = TreeHaver::Backends::Psych::Language.yaml
|
|
971
|
+
|
|
972
|
+
source = <<~YAML
|
|
973
|
+
database:
|
|
974
|
+
host: localhost
|
|
975
|
+
port: 5432
|
|
976
|
+
YAML
|
|
977
|
+
|
|
978
|
+
tree = parser.parse(source)
|
|
979
|
+
root = tree.root_node
|
|
980
|
+
|
|
981
|
+
# Navigate YAML structure
|
|
982
|
+
def show_structure(node, indent = 0)
|
|
983
|
+
prefix = " " * indent
|
|
984
|
+
puts "#{prefix}#{node.type} (line #{node.start_line})"
|
|
985
|
+
node.children.each { |child| show_structure(child, indent + 1) }
|
|
986
|
+
end
|
|
987
|
+
|
|
988
|
+
show_structure(root)
|
|
989
|
+
```
|
|
990
|
+
|
|
991
|
+
#### Parsing Markdown with Commonmarker or Markly
|
|
992
|
+
|
|
993
|
+
```ruby
|
|
994
|
+
require "tree_haver"
|
|
995
|
+
|
|
996
|
+
# Choose your backend
|
|
997
|
+
TreeHaver.backend = :commonmarker # or :markly for GFM
|
|
998
|
+
|
|
999
|
+
parser = TreeHaver::Parser.new
|
|
1000
|
+
parser.language = TreeHaver::Backends::Commonmarker::Language.markdown
|
|
1001
|
+
|
|
1002
|
+
source = <<~MARKDOWN
|
|
1003
|
+
# My Document
|
|
1004
|
+
|
|
1005
|
+
## Section
|
|
1006
|
+
|
|
1007
|
+
- Item 1
|
|
1008
|
+
- Item 2
|
|
1009
|
+
MARKDOWN
|
|
1010
|
+
|
|
1011
|
+
tree = parser.parse(source)
|
|
1012
|
+
root = tree.root_node
|
|
1013
|
+
|
|
1014
|
+
# Find all headings
|
|
1015
|
+
def find_headings(node, results = [])
|
|
1016
|
+
results << node if node.type == "heading"
|
|
1017
|
+
node.children.each { |child| find_headings(child, results) }
|
|
1018
|
+
results
|
|
1019
|
+
end
|
|
1020
|
+
|
|
1021
|
+
headings = find_headings(root)
|
|
1022
|
+
headings.each do |heading|
|
|
1023
|
+
level = heading.header_level
|
|
1024
|
+
text = heading.children.map(&:text).join
|
|
1025
|
+
puts "H#{level}: #{text} (line #{heading.start_line})"
|
|
864
1026
|
end
|
|
865
1027
|
```
|
|
866
1028
|
|
|
@@ -1159,7 +1321,7 @@ RSpec.describe("MyParser") do
|
|
|
1159
1321
|
TreeHaver.with_backend(backend_name) do
|
|
1160
1322
|
parser = TreeHaver::Parser.new
|
|
1161
1323
|
result = parser.parse("x = 42")
|
|
1162
|
-
expect(result.root_node.type).to
|
|
1324
|
+
expect(result.root_node.type).to(eq("document"))
|
|
1163
1325
|
end
|
|
1164
1326
|
# Backend automatically restored after block
|
|
1165
1327
|
end
|
|
@@ -1623,7 +1785,7 @@ Thanks for RTFM. ☺️
|
|
|
1623
1785
|
[📌gitmoji]: https://gitmoji.dev
|
|
1624
1786
|
[📌gitmoji-img]: https://img.shields.io/badge/gitmoji_commits-%20%F0%9F%98%9C%20%F0%9F%98%8D-34495e.svg?style=flat-square
|
|
1625
1787
|
[🧮kloc]: https://www.youtube.com/watch?v=dQw4w9WgXcQ
|
|
1626
|
-
[🧮kloc-img]: https://img.shields.io/badge/KLOC-1.
|
|
1788
|
+
[🧮kloc-img]: https://img.shields.io/badge/KLOC-1.141-FFDD67.svg?style=for-the-badge&logo=YouTube&logoColor=blue
|
|
1627
1789
|
[🔐security]: SECURITY.md
|
|
1628
1790
|
[🔐security-img]: https://img.shields.io/badge/security-policy-259D6C.svg?style=flat
|
|
1629
1791
|
[📄copyright-notice-explainer]: https://opensource.stackexchange.com/questions/5778/why-do-licenses-such-as-the-mit-license-specify-a-single-year
|
|
@@ -372,6 +372,42 @@ module TreeHaver
|
|
|
372
372
|
calculate_point(@match.offset + @match.length)
|
|
373
373
|
end
|
|
374
374
|
|
|
375
|
+
# Get the 1-based line number where this node starts
|
|
376
|
+
#
|
|
377
|
+
# @return [Integer] 1-based line number
|
|
378
|
+
def start_line
|
|
379
|
+
start_point[:row] + 1
|
|
380
|
+
end
|
|
381
|
+
|
|
382
|
+
# Get the 1-based line number where this node ends
|
|
383
|
+
#
|
|
384
|
+
# @return [Integer] 1-based line number
|
|
385
|
+
def end_line
|
|
386
|
+
end_point[:row] + 1
|
|
387
|
+
end
|
|
388
|
+
|
|
389
|
+
# Get position information as a hash
|
|
390
|
+
#
|
|
391
|
+
# Returns a hash with 1-based line numbers and 0-based columns.
|
|
392
|
+
# Compatible with *-merge gems' FileAnalysisBase.
|
|
393
|
+
#
|
|
394
|
+
# @return [Hash{Symbol => Integer}] Position hash
|
|
395
|
+
def source_position
|
|
396
|
+
{
|
|
397
|
+
start_line: start_line,
|
|
398
|
+
end_line: end_line,
|
|
399
|
+
start_column: start_point[:column],
|
|
400
|
+
end_column: end_point[:column],
|
|
401
|
+
}
|
|
402
|
+
end
|
|
403
|
+
|
|
404
|
+
# Get the first child node
|
|
405
|
+
#
|
|
406
|
+
# @return [Node, nil] First child or nil
|
|
407
|
+
def first_child
|
|
408
|
+
child(0)
|
|
409
|
+
end
|
|
410
|
+
|
|
375
411
|
def text
|
|
376
412
|
@match.string
|
|
377
413
|
end
|