parsanol 3.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of parsanol might be problematic. Click here for more details.

Files changed (336) hide show
  1. checksums.yaml +7 -0
  2. data/HISTORY.txt +25 -0
  3. data/LICENSE +23 -0
  4. data/README.adoc +643 -0
  5. data/Rakefile +189 -0
  6. data/example/balanced-parens/basic.rb +42 -0
  7. data/example/balanced-parens/basic.rb.md +86 -0
  8. data/example/balanced-parens/parens.rb +42 -0
  9. data/example/balanced-parens/ruby_transform.rb +162 -0
  10. data/example/big.erb +73 -0
  11. data/example/boolean-algebra/basic.rb +70 -0
  12. data/example/boolean-algebra/basic.rb.md +108 -0
  13. data/example/boolean-algebra/ruby_transform.rb +263 -0
  14. data/example/calculator/basic.rb +153 -0
  15. data/example/calculator/basic.rb.md +120 -0
  16. data/example/calculator/pattern.rb +153 -0
  17. data/example/calculator/ruby_transform.rb +156 -0
  18. data/example/calculator/ruby_transform.rb.md +32 -0
  19. data/example/calculator/serialized.rb +257 -0
  20. data/example/calculator/serialized.rb.md +32 -0
  21. data/example/calculator/transform.rb +153 -0
  22. data/example/calculator/zero_copy.rb +269 -0
  23. data/example/calculator/zero_copy.rb.md +36 -0
  24. data/example/capture/basic.rb +49 -0
  25. data/example/capture/basic.rb.md +106 -0
  26. data/example/capture/example.json +39 -0
  27. data/example/comments/basic.rb +35 -0
  28. data/example/comments/basic.rb.md +110 -0
  29. data/example/csv/ruby_transform.rb +148 -0
  30. data/example/csv/ruby_transform.rb.md +131 -0
  31. data/example/csv/serialized.rb +201 -0
  32. data/example/csv/serialized.rb.md +31 -0
  33. data/example/csv/zero_copy.rb +276 -0
  34. data/example/csv/zero_copy.rb.md +36 -0
  35. data/example/custom_atoms/indent_atom.rb +79 -0
  36. data/example/deepest-errors/basic.rb +131 -0
  37. data/example/deepest-errors/basic.rb.md +152 -0
  38. data/example/documentation/basic.rb +18 -0
  39. data/example/documentation/basic.rb.md +97 -0
  40. data/example/email/basic.rb +55 -0
  41. data/example/email/basic.rb.md +102 -0
  42. data/example/email/ruby_transform.rb +106 -0
  43. data/example/empty/basic.rb +13 -0
  44. data/example/empty/basic.rb.md +73 -0
  45. data/example/empty/example.json +38 -0
  46. data/example/erb/basic.rb +47 -0
  47. data/example/erb/basic.rb.md +103 -0
  48. data/example/erb/optimized.rb +42 -0
  49. data/example/error-reporting/basic.rb +132 -0
  50. data/example/error-reporting/basic.rb.md +122 -0
  51. data/example/expression-evaluator/basic.rb +284 -0
  52. data/example/expression-evaluator/basic.rb.md +138 -0
  53. data/example/ini/basic.rb +154 -0
  54. data/example/ini/basic.rb.md +129 -0
  55. data/example/ini/ruby_transform.rb +154 -0
  56. data/example/ip-address/basic.rb +125 -0
  57. data/example/ip-address/basic.rb.md +139 -0
  58. data/example/iso-6709/basic.rb +231 -0
  59. data/example/iso-6709/basic.rb.md +143 -0
  60. data/example/iso-8601/basic.rb +275 -0
  61. data/example/iso-8601/basic.rb.md +149 -0
  62. data/example/json/basic.rb +128 -0
  63. data/example/json/basic.rb.md +121 -0
  64. data/example/json/pattern.rb +128 -0
  65. data/example/json/ruby_transform.rb +200 -0
  66. data/example/json/ruby_transform.rb.md +32 -0
  67. data/example/json/serialized.rb +233 -0
  68. data/example/json/serialized.rb.md +31 -0
  69. data/example/json/transform.rb +128 -0
  70. data/example/json/zero_copy.rb +316 -0
  71. data/example/json/zero_copy.rb.md +36 -0
  72. data/example/local/basic.rb +34 -0
  73. data/example/local/basic.rb.md +91 -0
  74. data/example/local/example.json +38 -0
  75. data/example/markdown/basic.rb +287 -0
  76. data/example/markdown/basic.rb.md +160 -0
  77. data/example/markup/basic.rb +173 -0
  78. data/example/markup/basic.rb.md +118 -0
  79. data/example/mathn/basic.rb +47 -0
  80. data/example/mathn/basic.rb.md +96 -0
  81. data/example/mathn/example.json +39 -0
  82. data/example/minilisp/basic.rb +94 -0
  83. data/example/minilisp/basic.rb.md +133 -0
  84. data/example/modularity/basic.rb +47 -0
  85. data/example/modularity/basic.rb.md +152 -0
  86. data/example/nested-errors/basic.rb +132 -0
  87. data/example/nested-errors/basic.rb.md +157 -0
  88. data/example/output/boolean_algebra.out +4 -0
  89. data/example/output/calc.out +1 -0
  90. data/example/output/capture.out +3 -0
  91. data/example/output/comments.out +8 -0
  92. data/example/output/deepest_errors.out +54 -0
  93. data/example/output/documentation.err +4 -0
  94. data/example/output/documentation.out +1 -0
  95. data/example/output/email_parser.out +2 -0
  96. data/example/output/empty.err +1 -0
  97. data/example/output/erb.out +7 -0
  98. data/example/output/ignore.out +1 -0
  99. data/example/output/ignore_whitespace.out +1 -0
  100. data/example/output/ip_address.out +9 -0
  101. data/example/output/json.out +5 -0
  102. data/example/output/local.out +3 -0
  103. data/example/output/mathn.out +4 -0
  104. data/example/output/minilisp.out +5 -0
  105. data/example/output/modularity.out +0 -0
  106. data/example/output/nested_errors.out +54 -0
  107. data/example/output/optimized_erb.out +1 -0
  108. data/example/output/parens.out +8 -0
  109. data/example/output/prec_calc.out +5 -0
  110. data/example/output/readme.out +1 -0
  111. data/example/output/scopes.out +1 -0
  112. data/example/output/seasons.out +28 -0
  113. data/example/output/sentence.out +1 -0
  114. data/example/output/simple_xml.out +2 -0
  115. data/example/output/string_parser.out +3 -0
  116. data/example/prec-calc/basic.rb +71 -0
  117. data/example/prec-calc/basic.rb.md +114 -0
  118. data/example/readme/basic.rb +30 -0
  119. data/example/readme/basic.rb.md +80 -0
  120. data/example/scopes/basic.rb +15 -0
  121. data/example/scopes/basic.rb.md +73 -0
  122. data/example/scopes/example.json +38 -0
  123. data/example/seasons/basic.rb +46 -0
  124. data/example/seasons/basic.rb.md +117 -0
  125. data/example/seasons/example.json +40 -0
  126. data/example/sentence/basic.rb +36 -0
  127. data/example/sentence/basic.rb.md +81 -0
  128. data/example/sexp/ruby_transform.rb +180 -0
  129. data/example/sexp/ruby_transform.rb.md +143 -0
  130. data/example/simple-xml/basic.rb +54 -0
  131. data/example/simple-xml/basic.rb.md +125 -0
  132. data/example/simple.lit +3 -0
  133. data/example/string-literal/basic.rb +77 -0
  134. data/example/string-literal/basic.rb.md +128 -0
  135. data/example/test.lit +4 -0
  136. data/example/toml/basic.rb +226 -0
  137. data/example/toml/basic.rb.md +173 -0
  138. data/example/url/basic.rb +219 -0
  139. data/example/url/basic.rb.md +142 -0
  140. data/example/url/ruby_transform.rb +219 -0
  141. data/example/yaml/basic.rb +216 -0
  142. data/example/yaml/basic.rb.md +148 -0
  143. data/ext/parsanol_native/extconf.rb +4 -0
  144. data/lib/parsanol/accelerator/application.rb +62 -0
  145. data/lib/parsanol/accelerator/engine.rb +112 -0
  146. data/lib/parsanol/accelerator.rb +162 -0
  147. data/lib/parsanol/ast_visitor.rb +122 -0
  148. data/lib/parsanol/atoms/alternative.rb +97 -0
  149. data/lib/parsanol/atoms/base.rb +214 -0
  150. data/lib/parsanol/atoms/can_flatten.rb +192 -0
  151. data/lib/parsanol/atoms/capture.rb +41 -0
  152. data/lib/parsanol/atoms/context.rb +351 -0
  153. data/lib/parsanol/atoms/context_optimized.rb +42 -0
  154. data/lib/parsanol/atoms/custom.rb +110 -0
  155. data/lib/parsanol/atoms/cut.rb +62 -0
  156. data/lib/parsanol/atoms/dsl.rb +130 -0
  157. data/lib/parsanol/atoms/dynamic.rb +33 -0
  158. data/lib/parsanol/atoms/entity.rb +55 -0
  159. data/lib/parsanol/atoms/ignored.rb +28 -0
  160. data/lib/parsanol/atoms/infix.rb +121 -0
  161. data/lib/parsanol/atoms/lookahead.rb +64 -0
  162. data/lib/parsanol/atoms/named.rb +50 -0
  163. data/lib/parsanol/atoms/re.rb +61 -0
  164. data/lib/parsanol/atoms/repetition.rb +241 -0
  165. data/lib/parsanol/atoms/scope.rb +28 -0
  166. data/lib/parsanol/atoms/sequence.rb +157 -0
  167. data/lib/parsanol/atoms/str.rb +90 -0
  168. data/lib/parsanol/atoms/visitor.rb +91 -0
  169. data/lib/parsanol/atoms.rb +36 -0
  170. data/lib/parsanol/buffer.rb +130 -0
  171. data/lib/parsanol/builder_callbacks.rb +353 -0
  172. data/lib/parsanol/cause.rb +101 -0
  173. data/lib/parsanol/context.rb +23 -0
  174. data/lib/parsanol/convenience.rb +35 -0
  175. data/lib/parsanol/edit_tracker.rb +107 -0
  176. data/lib/parsanol/error_reporter/contextual.rb +122 -0
  177. data/lib/parsanol/error_reporter/deepest.rb +106 -0
  178. data/lib/parsanol/error_reporter/tree.rb +68 -0
  179. data/lib/parsanol/error_reporter.rb +98 -0
  180. data/lib/parsanol/export.rb +163 -0
  181. data/lib/parsanol/expression/treetop.rb +94 -0
  182. data/lib/parsanol/expression.rb +51 -0
  183. data/lib/parsanol/fast_mode.rb +145 -0
  184. data/lib/parsanol/first_set.rb +75 -0
  185. data/lib/parsanol/grammar_builder.rb +177 -0
  186. data/lib/parsanol/graphviz.rb +97 -0
  187. data/lib/parsanol/incremental_parser.rb +179 -0
  188. data/lib/parsanol/interval_tree.rb +215 -0
  189. data/lib/parsanol/lazy_result.rb +178 -0
  190. data/lib/parsanol/lexer.rb +146 -0
  191. data/lib/parsanol/native/parser.rb +630 -0
  192. data/lib/parsanol/native/serializer.rb +245 -0
  193. data/lib/parsanol/native/transformer.rb +438 -0
  194. data/lib/parsanol/native/types.rb +41 -0
  195. data/lib/parsanol/native.rb +217 -0
  196. data/lib/parsanol/optimizer.rb +86 -0
  197. data/lib/parsanol/optimizers/choice_optimizer.rb +78 -0
  198. data/lib/parsanol/optimizers/cut_inserter.rb +175 -0
  199. data/lib/parsanol/optimizers/lookahead_optimizer.rb +58 -0
  200. data/lib/parsanol/optimizers/quantifier_optimizer.rb +62 -0
  201. data/lib/parsanol/optimizers/sequence_optimizer.rb +97 -0
  202. data/lib/parsanol/options/ruby_transform.rb +109 -0
  203. data/lib/parsanol/options/serialized.rb +94 -0
  204. data/lib/parsanol/options/zero_copy.rb +130 -0
  205. data/lib/parsanol/options.rb +20 -0
  206. data/lib/parsanol/parallel.rb +133 -0
  207. data/lib/parsanol/parsanol_native.bundle +0 -0
  208. data/lib/parsanol/parser.rb +151 -0
  209. data/lib/parsanol/parslet.rb +148 -0
  210. data/lib/parsanol/parslet_native.bundle +0 -0
  211. data/lib/parsanol/pattern/binding.rb +49 -0
  212. data/lib/parsanol/pattern.rb +115 -0
  213. data/lib/parsanol/pool.rb +220 -0
  214. data/lib/parsanol/pools/array_pool.rb +75 -0
  215. data/lib/parsanol/pools/buffer_pool.rb +173 -0
  216. data/lib/parsanol/pools/position_pool.rb +92 -0
  217. data/lib/parsanol/pools/slice_pool.rb +64 -0
  218. data/lib/parsanol/position.rb +89 -0
  219. data/lib/parsanol/result.rb +44 -0
  220. data/lib/parsanol/result_builder.rb +208 -0
  221. data/lib/parsanol/result_stream.rb +262 -0
  222. data/lib/parsanol/rig/rspec.rb +52 -0
  223. data/lib/parsanol/rope.rb +78 -0
  224. data/lib/parsanol/scope.rb +42 -0
  225. data/lib/parsanol/slice.rb +172 -0
  226. data/lib/parsanol/source/line_cache.rb +99 -0
  227. data/lib/parsanol/source.rb +171 -0
  228. data/lib/parsanol/source_location.rb +164 -0
  229. data/lib/parsanol/streaming_parser.rb +124 -0
  230. data/lib/parsanol/string_view.rb +192 -0
  231. data/lib/parsanol/transform.rb +267 -0
  232. data/lib/parsanol/version.rb +5 -0
  233. data/lib/parsanol/wasm/README.md +80 -0
  234. data/lib/parsanol/wasm/package.json +51 -0
  235. data/lib/parsanol/wasm/parsanol.js +252 -0
  236. data/lib/parsanol/wasm/parslet.d.ts +129 -0
  237. data/lib/parsanol/wasm_parser.rb +239 -0
  238. data/lib/parsanol.rb +408 -0
  239. data/parsanol-ruby.gemspec +56 -0
  240. data/spec/acceptance/examples_spec.rb +96 -0
  241. data/spec/acceptance/infix_parser_spec.rb +145 -0
  242. data/spec/acceptance/mixing_parsers_spec.rb +74 -0
  243. data/spec/acceptance/regression_spec.rb +329 -0
  244. data/spec/acceptance/repetition_and_maybe_spec.rb +44 -0
  245. data/spec/acceptance/unconsumed_input_spec.rb +21 -0
  246. data/spec/benchmark/comparative/runner_spec.rb +105 -0
  247. data/spec/integration/array_pooling_spec.rb +193 -0
  248. data/spec/integration/buffer_allocation_spec.rb +324 -0
  249. data/spec/integration/position_pooling_spec.rb +184 -0
  250. data/spec/integration/result_builder_spec.rb +282 -0
  251. data/spec/integration/rope_stringview_integration_spec.rb +188 -0
  252. data/spec/integration/slice_pooling_spec.rb +63 -0
  253. data/spec/integration/string_view_integration_spec.rb +125 -0
  254. data/spec/lexer_spec.rb +231 -0
  255. data/spec/parsanol/atom_results_spec.rb +39 -0
  256. data/spec/parsanol/atoms/alternative_spec.rb +26 -0
  257. data/spec/parsanol/atoms/base_spec.rb +127 -0
  258. data/spec/parsanol/atoms/capture_spec.rb +21 -0
  259. data/spec/parsanol/atoms/combinations_spec.rb +5 -0
  260. data/spec/parsanol/atoms/custom_spec.rb +79 -0
  261. data/spec/parsanol/atoms/dsl_spec.rb +7 -0
  262. data/spec/parsanol/atoms/entity_spec.rb +77 -0
  263. data/spec/parsanol/atoms/ignored_spec.rb +15 -0
  264. data/spec/parsanol/atoms/infix_spec.rb +5 -0
  265. data/spec/parsanol/atoms/lookahead_spec.rb +22 -0
  266. data/spec/parsanol/atoms/named_spec.rb +4 -0
  267. data/spec/parsanol/atoms/re_spec.rb +14 -0
  268. data/spec/parsanol/atoms/repetition_spec.rb +24 -0
  269. data/spec/parsanol/atoms/scope_spec.rb +26 -0
  270. data/spec/parsanol/atoms/sequence_spec.rb +28 -0
  271. data/spec/parsanol/atoms/str_spec.rb +15 -0
  272. data/spec/parsanol/atoms/visitor_spec.rb +101 -0
  273. data/spec/parsanol/atoms_spec.rb +488 -0
  274. data/spec/parsanol/auto_optimize_spec.rb +334 -0
  275. data/spec/parsanol/buffer_spec.rb +219 -0
  276. data/spec/parsanol/builder_callbacks_spec.rb +377 -0
  277. data/spec/parsanol/choice_optimizer_spec.rb +231 -0
  278. data/spec/parsanol/convenience_spec.rb +54 -0
  279. data/spec/parsanol/cut_inserter_spec.rb +248 -0
  280. data/spec/parsanol/cut_spec.rb +66 -0
  281. data/spec/parsanol/edit_tracker_spec.rb +218 -0
  282. data/spec/parsanol/error_reporter/contextual_spec.rb +122 -0
  283. data/spec/parsanol/error_reporter/deepest_spec.rb +82 -0
  284. data/spec/parsanol/error_reporter/tree_spec.rb +7 -0
  285. data/spec/parsanol/export_spec.rb +67 -0
  286. data/spec/parsanol/expression/treetop_spec.rb +75 -0
  287. data/spec/parsanol/first_set_spec.rb +298 -0
  288. data/spec/parsanol/interval_tree_spec.rb +205 -0
  289. data/spec/parsanol/lazy_result_spec.rb +288 -0
  290. data/spec/parsanol/lookahead_optimizer_spec.rb +252 -0
  291. data/spec/parsanol/minilisp.citrus +29 -0
  292. data/spec/parsanol/minilisp.tt +29 -0
  293. data/spec/parsanol/optimizer_spec.rb +459 -0
  294. data/spec/parsanol/options/parslet_compat_spec.rb +166 -0
  295. data/spec/parsanol/options/ruby_transform_spec.rb +70 -0
  296. data/spec/parsanol/options/serialized_spec.rb +69 -0
  297. data/spec/parsanol/options/zero_copy_spec.rb +230 -0
  298. data/spec/parsanol/parser_spec.rb +36 -0
  299. data/spec/parsanol/parslet_spec.rb +38 -0
  300. data/spec/parsanol/pattern_spec.rb +272 -0
  301. data/spec/parsanol/pool_spec.rb +392 -0
  302. data/spec/parsanol/pools/array_pool_spec.rb +356 -0
  303. data/spec/parsanol/pools/buffer_pool_spec.rb +365 -0
  304. data/spec/parsanol/pools/position_pool_spec.rb +118 -0
  305. data/spec/parsanol/pools/slice_pool_spec.rb +262 -0
  306. data/spec/parsanol/position_spec.rb +14 -0
  307. data/spec/parsanol/result_builder_spec.rb +391 -0
  308. data/spec/parsanol/rig/rspec_spec.rb +54 -0
  309. data/spec/parsanol/rope_spec.rb +207 -0
  310. data/spec/parsanol/scope_spec.rb +45 -0
  311. data/spec/parsanol/slice_spec.rb +249 -0
  312. data/spec/parsanol/source/line_cache_spec.rb +74 -0
  313. data/spec/parsanol/source_spec.rb +207 -0
  314. data/spec/parsanol/string_view_spec.rb +345 -0
  315. data/spec/parsanol/transform/context_spec.rb +56 -0
  316. data/spec/parsanol/transform_spec.rb +183 -0
  317. data/spec/parsanol/tree_memoization_spec.rb +149 -0
  318. data/spec/parslet_compatibility/expressir_edge_cases_spec.rb +153 -0
  319. data/spec/parslet_compatibility/minimal_reproduction.rb +199 -0
  320. data/spec/parslet_compatibility_spec.rb +399 -0
  321. data/spec/parslet_imported/atom_spec.rb +93 -0
  322. data/spec/parslet_imported/combinator_spec.rb +161 -0
  323. data/spec/parslet_imported/spec_helper.rb +73 -0
  324. data/spec/performance/batch_parsing_benchmark.rb +129 -0
  325. data/spec/performance/complete_optimization_summary.rb +143 -0
  326. data/spec/performance/grammar_caching_analysis.rb +121 -0
  327. data/spec/performance/grammar_caching_benchmark.rb +80 -0
  328. data/spec/performance/native_benchmark_spec.rb +230 -0
  329. data/spec/performance/phase5_benchmark.rb +144 -0
  330. data/spec/performance/profiling_benchmark.rb +131 -0
  331. data/spec/performance/ruby_improvements_benchmark.rb +171 -0
  332. data/spec/performance_spec.rb +374 -0
  333. data/spec/spec_helper.rb +79 -0
  334. data/spec/support/opal.rb +8 -0
  335. data/spec/support/opal.rb.erb +14 -0
  336. metadata +485 -0
@@ -0,0 +1,276 @@
1
+ # CSV Parser Example - ZeroCopy: Mirrored Objects (Direct FFI)
2
+ #
3
+ # This example demonstrates ZeroCopy for parsing CSV:
4
+ # 1. Rust parser (parsanol-rs) does the parsing
5
+ # 2. Rust constructs typed CSV value objects
6
+ # 3. Direct Ruby object construction via FFI (no serialization!)
7
+ # 4. Maximum performance with zero-copy
8
+
9
+ $:.unshift File.dirname(__FILE__) + "/../lib"
10
+
11
+ require 'parsanol'
12
+
13
+ # NOTE: This example requires:
14
+ # 1. ZeroCopy extension support for parse_to_objects
15
+ # 2. #[derive(RubyObject)] proc macro in Rust
16
+ # 3. Matching Ruby class definitions
17
+ #
18
+ # This serves as an API preview.
19
+
20
+ # Step 1: Define Ruby classes that mirror Rust struct definitions
21
+ module Csv
22
+ class Value
23
+ end
24
+
25
+ # Represents a single CSV field
26
+ class Field < Value
27
+ attr_reader :raw, :value
28
+
29
+ def initialize(raw:, value:)
30
+ @raw = raw
31
+ @value = value
32
+ end
33
+
34
+ def to_s = @value
35
+ def ==(other)
36
+ other.is_a?(Field) && @value == other.value
37
+ end
38
+ end
39
+
40
+ # Represents a row of fields
41
+ class Row < Value
42
+ attr_reader :fields
43
+
44
+ def initialize(fields)
45
+ @fields = fields
46
+ end
47
+
48
+ def [](index)
49
+ @fields[index]
50
+ end
51
+
52
+ def size
53
+ @fields.size
54
+ end
55
+
56
+ def each(&block)
57
+ @fields.each(&block)
58
+ end
59
+
60
+ def to_a
61
+ @fields.map(&:value)
62
+ end
63
+
64
+ def to_s
65
+ @fields.map(&:value).join(',')
66
+ end
67
+ end
68
+
69
+ # Represents an entire CSV document
70
+ class Document < Value
71
+ attr_reader :rows
72
+
73
+ def initialize(rows)
74
+ @rows = rows
75
+ end
76
+
77
+ def size
78
+ @rows.size
79
+ end
80
+
81
+ def [](index)
82
+ @rows[index]
83
+ end
84
+
85
+ def each(&block)
86
+ @rows.each(&block)
87
+ end
88
+
89
+ def headers
90
+ @rows.first&.fields&.map(&:value) if @rows.any?
91
+ end
92
+
93
+ def data
94
+ @rows[1..] || []
95
+ end
96
+
97
+ def to_a
98
+ @rows.map(&:to_a)
99
+ end
100
+
101
+ def to_hashes
102
+ return [] unless headers && !data.empty?
103
+
104
+ headers = self.headers
105
+ data.map { |row| headers.zip(row.to_a).to_h }
106
+ end
107
+ end
108
+ end
109
+
110
+ # Step 2: Define the parser with output type mapping
111
+ class CsvParser < Parsanol::Parser
112
+ # Include ZeroCopy module (planned)
113
+ # include Parsanol::ZeroCopy
114
+
115
+ root :csv
116
+
117
+ rule(:csv) {
118
+ space? >> (row >> (newline >> row).repeat).maybe >> space?
119
+ }
120
+
121
+ rule(:row) {
122
+ (field.as(:f) >> (comma >> field.as(:f)).repeat).as(:row)
123
+ }
124
+
125
+ rule(:field) {
126
+ quoted_field | simple_field
127
+ }
128
+
129
+ rule(:quoted_field) {
130
+ str('"') >> (
131
+ str('""') | str('"').absent? >> any
132
+ ).repeat.as(:quoted) >> str('"')
133
+ }
134
+
135
+ rule(:simple_field) {
136
+ (comma.absent? >> newline.absent? >> any).repeat.as(:simple)
137
+ }
138
+
139
+ rule(:comma) { str(',') }
140
+ rule(:newline) { str("\n") | str("\r\n") | str("\r") }
141
+ rule(:space) { match('\s').repeat(1) }
142
+ rule(:space?) { space.maybe }
143
+
144
+ # Output type mapping (planned feature)
145
+ # output_types(
146
+ # field: Csv::Field,
147
+ # row: Csv::Row,
148
+ # csv: Csv::Document
149
+ # )
150
+ end
151
+
152
+ # Step 3: Parse with direct object construction
153
+ def parse_csv(input)
154
+ parser = CsvParser.new
155
+
156
+ # ZeroCopy: Parse and get direct Ruby objects
157
+ # NOTE: This requires native extension support
158
+ # doc = parser.parse(input)
159
+ # # doc is already a Csv::Document!
160
+ # # No transform needed, no JSON serialization!
161
+
162
+ # For demonstration, simulate what ZeroCopy would return
163
+ doc = simulate_parse(input)
164
+ puts "Parsed: #{doc.class} with #{doc.size} rows"
165
+
166
+ doc
167
+ end
168
+
169
+ # Simulated parsing for demonstration
170
+ def simulate_parse(input)
171
+ lines = input.strip.split("\n")
172
+ return Csv::Document.new([]) if lines.empty?
173
+
174
+ rows = lines.map do |line|
175
+ fields = line.split(',').map do |field|
176
+ raw = field
177
+ # Unescape if quoted
178
+ value = if field.start_with?('"') && field.end_with?('"')
179
+ field[1..-2].gsub('""', '"')
180
+ else
181
+ field.strip
182
+ end
183
+ Csv::Field.new(raw: raw, value: value)
184
+ end
185
+ Csv::Row.new(fields)
186
+ end
187
+
188
+ Csv::Document.new(rows)
189
+ end
190
+
191
+ # Example usage
192
+ if __FILE__ == $0
193
+ puts "=" * 60
194
+ puts "CSV Parser Example - ZeroCopy: Mirrored Objects"
195
+ puts "=" * 60
196
+ puts
197
+ puts "NOTE: This example shows the planned API for ZeroCopy."
198
+ puts "The native extension support for parse_to_objects is coming soon."
199
+ puts
200
+
201
+ simple_csv = <<~CSV
202
+ name,age,city
203
+ Alice,30,New York
204
+ Bob,25,San Francisco
205
+ CSV
206
+
207
+ puts "Simple CSV:"
208
+ puts "-" * 40
209
+ doc = parse_csv(simple_csv)
210
+
211
+ puts "As arrays:"
212
+ doc.to_a.each { |row| puts row.inspect }
213
+
214
+ puts
215
+ puts "Headers: #{doc.headers.inspect}"
216
+ puts "Data rows: #{doc.data.size}"
217
+
218
+ puts
219
+ puts "As hashes:"
220
+ doc.to_hashes.each { |row| puts row.inspect }
221
+
222
+ # Type-safe access
223
+ puts
224
+ puts "Type-safe access:"
225
+ puts "First row class: #{doc[0].class}"
226
+ puts "First field class: #{doc[0][0].class}"
227
+ puts "First field raw: #{doc[0][0].raw.inspect}"
228
+ puts "First field value: #{doc[0][0].value.inspect}"
229
+
230
+ # Custom method example
231
+ puts
232
+ puts "Custom method on Field:"
233
+ field = Csv::Field.new(raw: '"Hello, World"', value: 'Hello, World')
234
+ puts "Field: #{field.value}"
235
+
236
+ puts
237
+ puts "=" * 60
238
+ puts "ZeroCopy Benefits for CSV:"
239
+ puts "- FASTEST: No serialization overhead"
240
+ puts "- Type-safe: Each field is a Csv::Field object"
241
+ puts "- Custom methods: Can add validation, formatting, etc."
242
+ puts "- Zero-copy: Direct construction from Rust"
243
+ puts
244
+ puts "When to use ZeroCopy for CSV:"
245
+ puts "- High-throughput CSV processing"
246
+ puts "- When you need typed field access"
247
+ puts "- When you want custom methods on fields/rows"
248
+ puts "- When performance is critical"
249
+ puts "=" * 60
250
+ end
251
+
252
+ # Rust code that would be needed (for reference):
253
+ #
254
+ # // In parsanol-rs
255
+ # use parsanol_ruby_derive::RubyObject;
256
+ #
257
+ # #[derive(Debug, Clone, RubyObject)]
258
+ # #[ruby_class("Csv::Value")]
259
+ # pub enum CsvValue {
260
+ # #[ruby_variant("field")]
261
+ # Field {
262
+ # raw: String,
263
+ # value: String,
264
+ # },
265
+ #
266
+ # #[ruby_variant("row")]
267
+ # Row(Vec<CsvValue>),
268
+ #
269
+ # #[ruby_variant("document")]
270
+ # Document(Vec<CsvValue>),
271
+ # }
272
+ #
273
+ # // The #[derive(RubyObject)] proc macro generates:
274
+ # // - Ruby class definitions
275
+ # // - to_ruby() implementation
276
+ # // - Direct object construction via FFI
@@ -0,0 +1,36 @@
1
+ # CSV (Zero-Copy - Option C)
2
+
3
+ ## Purpose
4
+
5
+ This implementation demonstrates direct FFI object construction for CSV
6
+ parsing without serialization overhead.
7
+
8
+ ## When to Use
9
+
10
+ - Maximum performance required
11
+ - Production systems
12
+ - When zero-copy is critical
13
+
14
+ ## Key Concepts
15
+
16
+ 1. **Direct FFI**: No serialization overhead
17
+ 2. **Ruby Object Construction**: Direct via rb_funcall
18
+ 3. **Type Safety**: Mirrored types on both sides
19
+
20
+ ## Running
21
+
22
+ ```bash
23
+ ruby example/csv/zero_copy.rb
24
+ ```
25
+
26
+ ## Output
27
+
28
+ ```
29
+ Input: a,b,c
30
+ Result: Array
31
+ Value: [["a", "b", "c"]]
32
+ ```
33
+
34
+ ## Note
35
+
36
+ This is the fastest option but requires more complex FFI setup.
@@ -0,0 +1,79 @@
1
+ # frozen_string_literal: true
2
+
3
+ # IndentAtom - Custom atom for indentation-sensitive matching
4
+ #
5
+ # This example demonstrates how to create a custom atom that matches
6
+ # lines with a specific indentation level. This is useful for parsing
7
+ # Python-like languages where indentation matters.
8
+ #
9
+ # Usage:
10
+ # class PythonParser < Parsanol::Parser
11
+ # rule(:indented_block) { IndentAtom.new(4) >> statement }
12
+ # end
13
+
14
+ require 'parsanol'
15
+
16
+ class IndentAtom < Parsanol::Atoms::Custom
17
+ # Create a new indentation matcher
18
+ #
19
+ # @param expected_indent [Integer] Number of spaces expected at start of line
20
+ def initialize(expected_indent)
21
+ @expected_indent = expected_indent
22
+ super()
23
+ end
24
+
25
+ # Required: Implement the matching logic
26
+ def try_match(source, context, consume_all)
27
+ # Save position for potential backtrack
28
+ start_pos = source.bytepos
29
+
30
+ # Count leading spaces at current position
31
+ indent = 0
32
+ while source.bytepos < source.chars_left + start_pos
33
+ # Peek at the next character without consuming it
34
+ break unless source.matches?(/ /)
35
+
36
+ # Consume one space
37
+ char = source.consume(1)
38
+ break unless char == ' '
39
+ indent += 1
40
+ end
41
+
42
+ if indent == @expected_indent
43
+ # Success - return the matched indentation as a slice
44
+ # Use source.slice to create a proper Slice object
45
+ [true, source.slice(start_pos, ' ' * indent)]
46
+ else
47
+ # Failure - restore position for backtracking
48
+ source.bytepos = start_pos
49
+ [false, nil]
50
+ end
51
+ end
52
+
53
+ # Override to_s_inner for better error messages
54
+ def to_s_inner(prec = nil)
55
+ "indent(#{@expected_indent})"
56
+ end
57
+ end
58
+
59
+ # Example usage
60
+ if __FILE__ == $0
61
+ class IndentedParser < Parsanol::Parser
62
+ rule(:line) { indent >> content }
63
+ rule(:indent) { IndentAtom.new(2) }
64
+ rule(:content) { match['a-z'].repeat(1) }
65
+ root(:line)
66
+ end
67
+
68
+ parser = IndentedParser.new
69
+
70
+ # This should parse - exactly 2 spaces of indentation
71
+ puts parser.parse(" hello").inspect
72
+
73
+ # This will fail - wrong indentation
74
+ begin
75
+ puts parser.parse(" hello").inspect
76
+ rescue Parsanol::ParseFailed => e
77
+ puts "Failed as expected: #{e.message}"
78
+ end
79
+ end
@@ -0,0 +1,131 @@
1
+ $:.unshift File.dirname(__FILE__) + "/../lib"
2
+
3
+ # This example demonstrates how to do deepest error reporting, as invented
4
+ # by John Mettraux (issue #64).
5
+
6
+ require 'parsanol/parslet'
7
+ require 'parsanol/convenience'
8
+
9
+ def prettify(str)
10
+ puts " "*3 + " "*4 + "." + " "*4 + "10" + " "*3 + "." + " "*4 + "20"
11
+ str.lines.each_with_index do |line, index|
12
+ printf "%02d %s\n",
13
+ index+1,
14
+ line.chomp
15
+ end
16
+ end
17
+
18
+ class MyParser < Parsanol::Parser
19
+ # commons
20
+
21
+ rule(:space) { match('[ \t]').repeat(1) }
22
+ rule(:space?) { space.maybe }
23
+
24
+ rule(:newline) { match('[\r\n]') }
25
+
26
+ rule(:comment) { str('#') >> match('[^\r\n]').repeat }
27
+
28
+ rule(:line_separator) {
29
+ (space? >> ((comment.maybe >> newline) | str(';')) >> space?).repeat(1)
30
+ }
31
+
32
+ rule(:blank) { line_separator | space }
33
+ rule(:blank?) { blank.maybe }
34
+
35
+ rule(:identifier) { match('[a-zA-Z0-9_]').repeat(1) }
36
+
37
+ # res_statement
38
+
39
+ rule(:reference) {
40
+ (str('@').repeat(1,2) >> identifier).as(:reference)
41
+ }
42
+
43
+ rule(:res_action_or_link) {
44
+ str('.').as(:dot) >> (identifier >> str('?').maybe ).as(:name) >> str('()')
45
+ }
46
+
47
+ rule(:res_actions) {
48
+ (
49
+ reference
50
+ ).as(:resources) >>
51
+ (
52
+ res_action_or_link.as(:res_action)
53
+ ).repeat(0).as(:res_actions)
54
+ }
55
+
56
+ rule(:res_statement) {
57
+ res_actions >>
58
+ (str(':') >> identifier.as(:name)).maybe.as(:res_field)
59
+ }
60
+
61
+ # expression
62
+
63
+ rule(:expression) {
64
+ res_statement
65
+ }
66
+
67
+ # body
68
+
69
+ rule(:body) {
70
+ (line_separator >> (block | expression)).repeat(1).as(:body) >>
71
+ line_separator
72
+ }
73
+
74
+ # blocks
75
+
76
+ rule(:begin_block) {
77
+ (str('concurrent').as(:type) >> space).maybe.as(:pre) >>
78
+ str('begin').as(:begin) >>
79
+ body >>
80
+ str('end')
81
+ }
82
+
83
+ rule(:define_block) {
84
+ str('define').as(:define) >> space >>
85
+ identifier.as(:name) >> str('()') >>
86
+ body >>
87
+ str('end')
88
+ }
89
+
90
+ rule(:block) {
91
+ define_block | begin_block
92
+ }
93
+
94
+ # root
95
+
96
+ rule(:radix) {
97
+ line_separator.maybe >> block >> line_separator.maybe
98
+ }
99
+
100
+ root(:radix)
101
+ end
102
+
103
+ ds = [
104
+ %{
105
+ define f()
106
+ @res.name
107
+ end
108
+ },
109
+ %{
110
+ define f()
111
+ begin
112
+ @res.name
113
+ end
114
+ end
115
+ }
116
+ ]
117
+
118
+ ds.each do |d|
119
+
120
+ puts '-' * 80
121
+ prettify(d)
122
+
123
+ parser = MyParser.new
124
+
125
+ begin
126
+ parser.parse_with_debug(d,
127
+ :reporter => Parsanol::ErrorReporter::Deepest.new)
128
+ end
129
+ end
130
+
131
+ puts '-' * 80
@@ -0,0 +1,152 @@
1
+ # Deepest Error Reporting - Ruby Implementation
2
+
3
+ ## How to Run
4
+
5
+ ```bash
6
+ cd parsanol-ruby/example/deepest-errors
7
+ ruby basic.rb
8
+ ```
9
+
10
+ ## Code Walkthrough
11
+
12
+ ### Error Reporter Configuration
13
+
14
+ Parslet provides different error reporters:
15
+
16
+ ```ruby
17
+ parser.parse_with_debug(input,
18
+ :reporter => Parsanol::ErrorReporter::Deepest.new)
19
+ ```
20
+
21
+ The `Deepest` reporter finds the point where parsing progressed furthest.
22
+
23
+ ### Helper for Display
24
+
25
+ Format input with line numbers:
26
+
27
+ ```ruby
28
+ def prettify(str)
29
+ puts " "*3 + " "*4 + "." + " "*4 + "10" + " "*3 + "." + " "*4 + "20"
30
+ str.lines.each_with_index do |line, index|
31
+ printf "%02d %s\n", index+1, line.chomp
32
+ end
33
+ end
34
+ ```
35
+
36
+ This helps users locate errors in their input.
37
+
38
+ ### Complex Grammar
39
+
40
+ The example uses a realistic grammar with multiple constructs:
41
+
42
+ ```ruby
43
+ rule(:define_block) {
44
+ str('define').as(:define) >> space >>
45
+ identifier.as(:name) >> str('()') >>
46
+ body >>
47
+ str('end')
48
+ }
49
+
50
+ rule(:begin_block) {
51
+ (str('concurrent').as(:type) >> space).maybe.as(:pre) >>
52
+ str('begin').as(:begin) >>
53
+ body >>
54
+ str('end')
55
+ }
56
+ ```
57
+
58
+ Multiple block types demonstrate error scenarios.
59
+
60
+ ### Reference Parsing
61
+
62
+ Resources use dot notation:
63
+
64
+ ```ruby
65
+ rule(:reference) {
66
+ (str('@').repeat(1,2) >> identifier).as(:reference)
67
+ }
68
+
69
+ rule(:res_action_or_link) {
70
+ str('.').as(:dot) >> (identifier >> str('?').maybe).as(:name) >> str('()')
71
+ }
72
+ ```
73
+
74
+ Single `@` for reference, `@@` for class reference.
75
+
76
+ ### Body Structure
77
+
78
+ Bodies contain nested content:
79
+
80
+ ```ruby
81
+ rule(:body) {
82
+ (line_separator >> (block | expression)).repeat(1).as(:body) >>
83
+ line_separator
84
+ }
85
+ ```
86
+
87
+ Recursion allows arbitrarily nested blocks.
88
+
89
+ ### Testing with Errors
90
+
91
+ ```ruby
92
+ ds = [
93
+ %{
94
+ define f()
95
+ @res.name
96
+ end
97
+ },
98
+ %{
99
+ define f()
100
+ begin
101
+ @res.name
102
+ end
103
+ end
104
+ }
105
+ ]
106
+
107
+ ds.each do |d|
108
+ parser.parse_with_debug(d, :reporter => Parsanol::ErrorReporter::Deepest.new)
109
+ end
110
+ ```
111
+
112
+ Each test case shows how errors are reported.
113
+
114
+ ## Output Types
115
+
116
+ ```
117
+ 01
118
+ 02 define f()
119
+ 03 @res.name
120
+ 04 end
121
+
122
+ Parsed successfully
123
+ ```
124
+
125
+ Or for errors:
126
+
127
+ ```
128
+ 01
129
+ 02 define f()
130
+ 03 @res.name(
131
+ 04 end
132
+
133
+ Expected ')' at line 3, column 12
134
+ ```
135
+
136
+ ## Design Decisions
137
+
138
+ ### Why Deepest Reporter?
139
+
140
+ The deepest failure point is usually where the user made their mistake. Earlier failures are often from trying wrong alternatives.
141
+
142
+ ### Why parse_with_debug?
143
+
144
+ This method automatically formats and prints errors. For programmatic access, use `parse` with rescue.
145
+
146
+ ### Why Line Numbers?
147
+
148
+ Humans think in lines, not byte positions. Line-oriented output matches how users read their input.
149
+
150
+ ### Why Multiple Test Cases?
151
+
152
+ Different error scenarios demonstrate the reporter's behavior across grammar constructs.
@@ -0,0 +1,18 @@
1
+ # A small example that shows a really small parser and what happens on parser
2
+ # errors.
3
+
4
+ $:.unshift File.dirname(__FILE__) + "/../lib"
5
+
6
+ require 'pp'
7
+ require 'parsanol/parslet'
8
+
9
+ class MyParser < Parsanol::Parser
10
+ rule(:a) { str('a').repeat }
11
+
12
+ def parse(str)
13
+ a.parse(str)
14
+ end
15
+ end
16
+
17
+ pp MyParser.new.parse('aaaa')
18
+ pp MyParser.new.parse('bbbb')