kpeg 0.7 → 0.8.0

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -5,4 +5,65 @@ KPeg is a simple PEG library for Ruby. It provides an API as well as native gram
5
5
 
6
6
  KPeg strives to provide a simple, powerful API without being too exotic.
7
7
 
8
- KPeg supports direct left recursion of rules via the [OMeta memoization](http://www.vpri.org/pdf/tr2008003_experimenting.pdf) trick.
8
+ KPeg supports direct left recursion of rules via the [OMeta memoization](http://www.vpri.org/pdf/tr2008003_experimenting.pdf) trick.
9
+
10
+ ## Writing your first grammar
11
+
12
+ ### Setting up your grammar
13
+
14
+ All grammars start with with the class/module name that will be your parser
15
+
16
+ %% name = Example::Parser
17
+
18
+ After that a block of ruby code can be defined that will be added into the class body of your parser. Attributes that are defines in this block can be accessed within your parser as instance variables
19
+
20
+ %% {
21
+ attr_accessor :something_cool
22
+ }
23
+
24
+ ### Defining literals
25
+
26
+ Literals are static declarations of characters or regular expressions designed for reuse in the grammar. These can be constants or variables.
27
+
28
+ ALPHA = /[A-Za-z]/
29
+ DIGIT = /[0-9]/
30
+ period = "."
31
+
32
+ Literals can also accept multiple definitions
33
+
34
+ vowel = "a" | "e" | "i" | "o" | "u"
35
+ alpha = /[A-Z]/ | /[a-z]/
36
+
37
+ ### Defining Rules for Values
38
+
39
+ Before you can start parsing a string you will need to define rules that you will use to accept or reject that string. There are many different types of rules available in kpeg
40
+
41
+ The most basic of these rules is a string capture
42
+
43
+ alpha = < /[A-Za-z]/ > { text }
44
+
45
+ While this looks very much like the ALPHA literal defined above it differs in one important way, the text captured by the rule defined between the < and > symbols will be set as the text variable in block that follows. You can also explicitly define the variable that you would like but only with existing rules or literals.
46
+
47
+ num = /[1-9][0-9]*/
48
+ sum = < num:n1 "+" num:n2 > { n1 + n2 }
49
+
50
+ Additionally blocks can return true or false values based upon an expression within the block. To test if something is true do the following:
51
+
52
+ greater_than_10 = < num:n > &{ n > 10 }
53
+
54
+ To test for a false value do the following:
55
+
56
+ not_greater_than_10 = < num:n > !{ n > 10 }
57
+
58
+ Rules can also act like functions and take parameters, an example of this is can be lifted from the [Email List Validator](https://github.com/andrewvc/email_address_validator), where an ascii value is passed in and the character is evaluated against it returning a true if it matches
59
+
60
+ d(num) = <.> &{ text[0] == num }
61
+
62
+
63
+
64
+ ## Projects using kpeg
65
+
66
+ [Dang](https://github.com/veganstraightedge/dang)
67
+ [Email Address Validator](https://github.com/andrewvc/email_address_validator)
68
+ [Callisto](https://github.com/dwaite/Callisto)
69
+ [Doodle](https://github.com/vito/doodle)
@@ -34,6 +34,59 @@ module KPeg
34
34
  @saves = 0
35
35
  end
36
36
 
37
+ def output_ast(short, code, description)
38
+ parser = FormatParser.new description
39
+
40
+ # just skip it if it's bad.
41
+ return unless parser.parse "ast_root"
42
+
43
+ name, attrs = parser.result
44
+
45
+ code << " class #{name} < Node\n"
46
+ code << " def initialize(#{attrs.join(', ')})\n"
47
+ attrs.each do |at|
48
+ code << " @#{at} = #{at}\n"
49
+ end
50
+ code << " end\n"
51
+ attrs.each do |at|
52
+ code << " attr_reader :#{at}\n"
53
+ end
54
+ code << " end\n"
55
+
56
+ [short, name, attrs]
57
+ end
58
+
59
+ def handle_ast(code)
60
+ output_node = false
61
+
62
+ root = @grammar.variables["ast-location"] || "AST"
63
+
64
+ methods = []
65
+
66
+ @grammar.variables.each do |name, val|
67
+ if val.index("ast ") == 0
68
+ unless output_node
69
+ code << "\n"
70
+ code << " module #{root}\n"
71
+ code << " class Node; end\n"
72
+ output_node = true
73
+ end
74
+ if m = output_ast(name, code, val[4..-1])
75
+ methods << m
76
+ end
77
+ end
78
+ end
79
+
80
+ if output_node
81
+ code << " end\n"
82
+ methods.each do |short, name, attrs|
83
+ code << " def #{short}(#{attrs.join(', ')})\n"
84
+ code << " #{root}::#{name}.new(#{attrs.join(', ')})\n"
85
+ code << " end\n"
86
+ end
87
+ end
88
+ end
89
+
37
90
  def output_op(code, op)
38
91
  case op
39
92
  when Dot
@@ -41,7 +94,8 @@ module KPeg
41
94
  when LiteralString
42
95
  code << " _tmp = match_string(#{op.string.dump})\n"
43
96
  when LiteralRegexp
44
- code << " _tmp = scan(/\\A#{op.regexp}/)\n"
97
+ lang = op.regexp.kcode.to_s[0,1]
98
+ code << " _tmp = scan(/\\A#{op.regexp}/#{lang})\n"
45
99
  when CharRange
46
100
  ss = save()
47
101
  if op.start.bytesize == 1 and op.fin.bytesize == 1
@@ -271,6 +325,8 @@ module KPeg
271
325
  code << "\n#{act.action}\n\n"
272
326
  end
273
327
 
328
+ handle_ast(code)
329
+
274
330
  fg = @grammar.foreign_grammars
275
331
 
276
332
  if fg.empty?
@@ -283,6 +339,13 @@ module KPeg
283
339
  code << " @_grammar_#{name} = #{gram}.new(nil)\n"
284
340
  end
285
341
  code << " end\n"
342
+
343
+ @grammar.foreign_grammars.each do |name, gram|
344
+ code << "\n"
345
+ code << " def invoke_#{name}(*args)\n"
346
+ code << " @_grammar_#{name}.external_invoke(self, :_root, *args)\n"
347
+ code << " end\n"
348
+ end
286
349
  end
287
350
 
288
351
  render = GrammarRenderer.new(@grammar)
@@ -3,6 +3,11 @@ require 'kpeg/position'
3
3
  module KPeg
4
4
  class CompiledParser
5
5
 
6
+ # Must be outside the STANDALONE block because a standalone
7
+ # parser always injects it's own version of this method.
8
+ def setup_foreign_grammar
9
+ end
10
+
6
11
  # Leave these markers in! They allow us to generate standalone
7
12
  # code automatically!
8
13
  #
@@ -18,9 +23,6 @@ module KPeg
18
23
  setup_foreign_grammar
19
24
  end
20
25
 
21
- def setup_foreign_grammar
22
- end
23
-
24
26
  # This is distinct from setup_parser so that a standalone parser
25
27
  # can redefine #initialize and still have access to the proper
26
28
  # parser setup code.
@@ -11,9 +11,6 @@ class KPeg::FormatParser
11
11
  setup_foreign_grammar
12
12
  end
13
13
 
14
- def setup_foreign_grammar
15
- end
16
-
17
14
  # This is distinct from setup_parser so that a standalone parser
18
15
  # can redefine #initialize and still have access to the proper
19
16
  # parser setup code.
@@ -464,6 +461,32 @@ class KPeg::FormatParser
464
461
  return _tmp
465
462
  end
466
463
 
464
+ # method = < /[a-zA-Z_][a-zA-Z0-9_]*/ > { text }
465
+ def _method
466
+
467
+ _save = self.pos
468
+ while true # sequence
469
+ _text_start = self.pos
470
+ _tmp = scan(/\A(?-mix:[a-zA-Z_][a-zA-Z0-9_]*)/)
471
+ if _tmp
472
+ text = get_text(_text_start)
473
+ end
474
+ unless _tmp
475
+ self.pos = _save
476
+ break
477
+ end
478
+ @result = begin; text ; end
479
+ _tmp = true
480
+ unless _tmp
481
+ self.pos = _save
482
+ end
483
+ break
484
+ end # end sequence
485
+
486
+ set_failed_rule :_method unless _tmp
487
+ return _tmp
488
+ end
489
+
467
490
  # dbl_escapes = ("\\\"" { '"' } | "\\n" { "\n" } | "\\t" { "\t" } | "\\b" { "\b" } | "\\\\" { "\\" })
468
491
  def _dbl_escapes
469
492
 
@@ -1239,7 +1262,7 @@ class KPeg::FormatParser
1239
1262
  return _tmp
1240
1263
  end
1241
1264
 
1242
- # value = (value:v ":" var:n { @g.t(v,n) } | value:v "?" { @g.maybe(v) } | value:v "+" { @g.many(v) } | value:v "*" { @g.kleene(v) } | value:v mult_range:r { @g.multiple(v, *r) } | "&" value:v { @g.andp(v) } | "!" value:v { @g.notp(v) } | "(" - expression:o - ")" { o } | "<" - expression:o - ">" { @g.collect(o) } | curly_block | "." { @g.dot } | "@" var:name !(- "=") { @g.invoke(name) } | "^" var:name < nested_paren? > { @g.foreign_invoke("parent", name, text) } | "%" var:gram "." var:name < nested_paren? > { @g.foreign_invoke(gram, name, text) } | var:name < nested_paren? > !(- "=") { text.empty? ? @g.ref(name) : @g.invoke(name, text) } | char_range | regexp | string)
1265
+ # value = (value:v ":" var:n { @g.t(v,n) } | value:v "?" { @g.maybe(v) } | value:v "+" { @g.many(v) } | value:v "*" { @g.kleene(v) } | value:v mult_range:r { @g.multiple(v, *r) } | "&" value:v { @g.andp(v) } | "!" value:v { @g.notp(v) } | "(" - expression:o - ")" { o } | "<" - expression:o - ">" { @g.collect(o) } | curly_block | "~" method:m < nested_paren? > { @g.action("#{m}#{text}") } | "." { @g.dot } | "@" var:name !(- "=") { @g.invoke(name) } | "^" var:name < nested_paren? > { @g.foreign_invoke("parent", name, text) } | "%" var:gram "." var:name < nested_paren? > { @g.foreign_invoke(gram, name, text) } | var:name < nested_paren? > !(- "=") { text.empty? ? @g.ref(name) : @g.invoke(name, text) } | char_range | regexp | string)
1243
1266
  def _value
1244
1267
 
1245
1268
  _save = self.pos
@@ -1503,12 +1526,32 @@ class KPeg::FormatParser
1503
1526
 
1504
1527
  _save10 = self.pos
1505
1528
  while true # sequence
1506
- _tmp = match_string(".")
1529
+ _tmp = match_string("~")
1507
1530
  unless _tmp
1508
1531
  self.pos = _save10
1509
1532
  break
1510
1533
  end
1511
- @result = begin; @g.dot ; end
1534
+ _tmp = apply(:_method)
1535
+ m = @result
1536
+ unless _tmp
1537
+ self.pos = _save10
1538
+ break
1539
+ end
1540
+ _text_start = self.pos
1541
+ _save11 = self.pos
1542
+ _tmp = apply(:_nested_paren)
1543
+ unless _tmp
1544
+ _tmp = true
1545
+ self.pos = _save11
1546
+ end
1547
+ if _tmp
1548
+ text = get_text(_text_start)
1549
+ end
1550
+ unless _tmp
1551
+ self.pos = _save10
1552
+ break
1553
+ end
1554
+ @result = begin; @g.action("#{m}#{text}") ; end
1512
1555
  _tmp = true
1513
1556
  unless _tmp
1514
1557
  self.pos = _save10
@@ -1519,45 +1562,63 @@ class KPeg::FormatParser
1519
1562
  break if _tmp
1520
1563
  self.pos = _save
1521
1564
 
1522
- _save11 = self.pos
1565
+ _save12 = self.pos
1566
+ while true # sequence
1567
+ _tmp = match_string(".")
1568
+ unless _tmp
1569
+ self.pos = _save12
1570
+ break
1571
+ end
1572
+ @result = begin; @g.dot ; end
1573
+ _tmp = true
1574
+ unless _tmp
1575
+ self.pos = _save12
1576
+ end
1577
+ break
1578
+ end # end sequence
1579
+
1580
+ break if _tmp
1581
+ self.pos = _save
1582
+
1583
+ _save13 = self.pos
1523
1584
  while true # sequence
1524
1585
  _tmp = match_string("@")
1525
1586
  unless _tmp
1526
- self.pos = _save11
1587
+ self.pos = _save13
1527
1588
  break
1528
1589
  end
1529
1590
  _tmp = apply(:_var)
1530
1591
  name = @result
1531
1592
  unless _tmp
1532
- self.pos = _save11
1593
+ self.pos = _save13
1533
1594
  break
1534
1595
  end
1535
- _save12 = self.pos
1596
+ _save14 = self.pos
1536
1597
 
1537
- _save13 = self.pos
1598
+ _save15 = self.pos
1538
1599
  while true # sequence
1539
1600
  _tmp = apply(:__hyphen_)
1540
1601
  unless _tmp
1541
- self.pos = _save13
1602
+ self.pos = _save15
1542
1603
  break
1543
1604
  end
1544
1605
  _tmp = match_string("=")
1545
1606
  unless _tmp
1546
- self.pos = _save13
1607
+ self.pos = _save15
1547
1608
  end
1548
1609
  break
1549
1610
  end # end sequence
1550
1611
 
1551
1612
  _tmp = _tmp ? nil : true
1552
- self.pos = _save12
1613
+ self.pos = _save14
1553
1614
  unless _tmp
1554
- self.pos = _save11
1615
+ self.pos = _save13
1555
1616
  break
1556
1617
  end
1557
1618
  @result = begin; @g.invoke(name) ; end
1558
1619
  _tmp = true
1559
1620
  unless _tmp
1560
- self.pos = _save11
1621
+ self.pos = _save13
1561
1622
  end
1562
1623
  break
1563
1624
  end # end sequence
@@ -1565,37 +1626,37 @@ class KPeg::FormatParser
1565
1626
  break if _tmp
1566
1627
  self.pos = _save
1567
1628
 
1568
- _save14 = self.pos
1629
+ _save16 = self.pos
1569
1630
  while true # sequence
1570
1631
  _tmp = match_string("^")
1571
1632
  unless _tmp
1572
- self.pos = _save14
1633
+ self.pos = _save16
1573
1634
  break
1574
1635
  end
1575
1636
  _tmp = apply(:_var)
1576
1637
  name = @result
1577
1638
  unless _tmp
1578
- self.pos = _save14
1639
+ self.pos = _save16
1579
1640
  break
1580
1641
  end
1581
1642
  _text_start = self.pos
1582
- _save15 = self.pos
1643
+ _save17 = self.pos
1583
1644
  _tmp = apply(:_nested_paren)
1584
1645
  unless _tmp
1585
1646
  _tmp = true
1586
- self.pos = _save15
1647
+ self.pos = _save17
1587
1648
  end
1588
1649
  if _tmp
1589
1650
  text = get_text(_text_start)
1590
1651
  end
1591
1652
  unless _tmp
1592
- self.pos = _save14
1653
+ self.pos = _save16
1593
1654
  break
1594
1655
  end
1595
1656
  @result = begin; @g.foreign_invoke("parent", name, text) ; end
1596
1657
  _tmp = true
1597
1658
  unless _tmp
1598
- self.pos = _save14
1659
+ self.pos = _save16
1599
1660
  end
1600
1661
  break
1601
1662
  end # end sequence
@@ -1603,48 +1664,48 @@ class KPeg::FormatParser
1603
1664
  break if _tmp
1604
1665
  self.pos = _save
1605
1666
 
1606
- _save16 = self.pos
1667
+ _save18 = self.pos
1607
1668
  while true # sequence
1608
1669
  _tmp = match_string("%")
1609
1670
  unless _tmp
1610
- self.pos = _save16
1671
+ self.pos = _save18
1611
1672
  break
1612
1673
  end
1613
1674
  _tmp = apply(:_var)
1614
1675
  gram = @result
1615
1676
  unless _tmp
1616
- self.pos = _save16
1677
+ self.pos = _save18
1617
1678
  break
1618
1679
  end
1619
1680
  _tmp = match_string(".")
1620
1681
  unless _tmp
1621
- self.pos = _save16
1682
+ self.pos = _save18
1622
1683
  break
1623
1684
  end
1624
1685
  _tmp = apply(:_var)
1625
1686
  name = @result
1626
1687
  unless _tmp
1627
- self.pos = _save16
1688
+ self.pos = _save18
1628
1689
  break
1629
1690
  end
1630
1691
  _text_start = self.pos
1631
- _save17 = self.pos
1692
+ _save19 = self.pos
1632
1693
  _tmp = apply(:_nested_paren)
1633
1694
  unless _tmp
1634
1695
  _tmp = true
1635
- self.pos = _save17
1696
+ self.pos = _save19
1636
1697
  end
1637
1698
  if _tmp
1638
1699
  text = get_text(_text_start)
1639
1700
  end
1640
1701
  unless _tmp
1641
- self.pos = _save16
1702
+ self.pos = _save18
1642
1703
  break
1643
1704
  end
1644
1705
  @result = begin; @g.foreign_invoke(gram, name, text) ; end
1645
1706
  _tmp = true
1646
1707
  unless _tmp
1647
- self.pos = _save16
1708
+ self.pos = _save18
1648
1709
  end
1649
1710
  break
1650
1711
  end # end sequence
@@ -1652,54 +1713,54 @@ class KPeg::FormatParser
1652
1713
  break if _tmp
1653
1714
  self.pos = _save
1654
1715
 
1655
- _save18 = self.pos
1716
+ _save20 = self.pos
1656
1717
  while true # sequence
1657
1718
  _tmp = apply(:_var)
1658
1719
  name = @result
1659
1720
  unless _tmp
1660
- self.pos = _save18
1721
+ self.pos = _save20
1661
1722
  break
1662
1723
  end
1663
1724
  _text_start = self.pos
1664
- _save19 = self.pos
1725
+ _save21 = self.pos
1665
1726
  _tmp = apply(:_nested_paren)
1666
1727
  unless _tmp
1667
1728
  _tmp = true
1668
- self.pos = _save19
1729
+ self.pos = _save21
1669
1730
  end
1670
1731
  if _tmp
1671
1732
  text = get_text(_text_start)
1672
1733
  end
1673
1734
  unless _tmp
1674
- self.pos = _save18
1735
+ self.pos = _save20
1675
1736
  break
1676
1737
  end
1677
- _save20 = self.pos
1738
+ _save22 = self.pos
1678
1739
 
1679
- _save21 = self.pos
1740
+ _save23 = self.pos
1680
1741
  while true # sequence
1681
1742
  _tmp = apply(:__hyphen_)
1682
1743
  unless _tmp
1683
- self.pos = _save21
1744
+ self.pos = _save23
1684
1745
  break
1685
1746
  end
1686
1747
  _tmp = match_string("=")
1687
1748
  unless _tmp
1688
- self.pos = _save21
1749
+ self.pos = _save23
1689
1750
  end
1690
1751
  break
1691
1752
  end # end sequence
1692
1753
 
1693
1754
  _tmp = _tmp ? nil : true
1694
- self.pos = _save20
1755
+ self.pos = _save22
1695
1756
  unless _tmp
1696
- self.pos = _save18
1757
+ self.pos = _save20
1697
1758
  break
1698
1759
  end
1699
1760
  @result = begin; text.empty? ? @g.ref(name) : @g.invoke(name, text) ; end
1700
1761
  _tmp = true
1701
1762
  unless _tmp
1702
- self.pos = _save18
1763
+ self.pos = _save20
1703
1764
  end
1704
1765
  break
1705
1766
  end # end sequence
@@ -2009,7 +2070,7 @@ class KPeg::FormatParser
2009
2070
  return _tmp
2010
2071
  end
2011
2072
 
2012
- # statement = (- var:v "(" args:a ")" - "=" - expression:o { @g.set(v, o, a) } | - var:v - "=" - expression:o { @g.set(v, o) } | - "%" var:name - "=" - < /[::A-Za-z]+/ > { @g.add_foreign_grammar(name, text) } | - "%%" - curly:act { @g.add_setup act } | - "%%" - var:name - "=" - < (!"\n" .)+ > { @g.set_variable(name, text) })
2073
+ # statement = (- var:v "(" args:a ")" - "=" - expression:o { @g.set(v, o, a) } | - var:v - "=" - expression:o { @g.set(v, o) } | - "%" var:name - "=" - < /[::A-Za-z0-9_]+/ > { @g.add_foreign_grammar(name, text) } | - "%%" - curly:act { @g.add_setup act } | - "%%" - var:name - "=" - < (!"\n" .)+ > { @g.set_variable(name, text) })
2013
2074
  def _statement
2014
2075
 
2015
2076
  _save = self.pos
@@ -2155,7 +2216,7 @@ class KPeg::FormatParser
2155
2216
  break
2156
2217
  end
2157
2218
  _text_start = self.pos
2158
- _tmp = scan(/\A(?-mix:[::A-Za-z]+)/)
2219
+ _tmp = scan(/\A(?-mix:[::A-Za-z0-9_]+)/)
2159
2220
  if _tmp
2160
2221
  text = get_text(_text_start)
2161
2222
  end
@@ -2402,12 +2463,197 @@ class KPeg::FormatParser
2402
2463
  return _tmp
2403
2464
  end
2404
2465
 
2466
+ # ast_constant = < /[A-Z][A-Za-z0-9_]*/ > { text }
2467
+ def _ast_constant
2468
+
2469
+ _save = self.pos
2470
+ while true # sequence
2471
+ _text_start = self.pos
2472
+ _tmp = scan(/\A(?-mix:[A-Z][A-Za-z0-9_]*)/)
2473
+ if _tmp
2474
+ text = get_text(_text_start)
2475
+ end
2476
+ unless _tmp
2477
+ self.pos = _save
2478
+ break
2479
+ end
2480
+ @result = begin; text ; end
2481
+ _tmp = true
2482
+ unless _tmp
2483
+ self.pos = _save
2484
+ end
2485
+ break
2486
+ end # end sequence
2487
+
2488
+ set_failed_rule :_ast_constant unless _tmp
2489
+ return _tmp
2490
+ end
2491
+
2492
+ # ast_word = < /[A-Za-z_][A-Za-z0-9_]*/ > { text }
2493
+ def _ast_word
2494
+
2495
+ _save = self.pos
2496
+ while true # sequence
2497
+ _text_start = self.pos
2498
+ _tmp = scan(/\A(?-mix:[A-Za-z_][A-Za-z0-9_]*)/)
2499
+ if _tmp
2500
+ text = get_text(_text_start)
2501
+ end
2502
+ unless _tmp
2503
+ self.pos = _save
2504
+ break
2505
+ end
2506
+ @result = begin; text ; end
2507
+ _tmp = true
2508
+ unless _tmp
2509
+ self.pos = _save
2510
+ end
2511
+ break
2512
+ end # end sequence
2513
+
2514
+ set_failed_rule :_ast_word unless _tmp
2515
+ return _tmp
2516
+ end
2517
+
2518
+ # ast_sp = (" " | "\t")*
2519
+ def _ast_sp
2520
+ while true
2521
+
2522
+ _save1 = self.pos
2523
+ while true # choice
2524
+ _tmp = match_string(" ")
2525
+ break if _tmp
2526
+ self.pos = _save1
2527
+ _tmp = match_string("\t")
2528
+ break if _tmp
2529
+ self.pos = _save1
2530
+ break
2531
+ end # end choice
2532
+
2533
+ break unless _tmp
2534
+ end
2535
+ _tmp = true
2536
+ set_failed_rule :_ast_sp unless _tmp
2537
+ return _tmp
2538
+ end
2539
+
2540
+ # ast_words = (ast_words:r ast_sp "," ast_sp ast_word:w { r + [w] } | ast_word:w { [w] })
2541
+ def _ast_words
2542
+
2543
+ _save = self.pos
2544
+ while true # choice
2545
+
2546
+ _save1 = self.pos
2547
+ while true # sequence
2548
+ _tmp = apply(:_ast_words)
2549
+ r = @result
2550
+ unless _tmp
2551
+ self.pos = _save1
2552
+ break
2553
+ end
2554
+ _tmp = apply(:_ast_sp)
2555
+ unless _tmp
2556
+ self.pos = _save1
2557
+ break
2558
+ end
2559
+ _tmp = match_string(",")
2560
+ unless _tmp
2561
+ self.pos = _save1
2562
+ break
2563
+ end
2564
+ _tmp = apply(:_ast_sp)
2565
+ unless _tmp
2566
+ self.pos = _save1
2567
+ break
2568
+ end
2569
+ _tmp = apply(:_ast_word)
2570
+ w = @result
2571
+ unless _tmp
2572
+ self.pos = _save1
2573
+ break
2574
+ end
2575
+ @result = begin; r + [w] ; end
2576
+ _tmp = true
2577
+ unless _tmp
2578
+ self.pos = _save1
2579
+ end
2580
+ break
2581
+ end # end sequence
2582
+
2583
+ break if _tmp
2584
+ self.pos = _save
2585
+
2586
+ _save2 = self.pos
2587
+ while true # sequence
2588
+ _tmp = apply(:_ast_word)
2589
+ w = @result
2590
+ unless _tmp
2591
+ self.pos = _save2
2592
+ break
2593
+ end
2594
+ @result = begin; [w] ; end
2595
+ _tmp = true
2596
+ unless _tmp
2597
+ self.pos = _save2
2598
+ end
2599
+ break
2600
+ end # end sequence
2601
+
2602
+ break if _tmp
2603
+ self.pos = _save
2604
+ break
2605
+ end # end choice
2606
+
2607
+ set_failed_rule :_ast_words unless _tmp
2608
+ return _tmp
2609
+ end
2610
+
2611
+ # ast_root = ast_constant:c "(" ast_words:w ")" { [c, w] }
2612
+ def _ast_root
2613
+
2614
+ _save = self.pos
2615
+ while true # sequence
2616
+ _tmp = apply(:_ast_constant)
2617
+ c = @result
2618
+ unless _tmp
2619
+ self.pos = _save
2620
+ break
2621
+ end
2622
+ _tmp = match_string("(")
2623
+ unless _tmp
2624
+ self.pos = _save
2625
+ break
2626
+ end
2627
+ _tmp = apply(:_ast_words)
2628
+ w = @result
2629
+ unless _tmp
2630
+ self.pos = _save
2631
+ break
2632
+ end
2633
+ _tmp = match_string(")")
2634
+ unless _tmp
2635
+ self.pos = _save
2636
+ break
2637
+ end
2638
+ @result = begin; [c, w] ; end
2639
+ _tmp = true
2640
+ unless _tmp
2641
+ self.pos = _save
2642
+ end
2643
+ break
2644
+ end # end sequence
2645
+
2646
+ set_failed_rule :_ast_root unless _tmp
2647
+ return _tmp
2648
+ end
2649
+
2405
2650
  Rules = {}
2406
2651
  Rules[:_eol] = rule_info("eol", "\"\\n\"")
2407
2652
  Rules[:_comment] = rule_info("comment", "\"\#\" (!eol .)* eol")
2408
2653
  Rules[:_space] = rule_info("space", "(\" \" | \"\\t\" | eol)")
2409
2654
  Rules[:__hyphen_] = rule_info("-", "(space | comment)*")
2410
2655
  Rules[:_var] = rule_info("var", "< (\"-\" | /[a-zA-Z][\\-_a-zA-Z0-9]*/) > { text }")
2656
+ Rules[:_method] = rule_info("method", "< /[a-zA-Z_][a-zA-Z0-9_]*/ > { text }")
2411
2657
  Rules[:_dbl_escapes] = rule_info("dbl_escapes", "(\"\\\\\\\"\" { '\"' } | \"\\\\n\" { \"\\n\" } | \"\\\\t\" { \"\\t\" } | \"\\\\b\" { \"\\b\" } | \"\\\\\\\\\" { \"\\\\\" })")
2412
2658
  Rules[:_dbl_seq] = rule_info("dbl_seq", "< /[^\\\\\"]+/ > { text }")
2413
2659
  Rules[:_dbl_not_quote] = rule_info("dbl_not_quote", "(dbl_escapes:s | dbl_seq:s)+:ary { ary }")
@@ -2427,14 +2673,19 @@ class KPeg::FormatParser
2427
2673
  Rules[:_curly_block] = rule_info("curly_block", "curly")
2428
2674
  Rules[:_curly] = rule_info("curly", "\"{\" < (/[^{}]+/ | curly)* > \"}\" { @g.action(text) }")
2429
2675
  Rules[:_nested_paren] = rule_info("nested_paren", "\"(\" (/[^()]+/ | nested_paren)* \")\"")
2430
- Rules[:_value] = rule_info("value", "(value:v \":\" var:n { @g.t(v,n) } | value:v \"?\" { @g.maybe(v) } | value:v \"+\" { @g.many(v) } | value:v \"*\" { @g.kleene(v) } | value:v mult_range:r { @g.multiple(v, *r) } | \"&\" value:v { @g.andp(v) } | \"!\" value:v { @g.notp(v) } | \"(\" - expression:o - \")\" { o } | \"<\" - expression:o - \">\" { @g.collect(o) } | curly_block | \".\" { @g.dot } | \"@\" var:name !(- \"=\") { @g.invoke(name) } | \"^\" var:name < nested_paren? > { @g.foreign_invoke(\"parent\", name, text) } | \"%\" var:gram \".\" var:name < nested_paren? > { @g.foreign_invoke(gram, name, text) } | var:name < nested_paren? > !(- \"=\") { text.empty? ? @g.ref(name) : @g.invoke(name, text) } | char_range | regexp | string)")
2676
+ Rules[:_value] = rule_info("value", "(value:v \":\" var:n { @g.t(v,n) } | value:v \"?\" { @g.maybe(v) } | value:v \"+\" { @g.many(v) } | value:v \"*\" { @g.kleene(v) } | value:v mult_range:r { @g.multiple(v, *r) } | \"&\" value:v { @g.andp(v) } | \"!\" value:v { @g.notp(v) } | \"(\" - expression:o - \")\" { o } | \"<\" - expression:o - \">\" { @g.collect(o) } | curly_block | \"~\" method:m < nested_paren? > { @g.action(\"\#{m}\#{text}\") } | \".\" { @g.dot } | \"@\" var:name !(- \"=\") { @g.invoke(name) } | \"^\" var:name < nested_paren? > { @g.foreign_invoke(\"parent\", name, text) } | \"%\" var:gram \".\" var:name < nested_paren? > { @g.foreign_invoke(gram, name, text) } | var:name < nested_paren? > !(- \"=\") { text.empty? ? @g.ref(name) : @g.invoke(name, text) } | char_range | regexp | string)")
2431
2677
  Rules[:_spaces] = rule_info("spaces", "(space | comment)+")
2432
2678
  Rules[:_values] = rule_info("values", "(values:s spaces value:v { @g.seq(s, v) } | value:l spaces value:r { @g.seq(l, r) } | value)")
2433
2679
  Rules[:_choose_cont] = rule_info("choose_cont", "- \"|\" - values:v { v }")
2434
2680
  Rules[:_expression] = rule_info("expression", "(values:v choose_cont+:alts { @g.any(v, *alts) } | values)")
2435
2681
  Rules[:_args] = rule_info("args", "(args:a \",\" - var:n - { a + [n] } | - var:n - { [n] })")
2436
- Rules[:_statement] = rule_info("statement", "(- var:v \"(\" args:a \")\" - \"=\" - expression:o { @g.set(v, o, a) } | - var:v - \"=\" - expression:o { @g.set(v, o) } | - \"%\" var:name - \"=\" - < /[::A-Za-z]+/ > { @g.add_foreign_grammar(name, text) } | - \"%%\" - curly:act { @g.add_setup act } | - \"%%\" - var:name - \"=\" - < (!\"\\n\" .)+ > { @g.set_variable(name, text) })")
2682
+ Rules[:_statement] = rule_info("statement", "(- var:v \"(\" args:a \")\" - \"=\" - expression:o { @g.set(v, o, a) } | - var:v - \"=\" - expression:o { @g.set(v, o) } | - \"%\" var:name - \"=\" - < /[::A-Za-z0-9_]+/ > { @g.add_foreign_grammar(name, text) } | - \"%%\" - curly:act { @g.add_setup act } | - \"%%\" - var:name - \"=\" - < (!\"\\n\" .)+ > { @g.set_variable(name, text) })")
2437
2683
  Rules[:_statements] = rule_info("statements", "statement (- statements)?")
2438
2684
  Rules[:_eof] = rule_info("eof", "!.")
2439
2685
  Rules[:_root] = rule_info("root", "statements - \"\\n\"? eof")
2686
+ Rules[:_ast_constant] = rule_info("ast_constant", "< /[A-Z][A-Za-z0-9_]*/ > { text }")
2687
+ Rules[:_ast_word] = rule_info("ast_word", "< /[A-Za-z_][A-Za-z0-9_]*/ > { text }")
2688
+ Rules[:_ast_sp] = rule_info("ast_sp", "(\" \" | \"\\t\")*")
2689
+ Rules[:_ast_words] = rule_info("ast_words", "(ast_words:r ast_sp \",\" ast_sp ast_word:w { r + [w] } | ast_word:w { [w] })")
2690
+ Rules[:_ast_root] = rule_info("ast_root", "ast_constant:c \"(\" ast_words:w \")\" { [c, w] }")
2440
2691
  end
@@ -1,3 +1,3 @@
1
1
  module KPeg
2
- VERSION = "0.7"
2
+ VERSION = "0.8.0"
3
3
  end
@@ -93,6 +93,36 @@ end
93
93
  assert !cg.parse("a")
94
94
  end
95
95
 
96
+ def test_reg_unicode
97
+ gram = KPeg.grammar do |g|
98
+ g.root = g.reg(/./u)
99
+ end
100
+
101
+ str = <<-STR
102
+ require 'kpeg/compiled_parser'
103
+
104
+ class Test < KPeg::CompiledParser
105
+
106
+ # root = /./u
107
+ def _root
108
+ _tmp = scan(/\\A(?-mix:.)/u)
109
+ set_failed_rule :_root unless _tmp
110
+ return _tmp
111
+ end
112
+
113
+ Rules = {}
114
+ Rules[:_root] = rule_info("root", "/./u")
115
+ end
116
+ STR
117
+
118
+ cg = KPeg::CodeGenerator.new "Test", gram
119
+
120
+ assert_equal str, cg.output
121
+
122
+ assert cg.parse("う")
123
+ assert cg.parse("a")
124
+ end
125
+
96
126
  def test_char_range
97
127
  gram = KPeg.grammar do |g|
98
128
  g.root = g.range("a", "z")
@@ -791,6 +821,10 @@ class Test < KPeg::CompiledParser
791
821
  @_grammar_blah = TestKPegCodeGenerator::TestParser.new(nil)
792
822
  end
793
823
 
824
+ def invoke_blah(*args)
825
+ @_grammar_blah.external_invoke(self, :_root, *args)
826
+ end
827
+
794
828
  # root = %blah.greeting
795
829
  def _root
796
830
  _tmp = @_grammar_blah.external_invoke(self, :_greeting)
@@ -824,6 +858,10 @@ class Test < KPeg::CompiledParser
824
858
  @_grammar_blah = TestKPegCodeGenerator::TestParser.new(nil)
825
859
  end
826
860
 
861
+ def invoke_blah(*args)
862
+ @_grammar_blah.external_invoke(self, :_root, *args)
863
+ end
864
+
827
865
  # root = %blah.greeting2(1,2)
828
866
  def _root
829
867
  _tmp = @_grammar_blah.external_invoke(self, :_greeting2, 1,2)
@@ -1304,4 +1342,95 @@ end
1304
1342
  assert cg.parse("hello")
1305
1343
  end
1306
1344
 
1345
+ def test_ast_generation
1346
+ gram = KPeg.grammar do |g|
1347
+ g.root = g.dot
1348
+ g.set_variable "bracket", "ast BracketOperator(receiver, argument)"
1349
+ end
1350
+
1351
+ str = <<-STR
1352
+ require 'kpeg/compiled_parser'
1353
+
1354
+ class Test < KPeg::CompiledParser
1355
+
1356
+ module AST
1357
+ class Node; end
1358
+ class BracketOperator < Node
1359
+ def initialize(receiver, argument)
1360
+ @receiver = receiver
1361
+ @argument = argument
1362
+ end
1363
+ attr_reader :receiver
1364
+ attr_reader :argument
1365
+ end
1366
+ end
1367
+ def bracket(receiver, argument)
1368
+ AST::BracketOperator.new(receiver, argument)
1369
+ end
1370
+
1371
+ # root = .
1372
+ def _root
1373
+ _tmp = get_byte
1374
+ set_failed_rule :_root unless _tmp
1375
+ return _tmp
1376
+ end
1377
+
1378
+ Rules = {}
1379
+ Rules[:_root] = rule_info("root", ".")
1380
+ end
1381
+ STR
1382
+
1383
+ cg = KPeg::CodeGenerator.new "Test", gram
1384
+
1385
+ assert_equal str, cg.output
1386
+
1387
+ assert cg.parse("hello")
1388
+ end
1389
+
1390
+ def test_ast_generation_in_different_location
1391
+ gram = KPeg.grammar do |g|
1392
+ g.root = g.dot
1393
+ g.set_variable "bracket", "ast BracketOperator(receiver, argument)"
1394
+ g.set_variable "ast-location", "MegaAST"
1395
+ end
1396
+
1397
+ str = <<-STR
1398
+ require 'kpeg/compiled_parser'
1399
+
1400
+ class Test < KPeg::CompiledParser
1401
+
1402
+ module MegaAST
1403
+ class Node; end
1404
+ class BracketOperator < Node
1405
+ def initialize(receiver, argument)
1406
+ @receiver = receiver
1407
+ @argument = argument
1408
+ end
1409
+ attr_reader :receiver
1410
+ attr_reader :argument
1411
+ end
1412
+ end
1413
+ def bracket(receiver, argument)
1414
+ MegaAST::BracketOperator.new(receiver, argument)
1415
+ end
1416
+
1417
+ # root = .
1418
+ def _root
1419
+ _tmp = get_byte
1420
+ set_failed_rule :_root unless _tmp
1421
+ return _tmp
1422
+ end
1423
+
1424
+ Rules = {}
1425
+ Rules[:_root] = rule_info("root", ".")
1426
+ end
1427
+ STR
1428
+
1429
+ cg = KPeg::CodeGenerator.new "Test", gram
1430
+
1431
+ assert_equal str, cg.output
1432
+
1433
+ assert cg.parse("hello")
1434
+ end
1435
+
1307
1436
  end
@@ -109,6 +109,16 @@ b(p) = x
109
109
  assert_equal "OtherGrammar", gram.foreign_grammars["blah"]
110
110
  end
111
111
 
112
+ def test_add_foreign_grammar_with_numbers
113
+ gram = match "%blah = Thing1::OtherGrammar"
114
+ assert_equal "Thing1::OtherGrammar", gram.foreign_grammars["blah"]
115
+ end
116
+
117
+ def test_add_foreign_grammar_with_undescore
118
+ gram = match "%blah = Other_Grammar"
119
+ assert_equal "Other_Grammar", gram.foreign_grammars["blah"]
120
+ end
121
+
112
122
  def test_invoke_parent_rule
113
123
  assert_rule G.foreign_invoke("parent", "letters"),
114
124
  match("a=^letters"), "a"
@@ -305,6 +315,16 @@ Value = NUMBER:i { i }
305
315
  assert_rule G.seq(:b, :c, G.action(" b + { c + d } ")), m
306
316
  end
307
317
 
318
+ def test_action_send
319
+ m = match 'a=b c ~d'
320
+ assert_rule G.seq(:b, :c, G.action("d")), m
321
+ end
322
+
323
+ def test_action_send_with_args
324
+ m = match 'a=b c ~d(b,c)'
325
+ assert_rule G.seq(:b, :c, G.action("d(b,c)")), m
326
+ end
327
+
308
328
  def test_collect
309
329
  m = match 'a = < b c >'
310
330
  assert_rule G.collect(G.seq(:b, :c)), m
metadata CHANGED
@@ -1,11 +1,13 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: kpeg
3
3
  version: !ruby/object:Gem::Version
4
- prerelease: false
4
+ hash: 63
5
+ prerelease:
5
6
  segments:
6
7
  - 0
7
- - 7
8
- version: "0.7"
8
+ - 8
9
+ - 0
10
+ version: 0.8.0
9
11
  platform: ruby
10
12
  authors:
11
13
  - Evan Phoenix
@@ -13,16 +15,18 @@ autorequire:
13
15
  bindir: bin
14
16
  cert_chain: []
15
17
 
16
- date: 2011-03-15 00:00:00 -07:00
18
+ date: 2011-04-06 00:00:00 -07:00
17
19
  default_executable:
18
20
  dependencies:
19
21
  - !ruby/object:Gem::Dependency
20
22
  name: rake
21
23
  prerelease: false
22
24
  requirement: &id001 !ruby/object:Gem::Requirement
25
+ none: false
23
26
  requirements:
24
27
  - - ">="
25
28
  - !ruby/object:Gem::Version
29
+ hash: 3
26
30
  segments:
27
31
  - 0
28
32
  version: "0"
@@ -57,6 +61,13 @@ files:
57
61
  - Rakefile
58
62
  - kpeg.gemspec
59
63
  - Gemfile
64
+ - test/test_file_parser_roundtrip.rb
65
+ - test/test_gen_calc.rb
66
+ - test/test_kpeg.rb
67
+ - test/test_kpeg_code_generator.rb
68
+ - test/test_kpeg_compiled_parser.rb
69
+ - test/test_kpeg_format.rb
70
+ - test/test_kpeg_grammar_renderer.rb
60
71
  has_rdoc: true
61
72
  homepage: https://github.com/evanphx/kpeg
62
73
  licenses: []
@@ -67,23 +78,27 @@ rdoc_options: []
67
78
  require_paths:
68
79
  - lib
69
80
  required_ruby_version: !ruby/object:Gem::Requirement
81
+ none: false
70
82
  requirements:
71
83
  - - ">="
72
84
  - !ruby/object:Gem::Version
85
+ hash: 3
73
86
  segments:
74
87
  - 0
75
88
  version: "0"
76
89
  required_rubygems_version: !ruby/object:Gem::Requirement
90
+ none: false
77
91
  requirements:
78
92
  - - ">="
79
93
  - !ruby/object:Gem::Version
94
+ hash: 3
80
95
  segments:
81
96
  - 0
82
97
  version: "0"
83
98
  requirements: []
84
99
 
85
100
  rubyforge_project:
86
- rubygems_version: 1.3.6
101
+ rubygems_version: 1.6.2
87
102
  signing_key:
88
103
  specification_version: 3
89
104
  summary: Peg-based Code Generator