fop_lang 0.1.0 → 0.5.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 62ded9f974ca670a6eca5d4b293f98195af25ab476e9b3571e7def5299499fa9
4
- data.tar.gz: b53b6e276ee31f122571a096dd534ed5c1407d27384fab78d216886322751fcb
3
+ metadata.gz: 1166e1e43fd54ed2263db8a37ff288431a6152543869d961f007a134483f1a4b
4
+ data.tar.gz: 17e55b17448c38a37afb6e24a5798c87de368a352bdb0c74df13ad865a7e3ad0
5
5
  SHA512:
6
- metadata.gz: 45509b09122e4f76f1f8219d8c0f094bc5ac77231dec25f6e0e9818932b68186d74c71dd74db80638dcfc9bae210892b563e660004cfd32c25f439c761c3edc2
7
- data.tar.gz: 49c6153cf728e909e60bc8fa1f1214dcdd8f9532703c8bc3455999aac1aa421c629370e1c7669bc9a846b5886fa2f10b7f5b698426cae2830e73ca9623a32a4a
6
+ metadata.gz: cbf5d8c7f6c10ca395518cbd7bf0e9083b1f217b3523386a482f19e5cdf47a16aa7ffb50a6290f89f0ec80ba113900c499944fdf35b7e2e71a941d537483a7e1
7
+ data.tar.gz: bff3c613a575687d0d3223c5bd60bb1128b0ae78accf2e5101228c3ec14d61f46cdc798855af9d07692de0b25ba01737dbda97409d4e8f44881e9bbe6da9c523
data/README.md CHANGED
@@ -1,44 +1,92 @@
1
1
  # fop_lang
2
2
 
3
- Fop is a tiny expression language implemented in Ruby for text filtering and modification.
3
+ Fop (Filter and OPerations language) is an experimental, tiny expression language in the vein of awk and sed. This is a Ruby implementation. It is useful for simultaneously matching and transforming text input.
4
4
 
5
- ## Examples
5
+ ```ruby
6
+ gem 'fop_lang'
7
+ ```
8
+
9
+ ## Release Number Example
10
+
11
+ This example takes in GitHub branch names, decides if they're release branches, and if so, increments the version number.
6
12
 
7
13
  ```ruby
8
- f = Fop("release-{N}.{N+1}.{N=0}")
14
+ f = Fop('release-{N}.{N+1}.{N=0}')
9
15
 
10
- puts f.apply("release-5.99.1")
11
- => "release-5.100.0"
16
+ puts f.apply('release-5.99.1')
17
+ => 'release-5.100.0'
12
18
 
13
- puts f.apply("release-5")
19
+ puts f.apply('release-5')
14
20
  => nil
15
21
  # doesn't match the pattern
16
22
  ```
17
23
 
24
+ ## Anatomy of a Fop expression
25
+
26
+ `Text Literal {Operation}`
27
+
28
+ The above expression contains the only two parts of Fop (except for the wildcard and escape characters).
29
+
30
+ **Text Literals**
31
+
32
+ A text literal works how it sounds: the input must match it exactly. If it matches it passes through unchanged. The only exception is the `*` (wildcard) character, which matches 0 or more of anything. Wildcards can be used anywhere except inside `{...}` (operations).
33
+
34
+ If `\` (escape) is used before the special characters `*`, `{` or `}`, then that character is treated like a text literal. It's recommended to use single-quoted Ruby strings with Fop expressions that so you don't need to double-escape.
35
+
36
+ **Operations**
37
+
38
+ Operations are the interesting part of Fop, and are specified between `{` and `}`. An Operation can consist of one to three parts:
39
+
40
+ 1. Matching class (required): Defines what characters the operation will match and operate on.
41
+ * `N` is the numeric class and will match one or more digits.
42
+ * `A` is the alpha class and will match one or more letters (lower or upper case).
43
+ * `W` is the word class and matches alphanumeric chars and underscores.
44
+ * `*` is the wildcard class and greedily matches everything after it.
45
+ * `/.../` matches on the supplied regex between the `/`'s. If you're regex contains a `/`, it must be escaped. Capture groups may be referenced in the operator argument as `$1`, `$2`, etc.
46
+ 3. Operator (optional): What to do to the matching characters.
47
+ * `=` Replace the matching character(s) with the given argument. If no argument is given, drop the matching chars.
48
+ * `>` Append the following chars to the matching value.
49
+ * `<` Prepend the following chars to the matching value.
50
+ * `+` Perform addition on the matching number and the argument (`N` only).
51
+ * `-` Subtract the argument from the matching number (`N` only).
52
+ 5. Operator argument (required for some operators): meaning varies by operator.
53
+
54
+ ## More Examples
55
+
18
56
  ```ruby
19
- f = Fop("release-{N=5}.{N+1}.{N=0}")
57
+ f = Fop('release-{N=5}.{N+1}.{N=0}')
58
+
59
+ puts f.apply('release-4.99.1')
60
+ => 'release-5.100.0'
61
+ ```
62
+
63
+ ```ruby
64
+ f = Fop('rel{/(ease)?/}-{N=5}.{N+1}.{N=0}')
65
+
66
+ puts f.apply('release-4.99.1')
67
+ => 'release-5.100.0'
20
68
 
21
- puts f.apply("release-4.99.1")
22
- => "release-5.100.0"
69
+ puts f.apply('rel-4.99.1')
70
+ => 'rel-5.100.0'
23
71
  ```
24
72
 
25
73
  ```ruby
26
- f = Fop("release-*{N=5}.{N+100}.{N=0}")
74
+ f = Fop('release-*{N=5}.{N+100}.{N=0}')
27
75
 
28
- puts f.apply("release-foo-4.100.1")
29
- => "release-foo-5.200.0"
76
+ puts f.apply('release-foo-4.100.1')
77
+ => 'release-foo-5.200.0'
30
78
  ```
31
79
 
32
80
  ```ruby
33
- f = Fop("release-{N=5}.{N+1}.{N=0}{*=}")
81
+ f = Fop('release-{N=5}.{N+1}.{N=0}{*=}')
34
82
 
35
- puts f.apply("release-4.100.1.foo.bar")
36
- => "release-5.101.0"
83
+ puts f.apply('release-4.100.1.foo.bar')
84
+ => 'release-5.101.0'
37
85
  ```
38
86
 
39
87
  ```ruby
40
- f = Fop("{W=version}-{N=5}.{N+1}.{N=0}")
88
+ f = Fop('{W=version}-{N=5}.{N+1}.{N=0}')
41
89
 
42
- puts f.apply("release-4.100.1")
43
- => "version-5.101.0"
90
+ puts f.apply('release-4.100.1')
91
+ => 'version-5.101.0'
44
92
  ```
@@ -0,0 +1,72 @@
1
+ require_relative 'parser'
2
+
3
+ module Fop
4
+ module Compiler
5
+ def self.compile(src)
6
+ parser = Parser.new(src)
7
+ nodes, errors = parser.parse
8
+
9
+ instructions = nodes.map { |node|
10
+ case node
11
+ when Nodes::Text, Nodes::Regex
12
+ Instructions.regex_match(node.regex)
13
+ when Nodes::Expression
14
+ Instructions::ExpressionMatch.new(node)
15
+ else
16
+ raise "Unknown node type #{node}"
17
+ end
18
+ }
19
+
20
+ return nil, errors if errors.any?
21
+ return instructions, nil
22
+ end
23
+
24
+ module Instructions
25
+ BLANK = "".freeze
26
+ OPERATIONS = {
27
+ "=" => ->(_val, arg) { arg || BLANK },
28
+ "+" => ->(val, arg) { val.to_i + arg.to_i },
29
+ "-" => ->(val, arg) { val.to_i - arg.to_i },
30
+ ">" => ->(val, arg) { val + arg },
31
+ "<" => ->(val, arg) { arg + val },
32
+ }
33
+
34
+ def self.regex_match(regex)
35
+ ->(input) { input.slice! regex }
36
+ end
37
+
38
+ class ExpressionMatch
39
+ def initialize(node)
40
+ @regex = node.regex&.regex
41
+ @op = node.operator ? OPERATIONS.fetch(node.operator) : nil
42
+ @regex_match = node.regex_match
43
+ if node.arg&.any? { |a| a.is_a? Integer }
44
+ @arg, @arg_with_caps = nil, node.arg
45
+ else
46
+ @arg = node.arg&.join("")
47
+ @arg_with_caps = nil
48
+ end
49
+ end
50
+
51
+ def call(input)
52
+ if (match = @regex.match(input))
53
+ val = match.to_s
54
+ blank = val == BLANK
55
+ input.sub!(val, BLANK) unless blank
56
+ found_val = @regex_match || !blank
57
+ arg = @arg_with_caps ? sub_caps(@arg_with_caps, match.captures) : @arg
58
+ @op && found_val ? @op.call(val, arg) : val
59
+ end
60
+ end
61
+
62
+ private
63
+
64
+ def sub_caps(args, caps)
65
+ args.map { |a|
66
+ a.is_a?(Integer) ? caps[a].to_s : a
67
+ }.join("")
68
+ end
69
+ end
70
+ end
71
+ end
72
+ end
data/lib/fop/nodes.rb CHANGED
@@ -1,68 +1,30 @@
1
1
  module Fop
2
2
  module Nodes
3
- Text = Struct.new(:wildcard, :str) do
4
- def consume!(input)
5
- @regex ||= Regexp.new((wildcard ? ".*" : "^") + Regexp.escape(str))
6
- input.slice!(@regex)
7
- end
8
-
3
+ Text = Struct.new(:wildcard, :str, :regex) do
9
4
  def to_s
10
5
  w = wildcard ? "*" : nil
11
- "Text #{w}#{str}"
6
+ "[#{w}txt] #{str}"
12
7
  end
13
8
  end
14
9
 
15
- Match = Struct.new(:wildcard, :tokens) do
16
- NUM = "N".freeze
17
- WORD = "W".freeze
18
- WILD = "*".freeze
19
- BLANK = "".freeze
20
-
21
- def consume!(input)
22
- if (val = input.slice!(@regex))
23
- @expression && val != BLANK ? @expression.call(val) : val
24
- end
25
- end
26
-
10
+ Regex = Struct.new(:wildcard, :src, :regex) do
27
11
  def to_s
28
12
  w = wildcard ? "*" : nil
29
- @op ? "#{w}#{@match} #{@op} #{@arg}" : "#{w}#{@match}"
13
+ "[#{w}reg] #{src}"
30
14
  end
15
+ end
31
16
 
32
- def parse!
33
- match = tokens.shift || raise(ParserError, "Empty match")
34
- raise ParserError, "Unexpected #{match}" unless match.is_a? Tokenizer::Char
35
-
36
- @match = match.char
37
- @regex =
38
- case @match
39
- when NUM then Regexp.new((wildcard ? ".*?" : "^") + "[0-9]+")
40
- when WORD then Regexp.new((wildcard ? ".*?" : "^") + "[a-zA-Z]+")
41
- when WILD then /.*/
42
- else raise ParserError, "Unknown match type '#{@match}'"
43
- end
44
-
45
- if (op = tokens.shift)
46
- raise ParserError, "Unexpected #{op}" unless op.is_a? Tokenizer::Char
47
- arg = tokens.reduce("") { |acc, t|
48
- raise ParserError, "Unexpected #{t}" unless t.is_a? Tokenizer::Char
49
- acc + t.char
50
- }
51
-
52
- @op = op.char
53
- @arg = arg == BLANK ? nil : arg
54
- @expression =
55
- case @op
56
- when "=" then ->(_) { @arg || BLANK }
57
- when "+", "-", "*", "/"
58
- raise ParserError, "Operator #{@op} is only available for numeric matches" unless @match == NUM
59
- raise ParserError, "Operator #{@op} expects an argument" if @arg.nil?
60
- ->(x) { x.to_i.send(@op, @arg.to_i) }
61
- else raise ParserError, "Unknown operator #{@op}"
62
- end
63
- else
64
- @op, @arg, @expression = nil, nil, nil
17
+ Expression = Struct.new(:wildcard, :match, :regex_match, :regex, :operator, :arg) do
18
+ def to_s
19
+ w = wildcard ? "*" : nil
20
+ s = "[#{w}exp] #{match}"
21
+ if operator
22
+ arg_str = arg
23
+ .map { |a| a.is_a?(Integer) ? "$#{a+1}" : a.to_s }
24
+ .join("")
25
+ s << " #{operator} #{arg_str}"
65
26
  end
27
+ s
66
28
  end
67
29
  end
68
30
  end
data/lib/fop/parser.rb CHANGED
@@ -1,93 +1,162 @@
1
+ require_relative 'tokenizer'
1
2
  require_relative 'nodes'
2
3
 
3
4
  module Fop
4
- module Parser
5
- Error = Class.new(StandardError)
5
+ class Parser
6
+ DIGIT = /^[0-9]$/
7
+ REGEX_START = "^".freeze
8
+ REGEX_LAZY_WILDCARD = ".*?".freeze
9
+ REGEX_MATCHES = {
10
+ "N" => "[0-9]+".freeze,
11
+ "W" => "\\w+".freeze,
12
+ "A" => "[a-zA-Z]+".freeze,
13
+ "*" => ".*".freeze,
14
+ }.freeze
15
+ OPS_WITH_OPTIONAL_ARGS = [Tokenizer::OP_REPLACE]
16
+ TR_REGEX = /.*/
6
17
 
7
- def self.parse!(tokens)
8
- stack = []
9
- current_el = nil
18
+ Error = Struct.new(:type, :token, :message) do
19
+ def to_s
20
+ "#{type.to_s.capitalize} error: #{message} at column #{token.pos}"
21
+ end
22
+ end
23
+
24
+ attr_reader :errors
10
25
 
11
- tokens.each { |token|
12
- case current_el
13
- when nil
14
- current_el = new_element token
15
- when :wildcard
16
- current_el = new_element token, true
17
- raise Error, "Unexpected * after wildcard" if current_el == :wildcard
18
- when Nodes::Text
19
- current_el = parse_text stack, current_el, token
20
- when Nodes::Match
21
- current_el = parse_match stack, current_el, token
26
+ def initialize(src, debug: false)
27
+ @tokenizer = Tokenizer.new(src)
28
+ @errors = []
29
+ end
30
+
31
+ def parse
32
+ nodes = []
33
+ wildcard = false
34
+ eof = false
35
+ # Top-level parsing. It will always be looking for a String, Regex, or Expression.
36
+ until eof
37
+ @tokenizer.reset_escapes!
38
+ t = @tokenizer.next
39
+ case t.type
40
+ when Tokens::WILDCARD
41
+ errors << Error.new(:syntax, t, "Consecutive wildcards") if wildcard
42
+ wildcard = true
43
+ when Tokens::TEXT
44
+ reg = build_regex!(wildcard, t, Regexp.escape(t.val))
45
+ nodes << Nodes::Text.new(wildcard, t.val, reg)
46
+ wildcard = false
47
+ when Tokens::EXP_OPEN
48
+ nodes << parse_exp!(wildcard)
49
+ wildcard = false
50
+ when Tokens::REG_DELIM
51
+ nodes << parse_regex!(wildcard)
52
+ wildcard = false
53
+ when Tokens::EOF
54
+ eof = true
22
55
  else
23
- raise Error, "Unexpected token #{token} in #{current_el}"
56
+ errors << Error.new(:syntax, t, "Unexpected #{t.type}")
24
57
  end
25
- }
26
-
27
- case current_el
28
- when nil
29
- # noop
30
- when :wildcard
31
- stack << Nodes::Text.new(true, "")
32
- when Nodes::Text
33
- stack << current_el
34
- when Nodes::Match
35
- raise Error, "Unclosed match"
36
58
  end
37
-
38
- stack
59
+ nodes << Nodes::Text.new(true, "", TR_REGEX) if wildcard
60
+ return nodes, @errors
39
61
  end
40
62
 
41
- private
63
+ def parse_exp!(wildcard = false)
64
+ exp = Nodes::Expression.new(wildcard)
65
+ parse_exp_match! exp
66
+ op_token = parse_exp_operator! exp
67
+ if exp.operator
68
+ parse_exp_arg! exp, op_token
69
+ end
70
+ return exp
71
+ end
42
72
 
43
- def self.new_element(token, wildcard = false)
44
- case token
45
- when Tokenizer::Char
46
- Nodes::Text.new(wildcard, token.char.clone)
47
- when :match_open
48
- Nodes::Match.new(wildcard, [])
49
- when :match_close
50
- raise ParserError, "Unmatched }"
51
- when :wildcard
52
- :wildcard
73
+ def parse_exp_match!(exp)
74
+ @tokenizer.escape.operators = false
75
+ t = @tokenizer.next
76
+ case t.type
77
+ when Tokens::TEXT, Tokens::WILDCARD
78
+ exp.match = t.val
79
+ if (src = REGEX_MATCHES[exp.match])
80
+ reg = Regexp.new((exp.wildcard ? REGEX_LAZY_WILDCARD : REGEX_START) + src)
81
+ exp.regex = Nodes::Regex.new(exp.wildcard, src, reg)
82
+ else
83
+ errors << Error.new(:name, t, "Unknown match type '#{exp.match}'") if exp.regex.nil?
84
+ end
85
+ when Tokens::REG_DELIM
86
+ exp.regex = parse_regex!(exp.wildcard)
87
+ exp.match = exp.regex&.src
88
+ exp.regex_match = true
89
+ @tokenizer.reset_escapes!
53
90
  else
54
- raise ParserError, "Unexpected #{token}"
91
+ errors << Error.new(:syntax, t, "Unexpected #{t.type}; expected a string or a regex")
55
92
  end
56
93
  end
57
94
 
58
- def self.parse_text(stack, text_el, token)
59
- case token
60
- when :match_open
61
- stack << text_el
62
- Nodes::Match.new(false, [])
63
- when :match_close
64
- raise ParserError.new, "Unexpected }"
65
- when Tokenizer::Char
66
- text_el.str << token.char
67
- text_el
68
- when :wildcard
69
- stack << text_el
70
- :wildcard
95
+ def parse_exp_operator!(exp)
96
+ @tokenizer.escape.operators = false
97
+ t = @tokenizer.next
98
+ case t.type
99
+ when Tokens::EXP_CLOSE
100
+ # no op
101
+ when Tokens::OPERATOR
102
+ exp.operator = t.val
71
103
  else
72
- raise ParserError, "Unexpected #{token}"
104
+ errors << Error.new(:syntax, t, "Unexpected #{t.type}; expected an operator")
73
105
  end
106
+ t
74
107
  end
75
108
 
76
- def self.parse_match(stack, match_el, token)
77
- case token
78
- when Tokenizer::Char
79
- match_el.tokens << token
80
- match_el
81
- when :wildcard
82
- match_el.tokens << Tokenizer::Char.new("*").freeze
83
- match_el
84
- when :match_close
85
- match_el.parse!
86
- stack << match_el
87
- nil
109
+ def parse_exp_arg!(exp, op_token)
110
+ @tokenizer.escape.operators = true
111
+ @tokenizer.escape.regex = true
112
+ @tokenizer.escape.regex_capture = false if exp.regex_match
113
+
114
+ exp.arg = []
115
+ found_close, eof = false, false
116
+ until found_close or eof
117
+ t = @tokenizer.next
118
+ case t.type
119
+ when Tokens::TEXT
120
+ exp.arg << t.val
121
+ when Tokens::REG_CAPTURE
122
+ exp.arg << t.val.to_i - 1
123
+ errors << Error.new(:syntax, t, "Invalid regex capture; must be between 0 and 9 (found #{t.val})") unless t.val =~ DIGIT
124
+ errors << Error.new(:syntax, t, "Unexpected regex capture; expected str or '}'") if !exp.regex_match
125
+ when Tokens::EXP_CLOSE
126
+ found_close = true
127
+ when Tokens::EOF
128
+ eof = true
129
+ errors << Error.new(:syntax, t, "Unexpected #{t.type}; expected str or '}'")
130
+ else
131
+ errors << Error.new(:syntax, t, "Unexpected #{t.type}; expected str or '}'")
132
+ end
133
+ end
134
+
135
+ if exp.arg.size != 1 and !OPS_WITH_OPTIONAL_ARGS.include?(exp.operator)
136
+ errors << Error.new(:arg, op_token, "Operator '#{op_token.val}' requires an argument")
137
+ end
138
+ end
139
+
140
+ def parse_regex!(wildcard)
141
+ @tokenizer.regex_mode!
142
+ t = @tokenizer.next
143
+ reg = Nodes::Regex.new(wildcard, t.val)
144
+ if t.type == Tokens::TEXT
145
+ reg.regex = build_regex!(wildcard, t)
88
146
  else
89
- raise ParserError, "Unexpected #{token}"
147
+ errors << Error.new(:syntax, t, "Unexpected #{t.type}; expected a string of regex")
90
148
  end
149
+
150
+ t = @tokenizer.next
151
+ errors << Error.new(:syntax, t, "Unexpected #{t.type}; expected a string of regex") unless t.type == Tokens::REG_DELIM
152
+ reg
153
+ end
154
+
155
+ def build_regex!(wildcard, token, src = token.val)
156
+ Regexp.new((wildcard ? REGEX_LAZY_WILDCARD : REGEX_START) + src)
157
+ rescue RegexpError => e
158
+ errors << Error.new(:regex, token, e.message)
159
+ nil
91
160
  end
92
161
  end
93
162
  end
data/lib/fop/program.rb CHANGED
@@ -1,22 +1,16 @@
1
- require_relative 'tokenizer'
2
- require_relative 'parser'
3
-
4
1
  module Fop
5
2
  class Program
6
- attr_reader :nodes
7
-
8
- def initialize(src)
9
- tokens = Tokenizer.tokenize! src
10
- @nodes = Parser.parse! tokens
3
+ def initialize(instructions)
4
+ @instructions = instructions
11
5
  end
12
6
 
13
7
  def apply(input)
14
8
  input = input.clone
15
9
  output =
16
- @nodes.reduce("") { |acc, token|
17
- section = token.consume!(input)
18
- return nil if section.nil?
19
- acc + section.to_s
10
+ @instructions.reduce("") { |acc, ins|
11
+ result = ins.call(input)
12
+ return nil if result.nil?
13
+ acc + result.to_s
20
14
  }
21
15
  input.empty? ? output : nil
22
16
  end
data/lib/fop/tokenizer.rb CHANGED
@@ -1,34 +1,175 @@
1
+ require_relative 'tokens'
2
+
1
3
  module Fop
2
- module Tokenizer
3
- Char = Struct.new(:char)
4
- Error = Class.new(StandardError)
5
-
6
- def self.tokenize!(src)
7
- tokens = []
8
- escape = false
9
- src.each_char { |char|
4
+ class Tokenizer
5
+ Token = Struct.new(:pos, :type, :val)
6
+ Error = Struct.new(:pos, :message)
7
+ Escapes = Struct.new(:operators, :regex_capture, :regex, :regex_escape, :wildcards, :exp)
8
+
9
+ EXP_OPEN = "{".freeze
10
+ EXP_CLOSE = "}".freeze
11
+ ESCAPE = "\\".freeze
12
+ WILDCARD = "*".freeze
13
+ REGEX_DELIM = "/".freeze
14
+ REGEX_CAPTURE = "$".freeze
15
+ OP_REPLACE = "=".freeze
16
+ OP_APPEND = ">".freeze
17
+ OP_PREPEND = "<".freeze
18
+ OP_ADD = "+".freeze
19
+ OP_SUB = "-".freeze
20
+
21
+ #
22
+ # Controls which "mode" the tokenizer is currently in. This is a necessary result of the syntax lacking
23
+ # explicit string delimiters. That *could* be worked around by requiring users to escape all reserved chars,
24
+ # but that's ugly af. Instead, the parser continually assesses the current context and flips these flags on
25
+ # or off to auto-escape certain chars for the next token.
26
+ #
27
+ attr_reader :escape
28
+
29
+ def initialize(src)
30
+ @src = src
31
+ @end = src.size - 1
32
+ @start_i = 0
33
+ @i = 0
34
+ reset_escapes!
35
+ end
36
+
37
+ # Auto-escape operators and regex capture vars. Appropriate for top-level syntax.
38
+ def reset_escapes!
39
+ @escape = Escapes.new(true, true)
40
+ end
41
+
42
+ # Auto-escape anything you'd find in a regular expression
43
+ def regex_mode!
44
+ @escape.regex = false # look for the final /
45
+ @escape.regex_escape = true # pass \ through to the regex engine UNLESS it's followed by a /
46
+ @escape.wildcards = true
47
+ @escape.operators = true
48
+ @escape.regex_capture = true
49
+ @escape.exp = true
50
+ end
51
+
52
+ def next
53
+ return Token.new(@i, Tokens::EOF) if @i > @end
54
+ char = @src[@i]
55
+ case char
56
+ when EXP_OPEN
57
+ @i += 1
58
+ token! Tokens::EXP_OPEN
59
+ when EXP_CLOSE
60
+ @i += 1
61
+ token! Tokens::EXP_CLOSE
62
+ when WILDCARD
63
+ @i += 1
64
+ token! Tokens::WILDCARD, WILDCARD
65
+ when REGEX_DELIM
66
+ if @escape.regex
67
+ get_str!
68
+ else
69
+ @i += 1
70
+ token! Tokens::REG_DELIM
71
+ end
72
+ when REGEX_CAPTURE
73
+ if @escape.regex_capture
74
+ get_str!
75
+ else
76
+ @i += 1
77
+ t = token! Tokens::REG_CAPTURE, @src[@i]
78
+ @i += 1
79
+ @start_i = @i
80
+ t
81
+ end
82
+ when OP_REPLACE, OP_APPEND, OP_PREPEND, OP_ADD, OP_SUB
83
+ if @escape.operators
84
+ get_str!
85
+ else
86
+ @i += 1
87
+ token! Tokens::OPERATOR, char
88
+ end
89
+ else
90
+ get_str!
91
+ end
92
+ end
93
+
94
+ private
95
+
96
+ def token!(type, val = nil)
97
+ t = Token.new(@start_i, type, val)
98
+ @start_i = @i
99
+ t
100
+ end
101
+
102
+ def get_str!
103
+ str = ""
104
+ escape, found_end = false, false
105
+ until found_end or @i > @end
106
+ char = @src[@i]
107
+
10
108
  if escape
11
- tokens << Char.new(char)
109
+ @i += 1
110
+ str << char
12
111
  escape = false
13
112
  next
14
113
  end
15
114
 
16
115
  case char
17
- when "\\".freeze
18
- escape = true
19
- when "{".freeze
20
- tokens << :match_open
21
- when "}".freeze
22
- tokens << :match_close
23
- when "*".freeze
24
- tokens << :wildcard
116
+ when ESCAPE
117
+ @i += 1
118
+ if @escape.regex_escape and @src[@i] != REGEX_DELIM
119
+ str << char
120
+ else
121
+ escape = true
122
+ end
123
+ when EXP_OPEN
124
+ if @escape.exp
125
+ @i += 1
126
+ str << char
127
+ else
128
+ found_end = true
129
+ end
130
+ when EXP_CLOSE
131
+ if @escape.exp
132
+ @i += 1
133
+ str << char
134
+ else
135
+ found_end = true
136
+ end
137
+ when WILDCARD
138
+ if @escape.wildcards
139
+ @i += 1
140
+ str << char
141
+ else
142
+ found_end = true
143
+ end
144
+ when REGEX_DELIM
145
+ if @escape.regex
146
+ @i += 1
147
+ str << char
148
+ else
149
+ found_end = true
150
+ end
151
+ when REGEX_CAPTURE
152
+ if @escape.regex_capture
153
+ @i += 1
154
+ str << char
155
+ else
156
+ found_end = true
157
+ end
158
+ when OP_REPLACE, OP_APPEND, OP_PREPEND, OP_ADD, OP_SUB
159
+ if @escape.operators
160
+ @i += 1
161
+ str << char
162
+ else
163
+ found_end = true
164
+ end
25
165
  else
26
- tokens << Char.new(char)
166
+ @i += 1
167
+ str << char
27
168
  end
28
- }
169
+ end
29
170
 
30
- raise Error, "Trailing escape" if escape
31
- tokens
171
+ return Token.new(@i - 1, Tokens::TR_ESC) if escape
172
+ token! Tokens::TEXT, str
32
173
  end
33
174
  end
34
175
  end
data/lib/fop/tokens.rb ADDED
@@ -0,0 +1,13 @@
1
+ module Fop
2
+ module Tokens
3
+ TEXT = :TXT
4
+ EXP_OPEN = :"{"
5
+ EXP_CLOSE = :"}"
6
+ REG_CAPTURE = :"$"
7
+ REG_DELIM = :/
8
+ WILDCARD = :*
9
+ OPERATOR = :op
10
+ TR_ESC = :"trailing escape"
11
+ EOF = :EOF
12
+ end
13
+ end
data/lib/fop/version.rb CHANGED
@@ -1,3 +1,3 @@
1
1
  module Fop
2
- VERSION = "0.1.0"
2
+ VERSION = "0.5.0"
3
3
  end
data/lib/fop_lang.rb CHANGED
@@ -1,12 +1,22 @@
1
1
  require_relative 'fop/version'
2
+ require_relative 'fop/compiler'
2
3
  require_relative 'fop/program'
3
4
 
4
5
  def Fop(src)
5
- ::Fop::Program.new(src)
6
+ ::Fop.compile!(src)
6
7
  end
7
8
 
8
9
  module Fop
10
+ def self.compile!(src)
11
+ prog, errors = compile(src)
12
+ # TODO better exception
13
+ raise "Fop errors: " + errors.map(&:message).join(",") if errors
14
+ prog
15
+ end
16
+
9
17
  def self.compile(src)
10
- Program.new(src)
18
+ instructions, errors = ::Fop::Compiler.compile(src)
19
+ return nil, errors if errors
20
+ return Program.new(instructions), nil
11
21
  end
12
22
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fop_lang
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.5.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Jordan Hollinger
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2021-08-15 00:00:00.000000000 Z
11
+ date: 2021-08-20 00:00:00.000000000 Z
12
12
  dependencies: []
13
13
  description: A micro expression language for Filter and OPerations on text
14
14
  email: jordan.hollinger@gmail.com
@@ -17,10 +17,12 @@ extensions: []
17
17
  extra_rdoc_files: []
18
18
  files:
19
19
  - README.md
20
+ - lib/fop/compiler.rb
20
21
  - lib/fop/nodes.rb
21
22
  - lib/fop/parser.rb
22
23
  - lib/fop/program.rb
23
24
  - lib/fop/tokenizer.rb
25
+ - lib/fop/tokens.rb
24
26
  - lib/fop/version.rb
25
27
  - lib/fop_lang.rb
26
28
  homepage: https://jhollinger.github.io/fop-lang-rb/