fop_lang 0.7.0 → 0.8.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 798fd7c335f394e878fba2f70a9f60372ea356c79f2dc63392398920d0ffce38
4
- data.tar.gz: 654786ff77823e8d8dd9a348f958828346e3755e43a04a0f38e711a6c5571ea9
3
+ metadata.gz: e23d8d937f5a4b5e4d74010bb91923dedce019543d4d3baefc228dece938a731
4
+ data.tar.gz: cc97f6953b708498be169352269b861c73c9dbe52ded1a72f4370a8d18d32d48
5
5
  SHA512:
6
- metadata.gz: 6761f3d7dd602d1c93a2387fc73ea14c11484e88d0d319bbf87df98925977aa15de59a63f23aafffafa384ce3b9def9f81edabae669aabc2012b00d3131e46f4
7
- data.tar.gz: 7f5187cd510d691dda996284d5a400804b7573f67506701e39a6d2909c8a4026b58655f6b2800708e911377ccce790885a2238eed7a75d4873e4b599d23e67df
6
+ metadata.gz: e2cec9cd47a472298f7af0268a9dc03aacce374ed88da7b505e33cb4536f6f1d04107cce7c33eba4718809d54591a3111bfd26971eef3c52073ba1226be4da4f
7
+ data.tar.gz: 80b5700d0cdda44dd021fe48d5c134cb992c6967b10681c43488a5a7276fbf03df7d7a9427a9aa92529569eaf0d134fa789df0c8e27cd2250dc50bcb16727d13
data/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # fop_lang
2
2
 
3
- Fop (Filter and OPerations language) is a tiny, experimental language for filtering and transforming text. Think of it like awk but with the condition and action segments combined.
3
+ Fop (Filter and OPerations language) is a tiny, experimental language for filtering and operating on text. Think of it like awk but with the condition and action segments combined.
4
4
 
5
5
  This is a Ruby implementation with both a library interface and a bin command.
6
6
 
@@ -16,10 +16,8 @@ You may use fop in a Ruby script:
16
16
  require 'fop_lang'
17
17
 
18
18
  f = Fop('foo {N+1}')
19
-
20
19
  f.apply('foo 1')
21
20
  => "foo 2"
22
-
23
21
  f.apply('bar 1')
24
22
  => nil
25
23
  ```
@@ -40,18 +38,24 @@ The above program demonstrates a text match, a regex match, and a match expressi
40
38
 
41
39
  ### Text match
42
40
 
41
+ `Text ` and ` ` in the above example.
42
+
43
43
  The input must match this text exactly. Whitespace is part of the match. Wildcards (`*`) are allowed. Special characters (`*/{}\`) may be escaped with `\`.
44
44
 
45
45
  The output of a text match will be the matching input.
46
46
 
47
47
  ### Regex match
48
48
 
49
+ `/(R|r)egex/` in the above example.
50
+
49
51
  Regular expressions may be placed between `/`s. If the regular expression contains a `/`, you may escape it with `\`. Special regex characters like `[]()+.*` may also be escaped with `\`.
50
52
 
51
53
  The output of a regex match will be the matching input.
52
54
 
53
55
  ### Match expression
54
56
 
57
+ `{N+1}` in the above example.
58
+
55
59
  A match expression both matches on input and modifies that input. An expression is made up of 1 - 3 parts:
56
60
 
57
61
  1. The match, e.g. `N` for numeric.
@@ -76,6 +80,10 @@ The output of a match expression will be the _modified_ matching input. If no op
76
80
  * `+` Perform addition on the matching number and the argument (`N` only).
77
81
  * `-` Subtract the argument from the matching number (`N` only).
78
82
 
83
+ **Whitespace**
84
+
85
+ Inside of match expressions, whitespace is an optional seperator of terms, i.e. `{ N + 1 }` is the same as `{N+1}`. This means that any spaces in string arguments must be escaped. For example, replacing a word with `foo bar` looks like `{W = foo\ bar}`.
86
+
79
87
  ## Examples
80
88
 
81
89
  ### Release Number Example
@@ -103,10 +111,10 @@ This example takes in GitHub branch names, decides if they're release branches,
103
111
  ```
104
112
 
105
113
  ```ruby
106
- f = Fop('rel{/(ease)?/}-{N=5}.{N+1}.{N=0}')
114
+ f = Fop('rel{/(ease)?/=}-{N=5}.{N+1}.{N=0}')
107
115
 
108
116
  puts f.apply('release-4.99.1')
109
- => 'release-5.100.0'
117
+ => 'rel-5.100.0'
110
118
 
111
119
  puts f.apply('rel-4.99.1')
112
120
  => 'rel-5.100.0'
data/lib/fop/compiler.rb CHANGED
@@ -11,6 +11,8 @@ module Fop
11
11
  when Nodes::Text, Nodes::Regex
12
12
  Instructions.regex_match(node.regex)
13
13
  when Nodes::Expression
14
+ arg_error = Validations.validate_args(node)
15
+ errors << arg_error if arg_error
14
16
  Instructions::ExpressionMatch.new(node)
15
17
  else
16
18
  raise "Unknown node type #{node}"
@@ -22,13 +24,14 @@ module Fop
22
24
  end
23
25
 
24
26
  module Instructions
27
+ Op = Struct.new(:proc, :arity, :max_arity)
25
28
  BLANK = "".freeze
26
29
  OPERATIONS = {
27
- "=" => ->(_val, arg) { arg || BLANK },
28
- "+" => ->(val, arg) { val.to_i + arg.to_i },
29
- "-" => ->(val, arg) { val.to_i - arg.to_i },
30
- ">" => ->(val, arg) { val + arg },
31
- "<" => ->(val, arg) { arg + val },
30
+ "=" => Op.new(->(_val, args) { args[0] || BLANK }, 0, 1),
31
+ "+" => Op.new(->(val, args) { val.to_i + args[0].to_i }, 1),
32
+ "-" => Op.new(->(val, args) { val.to_i - args[0].to_i }, 1),
33
+ ">" => Op.new(->(val, args) { val + args[0] }, 1),
34
+ "<" => Op.new(->(val, args) { args[0] + val }, 1),
32
35
  }
33
36
 
34
37
  def self.regex_match(regex)
@@ -38,14 +41,11 @@ module Fop
38
41
  class ExpressionMatch
39
42
  def initialize(node)
40
43
  @regex = node.regex&.regex
41
- @op = node.operator ? OPERATIONS.fetch(node.operator) : nil
44
+ @op = node.operator_token ? OPERATIONS.fetch(node.operator_token.val) : nil
42
45
  @regex_match = node.regex_match
43
- if node.arg&.any? { |a| a.is_a? Integer }
44
- @arg, @arg_with_caps = nil, node.arg
45
- else
46
- @arg = node.arg&.join("")
47
- @arg_with_caps = nil
48
- end
46
+ @args = node.args&.map { |arg|
47
+ arg.has_captures ? arg.segments : arg.segments.join("")
48
+ }
49
49
  end
50
50
 
51
51
  def call(input)
@@ -54,8 +54,18 @@ module Fop
54
54
  blank = val == BLANK
55
55
  input.sub!(val, BLANK) unless blank
56
56
  found_val = @regex_match || !blank
57
- arg = @arg_with_caps ? sub_caps(@arg_with_caps, match.captures) : @arg
58
- @op && found_val ? @op.call(val, arg) : val
57
+ if @op and @args and found_val
58
+ args = @args.map { |arg|
59
+ case arg
60
+ when String then arg
61
+ when Array then sub_caps(arg, match.captures)
62
+ else raise "Unexpected arg type #{arg.class.name}"
63
+ end
64
+ }
65
+ @op.proc.call(val, args)
66
+ else
67
+ val
68
+ end
59
69
  end
60
70
  end
61
71
 
@@ -68,5 +78,18 @@ module Fop
68
78
  end
69
79
  end
70
80
  end
81
+
82
+ module Validations
83
+ def self.validate_args(exp_node)
84
+ op_token = exp_node.operator_token || return
85
+ op = Instructions::OPERATIONS.fetch(op_token.val)
86
+ num = exp_node.args&.size || 0
87
+ arity = op.arity
88
+ max_arity = op.max_arity || arity
89
+ if num < arity or num > max_arity
90
+ Parser::Error.new(:argument, op_token, "#{op_token.val} expects #{arity}..#{max_arity} arguments; #{num} given")
91
+ end
92
+ end
93
+ end
71
94
  end
72
95
  end
data/lib/fop/nodes.rb CHANGED
@@ -14,18 +14,29 @@ module Fop
14
14
  end
15
15
  end
16
16
 
17
- Expression = Struct.new(:wildcard, :match, :regex_match, :regex, :operator, :arg) do
17
+ Expression = Struct.new(:wildcard, :match, :regex_match, :regex, :operator_token, :args) do
18
18
  def to_s
19
19
  w = wildcard ? "*" : nil
20
20
  s = "[#{w}exp] #{match}"
21
- if operator
22
- arg_str = arg
21
+ if operator_token
22
+ arg_str = args
23
23
  .map { |a| a.is_a?(Integer) ? "$#{a+1}" : a.to_s }
24
24
  .join("")
25
- s << " #{operator} #{arg_str}"
25
+ s << " #{operator_token.val} #{arg_str}"
26
26
  end
27
27
  s
28
28
  end
29
29
  end
30
+
31
+ Arg = Struct.new(:segments, :has_captures) do
32
+ def to_s
33
+ segments.map { |s|
34
+ case s
35
+ when Integer then "$#{s + 1}"
36
+ else s.to_s
37
+ end
38
+ }.join("")
39
+ end
40
+ end
30
41
  end
31
42
  end
data/lib/fop/parser.rb CHANGED
@@ -12,7 +12,7 @@ module Fop
12
12
  "A" => "[a-zA-Z]+".freeze,
13
13
  "*" => ".*".freeze,
14
14
  }.freeze
15
- OPS_WITH_OPTIONAL_ARGS = [Tokenizer::OP_REPLACE]
15
+ #OPS_WITH_OPTIONAL_ARGS = [Tokenizer::OP_REPLACE]
16
16
  TR_REGEX = /.*/
17
17
 
18
18
  Error = Struct.new(:type, :token, :message) do
@@ -63,14 +63,15 @@ module Fop
63
63
  def parse_exp!(wildcard = false)
64
64
  exp = Nodes::Expression.new(wildcard)
65
65
  parse_exp_match! exp
66
- op_token = parse_exp_operator! exp
67
- if exp.operator
68
- parse_exp_arg! exp, op_token
66
+ parse_exp_operator! exp
67
+ if exp.operator_token
68
+ parse_exp_arg! exp
69
69
  end
70
70
  return exp
71
71
  end
72
72
 
73
73
  def parse_exp_match!(exp)
74
+ @tokenizer.escape.whitespace = false
74
75
  @tokenizer.escape.operators = false
75
76
  t = @tokenizer.next
76
77
  case t.type
@@ -93,35 +94,44 @@ module Fop
93
94
  end
94
95
 
95
96
  def parse_exp_operator!(exp)
97
+ @tokenizer.escape.whitespace = false
96
98
  @tokenizer.escape.operators = false
97
99
  t = @tokenizer.next
98
100
  case t.type
99
101
  when Tokens::EXP_CLOSE
100
102
  # no op
101
- when Tokens::OPERATOR
102
- exp.operator = t.val
103
+ when Tokens::OPERATOR, Tokens::TEXT
104
+ exp.operator_token = t
103
105
  else
104
106
  errors << Error.new(:syntax, t, "Unexpected #{t.type}; expected an operator")
105
107
  end
106
- t
107
108
  end
108
109
 
109
- def parse_exp_arg!(exp, op_token)
110
+ def parse_exp_arg!(exp)
111
+ @tokenizer.escape.whitespace = false
112
+ @tokenizer.escape.whitespace_sep = false
110
113
  @tokenizer.escape.operators = true
111
114
  @tokenizer.escape.regex = true
112
115
  @tokenizer.escape.regex_capture = false if exp.regex_match
113
116
 
114
- exp.arg = []
117
+ arg = Nodes::Arg.new([], false)
118
+ exp.args = []
115
119
  found_close, eof = false, false
116
120
  until found_close or eof
117
121
  t = @tokenizer.next
118
122
  case t.type
119
123
  when Tokens::TEXT
120
- exp.arg << t.val
124
+ arg.segments << t.val
121
125
  when Tokens::REG_CAPTURE
122
- exp.arg << t.val.to_i - 1
126
+ arg.has_captures = true
127
+ arg.segments << t.val.to_i - 1
123
128
  errors << Error.new(:syntax, t, "Invalid regex capture; must be between 0 and 9 (found #{t.val})") unless t.val =~ DIGIT
124
129
  errors << Error.new(:syntax, t, "Unexpected regex capture; expected str or '}'") if !exp.regex_match
130
+ when Tokens::WHITESPACE_SEP
131
+ if arg.segments.any?
132
+ exp.args << arg
133
+ arg = Nodes::Arg.new([])
134
+ end
125
135
  when Tokens::EXP_CLOSE
126
136
  found_close = true
127
137
  when Tokens::EOF
@@ -131,10 +141,11 @@ module Fop
131
141
  errors << Error.new(:syntax, t, "Unexpected #{t.type}; expected str or '}'")
132
142
  end
133
143
  end
144
+ exp.args << arg if arg.segments.any?
134
145
 
135
- if exp.arg.size != 1 and !OPS_WITH_OPTIONAL_ARGS.include?(exp.operator)
136
- errors << Error.new(:arg, op_token, "Operator '#{op_token.val}' requires an argument")
137
- end
146
+ #if exp.arg.size != 1 and !OPS_WITH_OPTIONAL_ARGS.include?(exp.operator)
147
+ # errors << Error.new(:arg, op_token, "Operator '#{op_token.val}' requires an argument")
148
+ #end
138
149
  end
139
150
 
140
151
  def parse_regex!(wildcard)
data/lib/fop/tokenizer.rb CHANGED
@@ -3,8 +3,7 @@ require_relative 'tokens'
3
3
  module Fop
4
4
  class Tokenizer
5
5
  Token = Struct.new(:pos, :type, :val)
6
- Error = Struct.new(:pos, :message)
7
- Escapes = Struct.new(:operators, :regex_capture, :regex, :regex_escape, :wildcards, :exp)
6
+ Escapes = Struct.new(:whitespace, :whitespace_sep, :operators, :regex_capture, :regex, :regex_escape, :wildcards, :exp)
8
7
 
9
8
  EXP_OPEN = "{".freeze
10
9
  EXP_CLOSE = "}".freeze
@@ -17,6 +16,7 @@ module Fop
17
16
  OP_PREPEND = "<".freeze
18
17
  OP_ADD = "+".freeze
19
18
  OP_SUB = "-".freeze
19
+ WHITESPACE = " ".freeze
20
20
 
21
21
  #
22
22
  # Controls which "mode" the tokenizer is currently in. This is a necessary result of the syntax lacking
@@ -36,11 +36,12 @@ module Fop
36
36
 
37
37
  # Auto-escape operators and regex capture vars. Appropriate for top-level syntax.
38
38
  def reset_escapes!
39
- @escape = Escapes.new(true, true)
39
+ @escape = Escapes.new(true, true, true, true)
40
40
  end
41
41
 
42
42
  # Auto-escape anything you'd find in a regular expression
43
43
  def regex_mode!
44
+ @escape.whitespace = true
44
45
  @escape.regex = false # look for the final /
45
46
  @escape.regex_escape = true # pass \ through to the regex engine UNLESS it's followed by a /
46
47
  @escape.wildcards = true
@@ -86,6 +87,17 @@ module Fop
86
87
  @i += 1
87
88
  token! Tokens::OPERATOR, char
88
89
  end
90
+ when WHITESPACE
91
+ if @escape.whitespace
92
+ get_str!
93
+ elsif !@escape.whitespace_sep
94
+ @i += 1
95
+ token! Tokens::WHITESPACE_SEP
96
+ else
97
+ @i += 1
98
+ @start_i = @i
99
+ self.next
100
+ end
89
101
  else
90
102
  get_str!
91
103
  end
@@ -162,6 +174,13 @@ module Fop
162
174
  else
163
175
  found_end = true
164
176
  end
177
+ when WHITESPACE
178
+ if @escape.whitespace
179
+ @i += 1
180
+ str << char
181
+ else
182
+ found_end = true
183
+ end
165
184
  else
166
185
  @i += 1
167
186
  str << char
data/lib/fop/tokens.rb CHANGED
@@ -8,6 +8,7 @@ module Fop
8
8
  WILDCARD = :*
9
9
  OPERATOR = :op
10
10
  TR_ESC = :"trailing escape"
11
+ WHITESPACE_SEP = :s
11
12
  EOF = :EOF
12
13
  end
13
14
  end
data/lib/fop/version.rb CHANGED
@@ -1,3 +1,3 @@
1
1
  module Fop
2
- VERSION = "0.7.0"
2
+ VERSION = "0.8.0"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fop_lang
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.7.0
4
+ version: 0.8.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Jordan Hollinger
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2021-08-30 00:00:00.000000000 Z
11
+ date: 2021-09-01 00:00:00.000000000 Z
12
12
  dependencies: []
13
13
  description: A micro expression language for Filter and OPerations on text
14
14
  email: jordan.hollinger@gmail.com