lexical_analyzer 0.2.2 → 0.3.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: cb89405e18a8a6774cf5e844f27d2f9006e70b79
4
- data.tar.gz: b8bc6349e4ecda409ae4234d3968a49575cb0fbc
3
+ metadata.gz: 17cf068290c697c20216d7e8a171392caa14dc9f
4
+ data.tar.gz: 44cc961d916c2b03e226138ff11c0b637f54e52b
5
5
  SHA512:
6
- metadata.gz: 10abdd9d749ff65755199bd82a4fc983efc870398f3c41ddc7f1f3c331361dcfc69d8c7948447932e291b5929298096612e69bc377e7354b3cc1e367db978b83
7
- data.tar.gz: c9bef6e94b267ee9342267f03f1eaa1d0f03b474fd1113c88ea6169250996b7b36c2aac69437cd56bc988a859549655bd49286d6e1c211787ca9a68ccff700c0
6
+ metadata.gz: 80a7590b7abd987fc22cdf2be1b547a79ec2b04d4ba39d9961eb71544a03ef1604935e01b66d12da0198cb8f6e5c3e7215cb74d513b018e8ebd90307449eab82
7
+ data.tar.gz: 8f0483b6d822154daf2757ebeb0fb28a35ab38a0b9e42e8da10615d5d075f7f20c78b633487b10d405792e532fd3a6546ed6568344ba29bc77939c005b8fb912
data/README.md CHANGED
@@ -30,44 +30,63 @@ be analyzed and an array of rules for performing that task.
30
30
  ```ruby
31
31
  lexical_analyser = LexicalAnalyzer.new(text: text, rules: rules)
32
32
 
33
+ token = lexical_analyser.get
34
+
33
35
  ```
34
36
 
35
- #### Rules
37
+ It is sometimes desirable to reuse an existing lexical analyzer. This can be
38
+ done with the renew method.
39
+
40
+ ```ruby
41
+ lexical_analyser.renew(text: new_text)
36
42
 
37
- A rule is an array with two or three elements. These elements are:
43
+ token = lexical_analyser.get
38
44
 
39
- rule[0] - a symbol that represents this rule.
45
+ ```
46
+
47
+ Note: The renew method takes the same arguments as the new method, text and an
48
+ array of rules. If these are omitted, the default is to leave that value
49
+ unchanged. The renew method returns the updated lexical analyzer just like the
50
+ new method returns the newly created one.
40
51
 
41
- rule[1] - a regular expression. This must begin with a \\A clause to ensure
42
- correct operation of the analyzer.
52
+ #### Rules
43
53
 
44
- rule[2] - an optional block that generates the output token that corresponds
45
- to this rule. Some examples of these blocks are:
54
+ The rules are an array of LexicalRule objects. Each consists of a symbol, a
55
+ regular expression, and an optional action.
46
56
 
47
57
  ```ruby
48
- # Ignore this input, emit no token.
49
- Proc.new { false }
58
+ # Rule with default block returns [:equality, "=="] on a match.
59
+ LexicalRule.new(:equality, /\A==/)
50
60
 
51
- # The default block that is used if none is given.
52
- lambda {|symbol, value| [symbol, value] }
61
+ # Rule with an ignore block, ignores matches.
62
+ LexicalRule.new(:spaces, /\A\s+/) {|_value| false }
53
63
 
54
- # Take the text retrieved and process it further with another analyzer.
55
- lambda {|_symbol, value| ka.set_text(value).get
64
+ # Rule with an integer block returns [:integer, an_integer] on a match.
65
+ LexicalRule.new(:integer, /\A\d+/) {|value| [@symbol, value.to_i] }
56
66
 
67
+ # Rule with a block that expands of to a sub-rule. Returns the value of the
68
+ # lexical analyzer in the captured variable ka.
69
+ LexicalRule.new(:identifier, /\A[a-zA-Z_]\w*(?=\W|$|\z)/) {|value|
70
+ ka.renew(text: value).get
71
+ }
57
72
  ```
58
73
 
59
- Note: The order of rules is important. For example, if there are two rules
74
+ Notes:
75
+
76
+ * The regular expression must begin with a \A clause to ensure correct
77
+ operation of the analyzer.
78
+ * The order of rules is important. For example, if there are two rules
60
79
  looking for "==" and "=" respectively, if the "=" is ahead of the "==" rule
61
80
  in the array the "==" rule will never trigger and the analysis will be
62
81
  incorrect.
63
82
 
64
83
  #### Tokens
65
84
 
66
- The token is also an array, with two elements.
85
+ The output token is an array with two elements.
67
86
 
68
87
  token[0] - the symbol extracted from the rule that generated this token.
69
88
 
70
- token[1] - the text that generated this token.
89
+ token[1] - the text that generated this token or its value.
71
90
 
72
91
 
73
92
  #### Example
@@ -88,7 +107,9 @@ action.
88
107
 
89
108
  #### Plan B
90
109
 
91
- Go to the GitHub repository and raise an issue calling attention to some
110
+ Go to the GitHub repository and raise an
111
+ [issue](https://github.com/PeterCamilleri/lexical_analyzer/issues)
112
+ calling attention to some
92
113
  aspect that could use some TLC or a suggestion or an idea.
93
114
 
94
115
  ## License
@@ -0,0 +1,24 @@
1
+ # The Ruby Compiler Toolkit Project - Lexical Rule
2
+ # A rule for lexical analysis.
3
+
4
+ class LexicalRule
5
+
6
+ # Create a lexical rule.
7
+ def initialize(symbol, regex, &action)
8
+ @symbol = symbol
9
+ @regex = regex
10
+
11
+ define_singleton_method(:call, &action) if block_given?
12
+ end
13
+
14
+ # Does this rule match?
15
+ def match(text)
16
+ text.match(@regex)
17
+ end
18
+
19
+ # The default rule action.
20
+ def call(value)
21
+ [@symbol, value]
22
+ end
23
+
24
+ end
@@ -1,3 +1,3 @@
1
1
  class LexicalAnalyzer
2
- VERSION = "0.2.2"
2
+ VERSION = "0.3.0"
3
3
  end
@@ -1,6 +1,7 @@
1
1
  # The Ruby Compiler Toolkit Project - Lexical Analyzer
2
2
  # Scan input and extract lexical tokens.
3
3
 
4
+ require_relative 'lexical_analyzer/lexical_rule'
4
5
  require_relative 'lexical_analyzer/version'
5
6
 
6
7
  # The RCTP class for lexical analysis.
@@ -8,26 +9,25 @@ class LexicalAnalyzer
8
9
  attr_reader :text # Access the text in the analyzer.
9
10
  attr_reader :rules # Access the array of lexical rules.
10
11
 
11
- # Some array index values.
12
- SYMBOL = 0
13
- REGEX = 1
14
- BLOCK = 2
15
-
16
- # The default tokenizer block
17
- DTB = lambda {|symbol, value| [symbol, value] }
18
-
19
12
  # Set things up.
20
13
  def initialize(text: "", rules: [])
21
14
  @text = text
22
15
  @rules = rules
23
16
  end
24
17
 
18
+ # Reuse an existing lexical analyzer.
19
+ def renew(text: @text, rules: @rules)
20
+ @text = text
21
+ @rules = rules
22
+ self
23
+ end
24
+
25
25
  # Get the next lexical token
26
26
  def get(extra=[])
27
27
  (rules + extra).each do |rule|
28
- if match_data = text.match(rule[REGEX])
28
+ if match_data = rule.match(text)
29
29
  @text = match_data.post_match
30
- return (rule[BLOCK] || DTB).call(rule[SYMBOL], match_data.to_s) || get
30
+ return rule.call(match_data.to_s) || get
31
31
  end
32
32
  end
33
33
 
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: lexical_analyzer
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.2
4
+ version: 0.3.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - PeterCamilleri
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2018-09-30 00:00:00.000000000 Z
11
+ date: 2018-10-03 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -80,6 +80,7 @@ files:
80
80
  - README.md
81
81
  - lexical_analyzer.gemspec
82
82
  - lib/lexical_analyzer.rb
83
+ - lib/lexical_analyzer/lexical_rule.rb
83
84
  - lib/lexical_analyzer/version.rb
84
85
  - rakefile.rb
85
86
  - reek.txt