lexical_analyzer 0.1.0 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +68 -3
- data/lib/lexical_analyzer.rb +11 -20
- data/lib/lexical_analyzer/version.rb +1 -3
- data/rakefile.rb +6 -0
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 54b488be7553b7be24ea6fa73c51fb35f13ad33d
|
4
|
+
data.tar.gz: 42df79a772749f16ab5af6b902b84897086385dc
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 41c89261b16eb1b5f626e5d824a59d1a84e0aad0d11eda7e15454fcc243f3f6c1e85645d96e7caa4b06fce90ff0007ad1796d84d15391e160df4e3f06f9a1c82
|
7
|
+
data.tar.gz: e9352a9008ae1353193c277162e0b9979e89e495c2dfa5d78e9407d088b612d483f76603dea29e0d0c4650f72dcca0aa95704613ed60993b311b744d825acda1
|
data/README.md
CHANGED
@@ -1,6 +1,10 @@
|
|
1
1
|
# LexicalAnalyzer
|
2
2
|
|
3
|
-
|
3
|
+
The lexical analyzer is a component of the Ruby Compiler Toolkit Project that
|
4
|
+
scans an input text against an array of rules and generating the lexical
|
5
|
+
tokens that it detects. It is normally used in conjunction with a parse queue
|
6
|
+
object which handles queuing of tokens and back tracking of the compile process
|
7
|
+
when needed.
|
4
8
|
|
5
9
|
## Installation
|
6
10
|
|
@@ -20,11 +24,72 @@ Or install it yourself as:
|
|
20
24
|
|
21
25
|
## Usage
|
22
26
|
|
23
|
-
|
27
|
+
A lexical analyzer object is created with two keyword parameters, the text to
|
28
|
+
be analyzed and an array of rules for performing that task.
|
29
|
+
|
30
|
+
```ruby
|
31
|
+
lexical_analyser = LexicalAnalyzer.new(text: text, rules: rules)
|
32
|
+
|
33
|
+
```
|
34
|
+
|
35
|
+
#### Rules
|
36
|
+
|
37
|
+
A rule is an array with two or three elements. These elements are:
|
38
|
+
|
39
|
+
rule[0] - a symbol that represents this rule.
|
40
|
+
|
41
|
+
rule[1] - a regular expression. This must begin with a \\A clause to ensure
|
42
|
+
correct operation of the analyzer.
|
43
|
+
|
44
|
+
rule[2] - an optional block that generates the output token that corresponds
|
45
|
+
to this rule. Some examples of these blocks are:
|
46
|
+
|
47
|
+
```ruby
|
48
|
+
# Ignore this input, emit no token.
|
49
|
+
Proc.new { false }
|
50
|
+
|
51
|
+
# The default block that is used if none is given.
|
52
|
+
lambda {|symbol, value| [symbol, value] }
|
53
|
+
|
54
|
+
# Take the text retrieved and process it further with another analyzer.
|
55
|
+
lambda {|_symbol, value| ka.set_text(value).get
|
56
|
+
|
57
|
+
```
|
58
|
+
|
59
|
+
Note: The order of rules is important. For example, if there are two rules
|
60
|
+
looking for "==" and "=" respectively, if the "=" is ahead of the "==" rule
|
61
|
+
in the array the "==" rule will never trigger and the analysis will be
|
62
|
+
incorrect.
|
63
|
+
|
64
|
+
#### Tokens
|
65
|
+
|
66
|
+
The token is also an array, with two elements.
|
67
|
+
|
68
|
+
token[0] - the symbol extracted from the rule that generated this token.
|
69
|
+
|
70
|
+
token[1] - the text that generated this token.
|
71
|
+
|
72
|
+
|
73
|
+
#### Example
|
74
|
+
|
75
|
+
The test file "lexical_analyzer_test.rb" has the method
|
76
|
+
test_some_lexical_analyzing that is a really good example of this gem in
|
77
|
+
action.
|
24
78
|
|
25
79
|
## Contributing
|
26
80
|
|
27
|
-
|
81
|
+
#### Plan A
|
82
|
+
|
83
|
+
1. Fork it ( https://github.com/PeterCamilleri/lexical_analyzer/fork )
|
84
|
+
2. Create your feature branch (`git checkout -b my-new-feature`)
|
85
|
+
3. Commit your changes (`git commit -am 'Add some feature'`)
|
86
|
+
4. Push to the branch (`git push origin my-new-feature`)
|
87
|
+
5. Create a new Pull Request
|
88
|
+
|
89
|
+
#### Plan B
|
90
|
+
|
91
|
+
Go to the GitHub repository and raise an issue calling attention to some
|
92
|
+
aspect that could use some TLC or a suggestion or an idea.
|
28
93
|
|
29
94
|
## License
|
30
95
|
|
data/lib/lexical_analyzer.rb
CHANGED
@@ -1,44 +1,35 @@
|
|
1
|
-
# coding: utf-8
|
2
|
-
|
3
|
-
require_relative 'lexical_analyzer/version'
|
4
|
-
|
5
1
|
# The Ruby Compiler Toolkit Project - Lexical Analyzer
|
6
2
|
# Scan input and extract lexical tokens.
|
7
3
|
|
8
|
-
|
4
|
+
require_relative 'lexical_analyzer/version'
|
9
5
|
|
10
|
-
|
11
|
-
|
6
|
+
class LexicalAnalyzer
|
7
|
+
attr_accessor :text # Access the text in the analyzer.
|
8
|
+
attr_reader :rules # Access the array of lexical rules.
|
12
9
|
|
13
|
-
#
|
14
|
-
|
10
|
+
# Some array index values.
|
11
|
+
SYMBOL = 0
|
12
|
+
REGEX = 1
|
13
|
+
BLOCK = 2
|
15
14
|
|
16
15
|
# The default tokenizer block
|
17
16
|
DTB = lambda {|symbol, value| [symbol, value] }
|
18
17
|
|
19
18
|
# Set things up.
|
20
19
|
def initialize(text: "", rules: [])
|
21
|
-
@text
|
20
|
+
@text = text
|
22
21
|
@rules = rules
|
23
22
|
end
|
24
23
|
|
25
|
-
# Set the text.
|
26
|
-
def set_text(text)
|
27
|
-
@text = text
|
28
|
-
self
|
29
|
-
end
|
30
|
-
|
31
24
|
# Get the next lexical token
|
32
25
|
def get
|
33
26
|
rules.each do |rule|
|
34
|
-
if match_data = text.match(rule[
|
27
|
+
if match_data = text.match(rule[REGEX])
|
35
28
|
@text = match_data.post_match
|
36
|
-
|
37
|
-
return (rule[2] || DTB).call(rule[0], match_data.to_s) || get
|
29
|
+
return (rule[BLOCK] || DTB).call(rule[SYMBOL], match_data.to_s) || get
|
38
30
|
end
|
39
31
|
end
|
40
32
|
|
41
33
|
false
|
42
34
|
end
|
43
|
-
|
44
35
|
end
|
data/rakefile.rb
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: lexical_analyzer
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.2.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- PeterCamilleri
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2018-08-
|
11
|
+
date: 2018-08-31 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bundler
|