lexical_analyzer 0.1.0 → 0.2.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +68 -3
- data/lib/lexical_analyzer.rb +11 -20
- data/lib/lexical_analyzer/version.rb +1 -3
- data/rakefile.rb +6 -0
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 54b488be7553b7be24ea6fa73c51fb35f13ad33d
|
4
|
+
data.tar.gz: 42df79a772749f16ab5af6b902b84897086385dc
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 41c89261b16eb1b5f626e5d824a59d1a84e0aad0d11eda7e15454fcc243f3f6c1e85645d96e7caa4b06fce90ff0007ad1796d84d15391e160df4e3f06f9a1c82
|
7
|
+
data.tar.gz: e9352a9008ae1353193c277162e0b9979e89e495c2dfa5d78e9407d088b612d483f76603dea29e0d0c4650f72dcca0aa95704613ed60993b311b744d825acda1
|
data/README.md
CHANGED
@@ -1,6 +1,10 @@
|
|
1
1
|
# LexicalAnalyzer
|
2
2
|
|
3
|
-
|
3
|
+
The lexical analyzer is a component of the Ruby Compiler Toolkit Project that
|
4
|
+
scans an input text against an array of rules and generating the lexical
|
5
|
+
tokens that it detects. It is normally used in conjunction with a parse queue
|
6
|
+
object which handles queuing of tokens and back tracking of the compile process
|
7
|
+
when needed.
|
4
8
|
|
5
9
|
## Installation
|
6
10
|
|
@@ -20,11 +24,72 @@ Or install it yourself as:
|
|
20
24
|
|
21
25
|
## Usage
|
22
26
|
|
23
|
-
|
27
|
+
A lexical analyzer object is created with two keyword parameters, the text to
|
28
|
+
be analyzed and an array of rules for performing that task.
|
29
|
+
|
30
|
+
```ruby
|
31
|
+
lexical_analyser = LexicalAnalyzer.new(text: text, rules: rules)
|
32
|
+
|
33
|
+
```
|
34
|
+
|
35
|
+
#### Rules
|
36
|
+
|
37
|
+
A rule is an array with two or three elements. These elements are:
|
38
|
+
|
39
|
+
rule[0] - a symbol that represents this rule.
|
40
|
+
|
41
|
+
rule[1] - a regular expression. This must begin with a \\A clause to ensure
|
42
|
+
correct operation of the analyzer.
|
43
|
+
|
44
|
+
rule[2] - an optional block that generates the output token that corresponds
|
45
|
+
to this rule. Some examples of these blocks are:
|
46
|
+
|
47
|
+
```ruby
|
48
|
+
# Ignore this input, emit no token.
|
49
|
+
Proc.new { false }
|
50
|
+
|
51
|
+
# The default block that is used if none is given.
|
52
|
+
lambda {|symbol, value| [symbol, value] }
|
53
|
+
|
54
|
+
# Take the text retrieved and process it further with another analyzer.
|
55
|
+
lambda {|_symbol, value| ka.set_text(value).get
|
56
|
+
|
57
|
+
```
|
58
|
+
|
59
|
+
Note: The order of rules is important. For example, if there are two rules
|
60
|
+
looking for "==" and "=" respectively, if the "=" is ahead of the "==" rule
|
61
|
+
in the array the "==" rule will never trigger and the analysis will be
|
62
|
+
incorrect.
|
63
|
+
|
64
|
+
#### Tokens
|
65
|
+
|
66
|
+
The token is also an array, with two elements.
|
67
|
+
|
68
|
+
token[0] - the symbol extracted from the rule that generated this token.
|
69
|
+
|
70
|
+
token[1] - the text that generated this token.
|
71
|
+
|
72
|
+
|
73
|
+
#### Example
|
74
|
+
|
75
|
+
The test file "lexical_analyzer_test.rb" has the method
|
76
|
+
test_some_lexical_analyzing that is a really good example of this gem in
|
77
|
+
action.
|
24
78
|
|
25
79
|
## Contributing
|
26
80
|
|
27
|
-
|
81
|
+
#### Plan A
|
82
|
+
|
83
|
+
1. Fork it ( https://github.com/PeterCamilleri/lexical_analyzer/fork )
|
84
|
+
2. Create your feature branch (`git checkout -b my-new-feature`)
|
85
|
+
3. Commit your changes (`git commit -am 'Add some feature'`)
|
86
|
+
4. Push to the branch (`git push origin my-new-feature`)
|
87
|
+
5. Create a new Pull Request
|
88
|
+
|
89
|
+
#### Plan B
|
90
|
+
|
91
|
+
Go to the GitHub repository and raise an issue calling attention to some
|
92
|
+
aspect that could use some TLC or a suggestion or an idea.
|
28
93
|
|
29
94
|
## License
|
30
95
|
|
data/lib/lexical_analyzer.rb
CHANGED
@@ -1,44 +1,35 @@
|
|
1
|
-
# coding: utf-8
|
2
|
-
|
3
|
-
require_relative 'lexical_analyzer/version'
|
4
|
-
|
5
1
|
# The Ruby Compiler Toolkit Project - Lexical Analyzer
|
6
2
|
# Scan input and extract lexical tokens.
|
7
3
|
|
8
|
-
|
4
|
+
require_relative 'lexical_analyzer/version'
|
9
5
|
|
10
|
-
|
11
|
-
|
6
|
+
class LexicalAnalyzer
|
7
|
+
attr_accessor :text # Access the text in the analyzer.
|
8
|
+
attr_reader :rules # Access the array of lexical rules.
|
12
9
|
|
13
|
-
#
|
14
|
-
|
10
|
+
# Some array index values.
|
11
|
+
SYMBOL = 0
|
12
|
+
REGEX = 1
|
13
|
+
BLOCK = 2
|
15
14
|
|
16
15
|
# The default tokenizer block
|
17
16
|
DTB = lambda {|symbol, value| [symbol, value] }
|
18
17
|
|
19
18
|
# Set things up.
|
20
19
|
def initialize(text: "", rules: [])
|
21
|
-
@text
|
20
|
+
@text = text
|
22
21
|
@rules = rules
|
23
22
|
end
|
24
23
|
|
25
|
-
# Set the text.
|
26
|
-
def set_text(text)
|
27
|
-
@text = text
|
28
|
-
self
|
29
|
-
end
|
30
|
-
|
31
24
|
# Get the next lexical token
|
32
25
|
def get
|
33
26
|
rules.each do |rule|
|
34
|
-
if match_data = text.match(rule[
|
27
|
+
if match_data = text.match(rule[REGEX])
|
35
28
|
@text = match_data.post_match
|
36
|
-
|
37
|
-
return (rule[2] || DTB).call(rule[0], match_data.to_s) || get
|
29
|
+
return (rule[BLOCK] || DTB).call(rule[SYMBOL], match_data.to_s) || get
|
38
30
|
end
|
39
31
|
end
|
40
32
|
|
41
33
|
false
|
42
34
|
end
|
43
|
-
|
44
35
|
end
|
data/rakefile.rb
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: lexical_analyzer
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.2.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- PeterCamilleri
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2018-08-
|
11
|
+
date: 2018-08-31 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bundler
|