klue-langcraft 0.0.7 → 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +7 -0
- data/docs/dsl-examples.md +105 -0
- data/docs/dsl-rules.md +0 -107
- data/docs/dsl-samples/index.md +4 -0
- data/docs/dsl-samples/youtube-launch-optimizer-old.klue +300 -0
- data/docs/dsl-samples/youtube-launch-optimizer-strawberry.json +132 -0
- data/docs/dsl-samples/youtube-launch-optimizer-strawberry.klue +47 -0
- data/docs/dsl-samples/youtube-launch-optimizer.defn.klue +1 -8
- data/docs/dsl-samples/youtube-launch-optimizer.json +337 -0
- data/lib/klue/langcraft/-brief.md +357 -0
- data/lib/klue/langcraft/parser.rb +78 -0
- data/lib/klue/langcraft/sample_usage.rb +15 -0
- data/lib/klue/langcraft/tokenizer.rb +20 -0
- data/lib/klue/langcraft/version.rb +1 -1
- data/package-lock.json +2 -2
- data/package.json +1 -1
- metadata +12 -2
@@ -0,0 +1,357 @@
|
|
1
|
+
# Brief for creating the Parser, Tokenizer and Parsing our first DSL
|
2
|
+
|
3
|
+
# 1. Parsing Libraries in Ruby
|
4
|
+
|
5
|
+
Here are three Ruby parsing libraries with their pros and cons:
|
6
|
+
|
7
|
+
### Parslet
|
8
|
+
**Pros:**
|
9
|
+
- Pure Ruby library for constructing parsers using parsing expression grammars (PEG).
|
10
|
+
- Intuitive and readable grammar definitions embedded in Ruby code.
|
11
|
+
- Actively maintained with compatibility for modern Ruby versions.
|
12
|
+
|
13
|
+
**Cons:**
|
14
|
+
- Can be slower for large inputs due to backtracking.
|
15
|
+
- Verbose grammars can become complex for intricate DSLs.
|
16
|
+
|
17
|
+
### Racc
|
18
|
+
**Pros:**
|
19
|
+
- LALR(1) parser generator that comes standard with Ruby.
|
20
|
+
- Generates fast parsers suitable for complex grammars.
|
21
|
+
- Actively maintained as part of the Ruby language.
|
22
|
+
|
23
|
+
**Cons:**
|
24
|
+
- Steeper learning curve with Yacc-like syntax.
|
25
|
+
- Less intuitive for those unfamiliar with parser generators.
|
26
|
+
|
27
|
+
### Treetop
|
28
|
+
**Pros:**
|
29
|
+
- Provides a powerful parsing DSL and supports PEG.
|
30
|
+
- Clean syntax with grammars defined in separate files.
|
31
|
+
- Memoization for improved parsing performance.
|
32
|
+
|
33
|
+
**Cons:**
|
34
|
+
- Less active development; may not be updated for recent Ruby versions.
|
35
|
+
- Potential compatibility issues with newer Ruby releases.
|
36
|
+
|
37
|
+
**Note:** Based on maintenance and compatibility, Parslet and Racc are more suitable for your needs.
|
38
|
+
|
39
|
+
---
|
40
|
+
|
41
|
+
# 2. Converting the DSL Definition into JSON
|
42
|
+
|
43
|
+
Transforming your DSL definition into JSON will facilitate parsing and validation. Here's how your DSL definition can be represented in JSON:
|
44
|
+
|
45
|
+
```json
|
46
|
+
{
|
47
|
+
"definition": {
|
48
|
+
"name": "workflow",
|
49
|
+
"params": [
|
50
|
+
{
|
51
|
+
"name": "name",
|
52
|
+
"type": "positional"
|
53
|
+
}
|
54
|
+
],
|
55
|
+
"nodes": [
|
56
|
+
{
|
57
|
+
"name": "description",
|
58
|
+
"params": [
|
59
|
+
{
|
60
|
+
"name": "description",
|
61
|
+
"type": "positional"
|
62
|
+
}
|
63
|
+
]
|
64
|
+
},
|
65
|
+
{
|
66
|
+
"name": "settings",
|
67
|
+
"nodes": [
|
68
|
+
{
|
69
|
+
"name": "setting",
|
70
|
+
"repeat": true,
|
71
|
+
"params": [
|
72
|
+
{
|
73
|
+
"name": "key",
|
74
|
+
"type": "declarative"
|
75
|
+
},
|
76
|
+
{
|
77
|
+
"name": "value",
|
78
|
+
"type": "positional"
|
79
|
+
}
|
80
|
+
]
|
81
|
+
}
|
82
|
+
]
|
83
|
+
},
|
84
|
+
{
|
85
|
+
"name": "prompts",
|
86
|
+
"nodes": [
|
87
|
+
{
|
88
|
+
"name": "prompt",
|
89
|
+
"repeat": true,
|
90
|
+
"params": [
|
91
|
+
{
|
92
|
+
"name": "key",
|
93
|
+
"type": "positional"
|
94
|
+
},
|
95
|
+
{
|
96
|
+
"name": "content",
|
97
|
+
"type": "named",
|
98
|
+
"default": ""
|
99
|
+
}
|
100
|
+
]
|
101
|
+
}
|
102
|
+
]
|
103
|
+
},
|
104
|
+
{
|
105
|
+
"name": "section",
|
106
|
+
"repeat": true,
|
107
|
+
"params": [
|
108
|
+
{
|
109
|
+
"name": "name",
|
110
|
+
"type": "positional"
|
111
|
+
}
|
112
|
+
],
|
113
|
+
"nodes": [
|
114
|
+
{
|
115
|
+
"name": "step",
|
116
|
+
"repeat": true,
|
117
|
+
"params": [
|
118
|
+
{
|
119
|
+
"name": "key",
|
120
|
+
"type": "positional"
|
121
|
+
}
|
122
|
+
],
|
123
|
+
"nodes": [
|
124
|
+
{
|
125
|
+
"name": "input",
|
126
|
+
"repeat": true,
|
127
|
+
"params": [
|
128
|
+
{
|
129
|
+
"name": "key",
|
130
|
+
"type": "positional"
|
131
|
+
}
|
132
|
+
]
|
133
|
+
},
|
134
|
+
{
|
135
|
+
"name": "prompt",
|
136
|
+
"params": [
|
137
|
+
{
|
138
|
+
"name": "key",
|
139
|
+
"type": "positional"
|
140
|
+
}
|
141
|
+
]
|
142
|
+
},
|
143
|
+
{
|
144
|
+
"name": "output",
|
145
|
+
"repeat": true,
|
146
|
+
"params": [
|
147
|
+
{
|
148
|
+
"name": "key",
|
149
|
+
"type": "positional"
|
150
|
+
}
|
151
|
+
]
|
152
|
+
}
|
153
|
+
]
|
154
|
+
}
|
155
|
+
]
|
156
|
+
},
|
157
|
+
{
|
158
|
+
"name": "actions",
|
159
|
+
"nodes": [
|
160
|
+
{
|
161
|
+
"name": "save",
|
162
|
+
"params": []
|
163
|
+
},
|
164
|
+
{
|
165
|
+
"name": "save_json",
|
166
|
+
"params": [
|
167
|
+
{
|
168
|
+
"name": "path",
|
169
|
+
"type": "positional"
|
170
|
+
}
|
171
|
+
]
|
172
|
+
}
|
173
|
+
]
|
174
|
+
}
|
175
|
+
]
|
176
|
+
}
|
177
|
+
}
|
178
|
+
```
|
179
|
+
|
180
|
+
# 3. Writing a Parser in Raw Ruby
|
181
|
+
|
182
|
+
Given the simplicity and hierarchical nature of your DSL, you can write a custom parser in Ruby without external libraries. Below is an outline of how to approach this:
|
183
|
+
|
184
|
+
### Step 1: Tokenization
|
185
|
+
- Create a tokenizer that reads the DSL code and breaks it down into tokens (keywords, symbols, identifiers, strings, etc.).
|
186
|
+
|
187
|
+
```ruby
|
188
|
+
class Tokenizer
|
189
|
+
attr_reader :tokens
|
190
|
+
|
191
|
+
def initialize(code)
|
192
|
+
@code = code
|
193
|
+
@tokens = []
|
194
|
+
end
|
195
|
+
|
196
|
+
def tokenize
|
197
|
+
# Implement logic to convert code into tokens
|
198
|
+
# Handle strings, symbols, keywords, and delimiters
|
199
|
+
end
|
200
|
+
end
|
201
|
+
```
|
202
|
+
|
203
|
+
### Step 2: Parsing
|
204
|
+
- Use recursive descent parsing to process tokens according to the rules defined in your JSON schema.
|
205
|
+
|
206
|
+
```ruby
|
207
|
+
class Parser
|
208
|
+
def initialize(tokens, schema)
|
209
|
+
@tokens = tokens
|
210
|
+
@schema = schema
|
211
|
+
@position = 0
|
212
|
+
end
|
213
|
+
|
214
|
+
def parse
|
215
|
+
parse_node(@schema['definition'])
|
216
|
+
end
|
217
|
+
|
218
|
+
private
|
219
|
+
|
220
|
+
def parse_node(node_schema)
|
221
|
+
node = { 'name' => node_schema['name'], 'params' => {}, 'children' => [] }
|
222
|
+
|
223
|
+
# Parse parameters
|
224
|
+
node['params'] = parse_params(node_schema['params'])
|
225
|
+
|
226
|
+
# If node has child nodes
|
227
|
+
if node_schema['nodes']
|
228
|
+
# Expect 'do'
|
229
|
+
expect('do')
|
230
|
+
|
231
|
+
# Parse child nodes
|
232
|
+
while peek != 'end'
|
233
|
+
child_node_schema = match_node_schema(node_schema['nodes'])
|
234
|
+
node['children'] << parse_node(child_node_schema)
|
235
|
+
end
|
236
|
+
|
237
|
+
expect('end')
|
238
|
+
end
|
239
|
+
|
240
|
+
node
|
241
|
+
end
|
242
|
+
|
243
|
+
def parse_params(params_schema)
|
244
|
+
params = {}
|
245
|
+
params_schema.each do |param_schema|
|
246
|
+
# Extract parameter based on its type
|
247
|
+
params[param_schema['name']] = extract_param(param_schema)
|
248
|
+
end
|
249
|
+
params
|
250
|
+
end
|
251
|
+
|
252
|
+
def extract_param(param_schema)
|
253
|
+
# Implement extraction logic based on param_schema['type']
|
254
|
+
end
|
255
|
+
|
256
|
+
def expect(expected_token)
|
257
|
+
actual_token = next_token
|
258
|
+
if actual_token != expected_token
|
259
|
+
raise "Expected '#{expected_token}', got '#{actual_token}'"
|
260
|
+
end
|
261
|
+
end
|
262
|
+
|
263
|
+
def next_token
|
264
|
+
token = @tokens[@position]
|
265
|
+
@position += 1
|
266
|
+
token
|
267
|
+
end
|
268
|
+
|
269
|
+
def peek
|
270
|
+
@tokens[@position]
|
271
|
+
end
|
272
|
+
|
273
|
+
def match_node_schema(nodes_schema)
|
274
|
+
current_token = peek
|
275
|
+
nodes_schema.find { |ns| ns['name'] == current_token } || raise("Unknown node '#{current_token}'")
|
276
|
+
end
|
277
|
+
end
|
278
|
+
```
|
279
|
+
|
280
|
+
### Step 3: Building the Abstract Syntax Tree (AST)
|
281
|
+
- As you parse, construct an AST that captures both the structural elements and their associated data.
|
282
|
+
|
283
|
+
#### Example Usage:
|
284
|
+
|
285
|
+
```ruby
|
286
|
+
# Read DSL code from file
|
287
|
+
dsl_code = File.read('workflow_dsl.rb')
|
288
|
+
|
289
|
+
# Tokenize the DSL code
|
290
|
+
tokenizer = Tokenizer.new(dsl_code)
|
291
|
+
tokenizer.tokenize
|
292
|
+
|
293
|
+
# Parse tokens into an AST
|
294
|
+
parser = Parser.new(tokenizer.tokens, schema)
|
295
|
+
ast = parser.parse
|
296
|
+
|
297
|
+
# Output the AST
|
298
|
+
puts ast.inspect
|
299
|
+
```
|
300
|
+
|
301
|
+
### Considerations:
|
302
|
+
|
303
|
+
#### Parsing Parameters:
|
304
|
+
- Implement the `parse_params` method to handle different parameter types:
|
305
|
+
- **Positional Parameters:** Split `params_str` by commas or spaces, and assign values in order.
|
306
|
+
- **Declarative Parameters:** Use the node name as the parameter value.
|
307
|
+
- **Named Parameters:** Look for `key: value` pairs.
|
308
|
+
|
309
|
+
#### Handling Repetition:
|
310
|
+
- For nodes with `repeat: true`, allow multiple instances by continuing to parse matching nodes until none are found.
|
311
|
+
|
312
|
+
### Advantages:
|
313
|
+
- **Simplicity:** Direct control over parsing logic tailored to your DSL.
|
314
|
+
- **No Dependencies:** Eliminates issues with outdated libraries.
|
315
|
+
- **Customizable:** Easy to modify as your DSL evolves.
|
316
|
+
|
317
|
+
### Challenges:
|
318
|
+
- **Complexity Management:** As your DSL grows, the parser logic may become more complex.
|
319
|
+
- **Testing:** Thorough testing is needed to ensure reliability.
|
320
|
+
- **Performance:** May need optimization for large DSL files.
|
321
|
+
|
322
|
+
---
|
323
|
+
|
324
|
+
# 5. Additional Considerations
|
325
|
+
|
326
|
+
While your immediate focus is on building the engine, keep in mind future integration with tools like IDEs:
|
327
|
+
|
328
|
+
- **Abstract Syntax Tree (AST):** A well-structured AST can facilitate features like syntax highlighting and code completion.
|
329
|
+
- **Language Server Protocol (LSP):** If you decide to provide IDE support, structuring your parser to output data compatible with LSP can be beneficial.
|
330
|
+
- **Extensibility:** Designing your parser and data structures with future enhancements in mind can save time later.
|
331
|
+
|
332
|
+
---
|
333
|
+
|
334
|
+
# 6. Conclusion
|
335
|
+
|
336
|
+
Creating a custom parser in Ruby without external libraries is feasible for your DSL, especially given its hierarchical and relatively simple structure. This approach offers:
|
337
|
+
|
338
|
+
- **Control and Flexibility:** Tailor the parser to your specific needs without external constraints.
|
339
|
+
- **Understanding:** Deepens your knowledge of parsing techniques and the inner workings of your DSL.
|
340
|
+
- **Maintainability:** Avoids dependency issues associated with outdated gems.
|
341
|
+
|
342
|
+
### Next Steps:
|
343
|
+
1. **Implement the Parser:** Start coding the parser using the outlined approaches.
|
344
|
+
2. **Test with Examples:** Use your existing DSL examples to validate the parser's functionality.
|
345
|
+
3. **Iterate:** Refine the parser based on testing, adding error handling and edge case management as needed.
|
346
|
+
- **Parameter Types:** Implement logic for different parameter types (positional, declarative, named, etc.).
|
347
|
+
- **Repeating Nodes:** Handle nodes with `repeat: true` by looping until no more matching nodes are found.
|
348
|
+
- **Error Handling:** Include meaningful error messages for unexpected tokens or structure violations.
|
349
|
+
- **Whitespace and Comments:** Strip out or ignore to simplify tokenization.
|
350
|
+
|
351
|
+
---
|
352
|
+
|
353
|
+
# 4. Writing the Parser Without External Libraries
|
354
|
+
|
355
|
+
You can implement the parser using Ruby's built-in capabilities, focusing on string manipulation and control structures.
|
356
|
+
|
357
|
+
### Simplified Parser Example:
|
@@ -0,0 +1,78 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module Klue
|
4
|
+
module Langcraft
|
5
|
+
# Parser class
|
6
|
+
class Parser
|
7
|
+
def initialize(tokens, schema)
|
8
|
+
@tokens = tokens
|
9
|
+
@schema = schema
|
10
|
+
@position = 0
|
11
|
+
end
|
12
|
+
|
13
|
+
def parse
|
14
|
+
parse_node(@schema['definition'])
|
15
|
+
end
|
16
|
+
|
17
|
+
private
|
18
|
+
|
19
|
+
def parse_node(node_schema)
|
20
|
+
node = { 'name' => node_schema['name'], 'params' => {}, 'children' => [] }
|
21
|
+
|
22
|
+
# Parse parameters
|
23
|
+
node['params'] = parse_params(node_schema['params'])
|
24
|
+
|
25
|
+
# If node has child nodes
|
26
|
+
if node_schema['nodes']
|
27
|
+
# Expect 'do'
|
28
|
+
expect('do')
|
29
|
+
|
30
|
+
# Parse child nodes
|
31
|
+
while peek != 'end'
|
32
|
+
child_node_schema = match_node_schema(node_schema['nodes'])
|
33
|
+
node['children'] << parse_node(child_node_schema)
|
34
|
+
end
|
35
|
+
|
36
|
+
expect('end')
|
37
|
+
end
|
38
|
+
|
39
|
+
node
|
40
|
+
end
|
41
|
+
|
42
|
+
def parse_params(params_schema)
|
43
|
+
params = {}
|
44
|
+
params_schema.each do |param_schema|
|
45
|
+
# Extract parameter based on its type
|
46
|
+
params[param_schema['name']] = extract_param(param_schema)
|
47
|
+
end
|
48
|
+
params
|
49
|
+
end
|
50
|
+
|
51
|
+
def extract_param(param_schema)
|
52
|
+
# Implement extraction logic based on param_schema['type']
|
53
|
+
end
|
54
|
+
|
55
|
+
def expect(expected_token)
|
56
|
+
actual_token = next_token
|
57
|
+
return unless actual_token != expected_token
|
58
|
+
|
59
|
+
raise "Expected '#{expected_token}', got '#{actual_token}'"
|
60
|
+
end
|
61
|
+
|
62
|
+
def next_token
|
63
|
+
token = @tokens[@position]
|
64
|
+
@position += 1
|
65
|
+
token
|
66
|
+
end
|
67
|
+
|
68
|
+
def peek
|
69
|
+
@tokens[@position]
|
70
|
+
end
|
71
|
+
|
72
|
+
def match_node_schema(nodes_schema)
|
73
|
+
current_token = peek
|
74
|
+
nodes_schema.find { |ns| ns['name'] == current_token } || raise("Unknown node '#{current_token}'")
|
75
|
+
end
|
76
|
+
end
|
77
|
+
end
|
78
|
+
end
|
@@ -0,0 +1,15 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
# Read DSL code from file
|
4
|
+
dsl_code = File.read('workflow_dsl.rb')
|
5
|
+
|
6
|
+
# Tokenize the DSL code
|
7
|
+
tokenizer = Tokenizer.new(dsl_code)
|
8
|
+
tokenizer.tokenize
|
9
|
+
|
10
|
+
# Parse tokens into an AST
|
11
|
+
parser = Parser.new(tokenizer.tokens, schema)
|
12
|
+
ast = parser.parse
|
13
|
+
|
14
|
+
# Output the AST
|
15
|
+
puts ast.inspect
|
@@ -0,0 +1,20 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module Klue
|
4
|
+
module Langcraft
|
5
|
+
# Tokenizer class
|
6
|
+
class Tokenizer
|
7
|
+
attr_reader :tokens
|
8
|
+
|
9
|
+
def initialize(code)
|
10
|
+
@code = code
|
11
|
+
@tokens = []
|
12
|
+
end
|
13
|
+
|
14
|
+
def tokenize
|
15
|
+
# Implement logic to convert code into tokens
|
16
|
+
# Handle strings, symbols, keywords, and delimiters
|
17
|
+
end
|
18
|
+
end
|
19
|
+
end
|
20
|
+
end
|
data/package-lock.json
CHANGED
@@ -1,12 +1,12 @@
|
|
1
1
|
{
|
2
2
|
"name": "klue-langcraft",
|
3
|
-
"version": "0.0
|
3
|
+
"version": "0.1.0",
|
4
4
|
"lockfileVersion": 3,
|
5
5
|
"requires": true,
|
6
6
|
"packages": {
|
7
7
|
"": {
|
8
8
|
"name": "klue-langcraft",
|
9
|
-
"version": "0.0
|
9
|
+
"version": "0.1.0",
|
10
10
|
"devDependencies": {
|
11
11
|
"@klueless-js/semantic-release-rubygem": "github:klueless-js/semantic-release-rubygem",
|
12
12
|
"@semantic-release/changelog": "^6.0.3",
|
data/package.json
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: klue-langcraft
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.0
|
4
|
+
version: 0.1.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- David Cruwys
|
8
8
|
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date: 2024-09-
|
11
|
+
date: 2024-09-22 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: k_log
|
@@ -48,8 +48,14 @@ files:
|
|
48
48
|
- Rakefile
|
49
49
|
- bin/console
|
50
50
|
- bin/setup
|
51
|
+
- docs/dsl-examples.md
|
51
52
|
- docs/dsl-rules.md
|
53
|
+
- docs/dsl-samples/index.md
|
54
|
+
- docs/dsl-samples/youtube-launch-optimizer-old.klue
|
55
|
+
- docs/dsl-samples/youtube-launch-optimizer-strawberry.json
|
56
|
+
- docs/dsl-samples/youtube-launch-optimizer-strawberry.klue
|
52
57
|
- docs/dsl-samples/youtube-launch-optimizer.defn.klue
|
58
|
+
- docs/dsl-samples/youtube-launch-optimizer.json
|
53
59
|
- docs/dsl-samples/youtube-launch-optimizer.klue
|
54
60
|
- docs/project-plan/project-plan.md
|
55
61
|
- docs/project-plan/project.drawio
|
@@ -57,6 +63,10 @@ files:
|
|
57
63
|
- docs/project-plan/project_in_progress.svg
|
58
64
|
- docs/project-plan/project_todo.svg
|
59
65
|
- lib/klue/langcraft.rb
|
66
|
+
- lib/klue/langcraft/-brief.md
|
67
|
+
- lib/klue/langcraft/parser.rb
|
68
|
+
- lib/klue/langcraft/sample_usage.rb
|
69
|
+
- lib/klue/langcraft/tokenizer.rb
|
60
70
|
- lib/klue/langcraft/version.rb
|
61
71
|
- package-lock.json
|
62
72
|
- package.json
|