klue-langcraft 0.0.7 → 0.1.1

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,359 @@
1
+ # Brief for creating the Parser, Tokenizer and Parsing our first DSL
2
+
3
+ [ChatGPT Conversation](https://chatgpt.com/share/66efd141-6644-8002-970d-4ad641c54d00)
4
+
5
+ # 1. Parsing Libraries in Ruby
6
+
7
+ Here are three Ruby parsing libraries with their pros and cons:
8
+
9
+ ### Parslet
10
+ **Pros:**
11
+ - Pure Ruby library for constructing parsers using parsing expression grammars (PEG).
12
+ - Intuitive and readable grammar definitions embedded in Ruby code.
13
+ - Actively maintained with compatibility for modern Ruby versions.
14
+
15
+ **Cons:**
16
+ - Can be slower for large inputs due to backtracking.
17
+ - Verbose grammars can become complex for intricate DSLs.
18
+
19
+ ### Racc
20
+ **Pros:**
21
+ - LALR(1) parser generator that comes standard with Ruby.
22
+ - Generates fast parsers suitable for complex grammars.
23
+ - Actively maintained as part of the Ruby language.
24
+
25
+ **Cons:**
26
+ - Steeper learning curve with Yacc-like syntax.
27
+ - Less intuitive for those unfamiliar with parser generators.
28
+
29
+ ### Treetop
30
+ **Pros:**
31
+ - Provides a powerful parsing DSL and supports PEG.
32
+ - Clean syntax with grammars defined in separate files.
33
+ - Memoization for improved parsing performance.
34
+
35
+ **Cons:**
36
+ - Less active development; may not be updated for recent Ruby versions.
37
+ - Potential compatibility issues with newer Ruby releases.
38
+
39
+ **Note:** Based on maintenance and compatibility, Parslet and Racc are more suitable for your needs.
40
+
41
+ ---
42
+
43
+ # 2. Converting the DSL Definition into JSON
44
+
45
+ Transforming your DSL definition into JSON will facilitate parsing and validation. Here's how your DSL definition can be represented in JSON:
46
+
47
+ ```json
48
+ {
49
+ "definition": {
50
+ "name": "workflow",
51
+ "params": [
52
+ {
53
+ "name": "name",
54
+ "type": "positional"
55
+ }
56
+ ],
57
+ "nodes": [
58
+ {
59
+ "name": "description",
60
+ "params": [
61
+ {
62
+ "name": "description",
63
+ "type": "positional"
64
+ }
65
+ ]
66
+ },
67
+ {
68
+ "name": "settings",
69
+ "nodes": [
70
+ {
71
+ "name": "setting",
72
+ "repeat": true,
73
+ "params": [
74
+ {
75
+ "name": "key",
76
+ "type": "declarative"
77
+ },
78
+ {
79
+ "name": "value",
80
+ "type": "positional"
81
+ }
82
+ ]
83
+ }
84
+ ]
85
+ },
86
+ {
87
+ "name": "prompts",
88
+ "nodes": [
89
+ {
90
+ "name": "prompt",
91
+ "repeat": true,
92
+ "params": [
93
+ {
94
+ "name": "key",
95
+ "type": "positional"
96
+ },
97
+ {
98
+ "name": "content",
99
+ "type": "named",
100
+ "default": ""
101
+ }
102
+ ]
103
+ }
104
+ ]
105
+ },
106
+ {
107
+ "name": "section",
108
+ "repeat": true,
109
+ "params": [
110
+ {
111
+ "name": "name",
112
+ "type": "positional"
113
+ }
114
+ ],
115
+ "nodes": [
116
+ {
117
+ "name": "step",
118
+ "repeat": true,
119
+ "params": [
120
+ {
121
+ "name": "key",
122
+ "type": "positional"
123
+ }
124
+ ],
125
+ "nodes": [
126
+ {
127
+ "name": "input",
128
+ "repeat": true,
129
+ "params": [
130
+ {
131
+ "name": "key",
132
+ "type": "positional"
133
+ }
134
+ ]
135
+ },
136
+ {
137
+ "name": "prompt",
138
+ "params": [
139
+ {
140
+ "name": "key",
141
+ "type": "positional"
142
+ }
143
+ ]
144
+ },
145
+ {
146
+ "name": "output",
147
+ "repeat": true,
148
+ "params": [
149
+ {
150
+ "name": "key",
151
+ "type": "positional"
152
+ }
153
+ ]
154
+ }
155
+ ]
156
+ }
157
+ ]
158
+ },
159
+ {
160
+ "name": "actions",
161
+ "nodes": [
162
+ {
163
+ "name": "save",
164
+ "params": []
165
+ },
166
+ {
167
+ "name": "save_json",
168
+ "params": [
169
+ {
170
+ "name": "path",
171
+ "type": "positional"
172
+ }
173
+ ]
174
+ }
175
+ ]
176
+ }
177
+ ]
178
+ }
179
+ }
180
+ ```
181
+
182
+ # 3. Writing a Parser in Raw Ruby
183
+
184
+ Given the simplicity and hierarchical nature of your DSL, you can write a custom parser in Ruby without external libraries. Below is an outline of how to approach this:
185
+
186
+ ### Step 1: Tokenization
187
+ - Create a tokenizer that reads the DSL code and breaks it down into tokens (keywords, symbols, identifiers, strings, etc.).
188
+
189
+ ```ruby
190
+ class Tokenizer
191
+ attr_reader :tokens
192
+
193
+ def initialize(code)
194
+ @code = code
195
+ @tokens = []
196
+ end
197
+
198
+ def tokenize
199
+ # Implement logic to convert code into tokens
200
+ # Handle strings, symbols, keywords, and delimiters
201
+ end
202
+ end
203
+ ```
204
+
205
+ ### Step 2: Parsing
206
+ - Use recursive descent parsing to process tokens according to the rules defined in your JSON schema.
207
+
208
+ ```ruby
209
+ class Parser
210
+ def initialize(tokens, schema)
211
+ @tokens = tokens
212
+ @schema = schema
213
+ @position = 0
214
+ end
215
+
216
+ def parse
217
+ parse_node(@schema['definition'])
218
+ end
219
+
220
+ private
221
+
222
+ def parse_node(node_schema)
223
+ node = { 'name' => node_schema['name'], 'params' => {}, 'children' => [] }
224
+
225
+ # Parse parameters
226
+ node['params'] = parse_params(node_schema['params'])
227
+
228
+ # If node has child nodes
229
+ if node_schema['nodes']
230
+ # Expect 'do'
231
+ expect('do')
232
+
233
+ # Parse child nodes
234
+ while peek != 'end'
235
+ child_node_schema = match_node_schema(node_schema['nodes'])
236
+ node['children'] << parse_node(child_node_schema)
237
+ end
238
+
239
+ expect('end')
240
+ end
241
+
242
+ node
243
+ end
244
+
245
+ def parse_params(params_schema)
246
+ params = {}
247
+ params_schema.each do |param_schema|
248
+ # Extract parameter based on its type
249
+ params[param_schema['name']] = extract_param(param_schema)
250
+ end
251
+ params
252
+ end
253
+
254
+ def extract_param(param_schema)
255
+ # Implement extraction logic based on param_schema['type']
256
+ end
257
+
258
+ def expect(expected_token)
259
+ actual_token = next_token
260
+ if actual_token != expected_token
261
+ raise "Expected '#{expected_token}', got '#{actual_token}'"
262
+ end
263
+ end
264
+
265
+ def next_token
266
+ token = @tokens[@position]
267
+ @position += 1
268
+ token
269
+ end
270
+
271
+ def peek
272
+ @tokens[@position]
273
+ end
274
+
275
+ def match_node_schema(nodes_schema)
276
+ current_token = peek
277
+ nodes_schema.find { |ns| ns['name'] == current_token } || raise("Unknown node '#{current_token}'")
278
+ end
279
+ end
280
+ ```
281
+
282
+ ### Step 3: Building the Abstract Syntax Tree (AST)
283
+ - As you parse, construct an AST that captures both the structural elements and their associated data.
284
+
285
+ #### Example Usage:
286
+
287
+ ```ruby
288
+ # Read DSL code from file
289
+ dsl_code = File.read('workflow_dsl.rb')
290
+
291
+ # Tokenize the DSL code
292
+ tokenizer = Tokenizer.new(dsl_code)
293
+ tokenizer.tokenize
294
+
295
+ # Parse tokens into an AST
296
+ parser = Parser.new(tokenizer.tokens, schema)
297
+ ast = parser.parse
298
+
299
+ # Output the AST
300
+ puts ast.inspect
301
+ ```
302
+
303
+ ### Considerations:
304
+
305
+ #### Parsing Parameters:
306
+ - Implement the `parse_params` method to handle different parameter types:
307
+ - **Positional Parameters:** Split `params_str` by commas or spaces, and assign values in order.
308
+ - **Declarative Parameters:** Use the node name as the parameter value.
309
+ - **Named Parameters:** Look for `key: value` pairs.
310
+
311
+ #### Handling Repetition:
312
+ - For nodes with `repeat: true`, allow multiple instances by continuing to parse matching nodes until none are found.
313
+
314
+ ### Advantages:
315
+ - **Simplicity:** Direct control over parsing logic tailored to your DSL.
316
+ - **No Dependencies:** Eliminates issues with outdated libraries.
317
+ - **Customizable:** Easy to modify as your DSL evolves.
318
+
319
+ ### Challenges:
320
+ - **Complexity Management:** As your DSL grows, the parser logic may become more complex.
321
+ - **Testing:** Thorough testing is needed to ensure reliability.
322
+ - **Performance:** May need optimization for large DSL files.
323
+
324
+ ---
325
+
326
+ # 5. Additional Considerations
327
+
328
+ While your immediate focus is on building the engine, keep in mind future integration with tools like IDEs:
329
+
330
+ - **Abstract Syntax Tree (AST):** A well-structured AST can facilitate features like syntax highlighting and code completion.
331
+ - **Language Server Protocol (LSP):** If you decide to provide IDE support, structuring your parser to output data compatible with LSP can be beneficial.
332
+ - **Extensibility:** Designing your parser and data structures with future enhancements in mind can save time later.
333
+
334
+ ---
335
+
336
+ # 6. Conclusion
337
+
338
+ Creating a custom parser in Ruby without external libraries is feasible for your DSL, especially given its hierarchical and relatively simple structure. This approach offers:
339
+
340
+ - **Control and Flexibility:** Tailor the parser to your specific needs without external constraints.
341
+ - **Understanding:** Deepens your knowledge of parsing techniques and the inner workings of your DSL.
342
+ - **Maintainability:** Avoids dependency issues associated with outdated gems.
343
+
344
+ ### Next Steps:
345
+ 1. **Implement the Parser:** Start coding the parser using the outlined approaches.
346
+ 2. **Test with Examples:** Use your existing DSL examples to validate the parser's functionality.
347
+ 3. **Iterate:** Refine the parser based on testing, adding error handling and edge case management as needed.
348
+ - **Parameter Types:** Implement logic for different parameter types (positional, declarative, named, etc.).
349
+ - **Repeating Nodes:** Handle nodes with `repeat: true` by looping until no more matching nodes are found.
350
+ - **Error Handling:** Include meaningful error messages for unexpected tokens or structure violations.
351
+ - **Whitespace and Comments:** Strip out or ignore to simplify tokenization.
352
+
353
+ ---
354
+
355
+ # 4. Writing the Parser Without External Libraries
356
+
357
+ You can implement the parser using Ruby's built-in capabilities, focusing on string manipulation and control structures.
358
+
359
+ ### Simplified Parser Example:
@@ -0,0 +1,78 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Klue
4
+ module Langcraft
5
+ # Parser class
6
+ class Parser
7
+ def initialize(tokens, schema)
8
+ @tokens = tokens
9
+ @schema = schema
10
+ @position = 0
11
+ end
12
+
13
+ def parse
14
+ parse_node(@schema['definition'])
15
+ end
16
+
17
+ private
18
+
19
+ def parse_node(node_schema)
20
+ node = { 'name' => node_schema['name'], 'params' => {}, 'children' => [] }
21
+
22
+ # Parse parameters
23
+ node['params'] = parse_params(node_schema['params'])
24
+
25
+ # If node has child nodes
26
+ if node_schema['nodes']
27
+ # Expect 'do'
28
+ expect('do')
29
+
30
+ # Parse child nodes
31
+ while peek != 'end'
32
+ child_node_schema = match_node_schema(node_schema['nodes'])
33
+ node['children'] << parse_node(child_node_schema)
34
+ end
35
+
36
+ expect('end')
37
+ end
38
+
39
+ node
40
+ end
41
+
42
+ def parse_params(params_schema)
43
+ params = {}
44
+ params_schema.each do |param_schema|
45
+ # Extract parameter based on its type
46
+ params[param_schema['name']] = extract_param(param_schema)
47
+ end
48
+ params
49
+ end
50
+
51
+ def extract_param(param_schema)
52
+ # Implement extraction logic based on param_schema['type']
53
+ end
54
+
55
+ def expect(expected_token)
56
+ actual_token = next_token
57
+ return unless actual_token != expected_token
58
+
59
+ raise "Expected '#{expected_token}', got '#{actual_token}'"
60
+ end
61
+
62
+ def next_token
63
+ token = @tokens[@position]
64
+ @position += 1
65
+ token
66
+ end
67
+
68
+ def peek
69
+ @tokens[@position]
70
+ end
71
+
72
+ def match_node_schema(nodes_schema)
73
+ current_token = peek
74
+ nodes_schema.find { |ns| ns['name'] == current_token } || raise("Unknown node '#{current_token}'")
75
+ end
76
+ end
77
+ end
78
+ end
@@ -0,0 +1,15 @@
1
+ # frozen_string_literal: true
2
+
3
+ # Read DSL code from file
4
+ dsl_code = File.read('workflow_dsl.rb')
5
+
6
+ # Tokenize the DSL code
7
+ tokenizer = Tokenizer.new(dsl_code)
8
+ tokenizer.tokenize
9
+
10
+ # Parse tokens into an AST
11
+ parser = Parser.new(tokenizer.tokens, schema)
12
+ ast = parser.parse
13
+
14
+ # Output the AST
15
+ puts ast.inspect
@@ -0,0 +1,20 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Klue
4
+ module Langcraft
5
+ # Tokenizer class
6
+ class Tokenizer
7
+ attr_reader :tokens
8
+
9
+ def initialize(code)
10
+ @code = code
11
+ @tokens = []
12
+ end
13
+
14
+ def tokenize
15
+ # Implement logic to convert code into tokens
16
+ # Handle strings, symbols, keywords, and delimiters
17
+ end
18
+ end
19
+ end
20
+ end
@@ -2,6 +2,6 @@
2
2
 
3
3
  module Klue
4
4
  module Langcraft
5
- VERSION = '0.0.7'
5
+ VERSION = '0.1.1'
6
6
  end
7
7
  end
data/package-lock.json CHANGED
@@ -1,12 +1,12 @@
1
1
  {
2
2
  "name": "klue-langcraft",
3
- "version": "0.0.7",
3
+ "version": "0.1.1",
4
4
  "lockfileVersion": 3,
5
5
  "requires": true,
6
6
  "packages": {
7
7
  "": {
8
8
  "name": "klue-langcraft",
9
- "version": "0.0.7",
9
+ "version": "0.1.1",
10
10
  "devDependencies": {
11
11
  "@klueless-js/semantic-release-rubygem": "github:klueless-js/semantic-release-rubygem",
12
12
  "@semantic-release/changelog": "^6.0.3",
data/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "klue-langcraft",
3
- "version": "0.0.7",
3
+ "version": "0.1.1",
4
4
  "description": "Domain Specific Language Crafting",
5
5
  "scripts": {
6
6
  "release": "semantic-release"
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: klue-langcraft
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.7
4
+ version: 0.1.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - David Cruwys
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2024-09-21 00:00:00.000000000 Z
11
+ date: 2024-09-22 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: k_log
@@ -48,8 +48,14 @@ files:
48
48
  - Rakefile
49
49
  - bin/console
50
50
  - bin/setup
51
+ - docs/dsl-examples.md
51
52
  - docs/dsl-rules.md
53
+ - docs/dsl-samples/index.md
54
+ - docs/dsl-samples/youtube-launch-optimizer-old.klue
55
+ - docs/dsl-samples/youtube-launch-optimizer-strawberry.json
56
+ - docs/dsl-samples/youtube-launch-optimizer-strawberry.klue
52
57
  - docs/dsl-samples/youtube-launch-optimizer.defn.klue
58
+ - docs/dsl-samples/youtube-launch-optimizer.json
53
59
  - docs/dsl-samples/youtube-launch-optimizer.klue
54
60
  - docs/project-plan/project-plan.md
55
61
  - docs/project-plan/project.drawio
@@ -57,6 +63,10 @@ files:
57
63
  - docs/project-plan/project_in_progress.svg
58
64
  - docs/project-plan/project_todo.svg
59
65
  - lib/klue/langcraft.rb
66
+ - lib/klue/langcraft/-brief.md
67
+ - lib/klue/langcraft/parser.rb
68
+ - lib/klue/langcraft/sample_usage.rb
69
+ - lib/klue/langcraft/tokenizer.rb
60
70
  - lib/klue/langcraft/version.rb
61
71
  - package-lock.json
62
72
  - package.json