atomic_assessments_import 0.2.4 → 0.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +21 -1
- data/docs/plans/2026-02-11-flexible-examsoft-importer-design.md +127 -0
- data/docs/plans/2026-02-11-flexible-examsoft-importer-plan.md +2635 -0
- data/lib/atomic_assessments_import/csv/converter.rb +3 -3
- data/lib/atomic_assessments_import/exam_soft/chunker/heading_split_strategy.rb +38 -0
- data/lib/atomic_assessments_import/exam_soft/chunker/horizontal_rule_split_strategy.rb +37 -0
- data/lib/atomic_assessments_import/exam_soft/chunker/metadata_marker_strategy.rb +38 -0
- data/lib/atomic_assessments_import/exam_soft/chunker/numbered_question_strategy.rb +41 -0
- data/lib/atomic_assessments_import/exam_soft/chunker/strategy.rb +22 -0
- data/lib/atomic_assessments_import/exam_soft/chunker.rb +46 -0
- data/lib/atomic_assessments_import/exam_soft/converter.rb +203 -0
- data/lib/atomic_assessments_import/exam_soft/extractor/correct_answer_detector.rb +36 -0
- data/lib/atomic_assessments_import/exam_soft/extractor/feedback_detector.rb +50 -0
- data/lib/atomic_assessments_import/exam_soft/extractor/metadata_detector.rb +37 -0
- data/lib/atomic_assessments_import/exam_soft/extractor/options_detector.rb +44 -0
- data/lib/atomic_assessments_import/exam_soft/extractor/question_stem_detector.rb +44 -0
- data/lib/atomic_assessments_import/exam_soft/extractor/question_type_detector.rb +51 -0
- data/lib/atomic_assessments_import/exam_soft/extractor.rb +96 -0
- data/lib/atomic_assessments_import/exam_soft.rb +10 -0
- data/lib/atomic_assessments_import/questions/cloze_dropdown.rb +62 -0
- data/lib/atomic_assessments_import/questions/essay.rb +20 -0
- data/lib/atomic_assessments_import/questions/fill_in_the_blank.rb +49 -0
- data/lib/atomic_assessments_import/questions/matching.rb +42 -0
- data/lib/atomic_assessments_import/questions/multiple_choice.rb +102 -0
- data/lib/atomic_assessments_import/questions/ordering.rb +53 -0
- data/lib/atomic_assessments_import/questions/question.rb +106 -0
- data/lib/atomic_assessments_import/questions/short_answer.rb +24 -0
- data/lib/atomic_assessments_import/utils.rb +21 -0
- data/lib/atomic_assessments_import/version.rb +1 -1
- data/lib/atomic_assessments_import/writer.rb +1 -1
- data/lib/atomic_assessments_import.rb +31 -12
- metadata +62 -13
- data/lib/atomic_assessments_import/csv/questions/multiple_choice.rb +0 -104
- data/lib/atomic_assessments_import/csv/questions/question.rb +0 -86
- data/lib/atomic_assessments_import/csv/utils.rb +0 -24
|
@@ -0,0 +1,2635 @@
|
|
|
1
|
+
# Flexible ExamSoft Importer Implementation Plan
|
|
2
|
+
|
|
3
|
+
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
|
4
|
+
|
|
5
|
+
**Goal:** Refactor the ExamSoft converter from rigid regex parsing into a flexible chunker + field detector pipeline that handles unknown format variations with best-effort extraction.
|
|
6
|
+
|
|
7
|
+
**Architecture:** Pandoc normalizes input to HTML, Nokogiri parses to DOM, a strategy-based chunker splits into per-question chunks, independent field detectors extract data from each chunk, and the existing Question pipeline produces Learnosity output. Warnings accumulate rather than halting.
|
|
8
|
+
|
|
9
|
+
**Tech Stack:** Ruby, RSpec, Nokogiri (already in bundle), PandocRuby (already in bundle), Learnosity format output
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
### Task 1: Chunking Strategy Base Class + MetadataMarkerStrategy
|
|
14
|
+
|
|
15
|
+
This is the foundation. The MetadataMarkerStrategy replicates the current chunking behavior (split on `Folder:` / `Type:` markers) so we can verify backward compatibility.
|
|
16
|
+
|
|
17
|
+
**Files:**
|
|
18
|
+
- Create: `lib/atomic_assessments_import/exam_soft/chunker/strategy.rb`
|
|
19
|
+
- Create: `lib/atomic_assessments_import/exam_soft/chunker/metadata_marker_strategy.rb`
|
|
20
|
+
- Test: `spec/atomic_assessments_import/examsoft/chunker/metadata_marker_strategy_spec.rb`
|
|
21
|
+
|
|
22
|
+
**Step 1: Write the failing test**
|
|
23
|
+
|
|
24
|
+
Create `spec/atomic_assessments_import/examsoft/chunker/metadata_marker_strategy_spec.rb`:
|
|
25
|
+
|
|
26
|
+
```ruby
|
|
27
|
+
# frozen_string_literal: true
|
|
28
|
+
|
|
29
|
+
require "atomic_assessments_import"
|
|
30
|
+
require "nokogiri"
|
|
31
|
+
|
|
32
|
+
RSpec.describe AtomicAssessmentsImport::ExamSoft::Chunker::MetadataMarkerStrategy do
|
|
33
|
+
describe "#split" do
|
|
34
|
+
it "splits HTML on Folder: markers" do
|
|
35
|
+
html = <<~HTML
|
|
36
|
+
<p>Folder: Geography Title: Q1 Category: Test 1) What is the capital? ~ Explanation</p>
|
|
37
|
+
<p>*a) Paris</p>
|
|
38
|
+
<p>b) London</p>
|
|
39
|
+
<p>Folder: Science Title: Q2 Category: Test 2) What is H2O? ~ Water</p>
|
|
40
|
+
<p>*a) Water</p>
|
|
41
|
+
<p>b) Fire</p>
|
|
42
|
+
HTML
|
|
43
|
+
doc = Nokogiri::HTML.fragment(html)
|
|
44
|
+
strategy = described_class.new
|
|
45
|
+
chunks = strategy.split(doc)
|
|
46
|
+
|
|
47
|
+
expect(chunks.length).to eq(2)
|
|
48
|
+
end
|
|
49
|
+
|
|
50
|
+
it "splits HTML on Type: markers" do
|
|
51
|
+
html = <<~HTML
|
|
52
|
+
<p>Type: MA Folder: Geography Title: Q1 Category: Test 1) Question? ~ Expl</p>
|
|
53
|
+
<p>*a) Answer</p>
|
|
54
|
+
<p>Type: MCQ Folder: Science Title: Q2 Category: Test 2) Question2? ~ Expl</p>
|
|
55
|
+
<p>*a) Answer2</p>
|
|
56
|
+
HTML
|
|
57
|
+
doc = Nokogiri::HTML.fragment(html)
|
|
58
|
+
strategy = described_class.new
|
|
59
|
+
chunks = strategy.split(doc)
|
|
60
|
+
|
|
61
|
+
expect(chunks.length).to eq(2)
|
|
62
|
+
end
|
|
63
|
+
|
|
64
|
+
it "returns empty array when no markers found" do
|
|
65
|
+
html = "<p>Just some text with no markers</p>"
|
|
66
|
+
doc = Nokogiri::HTML.fragment(html)
|
|
67
|
+
strategy = described_class.new
|
|
68
|
+
chunks = strategy.split(doc)
|
|
69
|
+
|
|
70
|
+
expect(chunks).to eq([])
|
|
71
|
+
end
|
|
72
|
+
|
|
73
|
+
it "separates exam header from questions" do
|
|
74
|
+
html = <<~HTML
|
|
75
|
+
<p>Exam: Midterm 2024</p>
|
|
76
|
+
<p>Total Questions: 30</p>
|
|
77
|
+
<p>Folder: Geography Title: Q1 Category: Test 1) Question? ~ Expl</p>
|
|
78
|
+
<p>*a) Answer</p>
|
|
79
|
+
HTML
|
|
80
|
+
doc = Nokogiri::HTML.fragment(html)
|
|
81
|
+
strategy = described_class.new
|
|
82
|
+
chunks = strategy.split(doc)
|
|
83
|
+
|
|
84
|
+
expect(chunks.length).to eq(1)
|
|
85
|
+
expect(strategy.header_nodes).not_to be_empty
|
|
86
|
+
end
|
|
87
|
+
|
|
88
|
+
it "returns chunks as arrays of Nokogiri nodes" do
|
|
89
|
+
html = <<~HTML
|
|
90
|
+
<p>Folder: Geo Title: Q1 Category: Test 1) Question? ~ Expl</p>
|
|
91
|
+
<p>*a) Answer</p>
|
|
92
|
+
<p>b) Wrong</p>
|
|
93
|
+
HTML
|
|
94
|
+
doc = Nokogiri::HTML.fragment(html)
|
|
95
|
+
strategy = described_class.new
|
|
96
|
+
chunks = strategy.split(doc)
|
|
97
|
+
|
|
98
|
+
expect(chunks.length).to eq(1)
|
|
99
|
+
expect(chunks[0]).to all(be_a(Nokogiri::XML::Node))
|
|
100
|
+
end
|
|
101
|
+
end
|
|
102
|
+
end
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
**Step 2: Run test to verify it fails**
|
|
106
|
+
|
|
107
|
+
Run: `bundle exec rspec spec/atomic_assessments_import/examsoft/chunker/metadata_marker_strategy_spec.rb -v`
|
|
108
|
+
Expected: FAIL — uninitialized constant
|
|
109
|
+
|
|
110
|
+
**Step 3: Write the base Strategy class**
|
|
111
|
+
|
|
112
|
+
Create `lib/atomic_assessments_import/exam_soft/chunker/strategy.rb`:
|
|
113
|
+
|
|
114
|
+
```ruby
|
|
115
|
+
# frozen_string_literal: true
|
|
116
|
+
|
|
117
|
+
module AtomicAssessmentsImport
|
|
118
|
+
module ExamSoft
|
|
119
|
+
module Chunker
|
|
120
|
+
class Strategy
|
|
121
|
+
attr_reader :header_nodes
|
|
122
|
+
|
|
123
|
+
def initialize
|
|
124
|
+
@header_nodes = []
|
|
125
|
+
end
|
|
126
|
+
|
|
127
|
+
# Subclasses implement this. Returns an array of chunks,
|
|
128
|
+
# where each chunk is an array of Nokogiri nodes belonging to one question.
|
|
129
|
+
# Returns empty array if this strategy doesn't apply to the document.
|
|
130
|
+
def split(doc)
|
|
131
|
+
raise NotImplementedError
|
|
132
|
+
end
|
|
133
|
+
end
|
|
134
|
+
end
|
|
135
|
+
end
|
|
136
|
+
end
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
**Step 4: Write MetadataMarkerStrategy**
|
|
140
|
+
|
|
141
|
+
Create `lib/atomic_assessments_import/exam_soft/chunker/metadata_marker_strategy.rb`:
|
|
142
|
+
|
|
143
|
+
```ruby
|
|
144
|
+
# frozen_string_literal: true
|
|
145
|
+
|
|
146
|
+
require_relative "strategy"
|
|
147
|
+
|
|
148
|
+
module AtomicAssessmentsImport
|
|
149
|
+
module ExamSoft
|
|
150
|
+
module Chunker
|
|
151
|
+
class MetadataMarkerStrategy < Strategy
|
|
152
|
+
MARKER_PATTERN = /\A\s*(?:Type:\s*.+?\s+)?Folder:\s*/i
|
|
153
|
+
|
|
154
|
+
def split(doc)
|
|
155
|
+
@header_nodes = []
|
|
156
|
+
chunks = []
|
|
157
|
+
current_chunk = []
|
|
158
|
+
found_first = false
|
|
159
|
+
|
|
160
|
+
doc.children.each do |node|
|
|
161
|
+
text = node.text.strip
|
|
162
|
+
next if text.empty? && !node.name.match?(/^(img|table|hr)$/i)
|
|
163
|
+
|
|
164
|
+
if text.match?(MARKER_PATTERN)
|
|
165
|
+
found_first = true
|
|
166
|
+
chunks << current_chunk unless current_chunk.empty?
|
|
167
|
+
current_chunk = [node]
|
|
168
|
+
elsif found_first
|
|
169
|
+
current_chunk << node
|
|
170
|
+
else
|
|
171
|
+
@header_nodes << node
|
|
172
|
+
end
|
|
173
|
+
end
|
|
174
|
+
|
|
175
|
+
chunks << current_chunk unless current_chunk.empty?
|
|
176
|
+
chunks
|
|
177
|
+
end
|
|
178
|
+
end
|
|
179
|
+
end
|
|
180
|
+
end
|
|
181
|
+
end
|
|
182
|
+
```
|
|
183
|
+
|
|
184
|
+
**Step 5: Run test to verify it passes**
|
|
185
|
+
|
|
186
|
+
Run: `bundle exec rspec spec/atomic_assessments_import/examsoft/chunker/metadata_marker_strategy_spec.rb -v`
|
|
187
|
+
Expected: PASS
|
|
188
|
+
|
|
189
|
+
**Step 6: Commit**
|
|
190
|
+
|
|
191
|
+
```bash
|
|
192
|
+
git add lib/atomic_assessments_import/exam_soft/chunker/ spec/atomic_assessments_import/examsoft/chunker/
|
|
193
|
+
git commit -m "feat: add chunker base class and MetadataMarkerStrategy"
|
|
194
|
+
```
|
|
195
|
+
|
|
196
|
+
---
|
|
197
|
+
|
|
198
|
+
### Task 2: NumberedQuestionStrategy
|
|
199
|
+
|
|
200
|
+
**Files:**
|
|
201
|
+
- Create: `lib/atomic_assessments_import/exam_soft/chunker/numbered_question_strategy.rb`
|
|
202
|
+
- Test: `spec/atomic_assessments_import/examsoft/chunker/numbered_question_strategy_spec.rb`
|
|
203
|
+
|
|
204
|
+
**Step 1: Write the failing test**
|
|
205
|
+
|
|
206
|
+
Create `spec/atomic_assessments_import/examsoft/chunker/numbered_question_strategy_spec.rb`:
|
|
207
|
+
|
|
208
|
+
```ruby
|
|
209
|
+
# frozen_string_literal: true
|
|
210
|
+
|
|
211
|
+
require "atomic_assessments_import"
|
|
212
|
+
require "nokogiri"
|
|
213
|
+
|
|
214
|
+
RSpec.describe AtomicAssessmentsImport::ExamSoft::Chunker::NumberedQuestionStrategy do
|
|
215
|
+
describe "#split" do
|
|
216
|
+
it "splits on paragraphs starting with number-paren pattern" do
|
|
217
|
+
html = <<~HTML
|
|
218
|
+
<p>1) What is the capital of France?</p>
|
|
219
|
+
<p>a) Paris</p>
|
|
220
|
+
<p>b) London</p>
|
|
221
|
+
<p>2) What is H2O?</p>
|
|
222
|
+
<p>a) Water</p>
|
|
223
|
+
<p>b) Fire</p>
|
|
224
|
+
HTML
|
|
225
|
+
doc = Nokogiri::HTML.fragment(html)
|
|
226
|
+
chunks = described_class.new.split(doc)
|
|
227
|
+
|
|
228
|
+
expect(chunks.length).to eq(2)
|
|
229
|
+
end
|
|
230
|
+
|
|
231
|
+
it "splits on paragraphs starting with number-dot pattern" do
|
|
232
|
+
html = <<~HTML
|
|
233
|
+
<p>1. What is the capital of France?</p>
|
|
234
|
+
<p>a) Paris</p>
|
|
235
|
+
<p>2. What is H2O?</p>
|
|
236
|
+
<p>a) Water</p>
|
|
237
|
+
HTML
|
|
238
|
+
doc = Nokogiri::HTML.fragment(html)
|
|
239
|
+
chunks = described_class.new.split(doc)
|
|
240
|
+
|
|
241
|
+
expect(chunks.length).to eq(2)
|
|
242
|
+
end
|
|
243
|
+
|
|
244
|
+
it "returns empty array when no numbered questions found" do
|
|
245
|
+
html = "<p>Just some regular text</p><p>More text</p>"
|
|
246
|
+
doc = Nokogiri::HTML.fragment(html)
|
|
247
|
+
chunks = described_class.new.split(doc)
|
|
248
|
+
|
|
249
|
+
expect(chunks).to eq([])
|
|
250
|
+
end
|
|
251
|
+
|
|
252
|
+
it "separates header content before first question" do
|
|
253
|
+
html = <<~HTML
|
|
254
|
+
<p>Exam: Midterm</p>
|
|
255
|
+
<p>Total: 30 questions</p>
|
|
256
|
+
<p>1) First question?</p>
|
|
257
|
+
<p>a) Answer</p>
|
|
258
|
+
HTML
|
|
259
|
+
doc = Nokogiri::HTML.fragment(html)
|
|
260
|
+
strategy = described_class.new
|
|
261
|
+
chunks = strategy.split(doc)
|
|
262
|
+
|
|
263
|
+
expect(chunks.length).to eq(1)
|
|
264
|
+
expect(strategy.header_nodes.length).to eq(2)
|
|
265
|
+
end
|
|
266
|
+
|
|
267
|
+
it "does not split on lettered options like a) b) c)" do
|
|
268
|
+
html = <<~HTML
|
|
269
|
+
<p>1) What is the capital of France?</p>
|
|
270
|
+
<p>a) Paris</p>
|
|
271
|
+
<p>b) London</p>
|
|
272
|
+
<p>c) Berlin</p>
|
|
273
|
+
HTML
|
|
274
|
+
doc = Nokogiri::HTML.fragment(html)
|
|
275
|
+
chunks = described_class.new.split(doc)
|
|
276
|
+
|
|
277
|
+
expect(chunks.length).to eq(1)
|
|
278
|
+
end
|
|
279
|
+
end
|
|
280
|
+
end
|
|
281
|
+
```
|
|
282
|
+
|
|
283
|
+
**Step 2: Run test to verify it fails**
|
|
284
|
+
|
|
285
|
+
Run: `bundle exec rspec spec/atomic_assessments_import/examsoft/chunker/numbered_question_strategy_spec.rb -v`
|
|
286
|
+
Expected: FAIL — uninitialized constant
|
|
287
|
+
|
|
288
|
+
**Step 3: Write implementation**
|
|
289
|
+
|
|
290
|
+
Create `lib/atomic_assessments_import/exam_soft/chunker/numbered_question_strategy.rb`:
|
|
291
|
+
|
|
292
|
+
```ruby
|
|
293
|
+
# frozen_string_literal: true
|
|
294
|
+
|
|
295
|
+
require_relative "strategy"
|
|
296
|
+
|
|
297
|
+
module AtomicAssessmentsImport
|
|
298
|
+
module ExamSoft
|
|
299
|
+
module Chunker
|
|
300
|
+
class NumberedQuestionStrategy < Strategy
|
|
301
|
+
# Matches "1)" or "1." or "12)" etc. at start of text, but NOT single letters like "a)"
|
|
302
|
+
NUMBERED_PATTERN = /\A\s*(\d+)\s*[.)]/
|
|
303
|
+
|
|
304
|
+
def split(doc)
|
|
305
|
+
@header_nodes = []
|
|
306
|
+
chunks = []
|
|
307
|
+
current_chunk = []
|
|
308
|
+
found_first = false
|
|
309
|
+
|
|
310
|
+
doc.children.each do |node|
|
|
311
|
+
text = node.text.strip
|
|
312
|
+
next if text.empty? && !node.name.match?(/^(img|table|hr)$/i)
|
|
313
|
+
|
|
314
|
+
if text.match?(NUMBERED_PATTERN)
|
|
315
|
+
found_first = true
|
|
316
|
+
chunks << current_chunk unless current_chunk.empty?
|
|
317
|
+
current_chunk = [node]
|
|
318
|
+
elsif found_first
|
|
319
|
+
current_chunk << node
|
|
320
|
+
else
|
|
321
|
+
@header_nodes << node
|
|
322
|
+
end
|
|
323
|
+
end
|
|
324
|
+
|
|
325
|
+
chunks << current_chunk unless current_chunk.empty?
|
|
326
|
+
# Only valid if we found more than one chunk (single could be a false positive)
|
|
327
|
+
chunks.length > 1 ? chunks : []
|
|
328
|
+
end
|
|
329
|
+
end
|
|
330
|
+
end
|
|
331
|
+
end
|
|
332
|
+
end
|
|
333
|
+
```
|
|
334
|
+
|
|
335
|
+
**Step 4: Run test to verify it passes**
|
|
336
|
+
|
|
337
|
+
Run: `bundle exec rspec spec/atomic_assessments_import/examsoft/chunker/numbered_question_strategy_spec.rb -v`
|
|
338
|
+
Expected: PASS
|
|
339
|
+
|
|
340
|
+
**Step 5: Commit**
|
|
341
|
+
|
|
342
|
+
```bash
|
|
343
|
+
git add lib/atomic_assessments_import/exam_soft/chunker/numbered_question_strategy.rb spec/atomic_assessments_import/examsoft/chunker/numbered_question_strategy_spec.rb
|
|
344
|
+
git commit -m "feat: add NumberedQuestionStrategy for chunking"
|
|
345
|
+
```
|
|
346
|
+
|
|
347
|
+
---
|
|
348
|
+
|
|
349
|
+
### Task 3: HeadingSplitStrategy + HorizontalRuleSplitStrategy
|
|
350
|
+
|
|
351
|
+
These two are simple and follow the same pattern, so they're combined.
|
|
352
|
+
|
|
353
|
+
**Files:**
|
|
354
|
+
- Create: `lib/atomic_assessments_import/exam_soft/chunker/heading_split_strategy.rb`
|
|
355
|
+
- Create: `lib/atomic_assessments_import/exam_soft/chunker/horizontal_rule_split_strategy.rb`
|
|
356
|
+
- Test: `spec/atomic_assessments_import/examsoft/chunker/heading_split_strategy_spec.rb`
|
|
357
|
+
- Test: `spec/atomic_assessments_import/examsoft/chunker/horizontal_rule_split_strategy_spec.rb`
|
|
358
|
+
|
|
359
|
+
**Step 1: Write failing tests**
|
|
360
|
+
|
|
361
|
+
Create `spec/atomic_assessments_import/examsoft/chunker/heading_split_strategy_spec.rb`:
|
|
362
|
+
|
|
363
|
+
```ruby
|
|
364
|
+
# frozen_string_literal: true
|
|
365
|
+
|
|
366
|
+
require "atomic_assessments_import"
|
|
367
|
+
require "nokogiri"
|
|
368
|
+
|
|
369
|
+
RSpec.describe AtomicAssessmentsImport::ExamSoft::Chunker::HeadingSplitStrategy do
|
|
370
|
+
describe "#split" do
|
|
371
|
+
it "splits on heading tags" do
|
|
372
|
+
html = <<~HTML
|
|
373
|
+
<h2>Question 1</h2>
|
|
374
|
+
<p>What is the capital of France?</p>
|
|
375
|
+
<p>a) Paris</p>
|
|
376
|
+
<h2>Question 2</h2>
|
|
377
|
+
<p>What is H2O?</p>
|
|
378
|
+
<p>a) Water</p>
|
|
379
|
+
HTML
|
|
380
|
+
doc = Nokogiri::HTML.fragment(html)
|
|
381
|
+
chunks = described_class.new.split(doc)
|
|
382
|
+
|
|
383
|
+
expect(chunks.length).to eq(2)
|
|
384
|
+
end
|
|
385
|
+
|
|
386
|
+
it "returns empty array when no headings found" do
|
|
387
|
+
html = "<p>No headings here</p>"
|
|
388
|
+
doc = Nokogiri::HTML.fragment(html)
|
|
389
|
+
chunks = described_class.new.split(doc)
|
|
390
|
+
|
|
391
|
+
expect(chunks).to eq([])
|
|
392
|
+
end
|
|
393
|
+
|
|
394
|
+
it "separates header content before first heading" do
|
|
395
|
+
html = <<~HTML
|
|
396
|
+
<p>Exam header info</p>
|
|
397
|
+
<h2>Question 1</h2>
|
|
398
|
+
<p>What is the capital?</p>
|
|
399
|
+
HTML
|
|
400
|
+
doc = Nokogiri::HTML.fragment(html)
|
|
401
|
+
strategy = described_class.new
|
|
402
|
+
chunks = strategy.split(doc)
|
|
403
|
+
|
|
404
|
+
expect(chunks.length).to eq(1)
|
|
405
|
+
expect(strategy.header_nodes).not_to be_empty
|
|
406
|
+
end
|
|
407
|
+
end
|
|
408
|
+
end
|
|
409
|
+
```
|
|
410
|
+
|
|
411
|
+
Create `spec/atomic_assessments_import/examsoft/chunker/horizontal_rule_split_strategy_spec.rb`:
|
|
412
|
+
|
|
413
|
+
```ruby
|
|
414
|
+
# frozen_string_literal: true
|
|
415
|
+
|
|
416
|
+
require "atomic_assessments_import"
|
|
417
|
+
require "nokogiri"
|
|
418
|
+
|
|
419
|
+
RSpec.describe AtomicAssessmentsImport::ExamSoft::Chunker::HorizontalRuleSplitStrategy do
|
|
420
|
+
describe "#split" do
|
|
421
|
+
it "splits on hr tags" do
|
|
422
|
+
html = <<~HTML
|
|
423
|
+
<p>Question 1: What is the capital of France?</p>
|
|
424
|
+
<p>a) Paris</p>
|
|
425
|
+
<hr/>
|
|
426
|
+
<p>Question 2: What is H2O?</p>
|
|
427
|
+
<p>a) Water</p>
|
|
428
|
+
HTML
|
|
429
|
+
doc = Nokogiri::HTML.fragment(html)
|
|
430
|
+
chunks = described_class.new.split(doc)
|
|
431
|
+
|
|
432
|
+
expect(chunks.length).to eq(2)
|
|
433
|
+
end
|
|
434
|
+
|
|
435
|
+
it "returns empty array when no hr tags found" do
|
|
436
|
+
html = "<p>No rules here</p>"
|
|
437
|
+
doc = Nokogiri::HTML.fragment(html)
|
|
438
|
+
chunks = described_class.new.split(doc)
|
|
439
|
+
|
|
440
|
+
expect(chunks).to eq([])
|
|
441
|
+
end
|
|
442
|
+
|
|
443
|
+
it "separates header content before first hr" do
|
|
444
|
+
html = <<~HTML
|
|
445
|
+
<p>Exam header info</p>
|
|
446
|
+
<hr/>
|
|
447
|
+
<p>Question 1</p>
|
|
448
|
+
HTML
|
|
449
|
+
doc = Nokogiri::HTML.fragment(html)
|
|
450
|
+
strategy = described_class.new
|
|
451
|
+
chunks = strategy.split(doc)
|
|
452
|
+
|
|
453
|
+
expect(chunks.length).to eq(1)
|
|
454
|
+
expect(strategy.header_nodes).not_to be_empty
|
|
455
|
+
end
|
|
456
|
+
end
|
|
457
|
+
end
|
|
458
|
+
```
|
|
459
|
+
|
|
460
|
+
**Step 2: Run tests to verify they fail**
|
|
461
|
+
|
|
462
|
+
Run: `bundle exec rspec spec/atomic_assessments_import/examsoft/chunker/heading_split_strategy_spec.rb spec/atomic_assessments_import/examsoft/chunker/horizontal_rule_split_strategy_spec.rb -v`
|
|
463
|
+
Expected: FAIL — uninitialized constants
|
|
464
|
+
|
|
465
|
+
**Step 3: Write implementations**
|
|
466
|
+
|
|
467
|
+
Create `lib/atomic_assessments_import/exam_soft/chunker/heading_split_strategy.rb`:
|
|
468
|
+
|
|
469
|
+
```ruby
|
|
470
|
+
# frozen_string_literal: true
|
|
471
|
+
|
|
472
|
+
require_relative "strategy"
|
|
473
|
+
|
|
474
|
+
module AtomicAssessmentsImport
|
|
475
|
+
module ExamSoft
|
|
476
|
+
module Chunker
|
|
477
|
+
class HeadingSplitStrategy < Strategy
|
|
478
|
+
HEADING_PATTERN = /^h[1-6]$/i
|
|
479
|
+
|
|
480
|
+
def split(doc)
|
|
481
|
+
@header_nodes = []
|
|
482
|
+
chunks = []
|
|
483
|
+
current_chunk = []
|
|
484
|
+
found_first = false
|
|
485
|
+
|
|
486
|
+
doc.children.each do |node|
|
|
487
|
+
if node.name.match?(HEADING_PATTERN)
|
|
488
|
+
found_first = true
|
|
489
|
+
chunks << current_chunk unless current_chunk.empty?
|
|
490
|
+
current_chunk = [node]
|
|
491
|
+
elsif found_first
|
|
492
|
+
text = node.text.strip
|
|
493
|
+
next if text.empty? && !node.name.match?(/^(img|table|hr)$/i)
|
|
494
|
+
|
|
495
|
+
current_chunk << node
|
|
496
|
+
else
|
|
497
|
+
@header_nodes << node unless node.text.strip.empty?
|
|
498
|
+
end
|
|
499
|
+
end
|
|
500
|
+
|
|
501
|
+
chunks << current_chunk unless current_chunk.empty?
|
|
502
|
+
chunks.length > 1 ? chunks : []
|
|
503
|
+
end
|
|
504
|
+
end
|
|
505
|
+
end
|
|
506
|
+
end
|
|
507
|
+
end
|
|
508
|
+
```
|
|
509
|
+
|
|
510
|
+
Create `lib/atomic_assessments_import/exam_soft/chunker/horizontal_rule_split_strategy.rb`:
|
|
511
|
+
|
|
512
|
+
```ruby
|
|
513
|
+
# frozen_string_literal: true
|
|
514
|
+
|
|
515
|
+
require_relative "strategy"
|
|
516
|
+
|
|
517
|
+
module AtomicAssessmentsImport
|
|
518
|
+
module ExamSoft
|
|
519
|
+
module Chunker
|
|
520
|
+
class HorizontalRuleSplitStrategy < Strategy
|
|
521
|
+
def split(doc)
|
|
522
|
+
@header_nodes = []
|
|
523
|
+
chunks = []
|
|
524
|
+
current_chunk = []
|
|
525
|
+
found_first = false
|
|
526
|
+
|
|
527
|
+
doc.children.each do |node|
|
|
528
|
+
if node.name == "hr"
|
|
529
|
+
if current_chunk.empty? && !found_first
|
|
530
|
+
# Content before first hr with no question content is header
|
|
531
|
+
next
|
|
532
|
+
end
|
|
533
|
+
found_first = true
|
|
534
|
+
chunks << current_chunk unless current_chunk.empty?
|
|
535
|
+
current_chunk = []
|
|
536
|
+
elsif found_first || !chunks.empty?
|
|
537
|
+
text = node.text.strip
|
|
538
|
+
next if text.empty? && !node.name.match?(/^(img|table)$/i)
|
|
539
|
+
|
|
540
|
+
current_chunk << node
|
|
541
|
+
else
|
|
542
|
+
text = node.text.strip
|
|
543
|
+
if text.empty?
|
|
544
|
+
next
|
|
545
|
+
else
|
|
546
|
+
# Before any hr — could be header or first question
|
|
547
|
+
current_chunk << node
|
|
548
|
+
end
|
|
549
|
+
end
|
|
550
|
+
end
|
|
551
|
+
|
|
552
|
+
chunks << current_chunk unless current_chunk.empty?
|
|
553
|
+
|
|
554
|
+
if chunks.length > 1
|
|
555
|
+
chunks
|
|
556
|
+
else
|
|
557
|
+
@header_nodes = []
|
|
558
|
+
[]
|
|
559
|
+
end
|
|
560
|
+
end
|
|
561
|
+
end
|
|
562
|
+
end
|
|
563
|
+
end
|
|
564
|
+
end
|
|
565
|
+
```
|
|
566
|
+
|
|
567
|
+
Note: The HorizontalRuleSplitStrategy is a bit different — the `<hr>` is a separator *between* chunks, not part of a chunk. Content before the first `<hr>` is the first chunk (or header if there's no question content before it).
|
|
568
|
+
|
|
569
|
+
**Step 4: Run tests to verify they pass**
|
|
570
|
+
|
|
571
|
+
Run: `bundle exec rspec spec/atomic_assessments_import/examsoft/chunker/heading_split_strategy_spec.rb spec/atomic_assessments_import/examsoft/chunker/horizontal_rule_split_strategy_spec.rb -v`
|
|
572
|
+
Expected: PASS
|
|
573
|
+
|
|
574
|
+
**Step 5: Commit**
|
|
575
|
+
|
|
576
|
+
```bash
|
|
577
|
+
git add lib/atomic_assessments_import/exam_soft/chunker/heading_split_strategy.rb lib/atomic_assessments_import/exam_soft/chunker/horizontal_rule_split_strategy.rb spec/atomic_assessments_import/examsoft/chunker/heading_split_strategy_spec.rb spec/atomic_assessments_import/examsoft/chunker/horizontal_rule_split_strategy_spec.rb
|
|
578
|
+
git commit -m "feat: add HeadingSplitStrategy and HorizontalRuleSplitStrategy"
|
|
579
|
+
```
|
|
580
|
+
|
|
581
|
+
---
|
|
582
|
+
|
|
583
|
+
### Task 4: Chunker Orchestrator
|
|
584
|
+
|
|
585
|
+
The orchestrator tries each strategy and picks the best one.
|
|
586
|
+
|
|
587
|
+
**Files:**
|
|
588
|
+
- Create: `lib/atomic_assessments_import/exam_soft/chunker.rb`
|
|
589
|
+
- Test: `spec/atomic_assessments_import/examsoft/chunker_spec.rb`
|
|
590
|
+
|
|
591
|
+
**Step 1: Write the failing test**
|
|
592
|
+
|
|
593
|
+
Create `spec/atomic_assessments_import/examsoft/chunker_spec.rb`:
|
|
594
|
+
|
|
595
|
+
```ruby
|
|
596
|
+
# frozen_string_literal: true
|
|
597
|
+
|
|
598
|
+
require "atomic_assessments_import"
|
|
599
|
+
require "nokogiri"
|
|
600
|
+
|
|
601
|
+
RSpec.describe AtomicAssessmentsImport::ExamSoft::Chunker do
|
|
602
|
+
describe "#chunk" do
|
|
603
|
+
it "uses MetadataMarkerStrategy when Folder: markers are present" do
|
|
604
|
+
html = <<~HTML
|
|
605
|
+
<p>Folder: Geo Title: Q1 Category: Test 1) Question? ~ Expl</p>
|
|
606
|
+
<p>*a) Answer</p>
|
|
607
|
+
<p>Folder: Sci Title: Q2 Category: Test 2) Question2? ~ Expl</p>
|
|
608
|
+
<p>*a) Answer2</p>
|
|
609
|
+
HTML
|
|
610
|
+
doc = Nokogiri::HTML.fragment(html)
|
|
611
|
+
chunker = described_class.new(doc)
|
|
612
|
+
result = chunker.chunk
|
|
613
|
+
|
|
614
|
+
expect(result[:chunks].length).to eq(2)
|
|
615
|
+
end
|
|
616
|
+
|
|
617
|
+
it "falls back to NumberedQuestionStrategy when no metadata markers" do
|
|
618
|
+
html = <<~HTML
|
|
619
|
+
<p>1) What is the capital of France?</p>
|
|
620
|
+
<p>a) Paris</p>
|
|
621
|
+
<p>b) London</p>
|
|
622
|
+
<p>2) What is H2O?</p>
|
|
623
|
+
<p>a) Water</p>
|
|
624
|
+
<p>b) Fire</p>
|
|
625
|
+
HTML
|
|
626
|
+
doc = Nokogiri::HTML.fragment(html)
|
|
627
|
+
chunker = described_class.new(doc)
|
|
628
|
+
result = chunker.chunk
|
|
629
|
+
|
|
630
|
+
expect(result[:chunks].length).to eq(2)
|
|
631
|
+
end
|
|
632
|
+
|
|
633
|
+
it "falls back to HeadingSplitStrategy when no numbers" do
|
|
634
|
+
html = <<~HTML
|
|
635
|
+
<h2>Question 1</h2>
|
|
636
|
+
<p>What is the capital?</p>
|
|
637
|
+
<p>a) Paris</p>
|
|
638
|
+
<h2>Question 2</h2>
|
|
639
|
+
<p>What is H2O?</p>
|
|
640
|
+
<p>a) Water</p>
|
|
641
|
+
HTML
|
|
642
|
+
doc = Nokogiri::HTML.fragment(html)
|
|
643
|
+
chunker = described_class.new(doc)
|
|
644
|
+
result = chunker.chunk
|
|
645
|
+
|
|
646
|
+
expect(result[:chunks].length).to eq(2)
|
|
647
|
+
end
|
|
648
|
+
|
|
649
|
+
it "returns whole document as single chunk when no strategy matches" do
|
|
650
|
+
html = <<~HTML
|
|
651
|
+
<p>Some question text here</p>
|
|
652
|
+
<p>a) An option</p>
|
|
653
|
+
HTML
|
|
654
|
+
doc = Nokogiri::HTML.fragment(html)
|
|
655
|
+
chunker = described_class.new(doc)
|
|
656
|
+
result = chunker.chunk
|
|
657
|
+
|
|
658
|
+
expect(result[:chunks].length).to eq(1)
|
|
659
|
+
expect(result[:warnings]).to include(/No chunking strategy/i)
|
|
660
|
+
end
|
|
661
|
+
|
|
662
|
+
it "extracts header nodes" do
|
|
663
|
+
html = <<~HTML
|
|
664
|
+
<p>Exam: Midterm 2024</p>
|
|
665
|
+
<p>Total Questions: 30</p>
|
|
666
|
+
<p>Folder: Geo Title: Q1 Category: Test 1) Question? ~ Expl</p>
|
|
667
|
+
<p>*a) Answer</p>
|
|
668
|
+
HTML
|
|
669
|
+
doc = Nokogiri::HTML.fragment(html)
|
|
670
|
+
chunker = described_class.new(doc)
|
|
671
|
+
result = chunker.chunk
|
|
672
|
+
|
|
673
|
+
expect(result[:header_nodes]).not_to be_empty
|
|
674
|
+
end
|
|
675
|
+
end
|
|
676
|
+
end
|
|
677
|
+
```
|
|
678
|
+
|
|
679
|
+
**Step 2: Run test to verify it fails**
|
|
680
|
+
|
|
681
|
+
Run: `bundle exec rspec spec/atomic_assessments_import/examsoft/chunker_spec.rb -v`
|
|
682
|
+
Expected: FAIL
|
|
683
|
+
|
|
684
|
+
**Step 3: Write implementation**
|
|
685
|
+
|
|
686
|
+
Create `lib/atomic_assessments_import/exam_soft/chunker.rb`:
|
|
687
|
+
|
|
688
|
+
```ruby
|
|
689
|
+
# frozen_string_literal: true
|
|
690
|
+
|
|
691
|
+
require_relative "chunker/strategy"
|
|
692
|
+
require_relative "chunker/metadata_marker_strategy"
|
|
693
|
+
require_relative "chunker/numbered_question_strategy"
|
|
694
|
+
require_relative "chunker/heading_split_strategy"
|
|
695
|
+
require_relative "chunker/horizontal_rule_split_strategy"
|
|
696
|
+
|
|
697
|
+
module AtomicAssessmentsImport
|
|
698
|
+
module ExamSoft
|
|
699
|
+
class Chunker
|
|
700
|
+
STRATEGIES = [
|
|
701
|
+
Chunker::MetadataMarkerStrategy,
|
|
702
|
+
Chunker::NumberedQuestionStrategy,
|
|
703
|
+
Chunker::HeadingSplitStrategy,
|
|
704
|
+
Chunker::HorizontalRuleSplitStrategy,
|
|
705
|
+
].freeze
|
|
706
|
+
|
|
707
|
+
def initialize(doc)
|
|
708
|
+
@doc = doc
|
|
709
|
+
end
|
|
710
|
+
|
|
711
|
+
def chunk
|
|
712
|
+
warnings = []
|
|
713
|
+
|
|
714
|
+
STRATEGIES.each do |strategy_class|
|
|
715
|
+
strategy = strategy_class.new
|
|
716
|
+
chunks = strategy.split(@doc)
|
|
717
|
+
next if chunks.empty?
|
|
718
|
+
|
|
719
|
+
return {
|
|
720
|
+
chunks: chunks,
|
|
721
|
+
header_nodes: strategy.header_nodes,
|
|
722
|
+
warnings: warnings,
|
|
723
|
+
}
|
|
724
|
+
end
|
|
725
|
+
|
|
726
|
+
# No strategy matched — return entire document as one chunk
|
|
727
|
+
all_nodes = @doc.children.reject { |n| n.text.strip.empty? && !n.name.match?(/^(img|table|hr)$/i) }
|
|
728
|
+
warnings << "No chunking strategy matched. Treating entire document as a single question."
|
|
729
|
+
|
|
730
|
+
{
|
|
731
|
+
chunks: [all_nodes],
|
|
732
|
+
header_nodes: [],
|
|
733
|
+
warnings: warnings,
|
|
734
|
+
}
|
|
735
|
+
end
|
|
736
|
+
end
|
|
737
|
+
end
|
|
738
|
+
end
|
|
739
|
+
```
|
|
740
|
+
|
|
741
|
+
**Step 4: Run test to verify it passes**
|
|
742
|
+
|
|
743
|
+
Run: `bundle exec rspec spec/atomic_assessments_import/examsoft/chunker_spec.rb -v`
|
|
744
|
+
Expected: PASS
|
|
745
|
+
|
|
746
|
+
**Step 5: Commit**
|
|
747
|
+
|
|
748
|
+
```bash
|
|
749
|
+
git add lib/atomic_assessments_import/exam_soft/chunker.rb spec/atomic_assessments_import/examsoft/chunker_spec.rb
|
|
750
|
+
git commit -m "feat: add Chunker orchestrator with strategy cascade"
|
|
751
|
+
```
|
|
752
|
+
|
|
753
|
+
---
|
|
754
|
+
|
|
755
|
+
### Task 5: Field Detectors — QuestionStem, Options, CorrectAnswer
|
|
756
|
+
|
|
757
|
+
These three are the core detectors needed for MCQ questions.
|
|
758
|
+
|
|
759
|
+
**Files:**
|
|
760
|
+
- Create: `lib/atomic_assessments_import/exam_soft/extractor/question_stem_detector.rb`
|
|
761
|
+
- Create: `lib/atomic_assessments_import/exam_soft/extractor/options_detector.rb`
|
|
762
|
+
- Create: `lib/atomic_assessments_import/exam_soft/extractor/correct_answer_detector.rb`
|
|
763
|
+
- Test: `spec/atomic_assessments_import/examsoft/extractor/question_stem_detector_spec.rb`
|
|
764
|
+
- Test: `spec/atomic_assessments_import/examsoft/extractor/options_detector_spec.rb`
|
|
765
|
+
- Test: `spec/atomic_assessments_import/examsoft/extractor/correct_answer_detector_spec.rb`
|
|
766
|
+
|
|
767
|
+
**Step 1: Write failing tests**
|
|
768
|
+
|
|
769
|
+
Create `spec/atomic_assessments_import/examsoft/extractor/question_stem_detector_spec.rb`:
|
|
770
|
+
|
|
771
|
+
```ruby
|
|
772
|
+
# frozen_string_literal: true
|
|
773
|
+
|
|
774
|
+
require "atomic_assessments_import"
|
|
775
|
+
require "nokogiri"
|
|
776
|
+
|
|
777
|
+
RSpec.describe AtomicAssessmentsImport::ExamSoft::Extractor::QuestionStemDetector do
|
|
778
|
+
def nodes_from(html)
|
|
779
|
+
Nokogiri::HTML.fragment(html).children.to_a
|
|
780
|
+
end
|
|
781
|
+
|
|
782
|
+
describe "#detect" do
|
|
783
|
+
it "extracts question text before options" do
|
|
784
|
+
nodes = nodes_from(<<~HTML)
|
|
785
|
+
<p>1) What is the capital of France?</p>
|
|
786
|
+
<p>a) Paris</p>
|
|
787
|
+
<p>b) London</p>
|
|
788
|
+
HTML
|
|
789
|
+
result = described_class.new(nodes).detect
|
|
790
|
+
|
|
791
|
+
expect(result).to eq("What is the capital of France?")
|
|
792
|
+
end
|
|
793
|
+
|
|
794
|
+
it "extracts question text with tilde-separated explanation removed" do
|
|
795
|
+
nodes = nodes_from(<<~HTML)
|
|
796
|
+
<p>Folder: Geo Title: Q1 Category: Test 1) What is the capital? ~ Paris is the capital.</p>
|
|
797
|
+
<p>*a) Paris</p>
|
|
798
|
+
HTML
|
|
799
|
+
result = described_class.new(nodes).detect
|
|
800
|
+
|
|
801
|
+
expect(result).to eq("What is the capital?")
|
|
802
|
+
end
|
|
803
|
+
|
|
804
|
+
it "extracts question text without numbered prefix" do
|
|
805
|
+
nodes = nodes_from(<<~HTML)
|
|
806
|
+
<p>What is the capital of France?</p>
|
|
807
|
+
<p>a) Paris</p>
|
|
808
|
+
HTML
|
|
809
|
+
result = described_class.new(nodes).detect
|
|
810
|
+
|
|
811
|
+
expect(result).to eq("What is the capital of France?")
|
|
812
|
+
end
|
|
813
|
+
|
|
814
|
+
it "returns nil when no question text found" do
|
|
815
|
+
nodes = nodes_from("<p>a) Paris</p><p>b) London</p>")
|
|
816
|
+
result = described_class.new(nodes).detect
|
|
817
|
+
|
|
818
|
+
expect(result).to be_nil
|
|
819
|
+
end
|
|
820
|
+
end
|
|
821
|
+
end
|
|
822
|
+
```
|
|
823
|
+
|
|
824
|
+
Create `spec/atomic_assessments_import/examsoft/extractor/options_detector_spec.rb`:
|
|
825
|
+
|
|
826
|
+
```ruby
|
|
827
|
+
# frozen_string_literal: true
|
|
828
|
+
|
|
829
|
+
require "atomic_assessments_import"
|
|
830
|
+
require "nokogiri"
|
|
831
|
+
|
|
832
|
+
RSpec.describe AtomicAssessmentsImport::ExamSoft::Extractor::OptionsDetector do
|
|
833
|
+
def nodes_from(html)
|
|
834
|
+
Nokogiri::HTML.fragment(html).children.to_a
|
|
835
|
+
end
|
|
836
|
+
|
|
837
|
+
describe "#detect" do
|
|
838
|
+
it "extracts lettered options with paren format" do
|
|
839
|
+
nodes = nodes_from(<<~HTML)
|
|
840
|
+
<p>Question text</p>
|
|
841
|
+
<p>a) Paris</p>
|
|
842
|
+
<p>b) London</p>
|
|
843
|
+
<p>c) Berlin</p>
|
|
844
|
+
HTML
|
|
845
|
+
result = described_class.new(nodes).detect
|
|
846
|
+
|
|
847
|
+
expect(result.length).to eq(3)
|
|
848
|
+
expect(result[0][:text]).to eq("Paris")
|
|
849
|
+
expect(result[1][:text]).to eq("London")
|
|
850
|
+
expect(result[2][:text]).to eq("Berlin")
|
|
851
|
+
end
|
|
852
|
+
|
|
853
|
+
it "detects correct answer markers with asterisk" do
|
|
854
|
+
nodes = nodes_from(<<~HTML)
|
|
855
|
+
<p>*a) Paris</p>
|
|
856
|
+
<p>b) London</p>
|
|
857
|
+
HTML
|
|
858
|
+
result = described_class.new(nodes).detect
|
|
859
|
+
|
|
860
|
+
expect(result[0][:correct]).to be true
|
|
861
|
+
expect(result[1][:correct]).to be false
|
|
862
|
+
end
|
|
863
|
+
|
|
864
|
+
it "detects correct answer markers with bold" do
|
|
865
|
+
nodes = nodes_from(<<~HTML)
|
|
866
|
+
<p><strong>a) Paris</strong></p>
|
|
867
|
+
<p>b) London</p>
|
|
868
|
+
HTML
|
|
869
|
+
result = described_class.new(nodes).detect
|
|
870
|
+
|
|
871
|
+
expect(result[0][:correct]).to be true
|
|
872
|
+
expect(result[1][:correct]).to be false
|
|
873
|
+
end
|
|
874
|
+
|
|
875
|
+
it "returns empty array when no options found" do
|
|
876
|
+
nodes = nodes_from("<p>Just a paragraph</p>")
|
|
877
|
+
result = described_class.new(nodes).detect
|
|
878
|
+
|
|
879
|
+
expect(result).to eq([])
|
|
880
|
+
end
|
|
881
|
+
|
|
882
|
+
it "handles uppercase letter options" do
|
|
883
|
+
nodes = nodes_from(<<~HTML)
|
|
884
|
+
<p>A) Paris</p>
|
|
885
|
+
<p>B) London</p>
|
|
886
|
+
HTML
|
|
887
|
+
result = described_class.new(nodes).detect
|
|
888
|
+
|
|
889
|
+
expect(result.length).to eq(2)
|
|
890
|
+
expect(result[0][:text]).to eq("Paris")
|
|
891
|
+
end
|
|
892
|
+
end
|
|
893
|
+
end
|
|
894
|
+
```
|
|
895
|
+
|
|
896
|
+
Create `spec/atomic_assessments_import/examsoft/extractor/correct_answer_detector_spec.rb`:
|
|
897
|
+
|
|
898
|
+
```ruby
|
|
899
|
+
# frozen_string_literal: true
|
|
900
|
+
|
|
901
|
+
require "atomic_assessments_import"
|
|
902
|
+
require "nokogiri"
|
|
903
|
+
|
|
904
|
+
RSpec.describe AtomicAssessmentsImport::ExamSoft::Extractor::CorrectAnswerDetector do
|
|
905
|
+
def nodes_from(html)
|
|
906
|
+
Nokogiri::HTML.fragment(html).children.to_a
|
|
907
|
+
end
|
|
908
|
+
|
|
909
|
+
describe "#detect" do
|
|
910
|
+
it "detects correct answers from asterisk-marked options" do
|
|
911
|
+
options = [
|
|
912
|
+
{ text: "Paris", letter: "a", correct: true },
|
|
913
|
+
{ text: "London", letter: "b", correct: false },
|
|
914
|
+
]
|
|
915
|
+
result = described_class.new(nodes_from(""), options).detect
|
|
916
|
+
|
|
917
|
+
expect(result).to eq(["a"])
|
|
918
|
+
end
|
|
919
|
+
|
|
920
|
+
it "detects multiple correct answers" do
|
|
921
|
+
options = [
|
|
922
|
+
{ text: "Little Rock", letter: "a", correct: true },
|
|
923
|
+
{ text: "Denver", letter: "b", correct: true },
|
|
924
|
+
{ text: "Detroit", letter: "c", correct: false },
|
|
925
|
+
]
|
|
926
|
+
result = described_class.new(nodes_from(""), options).detect
|
|
927
|
+
|
|
928
|
+
expect(result).to eq(["a", "b"])
|
|
929
|
+
end
|
|
930
|
+
|
|
931
|
+
it "detects correct answer from Answer: label in chunk" do
|
|
932
|
+
nodes = nodes_from("<p>Answer: A</p>")
|
|
933
|
+
options = [
|
|
934
|
+
{ text: "Paris", letter: "a", correct: false },
|
|
935
|
+
{ text: "London", letter: "b", correct: false },
|
|
936
|
+
]
|
|
937
|
+
result = described_class.new(nodes, options).detect
|
|
938
|
+
|
|
939
|
+
expect(result).to eq(["a"])
|
|
940
|
+
end
|
|
941
|
+
|
|
942
|
+
it "returns empty array when no correct answer found" do
|
|
943
|
+
options = [
|
|
944
|
+
{ text: "Paris", letter: "a", correct: false },
|
|
945
|
+
{ text: "London", letter: "b", correct: false },
|
|
946
|
+
]
|
|
947
|
+
result = described_class.new(nodes_from(""), options).detect
|
|
948
|
+
|
|
949
|
+
expect(result).to eq([])
|
|
950
|
+
end
|
|
951
|
+
end
|
|
952
|
+
end
|
|
953
|
+
```
|
|
954
|
+
|
|
955
|
+
**Step 2: Run tests to verify they fail**
|
|
956
|
+
|
|
957
|
+
Run: `bundle exec rspec spec/atomic_assessments_import/examsoft/extractor/ -v`
|
|
958
|
+
Expected: FAIL — uninitialized constants
|
|
959
|
+
|
|
960
|
+
**Step 3: Write implementations**
|
|
961
|
+
|
|
962
|
+
Create `lib/atomic_assessments_import/exam_soft/extractor/question_stem_detector.rb`:
|
|
963
|
+
|
|
964
|
+
```ruby
|
|
965
|
+
# frozen_string_literal: true
|
|
966
|
+
|
|
967
|
+
module AtomicAssessmentsImport
|
|
968
|
+
module ExamSoft
|
|
969
|
+
module Extractor
|
|
970
|
+
class QuestionStemDetector
|
|
971
|
+
OPTION_PATTERN = /\A\s*\*?[a-oA-O][.)]/
|
|
972
|
+
NUMBERED_PREFIX = /\A\s*\d+\s*[.)]\s*/
|
|
973
|
+
METADATA_PREFIX = /\A\s*(?:(?:Type:\s*.+?\s+)?Folder:.+?(?:Title:.+?)?(?:Category:.+?)?)?\s*\d*\s*[.)]?\s*/m
|
|
974
|
+
TILDE_SPLIT = /\s*~\s*/
|
|
975
|
+
|
|
976
|
+
def initialize(nodes)
|
|
977
|
+
@nodes = nodes
|
|
978
|
+
end
|
|
979
|
+
|
|
980
|
+
def detect
|
|
981
|
+
@nodes.each do |node|
|
|
982
|
+
text = node.text.strip
|
|
983
|
+
next if text.empty?
|
|
984
|
+
next if text.match?(OPTION_PATTERN)
|
|
985
|
+
|
|
986
|
+
# This node contains the question stem (possibly with metadata prefix)
|
|
987
|
+
# Try to extract just the question part
|
|
988
|
+
stem = extract_stem(text)
|
|
989
|
+
return stem unless stem.nil? || stem.empty?
|
|
990
|
+
end
|
|
991
|
+
|
|
992
|
+
nil
|
|
993
|
+
end
|
|
994
|
+
|
|
995
|
+
private
|
|
996
|
+
|
|
997
|
+
def extract_stem(text)
|
|
998
|
+
# Remove metadata prefix if present (Folder:, Title:, Category:, etc.)
|
|
999
|
+
cleaned = text.sub(METADATA_PREFIX, "")
|
|
1000
|
+
# Remove numbered prefix
|
|
1001
|
+
cleaned = cleaned.sub(NUMBERED_PREFIX, "")
|
|
1002
|
+
# Split on tilde (explanation separator) and take the question part
|
|
1003
|
+
cleaned = cleaned.split(TILDE_SPLIT).first
|
|
1004
|
+
cleaned&.strip.presence
|
|
1005
|
+
end
|
|
1006
|
+
end
|
|
1007
|
+
end
|
|
1008
|
+
end
|
|
1009
|
+
end
|
|
1010
|
+
```
|
|
1011
|
+
|
|
1012
|
+
Create `lib/atomic_assessments_import/exam_soft/extractor/options_detector.rb`:
|
|
1013
|
+
|
|
1014
|
+
```ruby
|
|
1015
|
+
# frozen_string_literal: true
|
|
1016
|
+
|
|
1017
|
+
module AtomicAssessmentsImport
|
|
1018
|
+
module ExamSoft
|
|
1019
|
+
module Extractor
|
|
1020
|
+
class OptionsDetector
|
|
1021
|
+
OPTION_PATTERN = /\A\s*(\*?)([a-oA-O])\s*[.)]\s*(.+)/m
|
|
1022
|
+
|
|
1023
|
+
def initialize(nodes)
|
|
1024
|
+
@nodes = nodes
|
|
1025
|
+
end
|
|
1026
|
+
|
|
1027
|
+
def detect
|
|
1028
|
+
options = []
|
|
1029
|
+
|
|
1030
|
+
@nodes.each do |node|
|
|
1031
|
+
text = node.text.strip
|
|
1032
|
+
match = text.match(OPTION_PATTERN)
|
|
1033
|
+
next unless match
|
|
1034
|
+
|
|
1035
|
+
marker = match[1]
|
|
1036
|
+
letter = match[2].downcase
|
|
1037
|
+
option_text = match[3].strip
|
|
1038
|
+
|
|
1039
|
+
# Check for bold formatting as correct marker
|
|
1040
|
+
bold = node.at_css("strong, b")
|
|
1041
|
+
is_correct = marker == "*" || (bold && bold.text.strip == text.strip)
|
|
1042
|
+
|
|
1043
|
+
options << {
|
|
1044
|
+
text: option_text,
|
|
1045
|
+
letter: letter,
|
|
1046
|
+
correct: is_correct || false,
|
|
1047
|
+
}
|
|
1048
|
+
end
|
|
1049
|
+
|
|
1050
|
+
options
|
|
1051
|
+
end
|
|
1052
|
+
end
|
|
1053
|
+
end
|
|
1054
|
+
end
|
|
1055
|
+
end
|
|
1056
|
+
```
|
|
1057
|
+
|
|
1058
|
+
Create `lib/atomic_assessments_import/exam_soft/extractor/correct_answer_detector.rb`:
|
|
1059
|
+
|
|
1060
|
+
```ruby
|
|
1061
|
+
# frozen_string_literal: true
|
|
1062
|
+
|
|
1063
|
+
module AtomicAssessmentsImport
|
|
1064
|
+
module ExamSoft
|
|
1065
|
+
module Extractor
|
|
1066
|
+
class CorrectAnswerDetector
|
|
1067
|
+
ANSWER_LABEL_PATTERN = /\bAnswer:\s*([A-Oa-o,;\s]+)/i
|
|
1068
|
+
|
|
1069
|
+
def initialize(nodes, options)
|
|
1070
|
+
@nodes = nodes
|
|
1071
|
+
@options = options
|
|
1072
|
+
end
|
|
1073
|
+
|
|
1074
|
+
def detect
|
|
1075
|
+
# First: check options for correct markers (asterisk, bold)
|
|
1076
|
+
from_markers = @options.select { |o| o[:correct] }.map { |o| o[:letter] }
|
|
1077
|
+
return from_markers unless from_markers.empty?
|
|
1078
|
+
|
|
1079
|
+
# Second: look for "Answer:" label in the chunk
|
|
1080
|
+
@nodes.each do |node|
|
|
1081
|
+
text = node.text.strip
|
|
1082
|
+
match = text.match(ANSWER_LABEL_PATTERN)
|
|
1083
|
+
next unless match
|
|
1084
|
+
|
|
1085
|
+
letters = match[1].scan(/[a-oA-O]/).map(&:downcase)
|
|
1086
|
+
return letters unless letters.empty?
|
|
1087
|
+
end
|
|
1088
|
+
|
|
1089
|
+
[]
|
|
1090
|
+
end
|
|
1091
|
+
end
|
|
1092
|
+
end
|
|
1093
|
+
end
|
|
1094
|
+
end
|
|
1095
|
+
```
|
|
1096
|
+
|
|
1097
|
+
**Step 4: Run tests to verify they pass**
|
|
1098
|
+
|
|
1099
|
+
Run: `bundle exec rspec spec/atomic_assessments_import/examsoft/extractor/ -v`
|
|
1100
|
+
Expected: PASS
|
|
1101
|
+
|
|
1102
|
+
**Step 5: Commit**
|
|
1103
|
+
|
|
1104
|
+
```bash
|
|
1105
|
+
git add lib/atomic_assessments_import/exam_soft/extractor/ spec/atomic_assessments_import/examsoft/extractor/
|
|
1106
|
+
git commit -m "feat: add core field detectors (stem, options, correct answer)"
|
|
1107
|
+
```
|
|
1108
|
+
|
|
1109
|
+
---
|
|
1110
|
+
|
|
1111
|
+
### Task 6: Field Detectors — Metadata, Feedback, QuestionType
|
|
1112
|
+
|
|
1113
|
+
**Files:**
|
|
1114
|
+
- Create: `lib/atomic_assessments_import/exam_soft/extractor/metadata_detector.rb`
|
|
1115
|
+
- Create: `lib/atomic_assessments_import/exam_soft/extractor/feedback_detector.rb`
|
|
1116
|
+
- Create: `lib/atomic_assessments_import/exam_soft/extractor/question_type_detector.rb`
|
|
1117
|
+
- Test: `spec/atomic_assessments_import/examsoft/extractor/metadata_detector_spec.rb`
|
|
1118
|
+
- Test: `spec/atomic_assessments_import/examsoft/extractor/feedback_detector_spec.rb`
|
|
1119
|
+
- Test: `spec/atomic_assessments_import/examsoft/extractor/question_type_detector_spec.rb`
|
|
1120
|
+
|
|
1121
|
+
**Step 1: Write failing tests**
|
|
1122
|
+
|
|
1123
|
+
Create `spec/atomic_assessments_import/examsoft/extractor/metadata_detector_spec.rb`:
|
|
1124
|
+
|
|
1125
|
+
```ruby
|
|
1126
|
+
# frozen_string_literal: true
|
|
1127
|
+
|
|
1128
|
+
require "atomic_assessments_import"
|
|
1129
|
+
require "nokogiri"
|
|
1130
|
+
|
|
1131
|
+
RSpec.describe AtomicAssessmentsImport::ExamSoft::Extractor::MetadataDetector do
|
|
1132
|
+
def nodes_from(html)
|
|
1133
|
+
Nokogiri::HTML.fragment(html).children.to_a
|
|
1134
|
+
end
|
|
1135
|
+
|
|
1136
|
+
describe "#detect" do
|
|
1137
|
+
it "extracts folder, title, and category" do
|
|
1138
|
+
nodes = nodes_from("<p>Folder: Geography Title: Question 1 Category: Subject/Capitals, Difficulty/Normal 1) Question?</p>")
|
|
1139
|
+
result = described_class.new(nodes).detect
|
|
1140
|
+
|
|
1141
|
+
expect(result[:folder]).to eq("Geography")
|
|
1142
|
+
expect(result[:title]).to eq("Question 1")
|
|
1143
|
+
expect(result[:categories]).to include("Subject/Capitals")
|
|
1144
|
+
end
|
|
1145
|
+
|
|
1146
|
+
it "extracts type when present" do
|
|
1147
|
+
nodes = nodes_from("<p>Type: MA Folder: Geography Title: Q1 Category: Test 1) Question?</p>")
|
|
1148
|
+
result = described_class.new(nodes).detect
|
|
1149
|
+
|
|
1150
|
+
expect(result[:type]).to eq("ma")
|
|
1151
|
+
end
|
|
1152
|
+
|
|
1153
|
+
it "returns empty hash when no metadata found" do
|
|
1154
|
+
nodes = nodes_from("<p>Just a question with no metadata</p>")
|
|
1155
|
+
result = described_class.new(nodes).detect
|
|
1156
|
+
|
|
1157
|
+
expect(result).to eq({})
|
|
1158
|
+
end
|
|
1159
|
+
end
|
|
1160
|
+
end
|
|
1161
|
+
```
|
|
1162
|
+
|
|
1163
|
+
Create `spec/atomic_assessments_import/examsoft/extractor/feedback_detector_spec.rb`:
|
|
1164
|
+
|
|
1165
|
+
```ruby
|
|
1166
|
+
# frozen_string_literal: true
|
|
1167
|
+
|
|
1168
|
+
require "atomic_assessments_import"
|
|
1169
|
+
require "nokogiri"
|
|
1170
|
+
|
|
1171
|
+
RSpec.describe AtomicAssessmentsImport::ExamSoft::Extractor::FeedbackDetector do
|
|
1172
|
+
def nodes_from(html)
|
|
1173
|
+
Nokogiri::HTML.fragment(html).children.to_a
|
|
1174
|
+
end
|
|
1175
|
+
|
|
1176
|
+
describe "#detect" do
|
|
1177
|
+
it "extracts feedback after tilde" do
|
|
1178
|
+
nodes = nodes_from("<p>1) What is the capital? ~ Paris is the capital of France.</p>")
|
|
1179
|
+
result = described_class.new(nodes).detect
|
|
1180
|
+
|
|
1181
|
+
expect(result).to eq("Paris is the capital of France.")
|
|
1182
|
+
end
|
|
1183
|
+
|
|
1184
|
+
it "extracts feedback from Explanation: label" do
|
|
1185
|
+
nodes = nodes_from(<<~HTML)
|
|
1186
|
+
<p>What is the capital?</p>
|
|
1187
|
+
<p>Explanation: Paris is the capital of France.</p>
|
|
1188
|
+
HTML
|
|
1189
|
+
result = described_class.new(nodes).detect
|
|
1190
|
+
|
|
1191
|
+
expect(result).to eq("Paris is the capital of France.")
|
|
1192
|
+
end
|
|
1193
|
+
|
|
1194
|
+
it "extracts feedback from Rationale: label" do
|
|
1195
|
+
nodes = nodes_from(<<~HTML)
|
|
1196
|
+
<p>What is the capital?</p>
|
|
1197
|
+
<p>Rationale: Paris is the capital of France.</p>
|
|
1198
|
+
HTML
|
|
1199
|
+
result = described_class.new(nodes).detect
|
|
1200
|
+
|
|
1201
|
+
expect(result).to eq("Paris is the capital of France.")
|
|
1202
|
+
end
|
|
1203
|
+
|
|
1204
|
+
it "returns nil when no feedback found" do
|
|
1205
|
+
nodes = nodes_from("<p>Just a question</p>")
|
|
1206
|
+
result = described_class.new(nodes).detect
|
|
1207
|
+
|
|
1208
|
+
expect(result).to be_nil
|
|
1209
|
+
end
|
|
1210
|
+
end
|
|
1211
|
+
end
|
|
1212
|
+
```
|
|
1213
|
+
|
|
1214
|
+
Create `spec/atomic_assessments_import/examsoft/extractor/question_type_detector_spec.rb`:
|
|
1215
|
+
|
|
1216
|
+
```ruby
|
|
1217
|
+
# frozen_string_literal: true
|
|
1218
|
+
|
|
1219
|
+
require "atomic_assessments_import"
|
|
1220
|
+
require "nokogiri"
|
|
1221
|
+
|
|
1222
|
+
RSpec.describe AtomicAssessmentsImport::ExamSoft::Extractor::QuestionTypeDetector do
|
|
1223
|
+
def nodes_from(html)
|
|
1224
|
+
Nokogiri::HTML.fragment(html).children.to_a
|
|
1225
|
+
end
|
|
1226
|
+
|
|
1227
|
+
describe "#detect" do
|
|
1228
|
+
it "detects type from Type: label" do
|
|
1229
|
+
nodes = nodes_from("<p>Type: MA Folder: Geo 1) Question?</p>")
|
|
1230
|
+
result = described_class.new(nodes, has_options: true).detect
|
|
1231
|
+
|
|
1232
|
+
expect(result).to eq("ma")
|
|
1233
|
+
end
|
|
1234
|
+
|
|
1235
|
+
it "detects essay from Type: label" do
|
|
1236
|
+
nodes = nodes_from("<p>Type: Essay Folder: Geo 1) Question?</p>")
|
|
1237
|
+
result = described_class.new(nodes, has_options: false).detect
|
|
1238
|
+
|
|
1239
|
+
expect(result).to eq("essay")
|
|
1240
|
+
end
|
|
1241
|
+
|
|
1242
|
+
it "defaults to mcq when options are present" do
|
|
1243
|
+
nodes = nodes_from("<p>A question with no type label</p>")
|
|
1244
|
+
result = described_class.new(nodes, has_options: true).detect
|
|
1245
|
+
|
|
1246
|
+
expect(result).to eq("mcq")
|
|
1247
|
+
end
|
|
1248
|
+
|
|
1249
|
+
it "defaults to short_answer when no options" do
|
|
1250
|
+
nodes = nodes_from("<p>A question with no type label and no options</p>")
|
|
1251
|
+
result = described_class.new(nodes, has_options: false).detect
|
|
1252
|
+
|
|
1253
|
+
expect(result).to eq("short_answer")
|
|
1254
|
+
end
|
|
1255
|
+
|
|
1256
|
+
it "detects true/false from Type: label" do
|
|
1257
|
+
nodes = nodes_from("<p>Type: True/False 1) Question?</p>")
|
|
1258
|
+
result = described_class.new(nodes, has_options: true).detect
|
|
1259
|
+
|
|
1260
|
+
expect(result).to eq("true_false")
|
|
1261
|
+
end
|
|
1262
|
+
|
|
1263
|
+
it "detects matching from Type: label" do
|
|
1264
|
+
nodes = nodes_from("<p>Type: Matching 1) Question?</p>")
|
|
1265
|
+
result = described_class.new(nodes, has_options: false).detect
|
|
1266
|
+
|
|
1267
|
+
expect(result).to eq("matching")
|
|
1268
|
+
end
|
|
1269
|
+
|
|
1270
|
+
it "detects ordering from Type: label" do
|
|
1271
|
+
nodes = nodes_from("<p>Type: Ordering 1) Question?</p>")
|
|
1272
|
+
result = described_class.new(nodes, has_options: false).detect
|
|
1273
|
+
|
|
1274
|
+
expect(result).to eq("ordering")
|
|
1275
|
+
end
|
|
1276
|
+
|
|
1277
|
+
it "detects fill_in_the_blank from Type: label" do
|
|
1278
|
+
nodes = nodes_from("<p>Type: Fill in the Blank 1) Question?</p>")
|
|
1279
|
+
result = described_class.new(nodes, has_options: false).detect
|
|
1280
|
+
|
|
1281
|
+
expect(result).to eq("fill_in_the_blank")
|
|
1282
|
+
end
|
|
1283
|
+
end
|
|
1284
|
+
end
|
|
1285
|
+
```
|
|
1286
|
+
|
|
1287
|
+
**Step 2: Run tests to verify they fail**
|
|
1288
|
+
|
|
1289
|
+
Run: `bundle exec rspec spec/atomic_assessments_import/examsoft/extractor/ -v`
|
|
1290
|
+
Expected: FAIL — uninitialized constants for new detectors
|
|
1291
|
+
|
|
1292
|
+
**Step 3: Write implementations**
|
|
1293
|
+
|
|
1294
|
+
Create `lib/atomic_assessments_import/exam_soft/extractor/metadata_detector.rb`:
|
|
1295
|
+
|
|
1296
|
+
```ruby
|
|
1297
|
+
# frozen_string_literal: true
|
|
1298
|
+
|
|
1299
|
+
module AtomicAssessmentsImport
|
|
1300
|
+
module ExamSoft
|
|
1301
|
+
module Extractor
|
|
1302
|
+
class MetadataDetector
|
|
1303
|
+
FOLDER_PATTERN = /Folder:\s*(.+?)(?=\s*(?:Title:|Category:|\d+[.)]))/
|
|
1304
|
+
TITLE_PATTERN = /Title:\s*(.+?)(?=\s*(?:Category:|\d+[.)]))/
|
|
1305
|
+
CATEGORY_PATTERN = /Category:\s*(.+?)(?=\s*\d+[.)]|\z)/
|
|
1306
|
+
TYPE_PATTERN = /Type:\s*(\S+)/
|
|
1307
|
+
|
|
1308
|
+
def initialize(nodes)
|
|
1309
|
+
@nodes = nodes
|
|
1310
|
+
end
|
|
1311
|
+
|
|
1312
|
+
def detect
|
|
1313
|
+
# Combine all text from nodes to search for metadata
|
|
1314
|
+
full_text = @nodes.map { |n| n.text.strip }.join(" ")
|
|
1315
|
+
result = {}
|
|
1316
|
+
|
|
1317
|
+
type_match = full_text.match(TYPE_PATTERN)
|
|
1318
|
+
result[:type] = type_match[1].strip.downcase if type_match
|
|
1319
|
+
|
|
1320
|
+
folder_match = full_text.match(FOLDER_PATTERN)
|
|
1321
|
+
result[:folder] = folder_match[1].strip if folder_match
|
|
1322
|
+
|
|
1323
|
+
title_match = full_text.match(TITLE_PATTERN)
|
|
1324
|
+
result[:title] = title_match[1].strip if title_match
|
|
1325
|
+
|
|
1326
|
+
category_match = full_text.match(CATEGORY_PATTERN)
|
|
1327
|
+
if category_match
|
|
1328
|
+
result[:categories] = category_match[1].split(",").map(&:strip)
|
|
1329
|
+
end
|
|
1330
|
+
|
|
1331
|
+
result
|
|
1332
|
+
end
|
|
1333
|
+
end
|
|
1334
|
+
end
|
|
1335
|
+
end
|
|
1336
|
+
end
|
|
1337
|
+
```
|
|
1338
|
+
|
|
1339
|
+
Create `lib/atomic_assessments_import/exam_soft/extractor/feedback_detector.rb`:
|
|
1340
|
+
|
|
1341
|
+
```ruby
|
|
1342
|
+
# frozen_string_literal: true
|
|
1343
|
+
|
|
1344
|
+
module AtomicAssessmentsImport
|
|
1345
|
+
module ExamSoft
|
|
1346
|
+
module Extractor
|
|
1347
|
+
class FeedbackDetector
|
|
1348
|
+
TILDE_PATTERN = /~\s*(.+)/m
|
|
1349
|
+
LABEL_PATTERN = /\A\s*(?:Explanation|Rationale):\s*(.+)/im
|
|
1350
|
+
|
|
1351
|
+
def initialize(nodes)
|
|
1352
|
+
@nodes = nodes
|
|
1353
|
+
end
|
|
1354
|
+
|
|
1355
|
+
def detect
|
|
1356
|
+
# First: look for tilde-separated feedback in any node
|
|
1357
|
+
@nodes.each do |node|
|
|
1358
|
+
text = node.text.strip
|
|
1359
|
+
match = text.match(TILDE_PATTERN)
|
|
1360
|
+
if match
|
|
1361
|
+
feedback = match[1].strip
|
|
1362
|
+
return feedback unless feedback.empty?
|
|
1363
|
+
end
|
|
1364
|
+
end
|
|
1365
|
+
|
|
1366
|
+
# Second: look for labeled feedback (Explanation:, Rationale:)
|
|
1367
|
+
@nodes.each do |node|
|
|
1368
|
+
text = node.text.strip
|
|
1369
|
+
match = text.match(LABEL_PATTERN)
|
|
1370
|
+
return match[1].strip if match
|
|
1371
|
+
end
|
|
1372
|
+
|
|
1373
|
+
nil
|
|
1374
|
+
end
|
|
1375
|
+
end
|
|
1376
|
+
end
|
|
1377
|
+
end
|
|
1378
|
+
end
|
|
1379
|
+
```
|
|
1380
|
+
|
|
1381
|
+
Create `lib/atomic_assessments_import/exam_soft/extractor/question_type_detector.rb`:
|
|
1382
|
+
|
|
1383
|
+
```ruby
|
|
1384
|
+
# frozen_string_literal: true
|
|
1385
|
+
|
|
1386
|
+
module AtomicAssessmentsImport
|
|
1387
|
+
module ExamSoft
|
|
1388
|
+
module Extractor
|
|
1389
|
+
class QuestionTypeDetector
|
|
1390
|
+
TYPE_LABEL_PATTERN = /Type:\s*(.+?)(?=\s*(?:Folder:|Title:|Category:|\d+[.)]|\z))/i
|
|
1391
|
+
|
|
1392
|
+
TYPE_MAP = {
|
|
1393
|
+
/\Amcq?\z/i => "mcq",
|
|
1394
|
+
/\Amultiple\s*choice\z/i => "mcq",
|
|
1395
|
+
/\Ama\z/i => "ma",
|
|
1396
|
+
/\Amultiple\s*(?:select|answer|response)\z/i => "ma",
|
|
1397
|
+
/\Atrue[\s\/]*false\z/i => "true_false",
|
|
1398
|
+
/\At\s*\/?\s*f\z/i => "true_false",
|
|
1399
|
+
/\Aessay\z/i => "essay",
|
|
1400
|
+
/\Along\s*answer\z/i => "essay",
|
|
1401
|
+
/\Ashort\s*answer\z/i => "short_answer",
|
|
1402
|
+
/\Afill[\s_-]*in[\s_-]*(?:the[\s_-]*)?blank\z/i => "fill_in_the_blank",
|
|
1403
|
+
/\Acloze\z/i => "fill_in_the_blank",
|
|
1404
|
+
/\Amatching\z/i => "matching",
|
|
1405
|
+
/\Aorder(?:ing)?\z/i => "ordering",
|
|
1406
|
+
}.freeze
|
|
1407
|
+
|
|
1408
|
+
def initialize(nodes, has_options:)
|
|
1409
|
+
@nodes = nodes
|
|
1410
|
+
@has_options = has_options
|
|
1411
|
+
end
|
|
1412
|
+
|
|
1413
|
+
def detect
|
|
1414
|
+
# Try to find an explicit Type: label
|
|
1415
|
+
full_text = @nodes.map { |n| n.text.strip }.join(" ")
|
|
1416
|
+
match = full_text.match(TYPE_LABEL_PATTERN)
|
|
1417
|
+
|
|
1418
|
+
if match
|
|
1419
|
+
type_text = match[1].strip
|
|
1420
|
+
TYPE_MAP.each do |pattern, type|
|
|
1421
|
+
return type if type_text.match?(pattern)
|
|
1422
|
+
end
|
|
1423
|
+
# Unknown explicit type — return it lowercased as-is
|
|
1424
|
+
return type_text.downcase.gsub(/\s+/, "_")
|
|
1425
|
+
end
|
|
1426
|
+
|
|
1427
|
+
# No explicit type — infer from structure
|
|
1428
|
+
@has_options ? "mcq" : "short_answer"
|
|
1429
|
+
end
|
|
1430
|
+
end
|
|
1431
|
+
end
|
|
1432
|
+
end
|
|
1433
|
+
end
|
|
1434
|
+
```
|
|
1435
|
+
|
|
1436
|
+
**Step 4: Run tests to verify they pass**
|
|
1437
|
+
|
|
1438
|
+
Run: `bundle exec rspec spec/atomic_assessments_import/examsoft/extractor/ -v`
|
|
1439
|
+
Expected: PASS
|
|
1440
|
+
|
|
1441
|
+
**Step 5: Commit**
|
|
1442
|
+
|
|
1443
|
+
```bash
|
|
1444
|
+
git add lib/atomic_assessments_import/exam_soft/extractor/ spec/atomic_assessments_import/examsoft/extractor/
|
|
1445
|
+
git commit -m "feat: add metadata, feedback, and question type detectors"
|
|
1446
|
+
```
|
|
1447
|
+
|
|
1448
|
+
---
|
|
1449
|
+
|
|
1450
|
+
### Task 7: Extractor Orchestrator
|
|
1451
|
+
|
|
1452
|
+
Assembles all detectors and builds the `row_mock` hash.
|
|
1453
|
+
|
|
1454
|
+
**Files:**
|
|
1455
|
+
- Create: `lib/atomic_assessments_import/exam_soft/extractor.rb`
|
|
1456
|
+
- Test: `spec/atomic_assessments_import/examsoft/extractor_spec.rb`
|
|
1457
|
+
|
|
1458
|
+
**Step 1: Write the failing test**
|
|
1459
|
+
|
|
1460
|
+
Create `spec/atomic_assessments_import/examsoft/extractor_spec.rb`:
|
|
1461
|
+
|
|
1462
|
+
```ruby
|
|
1463
|
+
# frozen_string_literal: true
|
|
1464
|
+
|
|
1465
|
+
require "atomic_assessments_import"
|
|
1466
|
+
require "nokogiri"
|
|
1467
|
+
|
|
1468
|
+
RSpec.describe AtomicAssessmentsImport::ExamSoft::Extractor do
|
|
1469
|
+
def nodes_from(html)
|
|
1470
|
+
Nokogiri::HTML.fragment(html).children.to_a
|
|
1471
|
+
end
|
|
1472
|
+
|
|
1473
|
+
describe "#extract" do
|
|
1474
|
+
it "extracts a complete MCQ question" do
|
|
1475
|
+
nodes = nodes_from(<<~HTML)
|
|
1476
|
+
<p>Folder: Geography Title: Question 1 Category: Subject/Capitals 1) What is the capital of France? ~ Paris is the capital.</p>
|
|
1477
|
+
<p>*a) Paris</p>
|
|
1478
|
+
<p>b) London</p>
|
|
1479
|
+
<p>c) Berlin</p>
|
|
1480
|
+
HTML
|
|
1481
|
+
result = described_class.new(nodes).extract
|
|
1482
|
+
|
|
1483
|
+
expect(result[:row]["question text"]).to eq("What is the capital of France?")
|
|
1484
|
+
expect(result[:row]["option a"]).to eq("Paris")
|
|
1485
|
+
expect(result[:row]["option b"]).to eq("London")
|
|
1486
|
+
expect(result[:row]["option c"]).to eq("Berlin")
|
|
1487
|
+
expect(result[:row]["correct answer"]).to eq("a")
|
|
1488
|
+
expect(result[:row]["title"]).to eq("Question 1")
|
|
1489
|
+
expect(result[:row]["folder"]).to eq("Geography")
|
|
1490
|
+
expect(result[:row]["general feedback"]).to eq("Paris is the capital.")
|
|
1491
|
+
expect(result[:row]["question type"]).to eq("mcq")
|
|
1492
|
+
expect(result[:status]).to eq("published")
|
|
1493
|
+
expect(result[:warnings]).to be_empty
|
|
1494
|
+
end
|
|
1495
|
+
|
|
1496
|
+
it "returns draft status when no correct answer" do
|
|
1497
|
+
nodes = nodes_from(<<~HTML)
|
|
1498
|
+
<p>1) What is the capital of France?</p>
|
|
1499
|
+
<p>a) Paris</p>
|
|
1500
|
+
<p>b) London</p>
|
|
1501
|
+
HTML
|
|
1502
|
+
result = described_class.new(nodes).extract
|
|
1503
|
+
|
|
1504
|
+
expect(result[:status]).to eq("draft")
|
|
1505
|
+
expect(result[:warnings]).to include(/correct answer/i)
|
|
1506
|
+
end
|
|
1507
|
+
|
|
1508
|
+
it "returns draft status when no question text found" do
|
|
1509
|
+
nodes = nodes_from(<<~HTML)
|
|
1510
|
+
<p>a) Paris</p>
|
|
1511
|
+
<p>b) London</p>
|
|
1512
|
+
HTML
|
|
1513
|
+
result = described_class.new(nodes).extract
|
|
1514
|
+
|
|
1515
|
+
expect(result[:status]).to eq("draft")
|
|
1516
|
+
expect(result[:warnings]).to include(/question text/i)
|
|
1517
|
+
end
|
|
1518
|
+
|
|
1519
|
+
it "handles multiple correct answers for MA type" do
|
|
1520
|
+
nodes = nodes_from(<<~HTML)
|
|
1521
|
+
<p>Type: MA Folder: Geo Title: Q1 Category: Test 1) Pick capitals? ~ Explanation</p>
|
|
1522
|
+
<p>*a) Paris</p>
|
|
1523
|
+
<p>*b) Berlin</p>
|
|
1524
|
+
<p>c) Detroit</p>
|
|
1525
|
+
HTML
|
|
1526
|
+
result = described_class.new(nodes).extract
|
|
1527
|
+
|
|
1528
|
+
expect(result[:row]["correct answer"]).to eq("a; b")
|
|
1529
|
+
expect(result[:row]["question type"]).to eq("ma")
|
|
1530
|
+
end
|
|
1531
|
+
|
|
1532
|
+
it "extracts essay questions without options" do
|
|
1533
|
+
nodes = nodes_from(<<~HTML)
|
|
1534
|
+
<p>Type: Essay Folder: Writing Title: Q1 Category: Test 1) Discuss the causes of WWI.</p>
|
|
1535
|
+
HTML
|
|
1536
|
+
result = described_class.new(nodes).extract
|
|
1537
|
+
|
|
1538
|
+
expect(result[:row]["question type"]).to eq("essay")
|
|
1539
|
+
expect(result[:row]["question text"]).to eq("Discuss the causes of WWI.")
|
|
1540
|
+
expect(result[:status]).to eq("published")
|
|
1541
|
+
end
|
|
1542
|
+
|
|
1543
|
+
it "warns for unsupported question types but still imports" do
|
|
1544
|
+
nodes = nodes_from(<<~HTML)
|
|
1545
|
+
<p>Type: Hotspot 1) Identify the region on the map.</p>
|
|
1546
|
+
HTML
|
|
1547
|
+
result = described_class.new(nodes).extract
|
|
1548
|
+
|
|
1549
|
+
expect(result[:status]).to eq("draft")
|
|
1550
|
+
expect(result[:warnings]).to include(/unsupported.*hotspot/i)
|
|
1551
|
+
end
|
|
1552
|
+
end
|
|
1553
|
+
end
|
|
1554
|
+
```
|
|
1555
|
+
|
|
1556
|
+
**Step 2: Run test to verify it fails**
|
|
1557
|
+
|
|
1558
|
+
Run: `bundle exec rspec spec/atomic_assessments_import/examsoft/extractor_spec.rb -v`
|
|
1559
|
+
Expected: FAIL
|
|
1560
|
+
|
|
1561
|
+
**Step 3: Write implementation**
|
|
1562
|
+
|
|
1563
|
+
Create `lib/atomic_assessments_import/exam_soft/extractor.rb`:
|
|
1564
|
+
|
|
1565
|
+
```ruby
|
|
1566
|
+
# frozen_string_literal: true
|
|
1567
|
+
|
|
1568
|
+
require_relative "extractor/question_stem_detector"
|
|
1569
|
+
require_relative "extractor/options_detector"
|
|
1570
|
+
require_relative "extractor/correct_answer_detector"
|
|
1571
|
+
require_relative "extractor/metadata_detector"
|
|
1572
|
+
require_relative "extractor/feedback_detector"
|
|
1573
|
+
require_relative "extractor/question_type_detector"
|
|
1574
|
+
|
|
1575
|
+
module AtomicAssessmentsImport
|
|
1576
|
+
module ExamSoft
|
|
1577
|
+
class Extractor
|
|
1578
|
+
SUPPORTED_TYPES = %w[mcq ma true_false essay short_answer fill_in_the_blank matching ordering].freeze
|
|
1579
|
+
# Types that require options and a correct answer
|
|
1580
|
+
OPTION_TYPES = %w[mcq ma true_false].freeze
|
|
1581
|
+
|
|
1582
|
+
def initialize(nodes)
|
|
1583
|
+
@nodes = nodes
|
|
1584
|
+
end
|
|
1585
|
+
|
|
1586
|
+
def extract
|
|
1587
|
+
warnings = []
|
|
1588
|
+
|
|
1589
|
+
# Run detectors
|
|
1590
|
+
options = Extractor::OptionsDetector.new(@nodes).detect
|
|
1591
|
+
has_options = !options.empty?
|
|
1592
|
+
|
|
1593
|
+
metadata = Extractor::MetadataDetector.new(@nodes).detect
|
|
1594
|
+
question_type = Extractor::QuestionTypeDetector.new(@nodes, has_options: has_options).detect
|
|
1595
|
+
stem = Extractor::QuestionStemDetector.new(@nodes).detect
|
|
1596
|
+
feedback = Extractor::FeedbackDetector.new(@nodes).detect
|
|
1597
|
+
correct_answers = has_options ? Extractor::CorrectAnswerDetector.new(@nodes, options).detect : []
|
|
1598
|
+
|
|
1599
|
+
# Determine status
|
|
1600
|
+
status = "published"
|
|
1601
|
+
|
|
1602
|
+
unless SUPPORTED_TYPES.include?(question_type)
|
|
1603
|
+
warnings << "Unsupported question type '#{question_type}', imported as draft"
|
|
1604
|
+
status = "draft"
|
|
1605
|
+
end
|
|
1606
|
+
|
|
1607
|
+
if stem.nil?
|
|
1608
|
+
warnings << "No question text found, imported as draft"
|
|
1609
|
+
status = "draft"
|
|
1610
|
+
end
|
|
1611
|
+
|
|
1612
|
+
if OPTION_TYPES.include?(question_type)
|
|
1613
|
+
if options.empty?
|
|
1614
|
+
warnings << "No options found for #{question_type} question, imported as draft"
|
|
1615
|
+
status = "draft"
|
|
1616
|
+
end
|
|
1617
|
+
if correct_answers.empty?
|
|
1618
|
+
warnings << "No correct answer found, imported as draft"
|
|
1619
|
+
status = "draft"
|
|
1620
|
+
end
|
|
1621
|
+
end
|
|
1622
|
+
|
|
1623
|
+
# Build row_mock
|
|
1624
|
+
row = {
|
|
1625
|
+
"question id" => nil,
|
|
1626
|
+
"folder" => metadata[:folder],
|
|
1627
|
+
"title" => metadata[:title],
|
|
1628
|
+
"category" => metadata[:categories] || [],
|
|
1629
|
+
"import type" => nil,
|
|
1630
|
+
"description" => nil,
|
|
1631
|
+
"question text" => stem,
|
|
1632
|
+
"question type" => question_type,
|
|
1633
|
+
"stimulus review" => nil,
|
|
1634
|
+
"instructor stimulus" => nil,
|
|
1635
|
+
"correct answer" => correct_answers.join("; "),
|
|
1636
|
+
"scoring type" => nil,
|
|
1637
|
+
"points" => nil,
|
|
1638
|
+
"distractor rationale" => nil,
|
|
1639
|
+
"sample answer" => nil,
|
|
1640
|
+
"acknowledgements" => nil,
|
|
1641
|
+
"general feedback" => feedback,
|
|
1642
|
+
"correct feedback" => nil,
|
|
1643
|
+
"incorrect feedback" => nil,
|
|
1644
|
+
"shuffle options" => nil,
|
|
1645
|
+
"template" => question_type,
|
|
1646
|
+
}
|
|
1647
|
+
|
|
1648
|
+
# Add option keys
|
|
1649
|
+
options.each_with_index do |opt, index|
|
|
1650
|
+
letter = ("a".ord + index).chr
|
|
1651
|
+
row["option #{letter}"] = opt[:text]
|
|
1652
|
+
end
|
|
1653
|
+
|
|
1654
|
+
{
|
|
1655
|
+
row: row,
|
|
1656
|
+
status: status,
|
|
1657
|
+
warnings: warnings,
|
|
1658
|
+
}
|
|
1659
|
+
end
|
|
1660
|
+
end
|
|
1661
|
+
end
|
|
1662
|
+
end
|
|
1663
|
+
```
|
|
1664
|
+
|
|
1665
|
+
**Step 4: Run test to verify it passes**
|
|
1666
|
+
|
|
1667
|
+
Run: `bundle exec rspec spec/atomic_assessments_import/examsoft/extractor_spec.rb -v`
|
|
1668
|
+
Expected: PASS
|
|
1669
|
+
|
|
1670
|
+
**Step 5: Commit**
|
|
1671
|
+
|
|
1672
|
+
```bash
|
|
1673
|
+
git add lib/atomic_assessments_import/exam_soft/extractor.rb spec/atomic_assessments_import/examsoft/extractor_spec.rb
|
|
1674
|
+
git commit -m "feat: add Extractor orchestrator with field detection pipeline"
|
|
1675
|
+
```
|
|
1676
|
+
|
|
1677
|
+
---
|
|
1678
|
+
|
|
1679
|
+
### Task 8: New Question Type Classes — Essay and ShortAnswer
|
|
1680
|
+
|
|
1681
|
+
**Files:**
|
|
1682
|
+
- Create: `lib/atomic_assessments_import/questions/essay.rb`
|
|
1683
|
+
- Create: `lib/atomic_assessments_import/questions/short_answer.rb`
|
|
1684
|
+
- Test: `spec/atomic_assessments_import/questions/essay_spec.rb`
|
|
1685
|
+
- Test: `spec/atomic_assessments_import/questions/short_answer_spec.rb`
|
|
1686
|
+
- Modify: `lib/atomic_assessments_import/questions/question.rb:12-18` (add cases to `self.load`)
|
|
1687
|
+
|
|
1688
|
+
**Step 1: Write failing tests**
|
|
1689
|
+
|
|
1690
|
+
Create `spec/atomic_assessments_import/questions/essay_spec.rb`:
|
|
1691
|
+
|
|
1692
|
+
```ruby
|
|
1693
|
+
# frozen_string_literal: true
|
|
1694
|
+
|
|
1695
|
+
require "atomic_assessments_import"
|
|
1696
|
+
|
|
1697
|
+
RSpec.describe AtomicAssessmentsImport::Questions::Essay do
|
|
1698
|
+
let(:row) do
|
|
1699
|
+
{
|
|
1700
|
+
"question text" => "Discuss the causes of World War I.",
|
|
1701
|
+
"question type" => "essay",
|
|
1702
|
+
"general feedback" => "A good answer covers alliances, imperialism, and nationalism.",
|
|
1703
|
+
"sample answer" => "World War I was caused by...",
|
|
1704
|
+
"points" => "10",
|
|
1705
|
+
}
|
|
1706
|
+
end
|
|
1707
|
+
|
|
1708
|
+
describe "#question_type" do
|
|
1709
|
+
it "returns longanswer" do
|
|
1710
|
+
question = described_class.new(row)
|
|
1711
|
+
expect(question.question_type).to eq("longanswer")
|
|
1712
|
+
end
|
|
1713
|
+
end
|
|
1714
|
+
|
|
1715
|
+
describe "#to_learnosity" do
|
|
1716
|
+
it "returns correct structure" do
|
|
1717
|
+
question = described_class.new(row)
|
|
1718
|
+
result = question.to_learnosity
|
|
1719
|
+
|
|
1720
|
+
expect(result[:type]).to eq("longanswer")
|
|
1721
|
+
expect(result[:widget_type]).to eq("response")
|
|
1722
|
+
expect(result[:data][:stimulus]).to eq("Discuss the causes of World War I.")
|
|
1723
|
+
end
|
|
1724
|
+
|
|
1725
|
+
it "includes max_length when word limit specified" do
|
|
1726
|
+
row["word_limit"] = "500"
|
|
1727
|
+
question = described_class.new(row)
|
|
1728
|
+
result = question.to_learnosity
|
|
1729
|
+
|
|
1730
|
+
expect(result[:data][:max_length]).to eq(500)
|
|
1731
|
+
end
|
|
1732
|
+
|
|
1733
|
+
it "sets metadata" do
|
|
1734
|
+
question = described_class.new(row)
|
|
1735
|
+
result = question.to_learnosity
|
|
1736
|
+
|
|
1737
|
+
expect(result[:data][:metadata][:sample_answer]).to eq("World War I was caused by...")
|
|
1738
|
+
expect(result[:data][:metadata][:general_feedback]).to eq("A good answer covers alliances, imperialism, and nationalism.")
|
|
1739
|
+
end
|
|
1740
|
+
end
|
|
1741
|
+
end
|
|
1742
|
+
```
|
|
1743
|
+
|
|
1744
|
+
Create `spec/atomic_assessments_import/questions/short_answer_spec.rb`:
|
|
1745
|
+
|
|
1746
|
+
```ruby
|
|
1747
|
+
# frozen_string_literal: true
|
|
1748
|
+
|
|
1749
|
+
require "atomic_assessments_import"
|
|
1750
|
+
|
|
1751
|
+
RSpec.describe AtomicAssessmentsImport::Questions::ShortAnswer do
|
|
1752
|
+
let(:row) do
|
|
1753
|
+
{
|
|
1754
|
+
"question text" => "What is the chemical symbol for water?",
|
|
1755
|
+
"question type" => "short_answer",
|
|
1756
|
+
"correct answer" => "H2O",
|
|
1757
|
+
"points" => "1",
|
|
1758
|
+
}
|
|
1759
|
+
end
|
|
1760
|
+
|
|
1761
|
+
describe "#question_type" do
|
|
1762
|
+
it "returns shorttext" do
|
|
1763
|
+
question = described_class.new(row)
|
|
1764
|
+
expect(question.question_type).to eq("shorttext")
|
|
1765
|
+
end
|
|
1766
|
+
end
|
|
1767
|
+
|
|
1768
|
+
describe "#to_learnosity" do
|
|
1769
|
+
it "returns correct structure" do
|
|
1770
|
+
question = described_class.new(row)
|
|
1771
|
+
result = question.to_learnosity
|
|
1772
|
+
|
|
1773
|
+
expect(result[:type]).to eq("shorttext")
|
|
1774
|
+
expect(result[:widget_type]).to eq("response")
|
|
1775
|
+
expect(result[:data][:stimulus]).to eq("What is the chemical symbol for water?")
|
|
1776
|
+
end
|
|
1777
|
+
|
|
1778
|
+
it "includes validation with correct answer" do
|
|
1779
|
+
question = described_class.new(row)
|
|
1780
|
+
result = question.to_learnosity
|
|
1781
|
+
|
|
1782
|
+
expect(result[:data][:validation][:valid_response][:value]).to eq("H2O")
|
|
1783
|
+
expect(result[:data][:validation][:valid_response][:score]).to eq(1)
|
|
1784
|
+
end
|
|
1785
|
+
end
|
|
1786
|
+
end
|
|
1787
|
+
```
|
|
1788
|
+
|
|
1789
|
+
**Step 2: Run tests to verify they fail**
|
|
1790
|
+
|
|
1791
|
+
Run: `bundle exec rspec spec/atomic_assessments_import/questions/essay_spec.rb spec/atomic_assessments_import/questions/short_answer_spec.rb -v`
|
|
1792
|
+
Expected: FAIL — uninitialized constants
|
|
1793
|
+
|
|
1794
|
+
**Step 3: Write implementations**
|
|
1795
|
+
|
|
1796
|
+
Create `lib/atomic_assessments_import/questions/essay.rb`:
|
|
1797
|
+
|
|
1798
|
+
```ruby
|
|
1799
|
+
# frozen_string_literal: true
|
|
1800
|
+
|
|
1801
|
+
require_relative "question"
|
|
1802
|
+
|
|
1803
|
+
module AtomicAssessmentsImport
|
|
1804
|
+
module Questions
|
|
1805
|
+
class Essay < Question
|
|
1806
|
+
def question_type
|
|
1807
|
+
"longanswer"
|
|
1808
|
+
end
|
|
1809
|
+
|
|
1810
|
+
def question_data
|
|
1811
|
+
data = super
|
|
1812
|
+
word_limit = @row["word_limit"]&.to_i
|
|
1813
|
+
data[:max_length] = word_limit if word_limit && word_limit > 0
|
|
1814
|
+
data
|
|
1815
|
+
end
|
|
1816
|
+
end
|
|
1817
|
+
end
|
|
1818
|
+
end
|
|
1819
|
+
```
|
|
1820
|
+
|
|
1821
|
+
Create `lib/atomic_assessments_import/questions/short_answer.rb`:
|
|
1822
|
+
|
|
1823
|
+
```ruby
|
|
1824
|
+
# frozen_string_literal: true
|
|
1825
|
+
|
|
1826
|
+
require_relative "question"
|
|
1827
|
+
|
|
1828
|
+
module AtomicAssessmentsImport
|
|
1829
|
+
module Questions
|
|
1830
|
+
class ShortAnswer < Question
|
|
1831
|
+
def question_type
|
|
1832
|
+
"shorttext"
|
|
1833
|
+
end
|
|
1834
|
+
|
|
1835
|
+
def question_data
|
|
1836
|
+
super.merge(
|
|
1837
|
+
validation: {
|
|
1838
|
+
valid_response: {
|
|
1839
|
+
score: points,
|
|
1840
|
+
value: @row["correct answer"] || "",
|
|
1841
|
+
},
|
|
1842
|
+
}
|
|
1843
|
+
)
|
|
1844
|
+
end
|
|
1845
|
+
end
|
|
1846
|
+
end
|
|
1847
|
+
end
|
|
1848
|
+
```
|
|
1849
|
+
|
|
1850
|
+
**Step 4: Update Question.load** in `lib/atomic_assessments_import/questions/question.rb`
|
|
1851
|
+
|
|
1852
|
+
Change the `self.load` method to include new types:
|
|
1853
|
+
|
|
1854
|
+
```ruby
|
|
1855
|
+
def self.load(row)
|
|
1856
|
+
case row["question type"]
|
|
1857
|
+
when nil, "", /multiple choice/i, /mcq/i, /^ma$/i
|
|
1858
|
+
MultipleChoice.new(row)
|
|
1859
|
+
when /true_false/i, /true\/false/i
|
|
1860
|
+
MultipleChoice.new(row)
|
|
1861
|
+
when /essay/i, /longanswer/i
|
|
1862
|
+
Essay.new(row)
|
|
1863
|
+
when /short_answer/i, /shorttext/i
|
|
1864
|
+
ShortAnswer.new(row)
|
|
1865
|
+
else
|
|
1866
|
+
raise "Unknown question type #{row['question type']}"
|
|
1867
|
+
end
|
|
1868
|
+
end
|
|
1869
|
+
```
|
|
1870
|
+
|
|
1871
|
+
Also add requires at the top of `question.rb` — actually, since `question.rb` is loaded first and subclasses require it, just add the requires in the extractor/converter that uses `Question.load`. The existing pattern is that `converter.rb` files require all question classes. We'll add the new requires there.
|
|
1872
|
+
|
|
1873
|
+
For now, add to the top of `lib/atomic_assessments_import/questions/question.rb` after the class definition is loaded — actually the simplest approach: add requires in the files that use `Question.load`. The existing exam_soft converter already requires question and multiple_choice. We'll add essay and short_answer requires alongside those.
|
|
1874
|
+
|
|
1875
|
+
**Step 5: Run tests to verify they pass**
|
|
1876
|
+
|
|
1877
|
+
Run: `bundle exec rspec spec/atomic_assessments_import/questions/essay_spec.rb spec/atomic_assessments_import/questions/short_answer_spec.rb -v`
|
|
1878
|
+
Expected: PASS
|
|
1879
|
+
|
|
1880
|
+
**Step 6: Run all tests to check nothing broke**
|
|
1881
|
+
|
|
1882
|
+
Run: `bundle exec rspec`
|
|
1883
|
+
Expected: All pass
|
|
1884
|
+
|
|
1885
|
+
**Step 7: Commit**
|
|
1886
|
+
|
|
1887
|
+
```bash
|
|
1888
|
+
git add lib/atomic_assessments_import/questions/essay.rb lib/atomic_assessments_import/questions/short_answer.rb lib/atomic_assessments_import/questions/question.rb spec/atomic_assessments_import/questions/essay_spec.rb spec/atomic_assessments_import/questions/short_answer_spec.rb
|
|
1889
|
+
git commit -m "feat: add Essay and ShortAnswer question types"
|
|
1890
|
+
```
|
|
1891
|
+
|
|
1892
|
+
---
|
|
1893
|
+
|
|
1894
|
+
### Task 9: New Question Type Classes — FillInTheBlank, Matching, Ordering
|
|
1895
|
+
|
|
1896
|
+
**Files:**
|
|
1897
|
+
- Create: `lib/atomic_assessments_import/questions/fill_in_the_blank.rb`
|
|
1898
|
+
- Create: `lib/atomic_assessments_import/questions/matching.rb`
|
|
1899
|
+
- Create: `lib/atomic_assessments_import/questions/ordering.rb`
|
|
1900
|
+
- Test: `spec/atomic_assessments_import/questions/fill_in_the_blank_spec.rb`
|
|
1901
|
+
- Test: `spec/atomic_assessments_import/questions/matching_spec.rb`
|
|
1902
|
+
- Test: `spec/atomic_assessments_import/questions/ordering_spec.rb`
|
|
1903
|
+
- Modify: `lib/atomic_assessments_import/questions/question.rb:12-18` (add remaining cases to `self.load`)
|
|
1904
|
+
|
|
1905
|
+
**Step 1: Write failing tests**
|
|
1906
|
+
|
|
1907
|
+
Create `spec/atomic_assessments_import/questions/fill_in_the_blank_spec.rb`:
|
|
1908
|
+
|
|
1909
|
+
```ruby
|
|
1910
|
+
# frozen_string_literal: true
|
|
1911
|
+
|
|
1912
|
+
require "atomic_assessments_import"
|
|
1913
|
+
|
|
1914
|
+
RSpec.describe AtomicAssessmentsImport::Questions::FillInTheBlank do
|
|
1915
|
+
let(:row) do
|
|
1916
|
+
{
|
|
1917
|
+
"question text" => "The capital of France is {{response}}.",
|
|
1918
|
+
"question type" => "fill_in_the_blank",
|
|
1919
|
+
"correct answer" => "Paris",
|
|
1920
|
+
"points" => "1",
|
|
1921
|
+
}
|
|
1922
|
+
end
|
|
1923
|
+
|
|
1924
|
+
describe "#question_type" do
|
|
1925
|
+
it "returns clozetext" do
|
|
1926
|
+
question = described_class.new(row)
|
|
1927
|
+
expect(question.question_type).to eq("clozetext")
|
|
1928
|
+
end
|
|
1929
|
+
end
|
|
1930
|
+
|
|
1931
|
+
describe "#to_learnosity" do
|
|
1932
|
+
it "returns correct structure" do
|
|
1933
|
+
question = described_class.new(row)
|
|
1934
|
+
result = question.to_learnosity
|
|
1935
|
+
|
|
1936
|
+
expect(result[:type]).to eq("clozetext")
|
|
1937
|
+
expect(result[:data][:stimulus]).to eq("The capital of France is {{response}}.")
|
|
1938
|
+
end
|
|
1939
|
+
|
|
1940
|
+
it "includes validation with correct answer" do
|
|
1941
|
+
question = described_class.new(row)
|
|
1942
|
+
result = question.to_learnosity
|
|
1943
|
+
|
|
1944
|
+
expect(result[:data][:validation][:valid_response][:score]).to eq(1)
|
|
1945
|
+
expect(result[:data][:validation][:valid_response][:value]).to eq(["Paris"])
|
|
1946
|
+
end
|
|
1947
|
+
end
|
|
1948
|
+
end
|
|
1949
|
+
```
|
|
1950
|
+
|
|
1951
|
+
Create `spec/atomic_assessments_import/questions/matching_spec.rb`:
|
|
1952
|
+
|
|
1953
|
+
```ruby
|
|
1954
|
+
# frozen_string_literal: true
|
|
1955
|
+
|
|
1956
|
+
require "atomic_assessments_import"
|
|
1957
|
+
|
|
1958
|
+
RSpec.describe AtomicAssessmentsImport::Questions::Matching do
|
|
1959
|
+
let(:row) do
|
|
1960
|
+
{
|
|
1961
|
+
"question text" => "Match the countries to their capitals.",
|
|
1962
|
+
"question type" => "matching",
|
|
1963
|
+
"option a" => "France",
|
|
1964
|
+
"option b" => "Germany",
|
|
1965
|
+
"option c" => "Spain",
|
|
1966
|
+
"match a" => "Paris",
|
|
1967
|
+
"match b" => "Berlin",
|
|
1968
|
+
"match c" => "Madrid",
|
|
1969
|
+
"points" => "3",
|
|
1970
|
+
}
|
|
1971
|
+
end
|
|
1972
|
+
|
|
1973
|
+
describe "#question_type" do
|
|
1974
|
+
it "returns association" do
|
|
1975
|
+
question = described_class.new(row)
|
|
1976
|
+
expect(question.question_type).to eq("association")
|
|
1977
|
+
end
|
|
1978
|
+
end
|
|
1979
|
+
|
|
1980
|
+
describe "#to_learnosity" do
|
|
1981
|
+
it "returns correct structure" do
|
|
1982
|
+
question = described_class.new(row)
|
|
1983
|
+
result = question.to_learnosity
|
|
1984
|
+
|
|
1985
|
+
expect(result[:type]).to eq("association")
|
|
1986
|
+
expect(result[:data][:stimulus]).to eq("Match the countries to their capitals.")
|
|
1987
|
+
end
|
|
1988
|
+
|
|
1989
|
+
it "includes stimulus and possible responses" do
|
|
1990
|
+
question = described_class.new(row)
|
|
1991
|
+
result = question.to_learnosity
|
|
1992
|
+
|
|
1993
|
+
expect(result[:data][:stimulus_list].length).to eq(3)
|
|
1994
|
+
expect(result[:data][:possible_responses].length).to eq(3)
|
|
1995
|
+
end
|
|
1996
|
+
|
|
1997
|
+
it "includes validation" do
|
|
1998
|
+
question = described_class.new(row)
|
|
1999
|
+
result = question.to_learnosity
|
|
2000
|
+
|
|
2001
|
+
expect(result[:data][:validation][:valid_response][:score]).to eq(3)
|
|
2002
|
+
expect(result[:data][:validation][:valid_response][:value].length).to eq(3)
|
|
2003
|
+
end
|
|
2004
|
+
end
|
|
2005
|
+
end
|
|
2006
|
+
```
|
|
2007
|
+
|
|
2008
|
+
Create `spec/atomic_assessments_import/questions/ordering_spec.rb`:
|
|
2009
|
+
|
|
2010
|
+
```ruby
|
|
2011
|
+
# frozen_string_literal: true
|
|
2012
|
+
|
|
2013
|
+
require "atomic_assessments_import"
|
|
2014
|
+
|
|
2015
|
+
RSpec.describe AtomicAssessmentsImport::Questions::Ordering do
|
|
2016
|
+
let(:row) do
|
|
2017
|
+
{
|
|
2018
|
+
"question text" => "Arrange these events in chronological order.",
|
|
2019
|
+
"question type" => "ordering",
|
|
2020
|
+
"option a" => "World War I",
|
|
2021
|
+
"option b" => "World War II",
|
|
2022
|
+
"option c" => "Cold War",
|
|
2023
|
+
"correct answer" => "a; b; c",
|
|
2024
|
+
"points" => "3",
|
|
2025
|
+
}
|
|
2026
|
+
end
|
|
2027
|
+
|
|
2028
|
+
describe "#question_type" do
|
|
2029
|
+
it "returns orderlist" do
|
|
2030
|
+
question = described_class.new(row)
|
|
2031
|
+
expect(question.question_type).to eq("orderlist")
|
|
2032
|
+
end
|
|
2033
|
+
end
|
|
2034
|
+
|
|
2035
|
+
describe "#to_learnosity" do
|
|
2036
|
+
it "returns correct structure" do
|
|
2037
|
+
question = described_class.new(row)
|
|
2038
|
+
result = question.to_learnosity
|
|
2039
|
+
|
|
2040
|
+
expect(result[:type]).to eq("orderlist")
|
|
2041
|
+
expect(result[:data][:stimulus]).to eq("Arrange these events in chronological order.")
|
|
2042
|
+
end
|
|
2043
|
+
|
|
2044
|
+
it "includes list of items" do
|
|
2045
|
+
question = described_class.new(row)
|
|
2046
|
+
result = question.to_learnosity
|
|
2047
|
+
|
|
2048
|
+
expect(result[:data][:list].length).to eq(3)
|
|
2049
|
+
end
|
|
2050
|
+
|
|
2051
|
+
it "includes validation with correct order" do
|
|
2052
|
+
question = described_class.new(row)
|
|
2053
|
+
result = question.to_learnosity
|
|
2054
|
+
|
|
2055
|
+
expect(result[:data][:validation][:valid_response][:score]).to eq(3)
|
|
2056
|
+
expect(result[:data][:validation][:valid_response][:value]).to eq(["0", "1", "2"])
|
|
2057
|
+
end
|
|
2058
|
+
end
|
|
2059
|
+
end
|
|
2060
|
+
```
|
|
2061
|
+
|
|
2062
|
+
**Step 2: Run tests to verify they fail**
|
|
2063
|
+
|
|
2064
|
+
Run: `bundle exec rspec spec/atomic_assessments_import/questions/fill_in_the_blank_spec.rb spec/atomic_assessments_import/questions/matching_spec.rb spec/atomic_assessments_import/questions/ordering_spec.rb -v`
|
|
2065
|
+
Expected: FAIL — uninitialized constants
|
|
2066
|
+
|
|
2067
|
+
**Step 3: Write implementations**
|
|
2068
|
+
|
|
2069
|
+
Create `lib/atomic_assessments_import/questions/fill_in_the_blank.rb`:
|
|
2070
|
+
|
|
2071
|
+
```ruby
|
|
2072
|
+
# frozen_string_literal: true
|
|
2073
|
+
|
|
2074
|
+
require_relative "question"
|
|
2075
|
+
|
|
2076
|
+
module AtomicAssessmentsImport
|
|
2077
|
+
module Questions
|
|
2078
|
+
class FillInTheBlank < Question
|
|
2079
|
+
def question_type
|
|
2080
|
+
"clozetext"
|
|
2081
|
+
end
|
|
2082
|
+
|
|
2083
|
+
def question_data
|
|
2084
|
+
answers = (@row["correct answer"] || "").split(";").map(&:strip)
|
|
2085
|
+
|
|
2086
|
+
super.merge(
|
|
2087
|
+
validation: {
|
|
2088
|
+
valid_response: {
|
|
2089
|
+
score: points,
|
|
2090
|
+
value: answers,
|
|
2091
|
+
},
|
|
2092
|
+
}
|
|
2093
|
+
)
|
|
2094
|
+
end
|
|
2095
|
+
end
|
|
2096
|
+
end
|
|
2097
|
+
end
|
|
2098
|
+
```
|
|
2099
|
+
|
|
2100
|
+
Create `lib/atomic_assessments_import/questions/matching.rb`:
|
|
2101
|
+
|
|
2102
|
+
```ruby
|
|
2103
|
+
# frozen_string_literal: true
|
|
2104
|
+
|
|
2105
|
+
require_relative "question"
|
|
2106
|
+
|
|
2107
|
+
module AtomicAssessmentsImport
|
|
2108
|
+
module Questions
|
|
2109
|
+
class Matching < Question
|
|
2110
|
+
INDEXES = ("a".."o").to_a.freeze
|
|
2111
|
+
|
|
2112
|
+
def question_type
|
|
2113
|
+
"association"
|
|
2114
|
+
end
|
|
2115
|
+
|
|
2116
|
+
def question_data
|
|
2117
|
+
stimulus_list = []
|
|
2118
|
+
possible_responses = []
|
|
2119
|
+
valid_values = []
|
|
2120
|
+
|
|
2121
|
+
INDEXES.each do |letter|
|
|
2122
|
+
option = @row["option #{letter}"]
|
|
2123
|
+
match = @row["match #{letter}"]
|
|
2124
|
+
break unless option
|
|
2125
|
+
|
|
2126
|
+
stimulus_list << option
|
|
2127
|
+
possible_responses << match if match
|
|
2128
|
+
valid_values << match if match
|
|
2129
|
+
end
|
|
2130
|
+
|
|
2131
|
+
super.merge(
|
|
2132
|
+
stimulus_list: stimulus_list,
|
|
2133
|
+
possible_responses: possible_responses,
|
|
2134
|
+
validation: {
|
|
2135
|
+
valid_response: {
|
|
2136
|
+
score: points,
|
|
2137
|
+
value: valid_values,
|
|
2138
|
+
},
|
|
2139
|
+
}
|
|
2140
|
+
)
|
|
2141
|
+
end
|
|
2142
|
+
end
|
|
2143
|
+
end
|
|
2144
|
+
end
|
|
2145
|
+
```
|
|
2146
|
+
|
|
2147
|
+
Create `lib/atomic_assessments_import/questions/ordering.rb`:
|
|
2148
|
+
|
|
2149
|
+
```ruby
|
|
2150
|
+
# frozen_string_literal: true
|
|
2151
|
+
|
|
2152
|
+
require_relative "question"
|
|
2153
|
+
|
|
2154
|
+
module AtomicAssessmentsImport
|
|
2155
|
+
module Questions
|
|
2156
|
+
class Ordering < Question
|
|
2157
|
+
INDEXES = ("a".."o").to_a.freeze
|
|
2158
|
+
|
|
2159
|
+
def question_type
|
|
2160
|
+
"orderlist"
|
|
2161
|
+
end
|
|
2162
|
+
|
|
2163
|
+
def question_data
|
|
2164
|
+
items = []
|
|
2165
|
+
INDEXES.each do |letter|
|
|
2166
|
+
option = @row["option #{letter}"]
|
|
2167
|
+
break unless option
|
|
2168
|
+
|
|
2169
|
+
items << option
|
|
2170
|
+
end
|
|
2171
|
+
|
|
2172
|
+
# Parse correct order from "a; b; c" format
|
|
2173
|
+
order = (@row["correct answer"] || "").split(";").map(&:strip).map(&:downcase)
|
|
2174
|
+
valid_values = order.filter_map { |letter| INDEXES.find_index(letter)&.to_s }
|
|
2175
|
+
|
|
2176
|
+
super.merge(
|
|
2177
|
+
list: items,
|
|
2178
|
+
validation: {
|
|
2179
|
+
valid_response: {
|
|
2180
|
+
score: points,
|
|
2181
|
+
value: valid_values,
|
|
2182
|
+
},
|
|
2183
|
+
}
|
|
2184
|
+
)
|
|
2185
|
+
end
|
|
2186
|
+
end
|
|
2187
|
+
end
|
|
2188
|
+
end
|
|
2189
|
+
```
|
|
2190
|
+
|
|
2191
|
+
**Step 4: Update Question.load** in `lib/atomic_assessments_import/questions/question.rb`
|
|
2192
|
+
|
|
2193
|
+
Final version of `self.load`:
|
|
2194
|
+
|
|
2195
|
+
```ruby
|
|
2196
|
+
def self.load(row)
|
|
2197
|
+
case row["question type"]
|
|
2198
|
+
when nil, "", /multiple choice/i, /mcq/i, /^ma$/i
|
|
2199
|
+
MultipleChoice.new(row)
|
|
2200
|
+
when /true_false/i, /true\/false/i
|
|
2201
|
+
MultipleChoice.new(row)
|
|
2202
|
+
when /essay/i, /longanswer/i
|
|
2203
|
+
Essay.new(row)
|
|
2204
|
+
when /short_answer/i, /shorttext/i
|
|
2205
|
+
ShortAnswer.new(row)
|
|
2206
|
+
when /fill_in_the_blank/i, /cloze/i
|
|
2207
|
+
FillInTheBlank.new(row)
|
|
2208
|
+
when /matching/i, /association/i
|
|
2209
|
+
Matching.new(row)
|
|
2210
|
+
when /ordering/i, /orderlist/i
|
|
2211
|
+
Ordering.new(row)
|
|
2212
|
+
else
|
|
2213
|
+
raise "Unknown question type #{row['question type']}"
|
|
2214
|
+
end
|
|
2215
|
+
end
|
|
2216
|
+
```
|
|
2217
|
+
|
|
2218
|
+
**Step 5: Run tests to verify they pass**
|
|
2219
|
+
|
|
2220
|
+
Run: `bundle exec rspec spec/atomic_assessments_import/questions/ -v`
|
|
2221
|
+
Expected: PASS
|
|
2222
|
+
|
|
2223
|
+
**Step 6: Commit**
|
|
2224
|
+
|
|
2225
|
+
```bash
|
|
2226
|
+
git add lib/atomic_assessments_import/questions/ spec/atomic_assessments_import/questions/
|
|
2227
|
+
git commit -m "feat: add FillInTheBlank, Matching, and Ordering question types"
|
|
2228
|
+
```
|
|
2229
|
+
|
|
2230
|
+
---
|
|
2231
|
+
|
|
2232
|
+
### Task 10: Refactor ExamSoft::Converter to Use New Pipeline
|
|
2233
|
+
|
|
2234
|
+
Replace the monolithic regex-based converter with the chunker + extractor pipeline.
|
|
2235
|
+
|
|
2236
|
+
**Files:**
|
|
2237
|
+
- Modify: `lib/atomic_assessments_import/exam_soft/converter.rb` (major rewrite)
|
|
2238
|
+
- Modify: `lib/atomic_assessments_import/exam_soft.rb` (add requires)
|
|
2239
|
+
|
|
2240
|
+
**Step 1: Read and understand the existing converter**
|
|
2241
|
+
|
|
2242
|
+
The existing converter is at `lib/atomic_assessments_import/exam_soft/converter.rb`. It handles:
|
|
2243
|
+
1. File input (String path or Tempfile)
|
|
2244
|
+
2. Pandoc conversion to HTML
|
|
2245
|
+
3. Regex chunking + extraction
|
|
2246
|
+
4. Building row_mock
|
|
2247
|
+
5. Calling convert_row to build items/questions
|
|
2248
|
+
|
|
2249
|
+
We keep steps 1-2 and 5, replace step 3-4 with Chunker + Extractor.
|
|
2250
|
+
|
|
2251
|
+
**Step 2: Rewrite the converter**
|
|
2252
|
+
|
|
2253
|
+
Replace `lib/atomic_assessments_import/exam_soft/converter.rb` with:
|
|
2254
|
+
|
|
2255
|
+
```ruby
|
|
2256
|
+
# frozen_string_literal: true
|
|
2257
|
+
|
|
2258
|
+
require "pandoc-ruby"
|
|
2259
|
+
require "nokogiri"
|
|
2260
|
+
require "active_support/core_ext/digest/uuid"
|
|
2261
|
+
|
|
2262
|
+
require_relative "../questions/question"
|
|
2263
|
+
require_relative "../questions/multiple_choice"
|
|
2264
|
+
require_relative "../questions/essay"
|
|
2265
|
+
require_relative "../questions/short_answer"
|
|
2266
|
+
require_relative "../questions/fill_in_the_blank"
|
|
2267
|
+
require_relative "../questions/matching"
|
|
2268
|
+
require_relative "../questions/ordering"
|
|
2269
|
+
require_relative "../utils"
|
|
2270
|
+
require_relative "chunker"
|
|
2271
|
+
require_relative "extractor"
|
|
2272
|
+
|
|
2273
|
+
module AtomicAssessmentsImport
|
|
2274
|
+
module ExamSoft
|
|
2275
|
+
class Converter
|
|
2276
|
+
def initialize(file)
|
|
2277
|
+
@file = file
|
|
2278
|
+
end
|
|
2279
|
+
|
|
2280
|
+
def convert
|
|
2281
|
+
html = normalize_to_html
|
|
2282
|
+
doc = Nokogiri::HTML.fragment(html)
|
|
2283
|
+
|
|
2284
|
+
# Chunk the document
|
|
2285
|
+
chunk_result = Chunker.new(doc).chunk
|
|
2286
|
+
all_warnings = chunk_result[:warnings].dup
|
|
2287
|
+
|
|
2288
|
+
# Log header info if present
|
|
2289
|
+
unless chunk_result[:header_nodes].empty?
|
|
2290
|
+
header_text = chunk_result[:header_nodes].map { |n| n.text.strip }.join(" ")
|
|
2291
|
+
all_warnings << "Exam header detected: #{header_text}" unless header_text.empty?
|
|
2292
|
+
end
|
|
2293
|
+
|
|
2294
|
+
items = []
|
|
2295
|
+
questions = []
|
|
2296
|
+
|
|
2297
|
+
chunk_result[:chunks].each_with_index do |chunk_nodes, index|
|
|
2298
|
+
# Extract fields from this chunk
|
|
2299
|
+
extraction = Extractor.new(chunk_nodes).extract
|
|
2300
|
+
all_warnings.concat(extraction[:warnings].map { |w| "Question #{index + 1}: #{w}" })
|
|
2301
|
+
|
|
2302
|
+
row = extraction[:row]
|
|
2303
|
+
status = extraction[:status]
|
|
2304
|
+
|
|
2305
|
+
# Skip completely unparseable chunks
|
|
2306
|
+
if row["question text"].nil? && row["option a"].nil?
|
|
2307
|
+
all_warnings << "Question #{index + 1}: Skipped — no usable content found"
|
|
2308
|
+
next
|
|
2309
|
+
end
|
|
2310
|
+
|
|
2311
|
+
begin
|
|
2312
|
+
item, question_widgets = convert_row(row, status)
|
|
2313
|
+
items << item
|
|
2314
|
+
questions += question_widgets
|
|
2315
|
+
rescue StandardError => e
|
|
2316
|
+
title = row["title"] || "Question #{index + 1}"
|
|
2317
|
+
all_warnings << "#{title}: #{e.message}, imported as draft"
|
|
2318
|
+
# Attempt bare-minimum import
|
|
2319
|
+
begin
|
|
2320
|
+
item, question_widgets = convert_row_minimal(row)
|
|
2321
|
+
items << item
|
|
2322
|
+
questions += question_widgets
|
|
2323
|
+
rescue StandardError
|
|
2324
|
+
all_warnings << "#{title}: Could not import even minimally, skipped"
|
|
2325
|
+
end
|
|
2326
|
+
end
|
|
2327
|
+
end
|
|
2328
|
+
|
|
2329
|
+
{
|
|
2330
|
+
activities: [],
|
|
2331
|
+
items: items,
|
|
2332
|
+
questions: questions,
|
|
2333
|
+
features: [],
|
|
2334
|
+
errors: all_warnings,
|
|
2335
|
+
}
|
|
2336
|
+
end
|
|
2337
|
+
|
|
2338
|
+
private
|
|
2339
|
+
|
|
2340
|
+
def normalize_to_html
|
|
2341
|
+
if @file.is_a?(String)
|
|
2342
|
+
PandocRuby.new([@file], from: @file.split(".").last).to_html
|
|
2343
|
+
else
|
|
2344
|
+
source_type = @file.path.split(".").last.match(/^[a-zA-Z]+/)[0]
|
|
2345
|
+
PandocRuby.new(@file.read, from: source_type).to_html
|
|
2346
|
+
end
|
|
2347
|
+
end
|
|
2348
|
+
|
|
2349
|
+
def categories_to_tags(categories)
|
|
2350
|
+
tags = {}
|
|
2351
|
+
(categories || []).each do |cat|
|
|
2352
|
+
if cat.include?("/")
|
|
2353
|
+
key, value = cat.split("/", 2).map(&:strip)
|
|
2354
|
+
tags[key.to_sym] ||= []
|
|
2355
|
+
tags[key.to_sym] << value
|
|
2356
|
+
else
|
|
2357
|
+
tags[cat.to_sym] ||= []
|
|
2358
|
+
end
|
|
2359
|
+
end
|
|
2360
|
+
tags
|
|
2361
|
+
end
|
|
2362
|
+
|
|
2363
|
+
def convert_row(row, status = "published")
|
|
2364
|
+
source = "<p>ExamSoft Import on #{Time.now.strftime('%Y-%m-%d')}</p>\n"
|
|
2365
|
+
if row["question id"].present?
|
|
2366
|
+
source += "<p>External id: #{row['question id']}</p>\n"
|
|
2367
|
+
end
|
|
2368
|
+
|
|
2369
|
+
question = Questions::Question.load(row)
|
|
2370
|
+
item = {
|
|
2371
|
+
reference: SecureRandom.uuid,
|
|
2372
|
+
title: row["title"] || "",
|
|
2373
|
+
status: status,
|
|
2374
|
+
tags: categories_to_tags(row["category"]),
|
|
2375
|
+
metadata: {
|
|
2376
|
+
import_date: Time.now.iso8601,
|
|
2377
|
+
import_type: row["import_type"] || "examsoft",
|
|
2378
|
+
},
|
|
2379
|
+
source: source,
|
|
2380
|
+
description: row["description"] || "",
|
|
2381
|
+
questions: [
|
|
2382
|
+
{
|
|
2383
|
+
reference: question.reference,
|
|
2384
|
+
type: question.question_type,
|
|
2385
|
+
},
|
|
2386
|
+
],
|
|
2387
|
+
features: [],
|
|
2388
|
+
definition: {
|
|
2389
|
+
widgets: [
|
|
2390
|
+
{
|
|
2391
|
+
reference: question.reference,
|
|
2392
|
+
widget_type: "response",
|
|
2393
|
+
},
|
|
2394
|
+
],
|
|
2395
|
+
},
|
|
2396
|
+
}
|
|
2397
|
+
[item, [question.to_learnosity]]
|
|
2398
|
+
end
|
|
2399
|
+
|
|
2400
|
+
def convert_row_minimal(row)
|
|
2401
|
+
# Fallback: create a bare item with just the question text
|
|
2402
|
+
reference = SecureRandom.uuid
|
|
2403
|
+
item = {
|
|
2404
|
+
reference: reference,
|
|
2405
|
+
title: row["title"] || "",
|
|
2406
|
+
status: "draft",
|
|
2407
|
+
tags: {},
|
|
2408
|
+
metadata: {
|
|
2409
|
+
import_date: Time.now.iso8601,
|
|
2410
|
+
import_type: "examsoft",
|
|
2411
|
+
},
|
|
2412
|
+
source: "<p>ExamSoft Import on #{Time.now.strftime('%Y-%m-%d')}</p>\n",
|
|
2413
|
+
description: row["question text"] || "",
|
|
2414
|
+
questions: [],
|
|
2415
|
+
features: [],
|
|
2416
|
+
definition: { widgets: [] },
|
|
2417
|
+
}
|
|
2418
|
+
[item, []]
|
|
2419
|
+
end
|
|
2420
|
+
end
|
|
2421
|
+
end
|
|
2422
|
+
end
|
|
2423
|
+
```
|
|
2424
|
+
|
|
2425
|
+
**Step 3: Run existing tests to check backward compatibility**
|
|
2426
|
+
|
|
2427
|
+
Run: `bundle exec rspec spec/atomic_assessments_import/examsoft/ -v`
|
|
2428
|
+
Expected: Existing tests should mostly pass. Some may need minor adjustments due to error handling changes (e.g., "raises if no options" now produces a warning instead of an exception).
|
|
2429
|
+
|
|
2430
|
+
**Step 4: Update existing ExamSoft specs for new behavior**
|
|
2431
|
+
|
|
2432
|
+
The tests that expect `raise_error` for missing options/correct answers need to change — the new converter uses best-effort and produces warnings instead. Update `spec/atomic_assessments_import/examsoft/docx_converter_spec.rb`:
|
|
2433
|
+
|
|
2434
|
+
Change the "raises if no options" test to:
|
|
2435
|
+
```ruby
|
|
2436
|
+
it "warns and imports as draft if no options are given" do
|
|
2437
|
+
no_options = Tempfile.new("temp.docx")
|
|
2438
|
+
original_content = File.read("spec/fixtures/no_options.docx")
|
|
2439
|
+
no_options.write(original_content)
|
|
2440
|
+
no_options.rewind
|
|
2441
|
+
|
|
2442
|
+
data = described_class.new(no_options).convert
|
|
2443
|
+
expect(data[:errors]).to include(a_string_matching(/no options|missing options/i))
|
|
2444
|
+
end
|
|
2445
|
+
```
|
|
2446
|
+
|
|
2447
|
+
Change the "raises if no correct answer" test to:
|
|
2448
|
+
```ruby
|
|
2449
|
+
it "warns and imports as draft if no correct answer is given" do
|
|
2450
|
+
no_correct = Tempfile.new("temp.docx")
|
|
2451
|
+
original_content = File.read("spec/fixtures/no_correct.docx")
|
|
2452
|
+
no_correct.write(original_content)
|
|
2453
|
+
no_correct.rewind
|
|
2454
|
+
|
|
2455
|
+
data = described_class.new(no_correct).convert
|
|
2456
|
+
expect(data[:errors]).to include(a_string_matching(/correct answer/i))
|
|
2457
|
+
end
|
|
2458
|
+
```
|
|
2459
|
+
|
|
2460
|
+
Apply similar changes to `html_converter_spec.rb` and `rtf_converter_spec.rb`.
|
|
2461
|
+
|
|
2462
|
+
**Step 5: Run all tests**
|
|
2463
|
+
|
|
2464
|
+
Run: `bundle exec rspec`
|
|
2465
|
+
Expected: All pass
|
|
2466
|
+
|
|
2467
|
+
**Step 6: Commit**
|
|
2468
|
+
|
|
2469
|
+
```bash
|
|
2470
|
+
git add lib/atomic_assessments_import/exam_soft/ spec/atomic_assessments_import/examsoft/
|
|
2471
|
+
git commit -m "refactor: rewrite ExamSoft converter to use chunker + extractor pipeline"
|
|
2472
|
+
```
|
|
2473
|
+
|
|
2474
|
+
---
|
|
2475
|
+
|
|
2476
|
+
### Task 11: Integration Tests — Mixed Types, Messy Documents, Partial Parse
|
|
2477
|
+
|
|
2478
|
+
**Files:**
|
|
2479
|
+
- Create: `spec/fixtures/mixed_types.html`
|
|
2480
|
+
- Create: `spec/fixtures/messy_document.html`
|
|
2481
|
+
- Create: `spec/atomic_assessments_import/examsoft/integration_spec.rb`
|
|
2482
|
+
|
|
2483
|
+
**Step 1: Create test fixtures**
|
|
2484
|
+
|
|
2485
|
+
Create `spec/fixtures/mixed_types.html`:
|
|
2486
|
+
|
|
2487
|
+
```html
|
|
2488
|
+
<p>Exam: Midterm 2024</p>
|
|
2489
|
+
<p>Total Questions: 4</p>
|
|
2490
|
+
<p>Folder: Science Title: Q1 Category: Biology/Cells 1) What is the powerhouse of the cell? ~ The mitochondria produces ATP.</p>
|
|
2491
|
+
<p>*a) Mitochondria</p>
|
|
2492
|
+
<p>b) Nucleus</p>
|
|
2493
|
+
<p>c) Ribosome</p>
|
|
2494
|
+
<p>Type: Essay Folder: Writing Title: Q2 Category: English/Composition 2) Discuss the themes of Hamlet.</p>
|
|
2495
|
+
<p>Type: MA Folder: Geography Title: Q3 Category: Capitals 3) Select all European capitals.</p>
|
|
2496
|
+
<p>*a) Paris</p>
|
|
2497
|
+
<p>*b) Berlin</p>
|
|
2498
|
+
<p>c) New York</p>
|
|
2499
|
+
<p>Folder: Science Title: Q4 Category: Chemistry 4) What is the chemical symbol for gold?</p>
|
|
2500
|
+
<p>*a) Au</p>
|
|
2501
|
+
<p>b) Ag</p>
|
|
2502
|
+
<p>c) Fe</p>
|
|
2503
|
+
```
|
|
2504
|
+
|
|
2505
|
+
Create `spec/fixtures/messy_document.html`:
|
|
2506
|
+
|
|
2507
|
+
```html
|
|
2508
|
+
<p>Some random header text</p>
|
|
2509
|
+
<p></p>
|
|
2510
|
+
<p>Folder: Test Title: Q1 Category: General 1) A normal question? ~ Normal explanation</p>
|
|
2511
|
+
<p>*a) Correct</p>
|
|
2512
|
+
<p>b) Wrong</p>
|
|
2513
|
+
<p>Folder: Test Title: Q2 Category: General 2) A question with no options at all</p>
|
|
2514
|
+
<p>Folder: Test Title: Q3 Category: General 3) Another normal question? ~ Another explanation</p>
|
|
2515
|
+
<p>*a) Right</p>
|
|
2516
|
+
<p>b) Wrong</p>
|
|
2517
|
+
```
|
|
2518
|
+
|
|
2519
|
+
**Step 2: Write integration tests**
|
|
2520
|
+
|
|
2521
|
+
Create `spec/atomic_assessments_import/examsoft/integration_spec.rb`:
|
|
2522
|
+
|
|
2523
|
+
```ruby
|
|
2524
|
+
# frozen_string_literal: true
|
|
2525
|
+
|
|
2526
|
+
require "atomic_assessments_import"
|
|
2527
|
+
|
|
2528
|
+
RSpec.describe "ExamSoft Integration" do
|
|
2529
|
+
describe "mixed question types" do
|
|
2530
|
+
it "handles a document with MCQ, essay, and MA questions" do
|
|
2531
|
+
data = AtomicAssessmentsImport::ExamSoft::Converter.new("spec/fixtures/mixed_types.html").convert
|
|
2532
|
+
|
|
2533
|
+
expect(data[:items].length).to eq(4)
|
|
2534
|
+
|
|
2535
|
+
# MCQ question
|
|
2536
|
+
q1 = data[:questions].find { |q| q[:data][:stimulus]&.include?("powerhouse") }
|
|
2537
|
+
expect(q1).not_to be_nil
|
|
2538
|
+
expect(q1[:type]).to eq("mcq")
|
|
2539
|
+
|
|
2540
|
+
# Essay question
|
|
2541
|
+
q2 = data[:questions].find { |q| q[:data][:stimulus]&.include?("Hamlet") }
|
|
2542
|
+
expect(q2).not_to be_nil
|
|
2543
|
+
expect(q2[:type]).to eq("longanswer")
|
|
2544
|
+
|
|
2545
|
+
# MA question
|
|
2546
|
+
q3 = data[:questions].find { |q| q[:data][:stimulus]&.include?("European capitals") }
|
|
2547
|
+
expect(q3).not_to be_nil
|
|
2548
|
+
end
|
|
2549
|
+
|
|
2550
|
+
it "reports exam header in warnings" do
|
|
2551
|
+
data = AtomicAssessmentsImport::ExamSoft::Converter.new("spec/fixtures/mixed_types.html").convert
|
|
2552
|
+
|
|
2553
|
+
expect(data[:errors]).to include(a_string_matching(/header/i))
|
|
2554
|
+
end
|
|
2555
|
+
end
|
|
2556
|
+
|
|
2557
|
+
describe "messy documents with partial parse" do
|
|
2558
|
+
it "imports what it can and warns about problems" do
|
|
2559
|
+
data = AtomicAssessmentsImport::ExamSoft::Converter.new("spec/fixtures/messy_document.html").convert
|
|
2560
|
+
|
|
2561
|
+
# Should get at least 2 good items (Q1 and Q3)
|
|
2562
|
+
published = data[:items].select { |i| i[:status] == "published" }
|
|
2563
|
+
expect(published.length).to be >= 2
|
|
2564
|
+
|
|
2565
|
+
# Should have warnings about Q2 (no options for what looks like MCQ)
|
|
2566
|
+
expect(data[:errors].length).to be > 0
|
|
2567
|
+
end
|
|
2568
|
+
end
|
|
2569
|
+
|
|
2570
|
+
describe "backward compatibility" do
|
|
2571
|
+
it "produces the same structure from simple.html as before" do
|
|
2572
|
+
data = AtomicAssessmentsImport::ExamSoft::Converter.new("spec/fixtures/simple.html").convert
|
|
2573
|
+
|
|
2574
|
+
expect(data[:items].length).to eq(3)
|
|
2575
|
+
expect(data[:questions].length).to eq(3)
|
|
2576
|
+
expect(data[:activities]).to eq([])
|
|
2577
|
+
expect(data[:features]).to eq([])
|
|
2578
|
+
|
|
2579
|
+
item1 = data[:items].find { |i| i[:title] == "Question 1" }
|
|
2580
|
+
expect(item1).not_to be_nil
|
|
2581
|
+
expect(item1[:status]).to eq("published")
|
|
2582
|
+
|
|
2583
|
+
q1 = data[:questions].find { |q| q[:data][:stimulus] == "What is the capital of France?" }
|
|
2584
|
+
expect(q1).not_to be_nil
|
|
2585
|
+
expect(q1[:data][:options].length).to eq(3)
|
|
2586
|
+
end
|
|
2587
|
+
end
|
|
2588
|
+
end
|
|
2589
|
+
```
|
|
2590
|
+
|
|
2591
|
+
**Step 3: Run integration tests**
|
|
2592
|
+
|
|
2593
|
+
Run: `bundle exec rspec spec/atomic_assessments_import/examsoft/integration_spec.rb -v`
|
|
2594
|
+
Expected: PASS
|
|
2595
|
+
|
|
2596
|
+
**Step 4: Run full test suite**
|
|
2597
|
+
|
|
2598
|
+
Run: `bundle exec rspec`
|
|
2599
|
+
Expected: All pass
|
|
2600
|
+
|
|
2601
|
+
**Step 5: Commit**
|
|
2602
|
+
|
|
2603
|
+
```bash
|
|
2604
|
+
git add spec/fixtures/mixed_types.html spec/fixtures/messy_document.html spec/atomic_assessments_import/examsoft/integration_spec.rb
|
|
2605
|
+
git commit -m "test: add integration tests for mixed types, messy docs, backward compat"
|
|
2606
|
+
```
|
|
2607
|
+
|
|
2608
|
+
---
|
|
2609
|
+
|
|
2610
|
+
### Task 12: Final Cleanup and Full Test Run
|
|
2611
|
+
|
|
2612
|
+
**Files:**
|
|
2613
|
+
- Review: all modified files
|
|
2614
|
+
- Clean up: any dead code from old converter, unused comments
|
|
2615
|
+
|
|
2616
|
+
**Step 1: Run full test suite**
|
|
2617
|
+
|
|
2618
|
+
Run: `bundle exec rspec --format documentation`
|
|
2619
|
+
Expected: All pass
|
|
2620
|
+
|
|
2621
|
+
**Step 2: Check for dead code**
|
|
2622
|
+
|
|
2623
|
+
Look for any leftover references to the old regex patterns in the converter that are no longer needed. The old `chunk_pattern`, `meta_regex`, `question_regex`, `explanation_regex`, `options_regex` constants should all be gone since they were local variables in the old `convert` method.
|
|
2624
|
+
|
|
2625
|
+
**Step 3: Run rubocop if configured**
|
|
2626
|
+
|
|
2627
|
+
Run: `bundle exec rubocop lib/atomic_assessments_import/exam_soft/ lib/atomic_assessments_import/questions/`
|
|
2628
|
+
Fix any style issues.
|
|
2629
|
+
|
|
2630
|
+
**Step 4: Final commit**
|
|
2631
|
+
|
|
2632
|
+
```bash
|
|
2633
|
+
git add -A
|
|
2634
|
+
git commit -m "chore: cleanup after ExamSoft converter refactor"
|
|
2635
|
+
```
|