docsmith 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (45) hide show
  1. checksums.yaml +7 -0
  2. data/.rspec +3 -0
  3. data/.rspec_status +212 -0
  4. data/CHANGELOG.md +5 -0
  5. data/CODE_OF_CONDUCT.md +132 -0
  6. data/LICENSE.txt +21 -0
  7. data/README.md +66 -0
  8. data/Rakefile +8 -0
  9. data/USAGE.md +510 -0
  10. data/docs/superpowers/plans/2026-04-01-docsmith-full-plan.md +6459 -0
  11. data/docs/superpowers/plans/2026-04-08-parsers-remove-branches-docs.md +2112 -0
  12. data/docs/superpowers/specs/2026-04-01-docsmith-phase1-design.md +340 -0
  13. data/docsmith_spec.md +630 -0
  14. data/lib/docsmith/auto_save.rb +29 -0
  15. data/lib/docsmith/comments/anchor.rb +68 -0
  16. data/lib/docsmith/comments/comment.rb +44 -0
  17. data/lib/docsmith/comments/manager.rb +73 -0
  18. data/lib/docsmith/comments/migrator.rb +64 -0
  19. data/lib/docsmith/configuration.rb +95 -0
  20. data/lib/docsmith/diff/engine.rb +39 -0
  21. data/lib/docsmith/diff/parsers/html.rb +64 -0
  22. data/lib/docsmith/diff/parsers/markdown.rb +60 -0
  23. data/lib/docsmith/diff/renderers/base.rb +62 -0
  24. data/lib/docsmith/diff/renderers/registry.rb +41 -0
  25. data/lib/docsmith/diff/renderers.rb +10 -0
  26. data/lib/docsmith/diff/result.rb +77 -0
  27. data/lib/docsmith/diff.rb +6 -0
  28. data/lib/docsmith/document.rb +44 -0
  29. data/lib/docsmith/document_version.rb +50 -0
  30. data/lib/docsmith/errors.rb +18 -0
  31. data/lib/docsmith/events/event.rb +19 -0
  32. data/lib/docsmith/events/hook_registry.rb +14 -0
  33. data/lib/docsmith/events/notifier.rb +22 -0
  34. data/lib/docsmith/rendering/html_renderer.rb +36 -0
  35. data/lib/docsmith/rendering/json_renderer.rb +29 -0
  36. data/lib/docsmith/version.rb +5 -0
  37. data/lib/docsmith/version_manager.rb +143 -0
  38. data/lib/docsmith/version_tag.rb +25 -0
  39. data/lib/docsmith/versionable.rb +252 -0
  40. data/lib/docsmith.rb +52 -0
  41. data/lib/generators/docsmith/install/install_generator.rb +27 -0
  42. data/lib/generators/docsmith/install/templates/create_docsmith_tables.rb.erb +64 -0
  43. data/lib/generators/docsmith/install/templates/docsmith_initializer.rb.erb +19 -0
  44. data/sig/docsmith.rbs +4 -0
  45. metadata +196 -0
@@ -0,0 +1,2112 @@
1
+ # Docsmith: Format-Aware Parsers, Remove Branches, Docs
2
+
3
+ > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
4
+
5
+ **Goal:** Add word-level Markdown and HTML-aware diff parsers, strip all branching/merging code, and write complete documentation.
6
+
7
+ **Architecture:** The diff `Engine` gains a `PARSERS` map (content-type → parser class) that overrides the generic line-level `Renderers::Base` for markdown and html documents. Parsers inherit `Renderers::Base`, overriding only `compute`. Branch code (3 lib files, 1 result file, 9 spec files, schema table, generator template) is deleted wholesale. The `Renderers::Registry` is left untouched—it still exists for custom rendering overrides.
8
+
9
+ **Tech Stack:** Ruby, diff-lcs, RSpec, ActiveRecord (SQLite in tests)
10
+
11
+ ---
12
+
13
+ ## File Map
14
+
15
+ ```
16
+ # Created
17
+ lib/docsmith/diff/parsers/markdown.rb ← word-level tokenizer, inherits Renderers::Base
18
+ lib/docsmith/diff/parsers/html.rb ← HTML-tag-aware tokenizer, inherits Renderers::Base
19
+ spec/docsmith/diff/parsers/markdown_spec.rb ← unit tests for Markdown parser
20
+ spec/docsmith/diff/parsers/html_spec.rb ← unit tests for HTML parser
21
+ USAGE.md ← verbose usage guide
22
+
23
+ # Deleted
24
+ lib/docsmith/branches/branch.rb
25
+ lib/docsmith/branches/manager.rb
26
+ lib/docsmith/branches/merger.rb
27
+ lib/docsmith/merge_result.rb
28
+ spec/docsmith/branches/branch_spec.rb
29
+ spec/docsmith/branches/manager_spec.rb
30
+ spec/docsmith/branches/merger_spec.rb
31
+ spec/docsmith/merge_result_spec.rb
32
+ spec/docsmith/phase4_integration_spec.rb
33
+
34
+ # Modified
35
+ lib/docsmith.rb ← remove branch/merge_result requires, add parser requires (before engine require)
36
+ lib/docsmith/diff/engine.rb ← add PARSERS constant, use parser in between()
37
+ lib/docsmith/versionable.rb ← remove branch: param from save_version!, delete 4 branch methods
38
+ lib/docsmith/version_manager.rb ← remove branch: param + branch_id + branch.update_columns from save!
39
+ lib/docsmith/document_version.rb ← remove belongs_to :branch
40
+ spec/support/schema.rb ← remove branch_id column + docsmith_branches table
41
+ lib/generators/.../create_docsmith_tables.rb.erb ← remove docsmith_branches + branch_id
42
+ lib/generators/.../docsmith_initializer.rb.erb ← clean up stale comments
43
+ spec/docsmith/versionable_spec.rb ← remove branch describe blocks + fix addition count
44
+ spec/docsmith/diff/engine_spec.rb ← fix addition count (1 → 3 for word-level)
45
+ spec/docsmith/phase2_integration_spec.rb ← fix addition count (1 → 3 for word-level)
46
+ docsmith.gemspec ← update description
47
+ README.md ← rewrite with gem overview + link to USAGE.md
48
+ ```
49
+
50
+ ---
51
+
52
+ ## Task 1: Delete Branch Files
53
+
54
+ **Files:**
55
+ - Delete: `lib/docsmith/branches/branch.rb`
56
+ - Delete: `lib/docsmith/branches/manager.rb`
57
+ - Delete: `lib/docsmith/branches/merger.rb`
58
+ - Delete: `lib/docsmith/merge_result.rb`
59
+ - Delete: `spec/docsmith/branches/branch_spec.rb`
60
+ - Delete: `spec/docsmith/branches/manager_spec.rb`
61
+ - Delete: `spec/docsmith/branches/merger_spec.rb`
62
+ - Delete: `spec/docsmith/merge_result_spec.rb`
63
+ - Delete: `spec/docsmith/phase4_integration_spec.rb`
64
+
65
+ - [ ] **Step 1: Delete branch lib files and merge_result**
66
+
67
+ ```bash
68
+ rm lib/docsmith/branches/branch.rb
69
+ rm lib/docsmith/branches/manager.rb
70
+ rm lib/docsmith/branches/merger.rb
71
+ rm lib/docsmith/merge_result.rb
72
+ rmdir lib/docsmith/branches
73
+ ```
74
+
75
+ - [ ] **Step 2: Delete branch spec files and phase4 spec**
76
+
77
+ ```bash
78
+ rm spec/docsmith/branches/branch_spec.rb
79
+ rm spec/docsmith/branches/manager_spec.rb
80
+ rm spec/docsmith/branches/merger_spec.rb
81
+ rm spec/docsmith/merge_result_spec.rb
82
+ rm spec/docsmith/phase4_integration_spec.rb
83
+ rmdir spec/docsmith/branches
84
+ ```
85
+
86
+ ---
87
+
88
+ ## Task 2: Scrub Branch References from Core Files
89
+
90
+ **Files:**
91
+ - Modify: `lib/docsmith.rb`
92
+ - Modify: `lib/docsmith/versionable.rb`
93
+ - Modify: `lib/docsmith/version_manager.rb`
94
+ - Modify: `lib/docsmith/document_version.rb`
95
+
96
+ - [ ] **Step 1: Remove branch and merge_result requires from `lib/docsmith.rb`**
97
+
98
+ Replace the entire file content with:
99
+
100
+ ```ruby
101
+ # frozen_string_literal: true
102
+
103
+ require "active_record"
104
+ require "active_support"
105
+ require "active_support/core_ext/numeric/time"
106
+ require "active_support/notifications"
107
+
108
+ require_relative "docsmith/version"
109
+ require_relative "docsmith/errors"
110
+ require_relative "docsmith/configuration"
111
+ require_relative "docsmith/events/event"
112
+ require_relative "docsmith/events/hook_registry"
113
+ require_relative "docsmith/events/notifier"
114
+ require_relative "docsmith/document"
115
+ require_relative "docsmith/document_version"
116
+ require_relative "docsmith/version_tag"
117
+ require_relative "docsmith/auto_save"
118
+ require_relative "docsmith/version_manager"
119
+ require_relative "docsmith/versionable"
120
+ require_relative "docsmith/diff"
121
+ require_relative "docsmith/diff/renderers"
122
+ require_relative "docsmith/diff/renderers/base"
123
+ require_relative "docsmith/diff/renderers/registry"
124
+ require_relative "docsmith/diff/result"
125
+ require_relative "docsmith/diff/parsers/markdown"
126
+ require_relative "docsmith/diff/parsers/html"
127
+ require_relative "docsmith/diff/engine"
128
+ require_relative "docsmith/rendering/html_renderer"
129
+ require_relative "docsmith/rendering/json_renderer"
130
+ require_relative "docsmith/comments/comment"
131
+ require_relative "docsmith/comments/anchor"
132
+ require_relative "docsmith/comments/manager"
133
+ require_relative "docsmith/comments/migrator"
134
+
135
+ module Docsmith
136
+ class << self
137
+ # @yield [Docsmith::Configuration]
138
+ def configure
139
+ yield configuration
140
+ end
141
+
142
+ # @return [Docsmith::Configuration]
143
+ def configuration
144
+ @configuration ||= Configuration.new
145
+ end
146
+
147
+ # Reset to gem defaults. Call in specs via config.before(:each).
148
+ def reset_configuration!
149
+ @configuration = Configuration.new
150
+ end
151
+ end
152
+ end
153
+ ```
154
+
155
+ - [ ] **Step 2: Remove branch param from `save_version!` and delete 4 branch methods in `lib/docsmith/versionable.rb`**
156
+
157
+ Replace the file content with (full new content — removes `branch:` param from `save_version!` and removes `create_branch!`, `branches`, `active_branches`, `merge_branch!`):
158
+
159
+ ```ruby
160
+ # frozen_string_literal: true
161
+
162
+ module Docsmith
163
+ # ActiveRecord mixin that adds full versioning to any model.
164
+ #
165
+ # Usage:
166
+ # class Article < ApplicationRecord
167
+ # include Docsmith::Versionable
168
+ # docsmith_config { content_field :body; content_type :markdown }
169
+ # end
170
+ module Versionable
171
+ def self.included(base)
172
+ base.extend(ClassMethods)
173
+ base.after_save(:_docsmith_auto_save_callback)
174
+ end
175
+
176
+ module ClassMethods
177
+ # Configure per-class Docsmith options. All keys optional.
178
+ # Unset keys fall through to global config then gem defaults.
179
+ # @yield block evaluated on a Docsmith::ClassConfig instance
180
+ # @return [Docsmith::ClassConfig]
181
+ def docsmith_config(&block)
182
+ @_docsmith_class_config ||= Docsmith::ClassConfig.new
183
+ @_docsmith_class_config.instance_eval(&block) if block_given?
184
+ @_docsmith_class_config
185
+ end
186
+
187
+ # @return [Hash] fully resolved config (read-time resolution)
188
+ def docsmith_resolved_config
189
+ Docsmith::Configuration.resolve(
190
+ @_docsmith_class_config&.settings || {},
191
+ Docsmith.configuration
192
+ )
193
+ end
194
+ end
195
+
196
+ # Create a new DocumentVersion snapshot of this record's content.
197
+ # Returns nil if content is identical to the latest version.
198
+ # Raises Docsmith::InvalidContentField if content_field returns a non-String
199
+ # and no content_extractor is configured.
200
+ #
201
+ # @param author [Object, nil]
202
+ # @param summary [String, nil]
203
+ # @return [Docsmith::DocumentVersion, nil]
204
+ def save_version!(author:, summary: nil)
205
+ _sync_docsmith_content!
206
+ Docsmith::VersionManager.save!(
207
+ _docsmith_document,
208
+ author: author,
209
+ summary: summary,
210
+ config: self.class.docsmith_resolved_config
211
+ )
212
+ end
213
+
214
+ # Debounced auto-save. Returns nil if debounce window has not elapsed
215
+ # OR content is unchanged. Both non-save cases return nil.
216
+ # auto_save: false in config causes this to always return nil.
217
+ #
218
+ # @param author [Object, nil]
219
+ # @return [Docsmith::DocumentVersion, nil]
220
+ def auto_save_version!(author: nil)
221
+ config = self.class.docsmith_resolved_config
222
+ return nil unless config[:auto_save]
223
+
224
+ _sync_docsmith_content!
225
+ Docsmith::AutoSave.call(_docsmith_document, author: author, config: config)
226
+ end
227
+
228
+ # @return [ActiveRecord::Relation<Docsmith::DocumentVersion>] ordered by version_number
229
+ def versions
230
+ _docsmith_document.document_versions
231
+ end
232
+
233
+ # @return [Docsmith::DocumentVersion, nil] latest version
234
+ def current_version
235
+ _docsmith_document.current_version
236
+ end
237
+
238
+ # @param number [Integer] 1-indexed version_number
239
+ # @return [Docsmith::DocumentVersion, nil]
240
+ def version(number)
241
+ _docsmith_document.document_versions.find_by(version_number: number)
242
+ end
243
+
244
+ # Restore to a previous version. Creates a new version with the old content.
245
+ # Syncs restored content back to the model's content_field via update_column
246
+ # (bypasses after_save to prevent a duplicate auto-save).
247
+ # Never mutates existing versions.
248
+ #
249
+ # @param number [Integer] version_number to restore from
250
+ # @param author [Object, nil]
251
+ # @return [Docsmith::DocumentVersion]
252
+ # @raise [Docsmith::VersionNotFound]
253
+ def restore_version!(number, author:)
254
+ result = Docsmith::VersionManager.restore!(
255
+ _docsmith_document,
256
+ version: number,
257
+ author: author,
258
+ config: self.class.docsmith_resolved_config
259
+ )
260
+ field = self.class.docsmith_resolved_config[:content_field]
261
+ update_column(field, _docsmith_document.reload.content)
262
+ result
263
+ end
264
+
265
+ # Tag a specific version. Names are unique per document.
266
+ # @param number [Integer] version_number to tag
267
+ # @param name [String]
268
+ # @param author [Object, nil]
269
+ # @return [Docsmith::VersionTag]
270
+ def tag_version!(number, name:, author:)
271
+ Docsmith::VersionManager.tag!(
272
+ _docsmith_document, version: number, name: name, author: author)
273
+ end
274
+
275
+ # @param tag_name [String]
276
+ # @return [Docsmith::DocumentVersion, nil]
277
+ def tagged_version(tag_name)
278
+ tag = _docsmith_document.version_tags.find_by(name: tag_name)
279
+ tag&.version
280
+ end
281
+
282
+ # @param number [Integer] version_number
283
+ # @return [Array<String>] tag names on that version
284
+ def version_tags(number)
285
+ ver = version(number)
286
+ return [] unless ver
287
+ ver.version_tags.pluck(:name)
288
+ end
289
+
290
+ # Computes a diff from version N to the current (latest) version.
291
+ #
292
+ # @param version_number [Integer]
293
+ # @return [Docsmith::Diff::Result]
294
+ # @raise [ActiveRecord::RecordNotFound] if version_number does not exist
295
+ def diff_from(version_number)
296
+ doc = _docsmith_document
297
+ v_from = Docsmith::DocumentVersion.find_by!(document: doc, version_number: version_number)
298
+ v_to = Docsmith::DocumentVersion.where(document_id: doc.id).order(version_number: :desc).first!
299
+ Docsmith::Diff.between(v_from, v_to)
300
+ end
301
+
302
+ # Computes a diff between two named versions.
303
+ #
304
+ # @param from_version [Integer]
305
+ # @param to_version [Integer]
306
+ # @return [Docsmith::Diff::Result]
307
+ # @raise [ActiveRecord::RecordNotFound] if either version does not exist
308
+ def diff_between(from_version, to_version)
309
+ doc = _docsmith_document
310
+ v_from = Docsmith::DocumentVersion.find_by!(document: doc, version_number: from_version)
311
+ v_to = Docsmith::DocumentVersion.find_by!(document: doc, version_number: to_version)
312
+ Docsmith::Diff.between(v_from, v_to)
313
+ end
314
+
315
+ # Adds a comment to a specific version of this document.
316
+ #
317
+ # @param version [Integer] version_number
318
+ # @param body [String]
319
+ # @param author [Object] polymorphic author
320
+ # @param anchor [Hash, nil] { start_offset:, end_offset: } for inline range comments
321
+ # @param parent [Comments::Comment, nil] parent comment for threading
322
+ # @return [Docsmith::Comments::Comment]
323
+ def add_comment!(version:, body:, author:, anchor: nil, parent: nil)
324
+ Comments::Manager.add!(
325
+ _docsmith_document,
326
+ version_number: version,
327
+ body: body,
328
+ author: author,
329
+ anchor: anchor,
330
+ parent: parent
331
+ )
332
+ end
333
+
334
+ # Returns all comments across all versions of this document.
335
+ #
336
+ # @return [ActiveRecord::Relation<Docsmith::Comments::Comment>]
337
+ def comments
338
+ doc = _docsmith_document
339
+ Comments::Comment.joins(:version)
340
+ .where(docsmith_versions: { document_id: doc.id })
341
+ end
342
+
343
+ # Returns comments on a specific version, optionally filtered by anchor type.
344
+ #
345
+ # @param version [Integer] version_number
346
+ # @param type [Symbol, nil] :document or :range to filter; nil = all
347
+ # @return [ActiveRecord::Relation<Docsmith::Comments::Comment>]
348
+ def comments_on(version:, type: nil)
349
+ doc = _docsmith_document
350
+ dv = Docsmith::DocumentVersion.find_by!(document: doc, version_number: version)
351
+ rel = Comments::Comment.where(version: dv)
352
+ rel = rel.where(anchor_type: type.to_s) if type
353
+ rel
354
+ end
355
+
356
+ # Returns all unresolved comments across all versions.
357
+ #
358
+ # @return [ActiveRecord::Relation<Docsmith::Comments::Comment>]
359
+ def unresolved_comments
360
+ comments.merge(Comments::Comment.unresolved)
361
+ end
362
+
363
+ # Migrates top-level comments from one version to another.
364
+ #
365
+ # @param from [Integer] source version_number
366
+ # @param to [Integer] target version_number
367
+ # @return [void]
368
+ def migrate_comments!(from:, to:)
369
+ Comments::Migrator.migrate!(_docsmith_document, from: from, to: to)
370
+ end
371
+
372
+ private
373
+
374
+ # Finds or creates the shadow Docsmith::Document for this record.
375
+ # Cached in @_docsmith_document after first lookup.
376
+ def _docsmith_document
377
+ config = self.class.docsmith_resolved_config
378
+ @_docsmith_document ||= Docsmith::Document.find_or_create_by!(subject: self) do |doc|
379
+ doc.content_type = config[:content_type].to_s
380
+ doc.title = respond_to?(:title) ? title.to_s : self.class.name
381
+ end
382
+ end
383
+
384
+ # Reads content from the model via content_extractor or content_field,
385
+ # validates it is a String, then syncs to the shadow document's content column.
386
+ def _sync_docsmith_content!
387
+ config = self.class.docsmith_resolved_config
388
+
389
+ raw = if config[:content_extractor]
390
+ config[:content_extractor].call(self)
391
+ else
392
+ public_send(config[:content_field])
393
+ end
394
+
395
+ unless raw.nil? || raw.is_a?(String)
396
+ source = config[:content_extractor] ? "content_extractor" : "content_field :#{config[:content_field]}"
397
+ raise Docsmith::InvalidContentField,
398
+ "#{source} must return a String, got #{raw.class}. " \
399
+ "Use content_extractor: ->(record) { ... } for non-string fields."
400
+ end
401
+
402
+ _docsmith_document.update_column(:content, raw.to_s)
403
+ end
404
+
405
+ def _docsmith_auto_save_callback
406
+ auto_save_version!
407
+ rescue Docsmith::InvalidContentField
408
+ nil
409
+ end
410
+ end
411
+ end
412
+ ```
413
+
414
+ - [ ] **Step 3: Remove branch param from `VersionManager.save!` in `lib/docsmith/version_manager.rb`**
415
+
416
+ Replace the file content with:
417
+
418
+ ```ruby
419
+ # frozen_string_literal: true
420
+
421
+ module Docsmith
422
+ # Service object for all version lifecycle operations.
423
+ # The Versionable mixin delegates here after resolving the shadow document.
424
+ # Always receives a Docsmith::Document instance.
425
+ module VersionManager
426
+ # Create a new DocumentVersion snapshot.
427
+ # Returns nil if content is identical to the latest version (string == check).
428
+ #
429
+ # @param document [Docsmith::Document]
430
+ # @param author [Object, nil]
431
+ # @param summary [String, nil]
432
+ # @param config [Hash] resolved config
433
+ # @return [Docsmith::DocumentVersion, nil]
434
+ def self.save!(document, author:, summary: nil, config: nil)
435
+ config ||= Configuration.resolve({}, Docsmith.configuration)
436
+ current = document.content.to_s
437
+ latest = document.document_versions.last
438
+
439
+ return nil if latest && latest.content == current
440
+
441
+ next_num = document.versions_count + 1
442
+
443
+ version = DocumentVersion.create!(
444
+ document: document,
445
+ version_number: next_num,
446
+ content: current,
447
+ content_type: document.content_type,
448
+ author: author,
449
+ change_summary: summary,
450
+ metadata: {}
451
+ )
452
+
453
+ document.update_columns(
454
+ versions_count: next_num,
455
+ last_versioned_at: Time.current
456
+ )
457
+ document.versions_count = next_num
458
+
459
+ prune_if_needed!(document, version, config) if config[:max_versions]
460
+
461
+ record = document.subject || document
462
+ Events::Notifier.instrument(:version_created,
463
+ record: record, document: document, version: version, author: author)
464
+
465
+ version
466
+ end
467
+
468
+ # Restore a previous version by creating a new version with its content.
469
+ # Fires :version_restored (not :version_created). Never mutates existing versions.
470
+ #
471
+ # @param document [Docsmith::Document]
472
+ # @param version [Integer] version_number to restore from
473
+ # @param author [Object, nil]
474
+ # @param config [Hash] resolved config
475
+ # @return [Docsmith::DocumentVersion] the new version
476
+ # @raise [Docsmith::VersionNotFound]
477
+ def self.restore!(document, version:, author:, config: nil)
478
+ config ||= Configuration.resolve({}, Docsmith.configuration)
479
+ from_version = document.document_versions.find_by(version_number: version)
480
+ raise VersionNotFound, "Version #{version} not found on this document" unless from_version
481
+
482
+ next_num = document.versions_count + 1
483
+
484
+ new_version = DocumentVersion.create!(
485
+ document: document,
486
+ version_number: next_num,
487
+ content: from_version.content,
488
+ content_type: document.content_type,
489
+ author: author,
490
+ change_summary: "Restored from v#{version}",
491
+ metadata: {}
492
+ )
493
+
494
+ document.update_columns(
495
+ content: from_version.content,
496
+ versions_count: next_num,
497
+ last_versioned_at: Time.current
498
+ )
499
+
500
+ record = document.subject || document
501
+ Events::Notifier.instrument(:version_restored,
502
+ record: record, document: document, version: new_version,
503
+ author: author, from_version: from_version)
504
+
505
+ new_version
506
+ end
507
+
508
+ # Tag a specific version with a name unique to this document.
509
+ #
510
+ # @param document [Docsmith::Document]
511
+ # @param version [Integer] version_number to tag
512
+ # @param name [String] unique per document
513
+ # @param author [Object, nil]
514
+ # @return [Docsmith::VersionTag]
515
+ # @raise [Docsmith::VersionNotFound]
516
+ # @raise [Docsmith::TagAlreadyExists]
517
+ def self.tag!(document, version:, name:, author:)
518
+ version_record = document.document_versions.find_by(version_number: version)
519
+ raise VersionNotFound, "Version #{version} not found on this document" unless version_record
520
+
521
+ if VersionTag.exists?(document_id: document.id, name: name)
522
+ raise TagAlreadyExists, "Tag '#{name}' already exists on this document"
523
+ end
524
+
525
+ tag = VersionTag.create!(
526
+ document: document,
527
+ version: version_record,
528
+ name: name,
529
+ author: author
530
+ )
531
+
532
+ record = document.subject || document
533
+ Events::Notifier.instrument(:version_tagged,
534
+ record: record, document: document, version: version_record,
535
+ author: author, tag_name: name)
536
+
537
+ tag
538
+ end
539
+
540
+ def self.prune_if_needed!(document, new_version, config)
541
+ max = config[:max_versions]
542
+ return unless max && document.versions_count > max
543
+
544
+ tagged_ids = VersionTag.where(document_id: document.id).select(:version_id)
545
+ oldest_untagged = document.document_versions
546
+ .where.not(id: tagged_ids)
547
+ .where.not(id: new_version.id)
548
+ .first
549
+
550
+ unless oldest_untagged
551
+ raise MaxVersionsExceeded,
552
+ "All #{document.versions_count} versions are tagged. Cannot prune to stay within " \
553
+ "max_versions: #{max}. Remove a tag or increase max_versions."
554
+ end
555
+
556
+ oldest_untagged.destroy!
557
+ document.update_column(:versions_count, document.versions_count - 1)
558
+ end
559
+ private_class_method :prune_if_needed!
560
+ end
561
+ end
562
+ ```
563
+
564
+ - [ ] **Step 4: Remove `belongs_to :branch` from `lib/docsmith/document_version.rb`**
565
+
566
+ Replace lines 14–15 (the branch association and its blank line):
567
+
568
+ ```ruby
569
+ # REMOVE these two lines:
570
+ belongs_to :branch, class_name: "Docsmith::Branches::Branch", optional: true
571
+ ```
572
+
573
+ The updated associations block becomes:
574
+
575
+ ```ruby
576
+ belongs_to :document,
577
+ class_name: "Docsmith::Document",
578
+ foreign_key: :document_id
579
+ belongs_to :author, polymorphic: true, optional: true
580
+ has_many :version_tags,
581
+ class_name: "Docsmith::VersionTag",
582
+ foreign_key: :version_id,
583
+ dependent: :destroy
584
+ has_many :comments,
585
+ class_name: "Docsmith::Comments::Comment",
586
+ foreign_key: :version_id,
587
+ dependent: :destroy
588
+ ```
589
+
590
+ Full file after edit:
591
+
592
+ ```ruby
593
+ # frozen_string_literal: true
594
+
595
+ module Docsmith
596
+ # Immutable content snapshot. Table is docsmith_versions.
597
+ class DocumentVersion < ActiveRecord::Base
598
+ self.table_name = "docsmith_versions"
599
+
600
+ belongs_to :document,
601
+ class_name: "Docsmith::Document",
602
+ foreign_key: :document_id
603
+ belongs_to :author, polymorphic: true, optional: true
604
+ has_many :version_tags,
605
+ class_name: "Docsmith::VersionTag",
606
+ foreign_key: :version_id,
607
+ dependent: :destroy
608
+ has_many :comments,
609
+ class_name: "Docsmith::Comments::Comment",
610
+ foreign_key: :version_id,
611
+ dependent: :destroy
612
+
613
+ validates :version_number, presence: true,
614
+ uniqueness: { scope: :document_id }
615
+ validates :content, presence: true
616
+ validates :content_type, presence: true,
617
+ inclusion: { in: %w[html markdown json] }
618
+
619
+ # @return [Docsmith::DocumentVersion, nil]
620
+ def previous_version
621
+ document.document_versions
622
+ .where("version_number < ?", version_number)
623
+ .last
624
+ end
625
+
626
+ # Renders this version's content in the given output format.
627
+ #
628
+ # @param format [Symbol] :html or :json
629
+ # @param options [Hash] passed through to the renderer
630
+ # @return [String]
631
+ # @raise [ArgumentError] for unknown formats
632
+ def render(format, **options)
633
+ case format.to_sym
634
+ when :html then Rendering::HtmlRenderer.new.render(self, **options)
635
+ when :json then Rendering::JsonRenderer.new.render(self, **options)
636
+ else raise ArgumentError, "Unknown render format: #{format}. Supported: :html, :json"
637
+ end
638
+ end
639
+ end
640
+ end
641
+ ```
642
+
643
+ ---
644
+
645
+ ## Task 3: Remove Branch Schema and Generator Template
646
+
647
+ **Files:**
648
+ - Modify: `spec/support/schema.rb`
649
+ - Modify: `lib/generators/docsmith/install/templates/create_docsmith_tables.rb.erb`
650
+ - Modify: `lib/generators/docsmith/install/templates/docsmith_initializer.rb.erb`
651
+
652
+ - [ ] **Step 1: Remove `branch_id` column and `docsmith_branches` table from `spec/support/schema.rb`**
653
+
654
+ Replace the file with:
655
+
656
+ ```ruby
657
+ # frozen_string_literal: true
658
+
659
+ # In-memory SQLite schema for tests.
660
+ # Mirrors db/migrate/create_docsmith_tables.rb with two intentional differences:
661
+ # 1. :jsonb columns use :text here (SQLite has no jsonb type)
662
+ # 2. Foreign key constraints are omitted (SQLite does not enforce them)
663
+ # Production migration uses :jsonb and add_foreign_key for PostgreSQL.
664
+
665
+ ActiveRecord::Schema.define do
666
+ create_table :docsmith_documents, force: true do |t|
667
+ t.string :title
668
+ t.text :content
669
+ t.string :content_type, null: false, default: "markdown"
670
+ t.integer :versions_count, null: false, default: 0
671
+ t.datetime :last_versioned_at
672
+ t.string :subject_type
673
+ t.bigint :subject_id
674
+ t.text :metadata, default: "{}"
675
+ t.timestamps
676
+ end
677
+ add_index :docsmith_documents, %i[subject_type subject_id]
678
+
679
+ create_table :docsmith_versions, force: true do |t|
680
+ t.bigint :document_id, null: false
681
+ t.integer :version_number, null: false
682
+ t.text :content, null: false
683
+ t.string :content_type, null: false
684
+ t.string :author_type
685
+ t.bigint :author_id
686
+ t.string :change_summary
687
+ t.text :metadata, default: "{}"
688
+ t.datetime :created_at, null: false
689
+ end
690
+ add_index :docsmith_versions, %i[document_id version_number], unique: true
691
+ add_index :docsmith_versions, %i[author_type author_id]
692
+
693
+ create_table :docsmith_version_tags, force: true do |t|
694
+ t.bigint :document_id, null: false
695
+ t.bigint :version_id, null: false
696
+ t.string :name, null: false
697
+ t.string :author_type
698
+ t.bigint :author_id
699
+ t.datetime :created_at, null: false
700
+ end
701
+ add_index :docsmith_version_tags, %i[document_id name], unique: true
702
+ add_index :docsmith_version_tags, %i[version_id]
703
+
704
+ create_table :articles, force: true do |t|
705
+ t.string :title
706
+ t.text :body
707
+ t.timestamps
708
+ end
709
+
710
+ create_table :posts, force: true do |t|
711
+ t.text :body
712
+ t.timestamps
713
+ end
714
+
715
+ create_table :users, force: true do |t|
716
+ t.string :name
717
+ t.timestamps
718
+ end
719
+
720
+ create_table :docsmith_comments, force: true do |t|
721
+ t.bigint :version_id, null: false
722
+ t.bigint :parent_id
723
+ t.string :author_type
724
+ t.bigint :author_id
725
+ t.text :body, null: false
726
+ t.string :anchor_type, null: false, default: "document"
727
+ t.text :anchor_data, null: false, default: "{}"
728
+ t.boolean :resolved, null: false, default: false
729
+ t.string :resolved_by_type
730
+ t.bigint :resolved_by_id
731
+ t.datetime :resolved_at
732
+ t.datetime :created_at, null: false
733
+ t.datetime :updated_at, null: false
734
+ end
735
+ add_index :docsmith_comments, :version_id
736
+ add_index :docsmith_comments, :parent_id
737
+ add_index :docsmith_comments, %i[author_type author_id]
738
+ end
739
+ ```
740
+
741
+ - [ ] **Step 2: Remove `docsmith_branches` table and `branch_id` from the migration template**
742
+
743
+ Replace `lib/generators/docsmith/install/templates/create_docsmith_tables.rb.erb` with:
744
+
745
+ ```erb
746
+ # frozen_string_literal: true
747
+
748
+ class CreateDocsmithTables < ActiveRecord::Migration[<%= ActiveRecord::Migration.current_version %>]
749
+ def change
750
+ create_table :docsmith_documents do |t|
751
+ t.string :title
752
+ t.text :content
753
+ t.string :content_type, null: false, default: "markdown"
754
+ t.integer :versions_count, null: false, default: 0
755
+ t.datetime :last_versioned_at
756
+ t.string :subject_type
757
+ t.bigint :subject_id
758
+ t.jsonb :metadata, null: false, default: {}
759
+ t.timestamps
760
+ end
761
+ add_index :docsmith_documents, %i[subject_type subject_id]
762
+
763
+ create_table :docsmith_versions do |t|
764
+ t.references :document, null: false, foreign_key: { to_table: :docsmith_documents }
765
+ t.integer :version_number, null: false
766
+ t.text :content, null: false
767
+ t.string :content_type, null: false
768
+ t.string :author_type
769
+ t.bigint :author_id
770
+ t.string :change_summary
771
+ t.jsonb :metadata, null: false, default: {}
772
+ t.datetime :created_at, null: false
773
+ end
774
+ add_index :docsmith_versions, %i[document_id version_number], unique: true
775
+ add_index :docsmith_versions, %i[author_type author_id]
776
+
777
+ create_table :docsmith_version_tags do |t|
778
+ t.references :document, null: false, foreign_key: { to_table: :docsmith_documents }
779
+ t.bigint :version_id, null: false
780
+ t.string :name, null: false
781
+ t.string :author_type
782
+ t.bigint :author_id
783
+ t.datetime :created_at, null: false
784
+ end
785
+ add_index :docsmith_version_tags, %i[document_id name], unique: true
786
+ add_index :docsmith_version_tags, [:version_id]
787
+ add_foreign_key :docsmith_version_tags, :docsmith_versions, column: :version_id
788
+
789
+ create_table :docsmith_comments do |t|
790
+ t.bigint :version_id, null: false
791
+ t.bigint :parent_id
792
+ t.string :author_type
793
+ t.bigint :author_id
794
+ t.text :body, null: false
795
+ t.string :anchor_type, null: false, default: "document"
796
+ t.jsonb :anchor_data, null: false, default: {}
797
+ t.boolean :resolved, null: false, default: false
798
+ t.string :resolved_by_type
799
+ t.bigint :resolved_by_id
800
+ t.datetime :resolved_at
801
+ t.timestamps null: false
802
+ end
803
+ add_index :docsmith_comments, :version_id
804
+ add_index :docsmith_comments, :parent_id
805
+ add_index :docsmith_comments, [:author_type, :author_id]
806
+ add_foreign_key :docsmith_comments, :docsmith_versions, column: :version_id
807
+ add_foreign_key :docsmith_comments, :docsmith_comments, column: :parent_id
808
+ end
809
+ end
810
+ ```
811
+
812
+ - [ ] **Step 3: Clean up initializer template comments**
813
+
814
+ Replace `lib/generators/docsmith/install/templates/docsmith_initializer.rb.erb` with:
815
+
816
+ ```erb
817
+ # frozen_string_literal: true
818
+
819
+ Docsmith.configure do |config|
820
+ # Resolution order: per-class docsmith_config > this global config > gem defaults
821
+ #
822
+ # config.default_content_field = :body # gem default: :body
823
+ # config.default_content_type = :markdown # gem default: :markdown (:html, :markdown, :json)
824
+ # config.auto_save = true # gem default: true
825
+ # config.default_debounce = 30 # gem default: 30 (integer seconds)
826
+ # config.max_versions = nil # gem default: nil (unlimited)
827
+ # config.content_extractor = nil # example: ->(record) { record.body.to_html }
828
+ # config.table_prefix = "docsmith" # gem default: "docsmith"
829
+ # config.diff_context_lines = 3
830
+ #
831
+ # Event hooks (fires synchronously before AS::Notifications):
832
+ # config.on(:version_created) { |event| Rails.logger.info "v#{event.version.version_number} saved" }
833
+ # config.on(:version_restored) { |event| }
834
+ # config.on(:version_tagged) { |event| }
835
+ end
836
+ ```
837
+
838
+ - [ ] **Step 4: Verify tests pass after branch removal**
839
+
840
+ Run: `bundle exec rspec --format progress`
841
+
842
+ Expected: all remaining specs pass (branch specs are deleted, so no failures from them).
843
+
844
+ If you see `undefined method 'branch'` errors anywhere, check that all reference removals were complete.
845
+
846
+ - [ ] **Step 5: Commit branch removal**
847
+
848
+ ```bash
849
+ git add -A
850
+ git commit -m "feat: remove branching and merging — too heavyweight for a document versioning gem"
851
+ ```
852
+
853
+ ---
854
+
855
+ ## Task 4: Implement Markdown Diff Parser
856
+
857
+ **Files:**
858
+ - Create: `lib/docsmith/diff/parsers/markdown.rb`
859
+ - Create: `spec/docsmith/diff/parsers/markdown_spec.rb`
860
+
861
+ - [ ] **Step 1: Write the failing tests**
862
+
863
+ Create `spec/docsmith/diff/parsers/markdown_spec.rb`:
864
+
865
+ ```ruby
866
+ # frozen_string_literal: true
867
+
868
+ require "spec_helper"
869
+
870
+ RSpec.describe Docsmith::Diff::Parsers::Markdown do
871
+ subject(:parser) { described_class.new }
872
+
873
+ describe "#compute" do
874
+ it "detects a word addition between versions" do
875
+ # "Hello world" → "Hello Ruby world"
876
+ # Old tokens: ["Hello", "world"]
877
+ # New tokens: ["Hello", "Ruby", "world"]
878
+ # LCS: ["Hello", "world"] — "Ruby" is inserted
879
+ changes = parser.compute("Hello world", "Hello Ruby world")
880
+ expect(changes).to include(a_hash_including(type: :addition, content: "Ruby"))
881
+ end
882
+
883
+ it "detects a word deletion between versions" do
884
+ # "Hello Ruby world" → "Hello world"
885
+ changes = parser.compute("Hello Ruby world", "Hello world")
886
+ expect(changes).to include(a_hash_including(type: :deletion, content: "Ruby"))
887
+ end
888
+
889
+ it "detects a word modification" do
890
+ # "Hello world" → "Hello Ruby"
891
+ # Old tokens: ["Hello", "world"]
892
+ # New tokens: ["Hello", "Ruby"]
893
+ # LCS: ["Hello"] — "world" modified to "Ruby"
894
+ changes = parser.compute("Hello world", "Hello Ruby")
895
+ expect(changes).to include(a_hash_including(
896
+ type: :modification,
897
+ old_content: "world",
898
+ new_content: "Ruby"
899
+ ))
900
+ end
901
+
902
+ it "returns empty array for identical content" do
903
+ expect(parser.compute("same text", "same text")).to be_empty
904
+ end
905
+
906
+ it "treats each whitespace-delimited word as a separate token" do
907
+ # Adding a new line adds 3 tokens: newline, word, word
908
+ # "line one\nline two" → "line one\nline two\nline three"
909
+ # Old tokens: ["line", "one", "\n", "line", "two"]
910
+ # New tokens: ["line", "one", "\n", "line", "two", "\n", "line", "three"]
911
+ # Additions: 3 tokens ("\n", "line", "three")
912
+ changes = parser.compute("line one\nline two", "line one\nline two\nline three")
913
+ additions = changes.select { |c| c[:type] == :addition }
914
+ expect(additions.count).to eq(3)
915
+ expect(additions.map { |c| c[:content] }).to contain_exactly("\n", "line", "three")
916
+ end
917
+
918
+ it "preserves newlines as distinct tokens for paragraph detection" do
919
+ # A blank-line paragraph break is one "\n\n" token
920
+ changes = parser.compute("Para one", "Para one\n\nPara two")
921
+ expect(changes).to include(a_hash_including(type: :addition, content: "\n\n"))
922
+ end
923
+
924
+ it "returns change hashes with :line (token index), :type, and :content keys" do
925
+ changes = parser.compute("foo", "foo bar")
926
+ addition = changes.find { |c| c[:type] == :addition }
927
+ expect(addition).to include(:line, :type, :content)
928
+ end
929
+ end
930
+ end
931
+ ```
932
+
933
+ - [ ] **Step 2: Run tests to verify they fail**
934
+
935
+ Run: `bundle exec rspec spec/docsmith/diff/parsers/markdown_spec.rb`
936
+
937
+ Expected: FAIL — `uninitialized constant Docsmith::Diff::Parsers::Markdown`
938
+
939
+ - [ ] **Step 3: Implement the Markdown parser**
940
+
941
+ Create `lib/docsmith/diff/parsers/markdown.rb`:
942
+
943
+ ```ruby
944
+ # frozen_string_literal: true
945
+
946
+ require "diff/lcs"
947
+
948
+ module Docsmith
949
+ module Diff
950
+ module Parsers
951
+ # Word-level diff parser for Markdown documents.
952
+ #
953
+ # Instead of comparing line-by-line (as Renderers::Base does), this parser
954
+ # tokenizes content into individual words and newline groups, then diffs
955
+ # those tokens. This gives precise word-level change detection for prose,
956
+ # which is far more useful than "the whole line changed."
957
+ #
958
+ # Tokenization: content.scan(/\S+|\n+/)
959
+ # "Hello world\n\nFoo" → ["Hello", "world", "\n\n", "Foo"]
960
+ #
961
+ # The :line key in change hashes stores the 1-indexed token position
962
+ # (not a line number) for compatibility with Diff::Result serialization.
963
+ class Markdown < Renderers::Base
964
+ # @param old_content [String]
965
+ # @param new_content [String]
966
+ # @return [Array<Hash>] change hashes with :type, :line (token index), and content keys
967
+ def compute(old_content, new_content)
968
+ old_tokens = tokenize(old_content)
969
+ new_tokens = tokenize(new_content)
970
+ changes = []
971
+
972
+ ::Diff::LCS.sdiff(old_tokens, new_tokens).each do |hunk|
973
+ case hunk.action
974
+ when "+"
975
+ changes << { type: :addition, line: hunk.new_position + 1, content: hunk.new_element.to_s }
976
+ when "-"
977
+ changes << { type: :deletion, line: hunk.old_position + 1, content: hunk.old_element.to_s }
978
+ when "!"
979
+ changes << {
980
+ type: :modification,
981
+ line: hunk.old_position + 1,
982
+ old_content: hunk.old_element.to_s,
983
+ new_content: hunk.new_element.to_s
984
+ }
985
+ end
986
+ end
987
+
988
+ changes
989
+ end
990
+
991
+ private
992
+
993
+ # Splits markdown into word tokens.
994
+ # \S+ matches any non-whitespace run (words, punctuation, markdown markers).
995
+ # \n+ matches one or more consecutive newlines as a single token so that
996
+ # paragraph breaks (\n\n) and line breaks (\n) are each one diffable unit.
997
+ def tokenize(content)
998
+ content.scan(/\S+|\n+/)
999
+ end
1000
+ end
1001
+ end
1002
+ end
1003
+ end
1004
+ ```
1005
+
1006
+ - [ ] **Step 4: Run tests to verify they pass**
1007
+
1008
+ Run: `bundle exec rspec spec/docsmith/diff/parsers/markdown_spec.rb`
1009
+
1010
+ Expected: all 7 examples pass.
1011
+
1012
+ - [ ] **Step 5: Commit**
1013
+
1014
+ ```bash
1015
+ git add lib/docsmith/diff/parsers/markdown.rb spec/docsmith/diff/parsers/markdown_spec.rb
1016
+ git commit -m "feat(diff): add Markdown word-level diff parser"
1017
+ ```
1018
+
1019
+ ---
1020
+
1021
+ ## Task 5: Implement HTML Diff Parser
1022
+
1023
+ **Files:**
1024
+ - Create: `lib/docsmith/diff/parsers/html.rb`
1025
+ - Create: `spec/docsmith/diff/parsers/html_spec.rb`
1026
+
1027
+ - [ ] **Step 1: Write the failing tests**
1028
+
1029
+ Create `spec/docsmith/diff/parsers/html_spec.rb`:
1030
+
1031
+ ```ruby
1032
+ # frozen_string_literal: true
1033
+
1034
+ require "spec_helper"
1035
+
1036
+ RSpec.describe Docsmith::Diff::Parsers::Html do
1037
+ subject(:parser) { described_class.new }
1038
+
1039
+ describe "#compute" do
1040
+ it "treats an opening tag as one atomic token" do
1041
+ # "<p>Hello</p>" → "<span>Hello</span>"
1042
+ # Tokens: ["<p>", "Hello", "</p>"] vs ["<span>", "Hello", "</span>"]
1043
+ # Modifications: "<p>"→"<span>", "</p>"→"</span>"
1044
+ changes = parser.compute("<p>Hello</p>", "<span>Hello</span>")
1045
+ mods = changes.select { |c| c[:type] == :modification }
1046
+ expect(mods).to include(a_hash_including(old_content: "<p>", new_content: "<span>"))
1047
+ expect(mods).to include(a_hash_including(old_content: "</p>", new_content: "</span>"))
1048
+ end
1049
+
1050
+ it "detects a new paragraph added (3 new tokens)" do
1051
+ # "<p>Hello</p>" → "<p>Hello</p><p>World</p>"
1052
+ # Old tokens: ["<p>", "Hello", "</p>"]
1053
+ # New tokens: ["<p>", "Hello", "</p>", "<p>", "World", "</p>"]
1054
+ # LCS: first 3 match — 3 additions: "<p>", "World", "</p>"
1055
+ changes = parser.compute("<p>Hello</p>", "<p>Hello</p><p>World</p>")
1056
+ additions = changes.select { |c| c[:type] == :addition }
1057
+ expect(additions.map { |c| c[:content] }).to contain_exactly("<p>", "World", "</p>")
1058
+ end
1059
+
1060
+ it "detects a word change inside a tag" do
1061
+ changes = parser.compute("<p>Hello world</p>", "<p>Hello Ruby</p>")
1062
+ expect(changes).to include(a_hash_including(
1063
+ type: :modification,
1064
+ old_content: "world",
1065
+ new_content: "Ruby"
1066
+ ))
1067
+ end
1068
+
1069
+ it "treats tag with attributes as one atomic token" do
1070
+ # "<div class=\"foo\">" must be ONE token, not split on spaces inside the tag
1071
+ changes = parser.compute('<div class="foo">bar</div>', '<div class="baz">bar</div>')
1072
+ mods = changes.select { |c| c[:type] == :modification }
1073
+ expect(mods).to include(a_hash_including(
1074
+ old_content: '<div class="foo">',
1075
+ new_content: '<div class="baz">'
1076
+ ))
1077
+ end
1078
+
1079
+ it "returns empty array for identical HTML" do
1080
+ html = "<p>Same content</p>"
1081
+ expect(parser.compute(html, html)).to be_empty
1082
+ end
1083
+
1084
+ it "does not split tag delimiters < and > as separate tokens" do
1085
+ # If the tokenizer split on < and >, the open bracket "<" would be its own token.
1086
+ # Verify that no change content is exactly "<" or ">"
1087
+ changes = parser.compute("<p>a</p>", "<p>b</p>")
1088
+ all_content = changes.flat_map { |c| [c[:content], c[:old_content], c[:new_content]] }.compact
1089
+ expect(all_content).not_to include("<", ">")
1090
+ end
1091
+
1092
+ it "returns change hashes with :line (token index), :type, and content keys" do
1093
+ changes = parser.compute("<p>foo</p>", "<p>foo</p><p>bar</p>")
1094
+ addition = changes.find { |c| c[:type] == :addition }
1095
+ expect(addition).to include(:line, :type, :content)
1096
+ end
1097
+ end
1098
+ end
1099
+ ```
1100
+
1101
+ - [ ] **Step 2: Run tests to verify they fail**
1102
+
1103
+ Run: `bundle exec rspec spec/docsmith/diff/parsers/html_spec.rb`
1104
+
1105
+ Expected: FAIL — `uninitialized constant Docsmith::Diff::Parsers::Html`
1106
+
1107
+ - [ ] **Step 3: Implement the HTML parser**
1108
+
1109
+ Create `lib/docsmith/diff/parsers/html.rb`:
1110
+
1111
+ ```ruby
1112
+ # frozen_string_literal: true
1113
+
1114
+ require "diff/lcs"
1115
+
1116
+ module Docsmith
1117
+ module Diff
1118
+ module Parsers
1119
+ # HTML-aware diff parser for HTML documents.
1120
+ #
1121
+ # Tokenizes HTML so that each tag (including its attributes) is one atomic
1122
+ # unit and text words are separate units. This prevents the diff engine from
1123
+ # splitting `<p class="foo">` into angle brackets, attribute names, and values.
1124
+ #
1125
+ # Tokenization regex: /<[^>]+>|[^\s<>]+/
1126
+ # - /<[^>]+>/ matches any HTML tag: <p>, </p>, <div class="x">, <br/>
1127
+ # - /[^\s<>]+/ matches words in text content between tags
1128
+ #
1129
+ # Example: "<p>Hello world</p>" → ["<p>", "Hello", "world", "</p>"]
1130
+ #
1131
+ # The :line key in change hashes stores the 1-indexed token position
1132
+ # (not a line number) for compatibility with Diff::Result serialization.
1133
+ class Html < Renderers::Base
1134
+ TAG_OR_WORD = /<[^>]+>|[^\s<>]+/.freeze
1135
+
1136
+ # @param old_content [String]
1137
+ # @param new_content [String]
1138
+ # @return [Array<Hash>] change hashes with :type, :line (token index), and content keys
1139
+ def compute(old_content, new_content)
1140
+ old_tokens = tokenize(old_content)
1141
+ new_tokens = tokenize(new_content)
1142
+ changes = []
1143
+
1144
+ ::Diff::LCS.sdiff(old_tokens, new_tokens).each do |hunk|
1145
+ case hunk.action
1146
+ when "+"
1147
+ changes << { type: :addition, line: hunk.new_position + 1, content: hunk.new_element.to_s }
1148
+ when "-"
1149
+ changes << { type: :deletion, line: hunk.old_position + 1, content: hunk.old_element.to_s }
1150
+ when "!"
1151
+ changes << {
1152
+ type: :modification,
1153
+ line: hunk.old_position + 1,
1154
+ old_content: hunk.old_element.to_s,
1155
+ new_content: hunk.new_element.to_s
1156
+ }
1157
+ end
1158
+ end
1159
+
1160
+ changes
1161
+ end
1162
+
1163
+ private
1164
+
1165
+ # Splits HTML into tokens:
1166
+ # - Each HTML tag (including attributes) is one token
1167
+ # - Each word in text content is one token
1168
+ # Whitespace between tokens is discarded.
1169
+ def tokenize(content)
1170
+ content.scan(TAG_OR_WORD)
1171
+ end
1172
+ end
1173
+ end
1174
+ end
1175
+ end
1176
+ ```
1177
+
1178
+ - [ ] **Step 4: Run tests to verify they pass**
1179
+
1180
+ Run: `bundle exec rspec spec/docsmith/diff/parsers/html_spec.rb`
1181
+
1182
+ Expected: all 6 examples pass.
1183
+
1184
+ - [ ] **Step 5: Commit**
1185
+
1186
+ ```bash
1187
+ git add lib/docsmith/diff/parsers/html.rb spec/docsmith/diff/parsers/html_spec.rb
1188
+ git commit -m "feat(diff): add HTML-aware diff parser treating tags as atomic tokens"
1189
+ ```
1190
+
1191
+ ---
1192
+
1193
+ ## Task 6: Wire Parsers into Engine
1194
+
1195
+ **Files:**
1196
+ - Modify: `lib/docsmith/diff/engine.rb`
1197
+
1198
+ Parsers are already required before engine in `lib/docsmith.rb` (added in Task 2, Step 1). Now update engine to use them.
1199
+
1200
+ - [ ] **Step 1: Write a failing test for the engine using format-aware parser**
1201
+
1202
+ Add to `spec/docsmith/diff/engine_spec.rb` (after the existing describe blocks):
1203
+
1204
+ ```ruby
1205
+ describe "format-aware parser dispatch" do
1206
+ let(:md_doc) { create(:document, content: "# Hello", content_type: "markdown") }
1207
+ let(:html_doc) { create(:document, content: "<p>Hello</p>", content_type: "html") }
1208
+
1209
+ let(:md_v1) { create(:document_version, document: md_doc, content: "Hello world", version_number: 1, content_type: "markdown") }
1210
+ let(:md_v2) { create(:document_version, document: md_doc, content: "Hello Ruby world", version_number: 2, content_type: "markdown") }
1211
+
1212
+ let(:html_v1) { create(:document_version, document: html_doc, content: "<p>Hello</p>", version_number: 1, content_type: "html") }
1213
+ let(:html_v2) { create(:document_version, document: html_doc, content: "<p>Hello</p><p>World</p>", version_number: 2, content_type: "html") }
1214
+
1215
+ it "uses Markdown parser for markdown content — detects word addition" do
1216
+ result = described_class.between(md_v1, md_v2)
1217
+ # "Hello world" → "Hello Ruby world": 1 word added ("Ruby")
1218
+ expect(result.additions).to eq(1)
1219
+ expect(result.changes.find { |c| c[:type] == :addition }[:content]).to eq("Ruby")
1220
+ end
1221
+
1222
+ it "uses HTML parser for html content — treats tags as atomic tokens" do
1223
+ result = described_class.between(html_v1, html_v2)
1224
+ # "<p>Hello</p>" → "<p>Hello</p><p>World</p>": 3 token additions
1225
+ expect(result.additions).to eq(3)
1226
+ end
1227
+ end
1228
+ ```
1229
+
1230
+ - [ ] **Step 2: Run to verify the new tests fail**
1231
+
1232
+ Run: `bundle exec rspec spec/docsmith/diff/engine_spec.rb -e "format-aware"`
1233
+
1234
+ Expected: FAIL — engine still uses `Renderers::Base` for all types.
1235
+
1236
+ - [ ] **Step 3: Update `lib/docsmith/diff/engine.rb` to use the PARSERS map**
1237
+
1238
+ Replace the file with:
1239
+
1240
+ ```ruby
1241
+ # frozen_string_literal: true
1242
+
1243
+ module Docsmith
1244
+ module Diff
1245
+ # Computes diffs between two DocumentVersion records.
1246
+ # For markdown and html content types, a format-aware parser is used
1247
+ # (word-level for markdown, tag-atomic for html).
1248
+ # Falls back to Renderers::Base (line-level) for json and unknown types.
1249
+ class Engine
1250
+ PARSERS = {
1251
+ "markdown" => Parsers::Markdown,
1252
+ "html" => Parsers::Html
1253
+ }.freeze
1254
+
1255
+ class << self
1256
+ # @param version_a [Docsmith::DocumentVersion] the older version
1257
+ # @param version_b [Docsmith::DocumentVersion] the newer version
1258
+ # @return [Docsmith::Diff::Result]
1259
+ def between(version_a, version_b)
1260
+ content_type = version_a.content_type.to_s
1261
+ parser = PARSERS.fetch(content_type, Renderers::Base).new
1262
+ changes = parser.compute(version_a.content.to_s, version_b.content.to_s)
1263
+
1264
+ Result.new(
1265
+ content_type: content_type,
1266
+ from_version: version_a.version_number,
1267
+ to_version: version_b.version_number,
1268
+ changes: changes
1269
+ )
1270
+ end
1271
+ end
1272
+ end
1273
+
1274
+ # Convenience module method: Docsmith::Diff.between(v1, v2)
1275
+ def self.between(version_a, version_b)
1276
+ Engine.between(version_a, version_b)
1277
+ end
1278
+ end
1279
+ end
1280
+ ```
1281
+
1282
+ - [ ] **Step 4: Run the new engine tests to verify they pass**
1283
+
1284
+ Run: `bundle exec rspec spec/docsmith/diff/engine_spec.rb -e "format-aware"`
1285
+
1286
+ Expected: 2 examples, 0 failures.
1287
+
1288
+ - [ ] **Step 5: Commit**
1289
+
1290
+ ```bash
1291
+ git add lib/docsmith/diff/engine.rb spec/docsmith/diff/engine_spec.rb
1292
+ git commit -m "feat(diff): wire format-aware parsers into Engine via PARSERS dispatch map"
1293
+ ```
1294
+
1295
+ ---
1296
+
1297
+ ## Task 7: Fix Broken Count Expectations in Existing Specs
1298
+
1299
+ The word-level Markdown parser produces more tokens than the old line-level Base renderer. Three spec files assert specific addition counts that now need updating.
1300
+
1301
+ **Affected expectations** (all involve `"line one\nline two"` → `"line one\nline two\nline three"` on `content_type: "markdown"`):
1302
+
1303
+ Old (line-level): 1 line added
1304
+ New (word-level): 3 tokens added (`"\n"`, `"line"`, `"three"`)
1305
+
1306
+ **Files:**
1307
+ - Modify: `spec/docsmith/diff/engine_spec.rb`
1308
+ - Modify: `spec/docsmith/phase2_integration_spec.rb`
1309
+ - Modify: `spec/docsmith/versionable_spec.rb`
1310
+
1311
+ - [ ] **Step 1: Fix `spec/docsmith/diff/engine_spec.rb`**
1312
+
1313
+ In the `describe ".between"` block, the test `"detects the added line"`:
1314
+
1315
+ ```ruby
1316
+ # BEFORE:
1317
+ it "detects the added line" do
1318
+ expect(result.additions).to eq(1)
1319
+ expect(result.deletions).to eq(0)
1320
+ end
1321
+
1322
+ # AFTER:
1323
+ it "detects token additions (word-level for markdown)" do
1324
+ # v1: "line one\nline two" → tokens: ["line", "one", "\n", "line", "two"]
1325
+ # v2: adds "\nline three" → 3 new tokens: "\n", "line", "three"
1326
+ expect(result.additions).to eq(3)
1327
+ expect(result.deletions).to eq(0)
1328
+ end
1329
+ ```
1330
+
1331
+ In the `describe "Docsmith::Diff.between (module convenience method)"` block:
1332
+
1333
+ ```ruby
1334
+ # BEFORE:
1335
+ expect(result.additions).to eq(1)
1336
+
1337
+ # AFTER:
1338
+ expect(result.additions).to eq(3)
1339
+ ```
1340
+
1341
+ - [ ] **Step 2: Run engine spec to verify**
1342
+
1343
+ Run: `bundle exec rspec spec/docsmith/diff/engine_spec.rb`
1344
+
1345
+ Expected: all examples pass.
1346
+
1347
+ - [ ] **Step 3: Fix `spec/docsmith/phase2_integration_spec.rb`**
1348
+
1349
+ ```ruby
1350
+ # BEFORE — "diff_from returns correct addition count":
1351
+ expect(result.additions).to eq(1)
1352
+
1353
+ # AFTER:
1354
+ expect(result.additions).to eq(3)
1355
+ ```
1356
+
1357
+ ```ruby
1358
+ # BEFORE — "Diff::Result#to_json returns valid JSON with stats":
1359
+ expect(parsed["stats"]["additions"]).to eq(1)
1360
+
1361
+ # AFTER:
1362
+ expect(parsed["stats"]["additions"]).to eq(3)
1363
+ ```
1364
+
1365
+ - [ ] **Step 4: Run phase2 spec to verify**
1366
+
1367
+ Run: `bundle exec rspec spec/docsmith/phase2_integration_spec.rb`
1368
+
1369
+ Expected: all 6 examples pass.
1370
+
1371
+ - [ ] **Step 5: Fix `spec/docsmith/versionable_spec.rb`** — two changes
1372
+
1373
+ **Change A:** Remove the branch describe blocks (lines 435–531). Delete the following four describe blocks entirely:
1374
+
1375
+ ```ruby
1376
+ # DELETE this entire block (lines 435–455):
1377
+ describe "#save_version! with branch:" do
1378
+ ...
1379
+ end
1380
+
1381
+ # DELETE this entire block (lines 457–476):
1382
+ describe "#create_branch!" do
1383
+ ...
1384
+ end
1385
+
1386
+ # DELETE this entire block (lines 478–500):
1387
+ describe "#branches and #active_branches" do
1388
+ ...
1389
+ end
1390
+
1391
+ # DELETE this entire block (lines 502–531):
1392
+ describe "#merge_branch!" do
1393
+ ...
1394
+ end
1395
+ ```
1396
+
1397
+ **Change B:** In the `describe "#diff_from"` block, update the addition count:
1398
+
1399
+ ```ruby
1400
+ # BEFORE:
1401
+ expect(result.additions).to eq(1)
1402
+
1403
+ # AFTER:
1404
+ expect(result.additions).to eq(3)
1405
+ ```
1406
+
1407
+ - [ ] **Step 6: Run versionable spec to verify**
1408
+
1409
+ Run: `bundle exec rspec spec/docsmith/versionable_spec.rb`
1410
+
1411
+ Expected: all examples pass (branch describe blocks gone, count updated).
1412
+
1413
+ - [ ] **Step 7: Run the full suite**
1414
+
1415
+ Run: `bundle exec rspec --format progress`
1416
+
1417
+ Expected: 0 failures. Note the total example count will be lower than before (branch specs deleted).
1418
+
1419
+ - [ ] **Step 8: Commit**
1420
+
1421
+ ```bash
1422
+ git add spec/docsmith/diff/engine_spec.rb spec/docsmith/phase2_integration_spec.rb spec/docsmith/versionable_spec.rb
1423
+ git commit -m "test: update addition count assertions for word-level Markdown parser"
1424
+ ```
1425
+
1426
+ ---
1427
+
1428
+ ## Task 8: Update gemspec Description
1429
+
1430
+ **Files:**
1431
+ - Modify: `docsmith.gemspec`
1432
+
1433
+ - [ ] **Step 1: Update the description to remove branching mention**
1434
+
1435
+ In `docsmith.gemspec`, replace the `spec.description` block:
1436
+
1437
+ ```ruby
1438
+ # BEFORE:
1439
+ spec.description = <<~DESC
1440
+ Docsmith is a full-featured document versioning layer for Ruby on Rails.
1441
+
1442
+ It gives any ActiveRecord model snapshot-based versioning, multi-format diff rendering,
1443
+ inline range-anchored comments, and Git-like branching & merging — all with zero
1444
+ system dependencies.
1445
+
1446
+ • Full content snapshots (HTML, Markdown, JSON) for trivial rollbacks
1447
+ • Pure-Ruby diff-lcs engine with line-level diffs and stats
1448
+ • Document-level + range-anchored comments with threading and migration
1449
+ • Branching and three-way merge support
1450
+ • Per-class configuration, auto-save with debounce, events, and a clean service-object API
1451
+
1452
+ Perfect for wikis, CMS pages, API specs, legal documents, or any content that needs
1453
+ audit trails, collaboration, and version history.
1454
+ DESC
1455
+
1456
+ # AFTER:
1457
+ spec.description = <<~DESC
1458
+ Docsmith adds snapshot-based versioning to any ActiveRecord model with zero system dependencies.
1459
+
1460
+ • Full content snapshots (HTML, Markdown, JSON) for instant rollbacks
1461
+ • Format-aware diff engine: word-level diffs for Markdown, tag-atomic diffs for HTML
1462
+ • Document-level and range-anchored comments with threading and version migration
1463
+ • Per-class configuration, debounced auto-save, lifecycle events, and a clean API
1464
+
1465
+ Perfect for wikis, CMS pages, API specs, legal documents, or any content that needs
1466
+ an audit trail and inline collaboration.
1467
+ DESC
1468
+ ```
1469
+
1470
+ - [ ] **Step 2: Commit**
1471
+
1472
+ ```bash
1473
+ git add docsmith.gemspec
1474
+ git commit -m "docs(gemspec): update description to reflect current feature set"
1475
+ ```
1476
+
1477
+ ---
1478
+
1479
+ ## Task 9: Write USAGE.md and Update README.md
1480
+
1481
+ **Files:**
1482
+ - Create: `USAGE.md`
1483
+ - Modify: `README.md`
1484
+
1485
+ - [ ] **Step 1: Create `USAGE.md`**
1486
+
1487
+ ```markdown
1488
+ # Docsmith Usage Guide
1489
+
1490
+ Docsmith adds snapshot-based versioning, format-aware diffs, and inline comments to any
1491
+ ActiveRecord model. It stores all data in your existing database — no external services.
1492
+
1493
+ ---
1494
+
1495
+ ## Table of Contents
1496
+
1497
+ 1. [Installation](#1-installation)
1498
+ 2. [Setup — Migration](#2-setup--migration)
1499
+ 3. [Setup — Include Versionable](#3-setup--include-versionable)
1500
+ 4. [Per-Class Configuration](#4-per-class-configuration)
1501
+ 5. [Global Configuration](#5-global-configuration)
1502
+ 6. [Saving Versions](#6-saving-versions)
1503
+ 7. [Auto-Save and Debounce](#7-auto-save-and-debounce)
1504
+ 8. [Querying Versions](#8-querying-versions)
1505
+ 9. [Restoring Versions](#9-restoring-versions)
1506
+ 10. [Tagging Versions](#10-tagging-versions)
1507
+ 11. [Diffs](#11-diffs)
1508
+ 12. [Comments](#12-comments)
1509
+ 13. [Events and Hooks](#13-events-and-hooks)
1510
+ 14. [Standalone Document API](#14-standalone-document-api)
1511
+ 15. [Configuration Reference](#15-configuration-reference)
1512
+
1513
+ ---
1514
+
1515
+ ## 1. Installation
1516
+
1517
+ Add to your `Gemfile`:
1518
+
1519
+ ```ruby
1520
+ gem "docsmith"
1521
+ ```
1522
+
1523
+ Then:
1524
+
1525
+ ```bash
1526
+ bundle install
1527
+ ```
1528
+
1529
+ ---
1530
+
1531
+ ## 2. Setup — Migration
1532
+
1533
+ Run the install generator to create the migration:
1534
+
1535
+ ```bash
1536
+ rails generate docsmith:install
1537
+ rails db:migrate
1538
+ ```
1539
+
1540
+ This creates four tables:
1541
+
1542
+ | Table | Purpose |
1543
+ |-------------------------|----------------------------------------------|
1544
+ | `docsmith_documents` | One record per versioned model instance |
1545
+ | `docsmith_versions` | Content snapshots (immutable) |
1546
+ | `docsmith_version_tags` | Named tags on specific versions |
1547
+ | `docsmith_comments` | Inline and document-level comments |
1548
+
1549
+ ---
1550
+
1551
+ ## 3. Setup — Include Versionable
1552
+
1553
+ Add `include Docsmith::Versionable` to any ActiveRecord model. Optionally configure
1554
+ it with `docsmith_config`:
1555
+
1556
+ ```ruby
1557
+ class Article < ApplicationRecord
1558
+ include Docsmith::Versionable
1559
+
1560
+ docsmith_config do
1561
+ content_field :body # which column holds the document content
1562
+ content_type :markdown # :markdown, :html, or :json
1563
+ end
1564
+ end
1565
+ ```
1566
+
1567
+ That is all you need. Docsmith automatically creates a shadow `Docsmith::Document`
1568
+ record the first time a version is saved for each model instance.
1569
+
1570
+ ---
1571
+
1572
+ ## 4. Per-Class Configuration
1573
+
1574
+ `docsmith_config` accepts a block that can set any of the following keys:
1575
+
1576
+ ```ruby
1577
+ class LegalDocument < ApplicationRecord
1578
+ include Docsmith::Versionable
1579
+
1580
+ docsmith_config do
1581
+ content_field :body # column to snapshot (default: :body)
1582
+ content_type :html # :markdown (default), :html, :json
1583
+ auto_save false # disable auto-save callback (default: true)
1584
+ debounce 60 # seconds between auto-saves (default: 30)
1585
+ max_versions 50 # cap on stored versions per document (default: nil = unlimited)
1586
+ content_extractor ->(r) { r.body.to_s.strip } # override content_field with a proc
1587
+ end
1588
+ end
1589
+ ```
1590
+
1591
+ **`content_extractor`** is useful when the field you want to version is not a plain
1592
+ string column:
1593
+
1594
+ ```ruby
1595
+ docsmith_config do
1596
+ content_field :body_data # ActiveStorage attachment or JSONB column
1597
+ content_type :markdown
1598
+ content_extractor ->(record) { record.body_data.to_plain_text }
1599
+ end
1600
+ ```
1601
+
1602
+ ---
1603
+
1604
+ ## 5. Global Configuration
1605
+
1606
+ Set defaults for the whole app in `config/initializers/docsmith.rb`:
1607
+
1608
+ ```ruby
1609
+ Docsmith.configure do |config|
1610
+ config.default_content_field = :body
1611
+ config.default_content_type = :markdown
1612
+ config.auto_save = true
1613
+ config.default_debounce = 30 # seconds
1614
+ config.max_versions = nil # nil = unlimited
1615
+ end
1616
+ ```
1617
+
1618
+ Resolution order: **per-class `docsmith_config`** > **global `Docsmith.configure`** > **gem defaults**.
1619
+
1620
+ ---
1621
+
1622
+ ## 6. Saving Versions
1623
+
1624
+ Call `save_version!` to take an explicit snapshot:
1625
+
1626
+ ```ruby
1627
+ article = Article.find(1)
1628
+ article.body = "Updated content here."
1629
+ article.save!
1630
+
1631
+ version = article.save_version!(author: current_user, summary: "Fixed typo in intro")
1632
+ # => #<Docsmith::DocumentVersion version_number: 3, content_type: "markdown", ...>
1633
+ ```
1634
+
1635
+ - Returns the new `DocumentVersion` record.
1636
+ - Returns `nil` if the content has not changed since the last snapshot.
1637
+ - Raises `Docsmith::InvalidContentField` if `content_field` returns a non-String and
1638
+ no `content_extractor` is configured.
1639
+
1640
+ ---
1641
+
1642
+ ## 7. Auto-Save and Debounce
1643
+
1644
+ When `auto_save: true` (the default), Docsmith hooks into ActiveRecord's `after_save`
1645
+ callback and automatically takes a snapshot after every model save — subject to the
1646
+ debounce window.
1647
+
1648
+ ```ruby
1649
+ article.body = "New draft"
1650
+ article.save! # triggers auto_save_version! internally
1651
+ ```
1652
+
1653
+ The **debounce** prevents a snapshot from being created if another snapshot was already
1654
+ taken within the last N seconds (default: 30). This avoids flooding the version history
1655
+ when a user is rapidly typing and saving.
1656
+
1657
+ You can also call `auto_save_version!` directly:
1658
+
1659
+ ```ruby
1660
+ article.auto_save_version!(author: current_user)
1661
+ ```
1662
+
1663
+ To disable auto-save for a class:
1664
+
1665
+ ```ruby
1666
+ docsmith_config { auto_save false }
1667
+ ```
1668
+
1669
+ ---
1670
+
1671
+ ## 8. Querying Versions
1672
+
1673
+ ```ruby
1674
+ # All versions, ordered by version_number ascending
1675
+ article.versions
1676
+ # => ActiveRecord::Relation<Docsmith::DocumentVersion>
1677
+
1678
+ # Latest version
1679
+ article.current_version
1680
+ # => #<Docsmith::DocumentVersion version_number: 5, ...>
1681
+
1682
+ # Specific version by number
1683
+ article.version(3)
1684
+ # => #<Docsmith::DocumentVersion version_number: 3, ...>
1685
+
1686
+ # Inspect content
1687
+ article.version(2).content # => "Body text at v2"
1688
+ article.version(2).content_type # => "markdown"
1689
+ article.version(2).author # => #<User id: 1, ...>
1690
+ article.version(2).change_summary # => "Second draft"
1691
+ article.version(2).created_at # => 2026-03-01 14:22:00 UTC
1692
+
1693
+ # Render a version's content
1694
+ article.version(2).render(:html) # => "<p>Body text at v2</p>"
1695
+ article.version(2).render(:json) # => '{"version":2,"content":"..."}'
1696
+ ```
1697
+
1698
+ ---
1699
+
1700
+ ## 9. Restoring Versions
1701
+
1702
+ Restore creates a **new version** whose content matches an older snapshot. It never
1703
+ mutates existing version records.
1704
+
1705
+ ```ruby
1706
+ restored = article.restore_version!(2, author: current_user)
1707
+ # => #<Docsmith::DocumentVersion version_number: 6, change_summary: "Restored from v2", ...>
1708
+
1709
+ article.reload.body # => the body content from v2
1710
+ ```
1711
+
1712
+ - The model's `content_field` column is updated via `update_column` (bypasses callbacks
1713
+ to avoid a duplicate auto-save).
1714
+ - Fires the `:version_restored` event hook (see §13).
1715
+ - Raises `Docsmith::VersionNotFound` if the version number does not exist.
1716
+
1717
+ ---
1718
+
1719
+ ## 10. Tagging Versions
1720
+
1721
+ Tags are named pointers to specific versions, unique per document.
1722
+
1723
+ ```ruby
1724
+ # Create a tag
1725
+ article.tag_version!(3, name: "v1.0-release", author: current_user)
1726
+
1727
+ # Look up a version by tag name
1728
+ v = article.tagged_version("v1.0-release")
1729
+ # => #<Docsmith::DocumentVersion version_number: 3, ...>
1730
+
1731
+ # List tag names on a version
1732
+ article.version_tags(3)
1733
+ # => ["v1.0-release", "stable"]
1734
+ ```
1735
+
1736
+ - Raises `Docsmith::TagAlreadyExists` if the name is already used on this document.
1737
+ - Raises `Docsmith::VersionNotFound` if the version number does not exist.
1738
+
1739
+ **Interaction with `max_versions`:** Tagged versions are never pruned automatically.
1740
+ If all versions are tagged and a prune would be needed, `Docsmith::MaxVersionsExceeded`
1741
+ is raised.
1742
+
1743
+ ---
1744
+
1745
+ ## 11. Diffs
1746
+
1747
+ Docsmith computes diffs between any two versions. The parser used depends on the
1748
+ document's `content_type`.
1749
+
1750
+ ### Diff from version N to current
1751
+
1752
+ ```ruby
1753
+ result = article.diff_from(1)
1754
+ # => #<Docsmith::Diff::Result from_version: 1, to_version: 5, ...>
1755
+
1756
+ result.additions # => integer count of added tokens
1757
+ result.deletions # => integer count of removed tokens
1758
+ result.to_html # => HTML string with <ins>/<del> markup
1759
+ result.to_json # => JSON string with stats and changes array
1760
+ ```
1761
+
1762
+ ### Diff between two named versions
1763
+
1764
+ ```ruby
1765
+ result = article.diff_between(2, 4)
1766
+ ```
1767
+
1768
+ ### Format-aware parsers
1769
+
1770
+ | `content_type` | Parser | Token unit |
1771
+ |----------------|--------|-----------|
1772
+ | `markdown` | `Docsmith::Diff::Parsers::Markdown` | Each whitespace-delimited word; newline runs are one token |
1773
+ | `html` | `Docsmith::Diff::Parsers::Html` | Each HTML tag (including attributes) is one token; words in text are separate tokens |
1774
+ | `json` | `Docsmith::Diff::Renderers::Base` | Line-level (whole lines) |
1775
+
1776
+ **Markdown example:**
1777
+
1778
+ ```ruby
1779
+ # v1 content: "The quick brown fox"
1780
+ # v2 content: "The quick red fox"
1781
+ result = article.diff_between(1, 2)
1782
+ result.changes
1783
+ # => [{ type: :modification, line: 3, old_content: "brown", new_content: "red" }]
1784
+ result.additions # => 0
1785
+ result.deletions # => 0
1786
+ ```
1787
+
1788
+ **HTML example:**
1789
+
1790
+ ```ruby
1791
+ # v1 content: "<p>Hello world</p>"
1792
+ # v2 content: "<p>Hello world</p><p>New paragraph</p>"
1793
+ # old tokens: ["<p>", "Hello", "world", "</p>"]
1794
+ # new tokens: ["<p>", "Hello", "world", "</p>", "<p>", "New", "paragraph", "</p>"]
1795
+ # LCS: first 4 tokens match exactly → 4 additions: "<p>", "New", "paragraph", "</p>"
1796
+ result = article.diff_between(1, 2)
1797
+ result.additions # => 4
1798
+ ```
1799
+
1800
+ ### to_html output
1801
+
1802
+ ```ruby
1803
+ result.to_html
1804
+ # => '<div class="docsmith-diff">
1805
+ # <ins class="docsmith-addition">Ruby</ins>
1806
+ # <del class="docsmith-deletion">Python</del>
1807
+ # </div>'
1808
+ ```
1809
+
1810
+ ### to_json output
1811
+
1812
+ ```ruby
1813
+ JSON.parse(result.to_json)
1814
+ # => {
1815
+ # "content_type" => "markdown",
1816
+ # "from_version" => 1,
1817
+ # "to_version" => 3,
1818
+ # "stats" => { "additions" => 2, "deletions" => 1 },
1819
+ # "changes" => [
1820
+ # { "type" => "addition", "position" => { "line" => 5 }, "content" => "Ruby" },
1821
+ # { "type" => "deletion", "position" => { "line" => 3 }, "content" => "Python" },
1822
+ # { "type" => "modification", "position" => { "line" => 7 }, "old_content" => "foo", "new_content" => "bar" }
1823
+ # ]
1824
+ # }
1825
+ ```
1826
+
1827
+ ---
1828
+
1829
+ ## 12. Comments
1830
+
1831
+ Comments can be attached to a specific version. They are either **document-level** (no
1832
+ position) or **range-anchored** (tied to a character offset range).
1833
+
1834
+ ### Add a comment
1835
+
1836
+ ```ruby
1837
+ # Document-level comment
1838
+ comment = article.add_comment!(
1839
+ version: 2,
1840
+ body: "This section needs a citation.",
1841
+ author: current_user
1842
+ )
1843
+ comment.anchor_type # => "document"
1844
+
1845
+ # Range-anchored (inline) comment — offsets into the version's content string
1846
+ comment = article.add_comment!(
1847
+ version: 2,
1848
+ body: "Cite this claim.",
1849
+ author: current_user,
1850
+ anchor: { start_offset: 42, end_offset: 78 }
1851
+ )
1852
+ comment.anchor_type # => "range"
1853
+ comment.anchor_data["start_offset"] # => 42
1854
+ comment.anchor_data["anchored_text"] # => the substring from offset 42–78
1855
+ ```
1856
+
1857
+ ### Thread replies
1858
+
1859
+ ```ruby
1860
+ reply = article.add_comment!(
1861
+ version: 2,
1862
+ body: "Good point, fixing now.",
1863
+ author: other_user,
1864
+ parent: comment
1865
+ )
1866
+ comment.replies # => [reply]
1867
+ reply.parent # => comment
1868
+ ```
1869
+
1870
+ ### Query comments
1871
+
1872
+ ```ruby
1873
+ # All comments across all versions (AR relation)
1874
+ article.comments
1875
+
1876
+ # Comments on a specific version
1877
+ article.comments_on(version: 2)
1878
+
1879
+ # Filter by type
1880
+ article.comments_on(version: 2, type: :range) # inline only
1881
+ article.comments_on(version: 2, type: :document) # document-level only
1882
+
1883
+ # Unresolved comments across all versions
1884
+ article.unresolved_comments
1885
+ ```
1886
+
1887
+ ### Resolve a comment
1888
+
1889
+ ```ruby
1890
+ Docsmith::Comments::Manager.resolve!(comment, by: current_user)
1891
+ comment.reload.resolved # => true
1892
+ comment.resolved_by # => current_user
1893
+ comment.resolved_at # => Time
1894
+ ```
1895
+
1896
+ ### Migrate comments between versions
1897
+
1898
+ When a new version is saved, document-level comments from the previous version can be
1899
+ migrated forward:
1900
+
1901
+ ```ruby
1902
+ article.migrate_comments!(from: 2, to: 3)
1903
+ # Copies document-level (non-range) comments from v2 to v3.
1904
+ # Range comments are not migrated — their offsets may no longer be valid.
1905
+ ```
1906
+
1907
+ ---
1908
+
1909
+ ## 13. Events and Hooks
1910
+
1911
+ Docsmith fires synchronous events you can subscribe to via `Docsmith.configure`.
1912
+
1913
+ ```ruby
1914
+ Docsmith.configure do |config|
1915
+ config.on(:version_created) do |event|
1916
+ Rails.logger.info "[Docsmith] v#{event.version.version_number} saved on #{event.document.title}"
1917
+ AuditLog.create!(action: "version_created", record: event.record)
1918
+ end
1919
+
1920
+ config.on(:version_restored) do |event|
1921
+ Rails.logger.info "[Docsmith] Restored to v#{event.version.version_number}"
1922
+ end
1923
+
1924
+ config.on(:version_tagged) do |event|
1925
+ Rails.logger.info "[Docsmith] Tagged v#{event.version.version_number} as '#{event.tag_name}'"
1926
+ end
1927
+ end
1928
+ ```
1929
+
1930
+ **Event payload** (`event` is a `Docsmith::Events::Event`):
1931
+
1932
+ | Field | Type | Always present |
1933
+ |----------------|--------------------------------------|----------------|
1934
+ | `event.record` | The originating AR model (or Document if standalone) | yes |
1935
+ | `event.document` | `Docsmith::Document` | yes |
1936
+ | `event.version` | `Docsmith::DocumentVersion` | yes |
1937
+ | `event.author` | whatever you passed as `author:` | yes |
1938
+ | `event.tag_name` | String (`:version_tagged` only) | no |
1939
+ | `event.from_version` | DocumentVersion (`:version_restored` only) | no |
1940
+
1941
+ Hooks fire before `ActiveSupport::Notifications` so they are synchronous and blocking.
1942
+ Keep hooks fast.
1943
+
1944
+ ---
1945
+
1946
+ ## 14. Standalone Document API
1947
+
1948
+ `Docsmith::Versionable` is a convenience wrapper. You can use `Docsmith::Document` and
1949
+ `Docsmith::VersionManager` directly without any model mixin:
1950
+
1951
+ ```ruby
1952
+ doc = Docsmith::Document.create!(
1953
+ title: "My API Spec",
1954
+ content: "# Version 1\n\nInitial spec.",
1955
+ content_type: "markdown"
1956
+ )
1957
+
1958
+ v1 = Docsmith::VersionManager.save!(doc, author: nil, summary: "Initial draft")
1959
+ doc.update_column(:content, "# Version 1\n\nRevised spec.")
1960
+ v2 = Docsmith::VersionManager.save!(doc, author: nil, summary: "Revised intro")
1961
+
1962
+ # Diff
1963
+ result = Docsmith::Diff.between(v1, v2)
1964
+ result.additions # => number of added tokens
1965
+ result.to_html # => HTML diff markup
1966
+
1967
+ # Restore
1968
+ Docsmith::VersionManager.restore!(doc, version: 1, author: nil)
1969
+ doc.reload.content # => "# Version 1\n\nInitial spec."
1970
+
1971
+ # Tag
1972
+ Docsmith::VersionManager.tag!(doc, version: 1, name: "golden", author: nil)
1973
+ ```
1974
+
1975
+ ---
1976
+
1977
+ ## 15. Configuration Reference
1978
+
1979
+ | Key | Default | Description |
1980
+ |-----|---------|-------------|
1981
+ | `default_content_field` | `:body` | Column to snapshot when no per-class override |
1982
+ | `default_content_type` | `:markdown` | Content type for new documents |
1983
+ | `auto_save` | `true` | Enable after_save auto-snapshot |
1984
+ | `default_debounce` | `30` | Seconds between auto-saves |
1985
+ | `max_versions` | `nil` | Max snapshots per document; `nil` = unlimited |
1986
+ | `content_extractor` | `nil` | Global proc overriding `content_field` |
1987
+ | `table_prefix` | `"docsmith"` | Table name prefix |
1988
+ | `diff_context_lines` | `3` | Context lines in diff output |
1989
+
1990
+ **Error classes:**
1991
+
1992
+ | Class | Raised when |
1993
+ |-------|-------------|
1994
+ | `Docsmith::InvalidContentField` | `content_field` returns a non-String |
1995
+ | `Docsmith::VersionNotFound` | Requested version number does not exist |
1996
+ | `Docsmith::TagAlreadyExists` | Tag name already used on this document |
1997
+ | `Docsmith::MaxVersionsExceeded` | All versions are tagged and pruning is blocked |
1998
+ ```
1999
+
2000
+ - [ ] **Step 2: Verify the file was created correctly**
2001
+
2002
+ Run: `wc -l USAGE.md`
2003
+
2004
+ Expected: > 300 lines (it's a substantial file).
2005
+
2006
+ - [ ] **Step 3: Update `README.md`**
2007
+
2008
+ Replace the entire README.md with:
2009
+
2010
+ ```markdown
2011
+ # Docsmith
2012
+
2013
+ Docsmith adds snapshot-based versioning, format-aware diffs, and inline comments to any
2014
+ ActiveRecord model — with zero system dependencies.
2015
+
2016
+ ## Features
2017
+
2018
+ - **Full content snapshots** for HTML, Markdown, and JSON — instant rollback to any version
2019
+ - **Format-aware diffs** — word-level diffs for Markdown; HTML tags treated as atomic tokens
2020
+ - **Inline and document-level comments** with threading, resolution, and version migration
2021
+ - **Debounced auto-save** with per-class and global configuration
2022
+ - **Lifecycle events** — hook into version_created, version_restored, version_tagged
2023
+ - **Clean service API** — works standalone without any model mixin
2024
+
2025
+ ## Quick Start
2026
+
2027
+ ```ruby
2028
+ # Gemfile
2029
+ gem "docsmith"
2030
+ ```
2031
+
2032
+ ```bash
2033
+ rails generate docsmith:install
2034
+ rails db:migrate
2035
+ ```
2036
+
2037
+ ```ruby
2038
+ class Article < ApplicationRecord
2039
+ include Docsmith::Versionable
2040
+ docsmith_config { content_field :body; content_type :markdown }
2041
+ end
2042
+
2043
+ article.body = "New draft"
2044
+ article.save!
2045
+ article.save_version!(author: current_user, summary: "First draft")
2046
+
2047
+ result = article.diff_from(1)
2048
+ result.additions # word-level count for markdown
2049
+ result.to_html # <ins>/<del> markup
2050
+ ```
2051
+
2052
+ ## Documentation
2053
+
2054
+ See **[USAGE.md](USAGE.md)** for full documentation including:
2055
+
2056
+ - Installation and migration
2057
+ - Per-class and global configuration
2058
+ - Saving, querying, and restoring versions
2059
+ - Version tagging
2060
+ - Format-aware diffs (Markdown and HTML parsers)
2061
+ - Inline and document-level comments
2062
+ - Events and hooks
2063
+ - Standalone Document API
2064
+ - Configuration reference
2065
+
2066
+ ## Development
2067
+
2068
+ ```bash
2069
+ bin/setup
2070
+ bundle exec rspec # run tests
2071
+ bin/console # interactive console
2072
+ ```
2073
+
2074
+ ## License
2075
+
2076
+ MIT — see [LICENSE.txt](LICENSE.txt).
2077
+ ```
2078
+
2079
+ - [ ] **Step 4: Final full test run**
2080
+
2081
+ Run: `bundle exec rspec --format documentation`
2082
+
2083
+ Expected: 0 failures. Review the output to confirm parser specs, engine dispatch specs, and all integration specs pass.
2084
+
2085
+ - [ ] **Step 5: Commit documentation**
2086
+
2087
+ ```bash
2088
+ git add USAGE.md README.md
2089
+ git commit -m "docs: write verbose USAGE.md and update README with feature overview"
2090
+ ```
2091
+
2092
+ ---
2093
+
2094
+ ## Self-Review
2095
+
2096
+ **Spec coverage check:**
2097
+ - ✅ Markdown word-level parser → Task 4
2098
+ - ✅ HTML tag-atomic parser → Task 5
2099
+ - ✅ Engine dispatch for markdown/html → Task 6
2100
+ - ✅ Remove all branch lib code → Task 1 + 2
2101
+ - ✅ Remove branch schema → Task 3
2102
+ - ✅ Remove branch specs → Task 1 + 7
2103
+ - ✅ Fix broken count expectations → Task 7
2104
+ - ✅ USAGE.md created → Task 9
2105
+ - ✅ README.md updated → Task 9
2106
+
2107
+ **Placeholder scan:** None — all code blocks show complete, runnable Ruby.
2108
+
2109
+ **Type consistency:**
2110
+ - `Parsers::Markdown#compute` and `Parsers::Html#compute` both return `Array<Hash>` with keys `:type`, `:line`, `:content` / `:old_content` / `:new_content` — matches what `Renderers::Base#compute` returns and what `Result`, `render_html`, and `serialize_change` consume.
2111
+ - `Engine::PARSERS` references `Parsers::Markdown` and `Parsers::Html` — both defined before engine is loaded (require order in `lib/docsmith.rb` established in Task 2, Step 1).
2112
+ - `VersionManager.save!` signature after edit: `(document, author:, summary: nil, config: nil)` — matches all call sites in versionable.rb and specs.