llm-docs-builder 0.6.0 → 0.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.rspec +3 -0
- data/CHANGELOG.md +37 -0
- data/Gemfile.lock +1 -1
- data/README.md +182 -555
- data/bin/rspecs +2 -1
- data/lib/llm_docs_builder/cli.rb +1 -62
- data/lib/llm_docs_builder/comparator.rb +4 -16
- data/lib/llm_docs_builder/config.rb +42 -5
- data/lib/llm_docs_builder/markdown_transformer.rb +54 -128
- data/lib/llm_docs_builder/output_formatter.rb +93 -0
- data/lib/llm_docs_builder/parser.rb +1 -59
- data/lib/llm_docs_builder/text_compressor.rb +164 -0
- data/lib/llm_docs_builder/token_estimator.rb +52 -0
- data/lib/llm_docs_builder/transformers/base_transformer.rb +30 -0
- data/lib/llm_docs_builder/transformers/content_cleanup_transformer.rb +106 -0
- data/lib/llm_docs_builder/transformers/enhancement_transformer.rb +95 -0
- data/lib/llm_docs_builder/transformers/link_transformer.rb +84 -0
- data/lib/llm_docs_builder/transformers/whitespace_transformer.rb +44 -0
- data/lib/llm_docs_builder/version.rb +1 -1
- metadata +10 -3
- data/CLAUDE.md +0 -178
- data/llm-docs-builder.yml +0 -7
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: b8fed4a1c362dec44db4b8091bc81a7d1fc2c01e3c104993f74a810c048d5d02
|
4
|
+
data.tar.gz: 575cf59762de9438336d3c4127277f0ba89e6e540772427cdee9af0407b90983
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: f8474aa5ed99b00c9b7d143a48dae066e7eb3d726fc9570d16b4bedae57e78e09bdf028bb679bf1d4f6c87ee94e690f48420ec8d67c5a22f9c97c5c4d8a064cd
|
7
|
+
data.tar.gz: 4ea7338aed98ff58d1173ceaf426bcd6d79949508e1ec0441652ef0202d58bf3f88d600f3284648d98210de33ead0201aac0de3fde3cfb9bc92ce4e5e3b05132
|
data/.rspec
ADDED
data/CHANGELOG.md
CHANGED
@@ -1,5 +1,42 @@
|
|
1
1
|
# Changelog
|
2
2
|
|
3
|
+
## 0.7.0 (2025-10-09)
|
4
|
+
- [Feature] **Advanced Token Optimization** - Added 8 new compression options to reduce token consumption:
|
5
|
+
- `remove_code_examples`: Remove code blocks and inline code
|
6
|
+
- `remove_images`: Remove all image syntax
|
7
|
+
- `simplify_links`: Simplify verbose link text (e.g., "Click here to see the docs" → "docs")
|
8
|
+
- `remove_blockquotes`: Remove blockquote formatting while preserving content
|
9
|
+
- `generate_toc`: Generate table of contents from headings with anchor links
|
10
|
+
- `custom_instruction`: Inject AI context messages at document top
|
11
|
+
- `remove_stopwords`: Remove common stopwords from prose (preserves code blocks)
|
12
|
+
- `remove_duplicates`: Remove duplicate paragraphs using fuzzy matching
|
13
|
+
- [Feature] **Compression Presets** - 6 built-in presets for easy usage:
|
14
|
+
- `conservative`: 15-25% reduction (safest transformations)
|
15
|
+
- `moderate`: 30-45% reduction (balanced approach)
|
16
|
+
- `aggressive`: 50-70% reduction (maximum compression)
|
17
|
+
- `documentation`: 35-50% reduction (preserves code examples)
|
18
|
+
- `tutorial`: 20% reduction (minimal compression for learning materials)
|
19
|
+
- `api_reference`: 40% reduction (optimized for API documentation)
|
20
|
+
- [Enhancement] **Refactored Architecture** - Split monolithic `MarkdownTransformer` into focused transformer classes following SRP:
|
21
|
+
- `BaseTransformer`: Common interface for all transformers
|
22
|
+
- `LinkTransformer`: Link expansion, URL conversion, link simplification
|
23
|
+
- `ContentCleanupTransformer`: All removal operations
|
24
|
+
- `EnhancementTransformer`: TOC generation and custom instructions
|
25
|
+
- `WhitespaceTransformer`: Whitespace normalization
|
26
|
+
- `MarkdownTransformer`: Pipeline orchestrator
|
27
|
+
- [Enhancement] Added `TextCompressor` class for advanced text compression (stopwords, duplicates).
|
28
|
+
- [Enhancement] Added `TokenEstimator` class for token count estimation.
|
29
|
+
- [Enhancement] Added `OutputFormatter` class for formatted output (extracted from CLI).
|
30
|
+
- [Enhancement] Added `CompressionPresets` class with preset configurations.
|
31
|
+
- [Enhancement] Custom instructions now adapt to blockquote removal setting (no blockquote format when `remove_blockquotes: true`).
|
32
|
+
- [Enhancement] Updated `Config#merge_with_options` to support all new compression options.
|
33
|
+
- [Testing] Added 20 new integration tests for compression features and presets.
|
34
|
+
- [Testing] Added automatic config file backup/restore in test suite to prevent interference.
|
35
|
+
- [Testing] All 110 tests passing with 79.44% code coverage.
|
36
|
+
- [Documentation] **Shortened README.md by 47%** (729 → 381 lines) while adding all new features.
|
37
|
+
- [Documentation] Added comprehensive compression examples and use cases.
|
38
|
+
- [Documentation] Added preset comparison table showing what each preset does.
|
39
|
+
|
3
40
|
## 0.6.0 (2025-10-09)
|
4
41
|
- [Breaking] **Project renamed from `llms-txt-ruby` to `llm-docs-builder`** to better reflect expanded functionality beyond just llms.txt generation.
|
5
42
|
- Gem name: `llms-txt-ruby` → `llm-docs-builder`
|