llm-docs-builder 0.6.0 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/CLAUDE.md DELETED
@@ -1,178 +0,0 @@
1
- # CLAUDE.md
2
-
3
- llm-docs-builder is a Ruby gem that generates [llms.txt](https://llmstxt.org/) files from existing markdown documentation and transforms markdown files to be AI-friendly. It provides both a CLI tool and Ruby API.
4
-
5
- ## Project Overview
6
-
7
- llm-docs-builder is a Ruby gem that generates [llms.txt](https://llmstxt.org/) files from existing markdown documentation and transforms markdown files to be AI-friendly. It provides both a CLI tool and Ruby API.
8
-
9
- **Key functionality:**
10
- - Generates llms.txt files from documentation directories by scanning markdown files, extracting metadata, and organizing by priority
11
- - Transforms individual markdown files by expanding relative links to absolute URLs
12
- - Bulk transforms entire documentation trees with customizable suffixes and exclusion patterns
13
- - Supports both config file and direct options for all operations
14
-
15
- ## Development Commands
16
-
17
- ### Testing
18
- ```bash
19
- # Run all tests
20
- ./bin/rspecs
21
-
22
- # Run specific test file
23
- bundle exec rspec spec/llm_docs_builder_spec.rb
24
-
25
- # Run specific test line
26
- bundle exec rspec spec/llm_docs_builder_spec.rb:42
27
- ```
28
-
29
- ### Code Quality
30
- ```bash
31
- # Run RuboCop linter
32
- bundle exec rubocop
33
-
34
- # Auto-fix RuboCop violations
35
- bundle exec rubocop -a
36
-
37
- # Run all checks (tests + linting)
38
- bundle exec rake
39
- ```
40
-
41
- ### CLI Testing
42
- ```bash
43
- # Test CLI locally
44
- bundle exec bin/llm-docs-builder generate --docs ./docs
45
- bundle exec bin/llm-docs-builder transform --docs README.md
46
- bundle exec bin/llm-docs-builder bulk-transform --docs ./docs
47
-
48
- # Test compare command (requires network)
49
- bundle exec bin/llm-docs-builder compare --url https://karafka.io/docs/Getting-Started.html
50
- bundle exec bin/llm-docs-builder compare --url https://example.com/page.html --file docs/local.md
51
- ```
52
-
53
- ### Building and Installing
54
- ```bash
55
- # Build gem locally
56
- bundle exec rake build
57
-
58
- # Install locally built gem
59
- gem install pkg/llm-docs-builder-*.gem
60
-
61
- # Release (maintainers only)
62
- bundle exec rake release
63
- ```
64
-
65
- ## Architecture
66
-
67
- ### Core Components
68
-
69
- **LlmDocsBuilder Module** (`lib/llm_docs_builder.rb`)
70
- - Main API entry point with class methods for all operations
71
- - Uses Zeitwerk for autoloading
72
- - Delegates to specialized classes for generation, transformation, and validation
73
- - All methods support both config file and direct options via `Config#merge_with_options`
74
-
75
- **Generator** (`lib/llm_docs_builder/generator.rb`)
76
- - Scans documentation directories recursively using `Find.find`
77
- - Extracts title from first H1 header, description from first paragraph
78
- - Prioritizes files: README (1), getting started (2), guides (3), tutorials (4), API (5), reference (6), others (7)
79
- - Builds formatted llms.txt with links and descriptions
80
-
81
- **MarkdownTransformer** (`lib/llm_docs_builder/markdown_transformer.rb`)
82
- - Transforms individual markdown files using regex patterns
83
- - `expand_relative_links`: Converts relative links to absolute URLs using base_url
84
- - `convert_html_urls`: Changes .html/.htm URLs to .md format
85
- - Leaves absolute URLs and anchor links unchanged
86
-
87
- **BulkTransformer** (`lib/llm_docs_builder/bulk_transformer.rb`)
88
- - Recursively processes all markdown files in a directory
89
- - Uses `MarkdownTransformer` for each file
90
- - Generates output paths with configurable suffix (default: `.llm`)
91
- - Empty suffix (`""`) enables in-place transformation
92
- - Supports glob-based exclusion patterns via `File.fnmatch`
93
-
94
- **Comparator** (`lib/llm_docs_builder/comparator.rb`)
95
- - Measures context window savings by comparing content sizes
96
- - Fetches URLs with different User-Agents (human browser vs AI bot)
97
- - Can compare remote URL with local markdown file
98
- - Uses Net::HTTP for fetching with redirect support
99
- - Calculates reduction percentage, bytes saved, and compression factor
100
-
101
- **Config** (`lib/llm_docs_builder/config.rb`)
102
- - Loads YAML config from file or auto-finds `llms-txt.yml`
103
- - Merges config file options with programmatic options (programmatic takes precedence)
104
- - Handles defaults: `suffix: '.llm'`, `output: 'llms.txt'`, `excludes: []`
105
-
106
- **CLI** (`lib/llm_docs_builder/cli.rb`)
107
- - Parses commands: generate, transform, bulk-transform, compare, parse, validate, version
108
- - Uses OptionParser for flag parsing
109
- - Loads config and merges with CLI options before delegating to main module
110
- - Handles errors gracefully with user-friendly messages
111
- - Compare command displays formatted output with human-readable byte sizes (bytes/KB/MB)
112
-
113
- ### Configuration Precedence
114
-
115
- Options are resolved in this order (highest to lowest priority):
116
- 1. Direct method arguments (e.g., `LlmDocsBuilder.generate_from_docs('./docs', title: 'Override')`)
117
- 2. CLI flags (e.g., `--docs ./docs`)
118
- 3. Config file values (e.g., `llms-txt.yml`)
119
- 4. Defaults (e.g., `suffix: '.llm'`, `output: 'llms.txt'`)
120
-
121
- ### File Priority System
122
-
123
- When generating llms.txt, files are automatically ordered by importance:
124
- - Priority 1: README files (always listed first)
125
- - Priority 2: Getting started guides
126
- - Priority 3: General guides
127
- - Priority 4: Tutorials
128
- - Priority 5: API documentation
129
- - Priority 6: Reference documentation
130
- - Priority 7: All other files
131
-
132
- ### Link Transformation Logic
133
-
134
- **Relative Link Expansion** (when `base_url` provided):
135
- - Converts `[text](./path.md)` → `[text](https://base.url/path.md)`
136
- - Converts `[text](../other.md)` → `[text](https://base.url/other.md)`
137
- - Skips URLs starting with `http://`, `https://`, `//`, or `#`
138
-
139
- **URL Conversion** (when `convert_urls: true`):
140
- - Changes `https://example.com/page.html` → `https://example.com/page.md`
141
- - Changes `https://example.com/doc.htm` → `https://example.com/doc.md`
142
-
143
- ### In-Place vs Separate Files
144
-
145
- **Separate Files** (`suffix: '.llm'` - default):
146
- - Creates new files: `README.md` → `README.llm.md`
147
- - Preserves originals for human-readable documentation
148
- - Useful for dual-serving human and AI versions
149
-
150
- **In-Place** (`suffix: ""`):
151
- - Overwrites originals: `README.md` → `README.md` (transformed)
152
- - Used in build pipelines (e.g., Karafka framework)
153
- - Transforms documentation before deployment
154
-
155
- ## Testing Strategy
156
-
157
- - RSpec for all tests with SimpleCov coverage tracking
158
- - Unit tests for each component in isolation
159
- - Integration tests in `spec/integrations/` for end-to-end workflows
160
- - Example outputs saved in `spec/examples.txt` for persistence
161
- - CI tests against Ruby 3.2, 3.3, 3.4 via GitHub Actions
162
-
163
- ## Dependencies
164
-
165
- - **zeitwerk**: Autoloading and code organization
166
- - **optparse**: Built-in Ruby CLI parsing (no external CLI framework)
167
- - **rspec**: Testing framework
168
- - **rubocop**: Code linting and style enforcement
169
- - **simplecov**: Test coverage reporting
170
-
171
- ## Code Style
172
-
173
- - Ruby 3.2+ syntax and features required
174
- - Frozen string literals in all files
175
- - Explicit module nesting (no `class Foo::Bar`)
176
- - Comprehensive YARD documentation for public APIs
177
- - Private methods clearly marked and documented
178
- - RuboCop enforces consistent style
data/llm-docs-builder.yml DELETED
@@ -1,7 +0,0 @@
1
- docs: ./docs
2
- base_url: https://myproject.io
3
- title: My Awesome Project
4
- description: A Ruby library that helps developers build amazing applications
5
- output: llms.txt
6
- convert_urls: true
7
- verbose: false