PyPI - llmsbrieftxt - Versions diffs - 1.8.2__tar.gz - Mend

llmsbrieftxt 1.8.2__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (45) hide show

llmsbrieftxt-1.8.2/.github/ISSUE_TEMPLATE/bug_report.yml +104 -0
llmsbrieftxt-1.8.2/.github/ISSUE_TEMPLATE/config.yml +8 -0
llmsbrieftxt-1.8.2/.github/ISSUE_TEMPLATE/feature_request.yml +87 -0
llmsbrieftxt-1.8.2/.github/ISSUE_TEMPLATE/question.yml +63 -0
llmsbrieftxt-1.8.2/.github/copilot-instructions.md +115 -0
llmsbrieftxt-1.8.2/.github/workflows/ci.yml +40 -0
llmsbrieftxt-1.8.2/.github/workflows/claude-cli-qa.yml +360 -0
llmsbrieftxt-1.8.2/.github/workflows/claude-doc-review.yml +292 -0
llmsbrieftxt-1.8.2/.github/workflows/claude.yml +50 -0
llmsbrieftxt-1.8.2/.github/workflows/pr-title-check.yml +48 -0
llmsbrieftxt-1.8.2/.github/workflows/release.yml +111 -0
llmsbrieftxt-1.8.2/.gitignore +14 -0
llmsbrieftxt-1.8.2/CLAUDE.md +388 -0
llmsbrieftxt-1.8.2/CONTRIBUTING.md +264 -0
llmsbrieftxt-1.8.2/LICENSE +21 -0
llmsbrieftxt-1.8.2/PKG-INFO +457 -0
llmsbrieftxt-1.8.2/PRODUCTION_CLEANUP_PLAN.md +339 -0
llmsbrieftxt-1.8.2/README.md +424 -0
llmsbrieftxt-1.8.2/docs/USER_JOURNEYS.md +700 -0
llmsbrieftxt-1.8.2/llmsbrieftxt/__init__.py +1 -0
llmsbrieftxt-1.8.2/llmsbrieftxt/cli.py +282 -0
llmsbrieftxt-1.8.2/llmsbrieftxt/constants.py +62 -0
llmsbrieftxt-1.8.2/llmsbrieftxt/crawler.py +366 -0
llmsbrieftxt-1.8.2/llmsbrieftxt/doc_loader.py +150 -0
llmsbrieftxt-1.8.2/llmsbrieftxt/extractor.py +69 -0
llmsbrieftxt-1.8.2/llmsbrieftxt/main.py +424 -0
llmsbrieftxt-1.8.2/llmsbrieftxt/schema.py +42 -0
llmsbrieftxt-1.8.2/llmsbrieftxt/summarizer.py +300 -0
llmsbrieftxt-1.8.2/llmsbrieftxt/url_filters.py +75 -0
llmsbrieftxt-1.8.2/llmsbrieftxt/url_utils.py +73 -0
llmsbrieftxt-1.8.2/pyproject.toml +103 -0
llmsbrieftxt-1.8.2/pytest.ini +17 -0
llmsbrieftxt-1.8.2/scripts/bump_version.py +187 -0
llmsbrieftxt-1.8.2/tests/__init__.py +1 -0
llmsbrieftxt-1.8.2/tests/conftest.py +46 -0
llmsbrieftxt-1.8.2/tests/fixtures/__init__.py +1 -0
llmsbrieftxt-1.8.2/tests/integration/__init__.py +1 -0
llmsbrieftxt-1.8.2/tests/integration/test_doc_loader_integration.py +181 -0
llmsbrieftxt-1.8.2/tests/unit/__init__.py +1 -0
llmsbrieftxt-1.8.2/tests/unit/test_cli.py +418 -0
llmsbrieftxt-1.8.2/tests/unit/test_doc_loader.py +120 -0
llmsbrieftxt-1.8.2/tests/unit/test_extractor.py +70 -0
llmsbrieftxt-1.8.2/tests/unit/test_robustness.py +83 -0
llmsbrieftxt-1.8.2/tests/unit/test_summarizer.py +197 -0
llmsbrieftxt-1.8.2/uv.lock +1136 -0

llmsbrieftxt-1.8.2/.github/ISSUE_TEMPLATE/bug_report.yml ADDED Viewed

@@ -0,0 +1,104 @@
+name: Bug Report
+description: Report a bug or unexpected behavior
+title: "[Bug]: "
+labels: ["bug", "triage"]
+body:
+  - type: markdown
+    attributes:
+      value: |
+        Thanks for taking the time to report a bug! Please fill out the information below to help us resolve the issue.
+  - type: textarea
+    id: description
+    attributes:
+      label: Bug Description
+      description: A clear and concise description of what the bug is.
+      placeholder: When I run llmtxt with..., I expect..., but instead...
+    validations:
+      required: true
+  - type: textarea
+    id: reproduction
+    attributes:
+      label: Steps to Reproduce
+      description: Steps to reproduce the behavior
+      placeholder: |
+        1. Run command: llmtxt https://example.com
+        2. Observe error...
+        3. ...
+    validations:
+      required: true
+  - type: textarea
+    id: expected
+    attributes:
+      label: Expected Behavior
+      description: What you expected to happen
+      placeholder: I expected the tool to...
+    validations:
+      required: true
+  - type: textarea
+    id: actual
+    attributes:
+      label: Actual Behavior
+      description: What actually happened
+      placeholder: Instead, the tool...
+    validations:
+      required: true
+  - type: textarea
+    id: logs
+    attributes:
+      label: Error Logs / Output
+      description: Please paste any relevant error messages or output
+      render: shell
+      placeholder: |
+        Paste error output here...
+  - type: input
+    id: version
+    attributes:
+      label: llmsbrieftxt Version
+      description: Run `pip show llmsbrieftxt` to get the version
+      placeholder: "1.0.0"
+    validations:
+      required: true
+  - type: input
+    id: python-version
+    attributes:
+      label: Python Version
+      description: Run `python --version` to get your Python version
+      placeholder: "3.11.0"
+    validations:
+      required: true
+  - type: input
+    id: os
+    attributes:
+      label: Operating System
+      description: Which OS are you using?
+      placeholder: "macOS 14.0, Ubuntu 22.04, Windows 11, etc."
+    validations:
+      required: true
+  - type: textarea
+    id: context
+    attributes:
+      label: Additional Context
+      description: Any other context, screenshots, or information that might be helpful
+      placeholder: Add any other context about the problem here
+  - type: checkboxes
+    id: checks
+    attributes:
+      label: Pre-submission Checklist
+      description: Please confirm the following before submitting
+      options:
+        - label: I have searched existing issues to ensure this is not a duplicate
+          required: true
+        - label: I have tested with the latest version of llmsbrieftxt
+          required: true
+        - label: I have included complete error messages and logs
+          required: true

llmsbrieftxt-1.8.2/.github/ISSUE_TEMPLATE/config.yml ADDED Viewed

@@ -0,0 +1,8 @@
+blank_issues_enabled: false
+contact_links:
+  - name: 💬 GitHub Discussions
+    url: https://github.com/stevennevins/llmsbrief/discussions
+    about: Ask questions and discuss ideas with the community
+  - name: 📚 Documentation
+    url: https://github.com/stevennevins/llmsbrief#readme
+    about: Read the documentation and guides

llmsbrieftxt-1.8.2/.github/ISSUE_TEMPLATE/feature_request.yml ADDED Viewed

@@ -0,0 +1,87 @@
+name: Feature Request
+description: Suggest a new feature or enhancement
+title: "[Feature]: "
+labels: ["enhancement", "triage"]
+body:
+  - type: markdown
+    attributes:
+      value: |
+        Thanks for suggesting a new feature! Please provide as much detail as possible to help us understand your request.
+        **Note:** llmsbrieftxt follows the Unix philosophy of doing one thing well. Features should align with the core mission: generating llms-brief.txt files from documentation websites.
+  - type: textarea
+    id: problem
+    attributes:
+      label: Problem or Use Case
+      description: What problem does this feature solve? What use case does it enable?
+      placeholder: |
+        I want to be able to... because...
+        Currently, I can't... which means...
+    validations:
+      required: true
+  - type: textarea
+    id: solution
+    attributes:
+      label: Proposed Solution
+      description: How would you like this feature to work?
+      placeholder: |
+        Add a new option `--example` that would...
+        The behavior should be...
+    validations:
+      required: true
+  - type: textarea
+    id: alternatives
+    attributes:
+      label: Alternatives Considered
+      description: What alternatives have you considered? Are there workarounds?
+      placeholder: |
+        I considered using X, but...
+        As a workaround, I currently...
+  - type: textarea
+    id: examples
+    attributes:
+      label: Usage Examples
+      description: Provide examples of how this feature would be used
+      render: bash
+      placeholder: |
+        # Example usage
+        llmtxt https://example.com --new-option value
+  - type: dropdown
+    id: alignment
+    attributes:
+      label: Alignment with Project Philosophy
+      description: Does this feature align with the Unix philosophy of doing one thing well?
+      options:
+        - "Yes - Directly enhances llms-brief.txt generation"
+        - "Maybe - Could be considered a core feature"
+        - "No - This is a complementary tool/feature"
+    validations:
+      required: true
+  - type: textarea
+    id: impact
+    attributes:
+      label: Impact
+      description: |
+        Who benefits from this feature? How critical is it?
+      placeholder: |
+        This would benefit users who...
+        Impact: High/Medium/Low
+  - type: checkboxes
+    id: checks
+    attributes:
+      label: Pre-submission Checklist
+      description: Please confirm the following before submitting
+      options:
+        - label: I have searched existing issues and discussions to ensure this hasn't been requested
+          required: true
+        - label: I have considered whether this aligns with the project's Unix philosophy
+          required: true
+        - label: I have provided a clear use case and motivation
+          required: true

llmsbrieftxt-1.8.2/.github/ISSUE_TEMPLATE/question.yml ADDED Viewed

@@ -0,0 +1,63 @@
+name: Question or Support
+description: Ask a question or get help using llmsbrieftxt
+title: "[Question]: "
+labels: ["question", "support"]
+body:
+  - type: markdown
+    attributes:
+      value: |
+        Have a question about using llmsbrieftxt? We're here to help!
+        **Tip:** For general discussions and community help, consider using [GitHub Discussions](https://github.com/stevennevins/llmsbrief/discussions) instead.
+  - type: textarea
+    id: question
+    attributes:
+      label: Question
+      description: What would you like to know?
+      placeholder: How do I...? What is the best way to...?
+    validations:
+      required: true
+  - type: textarea
+    id: context
+    attributes:
+      label: Context
+      description: Provide any relevant context about what you're trying to accomplish
+      placeholder: |
+        I'm trying to...
+        My use case is...
+  - type: textarea
+    id: attempted
+    attributes:
+      label: What Have You Tried?
+      description: What have you already attempted? Include commands, configurations, etc.
+      render: bash
+      placeholder: |
+        I tried running:
+        llmtxt https://example.com --option value
+        But I'm not sure if...
+  - type: textarea
+    id: documentation
+    attributes:
+      label: Documentation Consulted
+      description: Which documentation have you already checked?
+      placeholder: |
+        - [ ] README.md
+        - [ ] CLAUDE.md
+        - [ ] CONTRIBUTING.md
+        - [ ] Searched existing issues
+  - type: checkboxes
+    id: checks
+    attributes:
+      label: Pre-submission Checklist
+      description: Please confirm the following
+      options:
+        - label: I have checked the README and documentation
+          required: true
+        - label: I have searched existing issues and discussions
+          required: true

llmsbrieftxt-1.8.2/.github/copilot-instructions.md ADDED Viewed

@@ -0,0 +1,115 @@
+# GitHub Copilot Instructions for llmsbrieftxt
+## Project Overview
+This is `llmsbrieftxt`, a Python package that generates llms-brief.txt files by crawling documentation websites and using OpenAI to create structured descriptions. The CLI command is `llmtxt` (not `llmsbrieftxt`).
+## Architecture and Code Patterns
+### Async-First Design
+All main functions use async/await patterns. Use `asyncio.gather()` for concurrent operations and semaphore control for rate limiting. The processing pipeline flows: URL Discovery → Content Extraction → LLM Summarization → File Generation.
+### Module Organization
+- **cli.py**: Simple CLI with positional URL argument (no subcommands)
+- **main.py**: Orchestrates the async generation pipeline
+- **crawler.py**: RobustDocCrawler for breadth-first URL discovery
+- **doc_loader.py**: DocLoader wraps crawler with document loading
+- **extractor.py**: HTML to markdown via trafilatura
+- **summarizer.py**: OpenAI integration with retry logic (tenacity)
+- **url_utils.py**: URLNormalizer for deduplication
+- **url_filters.py**: Filter non-documentation URLs
+- **schema.py**: Pydantic models (PageSummary)
+- **constants.py**: Configuration constants
+### Type Safety
+Use Pydantic models for all structured data. The OpenAI integration uses structured output with the PageSummary model.
+### Error Handling
+Failed URL loads should be logged but not stop processing. LLM failures use exponential backoff retries via tenacity. Never let one failure break the entire pipeline.
+## Development Practices
+### Testing Requirements
+Write tests before implementing features. Use pytest with these markers:
+- `@pytest.mark.unit` for fast, isolated tests
+- `@pytest.mark.requires_openai` for tests needing OPENAI_API_KEY
+- `@pytest.mark.slow` for tests making external API calls
+Tests go in:
+- `tests/unit/` for fast tests with no external dependencies
+- `tests/integration/` for tests requiring OPENAI_API_KEY
+### Code Quality Tools
+Before committing, always run:
+1. Format: `uv run ruff format llmsbrieftxt/ tests/`
+2. Lint: `uv run ruff check llmsbrieftxt/ tests/`
+3. Type check: `uv run pyright llmsbrieftxt/`
+4. Tests: `uv run pytest tests/unit/`
+### Package Management
+Use `uv` for all package operations:
+- Install: `uv sync --group dev`
+- Add dependency: `uv add package-name`
+- Build: `uv build`
+## Design Philosophy
+### Unix Philosophy
+This project follows "do one thing and do it well":
+- Generate llms-brief.txt files only (no built-in search/list features)
+- Compose with standard Unix tools (rg, grep, ls)
+- Simple CLI: URL is a positional argument, no subcommands
+- Plain text output for scriptability
+### Simplicity Over Features
+Avoid adding functionality that duplicates mature Unix tools. Every line of code must serve the core mission of generating llms-brief.txt files.
+## Configuration Defaults
+- **Crawl Depth**: 3 levels (hardcoded in crawler.py)
+- **Output**: `~/.claude/docs/<domain>.txt` (override with `--output`)
+- **Cache**: `.llmsbrieftxt_cache/` for intermediate results
+- **OpenAI Model**: `gpt-5-mini` (override with `--model`)
+- **Concurrency**: 10 concurrent LLM requests (prevents rate limiting)
+## Commit Convention
+Use conventional commits for automated versioning:
+- `fix:` → patch bump (1.0.0 → 1.0.1)
+- `feat:` → minor bump (1.0.0 → 1.1.0)
+- `BREAKING CHANGE` or `feat!:`/`fix!:` → major bump (1.0.0 → 2.0.0)
+Examples:
+```bash
+git commit -m "fix: handle empty sitemap gracefully"
+git commit -m "feat: add --depth option for custom crawl depth"
+git commit -m "feat!: change default output location"
+```
+## Non-Obvious Behaviors
+1. URL Discovery discovers ALL pages up to depth 3, not just direct links
+2. URLs like `/page`, `/page/`, and `/page#section` are deduplicated as the same URL
+3. Summaries are automatically cached in `.llmsbrieftxt_cache/summaries.json`
+4. Content extraction uses trafilatura to preserve HTML structure in markdown
+5. File I/O is synchronous (uses standard `Path.write_text()` for simplicity)
+## Known Limitations
+1. Only supports OpenAI API (no other LLM providers)
+2. Crawl depth is hardcoded to 3 in crawler.py
+3. No CLI flag to force resume from cache (though cache exists)
+4. No progress persistence if interrupted
+5. Prompts and parsing assume English documentation
+## Code Review Checklist
+When reviewing code changes:
+- Ensure async patterns are used correctly (no blocking I/O in async functions)
+- Verify all functions have type hints
+- Check that tests are included for new functionality
+- Confirm error handling doesn't break the pipeline
+- Validate that conventional commit format is used
+- Ensure code follows Unix philosophy (simplicity, composability)
+- Check that ruff and pyright pass without errors
+- **IMPORTANT**: Always include specific file names and line numbers when providing review feedback (e.g., "main.py:165" or "line 182 in cli.py")

llmsbrieftxt-1.8.2/.github/workflows/ci.yml ADDED Viewed

@@ -0,0 +1,40 @@
+name: CI
+on:
+  push:
+    branches: [ main, master ]
+  pull_request:
+    branches: [ main, master ]
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+        python-version: ["3.10", "3.12"]
+    steps:
+    - uses: actions/checkout@v4
+    - name: Install uv
+      uses: astral-sh/setup-uv@v5
+      with:
+        version: "latest"
+    - name: Install dependencies
+      run: |
+        uv python install ${{ matrix.python-version }}
+        uv sync --group dev
+    - name: Run unit tests
+      run: |
+        uv run pytest tests/unit -v --tb=short
+    - name: Run linting and formatting checks
+      run: |
+        uv run ruff check llmsbrieftxt/ tests/
+        uv run ruff format --check llmsbrieftxt/ tests/
+    - name: Type checking
+      run: |
+        uv run pyright llmsbrieftxt/