RubyGems - woods - Versions diffs - 1.0.0 - Mend

woods 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (185) hide show

checksums.yaml +7 -0
data/CHANGELOG.md +89 -0
data/CODE_OF_CONDUCT.md +83 -0
data/CONTRIBUTING.md +65 -0
data/LICENSE.txt +21 -0
data/README.md +406 -0
data/exe/woods-console +59 -0
data/exe/woods-console-mcp +22 -0
data/exe/woods-mcp +34 -0
data/exe/woods-mcp-http +37 -0
data/exe/woods-mcp-start +58 -0
data/lib/generators/woods/install_generator.rb +32 -0
data/lib/generators/woods/pgvector_generator.rb +37 -0
data/lib/generators/woods/templates/add_pgvector_to_woods.rb.erb +15 -0
data/lib/generators/woods/templates/create_woods_tables.rb.erb +43 -0
data/lib/tasks/woods.rake +621 -0
data/lib/tasks/woods_evaluation.rake +115 -0
data/lib/woods/ast/call_site_extractor.rb +106 -0
data/lib/woods/ast/method_extractor.rb +71 -0
data/lib/woods/ast/node.rb +116 -0
data/lib/woods/ast/parser.rb +614 -0
data/lib/woods/ast.rb +6 -0
data/lib/woods/builder.rb +200 -0
data/lib/woods/cache/cache_middleware.rb +199 -0
data/lib/woods/cache/cache_store.rb +264 -0
data/lib/woods/cache/redis_cache_store.rb +116 -0
data/lib/woods/cache/solid_cache_store.rb +111 -0
data/lib/woods/chunking/chunk.rb +84 -0
data/lib/woods/chunking/semantic_chunker.rb +295 -0
data/lib/woods/console/adapters/cache_adapter.rb +58 -0
data/lib/woods/console/adapters/good_job_adapter.rb +33 -0
data/lib/woods/console/adapters/job_adapter.rb +68 -0
data/lib/woods/console/adapters/sidekiq_adapter.rb +33 -0
data/lib/woods/console/adapters/solid_queue_adapter.rb +33 -0
data/lib/woods/console/audit_logger.rb +75 -0
data/lib/woods/console/bridge.rb +177 -0
data/lib/woods/console/confirmation.rb +90 -0
data/lib/woods/console/connection_manager.rb +173 -0
data/lib/woods/console/console_response_renderer.rb +74 -0
data/lib/woods/console/embedded_executor.rb +373 -0
data/lib/woods/console/model_validator.rb +81 -0
data/lib/woods/console/rack_middleware.rb +87 -0
data/lib/woods/console/safe_context.rb +82 -0
data/lib/woods/console/server.rb +612 -0
data/lib/woods/console/sql_validator.rb +172 -0
data/lib/woods/console/tools/tier1.rb +118 -0
data/lib/woods/console/tools/tier2.rb +117 -0
data/lib/woods/console/tools/tier3.rb +110 -0
data/lib/woods/console/tools/tier4.rb +79 -0
data/lib/woods/coordination/pipeline_lock.rb +109 -0
data/lib/woods/cost_model/embedding_cost.rb +88 -0
data/lib/woods/cost_model/estimator.rb +128 -0
data/lib/woods/cost_model/provider_pricing.rb +67 -0
data/lib/woods/cost_model/storage_cost.rb +52 -0
data/lib/woods/cost_model.rb +22 -0
data/lib/woods/db/migrations/001_create_units.rb +38 -0
data/lib/woods/db/migrations/002_create_edges.rb +35 -0
data/lib/woods/db/migrations/003_create_embeddings.rb +37 -0
data/lib/woods/db/migrations/004_create_snapshots.rb +45 -0
data/lib/woods/db/migrations/005_create_snapshot_units.rb +40 -0
data/lib/woods/db/migrations/006_rename_tables.rb +34 -0
data/lib/woods/db/migrator.rb +73 -0
data/lib/woods/db/schema_version.rb +73 -0
data/lib/woods/dependency_graph.rb +236 -0
data/lib/woods/embedding/indexer.rb +140 -0
data/lib/woods/embedding/openai.rb +126 -0
data/lib/woods/embedding/provider.rb +162 -0
data/lib/woods/embedding/text_preparer.rb +112 -0
data/lib/woods/evaluation/baseline_runner.rb +115 -0
data/lib/woods/evaluation/evaluator.rb +139 -0
data/lib/woods/evaluation/metrics.rb +79 -0
data/lib/woods/evaluation/query_set.rb +148 -0
data/lib/woods/evaluation/report_generator.rb +90 -0
data/lib/woods/extracted_unit.rb +145 -0
data/lib/woods/extractor.rb +1028 -0
data/lib/woods/extractors/action_cable_extractor.rb +201 -0
data/lib/woods/extractors/ast_source_extraction.rb +46 -0
data/lib/woods/extractors/behavioral_profile.rb +309 -0
data/lib/woods/extractors/caching_extractor.rb +261 -0
data/lib/woods/extractors/callback_analyzer.rb +246 -0
data/lib/woods/extractors/concern_extractor.rb +292 -0
data/lib/woods/extractors/configuration_extractor.rb +219 -0
data/lib/woods/extractors/controller_extractor.rb +404 -0
data/lib/woods/extractors/database_view_extractor.rb +278 -0
data/lib/woods/extractors/decorator_extractor.rb +253 -0
data/lib/woods/extractors/engine_extractor.rb +223 -0
data/lib/woods/extractors/event_extractor.rb +211 -0
data/lib/woods/extractors/factory_extractor.rb +289 -0
data/lib/woods/extractors/graphql_extractor.rb +892 -0
data/lib/woods/extractors/i18n_extractor.rb +117 -0
data/lib/woods/extractors/job_extractor.rb +374 -0
data/lib/woods/extractors/lib_extractor.rb +218 -0
data/lib/woods/extractors/mailer_extractor.rb +269 -0
data/lib/woods/extractors/manager_extractor.rb +188 -0
data/lib/woods/extractors/middleware_extractor.rb +133 -0
data/lib/woods/extractors/migration_extractor.rb +469 -0
data/lib/woods/extractors/model_extractor.rb +988 -0
data/lib/woods/extractors/phlex_extractor.rb +252 -0
data/lib/woods/extractors/policy_extractor.rb +191 -0
data/lib/woods/extractors/poro_extractor.rb +229 -0
data/lib/woods/extractors/pundit_extractor.rb +223 -0
data/lib/woods/extractors/rails_source_extractor.rb +473 -0
data/lib/woods/extractors/rake_task_extractor.rb +343 -0
data/lib/woods/extractors/route_extractor.rb +181 -0
data/lib/woods/extractors/scheduled_job_extractor.rb +331 -0
data/lib/woods/extractors/serializer_extractor.rb +339 -0
data/lib/woods/extractors/service_extractor.rb +217 -0
data/lib/woods/extractors/shared_dependency_scanner.rb +91 -0
data/lib/woods/extractors/shared_utility_methods.rb +281 -0
data/lib/woods/extractors/state_machine_extractor.rb +398 -0
data/lib/woods/extractors/test_mapping_extractor.rb +225 -0
data/lib/woods/extractors/validator_extractor.rb +211 -0
data/lib/woods/extractors/view_component_extractor.rb +311 -0
data/lib/woods/extractors/view_template_extractor.rb +261 -0
data/lib/woods/feedback/gap_detector.rb +89 -0
data/lib/woods/feedback/store.rb +119 -0
data/lib/woods/filename_utils.rb +32 -0
data/lib/woods/flow_analysis/operation_extractor.rb +206 -0
data/lib/woods/flow_analysis/response_code_mapper.rb +154 -0
data/lib/woods/flow_assembler.rb +290 -0
data/lib/woods/flow_document.rb +191 -0
data/lib/woods/flow_precomputer.rb +102 -0
data/lib/woods/formatting/base.rb +30 -0
data/lib/woods/formatting/claude_adapter.rb +98 -0
data/lib/woods/formatting/generic_adapter.rb +56 -0
data/lib/woods/formatting/gpt_adapter.rb +64 -0
data/lib/woods/formatting/human_adapter.rb +78 -0
data/lib/woods/graph_analyzer.rb +374 -0
data/lib/woods/mcp/bootstrapper.rb +96 -0
data/lib/woods/mcp/index_reader.rb +394 -0
data/lib/woods/mcp/renderers/claude_renderer.rb +81 -0
data/lib/woods/mcp/renderers/json_renderer.rb +17 -0
data/lib/woods/mcp/renderers/markdown_renderer.rb +353 -0
data/lib/woods/mcp/renderers/plain_renderer.rb +240 -0
data/lib/woods/mcp/server.rb +962 -0
data/lib/woods/mcp/tool_response_renderer.rb +85 -0
data/lib/woods/model_name_cache.rb +51 -0
data/lib/woods/notion/client.rb +217 -0
data/lib/woods/notion/exporter.rb +219 -0
data/lib/woods/notion/mapper.rb +40 -0
data/lib/woods/notion/mappers/column_mapper.rb +57 -0
data/lib/woods/notion/mappers/migration_mapper.rb +39 -0
data/lib/woods/notion/mappers/model_mapper.rb +161 -0
data/lib/woods/notion/mappers/shared.rb +22 -0
data/lib/woods/notion/rate_limiter.rb +68 -0
data/lib/woods/observability/health_check.rb +79 -0
data/lib/woods/observability/instrumentation.rb +34 -0
data/lib/woods/observability/structured_logger.rb +57 -0
data/lib/woods/operator/error_escalator.rb +81 -0
data/lib/woods/operator/pipeline_guard.rb +92 -0
data/lib/woods/operator/status_reporter.rb +80 -0
data/lib/woods/railtie.rb +38 -0
data/lib/woods/resilience/circuit_breaker.rb +99 -0
data/lib/woods/resilience/index_validator.rb +167 -0
data/lib/woods/resilience/retryable_provider.rb +108 -0
data/lib/woods/retrieval/context_assembler.rb +261 -0
data/lib/woods/retrieval/query_classifier.rb +133 -0
data/lib/woods/retrieval/ranker.rb +277 -0
data/lib/woods/retrieval/search_executor.rb +316 -0
data/lib/woods/retriever.rb +152 -0
data/lib/woods/ruby_analyzer/class_analyzer.rb +170 -0
data/lib/woods/ruby_analyzer/dataflow_analyzer.rb +77 -0
data/lib/woods/ruby_analyzer/fqn_builder.rb +18 -0
data/lib/woods/ruby_analyzer/mermaid_renderer.rb +280 -0
data/lib/woods/ruby_analyzer/method_analyzer.rb +143 -0
data/lib/woods/ruby_analyzer/trace_enricher.rb +143 -0
data/lib/woods/ruby_analyzer.rb +87 -0
data/lib/woods/session_tracer/file_store.rb +104 -0
data/lib/woods/session_tracer/middleware.rb +143 -0
data/lib/woods/session_tracer/redis_store.rb +106 -0
data/lib/woods/session_tracer/session_flow_assembler.rb +254 -0
data/lib/woods/session_tracer/session_flow_document.rb +223 -0
data/lib/woods/session_tracer/solid_cache_store.rb +139 -0
data/lib/woods/session_tracer/store.rb +81 -0
data/lib/woods/storage/graph_store.rb +120 -0
data/lib/woods/storage/metadata_store.rb +196 -0
data/lib/woods/storage/pgvector.rb +195 -0
data/lib/woods/storage/qdrant.rb +205 -0
data/lib/woods/storage/vector_store.rb +167 -0
data/lib/woods/temporal/json_snapshot_store.rb +245 -0
data/lib/woods/temporal/snapshot_store.rb +345 -0
data/lib/woods/token_utils.rb +19 -0
data/lib/woods/version.rb +5 -0
data/lib/woods.rb +246 -0
metadata +270 -0

checksums.yaml ADDED Viewed

@@ -0,0 +1,7 @@
+---
+SHA256:
+  metadata.gz: ab164a85b76d9c97fc6142836da5349a444e9c62f507622fb327f5cc8f434ed4
+  data.tar.gz: 66752a95ddb4183a6f78d47417690242cfc3ad2bdfc622b8740fe2fbc388658e
+SHA512:
+  metadata.gz: 2d53024eefb62544ba536f23b1c9f36bebab988fc75223ef72e1d2ffd1d2ed0b46b2507781b040726b8059d14c9f6eefa3faa1c4d6b0a4b6c5019905ef41675d
+  data.tar.gz: 8d5c7a1e7ab4c7b401e61140a9ec5bea06848244d08192f05b0cc088a93980b3208cf3f22a0319545857051dc0b2a234f4d4c2ef8a5789ef108080f179aa6f99

data/CHANGELOG.md ADDED Viewed

@@ -0,0 +1,89 @@
+# Changelog
+All notable changes to this project will be documented in this file.
+The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
+and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [0.3.1] - 2026-03-04
+### Fixed
+- **Gemspec version** now reads from `version.rb` instead of being hardcoded — prevents version mismatch during gem builds
+- **Release workflow** replaced `rake release` (fails on tag-triggered detached HEAD) with `gem build` + `gem push`
+## [0.3.0] - 2026-03-04
+### Added
+- **Redis/SolidCache caching layer** for retrieval pipeline with TTL, namespace isolation, and nil-caching
+- **Engine classification** — engines tagged as `:framework` or `:application` based on install path (handles Docker vendor paths)
+- **Graph analysis staleness tracking** — `generated_at` timestamp and `graph_sha` for detecting stale analysis
+- **Docker setup guide** (`docs/DOCKER_SETUP.md`) — split architecture, volume mounts, bridge mode, troubleshooting
+- **Context7 documentation suite** — 10 new user-facing docs optimized for AI retrieval: FAQ, Troubleshooting, Architecture, Extractor Reference, WHY Woods, MCP Tool Cookbook, and 3 Context7 skills
+- **`context7.json`** configuration for controlling Context7 indexing scope
+### Fixed
+- **Vendor path leak** in source file resolution across 9 extractors — framework gems under `vendor/bundle` no longer produce empty source
+- **Prism cross-version compatibility** — handle API differences between Prism versions
+- **`schema_sha`** now supports `db/structure.sql` fallback (not just `db/schema.rb`)
+- **ViewComponent extractor** skips framework-internal components with no resolvable source file
+- **HTTP connection reuse** and retry handling in embedding providers
+- **DependencyGraph `to_h`** returns a dup to prevent cache pollution
+- **MCP tool counts** corrected across all documentation (27 index / 31 console)
+- **TROUBLESHOOTING.md** corrected: `config.extractors` controls retrieval scope, not which extractors run
+### Changed
+- **README streamlined** from 620 to 325 lines — added Quick Start, Documentation table; removed verbose sections in favor of links to dedicated docs
+- **Internal rake tasks** (`retrieve`, `self_analyze`) hidden from `rails -T`
+- **Estimated tokens memoization** removed to prevent stale values after source changes
+- **Simplification sweep** — dead code removal, shared helper extraction, bug fixes across caching and retrieval layers
+### Performance
+- Critical hotspots fixed across extraction, storage, and retrieval pipelines
+- `fetch_key` optimization for falsy value handling in cache layer
+## [0.2.1] - 2026-02-19
+### Changed
+- Switch release workflow to RubyGems trusted publishing
+## [0.2.0] - 2026-02-19
+### Added
+- **Embedded console MCP server** for zero-config Rails querying (no bridge process needed)
+- **Console MCP setup guide** (`docs/CONSOLE_MCP_SETUP.md`) — stdio, Docker, HTTP/Rack, SSH bridge options
+- **CODEOWNERS** and issue template configuration
+### Fixed
+- MCP gem compatibility and symbol key handling in embedded executor
+- Duplicate URI warning in gemspec
+## [0.1.0] - 2026-02-18
+### Added
+- **Extraction layer** with 13 extractors: Model, Controller, Service, Job, Mailer, Phlex, ViewComponent, GraphQL, Serializer, Manager, Policy, Validator, RailsSource
+- **Dependency graph** with PageRank scoring and GraphAnalyzer (orphans, hubs, cycles, bridges)
+- **Storage interfaces** with InMemory, SQLite, Pgvector, and Qdrant adapters
+- **Embedding pipeline** with OpenAI and Ollama providers, TextPreparer, resumable Indexer
+- **Semantic chunking** with type-aware splitting (model sections, controller per-action)
+- **Context formatting** adapters for Claude, GPT, generic LLMs, and humans
+- **Retrieval pipeline** with QueryClassifier, SearchExecutor, RRF Ranker, ContextAssembler
+- **Retriever orchestrator** with degradation tiers and RetrievalTrace
+- **Schema management** with versioned migrations and Rails generators
+- **Observability** with ActiveSupport::Notifications instrumentation, structured logging, health checks
+- **Resilience** with CircuitBreaker, RetryableProvider, IndexValidator
+- **MCP Index Server** (21 tools) for AI agent codebase retrieval
+- **Console MCP Server** (31 tools across 4 tiers) for live Rails data access
+- **AST layer** with Prism adapter for method extraction and call site analysis
+- **RubyAnalyzer** for class, method, and data flow analysis
+- **Flow extraction** with FlowAssembler, OperationExtractor, FlowDocument
+- **Evaluation harness** with Precision@k, Recall, MRR metrics and baseline comparisons
+- **Rake tasks** for extraction, incremental indexing, framework source, validation, stats, evaluation

data/CODE_OF_CONDUCT.md ADDED Viewed

@@ -0,0 +1,83 @@
+# Contributor Covenant Code of Conduct
+## Our Pledge
+We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.
+We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.
+## Our Standards
+Examples of behavior that contributes to a positive environment for our community include:
+* Demonstrating empathy and kindness toward other people
+* Being respectful of differing opinions, viewpoints, and experiences
+* Giving and gracefully accepting constructive feedback
+* Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience
+* Focusing on what is best not just for us as individuals, but for the overall community
+Examples of unacceptable behavior include:
+* The use of sexualized language or imagery, and sexual attention or advances of any kind
+* Trolling, insulting or derogatory comments, and personal or political attacks
+* Public or private harassment
+* Publishing others' private information, such as a physical or email address, without their explicit permission
+* Other conduct which could reasonably be considered inappropriate in a professional setting
+## Enforcement Responsibilities
+Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.
+Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate.
+## Scope
+This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event.
+## Enforcement
+Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at info@leah.wtf. All complaints will be reviewed and investigated promptly and fairly.
+All community leaders are obligated to respect the privacy and security of the reporter of any incident.
+## Enforcement Guidelines
+Community leaders will follow these Community Impact Guidelines in determining the consequences for any action they deem in violation of this Code of Conduct:
+### 1. Correction
+**Community Impact**: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community.
+**Consequence**: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested.
+### 2. Warning
+**Community Impact**: A violation through a single incident or series of actions.
+**Consequence**: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels like social media. Violating these terms may lead to a temporary or permanent ban.
+### 3. Temporary Ban
+**Community Impact**: A serious violation of community standards, including sustained inappropriate behavior.
+**Consequence**: A temporary ban from any sort of interaction or public communication with the community for a specified period of time. No public or private interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban.
+### 4. Permanent Ban
+**Community Impact**: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals.
+**Consequence**: A permanent ban from any sort of public interaction within the community.
+## Attribution
+This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 2.1, available at [https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1].
+Community Impact Guidelines were inspired by [Mozilla's code of conduct enforcement ladder][Mozilla CoC].
+For answers to common questions about this code of conduct, see the FAQ at [https://www.contributor-covenant.org/faq][FAQ]. Translations are available at [https://www.contributor-covenant.org/translations][translations].
+[homepage]: https://www.contributor-covenant.org
+[v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html
+[Mozilla CoC]: https://github.com/mozilla/diversity
+[FAQ]: https://www.contributor-covenant.org/faq
+[translations]: https://www.contributor-covenant.org/translations

data/CONTRIBUTING.md ADDED Viewed

@@ -0,0 +1,65 @@
+# Contributing to Woods
+Thank you for your interest in contributing to Woods!
+## Bug Reports
+Please open an issue on GitHub with:
+- A clear description of the bug
+- Steps to reproduce
+- Expected vs. actual behavior
+- Your Ruby version, Rails version, and database adapter
+## Feature Requests
+Open an issue describing:
+- The problem you're trying to solve
+- Your proposed solution
+- Any alternatives you've considered
+## Pull Requests
+1. Fork the repo and create your branch from `main`
+2. Install dependencies: `bin/setup`
+3. Make your changes
+4. Add tests for new functionality
+5. Ensure the test suite passes: `bundle exec rake spec`
+6. Ensure code style passes: `bundle exec rubocop`
+7. Update CHANGELOG.md with your changes
+8. Open a pull request
+## Development Setup
+```bash
+git clone https://github.com/lost-in-the/woods.git
+cd woods
+bin/setup
+bundle exec rake spec    # Run tests
+bundle exec rubocop      # Check style
+```
+## Testing
+Woods has two test suites:
+- **Gem unit specs** (`spec/`): Run with `bundle exec rake spec`. No Rails boot required.
+- **Integration specs**: Run inside a host Rails app to test real extraction.
+All new features need tests. Bug fixes should include a regression test.
+## Code Style
+- `frozen_string_literal: true` on every file
+- YARD documentation on public methods
+- `rescue StandardError`, never bare `rescue`
+- All extractors return `Array<ExtractedUnit>`
+## Runtime Introspection Requirement
+Woods uses runtime introspection, not static parsing. If your feature requires access to Rails internals (ActiveRecord reflections, route introspection, etc.), it must run inside a booted Rails environment. Unit tests should use mocks/stubs; integration tests should run in a real Rails app.
+## License
+By contributing, you agree that your contributions will be licensed under the MIT License.

data/LICENSE.txt ADDED Viewed

@@ -0,0 +1,21 @@
+The MIT License (MIT)
+Copyright (c) 2024-2026 Leah Armstrong
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

data/README.md ADDED Viewed

@@ -0,0 +1,406 @@
+# Woods
+**Your AI coding assistant is guessing about your Rails app. Woods gives it the real answers.**
+Rails hides enormous amounts of behavior behind conventions, concerns, and runtime magic. When you ask an AI assistant "what callbacks fire when a User saves?" or "what routes map to this controller?", it guesses from training data — and gets it wrong. Woods runs *inside* your Rails app, extracts what's actually happening at runtime, and serves that context directly to your AI tools via [MCP](https://modelcontextprotocol.io/).
+Works with **Claude Code**, **Cursor**, **Windsurf**, and any MCP-compatible tool.
+---
+## The Problem
+Ask your AI assistant about your Rails app and watch it confidently hallucinate:
+| You ask | What the AI says | What's actually true |
+|---------|-----------------|---------------------|
+| "What callbacks fire when User saves?" | `before_save :set_slug` | 11 callbacks across 4 files, including 3 from concerns |
+| "What routes map to OrdersController?" | Standard REST routes | Custom `POST /checkout`, nested under `/shops/:shop_id` |
+| "What does the checkout flow do?" | Describes `CheckoutService` | Misses that `order.save!` triggers 3 callbacks that enqueue 2 jobs |
+The AI isn't bad — it just can't see what Rails is doing. Your 40-line model file has 10x that behavior when you factor in included concerns, schema context, callback chains, validations, and association reflections. Static analysis can't reach any of it.
+**Woods fixes this by running inside Rails and extracting what's actually there.**
+See [Why Woods?](docs/WHY_CODEBASE_INDEX.md) for detailed before/after examples.
+---
+## Quick Start
+Five steps from install to asking questions:
+```bash
+# 1. Add to your Rails app's Gemfile
+gem 'woods', group: :development
+# 2. Install and configure
+bundle install
+rails generate woods:install
+# 3. Extract your codebase (requires Rails to be running)
+bundle exec rake woods:extract
+# Aliases: woods:scan
+# 4. Verify it worked
+bundle exec rake woods:stats
+# Aliases: woods:look
+# 5. Add the MCP server to your AI tool (see "Connect to Your AI Tool" below)
+```
+After extraction, your AI tool gets accurate, structured context about every model, controller, service, job, route, and more — including all the behavior that Rails hides.
+> **Docker?** Run extraction inside the container: `docker compose exec app bundle exec rake woods:extract`. The MCP server runs on the host reading volume-mounted output. See [Docker Setup](docs/DOCKER_SETUP.md).
+See [Getting Started](docs/GETTING_STARTED.md) for the full walkthrough including storage presets, CI setup, and common first-run issues.
+---
+## What Does It Actually Do?
+Woods boots your Rails app, introspects everything using runtime APIs, and writes structured JSON that your AI tools can read. Here's what that means in practice:
+### Concern Inlining
+Your `User` model includes `Auditable`, `Searchable`, and `SoftDeletable`. An AI tool reading `app/models/user.rb` sees 40 lines. Woods inlines all three concerns directly into the extracted unit — the AI sees the full 200-line behavioral surface area in one block.
+### Schema Prepending
+Model source gets a header with actual column types, indexes, and foreign keys pulled from the live database. No more guessing whether `name` is a `string` or `text`, or whether there's an index on `email`.
+### Route Binding
+Controller source gets a route map prepended showing the real HTTP verb + path + constraints for every action. No more assuming standard REST when your app has custom routes and nested resources.
+### Dependency Graph
+34 extractors build a bidirectional graph: what each unit depends on, and what depends on it. Change a concern and trace every model it touches. Refactor a service and see every controller that calls it. PageRank scoring identifies the most important nodes in your codebase.
+### Callback Side-Effect Analysis
+`CallbackAnalyzer` detects what actually happens inside callbacks — which columns get written, which jobs get enqueued, which services get called, which mailers fire. This is the #1 source of unexpected bugs in Rails, and the #1 thing AI tools get wrong.
+---
+## Connect to Your AI Tool
+Woods ships two MCP servers. Most users only need the **Index Server**.
+### Index Server — Reads Pre-Extracted Data (No Rails Required)
+27 tools for code lookup, dependency traversal, semantic search, graph analysis, and more. Reads static JSON from disk — fast, no Rails boot needed.
+**Claude Code** — add to `.mcp.json` in your project root:
+```json
+{
+  "mcpServers": {
+    "woods": {
+      "command": "woods-mcp-start",
+      "args": ["./tmp/woods"]
+    }
+  }
+}
+```
+> `woods-mcp-start` is a self-healing wrapper that validates the index, checks dependencies, and auto-restarts on failure. Recommended for Claude Code.
+**Cursor / Windsurf** — add to your MCP config:
+```json
+{
+  "mcpServers": {
+    "woods": {
+      "command": "woods-mcp",
+      "args": ["/path/to/your-rails-app/tmp/woods"]
+    }
+  }
+}
+```
+### Console Server — Live Rails Queries (Optional)
+31 tools for querying real database records, monitoring job queues, running model diagnostics, and checking schema. Connects to a live Rails process. Every query runs in a rolled-back transaction with SQL validation — safe for development use.
+```json
+{
+  "mcpServers": {
+    "woods-console": {
+      "command": "bundle",
+      "args": ["exec", "rake", "woods:console"],
+      "cwd": "/path/to/your-rails-app"
+    }
+  }
+}
+```
+See [MCP Servers](docs/MCP_SERVERS.md) for the full tool catalog and [MCP Tool Cookbook](docs/MCP_TOOL_COOKBOOK.md) for scenario-based examples.
+---
+## What Gets Extracted
+34 extractors cover every major Rails concept:
+| Category | What's Extracted | Key Details |
+|----------|-----------------|-------------|
+| **Models** | Schema, associations, validations, scopes, callbacks, enums | Concerns inlined, callback side-effects analyzed |
+| **Controllers** | Actions, filters, permitted params, response formats | Route map prepended, per-action filter chains |
+| **Services & Jobs** | Entry points, dependencies, retry config, queue names | Includes services, interactors, operations, commands |
+| **Views & Components** | ERB templates, Phlex components, ViewComponents | Partial references, slot definitions, prop interfaces |
+| **Routes & Middleware** | Full route table, middleware stack order | Constraint resolution, engine mount points |
+| **GraphQL** | Types, mutations, resolvers, fields | Relay connections, argument definitions |
+| **Background Work** | Jobs, mailers, Action Cable channels, scheduled tasks | Queue configuration, retry policies |
+| **Data Layer** | Migrations, database views, state machines, events | DDL metadata, reversibility, transition graphs |
+| **Testing** | Factories, test-to-source mappings | FactoryBot definitions, spec file associations |
+| **Framework Source** | Rails internals, gem source for exact installed versions | Pinned to your `Gemfile.lock` versions |
+See [Extractor Reference](docs/EXTRACTOR_REFERENCE.md) for per-extractor documentation with configuration options and example output.
+---
+## Use Cases
+### For AI-Assisted Development
+- **Context-aware code generation** — your AI sees the full model (with concerns, schema, and callbacks) before writing new code
+- **Feature planning** — query the dependency graph to understand blast radius before changing anything
+- **PR context** — compute affected units from a diff and explain downstream impact
+- **Code review** — surface hidden callback side-effects that a reviewer might miss
+- **Onboarding** — new team members ask "how does checkout work?" and get the real execution flow
+### For Architecture & Technical Debt
+- **Dead code detection** — `GraphAnalyzer` finds orphaned units with no dependents
+- **Hub identification** — find models with 50+ dependents that are bottlenecks
+- **Cycle detection** — circular dependencies surfaced automatically
+- **Migration risk** — DDL metadata shows which pending migrations touch large tables
+- **API surface audit** — every endpoint, its method, path, filters, and permitted params
+- **Callback chain auditing** — the #1 source of Rails bugs, now visible and traceable
+---
+## Configuration
+### Zero-Config Start
+The install generator creates a working configuration. The only required option is `output_dir`, which defaults to `tmp/woods`:
+```ruby
+# config/initializers/woods.rb
+Woods.configure do |config|
+  config.output_dir = Rails.root.join('tmp/woods')
+end
+```
+### Storage Presets
+For embedding and semantic search, use a preset to configure storage and embedding together:
+```ruby
+# Local development — no external services needed
+Woods.configure_with_preset(:local)
+# PostgreSQL — pgvector + OpenAI embeddings
+Woods.configure_with_preset(:postgresql)
+# Production scale — Qdrant + OpenAI embeddings
+Woods.configure_with_preset(:production)
+```
+### Backend Compatibility
+Woods is backend-agnostic. Your app database, vector store, embedding provider, and job system are all configurable independently:
+| Component | Options |
+|-----------|---------|
+| **App Database** | MySQL, PostgreSQL, SQLite |
+| **Vector Store** | In-memory, pgvector, Qdrant |
+| **Embeddings** | OpenAI, Ollama (local, free) |
+| **Job System** | Sidekiq, Solid Queue, GoodJob, inline |
+| **View Layer** | ERB, Phlex, ViewComponent |
+See [Backend Matrix](docs/BACKEND_MATRIX.md) for supported combinations and [Configuration Reference](docs/CONFIGURATION_REFERENCE.md) for every option with defaults.
+---
+## Keeping the Index Current
+### Incremental Updates
+After the initial extraction, update only changed files — typically 5-10x faster:
+```bash
+bundle exec rake woods:incremental
+# Aliases: woods:tend
+```
+### CI Integration
+```yaml
+# .github/workflows/index.yml
+jobs:
+  index:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 2
+      - name: Update index
+        run: bundle exec rake woods:incremental
+        env:
+          GITHUB_BASE_REF: ${{ github.base_ref }}
+```
+### Other Tasks
+```bash
+rake woods:validate            # Check index integrity (alias: woods:vet)
+rake woods:stats               # Show unit counts and graph stats (alias: woods:look)
+rake woods:clean               # Remove index output (alias: woods:clear)
+rake woods:embed               # Embed units for semantic search (alias: woods:nest)
+rake woods:embed_incremental   # Embed changed units only (alias: woods:hone)
+rake woods:notion_sync         # Sync models/columns to Notion (alias: woods:send)
+```
+---
+## How It Works Under the Hood
+```
+Inside your Rails app (rake task):
+  1. Boot Rails, eager-load all application classes
+  2. 34 extractors introspect models, controllers, routes, etc.
+  3. Dependency graph is built with forward + reverse edges
+  4. Git metadata enriches each unit (last modified, contributors, churn)
+  5. JSON output written to tmp/woods/
+On the host (no Rails needed):
+  6. Embedding pipeline chunks and vectorizes units (optional)
+  7. MCP Index Server reads JSON and answers AI tool queries
+```
+### The ExtractedUnit
+Everything flows through `ExtractedUnit` — the universal data structure. Each unit carries:
+| Field | What It Contains |
+|-------|-----------------|
+| `identifier` | Class name or descriptive key (`"User"`, `"POST /orders"`) |
+| `type` | Category (`:model`, `:controller`, `:service`, `:job`, etc.) |
+| `source_code` | Annotated source with inlined concerns and schema |
+| `metadata` | Structured data — associations, callbacks, routes, fields |
+| `dependencies` | What this unit depends on (forward edges) |
+| `dependents` | What depends on this unit (reverse edges) |
+| `chunks` | Semantic sub-sections for large units |
+| `estimated_tokens` | Token count for LLM context budgeting |
+### Output Structure
+```
+tmp/woods/
+├── manifest.json              # Git SHA, timestamps, checksums
+├── dependency_graph.json      # Full graph with PageRank scores
+├── SUMMARY.md                 # Human-readable overview
+├── models/
+│   ├── _index.json            # Quick lookup index
+│   ├── User.json              # Full unit with inlined concerns
+│   └── Order.json
+├── controllers/
+│   └── OrdersController.json  # With route map prepended
+├── services/
+│   └── CheckoutService.json
+└── rails_source/
+    └── ...                    # Framework source for installed versions
+```
+### Architecture Diagram
+```
+┌──────────────────────────────────────────────────────────────────┐
+│                      Rails Application                           │
+│                                                                  │
+│  ┌────────────┐    ┌─────────────┐    ┌──────────────────────┐  │
+│  │  Extract   │───>│   Resolve   │───>│   Write JSON         │  │
+│  │ 34 types   │    │   graph +   │    │   per unit           │  │
+│  │            │    │   git data  │    │                      │  │
+│  └────────────┘    └─────────────┘    └──────────────────────┘  │
+└──────────────────────────────────────────────────────────────────┘
+                                               │
+                     ┌─────────────────────────┘
+                     ▼
+┌──────────────────────────────────────────────────────────────────┐
+│                   Host / CI Environment                           │
+│                                                                  │
+│  ┌────────────┐    ┌─────────────┐    ┌──────────────────────┐  │
+│  │  Embed     │───>│ Vector Store│    │  MCP Index Server    │  │
+│  │  OpenAI /  │    │ pgvector /  │    │  27 tools            │  │
+│  │  Ollama    │    │ Qdrant      │    │  No Rails required   │  │
+│  └────────────┘    └─────────────┘    └──────────────────────┘  │
+│                                                                  │
+│                              ┌────────────────────────────────┐  │
+│                              │  Console MCP Server            │  │
+│                              │  31 tools, bridges to Rails    │  │
+│                              └────────────────────────────────┘  │
+└──────────────────────────────────────────────────────────────────┘
+```
+See [Architecture](docs/ARCHITECTURE.md) for the deep dive — extraction phases, graph internals, retrieval pipeline, and semantic chunking.
+---
+## Advanced Features
+| Feature | What It Does | Guide |
+|---------|-------------|-------|
+| **Semantic Search** | Natural-language queries like "find email validation logic" | [Configuration Reference](docs/CONFIGURATION_REFERENCE.md) |
+| **Temporal Snapshots** | Compare extraction state across git SHAs | [FAQ](docs/FAQ.md#what-are-temporal-snapshots) |
+| **Session Tracing** | Record which code paths fire during a browser session | [FAQ](docs/FAQ.md#what-does-the-session-tracer-do) |
+| **Notion Export** | Sync model/column data to Notion for non-technical stakeholders | [Notion Integration](docs/NOTION_INTEGRATION.md) |
+| **Graph Analysis** | Find orphans, hubs, cycles, bridges in your dependency graph | [Architecture](docs/ARCHITECTURE.md) |
+| **Evaluation Harness** | Measure retrieval precision, recall, and MRR | [Architecture](docs/ARCHITECTURE.md) |
+| **Flow Precomputation** | Per-action request flow maps (controller → model → jobs) | [Configuration Reference](docs/CONFIGURATION_REFERENCE.md) |
+---
+## Documentation
+| Guide | Who It's For | Description |
+|-------|-------------|-------------|
+| [Getting Started](docs/GETTING_STARTED.md) | Everyone | Install, configure, extract, inspect |
+| [FAQ](docs/FAQ.md) | Everyone | Common questions about setup, extraction, MCP, Docker |
+| [Troubleshooting](docs/TROUBLESHOOTING.md) | Everyone | Symptom → cause → fix |
+| [MCP Servers](docs/MCP_SERVERS.md) | Setup | Full tool catalog for Claude Code, Cursor, Windsurf |
+| [MCP Tool Cookbook](docs/MCP_TOOL_COOKBOOK.md) | Daily use | Scenario-based "how do I..." examples |
+| [Docker Setup](docs/DOCKER_SETUP.md) | Docker users | Container extraction + host MCP server |
+| [Configuration Reference](docs/CONFIGURATION_REFERENCE.md) | Customization | Every option with defaults |
+| [Extractor Reference](docs/EXTRACTOR_REFERENCE.md) | Deep dive | What each of the 34 extractors captures |
+| [Architecture](docs/ARCHITECTURE.md) | Contributors | Pipeline stages, graph internals, retrieval |
+| [Backend Matrix](docs/BACKEND_MATRIX.md) | Infrastructure | Supported database, vector, and embedding combos |
+| [Why Woods?](docs/WHY_CODEBASE_INDEX.md) | Evaluation | Detailed before/after comparisons |
+---
+## Requirements
+- Ruby >= 3.0
+- Rails >= 6.1
+Works with MySQL, PostgreSQL, and SQLite. No additional infrastructure required for basic extraction — embedding and vector search are optional add-ons.
+## Development
+```bash
+bin/setup                  # Install dependencies
+bundle exec rake spec      # Run tests (~2500 examples)
+bundle exec rubocop        # Lint
+```
+## Contributing
+Bug reports and pull requests are welcome on GitHub at https://github.com/lost-in-the/woods. See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
+## License
+Available as open source under the [MIT License](LICENSE.txt).