RubyGems - nous - Versions diffs - 0.2.0 → 0.4.0 - Mend

nous 0.2.0 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (39) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +68 -0
data/README.md +82 -10
data/lib/nous/cli.rb +13 -10
data/lib/nous/command.rb +2 -2
data/lib/nous/configuration_builder.rb +56 -0
data/lib/nous/converter.rb +1 -1
data/lib/nous/crawler/async_page_fetcher.rb +83 -0
data/lib/nous/crawler/link_extractor.rb +11 -11
data/lib/nous/crawler/recursive_page_fetcher.rb +103 -0
data/lib/nous/crawler/redirect_follower.rb +60 -0
data/lib/nous/crawler/single_page_fetcher.rb +112 -0
data/lib/nous/crawler/url_filter.rb +6 -6
data/lib/nous/crawler.rb +15 -70
data/lib/nous/extractor/default/client.rb +68 -0
data/lib/nous/extractor/default.rb +10 -6
data/lib/nous/extractor/jina/client.rb +4 -4
data/lib/nous/extractor/jina.rb +10 -9
data/lib/nous/fetcher/extraction_runner.rb +31 -0
data/lib/nous/fetcher/page_extractor.rb +40 -0
data/lib/nous/fetcher.rb +38 -11
data/lib/nous/primitives/configuration.rb +17 -0
data/lib/nous/primitives/extracted_content.rb +5 -0
data/lib/nous/primitives/fetch_record.rb +26 -0
data/lib/nous/primitives/fetch_result.rb +21 -0
data/lib/nous/primitives/page.rb +5 -0
data/lib/nous/primitives/url.rb +45 -0
data/lib/nous/serializer.rb +14 -3
data/lib/nous/url_resolver.rb +25 -0
data/lib/nous/version.rb +1 -1
data/lib/nous.rb +6 -5
metadata +44 -8
data/lib/nous/configuration.rb +0 -39
data/lib/nous/crawler/page_fetcher.rb +0 -47
data/lib/nous/error.rb +0 -5
data/lib/nous/extraction_runner.rb +0 -29
data/lib/nous/extraction_thread.rb +0 -28
data/lib/nous/extractor.rb +0 -46
data/lib/nous/page.rb +0 -5

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: c44bdc52070c6430739f9b0258ea53e3dafc1cff42d87814fd940c2e9e26ee94
-  data.tar.gz: e4b42ca9917d7e4656f8e8bc2d9b8b328781021c2ed02e9e1912bdb9ce8ac744
+  metadata.gz: c73c21d427c9bb99cc148e089ed5899e7aa9e3ca86a4825540380d41771354d2
+  data.tar.gz: 4b361a7aed3c0dfb28a6a650b0813d371622c82b910b8880502281047152a739
 SHA512:
-  metadata.gz: f55c5122dd9a53611c7045e648c34870f9e423afae6777d0004f0bc909c0b916fd5a8a0350168d286e2e63339be6ae393f9ee02cbe4703d1a392fceaee317fd0
-  data.tar.gz: fb6bdb6b9c283bc8350a4e697412869e9c1062af0659ad5b56aee6a0cdcad33983f8a15da9a94f3ae37b45b469a5d67a1c5e977d3d2c8b27a59e8f66eeedd59c
+  metadata.gz: aacbc4777dc1e5bd66513ddc3bc5a1f667276ac2e89ff781f7b43762133ae640d68bc1191c11b61ee00ac7911a128fcf6ad80653f86d369d284223e830f09120
+  data.tar.gz: 90fc8f0cf3c30c6e06bf6aeebd2790539ff80d0d63469a367c4a09b008c9ceed3dedc0d18528e5f6fab3c4a02103b4d2c7c9b3788b1708c6a1a46790d9f5cbab

data/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,67 @@
 ## [Unreleased]
+## [0.4.0] - 2026-04-11
+### Added
+- **New `details: true` option for `Nous.fetch`** - Returns a `FetchResult` object containing both successful pages and failed fetch/extraction attempts. This enables explicit failure handling without exceptions.
+  ```ruby
+  result = Nous.fetch("https://example.com", details: true)
+  result.pages    # Array<Page> - successfully extracted
+  result.failures # [{requested_url:, error:}, ...]
+  ```
+- **Page metadata** - Every extracted page now includes provenance information:
+  - `extractor`: Which extractor backend was used (e.g., "Nous::Extractor::Default")
+  - `requested_url`: The original URL before any redirects
+  - `content_type`: HTTP Content-Type header from the response
+  - `redirected`: Boolean indicating if redirects occurred
+- **FetchRecord internal primitive** - Unified fetch result representation that captures both success and failure cases with full provenance tracking. Replaces the previous `RawPage` which only handled successful fetches.
+- **Configuration#single_page? helper** - Convenience predicate method for checking if the current configuration is in single-page (non-recursive) mode.
+### Changed
+- **Improved title extraction** - Title extraction now uses a fallback chain: readability extracted title → HTML `<title>` tag → first `<h1>` element. This significantly improves title reliability on pages where readability fails to identify the title.
+- **Reduced aggressive DOM stripping** - The default extractor now preserves more content before readability processing. Previously removed elements (`header`, `img`, `video`, `svg`, `link`) are now retained, providing better context for readability scoring and preserving useful content like captions and bylines.
+- **Unified fetch contract** - Both single-page and recursive crawling now use the same internal `FetchRecord` structure, ensuring consistent provenance tracking and failure handling across all fetch modes.
+- **Serializer schema updated** - Both text and JSON output formats now include:
+  - `pathname`: URL path component
+  - `extractor`: Which extractor processed the page
+  - Full metadata object (JSON only)
+### Fixed
+- **JSON serialization** - The JSON output now correctly includes the `pathname` field that was documented but missing in previous versions.
+- **Extraction failure visibility** - Previously, extraction failures were only visible with debug logging enabled. The new `FetchResult` structure makes failures programmatically accessible.
+### Internal Changes
+- **Duck-typed extractor interface** - Extractors now receive the full `FetchRecord` object and can access the fields they need (`Default` uses `record.html`, `Jina` uses `record.final_url`).
+- **Removed `RawPage` primitive** - Superseded by the richer `FetchRecord` which handles both success and failure uniformly.
+## [0.3.0] - 2026-02-23
+- Remove `Nous::Error` base hierarchy; colocated errors inherit directly from `StandardError` with descriptive names
+- Move extraction pipeline under `Nous::Fetcher::*` namespace (`ExtractionRunner`, `ExtractionThread`)
+- Move readability command into `Nous::Extractor::Default::Client`, mirroring Jina structure
+- `Nous::Extractor` is now a module namespace (implicit via Zeitwerk), no longer a Command
+- Shared `Extractor::ExtractionError` contract: all extractor backends raise this on failure
+- Pull `seed_url` off `Configuration`; `Crawler` owns URL parsing and validation directly
+- Explicit rescue lists in CLI and extraction thread instead of broad `Nous::Error` rescue
+- Rename `--verbose`/`-v` to `--debug`/`-d`; `-v` is now `--version`
+- Add `Nous::Url`, `Nous::UrlResolver`, and `Crawler::RedirectFollower` to correctly handle redirects and path encoding (including spaces)
+- Add `-r`/`--recursive`; default mode now fetches only the seed page unless recursion is explicitly enabled
+- Split crawler fetchers by mode: `Crawler::AsyncPageFetcher`, `Crawler::RecursivePageFetcher`, and `Crawler::SinglePageFetcher`
+- Move configuration construction to `ConfigurationBuilder` and `Data.define`-based `Configuration` primitive
+- Add `faraday-follow_redirects` for single-page redirect handling and update integration/spec coverage for recursive and single-page flows
 ## [0.2.0] - 2026-02-21
 - Promote Configuration to module-level singleton (`Nous.configure`, `Nous.configuration`)
@@ -13,3 +75,9 @@
 ## [0.1.0] - 2026-02-21
 - Initial release
+[Unreleased]: https://github.com/danfrenette/nous/compare/v0.4.0...HEAD
+[0.4.0]: https://github.com/danfrenette/nous/compare/v0.3.0...v0.4.0
+[0.3.0]: https://github.com/danfrenette/nous/compare/v0.2.0...v0.3.0
+[0.2.0]: https://github.com/danfrenette/nous/compare/v0.1.0...v0.2.0
+[0.1.0]: https://github.com/danfrenette/nous/releases/tag/v0.1.0

data/README.md CHANGED Viewed

@@ -42,8 +42,8 @@ nous https://example.com -s "article.post"
 # Use Jina Reader API for JS-rendered sites (Next.js, SPAs)
 nous https://example.com --jina
-# Verbose logging
-nous https://example.com -v
+# Debug logging
+nous https://example.com -d
 ```
 ### Options
@@ -58,17 +58,21 @@ nous https://example.com -v
 | `-l`, `--limit N` | Maximum pages to fetch | `100` |
 | `--timeout N` | Per-request timeout in seconds | `15` |
 | `--jina` | Use Jina Reader API for extraction | off |
-| `-v`, `--verbose` | Verbose logging to stderr | off |
+| `-v`, `--version` | Print version and exit | off |
+| `-h`, `--help` | Print usage and exit | off |
+| `-d`, `--debug` | Debug logging to stderr | off |
 ## Ruby API
+### Basic Usage
 ```ruby
 require "nous"
 # Fetch pages with the default extractor
 pages = Nous.fetch("https://example.com", limit: 10, concurrency: 3)
-# Each page is a Nous::Page with title, url, pathname, content
+# Each page is a Nous::Page with title, url, pathname, content, metadata
 pages.each do |page|
   puts "#{page.title} (#{page.url})"
   puts page.content
@@ -87,11 +91,70 @@ pages = Nous.fetch("https://spa-site.com",
 )
 ```
+### Detailed Results
+Use the `details: true` option to receive full fetch results including failures:
+```ruby
+result = Nous.fetch("https://example.com", details: true)
+result.pages       # Array<Nous::Page> - successfully extracted pages
+result.failures    # Array<{requested_url:, error:}> - failed fetches
+result.total_requested  # Integer - total URLs attempted
+result.all_succeeded?   # Boolean - true if no failures
+result.any_succeeded?   # Boolean - true if at least one page extracted
+```
+This is useful when you need to handle failures explicitly:
+```ruby
+result = Nous.fetch("https://example.com/api-docs", details: true)
+if result.failures.any?
+  puts "Failed to fetch:"
+  result.failures.each do |failure|
+    puts "  #{failure[:requested_url]}: #{failure[:error]}"
+  end
+end
+result.pages.each do |page|
+  puts "Successfully extracted: #{page.title}"
+end
+```
+### Page Structure
+Each extracted page contains:
+| Field | Type | Description |
+|-------|------|-------------|
+| `title` | String | Page title (fallback chain: readability → `<title>` tag → `<h1>`) |
+| `url` | String | Final URL after redirects |
+| `pathname` | String | URL path component |
+| `content` | String | Extracted content as Markdown |
+| `metadata` | Hash | Provenance information (see below) |
+### Page Metadata
+```ruby
+page.metadata  # => {
+  #   extractor: "Nous::Extractor::Default",  # Which extractor was used
+  #   requested_url: "https://example.com/blog", # Original URL before redirects
+  #   content_type: "text/html; charset=utf-8",  # HTTP Content-Type header
+  #   redirected: true                           # Whether redirects occurred
+  # }
+```
 ## Extraction Backends
 ### Default (ruby-readability)
-Parses static HTML using [ruby-readability](https://github.com/cantino/ruby-readability), strips noisy elements (nav, footer, script, header), and converts to Markdown via [reverse_markdown](https://github.com/xijo/reverse_markdown). Fast and requires no external services, but cannot extract content from JS-rendered pages.
+Parses static HTML using [ruby-readability](https://github.com/cantino/ruby-readability), strips noisy elements (script, style, nav, footer), and converts to Markdown via [reverse_markdown](https://github.com/xijo/reverse_markdown). Fast and requires no external services, but cannot extract content from JS-rendered pages.
+Title extraction uses a fallback chain:
+1. Readability's extracted title
+2. Original `<title>` tag from HTML
+3. First `<h1>` from extracted content
 ### Jina Reader API
@@ -105,13 +168,15 @@ XML-tagged output designed for LLM context windows:
 ```xml
 <page>
-<title>Page Title</title>
-<url>https://example.com/page</url>
-<content>
+  <title>Page Title</title>
+  <url>https://example.com/page</url>
+  <pathname>/page</pathname>
+  <extractor>Nous::Extractor::Default</extractor>
+  <content>
 # Heading
 Extracted markdown content...
-</content>
+  </content>
 </page>
 ```
@@ -123,7 +188,13 @@ Extracted markdown content...
     "title": "Page Title",
     "url": "https://example.com/page",
     "pathname": "/page",
-    "content": "# Heading\n\nExtracted markdown content..."
+    "content": "# Heading\n\nExtracted markdown content...",
+    "metadata": {
+      "extractor": "Nous::Extractor::Default",
+      "requested_url": "https://example.com/page",
+      "content_type": "text/html; charset=utf-8",
+      "redirected": false
+    }
   }
 ]
 ```
@@ -134,6 +205,7 @@ Extracted markdown content...
 bin/setup               # Install dependencies
 bundle exec rspec       # Run tests
 bundle exec standardrb  # Lint
+bundle exec exe/nous    # Run the command line in-development
 ```
 ## License

data/lib/nous/cli.rb CHANGED Viewed

@@ -4,7 +4,7 @@ require "optparse"
 module Nous
   class Cli
-    class Error < Nous::Error; end
+    class CliError < StandardError; end
     def initialize(argv)
       @argv = argv
@@ -18,7 +18,9 @@ module Nous
       pages = Nous.fetch(seed_url, **fetch_options)
       output = Nous.serialize(pages, format: options[:format])
       write_output(output)
-    rescue Nous::Error => e
+    rescue CliError,
+      Fetcher::FetchError,
+      Serializer::SerializationError => e
       warn("nous: #{e.message}")
       exit 1
     end
@@ -32,7 +34,7 @@ module Nous
     end
     def fetch_options
-      opts = options.slice(:concurrency, :match, :limit, :timeout, :verbose)
+      opts = options.slice(*Configuration.members)
       opts[:extractor] = extractor
       opts
     end
@@ -44,7 +46,7 @@ module Nous
     end
     def validate!
-      raise Error, "no URL provided. Usage: nous <url> [options]" unless seed_url
+      raise CliError, "no URL provided. Usage: nous <url> [options]" unless seed_url
     end
     def write_output(output)
@@ -58,7 +60,7 @@ module Nous
     def parse_options!
       parser.parse!(argv)
     rescue OptionParser::InvalidOption => e
-      raise Error, e.message
+      raise CliError, e.message
     end
     def parser
@@ -77,13 +79,14 @@ module Nous
         opts.on("-l", "--limit N", Integer, "Maximum pages to fetch") { |v| options[:limit] = v }
         opts.on("--timeout N", Integer, "Per-request timeout in seconds (default: 15)") { |v| options[:timeout] = v }
         opts.on("--jina", "Use Jina Reader API for extraction (handles JS-rendered sites)") { options[:jina] = true }
-        opts.on("-v", "--verbose", "Verbose logging to stderr") { options[:verbose] = true }
-        opts.on("-h", "--help", "Show help") do
-          $stdout.puts(opts)
+        opts.on("-r", "--recursive", "Follow same-host links recursively") { options[:recursive] = true }
+        opts.on("-d", "--debug", "Debug logging to stderr") { options[:debug] = true }
+        opts.on("-v", "--version", "Show version") do
+          $stdout.puts("nous #{Nous::VERSION}")
           exit
         end
-        opts.on("--version", "Show version") do
-          $stdout.puts("nous #{Nous::VERSION}")
+        opts.on("-h", "--help", "Show help") do
+          $stdout.puts(opts)
           exit
         end
       end

data/lib/nous/command.rb CHANGED Viewed

@@ -2,7 +2,7 @@
 module Nous
   class Command
-    class Error < Nous::Error; end
+    class CommandError < StandardError; end
     class Result
       attr_reader :payload, :error, :metadata
@@ -27,7 +27,7 @@ module Nous
       command = new(...)
       command.call
     rescue => e
-      return command.failure(Error.new("unexpected: #{e.message}")) if command
+      return command.failure(CommandError.new("unexpected: #{e.message}")) if command
       Result.new(success: false, error: e)
     end

data/lib/nous/configuration_builder.rb ADDED Viewed

@@ -0,0 +1,56 @@
+# frozen_string_literal: true
+module Nous
+  class ConfigurationBuilder
+    class UnknownOptionError < StandardError; end
+    DEFAULTS = {
+      concurrency: 3,
+      match: [],
+      limit: 100,
+      timeout: 15,
+      debug: false,
+      keep_query: false,
+      recursive: false
+    }.freeze
+    def self.call(**options)
+      new(options).call
+    end
+    def initialize(options)
+      @options = options
+    end
+    def call
+      validate_keys!
+      Configuration.new(**coerced_options)
+    end
+    private
+    attr_reader :options
+    def validate_keys!
+      unknown = options.keys - Configuration.members
+      return if unknown.empty?
+      raise UnknownOptionError, "unknown option(s): #{unknown.join(", ")}"
+    end
+    def coerced_options
+      merged = DEFAULTS.merge(options)
+      {
+        concurrency: Integer(merged[:concurrency]).clamp(1, 20),
+        match: Array(merged[:match]),
+        limit: Integer(merged[:limit]).clamp(1, 10_000),
+        timeout: Integer(merged[:timeout]),
+        debug: !!merged[:debug],
+        keep_query: !!merged[:keep_query],
+        recursive: !!merged[:recursive]
+      }
+    end
+  end
+end

data/lib/nous/converter.rb CHANGED Viewed

@@ -4,7 +4,7 @@ require "reverse_markdown"
 module Nous
   class Converter < Command
-    class Error < Command::Error; end
+    class ConversionError < StandardError; end
     def initialize(html:)
       @html = html

data/lib/nous/crawler/async_page_fetcher.rb ADDED Viewed

@@ -0,0 +1,83 @@
+# frozen_string_literal: true
+module Nous
+  class Crawler < Command
+    class AsyncPageFetcher
+      HTML_CONTENT_TYPES = %w[text/html application/xhtml+xml].freeze
+      def initialize(client:, seed_host:)
+        @client = client
+        @seed_host = seed_host
+      end
+      def fetch(url)
+        Async::Task.current.with_timeout(config.timeout) do
+          result = RedirectFollower.call(client:, seed_host:, url:)
+          return build_failed_record(url, result.error.message) if result.failure?
+          response, final_url = result.payload
+          content_type = response.headers["content-type"].to_s
+          redirected = final_url.to_s != url
+          return build_failed_record(url, "status #{response.status}") unless response.status == 200
+          return build_failed_record(url, "non-html content") unless html?(content_type)
+          build_success_record(
+            url: url,
+            final_url: final_url.to_s,
+            pathname: final_url.path,
+            html: response.read,
+            content_type: content_type,
+            redirected: redirected
+          )
+        ensure
+          response&.close
+        end
+      rescue Async::TimeoutError
+        build_failed_record(url, "timeout after #{config.timeout}s")
+      rescue IOError, SocketError, Errno::ECONNREFUSED => e
+        build_failed_record(url, e.message)
+      end
+      private
+      attr_reader :client, :seed_host
+      def config
+        Nous.configuration
+      end
+      def html?(content_type)
+        HTML_CONTENT_TYPES.any? { |type| content_type.include?(type) }
+      end
+      def build_success_record(url:, final_url:, pathname:, html:, content_type:, redirected:)
+        FetchRecord.new(
+          requested_url: url,
+          final_url: final_url,
+          pathname: pathname,
+          html: html,
+          content_type: content_type,
+          ok: true,
+          error: nil,
+          redirected: redirected
+        )
+      end
+      def build_failed_record(url, error)
+        FetchRecord.new(
+          requested_url: url,
+          final_url: nil,
+          pathname: Url.new(url).path,
+          html: nil,
+          content_type: nil,
+          ok: false,
+          error: error,
+          redirected: false
+        ).tap do |record|
+          warn("[nous] skip #{url}: #{error}") if config.debug?
+        end
+      end
+    end
+  end
+end

data/lib/nous/crawler/link_extractor.rb CHANGED Viewed

@@ -1,5 +1,7 @@
 # frozen_string_literal: true
+require "nokogiri"
 module Nous
   class Crawler < Command
     class LinkExtractor
@@ -8,9 +10,7 @@ module Nous
       end
       def extract(current_url, html)
-        base_uri = URI.parse(current_url)
-        anchors(html).filter_map { |href| resolve(base_uri, href) }.uniq
+        anchors(html).filter_map { |href| resolve(current_url, href) }.uniq
       end
       private
@@ -21,19 +21,19 @@ module Nous
         Nokogiri::HTML(html).css("a[href]").map { |node| node["href"] }
       end
-      def resolve(base_uri, href)
+      def resolve(current_url, href)
         return unless url_filter.allowed?(href)
-        uri = URI.join(base_uri, href)
-        return unless url_filter.same_host?(uri)
+        result = UrlResolver.call(base_url: current_url, href:)
+        return unless result.success?
+        url = result.payload
+        return unless url_filter.same_host?(url)
-        canonical = url_filter.canonicalize(uri)
-        return unless url_filter.matches_path?(URI.parse(canonical).path)
+        canonical = url_filter.canonicalize(url)
+        return unless url_filter.matches_path?(Url.new(canonical).path)
         canonical
-      rescue URI::InvalidURIError => e
-        warn("[nous] malformed href #{href.inspect}: #{e.message}") if Nous.configuration.verbose?
-        nil
       end
     end
   end

data/lib/nous/crawler/recursive_page_fetcher.rb ADDED Viewed

@@ -0,0 +1,103 @@
+# frozen_string_literal: true
+require "async"
+require "async/http/internet"
+module Nous
+  class Crawler < Command
+    class RecursivePageFetcher < Command
+      def initialize(seed_url:, http_client: nil)
+        @seed_uri = Url.new(seed_url)
+        @http_client = http_client
+        @records = []
+        @queue = [url_filter.canonicalize(seed_uri)]
+        @seen = Set.new(queue)
+      end
+      def call
+        suppress_async_warnings unless config.debug?
+        open_connection do |client|
+          crawl(client)
+        end
+        success(payload: records)
+      end
+      private
+      attr_reader :seed_uri, :http_client, :records, :queue, :seen
+      def config
+        Nous.configuration
+      end
+      def crawl(client)
+        fetch_and_enqueue(queue.shift(config.concurrency), client) while queue.any? && within_limit?
+      end
+      def fetch_and_enqueue(batch, client)
+        fetch_batch(batch, client).each do |record|
+          next unless record.ok
+          break unless within_limit?
+          records << record
+          seen << record.final_url
+          enqueue_links(record)
+        end
+      end
+      def fetch_batch(urls, client)
+        tasks = []
+        Async do |task|
+          urls.each do |url|
+            tasks << task.async { page_fetcher(client).fetch(url) }
+          end
+        end.wait
+        tasks.map(&:wait)
+      end
+      def enqueue_links(record)
+        link_extractor.extract(record.final_url, record.html).each do |url|
+          next if seen.include?(url)
+          seen << url
+          queue << url
+        end
+      end
+      def within_limit?
+        records.count(&:ok) < config.limit
+      end
+      def open_connection
+        client = http_client || Async::HTTP::Internet.new
+        Async do
+          yield client
+        ensure
+          client.close
+        end.wait
+      end
+      def page_fetcher(client)
+        AsyncPageFetcher.new(client:, seed_host: seed_uri.host)
+      end
+      def url_filter
+        @url_filter ||= UrlFilter.new(seed_uri:)
+      end
+      def link_extractor
+        @link_filter ||= LinkExtractor.new(url_filter:)
+      end
+      def suppress_async_warnings
+        require "console"
+        Console.logger.level = :error
+      end
+    end
+  end
+end

data/lib/nous/crawler/redirect_follower.rb ADDED Viewed

@@ -0,0 +1,60 @@
+# frozen_string_literal: true
+module Nous
+  class Crawler < Command
+    class RedirectFollower < Command
+      class RedirectError < StandardError; end
+      MAX_HOPS = 5
+      def initialize(client:, seed_host:, url:, hops_remaining: MAX_HOPS)
+        @client = client
+        @seed_host = seed_host
+        @url = url
+        @hops_remaining = hops_remaining
+      end
+      def call
+        response = client.get(url, {})
+        return success(payload: [response, Url.new(url)]) unless redirect?(response.status)
+        response.close
+        follow(response.headers["location"])
+      end
+      private
+      attr_reader :client, :seed_host, :url, :hops_remaining
+      def redirect?(status)
+        (300..399).cover?(status)
+      end
+      def follow(location)
+        target = resolve_target(location)
+        return target if target.failure?
+        self.class.call(client:, seed_host:, url: target.payload.to_s, hops_remaining: hops_remaining - 1)
+      end
+      def resolve_target(location)
+        return failure(RedirectError.new("redirect without location from #{url}")) unless location
+        return failure(RedirectError.new("too many redirects from #{url}")) if hops_remaining <= 0
+        result = UrlResolver.call(base_url: url, href: location)
+        return failure(RedirectError.new(result.error.message)) if result.failure?
+        unless safe?(result.payload)
+          return failure(RedirectError.new("redirect to #{result.payload} outside #{seed_host}"))
+        end
+        result
+      end
+      def safe?(target)
+        target.http? && target.host == seed_host
+      end
+    end
+  end
+end