domain_extractor 0.2.4 → 0.2.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: e27f85f19d8a816ec9323daf5007fa86316e337c6233924b36b0178b2b99d1e1
4
- data.tar.gz: e8dd889cad29d7ecc72b93e5d62e8e4e7543c68b8b5f3886d642ec51bc6bc4dc
3
+ metadata.gz: b770e3c09383122b5cae3baa952127a0f616ee721c2a241f1facd9ddc42a4762
4
+ data.tar.gz: de6e3561bba3d457da8a4cd9aee88c5f6c76aedaf233c3bda4930cb8402b2871
5
5
  SHA512:
6
- metadata.gz: 35999ebc7fa0952f9d14d54bd07c214c7fef3ac92aeaa6e7f95882448abd1437a0e683afdab59db32f6f85905f0de722e82aed7adad55caa5abde2f87921ecc3
7
- data.tar.gz: b4b989986447dd919bc797e4d103f45e9b64cd2f1116975436bdfb0a9d3972d8a5b50061c3332075b8d4d921425b0161dc36d8e07b3c1179e5c15d2bbc02bde0
6
+ metadata.gz: 342694e42f321dbea6b197a99909afba2fc4de4d13d01e6e92e66f54fa7d286c1abdfbc1713c56709783d1d05a840523c8f0b202c89528bdeca754eade68cf60
7
+ data.tar.gz: e23a61526b995375057f34b6a87053c9a26e1f6b6699a521332829921ecb28d1fa6fcf13802fdb9f79637a45200bc1ff98fa47fb8179671fbaed27500e9c16e9
data/CHANGELOG.md CHANGED
@@ -5,7 +5,126 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
- ## [Unreleased]
8
+ ## [0.2.6] - 2025-11-09
9
+
10
+ ### Fixed - Rails Validator Registration
11
+
12
+ **CRITICAL FIX**: Moved `DomainValidator` class to the **top-level namespace** (from `DomainExtractor::DomainValidator`) to ensure Rails can properly autoload and find the validator.
13
+
14
+ #### The Problem
15
+
16
+ Version 0.2.5 defined the validator as `DomainExtractor::DomainValidator`, which caused Rails to fail with:
17
+
18
+ ```
19
+ ArgumentError: Unknown validator: 'DomainValidator'
20
+ NameError: uninitialized constant Website::DomainValidator
21
+ ```
22
+
23
+ This occurred because when using `validates :url, domain: { ... }`, Rails searches for `DomainValidator` in:
24
+
25
+ 1. The model's namespace (e.g., `Website::DomainValidator`)
26
+ 2. The top-level namespace (`::DomainValidator`)
27
+ 3. ActiveModel::Validations namespace
28
+
29
+ It does **not** search module namespaces like `DomainExtractor::`.
30
+
31
+ #### The Solution
32
+
33
+ - Moved `DomainValidator` to top-level namespace where Rails can find it
34
+ - Added `DomainExtractor::DomainValidator` as an alias for backward compatibility
35
+ - All functionality remains identical; only the class location changed
36
+
37
+ #### Verification
38
+
39
+ - All 151 tests pass including 35 validator-specific tests
40
+ - RuboCop clean with zero offenses
41
+ - Verified in production Rails 8 application
42
+ - Confirmed working with `validates :url, domain: { validation: :root_or_custom_subdomain }`
43
+
44
+ ## [0.2.5] - 2025-11-09 [YANKED]
45
+
46
+ **This version was yanked due to validator registration issue. Use 0.2.6 instead.**
47
+
48
+ ### Added Rails Integration - Custom ActiveModel Validator (BROKEN)
49
+
50
+ Added a comprehensive custom ActiveModel validator for declarative URL and domain validation in Rails applications. However, the validator was incorrectly namespaced and did not work in Rails applications.
51
+
52
+ #### Features (Broken in 0.2.5)
53
+
54
+ **Validation Modes:**
55
+
56
+ - `:standard` - Validates any parseable URL (default mode)
57
+ - `:root_domain` - Only allows root domains without subdomains (e.g., `example.com` ✅, `shop.example.com` ❌)
58
+ - `:root_or_custom_subdomain` - Allows root or custom subdomains but excludes `www` subdomain (e.g., `example.com` ✅, `shop.example.com` ✅, `www.example.com` ❌)
59
+
60
+ **Protocol Options:**
61
+
62
+ - `use_protocol` (default: `true`) - Controls whether protocol (http/https) is required in the URL
63
+ - `use_https` (default: `true`) - Controls whether HTTPS is required (only relevant when `use_protocol` is true)
64
+
65
+ **Usage Examples:**
66
+
67
+ ```ruby
68
+ # Standard validation - any valid URL
69
+ validates :url, domain: { validation: :standard }
70
+
71
+ # Root domain only, no subdomains
72
+ validates :primary_domain, domain: { validation: :root_domain }
73
+
74
+ # Custom subdomains allowed, but not www
75
+ validates :custom_domain, domain: { validation: :root_or_custom_subdomain }
76
+
77
+ # Flexible protocol requirements
78
+ validates :domain, domain: {
79
+ validation: :root_domain,
80
+ use_protocol: false,
81
+ use_https: false
82
+ }
83
+ ```
84
+
85
+ #### Implementation Details
86
+
87
+ - **Zero Configuration**: Automatically loads when ActiveModel is available
88
+ - **Graceful Degradation**: Validator only loads in Rails environments; works independently in non-Rails contexts
89
+ - **Clean Error Messages**: Provides clear, actionable validation error messages
90
+ - **Performance**: Leverages existing DomainExtractor parsing engine with minimal overhead
91
+ - **Thread-Safe**: Stateless validation logic safe for concurrent use
92
+
93
+ #### Compatibility
94
+
95
+ - **Rails 6.0+**: Full compatibility with ActiveModel::EachValidator API
96
+ - **Rails 7.0+**: Compatible with modern errors API
97
+ - **Rails 8.0+**: No breaking changes, fully supported
98
+ - **Non-Rails**: Works with any application using ActiveModel (Sinatra, Hanami, etc.)
99
+
100
+ #### Code Quality
101
+
102
+ - **100% Test Coverage**: 35 comprehensive test cases covering all validation modes and options
103
+ - **RuboCop Clean**: Zero offenses, follows Ruby style guide
104
+ - **Well-Documented**: Extensive README section with real-world examples
105
+ - **Type-Safe**: Proper argument validation with clear error messages
106
+
107
+ #### Documentation
108
+
109
+ - Added comprehensive **Rails Integration** section to README.md
110
+ - Includes real-world examples:
111
+ - Multi-tenant applications with custom domains
112
+ - E-commerce store configuration
113
+ - API service registration
114
+ - Domain allowlists with flexible protocols
115
+ - Documents all validation modes, options, and error messages
116
+ - Shows integration with other Rails validators
117
+
118
+ #### Use Cases
119
+
120
+ Perfect for Rails applications requiring:
121
+
122
+ - Multi-tenant custom domain validation
123
+ - Secure URL validation (HTTPS enforcement)
124
+ - Subdomain-based architecture validation
125
+ - API endpoint domain validation
126
+ - Domain allowlist/blocklist management
127
+ - Custom subdomain requirements
9
128
 
10
129
  ## [0.1.8] - 2025-10-31
11
130
 
data/README.md CHANGED
@@ -355,6 +355,275 @@ DomainExtractor.parse_query_params(query_string)
355
355
  # Returns: Hash of query parameters
356
356
  ```
357
357
 
358
+ ## Rails Integration
359
+
360
+ DomainExtractor provides a custom ActiveModel validator for Rails applications, enabling declarative URL/domain validation with multiple modes and options.
361
+
362
+ ### Installation
363
+
364
+ The Rails validator is automatically available when using DomainExtractor in a Rails application (or any application with ActiveModel). No additional setup is required.
365
+
366
+ ### Basic Usage
367
+
368
+ ```ruby
369
+ class Website < ApplicationRecord
370
+ # Standard validation - accepts any valid URL
371
+ validates :url, domain: { validation: :standard }
372
+ end
373
+ ```
374
+
375
+ ### Validation Modes
376
+
377
+ #### `:standard` - Accept Any Valid URL
378
+
379
+ Validates that the URL is parseable and valid. This is the default mode.
380
+
381
+ ```ruby
382
+ class Website < ApplicationRecord
383
+ validates :url, domain: { validation: :standard }
384
+ end
385
+
386
+ # Valid URLs
387
+ website = Website.new(url: 'https://mysite.com') # ✅ Valid
388
+ website = Website.new(url: 'https://shop.mysite.com') # ✅ Valid
389
+ website = Website.new(url: 'https://www.mysite.com') # ✅ Valid
390
+ website = Website.new(url: 'https://api.staging.mysite.com') # ✅ Valid
391
+
392
+ # Invalid URLs
393
+ website = Website.new(url: 'not-a-url') # ❌ Invalid
394
+ ```
395
+
396
+ #### `:root_domain` - Root Domain Only
397
+
398
+ Only allows root domains without any subdomains.
399
+
400
+ ```ruby
401
+ class PrimaryDomain < ApplicationRecord
402
+ validates :domain, domain: { validation: :root_domain }
403
+ end
404
+
405
+ # Valid URLs
406
+ domain = PrimaryDomain.new(domain: 'https://mysite.com') # ✅ Valid
407
+
408
+ # Invalid URLs
409
+ domain = PrimaryDomain.new(domain: 'https://shop.mysite.com') # ❌ Invalid (has subdomain)
410
+ domain = PrimaryDomain.new(domain: 'https://www.mysite.com') # ❌ Invalid (has www subdomain)
411
+ ```
412
+
413
+ #### `:root_or_custom_subdomain` - Root or Custom Subdomain (No WWW)
414
+
415
+ Allows root domains or custom subdomains, but specifically excludes the 'www' subdomain.
416
+
417
+ ```ruby
418
+ class CustomDomain < ApplicationRecord
419
+ validates :url, domain: { validation: :root_or_custom_subdomain }
420
+ end
421
+
422
+ # Valid URLs
423
+ domain = CustomDomain.new(url: 'https://mysite.com') # ✅ Valid (root domain)
424
+ domain = CustomDomain.new(url: 'https://shop.mysite.com') # ✅ Valid (custom subdomain)
425
+ domain = CustomDomain.new(url: 'https://api.mysite.com') # ✅ Valid (custom subdomain)
426
+
427
+ # Invalid URLs
428
+ domain = CustomDomain.new(url: 'https://www.mysite.com') # ❌ Invalid (www not allowed)
429
+ ```
430
+
431
+ ### Protocol Options
432
+
433
+ #### `use_protocol` (default: `true`)
434
+
435
+ Controls whether the protocol (http:// or https://) is required in the URL.
436
+
437
+ ```ruby
438
+ class Website < ApplicationRecord
439
+ # Require protocol (default behavior)
440
+ validates :url, domain: { validation: :standard, use_protocol: true }
441
+
442
+ # Don't require protocol
443
+ validates :domain_without_protocol, domain: {
444
+ validation: :standard,
445
+ use_protocol: false
446
+ }
447
+ end
448
+
449
+ # With use_protocol: true (default)
450
+ Website.new(url: 'https://mysite.com') # ✅ Valid
451
+ Website.new(url: 'mysite.com') # ✅ Valid (auto-adds https://)
452
+
453
+ # With use_protocol: false
454
+ Website.new(domain_without_protocol: 'mysite.com') # ✅ Valid
455
+ Website.new(domain_without_protocol: 'https://mysite.com') # ✅ Valid (protocol stripped)
456
+ ```
457
+
458
+ #### `use_https` (default: `true`)
459
+
460
+ Controls whether HTTPS is required. Only relevant when `use_protocol` is `true`.
461
+
462
+ ```ruby
463
+ class SecureWebsite < ApplicationRecord
464
+ # Require HTTPS (default behavior)
465
+ validates :url, domain: { validation: :standard, use_https: true }
466
+ end
467
+
468
+ class FlexibleWebsite < ApplicationRecord
469
+ # Allow both HTTP and HTTPS
470
+ validates :url, domain: { validation: :standard, use_https: false }
471
+ end
472
+
473
+ # With use_https: true (default)
474
+ SecureWebsite.new(url: 'https://mysite.com') # ✅ Valid
475
+ SecureWebsite.new(url: 'http://mysite.com') # ❌ Invalid
476
+
477
+ # With use_https: false
478
+ FlexibleWebsite.new(url: 'https://mysite.com') # ✅ Valid
479
+ FlexibleWebsite.new(url: 'http://mysite.com') # ✅ Valid
480
+ ```
481
+
482
+ ### Real-World Examples
483
+
484
+ #### Multi-Tenant Application with Custom Domains
485
+
486
+ ```ruby
487
+ class Tenant < ApplicationRecord
488
+ # Allow custom subdomains but not www
489
+ validates :custom_domain, domain: {
490
+ validation: :root_or_custom_subdomain,
491
+ use_https: true
492
+ }
493
+
494
+ # Primary domain must be root only
495
+ validates :primary_domain, domain: {
496
+ validation: :root_domain,
497
+ use_protocol: false
498
+ }
499
+ end
500
+
501
+ # Valid configurations
502
+ tenant = Tenant.create(
503
+ custom_domain: 'https://shop.example.com', # ✅ Custom subdomain
504
+ primary_domain: 'example.com' # ✅ Root without protocol
505
+ )
506
+
507
+ # Invalid configurations
508
+ tenant = Tenant.new(
509
+ custom_domain: 'https://www.example.com' # ❌ www not allowed
510
+ )
511
+ ```
512
+
513
+ #### E-commerce Store Configuration
514
+
515
+ ```ruby
516
+ class Store < ApplicationRecord
517
+ # Main storefront can be root or custom subdomain
518
+ validates :storefront_url, domain: {
519
+ validation: :root_or_custom_subdomain,
520
+ use_https: true
521
+ }
522
+
523
+ # Admin panel must be a subdomain (not root, not www)
524
+ validates :admin_url, domain: { validation: :standard }
525
+ validate :admin_must_have_subdomain
526
+
527
+ private
528
+
529
+ def admin_must_have_subdomain
530
+ parsed = DomainExtractor.parse(admin_url)
531
+ if parsed.valid? && !parsed.subdomain?
532
+ errors.add(:admin_url, 'must have a subdomain')
533
+ end
534
+ end
535
+ end
536
+ ```
537
+
538
+ #### API Service Registration
539
+
540
+ ```ruby
541
+ class ApiEndpoint < ApplicationRecord
542
+ # API endpoints must use HTTPS
543
+ validates :url, domain: {
544
+ validation: :standard,
545
+ use_https: true
546
+ }
547
+
548
+ # Custom validation for API subdomain
549
+ validate :must_be_api_subdomain
550
+
551
+ private
552
+
553
+ def must_be_api_subdomain
554
+ return unless url.present?
555
+
556
+ parsed = DomainExtractor.parse(url)
557
+ if parsed.valid? && parsed.subdomain.present?
558
+ unless parsed.subdomain.start_with?('api')
559
+ errors.add(:url, 'must use an api subdomain')
560
+ end
561
+ end
562
+ end
563
+ end
564
+ ```
565
+
566
+ #### Domain Allowlist with Flexible Protocol
567
+
568
+ ```ruby
569
+ class AllowedDomain < ApplicationRecord
570
+ # Accept domains with or without protocol
571
+ validates :domain, domain: {
572
+ validation: :root_domain,
573
+ use_protocol: false,
574
+ use_https: false
575
+ }
576
+ end
577
+
578
+ # All these are valid
579
+ AllowedDomain.create(domain: 'example.com')
580
+ AllowedDomain.create(domain: 'https://example.com')
581
+ AllowedDomain.create(domain: 'http://example.com')
582
+ ```
583
+
584
+ ### Combining with Other Validators
585
+
586
+ The domain validator works seamlessly with other Rails validators:
587
+
588
+ ```ruby
589
+ class Website < ApplicationRecord
590
+ validates :url, presence: true,
591
+ domain: { validation: :standard },
592
+ uniqueness: { case_sensitive: false }
593
+
594
+ validates :backup_url, domain: {
595
+ validation: :root_or_custom_subdomain,
596
+ use_https: true
597
+ }, allow_blank: true
598
+ end
599
+ ```
600
+
601
+ ### Error Messages
602
+
603
+ The validator provides clear, specific error messages:
604
+
605
+ ```ruby
606
+ website = Website.new(url: 'not-a-url')
607
+ website.valid?
608
+ website.errors[:url]
609
+ # => ["is not a valid URL"]
610
+
611
+ domain = PrimaryDomain.new(domain: 'https://shop.example.com')
612
+ domain.valid?
613
+ domain.errors[:domain]
614
+ # => ["must be a root domain (no subdomains allowed)"]
615
+
616
+ custom = CustomDomain.new(url: 'https://www.example.com')
617
+ custom.valid?
618
+ custom.errors[:url]
619
+ # => ["cannot use www subdomain"]
620
+
621
+ secure = SecureWebsite.new(url: 'http://example.com')
622
+ secure.valid?
623
+ secure.errors[:url]
624
+ # => ["must use https://"]
625
+ ```
626
+
358
627
  ## Use Cases
359
628
 
360
629
  **Web Scraping**
@@ -0,0 +1,175 @@
1
+ # frozen_string_literal: true
2
+
3
+ # Try to load ActiveModel, but don't fail if it's not available
4
+ begin
5
+ require 'active_model'
6
+ rescue LoadError
7
+ # Create a stub for testing environments without Rails
8
+ module ActiveModel
9
+ class EachValidator
10
+ attr_reader :options
11
+
12
+ def initialize(options)
13
+ @options = options
14
+ end
15
+ end
16
+ end
17
+ end
18
+
19
+ # DomainValidator is a custom ActiveModel validator for URL/domain validation.
20
+ #
21
+ # This validator is defined at the top level so Rails can find it when using:
22
+ # validates :url, domain: { validation: :standard }
23
+ #
24
+ # Validation modes:
25
+ # - :standard - Validates any valid URL using DomainExtractor.valid?
26
+ # - :root_domain - Only allows root domains (no subdomains) like https://mysite.com
27
+ # - :root_or_custom_subdomain - Allows root or custom subdomains but excludes 'www'
28
+ #
29
+ # Optional flags:
30
+ # - use_protocol (default: true) - Whether protocol (http/https) is required
31
+ # - use_https (default: true) - Whether https is required (only if use_protocol is true)
32
+ #
33
+ # @example Standard validation
34
+ # validates :url, domain: { validation: :standard }
35
+ #
36
+ # @example Root domain only, no protocol required
37
+ # validates :url, domain: { validation: :root_domain, use_protocol: false }
38
+ #
39
+ # @example Root or custom subdomain with https required
40
+ # validates :url, domain: { validation: :root_or_custom_subdomain, use_https: true }
41
+ class DomainValidator < ActiveModel::EachValidator
42
+ VALIDATION_MODES = %i[standard root_domain root_or_custom_subdomain].freeze
43
+ WWW_SUBDOMAIN = 'www'
44
+
45
+ def validate_each(record, attribute, value)
46
+ return if blank?(value)
47
+
48
+ validation_mode = extract_validation_mode
49
+ use_protocol = options.fetch(:use_protocol, true)
50
+ use_https = options.fetch(:use_https, true)
51
+
52
+ normalized_url = normalize_url(value, use_protocol, use_https)
53
+
54
+ return unless protocol_valid?(record, attribute, normalized_url, use_protocol, use_https)
55
+
56
+ parsed = parse_and_validate_url(record, attribute, normalized_url)
57
+ return unless parsed
58
+
59
+ apply_validation_mode(record, attribute, parsed, validation_mode)
60
+ end
61
+
62
+ private
63
+
64
+ # Extract and validate the validation mode option
65
+ def extract_validation_mode
66
+ validation_mode = options.fetch(:validation, :standard)
67
+ return validation_mode if VALIDATION_MODES.include?(validation_mode)
68
+
69
+ raise ArgumentError, "Invalid validation mode: #{validation_mode}. " \
70
+ "Must be one of: #{VALIDATION_MODES.join(', ')}"
71
+ end
72
+
73
+ # Check protocol requirements
74
+ def protocol_valid?(record, attribute, url, use_protocol, use_https)
75
+ return true unless use_protocol
76
+ return true if valid_protocol?(url, use_https)
77
+
78
+ protocol = use_https ? 'https://' : 'http:// or https://'
79
+ record.errors.add(attribute, "must use #{protocol}")
80
+ false
81
+ end
82
+
83
+ # Parse URL and validate it's valid
84
+ def parse_and_validate_url(record, attribute, url)
85
+ parsed = DomainExtractor.parse(url)
86
+ return parsed if parsed.valid?
87
+
88
+ record.errors.add(attribute, 'is not a valid URL')
89
+ nil
90
+ end
91
+
92
+ # Apply the validation mode rules
93
+ def apply_validation_mode(record, attribute, parsed, validation_mode)
94
+ case validation_mode
95
+ when :standard
96
+ # Already validated - any valid URL passes
97
+ nil
98
+ when :root_domain
99
+ validate_root_domain(record, attribute, parsed)
100
+ when :root_or_custom_subdomain
101
+ validate_root_or_custom_subdomain(record, attribute, parsed)
102
+ end
103
+ end
104
+
105
+ # Check if value is blank (nil, empty string, or whitespace-only)
106
+ def blank?(value)
107
+ value.nil? || (value.respond_to?(:empty?) && value.empty?) ||
108
+ (value.is_a?(String) && value.strip.empty?)
109
+ end
110
+
111
+ # Normalize URL for validation based on protocol requirements
112
+ def normalize_url(url, use_protocol, use_https)
113
+ return url if blank?(url)
114
+
115
+ url = url.strip
116
+
117
+ # If protocol is not required, strip any existing protocol
118
+ url = url.gsub(%r{\A[A-Za-z][A-Za-z0-9+\-.]*://}, '') unless use_protocol
119
+
120
+ # Add protocol if needed for parsing
121
+ unless url.match?(%r{\A[A-Za-z][A-Za-z0-9+\-.]*://})
122
+ scheme = use_https ? 'https://' : 'http://'
123
+ url = scheme + url
124
+ end
125
+
126
+ url
127
+ end
128
+
129
+ # Check if URL has valid protocol
130
+ def valid_protocol?(url, use_https)
131
+ return true unless url.match?(%r{\A[A-Za-z][A-Za-z0-9+\-.]*://})
132
+
133
+ if use_https
134
+ url.start_with?('https://')
135
+ else
136
+ url.start_with?('http://', 'https://')
137
+ end
138
+ end
139
+
140
+ # Validate that URL is a root domain (no subdomain)
141
+ def validate_root_domain(record, attribute, parsed)
142
+ return unless parsed.subdomain?
143
+
144
+ record.errors.add(attribute, 'must be a root domain (no subdomains allowed)')
145
+ end
146
+
147
+ # Validate that URL is either root domain or has custom subdomain (not 'www')
148
+ def validate_root_or_custom_subdomain(record, attribute, parsed)
149
+ return unless parsed.subdomain == WWW_SUBDOMAIN
150
+
151
+ record.errors.add(attribute, 'cannot use www subdomain')
152
+ end
153
+ end
154
+
155
+ # Also register in DomainExtractor namespace for backwards compatibility
156
+ module DomainExtractor
157
+ # DomainValidator is now defined at the top level for Rails autoloading.
158
+ # This constant provides a reference for explicit usage.
159
+ #
160
+ # Validation modes:
161
+ # - :standard - Validates any valid URL using DomainExtractor.valid?
162
+ # - :root_domain - Only allows root domains (no subdomains) like https://mysite.com
163
+ # - :root_or_custom_subdomain - Allows root or custom subdomains, but excludes 'www'
164
+ #
165
+ # Optional flags:
166
+ # - use_protocol (default: true) - Whether protocol (http/https) is required
167
+ # - use_https (default: true) - Whether https is required (only if use_protocol is true)
168
+ #
169
+ # @example Standard validation
170
+ # validates :url, domain: { validation: :standard }
171
+ #
172
+ # @example Root domain only, no protocol required
173
+ # validates :url, domain: { validation: :root_domain, use_protocol: false }
174
+ DomainValidator = ::DomainValidator
175
+ end
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module DomainExtractor
4
- VERSION = '0.2.4'
4
+ VERSION = '0.2.6'
5
5
  end
@@ -9,6 +9,13 @@ require_relative 'domain_extractor/parsed_url'
9
9
  require_relative 'domain_extractor/parser'
10
10
  require_relative 'domain_extractor/query_params'
11
11
 
12
+ # Conditionally load Rails validator if ActiveModel is available
13
+ begin
14
+ require_relative 'domain_extractor/domain_validator'
15
+ rescue LoadError
16
+ # ActiveModel not available - skip loading validator
17
+ end
18
+
12
19
  # DomainExtractor provides a high-performance API for url parsing and domain parsing.
13
20
  # It exposes simple helpers for single URL normalization, domain extraction, and batch operations.
14
21
  module DomainExtractor
@@ -0,0 +1,350 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'spec_helper'
4
+
5
+ RSpec.describe DomainValidator do
6
+ # Mock record class for testing
7
+ let(:record_class) do
8
+ Class.new do
9
+ attr_accessor :url
10
+ attr_reader :errors
11
+
12
+ def initialize
13
+ @errors = ErrorsCollection.new
14
+ end
15
+ end
16
+ end
17
+
18
+ # Mock errors collection
19
+ let(:errors_collection_class) do
20
+ Class.new do
21
+ attr_reader :messages
22
+
23
+ def initialize
24
+ @messages = []
25
+ end
26
+
27
+ def add(attribute, message)
28
+ @messages << { attribute: attribute, message: message }
29
+ end
30
+
31
+ def empty?
32
+ @messages.empty?
33
+ end
34
+
35
+ def full_messages
36
+ @messages.map { |m| "#{m[:attribute]} #{m[:message]}" }
37
+ end
38
+ end
39
+ end
40
+
41
+ let(:record) { record_class.new }
42
+
43
+ before(:each) do
44
+ stub_const('ErrorsCollection', errors_collection_class)
45
+ end
46
+
47
+ describe 'validation modes' do
48
+ context 'with :standard validation' do
49
+ let(:validator) { described_class.new(attributes: [:url], validation: :standard) }
50
+
51
+ it 'accepts valid URLs with subdomains' do
52
+ record.url = 'https://shop.mysite.com'
53
+ validator.validate_each(record, :url, record.url)
54
+ expect(record.errors.messages).to be_empty
55
+ end
56
+
57
+ it 'accepts valid URLs without subdomains' do
58
+ record.url = 'https://mysite.com'
59
+ validator.validate_each(record, :url, record.url)
60
+ expect(record.errors.messages).to be_empty
61
+ end
62
+
63
+ it 'accepts www subdomain' do
64
+ record.url = 'https://www.mysite.com'
65
+ validator.validate_each(record, :url, record.url)
66
+ expect(record.errors.messages).to be_empty
67
+ end
68
+
69
+ it 'rejects invalid URLs' do
70
+ record.url = 'not-a-url'
71
+ validator.validate_each(record, :url, record.url)
72
+ expect(record.errors.messages).not_to be_empty
73
+ expect(record.errors.messages.first[:message]).to include('not a valid URL')
74
+ end
75
+
76
+ it 'allows blank values' do
77
+ record.url = ''
78
+ validator.validate_each(record, :url, record.url)
79
+ expect(record.errors.messages).to be_empty
80
+ end
81
+ end
82
+
83
+ context 'with :root_domain validation' do
84
+ let(:validator) { described_class.new(attributes: [:url], validation: :root_domain) }
85
+
86
+ it 'accepts root domain URLs' do
87
+ record.url = 'https://mysite.com'
88
+ validator.validate_each(record, :url, record.url)
89
+ expect(record.errors.messages).to be_empty
90
+ end
91
+
92
+ it 'rejects URLs with subdomains' do
93
+ record.url = 'https://shop.mysite.com'
94
+ validator.validate_each(record, :url, record.url)
95
+ expect(record.errors.messages).not_to be_empty
96
+ expect(record.errors.messages.first[:message]).to include('no subdomains allowed')
97
+ end
98
+
99
+ it 'rejects www subdomain' do
100
+ record.url = 'https://www.mysite.com'
101
+ validator.validate_each(record, :url, record.url)
102
+ expect(record.errors.messages).not_to be_empty
103
+ expect(record.errors.messages.first[:message]).to include('no subdomains allowed')
104
+ end
105
+
106
+ it 'rejects custom subdomains' do
107
+ record.url = 'https://api.mysite.com'
108
+ validator.validate_each(record, :url, record.url)
109
+ expect(record.errors.messages).not_to be_empty
110
+ expect(record.errors.messages.first[:message]).to include('no subdomains allowed')
111
+ end
112
+ end
113
+
114
+ context 'with :root_or_custom_subdomain validation' do
115
+ let(:validator) do
116
+ described_class.new(attributes: [:url], validation: :root_or_custom_subdomain)
117
+ end
118
+
119
+ it 'accepts root domain URLs' do
120
+ record.url = 'https://mysite.com'
121
+ validator.validate_each(record, :url, record.url)
122
+ expect(record.errors.messages).to be_empty
123
+ end
124
+
125
+ it 'accepts custom subdomain URLs' do
126
+ record.url = 'https://shop.mysite.com'
127
+ validator.validate_each(record, :url, record.url)
128
+ expect(record.errors.messages).to be_empty
129
+ end
130
+
131
+ it 'accepts api subdomain' do
132
+ record.url = 'https://api.mysite.com'
133
+ validator.validate_each(record, :url, record.url)
134
+ expect(record.errors.messages).to be_empty
135
+ end
136
+
137
+ it 'rejects www subdomain' do
138
+ record.url = 'https://www.mysite.com'
139
+ validator.validate_each(record, :url, record.url)
140
+ expect(record.errors.messages).not_to be_empty
141
+ expect(record.errors.messages.first[:message]).to include('cannot use www subdomain')
142
+ end
143
+ end
144
+ end
145
+
146
+ describe 'protocol options' do
147
+ context 'with use_protocol: true (default)' do
148
+ let(:validator) { described_class.new(attributes: [:url], validation: :standard) }
149
+
150
+ it 'accepts URLs with https protocol' do
151
+ record.url = 'https://mysite.com'
152
+ validator.validate_each(record, :url, record.url)
153
+ expect(record.errors.messages).to be_empty
154
+ end
155
+
156
+ it 'accepts URLs without protocol by auto-adding https' do
157
+ record.url = 'mysite.com'
158
+ validator.validate_each(record, :url, record.url)
159
+ expect(record.errors.messages).to be_empty
160
+ end
161
+ end
162
+
163
+ context 'with use_protocol: false' do
164
+ let(:validator) do
165
+ described_class.new(attributes: [:url], validation: :standard, use_protocol: false)
166
+ end
167
+
168
+ it 'accepts URLs without protocol' do
169
+ record.url = 'mysite.com'
170
+ validator.validate_each(record, :url, record.url)
171
+ expect(record.errors.messages).to be_empty
172
+ end
173
+
174
+ it 'accepts URLs with protocol by stripping it' do
175
+ record.url = 'https://mysite.com'
176
+ validator.validate_each(record, :url, record.url)
177
+ expect(record.errors.messages).to be_empty
178
+ end
179
+
180
+ it 'works with root_domain validation' do
181
+ validator = described_class.new(
182
+ attributes: [:url],
183
+ validation: :root_domain,
184
+ use_protocol: false
185
+ )
186
+ record.url = 'mysite.com'
187
+ validator.validate_each(record, :url, record.url)
188
+ expect(record.errors.messages).to be_empty
189
+ end
190
+
191
+ it 'rejects subdomains with root_domain validation' do
192
+ validator = described_class.new(
193
+ attributes: [:url],
194
+ validation: :root_domain,
195
+ use_protocol: false
196
+ )
197
+ record.url = 'shop.mysite.com'
198
+ validator.validate_each(record, :url, record.url)
199
+ expect(record.errors.messages).not_to be_empty
200
+ end
201
+ end
202
+
203
+ context 'with use_https: true (default)' do
204
+ let(:validator) { described_class.new(attributes: [:url], validation: :standard) }
205
+
206
+ it 'accepts https URLs' do
207
+ record.url = 'https://mysite.com'
208
+ validator.validate_each(record, :url, record.url)
209
+ expect(record.errors.messages).to be_empty
210
+ end
211
+
212
+ it 'rejects http URLs' do
213
+ record.url = 'http://mysite.com'
214
+ validator.validate_each(record, :url, record.url)
215
+ expect(record.errors.messages).not_to be_empty
216
+ expect(record.errors.messages.first[:message]).to include('must use https://')
217
+ end
218
+ end
219
+
220
+ context 'with use_https: false' do
221
+ let(:validator) do
222
+ described_class.new(attributes: [:url], validation: :standard, use_https: false)
223
+ end
224
+
225
+ it 'accepts https URLs' do
226
+ record.url = 'https://mysite.com'
227
+ validator.validate_each(record, :url, record.url)
228
+ expect(record.errors.messages).to be_empty
229
+ end
230
+
231
+ it 'accepts http URLs' do
232
+ record.url = 'http://mysite.com'
233
+ validator.validate_each(record, :url, record.url)
234
+ expect(record.errors.messages).to be_empty
235
+ end
236
+ end
237
+
238
+ context 'with use_protocol: false and use_https: false' do
239
+ let(:validator) do
240
+ described_class.new(
241
+ attributes: [:url],
242
+ validation: :standard,
243
+ use_protocol: false,
244
+ use_https: false
245
+ )
246
+ end
247
+
248
+ it 'accepts domain without protocol' do
249
+ record.url = 'mysite.com'
250
+ validator.validate_each(record, :url, record.url)
251
+ expect(record.errors.messages).to be_empty
252
+ end
253
+
254
+ it 'accepts domain with http protocol' do
255
+ record.url = 'http://mysite.com'
256
+ validator.validate_each(record, :url, record.url)
257
+ expect(record.errors.messages).to be_empty
258
+ end
259
+
260
+ it 'accepts domain with https protocol' do
261
+ record.url = 'https://mysite.com'
262
+ validator.validate_each(record, :url, record.url)
263
+ expect(record.errors.messages).to be_empty
264
+ end
265
+ end
266
+ end
267
+
268
+ describe 'complex scenarios' do
269
+ it 'validates root_domain without protocol' do
270
+ validator = described_class.new(
271
+ attributes: [:url],
272
+ validation: :root_domain,
273
+ use_protocol: false
274
+ )
275
+ record.url = 'mysite.com'
276
+ validator.validate_each(record, :url, record.url)
277
+ expect(record.errors.messages).to be_empty
278
+ end
279
+
280
+ it 'validates root_or_custom_subdomain with https only' do
281
+ validator = described_class.new(
282
+ attributes: [:url],
283
+ validation: :root_or_custom_subdomain,
284
+ use_https: true
285
+ )
286
+ record.url = 'https://shop.mysite.com'
287
+ validator.validate_each(record, :url, record.url)
288
+ expect(record.errors.messages).to be_empty
289
+ end
290
+
291
+ it 'rejects www in root_or_custom_subdomain mode' do
292
+ validator = described_class.new(
293
+ attributes: [:url],
294
+ validation: :root_or_custom_subdomain,
295
+ use_protocol: false
296
+ )
297
+ record.url = 'www.mysite.com'
298
+ validator.validate_each(record, :url, record.url)
299
+ expect(record.errors.messages).not_to be_empty
300
+ expect(record.errors.messages.first[:message]).to include('cannot use www subdomain')
301
+ end
302
+
303
+ it 'handles URLs with paths' do
304
+ validator = described_class.new(attributes: [:url], validation: :standard)
305
+ record.url = 'https://mysite.com/path/to/page'
306
+ validator.validate_each(record, :url, record.url)
307
+ expect(record.errors.messages).to be_empty
308
+ end
309
+
310
+ it 'handles URLs with query parameters' do
311
+ validator = described_class.new(attributes: [:url], validation: :standard)
312
+ record.url = 'https://mysite.com?foo=bar&baz=qux'
313
+ validator.validate_each(record, :url, record.url)
314
+ expect(record.errors.messages).to be_empty
315
+ end
316
+
317
+ it 'handles multi-level subdomains with root_or_custom_subdomain' do
318
+ validator = described_class.new(
319
+ attributes: [:url],
320
+ validation: :root_or_custom_subdomain
321
+ )
322
+ record.url = 'https://api.staging.mysite.com'
323
+ validator.validate_each(record, :url, record.url)
324
+ expect(record.errors.messages).to be_empty
325
+ end
326
+ end
327
+
328
+ describe 'error handling' do
329
+ it 'raises error for invalid validation mode' do
330
+ expect do
331
+ validator = described_class.new(attributes: [:url], validation: :invalid_mode)
332
+ validator.validate_each(record, :url, 'https://mysite.com')
333
+ end.to raise_error(ArgumentError, /Invalid validation mode/)
334
+ end
335
+
336
+ it 'handles nil values gracefully' do
337
+ validator = described_class.new(attributes: [:url], validation: :standard)
338
+ record.url = nil
339
+ validator.validate_each(record, :url, record.url)
340
+ expect(record.errors.messages).to be_empty
341
+ end
342
+
343
+ it 'handles whitespace-only values' do
344
+ validator = described_class.new(attributes: [:url], validation: :standard)
345
+ record.url = ' '
346
+ validator.validate_each(record, :url, record.url)
347
+ expect(record.errors.messages).to be_empty
348
+ end
349
+ end
350
+ end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: domain_extractor
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.4
4
+ version: 0.2.6
5
5
  platform: ruby
6
6
  authors:
7
7
  - OpenSite AI
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2025-11-08 00:00:00.000000000 Z
11
+ date: 2025-11-09 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: public_suffix
@@ -41,6 +41,7 @@ files:
41
41
  - LICENSE.txt
42
42
  - README.md
43
43
  - lib/domain_extractor.rb
44
+ - lib/domain_extractor/domain_validator.rb
44
45
  - lib/domain_extractor/errors.rb
45
46
  - lib/domain_extractor/normalizer.rb
46
47
  - lib/domain_extractor/parsed_url.rb
@@ -50,6 +51,7 @@ files:
50
51
  - lib/domain_extractor/validators.rb
51
52
  - lib/domain_extractor/version.rb
52
53
  - spec/domain_extractor_spec.rb
54
+ - spec/domain_validator_spec.rb
53
55
  - spec/parsed_url_spec.rb
54
56
  - spec/spec_helper.rb
55
57
  homepage: https://github.com/opensite-ai/domain_extractor