domain_extractor 0.1.8 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 4bc4d6ad831692d1251048f8b21820bb0efb10ed5b3cce641441b31afb5308b4
4
- data.tar.gz: 67a96b33dc3544847af271c8bd837dbc592031bff5dac126022a147c2281460c
3
+ metadata.gz: 9be08df3b44102d414007ad7fc6865166cb89061a09340d5e49de211165c963f
4
+ data.tar.gz: a334da0e7a8dc42335b9710aaf9d601d9fc85323ca2e328359de9593a340abc3
5
5
  SHA512:
6
- metadata.gz: 02bca764446a3391461695cfeeaef9c6e7920308bc78768b062ae676005d3610b09733133cb10c34cd5e29dc35169f770b4789f418fd554cba762a6d5a19022a
7
- data.tar.gz: eeaaa8356b306feba33e08e54c8da2926f7e052ebac5d6920f0a6f26c0dacd3bfbb0d4f863fa694377d87441564a2c1eecff764d756ce4efde0569fabf573ee2
6
+ metadata.gz: 297852adc140faba6fc64b7589402c1f3b032be344d13ad781ede5ff2a1c1ba867480edbebaca488708da46d68152bf3ee80c02da36d6f8afbf97a74361960ae
7
+ data.tar.gz: 6b0191d8bfb2458a23e3c56c382f495c9eebf86715a16c8f2936e6804d4adc0643835485e100e42f3bcd3f4d83344d7d60aac47dee60f72214fd95297f714334
data/README.md CHANGED
@@ -65,6 +65,10 @@ end
65
65
  result.subdomain # => 'www'
66
66
  result.domain # => 'example'
67
67
  result.host # => 'www.example.co.uk'
68
+
69
+ # Opt into strict parsing when needed
70
+ DomainExtractor.parse!('notaurl')
71
+ # => raises DomainExtractor::InvalidURLError: Invalid URL Value
68
72
  ```
69
73
 
70
74
  ## ParsedURL API - Intuitive Method Access
@@ -74,6 +78,7 @@ DomainExtractor now returns a `ParsedURL` object that supports three accessor st
74
78
  ### Method Accessor Styles
75
79
 
76
80
  #### 1. Default Methods (Silent Nil)
81
+
77
82
  Returns the value or `nil` - perfect for exploratory code or when handling invalid data gracefully.
78
83
 
79
84
  ```ruby
@@ -89,6 +94,7 @@ result.domain # => 'example'
89
94
  ```
90
95
 
91
96
  #### 2. Bang Methods (!) - Explicit Errors
97
+
92
98
  Returns the value or raises `InvalidURLError` - ideal for production code where missing data should fail fast.
93
99
 
94
100
  ```ruby
@@ -98,6 +104,7 @@ result.subdomain! # raises InvalidURLError: "subdomain not found or invalid"
98
104
  ```
99
105
 
100
106
  #### 3. Question Methods (?) - Boolean Checks
107
+
101
108
  Always returns `true` or `false` - perfect for conditional logic without exceptions.
102
109
 
103
110
  ```ruby
@@ -150,6 +157,28 @@ DomainExtractor.parse('https://api.dashtrack.com').subdomain? # => true
150
157
  # Check for www subdomain specifically
151
158
  DomainExtractor.parse('https://www.dashtrack.com').www_subdomain? # => true
152
159
  DomainExtractor.parse('https://api.dashtrack.com').www_subdomain? # => false
160
+
161
+ ```
162
+
163
+ #### Handling Unknown or Invalid Data
164
+
165
+ ```ruby
166
+ # Default accessors fail silently with nil
167
+ DomainExtractor.parse(nil).domain # => nil
168
+ DomainExtractor.parse('').host # => nil
169
+ DomainExtractor.parse('asdfasdfds').domain # => nil
170
+
171
+ # Boolean checks never raise
172
+ DomainExtractor.parse(nil).subdomain? # => false
173
+ DomainExtractor.parse('').domain? # => false
174
+ DomainExtractor.parse('https://dashtrack.com').subdomain? # => false
175
+
176
+ # Bang methods raise when a component is missing
177
+ DomainExtractor.parse('').host! # => raises DomainExtractor::InvalidURLError
178
+ DomainExtractor.parse('asdfasdfds').domain! # => raises DomainExtractor::InvalidURLError
179
+
180
+ # Strict parsing helper mirrors legacy behaviour
181
+ DomainExtractor.parse!('asdfasdfds') # => raises DomainExtractor::InvalidURLError
153
182
  ```
154
183
 
155
184
  #### Safe Batch Processing
@@ -235,7 +264,7 @@ hash = result.to_h
235
264
  # }
236
265
  ```
237
266
 
238
- **See [docs/PARSED_URL_API.md](docs/PARSED_URL_API.md) for comprehensive documentation and real-world examples.**
267
+ **[Comprehensive documentation and real-world examples of parsed URL quick start guide](https://github.com/opensite-ai/domain_extractor/blob/master/docs/PARSED_URL_QUICK_START.md)**
239
268
 
240
269
  ## Usage Examples
241
270
 
@@ -300,31 +329,38 @@ DomainExtractor.parse('not-a-url')
300
329
 
301
330
  ## API Reference
302
331
 
303
- ### `DomainExtractor.parse(url_string)`
304
-
305
- Parses a URL string and extracts domain components.
332
+ ```ruby
333
+ DomainExtractor.parse(url_string)
306
334
 
307
- **Returns:** Hash with keys `:subdomain`, `:domain`, `:tld`, `:root_domain`, `:host`, `:path`
335
+ # => Parses a URL string and extracts domain components.
308
336
 
309
- **Raises:** `DomainExtractor::InvalidURLError` when the URL fails validation
337
+ # Returns: Hash with keys :subdomain, :domain, :tld, :root_domain, :host, :path
338
+ # Raises: DomainExtractor::InvalidURLError when the URL fails validation
339
+ ```
310
340
 
311
- ### `DomainExtractor.parse_batch(urls)`
341
+ ```ruby
342
+ DomainExtractor.parse_batch(urls)
312
343
 
313
- Parses multiple URLs efficiently.
344
+ # => Parses multiple URLs efficiently.
314
345
 
315
- **Returns:** Array of parsed results
346
+ # Returns: Array of parsed results
347
+ ```
316
348
 
317
- ### `DomainExtractor.valid?(url_string)`
349
+ ```ruby
350
+ DomainExtractor.valid?(url_string)
318
351
 
319
- Checks if a URL can be parsed successfully without raising.
352
+ # => Checks if a URL can be parsed successfully without raising.
320
353
 
321
- **Returns:** `true` or `false`
354
+ # Returns: true or false
355
+ ```
322
356
 
323
- ### `DomainExtractor.parse_query_params(query_string)`
357
+ ```ruby
358
+ DomainExtractor.parse_query_params(query_string)
324
359
 
325
- Parses a query string into a hash of parameters.
360
+ # => Parses a query string into a hash of parameters.
326
361
 
327
- **Returns:** Hash of query parameters
362
+ # Returns: Hash of query parameters
363
+ ```
328
364
 
329
365
  ## Use Cases
330
366
 
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module DomainExtractor
4
- VERSION = '0.1.8'
4
+ VERSION = '0.2.0'
5
5
  end
@@ -15,10 +15,21 @@ module DomainExtractor
15
15
  class << self
16
16
  # Parse an individual URL and extract domain attributes.
17
17
  # Returns a ParsedURL object that supports hash-style access and method calls.
18
- # Raises DomainExtractor::InvalidURLError when the URL fails validation.
18
+ # For invalid inputs the returned ParsedURL will be marked invalid and all
19
+ # accessors (without bang) will evaluate to nil/false.
19
20
  # @param url [String, #to_s]
20
21
  # @return [ParsedURL]
21
22
  def parse(url)
23
+ Parser.call(url)
24
+ end
25
+
26
+ # Parse an individual URL and raise when extraction fails.
27
+ # This mirrors the legacy behaviour of .parse while giving callers an
28
+ # explicit opt-in to strict validation.
29
+ # @param url [String, #to_s]
30
+ # @return [ParsedURL]
31
+ # @raise [InvalidURLError]
32
+ def parse!(url)
22
33
  result = Parser.call(url)
23
34
  raise InvalidURLError unless result.valid?
24
35
 
@@ -142,40 +142,42 @@ RSpec.describe DomainExtractor do
142
142
  end
143
143
 
144
144
  context 'with invalid URLs' do
145
- it 'raises InvalidURLError for malformed URLs' do
146
- expect { described_class.parse('http://') }.to raise_error(
147
- DomainExtractor::InvalidURLError,
148
- 'Invalid URL Value'
149
- )
145
+ let(:invalid_inputs) { ['http://', 'not_a_url', '192.168.1.1', '[2001:db8::1]', '', nil] }
146
+
147
+ it 'returns an invalid ParsedURL that safely yields nil values' do
148
+ invalid_inputs.each do |input|
149
+ result = described_class.parse(input)
150
+
151
+ expect(result).to be_a(DomainExtractor::ParsedURL)
152
+ expect(result.valid?).to be(false)
153
+ expect(result.domain).to be_nil
154
+ expect(result.domain?).to be(false)
155
+ expect(result.host).to be_nil
156
+ expect(result.host?).to be(false)
157
+ end
150
158
  end
151
159
 
152
- it 'raises InvalidURLError for invalid domains' do
153
- expect { described_class.parse('not_a_url') }.to raise_error(
154
- DomainExtractor::InvalidURLError,
155
- 'Invalid URL Value'
156
- )
157
- end
160
+ it 'allows bang accessors to raise explicit errors' do
161
+ result = described_class.parse('not_a_url')
158
162
 
159
- it 'raises InvalidURLError for IP addresses' do
160
- expect { described_class.parse('192.168.1.1') }.to raise_error(
163
+ expect { result.domain! }.to raise_error(
161
164
  DomainExtractor::InvalidURLError,
162
- 'Invalid URL Value'
165
+ 'domain not found or invalid'
163
166
  )
164
- end
165
167
 
166
- it 'raises InvalidURLError for IPv6 addresses' do
167
- expect { described_class.parse('[2001:db8::1]') }.to raise_error(
168
+ expect { result.host! }.to raise_error(
168
169
  DomainExtractor::InvalidURLError,
169
- 'Invalid URL Value'
170
+ 'host not found or invalid'
170
171
  )
171
172
  end
172
173
 
173
- it 'raises InvalidURLError for empty string' do
174
- expect { described_class.parse('') }.to raise_error(DomainExtractor::InvalidURLError, 'Invalid URL Value')
175
- end
176
-
177
- it 'raises InvalidURLError for nil' do
178
- expect { described_class.parse(nil) }.to raise_error(DomainExtractor::InvalidURLError, 'Invalid URL Value')
174
+ it 'provides strict parsing via parse!' do
175
+ invalid_inputs.each do |input|
176
+ expect { described_class.parse!(input) }.to raise_error(
177
+ DomainExtractor::InvalidURLError,
178
+ 'Invalid URL Value'
179
+ )
180
+ end
179
181
  end
180
182
  end
181
183
  end
@@ -127,8 +127,8 @@ RSpec.describe DomainExtractor::ParsedURL do
127
127
  end
128
128
  end
129
129
 
130
- context 'with invalid URL' do
131
- let(:parsed) { DomainExtractor::ParsedURL.new(nil) }
130
+ context 'with invalid URL input' do
131
+ let(:parsed) { DomainExtractor.parse('invalid_url_value') }
132
132
 
133
133
  describe 'default accessor methods' do
134
134
  it 'returns nil for subdomain' do
@@ -189,6 +189,50 @@ RSpec.describe DomainExtractor::ParsedURL do
189
189
  end
190
190
  end
191
191
  end
192
+
193
+ context 'with nil input' do
194
+ let(:parsed) { DomainExtractor.parse(nil) }
195
+
196
+ it 'returns nil for default accessors' do
197
+ expect(parsed.domain).to be_nil
198
+ expect(parsed.host).to be_nil
199
+ expect(parsed.subdomain).to be_nil
200
+ end
201
+
202
+ it 'returns false for question accessors' do
203
+ expect(parsed.domain?).to be false
204
+ expect(parsed.host?).to be false
205
+ expect(parsed.subdomain?).to be false
206
+ end
207
+
208
+ it 'raises for bang accessors' do
209
+ expect { parsed.domain! }.to raise_error(
210
+ DomainExtractor::InvalidURLError,
211
+ 'domain not found or invalid'
212
+ )
213
+ end
214
+ end
215
+
216
+ context 'with empty string input' do
217
+ let(:parsed) { DomainExtractor.parse('') }
218
+
219
+ it 'returns nil for default accessors' do
220
+ expect(parsed.domain).to be_nil
221
+ expect(parsed.host).to be_nil
222
+ end
223
+
224
+ it 'returns false for question accessors' do
225
+ expect(parsed.domain?).to be false
226
+ expect(parsed.host?).to be false
227
+ end
228
+
229
+ it 'raises for bang accessors' do
230
+ expect { parsed.host! }.to raise_error(
231
+ DomainExtractor::InvalidURLError,
232
+ 'host not found or invalid'
233
+ )
234
+ end
235
+ end
192
236
  end
193
237
 
194
238
  describe '#www_subdomain?' do
@@ -208,7 +252,7 @@ RSpec.describe DomainExtractor::ParsedURL do
208
252
  end
209
253
 
210
254
  it 'returns false for invalid URL' do
211
- parsed = DomainExtractor::ParsedURL.new(nil)
255
+ parsed = DomainExtractor.parse('invalid_url_value')
212
256
  expect(parsed.www_subdomain?).to be false
213
257
  end
214
258
  end
@@ -220,7 +264,7 @@ RSpec.describe DomainExtractor::ParsedURL do
220
264
  end
221
265
 
222
266
  it 'returns false for invalid URL' do
223
- parsed = DomainExtractor::ParsedURL.new(nil)
267
+ parsed = DomainExtractor.parse('invalid_url_value')
224
268
  expect(parsed.valid?).to be false
225
269
  end
226
270
 
@@ -299,8 +343,7 @@ RSpec.describe DomainExtractor::ParsedURL do
299
343
 
300
344
  it 'handles example: domain returns nil for invalid URL' do
301
345
  # Parser returns ParsedURL with empty result for invalid URLs
302
- # But parse() raises error, so we need to construct directly
303
- parsed = DomainExtractor::ParsedURL.new(nil)
346
+ parsed = DomainExtractor.parse('invalid_url_value')
304
347
  expect(parsed.domain).to be_nil
305
348
  end
306
349
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: domain_extractor
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.8
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - OpenSite AI
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2025-10-31 00:00:00.000000000 Z
11
+ date: 2025-11-01 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: public_suffix