domain_extractor 0.1.9 → 0.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +28 -18
- data/lib/domain_extractor/version.rb +1 -1
- metadata +2 -2
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: e98bbd81335ec24e45ed141b05b71f65106e3e441f54a772722b7a56d4e1cb5c
|
|
4
|
+
data.tar.gz: 9cdecbb53f3ec5bb9b6758d5e6b95991245f1ad97d8b22ef15fbc0fd4f074b32
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 23d81dafbcbb998c8bd94d0c649796ecc999fad5263354d49172ebd7db5ecc536c4705c537726d88be9b88d3c49bb071e9e841ef5f025e912ed0413557050777
|
|
7
|
+
data.tar.gz: 34b21e93c8e7351dcb867c8bacf106175dcd5cb432e372ad60031a7e0733e82371e66410790b5d6f77e8429d80fe9949b458cdda51ff9db2c399e347042d406e
|
data/README.md
CHANGED
|
@@ -1,12 +1,12 @@
|
|
|
1
1
|
# DomainExtractor
|
|
2
2
|
|
|
3
|
-
[](https://badge.fury.io/rb/domain_extractor)
|
|
3
|
+
[](https://badge.fury.io/rb/domain_extractor)
|
|
4
4
|
[](https://github.com/opensite-ai/domain_extractor/actions/workflows/ci.yml)
|
|
5
5
|
[](https://codeclimate.com/github/opensite-ai/domain_extractor)
|
|
6
6
|
|
|
7
7
|
A lightweight, robust Ruby library for url parsing and domain parsing with **accurate multi-part TLD support**. DomainExtractor delivers a high-throughput url parser and domain parser that excels at domain extraction tasks while staying friendly to analytics pipelines. Perfect for web scraping, analytics, url manipulation, query parameter parsing, and multi-environment domain analysis.
|
|
8
8
|
|
|
9
|
-
Use DomainExtractor whenever you need a dependable tld parser for tricky multi-part tld registries or reliable subdomain extraction in production systems.
|
|
9
|
+
Use **DomainExtractor** whenever you need a dependable tld parser for tricky multi-part tld registries or reliable subdomain extraction in production systems.
|
|
10
10
|
|
|
11
11
|
## Why DomainExtractor?
|
|
12
12
|
|
|
@@ -78,6 +78,7 @@ DomainExtractor now returns a `ParsedURL` object that supports three accessor st
|
|
|
78
78
|
### Method Accessor Styles
|
|
79
79
|
|
|
80
80
|
#### 1. Default Methods (Silent Nil)
|
|
81
|
+
|
|
81
82
|
Returns the value or `nil` - perfect for exploratory code or when handling invalid data gracefully.
|
|
82
83
|
|
|
83
84
|
```ruby
|
|
@@ -93,6 +94,7 @@ result.domain # => 'example'
|
|
|
93
94
|
```
|
|
94
95
|
|
|
95
96
|
#### 2. Bang Methods (!) - Explicit Errors
|
|
97
|
+
|
|
96
98
|
Returns the value or raises `InvalidURLError` - ideal for production code where missing data should fail fast.
|
|
97
99
|
|
|
98
100
|
```ruby
|
|
@@ -102,6 +104,7 @@ result.subdomain! # raises InvalidURLError: "subdomain not found or invalid"
|
|
|
102
104
|
```
|
|
103
105
|
|
|
104
106
|
#### 3. Question Methods (?) - Boolean Checks
|
|
107
|
+
|
|
105
108
|
Always returns `true` or `false` - perfect for conditional logic without exceptions.
|
|
106
109
|
|
|
107
110
|
```ruby
|
|
@@ -261,7 +264,7 @@ hash = result.to_h
|
|
|
261
264
|
# }
|
|
262
265
|
```
|
|
263
266
|
|
|
264
|
-
**
|
|
267
|
+
**[Comprehensive documentation and real-world examples of parsed URL quick start guide](https://github.com/opensite-ai/domain_extractor/blob/master/docs/PARSED_URL_QUICK_START.md)**
|
|
265
268
|
|
|
266
269
|
## Usage Examples
|
|
267
270
|
|
|
@@ -326,31 +329,38 @@ DomainExtractor.parse('not-a-url')
|
|
|
326
329
|
|
|
327
330
|
## API Reference
|
|
328
331
|
|
|
329
|
-
|
|
330
|
-
|
|
331
|
-
Parses a URL string and extracts domain components.
|
|
332
|
+
```ruby
|
|
333
|
+
DomainExtractor.parse(url_string)
|
|
332
334
|
|
|
333
|
-
|
|
335
|
+
# => Parses a URL string and extracts domain components.
|
|
334
336
|
|
|
335
|
-
|
|
337
|
+
# Returns: Hash with keys :subdomain, :domain, :tld, :root_domain, :host, :path
|
|
338
|
+
# Raises: DomainExtractor::InvalidURLError when the URL fails validation
|
|
339
|
+
```
|
|
336
340
|
|
|
337
|
-
|
|
341
|
+
```ruby
|
|
342
|
+
DomainExtractor.parse_batch(urls)
|
|
338
343
|
|
|
339
|
-
Parses multiple URLs efficiently.
|
|
344
|
+
# => Parses multiple URLs efficiently.
|
|
340
345
|
|
|
341
|
-
|
|
346
|
+
# Returns: Array of parsed results
|
|
347
|
+
```
|
|
342
348
|
|
|
343
|
-
|
|
349
|
+
```ruby
|
|
350
|
+
DomainExtractor.valid?(url_string)
|
|
344
351
|
|
|
345
|
-
Checks if a URL can be parsed successfully without raising.
|
|
352
|
+
# => Checks if a URL can be parsed successfully without raising.
|
|
346
353
|
|
|
347
|
-
|
|
354
|
+
# Returns: true or false
|
|
355
|
+
```
|
|
348
356
|
|
|
349
|
-
|
|
357
|
+
```ruby
|
|
358
|
+
DomainExtractor.parse_query_params(query_string)
|
|
350
359
|
|
|
351
|
-
Parses a query string into a hash of parameters.
|
|
360
|
+
# => Parses a query string into a hash of parameters.
|
|
352
361
|
|
|
353
|
-
|
|
362
|
+
# Returns: Hash of query parameters
|
|
363
|
+
```
|
|
354
364
|
|
|
355
365
|
## Use Cases
|
|
356
366
|
|
|
@@ -389,7 +399,7 @@ Optimized for high-throughput production use:
|
|
|
389
399
|
- **Thread-safe**: Stateless modules, safe for concurrent use
|
|
390
400
|
- **Zero-allocation hot paths**: Frozen constants, pre-compiled regex
|
|
391
401
|
|
|
392
|
-
|
|
402
|
+
View [performance analysis](https://github.com/opensite-ai/domain_extractor/blob/master/docs/PERFORMANCE.md) for detailed benchmarks and optimization strategies and benchmark results along with a full set of enhancements made in order to meet the highly performance centric requirements of the OpenSite AI site rendering engine, showcased in the [optimization summary](https://github.com/opensite-ai/domain_extractor/blob/master/docs/OPTIMIZATION_SUMMARY.md)
|
|
393
403
|
|
|
394
404
|
## Comparison with Alternatives
|
|
395
405
|
|
metadata
CHANGED
|
@@ -1,14 +1,14 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: domain_extractor
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.1
|
|
4
|
+
version: 0.2.1
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- OpenSite AI
|
|
8
8
|
autorequire:
|
|
9
9
|
bindir: bin
|
|
10
10
|
cert_chain: []
|
|
11
|
-
date: 2025-
|
|
11
|
+
date: 2025-11-01 00:00:00.000000000 Z
|
|
12
12
|
dependencies:
|
|
13
13
|
- !ruby/object:Gem::Dependency
|
|
14
14
|
name: public_suffix
|