domain_extractor 0.2.8 → 0.2.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: d3a08e0e813341f588f96df2c7ec48eede21b5e93b6f46e127d15b2266439919
4
- data.tar.gz: f05739738dda333fa4793397f4c51c047574d2c94e591ed6c76904f8cdc51fe7
3
+ metadata.gz: 0b791db73463bab1af5200390a77607e87094609a4982c267cf45462648edb09
4
+ data.tar.gz: 1202be56be2eb390bbd767e2c60369a61268261fefe5c593acdb064d586e7431
5
5
  SHA512:
6
- metadata.gz: f1d4c335712a677707d7fedb966924e9f9a2f0e06f9084fc5ab1075d5d620ae51cd390a2e5013245e6fb9ec3d32563ed7c090d989807e86d01a304ac8ba6d9bc
7
- data.tar.gz: 143f192445d137c456668c44e260cf96183ca0afae22be308e884f9fa9c9248cbf43b5608466412fc0734e652565c72bea45990bc4d038ac20934b854f6b929f
6
+ metadata.gz: 957f691b10eb6bd705646cf192c9c3fecf84a246a3a76e8e5f29406d38b97f52277006bfaa9e09fc334eb3594eeae61c977e05eb95c10021d7cdce6a27ae8325
7
+ data.tar.gz: 205629c265724025d48c9d25318c78ede7d6bccf93486cae0b5957a4094ef94a46dde00a5b97540517c4b2f5244025e5ace4eb9de6bf4c6f0e0a7477692897e7
data/CHANGELOG.md CHANGED
@@ -5,6 +5,274 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [0.2.9] - 2026-03-11
9
+
10
+ ### Added - URI-Compatible Accessors and Authentication Extraction
11
+
12
+ This major release adds a much broader **URI-compatible API for common absolute-URL workflows** along with new authentication extraction, URI manipulation helpers, and expanded documentation.
13
+
14
+ #### 🔐 Authentication Extraction
15
+
16
+ **Comprehensive userinfo parsing** for database connections, Redis, FTP, SFTP, and any URL scheme with embedded credentials:
17
+
18
+ - `user` - Extract username from URL
19
+ - `password` - Extract password from URL
20
+ - `userinfo` - Complete userinfo string (user:password)
21
+ - `decoded_user` - Percent-decoded username (handles special characters)
22
+ - `decoded_password` - Percent-decoded password (handles special characters)
23
+
24
+ **Supported URL Schemes:**
25
+
26
+ - **Redis/Rediss**: `redis://username:password@host:6379/0`, `rediss://:password@host:6380`
27
+ - **PostgreSQL**: `postgresql://user:pass@localhost:5432/dbname`
28
+ - **MySQL**: `mysql://user:pass@host:3306/database`
29
+ - **MongoDB**: `mongodb+srv://user:pass@cluster.mongodb.net/db`
30
+ - **FTP/SFTP/FTPS**: `ftp://user:pass@host/path`, `sftp://user:pass@host:22/path`
31
+ - **HTTP/HTTPS**: `https://user:pass@api.example.com` (deprecated but supported)
32
+
33
+ **Special Character Handling:**
34
+
35
+ - Automatic percent-decoding of credentials with `@`, `:`, and other special characters
36
+ - `decoded_user` and `decoded_password` provide clean, usable credentials
37
+ - Handles edge cases: password-only (`:password`), username-only, empty passwords
38
+
39
+ #### 🔧 Complete URI Component Access
40
+
41
+ **Common URI components** are now accessible as both getters and setters:
42
+
43
+ **Read Access:**
44
+
45
+ - `scheme` - URL scheme (http, https, redis, postgresql, etc.)
46
+ - `host` - Host value for the parsed URI
47
+ - `hostname` - Hostname helper for URI-style access
48
+ - `port` - Port number
49
+ - `path` - URL path
50
+ - `query` - Raw query string
51
+ - `fragment` - Fragment/anchor (#section)
52
+ - `user`, `password`, `userinfo` - Authentication components
53
+ - `subdomain`, `domain`, `tld`, `root_domain` - Domain components (existing)
54
+
55
+ **Write Access (Setter Methods):**
56
+
57
+ - `scheme=`, `host=`, `hostname=`, `port=`, `path=`, `query=`, `fragment=`
58
+ - `user=`, `password=`, `userinfo=`
59
+ - Enables programmatic URI construction and modification
60
+
61
+ #### 🛠ïļ Authentication Helper Methods
62
+
63
+ **Basic Authentication:**
64
+
65
+ ```ruby
66
+ # Generate Authorization header from parsed URL
67
+ result = DomainExtractor.parse('https://user:pass@api.example.com')
68
+ result.basic_auth_header
69
+ # => "Basic dXNlcjpwYXNz"
70
+
71
+ # Or use module method directly
72
+ DomainExtractor.basic_auth_header('user', 'password')
73
+ # => "Basic dXNlcjpwYXNzd29yZA=="
74
+ ```
75
+
76
+ **Bearer Token Authentication:**
77
+
78
+ ```ruby
79
+ DomainExtractor.bearer_auth_header('eyJhbGciOiJIUzI1NiIs...')
80
+ # => "Bearer eyJhbGciOiJIUzI1NiIs..."
81
+ ```
82
+
83
+ **Credential Encoding/Decoding:**
84
+ ```ruby
85
+ # Encode credentials for URL use (percent-encoding)
86
+ DomainExtractor.encode_credential('P@ss:word!')
87
+ # => "P%40ss%3Aword%21"
88
+
89
+ # Decode percent-encoded credentials
90
+ DomainExtractor.decode_credential('P%40ss%3Aword%21')
91
+ # => "P@ss:word!"
92
+ ```
93
+
94
+ #### 🌐 Advanced URI Methods
95
+
96
+ **URI Manipulation:**
97
+ - `merge(relative_url)` - Merge with relative URL (RFC 2396 compliant)
98
+ - `normalize` - Normalize URI (lowercase scheme/host, remove default ports)
99
+ - `absolute?` - Check if URI is absolute
100
+ - `relative?` - Check if URI is relative
101
+ - `default_port` - Get default port for scheme (80 for http, 443 for https, 6379 for redis, etc.)
102
+ - `build_url` - Reconstruct complete URL from components
103
+
104
+ **Proxy Detection:**
105
+ - `find_proxy` - Automatic proxy detection from environment variables
106
+ - Checks scheme-specific proxy variables, falls back to `http_proxy` / `HTTP_PROXY`, and respects `no_proxy`
107
+ - Returns proxy URI or nil
108
+
109
+ **Alias Methods for URI Compatibility:**
110
+ - `to_str` - Alias for `to_s`
111
+ - `hostname` - URI-style hostname accessor
112
+ - `query` - Raw query string access
113
+
114
+ #### 📊 Real-World Use Cases
115
+
116
+ **Database Connection Parsing:**
117
+ ```ruby
118
+ db_url = 'postgresql://appuser:SecurePass@db.prod.internal:5432/production'
119
+ result = DomainExtractor.parse(db_url)
120
+
121
+ result.user # => "appuser"
122
+ result.password # => "SecurePass"
123
+ result.host # => "db.prod.internal"
124
+ result.port # => 5432
125
+ result.path # => "/production"
126
+ ```
127
+
128
+ **Redis Connection with Special Characters:**
129
+ ```ruby
130
+ redis_url = 'rediss://default:P%40ss%3Aword@redis.cloud:6385/0'
131
+ result = DomainExtractor.parse(redis_url)
132
+
133
+ result.password # => "P%40ss%3Aword"
134
+ result.decoded_password # => "P@ss:word"
135
+ result.scheme # => "rediss"
136
+ result.port # => 6385
137
+ ```
138
+
139
+ **API Authentication Header Generation:**
140
+ ```ruby
141
+ api_url = 'https://nick@untappd.com:MySuperAPIToken123@api.untappd.com/v4'
142
+ result = DomainExtractor.parse(api_url)
143
+
144
+ # Generate Basic Auth header
145
+ auth_header = result.basic_auth_header
146
+ # Use in HTTP request:
147
+ # headers['Authorization'] = auth_header
148
+ ```
149
+
150
+ **FTP/SFTP Deployment:**
151
+ ```ruby
152
+ deploy_url = 'sftp://deploy_user:DeployKey123@deployment.internal:22/var/www/app'
153
+ result = DomainExtractor.parse(deploy_url)
154
+
155
+ result.user # => "deploy_user"
156
+ result.password # => "DeployKey123"
157
+ result.host # => "deployment.internal"
158
+ result.port # => 22
159
+ result.path # => "/var/www/app"
160
+ ```
161
+
162
+ #### 🔒 Security Considerations
163
+
164
+ **Important Security Notes:**
165
+ - Embedding credentials in URLs is **deprecated** per RFC 3986 and should be avoided in production
166
+ - Use environment variables, secret managers, or secure vaults for credential storage
167
+ - The library supports credential extraction for **legacy systems** and **configuration parsing**
168
+ - Always use HTTPS/TLS when credentials must be transmitted
169
+ - Never log URLs containing credentials
170
+ - Consider using header-based authentication (Bearer tokens, API keys) instead
171
+
172
+ **Safe Credential Handling:**
173
+ ```ruby
174
+ # ✅ Good: Parse from environment variable
175
+ db_url = ENV['DATABASE_URL']
176
+ config = DomainExtractor.parse(db_url)
177
+
178
+ # ✅ Good: Extract and use separately
179
+ username = config.decoded_user
180
+ password = config.decoded_password
181
+ # Pass to connection library without logging URL
182
+
183
+ # ❌ Bad: Hardcode credentials in source
184
+ db_url = 'postgresql://user:password@localhost/db' # Don't do this!
185
+ ```
186
+
187
+ #### 🚀 Performance
188
+
189
+ **Maintains Performance-First Design:**
190
+ - All new features use frozen constants and optimized string operations
191
+ - Auth extraction adds <5Ξs overhead per parse
192
+ - Core hot paths remain allocation-conscious
193
+ - Thread-safe stateless modules
194
+ - Full parse throughput depends on host complexity; see the benchmark docs for measured results
195
+
196
+ #### 🔄 URI-Style Access
197
+
198
+ **Common URI-style access with additional domain helpers:**
199
+ ```ruby
200
+ # Before (using URI)
201
+ uri = URI.parse('https://user:pass@example.com:8080/path?query=value#section')
202
+ uri.scheme # => "https"
203
+ uri.user # => "user"
204
+ uri.password # => "pass"
205
+ uri.host # => "example.com"
206
+ uri.port # => 8080
207
+
208
+ # After (using DomainExtractor) - identical API
209
+ result = DomainExtractor.parse('https://user:pass@example.com:8080/path?query=value#section')
210
+ result.scheme # => "https"
211
+ result.user # => "user"
212
+ result.password # => "pass"
213
+ result.host # => "example.com"
214
+ result.port # => 8080
215
+
216
+ # PLUS: Additional domain parsing features
217
+ result.subdomain # => nil
218
+ result.domain # => "example"
219
+ result.tld # => "com"
220
+ result.root_domain # => "example.com"
221
+
222
+ # PLUS: Decoded credentials
223
+ result.decoded_user # => "user"
224
+ result.decoded_password # => "pass"
225
+
226
+ # PLUS: Authentication helpers
227
+ result.basic_auth_header # => "Basic dXNlcjpwYXNz"
228
+ ```
229
+
230
+ #### ðŸ“Ķ Implementation Details
231
+
232
+ **New Modules:**
233
+ - `DomainExtractor::Auth` - Authentication component extraction with percent-decoding
234
+ - `DomainExtractor::URIHelpers` - Advanced URI manipulation and helper methods
235
+
236
+ **Enhanced Modules:**
237
+ - `DomainExtractor::Parser` - Now extracts auth components and additional URI parts
238
+ - `DomainExtractor::Result` - Builds results with auth and URI components
239
+ - `DomainExtractor::ParsedURL` - Extended with URI-compatible methods and setters
240
+
241
+ **Code Quality:**
242
+ - 200+ comprehensive test cases covering all scenarios
243
+ - RuboCop clean with zero offenses
244
+ - 100% backward compatible - no breaking changes
245
+ - Extensive documentation with real-world examples
246
+
247
+ #### ðŸŽŊ Migration from URI Library
248
+
249
+ **Low-friction migration for common absolute-URL use cases:**
250
+ ```ruby
251
+ # Swap URI.parse for DomainExtractor.parse
252
+ # Before:
253
+ require 'uri'
254
+ uri = URI.parse(url)
255
+
256
+ # After:
257
+ require 'domain_extractor'
258
+ uri = DomainExtractor.parse(url)
259
+
260
+ # Common URI-style accessors continue to work, plus you get:
261
+ # - Multi-part TLD support
262
+ # - Domain component extraction
263
+ # - Decoded credentials
264
+ # - Authentication helpers
265
+ # - Better performance
266
+ ```
267
+
268
+ #### 📚 Documentation
269
+
270
+ - Comprehensive CHANGELOG with all features documented
271
+ - README updated with authentication examples
272
+ - Real-world use cases for Redis, databases, FTP, APIs
273
+ - Security best practices section
274
+ - Migration guide from URI library
275
+
8
276
  ## [0.2.7] - 2025-11-09
9
277
 
10
278
  ### Added - URL Formatting API
data/LICENSE CHANGED
@@ -1,6 +1,6 @@
1
1
  BSD 3-Clause License
2
2
 
3
- Copyright (c) 2025, OpenSite AI. All rights reserved.
3
+ Copyright (c) 2026, OpenSite AI. All rights reserved.
4
4
 
5
5
  Redistribution and use in source and binary forms, with or without
6
6
  modification, are permitted provided that the following conditions are met: