uri-idna 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/CHANGELOG.md +22 -0
- data/LICENSE.txt +21 -0
- data/README.md +184 -0
- data/lib/uri/idna/data/idna.rb +4692 -0
- data/lib/uri/idna/data/uts46.rb +8190 -0
- data/lib/uri/idna/intranges.rb +49 -0
- data/lib/uri/idna/process.rb +139 -0
- data/lib/uri/idna/punycode.rb +174 -0
- data/lib/uri/idna/uts46.rb +60 -0
- data/lib/uri/idna/validation/bidi.rb +93 -0
- data/lib/uri/idna/validation.rb +199 -0
- data/lib/uri/idna/version.rb +7 -0
- data/lib/uri/idna.rb +60 -0
- metadata +62 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: 06523fb12f478c36affa14f66e65ff8561afdb51a893df0e25a03d1d34325954
|
4
|
+
data.tar.gz: 78a948f74aee704e82e25dfb7d2d1be3b97ab43125b852623d95a5551944bdf0
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: e72151513e38a309ad3869a2ffaff53a207c8d4833d71d4faf1356b14ca6e6efc887c3761b20197f67b3ccb918afafd03eb9f0ba000e2707b6765d0f4f10f691
|
7
|
+
data.tar.gz: 2a02097ddf70e302367ad56088994a35511d81dac6735ded5d5d5f3badeed329ed19387eeda9e07d2243fa21e7826f5adb14728394d4738971568273db1759da
|
data/CHANGELOG.md
ADDED
@@ -0,0 +1,22 @@
|
|
1
|
+
# Changelog
|
2
|
+
|
3
|
+
All notable changes to this project will be documented in this file.
|
4
|
+
|
5
|
+
The format is based on [Keep a Changelog],
|
6
|
+
and this project adheres to [Semantic Versioning].
|
7
|
+
|
8
|
+
## [Unreleased]
|
9
|
+
|
10
|
+
## [0.1.0] - 2023-08-05
|
11
|
+
|
12
|
+
### Added
|
13
|
+
|
14
|
+
- Initial implementation. ([@skryukov])
|
15
|
+
|
16
|
+
[@skryukov]: https://github.com/skryukov
|
17
|
+
|
18
|
+
[Unreleased]: https://github.com/skryukov/uri-idna/compare/v0.1.0...HEAD
|
19
|
+
[0.1.0]: https://github.com/skryukov/uri-idna/commits/v0.1.0
|
20
|
+
|
21
|
+
[Keep a Changelog]: https://keepachangelog.com/en/1.0.0/
|
22
|
+
[Semantic Versioning]: https://semver.org/spec/v2.0.0.html
|
data/LICENSE.txt
ADDED
@@ -0,0 +1,21 @@
|
|
1
|
+
The MIT License (MIT)
|
2
|
+
|
3
|
+
Copyright (c) 2023 Svyatoslav Kryukov
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
7
|
+
in the Software without restriction, including without limitation the rights
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
10
|
+
furnished to do so, subject to the following conditions:
|
11
|
+
|
12
|
+
The above copyright notice and this permission notice shall be included in
|
13
|
+
all copies or substantial portions of the Software.
|
14
|
+
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
21
|
+
THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,184 @@
|
|
1
|
+
# URI::IDNA
|
2
|
+
|
3
|
+
[](https://rubygems.org/gems/uri-idna)
|
4
|
+
[](https://github.com/skryukov/uri-idna/actions/workflows/main.yml)
|
5
|
+
|
6
|
+
A IDNA 2008, UTS 46 and Punycode implementation in pure Ruby.
|
7
|
+
|
8
|
+
This gem provides a number of functions for converting internationalized domain names (IDNs) between the Unicode and ASCII Compatible Encoding (ACE) forms.
|
9
|
+
|
10
|
+
<a href="https://evilmartians.com/?utm_source=rubocop-gradual&utm_campaign=project_page">
|
11
|
+
<img src="https://evilmartians.com/badges/sponsored-by-evil-martians.svg" alt="Sponsored by Evil Martians" width="236" height="54">
|
12
|
+
</a>
|
13
|
+
|
14
|
+
## Installation
|
15
|
+
|
16
|
+
Add to your Gemfile:
|
17
|
+
```ruby
|
18
|
+
gem "idna-idna"
|
19
|
+
```
|
20
|
+
|
21
|
+
And then run `bundle install`.
|
22
|
+
|
23
|
+
## Usage
|
24
|
+
|
25
|
+
There are plenty of ways to convert IDNs between Unicode and ACE forms.
|
26
|
+
|
27
|
+
### IDNA 2008
|
28
|
+
|
29
|
+
The [RFC 5890] defines two protocols for IDN conversion: [Registration](https://datatracker.ietf.org/doc/html/rfc5891#section-4) and [Domain Name Lookup](https://datatracker.ietf.org/doc/html/rfc5891#section-5).
|
30
|
+
|
31
|
+
#### Registration protocol
|
32
|
+
|
33
|
+
```ruby
|
34
|
+
require "uri/idna"
|
35
|
+
|
36
|
+
URI::IDNA.register(alabel: "xn--gdkl8fhk5egc", ulabel: "ハロー・ワールド")
|
37
|
+
#=> "xn--gdkl8fhk5egc"
|
38
|
+
|
39
|
+
URI::IDNA.register(ulabel: "ハロー・ワールド")
|
40
|
+
#=> "xn--gdkl8fhk5egc"
|
41
|
+
|
42
|
+
URI::IDNA.register(alabel: "xn--gdkl8fhk5egc")
|
43
|
+
#=> "xn--gdkl8fhk5egc"
|
44
|
+
|
45
|
+
URI::IDNA.register(ulabel: "☕.us")
|
46
|
+
#<URI::IDNA::InvalidCodepointError: Codepoint U+2615 at position 1 of "☕" not allowed>
|
47
|
+
```
|
48
|
+
|
49
|
+
#### Domain Name Lookup Protocol
|
50
|
+
|
51
|
+
```ruby
|
52
|
+
require "uri/idna"
|
53
|
+
|
54
|
+
URI::IDNA.lookup("ハロー・ワールド")
|
55
|
+
#=> "xn--pck0a1b0a6a2e"
|
56
|
+
|
57
|
+
URI::IDNA.lookup("xn--pck0a1b0a6a2e")
|
58
|
+
#=> "xn--pck0a1b0a6a2e"
|
59
|
+
|
60
|
+
URI::IDNA.lookup("Ῠ.me")
|
61
|
+
#<URI::IDNA::InvalidCodepointError: Codepoint U+1FE8 at position 1 of "Ῠ" not allowed>
|
62
|
+
```
|
63
|
+
|
64
|
+
### Unicode UTS 46(TR46)
|
65
|
+
|
66
|
+
The [UTS 46](https://www.unicode.org/reports/tr46) defines two IDN conversion functions: [ToASCII](https://www.unicode.org/reports/tr46/#ToASCII) and [ToUnicode](https://www.unicode.org/reports/tr46/#ToUnicode).
|
67
|
+
|
68
|
+
#### ToASCII
|
69
|
+
|
70
|
+
```ruby
|
71
|
+
require "uri/idna"
|
72
|
+
|
73
|
+
URI::IDNA.to_ascii("Bloß.de")
|
74
|
+
#=> "xn--blo-7ka.de"
|
75
|
+
|
76
|
+
# UTS 46 transitional processing is disabled by default,
|
77
|
+
# but can be enabled via option:
|
78
|
+
URI::IDNA.to_ascii("Bloß.de", uts46_transitional: true)
|
79
|
+
#=> "bloss.de"
|
80
|
+
|
81
|
+
# Note that UTS 46 transitional processing is not fully IDNA 2008 compliant:
|
82
|
+
URI::IDNA.to_ascii("☕.us")
|
83
|
+
#=> "xn--53h.us"
|
84
|
+
```
|
85
|
+
|
86
|
+
#### ToUnicode
|
87
|
+
|
88
|
+
```ruby
|
89
|
+
require "uri/idna"
|
90
|
+
|
91
|
+
URI::IDNA.to_unicode("xn--blo-7ka.de")
|
92
|
+
#=> "bloß.de"
|
93
|
+
```
|
94
|
+
|
95
|
+
#### IDNA 2008 compatibility
|
96
|
+
|
97
|
+
It's possible to apply both IDNA 2008 and UTS 46 at once:
|
98
|
+
|
99
|
+
```ruby
|
100
|
+
require "uri/idna"
|
101
|
+
|
102
|
+
URI::IDNA.to_ascii("☕.us", idna_validity: true, contexto: true)
|
103
|
+
#<URI::IDNA::InvalidCodepointError: Codepoint U+2615 at position 1 of "☕" not allowed>
|
104
|
+
|
105
|
+
# It's also possible to apply UTS 46 to IDNA 2008 protocols:
|
106
|
+
URI::IDNA.lookup("Ῠ.me", check_dot: true, uts46: true, uts46_std3: true)
|
107
|
+
#=> "xn--rtg.me"
|
108
|
+
```
|
109
|
+
|
110
|
+
### Punycode
|
111
|
+
|
112
|
+
Punycode module performs conversion between Unicode and Punycode. Note that Punycode is not IDNA 2008 compliant, it is only used for conversion, no validations performed.
|
113
|
+
|
114
|
+
```ruby
|
115
|
+
require "uri/idna/punycode"
|
116
|
+
|
117
|
+
URI::IDNA::Punycode.encode("ハロー・ワールド")
|
118
|
+
#=> "gdkl8fhk5egc"
|
119
|
+
|
120
|
+
URI::IDNA::Punycode.decode("gdkl8fhk5egc")
|
121
|
+
#=> "ハロー・ワールド"
|
122
|
+
```
|
123
|
+
|
124
|
+
## Full technical reference:
|
125
|
+
|
126
|
+
### IDNA 2008
|
127
|
+
- [RFC 5890] – Definitions and Document Framework
|
128
|
+
- [RFC 5891] – Protocol
|
129
|
+
- [RFC 5892] – The Unicode Code Points
|
130
|
+
- [RFC 5893] – Bidi rule
|
131
|
+
|
132
|
+
### Punycode
|
133
|
+
|
134
|
+
- [RFC 3492] – Punycode: A Bootstring encoding of Unicode
|
135
|
+
|
136
|
+
### UTS 46 (also referenced as TS46)
|
137
|
+
|
138
|
+
- [Unicode IDNA Compatibility Processing]
|
139
|
+
|
140
|
+
## Development
|
141
|
+
|
142
|
+
After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
|
143
|
+
|
144
|
+
To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and the created tag, and push the `.gem` file to [rubygems.org](https://rubygems.org).
|
145
|
+
|
146
|
+
### Generating Unicode data
|
147
|
+
|
148
|
+
This gem uses Unicode data files to perform IDN conversion. To generate new Unicode data files, run `bundle exec rake idna:generate`.
|
149
|
+
|
150
|
+
To specify Unicode version, use `UNICODE_VERSION` environment variable, e.g. `UNICODE_VERSION=14.0.0 bundle exec rake idna:generate`.
|
151
|
+
|
152
|
+
By default, used Unicode version is the one used by the Ruby version (`RbConfig::CONFIG["UNICODE_VERSION"]`).
|
153
|
+
|
154
|
+
To set directory for generated files, use `DATA_DIR` environment variable, e.g. `DATA_DIR=lib/uri/idna/data bundle exec rake idna:generate`.
|
155
|
+
|
156
|
+
Unicode data cached in the `tmp` directory by default, to change it, use `CACHE_DIR` environment variable, e.g. `CACHE_DIR=~/.cache/unicode_data bundle exec rake idna:generate`.
|
157
|
+
|
158
|
+
### Inspect Unicode data
|
159
|
+
|
160
|
+
To inspect Unicode data, run `bundle exec rake idna:inspect[<HEX_CODE>]`.
|
161
|
+
|
162
|
+
To specify Unicode version, or cache directory, use `UNICODE_VERSION` or `CACHE_DIR` environment variables, e.g. `UNICODE_VERSION=15.0.0 bundle exec rake idna:inspect[1f495]`.
|
163
|
+
|
164
|
+
Note: if you getting the `no matches found: idna:inspect[1f495]` error, try to escape the brackets: `bundle exec rake idna:inspect\[1f495\]`.
|
165
|
+
|
166
|
+
### Update UTS 46 test suite data
|
167
|
+
|
168
|
+
To update UTS 46 test suite data, run `bundle exec rake idna:update_uts46_test_suite`.
|
169
|
+
|
170
|
+
## Contributing
|
171
|
+
|
172
|
+
Bug reports and pull requests are welcome on GitHub at https://github.com/skryukov/uri-idna.
|
173
|
+
|
174
|
+
## License
|
175
|
+
|
176
|
+
The gem is available as open source under the terms of the [MIT License].
|
177
|
+
|
178
|
+
[RFC 5890]: (https://datatracker.ietf.org/doc/html/rfc5890)
|
179
|
+
[RFC 5891]: (https://datatracker.ietf.org/doc/html/rfc5891)
|
180
|
+
[RFC 5892]: (https://datatracker.ietf.org/doc/html/rfc5892)
|
181
|
+
[RFC 5893]: (https://datatracker.ietf.org/doc/html/rfc5893)
|
182
|
+
[RFC 3492]: (https://datatracker.ietf.org/doc/html/rfc3492)
|
183
|
+
[Unicode IDNA Compatibility Processing]: (https://www.unicode.org/reports/tr46)
|
184
|
+
[MIT License]: (https://opensource.org/licenses/MIT)
|