uri-idna 0.1.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/CHANGELOG.md +22 -0
- data/LICENSE.txt +21 -0
- data/README.md +184 -0
- data/lib/uri/idna/data/idna.rb +4692 -0
- data/lib/uri/idna/data/uts46.rb +8190 -0
- data/lib/uri/idna/intranges.rb +49 -0
- data/lib/uri/idna/process.rb +139 -0
- data/lib/uri/idna/punycode.rb +174 -0
- data/lib/uri/idna/uts46.rb +60 -0
- data/lib/uri/idna/validation/bidi.rb +93 -0
- data/lib/uri/idna/validation.rb +199 -0
- data/lib/uri/idna/version.rb +7 -0
- data/lib/uri/idna.rb +60 -0
- metadata +62 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: 06523fb12f478c36affa14f66e65ff8561afdb51a893df0e25a03d1d34325954
|
4
|
+
data.tar.gz: 78a948f74aee704e82e25dfb7d2d1be3b97ab43125b852623d95a5551944bdf0
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: e72151513e38a309ad3869a2ffaff53a207c8d4833d71d4faf1356b14ca6e6efc887c3761b20197f67b3ccb918afafd03eb9f0ba000e2707b6765d0f4f10f691
|
7
|
+
data.tar.gz: 2a02097ddf70e302367ad56088994a35511d81dac6735ded5d5d5f3badeed329ed19387eeda9e07d2243fa21e7826f5adb14728394d4738971568273db1759da
|
data/CHANGELOG.md
ADDED
@@ -0,0 +1,22 @@
|
|
1
|
+
# Changelog
|
2
|
+
|
3
|
+
All notable changes to this project will be documented in this file.
|
4
|
+
|
5
|
+
The format is based on [Keep a Changelog],
|
6
|
+
and this project adheres to [Semantic Versioning].
|
7
|
+
|
8
|
+
## [Unreleased]
|
9
|
+
|
10
|
+
## [0.1.0] - 2023-08-05
|
11
|
+
|
12
|
+
### Added
|
13
|
+
|
14
|
+
- Initial implementation. ([@skryukov])
|
15
|
+
|
16
|
+
[@skryukov]: https://github.com/skryukov
|
17
|
+
|
18
|
+
[Unreleased]: https://github.com/skryukov/uri-idna/compare/v0.1.0...HEAD
|
19
|
+
[0.1.0]: https://github.com/skryukov/uri-idna/commits/v0.1.0
|
20
|
+
|
21
|
+
[Keep a Changelog]: https://keepachangelog.com/en/1.0.0/
|
22
|
+
[Semantic Versioning]: https://semver.org/spec/v2.0.0.html
|
data/LICENSE.txt
ADDED
@@ -0,0 +1,21 @@
|
|
1
|
+
The MIT License (MIT)
|
2
|
+
|
3
|
+
Copyright (c) 2023 Svyatoslav Kryukov
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
7
|
+
in the Software without restriction, including without limitation the rights
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
10
|
+
furnished to do so, subject to the following conditions:
|
11
|
+
|
12
|
+
The above copyright notice and this permission notice shall be included in
|
13
|
+
all copies or substantial portions of the Software.
|
14
|
+
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
21
|
+
THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,184 @@
|
|
1
|
+
# URI::IDNA
|
2
|
+
|
3
|
+
[![Gem Version](https://badge.fury.io/rb/uri-idna.svg)](https://rubygems.org/gems/uri-idna)
|
4
|
+
[![Ruby](https://github.com/skryukov/uri-idna/actions/workflows/main.yml/badge.svg)](https://github.com/skryukov/uri-idna/actions/workflows/main.yml)
|
5
|
+
|
6
|
+
A IDNA 2008, UTS 46 and Punycode implementation in pure Ruby.
|
7
|
+
|
8
|
+
This gem provides a number of functions for converting internationalized domain names (IDNs) between the Unicode and ASCII Compatible Encoding (ACE) forms.
|
9
|
+
|
10
|
+
<a href="https://evilmartians.com/?utm_source=rubocop-gradual&utm_campaign=project_page">
|
11
|
+
<img src="https://evilmartians.com/badges/sponsored-by-evil-martians.svg" alt="Sponsored by Evil Martians" width="236" height="54">
|
12
|
+
</a>
|
13
|
+
|
14
|
+
## Installation
|
15
|
+
|
16
|
+
Add to your Gemfile:
|
17
|
+
```ruby
|
18
|
+
gem "idna-idna"
|
19
|
+
```
|
20
|
+
|
21
|
+
And then run `bundle install`.
|
22
|
+
|
23
|
+
## Usage
|
24
|
+
|
25
|
+
There are plenty of ways to convert IDNs between Unicode and ACE forms.
|
26
|
+
|
27
|
+
### IDNA 2008
|
28
|
+
|
29
|
+
The [RFC 5890] defines two protocols for IDN conversion: [Registration](https://datatracker.ietf.org/doc/html/rfc5891#section-4) and [Domain Name Lookup](https://datatracker.ietf.org/doc/html/rfc5891#section-5).
|
30
|
+
|
31
|
+
#### Registration protocol
|
32
|
+
|
33
|
+
```ruby
|
34
|
+
require "uri/idna"
|
35
|
+
|
36
|
+
URI::IDNA.register(alabel: "xn--gdkl8fhk5egc", ulabel: "ハロー・ワールド")
|
37
|
+
#=> "xn--gdkl8fhk5egc"
|
38
|
+
|
39
|
+
URI::IDNA.register(ulabel: "ハロー・ワールド")
|
40
|
+
#=> "xn--gdkl8fhk5egc"
|
41
|
+
|
42
|
+
URI::IDNA.register(alabel: "xn--gdkl8fhk5egc")
|
43
|
+
#=> "xn--gdkl8fhk5egc"
|
44
|
+
|
45
|
+
URI::IDNA.register(ulabel: "☕.us")
|
46
|
+
#<URI::IDNA::InvalidCodepointError: Codepoint U+2615 at position 1 of "☕" not allowed>
|
47
|
+
```
|
48
|
+
|
49
|
+
#### Domain Name Lookup Protocol
|
50
|
+
|
51
|
+
```ruby
|
52
|
+
require "uri/idna"
|
53
|
+
|
54
|
+
URI::IDNA.lookup("ハロー・ワールド")
|
55
|
+
#=> "xn--pck0a1b0a6a2e"
|
56
|
+
|
57
|
+
URI::IDNA.lookup("xn--pck0a1b0a6a2e")
|
58
|
+
#=> "xn--pck0a1b0a6a2e"
|
59
|
+
|
60
|
+
URI::IDNA.lookup("Ῠ.me")
|
61
|
+
#<URI::IDNA::InvalidCodepointError: Codepoint U+1FE8 at position 1 of "Ῠ" not allowed>
|
62
|
+
```
|
63
|
+
|
64
|
+
### Unicode UTS 46(TR46)
|
65
|
+
|
66
|
+
The [UTS 46](https://www.unicode.org/reports/tr46) defines two IDN conversion functions: [ToASCII](https://www.unicode.org/reports/tr46/#ToASCII) and [ToUnicode](https://www.unicode.org/reports/tr46/#ToUnicode).
|
67
|
+
|
68
|
+
#### ToASCII
|
69
|
+
|
70
|
+
```ruby
|
71
|
+
require "uri/idna"
|
72
|
+
|
73
|
+
URI::IDNA.to_ascii("Bloß.de")
|
74
|
+
#=> "xn--blo-7ka.de"
|
75
|
+
|
76
|
+
# UTS 46 transitional processing is disabled by default,
|
77
|
+
# but can be enabled via option:
|
78
|
+
URI::IDNA.to_ascii("Bloß.de", uts46_transitional: true)
|
79
|
+
#=> "bloss.de"
|
80
|
+
|
81
|
+
# Note that UTS 46 transitional processing is not fully IDNA 2008 compliant:
|
82
|
+
URI::IDNA.to_ascii("☕.us")
|
83
|
+
#=> "xn--53h.us"
|
84
|
+
```
|
85
|
+
|
86
|
+
#### ToUnicode
|
87
|
+
|
88
|
+
```ruby
|
89
|
+
require "uri/idna"
|
90
|
+
|
91
|
+
URI::IDNA.to_unicode("xn--blo-7ka.de")
|
92
|
+
#=> "bloß.de"
|
93
|
+
```
|
94
|
+
|
95
|
+
#### IDNA 2008 compatibility
|
96
|
+
|
97
|
+
It's possible to apply both IDNA 2008 and UTS 46 at once:
|
98
|
+
|
99
|
+
```ruby
|
100
|
+
require "uri/idna"
|
101
|
+
|
102
|
+
URI::IDNA.to_ascii("☕.us", idna_validity: true, contexto: true)
|
103
|
+
#<URI::IDNA::InvalidCodepointError: Codepoint U+2615 at position 1 of "☕" not allowed>
|
104
|
+
|
105
|
+
# It's also possible to apply UTS 46 to IDNA 2008 protocols:
|
106
|
+
URI::IDNA.lookup("Ῠ.me", check_dot: true, uts46: true, uts46_std3: true)
|
107
|
+
#=> "xn--rtg.me"
|
108
|
+
```
|
109
|
+
|
110
|
+
### Punycode
|
111
|
+
|
112
|
+
Punycode module performs conversion between Unicode and Punycode. Note that Punycode is not IDNA 2008 compliant, it is only used for conversion, no validations performed.
|
113
|
+
|
114
|
+
```ruby
|
115
|
+
require "uri/idna/punycode"
|
116
|
+
|
117
|
+
URI::IDNA::Punycode.encode("ハロー・ワールド")
|
118
|
+
#=> "gdkl8fhk5egc"
|
119
|
+
|
120
|
+
URI::IDNA::Punycode.decode("gdkl8fhk5egc")
|
121
|
+
#=> "ハロー・ワールド"
|
122
|
+
```
|
123
|
+
|
124
|
+
## Full technical reference:
|
125
|
+
|
126
|
+
### IDNA 2008
|
127
|
+
- [RFC 5890] – Definitions and Document Framework
|
128
|
+
- [RFC 5891] – Protocol
|
129
|
+
- [RFC 5892] – The Unicode Code Points
|
130
|
+
- [RFC 5893] – Bidi rule
|
131
|
+
|
132
|
+
### Punycode
|
133
|
+
|
134
|
+
- [RFC 3492] – Punycode: A Bootstring encoding of Unicode
|
135
|
+
|
136
|
+
### UTS 46 (also referenced as TS46)
|
137
|
+
|
138
|
+
- [Unicode IDNA Compatibility Processing]
|
139
|
+
|
140
|
+
## Development
|
141
|
+
|
142
|
+
After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
|
143
|
+
|
144
|
+
To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and the created tag, and push the `.gem` file to [rubygems.org](https://rubygems.org).
|
145
|
+
|
146
|
+
### Generating Unicode data
|
147
|
+
|
148
|
+
This gem uses Unicode data files to perform IDN conversion. To generate new Unicode data files, run `bundle exec rake idna:generate`.
|
149
|
+
|
150
|
+
To specify Unicode version, use `UNICODE_VERSION` environment variable, e.g. `UNICODE_VERSION=14.0.0 bundle exec rake idna:generate`.
|
151
|
+
|
152
|
+
By default, used Unicode version is the one used by the Ruby version (`RbConfig::CONFIG["UNICODE_VERSION"]`).
|
153
|
+
|
154
|
+
To set directory for generated files, use `DATA_DIR` environment variable, e.g. `DATA_DIR=lib/uri/idna/data bundle exec rake idna:generate`.
|
155
|
+
|
156
|
+
Unicode data cached in the `tmp` directory by default, to change it, use `CACHE_DIR` environment variable, e.g. `CACHE_DIR=~/.cache/unicode_data bundle exec rake idna:generate`.
|
157
|
+
|
158
|
+
### Inspect Unicode data
|
159
|
+
|
160
|
+
To inspect Unicode data, run `bundle exec rake idna:inspect[<HEX_CODE>]`.
|
161
|
+
|
162
|
+
To specify Unicode version, or cache directory, use `UNICODE_VERSION` or `CACHE_DIR` environment variables, e.g. `UNICODE_VERSION=15.0.0 bundle exec rake idna:inspect[1f495]`.
|
163
|
+
|
164
|
+
Note: if you getting the `no matches found: idna:inspect[1f495]` error, try to escape the brackets: `bundle exec rake idna:inspect\[1f495\]`.
|
165
|
+
|
166
|
+
### Update UTS 46 test suite data
|
167
|
+
|
168
|
+
To update UTS 46 test suite data, run `bundle exec rake idna:update_uts46_test_suite`.
|
169
|
+
|
170
|
+
## Contributing
|
171
|
+
|
172
|
+
Bug reports and pull requests are welcome on GitHub at https://github.com/skryukov/uri-idna.
|
173
|
+
|
174
|
+
## License
|
175
|
+
|
176
|
+
The gem is available as open source under the terms of the [MIT License].
|
177
|
+
|
178
|
+
[RFC 5890]: (https://datatracker.ietf.org/doc/html/rfc5890)
|
179
|
+
[RFC 5891]: (https://datatracker.ietf.org/doc/html/rfc5891)
|
180
|
+
[RFC 5892]: (https://datatracker.ietf.org/doc/html/rfc5892)
|
181
|
+
[RFC 5893]: (https://datatracker.ietf.org/doc/html/rfc5893)
|
182
|
+
[RFC 3492]: (https://datatracker.ietf.org/doc/html/rfc3492)
|
183
|
+
[Unicode IDNA Compatibility Processing]: (https://www.unicode.org/reports/tr46)
|
184
|
+
[MIT License]: (https://opensource.org/licenses/MIT)
|