unicode-confusable 1.0.1 → 1.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +5 -1
- data/README.md +12 -9
- data/data/confusable.marshal.gz +0 -0
- data/lib/unicode/confusable/constants.rb +2 -1
- data/unicode-confusable.gemspec +1 -1
- metadata +6 -7
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA1:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 250ef2174c93d982622bd010538575ab71da22a5
|
|
4
|
+
data.tar.gz: bb61daf9d9370be515fd26e234f7dd638ab65f3a
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 944d65e2a740ac4ba02e6f6b871217c59bf6c43bf4748005f1f7f92d0cbf1d8a667dc5d44cbb232ff805514d753b513631864fab6d3b30585cd5ff8df848c675
|
|
7
|
+
data.tar.gz: 59e152950196c0794888a6581394ece3c0df2cf1746544dc957ac8e06b76904854714237e1cc6796fb23117ac5275725ae884d8f1b646ad2def6e0bb0cb9db45
|
data/CHANGELOG.md
CHANGED
data/README.md
CHANGED
|
@@ -1,29 +1,32 @@
|
|
|
1
1
|
# Unicode::Confusable [![[version]](https://badge.fury.io/rb/unicode-confusable.svg)](http://badge.fury.io/rb/unicode-confusable) [![[travis]](https://travis-ci.org/janlelis/unicode-confusable.png)](https://travis-ci.org/janlelis/unicode-confusable)
|
|
2
2
|
|
|
3
|
-
Compares two strings if they are visually confusable as described in [Unicode® Technical Standard #39](http://www.unicode.org/reports/tr39/#Confusable_Detection): Both strings get transformed into a skeleton format before comparing them. The skeleton is generated by normalizing the string, replacing [confusable characters](ftp://ftp.unicode.org/Public/security/8.0.0/confusables.txt), and normalizing the string again.
|
|
3
|
+
Compares two strings if they are visually confusable as described in [Unicode® Technical Standard #39](http://www.unicode.org/reports/tr39/#Confusable_Detection): Both strings get transformed into a skeleton format before comparing them. The skeleton is generated by normalizing the string ([NFD](http://unicode.org/reports/tr15/#Norm_Forms)), replacing [confusable characters](ftp://ftp.unicode.org/Public/security/8.0.0/confusables.txt), and normalizing the string again.
|
|
4
4
|
|
|
5
|
-
Unicode version: **
|
|
5
|
+
Unicode version: **9.0.0**
|
|
6
6
|
|
|
7
7
|
Supported Rubies: **2.3**, **2.2**
|
|
8
8
|
|
|
9
|
-
## `Gemfile`
|
|
10
|
-
|
|
11
|
-
```ruby
|
|
12
|
-
gem "unicode-confusable"
|
|
13
|
-
```
|
|
14
|
-
|
|
15
9
|
## Usage
|
|
16
10
|
|
|
17
11
|
```ruby
|
|
18
12
|
require "unicode/confusable"
|
|
19
13
|
|
|
20
14
|
Unicode::Confusable.confusable? "a", "b" # => false
|
|
15
|
+
Unicode::Confusable.confusable? "C", "С" # => true
|
|
21
16
|
Unicode::Confusable.confusable? "ℜ𝘂ᖯʏ", "Ruby" # => true
|
|
22
17
|
Unicode::Confusable.confusable? "Michael", "Michae1" # => true
|
|
23
18
|
Unicode::Confusable.confusable? "⁇", "?" # => false
|
|
24
19
|
Unicode::Confusable.confusable? "⁇", "??" # => true
|
|
25
20
|
```
|
|
26
21
|
|
|
22
|
+
### Skeleton
|
|
23
|
+
|
|
24
|
+
```ruby
|
|
25
|
+
Unicode::Confusable.skeleton "ℜ𝘂ᖯʏ" # => "Ruby"
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
**Please note:** The skeleton is an intermediate representation, not meant for any other use than testing confusability, [according to the standard](http://www.unicode.org/reports/tr39/#Confusable_Detection).
|
|
29
|
+
|
|
27
30
|
## No Advanced Detection
|
|
28
31
|
|
|
29
32
|
TR 39 also describes mechanisms for a more exact recognition of confusables, also within the same string:
|
|
@@ -32,7 +35,7 @@ TR 39 also describes mechanisms for a more exact recognition of confusables, als
|
|
|
32
35
|
- Mixed-script confusable
|
|
33
36
|
- Whole-script confusable
|
|
34
37
|
|
|
35
|
-
This is
|
|
38
|
+
This is currently **not** supported by this gem.
|
|
36
39
|
|
|
37
40
|
## MIT License
|
|
38
41
|
|
data/data/confusable.marshal.gz
CHANGED
|
Binary file
|
|
@@ -1,6 +1,7 @@
|
|
|
1
1
|
module Unicode
|
|
2
2
|
module Confusable
|
|
3
|
-
VERSION = "1.0
|
|
3
|
+
VERSION = "1.1.0".freeze
|
|
4
|
+
UNICODE_VERSION = "9.0.0".freeze
|
|
4
5
|
DATA_DIRECTORY = File.expand_path(File.dirname(__FILE__) + '/../../../data/').freeze
|
|
5
6
|
INDEX_FILENAME = (DATA_DIRECTORY + '/confusable.marshal.gz').freeze
|
|
6
7
|
end
|
data/unicode-confusable.gemspec
CHANGED
|
@@ -6,7 +6,7 @@ Gem::Specification.new do |gem|
|
|
|
6
6
|
gem.name = "unicode-confusable"
|
|
7
7
|
gem.version = Unicode::Confusable::VERSION
|
|
8
8
|
gem.summary = "Detect characters that look visually similar."
|
|
9
|
-
gem.description = "Compares two strings if they are visually confusable as described in Unicode® Technical Standard #39: Both strings get transformed into a skeleton format before comparing them. The skeleton is generated by normalizing the string, replacing confusable characters, and normalizing the string again."
|
|
9
|
+
gem.description = "[Unicode #{Unicode::Confusable::UNICODE_VERSION}] Compares two strings if they are visually confusable as described in Unicode® Technical Standard #39: Both strings get transformed into a skeleton format before comparing them. The skeleton is generated by normalizing the string, replacing confusable characters, and normalizing the string again."
|
|
10
10
|
gem.authors = ["Jan Lelis"]
|
|
11
11
|
gem.email = ["mail@janlelis.de"]
|
|
12
12
|
gem.homepage = "https://github.com/janlelis/unicode-confusable"
|
metadata
CHANGED
|
@@ -1,19 +1,19 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: unicode-confusable
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 1.0
|
|
4
|
+
version: 1.1.0
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Jan Lelis
|
|
8
8
|
autorequire:
|
|
9
9
|
bindir: bin
|
|
10
10
|
cert_chain: []
|
|
11
|
-
date: 2016-
|
|
11
|
+
date: 2016-06-21 00:00:00.000000000 Z
|
|
12
12
|
dependencies: []
|
|
13
|
-
description:
|
|
14
|
-
Unicode® Technical Standard #39: Both strings get transformed into
|
|
15
|
-
before comparing them. The skeleton is generated by normalizing
|
|
16
|
-
confusable characters, and normalizing the string again.
|
|
13
|
+
description: "[Unicode 9.0.0] Compares two strings if they are visually confusable
|
|
14
|
+
as described in Unicode® Technical Standard #39: Both strings get transformed into
|
|
15
|
+
a skeleton format before comparing them. The skeleton is generated by normalizing
|
|
16
|
+
the string, replacing confusable characters, and normalizing the string again."
|
|
17
17
|
email:
|
|
18
18
|
- mail@janlelis.de
|
|
19
19
|
executables: []
|
|
@@ -61,4 +61,3 @@ specification_version: 4
|
|
|
61
61
|
summary: Detect characters that look visually similar.
|
|
62
62
|
test_files:
|
|
63
63
|
- spec/unicode_confusable_spec.rb
|
|
64
|
-
has_rdoc:
|