unicode-emoji 1.1.0 β 2.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +5 -5
- data/.travis.yml +5 -4
- data/CHANGELOG.md +9 -0
- data/MIT-LICENSE.txt +1 -1
- data/README.md +60 -10
- data/data/emoji.marshal.gz +0 -0
- data/lib/unicode/emoji.rb +92 -23
- data/lib/unicode/emoji/constants.rb +3 -3
- data/spec/unicode_emoji_spec.rb +92 -4
- metadata +4 -4
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
|
-
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: 32bec9a0f826ab808cf77b3bf69e8248de000d99
|
4
|
+
data.tar.gz: 6c8a53dc8874ab6bf508aad2a914eded9a7a4889
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: c5ebaf7c4c6a66331af9c0f927f8f41079aaf89d3389c4ba84533c1f64fbd2b4456657971e05c437987568c6853f844f1084986721cede6da8979696e4efffd6
|
7
|
+
data.tar.gz: 3fc8af7fc6bdcaac8ac14ec148c4322a8d26a1e627895ec54a3a038a85062c9f00d37317208cdbe876cdc31603e83943eac569ee937bccfd7e44df15d239ac19
|
data/.travis.yml
CHANGED
data/CHANGELOG.md
CHANGED
@@ -1,5 +1,14 @@
|
|
1
1
|
## CHANGELOG
|
2
2
|
|
3
|
+
### 2.0.0
|
4
|
+
|
5
|
+
- Emoji 12.0 data (including valid subdivisions)
|
6
|
+
- Introduce new `REGEX_WELL_FORMED` to be able to match for invalid tag and region sequences
|
7
|
+
- Introduce new `*_INCLUDE_TEXT` regexes which include matching for textual presentation emoji
|
8
|
+
- Refactoring: Update Emoji matching to latest standard while keeping naming close to standard
|
9
|
+
- Issue warning when using `#list` method to retrieve outdated category
|
10
|
+
- Change matching for ZWJ sequences: Do not limit sequence to a maximum of 3 ZWJs
|
11
|
+
|
3
12
|
### 1.1.0
|
4
13
|
|
5
14
|
- Emoji 11.0
|
data/MIT-LICENSE.txt
CHANGED
data/README.md
CHANGED
@@ -1,12 +1,12 @@
|
|
1
|
-
# Unicode::Emoji [![[version]](https://badge.fury.io/rb/unicode-emoji.svg)](
|
1
|
+
# Unicode::Emoji [![[version]](https://badge.fury.io/rb/unicode-emoji.svg)](https://badge.fury.io/rb/unicode-emoji) [![[travis]](https://travis-ci.org/janlelis/unicode-emoji.svg)](https://travis-ci.org/janlelis/unicode-emoji)
|
2
2
|
|
3
3
|
A small Ruby library which provides Unicode Emoji data and regexes.
|
4
4
|
|
5
5
|
Also includes a categorized list of recommended Emoji.
|
6
6
|
|
7
|
-
Emoji version: **
|
7
|
+
Emoji version: **12.0** (February 2018)
|
8
8
|
|
9
|
-
Supported Rubies: **2.5**, **2.4**, **2.3**
|
9
|
+
Supported Rubies: **2.6**, **2.5**, **2.4**, **2.3**
|
10
10
|
|
11
11
|
If you are stuck on an older Ruby version, checkout the latest [0.9 version](https://rubygems.org/gems/unicode-emoji/versions/0.9.3) of this gem.
|
12
12
|
|
@@ -20,7 +20,7 @@ gem "unicode-emoji"
|
|
20
20
|
|
21
21
|
### Regex
|
22
22
|
|
23
|
-
|
23
|
+
The gem includes a bunch of Emoji regexes, which are compiled out of various Emoji Unicode data sources.
|
24
24
|
|
25
25
|
```ruby
|
26
26
|
require "unicode/emoji"
|
@@ -40,16 +40,64 @@ string = "String which contains all kinds of emoji:
|
|
40
40
|
string.scan(Unicode::Emoji::REGEX) # => ["π΄", "βΆοΈ", "ππ½", "π΅πΉ", "π΄σ §σ ’σ ³σ £σ ΄σ Ώ", "2οΈβ£", "π€Ύπ½ββοΈ"]
|
41
41
|
```
|
42
42
|
|
43
|
+
#### Main Regexes
|
44
|
+
|
45
|
+
Matches (non-textual) Emoji of all kinds:
|
46
|
+
|
43
47
|
Regex | Description | Example Matches | Example Non-Matches
|
44
48
|
------------------------------|-------------|-----------------|--------------------
|
45
|
-
`Unicode::Emoji::REGEX` | Matches (non-textual) singleton Emoji (except for singleton components, like a skin tone modifier without base Emoji) and all kind of
|
46
|
-
`Unicode::Emoji::REGEX_VALID` | Matches (non-textual) singleton Emoji (except for singleton components, like a skin tone modifier without base Emoji) and all kind of valid Emoji sequences | `π΄`, `βΆοΈ`, `ππ½`, `π΅πΉ`, `2οΈβ£`, `π΄σ §σ ’σ ³σ £σ ΄σ Ώ`, `π΄σ §σ ’σ ‘σ §σ ’σ Ώ`, `π€Ύπ½ββοΈ`, `π€ βπ€’` | `π΄οΈ`, `βΆ`, `π»`, `π΅π΅`
|
47
|
-
`Unicode::Emoji::
|
48
|
-
|
49
|
-
|
49
|
+
`Unicode::Emoji::REGEX` | **Use this if unsure!** Matches (non-textual) singleton Emoji (except for singleton components, like a skin tone modifier without base Emoji) and all kind of *recommended* Emoji sequences | `π΄`, `βΆοΈ`, `ππ½`, `π΅πΉ`, `2οΈβ£`, `π΄σ §σ ’σ ³σ £σ ΄σ Ώ`, `π€Ύπ½ββοΈ` | `π΄οΈ`, `βΆ`, `π»`, `π΅π΅`, `π΄σ §σ ’σ ‘σ §σ ’σ Ώ`, `π€ βπ€’`
|
50
|
+
`Unicode::Emoji::REGEX_VALID` | Matches (non-textual) singleton Emoji (except for singleton components, like a skin tone modifier without base Emoji) and all kind of *valid* Emoji sequences | `π΄`, `βΆοΈ`, `ππ½`, `π΅πΉ`, `2οΈβ£`, `π΄σ §σ ’σ ³σ £σ ΄σ Ώ`, `π΄σ §σ ’σ ‘σ §σ ’σ Ώ`, `π€Ύπ½ββοΈ`, `π€ βπ€’` | `π΄οΈ`, `βΆ`, `π»`, `π΅π΅`
|
51
|
+
`Unicode::Emoji::REGEX_WELL_FORMED` | Matches (non-textual) singleton Emoji (except for singleton components, like a skin tone modifier without base Emoji) and all kind of *well-formed* Emoji sequences | `π΄`, `βΆοΈ`, `ππ½`, `π΅πΉ`, `2οΈβ£`, `π΄σ §σ ’σ ³σ £σ ΄σ Ώ`, `π΄σ §σ ’σ ‘σ §σ ’σ Ώ`, `π€Ύπ½ββοΈ`, `π€ βπ€’`, `π΅π΅` | `π΄οΈ`, `βΆ`, `π»`
|
52
|
+
|
53
|
+
##### Picking the Right Emoji Regex
|
54
|
+
|
55
|
+
- Usually you just want `REGEX` (RGI set)
|
56
|
+
- If you want broader matching (e.g. more sub-regions), choose `REGEX_VALID`
|
57
|
+
- If you even want to match for invalid sequences, too, use `REGEX_WELL_FORMED`
|
58
|
+
|
59
|
+
Please see [the standard](http://www.unicode.org/reports/tr51/#Emoji_Sets) for details.
|
60
|
+
|
61
|
+
Property | `REGEX` (RGI / Recommended) | `REGEX_VALID` (Valid) | `REGEX_WELL_FORMED` (Well-formed)
|
62
|
+
---------|-----------------------------|-----------------------|----------------------------------
|
63
|
+
Region "π΅πΉ" | Yes | Yes | Yes
|
64
|
+
Region "π΅π΅" | No | No | Yes
|
65
|
+
Tag Sequence "π΄σ §σ ’σ ³σ £σ ΄σ Ώ" | Yes | Yes | Yes
|
66
|
+
Tag Sequence "π΄σ §σ ’σ ‘σ §σ ’σ Ώ" | No | Yes | Yes
|
67
|
+
Tag Sequence "π΄σ §σ ’σ ‘σ ‘σ ‘σ Ώ" | No | No | Yes
|
68
|
+
ZWJ Sequence "π€Ύπ½ββοΈ" | Yes | Yes | Yes
|
69
|
+
ZWJ Sequence "π€ βπ€’" | No | Yes | Yes
|
50
70
|
|
51
71
|
More info about valid vs. recommended Emoji in this [blog article on Emojipedia](http://blog.emojipedia.org/unicode-behind-the-curtain/).
|
52
72
|
|
73
|
+
#### Singleton Regexes
|
74
|
+
|
75
|
+
Matches only simple one-codepoint (+ optional variation selector) Emoji:
|
76
|
+
|
77
|
+
Regex | Description | Example Matches | Example Non-Matches
|
78
|
+
------------------------------|-------------|-----------------|--------------------
|
79
|
+
`Unicode::Emoji::REGEX_BASIC` | Matches (non-textual) singleton Emoji (except for singleton components, like a skin tone modifier without base Emoji), but no sequences at all | `π΄`, `βΆοΈ` | `π΄οΈ`, `βΆ`, `π»`, `ππ½`, `π΅πΉ`, `π΅π΅`,`2οΈβ£`, `π΄σ §σ ’σ ³σ £σ ΄σ Ώ`, `π΄σ §σ ’σ ‘σ §σ ’σ Ώ`, `π€Ύπ½ββοΈ`, `π€ βπ€’`
|
80
|
+
`Unicode::Emoji::REGEX_TEXT` | Matches only textual singleton Emoji (except for singleton components, like digit 1) | `π΄οΈ`, `βΆ` | `π΄`, `βΆοΈ`, `π»`, `ππ½`, `π΅πΉ`, `π΅π΅`,`2οΈβ£`, `π΄σ §σ ’σ ³σ £σ ΄σ Ώ`, `π΄σ §σ ’σ ‘σ §σ ’σ Ώ`, `π€Ύπ½ββοΈ`, `π€ βπ€’`
|
81
|
+
|
82
|
+
#### Include Textual Emoji
|
83
|
+
|
84
|
+
By default, textual Emoji (emoji characters with text variation selector or those that have a default text presentation) will not be included in the default regexes. However, if you wish to match for them too, you can include them in your regex by appending the `_INCLUDE_TEXT` suffix:
|
85
|
+
|
86
|
+
Regex | Description | Example Matches | Example Non-Matches
|
87
|
+
------------------------------|-------------|-----------------|--------------------
|
88
|
+
`Unicode::Emoji::REGEX_INCLUDE_TEXT` | `REGEX` + `REGEX_TEXT` | `π΄`, `βΆοΈ`, `ππ½`, `π΅πΉ`, `2οΈβ£`, `π΄σ §σ ’σ ³σ £σ ΄σ Ώ`, `π€Ύπ½ββοΈ`, `π΄οΈ`, `βΆ` | `π»`, `π΅π΅`, `π΄σ §σ ’σ ‘σ §σ ’σ Ώ`, `π€ βπ€’`
|
89
|
+
`Unicode::Emoji::REGEX_VALID_INCLUDE_TEXT` | `REGEX_VALID` + `REGEX_TEXT` | `π΄`, `βΆοΈ`, `ππ½`, `π΅πΉ`, `2οΈβ£`, `π΄σ §σ ’σ ³σ £σ ΄σ Ώ`, `π΄σ §σ ’σ ‘σ §σ ’σ Ώ`, `π€Ύπ½ββοΈ`, `π€ βπ€’`, `π΄οΈ`, `βΆ` | `π»`, `π΅π΅`
|
90
|
+
`Unicode::Emoji::REGEX_WELL_FORMED_INCLUDE_TEXT` | `REGEX_WELL_FORMED` + `REGEX_TEXT` | `π΄`, `βΆοΈ`, `ππ½`, `π΅πΉ`, `2οΈβ£`, `π΄σ §σ ’σ ³σ £σ ΄σ Ώ`, `π΄σ §σ ’σ ‘σ §σ ’σ Ώ`, `π€Ύπ½ββοΈ`, `π€ βπ€’`, `π΅π΅`, `π΄οΈ`, `βΆ` | `π»`
|
91
|
+
|
92
|
+
#### Partial Regexes
|
93
|
+
|
94
|
+
Matches potential Emoji parts (often, this is not what you want):
|
95
|
+
|
96
|
+
Regex | Description | Example Matches | Example Non-Matches
|
97
|
+
------------------------------|-------------|-----------------|--------------------
|
98
|
+
`Unicode::Emoji::REGEX_ANY` | Matches any Emoji-related codepoint (but no variation selectors, tags, or zero-width joiners). Please not that this will match Emoji-parts rather than complete Emoji, for example, single digits! | `π΄`, `βΆ`, `π»`, `π`, `π½`, `π΅`, `πΉ`, `2`, `π΄`, `π€Ύ`, `β`, `π€ `, `π€’` | -
|
99
|
+
|
100
|
+
|
53
101
|
### List
|
54
102
|
|
55
103
|
Use `Unicode::Emoji::LIST` or the list method to get a grouped (and ordered) list of Emoji:
|
@@ -65,6 +113,8 @@ Unicode::Emoji.list("Food & Drink", "food-asian")
|
|
65
113
|
=> ["π±", "π", "π", "π", "π", "π", "π", "π ", "π’", "π£", "π€", "π₯", "π‘", "\u{1F95F}", "\u{1F960}", "\u{1F961}"]
|
66
114
|
```
|
67
115
|
|
116
|
+
Please note that categories might change with future versions of the Emoji standard. This gem will issue warnings when attemting to retrieve old categories using the `#list` method.
|
117
|
+
|
68
118
|
A markdown file with all recommended Emoji can be found [in this gist](https://gist.github.com/janlelis/72f9be1f0ecca07372c64cf13894b801).
|
69
119
|
|
70
120
|
### Properties
|
@@ -87,5 +137,5 @@ Unicode::Emoji.properties "β" # => ["Emoji", "Emoji_Modifier_Base"]
|
|
87
137
|
|
88
138
|
## MIT
|
89
139
|
|
90
|
-
- Copyright (C) 2017
|
140
|
+
- Copyright (C) 2017-2019 Jan Lelis <http://janlelis.com>. Released under the MIT license.
|
91
141
|
- Unicode data: http://www.unicode.org/copyright.html#Exhibit1
|
data/data/emoji.marshal.gz
CHANGED
Binary file
|
data/lib/unicode/emoji.rb
CHANGED
@@ -18,8 +18,10 @@ module Unicode
|
|
18
18
|
TEXT_VARIATION_SELECTOR = 0xFE0E
|
19
19
|
EMOJI_TAG_BASE_FLAG = 0x1F3F4
|
20
20
|
CANCEL_TAG = 0xE007F
|
21
|
+
TAGS = [*0xE0020..0xE007E]
|
21
22
|
EMOJI_KEYCAP_SUFFIX = 0x20E3
|
22
23
|
ZWJ = 0x200D
|
24
|
+
REGIONAL_INDICATORS = [*0x1F1E6..0x1F1FF]
|
23
25
|
|
24
26
|
EMOJI_CHAR = INDEX[:PROPERTIES].select{ |ord, props| props.include?(:E) }.keys.freeze
|
25
27
|
EMOJI_PRESENTATION = INDEX[:PROPERTIES].select{ |ord, props| props.include?(:P) }.keys.freeze
|
@@ -36,6 +38,10 @@ module Unicode
|
|
36
38
|
RECOMMENDED_ZWJ_SEQUENCES = INDEX[:ZWJ].freeze
|
37
39
|
|
38
40
|
LIST = INDEX[:LIST].freeze.each_value(&:freeze)
|
41
|
+
LIST_REMOVED_KEYS = [
|
42
|
+
"Smileys & People",
|
43
|
+
"Component",
|
44
|
+
]
|
39
45
|
|
40
46
|
pack = ->(ord){ Regexp.escape(Array(ord).pack("U*")) }
|
41
47
|
join = -> (*strings){ "(?:" + strings.join("|") + ")" }
|
@@ -61,6 +67,9 @@ module Unicode
|
|
61
67
|
emoji_presentation + "(?!" + pack[TEXT_VARIATION_SELECTOR] + ")" + pack[EMOJI_VARIATION_SELECTOR] + "?",
|
62
68
|
]
|
63
69
|
|
70
|
+
non_component_emoji_presentation_sequence = \
|
71
|
+
"(?!" + emoji_component + ")" + emoji_presentation_sequence
|
72
|
+
|
64
73
|
text_presentation_sequence = \
|
65
74
|
join[
|
66
75
|
pack_and_join[TEXT_PRESENTATION]+ "(?!" + join[emoji_modifier, pack[EMOJI_VARIATION_SELECTOR]] + ")" + pack[TEXT_VARIATION_SELECTOR] + "?",
|
@@ -73,9 +82,36 @@ module Unicode
|
|
73
82
|
emoji_keycap_sequence = \
|
74
83
|
pack_and_join[EMOJI_KEYCAPS] + pack[[EMOJI_VARIATION_SELECTOR, EMOJI_KEYCAP_SUFFIX]]
|
75
84
|
|
76
|
-
|
85
|
+
emoji_valid_flag_sequence = \
|
77
86
|
pack_and_join[VALID_REGION_FLAGS]
|
78
87
|
|
88
|
+
emoji_well_formed_flag_sequence = \
|
89
|
+
"(?:" +
|
90
|
+
pack_and_join[REGIONAL_INDICATORS] +
|
91
|
+
pack_and_join[REGIONAL_INDICATORS] +
|
92
|
+
")"
|
93
|
+
|
94
|
+
emoji_valid_core_sequence = \
|
95
|
+
join[
|
96
|
+
# emoji_character,
|
97
|
+
emoji_keycap_sequence,
|
98
|
+
emoji_modifier_sequence,
|
99
|
+
non_component_emoji_presentation_sequence,
|
100
|
+
emoji_valid_flag_sequence,
|
101
|
+
]
|
102
|
+
|
103
|
+
emoji_well_formed_core_sequence = \
|
104
|
+
join[
|
105
|
+
# emoji_character,
|
106
|
+
emoji_keycap_sequence,
|
107
|
+
emoji_modifier_sequence,
|
108
|
+
non_component_emoji_presentation_sequence,
|
109
|
+
emoji_well_formed_flag_sequence,
|
110
|
+
]
|
111
|
+
|
112
|
+
emoji_rgi_tag_sequence = \
|
113
|
+
pack_and_join[RECOMMENDED_SUBDIVISION_FLAGS]
|
114
|
+
|
79
115
|
emoji_valid_tag_sequence = \
|
80
116
|
"(?:" +
|
81
117
|
pack[EMOJI_TAG_BASE_FLAG] +
|
@@ -83,35 +119,60 @@ module Unicode
|
|
83
119
|
pack[CANCEL_TAG] +
|
84
120
|
")"
|
85
121
|
|
86
|
-
|
122
|
+
emoji_well_formed_tag_sequence = \
|
123
|
+
"(?:" +
|
124
|
+
join[
|
125
|
+
non_component_emoji_presentation_sequence,
|
126
|
+
emoji_modifier_sequence,
|
127
|
+
] +
|
128
|
+
pack_and_join[TAGS] + "+" +
|
129
|
+
pack[CANCEL_TAG] +
|
130
|
+
")"
|
131
|
+
|
132
|
+
emoji_rgi_zwj_sequence = \
|
133
|
+
pack_and_join[RECOMMENDED_ZWJ_SEQUENCES]
|
134
|
+
|
135
|
+
emoji_valid_zwj_element = \
|
87
136
|
join[
|
88
137
|
emoji_modifier_sequence,
|
89
138
|
emoji_presentation_sequence,
|
90
139
|
emoji_character,
|
91
140
|
]
|
92
141
|
|
93
|
-
|
94
|
-
|
95
|
-
|
96
|
-
|
97
|
-
|
98
|
-
|
99
|
-
|
100
|
-
|
101
|
-
|
102
|
-
|
142
|
+
emoji_valid_zwj_sequence = \
|
143
|
+
"(?:" +
|
144
|
+
"(?:" + emoji_valid_zwj_element + pack[ZWJ] + ")+" + emoji_valid_zwj_element +
|
145
|
+
")"
|
146
|
+
|
147
|
+
emoji_rgi_sequence = \
|
148
|
+
join[
|
149
|
+
emoji_rgi_zwj_sequence,
|
150
|
+
emoji_rgi_tag_sequence,
|
151
|
+
emoji_valid_core_sequence,
|
152
|
+
]
|
153
|
+
|
154
|
+
emoji_valid_sequence = \
|
155
|
+
join[
|
156
|
+
emoji_valid_zwj_sequence,
|
157
|
+
emoji_valid_tag_sequence,
|
158
|
+
emoji_valid_core_sequence,
|
159
|
+
]
|
160
|
+
|
161
|
+
emoji_well_formed_sequence = \
|
162
|
+
join[
|
163
|
+
emoji_valid_zwj_sequence,
|
164
|
+
emoji_well_formed_tag_sequence,
|
165
|
+
emoji_well_formed_core_sequence,
|
166
|
+
]
|
167
|
+
|
168
|
+
# Matches basic singleton emoji and all kind of sequences, but restrict zwj and tag sequences to known sequences (rgi)
|
169
|
+
REGEX = Regexp.compile(emoji_rgi_sequence)
|
103
170
|
|
104
171
|
# Matches basic singleton emoji and all kind of valid sequences
|
105
|
-
REGEX_VALID = Regexp.compile(
|
106
|
-
|
107
|
-
|
108
|
-
|
109
|
-
?| + emoji_modifier_sequence +
|
110
|
-
?| + "(?!" + emoji_component + ")" + emoji_presentation_sequence +
|
111
|
-
?| + emoji_keycap_sequence +
|
112
|
-
?| + emoji_valid_region_sequence +
|
113
|
-
""
|
114
|
-
)
|
172
|
+
REGEX_VALID = Regexp.compile(emoji_valid_sequence)
|
173
|
+
|
174
|
+
# Matches basic singleton emoji and all kind of sequences
|
175
|
+
REGEX_WELL_FORMED = Regexp.compile(emoji_well_formed_sequence)
|
115
176
|
|
116
177
|
# Matches only basic single, non-textual emoji
|
117
178
|
# Ignores "components" like modifiers or simple digits
|
@@ -125,11 +186,16 @@ module Unicode
|
|
125
186
|
"(?!" + emoji_component + ")" + text_presentation_sequence
|
126
187
|
)
|
127
188
|
|
128
|
-
# Matches any emoji-related codepoint
|
189
|
+
# Matches any emoji-related codepoint - Use with caution (returns partil matches)
|
129
190
|
REGEX_ANY = Regexp.compile(
|
130
191
|
emoji_character
|
131
192
|
)
|
132
193
|
|
194
|
+
# Combined REGEXes which also match for TEXTUAL emoji
|
195
|
+
REGEX_INCLUDE_TEXT = Regexp.union(REGEX, REGEX_TEXT)
|
196
|
+
REGEX_VALID_INCLUDE_TEXT = Regexp.union(REGEX_VALID, REGEX_TEXT)
|
197
|
+
REGEX_WELL_FORMED_INCLUDE_TEXT = Regexp.union(REGEX_WELL_FORMED, REGEX_TEXT)
|
198
|
+
|
133
199
|
def self.properties(char)
|
134
200
|
ord = get_codepoint_value(char)
|
135
201
|
props = INDEX[:PROPERTIES][ord]
|
@@ -143,6 +209,9 @@ module Unicode
|
|
143
209
|
|
144
210
|
def self.list(key = nil, sub_key = nil)
|
145
211
|
return LIST unless key || sub_key
|
212
|
+
if LIST_REMOVED_KEYS.include?(key)
|
213
|
+
$stderr.puts "Warning(unicode-emoji): The category of #{key} does not exist anymore"
|
214
|
+
end
|
146
215
|
LIST.dig(*[key, sub_key].compact)
|
147
216
|
end
|
148
217
|
|
@@ -2,12 +2,12 @@
|
|
2
2
|
|
3
3
|
module Unicode
|
4
4
|
module Emoji
|
5
|
-
VERSION = "
|
6
|
-
EMOJI_VERSION = "
|
5
|
+
VERSION = "2.0.0".freeze
|
6
|
+
EMOJI_VERSION = "12.0".freeze
|
7
7
|
DATA_DIRECTORY = File.expand_path(File.dirname(__FILE__) + '/../../../data/').freeze
|
8
8
|
INDEX_FILENAME = (DATA_DIRECTORY + '/emoji.marshal.gz').freeze
|
9
9
|
|
10
|
-
ENABLE_NATIVE_EMOJI_UNICODE_PROPERTIES = false
|
10
|
+
ENABLE_NATIVE_EMOJI_UNICODE_PROPERTIES = false # As of Ruby 2.6.1, Emoji version 11 is included
|
11
11
|
end
|
12
12
|
end
|
13
13
|
|
data/spec/unicode_emoji_spec.rb
CHANGED
@@ -158,7 +158,7 @@ describe Unicode::Emoji do
|
|
158
158
|
|
159
159
|
it "does not match invalid tag sequences" do
|
160
160
|
"π΄σ §σ ’σ ‘σ ‘σ ‘σ Ώ GB AAA" =~ Unicode::Emoji::REGEX_VALID
|
161
|
-
assert_equal "π΄", $&
|
161
|
+
assert_equal "π΄", $& # only base flag is matched
|
162
162
|
end
|
163
163
|
|
164
164
|
it "matches recommended zwj sequences" do
|
@@ -172,6 +172,88 @@ describe Unicode::Emoji do
|
|
172
172
|
end
|
173
173
|
end
|
174
174
|
|
175
|
+
describe "REGEX_WELL_FORMED" do
|
176
|
+
it "matches most singleton emoji codepoints" do
|
177
|
+
"π΄ sleeping face" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
178
|
+
assert_equal "π΄", $&
|
179
|
+
end
|
180
|
+
|
181
|
+
it "matches singleton emoji in combination with emoji variation selector" do
|
182
|
+
"π΄\u{FE0F} sleeping face" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
183
|
+
assert_equal "π΄\u{FE0F}", $&
|
184
|
+
end
|
185
|
+
|
186
|
+
it "does not match singleton emoji when in combination with text variation selector" do
|
187
|
+
"π΄\u{FE0E} sleeping face" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
188
|
+
assert_nil $&
|
189
|
+
end
|
190
|
+
|
191
|
+
it "does not match textual singleton emoji" do
|
192
|
+
"βΆ play button" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
193
|
+
assert_nil $&
|
194
|
+
end
|
195
|
+
|
196
|
+
it "matches textual singleton emoji in combination with emoji variation selector" do
|
197
|
+
"βΆ\u{FE0F} play button" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
198
|
+
assert_equal "βΆ\u{FE0F}", $&
|
199
|
+
end
|
200
|
+
|
201
|
+
it "does not match singleton 'component' emoji codepoints" do
|
202
|
+
"π» light skin tone" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
203
|
+
assert_nil $&
|
204
|
+
end
|
205
|
+
|
206
|
+
it "matches modified emoji if modifier base emoji is used" do
|
207
|
+
"ππ½ person in bed: medium skin tone" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
208
|
+
assert_equal "ππ½", $&
|
209
|
+
end
|
210
|
+
|
211
|
+
it "does not match modified emoji if no modifier base emoji is used" do
|
212
|
+
"π΅π½ cactus" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
213
|
+
assert_equal "π΅", $&
|
214
|
+
end
|
215
|
+
|
216
|
+
it "matches valid region flags" do
|
217
|
+
"π΅πΉ Portugal" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
218
|
+
assert_equal "π΅πΉ", $&
|
219
|
+
end
|
220
|
+
|
221
|
+
it "does match invalid region flags" do
|
222
|
+
"π΅π΅ PP Land" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
223
|
+
assert_equal "π΅π΅", $&
|
224
|
+
end
|
225
|
+
|
226
|
+
it "matches emoji keycap sequences" do
|
227
|
+
"2οΈβ£ keycap: 2" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
228
|
+
assert_equal "2οΈβ£", $&
|
229
|
+
end
|
230
|
+
|
231
|
+
it "matches recommended tag sequences" do
|
232
|
+
"π΄σ §σ ’σ ³σ £σ ΄σ Ώ Scotland" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
233
|
+
assert_equal "π΄σ §σ ’σ ³σ £σ ΄σ Ώ", $&
|
234
|
+
end
|
235
|
+
|
236
|
+
it "matches valid tag sequences, even though they are not recommended" do
|
237
|
+
"π΄σ §σ ’σ ‘σ §σ ’σ Ώ GB AGB" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
238
|
+
assert_equal "π΄σ §σ ’σ ‘σ §σ ’σ Ώ", $&
|
239
|
+
end
|
240
|
+
|
241
|
+
it "does match invalid tag sequences" do
|
242
|
+
"π΄σ §σ ’σ ‘σ ‘σ ‘σ Ώ GB AAA" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
243
|
+
assert_equal "π΄σ §σ ’σ ‘σ ‘σ ‘σ Ώ", $&
|
244
|
+
end
|
245
|
+
|
246
|
+
it "matches recommended zwj sequences" do
|
247
|
+
"π€Ύπ½ββοΈ woman playing handball: medium skin tone" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
248
|
+
assert_equal "π€Ύπ½ββοΈ", $&
|
249
|
+
end
|
250
|
+
|
251
|
+
it "matches valid zwj sequences, even though they are not recommended" do
|
252
|
+
"π€ βπ€’ vomiting cowboy" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
253
|
+
assert_equal "π€ βπ€’", $&
|
254
|
+
end
|
255
|
+
end
|
256
|
+
|
175
257
|
describe "REGEX_BASIC" do
|
176
258
|
it "matches most singleton emoji codepoints" do
|
177
259
|
"π΄ sleeping face" =~ Unicode::Emoji::REGEX_BASIC
|
@@ -300,15 +382,21 @@ describe Unicode::Emoji do
|
|
300
382
|
|
301
383
|
describe ".list" do
|
302
384
|
it "returns a grouped list of emoji" do
|
303
|
-
assert_includes Unicode::Emoji.list.keys, "Smileys &
|
385
|
+
assert_includes Unicode::Emoji.list.keys, "Smileys & Emotion"
|
304
386
|
end
|
305
387
|
|
306
388
|
it "sub-groups the list of emoji" do
|
307
|
-
assert_includes Unicode::Emoji.list("Smileys &
|
389
|
+
assert_includes Unicode::Emoji.list("Smileys & Emotion").keys, "face-glasses"
|
308
390
|
end
|
309
391
|
|
310
392
|
it "has emoji in sub-groups" do
|
311
|
-
assert_includes Unicode::Emoji.list("Smileys &
|
393
|
+
assert_includes Unicode::Emoji.list("Smileys & Emotion", "face-glasses"), "π"
|
394
|
+
end
|
395
|
+
|
396
|
+
it "issues a warning if attempting to retrieve old category" do
|
397
|
+
assert_output nil, "Warning(unicode-emoji): The category of Smileys & People does not exist anymore\n" do
|
398
|
+
assert_nil Unicode::Emoji.list("Smileys & People", "face-positive")
|
399
|
+
end
|
312
400
|
end
|
313
401
|
end
|
314
402
|
end
|
metadata
CHANGED
@@ -1,16 +1,16 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: unicode-emoji
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version:
|
4
|
+
version: 2.0.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Jan Lelis
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2019-02-19 00:00:00.000000000 Z
|
12
12
|
dependencies: []
|
13
|
-
description: "[Emoji
|
13
|
+
description: "[Emoji 12.0] Retrieve emoji data about Unicode codepoints. Also contains
|
14
14
|
a regex to match emoji."
|
15
15
|
email:
|
16
16
|
- mail@janlelis.de
|
@@ -53,7 +53,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
53
53
|
version: '0'
|
54
54
|
requirements: []
|
55
55
|
rubyforge_project:
|
56
|
-
rubygems_version: 2.
|
56
|
+
rubygems_version: 2.5.1
|
57
57
|
signing_key:
|
58
58
|
specification_version: 4
|
59
59
|
summary: Retrieve Emoji data about Unicode codepoints.
|