unicode-emoji 1.1.0 β 2.0.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +5 -5
- data/.travis.yml +5 -4
- data/CHANGELOG.md +9 -0
- data/MIT-LICENSE.txt +1 -1
- data/README.md +60 -10
- data/data/emoji.marshal.gz +0 -0
- data/lib/unicode/emoji.rb +92 -23
- data/lib/unicode/emoji/constants.rb +3 -3
- data/spec/unicode_emoji_spec.rb +92 -4
- metadata +4 -4
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
|
-
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: 32bec9a0f826ab808cf77b3bf69e8248de000d99
|
4
|
+
data.tar.gz: 6c8a53dc8874ab6bf508aad2a914eded9a7a4889
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: c5ebaf7c4c6a66331af9c0f927f8f41079aaf89d3389c4ba84533c1f64fbd2b4456657971e05c437987568c6853f844f1084986721cede6da8979696e4efffd6
|
7
|
+
data.tar.gz: 3fc8af7fc6bdcaac8ac14ec148c4322a8d26a1e627895ec54a3a038a85062c9f00d37317208cdbe876cdc31603e83943eac569ee937bccfd7e44df15d239ac19
|
data/.travis.yml
CHANGED
data/CHANGELOG.md
CHANGED
@@ -1,5 +1,14 @@
|
|
1
1
|
## CHANGELOG
|
2
2
|
|
3
|
+
### 2.0.0
|
4
|
+
|
5
|
+
- Emoji 12.0 data (including valid subdivisions)
|
6
|
+
- Introduce new `REGEX_WELL_FORMED` to be able to match for invalid tag and region sequences
|
7
|
+
- Introduce new `*_INCLUDE_TEXT` regexes which include matching for textual presentation emoji
|
8
|
+
- Refactoring: Update Emoji matching to latest standard while keeping naming close to standard
|
9
|
+
- Issue warning when using `#list` method to retrieve outdated category
|
10
|
+
- Change matching for ZWJ sequences: Do not limit sequence to a maximum of 3 ZWJs
|
11
|
+
|
3
12
|
### 1.1.0
|
4
13
|
|
5
14
|
- Emoji 11.0
|
data/MIT-LICENSE.txt
CHANGED
data/README.md
CHANGED
@@ -1,12 +1,12 @@
|
|
1
|
-
# Unicode::Emoji [![[version]](https://badge.fury.io/rb/unicode-emoji.svg)](
|
1
|
+
# Unicode::Emoji [![[version]](https://badge.fury.io/rb/unicode-emoji.svg)](https://badge.fury.io/rb/unicode-emoji) [![[travis]](https://travis-ci.org/janlelis/unicode-emoji.svg)](https://travis-ci.org/janlelis/unicode-emoji)
|
2
2
|
|
3
3
|
A small Ruby library which provides Unicode Emoji data and regexes.
|
4
4
|
|
5
5
|
Also includes a categorized list of recommended Emoji.
|
6
6
|
|
7
|
-
Emoji version: **
|
7
|
+
Emoji version: **12.0** (February 2018)
|
8
8
|
|
9
|
-
Supported Rubies: **2.5**, **2.4**, **2.3**
|
9
|
+
Supported Rubies: **2.6**, **2.5**, **2.4**, **2.3**
|
10
10
|
|
11
11
|
If you are stuck on an older Ruby version, checkout the latest [0.9 version](https://rubygems.org/gems/unicode-emoji/versions/0.9.3) of this gem.
|
12
12
|
|
@@ -20,7 +20,7 @@ gem "unicode-emoji"
|
|
20
20
|
|
21
21
|
### Regex
|
22
22
|
|
23
|
-
|
23
|
+
The gem includes a bunch of Emoji regexes, which are compiled out of various Emoji Unicode data sources.
|
24
24
|
|
25
25
|
```ruby
|
26
26
|
require "unicode/emoji"
|
@@ -40,16 +40,64 @@ string = "String which contains all kinds of emoji:
|
|
40
40
|
string.scan(Unicode::Emoji::REGEX) # => ["π΄", "βΆοΈ", "ππ½", "π΅πΉ", "π΄σ §σ ’σ ³σ £σ ΄σ Ώ", "2οΈβ£", "π€Ύπ½ββοΈ"]
|
41
41
|
```
|
42
42
|
|
43
|
+
#### Main Regexes
|
44
|
+
|
45
|
+
Matches (non-textual) Emoji of all kinds:
|
46
|
+
|
43
47
|
Regex | Description | Example Matches | Example Non-Matches
|
44
48
|
------------------------------|-------------|-----------------|--------------------
|
45
|
-
`Unicode::Emoji::REGEX` | Matches (non-textual) singleton Emoji (except for singleton components, like a skin tone modifier without base Emoji) and all kind of
|
46
|
-
`Unicode::Emoji::REGEX_VALID` | Matches (non-textual) singleton Emoji (except for singleton components, like a skin tone modifier without base Emoji) and all kind of valid Emoji sequences | `π΄`, `βΆοΈ`, `ππ½`, `π΅πΉ`, `2οΈβ£`, `π΄σ §σ ’σ ³σ £σ ΄σ Ώ`, `π΄σ §σ ’σ ‘σ §σ ’σ Ώ`, `π€Ύπ½ββοΈ`, `π€ βπ€’` | `π΄οΈ`, `βΆ`, `π»`, `π΅π΅`
|
47
|
-
`Unicode::Emoji::
|
48
|
-
|
49
|
-
|
49
|
+
`Unicode::Emoji::REGEX` | **Use this if unsure!** Matches (non-textual) singleton Emoji (except for singleton components, like a skin tone modifier without base Emoji) and all kind of *recommended* Emoji sequences | `π΄`, `βΆοΈ`, `ππ½`, `π΅πΉ`, `2οΈβ£`, `π΄σ §σ ’σ ³σ £σ ΄σ Ώ`, `π€Ύπ½ββοΈ` | `π΄οΈ`, `βΆ`, `π»`, `π΅π΅`, `π΄σ §σ ’σ ‘σ §σ ’σ Ώ`, `π€ βπ€’`
|
50
|
+
`Unicode::Emoji::REGEX_VALID` | Matches (non-textual) singleton Emoji (except for singleton components, like a skin tone modifier without base Emoji) and all kind of *valid* Emoji sequences | `π΄`, `βΆοΈ`, `ππ½`, `π΅πΉ`, `2οΈβ£`, `π΄σ §σ ’σ ³σ £σ ΄σ Ώ`, `π΄σ §σ ’σ ‘σ §σ ’σ Ώ`, `π€Ύπ½ββοΈ`, `π€ βπ€’` | `π΄οΈ`, `βΆ`, `π»`, `π΅π΅`
|
51
|
+
`Unicode::Emoji::REGEX_WELL_FORMED` | Matches (non-textual) singleton Emoji (except for singleton components, like a skin tone modifier without base Emoji) and all kind of *well-formed* Emoji sequences | `π΄`, `βΆοΈ`, `ππ½`, `π΅πΉ`, `2οΈβ£`, `π΄σ §σ ’σ ³σ £σ ΄σ Ώ`, `π΄σ §σ ’σ ‘σ §σ ’σ Ώ`, `π€Ύπ½ββοΈ`, `π€ βπ€’`, `π΅π΅` | `π΄οΈ`, `βΆ`, `π»`
|
52
|
+
|
53
|
+
##### Picking the Right Emoji Regex
|
54
|
+
|
55
|
+
- Usually you just want `REGEX` (RGI set)
|
56
|
+
- If you want broader matching (e.g. more sub-regions), choose `REGEX_VALID`
|
57
|
+
- If you even want to match for invalid sequences, too, use `REGEX_WELL_FORMED`
|
58
|
+
|
59
|
+
Please see [the standard](http://www.unicode.org/reports/tr51/#Emoji_Sets) for details.
|
60
|
+
|
61
|
+
Property | `REGEX` (RGI / Recommended) | `REGEX_VALID` (Valid) | `REGEX_WELL_FORMED` (Well-formed)
|
62
|
+
---------|-----------------------------|-----------------------|----------------------------------
|
63
|
+
Region "π΅πΉ" | Yes | Yes | Yes
|
64
|
+
Region "π΅π΅" | No | No | Yes
|
65
|
+
Tag Sequence "π΄σ §σ ’σ ³σ £σ ΄σ Ώ" | Yes | Yes | Yes
|
66
|
+
Tag Sequence "π΄σ §σ ’σ ‘σ §σ ’σ Ώ" | No | Yes | Yes
|
67
|
+
Tag Sequence "π΄σ §σ ’σ ‘σ ‘σ ‘σ Ώ" | No | No | Yes
|
68
|
+
ZWJ Sequence "π€Ύπ½ββοΈ" | Yes | Yes | Yes
|
69
|
+
ZWJ Sequence "π€ βπ€’" | No | Yes | Yes
|
50
70
|
|
51
71
|
More info about valid vs. recommended Emoji in this [blog article on Emojipedia](http://blog.emojipedia.org/unicode-behind-the-curtain/).
|
52
72
|
|
73
|
+
#### Singleton Regexes
|
74
|
+
|
75
|
+
Matches only simple one-codepoint (+ optional variation selector) Emoji:
|
76
|
+
|
77
|
+
Regex | Description | Example Matches | Example Non-Matches
|
78
|
+
------------------------------|-------------|-----------------|--------------------
|
79
|
+
`Unicode::Emoji::REGEX_BASIC` | Matches (non-textual) singleton Emoji (except for singleton components, like a skin tone modifier without base Emoji), but no sequences at all | `π΄`, `βΆοΈ` | `π΄οΈ`, `βΆ`, `π»`, `ππ½`, `π΅πΉ`, `π΅π΅`,`2οΈβ£`, `π΄σ §σ ’σ ³σ £σ ΄σ Ώ`, `π΄σ §σ ’σ ‘σ §σ ’σ Ώ`, `π€Ύπ½ββοΈ`, `π€ βπ€’`
|
80
|
+
`Unicode::Emoji::REGEX_TEXT` | Matches only textual singleton Emoji (except for singleton components, like digit 1) | `π΄οΈ`, `βΆ` | `π΄`, `βΆοΈ`, `π»`, `ππ½`, `π΅πΉ`, `π΅π΅`,`2οΈβ£`, `π΄σ §σ ’σ ³σ £σ ΄σ Ώ`, `π΄σ §σ ’σ ‘σ §σ ’σ Ώ`, `π€Ύπ½ββοΈ`, `π€ βπ€’`
|
81
|
+
|
82
|
+
#### Include Textual Emoji
|
83
|
+
|
84
|
+
By default, textual Emoji (emoji characters with text variation selector or those that have a default text presentation) will not be included in the default regexes. However, if you wish to match for them too, you can include them in your regex by appending the `_INCLUDE_TEXT` suffix:
|
85
|
+
|
86
|
+
Regex | Description | Example Matches | Example Non-Matches
|
87
|
+
------------------------------|-------------|-----------------|--------------------
|
88
|
+
`Unicode::Emoji::REGEX_INCLUDE_TEXT` | `REGEX` + `REGEX_TEXT` | `π΄`, `βΆοΈ`, `ππ½`, `π΅πΉ`, `2οΈβ£`, `π΄σ §σ ’σ ³σ £σ ΄σ Ώ`, `π€Ύπ½ββοΈ`, `π΄οΈ`, `βΆ` | `π»`, `π΅π΅`, `π΄σ §σ ’σ ‘σ §σ ’σ Ώ`, `π€ βπ€’`
|
89
|
+
`Unicode::Emoji::REGEX_VALID_INCLUDE_TEXT` | `REGEX_VALID` + `REGEX_TEXT` | `π΄`, `βΆοΈ`, `ππ½`, `π΅πΉ`, `2οΈβ£`, `π΄σ §σ ’σ ³σ £σ ΄σ Ώ`, `π΄σ §σ ’σ ‘σ §σ ’σ Ώ`, `π€Ύπ½ββοΈ`, `π€ βπ€’`, `π΄οΈ`, `βΆ` | `π»`, `π΅π΅`
|
90
|
+
`Unicode::Emoji::REGEX_WELL_FORMED_INCLUDE_TEXT` | `REGEX_WELL_FORMED` + `REGEX_TEXT` | `π΄`, `βΆοΈ`, `ππ½`, `π΅πΉ`, `2οΈβ£`, `π΄σ §σ ’σ ³σ £σ ΄σ Ώ`, `π΄σ §σ ’σ ‘σ §σ ’σ Ώ`, `π€Ύπ½ββοΈ`, `π€ βπ€’`, `π΅π΅`, `π΄οΈ`, `βΆ` | `π»`
|
91
|
+
|
92
|
+
#### Partial Regexes
|
93
|
+
|
94
|
+
Matches potential Emoji parts (often, this is not what you want):
|
95
|
+
|
96
|
+
Regex | Description | Example Matches | Example Non-Matches
|
97
|
+
------------------------------|-------------|-----------------|--------------------
|
98
|
+
`Unicode::Emoji::REGEX_ANY` | Matches any Emoji-related codepoint (but no variation selectors, tags, or zero-width joiners). Please not that this will match Emoji-parts rather than complete Emoji, for example, single digits! | `π΄`, `βΆ`, `π»`, `π`, `π½`, `π΅`, `πΉ`, `2`, `π΄`, `π€Ύ`, `β`, `π€ `, `π€’` | -
|
99
|
+
|
100
|
+
|
53
101
|
### List
|
54
102
|
|
55
103
|
Use `Unicode::Emoji::LIST` or the list method to get a grouped (and ordered) list of Emoji:
|
@@ -65,6 +113,8 @@ Unicode::Emoji.list("Food & Drink", "food-asian")
|
|
65
113
|
=> ["π±", "π", "π", "π", "π", "π", "π", "π ", "π’", "π£", "π€", "π₯", "π‘", "\u{1F95F}", "\u{1F960}", "\u{1F961}"]
|
66
114
|
```
|
67
115
|
|
116
|
+
Please note that categories might change with future versions of the Emoji standard. This gem will issue warnings when attemting to retrieve old categories using the `#list` method.
|
117
|
+
|
68
118
|
A markdown file with all recommended Emoji can be found [in this gist](https://gist.github.com/janlelis/72f9be1f0ecca07372c64cf13894b801).
|
69
119
|
|
70
120
|
### Properties
|
@@ -87,5 +137,5 @@ Unicode::Emoji.properties "β" # => ["Emoji", "Emoji_Modifier_Base"]
|
|
87
137
|
|
88
138
|
## MIT
|
89
139
|
|
90
|
-
- Copyright (C) 2017
|
140
|
+
- Copyright (C) 2017-2019 Jan Lelis <http://janlelis.com>. Released under the MIT license.
|
91
141
|
- Unicode data: http://www.unicode.org/copyright.html#Exhibit1
|
data/data/emoji.marshal.gz
CHANGED
Binary file
|
data/lib/unicode/emoji.rb
CHANGED
@@ -18,8 +18,10 @@ module Unicode
|
|
18
18
|
TEXT_VARIATION_SELECTOR = 0xFE0E
|
19
19
|
EMOJI_TAG_BASE_FLAG = 0x1F3F4
|
20
20
|
CANCEL_TAG = 0xE007F
|
21
|
+
TAGS = [*0xE0020..0xE007E]
|
21
22
|
EMOJI_KEYCAP_SUFFIX = 0x20E3
|
22
23
|
ZWJ = 0x200D
|
24
|
+
REGIONAL_INDICATORS = [*0x1F1E6..0x1F1FF]
|
23
25
|
|
24
26
|
EMOJI_CHAR = INDEX[:PROPERTIES].select{ |ord, props| props.include?(:E) }.keys.freeze
|
25
27
|
EMOJI_PRESENTATION = INDEX[:PROPERTIES].select{ |ord, props| props.include?(:P) }.keys.freeze
|
@@ -36,6 +38,10 @@ module Unicode
|
|
36
38
|
RECOMMENDED_ZWJ_SEQUENCES = INDEX[:ZWJ].freeze
|
37
39
|
|
38
40
|
LIST = INDEX[:LIST].freeze.each_value(&:freeze)
|
41
|
+
LIST_REMOVED_KEYS = [
|
42
|
+
"Smileys & People",
|
43
|
+
"Component",
|
44
|
+
]
|
39
45
|
|
40
46
|
pack = ->(ord){ Regexp.escape(Array(ord).pack("U*")) }
|
41
47
|
join = -> (*strings){ "(?:" + strings.join("|") + ")" }
|
@@ -61,6 +67,9 @@ module Unicode
|
|
61
67
|
emoji_presentation + "(?!" + pack[TEXT_VARIATION_SELECTOR] + ")" + pack[EMOJI_VARIATION_SELECTOR] + "?",
|
62
68
|
]
|
63
69
|
|
70
|
+
non_component_emoji_presentation_sequence = \
|
71
|
+
"(?!" + emoji_component + ")" + emoji_presentation_sequence
|
72
|
+
|
64
73
|
text_presentation_sequence = \
|
65
74
|
join[
|
66
75
|
pack_and_join[TEXT_PRESENTATION]+ "(?!" + join[emoji_modifier, pack[EMOJI_VARIATION_SELECTOR]] + ")" + pack[TEXT_VARIATION_SELECTOR] + "?",
|
@@ -73,9 +82,36 @@ module Unicode
|
|
73
82
|
emoji_keycap_sequence = \
|
74
83
|
pack_and_join[EMOJI_KEYCAPS] + pack[[EMOJI_VARIATION_SELECTOR, EMOJI_KEYCAP_SUFFIX]]
|
75
84
|
|
76
|
-
|
85
|
+
emoji_valid_flag_sequence = \
|
77
86
|
pack_and_join[VALID_REGION_FLAGS]
|
78
87
|
|
88
|
+
emoji_well_formed_flag_sequence = \
|
89
|
+
"(?:" +
|
90
|
+
pack_and_join[REGIONAL_INDICATORS] +
|
91
|
+
pack_and_join[REGIONAL_INDICATORS] +
|
92
|
+
")"
|
93
|
+
|
94
|
+
emoji_valid_core_sequence = \
|
95
|
+
join[
|
96
|
+
# emoji_character,
|
97
|
+
emoji_keycap_sequence,
|
98
|
+
emoji_modifier_sequence,
|
99
|
+
non_component_emoji_presentation_sequence,
|
100
|
+
emoji_valid_flag_sequence,
|
101
|
+
]
|
102
|
+
|
103
|
+
emoji_well_formed_core_sequence = \
|
104
|
+
join[
|
105
|
+
# emoji_character,
|
106
|
+
emoji_keycap_sequence,
|
107
|
+
emoji_modifier_sequence,
|
108
|
+
non_component_emoji_presentation_sequence,
|
109
|
+
emoji_well_formed_flag_sequence,
|
110
|
+
]
|
111
|
+
|
112
|
+
emoji_rgi_tag_sequence = \
|
113
|
+
pack_and_join[RECOMMENDED_SUBDIVISION_FLAGS]
|
114
|
+
|
79
115
|
emoji_valid_tag_sequence = \
|
80
116
|
"(?:" +
|
81
117
|
pack[EMOJI_TAG_BASE_FLAG] +
|
@@ -83,35 +119,60 @@ module Unicode
|
|
83
119
|
pack[CANCEL_TAG] +
|
84
120
|
")"
|
85
121
|
|
86
|
-
|
122
|
+
emoji_well_formed_tag_sequence = \
|
123
|
+
"(?:" +
|
124
|
+
join[
|
125
|
+
non_component_emoji_presentation_sequence,
|
126
|
+
emoji_modifier_sequence,
|
127
|
+
] +
|
128
|
+
pack_and_join[TAGS] + "+" +
|
129
|
+
pack[CANCEL_TAG] +
|
130
|
+
")"
|
131
|
+
|
132
|
+
emoji_rgi_zwj_sequence = \
|
133
|
+
pack_and_join[RECOMMENDED_ZWJ_SEQUENCES]
|
134
|
+
|
135
|
+
emoji_valid_zwj_element = \
|
87
136
|
join[
|
88
137
|
emoji_modifier_sequence,
|
89
138
|
emoji_presentation_sequence,
|
90
139
|
emoji_character,
|
91
140
|
]
|
92
141
|
|
93
|
-
|
94
|
-
|
95
|
-
|
96
|
-
|
97
|
-
|
98
|
-
|
99
|
-
|
100
|
-
|
101
|
-
|
102
|
-
|
142
|
+
emoji_valid_zwj_sequence = \
|
143
|
+
"(?:" +
|
144
|
+
"(?:" + emoji_valid_zwj_element + pack[ZWJ] + ")+" + emoji_valid_zwj_element +
|
145
|
+
")"
|
146
|
+
|
147
|
+
emoji_rgi_sequence = \
|
148
|
+
join[
|
149
|
+
emoji_rgi_zwj_sequence,
|
150
|
+
emoji_rgi_tag_sequence,
|
151
|
+
emoji_valid_core_sequence,
|
152
|
+
]
|
153
|
+
|
154
|
+
emoji_valid_sequence = \
|
155
|
+
join[
|
156
|
+
emoji_valid_zwj_sequence,
|
157
|
+
emoji_valid_tag_sequence,
|
158
|
+
emoji_valid_core_sequence,
|
159
|
+
]
|
160
|
+
|
161
|
+
emoji_well_formed_sequence = \
|
162
|
+
join[
|
163
|
+
emoji_valid_zwj_sequence,
|
164
|
+
emoji_well_formed_tag_sequence,
|
165
|
+
emoji_well_formed_core_sequence,
|
166
|
+
]
|
167
|
+
|
168
|
+
# Matches basic singleton emoji and all kind of sequences, but restrict zwj and tag sequences to known sequences (rgi)
|
169
|
+
REGEX = Regexp.compile(emoji_rgi_sequence)
|
103
170
|
|
104
171
|
# Matches basic singleton emoji and all kind of valid sequences
|
105
|
-
REGEX_VALID = Regexp.compile(
|
106
|
-
|
107
|
-
|
108
|
-
|
109
|
-
?| + emoji_modifier_sequence +
|
110
|
-
?| + "(?!" + emoji_component + ")" + emoji_presentation_sequence +
|
111
|
-
?| + emoji_keycap_sequence +
|
112
|
-
?| + emoji_valid_region_sequence +
|
113
|
-
""
|
114
|
-
)
|
172
|
+
REGEX_VALID = Regexp.compile(emoji_valid_sequence)
|
173
|
+
|
174
|
+
# Matches basic singleton emoji and all kind of sequences
|
175
|
+
REGEX_WELL_FORMED = Regexp.compile(emoji_well_formed_sequence)
|
115
176
|
|
116
177
|
# Matches only basic single, non-textual emoji
|
117
178
|
# Ignores "components" like modifiers or simple digits
|
@@ -125,11 +186,16 @@ module Unicode
|
|
125
186
|
"(?!" + emoji_component + ")" + text_presentation_sequence
|
126
187
|
)
|
127
188
|
|
128
|
-
# Matches any emoji-related codepoint
|
189
|
+
# Matches any emoji-related codepoint - Use with caution (returns partil matches)
|
129
190
|
REGEX_ANY = Regexp.compile(
|
130
191
|
emoji_character
|
131
192
|
)
|
132
193
|
|
194
|
+
# Combined REGEXes which also match for TEXTUAL emoji
|
195
|
+
REGEX_INCLUDE_TEXT = Regexp.union(REGEX, REGEX_TEXT)
|
196
|
+
REGEX_VALID_INCLUDE_TEXT = Regexp.union(REGEX_VALID, REGEX_TEXT)
|
197
|
+
REGEX_WELL_FORMED_INCLUDE_TEXT = Regexp.union(REGEX_WELL_FORMED, REGEX_TEXT)
|
198
|
+
|
133
199
|
def self.properties(char)
|
134
200
|
ord = get_codepoint_value(char)
|
135
201
|
props = INDEX[:PROPERTIES][ord]
|
@@ -143,6 +209,9 @@ module Unicode
|
|
143
209
|
|
144
210
|
def self.list(key = nil, sub_key = nil)
|
145
211
|
return LIST unless key || sub_key
|
212
|
+
if LIST_REMOVED_KEYS.include?(key)
|
213
|
+
$stderr.puts "Warning(unicode-emoji): The category of #{key} does not exist anymore"
|
214
|
+
end
|
146
215
|
LIST.dig(*[key, sub_key].compact)
|
147
216
|
end
|
148
217
|
|
@@ -2,12 +2,12 @@
|
|
2
2
|
|
3
3
|
module Unicode
|
4
4
|
module Emoji
|
5
|
-
VERSION = "
|
6
|
-
EMOJI_VERSION = "
|
5
|
+
VERSION = "2.0.0".freeze
|
6
|
+
EMOJI_VERSION = "12.0".freeze
|
7
7
|
DATA_DIRECTORY = File.expand_path(File.dirname(__FILE__) + '/../../../data/').freeze
|
8
8
|
INDEX_FILENAME = (DATA_DIRECTORY + '/emoji.marshal.gz').freeze
|
9
9
|
|
10
|
-
ENABLE_NATIVE_EMOJI_UNICODE_PROPERTIES = false
|
10
|
+
ENABLE_NATIVE_EMOJI_UNICODE_PROPERTIES = false # As of Ruby 2.6.1, Emoji version 11 is included
|
11
11
|
end
|
12
12
|
end
|
13
13
|
|
data/spec/unicode_emoji_spec.rb
CHANGED
@@ -158,7 +158,7 @@ describe Unicode::Emoji do
|
|
158
158
|
|
159
159
|
it "does not match invalid tag sequences" do
|
160
160
|
"π΄σ §σ ’σ ‘σ ‘σ ‘σ Ώ GB AAA" =~ Unicode::Emoji::REGEX_VALID
|
161
|
-
assert_equal "π΄", $&
|
161
|
+
assert_equal "π΄", $& # only base flag is matched
|
162
162
|
end
|
163
163
|
|
164
164
|
it "matches recommended zwj sequences" do
|
@@ -172,6 +172,88 @@ describe Unicode::Emoji do
|
|
172
172
|
end
|
173
173
|
end
|
174
174
|
|
175
|
+
describe "REGEX_WELL_FORMED" do
|
176
|
+
it "matches most singleton emoji codepoints" do
|
177
|
+
"π΄ sleeping face" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
178
|
+
assert_equal "π΄", $&
|
179
|
+
end
|
180
|
+
|
181
|
+
it "matches singleton emoji in combination with emoji variation selector" do
|
182
|
+
"π΄\u{FE0F} sleeping face" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
183
|
+
assert_equal "π΄\u{FE0F}", $&
|
184
|
+
end
|
185
|
+
|
186
|
+
it "does not match singleton emoji when in combination with text variation selector" do
|
187
|
+
"π΄\u{FE0E} sleeping face" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
188
|
+
assert_nil $&
|
189
|
+
end
|
190
|
+
|
191
|
+
it "does not match textual singleton emoji" do
|
192
|
+
"βΆ play button" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
193
|
+
assert_nil $&
|
194
|
+
end
|
195
|
+
|
196
|
+
it "matches textual singleton emoji in combination with emoji variation selector" do
|
197
|
+
"βΆ\u{FE0F} play button" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
198
|
+
assert_equal "βΆ\u{FE0F}", $&
|
199
|
+
end
|
200
|
+
|
201
|
+
it "does not match singleton 'component' emoji codepoints" do
|
202
|
+
"π» light skin tone" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
203
|
+
assert_nil $&
|
204
|
+
end
|
205
|
+
|
206
|
+
it "matches modified emoji if modifier base emoji is used" do
|
207
|
+
"ππ½ person in bed: medium skin tone" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
208
|
+
assert_equal "ππ½", $&
|
209
|
+
end
|
210
|
+
|
211
|
+
it "does not match modified emoji if no modifier base emoji is used" do
|
212
|
+
"π΅π½ cactus" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
213
|
+
assert_equal "π΅", $&
|
214
|
+
end
|
215
|
+
|
216
|
+
it "matches valid region flags" do
|
217
|
+
"π΅πΉ Portugal" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
218
|
+
assert_equal "π΅πΉ", $&
|
219
|
+
end
|
220
|
+
|
221
|
+
it "does match invalid region flags" do
|
222
|
+
"π΅π΅ PP Land" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
223
|
+
assert_equal "π΅π΅", $&
|
224
|
+
end
|
225
|
+
|
226
|
+
it "matches emoji keycap sequences" do
|
227
|
+
"2οΈβ£ keycap: 2" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
228
|
+
assert_equal "2οΈβ£", $&
|
229
|
+
end
|
230
|
+
|
231
|
+
it "matches recommended tag sequences" do
|
232
|
+
"π΄σ §σ ’σ ³σ £σ ΄σ Ώ Scotland" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
233
|
+
assert_equal "π΄σ §σ ’σ ³σ £σ ΄σ Ώ", $&
|
234
|
+
end
|
235
|
+
|
236
|
+
it "matches valid tag sequences, even though they are not recommended" do
|
237
|
+
"π΄σ §σ ’σ ‘σ §σ ’σ Ώ GB AGB" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
238
|
+
assert_equal "π΄σ §σ ’σ ‘σ §σ ’σ Ώ", $&
|
239
|
+
end
|
240
|
+
|
241
|
+
it "does match invalid tag sequences" do
|
242
|
+
"π΄σ §σ ’σ ‘σ ‘σ ‘σ Ώ GB AAA" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
243
|
+
assert_equal "π΄σ §σ ’σ ‘σ ‘σ ‘σ Ώ", $&
|
244
|
+
end
|
245
|
+
|
246
|
+
it "matches recommended zwj sequences" do
|
247
|
+
"π€Ύπ½ββοΈ woman playing handball: medium skin tone" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
248
|
+
assert_equal "π€Ύπ½ββοΈ", $&
|
249
|
+
end
|
250
|
+
|
251
|
+
it "matches valid zwj sequences, even though they are not recommended" do
|
252
|
+
"π€ βπ€’ vomiting cowboy" =~ Unicode::Emoji::REGEX_WELL_FORMED
|
253
|
+
assert_equal "π€ βπ€’", $&
|
254
|
+
end
|
255
|
+
end
|
256
|
+
|
175
257
|
describe "REGEX_BASIC" do
|
176
258
|
it "matches most singleton emoji codepoints" do
|
177
259
|
"π΄ sleeping face" =~ Unicode::Emoji::REGEX_BASIC
|
@@ -300,15 +382,21 @@ describe Unicode::Emoji do
|
|
300
382
|
|
301
383
|
describe ".list" do
|
302
384
|
it "returns a grouped list of emoji" do
|
303
|
-
assert_includes Unicode::Emoji.list.keys, "Smileys &
|
385
|
+
assert_includes Unicode::Emoji.list.keys, "Smileys & Emotion"
|
304
386
|
end
|
305
387
|
|
306
388
|
it "sub-groups the list of emoji" do
|
307
|
-
assert_includes Unicode::Emoji.list("Smileys &
|
389
|
+
assert_includes Unicode::Emoji.list("Smileys & Emotion").keys, "face-glasses"
|
308
390
|
end
|
309
391
|
|
310
392
|
it "has emoji in sub-groups" do
|
311
|
-
assert_includes Unicode::Emoji.list("Smileys &
|
393
|
+
assert_includes Unicode::Emoji.list("Smileys & Emotion", "face-glasses"), "π"
|
394
|
+
end
|
395
|
+
|
396
|
+
it "issues a warning if attempting to retrieve old category" do
|
397
|
+
assert_output nil, "Warning(unicode-emoji): The category of Smileys & People does not exist anymore\n" do
|
398
|
+
assert_nil Unicode::Emoji.list("Smileys & People", "face-positive")
|
399
|
+
end
|
312
400
|
end
|
313
401
|
end
|
314
402
|
end
|
metadata
CHANGED
@@ -1,16 +1,16 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: unicode-emoji
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version:
|
4
|
+
version: 2.0.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Jan Lelis
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2019-02-19 00:00:00.000000000 Z
|
12
12
|
dependencies: []
|
13
|
-
description: "[Emoji
|
13
|
+
description: "[Emoji 12.0] Retrieve emoji data about Unicode codepoints. Also contains
|
14
14
|
a regex to match emoji."
|
15
15
|
email:
|
16
16
|
- mail@janlelis.de
|
@@ -53,7 +53,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
53
53
|
version: '0'
|
54
54
|
requirements: []
|
55
55
|
rubyforge_project:
|
56
|
-
rubygems_version: 2.
|
56
|
+
rubygems_version: 2.5.1
|
57
57
|
signing_key:
|
58
58
|
specification_version: 4
|
59
59
|
summary: Retrieve Emoji data about Unicode codepoints.
|