gemoji-parser 1.0.0 → 1.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +84 -27
- data/gemoji-parser.gemspec +1 -2
- data/lib/gemoji-parser.rb +251 -23
- data/lib/gemoji-parser/version.rb +1 -1
- data/spec/emoji_parser_spec.rb +222 -0
- metadata +6 -6
- data/spec/emoji_helper_spec.rb +0 -88
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 3621544eb0c0dfe923ad4639f4ec51fefbf42cdc
|
4
|
+
data.tar.gz: 412e5b97b12b82fd2fcb1ae4a95546bb0eac8bff
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: dec9825da6d1d409f98c5afb6cce0363bc00c0ee34b5fa5f6b0ed3c8faca12a1228b7d20b10e75174ed36621756075ba6cb145b766acecd2bb0756fc12d57d5f
|
7
|
+
data.tar.gz: 67e7e574e303eda0b25f2fb1dca6227802afc86210827df428d9d5423b1cf98a92e0abba5788cc156a0f89a9d12281cb634d459832f3e3401af0b7cff555273d
|
data/README.md
CHANGED
@@ -1,6 +1,6 @@
|
|
1
1
|
# gemoji-parser
|
2
2
|
|
3
|
-
The missing helper methods for [GitHub's
|
3
|
+
The missing helper methods for [GitHub's gemoji](https://github.com/github/gemoji) gem. This utility provides a parsing API for the `Emoji` corelib (provided by *gemoji*). The parser handles transformations of emoji symbols between unicode (😃), token (`:smile:`), and emoticon (`:-D`) formats; and may perform arbitrary replacement of emoji symbols into custom display formats (such as image tags). Internally, highly-optimized regular expressions are generated and cached to maximize parsing efficiency.
|
4
4
|
|
5
5
|
## Installation
|
6
6
|
|
@@ -12,7 +12,7 @@ gem 'gemoji-parser'
|
|
12
12
|
|
13
13
|
And then execute:
|
14
14
|
|
15
|
-
$ bundle
|
15
|
+
$ bundle install
|
16
16
|
|
17
17
|
Or install it yourself as:
|
18
18
|
|
@@ -20,14 +20,13 @@ Or install it yourself as:
|
|
20
20
|
|
21
21
|
To run tests:
|
22
22
|
|
23
|
-
|
23
|
+
$ bundle exec rake spec
|
24
24
|
|
25
25
|
## Usage
|
26
26
|
|
27
|
-
|
28
27
|
### Tokenizing
|
29
28
|
|
30
|
-
|
29
|
+
The tokenizer methods perform basic conversions of unicode symbols into token symbols, and vice versa.
|
31
30
|
|
32
31
|
```ruby
|
33
32
|
EmojiParser.tokenize("Test 🙈 🙊 🙉")
|
@@ -37,56 +36,114 @@ EmojiParser.detokenize("Test :see_no_evil: :speak_no_evil: :hear_no_evil:")
|
|
37
36
|
# "Test 🙈 🙊 🙉"
|
38
37
|
```
|
39
38
|
|
40
|
-
###
|
39
|
+
### Symbol Parsing
|
41
40
|
|
42
|
-
|
41
|
+
Use the symbol parser methods for custom transformations. All symbol parsers yield [Emoji::Character](https://github.com/github/gemoji/blob/master/lib/emoji/character.rb) instances into the parsing block for custom formatting.
|
43
42
|
|
44
43
|
**Unicode symbols**
|
45
44
|
|
46
45
|
```ruby
|
47
|
-
EmojiParser.parse_unicode(
|
46
|
+
EmojiParser.parse_unicode("Test 🐠") do |emoji|
|
48
47
|
%Q(<img src="#{emoji.image_filename}" alt=":#{emoji.name}:">).html_safe
|
49
48
|
end
|
50
49
|
|
51
|
-
# 'Test <img src="unicode/
|
50
|
+
# 'Test <img src="unicode/1f420.png" alt=":tropical_fish:">'
|
52
51
|
```
|
53
52
|
|
54
53
|
**Token symbols**
|
55
54
|
|
56
55
|
```ruby
|
57
|
-
EmojiParser.parse_tokens(
|
56
|
+
EmojiParser.parse_tokens("Test :tropical_fish:") do |emoji|
|
58
57
|
%Q(<img src="#{emoji.image_filename}" alt=":#{emoji.name}:">).html_safe
|
59
58
|
end
|
60
59
|
|
61
|
-
# 'Test <img src="unicode/
|
60
|
+
# 'Test <img src="unicode/1f420.png" alt=":tropical_fish:">'
|
62
61
|
```
|
63
62
|
|
64
|
-
**
|
63
|
+
**Emoticon symbols**
|
65
64
|
|
66
65
|
```ruby
|
67
|
-
EmojiParser.
|
66
|
+
EmojiParser.parse_emoticons("Test ;-)") do |emoji|
|
67
|
+
%Q(<img src="#{emoji.image_filename}" alt=":#{emoji.name}:">).html_safe
|
68
|
+
end
|
68
69
|
|
69
|
-
# 'Test
|
70
|
+
# 'Test <img src="unicode/1f609.png" alt=":wink:">'
|
70
71
|
```
|
71
72
|
|
72
|
-
|
73
|
+
**All symbol types**
|
73
74
|
|
74
|
-
|
75
|
+
Use the `parse` method to target all symbol types with a single parsing pass. Specific symbol types may be excluded using options:
|
75
76
|
|
76
77
|
```ruby
|
77
|
-
|
78
|
-
|
79
|
-
|
78
|
+
EmojiParser.parse("Test 🐠 :scream: ;-)") { |emoji| "[#{emoji.name}]" }
|
79
|
+
# 'Test [tropical_fish] [scream] [wink]'
|
80
|
+
|
81
|
+
EmojiParser.parse("Test 🐠 :scream: ;-)", emoticons: false) do |emoji|
|
82
|
+
"[#{emoji.name}]"
|
83
|
+
end
|
84
|
+
# 'Test [tropical_fish] [scream] ;-)'
|
80
85
|
```
|
81
86
|
|
82
|
-
|
87
|
+
While the `parse` method is heavier to run than the discrete parsing methods for each symbol type (`parse_unicode`, `parse_tokens`, etc...), it has the advantage of avoiding multiple parsing passes. This is handy if you want parsed symbols to output new symbols in a different format, such as generating image tags that include a symbol in their alt text:
|
88
|
+
|
89
|
+
```ruby
|
90
|
+
EmojiParser.parse("Test 🐠 ;-)") do |emoji|
|
91
|
+
%Q(<img src="#{emoji.image_filename}" alt=":#{emoji.name}:">).html_safe
|
92
|
+
end
|
93
|
+
|
94
|
+
# 'Test <img src="unicode/1f420.png" alt=":tropical_fish:"> <img src="unicode/1f609.png" alt=":wink:">'
|
95
|
+
```
|
96
|
+
|
97
|
+
### Lookups & File Paths
|
98
|
+
|
99
|
+
Use the `find` method to derive [Emoji::Character](https://github.com/github/gemoji/blob/master/lib/emoji/character.rb) instances from any symbol format (unicode, token, emoticon):
|
100
|
+
|
101
|
+
```ruby
|
102
|
+
emoji = EmojiParser.find(🐠)
|
103
|
+
emoji = EmojiParser.find('see_no_evil')
|
104
|
+
emoji = EmojiParser.find(';-)')
|
105
|
+
```
|
106
|
+
|
107
|
+
Use the `image_path` helper to derive an image filepath from any symbol format (unicode, token, emoticon). You may optionally provide a custom path that overrides the *gemoji* default location (this is useful if you'd like to reference your images from a CDN):
|
83
108
|
|
84
|
-
|
109
|
+
```ruby
|
110
|
+
EmojiParser.image_path('tropical_fish')
|
111
|
+
# "unicode/1f420.png"
|
112
|
+
|
113
|
+
EmojiParser.image_path('tropical_fish', '//cdn.fu/emoji/')
|
114
|
+
# "//cdn.fu/emoji/1f420.png"
|
115
|
+
```
|
116
|
+
|
117
|
+
## Custom Symbols
|
118
|
+
|
119
|
+
**Emoji**
|
120
|
+
|
121
|
+
The parser plays nicely with custom emoji defined through the *gemoji* core. You just need to call `rehash!` once after adding new emoji symbols to regenerate the parser's regex cache:
|
122
|
+
|
123
|
+
```ruby
|
124
|
+
Emoji.create('boxing_kangaroo') # << WHY IS THIS NOT STANDARD?!
|
125
|
+
EmojiParser.rehash!
|
126
|
+
```
|
127
|
+
|
128
|
+
**Emoticons**
|
129
|
+
|
130
|
+
Emoticon patterns are defined through the parser, and are simply mapped to an emoji name that exists within the *gemoji* core (this can be a standard emoji, or a custom emoji that you have added). To see default emoticons, inspect the `EmojiParser.emoticons` hash. For custom emoticons:
|
131
|
+
|
132
|
+
```ruby
|
133
|
+
# Alias a standard emoji:
|
134
|
+
EmojiParser.emoticons[':@'] = :angry
|
135
|
+
|
136
|
+
# Create a custom emoji, and alias it:
|
137
|
+
Emoji.create('bill_clinton')
|
138
|
+
EmojiParser.emoticons['=:o]'] = :bill_clinton
|
139
|
+
|
140
|
+
# IMPORTANT:
|
141
|
+
# Rehash once after adding new symbols to Emoji core, or to the EmojiParser:
|
142
|
+
EmojiParser.rehash!
|
143
|
+
```
|
144
|
+
|
145
|
+
## Shoutout
|
85
146
|
|
86
|
-
|
147
|
+
Thanks to the GitHub team for the [gemoji](https://github.com/github/gemoji) gem, and my esteemed colleague Michael Lovitt for the fantastic [Rubular](http://rubular.com/) regex tool (it has been invaluable for this project).
|
87
148
|
|
88
|
-
|
89
|
-
2. Create your feature branch (`git checkout -b my-new-feature`)
|
90
|
-
3. Commit your changes (`git commit -am 'Add some feature'`)
|
91
|
-
4. Push to the branch (`git push origin my-new-feature`)
|
92
|
-
5. Create a new Pull Request
|
149
|
+
🙈 🙊 🙉
|
data/gemoji-parser.gemspec
CHANGED
@@ -9,7 +9,7 @@ Gem::Specification.new do |s|
|
|
9
9
|
s.authors = ["Greg MacWilliam"]
|
10
10
|
s.email = ["greg.macwilliam@voxmedia.com"]
|
11
11
|
s.summary = %q{The missing helper methods for GitHub's Gemoji gem.}
|
12
|
-
s.description = %q{
|
12
|
+
s.description = %q{Expands GitHub Gemoji to parse unicode and token emoji symbols into custom formats.}
|
13
13
|
s.homepage = "https://github.com/gmac/gemoji-parser"
|
14
14
|
s.license = "MIT"
|
15
15
|
|
@@ -19,7 +19,6 @@ Gem::Specification.new do |s|
|
|
19
19
|
s.require_paths = ["lib"]
|
20
20
|
|
21
21
|
s.required_ruby_version = '> 1.9'
|
22
|
-
|
23
22
|
s.add_dependency "gemoji", ">= 2.1.0"
|
24
23
|
s.add_development_dependency "bundler", "~> 1.6"
|
25
24
|
s.add_development_dependency "rake", "~> 10.0"
|
data/lib/gemoji-parser.rb
CHANGED
@@ -4,46 +4,215 @@ require 'gemoji'
|
|
4
4
|
module EmojiParser
|
5
5
|
extend self
|
6
6
|
|
7
|
-
#
|
8
|
-
#
|
9
|
-
|
10
|
-
|
11
|
-
|
7
|
+
# Emoticons
|
8
|
+
# ---------
|
9
|
+
# The base emoticons set (below) is generated with "noseless" variants, ie: :-) and :)
|
10
|
+
# The generated `EmojiParser.emoticons` hash is formatted as:
|
11
|
+
# ---
|
12
|
+
# > {
|
13
|
+
# > ":-)" => :blush,
|
14
|
+
# > ":)" => :blush,
|
15
|
+
# > ":-D" => :smile,
|
16
|
+
# > ":D" => :smile,
|
17
|
+
# > }
|
18
|
+
#
|
19
|
+
# This base set is selected for commonality and high degrees of author intention.
|
20
|
+
# If you want more/different emoticons:
|
21
|
+
# - Please DO customize the `EmojiParser.emoticons` hash in your app runtime.
|
22
|
+
# - Please DO NOT customize this source code and issue a pull request.
|
23
|
+
#
|
24
|
+
# To add an emoticon:
|
25
|
+
# ---
|
26
|
+
# > EmojiParser.emoticons[':-$'] = :grimacing
|
27
|
+
# > EmojiParser.rehash!
|
28
|
+
#
|
29
|
+
# To remove an emoticon:
|
30
|
+
# ---
|
31
|
+
# > EmojiParser.emoticons.delete(':-$')
|
32
|
+
# > EmojiParser.rehash!
|
33
|
+
#
|
34
|
+
# NOTE: call `rehash!` after making changes to Emoji/emoticon sets.
|
35
|
+
# Rehashing updates the parser's regex cache with the latest icons.
|
36
|
+
#
|
37
|
+
def emoticons
|
38
|
+
return @emoticons if defined? @emoticons
|
39
|
+
@emoticons = {}
|
40
|
+
emoticons = {
|
41
|
+
angry: ">:-(",
|
42
|
+
blush: ":-)",
|
43
|
+
cry: ":'(",
|
44
|
+
confused: [":-\\", ":-/"],
|
45
|
+
disappointed: ":-(",
|
46
|
+
kiss: ":-*",
|
47
|
+
neutral_face: ":-|",
|
48
|
+
monkey_face: ":o)",
|
49
|
+
open_mouth: ":-o",
|
50
|
+
smiley: "=-)",
|
51
|
+
smile: ":-D",
|
52
|
+
stuck_out_tongue: [":-p", ":-P", ":-b"],
|
53
|
+
stuck_out_tongue_winking_eye: [";-p", ";-P", ";-b"],
|
54
|
+
wink: ";-)"
|
55
|
+
}
|
56
|
+
|
57
|
+
# Parse all named patterns into a flat hash table,
|
58
|
+
# where pattern is the key and its token is the value.
|
59
|
+
# all patterns are duplicated with the "noseless" variants, ie: :-) and :)
|
60
|
+
emoticons.each_pair do |name, patterns|
|
61
|
+
patterns = [patterns] unless patterns.is_a?(Array)
|
62
|
+
patterns.each do |pattern|
|
63
|
+
@emoticons[pattern] = name
|
64
|
+
@emoticons[pattern.sub(/(?<=:|;|=)-/, '')] = name
|
65
|
+
end
|
66
|
+
end
|
67
|
+
|
68
|
+
@emoticons
|
69
|
+
end
|
70
|
+
|
71
|
+
attr_writer :emoticons
|
72
|
+
|
73
|
+
# Rehashes all cached regular expressions.
|
74
|
+
# IMPORTANT: call this once after changing emoji characters or emoticon patterns.
|
75
|
+
def rehash!
|
76
|
+
unicode_regex(rehash: true)
|
77
|
+
token_regex(rehash: true)
|
78
|
+
emoticon_regex(rehash: true)
|
79
|
+
end
|
80
|
+
|
81
|
+
# Creates an optimized regular expression for matching unicode symbols.
|
82
|
+
# - Options: rehash:boolean
|
83
|
+
def unicode_regex(opts={})
|
84
|
+
return @unicode_regex if defined?(@unicode_regex) && !opts[:rehash]
|
85
|
+
pattern = []
|
12
86
|
|
13
87
|
Emoji.all.each do |emoji|
|
14
88
|
u = emoji.unicode_aliases.map do |str|
|
15
89
|
str.codepoints.map { |c| '\u{%s}' % c.to_s(16).rjust(4, '0') }.join('')
|
16
90
|
end
|
17
|
-
#
|
18
|
-
|
91
|
+
# Simple method: x10 slower!
|
92
|
+
# pattern.concat u.sort! { |a, b| b.length - a.length }
|
93
|
+
pattern << unicode_matcher(u) if u.any?
|
94
|
+
end
|
95
|
+
|
96
|
+
@unicode_pattern = pattern.join('|')
|
97
|
+
@unicode_regex = Regexp.new("(#{@unicode_pattern})")
|
98
|
+
end
|
99
|
+
|
100
|
+
# Creates a regular expression for matching token symbols.
|
101
|
+
# - Options: rehash:boolean (currently unused)
|
102
|
+
def token_regex(opts={})
|
103
|
+
return @token_regex if defined?(@token_regex)
|
104
|
+
@token_pattern = ':([\w+-]+):'
|
105
|
+
@token_regex = Regexp.new(@token_pattern)
|
106
|
+
end
|
107
|
+
|
108
|
+
# Creates an optimized regular expression for matching emoticon symbols.
|
109
|
+
# - Options: rehash:boolean
|
110
|
+
def emoticon_regex(opts={})
|
111
|
+
return @emoticon_regex if defined?(@emoticon_regex) && !opts[:rehash]
|
112
|
+
pattern = {}
|
113
|
+
|
114
|
+
emoticons.keys.each do |icon|
|
115
|
+
compact_icon = icon.gsub('-', '')
|
116
|
+
|
117
|
+
# Check to see if this icon has a compact version, ex: :-) versus :)
|
118
|
+
# One expression will match as many nose/noseless variants as possible.
|
119
|
+
if compact_icon != icon && emoticons[compact_icon]
|
120
|
+
compact_regex = Regexp.escape(icon).gsub('-', '-?')
|
121
|
+
|
122
|
+
# Keep this expression if it hasn't been defined yet,
|
123
|
+
# or if it's longer than a previously defined pattern.
|
124
|
+
if !pattern[compact_icon] || pattern[compact_icon].length < compact_regex.length
|
125
|
+
pattern[compact_icon] = compact_regex
|
126
|
+
end
|
127
|
+
elsif !pattern[icon]
|
128
|
+
pattern[icon] = Regexp.escape(icon)
|
129
|
+
end
|
130
|
+
end
|
131
|
+
|
132
|
+
@emoticon_pattern = "(?<=^|\\s)(?:#{ pattern.values.join('|') })(?=\\s|$)"
|
133
|
+
@emoticon_regex = Regexp.new("(#{@emoticon_pattern})")
|
134
|
+
end
|
135
|
+
|
136
|
+
# Generates a macro regex for matching one or more symbol sets.
|
137
|
+
# Regex uses various formats, based on symbol sets. Yields match as $1 OR $2
|
138
|
+
# T/EU: (token-$1)|(emoticon-unicode-$2)
|
139
|
+
# T/E or T/U: (token-$1)|(emoticon/unicode-$2)
|
140
|
+
# EU: (emoticon/unicode-$1)
|
141
|
+
# - Options: unicode:boolean, tokens:boolean, emoticons:boolean
|
142
|
+
def macro_regex(opts={})
|
143
|
+
unicode_regex if opts[:unicode]
|
144
|
+
token_regex if opts[:tokens]
|
145
|
+
emoticon_regex if opts[:emoticons]
|
146
|
+
pattern = []
|
147
|
+
|
148
|
+
if opts[:emoticons] && opts[:unicode]
|
149
|
+
pattern << "(?:#{ @emoticon_pattern })"
|
150
|
+
pattern << @unicode_pattern
|
151
|
+
else
|
152
|
+
pattern << @emoticon_pattern if opts[:emoticons]
|
153
|
+
pattern << @unicode_pattern if opts[:unicode]
|
154
|
+
end
|
155
|
+
|
156
|
+
pattern = pattern.any? ? "(#{ pattern.join('|') })" : ""
|
157
|
+
|
158
|
+
if opts[:tokens]
|
159
|
+
if pattern.empty?
|
160
|
+
pattern = @token_pattern
|
161
|
+
else
|
162
|
+
pattern = "(?:#{ @token_pattern })|#{ pattern }"
|
163
|
+
end
|
19
164
|
end
|
20
165
|
|
21
|
-
|
166
|
+
Regexp.new(pattern)
|
22
167
|
end
|
23
168
|
|
24
|
-
# Parses all unicode
|
25
|
-
#
|
169
|
+
# Parses all unicode symbols within a string.
|
170
|
+
# - Block: performs all symbol transformations.
|
26
171
|
def parse_unicode(text)
|
27
|
-
text.gsub(
|
172
|
+
text.gsub(unicode_regex) do |match|
|
28
173
|
emoji = Emoji.find_by_unicode($1)
|
29
174
|
block_given? && emoji ? yield(emoji) : match
|
30
175
|
end
|
31
176
|
end
|
32
177
|
|
33
|
-
# Parses all
|
34
|
-
#
|
178
|
+
# Parses all token symbols within a string.
|
179
|
+
# - Block: performs all symbol transformations.
|
35
180
|
def parse_tokens(text)
|
36
|
-
text.gsub(
|
37
|
-
emoji = Emoji.find_by_alias($1
|
181
|
+
text.gsub(token_regex) do |match|
|
182
|
+
emoji = Emoji.find_by_alias($1)
|
38
183
|
block_given? && emoji ? yield(emoji) : match
|
39
184
|
end
|
40
185
|
end
|
41
186
|
|
42
|
-
# Parses all
|
43
|
-
#
|
44
|
-
def
|
45
|
-
text
|
46
|
-
|
187
|
+
# Parses all emoticon symbols within a string.
|
188
|
+
# - Block: performs all symbol transformations.
|
189
|
+
def parse_emoticons(text)
|
190
|
+
text.gsub(emoticon_regex) do |match|
|
191
|
+
if emoticons.has_key?($1)
|
192
|
+
emoji = Emoji.find_by_alias(emoticons[$1].to_s)
|
193
|
+
block_given? && emoji ? yield(emoji) : match
|
194
|
+
else
|
195
|
+
match
|
196
|
+
end
|
197
|
+
end
|
198
|
+
end
|
199
|
+
|
200
|
+
# Parses all emoji unicode, tokens, and emoticons within a string.
|
201
|
+
# - Block: performs all symbol transformations.
|
202
|
+
# - Options: unicode:boolean, tokens:boolean, emoticons:boolean
|
203
|
+
def parse(text, opts={})
|
204
|
+
opts = { unicode: true, tokens: true, emoticons: true }.merge(opts)
|
205
|
+
if opts.one?
|
206
|
+
return parse_unicode(text) { |e| yield e } if opts[:unicode]
|
207
|
+
return parse_tokens(text) { |e| yield e } if opts[:tokens]
|
208
|
+
return parse_emoticons(text) { |e| yield e } if opts[:emoticons]
|
209
|
+
end
|
210
|
+
text.gsub(macro_regex(opts)) do |match|
|
211
|
+
a = defined?($1) ? $1 : nil
|
212
|
+
b = defined?($2) ? $2 : nil
|
213
|
+
emoji = find(a || b)
|
214
|
+
block_given? && emoji ? yield(emoji) : match
|
215
|
+
end
|
47
216
|
end
|
48
217
|
|
49
218
|
# Transforms all unicode emoji into token strings.
|
@@ -56,8 +225,67 @@ module EmojiParser
|
|
56
225
|
parse_tokens(text) { |emoji| emoji.raw }
|
57
226
|
end
|
58
227
|
|
59
|
-
#
|
60
|
-
|
61
|
-
|
228
|
+
# Finds an Emoji::Character instance for an unknown symbol type.
|
229
|
+
# - symbol: an <Emoji::Character>, or a unicode/token/emoticon string.
|
230
|
+
def find(symbol)
|
231
|
+
return symbol if (symbol.is_a?(Emoji::Character))
|
232
|
+
symbol = emoticons[symbol].to_s if emoticons.has_key?(symbol)
|
233
|
+
Emoji.find_by_alias(symbol) || Emoji.find_by_unicode(symbol) || nil
|
234
|
+
end
|
235
|
+
|
236
|
+
# Gets the image file reference for a symbol; optionally with a custom path.
|
237
|
+
# - symbol: an <Emoji::Character>, or a unicode/token/emoticon string.
|
238
|
+
# - path: a file path to sub into symbol's filename.
|
239
|
+
def image_path(symbol, path=nil)
|
240
|
+
emoji = find(symbol)
|
241
|
+
return nil unless emoji
|
242
|
+
return emoji.image_filename unless path
|
243
|
+
"#{ path.sub(/\/$/, '') }/#{ emoji.image_filename.split('/').pop }"
|
244
|
+
end
|
245
|
+
|
246
|
+
private
|
247
|
+
|
248
|
+
# Compiles an optimized unicode pattern for fast matching.
|
249
|
+
# Matchers use as small a base as possible, with added options. Ex:
|
250
|
+
# 1-char base \w option: \u{1f6a9}\u{fe0f}?
|
251
|
+
# 2-char base \w option: \u{1f1ef}\u{1f1f5}\u{fe0f}?
|
252
|
+
# 1-char base \w options: \u{0031}(?:\u{fe0f}\u{20e3}|\u{20e3}\u{fe0f})?
|
253
|
+
def unicode_matcher(patterns)
|
254
|
+
return patterns.first if patterns.length == 1
|
255
|
+
|
256
|
+
# Sort patterns, longest to shortest:
|
257
|
+
patterns.sort! { |a, b| b.length - a.length }
|
258
|
+
|
259
|
+
# Select a base pattern:
|
260
|
+
# this is the shortest prefix contained by all patterns.
|
261
|
+
base = patterns.last
|
262
|
+
|
263
|
+
if patterns.all? { |p| p.start_with?(base) }
|
264
|
+
base = patterns.pop
|
265
|
+
else
|
266
|
+
base = base.match(/\\u\{.+?\}/).to_s
|
267
|
+
base = nil unless patterns.all? { |p| p.start_with?(base) }
|
268
|
+
end
|
269
|
+
|
270
|
+
# Collect base options and/or alternate patterns:
|
271
|
+
opts = []
|
272
|
+
alts = []
|
273
|
+
patterns.each do |pattern|
|
274
|
+
if base && pattern.start_with?(base)
|
275
|
+
opts << pattern.sub(base, '')
|
276
|
+
else
|
277
|
+
alts << pattern
|
278
|
+
end
|
279
|
+
end
|
280
|
+
|
281
|
+
# Format base options:
|
282
|
+
if opts.length == 1
|
283
|
+
base += "#{ opts.first }?"
|
284
|
+
elsif opts.length > 1
|
285
|
+
base += "(?:#{ opts.join('|') })?"
|
286
|
+
end
|
287
|
+
|
288
|
+
alts << base if base
|
289
|
+
alts.join('|')
|
62
290
|
end
|
63
291
|
end
|
@@ -0,0 +1,222 @@
|
|
1
|
+
# coding: utf-8
|
2
|
+
require 'gemoji-parser'
|
3
|
+
|
4
|
+
describe EmojiParser do
|
5
|
+
let(:test_unicode) { 'Test 🙈 🙊 🙉 😰 :invalid: 🐠. :o)' }
|
6
|
+
let(:test_mixed) { 'Test 🙈 🙊 🙉 :cold_sweat: :invalid: :tropical_fish:. :o)' }
|
7
|
+
let(:test_tokens) { 'Test :see_no_evil: :speak_no_evil: :hear_no_evil: :cold_sweat: :invalid: :tropical_fish:. :o)' }
|
8
|
+
let(:test_emoticons) { ';-) Test (:cold_sweat:) :) :-D' }
|
9
|
+
let(:test_custom) { Emoji.create('custom') }
|
10
|
+
|
11
|
+
describe '#emoticons' do
|
12
|
+
it 'should provide a hash with emoticons and their tokens as key/value pairs.' do
|
13
|
+
expect(EmojiParser.emoticons[':o)']).to eq :monkey_face
|
14
|
+
end
|
15
|
+
end
|
16
|
+
|
17
|
+
describe '#unicode_regex' do
|
18
|
+
it 'generates once and remains cached.' do
|
19
|
+
first = EmojiParser.unicode_regex
|
20
|
+
second = EmojiParser.unicode_regex
|
21
|
+
expect(first).to be second
|
22
|
+
end
|
23
|
+
|
24
|
+
it 'regenerates when called with a :rehash option.' do
|
25
|
+
first = EmojiParser.unicode_regex
|
26
|
+
second = EmojiParser.unicode_regex(rehash: true)
|
27
|
+
expect(first).not_to be second
|
28
|
+
end
|
29
|
+
end
|
30
|
+
|
31
|
+
describe '#token_regex' do
|
32
|
+
it 'generates once and remains cached.' do
|
33
|
+
first = EmojiParser.token_regex
|
34
|
+
second = EmojiParser.token_regex
|
35
|
+
expect(first).to be second
|
36
|
+
end
|
37
|
+
end
|
38
|
+
|
39
|
+
describe '#emoticon_regex' do
|
40
|
+
it 'generates once and remains cached.' do
|
41
|
+
first = EmojiParser.emoticon_regex
|
42
|
+
second = EmojiParser.emoticon_regex
|
43
|
+
expect(first).to be second
|
44
|
+
end
|
45
|
+
|
46
|
+
it 'regenerates when called with a :rehash option.' do
|
47
|
+
first = EmojiParser.emoticon_regex
|
48
|
+
second = EmojiParser.emoticon_regex(rehash: true)
|
49
|
+
expect(first).not_to be second
|
50
|
+
end
|
51
|
+
end
|
52
|
+
|
53
|
+
describe '#parse_unicode' do
|
54
|
+
it 'successfully parses full Gemoji unicode set.' do
|
55
|
+
Emoji.all.each do |emoji|
|
56
|
+
emoji.unicode_aliases.each do |u|
|
57
|
+
parsed = EmojiParser.parse_unicode("Test #{u}") { |e| 'X' }
|
58
|
+
expect(parsed).to eq "Test X"
|
59
|
+
end
|
60
|
+
end
|
61
|
+
end
|
62
|
+
|
63
|
+
it 'replaces all valid unicode symbols via block transformation.' do
|
64
|
+
parsed = EmojiParser.parse_unicode(test_mixed) { |e| 'X' }
|
65
|
+
expect(parsed).to eq 'Test X X X :cold_sweat: :invalid: :tropical_fish:. :o)'
|
66
|
+
end
|
67
|
+
end
|
68
|
+
|
69
|
+
describe '#parse_tokens' do
|
70
|
+
it 'successfully parses full Gemoji name set.' do
|
71
|
+
Emoji.all.each do |emoji|
|
72
|
+
parsed = EmojiParser.parse_tokens("Test :#{emoji.name}:") { |e| 'X' }
|
73
|
+
expect(parsed).to eq "Test X"
|
74
|
+
end
|
75
|
+
end
|
76
|
+
|
77
|
+
it 'replaces all valid token symbols via block transformation.' do
|
78
|
+
parsed = EmojiParser.parse_tokens(test_tokens) { |e| 'X' }
|
79
|
+
expect(parsed).to eq 'Test X X X X :invalid: X. :o)'
|
80
|
+
end
|
81
|
+
end
|
82
|
+
|
83
|
+
describe '#parse_emoticons' do
|
84
|
+
it 'successfully parses full default emoticon set.' do
|
85
|
+
EmojiParser.emoticons.each_key do |emoticon|
|
86
|
+
parsed = EmojiParser.parse_emoticons("Test #{emoticon}") { |e| 'X' }
|
87
|
+
expect(parsed).to eq "Test X"
|
88
|
+
end
|
89
|
+
end
|
90
|
+
|
91
|
+
it 'replaces all valid emoticon symbols via block transformation.' do
|
92
|
+
parsed = EmojiParser.parse_emoticons(test_emoticons) { |e| 'X' }
|
93
|
+
expect(parsed).to eq 'X Test (:cold_sweat:) X X'
|
94
|
+
end
|
95
|
+
end
|
96
|
+
|
97
|
+
describe '#parse' do
|
98
|
+
it 'replaces valid symbols of all types via block transformation.' do
|
99
|
+
parsed = EmojiParser.parse(test_mixed) { |e| 'X' }
|
100
|
+
expect(parsed).to eq 'Test X X X X :invalid: X. X'
|
101
|
+
end
|
102
|
+
|
103
|
+
it 'replaces valid symbols of specified types (unicode, tokens).' do
|
104
|
+
parsed = EmojiParser.parse(test_mixed, emoticons: false) { |e| 'X' }
|
105
|
+
expect(parsed).to eq 'Test X X X X :invalid: X. :o)'
|
106
|
+
end
|
107
|
+
|
108
|
+
it 'replaces valid symbols of specified types (unicode, emoticons).' do
|
109
|
+
parsed = EmojiParser.parse(test_mixed, tokens: false) { |e| 'X' }
|
110
|
+
expect(parsed).to eq 'Test X X X :cold_sweat: :invalid: :tropical_fish:. X'
|
111
|
+
end
|
112
|
+
|
113
|
+
it 'replaces valid symbols of specified types (tokens, emoticons).' do
|
114
|
+
parsed = EmojiParser.parse(test_mixed, unicode: false) { |e| 'X' }
|
115
|
+
expect(parsed).to eq 'Test 🙈 🙊 🙉 X :invalid: X. X'
|
116
|
+
end
|
117
|
+
|
118
|
+
it 'allows symbols to safely insert other symbol types without getting re-parsed.' do
|
119
|
+
parsed = EmojiParser.parse('🙈 🙊 :hear_no_evil:') { |e| ":#{e.name}:" }
|
120
|
+
expect(parsed).to eq ':see_no_evil: :speak_no_evil: :hear_no_evil:'
|
121
|
+
end
|
122
|
+
end
|
123
|
+
|
124
|
+
describe '#tokenize' do
|
125
|
+
it 'successfully tokenizes full Gemoji unicode set.' do
|
126
|
+
Emoji.all.each do |emoji|
|
127
|
+
emoji.unicode_aliases.each do |u|
|
128
|
+
tokenized = EmojiParser.tokenize("Test #{u}")
|
129
|
+
expect(tokenized).to eq "Test :#{emoji.name}:"
|
130
|
+
end
|
131
|
+
end
|
132
|
+
end
|
133
|
+
|
134
|
+
it 'replaces all valid emoji unicode with their token equivalent.' do
|
135
|
+
tokenized = EmojiParser.tokenize(test_mixed)
|
136
|
+
expect(tokenized).to eq test_tokens
|
137
|
+
end
|
138
|
+
end
|
139
|
+
|
140
|
+
describe '#detokenize' do
|
141
|
+
it 'replaces all valid emoji tokens with their raw unicode equivalent.' do
|
142
|
+
tokenized = EmojiParser.detokenize(test_mixed)
|
143
|
+
expect(tokenized).to eq test_unicode
|
144
|
+
end
|
145
|
+
end
|
146
|
+
|
147
|
+
describe '#find' do
|
148
|
+
let (:the_unicode) { '🐵' }
|
149
|
+
let (:the_token) { 'monkey_face' }
|
150
|
+
let (:the_emoticon) { ':o)' }
|
151
|
+
let (:the_emoji) { Emoji.find_by_alias(the_token) }
|
152
|
+
|
153
|
+
it 'returns valid emoji characters.' do
|
154
|
+
expect(EmojiParser.find(the_emoji)).to eq the_emoji
|
155
|
+
end
|
156
|
+
|
157
|
+
it 'finds the proper emoji character for a unicode symbol.' do
|
158
|
+
expect(EmojiParser.find(the_unicode)).to eq the_emoji
|
159
|
+
end
|
160
|
+
|
161
|
+
it 'finds the proper emoji character for a token symbol.' do
|
162
|
+
expect(EmojiParser.find(the_token)).to eq the_emoji
|
163
|
+
end
|
164
|
+
|
165
|
+
it 'finds the proper emoji character for a unicode symbol.' do
|
166
|
+
expect(EmojiParser.find(the_emoticon)).to eq the_emoji
|
167
|
+
end
|
168
|
+
end
|
169
|
+
|
170
|
+
describe '#image_path' do
|
171
|
+
let (:the_emoji) { Emoji.find_by_alias('smiley') }
|
172
|
+
let (:the_image) { '1f603.png' }
|
173
|
+
|
174
|
+
it 'gets the image filename by emoji character.' do
|
175
|
+
path = EmojiParser.image_path(the_emoji)
|
176
|
+
expect(path).to eq the_emoji.image_filename
|
177
|
+
end
|
178
|
+
|
179
|
+
it 'gets the image filename by unicode symbol.' do
|
180
|
+
path = EmojiParser.image_path(the_emoji.raw)
|
181
|
+
expect(path).to eq the_emoji.image_filename
|
182
|
+
end
|
183
|
+
|
184
|
+
it 'gets the image filename by token symbol.' do
|
185
|
+
path = EmojiParser.image_path(the_emoji.name)
|
186
|
+
expect(path).to eq the_emoji.image_filename
|
187
|
+
end
|
188
|
+
|
189
|
+
it 'gets the image filename by emoticon symbol.' do
|
190
|
+
path = EmojiParser.image_path('=)')
|
191
|
+
expect(path).to eq the_emoji.image_filename
|
192
|
+
end
|
193
|
+
|
194
|
+
it 'formats a Gemoji image path as a custom location (with trailing slash).' do
|
195
|
+
custom_path = '//fonts.test.com/emoji/'
|
196
|
+
path = EmojiParser.image_path(the_emoji, custom_path)
|
197
|
+
expect(path).to eq "#{ custom_path }#{ the_image }"
|
198
|
+
end
|
199
|
+
|
200
|
+
it 'formats a Gemoji image path to a custom location (no trailing slash).' do
|
201
|
+
custom_path = '//fonts.test.com/emoji'
|
202
|
+
path = EmojiParser.image_path(the_emoji, custom_path)
|
203
|
+
expect(path).to eq "#{ custom_path }/#{ the_image }"
|
204
|
+
end
|
205
|
+
end
|
206
|
+
|
207
|
+
describe 'custom emoji' do
|
208
|
+
it 'replaces tokens for custom Emoji.' do
|
209
|
+
Emoji.create('boxing_kangaroo')
|
210
|
+
parsed = EmojiParser.parse_tokens('Test :boxing_kangaroo:') { |e| 'X' }
|
211
|
+
expect(parsed).to eq 'Test X'
|
212
|
+
end
|
213
|
+
|
214
|
+
it 'replaces custom emoticons (requires rehashing the regex).' do
|
215
|
+
EmojiParser.emoticons['¯\\(°_o)/¯'] = :confused
|
216
|
+
EmojiParser.emoticon_regex(rehash: true)
|
217
|
+
|
218
|
+
parsed = EmojiParser.parse_emoticons('Test ¯\\(°_o)/¯') { |e| e.name }
|
219
|
+
expect(parsed).to eq 'Test confused'
|
220
|
+
end
|
221
|
+
end
|
222
|
+
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: gemoji-parser
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.
|
4
|
+
version: 1.1.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Greg MacWilliam
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2015-03-
|
11
|
+
date: 2015-03-21 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: gemoji
|
@@ -66,8 +66,8 @@ dependencies:
|
|
66
66
|
- - ">="
|
67
67
|
- !ruby/object:Gem::Version
|
68
68
|
version: '0'
|
69
|
-
description:
|
70
|
-
|
69
|
+
description: Expands GitHub Gemoji to parse unicode and token emoji symbols into custom
|
70
|
+
formats.
|
71
71
|
email:
|
72
72
|
- greg.macwilliam@voxmedia.com
|
73
73
|
executables: []
|
@@ -82,7 +82,7 @@ files:
|
|
82
82
|
- gemoji-parser.gemspec
|
83
83
|
- lib/gemoji-parser.rb
|
84
84
|
- lib/gemoji-parser/version.rb
|
85
|
-
- spec/
|
85
|
+
- spec/emoji_parser_spec.rb
|
86
86
|
homepage: https://github.com/gmac/gemoji-parser
|
87
87
|
licenses:
|
88
88
|
- MIT
|
@@ -108,4 +108,4 @@ signing_key:
|
|
108
108
|
specification_version: 4
|
109
109
|
summary: The missing helper methods for GitHub's Gemoji gem.
|
110
110
|
test_files:
|
111
|
-
- spec/
|
111
|
+
- spec/emoji_parser_spec.rb
|
data/spec/emoji_helper_spec.rb
DELETED
@@ -1,88 +0,0 @@
|
|
1
|
-
# coding: utf-8
|
2
|
-
require 'gemoji-parser'
|
3
|
-
|
4
|
-
describe EmojiParser do
|
5
|
-
let(:test_unicode) { 'Test 🙈 🙊 🙉 😰 :invalid: 🐠.' }
|
6
|
-
let(:test_mixed) { 'Test 🙈 🙊 🙉 :cold_sweat: :invalid: :tropical_fish:.' }
|
7
|
-
let(:test_tokens) { 'Test :see_no_evil: :speak_no_evil: :hear_no_evil: :cold_sweat: :invalid: :tropical_fish:.' }
|
8
|
-
|
9
|
-
describe '#emoji_regexp' do
|
10
|
-
it 'generates once and remains cached.' do
|
11
|
-
first = EmojiParser.emoji_regexp
|
12
|
-
second = EmojiParser.emoji_regexp
|
13
|
-
expect(first).to be second
|
14
|
-
end
|
15
|
-
|
16
|
-
it 'regenerates when called with a :rehash option.' do
|
17
|
-
first = EmojiParser.emoji_regexp
|
18
|
-
second = EmojiParser.emoji_regexp(rehash: true)
|
19
|
-
expect(first).not_to be second
|
20
|
-
end
|
21
|
-
end
|
22
|
-
|
23
|
-
describe '#parse_unicode' do
|
24
|
-
it 'replaces all valid emoji unicode via block transformation.' do
|
25
|
-
parsed = EmojiParser.parse_unicode(test_mixed) { |emoji| 'X' }
|
26
|
-
expect(parsed).to eq "Test X X X :cold_sweat: :invalid: :tropical_fish:."
|
27
|
-
end
|
28
|
-
end
|
29
|
-
|
30
|
-
describe '#parse_tokens' do
|
31
|
-
it 'replaces all valid emoji tokens via block transformation.' do
|
32
|
-
parsed = EmojiParser.parse_tokens(test_tokens) { |emoji| 'X' }
|
33
|
-
expect(parsed).to eq "Test X X X X :invalid: X."
|
34
|
-
end
|
35
|
-
end
|
36
|
-
|
37
|
-
describe '#parse_all' do
|
38
|
-
it 'replaces all valid emoji unicode and tokens via block transformation.' do
|
39
|
-
parsed = EmojiParser.parse_all(test_mixed) { |emoji| 'X' }
|
40
|
-
expect(parsed).to eq "Test X X X X :invalid: X."
|
41
|
-
end
|
42
|
-
end
|
43
|
-
|
44
|
-
describe '#tokenize' do
|
45
|
-
it 'successfully tokenizes all Gemoji unicode aliases.' do
|
46
|
-
Emoji.all.each do |emoji|
|
47
|
-
emoji.unicode_aliases.each do |u|
|
48
|
-
tokenized = EmojiParser.tokenize("Test #{u}")
|
49
|
-
expect(tokenized).to eq "Test :#{emoji.name}:"
|
50
|
-
end
|
51
|
-
end
|
52
|
-
end
|
53
|
-
|
54
|
-
it 'replaces all valid emoji unicodes with their token equivalent.' do
|
55
|
-
tokenized = EmojiParser.tokenize(test_mixed)
|
56
|
-
expect(tokenized).to eq test_tokens
|
57
|
-
end
|
58
|
-
end
|
59
|
-
|
60
|
-
describe '#detokenize' do
|
61
|
-
it 'replaces all valid emoji tokens with their raw unicode equivalent.' do
|
62
|
-
tokenized = EmojiParser.detokenize(test_mixed)
|
63
|
-
expect(tokenized).to eq test_unicode
|
64
|
-
end
|
65
|
-
end
|
66
|
-
|
67
|
-
describe '#filepath' do
|
68
|
-
let (:test_emoji) { Emoji.find_by_alias('de') }
|
69
|
-
let (:test_file) { '1f1e9-1f1ea.png' }
|
70
|
-
|
71
|
-
it 'formats a Gemoji image path as a root location by default.' do
|
72
|
-
path = EmojiParser.filepath(test_emoji)
|
73
|
-
expect(path).to eq "/#{test_file}"
|
74
|
-
end
|
75
|
-
|
76
|
-
it 'formats a Gemoji image path as a custom location (with trailing slash).' do
|
77
|
-
images_path = '//fonts.test.com/emoji/'
|
78
|
-
path = EmojiParser.filepath(test_emoji, images_path)
|
79
|
-
expect(path).to eq "#{images_path}#{test_file}"
|
80
|
-
end
|
81
|
-
|
82
|
-
it 'formats a Gemoji image path to a custom location (no trailing slash).' do
|
83
|
-
images_path = '//fonts.test.com/emoji'
|
84
|
-
path = EmojiParser.filepath(test_emoji, images_path)
|
85
|
-
expect(path).to eq "#{images_path}/#{test_file}"
|
86
|
-
end
|
87
|
-
end
|
88
|
-
end
|