gemoji-parser 1.0.0 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 1f5c233d77c332dabf52ca117a87ac008e0df2c2
4
- data.tar.gz: 79876f028d01361d173c93de9ca527ee6ec71c7b
3
+ metadata.gz: 3621544eb0c0dfe923ad4639f4ec51fefbf42cdc
4
+ data.tar.gz: 412e5b97b12b82fd2fcb1ae4a95546bb0eac8bff
5
5
  SHA512:
6
- metadata.gz: 20e96eb56067495de4b03295e5484853bf86162d0773203f8b67c7eff47bf1ffda05d8753c67c575fe4dd502e68ffbee231a40f75f2db94eb2908b95db1d52e6
7
- data.tar.gz: d24710b9c58651f74749ff7fcf2efb00440b04102ffe511e8eaa1b54b066ed5169c9074d8b37eacfb3a6a33fce06117919fc50f77798a08203b2144272eab365
6
+ metadata.gz: dec9825da6d1d409f98c5afb6cce0363bc00c0ee34b5fa5f6b0ed3c8faca12a1228b7d20b10e75174ed36621756075ba6cb145b766acecd2bb0756fc12d57d5f
7
+ data.tar.gz: 67e7e574e303eda0b25f2fb1dca6227802afc86210827df428d9d5423b1cf98a92e0abba5788cc156a0f89a9d12281cb634d459832f3e3401af0b7cff555273d
data/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # gemoji-parser
2
2
 
3
- The missing helper methods for [GitHub's Gemoji](https://github.com/github/gemoji) gem. This utility provides a parsing API for the `Emoji` corelib (provided by Gemoji). The parser includes quick tokenizers for transforming unicode symbols (🐠) into token symbols (`:tropical_fish:`), and arbitrary block replacement methods for custom formatting of symbols.
3
+ The missing helper methods for [GitHub's gemoji](https://github.com/github/gemoji) gem. This utility provides a parsing API for the `Emoji` corelib (provided by *gemoji*). The parser handles transformations of emoji symbols between unicode (😃), token (`:smile:`), and emoticon (`:-D`) formats; and may perform arbitrary replacement of emoji symbols into custom display formats (such as image tags). Internally, highly-optimized regular expressions are generated and cached to maximize parsing efficiency.
4
4
 
5
5
  ## Installation
6
6
 
@@ -12,7 +12,7 @@ gem 'gemoji-parser'
12
12
 
13
13
  And then execute:
14
14
 
15
- $ bundle
15
+ $ bundle install
16
16
 
17
17
  Or install it yourself as:
18
18
 
@@ -20,14 +20,13 @@ Or install it yourself as:
20
20
 
21
21
  To run tests:
22
22
 
23
- $ be rake spec
23
+ $ bundle exec rake spec
24
24
 
25
25
  ## Usage
26
26
 
27
-
28
27
  ### Tokenizing
29
28
 
30
- These methods perform basic conversions of unicode symbols to token symbols, and vice versa.
29
+ The tokenizer methods perform basic conversions of unicode symbols into token symbols, and vice versa.
31
30
 
32
31
  ```ruby
33
32
  EmojiParser.tokenize("Test 🙈 🙊 🙉")
@@ -37,56 +36,114 @@ EmojiParser.detokenize("Test :see_no_evil: :speak_no_evil: :hear_no_evil:")
37
36
  # "Test 🙈 🙊 🙉"
38
37
  ```
39
38
 
40
- ### Block Parsing
39
+ ### Symbol Parsing
41
40
 
42
- For custom symbol transformations, use the block parser methods. All parsers yeild Gemoji `Emoji::Character` instances into the parsing block for custom formatting.
41
+ Use the symbol parser methods for custom transformations. All symbol parsers yield [Emoji::Character](https://github.com/github/gemoji/blob/master/lib/emoji/character.rb) instances into the parsing block for custom formatting.
43
42
 
44
43
  **Unicode symbols**
45
44
 
46
45
  ```ruby
47
- EmojiParser.parse_unicode('Test 🐠') do |emoji|
46
+ EmojiParser.parse_unicode("Test 🐠") do |emoji|
48
47
  %Q(<img src="#{emoji.image_filename}" alt=":#{emoji.name}:">).html_safe
49
48
  end
50
49
 
51
- # 'Test <img src="unicode/1F420.png" alt=":tropical_fish:">'
50
+ # 'Test <img src="unicode/1f420.png" alt=":tropical_fish:">'
52
51
  ```
53
52
 
54
53
  **Token symbols**
55
54
 
56
55
  ```ruby
57
- EmojiParser.parse_tokens('Test :tropical_fish:') do |emoji|
56
+ EmojiParser.parse_tokens("Test :tropical_fish:") do |emoji|
58
57
  %Q(<img src="#{emoji.image_filename}" alt=":#{emoji.name}:">).html_safe
59
58
  end
60
59
 
61
- # 'Test <img src="unicode/1F420.png" alt=":tropical_fish:">'
60
+ # 'Test <img src="unicode/1f420.png" alt=":tropical_fish:">'
62
61
  ```
63
62
 
64
- **All symbols**
63
+ **Emoticon symbols**
65
64
 
66
65
  ```ruby
67
- EmojiParser.parse_all('Test 🐠 :tropical_fish:') { |emoji| emoji.hex_inspect }
66
+ EmojiParser.parse_emoticons("Test ;-)") do |emoji|
67
+ %Q(<img src="#{emoji.image_filename}" alt=":#{emoji.name}:">).html_safe
68
+ end
68
69
 
69
- # 'Test 1f420 1f420'
70
+ # 'Test <img src="unicode/1f609.png" alt=":wink:">'
70
71
  ```
71
72
 
72
- ### File Paths
73
+ **All symbol types**
73
74
 
74
- A helper is provided for formatting custom filepaths beyond the Gemoji default. This may be useful if you'd like to upload your images to a CDN, and simply reference them from there:
75
+ Use the `parse` method to target all symbol types with a single parsing pass. Specific symbol types may be excluded using options:
75
76
 
76
77
  ```ruby
77
- fish = Emoji.find_by_alias('tropical_fish')
78
- EmojiParser.filepath(fish, '//cdn.fu/emoji/')
79
- # "//cdn.fu/emoji/1F420.png"
78
+ EmojiParser.parse("Test 🐠 :scream: ;-)") { |emoji| "[#{emoji.name}]" }
79
+ # 'Test [tropical_fish] [scream] [wink]'
80
+
81
+ EmojiParser.parse("Test 🐠 :scream: ;-)", emoticons: false) do |emoji|
82
+ "[#{emoji.name}]"
83
+ end
84
+ # 'Test [tropical_fish] [scream] ;-)'
80
85
  ```
81
86
 
82
- ## Shoutout
87
+ While the `parse` method is heavier to run than the discrete parsing methods for each symbol type (`parse_unicode`, `parse_tokens`, etc...), it has the advantage of avoiding multiple parsing passes. This is handy if you want parsed symbols to output new symbols in a different format, such as generating image tags that include a symbol in their alt text:
88
+
89
+ ```ruby
90
+ EmojiParser.parse("Test 🐠 ;-)") do |emoji|
91
+ %Q(<img src="#{emoji.image_filename}" alt=":#{emoji.name}:">).html_safe
92
+ end
93
+
94
+ # 'Test <img src="unicode/1f420.png" alt=":tropical_fish:"> <img src="unicode/1f609.png" alt=":wink:">'
95
+ ```
96
+
97
+ ### Lookups & File Paths
98
+
99
+ Use the `find` method to derive [Emoji::Character](https://github.com/github/gemoji/blob/master/lib/emoji/character.rb) instances from any symbol format (unicode, token, emoticon):
100
+
101
+ ```ruby
102
+ emoji = EmojiParser.find(🐠)
103
+ emoji = EmojiParser.find('see_no_evil')
104
+ emoji = EmojiParser.find(';-)')
105
+ ```
106
+
107
+ Use the `image_path` helper to derive an image filepath from any symbol format (unicode, token, emoticon). You may optionally provide a custom path that overrides the *gemoji* default location (this is useful if you'd like to reference your images from a CDN):
83
108
 
84
- Thanks to the GitHub team for the [Gemoji](https://github.com/github/gemoji) gem. They're handling all the heavy lifting.
109
+ ```ruby
110
+ EmojiParser.image_path('tropical_fish')
111
+ # "unicode/1f420.png"
112
+
113
+ EmojiParser.image_path('tropical_fish', '//cdn.fu/emoji/')
114
+ # "//cdn.fu/emoji/1f420.png"
115
+ ```
116
+
117
+ ## Custom Symbols
118
+
119
+ **Emoji**
120
+
121
+ The parser plays nicely with custom emoji defined through the *gemoji* core. You just need to call `rehash!` once after adding new emoji symbols to regenerate the parser's regex cache:
122
+
123
+ ```ruby
124
+ Emoji.create('boxing_kangaroo') # << WHY IS THIS NOT STANDARD?!
125
+ EmojiParser.rehash!
126
+ ```
127
+
128
+ **Emoticons**
129
+
130
+ Emoticon patterns are defined through the parser, and are simply mapped to an emoji name that exists within the *gemoji* core (this can be a standard emoji, or a custom emoji that you have added). To see default emoticons, inspect the `EmojiParser.emoticons` hash. For custom emoticons:
131
+
132
+ ```ruby
133
+ # Alias a standard emoji:
134
+ EmojiParser.emoticons[':@'] = :angry
135
+
136
+ # Create a custom emoji, and alias it:
137
+ Emoji.create('bill_clinton')
138
+ EmojiParser.emoticons['=:o]'] = :bill_clinton
139
+
140
+ # IMPORTANT:
141
+ # Rehash once after adding new symbols to Emoji core, or to the EmojiParser:
142
+ EmojiParser.rehash!
143
+ ```
144
+
145
+ ## Shoutout
85
146
 
86
- ## Contributing
147
+ Thanks to the GitHub team for the [gemoji](https://github.com/github/gemoji) gem, and my esteemed colleague Michael Lovitt for the fantastic [Rubular](http://rubular.com/) regex tool (it has been invaluable for this project).
87
148
 
88
- 1. Fork it ( https://github.com/gmac/gemoji-parser/fork )
89
- 2. Create your feature branch (`git checkout -b my-new-feature`)
90
- 3. Commit your changes (`git commit -am 'Add some feature'`)
91
- 4. Push to the branch (`git push origin my-new-feature`)
92
- 5. Create a new Pull Request
149
+ 🙈 🙊 🙉
@@ -9,7 +9,7 @@ Gem::Specification.new do |s|
9
9
  s.authors = ["Greg MacWilliam"]
10
10
  s.email = ["greg.macwilliam@voxmedia.com"]
11
11
  s.summary = %q{The missing helper methods for GitHub's Gemoji gem.}
12
- s.description = %q{Parses emoji unicode symbols and string tokens, allowing for customizable transformations.}
12
+ s.description = %q{Expands GitHub Gemoji to parse unicode and token emoji symbols into custom formats.}
13
13
  s.homepage = "https://github.com/gmac/gemoji-parser"
14
14
  s.license = "MIT"
15
15
 
@@ -19,7 +19,6 @@ Gem::Specification.new do |s|
19
19
  s.require_paths = ["lib"]
20
20
 
21
21
  s.required_ruby_version = '> 1.9'
22
-
23
22
  s.add_dependency "gemoji", ">= 2.1.0"
24
23
  s.add_development_dependency "bundler", "~> 1.6"
25
24
  s.add_development_dependency "rake", "~> 10.0"
data/lib/gemoji-parser.rb CHANGED
@@ -4,46 +4,215 @@ require 'gemoji'
4
4
  module EmojiParser
5
5
  extend self
6
6
 
7
- # Generates a regular expression for matching emoji unicodes.
8
- # Call with "rehash: true" to regenerate the cached regex.
9
- def emoji_regexp(opts = {})
10
- return @emoji_regexp if defined?(@emoji_regexp) && !opts[:rehash]
11
- patterns = []
7
+ # Emoticons
8
+ # ---------
9
+ # The base emoticons set (below) is generated with "noseless" variants, ie: :-) and :)
10
+ # The generated `EmojiParser.emoticons` hash is formatted as:
11
+ # ---
12
+ # > {
13
+ # > ":-)" => :blush,
14
+ # > ":)" => :blush,
15
+ # > ":-D" => :smile,
16
+ # > ":D" => :smile,
17
+ # > }
18
+ #
19
+ # This base set is selected for commonality and high degrees of author intention.
20
+ # If you want more/different emoticons:
21
+ # - Please DO customize the `EmojiParser.emoticons` hash in your app runtime.
22
+ # - Please DO NOT customize this source code and issue a pull request.
23
+ #
24
+ # To add an emoticon:
25
+ # ---
26
+ # > EmojiParser.emoticons[':-$'] = :grimacing
27
+ # > EmojiParser.rehash!
28
+ #
29
+ # To remove an emoticon:
30
+ # ---
31
+ # > EmojiParser.emoticons.delete(':-$')
32
+ # > EmojiParser.rehash!
33
+ #
34
+ # NOTE: call `rehash!` after making changes to Emoji/emoticon sets.
35
+ # Rehashing updates the parser's regex cache with the latest icons.
36
+ #
37
+ def emoticons
38
+ return @emoticons if defined? @emoticons
39
+ @emoticons = {}
40
+ emoticons = {
41
+ angry: ">:-(",
42
+ blush: ":-)",
43
+ cry: ":'(",
44
+ confused: [":-\\", ":-/"],
45
+ disappointed: ":-(",
46
+ kiss: ":-*",
47
+ neutral_face: ":-|",
48
+ monkey_face: ":o)",
49
+ open_mouth: ":-o",
50
+ smiley: "=-)",
51
+ smile: ":-D",
52
+ stuck_out_tongue: [":-p", ":-P", ":-b"],
53
+ stuck_out_tongue_winking_eye: [";-p", ";-P", ";-b"],
54
+ wink: ";-)"
55
+ }
56
+
57
+ # Parse all named patterns into a flat hash table,
58
+ # where pattern is the key and its token is the value.
59
+ # all patterns are duplicated with the "noseless" variants, ie: :-) and :)
60
+ emoticons.each_pair do |name, patterns|
61
+ patterns = [patterns] unless patterns.is_a?(Array)
62
+ patterns.each do |pattern|
63
+ @emoticons[pattern] = name
64
+ @emoticons[pattern.sub(/(?<=:|;|=)-/, '')] = name
65
+ end
66
+ end
67
+
68
+ @emoticons
69
+ end
70
+
71
+ attr_writer :emoticons
72
+
73
+ # Rehashes all cached regular expressions.
74
+ # IMPORTANT: call this once after changing emoji characters or emoticon patterns.
75
+ def rehash!
76
+ unicode_regex(rehash: true)
77
+ token_regex(rehash: true)
78
+ emoticon_regex(rehash: true)
79
+ end
80
+
81
+ # Creates an optimized regular expression for matching unicode symbols.
82
+ # - Options: rehash:boolean
83
+ def unicode_regex(opts={})
84
+ return @unicode_regex if defined?(@unicode_regex) && !opts[:rehash]
85
+ pattern = []
12
86
 
13
87
  Emoji.all.each do |emoji|
14
88
  u = emoji.unicode_aliases.map do |str|
15
89
  str.codepoints.map { |c| '\u{%s}' % c.to_s(16).rjust(4, '0') }.join('')
16
90
  end
17
- # Append unicode patterns longest first for broader match:
18
- patterns.concat u.sort! { |a, b| b.length - a.length }
91
+ # Simple method: x10 slower!
92
+ # pattern.concat u.sort! { |a, b| b.length - a.length }
93
+ pattern << unicode_matcher(u) if u.any?
94
+ end
95
+
96
+ @unicode_pattern = pattern.join('|')
97
+ @unicode_regex = Regexp.new("(#{@unicode_pattern})")
98
+ end
99
+
100
+ # Creates a regular expression for matching token symbols.
101
+ # - Options: rehash:boolean (currently unused)
102
+ def token_regex(opts={})
103
+ return @token_regex if defined?(@token_regex)
104
+ @token_pattern = ':([\w+-]+):'
105
+ @token_regex = Regexp.new(@token_pattern)
106
+ end
107
+
108
+ # Creates an optimized regular expression for matching emoticon symbols.
109
+ # - Options: rehash:boolean
110
+ def emoticon_regex(opts={})
111
+ return @emoticon_regex if defined?(@emoticon_regex) && !opts[:rehash]
112
+ pattern = {}
113
+
114
+ emoticons.keys.each do |icon|
115
+ compact_icon = icon.gsub('-', '')
116
+
117
+ # Check to see if this icon has a compact version, ex: :-) versus :)
118
+ # One expression will match as many nose/noseless variants as possible.
119
+ if compact_icon != icon && emoticons[compact_icon]
120
+ compact_regex = Regexp.escape(icon).gsub('-', '-?')
121
+
122
+ # Keep this expression if it hasn't been defined yet,
123
+ # or if it's longer than a previously defined pattern.
124
+ if !pattern[compact_icon] || pattern[compact_icon].length < compact_regex.length
125
+ pattern[compact_icon] = compact_regex
126
+ end
127
+ elsif !pattern[icon]
128
+ pattern[icon] = Regexp.escape(icon)
129
+ end
130
+ end
131
+
132
+ @emoticon_pattern = "(?<=^|\\s)(?:#{ pattern.values.join('|') })(?=\\s|$)"
133
+ @emoticon_regex = Regexp.new("(#{@emoticon_pattern})")
134
+ end
135
+
136
+ # Generates a macro regex for matching one or more symbol sets.
137
+ # Regex uses various formats, based on symbol sets. Yields match as $1 OR $2
138
+ # T/EU: (token-$1)|(emoticon-unicode-$2)
139
+ # T/E or T/U: (token-$1)|(emoticon/unicode-$2)
140
+ # EU: (emoticon/unicode-$1)
141
+ # - Options: unicode:boolean, tokens:boolean, emoticons:boolean
142
+ def macro_regex(opts={})
143
+ unicode_regex if opts[:unicode]
144
+ token_regex if opts[:tokens]
145
+ emoticon_regex if opts[:emoticons]
146
+ pattern = []
147
+
148
+ if opts[:emoticons] && opts[:unicode]
149
+ pattern << "(?:#{ @emoticon_pattern })"
150
+ pattern << @unicode_pattern
151
+ else
152
+ pattern << @emoticon_pattern if opts[:emoticons]
153
+ pattern << @unicode_pattern if opts[:unicode]
154
+ end
155
+
156
+ pattern = pattern.any? ? "(#{ pattern.join('|') })" : ""
157
+
158
+ if opts[:tokens]
159
+ if pattern.empty?
160
+ pattern = @token_pattern
161
+ else
162
+ pattern = "(?:#{ @token_pattern })|#{ pattern }"
163
+ end
19
164
  end
20
165
 
21
- @emoji_regexp = Regexp.new("(#{patterns.join('|')})")
166
+ Regexp.new(pattern)
22
167
  end
23
168
 
24
- # Parses all unicode emoji characters within a string.
25
- # Provide a block that performs the character transformation.
169
+ # Parses all unicode symbols within a string.
170
+ # - Block: performs all symbol transformations.
26
171
  def parse_unicode(text)
27
- text.gsub(emoji_regexp) do |match|
172
+ text.gsub(unicode_regex) do |match|
28
173
  emoji = Emoji.find_by_unicode($1)
29
174
  block_given? && emoji ? yield(emoji) : match
30
175
  end
31
176
  end
32
177
 
33
- # Parses all emoji tokens within a string.
34
- # Provide a block that performs the token transformation.
178
+ # Parses all token symbols within a string.
179
+ # - Block: performs all symbol transformations.
35
180
  def parse_tokens(text)
36
- text.gsub(/:([\w+-]+):/) do |match|
37
- emoji = Emoji.find_by_alias($1.to_s)
181
+ text.gsub(token_regex) do |match|
182
+ emoji = Emoji.find_by_alias($1)
38
183
  block_given? && emoji ? yield(emoji) : match
39
184
  end
40
185
  end
41
186
 
42
- # Parses all emoji unicodes and tokens within a string.
43
- # Provide a block that performs all transformations.
44
- def parse_all(text)
45
- text = parse_unicode(text) { |emoji| yield(emoji) }
46
- parse_tokens(text) { |emoji| yield(emoji) }
187
+ # Parses all emoticon symbols within a string.
188
+ # - Block: performs all symbol transformations.
189
+ def parse_emoticons(text)
190
+ text.gsub(emoticon_regex) do |match|
191
+ if emoticons.has_key?($1)
192
+ emoji = Emoji.find_by_alias(emoticons[$1].to_s)
193
+ block_given? && emoji ? yield(emoji) : match
194
+ else
195
+ match
196
+ end
197
+ end
198
+ end
199
+
200
+ # Parses all emoji unicode, tokens, and emoticons within a string.
201
+ # - Block: performs all symbol transformations.
202
+ # - Options: unicode:boolean, tokens:boolean, emoticons:boolean
203
+ def parse(text, opts={})
204
+ opts = { unicode: true, tokens: true, emoticons: true }.merge(opts)
205
+ if opts.one?
206
+ return parse_unicode(text) { |e| yield e } if opts[:unicode]
207
+ return parse_tokens(text) { |e| yield e } if opts[:tokens]
208
+ return parse_emoticons(text) { |e| yield e } if opts[:emoticons]
209
+ end
210
+ text.gsub(macro_regex(opts)) do |match|
211
+ a = defined?($1) ? $1 : nil
212
+ b = defined?($2) ? $2 : nil
213
+ emoji = find(a || b)
214
+ block_given? && emoji ? yield(emoji) : match
215
+ end
47
216
  end
48
217
 
49
218
  # Transforms all unicode emoji into token strings.
@@ -56,8 +225,67 @@ module EmojiParser
56
225
  parse_tokens(text) { |emoji| emoji.raw }
57
226
  end
58
227
 
59
- # Generates a custom emoji file path.
60
- def filepath(emoji, path='/')
61
- [path.sub(/\/$/, ''), emoji.image_filename.split('/').pop].join('/')
228
+ # Finds an Emoji::Character instance for an unknown symbol type.
229
+ # - symbol: an <Emoji::Character>, or a unicode/token/emoticon string.
230
+ def find(symbol)
231
+ return symbol if (symbol.is_a?(Emoji::Character))
232
+ symbol = emoticons[symbol].to_s if emoticons.has_key?(symbol)
233
+ Emoji.find_by_alias(symbol) || Emoji.find_by_unicode(symbol) || nil
234
+ end
235
+
236
+ # Gets the image file reference for a symbol; optionally with a custom path.
237
+ # - symbol: an <Emoji::Character>, or a unicode/token/emoticon string.
238
+ # - path: a file path to sub into symbol's filename.
239
+ def image_path(symbol, path=nil)
240
+ emoji = find(symbol)
241
+ return nil unless emoji
242
+ return emoji.image_filename unless path
243
+ "#{ path.sub(/\/$/, '') }/#{ emoji.image_filename.split('/').pop }"
244
+ end
245
+
246
+ private
247
+
248
+ # Compiles an optimized unicode pattern for fast matching.
249
+ # Matchers use as small a base as possible, with added options. Ex:
250
+ # 1-char base \w option: \u{1f6a9}\u{fe0f}?
251
+ # 2-char base \w option: \u{1f1ef}\u{1f1f5}\u{fe0f}?
252
+ # 1-char base \w options: \u{0031}(?:\u{fe0f}\u{20e3}|\u{20e3}\u{fe0f})?
253
+ def unicode_matcher(patterns)
254
+ return patterns.first if patterns.length == 1
255
+
256
+ # Sort patterns, longest to shortest:
257
+ patterns.sort! { |a, b| b.length - a.length }
258
+
259
+ # Select a base pattern:
260
+ # this is the shortest prefix contained by all patterns.
261
+ base = patterns.last
262
+
263
+ if patterns.all? { |p| p.start_with?(base) }
264
+ base = patterns.pop
265
+ else
266
+ base = base.match(/\\u\{.+?\}/).to_s
267
+ base = nil unless patterns.all? { |p| p.start_with?(base) }
268
+ end
269
+
270
+ # Collect base options and/or alternate patterns:
271
+ opts = []
272
+ alts = []
273
+ patterns.each do |pattern|
274
+ if base && pattern.start_with?(base)
275
+ opts << pattern.sub(base, '')
276
+ else
277
+ alts << pattern
278
+ end
279
+ end
280
+
281
+ # Format base options:
282
+ if opts.length == 1
283
+ base += "#{ opts.first }?"
284
+ elsif opts.length > 1
285
+ base += "(?:#{ opts.join('|') })?"
286
+ end
287
+
288
+ alts << base if base
289
+ alts.join('|')
62
290
  end
63
291
  end
@@ -1,3 +1,3 @@
1
1
  module EmojiParser
2
- VERSION = "1.0.0"
2
+ VERSION = "1.1.0"
3
3
  end
@@ -0,0 +1,222 @@
1
+ # coding: utf-8
2
+ require 'gemoji-parser'
3
+
4
+ describe EmojiParser do
5
+ let(:test_unicode) { 'Test 🙈 🙊 🙉 😰 :invalid: 🐠. :o)' }
6
+ let(:test_mixed) { 'Test 🙈 🙊 🙉 :cold_sweat: :invalid: :tropical_fish:. :o)' }
7
+ let(:test_tokens) { 'Test :see_no_evil: :speak_no_evil: :hear_no_evil: :cold_sweat: :invalid: :tropical_fish:. :o)' }
8
+ let(:test_emoticons) { ';-) Test (:cold_sweat:) :) :-D' }
9
+ let(:test_custom) { Emoji.create('custom') }
10
+
11
+ describe '#emoticons' do
12
+ it 'should provide a hash with emoticons and their tokens as key/value pairs.' do
13
+ expect(EmojiParser.emoticons[':o)']).to eq :monkey_face
14
+ end
15
+ end
16
+
17
+ describe '#unicode_regex' do
18
+ it 'generates once and remains cached.' do
19
+ first = EmojiParser.unicode_regex
20
+ second = EmojiParser.unicode_regex
21
+ expect(first).to be second
22
+ end
23
+
24
+ it 'regenerates when called with a :rehash option.' do
25
+ first = EmojiParser.unicode_regex
26
+ second = EmojiParser.unicode_regex(rehash: true)
27
+ expect(first).not_to be second
28
+ end
29
+ end
30
+
31
+ describe '#token_regex' do
32
+ it 'generates once and remains cached.' do
33
+ first = EmojiParser.token_regex
34
+ second = EmojiParser.token_regex
35
+ expect(first).to be second
36
+ end
37
+ end
38
+
39
+ describe '#emoticon_regex' do
40
+ it 'generates once and remains cached.' do
41
+ first = EmojiParser.emoticon_regex
42
+ second = EmojiParser.emoticon_regex
43
+ expect(first).to be second
44
+ end
45
+
46
+ it 'regenerates when called with a :rehash option.' do
47
+ first = EmojiParser.emoticon_regex
48
+ second = EmojiParser.emoticon_regex(rehash: true)
49
+ expect(first).not_to be second
50
+ end
51
+ end
52
+
53
+ describe '#parse_unicode' do
54
+ it 'successfully parses full Gemoji unicode set.' do
55
+ Emoji.all.each do |emoji|
56
+ emoji.unicode_aliases.each do |u|
57
+ parsed = EmojiParser.parse_unicode("Test #{u}") { |e| 'X' }
58
+ expect(parsed).to eq "Test X"
59
+ end
60
+ end
61
+ end
62
+
63
+ it 'replaces all valid unicode symbols via block transformation.' do
64
+ parsed = EmojiParser.parse_unicode(test_mixed) { |e| 'X' }
65
+ expect(parsed).to eq 'Test X X X :cold_sweat: :invalid: :tropical_fish:. :o)'
66
+ end
67
+ end
68
+
69
+ describe '#parse_tokens' do
70
+ it 'successfully parses full Gemoji name set.' do
71
+ Emoji.all.each do |emoji|
72
+ parsed = EmojiParser.parse_tokens("Test :#{emoji.name}:") { |e| 'X' }
73
+ expect(parsed).to eq "Test X"
74
+ end
75
+ end
76
+
77
+ it 'replaces all valid token symbols via block transformation.' do
78
+ parsed = EmojiParser.parse_tokens(test_tokens) { |e| 'X' }
79
+ expect(parsed).to eq 'Test X X X X :invalid: X. :o)'
80
+ end
81
+ end
82
+
83
+ describe '#parse_emoticons' do
84
+ it 'successfully parses full default emoticon set.' do
85
+ EmojiParser.emoticons.each_key do |emoticon|
86
+ parsed = EmojiParser.parse_emoticons("Test #{emoticon}") { |e| 'X' }
87
+ expect(parsed).to eq "Test X"
88
+ end
89
+ end
90
+
91
+ it 'replaces all valid emoticon symbols via block transformation.' do
92
+ parsed = EmojiParser.parse_emoticons(test_emoticons) { |e| 'X' }
93
+ expect(parsed).to eq 'X Test (:cold_sweat:) X X'
94
+ end
95
+ end
96
+
97
+ describe '#parse' do
98
+ it 'replaces valid symbols of all types via block transformation.' do
99
+ parsed = EmojiParser.parse(test_mixed) { |e| 'X' }
100
+ expect(parsed).to eq 'Test X X X X :invalid: X. X'
101
+ end
102
+
103
+ it 'replaces valid symbols of specified types (unicode, tokens).' do
104
+ parsed = EmojiParser.parse(test_mixed, emoticons: false) { |e| 'X' }
105
+ expect(parsed).to eq 'Test X X X X :invalid: X. :o)'
106
+ end
107
+
108
+ it 'replaces valid symbols of specified types (unicode, emoticons).' do
109
+ parsed = EmojiParser.parse(test_mixed, tokens: false) { |e| 'X' }
110
+ expect(parsed).to eq 'Test X X X :cold_sweat: :invalid: :tropical_fish:. X'
111
+ end
112
+
113
+ it 'replaces valid symbols of specified types (tokens, emoticons).' do
114
+ parsed = EmojiParser.parse(test_mixed, unicode: false) { |e| 'X' }
115
+ expect(parsed).to eq 'Test 🙈 🙊 🙉 X :invalid: X. X'
116
+ end
117
+
118
+ it 'allows symbols to safely insert other symbol types without getting re-parsed.' do
119
+ parsed = EmojiParser.parse('🙈 🙊 :hear_no_evil:') { |e| ":#{e.name}:" }
120
+ expect(parsed).to eq ':see_no_evil: :speak_no_evil: :hear_no_evil:'
121
+ end
122
+ end
123
+
124
+ describe '#tokenize' do
125
+ it 'successfully tokenizes full Gemoji unicode set.' do
126
+ Emoji.all.each do |emoji|
127
+ emoji.unicode_aliases.each do |u|
128
+ tokenized = EmojiParser.tokenize("Test #{u}")
129
+ expect(tokenized).to eq "Test :#{emoji.name}:"
130
+ end
131
+ end
132
+ end
133
+
134
+ it 'replaces all valid emoji unicode with their token equivalent.' do
135
+ tokenized = EmojiParser.tokenize(test_mixed)
136
+ expect(tokenized).to eq test_tokens
137
+ end
138
+ end
139
+
140
+ describe '#detokenize' do
141
+ it 'replaces all valid emoji tokens with their raw unicode equivalent.' do
142
+ tokenized = EmojiParser.detokenize(test_mixed)
143
+ expect(tokenized).to eq test_unicode
144
+ end
145
+ end
146
+
147
+ describe '#find' do
148
+ let (:the_unicode) { '🐵' }
149
+ let (:the_token) { 'monkey_face' }
150
+ let (:the_emoticon) { ':o)' }
151
+ let (:the_emoji) { Emoji.find_by_alias(the_token) }
152
+
153
+ it 'returns valid emoji characters.' do
154
+ expect(EmojiParser.find(the_emoji)).to eq the_emoji
155
+ end
156
+
157
+ it 'finds the proper emoji character for a unicode symbol.' do
158
+ expect(EmojiParser.find(the_unicode)).to eq the_emoji
159
+ end
160
+
161
+ it 'finds the proper emoji character for a token symbol.' do
162
+ expect(EmojiParser.find(the_token)).to eq the_emoji
163
+ end
164
+
165
+ it 'finds the proper emoji character for a unicode symbol.' do
166
+ expect(EmojiParser.find(the_emoticon)).to eq the_emoji
167
+ end
168
+ end
169
+
170
+ describe '#image_path' do
171
+ let (:the_emoji) { Emoji.find_by_alias('smiley') }
172
+ let (:the_image) { '1f603.png' }
173
+
174
+ it 'gets the image filename by emoji character.' do
175
+ path = EmojiParser.image_path(the_emoji)
176
+ expect(path).to eq the_emoji.image_filename
177
+ end
178
+
179
+ it 'gets the image filename by unicode symbol.' do
180
+ path = EmojiParser.image_path(the_emoji.raw)
181
+ expect(path).to eq the_emoji.image_filename
182
+ end
183
+
184
+ it 'gets the image filename by token symbol.' do
185
+ path = EmojiParser.image_path(the_emoji.name)
186
+ expect(path).to eq the_emoji.image_filename
187
+ end
188
+
189
+ it 'gets the image filename by emoticon symbol.' do
190
+ path = EmojiParser.image_path('=)')
191
+ expect(path).to eq the_emoji.image_filename
192
+ end
193
+
194
+ it 'formats a Gemoji image path as a custom location (with trailing slash).' do
195
+ custom_path = '//fonts.test.com/emoji/'
196
+ path = EmojiParser.image_path(the_emoji, custom_path)
197
+ expect(path).to eq "#{ custom_path }#{ the_image }"
198
+ end
199
+
200
+ it 'formats a Gemoji image path to a custom location (no trailing slash).' do
201
+ custom_path = '//fonts.test.com/emoji'
202
+ path = EmojiParser.image_path(the_emoji, custom_path)
203
+ expect(path).to eq "#{ custom_path }/#{ the_image }"
204
+ end
205
+ end
206
+
207
+ describe 'custom emoji' do
208
+ it 'replaces tokens for custom Emoji.' do
209
+ Emoji.create('boxing_kangaroo')
210
+ parsed = EmojiParser.parse_tokens('Test :boxing_kangaroo:') { |e| 'X' }
211
+ expect(parsed).to eq 'Test X'
212
+ end
213
+
214
+ it 'replaces custom emoticons (requires rehashing the regex).' do
215
+ EmojiParser.emoticons['¯\\(°_o)/¯'] = :confused
216
+ EmojiParser.emoticon_regex(rehash: true)
217
+
218
+ parsed = EmojiParser.parse_emoticons('Test ¯\\(°_o)/¯') { |e| e.name }
219
+ expect(parsed).to eq 'Test confused'
220
+ end
221
+ end
222
+ end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: gemoji-parser
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.0
4
+ version: 1.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Greg MacWilliam
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2015-03-19 00:00:00.000000000 Z
11
+ date: 2015-03-21 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: gemoji
@@ -66,8 +66,8 @@ dependencies:
66
66
  - - ">="
67
67
  - !ruby/object:Gem::Version
68
68
  version: '0'
69
- description: Parses emoji unicode symbols and string tokens, allowing for customizable
70
- transformations.
69
+ description: Expands GitHub Gemoji to parse unicode and token emoji symbols into custom
70
+ formats.
71
71
  email:
72
72
  - greg.macwilliam@voxmedia.com
73
73
  executables: []
@@ -82,7 +82,7 @@ files:
82
82
  - gemoji-parser.gemspec
83
83
  - lib/gemoji-parser.rb
84
84
  - lib/gemoji-parser/version.rb
85
- - spec/emoji_helper_spec.rb
85
+ - spec/emoji_parser_spec.rb
86
86
  homepage: https://github.com/gmac/gemoji-parser
87
87
  licenses:
88
88
  - MIT
@@ -108,4 +108,4 @@ signing_key:
108
108
  specification_version: 4
109
109
  summary: The missing helper methods for GitHub's Gemoji gem.
110
110
  test_files:
111
- - spec/emoji_helper_spec.rb
111
+ - spec/emoji_parser_spec.rb
@@ -1,88 +0,0 @@
1
- # coding: utf-8
2
- require 'gemoji-parser'
3
-
4
- describe EmojiParser do
5
- let(:test_unicode) { 'Test 🙈 🙊 🙉 😰 :invalid: 🐠.' }
6
- let(:test_mixed) { 'Test 🙈 🙊 🙉 :cold_sweat: :invalid: :tropical_fish:.' }
7
- let(:test_tokens) { 'Test :see_no_evil: :speak_no_evil: :hear_no_evil: :cold_sweat: :invalid: :tropical_fish:.' }
8
-
9
- describe '#emoji_regexp' do
10
- it 'generates once and remains cached.' do
11
- first = EmojiParser.emoji_regexp
12
- second = EmojiParser.emoji_regexp
13
- expect(first).to be second
14
- end
15
-
16
- it 'regenerates when called with a :rehash option.' do
17
- first = EmojiParser.emoji_regexp
18
- second = EmojiParser.emoji_regexp(rehash: true)
19
- expect(first).not_to be second
20
- end
21
- end
22
-
23
- describe '#parse_unicode' do
24
- it 'replaces all valid emoji unicode via block transformation.' do
25
- parsed = EmojiParser.parse_unicode(test_mixed) { |emoji| 'X' }
26
- expect(parsed).to eq "Test X X X :cold_sweat: :invalid: :tropical_fish:."
27
- end
28
- end
29
-
30
- describe '#parse_tokens' do
31
- it 'replaces all valid emoji tokens via block transformation.' do
32
- parsed = EmojiParser.parse_tokens(test_tokens) { |emoji| 'X' }
33
- expect(parsed).to eq "Test X X X X :invalid: X."
34
- end
35
- end
36
-
37
- describe '#parse_all' do
38
- it 'replaces all valid emoji unicode and tokens via block transformation.' do
39
- parsed = EmojiParser.parse_all(test_mixed) { |emoji| 'X' }
40
- expect(parsed).to eq "Test X X X X :invalid: X."
41
- end
42
- end
43
-
44
- describe '#tokenize' do
45
- it 'successfully tokenizes all Gemoji unicode aliases.' do
46
- Emoji.all.each do |emoji|
47
- emoji.unicode_aliases.each do |u|
48
- tokenized = EmojiParser.tokenize("Test #{u}")
49
- expect(tokenized).to eq "Test :#{emoji.name}:"
50
- end
51
- end
52
- end
53
-
54
- it 'replaces all valid emoji unicodes with their token equivalent.' do
55
- tokenized = EmojiParser.tokenize(test_mixed)
56
- expect(tokenized).to eq test_tokens
57
- end
58
- end
59
-
60
- describe '#detokenize' do
61
- it 'replaces all valid emoji tokens with their raw unicode equivalent.' do
62
- tokenized = EmojiParser.detokenize(test_mixed)
63
- expect(tokenized).to eq test_unicode
64
- end
65
- end
66
-
67
- describe '#filepath' do
68
- let (:test_emoji) { Emoji.find_by_alias('de') }
69
- let (:test_file) { '1f1e9-1f1ea.png' }
70
-
71
- it 'formats a Gemoji image path as a root location by default.' do
72
- path = EmojiParser.filepath(test_emoji)
73
- expect(path).to eq "/#{test_file}"
74
- end
75
-
76
- it 'formats a Gemoji image path as a custom location (with trailing slash).' do
77
- images_path = '//fonts.test.com/emoji/'
78
- path = EmojiParser.filepath(test_emoji, images_path)
79
- expect(path).to eq "#{images_path}#{test_file}"
80
- end
81
-
82
- it 'formats a Gemoji image path to a custom location (no trailing slash).' do
83
- images_path = '//fonts.test.com/emoji'
84
- path = EmojiParser.filepath(test_emoji, images_path)
85
- expect(path).to eq "#{images_path}/#{test_file}"
86
- end
87
- end
88
- end