unicode-scripts 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 80faaef79a195000de4ade1f5b6c371f1234a1ba
4
+ data.tar.gz: aa006391634143063c9132a3e5f6beff03c1a6ef
5
+ SHA512:
6
+ metadata.gz: 92487c1b6d5a87383c6ddea8455062231c45c99a7f2b5e711120f9e92b522a3d5688500d466c862d7a701b9aabd78b222f58de255003544bf9792eb97907c269
7
+ data.tar.gz: b990f3ec80f1388575912f07bcccea287d9a867bf9d5755ce382633a37ca0c536a627d3f840ccb6dccdbd7797f5d473c4761fe6cf26631d2d315740e6bf25993
@@ -0,0 +1,2 @@
1
+ Gemfile.lock
2
+ /pkg
@@ -0,0 +1,21 @@
1
+ sudo: false
2
+ language: ruby
3
+
4
+ script: bundle exec ruby spec/unicode_scripts_spec.rb
5
+
6
+ rvm:
7
+ - 2.3.0
8
+ - 2.2
9
+ - 2.1
10
+ - ruby-head
11
+ - rbx-2
12
+ - jruby-head
13
+ - jruby-9.0.5.0
14
+
15
+ cache:
16
+ - bundler
17
+
18
+ matrix:
19
+ allow_failures:
20
+ - rvm: jruby-head
21
+ - rvm: rbx-2
@@ -0,0 +1,6 @@
1
+ ## CHANGELOG
2
+
3
+ ### 1.0.0
4
+
5
+ * Inital release
6
+
@@ -0,0 +1,74 @@
1
+ # Contributor Covenant Code of Conduct
2
+
3
+ ## Our Pledge
4
+
5
+ In the interest of fostering an open and welcoming environment, we as
6
+ contributors and maintainers pledge to making participation in our project and
7
+ our community a harassment-free experience for everyone, regardless of age, body
8
+ size, disability, ethnicity, gender identity and expression, level of experience,
9
+ nationality, personal appearance, race, religion, or sexual identity and
10
+ orientation.
11
+
12
+ ## Our Standards
13
+
14
+ Examples of behavior that contributes to creating a positive environment
15
+ include:
16
+
17
+ * Using welcoming and inclusive language
18
+ * Being respectful of differing viewpoints and experiences
19
+ * Gracefully accepting constructive criticism
20
+ * Focusing on what is best for the community
21
+ * Showing empathy towards other community members
22
+
23
+ Examples of unacceptable behavior by participants include:
24
+
25
+ * The use of sexualized language or imagery and unwelcome sexual attention or
26
+ advances
27
+ * Trolling, insulting/derogatory comments, and personal or political attacks
28
+ * Public or private harassment
29
+ * Publishing others' private information, such as a physical or electronic
30
+ address, without explicit permission
31
+ * Other conduct which could reasonably be considered inappropriate in a
32
+ professional setting
33
+
34
+ ## Our Responsibilities
35
+
36
+ Project maintainers are responsible for clarifying the standards of acceptable
37
+ behavior and are expected to take appropriate and fair corrective action in
38
+ response to any instances of unacceptable behavior.
39
+
40
+ Project maintainers have the right and responsibility to remove, edit, or
41
+ reject comments, commits, code, wiki edits, issues, and other contributions
42
+ that are not aligned to this Code of Conduct, or to ban temporarily or
43
+ permanently any contributor for other behaviors that they deem inappropriate,
44
+ threatening, offensive, or harmful.
45
+
46
+ ## Scope
47
+
48
+ This Code of Conduct applies both within project spaces and in public spaces
49
+ when an individual is representing the project or its community. Examples of
50
+ representing a project or community include using an official project e-mail
51
+ address, posting via an official social media account, or acting as an appointed
52
+ representative at an online or offline event. Representation of a project may be
53
+ further defined and clarified by project maintainers.
54
+
55
+ ## Enforcement
56
+
57
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be
58
+ reported by contacting the project team at opensource@janlelis.com. All
59
+ complaints will be reviewed and investigated and will result in a response that
60
+ is deemed necessary and appropriate to the circumstances. The project team is
61
+ obligated to maintain confidentiality with regard to the reporter of an incident.
62
+ Further details of specific enforcement policies may be posted separately.
63
+
64
+ Project maintainers who do not follow or enforce the Code of Conduct in good
65
+ faith may face temporary or permanent repercussions as determined by other
66
+ members of the project's leadership.
67
+
68
+ ## Attribution
69
+
70
+ This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
71
+ available at [http://contributor-covenant.org/version/1/4][version]
72
+
73
+ [homepage]: http://contributor-covenant.org
74
+ [version]: http://contributor-covenant.org/version/1/4/
data/Gemfile ADDED
@@ -0,0 +1,5 @@
1
+ source 'https://rubygems.org'
2
+
3
+ gemspec
4
+
5
+ gem 'minitest'
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2016 Jan Lelis, mail@janlelis.de
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,340 @@
1
+ # Unicode::Scripts [![[version]](https://badge.fury.io/rb/unicode-scripts.svg)](http://badge.fury.io/rb/unicode-scripts) [![[travis]](https://travis-ci.org/janlelis/unicode-scripts.png)](https://travis-ci.org/janlelis/unicode-scripts)
2
+
3
+ Retrieve the [Unicode script(s)](https://en.wikipedia.org/wiki/Script_%28Unicode%29) a string belongs to. Can also return the *Script_Extension* property which is defined as characters which are "commonly used with more than one script, but with a limited number of scripts".
4
+
5
+ Unicode version: **8.0.0**
6
+
7
+ Supported Rubies: **2.3**, **2.2**, **2.1**
8
+
9
+ ## Gemfile
10
+
11
+ ```ruby
12
+ gem "unicode-scripts"
13
+ ```
14
+
15
+ ## Usage
16
+
17
+ ```ruby
18
+ require "unicode/scripts"
19
+
20
+ Unicode::Scripts.scripts("СC") # => ["Cyrillic", "Latin"]
21
+
22
+ # 4 letter script aliases
23
+ Unicode::Scripts.scripts("СC", format: :short) # => ["Cyrl", "Latn"]
24
+
25
+ # Single character
26
+ Unicode::Scripts.script("ᴦ") # => "Greek"
27
+
28
+ # Script_Extension property
29
+ Unicode::Scripts.script_extensions("॥") # => ["Bengali", "Devanagari", "Grantha", "Gujarati",
30
+ "Gurmukhi", "Kannada", "Khudawadi", "Limbu",
31
+ "Mahajani", "Malayalam", "Oriya", "Sinhala",
32
+ "Syloti_Nagri", "Takri", "Tamil", "Telugu",
33
+ "Tirhuta"]
34
+ ```
35
+
36
+ ## Hints
37
+ ### Regex Matching
38
+
39
+ If you have a string and want to match a substring/character from a specific Unicode script, you actually won't need this gem. Instead, you can use the [Regexp Unicode Property Syntax `\p{}`](http://ruby-doc.org/core-2.3.0/Regexp.html#class-Regexp-label-Character+Properties):
40
+
41
+ ```ruby
42
+ "Coptic letter: ⲁ".scan(/\p{Coptic}/) # => ["ⲁ"]
43
+ ```
44
+
45
+ ### Script Names
46
+
47
+ You can extract all script names from the gem like this:
48
+
49
+ ```ruby
50
+ require "unicode/scripts"
51
+ puts Unicode::Scripts.names
52
+
53
+ # # # Output # # #
54
+
55
+ Caucasian_Albanian
56
+ Ahom
57
+ Arabic
58
+ Imperial_Aramaic
59
+ Armenian
60
+ Avestan
61
+ Balinese
62
+ Bamum
63
+ Bassa_Vah
64
+ Batak
65
+ Bengali
66
+ Bopomofo
67
+ Brahmi
68
+ Braille
69
+ Buginese
70
+ Buhid
71
+ Chakma
72
+ Canadian_Aboriginal
73
+ Carian
74
+ Cham
75
+ Cherokee
76
+ Coptic
77
+ Cypriot
78
+ Cyrillic
79
+ Devanagari
80
+ Deseret
81
+ Duployan
82
+ Egyptian_Hieroglyphs
83
+ Elbasan
84
+ Ethiopic
85
+ Georgian
86
+ Glagolitic
87
+ Gothic
88
+ Grantha
89
+ Greek
90
+ Gujarati
91
+ Gurmukhi
92
+ Hangul
93
+ Han
94
+ Hanunoo
95
+ Hatran
96
+ Hebrew
97
+ Hiragana
98
+ Anatolian_Hieroglyphs
99
+ Pahawh_Hmong
100
+ Katakana_Or_Hiragana
101
+ Old_Hungarian
102
+ Old_Italic
103
+ Javanese
104
+ Kayah_Li
105
+ Katakana
106
+ Kharoshthi
107
+ Khmer
108
+ Khojki
109
+ Kannada
110
+ Kaithi
111
+ Tai_Tham
112
+ Lao
113
+ Latin
114
+ Lepcha
115
+ Limbu
116
+ Linear_A
117
+ Linear_B
118
+ Lisu
119
+ Lycian
120
+ Lydian
121
+ Mahajani
122
+ Mandaic
123
+ Manichaean
124
+ Mende_Kikakui
125
+ Meroitic_Cursive
126
+ Meroitic_Hieroglyphs
127
+ Malayalam
128
+ Modi
129
+ Mongolian
130
+ Mro
131
+ Meetei_Mayek
132
+ Multani
133
+ Myanmar
134
+ Old_North_Arabian
135
+ Nabataean
136
+ Nko
137
+ Ogham
138
+ Ol_Chiki
139
+ Old_Turkic
140
+ Oriya
141
+ Osmanya
142
+ Palmyrene
143
+ Pau_Cin_Hau
144
+ Old_Permic
145
+ Phags_Pa
146
+ Inscriptional_Pahlavi
147
+ Psalter_Pahlavi
148
+ Phoenician
149
+ Miao
150
+ Inscriptional_Parthian
151
+ Rejang
152
+ Runic
153
+ Samaritan
154
+ Old_South_Arabian
155
+ Saurashtra
156
+ SignWriting
157
+ Shavian
158
+ Sharada
159
+ Siddham
160
+ Khudawadi
161
+ Sinhala
162
+ Sora_Sompeng
163
+ Sundanese
164
+ Syloti_Nagri
165
+ Syriac
166
+ Tagbanwa
167
+ Takri
168
+ Tai_Le
169
+ New_Tai_Lue
170
+ Tamil
171
+ Tai_Viet
172
+ Telugu
173
+ Tifinagh
174
+ Tagalog
175
+ Thaana
176
+ Thai
177
+ Tibetan
178
+ Tirhuta
179
+ Ugaritic
180
+ Vai
181
+ Warang_Citi
182
+ Old_Persian
183
+ Cuneiform
184
+ Yi
185
+ Inherited
186
+ Common
187
+ Unknown
188
+ ```
189
+
190
+ ### Script Extension Names
191
+
192
+ You can extract all script extensions names from the gem like this:
193
+
194
+ ```ruby
195
+ require "unicode/scripts"
196
+ puts Unicode::Scripts.extension_names
197
+
198
+ # # # Output # # #
199
+
200
+ Aghb
201
+ Ahom
202
+ Arab
203
+ Armi
204
+ Armn
205
+ Avst
206
+ Bali
207
+ Bamu
208
+ Bass
209
+ Batk
210
+ Beng
211
+ Bopo
212
+ Brah
213
+ Brai
214
+ Bugi
215
+ Buhd
216
+ Cakm
217
+ Cans
218
+ Cari
219
+ Cham
220
+ Cher
221
+ Copt
222
+ Qaac
223
+ Cprt
224
+ Cyrl
225
+ Deva
226
+ Dsrt
227
+ Dupl
228
+ Egyp
229
+ Elba
230
+ Ethi
231
+ Geor
232
+ Glag
233
+ Goth
234
+ Gran
235
+ Grek
236
+ Gujr
237
+ Guru
238
+ Hang
239
+ Hani
240
+ Hano
241
+ Hatr
242
+ Hebr
243
+ Hira
244
+ Hluw
245
+ Hmng
246
+ Hrkt
247
+ Hung
248
+ Ital
249
+ Java
250
+ Kali
251
+ Kana
252
+ Khar
253
+ Khmr
254
+ Khoj
255
+ Knda
256
+ Kthi
257
+ Lana
258
+ Laoo
259
+ Latn
260
+ Lepc
261
+ Limb
262
+ Lina
263
+ Linb
264
+ Lisu
265
+ Lyci
266
+ Lydi
267
+ Mahj
268
+ Mand
269
+ Mani
270
+ Mend
271
+ Merc
272
+ Mero
273
+ Mlym
274
+ Modi
275
+ Mong
276
+ Mroo
277
+ Mtei
278
+ Mult
279
+ Mymr
280
+ Narb
281
+ Nbat
282
+ Nkoo
283
+ Ogam
284
+ Olck
285
+ Orkh
286
+ Orya
287
+ Osma
288
+ Palm
289
+ Pauc
290
+ Perm
291
+ Phag
292
+ Phli
293
+ Phlp
294
+ Phnx
295
+ Plrd
296
+ Prti
297
+ Rjng
298
+ Runr
299
+ Samr
300
+ Sarb
301
+ Saur
302
+ Sgnw
303
+ Shaw
304
+ Shrd
305
+ Sidd
306
+ Sind
307
+ Sinh
308
+ Sora
309
+ Sund
310
+ Sylo
311
+ Syrc
312
+ Tagb
313
+ Takr
314
+ Tale
315
+ Talu
316
+ Taml
317
+ Tavt
318
+ Telu
319
+ Tfng
320
+ Tglg
321
+ Thaa
322
+ Thai
323
+ Tibt
324
+ Tirh
325
+ Ugar
326
+ Vaii
327
+ Wara
328
+ Xpeo
329
+ Xsux
330
+ Yiii
331
+ Zinh
332
+ Qaai
333
+ Zyyy
334
+ Zzzz
335
+ ```
336
+
337
+ ## MIT License
338
+
339
+ - Copyright (C) 2016 Jan Lelis <http://janlelis.com>. Released under the MIT license.
340
+ - Unicode data: http://www.unicode.org/copyright.html#Exhibit1
@@ -0,0 +1,37 @@
1
+ # # #
2
+ # Get gemspec info
3
+
4
+ gemspec_file = Dir['*.gemspec'].first
5
+ gemspec = eval File.read(gemspec_file), binding, gemspec_file
6
+ info = "#{gemspec.name} | #{gemspec.version} | " \
7
+ "#{gemspec.runtime_dependencies.size} dependencies | " \
8
+ "#{gemspec.files.size} files"
9
+
10
+ # # #
11
+ # Gem build and install task
12
+
13
+ desc info
14
+ task :gem do
15
+ puts info + "\n\n"
16
+ print " "; sh "gem build #{gemspec_file}"
17
+ FileUtils.mkdir_p 'pkg'
18
+ FileUtils.mv "#{gemspec.name}-#{gemspec.version}.gem", 'pkg'
19
+ puts; sh %{gem install --no-document pkg/#{gemspec.name}-#{gemspec.version}.gem}
20
+ end
21
+
22
+ # # #
23
+ # Start an IRB session with the gem loaded
24
+
25
+ desc "#{gemspec.name} | IRB"
26
+ task :irb do
27
+ sh "irb -I ./lib -r #{gemspec.name.gsub '-','/'}"
28
+ end
29
+
30
+ # # #
31
+ # Run Specs
32
+
33
+ desc "#{gemspec.name} | Spec"
34
+ task :spec do
35
+ sh "for file in spec/*.rb; do ruby $file; done"
36
+ end
37
+ task default: :spec
Binary file
@@ -0,0 +1,57 @@
1
+ require_relative "scripts/constants"
2
+
3
+ module Unicode
4
+ module Scripts
5
+ def self.scripts(string, **options)
6
+ res = []
7
+ string.each_char{ |char|
8
+ script_name = script(char, **options)
9
+ res << script_name unless res.include?(script_name)
10
+ }
11
+ res.sort
12
+ end
13
+ class << self; alias of scripts; end
14
+
15
+ def self.script(char, format: :long)
16
+ require_relative 'scripts/index' unless defined? ::Unicode::Scripts::INDEX
17
+ codepoint_depth_offset = char.unpack("U")[0] or
18
+ raise(ArgumentError, "Unicode::Scripts.script must be given a valid char")
19
+ index_or_value = INDEX[:SCRIPTS]
20
+ [0x10000, 0x1000, 0x100, 0x10].each{ |depth|
21
+ index_or_value = index_or_value[codepoint_depth_offset / depth]
22
+ codepoint_depth_offset = codepoint_depth_offset % depth
23
+ unless index_or_value.is_a? Array
24
+ res = index_or_value || INDEX[:SCRIPT_ALIASES]["Zzzz"]
25
+ return format == :long ? INDEX[:SCRIPT_NAMES][res] : INDEX[:SCRIPT_ALIASES].key(res)
26
+ end
27
+ }
28
+
29
+ res = index_or_value[codepoint_depth_offset] || INDEX[:SCRIPT_ALIASES]["Zzzz"]
30
+ format == :long ? INDEX[:SCRIPT_NAMES][res] : INDEX[:SCRIPT_ALIASES].key(res)
31
+ end
32
+
33
+ def self.script_extensions(string, format: :long)
34
+ require_relative 'scripts/index' unless defined? ::Unicode::Scripts::INDEX
35
+
36
+ string.each_codepoint.inject([]){ |res, codepoint|
37
+ if new_scripts = INDEX[:SCRIPT_EXTENSIONS][codepoint]
38
+ script_extension_names = new_scripts.map{ |new_script|
39
+ format == :long ? INDEX[:SCRIPT_NAMES][new_script] : INDEX[:SCRIPT_ALIASES].key(new_script)
40
+ }
41
+ else
42
+ script_extension_names = scripts([codepoint].pack("U"), format: format)
43
+ end
44
+
45
+ res | script_extension_names
46
+ }.sort
47
+ end
48
+
49
+ def self.names(format: :long)
50
+ require_relative 'scripts/index' unless defined? ::Unicode::Scripts::INDEX
51
+ format == :long ?
52
+ INDEX[:SCRIPT_NAMES].sort :
53
+ INDEX[:SCRIPT_ALIASES].keys.sort
54
+ end
55
+ end
56
+ end
57
+
@@ -0,0 +1,9 @@
1
+ module Unicode
2
+ module Scripts
3
+ VERSION = "1.0.0".freeze
4
+ UNICODE_VERSION = "8.0.0".freeze
5
+ DATA_DIRECTORY = File.expand_path(File.dirname(__FILE__) + '/../../../data/').freeze
6
+ INDEX_FILENAME = (DATA_DIRECTORY + '/scripts.marshal.gz').freeze
7
+ end
8
+ end
9
+
@@ -0,0 +1,7 @@
1
+ require_relative 'constants'
2
+
3
+ module Unicode
4
+ module Scripts
5
+ INDEX = Marshal.load(Gem.gunzip(File.binread(INDEX_FILENAME)))
6
+ end
7
+ end
@@ -0,0 +1,12 @@
1
+ require_relative "../scripts"
2
+
3
+ class String
4
+ # Optional string extension for your convenience
5
+ def unicode_scripts
6
+ Unicode::Scripts.scripts(self)
7
+ end
8
+
9
+ def unicode_script_extensions
10
+ Unicode::Scripts.script_extensions(self)
11
+ end
12
+ end
@@ -0,0 +1,117 @@
1
+ require_relative "../lib/unicode/scripts"
2
+ require "minitest/autorun"
3
+
4
+ describe Unicode::Scripts do
5
+ describe ".scripts (alias .of)" do
6
+ it "will always return an Array" do
7
+ assert_equal [], Unicode::Scripts.of("")
8
+ end
9
+
10
+ it "will return all scripts that characters in the string belong to" do
11
+ assert_equal ["Cyrillic", "Latin"], Unicode::Scripts.of("СC")
12
+ end
13
+
14
+ it "will return all scripts in sorted order" do
15
+ assert_equal ["Cyrillic", "Latin"], Unicode::Scripts.of("СA")
16
+ assert_equal ["Cyrillic", "Latin"], Unicode::Scripts.of("AС")
17
+ end
18
+
19
+ it "will call .script for every character" do
20
+ mocked_method = MiniTest::Mock.new
21
+ mocked_method.expect :call, "first script", ["С", {}]
22
+ mocked_method.expect :call, "second script", ["A", {}]
23
+ Unicode::Scripts.stub :script, mocked_method do
24
+ Unicode::Scripts.of("СA")
25
+ end
26
+ mocked_method.verify
27
+ end
28
+ end
29
+
30
+ describe ".script" do
31
+ it "will return script for that character" do
32
+ assert_equal "Greek", Unicode::Scripts.script("ᴦ")
33
+ assert_equal "Common", Unicode::Scripts.script("�")
34
+ end
35
+
36
+ it "will return Unknown for characters not in any script" do
37
+ assert_equal "Unknown", Unicode::Scripts.script("\u{10c50}")
38
+ end
39
+
40
+ it "will return 4 letter script codes with format: :short" do
41
+ assert_equal ["Cyrl", "Latn"], Unicode::Scripts.of("СC", format: :short)
42
+ end
43
+ end
44
+
45
+ describe ".script_extensions" do
46
+ it "will always return an Array" do
47
+ assert_equal [], Unicode::Scripts.script_extensions("")
48
+ end
49
+
50
+ it "will return all extended scripts that characters in the string belong to" do
51
+ assert_equal [
52
+ "Bengali",
53
+ "Devanagari",
54
+ "Grantha",
55
+ "Gujarati",
56
+ "Gurmukhi",
57
+ "Kannada",
58
+ "Khudawadi",
59
+ "Limbu",
60
+ "Mahajani",
61
+ "Malayalam",
62
+ "Oriya",
63
+ "Sinhala",
64
+ "Syloti_Nagri",
65
+ "Takri",
66
+ "Tamil",
67
+ "Telugu",
68
+ "Tirhuta"
69
+ ], Unicode::Scripts.script_extensions("॥")
70
+ end
71
+
72
+ it "will return 4 letter script codes with format: :short" do
73
+ assert_equal [
74
+ "Beng",
75
+ "Deva",
76
+ "Gran",
77
+ "Gujr",
78
+ "Guru",
79
+ "Knda",
80
+ "Limb",
81
+ "Mahj",
82
+ "Mlym",
83
+ "Orya",
84
+ "Sind",
85
+ "Sinh",
86
+ "Sylo",
87
+ "Takr",
88
+ "Taml",
89
+ "Telu",
90
+ "Tirh"
91
+ ], Unicode::Scripts.script_extensions("॥", format: :short)
92
+
93
+ end
94
+
95
+ it "will return all extended scripts in sorted order" do
96
+ assert_equal ["Cyrillic", "Latin"], Unicode::Scripts.script_extensions("СA")
97
+ assert_equal ["Cyrillic", "Latin"], Unicode::Scripts.script_extensions("AС")
98
+ end
99
+
100
+ it "will call .scripts for characters that have no explicit script extension" do
101
+ mocked_method = MiniTest::Mock.new
102
+ mocked_method.expect :call, ["scripts"], ["A", {format: :long}]
103
+ Unicode::Scripts.stub :scripts, mocked_method do
104
+ Unicode::Scripts.script_extensions("A")
105
+ end
106
+ mocked_method.verify
107
+ end
108
+ end
109
+
110
+ describe ".names" do
111
+ it "will return a list of all script names" do
112
+ assert_kind_of Array, Unicode::Scripts.names
113
+ assert_includes Unicode::Scripts.names, "Inscriptional_Parthian"
114
+ end
115
+ end
116
+ end
117
+
@@ -0,0 +1,21 @@
1
+ # -*- encoding: utf-8 -*-
2
+
3
+ require File.dirname(__FILE__) + "/lib/unicode/scripts/constants"
4
+
5
+ Gem::Specification.new do |gem|
6
+ gem.name = "unicode-scripts"
7
+ gem.version = Unicode::Scripts::VERSION
8
+ gem.summary = "Which script(s) does a Unicode string belong to?"
9
+ gem.description = "[Unicode version: #{Unicode::Scripts::UNICODE_VERSION}] Retrieve the Unicode script(s) a string belongs to. Can also return the Script_Extension property which is defined as characters which are 'commonly used with more than one script, but with a limited number of scripts'. "
10
+ gem.authors = ["Jan Lelis"]
11
+ gem.email = ["mail@janlelis.de"]
12
+ gem.homepage = "https://github.com/janlelis/unicode-scripts"
13
+ gem.license = "MIT"
14
+
15
+ gem.files = Dir["{**/}{.*,*}"].select{ |path| File.file?(path) && path !~ /^pkg/ }
16
+ gem.executables = gem.files.grep(%r{^bin/}).map{ |f| File.basename(f) }
17
+ gem.test_files = gem.files.grep(%r{^(test|spec|features)/})
18
+ gem.require_paths = ["lib"]
19
+
20
+ gem.required_ruby_version = "~> 2.0"
21
+ end
metadata ADDED
@@ -0,0 +1,64 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: unicode-scripts
3
+ version: !ruby/object:Gem::Version
4
+ version: 1.0.0
5
+ platform: ruby
6
+ authors:
7
+ - Jan Lelis
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2016-04-13 00:00:00.000000000 Z
12
+ dependencies: []
13
+ description: "[Unicode version: 8.0.0] Retrieve the Unicode script(s) a string belongs
14
+ to. Can also return the Script_Extension property which is defined as characters
15
+ which are 'commonly used with more than one script, but with a limited number of
16
+ scripts'. "
17
+ email:
18
+ - mail@janlelis.de
19
+ executables: []
20
+ extensions: []
21
+ extra_rdoc_files: []
22
+ files:
23
+ - ".gitignore"
24
+ - ".travis.yml"
25
+ - CHANGELOG.md
26
+ - CODE_OF_CONDUCT.md
27
+ - Gemfile
28
+ - MIT-LICENSE.txt
29
+ - README.md
30
+ - Rakefile
31
+ - data/scripts.marshal.gz
32
+ - lib/unicode/scripts.rb
33
+ - lib/unicode/scripts/constants.rb
34
+ - lib/unicode/scripts/index.rb
35
+ - lib/unicode/scripts/string_ext.rb
36
+ - spec/unicode_scripts_spec.rb
37
+ - unicode-scripts.gemspec
38
+ homepage: https://github.com/janlelis/unicode-scripts
39
+ licenses:
40
+ - MIT
41
+ metadata: {}
42
+ post_install_message:
43
+ rdoc_options: []
44
+ require_paths:
45
+ - lib
46
+ required_ruby_version: !ruby/object:Gem::Requirement
47
+ requirements:
48
+ - - "~>"
49
+ - !ruby/object:Gem::Version
50
+ version: '2.0'
51
+ required_rubygems_version: !ruby/object:Gem::Requirement
52
+ requirements:
53
+ - - ">="
54
+ - !ruby/object:Gem::Version
55
+ version: '0'
56
+ requirements: []
57
+ rubyforge_project:
58
+ rubygems_version: 2.6.3
59
+ signing_key:
60
+ specification_version: 4
61
+ summary: Which script(s) does a Unicode string belong to?
62
+ test_files:
63
+ - spec/unicode_scripts_spec.rb
64
+ has_rdoc: