smstools 0.0.1 → 0.2.2

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
- SHA1:
3
- metadata.gz: 4180ff46eb40f9d709cf9895d1dc493aeec4203c
4
- data.tar.gz: cba4fd705516f2418dfae93ce3732231e30faba5
2
+ SHA256:
3
+ metadata.gz: f2cecee4608c47f5abf1cf0a980b3a3a646e358d50a72e6b0f1931f554f86c5f
4
+ data.tar.gz: 46eae0938780419f4672581f4b1105a8d51cc4fe7f6150b2e44fdc3c00f16c6e
5
5
  SHA512:
6
- metadata.gz: b3b26b73c00f2acbb9e107ecc24738b97a713a3579c685339226b9ba369c0d2080ff7b40357a472c0dc9ebe41bca0f8ed38e28cd132242ce7ed3e514a78d69c3
7
- data.tar.gz: 9b7a0e6a64fdaec4f8f05130dc3b4ac735f2848fddfd1bbd8d274ec2b6e9781fcdb993ef121357e77f6b4c486508a067e6480516a72fd33d7c28007397f0fde5
6
+ metadata.gz: 6f40d959431dc1185a989b179c91858363978b5200bba7504e499b214ba8b1493c859eebb3308b343ac8000ec747db89ccfbc664692721315bb68c41f96000a0
7
+ data.tar.gz: 7670ac023de1612cd5e4573ad4879526e5f45e38c871416cef3466e4bbee6d1740d0b2ae760b907d57811f913670862e2907b3ed3c8f268b876085195bed2a61
@@ -1,3 +1,25 @@
1
- ## 0.1.1 (17 January 2014)
1
+ ## 0.2.2 (20 Jan 2021)
2
+
3
+ * #9 Fix the way some complex Unicode characters (like composite emojis) are counted. Thanks to @bryanrite for the neat implementation. Note the fix could be **potentially backwards-incompatible** if you were relying on the incorrect behaviour previously. Technically it's still a bug fix.
4
+
5
+ ## 0.2.1 (18 Aug 2020)
6
+
7
+ * #7 Introduce `SmsTools.use_ascii_encoding` option (defaults to `true` for backwards-compatibility) that allows disabling the `:ascii` workaround encoding. See #6 and #7 for details. Thanks @kingsley-wang.
8
+
9
+ ## 0.2.0 (2 March 2017)
10
+
11
+ * The non-breaking space character (0x00A0 in Unicode and "\xC2\xA0" in UTF-8) is no longer regarded as a valid GSM 7-bit symbol. [#4](https://github.com/livebg/smstools/issues/4)
12
+ * GsmEncoding.to_utf8 will now raise errors in case the provided argument is not a valid GSM 7-bit text.
13
+
14
+ ## 0.1.1 (18 April 2016)
15
+
16
+ * Replaces small c with cedilla to capital one, as per the GSM 03.38 standard (by @skliask)
17
+
18
+ ## 0.1.0 (08 October 2015)
19
+
20
+ * distinguish between ascii encoding and gsm encoding
21
+ * add option for preventing the use of gsm encoding, that is to use unicode instead
22
+
23
+ ## 0.0.1 (17 January 2014)
2
24
 
3
25
  * Initial release.
data/README.md CHANGED
@@ -1,23 +1,80 @@
1
1
  # Sms Tools
2
2
 
3
- A small collection of useful Ruby and JavaScript classes implementing often
4
- needed functionality for dealing with SMS messages.
3
+ A small collection of Ruby and JavaScript classes implementing often needed functionality for
4
+ dealing with SMS messages.
5
5
 
6
- The gem is also a Rails engine and using it in your Rails app will allow you
7
- to also use the JavaScript classes via the asset pipeline.
6
+ The gem can also be used in a Rails application as an engine. It integrates with the asset pipeline
7
+ and gives you access to some client-side SMS manipulation functionality.
8
8
 
9
9
  ## Features
10
10
 
11
11
  The following features are available on both the server side and the client
12
12
  side:
13
13
 
14
- - Detection of the most optimal encoding for sending an SMS message (GSM 7-bit
15
- or Unicode).
16
- - Correctly determining the message's length according to the most optimal
17
- encoding.
14
+ - Detection of the most optimal encoding for sending an SMS message (GSM 7-bit or Unicode).
15
+ - Correctly determining a message's length in the most optimal encoding.
18
16
  - Concatenation detection and concatenated message parts counting.
19
17
 
20
- And more.
18
+ The following can be accomplished only on the server with Ruby:
19
+
20
+ - Converting a UTF-8 string to a GSM 7-bit encoding and vice versa.
21
+ - Detecting if a UTF-8 string can be safely represented in a GSM 7-bit encoding.
22
+ - Detection of double-byte chars in the GSM 7-bit encoding.
23
+
24
+ And possibly more.
25
+
26
+ ### Note on the GSM encoding
27
+
28
+ All references to the "GSM" encoding or the "GSM 7-bit alphabet" in this text actually refer to the
29
+ [GSM 03.38 spec](http://en.wikipedia.org/wiki/GSM_03.38) and [its latest
30
+ version](ftp://ftp.unicode.org/Public/MAPPINGS/ETSI/GSM0338.TXT), as defined by the Unicode
31
+ consortium.
32
+
33
+ This encoding is the most widely used one when sending SMS messages.
34
+
35
+ ### Note regarding non-ASCII symbols from the GSM encoding
36
+
37
+ The GSM 03.38 encoding is used by default. This standard defines a set of
38
+ symbols which can be encoded in 7-bits each, thus allowing up to 160 symbols
39
+ per SMS message (each SMS message can contain up to 140 bytes of data).
40
+
41
+ This standard covers most of the ASCII table, but also includes some non-ASCII
42
+ symbols such as `æ`, `ø` and `å`. If you use these in your messages, you can
43
+ still send them as GSM encoded, having a 160-symbol limit. This is technically
44
+ correct.
45
+
46
+ In reality, however, some SMS routes have problems delivering messages which
47
+ contain such non-ASCII symbols in the GSM encoding. The special symbols might
48
+ be omitted, or the message might not arrive at all.
49
+
50
+ Thus, it might be safer to just send messages in Unicode if the message's text
51
+ contains any non-ASCII symbols. This is not the default as it reduces the max
52
+ symbols count to 70 per message, instead of 160, and you might not have any
53
+ issues with GSM-encoded messages. In case you do, however, you can turn off
54
+ support for the GSM encoding and just treat messages as Unicode if they contain
55
+ non-ASCII symbols.
56
+
57
+ In case you decide to do so, you have to specify it in both the Ruby and the
58
+ JavaScript part of the library, like so:
59
+
60
+ #### In Ruby
61
+
62
+ SmsTools.use_gsm_encoding = false
63
+
64
+ #### In Javascript
65
+
66
+ //= require sms_tools
67
+ SmsTools.use_gsm_encoding = false;
68
+
69
+ There is another alternative as well. As explained in this commit – f1ffd948d4b8c – SmsTools will by
70
+ default detect the encoding as `:ascii` if the SMS message contains ASCII-only symbols. The safest
71
+ way to send messages would be to use an ASCII subset of the GSM encodnig.
72
+
73
+ The `:ascii` encoding is informative only, however. Your SMS sending implementation will have to
74
+ decide how to handle it. You may also find it confusing that the dummy `:ascii` encoding does not
75
+ consider double-byte chars at all when counting the length of the message.
76
+
77
+ To disable this dummy `:ascii` encoding, set `SmsTools.use_ascii_encoding` to `false`.
21
78
 
22
79
  ## Installation
23
80
 
@@ -33,32 +90,99 @@ Or install it yourself as:
33
90
 
34
91
  $ gem install smstools
35
92
 
93
+ If you're using the gem in Rails, you may also want to add the following to your `application.js`
94
+ manifest file to gain access to the client-side features:
95
+
96
+ //= require sms_tools
97
+
36
98
  ## Usage
37
99
 
38
100
  The gem consists of both server-side (Ruby) and client-side classes. You can
39
- use either one.
101
+ use either.
40
102
 
41
103
  ### Server-side code
42
104
 
43
- If you use the gem in Rails or via Bundler, just use the appropriate class,
44
- such as `SmsTools::EncodingDetection` or `SmsTools::GsmEncoding`.
105
+ First make sure you have installed the gem and have required the appropriate files.
106
+
107
+ #### Encoding detection
108
+
109
+ The `SmsTools::EncodingDetection` class provides you with a few simple methods to detect the most
110
+ optimal encoding for sending an SMS message, to correctly caclulate its length in that encoding and
111
+ to see if the text would need to be concatenated or will fit in a single message.
112
+
113
+ Here is an example with a non-concatenated message which is best encoded in the GSM 7-bit alphabet:
114
+
115
+ ```ruby
116
+ sms_text = 'Text in GSM 03.38: ÄäøÆ with a double-byte char: ~ '
117
+ sms_encoding = SmsTools::EncodingDetection.new sms_text
118
+
119
+ sms_encoding.gsm? # => true
120
+ sms_encoding.unicode? # => false
121
+ sms_encoding.length # => 52 (because of the double-byte char)
122
+ sms_encoding.concatenated? # => false
123
+ sms_encoding.concatenated_parts # => 1
124
+ sms_encoding.encoding # => :gsm
125
+ ```
126
+
127
+ Here's another example with a concatenated Unicode message:
128
+
129
+ ```ruby
130
+ sms_text = 'Я' * 90
131
+ sms_encoding = SmsTools::EncodingDetection.new sms_text
132
+
133
+ sms_encoding.gsm? # => false
134
+ sms_encoding.unicode? # => true
135
+ sms_encoding.length # => 90
136
+ sms_encoding.concatenated? # => true
137
+ sms_encoding.concatenated_parts # => 2
138
+ sms_encoding.encoding # => :unicode
139
+ ```
140
+
141
+ You can check the specs for this class for more examples.
45
142
 
46
- #### `EncodingDetection`
47
- #### `GsmEncoding`
143
+ #### GSM 03.38 encoding conversion
144
+
145
+ The `SmsTools::GsmEncoding` class can be used to check if a given UTF-8 string can be fully
146
+ represented in the GSM 03.38 encoding as well as to convert from UTF-8 to GSM 03.38 and vice-versa.
147
+
148
+ The main API this class provides is the following:
149
+
150
+ ```ruby
151
+ SmsTools::GsmEncoding.valid? message_text_in_utf8 # => true or false
152
+
153
+ SmsTools::GsmEncoding.from_utf8 utf8_encoded_string # => a GSM 03.38 encoded string
154
+ SmsTools::GsmEncoding.to_utf8 gsm_encoded_string # => an UTF-8 encoded string
155
+ ```
156
+
157
+ Check out the source code of the class to find out more.
48
158
 
49
159
  ### Client-side code
50
160
 
51
- If you're using the gem in Rails 3.x or newer, you can just add the following
52
- to your `application.js` file to gain access to the JavaScript classes:
161
+ If you're using the gem in Rails 3.1 or newer, you can gain access to the `SmsTools.Message` class.
162
+ Its interface is similar to the one of `SmsTools::EncodingDetection`. Here is an example in
163
+ CoffeeScript:
53
164
 
54
- #= require 'sms_tools/all'
165
+ ```coffeescript
166
+ message = new SmsTools.Message 'The text of the message: ~'
55
167
 
56
- Or require only the files you need:
168
+ message.encoding # => 'gsm'
169
+ message.length # => 27
170
+ message.concatenatedPartsCount # => 1
171
+ ```
57
172
 
58
- #= require 'sms_tools/message'
173
+ You can also check how long can this message be in the current most optimal encoding, if we want to
174
+ limit the number of concatenated messages we will allow to be sent:
59
175
 
60
- Note that this assumes you're using the asset pipeline. You need to have a
61
- CoffeeScript preprocessor set up.
176
+ ```coffeescript
177
+ maxConcatenatedPartsCount = 2
178
+ message.maxLengthFor(maxConcatenatedPartsCount) # => 306
179
+ ```
180
+
181
+ This allows you to have a dynamic instead of a fixed length limit, for when you use a non-GSM 03.38
182
+ symbol in your text, your message length limit decreases significantly.
183
+
184
+ Note that to use this client-side code, a Rails application with an active asset pipeline is
185
+ assumed. It might be possible to use it in other setups as well, but you're on your own there.
62
186
 
63
187
  ## Contributing
64
188
 
@@ -69,3 +193,10 @@ CoffeeScript preprocessor set up.
69
193
  5. Commit your changes (`git commit -am 'Add some feature'`)
70
194
  6. Push to the branch (`git push origin my-new-feature`)
71
195
  7. Send a pull request.
196
+
197
+ ## Publishing a new version
198
+
199
+ 1. Pick a version number according to Semantic Versioning.
200
+ 2. Update `CHANGELOG.md`, `version.rb` and potentially this readme.
201
+ 3. Commit the changes, tag them with `vX.Y.Z` (e.g. `v0.2.1`) and push all with `git push --tags`.
202
+ 4. Build and publish the new version of the gem with `gem build smstools.gemspec && gem push *.gem`.
data/Rakefile CHANGED
@@ -1,11 +1,10 @@
1
1
  require 'bundler/gem_tasks'
2
+ require 'rake/testtask'
2
3
 
3
- task :test do
4
- test_files = Dir[File.expand_path('../spec/**/*_spec.rb', __FILE__)]
5
- command = "ruby -Ispec #{test_files.join ' '}"
4
+ task default: :test
6
5
 
7
- puts "Running #{command}"
8
- system command
6
+ Rake::TestTask.new do |t|
7
+ t.libs << 'spec'
8
+ t.test_files = FileList['spec/**/*_spec.rb']
9
+ t.verbose = true
9
10
  end
10
-
11
- task default: :test
@@ -2,6 +2,9 @@ window.SmsTools ?= {}
2
2
 
3
3
  class SmsTools.Message
4
4
  maxLengthForEncoding:
5
+ ascii:
6
+ normal: 160
7
+ concatenated: 153
5
8
  gsm:
6
9
  normal: 160
7
10
  concatenated: 153
@@ -20,6 +23,7 @@ class SmsTools.Message
20
23
  '€': true
21
24
  '\\': true
22
25
 
26
+ asciiPattern: /^[\x00-\x7F]*$/
23
27
  gsmEncodingPattern: /^[0-9a-zA-Z@Δ¡¿£_!Φ"¥Γ#èΛ¤éΩ%ùΠ&ìΨòΣçΘΞ:Ø;ÄäøÆ,<Ööæ=ÑñÅß>Üüåɧà€~ \$\.\-\+\(\)\*\\\/\?\|\^\}\{\[\]\'\r\n]*$/
24
28
 
25
29
  constructor: (@text) ->
@@ -33,8 +37,25 @@ class SmsTools.Message
33
37
 
34
38
  concatenatedPartsCount * @maxLengthForEncoding[@encoding][messageType]
35
39
 
40
+ use_gsm_encoding: ->
41
+ if SmsTools['use_gsm_encoding'] == undefined
42
+ true
43
+ else
44
+ SmsTools['use_gsm_encoding']
45
+
46
+ use_ascii_encoding: ->
47
+ if SmsTools['use_ascii_encoding'] == undefined
48
+ true
49
+ else
50
+ SmsTools['use_ascii_encoding']
51
+
36
52
  _encoding: ->
37
- if @gsmEncodingPattern.test(@text) then 'gsm' else 'unicode'
53
+ if @asciiPattern.test(@text) and @use_ascii_encoding()
54
+ 'ascii'
55
+ else if @use_gsm_encoding() and @gsmEncodingPattern.test(@text)
56
+ 'gsm'
57
+ else
58
+ 'unicode'
38
59
 
39
60
  _concatenatedPartsCount: ->
40
61
  encoding = @encoding
@@ -45,9 +66,9 @@ class SmsTools.Message
45
66
  else
46
67
  parseInt Math.ceil(length / @maxLengthForEncoding[encoding].concatenated), 10
47
68
 
48
- # Returns the number of symbols, which the given text will take up in an SMS
49
- # message, taking into account any double-space symbols in the GSM 03.38
50
- # encoding.
69
+ # Returns the number of symbols which the given text will eat up in an SMS
70
+ # message, taking into account any double-space symbols in the GSM 03.38
71
+ # encoding.
51
72
  _length: ->
52
73
  length = @text.length
53
74
 
@@ -1,7 +1,28 @@
1
1
  require 'sms_tools/version'
2
2
  require 'sms_tools/encoding_detection'
3
3
  require 'sms_tools/gsm_encoding'
4
+ require 'sms_tools/unicode_encoding'
4
5
 
5
6
  if defined?(::Rails) and ::Rails.version >= '3.1'
6
7
  require 'sms_tools/rails/engine'
7
8
  end
9
+
10
+ module SmsTools
11
+ class << self
12
+ def use_gsm_encoding?
13
+ @use_gsm_encoding.nil? ? true : @use_gsm_encoding
14
+ end
15
+
16
+ def use_gsm_encoding=(value)
17
+ @use_gsm_encoding = value
18
+ end
19
+
20
+ def use_ascii_encoding?
21
+ @use_ascii_encoding.nil? ? true : @use_ascii_encoding
22
+ end
23
+
24
+ def use_ascii_encoding=(value)
25
+ @use_ascii_encoding = value
26
+ end
27
+ end
28
+ end
@@ -3,6 +3,10 @@ require 'sms_tools/gsm_encoding'
3
3
  module SmsTools
4
4
  class EncodingDetection
5
5
  MAX_LENGTH_FOR_ENCODING = {
6
+ ascii: {
7
+ normal: 160,
8
+ concatenated: 153,
9
+ },
6
10
  gsm: {
7
11
  normal: 160,
8
12
  concatenated: 153,
@@ -20,7 +24,18 @@ module SmsTools
20
24
  end
21
25
 
22
26
  def encoding
23
- @encoding ||= GsmEncoding.valid?(text) ? :gsm : :unicode
27
+ @encoding ||=
28
+ if text.ascii_only? and SmsTools.use_ascii_encoding?
29
+ :ascii
30
+ elsif SmsTools.use_gsm_encoding? and GsmEncoding.valid?(text)
31
+ :gsm
32
+ else
33
+ :unicode
34
+ end
35
+ end
36
+
37
+ def ascii?
38
+ encoding == :ascii
24
39
  end
25
40
 
26
41
  def gsm?
@@ -49,12 +64,16 @@ module SmsTools
49
64
  concatenated_parts * MAX_LENGTH_FOR_ENCODING[encoding][message_type]
50
65
  end
51
66
 
52
- # Returns the number of symbols, which the given text will take up in an SMS
53
- # message, taking into account any double-space symbols in the GSM 03.38
54
- # encoding.
67
+ # Returns the number of symbols which the given text will eat up in an SMS
68
+ # message, taking into account any double-space symbols in the GSM 03.38
69
+ # encoding.
55
70
  def length
56
- length = text.length
57
- length += text.chars.count { |char| GsmEncoding.double_byte?(char) } if gsm?
71
+ if unicode?
72
+ length = text.chars.sum { |char| UnicodeEncoding.character_count(char) }
73
+ else
74
+ length = text.length
75
+ length += text.chars.count { |char| GsmEncoding.double_byte?(char) } if gsm?
76
+ end
58
77
 
59
78
  length
60
79
  end
@@ -4,6 +4,8 @@ module SmsTools
4
4
  module GsmEncoding
5
5
  extend self
6
6
 
7
+ GSM_EXTENSION_TABLE_ESCAPE_CODE = "\x1B".freeze
8
+
7
9
  UTF8_TO_GSM_BASE_TABLE = {
8
10
  0x0040 => "\x00", # COMMERCIAL AT
9
11
  0x00A3 => "\x01", # POUND SIGN
@@ -14,7 +16,7 @@ module SmsTools
14
16
  0x00F9 => "\x06", # LATIN SMALL LETTER U WITH GRAVE
15
17
  0x00EC => "\x07", # LATIN SMALL LETTER I WITH GRAVE
16
18
  0x00F2 => "\x08", # LATIN SMALL LETTER O WITH GRAVE
17
- 0x00E7 => "\x09", # LATIN SMALL LETTER C WITH CEDILLA
19
+ 0x00C7 => "\x09", # LATIN CAPITAL LETTER C WITH CEDILLA
18
20
  0x000A => "\x0A", # LINE FEED
19
21
  0x00D8 => "\x0B", # LATIN CAPITAL LETTER O WITH STROKE
20
22
  0x00F8 => "\x0C", # LATIN SMALL LETTER O WITH STROKE
@@ -32,7 +34,7 @@ module SmsTools
32
34
  0x03A3 => "\x18", # GREEK CAPITAL LETTER SIGMA
33
35
  0x0398 => "\x19", # GREEK CAPITAL LETTER THETA
34
36
  0x039E => "\x1A", # GREEK CAPITAL LETTER XI
35
- 0x00A0 => "\x1B", # ESCAPE TO EXTENSION TABLE
37
+ nil => "\x1B", # ESCAPE TO EXTENSION TABLE or NON-BREAKING SPACE
36
38
  0x00C6 => "\x1C", # LATIN CAPITAL LETTER AE
37
39
  0x00E6 => "\x1D", # LATIN SMALL LETTER AE
38
40
  0x00DF => "\x1E", # LATIN SMALL LETTER SHARP S (German)
@@ -176,20 +178,25 @@ module SmsTools
176
178
  def to_utf8(gsm_encoded_string)
177
179
  utf8_encoded_string = ''
178
180
  escape = false
179
- escape_code = "\e".freeze
180
181
 
181
182
  gsm_encoded_string.each_char do |char|
182
- if char == escape_code
183
+ if char == GSM_EXTENSION_TABLE_ESCAPE_CODE
183
184
  escape = true
184
185
  elsif escape
185
186
  escape = false
186
- utf8_encoded_string << [GSM_TO_UTF8[escape_code + char]].pack('U')
187
+ utf8_encoded_string << [fetch_utf8_char(GSM_EXTENSION_TABLE_ESCAPE_CODE + char)].pack('U')
187
188
  else
188
- utf8_encoded_string << [GSM_TO_UTF8[char]].pack('U')
189
+ utf8_encoded_string << [fetch_utf8_char(char)].pack('U')
189
190
  end
190
191
  end
191
192
 
192
193
  utf8_encoded_string
193
194
  end
195
+
196
+ private
197
+
198
+ def fetch_utf8_char(char)
199
+ GSM_TO_UTF8.fetch(char) { raise "Unsupported symbol in GSM-7 encoding: #{char}" }
200
+ end
194
201
  end
195
202
  end
@@ -0,0 +1,15 @@
1
+ module SmsTools
2
+ module UnicodeEncoding
3
+ extend self
4
+
5
+ BASIC_PLANE = 0x0000..0xFFFF
6
+
7
+ # UCS-2/UTF-16 is used for unicode text messaging. UCS-2/UTF-16 represents characters in minimum
8
+ # 2-bytes, any characters in the basic plane are represented with 2-bytes, so each codepoint
9
+ # within the Basic Plane counts as a single character. Any codepoint outside the Basic Plane is
10
+ # encoded using 4-bytes and therefore counts as 2 characters in a text message.
11
+ def character_count(char)
12
+ char.each_codepoint.sum { |codepoint| BASIC_PLANE.include?(codepoint) ? 1 : 2 }
13
+ end
14
+ end
15
+ end
@@ -1,3 +1,3 @@
1
1
  module SmsTools
2
- VERSION = '0.0.1'
2
+ VERSION = '0.2.2'
3
3
  end
@@ -1,5 +1,5 @@
1
1
  require 'spec_helper'
2
- require 'sms_tools/encoding_detection'
2
+ require 'sms_tools'
3
3
 
4
4
  describe SmsTools::EncodingDetection do
5
5
  it "exposes the original text as a method" do
@@ -7,27 +7,77 @@ describe SmsTools::EncodingDetection do
7
7
  end
8
8
 
9
9
  describe "encoding" do
10
- it "defaults to GSM encoding for empty messages" do
11
- detection_for('').encoding.must_equal :gsm
10
+ it "defaults to ASCII encoding for empty messages" do
11
+ detection_for('').encoding.must_equal :ascii
12
12
  end
13
13
 
14
- it "returns GSM as encoding for simple ASCII text" do
15
- detection_for('foo bar baz').encoding.must_equal :gsm
14
+ it "returns ASCII as encoding for simple ASCII text" do
15
+ detection_for('foo bar baz').encoding.must_equal :ascii
16
16
  end
17
17
 
18
18
  it "returns GSM as encoding for special symbols defined in GSM 03.38" do
19
- detection_for('09azAZ@Δ¡¿£_!Φ"¥Γ#èΛ¤éΩ%ùΠ&ìΨòΣçΘΞ:Ø;ÄäøÆ,<Ööæ=ÑñÅß>Üüåɧà€~').encoding.must_equal :gsm
19
+ detection_for('09azAZ@Δ¡¿£_!Φ"¥Γ#èΛ¤éΩ%ùΠ&ìΨòΣCΘΞ:Ø;ÄäøÆ,<Ööæ=ÑñÅß>Üüåɧà€~').encoding.must_equal :gsm
20
20
  end
21
21
 
22
- it "returns GSM as encoding for puntucation and newline symbols" do
23
- detection_for('Foo bar {} [baz]! Larodi $5. What else?').encoding.must_equal :gsm
24
- detection_for("Spaces and newlines are GSM 03.38, too: \r\n").encoding.must_equal :gsm
22
+ it "returns ASCII as encoding for puntucation and newline symbols" do
23
+ detection_for('Foo bar {} [baz]! Larodi $5. What else?').encoding.must_equal :ascii
24
+ detection_for("Spaces and newlines are GSM 03.38, too: \r\n").encoding.must_equal :ascii
25
25
  end
26
26
 
27
27
  it "returns Unicode when non-GSM Unicode symbols are used" do
28
28
  detection_for('Foo bar лароди').encoding.must_equal :unicode
29
29
  detection_for('∞').encoding.must_equal :unicode
30
30
  end
31
+
32
+ it 'considers the non-breaking space character as a non-GSM Unicode symbol' do
33
+ non_breaking_space = "\xC2\xA0"
34
+
35
+ detection_for(non_breaking_space).encoding.must_equal :unicode
36
+ end
37
+
38
+ describe 'with SmsTools.use_gsm_encoding = false' do
39
+ before do
40
+ SmsTools.use_gsm_encoding = false
41
+ end
42
+
43
+ after do
44
+ SmsTools.use_gsm_encoding = true
45
+ end
46
+
47
+ it "returns Unicode as encoding for special symbols defined in GSM 03.38" do
48
+ detection_for('09azAZ@Δ¡¿£_!Φ"¥Γ#èΛ¤éΩ%ùΠ&ìΨòΣCΘΞ:Ø;ÄäøÆ,<Ööæ=ÑñÅß>Üüåɧà€~').encoding.must_equal :unicode
49
+ end
50
+
51
+ it 'returns ASCII for simple ASCII text' do
52
+ detection_for('Hello world.').encoding.must_equal :ascii
53
+ end
54
+
55
+ it "defaults to ASCII encoding for empty messages" do
56
+ detection_for('').encoding.must_equal :ascii
57
+ end
58
+ end
59
+
60
+ describe 'with SmsTools.use_ascii_encoding = false' do
61
+ before do
62
+ SmsTools.use_ascii_encoding = false
63
+ end
64
+
65
+ after do
66
+ SmsTools.use_ascii_encoding = true
67
+ end
68
+
69
+ it "returns GSM 03.38 as encoding for special symbols defined in GSM 03.38" do
70
+ detection_for('09azAZ@Δ¡¿£_!Φ"¥Γ#èΛ¤éΩ%ùΠ&ìΨòΣCΘΞ:Ø;ÄäøÆ,<Ööæ=ÑñÅß>Üüåɧà€~').encoding.must_equal :gsm
71
+ end
72
+
73
+ it 'returns GSM 03.38 for simple ASCII text' do
74
+ detection_for('Hello world.').encoding.must_equal :gsm
75
+ end
76
+
77
+ it "defaults to GSM 03.38 encoding for empty messages" do
78
+ detection_for('').encoding.must_equal :gsm
79
+ end
80
+ end
31
81
  end
32
82
 
33
83
  describe "message length" do
@@ -38,7 +88,7 @@ describe SmsTools::EncodingDetection do
38
88
  end
39
89
 
40
90
  it "computes the length of non-trivial GSM encoded messages correctly" do
41
- detection_for('GSM: 09azAZ@Δ¡¿£_!Φ"¥Γ#èΛ¤éΩ%ùΠ&ìΨòΣçΘΞ:Ø;ÄäøÆ,<Ööæ=ÑñÅß>Üüåɧà').length.must_equal 63
91
+ detection_for('GSM: 09azAZ@Δ¡¿£_!Φ"¥Γ#èΛ¤éΩ%ùΠ&ìΨòΣÇΘΞ:Ø;ÄäøÆ,<Ööæ=ÑñÅß>Üüåɧà').length.must_equal 63
42
92
  end
43
93
 
44
94
  it "correctly counts the length of whitespace-only messages" do
@@ -67,6 +117,34 @@ describe SmsTools::EncodingDetection do
67
117
  detection_for('Уникод: ^{}[~]|€\\').length.must_equal 17
68
118
  detection_for('Уникод: Σ: €').length.must_equal 12
69
119
  end
120
+
121
+ it "counts ZWJ unicode characters correctly" do
122
+ detection_for('😴').length.must_equal 2
123
+ detection_for('🛌🏽').length.must_equal 4
124
+ detection_for('🤾🏽‍♀️').length.must_equal 7
125
+ detection_for('🇵🇵').length.must_equal 4
126
+ detection_for('👩‍❤️‍👩').length.must_equal 8
127
+ end
128
+
129
+ describe 'with SmsTools.use_gsm_encoding = false' do
130
+ before do
131
+ SmsTools.use_gsm_encoding = false
132
+ end
133
+
134
+ it "returns ASCII encoded length for some specific symbols which are also in GSM 03.38" do
135
+ detection_for('[]').length.must_equal 2
136
+ end
137
+ end
138
+
139
+ describe 'with SmsTools.use_ascii_encoding = false' do
140
+ before do
141
+ SmsTools.use_ascii_encoding = false
142
+ end
143
+
144
+ it "returns GSM 03.38 encoded length for some specific symbols which are also in ASCII" do
145
+ detection_for('[]').length.must_equal 4
146
+ end
147
+ end
70
148
  end
71
149
 
72
150
  describe "concatenated message parts counting" do
@@ -96,11 +174,16 @@ describe SmsTools::EncodingDetection do
96
174
  concatenated_parts_for length: 135, encoding: :unicode, must_be: 3
97
175
  end
98
176
 
99
- it "counts parts for actual GSM-encoded and Unicode messages" do
177
+ it "counts parts for actual GSM-encoded messages" do
100
178
  detection_for('').concatenated_parts.must_equal 1
101
- detection_for('Я').concatenated_parts.must_equal 1
102
179
  detection_for('Σ' * 160).concatenated_parts.must_equal 1
103
180
  detection_for('Σ' * 159 + '~').concatenated_parts.must_equal 2
181
+ end
182
+
183
+ it "counts parts for actual Unicode-encoded messages" do
184
+ detection_for('Я').concatenated_parts.must_equal 1
185
+ detection_for('Я' * 70).concatenated_parts.must_equal 1
186
+ detection_for('Я' * 71).concatenated_parts.must_equal 2
104
187
  detection_for('Я' * 133 + '~').concatenated_parts.must_equal 2
105
188
  end
106
189
  end
@@ -0,0 +1,36 @@
1
+ require 'spec_helper'
2
+ require 'sms_tools'
3
+
4
+ describe SmsTools::GsmEncoding do
5
+ describe 'from_utf8' do
6
+ it 'converts simple UTF-8 text to GSM 03.38' do
7
+ SmsTools::GsmEncoding.from_utf8('simple').must_equal 'simple'
8
+ end
9
+
10
+ it 'converts UTF-8 text with double-byte chars to GSM 03.38' do
11
+ SmsTools::GsmEncoding.from_utf8('foo []').must_equal "foo \e<\e>"
12
+ end
13
+
14
+ it 'raises an exception if the UTF-8 text contains chars outside of GSM 03.38' do
15
+ -> { SmsTools::GsmEncoding.from_utf8('баба') }.must_raise RuntimeError, /Unsupported symbol in GSM-7 encoding/
16
+ end
17
+ end
18
+
19
+ describe 'to_utf8' do
20
+ it 'converts simple GSM 03.38 to UTF-8' do
21
+ SmsTools::GsmEncoding.to_utf8('simple').must_equal 'simple'
22
+ end
23
+
24
+ it 'converts UTF-8 text with double-byte chars to GSM 03.38' do
25
+ SmsTools::GsmEncoding.to_utf8("GSM \e<\e>").must_equal 'GSM []'
26
+ end
27
+
28
+ it 'raises an exception if the UTF-8 text contains chars outside of GSM 03.38' do
29
+ -> { SmsTools::GsmEncoding.to_utf8('баба') }.must_raise RuntimeError, /Unsupported symbol in GSM-7 encoding/
30
+ end
31
+
32
+ it 'ignores single occurrences of the GSM-7 extension table escape code' do
33
+ SmsTools::GsmEncoding.to_utf8("\x1B").must_equal ''
34
+ end
35
+ end
36
+ end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: smstools
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.1
4
+ version: 0.2.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Dimitar Dimitrov
8
- autorequire:
8
+ autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2014-01-17 00:00:00.000000000 Z
11
+ date: 2021-01-20 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -86,16 +86,18 @@ files:
86
86
  - lib/sms_tools/encoding_detection.rb
87
87
  - lib/sms_tools/gsm_encoding.rb
88
88
  - lib/sms_tools/rails/engine.rb
89
+ - lib/sms_tools/unicode_encoding.rb
89
90
  - lib/sms_tools/version.rb
90
91
  - lib/smstools.rb
91
92
  - smstools.gemspec
92
93
  - spec/sms_tools/encoding_detection_spec.rb
94
+ - spec/sms_tools/gsm_encoding_spec.rb
93
95
  - spec/spec_helper.rb
94
96
  homepage: https://github.com/mitio/smstools
95
97
  licenses:
96
98
  - MIT
97
99
  metadata: {}
98
- post_install_message:
100
+ post_install_message:
99
101
  rdoc_options: []
100
102
  require_paths:
101
103
  - lib
@@ -110,11 +112,11 @@ required_rubygems_version: !ruby/object:Gem::Requirement
110
112
  - !ruby/object:Gem::Version
111
113
  version: '0'
112
114
  requirements: []
113
- rubyforge_project:
114
- rubygems_version: 2.2.0
115
- signing_key:
115
+ rubygems_version: 3.0.3
116
+ signing_key:
116
117
  specification_version: 4
117
118
  summary: Small library of classes for common SMS-related functionality.
118
119
  test_files:
119
120
  - spec/sms_tools/encoding_detection_spec.rb
121
+ - spec/sms_tools/gsm_encoding_spec.rb
120
122
  - spec/spec_helper.rb