smstools 0.0.1 → 0.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
- SHA1:
3
- metadata.gz: 4180ff46eb40f9d709cf9895d1dc493aeec4203c
4
- data.tar.gz: cba4fd705516f2418dfae93ce3732231e30faba5
2
+ SHA256:
3
+ metadata.gz: f2cecee4608c47f5abf1cf0a980b3a3a646e358d50a72e6b0f1931f554f86c5f
4
+ data.tar.gz: 46eae0938780419f4672581f4b1105a8d51cc4fe7f6150b2e44fdc3c00f16c6e
5
5
  SHA512:
6
- metadata.gz: b3b26b73c00f2acbb9e107ecc24738b97a713a3579c685339226b9ba369c0d2080ff7b40357a472c0dc9ebe41bca0f8ed38e28cd132242ce7ed3e514a78d69c3
7
- data.tar.gz: 9b7a0e6a64fdaec4f8f05130dc3b4ac735f2848fddfd1bbd8d274ec2b6e9781fcdb993ef121357e77f6b4c486508a067e6480516a72fd33d7c28007397f0fde5
6
+ metadata.gz: 6f40d959431dc1185a989b179c91858363978b5200bba7504e499b214ba8b1493c859eebb3308b343ac8000ec747db89ccfbc664692721315bb68c41f96000a0
7
+ data.tar.gz: 7670ac023de1612cd5e4573ad4879526e5f45e38c871416cef3466e4bbee6d1740d0b2ae760b907d57811f913670862e2907b3ed3c8f268b876085195bed2a61
@@ -1,3 +1,25 @@
1
- ## 0.1.1 (17 January 2014)
1
+ ## 0.2.2 (20 Jan 2021)
2
+
3
+ * #9 Fix the way some complex Unicode characters (like composite emojis) are counted. Thanks to @bryanrite for the neat implementation. Note the fix could be **potentially backwards-incompatible** if you were relying on the incorrect behaviour previously. Technically it's still a bug fix.
4
+
5
+ ## 0.2.1 (18 Aug 2020)
6
+
7
+ * #7 Introduce `SmsTools.use_ascii_encoding` option (defaults to `true` for backwards-compatibility) that allows disabling the `:ascii` workaround encoding. See #6 and #7 for details. Thanks @kingsley-wang.
8
+
9
+ ## 0.2.0 (2 March 2017)
10
+
11
+ * The non-breaking space character (0x00A0 in Unicode and "\xC2\xA0" in UTF-8) is no longer regarded as a valid GSM 7-bit symbol. [#4](https://github.com/livebg/smstools/issues/4)
12
+ * GsmEncoding.to_utf8 will now raise errors in case the provided argument is not a valid GSM 7-bit text.
13
+
14
+ ## 0.1.1 (18 April 2016)
15
+
16
+ * Replaces small c with cedilla to capital one, as per the GSM 03.38 standard (by @skliask)
17
+
18
+ ## 0.1.0 (08 October 2015)
19
+
20
+ * distinguish between ascii encoding and gsm encoding
21
+ * add option for preventing the use of gsm encoding, that is to use unicode instead
22
+
23
+ ## 0.0.1 (17 January 2014)
2
24
 
3
25
  * Initial release.
data/README.md CHANGED
@@ -1,23 +1,80 @@
1
1
  # Sms Tools
2
2
 
3
- A small collection of useful Ruby and JavaScript classes implementing often
4
- needed functionality for dealing with SMS messages.
3
+ A small collection of Ruby and JavaScript classes implementing often needed functionality for
4
+ dealing with SMS messages.
5
5
 
6
- The gem is also a Rails engine and using it in your Rails app will allow you
7
- to also use the JavaScript classes via the asset pipeline.
6
+ The gem can also be used in a Rails application as an engine. It integrates with the asset pipeline
7
+ and gives you access to some client-side SMS manipulation functionality.
8
8
 
9
9
  ## Features
10
10
 
11
11
  The following features are available on both the server side and the client
12
12
  side:
13
13
 
14
- - Detection of the most optimal encoding for sending an SMS message (GSM 7-bit
15
- or Unicode).
16
- - Correctly determining the message's length according to the most optimal
17
- encoding.
14
+ - Detection of the most optimal encoding for sending an SMS message (GSM 7-bit or Unicode).
15
+ - Correctly determining a message's length in the most optimal encoding.
18
16
  - Concatenation detection and concatenated message parts counting.
19
17
 
20
- And more.
18
+ The following can be accomplished only on the server with Ruby:
19
+
20
+ - Converting a UTF-8 string to a GSM 7-bit encoding and vice versa.
21
+ - Detecting if a UTF-8 string can be safely represented in a GSM 7-bit encoding.
22
+ - Detection of double-byte chars in the GSM 7-bit encoding.
23
+
24
+ And possibly more.
25
+
26
+ ### Note on the GSM encoding
27
+
28
+ All references to the "GSM" encoding or the "GSM 7-bit alphabet" in this text actually refer to the
29
+ [GSM 03.38 spec](http://en.wikipedia.org/wiki/GSM_03.38) and [its latest
30
+ version](ftp://ftp.unicode.org/Public/MAPPINGS/ETSI/GSM0338.TXT), as defined by the Unicode
31
+ consortium.
32
+
33
+ This encoding is the most widely used one when sending SMS messages.
34
+
35
+ ### Note regarding non-ASCII symbols from the GSM encoding
36
+
37
+ The GSM 03.38 encoding is used by default. This standard defines a set of
38
+ symbols which can be encoded in 7-bits each, thus allowing up to 160 symbols
39
+ per SMS message (each SMS message can contain up to 140 bytes of data).
40
+
41
+ This standard covers most of the ASCII table, but also includes some non-ASCII
42
+ symbols such as `æ`, `ø` and `å`. If you use these in your messages, you can
43
+ still send them as GSM encoded, having a 160-symbol limit. This is technically
44
+ correct.
45
+
46
+ In reality, however, some SMS routes have problems delivering messages which
47
+ contain such non-ASCII symbols in the GSM encoding. The special symbols might
48
+ be omitted, or the message might not arrive at all.
49
+
50
+ Thus, it might be safer to just send messages in Unicode if the message's text
51
+ contains any non-ASCII symbols. This is not the default as it reduces the max
52
+ symbols count to 70 per message, instead of 160, and you might not have any
53
+ issues with GSM-encoded messages. In case you do, however, you can turn off
54
+ support for the GSM encoding and just treat messages as Unicode if they contain
55
+ non-ASCII symbols.
56
+
57
+ In case you decide to do so, you have to specify it in both the Ruby and the
58
+ JavaScript part of the library, like so:
59
+
60
+ #### In Ruby
61
+
62
+ SmsTools.use_gsm_encoding = false
63
+
64
+ #### In Javascript
65
+
66
+ //= require sms_tools
67
+ SmsTools.use_gsm_encoding = false;
68
+
69
+ There is another alternative as well. As explained in this commit – f1ffd948d4b8c – SmsTools will by
70
+ default detect the encoding as `:ascii` if the SMS message contains ASCII-only symbols. The safest
71
+ way to send messages would be to use an ASCII subset of the GSM encodnig.
72
+
73
+ The `:ascii` encoding is informative only, however. Your SMS sending implementation will have to
74
+ decide how to handle it. You may also find it confusing that the dummy `:ascii` encoding does not
75
+ consider double-byte chars at all when counting the length of the message.
76
+
77
+ To disable this dummy `:ascii` encoding, set `SmsTools.use_ascii_encoding` to `false`.
21
78
 
22
79
  ## Installation
23
80
 
@@ -33,32 +90,99 @@ Or install it yourself as:
33
90
 
34
91
  $ gem install smstools
35
92
 
93
+ If you're using the gem in Rails, you may also want to add the following to your `application.js`
94
+ manifest file to gain access to the client-side features:
95
+
96
+ //= require sms_tools
97
+
36
98
  ## Usage
37
99
 
38
100
  The gem consists of both server-side (Ruby) and client-side classes. You can
39
- use either one.
101
+ use either.
40
102
 
41
103
  ### Server-side code
42
104
 
43
- If you use the gem in Rails or via Bundler, just use the appropriate class,
44
- such as `SmsTools::EncodingDetection` or `SmsTools::GsmEncoding`.
105
+ First make sure you have installed the gem and have required the appropriate files.
106
+
107
+ #### Encoding detection
108
+
109
+ The `SmsTools::EncodingDetection` class provides you with a few simple methods to detect the most
110
+ optimal encoding for sending an SMS message, to correctly caclulate its length in that encoding and
111
+ to see if the text would need to be concatenated or will fit in a single message.
112
+
113
+ Here is an example with a non-concatenated message which is best encoded in the GSM 7-bit alphabet:
114
+
115
+ ```ruby
116
+ sms_text = 'Text in GSM 03.38: ÄäøÆ with a double-byte char: ~ '
117
+ sms_encoding = SmsTools::EncodingDetection.new sms_text
118
+
119
+ sms_encoding.gsm? # => true
120
+ sms_encoding.unicode? # => false
121
+ sms_encoding.length # => 52 (because of the double-byte char)
122
+ sms_encoding.concatenated? # => false
123
+ sms_encoding.concatenated_parts # => 1
124
+ sms_encoding.encoding # => :gsm
125
+ ```
126
+
127
+ Here's another example with a concatenated Unicode message:
128
+
129
+ ```ruby
130
+ sms_text = 'Я' * 90
131
+ sms_encoding = SmsTools::EncodingDetection.new sms_text
132
+
133
+ sms_encoding.gsm? # => false
134
+ sms_encoding.unicode? # => true
135
+ sms_encoding.length # => 90
136
+ sms_encoding.concatenated? # => true
137
+ sms_encoding.concatenated_parts # => 2
138
+ sms_encoding.encoding # => :unicode
139
+ ```
140
+
141
+ You can check the specs for this class for more examples.
45
142
 
46
- #### `EncodingDetection`
47
- #### `GsmEncoding`
143
+ #### GSM 03.38 encoding conversion
144
+
145
+ The `SmsTools::GsmEncoding` class can be used to check if a given UTF-8 string can be fully
146
+ represented in the GSM 03.38 encoding as well as to convert from UTF-8 to GSM 03.38 and vice-versa.
147
+
148
+ The main API this class provides is the following:
149
+
150
+ ```ruby
151
+ SmsTools::GsmEncoding.valid? message_text_in_utf8 # => true or false
152
+
153
+ SmsTools::GsmEncoding.from_utf8 utf8_encoded_string # => a GSM 03.38 encoded string
154
+ SmsTools::GsmEncoding.to_utf8 gsm_encoded_string # => an UTF-8 encoded string
155
+ ```
156
+
157
+ Check out the source code of the class to find out more.
48
158
 
49
159
  ### Client-side code
50
160
 
51
- If you're using the gem in Rails 3.x or newer, you can just add the following
52
- to your `application.js` file to gain access to the JavaScript classes:
161
+ If you're using the gem in Rails 3.1 or newer, you can gain access to the `SmsTools.Message` class.
162
+ Its interface is similar to the one of `SmsTools::EncodingDetection`. Here is an example in
163
+ CoffeeScript:
53
164
 
54
- #= require 'sms_tools/all'
165
+ ```coffeescript
166
+ message = new SmsTools.Message 'The text of the message: ~'
55
167
 
56
- Or require only the files you need:
168
+ message.encoding # => 'gsm'
169
+ message.length # => 27
170
+ message.concatenatedPartsCount # => 1
171
+ ```
57
172
 
58
- #= require 'sms_tools/message'
173
+ You can also check how long can this message be in the current most optimal encoding, if we want to
174
+ limit the number of concatenated messages we will allow to be sent:
59
175
 
60
- Note that this assumes you're using the asset pipeline. You need to have a
61
- CoffeeScript preprocessor set up.
176
+ ```coffeescript
177
+ maxConcatenatedPartsCount = 2
178
+ message.maxLengthFor(maxConcatenatedPartsCount) # => 306
179
+ ```
180
+
181
+ This allows you to have a dynamic instead of a fixed length limit, for when you use a non-GSM 03.38
182
+ symbol in your text, your message length limit decreases significantly.
183
+
184
+ Note that to use this client-side code, a Rails application with an active asset pipeline is
185
+ assumed. It might be possible to use it in other setups as well, but you're on your own there.
62
186
 
63
187
  ## Contributing
64
188
 
@@ -69,3 +193,10 @@ CoffeeScript preprocessor set up.
69
193
  5. Commit your changes (`git commit -am 'Add some feature'`)
70
194
  6. Push to the branch (`git push origin my-new-feature`)
71
195
  7. Send a pull request.
196
+
197
+ ## Publishing a new version
198
+
199
+ 1. Pick a version number according to Semantic Versioning.
200
+ 2. Update `CHANGELOG.md`, `version.rb` and potentially this readme.
201
+ 3. Commit the changes, tag them with `vX.Y.Z` (e.g. `v0.2.1`) and push all with `git push --tags`.
202
+ 4. Build and publish the new version of the gem with `gem build smstools.gemspec && gem push *.gem`.
data/Rakefile CHANGED
@@ -1,11 +1,10 @@
1
1
  require 'bundler/gem_tasks'
2
+ require 'rake/testtask'
2
3
 
3
- task :test do
4
- test_files = Dir[File.expand_path('../spec/**/*_spec.rb', __FILE__)]
5
- command = "ruby -Ispec #{test_files.join ' '}"
4
+ task default: :test
6
5
 
7
- puts "Running #{command}"
8
- system command
6
+ Rake::TestTask.new do |t|
7
+ t.libs << 'spec'
8
+ t.test_files = FileList['spec/**/*_spec.rb']
9
+ t.verbose = true
9
10
  end
10
-
11
- task default: :test
@@ -2,6 +2,9 @@ window.SmsTools ?= {}
2
2
 
3
3
  class SmsTools.Message
4
4
  maxLengthForEncoding:
5
+ ascii:
6
+ normal: 160
7
+ concatenated: 153
5
8
  gsm:
6
9
  normal: 160
7
10
  concatenated: 153
@@ -20,6 +23,7 @@ class SmsTools.Message
20
23
  '€': true
21
24
  '\\': true
22
25
 
26
+ asciiPattern: /^[\x00-\x7F]*$/
23
27
  gsmEncodingPattern: /^[0-9a-zA-Z@Δ¡¿£_!Φ"¥Γ#èΛ¤éΩ%ùΠ&ìΨòΣçΘΞ:Ø;ÄäøÆ,<Ööæ=ÑñÅß>Üüåɧà€~ \$\.\-\+\(\)\*\\\/\?\|\^\}\{\[\]\'\r\n]*$/
24
28
 
25
29
  constructor: (@text) ->
@@ -33,8 +37,25 @@ class SmsTools.Message
33
37
 
34
38
  concatenatedPartsCount * @maxLengthForEncoding[@encoding][messageType]
35
39
 
40
+ use_gsm_encoding: ->
41
+ if SmsTools['use_gsm_encoding'] == undefined
42
+ true
43
+ else
44
+ SmsTools['use_gsm_encoding']
45
+
46
+ use_ascii_encoding: ->
47
+ if SmsTools['use_ascii_encoding'] == undefined
48
+ true
49
+ else
50
+ SmsTools['use_ascii_encoding']
51
+
36
52
  _encoding: ->
37
- if @gsmEncodingPattern.test(@text) then 'gsm' else 'unicode'
53
+ if @asciiPattern.test(@text) and @use_ascii_encoding()
54
+ 'ascii'
55
+ else if @use_gsm_encoding() and @gsmEncodingPattern.test(@text)
56
+ 'gsm'
57
+ else
58
+ 'unicode'
38
59
 
39
60
  _concatenatedPartsCount: ->
40
61
  encoding = @encoding
@@ -45,9 +66,9 @@ class SmsTools.Message
45
66
  else
46
67
  parseInt Math.ceil(length / @maxLengthForEncoding[encoding].concatenated), 10
47
68
 
48
- # Returns the number of symbols, which the given text will take up in an SMS
49
- # message, taking into account any double-space symbols in the GSM 03.38
50
- # encoding.
69
+ # Returns the number of symbols which the given text will eat up in an SMS
70
+ # message, taking into account any double-space symbols in the GSM 03.38
71
+ # encoding.
51
72
  _length: ->
52
73
  length = @text.length
53
74
 
@@ -1,7 +1,28 @@
1
1
  require 'sms_tools/version'
2
2
  require 'sms_tools/encoding_detection'
3
3
  require 'sms_tools/gsm_encoding'
4
+ require 'sms_tools/unicode_encoding'
4
5
 
5
6
  if defined?(::Rails) and ::Rails.version >= '3.1'
6
7
  require 'sms_tools/rails/engine'
7
8
  end
9
+
10
+ module SmsTools
11
+ class << self
12
+ def use_gsm_encoding?
13
+ @use_gsm_encoding.nil? ? true : @use_gsm_encoding
14
+ end
15
+
16
+ def use_gsm_encoding=(value)
17
+ @use_gsm_encoding = value
18
+ end
19
+
20
+ def use_ascii_encoding?
21
+ @use_ascii_encoding.nil? ? true : @use_ascii_encoding
22
+ end
23
+
24
+ def use_ascii_encoding=(value)
25
+ @use_ascii_encoding = value
26
+ end
27
+ end
28
+ end
@@ -3,6 +3,10 @@ require 'sms_tools/gsm_encoding'
3
3
  module SmsTools
4
4
  class EncodingDetection
5
5
  MAX_LENGTH_FOR_ENCODING = {
6
+ ascii: {
7
+ normal: 160,
8
+ concatenated: 153,
9
+ },
6
10
  gsm: {
7
11
  normal: 160,
8
12
  concatenated: 153,
@@ -20,7 +24,18 @@ module SmsTools
20
24
  end
21
25
 
22
26
  def encoding
23
- @encoding ||= GsmEncoding.valid?(text) ? :gsm : :unicode
27
+ @encoding ||=
28
+ if text.ascii_only? and SmsTools.use_ascii_encoding?
29
+ :ascii
30
+ elsif SmsTools.use_gsm_encoding? and GsmEncoding.valid?(text)
31
+ :gsm
32
+ else
33
+ :unicode
34
+ end
35
+ end
36
+
37
+ def ascii?
38
+ encoding == :ascii
24
39
  end
25
40
 
26
41
  def gsm?
@@ -49,12 +64,16 @@ module SmsTools
49
64
  concatenated_parts * MAX_LENGTH_FOR_ENCODING[encoding][message_type]
50
65
  end
51
66
 
52
- # Returns the number of symbols, which the given text will take up in an SMS
53
- # message, taking into account any double-space symbols in the GSM 03.38
54
- # encoding.
67
+ # Returns the number of symbols which the given text will eat up in an SMS
68
+ # message, taking into account any double-space symbols in the GSM 03.38
69
+ # encoding.
55
70
  def length
56
- length = text.length
57
- length += text.chars.count { |char| GsmEncoding.double_byte?(char) } if gsm?
71
+ if unicode?
72
+ length = text.chars.sum { |char| UnicodeEncoding.character_count(char) }
73
+ else
74
+ length = text.length
75
+ length += text.chars.count { |char| GsmEncoding.double_byte?(char) } if gsm?
76
+ end
58
77
 
59
78
  length
60
79
  end
@@ -4,6 +4,8 @@ module SmsTools
4
4
  module GsmEncoding
5
5
  extend self
6
6
 
7
+ GSM_EXTENSION_TABLE_ESCAPE_CODE = "\x1B".freeze
8
+
7
9
  UTF8_TO_GSM_BASE_TABLE = {
8
10
  0x0040 => "\x00", # COMMERCIAL AT
9
11
  0x00A3 => "\x01", # POUND SIGN
@@ -14,7 +16,7 @@ module SmsTools
14
16
  0x00F9 => "\x06", # LATIN SMALL LETTER U WITH GRAVE
15
17
  0x00EC => "\x07", # LATIN SMALL LETTER I WITH GRAVE
16
18
  0x00F2 => "\x08", # LATIN SMALL LETTER O WITH GRAVE
17
- 0x00E7 => "\x09", # LATIN SMALL LETTER C WITH CEDILLA
19
+ 0x00C7 => "\x09", # LATIN CAPITAL LETTER C WITH CEDILLA
18
20
  0x000A => "\x0A", # LINE FEED
19
21
  0x00D8 => "\x0B", # LATIN CAPITAL LETTER O WITH STROKE
20
22
  0x00F8 => "\x0C", # LATIN SMALL LETTER O WITH STROKE
@@ -32,7 +34,7 @@ module SmsTools
32
34
  0x03A3 => "\x18", # GREEK CAPITAL LETTER SIGMA
33
35
  0x0398 => "\x19", # GREEK CAPITAL LETTER THETA
34
36
  0x039E => "\x1A", # GREEK CAPITAL LETTER XI
35
- 0x00A0 => "\x1B", # ESCAPE TO EXTENSION TABLE
37
+ nil => "\x1B", # ESCAPE TO EXTENSION TABLE or NON-BREAKING SPACE
36
38
  0x00C6 => "\x1C", # LATIN CAPITAL LETTER AE
37
39
  0x00E6 => "\x1D", # LATIN SMALL LETTER AE
38
40
  0x00DF => "\x1E", # LATIN SMALL LETTER SHARP S (German)
@@ -176,20 +178,25 @@ module SmsTools
176
178
  def to_utf8(gsm_encoded_string)
177
179
  utf8_encoded_string = ''
178
180
  escape = false
179
- escape_code = "\e".freeze
180
181
 
181
182
  gsm_encoded_string.each_char do |char|
182
- if char == escape_code
183
+ if char == GSM_EXTENSION_TABLE_ESCAPE_CODE
183
184
  escape = true
184
185
  elsif escape
185
186
  escape = false
186
- utf8_encoded_string << [GSM_TO_UTF8[escape_code + char]].pack('U')
187
+ utf8_encoded_string << [fetch_utf8_char(GSM_EXTENSION_TABLE_ESCAPE_CODE + char)].pack('U')
187
188
  else
188
- utf8_encoded_string << [GSM_TO_UTF8[char]].pack('U')
189
+ utf8_encoded_string << [fetch_utf8_char(char)].pack('U')
189
190
  end
190
191
  end
191
192
 
192
193
  utf8_encoded_string
193
194
  end
195
+
196
+ private
197
+
198
+ def fetch_utf8_char(char)
199
+ GSM_TO_UTF8.fetch(char) { raise "Unsupported symbol in GSM-7 encoding: #{char}" }
200
+ end
194
201
  end
195
202
  end
@@ -0,0 +1,15 @@
1
+ module SmsTools
2
+ module UnicodeEncoding
3
+ extend self
4
+
5
+ BASIC_PLANE = 0x0000..0xFFFF
6
+
7
+ # UCS-2/UTF-16 is used for unicode text messaging. UCS-2/UTF-16 represents characters in minimum
8
+ # 2-bytes, any characters in the basic plane are represented with 2-bytes, so each codepoint
9
+ # within the Basic Plane counts as a single character. Any codepoint outside the Basic Plane is
10
+ # encoded using 4-bytes and therefore counts as 2 characters in a text message.
11
+ def character_count(char)
12
+ char.each_codepoint.sum { |codepoint| BASIC_PLANE.include?(codepoint) ? 1 : 2 }
13
+ end
14
+ end
15
+ end
@@ -1,3 +1,3 @@
1
1
  module SmsTools
2
- VERSION = '0.0.1'
2
+ VERSION = '0.2.2'
3
3
  end
@@ -1,5 +1,5 @@
1
1
  require 'spec_helper'
2
- require 'sms_tools/encoding_detection'
2
+ require 'sms_tools'
3
3
 
4
4
  describe SmsTools::EncodingDetection do
5
5
  it "exposes the original text as a method" do
@@ -7,27 +7,77 @@ describe SmsTools::EncodingDetection do
7
7
  end
8
8
 
9
9
  describe "encoding" do
10
- it "defaults to GSM encoding for empty messages" do
11
- detection_for('').encoding.must_equal :gsm
10
+ it "defaults to ASCII encoding for empty messages" do
11
+ detection_for('').encoding.must_equal :ascii
12
12
  end
13
13
 
14
- it "returns GSM as encoding for simple ASCII text" do
15
- detection_for('foo bar baz').encoding.must_equal :gsm
14
+ it "returns ASCII as encoding for simple ASCII text" do
15
+ detection_for('foo bar baz').encoding.must_equal :ascii
16
16
  end
17
17
 
18
18
  it "returns GSM as encoding for special symbols defined in GSM 03.38" do
19
- detection_for('09azAZ@Δ¡¿£_!Φ"¥Γ#èΛ¤éΩ%ùΠ&ìΨòΣçΘΞ:Ø;ÄäøÆ,<Ööæ=ÑñÅß>Üüåɧà€~').encoding.must_equal :gsm
19
+ detection_for('09azAZ@Δ¡¿£_!Φ"¥Γ#èΛ¤éΩ%ùΠ&ìΨòΣCΘΞ:Ø;ÄäøÆ,<Ööæ=ÑñÅß>Üüåɧà€~').encoding.must_equal :gsm
20
20
  end
21
21
 
22
- it "returns GSM as encoding for puntucation and newline symbols" do
23
- detection_for('Foo bar {} [baz]! Larodi $5. What else?').encoding.must_equal :gsm
24
- detection_for("Spaces and newlines are GSM 03.38, too: \r\n").encoding.must_equal :gsm
22
+ it "returns ASCII as encoding for puntucation and newline symbols" do
23
+ detection_for('Foo bar {} [baz]! Larodi $5. What else?').encoding.must_equal :ascii
24
+ detection_for("Spaces and newlines are GSM 03.38, too: \r\n").encoding.must_equal :ascii
25
25
  end
26
26
 
27
27
  it "returns Unicode when non-GSM Unicode symbols are used" do
28
28
  detection_for('Foo bar лароди').encoding.must_equal :unicode
29
29
  detection_for('∞').encoding.must_equal :unicode
30
30
  end
31
+
32
+ it 'considers the non-breaking space character as a non-GSM Unicode symbol' do
33
+ non_breaking_space = "\xC2\xA0"
34
+
35
+ detection_for(non_breaking_space).encoding.must_equal :unicode
36
+ end
37
+
38
+ describe 'with SmsTools.use_gsm_encoding = false' do
39
+ before do
40
+ SmsTools.use_gsm_encoding = false
41
+ end
42
+
43
+ after do
44
+ SmsTools.use_gsm_encoding = true
45
+ end
46
+
47
+ it "returns Unicode as encoding for special symbols defined in GSM 03.38" do
48
+ detection_for('09azAZ@Δ¡¿£_!Φ"¥Γ#èΛ¤éΩ%ùΠ&ìΨòΣCΘΞ:Ø;ÄäøÆ,<Ööæ=ÑñÅß>Üüåɧà€~').encoding.must_equal :unicode
49
+ end
50
+
51
+ it 'returns ASCII for simple ASCII text' do
52
+ detection_for('Hello world.').encoding.must_equal :ascii
53
+ end
54
+
55
+ it "defaults to ASCII encoding for empty messages" do
56
+ detection_for('').encoding.must_equal :ascii
57
+ end
58
+ end
59
+
60
+ describe 'with SmsTools.use_ascii_encoding = false' do
61
+ before do
62
+ SmsTools.use_ascii_encoding = false
63
+ end
64
+
65
+ after do
66
+ SmsTools.use_ascii_encoding = true
67
+ end
68
+
69
+ it "returns GSM 03.38 as encoding for special symbols defined in GSM 03.38" do
70
+ detection_for('09azAZ@Δ¡¿£_!Φ"¥Γ#èΛ¤éΩ%ùΠ&ìΨòΣCΘΞ:Ø;ÄäøÆ,<Ööæ=ÑñÅß>Üüåɧà€~').encoding.must_equal :gsm
71
+ end
72
+
73
+ it 'returns GSM 03.38 for simple ASCII text' do
74
+ detection_for('Hello world.').encoding.must_equal :gsm
75
+ end
76
+
77
+ it "defaults to GSM 03.38 encoding for empty messages" do
78
+ detection_for('').encoding.must_equal :gsm
79
+ end
80
+ end
31
81
  end
32
82
 
33
83
  describe "message length" do
@@ -38,7 +88,7 @@ describe SmsTools::EncodingDetection do
38
88
  end
39
89
 
40
90
  it "computes the length of non-trivial GSM encoded messages correctly" do
41
- detection_for('GSM: 09azAZ@Δ¡¿£_!Φ"¥Γ#èΛ¤éΩ%ùΠ&ìΨòΣçΘΞ:Ø;ÄäøÆ,<Ööæ=ÑñÅß>Üüåɧà').length.must_equal 63
91
+ detection_for('GSM: 09azAZ@Δ¡¿£_!Φ"¥Γ#èΛ¤éΩ%ùΠ&ìΨòΣÇΘΞ:Ø;ÄäøÆ,<Ööæ=ÑñÅß>Üüåɧà').length.must_equal 63
42
92
  end
43
93
 
44
94
  it "correctly counts the length of whitespace-only messages" do
@@ -67,6 +117,34 @@ describe SmsTools::EncodingDetection do
67
117
  detection_for('Уникод: ^{}[~]|€\\').length.must_equal 17
68
118
  detection_for('Уникод: Σ: €').length.must_equal 12
69
119
  end
120
+
121
+ it "counts ZWJ unicode characters correctly" do
122
+ detection_for('😴').length.must_equal 2
123
+ detection_for('🛌🏽').length.must_equal 4
124
+ detection_for('🤾🏽‍♀️').length.must_equal 7
125
+ detection_for('🇵🇵').length.must_equal 4
126
+ detection_for('👩‍❤️‍👩').length.must_equal 8
127
+ end
128
+
129
+ describe 'with SmsTools.use_gsm_encoding = false' do
130
+ before do
131
+ SmsTools.use_gsm_encoding = false
132
+ end
133
+
134
+ it "returns ASCII encoded length for some specific symbols which are also in GSM 03.38" do
135
+ detection_for('[]').length.must_equal 2
136
+ end
137
+ end
138
+
139
+ describe 'with SmsTools.use_ascii_encoding = false' do
140
+ before do
141
+ SmsTools.use_ascii_encoding = false
142
+ end
143
+
144
+ it "returns GSM 03.38 encoded length for some specific symbols which are also in ASCII" do
145
+ detection_for('[]').length.must_equal 4
146
+ end
147
+ end
70
148
  end
71
149
 
72
150
  describe "concatenated message parts counting" do
@@ -96,11 +174,16 @@ describe SmsTools::EncodingDetection do
96
174
  concatenated_parts_for length: 135, encoding: :unicode, must_be: 3
97
175
  end
98
176
 
99
- it "counts parts for actual GSM-encoded and Unicode messages" do
177
+ it "counts parts for actual GSM-encoded messages" do
100
178
  detection_for('').concatenated_parts.must_equal 1
101
- detection_for('Я').concatenated_parts.must_equal 1
102
179
  detection_for('Σ' * 160).concatenated_parts.must_equal 1
103
180
  detection_for('Σ' * 159 + '~').concatenated_parts.must_equal 2
181
+ end
182
+
183
+ it "counts parts for actual Unicode-encoded messages" do
184
+ detection_for('Я').concatenated_parts.must_equal 1
185
+ detection_for('Я' * 70).concatenated_parts.must_equal 1
186
+ detection_for('Я' * 71).concatenated_parts.must_equal 2
104
187
  detection_for('Я' * 133 + '~').concatenated_parts.must_equal 2
105
188
  end
106
189
  end
@@ -0,0 +1,36 @@
1
+ require 'spec_helper'
2
+ require 'sms_tools'
3
+
4
+ describe SmsTools::GsmEncoding do
5
+ describe 'from_utf8' do
6
+ it 'converts simple UTF-8 text to GSM 03.38' do
7
+ SmsTools::GsmEncoding.from_utf8('simple').must_equal 'simple'
8
+ end
9
+
10
+ it 'converts UTF-8 text with double-byte chars to GSM 03.38' do
11
+ SmsTools::GsmEncoding.from_utf8('foo []').must_equal "foo \e<\e>"
12
+ end
13
+
14
+ it 'raises an exception if the UTF-8 text contains chars outside of GSM 03.38' do
15
+ -> { SmsTools::GsmEncoding.from_utf8('баба') }.must_raise RuntimeError, /Unsupported symbol in GSM-7 encoding/
16
+ end
17
+ end
18
+
19
+ describe 'to_utf8' do
20
+ it 'converts simple GSM 03.38 to UTF-8' do
21
+ SmsTools::GsmEncoding.to_utf8('simple').must_equal 'simple'
22
+ end
23
+
24
+ it 'converts UTF-8 text with double-byte chars to GSM 03.38' do
25
+ SmsTools::GsmEncoding.to_utf8("GSM \e<\e>").must_equal 'GSM []'
26
+ end
27
+
28
+ it 'raises an exception if the UTF-8 text contains chars outside of GSM 03.38' do
29
+ -> { SmsTools::GsmEncoding.to_utf8('баба') }.must_raise RuntimeError, /Unsupported symbol in GSM-7 encoding/
30
+ end
31
+
32
+ it 'ignores single occurrences of the GSM-7 extension table escape code' do
33
+ SmsTools::GsmEncoding.to_utf8("\x1B").must_equal ''
34
+ end
35
+ end
36
+ end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: smstools
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.1
4
+ version: 0.2.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Dimitar Dimitrov
8
- autorequire:
8
+ autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2014-01-17 00:00:00.000000000 Z
11
+ date: 2021-01-20 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -86,16 +86,18 @@ files:
86
86
  - lib/sms_tools/encoding_detection.rb
87
87
  - lib/sms_tools/gsm_encoding.rb
88
88
  - lib/sms_tools/rails/engine.rb
89
+ - lib/sms_tools/unicode_encoding.rb
89
90
  - lib/sms_tools/version.rb
90
91
  - lib/smstools.rb
91
92
  - smstools.gemspec
92
93
  - spec/sms_tools/encoding_detection_spec.rb
94
+ - spec/sms_tools/gsm_encoding_spec.rb
93
95
  - spec/spec_helper.rb
94
96
  homepage: https://github.com/mitio/smstools
95
97
  licenses:
96
98
  - MIT
97
99
  metadata: {}
98
- post_install_message:
100
+ post_install_message:
99
101
  rdoc_options: []
100
102
  require_paths:
101
103
  - lib
@@ -110,11 +112,11 @@ required_rubygems_version: !ruby/object:Gem::Requirement
110
112
  - !ruby/object:Gem::Version
111
113
  version: '0'
112
114
  requirements: []
113
- rubyforge_project:
114
- rubygems_version: 2.2.0
115
- signing_key:
115
+ rubygems_version: 3.0.3
116
+ signing_key:
116
117
  specification_version: 4
117
118
  summary: Small library of classes for common SMS-related functionality.
118
119
  test_files:
119
120
  - spec/sms_tools/encoding_detection_spec.rb
121
+ - spec/sms_tools/gsm_encoding_spec.rb
120
122
  - spec/spec_helper.rb