unicode-display_width 3.1.2 → 3.1.3

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: a85ca57ca5e291c17993e526d222dda44b884286484b3831bb8173ce92aafb1a
4
- data.tar.gz: d1036dfc6464459de04a713e273d09dea767a3b9a9629d9e491052c2ffe97c23
3
+ metadata.gz: 4b0b5fe12467a22c6b21ad6dfb8dc422eb547e252690e432afc7504e8dae641c
4
+ data.tar.gz: a2c0e4c856034b1ef64946861d33845615cd8c950da462441faea2f900c14502
5
5
  SHA512:
6
- metadata.gz: d669e8a2866b56a78bafb3fff6d2d6430fab6bb1ca2633aeaac68e0634ca14374ac0b325bc7159ef90afe0bdffd9c154700cae1fc3183b1d74281ff4b5024e1b
7
- data.tar.gz: 5f319484d27dad70b3851398e11cd3cb93b5c4f41a6c3a76c958d505d8357f9e303b661fd7a0339262d1458b82cb8619e6682ee2dbf8c583d33fbde4fd1a8680
6
+ metadata.gz: 8a9499ffcdc0f6def0ac88fc13aaaaea0e46031a63e05c00c32872e1f066d7550abd8e0cb3efaea084a07ebc168e87ed2a3a32effa040c1a651c3288157704f1
7
+ data.tar.gz: f6a6c7e002476db323d52073ef00567c6eafb564864f1b9c6b72ba7228c5f1a0325e12a07445e0d7a7af66ff988ccd05e89d268898c78c2eaaf97735f79b3f90
data/CHANGELOG.md CHANGED
@@ -1,5 +1,12 @@
1
1
  # CHANGELOG
2
2
 
3
+ ## 3.1.3
4
+
5
+ Better handling of non-UTF-8 strings, patch by @Earlopain:
6
+
7
+ - Data with *BINARY* encoding is interpreted as UTF-8, if possible
8
+ - Use `invalid: :replace` and `undef: :replace` options when converting to UTF-8
9
+
3
10
  ## 3.1.2
4
11
 
5
12
  - Performance improvements
@@ -28,6 +35,7 @@
28
35
 
29
36
  ## 3.0.1
30
37
 
38
+
31
39
  - Add WezTerm and foot as good Emoji terminals
32
40
 
33
41
  ## 3.0.0
data/README.md CHANGED
@@ -71,6 +71,11 @@ Unicode::DisplayWidth.of("·", 1) # => 1
71
71
  Unicode::DisplayWidth.of("·", 2) # => 2
72
72
  ```
73
73
 
74
+ ### Encoding Notes
75
+
76
+ - Data with *BINARY* encoding is interpreted as UTF-8, if possible
77
+ - Non-UTF-8 strings are converted to UTF-8 before measuring, using the [`{invalid: :replace, undef: :replace}`) options](https://ruby-doc.org/3.3.5/encodings_rdoc.html#label-Encoding+Options)
78
+
74
79
  ### Custom Overwrites
75
80
 
76
81
  You can overwrite how to handle specific code points by passing a hash (or even a proc) as `overwrite:` parameter:
@@ -126,7 +131,7 @@ The `emoji:` option can be used to configure which type of Emoji should be consi
126
131
 
127
132
  Unfortunately, the level of Emoji support varies a lot between terminals. While some of them are able to display (almost) all Emoji sequences correctly, others fall back to displaying sequences of basic Emoji. When `emoji: true` or `emoji: :auto` is used, the gem will attempt to set the best fitting Emoji setting for you (e.g. `:rgi_at` on "Apple_Terminal" or `false` on Gnome's terminal widget).
128
133
 
129
- Please note that Emoji display and number of terminal columns used might differs a lot. For example, it might be the case that a terminal does not understand which Emoji to display, but still manages to calculate the proper amount of terminal cells. The automatic Emoji support level per terminal only considers the latter (cursor position), not the actual Emoji image(s) displayed. Please [open an issue](https://github.com/janlelis/unicode-display_width/issues/new) if you notice your terminal application could use a better default value. Also see the [ucs-detect project](https://ucs-detect.readthedocs.io/results.html), which is a great resource that compares various terminal's Unicode/Emoji capabilities.
134
+ Please note that Emoji display and number of terminal columns used might differs a lot. For example, it might be the case that a terminal does not understand which Emoji to display, but still manages to calculate the proper amount of terminal cells. The automatic Emoji support level per terminal only considers the latter (cursor position), not the actual Emoji image(s) displayed. Please [open an issue](https://github.com/janlelis/unicode-display_width/issues/new) if you notice your terminal application could use a better default value. Also see the [ucs-detect project](https://ucs-detect.readthedocs.io/results.html), which is a great resource that compares various terminal's Unicode/Emoji capabilities. You can checkout how your terminals renders different kind of Emoji types with this [terminal-emoji-width.rb script](https://github.com/janlelis/unicode-display_width/blob/main/misc/terminal-emoji-width.rb).
130
135
 
131
136
  **To terminal implementors reading this:** Although the practice of giving all Emoji/ZWJ sequences a width of 2 (`:all` mode described above) has some advantages, it does not lead to a particularly good developer experience. Since there is always the possibility of well-formed Emoji that are currently not supported (non-RGI / future Unicode) appearing, those sequences will take more cells. Instead of overflowing, cutting off sequences or displaying placeholder-Emoji, could it be worthwile to implement the `:rgi` option (only known Emoji get width 2) and give those unknown Emoji the space they need? This would support the idea that the meaning of an unknown Emoji sequence can still be conveyed (without messing up the terminal at the same time). Just a thought…
132
137
 
@@ -2,7 +2,7 @@
2
2
 
3
3
  module Unicode
4
4
  class DisplayWidth
5
- VERSION = "3.1.2"
5
+ VERSION = "3.1.3"
6
6
  UNICODE_VERSION = "16.0.0"
7
7
  DATA_DIRECTORY = File.expand_path(File.dirname(__FILE__) + "/../../../data/")
8
8
  INDEX_FILENAME = DATA_DIRECTORY + "/display_width.marshal.gz"
@@ -47,7 +47,14 @@ module Unicode
47
47
 
48
48
  # Returns monospace display width of string
49
49
  def self.of(string, ambiguous = nil, overwrite = nil, old_options = {}, **options)
50
- string = string.encode(Encoding::UTF_8) unless string.encoding == Encoding::UTF_8
50
+ # Binary strings don't make much sense when calculating display width.
51
+ # Assume it's valid UTF-8
52
+ if string.encoding == Encoding::BINARY && !string.force_encoding(Encoding::UTF_8).valid_encoding?
53
+ # Didn't work out, go back to binary
54
+ string.force_encoding(Encoding::BINARY)
55
+ end
56
+
57
+ string = string.encode(Encoding::UTF_8, invalid: :replace, undef: :replace) unless string.encoding == Encoding::UTF_8
51
58
  options = normalize_options(string, ambiguous, overwrite, old_options, **options)
52
59
 
53
60
  width = 0
@@ -236,4 +243,3 @@ module Unicode
236
243
  end
237
244
  end
238
245
  end
239
-
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: unicode-display_width
3
3
  version: !ruby/object:Gem::Version
4
- version: 3.1.2
4
+ version: 3.1.3
5
5
  platform: ruby
6
6
  authors:
7
7
  - Jan Lelis
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2024-11-20 00:00:00.000000000 Z
11
+ date: 2024-12-26 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: unicode-emoji
@@ -104,7 +104,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
104
104
  - !ruby/object:Gem::Version
105
105
  version: '0'
106
106
  requirements: []
107
- rubygems_version: 3.5.21
107
+ rubygems_version: 3.1.6
108
108
  signing_key:
109
109
  specification_version: 4
110
110
  summary: Determines the monospace display width of a string in Ruby.