tokenizers 0.2.1 → 0.2.3
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +13 -0
- data/Cargo.lock +125 -1253
- data/Cargo.toml +0 -5
- data/README.md +1 -1
- data/ext/tokenizers/Cargo.toml +4 -3
- data/ext/tokenizers/src/encoding.rs +77 -3
- data/ext/tokenizers/src/lib.rs +26 -5
- data/ext/tokenizers/src/tokenizer.rs +20 -20
- data/lib/tokenizers/char_bpe_tokenizer.rb +2 -2
- data/lib/tokenizers/encoding.rb +19 -0
- data/lib/tokenizers/from_pretrained.rb +119 -0
- data/lib/tokenizers/tokenizer.rb +12 -0
- data/lib/tokenizers/version.rb +1 -1
- data/lib/tokenizers.rb +8 -7
- metadata +5 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: e4f3cb98cb867df67a1c8a00b56f9ec5f4c6fafa178d760655dafb6735160773
|
4
|
+
data.tar.gz: 88c420f7a42f56330ce091df7f131878efd552488232282388e69d7a3c4b4aa2
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 8e4746ccdf33dce78dc2b86d847f47f83576ca0d637671f825ad006a53b7ac3374654f7724f1e889618f322f9cfa5081e30083997ee9810eab282b9a8b99f807
|
7
|
+
data.tar.gz: 5dfe7b502d908f85ae16cfb28ebe1bd2ff51348c31151c7ee531504c00a0315dc22ea76fea963690de8c7390c7adb50d392e39de6db4a22101e91d31de1fa4e8
|
data/CHANGELOG.md
CHANGED
@@ -1,3 +1,16 @@
|
|
1
|
+
## 0.2.3 (2022-01-22)
|
2
|
+
|
3
|
+
- Added `add_special_tokens` option to `encode` method
|
4
|
+
- Added warning about `encode` method including special tokens by default in 0.3.0
|
5
|
+
- Added more methods to `Encoding`
|
6
|
+
- Fixed error with precompiled gem on Mac ARM
|
7
|
+
|
8
|
+
## 0.2.2 (2022-01-15)
|
9
|
+
|
10
|
+
- Added precompiled gem for Linux ARM
|
11
|
+
- Added `from_file` method
|
12
|
+
- Fixed error with precompiled gem on Linux x86-64
|
13
|
+
|
1
14
|
## 0.2.1 (2022-01-12)
|
2
15
|
|
3
16
|
- Added support for Ruby 3.2
|