RubyGems - tomoto - Versions diffs - 0.3.2-arm64-darwin → 0.3.3-arm64-darwin - Mend

tomoto 0.3.2-arm64-darwin → 0.3.3-arm64-darwin

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 25a8fdacc489aed2a3521fdfcfe0d5e591d3eeb71120bd2ca9a7bbc9cf65f5b6
-  data.tar.gz: f21fd4b4d378392fbc1ed7bf0a53f37854a5c28caf65e1e8189bcab11f00f90b
+  metadata.gz: 5a103162a422fa8a45dcab76b8ee72ea3e7e755d4f924666601f204d868ea13e
+  data.tar.gz: ac60a94e66a6518bbd36a17e05d3bc79b4bbb0b32b14aa619d1a6cda2cef4b7b
 SHA512:
-  metadata.gz: 81fa53fbb42974675ec3e7e996a01372106c326d2e8713382b4866209b4c63511e861362e596e18d6f5d2219cb98e7f1ba4c42d43dd5a12d3306340402eba22a
-  data.tar.gz: 3c20a2402fb61c94d1a298eb743f3f2eda368fa64b828792b5685655d5ac0a62084a3f188a488d5ed1048d6e316def5fb55e85d07ae319d64f0d8d19e38e7e23
+  metadata.gz: 2d4891fbccb8fcd5572fdbdea4d9ffe6aa7684d16b1690feadad3a99a12b4c799263ba7ae86150ca0980357572ad65e95fcdcb2ce7a47f52ab78140d2752ef53
+  data.tar.gz: 6d4d3565f2c9db5f4b04f68428006f0ad7de028a1065411f6131e6468a332f53e601a507f92604423aa2eabf0b7a695cfbb5cf62fa622c364e82a13a81ef7465

data/CHANGELOG.md CHANGED Viewed

@@ -1,3 +1,8 @@
+## 0.3.3 (2023-02-01)
+- Added `topic_label_dict` method to `LLDA`
+- Fixed error with `infer` with loaded model
 ## 0.3.2 (2023-01-22)
 - Added precompiled gem for Mac ARM

data/LICENSE.txt CHANGED Viewed

@@ -1,7 +1,7 @@
 MIT License
 Copyright (c) 2019, bab2min
-Copyright (c) 2020-2021 Andrew Kane
+Copyright (c) 2020-2023 Andrew Kane
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal

data/README.md CHANGED Viewed

@@ -12,17 +12,15 @@ Add this line to your application’s Gemfile:
 gem "tomoto"
 ```
-ARM is not currently supported
 ## Getting Started
 Train a model
 ```ruby
 model = Tomoto::LDA.new(k: 2)
-model.add_doc("text from document one")
-model.add_doc("text from document two")
-model.add_doc("text from document three")
+model.add_doc(["tokens", "from", "document", "one"])
+model.add_doc(["tokens", "from", "document", "two"])
+model.add_doc(["tokens", "from", "document", "three"])
 model.train(100) # iterations
 ```
@@ -78,7 +76,7 @@ model.ll_per_word
 Perform inference for unseen documents
 ```ruby
-doc = model.make_doc("unseen doc")
+doc = model.make_doc(["unseen", "doc"])
 topic_dist, ll = model.infer(doc)
 ```
@@ -114,14 +112,6 @@ If a method or option you need isn’t supported, feel free to open an issue.
 - [LDA](examples/lda_basic.rb)
 - [HDP](examples/hdp_basic.rb)
-## Tokenization
-Documents are tokenized by whitespace by default, or you can perform your own tokenization.
-```ruby
-model.add_doc(["tokens", "from", "document", "one"])
-```
 ## Performance
 tomoto uses AVX2, AVX, or SSE2 instructions to increase performance on machines that support it. Check which instruction set architecture it’s using with:

data/ext/tomoto/llda.cpp CHANGED Viewed

@@ -29,5 +29,18 @@ void init_llda(Rice::Module& m) {
       "topics_per_label",
       [](tomoto::ILLDAModel& self) {
         return self.getNumTopicsPerLabel();
+      })
+    .define_method(
+      "topic_label_dict",
+      [](tomoto::ILLDAModel& self) {
+        auto dict = self.getTopicLabelDict();
+        Array res;
+        auto utf8 = Rice::Class(rb_cEncoding).call("const_get", "UTF_8");
+        for (size_t i = 0; i < dict.size(); i++) {
+          VALUE value = Rice::detail::To_Ruby<std::string>().convert(dict.toWord(i));
+          Object obj(value);
+          res.push(obj.call("force_encoding", utf8));
+        }
+        return res;
       });
 }

data/lib/tomoto/2.7/tomoto.bundle CHANGED Viewed

Binary file

data/lib/tomoto/3.0/tomoto.bundle CHANGED Viewed

Binary file

data/lib/tomoto/3.1/tomoto.bundle CHANGED Viewed

Binary file

data/lib/tomoto/3.2/tomoto.bundle CHANGED Viewed

Binary file

data/lib/tomoto/lda.rb CHANGED Viewed

@@ -24,7 +24,7 @@ module Tomoto
     # TODO support multiple docs
     def infer(doc, iter: 100, tolerance: -1, workers: 0, parallel: :default, together: 0)
-      raise "cannot infer with untrained model" unless defined?(@prepared)
+      raise "cannot infer with untrained model" unless trained?
       _infer(doc, iter, tolerance, workers, to_ps(parallel), together)
     end
@@ -86,6 +86,7 @@ module Tomoto
       end
     end
+    # TODO raise error if iterations < 1
     def train(iterations = 10, workers: 0, parallel: :default)
       prepare
       _train(iterations, workers, to_ps(parallel))
@@ -97,6 +98,10 @@ module Tomoto
     private
+    def trained?
+      global_step.positive?
+    end
     def prepare
       unless defined?(@prepared)
         _prepare(@min_cf, @min_df, @rm_top)

data/lib/tomoto/version.rb CHANGED Viewed

@@ -1,3 +1,3 @@
 module Tomoto
-  VERSION = "0.3.2"
+  VERSION = "0.3.3"
 end

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: tomoto
 version: !ruby/object:Gem::Version
-  version: 0.3.2
+  version: 0.3.3
 platform: arm64-darwin
 authors:
 - Andrew Kane
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2023-01-23 00:00:00.000000000 Z
+date: 2023-02-02 00:00:00.000000000 Z
 dependencies: []
 description:
 email: andrew@ankane.org