RubyGems - tomoto - Versions diffs - 0.3.2-x86_64-darwin → 0.3.3-x86_64-darwin - Mend

tomoto 0.3.2-x86_64-darwin → 0.3.3-x86_64-darwin

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 3d119a9cebe3238d7adec0b7599d44be4d0236b8141f926f7d38fbb7cac55b4c
-  data.tar.gz: 821d52e3399b0d380012c8c9e4baf1d7681d1879363921f1342fa14c427e239e
+  metadata.gz: b79fb57f7e14e6b483109a2ee2b9905b3ad30c5dc026477494d42238f6c3719d
+  data.tar.gz: 22f74746b73ad822f1fdd1cf8cabdcc28b995d1d3f18097c90ca2894dadb38f2
 SHA512:
-  metadata.gz: 4213d5f13f26e2fd41a1f6569fc54b4813315c9e99d03ece9639e6fe7311bb3918fe81d297954589f0ae5e499509816a9b97461b6e1d20c3a8553873799a8e1b
-  data.tar.gz: aa404ce2e31f8311245916ce022b8f3e6776d4db8cfcda25b9e9314b6bb83e7d85c559985911b2005bdf29e7ab1f82d5bd4379adb5414575767f31e039c8e762
+  metadata.gz: acbd74efa07f328b5326944bd836bbb55c310a9d4877f645785072bb08aaabf52453b18c2371f70573acb5806f491659460fdb7881d5733bb576b2694727275f
+  data.tar.gz: cd255e7ce35ef651c1cada54406631b9ddf71d7ee29af66a223bb12053b237e1619e294f2c6ff8a3481eac205a37e1d1e60c0b9d30cd2280c39b6f2c6c32f2ee

data/CHANGELOG.md CHANGED Viewed

@@ -1,3 +1,8 @@
+## 0.3.3 (2023-02-01)
+- Added `topic_label_dict` method to `LLDA`
+- Fixed error with `infer` with loaded model
 ## 0.3.2 (2023-01-22)
 - Added precompiled gem for Mac ARM

data/LICENSE.txt CHANGED Viewed

@@ -1,7 +1,7 @@
 MIT License
 Copyright (c) 2019, bab2min
-Copyright (c) 2020-2021 Andrew Kane
+Copyright (c) 2020-2023 Andrew Kane
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal

data/README.md CHANGED Viewed

@@ -12,17 +12,15 @@ Add this line to your application’s Gemfile:
 gem "tomoto"
 ```
-ARM is not currently supported
 ## Getting Started
 Train a model
 ```ruby
 model = Tomoto::LDA.new(k: 2)
-model.add_doc("text from document one")
-model.add_doc("text from document two")
-model.add_doc("text from document three")
+model.add_doc(["tokens", "from", "document", "one"])
+model.add_doc(["tokens", "from", "document", "two"])
+model.add_doc(["tokens", "from", "document", "three"])
 model.train(100) # iterations
 ```
@@ -78,7 +76,7 @@ model.ll_per_word
 Perform inference for unseen documents
 ```ruby
-doc = model.make_doc("unseen doc")
+doc = model.make_doc(["unseen", "doc"])
 topic_dist, ll = model.infer(doc)
 ```
@@ -114,14 +112,6 @@ If a method or option you need isn’t supported, feel free to open an issue.
 - [LDA](examples/lda_basic.rb)
 - [HDP](examples/hdp_basic.rb)
-## Tokenization
-Documents are tokenized by whitespace by default, or you can perform your own tokenization.
-```ruby
-model.add_doc(["tokens", "from", "document", "one"])
-```
 ## Performance
 tomoto uses AVX2, AVX, or SSE2 instructions to increase performance on machines that support it. Check which instruction set architecture it’s using with:

data/ext/tomoto/llda.cpp CHANGED Viewed

@@ -29,5 +29,18 @@ void init_llda(Rice::Module& m) {
       "topics_per_label",
       [](tomoto::ILLDAModel& self) {
         return self.getNumTopicsPerLabel();
+      })
+    .define_method(
+      "topic_label_dict",
+      [](tomoto::ILLDAModel& self) {
+        auto dict = self.getTopicLabelDict();
+        Array res;
+        auto utf8 = Rice::Class(rb_cEncoding).call("const_get", "UTF_8");
+        for (size_t i = 0; i < dict.size(); i++) {
+          VALUE value = Rice::detail::To_Ruby<std::string>().convert(dict.toWord(i));
+          Object obj(value);
+          res.push(obj.call("force_encoding", utf8));
+        }
+        return res;
       });
 }

data/lib/tomoto/2.7/tomoto.bundle CHANGED Viewed

Binary file

data/lib/tomoto/3.0/tomoto.bundle CHANGED Viewed

Binary file

data/lib/tomoto/3.1/tomoto.bundle CHANGED Viewed

Binary file

data/lib/tomoto/3.2/tomoto.bundle CHANGED Viewed

Binary file

data/lib/tomoto/lda.rb CHANGED Viewed

@@ -24,7 +24,7 @@ module Tomoto
     # TODO support multiple docs
     def infer(doc, iter: 100, tolerance: -1, workers: 0, parallel: :default, together: 0)
-      raise "cannot infer with untrained model" unless defined?(@prepared)
+      raise "cannot infer with untrained model" unless trained?
       _infer(doc, iter, tolerance, workers, to_ps(parallel), together)
     end
@@ -86,6 +86,7 @@ module Tomoto
       end
     end
+    # TODO raise error if iterations < 1
     def train(iterations = 10, workers: 0, parallel: :default)
       prepare
       _train(iterations, workers, to_ps(parallel))
@@ -97,6 +98,10 @@ module Tomoto
     private
+    def trained?
+      global_step.positive?
+    end
     def prepare
       unless defined?(@prepared)
         _prepare(@min_cf, @min_df, @rm_top)

data/lib/tomoto/version.rb CHANGED Viewed

@@ -1,3 +1,3 @@
 module Tomoto
-  VERSION = "0.3.2"
+  VERSION = "0.3.3"
 end

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: tomoto
 version: !ruby/object:Gem::Version
-  version: 0.3.2
+  version: 0.3.3
 platform: x86_64-darwin
 authors:
 - Andrew Kane
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2023-01-23 00:00:00.000000000 Z
+date: 2023-02-02 00:00:00.000000000 Z
 dependencies: []
 description:
 email: andrew@ankane.org