RubyGems - tomoto - Versions diffs - 0.3.2-x86_64-linux → 0.3.3-x86_64-linux - Mend

tomoto 0.3.2-x86_64-linux → 0.3.3-x86_64-linux

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: ee297a6bc0b0924af7d5eb880daf0922e57635fa24dbced92261c99a5a321330
-  data.tar.gz: '0163269051abc276e4748337c1ad5d9e92cf563fce54a5b541cc88db5958acef'
+  metadata.gz: 2bac6fcf87194d591166c8adaa6a5cf5e46a5c31b8c7a9b88b7924d41f3888a2
+  data.tar.gz: '0549edad5ab0133ea02b86b53b59e6bd76c3a4b9251ecbfb1b4525d1c721d287'
 SHA512:
-  metadata.gz: 73459d05fa990e8a5b44c845b20bc599659db1d8d22dc15dcdd5a28aa99acb40e8d0b338601126b1d39afeeb297e8ec235029dbc6d175df9b4811d9afbf5b997
-  data.tar.gz: fe0f88f2c4fbbae79674827ef69592fba7a6ea5d08e2d19f357f94a58bbdfae7c9c3c847027fb17710581332624ac67f29088ab3c83ab34139d85d40fcee41f2
+  metadata.gz: fffeaedac57ec10c59faf46ad0179394c94d05f098e0d32e0228bb744a80f3207b700a9acf1b1fe862ce6832274936f881b5566fddd4d4301583fdc22ce3a677
+  data.tar.gz: 974fd982373f3b7a90eea1eeb48f9e889656a1f10e5c2886ad38901d79f3e67983fd56fc696a44112c2d6f02c406a0e4b8de2a040ea60454baa9bdb10b22a1bb

data/CHANGELOG.md CHANGED Viewed

@@ -1,3 +1,8 @@
+## 0.3.3 (2023-02-01)
+- Added `topic_label_dict` method to `LLDA`
+- Fixed error with `infer` with loaded model
 ## 0.3.2 (2023-01-22)
 - Added precompiled gem for Mac ARM

data/LICENSE.txt CHANGED Viewed

@@ -1,7 +1,7 @@
 MIT License
 Copyright (c) 2019, bab2min
-Copyright (c) 2020-2021 Andrew Kane
+Copyright (c) 2020-2023 Andrew Kane
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal

data/README.md CHANGED Viewed

@@ -12,17 +12,15 @@ Add this line to your application’s Gemfile:
 gem "tomoto"
 ```
-ARM is not currently supported
 ## Getting Started
 Train a model
 ```ruby
 model = Tomoto::LDA.new(k: 2)
-model.add_doc("text from document one")
-model.add_doc("text from document two")
-model.add_doc("text from document three")
+model.add_doc(["tokens", "from", "document", "one"])
+model.add_doc(["tokens", "from", "document", "two"])
+model.add_doc(["tokens", "from", "document", "three"])
 model.train(100) # iterations
 ```
@@ -78,7 +76,7 @@ model.ll_per_word
 Perform inference for unseen documents
 ```ruby
-doc = model.make_doc("unseen doc")
+doc = model.make_doc(["unseen", "doc"])
 topic_dist, ll = model.infer(doc)
 ```
@@ -114,14 +112,6 @@ If a method or option you need isn’t supported, feel free to open an issue.
 - [LDA](examples/lda_basic.rb)
 - [HDP](examples/hdp_basic.rb)
-## Tokenization
-Documents are tokenized by whitespace by default, or you can perform your own tokenization.
-```ruby
-model.add_doc(["tokens", "from", "document", "one"])
-```
 ## Performance
 tomoto uses AVX2, AVX, or SSE2 instructions to increase performance on machines that support it. Check which instruction set architecture it’s using with:

data/ext/tomoto/llda.cpp CHANGED Viewed

@@ -29,5 +29,18 @@ void init_llda(Rice::Module& m) {
       "topics_per_label",
       [](tomoto::ILLDAModel& self) {
         return self.getNumTopicsPerLabel();
+      })
+    .define_method(
+      "topic_label_dict",
+      [](tomoto::ILLDAModel& self) {
+        auto dict = self.getTopicLabelDict();
+        Array res;
+        auto utf8 = Rice::Class(rb_cEncoding).call("const_get", "UTF_8");
+        for (size_t i = 0; i < dict.size(); i++) {
+          VALUE value = Rice::detail::To_Ruby<std::string>().convert(dict.toWord(i));
+          Object obj(value);
+          res.push(obj.call("force_encoding", utf8));
+        }
+        return res;
       });
 }

data/lib/tomoto/2.7/tomoto.so CHANGED Viewed

Binary file

data/lib/tomoto/3.0/tomoto.so CHANGED Viewed

Binary file

data/lib/tomoto/3.1/tomoto.so CHANGED Viewed

Binary file

data/lib/tomoto/3.2/tomoto.so CHANGED Viewed

Binary file

data/lib/tomoto/lda.rb CHANGED Viewed

@@ -24,7 +24,7 @@ module Tomoto
     # TODO support multiple docs
     def infer(doc, iter: 100, tolerance: -1, workers: 0, parallel: :default, together: 0)
-      raise "cannot infer with untrained model" unless defined?(@prepared)
+      raise "cannot infer with untrained model" unless trained?
       _infer(doc, iter, tolerance, workers, to_ps(parallel), together)
     end
@@ -86,6 +86,7 @@ module Tomoto
       end
     end
+    # TODO raise error if iterations < 1
     def train(iterations = 10, workers: 0, parallel: :default)
       prepare
       _train(iterations, workers, to_ps(parallel))
@@ -97,6 +98,10 @@ module Tomoto
     private
+    def trained?
+      global_step.positive?
+    end
     def prepare
       unless defined?(@prepared)
         _prepare(@min_cf, @min_df, @rm_top)

data/lib/tomoto/version.rb CHANGED Viewed

@@ -1,3 +1,3 @@
 module Tomoto
-  VERSION = "0.3.2"
+  VERSION = "0.3.3"
 end

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: tomoto
 version: !ruby/object:Gem::Version
-  version: 0.3.2
+  version: 0.3.3
 platform: x86_64-linux
 authors:
 - Andrew Kane
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2023-01-23 00:00:00.000000000 Z
+date: 2023-02-02 00:00:00.000000000 Z
 dependencies: []
 description:
 email: andrew@ankane.org