RubyGems - ruby-spacy - Versions diffs - 0.2.2 → 0.2.3 - Mend

ruby-spacy 0.2.2 → 0.2.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +4 -0
data/README.md +153 -41
data/examples/openai_integration/openai_completion.rb +1 -1
data/examples/openai_integration/openai_query_1.rb +1 -1
data/examples/openai_integration/openai_query_2.rb +1 -1
data/examples/openai_integration/openai_query_3.rb +72 -57
data/examples/openai_integration/openai_query_4.rb +3 -3
data/lib/ruby-spacy/version.rb +1 -1
data/lib/ruby-spacy.rb +32 -13
metadata +3 -3

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 9e9edb55398e8926b4fd9c06d65b49538129e34a960098b1ad20535d64a2b787
-  data.tar.gz: 639fa3186d563480d0eb268fa948d8b97428fcdf37887dfff64954fcfc86c1f0
+  metadata.gz: 9c9ca5b4cba8eb115192aa0b5a45216d12a9d9e4cdddc253ba55ace52e778afd
+  data.tar.gz: 197c61acfa742048fefff05b35d6045e17dd5cf212667c277537fb984a0ff926
 SHA512:
-  metadata.gz: 74367e0cd67a3537b20f73427baf626ada1f123d9c34da1a55795a905c3cfd8239c5cc1a04e6cf92c8312c6338a6300ce95b837d24c642c3dbb77733a25060ed
-  data.tar.gz: 4723555e09a6416ec8cb5727b3344756be36a26f65429759508aaee697b245960065cb74699b7f619d552a67b574dbc70da1b53a0ddc434ade45901b0ca72dd7
+  metadata.gz: 950daeb4f8ee140a15bacf18ea3228f2604a552df8aa12be52fb7a488c78e67b894b8678fbe6fbed74da54beb714e89d02ab1bd46d5c59a908b8ddfbc5c9e7c0
+  data.tar.gz: 84b183babd37f9120c0ac2332eec23dff30d3180da165aaf044bf72ef4be7af4efc2b339ad5ac5b489e3e3b9b44ba33d3df4fca287addbbed05cfa4201b79d75

data/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,9 @@
 # Change Log
+## 0.2.3 - 2024-08-27
+- Timeout option added to `Spacy::Language.new`
+- Default OpenaAI models updated to `gpt-4o-mini`
 ## 0.2.0 - 2022-10-02
 - spaCy 3.7.0 supported

data/README.md CHANGED Viewed

@@ -13,11 +13,12 @@
 | ✅ | Access to pre-trained word vectors                 |
 | ✅ | OpenAI Chat/Completion/Embeddings API integration  |
-Current Version: `0.2.2`
+Current Version: `0.2.3`
-- Addressed installation issues in some environments
+- spaCy 3.7.0 supported
+- OpenAI API integration
-## Installation of prerequisites
+## Installation of Prerequisites
 **IMPORTANT**: Make sure that the `enable-shared` option is enabled in your Python installation. You can use [pyenv](https://github.com/pyenv/pyenv) to install any version of Python you like. Install Python 3.10.6, for instance, using pyenv with `enable-shared` as follows:
@@ -109,7 +110,7 @@ Output:
 |:-----:|:--:|:-------:|:--:|:------:|:----:|:-------:|:---:|:-:|:--:|:-------:|
 | Apple | is | looking | at | buying | U.K. | startup | for | $ | 1  | billion |
-### Part-of-speech and dependency
+### Part-of-speech and Dependency
 → [spaCy: Part-of-speech tags and dependencies](https://spacy.io/usage/spacy-101#annotations-pos-deps)
@@ -149,7 +150,7 @@ Output:
 | 1       | 1       | NUM   | CD  | compound |
 | billion | billion | NUM   | CD  | pobj     |
-### Part-of-speech and dependency (Japanese)
+### Part-of-speech and Dependency (Japanese)
 Ruby code:
@@ -234,7 +235,7 @@ Output:
 | 1       | d     | false    | false   | NumType = Card                                                                      |
 | billion | xxxx  | true     | false   | NumType = Card                                                                      |
-### Visualizing dependency
+### Visualizing Dependency
 → [spaCy: Visualizers](https://spacy.io/usage/visualizers)
@@ -259,7 +260,7 @@ Output:
 ![](https://github.com/yohasebe/ruby-spacy/blob/main/examples/get_started/outputs/test_dep.svg)
-### Visualizing dependency (compact)
+### Visualizing Dependency (Compact)
 Ruby code:
@@ -282,7 +283,7 @@ Output:
 ![](https://github.com/yohasebe/ruby-spacy/blob/main/examples/get_started/outputs/test_dep_compact.svg)
-### Named entity recognition
+### Named Entity Recognition
 → [spaCy: Named entities](https://spacy.io/usage/spacy-101#annotations-ner)
@@ -314,7 +315,7 @@ Output:
 | U.K.       | 27         | 31       | GPE   |
 | $1 billion | 44         | 54       | MONEY |
-### Named entity recognition (Japanese)
+### Named Entity Recognition (Japanese)
 Ruby code:
@@ -347,7 +348,7 @@ Output:
 | ファミコン | 10    | 15  | PRODUCT |
 | 14,800円   | 16    | 23  | MONEY   |
-### Checking availability of word vectors
+### Checking Availability of Word Vectors
 → [spaCy: Word vectors and similarity](https://spacy.io/usage/spacy-101#vectors-similarity)
@@ -380,7 +381,7 @@ Output:
 | banana  | true       | 6.700014    | false  |
 | afskfsd | false      | 0.0         | true   |
-### Similarity calculation
+### Similarity Calculation
 Ruby code:
@@ -405,7 +406,7 @@ Doc 2: Fast food tastes very good.
 Similarity: 0.7687607012190486
 ```
-### Similarity calculation (Japanese)
+### Similarity Calculation (Japanese)
 Ruby code:
@@ -428,7 +429,7 @@ doc2: あいにくの悪天候で残念です。
 Similarity: 0.8684192637149641
 ```
-### Word vector calculation
+### Word Vector Calculation
 **Tokyo - Japan + France = Paris ?**
@@ -475,7 +476,7 @@ Output:
 | 10   | marseille   | 0.6370999813079834 |
-### Word vector calculation (Japanese)
+### Word Vector Calculation (Japanese)
 **東京 - 日本 + フランス = パリ ?**
@@ -524,7 +525,9 @@ Output:
 ## OpenAI API Integration
-Easily leverage GPT models within ruby-spacy by using an OpenAI API key. When constructing prompts for the `Doc::openai_query` method, you can incorporate various token properties from the document. These properties are retrieved through function calls and seamlessly integrated into your prompt (`gpt-3.5-turbo-0613` or greater is needed). The available properties include:
+> ⚠️ This feature is currently experimental. Details are subject to change. Please refer to OpenAI's [API reference](https://platform.openai.com/docs/api-reference) and [Ruby OpenAI](https://github.com/alexrudall/ruby-openai) for available parameters (`max_tokens`, `temperature`, etc).
+Easily leverage GPT models within ruby-spacy by using an OpenAI API key. When constructing prompts for the `Doc::openai_query` method, you can incorporate the following token properties of the document. These properties are retrieved through function calls (made internally by GPT when necessary) and seamlessly integrated into your prompt. Note that function calls need `gpt-4o-mini` or greater. The available properties include:
 - `surface`
 - `lemma`
@@ -534,7 +537,7 @@ Easily leverage GPT models within ruby-spacy by using an OpenAI API key. When co
 - `ent_type` (entity type)
 - `morphology`
-### GPT Prompting 1
+### GPT Prompting (Translation)
 Ruby code:
@@ -549,7 +552,7 @@ doc = nlp.read("The Beatles released 12 studio albums")
 # default parameter values
 # max_tokens: 1000
 # temperature: 0.7
-# model: "gpt-3.5-turbo-0613"
+# model: "gpt-4o-mini"
 res1 = doc.openai_query(
   access_token: api_key,
   prompt: "Translate the text to Japanese."
@@ -561,7 +564,7 @@ Output:
 > ビートルズは12枚のスタジオアルバムをリリースしました。
-### GPT Prompting 2
+### GPT Prompting (Elaboration)
 Ruby code:
@@ -572,6 +575,10 @@ api_key = ENV["OPENAI_API_KEY"]
 nlp = Spacy::Language.new("en_core_web_sm")
 doc = nlp.read("The Beatles were an English rock band formed in Liverpool in 1960.")
+# default parameter values
+# max_tokens: 1000
+# temperature: 0.7
+# model: "gpt-4o-mini"
 res = doc.openai_query(
   access_token: api_key,
   prompt: "Extract the topic of the document and list 10 entities (names, concepts, locations, etc.) that are relevant to the topic."
@@ -580,21 +587,112 @@ res = doc.openai_query(
 Output:
-> Topic: The Beatles
+> **Topic:** The Beatles
+>
+> **Relevant Entities:**
 >
-> Entities:
-> 1. The Beatles (band)
-> 2. English (nationality)
-> 3. Rock band
-> 4. Liverpool (city)
-> 5. 1960 (year)
-> 6. John Lennon (member)
-> 7. Paul McCartney (member)
-> 8. George Harrison (member)
-> 9. Ringo Starr (member)
-> 10. Music
-### GPT Prompting 3
+> 1. The Beatles (PERSON)
+> 2. Liverpool (GPE - Geopolitical Entity)
+> 3. English (LANGUAGE)
+> 4. Rock (MUSIC GENRE)
+> 5. 1960 (DATE)
+> 6. Band (MUSIC GROUP)
+> 7. John Lennon (PERSON - key member)
+> 8. Paul McCartney (PERSON - key member)
+> 9. George Harrison (PERSON - key member)
+> 10. Ringo Starr (PERSON - key member)
+### GPT Prompting (JSON Output Using RAG with Token Properties)
+Ruby code:
+```ruby
+require "ruby-spacy"
+api_key = ENV["OPENAI_API_KEY"]
+nlp = Spacy::Language.new("en_core_web_sm")
+doc = nlp.read("The Beatles released 12 studio albums")
+# default parameter values
+# max_tokens: 1000
+# temperature: 0.7
+# model: "gpt-4o-mini"
+res = doc.openai_query(
+  access_token: api_key,
+  prompt: "List token data of each of the words used in the sentence. Add 'meaning' property and value (brief semantic definition) to each token data. Output as a JSON object."
+)
+```
+Output:
+```json
+{
+  "tokens": [
+    {
+      "surface": "The",
+      "lemma": "the",
+      "pos": "DET",
+      "tag": "DT",
+      "dep": "det",
+      "ent_type": "",
+      "morphology": "{'Definite': 'Def', 'PronType': 'Art'}",
+      "meaning": "A definite article used to specify a noun."
+    },
+    {
+      "surface": "Beatles",
+      "lemma": "beatle",
+      "pos": "NOUN",
+      "tag": "NNS",
+      "dep": "nsubj",
+      "ent_type": "GPE",
+      "morphology": "{'Number': 'Plur'}",
+      "meaning": "A British rock band formed in Liverpool in 1960."
+    },
+    {
+      "surface": "released",
+      "lemma": "release",
+      "pos": "VERB",
+      "tag": "VBD",
+      "dep": "ROOT",
+      "ent_type": "",
+      "morphology": "{'Tense': 'Past', 'VerbForm': 'Fin'}",
+      "meaning": "To make something available to the public."
+    },
+    {
+      "surface": "12",
+      "lemma": "12",
+      "pos": "NUM",
+      "tag": "CD",
+      "dep": "nummod",
+      "ent_type": "CARDINAL",
+      "morphology": "{'NumType': 'Card'}",
+      "meaning": "A cardinal number representing the quantity of twelve."
+    },
+    {
+      "surface": "studio",
+      "lemma": "studio",
+      "pos": "NOUN",
+      "tag": "NN",
+      "dep": "compound",
+      "ent_type": "",
+      "morphology": "{'Number': 'Sing'}",
+      "meaning": "A place where recording or filming takes place."
+    },
+    {
+      "surface": "albums",
+      "lemma": "album",
+      "pos": "NOUN",
+      "tag": "NNS",
+      "dep": "dobj",
+      "ent_type": "",
+      "morphology": "{'Number': 'Plur'}",
+      "meaning": "Collections of music tracks or recordings."
+    }
+  ]
+}
+```
+### GPT Prompting (Generate a Syntaxt Tree using Token Properties)
 Ruby code:
@@ -603,11 +701,15 @@ require "ruby-spacy"
 api_key = ENV["OPENAI_API_KEY"]
 nlp = Spacy::Language.new("en_core_web_sm")
+doc = nlp.read("The Beatles released 12 studio albums")
+# default parameter values
+# max_tokens: 1000
+# temperature: 0.7
 res = doc.openai_query(
   access_token: api_key,
   model: "gpt-4",
-  prompt: "Generate a tree diagram from the text in the following style: [S [NP [Det the] [N cat]] [VP [V sat] [PP [P on] [NP the mat]]]"
+  prompt: "Generate a tree diagram from the text using given token data. Use the following bracketing style: [S [NP [Det the] [N cat]] [VP [V sat] [PP [P on] [NP the mat]]]"
 )
 puts res
 ```
@@ -647,14 +749,14 @@ doc = nlp.read("Vladimir Nabokov was a")
 # default parameter values
 # max_tokens: 1000
 # temperature: 0.7
-# model: "gpt-3.5-turbo-0613"
+# model: "gpt-4o-mini"
 res = doc.openai_completion(access_token: api_key)
 puts res
 ```
 Output:
-> Russian-American novelist and lepidopterist. He was born in 1899 in St. Petersburg, Russia, and later emigrated to the United States in 1940. Nabokov is best known for his novel "Lolita," which was published in 1955 and caused much controversy due to its controversial subject matter. Throughout his career, Nabokov wrote many other notable works, including "Pale Fire" and "Ada or Ardor: A Family Chronicle." In addition to his writing, Nabokov was also a passionate butterfly collector and taxonomist, publishing several scientific papers on the subject. He passed away in 1977, leaving behind a rich literary legacy.
+> Vladimir Nabokov was a Russian-American novelist, poet, and entomologist, best known for his intricate prose style and innovative narrative techniques. He is most famously recognized for his controversial novel "Lolita," which explores themes of obsession and manipulation. Nabokov's works often reflect his fascination with language, memory, and the nature of art. In addition to his literary accomplishments, he was also a passionate lepidopterist, contributing to the field of butterfly studies. His literary career spanned several decades, and his influence continues to be felt in contemporary literature.
 ### Text Embeddings
@@ -676,12 +778,22 @@ puts res
 Output:
 ```
--0.00208362
--0.01645165
- 0.0110955965
- 0.012802119
- 0.0012175755
- ...
+-0.0023891362
+-0.016671216
+0.010879759
+0.012918914
+0.0012281279
+...
+```
+## Advanced Usage
+### Setting a Timeout
+You can set a timeout for the `Spacy::Language.new` method:
+```ruby
+nlp = Spacy::Language.new("en_core_web_sm", timeout: 120) # Set timeout to 120 seconds
 ```
 ## Author

data/examples/openai_integration/openai_completion.rb CHANGED Viewed

@@ -12,7 +12,7 @@ doc = nlp.read("Vladimir Nabokov was a")
 # default parameter values
 # max_tokens: 1000
 # temperature: 0.7
-# model: "gpt-3.5-turbo-0613"
+# model: "gpt-4o-mini"
 res = doc.openai_completion(access_token: api_key)
 puts res

data/examples/openai_integration/openai_query_1.rb CHANGED Viewed

@@ -12,7 +12,7 @@ doc = nlp.read("The Beatles released 12 studio albums")
 # default parameter values
 # max_tokens: 1000
 # temperature: 0.7
-# model: "gpt-3.5-turbo-0613"
+# model: "gpt-4o-mini"
 res = doc.openai_query(access_token: api_key, prompt: "Translate the text to Japanese.")
 puts res

data/examples/openai_integration/openai_query_2.rb CHANGED Viewed

@@ -12,7 +12,7 @@ doc = nlp.read("The Beatles were an English rock band formed in Liverpool in 196
 # default parameter values
 # max_tokens: 1000
 # temperature: 0.7
-# model: "gpt-3.5-turbo-0613"
+# model: "gpt-4o-mini"
 res = doc.openai_query(access_token: api_key, prompt: "Extract the topic of the document and list 10 entities (names, concepts, locations, etc.) that are relevant to the topic.")
 puts res

data/examples/openai_integration/openai_query_3.rb CHANGED Viewed

@@ -12,63 +12,78 @@ doc = nlp.read("The Beatles released 12 studio albums")
 # default parameter values
 # max_tokens: 1000
 # temperature: 0.7
-# model: "gpt-3.5-turbo-0613"
-res = doc.openai_query(access_token: api_key, prompt: "List detailed morphology data of each of the word used in the sentence")
+# model: "gpt-4o-mini"
+res = doc.openai_query(
+  access_token: api_key,
+  prompt: "List token data of each of the words used in the sentence. Add 'meaning' property and value (brief semantic definition) to each token data. Output as a JSON object.",
+  max_tokens: 1000,
+  temperature: 0.7,
+  model: "gpt-4o-mini"
+)
 puts res
-# Here is the detailed morphology data for each word in the sentence:
-#
-# 1. Token: "The"
-#    - Surface: "The"
-#    - Lemma: "the"
-#    - Part-of-speech: Determiner (DET)
-#    - Tag: DT
-#    - Dependency: Determiner (det)
-#    - Entity type: None
-#    - Morphology: {'Definite': 'Def', 'PronType': 'Art'}
-#
-# 2. Token: "Beatles"
-#    - Surface: "Beatles"
-#    - Lemma: "beatle"
-#    - Part-of-speech: Noun (NOUN)
-#    - Tag: NNS
-#    - Dependency: Noun subject (nsubj)
-#    - Entity type: GPE (Geopolitical Entity)
-#    - Morphology: {'Number': 'Plur'}
-#
-# 3. Token: "released"
-#    - Surface: "released"
-#    - Lemma: "release"
-#    - Part-of-speech: Verb (VERB)
-#    - Tag: VBD
-#    - Dependency: Root
-#    - Entity type: None
-#    - Morphology: {'Tense': 'Past', 'VerbForm': 'Fin'}
-#
-# 4. Token: "12"
-#    - Surface: "12"
-#    - Lemma: "12"
-#    - Part-of-speech: Numeral (NUM)
-#    - Tag: CD
-#    - Dependency: Numeric modifier (nummod)
-#    - Entity type: Cardinal number (CARDINAL)
-#    - Morphology: {'NumType': 'Card'}
-#
-# 5. Token: "studio"
-#    - Surface: "studio"
-#    - Lemma: "studio"
-#    - Part-of-speech: Noun (NOUN)
-#    - Tag: NN
-#    - Dependency: Compound
-#    - Entity type: None
-#    - Morphology: {'Number': 'Sing'}
-#
-# 6. Token: "albums"
-#    - Surface: "albums"
-#    - Lemma: "album"
-#    - Part-of-speech: Noun (NOUN)
-#    - Tag: NNS
-#    - Dependency: Direct object (dobj)
-#    - Entity type: None
-#    - Morphology: {'Number': 'Plur'}
+# {
+#   "tokens": [
+#     {
+#       "surface": "The",
+#       "lemma": "the",
+#       "pos": "DET",
+#       "tag": "DT",
+#       "dep": "det",
+#       "ent_type": "",
+#       "morphology": "{'Definite': 'Def', 'PronType': 'Art'}",
+#       "meaning": "Used to refer to one or more people or things already mentioned or assumed to be common knowledge"
+#     },
+#     {
+#       "surface": "Beatles",
+#       "lemma": "beatle",
+#       "pos": "NOUN",
+#       "tag": "NNS",
+#       "dep": "nsubj",
+#       "ent_type": "GPE",
+#       "morphology": "{'Number': 'Plur'}",
+#       "meaning": "A British rock band formed in Liverpool in 1960"
+#     },
+#     {
+#       "surface": "released",
+#       "lemma": "release",
+#       "pos": "VERB",
+#       "tag": "VBD",
+#       "dep": "ROOT",
+#       "ent_type": "",
+#       "morphology": "{'Tense': 'Past', 'VerbForm': 'Fin'}",
+#       "meaning": "To make something available or known to the public"
+#     },
+#     {
+#       "surface": "12",
+#       "lemma": "12",
+#       "pos": "NUM",
+#       "tag": "CD",
+#       "dep": "nummod",
+#       "ent_type": "CARDINAL",
+#       "morphology": "{'NumType': 'Card'}",
+#       "meaning": "A number representing a quantity"
+#     },
+#     {
+#       "surface": "studio",
+#       "lemma": "studio",
+#       "pos": "NOUN",
+#       "tag": "NN",
+#       "dep": "compound",
+#       "ent_type": "",
+#       "morphology": "{'Number': 'Sing'}",
+#       "meaning": "A place where creative work is done"
+#     },
+#     {
+#       "surface": "albums",
+#       "lemma": "album",
+#       "pos": "NOUN",
+#       "tag": "NNS",
+#       "dep": "dobj",
+#       "ent_type": "",
+#       "morphology": "{'Number': 'Plur'}",
+#       "meaning": "A collection of musical or spoken recordings"
+#     }
+#   ]
+# }

data/examples/openai_integration/openai_query_4.rb CHANGED Viewed

@@ -12,11 +12,11 @@ doc = nlp.read("The Beatles released 12 studio albums")
 # default parameter values
 # max_tokens: 1000
 # temperature: 0.7
-# model: "gpt-3.5-turbo-0613"
+# model: "gpt-4o-mini"
 res = doc.openai_query(
   access_token: api_key,
-  model: "gpt-4",
-  prompt: "Generate a tree diagram from the text in the following style: [S [NP [Det the] [N cat]] [VP [V sat] [PP [P on] [NP the mat]]]"
+  model: "gpt-4o",
+  prompt: "Generate a tree diagram from the text using given token data. Use the following bracketing style: [S [NP [Det the] [N cat]] [VP [V sat] [PP [P on] [NP the mat]]]"
 )
 puts res

data/lib/ruby-spacy/version.rb CHANGED Viewed

@@ -2,5 +2,5 @@
 module Spacy
   # The version number of the module
-  VERSION = "0.2.2"
+  VERSION = "0.2.3"
 end

data/lib/ruby-spacy.rb CHANGED Viewed

@@ -1,10 +1,21 @@
 # frozen_string_literal: true
 require_relative "ruby-spacy/version"
-require "strscan"
 require "numpy"
-require "pycall"
 require "openai"
+require "pycall"
+require "strscan"
+require "timeout"
+begin
+  PyCall.init
+  _spacy = PyCall.import_module("spacy")
+rescue PyCall::PyError => e
+  puts "Failed to initialize PyCall or import spacy: #{e.message}"
+  puts "Python traceback:"
+  puts e.traceback
+  raise
+end
 # This module covers the areas of spaCy functionality for _using_ many varieties of its language models, not for _building_ ones.
 module Spacy
@@ -216,7 +227,7 @@ module Spacy
     def openai_query(access_token: nil,
                      max_tokens: 1000,
                      temperature: 0.7,
-                     model: "gpt-3.5-turbo-0613",
+                     model: "gpt-4o-mini",
                      messages: [],
                      prompt: nil)
       if messages.empty?
@@ -291,7 +302,7 @@ module Spacy
       end
     end
-    def openai_completion(access_token: nil, max_tokens: 1000, temperature: 0.7, model: "gpt-3.5-turbo-0613")
+    def openai_completion(access_token: nil, max_tokens: 1000, temperature: 0.7, model: "gpt-4o-mini")
       messages = [
         { role: "system", content: "Complete the text input by the user." },
         { role: "user", content: @text }
@@ -355,16 +366,24 @@ module Spacy
     # Creates a language model instance, which is conventionally referred to by a variable named `nlp`.
     # @param model [String] A language model installed in the system
-    def initialize(model = "en_core_web_sm", max_retrial: MAX_RETRIAL, retrial: 0)
+    def initialize(model = "en_core_web_sm", max_retrial: MAX_RETRIAL, retrial: 0, timeout: 60)
       @spacy_nlp_id = "nlp_#{model.object_id}"
-      PyCall.exec("import spacy; #{@spacy_nlp_id} = spacy.load('#{model}')")
-      @py_nlp = PyCall.eval(@spacy_nlp_id)
-    rescue StandardError
-      retrial += 1
-      raise "Error: Pycall failed to load Spacy" unless retrial <= max_retrial
-      sleep 0.5
-      initialize(model, max_retrial: max_retrial, retrial: retrial)
+      begin
+        Timeout.timeout(timeout) do
+          PyCall.exec("import spacy; #{@spacy_nlp_id} = spacy.load('#{model}')")
+        end
+        @py_nlp = PyCall.eval(@spacy_nlp_id)
+      rescue Timeout::Error
+        raise "PyCall execution timed out after #{timeout} seconds"
+      rescue StandardError => e
+        retrial += 1
+        if retrial <= max_retrial
+          sleep 0.5
+          retry
+        else
+          raise "Failed to initialize Spacy after #{max_retrial} attempts: #{e.message}"
+        end
+      end
     end
     # Reads and analyze the given text.

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: ruby-spacy
 version: !ruby/object:Gem::Version
-  version: 0.2.2
+  version: 0.2.3
 platform: ruby
 authors:
 - Yoichiro Hasebe
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2023-10-03 00:00:00.000000000 Z
+date: 2024-08-27 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: bundler
@@ -224,7 +224,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
     - !ruby/object:Gem::Version
       version: '0'
 requirements: []
-rubygems_version: 3.4.12
+rubygems_version: 3.4.13
 signing_key:
 specification_version: 4
 summary: A wrapper module for using spaCy natural language processing library from