RubyGems - ML_Ruby - Versions diffs - 0.1.1 → 0.1.2 - Mend

ML_Ruby 0.1.1 → 0.1.2

Files changed (7) hide show

checksums.yaml +4 -4
data/README.md +27 -4
data/lib/ML_Ruby/version.rb +1 -1
data/lib/ML_Ruby.rb +13 -0
data/lib/python/natural_language_processing/support_vector_machine.py +29 -0
metadata +3 -3
data/lib/python/natural_language_processing/text_summary.py +0 -60

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 64fda0d7d407a58ec1df7fa697db1e5a08d9fe8b4ca4b28e5533c50d985fa1c6
-  data.tar.gz: 26b26aac3b3260c77eda5aca6b4e04f3aa1320dd1c725a29d3d25b0e98a7c467
+  metadata.gz: 95869d52047e12b4e821f8955856b4dbe0d4dff4efa797a44300fed365103b7a
+  data.tar.gz: 6130da39a94674ee68cc5d7c2e834b104f664e82cc19d4d14f9b97b3af297f29
 SHA512:
-  metadata.gz: 55dc5ada8223599c639067eaf2cc301491b3c5fccf4d9a6d37ffeab286bec206cdd1154d9bd858f17a2002c9a31255e672660bf4e53f16b7214652796b8e2ca2
-  data.tar.gz: 1c9c36e421b9295ae086619c06fa333595d910128e0c0e4e4b4257360f2f9b84621742eb20bac257553fdbb31c1a273062e3a8ae955db92092778330b234160a
+  metadata.gz: 0bf46c64cd162b99fa821dd2e51c48ed4559b294d3a6d2ac33941f4fc7672b28eb20ac975085dbb7dfa2a278e633bca3760ce1f2d6d2811fdbb07b9372679a61
+  data.tar.gz: 63a6abaa30bc00fe04b2937188e7bf22ce67eff41878d77dcaef3e985c26bda05af1eaf4dd6a563a006f6f44d942c21a26354e90f6788b9a9287075cc7488851

data/README.md CHANGED Viewed

@@ -1,14 +1,14 @@
 # MLRuby
-This Ruby gem leverages Machine Learning(ML) techniques to make predictions(forecasts) and classifications in various applications. It provides capabilities such as predicting next month's billing, forecasting upcoming sales orders, determining user approval status, classifying text, generating similarity scores, and making recommendations. It uses Python3 under the hood, powered by popular machine learning techniques including NLP(Natural Language Processing), Decision Tree, K-Nearest Neighbors and Linear Regression algorithms.
+This Ruby gem leverages Machine Learning(ML) techniques to make predictions(forecasts) and classifications in various applications. It provides capabilities such as predicting next month's billing, generating new house/apartment prices, forecasting upcoming sales orders, determining user approval status, classifying text, generating similarity scores, and making recommendations. It uses Python3 under the hood, powered by popular machine learning techniques including NLP(Natural Language Processing), Decision Tree, K-Nearest Neighbors, Random Forest and Linear Regression algorithms.
 # Pre-requisite
-1. Please make sure you have Python3 installed in your Machine. The gem will run `which python3` to locate your installed python3 in your Machine. Usually it is installed at `/usr/bin/python3`
+1. Please make sure you have Python3 installed in your Machine. The gem will run `which python3` command internally in order to locate your installed python3 in your Machine. Usually it is installed at `/usr/bin/python3`
-2. Please make sure you have `scikit-learn` and `pandas` python libraries are installed in Machine.
+2. Please make sure you have `scikit-learn` and `pandas` python libraries are installed in your Machine.
-Here are examples of how to install these python libraries via the command line in MacOS. Install `nltk` if you really need to work with Natural Language Processing(NLP)
+Here are examples of how to install these python libraries via the command line in MacOS and Linux. Install `nltk` if you really need to work with Natural Language Processing(NLP)
 `/usr/bin/python3 -m pip install scikit-learn`
@@ -163,6 +163,29 @@ training_messages = [
     ]
   predictions = ml.predict(new_messages)
 ```
+ - ### Natural Language Processing(NLP): Support Vector Machine Algorithm - Categorize/Classify Comments/Texts/Documents/Articles
+    Imagine you're managing a customer feedback system, and you want to categorize customer comments effectively. With the capabilities of this gem, you can seamlessly classify comments/texts/documents/articles into their appropriate categories.
+```
+    training_documents = [
+      "Machine learning techniques include neural networks and decision trees.",
+      "Web development skills are essential for building modern websites.",
+      "Natural language processing (NLP) is a subfield of artificial intelligence.",
+      "Data science involves data analysis and statistical modeling.",
+      "Computer vision is used in image and video processing applications."
+    ]
+    categories = ["Machine Learning", "Web Development", "Artificial Intelligence", "Data Science", "Computer Vision"]
+    ml = MLRuby::NaturalLanguageProcessing::SupportVectorMachine::Model.new(training_documents, categories)
+    new_documents = [
+      "I am Ruby On Rails Expert",
+      "I am interested in understanding natural language processing.",
+      "I want to pursue an academic degree on neural networks.",
+      "I have more than 12 years of professional working experience in JavaScript stack"
+    ]
+    predictions = ml.predict(new_documents)
+```
 It's important to note that the size of your training dataset plays a significant role in enhancing the accuracy of the model's predictions. By incorporating real-world, authentic data and expanding the amount of training data for the model, it gains a better understanding of patterns and trends within the data which leads to more precise and reliable predictions.
 ## Contributing

data/lib/ML_Ruby/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module MLRuby
-  VERSION = "0.1.1"
+  VERSION = "0.1.2"
 end

data/lib/ML_Ruby.rb CHANGED Viewed

@@ -74,6 +74,19 @@ module MLRuby
         end
       end
     end
+    module SupportVectorMachine
+      class Model
+        def initialize(training_data=[], categories=[])
+          @training_data = training_data
+          @categories = categories
+        end
+        def predict(new_data=[])
+          script_path = "#{Gem.loaded_specs['ML_Ruby'].gem_dir}/lib/python/natural_language_processing/support_vector_machine.py"
+          result = `#{MLRuby::PYTHON_PATH} #{script_path} '#{@training_data}' '#{@categories}' '#{new_data}'`
+          result.gsub("\n", "").gsub(/[\[\]]/, '').split("' '").map { |element| element.gsub(/'/, '') }
+        end
+      end
+    end
   end
 end

data/lib/python/natural_language_processing/support_vector_machine.py ADDED Viewed

@@ -0,0 +1,29 @@
+import numpy as np
+from sklearn.feature_extraction.text import TfidfVectorizer
+from sklearn.svm import SVC
+from sklearn.metrics import classification_report
+import sys
+import ast
+class SupportVectorMachine:
+    def __init__(self, documents, categories):
+      self.documents = documents
+      self.categories = categories
+    def fit_and_transform(self):
+      self.vectorizer = TfidfVectorizer(max_features=1000, stop_words='english')
+      X = self.vectorizer.fit_transform(self.documents)
+      self.classifier = SVC(kernel='linear')
+      self.classifier.fit(X, self.categories)
+documents = ast.literal_eval(sys.argv[1])
+categories = ast.literal_eval(sys.argv[2])
+new_documents = ast.literal_eval(sys.argv[3])
+support_vector_machine_model = SupportVectorMachine(documents, categories)
+support_vector_machine_model.fit_and_transform()
+new_text_vectorized = support_vector_machine_model.vectorizer.transform(new_documents)
+predicted_topics = support_vector_machine_model.classifier.predict(new_text_vectorized)
+print(predicted_topics)

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: ML_Ruby
 version: !ruby/object:Gem::Version
-  version: 0.1.1
+  version: 0.1.2
 platform: ruby
 authors:
 - Abdul Barek
 autorequire:
 bindir: exe
 cert_chain: []
-date: 2023-09-18 00:00:00.000000000 Z
+date: 2023-09-19 00:00:00.000000000 Z
 dependencies: []
 description: This Ruby gem leverages Machine Learning(ML) techniques to make predictions(forecasts)
   and classifications in various applications. It provides capabilities such as predicting
@@ -32,8 +32,8 @@ files:
 - lib/python/decision_tree_classifier.py
 - lib/python/k_nearest_neighbors.py
 - lib/python/linear_regression.py
+- lib/python/natural_language_processing/support_vector_machine.py
 - lib/python/natural_language_processing/text_classifier.py
-- lib/python/natural_language_processing/text_summary.py
 - lib/python/random_forest.py
 - sig/ML_Ruby.rbs
 homepage: https://github.com/barek2k2/ML_Ruby

data/lib/python/natural_language_processing/text_summary.py DELETED Viewed

@@ -1,60 +0,0 @@
-import requests
-# Replace 'YOUR_API_KEY' with your actual Hugging Face API key
-api_key = 'YOUR_API_KEY'
-# Text to be summarized
-text = """
-Freedom of expression is a fundamental human right that holds immense importance in any democratic society. It serves as a platform for individuals to voice their opinions, share their thoughts and ideas, and engage in meaningful discussions and debates. However, the extent to which freedom of expression is upheld and practised varies from one country to another. In Bangladesh's case, although freedom of expression is enshrined in the constitution, its implementation faces significant challenges.
-At first glance, it may appear that people in Bangladesh enjoy the freedom to express themselves without fear of persecution. The media also seems relatively free to report on matters of public interest. However, a closer examination reveals the existence of substantial limitations to this freedom.
-One of the serious issues with freedom of expression in Bangladesh is its heavy bias towards certain individuals and groups. While some individuals, particularly those who are part of the ruling elite or possess political influence, can freely criticise the government and its policies, others do not have the same privilege. Those who lack political connections or are not part of the ruling elite face restrictions in expressing their opinions. They often become targets of persecution when they dare to speak out against the government or its policies.
-Furthermore, freedom of expression is confined to specific topics in the country. People are allowed to discuss issues that do not directly affect the larger population, but they are prohibited from being critical of the government's failures or the shortcomings of its institutions. To achieve this, the government employs various tactics such as intimidation, harassment, imprisonment, and even violence. The media is not immune to such suppression, as journalists are frequently targeted and attacked for reporting on sensitive issues. Such restrictions do not align with the principles of democracy.
-In Bangladesh, a unique style has emerged, favouring those who have political connections or are part of the ruling elite. This creates a significant power imbalance, silencing the voices of the majority while allowing the minority with access to power to dominate the conversation.
-How did the country arrive at this point? Is it solely due to the efforts of successive governments to silence dissidents, or have the institutions tasked with protecting the rights of the people also succumbed to vested interests?
-While it is valid to discuss the failure of institutions, a sizable portion of the blame falls on the media in Bangladesh as well. Impartial and unbiased journalism is increasingly limited in the country – not only due to informal embargoes and formal legal restrictions but also due to other factors. One crucial factor is the prevalence of media and journalists being aligned with partisan politics. While journalists having political alignments is not inherently wrong, it becomes problematic when they forget the fundamental principle of journalism: presenting the facts. Journalists with political biases or perceived biases often manipulate or distort information to support a specific political agenda or gain personal advantages. They tend to align themselves with the political party in power, compromising the freedom of expression that is crucial in a democratic society.
-Another concerning trend in Bangladesh is the influence of businesses or business conglomerates that own media outlets. These entities often leverage their media ownership to protect their business interests by eliminating competitors or evading accountability for crimes and wrongdoing. In such cases, the media outlets tend to align closely with the power structure, serving the interests of those in power, rather than speaking up for the people.
-Even in the presence of formal and informal restrictions imposed by the state, journalists should be able to report freely on matters of public interest. This freedom is essential for the press to remain relevant and create a platform for open dialogue among the people, their representatives, and public institutions. By doing so, the press can uphold the citizens' right to freedom of expression
-Journalists and media organisations in Bangladesh must reaffirm their commitment to impartiality, independence, and the presentation of facts. By adhering to these principles, they can counteract the influence of partisan politics and business interests on journalism and contribute to the preservation of press freedom in the country. Additionally, efforts should be made to foster a media environment that encourages diverse perspectives and safeguards the integrity of journalism.
-In a democracy, freedom of expression encompasses the right to criticise the government and its policies, and the press is expected to be free from bias. If Bangladesh aims to progress towards a more democratic society, it must ensure that freedom of expression is guaranteed to all its citizens, irrespective of their political affiliations or social status. But the question remains: how can the country do so?
-"""
-# Define the endpoint URL
-endpoint = 'https://api-inference.huggingface.co/models/facebook/bart-large-cnn'
-# Define the headers with your API key
-api_key = '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'
-headers = {
-    'Authorization': f'Bearer {api_key}',
-}
-# Define the payload with the text to be summarized
-payload = {
-    'inputs': text,
-    'options': {
-        'task': 'summarization',
-    },
-}
-# Make a POST request to the Hugging Face API
-response = requests.post(endpoint, json=payload, headers=headers)
-# Check if the request was successful
-if response.status_code == 200:
-    result = response.json()
-    print(result)
-    #summary = result['summary']
-    #print("Summary:")
-    #print(summary)
-else:
-    print("Error:", response.status_code, response.text)