PyPI - pyrlutils - Versions diffs - 0.0.2__tar.gz → 0.0.4__tar.gz - Mend

pyrlutils 0.0.2tar.gz → 0.0.4tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of pyrlutils might be problematic. Click here for more details.

Files changed (38) hide show

pyrlutils-0.0.4/.circleci/config.yml ADDED Viewed

@@ -0,0 +1,76 @@
+version: 2
+shared: &shared
+  working_directory: ~/pyrlutils
+  steps:
+    - checkout
+    - run:
+        name: Apt Install
+        command: |
+          sudo apt-get update
+          sudo apt-get install -y libc6
+          sudo apt-get install -y g++
+    - run:
+        name: Installing Packages
+        command: |
+          pip install --upgrade --user pip
+          pip install --upgrade --user .
+          pip install --upgrade --user .[openaigym]
+    - run:
+        name: Run Unit Tests
+        command: |
+          python -m unittest
+jobs:
+  py37:
+    <<: *shared
+    docker:
+      - image: cimg/python:3.7
+  py38:
+    <<: *shared
+    docker:
+      - image: cimg/python:3.8
+  py39:
+    <<: *shared
+    docker:
+      - image: cimg/python:3.9
+  py310:
+    <<: *shared
+    docker:
+      - image: cimg/python:3.10
+  py311:
+    <<: *shared
+    docker:
+      - image: cimg/python:3.11
+  py312:
+    <<: *shared
+    docker:
+      - image: cimg/python:3.12
+  py313:
+    <<: *shared
+    docker:
+      - image: cimg/python:3.13
+workflows:
+  version: 2
+  build:
+    jobs:
+      - py37
+      - py38
+      - py39
+      - py310
+      - py311
+      - py312
+      - py313

pyrlutils-0.0.4/.gitignore ADDED Viewed

@@ -0,0 +1,252 @@
+### VisualStudioCode template
+.vscode/*
+!.vscode/settings.json
+!.vscode/tasks.json
+!.vscode/launch.json
+!.vscode/extensions.json
+*.code-workspace
+# Local History for Visual Studio Code
+.history/
+### JetBrains template
+# Covers JetBrains IDEs: IntelliJ, RubyMine, PhpStorm, AppCode, PyCharm, CLion, Android Studio, WebStorm and Rider
+# Reference: https://intellij-support.jetbrains.com/hc/en-us/articles/206544839
+# User-specific stuff
+.idea/**/workspace.xml
+.idea/**/tasks.xml
+.idea/**/usage.statistics.xml
+.idea/**/dictionaries
+.idea/**/shelf
+# Generated files
+.idea/**/contentModel.xml
+# Sensitive or high-churn files
+.idea/**/dataSources/
+.idea/**/dataSources.ids
+.idea/**/dataSources.local.xml
+.idea/**/sqlDataSources.xml
+.idea/**/dynamic.xml
+.idea/**/uiDesigner.xml
+.idea/**/dbnavigator.xml
+# Gradle
+.idea/**/gradle.xml
+.idea/**/libraries
+# Gradle and Maven with auto-import
+# When using Gradle or Maven with auto-import, you should exclude module files,
+# since they will be recreated, and may cause churn.  Uncomment if using
+# auto-import.
+# .idea/artifacts
+# .idea/compiler.xml
+# .idea/jarRepositories.xml
+# .idea/modules.xml
+# .idea/*.iml
+# .idea/modules
+# *.iml
+# *.ipr
+# CMake
+cmake-build-*/
+# Mongo Explorer plugin
+.idea/**/mongoSettings.xml
+# File-based project format
+*.iws
+# IntelliJ
+out/
+# mpeltonen/sbt-idea plugin
+.idea_modules/
+# JIRA plugin
+atlassian-ide-plugin.xml
+# Cursive Clojure plugin
+.idea/replstate.xml
+# Crashlytics plugin (for Android Studio and IntelliJ)
+com_crashlytics_export_strings.xml
+crashlytics.properties
+crashlytics-build.properties
+fabric.properties
+# Editor-based Rest Client
+.idea/httpRequests
+# Android studio 3.1+ serialized cache file
+.idea/caches/build_file_checksums.ser
+### VirtualEnv template
+# Virtualenv
+# http://iamzed.com/2009/05/07/a-primer-on-virtualenv/
+.Python
+[Bb]in
+[Ii]nclude
+[Ll]ib
+[Ll]ib64
+[Ll]ocal
+[Ss]cripts
+pyvenv.cfg
+.venv
+pip-selfcheck.json
+### JupyterNotebooks template
+# gitignore template for Jupyter Notebooks
+# website: http://jupyter.org/
+.ipynb_checkpoints
+*/.ipynb_checkpoints/*
+# IPython
+profile_default/
+ipython_config.py
+# Remove previous ipynb_checkpoints
+#   git rm -r .ipynb_checkpoints/
+### Python template
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+# C extensions
+*.so
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+cover/
+# Translations
+*.mo
+*.pot
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+# Flask stuff:
+instance/
+.webassets-cache
+# Scrapy stuff:
+.scrapy
+# Sphinx documentation
+docs/_build/
+# PyBuilder
+.pybuilder/
+target/
+# Jupyter Notebook
+.ipynb_checkpoints
+# IPython
+profile_default/
+ipython_config.py
+# pyenv
+#   For a library or package, you might want to ignore these files since the code is
+#   intended to run in multiple environments; otherwise, check them in:
+# .python-version
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow
+__pypackages__/
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+# SageMath parsed files
+*.sage.py
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# Spyder project settings
+.spyderproject
+.spyproject
+# Rope project settings
+.ropeproject
+# mkdocs documentation
+/site
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+# Pyre type checker
+.pyre/
+# pytype static type analyzer
+.pytype/
+# Cython debug symbols
+cython_debug/

pyrlutils-0.0.4/.pyup.yml ADDED Viewed

@@ -0,0 +1,5 @@
+# autogenerated pyup.io config file
+# see https://pyup.io/docs/configuration/ for all available options
+schedule: ''
+update: false

pyrlutils-0.0.4/MANIFEST.in ADDED Viewed

@@ -0,0 +1,3 @@
+include README.md
+include pyproject.toml
+include LICENSE

{pyrlutils-0.0.2/pyrlutils.egg-info → pyrlutils-0.0.4}/PKG-INFO RENAMED Viewed

@@ -1,14 +1,14 @@
-Metadata-Version: 2.1
+Metadata-Version: 2.2
 Name: pyrlutils
-Version: 0.0.2
+Version: 0.0.4
 Summary: Utility and Helpers for Reinformcement Learning
-Home-page: https://github.com/stephenhky/PyRLUtils
-Author: Kwan-Yuet Ho
-Author-email: stephenhky@yahoo.com.hk
+Author-email: Kwan Yuet Stephen Ho <stephenhky@yahoo.com.hk>
 License: MIT
-Keywords: machine learning,reinforcement leaning,artifiial intelligence
-Platform: UNKNOWN
+Project-URL: Repository, https://github.com/stephenhky/PyRLUtils
+Project-URL: Issues, https://github.com/stephenhky/PyRLUtils/issues
+Keywords: machine learning,reinforcement leaning,artificial intelligence
 Classifier: Topic :: Scientific/Engineering :: Mathematics
+Classifier: License :: OSI Approved :: MIT License
 Classifier: Topic :: Software Development :: Libraries :: Python Modules
 Classifier: Topic :: Software Development :: Version Control :: Git
 Classifier: Programming Language :: Python :: 3.7
@@ -16,10 +16,17 @@ Classifier: Programming Language :: Python :: 3.8
 Classifier: Programming Language :: Python :: 3.9
 Classifier: Programming Language :: Python :: 3.10
 Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
 Classifier: Intended Audience :: Science/Research
 Classifier: Intended Audience :: Developers
+Requires-Python: >=3.7
 Description-Content-Type: text/markdown
 License-File: LICENSE
+Requires-Dist: numpy
+Provides-Extra: openaigym
+Requires-Dist: gymnasium; extra == "openaigym"
+Provides-Extra: test
+Requires-Dist: unittest; extra == "test"
 # PyRLUtils
@@ -27,8 +34,9 @@ License-File: LICENSE
 [![GitHub release](https://img.shields.io/github/release/stephenhky/PyRLUtils.svg?maxAge=3600)](https://github.com/stephenhky/pyqentangle/PyRLUtils)
 [![pypi](https://img.shields.io/pypi/v/PyRLUtils.svg?maxAge=3600)](https://pypi.org/project/pyqentangle/)
 [![download](https://img.shields.io/pypi/dm/PyRLUtils.svg?maxAge=2592000&label=installs&color=%2327B1FF)](https://pypi.org/project/PyRLUtils/)
+[![Updates](https://pyup.io/repos/github/stephenhky/PyRLUtils/shield.svg)](https://pyup.io/repos/github/stephenhky/PyRLUtils/)
+[![Python 3](https://pyup.io/repos/github/stephenhky/PyRLUtils/python-3-shield.svg)](https://pyup.io/repos/github/stephenhky/PyRLUtils/)
 This is a Python package with utility classes and helper functions for
 that facilitates the development of any reinformecement learning projects.

{pyrlutils-0.0.2 → pyrlutils-0.0.4}/README.md RENAMED Viewed

@@ -4,6 +4,9 @@
 [![GitHub release](https://img.shields.io/github/release/stephenhky/PyRLUtils.svg?maxAge=3600)](https://github.com/stephenhky/pyqentangle/PyRLUtils)
 [![pypi](https://img.shields.io/pypi/v/PyRLUtils.svg?maxAge=3600)](https://pypi.org/project/pyqentangle/)
 [![download](https://img.shields.io/pypi/dm/PyRLUtils.svg?maxAge=2592000&label=installs&color=%2327B1FF)](https://pypi.org/project/PyRLUtils/)
+[![Updates](https://pyup.io/repos/github/stephenhky/PyRLUtils/shield.svg)](https://pyup.io/repos/github/stephenhky/PyRLUtils/)
+[![Python 3](https://pyup.io/repos/github/stephenhky/PyRLUtils/python-3-shield.svg)](https://pyup.io/repos/github/stephenhky/PyRLUtils/)
 This is a Python package with utility classes and helper functions for
 that facilitates the development of any reinformecement learning projects.

pyrlutils-0.0.4/pyproject.toml ADDED Viewed

@@ -0,0 +1,42 @@
+[build-system]
+requires = ["setuptools", "setuptools-scm", "wheel"]
+build-backend = "setuptools.build_meta"
+[project]
+name = "pyrlutils"
+version = "0.0.4"
+authors = [
+    {name = "Kwan Yuet Stephen Ho", email = "stephenhky@yahoo.com.hk"}
+]
+description = "Utility and Helpers for Reinformcement Learning"
+readme = {file = "README.md", content-type = "text/markdown"}
+license = {text = "MIT"}
+keywords = ["machine learning", "reinforcement leaning", "artificial intelligence"]
+requires-python = ">=3.7"
+classifiers = [
+    "Topic :: Scientific/Engineering :: Mathematics",
+    "License :: OSI Approved :: MIT License",
+    "Topic :: Software Development :: Libraries :: Python Modules",
+    "Topic :: Software Development :: Version Control :: Git",
+    "Programming Language :: Python :: 3.7",
+    "Programming Language :: Python :: 3.8",
+    "Programming Language :: Python :: 3.9",
+    "Programming Language :: Python :: 3.10",
+    "Programming Language :: Python :: 3.11",
+    "Programming Language :: Python :: 3.12",
+    "Intended Audience :: Science/Research",
+    "Intended Audience :: Developers",
+]
+dependencies = ["numpy"]
+[project.urls]
+Repository = "https://github.com/stephenhky/PyRLUtils"
+Issues = "https://github.com/stephenhky/PyRLUtils/issues"
+[tool.setuptools]
+packages = ["pyrlutils", "pyrlutils.bandit", "pyrlutils.openai"]
+zip-safe = false
+[project.optional-dependencies]
+openaigym = ["gymnasium"]
+test = ["unittest"]

pyrlutils-0.0.4/pyrlutils/bandit/algo.py ADDED Viewed

@@ -0,0 +1,128 @@
+from abc import ABC, abstractmethod
+import numpy as np
+from .reward import IndividualBanditRewardFunction
+class BanditAlgorithm(ABC):
+    def __init__(self, action_values: list, reward_function: IndividualBanditRewardFunction):
+        self._action_values = action_values
+        self._reward_function = reward_function
+    @abstractmethod
+    def _go_one_loop(self):
+        pass
+    def loop(self, nbiterations: int):
+        for _ in range(nbiterations):
+            self._go_one_loop()
+    def reward(self, action_value) -> float:
+        return self._reward_function(action_value)
+    @abstractmethod
+    def get_action(self):
+        pass
+    @property
+    def action_values(self):
+        return self._action_values
+    @property
+    def reward_function(self) -> IndividualBanditRewardFunction:
+        return self._reward_function
+class SimpleBandit(BanditAlgorithm):
+    def __init__(
+            self,
+            action_values: list,
+            reward_function: IndividualBanditRewardFunction,
+            epsilon: float=0.05
+    ):
+        super().__init__(action_values, reward_function)
+        self._epsilon = epsilon
+        self._initialize()
+    def _initialize(self):
+        self._Q = np.zeros(len(self._action_values))
+        self._N = np.zeros(len(self._action_values), dtype=np.int32)
+    def _go_one_loop(self):
+        r = np.random.uniform()
+        if r < self.epsilon:
+            selected_action_idx = np.argmax(self._Q)
+        else:
+            selected_action_idx = np.random.choice(range(len(self._action_values)))
+        reward = self._reward_function(self._action_values[selected_action_idx])
+        self._N[selected_action_idx] += 1
+        self._Q[selected_action_idx] += (reward - self._Q[selected_action_idx]) / self._N[selected_action_idx]
+    def get_action(self):
+        selected_action_idx = np.argmax(self._Q)
+        return self._action_values[selected_action_idx]
+    @property
+    def epsilon(self) -> float:
+        return self._epsilon
+    @epsilon.setter
+    def epsilon(self, val: float):
+        self._epsilon = val
+class GradientBandit(BanditAlgorithm):
+    def __init__(self, action_values: list, reward_function: IndividualBanditRewardFunction, temperature: float=1.0, alpha: float=0.1):
+        super().__init__(action_values, reward_function)
+        self._T = temperature
+        self._alpha = alpha
+        self._initialize()
+    def _initialize(self):
+        self._preferences = np.zeros(len(self._action_values))
+        self._rewards_over_time = []
+    def _get_probs(self) -> np.ndarray:
+        # getting probabilities using softmax
+        exp_preferences = np.exp(self._preferences / self.T)
+        sum_exp_preferences = np.sum(exp_preferences)
+        return exp_preferences / sum_exp_preferences
+    def get_action(self):
+        selected_action_idx = np.argmax(self._preferences)
+        return self._action_values[selected_action_idx]
+    def _go_one_loop(self):
+        probs = self._get_probs()
+        selected_action_idx = np.random.choice(range(self._preferences.shape[0]), p=probs)
+        reward = self._reward_function(self._action_values[selected_action_idx])
+        self._rewards_over_time.append(reward)
+        average_reward = np.mean(self._rewards_over_time) if len(self._rewards_over_time) > 0 else 0.
+        for i in range(len(self._action_values)):
+            if i == selected_action_idx:
+                self._preferences[i] += self.alpha * (reward - average_reward) * (1 - probs[i])
+            else:
+                self._preferences[i] -= self.alpha * (reward - average_reward) * probs[i]
+    @property
+    def alpha(self) -> float:
+        return self._alpha
+    @alpha.setter
+    def alpha(self, val: float):
+        self._alpha = val
+    @property
+    def T(self) -> float:
+        return self._T
+    @T.setter
+    def T(self, val: float):
+        self._T = val
+    @property
+    def temperature(self) -> float:
+        return self._T

pyrlutils-0.0.4/pyrlutils/bandit/reward.py ADDED Viewed

@@ -0,0 +1,11 @@
+from abc import ABC, abstractmethod
+class IndividualBanditRewardFunction(ABC):
+    @abstractmethod
+    def reward(self, action_value) -> float:
+        pass
+    def __call__(self, action_value) -> float:
+        return self.reward(action_value)

pyrlutils-0.0.4/pyrlutils/openai/__init__.py ADDED Viewed

File without changes

pyrlutils-0.0.4/pyrlutils/openai/utils.py ADDED Viewed

@@ -0,0 +1,31 @@
+import gymnasium as gym
+from ..transition import TransitionProbabilityFactory, NextStateTuple
+class OpenAIGymDiscreteEnvironmentTransitionProbabilityFactory(TransitionProbabilityFactory):
+    def __init__(self, envname):
+        super().__init__()
+        self._envname = envname
+        self._gymenv = gym.make(envname)
+        self._convert_openai_gymenv_to_transprob()
+    def _convert_openai_gymenv_to_transprob(self):
+        P = self._gymenv.env.env.env.P
+        for state_value, trans_dict in P.items():
+            new_trans_dict = {}
+            for action_value, next_state_list in trans_dict.items():
+                new_trans_dict[action_value] = [
+                    NextStateTuple(next_state[1], next_state[0], next_state[2], next_state[3])
+                    for next_state in next_state_list
+                ]
+            self.add_state_transitions(state_value, new_trans_dict)
+    @property
+    def envname(self):
+        return self._envname
+    @property
+    def gymenv(self):
+        return self._gymenv

{pyrlutils-0.0.2 → pyrlutils-0.0.4}/pyrlutils/reward.py RENAMED Viewed

@@ -21,7 +21,7 @@ class RewardFunction(ABC):
         return self._discount_factor
     @discount_factor.setter
-    def discount_factor(self, discount_factor):
+    def discount_factor(self, discount_factor: float):
         self._discount_factor = discount_factor
     def individual_reward(self, state_value, action_value, next_state_value) -> float:

{pyrlutils-0.0.2 → pyrlutils-0.0.4}/pyrlutils/state.py RENAMED Viewed

@@ -1,10 +1,39 @@
 from abc import ABC, abstractmethod
+from enum import Enum
+from dataclasses import dataclass
 from typing import Tuple, List, Optional, Union
 import numpy as np
+class StateValue(ABC):
+    @property
+    @abstractmethod
+    def value(self):
+        pass
+@dataclass
+class DiscreteStateValue(StateValue):
+    enum: Enum
+    @property
+    def value(self):
+        return self.enum.value
+    def name(self):
+        return self.enum.name
+class ContinuousStateValue(StateValue):
+    _value: float
+    @property
+    def value(self) -> float:
+        return self._value
 class State(ABC):
     @property
     def state_value(self):
@@ -23,7 +52,7 @@ class State(ABC):
         self.set_state_value(new_state_value)
-DiscreteStateValueType = Union[float, str, Tuple[int]]
+DiscreteStateValueType = Union[float, str, Tuple[int], Enum]
 class DiscreteState(State):
@@ -182,7 +211,7 @@ class Discrete2DCartesianState(DiscreteState):
         self._county = self._y_hilim - self._y_lowlim + 1
         if initial_coordinate is None:
             initial_coordinate = [self._x_lowlim, self._y_lowlim]
-        initial_value =  (initial_coordinate[1] - self._y_lowlim) * self._countx + (initial_coordinate[0] - self._x_lowlim)
+        initial_value = (initial_coordinate[1] - self._y_lowlim) * self._countx + (initial_coordinate[0] - self._x_lowlim)
         super().__init__(list(range(self._countx*self._county)), initial_values=initial_value)
     def _encode_coordinates(self, x, y) -> int:

{pyrlutils-0.0.2 → pyrlutils-0.0.4}/pyrlutils/transition.py RENAMED Viewed

@@ -3,7 +3,6 @@ from types import LambdaType
 from typing import Tuple, Dict
 import numpy as np
-import gym
 from .state import DiscreteState, DiscreteStateValueType
 from .reward import IndividualRewardFunction
@@ -145,21 +144,3 @@ class TransitionProbabilityFactory:
     @property
     def objects_generated(self) -> bool:
         return self._objects_generated
-class OpenAIGymDiscreteEnvironmentTransitionProbabilityFactory(TransitionProbabilityFactory):
-    def __init__(self, envname):
-        super().__init__()
-        self.gymenv = gym.make(envname)
-        self._convert_openai_gymenv_to_transprob()
-    def _convert_openai_gymenv_to_transprob(self):
-        P = self.gymenv.env.P
-        for state_value, trans_dict in P.items():
-            new_trans_dict = {}
-            for action_value, next_state_list in trans_dict.items():
-                new_trans_dict[action_value] = [
-                    NextStateTuple(next_state[1], next_state[0], next_state[2], next_state[3])
-                    for next_state in next_state_list
-                ]
-            self.add_state_transitions(state_value, new_trans_dict)

{pyrlutils-0.0.2 → pyrlutils-0.0.4}/pyrlutils/valuefcns.py RENAMED Viewed

@@ -14,7 +14,7 @@ from .policy import DiscreteDeterminsticPolicy
 class OptimalPolicyOnValueFunctions:
     def __init__(self, discount_factor: float, transprobfac: TransitionProbabilityFactory):
         try:
-            assert discount_factor >= 0. and discount_factor <= 1.
+            assert 0. <= discount_factor <= 1.
         except AssertionError:
             raise ValueError('Discount factor must be between 0 and 1.')
         self._gamma = discount_factor

{pyrlutils-0.0.2 → pyrlutils-0.0.4/pyrlutils.egg-info}/PKG-INFO RENAMED Viewed

@@ -1,14 +1,14 @@
-Metadata-Version: 2.1
+Metadata-Version: 2.2
 Name: pyrlutils
-Version: 0.0.2
+Version: 0.0.4
 Summary: Utility and Helpers for Reinformcement Learning
-Home-page: https://github.com/stephenhky/PyRLUtils
-Author: Kwan-Yuet Ho
-Author-email: stephenhky@yahoo.com.hk
+Author-email: Kwan Yuet Stephen Ho <stephenhky@yahoo.com.hk>
 License: MIT
-Keywords: machine learning,reinforcement leaning,artifiial intelligence
-Platform: UNKNOWN
+Project-URL: Repository, https://github.com/stephenhky/PyRLUtils
+Project-URL: Issues, https://github.com/stephenhky/PyRLUtils/issues
+Keywords: machine learning,reinforcement leaning,artificial intelligence
 Classifier: Topic :: Scientific/Engineering :: Mathematics
+Classifier: License :: OSI Approved :: MIT License
 Classifier: Topic :: Software Development :: Libraries :: Python Modules
 Classifier: Topic :: Software Development :: Version Control :: Git
 Classifier: Programming Language :: Python :: 3.7
@@ -16,10 +16,17 @@ Classifier: Programming Language :: Python :: 3.8
 Classifier: Programming Language :: Python :: 3.9
 Classifier: Programming Language :: Python :: 3.10
 Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
 Classifier: Intended Audience :: Science/Research
 Classifier: Intended Audience :: Developers
+Requires-Python: >=3.7
 Description-Content-Type: text/markdown
 License-File: LICENSE
+Requires-Dist: numpy
+Provides-Extra: openaigym
+Requires-Dist: gymnasium; extra == "openaigym"
+Provides-Extra: test
+Requires-Dist: unittest; extra == "test"
 # PyRLUtils
@@ -27,8 +34,9 @@ License-File: LICENSE
 [![GitHub release](https://img.shields.io/github/release/stephenhky/PyRLUtils.svg?maxAge=3600)](https://github.com/stephenhky/pyqentangle/PyRLUtils)
 [![pypi](https://img.shields.io/pypi/v/PyRLUtils.svg?maxAge=3600)](https://pypi.org/project/pyqentangle/)
 [![download](https://img.shields.io/pypi/dm/PyRLUtils.svg?maxAge=2592000&label=installs&color=%2327B1FF)](https://pypi.org/project/PyRLUtils/)
+[![Updates](https://pyup.io/repos/github/stephenhky/PyRLUtils/shield.svg)](https://pyup.io/repos/github/stephenhky/PyRLUtils/)
+[![Python 3](https://pyup.io/repos/github/stephenhky/PyRLUtils/python-3-shield.svg)](https://pyup.io/repos/github/stephenhky/PyRLUtils/)
 This is a Python package with utility classes and helper functions for
 that facilitates the development of any reinformecement learning projects.

{pyrlutils-0.0.2 → pyrlutils-0.0.4}/pyrlutils.egg-info/SOURCES.txt RENAMED Viewed

@@ -1,7 +1,10 @@
+.gitignore
+.pyup.yml
 LICENSE
 MANIFEST.in
 README.md
-setup.py
+pyproject.toml
+.circleci/config.yml
 pyrlutils/__init__.py
 pyrlutils/action.py
 pyrlutils/policy.py
@@ -15,9 +18,16 @@ pyrlutils.egg-info/dependency_links.txt
 pyrlutils.egg-info/not-zip-safe
 pyrlutils.egg-info/requires.txt
 pyrlutils.egg-info/top_level.txt
+pyrlutils/bandit/__init__.py
+pyrlutils/bandit/algo.py
+pyrlutils/bandit/reward.py
+pyrlutils/openai/__init__.py
+pyrlutils/openai/utils.py
+test/__init__.py
 test/test_2ddiscrete.py
 test/test_2dmaze.py
 test/test_action.py
+test/test_bandits.py
 test/test_continous_state_actions.py
 test/test_frozenlake.py
 test/test_state.py

pyrlutils-0.0.4/pyrlutils.egg-info/requires.txt ADDED Viewed

@@ -0,0 +1,7 @@
+numpy
+[openaigym]
+gymnasium
+[test]
+unittest

pyrlutils-0.0.4/test/__init__.py ADDED Viewed

File without changes

pyrlutils-0.0.4/test/test_bandits.py ADDED Viewed

@@ -0,0 +1,82 @@
+import unittest
+from enum import Enum
+import random
+import numpy as np
+from pyrlutils.bandit.reward import IndividualBanditRewardFunction
+from pyrlutils.bandit.algo import SimpleBandit, GradientBandit
+class BanditWalk(Enum):
+    LEFT = 0
+    RIGHT = 1
+class BanditWalkReward(IndividualBanditRewardFunction):
+    def reward(self, action_value: BanditWalk) -> float:
+        return 0. if action_value == BanditWalk.LEFT else 1.
+class BanditSlipperyWalkReward(IndividualBanditRewardFunction):
+    def reward(self, action_value: BanditWalk) -> float:
+        r = random.uniform(0, 1)
+        if action_value == BanditWalk.LEFT:
+            return 0. if r <= 0.8 else 1.
+        else:
+            return 0. if r <= 0.2 else 1.
+class TestBandits(unittest.TestCase):
+    def test_simple_bandit_BW(self):
+        simple_bandit_BW = SimpleBandit(list(BanditWalk), BanditWalkReward())
+        assert simple_bandit_BW._Q.shape[0] == len(list(BanditWalk))
+        assert len(simple_bandit_BW.action_values) == len(list(BanditWalk))
+        # go for 100 loops
+        simple_bandit_BW.loop(100)
+        assert simple_bandit_BW.get_action() == BanditWalk.RIGHT
+    def test_simple_bandit_BSW(self):
+        simple_bandit_BSW = SimpleBandit(list(BanditWalk), BanditSlipperyWalkReward())
+        assert simple_bandit_BSW._Q.shape[0] == len(list(BanditWalk))
+        assert len(simple_bandit_BSW.action_values) == len(list(BanditWalk))
+        # go for 100 loops
+        simple_bandit_BSW.loop(100)
+        assert simple_bandit_BSW.get_action() == BanditWalk.RIGHT
+    def test_gradient_bandit_BW(self):
+        gradient_bandit_BW = GradientBandit(list(BanditWalk), BanditWalkReward())
+        assert gradient_bandit_BW._preferences.shape[0] == len(list(BanditWalk))
+        probs = gradient_bandit_BW._get_probs()
+        self.assertAlmostEqual(probs[0], 0.5)
+        self.assertAlmostEqual(probs[1], 0.5)
+        # go for 100 loops
+        gradient_bandit_BW.loop(100)
+        assert gradient_bandit_BW.get_action() == BanditWalk.RIGHT
+    def test_gradient_bandit_BSW(self):
+        gradient_bandit_BSW = GradientBandit(list(BanditWalk), BanditSlipperyWalkReward())
+        assert gradient_bandit_BSW._preferences.shape[0] == len(list(BanditWalk))
+        probs = gradient_bandit_BSW._get_probs()
+        self.assertAlmostEqual(probs[0], 0.5)
+        self.assertAlmostEqual(probs[1], 0.5)
+        # go for 100 loops
+        gradient_bandit_BSW.loop(100)
+        assert gradient_bandit_BSW.get_action() == BanditWalk.RIGHT
+if __name__ == '__main__':
+    unittest.main()

{pyrlutils-0.0.2 → pyrlutils-0.0.4}/test/test_frozenlake.py RENAMED Viewed

@@ -1,7 +1,8 @@
 import unittest
-from pyrlutils.transition import OpenAIGymDiscreteEnvironmentTransitionProbabilityFactory
+from pyrlutils.openai.utils import OpenAIGymDiscreteEnvironmentTransitionProbabilityFactory
 class TestFrozenLake(unittest.TestCase):
     def test_factory(self):

pyrlutils-0.0.2/pyrlutils.egg-info/requires.txt DELETED Viewed

	@@ -1,2 +0,0 @@
1	- numpy
2	- gym

pyrlutils-0.0.2/setup.py DELETED Viewed

@@ -1,53 +0,0 @@
-from setuptools import setup
-def readme():
-    with open('README.md') as f:
-        return f.read()
-def install_requirements():
-    return [package_string.strip() for package_string in open('requirements.txt', 'r')]
-def package_description():
-    text = open('README.md', 'r').read()
-    return text
-setup(
-    name='pyrlutils',
-    version="0.0.2",
-    description="Utility and Helpers for Reinformcement Learning",
-    long_description=package_description(),
-    long_description_content_type='text/markdown',
-    classifiers=[
-      "Topic :: Scientific/Engineering :: Mathematics",
-      "Topic :: Software Development :: Libraries :: Python Modules",
-      "Topic :: Software Development :: Version Control :: Git",
-      "Programming Language :: Python :: 3.7",
-      "Programming Language :: Python :: 3.8",
-      "Programming Language :: Python :: 3.9",
-      "Programming Language :: Python :: 3.10",
-      "Programming Language :: Python :: 3.11",
-      "Intended Audience :: Science/Research",
-      "Intended Audience :: Developers",
-    ],
-    keywords="machine learning, reinforcement leaning, artifiial intelligence",
-    url="https://github.com/stephenhky/PyRLUtils",
-    author="Kwan-Yuet Ho",
-    author_email="stephenhky@yahoo.com.hk",
-    license='MIT',
-    packages=[
-        'pyrlutils'
-    ],
-    install_requires=install_requirements(),
-    tests_require=[
-      'unittest'
-    ],
-    # scripts=[],
-    include_package_data=True,
-    test_suite="test",
-    zip_safe=False
-)