saferl-lite 0.1.0__py3-none-any.whl → 0.1.2__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,239 @@
1
+ Metadata-Version: 2.4
2
+ Name: saferl-lite
3
+ Version: 0.1.2
4
+ Summary: A lightweight, explainable, and constrained reinforcement learning toolkit.
5
+ Home-page: https://github.com/satyamcser/saferl-lite
6
+ Author: Satyam Mishra
7
+ Author-email: satyam@example.com
8
+ Project-URL: Documentation, https://github.com/satyamcser/saferl-lite/tree/main/docs
9
+ Project-URL: Source, https://github.com/satyamcser/saferl-lite
10
+ Project-URL: Bug Tracker, https://github.com/satyamcser/saferl-lite/issues
11
+ Classifier: Programming Language :: Python :: 3
12
+ Classifier: License :: OSI Approved :: MIT License
13
+ Classifier: Operating System :: OS Independent
14
+ Classifier: Intended Audience :: Science/Research
15
+ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
16
+ Requires-Python: >=3.8
17
+ Description-Content-Type: text/markdown
18
+ License-File: LICENSE
19
+ Requires-Dist: gym
20
+ Requires-Dist: gymnasium
21
+ Requires-Dist: numpy
22
+ Requires-Dist: torch
23
+ Requires-Dist: matplotlib
24
+ Requires-Dist: seaborn
25
+ Requires-Dist: pre-commit
26
+ Requires-Dist: flake8
27
+ Requires-Dist: pyyaml
28
+ Requires-Dist: shap
29
+ Requires-Dist: captum
30
+ Requires-Dist: typer
31
+ Requires-Dist: scikit-learn
32
+ Requires-Dist: pandas
33
+ Requires-Dist: pytest
34
+ Requires-Dist: pytest-cov
35
+ Requires-Dist: coverage
36
+ Requires-Dist: mkdocs
37
+ Requires-Dist: wandb
38
+ Requires-Dist: mkdocs>=1.5
39
+ Requires-Dist: mkdocs-material>=9.5
40
+ Requires-Dist: mkdocstrings[python]
41
+ Dynamic: author
42
+ Dynamic: author-email
43
+ Dynamic: classifier
44
+ Dynamic: description
45
+ Dynamic: description-content-type
46
+ Dynamic: home-page
47
+ Dynamic: license-file
48
+ Dynamic: project-url
49
+ Dynamic: requires-dist
50
+ Dynamic: requires-python
51
+ Dynamic: summary
52
+
53
+ # 🔐 SafeRL-Lite
54
+
55
+ A **lightweight, explainable, and modular** Python library for **Constrained Reinforcement Learning (Safe RL)** with real-time **SHAP & saliency-based explainability**, custom metrics, and Gym-compatible wrappers.
56
+
57
+ By:
58
+ - Satyam Mishra, Vision Mentors Ltd., Hanoi, Vietnam
59
+ - Shivam Mishra, Phung Thao Vi, Vietnam National University, Hanoi, Vietnam
60
+ - Dr. Vishwanath Bijalwan, SR University, Warangal, India
61
+ - Dr. Vijay Bhaskar Semwal, MANIT, Bhopal, India
62
+ - University of West London, London, UK
63
+
64
+ <p align="center">
65
+ <a href="https://github.com/satyamcser/saferl-lite/blob/main/LICENSE">
66
+ <img src="https://img.shields.io/github/license/satyamcser/saferl-lite?style=flat-square" alt="License">
67
+ </a>
68
+ <a href="https://github.com/satyamcser/saferl-lite/stargazers">
69
+ <img src="https://img.shields.io/github/stars/satyamcser/saferl-lite?style=flat-square" alt="Stars">
70
+ </a>
71
+ <a href="https://pypi.org/project/saferl-lite/">
72
+ <img src="https://img.shields.io/pypi/v/saferl-lite?style=flat-square" alt="PyPI version">
73
+ </a>
74
+ <a href="https://github.com/satyamcser/saferl-lite/actions/workflows/ci.yml">
75
+ <img src="https://img.shields.io/github/actions/workflow/status/satyamcser/saferl-lite/ci.yml?branch=main&style=flat-square" alt="Build Status">
76
+ </a>
77
+ </p>
78
+
79
+
80
+ ---
81
+
82
+ ## 🌟 Overview
83
+
84
+ **SafeRL-Lite** empowers reinforcement learning agents to act under **safety constraints**, while remaining **interpretable** and **modular** for fast experimentation. It wraps standard Gym environments and DQN-based agents with:
85
+
86
+ - ✅ Safety constraint logic
87
+ - 🔍 Visual explainability (SHAP, saliency maps)
88
+ - 📊 Violation and reward tracking
89
+ - 🧪 Built-in testing and evaluations
90
+
91
+ ---
92
+
93
+ ## ✅ Problem We Solved
94
+
95
+ Modern Reinforcement Learning (RL) agents are powerful but unsafe and opaque:
96
+
97
+ - 🚫 They frequently violate safety constraints during learning or deployment (e.g., fall off a cliff in navigation tasks).
98
+
99
+ - 😕 Their decision-making is a black box: humans can’t understand why a certain action was chosen.
100
+
101
+ - 🔍 Standard RL libraries lack native support for:
102
+
103
+ - Enforcing hard constraints during training.
104
+
105
+ - Explaining decisions using methods like SHAP or saliency maps.
106
+
107
+ ## ✅ Our Solution
108
+ SafeRL-Lite is a lightweight Python library that:
109
+
110
+ 1. 📏 Adds a SafetyWrapper around any Gym environment to enforce safety constraints (e.g., bounding actions, limiting states).
111
+
112
+ 2. 🧠 Integrates explainability methods:
113
+
114
+ - SHAPExplainer (model-agnostic local explanations).
115
+
116
+ - SaliencyExplainer (gradient-based sensitivity maps).
117
+
118
+ 3. 🔧 Wraps Constrained DQNs with ease, enabling safety-compliant Q-learning.
119
+
120
+ 4. 📊 Offers built-in metrics like violation count and safe episode tracking.
121
+
122
+ ## ✅ Novelty
123
+ While Safe RL and Explainable RL are separately studied, no prior lightweight library:
124
+
125
+ - Combines hard safety constraints with post-hoc interpretability.
126
+
127
+ - Is designed to be minimal, pluggable, and easily installable (pip install saferl-lite) for education, experimentation, or safe deployment.
128
+
129
+ - Enables real-time SHAP or saliency visualization for Gym-based agents out-of-the-box.
130
+ ``` bash
131
+ SafeRL-Lite is the first minimal library to unify constraint satisfaction and explainability in reinforcement learning — without heavy dependencies or overhead.
132
+ ```
133
+
134
+ ## ✅ Our Contribution
135
+ 1. 🔐 Constraint Wrapper API: Drop-in Gym wrapper for defining and enforcing logical constraints on observations, actions, and reward signals.
136
+
137
+ 2. 🧠 Explainability Modules: Plug-and-play SHAP and saliency explainer classes for deep Q-networks.
138
+
139
+ 3. 📦 PyPI-Ready Toolkit: Easily installed, documented, and CI/CD tested; built for research and reproducibility.
140
+
141
+ 4. 📈 Metrics for Constraint Violation: Tracks unsafe episodes, per-step violations, and integrates cleanly with WandB or TensorBoard.
142
+
143
+ ## ✅ Technical Explanation
144
+ - We define a custom SafeEnvWrapper(gym.Env) that:
145
+
146
+ - Intercepts actions.
147
+
148
+ - Applies logical rules or thresholding.
149
+
150
+ - Optionally overrides rewards or terminations if constraints are violated.
151
+
152
+ - A ConstrainedDQNAgent uses:
153
+
154
+ - Safety-wrapped Gym envs.
155
+
156
+ - Standard Q-learning with optional penalty_on_violation flag.
157
+
158
+ - Post-training, the SHAPExplainer and SaliencyExplainer:
159
+
160
+ - Generate local attributions using input perturbations or gradient norms.
161
+
162
+ - Can visualize per-state or per-action explanations.
163
+
164
+
165
+
166
+ ## ✅ Satyam's Explanation
167
+ ```bash
168
+ Imagine you're teaching a robot to walk — but there’s lava on the floor!
169
+ You don’t just want it to learn fast, you want it to stay safe and explain why it stepped left, not right.
170
+ ```
171
+ SafeRL-Lite is like a safety helmet and voicebox for robots:
172
+
173
+ - The helmet makes sure they don’t do dangerous stuff.
174
+
175
+ - The voicebox lets them say why they made that move.
176
+
177
+ ## 🔧 Installation
178
+
179
+ > 📦 PyPI
180
+ ```bash
181
+ pip install saferl-lite
182
+ ```
183
+
184
+ ## 🛠️ From source:
185
+
186
+ ```bash
187
+ git clone https://github.com/satyamcser/saferl-lite.git
188
+ cd saferl-lite
189
+ pip install -e .
190
+ ```
191
+
192
+ ## 🚀 Quickstart
193
+ Train a constrained DQN agent with saliency-based explainability:
194
+
195
+ ```bash
196
+ python train.py --env CartPole-v1 --constraint pole_angle --explain shap
197
+ ```
198
+
199
+ 🔹 This:
200
+
201
+ - Adds a pole-angle constraint wrapper to the Gym env
202
+
203
+ - Logs violations
204
+
205
+ - Displays SHAP or saliency explanations for agent decisions
206
+
207
+ ## 🧠 Features
208
+ #### ✅ Constrained RL
209
+ - Add custom constraints via wrapper or logic class
210
+
211
+ - Violation logging and reward shaping
212
+
213
+ - Safe vs unsafe episode tracking
214
+
215
+ #### 🔍 Explainability
216
+ - SaliencyExplainer — gradient-based visual heatmaps
217
+
218
+ - SHAPExplainer — feature contribution values per decision
219
+
220
+ - Compatible with any PyTorch-based agent
221
+
222
+ #### 📊 Metrics
223
+ - Constraint violation rate
224
+
225
+ - Episode reward
226
+
227
+ - Cumulative safe reward
228
+
229
+ - Action entropy & temporal behavior stats
230
+
231
+ #### 📚 Modularity
232
+ - Swap out agents, constraints, evaluators, or explainers
233
+
234
+ - Supports Gym environments
235
+
236
+ - Configurable training pipeline
237
+
238
+ ## 📜 Citation
239
+ Coming soon after arXiv/preprint release.
@@ -6,8 +6,8 @@ envs/wrappers.py,sha256=rfk3cfsTsfD8NqUjEcJ-o7XGMmkBBHt5kfaCiE3AgAw,1749
6
6
  explainability/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
7
7
  explainability/saliency.py,sha256=EpvrpkRZWqYqd3lkRIkfIbJ0pw7G_hJ8GEiVfgPo88U,767
8
8
  explainability/shap_explainer.py,sha256=Tj-fP947z8ixFdWRXHdR6D3a_wtznGN5x-DomU34xbc,883
9
- saferl_lite-0.1.0.dist-info/licenses/LICENSE,sha256=WRhQPkdFDzbMFEhvoaq9gSNnbsy0lhSC8tFH3stLntY,1070
10
- saferl_lite-0.1.0.dist-info/METADATA,sha256=k9EwE0Clqv-yIANmGdhJPemW4EhBI9kqAnw6xc74WJE,3868
11
- saferl_lite-0.1.0.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
12
- saferl_lite-0.1.0.dist-info/top_level.txt,sha256=f1IuezLA5sRnSuKZbl-VrS_Hh9pekOW2smLrpJLuiGg,27
13
- saferl_lite-0.1.0.dist-info/RECORD,,
9
+ saferl_lite-0.1.2.dist-info/licenses/LICENSE,sha256=WRhQPkdFDzbMFEhvoaq9gSNnbsy0lhSC8tFH3stLntY,1070
10
+ saferl_lite-0.1.2.dist-info/METADATA,sha256=Ba_qQsoYNXTdBgZvMFynvZsVwuhWK22x3_0EM92AJTU,7613
11
+ saferl_lite-0.1.2.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
12
+ saferl_lite-0.1.2.dist-info/top_level.txt,sha256=f1IuezLA5sRnSuKZbl-VrS_Hh9pekOW2smLrpJLuiGg,27
13
+ saferl_lite-0.1.2.dist-info/RECORD,,
@@ -1,139 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: saferl-lite
3
- Version: 0.1.0
4
- Summary: A lightweight, explainable, and constrained reinforcement learning toolkit.
5
- Home-page: https://github.com/satyamcser/saferl-lite
6
- Author: Satyam Mishra
7
- Author-email: satyam@example.com
8
- Project-URL: Documentation, https://satyamcser.github.io/saferl-lite/
9
- Project-URL: Source, https://github.com/satyamcser/saferl-lite
10
- Project-URL: Bug Tracker, https://github.com/satyamcser/saferl-lite/issues
11
- Classifier: Programming Language :: Python :: 3
12
- Classifier: License :: OSI Approved :: MIT License
13
- Classifier: Operating System :: OS Independent
14
- Classifier: Intended Audience :: Science/Research
15
- Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
16
- Requires-Python: >=3.8
17
- Description-Content-Type: text/markdown
18
- License-File: LICENSE
19
- Requires-Dist: gym
20
- Requires-Dist: gymnasium
21
- Requires-Dist: numpy
22
- Requires-Dist: torch
23
- Requires-Dist: matplotlib
24
- Requires-Dist: seaborn
25
- Requires-Dist: pre-commit
26
- Requires-Dist: flake8
27
- Requires-Dist: pyyaml
28
- Requires-Dist: shap
29
- Requires-Dist: captum
30
- Requires-Dist: typer
31
- Requires-Dist: scikit-learn
32
- Requires-Dist: pandas
33
- Requires-Dist: pytest
34
- Requires-Dist: pytest-cov
35
- Requires-Dist: coverage
36
- Requires-Dist: mkdocs
37
- Requires-Dist: wandb
38
- Requires-Dist: mkdocs>=1.5
39
- Requires-Dist: mkdocs-material>=9.5
40
- Requires-Dist: mkdocstrings[python]
41
- Dynamic: author
42
- Dynamic: author-email
43
- Dynamic: classifier
44
- Dynamic: description
45
- Dynamic: description-content-type
46
- Dynamic: home-page
47
- Dynamic: license-file
48
- Dynamic: project-url
49
- Dynamic: requires-dist
50
- Dynamic: requires-python
51
- Dynamic: summary
52
-
53
- # 🔐 SafeRL-Lite
54
-
55
- A **lightweight, explainable, and modular** Python library for **Constrained Reinforcement Learning (Safe RL)** with real-time **SHAP & saliency-based explainability**, custom metrics, and Gym-compatible wrappers.
56
-
57
- <p align="center">
58
- <img src="https://img.shields.io/github/license/satyamcser/saferl-lite?style=flat-square">
59
- <img src="https://img.shields.io/github/stars/satyamcser/saferl-lite?style=flat-square">
60
- <img src="https://img.shields.io/pypi/v/saferl-lite?style=flat-square">
61
- <img src="https://img.shields.io/github/actions/workflow/status/satyamcser/saferl-lite/ci.yml?branch=main&style=flat-square">
62
- </p>
63
-
64
- ---
65
-
66
- ## 🌟 Overview
67
-
68
- **SafeRL-Lite** empowers reinforcement learning agents to act under **safety constraints**, while remaining **interpretable** and **modular** for fast experimentation. It wraps standard Gym environments and DQN-based agents with:
69
-
70
- - ✅ Safety constraint logic
71
- - 🔍 Visual explainability (SHAP, saliency maps)
72
- - 📊 Violation and reward tracking
73
- - 🧪 Built-in testing and evaluations
74
-
75
- ---
76
-
77
- ## 🔧 Installation
78
-
79
- > 📦 PyPI (coming soon)
80
- ```bash
81
- pip install saferl-lite
82
- ```
83
-
84
- ## 🛠️ From source:
85
-
86
- ```bash
87
- git clone https://github.com/satyamcser/saferl-lite.git
88
- cd saferl-lite
89
- pip install -e .
90
- ```
91
-
92
- ## 🚀 Quickstart
93
- Train a constrained DQN agent with saliency-based explainability:
94
-
95
- ```bash
96
- python train.py --env CartPole-v1 --constraint pole_angle --explain shap
97
- ```
98
-
99
- 🔹 This:
100
-
101
- - Adds a pole-angle constraint wrapper to the Gym env
102
-
103
- - Logs violations
104
-
105
- - Displays SHAP or saliency explanations for agent decisions
106
-
107
- ## 🧠 Features
108
- #### ✅ Constrained RL
109
- - Add custom constraints via wrapper or logic class
110
-
111
- - Violation logging and reward shaping
112
-
113
- - Safe vs unsafe episode tracking
114
-
115
- #### 🔍 Explainability
116
- - SaliencyExplainer — gradient-based visual heatmaps
117
-
118
- - SHAPExplainer — feature contribution values per decision
119
-
120
- - Compatible with any PyTorch-based agent
121
-
122
- #### 📊 Metrics
123
- - Constraint violation rate
124
-
125
- - Episode reward
126
-
127
- - Cumulative safe reward
128
-
129
- - Action entropy & temporal behavior stats
130
-
131
- #### 📚 Modularity
132
- - Swap out agents, constraints, evaluators, or explainers
133
-
134
- - Supports Gym environments
135
-
136
- - Configurable training pipeline
137
-
138
- ## 📜 Citation
139
- Coming soon after arXiv/preprint release.