saferl-lite 0.1.0__py3-none-any.whl → 0.1.2__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- saferl_lite-0.1.2.dist-info/METADATA +239 -0
- {saferl_lite-0.1.0.dist-info → saferl_lite-0.1.2.dist-info}/RECORD +5 -5
- saferl_lite-0.1.0.dist-info/METADATA +0 -139
- {saferl_lite-0.1.0.dist-info → saferl_lite-0.1.2.dist-info}/WHEEL +0 -0
- {saferl_lite-0.1.0.dist-info → saferl_lite-0.1.2.dist-info}/licenses/LICENSE +0 -0
- {saferl_lite-0.1.0.dist-info → saferl_lite-0.1.2.dist-info}/top_level.txt +0 -0
@@ -0,0 +1,239 @@
|
|
1
|
+
Metadata-Version: 2.4
|
2
|
+
Name: saferl-lite
|
3
|
+
Version: 0.1.2
|
4
|
+
Summary: A lightweight, explainable, and constrained reinforcement learning toolkit.
|
5
|
+
Home-page: https://github.com/satyamcser/saferl-lite
|
6
|
+
Author: Satyam Mishra
|
7
|
+
Author-email: satyam@example.com
|
8
|
+
Project-URL: Documentation, https://github.com/satyamcser/saferl-lite/tree/main/docs
|
9
|
+
Project-URL: Source, https://github.com/satyamcser/saferl-lite
|
10
|
+
Project-URL: Bug Tracker, https://github.com/satyamcser/saferl-lite/issues
|
11
|
+
Classifier: Programming Language :: Python :: 3
|
12
|
+
Classifier: License :: OSI Approved :: MIT License
|
13
|
+
Classifier: Operating System :: OS Independent
|
14
|
+
Classifier: Intended Audience :: Science/Research
|
15
|
+
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
16
|
+
Requires-Python: >=3.8
|
17
|
+
Description-Content-Type: text/markdown
|
18
|
+
License-File: LICENSE
|
19
|
+
Requires-Dist: gym
|
20
|
+
Requires-Dist: gymnasium
|
21
|
+
Requires-Dist: numpy
|
22
|
+
Requires-Dist: torch
|
23
|
+
Requires-Dist: matplotlib
|
24
|
+
Requires-Dist: seaborn
|
25
|
+
Requires-Dist: pre-commit
|
26
|
+
Requires-Dist: flake8
|
27
|
+
Requires-Dist: pyyaml
|
28
|
+
Requires-Dist: shap
|
29
|
+
Requires-Dist: captum
|
30
|
+
Requires-Dist: typer
|
31
|
+
Requires-Dist: scikit-learn
|
32
|
+
Requires-Dist: pandas
|
33
|
+
Requires-Dist: pytest
|
34
|
+
Requires-Dist: pytest-cov
|
35
|
+
Requires-Dist: coverage
|
36
|
+
Requires-Dist: mkdocs
|
37
|
+
Requires-Dist: wandb
|
38
|
+
Requires-Dist: mkdocs>=1.5
|
39
|
+
Requires-Dist: mkdocs-material>=9.5
|
40
|
+
Requires-Dist: mkdocstrings[python]
|
41
|
+
Dynamic: author
|
42
|
+
Dynamic: author-email
|
43
|
+
Dynamic: classifier
|
44
|
+
Dynamic: description
|
45
|
+
Dynamic: description-content-type
|
46
|
+
Dynamic: home-page
|
47
|
+
Dynamic: license-file
|
48
|
+
Dynamic: project-url
|
49
|
+
Dynamic: requires-dist
|
50
|
+
Dynamic: requires-python
|
51
|
+
Dynamic: summary
|
52
|
+
|
53
|
+
# 🔐 SafeRL-Lite
|
54
|
+
|
55
|
+
A **lightweight, explainable, and modular** Python library for **Constrained Reinforcement Learning (Safe RL)** with real-time **SHAP & saliency-based explainability**, custom metrics, and Gym-compatible wrappers.
|
56
|
+
|
57
|
+
By:
|
58
|
+
- Satyam Mishra, Vision Mentors Ltd., Hanoi, Vietnam
|
59
|
+
- Shivam Mishra, Phung Thao Vi, Vietnam National University, Hanoi, Vietnam
|
60
|
+
- Dr. Vishwanath Bijalwan, SR University, Warangal, India
|
61
|
+
- Dr. Vijay Bhaskar Semwal, MANIT, Bhopal, India
|
62
|
+
- University of West London, London, UK
|
63
|
+
|
64
|
+
<p align="center">
|
65
|
+
<a href="https://github.com/satyamcser/saferl-lite/blob/main/LICENSE">
|
66
|
+
<img src="https://img.shields.io/github/license/satyamcser/saferl-lite?style=flat-square" alt="License">
|
67
|
+
</a>
|
68
|
+
<a href="https://github.com/satyamcser/saferl-lite/stargazers">
|
69
|
+
<img src="https://img.shields.io/github/stars/satyamcser/saferl-lite?style=flat-square" alt="Stars">
|
70
|
+
</a>
|
71
|
+
<a href="https://pypi.org/project/saferl-lite/">
|
72
|
+
<img src="https://img.shields.io/pypi/v/saferl-lite?style=flat-square" alt="PyPI version">
|
73
|
+
</a>
|
74
|
+
<a href="https://github.com/satyamcser/saferl-lite/actions/workflows/ci.yml">
|
75
|
+
<img src="https://img.shields.io/github/actions/workflow/status/satyamcser/saferl-lite/ci.yml?branch=main&style=flat-square" alt="Build Status">
|
76
|
+
</a>
|
77
|
+
</p>
|
78
|
+
|
79
|
+
|
80
|
+
---
|
81
|
+
|
82
|
+
## 🌟 Overview
|
83
|
+
|
84
|
+
**SafeRL-Lite** empowers reinforcement learning agents to act under **safety constraints**, while remaining **interpretable** and **modular** for fast experimentation. It wraps standard Gym environments and DQN-based agents with:
|
85
|
+
|
86
|
+
- ✅ Safety constraint logic
|
87
|
+
- 🔍 Visual explainability (SHAP, saliency maps)
|
88
|
+
- 📊 Violation and reward tracking
|
89
|
+
- 🧪 Built-in testing and evaluations
|
90
|
+
|
91
|
+
---
|
92
|
+
|
93
|
+
## ✅ Problem We Solved
|
94
|
+
|
95
|
+
Modern Reinforcement Learning (RL) agents are powerful but unsafe and opaque:
|
96
|
+
|
97
|
+
- 🚫 They frequently violate safety constraints during learning or deployment (e.g., fall off a cliff in navigation tasks).
|
98
|
+
|
99
|
+
- 😕 Their decision-making is a black box: humans can’t understand why a certain action was chosen.
|
100
|
+
|
101
|
+
- 🔍 Standard RL libraries lack native support for:
|
102
|
+
|
103
|
+
- Enforcing hard constraints during training.
|
104
|
+
|
105
|
+
- Explaining decisions using methods like SHAP or saliency maps.
|
106
|
+
|
107
|
+
## ✅ Our Solution
|
108
|
+
SafeRL-Lite is a lightweight Python library that:
|
109
|
+
|
110
|
+
1. 📏 Adds a SafetyWrapper around any Gym environment to enforce safety constraints (e.g., bounding actions, limiting states).
|
111
|
+
|
112
|
+
2. 🧠 Integrates explainability methods:
|
113
|
+
|
114
|
+
- SHAPExplainer (model-agnostic local explanations).
|
115
|
+
|
116
|
+
- SaliencyExplainer (gradient-based sensitivity maps).
|
117
|
+
|
118
|
+
3. 🔧 Wraps Constrained DQNs with ease, enabling safety-compliant Q-learning.
|
119
|
+
|
120
|
+
4. 📊 Offers built-in metrics like violation count and safe episode tracking.
|
121
|
+
|
122
|
+
## ✅ Novelty
|
123
|
+
While Safe RL and Explainable RL are separately studied, no prior lightweight library:
|
124
|
+
|
125
|
+
- Combines hard safety constraints with post-hoc interpretability.
|
126
|
+
|
127
|
+
- Is designed to be minimal, pluggable, and easily installable (pip install saferl-lite) for education, experimentation, or safe deployment.
|
128
|
+
|
129
|
+
- Enables real-time SHAP or saliency visualization for Gym-based agents out-of-the-box.
|
130
|
+
``` bash
|
131
|
+
SafeRL-Lite is the first minimal library to unify constraint satisfaction and explainability in reinforcement learning — without heavy dependencies or overhead.
|
132
|
+
```
|
133
|
+
|
134
|
+
## ✅ Our Contribution
|
135
|
+
1. 🔐 Constraint Wrapper API: Drop-in Gym wrapper for defining and enforcing logical constraints on observations, actions, and reward signals.
|
136
|
+
|
137
|
+
2. 🧠 Explainability Modules: Plug-and-play SHAP and saliency explainer classes for deep Q-networks.
|
138
|
+
|
139
|
+
3. 📦 PyPI-Ready Toolkit: Easily installed, documented, and CI/CD tested; built for research and reproducibility.
|
140
|
+
|
141
|
+
4. 📈 Metrics for Constraint Violation: Tracks unsafe episodes, per-step violations, and integrates cleanly with WandB or TensorBoard.
|
142
|
+
|
143
|
+
## ✅ Technical Explanation
|
144
|
+
- We define a custom SafeEnvWrapper(gym.Env) that:
|
145
|
+
|
146
|
+
- Intercepts actions.
|
147
|
+
|
148
|
+
- Applies logical rules or thresholding.
|
149
|
+
|
150
|
+
- Optionally overrides rewards or terminations if constraints are violated.
|
151
|
+
|
152
|
+
- A ConstrainedDQNAgent uses:
|
153
|
+
|
154
|
+
- Safety-wrapped Gym envs.
|
155
|
+
|
156
|
+
- Standard Q-learning with optional penalty_on_violation flag.
|
157
|
+
|
158
|
+
- Post-training, the SHAPExplainer and SaliencyExplainer:
|
159
|
+
|
160
|
+
- Generate local attributions using input perturbations or gradient norms.
|
161
|
+
|
162
|
+
- Can visualize per-state or per-action explanations.
|
163
|
+
|
164
|
+
|
165
|
+
|
166
|
+
## ✅ Satyam's Explanation
|
167
|
+
```bash
|
168
|
+
Imagine you're teaching a robot to walk — but there’s lava on the floor!
|
169
|
+
You don’t just want it to learn fast, you want it to stay safe and explain why it stepped left, not right.
|
170
|
+
```
|
171
|
+
SafeRL-Lite is like a safety helmet and voicebox for robots:
|
172
|
+
|
173
|
+
- The helmet makes sure they don’t do dangerous stuff.
|
174
|
+
|
175
|
+
- The voicebox lets them say why they made that move.
|
176
|
+
|
177
|
+
## 🔧 Installation
|
178
|
+
|
179
|
+
> 📦 PyPI
|
180
|
+
```bash
|
181
|
+
pip install saferl-lite
|
182
|
+
```
|
183
|
+
|
184
|
+
## 🛠️ From source:
|
185
|
+
|
186
|
+
```bash
|
187
|
+
git clone https://github.com/satyamcser/saferl-lite.git
|
188
|
+
cd saferl-lite
|
189
|
+
pip install -e .
|
190
|
+
```
|
191
|
+
|
192
|
+
## 🚀 Quickstart
|
193
|
+
Train a constrained DQN agent with saliency-based explainability:
|
194
|
+
|
195
|
+
```bash
|
196
|
+
python train.py --env CartPole-v1 --constraint pole_angle --explain shap
|
197
|
+
```
|
198
|
+
|
199
|
+
🔹 This:
|
200
|
+
|
201
|
+
- Adds a pole-angle constraint wrapper to the Gym env
|
202
|
+
|
203
|
+
- Logs violations
|
204
|
+
|
205
|
+
- Displays SHAP or saliency explanations for agent decisions
|
206
|
+
|
207
|
+
## 🧠 Features
|
208
|
+
#### ✅ Constrained RL
|
209
|
+
- Add custom constraints via wrapper or logic class
|
210
|
+
|
211
|
+
- Violation logging and reward shaping
|
212
|
+
|
213
|
+
- Safe vs unsafe episode tracking
|
214
|
+
|
215
|
+
#### 🔍 Explainability
|
216
|
+
- SaliencyExplainer — gradient-based visual heatmaps
|
217
|
+
|
218
|
+
- SHAPExplainer — feature contribution values per decision
|
219
|
+
|
220
|
+
- Compatible with any PyTorch-based agent
|
221
|
+
|
222
|
+
#### 📊 Metrics
|
223
|
+
- Constraint violation rate
|
224
|
+
|
225
|
+
- Episode reward
|
226
|
+
|
227
|
+
- Cumulative safe reward
|
228
|
+
|
229
|
+
- Action entropy & temporal behavior stats
|
230
|
+
|
231
|
+
#### 📚 Modularity
|
232
|
+
- Swap out agents, constraints, evaluators, or explainers
|
233
|
+
|
234
|
+
- Supports Gym environments
|
235
|
+
|
236
|
+
- Configurable training pipeline
|
237
|
+
|
238
|
+
## 📜 Citation
|
239
|
+
Coming soon after arXiv/preprint release.
|
@@ -6,8 +6,8 @@ envs/wrappers.py,sha256=rfk3cfsTsfD8NqUjEcJ-o7XGMmkBBHt5kfaCiE3AgAw,1749
|
|
6
6
|
explainability/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
|
7
7
|
explainability/saliency.py,sha256=EpvrpkRZWqYqd3lkRIkfIbJ0pw7G_hJ8GEiVfgPo88U,767
|
8
8
|
explainability/shap_explainer.py,sha256=Tj-fP947z8ixFdWRXHdR6D3a_wtznGN5x-DomU34xbc,883
|
9
|
-
saferl_lite-0.1.
|
10
|
-
saferl_lite-0.1.
|
11
|
-
saferl_lite-0.1.
|
12
|
-
saferl_lite-0.1.
|
13
|
-
saferl_lite-0.1.
|
9
|
+
saferl_lite-0.1.2.dist-info/licenses/LICENSE,sha256=WRhQPkdFDzbMFEhvoaq9gSNnbsy0lhSC8tFH3stLntY,1070
|
10
|
+
saferl_lite-0.1.2.dist-info/METADATA,sha256=Ba_qQsoYNXTdBgZvMFynvZsVwuhWK22x3_0EM92AJTU,7613
|
11
|
+
saferl_lite-0.1.2.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
|
12
|
+
saferl_lite-0.1.2.dist-info/top_level.txt,sha256=f1IuezLA5sRnSuKZbl-VrS_Hh9pekOW2smLrpJLuiGg,27
|
13
|
+
saferl_lite-0.1.2.dist-info/RECORD,,
|
@@ -1,139 +0,0 @@
|
|
1
|
-
Metadata-Version: 2.4
|
2
|
-
Name: saferl-lite
|
3
|
-
Version: 0.1.0
|
4
|
-
Summary: A lightweight, explainable, and constrained reinforcement learning toolkit.
|
5
|
-
Home-page: https://github.com/satyamcser/saferl-lite
|
6
|
-
Author: Satyam Mishra
|
7
|
-
Author-email: satyam@example.com
|
8
|
-
Project-URL: Documentation, https://satyamcser.github.io/saferl-lite/
|
9
|
-
Project-URL: Source, https://github.com/satyamcser/saferl-lite
|
10
|
-
Project-URL: Bug Tracker, https://github.com/satyamcser/saferl-lite/issues
|
11
|
-
Classifier: Programming Language :: Python :: 3
|
12
|
-
Classifier: License :: OSI Approved :: MIT License
|
13
|
-
Classifier: Operating System :: OS Independent
|
14
|
-
Classifier: Intended Audience :: Science/Research
|
15
|
-
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
16
|
-
Requires-Python: >=3.8
|
17
|
-
Description-Content-Type: text/markdown
|
18
|
-
License-File: LICENSE
|
19
|
-
Requires-Dist: gym
|
20
|
-
Requires-Dist: gymnasium
|
21
|
-
Requires-Dist: numpy
|
22
|
-
Requires-Dist: torch
|
23
|
-
Requires-Dist: matplotlib
|
24
|
-
Requires-Dist: seaborn
|
25
|
-
Requires-Dist: pre-commit
|
26
|
-
Requires-Dist: flake8
|
27
|
-
Requires-Dist: pyyaml
|
28
|
-
Requires-Dist: shap
|
29
|
-
Requires-Dist: captum
|
30
|
-
Requires-Dist: typer
|
31
|
-
Requires-Dist: scikit-learn
|
32
|
-
Requires-Dist: pandas
|
33
|
-
Requires-Dist: pytest
|
34
|
-
Requires-Dist: pytest-cov
|
35
|
-
Requires-Dist: coverage
|
36
|
-
Requires-Dist: mkdocs
|
37
|
-
Requires-Dist: wandb
|
38
|
-
Requires-Dist: mkdocs>=1.5
|
39
|
-
Requires-Dist: mkdocs-material>=9.5
|
40
|
-
Requires-Dist: mkdocstrings[python]
|
41
|
-
Dynamic: author
|
42
|
-
Dynamic: author-email
|
43
|
-
Dynamic: classifier
|
44
|
-
Dynamic: description
|
45
|
-
Dynamic: description-content-type
|
46
|
-
Dynamic: home-page
|
47
|
-
Dynamic: license-file
|
48
|
-
Dynamic: project-url
|
49
|
-
Dynamic: requires-dist
|
50
|
-
Dynamic: requires-python
|
51
|
-
Dynamic: summary
|
52
|
-
|
53
|
-
# 🔐 SafeRL-Lite
|
54
|
-
|
55
|
-
A **lightweight, explainable, and modular** Python library for **Constrained Reinforcement Learning (Safe RL)** with real-time **SHAP & saliency-based explainability**, custom metrics, and Gym-compatible wrappers.
|
56
|
-
|
57
|
-
<p align="center">
|
58
|
-
<img src="https://img.shields.io/github/license/satyamcser/saferl-lite?style=flat-square">
|
59
|
-
<img src="https://img.shields.io/github/stars/satyamcser/saferl-lite?style=flat-square">
|
60
|
-
<img src="https://img.shields.io/pypi/v/saferl-lite?style=flat-square">
|
61
|
-
<img src="https://img.shields.io/github/actions/workflow/status/satyamcser/saferl-lite/ci.yml?branch=main&style=flat-square">
|
62
|
-
</p>
|
63
|
-
|
64
|
-
---
|
65
|
-
|
66
|
-
## 🌟 Overview
|
67
|
-
|
68
|
-
**SafeRL-Lite** empowers reinforcement learning agents to act under **safety constraints**, while remaining **interpretable** and **modular** for fast experimentation. It wraps standard Gym environments and DQN-based agents with:
|
69
|
-
|
70
|
-
- ✅ Safety constraint logic
|
71
|
-
- 🔍 Visual explainability (SHAP, saliency maps)
|
72
|
-
- 📊 Violation and reward tracking
|
73
|
-
- 🧪 Built-in testing and evaluations
|
74
|
-
|
75
|
-
---
|
76
|
-
|
77
|
-
## 🔧 Installation
|
78
|
-
|
79
|
-
> 📦 PyPI (coming soon)
|
80
|
-
```bash
|
81
|
-
pip install saferl-lite
|
82
|
-
```
|
83
|
-
|
84
|
-
## 🛠️ From source:
|
85
|
-
|
86
|
-
```bash
|
87
|
-
git clone https://github.com/satyamcser/saferl-lite.git
|
88
|
-
cd saferl-lite
|
89
|
-
pip install -e .
|
90
|
-
```
|
91
|
-
|
92
|
-
## 🚀 Quickstart
|
93
|
-
Train a constrained DQN agent with saliency-based explainability:
|
94
|
-
|
95
|
-
```bash
|
96
|
-
python train.py --env CartPole-v1 --constraint pole_angle --explain shap
|
97
|
-
```
|
98
|
-
|
99
|
-
🔹 This:
|
100
|
-
|
101
|
-
- Adds a pole-angle constraint wrapper to the Gym env
|
102
|
-
|
103
|
-
- Logs violations
|
104
|
-
|
105
|
-
- Displays SHAP or saliency explanations for agent decisions
|
106
|
-
|
107
|
-
## 🧠 Features
|
108
|
-
#### ✅ Constrained RL
|
109
|
-
- Add custom constraints via wrapper or logic class
|
110
|
-
|
111
|
-
- Violation logging and reward shaping
|
112
|
-
|
113
|
-
- Safe vs unsafe episode tracking
|
114
|
-
|
115
|
-
#### 🔍 Explainability
|
116
|
-
- SaliencyExplainer — gradient-based visual heatmaps
|
117
|
-
|
118
|
-
- SHAPExplainer — feature contribution values per decision
|
119
|
-
|
120
|
-
- Compatible with any PyTorch-based agent
|
121
|
-
|
122
|
-
#### 📊 Metrics
|
123
|
-
- Constraint violation rate
|
124
|
-
|
125
|
-
- Episode reward
|
126
|
-
|
127
|
-
- Cumulative safe reward
|
128
|
-
|
129
|
-
- Action entropy & temporal behavior stats
|
130
|
-
|
131
|
-
#### 📚 Modularity
|
132
|
-
- Swap out agents, constraints, evaluators, or explainers
|
133
|
-
|
134
|
-
- Supports Gym environments
|
135
|
-
|
136
|
-
- Configurable training pipeline
|
137
|
-
|
138
|
-
## 📜 Citation
|
139
|
-
Coming soon after arXiv/preprint release.
|
File without changes
|
File without changes
|
File without changes
|