multi-agent-rlenv 3.7.2__py3-none-any.whl → 3.7.4__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- marlenv/__init__.py +88 -39
- marlenv/adapters/__init__.py +15 -0
- marlenv/adapters/smac_adapter.py +1 -1
- marlenv/catalog/__init__.py +21 -0
- marlenv/models/__init__.py +11 -0
- marlenv/models/step.py +9 -0
- multi_agent_rlenv-3.7.4.dist-info/METADATA +156 -0
- {multi_agent_rlenv-3.7.2.dist-info → multi_agent_rlenv-3.7.4.dist-info}/RECORD +10 -10
- multi_agent_rlenv-3.7.2.dist-info/METADATA +0 -144
- {multi_agent_rlenv-3.7.2.dist-info → multi_agent_rlenv-3.7.4.dist-info}/WHEEL +0 -0
- {multi_agent_rlenv-3.7.2.dist-info → multi_agent_rlenv-3.7.4.dist-info}/licenses/LICENSE +0 -0
marlenv/__init__.py
CHANGED
|
@@ -1,65 +1,114 @@
|
|
|
1
1
|
"""
|
|
2
2
|
`marlenv` is a strongly typed library for multi-agent and multi-objective reinforcement learning.
|
|
3
3
|
|
|
4
|
+
Install the library with
|
|
5
|
+
```sh
|
|
6
|
+
$ pip install multi-agent-rlenv # Basics
|
|
7
|
+
$ pip install multi-agent-rlenv[all] # With all optional dependecies
|
|
8
|
+
$ pip install multi-agent-rlenv[smac,overcooked] # Only SMACv2 & Overcooked
|
|
9
|
+
```
|
|
10
|
+
|
|
4
11
|
It aims to provide a simple and consistent interface for reinforcement learning environments by providing abstraction models such as `Observation`s or `Episode`s. `marlenv` provides adapters for popular libraries such as `gym` or `pettingzoo` and provides utility wrappers to add functionalities such as video recording or limiting the number of steps.
|
|
5
12
|
|
|
6
|
-
Almost every class is a
|
|
13
|
+
Almost every class is a dataclass to enable seemless serialiation with the `orjson` library.
|
|
7
14
|
|
|
8
|
-
#
|
|
9
|
-
|
|
15
|
+
# Fundamentals
|
|
16
|
+
## States & Observations
|
|
17
|
+
`MARLEnv.reset()` returns a pair of `(Observation, State)` and `MARLEnv.step()` returns a `Step`.
|
|
10
18
|
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
19
|
+
- `Observation` contains:
|
|
20
|
+
- `data`: shape `[n_agents, *observation_shape]`
|
|
21
|
+
- `available_actions`: boolean mask `[n_agents, n_actions]`
|
|
22
|
+
- `extras`: extra features per agent (default shape `(n_agents, 0)`)
|
|
23
|
+
- `State` represents the environment state and can also carry `extras`.
|
|
24
|
+
- `Step` bundles `obs`, `state`, `reward`, `done`, `truncated`, and `info`.
|
|
14
25
|
|
|
15
|
-
|
|
16
|
-
env2 = marlenv.make("CartPole-v1")
|
|
17
|
-
env3 = PettingZoo("prospector_v4")
|
|
18
|
-
env4 = SMAC("3m")
|
|
19
|
-
env5 = Overcooked.from_layout("cramped_room")
|
|
20
|
-
```
|
|
26
|
+
Rewards are stored as `np.float32` arrays. Multi-objective envs use reward vectors with `reward_space.size > 1`.
|
|
21
27
|
|
|
22
|
-
|
|
23
|
-
|
|
28
|
+
## Extras
|
|
29
|
+
Extras are auxiliary features appended by wrappers (agent id, last action, time ratio, available actions, ...).
|
|
30
|
+
Wrappers that add extras must update both `extras_shape` and `extras_meanings` so downstream users can interpret them.
|
|
31
|
+
`State` extras should stay in sync with `Observation` extras when applicable.
|
|
32
|
+
|
|
33
|
+
# Environment catalog
|
|
34
|
+
`marlenv.catalog` exposes curated environments and lazily imports optional dependencies.
|
|
24
35
|
|
|
25
36
|
```python
|
|
26
|
-
from marlenv import
|
|
37
|
+
from marlenv import catalog
|
|
27
38
|
|
|
28
|
-
|
|
29
|
-
|
|
39
|
+
env1 = catalog.overcooked().from_layout("scenario4")
|
|
40
|
+
env2 = catalog.lle().level(6)
|
|
41
|
+
env3 = catalog.DeepSea(mex_depth=5)
|
|
30
42
|
```
|
|
31
43
|
|
|
32
|
-
|
|
33
|
-
|
|
44
|
+
Catalog entries require their corresponding extras at install time (e.g., `marlenv[overcooked]`, `marlenv[lle]`).
|
|
45
|
+
|
|
46
|
+
# Wrappers & builders
|
|
47
|
+
Wrappers are composable through `RLEnvWrapper` and can be chained via `Builder` for fluent configuration.
|
|
34
48
|
|
|
35
49
|
```python
|
|
36
|
-
from marlenv import
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
50
|
+
from marlenv import Builder
|
|
51
|
+
from marlenv.adapters import SMAC
|
|
52
|
+
|
|
53
|
+
env = (
|
|
54
|
+
Builder(SMAC("3m"))
|
|
55
|
+
.agent_id()
|
|
56
|
+
.time_limit(20)
|
|
57
|
+
.available_actions()
|
|
58
|
+
.build()
|
|
59
|
+
)
|
|
46
60
|
```
|
|
47
61
|
|
|
48
|
-
|
|
49
|
-
|
|
62
|
+
Common wrappers include time limits, delayed rewards, masking available actions, and video recording.
|
|
63
|
+
|
|
64
|
+
# Using the library
|
|
65
|
+
## Adapters for existing libraries
|
|
66
|
+
Adapters normalize external APIs into `MARLEnv`:
|
|
50
67
|
|
|
51
68
|
```python
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
69
|
+
import marlenv
|
|
70
|
+
|
|
71
|
+
gym_env = marlenv.make("CartPole-v1", seed=25)
|
|
55
72
|
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
73
|
+
from marlenv.adapters import SMAC
|
|
74
|
+
smac_env = SMAC("3m", debug=True, difficulty="9")
|
|
75
|
+
|
|
76
|
+
from pettingzoo.sisl import pursuit_v4
|
|
77
|
+
from marlenv.adapters import PettingZoo
|
|
78
|
+
env = PettingZoo(pursuit_v4.parallel_env())
|
|
59
79
|
```
|
|
60
80
|
|
|
61
|
-
|
|
62
|
-
|
|
81
|
+
## Designing a custom environment
|
|
82
|
+
Create a custom environment by inheriting from `MARLEnv` and implementing `reset`, `step`, `get_observation`, and `get_state`.
|
|
83
|
+
|
|
84
|
+
```python
|
|
85
|
+
import numpy as np
|
|
86
|
+
from marlenv import MARLEnv, DiscreteSpace, Observation, State, Step
|
|
87
|
+
|
|
88
|
+
class CustomEnv(MARLEnv[DiscreteSpace]):
|
|
89
|
+
def __init__(self):
|
|
90
|
+
super().__init__(
|
|
91
|
+
n_agents=3,
|
|
92
|
+
action_space=DiscreteSpace.action(5).repeat(3),
|
|
93
|
+
observation_shape=(4,),
|
|
94
|
+
state_shape=(2,),
|
|
95
|
+
)
|
|
96
|
+
self.t = 0
|
|
97
|
+
|
|
98
|
+
def reset(self):
|
|
99
|
+
self.t = 0
|
|
100
|
+
return self.get_observation(), self.get_state()
|
|
101
|
+
|
|
102
|
+
def step(self, action):
|
|
103
|
+
self.t += 1
|
|
104
|
+
return Step(self.get_observation(), self.get_state(), reward=0.0, done=False)
|
|
105
|
+
|
|
106
|
+
def get_observation(self):
|
|
107
|
+
return Observation(np.zeros((3, 4), dtype=np.float32), self.available_actions())
|
|
108
|
+
|
|
109
|
+
def get_state(self):
|
|
110
|
+
return State(np.array([self.t, 0], dtype=np.float32))
|
|
111
|
+
```
|
|
63
112
|
"""
|
|
64
113
|
|
|
65
114
|
from importlib.metadata import version, PackageNotFoundError
|
marlenv/adapters/__init__.py
CHANGED
|
@@ -1,3 +1,18 @@
|
|
|
1
|
+
"""
|
|
2
|
+
Adapters for external RL libraries.
|
|
3
|
+
|
|
4
|
+
This submodule provides optional wrappers that normalize third-party APIs into
|
|
5
|
+
`MARLEnv`. Adapters are imported lazily via `try/except` so the base install
|
|
6
|
+
remains lightweight. The availability flags (`HAS_GYM`, `HAS_PETTINGZOO`,
|
|
7
|
+
`HAS_SMAC`) reflect whether the corresponding extra was installed.
|
|
8
|
+
|
|
9
|
+
Install extras to enable adapters with `uv` or `pip`:
|
|
10
|
+
- `multi-agent-rlenv[all]` for all optional dependencies
|
|
11
|
+
- `multi-agent-rlenv[gym]` for Gymnasium
|
|
12
|
+
- `multi-agent-rlenv[pettingzoo]` for PettingZoo
|
|
13
|
+
- `multi-agent-rlenv[smac]` for SMAC
|
|
14
|
+
"""
|
|
15
|
+
|
|
1
16
|
from .pymarl_adapter import PymarlAdapter
|
|
2
17
|
from marlenv.utils import dummy_function
|
|
3
18
|
|
marlenv/adapters/smac_adapter.py
CHANGED
|
@@ -3,7 +3,7 @@ from typing import overload
|
|
|
3
3
|
|
|
4
4
|
import numpy as np
|
|
5
5
|
import numpy.typing as npt
|
|
6
|
-
from
|
|
6
|
+
from smacv2.env import StarCraft2Env
|
|
7
7
|
|
|
8
8
|
from marlenv.models import MARLEnv, Observation, State, Step, MultiDiscreteSpace, DiscreteSpace
|
|
9
9
|
|
marlenv/catalog/__init__.py
CHANGED
|
@@ -1,3 +1,24 @@
|
|
|
1
|
+
"""
|
|
2
|
+
Environment catalog for `marlenv`.
|
|
3
|
+
|
|
4
|
+
This submodule exposes curated environments and provides lazy imports for optional
|
|
5
|
+
dependencies to keep the base install lightweight. Use the catalog to construct
|
|
6
|
+
environments without importing their packages directly.
|
|
7
|
+
|
|
8
|
+
Examples:
|
|
9
|
+
```python
|
|
10
|
+
from marlenv import catalog
|
|
11
|
+
|
|
12
|
+
env1 = catalog.DeepSea(mex_depth=5)
|
|
13
|
+
env2 = catalog.CoordinatedGrid()
|
|
14
|
+
env3 = catalog.connect_n()(width=7, height=6, n_to_connect=4)
|
|
15
|
+
env4 = catalog.smac()("3m")
|
|
16
|
+
```
|
|
17
|
+
|
|
18
|
+
Optional entries such as `smac`, `lle`, and `overcooked` require installing their
|
|
19
|
+
corresponding extras (e.g., `marlenv[smac]`, `marlenv[lle]`, `marlenv[overcooked]`).
|
|
20
|
+
"""
|
|
21
|
+
|
|
1
22
|
from .deepsea import DeepSea
|
|
2
23
|
from .matrix_game import MatrixGame
|
|
3
24
|
from .coordinated_grid import CoordinatedGrid
|
marlenv/models/__init__.py
CHANGED
|
@@ -1,3 +1,14 @@
|
|
|
1
|
+
"""
|
|
2
|
+
Core data models for the `marlenv` API.
|
|
3
|
+
|
|
4
|
+
This package defines the typed containers and interfaces shared across adapters,
|
|
5
|
+
wrappers, and environments:
|
|
6
|
+
- `MARLEnv`: the abstract environment contract.
|
|
7
|
+
- `Observation` / `State`: structured inputs to agents and state tracking.
|
|
8
|
+
- `Step` / `Transition` / `Episode`: execution results and replayable logs.
|
|
9
|
+
- `Space` variants: action/reward space definitions.
|
|
10
|
+
"""
|
|
11
|
+
|
|
1
12
|
from .spaces import DiscreteSpace, ContinuousSpace, MultiDiscreteSpace, Space
|
|
2
13
|
from .observation import Observation
|
|
3
14
|
from .step import Step
|
marlenv/models/step.py
CHANGED
|
@@ -9,6 +9,15 @@ from .state import State
|
|
|
9
9
|
|
|
10
10
|
@dataclass
|
|
11
11
|
class Step:
|
|
12
|
+
"""
|
|
13
|
+
The result of performing a step in the environment:
|
|
14
|
+
- the new observation
|
|
15
|
+
- the new state
|
|
16
|
+
- the reward received for the step performed
|
|
17
|
+
- whether the episode is done or truncated
|
|
18
|
+
- some info (mainly for logging purposes)
|
|
19
|
+
"""
|
|
20
|
+
|
|
12
21
|
obs: Observation
|
|
13
22
|
"""The new observation (1 per agent) of the environment resulting from the agent's action."""
|
|
14
23
|
state: State
|
|
@@ -0,0 +1,156 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: multi-agent-rlenv
|
|
3
|
+
Version: 3.7.4
|
|
4
|
+
Summary: A strongly typed Multi-Agent Reinforcement Learning framework
|
|
5
|
+
Project-URL: repository, https://github.com/yamoling/multi-agent-rlenv
|
|
6
|
+
Author-email: Yannick Molinghen <yannick.molinghen@ulb.be>
|
|
7
|
+
License-File: LICENSE
|
|
8
|
+
Classifier: Operating System :: OS Independent
|
|
9
|
+
Classifier: Programming Language :: Python :: 3
|
|
10
|
+
Requires-Python: <4,>=3.12
|
|
11
|
+
Requires-Dist: numpy>=2.0.0
|
|
12
|
+
Requires-Dist: opencv-python>=4.0
|
|
13
|
+
Requires-Dist: typing-extensions>=4.0
|
|
14
|
+
Provides-Extra: all
|
|
15
|
+
Requires-Dist: gymnasium>0.29.1; extra == 'all'
|
|
16
|
+
Requires-Dist: laser-learning-environment>=2.6.1; extra == 'all'
|
|
17
|
+
Requires-Dist: overcooked>=0.1.0; extra == 'all'
|
|
18
|
+
Requires-Dist: pettingzoo>=1.20; extra == 'all'
|
|
19
|
+
Requires-Dist: pymunk>=6.0; extra == 'all'
|
|
20
|
+
Requires-Dist: scipy>=1.10; extra == 'all'
|
|
21
|
+
Requires-Dist: smacv2; extra == 'all'
|
|
22
|
+
Requires-Dist: torch>=2.0; extra == 'all'
|
|
23
|
+
Provides-Extra: gym
|
|
24
|
+
Requires-Dist: gymnasium>=0.29.1; extra == 'gym'
|
|
25
|
+
Provides-Extra: lle
|
|
26
|
+
Requires-Dist: laser-learning-environment>=2.6.1; extra == 'lle'
|
|
27
|
+
Provides-Extra: overcooked
|
|
28
|
+
Requires-Dist: overcooked>=0.1.0; extra == 'overcooked'
|
|
29
|
+
Provides-Extra: pettingzoo
|
|
30
|
+
Requires-Dist: pettingzoo>=1.20; extra == 'pettingzoo'
|
|
31
|
+
Requires-Dist: pymunk>=6.0; extra == 'pettingzoo'
|
|
32
|
+
Requires-Dist: scipy>=1.10; extra == 'pettingzoo'
|
|
33
|
+
Provides-Extra: smac
|
|
34
|
+
Requires-Dist: pysc2; extra == 'smac'
|
|
35
|
+
Requires-Dist: smacv2; extra == 'smac'
|
|
36
|
+
Provides-Extra: torch
|
|
37
|
+
Requires-Dist: torch>=2.0; extra == 'torch'
|
|
38
|
+
Description-Content-Type: text/markdown
|
|
39
|
+
|
|
40
|
+
# `marlenv` - A unified framework for muti-agent reinforcement learning
|
|
41
|
+
**Documentation: [https://yamoling.github.io/multi-agent-rlenv](https://yamoling.github.io/multi-agent-rlenv)**
|
|
42
|
+
|
|
43
|
+
`marlenv` is a strongly typed library for multi-agent and multi-objective reinforcement learning.
|
|
44
|
+
|
|
45
|
+
Install the library with
|
|
46
|
+
```sh
|
|
47
|
+
$ pip install multi-agent-rlenv # Basics
|
|
48
|
+
$ pip install multi-agent-rlenv[all] # With all optional dependecies
|
|
49
|
+
$ pip install multi-agent-rlenv[smac,overcooked] # Only SMACv2 & Overcooked
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
It aims to provide a simple and consistent interface for reinforcement learning environments by providing abstraction models such as `Observation`s or `Episode`s. `marlenv` provides adapters for popular libraries such as `gym` or `pettingzoo` and provides utility wrappers to add functionalities such as video recording or limiting the number of steps.
|
|
53
|
+
|
|
54
|
+
Almost every class is a dataclass to enable seemless serialiation with the `orjson` library.
|
|
55
|
+
|
|
56
|
+
# Fundamentals
|
|
57
|
+
## States & Observations
|
|
58
|
+
`MARLEnv.reset()` returns a pair of `(Observation, State)` and `MARLEnv.step()` returns a `Step`.
|
|
59
|
+
|
|
60
|
+
- `Observation` contains:
|
|
61
|
+
- `data`: shape `[n_agents, *observation_shape]`
|
|
62
|
+
- `available_actions`: boolean mask `[n_agents, n_actions]`
|
|
63
|
+
- `extras`: extra features per agent (default shape `(n_agents, 0)`)
|
|
64
|
+
- `State` represents the environment state and can also carry `extras`.
|
|
65
|
+
- `Step` bundles `obs`, `state`, `reward`, `done`, `truncated`, and `info`.
|
|
66
|
+
|
|
67
|
+
Rewards are stored as `np.float32` arrays. Multi-objective envs use reward vectors with `reward_space.size > 1`.
|
|
68
|
+
|
|
69
|
+
## Extras
|
|
70
|
+
Extras are auxiliary features appended by wrappers (agent id, last action, time ratio, available actions, ...).
|
|
71
|
+
Wrappers that add extras must update both `extras_shape` and `extras_meanings` so downstream users can interpret them.
|
|
72
|
+
`State` extras should stay in sync with `Observation` extras when applicable.
|
|
73
|
+
|
|
74
|
+
# Environment catalog
|
|
75
|
+
`marlenv.catalog` exposes curated environments and lazily imports optional dependencies.
|
|
76
|
+
|
|
77
|
+
```python
|
|
78
|
+
from marlenv import catalog
|
|
79
|
+
|
|
80
|
+
env1 = catalog.overcooked().from_layout("scenario4")
|
|
81
|
+
env2 = catalog.lle().level(6)
|
|
82
|
+
env3 = catalog.DeepSea(mex_depth=5)
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
Catalog entries require their corresponding extras at install time (e.g., `marlenv[overcooked]`, `marlenv[lle]`).
|
|
86
|
+
|
|
87
|
+
# Wrappers & builders
|
|
88
|
+
Wrappers are composable through `RLEnvWrapper` and can be chained via `Builder` for fluent configuration.
|
|
89
|
+
|
|
90
|
+
```python
|
|
91
|
+
from marlenv import Builder
|
|
92
|
+
from marlenv.adapters import SMAC
|
|
93
|
+
|
|
94
|
+
env = (
|
|
95
|
+
Builder(SMAC("3m"))
|
|
96
|
+
.agent_id()
|
|
97
|
+
.time_limit(20)
|
|
98
|
+
.available_actions()
|
|
99
|
+
.build()
|
|
100
|
+
)
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
Common wrappers include time limits, delayed rewards, masking available actions, and video recording.
|
|
104
|
+
|
|
105
|
+
# Using the library
|
|
106
|
+
## Adapters for existing libraries
|
|
107
|
+
Adapters normalize external APIs into `MARLEnv`:
|
|
108
|
+
|
|
109
|
+
```python
|
|
110
|
+
import marlenv
|
|
111
|
+
|
|
112
|
+
gym_env = marlenv.make("CartPole-v1", seed=25)
|
|
113
|
+
|
|
114
|
+
from marlenv.adapters import SMAC
|
|
115
|
+
smac_env = SMAC("3m", debug=True, difficulty="9")
|
|
116
|
+
|
|
117
|
+
from pettingzoo.sisl import pursuit_v4
|
|
118
|
+
from marlenv.adapters import PettingZoo
|
|
119
|
+
env = PettingZoo(pursuit_v4.parallel_env())
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
## Designing a custom environment
|
|
123
|
+
Create a custom environment by inheriting from `MARLEnv` and implementing `reset`, `step`, `get_observation`, and `get_state`.
|
|
124
|
+
|
|
125
|
+
```python
|
|
126
|
+
import numpy as np
|
|
127
|
+
from marlenv import MARLEnv, DiscreteSpace, Observation, State, Step
|
|
128
|
+
|
|
129
|
+
class CustomEnv(MARLEnv[DiscreteSpace]):
|
|
130
|
+
def __init__(self):
|
|
131
|
+
super().__init__(
|
|
132
|
+
n_agents=3,
|
|
133
|
+
action_space=DiscreteSpace.action(5).repeat(3),
|
|
134
|
+
observation_shape=(4,),
|
|
135
|
+
state_shape=(2,),
|
|
136
|
+
)
|
|
137
|
+
self.t = 0
|
|
138
|
+
|
|
139
|
+
def reset(self):
|
|
140
|
+
self.t = 0
|
|
141
|
+
return self.get_observation(), self.get_state()
|
|
142
|
+
|
|
143
|
+
def step(self, action):
|
|
144
|
+
self.t += 1
|
|
145
|
+
return Step(self.get_observation(), self.get_state(), reward=0.0, done=False)
|
|
146
|
+
|
|
147
|
+
def get_observation(self):
|
|
148
|
+
return Observation(np.zeros((3, 4), dtype=np.float32), self.available_actions())
|
|
149
|
+
|
|
150
|
+
def get_state(self):
|
|
151
|
+
return State(np.array([self.t, 0], dtype=np.float32))
|
|
152
|
+
```
|
|
153
|
+
|
|
154
|
+
# Related projects
|
|
155
|
+
- MARL: Collection of multi-agent reinforcement learning algorithms based on `marlenv` [https://github.com/yamoling/marl](https://github.com/yamoling/marl)
|
|
156
|
+
- Laser Learning Environment: a multi-agent gridworld that leverages `marlenv`'s capabilities [https://pypi.org/project/laser-learning-environment/](https://pypi.org/project/laser-learning-environment/)
|
|
@@ -1,15 +1,15 @@
|
|
|
1
|
-
marlenv/__init__.py,sha256=
|
|
1
|
+
marlenv/__init__.py,sha256=5HWxgfUTA1l-uGpvwEt1e8KxRINteqXPKshY-PItlxI,4875
|
|
2
2
|
marlenv/env_builder.py,sha256=RUMFvW7dAJtHMLm8-oPVpjBefDtNliZtjlHci97Xj-Q,3874
|
|
3
3
|
marlenv/env_pool.py,sha256=mJhJUROX9k2A2njwnUOBl2EAuhotksQMugH_Zydg1IU,951
|
|
4
4
|
marlenv/exceptions.py,sha256=gJUC_2rVAvOfK_ypVFc7Myh-pIfSU3To38VBVS_0rZA,1179
|
|
5
5
|
marlenv/mock_env.py,sha256=rvl4QAn046HM79IMMiAj1Aoy3_GBSNBBR1_9fHPutR8,4682
|
|
6
6
|
marlenv/py.typed,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
|
|
7
|
-
marlenv/adapters/__init__.py,sha256=
|
|
7
|
+
marlenv/adapters/__init__.py,sha256=1APqbpC2JmVgMhexdb8FbTifxFs7_mjqrcEkQquug8k,1182
|
|
8
8
|
marlenv/adapters/gym_adapter.py,sha256=DXQ1czcvRoL9hTwcVzfMyXArZeVIHP1gAKqZJO87y7Y,3065
|
|
9
9
|
marlenv/adapters/pettingzoo_adapter.py,sha256=UzSUdP4EUJOt49AB7H45ToA8rUkGmPQgrJKegvK86og,2877
|
|
10
10
|
marlenv/adapters/pymarl_adapter.py,sha256=2s7EY31s1hrml3q-BBaXo_eDMXTjkebozZPvzsgrb9c,3353
|
|
11
|
-
marlenv/adapters/smac_adapter.py,sha256=
|
|
12
|
-
marlenv/catalog/__init__.py,sha256=
|
|
11
|
+
marlenv/adapters/smac_adapter.py,sha256=HXECjK4hs4c1zAr1qWUChrJZdcvsu-OhnDovS0_u9Z0,8240
|
|
12
|
+
marlenv/catalog/__init__.py,sha256=YK8w6wUleIZkO85f_5e0Dj_7HEqX3X0u1CgXeTX6IE0,1215
|
|
13
13
|
marlenv/catalog/coordinated_grid.py,sha256=Kq5UzG9rr5gYRO0QWFCmKmO56JIzgIR19an9_pvypJU,4997
|
|
14
14
|
marlenv/catalog/deepsea.py,sha256=yTyvskWZiAZem11L8cZwHedBIDQ4EAxE2IaUKrjKL2U,2413
|
|
15
15
|
marlenv/catalog/matrix_game.py,sha256=zkErnh6ZIa1kBryYMVLw-jeMCd2AJ-BlP2yROxpbb0w,1519
|
|
@@ -17,13 +17,13 @@ marlenv/catalog/two_steps.py,sha256=lI-q4-Q8283QZTjY0wk7OfXWB6Ln-lquYUjHyT4URi4,
|
|
|
17
17
|
marlenv/catalog/connectn/__init__.py,sha256=BKfM0ZofMK6zqGURi2bzILyNFfYjfbZpKTs5ikKiJAk,195
|
|
18
18
|
marlenv/catalog/connectn/board.py,sha256=GVcFA1OJgLUmQoTIfOO9M7nL9dFv-4T3tGrVsP15zyg,6124
|
|
19
19
|
marlenv/catalog/connectn/env.py,sha256=Ot5vfAbzS6eRe3-nLW_AkhEH7F1WVvv4_odoxZU7HNg,1905
|
|
20
|
-
marlenv/models/__init__.py,sha256=
|
|
20
|
+
marlenv/models/__init__.py,sha256=M6nXAZJWpTdncWm-4wN5V05waUAp4KJ007efw-xbMDQ,854
|
|
21
21
|
marlenv/models/env.py,sha256=BG1iVHxGD_p827mF0ewyOBn6wU2gtFsHLW1b4UtW-V0,7841
|
|
22
22
|
marlenv/models/episode.py,sha256=zsyxsW4LIioPKyY4DZKn64A31e5ZvlwOf3HIGuRUzhs,13531
|
|
23
23
|
marlenv/models/observation.py,sha256=6uY2h0zHBm6g1ECzD8jZLXuSzuuX-U60QW0E_b4qPuc,3569
|
|
24
24
|
marlenv/models/spaces.py,sha256=d_aIPWwPdaOWZeNRUUdzSiDxs9XQb9itPnrE_EyhhfQ,7810
|
|
25
25
|
marlenv/models/state.py,sha256=JvCXwf0l7L2UMHkvYp-WM_aDegJ-hePpQI2yiUw6X_g,2099
|
|
26
|
-
marlenv/models/step.py,sha256=
|
|
26
|
+
marlenv/models/step.py,sha256=xg_7iPyOvahsZ5k7L6On7E_j0dUDEu0h6eyqFsWGR-M,3337
|
|
27
27
|
marlenv/models/transition.py,sha256=UkJVRNxZoyRkjE7YmKtUf_4xA7cOEh20O60dTldbvys,5070
|
|
28
28
|
marlenv/utils/__init__.py,sha256=ky5mz_T7EF65YNaEN1UDCUYZVlz7hFyKResgIJlE_1Q,462
|
|
29
29
|
marlenv/utils/cached_property_collector.py,sha256=IOjbr61f0DqLhcidXKrl7MhN1BOEGiTzCANIKQCxaF0,600
|
|
@@ -45,7 +45,7 @@ marlenv/wrappers/rlenv_wrapper.py,sha256=iFSQsDMkUUbQJKEO8l6SosNi-eOUVSh4pIJVu7a
|
|
|
45
45
|
marlenv/wrappers/state_counter.py,sha256=QmEMb55vOnK-VJuvKsDIIBgcNRsHuovqgpK2pcCY7sA,1211
|
|
46
46
|
marlenv/wrappers/time_limit.py,sha256=HctKeiepPQ2NAIa208SnvknioSkRIuUQ4X-Xhf_XTs0,3974
|
|
47
47
|
marlenv/wrappers/video_recorder.py,sha256=mtWcqaYNCu-zjVXvpa8DJe3_062tpK_TChOu-Xyxs3s,2533
|
|
48
|
-
multi_agent_rlenv-3.7.
|
|
49
|
-
multi_agent_rlenv-3.7.
|
|
50
|
-
multi_agent_rlenv-3.7.
|
|
51
|
-
multi_agent_rlenv-3.7.
|
|
48
|
+
multi_agent_rlenv-3.7.4.dist-info/METADATA,sha256=UKdalybcSN3nvAVbJtZXgQFr9R3AEjKq0E3OFvlM3ZE,5971
|
|
49
|
+
multi_agent_rlenv-3.7.4.dist-info/WHEEL,sha256=WLgqFyCfm_KASv4WHyYy0P3pM_m7J5L9k2skdKLirC8,87
|
|
50
|
+
multi_agent_rlenv-3.7.4.dist-info/licenses/LICENSE,sha256=_eeiGVoIJ7kYt6l1zbIvSBQppTnw0mjnYk1lQ4FxEjE,1074
|
|
51
|
+
multi_agent_rlenv-3.7.4.dist-info/RECORD,,
|
|
@@ -1,144 +0,0 @@
|
|
|
1
|
-
Metadata-Version: 2.4
|
|
2
|
-
Name: multi-agent-rlenv
|
|
3
|
-
Version: 3.7.2
|
|
4
|
-
Summary: A strongly typed Multi-Agent Reinforcement Learning framework
|
|
5
|
-
Project-URL: repository, https://github.com/yamoling/multi-agent-rlenv
|
|
6
|
-
Author-email: Yannick Molinghen <yannick.molinghen@ulb.be>
|
|
7
|
-
License-File: LICENSE
|
|
8
|
-
Classifier: Operating System :: OS Independent
|
|
9
|
-
Classifier: Programming Language :: Python :: 3
|
|
10
|
-
Requires-Python: <4,>=3.12
|
|
11
|
-
Requires-Dist: numpy>=2.0.0
|
|
12
|
-
Requires-Dist: opencv-python>=4.0
|
|
13
|
-
Requires-Dist: typing-extensions>=4.0
|
|
14
|
-
Provides-Extra: all
|
|
15
|
-
Requires-Dist: gymnasium>0.29.1; extra == 'all'
|
|
16
|
-
Requires-Dist: laser-learning-environment>=2.6.1; extra == 'all'
|
|
17
|
-
Requires-Dist: overcooked>=0.1.0; extra == 'all'
|
|
18
|
-
Requires-Dist: pettingzoo>=1.20; extra == 'all'
|
|
19
|
-
Requires-Dist: pymunk>=6.0; extra == 'all'
|
|
20
|
-
Requires-Dist: pysc2; extra == 'all'
|
|
21
|
-
Requires-Dist: scipy>=1.10; extra == 'all'
|
|
22
|
-
Requires-Dist: smac; extra == 'all'
|
|
23
|
-
Requires-Dist: torch>=2.0; extra == 'all'
|
|
24
|
-
Provides-Extra: gym
|
|
25
|
-
Requires-Dist: gymnasium>=0.29.1; extra == 'gym'
|
|
26
|
-
Provides-Extra: lle
|
|
27
|
-
Requires-Dist: laser-learning-environment>=2.6.1; extra == 'lle'
|
|
28
|
-
Provides-Extra: overcooked
|
|
29
|
-
Requires-Dist: overcooked>=0.1.0; extra == 'overcooked'
|
|
30
|
-
Provides-Extra: pettingzoo
|
|
31
|
-
Requires-Dist: pettingzoo>=1.20; extra == 'pettingzoo'
|
|
32
|
-
Requires-Dist: pymunk>=6.0; extra == 'pettingzoo'
|
|
33
|
-
Requires-Dist: scipy>=1.10; extra == 'pettingzoo'
|
|
34
|
-
Provides-Extra: smac
|
|
35
|
-
Requires-Dist: pysc2; extra == 'smac'
|
|
36
|
-
Requires-Dist: smac; extra == 'smac'
|
|
37
|
-
Provides-Extra: torch
|
|
38
|
-
Requires-Dist: torch>=2.0; extra == 'torch'
|
|
39
|
-
Description-Content-Type: text/markdown
|
|
40
|
-
|
|
41
|
-
# `marlenv` - A unified framework for muti-agent reinforcement learning
|
|
42
|
-
**Documentation: [https://yamoling.github.io/multi-agent-rlenv](https://yamoling.github.io/multi-agent-rlenv)**
|
|
43
|
-
|
|
44
|
-
The objective of `marlenv` is to provide a common (typed) interface for many different reinforcement learning environments.
|
|
45
|
-
|
|
46
|
-
As such, `marlenv` provides high level abstractions of RL concepts such as `Observation`s or `Transition`s that are commonly represented as mere (confusing) lists or tuples.
|
|
47
|
-
|
|
48
|
-
## Installation
|
|
49
|
-
Install with you preferred package manager (`uv`, `pip`, `poetry`, ...):
|
|
50
|
-
```bash
|
|
51
|
-
$ pip install marlenv[all] # Enable all features
|
|
52
|
-
$ pip install marlenv # Basic installation
|
|
53
|
-
```
|
|
54
|
-
|
|
55
|
-
There are multiple optional dependencies if you want to support specific libraries and environments. Available options are:
|
|
56
|
-
- `smac` for StarCraft II environments
|
|
57
|
-
- `gym` for OpenAI Gym environments
|
|
58
|
-
- `pettingzoo` for PettingZoo environments
|
|
59
|
-
- `overcooked` for Overcooked environments
|
|
60
|
-
|
|
61
|
-
Install them with:
|
|
62
|
-
```bash
|
|
63
|
-
$ pip install marlenv[smac] # Install SMAC
|
|
64
|
-
$ pip install marlenv[gym,smac] # Install Gym & smac support
|
|
65
|
-
```
|
|
66
|
-
|
|
67
|
-
## Using the `marlenv` environment catalog
|
|
68
|
-
Some environments are registered in the `marlenv` and can be easily instantiated via its catalog.
|
|
69
|
-
|
|
70
|
-
```python
|
|
71
|
-
from marlenv import catalog
|
|
72
|
-
|
|
73
|
-
env1 = catalog.Overcooked.from_layout("scenario4")
|
|
74
|
-
env2 = catalog.LLE.level(6)
|
|
75
|
-
env3 = catalog.DeepSea(mex_depth=5)
|
|
76
|
-
```
|
|
77
|
-
Note that using the catalog requires the corresponding environment package to be installed. For instance you need to install the `laser-learning-environment` package to use `catalog.LLE`, which can be done by using the corresponding feature when at installation as shown below.
|
|
78
|
-
```bash
|
|
79
|
-
pip install multi-agent-rlenv[lle]
|
|
80
|
-
```
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
## Using `marlenv` with existing libraries
|
|
84
|
-
`marlenv` provides adapters from most popular libraries to unify them under a single interface. Namely, `marlenv` supports `smac`, `gymnasium` and `pettingzoo`.
|
|
85
|
-
|
|
86
|
-
```python
|
|
87
|
-
import marlenv
|
|
88
|
-
|
|
89
|
-
# You can instanciate gymnasium environments directly via their registry ID
|
|
90
|
-
gym_env = marlenv.make("CartPole-v1", seed=25)
|
|
91
|
-
|
|
92
|
-
# You can seemlessly instanciate a SMAC environment and directly pass your required arguments
|
|
93
|
-
from marlenv.adapters import SMAC
|
|
94
|
-
smac_env = SMAC("3m", debug=True, difficulty="9")
|
|
95
|
-
|
|
96
|
-
# pettingzoo is also supported
|
|
97
|
-
from pettingzoo.sisl import pursuit_v4
|
|
98
|
-
from marlenv.adapters import PettingZoo
|
|
99
|
-
pz_env = PettingZoo(pursuit_v4.parallel_env())
|
|
100
|
-
```
|
|
101
|
-
|
|
102
|
-
|
|
103
|
-
## Designing custom environments
|
|
104
|
-
You can create your own custom environment by inheriting from the `RLEnv` class. The below example illustrates a gridworld with a discrete action space. Note that other methods such as `step` or `render` must also be implemented.
|
|
105
|
-
```python
|
|
106
|
-
import numpy as np
|
|
107
|
-
from marlenv import RLEnv, DiscreteActionSpace, Observation
|
|
108
|
-
|
|
109
|
-
N_AGENTS = 3
|
|
110
|
-
N_ACTIONS = 5
|
|
111
|
-
|
|
112
|
-
class CustomEnv(MARLEnv[DiscreteActionSpace]):
|
|
113
|
-
def __init__(self, width: int, height: int):
|
|
114
|
-
super().__init__(
|
|
115
|
-
action_space=DiscreteActionSpace(N_AGENTS, N_ACTIONS),
|
|
116
|
-
observation_shape=(height, width),
|
|
117
|
-
state_shape=(1,),
|
|
118
|
-
)
|
|
119
|
-
self.time = 0
|
|
120
|
-
|
|
121
|
-
def reset(self) -> Observation:
|
|
122
|
-
self.time = 0
|
|
123
|
-
...
|
|
124
|
-
return obs
|
|
125
|
-
|
|
126
|
-
def get_state(self):
|
|
127
|
-
return np.array([self.time])
|
|
128
|
-
```
|
|
129
|
-
|
|
130
|
-
## Useful wrappers
|
|
131
|
-
`marlenv` comes with multiple common environment wrappers, check the documentation for a complete list. The preferred way of using the wrappers is through a `marlenv.Builder`. The below example shows how to add a time limit (in number of steps) and an agent id to the observations of a SMAC environment.
|
|
132
|
-
|
|
133
|
-
```python
|
|
134
|
-
from marlenv import Builder
|
|
135
|
-
from marlenv.adapters import SMAC
|
|
136
|
-
|
|
137
|
-
env = Builder(SMAC("3m")).agent_id().time_limit(20).build()
|
|
138
|
-
print(env.extras_shape) # -> (4, ) because there are 3 agents and the time counter
|
|
139
|
-
```
|
|
140
|
-
|
|
141
|
-
|
|
142
|
-
# Related projects
|
|
143
|
-
- MARL: Collection of multi-agent reinforcement learning algorithms based on `marlenv` [https://github.com/yamoling/marl](https://github.com/yamoling/marl)
|
|
144
|
-
- Laser Learning Environment: a multi-agent gridworld that leverages `marlenv`'s capabilities [https://pypi.org/project/laser-learning-environment/](https://pypi.org/project/laser-learning-environment/)
|
|
File without changes
|
|
File without changes
|