PyPI - alem-env - Versions diffs - 0.1.0__tar.gz - Mend

alem-env 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (205) hide show

alem_env-0.1.0/LICENSE ADDED Viewed

@@ -0,0 +1,19 @@
+Copyright (c) 2025
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

alem_env-0.1.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,580 @@
+Metadata-Version: 2.4
+Name: alem-env
+Version: 0.1.0
+Summary: A JAX environment for open-ended multi-agent coordination.
+License: MIT
+Project-URL: Homepage, https://github.com/alem-world/alem-env
+Project-URL: Repository, https://github.com/alem-world/alem-env
+Project-URL: Paper, https://arxiv.org/abs/2606.08340
+Keywords: reinforcement-learning,multi-agent,jax,llm-agents,benchmark
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Intended Audience :: Science/Research
+Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
+Requires-Python: <3.13,>=3.11
+Description-Content-Type: text/markdown
+License-File: LICENSE
+Requires-Dist: chex>=0.1.90
+Requires-Dist: flax==0.10.3
+Requires-Dist: imageio>=2.37.0
+Requires-Dist: jax==0.4.38
+Requires-Dist: jaxlib==0.4.38
+Requires-Dist: matplotlib>=3.8.0
+Requires-Dist: numpy>=1.26.0
+Requires-Dist: pillow>=10.0.0
+Requires-Dist: scipy>=1.12.0
+Requires-Dist: seaborn>=0.13.0
+Provides-Extra: gpu
+Requires-Dist: jax[cuda12]==0.4.38; extra == "gpu"
+Provides-Extra: cuda12
+Requires-Dist: jax[cuda12]==0.4.38; extra == "cuda12"
+Provides-Extra: llm
+Requires-Dist: openai>=1.0.0; extra == "llm"
+Provides-Extra: play
+Requires-Dist: pygame>=2.6.0; extra == "play"
+Provides-Extra: baselines-rl
+Requires-Dist: jaxmarl>=0.0.3; extra == "baselines-rl"
+Requires-Dist: distrax>=0.1.5; extra == "baselines-rl"
+Requires-Dist: optax>=0.2.0; extra == "baselines-rl"
+Requires-Dist: orbax-checkpoint>=0.5.0; extra == "baselines-rl"
+Requires-Dist: hydra-core>=1.3.0; extra == "baselines-rl"
+Requires-Dist: omegaconf>=2.3.0; extra == "baselines-rl"
+Requires-Dist: wandb>=0.16.0; extra == "baselines-rl"
+Requires-Dist: pyyaml>=6.0; extra == "baselines-rl"
+Provides-Extra: baselines-llm
+Requires-Dist: hydra-core>=1.3.0; extra == "baselines-llm"
+Requires-Dist: omegaconf>=2.3.0; extra == "baselines-llm"
+Requires-Dist: wandb>=0.16.0; extra == "baselines-llm"
+Requires-Dist: tqdm>=4.66.0; extra == "baselines-llm"
+Requires-Dist: openai>=1.0.0; extra == "baselines-llm"
+Requires-Dist: anthropic>=0.40.0; extra == "baselines-llm"
+Requires-Dist: google-genai>=0.3.0; extra == "baselines-llm"
+Provides-Extra: dev
+Requires-Dist: pytest>=8.0.0; extra == "dev"
+Requires-Dist: pytest-cov>=5.0.0; extra == "dev"
+Requires-Dist: jaxtyping>=0.2.34; extra == "dev"
+Requires-Dist: beartype>=0.18.0; extra == "dev"
+Requires-Dist: ruff>=0.4.0; extra == "dev"
+Dynamic: license-file
+<h1 align="center">
+  <a href="https://arxiv.org/abs/2606.08340">
+    <img src="images/alem-logo-animated.gif" alt="Alem logo showing synchronous coordination" width="360" />
+  </a>
+</h1>
+<p align="center">
+  <a href="https://pypi.org/project/alem-env/"><img alt="PyPI" src="https://img.shields.io/pypi/v/alem-env.svg" /></a>
+  <a href="https://pypi.org/project/alem-env/"><img alt="Python versions" src="https://img.shields.io/pypi/pyversions/alem-env.svg" /></a>
+  <a href="https://github.com/alem-world/alem-env/actions/workflows/ci.yml"><img alt="CI" src="https://github.com/alem-world/alem-env/actions/workflows/ci.yml/badge.svg" /></a>
+  <a href="LICENSE"><img alt="License: MIT" src="https://img.shields.io/badge/License-MIT-yellow.svg" /></a>
+  <a href="https://arxiv.org/abs/2606.08340"><img alt="arXiv:2606.08340" src="https://img.shields.io/badge/arXiv-2606.08340-b31b1b.svg" /></a>
+  <a href="https://huggingface.co/alem-world/alem-rl-baselines"><img alt="Hugging Face Models" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Models-ffce1c.svg" /></a>
+</p>
+<p align="center">
+  <b><a href="https://alem-world.github.io/leaderboard">🏆 Leaderboard</a></b> · <a href="https://arxiv.org/abs/2606.08340">📄 Paper</a> · <a href="https://huggingface.co/alem-world/alem-rl-baselines">🤗 Models</a>
+  <!-- TODO: replace the leaderboard URL with the real reference-website link once it is live. -->
+</p>
+*Alem* is a JAX benchmark for open-ended multi-agent coordination. Building on [Craftax](https://github.com/MichaelTMatthews/Craftax) and [Multi-Agent Craftax / Craftax-Coop](https://github.com/BaselOmari/MA-Craftax), *Alem* introduces procedurally generated coordination tasks, soft specialisation, communication, and controllable coordination difficulty into a long-horizon survival world with exploration, crafting, trading, and combat. The same world is exposed through symbolic, pixel, and text interfaces, making it usable by MARL agents, language agents, and humans.
+*Alem* means *world* in Amharic.
+## Contents
+[RL Playing](#rl-agents-playing) · [LLM Playing](#llm-agents-playing) · [Install](#install) · [Quick Start](#quick-start) · [Configure](#configure) · [RL Agents](#rl-agents) · [LLM Agents](#llm-agents) · [Baselines](#baselines) · [Human Play](#human-play) · [Docker](#docker) · [Package Layout](#package-layout) · [Development](#development) · [RL vs LLM Interfaces](#rl-vs-llm-interfaces) · [Reproduce the Paper](#reproduce-the-paper) · [Contributing](#contributing) · [Citation](#citation) · [License](#license)
+## RL Agents Playing
+A team of MARL agents controlling the three players from symbolic observations, each acting from its own egocentric view.
+<p align="center">
+  <img src="images/sample_agents_playing.gif" alt="RL agents playing Alem" width="760" />
+</p>
+Fast to train end-to-end in JAX. Full MARL training code and reference baselines live in [`baselines/`](baselines) — see [Baselines](#baselines).
+## LLM Agents Playing
+The same world through the text interface. Each agent gets its own observation, broadcasts a free-form message to teammates every step and stores important information in scratchpad memory.
+<p align="center">
+  <img src="images/llm_communication.gif" alt="LLM agents coordinating in Alem" width="820" />
+</p>
+<p align="center"><sub>Gemini 3.1 Pro (medium). <b>THINKING</b> = the agent's private plan; <b>MESSAGE</b> = what it broadcasts to the team.<br/><b>Each panel is held for several seconds so the reasoning is readable — this is not the agents' real decision speed.</b></sub></p>
+The warrior plans turns ahead **and predicts how a teammate will react** — then it happens:
+- **Plans ahead.** A turn-indexed plan `T87→T95`, with a fallback for the warrior's crafting-penalty.
+- **Theory of mind.** *"A2 will get my message at T88, so they'll cancel their T90 action and wait for T95"* — and at T88, A2 does exactly that.
+- **Coordinates out loud.** Lines all three up for a synchronous mine to earn the *Coord Mine Stone Hard* bonus.
+See [LLM Agents](#llm-agents) to run it yourself.
+<details>
+<summary><b>👁️ Click to see what the agents actually see</b></summary>
+<br>
+Every step a language agent gets a **system prompt** (the rules, sent once) and a **text observation** (its current view), and must reply with an `<action>`, an optional `<communication>` broadcast, and an optional private `<scratchpad>`. We use **progressive disclosure**, where we only give relevant information for the current level in the prompt, and add information as agents get to more levels.
+**System prompt template** — placeholders in `{…}` are filled per agent/run (abridged; the full rules are sent verbatim):
+```text
+You are Agent {id} ({role}) in a {num_agents}-agent cooperative survival game. Your goal is to gather resources, craft gear, fight monsters, and descend through {num_levels} dungeon levels, while coordinating with teammates. You must survive — if your health reaches zero, you die, and if all agents die the game ends. Maximize achievements while alive.
+[ … full game rules: movement & facing, Do/interaction, crafting recipes, roles & specialist penalties, coordination (sync / handover / construction), the resource chain, and the {num_achievements} achievements …]
+<output_format>
+1. (Required) Exactly one action from the available list:
+   <action>YOUR_CHOSEN_ACTION</action>
+2. (Optional) Broadcast to teammates, up to {comm_char_limit} chars:
+   <communication>YOUR_MESSAGE</communication>
+3. (Optional) Private notes, up to {scratchpad_char_limit} chars — not shared; your only memory:
+   <scratchpad>YOUR_NOTES</scratchpad>
+Token budget: {token_budget} tokens for the full response (including reasoning).
+</output_format>
+```
+**[View the full system prompt, filled in](SYSTEM_PROMPT.md)** (a concrete 3-agent example on overworld).
+**Observation template** — the structure every agent receives each step:
+```text
+Step: {step}/{max_steps} ({steps_remaining} remaining, ends early if all agents die)
+Position: (x={x}, y={y})
+Role: {role}
+Location: {dungeon_level}
+Achievements: {unlocked}/{total} ({locked} unlock later)
+You see:
+- {object} {relative_position} (x={x}, y={y})
+  ...one line per visible object...
+Facing: {direction}. Do target: {object_in_front} (x={x}, y={y}).
+Coordination:
+- {object} (x={x}, y={y}): {how_this_object_must_be_coordinated}
+  ...one line per coordination-relevant object...
+Teammates:
+Agent {id} ({role}): {relative_position} (x={x}, y={y}), health={hp}
+  ...one line per teammate...
+Your status: health {hp}, food {food}, drink {drink}, energy {energy}, mana {mana}, xp {xp}
+Available actions: {legal_actions_this_step}
+```
+**Filled-in example** — what the warrior actually sees at step 0:
+```text
+Step: 0/10000 (10000 remaining, ends early if all agents die)
+Position: (x=24, y=24)
+Role: warrior
+Location: Overworld (surface)
+Achievements: 0/93 (39 unlock later)
+You see:
+- stone 5 steps east (x=29, y=24)
+- tree 1 step north and 2 steps west (x=22, y=23)
+- construction_site 2 steps north (x=24, y=22)
+- iron 3 steps south and 5 steps east (x=29, y=27)
+Facing: north. Do target: grass (x=24, y=23).
+Coordination:
+- construction_site (x=24, y=22): requires 3 agents to Build simultaneously (fails alone).
+- tree (x=26, y=23): one agent begins, another completes it within 6 steps (handover).
+- stone (x=20, y=21): works solo, but a bonus when 3 agents Do simultaneously.
+Teammates:
+Agent 1 (forager): 1 step east (x=25, y=24), health=9
+Agent 2 (miner): 1 step south (x=24, y=25), health=9
+Your status: health 9, food 9, drink 9, energy 9, mana 9, xp 0
+Available actions: Noop, Move {West,East,North,South}, Do, Sleep, Rest, Request {Food,Drink,Wood,Stone,Iron,Coal,Diamond,Ruby,Sapphire}
+```
+</details>
+## Install
+```bash
+pip install alem-env          # latest release from PyPI
+```
+Or from source for development (editable install):
+```bash
+uv venv --python 3.12
+source .venv/bin/activate    # Linux / macOS
+# .venv\Scripts\activate     # Windows
+uv pip install -e .
+```
+Optional extras (work with either `alem-env` or `-e .`):
+```bash
+uv pip install -e ".[llm]"             # OpenAI-compatible LLM interface
+uv pip install -e ".[play]"            # pygame human-play example
+uv pip install -e ".[gpu]"             # NVIDIA CUDA 12 JAX wheels
+uv pip install -e ".[baselines-rl]"    # JAX MARL trainers (see Baselines)
+uv pip install -e ".[baselines-llm]"   # LLM-agent evaluation harness
+uv pip install -e ".[dev]"             # pytest, ruff, jaxtyping
+```
+Plain `pip` also works (no uv required):
+```bash
+pip install -e .
+pip install -e ".[gpu]"
+```
+> **Running scripts with uv:** Commands below use `uv run python …`, which uses `.venv` without a prior `source activate` (activating once and calling `python …` also works). Note: it's `uv run python script.py` — `uv python script.py` is not valid.
+## Quick Start
+```python
+import jax
+from alem.alem_env import make_alem_env_from_name
+env = make_alem_env_from_name("Alem-Coop-Symbolic")
+obs, state = env.reset(jax.random.PRNGKey(0))
+rng_act = jax.random.split(jax.random.PRNGKey(1), env.num_agents)
+actions = {
+    agent: env.action_space(agent).sample(rng_act[i])
+    for i, agent in enumerate(env.agents)
+}
+obs, state, rewards, dones, infos = env.step(jax.random.PRNGKey(2), state, actions)
+```
+Available environments:
+| Name                        | Description                                              |
+| --------------------------- | -------------------------------------------------------- |
+| `Alem-Coop-Symbolic`        | Full multi-agent environment, symbolic observations      |
+| `Alem-Coop-Pixels`          | Full multi-agent environment, pixel observations         |
+| `Alem-Coop-Symbolic-Debug`  | Smaller debug environment, only overworld (first floor). |
+| `Alem-SingleAgent-Symbolic` | Single-agent variant (experimental)                      |
+## Configure
+```python
+from alem.alem_coop.alem_state import EnvParams, StaticEnvParams, get_coordination_params
+env_params = EnvParams().replace(
+    **get_coordination_params("easy"),
+    soft_specialization=True,
+    shared_reward=False,
+)
+static_env_params = StaticEnvParams(player_count=3, num_comm_channels=4)
+env = make_alem_env_from_name(
+    "Alem-Coop-Symbolic",
+    env_params=env_params,
+    static_env_params=static_env_params,
+)
+```
+Coordination difficulty can be `"none"`, `"easy"`, `"medium"`, `"hard"`, or a numeric alpha in `[0, 1]`.
+## RL Agents
+A minimal framework-free rollout using symbolic observations and legal action masks:
+```bash
+uv run python examples/random_rl_agent.py --coord easy --steps 100
+uv run python examples/random_rl_agent.py --players 2 --coord hard --steps 200
+```
+The example uses a jitted `lax.scan` loop and can serve as a template for custom policies. Full MARL training recipes (IPPO, HyperMARL-IPPO, MAPPO, PQN-VDN) live in [`baselines/`](baselines) — see [Baselines](#baselines).
+## LLM Agents
+Preview 3-agent text observations without any model calls:
+```bash
+uv run python examples/llm_text_smoke.py --coord easy --show-affordances
+```
+Run one 3-agent step with any OpenAI-compatible model:
+```bash
+export OPENAI_API_KEY=sk-...
+uv run python examples/llm_openai_smoke.py --model gpt-4o-mini --steps 1
+```
+Local vLLM server:
+```bash
+uv run python examples/llm_openai_smoke.py \
+    --base-url http://localhost:8000/v1 \
+    --api-key EMPTY \
+    --model meta-llama/Llama-3.2-1B-Instruct \
+    --steps 1
+```
+Full LLM evaluation runners live in [`baselines/llm/`](baselines/llm) — see [Baselines](#baselines).
+## Baselines
+Reference MARL training code and the LLM-agent evaluation harness live in this repo under [`baselines/`](baselines). Following the [CleanRL](https://github.com/vwxyzjn/cleanrl) philosophy — and [JaxMARL](https://github.com/FLAIROx/JaxMARL), which these are adapted from — each RL algorithm is a single self-contained file with a matching [Hydra](https://hydra.cc) config in `baselines/config/`.
+Install only the set you need:
+```bash
+uv pip install -e ".[baselines-rl]"    # JAX MARL trainers (IPPO / MAPPO / PQN-VDN / HyperMARL)
+uv pip install -e ".[baselines-llm]"   # LLM-agent evaluation harness
+```
+### RL training
+| Algorithm                       | Entry point                       | Reference                                              |
+| ------------------------------- | --------------------------------- | ------------------------------------------------------ |
+| IPPO (RNN, shared params)       | `baselines/ippo_rnn.py`           | [IPPO](https://arxiv.org/abs/2011.09533)    |
+| IPPO (RNN, no param sharing)    | `baselines/ippo_rnn_nops.py`      | [IPPO](https://arxiv.org/abs/2011.09533)    |
+| HyperMARL-IPPO (RNN)            | `baselines/ippo_hypermarl_rnn.py` | [HyperMARL](https://arxiv.org/abs/2412.04233) ([code](https://github.com/KaleabTessera/HyperMARL)) |
+| MAPPO (RNN)                     | `baselines/mappo_rnn.py`          | [MAPPO](https://arxiv.org/abs/2103.01955)    |
+| PQN-VDN (RNN)                   | `baselines/pqn_vdn_rnn.py`        | [PQN](https://arxiv.org/abs/2407.04811) ([code](https://github.com/mttga/purejaxql))  |
+Run the baselines from the `baselines/` directory and override config values on the command line:
+```bash
+cd baselines
+python ippo_rnn.py
+python mappo_rnn.py coordination_difficulty=hard   # override any config value
+```
+### Running stored policies
+Pretrained RL checkpoints from the paper are on the Hugging Face Hub at
+[**alem-world/alem-rl-baselines**](https://huggingface.co/alem-world/alem-rl-baselines):
+120 checkpoints = 2 training budgets (`100M`, `1B` env steps) × 4 algorithms × 3
+difficulties × 5 seeds, laid out as `<budget>/<algorithm>/<difficulty>/seed<N>/`.
+Each trainer can skip training and instead restore a saved checkpoint, then run the
+same final evaluation (and visualization) used after training. Download the checkpoints,
+then pass `LOAD_CHECKPOINT` pointing at the checkpoint directory:
+```bash
+# 1. Download the checkpoints (needs: pip install -U huggingface_hub)
+hf download alem-world/alem-rl-baselines --local-dir alem-rl-baselines
+# 2. Reload and evaluate an IPPO policy (note NUM_COMM_CHANNELS=4)
+cd baselines
+python ippo_rnn.py \
+    +LOAD_CHECKPOINT=../alem-rl-baselines/1B/ippo-rnn/hard/seed0/checkpoint \
+    NUM_COMM_CHANNELS=4 \
+    EVAL_DIFFICULTIES=[hard] \
+    +VISUALIZE=True
+```
+Gifs are saved in `./outputs/`, set `VISUALIZE=False` to skip rendering and only run the numeric evaluation.
+> **Important — the config must match how the checkpoint was trained.** Checkpoint
+> shapes are fixed at training time, so the env config (number of agents, communication
+> channels, etc.) must match or the restore will fail with a shape mismatch. The released
+> checkpoints were all trained with **4 communication channels**, so load them with
+> `NUM_COMM_CHANNELS=4`. The exact overrides for any checkpoint are stored under
+> `reload_overrides` in its `config.json`.
+### LLM-agent evaluation
+The harness (derived from [BALROG](https://github.com/balrog-ai/BALROG)) drives 3 language agents through the text interface and supports vLLM, OpenAI, Anthropic, Gemini, and other OpenAI-compatible providers. See [`baselines/llm/README.md`](baselines/llm/README.md) for full launch commands and configuration.
+```bash
+cd baselines/llm
+export OPENAI_API_KEY=sk-...
+python eval_alem.py \
+    clients.0.client_name=openai \
+    clients.1.client_name=openai \
+    clients.2.client_name=openai
+```
+## Human Play
+Install the optional play dependencies first:
+```bash
+uv pip install -e ".[play]"
+```
+```bash
+uv run python examples/play_alem.py
+uv run python examples/play_alem.py --players 3 --coord easy --seed 42
+uv run python examples/play_alem.py --players 2 --coord hard --god
+```
+| Key         | Action           |
+| ----------- | ---------------- |
+| `W A S D`   | Move             |
+| `Space`     | Do / interact    |
+| `Tab`       | Sleep            |
+| `E`         | Rest             |
+| `.` / `,`   | Descend / ascend |
+| `Backspace` | Give to teammate |
+| `Q`         | No-op            |
+The game advances after all players have chosen an action.
+## Docker
+**Build:**
+```bash
+# CPU (default)
+docker build -f docker/Dockerfile.env -t alem-env .
+# GPU — NVIDIA CUDA 12
+docker build -f docker/Dockerfile.env --build-arg ALEM_ACCELERATOR=cuda12 -t alem-env:gpu .
+# With optional extras (e.g. LLM + play)
+docker build -f docker/Dockerfile.env --build-arg ALEM_EXTRAS=llm,play -t alem-env:extras .
+```
+**Run:**
+The image uses the system Python (`UV_SYSTEM_PYTHON=1`), so inside a container you call `python` directly — no `uv run` prefix needed.
+```bash
+# Smoke test — confirms the install works (default CMD)
+docker run --rm alem-env
+docker run --rm --gpus all alem-env:gpu
+# Run examples
+docker run --rm alem-env python examples/random_rl_agent.py --steps 20
+docker run --rm alem-env python examples/llm_text_smoke.py --coord easy
+# LLM smoke test (pass your API key)
+docker run --rm -e OPENAI_API_KEY=$OPENAI_API_KEY alem-env:extras \
+    python examples/llm_openai_smoke.py --model gpt-4o-mini --steps 1
+# Interactive shell
+docker run --rm -it alem-env bash
+```
+> **Human play is easiest natively** — `uv pip install -e ".[play]"` then `uv run python examples/play_alem.py`. Pygame opens a real window with no display plumbing.
+<details>
+<summary><b>Running human play inside Docker (X11 setup)</b></summary>
+<br>
+Inside Docker, human play needs an X11 display *and* an image built with the `play` extra (so the SDL/X11 libs are present):
+```bash
+# Build with the play extra (adds pygame + SDL/X11 runtime libs)
+docker build -f docker/Dockerfile.env --build-arg ALEM_EXTRAS=play -t alem-env:play .
+```
+- **Linux:** grant the container access to your X server first (this is the step that's usually missing when the window never appears), then run it:
+  ```bash
+  xhost +local:root
+  docker run --rm -it --network host -e DISPLAY=$DISPLAY \
+      -v /tmp/.X11-unix:/tmp/.X11-unix alem-env:play python examples/play_alem.py
+  xhost -local:root   # revoke when done
+  ```
+- **macOS / Windows:** start XQuartz (macOS) or VcXsrv (Windows), enable "allow connections from network clients", then set `DISPLAY` accordingly.
+</details>
+## Package Layout
+<details>
+<summary><b>Repository map — where each piece lives</b></summary>
+<br>
+| Path                           | Purpose                                                               |
+| ------------------------------ | --------------------------------------------------------------------- |
+| `alem/alem_env.py`             | Environment factory                                                   |
+| `alem/alem_coop/envs/`         | Symbolic, pixel, debug, and single-agent env classes                  |
+| `alem/alem_coop/alem_state.py` | State dataclasses and `EnvParams` / `StaticEnvParams`                 |
+| `alem/alem_coop/game_logic.py` | Step logic: movement, resources, combat, crafting, coordination       |
+| `alem/alem_coop/world_gen/`    | Procedural world generation                                           |
+| `alem/alem_coop/renderer/`     | Symbolic, pixel, and text renderers                                   |
+| `alem/alem_coop/constants.py`  | Actions, achievements, blocks, items, textures                        |
+| `alem/llm/`                    | Text observations, action parsing, ASCII maps, LLM evaluator adapters |
+| `examples/random_rl_agent.py`  | Masked-random symbolic rollout (RL template)                          |
+| `examples/llm_text_smoke.py`   | Preview text observations without model calls                         |
+| `examples/llm_openai_smoke.py` | One-step OpenAI-compatible LLM smoke test                             |
+| `examples/play_alem.py`        | Human pygame player                                                   |
+</details>
+## Development
+```bash
+uv pip install -e ".[dev]"   # pytest, ruff, jaxtyping
+uv run pytest alem/tests/    # run the test suite
+```
+<details>
+<summary><b>Lint &amp; format (ruff)</b></summary>
+<br>
+Code style is enforced with [ruff](https://docs.astral.sh/ruff/) (config in `pyproject.toml`). CI runs these checks before the test suite, so run them locally first:
+```bash
+uv run ruff check .          # lint
+uv run ruff format --check . # verify formatting
+```
+To auto-fix before committing:
+```bash
+uv run ruff check --fix .    # apply safe lint fixes
+uv run ruff format .         # format the code
+```
+</details>
+## RL vs LLM Interfaces
+Both interfaces drive the **same** environment but are **not directly comparable** -- treat cross-paradigm scores as indicative, not head-to-head.
+|                     | MARL (symbolic)                                          | LLM (text)                                                                        |
+| ------------------- | -------------------------------------------------------- | --------------------------------------------------------------------------------- |
+| **Observation**     | Numeric vector                                           | Natural-language text                                                             |
+| **Communication**   | A discrete signal on one of `num_comm_channels` (e.g. 4) | Free-form text, ≤ 400 chars                                                       |
+| **Comms vs acting** | Costs your action that turn                              | Sent alongside the action                                                         |
+| **Memory**          | Recurrent hidden state                                   | Private `<scratchpad>` notes — not shared; the agent's only memory across steps.  |
+| **Learning**        | Trained from reward                                      | Zero-shot                                                                         |
+(`Request`/`Give` resource transfers are ordinary actions in both.)
+Text observations apply lightweight preprocessing, including compact local-state summaries (inspired by [BALROG](https://github.com/balrog-ai/BALROG/blob/b7afe79e3e4265811cfa985ed7c95c4d1a11e3f5/balrog/environments/crafter/env.py#L129)) and action affordances. See the [language wrapper](alem/llm/alem_language_wrapper.py) for details.
+## Reproduce the Paper
+The full experiments from the paper — the 13-LLM evaluation and the RL baselines (IPPO, HyperMARL-IPPO, MAPPO, PQN-VDN) — live in [`baselines/`](baselines); see [Baselines](#baselines) for launch commands and configs.
+The paper's numbers were produced against *Alem* [`v0.1.0`](https://github.com/alem-world/alem-env/releases/tag/v0.1.0). For the exact settings to use when reporting an Alem number — seeds, episode count, metrics — see the canonical [evaluation protocol](EVALUATION.md).
+## Contributing
+Contributions are welcome — new baselines, bug fixes, docs, and coordination tasks. See [`CONTRIBUTING.md`](CONTRIBUTING.md) for the dev setup, lint/test workflow, and PR checklist, and [`CODE_OF_CONDUCT.md`](CODE_OF_CONDUCT.md) for community expectations. To put a result on the [leaderboard](https://alem-world.github.io/leaderboard), follow the submission instructions there.
+## Citation
+```bibtex
+@article{tessera2026alem,
+  title   = {Benchmarking Open-Ended Multi-Agent Coordination in Language Agents},
+  author  = {Tessera, {Kale-ab} Abebe and Szecsenyi, Andras and Barker, Cameron and
+             Rutherford, Alexander and Paglieri, Davide and Scannell, Aidan and
+             Gouk, Henry and Crowley, Elliot J. and Rockt\"{a}schel, Tim and
+             Storkey, Amos},
+  year    = {2026},
+  url     = {https://arxiv.org/abs/2606.08340}
+}
+```
+## License
+MIT. See `LICENSE`.