PyPI - cogames-agents - Versions diffs - 0.0.0.7__cp312-cp312-macosx_11_0_arm64.whl - Mend

cogames-agents 0.0.0.7__cp312-cp312-macosx_11_0_arm64.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (128) hide show

cogames_agents/policy/scripted_agent/cogsguard/README.md ADDED Viewed

@@ -0,0 +1,252 @@
+# CoGsGuard Scripted Agent
+A vibe-based multi-agent policy for the Cogs vs Clips arena game.
+## Vibe-Based Role System
+Agents use **vibes** to determine their behavior dynamically:
+| Vibe        | Behavior                                                        |
+| ----------- | --------------------------------------------------------------- |
+| `default`   | Do nothing (noop) - agent is idle                               |
+| `gear`      | Pick a role via the smart/evolutionary coordinator, change vibe |
+| `miner`     | Get miner gear if needed, then mine resources                   |
+| `scout`     | Get scout gear if needed, then explore the map                  |
+| `aligner`   | Get aligner gear if needed, then align junctions to cogs        |
+| `scrambler` | Get scrambler gear if needed, then scramble enemy junctions     |
+| `heart`     | Do nothing (noop)                                               |
+This allows external systems (like training policies) to control agent behavior by setting their vibe.
+## Game Rules
+### Overview
+CoGsGuard is a team-based resource management game where the **Cogs** team competes against the **Clips** team. The Cogs
+team uses this scripted policy while Clips can be controlled by another policy or bot.
+### Resources
+- **Elements**: carbon, oxygen, germanium, silicon (gathered from extractors)
+- **Energy**: Required for movement (auto-regenerates near aligned structures)
+- **Hearts**: Required for align/scramble actions
+- **Influence**: Required for aligning supply depots
+- **HP**: Health points
+### Key Structures
+| Structure                  | Owner             | Function                                                     |
+| -------------------------- | ----------------- | ------------------------------------------------------------ |
+| **Main Nexus**             | Cogs              | Energy AOE regeneration, resource deposits, heart withdrawal |
+| **Supply Depot (Charger)** | Clips (initially) | Can be scrambled (→neutral) then aligned (→cogs)             |
+| **Gear Stations**          | Cogs              | Dispense role-specific gear (costs commons resources)        |
+| **Extractors**             | Neutral           | Gather element resources (in map corners)                    |
+### Gear System
+Agents must acquire role-specific gear from gear stations before executing their role. Gear costs are paid from the
+**cogs commons** inventory:
+| Gear      | Cost                                       | Bonus                  |
+| --------- | ------------------------------------------ | ---------------------- |
+| Miner     | 3 carbon, 1 oxygen, 1 germanium, 1 silicon | +40 cargo capacity     |
+| Scout     | 1 carbon, 1 oxygen, 1 germanium, 3 silicon | +100 energy, +400 HP   |
+| Aligner   | 3 carbon, 1 oxygen, 1 germanium, 1 silicon | +20 influence capacity |
+| Scrambler | 1 carbon, 3 oxygen, 1 germanium, 1 silicon | +200 HP                |
+### Supply Depot Mechanics
+- **Align**: Convert neutral depot to cogs-aligned (requires aligner gear + 1 influence + 1 heart)
+- **Scramble**: Remove depot's alignment (requires scrambler gear + 1 heart)
+- Aligned depots provide energy AOE to their team
+## Agent Strategy
+### Vibe State Machine
+```
+┌─────────────┐                    ┌─────────────┐
+│   default   │ ◄──────────────────│    heart    │
+│   (noop)    │                    │   (noop)    │
+└─────────────┘                    └─────────────┘
+                    External vibe change
+                           │
+                           ▼
+┌─────────────┐   Pick smart role   ┌───────────────┐
+│    gear     │ ──────────────────► │  role vibe    │
+│             │    role vibe        │ (miner/scout/ │
+└─────────────┘                     │  aligner/     │
+                                    │  scrambler)   │
+                                    └───────────────┘
+                                           │
+                                           ▼
+                                    ┌─────────────┐     ┌──────────────┐
+                                    │  GET_GEAR   │ ──► │ EXECUTE_ROLE │
+                                    └─────────────┘     └──────────────┘
+```
+### Phase System (within role vibes)
+When an agent has a role vibe (miner/scout/aligner/scrambler):
+1. **GET_GEAR**: Find and bump the role-specific gear station
+2. **EXECUTE_ROLE**: Perform role-specific behavior
+### Role Behaviors
+#### 🔨 Miner
+```
+1. Find nearest extractor (carbon/oxygen/germanium/silicon chests)
+2. Navigate to extractor and extract resources
+3. When cargo full (40 capacity), return to supply depot to deposit
+4. Repeat
+```
+#### 🔭 Scout
+```
+1. Explore the map systematically (high energy allows long-range scouting)
+2. Discover structures and resources for team knowledge
+3. Patrol map edges to maximize coverage
+```
+#### 🔗 Aligner
+```
+1. Get influence from nexus AOE (stand nearby)
+2. Get hearts from nexus (bump to withdraw from commons)
+3. Find neutral supply depots (after scrambler has neutralized them)
+4. Bump depot to align it to cogs
+5. Repeat
+```
+#### 🌀 Scrambler
+```
+1. Get hearts from nexus or chest
+2. Find clips-aligned supply depots (junctions)
+3. Bump depot to scramble (remove alignment → neutral)
+4. Repeat
+```
+### Exploration Strategy
+Agents explore systematically by cycling through cardinal directions:
+```
+East (8 steps) → South (8 steps) → West (8 steps) → North (8 steps) → repeat
+```
+Starting direction is East (where gear stations are typically located in hub maps).
+### Resource Flow
+```
+Extractors ──► Miners ──► Commons ──► Gear Stations ──► Agents
+                              │
+                              └──► Hearts ──► Aligners/Scramblers
+```
+## Known Limitations
+1. **Aligner Timing**: Aligners often take too long to find their gear stations. By then, the commons may be depleted of
+   resources needed for aligner gear.
+2. **No Communication**: Agents don't share discovered locations. Each agent must independently explore to find
+   structures.
+3. **Random Station Placement**: Gear stations are randomly placed around the hub perimeter, making exploration outcomes
+   variable.
+## Investigations
+- [aligned_junction_held_investigation](aligned_junction_held_investigation.md): AOE energy issues blocking junction
+  alignment in `aligned.junction.held`.
+## Usage
+```bash
+# Run with the role policy (default: 1 scrambler, 4 miners)
+./tools/run.py recipes.experiment.cogsguard.play policy_uri=metta://policy/role
+# With limited timesteps and log rendering
+./tools/run.py recipes.experiment.cogsguard.play policy_uri=metta://policy/role render=log max_steps=500
+```
+### Specifying Initial Vibe Counts
+You can control how many agents start with each role using URI query parameters:
+```bash
+# Custom distribution: 4 miners, 2 scramblers, 1 gear (smart role)
+./tools/run.py recipes.experiment.cogsguard.play \
+    policy_uri="metta://policy/role?miner=4&scrambler=2&gear=1"
+# All miners
+./tools/run.py recipes.experiment.cogsguard.play \
+    policy_uri="metta://policy/role?miner=10"
+# Balanced team
+./tools/run.py recipes.experiment.cogsguard.play \
+    policy_uri="metta://policy/role?miner=3&scout=2&aligner=2&scrambler=3"
+```
+**Supported vibe parameters:**
+| Parameter   | Description                                                                           |
+| ----------- | ------------------------------------------------------------------------------------- |
+| `miner`     | Number of agents starting as miners                                                   |
+| `scout`     | Number of agents starting as scouts                                                   |
+| `aligner`   | Number of agents starting as aligners                                                 |
+| `scrambler` | Number of agents starting as scramblers                                               |
+| `gear`      | Number of agents starting with gear (smart role)                                      |
+| `evolution` | Use evolutionary role selection for `gear` agents (aliases: `evolutionary`, `evolve`) |
+**Assignment order:** `scrambler → aligner → miner → scout → gear`
+Agents are assigned vibes in order by agent ID. Agents beyond the total count specified get no initial target vibe and
+start with the `gear` vibe (smart role selection).
+**Default counts** (if no params specified): `scrambler=1, miner=4`, remainder `gear`
+### Role Cycle / Fixed Mix
+If you want a fixed, repeating role pattern by agent index, use `role_cycle` (comma-separated). This is handy for
+hardcoding a mix like 3 aligners, 3 miners, 2 scramblers, 2 scouts when running 10 agents.
+```bash
+# 10 agents: aligner, miner, scrambler, scout repeating (3/3/2/2)
+./tools/run.py recipes.experiment.cogsguard.play \
+    policy_uri="metta://policy/role_py?role_cycle=aligner,miner,scrambler,scout" \
+    sim.env.game.num_agents=10 \
+    sim.env.game.map_builder.instance.spawn_count=10
+```
+For a one-off explicit ordering, use `role_order` (comma-separated) to list the exact vibes per agent id.
+## File Structure
+```
+cogsguard/
+├── __init__.py      # Exports CogsguardPolicy
+├── policy.py        # Base agent logic, vibe state machine, navigation
+├── types.py         # State definitions (CogsguardAgentState, Role, Phase)
+├── miner.py         # Miner role implementation
+├── scout.py         # Scout role implementation
+├── aligner.py       # Aligner role implementation
+├── scrambler.py     # Scrambler role implementation
+├── README.md        # This file
+└── CLAUDE.md        # AI debugging guide
+```
+## Debug Mode
+Set `DEBUG = True` in `policy.py` to enable detailed logging:
+```python
+DEBUG = True  # Enable debug logging
+```
+This will print agent vibe transitions, decisions, discoveries, and phase transitions.

cogames_agents/policy/scripted_agent/cogsguard/__init__.py ADDED Viewed

@@ -0,0 +1,74 @@
+"""CoGsGuard scripted agent with role-based behavior."""
+from cogames_agents.policy.evolution.cogsguard.evolution import (
+    BehaviorDef,
+    BehaviorSource,
+    EvolutionConfig,
+    RoleCatalog,
+    RoleDef,
+    RoleTier,
+    TierSelection,
+    materialize_role_behaviors,
+    mutate_role,
+    pick_role_id_weighted,
+    recombine_roles,
+    record_behavior_score,
+    record_role_score,
+    sample_role,
+)
+from cogames_agents.policy.evolution.cogsguard.evolutionary_coordinator import (
+    EvolutionaryRoleCoordinator,
+)
+from cogames_agents.policy.scripted_agent.cogsguard.behavior_hooks import build_cogsguard_behavior_hooks
+from cogames_agents.policy.scripted_agent.cogsguard.control_agent import CogsguardControlAgent
+from cogames_agents.policy.scripted_agent.cogsguard.policy import CogsguardPolicy, CogsguardWomboPolicy
+from cogames_agents.policy.scripted_agent.cogsguard.roles import (
+    AlignerPolicy,
+    MinerPolicy,
+    ScoutPolicy,
+    ScramblerPolicy,
+)
+from cogames_agents.policy.scripted_agent.cogsguard.targeted_agent import CogsguardTargetedAgent
+from cogames_agents.policy.scripted_agent.cogsguard.v2_agent import CogsguardV2Agent
+try:
+    from cogames_agents.policy.scripted_agent.cogsguard.teacher import CogsguardTeacherPolicy
+except ModuleNotFoundError as exc:  # pragma: no cover - optional for environments without nim agents
+    if exc.name and exc.name.startswith("cogames_agents.policy.nim_agents"):
+        CogsguardTeacherPolicy = None
+    else:
+        raise
+__all__ = [
+    "CogsguardControlAgent",
+    "CogsguardPolicy",
+    "CogsguardWomboPolicy",
+    "CogsguardTargetedAgent",
+    "CogsguardV2Agent",
+    "MinerPolicy",
+    "ScoutPolicy",
+    "AlignerPolicy",
+    "ScramblerPolicy",
+    # Evolution types
+    "BehaviorDef",
+    "BehaviorSource",
+    "EvolutionConfig",
+    "RoleCatalog",
+    "RoleDef",
+    "RoleTier",
+    "TierSelection",
+    # Evolution functions
+    "materialize_role_behaviors",
+    "mutate_role",
+    "pick_role_id_weighted",
+    "recombine_roles",
+    "record_behavior_score",
+    "record_role_score",
+    "sample_role",
+    # Coordinator + hooks
+    "EvolutionaryRoleCoordinator",
+    "build_cogsguard_behavior_hooks",
+]
+if CogsguardTeacherPolicy is not None:
+    __all__.append("CogsguardTeacherPolicy")

cogames_agents/policy/scripted_agent/cogsguard/aligned_junction_held_investigation.md ADDED Viewed

@@ -0,0 +1,152 @@
+# Investigation: aligned.junction.held Scoring Gap (300 vs 30k)
+## Summary
+The 100x scoring gap (PPO: ~300 vs expected: ~30,000) for `aligned.junction.held` is caused by a fundamental energy
+starvation problem that prevents agents from effectively navigating and completing their roles.
+## Key Findings
+### 1. Scripted Agents Are NOT Successfully Aligning Junctions
+After running `cogames play` with both the Nim (`metta://policy/role`) and Python (`metta://policy/role_py`) policies, I
+observed:
+- **clips.aligned.junction.held: ~135,000** (over 5000 steps)
+- **cogs.aligned.junction.held: 0** (zero junctions aligned to cogs)
+This means the scripted agents themselves are failing to align junctions, just like PPO.
+### 2. Root Cause: Energy Starvation
+Agents are experiencing severe energy starvation:
+| Metric                     | Expected | Observed                  |
+| -------------------------- | -------- | ------------------------- |
+| action.move.failed         | ~0%      | ~99%                      |
+| action.move.success        | ~99%     | ~1%                       |
+| max_steps_without_motion   | low      | ~2900 (out of 3000 steps) |
+| energy.gained (3000 steps) | ~300,000 | ~130-200                  |
+### 3. Why Energy is Depleted
+The game is designed with an energy-based economy:
+- **Move action costs**: 3 energy per move
+- **Agent initial energy**: 100
+- **Base energy regen**: +1 energy/tick
+- **Hub AOE (expected)**: +100 energy/tick to cogs agents within range 10
+**Problem**: The hub's AOE energy buff is NOT being applied to agents.
+In testing, I observed:
+- Agent spawns at (26, 26), hub at (29, 29) - distance 6 (within AOE range 10)
+- Agent starts with 100 energy
+- After 1 step: energy = 10 (dropped 90!)
+- Expected: energy should increase to 200+ from AOE
+### 4. The Junction Alignment Flow
+To align a junction, agents must:
+1. **Scrambler** scrambles clips-aligned junctions to neutral
+   - Requires: scrambler gear + 1 heart
+   - Must navigate to junction and bump into it
+2. **Aligner** aligns neutral junctions to cogs
+   - Requires: aligner gear + 1 influence + 1 heart
+   - Must navigate to junction and bump into it
+**Problem**: Agents can't navigate because they don't have energy to move.
+### 5. Chicken-and-Egg Problem
+The game design creates a catch-22:
+- Agents need energy to move to junctions
+- Junctions provide energy AOE when aligned to cogs
+- But junctions start aligned to clips (enemy)
+- The hub should provide energy, but its AOE isn't working
+## Configuration Details
+### Agent Energy Config
+```python
+inventory.limits = {'energy': ResourceLimitsConfig(min=10, max=65535, ...)}
+inventory.initial = {'energy': 100}
+inventory.regen_amounts = {'default': {'energy': 1, 'hp': -1, 'influence': -1}}
+```
+### Move Action Cost
+```python
+actions.move = MoveActionConfig(consumed_resources={'energy': 3})
+```
+### Hub AOE (Hub)
+```python
+aoes = [
+    AOEEffectConfig(
+        range=10,
+        resource_deltas={'influence': 10, 'energy': 100, 'hp': 100},
+        filters=[isAlignedToActor()]  # Same collective
+    ),
+    AOEEffectConfig(
+        range=10,
+        resource_deltas={'hp': -1, 'influence': -100},
+        filters=[isEnemy()]  # Different collective
+    )
+]
+```
+### Junction AOE
+Same structure as hub, but junctions are clips-aligned, so they:
+- Give +100 energy to clips agents
+- Deal -1 hp to cogs agents
+## Recommendations
+1. **Investigate AOE Application Bug**: The hub's energy AOE is not being applied to cogs agents. Check if there's a bug
+   in the collective alignment matching for AOE effects.
+2. **Reduce Move Energy Cost**: Consider lowering from 3 to 1 or 2 to make agents more mobile.
+3. **Increase Base Energy Regen**: Increase from +1 to +5 or +10 per tick.
+4. **Give Agents Initial Hearts**: Currently agents start with 0 hearts and must get them from chests, but they can't
+   reach chests without energy.
+5. **Check Scripted Agent Logic**: The Nim agents may have bugs in their pathfinding or role execution that cause them
+   to get stuck even when they have energy.
+## Test Commands Used
+```bash
+# Run with teacher policy (uses Nim backend)
+uv run tools/run.py cogsguard.play "policy_uri=metta://policy/teacher" render=log max_steps=3000
+# Run with Python policy
+uv run tools/run.py cogsguard.play "policy_uri=metta://policy/role_py" render=log max_steps=3000
+# Check simulation state
+uv run python -c "
+from mettagrid.simulator.simulator import Simulator
+from recipes.experiment.cogsguard import make_env
+cfg = make_env(num_agents=10, max_steps=100)
+sim = Simulator().new_simulation(cfg, seed=42)
+# ... inspect state
+"
+```
+## Related Files
+- `packages/cogames/src/cogames/cogs_vs_clips/mission.py` - CogsGuard mission config
+- `packages/cogames/src/cogames/cogs_vs_clips/stations.py` - Junction and Hub configs
+- `packages/cogames-agents/src/cogames_agents/policy/nim_agents/cogsguard_agents.nim` - Nim scripted agents
+- `packages/cogames-agents/src/cogames_agents/policy/scripted_agent/cogsguard/` - Python scripted agents
+- `recipes/experiment/cogsguard.py` - Recipe for running CogsGuard