PyPI - mini-swe-agent - Versions diffs - 1.17.1__tar.gz → 1.17.3__tar.gz - Mend

mini-swe-agent 1.17.1tar.gz → 1.17.3tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (66) hide show

{mini_swe_agent-1.17.1/src/mini_swe_agent.egg-info → mini_swe_agent-1.17.3}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: mini-swe-agent
-Version: 1.17.1
+Version: 1.17.3
 Summary: Nano SWE Agent - A simple AI software engineering agent
 Author-email: Kilian Lieret <kilian.lieret@posteo.de>, "Carlos E. Jimenez" <carlosej@princeton.edu>
 License: MIT License
@@ -86,21 +86,21 @@ In 2024, [SWE-bench](https://github.com/swe-bench/SWE-bench) & [SWE-agent](https
 We now ask: **What if SWE-agent was 100x smaller, and still worked nearly as well?**
-`mini` is for
+The `mini` agent is for
 - **Researchers** who want to **[benchmark](https://swe-bench.com), [fine-tune](https://swesmith.com/) or RL** without assumptions, bloat, or surprises
-- **Developers** who like their tools like their scripts: **short, sharp, and readable**
+- **Developers** who like to **own, understand, and modify** their tools
 - **Engineers** who want something **trivial to sandbox & to deploy anywhere**
 Here's some details:
 - **Minimal**: Just [100 lines of python](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/agents/default.py) (+100 total for [env](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/environments/local.py),
 [model](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/models/litellm_model.py), [script](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/run/hello_world.py)) — no fancy dependencies!
-- **Powerful:** Resolves >74% of GitHub issues in the [SWE-bench verified benchmark](https://www.swebench.com/) ([leaderboard](https://swe-bench.com/)).
-- **Convenient:** Comes with UIs that turn this into your daily dev swiss army knife!
+- **Performant:** Scores >74% on the [SWE-bench verified benchmark](https://www.swebench.com/) benchmark; starts faster than Claude Code
 - **Deployable:** In addition to local envs, you can use **docker**, **podman**, **singularity**, **apptainer**, and more
-- **Tested:** [![Codecov](https://img.shields.io/codecov/c/github/swe-agent/mini-swe-agent?style=flat-square)](https://codecov.io/gh/SWE-agent/mini-swe-agent)
 - **Cutting edge:** Built by the Princeton & Stanford team behind [SWE-bench](https://swebench.com) and [SWE-agent](https://swe-agent.com).
+- **Widely adopted:** In use by Meta, NVIDIA, Essential AI, Anyscale, and others
+- **Tested:** [![Codecov](https://img.shields.io/codecov/c/github/swe-agent/mini-swe-agent?style=flat-square)](https://codecov.io/gh/SWE-agent/mini-swe-agent)
 <details>
@@ -108,7 +108,7 @@ Here's some details:
 [SWE-agent](https://swe-agent.com/latest/) jump-started the development of AI agents in 2024. Back then, we placed a lot of emphasis on tools and special interfaces for the agent.
 However, one year later, as LMs have become more capable, a lot of this is not needed at all to build a useful agent!
-In fact, mini-SWE-agent
+In fact, the `mini` agent
 - **Does not have any tools other than bash** — it doesn't even use the tool-calling interface of the LMs.
   This means that you can run it with literally any model. When running in sandboxed environments you also don't need to take care
@@ -131,7 +131,7 @@ You can see the result on the [SWE-bench (bash only)](https://www.swebench.com/)
 Some agents are overfitted research artifacts. Others are UI-heavy frontend monsters.
-`mini` wants to be a hackable tool, not a black box.
+The `mini` agent wants to be a hackable tool, not a black box.
 - **Simple** enough to understand at a glance
 - **Convenient** enough to use in daily workflows

{mini_swe_agent-1.17.1 → mini_swe_agent-1.17.3}/README.md RENAMED Viewed

@@ -15,21 +15,21 @@ In 2024, [SWE-bench](https://github.com/swe-bench/SWE-bench) & [SWE-agent](https
 We now ask: **What if SWE-agent was 100x smaller, and still worked nearly as well?**
-`mini` is for
+The `mini` agent is for
 - **Researchers** who want to **[benchmark](https://swe-bench.com), [fine-tune](https://swesmith.com/) or RL** without assumptions, bloat, or surprises
-- **Developers** who like their tools like their scripts: **short, sharp, and readable**
+- **Developers** who like to **own, understand, and modify** their tools
 - **Engineers** who want something **trivial to sandbox & to deploy anywhere**
 Here's some details:
 - **Minimal**: Just [100 lines of python](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/agents/default.py) (+100 total for [env](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/environments/local.py),
 [model](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/models/litellm_model.py), [script](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/run/hello_world.py)) — no fancy dependencies!
-- **Powerful:** Resolves >74% of GitHub issues in the [SWE-bench verified benchmark](https://www.swebench.com/) ([leaderboard](https://swe-bench.com/)).
-- **Convenient:** Comes with UIs that turn this into your daily dev swiss army knife!
+- **Performant:** Scores >74% on the [SWE-bench verified benchmark](https://www.swebench.com/) benchmark; starts faster than Claude Code
 - **Deployable:** In addition to local envs, you can use **docker**, **podman**, **singularity**, **apptainer**, and more
-- **Tested:** [![Codecov](https://img.shields.io/codecov/c/github/swe-agent/mini-swe-agent?style=flat-square)](https://codecov.io/gh/SWE-agent/mini-swe-agent)
 - **Cutting edge:** Built by the Princeton & Stanford team behind [SWE-bench](https://swebench.com) and [SWE-agent](https://swe-agent.com).
+- **Widely adopted:** In use by Meta, NVIDIA, Essential AI, Anyscale, and others
+- **Tested:** [![Codecov](https://img.shields.io/codecov/c/github/swe-agent/mini-swe-agent?style=flat-square)](https://codecov.io/gh/SWE-agent/mini-swe-agent)
 <details>
@@ -37,7 +37,7 @@ Here's some details:
 [SWE-agent](https://swe-agent.com/latest/) jump-started the development of AI agents in 2024. Back then, we placed a lot of emphasis on tools and special interfaces for the agent.
 However, one year later, as LMs have become more capable, a lot of this is not needed at all to build a useful agent!
-In fact, mini-SWE-agent
+In fact, the `mini` agent
 - **Does not have any tools other than bash** — it doesn't even use the tool-calling interface of the LMs.
   This means that you can run it with literally any model. When running in sandboxed environments you also don't need to take care
@@ -60,7 +60,7 @@ You can see the result on the [SWE-bench (bash only)](https://www.swebench.com/)
 Some agents are overfitted research artifacts. Others are UI-heavy frontend monsters.
-`mini` wants to be a hackable tool, not a black box.
+The `mini` agent wants to be a hackable tool, not a black box.
 - **Simple** enough to understand at a glance
 - **Convenient** enough to use in daily workflows

{mini_swe_agent-1.17.1 → mini_swe_agent-1.17.3/src/mini_swe_agent.egg-info}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: mini-swe-agent
-Version: 1.17.1
+Version: 1.17.3
 Summary: Nano SWE Agent - A simple AI software engineering agent
 Author-email: Kilian Lieret <kilian.lieret@posteo.de>, "Carlos E. Jimenez" <carlosej@princeton.edu>
 License: MIT License
@@ -86,21 +86,21 @@ In 2024, [SWE-bench](https://github.com/swe-bench/SWE-bench) & [SWE-agent](https
 We now ask: **What if SWE-agent was 100x smaller, and still worked nearly as well?**
-`mini` is for
+The `mini` agent is for
 - **Researchers** who want to **[benchmark](https://swe-bench.com), [fine-tune](https://swesmith.com/) or RL** without assumptions, bloat, or surprises
-- **Developers** who like their tools like their scripts: **short, sharp, and readable**
+- **Developers** who like to **own, understand, and modify** their tools
 - **Engineers** who want something **trivial to sandbox & to deploy anywhere**
 Here's some details:
 - **Minimal**: Just [100 lines of python](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/agents/default.py) (+100 total for [env](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/environments/local.py),
 [model](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/models/litellm_model.py), [script](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/run/hello_world.py)) — no fancy dependencies!
-- **Powerful:** Resolves >74% of GitHub issues in the [SWE-bench verified benchmark](https://www.swebench.com/) ([leaderboard](https://swe-bench.com/)).
-- **Convenient:** Comes with UIs that turn this into your daily dev swiss army knife!
+- **Performant:** Scores >74% on the [SWE-bench verified benchmark](https://www.swebench.com/) benchmark; starts faster than Claude Code
 - **Deployable:** In addition to local envs, you can use **docker**, **podman**, **singularity**, **apptainer**, and more
-- **Tested:** [![Codecov](https://img.shields.io/codecov/c/github/swe-agent/mini-swe-agent?style=flat-square)](https://codecov.io/gh/SWE-agent/mini-swe-agent)
 - **Cutting edge:** Built by the Princeton & Stanford team behind [SWE-bench](https://swebench.com) and [SWE-agent](https://swe-agent.com).
+- **Widely adopted:** In use by Meta, NVIDIA, Essential AI, Anyscale, and others
+- **Tested:** [![Codecov](https://img.shields.io/codecov/c/github/swe-agent/mini-swe-agent?style=flat-square)](https://codecov.io/gh/SWE-agent/mini-swe-agent)
 <details>
@@ -108,7 +108,7 @@ Here's some details:
 [SWE-agent](https://swe-agent.com/latest/) jump-started the development of AI agents in 2024. Back then, we placed a lot of emphasis on tools and special interfaces for the agent.
 However, one year later, as LMs have become more capable, a lot of this is not needed at all to build a useful agent!
-In fact, mini-SWE-agent
+In fact, the `mini` agent
 - **Does not have any tools other than bash** — it doesn't even use the tool-calling interface of the LMs.
   This means that you can run it with literally any model. When running in sandboxed environments you also don't need to take care
@@ -131,7 +131,7 @@ You can see the result on the [SWE-bench (bash only)](https://www.swebench.com/)
 Some agents are overfitted research artifacts. Others are UI-heavy frontend monsters.
-`mini` wants to be a hackable tool, not a black box.
+The `mini` agent wants to be a hackable tool, not a black box.
 - **Simple** enough to understand at a glance
 - **Convenient** enough to use in daily workflows

{mini_swe_agent-1.17.1 → mini_swe_agent-1.17.3}/src/minisweagent/__init__.py RENAMED Viewed

@@ -8,7 +8,7 @@ This file provides:
   unless you want the static type checking.
 """
-__version__ = "1.17.1"
+__version__ = "1.17.3"
 import os
 from pathlib import Path

{mini_swe_agent-1.17.1 → mini_swe_agent-1.17.3}/src/minisweagent/agents/default.py RENAMED Viewed

@@ -20,7 +20,15 @@ class AgentConfig:
     )
     timeout_template: str = (
         "The last command <command>{{action['action']}}</command> timed out and has been killed.\n"
-        "The output of the command was:\n <output>\n{{output}}\n</output>\n"
+        "The output of the command was:\n"
+        "{% if output | length < 10000 -%}\n"
+        "<output>\n{{output}}\n</output>\n"
+        "{%- else -%}\n"
+        "<warning>Output was too long and has been truncated.</warning>\n"
+        "<output_head>\n{{ output[:5000] }}\n</output_head>\n"
+        "<elided_chars>{{ output | length - 10000 }} characters elided</elided_chars>\n"
+        "<output_tail>\n{{ output[-5000:] }}\n</output_tail>\n"
+        "{%- endif %}\n"
         "Please try another command and make sure to avoid those requiring interactive input."
     )
     format_error_template: str = "Please always provide EXACTLY ONE action in triple backticks."

{mini_swe_agent-1.17.1 → mini_swe_agent-1.17.3}/src/minisweagent/models/litellm_model.py RENAMED Viewed

@@ -41,6 +41,7 @@ class LitellmModel:
             litellm.utils.register_model(json.loads(Path(self.config.litellm_model_registry).read_text()))
     @retry(
+        reraise=True,
         stop=stop_after_attempt(int(os.getenv("MSWEA_MODEL_RETRY_STOP_AFTER_ATTEMPT", "10"))),
         wait=wait_exponential(multiplier=1, min=4, max=60),
         before_sleep=before_sleep_log(logger, logging.WARNING),
@@ -68,7 +69,7 @@ class LitellmModel:
     def query(self, messages: list[dict[str, str]], **kwargs) -> dict:
         if self.config.set_cache_control:
             messages = set_cache_control(messages, mode=self.config.set_cache_control)
-        response = self._query(messages, **kwargs)
+        response = self._query([{"role": msg["role"], "content": msg["content"]} for msg in messages], **kwargs)
         try:
             cost = litellm.cost_calculator.completion_cost(response, model=self.config.model_name)
             if cost <= 0.0:

{mini_swe_agent-1.17.1 → mini_swe_agent-1.17.3}/src/minisweagent/models/litellm_response_api_model.py RENAMED Viewed

@@ -28,6 +28,7 @@ class LitellmResponseAPIModel(LitellmModel):
         self._previous_response_id: str | None = None
     @retry(
+        reraise=True,
         stop=stop_after_attempt(10),
         wait=wait_exponential(multiplier=1, min=4, max=60),
         before_sleep=before_sleep_log(logger, logging.WARNING),
@@ -45,9 +46,11 @@ class LitellmResponseAPIModel(LitellmModel):
     )
     def _query(self, messages: list[dict[str, str]], **kwargs):
         try:
+            # Remove 'timestamp' field added by agent - not supported by OpenAI responses API
+            clean_messages = [{"role": msg["role"], "content": msg["content"]} for msg in messages]
             resp = litellm.responses(
                 model=self.config.model_name,
-                input=messages if self._previous_response_id is None else messages[-1:],
+                input=clean_messages if self._previous_response_id is None else clean_messages[-1:],
                 previous_response_id=self._previous_response_id,
                 **(self.config.model_kwargs | kwargs),
             )
@@ -59,7 +62,6 @@ class LitellmResponseAPIModel(LitellmModel):
     def query(self, messages: list[dict[str, str]], **kwargs) -> dict:
         response = self._query(messages, **kwargs)
-        print(response)
         text = coerce_responses_text(response)
         try:
             cost = litellm.cost_calculator.completion_cost(response, model=self.config.model_name)

{mini_swe_agent-1.17.1 → mini_swe_agent-1.17.3}/src/minisweagent/models/openrouter_model.py RENAMED Viewed

@@ -56,6 +56,7 @@ class OpenRouterModel:
         self._api_key = os.getenv("OPENROUTER_API_KEY", "")
     @retry(
+        reraise=True,
         stop=stop_after_attempt(int(os.getenv("MSWEA_MODEL_RETRY_STOP_AFTER_ATTEMPT", "10"))),
         wait=wait_exponential(multiplier=1, min=4, max=60),
         before_sleep=before_sleep_log(logger, logging.WARNING),
@@ -97,7 +98,7 @@ class OpenRouterModel:
     def query(self, messages: list[dict[str, str]], **kwargs) -> dict:
         if self.config.set_cache_control:
             messages = set_cache_control(messages, mode=self.config.set_cache_control)
-        response = self._query(messages, **kwargs)
+        response = self._query([{"role": msg["role"], "content": msg["content"]} for msg in messages], **kwargs)
         usage = response.get("usage", {})
         cost = usage.get("cost", 0.0)

{mini_swe_agent-1.17.1 → mini_swe_agent-1.17.3}/src/minisweagent/models/portkey_model.py RENAMED Viewed

@@ -74,6 +74,7 @@ class PortkeyModel:
         self.client = Portkey(**client_kwargs)
     @retry(
+        reraise=True,
         stop=stop_after_attempt(int(os.getenv("MSWEA_MODEL_RETRY_STOP_AFTER_ATTEMPT", "10"))),
         wait=wait_exponential(multiplier=1, min=4, max=60),
         before_sleep=before_sleep_log(logger, logging.WARNING),
@@ -90,7 +91,7 @@ class PortkeyModel:
     def query(self, messages: list[dict[str, str]], **kwargs) -> dict:
         if self.config.set_cache_control:
             messages = set_cache_control(messages, mode=self.config.set_cache_control)
-        response = self._query(messages, **kwargs)
+        response = self._query([{"role": msg["role"], "content": msg["content"]} for msg in messages], **kwargs)
         cost = self._calculate_cost(response)
         self.n_calls += 1
         self.cost += cost

{mini_swe_agent-1.17.1 → mini_swe_agent-1.17.3}/src/minisweagent/models/portkey_response_api_model.py RENAMED Viewed

@@ -30,6 +30,7 @@ class PortkeyResponseAPIModel(PortkeyModel):
         self._previous_response_id: str | None = None
     @retry(
+        reraise=True,
         stop=stop_after_attempt(int(os.getenv("MSWEA_MODEL_RETRY_STOP_AFTER_ATTEMPT", "10"))),
         wait=wait_exponential(multiplier=1, min=4, max=60),
         before_sleep=before_sleep_log(logger, logging.WARNING),

{mini_swe_agent-1.17.1 → mini_swe_agent-1.17.3}/src/minisweagent/models/requesty_model.py RENAMED Viewed

@@ -51,6 +51,7 @@ class RequestyModel:
         self._api_key = os.getenv("REQUESTY_API_KEY", "")
     @retry(
+        reraise=True,
         stop=stop_after_attempt(10),
         wait=wait_exponential(multiplier=1, min=4, max=60),
         before_sleep=before_sleep_log(logger, logging.WARNING),
@@ -91,7 +92,7 @@ class RequestyModel:
             raise RequestyAPIError(f"Request failed: {e}") from e
     def query(self, messages: list[dict[str, str]], **kwargs) -> dict:
-        response = self._query(messages, **kwargs)
+        response = self._query([{"role": msg["role"], "content": msg["content"]} for msg in messages], **kwargs)
         # Extract cost from usage information
         usage = response.get("usage", {})

{mini_swe_agent-1.17.1 → mini_swe_agent-1.17.3}/src/minisweagent/run/extra/utils/batch_progress.py RENAMED Viewed

@@ -79,7 +79,7 @@ class RunBatchProgressManager:
             "[cyan]Overall Progress", total=num_instances, total_cost="0.00", eta=""
         )
-        self.render_group = Group(Table(), self._task_progress_bar, self._main_progress_bar)
+        self.render_group = Group(self._main_progress_bar, Table(), self._task_progress_bar)
         self._yaml_report_path = yaml_report_path
     @property
@@ -112,7 +112,7 @@ class RunBatchProgressManager:
                 instances_str = _shorten_str(", ".join(reversed(instances)), 55)
                 t.add_row(status, str(len(instances)), instances_str)
         assert self.render_group is not None
-        self.render_group.renderables[0] = t
+        self.render_group.renderables[1] = t
     def _update_total_costs(self) -> None:
         with self._lock: