PyPI - mini-swe-agent - Versions diffs - 1.17.0__tar.gz → 1.17.2__tar.gz - Mend

mini-swe-agent 1.17.0tar.gz → 1.17.2tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (67) hide show

{mini_swe_agent-1.17.0/src/mini_swe_agent.egg-info → mini_swe_agent-1.17.2}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: mini-swe-agent
-Version: 1.17.0
+Version: 1.17.2
 Summary: Nano SWE Agent - A simple AI software engineering agent
 Author-email: Kilian Lieret <kilian.lieret@posteo.de>, "Carlos E. Jimenez" <carlosej@princeton.edu>
 License: MIT License
@@ -86,21 +86,21 @@ In 2024, [SWE-bench](https://github.com/swe-bench/SWE-bench) & [SWE-agent](https
 We now ask: **What if SWE-agent was 100x smaller, and still worked nearly as well?**
-`mini` is for
+The `mini` agent is for
 - **Researchers** who want to **[benchmark](https://swe-bench.com), [fine-tune](https://swesmith.com/) or RL** without assumptions, bloat, or surprises
-- **Developers** who like their tools like their scripts: **short, sharp, and readable**
+- **Developers** who like to **own, understand, and modify** their tools
 - **Engineers** who want something **trivial to sandbox & to deploy anywhere**
 Here's some details:
 - **Minimal**: Just [100 lines of python](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/agents/default.py) (+100 total for [env](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/environments/local.py),
 [model](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/models/litellm_model.py), [script](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/run/hello_world.py)) — no fancy dependencies!
-- **Powerful:** Resolves >74% of GitHub issues in the [SWE-bench verified benchmark](https://www.swebench.com/) ([leaderboard](https://swe-bench.com/)).
-- **Convenient:** Comes with UIs that turn this into your daily dev swiss army knife!
+- **Performant:** Scores >74% on the [SWE-bench verified benchmark](https://www.swebench.com/) benchmark; starts faster than Claude Code
 - **Deployable:** In addition to local envs, you can use **docker**, **podman**, **singularity**, **apptainer**, and more
-- **Tested:** [![Codecov](https://img.shields.io/codecov/c/github/swe-agent/mini-swe-agent?style=flat-square)](https://codecov.io/gh/SWE-agent/mini-swe-agent)
 - **Cutting edge:** Built by the Princeton & Stanford team behind [SWE-bench](https://swebench.com) and [SWE-agent](https://swe-agent.com).
+- **Widely adopted:** In use by Meta, NVIDIA, Essential AI, Anyscale, and others
+- **Tested:** [![Codecov](https://img.shields.io/codecov/c/github/swe-agent/mini-swe-agent?style=flat-square)](https://codecov.io/gh/SWE-agent/mini-swe-agent)
 <details>
@@ -108,7 +108,7 @@ Here's some details:
 [SWE-agent](https://swe-agent.com/latest/) jump-started the development of AI agents in 2024. Back then, we placed a lot of emphasis on tools and special interfaces for the agent.
 However, one year later, as LMs have become more capable, a lot of this is not needed at all to build a useful agent!
-In fact, mini-SWE-agent
+In fact, the `mini` agent
 - **Does not have any tools other than bash** — it doesn't even use the tool-calling interface of the LMs.
   This means that you can run it with literally any model. When running in sandboxed environments you also don't need to take care
@@ -131,7 +131,7 @@ You can see the result on the [SWE-bench (bash only)](https://www.swebench.com/)
 Some agents are overfitted research artifacts. Others are UI-heavy frontend monsters.
-`mini` wants to be a hackable tool, not a black box.
+The `mini` agent wants to be a hackable tool, not a black box.
 - **Simple** enough to understand at a glance
 - **Convenient** enough to use in daily workflows

{mini_swe_agent-1.17.0 → mini_swe_agent-1.17.2}/README.md RENAMED Viewed

@@ -15,21 +15,21 @@ In 2024, [SWE-bench](https://github.com/swe-bench/SWE-bench) & [SWE-agent](https
 We now ask: **What if SWE-agent was 100x smaller, and still worked nearly as well?**
-`mini` is for
+The `mini` agent is for
 - **Researchers** who want to **[benchmark](https://swe-bench.com), [fine-tune](https://swesmith.com/) or RL** without assumptions, bloat, or surprises
-- **Developers** who like their tools like their scripts: **short, sharp, and readable**
+- **Developers** who like to **own, understand, and modify** their tools
 - **Engineers** who want something **trivial to sandbox & to deploy anywhere**
 Here's some details:
 - **Minimal**: Just [100 lines of python](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/agents/default.py) (+100 total for [env](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/environments/local.py),
 [model](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/models/litellm_model.py), [script](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/run/hello_world.py)) — no fancy dependencies!
-- **Powerful:** Resolves >74% of GitHub issues in the [SWE-bench verified benchmark](https://www.swebench.com/) ([leaderboard](https://swe-bench.com/)).
-- **Convenient:** Comes with UIs that turn this into your daily dev swiss army knife!
+- **Performant:** Scores >74% on the [SWE-bench verified benchmark](https://www.swebench.com/) benchmark; starts faster than Claude Code
 - **Deployable:** In addition to local envs, you can use **docker**, **podman**, **singularity**, **apptainer**, and more
-- **Tested:** [![Codecov](https://img.shields.io/codecov/c/github/swe-agent/mini-swe-agent?style=flat-square)](https://codecov.io/gh/SWE-agent/mini-swe-agent)
 - **Cutting edge:** Built by the Princeton & Stanford team behind [SWE-bench](https://swebench.com) and [SWE-agent](https://swe-agent.com).
+- **Widely adopted:** In use by Meta, NVIDIA, Essential AI, Anyscale, and others
+- **Tested:** [![Codecov](https://img.shields.io/codecov/c/github/swe-agent/mini-swe-agent?style=flat-square)](https://codecov.io/gh/SWE-agent/mini-swe-agent)
 <details>
@@ -37,7 +37,7 @@ Here's some details:
 [SWE-agent](https://swe-agent.com/latest/) jump-started the development of AI agents in 2024. Back then, we placed a lot of emphasis on tools and special interfaces for the agent.
 However, one year later, as LMs have become more capable, a lot of this is not needed at all to build a useful agent!
-In fact, mini-SWE-agent
+In fact, the `mini` agent
 - **Does not have any tools other than bash** — it doesn't even use the tool-calling interface of the LMs.
   This means that you can run it with literally any model. When running in sandboxed environments you also don't need to take care
@@ -60,7 +60,7 @@ You can see the result on the [SWE-bench (bash only)](https://www.swebench.com/)
 Some agents are overfitted research artifacts. Others are UI-heavy frontend monsters.
-`mini` wants to be a hackable tool, not a black box.
+The `mini` agent wants to be a hackable tool, not a black box.
 - **Simple** enough to understand at a glance
 - **Convenient** enough to use in daily workflows

{mini_swe_agent-1.17.0 → mini_swe_agent-1.17.2/src/mini_swe_agent.egg-info}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: mini-swe-agent
-Version: 1.17.0
+Version: 1.17.2
 Summary: Nano SWE Agent - A simple AI software engineering agent
 Author-email: Kilian Lieret <kilian.lieret@posteo.de>, "Carlos E. Jimenez" <carlosej@princeton.edu>
 License: MIT License
@@ -86,21 +86,21 @@ In 2024, [SWE-bench](https://github.com/swe-bench/SWE-bench) & [SWE-agent](https
 We now ask: **What if SWE-agent was 100x smaller, and still worked nearly as well?**
-`mini` is for
+The `mini` agent is for
 - **Researchers** who want to **[benchmark](https://swe-bench.com), [fine-tune](https://swesmith.com/) or RL** without assumptions, bloat, or surprises
-- **Developers** who like their tools like their scripts: **short, sharp, and readable**
+- **Developers** who like to **own, understand, and modify** their tools
 - **Engineers** who want something **trivial to sandbox & to deploy anywhere**
 Here's some details:
 - **Minimal**: Just [100 lines of python](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/agents/default.py) (+100 total for [env](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/environments/local.py),
 [model](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/models/litellm_model.py), [script](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/run/hello_world.py)) — no fancy dependencies!
-- **Powerful:** Resolves >74% of GitHub issues in the [SWE-bench verified benchmark](https://www.swebench.com/) ([leaderboard](https://swe-bench.com/)).
-- **Convenient:** Comes with UIs that turn this into your daily dev swiss army knife!
+- **Performant:** Scores >74% on the [SWE-bench verified benchmark](https://www.swebench.com/) benchmark; starts faster than Claude Code
 - **Deployable:** In addition to local envs, you can use **docker**, **podman**, **singularity**, **apptainer**, and more
-- **Tested:** [![Codecov](https://img.shields.io/codecov/c/github/swe-agent/mini-swe-agent?style=flat-square)](https://codecov.io/gh/SWE-agent/mini-swe-agent)
 - **Cutting edge:** Built by the Princeton & Stanford team behind [SWE-bench](https://swebench.com) and [SWE-agent](https://swe-agent.com).
+- **Widely adopted:** In use by Meta, NVIDIA, Essential AI, Anyscale, and others
+- **Tested:** [![Codecov](https://img.shields.io/codecov/c/github/swe-agent/mini-swe-agent?style=flat-square)](https://codecov.io/gh/SWE-agent/mini-swe-agent)
 <details>
@@ -108,7 +108,7 @@ Here's some details:
 [SWE-agent](https://swe-agent.com/latest/) jump-started the development of AI agents in 2024. Back then, we placed a lot of emphasis on tools and special interfaces for the agent.
 However, one year later, as LMs have become more capable, a lot of this is not needed at all to build a useful agent!
-In fact, mini-SWE-agent
+In fact, the `mini` agent
 - **Does not have any tools other than bash** — it doesn't even use the tool-calling interface of the LMs.
   This means that you can run it with literally any model. When running in sandboxed environments you also don't need to take care
@@ -131,7 +131,7 @@ You can see the result on the [SWE-bench (bash only)](https://www.swebench.com/)
 Some agents are overfitted research artifacts. Others are UI-heavy frontend monsters.
-`mini` wants to be a hackable tool, not a black box.
+The `mini` agent wants to be a hackable tool, not a black box.
 - **Simple** enough to understand at a glance
 - **Convenient** enough to use in daily workflows

{mini_swe_agent-1.17.0 → mini_swe_agent-1.17.2}/src/mini_swe_agent.egg-info/SOURCES.txt RENAMED Viewed

@@ -20,7 +20,6 @@ src/minisweagent/config/default.yaml
 src/minisweagent/config/github_issue.yaml
 src/minisweagent/config/mini.tcss
 src/minisweagent/config/mini.yaml
-src/minisweagent/config/mini_no_temp.yaml
 src/minisweagent/config/extra/__init__.py
 src/minisweagent/config/extra/swebench.yaml
 src/minisweagent/config/extra/swebench_roulette.yaml

{mini_swe_agent-1.17.0 → mini_swe_agent-1.17.2}/src/minisweagent/__init__.py RENAMED Viewed

@@ -8,7 +8,7 @@ This file provides:
   unless you want the static type checking.
 """
-__version__ = "1.17.0"
+__version__ = "1.17.2"
 import os
 from pathlib import Path

{mini_swe_agent-1.17.0 → mini_swe_agent-1.17.2}/src/minisweagent/config/README.md RENAMED Viewed

@@ -1,7 +1,6 @@
 # Configs
 * `mini.yaml` - Default config for `mini`/`agents/interactive.py` or `mini -v`/`agents/interactive_textual.py` agent.
-* `mini_no_temp.yaml` - Same as `mini.yaml` but without the temperature setting
 * `default.yaml` - Default config for the `default.py` agent.
 * `github_issue.yaml` - Config for the `run/github_issue.py` entry point.

{mini_swe_agent-1.17.0 → mini_swe_agent-1.17.2}/src/minisweagent/config/default.yaml RENAMED Viewed

@@ -153,5 +153,4 @@ environment:
     TQDM_DISABLE: '1'
 model:
   model_kwargs:
-    temperature: 0.0
     drop_params: true

{mini_swe_agent-1.17.0 → mini_swe_agent-1.17.2}/src/minisweagent/config/extra/swebench_roulette.yaml RENAMED Viewed

@@ -36,7 +36,7 @@ agent:
     2. Provide exactly ONE bash command to execute
     ## Important Boundaries
-    - MODIFY: Regular source code files in {{working_dir}}
+    - MODIFY: Regular source code files in /testbed (this is the working directory for all your subsequent commands)
     - DO NOT MODIFY: Tests, configuration files (pyproject.toml, setup.cfg, etc.)
     ## Recommended Workflow

{mini_swe_agent-1.17.0 → mini_swe_agent-1.17.2}/src/minisweagent/config/github_issue.yaml RENAMED Viewed

@@ -142,5 +142,4 @@ environment:
     TQDM_DISABLE: '1'
 model:
   model_kwargs:
-    temperature: 0.0
     drop_params: true

{mini_swe_agent-1.17.0 → mini_swe_agent-1.17.2}/src/minisweagent/config/mini.yaml RENAMED Viewed

@@ -154,5 +154,4 @@ environment:
     TQDM_DISABLE: '1'
 model:
   model_kwargs:
-    temperature: 0.0
     drop_params: true

{mini_swe_agent-1.17.0 → mini_swe_agent-1.17.2}/src/minisweagent/models/litellm_model.py RENAMED Viewed

@@ -68,9 +68,9 @@ class LitellmModel:
     def query(self, messages: list[dict[str, str]], **kwargs) -> dict:
         if self.config.set_cache_control:
             messages = set_cache_control(messages, mode=self.config.set_cache_control)
-        response = self._query(messages, **kwargs)
+        response = self._query([{"role": msg["role"], "content": msg["content"]} for msg in messages], **kwargs)
         try:
-            cost = litellm.cost_calculator.completion_cost(response)
+            cost = litellm.cost_calculator.completion_cost(response, model=self.config.model_name)
             if cost <= 0.0:
                 raise ValueError(f"Cost must be > 0.0, got {cost}")
         except Exception as e:

{mini_swe_agent-1.17.0 → mini_swe_agent-1.17.2}/src/minisweagent/models/litellm_response_api_model.py RENAMED Viewed

@@ -62,7 +62,7 @@ class LitellmResponseAPIModel(LitellmModel):
         print(response)
         text = coerce_responses_text(response)
         try:
-            cost = litellm.cost_calculator.completion_cost(response)
+            cost = litellm.cost_calculator.completion_cost(response, model=self.config.model_name)
         except Exception as e:
             logger.critical(
                 f"Error calculating cost for model {self.config.model_name}: {e}. "

{mini_swe_agent-1.17.0 → mini_swe_agent-1.17.2}/src/minisweagent/models/openrouter_model.py RENAMED Viewed

@@ -97,7 +97,7 @@ class OpenRouterModel:
     def query(self, messages: list[dict[str, str]], **kwargs) -> dict:
         if self.config.set_cache_control:
             messages = set_cache_control(messages, mode=self.config.set_cache_control)
-        response = self._query(messages, **kwargs)
+        response = self._query([{"role": msg["role"], "content": msg["content"]} for msg in messages], **kwargs)
         usage = response.get("usage", {})
         cost = usage.get("cost", 0.0)

{mini_swe_agent-1.17.0 → mini_swe_agent-1.17.2}/src/minisweagent/models/portkey_model.py RENAMED Viewed

@@ -90,7 +90,7 @@ class PortkeyModel:
     def query(self, messages: list[dict[str, str]], **kwargs) -> dict:
         if self.config.set_cache_control:
             messages = set_cache_control(messages, mode=self.config.set_cache_control)
-        response = self._query(messages, **kwargs)
+        response = self._query([{"role": msg["role"], "content": msg["content"]} for msg in messages], **kwargs)
         cost = self._calculate_cost(response)
         self.n_calls += 1
         self.cost += cost

{mini_swe_agent-1.17.0 → mini_swe_agent-1.17.2}/src/minisweagent/models/portkey_response_api_model.py RENAMED Viewed

@@ -52,7 +52,7 @@ class PortkeyResponseAPIModel(PortkeyModel):
         response = self._query(messages, **kwargs)
         text = coerce_responses_text(response)
         try:
-            cost = litellm.cost_calculator.completion_cost(response)
+            cost = litellm.cost_calculator.completion_cost(response, model=self.config.model_name)
             assert cost > 0.0, f"Cost is not positive: {cost}"
         except Exception as e:
             if self.config.cost_tracking != "ignore_errors":

{mini_swe_agent-1.17.0 → mini_swe_agent-1.17.2}/src/minisweagent/models/requesty_model.py RENAMED Viewed

@@ -91,7 +91,7 @@ class RequestyModel:
             raise RequestyAPIError(f"Request failed: {e}") from e
     def query(self, messages: list[dict[str, str]], **kwargs) -> dict:
-        response = self._query(messages, **kwargs)
+        response = self._query([{"role": msg["role"], "content": msg["content"]} for msg in messages], **kwargs)
         # Extract cost from usage information
         usage = response.get("usage", {})

{mini_swe_agent-1.17.0 → mini_swe_agent-1.17.2}/src/minisweagent/run/extra/utils/batch_progress.py RENAMED Viewed

@@ -79,7 +79,7 @@ class RunBatchProgressManager:
             "[cyan]Overall Progress", total=num_instances, total_cost="0.00", eta=""
         )
-        self.render_group = Group(Table(), self._task_progress_bar, self._main_progress_bar)
+        self.render_group = Group(self._main_progress_bar, Table(), self._task_progress_bar)
         self._yaml_report_path = yaml_report_path
     @property
@@ -112,7 +112,7 @@ class RunBatchProgressManager:
                 instances_str = _shorten_str(", ".join(reversed(instances)), 55)
                 t.add_row(status, str(len(instances)), instances_str)
         assert self.render_group is not None
-        self.render_group.renderables[0] = t
+        self.render_group.renderables[1] = t
     def _update_total_costs(self) -> None:
         with self._lock:

{mini_swe_agent-1.17.0 → mini_swe_agent-1.17.2}/src/minisweagent/run/inspector.py RENAMED Viewed

@@ -2,9 +2,7 @@
 """
 Simple trajectory inspector for browsing agent conversation trajectories.
-[not dim]
-More information about the usage: [bold green]https://mini-swe-agent.com/latest/usage/inspector/[/bold green]
-[/not dim]
+More information about the usage: [bold green] https://mini-swe-agent.com/latest/usage/inspector/ [/bold green].
 """
 import json

mini_swe_agent-1.17.0/src/minisweagent/config/mini_no_temp.yaml DELETED Viewed

@@ -1,158 +0,0 @@
-# Identical config file to mini.yaml, but without temperature=0.0
-agent:
-  system_template: |
-    You are a helpful assistant that can interact with a computer.
-    Your response must contain exactly ONE bash code block with ONE command (or commands connected with && or ||).
-    Include a THOUGHT section before your command where you explain your reasoning process.
-    Format your response as shown in <format_example>.
-    <format_example>
-    Your reasoning and analysis here. Explain why you want to perform the action.
-    ```bash
-    your_command_here
-    ```
-    </format_example>
-    Failure to follow these rules will cause your response to be rejected.
-  instance_template: |
-    Please solve this issue: {{task}}
-    You can execute bash commands and edit files to implement the necessary changes.
-    ## Recommended Workflow
-    This workflows should be done step-by-step so that you can iterate on your changes and any possible problems.
-    1. Analyze the codebase by finding and reading relevant files
-    2. Create a script to reproduce the issue
-    3. Edit the source code to resolve the issue
-    4. Verify your fix works by running your script again
-    5. Test edge cases to ensure your fix is robust
-    6. Submit your changes and finish your work by issuing the following command: `echo COMPLETE_TASK_AND_SUBMIT_FINAL_OUTPUT`.
-       Do not combine it with any other command. <important>After this command, you cannot continue working on this task.</important>
-    ## Important Rules
-    1. Every response must contain exactly one action
-    2. The action must be enclosed in triple backticks
-    3. Directory or environment variable changes are not persistent. Every action is executed in a new subshell.
-       However, you can prefix any action with `MY_ENV_VAR=MY_VALUE cd /path/to/working/dir && ...` or write/load environment variables from files
-    <system_information>
-    {{system}} {{release}} {{version}} {{machine}}
-    </system_information>
-    ## Formatting your response
-    Here is an example of a correct response:
-    <example_response>
-    THOUGHT: I need to understand the structure of the repository first. Let me check what files are in the current directory to get a better understanding of the codebase.
-    ```bash
-    ls -la
-    ```
-    </example_response>
-    ## Useful command examples
-    ### Create a new file:
-    ```bash
-    cat <<'EOF' > newfile.py
-    import numpy as np
-    hello = "world"
-    print(hello)
-    EOF
-    ```
-    ### Edit files with sed:
-    {%- if system == "Darwin" -%}
-    <important>
-    You are on MacOS. For all the below examples, you need to use `sed -i ''` instead of `sed -i`.
-    </important>
-    {%- endif -%}
-    ```bash
-    # Replace all occurrences
-    sed -i 's/old_string/new_string/g' filename.py
-    # Replace only first occurrence
-    sed -i 's/old_string/new_string/' filename.py
-    # Replace first occurrence on line 1
-    sed -i '1s/old_string/new_string/' filename.py
-    # Replace all occurrences in lines 1-10
-    sed -i '1,10s/old_string/new_string/g' filename.py
-    ```
-    ### View file content:
-    ```bash
-    # View specific lines with numbers
-    nl -ba filename.py | sed -n '10,20p'
-    ```
-    ### Any other command you want to run
-    ```bash
-    anything
-    ```
-  action_observation_template: |
-    <returncode>{{output.returncode}}</returncode>
-    {% if output.output | length < 10000 -%}
-    <output>
-    {{ output.output -}}
-    </output>
-    {%- else -%}
-    <warning>
-    The output of your last command was too long.
-    Please try a different command that produces less output.
-    If you're looking at a file you can try use head, tail or sed to view a smaller number of lines selectively.
-    If you're using grep or find and it produced too much output, you can use a more selective search pattern.
-    If you really need to see something from the full command's output, you can redirect output to a file and then search in that file.
-    </warning>
-    {%- set elided_chars = output.output | length - 10000 -%}
-    <output_head>
-    {{ output.output[:5000] }}
-    </output_head>
-    <elided_chars>
-    {{ elided_chars }} characters elided
-    </elided_chars>
-    <output_tail>
-    {{ output.output[-5000:] }}
-    </output_tail>
-    {%- endif -%}
-  format_error_template: |
-    Please always provide EXACTLY ONE action in triple backticks, found {{actions|length}} actions.
-    If you want to end the task, please issue the following command: `echo COMPLETE_TASK_AND_SUBMIT_FINAL_OUTPUT`
-    without any other command.
-    Else, please format your response exactly as follows:
-    <response_example>
-    Here are some thoughts about why you want to perform the action.
-    ```bash
-    <action>
-    ```
-    </response_example>
-    Note: In rare cases, if you need to reference a similar format in your command, you might have
-    to proceed in two steps, first writing TRIPLEBACKTICKSBASH, then replacing them with ```bash.
-  step_limit: 0.
-  cost_limit: 3.
-  mode: confirm
-environment:
-  env:
-    PAGER: cat
-    MANPAGER: cat
-    LESS: -R
-    PIP_PROGRESS_BAR: 'off'
-    TQDM_DISABLE: '1'
-model:
-  model_kwargs:
-    drop_params: true