mini-swe-agent 1.17.0__py3-none-any.whl → 1.17.2__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: mini-swe-agent
3
- Version: 1.17.0
3
+ Version: 1.17.2
4
4
  Summary: Nano SWE Agent - A simple AI software engineering agent
5
5
  Author-email: Kilian Lieret <kilian.lieret@posteo.de>, "Carlos E. Jimenez" <carlosej@princeton.edu>
6
6
  License: MIT License
@@ -86,21 +86,21 @@ In 2024, [SWE-bench](https://github.com/swe-bench/SWE-bench) & [SWE-agent](https
86
86
 
87
87
  We now ask: **What if SWE-agent was 100x smaller, and still worked nearly as well?**
88
88
 
89
- `mini` is for
89
+ The `mini` agent is for
90
90
 
91
91
  - **Researchers** who want to **[benchmark](https://swe-bench.com), [fine-tune](https://swesmith.com/) or RL** without assumptions, bloat, or surprises
92
- - **Developers** who like their tools like their scripts: **short, sharp, and readable**
92
+ - **Developers** who like to **own, understand, and modify** their tools
93
93
  - **Engineers** who want something **trivial to sandbox & to deploy anywhere**
94
94
 
95
95
  Here's some details:
96
96
 
97
97
  - **Minimal**: Just [100 lines of python](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/agents/default.py) (+100 total for [env](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/environments/local.py),
98
98
  [model](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/models/litellm_model.py), [script](https://github.com/SWE-agent/mini-swe-agent/blob/main/src/minisweagent/run/hello_world.py)) — no fancy dependencies!
99
- - **Powerful:** Resolves >74% of GitHub issues in the [SWE-bench verified benchmark](https://www.swebench.com/) ([leaderboard](https://swe-bench.com/)).
100
- - **Convenient:** Comes with UIs that turn this into your daily dev swiss army knife!
99
+ - **Performant:** Scores >74% on the [SWE-bench verified benchmark](https://www.swebench.com/) benchmark; starts faster than Claude Code
101
100
  - **Deployable:** In addition to local envs, you can use **docker**, **podman**, **singularity**, **apptainer**, and more
102
- - **Tested:** [![Codecov](https://img.shields.io/codecov/c/github/swe-agent/mini-swe-agent?style=flat-square)](https://codecov.io/gh/SWE-agent/mini-swe-agent)
103
101
  - **Cutting edge:** Built by the Princeton & Stanford team behind [SWE-bench](https://swebench.com) and [SWE-agent](https://swe-agent.com).
102
+ - **Widely adopted:** In use by Meta, NVIDIA, Essential AI, Anyscale, and others
103
+ - **Tested:** [![Codecov](https://img.shields.io/codecov/c/github/swe-agent/mini-swe-agent?style=flat-square)](https://codecov.io/gh/SWE-agent/mini-swe-agent)
104
104
 
105
105
  <details>
106
106
 
@@ -108,7 +108,7 @@ Here's some details:
108
108
 
109
109
  [SWE-agent](https://swe-agent.com/latest/) jump-started the development of AI agents in 2024. Back then, we placed a lot of emphasis on tools and special interfaces for the agent.
110
110
  However, one year later, as LMs have become more capable, a lot of this is not needed at all to build a useful agent!
111
- In fact, mini-SWE-agent
111
+ In fact, the `mini` agent
112
112
 
113
113
  - **Does not have any tools other than bash** — it doesn't even use the tool-calling interface of the LMs.
114
114
  This means that you can run it with literally any model. When running in sandboxed environments you also don't need to take care
@@ -131,7 +131,7 @@ You can see the result on the [SWE-bench (bash only)](https://www.swebench.com/)
131
131
 
132
132
  Some agents are overfitted research artifacts. Others are UI-heavy frontend monsters.
133
133
 
134
- `mini` wants to be a hackable tool, not a black box.
134
+ The `mini` agent wants to be a hackable tool, not a black box.
135
135
 
136
136
  - **Simple** enough to understand at a glance
137
137
  - **Convenient** enough to use in daily workflows
@@ -1,21 +1,20 @@
1
- mini_swe_agent-1.17.0.dist-info/licenses/LICENSE.md,sha256=D3luWPkdHAe7LBsdD4vzqDAXw6Xewb3G-uczss0uh1s,1094
2
- minisweagent/__init__.py,sha256=tEnWAHcxFqNTVuv6G73e1YvEsTIPmiInXGGvSnXEcVw,2016
1
+ mini_swe_agent-1.17.2.dist-info/licenses/LICENSE.md,sha256=D3luWPkdHAe7LBsdD4vzqDAXw6Xewb3G-uczss0uh1s,1094
2
+ minisweagent/__init__.py,sha256=gEojonoOcUfOhgUFORuSn5xcgV412ZMFH8KV1tGoGbM,2016
3
3
  minisweagent/__main__.py,sha256=FIyAOiw--c3FQ2g240FOM1FdL0lk_PxSpixu0pQ7WFo,194
4
4
  minisweagent/py.typed,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
5
5
  minisweagent/agents/__init__.py,sha256=cpjJLzg1IGxLM-tZpoMJV9S33ye13XtdBO0x7DU_Lrk,48
6
6
  minisweagent/agents/default.py,sha256=6XQEf4bawQL-J2r4ClsDsO_1uQ-VM7WrW3hGaHxKcQE,5657
7
7
  minisweagent/agents/interactive.py,sha256=NBeNamRuqww9ZRhOg1q8xPO9ziUw2gpAVV6hCPbpBxU,7470
8
8
  minisweagent/agents/interactive_textual.py,sha256=yUDMkuvhhnZAP8LtiBWmt5J5WzfWBeR0zNlJbdbEGa0,18153
9
- minisweagent/config/README.md,sha256=ABd9anA4aRWtx7Oh37z36Wv6ARvcxD2w9lPUE24R2mY,435
9
+ minisweagent/config/README.md,sha256=ZGH5KbFHpkwYOwoZgwP1dHOikuvU11dIG_TqyI73dik,355
10
10
  minisweagent/config/__init__.py,sha256=0KzHaaIqWgRy2zbwIzhrg6BJPDzOvYi3jb4eBNY4sAU,823
11
- minisweagent/config/default.yaml,sha256=iVNFs-FHrjc81RAiaTjGk5435G6V7OPjbXECu6RxJPU,5129
12
- minisweagent/config/github_issue.yaml,sha256=qbjj3vmdukxz36_EY7e64vhNn1g2-_NrdNx5xgMOUAI,4569
11
+ minisweagent/config/default.yaml,sha256=z1q91vFq_EeEb2fAuRiIwU7ZabiWP_-29M4GgaBxpgA,5108
12
+ minisweagent/config/github_issue.yaml,sha256=Ws1IHO_lGA7JfLzXvDUzpSasf-4hQzXG7EjJqHORtho,4548
13
13
  minisweagent/config/mini.tcss,sha256=fmAP9cYAp2n7Ps2Dw3e-ZOGEF2E8JcwTgK1LDcis-x4,1141
14
- minisweagent/config/mini.yaml,sha256=-3c4eKeCysFAfKJX3whUuBI6wbQgt8vrlcTFp_pcdyY,5145
15
- minisweagent/config/mini_no_temp.yaml,sha256=g1Y5goNTYZlDcSuBgKWJUdMkoK09w_5vheASZg1yYYI,5190
14
+ minisweagent/config/mini.yaml,sha256=eeOO6VkmiQcdU5dddESXUpGKFcSI2gBXsavNni4Xmig,5124
16
15
  minisweagent/config/extra/__init__.py,sha256=e1MoAlDn_wc9HnXNoncf1P-B4DQ-iRf6n7Q_txjZGRI,52
17
16
  minisweagent/config/extra/swebench.yaml,sha256=opFzxJPeMYY6oIpB6oUViXiax3ei7UTOlP0Lz1LbFss,7750
18
- minisweagent/config/extra/swebench_roulette.yaml,sha256=8O7PvO8tPGJN-mYuBGhWUlAzsjMPnbf3_i6Sn5v7RQ4,7813
17
+ minisweagent/config/extra/swebench_roulette.yaml,sha256=ZYkh_ji0e7TuLcWXMrDhUQMPLiYH__EI0C4Skj3sK_Y,7871
19
18
  minisweagent/config/extra/swebench_xml.yaml,sha256=dWXAqzXgw167hgiKqoqOryPHwAgDV2JbgoiDABdEznY,7827
20
19
  minisweagent/environments/__init__.py,sha256=x80Ulx0UK21GAwg5jSTkOFeiZ7CQsGBP8cI_5BhazAo,1266
21
20
  minisweagent/environments/docker.py,sha256=hsKOPGAP2kjgEwA_2HQz_nCrr25qmR4fB8u5Ob6UbzY,4370
@@ -26,12 +25,12 @@ minisweagent/environments/extra/bubblewrap.py,sha256=G12Dm63N30qByfLb1SKNsI4G4gL
26
25
  minisweagent/environments/extra/swerex_docker.py,sha256=WPYbohT_vqTHkde9cxpbV6chLXCpLl0PDAcgMbZsV0M,1707
27
26
  minisweagent/models/__init__.py,sha256=RGJgMPeF8W2Ix8_xwvHjjDCD9I6ygirz4v8Ps1KG6dI,4435
28
27
  minisweagent/models/anthropic.py,sha256=4p-LxQ_RYQUX1rBsffAj3T1bBb2uMRhA4IyKfDcMpgo,1517
29
- minisweagent/models/litellm_model.py,sha256=ph2j2gDwxl10p9NRUhldNJt-FAwYIyjfnNiaF8A0ThU,4280
30
- minisweagent/models/litellm_response_api_model.py,sha256=BJzDGE-rn9P_TVX60-8YJXS6SIgOTJAGgBszuHNKzEI,2916
31
- minisweagent/models/openrouter_model.py,sha256=dsVl8pnZ9M6nZ5DpLy69v1bizMR73mixXdVZVNzvknc,4712
32
- minisweagent/models/portkey_model.py,sha256=6HMOsmHd6Q6WdLUDa_1XFCrU0BaCUP5RqI3h6L8AxJQ,7223
33
- minisweagent/models/portkey_response_api_model.py,sha256=knIKR88u5YP-FqE27ZEopsuTSchzslq8CDjyxsMHtbU,2908
34
- minisweagent/models/requesty_model.py,sha256=Dqr0cLNSstj3VrzYhf-Sqsju8OvJrRPK2mctajgRnNs,3781
28
+ minisweagent/models/litellm_model.py,sha256=hunhw6lpizmdtmrt3PeY5-eFxlf6pZjeE0F7FmfzweM,4372
29
+ minisweagent/models/litellm_response_api_model.py,sha256=FF-Xab6xBvkxh_E6VoMx-Km1owRpHrkSIfgXZ31hCsU,2946
30
+ minisweagent/models/openrouter_model.py,sha256=7iw_yozhiVk3IwEEvS2jN2Ig_7-aTo-qeWYUEq2Hl2g,4774
31
+ minisweagent/models/portkey_model.py,sha256=ziX36Kzv9zCEz8wDIuNjUnqEmAW0eCmb7VjRZzKpVKE,7285
32
+ minisweagent/models/portkey_response_api_model.py,sha256=muHiiRpm7nXP1jylZovE4Ttr_JwFt1b9-YZgRCthTbk,2938
33
+ minisweagent/models/requesty_model.py,sha256=9f7T3tzafTZhyc5DkXK8Pl9n8a9-SzMqDn6Di-BQlFU,3843
35
34
  minisweagent/models/test_models.py,sha256=ItCA6ddntzkYA7dzSuUEaLMV-AE8TBuXBFP8CzpiO3U,1351
36
35
  minisweagent/models/extra/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
37
36
  minisweagent/models/extra/roulette.py,sha256=idteU0pGvmmipNr0s-42GAbVkmKE20hY2LTFxbkAgoI,2048
@@ -42,7 +41,7 @@ minisweagent/models/utils/openai_utils.py,sha256=3OEOR65gFeVCTpcLJyMkzbFL_B-k8ft
42
41
  minisweagent/run/__init__.py,sha256=WIoYgHVl7iZF2YncrfV3IttupG6P5KogroKHKECka3A,38
43
42
  minisweagent/run/github_issue.py,sha256=35mZoPLc4JV6XXJKRv55lnuKbXf5lDftd51N89-x9J0,3192
44
43
  minisweagent/run/hello_world.py,sha256=erLnEwNmPFLxq3-8zyv66Vy1kIqMqQf97vISX7LrQXg,959
45
- minisweagent/run/inspector.py,sha256=QnY3oYzm-yq3w9Jzs112Lco2Rg84vSocAWrQRVz_1lc,7127
44
+ minisweagent/run/inspector.py,sha256=P86kOmzySWdK4tx0DHAOfSF1h-s1vHboSsaRD3_0OKQ,7109
46
45
  minisweagent/run/mini.py,sha256=N3ZvTQmKHNJ9bEaiz5YHjJT4Arg0WtjxGLBTtj8-T0E,4922
47
46
  minisweagent/run/mini_extra.py,sha256=ecA1PnTWElpO60G9RktvVLtUOf3bZ_ESmnSttS6izhQ,1465
48
47
  minisweagent/run/extra/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
@@ -50,13 +49,13 @@ minisweagent/run/extra/config.py,sha256=KDMwg6eQCxbwI6P1phosCwaLQhJQXB4ti65M_Hox
50
49
  minisweagent/run/extra/swebench.py,sha256=sO3LnjLXdU6Zbo409YhxVdizU8LaQcJUdcD8Tj6saMw,11741
51
50
  minisweagent/run/extra/swebench_single.py,sha256=KmoUkD6UQ1P0MY_73-OtYuQAsNPmOLlZIZSYKZGs5MQ,3699
52
51
  minisweagent/run/extra/utils/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
53
- minisweagent/run/extra/utils/batch_progress.py,sha256=xhJ7FmsaTBGz-yh8pzYl4yMoUGjn7GA24eYrP-nHj60,6804
52
+ minisweagent/run/extra/utils/batch_progress.py,sha256=URgnm5MpUA6liESNFqDIbzELM869PbXro7jKzvdbiv0,6804
54
53
  minisweagent/run/utils/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
55
54
  minisweagent/run/utils/save.py,sha256=bokvblZ1SaIvCXimkRQqgvERKmVM0jn8SF7UoHBeerQ,2546
56
55
  minisweagent/utils/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
57
56
  minisweagent/utils/log.py,sha256=ruDMNKMrVC9NPvCeHwO3QYz5jsVNUGQB2dRAEAPAWp8,996
58
- mini_swe_agent-1.17.0.dist-info/METADATA,sha256=Rbjq535rsiT5FfY89a1SydrveyAD8Sr9JUWaQMAUw9Y,14851
59
- mini_swe_agent-1.17.0.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
60
- mini_swe_agent-1.17.0.dist-info/entry_points.txt,sha256=d1_yRbTaGjs1UXHa6JQK0sKDGBIVGm8oeW0k2kfbJgQ,182
61
- mini_swe_agent-1.17.0.dist-info/top_level.txt,sha256=zKF4t8bFpV87fdVABZt2Da-vnb4Vkh_CxkwQx5YT4Ew,13
62
- mini_swe_agent-1.17.0.dist-info/RECORD,,
57
+ mini_swe_agent-1.17.2.dist-info/METADATA,sha256=wN_7BqZ2HOwKTRNQo8tb14Y8P3f769l0_6Vqh9_kpqI,14836
58
+ mini_swe_agent-1.17.2.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
59
+ mini_swe_agent-1.17.2.dist-info/entry_points.txt,sha256=d1_yRbTaGjs1UXHa6JQK0sKDGBIVGm8oeW0k2kfbJgQ,182
60
+ mini_swe_agent-1.17.2.dist-info/top_level.txt,sha256=zKF4t8bFpV87fdVABZt2Da-vnb4Vkh_CxkwQx5YT4Ew,13
61
+ mini_swe_agent-1.17.2.dist-info/RECORD,,
minisweagent/__init__.py CHANGED
@@ -8,7 +8,7 @@ This file provides:
8
8
  unless you want the static type checking.
9
9
  """
10
10
 
11
- __version__ = "1.17.0"
11
+ __version__ = "1.17.2"
12
12
 
13
13
  import os
14
14
  from pathlib import Path
@@ -1,7 +1,6 @@
1
1
  # Configs
2
2
 
3
3
  * `mini.yaml` - Default config for `mini`/`agents/interactive.py` or `mini -v`/`agents/interactive_textual.py` agent.
4
- * `mini_no_temp.yaml` - Same as `mini.yaml` but without the temperature setting
5
4
  * `default.yaml` - Default config for the `default.py` agent.
6
5
  * `github_issue.yaml` - Config for the `run/github_issue.py` entry point.
7
6
 
@@ -153,5 +153,4 @@ environment:
153
153
  TQDM_DISABLE: '1'
154
154
  model:
155
155
  model_kwargs:
156
- temperature: 0.0
157
156
  drop_params: true
@@ -36,7 +36,7 @@ agent:
36
36
  2. Provide exactly ONE bash command to execute
37
37
 
38
38
  ## Important Boundaries
39
- - MODIFY: Regular source code files in {{working_dir}}
39
+ - MODIFY: Regular source code files in /testbed (this is the working directory for all your subsequent commands)
40
40
  - DO NOT MODIFY: Tests, configuration files (pyproject.toml, setup.cfg, etc.)
41
41
 
42
42
  ## Recommended Workflow
@@ -142,5 +142,4 @@ environment:
142
142
  TQDM_DISABLE: '1'
143
143
  model:
144
144
  model_kwargs:
145
- temperature: 0.0
146
145
  drop_params: true
@@ -154,5 +154,4 @@ environment:
154
154
  TQDM_DISABLE: '1'
155
155
  model:
156
156
  model_kwargs:
157
- temperature: 0.0
158
157
  drop_params: true
@@ -68,9 +68,9 @@ class LitellmModel:
68
68
  def query(self, messages: list[dict[str, str]], **kwargs) -> dict:
69
69
  if self.config.set_cache_control:
70
70
  messages = set_cache_control(messages, mode=self.config.set_cache_control)
71
- response = self._query(messages, **kwargs)
71
+ response = self._query([{"role": msg["role"], "content": msg["content"]} for msg in messages], **kwargs)
72
72
  try:
73
- cost = litellm.cost_calculator.completion_cost(response)
73
+ cost = litellm.cost_calculator.completion_cost(response, model=self.config.model_name)
74
74
  if cost <= 0.0:
75
75
  raise ValueError(f"Cost must be > 0.0, got {cost}")
76
76
  except Exception as e:
@@ -62,7 +62,7 @@ class LitellmResponseAPIModel(LitellmModel):
62
62
  print(response)
63
63
  text = coerce_responses_text(response)
64
64
  try:
65
- cost = litellm.cost_calculator.completion_cost(response)
65
+ cost = litellm.cost_calculator.completion_cost(response, model=self.config.model_name)
66
66
  except Exception as e:
67
67
  logger.critical(
68
68
  f"Error calculating cost for model {self.config.model_name}: {e}. "
@@ -97,7 +97,7 @@ class OpenRouterModel:
97
97
  def query(self, messages: list[dict[str, str]], **kwargs) -> dict:
98
98
  if self.config.set_cache_control:
99
99
  messages = set_cache_control(messages, mode=self.config.set_cache_control)
100
- response = self._query(messages, **kwargs)
100
+ response = self._query([{"role": msg["role"], "content": msg["content"]} for msg in messages], **kwargs)
101
101
 
102
102
  usage = response.get("usage", {})
103
103
  cost = usage.get("cost", 0.0)
@@ -90,7 +90,7 @@ class PortkeyModel:
90
90
  def query(self, messages: list[dict[str, str]], **kwargs) -> dict:
91
91
  if self.config.set_cache_control:
92
92
  messages = set_cache_control(messages, mode=self.config.set_cache_control)
93
- response = self._query(messages, **kwargs)
93
+ response = self._query([{"role": msg["role"], "content": msg["content"]} for msg in messages], **kwargs)
94
94
  cost = self._calculate_cost(response)
95
95
  self.n_calls += 1
96
96
  self.cost += cost
@@ -52,7 +52,7 @@ class PortkeyResponseAPIModel(PortkeyModel):
52
52
  response = self._query(messages, **kwargs)
53
53
  text = coerce_responses_text(response)
54
54
  try:
55
- cost = litellm.cost_calculator.completion_cost(response)
55
+ cost = litellm.cost_calculator.completion_cost(response, model=self.config.model_name)
56
56
  assert cost > 0.0, f"Cost is not positive: {cost}"
57
57
  except Exception as e:
58
58
  if self.config.cost_tracking != "ignore_errors":
@@ -91,7 +91,7 @@ class RequestyModel:
91
91
  raise RequestyAPIError(f"Request failed: {e}") from e
92
92
 
93
93
  def query(self, messages: list[dict[str, str]], **kwargs) -> dict:
94
- response = self._query(messages, **kwargs)
94
+ response = self._query([{"role": msg["role"], "content": msg["content"]} for msg in messages], **kwargs)
95
95
 
96
96
  # Extract cost from usage information
97
97
  usage = response.get("usage", {})
@@ -79,7 +79,7 @@ class RunBatchProgressManager:
79
79
  "[cyan]Overall Progress", total=num_instances, total_cost="0.00", eta=""
80
80
  )
81
81
 
82
- self.render_group = Group(Table(), self._task_progress_bar, self._main_progress_bar)
82
+ self.render_group = Group(self._main_progress_bar, Table(), self._task_progress_bar)
83
83
  self._yaml_report_path = yaml_report_path
84
84
 
85
85
  @property
@@ -112,7 +112,7 @@ class RunBatchProgressManager:
112
112
  instances_str = _shorten_str(", ".join(reversed(instances)), 55)
113
113
  t.add_row(status, str(len(instances)), instances_str)
114
114
  assert self.render_group is not None
115
- self.render_group.renderables[0] = t
115
+ self.render_group.renderables[1] = t
116
116
 
117
117
  def _update_total_costs(self) -> None:
118
118
  with self._lock:
@@ -2,9 +2,7 @@
2
2
  """
3
3
  Simple trajectory inspector for browsing agent conversation trajectories.
4
4
 
5
- [not dim]
6
- More information about the usage: [bold green]https://mini-swe-agent.com/latest/usage/inspector/[/bold green]
7
- [/not dim]
5
+ More information about the usage: [bold green] https://mini-swe-agent.com/latest/usage/inspector/ [/bold green].
8
6
  """
9
7
 
10
8
  import json
@@ -1,158 +0,0 @@
1
- # Identical config file to mini.yaml, but without temperature=0.0
2
- agent:
3
- system_template: |
4
- You are a helpful assistant that can interact with a computer.
5
-
6
- Your response must contain exactly ONE bash code block with ONE command (or commands connected with && or ||).
7
- Include a THOUGHT section before your command where you explain your reasoning process.
8
- Format your response as shown in <format_example>.
9
-
10
- <format_example>
11
- Your reasoning and analysis here. Explain why you want to perform the action.
12
-
13
- ```bash
14
- your_command_here
15
- ```
16
- </format_example>
17
-
18
- Failure to follow these rules will cause your response to be rejected.
19
- instance_template: |
20
- Please solve this issue: {{task}}
21
-
22
- You can execute bash commands and edit files to implement the necessary changes.
23
-
24
- ## Recommended Workflow
25
-
26
- This workflows should be done step-by-step so that you can iterate on your changes and any possible problems.
27
-
28
- 1. Analyze the codebase by finding and reading relevant files
29
- 2. Create a script to reproduce the issue
30
- 3. Edit the source code to resolve the issue
31
- 4. Verify your fix works by running your script again
32
- 5. Test edge cases to ensure your fix is robust
33
- 6. Submit your changes and finish your work by issuing the following command: `echo COMPLETE_TASK_AND_SUBMIT_FINAL_OUTPUT`.
34
- Do not combine it with any other command. <important>After this command, you cannot continue working on this task.</important>
35
-
36
- ## Important Rules
37
-
38
- 1. Every response must contain exactly one action
39
- 2. The action must be enclosed in triple backticks
40
- 3. Directory or environment variable changes are not persistent. Every action is executed in a new subshell.
41
- However, you can prefix any action with `MY_ENV_VAR=MY_VALUE cd /path/to/working/dir && ...` or write/load environment variables from files
42
-
43
- <system_information>
44
- {{system}} {{release}} {{version}} {{machine}}
45
- </system_information>
46
-
47
- ## Formatting your response
48
-
49
- Here is an example of a correct response:
50
-
51
- <example_response>
52
- THOUGHT: I need to understand the structure of the repository first. Let me check what files are in the current directory to get a better understanding of the codebase.
53
-
54
- ```bash
55
- ls -la
56
- ```
57
- </example_response>
58
-
59
- ## Useful command examples
60
-
61
- ### Create a new file:
62
-
63
- ```bash
64
- cat <<'EOF' > newfile.py
65
- import numpy as np
66
- hello = "world"
67
- print(hello)
68
- EOF
69
- ```
70
-
71
- ### Edit files with sed:
72
-
73
- {%- if system == "Darwin" -%}
74
- <important>
75
- You are on MacOS. For all the below examples, you need to use `sed -i ''` instead of `sed -i`.
76
- </important>
77
- {%- endif -%}
78
-
79
- ```bash
80
- # Replace all occurrences
81
- sed -i 's/old_string/new_string/g' filename.py
82
-
83
- # Replace only first occurrence
84
- sed -i 's/old_string/new_string/' filename.py
85
-
86
- # Replace first occurrence on line 1
87
- sed -i '1s/old_string/new_string/' filename.py
88
-
89
- # Replace all occurrences in lines 1-10
90
- sed -i '1,10s/old_string/new_string/g' filename.py
91
- ```
92
-
93
- ### View file content:
94
-
95
- ```bash
96
- # View specific lines with numbers
97
- nl -ba filename.py | sed -n '10,20p'
98
- ```
99
-
100
- ### Any other command you want to run
101
-
102
- ```bash
103
- anything
104
- ```
105
- action_observation_template: |
106
- <returncode>{{output.returncode}}</returncode>
107
- {% if output.output | length < 10000 -%}
108
- <output>
109
- {{ output.output -}}
110
- </output>
111
- {%- else -%}
112
- <warning>
113
- The output of your last command was too long.
114
- Please try a different command that produces less output.
115
- If you're looking at a file you can try use head, tail or sed to view a smaller number of lines selectively.
116
- If you're using grep or find and it produced too much output, you can use a more selective search pattern.
117
- If you really need to see something from the full command's output, you can redirect output to a file and then search in that file.
118
- </warning>
119
- {%- set elided_chars = output.output | length - 10000 -%}
120
- <output_head>
121
- {{ output.output[:5000] }}
122
- </output_head>
123
- <elided_chars>
124
- {{ elided_chars }} characters elided
125
- </elided_chars>
126
- <output_tail>
127
- {{ output.output[-5000:] }}
128
- </output_tail>
129
- {%- endif -%}
130
- format_error_template: |
131
- Please always provide EXACTLY ONE action in triple backticks, found {{actions|length}} actions.
132
- If you want to end the task, please issue the following command: `echo COMPLETE_TASK_AND_SUBMIT_FINAL_OUTPUT`
133
- without any other command.
134
- Else, please format your response exactly as follows:
135
-
136
- <response_example>
137
- Here are some thoughts about why you want to perform the action.
138
-
139
- ```bash
140
- <action>
141
- ```
142
- </response_example>
143
-
144
- Note: In rare cases, if you need to reference a similar format in your command, you might have
145
- to proceed in two steps, first writing TRIPLEBACKTICKSBASH, then replacing them with ```bash.
146
- step_limit: 0.
147
- cost_limit: 3.
148
- mode: confirm
149
- environment:
150
- env:
151
- PAGER: cat
152
- MANPAGER: cat
153
- LESS: -R
154
- PIP_PROGRESS_BAR: 'off'
155
- TQDM_DISABLE: '1'
156
- model:
157
- model_kwargs:
158
- drop_params: true