vision-agent 0.2.56__py3-none-any.whl → 0.2.58__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,152 +0,0 @@
1
- VISION_AGENT_REFLECTION = """You are an advanced reasoning agent that can improve based on self-refection. You will be given a previous reasoning trial in which you were given the user's question, the available tools that the agent has, the decomposed tasks and tools that the agent used to answer the question, the tool usage for each of the tools used and the final answer the agent provided. You may also receive an image with the visualized bounding boxes or masks with their associated labels and scores from the tools used.
2
-
3
- Please note that:
4
- 1. You must ONLY output parsible JSON format. If the agents output was correct set "Finish" to true, else set "Finish" to false. An example output looks like:
5
- {{"Finish": true, "Reflection": "The agent's answer was correct."}}
6
- 2. You must utilize the image with the visualized bounding boxes or masks and determine if the tools were used correctly or if the tools were used incorrectly or the wrong tools were used.
7
- 3. If the agent's answer was incorrect, you must diagnose the reason for failure and devise a new concise and concrete plan that aims to mitigate the same failure with the tools available. An example output looks like:
8
- {{"Finish": false, "Reflection": "I can see from the visualized bounding boxes that the agent's answer was incorrect because the grounding_dino_ tool produced false positive predictions. The agent should use the following tools with the following parameters:
9
- Step 1: Use 'grounding_dino_' with a 'prompt' of 'baby. bed' and a 'box_threshold' of 0.7 to reduce the false positives.
10
- Step 2: Use 'box_iou_' with the baby bounding box and the bed bounding box to determine if the baby is on the bed or not."}}
11
- 4. If the task cannot be completed with the existing tools or by adjusting the parameters, set "Finish" to true.
12
-
13
- User's question: {question}
14
-
15
- Tools available:
16
- {tools}
17
-
18
- Tasks and tools used:
19
- {tool_results}
20
-
21
- Tool's used API documentation:
22
- {tool_usage}
23
-
24
- Final answer:
25
- {final_answer}
26
-
27
- Reflection: """
28
-
29
- TASK_DECOMPOSE = """You need to decompose a user's complex question into one or more simple subtasks and let the model execute it step by step.
30
- This is the user's question: {question}
31
- This is the tool list:
32
- {tools}
33
-
34
- Please note that:
35
- 1. If the given task is simple and the answer can be provided by executing one tool, you should only use that tool to provide the answer.
36
- 2. If the given task is complex, You should decompose this user's complex question into simple subtasks which can only be executed easily by using one single tool in the tool list.
37
- 3. You should try to decompose the complex question into least number of subtasks.
38
- 4. If one subtask needs the results from another subtask, you should write clearly. For example:
39
- {{"Tasks": ["Convert 23 km/h to X km/min by 'divide_'", "Multiply X km/min by 45 min to get Y by 'multiply_'"]}}
40
- 5. You must ONLY output in a parsible JSON format. An example output looks like:
41
-
42
- {{"Tasks": ["Task 1", "Task 2", ...]}}
43
-
44
- Output: """
45
-
46
- TASK_DECOMPOSE_DEPENDS = """You need to decompose a user's complex question into one or more simple subtasks and let the model execute it step by step.
47
- This is the user's question: {question}
48
-
49
- This is the tool list:
50
- {tools}
51
-
52
- This is a reflection from a previous failed attempt:
53
- {reflections}
54
-
55
- Please note that:
56
- 1. If the given task is simple and the answer can be provided by executing one tool, you should only use that tool to provide the answer.
57
- 2. If the given task is complex, You should decompose this user's complex question into simple subtasks which can only be executed easily by using one single tool in the tool list.
58
- 3. You should try to decompose the complex question into least number of subtasks.
59
- 4. If one subtask needs the results from another subtask, you should write clearly. For example:
60
- {{"Tasks": ["Convert 23 km/h to X km/min by 'divide_'", "Multiply X km/min by 45 min to get Y by 'multiply_'"]}}
61
- 5. You must ONLY output in a parsible JSON format. An example output looks like:
62
-
63
- {{"Tasks": ["Task 1", "Task 2", ...]}}
64
-
65
- Output: """
66
-
67
- CHOOSE_TOOL = """This is the user's question: {question}
68
- These are the tools you can select to solve the question:
69
- {tools}
70
-
71
- Please note that:
72
- 1. You should only choose one tool from the Tool List to solve this question and it should have maximum chance of solving the question.
73
- 2. You should only choose the tool whose parameters are most relevant to the user's question and are available as part of the question.
74
- 3. You should choose the tool whose return type is most relevant to the answer of the user's question.
75
- 4. You must ONLY output the ID of the tool you chose in a parsible JSON format. Two example outputs look like:
76
-
77
- Example 1: {{"ID": 1}}
78
- Example 2: {{"ID": 2}}
79
-
80
- Output: """
81
-
82
- CHOOSE_TOOL_DEPENDS = """This is the user's question: {question}
83
- These are the tools you can select to solve the question:
84
- {tools}
85
-
86
- This is a reflection from a previous failed attempt:
87
- {reflections}
88
-
89
- Please note that:
90
- 1. You should only choose one tool from the Tool List to solve this question and it should have maximum chance of solving the question.
91
- 2. You should only choose the tool whose parameters are most relevant to the user's question and are available as part of the question.
92
- 3. You should choose the tool whose return type is most relevant to the answer of the user's question.
93
- 4. You must ONLY output the ID of the tool you chose in a parsible JSON format. Two example outputs look like:
94
-
95
- Example 1: {{"ID": 1}}
96
- Example 2: {{"ID": 2}}
97
-
98
- Output: """
99
-
100
- CHOOSE_PARAMETER_DEPENDS = """Given a user's question and an API tool documentation, you need to output parameters according to the API tool documentation to successfully call the API to solve the user's question.
101
- Please note that:
102
- 1. The Example in the API tool documentation can help you better understand the use of the API. Pay attention to the examples which show how to parse the question and extract tool parameters such as prompts and visual inputs.
103
- 2. Ensure the parameters you output are correct. The output must contain the required parameters, and can contain the optional parameters based on the question. If there are no parameters in the required parameters and optional parameters, just leave it as {{"Parameters":{{}}}}
104
- 3. If the user's question mentions other APIs, you should ONLY consider the API tool documentation I give and do not consider other APIs.
105
- 4. The question may have dependencies on answers of other questions, so we will provide logs of previous questions and answers for your reference.
106
- 5. If you need to use this API multiple times, please set "Parameters" to a list.
107
- 6. You must ONLY output in a parsible JSON format. Two example outputs look like:
108
-
109
- Example 1: {{"Parameters":{{"input": [1,2,3]}}}}
110
- Example 2: {{"Parameters":[{{"input": [1,2,3]}}, {{"input": [2,3,4]}}]}}
111
-
112
- This is a reflection from a previous failed attempt:
113
- {reflections}
114
-
115
- These are logs of previous questions and answers:
116
- {previous_log}
117
-
118
- This is the current user's question: {question}
119
- This is the API tool documentation: {tool_usage}
120
- Output: """
121
-
122
- ANSWER_GENERATE_DEPENDS = """You should answer the question based on the response output by the API tool.
123
- Please note that:
124
- 1. You should try to organize the response into a natural language answer.
125
- 2. We will not show the API response to the user, thus you need to make full use of the response and give the information in the response that can satisfy the user's question in as much detail as possible.
126
- 3. If the API tool does not provide useful information in the response, please answer with your knowledge.
127
- 4. The question may have dependencies on answers of other questions, so we will provide logs of previous questions and answers.
128
-
129
- This is a reflection from a previous failed attempt:
130
- {reflections}
131
-
132
- These are logs of previous questions and answers:
133
- {previous_log}
134
-
135
- This is the user's question: {question}
136
-
137
- This is the response output by the API tool:
138
- {call_results}
139
-
140
- We will not show the API response to the user, thus you need to make full use of the response and give the information in the response that can satisfy the user's question in as much detail as possible.
141
- Output: """
142
-
143
- ANSWER_SUMMARIZE_DEPENDS = """We break down a user's complex problems into simple subtasks and provide answers to each simple subtask. You need to organize these answers to each subtask and form a self-consistent final answer to the user's question
144
- This is the user's question: {question}
145
-
146
- These are subtasks and their answers:
147
- {answers}
148
-
149
- This is a reflection from a previous failed attempt:
150
- {reflections}
151
-
152
- Final answer: """
@@ -1,299 +0,0 @@
1
- import logging
2
- import re
3
- import sys
4
- from pathlib import Path
5
- from typing import Dict, List, Optional, Tuple, Union
6
-
7
- from vision_agent.llm import LLM, OpenAILLM
8
- from vision_agent.lmm import LMM
9
-
10
- from .agent import Agent
11
- from .reflexion_prompts import (
12
- CHECK_FINSH,
13
- COT_AGENT_REFLECT_INSTRUCTION,
14
- COT_REFLECT_INSTRUCTION,
15
- COT_SIMPLE_REFLECTION,
16
- COTQA_SIMPLE6,
17
- REFLECTION_HEADER,
18
- )
19
-
20
- logging.basicConfig(stream=sys.stdout)
21
-
22
- _LOGGER = logging.getLogger(__name__)
23
-
24
-
25
- def format_step(step: str) -> str:
26
- return step.strip("\n").strip().replace("\n", "")
27
-
28
-
29
- def parse_action(input: str) -> Tuple[str, str]:
30
- # Make the pattern slightly less strict, the LMMs are not as good at following
31
- # instructions so they often would fail on the original regex.
32
- pattern = r"(\w+)\[(.+)\]"
33
- match = re.search(pattern, input)
34
-
35
- if match:
36
- action_type = match.group(1)
37
- argument = match.group(2)
38
- return action_type, argument
39
-
40
- _LOGGER.error(f"Invalid action: {input}")
41
- raise ValueError(f"Invalid action: {input}")
42
-
43
-
44
- def format_reflections(reflections: List[str], header: str = REFLECTION_HEADER) -> str:
45
- if reflections == []:
46
- return ""
47
- else:
48
- return (
49
- header + "Reflections:\n- " + "\n- ".join([r.strip() for r in reflections])
50
- )
51
-
52
-
53
- def format_chat(chat: List[Dict[str, str]]) -> str:
54
- chat_str = ""
55
- for c in chat:
56
- chat_str += c["role"] + ": " + c["content"] + "\n"
57
- return chat_str.strip()
58
-
59
-
60
- class Reflexion(Agent):
61
- r"""This is an implementation of the Reflexion paper https://arxiv.org/abs/2303.11366
62
- based on the original implementation https://github.com/noahshinn/reflexion in the
63
- hotpotqa folder. There are several differences between this implementation and the
64
- original one. Because we do not have instant feedback on whether or not the agent
65
- was correct, we use user feedback to determine if the agent was correct. The user
66
- feedback is evaluated by the self_reflect_model with a new prompt. We also expand
67
- Reflexion to include the ability to use an image as input to the action_agent and the
68
- self_reflect_model. Using Reflexion with LMMs may not work well, if it gets it wrong
69
- the first time, chances are it can't actually see the thing you want it to see.
70
-
71
- Example
72
- -------
73
- >>> from vision_agent.agent import Reflexion
74
- >>> agent = Reflexion()
75
- >>> question = "How many tires does a truck have?"
76
- >>> resp = agent(question)
77
- >>> print(resp)
78
- "18"
79
- >>> resp = agent([
80
- >>> {"role": "user", "content": question},
81
- >>> {"role": "assistant", "content": resp},
82
- >>> {"role": "user", "content": "No I mean those regular trucks but where the back tires are double."}
83
- >>> ])
84
- >>> print(resp)
85
- "6"
86
- >>> agent = Reflexion(
87
- >>> self_reflect_model=va.lmm.OpenAILMM(),
88
- >>> action_agent=va.lmm.OpenAILMM()
89
- >>> )
90
- >>> quesiton = "How many hearts are in this image?"
91
- >>> resp = agent(question, image="cards.png")
92
- >>> print(resp)
93
- "6"
94
- >>> resp = agent([
95
- >>> {"role": "user", "content": question},
96
- >>> {"role": "assistant", "content": resp},
97
- >>> {"role": "user", "content": "No, please count the hearts on the bottom card."}
98
- >>> ], image="cards.png")
99
- >>> print(resp)
100
- "4"
101
- )
102
- """
103
-
104
- def __init__(
105
- self,
106
- cot_examples: str = COTQA_SIMPLE6,
107
- reflect_examples: str = COT_SIMPLE_REFLECTION,
108
- agent_prompt: str = COT_AGENT_REFLECT_INSTRUCTION,
109
- reflect_prompt: str = COT_REFLECT_INSTRUCTION,
110
- finsh_prompt: str = CHECK_FINSH,
111
- self_reflect_model: Optional[Union[LLM, LMM]] = None,
112
- action_agent: Optional[Union[Agent, LLM, LMM]] = None,
113
- verbose: bool = False,
114
- ):
115
- self.agent_prompt = agent_prompt
116
- self.reflect_prompt = reflect_prompt
117
- self.finsh_prompt = finsh_prompt
118
- self.cot_examples = cot_examples
119
- self.reflect_examples = reflect_examples
120
- self.reflections: List[str] = []
121
- if verbose:
122
- _LOGGER.setLevel(logging.INFO)
123
-
124
- if isinstance(self_reflect_model, LLM) and not isinstance(action_agent, LLM):
125
- raise ValueError(
126
- "If self_reflect_model is an LLM, then action_agent must also be an LLM."
127
- )
128
- if isinstance(self_reflect_model, LMM) and isinstance(action_agent, LLM):
129
- raise ValueError(
130
- "If self_reflect_model is an LMM, then action_agent must also be an agent or LMM."
131
- )
132
-
133
- self.self_reflect_model = (
134
- OpenAILLM() if self_reflect_model is None else self_reflect_model
135
- )
136
- self.action_agent = OpenAILLM() if action_agent is None else action_agent
137
-
138
- def __call__(
139
- self,
140
- input: Union[str, List[Dict[str, str]]],
141
- media: Optional[Union[str, Path]] = None,
142
- ) -> str:
143
- """Invoke the vision agent.
144
-
145
- Parameters:
146
- input: a prompt that describe the task or a conversation in the format of [{"role": "user", "content": "describe your task here..."}].
147
- media: the input media referenced in the prompt parameter.
148
-
149
- Returns:
150
- A text response.
151
- """
152
- if isinstance(input, str):
153
- input = [{"role": "user", "content": input}]
154
- return self.chat(input, media)
155
-
156
- def chat(
157
- self, chat: List[Dict[str, str]], media: Optional[Union[str, Path]] = None
158
- ) -> str:
159
- if len(chat) == 0 or chat[0]["role"] != "user":
160
- raise ValueError(
161
- f"Invalid chat. Should start with user and alternate between user"
162
- f"and assistant and contain at least one entry {chat}"
163
- )
164
- if media is not None and isinstance(self.action_agent, LLM):
165
- raise ValueError(
166
- "If image is provided, then action_agent must be an agent or LMM."
167
- )
168
-
169
- question = chat[0]["content"]
170
- if len(chat) == 1:
171
- results = self._step(question, image=media)
172
- self.last_scratchpad = results["scratchpad"]
173
- return results["action_arg"]
174
-
175
- # Observe
176
- chat_str = format_chat(chat)
177
- is_correct = self.prompt_finish(chat_str)
178
- self.last_scratchpad += "\nObservation: "
179
- if is_correct:
180
- self.last_scratchpad += "Answer is CORRECT"
181
- return self.self_reflect_model(chat)
182
- else:
183
- self.last_scratchpad += "Answer is INCORRECT"
184
- chat_context = "The previous conversation was:\n" + chat_str
185
- reflections = self.reflect(
186
- question, chat_context, self.last_scratchpad, media
187
- )
188
- _LOGGER.info(f" {reflections}")
189
- results = self._step(question, reflections, image=media)
190
- self.last_scratchpad = results["scratchpad"]
191
- return results["action_arg"]
192
-
193
- def _step(
194
- self,
195
- question: str,
196
- reflections: str = "",
197
- image: Optional[Union[str, Path]] = None,
198
- ) -> Dict[str, str]:
199
- # Think
200
- scratchpad = "\nThought:"
201
- scratchpad += " " + self.prompt_agent(question, reflections, scratchpad, image)
202
- _LOGGER.info(f" {scratchpad}")
203
-
204
- # Act
205
- scratchpad += "\nAction:"
206
- action = self.prompt_agent(question, reflections, scratchpad, image)
207
- _LOGGER.info(f" {action}")
208
- scratchpad += " " + action
209
- action_type, argument = parse_action(action)
210
- return {
211
- "scratchpad": scratchpad,
212
- "action_type": action_type,
213
- "action_arg": argument,
214
- }
215
-
216
- def reflect(
217
- self,
218
- question: str,
219
- context: str,
220
- scratchpad: str,
221
- image: Optional[Union[str, Path]],
222
- ) -> str:
223
- self.reflections += [
224
- self.prompt_reflection(question, context, scratchpad, image)
225
- ]
226
- return format_reflections(self.reflections)
227
-
228
- def prompt_agent(
229
- self,
230
- question: str,
231
- reflections: str,
232
- scratchpad: str,
233
- image: Optional[Union[str, Path]] = None,
234
- ) -> str:
235
- if isinstance(self.action_agent, LLM):
236
- return format_step(
237
- self.action_agent(
238
- self._build_agent_prompt(question, reflections, scratchpad)
239
- )
240
- )
241
- elif isinstance(self.action_agent, LMM):
242
- return format_step(
243
- self.action_agent(
244
- self._build_agent_prompt(question, reflections, scratchpad),
245
- images=[image] if image is not None else None,
246
- )
247
- )
248
- elif isinstance(self.action_agent, Agent):
249
- return format_step(
250
- self.action_agent(
251
- self._build_agent_prompt(question, reflections, scratchpad),
252
- media=image,
253
- )
254
- )
255
-
256
- def prompt_reflection(
257
- self,
258
- question: str,
259
- context: str = "",
260
- scratchpad: str = "",
261
- image: Optional[Union[str, Path]] = None,
262
- ) -> str:
263
- if isinstance(self.self_reflect_model, LLM):
264
- return format_step(
265
- self.self_reflect_model(
266
- self._build_reflect_prompt(question, context, scratchpad)
267
- )
268
- )
269
- return format_step(
270
- self.self_reflect_model(
271
- self._build_reflect_prompt(question, context, scratchpad),
272
- images=[image] if image is not None else None,
273
- )
274
- )
275
-
276
- def prompt_finish(self, chat: str) -> bool:
277
- answer = self.action_agent(self.finsh_prompt.format(chat=chat))
278
- return "true" in answer.lower()
279
-
280
- def _build_agent_prompt(
281
- self, question: str, reflections: str, scratchpad: str
282
- ) -> str:
283
- return self.agent_prompt.format(
284
- examples=self.cot_examples,
285
- reflections=reflections,
286
- context="",
287
- question=question,
288
- scratchpad=scratchpad,
289
- )
290
-
291
- def _build_reflect_prompt(
292
- self, question: str, context: str = "", scratchpad: str = ""
293
- ) -> str:
294
- return self.reflect_prompt.format(
295
- examples=self.reflect_examples,
296
- context=context,
297
- question=question,
298
- scratchpad=scratchpad,
299
- )
@@ -1,100 +0,0 @@
1
- COT_AGENT_REFLECT_INSTRUCTION = """Solve a question answering task by having a Thought, then Finish with your answer. Thought can reason about the current situation. Finish[answer] returns the answer and finishes the task. You will be given context that you should use to help you answer the question.
2
- Here are some examples:
3
- {examples}
4
- (END OF EXAMPLES)
5
-
6
- {reflections}
7
-
8
- Relevant Context: {context}
9
- Question: {question}{scratchpad}"""
10
-
11
-
12
- COT_REFLECT_INSTRUCTION = """You are an advanced reasoning agent that can improve based on self-refection. You will be given a previous reasoning trial in which you were given access to relevant context and a question to answer. You were unsuccessful in answering the question either because you guessed the wrong answer with Finish[<answer>] or there is a phrasing discrepancy with your provided answer and the answer key. In a few sentences, diagnose a possible reason for failure or phrasing discrepancy and devise a new, concise, high level plan that aims to mitigate the same failure. Use complete sentences.
13
- Here are some examples:
14
- {examples}
15
- (END OF EXAMPLES)
16
-
17
- Previous trial:
18
- Relevant Context: {context}
19
- Question: {question}{scratchpad}
20
-
21
- Reflection:"""
22
-
23
-
24
- REFLECTION_HEADER = "You have attempted to answer following question before and failed. The following reflection(s) give a plan to avoid failing to answer the question in the same way you did previously. Use them to improve your strategy of correctly answering the given question.\n"
25
-
26
-
27
- COT = """Relevant Context: The Nile River is the longest river in the world, spanning approximately 6,650 kilometers (4,132 miles) in length. It flows through eleven countries in northeastern Africa, including Egypt, Sudan, and Uganda.
28
- Question: What is the longest river in the world?
29
- Thought: The question asks for the longest river in the world, which I know is the Nile River based on the context provided.
30
- Action: Finish[Nile River]
31
-
32
- Relevant Context: Ludwig van Beethoven was a German composer and pianist who was a crucial figure in the transition between the Classical and Romantic eras in Western classical music. One of his most famous compositions is the Symphony No. 9, also known as the "Choral" symphony.
33
- Question: Which composer created Symphony No. 9?
34
- Thought: The question is asking for the composer of Symphony No. 9. Based on the context, I know that Ludwig van Beethoven composed this symphony.
35
- Action: Finish[Ludwig van Beethoven]
36
-
37
- Relevant Context: Photosynthesis is the process by which green plants and some other organisms convert light energy into chemical energy. During this process, plants absorb sunlight, carbon dioxide, and water to produce glucose and oxygen.
38
- Question: What do plants produce during photosynthesis?
39
- Thought: The question is asking about the products of photosynthesis. From the context, I know that plants produce glucose and oxygen during this process.
40
- Action: Finish[Glucose and Oxygen]
41
- """
42
-
43
- COT_REFLECT = """
44
- Relevant Context: Ernest Hemingway's novel "The Old Man and the Sea" tells the story of Santiago, an aging Cuban fisherman, who struggles to catch a giant marlin in the Gulf Stream. The book won the Pulitzer Prize for Fiction in 1953 and contributed to Hemingway's Nobel Prize for Literature in 1954.
45
- Question: Which literary award did "The Old Man and the Sea" contribute to Hemingway winning?
46
- Thought: The question is asking which award "The Old Man and the Sea" contributed to Hemingway winning. Based on the context, I know the novel won the Pulitzer Prize for Fiction and contributed to his Nobel Prize for Literature.
47
- Action: Finish[Pulitzer Prize for Fiction]
48
-
49
- Reflection: My answer was correct based on the context, but may not be the exact answer stored by the grading environment. Next time, I should try to provide a less verbose answer like "Pulitzer Prize" or "Nobel Prize."
50
-
51
- Context: On 14 October 1947, Chuck Yeager, a United States Air Force test pilot, became the first person to break the sound barrier by flying the Bell X-1 experimental aircraft at an altitude of 45,000 feet.
52
- Charles Elwood "Chuck" Yeager (13 February 1923 - 7 December 2020) was a United States Air Force officer, flying ace, and test pilot. He is best known for becoming the first person to break the sound barrier, which he achieved in the Bell X-1 aircraft named Glamorous Glennis. Yeager was also a distinguished fighter pilot during World War II and was credited with shooting down at least 12 enemy aircraft. In 1973, he was inducted into the National Aviation Hall of Fame for his significant contributions to aviation.
53
- Question: Who is the first person to break the sound barrier?
54
- Thought: The question is asking for the first person to break the sound barrier. From the context, I know that Chuck Yeager, a United States Air Force test pilot, was the first person to break the sound barrier.
55
- Action: Finish[Chuck Yeager]
56
-
57
- Reflection: Upon reflecting on the incorrect answer I provided, I realize that I may not have given the full name of the individual in question. In the context, both the given name and the nickname were mentioned, and I only used the nickname in my response. This could have been the reason my answer was deemed incorrect. Moving forward, when attempting this question again or similar questions, I will make sure to include the complete name of the person, which consists of their given name, any middle names, and their nickname (if applicable). This will help ensure that my answer is more accurate and comprehensive."""
58
-
59
- COTQA_SIMPLE6 = """
60
- Question: What is the elevation range for the area that the eastern sector of the Colorado orogeny extends into?
61
- Thought: Let's think step by step. The eastern sector of Colorado orogeny extends into the High Plains. High Plains rise in elevation from around 1,800 to 7,000 ft, so the answer is 1,800 to 7,000 ft.
62
- Action: Finish[1,800 to 7,000 ft]
63
-
64
- Question: Musician and satirist Allie Goertz wrote a song about the "The Simpsons" character Milhouse, who Matt Groening named after who?
65
- Thought: Let's think step by step. Milhouse was named after U.S. president Richard Nixon, so the answer is Richard Nixon.
66
- Action: Finish[Richard Nixon]
67
-
68
- Question: Which documentary is about Finnish rock groups, Adam Clayton Powell or The Saimaa Gesture?
69
- Thought: Let's think step by step. Adam Clayton Powell (film) is a documentary about an African-American politician, not Finnish rock groups. So the documentary about Finnish rock groups must instead be The Saimaa Gesture.
70
- Action: Finish[The Saimaa Gesture]
71
-
72
- Question: What profession does Nicholas Ray and Elia Kazan have in common?
73
- Thought: Let's think step by step. Professions of Nicholas Ray are director, screenwriter, and actor. Professions of Elia Kazan are director, producer, screenwriter, and actor. So profession Nicholas Ray and Elia Kazan have in common is director, screenwriter, and actor.
74
- Action: Finish[director, screenwriter, actor]
75
-
76
- Question: Which magazine was started first Arthur's Magazine or First for Women?
77
- Thought: Let's think step by step. Arthur's Magazine was started in 1844. First for Women was started in 1989. 1844 (Arthur's Magazine) < 1989 (First for Women), so Arthur's Magazine was started first.
78
- Action: Finish[Arthur's Magazine]
79
-
80
- Question: Were Pavel Urysohn and Leonid Levin known for the same type of work?
81
- Thought: Let's think step by step. Pavel Urysohn is a mathematician. Leonid Levin is a mathematician and computer scientist. So Pavel Urysohn and Leonid Levin have the same type of work.
82
- Action: Finish[Yes]"""
83
-
84
- COT_SIMPLE_REFLECTION = """
85
- Question: What is the elevation range for the area that the eastern sector of the Colorado orogeny extends into?
86
- Thought: Let's think step by step. The eastern sector of Colorado orogeny extends into the Rocky Mountains. The Rocky Mountains rise in elevation from around 1,800 to 14,000 ft, so the answer is 1,800 to 14,000 ft.
87
- Action: Finish[1,800 to 14,000 ft]
88
-
89
- Reflection: My reasoning for the elevation range of the eastern sector of the Colorado orogeny failed because I incorrectly associated it with the Rocky Mountains instead of the High Plains. In the future, when attempting this question, I should focus on better understanding the geography and geological formations involved in the Colorado orogeny to avoid confusion.
90
-
91
- Question: Musician and satirist Allie Goertz wrote a song about the "The Simpsons" character Milhouse, who Matt Groening named after who?
92
- Thought: Let's think step by step. Milhouse was named after a prominent cartoon character, Mickey Mouse, so the answer is Mickey Mouse.
93
- Action: Finish[Mickey Mouse]
94
-
95
- Reflection: My reasoning for the naming of the character Milhouse in "The Simpsons" failed because I incorrectly assumed that the character was named after a prominent cartoon character, Mickey Mouse. In the future, when attempting this question, I should focus on better researching the background of "The Simpsons" and Matt Groening's influences in naming characters to avoid making incorrect assumptions.
96
- """
97
-
98
- CHECK_FINSH = """
99
- You are an advanced reasoning agent, your job is to look at a conversation between a human and another agent and determine if the agent correctly answered the human's quesiton. If the agent correctly answered the question, return True. If the agent did not correctly answer the question or you are unsure, return False.
100
- {chat}"""
@@ -1 +0,0 @@
1
- from .llm import LLM, AzureOpenAILLM, OpenAILLM