vision-agent 0.2.229__py3-none-any.whl → 0.2.230__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,156 @@
1
+ Metadata-Version: 2.1
2
+ Name: vision-agent
3
+ Version: 0.2.230
4
+ Summary: Toolset for Vision Agent
5
+ Author: Landing AI
6
+ Author-email: dev@landing.ai
7
+ Requires-Python: >=3.9,<4.0
8
+ Classifier: Programming Language :: Python :: 3
9
+ Classifier: Programming Language :: Python :: 3.9
10
+ Classifier: Programming Language :: Python :: 3.10
11
+ Classifier: Programming Language :: Python :: 3.11
12
+ Requires-Dist: anthropic (>=0.31.0,<0.32.0)
13
+ Requires-Dist: av (>=11.0.0,<12.0.0)
14
+ Requires-Dist: e2b (>=0.17.2a50,<0.18.0)
15
+ Requires-Dist: e2b-code-interpreter (==0.0.11a37)
16
+ Requires-Dist: flake8 (>=7.0.0,<8.0.0)
17
+ Requires-Dist: ipykernel (>=6.29.4,<7.0.0)
18
+ Requires-Dist: langsmith (>=0.1.58,<0.2.0)
19
+ Requires-Dist: libcst (>=1.5.0,<2.0.0)
20
+ Requires-Dist: matplotlib (>=3.9.2,<4.0.0)
21
+ Requires-Dist: nbclient (>=0.10.0,<0.11.0)
22
+ Requires-Dist: nbformat (>=5.10.4,<6.0.0)
23
+ Requires-Dist: numpy (>=1.21.0,<2.0.0)
24
+ Requires-Dist: openai (>=1.0.0,<2.0.0)
25
+ Requires-Dist: opencv-python (>=4.0.0,<5.0.0)
26
+ Requires-Dist: opentelemetry-api (>=1.29.0,<2.0.0)
27
+ Requires-Dist: pandas (>=2.0.0,<3.0.0)
28
+ Requires-Dist: pillow (>=10.0.0,<11.0.0)
29
+ Requires-Dist: pillow-heif (>=0.16.0,<0.17.0)
30
+ Requires-Dist: pydantic (==2.7.4)
31
+ Requires-Dist: pydantic-settings (>=2.2.1,<3.0.0)
32
+ Requires-Dist: pytube (==15.0.0)
33
+ Requires-Dist: requests (>=2.0.0,<3.0.0)
34
+ Requires-Dist: rich (>=13.7.1,<14.0.0)
35
+ Requires-Dist: scikit-learn (>=1.5.2,<2.0.0)
36
+ Requires-Dist: scipy (>=1.13.0,<1.14.0)
37
+ Requires-Dist: tabulate (>=0.9.0,<0.10.0)
38
+ Requires-Dist: tenacity (>=8.3.0,<9.0.0)
39
+ Requires-Dist: tqdm (>=4.64.0,<5.0.0)
40
+ Requires-Dist: typing_extensions (>=4.0.0,<5.0.0)
41
+ Project-URL: Homepage, https://landing.ai
42
+ Project-URL: documentation, https://github.com/landing-ai/vision-agent
43
+ Project-URL: repository, https://github.com/landing-ai/vision-agent
44
+ Description-Content-Type: text/markdown
45
+
46
+ <div align="center">
47
+ <picture>
48
+ <source media="(prefers-color-scheme: dark)" srcset="https://github.com/landing-ai/vision-agent/blob/main/assets/logo_light.svg?raw=true">
49
+ <source media="(prefers-color-scheme: light)" srcset="https://github.com/landing-ai/vision-agent/blob/main/assets/logo_dark.svg?raw=true">
50
+ <img alt="VisionAgent" height="200px" src="https://github.com/landing-ai/vision-agent/blob/main/assets/logo_light.svg?raw=true">
51
+ </picture>
52
+
53
+ [![](https://dcbadge.vercel.app/api/server/wPdN8RCYew?compact=true&style=flat)](https://discord.gg/wPdN8RCYew)
54
+ ![ci_status](https://github.com/landing-ai/vision-agent/actions/workflows/ci_cd.yml/badge.svg)
55
+ [![PyPI version](https://badge.fury.io/py/vision-agent.svg)](https://badge.fury.io/py/vision-agent)
56
+ ![version](https://img.shields.io/pypi/pyversions/vision-agent)
57
+ </div>
58
+
59
+ ## VisionAgent
60
+ VisionAgent is a library that helps you utilize agent frameworks to generate code to
61
+ solve your vision task. Check out our discord for updates and roadmaps! The fastest
62
+ way to test out VisionAgent is to use our web application which you can find [here](https://va.landing.ai/).
63
+
64
+ ## Installation
65
+ ```bash
66
+ pip install vision-agent
67
+ ```
68
+
69
+ ```bash
70
+ export ANTHROPIC_API_KEY="your-api-key"
71
+ export OPENAI_API_KEY="your-api-key"
72
+ ```
73
+
74
+ ---
75
+ **NOTE**
76
+ We found using both Anthropic Claude-3.5 and OpenAI o1 to be provide the best performance
77
+ for VisionAgent. If you want to use a different LLM provider or only one, see
78
+ 'Using Other LLM Providers' below.
79
+ ---
80
+
81
+ ## Documentation
82
+
83
+ [VisionAgent Library Docs](https://landing-ai.github.io/vision-agent/)
84
+
85
+ ## Examples
86
+ ### Counting cans in an image
87
+ You can run VisionAgent in a local Jupyter Notebook [Counting cans in an image](https://github.com/landing-ai/vision-agent/blob/main/examples/notebooks/counting_cans.ipynb)
88
+
89
+ ### Generating code
90
+ You can use VisionAgent to generate code to count the number of people in an image:
91
+ ```python
92
+ from vision_agent.agent import VisionAgentCoderV2
93
+ from vision_agent.agent.types import AgentMessage
94
+
95
+ agent = VisionAgentCoderV2(verbose=True)
96
+ code_context = agent.generate_code(
97
+ [
98
+ AgentMessage(
99
+ role="user",
100
+ content="Count the number of people in this image",
101
+ media=["people.png"]
102
+ )
103
+ ]
104
+ )
105
+
106
+ with open("generated_code.py", "w") as f:
107
+ f.write(code_context.code + "\n" + code_context.test)
108
+ ```
109
+
110
+ ### Using the tools directly
111
+ VisionAgent produces code that utilizes our tools. You can also use the tools directly.
112
+ For example if you wanted to detect people in an image and visualize the results:
113
+ ```python
114
+ import vision_agent.tools as T
115
+ import matplotlib.pyplot as plt
116
+
117
+ image = T.load_image("people.png")
118
+ dets = T.countgd_object_detection("person", image)
119
+ # visualize the countgd bounding boxes on the image
120
+ viz = T.overlay_bounding_boxes(image, dets)
121
+
122
+ # save the visualization to a file
123
+ T.save_image(viz, "people_detected.png")
124
+
125
+ # display the visualization
126
+ plt.imshow(viz)
127
+ plt.show()
128
+ ```
129
+
130
+ You can also use the tools for running on video files:
131
+ ```python
132
+ import vision_agent.tools as T
133
+
134
+ frames_and_ts = T.extract_frames_and_timestamps("people.mp4")
135
+ # extract the frames from the frames_and_ts list
136
+ frames = [f["frame"] for f in frames_and_ts]
137
+
138
+ # run the countgd tracking on the frames
139
+ tracks = T.countgd_sam2_video_tracking("person", frames)
140
+ # visualize the countgd tracking results on the frames and save the video
141
+ viz = T.overlay_segmentation_masks(frames, tracks)
142
+ T.save_video(viz, "people_detected.mp4")
143
+ ```
144
+
145
+ ## Using Other LLM Providers
146
+ You can use other LLM providers by changing `config.py` in the `vision_agent/configs`
147
+ directory. For example to change to Anthropic simply just run:
148
+ ```bash
149
+ cp vision_agent/configs/anthropic_config.py vision_agent/configs/config.py
150
+ ```
151
+
152
+ **NOTE**
153
+ VisionAgent moves fast and we are constantly updating and changing the library. If you
154
+ have any questions or need help, please reach out to us on our discord channel.
155
+ ---
156
+
@@ -1,37 +1,42 @@
1
- vision_agent/.sim_tools/df.csv,sha256=Vamicw8MiSGildK1r3-HXY4cKiq17GZxsgBsHbk7jpM,42158
1
+ vision_agent/.sim_tools/df.csv,sha256=XdcgkjC7CjF_CoJnXmFkYOPUBwHemiwsauh62b1eh1M,42472
2
2
  vision_agent/.sim_tools/embs.npy,sha256=YJe8EcKVNmeX_75CS2T1sbY-sUS_1HQAMT-34zc18a0,254080
3
3
  vision_agent/__init__.py,sha256=EAb4-f9iyuEYkBrX4ag1syM8Syx8118_t0R6_C34M9w,57
4
4
  vision_agent/agent/README.md,sha256=Q4w7FWw38qaWosQYAZ7NqWx8Q5XzuWrlv7nLhjUd1-8,5527
5
5
  vision_agent/agent/__init__.py,sha256=M8CffavdIh8Zh-skznLHIaQkYGCGK7vk4dq1FaVkbs4,617
6
6
  vision_agent/agent/agent.py,sha256=_1tHWAs7Jm5tqDzEcPfCRvJV3uRRveyh4n9_9pd6I1w,1565
7
- vision_agent/agent/agent_utils.py,sha256=pP4u5tiami7C3ChgjgYLqJITnmkTI1_GsUj6g5czSRk,13994
7
+ vision_agent/agent/agent_utils.py,sha256=IXxN9XruaeNTreUrdztb3kWJhimpsdH6hjv6xT4jg1Q,14062
8
8
  vision_agent/agent/types.py,sha256=DkFm3VMMrKlhYyfxEmZx4keppD72Ov3wmLCbM2J2o10,2437
9
- vision_agent/agent/vision_agent.py,sha256=I75bEU-os9Lf9OSICKfvQ_H_ftg-zOwgTwWnu41oIdo,23555
9
+ vision_agent/agent/vision_agent.py,sha256=fH9NOLk7twL1fPr9vLSqkaYhah-gfDWfTOVF2FfMyzI,23461
10
10
  vision_agent/agent/vision_agent_coder.py,sha256=flUxOibyGZK19BCSK5mhaD3HjCxHw6c6FtKom6N2q1E,27359
11
- vision_agent/agent/vision_agent_coder_prompts.py,sha256=gPLVXQMNSzYnQYpNm0wlH_5FPkOTaFDV24bqzK3jQ40,12221
11
+ vision_agent/agent/vision_agent_coder_prompts.py,sha256=_kkPLezUVnBXieNPlxMQab_6J6P7F-aa6ItF5NhZZsM,12281
12
12
  vision_agent/agent/vision_agent_coder_prompts_v2.py,sha256=idmSMfxebPULqqvllz3gqRzGDchEvS5dkGngvBs4PGo,4872
13
- vision_agent/agent/vision_agent_coder_v2.py,sha256=i1qgXp5YsWVRoA_qO429Ef-aKZBakveCl1F_2ZbSzk8,16287
13
+ vision_agent/agent/vision_agent_coder_v2.py,sha256=ZR2PQoMqNM6yK3vn_0rrCJf_EplRKye7t7bVjyl51ls,16476
14
14
  vision_agent/agent/vision_agent_planner.py,sha256=fFzjNkZBKkh8Y_oS06ATI4qz31xmIJvixb_tV1kX8KA,18590
15
- vision_agent/agent/vision_agent_planner_prompts.py,sha256=mn9NlZpRkW4XAvlNuMZwIs1ieHCFds5aYZJ55WXupZY,6733
16
- vision_agent/agent/vision_agent_planner_prompts_v2.py,sha256=YgemW2PRPYd8o8XpmwSJBUOJSxMUXMNr2DZNQnS4jEI,34988
17
- vision_agent/agent/vision_agent_planner_v2.py,sha256=vvxfmGydBIKB8CtNSAJyPvdEXkG7nIO5-Hs2SjNc48Y,20465
18
- vision_agent/agent/vision_agent_prompts.py,sha256=NtGdCfzzilCRtscKALC9FK55d1h4CBpMnbhLzg0PYlc,13772
19
- vision_agent/agent/vision_agent_prompts_v2.py,sha256=-vCWat-ARlCOOOeIDIFhg-kcwRRwjTXYEwsvvqPeaCs,1972
20
- vision_agent/agent/vision_agent_v2.py,sha256=1wu_vH_onic2kLYPKW2nAF2e6Zz5vmUt5Acv4Seq3sQ,10796
15
+ vision_agent/agent/vision_agent_planner_prompts.py,sha256=rYRdJthc-sQN57VgCBKrF09Sd73BSxcBdjNe6C4WNZ8,6837
16
+ vision_agent/agent/vision_agent_planner_prompts_v2.py,sha256=5xTx93lNpoyT4eAD9jicwDyDAkuW7eQqicr17zCjrQw,33337
17
+ vision_agent/agent/vision_agent_planner_v2.py,sha256=Vbfe_QrhHVViFoTYk8UTkXiZ6cptAiMexQClmRucQeA,20482
18
+ vision_agent/agent/vision_agent_prompts.py,sha256=KaJwYPUP7_GvQsCPPs6Fdawmi3AQWmWajBUuzj7gTG4,13812
19
+ vision_agent/agent/vision_agent_prompts_v2.py,sha256=AW_bW1boGiCLyLFd3h4GQenfDACttQagDHwpBkSW4Xo,2518
20
+ vision_agent/agent/vision_agent_v2.py,sha256=335VT0hk0jkB14y4W3cJo5ueEu1wY_jjN-R_m2xaQ30,10752
21
21
  vision_agent/clients/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
22
22
  vision_agent/clients/http.py,sha256=k883i6M_4nl7zwwHSI-yP5sAgQZIDPM1nrKD6YFJ3Xs,2009
23
23
  vision_agent/clients/landing_public_api.py,sha256=lU2ev6E8NICmR8DMUljuGcVFy5VNJQ4WQkWC8WnnJEc,1503
24
+ vision_agent/configs/__init__.py,sha256=Iu75-w9_nlPmnB_qKA7nYaaaHf7xtTrDmK8N4v2WV34,27
25
+ vision_agent/configs/anthropic_config.py,sha256=qYpl03wcM6wYcI24rl-9Y8Cyt8STbtUYR0IR4e5YFsU,4298
26
+ vision_agent/configs/anthropic_openai_config.py,sha256=YQjFxmlxppn5L55dJjK_v1myBJQ_V5J4q25pmUtwTOU,4310
27
+ vision_agent/configs/config.py,sha256=YQjFxmlxppn5L55dJjK_v1myBJQ_V5J4q25pmUtwTOU,4310
28
+ vision_agent/configs/openai_config.py,sha256=hCn-e8rRzg4I3OQV3zG0NTWVSULg-_9KAKx1D1IdKOU,4498
24
29
  vision_agent/fonts/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
25
30
  vision_agent/fonts/default_font_ch_en.ttf,sha256=1YM0Z3XqLDjSNbF7ihQFSAIUdjF9m1rtHiNC_6QosTE,1594400
26
- vision_agent/lmm/__init__.py,sha256=jyY1sJb_tYKg5-Wzs3p1lvwFkc-aUNZfMcLy3TOC4Zg,100
27
- vision_agent/lmm/lmm.py,sha256=x_nIyDNDZwq4-pfjnJTmcyyJZ2_B7TjkA5jZp88YVO8,17103
31
+ vision_agent/lmm/__init__.py,sha256=xk2Rn8Zgpy2xwYaOGHzy4tXxnxo2aj6SkpNjeJ8yxcY,111
32
+ vision_agent/lmm/lmm.py,sha256=arwfYPWme_RxCxSpEQ0ZkpHO22GFPCwVeoSvXqLPOAk,19288
28
33
  vision_agent/lmm/types.py,sha256=ZEXR_ptBL0ZwDMTDYkgxUCmSZFmBYPQd2jreNzr_8UY,221
29
34
  vision_agent/tools/__init__.py,sha256=8VpAC8zEk8OwcMLcTn7gEAfw6ihqlsEfzjEaW5yd5-4,2897
30
35
  vision_agent/tools/meta_tools.py,sha256=TPeS7QWnc_PmmU_ndiDT03dXbQ5yDSP33E7U8cSj7Ls,28660
31
- vision_agent/tools/planner_tools.py,sha256=qQvPuCif-KbFi7KsXKkTCfpgEQEJJ6oq6WB3gOuG2Xg,13686
36
+ vision_agent/tools/planner_tools.py,sha256=VL9bv7i2FS0v3wMi9eoqgWczoG8vJ-MW5de6JXnbcdA,14354
32
37
  vision_agent/tools/prompts.py,sha256=V1z4YJLXZuUl_iZ5rY0M5hHc_2tmMEUKr0WocXKGt4E,1430
33
38
  vision_agent/tools/tool_utils.py,sha256=kXB0F-HwmiChpQgKk7tMo-Acsl3UXxjaJV9mYo_q6n4,10076
34
- vision_agent/tools/tools.py,sha256=M_kk17Yr5c6ODKet26GcxZAlGDwl0AwMMD4wCrBhR6Y,105157
39
+ vision_agent/tools/tools.py,sha256=FgoWy2rHxo7z0Gj3gq7-seE7I5ss4i9qNiTuyCJZg-4,105471
35
40
  vision_agent/tools/tools_types.py,sha256=8hYf2OZhI58gvf65KGaeGkt4EQ56nwLFqIQDPHioOBc,2339
36
41
  vision_agent/utils/__init__.py,sha256=QKk4zVjMwGxQI0MQ-aZZA50N-qItxRY4EB9CwQkZ2HY,185
37
42
  vision_agent/utils/exceptions.py,sha256=booSPSuoULF7OXRr_YbC4dtKt6gM_HyiFQHBuaW86C4,2052
@@ -41,7 +46,7 @@ vision_agent/utils/sim.py,sha256=qr-6UWAxxGwtwIAKZjZCY_pu9VwBI_TTB8bfrGsaABg,928
41
46
  vision_agent/utils/type_defs.py,sha256=BE12s3JNQy36QvauXHjwyeffVh5enfcvd4vTzSwvEZI,1384
42
47
  vision_agent/utils/video.py,sha256=e1VwKhXzzlC5LcFMyrcQYrPnpnX4wxDpnQ-76sB4jgM,6001
43
48
  vision_agent/utils/video_tracking.py,sha256=wK5dOutqV2t2aeaxedstCBa7xy-NNQE0-QZqKu1QUds,9498
44
- vision_agent-0.2.229.dist-info/LICENSE,sha256=xx0jnfkXJvxRnG63LTGOxlggYnIysveWIZ6H3PNdCrQ,11357
45
- vision_agent-0.2.229.dist-info/METADATA,sha256=ver5sB_NI_dkek1GxY9GsvktACS1Rl6-tgrr_B5p1Zc,20039
46
- vision_agent-0.2.229.dist-info/WHEEL,sha256=7Z8_27uaHI_UZAc4Uox4PpBhQ9Y5_modZXWMxtUi4NU,88
47
- vision_agent-0.2.229.dist-info/RECORD,,
49
+ vision_agent-0.2.230.dist-info/LICENSE,sha256=xx0jnfkXJvxRnG63LTGOxlggYnIysveWIZ6H3PNdCrQ,11357
50
+ vision_agent-0.2.230.dist-info/METADATA,sha256=RU0v0XPXdyFA43SLZjk4sKgXB_peosxwCodQ1F9p4wA,5762
51
+ vision_agent-0.2.230.dist-info/WHEEL,sha256=7Z8_27uaHI_UZAc4Uox4PpBhQ9Y5_modZXWMxtUi4NU,88
52
+ vision_agent-0.2.230.dist-info/RECORD,,