crewplus 0.2.89__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- crewplus/__init__.py +10 -0
- crewplus/callbacks/__init__.py +1 -0
- crewplus/callbacks/async_langfuse_handler.py +166 -0
- crewplus/services/__init__.py +21 -0
- crewplus/services/azure_chat_model.py +145 -0
- crewplus/services/feedback.md +55 -0
- crewplus/services/feedback_manager.py +267 -0
- crewplus/services/gemini_chat_model.py +884 -0
- crewplus/services/init_services.py +57 -0
- crewplus/services/model_load_balancer.py +264 -0
- crewplus/services/schemas/feedback.py +61 -0
- crewplus/services/tracing_manager.py +182 -0
- crewplus/utils/__init__.py +4 -0
- crewplus/utils/schema_action.py +7 -0
- crewplus/utils/schema_document_updater.py +173 -0
- crewplus/utils/tracing_util.py +55 -0
- crewplus/vectorstores/milvus/__init__.py +5 -0
- crewplus/vectorstores/milvus/milvus_schema_manager.py +270 -0
- crewplus/vectorstores/milvus/schema_milvus.py +586 -0
- crewplus/vectorstores/milvus/vdb_service.py +917 -0
- crewplus-0.2.89.dist-info/METADATA +144 -0
- crewplus-0.2.89.dist-info/RECORD +29 -0
- crewplus-0.2.89.dist-info/WHEEL +4 -0
- crewplus-0.2.89.dist-info/entry_points.txt +4 -0
- crewplus-0.2.89.dist-info/licenses/LICENSE +21 -0
- docs/GeminiChatModel.md +247 -0
- docs/ModelLoadBalancer.md +134 -0
- docs/VDBService.md +238 -0
- docs/index.md +23 -0
|
@@ -0,0 +1,144 @@
|
|
|
1
|
+
Metadata-Version: 2.1
|
|
2
|
+
Name: crewplus
|
|
3
|
+
Version: 0.2.89
|
|
4
|
+
Summary: Base services for CrewPlus AI applications
|
|
5
|
+
Author-Email: Tim Liu <tim@opsmateai.com>
|
|
6
|
+
License: MIT
|
|
7
|
+
Project-URL: Homepage, https://github.com/your-org/crewplus-base
|
|
8
|
+
Project-URL: Documentation, https://crewplus.readthedocs.io
|
|
9
|
+
Project-URL: Repository, https://github.com/your-org/crewplus-base
|
|
10
|
+
Project-URL: Issues, https://github.com/your-org/crewplus-base/issues
|
|
11
|
+
Requires-Python: <4.0,>=3.11
|
|
12
|
+
Requires-Dist: langchain<1.0.0,>=0.3.25
|
|
13
|
+
Requires-Dist: langchain-openai>=0.3.24
|
|
14
|
+
Requires-Dist: google-genai>=1.21.1
|
|
15
|
+
Requires-Dist: langchain-milvus<0.3.0,>=0.2.1
|
|
16
|
+
Requires-Dist: langfuse<4.0.0,>=3.1.3
|
|
17
|
+
Description-Content-Type: text/markdown
|
|
18
|
+
|
|
19
|
+
# CrewPlus
|
|
20
|
+
|
|
21
|
+
[](https://badge.fury.io/py/crewplus)
|
|
22
|
+
[](https://opensource.org/licenses/MIT)
|
|
23
|
+
[](https://pypi.org/project/crewplus)
|
|
24
|
+
[](https://travis-ci.com/your-org/crewplus-base)
|
|
25
|
+
|
|
26
|
+
**CrewPlus** provides the foundational services and core components for building advanced AI applications. It is the heart of the CrewPlus ecosystem, designed for scalability, extensibility, and seamless integration.
|
|
27
|
+
|
|
28
|
+
## Overview
|
|
29
|
+
|
|
30
|
+
This repository, `crewplus-base`, contains the core `crewplus` Python package. It includes essential building blocks for interacting with large language models, managing vector databases, and handling application configuration. Whether you are building a simple chatbot or a complex multi-agent system, CrewPlus offers the robust foundation you need.
|
|
31
|
+
|
|
32
|
+
## The CrewPlus Ecosystem
|
|
33
|
+
|
|
34
|
+
CrewPlus is designed as a modular and extensible ecosystem of packages. This allows you to adopt only the components you need for your specific use case.
|
|
35
|
+
|
|
36
|
+
- **`crewplus` (This package):** The core package containing foundational services for chat, model load balancing, and vector stores.
|
|
37
|
+
- **`crewplus-agent`:** crewplus agent core: agentic task planner and executor, with context-aware memory.
|
|
38
|
+
- **`crewplus-ingestion`:** Provides robust pipelines for knowledge ingestion and data processing.
|
|
39
|
+
- **`crewplus-memory`:** Provides agent memory services for Crewplus AI Agents.
|
|
40
|
+
- **`crewplus-integrations`:** A collection of third-party integrations to connect CrewPlus with other services and platforms.
|
|
41
|
+
|
|
42
|
+
## Features
|
|
43
|
+
|
|
44
|
+
- **Chat Services:** A unified interface for interacting with various chat models (e.g., `GeminiChatModel`, `TracedAzureChatOpenAI`).
|
|
45
|
+
- **Model Load Balancer:** Intelligently distribute requests across multiple LLM endpoints.
|
|
46
|
+
- **Vector DB Services:** working with popular vector stores (e.g. Milvus, Zilliz Cloud) for retrieval-augmented generation (RAG) and agent memory.
|
|
47
|
+
- **Observability & Tracing:** Automatic integration with tracing tools like Langfuse, with an extensible design for adding others (e.g., Helicone, ...).
|
|
48
|
+
|
|
49
|
+
|
|
50
|
+
## Documentation
|
|
51
|
+
|
|
52
|
+
For detailed guides and API references, please see the `docs/` folder.
|
|
53
|
+
|
|
54
|
+
- **[GeminiChatModel Documentation](./docs/GeminiChatModel.md)**: A comprehensive guide to using the `GeminiChatModel` for text, image, and video understanding.
|
|
55
|
+
|
|
56
|
+
## Installation
|
|
57
|
+
|
|
58
|
+
To install the core `crewplus` package, run the following command:
|
|
59
|
+
|
|
60
|
+
```bash
|
|
61
|
+
pip install crewplus
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
## Getting Started
|
|
65
|
+
|
|
66
|
+
Here is a simple example of how to use the `GeminiChatModel` to start a conversation with an AI model.
|
|
67
|
+
|
|
68
|
+
```python
|
|
69
|
+
# main.py
|
|
70
|
+
from crewplus.services import GeminiChatModel
|
|
71
|
+
|
|
72
|
+
# Initialize the llm (API keys are typically handled by the configuration module)
|
|
73
|
+
llm = GeminiChatModel(google_api_key="your-google-api-key")
|
|
74
|
+
|
|
75
|
+
# Start a conversation
|
|
76
|
+
response = llm.chat("Hello, what is CrewPlus?")
|
|
77
|
+
|
|
78
|
+
print(response)
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
## Project Structure
|
|
82
|
+
|
|
83
|
+
The `crewplus-base` repository is organized to separate core logic, tests, and documentation.
|
|
84
|
+
|
|
85
|
+
```
|
|
86
|
+
crewplus-base/ # GitHub repo name
|
|
87
|
+
├── pyproject.toml
|
|
88
|
+
├── README.md
|
|
89
|
+
├── LICENSE
|
|
90
|
+
├── CHANGELOG.md
|
|
91
|
+
├── crewplus/ # PyPI package name
|
|
92
|
+
│ └── __init__.py
|
|
93
|
+
│ └── services/
|
|
94
|
+
│ └── __init__.py
|
|
95
|
+
│ └── gemini_chat_model.py
|
|
96
|
+
│ └── azure_chat_model.py
|
|
97
|
+
│ └── model_load_balancer.py
|
|
98
|
+
│ └── tracing_manager.py
|
|
99
|
+
│ └── ...
|
|
100
|
+
│ └── vectorstores/milvus
|
|
101
|
+
│ └── __init__.py
|
|
102
|
+
│ └── schema_milvus.py
|
|
103
|
+
│ └── vdb_service.py
|
|
104
|
+
│ └── utils/
|
|
105
|
+
│ └── __init__.py
|
|
106
|
+
│ └── schema_action.py
|
|
107
|
+
│ └── ...
|
|
108
|
+
├── tests/
|
|
109
|
+
│ └── ...
|
|
110
|
+
├── docs/
|
|
111
|
+
│ └── ...
|
|
112
|
+
└── notebooks/
|
|
113
|
+
└── ...
|
|
114
|
+
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
## Version Update
|
|
118
|
+
|
|
119
|
+
0.2.50
|
|
120
|
+
Add async aget_vector_store to enable async vector search
|
|
121
|
+
0.2.80
|
|
122
|
+
Add FeedbackManager, to support langsmith style feedback with langfuse score
|
|
123
|
+
|
|
124
|
+
## Deploy to PyPI
|
|
125
|
+
|
|
126
|
+
Clean Previous Build Artifacts:
|
|
127
|
+
Remove the dist/, build/, and *.egg-info/ directories to ensure that no old files are included in the new build.
|
|
128
|
+
|
|
129
|
+
rm -rf dist build *.egg-info
|
|
130
|
+
|
|
131
|
+
### install deployment tool
|
|
132
|
+
pip install twine
|
|
133
|
+
|
|
134
|
+
### build package
|
|
135
|
+
python -m build
|
|
136
|
+
|
|
137
|
+
### deploy to TestPyPI (Test first)
|
|
138
|
+
python -m twine upload --repository testpypi dist/*
|
|
139
|
+
|
|
140
|
+
### install from TestPyPI
|
|
141
|
+
pip install -i https://test.pypi.org/simple/ crewplus
|
|
142
|
+
|
|
143
|
+
### Deploy to official PyPI
|
|
144
|
+
python -m twine upload dist/*
|
|
@@ -0,0 +1,29 @@
|
|
|
1
|
+
crewplus-0.2.89.dist-info/METADATA,sha256=GxRIVQAVJF7Y8kHTa0leU5FA5imAcdPhKNYjH9hj1fU,5471
|
|
2
|
+
crewplus-0.2.89.dist-info/WHEEL,sha256=tsUv_t7BDeJeRHaSrczbGeuK-TtDpGsWi_JfpzD255I,90
|
|
3
|
+
crewplus-0.2.89.dist-info/entry_points.txt,sha256=6OYgBcLyFCUgeqLgnvMyOJxPCWzgy7se4rLPKtNonMs,34
|
|
4
|
+
crewplus-0.2.89.dist-info/licenses/LICENSE,sha256=2_NHSHRTKB_cTcT_GXgcenOCtIZku8j343mOgAguTfc,1087
|
|
5
|
+
crewplus/__init__.py,sha256=m46HkZL1Y4toD619NL47Sn2Qe084WFFSFD7e6VoYKZc,284
|
|
6
|
+
crewplus/callbacks/__init__.py,sha256=YG7ieeb91qEjp1zF0-inEN7mjZ7yT_D2yzdWFT8Z1Ws,63
|
|
7
|
+
crewplus/callbacks/async_langfuse_handler.py,sha256=8_p7ctgcmDNQgF5vOqA47I0x-3GWsm7zioZcZHgedZk,7163
|
|
8
|
+
crewplus/services/__init__.py,sha256=o6qtskHvwtAmdb4qiPNENW1ivlOgwLkPBhTVv-qeJ4s,702
|
|
9
|
+
crewplus/services/azure_chat_model.py,sha256=5ZQBr4Vvb518X_768EV2Ax4YnzVqTGLEdugOysLxL8k,6323
|
|
10
|
+
crewplus/services/feedback.md,sha256=4hMdSzQgG6qnprV0FkiyogSfMFj5eMwX4PoxN_xD9tU,3821
|
|
11
|
+
crewplus/services/feedback_manager.py,sha256=vpleLxGa9f0dQbBXqwQfT9SWu8k0BJO40DN5coGdR_k,11652
|
|
12
|
+
crewplus/services/gemini_chat_model.py,sha256=DYqz01H2TIHiCDQesSozVfOsMigno6QGwOtIweg7UHk,40103
|
|
13
|
+
crewplus/services/init_services.py,sha256=tc1ti8Yufo2ixlJpwg8uH0KmoyQ4EqxCOe4uTEWnlRM,2413
|
|
14
|
+
crewplus/services/model_load_balancer.py,sha256=HIx-k-FiizJSF4e88SFxfFVNS93vJR2zrOdU_fg26FU,12826
|
|
15
|
+
crewplus/services/schemas/feedback.py,sha256=M2AIwzW2MWtAbly2xfEFxDcGVk8kMFzZ7jT9swTmbHc,2787
|
|
16
|
+
crewplus/services/tracing_manager.py,sha256=pwNFeA77vnoZMh_AUOnK5TvAaPOOLg5oDnVOe1yUa9A,8502
|
|
17
|
+
crewplus/utils/__init__.py,sha256=2Gk1n5srFJQnFfBuYTxktdtKOVZyNrFcNaZKhXk35Pw,142
|
|
18
|
+
crewplus/utils/schema_action.py,sha256=GDaBoVFQD1rXqrLVSMTfXYW1xcUu7eDcHsn57XBSnIg,422
|
|
19
|
+
crewplus/utils/schema_document_updater.py,sha256=frvffxn2vbi71fHFPoGb9hq7gH2azmmdq17p-Fumnvg,7322
|
|
20
|
+
crewplus/utils/tracing_util.py,sha256=ew5VwjTKcY88P2sveIlGqmsNFR5OJ-DjKAHKQzBoTyE,2449
|
|
21
|
+
crewplus/vectorstores/milvus/__init__.py,sha256=OeYv2rdyG7tcREIjBJPyt2TbE54NvyeRoWMe7LwopRE,245
|
|
22
|
+
crewplus/vectorstores/milvus/milvus_schema_manager.py,sha256=-QRav-hzu-XWeJ_yKUMolal_EyMUspSg-nvh5sqlrlQ,11442
|
|
23
|
+
crewplus/vectorstores/milvus/schema_milvus.py,sha256=wwNpfqsKS0xeozZES40IvB0iNwUtpCall_7Hkg0dL1g,27223
|
|
24
|
+
crewplus/vectorstores/milvus/vdb_service.py,sha256=_jtJLEtURMYhKy_d7Hb6WoUiH_B1L2IbLC5TtGBZrzk,44270
|
|
25
|
+
docs/GeminiChatModel.md,sha256=zZYyl6RmjZTUsKxxMiC9O4yV70MC4TD-IGUmWhIDBKA,8677
|
|
26
|
+
docs/ModelLoadBalancer.md,sha256=aGHES1dcXPz4c7Y8kB5-vsCNJjriH2SWmjBkSGoYKiI,4398
|
|
27
|
+
docs/VDBService.md,sha256=Dw286Rrf_fsi13jyD3Bo4Sy7nZ_G7tYm7d8MZ2j9hxk,9375
|
|
28
|
+
docs/index.md,sha256=3tlc15uR8lzFNM5WjdoZLw0Y9o1P1gwgbEnOdIBspqc,1643
|
|
29
|
+
crewplus-0.2.89.dist-info/RECORD,,
|
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) Opsmate AI, Inc.
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
docs/GeminiChatModel.md
ADDED
|
@@ -0,0 +1,247 @@
|
|
|
1
|
+
# GeminiChatModel Documentation
|
|
2
|
+
|
|
3
|
+
## 1. Introduction
|
|
4
|
+
|
|
5
|
+
The `GeminiChatModel` is a custom LangChain-compatible chat model that provides a robust interface to Google's Gemini Pro and Flash models. It is designed to handle multimodal inputs, including text, images, and videos, making it a versatile tool for building advanced AI applications.
|
|
6
|
+
|
|
7
|
+
### Key Features:
|
|
8
|
+
- **LangChain Compatibility**: Seamlessly integrates into the LangChain ecosystem as a `BaseChatModel`.
|
|
9
|
+
- **Multimodal Support**: Natively processes text, images (from URLs, local paths, or base64), and videos (from local paths, Google Cloud URIs, or raw bytes).
|
|
10
|
+
- **Streaming**: Supports streaming for both standard and multimodal responses.
|
|
11
|
+
- **Advanced Configuration**: Allows fine-tuning of generation parameters like temperature, top-p, top-k, and max tokens.
|
|
12
|
+
- **Video Segment Analysis**: Can process specific time ranges within a video using start and end offsets.
|
|
13
|
+
|
|
14
|
+
## 2. Installation
|
|
15
|
+
|
|
16
|
+
To use the `GeminiChatModel`, you need to install the `crewplus` package. If you are working within the project repository, you can install it in editable mode:
|
|
17
|
+
|
|
18
|
+
```bash
|
|
19
|
+
pip install crewplus
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
## 3. Initialization
|
|
23
|
+
|
|
24
|
+
First, ensure you have set your Google API key as an environment variable:
|
|
25
|
+
|
|
26
|
+
```bash
|
|
27
|
+
# For Linux/macOS
|
|
28
|
+
export GOOGLE_API_KEY="YOUR_API_KEY"
|
|
29
|
+
|
|
30
|
+
# For Windows PowerShell
|
|
31
|
+
$env:GEMINI_API_KEY = "YOUR_API_KEY"
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
Then, you can import and initialize the model in your Python code.
|
|
35
|
+
|
|
36
|
+
```python
|
|
37
|
+
import logging
|
|
38
|
+
from crewplus.services import GeminiChatModel
|
|
39
|
+
from langchain_core.messages import HumanMessage
|
|
40
|
+
|
|
41
|
+
# Optional: Configure a logger for detailed output
|
|
42
|
+
logging.basicConfig(level=logging.INFO)
|
|
43
|
+
test_logger = logging.getLogger(__name__)
|
|
44
|
+
|
|
45
|
+
# Initialize the model
|
|
46
|
+
# You can also pass the google_api_key directly as a parameter
|
|
47
|
+
model = GeminiChatModel(
|
|
48
|
+
model_name="gemini-2.5-flash", # Or "gemini-1.5-pro"
|
|
49
|
+
logger=test_logger,
|
|
50
|
+
temperature=0.0,
|
|
51
|
+
)
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
## 4. Basic Usage (Text-only)
|
|
55
|
+
|
|
56
|
+
The model can be used for simple text-based conversations using `.invoke()` or `.stream()`.
|
|
57
|
+
|
|
58
|
+
```python
|
|
59
|
+
# Using invoke for a single response
|
|
60
|
+
response = model.invoke("Hello, how are you?")
|
|
61
|
+
print(response.content)
|
|
62
|
+
|
|
63
|
+
# Using stream for a chunked response
|
|
64
|
+
print("\n--- Streaming Response ---")
|
|
65
|
+
for chunk in model.stream("Tell me a short story about a brave robot."):
|
|
66
|
+
print(chunk.content, end="", flush=True)
|
|
67
|
+
|
|
68
|
+
# Using astream for an asynchronous chunked response
|
|
69
|
+
import asyncio
|
|
70
|
+
|
|
71
|
+
async def main():
|
|
72
|
+
print("\n--- Async Streaming Response ---")
|
|
73
|
+
async for chunk in model.astream("Tell me a short story about a brave robot."):
|
|
74
|
+
print(chunk.content, end="", flush=True)
|
|
75
|
+
|
|
76
|
+
# To run the async function in a Jupyter Notebook or a script:
|
|
77
|
+
# await main()
|
|
78
|
+
# Or, if not in an async context:
|
|
79
|
+
# asyncio.run(main())
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
## 5. Image Understanding
|
|
83
|
+
|
|
84
|
+
`GeminiChatModel` can understand images provided via a URL or as base64 encoded data.
|
|
85
|
+
|
|
86
|
+
### Example 1: Image from a URL
|
|
87
|
+
|
|
88
|
+
You can provide a direct URL to an image.
|
|
89
|
+
|
|
90
|
+
```python
|
|
91
|
+
from langchain_core.messages import HumanMessage
|
|
92
|
+
|
|
93
|
+
url_message = HumanMessage(
|
|
94
|
+
content=[
|
|
95
|
+
{"type": "text", "text": "Describe this image:"},
|
|
96
|
+
{
|
|
97
|
+
"type": "image_url",
|
|
98
|
+
"image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
|
|
99
|
+
},
|
|
100
|
+
]
|
|
101
|
+
)
|
|
102
|
+
url_response = model.invoke([url_message])
|
|
103
|
+
print("Image response (URL):", url_response.content)
|
|
104
|
+
```
|
|
105
|
+
> **Sample Output:**
|
|
106
|
+
> The image shows a wooden boardwalk stretching into the distance through a field of tall, green grass... The overall impression is one of tranquility and natural beauty.
|
|
107
|
+
|
|
108
|
+
### Example 2: Local Image (Base64)
|
|
109
|
+
|
|
110
|
+
You can also send a local image file by encoding it in base64.
|
|
111
|
+
|
|
112
|
+
```python
|
|
113
|
+
import base64
|
|
114
|
+
from langchain_core.messages import HumanMessage
|
|
115
|
+
|
|
116
|
+
image_path = "./notebooks/test_image_202506191.jpg"
|
|
117
|
+
try:
|
|
118
|
+
with open(image_path, "rb") as image_file:
|
|
119
|
+
encoded_string = base64.b64encode(image_file.read()).decode('utf-8')
|
|
120
|
+
|
|
121
|
+
image_message = HumanMessage(
|
|
122
|
+
content=[
|
|
123
|
+
{"type": "text", "text": "Describe this photo and its background story."},
|
|
124
|
+
{
|
|
125
|
+
"type": "image_url",
|
|
126
|
+
"image_url": {
|
|
127
|
+
"url": f"data:image/jpeg;base64,{encoded_string}"
|
|
128
|
+
}
|
|
129
|
+
},
|
|
130
|
+
]
|
|
131
|
+
)
|
|
132
|
+
image_response = model.invoke([image_message])
|
|
133
|
+
print("Image response (base64):", image_response.content)
|
|
134
|
+
except FileNotFoundError:
|
|
135
|
+
print(f"Image file not found at {image_path}, skipping base64 example.")
|
|
136
|
+
|
|
137
|
+
### Example 3: Streaming a Multimodal Response
|
|
138
|
+
|
|
139
|
+
Streaming also works with complex, multimodal inputs. This is useful for getting faster time-to-first-token while the model processes all the data.
|
|
140
|
+
|
|
141
|
+
```python
|
|
142
|
+
# The url_message is from the previous example
|
|
143
|
+
print("\n--- Streaming Multimodal Response ---")
|
|
144
|
+
for chunk in model.stream([url_message]):
|
|
145
|
+
print(chunk.content, end="", flush=True)
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
## 6. Video Understanding
|
|
149
|
+
|
|
150
|
+
The model supports video analysis from uploaded files, URIs, and raw bytes.
|
|
151
|
+
|
|
152
|
+
**Important Note:** The Gemini API does **not** support common public video URLs (e.g., YouTube, Loom, or public MP4 links). Videos must be uploaded to Google's servers first to get a processable URI.
|
|
153
|
+
|
|
154
|
+
### Example 1: Large Video File (>20MB)
|
|
155
|
+
|
|
156
|
+
For large videos, you must first upload the file using the `google-genai` client to get a file object.
|
|
157
|
+
|
|
158
|
+
```python
|
|
159
|
+
from google import genai
|
|
160
|
+
import os
|
|
161
|
+
from langchain_core.messages import HumanMessage
|
|
162
|
+
|
|
163
|
+
# Initialize the Google GenAI client
|
|
164
|
+
client = genai.Client(api_key=os.environ["GOOGLE_API_KEY"])
|
|
165
|
+
|
|
166
|
+
# Upload the video file
|
|
167
|
+
video_path = "./notebooks/manufacturing_process_tutorial.mp4"
|
|
168
|
+
print("Uploading video... this may take a moment.")
|
|
169
|
+
video_file_obj = client.files.upload(file=video_path)
|
|
170
|
+
print(f"Video uploaded successfully. File name: {video_file_obj.name}")
|
|
171
|
+
|
|
172
|
+
# Use the uploaded file object in the prompt
|
|
173
|
+
video_message = HumanMessage(
|
|
174
|
+
content=[
|
|
175
|
+
{"type": "text", "text": "Summarize this video and provide timestamps for key events."},
|
|
176
|
+
{"type": "video_file", "file": video_file_obj},
|
|
177
|
+
]
|
|
178
|
+
)
|
|
179
|
+
video_response = model.invoke([video_message])
|
|
180
|
+
print("Video response:", video_response.content)
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
> **Sample Output:**
|
|
184
|
+
> This video provides a step-by-step guide on how to correct a mis-set sidewall during tire manufacturing...
|
|
185
|
+
> **Timestamps:**
|
|
186
|
+
> * **0:04:** Applying product package to some material
|
|
187
|
+
> * **0:12:** Splice product Together and Prepare some material
|
|
188
|
+
> ...
|
|
189
|
+
|
|
190
|
+
### Example 2: Video with Time Offsets
|
|
191
|
+
|
|
192
|
+
You can analyze just a specific portion of a video by providing a `start_offset` and `end_offset`. This works with video URIs obtained after uploading.
|
|
193
|
+
|
|
194
|
+
```python
|
|
195
|
+
# Assuming 'video_file_obj' is available from the previous step
|
|
196
|
+
video_uri = video_file_obj.uri
|
|
197
|
+
|
|
198
|
+
offset_message = HumanMessage(
|
|
199
|
+
content=[
|
|
200
|
+
{"type": "text", "text": "Transcribe the events in this video segment."},
|
|
201
|
+
{
|
|
202
|
+
"type": "video_file",
|
|
203
|
+
"url": video_uri,
|
|
204
|
+
"start_offset": "5s",
|
|
205
|
+
"end_offset": "30s"
|
|
206
|
+
}
|
|
207
|
+
]
|
|
208
|
+
)
|
|
209
|
+
|
|
210
|
+
print("Streaming response for video segment:")
|
|
211
|
+
for chunk in model.stream([offset_message]):
|
|
212
|
+
print(chunk.content, end="", flush=True)
|
|
213
|
+
```
|
|
214
|
+
> **Sample Output:**
|
|
215
|
+
> This video demonstrates the process of applying Component A/Component B material to an assembly drum in a manufacturing setting...
|
|
216
|
+
> **Transcription:**
|
|
217
|
+
> **0:05 - 0:12:** A worker is shown applying a material...
|
|
218
|
+
> **0:12 - 0:23:** The worker continues to prepare the material on the drum...
|
|
219
|
+
|
|
220
|
+
### Example 3: Small Video File (<20MB)
|
|
221
|
+
|
|
222
|
+
For small videos, you can pass the raw bytes directly without a separate upload step.
|
|
223
|
+
|
|
224
|
+
```python
|
|
225
|
+
from langchain_core.messages import HumanMessage
|
|
226
|
+
|
|
227
|
+
try:
|
|
228
|
+
with open("./notebooks/product_demo_v1.mp4", "rb") as video_file:
|
|
229
|
+
video_bytes = video_file.read()
|
|
230
|
+
|
|
231
|
+
video_message = HumanMessage(
|
|
232
|
+
content=[
|
|
233
|
+
{"type": "text", "text": "What is happening in this video?"},
|
|
234
|
+
{
|
|
235
|
+
"type": "video_file",
|
|
236
|
+
"data": video_bytes,
|
|
237
|
+
"mime_type": "video/mp4" # Mime type is required for raw data
|
|
238
|
+
},
|
|
239
|
+
]
|
|
240
|
+
)
|
|
241
|
+
video_response = model.invoke([video_message])
|
|
242
|
+
print("Video response (bytes):", video_response.content)
|
|
243
|
+
except FileNotFoundError:
|
|
244
|
+
print("Video file not found.")
|
|
245
|
+
except Exception as e:
|
|
246
|
+
print(f"Video processing with bytes failed: {e}")
|
|
247
|
+
```
|
|
@@ -0,0 +1,134 @@
|
|
|
1
|
+
# ModelLoadBalancer Documentation
|
|
2
|
+
|
|
3
|
+
## 1. Introduction
|
|
4
|
+
|
|
5
|
+
The `ModelLoadBalancer` is a utility class designed to manage and provide access to various language models from different providers, such as Azure OpenAI and Google GenAI. It loads model configurations from a JSON file and allows you to retrieve specific models by their deployment name or a combination of provider and type.
|
|
6
|
+
|
|
7
|
+
### Key Features:
|
|
8
|
+
- **Centralized Model Management**: Manage all your model configurations in a single JSON file.
|
|
9
|
+
- **On-demand Model Loading**: Models are instantiated and loaded when requested.
|
|
10
|
+
- **Provider Agnostic**: Supports multiple model providers.
|
|
11
|
+
- **Flexible Retrieval**: Get models by a unique deployment name.
|
|
12
|
+
|
|
13
|
+
## 2. Initialization
|
|
14
|
+
|
|
15
|
+
To use the `ModelLoadBalancer`, you need to initialize it with the path to your model configuration file.
|
|
16
|
+
|
|
17
|
+
```python
|
|
18
|
+
from crewplus.services.model_load_balancer import ModelLoadBalancer
|
|
19
|
+
|
|
20
|
+
# Initialize the balancer with the path to your config file
|
|
21
|
+
config_path = "tests/models_config.json" # Adjust the path as needed
|
|
22
|
+
balancer = ModelLoadBalancer(config_path=config_path)
|
|
23
|
+
|
|
24
|
+
# Load the configurations and instantiate the models
|
|
25
|
+
balancer.load_config()
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
## 3. Configuration File
|
|
29
|
+
|
|
30
|
+
The `ModelLoadBalancer` uses a JSON file to configure the available models. Here is an example of what the configuration file looks like. The `deployment_name` is used to retrieve a specific model.
|
|
31
|
+
|
|
32
|
+
```json
|
|
33
|
+
{
|
|
34
|
+
"models": [
|
|
35
|
+
{
|
|
36
|
+
"id": 3,
|
|
37
|
+
"provider": "azure-openai",
|
|
38
|
+
"type": "inference",
|
|
39
|
+
"deployment_name": "gpt-4.1",
|
|
40
|
+
"api_version": "2025-01-01-preview",
|
|
41
|
+
"api_base": "https://crewplus-eastus2.openai.azure.com",
|
|
42
|
+
"api_key": "your-api-key"
|
|
43
|
+
},
|
|
44
|
+
{
|
|
45
|
+
"id": 7,
|
|
46
|
+
"provider": "google-genai",
|
|
47
|
+
"type": "inference",
|
|
48
|
+
"deployment_name": "gemini-2.5-flash",
|
|
49
|
+
"api_key": "your-google-api-key"
|
|
50
|
+
},
|
|
51
|
+
{
|
|
52
|
+
"id": 8,
|
|
53
|
+
"provider": "google-genai",
|
|
54
|
+
"type": "ingestion",
|
|
55
|
+
"deployment_name": "gemini-2.5-pro",
|
|
56
|
+
"api_key": "your-google-api-key"
|
|
57
|
+
}
|
|
58
|
+
]
|
|
59
|
+
}
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
## 4. Getting a Model
|
|
63
|
+
|
|
64
|
+
You can retrieve a model instance using the `get_model` method and passing the `deployment_name`.
|
|
65
|
+
|
|
66
|
+
### Get `gemini-2.5-flash`
|
|
67
|
+
```python
|
|
68
|
+
gemini_flash_model = balancer.get_model(deployment_name="gemini-2.5-flash")
|
|
69
|
+
|
|
70
|
+
# Now you can use the model
|
|
71
|
+
# from langchain_core.messages import HumanMessage
|
|
72
|
+
# response = gemini_flash_model.invoke([HumanMessage(content="Hello!")])
|
|
73
|
+
# print(response.content)
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
### Get `gemini-2.5-pro`
|
|
77
|
+
```python
|
|
78
|
+
gemini_pro_model = balancer.get_model(deployment_name="gemini-2.5-pro")
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
### Get `gpt-4.1`
|
|
82
|
+
```python
|
|
83
|
+
gpt41_model = balancer.get_model(deployment_name="gpt-4.1")
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
### Get `o3mini`
|
|
87
|
+
The model `o3mini` is identified by the deployment name `gpt-o3mini-eastus2-RPM25`.
|
|
88
|
+
```python
|
|
89
|
+
o3mini_model = balancer.get_model(deployment_name="gpt-o3mini-eastus2-RPM25")
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
## 5. Global Access with `init_load_balancer`
|
|
93
|
+
|
|
94
|
+
The `init_load_balancer` function provides a convenient singleton pattern for accessing the `ModelLoadBalancer` throughout your application without passing the instance around.
|
|
95
|
+
|
|
96
|
+
First, you initialize the balancer once at the start of your application.
|
|
97
|
+
|
|
98
|
+
### Initialization
|
|
99
|
+
|
|
100
|
+
You can initialize it in several ways:
|
|
101
|
+
|
|
102
|
+
**1. Default Initialization**
|
|
103
|
+
|
|
104
|
+
This will look for the `MODEL_CONFIG_PATH` environment variable, or use the default path `_config/models_config.json`.
|
|
105
|
+
|
|
106
|
+
```python
|
|
107
|
+
from crewplus.services.init_services import init_load_balancer
|
|
108
|
+
|
|
109
|
+
init_load_balancer()
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
**2. Initialization with a Custom Path**
|
|
113
|
+
|
|
114
|
+
You can also provide a direct path to your configuration file.
|
|
115
|
+
|
|
116
|
+
```python
|
|
117
|
+
from crewplus.services.init_services import init_load_balancer
|
|
118
|
+
|
|
119
|
+
init_load_balancer(config_path="path/to/your/models_config.json")
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
### Getting the Balancer and Models
|
|
123
|
+
|
|
124
|
+
Once initialized, you can retrieve the `ModelLoadBalancer` instance from anywhere in your code using `get_model_balancer`.
|
|
125
|
+
|
|
126
|
+
```python
|
|
127
|
+
from crewplus.services.init_services import get_model_balancer
|
|
128
|
+
|
|
129
|
+
# Get the balancer instance
|
|
130
|
+
balancer = get_model_balancer()
|
|
131
|
+
|
|
132
|
+
# Get a model by deployment name
|
|
133
|
+
gemini_flash_model = balancer.get_model(deployment_name="gemini-2.5-flash")
|
|
134
|
+
```
|