lollms-client 0.28.0__py3-none-any.whl → 0.29.1__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Potentially problematic release.
This version of lollms-client might be problematic. Click here for more details.
- examples/text_gen.py +1 -1
- lollms_client/__init__.py +1 -1
- lollms_client/llm_bindings/llamacpp/__init__.py +1 -0
- lollms_client/llm_bindings/lollms/__init__.py +411 -267
- lollms_client/llm_bindings/lollms_webui/__init__.py +428 -0
- lollms_client/lollms_core.py +157 -130
- lollms_client/lollms_discussion.py +343 -61
- lollms_client/lollms_personality.py +8 -0
- lollms_client/lollms_utilities.py +10 -2
- lollms_client-0.29.1.dist-info/METADATA +963 -0
- {lollms_client-0.28.0.dist-info → lollms_client-0.29.1.dist-info}/RECORD +14 -14
- lollms_client/llm_bindings/lollms_chat/__init__.py +0 -571
- lollms_client-0.28.0.dist-info/METADATA +0 -604
- {lollms_client-0.28.0.dist-info → lollms_client-0.29.1.dist-info}/WHEEL +0 -0
- {lollms_client-0.28.0.dist-info → lollms_client-0.29.1.dist-info}/licenses/LICENSE +0 -0
- {lollms_client-0.28.0.dist-info → lollms_client-0.29.1.dist-info}/top_level.txt +0 -0
|
@@ -1,604 +0,0 @@
|
|
|
1
|
-
Metadata-Version: 2.4
|
|
2
|
-
Name: lollms_client
|
|
3
|
-
Version: 0.28.0
|
|
4
|
-
Summary: A client library for LoLLMs generate endpoint
|
|
5
|
-
Author-email: ParisNeo <parisneoai@gmail.com>
|
|
6
|
-
License: Apache Software License
|
|
7
|
-
Project-URL: Homepage, https://github.com/ParisNeo/lollms_client
|
|
8
|
-
Classifier: Programming Language :: Python :: 3
|
|
9
|
-
Classifier: Programming Language :: Python :: 3.8
|
|
10
|
-
Classifier: Programming Language :: Python :: 3.9
|
|
11
|
-
Classifier: Programming Language :: Python :: 3.10
|
|
12
|
-
Classifier: Programming Language :: Python :: 3.11
|
|
13
|
-
Classifier: Programming Language :: Python :: 3.12
|
|
14
|
-
Classifier: License :: OSI Approved :: Apache Software License
|
|
15
|
-
Classifier: Operating System :: OS Independent
|
|
16
|
-
Classifier: Intended Audience :: Developers
|
|
17
|
-
Classifier: Intended Audience :: Science/Research
|
|
18
|
-
Requires-Python: >=3.7
|
|
19
|
-
Description-Content-Type: text/markdown
|
|
20
|
-
License-File: LICENSE
|
|
21
|
-
Requires-Dist: requests
|
|
22
|
-
Requires-Dist: ascii-colors
|
|
23
|
-
Requires-Dist: pipmaster
|
|
24
|
-
Requires-Dist: pyyaml
|
|
25
|
-
Requires-Dist: tiktoken
|
|
26
|
-
Requires-Dist: pydantic
|
|
27
|
-
Requires-Dist: numpy
|
|
28
|
-
Requires-Dist: pillow
|
|
29
|
-
Requires-Dist: sqlalchemy
|
|
30
|
-
Dynamic: license-file
|
|
31
|
-
|
|
32
|
-
# LoLLMs Client Library
|
|
33
|
-
|
|
34
|
-
[](https://opensource.org/licenses/Apache-2.0)
|
|
35
|
-
[](https://badge.fury.io/py/lollms_client)
|
|
36
|
-
[](https://pypi.org/project/lollms_client/)
|
|
37
|
-
[](https://pepy.tech/project/lollms-client)
|
|
38
|
-
[](DOC_USE.md)
|
|
39
|
-
[](DOC_DEV.md)
|
|
40
|
-
[](https://github.com/ParisNeo/lollms_client/stargazers/)
|
|
41
|
-
[](https://github.com/ParisNeo/lollms_client/issues)
|
|
42
|
-
|
|
43
|
-
**`lollms_client`** is a powerful and flexible Python library designed to simplify interactions with the **LoLLMs (Lord of Large Language Models)** ecosystem and various other Large Language Model (LLM) backends. It provides a unified API for text generation, multimodal operations (text-to-image, text-to-speech, etc.), and robust function calling through the Model Context Protocol (MCP).
|
|
44
|
-
|
|
45
|
-
Whether you're connecting to a remote LoLLMs server, an Ollama instance, the OpenAI API, or running models locally using GGUF (via `llama-cpp-python` or a managed `llama.cpp` server), Hugging Face Transformers, or vLLM, `lollms-client` offers a consistent and developer-friendly experience.
|
|
46
|
-
|
|
47
|
-
## Key Features
|
|
48
|
-
|
|
49
|
-
* 🔌 **Versatile Binding System:** Seamlessly switch between different LLM backends (LoLLMs, Ollama, OpenAI, Llama.cpp, Transformers, vLLM, OpenLLM) without major code changes.
|
|
50
|
-
* 🗣️ **Multimodal Support:** Interact with models capable of processing images and generate various outputs like speech (TTS) and images (TTI).
|
|
51
|
-
* 🤖 **Function Calling with MCP:** Empowers LLMs to use external tools and functions through the Model Context Protocol (MCP), with built-in support for local Python tool execution via `local_mcp` binding and its default tools (file I/O, internet search, Python interpreter, image generation).
|
|
52
|
-
* 🚀 **Streaming & Callbacks:** Efficiently handle real-time text generation with customizable callback functions, including during MCP interactions.
|
|
53
|
-
* 💬 **Discussion Management:** Utilities to easily manage and format conversation histories for chat applications.
|
|
54
|
-
* ⚙️ **Configuration Management:** Flexible ways to configure bindings and generation parameters.
|
|
55
|
-
* 🧩 **Extensible:** Designed to easily incorporate new LLM backends and modality services, including custom MCP toolsets.
|
|
56
|
-
* 📝 **High-Level Operations:** Includes convenience methods for complex tasks like sequential summarization and deep text analysis directly within `LollmsClient`.
|
|
57
|
-
|
|
58
|
-
## Installation
|
|
59
|
-
|
|
60
|
-
You can install `lollms_client` directly from PyPI:
|
|
61
|
-
|
|
62
|
-
```bash
|
|
63
|
-
pip install lollms-client
|
|
64
|
-
```
|
|
65
|
-
|
|
66
|
-
This will install the core library. Some bindings may require additional dependencies (e.g., `llama-cpp-python`, `torch`, `transformers`, `ollama`, `vllm`). The library attempts to manage these using `pipmaster`, but for complex dependencies (especially those requiring compilation like `llama-cpp-python` with GPU support), manual installation might be preferred.
|
|
67
|
-
|
|
68
|
-
## Quick Start
|
|
69
|
-
|
|
70
|
-
Here's a very basic example of how to use `LollmsClient` to generate text with a LoLLMs server (ensure one is running at `http://localhost:9600`):
|
|
71
|
-
|
|
72
|
-
```python
|
|
73
|
-
from lollms_client import LollmsClient, MSG_TYPE
|
|
74
|
-
from ascii_colors import ASCIIColors
|
|
75
|
-
|
|
76
|
-
# Callback for streaming output
|
|
77
|
-
def simple_streaming_callback(chunk: str, msg_type: MSG_TYPE, params=None, metadata=None) -> bool:
|
|
78
|
-
if msg_type == MSG_TYPE.MSG_TYPE_CHUNK:
|
|
79
|
-
print(chunk, end="", flush=True)
|
|
80
|
-
elif msg_type == MSG_TYPE.MSG_TYPE_EXCEPTION:
|
|
81
|
-
ASCIIColors.error(f"\nStreaming Error: {chunk}")
|
|
82
|
-
return True # True to continue streaming
|
|
83
|
-
|
|
84
|
-
try:
|
|
85
|
-
# Initialize client to connect to a LoLLMs server
|
|
86
|
-
# For other backends, change 'binding_name' and provide necessary parameters.
|
|
87
|
-
# See DOC_USE.md for detailed initialization examples.
|
|
88
|
-
lc = LollmsClient(
|
|
89
|
-
binding_name="lollms",
|
|
90
|
-
host_address="http://localhost:9600"
|
|
91
|
-
)
|
|
92
|
-
|
|
93
|
-
prompt = "Tell me a fun fact about space."
|
|
94
|
-
ASCIIColors.yellow(f"Prompt: {prompt}")
|
|
95
|
-
|
|
96
|
-
# Generate text with streaming
|
|
97
|
-
ASCIIColors.green("Streaming Response:")
|
|
98
|
-
response_text = lc.generate_text(
|
|
99
|
-
prompt,
|
|
100
|
-
n_predict=100,
|
|
101
|
-
stream=True,
|
|
102
|
-
streaming_callback=simple_streaming_callback
|
|
103
|
-
)
|
|
104
|
-
print("\n--- End of Stream ---")
|
|
105
|
-
|
|
106
|
-
# The 'response_text' variable will contain the full concatenated text
|
|
107
|
-
# if streaming_callback returns True throughout.
|
|
108
|
-
if isinstance(response_text, str):
|
|
109
|
-
ASCIIColors.cyan(f"\nFull streamed text collected: {response_text[:100]}...")
|
|
110
|
-
elif isinstance(response_text, dict) and "error" in response_text:
|
|
111
|
-
ASCIIColors.error(f"Error during generation: {response_text['error']}")
|
|
112
|
-
|
|
113
|
-
except ValueError as ve:
|
|
114
|
-
ASCIIColors.error(f"Initialization Error: {ve}")
|
|
115
|
-
ASCIIColors.info("Ensure a LoLLMs server is running or configure another binding.")
|
|
116
|
-
except ConnectionRefusedError:
|
|
117
|
-
ASCIIColors.error("Connection refused. Is the LoLLMs server running at http://localhost:9600?")
|
|
118
|
-
except Exception as e:
|
|
119
|
-
ASCIIColors.error(f"An unexpected error occurred: {e}")
|
|
120
|
-
|
|
121
|
-
```
|
|
122
|
-
|
|
123
|
-
### Function Calling with MCP
|
|
124
|
-
|
|
125
|
-
`lollms-client` supports robust function calling via the Model Context Protocol (MCP), allowing LLMs to interact with your custom Python tools or pre-defined utilities.
|
|
126
|
-
|
|
127
|
-
```python
|
|
128
|
-
from lollms_client import LollmsClient, MSG_TYPE
|
|
129
|
-
from ascii_colors import ASCIIColors
|
|
130
|
-
import json # For pretty printing results
|
|
131
|
-
|
|
132
|
-
# Example callback for MCP streaming
|
|
133
|
-
def mcp_stream_callback(chunk: str, msg_type: MSG_TYPE, metadata: dict = None, turn_history: list = None) -> bool:
|
|
134
|
-
if msg_type == MSG_TYPE.MSG_TYPE_CHUNK: ASCIIColors.success(chunk, end="", flush=True) # LLM's final answer or thought process
|
|
135
|
-
elif msg_type == MSG_TYPE.MSG_TYPE_STEP_START: ASCIIColors.info(f"\n>> MCP Step Start: {metadata.get('tool_name', chunk)}", flush=True)
|
|
136
|
-
elif msg_type == MSG_TYPE.MSG_TYPE_STEP_END: ASCIIColors.success(f"\n<< MCP Step End: {metadata.get('tool_name', chunk)} -> Result: {json.dumps(metadata.get('result', ''))}", flush=True)
|
|
137
|
-
elif msg_type == MSG_TYPE.MSG_TYPE_INFO and metadata and metadata.get("type") == "tool_call_request": ASCIIColors.info(f"\nAI requests: {metadata.get('name')}({metadata.get('params')})", flush=True)
|
|
138
|
-
return True
|
|
139
|
-
|
|
140
|
-
try:
|
|
141
|
-
# Initialize LollmsClient with an LLM binding and the local_mcp binding
|
|
142
|
-
lc = LollmsClient(
|
|
143
|
-
binding_name="ollama", model_name="mistral", # Example LLM
|
|
144
|
-
mcp_binding_name="local_mcp" # Enables default tools (file_writer, internet_search, etc.)
|
|
145
|
-
# or custom tools if mcp_binding_config.tools_folder_path is set.
|
|
146
|
-
)
|
|
147
|
-
|
|
148
|
-
user_query = "What were the main AI headlines last week and write a summary to 'ai_news.txt'?"
|
|
149
|
-
ASCIIColors.blue(f"User Query: {user_query}")
|
|
150
|
-
ASCIIColors.yellow("AI Processing with MCP (streaming):")
|
|
151
|
-
|
|
152
|
-
mcp_result = lc.generate_with_mcp(
|
|
153
|
-
prompt=user_query,
|
|
154
|
-
streaming_callback=mcp_stream_callback
|
|
155
|
-
)
|
|
156
|
-
print("\n--- End of MCP Interaction ---")
|
|
157
|
-
|
|
158
|
-
if mcp_result.get("error"):
|
|
159
|
-
ASCIIColors.error(f"MCP Error: {mcp_result['error']}")
|
|
160
|
-
else:
|
|
161
|
-
ASCIIColors.cyan(f"\nFinal Answer from AI: {mcp_result.get('final_answer', 'N/A')}")
|
|
162
|
-
ASCIIColors.magenta("\nTool Calls Made:")
|
|
163
|
-
for tc in mcp_result.get("tool_calls", []):
|
|
164
|
-
print(f" - Tool: {tc.get('name')}, Params: {tc.get('params')}, Result (first 50 chars): {str(tc.get('result'))[:50]}...")
|
|
165
|
-
|
|
166
|
-
except Exception as e:
|
|
167
|
-
ASCIIColors.error(f"An error occurred in MCP example: {e}")
|
|
168
|
-
trace_exception(e) # Assuming you have trace_exception utility
|
|
169
|
-
```
|
|
170
|
-
For a comprehensive guide on function calling and setting up tools, please refer to the [Usage Guide (DOC_USE.md)](DOC_USE.md).
|
|
171
|
-
|
|
172
|
-
### 🤖 Advanced Agentic Generation with RAG: `generate_with_mcp_rag`
|
|
173
|
-
|
|
174
|
-
For more complex tasks, `generate_with_mcp_rag` provides a powerful, built-in agent that uses a ReAct-style (Reason, Act) loop. This agent can reason about a user's request, use tools (MCP), retrieve information from knowledge bases (RAG), and adapt its plan based on the results of its actions.
|
|
175
|
-
|
|
176
|
-
**Key Agent Capabilities:**
|
|
177
|
-
|
|
178
|
-
* **Observe-Think-Act Loop:** The agent iteratively reviews its progress, thinks about the next logical step, and takes an action (like calling a tool).
|
|
179
|
-
* **Tool Integration (MCP):** Can use any available MCP tools, such as searching the web or executing code.
|
|
180
|
-
* **Retrieval-Augmented Generation (RAG):** You can provide one or more "data stores" (knowledge bases). The agent gains a `research::{store_name}` tool to query these stores for relevant information.
|
|
181
|
-
* **In-Memory Code Generation:** The agent has a special `generate_code` tool. This allows it to first write a piece of code (e.g., a complex Python script) and then pass that code to another tool (e.g., `python_code_interpreter`) in a subsequent step.
|
|
182
|
-
* **Stateful Progress Tracking:** Designed for rich UI experiences, it emits `step_start` and `step_end` events with unique IDs via the streaming callback. This allows an application to track the agent's individual thoughts and long-running tool calls in real-time.
|
|
183
|
-
* **Self-Correction:** Includes a `refactor_scratchpad` tool for the agent to clean up its own thought process if it becomes cluttered.
|
|
184
|
-
|
|
185
|
-
Here is an example of using the agent to answer a question by first performing RAG on a custom knowledge base and then using the retrieved information to generate and execute code.
|
|
186
|
-
|
|
187
|
-
```python
|
|
188
|
-
import json
|
|
189
|
-
from lollms_client import LollmsClient, MSG_TYPE
|
|
190
|
-
from ascii_colors import ASCIIColors
|
|
191
|
-
|
|
192
|
-
# 1. Define a mock RAG data store and retrieval function
|
|
193
|
-
project_notes = {
|
|
194
|
-
"project_phoenix_details": "Project Phoenix has a current budget of $500,000 and an expected quarterly growth rate of 15%."
|
|
195
|
-
}
|
|
196
|
-
|
|
197
|
-
def retrieve_from_notes(query: str, top_k: int = 1, min_similarity: float = 0.5):
|
|
198
|
-
"""A simple keyword-based retriever for our mock data store."""
|
|
199
|
-
results = []
|
|
200
|
-
for key, text in project_notes.items():
|
|
201
|
-
if query.lower() in text.lower():
|
|
202
|
-
results.append({"source": key, "content": text})
|
|
203
|
-
return results[:top_k]
|
|
204
|
-
|
|
205
|
-
# 2. Define a detailed streaming callback to visualize the agent's process
|
|
206
|
-
def agent_streaming_callback(chunk: str, msg_type: MSG_TYPE, params: dict = None, metadata: list = None) -> bool:
|
|
207
|
-
if not params: params = {}
|
|
208
|
-
msg_id = params.get("id", "")
|
|
209
|
-
|
|
210
|
-
if msg_type == MSG_TYPE.MSG_TYPE_STEP_START:
|
|
211
|
-
ASCIIColors.yellow(f"\n>> Agent Step Start [ID: {msg_id}]: {chunk}")
|
|
212
|
-
elif msg_type == MSG_TYPE.MSG_TYPE_STEP_END:
|
|
213
|
-
ASCIIColors.green(f"<< Agent Step End [ID: {msg_id}]: {chunk}")
|
|
214
|
-
if params.get('result'):
|
|
215
|
-
ASCIIColors.cyan(f" Result: {json.dumps(params['result'], indent=2)}")
|
|
216
|
-
elif msg_type == MSG_TYPE.MSG_TYPE_THOUGHT_CONTENT:
|
|
217
|
-
ASCIIColors.magenta(f"\n🤔 Agent Thought: {chunk}")
|
|
218
|
-
elif msg_type == MSG_TYPE.MSG_TYPE_TOOL_CALL:
|
|
219
|
-
ASCIIColors.blue(f"\n🛠️ Agent Action: {chunk}")
|
|
220
|
-
elif msg_type == MSG_TYPE.MSG_TYPE_OBSERVATION:
|
|
221
|
-
ASCIIColors.cyan(f"\n👀 Agent Observation: {chunk}")
|
|
222
|
-
elif msg_type == MSG_TYPE.MSG_TYPE_CHUNK:
|
|
223
|
-
print(chunk, end="", flush=True) # Final answer stream
|
|
224
|
-
return True
|
|
225
|
-
|
|
226
|
-
try:
|
|
227
|
-
# 3. Initialize LollmsClient with an LLM and local tools enabled
|
|
228
|
-
lc = LollmsClient(
|
|
229
|
-
binding_name="ollama", # Use Ollama
|
|
230
|
-
model_name="llama3", # Or any capable model like mistral, gemma, etc.
|
|
231
|
-
mcp_binding_name="local_mcp" # Enable local tools like python_code_interpreter
|
|
232
|
-
)
|
|
233
|
-
|
|
234
|
-
# 4. Define the user prompt and the RAG data store
|
|
235
|
-
prompt = "Based on my notes about Project Phoenix, write and run a Python script to calculate its projected budget after two quarters."
|
|
236
|
-
|
|
237
|
-
rag_data_store = {
|
|
238
|
-
"project_notes": {"callable": retrieve_from_notes}
|
|
239
|
-
}
|
|
240
|
-
|
|
241
|
-
ASCIIColors.yellow(f"User Prompt: {prompt}")
|
|
242
|
-
print("\n" + "="*50 + "\nAgent is now running...\n" + "="*50)
|
|
243
|
-
|
|
244
|
-
# 5. Run the agent
|
|
245
|
-
agent_output = lc.generate_with_mcp_rag(
|
|
246
|
-
prompt=prompt,
|
|
247
|
-
use_data_store=rag_data_store,
|
|
248
|
-
use_mcps=["python_code_interpreter"], # Make specific tools available
|
|
249
|
-
streaming_callback=agent_streaming_callback,
|
|
250
|
-
max_reasoning_steps=5
|
|
251
|
-
)
|
|
252
|
-
|
|
253
|
-
print("\n" + "="*50 + "\nAgent finished.\n" + "="*50)
|
|
254
|
-
|
|
255
|
-
# 6. Print the final results
|
|
256
|
-
if agent_output.get("error"):
|
|
257
|
-
ASCIIColors.error(f"\nAgent Error: {agent_output['error']}")
|
|
258
|
-
else:
|
|
259
|
-
ASCIIColors.green("\n--- Final Answer ---")
|
|
260
|
-
print(agent_output.get("final_answer"))
|
|
261
|
-
|
|
262
|
-
ASCIIColors.magenta("\n--- Tool Calls ---")
|
|
263
|
-
print(json.dumps(agent_output.get("tool_calls", []), indent=2))
|
|
264
|
-
|
|
265
|
-
ASCIIColors.cyan("\n--- RAG Sources ---")
|
|
266
|
-
print(json.dumps(agent_output.get("sources", []), indent=2))
|
|
267
|
-
|
|
268
|
-
except Exception as e:
|
|
269
|
-
ASCIIColors.red(f"\nAn unexpected error occurred: {e}")
|
|
270
|
-
|
|
271
|
-
```
|
|
272
|
-
|
|
273
|
-
## Documentation
|
|
274
|
-
|
|
275
|
-
For more in-depth information, please refer to:
|
|
276
|
-
|
|
277
|
-
* **[Usage Guide (DOC_USE.md)](DOC_USE.md):** Learn how to use `LollmsClient`, different bindings, modality features, function calling with MCP, and high-level operations.
|
|
278
|
-
* **[Developer Guide (DOC_DEV.md)](DOC_DEV.md):** Understand the architecture, how to create new bindings (LLM, modality, MCP), and contribute to the library.
|
|
279
|
-
|
|
280
|
-
## Core Concepts
|
|
281
|
-
|
|
282
|
-
```mermaid
|
|
283
|
-
graph LR
|
|
284
|
-
A[Your Application] --> LC[LollmsClient];
|
|
285
|
-
|
|
286
|
-
subgraph LollmsClient_Core
|
|
287
|
-
LC -- Manages --> LLB[LLM Binding];
|
|
288
|
-
LC -- Manages --> MCPB[MCP Binding];
|
|
289
|
-
LC -- Orchestrates --> MCP_Interaction[generate_with_mcp];
|
|
290
|
-
LC -- Provides --> HighLevelOps["High-Level Ops(summarize, deep_analyze etc.)"];
|
|
291
|
-
LC -- Provides Access To --> DM[DiscussionManager];
|
|
292
|
-
LC -- Provides Access To --> ModalityBindings[TTS, TTI, STT etc.];
|
|
293
|
-
end
|
|
294
|
-
|
|
295
|
-
subgraph LLM_Backends
|
|
296
|
-
LLB --> LollmsServer[LoLLMs Server];
|
|
297
|
-
LLB --> OllamaServer[Ollama];
|
|
298
|
-
LLB --> OpenAPIServer[OpenAI API];
|
|
299
|
-
LLB --> LocalGGUF["Local GGUF<br>(pythonllamacpp / llamacpp server)"];
|
|
300
|
-
LLB --> LocalHF["Local HuggingFace<br>(transformers / vLLM)"];
|
|
301
|
-
end
|
|
302
|
-
|
|
303
|
-
MCP_Interaction --> MCPB;
|
|
304
|
-
MCPB --> LocalTools["Local Python Tools<br>(via local_mcp)"];
|
|
305
|
-
MCPB --> RemoteTools["Remote MCP Tool Servers<br>(Future Potential)"];
|
|
306
|
-
|
|
307
|
-
|
|
308
|
-
ModalityBindings --> ModalityServices["Modality Services<br>(e.g., LoLLMs Server TTS/TTI, local Bark/XTTS)"];
|
|
309
|
-
```
|
|
310
|
-
|
|
311
|
-
* **`LollmsClient`**: The central class for all interactions. It holds the currently active LLM binding, an optional MCP binding, and provides access to modality bindings and high-level operations.
|
|
312
|
-
* **LLM Bindings**: These are plugins that allow `LollmsClient` to communicate with different LLM backends. You choose a binding (e.g., `"ollama"`, `"lollms"`, `"pythonllamacpp"`) when you initialize `LollmsClient`.
|
|
313
|
-
* **🔧 MCP Bindings**: Enable tool use and function calling. `lollms-client` includes `local_mcp` for executing Python tools. It discovers tools from a specified folder (or uses its default set), each defined by a `.py` script and a `.mcp.json` metadata file.
|
|
314
|
-
* **Modality Bindings**: Similar to LLM bindings, but for services like Text-to-Speech (`tts`), Text-to-Image (`tti`), etc.
|
|
315
|
-
* **High-Level Operations**: Methods directly on `LollmsClient` (e.g., `sequential_summarize`, `deep_analyze`, `generate_code`, `yes_no`) for performing complex, multi-step AI tasks.
|
|
316
|
-
* **`LollmsDiscussion`**: Helps manage and format conversation histories for chat applications.
|
|
317
|
-
|
|
318
|
-
## Examples
|
|
319
|
-
|
|
320
|
-
The `examples/` directory in this repository contains a rich set of scripts demonstrating various features:
|
|
321
|
-
* Basic text generation with different bindings.
|
|
322
|
-
* Streaming and non-streaming examples.
|
|
323
|
-
* Multimodal generation (text with images).
|
|
324
|
-
* Using built-in methods for summarization and Q&A.
|
|
325
|
-
* Implementing and using function calls with **`generate_with_mcp`** and the `local_mcp` binding (see `examples/function_calling_with_local_custom_mcp.py` and `examples/local_mcp.py`).
|
|
326
|
-
* Text-to-Speech and Text-to-Image generation.
|
|
327
|
-
|
|
328
|
-
Explore these examples to see `lollms-client` in action!
|
|
329
|
-
|
|
330
|
-
## Using LoLLMs Client with Different Bindings
|
|
331
|
-
|
|
332
|
-
`lollms-client` supports a wide range of LLM backends through its binding system. This section provides practical examples of how to initialize `LollmsClient` for each of the major supported bindings.
|
|
333
|
-
|
|
334
|
-
### A Note on Configuration
|
|
335
|
-
|
|
336
|
-
The recommended way to provide credentials and other binding-specific settings is through the `llm_binding_config` dictionary during `LollmsClient` initialization. While many bindings can fall back to reading environment variables (e.g., `OPENAI_API_KEY`), passing them explicitly in the config is clearer and less error-prone.
|
|
337
|
-
|
|
338
|
-
```python
|
|
339
|
-
# General configuration pattern
|
|
340
|
-
lc = LollmsClient(
|
|
341
|
-
binding_name="your_binding_name",
|
|
342
|
-
model_name="a_model_name",
|
|
343
|
-
llm_binding_config={
|
|
344
|
-
"specific_api_key_param": "your_api_key_here",
|
|
345
|
-
"another_specific_param": "some_value"
|
|
346
|
-
}
|
|
347
|
-
)
|
|
348
|
-
```
|
|
349
|
-
|
|
350
|
-
---
|
|
351
|
-
|
|
352
|
-
### 1. Local Bindings
|
|
353
|
-
|
|
354
|
-
These bindings run models directly on your local machine, giving you full control and privacy.
|
|
355
|
-
|
|
356
|
-
#### **Ollama**
|
|
357
|
-
|
|
358
|
-
The `ollama` binding connects to a running Ollama server instance on your machine or network.
|
|
359
|
-
|
|
360
|
-
**Prerequisites:**
|
|
361
|
-
* [Ollama installed and running](https://ollama.com/).
|
|
362
|
-
* Models pulled, e.g., `ollama pull llama3`.
|
|
363
|
-
|
|
364
|
-
**Usage:**
|
|
365
|
-
|
|
366
|
-
```python
|
|
367
|
-
from lollms_client import LollmsClient
|
|
368
|
-
|
|
369
|
-
# Configuration for a local Ollama server
|
|
370
|
-
lc = LollmsClient(
|
|
371
|
-
binding_name="ollama",
|
|
372
|
-
model_name="llama3", # Or any other model you have pulled
|
|
373
|
-
host_address="http://localhost:11434" # Default Ollama address
|
|
374
|
-
)
|
|
375
|
-
|
|
376
|
-
# Now you can use lc.generate_text(), lc.chat(), etc.
|
|
377
|
-
response = lc.generate_text("Why is the sky blue?")
|
|
378
|
-
print(response)
|
|
379
|
-
```
|
|
380
|
-
|
|
381
|
-
#### **PythonLlamaCpp (Local GGUF Models)**
|
|
382
|
-
|
|
383
|
-
The `pythonllamacpp` binding loads and runs GGUF model files directly using the powerful `llama-cpp-python` library. This is ideal for high-performance, local inference on CPU or GPU.
|
|
384
|
-
|
|
385
|
-
**Prerequisites:**
|
|
386
|
-
* A GGUF model file downloaded to your machine.
|
|
387
|
-
* `llama-cpp-python` installed. For GPU support, it must be compiled with the correct flags (e.g., `CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python`).
|
|
388
|
-
|
|
389
|
-
**Usage:**
|
|
390
|
-
|
|
391
|
-
```python
|
|
392
|
-
from lollms_client import LollmsClient
|
|
393
|
-
|
|
394
|
-
# --- Configuration for Llama.cpp ---
|
|
395
|
-
# Path to your GGUF model file
|
|
396
|
-
MODEL_PATH = "/path/to/your/model.gguf"
|
|
397
|
-
|
|
398
|
-
# Binding-specific configuration
|
|
399
|
-
LLAMACPP_CONFIG = {
|
|
400
|
-
"n_gpu_layers": -1, # -1 for all layers to GPU, 0 for CPU
|
|
401
|
-
"n_ctx": 4096, # Context size
|
|
402
|
-
"seed": -1, # -1 for random seed
|
|
403
|
-
"chat_format": "chatml" # Or another format like 'llama-2'
|
|
404
|
-
}
|
|
405
|
-
|
|
406
|
-
try:
|
|
407
|
-
lc = LollmsClient(
|
|
408
|
-
binding_name="pythonllamacpp",
|
|
409
|
-
model_name=MODEL_PATH, # For this binding, model_name is the file path
|
|
410
|
-
llm_binding_config=LLAMACPP_CONFIG
|
|
411
|
-
)
|
|
412
|
-
|
|
413
|
-
response = lc.generate_text("Write a recipe for a great day.")
|
|
414
|
-
print(response)
|
|
415
|
-
|
|
416
|
-
except Exception as e:
|
|
417
|
-
print(f"Error initializing Llama.cpp binding: {e}")
|
|
418
|
-
print("Please ensure llama-cpp-python is installed and the model path is correct.")
|
|
419
|
-
|
|
420
|
-
```
|
|
421
|
-
|
|
422
|
-
---
|
|
423
|
-
|
|
424
|
-
### 2. Cloud Service Bindings
|
|
425
|
-
|
|
426
|
-
These bindings connect to hosted LLM APIs from major providers.
|
|
427
|
-
|
|
428
|
-
#### **OpenAI**
|
|
429
|
-
|
|
430
|
-
Connects to the official OpenAI API to use models like GPT-4o, GPT-4, and GPT-3.5.
|
|
431
|
-
|
|
432
|
-
**Prerequisites:**
|
|
433
|
-
* An OpenAI API key.
|
|
434
|
-
|
|
435
|
-
**Usage:**
|
|
436
|
-
|
|
437
|
-
```python
|
|
438
|
-
from lollms_client import LollmsClient
|
|
439
|
-
|
|
440
|
-
OPENAI_CONFIG = {
|
|
441
|
-
"service_key": "your_openai_api_key_here" # sk-...
|
|
442
|
-
}
|
|
443
|
-
|
|
444
|
-
lc = LollmsClient(
|
|
445
|
-
binding_name="openai",
|
|
446
|
-
model_name="gpt-4o",
|
|
447
|
-
llm_binding_config=OPENAI_CONFIG
|
|
448
|
-
)
|
|
449
|
-
|
|
450
|
-
response = lc.generate_text("What is the difference between AI and machine learning?")
|
|
451
|
-
print(response)
|
|
452
|
-
```
|
|
453
|
-
|
|
454
|
-
#### **Google Gemini**
|
|
455
|
-
|
|
456
|
-
Connects to Google's Gemini family of models via the Google AI Studio API.
|
|
457
|
-
|
|
458
|
-
**Prerequisites:**
|
|
459
|
-
* A Google AI Studio API key.
|
|
460
|
-
|
|
461
|
-
**Usage:**
|
|
462
|
-
|
|
463
|
-
```python
|
|
464
|
-
from lollms_client import LollmsClient
|
|
465
|
-
|
|
466
|
-
GEMINI_CONFIG = {
|
|
467
|
-
"service_key": "your_google_api_key_here"
|
|
468
|
-
}
|
|
469
|
-
|
|
470
|
-
lc = LollmsClient(
|
|
471
|
-
binding_name="gemini",
|
|
472
|
-
model_name="gemini-1.5-pro-latest",
|
|
473
|
-
llm_binding_config=GEMINI_CONFIG
|
|
474
|
-
)
|
|
475
|
-
|
|
476
|
-
response = lc.generate_text("Summarize the plot of 'Dune' in three sentences.")
|
|
477
|
-
print(response)
|
|
478
|
-
```
|
|
479
|
-
|
|
480
|
-
#### **Anthropic Claude**
|
|
481
|
-
|
|
482
|
-
Connects to Anthropic's API to use the Claude family of models, including Claude 3.5 Sonnet, Opus, and Haiku.
|
|
483
|
-
|
|
484
|
-
**Prerequisites:**
|
|
485
|
-
* An Anthropic API key.
|
|
486
|
-
|
|
487
|
-
**Usage:**
|
|
488
|
-
|
|
489
|
-
```python
|
|
490
|
-
from lollms_client import LollmsClient
|
|
491
|
-
|
|
492
|
-
CLAUDE_CONFIG = {
|
|
493
|
-
"service_key": "your_anthropic_api_key_here"
|
|
494
|
-
}
|
|
495
|
-
|
|
496
|
-
lc = LollmsClient(
|
|
497
|
-
binding_name="claude",
|
|
498
|
-
model_name="claude-3-5-sonnet-20240620",
|
|
499
|
-
llm_binding_config=CLAUDE_CONFIG
|
|
500
|
-
)
|
|
501
|
-
|
|
502
|
-
response = lc.generate_text("What are the core principles of constitutional AI?")
|
|
503
|
-
print(response)
|
|
504
|
-
```
|
|
505
|
-
|
|
506
|
-
---
|
|
507
|
-
|
|
508
|
-
### 3. API Aggregator Bindings
|
|
509
|
-
|
|
510
|
-
These bindings connect to services that provide access to many different models through a single API.
|
|
511
|
-
|
|
512
|
-
#### **OpenRouter**
|
|
513
|
-
|
|
514
|
-
OpenRouter provides a unified, OpenAI-compatible interface to access models from dozens of providers (Google, Anthropic, Mistral, Groq, etc.) with one API key.
|
|
515
|
-
|
|
516
|
-
**Prerequisites:**
|
|
517
|
-
* An OpenRouter API key (starts with `sk-or-...`).
|
|
518
|
-
|
|
519
|
-
**Usage:**
|
|
520
|
-
Model names must be specified in the format `provider/model-name`.
|
|
521
|
-
|
|
522
|
-
```python
|
|
523
|
-
from lollms_client import LollmsClient
|
|
524
|
-
|
|
525
|
-
OPENROUTER_CONFIG = {
|
|
526
|
-
"open_router_api_key": "your_openrouter_api_key_here"
|
|
527
|
-
}
|
|
528
|
-
|
|
529
|
-
# Example using a Claude model through OpenRouter
|
|
530
|
-
lc = LollmsClient(
|
|
531
|
-
binding_name="open_router",
|
|
532
|
-
model_name="anthropic/claude-3-haiku-20240307",
|
|
533
|
-
llm_binding_config=OPENROUTER_CONFIG
|
|
534
|
-
)
|
|
535
|
-
|
|
536
|
-
response = lc.generate_text("Explain what an API aggregator is, as if to a beginner.")
|
|
537
|
-
print(response)
|
|
538
|
-
```
|
|
539
|
-
|
|
540
|
-
#### **Groq**
|
|
541
|
-
|
|
542
|
-
While Groq is a direct provider, it's famous as an aggregator of speed. It runs open-source models on custom LPU hardware for exceptionally fast inference.
|
|
543
|
-
|
|
544
|
-
**Prerequisites:**
|
|
545
|
-
* A Groq API key.
|
|
546
|
-
|
|
547
|
-
**Usage:**
|
|
548
|
-
|
|
549
|
-
```python
|
|
550
|
-
from lollms_client import LollmsClient
|
|
551
|
-
|
|
552
|
-
GROQ_CONFIG = {
|
|
553
|
-
"groq_api_key": "your_groq_api_key_here"
|
|
554
|
-
}
|
|
555
|
-
|
|
556
|
-
lc = LollmsClient(
|
|
557
|
-
binding_name="groq",
|
|
558
|
-
model_name="llama3-8b-8192",
|
|
559
|
-
llm_binding_config=GROQ_CONFIG
|
|
560
|
-
)
|
|
561
|
-
|
|
562
|
-
response = lc.generate_text("Write a 3-line poem about incredible speed.")
|
|
563
|
-
print(response)
|
|
564
|
-
```
|
|
565
|
-
|
|
566
|
-
#### **Hugging Face Inference API**
|
|
567
|
-
|
|
568
|
-
This connects to the serverless Hugging Face Inference API, allowing experimentation with thousands of open-source models without local hardware.
|
|
569
|
-
|
|
570
|
-
**Note:** This API can have "cold starts," so the first request might be slow.
|
|
571
|
-
|
|
572
|
-
**Prerequisites:**
|
|
573
|
-
* A Hugging Face User Access Token (starts with `hf_...`).
|
|
574
|
-
|
|
575
|
-
**Usage:**
|
|
576
|
-
|
|
577
|
-
```python
|
|
578
|
-
from lollms_client import LollmsClient
|
|
579
|
-
|
|
580
|
-
HF_CONFIG = {
|
|
581
|
-
"hf_api_key": "your_hugging_face_token_here"
|
|
582
|
-
}
|
|
583
|
-
|
|
584
|
-
lc = LollmsClient(
|
|
585
|
-
binding_name="hugging_face_inference_api",
|
|
586
|
-
model_name="google/gemma-1.1-7b-it",
|
|
587
|
-
llm_binding_config=HF_CONFIG
|
|
588
|
-
)
|
|
589
|
-
|
|
590
|
-
response = lc.generate_text("Write a short story about a robot who discovers music.")
|
|
591
|
-
print(response)
|
|
592
|
-
```
|
|
593
|
-
|
|
594
|
-
## Contributing
|
|
595
|
-
|
|
596
|
-
Contributions are welcome! Whether it's bug reports, feature suggestions, documentation improvements, or new bindings, please feel free to open an issue or submit a pull request on our [GitHub repository](https://github.com/ParisNeo/lollms_client).
|
|
597
|
-
|
|
598
|
-
## License
|
|
599
|
-
|
|
600
|
-
This project is licensed under the **Apache 2.0 License**. See the [LICENSE](LICENSE) file for details (assuming you have a LICENSE file, if not, state "Apache 2.0 License").
|
|
601
|
-
|
|
602
|
-
## Changelog
|
|
603
|
-
|
|
604
|
-
For a list of changes and updates, please refer to the [CHANGELOG.md](CHANGELOG.md) file.
|
|
File without changes
|
|
File without changes
|
|
File without changes
|