PyPI - lollms-client - Versions diffs - 0.25.5__tar.gz → 0.26.0__tar.gz - Mend - Supply Chain Defender

lollms-client 0.25.5tar.gz → 0.26.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of lollms-client might be problematic. Click here for more details.

Files changed (100) hide show

{lollms_client-0.25.5 → lollms_client-0.26.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: lollms_client
-Version: 0.25.5
+Version: 0.26.0
 Summary: A client library for LoLLMs generate endpoint
 Author-email: ParisNeo <parisneoai@gmail.com>
 License: Apache Software License
@@ -169,6 +169,107 @@ except Exception as e:
 ```
 For a comprehensive guide on function calling and setting up tools, please refer to the [Usage Guide (DOC_USE.md)](DOC_USE.md).
+### 🤖 Advanced Agentic Generation with RAG: `generate_with_mcp_rag`
+For more complex tasks, `generate_with_mcp_rag` provides a powerful, built-in agent that uses a ReAct-style (Reason, Act) loop. This agent can reason about a user's request, use tools (MCP), retrieve information from knowledge bases (RAG), and adapt its plan based on the results of its actions.
+**Key Agent Capabilities:**
+*   **Observe-Think-Act Loop:** The agent iteratively reviews its progress, thinks about the next logical step, and takes an action (like calling a tool).
+*   **Tool Integration (MCP):** Can use any available MCP tools, such as searching the web or executing code.
+*   **Retrieval-Augmented Generation (RAG):** You can provide one or more "data stores" (knowledge bases). The agent gains a `research::{store_name}` tool to query these stores for relevant information.
+*   **In-Memory Code Generation:** The agent has a special `generate_code` tool. This allows it to first write a piece of code (e.g., a complex Python script) and then pass that code to another tool (e.g., `python_code_interpreter`) in a subsequent step.
+*   **Stateful Progress Tracking:** Designed for rich UI experiences, it emits `step_start` and `step_end` events with unique IDs via the streaming callback. This allows an application to track the agent's individual thoughts and long-running tool calls in real-time.
+*   **Self-Correction:** Includes a `refactor_scratchpad` tool for the agent to clean up its own thought process if it becomes cluttered.
+Here is an example of using the agent to answer a question by first performing RAG on a custom knowledge base and then using the retrieved information to generate and execute code.
+```python
+import json
+from lollms_client import LollmsClient, MSG_TYPE
+from ascii_colors import ASCIIColors
+# 1. Define a mock RAG data store and retrieval function
+project_notes = {
+    "project_phoenix_details": "Project Phoenix has a current budget of $500,000 and an expected quarterly growth rate of 15%."
+}
+def retrieve_from_notes(query: str, top_k: int = 1, min_similarity: float = 0.5):
+    """A simple keyword-based retriever for our mock data store."""
+    results = []
+    for key, text in project_notes.items():
+        if query.lower() in text.lower():
+            results.append({"source": key, "content": text})
+    return results[:top_k]
+# 2. Define a detailed streaming callback to visualize the agent's process
+def agent_streaming_callback(chunk: str, msg_type: MSG_TYPE, params: dict = None, metadata: list = None) -> bool:
+    if not params: params = {}
+    msg_id = params.get("id", "")
+    if msg_type == MSG_TYPE.MSG_TYPE_STEP_START:
+        ASCIIColors.yellow(f"\n>> Agent Step Start [ID: {msg_id}]: {chunk}")
+    elif msg_type == MSG_TYPE.MSG_TYPE_STEP_END:
+        ASCIIColors.green(f"<< Agent Step End [ID: {msg_id}]: {chunk}")
+        if params.get('result'):
+            ASCIIColors.cyan(f"   Result: {json.dumps(params['result'], indent=2)}")
+    elif msg_type == MSG_TYPE.MSG_TYPE_THOUGHT_CONTENT:
+        ASCIIColors.magenta(f"\n🤔 Agent Thought: {chunk}")
+    elif msg_type == MSG_TYPE.MSG_TYPE_TOOL_CALL:
+        ASCIIColors.blue(f"\n🛠️  Agent Action: {chunk}")
+    elif msg_type == MSG_TYPE.MSG_TYPE_OBSERVATION:
+        ASCIIColors.cyan(f"\n👀 Agent Observation: {chunk}")
+    elif msg_type == MSG_TYPE.MSG_TYPE_CHUNK:
+        print(chunk, end="", flush=True) # Final answer stream
+    return True
+try:
+    # 3. Initialize LollmsClient with an LLM and local tools enabled
+    lc = LollmsClient(
+        binding_name="ollama",          # Use Ollama
+        model_name="llama3",            # Or any capable model like mistral, gemma, etc.
+        mcp_binding_name="local_mcp"    # Enable local tools like python_code_interpreter
+    )
+    # 4. Define the user prompt and the RAG data store
+    prompt = "Based on my notes about Project Phoenix, write and run a Python script to calculate its projected budget after two quarters."
+    rag_data_store = {
+        "project_notes": {"callable": retrieve_from_notes}
+    }
+    ASCIIColors.yellow(f"User Prompt: {prompt}")
+    print("\n" + "="*50 + "\nAgent is now running...\n" + "="*50)
+    # 5. Run the agent
+    agent_output = lc.generate_with_mcp_rag(
+        prompt=prompt,
+        use_data_store=rag_data_store,
+        use_mcps=["python_code_interpreter"], # Make specific tools available
+        streaming_callback=agent_streaming_callback,
+        max_reasoning_steps=5
+    )
+    print("\n" + "="*50 + "\nAgent finished.\n" + "="*50)
+    # 6. Print the final results
+    if agent_output.get("error"):
+        ASCIIColors.error(f"\nAgent Error: {agent_output['error']}")
+    else:
+        ASCIIColors.green("\n--- Final Answer ---")
+        print(agent_output.get("final_answer"))
+        ASCIIColors.magenta("\n--- Tool Calls ---")
+        print(json.dumps(agent_output.get("tool_calls", []), indent=2))
+        ASCIIColors.cyan("\n--- RAG Sources ---")
+        print(json.dumps(agent_output.get("sources", []), indent=2))
+except Exception as e:
+    ASCIIColors.red(f"\nAn unexpected error occurred: {e}")
+```
 ## Documentation
 For more in-depth information, please refer to:
@@ -226,6 +327,270 @@ The `examples/` directory in this repository contains a rich set of scripts demo
 Explore these examples to see `lollms-client` in action!
+## Using LoLLMs Client with Different Bindings
+`lollms-client` supports a wide range of LLM backends through its binding system. This section provides practical examples of how to initialize `LollmsClient` for each of the major supported bindings.
+### A Note on Configuration
+The recommended way to provide credentials and other binding-specific settings is through the `llm_binding_config` dictionary during `LollmsClient` initialization. While many bindings can fall back to reading environment variables (e.g., `OPENAI_API_KEY`), passing them explicitly in the config is clearer and less error-prone.
+```python
+# General configuration pattern
+lc = LollmsClient(
+    binding_name="your_binding_name",
+    model_name="a_model_name",
+    llm_binding_config={
+        "specific_api_key_param": "your_api_key_here",
+        "another_specific_param": "some_value"
+    }
+)
+```
+---
+### 1. Local Bindings
+These bindings run models directly on your local machine, giving you full control and privacy.
+#### **Ollama**
+The `ollama` binding connects to a running Ollama server instance on your machine or network.
+**Prerequisites:**
+*   [Ollama installed and running](https://ollama.com/).
+*   Models pulled, e.g., `ollama pull llama3`.
+**Usage:**
+```python
+from lollms_client import LollmsClient
+# Configuration for a local Ollama server
+lc = LollmsClient(
+    binding_name="ollama",
+    model_name="llama3",  # Or any other model you have pulled
+    host_address="http://localhost:11434" # Default Ollama address
+)
+# Now you can use lc.generate_text(), lc.chat(), etc.
+response = lc.generate_text("Why is the sky blue?")
+print(response)
+```
+#### **PythonLlamaCpp (Local GGUF Models)**
+The `pythonllamacpp` binding loads and runs GGUF model files directly using the powerful `llama-cpp-python` library. This is ideal for high-performance, local inference on CPU or GPU.
+**Prerequisites:**
+*   A GGUF model file downloaded to your machine.
+*   `llama-cpp-python` installed. For GPU support, it must be compiled with the correct flags (e.g., `CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python`).
+**Usage:**
+```python
+from lollms_client import LollmsClient
+# --- Configuration for Llama.cpp ---
+# Path to your GGUF model file
+MODEL_PATH = "/path/to/your/model.gguf"
+# Binding-specific configuration
+LLAMACPP_CONFIG = {
+    "n_gpu_layers": -1,  # -1 for all layers to GPU, 0 for CPU
+    "n_ctx": 4096,       # Context size
+    "seed": -1,          # -1 for random seed
+    "chat_format": "chatml" # Or another format like 'llama-2'
+}
+try:
+    lc = LollmsClient(
+        binding_name="pythonllamacpp",
+        model_name=MODEL_PATH, # For this binding, model_name is the file path
+        llm_binding_config=LLAMACPP_CONFIG
+    )
+    response = lc.generate_text("Write a recipe for a great day.")
+    print(response)
+except Exception as e:
+    print(f"Error initializing Llama.cpp binding: {e}")
+    print("Please ensure llama-cpp-python is installed and the model path is correct.")
+```
+---
+### 2. Cloud Service Bindings
+These bindings connect to hosted LLM APIs from major providers.
+#### **OpenAI**
+Connects to the official OpenAI API to use models like GPT-4o, GPT-4, and GPT-3.5.
+**Prerequisites:**
+*   An OpenAI API key.
+**Usage:**
+```python
+from lollms_client import LollmsClient
+OPENAI_CONFIG = {
+    "service_key": "your_openai_api_key_here" # sk-...
+}
+lc = LollmsClient(
+    binding_name="openai",
+    model_name="gpt-4o",
+    llm_binding_config=OPENAI_CONFIG
+)
+response = lc.generate_text("What is the difference between AI and machine learning?")
+print(response)
+```
+#### **Google Gemini**
+Connects to Google's Gemini family of models via the Google AI Studio API.
+**Prerequisites:**
+*   A Google AI Studio API key.
+**Usage:**
+```python
+from lollms_client import LollmsClient
+GEMINI_CONFIG = {
+    "service_key": "your_google_api_key_here"
+}
+lc = LollmsClient(
+    binding_name="gemini",
+    model_name="gemini-1.5-pro-latest",
+    llm_binding_config=GEMINI_CONFIG
+)
+response = lc.generate_text("Summarize the plot of 'Dune' in three sentences.")
+print(response)
+```
+#### **Anthropic Claude**
+Connects to Anthropic's API to use the Claude family of models, including Claude 3.5 Sonnet, Opus, and Haiku.
+**Prerequisites:**
+*   An Anthropic API key.
+**Usage:**
+```python
+from lollms_client import LollmsClient
+CLAUDE_CONFIG = {
+    "service_key": "your_anthropic_api_key_here"
+}
+lc = LollmsClient(
+    binding_name="claude",
+    model_name="claude-3-5-sonnet-20240620",
+    llm_binding_config=CLAUDE_CONFIG
+)
+response = lc.generate_text("What are the core principles of constitutional AI?")
+print(response)
+```
+---
+### 3. API Aggregator Bindings
+These bindings connect to services that provide access to many different models through a single API.
+#### **OpenRouter**
+OpenRouter provides a unified, OpenAI-compatible interface to access models from dozens of providers (Google, Anthropic, Mistral, Groq, etc.) with one API key.
+**Prerequisites:**
+*   An OpenRouter API key (starts with `sk-or-...`).
+**Usage:**
+Model names must be specified in the format `provider/model-name`.
+```python
+from lollms_client import LollmsClient
+OPENROUTER_CONFIG = {
+    "open_router_api_key": "your_openrouter_api_key_here"
+}
+# Example using a Claude model through OpenRouter
+lc = LollmsClient(
+    binding_name="open_router",
+    model_name="anthropic/claude-3-haiku-20240307",
+    llm_binding_config=OPENROUTER_CONFIG
+)
+response = lc.generate_text("Explain what an API aggregator is, as if to a beginner.")
+print(response)
+```
+#### **Groq**
+While Groq is a direct provider, it's famous as an aggregator of speed. It runs open-source models on custom LPU hardware for exceptionally fast inference.
+**Prerequisites:**
+*   A Groq API key.
+**Usage:**
+```python
+from lollms_client import LollmsClient
+GROQ_CONFIG = {
+    "groq_api_key": "your_groq_api_key_here"
+}
+lc = LollmsClient(
+    binding_name="groq",
+    model_name="llama3-8b-8192",
+    llm_binding_config=GROQ_CONFIG
+)
+response = lc.generate_text("Write a 3-line poem about incredible speed.")
+print(response)
+```
+#### **Hugging Face Inference API**
+This connects to the serverless Hugging Face Inference API, allowing experimentation with thousands of open-source models without local hardware.
+**Note:** This API can have "cold starts," so the first request might be slow.
+**Prerequisites:**
+*   A Hugging Face User Access Token (starts with `hf_...`).
+**Usage:**
+```python
+from lollms_client import LollmsClient
+HF_CONFIG = {
+    "hf_api_key": "your_hugging_face_token_here"
+}
+lc = LollmsClient(
+    binding_name="hugging_face_inference_api",
+    model_name="google/gemma-1.1-7b-it",
+    llm_binding_config=HF_CONFIG
+)
+response = lc.generate_text("Write a short story about a robot who discovers music.")
+print(response)
+```
 ## Contributing
 Contributions are welcome! Whether it's bug reports, feature suggestions, documentation improvements, or new bindings, please feel free to open an issue or submit a pull request on our [GitHub repository](https://github.com/ParisNeo/lollms_client).