PyPI - quantalogic - Versions diffs - 0.31.1__tar.gz → 0.33.0__tar.gz - Mend

quantalogic 0.31.1tar.gz → 0.33.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (100) hide show

{quantalogic-0.31.1 → quantalogic-0.33.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: quantalogic
-Version: 0.31.1
+Version: 0.33.0
 Summary: QuantaLogic ReAct Agents
 Author: Raphaël MANSUY
 Author-email: raphael.mansuy@gmail.com
@@ -184,12 +184,25 @@ See our [Release Notes](RELEASE_NOTES.MD) for detailed version history and chang
 | openrouter/openai/gpt-4o | OPENROUTER_API_KEY | OpenAI's GPT-4o model accessible through OpenRouter platform. |
 | openrouter/mistralai/mistral-large-2411 | OPENROUTER_API_KEY | Mistral's large model optimized for complex reasoning tasks, available through OpenRouter with enhanced multilingual capabilities. |
 | mistral/mistral-large-2407 | MISTRAL_API_KEY | Mistral's high-performance model designed for enterprise-grade applications, offering advanced reasoning and multilingual support. |
+| nvidia/deepseek-ai/deepseek-r1 | NVIDIA_API_KEY | NVIDIA's DeepSeek R1 model optimized for high-performance AI tasks and advanced reasoning capabilities. |
+| lm_studio/mistral-small-24b-instruct-2501 | LM_STUDIO_API_KEY | LM Studio's Mistral Small model optimized for local inference with advanced reasoning capabilities. |
 | dashscope/qwen-max | DASHSCOPE_API_KEY | Alibaba's Qwen-Max model optimized for maximum performance and extensive reasoning capabilities. |
 | dashscope/qwen-plus | DASHSCOPE_API_KEY | Alibaba's Qwen-Plus model offering balanced performance and cost-efficiency for a variety of tasks. |
 | dashscope/qwen-turbo | DASHSCOPE_API_KEY | Alibaba's Qwen-Turbo model designed for fast and efficient responses, ideal for high-throughput scenarios. |
 To configure the environment API key for Quantalogic using LiteLLM, set the required environment variable for your chosen provider and any optional variables like `OPENAI_API_BASE` or `OPENROUTER_REFERRER`. Use a `.env` file or a secrets manager to securely store these keys, and load them in your code using `python-dotenv`. For advanced configurations, refer to the [LiteLLM documentation](https://docs.litellm.ai/docs/).
+### LM Studio Local Setup
+To use LM Studio with the Mistral model locally, set the following environment variables:
+```bash
+export LM_STUDIO_API_BASE="http://localhost:1234/v1"
+export LM_STUDIO_API_KEY="your-api-key-here"
+```
+Replace `http://localhost:1234/v1` with your LM Studio server URL and `your-api-key-here` with your actual API key.
 ## 📦 Installation

{quantalogic-0.31.1 → quantalogic-0.33.0}/README.md RENAMED Viewed

@@ -124,12 +124,25 @@ See our [Release Notes](RELEASE_NOTES.MD) for detailed version history and chang
 | openrouter/openai/gpt-4o | OPENROUTER_API_KEY | OpenAI's GPT-4o model accessible through OpenRouter platform. |
 | openrouter/mistralai/mistral-large-2411 | OPENROUTER_API_KEY | Mistral's large model optimized for complex reasoning tasks, available through OpenRouter with enhanced multilingual capabilities. |
 | mistral/mistral-large-2407 | MISTRAL_API_KEY | Mistral's high-performance model designed for enterprise-grade applications, offering advanced reasoning and multilingual support. |
+| nvidia/deepseek-ai/deepseek-r1 | NVIDIA_API_KEY | NVIDIA's DeepSeek R1 model optimized for high-performance AI tasks and advanced reasoning capabilities. |
+| lm_studio/mistral-small-24b-instruct-2501 | LM_STUDIO_API_KEY | LM Studio's Mistral Small model optimized for local inference with advanced reasoning capabilities. |
 | dashscope/qwen-max | DASHSCOPE_API_KEY | Alibaba's Qwen-Max model optimized for maximum performance and extensive reasoning capabilities. |
 | dashscope/qwen-plus | DASHSCOPE_API_KEY | Alibaba's Qwen-Plus model offering balanced performance and cost-efficiency for a variety of tasks. |
 | dashscope/qwen-turbo | DASHSCOPE_API_KEY | Alibaba's Qwen-Turbo model designed for fast and efficient responses, ideal for high-throughput scenarios. |
 To configure the environment API key for Quantalogic using LiteLLM, set the required environment variable for your chosen provider and any optional variables like `OPENAI_API_BASE` or `OPENROUTER_REFERRER`. Use a `.env` file or a secrets manager to securely store these keys, and load them in your code using `python-dotenv`. For advanced configurations, refer to the [LiteLLM documentation](https://docs.litellm.ai/docs/).
+### LM Studio Local Setup
+To use LM Studio with the Mistral model locally, set the following environment variables:
+```bash
+export LM_STUDIO_API_BASE="http://localhost:1234/v1"
+export LM_STUDIO_API_KEY="your-api-key-here"
+```
+Replace `http://localhost:1234/v1` with your LM Studio server URL and `your-api-key-here` with your actual API key.
 ## 📦 Installation

{quantalogic-0.31.1 → quantalogic-0.33.0}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [tool.poetry]
 name = "quantalogic"
-version = "0.31.1"
+version = "0.33.0"
 description = "QuantaLogic ReAct Agents"
 authors = ["Raphaël MANSUY <raphael.mansuy@gmail.com>"]
 readme = "README.md"

{quantalogic-0.31.1 → quantalogic-0.33.0}/quantalogic/agent.py RENAMED Viewed

@@ -52,11 +52,7 @@ class ObserveResponseResult(BaseModel):
 class Agent(BaseModel):
     """Enhanced QuantaLogic agent implementing ReAct framework."""
-    model_config = ConfigDict(
-        arbitrary_types_allowed=True,
-        validate_assignment=True,
-        extra="forbid"
-    )
+    model_config = ConfigDict(arbitrary_types_allowed=True, validate_assignment=True, extra="forbid")
     specific_expertise: str
     model: GenerativeModel
@@ -95,7 +91,7 @@ class Agent(BaseModel):
         """Initialize the agent with model, memory, tools, and configurations."""
         try:
             logger.debug("Initializing agent...")
             # Create event emitter
             event_emitter = EventEmitter()
@@ -142,9 +138,9 @@ class Agent(BaseModel):
                 compact_every_n_iterations=compact_every_n_iterations or 30,
                 max_tokens_working_memory=max_tokens_working_memory,
             )
             self._model_name = model_name
             logger.debug(f"Memory will be compacted every {self.compact_every_n_iterations} iterations")
             logger.debug(f"Max tokens for working memory set to: {self.max_tokens_working_memory}")
             logger.debug("Agent initialized successfully.")
@@ -168,7 +164,9 @@ class Agent(BaseModel):
         """Clear the memory and reset the session."""
         self._reset_session(clear_memory=True)
-    def solve_task(self, task: str, max_iterations: int = 30, streaming: bool = False, clear_memory: bool = True) -> str:
+    def solve_task(
+        self, task: str, max_iterations: int = 30, streaming: bool = False, clear_memory: bool = True
+    ) -> str:
         """Solve the given task using the ReAct framework.
         Args:
@@ -182,7 +180,7 @@ class Agent(BaseModel):
             str: The final response after task completion.
         """
         logger.debug(f"Solving task... {task}")
-        self._reset_session(task_to_solve=task, max_iterations=max_iterations,clear_memory=clear_memory)
+        self._reset_session(task_to_solve=task, max_iterations=max_iterations, clear_memory=clear_memory)
         # Generate task summary
         self.task_to_solve_summary = self._generate_task_summary(task)
@@ -228,7 +226,9 @@ class Agent(BaseModel):
                     # For streaming, collect the response chunks
                     content = ""
                     for chunk in self.model.generate_with_history(
-                        messages_history=self.memory.memory, prompt=current_prompt, streaming=True
+                        messages_history=self.memory.memory,
+                        prompt=current_prompt,
+                        streaming=True,
                     ):
                         content += chunk
@@ -245,7 +245,8 @@ class Agent(BaseModel):
                     )
                 else:
                     result = self.model.generate_with_history(
-                        messages_history=self.memory.memory, prompt=current_prompt, streaming=False
+                        messages_history=self.memory.memory, prompt=current_prompt, streaming=False,
+                        stop_words=["thinking"]
                     )
                 content = result.response
@@ -296,7 +297,7 @@ class Agent(BaseModel):
         return answer
-    def _reset_session(self, task_to_solve: str = "", max_iterations: int = 30,clear_memory: bool = True):
+    def _reset_session(self, task_to_solve: str = "", max_iterations: int = 30, clear_memory: bool = True):
         """Reset the agent's session."""
         logger.debug("Resetting session...")
         self.task_to_solve = task_to_solve
@@ -316,29 +317,30 @@ class Agent(BaseModel):
     def _compact_memory_if_needed(self, current_prompt: str = ""):
         """Compacts the memory if it exceeds the maximum occupancy or token limit."""
         ratio_occupied = self._calculate_context_occupancy()
         # Compact memory if any of these conditions are met:
         # 1. Memory occupancy exceeds MAX_OCCUPANCY, or
         # 2. Current iteration is a multiple of compact_every_n_iterations, or
         # 3. Working memory exceeds max_tokens_working_memory (if set)
         should_compact_by_occupancy = ratio_occupied >= MAX_OCCUPANCY
         should_compact_by_iteration = (
-            self.compact_every_n_iterations is not None and
-            self.current_iteration > 0 and
-            self.current_iteration % self.compact_every_n_iterations == 0
+            self.compact_every_n_iterations is not None
+            and self.current_iteration > 0
+            and self.current_iteration % self.compact_every_n_iterations == 0
         )
         should_compact_by_token_limit = (
-            self.max_tokens_working_memory is not None and
-            self.total_tokens > self.max_tokens_working_memory
+            self.max_tokens_working_memory is not None and self.total_tokens > self.max_tokens_working_memory
         )
         if should_compact_by_occupancy or should_compact_by_iteration or should_compact_by_token_limit:
             if should_compact_by_occupancy:
                 logger.debug(f"Memory compaction triggered: Occupancy {ratio_occupied}% exceeds {MAX_OCCUPANCY}%")
             if should_compact_by_iteration:
-                logger.debug(f"Memory compaction triggered: Iteration {self.current_iteration} is a multiple of {self.compact_every_n_iterations}")
+                logger.debug(
+                    f"Memory compaction triggered: Iteration {self.current_iteration} is a multiple of {self.compact_every_n_iterations}"
+                )
             self._emit_event("memory_full")
             self.memory.compact()
             self.total_tokens = self.model.token_counter_with_history(self.memory.memory, current_prompt)
@@ -399,7 +401,7 @@ class Agent(BaseModel):
                     return self._handle_tool_execution_failure(response)
                 variable_name = self.variable_store.add(response)
-                new_prompt = self._format_observation_response(response, variable_name, iteration)
+                new_prompt = self._format_observation_response(response, executed_tool, variable_name, iteration)
                 return ObserveResponseResult(
                     next_prompt=new_prompt,
@@ -414,7 +416,7 @@ class Agent(BaseModel):
         """Extract tool usage from the response content."""
         if not content or not isinstance(content, str):
             return {}
         xml_parser = ToleranceXMLParser()
         tool_names = self.tools.tool_names()
         return xml_parser.extract_elements(text=content, element_names=tool_names)
@@ -461,7 +463,7 @@ class Agent(BaseModel):
             answer=None,
         )
-    def _handle_repeated_tool_call(self, tool_name: str, arguments_with_values: dict) -> (str,str):
+    def _handle_repeated_tool_call(self, tool_name: str, arguments_with_values: dict) -> (str, str):
         """Handle the case where a tool call is repeated."""
         repeat_count = self.last_tool_call.get("count", 0)
         error_message = (
@@ -494,7 +496,9 @@ class Agent(BaseModel):
             answer=None,
         )
-    def _format_observation_response(self, response: str, variable_name: str, iteration: int) -> str:
+    def _format_observation_response(
+        self, response: str, last_exectured_tool: str, variable_name: str, iteration: int
+    ) -> str:
         """Format the observation response with the given response, variable name, and iteration."""
         response_display = response
         if len(response) > MAX_RESPONSE_LENGTH:
@@ -504,29 +508,45 @@ class Agent(BaseModel):
             )
         # Format the response message
-        formatted_response = (
-            f"Your next step: you Must now plan the next tool call to complete the based on this new observation\n"
-            f"\n--- Observations for iteration {iteration} / max {self.max_iterations} ---\n"
-            f"\n--- Tool execution result in ${variable_name}$ ---\n"
-            f"<{variable_name}>\n{response_display}\n</{variable_name}>\n\n"
-            f"--- Tools ---\n{self._get_tools_names_prompt()}\n"
-            f"--- Variables ---\n{self._get_variable_prompt()}\n"
-            "Analyze this response to determine the next steps. If the step failed, reconsider your approach.\n"
-            f"--- Task to solve summary ---\n{self.task_to_solve_summary}\n"
-            "--- Format ---\n"
-            "Respond only with two XML blocks in markdown as specified in system prompt.\n"
-            "No extra comments must be added.\n"
+        formatted_response = formatted_response = (
+            "# Analysis and Next Action Decision Point\n\n"
+            f"📊 Progress: Iteration {iteration}/{self.max_iterations}\n\n"
+            "## Current Context\n"
+            f"```\n{self.task_to_solve_summary}```\n\n"
+            f"## Latest Tool {last_exectured_tool} Execution Result:\n"
+            f"Variable: ${variable_name}$\n"
+            f"```\n{response_display}```\n\n"
+            "## Available Resources\n"
+            f"🛠️ Tools:\n{self._get_tools_names_prompt()}\n\n"
+            f"📦 Variables:\n{self._get_variable_prompt()}\n\n"
+            "## Your Task\n"
+            "1. Analyze the execution result and progress, formalize if the current step is solved according to the task.\n"
+            "2. Determine the most effective next step\n"
+            "3. Select exactly ONE tool from the available list\n"
+            "4. Utilize variable interpolation where needed\n"
+            "## Response Requirements\n"
+            "Provide TWO markdown-formatted XML blocks:\n"
+            "1. Your analysis of the progression resulting from the execution of the tool in <thinking> tags, don't include <context_analysis/>\n"
+            "2. Your tool execution plan in <tool_name> tags\n\n"
+            "## Response Format\n"
             "```xml\n"
             "<thinking>\n"
-            "...\n"
+            "[Detailed analysis of progress, and reasoning for next step]\n"
             "</thinking>\n"
             "```\n"
             "```xml\n"
-            "< ...tool_name... >\n"
-            "...\n"
-            "</ ...tool_name... >\n"
-            "```"
-        )
+            "<action>\n"
+            "<selected_tool_name>\n"
+            "[Precise instruction for tool execution]\n"
+            "</selected_tool_name>\n"
+            "</action>\n"
+            "```\n\n"
+            "⚠️ Important:\n"
+            "- Respond ONLY with the two XML blocks\n"
+            "- No additional commentary\n"
+            "- If previous step failed, revise approach\n"
+            "- Ensure variable interpolation syntax is correct\n"
+            "- Utilize the <task_complete> tool to indicate task completion, display the result or if the task is deemed unfeasible.")
         return formatted_response
@@ -589,10 +609,10 @@ class Agent(BaseModel):
             arguments_with_values_interpolated = {
                 key: self._interpolate_variables(value) for key, value in arguments_with_values.items()
             }
             arguments_with_values_interpolated = arguments_with_values_interpolated
-            # test if tool need variables in context
+            # test if tool need variables in context
             if tool.need_variables:
                 # Inject variables into the tool if needed
                 arguments_with_values_interpolated["variables"] = self.variable_store
@@ -603,8 +623,7 @@ class Agent(BaseModel):
             try:
                 # Convert arguments to proper types
                 converted_args = self.tools.validate_and_convert_arguments(
-                    tool_name,
-                    arguments_with_values_interpolated
+                    tool_name, arguments_with_values_interpolated
                 )
             except ValueError as e:
                 return "", f"Argument Error: {str(e)}"
@@ -637,9 +656,10 @@ class Agent(BaseModel):
         """Interpolate variables using $var$ syntax in the given text."""
         try:
             import re
             for var in self.variable_store.keys():
                 # Escape the variable name for regex, but use raw value for replacement
-                pattern = rf'\${re.escape(var)}\$'
+                pattern = rf"\${re.escape(var)}\$"
                 replacement = self.variable_store[var]
                 text = re.sub(pattern, replacement, text)
             return text
@@ -729,9 +749,7 @@ class Agent(BaseModel):
         # Remove the last assistant / user message
         user_message = memory_copy.pop()
         assistant_message = memory_copy.pop()
-        summary = self.model.generate_with_history(
-            messages_history=memory_copy, prompt=prompt_summary
-        )
+        summary = self.model.generate_with_history(messages_history=memory_copy, prompt=prompt_summary)
         # Remove user message
         memory_copy.pop()
         # Replace by summary
@@ -751,6 +769,8 @@ class Agent(BaseModel):
             str: Generated task summary
         """
         try:
+            if len(content) < 200:
+                return content
             prompt = (
                 "Create an ultra-concise task summary that captures ONLY: \n"
                 "1. Primary objective/purpose\n"

{quantalogic-0.31.1 → quantalogic-0.33.0}/quantalogic/generative_model.py RENAMED Viewed

@@ -123,7 +123,8 @@ class GenerativeModel:
     # Generate a response with conversation history and optional streaming
     def generate_with_history(
-        self, messages_history: list[Message], prompt: str, image_url: str | None = None, streaming: bool = False
+        self, messages_history: list[Message], prompt: str, image_url: str | None = None, streaming: bool = False,
+        stop_words: list[str] | None = None
     ) -> ResponseStats:
         """Generate a response with conversation history and optional image.
@@ -132,6 +133,7 @@ class GenerativeModel:
             prompt: Current user prompt.
             image_url: Optional image URL for visual queries.
             streaming: Whether to stream the response.
+            stop_words: Optional list of stop words for streaming
         Returns:
             Detailed response statistics or a generator in streaming mode.
@@ -163,6 +165,7 @@ class GenerativeModel:
                 model=self.model,
                 messages=messages,
                 num_retries=MIN_RETRIES,
+                stop=stop_words,
             )
             token_usage = TokenUsage(
@@ -181,7 +184,7 @@ class GenerativeModel:
         except Exception as e:
             self._handle_generation_exception(e)
-    def _stream_response(self, messages):
+    def _stream_response(self, messages, stop_words: list[str] | None = None):
         """Private method to handle streaming responses."""
         try:
             for chunk in generate_completion(
@@ -189,7 +192,8 @@ class GenerativeModel:
                 model=self.model,
                 messages=messages,
                 num_retries=MIN_RETRIES,
-                stream=True,  # Enable streaming
+                stream=True,  # Enable streaming,
+                stop=stop_words,
             ):
                 if chunk.choices[0].delta.content is not None:
                     self.event_emitter.emit("stream_chunk", chunk.choices[0].delta.content)

quantalogic-0.33.0/quantalogic/get_model_info.py ADDED Viewed

@@ -0,0 +1,83 @@
+import loguru
+from quantalogic.model_info_list import model_info
+from quantalogic.model_info_litellm import litellm_get_model_max_input_tokens, litellm_get_model_max_output_tokens
+from quantalogic.utils.lm_studio_model_info import ModelInfo, get_model_list
+DEFAULT_MAX_OUTPUT_TOKENS = 4 * 1024  # Reasonable default for most models
+DEFAULT_MAX_INPUT_TOKENS = 32 * 1024  # Reasonable default for most models
+def validate_model_name(model_name: str) -> None:
+    if not isinstance(model_name, str) or not model_name.strip():
+        raise ValueError(f"Invalid model name: {model_name}")
+def print_model_info():
+    for info in model_info.values():
+        print(f"\n{info.model_name}:")
+        print(f"  Max Input Tokens: {info.max_input_tokens:,}")
+        print(f"  Max Output Tokens: {info.max_output_tokens:,}")
+def get_max_output_tokens(model_name: str) -> int:
+    """Get max output tokens with safe fallback"""
+    validate_model_name(model_name)
+    if model_name.startswith('lm_studio/'):
+        try:
+            models = get_model_list()
+            for model in models.data:
+                if model.id == model_name[len('lm_studio/'):]:
+                    return model.max_context_length
+        except Exception:
+            loguru.logger.warning(f"Could not fetch LM Studio model info for {model_name}, using default")
+    if model_name in model_info:
+        return model_info[model_name].max_output_tokens
+    try:
+        return litellm_get_model_max_output_tokens(model_name)
+    except Exception as e:
+        loguru.logger.warning(f"Model {model_name} not found in LiteLLM registry, using default")
+        return DEFAULT_MAX_OUTPUT_TOKENS
+def get_max_input_tokens(model_name: str) -> int:
+    """Get max input tokens with safe fallback"""
+    validate_model_name(model_name)
+    if model_name.startswith('lm_studio/'):
+        try:
+            models = get_model_list()
+            for model in models.data:
+                if model.id == model_name[len('lm_studio/'):]:
+                    return model.max_context_length
+        except Exception:
+            loguru.logger.warning(f"Could not fetch LM Studio model info for {model_name}, using default")
+    if model_name in model_info:
+        return model_info[model_name].max_input_tokens
+    try:
+        return litellm_get_model_max_input_tokens(model_name)
+    except Exception:
+        loguru.logger.warning(f"Model {model_name} not found in LiteLLM registry, using default")
+        return DEFAULT_MAX_INPUT_TOKENS
+def get_max_tokens(model_name: str) -> int:
+    """Get total maximum tokens (input + output)"""
+    validate_model_name(model_name)
+    # Get input and output tokens separately
+    input_tokens = get_max_input_tokens(model_name)
+    output_tokens = get_max_output_tokens(model_name)
+    return input_tokens + output_tokens
+if __name__ == "__main__":
+    print_model_info()
+    print(get_max_input_tokens("gpt-4o-mini"))
+    print(get_max_output_tokens("openrouter/openai/gpt-4o-mini"))

{quantalogic-0.31.1 → quantalogic-0.33.0}/quantalogic/llm.py RENAMED Viewed

@@ -30,18 +30,56 @@ def get_model_info(model_name: str) -> dict | None:
     return model_info.get(model_name, None)
+class ModelProviderConfig:
+    def __init__(self, prefix: str, provider: str, base_url: str, env_var: str):
+        self.prefix = prefix
+        self.provider = provider
+        self.base_url = base_url
+        self.env_var = env_var
+    def configure(self, model: str, kwargs: Dict[str, Any]) -> None:
+        kwargs["model"] = model.replace(self.prefix, "")
+        kwargs["custom_llm_provider"] = self.provider
+        kwargs["base_url"] = self.base_url
+        api_key = os.getenv(self.env_var)
+        if not api_key:
+            raise ValueError(f"{self.env_var} is not set in the environment variables.")
+        kwargs["api_key"] = api_key
+# Default provider configurations
+PROVIDERS = {
+    "dashscope": ModelProviderConfig(
+        prefix="dashscope/",
+        provider="openai",
+        base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
+        env_var="DASHSCOPE_API_KEY"
+    ),
+    "nvidia": ModelProviderConfig(
+        prefix="nvidia/",
+        provider="openai",
+        base_url="https://integrate.api.nvidia.com/v1",
+        env_var="NVIDIA_API_KEY"
+    ),
+    "ovh": ModelProviderConfig(
+        prefix="ovh/",
+        provider="openai",
+        base_url="https://deepseek-r1-distill-llama-70b.endpoints.kepler.ai.cloud.ovh.net/api/openai_compat/v1",
+        env_var="OVH_API_KEY"
+    )
+}
 def generate_completion(**kwargs: Dict[str, Any]) -> Any:
     """Wraps litellm completion with proper type hints."""
     model = kwargs.get("model", "")
-    if model.startswith("dashscope/"):
-        # Remove prefix and configure for OpenAI-compatible endpoint
-        kwargs["model"] = model.replace("dashscope/", "")
-        kwargs["custom_llm_provider"] = "openai"  # Explicitly specify OpenAI provider
-        kwargs["base_url"] = "https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
-        api_key = os.getenv("DASHSCOPE_API_KEY")
-        if not api_key:
-            raise ValueError("DASHSCOPE_API_KEY is not set in the environment variables.")
-        kwargs["api_key"] = api_key
+    # Find matching provider
+    for provider_name, provider_config in PROVIDERS.items():
+        if model.startswith(provider_config.prefix):
+            provider_config.configure(model, kwargs)
+            break
     return completion(**kwargs)

quantalogic-0.33.0/quantalogic/model_info.py ADDED Viewed

@@ -0,0 +1,12 @@
+from pydantic import BaseModel
+class ModelInfo(BaseModel):
+    model_name: str
+    max_input_tokens: int
+    max_output_tokens: int
+    max_cot_tokens: int | None = None
+class ModelNotFoundError(Exception):
+    """Raised when a model is not found in local registry"""

quantalogic-0.33.0/quantalogic/model_info_list.py ADDED Viewed

@@ -0,0 +1,60 @@
+from quantalogic.model_info import ModelInfo
+model_info = {
+    "dashscope/qwen-max": ModelInfo(
+        model_name="dashscope/qwen-max",
+        max_output_tokens=8 * 1024,
+        max_input_tokens=32 * 1024,
+    ),
+    "dashscope/qwen-plus": ModelInfo(
+        model_name="dashscope/qwen-plus",
+        max_output_tokens=8 * 1024,
+        max_input_tokens=131072,
+    ),
+    "dashscope/qwen-turbo": ModelInfo(
+        model_name="dashscope/qwen-turbo",
+        max_output_tokens=8 * 1024,
+        max_input_tokens=1000000,
+    ),
+    "deepseek-reasoner": ModelInfo(
+        model_name="deepseek-reasoner",
+        max_output_tokens=8 * 1024,
+        max_input_tokens=1024 * 128,
+    ),
+    "openrouter/deepseek/deepseek-r1": ModelInfo(
+        model_name="openrouter/deepseek/deepseek-r1",
+        max_output_tokens=8 * 1024,
+        max_input_tokens=1024 * 128,
+    ),
+    "openrouter/mistralai/mistral-large-2411": ModelInfo(
+        model_name="openrouter/mistralai/mistral-large-2411",
+        max_output_tokens=128 * 1024,
+        max_input_tokens=1024 * 128,
+    ),
+    "mistralai/mistral-large-2411": ModelInfo(
+        model_name="mistralai/mistral-large-2411",
+        max_output_tokens=128 * 1024,
+        max_input_tokens=1024 * 128,
+    ),
+    "deepseek/deepseek-chat": ModelInfo(
+        model_name="deepseek/deepseek-chat",
+        max_output_tokens=8 * 1024,
+        max_input_tokens=1024 * 64,
+    ),
+    "deepseek/deepseek-reasoner": ModelInfo(
+        model_name="deepseek/deepseek-reasoner",
+        max_output_tokens=8 * 1024,
+        max_input_tokens=1024 * 64,
+        max_cot_tokens=1024 * 32,
+    ),
+    "nvidia/deepseek-ai/deepseek-r1": ModelInfo(
+        model_name="nvidia/deepseek-ai/deepseek-r1",
+        max_output_tokens=8 * 1024,
+        max_input_tokens=1024 * 64,
+    ),
+    "ovh/DeepSeek-R1-Distill-Llama-70B": ModelInfo(
+        model_name="ovh/DeepSeek-R1-Distill-Llama-70B",
+        max_output_tokens=8 * 1024,
+        max_input_tokens=1024 * 64,
+    ),
+}

quantalogic 0.31.1__tar.gz → 0.33.0__tar.gz

quantalogic 0.31.1tar.gz → 0.33.0tar.gz