PyPI - dao-ai - Versions diffs - 0.1.19__tar.gz → 0.1.21__tar.gz - Mend

dao-ai 0.1.19tar.gz → 0.1.21tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (344) hide show

{dao_ai-0.1.19 → dao_ai-0.1.21}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: dao-ai
-Version: 0.1.19
+Version: 0.1.21
 Summary: DAO AI: A modular, multi-agent orchestration framework for complex AI workflows. Supports agent handoff, tool integration, and dynamic configuration via YAML.
 Project-URL: Homepage, https://github.com/natefleming/dao-ai
 Project-URL: Documentation, https://natefleming.github.io/dao-ai

{dao_ai-0.1.19 → dao_ai-0.1.21}/config/examples/04_genie/README.md RENAMED Viewed

@@ -115,7 +115,7 @@ genie_tool:
         embedding_model: *embedding_model
         similarity_threshold: 0.85
         time_to_live_seconds: 3600
-        context_window_size: 3
+        context_window_size: 2  # default
 ```
 ### In-Memory Semantic Cache (Single-Instance)
@@ -141,7 +141,7 @@ genie_tool:
         similarity_threshold: 0.85
         time_to_live_seconds: 604800  # 1 week
         capacity: 1000                # LRU eviction when full
-        context_window_size: 3
+        context_window_size: 2  # default
 ```
 ## Cache Flow

dao_ai-0.1.21/config/examples/04_genie/cache_threshold_optimization.yaml ADDED Viewed

@@ -0,0 +1,168 @@
+# yaml-language-server: $schema=../../../schemas/model_config_schema.json
+#
+# Example configuration for context-aware cache threshold optimization.
+#
+# This configuration demonstrates how to:
+#   1. Define an evaluation dataset with question pairs
+#   2. Configure threshold optimization parameters
+#   3. Run Optuna Bayesian optimization to find optimal thresholds
+#
+# The optimizer tunes these parameters:
+#   - similarity_threshold: Minimum similarity for question matching (0.5-0.99)
+#   - context_similarity_threshold: Minimum similarity for context matching (0.5-0.99)
+#   - question_weight: Weight for question vs context in combined score (0.1-0.9)
+#
+# Usage:
+#   1. Update the evaluation dataset with your domain-specific question pairs
+#   2. Run: config.optimizations.optimize() or use the notebook
+#   3. Apply the optimized thresholds to your cache configuration
+schemas:
+  quick_serve_restaurant_schema: &quick_serve_restaurant_schema
+    catalog_name: retail_consumer_goods
+    schema_name: quick_serve_restaurant
+resources:
+  llms:
+    # Judge model for semantic equivalence evaluation
+    # Used when expected_match is not provided for an entry
+    judge_model: &judge_model
+      name: databricks-meta-llama-3-3-70b-instruct
+      temperature: 0.0  # Low temperature for consistent judgments
+      max_tokens: 10    # Only need "MATCH" or "NO_MATCH"
+    # Embedding model for generating embeddings
+    embedding_model: &embedding_model
+      name: databricks-gte-large-en
+  warehouses:
+    shared_endpoint_warehouse: &shared_endpoint_warehouse
+      name: "Shared Endpoint Warehouse"
+      warehouse_id: 148ccb90800933a1
+  databases:
+    semantic_cache_db: &semantic_cache_db
+      name: "Retail and Consumer Goods Database"
+      instance_name: "retail-consumer-goods"
+# =============================================================================
+# OPTIMIZATIONS CONFIGURATION
+# =============================================================================
+# Configure cache threshold optimizations using Optuna Bayesian optimization.
+optimizations:
+  cache_threshold_optimizations:
+    # ---------------------------------------------------------------------------
+    # Retail Cache Threshold Optimization
+    # ---------------------------------------------------------------------------
+    optimize_retail_cache_thresholds:
+      name: optimize_retail_cache_thresholds
+      # Current thresholds to improve (optional - uses defaults if not provided)
+      cache_parameters:
+        database: *semantic_cache_db
+        warehouse: *shared_endpoint_warehouse
+        embedding_model: *embedding_model
+        similarity_threshold: 0.85           # Question matching threshold
+        context_similarity_threshold: 0.80   # Context matching threshold
+        question_weight: 0.6                 # Weight for question (context = 1 - question)
+        time_to_live_seconds: 86400
+      # Evaluation dataset with question pairs
+      # Each entry contains:
+      #   - question/context: The incoming query
+      #   - cached_question/cached_context: The cached entry to compare against
+      #   - expected_match: true (should match), false (should not), or omit for LLM judge
+      #
+      # Note: Embeddings are normally pre-computed. Use the notebook or
+      # generate_eval_dataset_from_cache() to create a dataset from cache entries.
+      dataset:
+        name: retail_cache_eval_dataset
+        description: "Evaluation dataset for retail domain semantic cache tuning"
+        entries: []
+        # In practice, populate with real entries like:
+        #
+        # entries:
+        #   # Positive pair - paraphrases that should match
+        #   - question: "What are total sales for Q1?"
+        #     question_embedding: [0.1, 0.2, ...]  # Pre-computed embeddings
+        #     context: "Previous: Show me revenue breakdown"
+        #     context_embedding: [0.1, 0.2, ...]
+        #     cached_question: "Show me Q1 total sales"
+        #     cached_question_embedding: [0.1, 0.2, ...]
+        #     cached_context: "Previous: Show me revenue breakdown"
+        #     cached_context_embedding: [0.1, 0.2, ...]
+        #     expected_match: true
+        #
+        #   # Negative pair - different questions that should NOT match
+        #   - question: "What is inventory count by store?"
+        #     question_embedding: [0.3, 0.1, ...]
+        #     context: ""
+        #     context_embedding: [0.0, 0.0, ...]
+        #     cached_question: "Show revenue by region"
+        #     cached_question_embedding: [0.5, 0.6, ...]
+        #     cached_context: ""
+        #     cached_context_embedding: [0.0, 0.0, ...]
+        #     expected_match: false
+        #
+        #   # Unlabeled entry - LLM judge will determine
+        #   - question: "How many items sold last week?"
+        #     question_embedding: [0.2, 0.3, ...]
+        #     context: "Previous: Filter by electronics"
+        #     context_embedding: [0.1, 0.4, ...]
+        #     cached_question: "Total items sold in past 7 days"
+        #     cached_question_embedding: [0.2, 0.35, ...]
+        #     cached_context: "Previous: Filter by electronics"
+        #     cached_context_embedding: [0.1, 0.4, ...]
+        #     # expected_match omitted - will use LLM judge
+      # LLM for judging unlabeled entries
+      judge_model: *judge_model
+      # Optimization parameters
+      n_trials: 50                              # Number of Optuna trials (more = better results)
+      metric: f1                                # Metric to optimize: f1, precision, recall, fbeta
+      beta: 1.0                                 # Beta for fbeta metric (higher = favor recall)
+      seed: 42                                  # Random seed for reproducibility
+# =============================================================================
+# USAGE INSTRUCTIONS
+# =============================================================================
+#
+# 1. PREPARE EVALUATION DATA:
+#    Generate embeddings for your question pairs using the embedding model.
+#    You can use the notebook or the generate_eval_dataset_from_cache() function
+#    to create a dataset from existing cache entries.
+#
+# 2. RUN OPTIMIZATION:
+#    Use the notebook notebooks/11_optimize_context_aware_genie_cache.py,
+#    or run programmatically:
+#
+#    ```python
+#    from dao_ai.config import AppConfig
+#
+#    config = AppConfig.from_file("cache_threshold_optimization.yaml")
+#
+#    # Run all optimizations (prompts and cache thresholds)
+#    results = config.optimizations.optimize()
+#
+#    # Or run a specific cache threshold optimization
+#    optimization = config.optimizations.cache_threshold_optimizations["optimize_retail_cache_thresholds"]
+#    result = optimization.optimize()
+#
+#    print(f"Optimized thresholds: {result.optimized_thresholds}")
+#    print(f"Improvement: {result.improvement:.1%}")
+#    ```
+#
+# 3. APPLY RESULTS:
+#    Update your semantic cache configuration with the optimized values:
+#
+#    semantic_cache_parameters:
+#      similarity_threshold: <optimized_value>
+#      context_similarity_threshold: <optimized_value>
+#      question_weight: <optimized_value>
+#
+# 4. MONITOR:
+#    Track cache hit rates and accuracy in production to validate improvements.

{dao_ai-0.1.19 → dao_ai-0.1.21}/config/examples/04_genie/genie_basic.yaml RENAMED Viewed

@@ -16,8 +16,6 @@ resources:
       max_tokens: 8192                              # Maximum tokens per response
       on_behalf_of_user: False
   genie_rooms:
     # Genie space for retail data queries
     retail_genie_room: &retail_genie_room
@@ -27,6 +25,13 @@ resources:
         env: RETAIL_AI_GENIE_SPACE_ID
         default_value: 01f01c91f1f414d59daaefd2b7ec82ea
+  tables:
+    # The retail_consumer_goods`.`quick_serve_restaurant`.`lookup_items_by_descriptions` function is used to lookup the items description vector index.
+    # The function will be implicitly granted permission to the genie agent, however, we we need to grant permission to the index for the genie agent to use it.
+    items_description_vs_index:
+      schema: *quick_serve_restaurant_schema
+      name: items_description_vs_index
 tools:
   genie_tool: &genie_tool
     name: genie

{dao_ai-0.1.19 → dao_ai-0.1.21}/config/examples/04_genie/genie_in_memory_semantic_cache.yaml RENAMED Viewed

@@ -51,6 +51,12 @@ resources:
         env: RETAIL_AI_GENIE_SPACE_ID
         default_value: 01f01c91f1f414d59daaefd2b7ec82ea
+  tables:
+    # The retail_consumer_goods`.`quick_serve_restaurant`.`lookup_items_by_descriptions` function is used to lookup the items description vector index.
+    # The function will be implicitly granted permission to the genie agent, however, we we need to grant permission to the index for the genie agent to use it.
+    items_description_vs_index:
+      schema: *quick_serve_restaurant_schema
+      name: items_description_vs_index
 # =============================================================================
 # MEMORY CONFIGURATION

{dao_ai-0.1.19 → dao_ai-0.1.21}/config/examples/04_genie/genie_lru_cache.yaml RENAMED Viewed

@@ -33,7 +33,13 @@ resources:
         env: RETAIL_AI_GENIE_SPACE_ID
         default_value: 01f01c91f1f414d59daaefd2b7ec82ea
+  tables:
+    # The retail_consumer_goods`.`quick_serve_restaurant`.`lookup_items_by_descriptions` function is used to lookup the items description vector index.
+    # The function will be implicitly granted permission to the genie agent, however, we we need to grant permission to the index for the genie agent to use it.
+    items_description_vs_index:
+      schema: *quick_serve_restaurant_schema
+      name: items_description_vs_index
 # =============================================================================
 # MEMORY CONFIGURATION
 # =============================================================================

{dao_ai-0.1.19 → dao_ai-0.1.21}/config/examples/04_genie/genie_semantic_cache.yaml RENAMED Viewed

@@ -53,7 +53,13 @@ resources:
         env: RETAIL_AI_GENIE_SPACE_ID
         default_value: 01f01c91f1f414d59daaefd2b7ec82ea
+  tables:
+    # The retail_consumer_goods`.`quick_serve_restaurant`.`lookup_items_by_descriptions` function is used to lookup the items description vector index.
+    # The function will be implicitly granted permission to the genie agent, however, we we need to grant permission to the index for the genie agent to use it.
+    items_description_vs_index:
+      schema: *quick_serve_restaurant_schema
+      name: items_description_vs_index
 # =============================================================================
 # MEMORY CONFIGURATION
 # =============================================================================

{dao_ai-0.1.19 → dao_ai-0.1.21}/config/examples/04_genie/genie_with_conversation_id.yaml RENAMED Viewed

@@ -75,6 +75,13 @@ resources:
         default_value: 01f01c91f1f414d59daaefd2b7ec82ea
       on_behalf_of_user: false
+  tables:
+    # The retail_consumer_goods`.`quick_serve_restaurant`.`lookup_items_by_descriptions` function is used to lookup the items description vector index.
+    # The function will be implicitly granted permission to the genie agent, however, we we need to grant permission to the index for the genie agent to use it.
+    items_description_vs_index:
+      schema: *quick_serve_retaurant_schema
+      name: items_description_vs_index
 memory: &memory
   # Conversation checkpointing for state persistence
   checkpointer:

dao_ai-0.1.21/config/examples/17_parallel_tools/README.md ADDED Viewed

@@ -0,0 +1,253 @@
+# 17. Parallel Tool Calls
+**Maximize agent performance with concurrent tool execution**
+Learn how to enable and observe parallel tool calling, where the LLM requests multiple tools in a single response and they execute concurrently.
+## Architecture Overview
+```mermaid
+%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#7b1fa2'}}}%%
+flowchart TB
+    subgraph Sequential["Sequential Execution (Slow)"]
+        direction TB
+        S1["Tool A"] --> S2["Tool B"] --> S3["Tool C"]
+        ST["Total: 3 round trips"]
+    end
+    subgraph Parallel["Parallel Execution (Fast)"]
+        direction TB
+        P1["Tool A"]
+        P2["Tool B"]
+        P3["Tool C"]
+        PT["Total: 1 round trip"]
+    end
+    User["User Query"] --> LLM["LLM Decision"]
+    LLM -->|"One at a time"| Sequential
+    LLM -->|"All at once"| Parallel
+    style Sequential fill:#ffebee,stroke:#c62828
+    style Parallel fill:#e8f5e9,stroke:#2e7d32
+```
+## Examples
+| File | Description |
+|------|-------------|
+| [`parallel_tool_calls.yaml`](./parallel_tool_calls.yaml) | Complete example with inline tools and observability middleware |
+## Key Concepts
+### 1. Parallel Tool Calling
+When an LLM needs multiple pieces of independent information, it can request all tools in a single response:
+```mermaid
+%%{init: {'theme': 'base'}}%%
+sequenceDiagram
+    autonumber
+    participant User
+    participant LLM
+    participant ToolNode
+    participant Tools
+    User->>LLM: "Calculate 5+3, 10*2, and 100/4"
+    LLM->>ToolNode: [calc(5+3), calc(10*2), calc(100/4)]
+    Note over ToolNode: Single response with 3 tool calls
+    par Parallel Execution
+        ToolNode->>Tools: calc(5+3)
+        ToolNode->>Tools: calc(10*2)
+        ToolNode->>Tools: calc(100/4)
+    end
+    Tools-->>ToolNode: [8, 20, 25]
+    ToolNode-->>LLM: All results
+    LLM-->>User: "5+3=8, 10*2=20, 100/4=25"
+```
+### 2. Inline Tool Definitions
+Define simple tools directly in YAML without separate Python files:
+```yaml
+tools:
+  calculator:
+    name: calculator
+    function:
+      type: inline
+      code: |
+        from langchain.tools import tool
+        @tool
+        def calculator(expression: str) -> str:
+            """Evaluate a mathematical expression."""
+            return str(eval(expression))
+```
+### 3. Tool Call Observability
+Monitor parallel vs sequential tool calling patterns:
+```yaml
+middleware:
+  - name: dao_ai.middleware.tool_call_observability.create_tool_call_observability_middleware
+    args:
+      log_level: INFO
+      include_args: true
+```
+## Prompt Engineering for Parallel Calls
+The key to enabling parallel tool calls is **explicit instruction** in the system prompt:
+```yaml
+prompt: |
+  You are a helpful assistant with access to various tools.
+  ## CRITICAL: Parallel Tool Execution
+  **ALWAYS call multiple tools simultaneously when they are independent.**
+  When you need to perform multiple independent operations, you MUST call ALL
+  relevant tools in a SINGLE response. Do NOT call them one at a time.
+  Examples of CORRECT parallel behavior:
+  - User asks for time in 3 cities -> Call get_time 3 times IN ONE RESPONSE
+  - User asks for 3 calculations -> Call calculator 3 times IN ONE RESPONSE
+  - User asks to look up items 101, 102, 103 -> Call lookup 3 times IN ONE RESPONSE
+  Only call tools sequentially when one tool's output is needed as INPUT for another.
+```
+## Observability Output
+The observability middleware provides detailed logging:
+### Parallel Calls Detected
+```
+INFO | PARALLEL tool calls detected | num_tools=3 | tool_names=calculator,calculator,calculator
+INFO |   Tool: calculator | args={'expression': '5 + 3'}
+INFO |   Tool: calculator | args={'expression': '10 * 2'}
+INFO |   Tool: calculator | args={'expression': '100 / 4'}
+```
+### Summary Statistics
+```
+INFO | Tool Call Observability Summary
+     | total_model_calls=2
+     | total_tool_calls=3
+     | parallel_batches=1
+     | sequential_calls=0
+     | parallelism_ratio=100.0%
+SUCCESS | Parallel tool calling IS happening: 1 batches with multiple tools
+```
+### Sequential Calls Warning
+```
+WARNING | All tool calls are SEQUENTIAL: 5 single-tool responses.
+        | Consider prompt engineering to encourage parallel calls.
+```
+## Quick Start
+```bash
+# Run the parallel tool calls example
+dao-ai chat -c config/examples/17_parallel_tools/parallel_tool_calls.yaml
+# Test queries that should trigger parallel calls:
+> What is 5+3, 10*2, and 100/4?
+> Look up items 101, 102, and 201
+> Roll three dice for me
+```
+## Performance Benefits
+```mermaid
+%%{init: {'theme': 'base'}}%%
+graph LR
+    subgraph Before["Before: Sequential"]
+        B1["3 tools x 500ms each = 1500ms"]
+    end
+    subgraph After["After: Parallel"]
+        A1["3 tools concurrent = 500ms"]
+    end
+    Before -->|"3x faster"| After
+    style Before fill:#ffebee,stroke:#c62828
+    style After fill:#e8f5e9,stroke:#2e7d32
+```
+| Scenario | Sequential | Parallel | Speedup |
+|----------|------------|----------|---------|
+| 3 independent lookups | 1.5s | 0.5s | 3x |
+| 5 API calls | 2.5s | 0.5s | 5x |
+| 10 database queries | 5.0s | 0.5s | 10x |
+## Inline Tools Reference
+The `inline` function type allows defining tools directly in YAML:
+```yaml
+tools:
+  my_tool:
+    name: my_tool
+    function:
+      type: inline
+      code: |
+        from langchain.tools import tool
+        @tool
+        def my_tool(param: str) -> str:
+            """Tool description shown to the LLM."""
+            # Your tool logic here
+            return f"Result: {param}"
+```
+### Requirements
+- Must import `@tool` decorator from `langchain.tools`
+- Must define at least one function decorated with `@tool`
+- The function docstring becomes the tool description
+- Return type should be `str` for best compatibility
+### Use Cases
+- Prototyping and testing
+- Simple utility tools
+- Demo configurations
+- Learning and experimentation
+For production tools, consider using `type: python` or `type: factory` with proper module organization.
+## Troubleshooting
+| Issue | Solution |
+|-------|----------|
+| Tools called sequentially | Add explicit parallel instructions to prompt |
+| Model ignores parallel prompt | Try more emphatic wording, use examples |
+| Observability not logging | Ensure middleware is first in list |
+| Inline tool errors | Check imports and `@tool` decorator |
+## Best Practices
+1. **Prompt Engineering**: Explicitly instruct the model to batch independent operations
+2. **Observability**: Always add the observability middleware during development
+3. **Test Queries**: Use queries that naturally require multiple independent operations
+4. **Monitor Parallelism Ratio**: Aim for high parallelism ratio in your use cases
+## Next Steps
+- **12_middleware/** - Learn about other middleware options
+- **14_basic_tools/** - Explore tool definition patterns
+- **15_complete_applications/** - See parallel tools in production configs
+## Related Documentation
+- [Tool Configuration](../../../docs/key-capabilities.md#tools)
+- [Middleware Configuration](../../../docs/key-capabilities.md#middleware)
+- [Performance Optimization](../../../docs/architecture.md#performance)

dao-ai 0.1.19__tar.gz → 0.1.21__tar.gz

dao-ai 0.1.19tar.gz → 0.1.21tar.gz