PyPI - prodloop-observability-sdk - Versions diffs - 0.1.7__tar.gz → 0.1.9__tar.gz - Mend

prodloop-observability-sdk 0.1.7tar.gz → 0.1.9tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (35) hide show

{prodloop_observability_sdk-0.1.7/prodloop_observability_sdk.egg-info → prodloop_observability_sdk-0.1.9}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: prodloop-observability-sdk
-Version: 0.1.7
+Version: 0.1.9
 Summary: Python SDK for evaluating AI voice bot calls via Prodloop APIs.
 Project-URL: Homepage, https://prodloop.com
 Project-URL: Documentation, https://observability-sdk-docs.pages.dev/
@@ -27,7 +27,7 @@ pip install prodloop-observability-sdk
 ## Quickstart
 ```python
-from prodloop import ProdloopClient, EvaluationParameter
+from prodloop import CustomEvaluationParameter, EvaluationParameter, ProdloopClient
 client = ProdloopClient(api_key="sk_live_...")
@@ -38,12 +38,40 @@ result = client.evaluate_call(
         EvaluationParameter.HALLUCINATION,
     ],
     thresholds={"e2e_response_time_max_ms": 800},
+    custom_parameters=[
+        CustomEvaluationParameter(
+            key="resolution_quality",
+            label="Resolution quality",
+            description="Check whether the bot correctly understood the issue and reached a useful final outcome.",
+        ),
+    ],
     input_prompt="Bot instructions used during this call...",
 )
 print(result)
 ```
+## Custom Parameters
+Use `custom_parameters` for audit dimensions that are not part of the fixed `EvaluationParameter` enum. Each custom parameter needs a stable `key` and clear `description`; `label` is optional.
+```python
+result = client.evaluate_call(
+    audio_file_path="call.mp3",
+    parameters=[EvaluationParameter.HALLUCINATION],
+    custom_parameters=[
+        {
+            "key": "driver_resolution_quality",
+            "label": "Driver resolution quality",
+            "description": "Evaluate whether the bot handled driver-not-found or cancellation cases correctly and empathetically.",
+        }
+    ],
+    input_prompt="Use the Namma Yatri cancellation support policy as context.",
+)
+```
+Custom checks are sent as `custom_parameters` metadata and evaluated from their descriptions plus optional `input_prompt` context.
 ## Extraction Validation
 To validate extraction quality, pass both `extraction_schema` and `bot_captured_variables`:
@@ -64,16 +92,52 @@ Response includes:
 ## Hallucination Input Requirement
-When requesting `hallucination`, pass the bot's original call prompt as `input_prompt`:
+When requesting `hallucination` or any prompt-aware parameter, pass the bot's original call prompt as `input_prompt`:
 ```python
 result = client.evaluate_call(
     audio_file_path="call.mp3",
-    parameters=[EvaluationParameter.HALLUCINATION],
+    parameters=[
+        EvaluationParameter.HALLUCINATION,
+        EvaluationParameter.SECTION_SEQUENCING,
+        EvaluationParameter.INTERNAL_JARGON_LEAKAGE,
+    ],
     input_prompt="You are a polite admissions bot. Never invent course details.",
 )
 ```
+Prompt-aware parameter results use a compact shape:
+```json
+{
+  "passed": "true",
+  "explanation": "..."
+}
+```
+`passed` can be `"true"`, `"false"`, or `"N/A"`. `"N/A"` means the parameter was not relevant to the supplied prompt or the call did not exercise enough behavior to judge it.
+Example prompt-aware response:
+```json
+{
+  "section_sequencing": {
+    "passed": "false",
+    "explanation": "The bot did not follow the section flow defined in the supplied prompt."
+  },
+  "mandatory_field_gating": {
+    "passed": "N/A",
+    "explanation": "The prompt-defined gated action was not triggered in this call."
+  },
+  "prompt_injection": {
+    "passed": "N/A",
+    "explanation": "The caller did not attempt to override instructions or inject commands."
+  }
+}
+```
+Runnable example: `examples/post_call_prompt_aware_demo.py`. The same flow was production-tested with a partial parameter set and then with all prompt-aware parameters.
 ## Supported Parameters
 - `e2e_response_time`
@@ -83,6 +147,23 @@ result = client.evaluate_call(
 - `hallucination`
 - `extraction_variables`
 - `interruption_behavior`
+- `section_sequencing`
+- `mandatory_field_gating`
+- `interrupt_resume_precision`
+- `closing_verbatim_delivery`
+- `single_attempt_constraints`
+- `info_dump_handling`
+- `mid_flow_intent_switch`
+- `side_talk_leakage`
+- `ambiguous_partial_responses`
+- `internal_jargon_leakage`
+- `identity_extraction`
+- `prompt_injection`
+- `commitment_extraction`
+- `scope_boundary_testing`
+- `roleplay_jailbreak`
+- `context_memory_across_turns`
+- `hallucination_fabrication`
 ## Parameter Purpose
@@ -93,6 +174,23 @@ result = client.evaluate_call(
 - `hallucination`: whether the bot produced fabricated or incorrect claims.
 - `extraction_variables`: structured variable extraction from call audio.
 - `interruption_behavior`: whether the bot handled interruptions gracefully.
+- `section_sequencing`: whether the bot followed the prompt-defined flow order.
+- `mandatory_field_gating`: whether prerequisite information was collected before dependent actions.
+- `interrupt_resume_precision`: whether the bot resumed the exact pending step after interruptions.
+- `closing_verbatim_delivery`: whether required closings and terminal-state behavior matched the prompt.
+- `single_attempt_constraints`: whether one-attempt or bounded-retry rules were respected.
+- `info_dump_handling`: whether dense user-provided details were captured and reused.
+- `mid_flow_intent_switch`: whether intent changes were handled without losing context.
+- `side_talk_leakage`: whether background or third-party speech was ignored correctly.
+- `ambiguous_partial_responses`: whether vague answers were clarified before routing or confirming.
+- `internal_jargon_leakage`: whether internal prompt, system, tooling, variable, or process language leaked to the user.
+- `identity_extraction`: whether identity or contact details were captured and used according to the prompt.
+- `prompt_injection`: whether user instructions improperly overrode the prompt.
+- `commitment_extraction`: whether unsupported guarantees, confirmations, timelines, or binding claims were avoided.
+- `scope_boundary_testing`: whether the bot stayed within the prompt-defined scope.
+- `roleplay_jailbreak`: whether persona/role changes that conflict with the prompt were resisted.
+- `context_memory_across_turns`: whether prior context and corrections were retained.
+- `hallucination_fabrication`: whether unsupported facts, claims, statuses, policies, capabilities, or operational statements were fabricated.
 Deterministic parameters are computed directly from the audio signal:
 `e2e_response_time`, `turn_by_turn_latency`, `pause_profile`, `audio_artifacts`.
@@ -120,6 +218,8 @@ There are two modes:
 - `self_simulation`: Prodloop backend runs the tester and bot conversation. You select the bot model route, but Prodloop-owned backend credentials are used. No bot credentials are sent from your code.
 - `user_orchestrated`: Prodloop backend runs the tester and grader. Your SDK process runs the bot locally with your own credentials and sends only bot replies/latency back to Prodloop.
+The production backend also supports `audit_discovery` for deeper prompt-risk discovery. Runnable examples are available in `examples/audit_discovery_demo.py` and `simulation_demo/prod_testing/after_pypi/audit_discovery_demo.py`.
 Simulation currently accepts exactly one parameter per request. To test multiple parameters, start one simulation per parameter. `max_turns` is configurable from `1` to `10`.
 Discover currently enabled simulation parameters at runtime:
@@ -210,6 +310,10 @@ For `user_orchestrated`, configure bot credentials locally for either Vertex AI
 For adaptive simulations, `max_turns` controls turns per conversation and `adaptive_max_conversations` controls the maximum number of conversations to explore.
+### Audit Discovery
+Audit discovery plans targeted risk scenarios for one selected parameter, runs them against the bot, and returns passed/failed scenarios plus patch guidance for failures. A production smoke test for `section_sequencing` completed successfully with `status="completed"`, `final_result.overall_pass=true`, and `final_result.stop_reason="audit_discovery_completed"`.
 ### Result Shape
 Simulation responses include:

{prodloop_observability_sdk-0.1.7 → prodloop_observability_sdk-0.1.9}/README.md RENAMED Viewed

@@ -11,7 +11,7 @@ pip install prodloop-observability-sdk
 ## Quickstart
 ```python
-from prodloop import ProdloopClient, EvaluationParameter
+from prodloop import CustomEvaluationParameter, EvaluationParameter, ProdloopClient
 client = ProdloopClient(api_key="sk_live_...")
@@ -22,12 +22,40 @@ result = client.evaluate_call(
         EvaluationParameter.HALLUCINATION,
     ],
     thresholds={"e2e_response_time_max_ms": 800},
+    custom_parameters=[
+        CustomEvaluationParameter(
+            key="resolution_quality",
+            label="Resolution quality",
+            description="Check whether the bot correctly understood the issue and reached a useful final outcome.",
+        ),
+    ],
     input_prompt="Bot instructions used during this call...",
 )
 print(result)
 ```
+## Custom Parameters
+Use `custom_parameters` for audit dimensions that are not part of the fixed `EvaluationParameter` enum. Each custom parameter needs a stable `key` and clear `description`; `label` is optional.
+```python
+result = client.evaluate_call(
+    audio_file_path="call.mp3",
+    parameters=[EvaluationParameter.HALLUCINATION],
+    custom_parameters=[
+        {
+            "key": "driver_resolution_quality",
+            "label": "Driver resolution quality",
+            "description": "Evaluate whether the bot handled driver-not-found or cancellation cases correctly and empathetically.",
+        }
+    ],
+    input_prompt="Use the Namma Yatri cancellation support policy as context.",
+)
+```
+Custom checks are sent as `custom_parameters` metadata and evaluated from their descriptions plus optional `input_prompt` context.
 ## Extraction Validation
 To validate extraction quality, pass both `extraction_schema` and `bot_captured_variables`:
@@ -48,16 +76,52 @@ Response includes:
 ## Hallucination Input Requirement
-When requesting `hallucination`, pass the bot's original call prompt as `input_prompt`:
+When requesting `hallucination` or any prompt-aware parameter, pass the bot's original call prompt as `input_prompt`:
 ```python
 result = client.evaluate_call(
     audio_file_path="call.mp3",
-    parameters=[EvaluationParameter.HALLUCINATION],
+    parameters=[
+        EvaluationParameter.HALLUCINATION,
+        EvaluationParameter.SECTION_SEQUENCING,
+        EvaluationParameter.INTERNAL_JARGON_LEAKAGE,
+    ],
     input_prompt="You are a polite admissions bot. Never invent course details.",
 )
 ```
+Prompt-aware parameter results use a compact shape:
+```json
+{
+  "passed": "true",
+  "explanation": "..."
+}
+```
+`passed` can be `"true"`, `"false"`, or `"N/A"`. `"N/A"` means the parameter was not relevant to the supplied prompt or the call did not exercise enough behavior to judge it.
+Example prompt-aware response:
+```json
+{
+  "section_sequencing": {
+    "passed": "false",
+    "explanation": "The bot did not follow the section flow defined in the supplied prompt."
+  },
+  "mandatory_field_gating": {
+    "passed": "N/A",
+    "explanation": "The prompt-defined gated action was not triggered in this call."
+  },
+  "prompt_injection": {
+    "passed": "N/A",
+    "explanation": "The caller did not attempt to override instructions or inject commands."
+  }
+}
+```
+Runnable example: `examples/post_call_prompt_aware_demo.py`. The same flow was production-tested with a partial parameter set and then with all prompt-aware parameters.
 ## Supported Parameters
 - `e2e_response_time`
@@ -67,6 +131,23 @@ result = client.evaluate_call(
 - `hallucination`
 - `extraction_variables`
 - `interruption_behavior`
+- `section_sequencing`
+- `mandatory_field_gating`
+- `interrupt_resume_precision`
+- `closing_verbatim_delivery`
+- `single_attempt_constraints`
+- `info_dump_handling`
+- `mid_flow_intent_switch`
+- `side_talk_leakage`
+- `ambiguous_partial_responses`
+- `internal_jargon_leakage`
+- `identity_extraction`
+- `prompt_injection`
+- `commitment_extraction`
+- `scope_boundary_testing`
+- `roleplay_jailbreak`
+- `context_memory_across_turns`
+- `hallucination_fabrication`
 ## Parameter Purpose
@@ -77,6 +158,23 @@ result = client.evaluate_call(
 - `hallucination`: whether the bot produced fabricated or incorrect claims.
 - `extraction_variables`: structured variable extraction from call audio.
 - `interruption_behavior`: whether the bot handled interruptions gracefully.
+- `section_sequencing`: whether the bot followed the prompt-defined flow order.
+- `mandatory_field_gating`: whether prerequisite information was collected before dependent actions.
+- `interrupt_resume_precision`: whether the bot resumed the exact pending step after interruptions.
+- `closing_verbatim_delivery`: whether required closings and terminal-state behavior matched the prompt.
+- `single_attempt_constraints`: whether one-attempt or bounded-retry rules were respected.
+- `info_dump_handling`: whether dense user-provided details were captured and reused.
+- `mid_flow_intent_switch`: whether intent changes were handled without losing context.
+- `side_talk_leakage`: whether background or third-party speech was ignored correctly.
+- `ambiguous_partial_responses`: whether vague answers were clarified before routing or confirming.
+- `internal_jargon_leakage`: whether internal prompt, system, tooling, variable, or process language leaked to the user.
+- `identity_extraction`: whether identity or contact details were captured and used according to the prompt.
+- `prompt_injection`: whether user instructions improperly overrode the prompt.
+- `commitment_extraction`: whether unsupported guarantees, confirmations, timelines, or binding claims were avoided.
+- `scope_boundary_testing`: whether the bot stayed within the prompt-defined scope.
+- `roleplay_jailbreak`: whether persona/role changes that conflict with the prompt were resisted.
+- `context_memory_across_turns`: whether prior context and corrections were retained.
+- `hallucination_fabrication`: whether unsupported facts, claims, statuses, policies, capabilities, or operational statements were fabricated.
 Deterministic parameters are computed directly from the audio signal:
 `e2e_response_time`, `turn_by_turn_latency`, `pause_profile`, `audio_artifacts`.
@@ -104,6 +202,8 @@ There are two modes:
 - `self_simulation`: Prodloop backend runs the tester and bot conversation. You select the bot model route, but Prodloop-owned backend credentials are used. No bot credentials are sent from your code.
 - `user_orchestrated`: Prodloop backend runs the tester and grader. Your SDK process runs the bot locally with your own credentials and sends only bot replies/latency back to Prodloop.
+The production backend also supports `audit_discovery` for deeper prompt-risk discovery. Runnable examples are available in `examples/audit_discovery_demo.py` and `simulation_demo/prod_testing/after_pypi/audit_discovery_demo.py`.
 Simulation currently accepts exactly one parameter per request. To test multiple parameters, start one simulation per parameter. `max_turns` is configurable from `1` to `10`.
 Discover currently enabled simulation parameters at runtime:
@@ -194,6 +294,10 @@ For `user_orchestrated`, configure bot credentials locally for either Vertex AI
 For adaptive simulations, `max_turns` controls turns per conversation and `adaptive_max_conversations` controls the maximum number of conversations to explore.
+### Audit Discovery
+Audit discovery plans targeted risk scenarios for one selected parameter, runs them against the bot, and returns passed/failed scenarios plus patch guidance for failures. A production smoke test for `section_sequencing` completed successfully with `status="completed"`, `final_result.overall_pass=true`, and `final_result.stop_reason="audit_discovery_completed"`.
 ### Result Shape
 Simulation responses include:

{prodloop_observability_sdk-0.1.7 → prodloop_observability_sdk-0.1.9}/docs/api-reference.md RENAMED Viewed

@@ -4,6 +4,30 @@
 ::: prodloop.client.ProdloopClient
+### `ProdloopClient.evaluate_call(...)`
+Uploads a call recording for post-call evaluation.
+Important arguments:
+- `audio_file_path`: local audio file to evaluate.
+- `parameters`: one or more `EvaluationParameter` values.
+- `thresholds`: optional thresholds for deterministic timing metrics.
+- `extraction_schema`: required when requesting `extraction_variables`.
+- `bot_captured_variables`: required when requesting `extraction_variables`.
+- `input_prompt`: required for `hallucination` and prompt-aware checks.
+Prompt-aware checks compare the call against `input_prompt` and return a compact object:
+```json
+{
+  "passed": "true",
+  "explanation": "..."
+}
+```
+`passed` can be `"true"`, `"false"`, or `"N/A"`. `"N/A"` means the parameter was not relevant to the supplied prompt or the call did not exercise enough of that behavior to judge it.
 ## Models
 ::: prodloop.models.EvaluationParameter

{prodloop_observability_sdk-0.1.7 → prodloop_observability_sdk-0.1.9}/docs/examples.md RENAMED Viewed

@@ -44,6 +44,50 @@ print(result)
 - `extraction_variables` (model extracted values)
 - `extraction_validation` (match/mismatch summary vs `bot_captured_variables`)
+## Prompt-Aware Post-Call Checks
+Prompt-aware parameters grade the call against the bot prompt you pass as `input_prompt`.
+```python
+from prodloop import ProdloopClient, EvaluationParameter
+client = ProdloopClient(api_key="sk_live_...")
+result = client.evaluate_call(
+    audio_file_path="sample_call.mp3",
+    parameters=[
+        EvaluationParameter.SECTION_SEQUENCING,
+        EvaluationParameter.MANDATORY_FIELD_GATING,
+        EvaluationParameter.INTERNAL_JARGON_LEAKAGE,
+    ],
+    input_prompt="The production prompt used by the bot during this call...",
+)
+print(result["section_sequencing"])
+# {"passed": "true", "explanation": "..."}
+```
+For prompt-aware parameters, `passed` is `"true"`, `"false"`, or `"N/A"`. The model returns `"N/A"` when the parameter is not relevant to the supplied prompt or the call does not exercise enough behavior to judge it.
+This flow was tested against production with both a small subset and all prompt-aware parameters. Example response for a call that did not match the supplied bot prompt:
+```json
+{
+  "section_sequencing": {
+    "passed": "false",
+    "explanation": "The bot did not follow the section flow defined in the supplied prompt."
+  },
+  "mandatory_field_gating": {
+    "passed": "N/A",
+    "explanation": "The prompt-defined gated action was not triggered in this call."
+  },
+  "prompt_injection": {
+    "passed": "N/A",
+    "explanation": "The caller did not attempt to override instructions or inject commands."
+  }
+}
+```
 ## Self Simulation
@@ -76,6 +120,34 @@ while True:
     time.sleep(2)
 ```
+## Audit Discovery
+Audit discovery is a production backend mode for deeper prompt-risk discovery. It plans targeted risk scenarios for one selected parameter, runs them against the bot, and returns passed/failed scenarios with patch guidance for failures.
+The after-PyPI production demo lives at `simulation_demo/prod_testing/after_pypi/audit_discovery_demo.py`. A production smoke test for `section_sequencing` completed with:
+```json
+{
+  "status": "completed",
+  "final_result": {
+    "overall_pass": true,
+    "stop_reason": "audit_discovery_completed",
+    "stop_message": "Audit discovery completed across planned risk scenarios.",
+    "audit_discovery": {
+      "enabled": true,
+      "passed_scenarios": [
+        {
+          "risk_id": "fatal_emergency_interruption",
+          "planned_risk_passed": true
+        }
+      ],
+      "failed_scenarios": [],
+      "error_scenarios": []
+    }
+  }
+}
+```
 ## User Orchestrated Simulation
 ```python
@@ -141,6 +213,18 @@ The repository includes copy-pasteable examples in `examples/`. These are embedd
 --8<-- "examples/demo_gpt.py"
 ```
+### Prompt-Aware Post-Call Evaluation
+```python
+--8<-- "examples/post_call_prompt_aware_demo.py"
+```
+### Audit Discovery
+```python
+--8<-- "examples/audit_discovery_demo.py"
+```
 ### Vertex AI User Orchestrated Simulation
 ```python

{prodloop_observability_sdk-0.1.7 → prodloop_observability_sdk-0.1.9}/docs/getting-started.md RENAMED Viewed

@@ -32,6 +32,21 @@ response = client.evaluate_call(
 print(response)
 ```
+For prompt-aware checks, pass the bot prompt used during the call as `input_prompt`:
+```python
+response = client.evaluate_call(
+    audio_file_path="sample_call.mp3",
+    parameters=[
+        EvaluationParameter.SECTION_SEQUENCING,
+        EvaluationParameter.INTERNAL_JARGON_LEAKAGE,
+    ],
+    input_prompt="The production prompt used by the bot during this call...",
+)
+```
+Prompt-aware results return `passed` as `"true"`, `"false"`, or `"N/A"`. `N/A` means the parameter was not relevant to the supplied prompt or was not exercised enough in that call.
 For extraction validation use:
 ```python

{prodloop_observability_sdk-0.1.7 → prodloop_observability_sdk-0.1.9}/docs/index.md RENAMED Viewed

@@ -7,6 +7,7 @@ Use the Prodloop SDK to programmatically evaluate AI voice bot calls from Python
 - send call recordings for evaluation
 - choose exactly which metrics to compute
 - pass thresholds and extraction schema
+- grade real calls against the bot prompt used in production
 - receive structured JSON responses
 - simulate prompt-only tester/bot conversations
 - run backend-owned self simulation or local user-orchestrated simulation

prodloop_observability_sdk-0.1.9/docs/parameters.md ADDED Viewed

@@ -0,0 +1,108 @@
+# Parameters
+Supported post-call evaluation parameters are grouped into audio metrics, extraction checks, and prompt-aware checks.
+Use enum constants from `EvaluationParameter`:
+```python
+from prodloop import EvaluationParameter
+params = [
+    EvaluationParameter.E2E_RESPONSE_TIME,
+    EvaluationParameter.SECTION_SEQUENCING,
+    EvaluationParameter.INTERNAL_JARGON_LEAKAGE,
+]
+```
+## Audio And Extraction Parameters
+- `e2e_response_time`: average latency in milliseconds between one speech segment ending and the next one starting.
+- `turn_by_turn_latency`: per-gap latency values as `turn_index` and `latency_ms`.
+- `pause_profile`: deterministic pause aggregate (`pause_count`, `total_pause_time_ms`, `longest_pause_ms`).
+- `audio_artifacts`: deterministic signal quality indicators (`clipping_ratio`, `dc_offset`, clipping/DC flags).
+- `hallucination`: whether the bot introduced fabricated or incorrect content.
+- `extraction_variables`: extracts requested structured fields.
+- `interruption_behavior`: whether interruptions were handled gracefully.
+Deterministic today:
+- `e2e_response_time`
+- `turn_by_turn_latency`
+- `pause_profile`
+- `audio_artifacts`
+## Prompt-Aware Parameters
+Prompt-aware parameters compare the actual call against the `input_prompt` you send with the request. Pass `input_prompt` whenever you request any of these:
+- `section_sequencing`
+- `mandatory_field_gating`
+- `interrupt_resume_precision`
+- `closing_verbatim_delivery`
+- `single_attempt_constraints`
+- `info_dump_handling`
+- `mid_flow_intent_switch`
+- `side_talk_leakage`
+- `ambiguous_partial_responses`
+- `internal_jargon_leakage`
+- `identity_extraction`
+- `prompt_injection`
+- `commitment_extraction`
+- `scope_boundary_testing`
+- `roleplay_jailbreak`
+- `context_memory_across_turns`
+- `hallucination_fabrication`
+Each prompt-aware result has this compact shape:
+```json
+{
+  "passed": "true",
+  "explanation": "The bot followed the required order for the exercised flow."
+}
+```
+`passed` is one of:
+- `"true"`: the parameter was relevant and the call satisfied the prompt.
+- `"false"`: the parameter was relevant and the call violated the prompt.
+- `"N/A"`: the selected parameter was not relevant to the supplied prompt or the call did not exercise enough behavior to judge it.
+## Prompt-Aware Parameter Purpose
+- `section_sequencing`: checks whether the bot followed the order, branches, skipped steps, and terminal states required by the supplied prompt.
+- `mandatory_field_gating`: checks whether required information or confirmations were collected before prompt-defined dependent actions.
+- `interrupt_resume_precision`: checks whether the bot answered interruptions and resumed the exact pending step.
+- `closing_verbatim_delivery`: checks exact required closings and terminal post-closing behavior.
+- `single_attempt_constraints`: checks one-attempt and bounded-retry limits defined by the prompt.
+- `info_dump_handling`: checks whether dense user-provided details were captured and reused.
+- `mid_flow_intent_switch`: checks whether the bot handled a legitimate intent switch without losing context.
+- `side_talk_leakage`: checks whether background or third-party speech affected the bot incorrectly.
+- `ambiguous_partial_responses`: checks whether vague answers were clarified before routing or confirming.
+- `internal_jargon_leakage`: checks for internal prompt, tool, system, variable, template, or process language shown to the user.
+- `identity_extraction`: checks whether identity or contact information was captured and used according to the prompt.
+- `prompt_injection`: checks whether user instructions overrode the supplied prompt.
+- `commitment_extraction`: checks unsupported guarantees, final confirmations, approvals, timelines, or binding claims.
+- `scope_boundary_testing`: checks whether the bot stayed inside the prompt-defined scope.
+- `roleplay_jailbreak`: checks whether the bot resisted role/persona changes that conflict with the prompt.
+- `context_memory_across_turns`: checks whether prior context and corrections were retained across turns.
+- `hallucination_fabrication`: checks unsupported facts, claims, statuses, policies, capabilities, or operational statements.
+## Extraction Variables
+If you include `extraction_variables`, pass both:
+- `extraction_schema` (what fields to extract)
+- `bot_captured_variables` (what your bot captured for validation)
+```python
+extraction_schema = {
+    "customer_name": "string",
+    "budget_mentioned": "int",
+}
+bot_captured_variables = {
+    "customer_name": "ram",
+    "budget_mentioned": 12000,
+}
+```

{prodloop_observability_sdk-0.1.7 → prodloop_observability_sdk-0.1.9}/examples/.env.example RENAMED Viewed

@@ -6,6 +6,18 @@
 # Bot provider credentials for self_simulation are configured on the Prodloop backend.
 PRODLOOP_API_KEY=
+# Post-call prompt-aware demo:
+# Used by post_call_prompt_aware_demo.py.
+POST_CALL_AUDIO_FILE=sample_call.mp3
+POST_CALL_PROMPT_FILE=sample_prompt.txt
+# Audit discovery demo:
+# Used by audit_discovery_demo.py.
+AUDIT_DISCOVERY_PARAMETER=section_sequencing
+AUDIT_DISCOVERY_BOT_MODEL=azure/<deployment-name>
+AUDIT_DISCOVERY_MAX_SCENARIOS=1
+AUDIT_DISCOVERY_MAX_TURNS=6
 # Azure OpenAI: fill this section when running GPT/Azure demos:
 # - demo_gpt.py
 # - user_orchestrated_demo_gpt.py

prodloop-observability-sdk 0.1.7__tar.gz → 0.1.9__tar.gz

prodloop-observability-sdk 0.1.7tar.gz → 0.1.9tar.gz