PyPI - lmnr - Versions diffs - 0.4.12b3__tar.gz → 0.4.12b4__tar.gz - Mend

lmnr 0.4.12b3tar.gz → 0.4.12b4tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (53) hide show

{lmnr-0.4.12b3 → lmnr-0.4.12b4}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: lmnr
-Version: 0.4.12b3
+Version: 0.4.12b4
 Summary: Python SDK for Laminar AI
 License: Apache-2.0
 Author: lmnr.ai
@@ -14,7 +14,6 @@ Classifier: Programming Language :: Python :: 3.12
 Requires-Dist: argparse (>=1.0,<2.0)
 Requires-Dist: asyncio (>=3.0,<4.0)
 Requires-Dist: backoff (>=2.0,<3.0)
-Requires-Dist: colorama (>=0.4,<0.5)
 Requires-Dist: deprecated (>=1.0,<2.0)
 Requires-Dist: jinja2 (>=3.0,<4.0)
 Requires-Dist: opentelemetry-api (>=1.27.0,<2.0.0)
@@ -197,7 +196,7 @@ L.initialize(project_api_key=os.environ["LMNR_PROJECT_API_KEY"], instruments={In
 If you want to fully disable any kind of autoinstrumentation, pass an empty set as `instruments=set()` to `.initialize()`.
-Majority of the autoinstrumentations are provided by Traceloop's [OpenLLMetry](https://github.com/traceloop/openllmetry).
+Autoinstrumentations are provided by Traceloop's [OpenLLMetry](https://github.com/traceloop/openllmetry).
 ## Sending events
@@ -267,13 +266,14 @@ Evaluation takes in the following parameters:
 - `name` – the name of your evaluation. If no such evaluation exists in the project, it will be created. Otherwise, data will be pushed to the existing evaluation
 - `data` – an array of `EvaluationDatapoint` objects, where each `EvaluationDatapoint` has two keys: `target` and `data`, each containing a key-value object. Alternatively, you can pass in dictionaries, and we will instantiate `EvaluationDatapoint`s with pydantic if possible
 - `executor` – the logic you want to evaluate. This function must take `data` as the first argument, and produce any output. *
-- `evaluators` – evaluaton logic. List of functions that take output of executor as the first argument, `target` as the second argument and produce a numeric scores. Each function can produce either a single number or `dict[str, int|float]` of scores.
+- `evaluators` – evaluaton logic. Functions that take output of executor as the first argument, `target` as the second argument and produce a numeric scores. Pass a dict from evaluator name to a function. Each function can produce either a single number or `dict[str, int|float]` of scores.
 \* If you already have the outputs of executors you want to evaluate, you can specify the executor as an identity function, that takes in `data` and returns only needed value(s) from it.
-### Example
+### Example code
 ```python
+from lmnr import evaluate
 from openai import AsyncOpenAI
 import asyncio
 import os
@@ -304,20 +304,25 @@ data = [
 ]
-def evaluator_A(output, target):
+def correctness(output, target):
     return 1 if output == target["capital"] else 0
 # Create an Evaluation instance
-e = Evaluation(
-    name="py-evaluation-async",
+e = evaluate(
+    name="my-evaluation",
     data=data,
     executor=get_capital,
-    evaluators=[evaluator_A],
+    evaluators={"correctness": correctness},
     project_api_key=os.environ["LMNR_PROJECT_API_KEY"],
 )
-# Run the evaluation
-asyncio.run(e.run())
 ```
+### Running from CLI.
+1. Make sure `lmnr` is installed in a venv. CLI does not work with a global env
+1. Run `lmnr path/to/my/eval.py`
+### Running from code
+Simply execute the function, e.g. `python3 path/to/my/eval.py`

{lmnr-0.4.12b3 → lmnr-0.4.12b4}/README.md RENAMED Viewed

@@ -137,7 +137,7 @@ L.initialize(project_api_key=os.environ["LMNR_PROJECT_API_KEY"], instruments={In
 If you want to fully disable any kind of autoinstrumentation, pass an empty set as `instruments=set()` to `.initialize()`.
-Majority of the autoinstrumentations are provided by Traceloop's [OpenLLMetry](https://github.com/traceloop/openllmetry).
+Autoinstrumentations are provided by Traceloop's [OpenLLMetry](https://github.com/traceloop/openllmetry).
 ## Sending events
@@ -207,13 +207,14 @@ Evaluation takes in the following parameters:
 - `name` – the name of your evaluation. If no such evaluation exists in the project, it will be created. Otherwise, data will be pushed to the existing evaluation
 - `data` – an array of `EvaluationDatapoint` objects, where each `EvaluationDatapoint` has two keys: `target` and `data`, each containing a key-value object. Alternatively, you can pass in dictionaries, and we will instantiate `EvaluationDatapoint`s with pydantic if possible
 - `executor` – the logic you want to evaluate. This function must take `data` as the first argument, and produce any output. *
-- `evaluators` – evaluaton logic. List of functions that take output of executor as the first argument, `target` as the second argument and produce a numeric scores. Each function can produce either a single number or `dict[str, int|float]` of scores.
+- `evaluators` – evaluaton logic. Functions that take output of executor as the first argument, `target` as the second argument and produce a numeric scores. Pass a dict from evaluator name to a function. Each function can produce either a single number or `dict[str, int|float]` of scores.
 \* If you already have the outputs of executors you want to evaluate, you can specify the executor as an identity function, that takes in `data` and returns only needed value(s) from it.
-### Example
+### Example code
 ```python
+from lmnr import evaluate
 from openai import AsyncOpenAI
 import asyncio
 import os
@@ -244,19 +245,25 @@ data = [
 ]
-def evaluator_A(output, target):
+def correctness(output, target):
     return 1 if output == target["capital"] else 0
 # Create an Evaluation instance
-e = Evaluation(
-    name="py-evaluation-async",
+e = evaluate(
+    name="my-evaluation",
     data=data,
     executor=get_capital,
-    evaluators=[evaluator_A],
+    evaluators={"correctness": correctness},
     project_api_key=os.environ["LMNR_PROJECT_API_KEY"],
 )
-# Run the evaluation
-asyncio.run(e.run())
 ```
+### Running from CLI.
+1. Make sure `lmnr` is installed in a venv. CLI does not work with a global env
+1. Run `lmnr path/to/my/eval.py`
+### Running from code
+Simply execute the function, e.g. `python3 path/to/my/eval.py`

{lmnr-0.4.12b3 → lmnr-0.4.12b4}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "lmnr"
-version = "0.4.12b3"
+version = "0.4.12b4"
 description = "Python SDK for Laminar AI"
 authors = [
   { name = "lmnr.ai", email = "founders@lmnr.ai" }
@@ -11,7 +11,7 @@ license = "Apache-2.0"
 [tool.poetry]
 name = "lmnr"
-version = "0.4.12b3"
+version = "0.4.12b4"
 description = "Python SDK for Laminar AI"
 authors = ["lmnr.ai"]
 readme = "README.md"
@@ -33,7 +33,6 @@ opentelemetry-instrumentation-sqlalchemy = "^0.48b0"
 opentelemetry-instrumentation-urllib3 = "^0.48b0"
 opentelemetry-instrumentation-threading = "^0.48b0"
 opentelemetry-semantic-conventions-ai = "0.4.1"
-colorama = "^0.4"
 tenacity = "~=8.0"
 jinja2 = "~=3.0"
 deprecated = "~=1.0"

{lmnr-0.4.12b3 → lmnr-0.4.12b4}/src/lmnr/sdk/decorators.py RENAMED Viewed

@@ -6,6 +6,7 @@ from opentelemetry.trace import INVALID_SPAN, get_current_span
 from typing import Callable, Optional, cast
+from lmnr.traceloop_sdk.tracing.attributes import SESSION_ID, USER_ID
 from lmnr.traceloop_sdk.tracing.tracing import update_association_properties
 from .utils import is_async
@@ -43,11 +44,11 @@ def observe(
         if current_span != INVALID_SPAN:
             if session_id is not None:
                 current_span.set_attribute(
-                    "traceloop.association.properties.session_id", session_id
+                    SESSION_ID, session_id
                 )
             if user_id is not None:
                 current_span.set_attribute(
-                    "traceloop.association.properties.user_id", user_id
+                    USER_ID, user_id
                 )
         association_properties = {}
         if session_id is not None:

{lmnr-0.4.12b3 → lmnr-0.4.12b4}/src/lmnr/sdk/evaluations.py RENAMED Viewed

@@ -2,12 +2,26 @@ import asyncio
 import sys
 from abc import ABC, abstractmethod
 from contextlib import contextmanager
-from typing import Any, Awaitable, Optional, Union
+from typing import Any, Awaitable, Optional, Set, Union
+import uuid
 from tqdm import tqdm
+from ..traceloop_sdk.instruments import Instruments
+from ..traceloop_sdk.tracing.attributes import SPAN_TYPE
 from .laminar import Laminar as L
-from .types import CreateEvaluationResponse, Datapoint, EvaluationResultDatapoint, Numeric, NumericTypes
+from .types import (
+    CreateEvaluationResponse,
+    Datapoint,
+    EvaluationResultDatapoint,
+    EvaluatorFunction,
+    ExecutorFunction,
+    Numeric,
+    NumericTypes,
+    SpanType,
+    TraceType,
+)
 from .utils import is_async
 DEFAULT_BATCH_SIZE = 5
@@ -39,7 +53,11 @@ class EvaluationReporter:
     def start(self, name: str, project_id: str, id: str, length: int):
         print(f"Running evaluation {name}...\n")
         print(f"Check progress and results at {get_evaluation_url(project_id, id)}\n")
-        self.cli_progress = tqdm(total=length, bar_format="{bar} {percentage:3.0f}% | ETA: {remaining}s | {n_fmt}/{total_fmt}", ncols=60)
+        self.cli_progress = tqdm(
+            total=length,
+            bar_format="{bar} {percentage:3.0f}% | ETA: {remaining}s | {n_fmt}/{total_fmt}",
+            ncols=60,
+        )
     def update(self, batch_length: int):
         self.cli_progress.update(batch_length)
@@ -51,7 +69,7 @@ class EvaluationReporter:
     def stop(self, average_scores: dict[str, Numeric]):
         self.cli_progress.close()
         print("\nAverage scores:")
-        for (name, score) in average_scores.items():
+        for name, score in average_scores.items():
             print(f"{name}: {score}")
         print("\n")
@@ -78,12 +96,14 @@ class Evaluation:
         self,
         data: Union[EvaluationDataset, list[Union[Datapoint, dict]]],
         executor: Any,
-        evaluators: list[Any],
+        evaluators: dict[str, EvaluatorFunction],
         name: Optional[str] = None,
         batch_size: int = DEFAULT_BATCH_SIZE,
         project_api_key: Optional[str] = None,
         base_url: Optional[str] = None,
         http_port: Optional[int] = None,
+        grpc_port: Optional[int] = None,
+        instruments: Optional[Set[Instruments]] = None,
     ):
         """
         Initializes an instance of the Evaluations class.
@@ -114,33 +134,18 @@ class Evaluation:
                             Defaults to "https://api.lmnr.ai".
             http_port (Optional[int], optional): The port for the Laminar API HTTP service.
                             Defaults to 443.
+            instruments (Optional[Set[Instruments]], optional): Set of modules to auto-instrument.
+                            Defaults to None. If None, all available instruments will be used.
         """
         self.is_finished = False
         self.name = name
         self.reporter = EvaluationReporter()
         self.executor = executor
-        self.evaluators = dict(
-            zip(
-                [
-                    (
-                        e.__name__
-                        if e.__name__ and e.__name__ != "<lambda>"
-                        else f"evaluator_{i+1}"
-                    )
-                    for i, e in enumerate(evaluators)
-                ],
-                evaluators,
-            )
-        )
-        self.evaluator_names = list(self.evaluators.keys())
+        self.evaluators = evaluators
         if isinstance(data, list):
             self.data = [
-                (
-                    Datapoint.model_validate(point)
-                    if isinstance(point, dict)
-                    else point
-                )
+                (Datapoint.model_validate(point) if isinstance(point, dict) else point)
                 for point in data
             ]
         else:
@@ -150,7 +155,8 @@ class Evaluation:
             project_api_key=project_api_key,
             base_url=base_url,
             http_port=http_port,
-            instruments=set(),
+            grpc_port=grpc_port,
+            instruments=instruments,
         )
     def run(self) -> Union[None, Awaitable[None]]:
@@ -205,7 +211,7 @@ class Evaluation:
     async def evaluate_in_batches(self, evaluation: CreateEvaluationResponse):
         for i in range(0, len(self.data), self.batch_size):
             batch = (
-                self.data[i: i + self.batch_size]
+                self.data[i : i + self.batch_size]
                 if isinstance(self.data, list)
                 else self.data.slice(i, i + self.batch_size)
             )
@@ -217,52 +223,72 @@ class Evaluation:
             finally:
                 self.reporter.update(len(batch))
-    async def _evaluate_batch(self, batch: list[Datapoint]) -> list[EvaluationResultDatapoint]:
+    async def _evaluate_batch(
+        self, batch: list[Datapoint]
+    ) -> list[EvaluationResultDatapoint]:
         batch_promises = [self._evaluate_datapoint(datapoint) for datapoint in batch]
         results = await asyncio.gather(*batch_promises)
         return results
-    async def _evaluate_datapoint(self, datapoint) -> EvaluationResultDatapoint:
-        output = (
-            await self.executor(datapoint.data)
-            if is_async(self.executor)
-            else self.executor(datapoint.data)
-        )
-        target = datapoint.target
-        # Iterate over evaluators
-        scores: dict[str, Numeric] = {}
-        for evaluator_name in self.evaluator_names:
-            evaluator = self.evaluators[evaluator_name]
-            value = (
-                await evaluator(output, target)
-                if is_async(evaluator)
-                else evaluator(output, target)
+    async def _evaluate_datapoint(
+        self, datapoint: Datapoint
+    ) -> EvaluationResultDatapoint:
+        with L.start_as_current_span("evaluation") as evaluation_span:
+            L._set_trace_type(trace_type=TraceType.EVALUATION)
+            evaluation_span.set_attribute(SPAN_TYPE, SpanType.EVALUATION.value)
+            with L.start_as_current_span(
+                "executor", input={"data": datapoint.data}
+            ) as executor_span:
+                executor_span.set_attribute(SPAN_TYPE, SpanType.EXECUTOR.value)
+                output = (
+                    await self.executor(datapoint.data)
+                    if is_async(self.executor)
+                    else self.executor(datapoint.data)
+                )
+                L.set_span_output(output)
+            target = datapoint.target
+            # Iterate over evaluators
+            scores: dict[str, Numeric] = {}
+            for evaluator_name, evaluator in self.evaluators.items():
+                with L.start_as_current_span(
+                    "evaluator", input={"output": output, "target": target}
+                ) as evaluator_span:
+                    evaluator_span.set_attribute(SPAN_TYPE, SpanType.EVALUATOR.value)
+                    value = (
+                        await evaluator(output, target)
+                        if is_async(evaluator)
+                        else evaluator(output, target)
+                    )
+                    L.set_span_output(value)
+                # If evaluator returns a single number, use evaluator name as key
+                if isinstance(value, NumericTypes):
+                    scores[evaluator_name] = value
+                else:
+                    scores.update(value)
+            trace_id = uuid.UUID(int=evaluation_span.get_span_context().trace_id)
+            return EvaluationResultDatapoint(
+                data=datapoint.data,
+                target=target,
+                executor_output=output,
+                scores=scores,
+                trace_id=trace_id,
             )
-            # If evaluator returns a single number, use evaluator name as key
-            if isinstance(value, NumericTypes):
-                scores[evaluator_name] = value
-            else:
-                scores.update(value)
-        return EvaluationResultDatapoint(
-            data=datapoint.data,
-            target=target,
-            executorOutput=output,
-            scores=scores,
-        )
 def evaluate(
     data: Union[EvaluationDataset, list[Union[Datapoint, dict]]],
-    executor: Any,
-    evaluators: list[Any],
+    executor: ExecutorFunction,
+    evaluators: dict[str, EvaluatorFunction],
     name: Optional[str] = None,
     batch_size: int = DEFAULT_BATCH_SIZE,
     project_api_key: Optional[str] = None,
     base_url: Optional[str] = None,
     http_port: Optional[int] = None,
+    grpc_port: Optional[int] = None,
+    instruments: Optional[Set[Instruments]] = None,
 ) -> Optional[Awaitable[None]]:
     """
     If added to the file which is called through lmnr eval command, then simply registers the evaluation.
@@ -295,6 +321,10 @@ def evaluate(
                         Defaults to "https://api.lmnr.ai".
         http_port (Optional[int], optional): The port for the Laminar API HTTP service.
                         Defaults to 443.
+        grpc_port (Optional[int], optional): The port for the Laminar API gRPC service.
+                        Defaults to 8443.
+        instruments (Optional[Set[Instruments]], optional): Set of modules to auto-instrument.
+                        Defaults to None. If None, all available instruments will be used.
     """
     evaluation = Evaluation(
@@ -306,6 +336,8 @@ def evaluate(
         project_api_key=project_api_key,
         base_url=base_url,
         http_port=http_port,
+        grpc_port=grpc_port,
+        instruments=instruments,
     )
     global _evaluation

{lmnr-0.4.12b3 → lmnr-0.4.12b4}/src/lmnr/sdk/laminar.py RENAMED Viewed

@@ -5,7 +5,6 @@ from opentelemetry.trace import (
     get_current_span,
     SpanKind,
 )
-from opentelemetry.semconv_ai import SpanAttributes
 from opentelemetry.util.types import AttributeValue
 from opentelemetry.context.context import Context
 from opentelemetry.util import types
@@ -26,7 +25,17 @@ import os
 import requests
 import uuid
-from lmnr.traceloop_sdk.tracing.tracing import set_association_properties, update_association_properties
+from lmnr.traceloop_sdk.tracing.attributes import (
+    SESSION_ID,
+    SPAN_INPUT,
+    SPAN_OUTPUT,
+    TRACE_TYPE,
+    USER_ID,
+)
+from lmnr.traceloop_sdk.tracing.tracing import (
+    set_association_properties,
+    update_association_properties,
+)
 from .log import VerboseColorfulFormatter
@@ -37,6 +46,7 @@ from .types import (
     PipelineRunResponse,
     NodeInput,
     PipelineRunRequest,
+    TraceType,
     UpdateEvaluationResponse,
 )
@@ -356,8 +366,8 @@ class Laminar:
             ) as span:
                 if input is not None:
                     span.set_attribute(
-                        SpanAttributes.TRACELOOP_ENTITY_INPUT,
-                        json.dumps({"input": input}),
+                        SPAN_INPUT,
+                        json.dumps(input),
                     )
                 yield span
@@ -371,9 +381,7 @@ class Laminar:
         """
         span = get_current_span()
         if output is not None and span != INVALID_SPAN:
-            span.set_attribute(
-                SpanAttributes.TRACELOOP_ENTITY_OUTPUT, json.dumps(output)
-            )
+            span.set_attribute(SPAN_OUTPUT, json.dumps(output))
     @classmethod
     def set_session(
@@ -396,9 +404,23 @@ class Laminar:
         """
         association_properties = {}
         if session_id is not None:
-            association_properties["session_id"] = session_id
+            association_properties[SESSION_ID] = session_id
         if user_id is not None:
-            association_properties["user_id"] = user_id
+            association_properties[USER_ID] = user_id
+        update_association_properties(association_properties)
+    @classmethod
+    def _set_trace_type(
+        cls,
+        trace_type: TraceType,
+    ):
+        """Set the trace_type for the current span and the context
+        Args:
+            trace_type (TraceType): Type of the trace
+        """
+        association_properties = {
+            TRACE_TYPE: trace_type.value,
+        }
         update_association_properties(association_properties)
     @classmethod
@@ -430,7 +452,7 @@ class Laminar:
     ) -> requests.Response:
         body = {
             "evaluationId": str(evaluation_id),
-            "points": [datapoint.model_dump() for datapoint in data],
+            "points": [datapoint.to_dict() for datapoint in data],
         }
         response = requests.post(
             cls.__base_http_url + "/v1/evaluation-datapoints",

{lmnr-0.4.12b3 → lmnr-0.4.12b4}/src/lmnr/sdk/types.py RENAMED Viewed

@@ -1,10 +1,11 @@
 import datetime
-import requests
+from enum import Enum
 import pydantic
-import uuid
+import requests
 from typing import Any, Awaitable, Callable, Literal, Optional, Union
+import uuid
-from .utils import to_dict
+from .utils import serialize
 class ChatMessage(pydantic.BaseModel):
@@ -37,7 +38,7 @@ class PipelineRunRequest(pydantic.BaseModel):
     def to_dict(self):
         return {
             "inputs": {
-                k: v.model_dump() if isinstance(v, pydantic.BaseModel) else to_dict(v)
+                k: v.model_dump() if isinstance(v, pydantic.BaseModel) else serialize(v)
                 for k, v in self.inputs.items()
             },
             "pipeline": self.pipeline,
@@ -125,5 +126,37 @@ UpdateEvaluationResponse = CreateEvaluationResponse
 class EvaluationResultDatapoint(pydantic.BaseModel):
     data: EvaluationDatapointData
     target: EvaluationDatapointTarget
-    executorOutput: ExecutorFunctionReturnType
+    executor_output: ExecutorFunctionReturnType
     scores: dict[str, Numeric]
+    trace_id: uuid.UUID
+    # uuid is not serializable by default, so we need to convert it to a string
+    def to_dict(self):
+        return {
+            "data": {
+                k: v.model_dump() if isinstance(v, pydantic.BaseModel) else serialize(v)
+                for k, v in self.data.items()
+            },
+            "target": {
+                k: v.model_dump() if isinstance(v, pydantic.BaseModel) else serialize(v)
+                for k, v in self.target.items()
+            },
+            "executorOutput": serialize(self.executor_output),
+            "scores": self.scores,
+            "traceId": str(self.trace_id),
+        }
+class SpanType(Enum):
+    DEFAULT = "DEFAULT"
+    LLM = "LLM"
+    PIPELINE = "PIPELINE"  # must not be set manually
+    EXECUTOR = "EXECUTOR"
+    EVALUATOR = "EVALUATOR"
+    EVALUATION = "EVALUATION"
+class TraceType(Enum):
+    DEFAULT = "DEFAULT"
+    EVENT = "EVENT"  # must not be set manually
+    EVALUATION = "EVALUATION"

{lmnr-0.4.12b3 → lmnr-0.4.12b4}/src/lmnr/sdk/utils.py RENAMED Viewed

@@ -1,5 +1,4 @@
 import asyncio
-import copy
 import datetime
 import dataclasses
 import enum
@@ -50,7 +49,7 @@ def is_iterator(o: typing.Any) -> bool:
     return hasattr(o, "__iter__") and hasattr(o, "__next__")
-def to_dict(obj: typing.Any) -> dict[str, typing.Any]:
+def serialize(obj: typing.Any) -> dict[str, typing.Any]:
     def to_dict_inner(o: typing.Any):
         if isinstance(o, (datetime.datetime, datetime.date)):
             return o.strftime("%Y-%m-%dT%H:%M:%S.%f%z")
@@ -59,7 +58,7 @@ def to_dict(obj: typing.Any) -> dict[str, typing.Any]:
         elif isinstance(o, (int, float, str, bool)):
             return o
         elif isinstance(o, uuid.UUID):
-            return str(o)  # same as in return, but explicit
+            return str(o)  # same as in final return, but explicit
         elif isinstance(o, enum.Enum):
             return o.value
         elif dataclasses.is_dataclass(o):
@@ -90,11 +89,11 @@ def get_input_from_func_args(
 ) -> dict[str, typing.Any]:
     # Remove implicitly passed "self" or "cls" argument for
     # instance or class methods
-    res = copy.deepcopy(func_kwargs)
+    res = func_kwargs.copy()
     for i, k in enumerate(inspect.signature(func).parameters.keys()):
         if is_method and k in ["self", "cls"]:
             continue
         # If param has default value, then it's not present in func args
-        if len(func_args) > i:
+        if i < len(func_args):
             res[k] = func_args[i]
     return res

lmnr 0.4.12b3__tar.gz → 0.4.12b4__tar.gz

lmnr 0.4.12b3tar.gz → 0.4.12b4tar.gz