PyPI - netra-sdk - Versions diffs - 0.1.19__tar.gz → 0.1.20__tar.gz - Mend

netra-sdk 0.1.19tar.gz → 0.1.20tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of netra-sdk might be problematic. Click here for more details.

Files changed (47) hide show

{netra_sdk-0.1.19 → netra_sdk-0.1.20}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.3
 Name: netra-sdk
-Version: 0.1.19
+Version: 0.1.20
 Summary: A Python SDK for AI application observability that provides OpenTelemetry-based monitoring, tracing, and PII protection for LLM and vector database applications. Enables easy instrumentation, session tracking, and privacy-focused data collection for AI systems in production environments.
 License: Apache-2.0
 Keywords: netra,tracing,observability,sdk,ai,llm,vector,database
@@ -69,7 +69,9 @@ Requires-Dist: opentelemetry-instrumentation-urllib3 (>=0.55b1,<1.0.0)
 Requires-Dist: opentelemetry-sdk (>=1.34.0,<2.0.0)
 Requires-Dist: presidio-analyzer (==2.2.358) ; extra == "presidio"
 Requires-Dist: presidio-anonymizer (==2.2.358) ; extra == "presidio"
+Requires-Dist: stanza (>=1.10.1,<2.0.0) ; extra == "presidio"
 Requires-Dist: traceloop-sdk (>=0.40.7,<0.43.0)
+Requires-Dist: transformers (==4.51.3) ; extra == "presidio"
 Project-URL: Bug Tracker, https://github.com/KeyValueSoftwareSystems/netra-sdk-py/issues
 Project-URL: Documentation, https://github.com/KeyValueSoftwareSystems/netra-sdk-py/blob/main/README.md
 Project-URL: Homepage, https://github.com/KeyValueSoftwareSystems/netra-sdk-py
@@ -331,6 +333,119 @@ print(f"Masked text: {result.masked_text}")
 print(f"PII entities: {result.pii_entities}")
 ```
+#### Custom Models for PII Detection
+The `PresidioPIIDetector` supports custom NLP models through the `nlp_configuration` parameter, allowing you to use specialized models for improved PII detection accuracy. You can configure custom spaCy, Stanza, or transformers models:
+##### NLP Configuration Example
+Follow this configuration structure to provide your custom models.
+```python
+nlp_configuration = {
+    "nlp_engine_name": "spacy|stanza|transformers",
+    "models": [
+        {
+            "lang_code": "en",  # Language code
+            "model_name": "model_identifier"  # Varies by engine type
+        }
+    ],
+    "ner_model_configuration": {  # Optional, mainly for transformers
+        # Additional configuration options
+    }
+}
+```
+##### Using Custom spaCy Models
+```python
+from netra.pii import PresidioPIIDetector
+# Configure custom spaCy model
+spacy_config = {
+    "nlp_engine_name": "spacy",
+    "models": [{"lang_code": "en", "model_name": "en_core_web_lg"}]
+}
+detector = PresidioPIIDetector(
+    nlp_configuration=spacy_config,
+    action_type="MASK",
+    score_threshold=0.8
+)
+text = "Dr. Sarah Wilson works at 123 Main St, New York"
+result = detector.detect(text)
+print(f"Detected entities: {result.pii_entities}")
+```
+##### Using Stanza Models
+```python
+from netra.pii import PresidioPIIDetector
+# Configure Stanza model
+stanza_config = {
+    "nlp_engine_name": "stanza",
+    "models": [{"lang_code": "en", "model_name": "en"}]
+}
+detector = PresidioPIIDetector(
+    nlp_configuration=stanza_config,
+    action_type="FLAG"
+)
+text = "Contact Alice Smith at alice@company.com"
+result = detector.detect(text)
+print(f"PII detected: {result.has_pii}")
+```
+##### Using Transformers Models
+For advanced NER capabilities, you can use transformer-based models:
+```python
+from netra.pii import PresidioPIIDetector
+# Configure transformers model with entity mapping
+transformers_config = {
+    "nlp_engine_name": "transformers",
+    "models": [{
+        "lang_code": "en",
+        "model_name": {
+            "spacy": "en_core_web_sm",
+            "transformers": "dbmdz/bert-large-cased-finetuned-conll03-english"
+        }
+    }],
+    "ner_model_configuration": {
+        "labels_to_ignore": ["O"],
+        "model_to_presidio_entity_mapping": {
+            "PER": "PERSON",
+            "LOC": "LOCATION",
+            "ORG": "ORGANIZATION",
+            "MISC": "MISC"
+        },
+        "low_confidence_score_multiplier": 0.4,
+        "low_score_entity_names": ["ORG"]
+    }
+}
+detector = PresidioPIIDetector(
+    nlp_configuration=transformers_config,
+    action_type="MASK"
+)
+text = "Microsoft Corporation is located in Redmond, Washington"
+result = detector.detect(text)
+print(f"Masked text: {result.masked_text}")
+```
+**Note**: Custom model configuration allows for:
+- **Better accuracy** with domain-specific models
+- **Multi-language support** by specifying different language codes
+- **Fine-tuned models** trained on your specific data
+- **Performance optimization** by choosing models suited to your use case
 #### Regex-based Detection
 ```python
 from netra.pii import RegexPIIDetector
@@ -555,102 +670,6 @@ Configuration values are resolved in the following order (highest to lowest prec
 4. **Default Values**: Fallback values defined in the SDK
 This allows you to:
-- Override any setting directly in code for maximum control
-- Use Netra-specific environment variables for Netra-specific settings
-- Fall back to standard OpenTelemetry variables for compatibility
-- Rely on sensible defaults when no other configuration is provided
-**Example**:
-```bash
-export NETRA_APP_NAME="my-ai-service"
-export NETRA_OTLP_ENDPOINT="https://collector.example.com:4318"
-export NETRA_API_KEY="your-api-key-here"
-export NETRA_ENV="production"
-export NETRA_RESOURCE_ATTRS='{"team":"ai", "version":"1.0.0"}'
-```
-### Programmatic Configuration
-You can also configure the SDK programmatically when initializing:
-```python
-from netra import Netra
-from netra.instrumentation.instruments import InstrumentSet
-Netra.init(
-    app_name="my-ai-service",
-    environment="production",
-    resource_attributes={"team": "ai", "version": "1.0.0"},
-    trace_content=True,
-    disable_batch=False,
-    instruments={InstrumentSet.OPENAI}
-)
-```
-### Custom Instrumentation Selection
-Control which instrumentations are enabled:
-```python
-from netra import Netra
-from netra.instrumentation.instruments import InstrumentSet
-# Enable specific instruments
-Netra.init(
-    app_name="Selective App",
-    instruments={
-        InstrumentSet.OPENAI,
-        InstrumentSet.WEAVIATEDB,
-        InstrumentSet.FASTAPI
-    }
-)
-# Block specific instruments
-Netra.init(
-    app_name="Blocked App",
-    block_instruments={
-        InstrumentSet.HTTPX,  # Don't trace HTTPX calls
-        InstrumentSet.REDIS   # Don't trace Redis operations
-    }
-)
-```
-### 🌐 Custom Endpoint Integration
-Since Netra SDK follows the **OpenTelemetry standard**, you can integrate it with any OpenTelemetry-compatible observability backend:
-#### Popular OpenTelemetry Backends
-- **Jaeger** - Distributed tracing platform
-- **Zipkin** - Distributed tracing system
-- **Prometheus** - Monitoring and alerting toolkit
-- **Grafana** - Observability and data visualization
-- **New Relic** - Full-stack observability platform
-- **Datadog** - Monitoring and analytics platform
-- **Honeycomb** - Observability for complex systems
-- **Lightstep** - Distributed tracing and observability
-- **AWS X-Ray** - Distributed tracing service
-- **Google Cloud Trace** - Distributed tracing system
-#### Custom Endpoint Configuration
-**Recommended: Environment Variable Configuration (No Code Changes Required)**
-```bash
-# Set custom OTLP endpoint via environment variables
-export NETRA_OTLP_ENDPOINT="https://your-custom-backend.com/v1/traces"
-export NETRA_HEADERS="authorization=Bearer your-token"
-```
-```python
-from netra import Netra
-from netra.instrumentation.instruments import InstrumentSet
-# Simple initialization - SDK automatically picks up environment variables
-Netra.init(app_name="Your App", instruments={InstrumentSet})
-# No endpoint configuration needed in code!
-```
-#### Benefits of OpenTelemetry Compatibility
 - **🔄 Vendor Agnostic**: Switch between observability platforms without code changes
 - **📊 Standard Format**: Consistent telemetry data across all tools
 - **🔧 Flexible Integration**: Works with existing observability infrastructure

{netra_sdk-0.1.19 → netra_sdk-0.1.20}/README.md RENAMED Viewed

@@ -253,6 +253,119 @@ print(f"Masked text: {result.masked_text}")
 print(f"PII entities: {result.pii_entities}")
 ```
+#### Custom Models for PII Detection
+The `PresidioPIIDetector` supports custom NLP models through the `nlp_configuration` parameter, allowing you to use specialized models for improved PII detection accuracy. You can configure custom spaCy, Stanza, or transformers models:
+##### NLP Configuration Example
+Follow this configuration structure to provide your custom models.
+```python
+nlp_configuration = {
+    "nlp_engine_name": "spacy|stanza|transformers",
+    "models": [
+        {
+            "lang_code": "en",  # Language code
+            "model_name": "model_identifier"  # Varies by engine type
+        }
+    ],
+    "ner_model_configuration": {  # Optional, mainly for transformers
+        # Additional configuration options
+    }
+}
+```
+##### Using Custom spaCy Models
+```python
+from netra.pii import PresidioPIIDetector
+# Configure custom spaCy model
+spacy_config = {
+    "nlp_engine_name": "spacy",
+    "models": [{"lang_code": "en", "model_name": "en_core_web_lg"}]
+}
+detector = PresidioPIIDetector(
+    nlp_configuration=spacy_config,
+    action_type="MASK",
+    score_threshold=0.8
+)
+text = "Dr. Sarah Wilson works at 123 Main St, New York"
+result = detector.detect(text)
+print(f"Detected entities: {result.pii_entities}")
+```
+##### Using Stanza Models
+```python
+from netra.pii import PresidioPIIDetector
+# Configure Stanza model
+stanza_config = {
+    "nlp_engine_name": "stanza",
+    "models": [{"lang_code": "en", "model_name": "en"}]
+}
+detector = PresidioPIIDetector(
+    nlp_configuration=stanza_config,
+    action_type="FLAG"
+)
+text = "Contact Alice Smith at alice@company.com"
+result = detector.detect(text)
+print(f"PII detected: {result.has_pii}")
+```
+##### Using Transformers Models
+For advanced NER capabilities, you can use transformer-based models:
+```python
+from netra.pii import PresidioPIIDetector
+# Configure transformers model with entity mapping
+transformers_config = {
+    "nlp_engine_name": "transformers",
+    "models": [{
+        "lang_code": "en",
+        "model_name": {
+            "spacy": "en_core_web_sm",
+            "transformers": "dbmdz/bert-large-cased-finetuned-conll03-english"
+        }
+    }],
+    "ner_model_configuration": {
+        "labels_to_ignore": ["O"],
+        "model_to_presidio_entity_mapping": {
+            "PER": "PERSON",
+            "LOC": "LOCATION",
+            "ORG": "ORGANIZATION",
+            "MISC": "MISC"
+        },
+        "low_confidence_score_multiplier": 0.4,
+        "low_score_entity_names": ["ORG"]
+    }
+}
+detector = PresidioPIIDetector(
+    nlp_configuration=transformers_config,
+    action_type="MASK"
+)
+text = "Microsoft Corporation is located in Redmond, Washington"
+result = detector.detect(text)
+print(f"Masked text: {result.masked_text}")
+```
+**Note**: Custom model configuration allows for:
+- **Better accuracy** with domain-specific models
+- **Multi-language support** by specifying different language codes
+- **Fine-tuned models** trained on your specific data
+- **Performance optimization** by choosing models suited to your use case
 #### Regex-based Detection
 ```python
 from netra.pii import RegexPIIDetector
@@ -477,102 +590,6 @@ Configuration values are resolved in the following order (highest to lowest prec
 4. **Default Values**: Fallback values defined in the SDK
 This allows you to:
-- Override any setting directly in code for maximum control
-- Use Netra-specific environment variables for Netra-specific settings
-- Fall back to standard OpenTelemetry variables for compatibility
-- Rely on sensible defaults when no other configuration is provided
-**Example**:
-```bash
-export NETRA_APP_NAME="my-ai-service"
-export NETRA_OTLP_ENDPOINT="https://collector.example.com:4318"
-export NETRA_API_KEY="your-api-key-here"
-export NETRA_ENV="production"
-export NETRA_RESOURCE_ATTRS='{"team":"ai", "version":"1.0.0"}'
-```
-### Programmatic Configuration
-You can also configure the SDK programmatically when initializing:
-```python
-from netra import Netra
-from netra.instrumentation.instruments import InstrumentSet
-Netra.init(
-    app_name="my-ai-service",
-    environment="production",
-    resource_attributes={"team": "ai", "version": "1.0.0"},
-    trace_content=True,
-    disable_batch=False,
-    instruments={InstrumentSet.OPENAI}
-)
-```
-### Custom Instrumentation Selection
-Control which instrumentations are enabled:
-```python
-from netra import Netra
-from netra.instrumentation.instruments import InstrumentSet
-# Enable specific instruments
-Netra.init(
-    app_name="Selective App",
-    instruments={
-        InstrumentSet.OPENAI,
-        InstrumentSet.WEAVIATEDB,
-        InstrumentSet.FASTAPI
-    }
-)
-# Block specific instruments
-Netra.init(
-    app_name="Blocked App",
-    block_instruments={
-        InstrumentSet.HTTPX,  # Don't trace HTTPX calls
-        InstrumentSet.REDIS   # Don't trace Redis operations
-    }
-)
-```
-### 🌐 Custom Endpoint Integration
-Since Netra SDK follows the **OpenTelemetry standard**, you can integrate it with any OpenTelemetry-compatible observability backend:
-#### Popular OpenTelemetry Backends
-- **Jaeger** - Distributed tracing platform
-- **Zipkin** - Distributed tracing system
-- **Prometheus** - Monitoring and alerting toolkit
-- **Grafana** - Observability and data visualization
-- **New Relic** - Full-stack observability platform
-- **Datadog** - Monitoring and analytics platform
-- **Honeycomb** - Observability for complex systems
-- **Lightstep** - Distributed tracing and observability
-- **AWS X-Ray** - Distributed tracing service
-- **Google Cloud Trace** - Distributed tracing system
-#### Custom Endpoint Configuration
-**Recommended: Environment Variable Configuration (No Code Changes Required)**
-```bash
-# Set custom OTLP endpoint via environment variables
-export NETRA_OTLP_ENDPOINT="https://your-custom-backend.com/v1/traces"
-export NETRA_HEADERS="authorization=Bearer your-token"
-```
-```python
-from netra import Netra
-from netra.instrumentation.instruments import InstrumentSet
-# Simple initialization - SDK automatically picks up environment variables
-Netra.init(app_name="Your App", instruments={InstrumentSet})
-# No endpoint configuration needed in code!
-```
-#### Benefits of OpenTelemetry Compatibility
 - **🔄 Vendor Agnostic**: Switch between observability platforms without code changes
 - **📊 Standard Format**: Consistent telemetry data across all tools
 - **🔧 Flexible Integration**: Works with existing observability infrastructure

{netra_sdk-0.1.19 → netra_sdk-0.1.20}/netra/pii.py RENAMED Viewed

@@ -577,7 +577,7 @@ class PresidioPIIDetector(PIIDetector):
     call Presidio's Analyzer + Anonymizer on a string.
     Examples:
-        # Using default hash function
+        # Using default configuration
         detector = PresidioPIIDetector()
         result = detector.detect("My email is john@example.com")
@@ -592,6 +592,41 @@ class PresidioPIIDetector(PIIDetector):
             action_type="MASK",
             score_threshold=0.8
         )
+        # Using custom spaCy model configuration
+        spacy_config = {
+            "nlp_engine_name": "spacy",
+            "models": [{"lang_code": "en", "model_name": "en_core_web_lg"}]
+        }
+        detector = PresidioPIIDetector(nlp_configuration=spacy_config)
+        # Using Stanza model configuration
+        stanza_config = {
+            "nlp_engine_name": "stanza",
+            "models": [{"lang_code": "en", "model_name": "en"}]
+        }
+        detector = PresidioPIIDetector(nlp_configuration=stanza_config)
+        # Using transformers model configuration
+        transformers_config = {
+            "nlp_engine_name": "transformers",
+            "models": [{
+                "lang_code": "en",
+                "model_name": {
+                    "spacy": "en_core_web_sm",
+                    "transformers": "dbmdz/bert-large-cased-finetuned-conll03-english"
+                }
+            }],
+            "ner_model_configuration": {
+                "labels_to_ignore": ["O"],
+                "model_to_presidio_entity_mapping": {
+                    "PER": "PERSON",
+                    "LOC": "LOCATION",
+                    "ORG": "ORGANIZATION"
+                }
+            }
+        }
+        detector = PresidioPIIDetector(nlp_configuration=transformers_config)
     """
     def __init__(
@@ -602,7 +637,35 @@ class PresidioPIIDetector(PIIDetector):
         action_type: Optional[Literal["BLOCK", "FLAG", "MASK"]] = None,
         anonymizer_cache_size: int = 1000,
         hash_function: Optional[Callable[[str], str]] = None,
+        nlp_configuration: Optional[Dict[str, Any]] = None,
     ) -> None:
+        """
+        Initialize the Presidio PII detector.
+        Args:
+            entities: List of entity types to detect. If None, uses DEFAULT_ENTITIES.
+            language: Language code for detection (default: "en").
+            score_threshold: Minimum confidence score for detections (default: 0.6).
+            action_type: Action to take when PII is detected ("BLOCK", "FLAG", "MASK").
+            anonymizer_cache_size: Size of the anonymizer cache (default: 1000).
+            hash_function: Custom hash function for anonymization.
+            nlp_configuration: Dictionary containing NLP engine configuration.
+                Format: {
+                    "nlp_engine_name": "spacy|stanza|transformers",
+                    "models": [{"lang_code": "en", "model_name": "model_name"}],
+                    "ner_model_configuration": {...}  # Optional, for transformers
+                }
+                For spaCy and Stanza:
+                - model_name should be a string (e.g., "en_core_web_lg", "en")
+                For transformers:
+                - model_name should be a dict with "spacy" and "transformers" keys
+                - Example: {"spacy": "en_core_web_sm", "transformers": "model_path"}
+        Raises:
+            ImportError: If presidio-analyzer is not installed or required NLP library is missing.
+        """
         if action_type is None:
             action_type = "FLAG"
             env_action = os.getenv("NETRA_ACTION_TYPE", "FLAG")
@@ -610,18 +673,99 @@ class PresidioPIIDetector(PIIDetector):
             if env_action in ["BLOCK", "FLAG", "MASK"]:
                 action_type = cast(Literal["BLOCK", "FLAG", "MASK"], env_action)
         super().__init__(action_type=action_type)
+        # Import presidio-analyzer
         try:
             from presidio_analyzer import AnalyzerEngine  # noqa: F401
         except ImportError as exc:
-            raise ImportError("Presidio-based PII detection requires: presidio-analyzer. " "Install via pip.") from exc
+            raise ImportError("Presidio-based PII detection requires: presidio-analyzer. Install via pip.") from exc
         self.language: str = language
         self.entities: Optional[List[str]] = entities if entities else DEFAULT_ENTITIES
         self.score_threshold: float = score_threshold
-        self.analyzer = AnalyzerEngine()
+        # Initialize AnalyzerEngine with custom or default NLP engine
+        if nlp_configuration is not None:
+            self.analyzer = self._create_analyzer_with_custom_nlp(nlp_configuration)
+        else:
+            # Use default AnalyzerEngine
+            self.analyzer = AnalyzerEngine()
         self.anonymizer = Anonymizer(hash_function=hash_function, cache_size=anonymizer_cache_size)
+    def _create_analyzer_with_custom_nlp(self, nlp_configuration: Dict[str, Any]) -> Any:
+        """
+        Create an AnalyzerEngine with custom NLP configuration.
+        Args:
+            nlp_configuration: Dictionary containing NLP engine configuration.
+        Returns:
+            AnalyzerEngine instance with custom NLP engine.
+        Raises:
+            ImportError: If required NLP library is not available.
+        """
+        try:
+            from presidio_analyzer import AnalyzerEngine
+            from presidio_analyzer.nlp_engine import NlpEngineProvider
+        except ImportError as exc:
+            raise ImportError("Presidio-based PII detection requires: presidio-analyzer. Install via pip.") from exc
+        # Validate and prepare configuration
+        engine_name = nlp_configuration.get("nlp_engine_name", "").lower()
+        # Perform lazy imports based on engine type
+        if engine_name == "spacy":
+            self._ensure_spacy_available()
+        elif engine_name == "stanza":
+            self._ensure_stanza_available()
+        elif engine_name == "transformers":
+            self._ensure_transformers_available()
+        else:
+            # Default behavior - let Presidio handle it
+            pass
+        # Create NLP engine from configuration
+        provider = NlpEngineProvider(nlp_configuration=nlp_configuration)
+        custom_nlp_engine = provider.create_engine()
+        # Extract supported languages from configuration
+        supported_languages = [self.language]
+        if "models" in nlp_configuration:
+            supported_languages = [model["lang_code"] for model in nlp_configuration["models"]]
+        return AnalyzerEngine(nlp_engine=custom_nlp_engine, supported_languages=supported_languages)
+    def _ensure_spacy_available(self) -> None:
+        """Ensure spaCy is available when needed."""
+        try:
+            import spacy  # noqa: F401
+        except ImportError as exc:
+            raise ImportError(
+                "spaCy is required for spaCy-based PII detection. Install via: pip install spacy"
+            ) from exc
+    def _ensure_stanza_available(self) -> None:
+        """Ensure Stanza is available when needed."""
+        try:
+            import stanza  # noqa: F401
+        except ImportError as exc:
+            raise ImportError(
+                "Stanza is required for Stanza-based PII detection. Install via: pip install stanza"
+            ) from exc
+    def _ensure_transformers_available(self) -> None:
+        """Ensure transformers is available when needed."""
+        try:
+            import torch  # noqa: F401
+            import transformers  # noqa: F401
+        except ImportError as exc:
+            raise ImportError(
+                "Transformers and PyTorch are required for transformers-based PII detection. "
+                "Install via: pip install transformers torch"
+            ) from exc
     def _detect_pii(self, text: str) -> Tuple[bool, Counter[str], str, Dict[str, str]]:
         """
         Detect PII in a single message.
@@ -666,6 +810,7 @@ def get_default_detector(
     action_type: Optional[Literal["BLOCK", "FLAG", "MASK"]] = None,
     entities: Optional[List[str]] = None,
     hash_function: Optional[Callable[[str], str]] = None,
+    nlp_configuration: Optional[Dict[str, Any]] = None,
 ) -> PIIDetector:
     """
     Returns a default PII detector instance (Presidio-based by default).
@@ -678,8 +823,11 @@ def get_default_detector(
             - "MASK": Replace PII with mask tokens (default)
         entities: Optional list of entity types to detect. If None, uses Presidio's default entities
         hash_function: Optional custom hash function for anonymization. If None, uses default hash function.
+        nlp_configuration: Dictionary containing NLP engine configuration for custom models.
     """
-    return PresidioPIIDetector(action_type=action_type, entities=entities, hash_function=hash_function)
+    return PresidioPIIDetector(
+        action_type=action_type, entities=entities, hash_function=hash_function, nlp_configuration=nlp_configuration
+    )
 # ---------------------------------------------------------------------------- #

netra_sdk-0.1.20/netra/version.py ADDED Viewed

	@@ -0,0 +1 @@
1	+ __version__ = "0.1.20"

{netra_sdk-0.1.19 → netra_sdk-0.1.20}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "poetry.core.masonry.api"
 [project]
 name = "netra-sdk"
-version = "0.1.19"
+version = "0.1.20"
 description = "A Python SDK for AI application observability that provides OpenTelemetry-based monitoring, tracing, and PII protection for LLM and vector database applications. Enables easy instrumentation, session tracking, and privacy-focused data collection for AI systems in production environments."
 authors = [
     {name = "Sooraj Thomas",email = "sooraj@keyvalue.systems"}
@@ -95,6 +95,8 @@ llm_guard = [
 presidio = [
     "presidio-analyzer==2.2.358",
     "presidio-anonymizer==2.2.358",
+    "transformers==4.51.3",
+    "stanza>=1.10.1,<2.0.0"
 ]
 [tool.poetry.group.dev.dependencies]