PyPI - auris_tools - Versions diffs - 0.0.3__tar.gz - Mend

auris_tools 0.0.3__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

auris_tools-0.0.3/PKG-INFO +97 -0
auris_tools-0.0.3/README.md +77 -0
auris_tools-0.0.3/auris_tools/__init__.py +0 -0
auris_tools-0.0.3/auris_tools/configuration.py +81 -0
auris_tools-0.0.3/auris_tools/databaseHandlers.py +132 -0
auris_tools-0.0.3/auris_tools/geminiHandler.py +246 -0
auris_tools-0.0.3/auris_tools/officeWordHandler.py +271 -0
auris_tools-0.0.3/auris_tools/storageHandler.py +195 -0
auris_tools-0.0.3/auris_tools/textractHandler.py +169 -0
auris_tools-0.0.3/auris_tools/utils.py +120 -0
auris_tools-0.0.3/pyproject.toml +64 -0

auris_tools-0.0.3/PKG-INFO ADDED Viewed

@@ -0,0 +1,97 @@
+Metadata-Version: 2.3
+Name: auris_tools
+Version: 0.0.3
+Summary: The swiss knife tools to coordinates cloud frameworks with an easy for Auris platforms
+Author: Antonio Senra
+Author-email: acsenrafilho@gmail.com
+Requires-Python: >=3.10,<4.0
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Programming Language :: Python :: 3.13
+Requires-Dist: boto3 (>=1.40.29,<2.0.0)
+Requires-Dist: dotenv (>=0.9.9,<0.10.0)
+Requires-Dist: google-generativeai (>=0.8.5,<0.9.0)
+Requires-Dist: python-docx (>=1.2.0,<2.0.0)
+Requires-Dist: rich (>=14.1.0,<15.0.0)
+Description-Content-Type: text/markdown
+# auris-tools
+[![PyPI version](https://img.shields.io/pypi/v/auris-tools.svg)](https://pypi.org/project/auris-tools/)
+[![Documentation Status](https://readthedocs.org/projects/auris-tools/badge/?version=latest)](https://auris-tools.readthedocs.io/en/latest/?badge=latest)
+[![CI for Develop Branch](https://github.com/AurisAASI/auris-tools/actions/workflows/ci_develop.yml/badge.svg?branch=develop)](https://github.com/AurisAASI/auris-tools/actions/workflows/ci_develop.yml)
+[![CI for Develop Branch](https://github.com/AurisAASI/auris-tools/actions/workflows/ci_develop.yml/badge.svg?branch=main)](https://github.com/AurisAASI/auris-tools/actions/workflows/ci_develop.yml)
+[![codecov](https://codecov.io/gh/AurisAASI/auris-tools/graph/badge.svg?token=08891W8HP2)](https://codecov.io/gh/AurisAASI/auris-tools)
+The swiss knife tools to coordinates cloud frameworks with an easy for Auris platforms
+## Installation
+This project requires **Python 3.10** and uses [Poetry](https://python-poetry.org/) for dependency management.
+1. **Clone the repository:**
+   ```bash
+   git clone https://github.com/AurisAASI/auris-tools.git
+   cd auris-tools
+   ```
+2. **Install Poetry (if not already installed):**
+   ```bash
+   pip install poetry
+   ```
+3. **Install dependencies:**
+   ```bash
+   poetry install
+   ```
+---
+## Project Structure
+The main classes and modules are organized as follows:
+```
+/auris_tools
+├── __init__.py
+├── configuration.py         # AWS configuration utilities
+├── databaseHandlers.py      # DynamoDB handler class
+├── officeWordHandler.py     # Office Word document handler
+├── storageHandler.py        # AWS S3 storage handler
+├── textractHandler.py       # AWS Textract handler
+├── utils.py                 # Utility functions
+├── geminiHandler.py         # Google Gemini AI handler
+```
+---
+## Testing & Linting
+- **Run all tests:**
+  ```bash
+  task test
+  ```
+- **Run linter (blue and isort):**
+  ```bash
+  task lint
+  ```
+Test coverage and linting are enforced in CI. Make sure all tests pass and code is linted before submitting a PR.
+## Documentation
+We use MkDocs with Material theme for our documentation:
+- **Run documentation server locally:**
+  ```bash
+  task docs
+  ```
+- **Build documentation:**
+  ```bash
+  task docs-build
+  ```
+The documentation is automatically published to Read the Docs when changes are pushed to the main branch.
+---

auris_tools-0.0.3/README.md ADDED Viewed

@@ -0,0 +1,77 @@
+# auris-tools
+[![PyPI version](https://img.shields.io/pypi/v/auris-tools.svg)](https://pypi.org/project/auris-tools/)
+[![Documentation Status](https://readthedocs.org/projects/auris-tools/badge/?version=latest)](https://auris-tools.readthedocs.io/en/latest/?badge=latest)
+[![CI for Develop Branch](https://github.com/AurisAASI/auris-tools/actions/workflows/ci_develop.yml/badge.svg?branch=develop)](https://github.com/AurisAASI/auris-tools/actions/workflows/ci_develop.yml)
+[![CI for Develop Branch](https://github.com/AurisAASI/auris-tools/actions/workflows/ci_develop.yml/badge.svg?branch=main)](https://github.com/AurisAASI/auris-tools/actions/workflows/ci_develop.yml)
+[![codecov](https://codecov.io/gh/AurisAASI/auris-tools/graph/badge.svg?token=08891W8HP2)](https://codecov.io/gh/AurisAASI/auris-tools)
+The swiss knife tools to coordinates cloud frameworks with an easy for Auris platforms
+## Installation
+This project requires **Python 3.10** and uses [Poetry](https://python-poetry.org/) for dependency management.
+1. **Clone the repository:**
+   ```bash
+   git clone https://github.com/AurisAASI/auris-tools.git
+   cd auris-tools
+   ```
+2. **Install Poetry (if not already installed):**
+   ```bash
+   pip install poetry
+   ```
+3. **Install dependencies:**
+   ```bash
+   poetry install
+   ```
+---
+## Project Structure
+The main classes and modules are organized as follows:
+```
+/auris_tools
+├── __init__.py
+├── configuration.py         # AWS configuration utilities
+├── databaseHandlers.py      # DynamoDB handler class
+├── officeWordHandler.py     # Office Word document handler
+├── storageHandler.py        # AWS S3 storage handler
+├── textractHandler.py       # AWS Textract handler
+├── utils.py                 # Utility functions
+├── geminiHandler.py         # Google Gemini AI handler
+```
+---
+## Testing & Linting
+- **Run all tests:**
+  ```bash
+  task test
+  ```
+- **Run linter (blue and isort):**
+  ```bash
+  task lint
+  ```
+Test coverage and linting are enforced in CI. Make sure all tests pass and code is linted before submitting a PR.
+## Documentation
+We use MkDocs with Material theme for our documentation:
+- **Run documentation server locally:**
+  ```bash
+  task docs
+  ```
+- **Build documentation:**
+  ```bash
+  task docs-build
+  ```
+The documentation is automatically published to Read the Docs when changes are pushed to the main branch.
+---

auris_tools-0.0.3/auris_tools/__init__.py ADDED Viewed

File without changes

auris_tools-0.0.3/auris_tools/configuration.py ADDED Viewed

@@ -0,0 +1,81 @@
+import logging
+import os
+from dotenv import load_dotenv
+# Load environment variables from .env file
+load_dotenv()
+class AWSConfiguration:
+    """
+    AWS Configuration class that handles credentials and region settings.
+    Prioritizes environment variables over constructor parameters.
+    """
+    def __init__(
+        self,
+        access_key: str = None,
+        secret_key: str = None,
+        region: str = None,
+        profile: str = None,
+        endpoint_url: str = None,
+    ):
+        # Try to get credentials from environment variables first
+        self.access_key = (
+            access_key if access_key else os.environ.get('AWS_ACCESS_KEY_ID')
+        )
+        self.secret_key = (
+            secret_key
+            if secret_key
+            else os.environ.get('AWS_SECRET_ACCESS_KEY')
+        )
+        self.region = (
+            region
+            if region
+            else os.environ.get('AWS_DEFAULT_REGION') or 'us-east-1'
+        )
+        self.profile = profile if profile else os.environ.get('AWS_PROFILE')
+        self.endpoint_url = (
+            endpoint_url
+            if endpoint_url
+            else os.environ.get('AWS_ENDPOINT_URL')
+        )
+        # Validate configuration
+        self._validate_config()
+    def _validate_config(self):
+        """Validate that we have enough configuration to proceed."""
+        if not ((self.access_key and self.secret_key) or self.profile):
+            logging.warning(
+                'No AWS credentials provided via environment variables or constructor. '
+                'AWS operations may fail unless credentials are configured via '
+                '~/.aws/credentials, IAM roles, or other AWS credential providers.'
+            )
+    def get_boto3_session_args(self):
+        """
+        Return a dictionary of arguments that can be passed to boto3.session.Session()
+        """
+        session_args = {'region_name': self.region}
+        if self.access_key and self.secret_key:
+            session_args['aws_access_key_id'] = self.access_key
+            session_args['aws_secret_access_key'] = self.secret_key
+        if self.profile:
+            session_args['profile_name'] = self.profile
+        return session_args
+    def get_client_args(self):
+        """
+        Return a dictionary of arguments that can be passed to boto3 client creation
+        """
+        client_args = {}
+        if self.endpoint_url:
+            client_args['endpoint_url'] = self.endpoint_url
+        return client_args

auris_tools-0.0.3/auris_tools/databaseHandlers.py ADDED Viewed

@@ -0,0 +1,132 @@
+import logging
+import boto3
+from boto3.dynamodb.types import TypeDeserializer, TypeSerializer
+from auris_tools.configuration import AWSConfiguration
+from auris_tools.utils import generate_uuid
+class DatabaseHandler:
+    def __init__(self, table_name, config=None):
+        """
+        Initialize the database handler.
+        Args:
+            table_name: Name of the DynamoDB table.
+            config: An AWSConfiguration object, or None to use environment variables.
+        """
+        self.table_name = table_name
+        if config is None:
+            config = AWSConfiguration()
+        # Create a boto3 session with the configuration
+        session = boto3.session.Session(**config.get_boto3_session_args())
+        # Create a DynamoDB client with additional configuration if needed
+        self.client = session.client('dynamodb', **config.get_client_args())
+        if not self._check_table_exists(table_name):
+            raise Exception(f'Table does not exist: {table_name}')
+        logging.info(f'Initialized DynamoDB client in region {config.region}')
+    def insert_item(self, item, primary_key: str = 'id'):
+        """Insert an item with automatic type conversion"""
+        if not isinstance(item, dict):
+            raise TypeError('Item must be a dictionary')
+        if primary_key not in item:
+            item[primary_key] = generate_uuid()
+        dynamo_item = self._serialize_item(item)
+        response = self.client.put_item(
+            TableName=self.table_name, Item=dynamo_item
+        )
+        return response
+    def get_item(self, key):
+        """
+        Retrieve an item from a DynamoDB table.
+        Args:
+            key: A dictionary representing the key of the item to retrieve.
+        Returns:
+            The retrieved item, or None if not found.
+        """
+        if not isinstance(key, dict):
+            raise TypeError('Key must be a dictionary')
+        # Check if the key is in DynamoDB format (i.e., values are dicts with type keys)
+        if not all(isinstance(v, dict) and len(v) == 1 for v in key.values()):
+            # Convert to DynamoDB format
+            key = self._serialize_item(key)
+        try:
+            response = self.client.get_item(TableName=self.table_name, Key=key)
+            return response.get('Item')
+        except Exception as e:
+            logging.error(
+                f'Error retrieving item from {self.table_name}: {str(e)}'
+            )
+            return None
+    def delete_item(self, key, primary_key='id'):
+        """
+        Delete an item from a DynamoDB table.
+        Args:
+            key (str or dict): Either a string identifier for the primary key,
+                              or a dictionary containing the complete key structure.
+            primary_key (str, optional): Name of the primary key field. Defaults to 'id'.
+        Returns:
+            bool: True if deletion was successful, False otherwise.
+        """
+        # Convert string key to a dictionary with the primary key
+        if isinstance(key, str):
+            key = {primary_key: key}
+        elif not isinstance(key, dict):
+            raise TypeError('Key must be a string identifier or a dictionary')
+        # Check if the key is in DynamoDB format
+        if not self.item_is_serialized(key):
+            key = self._serialize_item(key)
+        try:
+            self.client.delete_item(
+                TableName=self.table_name,
+                Key=key,
+                ReturnValues='ALL_OLD',  # Return the deleted item
+            )
+            logging.info(f'Deleted item from {self.table_name} with key {key}')
+            return True
+        except Exception as e:
+            logging.error(
+                f'Error deleting item from {self.table_name}: {str(e)}'
+            )
+            return False
+    def item_is_serialized(self, item):
+        """Check if an item is in DynamoDB serialized format"""
+        return all(isinstance(v, dict) and len(v) == 1 for v in item.values())
+    def _serialize_item(self, item):
+        """Convert Python types to DynamoDB format"""
+        serializer = TypeSerializer()
+        return {k: serializer.serialize(v) for k, v in item.items()}
+    def _deserialize_item(self, item):
+        """Convert DynamoDB format back to Python types"""
+        deserializer = TypeDeserializer()
+        return {k: deserializer.deserialize(v) for k, v in item.items()}
+    def _check_table_exists(self, table_name):
+        """Check if a DynamoDB table exists"""
+        try:
+            existing_tables = self.client.list_tables().get('TableNames', [])
+            return table_name in existing_tables
+        except Exception as e:
+            logging.error(f'Error checking table existence: {str(e)}')
+            return False

auris_tools-0.0.3/auris_tools/geminiHandler.py ADDED Viewed

@@ -0,0 +1,246 @@
+import logging
+import os
+import google.generativeai as genai
+from dotenv import load_dotenv
+# Load environment variables from .env file
+load_dotenv()
+logger = logging.getLogger(__name__)
+class GoogleGeminiHandler:
+    """A handler class for interacting with Google's Gemini AI models.
+    This class provides a convenient interface for generating content using Google's
+    Gemini generative AI models. It handles authentication, model configuration,
+    and content generation with automatic error handling and logging.
+    Attributes:
+        api_key (str): The Google AI API key used for authentication.
+        model_name (str): The name of the Gemini model to use.
+        temperature (float): Controls randomness in generation (0.0 to 1.0).
+        response_schema (dict): Optional schema for structured responses.
+        response_mime_type (str): MIME type for response format.
+        generation_config (genai.types.GenerationConfig): Configuration for content generation.
+        model (genai.GenerativeModel): The configured Gemini model instance.
+    Example:
+        Basic usage with environment variable API key:
+        >>> handler = GoogleGeminiHandler()
+        >>> response = handler.generate_output("What is artificial intelligence?")
+        >>> text = handler.get_text(response)
+        Usage with custom parameters:
+        >>> handler = GoogleGeminiHandler(
+        ...     api_key="your-api-key",
+        ...     model="gemini-2.0-flash-exp",
+        ...     temperature=0.7,
+        ...     response_mime_type="text/plain"
+        ... )
+    """
+    def __init__(
+        self, api_key: str = None, model: str = 'gemini-2.5-flash', **kwargs
+    ):
+        """Initialize the Google Gemini handler.
+        Args:
+            api_key (str, optional): Google AI API key. If not provided, will attempt
+                to load from GEMINI_API_KEY environment variable. Defaults to None.
+            model (str, optional): Name of the Gemini model to use.
+                Defaults to 'gemini-2.5-flash'.
+            **kwargs: Additional configuration parameters:
+                - temperature (float): Controls randomness (0.0-1.0). Defaults to 0.5.
+                - response_schema (dict): Schema for structured responses. Defaults to None.
+                - response_mime_type (str): Response MIME type. Defaults to 'application/json'.
+        Raises:
+            TypeError: If the specified model is not available.
+        Example:
+            >>> handler = GoogleGeminiHandler(
+            ...     api_key="your-api-key",
+            ...     model="gemini-2.0-flash-exp",
+            ...     temperature=0.7
+            ... )
+        """
+        self.api_key = api_key if api_key else os.getenv('GEMINI_API_KEY')
+        if self.api_key is None:
+            logger.error(
+                'Gemini API key not configured. Please, define the GEMINI_API_KEY environment variable or enter your key directly in the code.'
+            )
+        self.model_name = model
+        self._check_model_availability()
+        # More configuration from input parameters
+        self.temperature = kwargs.get('temperature', 0.5)
+        self.response_schema = kwargs.get('response_schema', None)
+        self.response_mime_type = kwargs.get(
+            'response_mime_type', 'application/json'
+        )
+        self.generation_config = genai.types.GenerationConfig(
+            temperature=self.temperature,
+            response_schema=self.response_schema,
+            response_mime_type=self.response_mime_type,
+        )
+        self.model = genai.GenerativeModel(
+            generation_config=self.generation_config,
+            model_name=self.model_name,
+        )
+    def generate_output(
+        self, prompt: str, input_data: str = None, input_mime_type: str = None
+    ):
+        """Generate content using the configured Gemini model.
+        This method sends a prompt to the Gemini model and returns the generated response.
+        It supports both text-only prompts and multimodal inputs with additional data.
+        Args:
+            prompt (str): The text prompt to send to the model. This is the main
+                instruction or question for the AI to respond to.
+            input_data (str, optional): Additional input data to include with the prompt.
+                This could be text content, encoded media, or other data. Requires
+                input_mime_type to be specified. Defaults to None.
+            input_mime_type (str, optional): MIME type of the input_data. Required if
+                input_data is provided. Examples: 'text/plain', 'image/jpeg',
+                'application/pdf'. Defaults to None.
+        Returns:
+            genai.types.GenerateContentResponse or str: The response from the Gemini model
+            if successful, or an empty string if an error occurred.
+        Raises:
+            ValueError: If input_data is provided without input_mime_type or vice versa.
+        Example:
+            Text-only generation:
+            >>> response = handler.generate_output("Explain quantum computing")
+            Multimodal generation with additional data:
+            >>> response = handler.generate_output(
+            ...     prompt="Describe this image",
+            ...     input_data=base64_encoded_image,
+            ...     input_mime_type="image/jpeg"
+            ... )
+        """
+        if (input_data is not None and input_mime_type is None) or (
+            input_data is None and input_mime_type is not None
+        ):
+            raise ValueError(
+                'input_mime_type must be provided if input_data is given, or otherwise both must be None.'
+            )
+        if input_data and input_mime_type:  # Add input data if provided
+            prompt = [
+                prompt,
+                {'mime_type': input_mime_type, 'content': input_data},
+            ]
+        try:
+            response = self.model.generate_content(prompt)
+            return response
+        except Exception as e:
+            logger.error(f'Error generating LLM output: {str(e)}')
+            return ''
+    def get_text(self, response) -> str:
+        """Extract text content from a Gemini model response.
+        This method parses the response object returned by the Gemini model and
+        extracts the generated text content. It handles the response structure
+        safely and provides fallbacks for various response formats.
+        Args:
+            response (genai.types.GenerateContentResponse or dict): The response object
+                returned from the generate_output method. This can be either a
+                GenerateContentResponse object or a dictionary representation.
+        Returns:
+            str: The extracted text content from the response. Returns an empty
+            string if no content is found or if an error occurs during extraction.
+        Example:
+            >>> response = handler.generate_output("What is AI?")
+            >>> text_content = handler.get_text(response)
+            >>> print(text_content)
+            "Artificial Intelligence (AI) refers to..."
+            >>> # Handle case with no candidates
+            >>> empty_response = {'candidates': []}
+            >>> text = handler.get_text(empty_response)
+            >>> print(text)  # Returns empty string
+            ""
+        """
+        try:
+            if 'candidates' in response and len(response['candidates']) > 0:
+                return response['candidates'][0]['content']
+            else:
+                logger.warning('No candidates found in the response.')
+                return ''
+        except Exception as e:
+            logger.error(f'Error extracting text from response: {str(e)}')
+            return ''
+    def _check_model_availability(self):
+        """Check if the specified Gemini model is available.
+        This private method validates that the requested model name exists in the
+        list of available Google Gemini models. It queries the Google AI API to
+        get the current list of available models and compares against the requested
+        model name.
+        Raises:
+            TypeError: If the specified model is not found in the list of available
+                models from the Google AI API.
+        Note:
+            This method is called automatically during initialization and will
+            prevent the handler from being created if an invalid model is specified.
+            It also logs the availability check results for debugging purposes.
+        Example:
+            This method is called internally during initialization:
+            >>> # This will call _check_model_availability internally
+            >>> handler = GoogleGeminiHandler(model="gemini-2.5-flash")  # Success
+            >>> handler = GoogleGeminiHandler(model="invalid-model")     # Raises TypeError
+        """
+        try:
+            genai.configure(api_key=self.api_key)
+            available_models = genai.list_models()
+            # Extract model names and handle the 'models/' prefix
+            available_model_names = []
+            for model in available_models:
+                model_name = model.name
+                # Remove 'models/' prefix if present
+                if model_name.startswith('models/'):
+                    model_name = model_name[7:]  # Remove 'models/' prefix
+                available_model_names.append(model_name)
+            if self.model_name not in available_model_names:
+                logger.error(
+                    f'Model {self.model_name} is not available. Please check the model name.'
+                )
+                logger.info(
+                    f'Available models: {", ".join(available_model_names)}'
+                )
+                raise TypeError(f'Invalid model name: {self.model_name}')
+            else:
+                logger.info(f'Model {self.model_name} is available.')
+        except Exception as e:
+            if 'Invalid model name' in str(e):
+                raise  # Re-raise our custom error
+            else:
+                logger.error(f'Error checking model availability: {str(e)}')
+                # Don't raise error for API connectivity issues, just log