auris_tools 0.0.3__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,97 @@
1
+ Metadata-Version: 2.3
2
+ Name: auris_tools
3
+ Version: 0.0.3
4
+ Summary: The swiss knife tools to coordinates cloud frameworks with an easy for Auris platforms
5
+ Author: Antonio Senra
6
+ Author-email: acsenrafilho@gmail.com
7
+ Requires-Python: >=3.10,<4.0
8
+ Classifier: Programming Language :: Python :: 3
9
+ Classifier: Programming Language :: Python :: 3.10
10
+ Classifier: Programming Language :: Python :: 3.11
11
+ Classifier: Programming Language :: Python :: 3.12
12
+ Classifier: Programming Language :: Python :: 3.13
13
+ Requires-Dist: boto3 (>=1.40.29,<2.0.0)
14
+ Requires-Dist: dotenv (>=0.9.9,<0.10.0)
15
+ Requires-Dist: google-generativeai (>=0.8.5,<0.9.0)
16
+ Requires-Dist: python-docx (>=1.2.0,<2.0.0)
17
+ Requires-Dist: rich (>=14.1.0,<15.0.0)
18
+ Description-Content-Type: text/markdown
19
+
20
+ # auris-tools
21
+
22
+ [![PyPI version](https://img.shields.io/pypi/v/auris-tools.svg)](https://pypi.org/project/auris-tools/)
23
+ [![Documentation Status](https://readthedocs.org/projects/auris-tools/badge/?version=latest)](https://auris-tools.readthedocs.io/en/latest/?badge=latest)
24
+ [![CI for Develop Branch](https://github.com/AurisAASI/auris-tools/actions/workflows/ci_develop.yml/badge.svg?branch=develop)](https://github.com/AurisAASI/auris-tools/actions/workflows/ci_develop.yml)
25
+ [![CI for Develop Branch](https://github.com/AurisAASI/auris-tools/actions/workflows/ci_develop.yml/badge.svg?branch=main)](https://github.com/AurisAASI/auris-tools/actions/workflows/ci_develop.yml)
26
+ [![codecov](https://codecov.io/gh/AurisAASI/auris-tools/graph/badge.svg?token=08891W8HP2)](https://codecov.io/gh/AurisAASI/auris-tools)
27
+
28
+ The swiss knife tools to coordinates cloud frameworks with an easy for Auris platforms
29
+
30
+ ## Installation
31
+
32
+ This project requires **Python 3.10** and uses [Poetry](https://python-poetry.org/) for dependency management.
33
+
34
+ 1. **Clone the repository:**
35
+ ```bash
36
+ git clone https://github.com/AurisAASI/auris-tools.git
37
+ cd auris-tools
38
+ ```
39
+ 2. **Install Poetry (if not already installed):**
40
+ ```bash
41
+ pip install poetry
42
+ ```
43
+ 3. **Install dependencies:**
44
+ ```bash
45
+ poetry install
46
+ ```
47
+
48
+ ---
49
+
50
+ ## Project Structure
51
+
52
+ The main classes and modules are organized as follows:
53
+
54
+ ```
55
+ /auris_tools
56
+ ├── __init__.py
57
+ ├── configuration.py # AWS configuration utilities
58
+ ├── databaseHandlers.py # DynamoDB handler class
59
+ ├── officeWordHandler.py # Office Word document handler
60
+ ├── storageHandler.py # AWS S3 storage handler
61
+ ├── textractHandler.py # AWS Textract handler
62
+ ├── utils.py # Utility functions
63
+ ├── geminiHandler.py # Google Gemini AI handler
64
+ ```
65
+
66
+ ---
67
+
68
+ ## Testing & Linting
69
+
70
+ - **Run all tests:**
71
+ ```bash
72
+ task test
73
+ ```
74
+ - **Run linter (blue and isort):**
75
+ ```bash
76
+ task lint
77
+ ```
78
+
79
+ Test coverage and linting are enforced in CI. Make sure all tests pass and code is linted before submitting a PR.
80
+
81
+ ## Documentation
82
+
83
+ We use MkDocs with Material theme for our documentation:
84
+
85
+ - **Run documentation server locally:**
86
+ ```bash
87
+ task docs
88
+ ```
89
+ - **Build documentation:**
90
+ ```bash
91
+ task docs-build
92
+ ```
93
+
94
+ The documentation is automatically published to Read the Docs when changes are pushed to the main branch.
95
+
96
+ ---
97
+
@@ -0,0 +1,77 @@
1
+ # auris-tools
2
+
3
+ [![PyPI version](https://img.shields.io/pypi/v/auris-tools.svg)](https://pypi.org/project/auris-tools/)
4
+ [![Documentation Status](https://readthedocs.org/projects/auris-tools/badge/?version=latest)](https://auris-tools.readthedocs.io/en/latest/?badge=latest)
5
+ [![CI for Develop Branch](https://github.com/AurisAASI/auris-tools/actions/workflows/ci_develop.yml/badge.svg?branch=develop)](https://github.com/AurisAASI/auris-tools/actions/workflows/ci_develop.yml)
6
+ [![CI for Develop Branch](https://github.com/AurisAASI/auris-tools/actions/workflows/ci_develop.yml/badge.svg?branch=main)](https://github.com/AurisAASI/auris-tools/actions/workflows/ci_develop.yml)
7
+ [![codecov](https://codecov.io/gh/AurisAASI/auris-tools/graph/badge.svg?token=08891W8HP2)](https://codecov.io/gh/AurisAASI/auris-tools)
8
+
9
+ The swiss knife tools to coordinates cloud frameworks with an easy for Auris platforms
10
+
11
+ ## Installation
12
+
13
+ This project requires **Python 3.10** and uses [Poetry](https://python-poetry.org/) for dependency management.
14
+
15
+ 1. **Clone the repository:**
16
+ ```bash
17
+ git clone https://github.com/AurisAASI/auris-tools.git
18
+ cd auris-tools
19
+ ```
20
+ 2. **Install Poetry (if not already installed):**
21
+ ```bash
22
+ pip install poetry
23
+ ```
24
+ 3. **Install dependencies:**
25
+ ```bash
26
+ poetry install
27
+ ```
28
+
29
+ ---
30
+
31
+ ## Project Structure
32
+
33
+ The main classes and modules are organized as follows:
34
+
35
+ ```
36
+ /auris_tools
37
+ ├── __init__.py
38
+ ├── configuration.py # AWS configuration utilities
39
+ ├── databaseHandlers.py # DynamoDB handler class
40
+ ├── officeWordHandler.py # Office Word document handler
41
+ ├── storageHandler.py # AWS S3 storage handler
42
+ ├── textractHandler.py # AWS Textract handler
43
+ ├── utils.py # Utility functions
44
+ ├── geminiHandler.py # Google Gemini AI handler
45
+ ```
46
+
47
+ ---
48
+
49
+ ## Testing & Linting
50
+
51
+ - **Run all tests:**
52
+ ```bash
53
+ task test
54
+ ```
55
+ - **Run linter (blue and isort):**
56
+ ```bash
57
+ task lint
58
+ ```
59
+
60
+ Test coverage and linting are enforced in CI. Make sure all tests pass and code is linted before submitting a PR.
61
+
62
+ ## Documentation
63
+
64
+ We use MkDocs with Material theme for our documentation:
65
+
66
+ - **Run documentation server locally:**
67
+ ```bash
68
+ task docs
69
+ ```
70
+ - **Build documentation:**
71
+ ```bash
72
+ task docs-build
73
+ ```
74
+
75
+ The documentation is automatically published to Read the Docs when changes are pushed to the main branch.
76
+
77
+ ---
File without changes
@@ -0,0 +1,81 @@
1
+ import logging
2
+ import os
3
+
4
+ from dotenv import load_dotenv
5
+
6
+ # Load environment variables from .env file
7
+ load_dotenv()
8
+
9
+
10
+ class AWSConfiguration:
11
+ """
12
+ AWS Configuration class that handles credentials and region settings.
13
+ Prioritizes environment variables over constructor parameters.
14
+ """
15
+
16
+ def __init__(
17
+ self,
18
+ access_key: str = None,
19
+ secret_key: str = None,
20
+ region: str = None,
21
+ profile: str = None,
22
+ endpoint_url: str = None,
23
+ ):
24
+ # Try to get credentials from environment variables first
25
+ self.access_key = (
26
+ access_key if access_key else os.environ.get('AWS_ACCESS_KEY_ID')
27
+ )
28
+ self.secret_key = (
29
+ secret_key
30
+ if secret_key
31
+ else os.environ.get('AWS_SECRET_ACCESS_KEY')
32
+ )
33
+ self.region = (
34
+ region
35
+ if region
36
+ else os.environ.get('AWS_DEFAULT_REGION') or 'us-east-1'
37
+ )
38
+ self.profile = profile if profile else os.environ.get('AWS_PROFILE')
39
+ self.endpoint_url = (
40
+ endpoint_url
41
+ if endpoint_url
42
+ else os.environ.get('AWS_ENDPOINT_URL')
43
+ )
44
+
45
+ # Validate configuration
46
+ self._validate_config()
47
+
48
+ def _validate_config(self):
49
+ """Validate that we have enough configuration to proceed."""
50
+ if not ((self.access_key and self.secret_key) or self.profile):
51
+ logging.warning(
52
+ 'No AWS credentials provided via environment variables or constructor. '
53
+ 'AWS operations may fail unless credentials are configured via '
54
+ '~/.aws/credentials, IAM roles, or other AWS credential providers.'
55
+ )
56
+
57
+ def get_boto3_session_args(self):
58
+ """
59
+ Return a dictionary of arguments that can be passed to boto3.session.Session()
60
+ """
61
+ session_args = {'region_name': self.region}
62
+
63
+ if self.access_key and self.secret_key:
64
+ session_args['aws_access_key_id'] = self.access_key
65
+ session_args['aws_secret_access_key'] = self.secret_key
66
+
67
+ if self.profile:
68
+ session_args['profile_name'] = self.profile
69
+
70
+ return session_args
71
+
72
+ def get_client_args(self):
73
+ """
74
+ Return a dictionary of arguments that can be passed to boto3 client creation
75
+ """
76
+ client_args = {}
77
+
78
+ if self.endpoint_url:
79
+ client_args['endpoint_url'] = self.endpoint_url
80
+
81
+ return client_args
@@ -0,0 +1,132 @@
1
+ import logging
2
+
3
+ import boto3
4
+ from boto3.dynamodb.types import TypeDeserializer, TypeSerializer
5
+
6
+ from auris_tools.configuration import AWSConfiguration
7
+ from auris_tools.utils import generate_uuid
8
+
9
+
10
+ class DatabaseHandler:
11
+ def __init__(self, table_name, config=None):
12
+ """
13
+ Initialize the database handler.
14
+
15
+ Args:
16
+ table_name: Name of the DynamoDB table.
17
+ config: An AWSConfiguration object, or None to use environment variables.
18
+ """
19
+ self.table_name = table_name
20
+ if config is None:
21
+ config = AWSConfiguration()
22
+
23
+ # Create a boto3 session with the configuration
24
+ session = boto3.session.Session(**config.get_boto3_session_args())
25
+
26
+ # Create a DynamoDB client with additional configuration if needed
27
+ self.client = session.client('dynamodb', **config.get_client_args())
28
+
29
+ if not self._check_table_exists(table_name):
30
+ raise Exception(f'Table does not exist: {table_name}')
31
+
32
+ logging.info(f'Initialized DynamoDB client in region {config.region}')
33
+
34
+ def insert_item(self, item, primary_key: str = 'id'):
35
+ """Insert an item with automatic type conversion"""
36
+ if not isinstance(item, dict):
37
+ raise TypeError('Item must be a dictionary')
38
+
39
+ if primary_key not in item:
40
+ item[primary_key] = generate_uuid()
41
+
42
+ dynamo_item = self._serialize_item(item)
43
+ response = self.client.put_item(
44
+ TableName=self.table_name, Item=dynamo_item
45
+ )
46
+ return response
47
+
48
+ def get_item(self, key):
49
+ """
50
+ Retrieve an item from a DynamoDB table.
51
+
52
+ Args:
53
+ key: A dictionary representing the key of the item to retrieve.
54
+
55
+ Returns:
56
+ The retrieved item, or None if not found.
57
+ """
58
+ if not isinstance(key, dict):
59
+ raise TypeError('Key must be a dictionary')
60
+
61
+ # Check if the key is in DynamoDB format (i.e., values are dicts with type keys)
62
+ if not all(isinstance(v, dict) and len(v) == 1 for v in key.values()):
63
+ # Convert to DynamoDB format
64
+ key = self._serialize_item(key)
65
+
66
+ try:
67
+ response = self.client.get_item(TableName=self.table_name, Key=key)
68
+ return response.get('Item')
69
+ except Exception as e:
70
+ logging.error(
71
+ f'Error retrieving item from {self.table_name}: {str(e)}'
72
+ )
73
+ return None
74
+
75
+ def delete_item(self, key, primary_key='id'):
76
+ """
77
+ Delete an item from a DynamoDB table.
78
+
79
+ Args:
80
+ key (str or dict): Either a string identifier for the primary key,
81
+ or a dictionary containing the complete key structure.
82
+ primary_key (str, optional): Name of the primary key field. Defaults to 'id'.
83
+
84
+ Returns:
85
+ bool: True if deletion was successful, False otherwise.
86
+ """
87
+ # Convert string key to a dictionary with the primary key
88
+ if isinstance(key, str):
89
+ key = {primary_key: key}
90
+ elif not isinstance(key, dict):
91
+ raise TypeError('Key must be a string identifier or a dictionary')
92
+
93
+ # Check if the key is in DynamoDB format
94
+ if not self.item_is_serialized(key):
95
+ key = self._serialize_item(key)
96
+
97
+ try:
98
+ self.client.delete_item(
99
+ TableName=self.table_name,
100
+ Key=key,
101
+ ReturnValues='ALL_OLD', # Return the deleted item
102
+ )
103
+ logging.info(f'Deleted item from {self.table_name} with key {key}')
104
+ return True
105
+ except Exception as e:
106
+ logging.error(
107
+ f'Error deleting item from {self.table_name}: {str(e)}'
108
+ )
109
+ return False
110
+
111
+ def item_is_serialized(self, item):
112
+ """Check if an item is in DynamoDB serialized format"""
113
+ return all(isinstance(v, dict) and len(v) == 1 for v in item.values())
114
+
115
+ def _serialize_item(self, item):
116
+ """Convert Python types to DynamoDB format"""
117
+ serializer = TypeSerializer()
118
+ return {k: serializer.serialize(v) for k, v in item.items()}
119
+
120
+ def _deserialize_item(self, item):
121
+ """Convert DynamoDB format back to Python types"""
122
+ deserializer = TypeDeserializer()
123
+ return {k: deserializer.deserialize(v) for k, v in item.items()}
124
+
125
+ def _check_table_exists(self, table_name):
126
+ """Check if a DynamoDB table exists"""
127
+ try:
128
+ existing_tables = self.client.list_tables().get('TableNames', [])
129
+ return table_name in existing_tables
130
+ except Exception as e:
131
+ logging.error(f'Error checking table existence: {str(e)}')
132
+ return False
@@ -0,0 +1,246 @@
1
+ import logging
2
+ import os
3
+
4
+ import google.generativeai as genai
5
+ from dotenv import load_dotenv
6
+
7
+ # Load environment variables from .env file
8
+ load_dotenv()
9
+
10
+ logger = logging.getLogger(__name__)
11
+
12
+
13
+ class GoogleGeminiHandler:
14
+ """A handler class for interacting with Google's Gemini AI models.
15
+
16
+ This class provides a convenient interface for generating content using Google's
17
+ Gemini generative AI models. It handles authentication, model configuration,
18
+ and content generation with automatic error handling and logging.
19
+
20
+ Attributes:
21
+ api_key (str): The Google AI API key used for authentication.
22
+ model_name (str): The name of the Gemini model to use.
23
+ temperature (float): Controls randomness in generation (0.0 to 1.0).
24
+ response_schema (dict): Optional schema for structured responses.
25
+ response_mime_type (str): MIME type for response format.
26
+ generation_config (genai.types.GenerationConfig): Configuration for content generation.
27
+ model (genai.GenerativeModel): The configured Gemini model instance.
28
+
29
+ Example:
30
+ Basic usage with environment variable API key:
31
+
32
+ >>> handler = GoogleGeminiHandler()
33
+ >>> response = handler.generate_output("What is artificial intelligence?")
34
+ >>> text = handler.get_text(response)
35
+
36
+ Usage with custom parameters:
37
+
38
+ >>> handler = GoogleGeminiHandler(
39
+ ... api_key="your-api-key",
40
+ ... model="gemini-2.0-flash-exp",
41
+ ... temperature=0.7,
42
+ ... response_mime_type="text/plain"
43
+ ... )
44
+ """
45
+
46
+ def __init__(
47
+ self, api_key: str = None, model: str = 'gemini-2.5-flash', **kwargs
48
+ ):
49
+ """Initialize the Google Gemini handler.
50
+
51
+ Args:
52
+ api_key (str, optional): Google AI API key. If not provided, will attempt
53
+ to load from GEMINI_API_KEY environment variable. Defaults to None.
54
+ model (str, optional): Name of the Gemini model to use.
55
+ Defaults to 'gemini-2.5-flash'.
56
+ **kwargs: Additional configuration parameters:
57
+ - temperature (float): Controls randomness (0.0-1.0). Defaults to 0.5.
58
+ - response_schema (dict): Schema for structured responses. Defaults to None.
59
+ - response_mime_type (str): Response MIME type. Defaults to 'application/json'.
60
+
61
+ Raises:
62
+ TypeError: If the specified model is not available.
63
+
64
+ Example:
65
+ >>> handler = GoogleGeminiHandler(
66
+ ... api_key="your-api-key",
67
+ ... model="gemini-2.0-flash-exp",
68
+ ... temperature=0.7
69
+ ... )
70
+ """
71
+
72
+ self.api_key = api_key if api_key else os.getenv('GEMINI_API_KEY')
73
+ if self.api_key is None:
74
+ logger.error(
75
+ 'Gemini API key not configured. Please, define the GEMINI_API_KEY environment variable or enter your key directly in the code.'
76
+ )
77
+
78
+ self.model_name = model
79
+ self._check_model_availability()
80
+
81
+ # More configuration from input parameters
82
+ self.temperature = kwargs.get('temperature', 0.5)
83
+ self.response_schema = kwargs.get('response_schema', None)
84
+ self.response_mime_type = kwargs.get(
85
+ 'response_mime_type', 'application/json'
86
+ )
87
+
88
+ self.generation_config = genai.types.GenerationConfig(
89
+ temperature=self.temperature,
90
+ response_schema=self.response_schema,
91
+ response_mime_type=self.response_mime_type,
92
+ )
93
+
94
+ self.model = genai.GenerativeModel(
95
+ generation_config=self.generation_config,
96
+ model_name=self.model_name,
97
+ )
98
+
99
+ def generate_output(
100
+ self, prompt: str, input_data: str = None, input_mime_type: str = None
101
+ ):
102
+ """Generate content using the configured Gemini model.
103
+
104
+ This method sends a prompt to the Gemini model and returns the generated response.
105
+ It supports both text-only prompts and multimodal inputs with additional data.
106
+
107
+ Args:
108
+ prompt (str): The text prompt to send to the model. This is the main
109
+ instruction or question for the AI to respond to.
110
+ input_data (str, optional): Additional input data to include with the prompt.
111
+ This could be text content, encoded media, or other data. Requires
112
+ input_mime_type to be specified. Defaults to None.
113
+ input_mime_type (str, optional): MIME type of the input_data. Required if
114
+ input_data is provided. Examples: 'text/plain', 'image/jpeg',
115
+ 'application/pdf'. Defaults to None.
116
+
117
+ Returns:
118
+ genai.types.GenerateContentResponse or str: The response from the Gemini model
119
+ if successful, or an empty string if an error occurred.
120
+
121
+ Raises:
122
+ ValueError: If input_data is provided without input_mime_type or vice versa.
123
+
124
+ Example:
125
+ Text-only generation:
126
+
127
+ >>> response = handler.generate_output("Explain quantum computing")
128
+
129
+ Multimodal generation with additional data:
130
+
131
+ >>> response = handler.generate_output(
132
+ ... prompt="Describe this image",
133
+ ... input_data=base64_encoded_image,
134
+ ... input_mime_type="image/jpeg"
135
+ ... )
136
+ """
137
+ if (input_data is not None and input_mime_type is None) or (
138
+ input_data is None and input_mime_type is not None
139
+ ):
140
+ raise ValueError(
141
+ 'input_mime_type must be provided if input_data is given, or otherwise both must be None.'
142
+ )
143
+
144
+ if input_data and input_mime_type: # Add input data if provided
145
+ prompt = [
146
+ prompt,
147
+ {'mime_type': input_mime_type, 'content': input_data},
148
+ ]
149
+
150
+ try:
151
+ response = self.model.generate_content(prompt)
152
+ return response
153
+ except Exception as e:
154
+ logger.error(f'Error generating LLM output: {str(e)}')
155
+ return ''
156
+
157
+ def get_text(self, response) -> str:
158
+ """Extract text content from a Gemini model response.
159
+
160
+ This method parses the response object returned by the Gemini model and
161
+ extracts the generated text content. It handles the response structure
162
+ safely and provides fallbacks for various response formats.
163
+
164
+ Args:
165
+ response (genai.types.GenerateContentResponse or dict): The response object
166
+ returned from the generate_output method. This can be either a
167
+ GenerateContentResponse object or a dictionary representation.
168
+
169
+ Returns:
170
+ str: The extracted text content from the response. Returns an empty
171
+ string if no content is found or if an error occurs during extraction.
172
+
173
+ Example:
174
+ >>> response = handler.generate_output("What is AI?")
175
+ >>> text_content = handler.get_text(response)
176
+ >>> print(text_content)
177
+ "Artificial Intelligence (AI) refers to..."
178
+
179
+ >>> # Handle case with no candidates
180
+ >>> empty_response = {'candidates': []}
181
+ >>> text = handler.get_text(empty_response)
182
+ >>> print(text) # Returns empty string
183
+ ""
184
+ """
185
+ try:
186
+ if 'candidates' in response and len(response['candidates']) > 0:
187
+ return response['candidates'][0]['content']
188
+ else:
189
+ logger.warning('No candidates found in the response.')
190
+ return ''
191
+ except Exception as e:
192
+ logger.error(f'Error extracting text from response: {str(e)}')
193
+ return ''
194
+
195
+ def _check_model_availability(self):
196
+ """Check if the specified Gemini model is available.
197
+
198
+ This private method validates that the requested model name exists in the
199
+ list of available Google Gemini models. It queries the Google AI API to
200
+ get the current list of available models and compares against the requested
201
+ model name.
202
+
203
+ Raises:
204
+ TypeError: If the specified model is not found in the list of available
205
+ models from the Google AI API.
206
+
207
+ Note:
208
+ This method is called automatically during initialization and will
209
+ prevent the handler from being created if an invalid model is specified.
210
+ It also logs the availability check results for debugging purposes.
211
+
212
+ Example:
213
+ This method is called internally during initialization:
214
+
215
+ >>> # This will call _check_model_availability internally
216
+ >>> handler = GoogleGeminiHandler(model="gemini-2.5-flash") # Success
217
+ >>> handler = GoogleGeminiHandler(model="invalid-model") # Raises TypeError
218
+ """
219
+ try:
220
+ genai.configure(api_key=self.api_key)
221
+ available_models = genai.list_models()
222
+ # Extract model names and handle the 'models/' prefix
223
+ available_model_names = []
224
+ for model in available_models:
225
+ model_name = model.name
226
+ # Remove 'models/' prefix if present
227
+ if model_name.startswith('models/'):
228
+ model_name = model_name[7:] # Remove 'models/' prefix
229
+ available_model_names.append(model_name)
230
+
231
+ if self.model_name not in available_model_names:
232
+ logger.error(
233
+ f'Model {self.model_name} is not available. Please check the model name.'
234
+ )
235
+ logger.info(
236
+ f'Available models: {", ".join(available_model_names)}'
237
+ )
238
+ raise TypeError(f'Invalid model name: {self.model_name}')
239
+ else:
240
+ logger.info(f'Model {self.model_name} is available.')
241
+ except Exception as e:
242
+ if 'Invalid model name' in str(e):
243
+ raise # Re-raise our custom error
244
+ else:
245
+ logger.error(f'Error checking model availability: {str(e)}')
246
+ # Don't raise error for API connectivity issues, just log