PyPI - sample_data_factory - Versions diffs - 0.7.0__tar.gz - Mend

sample_data_factory 0.7.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

sample_data_factory-0.7.0/LICENSE.txt +17 -0
sample_data_factory-0.7.0/PKG-INFO +135 -0
sample_data_factory-0.7.0/README.md +106 -0
sample_data_factory-0.7.0/pyproject.toml +122 -0
sample_data_factory-0.7.0/src/sdf/__init__.py +0 -0
sample_data_factory-0.7.0/src/sdf/sample_data_factory.py +152 -0

sample_data_factory-0.7.0/LICENSE.txt ADDED Viewed

@@ -0,0 +1,17 @@
+COPYRIGHT NOTICE
+----------------------------------------------------------------------------
+Copyright (c) 2023, Bright Edge eServices. All rights reserved.
+Unauthorized copying, distribution, modification, public display, or public
+performance of this software, or any portion of it, is strictly prohibited. This
+software is proprietary to Bright Edge eServices and is protected by South
+African copyright laws and international treaty provisions.
+No part of this software may be reproduced or transmitted in any form or by any
+means, electronic or mechanical, including photocopying, recording, or by any
+information storage and retrieval system, without the express written permission
+of Bright Edge eServices.
+Any use, copying, or distribution of this software not in accordance with this
+notice is expressly prohibited and may result in severe civil and criminal
+penalties.

sample_data_factory-0.7.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,135 @@
+Metadata-Version: 2.4
+Name: sample_data_factory
+Version: 0.7.0
+Summary: Reusable test data factory for CSV and FIDE XML archives.
+License-Expression: MIT
+License-File: LICENSE.txt
+Author: Hendrik du Toit
+Author-email: hendrik@brightedge.co.za
+Maintainer: Hendrik du Toit
+Maintainer-email: hendrikdt@citiqprepaid.co.za
+Requires-Python: >=3.12
+Classifier: Development Status :: 4 - Beta
+Classifier: Intended Audience :: Developers
+Classifier: Intended Audience :: Information Technology
+Classifier: Intended Audience :: System Administrators
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Programming Language :: Python :: 3.13
+Classifier: Topic :: Software Development :: Libraries :: Python Modules
+Classifier: Topic :: System :: Archiving :: Packaging
+Project-URL: Homepage, https://github.com/RealTimeEvents/sample_data_factory
+Project-URL: Issues, https://github.com/RealTimeEvents/sample_data_factory/issues
+Project-URL: Repository, https://github.com/RealTimeEvents/sample_data_factory.git
+Project-URL: changelog, https://github.com/RealTimeEvents/sample_data_factory/releases
+Description-Content-Type: text/markdown
+# sample_data_factory
+______________________________________________________________________
+## Short description
+`sample_data_factory` is a reusable Python helper for generating deterministic ZIP archives used in test flows, including URS CSV exports and FIDE players XML exports.
+______________________________________________________________________
+## Module Overview
+### Key Features
+- Builds URS publication ZIP archives containing CSV payloads.
+- Builds FIDE players-list ZIP archives containing XML payloads.
+- Supports either local byte output or upload to Google Drive.
+- Can create nested Google Drive sub-folders when uploading files.
+- Includes unit tests for archive generation and constructor validation.
+### Project Structure
+- `src/sdf/`: Core package implementation (`sample_data_factory.py`).
+- `tests/unit/`: Unit tests for archive builders and helper methods.
+- `scripts/`: SQL/bootstrap assets (legacy resources have been removed).
+- `legacy/`: Archived resources excluded from normal test runs.
+- `*.ps1`: Environment and dependency setup scripts.
+______________________________________________________________________
+## Getting Started
+### Prerequisites
+- Python 3.12+
+- Poetry
+### Setup
+```powershell
+# 1) Generate .env values from environment variables
+.\SetupDotEnv.ps1
+# 2) Configure private Poetry sources when required
+.\SetupPrivateRepoAccess.ps1
+# 3) Optional: configure GitHub CLI access
+.\SetupGitHubAccess.ps1
+# 4) Install and sync project dependencies
+.\InstallDevEnv.ps1
+# 5) Run tests
+poetry run pytest
+```
+### Usage Example
+```python
+from datetime import date
+from sdf.sample_data_factory import SampleDataFactory
+factory = SampleDataFactory(
+    data_structure={
+        "headers": ["PlayerID", "PlayerName"],
+        "rows": [["1", "Player One"], ["2", "Player Two"]],
+    },
+    drive=None,
+    file_prefix="Players",
+    out_file_date=date(2026, 1, 1),
+    target_folder_id=None,
+    sub_folder_name=None,
+)
+archive_bytes = factory.build_urs_rating_pub_zip()
+```
+### Common Commands
+```powershell
+poetry install
+poetry run pytest
+poetry run pytest --cov=src --cov=tests --cov-report=term-missing
+poetry run black src tests
+poetry run isort src tests
+poetry run flake8 src tests
+poetry run pre-commit run --all-files
+```
+______________________________________________________________________
+## Automation Scripts
+- `InstallPy.ps1`: Bootstraps Python/Poetry setup.
+- `InstallDevEnv.ps1`: Installs development dependencies and pre-commit hooks.
+- `SetupDotEnv.ps1`: Generates `.env` from required environment variables.
+- `SetupPrivateRepoAccess.ps1`: Configures private package source credentials.
+- `SetupGitHubAccess.ps1`: Configures GitHub authentication for local automation.
+______________________________________________________________________
+## Active Workflows
+- `.github/workflows/py-temp-pr-pub-no_docker-def.yaml`: Pull request validation workflow.
+- `.github/workflows/py-temp-publish-pub-build_release_notify_after_merge-def.yaml`: Post-merge release and publish workflow.

sample_data_factory-0.7.0/README.md ADDED Viewed

@@ -0,0 +1,106 @@
+# sample_data_factory
+______________________________________________________________________
+## Short description
+`sample_data_factory` is a reusable Python helper for generating deterministic ZIP archives used in test flows, including URS CSV exports and FIDE players XML exports.
+______________________________________________________________________
+## Module Overview
+### Key Features
+- Builds URS publication ZIP archives containing CSV payloads.
+- Builds FIDE players-list ZIP archives containing XML payloads.
+- Supports either local byte output or upload to Google Drive.
+- Can create nested Google Drive sub-folders when uploading files.
+- Includes unit tests for archive generation and constructor validation.
+### Project Structure
+- `src/sdf/`: Core package implementation (`sample_data_factory.py`).
+- `tests/unit/`: Unit tests for archive builders and helper methods.
+- `scripts/`: SQL/bootstrap assets (legacy resources have been removed).
+- `legacy/`: Archived resources excluded from normal test runs.
+- `*.ps1`: Environment and dependency setup scripts.
+______________________________________________________________________
+## Getting Started
+### Prerequisites
+- Python 3.12+
+- Poetry
+### Setup
+```powershell
+# 1) Generate .env values from environment variables
+.\SetupDotEnv.ps1
+# 2) Configure private Poetry sources when required
+.\SetupPrivateRepoAccess.ps1
+# 3) Optional: configure GitHub CLI access
+.\SetupGitHubAccess.ps1
+# 4) Install and sync project dependencies
+.\InstallDevEnv.ps1
+# 5) Run tests
+poetry run pytest
+```
+### Usage Example
+```python
+from datetime import date
+from sdf.sample_data_factory import SampleDataFactory
+factory = SampleDataFactory(
+    data_structure={
+        "headers": ["PlayerID", "PlayerName"],
+        "rows": [["1", "Player One"], ["2", "Player Two"]],
+    },
+    drive=None,
+    file_prefix="Players",
+    out_file_date=date(2026, 1, 1),
+    target_folder_id=None,
+    sub_folder_name=None,
+)
+archive_bytes = factory.build_urs_rating_pub_zip()
+```
+### Common Commands
+```powershell
+poetry install
+poetry run pytest
+poetry run pytest --cov=src --cov=tests --cov-report=term-missing
+poetry run black src tests
+poetry run isort src tests
+poetry run flake8 src tests
+poetry run pre-commit run --all-files
+```
+______________________________________________________________________
+## Automation Scripts
+- `InstallPy.ps1`: Bootstraps Python/Poetry setup.
+- `InstallDevEnv.ps1`: Installs development dependencies and pre-commit hooks.
+- `SetupDotEnv.ps1`: Generates `.env` from required environment variables.
+- `SetupPrivateRepoAccess.ps1`: Configures private package source credentials.
+- `SetupGitHubAccess.ps1`: Configures GitHub authentication for local automation.
+______________________________________________________________________
+## Active Workflows
+- `.github/workflows/py-temp-pr-pub-no_docker-def.yaml`: Pull request validation workflow.
+- `.github/workflows/py-temp-publish-pub-build_release_notify_after_merge-def.yaml`: Post-merge release and publish workflow.

sample_data_factory-0.7.0/pyproject.toml ADDED Viewed

@@ -0,0 +1,122 @@
+[build-system]
+    requires = [
+        "poetry-core>=2.0.0,<3.0.0",
+    ]
+    build-backend = "poetry.core.masonry.api"
+[project]
+    name = "sample_data_factory"
+    version = "0.7.0"
+    description = "Reusable test data factory for CSV and FIDE XML archives."
+    authors = [
+        { name = "Hendrik du Toit", email = "hendrik@brightedge.co.za" }
+    ]
+    classifiers = [
+        "Development Status :: 4 - Beta",
+        "Intended Audience :: Developers",
+        "Intended Audience :: Information Technology",
+        "Intended Audience :: System Administrators",
+        "License :: OSI Approved :: MIT License",
+        "Programming Language :: Python :: 3.10",
+        "Programming Language :: Python :: 3.11",
+        "Programming Language :: Python :: 3.12",
+        "Programming Language :: Python :: 3.13",
+        "Topic :: Software Development :: Libraries :: Python Modules",
+        "Topic :: System :: Archiving :: Packaging",
+    ]
+    dependencies = [
+    ]
+    license = "MIT"
+    #license = "Proprietary"
+    license-files = ["LICENSE.txt"]
+    maintainers = [
+        { name = "Hendrik du Toit", email = "hendrikdt@citiqprepaid.co.za" },
+        { name = "Henru du Toit", email = "henru@brightedge.co.za" },
+        { name = "Dirk du Toit", email = "dirk@brightedge.co.za" },
+    ]
+    readme = { file = "README.md", content-type = "text/markdown" }
+    requires-python = ">=3.12"
+    packages = [{include = "sdf", from = "src"}]
+[project.urls]
+    # documentation = "https://readthedocs.org"
+    Issues = "https://github.com/RealTimeEvents/sample_data_factory/issues"
+    changelog = "https://github.com/RealTimeEvents/sample_data_factory/releases"
+    Homepage = "https://github.com/RealTimeEvents/sample_data_factory"
+    Repository = "https://github.com/RealTimeEvents/sample_data_factory.git"
+[tool.black]
+    line-length = 120
+    target-version = [
+        "py313",
+    ]
+    extend-exclude = """
+    (
+        ^tests/testdata.py
+    )
+    """
+[tool.codespell]
+    count = ""
+    quiet-level = 2
+    skip = "working/*,legacy/*"
+    ignore-words-list = "space-holder"
+    write-changes = ""
+[tool.coverage.run]
+    source = [
+        "src",
+        "tests"
+    ]
+    omit = [
+        "./legacy/*",
+        "./tests/legacy/*"
+    ]
+[tool.isort]
+    profile = "black"
+[tool.poetry]
+    packages = [
+        { include = "sdf", from = "src" },
+    ]
+[tool.poetry.dependencies]
+[tool.poetry.group.dev]
+    optional = true
+[tool.poetry.group.dev.dependencies]
+    black = ">=25.1.0"
+    codecov = ">=2.1.13"
+    flake8 = ">=7.1.1"
+    isort = "^5.13.2"
+    mdformat-gfm = ">=0.4.1"
+    mdformat-frontmatter = ">=2.0.8"
+    mdformat-footnote = ">=0.1.1"
+    pre-commit = ">=4.0.1"
+    pygments = "^2.19.1"
+    pytest = ">=8.3.4"
+    pytest-cov = ">=6.0.0"
+    sphinx = ">=8.1.3"
+    twine = ">=6.1.0"
+[tool.pytest.ini_options]
+    norecursedirs = ["tests/legacy"]
+    addopts = [
+        "-vv",
+        "--ignore-glob=*/Archive",
+        "--ignore=/legacy",
+        "--ignore=tests/legacy",
+    ]
+    filterwarnings = [
+        #    "ignore::DeprecationWarning",
+    ]
+    pythonpath = [
+        "src",
+        "tests",
+    ]
+    testpaths = "tests"
+    markers = [
+        "select: Run a selection of tests",
+    ]

sample_data_factory-0.7.0/src/sdf/__init__.py ADDED Viewed

File without changes

sample_data_factory-0.7.0/src/sdf/sample_data_factory.py ADDED Viewed

@@ -0,0 +1,152 @@
+from __future__ import annotations
+import csv
+import io
+import tempfile
+import zipfile
+from datetime import date
+from pathlib import Path
+from typing import Any
+from typing import Protocol
+from xml.etree import ElementTree as ET
+GOOGLE_DRIVE_FOLDER_MIME = "application/vnd.google-apps.folder"
+class GoogleDriveInterface(Protocol):
+    service: Any
+    def list_children(self, folder_id: str) -> list[dict[str, str]]:
+        pass
+    def upload_file(self, local_path: Path, folder_id: str | None) -> str:
+        pass
+class SampleDataFactory:
+    """Test helper for generating publication archives for URS and FIDE test flows."""
+    def __init__(
+        self,
+        data_structure: dict,
+        drive: GoogleDriveInterface | None,
+        file_prefix: str,
+        out_file_date: date,
+        target_folder_id: str | None,
+        sub_folder_name: Path | None,
+    ):
+        self.data_structure = data_structure
+        self.drive = drive
+        self.file_date = out_file_date
+        self.file_prefix = file_prefix
+        self.sub_folder_name = sub_folder_name
+        self.target_folder_id = target_folder_id
+        if self.drive is None and self.sub_folder_name is not None:
+            raise ValueError("sub_folder_name requires a drive instance")
+        if self.drive is None and self.target_folder_id is not None:
+            raise ValueError("target_folder_id requires a drive instance")
+        if self.drive is not None and self.target_folder_id is None:
+            raise ValueError("target_folder_id is required when drive is provided")
+        if self.sub_folder_name is not None and not isinstance(self.sub_folder_name, Path):
+            raise TypeError("sub_folder_name must be a Path or None")
+    def _build_csv_bytes(self) -> bytes:
+        headers = self.data_structure["headers"]
+        rows = self.data_structure["rows"]
+        buffer = io.StringIO(newline="")
+        writer = csv.writer(buffer, lineterminator="\n")
+        writer.writerow(headers)
+        writer.writerows(rows)
+        return buffer.getvalue().encode("utf-8")
+    def _build_fide_players_xml_bytes(self) -> bytes:
+        players = self.data_structure["players"]
+        root = ET.Element("playerslist")
+        xml_filename = "players_list_xml_foa.xml"
+        for player_data in players:
+            player_element = ET.SubElement(root, "player")
+            for key in sorted(player_data.keys()):
+                value = player_data[key]
+                field = ET.SubElement(player_element, key)
+                field.text = value
+        xml_bytes = ET.tostring(root, encoding="utf-8")
+        output_buffer = io.BytesIO()
+        with zipfile.ZipFile(output_buffer, "w", compression=zipfile.ZIP_DEFLATED) as output_zip:
+            output_zip.writestr(xml_filename, xml_bytes)
+        return output_buffer.getvalue()
+    def _create_sub_folder_if_needed(self, parent_folder_id: str, sub_folder_name: Path) -> str:
+        current_parent_id = parent_folder_id
+        folder_parts = [part for part in sub_folder_name.parts if part not in {"", "."}]
+        for folder_name in folder_parts:
+            subfolder_id = None
+            for item in self.drive.list_children(current_parent_id):
+                if item.get("name") == folder_name and item.get("mimeType") == GOOGLE_DRIVE_FOLDER_MIME:
+                    subfolder_id = item.get("id")
+                    break
+            if not subfolder_id:
+                folder_metadata = {
+                    "mimeType": GOOGLE_DRIVE_FOLDER_MIME,
+                    "name": folder_name,
+                    "parents": [current_parent_id],
+                }
+                folder = (
+                    self.drive.service.files()
+                    .create(body=folder_metadata, fields="id", supportsAllDrives=True)
+                    .execute()
+                )
+                subfolder_id = folder.get("id")
+            current_parent_id = subfolder_id
+        return current_parent_id
+    def _csv_name(self) -> str:
+        return f"{self.file_prefix}_{self._yymmdd()}.csv"
+    def _upload_or_return_bytes(self, archive_bytes: bytes, archive_filename: str) -> str | bytes:
+        if self.drive is None:
+            return archive_bytes
+        with tempfile.TemporaryDirectory() as temp_dir:
+            temp_path = Path(temp_dir)
+            zip_path = temp_path / archive_filename
+            zip_path.write_bytes(archive_bytes)
+            upload_folder_id = self.target_folder_id
+            if self.sub_folder_name is not None:
+                upload_folder_id = self._create_sub_folder_if_needed(upload_folder_id, self.sub_folder_name)
+            file_id = self.drive.upload_file(local_path=zip_path, folder_id=upload_folder_id)
+        return file_id
+    def _yymmdd(self) -> str:
+        return self.file_date.strftime("%y%m%d")
+    def _zip_name(self) -> str:
+        return f"{self.file_prefix}_{self._yymmdd()}.zip"
+    def build(self) -> str | bytes:
+        return self.build_urs_rating_pub_zip()
+    def build_fide_players_list_zip(self) -> str | bytes:
+        archive_bytes = self._build_fide_players_xml_bytes()
+        archive_filename = "players_list_xml.zip"
+        return self._upload_or_return_bytes(archive_bytes=archive_bytes, archive_filename=archive_filename)
+    def build_urs_rating_pub_zip(self) -> str | bytes:
+        archive_filename = self._zip_name()
+        csv_bytes = self._build_csv_bytes()
+        csv_filename = self._csv_name()
+        output_buffer = io.BytesIO()
+        with zipfile.ZipFile(output_buffer, "w", compression=zipfile.ZIP_DEFLATED) as output_zip:
+            output_zip.writestr(csv_filename, csv_bytes)
+        return self._upload_or_return_bytes(archive_bytes=output_buffer.getvalue(), archive_filename=archive_filename)