PyPI - swarmauri_parser_bertembedding - Versions diffs - 0.8.0.dev4__tar.gz → 0.8.0.dev21__tar.gz - Mend

swarmauri_parser_bertembedding 0.8.0.dev4tar.gz → 0.8.0.dev21tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

swarmauri_parser_bertembedding-0.8.0.dev21/PKG-INFO ADDED Viewed

@@ -0,0 +1,135 @@
+Metadata-Version: 2.4
+Name: swarmauri_parser_bertembedding
+Version: 0.8.0.dev21
+Summary: Swarmauri Bert Embedding Parser
+License-Expression: Apache-2.0
+License-File: LICENSE
+Keywords: swarmauri,parser,bertembedding,bert,embedding
+Author: Jacob Stewart
+Author-email: jacob@swarmauri.com
+Requires-Python: >=3.10,<3.13
+Classifier: License :: OSI Approved :: Apache Software License
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Natural Language :: English
+Classifier: Development Status :: 3 - Alpha
+Classifier: Intended Audience :: Developers
+Classifier: Topic :: Software Development :: Libraries :: Application Frameworks
+Requires-Dist: swarmauri_base
+Requires-Dist: swarmauri_core
+Requires-Dist: swarmauri_standard
+Requires-Dist: torch
+Requires-Dist: transformers (>=4.45.0)
+Description-Content-Type: text/markdown
+![Swarmauri Logo](https://github.com/swarmauri/swarmauri-sdk/blob/3d4d1cfa949399d7019ae9d8f296afba773dfb7f/assets/swarmauri.brand.theme.svg)
+<p align="center">
+    <a href="https://pypi.org/project/swarmauri_parser_bertembedding/">
+        <img src="https://img.shields.io/pypi/dm/swarmauri_parser_bertembedding" alt="PyPI - Downloads"/></a>
+    <a href="https://hits.sh/github.com/swarmauri/swarmauri-sdk/tree/master/pkgs/community/swarmauri_parser_bertembedding/">
+        <img alt="Hits" src="https://hits.sh/github.com/swarmauri/swarmauri-sdk/tree/master/pkgs/community/swarmauri_parser_bertembedding.svg"/></a>
+    <a href="https://pypi.org/project/swarmauri_parser_bertembedding/">
+        <img src="https://img.shields.io/pypi/pyversions/swarmauri_parser_bertembedding" alt="PyPI - Python Version"/></a>
+    <a href="https://pypi.org/project/swarmauri_parser_bertembedding/">
+        <img src="https://img.shields.io/pypi/l/swarmauri_parser_bertembedding" alt="PyPI - License"/></a>
+    <a href="https://pypi.org/project/swarmauri_parser_bertembedding/">
+        <img src="https://img.shields.io/pypi/v/swarmauri_parser_bertembedding?label=swarmauri_parser_bertembedding&color=green" alt="PyPI - swarmauri_parser_bertembedding"/></a>
+</p>
+---
+# Swarmauri Parser Bert Embedding
+Parser that converts text into embeddings using a Hugging Face BERT encoder. Produces `Document` objects whose metadata carries the averaged token embedding so downstream Swarmauri pipelines can work with dense vectors.
+## Features
+- Uses `transformers.BertModel` + `BertTokenizer` (default `bert-base-uncased`).
+- Accepts single strings or lists of strings and emits `Document` instances with original text and embedding metadata.
+- Runs in inference (`eval`) mode with automatic `torch.no_grad()` handling.
+- Works on CPU by default; configure PyTorch device settings to leverage GPU.
+## Prerequisites
+- Python 3.10 or newer.
+- PyTorch compatible with your hardware (installed automatically via `transformers` if not present; install CUDA-enabled wheels manually when needed).
+- Internet access on first run so Hugging Face downloads tokenizer/model weights (or warm the cache ahead of time).
+## Installation
+```bash
+# pip
+pip install swarmauri_parser_bertembedding
+# poetry
+poetry add swarmauri_parser_bertembedding
+# uv (pyproject-based projects)
+uv add swarmauri_parser_bertembedding
+```
+## Quickstart
+```python
+from swarmauri_parser_bertembedding import BERTEmbeddingParser
+parser = BERTEmbeddingParser(parser_model_name="bert-base-uncased")
+documents = parser.parse([
+    "Swarmauri agents cooperate over shared memory.",
+    "Dense embeddings power semantic search.",
+])
+for doc in documents:
+    vector = doc.metadata["embedding"]
+    print(doc.content)
+    print(len(vector), vector[:5])
+```
+## Custom Models & Devices
+```python
+import torch
+from swarmauri_parser_bertembedding import BERTEmbeddingParser
+from transformers import BertModel
+class GPUParser(BERTEmbeddingParser):
+    def __init__(self, **kwargs):
+        super().__init__(**kwargs)
+        self._model = BertModel.from_pretrained(self.parser_model_name).to("cuda")
+parser = GPUParser(parser_model_name="bert-base-multilingual-cased")
+parser._model.eval()
+```
+## Batch Embeddings at Scale
+```python
+from tqdm import tqdm
+from swarmauri_parser_bertembedding import BERTEmbeddingParser
+texts = [f"Paragraph {i}" for i in range(1000)]
+parser = BERTEmbeddingParser()
+batched_docs = []
+batch_size = 32
+for start in tqdm(range(0, len(texts), batch_size)):
+    batch = texts[start:start + batch_size]
+    batched_docs.extend(parser.parse(batch))
+```
+Persist the resulting vectors into Swarmauri vector stores (Redis, Qdrant, etc.) via the metadata field.
+## Tips
+- Preprocess text to match model expectations (lowercase for uncased BERT, language-specific cleanup for multilingual models).
+- For extremely long documents, consider chunking before calling `parse` to respect the 512 token limit.
+- Use PyTorch's `to("cuda")` or `to("mps")` to execute on GPUs or Apple silicon accelerators.
+- Cache Hugging Face weights in CI/CD environments (`HF_HOME=/cache/hf`) to avoid repeated downloads.
+## Want to help?
+If you want to contribute to swarmauri-sdk, read up on our [guidelines for contributing](https://github.com/swarmauri/swarmauri-sdk/blob/master/contributing.md) that will help you get started.

swarmauri_parser_bertembedding-0.8.0.dev21/README.md ADDED Viewed

@@ -0,0 +1,109 @@
+![Swarmauri Logo](https://github.com/swarmauri/swarmauri-sdk/blob/3d4d1cfa949399d7019ae9d8f296afba773dfb7f/assets/swarmauri.brand.theme.svg)
+<p align="center">
+    <a href="https://pypi.org/project/swarmauri_parser_bertembedding/">
+        <img src="https://img.shields.io/pypi/dm/swarmauri_parser_bertembedding" alt="PyPI - Downloads"/></a>
+    <a href="https://hits.sh/github.com/swarmauri/swarmauri-sdk/tree/master/pkgs/community/swarmauri_parser_bertembedding/">
+        <img alt="Hits" src="https://hits.sh/github.com/swarmauri/swarmauri-sdk/tree/master/pkgs/community/swarmauri_parser_bertembedding.svg"/></a>
+    <a href="https://pypi.org/project/swarmauri_parser_bertembedding/">
+        <img src="https://img.shields.io/pypi/pyversions/swarmauri_parser_bertembedding" alt="PyPI - Python Version"/></a>
+    <a href="https://pypi.org/project/swarmauri_parser_bertembedding/">
+        <img src="https://img.shields.io/pypi/l/swarmauri_parser_bertembedding" alt="PyPI - License"/></a>
+    <a href="https://pypi.org/project/swarmauri_parser_bertembedding/">
+        <img src="https://img.shields.io/pypi/v/swarmauri_parser_bertembedding?label=swarmauri_parser_bertembedding&color=green" alt="PyPI - swarmauri_parser_bertembedding"/></a>
+</p>
+---
+# Swarmauri Parser Bert Embedding
+Parser that converts text into embeddings using a Hugging Face BERT encoder. Produces `Document` objects whose metadata carries the averaged token embedding so downstream Swarmauri pipelines can work with dense vectors.
+## Features
+- Uses `transformers.BertModel` + `BertTokenizer` (default `bert-base-uncased`).
+- Accepts single strings or lists of strings and emits `Document` instances with original text and embedding metadata.
+- Runs in inference (`eval`) mode with automatic `torch.no_grad()` handling.
+- Works on CPU by default; configure PyTorch device settings to leverage GPU.
+## Prerequisites
+- Python 3.10 or newer.
+- PyTorch compatible with your hardware (installed automatically via `transformers` if not present; install CUDA-enabled wheels manually when needed).
+- Internet access on first run so Hugging Face downloads tokenizer/model weights (or warm the cache ahead of time).
+## Installation
+```bash
+# pip
+pip install swarmauri_parser_bertembedding
+# poetry
+poetry add swarmauri_parser_bertembedding
+# uv (pyproject-based projects)
+uv add swarmauri_parser_bertembedding
+```
+## Quickstart
+```python
+from swarmauri_parser_bertembedding import BERTEmbeddingParser
+parser = BERTEmbeddingParser(parser_model_name="bert-base-uncased")
+documents = parser.parse([
+    "Swarmauri agents cooperate over shared memory.",
+    "Dense embeddings power semantic search.",
+])
+for doc in documents:
+    vector = doc.metadata["embedding"]
+    print(doc.content)
+    print(len(vector), vector[:5])
+```
+## Custom Models & Devices
+```python
+import torch
+from swarmauri_parser_bertembedding import BERTEmbeddingParser
+from transformers import BertModel
+class GPUParser(BERTEmbeddingParser):
+    def __init__(self, **kwargs):
+        super().__init__(**kwargs)
+        self._model = BertModel.from_pretrained(self.parser_model_name).to("cuda")
+parser = GPUParser(parser_model_name="bert-base-multilingual-cased")
+parser._model.eval()
+```
+## Batch Embeddings at Scale
+```python
+from tqdm import tqdm
+from swarmauri_parser_bertembedding import BERTEmbeddingParser
+texts = [f"Paragraph {i}" for i in range(1000)]
+parser = BERTEmbeddingParser()
+batched_docs = []
+batch_size = 32
+for start in tqdm(range(0, len(texts), batch_size)):
+    batch = texts[start:start + batch_size]
+    batched_docs.extend(parser.parse(batch))
+```
+Persist the resulting vectors into Swarmauri vector stores (Redis, Qdrant, etc.) via the metadata field.
+## Tips
+- Preprocess text to match model expectations (lowercase for uncased BERT, language-specific cleanup for multilingual models).
+- For extremely long documents, consider chunking before calling `parse` to respect the 512 token limit.
+- Use PyTorch's `to("cuda")` or `to("mps")` to execute on GPUs or Apple silicon accelerators.
+- Cache Hugging Face weights in CI/CD environments (`HF_HOME=/cache/hf`) to avoid repeated downloads.
+## Want to help?
+If you want to contribute to swarmauri-sdk, read up on our [guidelines for contributing](https://github.com/swarmauri/swarmauri-sdk/blob/master/contributing.md) that will help you get started.

{swarmauri_parser_bertembedding-0.8.0.dev4 → swarmauri_parser_bertembedding-0.8.0.dev21}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "swarmauri_parser_bertembedding"
-version = "0.8.0.dev4"
+version = "0.8.0.dev21"
 description = "Swarmauri Bert Embedding Parser"
 license = "Apache-2.0"
 readme = "README.md"
@@ -11,6 +11,10 @@ classifiers = [
     "Programming Language :: Python :: 3.10",
     "Programming Language :: Python :: 3.11",
     "Programming Language :: Python :: 3.12",
+    "Natural Language :: English",
+    "Development Status :: 3 - Alpha",
+    "Intended Audience :: Developers",
+    "Topic :: Software Development :: Libraries :: Application Frameworks",
 ]
 authors = [{ name = "Jacob Stewart", email = "jacob@swarmauri.com" }]
 dependencies = [
@@ -20,6 +24,13 @@ dependencies = [
     "swarmauri_base",
     "swarmauri_standard",
 ]
+keywords = [
+    "swarmauri",
+    "parser",
+    "bertembedding",
+    "bert",
+    "embedding",
+]
 [tool.uv.sources]
 swarmauri_core = { workspace = true }

swarmauri_parser_bertembedding-0.8.0.dev4/PKG-INFO DELETED Viewed

@@ -1,67 +0,0 @@
-Metadata-Version: 2.3
-Name: swarmauri_parser_bertembedding
-Version: 0.8.0.dev4
-Summary: Swarmauri Bert Embedding Parser
-License: Apache-2.0
-Author: Jacob Stewart
-Author-email: jacob@swarmauri.com
-Requires-Python: >=3.10,<3.13
-Classifier: License :: OSI Approved :: Apache Software License
-Classifier: Programming Language :: Python :: 3.10
-Classifier: Programming Language :: Python :: 3.11
-Classifier: Programming Language :: Python :: 3.12
-Requires-Dist: swarmauri_base
-Requires-Dist: swarmauri_core
-Requires-Dist: swarmauri_standard
-Requires-Dist: torch
-Requires-Dist: transformers (>=4.45.0)
-Description-Content-Type: text/markdown
-![Swamauri Logo](https://res.cloudinary.com/dbjmpekvl/image/upload/v1730099724/Swarmauri-logo-lockup-2048x757_hww01w.png)
-<p align="center">
-    <a href="https://pypi.org/project/swarmauri_parser_bertembedding/">
-        <img src="https://img.shields.io/pypi/dm/swarmauri_parser_bertembedding" alt="PyPI - Downloads"/></a>
-    <a href="https://hits.sh/github.com/swarmauri/swarmauri-sdk/tree/master/pkgs/community/swarmauri_parser_bertembedding/">
-        <img alt="Hits" src="https://hits.sh/github.com/swarmauri/swarmauri-sdk/tree/master/pkgs/community/swarmauri_parser_bertembedding.svg"/></a>
-    <a href="https://pypi.org/project/swarmauri_parser_bertembedding/">
-        <img src="https://img.shields.io/pypi/pyversions/swarmauri_parser_bertembedding" alt="PyPI - Python Version"/></a>
-    <a href="https://pypi.org/project/swarmauri_parser_bertembedding/">
-        <img src="https://img.shields.io/pypi/l/swarmauri_parser_bertembedding" alt="PyPI - License"/></a>
-    <a href="https://pypi.org/project/swarmauri_parser_bertembedding/">
-        <img src="https://img.shields.io/pypi/v/swarmauri_parser_bertembedding?label=swarmauri_parser_bertembedding&color=green" alt="PyPI - swarmauri_parser_bertembedding"/></a>
-</p>
----
-# Swarmauri Parser Bert Embedding
-A parser that transforms input text into document embeddings using BERT.
-## Installation
-```bash
-pip install swarmauri_parser_bertembedding
-```
-## Usage
-Basic usage examples with code snippets
-```python
-from swarmauri.parsers.BERTEmbeddingParser import BERTEmbeddingParser
-# Initialize the parser
-parser = BERTEmbeddingParser()
-# Parse some text data
-documents = parser.parse("Your text data here")
-# Access the embeddings
-for doc in documents:
-    print(doc.content)
-```
-## Want to help?
-If you want to contribute to swarmauri-sdk, read up on our [guidelines for contributing](https://github.com/swarmauri/swarmauri-sdk/blob/master/contributing.md) that will help you get started.

swarmauri_parser_bertembedding-0.8.0.dev4/README.md DELETED Viewed

@@ -1,47 +0,0 @@
-![Swamauri Logo](https://res.cloudinary.com/dbjmpekvl/image/upload/v1730099724/Swarmauri-logo-lockup-2048x757_hww01w.png)
-<p align="center">
-    <a href="https://pypi.org/project/swarmauri_parser_bertembedding/">
-        <img src="https://img.shields.io/pypi/dm/swarmauri_parser_bertembedding" alt="PyPI - Downloads"/></a>
-    <a href="https://hits.sh/github.com/swarmauri/swarmauri-sdk/tree/master/pkgs/community/swarmauri_parser_bertembedding/">
-        <img alt="Hits" src="https://hits.sh/github.com/swarmauri/swarmauri-sdk/tree/master/pkgs/community/swarmauri_parser_bertembedding.svg"/></a>
-    <a href="https://pypi.org/project/swarmauri_parser_bertembedding/">
-        <img src="https://img.shields.io/pypi/pyversions/swarmauri_parser_bertembedding" alt="PyPI - Python Version"/></a>
-    <a href="https://pypi.org/project/swarmauri_parser_bertembedding/">
-        <img src="https://img.shields.io/pypi/l/swarmauri_parser_bertembedding" alt="PyPI - License"/></a>
-    <a href="https://pypi.org/project/swarmauri_parser_bertembedding/">
-        <img src="https://img.shields.io/pypi/v/swarmauri_parser_bertembedding?label=swarmauri_parser_bertembedding&color=green" alt="PyPI - swarmauri_parser_bertembedding"/></a>
-</p>
----
-# Swarmauri Parser Bert Embedding
-A parser that transforms input text into document embeddings using BERT.
-## Installation
-```bash
-pip install swarmauri_parser_bertembedding
-```
-## Usage
-Basic usage examples with code snippets
-```python
-from swarmauri.parsers.BERTEmbeddingParser import BERTEmbeddingParser
-# Initialize the parser
-parser = BERTEmbeddingParser()
-# Parse some text data
-documents = parser.parse("Your text data here")
-# Access the embeddings
-for doc in documents:
-    print(doc.content)
-```
-## Want to help?
-If you want to contribute to swarmauri-sdk, read up on our [guidelines for contributing](https://github.com/swarmauri/swarmauri-sdk/blob/master/contributing.md) that will help you get started.

{swarmauri_parser_bertembedding-0.8.0.dev4 → swarmauri_parser_bertembedding-0.8.0.dev21}/LICENSE RENAMED Viewed

File without changes

{swarmauri_parser_bertembedding-0.8.0.dev4 → swarmauri_parser_bertembedding-0.8.0.dev21}/swarmauri_parser_bertembedding/BERTEmbeddingParser.py RENAMED Viewed

File without changes

{swarmauri_parser_bertembedding-0.8.0.dev4 → swarmauri_parser_bertembedding-0.8.0.dev21}/swarmauri_parser_bertembedding/__init__.py RENAMED Viewed

File without changes

swarmauri_parser_bertembedding 0.8.0.dev4__tar.gz → 0.8.0.dev21__tar.gz

swarmauri_parser_bertembedding 0.8.0.dev4tar.gz → 0.8.0.dev21tar.gz