PyPI - caption-flow - Versions diffs - 0.2.1__tar.gz → 0.2.3__tar.gz - Mend

caption-flow 0.2.1tar.gz → 0.2.3tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (45) hide show

{caption_flow-0.2.1/src/caption_flow.egg-info → caption_flow-0.2.3}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: caption-flow
-Version: 0.2.1
+Version: 0.2.3
 Summary: Self-contained distributed community captioning system
 Author-email: bghira <bghira@users.github.com>
 License: MIT
@@ -69,12 +69,14 @@ pip install -e .  # installs the `caption-flow` command
 1. copy + edit the sample configs
 ```bash
-cp orchestrator.yaml my-orchestrator.yaml
-cp worker.yaml my-worker.yaml
-cp monitor.yaml my-monitor.yaml   # optional; requires a monitor module
+cp examples/orchestrator/local_image_files.yaml my-orchestrator.yaml
+cp examples/worker.yaml my-worker.yaml
+cp examples/monitor.yaml my-monitor.yaml   # optional terminal interface
 ```
-set a unique shared token in both `my-orchestrator.yaml` and `my-worker.yaml` (see `auth.worker_tokens` in the orchestrator config and `worker.token` in the worker config). if you use private hugging face datasets/models, export `HUGGINGFACE_HUB_TOKEN` before starting workers.
+set a unique shared token in both `my-orchestrator.yaml` and `my-worker.yaml` (see `auth.worker_tokens` in the orchestrator config and `worker.token` in the worker config).
+if you use private hugging face datasets/models, export `HUGGINGFACE_HUB_TOKEN` before starting anything.
 2. start the orchestrator
@@ -90,6 +92,9 @@ caption-flow worker --config my-worker.yaml --gpu-id 0
 # your second GPU
 caption-flow worker --config my-worker.yaml --gpu-id 1
+# on a remote host
+caption-flow worker --config my-worker.yaml --server ws://your.hostname.address:8765
 ```
 4. (optional) start the monitor
@@ -98,12 +103,6 @@ caption-flow worker --config my-worker.yaml --gpu-id 1
 caption-flow monitor --config my-monitor.yaml
 ```
-5. (optional) scan/fix chunks on disk if you had crashes
-```bash
-caption-flow scan_chunks --data-dir ./caption_data --checkpoint-dir ./checkpoints --fix
-```
 ---
 ## how it’s wired
@@ -178,7 +177,7 @@ orchestrator:
   #   key:  /path/privkey.pem
   dataset:
-    type: huggingface   # or "local"
+    type: huggingface
     path: <hf-dataset-or-local-path>
     name: <logical-name>
     version: "1.0"
@@ -315,28 +314,31 @@ PRs welcome. keep it simple and fast.
 ## Storage Schema
 ### captions.parquet
 - `job_id`: Unique job identifier
-- `dataset`: Dataset name
-- `shard`: Shard identifier
-- `item_key`: Item within shard
-- `caption`: Generated caption text
-- `contributor_id`: Worker who generated it
-- `timestamp`: Generation time
-- `quality_score`: Optional quality metric
+* `dataset`: Dataset name
+* `shard`: Shard identifier
+* `item_key`: Item within shard
+* `caption`: Generated caption text
+* `contributor_id`: Worker who generated it
+* `timestamp`: Generation time
+* `quality_score`: Optional quality metric
 ### jobs.parquet
 - `job_id`: Unique identifier
-- `dataset`: Dataset name
-- `shard`: Shard identifier
-- `status`: pending/processing/completed/failed
-- `assigned_to`: Worker ID
-- `timestamp`: Status change time
+* `dataset`: Dataset name
+* `shard`: Shard identifier
+* `status`: pending/processing/completed/failed
+* `assigned_to`: Worker ID
+* `timestamp`: Status change time
 ### contributors.parquet
 - `contributor_id`: Unique identifier
-- `name`: Display name
-- `total_captions`: Lifetime count
-- `trust_level`: Quality tier (0-5)
+* `name`: Display name
+* `total_captions`: Lifetime count
+* `trust_level`: Quality tier (0-5)
 ## Development

{caption_flow-0.2.1 → caption_flow-0.2.3}/README.md RENAMED Viewed

@@ -25,12 +25,14 @@ pip install -e .  # installs the `caption-flow` command
 1. copy + edit the sample configs
 ```bash
-cp orchestrator.yaml my-orchestrator.yaml
-cp worker.yaml my-worker.yaml
-cp monitor.yaml my-monitor.yaml   # optional; requires a monitor module
+cp examples/orchestrator/local_image_files.yaml my-orchestrator.yaml
+cp examples/worker.yaml my-worker.yaml
+cp examples/monitor.yaml my-monitor.yaml   # optional terminal interface
 ```
-set a unique shared token in both `my-orchestrator.yaml` and `my-worker.yaml` (see `auth.worker_tokens` in the orchestrator config and `worker.token` in the worker config). if you use private hugging face datasets/models, export `HUGGINGFACE_HUB_TOKEN` before starting workers.
+set a unique shared token in both `my-orchestrator.yaml` and `my-worker.yaml` (see `auth.worker_tokens` in the orchestrator config and `worker.token` in the worker config).
+if you use private hugging face datasets/models, export `HUGGINGFACE_HUB_TOKEN` before starting anything.
 2. start the orchestrator
@@ -46,6 +48,9 @@ caption-flow worker --config my-worker.yaml --gpu-id 0
 # your second GPU
 caption-flow worker --config my-worker.yaml --gpu-id 1
+# on a remote host
+caption-flow worker --config my-worker.yaml --server ws://your.hostname.address:8765
 ```
 4. (optional) start the monitor
@@ -54,12 +59,6 @@ caption-flow worker --config my-worker.yaml --gpu-id 1
 caption-flow monitor --config my-monitor.yaml
 ```
-5. (optional) scan/fix chunks on disk if you had crashes
-```bash
-caption-flow scan_chunks --data-dir ./caption_data --checkpoint-dir ./checkpoints --fix
-```
 ---
 ## how it’s wired
@@ -134,7 +133,7 @@ orchestrator:
   #   key:  /path/privkey.pem
   dataset:
-    type: huggingface   # or "local"
+    type: huggingface
     path: <hf-dataset-or-local-path>
     name: <logical-name>
     version: "1.0"
@@ -271,28 +270,31 @@ PRs welcome. keep it simple and fast.
 ## Storage Schema
 ### captions.parquet
 - `job_id`: Unique job identifier
-- `dataset`: Dataset name
-- `shard`: Shard identifier
-- `item_key`: Item within shard
-- `caption`: Generated caption text
-- `contributor_id`: Worker who generated it
-- `timestamp`: Generation time
-- `quality_score`: Optional quality metric
+* `dataset`: Dataset name
+* `shard`: Shard identifier
+* `item_key`: Item within shard
+* `caption`: Generated caption text
+* `contributor_id`: Worker who generated it
+* `timestamp`: Generation time
+* `quality_score`: Optional quality metric
 ### jobs.parquet
 - `job_id`: Unique identifier
-- `dataset`: Dataset name
-- `shard`: Shard identifier
-- `status`: pending/processing/completed/failed
-- `assigned_to`: Worker ID
-- `timestamp`: Status change time
+* `dataset`: Dataset name
+* `shard`: Shard identifier
+* `status`: pending/processing/completed/failed
+* `assigned_to`: Worker ID
+* `timestamp`: Status change time
 ### contributors.parquet
 - `contributor_id`: Unique identifier
-- `name`: Display name
-- `total_captions`: Lifetime count
-- `trust_level`: Quality tier (0-5)
+* `name`: Display name
+* `total_captions`: Lifetime count
+* `trust_level`: Quality tier (0-5)
 ## Development

{caption_flow-0.2.1 → caption_flow-0.2.3}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "caption-flow"
-version = "0.2.1"
+version = "0.2.3"
 description = "Self-contained distributed community captioning system"
 readme = "README.md"
 requires-python = ">=3.10,<3.13"

{caption_flow-0.2.1 → caption_flow-0.2.3}/src/caption_flow/cli.py RENAMED Viewed

@@ -124,7 +124,7 @@ def setup_logging(verbose: bool = False):
     level = logging.DEBUG if verbose else logging.INFO
     logging.basicConfig(
         level=level,
-        format="%(asctime)s %(message)s",
+        format="%(message)s",
         datefmt="[%Y-%m-%d %H:%M:%S]",
         handlers=[
             RichHandler(
@@ -161,6 +161,7 @@ def main(ctx, verbose: bool):
 @click.option("--key", help="SSL key path")
 @click.option("--no-ssl", is_flag=True, help="Disable SSL (development only)")
 @click.option("--vllm", is_flag=True, help="Use vLLM orchestrator for WebDataset/HF datasets")
+@click.option("--verbose", is_flag=True, help="Enable verbose logging")
 @click.pass_context
 def orchestrator(ctx, config: Optional[str], **kwargs):
     """Start the orchestrator server."""

caption_flow-0.2.3/src/caption_flow/models.py ADDED Viewed

@@ -0,0 +1,191 @@
+"""Data models for CaptionFlow."""
+import PIL
+from dataclasses import dataclass, field
+from datetime import datetime
+from enum import Enum
+from typing import Any, Dict, List, Optional, Tuple
+from PIL import Image
+class JobStatus(Enum):
+    """Job processing status."""
+    PENDING = "pending"
+    PROCESSING = "processing"
+    COMPLETED = "completed"
+    FAILED = "failed"
+    def __str__(self):
+        return self.value
+    def to_json(self):
+        return self.value
+@dataclass
+class Job:
+    """Captioning job."""
+    job_id: str
+    dataset: str
+    shard: str
+    item_key: str
+    status: JobStatus = JobStatus.PENDING
+    assigned_to: Optional[str] = None
+    created_at: datetime = None
+    def __post_init__(self):
+        if self.created_at is None:
+            self.created_at = datetime.utcnow()
+@dataclass
+class JobId:
+    shard_id: str
+    chunk_id: str
+    sample_id: str
+    def get_shard_str(self):
+        return f"{self.shard_id}"
+    def get_chunk_str(self):
+        return f"{self.shard_id}:chunk:{self.chunk_id}"
+    def get_sample_str(self):
+        return f"{self.shard_id}:chunk:{self.chunk_id}:idx:{self.sample_id}"
+    @staticmethod
+    def from_dict(job: dict) -> "JobId":
+        return JobId(shard_id=job["shard_id"], chunk_id=job["chunk_id"], sample_id=job["sample_id"])
+    @staticmethod
+    def from_values(shard_id: str, chunk_id: str, sample_id: str) -> "JobId":
+        return JobId(shard_id=shard_id, chunk_id=chunk_id, sample_id=sample_id)
+    @staticmethod
+    def from_str(job_id: str):
+        # from data-0000:chunk:0:idx:0
+        parts = job_id.split(":")
+        if len(parts) != 5:
+            raise ValueError(f"Invalid job_id format: {job_id}")
+        return JobId(shard_id=parts[0], chunk_id=parts[2], sample_id=parts[4])
+@dataclass
+class Caption:
+    """Generated caption with attribution and image metadata."""
+    # Core fields
+    job_id: str
+    dataset: str
+    shard: str
+    item_key: str
+    contributor_id: str
+    timestamp: datetime
+    caption_count: int = 1  # Number of captions generated for this item
+    caption: Optional[str] = None
+    captions: Optional[List[str]] = None
+    outputs: Dict[str, List[str]] = field(default_factory=dict)
+    quality_score: Optional[float] = None
+    quality_scores: Optional[List[float]] = None
+    # Image metadata
+    image_width: Optional[int] = None
+    image_height: Optional[int] = None
+    image_format: Optional[str] = None
+    file_size: Optional[int] = None
+    filename: Optional[str] = None
+    url: Optional[str] = None
+    # Processing metadata
+    caption_index: Optional[int] = None  # Which caption this is (0, 1, 2...)
+    total_captions: Optional[int] = None  # Total captions for this image
+    processing_time_ms: Optional[float] = None
+    chunk_id: Optional[str] = None
+    metadata: Dict[str, Any] = field(default_factory=dict)
+    def __post_init__(self):
+        if self.caption is None and self.captions is None:
+            raise ValueError("At least one of 'caption' or 'captions' must be provided")
+@dataclass
+class Contributor:
+    """Contributor information."""
+    contributor_id: str
+    name: str
+    total_captions: int = 0
+    trust_level: int = 1
+@dataclass
+class ProcessingStage:
+    """Configuration for a single processing stage."""
+    name: str
+    model: str
+    prompts: List[str]
+    output_field: str
+    requires: List[str] = field(default_factory=list)
+    sampling: Optional[Dict[str, Any]] = None
+    # Model-specific overrides
+    tensor_parallel_size: Optional[int] = None
+    max_model_len: Optional[int] = None
+    dtype: Optional[str] = None
+    gpu_memory_utilization: Optional[float] = None
+@dataclass
+class StageResult:
+    """Results from a single stage."""
+    stage_name: str
+    output_field: str
+    outputs: List[str]  # Multiple outputs from multiple prompts
+    error: Optional[str] = None
+    def is_success(self) -> bool:
+        return self.error is None and bool(self.outputs)
+@dataclass
+class ShardChunk:
+    """Shard chunk assignment with unprocessed ranges."""
+    chunk_id: str
+    shard_url: str
+    shard_name: str
+    start_index: int
+    chunk_size: int
+    unprocessed_ranges: List[Tuple[int, int]] = field(default_factory=list)
+@dataclass
+class ProcessingItem:
+    """Item being processed."""
+    chunk_id: str
+    item_key: str
+    image: Image.Image
+    image_data: bytes
+    metadata: Dict[str, Any] = field(default_factory=dict)
+    stage_results: Dict[str, StageResult] = field(default_factory=dict)  # Accumulated results
+@dataclass
+class ProcessedResult:
+    """Result with multi-stage outputs."""
+    chunk_id: str
+    shard_name: str
+    item_key: str
+    outputs: Dict[str, List[str]]  # field_name -> list of outputs
+    image_width: int
+    image_height: int
+    image_format: str
+    file_size: int
+    processing_time_ms: float
+    metadata: Dict[str, Any] = field(default_factory=dict)

{caption_flow-0.2.1 → caption_flow-0.2.3}/src/caption_flow/monitor.py RENAMED Viewed

@@ -83,7 +83,7 @@ class Monitor:
                         await self._handle_update(data)
             except Exception as e:
-                logger.error(f"Connection error: {e}")
+                logger.error(f"Connection error: {e}", exc_info=True)
                 await asyncio.sleep(5)
     async def _handle_update(self, data: Dict):

caption-flow 0.2.1__tar.gz → 0.2.3__tar.gz

caption-flow 0.2.1tar.gz → 0.2.3tar.gz