PyPI - tumblrbot - Versions diffs - 1.9.5__tar.gz → 1.9.6__tar.gz - Mend

tumblrbot 1.9.5tar.gz → 1.9.6tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (21) hide show

{tumblrbot-1.9.5 → tumblrbot-1.9.6}/.gitignore RENAMED Viewed

@@ -218,4 +218,4 @@ __marimo__/
 data
 *.toml
 *.jsonl
-*.lnk
+tumblrbot.ps1

{tumblrbot-1.9.5 → tumblrbot-1.9.6}/PKG-INFO RENAMED Viewed

@@ -1,15 +1,15 @@
 Metadata-Version: 2.4
 Name: tumblrbot
-Version: 1.9.5
+Version: 1.9.6
 Summary: An updated bot that posts to Tumblr, based on your very own blog!
 Requires-Python: >= 3.14
 Description-Content-Type: text/markdown
-Requires-Dist: click
 Requires-Dist: openai
 Requires-Dist: pydantic
 Requires-Dist: requests
 Requires-Dist: requests-oauthlib
 Requires-Dist: rich
+Requires-Dist: tenacity
 Requires-Dist: tiktoken
 Requires-Dist: tomlkit
 Project-URL: Funding, https://ko-fi.com/maidscientistizutsumimarin
@@ -84,10 +84,6 @@ Features:
 - Colorful output, progress bars, and post previews using [rich].
 - Automatically keeps the [config][configurable] file up-to-date and recreates it if missing (without overriding user settings).
-**To-Do:**
-- Add retry logic for rate limiting.
 **Known Issues:**
 - Fine-tuning can fail after the validation phase due to the examples file not passing [OpenAI] moderation checks. There are a few workarounds for this that can be tried in combination:
@@ -103,24 +99,33 @@ Features:
 **Please submit an issue or contact us for features you want added/reimplemented.**
-## Installation
+## Installation & Usage
+### Downloadable Binary
+| Pros | Cons |
+| --- | --- |
+| Easier to install | Harder to update |
+| No risk of dependencies breaking | Dependencies may be older |
+1. Download the latest release's [tumblrbot.exe].
+1. Launch `tumblrbot.exe` in the install location.
+### PyPi
+| Pros | Cons |
+| --- | --- |
+| Easier to update | Harder to install |
+| Dependencies may be newer | Dependencies may break |
 1. Install the latest version of [Python]:
    - Windows: `winget install python3`
    - Linux (apt): `apt install python-pip`
    - Linux (pacman): `pacman install python-pip`
 1. Install the [pip] package: `pip install tumblrbot`
-   - Alternatively, you can install from this repository: `pip install git+https://github.com/MaidThatPrograms/tumblrbot.git`
+   - Alternatively, you can install from this repository: `pip install git+https://github.com/MaidScientistIzutsumiMarin/tumblrbot.git`
    - On Linux, you will have to make a virtual environment or use the flag to install packages system-wide.
-### Alternative Installation for Windows
-1. Download the latest release's [tumblrbot.exe].
-1. Run the file directly, or add it to your path, and use it as normal.
-## Usage
-Run `tumblrbot` from anywhere. Run `tumblrbot --help` for command-line options. Every command-line option corresponds to a value from the [config][configurable].
+1. Run `tumblrbot` from anywhere. Run `tumblrbot --help` for command-line options. Every command-line option corresponds to a value from the [config][configurable].
 ## Obtaining Tokens
@@ -177,6 +182,7 @@ Specific Options:
    To be specific, it should follow the [JSON Lines] file format with one collection of name/value pairs (a dictionary) per line. You can validate your file using the [JSON Lines Validator].
 - **`post_limit`** - At most, this many valid posts will be included in the training data. This effectively is a filter to select the `N` most recent valid posts from each blog. `0` will use every available valid post.
+- **`moderation_batch_size`** - This controls the batch size when submitting posts to the OpenAI moderation. There is no limit, but higher numbers will cause you to be rate-limited more, which can overall be slower. Low numbers reduce rate-limiting, but can sometimes take longer due to needing more requests. The best value will depend on your computer, internet connection, and any number of factors on OpenAI's side. The default value is just what worked best for our computer.
 - **`filtered_words`** - During training data generation, any posts with the specified words will be removed. Word boundaries are not checked by default, so “the” will also filter out posts with “them” or “thematic”. This setting supports regular expressions, so you can explicitly look for word boundaries by surrounding an entry with “\\\b”, i.e., “\\\bthe\\\b”. Regular expressions have to be escaped like so due to how JSON data is read in. If you are familiar with regular expressions, it could be useful for you to know that every entry is joined with a “|” which is then used to search the post content for any matches.
 - **`developer_message`** - This message is used in for fine-tuning the AI as well as generating prompts. If you change this, you will need to run the fine-tuning again with the new value before generating posts.
 - **`user_message`** - This setting is used and works in the same way as `developer_message`.

{tumblrbot-1.9.5 → tumblrbot-1.9.6}/README.md RENAMED Viewed

@@ -67,10 +67,6 @@ Features:
 - Colorful output, progress bars, and post previews using [rich].
 - Automatically keeps the [config][configurable] file up-to-date and recreates it if missing (without overriding user settings).
-**To-Do:**
-- Add retry logic for rate limiting.
 **Known Issues:**
 - Fine-tuning can fail after the validation phase due to the examples file not passing [OpenAI] moderation checks. There are a few workarounds for this that can be tried in combination:
@@ -86,24 +82,33 @@ Features:
 **Please submit an issue or contact us for features you want added/reimplemented.**
-## Installation
+## Installation & Usage
+### Downloadable Binary
+| Pros | Cons |
+| --- | --- |
+| Easier to install | Harder to update |
+| No risk of dependencies breaking | Dependencies may be older |
+1. Download the latest release's [tumblrbot.exe].
+1. Launch `tumblrbot.exe` in the install location.
+### PyPi
+| Pros | Cons |
+| --- | --- |
+| Easier to update | Harder to install |
+| Dependencies may be newer | Dependencies may break |
 1. Install the latest version of [Python]:
    - Windows: `winget install python3`
    - Linux (apt): `apt install python-pip`
    - Linux (pacman): `pacman install python-pip`
 1. Install the [pip] package: `pip install tumblrbot`
-   - Alternatively, you can install from this repository: `pip install git+https://github.com/MaidThatPrograms/tumblrbot.git`
+   - Alternatively, you can install from this repository: `pip install git+https://github.com/MaidScientistIzutsumiMarin/tumblrbot.git`
    - On Linux, you will have to make a virtual environment or use the flag to install packages system-wide.
-### Alternative Installation for Windows
-1. Download the latest release's [tumblrbot.exe].
-1. Run the file directly, or add it to your path, and use it as normal.
-## Usage
-Run `tumblrbot` from anywhere. Run `tumblrbot --help` for command-line options. Every command-line option corresponds to a value from the [config][configurable].
+1. Run `tumblrbot` from anywhere. Run `tumblrbot --help` for command-line options. Every command-line option corresponds to a value from the [config][configurable].
 ## Obtaining Tokens
@@ -160,6 +165,7 @@ Specific Options:
    To be specific, it should follow the [JSON Lines] file format with one collection of name/value pairs (a dictionary) per line. You can validate your file using the [JSON Lines Validator].
 - **`post_limit`** - At most, this many valid posts will be included in the training data. This effectively is a filter to select the `N` most recent valid posts from each blog. `0` will use every available valid post.
+- **`moderation_batch_size`** - This controls the batch size when submitting posts to the OpenAI moderation. There is no limit, but higher numbers will cause you to be rate-limited more, which can overall be slower. Low numbers reduce rate-limiting, but can sometimes take longer due to needing more requests. The best value will depend on your computer, internet connection, and any number of factors on OpenAI's side. The default value is just what worked best for our computer.
 - **`filtered_words`** - During training data generation, any posts with the specified words will be removed. Word boundaries are not checked by default, so “the” will also filter out posts with “them” or “thematic”. This setting supports regular expressions, so you can explicitly look for word boundaries by surrounding an entry with “\\\b”, i.e., “\\\bthe\\\b”. Regular expressions have to be escaped like so due to how JSON data is read in. If you are familiar with regular expressions, it could be useful for you to know that every entry is joined with a “|” which is then used to search the post content for any matches.
 - **`developer_message`** - This message is used in for fine-tuning the AI as well as generating prompts. If you change this, you will need to run the fine-tuning again with the new value before generating posts.
 - **`user_message`** - This setting is used and works in the same way as `developer_message`.

tumblrbot-1.9.6/build.ps1 ADDED Viewed

	@@ -0,0 +1 @@
1	+ ..\..\Powershell\build.ps1 -ExtraArgs '--collect-all tiktoken_ext'

{tumblrbot-1.9.5 → tumblrbot-1.9.6}/pyproject.toml RENAMED Viewed

@@ -1,18 +1,18 @@
 [project]
 name = "tumblrbot"
-version = "1.9.5"
+version = "1.9.6"
 description = "An updated bot that posts to Tumblr, based on your very own blog!"
 readme = "README.md"
 requires-python = ">= 3.14"
 dependencies = [
-  "click",
   "openai",
   "pydantic",
   "requests",
   "requests-oauthlib",
   "rich",
+  "tenacity",
   "tiktoken",
-  "tomlkit",
+  "tomlkit"
 ]
 [project.urls]

{tumblrbot-1.9.5 → tumblrbot-1.9.6}/src/tumblrbot/__main__.py RENAMED Viewed

@@ -1,3 +1,5 @@
+from sys import exit as sys_exit
 from openai import OpenAI
 from rich.prompt import Confirm
 from rich.traceback import install
@@ -35,3 +37,7 @@ def main() -> None:
         if Confirm.ask("Generate drafts?", default=False):
             DraftGenerator(openai=openai, tumblr=tumblr).main()
+if __name__ == "__main__":
+    sys_exit(main())

tumblrbot-1.9.6/src/tumblrbot/flow/examples.py ADDED Viewed

@@ -0,0 +1,97 @@
+from collections.abc import Generator
+from itertools import batched
+from json import loads
+from math import ceil
+from re import IGNORECASE
+from re import compile as re_compile
+from typing import TYPE_CHECKING, override
+from openai import RateLimitError
+from rich import print as rich_print
+from tenacity import retry, retry_if_exception_type, stop_after_attempt, wait_random_exponential
+from tumblrbot.utils.common import FlowClass, PreviewLive
+from tumblrbot.utils.models import Example, Message, Post
+if TYPE_CHECKING:
+    from collections.abc import Generator, Iterable
+    from pathlib import Path
+    from openai._types import SequenceNotStr
+    from openai.types import ModerationCreateResponse, ModerationMultiModalInputParam
+class ExamplesWriter(FlowClass):
+    @override
+    def main(self) -> None:
+        self.config.examples_file.parent.mkdir(parents=True, exist_ok=True)
+        examples = [self.create_example(*prompt) for prompt in self.get_custom_prompts()]
+        examples.extend(self.create_example(self.config.user_message, str(post)) for post in self.get_valid_posts())
+        self.write_examples(examples)
+        rich_print(f"[bold]The examples file can be found at: '{self.config.examples_file}'\n")
+    def create_example(self, user_message: str, assistant_message: str) -> Example:
+        return Example(
+            messages=[
+                Message(role="developer", content=self.config.developer_message),
+                Message(role="user", content=user_message),
+                Message(role="assistant", content=assistant_message),
+            ],
+        )
+    def get_custom_prompts(self) -> Generator[tuple[str, str]]:
+        self.config.custom_prompts_file.parent.mkdir(parents=True, exist_ok=True)
+        self.config.custom_prompts_file.touch(exist_ok=True)
+        with self.config.custom_prompts_file.open("rb") as fp:
+            for line in fp:
+                data: dict[str, str] = loads(line)
+                yield from data.items()
+    # This function mostly exists to make writing examples atomic.
+    def write_examples(self, examples: Iterable[Example]) -> None:
+        with self.config.examples_file.open("w", encoding="utf_8") as fp:
+            for example in examples:
+                fp.write(f"{example.model_dump_json()}\n")
+    def get_valid_posts(self) -> Generator[Post]:
+        for path in self.get_data_paths():
+            posts = list(self.get_valid_posts_from_path(path))
+            yield from posts[-self.config.post_limit :]
+    def get_valid_posts_from_path(self, path: Path) -> Generator[Post]:
+        pattern = re_compile("|".join(self.config.filtered_words), IGNORECASE)
+        with path.open("rb") as fp:
+            for line in fp:
+                post = Post.model_validate_json(line)
+                if post.valid_text_post() and not (post.trail and self.config.filtered_words and pattern.search(str(post))):
+                    yield post
+    def filter_examples(self) -> None:
+        raw_examples = self.config.examples_file.read_bytes().splitlines()
+        old_examples = map(Example.model_validate_json, raw_examples)
+        new_examples: list[Example] = []
+        with PreviewLive() as live:
+            for batch in live.progress.track(
+                batched(old_examples, self.config.moderation_batch_size, strict=False),
+                ceil(len(raw_examples) / self.config.moderation_batch_size),
+                description="Removing flagged posts...",
+            ):
+                response = self.create_moderation_batch(tuple(map(Example.get_assistant_message, batch)))
+                new_examples.extend(example for example, moderation in zip(batch, response.results, strict=True) if not moderation.flagged)
+        self.write_examples(new_examples)
+        rich_print(f"[red]Removed {len(raw_examples) - len(new_examples)} posts.\n")
+    @retry(
+        stop=stop_after_attempt(10),
+        wait=wait_random_exponential(),
+        retry=retry_if_exception_type(RateLimitError),
+        before_sleep=lambda state: rich_print(f"[yellow]OpenAI rate limit exceeded. Waiting for {state.idle_for} seconds..."),
+        reraise=True,
+    )
+    def create_moderation_batch(self, api_input: str | SequenceNotStr[str] | Iterable[ModerationMultiModalInputParam]) -> ModerationCreateResponse:
+        return self.openai.moderations.create(input=api_input)

{tumblrbot-1.9.5 → tumblrbot-1.9.6}/src/tumblrbot/flow/fine_tune.py RENAMED Viewed

@@ -3,9 +3,9 @@ from textwrap import dedent
 from time import sleep
 from typing import TYPE_CHECKING, override
-import rich
-from rich import progress
+from rich import print as rich_print
 from rich.console import Console
+from rich.progress import open as progress_open
 from rich.prompt import Confirm
 from tiktoken import encoding_for_model, get_encoding
@@ -21,7 +21,7 @@ if TYPE_CHECKING:
 class FineTuner(FlowClass):
     @staticmethod
     def dedent_print(text: str) -> None:
-        rich.print(dedent(text).lstrip())
+        rich_print(dedent(text).lstrip())
     @override
     def main(self) -> None:
@@ -55,12 +55,12 @@ class FineTuner(FlowClass):
         if self.config.job_id:
             return self.poll_job_status()
-        with progress.open(self.config.examples_file, "rb", description=f"Uploading [purple]{self.config.examples_file}[/]...") as fp:
+        with progress_open(self.config.examples_file, "rb", description=f"Uploading [purple]{self.config.examples_file}[/]...") as fp:
             file = self.openai.files.create(
                 file=fp,
                 purpose="fine-tune",
             )
-        rich.print()
+        rich_print()
         job = self.openai.fine_tuning.jobs.create(
             model=self.config.base_model,
@@ -96,7 +96,7 @@ class FineTuner(FlowClass):
         if job.status != "succeeded":
             if Confirm.ask("[gray62]Delete uploaded examples file?", default=False):
                 self.openai.files.delete(job.training_file)
-                rich.print()
+                rich_print()
             if job.status == "failed" and job.error is not None:
                 raise RuntimeError(job.error.message)

{tumblrbot-1.9.5 → tumblrbot-1.9.6}/src/tumblrbot/flow/generate.py RENAMED Viewed

@@ -2,12 +2,12 @@ from functools import cache
 from random import choice, random, sample
 from typing import TYPE_CHECKING, override
-import rich
 from pydantic import ConfigDict
+from rich import print as rich_print
 from rich.prompt import IntPrompt
 from tumblrbot.utils.common import FlowClass, PreviewLive
-from tumblrbot.utils.models import Post
+from tumblrbot.utils.models import Block, Post
 if TYPE_CHECKING:
     from collections.abc import Iterable
@@ -32,7 +32,7 @@ class DraftGenerator(FlowClass):
                     exception.add_note(f"📉 An error occurred! Generated {i} draft(s) before failing. {message}")
                     raise
-        rich.print(f":chart_increasing: [bold green]Generated {self.config.draft_count} draft(s).[/] {message}")
+        rich_print(f":chart_increasing: [bold green]Generated {self.config.draft_count} draft(s).[/] {message}")
     def generate_post(self) -> Post:
         if original := self.get_random_post():
@@ -48,7 +48,7 @@ class DraftGenerator(FlowClass):
             tags = tags.tags
         return Post(
-            content=[Post.Block(type="text", text=text)],
+            content=[Block(type="text", text=text)],
             tags=tags or [],
             parent_tumblelog_uuid=original.blog.uuid,
             parent_post_id=original.id,

{tumblrbot-1.9.5 → tumblrbot-1.9.6}/src/tumblrbot/utils/common.py RENAMED Viewed

@@ -1,18 +1,21 @@
 from abc import abstractmethod
-from pathlib import Path
 from random import choice
-from typing import ClassVar, Self, override
+from typing import TYPE_CHECKING, ClassVar, Self, override
-from openai import OpenAI
+from openai import OpenAI  # noqa: TC002
 from pydantic import ConfigDict
 from rich._spinners import SPINNERS
-from rich.console import RenderableType
 from rich.live import Live
 from rich.progress import MofNCompleteColumn, Progress, SpinnerColumn, TimeElapsedColumn
 from rich.table import Table
 from tumblrbot.utils.models import Config, FullyValidatedModel
-from tumblrbot.utils.tumblr import TumblrSession
+from tumblrbot.utils.tumblr import TumblrSession  # noqa: TC001
+if TYPE_CHECKING:
+    from pathlib import Path
+    from rich.console import RenderableType
 class FlowClass(FullyValidatedModel):

{tumblrbot-1.9.5 → tumblrbot-1.9.6}/src/tumblrbot/utils/models.py RENAMED Viewed

@@ -1,17 +1,19 @@
-from collections.abc import Generator
 from getpass import getpass
 from pathlib import Path
-from typing import Annotated, Any, Literal, Self, override
+from tomllib import loads
+from typing import TYPE_CHECKING, Annotated, Any, Literal, Self, override
-import rich
-from openai.types import ChatModel
+from openai.types import ChatModel  # noqa: TC002
 from pydantic import BaseModel, ConfigDict, Field, NonNegativeFloat, NonNegativeInt, PlainSerializer, PositiveFloat, PositiveInt, model_validator
-from pydantic.json_schema import SkipJsonSchema
+from pydantic.json_schema import SkipJsonSchema  # noqa: TC002
 from requests_oauthlib import OAuth1Session
+from rich import print as rich_print
 from rich.panel import Panel
 from rich.prompt import Prompt
 from tomlkit import comment, document, dumps  # pyright: ignore[reportUnknownVariableType]
-from tomllib import loads
+if TYPE_CHECKING:
+    from collections.abc import Generator
 class FullyValidatedModel(BaseModel):
@@ -58,7 +60,7 @@ class Config(FileSyncSettings):
     # Writing Examples
     post_limit: NonNegativeInt = Field(0, description="The number of the most recent posts from each blog that should be included in the training data.")
-    max_moderation_batch_size: PositiveInt = Field(100, description="The number of posts, at most, to submit to the OpenAI moderation API. This is also capped by the API.")
+    moderation_batch_size: PositiveInt = Field(25, description="The number of posts at a time to submit to the OpenAI moderation API.")
     custom_prompts_file: Path = Field(Path("custom_prompts.jsonl"), description="Where to read in custom prompts from.")
     filtered_words: list[str] = Field([], description="A case-insensitive list of disallowed words used to filter out training data. Regular expressions are allowed, but must be escaped.")
@@ -80,7 +82,7 @@ class Config(FileSyncSettings):
     # Generating
     upload_blog_identifier: str = Field("", description="The identifier of the blog which generated drafts will be uploaded to. This must be a blog associated with the same account as the configured Tumblr secret tokens.")
-    draft_count: PositiveInt = Field(150, description="The number of drafts to process. This will affect the number of tokens used with OpenAI")
+    draft_count: PositiveInt = Field(100, description="The number of drafts to process. This will affect the number of tokens used with OpenAI")
     tags_chance: NonNegativeFloat = Field(0.1, description="The chance to generate tags for any given post. This will use more OpenAI tokens.")
     tags_developer_message: str = Field("You will be provided with a block of text, and your task is to extract a very short list of the most important subjects from it.", description="The developer message used to generate tags.")
     reblog_blog_identifiers: list[str] = Field([], description="The identifiers of blogs that can be reblogged from when generating drafts.")
@@ -88,13 +90,15 @@ class Config(FileSyncSettings):
     reblog_user_message: str = Field("Please write a comical Tumblr post in response to the following post:\n\n{}", description="The format string for the user message used to reblog posts.")
     @override
-    def model_post_init(self, _: object) -> None:
+    def model_post_init(self, context: object) -> None:
+        super().model_post_init(context)
         if not self.download_blog_identifiers:
-            rich.print("Enter the [cyan]identifiers of your blogs[/] that data should be [bold purple]downloaded[/] from, separated by commas.")
+            rich_print("Enter the [cyan]identifiers of your blogs[/] that data should be [bold purple]downloaded[/] from, separated by commas.")
             self.download_blog_identifiers = list(map(str.strip, Prompt.ask("[bold][Example] [dim]staff.tumblr.com,changes").split(",")))
         if not self.upload_blog_identifier:
-            rich.print("Enter the [cyan]identifier of your blog[/] that drafts should be [bold purple]uploaded[/] to.")
+            rich_print("Enter the [cyan]identifier of your blog[/] that drafts should be [bold purple]uploaded[/] to.")
             self.upload_blog_identifier = Prompt.ask("[bold][Example] [dim]staff.tumblr.com or changes").strip()
@@ -109,7 +113,9 @@ class Tokens(FileSyncSettings):
     tumblr: Tumblr = Tumblr()
     @override
-    def model_post_init(self, _: object) -> None:
+    def model_post_init(self, context: object) -> None:
+        super().model_post_init(context)
         # Check if any tokens are missing or if the user wants to reset them, then set tokens if necessary.
         if not self.openai_api_key:
             (self.openai_api_key,) = self.online_token_prompt("https://platform.openai.com/api-keys", "API key")
@@ -124,8 +130,8 @@ class Tokens(FileSyncSettings):
                 self.tumblr.client_key,
                 self.tumblr.client_secret,
             ) as oauth_session:
-                fetch_response = oauth_session.fetch_request_token("http://tumblr.com/oauth/request_token")
-                full_authorize_url = oauth_session.authorization_url("http://tumblr.com/oauth/authorize")
+                fetch_response = oauth_session.fetch_request_token("http://tumblr.com/oauth/request_token")  # pyright: ignore[reportUnknownMemberType]
+                full_authorize_url = oauth_session.authorization_url("http://tumblr.com/oauth/authorize")  # pyright: ignore[reportUnknownMemberType]
                 (redirect_response,) = self.online_token_prompt(full_authorize_url, "full redirect URL")
                 oauth_response = oauth_session.parse_authorization_response(redirect_response)
@@ -135,7 +141,7 @@ class Tokens(FileSyncSettings):
                 *self.get_oauth_tokens(fetch_response),
                 verifier=oauth_response["oauth_verifier"],
             ) as oauth_session:
-                oauth_tokens = oauth_session.fetch_access_token("http://tumblr.com/oauth/access_token")
+                oauth_tokens = oauth_session.fetch_access_token("http://tumblr.com/oauth/access_token")  # pyright: ignore[reportUnknownMemberType]
             self.tumblr.resource_owner_key, self.tumblr.resource_owner_secret = self.get_oauth_tokens(oauth_tokens)
@@ -143,11 +149,11 @@ class Tokens(FileSyncSettings):
     def online_token_prompt(url: str, *tokens: str) -> Generator[str]:
         formatted_token_string = " and ".join(f"[cyan]{token}[/]" for token in tokens)
-        rich.print(f"Retrieve your {formatted_token_string} from: {url}")
+        rich_print(f"Retrieve your {formatted_token_string} from: {url}")
         for token in tokens:
             yield getpass(f"Enter your {token} (masked): ", echo_char="*").strip()
-        rich.print()
+        rich_print()
     @staticmethod
     def get_oauth_tokens(token: dict[str, str]) -> tuple[str, str]:
@@ -160,20 +166,22 @@ class Blog(FullyValidatedModel):
     uuid: str = ""
-class ResponseModel(FullyValidatedModel):
-    class Response(FullyValidatedModel):
-        blog: Blog = Blog()
-        posts: list[Any] = []
+class Response(FullyValidatedModel):
+    blog: Blog = Blog()
+    posts: list[Any] = []
+class ResponseModel(FullyValidatedModel):
     response: Response
-class Post(FullyValidatedModel):
-    class Block(FullyValidatedModel):
-        type: str = ""
-        text: str = ""
-        blocks: list[int] = []
+class Block(FullyValidatedModel):
+    type: str = ""
+    text: str = ""
+    blocks: list[int] = []
+class Post(FullyValidatedModel):
     blog: SkipJsonSchema[Blog] = Blog()
     id: SkipJsonSchema[int] = 0
     parent_tumblelog_uuid: SkipJsonSchema[str] = ""
@@ -212,9 +220,17 @@ class Post(FullyValidatedModel):
         return bool(self.content) and all(block.type == "text" for block in self.content) and not (self.is_submission or any(block.type == "ask" for block in self.layout))
-class Example(FullyValidatedModel):
-    class Message(FullyValidatedModel):
-        role: Literal["developer", "user", "assistant"]
-        content: str
+class Message(FullyValidatedModel):
+    role: Literal["developer", "user", "assistant"]
+    content: str
+class Example(FullyValidatedModel):
     messages: list[Message]
+    def get_assistant_message(self) -> str:
+        for message in self.messages:
+            if message.role == "assistant":
+                return message.content
+        msg = "Assistant message not found!"
+        raise ValueError(msg)

{tumblrbot-1.9.5 → tumblrbot-1.9.6}/src/tumblrbot/utils/tumblr.py RENAMED Viewed

@@ -2,13 +2,23 @@ from typing import Self
 from requests import HTTPError, Response
 from requests_oauthlib import OAuth1Session
+from rich import print as rich_print
+from tenacity import retry, retry_if_exception_message, stop_after_attempt, wait_random_exponential
 from tumblrbot.utils.models import Post, ResponseModel, Tokens
+rate_limit_retry = retry(
+    stop=stop_after_attempt(10),
+    wait=wait_random_exponential(min=60),
+    retry=retry_if_exception_message(match="429 Client Error: Limit Exceeded for url: .+"),
+    before_sleep=lambda state: rich_print(f"[yellow]Tumblr rate limit exceeded. Waiting for {state.idle_for} seconds..."),
+    reraise=True,
+)
 class TumblrSession(OAuth1Session):
     def __init__(self, tokens: Tokens) -> None:
-        super().__init__(**tokens.tumblr.model_dump())
+        super().__init__(**tokens.tumblr.model_dump())  # pyright: ignore[reportUnknownMemberType]
         self.hooks["response"].append(self.response_hook)
     def __enter__(self) -> Self:
@@ -22,10 +32,12 @@ class TumblrSession(OAuth1Session):
             error.add_note(response.text)
             raise
+    @rate_limit_retry
     def retrieve_blog_info(self, blog_identifier: str) -> ResponseModel:
         response = self.get(f"https://api.tumblr.com/v2/blog/{blog_identifier}/info")
         return ResponseModel.model_validate_json(response.text)
+    @rate_limit_retry
     def retrieve_published_posts(
         self,
         blog_identifier: str,
@@ -43,6 +55,7 @@ class TumblrSession(OAuth1Session):
         )
         return ResponseModel.model_validate_json(response.text)
+    @rate_limit_retry
     def create_post(self, blog_identifier: str, post: Post) -> ResponseModel:
         response = self.post(
             f"https://api.tumblr.com/v2/blog/{blog_identifier}/posts",

tumblrbot-1.9.5/src/tumblrbot/flow/examples.py DELETED Viewed

@@ -1,100 +0,0 @@
-import re
-from itertools import batched
-from json import loads
-from math import ceil
-from re import search
-from typing import IO, TYPE_CHECKING, override
-import rich
-from openai import BadRequestError
-from tumblrbot.utils.common import FlowClass, PreviewLive
-from tumblrbot.utils.models import Example, Post
-if TYPE_CHECKING:
-    from collections.abc import Generator
-    from pathlib import Path
-class ExamplesWriter(FlowClass):
-    @override
-    def main(self) -> None:
-        self.config.examples_file.parent.mkdir(parents=True, exist_ok=True)
-        with self.config.examples_file.open("w", encoding="utf_8") as fp:
-            for user_message, assistant_response in self.get_custom_prompts():
-                self.write_example(
-                    user_message,
-                    assistant_response,
-                    fp,
-                )
-            for post in self.get_valid_posts():
-                self.write_example(
-                    self.config.user_message,
-                    str(post),
-                    fp,
-                )
-        rich.print(f"[bold]The examples file can be found at: '{self.config.examples_file}'\n")
-    def write_example(self, user_message: str, assistant_message: str, fp: IO[str]) -> None:
-        example = Example(
-            messages=[
-                Example.Message(role="developer", content=self.config.developer_message),
-                Example.Message(role="user", content=user_message),
-                Example.Message(role="assistant", content=assistant_message),
-            ],
-        )
-        fp.write(f"{example.model_dump_json()}\n")
-    def get_custom_prompts(self) -> Generator[tuple[str, str]]:
-        self.config.custom_prompts_file.parent.mkdir(parents=True, exist_ok=True)
-        self.config.custom_prompts_file.touch(exist_ok=True)
-        with self.config.custom_prompts_file.open("rb") as fp:
-            for line in fp:
-                data: dict[str, str] = loads(line)
-                yield from data.items()
-    def get_valid_posts(self) -> Generator[Post]:
-        for path in self.get_data_paths():
-            posts = list(self.get_valid_posts_from_path(path))
-            yield from posts[-self.config.post_limit :]
-    def get_valid_posts_from_path(self, path: Path) -> Generator[Post]:
-        pattern = re.compile("|".join(self.config.filtered_words), re.IGNORECASE)
-        with path.open("rb") as fp:
-            for line in fp:
-                post = Post.model_validate_json(line)
-                if post.valid_text_post() and not (post.trail and self.config.filtered_words and pattern.search(str(post))):
-                    yield post
-    def filter_examples(self) -> None:
-        examples = self.config.examples_file.read_text("utf_8").splitlines()
-        with self.config.examples_file.open("w", encoding="utf_8") as fp:
-            batch_size = self.get_moderation_batch_size()
-            removed = 0
-            with PreviewLive() as live:
-                for batch in live.progress.track(
-                    batched(examples, batch_size, strict=False),
-                    ceil(len(examples) / batch_size),
-                    description="Removing flagged posts...",
-                ):
-                    response = self.openai.moderations.create(input=list(batch))
-                    for example, moderation in zip(batch, response.results, strict=True):
-                        if moderation.flagged:
-                            removed += 1
-                        else:
-                            fp.write(f"{example}\n")
-            rich.print(f"[red]Removed {removed} posts.\n")
-    def get_moderation_batch_size(self) -> int:
-        try:
-            self.openai.moderations.create(input=[""] * self.config.max_moderation_batch_size)
-        except BadRequestError as error:
-            message = error.response.json()["error"]["message"]
-            if match := search(r"(\d+)\.", message):
-                return int(match.group(1))
-        return self.config.max_moderation_batch_size