PyPI - tumblrbot - Versions diffs - 1.6.0__tar.gz → 1.7.0__tar.gz - Mend

tumblrbot 1.6.0tar.gz → 1.7.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

{tumblrbot-1.6.0 → tumblrbot-1.7.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: tumblrbot
-Version: 1.6.0
+Version: 1.7.0
 Summary: An updated bot that posts to Tumblr, based on your very own blog!
 Requires-Python: >= 3.13
 Description-Content-Type: text/markdown
@@ -36,6 +36,8 @@ Project-URL: Source, https://github.com/MaidScientistIzutsumiMarin/tumblrbot
 [Tumblr]: https://tumblr.com
 [Tumblr Tokens]: https://tumblr.com/oauth/apps
+[Tumblr API Documentation on Blog Identifiers]: https://tumblr.com/docs/en/api/v2#blog-identifiers
+[Tumblr API Documentation on Rate Limits]: https://tumblr.com/docs/en/api/v2#rate-limits
 [Download]: src/tumblrbot/flow/download.py
 [Examples]: src/tumblrbot/flow/examples.py
@@ -58,7 +60,7 @@ Features:
    1. Asks for [OpenAI] and [Tumblr] tokens.
       - Stores API tokens using [keyring].
    1. Retrieves [Tumblr] [OAuth] tokens.
-   1. [Downloads posts][Download] from the [configured][config] [Tumblr] blogs.
+   1. [Downloads posts][Download] from the [configured][config] blogs.
       - Skips redownloading already downloaded posts.
       - Shows progress and previews the current post.
    1. [Creates examples][Examples] to fine-tune the model from your posts.
@@ -70,22 +72,24 @@ Features:
       - Resumes monitoring the same fine-tuning process when restarted.
       - Deletes the uploaded examples file if fine-tuning does not succeed (optional).
       - Stores the output model automatically when fine-tuning is completed.
-   1. [Generates and uploads posts][Generate] to the [configured][config] [Tumblr] blog using the [configured][config] fine-tuned model.
+   1. [Generates and uploads posts][Generate] to the [configured][config] blog using the [configured][config] fine-tuned model.
       - Creates tags by extracting keywords at the [configured][config] frequency using the [configured][config] model.
-      - Uploads posts as drafts to the [configured][config] [Tumblr] blog.
-      - Reblog posts at the [configured][config] frequency.
+      - Uploads posts as drafts to the [configured][config] blog.
+      - Reblogs posts from the [configured][config] blogs at the [configured][config] frequency.
       - Shows progress and previews the current post.
 - Colorful output, progress bars, and post previews using [rich].
 - Automatically keeps the [config] file up-to-date and recreates it if missing.
 **To-Do:**
-- ...
+- Create training data from a sample of posts (possible).
+- User-specified list of words that will filter out posts.
 **Known Issues:**
 - Sometimes, you will get an error about the training file not being found when starting fine-tuning. We do not currently have a fix or workaround for this. You should instead use the online portal for fine-tuning if this continues to happen. Read more in [fine-tuning].
 - Post counts are incorrect when downloading posts. We are not certain what the cause of this is, but our tests suggest this is a [Tumblr] API problem that is giving inaccurate numbers.
+- During post downloading or post generation, you may receive a "Limit Exceeded" error message from the [Tumblr] API. This is caused by server-side rate-limiting by [Tumblr]. The only workaround is trying again or waiting for a period of time before retrying. In most cases, you either have to wait for a minute or an hour for the limits to reset. You can read more about the limits in the [Tumblr API documentation on rate limits].
 **Please submit an issue or contact us for features you want added/reimplemented.**
@@ -137,6 +141,10 @@ All config options can be found in `config.toml` after running the program once.
 All file options can include directories that will be created when the program is run.
+All config options that involve *blog identifiers* expect any version of a blog URL, which is explained in more detail in the [Tumblr API documentation on blog identifiers].
+Specific Options:
 - `custom_prompts_file` This file should follow the following file format:
    ```json
@@ -148,13 +156,15 @@ All file options can include directories that will be created when the program i
    To be specific, it should follow the [JSON Lines] file format with one collection of name/value pairs (a dictionary) per line. You can validate your file using the [JSON Lines Validator].
 - **`developer_message`** - This message is used in for fine-tuning the AI as well as generating prompts. If you change this, you will need to run the fine-tuning again with the new value before generating posts.
-- **`user_message`** - This message is used in the same way as `developer_message` and should be treated the same.
+- **`user_message`** - This setting is used and works in the same way as `developer_message`.
 - **`expected_epochs`** - The default value here is the default number of epochs for `base_model`. You may have to change this value if you change `base_model`. After running fine-tuning once, you will see the number of epochs used in the [fine-tuning portal] under *Hyperparameters*. This value will also be updated automatically if you run fine-tuning through this program.
 - **`token_price`** - The default value here is the default token price for `base_model`. You can find the up-to-date value in [OpenAI Pricing], in the *Training* column.
 - **`job_id`** - If there is any value here, this program will resume monitoring the corresponding job, instead of starting a new one. This gets set when starting the fine-tuning and is cleared when it is completed. You can read more in [fine-tuning].
 - **`base_model`** - This value is used to choose the tokenizer for estimating fine-tuning costs. It is also the base model that will be fine-tuned and the model that is used to generate tags. You can find a list of options in the [fine-tuning portal] by pressing `+ Create` and opening the drop-down list for `Base Model`. Be sure to update `token_price` if you change this value.
 - **`fine_tuned_model`** - Set automatically after monitoring fine-tuning if the job has succeeded. You can read more in [fine-tuning].
 - **`tags_chance`** - This should be between 0 and 1. Setting it to 0 corresponds to a 0% chance (never) to add tags to a post. 1 corresponds to a 100% chance (always) to add tags to a post. Adding tags incurs a very small token cost.
+- **`reblog_chance`** - This setting works the same way as `tags_chance`.
+- **`reblog_user_message`** - This setting is a prefix that is directly prepended to the contents of the post being reblogged.
 ## Manual Fine-Tuning

{tumblrbot-1.6.0 → tumblrbot-1.7.0}/README.md RENAMED Viewed

@@ -18,6 +18,8 @@
 [Tumblr]: https://tumblr.com
 [Tumblr Tokens]: https://tumblr.com/oauth/apps
+[Tumblr API Documentation on Blog Identifiers]: https://tumblr.com/docs/en/api/v2#blog-identifiers
+[Tumblr API Documentation on Rate Limits]: https://tumblr.com/docs/en/api/v2#rate-limits
 [Download]: src/tumblrbot/flow/download.py
 [Examples]: src/tumblrbot/flow/examples.py
@@ -40,7 +42,7 @@ Features:
    1. Asks for [OpenAI] and [Tumblr] tokens.
       - Stores API tokens using [keyring].
    1. Retrieves [Tumblr] [OAuth] tokens.
-   1. [Downloads posts][Download] from the [configured][config] [Tumblr] blogs.
+   1. [Downloads posts][Download] from the [configured][config] blogs.
       - Skips redownloading already downloaded posts.
       - Shows progress and previews the current post.
    1. [Creates examples][Examples] to fine-tune the model from your posts.
@@ -52,22 +54,24 @@ Features:
       - Resumes monitoring the same fine-tuning process when restarted.
       - Deletes the uploaded examples file if fine-tuning does not succeed (optional).
       - Stores the output model automatically when fine-tuning is completed.
-   1. [Generates and uploads posts][Generate] to the [configured][config] [Tumblr] blog using the [configured][config] fine-tuned model.
+   1. [Generates and uploads posts][Generate] to the [configured][config] blog using the [configured][config] fine-tuned model.
       - Creates tags by extracting keywords at the [configured][config] frequency using the [configured][config] model.
-      - Uploads posts as drafts to the [configured][config] [Tumblr] blog.
-      - Reblog posts at the [configured][config] frequency.
+      - Uploads posts as drafts to the [configured][config] blog.
+      - Reblogs posts from the [configured][config] blogs at the [configured][config] frequency.
       - Shows progress and previews the current post.
 - Colorful output, progress bars, and post previews using [rich].
 - Automatically keeps the [config] file up-to-date and recreates it if missing.
 **To-Do:**
-- ...
+- Create training data from a sample of posts (possible).
+- User-specified list of words that will filter out posts.
 **Known Issues:**
 - Sometimes, you will get an error about the training file not being found when starting fine-tuning. We do not currently have a fix or workaround for this. You should instead use the online portal for fine-tuning if this continues to happen. Read more in [fine-tuning].
 - Post counts are incorrect when downloading posts. We are not certain what the cause of this is, but our tests suggest this is a [Tumblr] API problem that is giving inaccurate numbers.
+- During post downloading or post generation, you may receive a "Limit Exceeded" error message from the [Tumblr] API. This is caused by server-side rate-limiting by [Tumblr]. The only workaround is trying again or waiting for a period of time before retrying. In most cases, you either have to wait for a minute or an hour for the limits to reset. You can read more about the limits in the [Tumblr API documentation on rate limits].
 **Please submit an issue or contact us for features you want added/reimplemented.**
@@ -119,6 +123,10 @@ All config options can be found in `config.toml` after running the program once.
 All file options can include directories that will be created when the program is run.
+All config options that involve *blog identifiers* expect any version of a blog URL, which is explained in more detail in the [Tumblr API documentation on blog identifiers].
+Specific Options:
 - `custom_prompts_file` This file should follow the following file format:
    ```json
@@ -130,13 +138,15 @@ All file options can include directories that will be created when the program i
    To be specific, it should follow the [JSON Lines] file format with one collection of name/value pairs (a dictionary) per line. You can validate your file using the [JSON Lines Validator].
 - **`developer_message`** - This message is used in for fine-tuning the AI as well as generating prompts. If you change this, you will need to run the fine-tuning again with the new value before generating posts.
-- **`user_message`** - This message is used in the same way as `developer_message` and should be treated the same.
+- **`user_message`** - This setting is used and works in the same way as `developer_message`.
 - **`expected_epochs`** - The default value here is the default number of epochs for `base_model`. You may have to change this value if you change `base_model`. After running fine-tuning once, you will see the number of epochs used in the [fine-tuning portal] under *Hyperparameters*. This value will also be updated automatically if you run fine-tuning through this program.
 - **`token_price`** - The default value here is the default token price for `base_model`. You can find the up-to-date value in [OpenAI Pricing], in the *Training* column.
 - **`job_id`** - If there is any value here, this program will resume monitoring the corresponding job, instead of starting a new one. This gets set when starting the fine-tuning and is cleared when it is completed. You can read more in [fine-tuning].
 - **`base_model`** - This value is used to choose the tokenizer for estimating fine-tuning costs. It is also the base model that will be fine-tuned and the model that is used to generate tags. You can find a list of options in the [fine-tuning portal] by pressing `+ Create` and opening the drop-down list for `Base Model`. Be sure to update `token_price` if you change this value.
 - **`fine_tuned_model`** - Set automatically after monitoring fine-tuning if the job has succeeded. You can read more in [fine-tuning].
 - **`tags_chance`** - This should be between 0 and 1. Setting it to 0 corresponds to a 0% chance (never) to add tags to a post. 1 corresponds to a 100% chance (always) to add tags to a post. Adding tags incurs a very small token cost.
+- **`reblog_chance`** - This setting works the same way as `tags_chance`.
+- **`reblog_user_message`** - This setting is a prefix that is directly prepended to the contents of the post being reblogged.
 ## Manual Fine-Tuning

{tumblrbot-1.6.0 → tumblrbot-1.7.0}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "tumblrbot"
-version = "1.6.0"
+version = "1.7.0"
 description = "An updated bot that posts to Tumblr, based on your very own blog!"
 readme = "README.md"
 requires-python = ">= 3.13"

{tumblrbot-1.6.0 → tumblrbot-1.7.0}/src/tumblrbot/flow/generate.py RENAMED Viewed

@@ -1,4 +1,4 @@
-from random import random, randrange
+from random import choice, random, randrange
 from typing import override
 import rich
@@ -28,7 +28,7 @@ class DraftGenerator(FlowClass):
         rich.print(f":chart_increasing: [bold green]Generated {self.config.draft_count} draft(s).[/] {message}")
     def generate_post(self) -> Post:
-        if random() < self.config.reblog_chance:  # noqa: S311
+        if self.config.reblog_blog_identifiers and random() < self.config.reblog_chance:  # noqa: S311
             original = self.get_random_post()
             user_message = f"{self.config.reblog_user_message}\n\n{original.get_content_text()}"
         else:
@@ -66,9 +66,14 @@ class DraftGenerator(FlowClass):
         return None
     def get_random_post(self) -> Post:
-        total = self.tumblr.retrieve_blog_info(self.config.upload_blog_identifier).response.blog.posts
-        post = self.tumblr.retrieve_published_posts(
-            self.config.upload_blog_identifier,
-            offset=randrange(total),  # noqa: S311
-        ).response.posts[0]
-        return Post.model_validate(post)
+        blog_identifier = choice(self.config.reblog_blog_identifiers)  # noqa: S311
+        while True:
+            total = self.tumblr.retrieve_blog_info(blog_identifier).response.blog.posts
+            for raw_post in self.tumblr.retrieve_published_posts(
+                blog_identifier,
+                "text",
+                randrange(total),  # noqa: S311
+            ).response.posts:
+                post = Post.model_validate(raw_post)
+                if post.valid_text_post():
+                    return post

{tumblrbot-1.6.0 → tumblrbot-1.7.0}/src/tumblrbot/utils/models.py RENAMED Viewed

@@ -46,7 +46,7 @@ class Config(FileSyncSettings):
     toml_file: ClassVar = Path("config.toml")
     # Downloading Posts & Writing Examples
-    download_blog_identifiers: list[str] = Field([], description="The identifiers of the blogs which post data will be downloaded from. These must be blogs associated with the same account as the configured Tumblr secret tokens.")
+    download_blog_identifiers: list[str] = Field([], description="The identifiers of the blogs which post data will be downloaded from.")
     data_directory: Path = Field(Path("data"), description="Where to store downloaded post data.")
     # Writing Examples
@@ -72,10 +72,11 @@ class Config(FileSyncSettings):
     # Generating
     upload_blog_identifier: str = Field("", description="The identifier of the blog which generated drafts will be uploaded to. This must be a blog associated with the same account as the configured Tumblr secret tokens.")
     draft_count: PositiveInt = Field(150, description="The number of drafts to process. This will affect the number of tokens used with OpenAI")
-    tags_chance: NonNegativeFloat = Field(0.1, description="The chance to generate tags for any given post. This will incur extra calls to OpenAI.")
+    tags_chance: NonNegativeFloat = Field(0.1, description="The chance to generate tags for any given post. This will use more OpenAI tokens.")
     tags_developer_message: str = Field("You will be provided with a block of text, and your task is to extract a very short list of the most important subjects from it.", description="The developer message used to generate tags.")
-    reblog_chance: NonNegativeFloat = Field(0.05, description="The chance to generate a reblog of a random post.")
-    reblog_user_message: str = Field("Please write a comical Tumblr post in response to the following Tumblr post:", description="The prefix for the user message used to reblog posts.")
+    reblog_blog_identifiers: list[str] = Field([], description="The identifiers of blogs that can be reblogged from when generating drafts.")
+    reblog_chance: NonNegativeFloat = Field(0.05, description="The chance to generate a reblog of a random post. This will use more OpenAI tokens.")
+    reblog_user_message: str = Field("Please write a comical Tumblr post in response to the following post from your blog:", description="The prefix for the user message used to reblog posts.")
     @classmethod
     @override

{tumblrbot-1.6.0 → tumblrbot-1.7.0}/src/tumblrbot/utils/tumblr.py RENAMED Viewed

@@ -1,4 +1,4 @@
-from typing import Self
+from typing import Literal, Self
 from requests import HTTPError, Response
 from requests_oauthlib import OAuth1Session
@@ -26,10 +26,17 @@ class TumblrSession(OAuth1Session):
         response = self.get(f"https://api.tumblr.com/v2/blog/{blog_identifier}/info")
         return ResponseModel.model_validate_json(response.text)
-    def retrieve_published_posts(self, blog_identifier: str, offset: int | None = None, after: int | None = None) -> ResponseModel:
+    def retrieve_published_posts(
+        self,
+        blog_identifier: str,
+        type_: Literal["text", "quote", "link", "answer", "video", "audio", "photo", "chat"] | None = None,
+        offset: int | None = None,
+        after: int | None = None,
+    ) -> ResponseModel:
         response = self.get(
             f"https://api.tumblr.com/v2/blog/{blog_identifier}/posts",
             params={
+                "type": type_,
                 "offset": offset,
                 "after": after,
                 "sort": "asc",

{tumblrbot-1.6.0 → tumblrbot-1.7.0}/.github/FUNDING.yml RENAMED Viewed

File without changes

{tumblrbot-1.6.0 → tumblrbot-1.7.0}/.github/dependabot.yml RENAMED Viewed

File without changes

{tumblrbot-1.6.0 → tumblrbot-1.7.0}/.gitignore RENAMED Viewed

File without changes

{tumblrbot-1.6.0 → tumblrbot-1.7.0}/UNLICENSE RENAMED Viewed

File without changes

{tumblrbot-1.6.0 → tumblrbot-1.7.0}/sample_custom_prompts.jsonl RENAMED Viewed

File without changes

{tumblrbot-1.6.0 → tumblrbot-1.7.0}/src/tumblrbot/__init__.py RENAMED Viewed

File without changes

{tumblrbot-1.6.0 → tumblrbot-1.7.0}/src/tumblrbot/__main__.py RENAMED Viewed

File without changes

{tumblrbot-1.6.0 → tumblrbot-1.7.0}/src/tumblrbot/flow/__init__.py RENAMED Viewed

File without changes

{tumblrbot-1.6.0 → tumblrbot-1.7.0}/src/tumblrbot/flow/download.py RENAMED Viewed

File without changes

{tumblrbot-1.6.0 → tumblrbot-1.7.0}/src/tumblrbot/flow/examples.py RENAMED Viewed

File without changes

{tumblrbot-1.6.0 → tumblrbot-1.7.0}/src/tumblrbot/flow/fine_tune.py RENAMED Viewed

File without changes

{tumblrbot-1.6.0 → tumblrbot-1.7.0}/src/tumblrbot/utils/__init__.py RENAMED Viewed

File without changes

{tumblrbot-1.6.0 → tumblrbot-1.7.0}/src/tumblrbot/utils/common.py RENAMED Viewed

File without changes

tumblrbot 1.6.0__tar.gz → 1.7.0__tar.gz

tumblrbot 1.6.0tar.gz → 1.7.0tar.gz