textllm 0.0.3__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,132 @@
1
+ Metadata-Version: 2.1
2
+ Name: textllm
3
+ Version: 0.0.3
4
+ Summary: Simple text file based interface to LLMs
5
+ Author: Justin Winokur
6
+ Author-email: Jwink3101@users.noreply.github.com
7
+ Classifier: Programming Language :: Python :: 3
8
+ Classifier: License :: OSI Approved :: MIT License
9
+ Classifier: Operating System :: OS Independent
10
+ Requires-Python: >=3.8
11
+ Description-Content-Type: text/markdown
12
+ Requires-Dist: python-dotenv
13
+ Requires-Dist: langchain
14
+
15
+ # textllm
16
+
17
+ This is a **SIMPLE** text-based interface to LLMs. It is not intended to be a general purpose or overly featureful tool. It is just an easy way to call an LLM and save results in a simple format (text/markdown)
18
+
19
+ textllm uses [LangChain][LangChain] to interact with many AI models.
20
+
21
+ [LangChain]:https://www.langchain.com/
22
+
23
+ ## Setup
24
+
25
+ Install from PyPI
26
+
27
+ $ pip install textllm
28
+
29
+ Then depending on needs
30
+
31
+ $ pip install langchain-openai
32
+ $ pip install langchain-anthropic
33
+ ...
34
+
35
+
36
+ ## Usage
37
+
38
+ Create a new text file. See the format description below for more:
39
+
40
+ $ textllm --new mytitle.md
41
+
42
+ That will look something like:
43
+
44
+ # !!AUTO TITLE!!
45
+
46
+ ```toml
47
+ # Optional Settings
48
+ # TOML Format
49
+ temperature = 0.5
50
+
51
+ model = "openai:gpt-4o" # pip install langchain-openai
52
+ # model = "openai:gpt-4o-mini"
53
+ # model = "anthropic:claude-3-5-sonnet-latest" # pip install langchain-anthropic
54
+ # model = "anthropic:claude-3-5-haiku-latest"
55
+
56
+ # END Optional Settings
57
+ ```
58
+
59
+ --- System ---
60
+
61
+ You are a helpful assistant. Provide clear and thorough answers but be concise unless instructed otherwise.
62
+
63
+ --- User ---
64
+
65
+
66
+ Then (optionally) modify the System prompt and add your query under the user prompt. Then
67
+
68
+ $ textllm mytitle.md
69
+
70
+ and it will (a) update the title, and (b) add the response, with a new user block ready to go, below it. You will need to re-open the text editor when its done.
71
+
72
+ ## Titles and Names
73
+
74
+ As noted in "Format Description", the title is the first line. If "!!AUTO TITLE!!" is in the first line, textllm will generate a title for the document (using the same model). This can be disabled or just manually set the title. To regenerate a title, reset the title to `!!AUTO TITLE!!`.
75
+
76
+ If `--rename` is set, the document will also be renamed for the title. Numbers will be added to the name as needed to avoid conflicts if needed.
77
+
78
+ ## Environment Variables
79
+
80
+ Most behavior is governed by command-line flags but there are a few exceptions.
81
+
82
+ | Variable | Description |
83
+ |--|--|
84
+ |`$TEXTLLM_ENV_PATH` | Path to an environment file for API keys. |
85
+ |`$TEXTLLM_AUTO_RENAME` | Set to "true" to make `--rename` the *default*. Command-line settings will override. |
86
+
87
+ ## Format Description
88
+
89
+ The format is designed to be very simple. An input is broken up into three main parts
90
+
91
+ 1. Title (optional).
92
+ 2. Settings (optional)
93
+ 3. Conversation
94
+
95
+ ### (1) Title:
96
+
97
+ First line of the document. If and only if it contains "!!AUTO TITLE!!", it will be replaced with an appropriate title based on the document (using the LLM).
98
+
99
+ Generally, this is only set once but "!!AUTO TITLE!!" is added back to the first line, it will get refreshed
100
+
101
+ ### (2) Settings
102
+
103
+ Optionally specify settings for the object in [TOML][toml] format inside of a Markdown fenced code block. Do NOT modify the leading comments as they are needed for the correct parseing. All settings are directly passed including 'model'. Model should be in the format of "<provider>:<name>" format where providers are those from LangChain. See [`init_chat_model` docs][init_chat_model] for the naming scheme and needed Python package and [Chat Models][chat models] for more details.
104
+
105
+ Note that they require an API key. It can be specified in the settings or can be set with an environment variable. Alternatively, an environment file can be specified with '$TEXTLLM_ENV_PATH' that may contain all API keys
106
+
107
+ ### (3) Conversation
108
+
109
+ The conversation is with a simple format. There are three block types as demonstrated below. General practice is to specify 'system' at the top and only once but textllm will translate all that are specified.
110
+
111
+ ```text
112
+ --- System ---
113
+
114
+ Enter your system prompt. These are like *super* user blocks.
115
+
116
+ --- User ---
117
+
118
+ The last "User" block is usually the question.
119
+
120
+ --- Assistant ---
121
+
122
+ The response.
123
+ ```
124
+
125
+ Generally, you want the final block to be the new "User" question but it doesn't have to be. Note that a new "--- User ---" heading will be added after the last response.
126
+
127
+ You can escape a block with a leading "\". It will be done if somehow the response also has such a block.
128
+
129
+
130
+ [toml]: https://toml.io/
131
+ [init_chat_model]: https://python.langchain.com/api_reference/langchain/chat_models/langchain.chat_models.base.init_chat_model.html
132
+ [chat models]: https://python.langchain.com/docs/integrations/chat/
@@ -0,0 +1,6 @@
1
+ textllm.py,sha256=iofZBgRLLc5872-ILLBqZJZrJaGDLgCKFOs-LH_lQRg,13683
2
+ textllm-0.0.3.dist-info/METADATA,sha256=DaQIEvPt7h0WjPwjvzO2n3nbZJj2YttrMdhMZm7UIJk,4798
3
+ textllm-0.0.3.dist-info/WHEEL,sha256=PZUExdf71Ui_so67QXpySuHtCi3-J3wvF4ORK6k_S8U,91
4
+ textllm-0.0.3.dist-info/entry_points.txt,sha256=YEO8Q74UjIG9btPb1Kd5n_n7Mdz2opbsER5zuyxYd9o,40
5
+ textllm-0.0.3.dist-info/top_level.txt,sha256=iwgCn2waDKSHiOeF4Vu_g5EzFI3R5GtUCLrQAUyapYI,8
6
+ textllm-0.0.3.dist-info/RECORD,,
@@ -0,0 +1,5 @@
1
+ Wheel-Version: 1.0
2
+ Generator: setuptools (75.6.0)
3
+ Root-Is-Purelib: true
4
+ Tag: py3-none-any
5
+
@@ -0,0 +1,2 @@
1
+ [console_scripts]
2
+ textllm = textllm:cli
@@ -0,0 +1 @@
1
+ textllm
textllm.py ADDED
@@ -0,0 +1,427 @@
1
+ #!/usr/bin/env python
2
+ # -*- coding: utf-8 -*-
3
+
4
+ import argparse
5
+ import itertools
6
+ import json
7
+ import logging
8
+ import os
9
+ import re
10
+ import shutil
11
+ import sys
12
+ import tomllib
13
+ from functools import cached_property
14
+ from pathlib import Path
15
+ from textwrap import dedent
16
+
17
+ from dotenv import load_dotenv # pip install python-dotenv
18
+ from langchain.chat_models import init_chat_model
19
+ from langchain_core.messages import (
20
+ AIMessage,
21
+ HumanMessage,
22
+ SystemMessage,
23
+ merge_message_runs,
24
+ )
25
+
26
+ __version__ = "0.0.3"
27
+
28
+ log = logging.getLogger("textllm")
29
+
30
+ TEXTLLM_ENV_PATH = os.environ.get("TEXTLLM_ENV_PATH", None)
31
+ TEXTLLM_AUTO_RENAME = os.environ.get("TEXTLLM_AUTO_RENAME", "").lower() == "true"
32
+
33
+ AUTO_TITLE = "!!AUTO TITLE!!"
34
+ TEMPLATE = f"""\
35
+ # {AUTO_TITLE}
36
+
37
+ ```toml
38
+ # Optional Settings
39
+ # TOML Format
40
+ temperature = 0.5
41
+
42
+ model = "openai:gpt-4o" # pip install langchain-openai
43
+ # model = "openai:gpt-4o-mini"
44
+ # model = "anthropic:claude-3-5-sonnet-latest" # pip install langchain-anthropic
45
+ # model = "anthropic:claude-3-5-haiku-latest"
46
+ # model = "google_genai:gemini-1.5-flash" # pip install langchain-google-genai
47
+ # model = "google_genai:gemini-1.5-pro"
48
+
49
+ # END Optional Settings
50
+ ```
51
+
52
+ --- System ---
53
+
54
+ You are a helpful assistant. Provide clear and thorough answers but be concise unless instructed otherwise.
55
+
56
+ --- User ---
57
+
58
+ """
59
+
60
+ TITLE_SYSTEM_PROMPT = """\
61
+ Provide an appropriate, consice, title for this conversation. The conversation is in JSON form with roles 'system' (or 'developer'), 'human', and 'ai'.
62
+
63
+ - Aim for fewer than 5 words but absolutely no more than 10.
64
+ - Give more influence to earlier messages than later.
65
+ - Be as concise as possible without losing the context of the conversation.
66
+ - Your goal is to extract the key point of the conversation
67
+ - Make sure the title is also appropriate for a filename. Spaces are acceptable.
68
+ - Reply with ONLY the title and nothing else!
69
+ """
70
+
71
+ MAX_FILENAME_CHAR = 240
72
+
73
+ flag2role = {
74
+ "--- system ---": SystemMessage,
75
+ "--- user ---": HumanMessage,
76
+ "--- assistant ---": AIMessage,
77
+ }
78
+
79
+ RETURN_AFTER_CLI_FOR_DEVEL = False
80
+
81
+
82
+ class Conversation:
83
+ def __init__(self, filepath):
84
+
85
+ if load_dotenv(TEXTLLM_ENV_PATH):
86
+ # $TEXTLLM_ENV_PATH defaults to None to look in the parent dir.
87
+ log.debug(f"Loaded env. ${TEXTLLM_ENV_PATH = }")
88
+ else:
89
+ log.debug(f"Could not load env. ${TEXTLLM_ENV_PATH = }")
90
+
91
+ self.filepath = filepath
92
+
93
+ # Read and truncate file. Do it now in case the title is updated
94
+ with open(self.filepath, "rb+") as fp:
95
+ content = fp.read().rstrip()
96
+ self.text = content.decode("UTF-8")
97
+
98
+ fp.seek(len(content), 0)
99
+ fp.truncate()
100
+
101
+ self.messages = self.read_conversation()
102
+
103
+ def call_llm(self, messages, **new_settings):
104
+ settings = self.settings.copy() | new_settings
105
+ log.debug(f"Settings {settings}")
106
+
107
+ model = settings.pop("model") # Will KeyError if not set as expected
108
+ try:
109
+ model_provider, model_name = model.split(":", 1)
110
+ except ValueError:
111
+ model_provider = None
112
+ model_name = model
113
+ log.debug(f"{model!r} does not contain a provider. Will try to infer")
114
+
115
+ log.debug(f"{model_provider = } {model_name = }")
116
+
117
+ chat_model = init_chat_model(
118
+ model=model_name,
119
+ model_provider=model_provider,
120
+ **settings,
121
+ )
122
+
123
+ response = chat_model.invoke(messages)
124
+
125
+ try:
126
+ logtxt = (
127
+ f"tokens: "
128
+ f"prompt {response.usage_metadata['input_tokens']}, "
129
+ f"completion {response.usage_metadata['output_tokens']}, "
130
+ f"total {response.usage_metadata['total_tokens']}"
131
+ )
132
+ log.debug(logtxt)
133
+ except:
134
+ # The above seems to only work well with OpenAI.
135
+ # ToDO: Fix this
136
+ pass
137
+
138
+ return response
139
+
140
+ def chat(self, require_user_prompt=True):
141
+ if require_user_prompt and (
142
+ not self.messages or not isinstance(self.messages[-1], HumanMessage)
143
+ ):
144
+ raise NoHumanMessageError("Must have a new user message")
145
+
146
+ response = self.call_llm(messages=self.messages)
147
+
148
+ # Not really needed but in case I do more with it later
149
+ self.messages.append(response)
150
+
151
+ # Add escapes to the content
152
+ content = response.content
153
+ pattern = re.compile(
154
+ "(" + "|".join("^" + re.escape(flag) for flag in flag2role) + ")",
155
+ flags=re.DOTALL | re.MULTILINE | re.IGNORECASE,
156
+ )
157
+ content = pattern.sub(r"\\\1", content)
158
+
159
+ with open(self.filepath, "at") as fp:
160
+ fp.write("\n\n--- Assistant ---\n\n")
161
+ fp.write(content)
162
+ fp.write("\n\n--- User ---\n\n")
163
+
164
+ log.info(f"Updated {self.filepath!r}")
165
+
166
+ def set_title(self):
167
+ top, rest = self.text.split("\n", 1)
168
+ if AUTO_TITLE not in top:
169
+ log.debug(f"{AUTO_TITLE!r} not found in first line.")
170
+ return # This will happen nearly every time but the first
171
+
172
+ messages = [(m.type, m.content) for m in self.messages]
173
+ new = [
174
+ SystemMessage(content=TITLE_SYSTEM_PROMPT),
175
+ HumanMessage(content=json.dumps(messages)),
176
+ ]
177
+
178
+ response = self.call_llm(messages=new, temperature=0.1)
179
+ title = response.content
180
+
181
+ top = top.replace(AUTO_TITLE, title)
182
+ self.text = f"{top}\n{rest}"
183
+ with open(self.filepath, "wt") as fp:
184
+ fp.write(self.text)
185
+ log.info(f"Set title to {title!r}")
186
+
187
+ @cached_property
188
+ def settings(self):
189
+ defaults = Conversation.read_settings(TEMPLATE)
190
+ new = Conversation.read_settings(self.text)
191
+ final = defaults | new
192
+ return final
193
+
194
+ @staticmethod
195
+ def read_settings(text):
196
+
197
+ pattern = re.compile(
198
+ r"```toml\s*"
199
+ r"# Optional Settings\s*"
200
+ r"(.*?)"
201
+ r"^# END Optional Settings\s*"
202
+ r"```",
203
+ flags=re.DOTALL | re.MULTILINE,
204
+ )
205
+ match = pattern.search(text)
206
+ if match:
207
+ toml_content = match.group(1).strip()
208
+ return tomllib.loads(toml_content) # Parse as TOML
209
+
210
+ return {}
211
+
212
+ def read_conversation(self):
213
+ conversation = []
214
+
215
+ pattern = re.compile(
216
+ "(" + "|".join("^" + re.escape(flag) for flag in flag2role) + ")",
217
+ flags=re.DOTALL | re.MULTILINE | re.IGNORECASE,
218
+ )
219
+
220
+ split_text = pattern.split(self.text)
221
+
222
+ # Decide if the first item is a flag. It likely isn't but could be!
223
+ if split_text[0].lower() not in flag2role:
224
+ del split_text[0]
225
+
226
+ for flag, msg in grouper(split_text, 2):
227
+ msg = msg.strip()
228
+ if not msg:
229
+ continue # Empty or blank
230
+
231
+ # Clean up and unescape
232
+ msg_lines = []
233
+ for line in msg.strip().split("\n"):
234
+ if any(line.lower().startswith(rf"\{flag}") for flag in flag2role):
235
+ line = line[1:]
236
+ msg_lines.append(line)
237
+
238
+ conversation.append(flag2role[flag.lower()](content="\n".join(msg_lines)))
239
+
240
+ return merge_message_runs(conversation)
241
+
242
+ def rename_by_title(self):
243
+ dirname = os.path.dirname(self.filepath)
244
+
245
+ # Clean the current for possible "<name> (n).<ext>"
246
+ base, ext = os.path.splitext(self.filepath)
247
+ cleaned_filepath = re.sub(r" \(\d+\)$", "", base) + ext
248
+ cleaned_filename = os.path.basename(cleaned_filepath)
249
+ log.debug(f"{cleaned_filename = }")
250
+
251
+ # Compute the new name without worrying about duplicates
252
+ title, *_ = self.text.split("\n", 1)
253
+
254
+ if AUTO_TITLE in title: # BEFORE cleaning it
255
+ log.warning(f"{AUTO_TITLE!r} in title. Not renaming!")
256
+ return
257
+
258
+ # Sub unsafe or invalid characters
259
+ invalid_chars = set(
260
+ "\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13"
261
+ '\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f"*/:<>?\\|'
262
+ )
263
+ title = title.strip().strip("#").strip()
264
+ title_based_filebase = "".join(c for c in title if c not in invalid_chars)
265
+ title_based_filebase = title_based_filebase[: (MAX_FILENAME_CHAR - len(ext))]
266
+ title_based_filename = title_based_filebase + ext
267
+ title_based_filepath = os.path.join(dirname, title_based_filename)
268
+ log.debug(f"{title_based_filename = }")
269
+ if cleaned_filename == title_based_filename:
270
+ log.debug("Already named by title")
271
+
272
+ # Ensure it is unique by added " (n)" up to 99
273
+ c = 0
274
+ while os.path.exists(title_based_filepath):
275
+ c += 1
276
+ if c >= 100:
277
+ raise ValueError(f"Too many for {title_based_filebase + ext!r}")
278
+
279
+ new = f"{title_based_filebase} ({c}){ext}"
280
+ title_based_filepath = os.path.join(dirname, new)
281
+ log.debug(f"Required {c} iterations for unique name")
282
+
283
+ shutil.move(self.filepath, title_based_filepath)
284
+ log.info(f"Rename by title {self.filepath!r} --> {title_based_filepath!r}")
285
+ self.filepath = title_based_filepath
286
+
287
+
288
+ def grouper(iterable, n, *, fillvalue=None):
289
+ iterators = [iter(iterable)] * n
290
+ return itertools.zip_longest(*iterators, fillvalue="")
291
+
292
+
293
+ class NoHumanMessageError(ValueError):
294
+ """Error when a conversation doesn't end with a HumanMessage"""
295
+
296
+
297
+ def cli(argv=None):
298
+
299
+ parser = argparse.ArgumentParser(
300
+ description="Simple LLM interface that reads and writes to a text file",
301
+ epilog="See readme.md for details on format description",
302
+ # formatter_class=argparse.RawDescriptionHelpFormatter,
303
+ )
304
+
305
+ parser.add_argument(
306
+ "conversation",
307
+ help="""
308
+ Input file in the noted format. If it does not exists, the template will
309
+ instead be written there (unless --no-create)
310
+ """,
311
+ )
312
+
313
+ parser.add_argument(
314
+ "--create",
315
+ action=argparse.BooleanOptionalAction,
316
+ default=True,
317
+ help="Whether or not to create a file with a template if no file exists",
318
+ )
319
+
320
+ parser.add_argument(
321
+ "--title",
322
+ choices=["auto", "only", "off"],
323
+ default="auto",
324
+ help=f"""
325
+ [%(default)s] How to set the title. If 'auto', will replace {AUTO_TITLE!r}
326
+ with the generated title. If 'only', will only replace the title and
327
+ not continue the chat. If 'off', will not update the title (or rename).
328
+ The title is the first line.
329
+ """,
330
+ )
331
+
332
+ parser.add_argument(
333
+ "--u", # To make --no-u an easy option
334
+ "--require-user-prompt",
335
+ dest="require_user_prompt",
336
+ action=argparse.BooleanOptionalAction,
337
+ default=True,
338
+ help="""
339
+ Whether or not to require there be a user prompt at the end of
340
+ the messages. Default %(default)s
341
+ """,
342
+ )
343
+
344
+ parser.add_argument(
345
+ "--rename",
346
+ action=argparse.BooleanOptionalAction,
347
+ default=TEXTLLM_AUTO_RENAME,
348
+ help=f"""
349
+ Rename the file based on the title. The title must NOT have {AUTO_TITLE!r}
350
+ in the title. Note that the automatic title generation will happen first if
351
+ set. Will increment the file if one already exists. Default is based
352
+ on environment variable whether $TEXTLLM_AUTO_RENAME == "true". Currently
353
+ %(default)s
354
+ """,
355
+ )
356
+
357
+ parser.add_argument(
358
+ "--version",
359
+ action="version",
360
+ version="%(prog)s-" + __version__,
361
+ )
362
+
363
+ verb = parser.add_argument_group("Verbosity Settings:")
364
+ verb.add_argument(
365
+ "-s", "--silent", action="count", default=0, help="Decrease Verbosity"
366
+ )
367
+ verb.add_argument(
368
+ "-v", "--verbose", action="count", default=0, help="Increase Verbosity"
369
+ )
370
+
371
+ args = parser.parse_args(argv)
372
+
373
+ # Define logging levels
374
+ levels = [logging.ERROR, logging.WARNING, logging.INFO, logging.DEBUG]
375
+ level_index = args.verbose - args.silent + 2 # +1: WARNING, +2: INFO
376
+ level_index = max(0, min(level_index, len(levels) - 1)) # Always keep ERROR
377
+
378
+ log.setLevel(levels[level_index])
379
+
380
+ console_handler = logging.StreamHandler()
381
+ fmt = logging.Formatter(
382
+ "%(asctime)s:%(levelname)s: %(message)s",
383
+ datefmt="%Y-%m-%d %H:%M:%S",
384
+ )
385
+ console_handler.setFormatter(fmt)
386
+ log.addHandler(console_handler)
387
+
388
+ log.debug(f"argv: {sys.argv[1:]}")
389
+ log.debug(f"{args = }")
390
+
391
+ filepath = args.conversation
392
+
393
+ try:
394
+ if not os.path.exists(filepath):
395
+ if not args.create:
396
+ raise ValueError(f"{filepath!r} does not exist. Exit")
397
+ Path(filepath).parent.mkdir(parents=True, exist_ok=True)
398
+ with open(filepath, "xt") as fp:
399
+ fp.write(TEMPLATE)
400
+ log.info(f"{filepath!r} does not exist. Created template.")
401
+ sys.exit()
402
+ else:
403
+ log.debug(f"{filepath!r} exists")
404
+
405
+ convo = Conversation(filepath)
406
+
407
+ if args.title != "off":
408
+ convo.set_title() # Will do nothing if AUTO_TITLE not in the top line
409
+ if args.title == "only":
410
+ return convo
411
+
412
+ convo.chat(require_user_prompt=args.require_user_prompt)
413
+
414
+ if args.rename:
415
+ convo.rename_by_title()
416
+
417
+ if RETURN_AFTER_CLI_FOR_DEVEL:
418
+ return convo
419
+
420
+ except Exception as E:
421
+ log.error(E)
422
+ if levels[level_index] == logging.DEBUG:
423
+ raise
424
+
425
+
426
+ if __name__ == "__main__":
427
+ cli()