chatterer 0.1.26__py3-none-any.whl → 0.1.28__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (42) hide show
  1. chatterer/__init__.py +87 -87
  2. chatterer/common_types/__init__.py +21 -21
  3. chatterer/common_types/io.py +19 -19
  4. chatterer/constants.py +5 -0
  5. chatterer/examples/__main__.py +75 -75
  6. chatterer/examples/any2md.py +83 -85
  7. chatterer/examples/pdf2md.py +231 -338
  8. chatterer/examples/pdf2txt.py +52 -54
  9. chatterer/examples/ppt.py +487 -486
  10. chatterer/examples/pw.py +141 -143
  11. chatterer/examples/snippet.py +54 -56
  12. chatterer/examples/transcribe.py +192 -192
  13. chatterer/examples/upstage.py +87 -89
  14. chatterer/examples/web2md.py +80 -80
  15. chatterer/interactive.py +422 -354
  16. chatterer/language_model.py +530 -536
  17. chatterer/messages.py +21 -21
  18. chatterer/tools/__init__.py +46 -46
  19. chatterer/tools/caption_markdown_images.py +388 -384
  20. chatterer/tools/citation_chunking/__init__.py +3 -3
  21. chatterer/tools/citation_chunking/chunks.py +51 -53
  22. chatterer/tools/citation_chunking/citation_chunker.py +117 -118
  23. chatterer/tools/citation_chunking/citations.py +284 -285
  24. chatterer/tools/citation_chunking/prompt.py +157 -157
  25. chatterer/tools/citation_chunking/reference.py +26 -26
  26. chatterer/tools/citation_chunking/utils.py +138 -138
  27. chatterer/tools/convert_pdf_to_markdown.py +634 -645
  28. chatterer/tools/convert_to_text.py +446 -446
  29. chatterer/tools/upstage_document_parser.py +704 -705
  30. chatterer/tools/webpage_to_markdown.py +739 -739
  31. chatterer/tools/youtube.py +146 -147
  32. chatterer/utils/__init__.py +15 -15
  33. chatterer/utils/base64_image.py +349 -350
  34. chatterer/utils/bytesio.py +59 -59
  35. chatterer/utils/code_agent.py +237 -237
  36. chatterer/utils/imghdr.py +145 -145
  37. {chatterer-0.1.26.dist-info → chatterer-0.1.28.dist-info}/METADATA +377 -390
  38. chatterer-0.1.28.dist-info/RECORD +43 -0
  39. chatterer-0.1.26.dist-info/RECORD +0 -42
  40. {chatterer-0.1.26.dist-info → chatterer-0.1.28.dist-info}/WHEEL +0 -0
  41. {chatterer-0.1.26.dist-info → chatterer-0.1.28.dist-info}/entry_points.txt +0 -0
  42. {chatterer-0.1.26.dist-info → chatterer-0.1.28.dist-info}/top_level.txt +0 -0
@@ -1,390 +1,377 @@
1
- Metadata-Version: 2.4
2
- Name: chatterer
3
- Version: 0.1.26
4
- Summary: The highest-level interface for various LLM APIs.
5
- Requires-Python: >=3.12
6
- Description-Content-Type: text/markdown
7
- Requires-Dist: instructor>=1.7.2
8
- Requires-Dist: langchain>=0.3.19
9
- Requires-Dist: langchain-openai>=0.3.11
10
- Requires-Dist: pillow>=11.1.0
11
- Requires-Dist: regex>=2024.11.6
12
- Requires-Dist: rich>=13.9.4
13
- Requires-Dist: colorama>=0.4.6
14
- Requires-Dist: spargear>=0.2.7
15
- Requires-Dist: dotenv>=0.9.9
16
- Provides-Extra: dev
17
- Requires-Dist: pyright>=1.1.401; extra == "dev"
18
- Provides-Extra: conversion
19
- Requires-Dist: youtube-transcript-api>=1.0.3; extra == "conversion"
20
- Requires-Dist: chatterer[browser]; extra == "conversion"
21
- Requires-Dist: chatterer[pdf]; extra == "conversion"
22
- Requires-Dist: chatterer[markdown]; extra == "conversion"
23
- Requires-Dist: chatterer[video]; extra == "conversion"
24
- Provides-Extra: browser
25
- Requires-Dist: playwright>=1.50.0; extra == "browser"
26
- Provides-Extra: pdf
27
- Requires-Dist: pymupdf>=1.25.4; extra == "pdf"
28
- Requires-Dist: pypdf>=5.4.0; extra == "pdf"
29
- Provides-Extra: markdown
30
- Requires-Dist: markitdown[all]>=0.1.1; extra == "markdown"
31
- Requires-Dist: markdownify>=1.1.0; extra == "markdown"
32
- Requires-Dist: commonmark>=0.9.1; extra == "markdown"
33
- Requires-Dist: mistune>=3.1.3; extra == "markdown"
34
- Provides-Extra: video
35
- Requires-Dist: pydub>=0.25.1; extra == "video"
36
- Provides-Extra: langchain
37
- Requires-Dist: langchain-anthropic>=0.3.10; extra == "langchain"
38
- Requires-Dist: langchain-google-genai>=2.1.1; extra == "langchain"
39
- Requires-Dist: langchain-ollama>=0.3.0; extra == "langchain"
40
- Requires-Dist: langchain-experimental>=0.3.4; extra == "langchain"
41
- Provides-Extra: all
42
- Requires-Dist: chatterer[dev]; extra == "all"
43
- Requires-Dist: chatterer[langchain]; extra == "all"
44
- Requires-Dist: chatterer[conversion]; extra == "all"
45
-
46
- # Chatterer
47
-
48
- **Simplified, Structured AI Assistant Framework**
49
-
50
- `chatterer` is a Python library designed as a type-safe LangChain wrapper for interacting with various language models (OpenAI, Anthropic, Google Gemini, Ollama, etc.). It supports structured outputs via Pydantic models, plain text responses, asynchronous calls, image description, code execution, and an interactive shell.
51
-
52
- The structured reasoning in `chatterer` is inspired by the [Atom-of-Thought](https://github.com/qixucen/atom) pipeline.
53
-
54
- ---
55
-
56
- ## Quick Install
57
-
58
- ```bash
59
- pip install chatterer
60
- ```
61
-
62
- ---
63
-
64
- ## Quickstart Example
65
-
66
- Generate text quickly using OpenAI.
67
- Messages can be input as plain strings or structured lists:
68
-
69
- ```python
70
- from chatterer import Chatterer, HumanMessage, AIMessage, SystemMessage
71
-
72
- # Initialize the Chatterer with `openai`, `anthropic`, `google`, or `ollama` models
73
- chatterer: Chatterer = Chatterer.openai("gpt-4.1")
74
-
75
- # Get direct response as str
76
- response: str = chatterer("What is the meaning of life?")
77
- # response = chatterer([{ "role": "user", "content": "What is the meaning of life?" }])
78
- # response = chatterer([("user", "What is the meaning of life?")])
79
- # response = chatterer([HumanMessage("What is the meaning of life?")])
80
- print(response)
81
- ```
82
-
83
- Image & text content can be sent as together:
84
-
85
- ```python
86
- from chatterer import Base64Image, HumanMessage
87
-
88
- # Load an image from a file or URL, resulting in a None or Base64Image object
89
- image = Base64Image.from_url_or_path("example.jpg")
90
- # image = Base64Image.from_url_or_path("https://example.com/image.jpg")
91
- assert image is not None, "Failed to load image"
92
-
93
- # Alternatively, load an image from bytes
94
- # with open("example.jpg", "rb") as f:
95
- # image = Base64Image.from_bytes(f.read(), ext="jpeg")
96
-
97
- message = HumanMessage(["Describe the image", image.data_uri_content])
98
- response: str = chatterer([message])
99
- print(response)
100
- ```
101
-
102
- ---
103
-
104
- ## Structured Output with Pydantic
105
-
106
- Define a Pydantic model and get typed responses:
107
-
108
- ```python
109
- from pydantic import BaseModel
110
-
111
- class AnswerModel(BaseModel):
112
- question: str
113
- answer: str
114
-
115
- # Call with response_model
116
- response: AnswerModel = chatterer("What's the capital of France?", response_model=AnswerModel)
117
- print(response.question, response.answer)
118
- ```
119
-
120
- ---
121
-
122
- ## Async Example
123
-
124
- Use asynchronous generation for non-blocking operations:
125
-
126
- ```python
127
- import asyncio
128
-
129
- async def main():
130
- response = await chatterer.agenerate("Explain async in Python briefly.")
131
- print(response)
132
-
133
- asyncio.run(main())
134
- ```
135
-
136
- ---
137
-
138
- ## Streaming Structured Outputs
139
-
140
- Stream structured responses in real-time:
141
-
142
- ```python
143
- from pydantic import BaseModel
144
-
145
- class AnswerModel(BaseModel):
146
- text: str
147
-
148
- chatterer = Chatterer.openai()
149
- for chunk in chatterer.generate_pydantic_stream(AnswerModel, "Tell me a story"):
150
- print(chunk.text)
151
- ```
152
-
153
- Asynchronous version:
154
- ```python
155
- import asyncio
156
-
157
- async def main():
158
- async for chunk in chatterer.agenerate_pydantic_stream(AnswerModel, "Tell me a story"):
159
- print(chunk.text)
160
-
161
- asyncio.run(main())
162
- ```
163
-
164
- ---
165
-
166
- ## Image Description
167
-
168
- Generate descriptions for images using the language model:
169
-
170
- ```python
171
- description = chatterer.describe_image("https://example.com/image.jpg")
172
- print(description)
173
-
174
- # Customize the instruction
175
- description = chatterer.describe_image("https://example.com/image.jpg", instruction="Describe the main objects in the image.")
176
- ```
177
-
178
- An asynchronous version is also available:
179
-
180
- ```python
181
- async def main():
182
- description = await chatterer.adescribe_image("https://example.com/image.jpg")
183
- print(description)
184
-
185
- asyncio.run(main())
186
- ```
187
-
188
- ---
189
-
190
- ## Code Execution
191
-
192
- Generate and execute Python code dynamically:
193
-
194
- ```python
195
- result = chatterer.invoke_code_execution("Write a function to calculate factorial.")
196
- print(result.code)
197
- print(result.output)
198
- ```
199
-
200
- An asynchronous version exists as well:
201
-
202
- ```python
203
- async def main():
204
- result = await chatterer.ainvoke_code_execution("Write a function to calculate factorial.")
205
- print(result.output)
206
-
207
- asyncio.run(main())
208
- ```
209
-
210
- ---
211
-
212
- ## Webpage to Markdown
213
-
214
- Convert webpages to Markdown, optionally filtering content with the language model:
215
-
216
- ```python
217
- from chatterer.tools.webpage_to_markdown import PlayWrightBot
218
-
219
- with PlayWrightBot() as bot:
220
- # Basic conversion
221
- markdown = bot.url_to_md("https://example.com")
222
- print(markdown)
223
-
224
- # With LLM filtering and image descriptions
225
- filtered_md = bot.url_to_md_with_llm("https://example.com", describe_images=True)
226
- print(filtered_md)
227
- ```
228
-
229
- Asynchronous version:
230
- ```python
231
- import asyncio
232
-
233
- async def main():
234
- async with PlayWrightBot() as bot:
235
- markdown = await bot.aurl_to_md_with_llm("https://example.com")
236
- print(markdown)
237
-
238
- asyncio.run(main())
239
- ```
240
-
241
- Extract specific elements:
242
- ```python
243
- with PlayWrightBot() as bot:
244
- headings = bot.select_and_extract("https://example.com", "h2")
245
- print(headings)
246
- ```
247
-
248
- ---
249
-
250
- ## Citation Chunking
251
-
252
- Chunk documents into semantic sections with citations:
253
-
254
- ```python
255
- from chatterer import Chatterer
256
- from chatterer.tools import citation_chunker
257
-
258
- chatterer = Chatterer.openai()
259
- document = "Long text about quantum computing..."
260
- chunks = citation_chunker(document, chatterer, global_coverage_threshold=0.9)
261
- for chunk in chunks:
262
- print(f"Subject: {chunk.name}")
263
- for source, matches in chunk.references.items():
264
- print(f" Source: {source}, Matches: {matches}")
265
- ```
266
-
267
- ---
268
-
269
- ## Interactive Shell
270
-
271
- Engage in a conversational AI session with code execution support:
272
-
273
- ```python
274
- from chatterer import interactive_shell
275
-
276
- interactive_shell()
277
- ```
278
-
279
- This launches an interactive session where you can chat with the AI and execute code snippets. Type `quit` or `exit` to end the session.
280
-
281
- ---
282
-
283
- ## Atom-of-Thought Pipeline (AoT)
284
-
285
- `AoTPipeline` provides structured reasoning inspired by the [Atom-of-Thought](https://github.com/qixucen/atom) approach. It decomposes complex questions recursively, generates answers, and combines them via an ensemble process.
286
-
287
- ### AoT Usage Example
288
-
289
- ```python
290
- from chatterer import Chatterer
291
- from chatterer.strategies import AoTStrategy, AoTPipeline
292
-
293
- pipeline = AoTPipeline(chatterer=Chatterer.openai(), max_depth=2)
294
- strategy = AoTStrategy(pipeline=pipeline)
295
-
296
- question = "What would Newton discover if hit by an apple falling from 100 meters?"
297
- answer = strategy.invoke(question)
298
- print(answer)
299
-
300
- # Generate and inspect reasoning graph
301
- graph = strategy.get_reasoning_graph()
302
- print(f"Graph: {len(graph.nodes)} nodes, {len(graph.relationships)} relationships")
303
- ```
304
-
305
- **Note**: The AoT pipeline includes an optional feature to generate a reasoning graph, which can be stored in Neo4j for visualization and analysis. Install `neo4j_extension` and set up a Neo4j instance to use this feature:
306
-
307
- ```python
308
- from neo4j_extension import Neo4jConnection
309
- with Neo4jConnection() as conn:
310
- conn.upsert_graph(graph)
311
- ```
312
-
313
- ---
314
-
315
- ## Supported Models
316
-
317
- Chatterer supports multiple language models, easily initialized as follows:
318
-
319
- - **OpenAI**
320
- - **Anthropic**
321
- - **Google Gemini**
322
- - **Ollama** (local models)
323
-
324
- ```python
325
- openai_chatterer = Chatterer.openai("gpt-4o-mini")
326
- anthropic_chatterer = Chatterer.anthropic("claude-3-7-sonnet-20250219")
327
- gemini_chatterer = Chatterer.google("gemini-2.0-flash")
328
- ollama_chatterer = Chatterer.ollama("deepseek-r1:1.5b")
329
- ```
330
-
331
- ---
332
-
333
- ## Advanced Features
334
-
335
- - **Streaming Responses**: Use `generate_stream` or `agenerate_stream` for real-time output.
336
- - **Streaming Structured Outputs**: Stream Pydantic-typed responses with `generate_pydantic_stream` or `agenerate_pydantic_stream`.
337
- - **Async/Await Support**: All methods have asynchronous counterparts (e.g., `agenerate`, `adescribe_image`).
338
- - **Structured Outputs**: Leverage Pydantic models for typed responses.
339
- - **Image Description**: Generate descriptions for images with `describe_image`.
340
- - **Code Execution**: Dynamically generate and execute Python code with `invoke_code_execution`.
341
- - **Webpage to Markdown**: Convert webpages to Markdown with `PlayWrightBot`, including JavaScript rendering, element extraction, and LLM-based content filtering.
342
- - **Citation Chunking**: Semantically chunk documents and extract citations with `citation_chunker`, including coverage analysis.
343
- - **Interactive Shell**: Use `interactive_shell` for conversational AI with code execution.
344
- - **Token Counting**: Retrieve input/output token counts with `get_num_tokens_from_message`.
345
- - **Utilities**: Tools for content processing (e.g., `html_to_markdown`, `pdf_to_text`, `get_youtube_video_subtitle`, `citation_chunker`) are available in the `tools` module.
346
-
347
- ```python
348
- # Example: Convert PDF to text
349
- from chatterer.tools import pdf_to_text
350
- text = pdf_to_text("example.pdf")
351
- print(text)
352
-
353
- # Example: Get YouTube subtitles
354
- from chatterer.tools import get_youtube_video_subtitle
355
- subtitles = get_youtube_video_subtitle("https://www.youtube.com/watch?v=example")
356
- print(subtitles)
357
-
358
- # Example: Get token counts
359
- from chatterer.messages import HumanMessage
360
- msg = HumanMessage(content="Hello, world!")
361
- tokens = chatterer.get_num_tokens_from_message(msg)
362
- if tokens:
363
- input_tokens, output_tokens = tokens
364
- print(f"Input: {input_tokens}, Output: {output_tokens}")
365
- ```
366
-
367
- ---
368
-
369
- ## Logging
370
-
371
- Enable debugging with basic logging:
372
-
373
- ```python
374
- import logging
375
- logging.basicConfig(level=logging.DEBUG)
376
- ```
377
-
378
- The AoT pipeline uses a custom color-coded logger for detailed step-by-step output.
379
-
380
- ---
381
-
382
- ## Contributing
383
-
384
- We welcome contributions! Feel free to open an issue or submit a pull request on the repository.
385
-
386
- ---
387
-
388
- ## License
389
-
390
- MIT License
1
+ Metadata-Version: 2.4
2
+ Name: chatterer
3
+ Version: 0.1.28
4
+ Summary: The highest-level interface for various LLM APIs.
5
+ Requires-Python: >=3.12
6
+ Description-Content-Type: text/markdown
7
+ Requires-Dist: instructor>=1.7.2
8
+ Requires-Dist: langchain>=0.3.19
9
+ Requires-Dist: langchain-openai>=0.3.11
10
+ Requires-Dist: pillow>=11.1.0
11
+ Requires-Dist: regex>=2024.11.6
12
+ Requires-Dist: rich>=13.9.4
13
+ Requires-Dist: colorama>=0.4.6
14
+ Requires-Dist: spargear>=0.2.7
15
+ Requires-Dist: dotenv>=0.9.9
16
+ Requires-Dist: loguru>=0.7.3
17
+ Provides-Extra: dev
18
+ Requires-Dist: pyright>=1.1.401; extra == "dev"
19
+ Provides-Extra: conversion
20
+ Requires-Dist: youtube-transcript-api>=1.0.3; extra == "conversion"
21
+ Requires-Dist: chatterer[browser]; extra == "conversion"
22
+ Requires-Dist: chatterer[pdf]; extra == "conversion"
23
+ Requires-Dist: chatterer[markdown]; extra == "conversion"
24
+ Requires-Dist: chatterer[video]; extra == "conversion"
25
+ Provides-Extra: browser
26
+ Requires-Dist: playwright>=1.50.0; extra == "browser"
27
+ Provides-Extra: pdf
28
+ Requires-Dist: pymupdf>=1.25.4; extra == "pdf"
29
+ Requires-Dist: pypdf>=5.4.0; extra == "pdf"
30
+ Provides-Extra: markdown
31
+ Requires-Dist: markitdown[all]>=0.1.1; extra == "markdown"
32
+ Requires-Dist: markdownify>=1.1.0; extra == "markdown"
33
+ Requires-Dist: commonmark>=0.9.1; extra == "markdown"
34
+ Requires-Dist: mistune>=3.1.3; extra == "markdown"
35
+ Provides-Extra: video
36
+ Requires-Dist: pydub>=0.25.1; extra == "video"
37
+ Provides-Extra: langchain
38
+ Requires-Dist: langchain-anthropic>=0.3.10; extra == "langchain"
39
+ Requires-Dist: langchain-google-genai>=2.1.1; extra == "langchain"
40
+ Requires-Dist: langchain-ollama>=0.3.0; extra == "langchain"
41
+ Requires-Dist: langchain-experimental>=0.3.4; extra == "langchain"
42
+ Provides-Extra: all
43
+ Requires-Dist: chatterer[dev]; extra == "all"
44
+ Requires-Dist: chatterer[langchain]; extra == "all"
45
+ Requires-Dist: chatterer[conversion]; extra == "all"
46
+
47
+ # Chatterer
48
+
49
+ **Simplified, Structured AI Assistant Framework**
50
+
51
+ `chatterer` is a Python library designed as a type-safe LangChain wrapper for interacting with various language models (OpenAI, Anthropic, Google Gemini, Ollama, etc.). It supports structured outputs via Pydantic models, plain text responses, asynchronous calls, image description, code execution, and an interactive shell.
52
+
53
+ The structured reasoning in `chatterer` is inspired by the [Atom-of-Thought](https://github.com/qixucen/atom) pipeline.
54
+
55
+ ---
56
+
57
+ ## Quick Install
58
+
59
+ ```bash
60
+ pip install chatterer
61
+ ```
62
+
63
+ ---
64
+
65
+ ## Quickstart Example
66
+
67
+ Generate text quickly using OpenAI.
68
+ Messages can be input as plain strings or structured lists:
69
+
70
+ ```python
71
+ from chatterer import Chatterer, HumanMessage, AIMessage, SystemMessage
72
+
73
+ # Initialize the Chatterer with `openai`, `anthropic`, `google`, or `ollama` models
74
+ chatterer: Chatterer = Chatterer.openai("gpt-4.1")
75
+
76
+ # Get direct response as str
77
+ response: str = chatterer("What is the meaning of life?")
78
+ # response = chatterer([{ "role": "user", "content": "What is the meaning of life?" }])
79
+ # response = chatterer([("user", "What is the meaning of life?")])
80
+ # response = chatterer([HumanMessage("What is the meaning of life?")])
81
+ print(response)
82
+ ```
83
+
84
+ Image & text content can be sent as together:
85
+
86
+ ```python
87
+ from chatterer import Base64Image, HumanMessage
88
+
89
+ # Load an image from a file or URL, resulting in a None or Base64Image object
90
+ image = Base64Image.from_url_or_path("example.jpg")
91
+ # image = Base64Image.from_url_or_path("https://example.com/image.jpg")
92
+ assert image is not None, "Failed to load image"
93
+
94
+ # Alternatively, load an image from bytes
95
+ # with open("example.jpg", "rb") as f:
96
+ # image = Base64Image.from_bytes(f.read(), ext="jpeg")
97
+
98
+ message = HumanMessage(["Describe the image", image.data_uri_content])
99
+ response: str = chatterer([message])
100
+ print(response)
101
+ ```
102
+
103
+ ---
104
+
105
+ ## Structured Output with Pydantic
106
+
107
+ Define a Pydantic model and get typed responses:
108
+
109
+ ```python
110
+ from pydantic import BaseModel
111
+
112
+ class AnswerModel(BaseModel):
113
+ question: str
114
+ answer: str
115
+
116
+ # Call with response_model
117
+ response: AnswerModel = chatterer("What's the capital of France?", response_model=AnswerModel)
118
+ print(response.question, response.answer)
119
+ ```
120
+
121
+ ---
122
+
123
+ ## Async Example
124
+
125
+ Use asynchronous generation for non-blocking operations:
126
+
127
+ ```python
128
+ import asyncio
129
+
130
+ async def main():
131
+ response = await chatterer.agenerate("Explain async in Python briefly.")
132
+ print(response)
133
+
134
+ asyncio.run(main())
135
+ ```
136
+
137
+ ---
138
+
139
+ ## Streaming Structured Outputs
140
+
141
+ Stream structured responses in real-time:
142
+
143
+ ```python
144
+ from pydantic import BaseModel
145
+
146
+ class AnswerModel(BaseModel):
147
+ text: str
148
+
149
+ chatterer = Chatterer.openai()
150
+ for chunk in chatterer.generate_pydantic_stream(AnswerModel, "Tell me a story"):
151
+ print(chunk.text)
152
+ ```
153
+
154
+ Asynchronous version:
155
+ ```python
156
+ import asyncio
157
+
158
+ async def main():
159
+ async for chunk in chatterer.agenerate_pydantic_stream(AnswerModel, "Tell me a story"):
160
+ print(chunk.text)
161
+
162
+ asyncio.run(main())
163
+ ```
164
+
165
+ ---
166
+
167
+ ## Image Description
168
+
169
+ Generate descriptions for images using the language model:
170
+
171
+ ```python
172
+ description = chatterer.describe_image("https://example.com/image.jpg")
173
+ print(description)
174
+
175
+ # Customize the instruction
176
+ description = chatterer.describe_image("https://example.com/image.jpg", instruction="Describe the main objects in the image.")
177
+ ```
178
+
179
+ An asynchronous version is also available:
180
+
181
+ ```python
182
+ async def main():
183
+ description = await chatterer.adescribe_image("https://example.com/image.jpg")
184
+ print(description)
185
+
186
+ asyncio.run(main())
187
+ ```
188
+
189
+ ---
190
+
191
+ ## Code Execution
192
+
193
+ Generate and execute Python code dynamically:
194
+
195
+ ```python
196
+ result = chatterer.invoke_code_execution("Write a function to calculate factorial.")
197
+ print(result.code)
198
+ print(result.output)
199
+ ```
200
+
201
+ An asynchronous version exists as well:
202
+
203
+ ```python
204
+ async def main():
205
+ result = await chatterer.ainvoke_code_execution("Write a function to calculate factorial.")
206
+ print(result.output)
207
+
208
+ asyncio.run(main())
209
+ ```
210
+
211
+ ---
212
+
213
+ ## Webpage to Markdown
214
+
215
+ Convert webpages to Markdown, optionally filtering content with the language model:
216
+
217
+ ```python
218
+ from chatterer.tools.webpage_to_markdown import PlayWrightBot
219
+
220
+ with PlayWrightBot() as bot:
221
+ # Basic conversion
222
+ markdown = bot.url_to_md("https://example.com")
223
+ print(markdown)
224
+
225
+ # With LLM filtering and image descriptions
226
+ filtered_md = bot.url_to_md_with_llm("https://example.com", describe_images=True)
227
+ print(filtered_md)
228
+ ```
229
+
230
+ Asynchronous version:
231
+ ```python
232
+ import asyncio
233
+
234
+ async def main():
235
+ async with PlayWrightBot() as bot:
236
+ markdown = await bot.aurl_to_md_with_llm("https://example.com")
237
+ print(markdown)
238
+
239
+ asyncio.run(main())
240
+ ```
241
+
242
+ Extract specific elements:
243
+ ```python
244
+ with PlayWrightBot() as bot:
245
+ headings = bot.select_and_extract("https://example.com", "h2")
246
+ print(headings)
247
+ ```
248
+
249
+ ---
250
+
251
+ ## Citation Chunking
252
+
253
+ Chunk documents into semantic sections with citations:
254
+
255
+ ```python
256
+ from chatterer import Chatterer
257
+ from chatterer.tools import citation_chunker
258
+
259
+ chatterer = Chatterer.openai()
260
+ document = "Long text about quantum computing..."
261
+ chunks = citation_chunker(document, chatterer, global_coverage_threshold=0.9)
262
+ for chunk in chunks:
263
+ print(f"Subject: {chunk.name}")
264
+ for source, matches in chunk.references.items():
265
+ print(f" Source: {source}, Matches: {matches}")
266
+ ```
267
+
268
+ ---
269
+
270
+ ## Interactive Shell
271
+
272
+ Engage in a conversational AI session with code execution support:
273
+
274
+ ```python
275
+ from chatterer import interactive_shell
276
+
277
+ interactive_shell()
278
+ ```
279
+
280
+ This launches an interactive session where you can chat with the AI and execute code snippets. Type `quit` or `exit` to end the session.
281
+
282
+ ---
283
+
284
+ ## Atom-of-Thought Pipeline (AoT)
285
+
286
+ `AoTPipeline` provides structured reasoning inspired by the [Atom-of-Thought](https://github.com/qixucen/atom) approach. It decomposes complex questions recursively, generates answers, and combines them via an ensemble process.
287
+
288
+ ### AoT Usage Example
289
+
290
+ ```python
291
+ from chatterer import Chatterer
292
+ from chatterer.strategies import AoTStrategy, AoTPipeline
293
+
294
+ pipeline = AoTPipeline(chatterer=Chatterer.openai(), max_depth=2)
295
+ strategy = AoTStrategy(pipeline=pipeline)
296
+
297
+ question = "What would Newton discover if hit by an apple falling from 100 meters?"
298
+ answer = strategy.invoke(question)
299
+ print(answer)
300
+
301
+ # Generate and inspect reasoning graph
302
+ graph = strategy.get_reasoning_graph()
303
+ print(f"Graph: {len(graph.nodes)} nodes, {len(graph.relationships)} relationships")
304
+ ```
305
+
306
+ **Note**: The AoT pipeline includes an optional feature to generate a reasoning graph, which can be stored in Neo4j for visualization and analysis. Install `neo4j_extension` and set up a Neo4j instance to use this feature:
307
+
308
+ ```python
309
+ from neo4j_extension import Neo4jConnection
310
+ with Neo4jConnection() as conn:
311
+ conn.upsert_graph(graph)
312
+ ```
313
+
314
+ ---
315
+
316
+ ## Supported Models
317
+
318
+ Chatterer supports multiple language models, easily initialized as follows:
319
+
320
+ - **OpenAI**
321
+ - **Anthropic**
322
+ - **Google Gemini**
323
+ - **Ollama** (local models)
324
+
325
+ ```python
326
+ openai_chatterer = Chatterer.openai("gpt-4o-mini")
327
+ anthropic_chatterer = Chatterer.anthropic("claude-3-7-sonnet-20250219")
328
+ gemini_chatterer = Chatterer.google("gemini-2.0-flash")
329
+ ollama_chatterer = Chatterer.ollama("deepseek-r1:1.5b")
330
+ ```
331
+
332
+ ---
333
+
334
+ ## Advanced Features
335
+
336
+ - **Streaming Responses**: Use `generate_stream` or `agenerate_stream` for real-time output.
337
+ - **Streaming Structured Outputs**: Stream Pydantic-typed responses with `generate_pydantic_stream` or `agenerate_pydantic_stream`.
338
+ - **Async/Await Support**: All methods have asynchronous counterparts (e.g., `agenerate`, `adescribe_image`).
339
+ - **Structured Outputs**: Leverage Pydantic models for typed responses.
340
+ - **Image Description**: Generate descriptions for images with `describe_image`.
341
+ - **Code Execution**: Dynamically generate and execute Python code with `invoke_code_execution`.
342
+ - **Webpage to Markdown**: Convert webpages to Markdown with `PlayWrightBot`, including JavaScript rendering, element extraction, and LLM-based content filtering.
343
+ - **Citation Chunking**: Semantically chunk documents and extract citations with `citation_chunker`, including coverage analysis.
344
+ - **Interactive Shell**: Use `interactive_shell` for conversational AI with code execution.
345
+ - **Token Counting**: Retrieve input/output token counts with `get_num_tokens_from_message`.
346
+ - **Utilities**: Tools for content processing (e.g., `html_to_markdown`, `pdf_to_text`, `get_youtube_video_subtitle`, `citation_chunker`) are available in the `tools` module.
347
+
348
+ ```python
349
+ # Example: Convert PDF to text
350
+ from chatterer.tools import pdf_to_text
351
+ text = pdf_to_text("example.pdf")
352
+ print(text)
353
+
354
+ # Example: Get YouTube subtitles
355
+ from chatterer.tools import get_youtube_video_subtitle
356
+ subtitles = get_youtube_video_subtitle("https://www.youtube.com/watch?v=example")
357
+ print(subtitles)
358
+
359
+ # Example: Get token counts
360
+ from chatterer.messages import HumanMessage
361
+ msg = HumanMessage(content="Hello, world!")
362
+ tokens = chatterer.get_num_tokens_from_message(msg)
363
+ if tokens:
364
+ input_tokens, output_tokens = tokens
365
+ print(f"Input: {input_tokens}, Output: {output_tokens}")
366
+ ```
367
+ ---
368
+
369
+ ## Contributing
370
+
371
+ We welcome contributions! Feel free to open an issue or submit a pull request on the repository.
372
+
373
+ ---
374
+
375
+ ## License
376
+
377
+ MIT License