chatterer 0.1.24__py3-none-any.whl → 0.1.25__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (44) hide show
  1. chatterer/__init__.py +97 -93
  2. chatterer/common_types/__init__.py +21 -21
  3. chatterer/common_types/io.py +19 -19
  4. chatterer/examples/__main__.py +75 -75
  5. chatterer/examples/any2md.py +85 -85
  6. chatterer/examples/pdf2md.py +338 -338
  7. chatterer/examples/pdf2txt.py +54 -54
  8. chatterer/examples/ppt.py +486 -486
  9. chatterer/examples/pw.py +143 -137
  10. chatterer/examples/snippet.py +56 -55
  11. chatterer/examples/transcribe.py +192 -112
  12. chatterer/examples/upstage.py +89 -89
  13. chatterer/examples/web2md.py +80 -66
  14. chatterer/interactive.py +354 -354
  15. chatterer/language_model.py +536 -536
  16. chatterer/messages.py +21 -21
  17. chatterer/strategies/__init__.py +13 -13
  18. chatterer/strategies/atom_of_thoughts.py +975 -975
  19. chatterer/strategies/base.py +14 -14
  20. chatterer/tools/__init__.py +46 -46
  21. chatterer/tools/caption_markdown_images.py +384 -384
  22. chatterer/tools/citation_chunking/__init__.py +3 -3
  23. chatterer/tools/citation_chunking/chunks.py +53 -53
  24. chatterer/tools/citation_chunking/citation_chunker.py +118 -118
  25. chatterer/tools/citation_chunking/citations.py +285 -285
  26. chatterer/tools/citation_chunking/prompt.py +157 -157
  27. chatterer/tools/citation_chunking/reference.py +26 -26
  28. chatterer/tools/citation_chunking/utils.py +138 -138
  29. chatterer/tools/convert_pdf_to_markdown.py +645 -625
  30. chatterer/tools/convert_to_text.py +446 -446
  31. chatterer/tools/upstage_document_parser.py +705 -705
  32. chatterer/tools/webpage_to_markdown.py +739 -739
  33. chatterer/tools/youtube.py +146 -146
  34. chatterer/utils/__init__.py +15 -15
  35. chatterer/utils/base64_image.py +293 -285
  36. chatterer/utils/bytesio.py +59 -59
  37. chatterer/utils/code_agent.py +237 -237
  38. chatterer/utils/imghdr.py +148 -148
  39. {chatterer-0.1.24.dist-info → chatterer-0.1.25.dist-info}/METADATA +390 -389
  40. chatterer-0.1.25.dist-info/RECORD +45 -0
  41. chatterer-0.1.24.dist-info/RECORD +0 -45
  42. {chatterer-0.1.24.dist-info → chatterer-0.1.25.dist-info}/WHEEL +0 -0
  43. {chatterer-0.1.24.dist-info → chatterer-0.1.25.dist-info}/entry_points.txt +0 -0
  44. {chatterer-0.1.24.dist-info → chatterer-0.1.25.dist-info}/top_level.txt +0 -0
@@ -1,389 +1,390 @@
1
- Metadata-Version: 2.4
2
- Name: chatterer
3
- Version: 0.1.24
4
- Summary: The highest-level interface for various LLM APIs.
5
- Requires-Python: >=3.12
6
- Description-Content-Type: text/markdown
7
- Requires-Dist: instructor>=1.7.2
8
- Requires-Dist: langchain>=0.3.19
9
- Requires-Dist: langchain-openai>=0.3.11
10
- Requires-Dist: pillow>=11.1.0
11
- Requires-Dist: regex>=2024.11.6
12
- Requires-Dist: rich>=13.9.4
13
- Requires-Dist: colorama>=0.4.6
14
- Requires-Dist: spargear>=0.2.7
15
- Provides-Extra: dev
16
- Requires-Dist: pyright>=1.1.401; extra == "dev"
17
- Provides-Extra: conversion
18
- Requires-Dist: youtube-transcript-api>=1.0.3; extra == "conversion"
19
- Requires-Dist: chatterer[browser]; extra == "conversion"
20
- Requires-Dist: chatterer[pdf]; extra == "conversion"
21
- Requires-Dist: chatterer[markdown]; extra == "conversion"
22
- Requires-Dist: chatterer[video]; extra == "conversion"
23
- Provides-Extra: browser
24
- Requires-Dist: playwright>=1.50.0; extra == "browser"
25
- Provides-Extra: pdf
26
- Requires-Dist: pymupdf>=1.25.4; extra == "pdf"
27
- Requires-Dist: pypdf>=5.4.0; extra == "pdf"
28
- Provides-Extra: markdown
29
- Requires-Dist: markitdown[all]>=0.1.1; extra == "markdown"
30
- Requires-Dist: markdownify>=1.1.0; extra == "markdown"
31
- Requires-Dist: commonmark>=0.9.1; extra == "markdown"
32
- Requires-Dist: mistune>=3.1.3; extra == "markdown"
33
- Provides-Extra: video
34
- Requires-Dist: pydub>=0.25.1; extra == "video"
35
- Provides-Extra: langchain
36
- Requires-Dist: langchain-anthropic>=0.3.10; extra == "langchain"
37
- Requires-Dist: langchain-google-genai>=2.1.1; extra == "langchain"
38
- Requires-Dist: langchain-ollama>=0.3.0; extra == "langchain"
39
- Requires-Dist: langchain-experimental>=0.3.4; extra == "langchain"
40
- Provides-Extra: all
41
- Requires-Dist: chatterer[dev]; extra == "all"
42
- Requires-Dist: chatterer[langchain]; extra == "all"
43
- Requires-Dist: chatterer[conversion]; extra == "all"
44
-
45
- # Chatterer
46
-
47
- **Simplified, Structured AI Assistant Framework**
48
-
49
- `chatterer` is a Python library designed as a type-safe LangChain wrapper for interacting with various language models (OpenAI, Anthropic, Google Gemini, Ollama, etc.). It supports structured outputs via Pydantic models, plain text responses, asynchronous calls, image description, code execution, and an interactive shell.
50
-
51
- The structured reasoning in `chatterer` is inspired by the [Atom-of-Thought](https://github.com/qixucen/atom) pipeline.
52
-
53
- ---
54
-
55
- ## Quick Install
56
-
57
- ```bash
58
- pip install chatterer
59
- ```
60
-
61
- ---
62
-
63
- ## Quickstart Example
64
-
65
- Generate text quickly using OpenAI.
66
- Messages can be input as plain strings or structured lists:
67
-
68
- ```python
69
- from chatterer import Chatterer, HumanMessage, AIMessage, SystemMessage
70
-
71
- # Initialize the Chatterer with `openai`, `anthropic`, `google`, or `ollama` models
72
- chatterer: Chatterer = Chatterer.openai("gpt-4.1")
73
-
74
- # Get direct response as str
75
- response: str = chatterer("What is the meaning of life?")
76
- # response = chatterer([{ "role": "user", "content": "What is the meaning of life?" }])
77
- # response = chatterer([("user", "What is the meaning of life?")])
78
- # response = chatterer([HumanMessage("What is the meaning of life?")])
79
- print(response)
80
- ```
81
-
82
- Image & text content can be sent as together:
83
-
84
- ```python
85
- from chatterer import Base64Image, HumanMessage
86
-
87
- # Load an image from a file or URL, resulting in a None or Base64Image object
88
- image = Base64Image.from_url_or_path("example.jpg")
89
- # image = Base64Image.from_url_or_path("https://example.com/image.jpg")
90
- assert image is not None, "Failed to load image"
91
-
92
- # Alternatively, load an image from bytes
93
- # with open("example.jpg", "rb") as f:
94
- # image = Base64Image.from_bytes(f.read(), ext="jpeg")
95
-
96
- message = HumanMessage(["Describe the image", image.data_uri_content])
97
- response: str = chatterer([message])
98
- print(response)
99
- ```
100
-
101
- ---
102
-
103
- ## Structured Output with Pydantic
104
-
105
- Define a Pydantic model and get typed responses:
106
-
107
- ```python
108
- from pydantic import BaseModel
109
-
110
- class AnswerModel(BaseModel):
111
- question: str
112
- answer: str
113
-
114
- # Call with response_model
115
- response: AnswerModel = chatterer("What's the capital of France?", response_model=AnswerModel)
116
- print(response.question, response.answer)
117
- ```
118
-
119
- ---
120
-
121
- ## Async Example
122
-
123
- Use asynchronous generation for non-blocking operations:
124
-
125
- ```python
126
- import asyncio
127
-
128
- async def main():
129
- response = await chatterer.agenerate("Explain async in Python briefly.")
130
- print(response)
131
-
132
- asyncio.run(main())
133
- ```
134
-
135
- ---
136
-
137
- ## Streaming Structured Outputs
138
-
139
- Stream structured responses in real-time:
140
-
141
- ```python
142
- from pydantic import BaseModel
143
-
144
- class AnswerModel(BaseModel):
145
- text: str
146
-
147
- chatterer = Chatterer.openai()
148
- for chunk in chatterer.generate_pydantic_stream(AnswerModel, "Tell me a story"):
149
- print(chunk.text)
150
- ```
151
-
152
- Asynchronous version:
153
- ```python
154
- import asyncio
155
-
156
- async def main():
157
- async for chunk in chatterer.agenerate_pydantic_stream(AnswerModel, "Tell me a story"):
158
- print(chunk.text)
159
-
160
- asyncio.run(main())
161
- ```
162
-
163
- ---
164
-
165
- ## Image Description
166
-
167
- Generate descriptions for images using the language model:
168
-
169
- ```python
170
- description = chatterer.describe_image("https://example.com/image.jpg")
171
- print(description)
172
-
173
- # Customize the instruction
174
- description = chatterer.describe_image("https://example.com/image.jpg", instruction="Describe the main objects in the image.")
175
- ```
176
-
177
- An asynchronous version is also available:
178
-
179
- ```python
180
- async def main():
181
- description = await chatterer.adescribe_image("https://example.com/image.jpg")
182
- print(description)
183
-
184
- asyncio.run(main())
185
- ```
186
-
187
- ---
188
-
189
- ## Code Execution
190
-
191
- Generate and execute Python code dynamically:
192
-
193
- ```python
194
- result = chatterer.invoke_code_execution("Write a function to calculate factorial.")
195
- print(result.code)
196
- print(result.output)
197
- ```
198
-
199
- An asynchronous version exists as well:
200
-
201
- ```python
202
- async def main():
203
- result = await chatterer.ainvoke_code_execution("Write a function to calculate factorial.")
204
- print(result.output)
205
-
206
- asyncio.run(main())
207
- ```
208
-
209
- ---
210
-
211
- ## Webpage to Markdown
212
-
213
- Convert webpages to Markdown, optionally filtering content with the language model:
214
-
215
- ```python
216
- from chatterer.tools.webpage_to_markdown import PlayWrightBot
217
-
218
- with PlayWrightBot() as bot:
219
- # Basic conversion
220
- markdown = bot.url_to_md("https://example.com")
221
- print(markdown)
222
-
223
- # With LLM filtering and image descriptions
224
- filtered_md = bot.url_to_md_with_llm("https://example.com", describe_images=True)
225
- print(filtered_md)
226
- ```
227
-
228
- Asynchronous version:
229
- ```python
230
- import asyncio
231
-
232
- async def main():
233
- async with PlayWrightBot() as bot:
234
- markdown = await bot.aurl_to_md_with_llm("https://example.com")
235
- print(markdown)
236
-
237
- asyncio.run(main())
238
- ```
239
-
240
- Extract specific elements:
241
- ```python
242
- with PlayWrightBot() as bot:
243
- headings = bot.select_and_extract("https://example.com", "h2")
244
- print(headings)
245
- ```
246
-
247
- ---
248
-
249
- ## Citation Chunking
250
-
251
- Chunk documents into semantic sections with citations:
252
-
253
- ```python
254
- from chatterer import Chatterer
255
- from chatterer.tools import citation_chunker
256
-
257
- chatterer = Chatterer.openai()
258
- document = "Long text about quantum computing..."
259
- chunks = citation_chunker(document, chatterer, global_coverage_threshold=0.9)
260
- for chunk in chunks:
261
- print(f"Subject: {chunk.name}")
262
- for source, matches in chunk.references.items():
263
- print(f" Source: {source}, Matches: {matches}")
264
- ```
265
-
266
- ---
267
-
268
- ## Interactive Shell
269
-
270
- Engage in a conversational AI session with code execution support:
271
-
272
- ```python
273
- from chatterer import interactive_shell
274
-
275
- interactive_shell()
276
- ```
277
-
278
- This launches an interactive session where you can chat with the AI and execute code snippets. Type `quit` or `exit` to end the session.
279
-
280
- ---
281
-
282
- ## Atom-of-Thought Pipeline (AoT)
283
-
284
- `AoTPipeline` provides structured reasoning inspired by the [Atom-of-Thought](https://github.com/qixucen/atom) approach. It decomposes complex questions recursively, generates answers, and combines them via an ensemble process.
285
-
286
- ### AoT Usage Example
287
-
288
- ```python
289
- from chatterer import Chatterer
290
- from chatterer.strategies import AoTStrategy, AoTPipeline
291
-
292
- pipeline = AoTPipeline(chatterer=Chatterer.openai(), max_depth=2)
293
- strategy = AoTStrategy(pipeline=pipeline)
294
-
295
- question = "What would Newton discover if hit by an apple falling from 100 meters?"
296
- answer = strategy.invoke(question)
297
- print(answer)
298
-
299
- # Generate and inspect reasoning graph
300
- graph = strategy.get_reasoning_graph()
301
- print(f"Graph: {len(graph.nodes)} nodes, {len(graph.relationships)} relationships")
302
- ```
303
-
304
- **Note**: The AoT pipeline includes an optional feature to generate a reasoning graph, which can be stored in Neo4j for visualization and analysis. Install `neo4j_extension` and set up a Neo4j instance to use this feature:
305
-
306
- ```python
307
- from neo4j_extension import Neo4jConnection
308
- with Neo4jConnection() as conn:
309
- conn.upsert_graph(graph)
310
- ```
311
-
312
- ---
313
-
314
- ## Supported Models
315
-
316
- Chatterer supports multiple language models, easily initialized as follows:
317
-
318
- - **OpenAI**
319
- - **Anthropic**
320
- - **Google Gemini**
321
- - **Ollama** (local models)
322
-
323
- ```python
324
- openai_chatterer = Chatterer.openai("gpt-4o-mini")
325
- anthropic_chatterer = Chatterer.anthropic("claude-3-7-sonnet-20250219")
326
- gemini_chatterer = Chatterer.google("gemini-2.0-flash")
327
- ollama_chatterer = Chatterer.ollama("deepseek-r1:1.5b")
328
- ```
329
-
330
- ---
331
-
332
- ## Advanced Features
333
-
334
- - **Streaming Responses**: Use `generate_stream` or `agenerate_stream` for real-time output.
335
- - **Streaming Structured Outputs**: Stream Pydantic-typed responses with `generate_pydantic_stream` or `agenerate_pydantic_stream`.
336
- - **Async/Await Support**: All methods have asynchronous counterparts (e.g., `agenerate`, `adescribe_image`).
337
- - **Structured Outputs**: Leverage Pydantic models for typed responses.
338
- - **Image Description**: Generate descriptions for images with `describe_image`.
339
- - **Code Execution**: Dynamically generate and execute Python code with `invoke_code_execution`.
340
- - **Webpage to Markdown**: Convert webpages to Markdown with `PlayWrightBot`, including JavaScript rendering, element extraction, and LLM-based content filtering.
341
- - **Citation Chunking**: Semantically chunk documents and extract citations with `citation_chunker`, including coverage analysis.
342
- - **Interactive Shell**: Use `interactive_shell` for conversational AI with code execution.
343
- - **Token Counting**: Retrieve input/output token counts with `get_num_tokens_from_message`.
344
- - **Utilities**: Tools for content processing (e.g., `html_to_markdown`, `pdf_to_text`, `get_youtube_video_subtitle`, `citation_chunker`) are available in the `tools` module.
345
-
346
- ```python
347
- # Example: Convert PDF to text
348
- from chatterer.tools import pdf_to_text
349
- text = pdf_to_text("example.pdf")
350
- print(text)
351
-
352
- # Example: Get YouTube subtitles
353
- from chatterer.tools import get_youtube_video_subtitle
354
- subtitles = get_youtube_video_subtitle("https://www.youtube.com/watch?v=example")
355
- print(subtitles)
356
-
357
- # Example: Get token counts
358
- from chatterer.messages import HumanMessage
359
- msg = HumanMessage(content="Hello, world!")
360
- tokens = chatterer.get_num_tokens_from_message(msg)
361
- if tokens:
362
- input_tokens, output_tokens = tokens
363
- print(f"Input: {input_tokens}, Output: {output_tokens}")
364
- ```
365
-
366
- ---
367
-
368
- ## Logging
369
-
370
- Enable debugging with basic logging:
371
-
372
- ```python
373
- import logging
374
- logging.basicConfig(level=logging.DEBUG)
375
- ```
376
-
377
- The AoT pipeline uses a custom color-coded logger for detailed step-by-step output.
378
-
379
- ---
380
-
381
- ## Contributing
382
-
383
- We welcome contributions! Feel free to open an issue or submit a pull request on the repository.
384
-
385
- ---
386
-
387
- ## License
388
-
389
- MIT License
1
+ Metadata-Version: 2.4
2
+ Name: chatterer
3
+ Version: 0.1.25
4
+ Summary: The highest-level interface for various LLM APIs.
5
+ Requires-Python: >=3.12
6
+ Description-Content-Type: text/markdown
7
+ Requires-Dist: instructor>=1.7.2
8
+ Requires-Dist: langchain>=0.3.19
9
+ Requires-Dist: langchain-openai>=0.3.11
10
+ Requires-Dist: pillow>=11.1.0
11
+ Requires-Dist: regex>=2024.11.6
12
+ Requires-Dist: rich>=13.9.4
13
+ Requires-Dist: colorama>=0.4.6
14
+ Requires-Dist: spargear>=0.2.7
15
+ Requires-Dist: dotenv>=0.9.9
16
+ Provides-Extra: dev
17
+ Requires-Dist: pyright>=1.1.401; extra == "dev"
18
+ Provides-Extra: conversion
19
+ Requires-Dist: youtube-transcript-api>=1.0.3; extra == "conversion"
20
+ Requires-Dist: chatterer[browser]; extra == "conversion"
21
+ Requires-Dist: chatterer[pdf]; extra == "conversion"
22
+ Requires-Dist: chatterer[markdown]; extra == "conversion"
23
+ Requires-Dist: chatterer[video]; extra == "conversion"
24
+ Provides-Extra: browser
25
+ Requires-Dist: playwright>=1.50.0; extra == "browser"
26
+ Provides-Extra: pdf
27
+ Requires-Dist: pymupdf>=1.25.4; extra == "pdf"
28
+ Requires-Dist: pypdf>=5.4.0; extra == "pdf"
29
+ Provides-Extra: markdown
30
+ Requires-Dist: markitdown[all]>=0.1.1; extra == "markdown"
31
+ Requires-Dist: markdownify>=1.1.0; extra == "markdown"
32
+ Requires-Dist: commonmark>=0.9.1; extra == "markdown"
33
+ Requires-Dist: mistune>=3.1.3; extra == "markdown"
34
+ Provides-Extra: video
35
+ Requires-Dist: pydub>=0.25.1; extra == "video"
36
+ Provides-Extra: langchain
37
+ Requires-Dist: langchain-anthropic>=0.3.10; extra == "langchain"
38
+ Requires-Dist: langchain-google-genai>=2.1.1; extra == "langchain"
39
+ Requires-Dist: langchain-ollama>=0.3.0; extra == "langchain"
40
+ Requires-Dist: langchain-experimental>=0.3.4; extra == "langchain"
41
+ Provides-Extra: all
42
+ Requires-Dist: chatterer[dev]; extra == "all"
43
+ Requires-Dist: chatterer[langchain]; extra == "all"
44
+ Requires-Dist: chatterer[conversion]; extra == "all"
45
+
46
+ # Chatterer
47
+
48
+ **Simplified, Structured AI Assistant Framework**
49
+
50
+ `chatterer` is a Python library designed as a type-safe LangChain wrapper for interacting with various language models (OpenAI, Anthropic, Google Gemini, Ollama, etc.). It supports structured outputs via Pydantic models, plain text responses, asynchronous calls, image description, code execution, and an interactive shell.
51
+
52
+ The structured reasoning in `chatterer` is inspired by the [Atom-of-Thought](https://github.com/qixucen/atom) pipeline.
53
+
54
+ ---
55
+
56
+ ## Quick Install
57
+
58
+ ```bash
59
+ pip install chatterer
60
+ ```
61
+
62
+ ---
63
+
64
+ ## Quickstart Example
65
+
66
+ Generate text quickly using OpenAI.
67
+ Messages can be input as plain strings or structured lists:
68
+
69
+ ```python
70
+ from chatterer import Chatterer, HumanMessage, AIMessage, SystemMessage
71
+
72
+ # Initialize the Chatterer with `openai`, `anthropic`, `google`, or `ollama` models
73
+ chatterer: Chatterer = Chatterer.openai("gpt-4.1")
74
+
75
+ # Get direct response as str
76
+ response: str = chatterer("What is the meaning of life?")
77
+ # response = chatterer([{ "role": "user", "content": "What is the meaning of life?" }])
78
+ # response = chatterer([("user", "What is the meaning of life?")])
79
+ # response = chatterer([HumanMessage("What is the meaning of life?")])
80
+ print(response)
81
+ ```
82
+
83
+ Image & text content can be sent as together:
84
+
85
+ ```python
86
+ from chatterer import Base64Image, HumanMessage
87
+
88
+ # Load an image from a file or URL, resulting in a None or Base64Image object
89
+ image = Base64Image.from_url_or_path("example.jpg")
90
+ # image = Base64Image.from_url_or_path("https://example.com/image.jpg")
91
+ assert image is not None, "Failed to load image"
92
+
93
+ # Alternatively, load an image from bytes
94
+ # with open("example.jpg", "rb") as f:
95
+ # image = Base64Image.from_bytes(f.read(), ext="jpeg")
96
+
97
+ message = HumanMessage(["Describe the image", image.data_uri_content])
98
+ response: str = chatterer([message])
99
+ print(response)
100
+ ```
101
+
102
+ ---
103
+
104
+ ## Structured Output with Pydantic
105
+
106
+ Define a Pydantic model and get typed responses:
107
+
108
+ ```python
109
+ from pydantic import BaseModel
110
+
111
+ class AnswerModel(BaseModel):
112
+ question: str
113
+ answer: str
114
+
115
+ # Call with response_model
116
+ response: AnswerModel = chatterer("What's the capital of France?", response_model=AnswerModel)
117
+ print(response.question, response.answer)
118
+ ```
119
+
120
+ ---
121
+
122
+ ## Async Example
123
+
124
+ Use asynchronous generation for non-blocking operations:
125
+
126
+ ```python
127
+ import asyncio
128
+
129
+ async def main():
130
+ response = await chatterer.agenerate("Explain async in Python briefly.")
131
+ print(response)
132
+
133
+ asyncio.run(main())
134
+ ```
135
+
136
+ ---
137
+
138
+ ## Streaming Structured Outputs
139
+
140
+ Stream structured responses in real-time:
141
+
142
+ ```python
143
+ from pydantic import BaseModel
144
+
145
+ class AnswerModel(BaseModel):
146
+ text: str
147
+
148
+ chatterer = Chatterer.openai()
149
+ for chunk in chatterer.generate_pydantic_stream(AnswerModel, "Tell me a story"):
150
+ print(chunk.text)
151
+ ```
152
+
153
+ Asynchronous version:
154
+ ```python
155
+ import asyncio
156
+
157
+ async def main():
158
+ async for chunk in chatterer.agenerate_pydantic_stream(AnswerModel, "Tell me a story"):
159
+ print(chunk.text)
160
+
161
+ asyncio.run(main())
162
+ ```
163
+
164
+ ---
165
+
166
+ ## Image Description
167
+
168
+ Generate descriptions for images using the language model:
169
+
170
+ ```python
171
+ description = chatterer.describe_image("https://example.com/image.jpg")
172
+ print(description)
173
+
174
+ # Customize the instruction
175
+ description = chatterer.describe_image("https://example.com/image.jpg", instruction="Describe the main objects in the image.")
176
+ ```
177
+
178
+ An asynchronous version is also available:
179
+
180
+ ```python
181
+ async def main():
182
+ description = await chatterer.adescribe_image("https://example.com/image.jpg")
183
+ print(description)
184
+
185
+ asyncio.run(main())
186
+ ```
187
+
188
+ ---
189
+
190
+ ## Code Execution
191
+
192
+ Generate and execute Python code dynamically:
193
+
194
+ ```python
195
+ result = chatterer.invoke_code_execution("Write a function to calculate factorial.")
196
+ print(result.code)
197
+ print(result.output)
198
+ ```
199
+
200
+ An asynchronous version exists as well:
201
+
202
+ ```python
203
+ async def main():
204
+ result = await chatterer.ainvoke_code_execution("Write a function to calculate factorial.")
205
+ print(result.output)
206
+
207
+ asyncio.run(main())
208
+ ```
209
+
210
+ ---
211
+
212
+ ## Webpage to Markdown
213
+
214
+ Convert webpages to Markdown, optionally filtering content with the language model:
215
+
216
+ ```python
217
+ from chatterer.tools.webpage_to_markdown import PlayWrightBot
218
+
219
+ with PlayWrightBot() as bot:
220
+ # Basic conversion
221
+ markdown = bot.url_to_md("https://example.com")
222
+ print(markdown)
223
+
224
+ # With LLM filtering and image descriptions
225
+ filtered_md = bot.url_to_md_with_llm("https://example.com", describe_images=True)
226
+ print(filtered_md)
227
+ ```
228
+
229
+ Asynchronous version:
230
+ ```python
231
+ import asyncio
232
+
233
+ async def main():
234
+ async with PlayWrightBot() as bot:
235
+ markdown = await bot.aurl_to_md_with_llm("https://example.com")
236
+ print(markdown)
237
+
238
+ asyncio.run(main())
239
+ ```
240
+
241
+ Extract specific elements:
242
+ ```python
243
+ with PlayWrightBot() as bot:
244
+ headings = bot.select_and_extract("https://example.com", "h2")
245
+ print(headings)
246
+ ```
247
+
248
+ ---
249
+
250
+ ## Citation Chunking
251
+
252
+ Chunk documents into semantic sections with citations:
253
+
254
+ ```python
255
+ from chatterer import Chatterer
256
+ from chatterer.tools import citation_chunker
257
+
258
+ chatterer = Chatterer.openai()
259
+ document = "Long text about quantum computing..."
260
+ chunks = citation_chunker(document, chatterer, global_coverage_threshold=0.9)
261
+ for chunk in chunks:
262
+ print(f"Subject: {chunk.name}")
263
+ for source, matches in chunk.references.items():
264
+ print(f" Source: {source}, Matches: {matches}")
265
+ ```
266
+
267
+ ---
268
+
269
+ ## Interactive Shell
270
+
271
+ Engage in a conversational AI session with code execution support:
272
+
273
+ ```python
274
+ from chatterer import interactive_shell
275
+
276
+ interactive_shell()
277
+ ```
278
+
279
+ This launches an interactive session where you can chat with the AI and execute code snippets. Type `quit` or `exit` to end the session.
280
+
281
+ ---
282
+
283
+ ## Atom-of-Thought Pipeline (AoT)
284
+
285
+ `AoTPipeline` provides structured reasoning inspired by the [Atom-of-Thought](https://github.com/qixucen/atom) approach. It decomposes complex questions recursively, generates answers, and combines them via an ensemble process.
286
+
287
+ ### AoT Usage Example
288
+
289
+ ```python
290
+ from chatterer import Chatterer
291
+ from chatterer.strategies import AoTStrategy, AoTPipeline
292
+
293
+ pipeline = AoTPipeline(chatterer=Chatterer.openai(), max_depth=2)
294
+ strategy = AoTStrategy(pipeline=pipeline)
295
+
296
+ question = "What would Newton discover if hit by an apple falling from 100 meters?"
297
+ answer = strategy.invoke(question)
298
+ print(answer)
299
+
300
+ # Generate and inspect reasoning graph
301
+ graph = strategy.get_reasoning_graph()
302
+ print(f"Graph: {len(graph.nodes)} nodes, {len(graph.relationships)} relationships")
303
+ ```
304
+
305
+ **Note**: The AoT pipeline includes an optional feature to generate a reasoning graph, which can be stored in Neo4j for visualization and analysis. Install `neo4j_extension` and set up a Neo4j instance to use this feature:
306
+
307
+ ```python
308
+ from neo4j_extension import Neo4jConnection
309
+ with Neo4jConnection() as conn:
310
+ conn.upsert_graph(graph)
311
+ ```
312
+
313
+ ---
314
+
315
+ ## Supported Models
316
+
317
+ Chatterer supports multiple language models, easily initialized as follows:
318
+
319
+ - **OpenAI**
320
+ - **Anthropic**
321
+ - **Google Gemini**
322
+ - **Ollama** (local models)
323
+
324
+ ```python
325
+ openai_chatterer = Chatterer.openai("gpt-4o-mini")
326
+ anthropic_chatterer = Chatterer.anthropic("claude-3-7-sonnet-20250219")
327
+ gemini_chatterer = Chatterer.google("gemini-2.0-flash")
328
+ ollama_chatterer = Chatterer.ollama("deepseek-r1:1.5b")
329
+ ```
330
+
331
+ ---
332
+
333
+ ## Advanced Features
334
+
335
+ - **Streaming Responses**: Use `generate_stream` or `agenerate_stream` for real-time output.
336
+ - **Streaming Structured Outputs**: Stream Pydantic-typed responses with `generate_pydantic_stream` or `agenerate_pydantic_stream`.
337
+ - **Async/Await Support**: All methods have asynchronous counterparts (e.g., `agenerate`, `adescribe_image`).
338
+ - **Structured Outputs**: Leverage Pydantic models for typed responses.
339
+ - **Image Description**: Generate descriptions for images with `describe_image`.
340
+ - **Code Execution**: Dynamically generate and execute Python code with `invoke_code_execution`.
341
+ - **Webpage to Markdown**: Convert webpages to Markdown with `PlayWrightBot`, including JavaScript rendering, element extraction, and LLM-based content filtering.
342
+ - **Citation Chunking**: Semantically chunk documents and extract citations with `citation_chunker`, including coverage analysis.
343
+ - **Interactive Shell**: Use `interactive_shell` for conversational AI with code execution.
344
+ - **Token Counting**: Retrieve input/output token counts with `get_num_tokens_from_message`.
345
+ - **Utilities**: Tools for content processing (e.g., `html_to_markdown`, `pdf_to_text`, `get_youtube_video_subtitle`, `citation_chunker`) are available in the `tools` module.
346
+
347
+ ```python
348
+ # Example: Convert PDF to text
349
+ from chatterer.tools import pdf_to_text
350
+ text = pdf_to_text("example.pdf")
351
+ print(text)
352
+
353
+ # Example: Get YouTube subtitles
354
+ from chatterer.tools import get_youtube_video_subtitle
355
+ subtitles = get_youtube_video_subtitle("https://www.youtube.com/watch?v=example")
356
+ print(subtitles)
357
+
358
+ # Example: Get token counts
359
+ from chatterer.messages import HumanMessage
360
+ msg = HumanMessage(content="Hello, world!")
361
+ tokens = chatterer.get_num_tokens_from_message(msg)
362
+ if tokens:
363
+ input_tokens, output_tokens = tokens
364
+ print(f"Input: {input_tokens}, Output: {output_tokens}")
365
+ ```
366
+
367
+ ---
368
+
369
+ ## Logging
370
+
371
+ Enable debugging with basic logging:
372
+
373
+ ```python
374
+ import logging
375
+ logging.basicConfig(level=logging.DEBUG)
376
+ ```
377
+
378
+ The AoT pipeline uses a custom color-coded logger for detailed step-by-step output.
379
+
380
+ ---
381
+
382
+ ## Contributing
383
+
384
+ We welcome contributions! Feel free to open an issue or submit a pull request on the repository.
385
+
386
+ ---
387
+
388
+ ## License
389
+
390
+ MIT License