thoughtflow 0.0.1__py3-none-any.whl → 0.0.3__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- thoughtflow/__init__.py +97 -5
- thoughtflow/_util.py +752 -0
- thoughtflow/action.py +357 -0
- thoughtflow/agent.py +66 -0
- thoughtflow/eval/__init__.py +34 -0
- thoughtflow/eval/harness.py +200 -0
- thoughtflow/eval/replay.py +137 -0
- thoughtflow/llm.py +250 -0
- thoughtflow/memory/__init__.py +32 -0
- thoughtflow/memory/base.py +1658 -0
- thoughtflow/message.py +140 -0
- thoughtflow/py.typed +2 -0
- thoughtflow/thought.py +1102 -0
- thoughtflow/thoughtflow6.py +4180 -0
- thoughtflow/tools/__init__.py +27 -0
- thoughtflow/tools/base.py +145 -0
- thoughtflow/tools/registry.py +122 -0
- thoughtflow/trace/__init__.py +34 -0
- thoughtflow/trace/events.py +183 -0
- thoughtflow/trace/schema.py +111 -0
- thoughtflow/trace/session.py +141 -0
- thoughtflow-0.0.3.dist-info/METADATA +215 -0
- thoughtflow-0.0.3.dist-info/RECORD +25 -0
- {thoughtflow-0.0.1.dist-info → thoughtflow-0.0.3.dist-info}/WHEEL +1 -2
- {thoughtflow-0.0.1.dist-info → thoughtflow-0.0.3.dist-info/licenses}/LICENSE +1 -1
- thoughtflow/jtools1.py +0 -25
- thoughtflow/jtools2.py +0 -27
- thoughtflow-0.0.1.dist-info/METADATA +0 -17
- thoughtflow-0.0.1.dist-info/RECORD +0 -8
- thoughtflow-0.0.1.dist-info/top_level.txt +0 -1
|
@@ -0,0 +1,4180 @@
|
|
|
1
|
+
|
|
2
|
+
DeveloperContext = """
|
|
3
|
+
|
|
4
|
+
# The Zen of Thoughtflow
|
|
5
|
+
|
|
6
|
+
The **Zen of Thoughtflow** is a set of guiding principles for building a framework that prioritizes simplicity, clarity, and flexibility. Thoughtflow is not meant to be a rigid system but a tool that helps developers create and explore freely. It's designed to stay light, modular, and focused, with Python at its core. The goal is to reduce complexity, maintain transparency, and ensure that functionality endures over time. Thoughtflow isn't about trying to please everyone—it's about building a tool that serves its purpose well, allowing developers to focus on their own path.
|
|
7
|
+
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
### 1. First Principles First
|
|
11
|
+
Thoughtflow is built on fundamental, simple concepts. Each piece should start with core truths, avoiding the temptation to build on excessive abstractions.
|
|
12
|
+
|
|
13
|
+
### 2. Complexity is the Enemy
|
|
14
|
+
Keep it simple. Thoughtflow should be Pythonic, intuitive, and elegant. Let ease of use guide every decision, ensuring the library remains as light as possible.
|
|
15
|
+
|
|
16
|
+
### 3. Obvious Over Abstract
|
|
17
|
+
If the user has to dig deep to understand what's going on, the design has failed. Everything should naturally reveal its purpose and operation.
|
|
18
|
+
|
|
19
|
+
### 4. Transparency is Trust
|
|
20
|
+
Thoughtflow must operate transparently. Users should never have to guess what's happening under the hood—understanding empowers, while opacity frustrates.
|
|
21
|
+
|
|
22
|
+
### 5. Backward Compatibility is Sacred
|
|
23
|
+
Code should endure. Deprecation should be rare, and backward compatibility must be respected to protect users' investments in their existing work.
|
|
24
|
+
|
|
25
|
+
### 6. Flexibility Over Rigidity
|
|
26
|
+
Provide intelligent defaults, but allow users infinite possibilities. Thoughtflow should never micromanage the user's experience—give them the freedom to define their journey.
|
|
27
|
+
|
|
28
|
+
### 7. Minimize Dependencies, Pack Light
|
|
29
|
+
Thoughtflow should rely only on minimal, light libraries. Keep the dependency tree shallow, and ensure it's always feasible to deploy the library in serverless architectures.
|
|
30
|
+
|
|
31
|
+
### 8. Clarity Over Cleverness
|
|
32
|
+
Documentation, code, and design must be explicit and clear, not implicit or convoluted. Guide users, both beginners and experts, with straightforward tutorials and examples.
|
|
33
|
+
|
|
34
|
+
### 9. Modularity is Better than Monolith
|
|
35
|
+
Thoughtflow should be a collection of lightweight, composable pieces. Never force the user into an all-or-nothing approach—each component should be able to stand alone. Every builder loves legos.
|
|
36
|
+
|
|
37
|
+
### 10. Accommodate Both Beginners and Experts
|
|
38
|
+
Thoughtflow should grow with its users. Provide frictionless onboarding for beginners while offering flexibility for advanced users to scale and customize as needed.
|
|
39
|
+
|
|
40
|
+
### 11. Make a Vehicle, Not a Destination
|
|
41
|
+
Thoughtflow should focus on the structuring and intelligent sequencing of user-defined thoughts. Classes should be as generalizable as possible, and logic should be easily exported and imported via thought files.
|
|
42
|
+
|
|
43
|
+
### 12. Good Documentation Accelerates Usage
|
|
44
|
+
Documentation and tutorials must be clear, comprehensive, and always up-to-date. They should guide users at every turn, ensuring knowledge is readily available.
|
|
45
|
+
|
|
46
|
+
### 13. Don't Try to Please Everyone
|
|
47
|
+
Thoughtflow is focused and light. It isn't designed to accommodate every possible use case, and that's intentional. Greatness comes from focus, not from trying to do everything.
|
|
48
|
+
|
|
49
|
+
### 14. Python is King
|
|
50
|
+
Thoughtflow is built to be Pythonic. Python is the first-class citizen, and every integration, feature, and extension should honor Python's language and philosophy.
|
|
51
|
+
|
|
52
|
+
---
|
|
53
|
+
|
|
54
|
+
ThoughtFlow is designed to be a sophisticated AI agent framework for building
|
|
55
|
+
intelligent, memory-aware systems that can think, act, and maintain persistent
|
|
56
|
+
state.
|
|
57
|
+
|
|
58
|
+
---
|
|
59
|
+
|
|
60
|
+
|
|
61
|
+
# Thoughtflow — Design Document (Plain-English Spec for a Single-File Base Implementation)
|
|
62
|
+
|
|
63
|
+
This document explains **exactly** how to engineer Thoughtflow in simple, idiomatic Python. It is meant to live at the top of a *single Python script* that defines the foundational classes and helper functions. It is written for a reader with **zero** prior exposure to Thoughtflow.
|
|
64
|
+
|
|
65
|
+
Thoughtflow is a **Pythonic cognitive engine**. You write ordinary Python—`for`/`while`, `if/elif/else`, `try/except`, and small classes—no graphs, no hidden DSLs. A *flow* is "just a function" that accepts a `MEMORY` object and returns that same `MEMORY` object, modified. Cognition is built from four primitives:
|
|
66
|
+
|
|
67
|
+
1. **LLM** — A tiny wrapper around a chat-style language model API.
|
|
68
|
+
2. **MEMORY** — The single state container that keeps messages, events, logs, reflections, and variables.
|
|
69
|
+
3. **THOUGHT** — The unit of cognition: Prompt + Context + LLM + Parsing + Validation (+ Retries + Logging).
|
|
70
|
+
4. **ACTION** — Anything the agent *does* (respond, call an HTTP API, write a file, query a vector store, etc.), with consistent logging.
|
|
71
|
+
|
|
72
|
+
The rest of this spec describes **design philosophy**, **object contracts**, **method/attribute lists**, **data conventions**, and **how everything fits together**—plus example usage that the finished library should support.
|
|
73
|
+
|
|
74
|
+
---
|
|
75
|
+
|
|
76
|
+
Final Notes on Style
|
|
77
|
+
|
|
78
|
+
* Keep constructors short and forgiving; let users pass just a few arguments.
|
|
79
|
+
* Prefer small, pure helpers (parsers/validators) over big class hierarchies.
|
|
80
|
+
* Do not hide failures; always leave a visible trace in `logs` and `events`.
|
|
81
|
+
* Default behaviors should serve 90% of use cases; exotic needs belong in user code.
|
|
82
|
+
|
|
83
|
+
"""
|
|
84
|
+
|
|
85
|
+
#############################################################################
|
|
86
|
+
#############################################################################
|
|
87
|
+
|
|
88
|
+
### IMPORTS AND SETTINGS
|
|
89
|
+
|
|
90
|
+
import os, sys, time, pickle, json, uuid
|
|
91
|
+
import http, urllib, socket, ssl, gzip, copy
|
|
92
|
+
import urllib.request
|
|
93
|
+
import pprint
|
|
94
|
+
import random
|
|
95
|
+
import re, ast
|
|
96
|
+
from typing import Mapping, Any, Iterable, Optional, Tuple, Union
|
|
97
|
+
|
|
98
|
+
import time,hashlib,pickle
|
|
99
|
+
from random import randint
|
|
100
|
+
from functools import reduce
|
|
101
|
+
|
|
102
|
+
import datetime as dtt
|
|
103
|
+
from zoneinfo import ZoneInfo
|
|
104
|
+
|
|
105
|
+
tz_bog = ZoneInfo("America/Bogota")
|
|
106
|
+
tz_utc = ZoneInfo("UTC")
|
|
107
|
+
|
|
108
|
+
|
|
109
|
+
#############################################################################
|
|
110
|
+
#############################################################################
|
|
111
|
+
|
|
112
|
+
### EVENT STAMP LOGIC
|
|
113
|
+
|
|
114
|
+
class EventStamp:
|
|
115
|
+
"""
|
|
116
|
+
Generates and decodes deterministic event stamps using Base62 encoding.
|
|
117
|
+
|
|
118
|
+
Event stamps combine encoded time, document hash, and random components
|
|
119
|
+
into a compact 16-character identifier.
|
|
120
|
+
|
|
121
|
+
Usage:
|
|
122
|
+
EventStamp.stamp() # Generate a new stamp
|
|
123
|
+
EventStamp.decode_time(s) # Decode timestamp from stamp
|
|
124
|
+
EventStamp.hashify("text") # Generate deterministic hash
|
|
125
|
+
"""
|
|
126
|
+
|
|
127
|
+
CHARSET = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
|
|
128
|
+
|
|
129
|
+
def sha256_hash(input_string):
|
|
130
|
+
"""Generate a SHA-256 hash and return it as an integer."""
|
|
131
|
+
hash_bytes = hashlib.sha256(input_string.encode("utf-8")).digest()
|
|
132
|
+
return int.from_bytes(hash_bytes, byteorder="big")
|
|
133
|
+
|
|
134
|
+
def base62_encode(number, length):
|
|
135
|
+
"""Encode an integer into a fixed-length Base62 string."""
|
|
136
|
+
base = len(EventStamp.CHARSET)
|
|
137
|
+
encoded = []
|
|
138
|
+
for _ in range(length):
|
|
139
|
+
number, remainder = divmod(number, base)
|
|
140
|
+
encoded.append(EventStamp.CHARSET[remainder])
|
|
141
|
+
return ''.join(encoded[::-1]) # Reverse to get correct order
|
|
142
|
+
|
|
143
|
+
def hashify(input_string, length=32):
|
|
144
|
+
"""Generate a deterministic hash using all uppercase/lowercase letters and digits."""
|
|
145
|
+
hashed_int = EventStamp.sha256_hash(input_string)
|
|
146
|
+
return EventStamp.base62_encode(hashed_int, length)
|
|
147
|
+
|
|
148
|
+
def encode_num(num, charset=None):
|
|
149
|
+
"""Encode a number in the given base/charset."""
|
|
150
|
+
if charset is None:
|
|
151
|
+
charset = EventStamp.CHARSET
|
|
152
|
+
base = len(charset)
|
|
153
|
+
if num < base:
|
|
154
|
+
return charset[num]
|
|
155
|
+
else:
|
|
156
|
+
return EventStamp.encode_num(num // base, charset) + charset[num % base]
|
|
157
|
+
|
|
158
|
+
def decode_num(encoded_str, charset=None):
|
|
159
|
+
"""Decode a base-encoded string back to an integer."""
|
|
160
|
+
if charset is None:
|
|
161
|
+
charset = EventStamp.CHARSET
|
|
162
|
+
base = len(charset)
|
|
163
|
+
char_to_value = {c: i for i, c in enumerate(charset)}
|
|
164
|
+
return reduce(lambda num, c: num * base + char_to_value[c], encoded_str, 0)
|
|
165
|
+
|
|
166
|
+
def encode_time(unix_time=0):
|
|
167
|
+
"""Encode current or given unix time."""
|
|
168
|
+
if unix_time == 0:
|
|
169
|
+
t = int(time.time() * 10000)
|
|
170
|
+
else:
|
|
171
|
+
t = int(unix_time * 10000)
|
|
172
|
+
return EventStamp.encode_num(t)
|
|
173
|
+
|
|
174
|
+
def encode_doc(doc={}):
|
|
175
|
+
"""Encode a document/value to a 5-character hash."""
|
|
176
|
+
return EventStamp.hashify(str(doc), 5)
|
|
177
|
+
|
|
178
|
+
def encode_rando(length=3):
|
|
179
|
+
"""Generate a random code of specified length."""
|
|
180
|
+
n = randint(300000, 900000)
|
|
181
|
+
c = '000' + EventStamp.encode_num(n)
|
|
182
|
+
return c[-length:]
|
|
183
|
+
|
|
184
|
+
def stamp(doc={}):
|
|
185
|
+
"""
|
|
186
|
+
Generate an event stamp.
|
|
187
|
+
|
|
188
|
+
Combines encoded time, document hash, and random component
|
|
189
|
+
into a 16-character identifier.
|
|
190
|
+
"""
|
|
191
|
+
time_code = EventStamp.encode_time()
|
|
192
|
+
rando_code = EventStamp.encode_rando()
|
|
193
|
+
if len(str(doc)) > 2:
|
|
194
|
+
doc_code = EventStamp.encode_doc(doc)
|
|
195
|
+
else:
|
|
196
|
+
arb = time_code + rando_code
|
|
197
|
+
doc_code = EventStamp.encode_doc(arb)
|
|
198
|
+
return (time_code + doc_code + rando_code)[:16]
|
|
199
|
+
|
|
200
|
+
def decode_time(stamp, charset=None):
|
|
201
|
+
"""Decode the time component from an event stamp."""
|
|
202
|
+
if charset is None:
|
|
203
|
+
charset = EventStamp.CHARSET
|
|
204
|
+
stamp_prefix = stamp[:8]
|
|
205
|
+
scaled_time = EventStamp.decode_num(stamp_prefix, charset)
|
|
206
|
+
unix_time_seconds = scaled_time / 10000
|
|
207
|
+
return unix_time_seconds
|
|
208
|
+
|
|
209
|
+
|
|
210
|
+
# Backwards compatibility aliases
|
|
211
|
+
event_stamp = EventStamp.stamp
|
|
212
|
+
hashify = EventStamp.hashify
|
|
213
|
+
encode_num = EventStamp.encode_num
|
|
214
|
+
decode_num = EventStamp.decode_num
|
|
215
|
+
|
|
216
|
+
#############################################################################
|
|
217
|
+
#############################################################################
|
|
218
|
+
|
|
219
|
+
### HELPER FUNCTIONS
|
|
220
|
+
|
|
221
|
+
|
|
222
|
+
default_header = '''
|
|
223
|
+
Markers like <start … l4zk> and </end … l4zk>
|
|
224
|
+
indicate where a text section begins and ends.
|
|
225
|
+
Never mix boundaries. Each block is separate.
|
|
226
|
+
This is to improve your ease-of-reading.
|
|
227
|
+
'''
|
|
228
|
+
|
|
229
|
+
def construct_prompt(
|
|
230
|
+
prompt_obj = {},
|
|
231
|
+
order = [],
|
|
232
|
+
header = '',
|
|
233
|
+
):
|
|
234
|
+
if order: sections = list(order)
|
|
235
|
+
else: sections = [a for a in prompt_obj]
|
|
236
|
+
rnum = str(randint(1,9))
|
|
237
|
+
stamp = event_stamp()[-4:].lower()
|
|
238
|
+
stamp = stamp[:2]+rnum+stamp[2:]
|
|
239
|
+
L = []
|
|
240
|
+
if header:
|
|
241
|
+
if header=='default':
|
|
242
|
+
L.append(default_header+'\n')
|
|
243
|
+
else:
|
|
244
|
+
L.append(header+'\n\n')
|
|
245
|
+
L.append('<start prompt stamp>\n\n')
|
|
246
|
+
for s in sections:
|
|
247
|
+
text = prompt_obj[s]
|
|
248
|
+
s2 = s.strip().replace(' ','_')
|
|
249
|
+
label1 = "<start "+s2+" stamp>\n"
|
|
250
|
+
label2 = "\n</end "+s2+" stamp>\n\n"
|
|
251
|
+
block = label1 + text + label2
|
|
252
|
+
L.append(block)
|
|
253
|
+
L.append('</end prompt stamp>')
|
|
254
|
+
prompt = ''.join(L).replace(' stamp>',' '+stamp+'>')
|
|
255
|
+
return prompt
|
|
256
|
+
|
|
257
|
+
def construct_msgs(
|
|
258
|
+
usr_prompt = '',
|
|
259
|
+
vars = {},
|
|
260
|
+
sys_prompt = '',
|
|
261
|
+
msgs = [],
|
|
262
|
+
):
|
|
263
|
+
if sys_prompt:
|
|
264
|
+
if type(sys_prompt)==dict:
|
|
265
|
+
sys_prompt = construct_prompt(sys_prompt)
|
|
266
|
+
m = {'role':'system','content':sys_prompt}
|
|
267
|
+
msgs.insert(0,m)
|
|
268
|
+
if usr_prompt:
|
|
269
|
+
if type(usr_prompt)==dict:
|
|
270
|
+
usr_prompt = construct_prompt(usr_prompt)
|
|
271
|
+
m = {'role':'user','content':usr_prompt}
|
|
272
|
+
msgs.append(m)
|
|
273
|
+
#msgs2 = []
|
|
274
|
+
#for m in msgs:
|
|
275
|
+
# m_copy = m.copy()
|
|
276
|
+
# if isinstance(m_copy, dict) and 'content' in m_copy and isinstance(m_copy['content'], str):
|
|
277
|
+
# for k, v in vars.items():
|
|
278
|
+
# m_copy['content'] = m_copy['content'].replace(k, str(v))
|
|
279
|
+
# msgs2.append(m_copy)
|
|
280
|
+
#return msgs2
|
|
281
|
+
msgs2 = []
|
|
282
|
+
for m in msgs:
|
|
283
|
+
m_copy = dict(m)
|
|
284
|
+
if isinstance(m_copy.get("content"), str):
|
|
285
|
+
for k, v in vars.items():
|
|
286
|
+
m_copy["content"] = m_copy["content"].replace(k, str(v))
|
|
287
|
+
msgs2.append(m_copy)
|
|
288
|
+
return msgs2
|
|
289
|
+
|
|
290
|
+
|
|
291
|
+
|
|
292
|
+
#############################################################################
|
|
293
|
+
|
|
294
|
+
|
|
295
|
+
class ValidExtractError(ValueError):
|
|
296
|
+
"""Raised when extraction or validation fails."""
|
|
297
|
+
|
|
298
|
+
def valid_extract(raw_text: str, parsing_rules: Mapping[str, Any]) -> Any:
|
|
299
|
+
"""
|
|
300
|
+
Extract and validate a target Python structure from noisy LLM text.
|
|
301
|
+
|
|
302
|
+
Parameters
|
|
303
|
+
----------
|
|
304
|
+
raw_text : str
|
|
305
|
+
The original model output (may include extra prose, code fences, etc.).
|
|
306
|
+
parsing_rules : dict
|
|
307
|
+
Rules controlling extraction/validation. Required keys:
|
|
308
|
+
- 'kind': currently supports 'python' (default). ('json' also supported.)
|
|
309
|
+
- 'format': schema describing the expected structure, e.g. [], {}, {'name': ''}, {'num_list': [], 'info': {}}
|
|
310
|
+
|
|
311
|
+
Schema language:
|
|
312
|
+
* [] : list of anything
|
|
313
|
+
* [schema] : list of items matching 'schema'
|
|
314
|
+
* {} : dict of anything
|
|
315
|
+
* {'k': sch} : dict with required key 'k' matching 'sch'
|
|
316
|
+
* {'k?': sch} : OPTIONAL key 'k' (if present, must match 'sch')
|
|
317
|
+
* '' or str : str
|
|
318
|
+
* 0 or int : int
|
|
319
|
+
* 0.0 or float: float
|
|
320
|
+
* True/False or bool: bool
|
|
321
|
+
* None : NoneType
|
|
322
|
+
|
|
323
|
+
Returns
|
|
324
|
+
-------
|
|
325
|
+
Any
|
|
326
|
+
The parsed Python object that satisfies the schema.
|
|
327
|
+
|
|
328
|
+
Raises
|
|
329
|
+
------
|
|
330
|
+
ValidExtractError
|
|
331
|
+
If extraction fails or the parsed object does not validate against the schema.
|
|
332
|
+
|
|
333
|
+
Examples
|
|
334
|
+
--------
|
|
335
|
+
>>> rules = {'kind': 'python', 'format': []}
|
|
336
|
+
>>> txt = "Here you go:\\n```python\\n[1, 2, 3]\\n```\\nLet me know!"
|
|
337
|
+
>>> valid_extract(txt, rules)
|
|
338
|
+
[1, 2, 3]
|
|
339
|
+
|
|
340
|
+
>>> rules = {'kind': 'python', 'format': {'num_list': [], 'my_info': {}, 'name': ''}}
|
|
341
|
+
>>> txt = "noise { 'num_list':[1,2], 'my_info':{'x':1}, 'name':'Ada' } trailing"
|
|
342
|
+
>>> valid_extract(txt, rules)
|
|
343
|
+
{'num_list': [1, 2], 'my_info': {'x': 1}, 'name': 'Ada'}
|
|
344
|
+
"""
|
|
345
|
+
if not isinstance(parsing_rules, Mapping):
|
|
346
|
+
raise ValidExtractError("parsing_rules must be a mapping.")
|
|
347
|
+
|
|
348
|
+
kind = parsing_rules.get("kind", "python")
|
|
349
|
+
schema = parsing_rules.get("format", None)
|
|
350
|
+
if schema is None:
|
|
351
|
+
raise ValidExtractError("parsing_rules['format'] is required.")
|
|
352
|
+
|
|
353
|
+
# 1) Collect candidate text segments in a robust order.
|
|
354
|
+
candidates: Iterable[str] = _candidate_segments(raw_text, schema, prefer_fences_first=True)
|
|
355
|
+
|
|
356
|
+
last_err: Optional[Exception] = None
|
|
357
|
+
for segment in candidates:
|
|
358
|
+
try:
|
|
359
|
+
obj = _parse_segment(segment, kind=kind)
|
|
360
|
+
except Exception as e:
|
|
361
|
+
last_err = e
|
|
362
|
+
continue
|
|
363
|
+
|
|
364
|
+
ok, msg = _validate_schema(obj, schema)
|
|
365
|
+
if ok:
|
|
366
|
+
return obj
|
|
367
|
+
last_err = ValidExtractError("Validation failed for candidate: {}".format(msg))
|
|
368
|
+
|
|
369
|
+
# If we got here, nothing parsed+validated.
|
|
370
|
+
if last_err:
|
|
371
|
+
raise ValidExtractError(str(last_err))
|
|
372
|
+
raise ValidExtractError("No parseable candidates found.")
|
|
373
|
+
|
|
374
|
+
# ----------------------------
|
|
375
|
+
# Parsing helpers
|
|
376
|
+
# ----------------------------
|
|
377
|
+
|
|
378
|
+
# --- Replace the fence regex with this (accepts inline fences) ---
|
|
379
|
+
_FENCE_RE = re.compile(
|
|
380
|
+
r"```(?P<lang>[a-zA-Z0-9_\-\.]*)\s*\n?(?P<body>.*?)```",
|
|
381
|
+
re.DOTALL
|
|
382
|
+
)
|
|
383
|
+
|
|
384
|
+
def _candidate_segments(raw_text: str, schema: Any, prefer_fences_first: bool = True) -> Iterable[str]:
|
|
385
|
+
"""
|
|
386
|
+
Yield candidate substrings likely to contain the target structure.
|
|
387
|
+
|
|
388
|
+
Strategy:
|
|
389
|
+
1) Fenced code blocks (```) first, in order, if requested.
|
|
390
|
+
2) Balanced slice for the top-level delimiter suggested by the schema.
|
|
391
|
+
3) As a fallback, return raw_text itself (last resort).
|
|
392
|
+
"""
|
|
393
|
+
# 1) From code fences
|
|
394
|
+
if prefer_fences_first:
|
|
395
|
+
for m in _FENCE_RE.finditer(raw_text):
|
|
396
|
+
lang = (m.group("lang") or "").strip().lower()
|
|
397
|
+
body = m.group("body")
|
|
398
|
+
# If the fence declares "python" or "json", prioritize; otherwise still try.
|
|
399
|
+
yield body
|
|
400
|
+
|
|
401
|
+
# 2) From balanced slice based on schema's top-level delimiter
|
|
402
|
+
opener, closer = _delims_for_schema(schema)
|
|
403
|
+
if opener and closer:
|
|
404
|
+
slice_ = _balanced_slice(raw_text, opener, closer)
|
|
405
|
+
if slice_ is not None:
|
|
406
|
+
yield slice_
|
|
407
|
+
|
|
408
|
+
# 3) Whole text (very last resort)
|
|
409
|
+
yield raw_text
|
|
410
|
+
|
|
411
|
+
def _parse_segment(segment: str, kind: str = "python") -> Any:
|
|
412
|
+
"""
|
|
413
|
+
Parse a segment into a Python object according to 'kind'.
|
|
414
|
+
- python: ast.literal_eval
|
|
415
|
+
- json: json.loads (with fallback: try literal_eval if JSON fails, for LLM single-quote dicts)
|
|
416
|
+
"""
|
|
417
|
+
text = segment.strip()
|
|
418
|
+
|
|
419
|
+
if kind == "python":
|
|
420
|
+
# Remove leading language hints often kept when copying from fences
|
|
421
|
+
if text.startswith("python\n"):
|
|
422
|
+
text = text[len("python\n") :].lstrip()
|
|
423
|
+
return ast.literal_eval(text)
|
|
424
|
+
|
|
425
|
+
if kind == "json":
|
|
426
|
+
try:
|
|
427
|
+
return json.loads(text)
|
|
428
|
+
except json.JSONDecodeError:
|
|
429
|
+
# LLMs often return Python-style dicts (single quotes). Try literal_eval as a fallback.
|
|
430
|
+
return ast.literal_eval(text)
|
|
431
|
+
|
|
432
|
+
raise ValidExtractError("Unsupported kind: {!r}".format(kind))
|
|
433
|
+
|
|
434
|
+
def _delims_for_schema(schema: Any) -> Tuple[Optional[str], Optional[str]]:
|
|
435
|
+
"""
|
|
436
|
+
Infer top-level delimiters from the schema.
|
|
437
|
+
- list-like → [ ]
|
|
438
|
+
- dict-like → { }
|
|
439
|
+
- tuple-like (if used) → ( )
|
|
440
|
+
- string/number/bool/None → no delimiters (None, None)
|
|
441
|
+
"""
|
|
442
|
+
# list
|
|
443
|
+
if isinstance(schema, list):
|
|
444
|
+
return "[", "]"
|
|
445
|
+
# dict
|
|
446
|
+
if isinstance(schema, dict):
|
|
447
|
+
return "{", "}"
|
|
448
|
+
# tuple schema (rare, but supported)
|
|
449
|
+
if isinstance(schema, tuple):
|
|
450
|
+
return "(", ")"
|
|
451
|
+
# primitives: cannot infer a unique delimiter—return None
|
|
452
|
+
return None, None
|
|
453
|
+
|
|
454
|
+
|
|
455
|
+
def _balanced_slice(text: str, open_ch: str, close_ch: str) -> Optional[str]:
|
|
456
|
+
"""
|
|
457
|
+
Return the first balanced substring between open_ch and close_ch,
|
|
458
|
+
scanning from the *first occurrence of open_ch* (so prose apostrophes
|
|
459
|
+
before the opener don't confuse quote tracking).
|
|
460
|
+
"""
|
|
461
|
+
start = text.find(open_ch)
|
|
462
|
+
if start == -1:
|
|
463
|
+
return None
|
|
464
|
+
|
|
465
|
+
depth = 0
|
|
466
|
+
in_str: Optional[str] = None # quote char if inside ' or "
|
|
467
|
+
escape = False
|
|
468
|
+
i = start
|
|
469
|
+
|
|
470
|
+
while i < len(text):
|
|
471
|
+
ch = text[i]
|
|
472
|
+
if in_str:
|
|
473
|
+
if escape:
|
|
474
|
+
escape = False
|
|
475
|
+
elif ch == "\\":
|
|
476
|
+
escape = True
|
|
477
|
+
elif ch == in_str:
|
|
478
|
+
in_str = None
|
|
479
|
+
else:
|
|
480
|
+
if ch in ("'", '"'):
|
|
481
|
+
in_str = ch
|
|
482
|
+
elif ch == open_ch:
|
|
483
|
+
depth += 1
|
|
484
|
+
elif ch == close_ch and depth > 0:
|
|
485
|
+
depth -= 1
|
|
486
|
+
if depth == 0:
|
|
487
|
+
return text[start : i + 1]
|
|
488
|
+
i += 1
|
|
489
|
+
return None
|
|
490
|
+
|
|
491
|
+
|
|
492
|
+
# ----------------------------
|
|
493
|
+
# Schema validation
|
|
494
|
+
# ----------------------------
|
|
495
|
+
|
|
496
|
+
def _is_optional_key(k: str) -> Tuple[str, bool]:
|
|
497
|
+
"""Return (base_key, optional_flag) for keys with a trailing '?'."""
|
|
498
|
+
if isinstance(k, str) and k.endswith("?"):
|
|
499
|
+
return k[:-1], True
|
|
500
|
+
return k, False
|
|
501
|
+
|
|
502
|
+
def _schema_type(schema: Any) -> Union[type, Tuple[type, ...], None]:
|
|
503
|
+
"""
|
|
504
|
+
Map schema exemplars to Python types.
|
|
505
|
+
Accepts either exemplar values ('' -> str, 0 -> int, 0.0 -> float, True -> bool, None -> NoneType)
|
|
506
|
+
OR actual types (str, int, float, bool).
|
|
507
|
+
"""
|
|
508
|
+
if schema is None:
|
|
509
|
+
return type(None)
|
|
510
|
+
if schema is str or isinstance(schema, str):
|
|
511
|
+
return str
|
|
512
|
+
if schema is int or (isinstance(schema, int) and not isinstance(schema, bool)):
|
|
513
|
+
return int
|
|
514
|
+
if schema is float or isinstance(schema, float):
|
|
515
|
+
return float
|
|
516
|
+
if schema is bool or isinstance(schema, bool):
|
|
517
|
+
return bool
|
|
518
|
+
if schema is list:
|
|
519
|
+
return list
|
|
520
|
+
if schema is dict:
|
|
521
|
+
return dict
|
|
522
|
+
if schema is tuple:
|
|
523
|
+
return tuple
|
|
524
|
+
return None # composite or unknown marker
|
|
525
|
+
|
|
526
|
+
def _validate_schema(obj: Any, schema: Any, path: str = "$") -> Tuple[bool, str]:
|
|
527
|
+
"""
|
|
528
|
+
Recursively validate 'obj' against 'schema'. Returns (ok, message).
|
|
529
|
+
"""
|
|
530
|
+
# 1) Primitive types via exemplar or type
|
|
531
|
+
t = _schema_type(schema)
|
|
532
|
+
if t is not None and t not in (list, dict, tuple):
|
|
533
|
+
if isinstance(obj, t):
|
|
534
|
+
return True, "ok"
|
|
535
|
+
return False, "{}: expected {}, got {}".format(path, t.__name__, type(obj).__name__)
|
|
536
|
+
|
|
537
|
+
# 2) List schemas
|
|
538
|
+
if isinstance(schema, list):
|
|
539
|
+
if not isinstance(obj, list):
|
|
540
|
+
return False, "{}: expected list, got {}".format(path, type(obj).__name__)
|
|
541
|
+
# If schema is [], any list passes
|
|
542
|
+
if len(schema) == 0:
|
|
543
|
+
return True, "ok"
|
|
544
|
+
# If schema is [subschema], every element must match subschema
|
|
545
|
+
if len(schema) == 1:
|
|
546
|
+
subschema = schema[0]
|
|
547
|
+
for i, el in enumerate(obj):
|
|
548
|
+
ok, msg = _validate_schema(el, subschema, "{}[{}]".format(path, i))
|
|
549
|
+
if not ok:
|
|
550
|
+
return ok, msg
|
|
551
|
+
return True, "ok"
|
|
552
|
+
# Otherwise treat as "structure-by-position" (rare)
|
|
553
|
+
if len(obj) != len(schema):
|
|
554
|
+
return False, "{}: expected list length {}, got {}".format(path, len(schema), len(obj))
|
|
555
|
+
for i, (el, subschema) in enumerate(zip(obj, schema)):
|
|
556
|
+
ok, msg = _validate_schema(el, subschema, "{}[{}]".format(path, i))
|
|
557
|
+
if not ok:
|
|
558
|
+
return ok, msg
|
|
559
|
+
return True, "ok"
|
|
560
|
+
|
|
561
|
+
# 3) Dict schemas
|
|
562
|
+
if isinstance(schema, dict):
|
|
563
|
+
if not isinstance(obj, dict):
|
|
564
|
+
return False, "{}: expected dict, got {}".format(path, type(obj).__name__)
|
|
565
|
+
|
|
566
|
+
# Check required/optional keys in schema
|
|
567
|
+
for skey, subschema in schema.items():
|
|
568
|
+
base_key, optional = _is_optional_key(skey)
|
|
569
|
+
if base_key not in obj:
|
|
570
|
+
if optional:
|
|
571
|
+
continue
|
|
572
|
+
return False, "{}: missing required key '{}'".format(path, base_key)
|
|
573
|
+
ok, msg = _validate_schema(obj[base_key], subschema, "{}.{}".format(path, base_key))
|
|
574
|
+
if not ok:
|
|
575
|
+
return ok, msg
|
|
576
|
+
return True, "ok"
|
|
577
|
+
|
|
578
|
+
# 4) Tuple schemas (optional)
|
|
579
|
+
if isinstance(schema, tuple):
|
|
580
|
+
if not isinstance(obj, tuple):
|
|
581
|
+
return False, "{}: expected tuple, got {}".format(path, type(obj).__name__)
|
|
582
|
+
if len(schema) == 0:
|
|
583
|
+
return True, "ok"
|
|
584
|
+
if len(schema) == 1:
|
|
585
|
+
subschema = schema[0]
|
|
586
|
+
for i, el in enumerate(obj):
|
|
587
|
+
ok, msg = _validate_schema(el, subschema, "{}[{}]".format(path, i))
|
|
588
|
+
if not ok:
|
|
589
|
+
return ok, msg
|
|
590
|
+
return True, "ok"
|
|
591
|
+
if len(obj) != len(schema):
|
|
592
|
+
return False, "{}: expected tuple length {}, got {}".format(path, len(schema), len(obj))
|
|
593
|
+
for i, (el, subschema) in enumerate(zip(obj, schema)):
|
|
594
|
+
ok, msg = _validate_schema(el, subschema, "{}[{}]".format(path, i))
|
|
595
|
+
if not ok:
|
|
596
|
+
return ok, msg
|
|
597
|
+
return True, "ok"
|
|
598
|
+
|
|
599
|
+
# 5) If schema is a type object (e.g., list, dict) we handled above; unknown markers:
|
|
600
|
+
st = type(schema).__name__
|
|
601
|
+
return False, "{}: unsupported schema marker of type {!r}".format(path, st)
|
|
602
|
+
|
|
603
|
+
|
|
604
|
+
ParsingExamples = """
|
|
605
|
+
|
|
606
|
+
# Examples showing how to use the valid_extract function
|
|
607
|
+
#------------------------------------------------------------------
|
|
608
|
+
|
|
609
|
+
# Basic list
|
|
610
|
+
txt = "Noise before ```python\n[1, 2, 3]\n``` noise after"
|
|
611
|
+
rules = {"kind": "python", "format": []}
|
|
612
|
+
assert valid_extract(txt, rules) == [1, 2, 3]
|
|
613
|
+
|
|
614
|
+
# Basic dict
|
|
615
|
+
txt2 = "Header\n{ 'a': 1, 'b': 2 }\nFooter"
|
|
616
|
+
rules2 = {"kind": "python", "format": {}}
|
|
617
|
+
assert valid_extract(txt2, rules2) == {"a": 1, "b": 2}
|
|
618
|
+
|
|
619
|
+
# Nested dict with types
|
|
620
|
+
txt3 = "reply: { 'num_list':[1,2,3], 'my_info':{'x':1}, 'name':'Ada' } ok."
|
|
621
|
+
rules3 = {"kind": "python",
|
|
622
|
+
"format": {'num_list': [int], 'my_info': {}, 'name': ''}}
|
|
623
|
+
assert valid_extract(txt3, rules3)["name"] == "Ada"
|
|
624
|
+
|
|
625
|
+
# Optional key example
|
|
626
|
+
txt4 = ''' I think this is how I'd answer: ``` {'a': 1}``` is this good enough?'''
|
|
627
|
+
rules4 = {"kind": "python", "format": {'a': int, 'b?': ''}}
|
|
628
|
+
assert valid_extract(txt4, rules4) == {'a': 1}
|
|
629
|
+
|
|
630
|
+
txt = " I think this is how I'd answer: ``` {'a': 1}``` is this good enough?"
|
|
631
|
+
rules = {"kind": "python", "format": {"a": int, "b?": ""}}
|
|
632
|
+
assert valid_extract(txt, rules) == {"a": 1}
|
|
633
|
+
|
|
634
|
+
txt2 = "noise before {'a': 1} and after"
|
|
635
|
+
assert valid_extract(txt2, rules) == {"a": 1}
|
|
636
|
+
|
|
637
|
+
txt3 = "ok ```python\n[1,2,3]\n``` end"
|
|
638
|
+
assert valid_extract(txt3, {"kind": "python", "format": []}) == [1,2,3]
|
|
639
|
+
|
|
640
|
+
txt4 = "inline ```[{'k': 'v'}]```"
|
|
641
|
+
assert valid_extract(txt4, {"kind": "python", "format": [{"k": ""}]}) == [{"k": "v"}]
|
|
642
|
+
|
|
643
|
+
"""
|
|
644
|
+
|
|
645
|
+
|
|
646
|
+
#############################################################################
|
|
647
|
+
#############################################################################
|
|
648
|
+
#############################################################################
|
|
649
|
+
|
|
650
|
+
|
|
651
|
+
### LLM CLASS
|
|
652
|
+
|
|
653
|
+
class LLM:
|
|
654
|
+
"""
|
|
655
|
+
The LLM class is designed to interface with various language model services.
|
|
656
|
+
|
|
657
|
+
Attributes:
|
|
658
|
+
service (str): The name of the service provider (e.g., 'openai', 'groq', 'anthropic').
|
|
659
|
+
model (str): The specific model to be used within the service.
|
|
660
|
+
api_key (str): The API key for authenticating requests.
|
|
661
|
+
api_secret (str): The API secret for additional authentication.
|
|
662
|
+
last_params (dict): Stores the parameters used in the last API call.
|
|
663
|
+
|
|
664
|
+
Methods:
|
|
665
|
+
__init__(model_id, key, secret):
|
|
666
|
+
Initializes the LLM instance with a model ID, API key, and secret.
|
|
667
|
+
|
|
668
|
+
call(msg_list, params):
|
|
669
|
+
Calls the appropriate API based on the service with the given message list and parameters.
|
|
670
|
+
|
|
671
|
+
_call_openai(msg_list, params):
|
|
672
|
+
Sends a request to the OpenAI API with the specified messages and parameters.
|
|
673
|
+
|
|
674
|
+
_call_groq(msg_list, params):
|
|
675
|
+
Sends a request to the Groq API with the specified messages and parameters.
|
|
676
|
+
|
|
677
|
+
_call_anthropic(msg_list, params):
|
|
678
|
+
Sends a request to the Anthropic API with the specified messages and parameters.
|
|
679
|
+
|
|
680
|
+
_send_request(url, data, headers):
|
|
681
|
+
Helper function to send HTTP requests to the specified URL with data and headers.
|
|
682
|
+
"""
|
|
683
|
+
def __init__(self, model_id='', key='API_KEY', secret='API_SECRET'):
|
|
684
|
+
# Parse model ID and initialize service and model name
|
|
685
|
+
if ':' not in model_id: model_id = 'openai:gpt-4-turbo'
|
|
686
|
+
|
|
687
|
+
splitted = model_id.split(':')
|
|
688
|
+
self.service = splitted[0]
|
|
689
|
+
self.model = ''.join(splitted[1:])
|
|
690
|
+
self.api_key = key
|
|
691
|
+
self.api_secret = secret
|
|
692
|
+
self.last_params = {}
|
|
693
|
+
# Make the object directly callable
|
|
694
|
+
self.__call__ = self.call
|
|
695
|
+
|
|
696
|
+
def _normalize_messages(self, msg_list):
|
|
697
|
+
"""
|
|
698
|
+
Accepts either:
|
|
699
|
+
- list[str] -> converts to [{'role':'user','content': str}, ...]
|
|
700
|
+
- list[dict] with 'role' and 'content' -> passes through unchanged
|
|
701
|
+
- list[dict] with only 'content' -> assumes role='user'
|
|
702
|
+
Returns: list[{'role': str, 'content': str or list[...]}]
|
|
703
|
+
"""
|
|
704
|
+
norm = []
|
|
705
|
+
for m in msg_list:
|
|
706
|
+
if isinstance(m, dict):
|
|
707
|
+
role = m.get("role", "user")
|
|
708
|
+
content = m.get("content", "")
|
|
709
|
+
norm.append({"role": role, "content": content})
|
|
710
|
+
else:
|
|
711
|
+
# treat as plain user text
|
|
712
|
+
norm.append({"role": "user", "content": str(m)})
|
|
713
|
+
return norm
|
|
714
|
+
|
|
715
|
+
def call(self, msg_list, params={}):
|
|
716
|
+
self.last_params = dict(params)
|
|
717
|
+
# General function to call the appropriate API with msg_list and optional parameters
|
|
718
|
+
if self.service == 'openai':
|
|
719
|
+
return self._call_openai(msg_list, params)
|
|
720
|
+
elif self.service == 'groq':
|
|
721
|
+
return self._call_groq(msg_list, params)
|
|
722
|
+
elif self.service == 'anthropic':
|
|
723
|
+
return self._call_anthropic(msg_list, params)
|
|
724
|
+
elif self.service == 'ollama':
|
|
725
|
+
return self._call_ollama(msg_list, params)
|
|
726
|
+
elif self.service == 'gemini':
|
|
727
|
+
return self._call_gemini(msg_list, params)
|
|
728
|
+
elif self.service == 'openrouter':
|
|
729
|
+
return self._call_openrouter(msg_list, params)
|
|
730
|
+
else:
|
|
731
|
+
raise ValueError("Unsupported service '{}'.".format(self.service))
|
|
732
|
+
|
|
733
|
+
def _call_openai(self, msg_list, params):
|
|
734
|
+
url = "https://api.openai.com/v1/chat/completions"
|
|
735
|
+
data = json.dumps({
|
|
736
|
+
"model": self.model,
|
|
737
|
+
"messages": self._normalize_messages(msg_list),
|
|
738
|
+
**params
|
|
739
|
+
}).encode("utf-8")
|
|
740
|
+
headers = {
|
|
741
|
+
"Authorization": "Bearer " + self.api_key,
|
|
742
|
+
"Content-Type": "application/json",
|
|
743
|
+
}
|
|
744
|
+
res = self._send_request(url, data, headers)
|
|
745
|
+
choices = [a["message"]["content"] for a in res.get("choices", [])]
|
|
746
|
+
return choices
|
|
747
|
+
|
|
748
|
+
def _call_groq(self, msg_list, params):
|
|
749
|
+
url = "https://api.groq.com/openai/v1/chat/completions"
|
|
750
|
+
data = json.dumps({
|
|
751
|
+
"model": self.model,
|
|
752
|
+
"messages": self._normalize_messages(msg_list),
|
|
753
|
+
**params
|
|
754
|
+
}).encode("utf-8")
|
|
755
|
+
headers = {
|
|
756
|
+
"Authorization": "Bearer " + self.api_key,
|
|
757
|
+
"Content-Type": "application/json",
|
|
758
|
+
"User-Agent": "Groq/Python 0.9.0",
|
|
759
|
+
}
|
|
760
|
+
res = self._send_request(url, data, headers)
|
|
761
|
+
choices = [a["message"]["content"] for a in res.get("choices", [])]
|
|
762
|
+
return choices
|
|
763
|
+
|
|
764
|
+
def _call_anthropic(self, msg_list, params):
|
|
765
|
+
url = "https://api.anthropic.com/v1/messages"
|
|
766
|
+
data = json.dumps({
|
|
767
|
+
"model": self.model,
|
|
768
|
+
"max_tokens": params.get("max_tokens", 1024),
|
|
769
|
+
"messages": self._normalize_messages(msg_list),
|
|
770
|
+
}).encode("utf-8")
|
|
771
|
+
headers = {
|
|
772
|
+
"x-api-key": self.api_key,
|
|
773
|
+
"anthropic-version": "2023-06-01",
|
|
774
|
+
"Content-Type": "application/json",
|
|
775
|
+
}
|
|
776
|
+
res = self._send_request(url, data, headers)
|
|
777
|
+
# Anthropic returns {"content":[{"type":"text","text":"..."}], ...}
|
|
778
|
+
choices = [c.get("text", "") for c in res.get("content", [])]
|
|
779
|
+
return choices
|
|
780
|
+
|
|
781
|
+
def _call_gemini(self, msg_list, params):
|
|
782
|
+
"""
|
|
783
|
+
Calls Google Gemini/SVertexAI chat-supported models via REST API.
|
|
784
|
+
Requires self.api_key to be set.
|
|
785
|
+
"""
|
|
786
|
+
url = "https://generativelanguage.googleapis.com/v1beta/models/{}:generateContent?key={}".format(self.model, self.api_key)
|
|
787
|
+
# Gemini expects a list of "contents" alternating user/assistant
|
|
788
|
+
# We collapse the messages into a sequence of dicts as required by Gemini
|
|
789
|
+
# Gemini wants [{"role": "user/assistant", "parts": [{"text": ...}]}]
|
|
790
|
+
gemini_msgs = []
|
|
791
|
+
for m in self._normalize_messages(msg_list):
|
|
792
|
+
# Google's role scheme: "user" or "model"
|
|
793
|
+
g_role = {"user": "user", "assistant": "model", "system": "user"}.get(m["role"], "user")
|
|
794
|
+
gemini_msgs.append({
|
|
795
|
+
"role": g_role,
|
|
796
|
+
"parts": [{"text": str(m["content"])}] if isinstance(m["content"], str) else m["content"]
|
|
797
|
+
})
|
|
798
|
+
payload = {
|
|
799
|
+
"contents": gemini_msgs,
|
|
800
|
+
**{k: v for k, v in params.items() if k != "model"}
|
|
801
|
+
}
|
|
802
|
+
data = json.dumps(payload).encode("utf-8")
|
|
803
|
+
headers = {
|
|
804
|
+
"Content-Type": "application/json",
|
|
805
|
+
}
|
|
806
|
+
res = self._send_request(url, data, headers)
|
|
807
|
+
# Gemini returns { "candidates": [ { "content": { "parts": [ { "text": ... } ] } } ] }
|
|
808
|
+
choices = []
|
|
809
|
+
for cand in res.get("candidates", []):
|
|
810
|
+
parts = cand.get("content", {}).get("parts", [])
|
|
811
|
+
text = "".join([p.get("text", "") for p in parts])
|
|
812
|
+
choices.append(text)
|
|
813
|
+
return choices
|
|
814
|
+
|
|
815
|
+
def _call_openrouter(self, msg_list, params):
|
|
816
|
+
"""
|
|
817
|
+
Calls an LLM via the OpenRouter API. Requires self.api_key.
|
|
818
|
+
API docs: https://openrouter.ai/docs
|
|
819
|
+
Model list: https://openrouter.ai/docs#models
|
|
820
|
+
"""
|
|
821
|
+
url = "https://openrouter.ai/api/v1/chat/completions"
|
|
822
|
+
data = json.dumps({
|
|
823
|
+
"model": self.model,
|
|
824
|
+
"messages": self._normalize_messages(msg_list),
|
|
825
|
+
**params
|
|
826
|
+
}).encode("utf-8")
|
|
827
|
+
headers = {
|
|
828
|
+
"Authorization": "Bearer " + self.api_key,
|
|
829
|
+
"Content-Type": "application/json",
|
|
830
|
+
"HTTP-Referer": params.get("referer", "https://your-app.com"),
|
|
831
|
+
"X-Title": params.get("title", "Thoughtflow"),
|
|
832
|
+
}
|
|
833
|
+
res = self._send_request(url, data, headers)
|
|
834
|
+
choices = [a["message"]["content"] for a in res.get("choices", [])]
|
|
835
|
+
return choices
|
|
836
|
+
|
|
837
|
+
def _call_ollama(self, msg_list, params):
|
|
838
|
+
"""
|
|
839
|
+
Calls a local model served via Ollama (http://localhost:11434 by default).
|
|
840
|
+
Expects no authentication. Ollama messages format is like OpenAI's.
|
|
841
|
+
"""
|
|
842
|
+
base_url = params.get("ollama_url", "http://localhost:11434")
|
|
843
|
+
url = base_url.rstrip('/') + "/api/chat"
|
|
844
|
+
payload = {
|
|
845
|
+
"model": self.model,
|
|
846
|
+
"messages": self._normalize_messages(msg_list),
|
|
847
|
+
"stream": False, # Disable streaming to get a single JSON response
|
|
848
|
+
**{k: v for k, v in params.items() if k not in ("ollama_url", "model")}
|
|
849
|
+
}
|
|
850
|
+
data = json.dumps(payload).encode("utf-8")
|
|
851
|
+
headers = {
|
|
852
|
+
"Content-Type": "application/json",
|
|
853
|
+
}
|
|
854
|
+
res = self._send_request(url, data, headers)
|
|
855
|
+
# Ollama returns {"message": {...}, ...} or {"choices": [{...}]}
|
|
856
|
+
# Prefer OpenAI-style extraction if available, else fallback
|
|
857
|
+
if "choices" in res:
|
|
858
|
+
choices = [a["message"]["content"] for a in res.get("choices", [])]
|
|
859
|
+
elif "message" in res:
|
|
860
|
+
# single result
|
|
861
|
+
msg = res["message"]
|
|
862
|
+
choices = [msg.get("content", "")]
|
|
863
|
+
elif "response" in res:
|
|
864
|
+
# streaming/fallback
|
|
865
|
+
choices = [res["response"]]
|
|
866
|
+
else:
|
|
867
|
+
choices = []
|
|
868
|
+
return choices
|
|
869
|
+
|
|
870
|
+
def _send_request(self, url, data, headers):
|
|
871
|
+
# Sends the actual HTTP request and handles the response
|
|
872
|
+
try:
|
|
873
|
+
req = urllib.request.Request(url, data=data, headers=headers)
|
|
874
|
+
with urllib.request.urlopen(req) as response:
|
|
875
|
+
response_data = response.read().decode("utf-8")
|
|
876
|
+
# Attempt to parse JSON response; handle plain-text responses
|
|
877
|
+
try:
|
|
878
|
+
return json.loads(response_data) # Parse JSON response
|
|
879
|
+
except json.JSONDecodeError:
|
|
880
|
+
# If response is not JSON, return it as-is in a structured format
|
|
881
|
+
return {"error": "Non-JSON response", "response_data": response_data}
|
|
882
|
+
|
|
883
|
+
except urllib.error.HTTPError as e:
|
|
884
|
+
# Return the error details in case of an HTTP error
|
|
885
|
+
error_msg = e.read().decode("utf-8")
|
|
886
|
+
print("HTTP Error:", error_msg) # Log HTTP error for debugging
|
|
887
|
+
return {"error": json.loads(error_msg) if error_msg else "Unknown HTTP error"}
|
|
888
|
+
except Exception as e:
|
|
889
|
+
return {"error": str(e)}
|
|
890
|
+
|
|
891
|
+
|
|
892
|
+
|
|
893
|
+
#############################################################################
|
|
894
|
+
#############################################################################
|
|
895
|
+
|
|
896
|
+
### MEMORY CLASS
|
|
897
|
+
|
|
898
|
+
# Sentinel class to mark deleted variables
|
|
899
|
+
class _VarDeleted:
|
|
900
|
+
"""Sentinel value indicating a variable has been deleted."""
|
|
901
|
+
_instance = None
|
|
902
|
+
|
|
903
|
+
def __new__(cls):
|
|
904
|
+
if cls._instance is None:
|
|
905
|
+
cls._instance = super().__new__(cls)
|
|
906
|
+
return cls._instance
|
|
907
|
+
|
|
908
|
+
def __repr__(self):
|
|
909
|
+
return '<DELETED>'
|
|
910
|
+
|
|
911
|
+
def __str__(self):
|
|
912
|
+
return '<DELETED>'
|
|
913
|
+
|
|
914
|
+
# Singleton instance for deleted marker
|
|
915
|
+
VAR_DELETED = _VarDeleted()
|
|
916
|
+
|
|
917
|
+
|
|
918
|
+
#-----------------------------------------------------------
|
|
919
|
+
# Object Compression Utilities (JSON-serializable)
|
|
920
|
+
#-----------------------------------------------------------
|
|
921
|
+
|
|
922
|
+
import zlib
|
|
923
|
+
import base64
|
|
924
|
+
|
|
925
|
+
def compress_to_json(data, content_type='auto'):
|
|
926
|
+
"""
|
|
927
|
+
Compress data to a JSON-serializable dict.
|
|
928
|
+
|
|
929
|
+
Args:
|
|
930
|
+
data: bytes, str, or JSON-serializable object
|
|
931
|
+
content_type: 'bytes', 'text', 'json', 'pickle', or 'auto'
|
|
932
|
+
|
|
933
|
+
Returns:
|
|
934
|
+
dict with 'data' (base64 string), sizes, and content_type
|
|
935
|
+
"""
|
|
936
|
+
# Convert to bytes based on type
|
|
937
|
+
if content_type == 'auto':
|
|
938
|
+
if isinstance(data, bytes):
|
|
939
|
+
content_type = 'bytes'
|
|
940
|
+
raw_bytes = data
|
|
941
|
+
elif isinstance(data, str):
|
|
942
|
+
content_type = 'text'
|
|
943
|
+
raw_bytes = data.encode('utf-8')
|
|
944
|
+
else:
|
|
945
|
+
# Try JSON first, fall back to pickle
|
|
946
|
+
try:
|
|
947
|
+
content_type = 'json'
|
|
948
|
+
raw_bytes = json.dumps(data).encode('utf-8')
|
|
949
|
+
except (TypeError, ValueError):
|
|
950
|
+
content_type = 'pickle'
|
|
951
|
+
raw_bytes = pickle.dumps(data)
|
|
952
|
+
elif content_type == 'bytes':
|
|
953
|
+
raw_bytes = data
|
|
954
|
+
elif content_type == 'text':
|
|
955
|
+
raw_bytes = data.encode('utf-8')
|
|
956
|
+
elif content_type == 'json':
|
|
957
|
+
raw_bytes = json.dumps(data).encode('utf-8')
|
|
958
|
+
elif content_type == 'pickle':
|
|
959
|
+
raw_bytes = pickle.dumps(data)
|
|
960
|
+
else:
|
|
961
|
+
raise ValueError("Unknown content_type: {}".format(content_type))
|
|
962
|
+
|
|
963
|
+
# Compress and base64 encode
|
|
964
|
+
compressed = zlib.compress(raw_bytes, level=9)
|
|
965
|
+
encoded = base64.b64encode(compressed).decode('ascii')
|
|
966
|
+
|
|
967
|
+
return {
|
|
968
|
+
'data': encoded,
|
|
969
|
+
'size_original': len(raw_bytes),
|
|
970
|
+
'size_compressed': len(compressed),
|
|
971
|
+
'content_type': content_type,
|
|
972
|
+
}
|
|
973
|
+
|
|
974
|
+
|
|
975
|
+
def decompress_from_json(obj_dict):
|
|
976
|
+
"""
|
|
977
|
+
Decompress data from JSON-serializable dict.
|
|
978
|
+
|
|
979
|
+
Args:
|
|
980
|
+
obj_dict: dict from compress_to_json
|
|
981
|
+
|
|
982
|
+
Returns:
|
|
983
|
+
Original data in its original type
|
|
984
|
+
"""
|
|
985
|
+
encoded = obj_dict['data']
|
|
986
|
+
content_type = obj_dict['content_type']
|
|
987
|
+
|
|
988
|
+
# Decode and decompress
|
|
989
|
+
compressed = base64.b64decode(encoded)
|
|
990
|
+
raw_bytes = zlib.decompress(compressed)
|
|
991
|
+
|
|
992
|
+
# Convert back to original type
|
|
993
|
+
if content_type == 'bytes':
|
|
994
|
+
return raw_bytes
|
|
995
|
+
elif content_type == 'text':
|
|
996
|
+
return raw_bytes.decode('utf-8')
|
|
997
|
+
elif content_type == 'json':
|
|
998
|
+
return json.loads(raw_bytes.decode('utf-8'))
|
|
999
|
+
elif content_type == 'pickle':
|
|
1000
|
+
return pickle.loads(raw_bytes)
|
|
1001
|
+
else:
|
|
1002
|
+
raise ValueError("Unknown content_type: {}".format(content_type))
|
|
1003
|
+
|
|
1004
|
+
|
|
1005
|
+
def estimate_size(value):
|
|
1006
|
+
"""
|
|
1007
|
+
Estimate the serialized size of a value in bytes.
|
|
1008
|
+
|
|
1009
|
+
Args:
|
|
1010
|
+
value: Any value
|
|
1011
|
+
|
|
1012
|
+
Returns:
|
|
1013
|
+
int: Estimated size in bytes
|
|
1014
|
+
"""
|
|
1015
|
+
if isinstance(value, bytes):
|
|
1016
|
+
return len(value)
|
|
1017
|
+
elif isinstance(value, str):
|
|
1018
|
+
return len(value.encode('utf-8'))
|
|
1019
|
+
else:
|
|
1020
|
+
try:
|
|
1021
|
+
return len(json.dumps(value).encode('utf-8'))
|
|
1022
|
+
except (TypeError, ValueError):
|
|
1023
|
+
return len(pickle.dumps(value))
|
|
1024
|
+
|
|
1025
|
+
|
|
1026
|
+
def is_obj_ref(value):
|
|
1027
|
+
"""
|
|
1028
|
+
Check if a value is an object reference.
|
|
1029
|
+
|
|
1030
|
+
Args:
|
|
1031
|
+
value: Any value
|
|
1032
|
+
|
|
1033
|
+
Returns:
|
|
1034
|
+
bool: True if value is an object reference dict
|
|
1035
|
+
"""
|
|
1036
|
+
return isinstance(value, dict) and '_obj_ref' in value
|
|
1037
|
+
|
|
1038
|
+
|
|
1039
|
+
def truncate_content(content, stamp, threshold=500, header_len=200, footer_len=200):
|
|
1040
|
+
"""
|
|
1041
|
+
Truncate long content by keeping header and footer with an expandable marker.
|
|
1042
|
+
|
|
1043
|
+
If content is shorter than threshold, returns content unchanged.
|
|
1044
|
+
Otherwise, keeps the first header_len chars and last footer_len chars,
|
|
1045
|
+
with a marker in between indicating truncation and providing the stamp
|
|
1046
|
+
for expansion.
|
|
1047
|
+
|
|
1048
|
+
Args:
|
|
1049
|
+
content: The text content to potentially truncate
|
|
1050
|
+
stamp: The event stamp (ID) for the content, used in expansion marker
|
|
1051
|
+
threshold: Minimum length before truncation applies (default 500)
|
|
1052
|
+
header_len: Characters to keep from start (default 200)
|
|
1053
|
+
footer_len: Characters to keep from end (default 200)
|
|
1054
|
+
|
|
1055
|
+
Returns:
|
|
1056
|
+
str: Original content if short enough, or truncated content with marker
|
|
1057
|
+
|
|
1058
|
+
Example:
|
|
1059
|
+
truncated = truncate_content(long_text, 'ABC123', threshold=500)
|
|
1060
|
+
# Returns: "First 200 chars...\n\n[...TRUNCATED: 1,847 chars omitted. To expand, request stamp: ABC123...]\n\n...last 200 chars"
|
|
1061
|
+
"""
|
|
1062
|
+
if len(content) <= threshold:
|
|
1063
|
+
return content
|
|
1064
|
+
|
|
1065
|
+
# Calculate how much we're removing
|
|
1066
|
+
chars_omitted = len(content) - header_len - footer_len
|
|
1067
|
+
|
|
1068
|
+
# Build the truncation marker
|
|
1069
|
+
marker = "\n\n[...TRUNCATED: {:,} chars omitted. To expand, request stamp: {}...]\n\n".format(chars_omitted, stamp)
|
|
1070
|
+
|
|
1071
|
+
# Extract header and footer
|
|
1072
|
+
header = content[:header_len]
|
|
1073
|
+
footer = content[-footer_len:]
|
|
1074
|
+
|
|
1075
|
+
return header + marker + footer
|
|
1076
|
+
|
|
1077
|
+
|
|
1078
|
+
class MEMORY:
|
|
1079
|
+
"""
|
|
1080
|
+
The MEMORY class serves as an event-sourced state container for managing events,
|
|
1081
|
+
logs, messages, reflections, and variables within the Thoughtflow framework.
|
|
1082
|
+
|
|
1083
|
+
All state changes are stored as events with sortable IDs (alphabetical = chronological).
|
|
1084
|
+
Events are stored in a dictionary for O(1) lookup, with separate sorted indexes for
|
|
1085
|
+
efficient retrieval. The memory can be fully reconstructed from its event list.
|
|
1086
|
+
|
|
1087
|
+
Architecture:
|
|
1088
|
+
- DATA LAYER: events dict (stamp → event object) - single source of truth
|
|
1089
|
+
- INDEX LAYER: idx_* lists of [timestamp, stamp] pairs, sorted chronologically
|
|
1090
|
+
- VARIABLE LAYER: vars dict with full history as list of [stamp, value] pairs
|
|
1091
|
+
- OBJECT LAYER: objects dict for compressed large data storage
|
|
1092
|
+
|
|
1093
|
+
Attributes:
|
|
1094
|
+
id (str): Unique identifier for this MEMORY instance (event_stamp).
|
|
1095
|
+
events (dict): Dictionary mapping event stamps to full event objects.
|
|
1096
|
+
idx_msgs (list): Sorted list of [timestamp, stamp] pairs for messages.
|
|
1097
|
+
idx_refs (list): Sorted list of [timestamp, stamp] pairs for reflections.
|
|
1098
|
+
idx_logs (list): Sorted list of [timestamp, stamp] pairs for logs.
|
|
1099
|
+
idx_vars (list): Sorted list of [timestamp, stamp] pairs for variable changes.
|
|
1100
|
+
idx_all (list): Master sorted list of all [timestamp, stamp] pairs.
|
|
1101
|
+
vars (dict): Dictionary mapping variable names to list of [stamp, value] pairs.
|
|
1102
|
+
Deleted variables have VAR_DELETED as the value in their last entry.
|
|
1103
|
+
Large values auto-convert to object references: {'_obj_ref': stamp}.
|
|
1104
|
+
var_desc_history (dict): Dictionary mapping variable names to list of [stamp, description] pairs.
|
|
1105
|
+
Tracks description evolution separately from value changes.
|
|
1106
|
+
objects (dict): Dictionary mapping stamps to compressed object dicts.
|
|
1107
|
+
Each object is JSON-serializable with base64-encoded compressed data.
|
|
1108
|
+
object_threshold (int): Size threshold (bytes) for auto-converting vars to objects.
|
|
1109
|
+
valid_roles (set): Set of valid roles for messages.
|
|
1110
|
+
valid_modes (set): Set of valid modes for messages.
|
|
1111
|
+
valid_channels (set): Set of valid communication channels.
|
|
1112
|
+
|
|
1113
|
+
Methods:
|
|
1114
|
+
add_msg(role, content, mode='text', channel='unknown'): Add a message event with channel.
|
|
1115
|
+
add_log(message): Add a log event.
|
|
1116
|
+
add_ref(content): Add a reflection event.
|
|
1117
|
+
get_msgs(...): Retrieve messages with filtering (supports channel filter).
|
|
1118
|
+
get_events(...): Retrieve all events with filtering.
|
|
1119
|
+
get_logs(limit=-1): Get log events.
|
|
1120
|
+
get_refs(limit=-1): Get reflection events.
|
|
1121
|
+
last_user_msg(): Get the last user message content.
|
|
1122
|
+
last_asst_msg(): Get the last assistant message content.
|
|
1123
|
+
last_sys_msg(): Get the last system message content.
|
|
1124
|
+
last_log_msg(): Get the last log message content.
|
|
1125
|
+
prepare_context(...): Prepare messages for LLM with smart truncation of old messages.
|
|
1126
|
+
set_var(key, value, desc=''): Set a variable (appends to history, auto-converts large values to objects).
|
|
1127
|
+
del_var(key): Mark a variable as deleted (preserves history).
|
|
1128
|
+
get_var(key, resolve_refs=True): Get current value (auto-resolves object refs).
|
|
1129
|
+
get_all_vars(resolve_refs=True): Get dict of all current non-deleted values.
|
|
1130
|
+
get_var_history(key, resolve_refs=False): Get full history as list of [stamp, value].
|
|
1131
|
+
get_var_desc(key): Get the current description of a variable.
|
|
1132
|
+
get_var_desc_history(key): Get full description history as list of [stamp, description].
|
|
1133
|
+
is_var_deleted(key): Check if a variable is currently marked as deleted.
|
|
1134
|
+
set_obj(data, name=None, desc='', content_type='auto'): Store compressed object, optionally link to variable.
|
|
1135
|
+
get_obj(stamp): Retrieve and decompress an object by stamp.
|
|
1136
|
+
get_obj_info(stamp): Get object metadata without decompressing.
|
|
1137
|
+
snapshot(): Export memory state as dict (includes events and objects).
|
|
1138
|
+
save(filename, compressed=False): Save memory to file (pickle format).
|
|
1139
|
+
load(filename, compressed=False): Load memory from file (pickle format).
|
|
1140
|
+
to_json(filename=None, indent=2): Export memory to JSON file or string.
|
|
1141
|
+
from_json(source): Class method to load memory from JSON file or string.
|
|
1142
|
+
copy(): Return a deep copy of the MEMORY instance.
|
|
1143
|
+
from_events(event_list, memory_id=None, objects=None): Class method to rehydrate from events/objects.
|
|
1144
|
+
|
|
1145
|
+
Example Usage:
|
|
1146
|
+
memory = MEMORY()
|
|
1147
|
+
|
|
1148
|
+
# Messages have channel tracking (for omni-directional communication)
|
|
1149
|
+
memory.add_msg('user', 'Hello!', channel='webapp')
|
|
1150
|
+
memory.add_msg('assistant', 'Hi there!', channel='webapp')
|
|
1151
|
+
|
|
1152
|
+
# Logs and reflections are internal (no channel)
|
|
1153
|
+
memory.add_log('User greeted the assistant')
|
|
1154
|
+
memory.add_ref('User seems friendly')
|
|
1155
|
+
|
|
1156
|
+
# Variables maintain full history (no channel needed)
|
|
1157
|
+
memory.set_var('foo', 42, 'A test variable')
|
|
1158
|
+
memory.set_var('foo', 100) # Appends to history
|
|
1159
|
+
memory.get_var('foo') # Returns 100
|
|
1160
|
+
memory.get_var_history('foo') # Returns [[stamp1, 42], [stamp2, 100]]
|
|
1161
|
+
|
|
1162
|
+
# Deletion is a tombstone, not removal
|
|
1163
|
+
memory.del_var('foo')
|
|
1164
|
+
memory.get_var('foo') # Returns None
|
|
1165
|
+
memory.is_var_deleted('foo') # Returns True
|
|
1166
|
+
memory.set_var('foo', 200) # Can re-set after deletion
|
|
1167
|
+
|
|
1168
|
+
# Large values auto-convert to compressed objects
|
|
1169
|
+
large_data = 'x' * 20000 # Exceeds default 10KB threshold
|
|
1170
|
+
memory.set_var('big_data', large_data) # Auto-converts to object
|
|
1171
|
+
memory.get_var('big_data') # Returns decompressed data
|
|
1172
|
+
memory.get_var('big_data', resolve_refs=False) # Returns {'_obj_ref': stamp}
|
|
1173
|
+
|
|
1174
|
+
# Direct object storage
|
|
1175
|
+
stamp = memory.set_obj(image_bytes, name='avatar', desc='User avatar')
|
|
1176
|
+
memory.get_var('avatar') # Returns decompressed image_bytes
|
|
1177
|
+
memory.get_obj(stamp) # Direct access by stamp
|
|
1178
|
+
memory.get_obj_info(stamp) # Metadata without decompressing
|
|
1179
|
+
|
|
1180
|
+
# Inspect internal state (public attributes)
|
|
1181
|
+
print(memory.events) # All events by stamp
|
|
1182
|
+
print(memory.objects) # All objects by stamp
|
|
1183
|
+
print(memory.vars) # Variable histories
|
|
1184
|
+
|
|
1185
|
+
memory.save('memory.pkl')
|
|
1186
|
+
memory2 = MEMORY()
|
|
1187
|
+
memory2.load('memory.pkl')
|
|
1188
|
+
|
|
1189
|
+
# Export to JSON (like DataFrame.to_csv)
|
|
1190
|
+
memory.to_json('memory_backup.json')
|
|
1191
|
+
memory4 = MEMORY.from_json('memory_backup.json')
|
|
1192
|
+
|
|
1193
|
+
# Rehydrate from events and objects (preserves all history)
|
|
1194
|
+
snap = memory.snapshot()
|
|
1195
|
+
memory3 = MEMORY.from_events(snap['events'].values(), objects=snap['objects'])
|
|
1196
|
+
"""
|
|
1197
|
+
|
|
1198
|
+
def __init__(self):
|
|
1199
|
+
import bisect
|
|
1200
|
+
self._bisect = bisect # Store for use in methods
|
|
1201
|
+
|
|
1202
|
+
self.id = event_stamp()
|
|
1203
|
+
|
|
1204
|
+
# DATA LAYER: Single source of truth for all events
|
|
1205
|
+
self.events = {} # stamp → full event dict
|
|
1206
|
+
|
|
1207
|
+
# INDEX LAYER: Sorted lists of [timestamp, stamp] pairs
|
|
1208
|
+
# Format: [[dt_utc, stamp], ...] - aligns with Redis sorted set structure
|
|
1209
|
+
# Sorted by timestamp (ISO string sorts chronologically)
|
|
1210
|
+
self.idx_msgs = [] # Message [timestamp, stamp] pairs
|
|
1211
|
+
self.idx_refs = [] # Reflection [timestamp, stamp] pairs
|
|
1212
|
+
self.idx_logs = [] # Log [timestamp, stamp] pairs
|
|
1213
|
+
self.idx_vars = [] # Variable-change [timestamp, stamp] pairs
|
|
1214
|
+
self.idx_all = [] # Master index (all [timestamp, stamp] pairs)
|
|
1215
|
+
|
|
1216
|
+
# VARIABLE LAYER: Full history with timestamps
|
|
1217
|
+
# vars[key] = [[stamp1, value1], [stamp2, value2], ...]
|
|
1218
|
+
# Deleted variables have VAR_DELETED as value in their last entry
|
|
1219
|
+
self.vars = {} # var_name → list of [stamp, value] pairs
|
|
1220
|
+
self.var_desc_history = {} # var_name → list of [stamp, description] pairs
|
|
1221
|
+
|
|
1222
|
+
# OBJECT LAYER: Compressed storage for large data
|
|
1223
|
+
# objects[stamp] = {
|
|
1224
|
+
# 'data': base64_encoded_compressed_string,
|
|
1225
|
+
# 'size_original': int,
|
|
1226
|
+
# 'size_compressed': int,
|
|
1227
|
+
# 'content_type': str, # 'bytes', 'text', 'json', 'pickle'
|
|
1228
|
+
# }
|
|
1229
|
+
self.objects = {} # stamp → compressed object dict
|
|
1230
|
+
|
|
1231
|
+
# Threshold for auto-converting variables to objects (bytes)
|
|
1232
|
+
self.object_threshold = 10000 # 10KB default
|
|
1233
|
+
|
|
1234
|
+
# Valid values
|
|
1235
|
+
self.valid_roles = {
|
|
1236
|
+
'system',
|
|
1237
|
+
'user',
|
|
1238
|
+
'assistant',
|
|
1239
|
+
'reflection',
|
|
1240
|
+
'action',
|
|
1241
|
+
'query',
|
|
1242
|
+
'result',
|
|
1243
|
+
'logger',
|
|
1244
|
+
}
|
|
1245
|
+
self.valid_modes = {
|
|
1246
|
+
'text',
|
|
1247
|
+
'audio',
|
|
1248
|
+
'voice',
|
|
1249
|
+
}
|
|
1250
|
+
self.valid_channels = {
|
|
1251
|
+
'webapp',
|
|
1252
|
+
'ios',
|
|
1253
|
+
'android',
|
|
1254
|
+
'telegram',
|
|
1255
|
+
'whatsapp',
|
|
1256
|
+
'slack',
|
|
1257
|
+
'api',
|
|
1258
|
+
'cli',
|
|
1259
|
+
'unknown',
|
|
1260
|
+
}
|
|
1261
|
+
|
|
1262
|
+
#--- Internal Methods ---
|
|
1263
|
+
|
|
1264
|
+
def _add_to_index(self, index_list, timestamp, stamp):
|
|
1265
|
+
"""
|
|
1266
|
+
Insert [timestamp, stamp] pair maintaining sorted order by timestamp.
|
|
1267
|
+
|
|
1268
|
+
Args:
|
|
1269
|
+
index_list: One of the idx_* lists
|
|
1270
|
+
timestamp: ISO timestamp string (dt_utc)
|
|
1271
|
+
stamp: Event stamp ID
|
|
1272
|
+
"""
|
|
1273
|
+
# bisect.insort sorts by first element of tuple/list (timestamp)
|
|
1274
|
+
self._bisect.insort(index_list, [timestamp, stamp])
|
|
1275
|
+
|
|
1276
|
+
def _store_event(self, event_type, obj):
|
|
1277
|
+
"""
|
|
1278
|
+
Store event in data layer and add to appropriate indexes.
|
|
1279
|
+
This is the single entry point for all event creation.
|
|
1280
|
+
|
|
1281
|
+
Args:
|
|
1282
|
+
event_type: One of 'msg', 'ref', 'log', 'var'
|
|
1283
|
+
obj: The full event dict (must contain 'stamp' and 'dt_utc' keys)
|
|
1284
|
+
"""
|
|
1285
|
+
stamp = obj['stamp']
|
|
1286
|
+
timestamp = obj['dt_utc']
|
|
1287
|
+
|
|
1288
|
+
# Store in data layer
|
|
1289
|
+
self.events[stamp] = obj
|
|
1290
|
+
|
|
1291
|
+
# Add to type-specific index (with [timestamp, stamp] format)
|
|
1292
|
+
if event_type == 'msg':
|
|
1293
|
+
self._add_to_index(self.idx_msgs, timestamp, stamp)
|
|
1294
|
+
elif event_type == 'ref':
|
|
1295
|
+
self._add_to_index(self.idx_refs, timestamp, stamp)
|
|
1296
|
+
elif event_type == 'log':
|
|
1297
|
+
self._add_to_index(self.idx_logs, timestamp, stamp)
|
|
1298
|
+
elif event_type == 'var':
|
|
1299
|
+
self._add_to_index(self.idx_vars, timestamp, stamp)
|
|
1300
|
+
|
|
1301
|
+
# Always add to master index
|
|
1302
|
+
self._add_to_index(self.idx_all, timestamp, stamp)
|
|
1303
|
+
|
|
1304
|
+
def _get_events_from_index(self, index, limit=-1):
|
|
1305
|
+
"""
|
|
1306
|
+
Get events from an index, optionally limited to last N.
|
|
1307
|
+
|
|
1308
|
+
Args:
|
|
1309
|
+
index: One of the idx_* lists (format: [[timestamp, stamp], ...])
|
|
1310
|
+
limit: Max events to return (-1 = all)
|
|
1311
|
+
|
|
1312
|
+
Returns:
|
|
1313
|
+
List of event dicts
|
|
1314
|
+
"""
|
|
1315
|
+
pairs = index if limit <= 0 else index[-limit:]
|
|
1316
|
+
# Extract stamp (second element) from each [timestamp, stamp] pair
|
|
1317
|
+
return [self.events[ts_stamp[1]] for ts_stamp in pairs if ts_stamp[1] in self.events]
|
|
1318
|
+
|
|
1319
|
+
def _get_latest_desc(self, key):
|
|
1320
|
+
"""
|
|
1321
|
+
Get the latest description for a variable from its description history.
|
|
1322
|
+
|
|
1323
|
+
Args:
|
|
1324
|
+
key: Variable name
|
|
1325
|
+
|
|
1326
|
+
Returns:
|
|
1327
|
+
Latest description string, or empty string if none exists
|
|
1328
|
+
"""
|
|
1329
|
+
history = self.var_desc_history.get(key)
|
|
1330
|
+
if not history:
|
|
1331
|
+
return ''
|
|
1332
|
+
return history[-1][1] # Return description from last [stamp, desc] pair
|
|
1333
|
+
|
|
1334
|
+
#--- Public Methods ---
|
|
1335
|
+
|
|
1336
|
+
def add_msg(self, role, content, mode='text', channel='unknown'):
|
|
1337
|
+
"""
|
|
1338
|
+
Add a message event with channel tracking.
|
|
1339
|
+
|
|
1340
|
+
Args:
|
|
1341
|
+
role: Message role (user, assistant, system, etc.)
|
|
1342
|
+
content: Message content
|
|
1343
|
+
mode: Communication mode (text, audio, voice)
|
|
1344
|
+
channel: Communication channel (webapp, ios, telegram, etc.)
|
|
1345
|
+
"""
|
|
1346
|
+
if role not in self.valid_roles:
|
|
1347
|
+
raise ValueError("Invalid role '{}'. Must be one of: {}".format(role, sorted(self.valid_roles)))
|
|
1348
|
+
if mode not in self.valid_modes:
|
|
1349
|
+
raise ValueError("Invalid mode '{}'. Must be one of: {}".format(mode, sorted(self.valid_modes)))
|
|
1350
|
+
if channel not in self.valid_channels:
|
|
1351
|
+
raise ValueError("Invalid channel '{}'. Must be one of: {}".format(channel, sorted(self.valid_channels)))
|
|
1352
|
+
|
|
1353
|
+
stamp = event_stamp({'role': role, 'content': content})
|
|
1354
|
+
msg = {
|
|
1355
|
+
'stamp' : stamp,
|
|
1356
|
+
'type' : 'msg',
|
|
1357
|
+
'role' : role,
|
|
1358
|
+
'content' : content,
|
|
1359
|
+
'mode' : mode,
|
|
1360
|
+
'channel' : channel,
|
|
1361
|
+
'dt_bog' : str(dtt.datetime.now(tz_bog))[:23],
|
|
1362
|
+
'dt_utc' : str(dtt.datetime.now(tz_utc))[:23],
|
|
1363
|
+
}
|
|
1364
|
+
self._store_event('msg', msg)
|
|
1365
|
+
|
|
1366
|
+
def add_log(self, message):
|
|
1367
|
+
"""
|
|
1368
|
+
Add a log event.
|
|
1369
|
+
|
|
1370
|
+
Args:
|
|
1371
|
+
message: Log message content
|
|
1372
|
+
"""
|
|
1373
|
+
stamp = event_stamp({'content': message})
|
|
1374
|
+
log_entry = {
|
|
1375
|
+
'stamp' : stamp,
|
|
1376
|
+
'type' : 'log',
|
|
1377
|
+
'role' : 'logger',
|
|
1378
|
+
'content' : message,
|
|
1379
|
+
'mode' : 'text',
|
|
1380
|
+
'dt_bog' : str(dtt.datetime.now(tz_bog))[:23],
|
|
1381
|
+
'dt_utc' : str(dtt.datetime.now(tz_utc))[:23],
|
|
1382
|
+
}
|
|
1383
|
+
self._store_event('log', log_entry)
|
|
1384
|
+
|
|
1385
|
+
def add_ref(self, content):
|
|
1386
|
+
"""
|
|
1387
|
+
Add a reflection event.
|
|
1388
|
+
|
|
1389
|
+
Args:
|
|
1390
|
+
content: Reflection content
|
|
1391
|
+
"""
|
|
1392
|
+
stamp = event_stamp({'content': content})
|
|
1393
|
+
ref = {
|
|
1394
|
+
'stamp' : stamp,
|
|
1395
|
+
'type' : 'ref',
|
|
1396
|
+
'role' : 'reflection',
|
|
1397
|
+
'content' : content,
|
|
1398
|
+
'mode' : 'text',
|
|
1399
|
+
'dt_bog' : str(dtt.datetime.now(tz_bog))[:23],
|
|
1400
|
+
'dt_utc' : str(dtt.datetime.now(tz_utc))[:23],
|
|
1401
|
+
}
|
|
1402
|
+
self._store_event('ref', ref)
|
|
1403
|
+
|
|
1404
|
+
#---
|
|
1405
|
+
|
|
1406
|
+
def get_msgs(self,
|
|
1407
|
+
limit=-1,
|
|
1408
|
+
include=None,
|
|
1409
|
+
exclude=None,
|
|
1410
|
+
repr='list',
|
|
1411
|
+
channel=None,
|
|
1412
|
+
):
|
|
1413
|
+
"""
|
|
1414
|
+
Get messages with flexible filtering.
|
|
1415
|
+
|
|
1416
|
+
Args:
|
|
1417
|
+
limit: Max messages to return (-1 = all)
|
|
1418
|
+
include: List of roles to include (None = all)
|
|
1419
|
+
exclude: List of roles to exclude (None = none)
|
|
1420
|
+
repr: Output format ('list', 'str', 'pprint1')
|
|
1421
|
+
channel: Filter by channel (None = all)
|
|
1422
|
+
|
|
1423
|
+
Returns:
|
|
1424
|
+
Messages in the specified format
|
|
1425
|
+
"""
|
|
1426
|
+
# Get all messages from index
|
|
1427
|
+
events = self._get_events_from_index(self.idx_msgs, -1)
|
|
1428
|
+
|
|
1429
|
+
# Apply filters
|
|
1430
|
+
if include:
|
|
1431
|
+
events = [e for e in events if e.get('role') in include]
|
|
1432
|
+
if exclude:
|
|
1433
|
+
exclude = exclude or []
|
|
1434
|
+
events = [e for e in events if e.get('role') not in exclude]
|
|
1435
|
+
if channel:
|
|
1436
|
+
events = [e for e in events if e.get('channel') == channel]
|
|
1437
|
+
|
|
1438
|
+
if limit > 0:
|
|
1439
|
+
events = events[-limit:]
|
|
1440
|
+
|
|
1441
|
+
if repr == 'list':
|
|
1442
|
+
return events
|
|
1443
|
+
elif repr == 'str':
|
|
1444
|
+
return '\n'.join(["{}: {}".format(e['role'], e['content']) for e in events])
|
|
1445
|
+
elif repr == 'pprint1':
|
|
1446
|
+
return pprint.pformat(events, indent=1)
|
|
1447
|
+
else:
|
|
1448
|
+
raise ValueError("Invalid repr option. Choose from 'list', 'str', or 'pprint1'.")
|
|
1449
|
+
|
|
1450
|
+
def get_events(self, limit=-1, event_types=None, channel=None):
|
|
1451
|
+
"""
|
|
1452
|
+
Get all events, optionally filtered by type and channel.
|
|
1453
|
+
|
|
1454
|
+
Args:
|
|
1455
|
+
limit: Max events (-1 = all)
|
|
1456
|
+
event_types: List like ['msg', 'log', 'ref', 'var'] (None = all)
|
|
1457
|
+
channel: Filter by channel (None = all)
|
|
1458
|
+
|
|
1459
|
+
Returns:
|
|
1460
|
+
List of event dicts
|
|
1461
|
+
"""
|
|
1462
|
+
events = self._get_events_from_index(self.idx_all, -1)
|
|
1463
|
+
|
|
1464
|
+
if event_types:
|
|
1465
|
+
events = [e for e in events if e.get('type') in event_types]
|
|
1466
|
+
if channel:
|
|
1467
|
+
events = [e for e in events if e.get('channel') == channel]
|
|
1468
|
+
|
|
1469
|
+
if limit > 0:
|
|
1470
|
+
events = events[-limit:]
|
|
1471
|
+
|
|
1472
|
+
return events
|
|
1473
|
+
|
|
1474
|
+
def get_logs(self, limit=-1):
|
|
1475
|
+
"""
|
|
1476
|
+
Get log events.
|
|
1477
|
+
|
|
1478
|
+
Args:
|
|
1479
|
+
limit: Max logs to return (-1 = all)
|
|
1480
|
+
|
|
1481
|
+
Returns:
|
|
1482
|
+
List of log event dicts
|
|
1483
|
+
"""
|
|
1484
|
+
events = self._get_events_from_index(self.idx_logs, -1)
|
|
1485
|
+
|
|
1486
|
+
if limit > 0:
|
|
1487
|
+
events = events[-limit:]
|
|
1488
|
+
|
|
1489
|
+
return events
|
|
1490
|
+
|
|
1491
|
+
def get_refs(self, limit=-1):
|
|
1492
|
+
"""
|
|
1493
|
+
Get reflection events.
|
|
1494
|
+
|
|
1495
|
+
Args:
|
|
1496
|
+
limit: Max reflections to return (-1 = all)
|
|
1497
|
+
|
|
1498
|
+
Returns:
|
|
1499
|
+
List of reflection event dicts
|
|
1500
|
+
"""
|
|
1501
|
+
events = self._get_events_from_index(self.idx_refs, -1)
|
|
1502
|
+
|
|
1503
|
+
if limit > 0:
|
|
1504
|
+
events = events[-limit:]
|
|
1505
|
+
|
|
1506
|
+
return events
|
|
1507
|
+
|
|
1508
|
+
def last_user_msg(self):
|
|
1509
|
+
"""Get the content of the last user message."""
|
|
1510
|
+
msgs = self.get_msgs(include=['user'])
|
|
1511
|
+
return msgs[-1]['content'] if msgs else ''
|
|
1512
|
+
|
|
1513
|
+
def last_asst_msg(self):
|
|
1514
|
+
"""Get the content of the last assistant message."""
|
|
1515
|
+
msgs = self.get_msgs(include=['assistant'])
|
|
1516
|
+
return msgs[-1]['content'] if msgs else ''
|
|
1517
|
+
|
|
1518
|
+
def last_sys_msg(self):
|
|
1519
|
+
"""Get the content of the last system message."""
|
|
1520
|
+
msgs = self.get_msgs(include=['system'])
|
|
1521
|
+
return msgs[-1]['content'] if msgs else ''
|
|
1522
|
+
|
|
1523
|
+
def last_log_msg(self):
|
|
1524
|
+
"""Get the content of the last log message."""
|
|
1525
|
+
logs = self.get_logs()
|
|
1526
|
+
return logs[-1]['content'] if logs else ''
|
|
1527
|
+
|
|
1528
|
+
def prepare_context(
|
|
1529
|
+
self,
|
|
1530
|
+
recent_count=6,
|
|
1531
|
+
truncate_threshold=500,
|
|
1532
|
+
header_len=200,
|
|
1533
|
+
footer_len=200,
|
|
1534
|
+
include_roles=('user', 'assistant'),
|
|
1535
|
+
format='list',
|
|
1536
|
+
):
|
|
1537
|
+
"""
|
|
1538
|
+
Prepare messages for LLM context with smart truncation of old messages.
|
|
1539
|
+
|
|
1540
|
+
Messages within the most recent `recent_count` are returned unchanged.
|
|
1541
|
+
Older messages that exceed `truncate_threshold` chars have their middle
|
|
1542
|
+
content truncated, preserving a header and footer with an expandable marker.
|
|
1543
|
+
|
|
1544
|
+
The truncation marker includes the message's stamp, allowing an LLM to
|
|
1545
|
+
request expansion of specific messages via memory.events[stamp].
|
|
1546
|
+
|
|
1547
|
+
Args:
|
|
1548
|
+
recent_count: Number of recent messages to keep untruncated (default 6)
|
|
1549
|
+
truncate_threshold: Min chars before truncation applies (default 500)
|
|
1550
|
+
header_len: Characters to keep from start (default 200)
|
|
1551
|
+
footer_len: Characters to keep from end (default 200)
|
|
1552
|
+
include_roles: Tuple of roles to include (default ('user', 'assistant'))
|
|
1553
|
+
format: 'list' returns list of dicts, 'openai' returns OpenAI-compatible format
|
|
1554
|
+
|
|
1555
|
+
Returns:
|
|
1556
|
+
List of message dicts with 'role' and 'content' keys.
|
|
1557
|
+
Older messages may have truncated content with expansion markers.
|
|
1558
|
+
|
|
1559
|
+
Example:
|
|
1560
|
+
# Get context-ready messages for LLM
|
|
1561
|
+
context = memory.prepare_context(recent_count=6, truncate_threshold=500)
|
|
1562
|
+
|
|
1563
|
+
# Use with OpenAI API
|
|
1564
|
+
context = memory.prepare_context(format='openai')
|
|
1565
|
+
response = client.chat.completions.create(
|
|
1566
|
+
model='gpt-4',
|
|
1567
|
+
messages=context
|
|
1568
|
+
)
|
|
1569
|
+
"""
|
|
1570
|
+
# Get all messages for included roles
|
|
1571
|
+
msgs = self.get_msgs(include=list(include_roles))
|
|
1572
|
+
|
|
1573
|
+
if not msgs:
|
|
1574
|
+
return []
|
|
1575
|
+
|
|
1576
|
+
# Determine cutoff point for truncation
|
|
1577
|
+
# Messages at index < cutoff_idx are candidates for truncation
|
|
1578
|
+
cutoff_idx = max(0, len(msgs) - recent_count)
|
|
1579
|
+
|
|
1580
|
+
result = []
|
|
1581
|
+
for i, msg in enumerate(msgs):
|
|
1582
|
+
stamp = msg.get('stamp', '')
|
|
1583
|
+
role = msg.get('role', 'user')
|
|
1584
|
+
content = msg.get('content', '')
|
|
1585
|
+
|
|
1586
|
+
# Apply truncation to older messages
|
|
1587
|
+
if i < cutoff_idx:
|
|
1588
|
+
content = truncate_content(
|
|
1589
|
+
content,
|
|
1590
|
+
stamp,
|
|
1591
|
+
threshold=truncate_threshold,
|
|
1592
|
+
header_len=header_len,
|
|
1593
|
+
footer_len=footer_len
|
|
1594
|
+
)
|
|
1595
|
+
|
|
1596
|
+
if format == 'openai':
|
|
1597
|
+
# OpenAI expects 'user', 'assistant', 'system' roles
|
|
1598
|
+
result.append({'role': role, 'content': content})
|
|
1599
|
+
else:
|
|
1600
|
+
# List format includes more metadata
|
|
1601
|
+
result.append({
|
|
1602
|
+
'role': role,
|
|
1603
|
+
'content': content,
|
|
1604
|
+
'stamp': stamp,
|
|
1605
|
+
'truncated': i < cutoff_idx and len(msg.get('content', '')) > truncate_threshold,
|
|
1606
|
+
})
|
|
1607
|
+
|
|
1608
|
+
return result
|
|
1609
|
+
|
|
1610
|
+
#---
|
|
1611
|
+
|
|
1612
|
+
def set_var(self, key, value, desc=''):
|
|
1613
|
+
"""
|
|
1614
|
+
Store a variable by appending to its history list.
|
|
1615
|
+
Variable changes are first-class events in the event stream.
|
|
1616
|
+
Each variable maintains a full history of [stamp, value] pairs.
|
|
1617
|
+
|
|
1618
|
+
Large values (exceeding object_threshold) are automatically converted
|
|
1619
|
+
to compressed objects, with an object reference stored in the history.
|
|
1620
|
+
|
|
1621
|
+
Descriptions are tracked separately in var_desc_history since they
|
|
1622
|
+
change less frequently than values.
|
|
1623
|
+
|
|
1624
|
+
Args:
|
|
1625
|
+
key: Variable name
|
|
1626
|
+
value: Variable value (any type)
|
|
1627
|
+
desc: Optional description (appended to description history if provided)
|
|
1628
|
+
"""
|
|
1629
|
+
# Check if value should be stored as object (auto-conversion)
|
|
1630
|
+
value_size = estimate_size(value)
|
|
1631
|
+
if value_size > self.object_threshold:
|
|
1632
|
+
# Store as object, use reference in history
|
|
1633
|
+
obj_stamp = event_stamp({'obj': str(value)[:50]})
|
|
1634
|
+
compressed_obj = compress_to_json(value)
|
|
1635
|
+
self.objects[obj_stamp] = compressed_obj
|
|
1636
|
+
stored_value = {'_obj_ref': obj_stamp}
|
|
1637
|
+
else:
|
|
1638
|
+
stored_value = value
|
|
1639
|
+
|
|
1640
|
+
stamp = event_stamp({'var': key, 'value': str(value)[:100]})
|
|
1641
|
+
|
|
1642
|
+
# Initialize history list if this is a new variable
|
|
1643
|
+
if key not in self.vars:
|
|
1644
|
+
self.vars[key] = []
|
|
1645
|
+
|
|
1646
|
+
# Append new [stamp, stored_value] pair to history
|
|
1647
|
+
self.vars[key].append([stamp, stored_value])
|
|
1648
|
+
|
|
1649
|
+
# Track description changes separately (only when provided)
|
|
1650
|
+
if desc:
|
|
1651
|
+
if key not in self.var_desc_history:
|
|
1652
|
+
self.var_desc_history[key] = []
|
|
1653
|
+
self.var_desc_history[key].append([stamp, desc])
|
|
1654
|
+
|
|
1655
|
+
# Get latest description from history (or the one we just set)
|
|
1656
|
+
current_desc = desc if desc else self._get_latest_desc(key)
|
|
1657
|
+
|
|
1658
|
+
# Create variable-change event
|
|
1659
|
+
var_event = {
|
|
1660
|
+
'stamp' : stamp,
|
|
1661
|
+
'type' : 'var',
|
|
1662
|
+
'role' : 'system',
|
|
1663
|
+
'var_name' : key,
|
|
1664
|
+
'var_value': stored_value, # Store reference if large, else value
|
|
1665
|
+
'var_desc' : current_desc,
|
|
1666
|
+
'content' : "Variable '{}' set".format(key) + (' (as object ref)' if is_obj_ref(stored_value) else ''),
|
|
1667
|
+
'mode' : 'text',
|
|
1668
|
+
'dt_bog' : str(dtt.datetime.now(tz_bog))[:23],
|
|
1669
|
+
'dt_utc' : str(dtt.datetime.now(tz_utc))[:23],
|
|
1670
|
+
}
|
|
1671
|
+
self._store_event('var', var_event)
|
|
1672
|
+
|
|
1673
|
+
def del_var(self, key):
|
|
1674
|
+
"""
|
|
1675
|
+
Mark a variable as deleted by appending a VAR_DELETED tombstone.
|
|
1676
|
+
The variable's history is preserved; it can be re-set later.
|
|
1677
|
+
|
|
1678
|
+
Args:
|
|
1679
|
+
key: Variable name to delete
|
|
1680
|
+
|
|
1681
|
+
Raises:
|
|
1682
|
+
KeyError: If the variable doesn't exist
|
|
1683
|
+
"""
|
|
1684
|
+
if key not in self.vars:
|
|
1685
|
+
raise KeyError("Variable '{}' does not exist".format(key))
|
|
1686
|
+
|
|
1687
|
+
stamp = event_stamp({'var': key, 'action': 'delete'})
|
|
1688
|
+
|
|
1689
|
+
# Append deletion marker to history
|
|
1690
|
+
self.vars[key].append([stamp, VAR_DELETED])
|
|
1691
|
+
|
|
1692
|
+
# Create variable-delete event
|
|
1693
|
+
var_event = {
|
|
1694
|
+
'stamp' : stamp,
|
|
1695
|
+
'type' : 'var',
|
|
1696
|
+
'role' : 'system',
|
|
1697
|
+
'var_name' : key,
|
|
1698
|
+
'var_value': None,
|
|
1699
|
+
'var_deleted': True,
|
|
1700
|
+
'var_desc' : self._get_latest_desc(key),
|
|
1701
|
+
'content' : "Variable '{}' deleted".format(key),
|
|
1702
|
+
'mode' : 'text',
|
|
1703
|
+
'dt_bog' : str(dtt.datetime.now(tz_bog))[:23],
|
|
1704
|
+
'dt_utc' : str(dtt.datetime.now(tz_utc))[:23],
|
|
1705
|
+
}
|
|
1706
|
+
self._store_event('var', var_event)
|
|
1707
|
+
|
|
1708
|
+
def get_var(self, key, resolve_refs=True):
|
|
1709
|
+
"""
|
|
1710
|
+
Return the current value of a variable.
|
|
1711
|
+
|
|
1712
|
+
If the value is an object reference, it is automatically resolved
|
|
1713
|
+
and the decompressed data is returned (unless resolve_refs=False).
|
|
1714
|
+
|
|
1715
|
+
Args:
|
|
1716
|
+
key: Variable name
|
|
1717
|
+
resolve_refs: If True (default), resolve object references to actual data
|
|
1718
|
+
|
|
1719
|
+
Returns:
|
|
1720
|
+
Current value, or None if not found or deleted
|
|
1721
|
+
"""
|
|
1722
|
+
history = self.vars.get(key)
|
|
1723
|
+
if not history:
|
|
1724
|
+
return None
|
|
1725
|
+
|
|
1726
|
+
# Get the last value
|
|
1727
|
+
last_stamp, last_value = history[-1]
|
|
1728
|
+
|
|
1729
|
+
# Return None if deleted
|
|
1730
|
+
if last_value is VAR_DELETED:
|
|
1731
|
+
return None
|
|
1732
|
+
|
|
1733
|
+
# Resolve object reference if applicable
|
|
1734
|
+
if resolve_refs and is_obj_ref(last_value):
|
|
1735
|
+
return self.get_obj(last_value['_obj_ref'])
|
|
1736
|
+
|
|
1737
|
+
return last_value
|
|
1738
|
+
|
|
1739
|
+
def is_var_deleted(self, key):
|
|
1740
|
+
"""
|
|
1741
|
+
Check if a variable is currently marked as deleted.
|
|
1742
|
+
|
|
1743
|
+
Args:
|
|
1744
|
+
key: Variable name
|
|
1745
|
+
|
|
1746
|
+
Returns:
|
|
1747
|
+
True if the variable exists and is deleted, False otherwise
|
|
1748
|
+
"""
|
|
1749
|
+
history = self.vars.get(key)
|
|
1750
|
+
if not history:
|
|
1751
|
+
return False
|
|
1752
|
+
|
|
1753
|
+
last_stamp, last_value = history[-1]
|
|
1754
|
+
return last_value is VAR_DELETED
|
|
1755
|
+
|
|
1756
|
+
def get_all_vars(self, resolve_refs=True):
|
|
1757
|
+
"""
|
|
1758
|
+
Get a dictionary of all current non-deleted variable values.
|
|
1759
|
+
|
|
1760
|
+
Args:
|
|
1761
|
+
resolve_refs: If True (default), resolve object references to actual data
|
|
1762
|
+
|
|
1763
|
+
Returns:
|
|
1764
|
+
dict: Variable name → current value (excludes deleted variables)
|
|
1765
|
+
"""
|
|
1766
|
+
result = {}
|
|
1767
|
+
for key, history in self.vars.items():
|
|
1768
|
+
if history:
|
|
1769
|
+
last_stamp, last_value = history[-1]
|
|
1770
|
+
if last_value is not VAR_DELETED:
|
|
1771
|
+
# Resolve object reference if applicable
|
|
1772
|
+
if resolve_refs and is_obj_ref(last_value):
|
|
1773
|
+
result[key] = self.get_obj(last_value['_obj_ref'])
|
|
1774
|
+
else:
|
|
1775
|
+
result[key] = last_value
|
|
1776
|
+
return result
|
|
1777
|
+
|
|
1778
|
+
def get_var_history(self, key, resolve_refs=False):
|
|
1779
|
+
"""
|
|
1780
|
+
Get full history of a variable as list of [stamp, value] pairs.
|
|
1781
|
+
Includes all historical values and deletion markers.
|
|
1782
|
+
|
|
1783
|
+
Args:
|
|
1784
|
+
key: Variable name
|
|
1785
|
+
resolve_refs: If True, resolve object references to actual data.
|
|
1786
|
+
Default False to preserve the raw history structure.
|
|
1787
|
+
|
|
1788
|
+
Returns:
|
|
1789
|
+
List of [stamp, value] pairs, or empty list if variable doesn't exist.
|
|
1790
|
+
Deleted entries have VAR_DELETED as the value.
|
|
1791
|
+
Object references appear as {'_obj_ref': stamp} unless resolve_refs=True.
|
|
1792
|
+
"""
|
|
1793
|
+
history = self.vars.get(key, [])
|
|
1794
|
+
if not resolve_refs:
|
|
1795
|
+
return list(history)
|
|
1796
|
+
|
|
1797
|
+
# Resolve object references
|
|
1798
|
+
resolved = []
|
|
1799
|
+
for stamp, value in history:
|
|
1800
|
+
if is_obj_ref(value):
|
|
1801
|
+
resolved.append([stamp, self.get_obj(value['_obj_ref'])])
|
|
1802
|
+
else:
|
|
1803
|
+
resolved.append([stamp, value])
|
|
1804
|
+
return resolved
|
|
1805
|
+
|
|
1806
|
+
def get_var_desc(self, key):
|
|
1807
|
+
"""
|
|
1808
|
+
Get the current (latest) description of a variable.
|
|
1809
|
+
|
|
1810
|
+
Args:
|
|
1811
|
+
key: Variable name
|
|
1812
|
+
|
|
1813
|
+
Returns:
|
|
1814
|
+
Latest description string, or default message if no description exists
|
|
1815
|
+
"""
|
|
1816
|
+
desc = self._get_latest_desc(key)
|
|
1817
|
+
return desc if desc else "No description found."
|
|
1818
|
+
|
|
1819
|
+
def get_var_desc_history(self, key):
|
|
1820
|
+
"""
|
|
1821
|
+
Get full history of a variable's descriptions as list of [stamp, description] pairs.
|
|
1822
|
+
|
|
1823
|
+
Args:
|
|
1824
|
+
key: Variable name
|
|
1825
|
+
|
|
1826
|
+
Returns:
|
|
1827
|
+
List of [stamp, description] pairs, or empty list if variable has no descriptions.
|
|
1828
|
+
"""
|
|
1829
|
+
return list(self.var_desc_history.get(key, []))
|
|
1830
|
+
|
|
1831
|
+
#--- Object Methods ---
|
|
1832
|
+
|
|
1833
|
+
def set_obj(self, data, name=None, desc='', content_type='auto'):
|
|
1834
|
+
"""
|
|
1835
|
+
Store a large object in compressed form.
|
|
1836
|
+
|
|
1837
|
+
Objects are compressed using zlib and base64-encoded for JSON serialization.
|
|
1838
|
+
Optionally creates a variable reference to the stored object.
|
|
1839
|
+
|
|
1840
|
+
Args:
|
|
1841
|
+
data: The data to store (bytes, str, or any JSON/pickle-serializable object)
|
|
1842
|
+
name: Optional variable name to create a reference
|
|
1843
|
+
desc: Description (used only if name is provided)
|
|
1844
|
+
content_type: 'bytes', 'text', 'json', 'pickle', or 'auto'
|
|
1845
|
+
|
|
1846
|
+
Returns:
|
|
1847
|
+
str: The object stamp (ID)
|
|
1848
|
+
|
|
1849
|
+
Example:
|
|
1850
|
+
# Store raw data, get stamp back
|
|
1851
|
+
stamp = memory.set_obj(large_text)
|
|
1852
|
+
|
|
1853
|
+
# Store and create variable reference
|
|
1854
|
+
memory.set_obj(image_bytes, name='profile_pic', desc='User avatar')
|
|
1855
|
+
memory.get_var('profile_pic') # Returns decompressed image_bytes
|
|
1856
|
+
"""
|
|
1857
|
+
stamp = event_stamp({'obj': str(data)[:50]})
|
|
1858
|
+
|
|
1859
|
+
# Compress and store
|
|
1860
|
+
compressed_obj = compress_to_json(data, content_type)
|
|
1861
|
+
self.objects[stamp] = compressed_obj
|
|
1862
|
+
|
|
1863
|
+
# Optionally create a variable reference
|
|
1864
|
+
if name:
|
|
1865
|
+
obj_ref = {'_obj_ref': stamp}
|
|
1866
|
+
# Store reference directly in vars (bypassing size check)
|
|
1867
|
+
var_stamp = event_stamp({'var': name})
|
|
1868
|
+
|
|
1869
|
+
# Initialize history if needed
|
|
1870
|
+
if name not in self.vars:
|
|
1871
|
+
self.vars[name] = []
|
|
1872
|
+
|
|
1873
|
+
# Append [stamp, obj_ref] to history
|
|
1874
|
+
self.vars[name].append([var_stamp, obj_ref])
|
|
1875
|
+
|
|
1876
|
+
# Track description changes separately (only when provided)
|
|
1877
|
+
if desc:
|
|
1878
|
+
if name not in self.var_desc_history:
|
|
1879
|
+
self.var_desc_history[name] = []
|
|
1880
|
+
self.var_desc_history[name].append([var_stamp, desc])
|
|
1881
|
+
|
|
1882
|
+
# Get latest description for the event
|
|
1883
|
+
current_desc = desc if desc else self._get_latest_desc(name)
|
|
1884
|
+
|
|
1885
|
+
# Store the var event
|
|
1886
|
+
var_event = {
|
|
1887
|
+
'type' : 'var',
|
|
1888
|
+
'stamp' : var_stamp,
|
|
1889
|
+
'var_name' : name,
|
|
1890
|
+
'var_value': obj_ref, # Store the reference, not the data
|
|
1891
|
+
'var_deleted': False,
|
|
1892
|
+
'var_desc' : current_desc,
|
|
1893
|
+
'content' : "Variable '{}' set to object ref: {}".format(name, stamp),
|
|
1894
|
+
'mode' : 'text',
|
|
1895
|
+
'dt_bog' : str(dtt.datetime.now(tz_bog))[:23],
|
|
1896
|
+
'dt_utc' : str(dtt.datetime.now(tz_utc))[:23],
|
|
1897
|
+
}
|
|
1898
|
+
self._store_event('var', var_event)
|
|
1899
|
+
|
|
1900
|
+
return stamp
|
|
1901
|
+
|
|
1902
|
+
def get_obj(self, stamp):
|
|
1903
|
+
"""
|
|
1904
|
+
Retrieve and decompress an object by its stamp.
|
|
1905
|
+
|
|
1906
|
+
Args:
|
|
1907
|
+
stamp: The object's event stamp
|
|
1908
|
+
|
|
1909
|
+
Returns:
|
|
1910
|
+
The decompressed original data, or None if not found
|
|
1911
|
+
|
|
1912
|
+
Example:
|
|
1913
|
+
data = memory.get_obj('A1B2C3...')
|
|
1914
|
+
"""
|
|
1915
|
+
obj_dict = self.objects.get(stamp)
|
|
1916
|
+
if obj_dict is None:
|
|
1917
|
+
return None
|
|
1918
|
+
return decompress_from_json(obj_dict)
|
|
1919
|
+
|
|
1920
|
+
def get_obj_info(self, stamp):
|
|
1921
|
+
"""
|
|
1922
|
+
Get metadata about a stored object without decompressing it.
|
|
1923
|
+
|
|
1924
|
+
Args:
|
|
1925
|
+
stamp: The object's event stamp
|
|
1926
|
+
|
|
1927
|
+
Returns:
|
|
1928
|
+
dict with size_original, size_compressed, content_type, or None if not found
|
|
1929
|
+
"""
|
|
1930
|
+
obj_dict = self.objects.get(stamp)
|
|
1931
|
+
if obj_dict is None:
|
|
1932
|
+
return None
|
|
1933
|
+
return {
|
|
1934
|
+
'stamp': stamp,
|
|
1935
|
+
'size_original': obj_dict['size_original'],
|
|
1936
|
+
'size_compressed': obj_dict['size_compressed'],
|
|
1937
|
+
'content_type': obj_dict['content_type'],
|
|
1938
|
+
'compression_ratio': obj_dict['size_compressed'] / obj_dict['size_original'] if obj_dict['size_original'] > 0 else 0,
|
|
1939
|
+
}
|
|
1940
|
+
|
|
1941
|
+
#---
|
|
1942
|
+
|
|
1943
|
+
def snapshot(self):
|
|
1944
|
+
"""
|
|
1945
|
+
Export memory state as dict.
|
|
1946
|
+
Stores events and objects - indexes can be rehydrated from events.
|
|
1947
|
+
|
|
1948
|
+
Returns:
|
|
1949
|
+
dict with 'id', 'events', and 'objects' keys
|
|
1950
|
+
"""
|
|
1951
|
+
return {
|
|
1952
|
+
'id': self.id,
|
|
1953
|
+
'events': dict(self.events), # All events by stamp
|
|
1954
|
+
'objects': dict(self.objects), # All objects by stamp (already JSON-serializable)
|
|
1955
|
+
}
|
|
1956
|
+
|
|
1957
|
+
def save(self, filename, compressed=False):
|
|
1958
|
+
"""
|
|
1959
|
+
Save memory to file.
|
|
1960
|
+
|
|
1961
|
+
Args:
|
|
1962
|
+
filename: Path to save file
|
|
1963
|
+
compressed: If True, use gzip compression
|
|
1964
|
+
"""
|
|
1965
|
+
import gzip
|
|
1966
|
+
data = self.snapshot()
|
|
1967
|
+
if compressed:
|
|
1968
|
+
with gzip.open(filename, 'wb') as f:
|
|
1969
|
+
pickle.dump(data, f)
|
|
1970
|
+
else:
|
|
1971
|
+
with open(filename, 'wb') as f:
|
|
1972
|
+
pickle.dump(data, f)
|
|
1973
|
+
|
|
1974
|
+
def load(self, filename, compressed=False):
|
|
1975
|
+
"""
|
|
1976
|
+
Load memory from file by rehydrating from events.
|
|
1977
|
+
|
|
1978
|
+
Args:
|
|
1979
|
+
filename: Path to load file
|
|
1980
|
+
compressed: If True, expect gzip compression
|
|
1981
|
+
"""
|
|
1982
|
+
import gzip
|
|
1983
|
+
if compressed:
|
|
1984
|
+
with gzip.open(filename, 'rb') as f:
|
|
1985
|
+
data = pickle.load(f)
|
|
1986
|
+
else:
|
|
1987
|
+
with open(filename, 'rb') as f:
|
|
1988
|
+
data = pickle.load(f)
|
|
1989
|
+
|
|
1990
|
+
# Rehydrate from events (pass objects if present)
|
|
1991
|
+
event_list = list(data.get('events', {}).values())
|
|
1992
|
+
objects = data.get('objects', {})
|
|
1993
|
+
mem = MEMORY.from_events(event_list, data.get('id'), objects=objects)
|
|
1994
|
+
|
|
1995
|
+
# Copy state to self
|
|
1996
|
+
self.id = mem.id
|
|
1997
|
+
self.events = mem.events
|
|
1998
|
+
self.idx_msgs = mem.idx_msgs
|
|
1999
|
+
self.idx_refs = mem.idx_refs
|
|
2000
|
+
self.idx_logs = mem.idx_logs
|
|
2001
|
+
self.idx_vars = mem.idx_vars
|
|
2002
|
+
self.idx_all = mem.idx_all
|
|
2003
|
+
self.vars = mem.vars
|
|
2004
|
+
self.var_desc_history = mem.var_desc_history
|
|
2005
|
+
self.objects = mem.objects
|
|
2006
|
+
|
|
2007
|
+
def copy(self):
|
|
2008
|
+
"""Return a deep copy of the MEMORY instance."""
|
|
2009
|
+
return copy.deepcopy(self)
|
|
2010
|
+
|
|
2011
|
+
def to_json(self, filename=None, indent=2):
|
|
2012
|
+
"""
|
|
2013
|
+
Export memory to JSON format.
|
|
2014
|
+
|
|
2015
|
+
Like DataFrame.to_csv(), this allows saving memory state to a portable
|
|
2016
|
+
JSON format that can be loaded later with from_json().
|
|
2017
|
+
|
|
2018
|
+
Args:
|
|
2019
|
+
filename: If provided, write to file. Otherwise return JSON string.
|
|
2020
|
+
indent: JSON indentation level (default 2, use None for compact)
|
|
2021
|
+
|
|
2022
|
+
Returns:
|
|
2023
|
+
JSON string if filename is None, else None
|
|
2024
|
+
|
|
2025
|
+
Example:
|
|
2026
|
+
# Save to file
|
|
2027
|
+
memory.to_json('memory_backup.json')
|
|
2028
|
+
|
|
2029
|
+
# Get JSON string
|
|
2030
|
+
json_str = memory.to_json()
|
|
2031
|
+
"""
|
|
2032
|
+
# Prepare data for JSON serialization
|
|
2033
|
+
# Need to handle VAR_DELETED sentinel in vars history
|
|
2034
|
+
def serialize_var_history(var_dict):
|
|
2035
|
+
"""Convert VAR_DELETED sentinel to JSON-safe marker."""
|
|
2036
|
+
result = {}
|
|
2037
|
+
for key, history in var_dict.items():
|
|
2038
|
+
serialized_history = []
|
|
2039
|
+
for stamp, value in history:
|
|
2040
|
+
if value is VAR_DELETED:
|
|
2041
|
+
serialized_history.append([stamp, '__VAR_DELETED__'])
|
|
2042
|
+
else:
|
|
2043
|
+
serialized_history.append([stamp, value])
|
|
2044
|
+
result[key] = serialized_history
|
|
2045
|
+
return result
|
|
2046
|
+
|
|
2047
|
+
data = {
|
|
2048
|
+
'version': '1.0',
|
|
2049
|
+
'id': self.id,
|
|
2050
|
+
'events': self.events,
|
|
2051
|
+
'objects': self.objects,
|
|
2052
|
+
'vars': serialize_var_history(self.vars),
|
|
2053
|
+
'var_desc_history': self.var_desc_history,
|
|
2054
|
+
'idx_msgs': self.idx_msgs,
|
|
2055
|
+
'idx_refs': self.idx_refs,
|
|
2056
|
+
'idx_logs': self.idx_logs,
|
|
2057
|
+
'idx_vars': self.idx_vars,
|
|
2058
|
+
'idx_all': self.idx_all,
|
|
2059
|
+
}
|
|
2060
|
+
|
|
2061
|
+
json_str = json.dumps(data, indent=indent, ensure_ascii=False)
|
|
2062
|
+
|
|
2063
|
+
if filename:
|
|
2064
|
+
with open(filename, 'w', encoding='utf-8') as f:
|
|
2065
|
+
f.write(json_str)
|
|
2066
|
+
return None
|
|
2067
|
+
return json_str
|
|
2068
|
+
|
|
2069
|
+
@classmethod
|
|
2070
|
+
def from_json(cls, source):
|
|
2071
|
+
"""
|
|
2072
|
+
Create MEMORY instance from JSON.
|
|
2073
|
+
|
|
2074
|
+
Like DataFrame.read_csv(), this loads a memory from a JSON file or string
|
|
2075
|
+
that was saved with to_json().
|
|
2076
|
+
|
|
2077
|
+
Args:
|
|
2078
|
+
source: JSON string or filename path
|
|
2079
|
+
|
|
2080
|
+
Returns:
|
|
2081
|
+
New MEMORY instance
|
|
2082
|
+
|
|
2083
|
+
Example:
|
|
2084
|
+
# Load from file
|
|
2085
|
+
memory = MEMORY.from_json('memory_backup.json')
|
|
2086
|
+
|
|
2087
|
+
# Load from JSON string
|
|
2088
|
+
memory = MEMORY.from_json(json_str)
|
|
2089
|
+
"""
|
|
2090
|
+
import os
|
|
2091
|
+
|
|
2092
|
+
# Determine if source is a file or JSON string
|
|
2093
|
+
if os.path.isfile(source):
|
|
2094
|
+
with open(source, 'r', encoding='utf-8') as f:
|
|
2095
|
+
data = json.load(f)
|
|
2096
|
+
else:
|
|
2097
|
+
data = json.loads(source)
|
|
2098
|
+
|
|
2099
|
+
# Helper to restore VAR_DELETED sentinel
|
|
2100
|
+
def deserialize_var_history(var_dict):
|
|
2101
|
+
"""Convert JSON marker back to VAR_DELETED sentinel."""
|
|
2102
|
+
result = {}
|
|
2103
|
+
for key, history in var_dict.items():
|
|
2104
|
+
deserialized_history = []
|
|
2105
|
+
for stamp, value in history:
|
|
2106
|
+
if value == '__VAR_DELETED__':
|
|
2107
|
+
deserialized_history.append([stamp, VAR_DELETED])
|
|
2108
|
+
else:
|
|
2109
|
+
deserialized_history.append([stamp, value])
|
|
2110
|
+
result[key] = deserialized_history
|
|
2111
|
+
return result
|
|
2112
|
+
|
|
2113
|
+
# Create new instance
|
|
2114
|
+
mem = cls()
|
|
2115
|
+
mem.id = data.get('id', mem.id)
|
|
2116
|
+
mem.events = data.get('events', {})
|
|
2117
|
+
mem.objects = data.get('objects', {})
|
|
2118
|
+
mem.vars = deserialize_var_history(data.get('vars', {}))
|
|
2119
|
+
mem.var_desc_history = data.get('var_desc_history', {})
|
|
2120
|
+
mem.idx_msgs = data.get('idx_msgs', [])
|
|
2121
|
+
mem.idx_refs = data.get('idx_refs', [])
|
|
2122
|
+
mem.idx_logs = data.get('idx_logs', [])
|
|
2123
|
+
mem.idx_vars = data.get('idx_vars', [])
|
|
2124
|
+
mem.idx_all = data.get('idx_all', [])
|
|
2125
|
+
|
|
2126
|
+
return mem
|
|
2127
|
+
|
|
2128
|
+
@classmethod
|
|
2129
|
+
def from_events(cls, event_list, memory_id=None, objects=None):
|
|
2130
|
+
"""
|
|
2131
|
+
Rehydrate a MEMORY instance from a list of events.
|
|
2132
|
+
This is the inverse of snapshot - enables cloud sync.
|
|
2133
|
+
|
|
2134
|
+
Args:
|
|
2135
|
+
event_list: List of event dicts (order doesn't matter, will be sorted)
|
|
2136
|
+
memory_id: Optional ID for the memory instance
|
|
2137
|
+
objects: Optional dict of objects (stamp → compressed object dict)
|
|
2138
|
+
|
|
2139
|
+
Returns:
|
|
2140
|
+
New MEMORY instance with all events loaded
|
|
2141
|
+
"""
|
|
2142
|
+
mem = cls()
|
|
2143
|
+
if memory_id:
|
|
2144
|
+
mem.id = memory_id
|
|
2145
|
+
|
|
2146
|
+
# Restore objects if provided
|
|
2147
|
+
if objects:
|
|
2148
|
+
mem.objects = dict(objects)
|
|
2149
|
+
|
|
2150
|
+
# Sort events by timestamp (dt_utc) for chronological order
|
|
2151
|
+
sorted_events = sorted(event_list, key=lambda e: e.get('dt_utc', ''))
|
|
2152
|
+
|
|
2153
|
+
for ev in sorted_events:
|
|
2154
|
+
stamp = ev.get('stamp')
|
|
2155
|
+
timestamp = ev.get('dt_utc', '')
|
|
2156
|
+
if not stamp:
|
|
2157
|
+
continue
|
|
2158
|
+
|
|
2159
|
+
event_type = ev.get('type', 'msg')
|
|
2160
|
+
|
|
2161
|
+
# Store in data layer
|
|
2162
|
+
mem.events[stamp] = ev
|
|
2163
|
+
|
|
2164
|
+
# Create [timestamp, stamp] pair for indexes
|
|
2165
|
+
ts_pair = [timestamp, stamp]
|
|
2166
|
+
|
|
2167
|
+
# Add to appropriate index (direct append since already sorted by timestamp)
|
|
2168
|
+
if event_type == 'msg':
|
|
2169
|
+
mem.idx_msgs.append(ts_pair)
|
|
2170
|
+
elif event_type == 'ref':
|
|
2171
|
+
mem.idx_refs.append(ts_pair)
|
|
2172
|
+
elif event_type == 'log':
|
|
2173
|
+
mem.idx_logs.append(ts_pair)
|
|
2174
|
+
elif event_type == 'var':
|
|
2175
|
+
mem.idx_vars.append(ts_pair)
|
|
2176
|
+
# Replay variable state into history list
|
|
2177
|
+
var_name = ev.get('var_name')
|
|
2178
|
+
if var_name:
|
|
2179
|
+
# Initialize history list if needed
|
|
2180
|
+
if var_name not in mem.vars:
|
|
2181
|
+
mem.vars[var_name] = []
|
|
2182
|
+
|
|
2183
|
+
# Determine value (check for deletion marker)
|
|
2184
|
+
if ev.get('var_deleted', False):
|
|
2185
|
+
value = VAR_DELETED
|
|
2186
|
+
else:
|
|
2187
|
+
value = ev.get('var_value')
|
|
2188
|
+
|
|
2189
|
+
# Append to history
|
|
2190
|
+
mem.vars[var_name].append([stamp, value])
|
|
2191
|
+
|
|
2192
|
+
# Rebuild description history if present
|
|
2193
|
+
var_desc = ev.get('var_desc')
|
|
2194
|
+
if var_desc:
|
|
2195
|
+
if var_name not in mem.var_desc_history:
|
|
2196
|
+
mem.var_desc_history[var_name] = []
|
|
2197
|
+
# Only add if different from last description (avoid duplicates)
|
|
2198
|
+
desc_hist = mem.var_desc_history[var_name]
|
|
2199
|
+
if not desc_hist or desc_hist[-1][1] != var_desc:
|
|
2200
|
+
desc_hist.append([stamp, var_desc])
|
|
2201
|
+
|
|
2202
|
+
mem.idx_all.append(ts_pair)
|
|
2203
|
+
|
|
2204
|
+
return mem
|
|
2205
|
+
|
|
2206
|
+
#---
|
|
2207
|
+
|
|
2208
|
+
# The render method provides a flexible way to display or export the MEMORY's messages or events.
|
|
2209
|
+
# It supports event type selection, output format, advanced filtering, metadata inclusion, pretty-printing, and message condensing.
|
|
2210
|
+
def render(
|
|
2211
|
+
self,
|
|
2212
|
+
include=('msgs',), # Tuple/list of event types to include: 'msgs', 'logs', 'refs', 'vars', 'events'
|
|
2213
|
+
output_format='plain', # 'plain', 'markdown', 'json', 'table', 'conversation'
|
|
2214
|
+
role_filter=None, # List of roles to include (None = all)
|
|
2215
|
+
mode_filter=None, # List of modes to include (None = all)
|
|
2216
|
+
channel_filter=None, # Channel to filter by (None = all)
|
|
2217
|
+
content_filter=None, # String or list of keywords to filter content (None = all)
|
|
2218
|
+
include_metadata=True, # Whether to include metadata (timestamps, roles, etc.)
|
|
2219
|
+
pretty=True, # Pretty-print for human readability
|
|
2220
|
+
max_length=None, # Max total length of output (int, None = unlimited)
|
|
2221
|
+
condense_msg=True, # If True, snip/condense messages that exceed max_length
|
|
2222
|
+
time_range=None, # Tuple (start_dt, end_dt) to filter by datetime (None = all)
|
|
2223
|
+
event_limit=None, # Max number of events to include (None = all)
|
|
2224
|
+
# Conversation/LLM-optimized options:
|
|
2225
|
+
max_message_length=1000, # Max length per individual message (for 'conversation' format)
|
|
2226
|
+
max_total_length=8000, # Max total length of the entire conversation (for 'conversation' format)
|
|
2227
|
+
include_roles=('user', 'assistant'), # Which roles to include (for 'conversation' format)
|
|
2228
|
+
message_separator="\n\n", # Separator between messages (for 'conversation' format)
|
|
2229
|
+
role_prefix=True, # Whether to include role prefixes like "User:" and "Assistant:" (for 'conversation' format)
|
|
2230
|
+
truncate_indicator="...", # What to show when content is truncated (for 'conversation' format)
|
|
2231
|
+
):
|
|
2232
|
+
"""
|
|
2233
|
+
Render MEMORY contents with flexible filtering and formatting.
|
|
2234
|
+
|
|
2235
|
+
This method unifies all rendering and export logic, including:
|
|
2236
|
+
- General event/message rendering (plain, markdown, table, json)
|
|
2237
|
+
- Advanced filtering (by role, mode, channel, content, time, event type)
|
|
2238
|
+
- Metadata inclusion and pretty-printing
|
|
2239
|
+
- Output length limiting and message condensing/snipping
|
|
2240
|
+
- LLM-optimized conversation export (via output_format='conversation'),
|
|
2241
|
+
which produces a clean text blob of user/assistant messages with
|
|
2242
|
+
configurable length and formatting options.
|
|
2243
|
+
|
|
2244
|
+
Args:
|
|
2245
|
+
include: Which event types to include ('msgs', 'logs', 'refs', 'vars', 'events')
|
|
2246
|
+
output_format: 'plain', 'markdown', 'json', 'table', or 'conversation'
|
|
2247
|
+
role_filter: List of roles to include (None = all)
|
|
2248
|
+
mode_filter: List of modes to include (None = all)
|
|
2249
|
+
channel_filter: Channel to filter by (None = all)
|
|
2250
|
+
content_filter: String or list of keywords to filter content (None = all)
|
|
2251
|
+
include_metadata: Whether to include metadata (timestamps, roles, etc.)
|
|
2252
|
+
pretty: Pretty-print for human readability
|
|
2253
|
+
max_length: Max total length of output (for general formats)
|
|
2254
|
+
condense_msg: If True, snip/condense messages that exceed max_length
|
|
2255
|
+
time_range: Tuple (start_dt, end_dt) to filter by datetime (None = all)
|
|
2256
|
+
event_limit: Max number of events to include (None = all)
|
|
2257
|
+
max_message_length: Max length per message (for 'conversation' format)
|
|
2258
|
+
max_total_length: Max total length (for 'conversation' format)
|
|
2259
|
+
include_roles: Which roles to include (for 'conversation' format)
|
|
2260
|
+
message_separator: Separator between messages (for 'conversation' format)
|
|
2261
|
+
role_prefix: Whether to include role prefixes (for 'conversation' format)
|
|
2262
|
+
truncate_indicator: Indicator for truncated content (for 'conversation' format)
|
|
2263
|
+
|
|
2264
|
+
Returns:
|
|
2265
|
+
str or dict: Rendered output in the specified format.
|
|
2266
|
+
|
|
2267
|
+
Example usage:
|
|
2268
|
+
mem = MEMORY()
|
|
2269
|
+
mem.add_msg('user', 'Hello!')
|
|
2270
|
+
mem.add_msg('assistant', 'Hi there!')
|
|
2271
|
+
print(mem.render()) # Default: plain text, all messages
|
|
2272
|
+
|
|
2273
|
+
# Render only user messages in markdown
|
|
2274
|
+
print(mem.render(role_filter=['user'], output_format='markdown'))
|
|
2275
|
+
|
|
2276
|
+
# Render as a table, including logs and refs
|
|
2277
|
+
print(mem.render(include=('msgs', 'logs', 'refs'), output_format='table'))
|
|
2278
|
+
|
|
2279
|
+
# Render with a content keyword filter and max length
|
|
2280
|
+
print(mem.render(content_filter='hello', max_length=50))
|
|
2281
|
+
|
|
2282
|
+
# Export as LLM-optimized conversation
|
|
2283
|
+
print(mem.render(output_format='conversation', max_total_length=2000))
|
|
2284
|
+
|
|
2285
|
+
# Filter by channel
|
|
2286
|
+
print(mem.render(channel_filter='telegram'))
|
|
2287
|
+
"""
|
|
2288
|
+
import json
|
|
2289
|
+
from datetime import datetime
|
|
2290
|
+
|
|
2291
|
+
# Helper: flatten include to set for fast lookup
|
|
2292
|
+
include_set = set(include)
|
|
2293
|
+
|
|
2294
|
+
# Helper: filter events by type using the new index-based retrieval
|
|
2295
|
+
def filter_events():
|
|
2296
|
+
events = []
|
|
2297
|
+
if 'events' in include_set:
|
|
2298
|
+
# Include all events from master index
|
|
2299
|
+
events = self._get_events_from_index(self.idx_all, -1)
|
|
2300
|
+
else:
|
|
2301
|
+
# Selectively include types
|
|
2302
|
+
if 'msgs' in include_set:
|
|
2303
|
+
events.extend(self._get_events_from_index(self.idx_msgs, -1))
|
|
2304
|
+
if 'logs' in include_set:
|
|
2305
|
+
events.extend(self._get_events_from_index(self.idx_logs, -1))
|
|
2306
|
+
if 'refs' in include_set:
|
|
2307
|
+
events.extend(self._get_events_from_index(self.idx_refs, -1))
|
|
2308
|
+
if 'vars' in include_set:
|
|
2309
|
+
events.extend(self._get_events_from_index(self.idx_vars, -1))
|
|
2310
|
+
return events
|
|
2311
|
+
|
|
2312
|
+
# Helper: filter by role, mode, channel, content, and time
|
|
2313
|
+
def advanced_filter(evlist):
|
|
2314
|
+
filtered = []
|
|
2315
|
+
for ev in evlist:
|
|
2316
|
+
# Role filter
|
|
2317
|
+
if role_filter:
|
|
2318
|
+
ev_role = ev.get('role') or ev.get('type')
|
|
2319
|
+
if ev_role not in role_filter:
|
|
2320
|
+
continue
|
|
2321
|
+
# Mode filter
|
|
2322
|
+
if mode_filter and ev.get('mode') not in mode_filter:
|
|
2323
|
+
continue
|
|
2324
|
+
# Channel filter
|
|
2325
|
+
if channel_filter and ev.get('channel') != channel_filter:
|
|
2326
|
+
continue
|
|
2327
|
+
# Content filter
|
|
2328
|
+
if content_filter:
|
|
2329
|
+
content = ev.get('content', '')
|
|
2330
|
+
if isinstance(content_filter, str):
|
|
2331
|
+
if content_filter.lower() not in content.lower():
|
|
2332
|
+
continue
|
|
2333
|
+
else: # list of keywords
|
|
2334
|
+
if not any(kw.lower() in content.lower() for kw in content_filter):
|
|
2335
|
+
continue
|
|
2336
|
+
# Time filter
|
|
2337
|
+
if time_range:
|
|
2338
|
+
# Try to get timestamp from event
|
|
2339
|
+
dt_str = ev.get('dt_utc') or ev.get('dt_bog')
|
|
2340
|
+
if dt_str:
|
|
2341
|
+
try:
|
|
2342
|
+
dt = datetime.fromisoformat(dt_str)
|
|
2343
|
+
start, end = time_range
|
|
2344
|
+
if (start and dt < start) or (end and dt > end):
|
|
2345
|
+
continue
|
|
2346
|
+
except Exception:
|
|
2347
|
+
pass # Ignore if can't parse
|
|
2348
|
+
filtered.append(ev)
|
|
2349
|
+
return filtered
|
|
2350
|
+
|
|
2351
|
+
# Helper: sort events by stamp (alphabetical = chronological)
|
|
2352
|
+
def sort_events(evlist):
|
|
2353
|
+
return sorted(evlist, key=lambda ev: ev.get('stamp', ''))
|
|
2354
|
+
|
|
2355
|
+
# Step 1: Gather and filter events
|
|
2356
|
+
events = filter_events()
|
|
2357
|
+
events = advanced_filter(events)
|
|
2358
|
+
events = sort_events(events)
|
|
2359
|
+
if event_limit:
|
|
2360
|
+
events = events[-event_limit:] # Most recent N
|
|
2361
|
+
|
|
2362
|
+
# --- Conversation/LLM-optimized format ---
|
|
2363
|
+
if output_format == 'conversation':
|
|
2364
|
+
# Only include messages and filter by include_roles
|
|
2365
|
+
conv_msgs = [ev for ev in events if ev.get('role') in include_roles]
|
|
2366
|
+
# Already sorted by stamp
|
|
2367
|
+
|
|
2368
|
+
conversation_parts = []
|
|
2369
|
+
current_length = 0
|
|
2370
|
+
for msg in conv_msgs:
|
|
2371
|
+
role = msg.get('role', 'unknown')
|
|
2372
|
+
content = msg.get('content', '')
|
|
2373
|
+
|
|
2374
|
+
# Truncate individual message if needed
|
|
2375
|
+
if len(content) > max_message_length:
|
|
2376
|
+
content = content[:max_message_length - len(truncate_indicator)] + truncate_indicator
|
|
2377
|
+
|
|
2378
|
+
# Format the message
|
|
2379
|
+
if role_prefix:
|
|
2380
|
+
if role == 'user':
|
|
2381
|
+
formatted_msg = "User: " + content
|
|
2382
|
+
elif role == 'assistant':
|
|
2383
|
+
formatted_msg = "Assistant: " + content
|
|
2384
|
+
else:
|
|
2385
|
+
formatted_msg = role.title() + ": " + content
|
|
2386
|
+
else:
|
|
2387
|
+
formatted_msg = content
|
|
2388
|
+
|
|
2389
|
+
# Check if adding this message would exceed total length
|
|
2390
|
+
message_length = len(formatted_msg) + len(message_separator)
|
|
2391
|
+
if current_length + message_length > max_total_length:
|
|
2392
|
+
# If we can't fit the full message, try to fit a truncated version
|
|
2393
|
+
remaining_space = max_total_length - current_length - len(truncate_indicator)
|
|
2394
|
+
if remaining_space > 50: # Only add if there's reasonable space
|
|
2395
|
+
if role_prefix:
|
|
2396
|
+
prefix_len = len(role.title() + ": ")
|
|
2397
|
+
truncated_content = content[:remaining_space - prefix_len] + truncate_indicator
|
|
2398
|
+
formatted_msg = role.title() + ": " + truncated_content
|
|
2399
|
+
else:
|
|
2400
|
+
formatted_msg = content[:remaining_space] + truncate_indicator
|
|
2401
|
+
conversation_parts.append(formatted_msg)
|
|
2402
|
+
break
|
|
2403
|
+
|
|
2404
|
+
conversation_parts.append(formatted_msg)
|
|
2405
|
+
current_length += message_length
|
|
2406
|
+
|
|
2407
|
+
return message_separator.join(conversation_parts)
|
|
2408
|
+
|
|
2409
|
+
# --- JSON format ---
|
|
2410
|
+
output = None
|
|
2411
|
+
total_length = 0
|
|
2412
|
+
snip_notice = " [snipped]" # For snipped messages
|
|
2413
|
+
|
|
2414
|
+
if output_format == 'json':
|
|
2415
|
+
# Output as JSON (list of dicts)
|
|
2416
|
+
if not include_metadata:
|
|
2417
|
+
# Remove metadata fields
|
|
2418
|
+
def strip_meta(ev):
|
|
2419
|
+
return {k: v for k, v in ev.items() if k in ('role', 'content', 'type', 'channel')}
|
|
2420
|
+
out_events = [strip_meta(ev) for ev in events]
|
|
2421
|
+
else:
|
|
2422
|
+
out_events = events
|
|
2423
|
+
output = json.dumps(out_events, indent=2 if pretty else None, default=str)
|
|
2424
|
+
if max_length and len(output) > max_length:
|
|
2425
|
+
output = output[:max_length] + snip_notice
|
|
2426
|
+
|
|
2427
|
+
elif output_format in ('plain', 'markdown', 'table'):
|
|
2428
|
+
# Build lines for each event
|
|
2429
|
+
lines = []
|
|
2430
|
+
for ev in events:
|
|
2431
|
+
# Compose line based on event type
|
|
2432
|
+
event_type = ev.get('type', 'msg')
|
|
2433
|
+
if event_type == 'log' or ev.get('role') == 'logger':
|
|
2434
|
+
prefix = "[LOG]"
|
|
2435
|
+
content = ev.get('content', '')
|
|
2436
|
+
elif event_type == 'ref':
|
|
2437
|
+
prefix = "[REF]"
|
|
2438
|
+
content = ev.get('content', '')
|
|
2439
|
+
elif event_type == 'var':
|
|
2440
|
+
prefix = "[VAR]"
|
|
2441
|
+
content = "{} = {}".format(ev.get('var_name', '?'), ev.get('var_value', '?'))
|
|
2442
|
+
else:
|
|
2443
|
+
prefix = "[{}]".format(ev.get('role', 'MSG').upper())
|
|
2444
|
+
content = ev.get('content', '')
|
|
2445
|
+
|
|
2446
|
+
# Optionally include metadata
|
|
2447
|
+
meta = ""
|
|
2448
|
+
if include_metadata:
|
|
2449
|
+
dt = ev.get('dt_utc') or ev.get('dt_bog')
|
|
2450
|
+
stamp = ev.get('stamp', '')
|
|
2451
|
+
channel = ev.get('channel', '')
|
|
2452
|
+
meta = " ({})".format(dt) if dt else ""
|
|
2453
|
+
if output_format == 'table':
|
|
2454
|
+
meta = "\t{}\t{}\t{}".format(dt or '', stamp or '', channel or '')
|
|
2455
|
+
|
|
2456
|
+
# Condense message if needed
|
|
2457
|
+
line = "{} {}{}".format(prefix, content, meta)
|
|
2458
|
+
if max_length and total_length + len(line) > max_length:
|
|
2459
|
+
if condense_msg:
|
|
2460
|
+
# Snip the content to fit
|
|
2461
|
+
allowed = max_length - total_length - len(snip_notice)
|
|
2462
|
+
if allowed > 0:
|
|
2463
|
+
line = line[:allowed] + snip_notice
|
|
2464
|
+
else:
|
|
2465
|
+
line = snip_notice
|
|
2466
|
+
lines.append(line)
|
|
2467
|
+
break
|
|
2468
|
+
else:
|
|
2469
|
+
break
|
|
2470
|
+
lines.append(line)
|
|
2471
|
+
total_length += len(line) + 1 # +1 for newline
|
|
2472
|
+
|
|
2473
|
+
# Format as table if requested
|
|
2474
|
+
if output_format == 'table':
|
|
2475
|
+
# Table header
|
|
2476
|
+
header = "Type\tContent\tDatetime\tStamp\tChannel"
|
|
2477
|
+
table_lines = [header]
|
|
2478
|
+
for ev in events:
|
|
2479
|
+
typ = ev.get('type', ev.get('role', ''))
|
|
2480
|
+
if typ == 'var':
|
|
2481
|
+
content = "{} = {}".format(ev.get('var_name', '?'), ev.get('var_value', '?'))
|
|
2482
|
+
else:
|
|
2483
|
+
content = ev.get('content', '')
|
|
2484
|
+
dt = ev.get('dt_utc') or ev.get('dt_bog') or ''
|
|
2485
|
+
stamp = ev.get('stamp', '')
|
|
2486
|
+
channel = ev.get('channel', '')
|
|
2487
|
+
row = "{}\t{}\t{}\t{}\t{}".format(typ, content, dt, stamp, channel)
|
|
2488
|
+
table_lines.append(row)
|
|
2489
|
+
output = "\n".join(table_lines)
|
|
2490
|
+
else:
|
|
2491
|
+
sep = "\n" if pretty else " "
|
|
2492
|
+
output = sep.join(lines)
|
|
2493
|
+
|
|
2494
|
+
else:
|
|
2495
|
+
raise ValueError("Unknown output_format: {}".format(output_format))
|
|
2496
|
+
|
|
2497
|
+
return output
|
|
2498
|
+
|
|
2499
|
+
|
|
2500
|
+
MemoryManipulationExamples = """
|
|
2501
|
+
|
|
2502
|
+
MEMORY Class Usage Tutorial
|
|
2503
|
+
===========================
|
|
2504
|
+
|
|
2505
|
+
This tutorial demonstrates common workflows and transactions using the MEMORY class.
|
|
2506
|
+
The MEMORY class is an event-sourced state container for managing messages, logs,
|
|
2507
|
+
reflections, and variables in agentic or conversational systems.
|
|
2508
|
+
|
|
2509
|
+
Key Features:
|
|
2510
|
+
- Everything is an event with a sortable ID (alphabetical = chronological)
|
|
2511
|
+
- Events stored in a dictionary for O(1) lookup
|
|
2512
|
+
- Channel tracking for messages (omni-directional communication)
|
|
2513
|
+
- Full variable history with timestamps
|
|
2514
|
+
- Memory can be rehydrated from event list for cloud sync
|
|
2515
|
+
|
|
2516
|
+
------------------------------------------------------------
|
|
2517
|
+
1. Initialization
|
|
2518
|
+
------------------------------------------------------------
|
|
2519
|
+
|
|
2520
|
+
>>> mem = MEMORY()
|
|
2521
|
+
|
|
2522
|
+
Creates a new MEMORY instance with empty event stores and indexes.
|
|
2523
|
+
|
|
2524
|
+
------------------------------------------------------------
|
|
2525
|
+
2. Adding and Retrieving Messages with Channel Support
|
|
2526
|
+
------------------------------------------------------------
|
|
2527
|
+
|
|
2528
|
+
# Add user and assistant messages with channel tracking
|
|
2529
|
+
>>> mem.add_msg('user', 'Hello, assistant!', channel='webapp')
|
|
2530
|
+
>>> mem.add_msg('assistant', 'Hello, user! How can I help you?', channel='webapp')
|
|
2531
|
+
|
|
2532
|
+
# Messages from different channels
|
|
2533
|
+
>>> mem.add_msg('user', 'Quick question via phone', channel='ios')
|
|
2534
|
+
>>> mem.add_msg('user', 'Following up on Telegram', channel='telegram')
|
|
2535
|
+
|
|
2536
|
+
# Retrieve all messages as a list of dicts
|
|
2537
|
+
>>> mem.get_msgs()
|
|
2538
|
+
[{'role': 'user', 'content': 'Hello, assistant!', 'channel': 'webapp', ...}, ...]
|
|
2539
|
+
|
|
2540
|
+
# Filter messages by channel
|
|
2541
|
+
>>> mem.get_msgs(channel='telegram')
|
|
2542
|
+
|
|
2543
|
+
# Retrieve only user messages as a string
|
|
2544
|
+
>>> mem.get_msgs(include=['user'], repr='str')
|
|
2545
|
+
'user: Hello, assistant!'
|
|
2546
|
+
|
|
2547
|
+
# Get the last assistant message
|
|
2548
|
+
>>> mem.last_asst_msg()
|
|
2549
|
+
'Hello, user! How can I help you?'
|
|
2550
|
+
|
|
2551
|
+
------------------------------------------------------------
|
|
2552
|
+
3. Logging and Reflections
|
|
2553
|
+
------------------------------------------------------------
|
|
2554
|
+
|
|
2555
|
+
# Add a log entry
|
|
2556
|
+
>>> mem.add_log('System initialized.')
|
|
2557
|
+
|
|
2558
|
+
# Add a reflection (agent's internal reasoning)
|
|
2559
|
+
>>> mem.add_ref('User seems to be asking about weather patterns.')
|
|
2560
|
+
|
|
2561
|
+
# Retrieve the last log message
|
|
2562
|
+
>>> mem.last_log_msg()
|
|
2563
|
+
'System initialized.'
|
|
2564
|
+
|
|
2565
|
+
# Get all logs
|
|
2566
|
+
>>> mem.get_logs()
|
|
2567
|
+
|
|
2568
|
+
# Get all reflections
|
|
2569
|
+
>>> mem.get_refs()
|
|
2570
|
+
|
|
2571
|
+
------------------------------------------------------------
|
|
2572
|
+
4. Managing Variables (Full History Tracking)
|
|
2573
|
+
------------------------------------------------------------
|
|
2574
|
+
|
|
2575
|
+
# Set a variable with a description (logged as an event!)
|
|
2576
|
+
>>> mem.set_var('session_id', 'abc123', desc='Current session identifier')
|
|
2577
|
+
|
|
2578
|
+
# Update the variable (appends to history, doesn't overwrite)
|
|
2579
|
+
>>> mem.set_var('session_id', 'xyz789')
|
|
2580
|
+
|
|
2581
|
+
# Retrieve the current value of a variable
|
|
2582
|
+
>>> mem.get_var('session_id')
|
|
2583
|
+
'xyz789'
|
|
2584
|
+
|
|
2585
|
+
# Get all current non-deleted variables as a dict
|
|
2586
|
+
>>> mem.get_all_vars()
|
|
2587
|
+
{'session_id': 'xyz789'}
|
|
2588
|
+
|
|
2589
|
+
# Get full variable history as list of [stamp, value] pairs
|
|
2590
|
+
>>> mem.get_var_history('session_id')
|
|
2591
|
+
[['stamp1...', 'abc123'], ['stamp2...', 'xyz789']]
|
|
2592
|
+
|
|
2593
|
+
# Get variable description
|
|
2594
|
+
>>> mem.get_var_desc('session_id')
|
|
2595
|
+
'Current session identifier'
|
|
2596
|
+
|
|
2597
|
+
# Delete a variable (marks as deleted but preserves history)
|
|
2598
|
+
>>> mem.del_var('session_id')
|
|
2599
|
+
|
|
2600
|
+
# After deletion, get_var returns None
|
|
2601
|
+
>>> mem.get_var('session_id')
|
|
2602
|
+
None
|
|
2603
|
+
|
|
2604
|
+
# Check if a variable is deleted
|
|
2605
|
+
>>> mem.is_var_deleted('session_id')
|
|
2606
|
+
True
|
|
2607
|
+
|
|
2608
|
+
# History still shows all changes including deletion
|
|
2609
|
+
>>> mem.get_var_history('session_id')
|
|
2610
|
+
[['stamp1...', 'abc123'], ['stamp2...', 'xyz789'], ['stamp3...', <DELETED>]]
|
|
2611
|
+
|
|
2612
|
+
# Variable can be re-set after deletion
|
|
2613
|
+
>>> mem.set_var('session_id', 'new_value')
|
|
2614
|
+
>>> mem.get_var('session_id')
|
|
2615
|
+
'new_value'
|
|
2616
|
+
|
|
2617
|
+
------------------------------------------------------------
|
|
2618
|
+
5. Saving, Loading, and Copying State
|
|
2619
|
+
------------------------------------------------------------
|
|
2620
|
+
|
|
2621
|
+
# Save MEMORY state to a file
|
|
2622
|
+
>>> mem.save('memory_state.pkl')
|
|
2623
|
+
|
|
2624
|
+
# Save with compression
|
|
2625
|
+
>>> mem.save('memory_state.pkl.gz', compressed=True)
|
|
2626
|
+
|
|
2627
|
+
# Load MEMORY state from a file (rehydrates from events)
|
|
2628
|
+
>>> mem2 = MEMORY()
|
|
2629
|
+
>>> mem2.load('memory_state.pkl')
|
|
2630
|
+
|
|
2631
|
+
# Deep copy the MEMORY object
|
|
2632
|
+
>>> mem3 = mem.copy()
|
|
2633
|
+
|
|
2634
|
+
------------------------------------------------------------
|
|
2635
|
+
6. Rehydrating from Events (Cloud Sync Ready)
|
|
2636
|
+
------------------------------------------------------------
|
|
2637
|
+
|
|
2638
|
+
# Export all events
|
|
2639
|
+
>>> events = mem.get_events()
|
|
2640
|
+
|
|
2641
|
+
# Create a new memory from events (order doesn't matter, sorted by stamp)
|
|
2642
|
+
>>> mem_copy = MEMORY.from_events(events)
|
|
2643
|
+
|
|
2644
|
+
# Export snapshot for cloud storage
|
|
2645
|
+
>>> snapshot = mem.snapshot()
|
|
2646
|
+
# snapshot = {'id': '...', 'events': {...}}
|
|
2647
|
+
|
|
2648
|
+
------------------------------------------------------------
|
|
2649
|
+
7. Rendering and Exporting Memory Contents
|
|
2650
|
+
------------------------------------------------------------
|
|
2651
|
+
|
|
2652
|
+
# Render all messages as plain text (default)
|
|
2653
|
+
>>> print(mem.render())
|
|
2654
|
+
|
|
2655
|
+
# Render only user messages in markdown format
|
|
2656
|
+
>>> print(mem.render(role_filter=['user'], output_format='markdown'))
|
|
2657
|
+
|
|
2658
|
+
# Render as a table, including logs and reflections
|
|
2659
|
+
>>> print(mem.render(include=('msgs', 'logs', 'refs'), output_format='table'))
|
|
2660
|
+
|
|
2661
|
+
# Filter by channel
|
|
2662
|
+
>>> print(mem.render(channel_filter='telegram'))
|
|
2663
|
+
|
|
2664
|
+
# Render with a content keyword filter and max length
|
|
2665
|
+
>>> print(mem.render(content_filter='hello', max_length=50))
|
|
2666
|
+
|
|
2667
|
+
# Export as LLM-optimized conversation (for prompt construction)
|
|
2668
|
+
>>> print(mem.render(output_format='conversation', max_total_length=2000))
|
|
2669
|
+
|
|
2670
|
+
------------------------------------------------------------
|
|
2671
|
+
8. Advanced Filtering and Formatting
|
|
2672
|
+
------------------------------------------------------------
|
|
2673
|
+
|
|
2674
|
+
# Filter by role, mode, and channel
|
|
2675
|
+
>>> print(mem.render(role_filter=['assistant'], mode_filter=['text'], channel_filter='webapp'))
|
|
2676
|
+
|
|
2677
|
+
# Filter by time range (using datetime objects)
|
|
2678
|
+
>>> from datetime import datetime, timedelta
|
|
2679
|
+
>>> start = datetime.utcnow() - timedelta(hours=1)
|
|
2680
|
+
>>> end = datetime.utcnow()
|
|
2681
|
+
>>> print(mem.render(time_range=(start, end)))
|
|
2682
|
+
|
|
2683
|
+
# Limit number of events/messages
|
|
2684
|
+
>>> print(mem.render(event_limit=5))
|
|
2685
|
+
|
|
2686
|
+
# Get all events of specific types
|
|
2687
|
+
>>> mem.get_events(event_types=['msg', 'ref'])
|
|
2688
|
+
|
|
2689
|
+
------------------------------------------------------------
|
|
2690
|
+
9. Example: Full Workflow
|
|
2691
|
+
------------------------------------------------------------
|
|
2692
|
+
|
|
2693
|
+
>>> mem = MEMORY()
|
|
2694
|
+
>>> mem.add_msg('user', 'What is the weather today?', channel='webapp')
|
|
2695
|
+
>>> mem.add_msg('assistant', 'The weather is sunny and warm.', channel='webapp')
|
|
2696
|
+
>>> mem.set_var('weather', 'sunny and warm', desc='Latest weather info')
|
|
2697
|
+
>>> mem.add_ref('User is interested in outdoor activities.')
|
|
2698
|
+
>>> mem.add_log('Weather query processed successfully.')
|
|
2699
|
+
>>> print(mem.render(output_format='conversation'))
|
|
2700
|
+
|
|
2701
|
+
# Export all events and rehydrate
|
|
2702
|
+
>>> all_events = mem.get_events()
|
|
2703
|
+
>>> mem_restored = MEMORY.from_events(all_events, mem.id)
|
|
2704
|
+
|
|
2705
|
+
------------------------------------------------------------
|
|
2706
|
+
For more details, see the MEMORY class docstring and method documentation.
|
|
2707
|
+
------------------------------------------------------------
|
|
2708
|
+
"""
|
|
2709
|
+
|
|
2710
|
+
|
|
2711
|
+
#############################################################################
|
|
2712
|
+
#############################################################################
|
|
2713
|
+
|
|
2714
|
+
### THOUGHT CLASS
|
|
2715
|
+
|
|
2716
|
+
|
|
2717
|
+
|
|
2718
|
+
class THOUGHT:
|
|
2719
|
+
"""
|
|
2720
|
+
The THOUGHT class represents a single, modular reasoning or action step within an agentic
|
|
2721
|
+
workflow. It is designed to operate on MEMORY objects, orchestrating LLM calls, memory queries,
|
|
2722
|
+
and variable manipulations in a composable and traceable manner.
|
|
2723
|
+
THOUGHTs are the atomic units of reasoning, planning, and execution in the Thoughtflow framework,
|
|
2724
|
+
and can be chained or composed to build complex agent behaviors.
|
|
2725
|
+
|
|
2726
|
+
CONCEPT:
|
|
2727
|
+
A thought is a self-contained, modular process of (1) creating a structured prompt for an LLM,
|
|
2728
|
+
(2) Executing the LLM request, (3) cleaning / validating the LLM response, and (4) retry execution
|
|
2729
|
+
if it is necesary. It is the discrete unit of cognition. It is the execution of a single cognitive task.
|
|
2730
|
+
In-so-doing, we have created the fundamental component of architecting multi-step cognitive systems.
|
|
2731
|
+
|
|
2732
|
+
The Simple Equation of a Thought:
|
|
2733
|
+
Thoughts = Prompt + Context + LLM + Parsing + Validation
|
|
2734
|
+
|
|
2735
|
+
|
|
2736
|
+
COMPONENTS:
|
|
2737
|
+
|
|
2738
|
+
1. PROMPT
|
|
2739
|
+
The Prompt() object is essentially the structured template which may contain certain parameters to fill-out.
|
|
2740
|
+
This defines the structure and the rules for executing the LLM request.
|
|
2741
|
+
|
|
2742
|
+
2. CONTEXT
|
|
2743
|
+
This is the relevant context which comes from a Memory() object. It is passed to a prompt object in the
|
|
2744
|
+
structure of a dictionary containing the variables required / optional. Any context that is given, but
|
|
2745
|
+
does not exist as a variable in the prompt, will be excluded.
|
|
2746
|
+
|
|
2747
|
+
3. LLM REQUEST
|
|
2748
|
+
This is the simple transaction of submitting a structured Messages object to an LLM in order to receive
|
|
2749
|
+
a response. The messages object may include a system prompt and a series of historical user / assistant
|
|
2750
|
+
interactions. Passed in this request is also parameters like temperature.
|
|
2751
|
+
|
|
2752
|
+
4. PARSING
|
|
2753
|
+
It is often that LLMs offer extra text even if they are told not to. For this reason, it is important
|
|
2754
|
+
to parse the response such that we are only handling the content that was requested, and nothing more.
|
|
2755
|
+
So if we are asking for a Python List, the parsed response should begin with "[" and end with "]".
|
|
2756
|
+
|
|
2757
|
+
5. VALIDATION
|
|
2758
|
+
It is possible that even if a response was successfully parsed that it is not valid, given the constraints
|
|
2759
|
+
of the Thought. For this reason, it is helpful to have a validation routine that stamps the response as valid
|
|
2760
|
+
according to a fixed list of rules. "max_retries" is a param that tells the Thought how many times it can
|
|
2761
|
+
retry the prompt before returning an error.
|
|
2762
|
+
|
|
2763
|
+
|
|
2764
|
+
Supported Operations:
|
|
2765
|
+
- llm_call: Execute an LLM request with prompt and context (default)
|
|
2766
|
+
- memory_query: Query memory state and return variables/data without LLM
|
|
2767
|
+
- variable_set: Set or compute memory variables from context
|
|
2768
|
+
- conditional: Execute logic based on memory conditions
|
|
2769
|
+
|
|
2770
|
+
Key Features:
|
|
2771
|
+
- Callable interface: mem = thought(mem) or mem = thought(mem, vars)
|
|
2772
|
+
- Automatic retry with configurable attempts and repair prompts
|
|
2773
|
+
- Schema-based response parsing via valid_extract or custom parsers
|
|
2774
|
+
- Multiple validators: has_keys, list_min_len, custom callables
|
|
2775
|
+
- Pre/post hooks for custom processing
|
|
2776
|
+
- Full execution tracing and history
|
|
2777
|
+
- Serialization support via to_dict()/from_dict()
|
|
2778
|
+
- Channel support for message tracking
|
|
2779
|
+
|
|
2780
|
+
Parameters:
|
|
2781
|
+
name (str): Unique identifier for this thought
|
|
2782
|
+
llm (LLM): LLM instance for execution (required for llm_call operation)
|
|
2783
|
+
prompt (str|dict): Prompt template with {variable} placeholders
|
|
2784
|
+
operation (str): Type of operation ('llm_call', 'memory_query', 'variable_set', 'conditional')
|
|
2785
|
+
system_prompt (str): Optional system prompt for LLM context (via config)
|
|
2786
|
+
parser (str|callable): Response parser ('text', 'json', 'list', or callable)
|
|
2787
|
+
parsing_rules (dict): Schema for valid_extract parsing (e.g., {'kind': 'python', 'format': []})
|
|
2788
|
+
validator (str|callable): Response validator ('any', 'has_keys:k1,k2', 'list_min_len:N', or callable)
|
|
2789
|
+
max_retries (int): Maximum retry attempts (default: 1)
|
|
2790
|
+
retry_delay (float): Delay between retries in seconds (default: 0)
|
|
2791
|
+
required_vars (list): Variables required from memory
|
|
2792
|
+
optional_vars (list): Optional variables from memory
|
|
2793
|
+
output_var (str): Variable name for storing result (default: '{name}_result')
|
|
2794
|
+
pre_hook (callable): Function called before execution: fn(thought, memory, vars, **kwargs)
|
|
2795
|
+
post_hook (callable): Function called after execution: fn(thought, memory, result, error)
|
|
2796
|
+
channel (str): Channel for message tracking (default: 'system')
|
|
2797
|
+
add_reflection (bool): Whether to add reflection on success (default: True)
|
|
2798
|
+
|
|
2799
|
+
Example usage:
|
|
2800
|
+
# Basic LLM call with result storage
|
|
2801
|
+
mem = MEMORY()
|
|
2802
|
+
llm = LLM(model="openai:gpt-4o-mini", api_key="...")
|
|
2803
|
+
thought = THOUGHT(
|
|
2804
|
+
name="summarize",
|
|
2805
|
+
llm=llm,
|
|
2806
|
+
prompt="Summarize the last user message: {last_user_msg}",
|
|
2807
|
+
operation="llm_call"
|
|
2808
|
+
)
|
|
2809
|
+
mem = thought(mem) # Executes the thought, updates memory with result
|
|
2810
|
+
result = mem.get_var("summarize_result")
|
|
2811
|
+
|
|
2812
|
+
# Schema-based parsing example
|
|
2813
|
+
thought = THOUGHT(
|
|
2814
|
+
name="extract_info",
|
|
2815
|
+
llm=llm,
|
|
2816
|
+
prompt="Extract name and age from: {text}",
|
|
2817
|
+
parsing_rules={"kind": "python", "format": {"name": "", "age": 0}}
|
|
2818
|
+
)
|
|
2819
|
+
|
|
2820
|
+
# Memory query example (no LLM)
|
|
2821
|
+
thought = THOUGHT(
|
|
2822
|
+
name="get_context",
|
|
2823
|
+
operation="memory_query",
|
|
2824
|
+
required_vars=["user_name", "session_id"]
|
|
2825
|
+
)
|
|
2826
|
+
|
|
2827
|
+
# Variable set example
|
|
2828
|
+
thought = THOUGHT(
|
|
2829
|
+
name="init_session",
|
|
2830
|
+
operation="variable_set",
|
|
2831
|
+
prompt={"session_active": True, "start_time": None} # dict of values to set
|
|
2832
|
+
)
|
|
2833
|
+
|
|
2834
|
+
|
|
2835
|
+
!!! IMPORTANT !!!
|
|
2836
|
+
The resulting functionality from this class must enable the following pattern:
|
|
2837
|
+
mem = thought(mem) # where mem is a MEMORY object
|
|
2838
|
+
or
|
|
2839
|
+
mem = thought(mem,vars) # where vars (optional)is a dictionary of variables to pass to the thought
|
|
2840
|
+
|
|
2841
|
+
THOUGHT OPERATIONS MUST BE CALLABLE.
|
|
2842
|
+
|
|
2843
|
+
"""
|
|
2844
|
+
|
|
2845
|
+
# Valid operation types
|
|
2846
|
+
VALID_OPERATIONS = {'llm_call', 'memory_query', 'variable_set', 'conditional'}
|
|
2847
|
+
|
|
2848
|
+
def __init__(self, name=None, llm=None, prompt=None, operation=None, **kwargs):
|
|
2849
|
+
"""
|
|
2850
|
+
Initialize a THOUGHT instance.
|
|
2851
|
+
|
|
2852
|
+
Args:
|
|
2853
|
+
name (str): Name of the thought.
|
|
2854
|
+
llm: LLM interface or callable.
|
|
2855
|
+
prompt: Prompt template (str or dict).
|
|
2856
|
+
operation (str): Operation type (e.g., 'llm_call', 'memory_query', etc).
|
|
2857
|
+
**kwargs: Additional configuration parameters.
|
|
2858
|
+
"""
|
|
2859
|
+
self.name = name
|
|
2860
|
+
self.id = event_stamp()
|
|
2861
|
+
self.llm = llm
|
|
2862
|
+
self.prompt = prompt
|
|
2863
|
+
self.operation = operation
|
|
2864
|
+
|
|
2865
|
+
# Store any additional configuration parameters
|
|
2866
|
+
self.config = kwargs.copy()
|
|
2867
|
+
|
|
2868
|
+
# Optionally, store a description or docstring if provided
|
|
2869
|
+
self.description = kwargs.get("description", None)
|
|
2870
|
+
|
|
2871
|
+
# Optionally, store validation rules, parsing functions, etc.
|
|
2872
|
+
self.validation = kwargs.get("validation", None)
|
|
2873
|
+
self.parse_fn = kwargs.get("parse_fn", None)
|
|
2874
|
+
self.max_retries = kwargs.get("max_retries", 1)
|
|
2875
|
+
self.retry_delay = kwargs.get("retry_delay", 0)
|
|
2876
|
+
|
|
2877
|
+
# Optionally, store default context variables or requirements
|
|
2878
|
+
self.required_vars = kwargs.get("required_vars", [])
|
|
2879
|
+
self.optional_vars = kwargs.get("optional_vars", [])
|
|
2880
|
+
|
|
2881
|
+
# Optionally, store output variable name
|
|
2882
|
+
self.output_var = kwargs.get("output_var", "{}_result".format(self.name) if self.name else None)
|
|
2883
|
+
|
|
2884
|
+
# Internal state for tracking last result, errors, etc.
|
|
2885
|
+
self.last_result = None
|
|
2886
|
+
self.last_error = None
|
|
2887
|
+
self.last_prompt = None
|
|
2888
|
+
self.last_msgs = None
|
|
2889
|
+
self.last_response = None
|
|
2890
|
+
|
|
2891
|
+
# Allow for custom hooks (pre/post processing)
|
|
2892
|
+
self.pre_hook = kwargs.get("pre_hook", None)
|
|
2893
|
+
self.post_hook = kwargs.get("post_hook", None)
|
|
2894
|
+
|
|
2895
|
+
# Execution history tracking
|
|
2896
|
+
self.execution_history = []
|
|
2897
|
+
|
|
2898
|
+
|
|
2899
|
+
def __call__(self, memory, vars={}, **kwargs):
|
|
2900
|
+
"""
|
|
2901
|
+
Execute the thought on the given MEMORY object.
|
|
2902
|
+
|
|
2903
|
+
Args:
|
|
2904
|
+
memory: MEMORY object.
|
|
2905
|
+
vars: Optional dictionary of variables to pass to the thought.
|
|
2906
|
+
**kwargs: Additional parameters for execution.
|
|
2907
|
+
Returns:
|
|
2908
|
+
Updated MEMORY object with result stored (if applicable).
|
|
2909
|
+
"""
|
|
2910
|
+
import time as time_module
|
|
2911
|
+
|
|
2912
|
+
start_time = time_module.time()
|
|
2913
|
+
|
|
2914
|
+
# Allow vars to be None
|
|
2915
|
+
if vars is None:
|
|
2916
|
+
vars = {}
|
|
2917
|
+
|
|
2918
|
+
# Pre-hook
|
|
2919
|
+
if self.pre_hook and callable(self.pre_hook):
|
|
2920
|
+
self.pre_hook(self, memory, vars, **kwargs)
|
|
2921
|
+
|
|
2922
|
+
# Determine operation type
|
|
2923
|
+
operation = self.operation or 'llm_call'
|
|
2924
|
+
|
|
2925
|
+
# Dispatch to appropriate handler based on operation type
|
|
2926
|
+
if operation == 'llm_call':
|
|
2927
|
+
result, last_error, attempts_made = self._execute_llm_call(memory, vars, **kwargs)
|
|
2928
|
+
elif operation == 'memory_query':
|
|
2929
|
+
result, last_error, attempts_made = self._execute_memory_query(memory, vars, **kwargs)
|
|
2930
|
+
elif operation == 'variable_set':
|
|
2931
|
+
result, last_error, attempts_made = self._execute_variable_set(memory, vars, **kwargs)
|
|
2932
|
+
elif operation == 'conditional':
|
|
2933
|
+
result, last_error, attempts_made = self._execute_conditional(memory, vars, **kwargs)
|
|
2934
|
+
else:
|
|
2935
|
+
raise ValueError("Unknown operation: {}. Valid operations: {}".format(operation, self.VALID_OPERATIONS))
|
|
2936
|
+
|
|
2937
|
+
# Calculate execution duration
|
|
2938
|
+
duration_ms = (time_module.time() - start_time) * 1000
|
|
2939
|
+
|
|
2940
|
+
# Build execution event for logging
|
|
2941
|
+
execution_event = {
|
|
2942
|
+
'thought_name': self.name,
|
|
2943
|
+
'thought_id': self.id,
|
|
2944
|
+
'operation': operation,
|
|
2945
|
+
'attempts': attempts_made,
|
|
2946
|
+
'success': result is not None,
|
|
2947
|
+
'duration_ms': round(duration_ms, 2),
|
|
2948
|
+
'output_var': self.output_var
|
|
2949
|
+
}
|
|
2950
|
+
|
|
2951
|
+
# If failed after all retries
|
|
2952
|
+
if result is None and last_error is not None:
|
|
2953
|
+
execution_event['error'] = last_error
|
|
2954
|
+
if hasattr(memory, "add_log") and callable(getattr(memory, "add_log", None)):
|
|
2955
|
+
memory.add_log("Thought execution failed: " + json.dumps(execution_event))
|
|
2956
|
+
# Store None as result
|
|
2957
|
+
self.update_memory(memory, None)
|
|
2958
|
+
else:
|
|
2959
|
+
if hasattr(memory, "add_log") and callable(getattr(memory, "add_log", None)):
|
|
2960
|
+
memory.add_log("Thought execution complete: " + json.dumps(execution_event))
|
|
2961
|
+
self.update_memory(memory, result)
|
|
2962
|
+
|
|
2963
|
+
# Track execution history on the THOUGHT instance
|
|
2964
|
+
self.execution_history.append({
|
|
2965
|
+
'stamp': event_stamp(),
|
|
2966
|
+
'memory_id': getattr(memory, 'id', None),
|
|
2967
|
+
'operation': operation,
|
|
2968
|
+
'duration_ms': duration_ms,
|
|
2969
|
+
'success': result is not None or last_error is None,
|
|
2970
|
+
'attempts': attempts_made,
|
|
2971
|
+
'error': self.last_error
|
|
2972
|
+
})
|
|
2973
|
+
|
|
2974
|
+
# Post-hook
|
|
2975
|
+
if self.post_hook and callable(self.post_hook):
|
|
2976
|
+
self.post_hook(self, memory, self.last_result, self.last_error)
|
|
2977
|
+
|
|
2978
|
+
return memory
|
|
2979
|
+
|
|
2980
|
+
def _execute_llm_call(self, memory, vars, **kwargs):
|
|
2981
|
+
"""
|
|
2982
|
+
Execute an LLM call operation with retry logic.
|
|
2983
|
+
|
|
2984
|
+
Returns:
|
|
2985
|
+
tuple: (result, last_error, attempts_made)
|
|
2986
|
+
"""
|
|
2987
|
+
import copy as copy_module
|
|
2988
|
+
import time as time_module
|
|
2989
|
+
|
|
2990
|
+
retries_left = self.max_retries
|
|
2991
|
+
last_error = None
|
|
2992
|
+
result = None
|
|
2993
|
+
attempts_made = 0
|
|
2994
|
+
|
|
2995
|
+
# Store original prompt to avoid mutation - work with a copy
|
|
2996
|
+
original_prompt = copy_module.deepcopy(self.prompt)
|
|
2997
|
+
working_prompt = copy_module.deepcopy(self.prompt)
|
|
2998
|
+
|
|
2999
|
+
while retries_left > 0:
|
|
3000
|
+
attempts_made += 1
|
|
3001
|
+
try:
|
|
3002
|
+
# Temporarily set working prompt for this iteration
|
|
3003
|
+
self.prompt = working_prompt
|
|
3004
|
+
|
|
3005
|
+
# Build context and prompt/messages
|
|
3006
|
+
ctx = self.get_context(memory)
|
|
3007
|
+
ctx.update(vars)
|
|
3008
|
+
msgs = self.build_msgs(memory, ctx)
|
|
3009
|
+
|
|
3010
|
+
# Run LLM
|
|
3011
|
+
llm_kwargs = self.config.get("llm_params", {})
|
|
3012
|
+
llm_kwargs.update(kwargs)
|
|
3013
|
+
response = self.run_llm(msgs, **llm_kwargs)
|
|
3014
|
+
self.last_response = response
|
|
3015
|
+
|
|
3016
|
+
# Get channel from config for message tracking
|
|
3017
|
+
channel = self.config.get("channel", "system")
|
|
3018
|
+
|
|
3019
|
+
# Add assistant message to memory (if possible)
|
|
3020
|
+
if hasattr(memory, "add_msg") and callable(getattr(memory, "add_msg", None)):
|
|
3021
|
+
memory.add_msg("assistant", response, channel=channel)
|
|
3022
|
+
|
|
3023
|
+
# Parse
|
|
3024
|
+
parsed = self.parse_response(response)
|
|
3025
|
+
self.last_result = parsed
|
|
3026
|
+
|
|
3027
|
+
# Validate
|
|
3028
|
+
valid, why = self.validate(parsed)
|
|
3029
|
+
if valid:
|
|
3030
|
+
result = parsed
|
|
3031
|
+
self.last_error = None
|
|
3032
|
+
# Logging
|
|
3033
|
+
if hasattr(memory, "add_log") and callable(getattr(memory, "add_log", None)):
|
|
3034
|
+
memory.add_log("Thought '{}' completed successfully".format(self.name))
|
|
3035
|
+
# Add reflection for reasoning trace (if configured)
|
|
3036
|
+
if self.config.get("add_reflection", True):
|
|
3037
|
+
if hasattr(memory, "add_ref") and callable(getattr(memory, "add_ref", None)):
|
|
3038
|
+
# Truncate response for reflection if too long
|
|
3039
|
+
response_preview = str(response)[:300]
|
|
3040
|
+
if len(str(response)) > 300:
|
|
3041
|
+
response_preview += "..."
|
|
3042
|
+
memory.add_ref("Thought '{}': {}".format(self.name, response_preview))
|
|
3043
|
+
break
|
|
3044
|
+
else:
|
|
3045
|
+
last_error = why
|
|
3046
|
+
self.last_error = why
|
|
3047
|
+
if hasattr(memory, "add_log") and callable(getattr(memory, "add_log", None)):
|
|
3048
|
+
memory.add_log("Thought '{}' validation failed: {}".format(self.name, why))
|
|
3049
|
+
# Create repair suffix for next retry (modify working_prompt, not original)
|
|
3050
|
+
repair_suffix = "\n(Please return only the requested format; your last answer failed: {}.)" .format(why)
|
|
3051
|
+
if isinstance(original_prompt, str):
|
|
3052
|
+
working_prompt = original_prompt.rstrip() + repair_suffix
|
|
3053
|
+
elif isinstance(original_prompt, dict):
|
|
3054
|
+
working_prompt = copy_module.deepcopy(original_prompt)
|
|
3055
|
+
last_key = list(working_prompt.keys())[-1]
|
|
3056
|
+
working_prompt[last_key] = working_prompt[last_key].rstrip() + repair_suffix
|
|
3057
|
+
except Exception as e:
|
|
3058
|
+
last_error = str(e)
|
|
3059
|
+
self.last_error = last_error
|
|
3060
|
+
if hasattr(memory, "add_log") and callable(getattr(memory, "add_log", None)):
|
|
3061
|
+
memory.add_log("Thought '{}' error: {}".format(self.name, last_error))
|
|
3062
|
+
# Create repair suffix for next retry (modify working_prompt, not original)
|
|
3063
|
+
repair_suffix = "\n(Please return only the requested format; your last answer failed: {}.)".format(last_error)
|
|
3064
|
+
if isinstance(original_prompt, str):
|
|
3065
|
+
working_prompt = original_prompt.rstrip() + repair_suffix
|
|
3066
|
+
elif isinstance(original_prompt, dict):
|
|
3067
|
+
working_prompt = copy_module.deepcopy(original_prompt)
|
|
3068
|
+
last_key = list(working_prompt.keys())[-1]
|
|
3069
|
+
working_prompt[last_key] = working_prompt[last_key].rstrip() + repair_suffix
|
|
3070
|
+
retries_left -= 1
|
|
3071
|
+
if self.retry_delay:
|
|
3072
|
+
time_module.sleep(self.retry_delay)
|
|
3073
|
+
|
|
3074
|
+
# Restore original prompt after execution (prevents permanent mutation)
|
|
3075
|
+
self.prompt = original_prompt
|
|
3076
|
+
|
|
3077
|
+
return result, last_error, attempts_made
|
|
3078
|
+
|
|
3079
|
+
def _execute_memory_query(self, memory, vars, **kwargs):
|
|
3080
|
+
"""
|
|
3081
|
+
Execute a memory query operation (no LLM involved).
|
|
3082
|
+
Retrieves specified variables from memory and returns them as a dict.
|
|
3083
|
+
|
|
3084
|
+
Returns:
|
|
3085
|
+
tuple: (result, last_error, attempts_made)
|
|
3086
|
+
"""
|
|
3087
|
+
try:
|
|
3088
|
+
result = {}
|
|
3089
|
+
|
|
3090
|
+
# Get required variables
|
|
3091
|
+
for var in self.required_vars:
|
|
3092
|
+
if hasattr(memory, "get_var") and callable(getattr(memory, "get_var", None)):
|
|
3093
|
+
val = memory.get_var(var)
|
|
3094
|
+
else:
|
|
3095
|
+
val = getattr(memory, var, None)
|
|
3096
|
+
|
|
3097
|
+
if val is None:
|
|
3098
|
+
return None, "Required variable '{}' not found in memory".format(var), 1
|
|
3099
|
+
result[var] = val
|
|
3100
|
+
|
|
3101
|
+
# Get optional variables
|
|
3102
|
+
for var in self.optional_vars:
|
|
3103
|
+
if hasattr(memory, "get_var") and callable(getattr(memory, "get_var", None)):
|
|
3104
|
+
val = memory.get_var(var)
|
|
3105
|
+
else:
|
|
3106
|
+
val = getattr(memory, var, None)
|
|
3107
|
+
|
|
3108
|
+
if val is not None:
|
|
3109
|
+
result[var] = val
|
|
3110
|
+
|
|
3111
|
+
# Include any vars passed directly
|
|
3112
|
+
result.update(vars)
|
|
3113
|
+
|
|
3114
|
+
self.last_result = result
|
|
3115
|
+
self.last_error = None
|
|
3116
|
+
|
|
3117
|
+
if hasattr(memory, "add_log") and callable(getattr(memory, "add_log", None)):
|
|
3118
|
+
memory.add_log("Thought '{}' memory query completed".format(self.name))
|
|
3119
|
+
|
|
3120
|
+
return result, None, 1
|
|
3121
|
+
|
|
3122
|
+
except Exception as e:
|
|
3123
|
+
self.last_error = str(e)
|
|
3124
|
+
return None, str(e), 1
|
|
3125
|
+
|
|
3126
|
+
def _execute_variable_set(self, memory, vars, **kwargs):
|
|
3127
|
+
"""
|
|
3128
|
+
Execute a variable set operation.
|
|
3129
|
+
Sets variables in memory from the prompt (as dict) or vars parameter.
|
|
3130
|
+
|
|
3131
|
+
Returns:
|
|
3132
|
+
tuple: (result, last_error, attempts_made)
|
|
3133
|
+
"""
|
|
3134
|
+
try:
|
|
3135
|
+
values_to_set = {}
|
|
3136
|
+
|
|
3137
|
+
# If prompt is a dict, use it as the values to set
|
|
3138
|
+
if isinstance(self.prompt, dict):
|
|
3139
|
+
values_to_set.update(self.prompt)
|
|
3140
|
+
|
|
3141
|
+
# Override/add with vars parameter
|
|
3142
|
+
values_to_set.update(vars)
|
|
3143
|
+
|
|
3144
|
+
# Set each variable in memory
|
|
3145
|
+
for key, value in values_to_set.items():
|
|
3146
|
+
if hasattr(memory, "set_var") and callable(getattr(memory, "set_var", None)):
|
|
3147
|
+
desc = self.config.get("var_descriptions", {}).get(key, "Set by thought: {}".format(self.name))
|
|
3148
|
+
memory.set_var(key, value, desc=desc)
|
|
3149
|
+
elif hasattr(memory, "vars"):
|
|
3150
|
+
if key not in memory.vars:
|
|
3151
|
+
memory.vars[key] = []
|
|
3152
|
+
stamp = event_stamp(value)
|
|
3153
|
+
memory.vars[key].append([stamp, value])
|
|
3154
|
+
|
|
3155
|
+
self.last_result = values_to_set
|
|
3156
|
+
self.last_error = None
|
|
3157
|
+
|
|
3158
|
+
if hasattr(memory, "add_log") and callable(getattr(memory, "add_log", None)):
|
|
3159
|
+
memory.add_log("Thought '{}' set {} variables".format(self.name, len(values_to_set)))
|
|
3160
|
+
|
|
3161
|
+
return values_to_set, None, 1
|
|
3162
|
+
|
|
3163
|
+
except Exception as e:
|
|
3164
|
+
self.last_error = str(e)
|
|
3165
|
+
return None, str(e), 1
|
|
3166
|
+
|
|
3167
|
+
def _execute_conditional(self, memory, vars, **kwargs):
|
|
3168
|
+
"""
|
|
3169
|
+
Execute a conditional operation.
|
|
3170
|
+
Evaluates a condition from config and returns the appropriate result.
|
|
3171
|
+
|
|
3172
|
+
Config options:
|
|
3173
|
+
condition (callable): Function that takes (memory, vars) and returns bool
|
|
3174
|
+
if_true: Value/action if condition is true
|
|
3175
|
+
if_false: Value/action if condition is false
|
|
3176
|
+
|
|
3177
|
+
Returns:
|
|
3178
|
+
tuple: (result, last_error, attempts_made)
|
|
3179
|
+
"""
|
|
3180
|
+
try:
|
|
3181
|
+
condition_fn = self.config.get("condition")
|
|
3182
|
+
if_true = self.config.get("if_true")
|
|
3183
|
+
if_false = self.config.get("if_false")
|
|
3184
|
+
|
|
3185
|
+
if condition_fn is None:
|
|
3186
|
+
return None, "No condition function provided for conditional operation", 1
|
|
3187
|
+
|
|
3188
|
+
if not callable(condition_fn):
|
|
3189
|
+
return None, "Condition must be callable", 1
|
|
3190
|
+
|
|
3191
|
+
# Evaluate condition
|
|
3192
|
+
ctx = self.get_context(memory)
|
|
3193
|
+
ctx.update(vars)
|
|
3194
|
+
condition_result = condition_fn(memory, ctx)
|
|
3195
|
+
|
|
3196
|
+
# Return appropriate value
|
|
3197
|
+
if condition_result:
|
|
3198
|
+
result = if_true
|
|
3199
|
+
if callable(if_true):
|
|
3200
|
+
result = if_true(memory, ctx)
|
|
3201
|
+
else:
|
|
3202
|
+
result = if_false
|
|
3203
|
+
if callable(if_false):
|
|
3204
|
+
result = if_false(memory, ctx)
|
|
3205
|
+
|
|
3206
|
+
self.last_result = result
|
|
3207
|
+
self.last_error = None
|
|
3208
|
+
|
|
3209
|
+
if hasattr(memory, "add_log") and callable(getattr(memory, "add_log", None)):
|
|
3210
|
+
memory.add_log("Thought '{}' conditional evaluated to {}".format(self.name, bool(condition_result)))
|
|
3211
|
+
|
|
3212
|
+
return result, None, 1
|
|
3213
|
+
|
|
3214
|
+
except Exception as e:
|
|
3215
|
+
self.last_error = str(e)
|
|
3216
|
+
return None, str(e), 1
|
|
3217
|
+
|
|
3218
|
+
def build_prompt(self, memory, context_vars=None):
|
|
3219
|
+
"""
|
|
3220
|
+
Build the prompt for the LLM using construct_prompt.
|
|
3221
|
+
|
|
3222
|
+
Args:
|
|
3223
|
+
memory: MEMORY object providing context.
|
|
3224
|
+
context_vars (dict): Optional context variables to fill the prompt.
|
|
3225
|
+
|
|
3226
|
+
Returns:
|
|
3227
|
+
str: The constructed prompt string.
|
|
3228
|
+
"""
|
|
3229
|
+
# Get context variables (merge get_context and context_vars)
|
|
3230
|
+
ctx = self.get_context(memory)
|
|
3231
|
+
if context_vars:
|
|
3232
|
+
ctx.update(context_vars)
|
|
3233
|
+
prompt_template = self.prompt
|
|
3234
|
+
# If prompt is a dict, use construct_prompt, else format as string
|
|
3235
|
+
if isinstance(prompt_template, dict):
|
|
3236
|
+
prompt = construct_prompt(prompt_template)
|
|
3237
|
+
elif isinstance(prompt_template, str):
|
|
3238
|
+
try:
|
|
3239
|
+
prompt = prompt_template.format(**ctx)
|
|
3240
|
+
except Exception:
|
|
3241
|
+
# fallback: just return as is
|
|
3242
|
+
prompt = prompt_template
|
|
3243
|
+
else:
|
|
3244
|
+
prompt = str(prompt_template)
|
|
3245
|
+
self.last_prompt = prompt
|
|
3246
|
+
return prompt
|
|
3247
|
+
|
|
3248
|
+
def build_msgs(self, memory, context_vars=None):
|
|
3249
|
+
"""
|
|
3250
|
+
Build the messages list for the LLM using construct_msgs.
|
|
3251
|
+
|
|
3252
|
+
Args:
|
|
3253
|
+
memory: MEMORY object providing context.
|
|
3254
|
+
context_vars (dict): Optional context variables to fill the prompt.
|
|
3255
|
+
|
|
3256
|
+
Returns:
|
|
3257
|
+
list: List of message dicts for LLM input.
|
|
3258
|
+
"""
|
|
3259
|
+
ctx = self.get_context(memory)
|
|
3260
|
+
if context_vars:
|
|
3261
|
+
ctx.update(context_vars)
|
|
3262
|
+
# Compose system and user prompts
|
|
3263
|
+
sys_prompt = self.config.get("system_prompt", "")
|
|
3264
|
+
usr_prompt = self.build_prompt(memory, ctx)
|
|
3265
|
+
# Optionally, allow for prior messages from memory
|
|
3266
|
+
msgs = []
|
|
3267
|
+
if hasattr(memory, "get_msgs"):
|
|
3268
|
+
# Optionally, get recent messages for context
|
|
3269
|
+
msgs = memory.get_msgs(repr="list") if callable(getattr(memory, "get_msgs", None)) else []
|
|
3270
|
+
# Build messages using construct_msgs
|
|
3271
|
+
msgs_out = construct_msgs(
|
|
3272
|
+
usr_prompt=usr_prompt,
|
|
3273
|
+
vars=ctx,
|
|
3274
|
+
sys_prompt=sys_prompt,
|
|
3275
|
+
msgs=msgs
|
|
3276
|
+
)
|
|
3277
|
+
self.last_msgs = msgs_out
|
|
3278
|
+
return msgs_out
|
|
3279
|
+
|
|
3280
|
+
def get_context(self, memory):
|
|
3281
|
+
"""
|
|
3282
|
+
Extract relevant context from the MEMORY object for this thought.
|
|
3283
|
+
|
|
3284
|
+
Args:
|
|
3285
|
+
memory: MEMORY object.
|
|
3286
|
+
|
|
3287
|
+
Returns:
|
|
3288
|
+
dict: Context variables for prompt filling.
|
|
3289
|
+
"""
|
|
3290
|
+
ctx = {}
|
|
3291
|
+
# If required_vars is specified, try to get those from memory
|
|
3292
|
+
if hasattr(self, "required_vars") and self.required_vars:
|
|
3293
|
+
for var in self.required_vars:
|
|
3294
|
+
# Try to get from memory.get_var if available
|
|
3295
|
+
if hasattr(memory, "get_var") and callable(getattr(memory, "get_var", None)):
|
|
3296
|
+
val = memory.get_var(var)
|
|
3297
|
+
else:
|
|
3298
|
+
val = getattr(memory, var, None)
|
|
3299
|
+
if val is not None:
|
|
3300
|
+
ctx[var] = val
|
|
3301
|
+
# Optionally, add optional_vars if present in memory
|
|
3302
|
+
if hasattr(self, "optional_vars") and self.optional_vars:
|
|
3303
|
+
for var in self.optional_vars:
|
|
3304
|
+
if hasattr(memory, "get_var") and callable(getattr(memory, "get_var", None)):
|
|
3305
|
+
val = memory.get_var(var)
|
|
3306
|
+
else:
|
|
3307
|
+
val = getattr(memory, var, None)
|
|
3308
|
+
if val is not None:
|
|
3309
|
+
ctx[var] = val
|
|
3310
|
+
# Add some common context keys if available
|
|
3311
|
+
if hasattr(memory, "last_user_msg") and callable(getattr(memory, "last_user_msg", None)):
|
|
3312
|
+
ctx["last_user_msg"] = memory.last_user_msg()
|
|
3313
|
+
if hasattr(memory, "last_asst_msg") and callable(getattr(memory, "last_asst_msg", None)):
|
|
3314
|
+
ctx["last_asst_msg"] = memory.last_asst_msg()
|
|
3315
|
+
if hasattr(memory, "get_msgs") and callable(getattr(memory, "get_msgs", None)):
|
|
3316
|
+
ctx["messages"] = memory.get_msgs(repr="list")
|
|
3317
|
+
# Add all memory.vars if present
|
|
3318
|
+
if hasattr(memory, "vars"):
|
|
3319
|
+
ctx.update(getattr(memory, "vars", {}))
|
|
3320
|
+
return ctx
|
|
3321
|
+
|
|
3322
|
+
def run_llm(self, msgs, **llm_kwargs):
|
|
3323
|
+
"""
|
|
3324
|
+
Execute the LLM call with the given messages.
|
|
3325
|
+
!!! USE THE EXISTING LLM CLASS !!!
|
|
3326
|
+
|
|
3327
|
+
Args:
|
|
3328
|
+
msgs (list): List of message dicts.
|
|
3329
|
+
**llm_kwargs: Additional LLM parameters.
|
|
3330
|
+
|
|
3331
|
+
Returns:
|
|
3332
|
+
str: Raw LLM response.
|
|
3333
|
+
"""
|
|
3334
|
+
if self.llm is None:
|
|
3335
|
+
raise ValueError("No LLM instance provided to this THOUGHT.")
|
|
3336
|
+
# The LLM class is expected to be callable: llm(msgs, **kwargs)
|
|
3337
|
+
# If LLM is a class with .call, use that (standard interface)
|
|
3338
|
+
if hasattr(self.llm, "call") and callable(getattr(self.llm, "call", None)):
|
|
3339
|
+
response = self.llm.call(msgs, llm_kwargs)
|
|
3340
|
+
elif hasattr(self.llm, "chat") and callable(getattr(self.llm, "chat", None)):
|
|
3341
|
+
response = self.llm.chat(msgs, **llm_kwargs)
|
|
3342
|
+
else:
|
|
3343
|
+
response = self.llm(msgs, **llm_kwargs)
|
|
3344
|
+
|
|
3345
|
+
# Handle list response from LLM.call() - it returns a list of choices
|
|
3346
|
+
if isinstance(response, list):
|
|
3347
|
+
response = response[0] if response else ""
|
|
3348
|
+
|
|
3349
|
+
# If response is a dict with 'content', extract it
|
|
3350
|
+
if isinstance(response, dict) and "content" in response:
|
|
3351
|
+
return response["content"]
|
|
3352
|
+
|
|
3353
|
+
return response
|
|
3354
|
+
|
|
3355
|
+
def parse_response(self, response):
|
|
3356
|
+
"""
|
|
3357
|
+
Parse the LLM response to extract the desired content.
|
|
3358
|
+
|
|
3359
|
+
Args:
|
|
3360
|
+
response (str): Raw LLM response.
|
|
3361
|
+
|
|
3362
|
+
Returns:
|
|
3363
|
+
object: Parsed result (e.g., string, list, dict).
|
|
3364
|
+
|
|
3365
|
+
Supports:
|
|
3366
|
+
- Custom parse_fn callable
|
|
3367
|
+
- Schema-based parsing via parsing_rules (uses valid_extract)
|
|
3368
|
+
- Built-in parsers: 'text', 'json', 'list'
|
|
3369
|
+
"""
|
|
3370
|
+
# Use custom parse_fn if provided
|
|
3371
|
+
if self.parse_fn and callable(self.parse_fn):
|
|
3372
|
+
return self.parse_fn(response)
|
|
3373
|
+
|
|
3374
|
+
# Check for schema-based parsing rules (using valid_extract)
|
|
3375
|
+
parsing_rules = self.config.get("parsing_rules")
|
|
3376
|
+
if parsing_rules:
|
|
3377
|
+
try:
|
|
3378
|
+
return valid_extract(response, parsing_rules)
|
|
3379
|
+
except ValidExtractError as e:
|
|
3380
|
+
raise ValueError("Schema-based parsing failed: {}".format(e))
|
|
3381
|
+
|
|
3382
|
+
# Use built-in parser based on config
|
|
3383
|
+
parser = self.config.get("parser", None)
|
|
3384
|
+
if parser is None:
|
|
3385
|
+
# Default: return as string
|
|
3386
|
+
return response
|
|
3387
|
+
if parser == "text":
|
|
3388
|
+
return response
|
|
3389
|
+
elif parser == "json":
|
|
3390
|
+
import re
|
|
3391
|
+
# Remove code fences if present
|
|
3392
|
+
text = response.strip()
|
|
3393
|
+
text = re.sub(r"^```(?:json)?|```$", "", text, flags=re.MULTILINE).strip()
|
|
3394
|
+
# Find first JSON object or array
|
|
3395
|
+
match = re.search(r"(\{.*\}|\[.*\])", text, re.DOTALL)
|
|
3396
|
+
if match:
|
|
3397
|
+
json_str = match.group(1)
|
|
3398
|
+
return json.loads(json_str)
|
|
3399
|
+
else:
|
|
3400
|
+
raise ValueError("No JSON object or array found in response.")
|
|
3401
|
+
elif parser == "list":
|
|
3402
|
+
import ast, re
|
|
3403
|
+
# Find first list literal
|
|
3404
|
+
match = re.search(r"(\[.*\])", response, re.DOTALL)
|
|
3405
|
+
if match:
|
|
3406
|
+
list_str = match.group(1)
|
|
3407
|
+
return ast.literal_eval(list_str)
|
|
3408
|
+
else:
|
|
3409
|
+
raise ValueError("No list found in response.")
|
|
3410
|
+
elif callable(parser):
|
|
3411
|
+
return parser(response)
|
|
3412
|
+
else:
|
|
3413
|
+
# Unknown parser, return as is
|
|
3414
|
+
return response
|
|
3415
|
+
|
|
3416
|
+
def validate(self, parsed_result):
|
|
3417
|
+
"""
|
|
3418
|
+
Validate the parsed result according to the thought's rules.
|
|
3419
|
+
|
|
3420
|
+
Args:
|
|
3421
|
+
parsed_result: The parsed output from the LLM.
|
|
3422
|
+
|
|
3423
|
+
Returns:
|
|
3424
|
+
(bool, why): True if valid, False otherwise, and reason string.
|
|
3425
|
+
"""
|
|
3426
|
+
# Use custom validation if provided
|
|
3427
|
+
if self.validation and callable(self.validation):
|
|
3428
|
+
try:
|
|
3429
|
+
valid, why = self.validation(parsed_result)
|
|
3430
|
+
return bool(valid), why
|
|
3431
|
+
except Exception as e:
|
|
3432
|
+
return False, "Validation exception: {}".format(e)
|
|
3433
|
+
# Use built-in validator based on config
|
|
3434
|
+
validator = self.config.get("validator", None)
|
|
3435
|
+
if validator is None or validator == "any":
|
|
3436
|
+
return True, ""
|
|
3437
|
+
elif isinstance(validator, str):
|
|
3438
|
+
if validator.startswith("has_keys:"):
|
|
3439
|
+
keys = [k.strip() for k in validator.split(":", 1)[1].split(",")]
|
|
3440
|
+
if isinstance(parsed_result, dict):
|
|
3441
|
+
missing = [k for k in keys if k not in parsed_result]
|
|
3442
|
+
if not missing:
|
|
3443
|
+
return True, ""
|
|
3444
|
+
else:
|
|
3445
|
+
return False, "Missing keys: {}".format(missing)
|
|
3446
|
+
else:
|
|
3447
|
+
return False, "Result is not a dict"
|
|
3448
|
+
elif validator.startswith("list_min_len:"):
|
|
3449
|
+
try:
|
|
3450
|
+
min_len = int(validator.split(":", 1)[1])
|
|
3451
|
+
except Exception:
|
|
3452
|
+
min_len = 1
|
|
3453
|
+
if isinstance(parsed_result, list) and len(parsed_result) >= min_len:
|
|
3454
|
+
return True, ""
|
|
3455
|
+
else:
|
|
3456
|
+
return False, "List too short (min {})".format(min_len)
|
|
3457
|
+
elif validator == "summary_v1":
|
|
3458
|
+
# Example: summary must be a string of at least 10 chars
|
|
3459
|
+
if isinstance(parsed_result, str) and len(parsed_result.strip()) >= 10:
|
|
3460
|
+
return True, ""
|
|
3461
|
+
else:
|
|
3462
|
+
return False, "Summary too short"
|
|
3463
|
+
else:
|
|
3464
|
+
return True, ""
|
|
3465
|
+
elif callable(validator):
|
|
3466
|
+
try:
|
|
3467
|
+
valid, why = validator(parsed_result)
|
|
3468
|
+
return bool(valid), why
|
|
3469
|
+
except Exception as e:
|
|
3470
|
+
return False, "Validation exception: {}".format(e)
|
|
3471
|
+
else:
|
|
3472
|
+
return True, ""
|
|
3473
|
+
|
|
3474
|
+
def update_memory(self, memory, result):
|
|
3475
|
+
"""
|
|
3476
|
+
Update the MEMORY object with the result of this thought.
|
|
3477
|
+
|
|
3478
|
+
Args:
|
|
3479
|
+
memory: MEMORY object.
|
|
3480
|
+
result: The result to store.
|
|
3481
|
+
|
|
3482
|
+
Returns:
|
|
3483
|
+
MEMORY: Updated memory object.
|
|
3484
|
+
"""
|
|
3485
|
+
# Store result in vars or via set_var if available
|
|
3486
|
+
varname = self.output_var or ("{}_result".format(self.name) if self.name else "thought_result")
|
|
3487
|
+
if hasattr(memory, "set_var") and callable(getattr(memory, "set_var", None)):
|
|
3488
|
+
memory.set_var(varname, result, desc="Result of thought: {}".format(self.name))
|
|
3489
|
+
elif hasattr(memory, "vars"):
|
|
3490
|
+
# Fallback: directly access vars dict if set_var not available
|
|
3491
|
+
if varname not in memory.vars:
|
|
3492
|
+
memory.vars[varname] = []
|
|
3493
|
+
stamp = event_stamp(result) if 'event_stamp' in globals() else 'no_stamp'
|
|
3494
|
+
memory.vars[varname].append({'object': result, 'stamp': stamp})
|
|
3495
|
+
else:
|
|
3496
|
+
setattr(memory, varname, result)
|
|
3497
|
+
return memory
|
|
3498
|
+
|
|
3499
|
+
def to_dict(self):
|
|
3500
|
+
"""
|
|
3501
|
+
Return a serializable dictionary representation of this THOUGHT.
|
|
3502
|
+
|
|
3503
|
+
Note: The LLM instance, parse_fn, validation, and hooks cannot be serialized,
|
|
3504
|
+
so they are represented by type/name only. When deserializing, these must be
|
|
3505
|
+
provided separately.
|
|
3506
|
+
|
|
3507
|
+
Returns:
|
|
3508
|
+
dict: Serializable representation of this thought.
|
|
3509
|
+
"""
|
|
3510
|
+
return {
|
|
3511
|
+
"name": self.name,
|
|
3512
|
+
"id": self.id,
|
|
3513
|
+
"prompt": self.prompt,
|
|
3514
|
+
"operation": self.operation,
|
|
3515
|
+
"config": self.config,
|
|
3516
|
+
"description": self.description,
|
|
3517
|
+
"max_retries": self.max_retries,
|
|
3518
|
+
"retry_delay": self.retry_delay,
|
|
3519
|
+
"output_var": self.output_var,
|
|
3520
|
+
"required_vars": self.required_vars,
|
|
3521
|
+
"optional_vars": self.optional_vars,
|
|
3522
|
+
"execution_history": self.execution_history,
|
|
3523
|
+
# Store metadata about non-serializable items
|
|
3524
|
+
"llm_type": type(self.llm).__name__ if self.llm else None,
|
|
3525
|
+
"has_parse_fn": self.parse_fn is not None,
|
|
3526
|
+
"has_validation": self.validation is not None,
|
|
3527
|
+
"has_pre_hook": self.pre_hook is not None,
|
|
3528
|
+
"has_post_hook": self.post_hook is not None,
|
|
3529
|
+
}
|
|
3530
|
+
|
|
3531
|
+
@classmethod
|
|
3532
|
+
def from_dict(cls, data, llm=None, parse_fn=None, validation=None, pre_hook=None, post_hook=None):
|
|
3533
|
+
"""
|
|
3534
|
+
Reconstruct a THOUGHT from a dictionary representation.
|
|
3535
|
+
|
|
3536
|
+
Args:
|
|
3537
|
+
data (dict): Dictionary representation of a THOUGHT.
|
|
3538
|
+
llm: LLM instance to use (required for execution).
|
|
3539
|
+
parse_fn: Optional custom parse function.
|
|
3540
|
+
validation: Optional custom validation function.
|
|
3541
|
+
pre_hook: Optional pre-execution hook.
|
|
3542
|
+
post_hook: Optional post-execution hook.
|
|
3543
|
+
|
|
3544
|
+
Returns:
|
|
3545
|
+
THOUGHT: Reconstructed THOUGHT object.
|
|
3546
|
+
"""
|
|
3547
|
+
# Extract config and merge with explicit kwargs
|
|
3548
|
+
config = data.get("config", {}).copy()
|
|
3549
|
+
|
|
3550
|
+
thought = cls(
|
|
3551
|
+
name=data.get("name"),
|
|
3552
|
+
llm=llm,
|
|
3553
|
+
prompt=data.get("prompt"),
|
|
3554
|
+
operation=data.get("operation"),
|
|
3555
|
+
description=data.get("description"),
|
|
3556
|
+
max_retries=data.get("max_retries", 1),
|
|
3557
|
+
retry_delay=data.get("retry_delay", 0),
|
|
3558
|
+
output_var=data.get("output_var"),
|
|
3559
|
+
required_vars=data.get("required_vars", []),
|
|
3560
|
+
optional_vars=data.get("optional_vars", []),
|
|
3561
|
+
parse_fn=parse_fn,
|
|
3562
|
+
validation=validation,
|
|
3563
|
+
pre_hook=pre_hook,
|
|
3564
|
+
post_hook=post_hook,
|
|
3565
|
+
**config
|
|
3566
|
+
)
|
|
3567
|
+
|
|
3568
|
+
# Restore ID if provided
|
|
3569
|
+
if data.get("id"):
|
|
3570
|
+
thought.id = data["id"]
|
|
3571
|
+
|
|
3572
|
+
# Restore execution history
|
|
3573
|
+
thought.execution_history = data.get("execution_history", [])
|
|
3574
|
+
|
|
3575
|
+
return thought
|
|
3576
|
+
|
|
3577
|
+
def copy(self):
|
|
3578
|
+
"""
|
|
3579
|
+
Return a deep copy of this THOUGHT.
|
|
3580
|
+
|
|
3581
|
+
Note: The LLM instance is shallow-copied (same reference), as LLM
|
|
3582
|
+
instances typically should be shared. All other attributes are deep-copied.
|
|
3583
|
+
|
|
3584
|
+
Returns:
|
|
3585
|
+
THOUGHT: A new THOUGHT instance with copied attributes.
|
|
3586
|
+
"""
|
|
3587
|
+
import copy as copy_module
|
|
3588
|
+
|
|
3589
|
+
new_thought = THOUGHT(
|
|
3590
|
+
name=self.name,
|
|
3591
|
+
llm=self.llm, # Shallow copy - same LLM instance
|
|
3592
|
+
prompt=copy_module.deepcopy(self.prompt),
|
|
3593
|
+
operation=self.operation,
|
|
3594
|
+
description=self.description,
|
|
3595
|
+
max_retries=self.max_retries,
|
|
3596
|
+
retry_delay=self.retry_delay,
|
|
3597
|
+
output_var=self.output_var,
|
|
3598
|
+
required_vars=copy_module.deepcopy(self.required_vars),
|
|
3599
|
+
optional_vars=copy_module.deepcopy(self.optional_vars),
|
|
3600
|
+
parse_fn=self.parse_fn,
|
|
3601
|
+
validation=self.validation,
|
|
3602
|
+
pre_hook=self.pre_hook,
|
|
3603
|
+
post_hook=self.post_hook,
|
|
3604
|
+
**copy_module.deepcopy(self.config)
|
|
3605
|
+
)
|
|
3606
|
+
|
|
3607
|
+
# Copy internal state
|
|
3608
|
+
new_thought.id = event_stamp() # Generate new ID for the copy
|
|
3609
|
+
new_thought.execution_history = copy_module.deepcopy(self.execution_history)
|
|
3610
|
+
new_thought.last_result = copy_module.deepcopy(self.last_result)
|
|
3611
|
+
new_thought.last_error = self.last_error
|
|
3612
|
+
new_thought.last_prompt = self.last_prompt
|
|
3613
|
+
new_thought.last_msgs = copy_module.deepcopy(self.last_msgs)
|
|
3614
|
+
new_thought.last_response = self.last_response
|
|
3615
|
+
|
|
3616
|
+
return new_thought
|
|
3617
|
+
|
|
3618
|
+
def __repr__(self):
|
|
3619
|
+
"""
|
|
3620
|
+
Return a detailed string representation of this THOUGHT.
|
|
3621
|
+
|
|
3622
|
+
Returns:
|
|
3623
|
+
str: Detailed representation including key attributes.
|
|
3624
|
+
"""
|
|
3625
|
+
return ("THOUGHT(name='{}', operation='{}', "
|
|
3626
|
+
"max_retries={}, output_var='{}')".format(
|
|
3627
|
+
self.name, self.operation, self.max_retries, self.output_var))
|
|
3628
|
+
|
|
3629
|
+
def __str__(self):
|
|
3630
|
+
"""
|
|
3631
|
+
Return a human-readable string representation of this THOUGHT.
|
|
3632
|
+
|
|
3633
|
+
Returns:
|
|
3634
|
+
str: Simple description of the thought.
|
|
3635
|
+
"""
|
|
3636
|
+
return "Thought: {}".format(self.name or 'unnamed')
|
|
3637
|
+
|
|
3638
|
+
|
|
3639
|
+
|
|
3640
|
+
|
|
3641
|
+
|
|
3642
|
+
|
|
3643
|
+
ThoughtClassTests = """
|
|
3644
|
+
# --- THOUGHT Class Tests ---
|
|
3645
|
+
|
|
3646
|
+
# Test 1: Basic THOUGHT instantiation and attributes
|
|
3647
|
+
>>> from thoughtflow6 import THOUGHT, MEMORY, event_stamp
|
|
3648
|
+
>>> t = THOUGHT(name="test_thought", prompt="Hello {name}", max_retries=3)
|
|
3649
|
+
>>> t.name
|
|
3650
|
+
'test_thought'
|
|
3651
|
+
>>> t.max_retries
|
|
3652
|
+
3
|
|
3653
|
+
>>> t.output_var
|
|
3654
|
+
'test_thought_result'
|
|
3655
|
+
>>> t.operation is None # Defaults to None, which means 'llm_call'
|
|
3656
|
+
True
|
|
3657
|
+
>>> len(t.execution_history)
|
|
3658
|
+
0
|
|
3659
|
+
|
|
3660
|
+
# Test 2: Serialization round-trip with to_dict/from_dict
|
|
3661
|
+
>>> t1 = THOUGHT(name="serialize_test", prompt="test prompt", max_retries=3, output_var="my_output")
|
|
3662
|
+
>>> data = t1.to_dict()
|
|
3663
|
+
>>> data['name']
|
|
3664
|
+
'serialize_test'
|
|
3665
|
+
>>> data['max_retries']
|
|
3666
|
+
3
|
|
3667
|
+
>>> data['output_var']
|
|
3668
|
+
'my_output'
|
|
3669
|
+
>>> t2 = THOUGHT.from_dict(data)
|
|
3670
|
+
>>> t2.name == t1.name
|
|
3671
|
+
True
|
|
3672
|
+
>>> t2.max_retries == t1.max_retries
|
|
3673
|
+
True
|
|
3674
|
+
>>> t2.output_var == t1.output_var
|
|
3675
|
+
True
|
|
3676
|
+
|
|
3677
|
+
# Test 3: Copy creates independent instance
|
|
3678
|
+
>>> t1 = THOUGHT(name="copy_test", prompt="original prompt")
|
|
3679
|
+
>>> t2 = t1.copy()
|
|
3680
|
+
>>> t2.name = "modified"
|
|
3681
|
+
>>> t1.name
|
|
3682
|
+
'copy_test'
|
|
3683
|
+
>>> t2.name
|
|
3684
|
+
'modified'
|
|
3685
|
+
>>> t1.id != t2.id # Copy gets new ID
|
|
3686
|
+
True
|
|
3687
|
+
|
|
3688
|
+
# Test 4: __repr__ and __str__
|
|
3689
|
+
>>> t = THOUGHT(name="repr_test", operation="llm_call", max_retries=2, output_var="result")
|
|
3690
|
+
>>> "repr_test" in repr(t)
|
|
3691
|
+
True
|
|
3692
|
+
>>> "llm_call" in repr(t)
|
|
3693
|
+
True
|
|
3694
|
+
>>> str(t)
|
|
3695
|
+
'Thought: repr_test'
|
|
3696
|
+
>>> t2 = THOUGHT() # unnamed
|
|
3697
|
+
>>> str(t2)
|
|
3698
|
+
'Thought: unnamed'
|
|
3699
|
+
|
|
3700
|
+
# Test 5: Memory query operation (no LLM)
|
|
3701
|
+
>>> mem = MEMORY()
|
|
3702
|
+
>>> mem.set_var("user_name", "Alice", desc="Test user")
|
|
3703
|
+
>>> mem.set_var("session_id", "sess123", desc="Test session")
|
|
3704
|
+
>>> t = THOUGHT(
|
|
3705
|
+
... name="query_test",
|
|
3706
|
+
... operation="memory_query",
|
|
3707
|
+
... required_vars=["user_name", "session_id"]
|
|
3708
|
+
... )
|
|
3709
|
+
>>> mem2 = t(mem)
|
|
3710
|
+
>>> result = mem2.get_var("query_test_result")
|
|
3711
|
+
>>> result['user_name']
|
|
3712
|
+
'Alice'
|
|
3713
|
+
>>> result['session_id']
|
|
3714
|
+
'sess123'
|
|
3715
|
+
|
|
3716
|
+
# Test 6: Variable set operation
|
|
3717
|
+
>>> mem = MEMORY()
|
|
3718
|
+
>>> t = THOUGHT(
|
|
3719
|
+
... name="setvar_test",
|
|
3720
|
+
... operation="variable_set",
|
|
3721
|
+
... prompt={"status": "active", "count": 42}
|
|
3722
|
+
... )
|
|
3723
|
+
>>> mem2 = t(mem)
|
|
3724
|
+
>>> mem2.get_var("status")
|
|
3725
|
+
'active'
|
|
3726
|
+
>>> mem2.get_var("count")
|
|
3727
|
+
42
|
|
3728
|
+
|
|
3729
|
+
# Test 7: Execution history tracking
|
|
3730
|
+
>>> mem = MEMORY()
|
|
3731
|
+
>>> t = THOUGHT(name="history_test", operation="memory_query", required_vars=[])
|
|
3732
|
+
>>> len(t.execution_history)
|
|
3733
|
+
0
|
|
3734
|
+
>>> mem = t(mem)
|
|
3735
|
+
>>> len(t.execution_history)
|
|
3736
|
+
1
|
|
3737
|
+
>>> t.execution_history[0]['success']
|
|
3738
|
+
True
|
|
3739
|
+
>>> 'duration_ms' in t.execution_history[0]
|
|
3740
|
+
True
|
|
3741
|
+
>>> 'stamp' in t.execution_history[0]
|
|
3742
|
+
True
|
|
3743
|
+
|
|
3744
|
+
# Test 8: Conditional operation
|
|
3745
|
+
>>> mem = MEMORY()
|
|
3746
|
+
>>> mem.set_var("threshold", 50)
|
|
3747
|
+
>>> t = THOUGHT(
|
|
3748
|
+
... name="cond_test",
|
|
3749
|
+
... operation="conditional",
|
|
3750
|
+
... condition=lambda m, ctx: ctx.get('value', 0) > ctx.get('threshold', 0),
|
|
3751
|
+
... if_true="above",
|
|
3752
|
+
... if_false="below"
|
|
3753
|
+
... )
|
|
3754
|
+
>>> mem2 = t(mem, vars={'value': 75})
|
|
3755
|
+
>>> mem2.get_var("cond_test_result")
|
|
3756
|
+
'above'
|
|
3757
|
+
>>> mem3 = t(mem, vars={'value': 25})
|
|
3758
|
+
>>> mem3.get_var("cond_test_result")
|
|
3759
|
+
'below'
|
|
3760
|
+
|
|
3761
|
+
# Test 9: VALID_OPERATIONS class attribute
|
|
3762
|
+
>>> 'llm_call' in THOUGHT.VALID_OPERATIONS
|
|
3763
|
+
True
|
|
3764
|
+
>>> 'memory_query' in THOUGHT.VALID_OPERATIONS
|
|
3765
|
+
True
|
|
3766
|
+
>>> 'variable_set' in THOUGHT.VALID_OPERATIONS
|
|
3767
|
+
True
|
|
3768
|
+
>>> 'conditional' in THOUGHT.VALID_OPERATIONS
|
|
3769
|
+
True
|
|
3770
|
+
|
|
3771
|
+
# Test 10: Parse response with parsing_rules (valid_extract integration)
|
|
3772
|
+
>>> t = THOUGHT(name="parse_test", parsing_rules={"kind": "python", "format": []})
|
|
3773
|
+
>>> t.parse_response("Here is the list: [1, 2, 3]")
|
|
3774
|
+
[1, 2, 3]
|
|
3775
|
+
>>> t2 = THOUGHT(name="parse_dict", parsing_rules={"kind": "python", "format": {"name": "", "count": 0}})
|
|
3776
|
+
>>> t2.parse_response("Result: {'name': 'test', 'count': 5}")
|
|
3777
|
+
{'name': 'test', 'count': 5}
|
|
3778
|
+
|
|
3779
|
+
# Test 11: Built-in parsers
|
|
3780
|
+
>>> t = THOUGHT(name="json_test", parser="json")
|
|
3781
|
+
>>> t.parse_response('Here is JSON: {"key": "value"}')
|
|
3782
|
+
{'key': 'value'}
|
|
3783
|
+
>>> t2 = THOUGHT(name="list_test", parser="list")
|
|
3784
|
+
>>> t2.parse_response("Numbers: [1, 2, 3, 4, 5]")
|
|
3785
|
+
[1, 2, 3, 4, 5]
|
|
3786
|
+
>>> t3 = THOUGHT(name="text_test", parser="text")
|
|
3787
|
+
>>> t3.parse_response("plain text")
|
|
3788
|
+
'plain text'
|
|
3789
|
+
|
|
3790
|
+
# Test 12: Built-in validators
|
|
3791
|
+
>>> t = THOUGHT(name="val_test", validator="has_keys:name,age")
|
|
3792
|
+
>>> t.validate({"name": "Alice", "age": 30})
|
|
3793
|
+
(True, '')
|
|
3794
|
+
>>> t.validate({"name": "Bob"})
|
|
3795
|
+
(False, 'Missing keys: [\\'age\\']')
|
|
3796
|
+
>>> t2 = THOUGHT(name="list_val", validator="list_min_len:3")
|
|
3797
|
+
>>> t2.validate([1, 2, 3])
|
|
3798
|
+
(True, '')
|
|
3799
|
+
>>> t2.validate([1, 2])
|
|
3800
|
+
(False, 'List too short (min 3)')
|
|
3801
|
+
|
|
3802
|
+
"""
|
|
3803
|
+
|
|
3804
|
+
|
|
3805
|
+
#############################################################################
|
|
3806
|
+
#############################################################################
|
|
3807
|
+
|
|
3808
|
+
### ACTION CLASS
|
|
3809
|
+
|
|
3810
|
+
|
|
3811
|
+
class ACTION:
|
|
3812
|
+
"""
|
|
3813
|
+
The ACTION class encapsulates an external or internal operation that can be invoked within a Thoughtflow agent.
|
|
3814
|
+
It is designed to represent a single, named action (such as a tool call, API request, or function) whose result
|
|
3815
|
+
is stored in the agent's state for later inspection, branching, or retry.
|
|
3816
|
+
|
|
3817
|
+
An ACTION represents a discrete, named operation (function, API call, tool invocation) that can be defined once
|
|
3818
|
+
and executed multiple times with different parameters. When executed, the ACTION handles logging, error management,
|
|
3819
|
+
and result storage in a consistent way.
|
|
3820
|
+
|
|
3821
|
+
Attributes:
|
|
3822
|
+
name (str): Identifier for this action, used for logging and storing results.
|
|
3823
|
+
id (str): Unique identifier for this action instance (event_stamp).
|
|
3824
|
+
fn (callable): The function to execute when this action is called.
|
|
3825
|
+
config (dict): Default configuration parameters that will be passed to the function.
|
|
3826
|
+
result_key (str): Key where results are stored in memory (defaults to "{name}_result").
|
|
3827
|
+
description (str): Human-readable description of what this action does.
|
|
3828
|
+
last_result (Any): The most recent result from executing this action.
|
|
3829
|
+
last_error (Exception): The most recent error from executing this action, if any.
|
|
3830
|
+
execution_count (int): Number of times this action has been executed.
|
|
3831
|
+
execution_history (list): Full execution history with timing and success/error tracking.
|
|
3832
|
+
|
|
3833
|
+
Methods:
|
|
3834
|
+
__init__(name, fn, config=None, result_key=None, description=None):
|
|
3835
|
+
Initializes an ACTION with a name, function, and optional configuration.
|
|
3836
|
+
|
|
3837
|
+
__call__(memory, **kwargs):
|
|
3838
|
+
Executes the action function with the memory object and any override parameters.
|
|
3839
|
+
The function receives (memory, **merged_kwargs) where merged_kwargs combines
|
|
3840
|
+
self.config with any call-specific kwargs.
|
|
3841
|
+
|
|
3842
|
+
Returns the memory object with results stored via set_var.
|
|
3843
|
+
Logs execution details with JSON-formatted event data.
|
|
3844
|
+
Tracks execution timing and history.
|
|
3845
|
+
|
|
3846
|
+
Handles exceptions during execution by logging them rather than raising them,
|
|
3847
|
+
allowing the workflow to continue and decide how to handle failures.
|
|
3848
|
+
|
|
3849
|
+
get_last_result():
|
|
3850
|
+
Returns the most recent result from executing this action.
|
|
3851
|
+
|
|
3852
|
+
was_successful():
|
|
3853
|
+
Returns True if the last execution was successful, False otherwise.
|
|
3854
|
+
|
|
3855
|
+
reset_stats():
|
|
3856
|
+
Resets execution statistics (count, last_result, last_error, execution_history).
|
|
3857
|
+
|
|
3858
|
+
copy():
|
|
3859
|
+
Returns a copy of this ACTION with a new ID and reset statistics.
|
|
3860
|
+
|
|
3861
|
+
to_dict():
|
|
3862
|
+
Returns a serializable dictionary representation of this action.
|
|
3863
|
+
|
|
3864
|
+
from_dict(cls, data, fn_registry):
|
|
3865
|
+
Class method to reconstruct an ACTION from a dictionary representation.
|
|
3866
|
+
|
|
3867
|
+
Example Usage:
|
|
3868
|
+
# Define a web search action
|
|
3869
|
+
def search_web(memory, query, max_results=3):
|
|
3870
|
+
# Implementation of web search
|
|
3871
|
+
results = web_api.search(query, limit=max_results)
|
|
3872
|
+
return {"status": "success", "hits": results}
|
|
3873
|
+
|
|
3874
|
+
search_action = ACTION(
|
|
3875
|
+
name="web_search",
|
|
3876
|
+
fn=search_web,
|
|
3877
|
+
config={"max_results": 5},
|
|
3878
|
+
description="Searches the web for information"
|
|
3879
|
+
)
|
|
3880
|
+
|
|
3881
|
+
# Execute the action
|
|
3882
|
+
memory = MEMORY()
|
|
3883
|
+
memory = search_action(memory, query="thoughtflow framework")
|
|
3884
|
+
|
|
3885
|
+
# Access results
|
|
3886
|
+
result = memory.get_var("web_search_result")
|
|
3887
|
+
|
|
3888
|
+
# Check execution history
|
|
3889
|
+
print(search_action.execution_history[-1]['duration_ms']) # Execution time
|
|
3890
|
+
print(search_action.execution_history[-1]['success']) # True/False
|
|
3891
|
+
|
|
3892
|
+
Design Principles:
|
|
3893
|
+
1. Explicit and inspectable operations with consistent logging
|
|
3894
|
+
2. Predictable result storage via memory.set_var
|
|
3895
|
+
3. Error handling that doesn't interrupt workflow execution
|
|
3896
|
+
4. Composability with other Thoughtflow components (MEMORY, THOUGHT)
|
|
3897
|
+
5. Serialization support for reproducibility
|
|
3898
|
+
6. Full execution history with timing for debugging and optimization
|
|
3899
|
+
"""
|
|
3900
|
+
|
|
3901
|
+
def __init__(self, name, fn, config=None, result_key=None, description=None):
|
|
3902
|
+
"""
|
|
3903
|
+
Initialize an ACTION with a name, function, and optional configuration.
|
|
3904
|
+
|
|
3905
|
+
Args:
|
|
3906
|
+
name (str): Identifier for this action, used for logging and result storage.
|
|
3907
|
+
fn (callable): The function to execute when this action is called.
|
|
3908
|
+
config (dict, optional): Default configuration parameters passed to the function.
|
|
3909
|
+
result_key (str, optional): Key where results are stored in memory (defaults to "{name}_result").
|
|
3910
|
+
description (str, optional): Human-readable description of what this action does.
|
|
3911
|
+
"""
|
|
3912
|
+
self.name = name
|
|
3913
|
+
self.id = event_stamp() # Unique identifier for this action instance
|
|
3914
|
+
self.fn = fn
|
|
3915
|
+
self.config = config or {}
|
|
3916
|
+
self.result_key = result_key or "{}_result".format(name)
|
|
3917
|
+
self.description = description or "Action: {}".format(name)
|
|
3918
|
+
self.last_result = None
|
|
3919
|
+
self.last_error = None
|
|
3920
|
+
self.execution_count = 0
|
|
3921
|
+
self.execution_history = [] # Full execution tracking with timing
|
|
3922
|
+
|
|
3923
|
+
def __call__(self, memory, **kwargs):
|
|
3924
|
+
"""
|
|
3925
|
+
Execute the action function with the memory object and any override parameters.
|
|
3926
|
+
|
|
3927
|
+
Args:
|
|
3928
|
+
memory (MEMORY): The memory object to update with results.
|
|
3929
|
+
**kwargs: Parameters that override the default config for this execution.
|
|
3930
|
+
|
|
3931
|
+
Returns:
|
|
3932
|
+
MEMORY: The updated memory object with results stored in memory.vars[result_key].
|
|
3933
|
+
|
|
3934
|
+
Note:
|
|
3935
|
+
The function receives (memory, **merged_kwargs) where merged_kwargs combines
|
|
3936
|
+
self.config with any call-specific kwargs.
|
|
3937
|
+
|
|
3938
|
+
Exceptions during execution are logged rather than raised, allowing the
|
|
3939
|
+
workflow to continue and decide how to handle failures.
|
|
3940
|
+
"""
|
|
3941
|
+
import time as time_module
|
|
3942
|
+
|
|
3943
|
+
start_time = time_module.time()
|
|
3944
|
+
|
|
3945
|
+
# Merge default config with call-specific kwargs
|
|
3946
|
+
merged_kwargs = {**self.config, **kwargs}
|
|
3947
|
+
self.execution_count += 1
|
|
3948
|
+
|
|
3949
|
+
try:
|
|
3950
|
+
# Execute the function
|
|
3951
|
+
result = self.fn(memory, **merged_kwargs)
|
|
3952
|
+
self.last_result = result
|
|
3953
|
+
self.last_error = None
|
|
3954
|
+
|
|
3955
|
+
# Calculate execution duration
|
|
3956
|
+
duration_ms = (time_module.time() - start_time) * 1000
|
|
3957
|
+
|
|
3958
|
+
# Store result in memory using set_var (correct API)
|
|
3959
|
+
if hasattr(memory, "set_var") and callable(getattr(memory, "set_var", None)):
|
|
3960
|
+
memory.set_var(self.result_key, result, desc="Result of action: {}".format(self.name))
|
|
3961
|
+
|
|
3962
|
+
# Build execution event for logging (JSON format like THOUGHT)
|
|
3963
|
+
execution_event = {
|
|
3964
|
+
'action_name': self.name,
|
|
3965
|
+
'action_id': self.id,
|
|
3966
|
+
'status': 'success',
|
|
3967
|
+
'duration_ms': round(duration_ms, 2),
|
|
3968
|
+
'result_key': self.result_key
|
|
3969
|
+
}
|
|
3970
|
+
|
|
3971
|
+
# Log successful execution (single message with JSON, no invalid details param)
|
|
3972
|
+
if hasattr(memory, "add_log") and callable(getattr(memory, "add_log", None)):
|
|
3973
|
+
memory.add_log("Action execution complete: " + json.dumps(execution_event))
|
|
3974
|
+
|
|
3975
|
+
# Track execution history
|
|
3976
|
+
self.execution_history.append({
|
|
3977
|
+
'stamp': event_stamp(),
|
|
3978
|
+
'memory_id': getattr(memory, 'id', None),
|
|
3979
|
+
'duration_ms': duration_ms,
|
|
3980
|
+
'success': True,
|
|
3981
|
+
'error': None
|
|
3982
|
+
})
|
|
3983
|
+
|
|
3984
|
+
except Exception as e:
|
|
3985
|
+
# Handle and log exceptions
|
|
3986
|
+
self.last_error = e
|
|
3987
|
+
|
|
3988
|
+
# Calculate execution duration
|
|
3989
|
+
duration_ms = (time_module.time() - start_time) * 1000
|
|
3990
|
+
|
|
3991
|
+
# Build error event for logging
|
|
3992
|
+
error_event = {
|
|
3993
|
+
'action_name': self.name,
|
|
3994
|
+
'action_id': self.id,
|
|
3995
|
+
'status': 'error',
|
|
3996
|
+
'error': str(e),
|
|
3997
|
+
'duration_ms': round(duration_ms, 2),
|
|
3998
|
+
'result_key': self.result_key
|
|
3999
|
+
}
|
|
4000
|
+
|
|
4001
|
+
# Log failed execution (single message with JSON)
|
|
4002
|
+
if hasattr(memory, "add_log") and callable(getattr(memory, "add_log", None)):
|
|
4003
|
+
memory.add_log("Action execution failed: " + json.dumps(error_event))
|
|
4004
|
+
|
|
4005
|
+
# Store error info in memory using set_var
|
|
4006
|
+
if hasattr(memory, "set_var") and callable(getattr(memory, "set_var", None)):
|
|
4007
|
+
memory.set_var(self.result_key, error_event, desc="Error in action: {}".format(self.name))
|
|
4008
|
+
|
|
4009
|
+
# Track execution history
|
|
4010
|
+
self.execution_history.append({
|
|
4011
|
+
'stamp': event_stamp(),
|
|
4012
|
+
'memory_id': getattr(memory, 'id', None),
|
|
4013
|
+
'duration_ms': duration_ms,
|
|
4014
|
+
'success': False,
|
|
4015
|
+
'error': str(e)
|
|
4016
|
+
})
|
|
4017
|
+
|
|
4018
|
+
return memory
|
|
4019
|
+
|
|
4020
|
+
def get_last_result(self):
|
|
4021
|
+
"""
|
|
4022
|
+
Returns the most recent result from executing this action.
|
|
4023
|
+
|
|
4024
|
+
Returns:
|
|
4025
|
+
Any: The last result or None if the action hasn't been executed.
|
|
4026
|
+
"""
|
|
4027
|
+
return self.last_result
|
|
4028
|
+
|
|
4029
|
+
def was_successful(self):
|
|
4030
|
+
"""
|
|
4031
|
+
Returns True if the last execution was successful, False otherwise.
|
|
4032
|
+
|
|
4033
|
+
Returns:
|
|
4034
|
+
bool: True if the last execution completed without errors, False otherwise.
|
|
4035
|
+
"""
|
|
4036
|
+
return self.last_error is None and self.execution_count > 0
|
|
4037
|
+
|
|
4038
|
+
def reset_stats(self):
|
|
4039
|
+
"""
|
|
4040
|
+
Resets execution statistics (count, last_result, last_error, execution_history).
|
|
4041
|
+
|
|
4042
|
+
Returns:
|
|
4043
|
+
ACTION: Self for method chaining.
|
|
4044
|
+
"""
|
|
4045
|
+
self.execution_count = 0
|
|
4046
|
+
self.last_result = None
|
|
4047
|
+
self.last_error = None
|
|
4048
|
+
self.execution_history = []
|
|
4049
|
+
return self
|
|
4050
|
+
|
|
4051
|
+
def copy(self):
|
|
4052
|
+
"""
|
|
4053
|
+
Return a copy of this ACTION with a new ID.
|
|
4054
|
+
|
|
4055
|
+
The function reference is shared (same callable), but config is copied.
|
|
4056
|
+
Execution statistics are reset in the copy.
|
|
4057
|
+
|
|
4058
|
+
Returns:
|
|
4059
|
+
ACTION: A new ACTION instance with copied attributes and new ID.
|
|
4060
|
+
"""
|
|
4061
|
+
new_action = ACTION(
|
|
4062
|
+
name=self.name,
|
|
4063
|
+
fn=self.fn, # Same function reference
|
|
4064
|
+
config=self.config.copy() if self.config else None,
|
|
4065
|
+
result_key=self.result_key,
|
|
4066
|
+
description=self.description
|
|
4067
|
+
)
|
|
4068
|
+
# New ID is already assigned in __init__, no need to set it
|
|
4069
|
+
return new_action
|
|
4070
|
+
|
|
4071
|
+
def to_dict(self):
|
|
4072
|
+
"""
|
|
4073
|
+
Returns a serializable dictionary representation of this action.
|
|
4074
|
+
|
|
4075
|
+
Note: The function itself cannot be serialized, so it's represented by name.
|
|
4076
|
+
When deserializing, a function registry must be provided.
|
|
4077
|
+
|
|
4078
|
+
Returns:
|
|
4079
|
+
dict: Serializable representation of this action.
|
|
4080
|
+
"""
|
|
4081
|
+
return {
|
|
4082
|
+
"name": self.name,
|
|
4083
|
+
"id": self.id,
|
|
4084
|
+
"fn_name": self.fn.__name__,
|
|
4085
|
+
"config": self.config,
|
|
4086
|
+
"result_key": self.result_key,
|
|
4087
|
+
"description": self.description,
|
|
4088
|
+
"execution_count": self.execution_count,
|
|
4089
|
+
"execution_history": self.execution_history
|
|
4090
|
+
}
|
|
4091
|
+
|
|
4092
|
+
@classmethod
|
|
4093
|
+
def from_dict(cls, data, fn_registry):
|
|
4094
|
+
"""
|
|
4095
|
+
Reconstruct an ACTION from a dictionary representation.
|
|
4096
|
+
|
|
4097
|
+
Args:
|
|
4098
|
+
data (dict): Dictionary representation of an ACTION.
|
|
4099
|
+
fn_registry (dict): Dictionary mapping function names to function objects.
|
|
4100
|
+
|
|
4101
|
+
Returns:
|
|
4102
|
+
ACTION: Reconstructed ACTION object.
|
|
4103
|
+
|
|
4104
|
+
Raises:
|
|
4105
|
+
KeyError: If the function name is not found in the registry.
|
|
4106
|
+
"""
|
|
4107
|
+
if data["fn_name"] not in fn_registry:
|
|
4108
|
+
raise KeyError("Function '{}' not found in registry".format(data['fn_name']))
|
|
4109
|
+
|
|
4110
|
+
action = cls(
|
|
4111
|
+
name=data["name"],
|
|
4112
|
+
fn=fn_registry[data["fn_name"]],
|
|
4113
|
+
config=data["config"],
|
|
4114
|
+
result_key=data["result_key"],
|
|
4115
|
+
description=data["description"]
|
|
4116
|
+
)
|
|
4117
|
+
# Restore ID if provided, otherwise keep the new one from __init__
|
|
4118
|
+
if data.get("id"):
|
|
4119
|
+
action.id = data["id"]
|
|
4120
|
+
action.execution_count = data.get("execution_count", 0)
|
|
4121
|
+
action.execution_history = data.get("execution_history", [])
|
|
4122
|
+
return action
|
|
4123
|
+
|
|
4124
|
+
def __str__(self):
|
|
4125
|
+
"""
|
|
4126
|
+
Returns a string representation of this action.
|
|
4127
|
+
|
|
4128
|
+
Returns:
|
|
4129
|
+
str: String representation.
|
|
4130
|
+
"""
|
|
4131
|
+
return "ACTION({}, desc='{}', executions={})".format(self.name, self.description, self.execution_count)
|
|
4132
|
+
|
|
4133
|
+
def __repr__(self):
|
|
4134
|
+
"""
|
|
4135
|
+
Returns a detailed string representation of this action.
|
|
4136
|
+
|
|
4137
|
+
Returns:
|
|
4138
|
+
str: Detailed string representation.
|
|
4139
|
+
"""
|
|
4140
|
+
return ("ACTION(name='{}', fn={}, "
|
|
4141
|
+
"config={}, result_key='{}', "
|
|
4142
|
+
"description='{}', execution_count={})".format(
|
|
4143
|
+
self.name, self.fn.__name__, self.config,
|
|
4144
|
+
self.result_key, self.description, self.execution_count))
|
|
4145
|
+
|
|
4146
|
+
|
|
4147
|
+
### ACTION CLASS TESTS
|
|
4148
|
+
|
|
4149
|
+
ActionClassTests = """
|
|
4150
|
+
# --- ACTION Class Tests ---
|
|
4151
|
+
|
|
4152
|
+
|
|
4153
|
+
"""
|
|
4154
|
+
|
|
4155
|
+
#############################################################################
|
|
4156
|
+
#############################################################################
|
|
4157
|
+
|
|
4158
|
+
|
|
4159
|
+
|
|
4160
|
+
|
|
4161
|
+
|
|
4162
|
+
|
|
4163
|
+
|
|
4164
|
+
|
|
4165
|
+
|
|
4166
|
+
|
|
4167
|
+
|
|
4168
|
+
|
|
4169
|
+
|
|
4170
|
+
|
|
4171
|
+
|
|
4172
|
+
|
|
4173
|
+
|
|
4174
|
+
|
|
4175
|
+
|
|
4176
|
+
|
|
4177
|
+
|
|
4178
|
+
|
|
4179
|
+
|
|
4180
|
+
|