thoughtflow 0.0.2__py3-none-any.whl → 0.0.4__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,4180 @@
1
+
2
+ DeveloperContext = """
3
+
4
+ # The Zen of Thoughtflow
5
+
6
+ The **Zen of Thoughtflow** is a set of guiding principles for building a framework that prioritizes simplicity, clarity, and flexibility. Thoughtflow is not meant to be a rigid system but a tool that helps developers create and explore freely. It's designed to stay light, modular, and focused, with Python at its core. The goal is to reduce complexity, maintain transparency, and ensure that functionality endures over time. Thoughtflow isn't about trying to please everyone—it's about building a tool that serves its purpose well, allowing developers to focus on their own path.
7
+
8
+ ---
9
+
10
+ ### 1. First Principles First
11
+ Thoughtflow is built on fundamental, simple concepts. Each piece should start with core truths, avoiding the temptation to build on excessive abstractions.
12
+
13
+ ### 2. Complexity is the Enemy
14
+ Keep it simple. Thoughtflow should be Pythonic, intuitive, and elegant. Let ease of use guide every decision, ensuring the library remains as light as possible.
15
+
16
+ ### 3. Obvious Over Abstract
17
+ If the user has to dig deep to understand what's going on, the design has failed. Everything should naturally reveal its purpose and operation.
18
+
19
+ ### 4. Transparency is Trust
20
+ Thoughtflow must operate transparently. Users should never have to guess what's happening under the hood—understanding empowers, while opacity frustrates.
21
+
22
+ ### 5. Backward Compatibility is Sacred
23
+ Code should endure. Deprecation should be rare, and backward compatibility must be respected to protect users' investments in their existing work.
24
+
25
+ ### 6. Flexibility Over Rigidity
26
+ Provide intelligent defaults, but allow users infinite possibilities. Thoughtflow should never micromanage the user's experience—give them the freedom to define their journey.
27
+
28
+ ### 7. Minimize Dependencies, Pack Light
29
+ Thoughtflow should rely only on minimal, light libraries. Keep the dependency tree shallow, and ensure it's always feasible to deploy the library in serverless architectures.
30
+
31
+ ### 8. Clarity Over Cleverness
32
+ Documentation, code, and design must be explicit and clear, not implicit or convoluted. Guide users, both beginners and experts, with straightforward tutorials and examples.
33
+
34
+ ### 9. Modularity is Better than Monolith
35
+ Thoughtflow should be a collection of lightweight, composable pieces. Never force the user into an all-or-nothing approach—each component should be able to stand alone. Every builder loves legos.
36
+
37
+ ### 10. Accommodate Both Beginners and Experts
38
+ Thoughtflow should grow with its users. Provide frictionless onboarding for beginners while offering flexibility for advanced users to scale and customize as needed.
39
+
40
+ ### 11. Make a Vehicle, Not a Destination
41
+ Thoughtflow should focus on the structuring and intelligent sequencing of user-defined thoughts. Classes should be as generalizable as possible, and logic should be easily exported and imported via thought files.
42
+
43
+ ### 12. Good Documentation Accelerates Usage
44
+ Documentation and tutorials must be clear, comprehensive, and always up-to-date. They should guide users at every turn, ensuring knowledge is readily available.
45
+
46
+ ### 13. Don't Try to Please Everyone
47
+ Thoughtflow is focused and light. It isn't designed to accommodate every possible use case, and that's intentional. Greatness comes from focus, not from trying to do everything.
48
+
49
+ ### 14. Python is King
50
+ Thoughtflow is built to be Pythonic. Python is the first-class citizen, and every integration, feature, and extension should honor Python's language and philosophy.
51
+
52
+ ---
53
+
54
+ ThoughtFlow is designed to be a sophisticated AI agent framework for building
55
+ intelligent, memory-aware systems that can think, act, and maintain persistent
56
+ state.
57
+
58
+ ---
59
+
60
+
61
+ # Thoughtflow — Design Document (Plain-English Spec for a Single-File Base Implementation)
62
+
63
+ This document explains **exactly** how to engineer Thoughtflow in simple, idiomatic Python. It is meant to live at the top of a *single Python script* that defines the foundational classes and helper functions. It is written for a reader with **zero** prior exposure to Thoughtflow.
64
+
65
+ Thoughtflow is a **Pythonic cognitive engine**. You write ordinary Python—`for`/`while`, `if/elif/else`, `try/except`, and small classes—no graphs, no hidden DSLs. A *flow* is "just a function" that accepts a `MEMORY` object and returns that same `MEMORY` object, modified. Cognition is built from four primitives:
66
+
67
+ 1. **LLM** — A tiny wrapper around a chat-style language model API.
68
+ 2. **MEMORY** — The single state container that keeps messages, events, logs, reflections, and variables.
69
+ 3. **THOUGHT** — The unit of cognition: Prompt + Context + LLM + Parsing + Validation (+ Retries + Logging).
70
+ 4. **ACTION** — Anything the agent *does* (respond, call an HTTP API, write a file, query a vector store, etc.), with consistent logging.
71
+
72
+ The rest of this spec describes **design philosophy**, **object contracts**, **method/attribute lists**, **data conventions**, and **how everything fits together**—plus example usage that the finished library should support.
73
+
74
+ ---
75
+
76
+ Final Notes on Style
77
+
78
+ * Keep constructors short and forgiving; let users pass just a few arguments.
79
+ * Prefer small, pure helpers (parsers/validators) over big class hierarchies.
80
+ * Do not hide failures; always leave a visible trace in `logs` and `events`.
81
+ * Default behaviors should serve 90% of use cases; exotic needs belong in user code.
82
+
83
+ """
84
+
85
+ #############################################################################
86
+ #############################################################################
87
+
88
+ ### IMPORTS AND SETTINGS
89
+
90
+ import os, sys, time, pickle, json, uuid
91
+ import http, urllib, socket, ssl, gzip, copy
92
+ import urllib.request
93
+ import pprint
94
+ import random
95
+ import re, ast
96
+ from typing import Mapping, Any, Iterable, Optional, Tuple, Union
97
+
98
+ import time,hashlib,pickle
99
+ from random import randint
100
+ from functools import reduce
101
+
102
+ import datetime as dtt
103
+ from zoneinfo import ZoneInfo
104
+
105
+ tz_bog = ZoneInfo("America/Bogota")
106
+ tz_utc = ZoneInfo("UTC")
107
+
108
+
109
+ #############################################################################
110
+ #############################################################################
111
+
112
+ ### EVENT STAMP LOGIC
113
+
114
+ class EventStamp:
115
+ """
116
+ Generates and decodes deterministic event stamps using Base62 encoding.
117
+
118
+ Event stamps combine encoded time, document hash, and random components
119
+ into a compact 16-character identifier.
120
+
121
+ Usage:
122
+ EventStamp.stamp() # Generate a new stamp
123
+ EventStamp.decode_time(s) # Decode timestamp from stamp
124
+ EventStamp.hashify("text") # Generate deterministic hash
125
+ """
126
+
127
+ CHARSET = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
128
+
129
+ def sha256_hash(input_string):
130
+ """Generate a SHA-256 hash and return it as an integer."""
131
+ hash_bytes = hashlib.sha256(input_string.encode("utf-8")).digest()
132
+ return int.from_bytes(hash_bytes, byteorder="big")
133
+
134
+ def base62_encode(number, length):
135
+ """Encode an integer into a fixed-length Base62 string."""
136
+ base = len(EventStamp.CHARSET)
137
+ encoded = []
138
+ for _ in range(length):
139
+ number, remainder = divmod(number, base)
140
+ encoded.append(EventStamp.CHARSET[remainder])
141
+ return ''.join(encoded[::-1]) # Reverse to get correct order
142
+
143
+ def hashify(input_string, length=32):
144
+ """Generate a deterministic hash using all uppercase/lowercase letters and digits."""
145
+ hashed_int = EventStamp.sha256_hash(input_string)
146
+ return EventStamp.base62_encode(hashed_int, length)
147
+
148
+ def encode_num(num, charset=None):
149
+ """Encode a number in the given base/charset."""
150
+ if charset is None:
151
+ charset = EventStamp.CHARSET
152
+ base = len(charset)
153
+ if num < base:
154
+ return charset[num]
155
+ else:
156
+ return EventStamp.encode_num(num // base, charset) + charset[num % base]
157
+
158
+ def decode_num(encoded_str, charset=None):
159
+ """Decode a base-encoded string back to an integer."""
160
+ if charset is None:
161
+ charset = EventStamp.CHARSET
162
+ base = len(charset)
163
+ char_to_value = {c: i for i, c in enumerate(charset)}
164
+ return reduce(lambda num, c: num * base + char_to_value[c], encoded_str, 0)
165
+
166
+ def encode_time(unix_time=0):
167
+ """Encode current or given unix time."""
168
+ if unix_time == 0:
169
+ t = int(time.time() * 10000)
170
+ else:
171
+ t = int(unix_time * 10000)
172
+ return EventStamp.encode_num(t)
173
+
174
+ def encode_doc(doc={}):
175
+ """Encode a document/value to a 5-character hash."""
176
+ return EventStamp.hashify(str(doc), 5)
177
+
178
+ def encode_rando(length=3):
179
+ """Generate a random code of specified length."""
180
+ n = randint(300000, 900000)
181
+ c = '000' + EventStamp.encode_num(n)
182
+ return c[-length:]
183
+
184
+ def stamp(doc={}):
185
+ """
186
+ Generate an event stamp.
187
+
188
+ Combines encoded time, document hash, and random component
189
+ into a 16-character identifier.
190
+ """
191
+ time_code = EventStamp.encode_time()
192
+ rando_code = EventStamp.encode_rando()
193
+ if len(str(doc)) > 2:
194
+ doc_code = EventStamp.encode_doc(doc)
195
+ else:
196
+ arb = time_code + rando_code
197
+ doc_code = EventStamp.encode_doc(arb)
198
+ return (time_code + doc_code + rando_code)[:16]
199
+
200
+ def decode_time(stamp, charset=None):
201
+ """Decode the time component from an event stamp."""
202
+ if charset is None:
203
+ charset = EventStamp.CHARSET
204
+ stamp_prefix = stamp[:8]
205
+ scaled_time = EventStamp.decode_num(stamp_prefix, charset)
206
+ unix_time_seconds = scaled_time / 10000
207
+ return unix_time_seconds
208
+
209
+
210
+ # Backwards compatibility aliases
211
+ event_stamp = EventStamp.stamp
212
+ hashify = EventStamp.hashify
213
+ encode_num = EventStamp.encode_num
214
+ decode_num = EventStamp.decode_num
215
+
216
+ #############################################################################
217
+ #############################################################################
218
+
219
+ ### HELPER FUNCTIONS
220
+
221
+
222
+ default_header = '''
223
+ Markers like <start … l4zk> and </end … l4zk>
224
+ indicate where a text section begins and ends.
225
+ Never mix boundaries. Each block is separate.
226
+ This is to improve your ease-of-reading.
227
+ '''
228
+
229
+ def construct_prompt(
230
+ prompt_obj = {},
231
+ order = [],
232
+ header = '',
233
+ ):
234
+ if order: sections = list(order)
235
+ else: sections = [a for a in prompt_obj]
236
+ rnum = str(randint(1,9))
237
+ stamp = event_stamp()[-4:].lower()
238
+ stamp = stamp[:2]+rnum+stamp[2:]
239
+ L = []
240
+ if header:
241
+ if header=='default':
242
+ L.append(default_header+'\n')
243
+ else:
244
+ L.append(header+'\n\n')
245
+ L.append('<start prompt stamp>\n\n')
246
+ for s in sections:
247
+ text = prompt_obj[s]
248
+ s2 = s.strip().replace(' ','_')
249
+ label1 = "<start "+s2+" stamp>\n"
250
+ label2 = "\n</end "+s2+" stamp>\n\n"
251
+ block = label1 + text + label2
252
+ L.append(block)
253
+ L.append('</end prompt stamp>')
254
+ prompt = ''.join(L).replace(' stamp>',' '+stamp+'>')
255
+ return prompt
256
+
257
+ def construct_msgs(
258
+ usr_prompt = '',
259
+ vars = {},
260
+ sys_prompt = '',
261
+ msgs = [],
262
+ ):
263
+ if sys_prompt:
264
+ if type(sys_prompt)==dict:
265
+ sys_prompt = construct_prompt(sys_prompt)
266
+ m = {'role':'system','content':sys_prompt}
267
+ msgs.insert(0,m)
268
+ if usr_prompt:
269
+ if type(usr_prompt)==dict:
270
+ usr_prompt = construct_prompt(usr_prompt)
271
+ m = {'role':'user','content':usr_prompt}
272
+ msgs.append(m)
273
+ #msgs2 = []
274
+ #for m in msgs:
275
+ # m_copy = m.copy()
276
+ # if isinstance(m_copy, dict) and 'content' in m_copy and isinstance(m_copy['content'], str):
277
+ # for k, v in vars.items():
278
+ # m_copy['content'] = m_copy['content'].replace(k, str(v))
279
+ # msgs2.append(m_copy)
280
+ #return msgs2
281
+ msgs2 = []
282
+ for m in msgs:
283
+ m_copy = dict(m)
284
+ if isinstance(m_copy.get("content"), str):
285
+ for k, v in vars.items():
286
+ m_copy["content"] = m_copy["content"].replace(k, str(v))
287
+ msgs2.append(m_copy)
288
+ return msgs2
289
+
290
+
291
+
292
+ #############################################################################
293
+
294
+
295
+ class ValidExtractError(ValueError):
296
+ """Raised when extraction or validation fails."""
297
+
298
+ def valid_extract(raw_text: str, parsing_rules: Mapping[str, Any]) -> Any:
299
+ """
300
+ Extract and validate a target Python structure from noisy LLM text.
301
+
302
+ Parameters
303
+ ----------
304
+ raw_text : str
305
+ The original model output (may include extra prose, code fences, etc.).
306
+ parsing_rules : dict
307
+ Rules controlling extraction/validation. Required keys:
308
+ - 'kind': currently supports 'python' (default). ('json' also supported.)
309
+ - 'format': schema describing the expected structure, e.g. [], {}, {'name': ''}, {'num_list': [], 'info': {}}
310
+
311
+ Schema language:
312
+ * [] : list of anything
313
+ * [schema] : list of items matching 'schema'
314
+ * {} : dict of anything
315
+ * {'k': sch} : dict with required key 'k' matching 'sch'
316
+ * {'k?': sch} : OPTIONAL key 'k' (if present, must match 'sch')
317
+ * '' or str : str
318
+ * 0 or int : int
319
+ * 0.0 or float: float
320
+ * True/False or bool: bool
321
+ * None : NoneType
322
+
323
+ Returns
324
+ -------
325
+ Any
326
+ The parsed Python object that satisfies the schema.
327
+
328
+ Raises
329
+ ------
330
+ ValidExtractError
331
+ If extraction fails or the parsed object does not validate against the schema.
332
+
333
+ Examples
334
+ --------
335
+ >>> rules = {'kind': 'python', 'format': []}
336
+ >>> txt = "Here you go:\\n```python\\n[1, 2, 3]\\n```\\nLet me know!"
337
+ >>> valid_extract(txt, rules)
338
+ [1, 2, 3]
339
+
340
+ >>> rules = {'kind': 'python', 'format': {'num_list': [], 'my_info': {}, 'name': ''}}
341
+ >>> txt = "noise { 'num_list':[1,2], 'my_info':{'x':1}, 'name':'Ada' } trailing"
342
+ >>> valid_extract(txt, rules)
343
+ {'num_list': [1, 2], 'my_info': {'x': 1}, 'name': 'Ada'}
344
+ """
345
+ if not isinstance(parsing_rules, Mapping):
346
+ raise ValidExtractError("parsing_rules must be a mapping.")
347
+
348
+ kind = parsing_rules.get("kind", "python")
349
+ schema = parsing_rules.get("format", None)
350
+ if schema is None:
351
+ raise ValidExtractError("parsing_rules['format'] is required.")
352
+
353
+ # 1) Collect candidate text segments in a robust order.
354
+ candidates: Iterable[str] = _candidate_segments(raw_text, schema, prefer_fences_first=True)
355
+
356
+ last_err: Optional[Exception] = None
357
+ for segment in candidates:
358
+ try:
359
+ obj = _parse_segment(segment, kind=kind)
360
+ except Exception as e:
361
+ last_err = e
362
+ continue
363
+
364
+ ok, msg = _validate_schema(obj, schema)
365
+ if ok:
366
+ return obj
367
+ last_err = ValidExtractError("Validation failed for candidate: {}".format(msg))
368
+
369
+ # If we got here, nothing parsed+validated.
370
+ if last_err:
371
+ raise ValidExtractError(str(last_err))
372
+ raise ValidExtractError("No parseable candidates found.")
373
+
374
+ # ----------------------------
375
+ # Parsing helpers
376
+ # ----------------------------
377
+
378
+ # --- Replace the fence regex with this (accepts inline fences) ---
379
+ _FENCE_RE = re.compile(
380
+ r"```(?P<lang>[a-zA-Z0-9_\-\.]*)\s*\n?(?P<body>.*?)```",
381
+ re.DOTALL
382
+ )
383
+
384
+ def _candidate_segments(raw_text: str, schema: Any, prefer_fences_first: bool = True) -> Iterable[str]:
385
+ """
386
+ Yield candidate substrings likely to contain the target structure.
387
+
388
+ Strategy:
389
+ 1) Fenced code blocks (```) first, in order, if requested.
390
+ 2) Balanced slice for the top-level delimiter suggested by the schema.
391
+ 3) As a fallback, return raw_text itself (last resort).
392
+ """
393
+ # 1) From code fences
394
+ if prefer_fences_first:
395
+ for m in _FENCE_RE.finditer(raw_text):
396
+ lang = (m.group("lang") or "").strip().lower()
397
+ body = m.group("body")
398
+ # If the fence declares "python" or "json", prioritize; otherwise still try.
399
+ yield body
400
+
401
+ # 2) From balanced slice based on schema's top-level delimiter
402
+ opener, closer = _delims_for_schema(schema)
403
+ if opener and closer:
404
+ slice_ = _balanced_slice(raw_text, opener, closer)
405
+ if slice_ is not None:
406
+ yield slice_
407
+
408
+ # 3) Whole text (very last resort)
409
+ yield raw_text
410
+
411
+ def _parse_segment(segment: str, kind: str = "python") -> Any:
412
+ """
413
+ Parse a segment into a Python object according to 'kind'.
414
+ - python: ast.literal_eval
415
+ - json: json.loads (with fallback: try literal_eval if JSON fails, for LLM single-quote dicts)
416
+ """
417
+ text = segment.strip()
418
+
419
+ if kind == "python":
420
+ # Remove leading language hints often kept when copying from fences
421
+ if text.startswith("python\n"):
422
+ text = text[len("python\n") :].lstrip()
423
+ return ast.literal_eval(text)
424
+
425
+ if kind == "json":
426
+ try:
427
+ return json.loads(text)
428
+ except json.JSONDecodeError:
429
+ # LLMs often return Python-style dicts (single quotes). Try literal_eval as a fallback.
430
+ return ast.literal_eval(text)
431
+
432
+ raise ValidExtractError("Unsupported kind: {!r}".format(kind))
433
+
434
+ def _delims_for_schema(schema: Any) -> Tuple[Optional[str], Optional[str]]:
435
+ """
436
+ Infer top-level delimiters from the schema.
437
+ - list-like → [ ]
438
+ - dict-like → { }
439
+ - tuple-like (if used) → ( )
440
+ - string/number/bool/None → no delimiters (None, None)
441
+ """
442
+ # list
443
+ if isinstance(schema, list):
444
+ return "[", "]"
445
+ # dict
446
+ if isinstance(schema, dict):
447
+ return "{", "}"
448
+ # tuple schema (rare, but supported)
449
+ if isinstance(schema, tuple):
450
+ return "(", ")"
451
+ # primitives: cannot infer a unique delimiter—return None
452
+ return None, None
453
+
454
+
455
+ def _balanced_slice(text: str, open_ch: str, close_ch: str) -> Optional[str]:
456
+ """
457
+ Return the first balanced substring between open_ch and close_ch,
458
+ scanning from the *first occurrence of open_ch* (so prose apostrophes
459
+ before the opener don't confuse quote tracking).
460
+ """
461
+ start = text.find(open_ch)
462
+ if start == -1:
463
+ return None
464
+
465
+ depth = 0
466
+ in_str: Optional[str] = None # quote char if inside ' or "
467
+ escape = False
468
+ i = start
469
+
470
+ while i < len(text):
471
+ ch = text[i]
472
+ if in_str:
473
+ if escape:
474
+ escape = False
475
+ elif ch == "\\":
476
+ escape = True
477
+ elif ch == in_str:
478
+ in_str = None
479
+ else:
480
+ if ch in ("'", '"'):
481
+ in_str = ch
482
+ elif ch == open_ch:
483
+ depth += 1
484
+ elif ch == close_ch and depth > 0:
485
+ depth -= 1
486
+ if depth == 0:
487
+ return text[start : i + 1]
488
+ i += 1
489
+ return None
490
+
491
+
492
+ # ----------------------------
493
+ # Schema validation
494
+ # ----------------------------
495
+
496
+ def _is_optional_key(k: str) -> Tuple[str, bool]:
497
+ """Return (base_key, optional_flag) for keys with a trailing '?'."""
498
+ if isinstance(k, str) and k.endswith("?"):
499
+ return k[:-1], True
500
+ return k, False
501
+
502
+ def _schema_type(schema: Any) -> Union[type, Tuple[type, ...], None]:
503
+ """
504
+ Map schema exemplars to Python types.
505
+ Accepts either exemplar values ('' -> str, 0 -> int, 0.0 -> float, True -> bool, None -> NoneType)
506
+ OR actual types (str, int, float, bool).
507
+ """
508
+ if schema is None:
509
+ return type(None)
510
+ if schema is str or isinstance(schema, str):
511
+ return str
512
+ if schema is int or (isinstance(schema, int) and not isinstance(schema, bool)):
513
+ return int
514
+ if schema is float or isinstance(schema, float):
515
+ return float
516
+ if schema is bool or isinstance(schema, bool):
517
+ return bool
518
+ if schema is list:
519
+ return list
520
+ if schema is dict:
521
+ return dict
522
+ if schema is tuple:
523
+ return tuple
524
+ return None # composite or unknown marker
525
+
526
+ def _validate_schema(obj: Any, schema: Any, path: str = "$") -> Tuple[bool, str]:
527
+ """
528
+ Recursively validate 'obj' against 'schema'. Returns (ok, message).
529
+ """
530
+ # 1) Primitive types via exemplar or type
531
+ t = _schema_type(schema)
532
+ if t is not None and t not in (list, dict, tuple):
533
+ if isinstance(obj, t):
534
+ return True, "ok"
535
+ return False, "{}: expected {}, got {}".format(path, t.__name__, type(obj).__name__)
536
+
537
+ # 2) List schemas
538
+ if isinstance(schema, list):
539
+ if not isinstance(obj, list):
540
+ return False, "{}: expected list, got {}".format(path, type(obj).__name__)
541
+ # If schema is [], any list passes
542
+ if len(schema) == 0:
543
+ return True, "ok"
544
+ # If schema is [subschema], every element must match subschema
545
+ if len(schema) == 1:
546
+ subschema = schema[0]
547
+ for i, el in enumerate(obj):
548
+ ok, msg = _validate_schema(el, subschema, "{}[{}]".format(path, i))
549
+ if not ok:
550
+ return ok, msg
551
+ return True, "ok"
552
+ # Otherwise treat as "structure-by-position" (rare)
553
+ if len(obj) != len(schema):
554
+ return False, "{}: expected list length {}, got {}".format(path, len(schema), len(obj))
555
+ for i, (el, subschema) in enumerate(zip(obj, schema)):
556
+ ok, msg = _validate_schema(el, subschema, "{}[{}]".format(path, i))
557
+ if not ok:
558
+ return ok, msg
559
+ return True, "ok"
560
+
561
+ # 3) Dict schemas
562
+ if isinstance(schema, dict):
563
+ if not isinstance(obj, dict):
564
+ return False, "{}: expected dict, got {}".format(path, type(obj).__name__)
565
+
566
+ # Check required/optional keys in schema
567
+ for skey, subschema in schema.items():
568
+ base_key, optional = _is_optional_key(skey)
569
+ if base_key not in obj:
570
+ if optional:
571
+ continue
572
+ return False, "{}: missing required key '{}'".format(path, base_key)
573
+ ok, msg = _validate_schema(obj[base_key], subschema, "{}.{}".format(path, base_key))
574
+ if not ok:
575
+ return ok, msg
576
+ return True, "ok"
577
+
578
+ # 4) Tuple schemas (optional)
579
+ if isinstance(schema, tuple):
580
+ if not isinstance(obj, tuple):
581
+ return False, "{}: expected tuple, got {}".format(path, type(obj).__name__)
582
+ if len(schema) == 0:
583
+ return True, "ok"
584
+ if len(schema) == 1:
585
+ subschema = schema[0]
586
+ for i, el in enumerate(obj):
587
+ ok, msg = _validate_schema(el, subschema, "{}[{}]".format(path, i))
588
+ if not ok:
589
+ return ok, msg
590
+ return True, "ok"
591
+ if len(obj) != len(schema):
592
+ return False, "{}: expected tuple length {}, got {}".format(path, len(schema), len(obj))
593
+ for i, (el, subschema) in enumerate(zip(obj, schema)):
594
+ ok, msg = _validate_schema(el, subschema, "{}[{}]".format(path, i))
595
+ if not ok:
596
+ return ok, msg
597
+ return True, "ok"
598
+
599
+ # 5) If schema is a type object (e.g., list, dict) we handled above; unknown markers:
600
+ st = type(schema).__name__
601
+ return False, "{}: unsupported schema marker of type {!r}".format(path, st)
602
+
603
+
604
+ ParsingExamples = """
605
+
606
+ # Examples showing how to use the valid_extract function
607
+ #------------------------------------------------------------------
608
+
609
+ # Basic list
610
+ txt = "Noise before ```python\n[1, 2, 3]\n``` noise after"
611
+ rules = {"kind": "python", "format": []}
612
+ assert valid_extract(txt, rules) == [1, 2, 3]
613
+
614
+ # Basic dict
615
+ txt2 = "Header\n{ 'a': 1, 'b': 2 }\nFooter"
616
+ rules2 = {"kind": "python", "format": {}}
617
+ assert valid_extract(txt2, rules2) == {"a": 1, "b": 2}
618
+
619
+ # Nested dict with types
620
+ txt3 = "reply: { 'num_list':[1,2,3], 'my_info':{'x':1}, 'name':'Ada' } ok."
621
+ rules3 = {"kind": "python",
622
+ "format": {'num_list': [int], 'my_info': {}, 'name': ''}}
623
+ assert valid_extract(txt3, rules3)["name"] == "Ada"
624
+
625
+ # Optional key example
626
+ txt4 = ''' I think this is how I'd answer: ``` {'a': 1}``` is this good enough?'''
627
+ rules4 = {"kind": "python", "format": {'a': int, 'b?': ''}}
628
+ assert valid_extract(txt4, rules4) == {'a': 1}
629
+
630
+ txt = " I think this is how I'd answer: ``` {'a': 1}``` is this good enough?"
631
+ rules = {"kind": "python", "format": {"a": int, "b?": ""}}
632
+ assert valid_extract(txt, rules) == {"a": 1}
633
+
634
+ txt2 = "noise before {'a': 1} and after"
635
+ assert valid_extract(txt2, rules) == {"a": 1}
636
+
637
+ txt3 = "ok ```python\n[1,2,3]\n``` end"
638
+ assert valid_extract(txt3, {"kind": "python", "format": []}) == [1,2,3]
639
+
640
+ txt4 = "inline ```[{'k': 'v'}]```"
641
+ assert valid_extract(txt4, {"kind": "python", "format": [{"k": ""}]}) == [{"k": "v"}]
642
+
643
+ """
644
+
645
+
646
+ #############################################################################
647
+ #############################################################################
648
+ #############################################################################
649
+
650
+
651
+ ### LLM CLASS
652
+
653
+ class LLM:
654
+ """
655
+ The LLM class is designed to interface with various language model services.
656
+
657
+ Attributes:
658
+ service (str): The name of the service provider (e.g., 'openai', 'groq', 'anthropic').
659
+ model (str): The specific model to be used within the service.
660
+ api_key (str): The API key for authenticating requests.
661
+ api_secret (str): The API secret for additional authentication.
662
+ last_params (dict): Stores the parameters used in the last API call.
663
+
664
+ Methods:
665
+ __init__(model_id, key, secret):
666
+ Initializes the LLM instance with a model ID, API key, and secret.
667
+
668
+ call(msg_list, params):
669
+ Calls the appropriate API based on the service with the given message list and parameters.
670
+
671
+ _call_openai(msg_list, params):
672
+ Sends a request to the OpenAI API with the specified messages and parameters.
673
+
674
+ _call_groq(msg_list, params):
675
+ Sends a request to the Groq API with the specified messages and parameters.
676
+
677
+ _call_anthropic(msg_list, params):
678
+ Sends a request to the Anthropic API with the specified messages and parameters.
679
+
680
+ _send_request(url, data, headers):
681
+ Helper function to send HTTP requests to the specified URL with data and headers.
682
+ """
683
+ def __init__(self, model_id='', key='API_KEY', secret='API_SECRET'):
684
+ # Parse model ID and initialize service and model name
685
+ if ':' not in model_id: model_id = 'openai:gpt-4-turbo'
686
+
687
+ splitted = model_id.split(':')
688
+ self.service = splitted[0]
689
+ self.model = ''.join(splitted[1:])
690
+ self.api_key = key
691
+ self.api_secret = secret
692
+ self.last_params = {}
693
+ # Make the object directly callable
694
+ self.__call__ = self.call
695
+
696
+ def _normalize_messages(self, msg_list):
697
+ """
698
+ Accepts either:
699
+ - list[str] -> converts to [{'role':'user','content': str}, ...]
700
+ - list[dict] with 'role' and 'content' -> passes through unchanged
701
+ - list[dict] with only 'content' -> assumes role='user'
702
+ Returns: list[{'role': str, 'content': str or list[...]}]
703
+ """
704
+ norm = []
705
+ for m in msg_list:
706
+ if isinstance(m, dict):
707
+ role = m.get("role", "user")
708
+ content = m.get("content", "")
709
+ norm.append({"role": role, "content": content})
710
+ else:
711
+ # treat as plain user text
712
+ norm.append({"role": "user", "content": str(m)})
713
+ return norm
714
+
715
+ def call(self, msg_list, params={}):
716
+ self.last_params = dict(params)
717
+ # General function to call the appropriate API with msg_list and optional parameters
718
+ if self.service == 'openai':
719
+ return self._call_openai(msg_list, params)
720
+ elif self.service == 'groq':
721
+ return self._call_groq(msg_list, params)
722
+ elif self.service == 'anthropic':
723
+ return self._call_anthropic(msg_list, params)
724
+ elif self.service == 'ollama':
725
+ return self._call_ollama(msg_list, params)
726
+ elif self.service == 'gemini':
727
+ return self._call_gemini(msg_list, params)
728
+ elif self.service == 'openrouter':
729
+ return self._call_openrouter(msg_list, params)
730
+ else:
731
+ raise ValueError("Unsupported service '{}'.".format(self.service))
732
+
733
+ def _call_openai(self, msg_list, params):
734
+ url = "https://api.openai.com/v1/chat/completions"
735
+ data = json.dumps({
736
+ "model": self.model,
737
+ "messages": self._normalize_messages(msg_list),
738
+ **params
739
+ }).encode("utf-8")
740
+ headers = {
741
+ "Authorization": "Bearer " + self.api_key,
742
+ "Content-Type": "application/json",
743
+ }
744
+ res = self._send_request(url, data, headers)
745
+ choices = [a["message"]["content"] for a in res.get("choices", [])]
746
+ return choices
747
+
748
+ def _call_groq(self, msg_list, params):
749
+ url = "https://api.groq.com/openai/v1/chat/completions"
750
+ data = json.dumps({
751
+ "model": self.model,
752
+ "messages": self._normalize_messages(msg_list),
753
+ **params
754
+ }).encode("utf-8")
755
+ headers = {
756
+ "Authorization": "Bearer " + self.api_key,
757
+ "Content-Type": "application/json",
758
+ "User-Agent": "Groq/Python 0.9.0",
759
+ }
760
+ res = self._send_request(url, data, headers)
761
+ choices = [a["message"]["content"] for a in res.get("choices", [])]
762
+ return choices
763
+
764
+ def _call_anthropic(self, msg_list, params):
765
+ url = "https://api.anthropic.com/v1/messages"
766
+ data = json.dumps({
767
+ "model": self.model,
768
+ "max_tokens": params.get("max_tokens", 1024),
769
+ "messages": self._normalize_messages(msg_list),
770
+ }).encode("utf-8")
771
+ headers = {
772
+ "x-api-key": self.api_key,
773
+ "anthropic-version": "2023-06-01",
774
+ "Content-Type": "application/json",
775
+ }
776
+ res = self._send_request(url, data, headers)
777
+ # Anthropic returns {"content":[{"type":"text","text":"..."}], ...}
778
+ choices = [c.get("text", "") for c in res.get("content", [])]
779
+ return choices
780
+
781
+ def _call_gemini(self, msg_list, params):
782
+ """
783
+ Calls Google Gemini/SVertexAI chat-supported models via REST API.
784
+ Requires self.api_key to be set.
785
+ """
786
+ url = "https://generativelanguage.googleapis.com/v1beta/models/{}:generateContent?key={}".format(self.model, self.api_key)
787
+ # Gemini expects a list of "contents" alternating user/assistant
788
+ # We collapse the messages into a sequence of dicts as required by Gemini
789
+ # Gemini wants [{"role": "user/assistant", "parts": [{"text": ...}]}]
790
+ gemini_msgs = []
791
+ for m in self._normalize_messages(msg_list):
792
+ # Google's role scheme: "user" or "model"
793
+ g_role = {"user": "user", "assistant": "model", "system": "user"}.get(m["role"], "user")
794
+ gemini_msgs.append({
795
+ "role": g_role,
796
+ "parts": [{"text": str(m["content"])}] if isinstance(m["content"], str) else m["content"]
797
+ })
798
+ payload = {
799
+ "contents": gemini_msgs,
800
+ **{k: v for k, v in params.items() if k != "model"}
801
+ }
802
+ data = json.dumps(payload).encode("utf-8")
803
+ headers = {
804
+ "Content-Type": "application/json",
805
+ }
806
+ res = self._send_request(url, data, headers)
807
+ # Gemini returns { "candidates": [ { "content": { "parts": [ { "text": ... } ] } } ] }
808
+ choices = []
809
+ for cand in res.get("candidates", []):
810
+ parts = cand.get("content", {}).get("parts", [])
811
+ text = "".join([p.get("text", "") for p in parts])
812
+ choices.append(text)
813
+ return choices
814
+
815
+ def _call_openrouter(self, msg_list, params):
816
+ """
817
+ Calls an LLM via the OpenRouter API. Requires self.api_key.
818
+ API docs: https://openrouter.ai/docs
819
+ Model list: https://openrouter.ai/docs#models
820
+ """
821
+ url = "https://openrouter.ai/api/v1/chat/completions"
822
+ data = json.dumps({
823
+ "model": self.model,
824
+ "messages": self._normalize_messages(msg_list),
825
+ **params
826
+ }).encode("utf-8")
827
+ headers = {
828
+ "Authorization": "Bearer " + self.api_key,
829
+ "Content-Type": "application/json",
830
+ "HTTP-Referer": params.get("referer", "https://your-app.com"),
831
+ "X-Title": params.get("title", "Thoughtflow"),
832
+ }
833
+ res = self._send_request(url, data, headers)
834
+ choices = [a["message"]["content"] for a in res.get("choices", [])]
835
+ return choices
836
+
837
+ def _call_ollama(self, msg_list, params):
838
+ """
839
+ Calls a local model served via Ollama (http://localhost:11434 by default).
840
+ Expects no authentication. Ollama messages format is like OpenAI's.
841
+ """
842
+ base_url = params.get("ollama_url", "http://localhost:11434")
843
+ url = base_url.rstrip('/') + "/api/chat"
844
+ payload = {
845
+ "model": self.model,
846
+ "messages": self._normalize_messages(msg_list),
847
+ "stream": False, # Disable streaming to get a single JSON response
848
+ **{k: v for k, v in params.items() if k not in ("ollama_url", "model")}
849
+ }
850
+ data = json.dumps(payload).encode("utf-8")
851
+ headers = {
852
+ "Content-Type": "application/json",
853
+ }
854
+ res = self._send_request(url, data, headers)
855
+ # Ollama returns {"message": {...}, ...} or {"choices": [{...}]}
856
+ # Prefer OpenAI-style extraction if available, else fallback
857
+ if "choices" in res:
858
+ choices = [a["message"]["content"] for a in res.get("choices", [])]
859
+ elif "message" in res:
860
+ # single result
861
+ msg = res["message"]
862
+ choices = [msg.get("content", "")]
863
+ elif "response" in res:
864
+ # streaming/fallback
865
+ choices = [res["response"]]
866
+ else:
867
+ choices = []
868
+ return choices
869
+
870
+ def _send_request(self, url, data, headers):
871
+ # Sends the actual HTTP request and handles the response
872
+ try:
873
+ req = urllib.request.Request(url, data=data, headers=headers)
874
+ with urllib.request.urlopen(req) as response:
875
+ response_data = response.read().decode("utf-8")
876
+ # Attempt to parse JSON response; handle plain-text responses
877
+ try:
878
+ return json.loads(response_data) # Parse JSON response
879
+ except json.JSONDecodeError:
880
+ # If response is not JSON, return it as-is in a structured format
881
+ return {"error": "Non-JSON response", "response_data": response_data}
882
+
883
+ except urllib.error.HTTPError as e:
884
+ # Return the error details in case of an HTTP error
885
+ error_msg = e.read().decode("utf-8")
886
+ print("HTTP Error:", error_msg) # Log HTTP error for debugging
887
+ return {"error": json.loads(error_msg) if error_msg else "Unknown HTTP error"}
888
+ except Exception as e:
889
+ return {"error": str(e)}
890
+
891
+
892
+
893
+ #############################################################################
894
+ #############################################################################
895
+
896
+ ### MEMORY CLASS
897
+
898
+ # Sentinel class to mark deleted variables
899
+ class _VarDeleted:
900
+ """Sentinel value indicating a variable has been deleted."""
901
+ _instance = None
902
+
903
+ def __new__(cls):
904
+ if cls._instance is None:
905
+ cls._instance = super().__new__(cls)
906
+ return cls._instance
907
+
908
+ def __repr__(self):
909
+ return '<DELETED>'
910
+
911
+ def __str__(self):
912
+ return '<DELETED>'
913
+
914
+ # Singleton instance for deleted marker
915
+ VAR_DELETED = _VarDeleted()
916
+
917
+
918
+ #-----------------------------------------------------------
919
+ # Object Compression Utilities (JSON-serializable)
920
+ #-----------------------------------------------------------
921
+
922
+ import zlib
923
+ import base64
924
+
925
+ def compress_to_json(data, content_type='auto'):
926
+ """
927
+ Compress data to a JSON-serializable dict.
928
+
929
+ Args:
930
+ data: bytes, str, or JSON-serializable object
931
+ content_type: 'bytes', 'text', 'json', 'pickle', or 'auto'
932
+
933
+ Returns:
934
+ dict with 'data' (base64 string), sizes, and content_type
935
+ """
936
+ # Convert to bytes based on type
937
+ if content_type == 'auto':
938
+ if isinstance(data, bytes):
939
+ content_type = 'bytes'
940
+ raw_bytes = data
941
+ elif isinstance(data, str):
942
+ content_type = 'text'
943
+ raw_bytes = data.encode('utf-8')
944
+ else:
945
+ # Try JSON first, fall back to pickle
946
+ try:
947
+ content_type = 'json'
948
+ raw_bytes = json.dumps(data).encode('utf-8')
949
+ except (TypeError, ValueError):
950
+ content_type = 'pickle'
951
+ raw_bytes = pickle.dumps(data)
952
+ elif content_type == 'bytes':
953
+ raw_bytes = data
954
+ elif content_type == 'text':
955
+ raw_bytes = data.encode('utf-8')
956
+ elif content_type == 'json':
957
+ raw_bytes = json.dumps(data).encode('utf-8')
958
+ elif content_type == 'pickle':
959
+ raw_bytes = pickle.dumps(data)
960
+ else:
961
+ raise ValueError("Unknown content_type: {}".format(content_type))
962
+
963
+ # Compress and base64 encode
964
+ compressed = zlib.compress(raw_bytes, level=9)
965
+ encoded = base64.b64encode(compressed).decode('ascii')
966
+
967
+ return {
968
+ 'data': encoded,
969
+ 'size_original': len(raw_bytes),
970
+ 'size_compressed': len(compressed),
971
+ 'content_type': content_type,
972
+ }
973
+
974
+
975
+ def decompress_from_json(obj_dict):
976
+ """
977
+ Decompress data from JSON-serializable dict.
978
+
979
+ Args:
980
+ obj_dict: dict from compress_to_json
981
+
982
+ Returns:
983
+ Original data in its original type
984
+ """
985
+ encoded = obj_dict['data']
986
+ content_type = obj_dict['content_type']
987
+
988
+ # Decode and decompress
989
+ compressed = base64.b64decode(encoded)
990
+ raw_bytes = zlib.decompress(compressed)
991
+
992
+ # Convert back to original type
993
+ if content_type == 'bytes':
994
+ return raw_bytes
995
+ elif content_type == 'text':
996
+ return raw_bytes.decode('utf-8')
997
+ elif content_type == 'json':
998
+ return json.loads(raw_bytes.decode('utf-8'))
999
+ elif content_type == 'pickle':
1000
+ return pickle.loads(raw_bytes)
1001
+ else:
1002
+ raise ValueError("Unknown content_type: {}".format(content_type))
1003
+
1004
+
1005
+ def estimate_size(value):
1006
+ """
1007
+ Estimate the serialized size of a value in bytes.
1008
+
1009
+ Args:
1010
+ value: Any value
1011
+
1012
+ Returns:
1013
+ int: Estimated size in bytes
1014
+ """
1015
+ if isinstance(value, bytes):
1016
+ return len(value)
1017
+ elif isinstance(value, str):
1018
+ return len(value.encode('utf-8'))
1019
+ else:
1020
+ try:
1021
+ return len(json.dumps(value).encode('utf-8'))
1022
+ except (TypeError, ValueError):
1023
+ return len(pickle.dumps(value))
1024
+
1025
+
1026
+ def is_obj_ref(value):
1027
+ """
1028
+ Check if a value is an object reference.
1029
+
1030
+ Args:
1031
+ value: Any value
1032
+
1033
+ Returns:
1034
+ bool: True if value is an object reference dict
1035
+ """
1036
+ return isinstance(value, dict) and '_obj_ref' in value
1037
+
1038
+
1039
+ def truncate_content(content, stamp, threshold=500, header_len=200, footer_len=200):
1040
+ """
1041
+ Truncate long content by keeping header and footer with an expandable marker.
1042
+
1043
+ If content is shorter than threshold, returns content unchanged.
1044
+ Otherwise, keeps the first header_len chars and last footer_len chars,
1045
+ with a marker in between indicating truncation and providing the stamp
1046
+ for expansion.
1047
+
1048
+ Args:
1049
+ content: The text content to potentially truncate
1050
+ stamp: The event stamp (ID) for the content, used in expansion marker
1051
+ threshold: Minimum length before truncation applies (default 500)
1052
+ header_len: Characters to keep from start (default 200)
1053
+ footer_len: Characters to keep from end (default 200)
1054
+
1055
+ Returns:
1056
+ str: Original content if short enough, or truncated content with marker
1057
+
1058
+ Example:
1059
+ truncated = truncate_content(long_text, 'ABC123', threshold=500)
1060
+ # Returns: "First 200 chars...\n\n[...TRUNCATED: 1,847 chars omitted. To expand, request stamp: ABC123...]\n\n...last 200 chars"
1061
+ """
1062
+ if len(content) <= threshold:
1063
+ return content
1064
+
1065
+ # Calculate how much we're removing
1066
+ chars_omitted = len(content) - header_len - footer_len
1067
+
1068
+ # Build the truncation marker
1069
+ marker = "\n\n[...TRUNCATED: {:,} chars omitted. To expand, request stamp: {}...]\n\n".format(chars_omitted, stamp)
1070
+
1071
+ # Extract header and footer
1072
+ header = content[:header_len]
1073
+ footer = content[-footer_len:]
1074
+
1075
+ return header + marker + footer
1076
+
1077
+
1078
+ class MEMORY:
1079
+ """
1080
+ The MEMORY class serves as an event-sourced state container for managing events,
1081
+ logs, messages, reflections, and variables within the Thoughtflow framework.
1082
+
1083
+ All state changes are stored as events with sortable IDs (alphabetical = chronological).
1084
+ Events are stored in a dictionary for O(1) lookup, with separate sorted indexes for
1085
+ efficient retrieval. The memory can be fully reconstructed from its event list.
1086
+
1087
+ Architecture:
1088
+ - DATA LAYER: events dict (stamp → event object) - single source of truth
1089
+ - INDEX LAYER: idx_* lists of [timestamp, stamp] pairs, sorted chronologically
1090
+ - VARIABLE LAYER: vars dict with full history as list of [stamp, value] pairs
1091
+ - OBJECT LAYER: objects dict for compressed large data storage
1092
+
1093
+ Attributes:
1094
+ id (str): Unique identifier for this MEMORY instance (event_stamp).
1095
+ events (dict): Dictionary mapping event stamps to full event objects.
1096
+ idx_msgs (list): Sorted list of [timestamp, stamp] pairs for messages.
1097
+ idx_refs (list): Sorted list of [timestamp, stamp] pairs for reflections.
1098
+ idx_logs (list): Sorted list of [timestamp, stamp] pairs for logs.
1099
+ idx_vars (list): Sorted list of [timestamp, stamp] pairs for variable changes.
1100
+ idx_all (list): Master sorted list of all [timestamp, stamp] pairs.
1101
+ vars (dict): Dictionary mapping variable names to list of [stamp, value] pairs.
1102
+ Deleted variables have VAR_DELETED as the value in their last entry.
1103
+ Large values auto-convert to object references: {'_obj_ref': stamp}.
1104
+ var_desc_history (dict): Dictionary mapping variable names to list of [stamp, description] pairs.
1105
+ Tracks description evolution separately from value changes.
1106
+ objects (dict): Dictionary mapping stamps to compressed object dicts.
1107
+ Each object is JSON-serializable with base64-encoded compressed data.
1108
+ object_threshold (int): Size threshold (bytes) for auto-converting vars to objects.
1109
+ valid_roles (set): Set of valid roles for messages.
1110
+ valid_modes (set): Set of valid modes for messages.
1111
+ valid_channels (set): Set of valid communication channels.
1112
+
1113
+ Methods:
1114
+ add_msg(role, content, mode='text', channel='unknown'): Add a message event with channel.
1115
+ add_log(message): Add a log event.
1116
+ add_ref(content): Add a reflection event.
1117
+ get_msgs(...): Retrieve messages with filtering (supports channel filter).
1118
+ get_events(...): Retrieve all events with filtering.
1119
+ get_logs(limit=-1): Get log events.
1120
+ get_refs(limit=-1): Get reflection events.
1121
+ last_user_msg(): Get the last user message content.
1122
+ last_asst_msg(): Get the last assistant message content.
1123
+ last_sys_msg(): Get the last system message content.
1124
+ last_log_msg(): Get the last log message content.
1125
+ prepare_context(...): Prepare messages for LLM with smart truncation of old messages.
1126
+ set_var(key, value, desc=''): Set a variable (appends to history, auto-converts large values to objects).
1127
+ del_var(key): Mark a variable as deleted (preserves history).
1128
+ get_var(key, resolve_refs=True): Get current value (auto-resolves object refs).
1129
+ get_all_vars(resolve_refs=True): Get dict of all current non-deleted values.
1130
+ get_var_history(key, resolve_refs=False): Get full history as list of [stamp, value].
1131
+ get_var_desc(key): Get the current description of a variable.
1132
+ get_var_desc_history(key): Get full description history as list of [stamp, description].
1133
+ is_var_deleted(key): Check if a variable is currently marked as deleted.
1134
+ set_obj(data, name=None, desc='', content_type='auto'): Store compressed object, optionally link to variable.
1135
+ get_obj(stamp): Retrieve and decompress an object by stamp.
1136
+ get_obj_info(stamp): Get object metadata without decompressing.
1137
+ snapshot(): Export memory state as dict (includes events and objects).
1138
+ save(filename, compressed=False): Save memory to file (pickle format).
1139
+ load(filename, compressed=False): Load memory from file (pickle format).
1140
+ to_json(filename=None, indent=2): Export memory to JSON file or string.
1141
+ from_json(source): Class method to load memory from JSON file or string.
1142
+ copy(): Return a deep copy of the MEMORY instance.
1143
+ from_events(event_list, memory_id=None, objects=None): Class method to rehydrate from events/objects.
1144
+
1145
+ Example Usage:
1146
+ memory = MEMORY()
1147
+
1148
+ # Messages have channel tracking (for omni-directional communication)
1149
+ memory.add_msg('user', 'Hello!', channel='webapp')
1150
+ memory.add_msg('assistant', 'Hi there!', channel='webapp')
1151
+
1152
+ # Logs and reflections are internal (no channel)
1153
+ memory.add_log('User greeted the assistant')
1154
+ memory.add_ref('User seems friendly')
1155
+
1156
+ # Variables maintain full history (no channel needed)
1157
+ memory.set_var('foo', 42, 'A test variable')
1158
+ memory.set_var('foo', 100) # Appends to history
1159
+ memory.get_var('foo') # Returns 100
1160
+ memory.get_var_history('foo') # Returns [[stamp1, 42], [stamp2, 100]]
1161
+
1162
+ # Deletion is a tombstone, not removal
1163
+ memory.del_var('foo')
1164
+ memory.get_var('foo') # Returns None
1165
+ memory.is_var_deleted('foo') # Returns True
1166
+ memory.set_var('foo', 200) # Can re-set after deletion
1167
+
1168
+ # Large values auto-convert to compressed objects
1169
+ large_data = 'x' * 20000 # Exceeds default 10KB threshold
1170
+ memory.set_var('big_data', large_data) # Auto-converts to object
1171
+ memory.get_var('big_data') # Returns decompressed data
1172
+ memory.get_var('big_data', resolve_refs=False) # Returns {'_obj_ref': stamp}
1173
+
1174
+ # Direct object storage
1175
+ stamp = memory.set_obj(image_bytes, name='avatar', desc='User avatar')
1176
+ memory.get_var('avatar') # Returns decompressed image_bytes
1177
+ memory.get_obj(stamp) # Direct access by stamp
1178
+ memory.get_obj_info(stamp) # Metadata without decompressing
1179
+
1180
+ # Inspect internal state (public attributes)
1181
+ print(memory.events) # All events by stamp
1182
+ print(memory.objects) # All objects by stamp
1183
+ print(memory.vars) # Variable histories
1184
+
1185
+ memory.save('memory.pkl')
1186
+ memory2 = MEMORY()
1187
+ memory2.load('memory.pkl')
1188
+
1189
+ # Export to JSON (like DataFrame.to_csv)
1190
+ memory.to_json('memory_backup.json')
1191
+ memory4 = MEMORY.from_json('memory_backup.json')
1192
+
1193
+ # Rehydrate from events and objects (preserves all history)
1194
+ snap = memory.snapshot()
1195
+ memory3 = MEMORY.from_events(snap['events'].values(), objects=snap['objects'])
1196
+ """
1197
+
1198
+ def __init__(self):
1199
+ import bisect
1200
+ self._bisect = bisect # Store for use in methods
1201
+
1202
+ self.id = event_stamp()
1203
+
1204
+ # DATA LAYER: Single source of truth for all events
1205
+ self.events = {} # stamp → full event dict
1206
+
1207
+ # INDEX LAYER: Sorted lists of [timestamp, stamp] pairs
1208
+ # Format: [[dt_utc, stamp], ...] - aligns with Redis sorted set structure
1209
+ # Sorted by timestamp (ISO string sorts chronologically)
1210
+ self.idx_msgs = [] # Message [timestamp, stamp] pairs
1211
+ self.idx_refs = [] # Reflection [timestamp, stamp] pairs
1212
+ self.idx_logs = [] # Log [timestamp, stamp] pairs
1213
+ self.idx_vars = [] # Variable-change [timestamp, stamp] pairs
1214
+ self.idx_all = [] # Master index (all [timestamp, stamp] pairs)
1215
+
1216
+ # VARIABLE LAYER: Full history with timestamps
1217
+ # vars[key] = [[stamp1, value1], [stamp2, value2], ...]
1218
+ # Deleted variables have VAR_DELETED as value in their last entry
1219
+ self.vars = {} # var_name → list of [stamp, value] pairs
1220
+ self.var_desc_history = {} # var_name → list of [stamp, description] pairs
1221
+
1222
+ # OBJECT LAYER: Compressed storage for large data
1223
+ # objects[stamp] = {
1224
+ # 'data': base64_encoded_compressed_string,
1225
+ # 'size_original': int,
1226
+ # 'size_compressed': int,
1227
+ # 'content_type': str, # 'bytes', 'text', 'json', 'pickle'
1228
+ # }
1229
+ self.objects = {} # stamp → compressed object dict
1230
+
1231
+ # Threshold for auto-converting variables to objects (bytes)
1232
+ self.object_threshold = 10000 # 10KB default
1233
+
1234
+ # Valid values
1235
+ self.valid_roles = {
1236
+ 'system',
1237
+ 'user',
1238
+ 'assistant',
1239
+ 'reflection',
1240
+ 'action',
1241
+ 'query',
1242
+ 'result',
1243
+ 'logger',
1244
+ }
1245
+ self.valid_modes = {
1246
+ 'text',
1247
+ 'audio',
1248
+ 'voice',
1249
+ }
1250
+ self.valid_channels = {
1251
+ 'webapp',
1252
+ 'ios',
1253
+ 'android',
1254
+ 'telegram',
1255
+ 'whatsapp',
1256
+ 'slack',
1257
+ 'api',
1258
+ 'cli',
1259
+ 'unknown',
1260
+ }
1261
+
1262
+ #--- Internal Methods ---
1263
+
1264
+ def _add_to_index(self, index_list, timestamp, stamp):
1265
+ """
1266
+ Insert [timestamp, stamp] pair maintaining sorted order by timestamp.
1267
+
1268
+ Args:
1269
+ index_list: One of the idx_* lists
1270
+ timestamp: ISO timestamp string (dt_utc)
1271
+ stamp: Event stamp ID
1272
+ """
1273
+ # bisect.insort sorts by first element of tuple/list (timestamp)
1274
+ self._bisect.insort(index_list, [timestamp, stamp])
1275
+
1276
+ def _store_event(self, event_type, obj):
1277
+ """
1278
+ Store event in data layer and add to appropriate indexes.
1279
+ This is the single entry point for all event creation.
1280
+
1281
+ Args:
1282
+ event_type: One of 'msg', 'ref', 'log', 'var'
1283
+ obj: The full event dict (must contain 'stamp' and 'dt_utc' keys)
1284
+ """
1285
+ stamp = obj['stamp']
1286
+ timestamp = obj['dt_utc']
1287
+
1288
+ # Store in data layer
1289
+ self.events[stamp] = obj
1290
+
1291
+ # Add to type-specific index (with [timestamp, stamp] format)
1292
+ if event_type == 'msg':
1293
+ self._add_to_index(self.idx_msgs, timestamp, stamp)
1294
+ elif event_type == 'ref':
1295
+ self._add_to_index(self.idx_refs, timestamp, stamp)
1296
+ elif event_type == 'log':
1297
+ self._add_to_index(self.idx_logs, timestamp, stamp)
1298
+ elif event_type == 'var':
1299
+ self._add_to_index(self.idx_vars, timestamp, stamp)
1300
+
1301
+ # Always add to master index
1302
+ self._add_to_index(self.idx_all, timestamp, stamp)
1303
+
1304
+ def _get_events_from_index(self, index, limit=-1):
1305
+ """
1306
+ Get events from an index, optionally limited to last N.
1307
+
1308
+ Args:
1309
+ index: One of the idx_* lists (format: [[timestamp, stamp], ...])
1310
+ limit: Max events to return (-1 = all)
1311
+
1312
+ Returns:
1313
+ List of event dicts
1314
+ """
1315
+ pairs = index if limit <= 0 else index[-limit:]
1316
+ # Extract stamp (second element) from each [timestamp, stamp] pair
1317
+ return [self.events[ts_stamp[1]] for ts_stamp in pairs if ts_stamp[1] in self.events]
1318
+
1319
+ def _get_latest_desc(self, key):
1320
+ """
1321
+ Get the latest description for a variable from its description history.
1322
+
1323
+ Args:
1324
+ key: Variable name
1325
+
1326
+ Returns:
1327
+ Latest description string, or empty string if none exists
1328
+ """
1329
+ history = self.var_desc_history.get(key)
1330
+ if not history:
1331
+ return ''
1332
+ return history[-1][1] # Return description from last [stamp, desc] pair
1333
+
1334
+ #--- Public Methods ---
1335
+
1336
+ def add_msg(self, role, content, mode='text', channel='unknown'):
1337
+ """
1338
+ Add a message event with channel tracking.
1339
+
1340
+ Args:
1341
+ role: Message role (user, assistant, system, etc.)
1342
+ content: Message content
1343
+ mode: Communication mode (text, audio, voice)
1344
+ channel: Communication channel (webapp, ios, telegram, etc.)
1345
+ """
1346
+ if role not in self.valid_roles:
1347
+ raise ValueError("Invalid role '{}'. Must be one of: {}".format(role, sorted(self.valid_roles)))
1348
+ if mode not in self.valid_modes:
1349
+ raise ValueError("Invalid mode '{}'. Must be one of: {}".format(mode, sorted(self.valid_modes)))
1350
+ if channel not in self.valid_channels:
1351
+ raise ValueError("Invalid channel '{}'. Must be one of: {}".format(channel, sorted(self.valid_channels)))
1352
+
1353
+ stamp = event_stamp({'role': role, 'content': content})
1354
+ msg = {
1355
+ 'stamp' : stamp,
1356
+ 'type' : 'msg',
1357
+ 'role' : role,
1358
+ 'content' : content,
1359
+ 'mode' : mode,
1360
+ 'channel' : channel,
1361
+ 'dt_bog' : str(dtt.datetime.now(tz_bog))[:23],
1362
+ 'dt_utc' : str(dtt.datetime.now(tz_utc))[:23],
1363
+ }
1364
+ self._store_event('msg', msg)
1365
+
1366
+ def add_log(self, message):
1367
+ """
1368
+ Add a log event.
1369
+
1370
+ Args:
1371
+ message: Log message content
1372
+ """
1373
+ stamp = event_stamp({'content': message})
1374
+ log_entry = {
1375
+ 'stamp' : stamp,
1376
+ 'type' : 'log',
1377
+ 'role' : 'logger',
1378
+ 'content' : message,
1379
+ 'mode' : 'text',
1380
+ 'dt_bog' : str(dtt.datetime.now(tz_bog))[:23],
1381
+ 'dt_utc' : str(dtt.datetime.now(tz_utc))[:23],
1382
+ }
1383
+ self._store_event('log', log_entry)
1384
+
1385
+ def add_ref(self, content):
1386
+ """
1387
+ Add a reflection event.
1388
+
1389
+ Args:
1390
+ content: Reflection content
1391
+ """
1392
+ stamp = event_stamp({'content': content})
1393
+ ref = {
1394
+ 'stamp' : stamp,
1395
+ 'type' : 'ref',
1396
+ 'role' : 'reflection',
1397
+ 'content' : content,
1398
+ 'mode' : 'text',
1399
+ 'dt_bog' : str(dtt.datetime.now(tz_bog))[:23],
1400
+ 'dt_utc' : str(dtt.datetime.now(tz_utc))[:23],
1401
+ }
1402
+ self._store_event('ref', ref)
1403
+
1404
+ #---
1405
+
1406
+ def get_msgs(self,
1407
+ limit=-1,
1408
+ include=None,
1409
+ exclude=None,
1410
+ repr='list',
1411
+ channel=None,
1412
+ ):
1413
+ """
1414
+ Get messages with flexible filtering.
1415
+
1416
+ Args:
1417
+ limit: Max messages to return (-1 = all)
1418
+ include: List of roles to include (None = all)
1419
+ exclude: List of roles to exclude (None = none)
1420
+ repr: Output format ('list', 'str', 'pprint1')
1421
+ channel: Filter by channel (None = all)
1422
+
1423
+ Returns:
1424
+ Messages in the specified format
1425
+ """
1426
+ # Get all messages from index
1427
+ events = self._get_events_from_index(self.idx_msgs, -1)
1428
+
1429
+ # Apply filters
1430
+ if include:
1431
+ events = [e for e in events if e.get('role') in include]
1432
+ if exclude:
1433
+ exclude = exclude or []
1434
+ events = [e for e in events if e.get('role') not in exclude]
1435
+ if channel:
1436
+ events = [e for e in events if e.get('channel') == channel]
1437
+
1438
+ if limit > 0:
1439
+ events = events[-limit:]
1440
+
1441
+ if repr == 'list':
1442
+ return events
1443
+ elif repr == 'str':
1444
+ return '\n'.join(["{}: {}".format(e['role'], e['content']) for e in events])
1445
+ elif repr == 'pprint1':
1446
+ return pprint.pformat(events, indent=1)
1447
+ else:
1448
+ raise ValueError("Invalid repr option. Choose from 'list', 'str', or 'pprint1'.")
1449
+
1450
+ def get_events(self, limit=-1, event_types=None, channel=None):
1451
+ """
1452
+ Get all events, optionally filtered by type and channel.
1453
+
1454
+ Args:
1455
+ limit: Max events (-1 = all)
1456
+ event_types: List like ['msg', 'log', 'ref', 'var'] (None = all)
1457
+ channel: Filter by channel (None = all)
1458
+
1459
+ Returns:
1460
+ List of event dicts
1461
+ """
1462
+ events = self._get_events_from_index(self.idx_all, -1)
1463
+
1464
+ if event_types:
1465
+ events = [e for e in events if e.get('type') in event_types]
1466
+ if channel:
1467
+ events = [e for e in events if e.get('channel') == channel]
1468
+
1469
+ if limit > 0:
1470
+ events = events[-limit:]
1471
+
1472
+ return events
1473
+
1474
+ def get_logs(self, limit=-1):
1475
+ """
1476
+ Get log events.
1477
+
1478
+ Args:
1479
+ limit: Max logs to return (-1 = all)
1480
+
1481
+ Returns:
1482
+ List of log event dicts
1483
+ """
1484
+ events = self._get_events_from_index(self.idx_logs, -1)
1485
+
1486
+ if limit > 0:
1487
+ events = events[-limit:]
1488
+
1489
+ return events
1490
+
1491
+ def get_refs(self, limit=-1):
1492
+ """
1493
+ Get reflection events.
1494
+
1495
+ Args:
1496
+ limit: Max reflections to return (-1 = all)
1497
+
1498
+ Returns:
1499
+ List of reflection event dicts
1500
+ """
1501
+ events = self._get_events_from_index(self.idx_refs, -1)
1502
+
1503
+ if limit > 0:
1504
+ events = events[-limit:]
1505
+
1506
+ return events
1507
+
1508
+ def last_user_msg(self):
1509
+ """Get the content of the last user message."""
1510
+ msgs = self.get_msgs(include=['user'])
1511
+ return msgs[-1]['content'] if msgs else ''
1512
+
1513
+ def last_asst_msg(self):
1514
+ """Get the content of the last assistant message."""
1515
+ msgs = self.get_msgs(include=['assistant'])
1516
+ return msgs[-1]['content'] if msgs else ''
1517
+
1518
+ def last_sys_msg(self):
1519
+ """Get the content of the last system message."""
1520
+ msgs = self.get_msgs(include=['system'])
1521
+ return msgs[-1]['content'] if msgs else ''
1522
+
1523
+ def last_log_msg(self):
1524
+ """Get the content of the last log message."""
1525
+ logs = self.get_logs()
1526
+ return logs[-1]['content'] if logs else ''
1527
+
1528
+ def prepare_context(
1529
+ self,
1530
+ recent_count=6,
1531
+ truncate_threshold=500,
1532
+ header_len=200,
1533
+ footer_len=200,
1534
+ include_roles=('user', 'assistant'),
1535
+ format='list',
1536
+ ):
1537
+ """
1538
+ Prepare messages for LLM context with smart truncation of old messages.
1539
+
1540
+ Messages within the most recent `recent_count` are returned unchanged.
1541
+ Older messages that exceed `truncate_threshold` chars have their middle
1542
+ content truncated, preserving a header and footer with an expandable marker.
1543
+
1544
+ The truncation marker includes the message's stamp, allowing an LLM to
1545
+ request expansion of specific messages via memory.events[stamp].
1546
+
1547
+ Args:
1548
+ recent_count: Number of recent messages to keep untruncated (default 6)
1549
+ truncate_threshold: Min chars before truncation applies (default 500)
1550
+ header_len: Characters to keep from start (default 200)
1551
+ footer_len: Characters to keep from end (default 200)
1552
+ include_roles: Tuple of roles to include (default ('user', 'assistant'))
1553
+ format: 'list' returns list of dicts, 'openai' returns OpenAI-compatible format
1554
+
1555
+ Returns:
1556
+ List of message dicts with 'role' and 'content' keys.
1557
+ Older messages may have truncated content with expansion markers.
1558
+
1559
+ Example:
1560
+ # Get context-ready messages for LLM
1561
+ context = memory.prepare_context(recent_count=6, truncate_threshold=500)
1562
+
1563
+ # Use with OpenAI API
1564
+ context = memory.prepare_context(format='openai')
1565
+ response = client.chat.completions.create(
1566
+ model='gpt-4',
1567
+ messages=context
1568
+ )
1569
+ """
1570
+ # Get all messages for included roles
1571
+ msgs = self.get_msgs(include=list(include_roles))
1572
+
1573
+ if not msgs:
1574
+ return []
1575
+
1576
+ # Determine cutoff point for truncation
1577
+ # Messages at index < cutoff_idx are candidates for truncation
1578
+ cutoff_idx = max(0, len(msgs) - recent_count)
1579
+
1580
+ result = []
1581
+ for i, msg in enumerate(msgs):
1582
+ stamp = msg.get('stamp', '')
1583
+ role = msg.get('role', 'user')
1584
+ content = msg.get('content', '')
1585
+
1586
+ # Apply truncation to older messages
1587
+ if i < cutoff_idx:
1588
+ content = truncate_content(
1589
+ content,
1590
+ stamp,
1591
+ threshold=truncate_threshold,
1592
+ header_len=header_len,
1593
+ footer_len=footer_len
1594
+ )
1595
+
1596
+ if format == 'openai':
1597
+ # OpenAI expects 'user', 'assistant', 'system' roles
1598
+ result.append({'role': role, 'content': content})
1599
+ else:
1600
+ # List format includes more metadata
1601
+ result.append({
1602
+ 'role': role,
1603
+ 'content': content,
1604
+ 'stamp': stamp,
1605
+ 'truncated': i < cutoff_idx and len(msg.get('content', '')) > truncate_threshold,
1606
+ })
1607
+
1608
+ return result
1609
+
1610
+ #---
1611
+
1612
+ def set_var(self, key, value, desc=''):
1613
+ """
1614
+ Store a variable by appending to its history list.
1615
+ Variable changes are first-class events in the event stream.
1616
+ Each variable maintains a full history of [stamp, value] pairs.
1617
+
1618
+ Large values (exceeding object_threshold) are automatically converted
1619
+ to compressed objects, with an object reference stored in the history.
1620
+
1621
+ Descriptions are tracked separately in var_desc_history since they
1622
+ change less frequently than values.
1623
+
1624
+ Args:
1625
+ key: Variable name
1626
+ value: Variable value (any type)
1627
+ desc: Optional description (appended to description history if provided)
1628
+ """
1629
+ # Check if value should be stored as object (auto-conversion)
1630
+ value_size = estimate_size(value)
1631
+ if value_size > self.object_threshold:
1632
+ # Store as object, use reference in history
1633
+ obj_stamp = event_stamp({'obj': str(value)[:50]})
1634
+ compressed_obj = compress_to_json(value)
1635
+ self.objects[obj_stamp] = compressed_obj
1636
+ stored_value = {'_obj_ref': obj_stamp}
1637
+ else:
1638
+ stored_value = value
1639
+
1640
+ stamp = event_stamp({'var': key, 'value': str(value)[:100]})
1641
+
1642
+ # Initialize history list if this is a new variable
1643
+ if key not in self.vars:
1644
+ self.vars[key] = []
1645
+
1646
+ # Append new [stamp, stored_value] pair to history
1647
+ self.vars[key].append([stamp, stored_value])
1648
+
1649
+ # Track description changes separately (only when provided)
1650
+ if desc:
1651
+ if key not in self.var_desc_history:
1652
+ self.var_desc_history[key] = []
1653
+ self.var_desc_history[key].append([stamp, desc])
1654
+
1655
+ # Get latest description from history (or the one we just set)
1656
+ current_desc = desc if desc else self._get_latest_desc(key)
1657
+
1658
+ # Create variable-change event
1659
+ var_event = {
1660
+ 'stamp' : stamp,
1661
+ 'type' : 'var',
1662
+ 'role' : 'system',
1663
+ 'var_name' : key,
1664
+ 'var_value': stored_value, # Store reference if large, else value
1665
+ 'var_desc' : current_desc,
1666
+ 'content' : "Variable '{}' set".format(key) + (' (as object ref)' if is_obj_ref(stored_value) else ''),
1667
+ 'mode' : 'text',
1668
+ 'dt_bog' : str(dtt.datetime.now(tz_bog))[:23],
1669
+ 'dt_utc' : str(dtt.datetime.now(tz_utc))[:23],
1670
+ }
1671
+ self._store_event('var', var_event)
1672
+
1673
+ def del_var(self, key):
1674
+ """
1675
+ Mark a variable as deleted by appending a VAR_DELETED tombstone.
1676
+ The variable's history is preserved; it can be re-set later.
1677
+
1678
+ Args:
1679
+ key: Variable name to delete
1680
+
1681
+ Raises:
1682
+ KeyError: If the variable doesn't exist
1683
+ """
1684
+ if key not in self.vars:
1685
+ raise KeyError("Variable '{}' does not exist".format(key))
1686
+
1687
+ stamp = event_stamp({'var': key, 'action': 'delete'})
1688
+
1689
+ # Append deletion marker to history
1690
+ self.vars[key].append([stamp, VAR_DELETED])
1691
+
1692
+ # Create variable-delete event
1693
+ var_event = {
1694
+ 'stamp' : stamp,
1695
+ 'type' : 'var',
1696
+ 'role' : 'system',
1697
+ 'var_name' : key,
1698
+ 'var_value': None,
1699
+ 'var_deleted': True,
1700
+ 'var_desc' : self._get_latest_desc(key),
1701
+ 'content' : "Variable '{}' deleted".format(key),
1702
+ 'mode' : 'text',
1703
+ 'dt_bog' : str(dtt.datetime.now(tz_bog))[:23],
1704
+ 'dt_utc' : str(dtt.datetime.now(tz_utc))[:23],
1705
+ }
1706
+ self._store_event('var', var_event)
1707
+
1708
+ def get_var(self, key, resolve_refs=True):
1709
+ """
1710
+ Return the current value of a variable.
1711
+
1712
+ If the value is an object reference, it is automatically resolved
1713
+ and the decompressed data is returned (unless resolve_refs=False).
1714
+
1715
+ Args:
1716
+ key: Variable name
1717
+ resolve_refs: If True (default), resolve object references to actual data
1718
+
1719
+ Returns:
1720
+ Current value, or None if not found or deleted
1721
+ """
1722
+ history = self.vars.get(key)
1723
+ if not history:
1724
+ return None
1725
+
1726
+ # Get the last value
1727
+ last_stamp, last_value = history[-1]
1728
+
1729
+ # Return None if deleted
1730
+ if last_value is VAR_DELETED:
1731
+ return None
1732
+
1733
+ # Resolve object reference if applicable
1734
+ if resolve_refs and is_obj_ref(last_value):
1735
+ return self.get_obj(last_value['_obj_ref'])
1736
+
1737
+ return last_value
1738
+
1739
+ def is_var_deleted(self, key):
1740
+ """
1741
+ Check if a variable is currently marked as deleted.
1742
+
1743
+ Args:
1744
+ key: Variable name
1745
+
1746
+ Returns:
1747
+ True if the variable exists and is deleted, False otherwise
1748
+ """
1749
+ history = self.vars.get(key)
1750
+ if not history:
1751
+ return False
1752
+
1753
+ last_stamp, last_value = history[-1]
1754
+ return last_value is VAR_DELETED
1755
+
1756
+ def get_all_vars(self, resolve_refs=True):
1757
+ """
1758
+ Get a dictionary of all current non-deleted variable values.
1759
+
1760
+ Args:
1761
+ resolve_refs: If True (default), resolve object references to actual data
1762
+
1763
+ Returns:
1764
+ dict: Variable name → current value (excludes deleted variables)
1765
+ """
1766
+ result = {}
1767
+ for key, history in self.vars.items():
1768
+ if history:
1769
+ last_stamp, last_value = history[-1]
1770
+ if last_value is not VAR_DELETED:
1771
+ # Resolve object reference if applicable
1772
+ if resolve_refs and is_obj_ref(last_value):
1773
+ result[key] = self.get_obj(last_value['_obj_ref'])
1774
+ else:
1775
+ result[key] = last_value
1776
+ return result
1777
+
1778
+ def get_var_history(self, key, resolve_refs=False):
1779
+ """
1780
+ Get full history of a variable as list of [stamp, value] pairs.
1781
+ Includes all historical values and deletion markers.
1782
+
1783
+ Args:
1784
+ key: Variable name
1785
+ resolve_refs: If True, resolve object references to actual data.
1786
+ Default False to preserve the raw history structure.
1787
+
1788
+ Returns:
1789
+ List of [stamp, value] pairs, or empty list if variable doesn't exist.
1790
+ Deleted entries have VAR_DELETED as the value.
1791
+ Object references appear as {'_obj_ref': stamp} unless resolve_refs=True.
1792
+ """
1793
+ history = self.vars.get(key, [])
1794
+ if not resolve_refs:
1795
+ return list(history)
1796
+
1797
+ # Resolve object references
1798
+ resolved = []
1799
+ for stamp, value in history:
1800
+ if is_obj_ref(value):
1801
+ resolved.append([stamp, self.get_obj(value['_obj_ref'])])
1802
+ else:
1803
+ resolved.append([stamp, value])
1804
+ return resolved
1805
+
1806
+ def get_var_desc(self, key):
1807
+ """
1808
+ Get the current (latest) description of a variable.
1809
+
1810
+ Args:
1811
+ key: Variable name
1812
+
1813
+ Returns:
1814
+ Latest description string, or default message if no description exists
1815
+ """
1816
+ desc = self._get_latest_desc(key)
1817
+ return desc if desc else "No description found."
1818
+
1819
+ def get_var_desc_history(self, key):
1820
+ """
1821
+ Get full history of a variable's descriptions as list of [stamp, description] pairs.
1822
+
1823
+ Args:
1824
+ key: Variable name
1825
+
1826
+ Returns:
1827
+ List of [stamp, description] pairs, or empty list if variable has no descriptions.
1828
+ """
1829
+ return list(self.var_desc_history.get(key, []))
1830
+
1831
+ #--- Object Methods ---
1832
+
1833
+ def set_obj(self, data, name=None, desc='', content_type='auto'):
1834
+ """
1835
+ Store a large object in compressed form.
1836
+
1837
+ Objects are compressed using zlib and base64-encoded for JSON serialization.
1838
+ Optionally creates a variable reference to the stored object.
1839
+
1840
+ Args:
1841
+ data: The data to store (bytes, str, or any JSON/pickle-serializable object)
1842
+ name: Optional variable name to create a reference
1843
+ desc: Description (used only if name is provided)
1844
+ content_type: 'bytes', 'text', 'json', 'pickle', or 'auto'
1845
+
1846
+ Returns:
1847
+ str: The object stamp (ID)
1848
+
1849
+ Example:
1850
+ # Store raw data, get stamp back
1851
+ stamp = memory.set_obj(large_text)
1852
+
1853
+ # Store and create variable reference
1854
+ memory.set_obj(image_bytes, name='profile_pic', desc='User avatar')
1855
+ memory.get_var('profile_pic') # Returns decompressed image_bytes
1856
+ """
1857
+ stamp = event_stamp({'obj': str(data)[:50]})
1858
+
1859
+ # Compress and store
1860
+ compressed_obj = compress_to_json(data, content_type)
1861
+ self.objects[stamp] = compressed_obj
1862
+
1863
+ # Optionally create a variable reference
1864
+ if name:
1865
+ obj_ref = {'_obj_ref': stamp}
1866
+ # Store reference directly in vars (bypassing size check)
1867
+ var_stamp = event_stamp({'var': name})
1868
+
1869
+ # Initialize history if needed
1870
+ if name not in self.vars:
1871
+ self.vars[name] = []
1872
+
1873
+ # Append [stamp, obj_ref] to history
1874
+ self.vars[name].append([var_stamp, obj_ref])
1875
+
1876
+ # Track description changes separately (only when provided)
1877
+ if desc:
1878
+ if name not in self.var_desc_history:
1879
+ self.var_desc_history[name] = []
1880
+ self.var_desc_history[name].append([var_stamp, desc])
1881
+
1882
+ # Get latest description for the event
1883
+ current_desc = desc if desc else self._get_latest_desc(name)
1884
+
1885
+ # Store the var event
1886
+ var_event = {
1887
+ 'type' : 'var',
1888
+ 'stamp' : var_stamp,
1889
+ 'var_name' : name,
1890
+ 'var_value': obj_ref, # Store the reference, not the data
1891
+ 'var_deleted': False,
1892
+ 'var_desc' : current_desc,
1893
+ 'content' : "Variable '{}' set to object ref: {}".format(name, stamp),
1894
+ 'mode' : 'text',
1895
+ 'dt_bog' : str(dtt.datetime.now(tz_bog))[:23],
1896
+ 'dt_utc' : str(dtt.datetime.now(tz_utc))[:23],
1897
+ }
1898
+ self._store_event('var', var_event)
1899
+
1900
+ return stamp
1901
+
1902
+ def get_obj(self, stamp):
1903
+ """
1904
+ Retrieve and decompress an object by its stamp.
1905
+
1906
+ Args:
1907
+ stamp: The object's event stamp
1908
+
1909
+ Returns:
1910
+ The decompressed original data, or None if not found
1911
+
1912
+ Example:
1913
+ data = memory.get_obj('A1B2C3...')
1914
+ """
1915
+ obj_dict = self.objects.get(stamp)
1916
+ if obj_dict is None:
1917
+ return None
1918
+ return decompress_from_json(obj_dict)
1919
+
1920
+ def get_obj_info(self, stamp):
1921
+ """
1922
+ Get metadata about a stored object without decompressing it.
1923
+
1924
+ Args:
1925
+ stamp: The object's event stamp
1926
+
1927
+ Returns:
1928
+ dict with size_original, size_compressed, content_type, or None if not found
1929
+ """
1930
+ obj_dict = self.objects.get(stamp)
1931
+ if obj_dict is None:
1932
+ return None
1933
+ return {
1934
+ 'stamp': stamp,
1935
+ 'size_original': obj_dict['size_original'],
1936
+ 'size_compressed': obj_dict['size_compressed'],
1937
+ 'content_type': obj_dict['content_type'],
1938
+ 'compression_ratio': obj_dict['size_compressed'] / obj_dict['size_original'] if obj_dict['size_original'] > 0 else 0,
1939
+ }
1940
+
1941
+ #---
1942
+
1943
+ def snapshot(self):
1944
+ """
1945
+ Export memory state as dict.
1946
+ Stores events and objects - indexes can be rehydrated from events.
1947
+
1948
+ Returns:
1949
+ dict with 'id', 'events', and 'objects' keys
1950
+ """
1951
+ return {
1952
+ 'id': self.id,
1953
+ 'events': dict(self.events), # All events by stamp
1954
+ 'objects': dict(self.objects), # All objects by stamp (already JSON-serializable)
1955
+ }
1956
+
1957
+ def save(self, filename, compressed=False):
1958
+ """
1959
+ Save memory to file.
1960
+
1961
+ Args:
1962
+ filename: Path to save file
1963
+ compressed: If True, use gzip compression
1964
+ """
1965
+ import gzip
1966
+ data = self.snapshot()
1967
+ if compressed:
1968
+ with gzip.open(filename, 'wb') as f:
1969
+ pickle.dump(data, f)
1970
+ else:
1971
+ with open(filename, 'wb') as f:
1972
+ pickle.dump(data, f)
1973
+
1974
+ def load(self, filename, compressed=False):
1975
+ """
1976
+ Load memory from file by rehydrating from events.
1977
+
1978
+ Args:
1979
+ filename: Path to load file
1980
+ compressed: If True, expect gzip compression
1981
+ """
1982
+ import gzip
1983
+ if compressed:
1984
+ with gzip.open(filename, 'rb') as f:
1985
+ data = pickle.load(f)
1986
+ else:
1987
+ with open(filename, 'rb') as f:
1988
+ data = pickle.load(f)
1989
+
1990
+ # Rehydrate from events (pass objects if present)
1991
+ event_list = list(data.get('events', {}).values())
1992
+ objects = data.get('objects', {})
1993
+ mem = MEMORY.from_events(event_list, data.get('id'), objects=objects)
1994
+
1995
+ # Copy state to self
1996
+ self.id = mem.id
1997
+ self.events = mem.events
1998
+ self.idx_msgs = mem.idx_msgs
1999
+ self.idx_refs = mem.idx_refs
2000
+ self.idx_logs = mem.idx_logs
2001
+ self.idx_vars = mem.idx_vars
2002
+ self.idx_all = mem.idx_all
2003
+ self.vars = mem.vars
2004
+ self.var_desc_history = mem.var_desc_history
2005
+ self.objects = mem.objects
2006
+
2007
+ def copy(self):
2008
+ """Return a deep copy of the MEMORY instance."""
2009
+ return copy.deepcopy(self)
2010
+
2011
+ def to_json(self, filename=None, indent=2):
2012
+ """
2013
+ Export memory to JSON format.
2014
+
2015
+ Like DataFrame.to_csv(), this allows saving memory state to a portable
2016
+ JSON format that can be loaded later with from_json().
2017
+
2018
+ Args:
2019
+ filename: If provided, write to file. Otherwise return JSON string.
2020
+ indent: JSON indentation level (default 2, use None for compact)
2021
+
2022
+ Returns:
2023
+ JSON string if filename is None, else None
2024
+
2025
+ Example:
2026
+ # Save to file
2027
+ memory.to_json('memory_backup.json')
2028
+
2029
+ # Get JSON string
2030
+ json_str = memory.to_json()
2031
+ """
2032
+ # Prepare data for JSON serialization
2033
+ # Need to handle VAR_DELETED sentinel in vars history
2034
+ def serialize_var_history(var_dict):
2035
+ """Convert VAR_DELETED sentinel to JSON-safe marker."""
2036
+ result = {}
2037
+ for key, history in var_dict.items():
2038
+ serialized_history = []
2039
+ for stamp, value in history:
2040
+ if value is VAR_DELETED:
2041
+ serialized_history.append([stamp, '__VAR_DELETED__'])
2042
+ else:
2043
+ serialized_history.append([stamp, value])
2044
+ result[key] = serialized_history
2045
+ return result
2046
+
2047
+ data = {
2048
+ 'version': '1.0',
2049
+ 'id': self.id,
2050
+ 'events': self.events,
2051
+ 'objects': self.objects,
2052
+ 'vars': serialize_var_history(self.vars),
2053
+ 'var_desc_history': self.var_desc_history,
2054
+ 'idx_msgs': self.idx_msgs,
2055
+ 'idx_refs': self.idx_refs,
2056
+ 'idx_logs': self.idx_logs,
2057
+ 'idx_vars': self.idx_vars,
2058
+ 'idx_all': self.idx_all,
2059
+ }
2060
+
2061
+ json_str = json.dumps(data, indent=indent, ensure_ascii=False)
2062
+
2063
+ if filename:
2064
+ with open(filename, 'w', encoding='utf-8') as f:
2065
+ f.write(json_str)
2066
+ return None
2067
+ return json_str
2068
+
2069
+ @classmethod
2070
+ def from_json(cls, source):
2071
+ """
2072
+ Create MEMORY instance from JSON.
2073
+
2074
+ Like DataFrame.read_csv(), this loads a memory from a JSON file or string
2075
+ that was saved with to_json().
2076
+
2077
+ Args:
2078
+ source: JSON string or filename path
2079
+
2080
+ Returns:
2081
+ New MEMORY instance
2082
+
2083
+ Example:
2084
+ # Load from file
2085
+ memory = MEMORY.from_json('memory_backup.json')
2086
+
2087
+ # Load from JSON string
2088
+ memory = MEMORY.from_json(json_str)
2089
+ """
2090
+ import os
2091
+
2092
+ # Determine if source is a file or JSON string
2093
+ if os.path.isfile(source):
2094
+ with open(source, 'r', encoding='utf-8') as f:
2095
+ data = json.load(f)
2096
+ else:
2097
+ data = json.loads(source)
2098
+
2099
+ # Helper to restore VAR_DELETED sentinel
2100
+ def deserialize_var_history(var_dict):
2101
+ """Convert JSON marker back to VAR_DELETED sentinel."""
2102
+ result = {}
2103
+ for key, history in var_dict.items():
2104
+ deserialized_history = []
2105
+ for stamp, value in history:
2106
+ if value == '__VAR_DELETED__':
2107
+ deserialized_history.append([stamp, VAR_DELETED])
2108
+ else:
2109
+ deserialized_history.append([stamp, value])
2110
+ result[key] = deserialized_history
2111
+ return result
2112
+
2113
+ # Create new instance
2114
+ mem = cls()
2115
+ mem.id = data.get('id', mem.id)
2116
+ mem.events = data.get('events', {})
2117
+ mem.objects = data.get('objects', {})
2118
+ mem.vars = deserialize_var_history(data.get('vars', {}))
2119
+ mem.var_desc_history = data.get('var_desc_history', {})
2120
+ mem.idx_msgs = data.get('idx_msgs', [])
2121
+ mem.idx_refs = data.get('idx_refs', [])
2122
+ mem.idx_logs = data.get('idx_logs', [])
2123
+ mem.idx_vars = data.get('idx_vars', [])
2124
+ mem.idx_all = data.get('idx_all', [])
2125
+
2126
+ return mem
2127
+
2128
+ @classmethod
2129
+ def from_events(cls, event_list, memory_id=None, objects=None):
2130
+ """
2131
+ Rehydrate a MEMORY instance from a list of events.
2132
+ This is the inverse of snapshot - enables cloud sync.
2133
+
2134
+ Args:
2135
+ event_list: List of event dicts (order doesn't matter, will be sorted)
2136
+ memory_id: Optional ID for the memory instance
2137
+ objects: Optional dict of objects (stamp → compressed object dict)
2138
+
2139
+ Returns:
2140
+ New MEMORY instance with all events loaded
2141
+ """
2142
+ mem = cls()
2143
+ if memory_id:
2144
+ mem.id = memory_id
2145
+
2146
+ # Restore objects if provided
2147
+ if objects:
2148
+ mem.objects = dict(objects)
2149
+
2150
+ # Sort events by timestamp (dt_utc) for chronological order
2151
+ sorted_events = sorted(event_list, key=lambda e: e.get('dt_utc', ''))
2152
+
2153
+ for ev in sorted_events:
2154
+ stamp = ev.get('stamp')
2155
+ timestamp = ev.get('dt_utc', '')
2156
+ if not stamp:
2157
+ continue
2158
+
2159
+ event_type = ev.get('type', 'msg')
2160
+
2161
+ # Store in data layer
2162
+ mem.events[stamp] = ev
2163
+
2164
+ # Create [timestamp, stamp] pair for indexes
2165
+ ts_pair = [timestamp, stamp]
2166
+
2167
+ # Add to appropriate index (direct append since already sorted by timestamp)
2168
+ if event_type == 'msg':
2169
+ mem.idx_msgs.append(ts_pair)
2170
+ elif event_type == 'ref':
2171
+ mem.idx_refs.append(ts_pair)
2172
+ elif event_type == 'log':
2173
+ mem.idx_logs.append(ts_pair)
2174
+ elif event_type == 'var':
2175
+ mem.idx_vars.append(ts_pair)
2176
+ # Replay variable state into history list
2177
+ var_name = ev.get('var_name')
2178
+ if var_name:
2179
+ # Initialize history list if needed
2180
+ if var_name not in mem.vars:
2181
+ mem.vars[var_name] = []
2182
+
2183
+ # Determine value (check for deletion marker)
2184
+ if ev.get('var_deleted', False):
2185
+ value = VAR_DELETED
2186
+ else:
2187
+ value = ev.get('var_value')
2188
+
2189
+ # Append to history
2190
+ mem.vars[var_name].append([stamp, value])
2191
+
2192
+ # Rebuild description history if present
2193
+ var_desc = ev.get('var_desc')
2194
+ if var_desc:
2195
+ if var_name not in mem.var_desc_history:
2196
+ mem.var_desc_history[var_name] = []
2197
+ # Only add if different from last description (avoid duplicates)
2198
+ desc_hist = mem.var_desc_history[var_name]
2199
+ if not desc_hist or desc_hist[-1][1] != var_desc:
2200
+ desc_hist.append([stamp, var_desc])
2201
+
2202
+ mem.idx_all.append(ts_pair)
2203
+
2204
+ return mem
2205
+
2206
+ #---
2207
+
2208
+ # The render method provides a flexible way to display or export the MEMORY's messages or events.
2209
+ # It supports event type selection, output format, advanced filtering, metadata inclusion, pretty-printing, and message condensing.
2210
+ def render(
2211
+ self,
2212
+ include=('msgs',), # Tuple/list of event types to include: 'msgs', 'logs', 'refs', 'vars', 'events'
2213
+ output_format='plain', # 'plain', 'markdown', 'json', 'table', 'conversation'
2214
+ role_filter=None, # List of roles to include (None = all)
2215
+ mode_filter=None, # List of modes to include (None = all)
2216
+ channel_filter=None, # Channel to filter by (None = all)
2217
+ content_filter=None, # String or list of keywords to filter content (None = all)
2218
+ include_metadata=True, # Whether to include metadata (timestamps, roles, etc.)
2219
+ pretty=True, # Pretty-print for human readability
2220
+ max_length=None, # Max total length of output (int, None = unlimited)
2221
+ condense_msg=True, # If True, snip/condense messages that exceed max_length
2222
+ time_range=None, # Tuple (start_dt, end_dt) to filter by datetime (None = all)
2223
+ event_limit=None, # Max number of events to include (None = all)
2224
+ # Conversation/LLM-optimized options:
2225
+ max_message_length=1000, # Max length per individual message (for 'conversation' format)
2226
+ max_total_length=8000, # Max total length of the entire conversation (for 'conversation' format)
2227
+ include_roles=('user', 'assistant'), # Which roles to include (for 'conversation' format)
2228
+ message_separator="\n\n", # Separator between messages (for 'conversation' format)
2229
+ role_prefix=True, # Whether to include role prefixes like "User:" and "Assistant:" (for 'conversation' format)
2230
+ truncate_indicator="...", # What to show when content is truncated (for 'conversation' format)
2231
+ ):
2232
+ """
2233
+ Render MEMORY contents with flexible filtering and formatting.
2234
+
2235
+ This method unifies all rendering and export logic, including:
2236
+ - General event/message rendering (plain, markdown, table, json)
2237
+ - Advanced filtering (by role, mode, channel, content, time, event type)
2238
+ - Metadata inclusion and pretty-printing
2239
+ - Output length limiting and message condensing/snipping
2240
+ - LLM-optimized conversation export (via output_format='conversation'),
2241
+ which produces a clean text blob of user/assistant messages with
2242
+ configurable length and formatting options.
2243
+
2244
+ Args:
2245
+ include: Which event types to include ('msgs', 'logs', 'refs', 'vars', 'events')
2246
+ output_format: 'plain', 'markdown', 'json', 'table', or 'conversation'
2247
+ role_filter: List of roles to include (None = all)
2248
+ mode_filter: List of modes to include (None = all)
2249
+ channel_filter: Channel to filter by (None = all)
2250
+ content_filter: String or list of keywords to filter content (None = all)
2251
+ include_metadata: Whether to include metadata (timestamps, roles, etc.)
2252
+ pretty: Pretty-print for human readability
2253
+ max_length: Max total length of output (for general formats)
2254
+ condense_msg: If True, snip/condense messages that exceed max_length
2255
+ time_range: Tuple (start_dt, end_dt) to filter by datetime (None = all)
2256
+ event_limit: Max number of events to include (None = all)
2257
+ max_message_length: Max length per message (for 'conversation' format)
2258
+ max_total_length: Max total length (for 'conversation' format)
2259
+ include_roles: Which roles to include (for 'conversation' format)
2260
+ message_separator: Separator between messages (for 'conversation' format)
2261
+ role_prefix: Whether to include role prefixes (for 'conversation' format)
2262
+ truncate_indicator: Indicator for truncated content (for 'conversation' format)
2263
+
2264
+ Returns:
2265
+ str or dict: Rendered output in the specified format.
2266
+
2267
+ Example usage:
2268
+ mem = MEMORY()
2269
+ mem.add_msg('user', 'Hello!')
2270
+ mem.add_msg('assistant', 'Hi there!')
2271
+ print(mem.render()) # Default: plain text, all messages
2272
+
2273
+ # Render only user messages in markdown
2274
+ print(mem.render(role_filter=['user'], output_format='markdown'))
2275
+
2276
+ # Render as a table, including logs and refs
2277
+ print(mem.render(include=('msgs', 'logs', 'refs'), output_format='table'))
2278
+
2279
+ # Render with a content keyword filter and max length
2280
+ print(mem.render(content_filter='hello', max_length=50))
2281
+
2282
+ # Export as LLM-optimized conversation
2283
+ print(mem.render(output_format='conversation', max_total_length=2000))
2284
+
2285
+ # Filter by channel
2286
+ print(mem.render(channel_filter='telegram'))
2287
+ """
2288
+ import json
2289
+ from datetime import datetime
2290
+
2291
+ # Helper: flatten include to set for fast lookup
2292
+ include_set = set(include)
2293
+
2294
+ # Helper: filter events by type using the new index-based retrieval
2295
+ def filter_events():
2296
+ events = []
2297
+ if 'events' in include_set:
2298
+ # Include all events from master index
2299
+ events = self._get_events_from_index(self.idx_all, -1)
2300
+ else:
2301
+ # Selectively include types
2302
+ if 'msgs' in include_set:
2303
+ events.extend(self._get_events_from_index(self.idx_msgs, -1))
2304
+ if 'logs' in include_set:
2305
+ events.extend(self._get_events_from_index(self.idx_logs, -1))
2306
+ if 'refs' in include_set:
2307
+ events.extend(self._get_events_from_index(self.idx_refs, -1))
2308
+ if 'vars' in include_set:
2309
+ events.extend(self._get_events_from_index(self.idx_vars, -1))
2310
+ return events
2311
+
2312
+ # Helper: filter by role, mode, channel, content, and time
2313
+ def advanced_filter(evlist):
2314
+ filtered = []
2315
+ for ev in evlist:
2316
+ # Role filter
2317
+ if role_filter:
2318
+ ev_role = ev.get('role') or ev.get('type')
2319
+ if ev_role not in role_filter:
2320
+ continue
2321
+ # Mode filter
2322
+ if mode_filter and ev.get('mode') not in mode_filter:
2323
+ continue
2324
+ # Channel filter
2325
+ if channel_filter and ev.get('channel') != channel_filter:
2326
+ continue
2327
+ # Content filter
2328
+ if content_filter:
2329
+ content = ev.get('content', '')
2330
+ if isinstance(content_filter, str):
2331
+ if content_filter.lower() not in content.lower():
2332
+ continue
2333
+ else: # list of keywords
2334
+ if not any(kw.lower() in content.lower() for kw in content_filter):
2335
+ continue
2336
+ # Time filter
2337
+ if time_range:
2338
+ # Try to get timestamp from event
2339
+ dt_str = ev.get('dt_utc') or ev.get('dt_bog')
2340
+ if dt_str:
2341
+ try:
2342
+ dt = datetime.fromisoformat(dt_str)
2343
+ start, end = time_range
2344
+ if (start and dt < start) or (end and dt > end):
2345
+ continue
2346
+ except Exception:
2347
+ pass # Ignore if can't parse
2348
+ filtered.append(ev)
2349
+ return filtered
2350
+
2351
+ # Helper: sort events by stamp (alphabetical = chronological)
2352
+ def sort_events(evlist):
2353
+ return sorted(evlist, key=lambda ev: ev.get('stamp', ''))
2354
+
2355
+ # Step 1: Gather and filter events
2356
+ events = filter_events()
2357
+ events = advanced_filter(events)
2358
+ events = sort_events(events)
2359
+ if event_limit:
2360
+ events = events[-event_limit:] # Most recent N
2361
+
2362
+ # --- Conversation/LLM-optimized format ---
2363
+ if output_format == 'conversation':
2364
+ # Only include messages and filter by include_roles
2365
+ conv_msgs = [ev for ev in events if ev.get('role') in include_roles]
2366
+ # Already sorted by stamp
2367
+
2368
+ conversation_parts = []
2369
+ current_length = 0
2370
+ for msg in conv_msgs:
2371
+ role = msg.get('role', 'unknown')
2372
+ content = msg.get('content', '')
2373
+
2374
+ # Truncate individual message if needed
2375
+ if len(content) > max_message_length:
2376
+ content = content[:max_message_length - len(truncate_indicator)] + truncate_indicator
2377
+
2378
+ # Format the message
2379
+ if role_prefix:
2380
+ if role == 'user':
2381
+ formatted_msg = "User: " + content
2382
+ elif role == 'assistant':
2383
+ formatted_msg = "Assistant: " + content
2384
+ else:
2385
+ formatted_msg = role.title() + ": " + content
2386
+ else:
2387
+ formatted_msg = content
2388
+
2389
+ # Check if adding this message would exceed total length
2390
+ message_length = len(formatted_msg) + len(message_separator)
2391
+ if current_length + message_length > max_total_length:
2392
+ # If we can't fit the full message, try to fit a truncated version
2393
+ remaining_space = max_total_length - current_length - len(truncate_indicator)
2394
+ if remaining_space > 50: # Only add if there's reasonable space
2395
+ if role_prefix:
2396
+ prefix_len = len(role.title() + ": ")
2397
+ truncated_content = content[:remaining_space - prefix_len] + truncate_indicator
2398
+ formatted_msg = role.title() + ": " + truncated_content
2399
+ else:
2400
+ formatted_msg = content[:remaining_space] + truncate_indicator
2401
+ conversation_parts.append(formatted_msg)
2402
+ break
2403
+
2404
+ conversation_parts.append(formatted_msg)
2405
+ current_length += message_length
2406
+
2407
+ return message_separator.join(conversation_parts)
2408
+
2409
+ # --- JSON format ---
2410
+ output = None
2411
+ total_length = 0
2412
+ snip_notice = " [snipped]" # For snipped messages
2413
+
2414
+ if output_format == 'json':
2415
+ # Output as JSON (list of dicts)
2416
+ if not include_metadata:
2417
+ # Remove metadata fields
2418
+ def strip_meta(ev):
2419
+ return {k: v for k, v in ev.items() if k in ('role', 'content', 'type', 'channel')}
2420
+ out_events = [strip_meta(ev) for ev in events]
2421
+ else:
2422
+ out_events = events
2423
+ output = json.dumps(out_events, indent=2 if pretty else None, default=str)
2424
+ if max_length and len(output) > max_length:
2425
+ output = output[:max_length] + snip_notice
2426
+
2427
+ elif output_format in ('plain', 'markdown', 'table'):
2428
+ # Build lines for each event
2429
+ lines = []
2430
+ for ev in events:
2431
+ # Compose line based on event type
2432
+ event_type = ev.get('type', 'msg')
2433
+ if event_type == 'log' or ev.get('role') == 'logger':
2434
+ prefix = "[LOG]"
2435
+ content = ev.get('content', '')
2436
+ elif event_type == 'ref':
2437
+ prefix = "[REF]"
2438
+ content = ev.get('content', '')
2439
+ elif event_type == 'var':
2440
+ prefix = "[VAR]"
2441
+ content = "{} = {}".format(ev.get('var_name', '?'), ev.get('var_value', '?'))
2442
+ else:
2443
+ prefix = "[{}]".format(ev.get('role', 'MSG').upper())
2444
+ content = ev.get('content', '')
2445
+
2446
+ # Optionally include metadata
2447
+ meta = ""
2448
+ if include_metadata:
2449
+ dt = ev.get('dt_utc') or ev.get('dt_bog')
2450
+ stamp = ev.get('stamp', '')
2451
+ channel = ev.get('channel', '')
2452
+ meta = " ({})".format(dt) if dt else ""
2453
+ if output_format == 'table':
2454
+ meta = "\t{}\t{}\t{}".format(dt or '', stamp or '', channel or '')
2455
+
2456
+ # Condense message if needed
2457
+ line = "{} {}{}".format(prefix, content, meta)
2458
+ if max_length and total_length + len(line) > max_length:
2459
+ if condense_msg:
2460
+ # Snip the content to fit
2461
+ allowed = max_length - total_length - len(snip_notice)
2462
+ if allowed > 0:
2463
+ line = line[:allowed] + snip_notice
2464
+ else:
2465
+ line = snip_notice
2466
+ lines.append(line)
2467
+ break
2468
+ else:
2469
+ break
2470
+ lines.append(line)
2471
+ total_length += len(line) + 1 # +1 for newline
2472
+
2473
+ # Format as table if requested
2474
+ if output_format == 'table':
2475
+ # Table header
2476
+ header = "Type\tContent\tDatetime\tStamp\tChannel"
2477
+ table_lines = [header]
2478
+ for ev in events:
2479
+ typ = ev.get('type', ev.get('role', ''))
2480
+ if typ == 'var':
2481
+ content = "{} = {}".format(ev.get('var_name', '?'), ev.get('var_value', '?'))
2482
+ else:
2483
+ content = ev.get('content', '')
2484
+ dt = ev.get('dt_utc') or ev.get('dt_bog') or ''
2485
+ stamp = ev.get('stamp', '')
2486
+ channel = ev.get('channel', '')
2487
+ row = "{}\t{}\t{}\t{}\t{}".format(typ, content, dt, stamp, channel)
2488
+ table_lines.append(row)
2489
+ output = "\n".join(table_lines)
2490
+ else:
2491
+ sep = "\n" if pretty else " "
2492
+ output = sep.join(lines)
2493
+
2494
+ else:
2495
+ raise ValueError("Unknown output_format: {}".format(output_format))
2496
+
2497
+ return output
2498
+
2499
+
2500
+ MemoryManipulationExamples = """
2501
+
2502
+ MEMORY Class Usage Tutorial
2503
+ ===========================
2504
+
2505
+ This tutorial demonstrates common workflows and transactions using the MEMORY class.
2506
+ The MEMORY class is an event-sourced state container for managing messages, logs,
2507
+ reflections, and variables in agentic or conversational systems.
2508
+
2509
+ Key Features:
2510
+ - Everything is an event with a sortable ID (alphabetical = chronological)
2511
+ - Events stored in a dictionary for O(1) lookup
2512
+ - Channel tracking for messages (omni-directional communication)
2513
+ - Full variable history with timestamps
2514
+ - Memory can be rehydrated from event list for cloud sync
2515
+
2516
+ ------------------------------------------------------------
2517
+ 1. Initialization
2518
+ ------------------------------------------------------------
2519
+
2520
+ >>> mem = MEMORY()
2521
+
2522
+ Creates a new MEMORY instance with empty event stores and indexes.
2523
+
2524
+ ------------------------------------------------------------
2525
+ 2. Adding and Retrieving Messages with Channel Support
2526
+ ------------------------------------------------------------
2527
+
2528
+ # Add user and assistant messages with channel tracking
2529
+ >>> mem.add_msg('user', 'Hello, assistant!', channel='webapp')
2530
+ >>> mem.add_msg('assistant', 'Hello, user! How can I help you?', channel='webapp')
2531
+
2532
+ # Messages from different channels
2533
+ >>> mem.add_msg('user', 'Quick question via phone', channel='ios')
2534
+ >>> mem.add_msg('user', 'Following up on Telegram', channel='telegram')
2535
+
2536
+ # Retrieve all messages as a list of dicts
2537
+ >>> mem.get_msgs()
2538
+ [{'role': 'user', 'content': 'Hello, assistant!', 'channel': 'webapp', ...}, ...]
2539
+
2540
+ # Filter messages by channel
2541
+ >>> mem.get_msgs(channel='telegram')
2542
+
2543
+ # Retrieve only user messages as a string
2544
+ >>> mem.get_msgs(include=['user'], repr='str')
2545
+ 'user: Hello, assistant!'
2546
+
2547
+ # Get the last assistant message
2548
+ >>> mem.last_asst_msg()
2549
+ 'Hello, user! How can I help you?'
2550
+
2551
+ ------------------------------------------------------------
2552
+ 3. Logging and Reflections
2553
+ ------------------------------------------------------------
2554
+
2555
+ # Add a log entry
2556
+ >>> mem.add_log('System initialized.')
2557
+
2558
+ # Add a reflection (agent's internal reasoning)
2559
+ >>> mem.add_ref('User seems to be asking about weather patterns.')
2560
+
2561
+ # Retrieve the last log message
2562
+ >>> mem.last_log_msg()
2563
+ 'System initialized.'
2564
+
2565
+ # Get all logs
2566
+ >>> mem.get_logs()
2567
+
2568
+ # Get all reflections
2569
+ >>> mem.get_refs()
2570
+
2571
+ ------------------------------------------------------------
2572
+ 4. Managing Variables (Full History Tracking)
2573
+ ------------------------------------------------------------
2574
+
2575
+ # Set a variable with a description (logged as an event!)
2576
+ >>> mem.set_var('session_id', 'abc123', desc='Current session identifier')
2577
+
2578
+ # Update the variable (appends to history, doesn't overwrite)
2579
+ >>> mem.set_var('session_id', 'xyz789')
2580
+
2581
+ # Retrieve the current value of a variable
2582
+ >>> mem.get_var('session_id')
2583
+ 'xyz789'
2584
+
2585
+ # Get all current non-deleted variables as a dict
2586
+ >>> mem.get_all_vars()
2587
+ {'session_id': 'xyz789'}
2588
+
2589
+ # Get full variable history as list of [stamp, value] pairs
2590
+ >>> mem.get_var_history('session_id')
2591
+ [['stamp1...', 'abc123'], ['stamp2...', 'xyz789']]
2592
+
2593
+ # Get variable description
2594
+ >>> mem.get_var_desc('session_id')
2595
+ 'Current session identifier'
2596
+
2597
+ # Delete a variable (marks as deleted but preserves history)
2598
+ >>> mem.del_var('session_id')
2599
+
2600
+ # After deletion, get_var returns None
2601
+ >>> mem.get_var('session_id')
2602
+ None
2603
+
2604
+ # Check if a variable is deleted
2605
+ >>> mem.is_var_deleted('session_id')
2606
+ True
2607
+
2608
+ # History still shows all changes including deletion
2609
+ >>> mem.get_var_history('session_id')
2610
+ [['stamp1...', 'abc123'], ['stamp2...', 'xyz789'], ['stamp3...', <DELETED>]]
2611
+
2612
+ # Variable can be re-set after deletion
2613
+ >>> mem.set_var('session_id', 'new_value')
2614
+ >>> mem.get_var('session_id')
2615
+ 'new_value'
2616
+
2617
+ ------------------------------------------------------------
2618
+ 5. Saving, Loading, and Copying State
2619
+ ------------------------------------------------------------
2620
+
2621
+ # Save MEMORY state to a file
2622
+ >>> mem.save('memory_state.pkl')
2623
+
2624
+ # Save with compression
2625
+ >>> mem.save('memory_state.pkl.gz', compressed=True)
2626
+
2627
+ # Load MEMORY state from a file (rehydrates from events)
2628
+ >>> mem2 = MEMORY()
2629
+ >>> mem2.load('memory_state.pkl')
2630
+
2631
+ # Deep copy the MEMORY object
2632
+ >>> mem3 = mem.copy()
2633
+
2634
+ ------------------------------------------------------------
2635
+ 6. Rehydrating from Events (Cloud Sync Ready)
2636
+ ------------------------------------------------------------
2637
+
2638
+ # Export all events
2639
+ >>> events = mem.get_events()
2640
+
2641
+ # Create a new memory from events (order doesn't matter, sorted by stamp)
2642
+ >>> mem_copy = MEMORY.from_events(events)
2643
+
2644
+ # Export snapshot for cloud storage
2645
+ >>> snapshot = mem.snapshot()
2646
+ # snapshot = {'id': '...', 'events': {...}}
2647
+
2648
+ ------------------------------------------------------------
2649
+ 7. Rendering and Exporting Memory Contents
2650
+ ------------------------------------------------------------
2651
+
2652
+ # Render all messages as plain text (default)
2653
+ >>> print(mem.render())
2654
+
2655
+ # Render only user messages in markdown format
2656
+ >>> print(mem.render(role_filter=['user'], output_format='markdown'))
2657
+
2658
+ # Render as a table, including logs and reflections
2659
+ >>> print(mem.render(include=('msgs', 'logs', 'refs'), output_format='table'))
2660
+
2661
+ # Filter by channel
2662
+ >>> print(mem.render(channel_filter='telegram'))
2663
+
2664
+ # Render with a content keyword filter and max length
2665
+ >>> print(mem.render(content_filter='hello', max_length=50))
2666
+
2667
+ # Export as LLM-optimized conversation (for prompt construction)
2668
+ >>> print(mem.render(output_format='conversation', max_total_length=2000))
2669
+
2670
+ ------------------------------------------------------------
2671
+ 8. Advanced Filtering and Formatting
2672
+ ------------------------------------------------------------
2673
+
2674
+ # Filter by role, mode, and channel
2675
+ >>> print(mem.render(role_filter=['assistant'], mode_filter=['text'], channel_filter='webapp'))
2676
+
2677
+ # Filter by time range (using datetime objects)
2678
+ >>> from datetime import datetime, timedelta
2679
+ >>> start = datetime.utcnow() - timedelta(hours=1)
2680
+ >>> end = datetime.utcnow()
2681
+ >>> print(mem.render(time_range=(start, end)))
2682
+
2683
+ # Limit number of events/messages
2684
+ >>> print(mem.render(event_limit=5))
2685
+
2686
+ # Get all events of specific types
2687
+ >>> mem.get_events(event_types=['msg', 'ref'])
2688
+
2689
+ ------------------------------------------------------------
2690
+ 9. Example: Full Workflow
2691
+ ------------------------------------------------------------
2692
+
2693
+ >>> mem = MEMORY()
2694
+ >>> mem.add_msg('user', 'What is the weather today?', channel='webapp')
2695
+ >>> mem.add_msg('assistant', 'The weather is sunny and warm.', channel='webapp')
2696
+ >>> mem.set_var('weather', 'sunny and warm', desc='Latest weather info')
2697
+ >>> mem.add_ref('User is interested in outdoor activities.')
2698
+ >>> mem.add_log('Weather query processed successfully.')
2699
+ >>> print(mem.render(output_format='conversation'))
2700
+
2701
+ # Export all events and rehydrate
2702
+ >>> all_events = mem.get_events()
2703
+ >>> mem_restored = MEMORY.from_events(all_events, mem.id)
2704
+
2705
+ ------------------------------------------------------------
2706
+ For more details, see the MEMORY class docstring and method documentation.
2707
+ ------------------------------------------------------------
2708
+ """
2709
+
2710
+
2711
+ #############################################################################
2712
+ #############################################################################
2713
+
2714
+ ### THOUGHT CLASS
2715
+
2716
+
2717
+
2718
+ class THOUGHT:
2719
+ """
2720
+ The THOUGHT class represents a single, modular reasoning or action step within an agentic
2721
+ workflow. It is designed to operate on MEMORY objects, orchestrating LLM calls, memory queries,
2722
+ and variable manipulations in a composable and traceable manner.
2723
+ THOUGHTs are the atomic units of reasoning, planning, and execution in the Thoughtflow framework,
2724
+ and can be chained or composed to build complex agent behaviors.
2725
+
2726
+ CONCEPT:
2727
+ A thought is a self-contained, modular process of (1) creating a structured prompt for an LLM,
2728
+ (2) Executing the LLM request, (3) cleaning / validating the LLM response, and (4) retry execution
2729
+ if it is necesary. It is the discrete unit of cognition. It is the execution of a single cognitive task.
2730
+ In-so-doing, we have created the fundamental component of architecting multi-step cognitive systems.
2731
+
2732
+ The Simple Equation of a Thought:
2733
+ Thoughts = Prompt + Context + LLM + Parsing + Validation
2734
+
2735
+
2736
+ COMPONENTS:
2737
+
2738
+ 1. PROMPT
2739
+ The Prompt() object is essentially the structured template which may contain certain parameters to fill-out.
2740
+ This defines the structure and the rules for executing the LLM request.
2741
+
2742
+ 2. CONTEXT
2743
+ This is the relevant context which comes from a Memory() object. It is passed to a prompt object in the
2744
+ structure of a dictionary containing the variables required / optional. Any context that is given, but
2745
+ does not exist as a variable in the prompt, will be excluded.
2746
+
2747
+ 3. LLM REQUEST
2748
+ This is the simple transaction of submitting a structured Messages object to an LLM in order to receive
2749
+ a response. The messages object may include a system prompt and a series of historical user / assistant
2750
+ interactions. Passed in this request is also parameters like temperature.
2751
+
2752
+ 4. PARSING
2753
+ It is often that LLMs offer extra text even if they are told not to. For this reason, it is important
2754
+ to parse the response such that we are only handling the content that was requested, and nothing more.
2755
+ So if we are asking for a Python List, the parsed response should begin with "[" and end with "]".
2756
+
2757
+ 5. VALIDATION
2758
+ It is possible that even if a response was successfully parsed that it is not valid, given the constraints
2759
+ of the Thought. For this reason, it is helpful to have a validation routine that stamps the response as valid
2760
+ according to a fixed list of rules. "max_retries" is a param that tells the Thought how many times it can
2761
+ retry the prompt before returning an error.
2762
+
2763
+
2764
+ Supported Operations:
2765
+ - llm_call: Execute an LLM request with prompt and context (default)
2766
+ - memory_query: Query memory state and return variables/data without LLM
2767
+ - variable_set: Set or compute memory variables from context
2768
+ - conditional: Execute logic based on memory conditions
2769
+
2770
+ Key Features:
2771
+ - Callable interface: mem = thought(mem) or mem = thought(mem, vars)
2772
+ - Automatic retry with configurable attempts and repair prompts
2773
+ - Schema-based response parsing via valid_extract or custom parsers
2774
+ - Multiple validators: has_keys, list_min_len, custom callables
2775
+ - Pre/post hooks for custom processing
2776
+ - Full execution tracing and history
2777
+ - Serialization support via to_dict()/from_dict()
2778
+ - Channel support for message tracking
2779
+
2780
+ Parameters:
2781
+ name (str): Unique identifier for this thought
2782
+ llm (LLM): LLM instance for execution (required for llm_call operation)
2783
+ prompt (str|dict): Prompt template with {variable} placeholders
2784
+ operation (str): Type of operation ('llm_call', 'memory_query', 'variable_set', 'conditional')
2785
+ system_prompt (str): Optional system prompt for LLM context (via config)
2786
+ parser (str|callable): Response parser ('text', 'json', 'list', or callable)
2787
+ parsing_rules (dict): Schema for valid_extract parsing (e.g., {'kind': 'python', 'format': []})
2788
+ validator (str|callable): Response validator ('any', 'has_keys:k1,k2', 'list_min_len:N', or callable)
2789
+ max_retries (int): Maximum retry attempts (default: 1)
2790
+ retry_delay (float): Delay between retries in seconds (default: 0)
2791
+ required_vars (list): Variables required from memory
2792
+ optional_vars (list): Optional variables from memory
2793
+ output_var (str): Variable name for storing result (default: '{name}_result')
2794
+ pre_hook (callable): Function called before execution: fn(thought, memory, vars, **kwargs)
2795
+ post_hook (callable): Function called after execution: fn(thought, memory, result, error)
2796
+ channel (str): Channel for message tracking (default: 'system')
2797
+ add_reflection (bool): Whether to add reflection on success (default: True)
2798
+
2799
+ Example usage:
2800
+ # Basic LLM call with result storage
2801
+ mem = MEMORY()
2802
+ llm = LLM(model="openai:gpt-4o-mini", api_key="...")
2803
+ thought = THOUGHT(
2804
+ name="summarize",
2805
+ llm=llm,
2806
+ prompt="Summarize the last user message: {last_user_msg}",
2807
+ operation="llm_call"
2808
+ )
2809
+ mem = thought(mem) # Executes the thought, updates memory with result
2810
+ result = mem.get_var("summarize_result")
2811
+
2812
+ # Schema-based parsing example
2813
+ thought = THOUGHT(
2814
+ name="extract_info",
2815
+ llm=llm,
2816
+ prompt="Extract name and age from: {text}",
2817
+ parsing_rules={"kind": "python", "format": {"name": "", "age": 0}}
2818
+ )
2819
+
2820
+ # Memory query example (no LLM)
2821
+ thought = THOUGHT(
2822
+ name="get_context",
2823
+ operation="memory_query",
2824
+ required_vars=["user_name", "session_id"]
2825
+ )
2826
+
2827
+ # Variable set example
2828
+ thought = THOUGHT(
2829
+ name="init_session",
2830
+ operation="variable_set",
2831
+ prompt={"session_active": True, "start_time": None} # dict of values to set
2832
+ )
2833
+
2834
+
2835
+ !!! IMPORTANT !!!
2836
+ The resulting functionality from this class must enable the following pattern:
2837
+ mem = thought(mem) # where mem is a MEMORY object
2838
+ or
2839
+ mem = thought(mem,vars) # where vars (optional)is a dictionary of variables to pass to the thought
2840
+
2841
+ THOUGHT OPERATIONS MUST BE CALLABLE.
2842
+
2843
+ """
2844
+
2845
+ # Valid operation types
2846
+ VALID_OPERATIONS = {'llm_call', 'memory_query', 'variable_set', 'conditional'}
2847
+
2848
+ def __init__(self, name=None, llm=None, prompt=None, operation=None, **kwargs):
2849
+ """
2850
+ Initialize a THOUGHT instance.
2851
+
2852
+ Args:
2853
+ name (str): Name of the thought.
2854
+ llm: LLM interface or callable.
2855
+ prompt: Prompt template (str or dict).
2856
+ operation (str): Operation type (e.g., 'llm_call', 'memory_query', etc).
2857
+ **kwargs: Additional configuration parameters.
2858
+ """
2859
+ self.name = name
2860
+ self.id = event_stamp()
2861
+ self.llm = llm
2862
+ self.prompt = prompt
2863
+ self.operation = operation
2864
+
2865
+ # Store any additional configuration parameters
2866
+ self.config = kwargs.copy()
2867
+
2868
+ # Optionally, store a description or docstring if provided
2869
+ self.description = kwargs.get("description", None)
2870
+
2871
+ # Optionally, store validation rules, parsing functions, etc.
2872
+ self.validation = kwargs.get("validation", None)
2873
+ self.parse_fn = kwargs.get("parse_fn", None)
2874
+ self.max_retries = kwargs.get("max_retries", 1)
2875
+ self.retry_delay = kwargs.get("retry_delay", 0)
2876
+
2877
+ # Optionally, store default context variables or requirements
2878
+ self.required_vars = kwargs.get("required_vars", [])
2879
+ self.optional_vars = kwargs.get("optional_vars", [])
2880
+
2881
+ # Optionally, store output variable name
2882
+ self.output_var = kwargs.get("output_var", "{}_result".format(self.name) if self.name else None)
2883
+
2884
+ # Internal state for tracking last result, errors, etc.
2885
+ self.last_result = None
2886
+ self.last_error = None
2887
+ self.last_prompt = None
2888
+ self.last_msgs = None
2889
+ self.last_response = None
2890
+
2891
+ # Allow for custom hooks (pre/post processing)
2892
+ self.pre_hook = kwargs.get("pre_hook", None)
2893
+ self.post_hook = kwargs.get("post_hook", None)
2894
+
2895
+ # Execution history tracking
2896
+ self.execution_history = []
2897
+
2898
+
2899
+ def __call__(self, memory, vars={}, **kwargs):
2900
+ """
2901
+ Execute the thought on the given MEMORY object.
2902
+
2903
+ Args:
2904
+ memory: MEMORY object.
2905
+ vars: Optional dictionary of variables to pass to the thought.
2906
+ **kwargs: Additional parameters for execution.
2907
+ Returns:
2908
+ Updated MEMORY object with result stored (if applicable).
2909
+ """
2910
+ import time as time_module
2911
+
2912
+ start_time = time_module.time()
2913
+
2914
+ # Allow vars to be None
2915
+ if vars is None:
2916
+ vars = {}
2917
+
2918
+ # Pre-hook
2919
+ if self.pre_hook and callable(self.pre_hook):
2920
+ self.pre_hook(self, memory, vars, **kwargs)
2921
+
2922
+ # Determine operation type
2923
+ operation = self.operation or 'llm_call'
2924
+
2925
+ # Dispatch to appropriate handler based on operation type
2926
+ if operation == 'llm_call':
2927
+ result, last_error, attempts_made = self._execute_llm_call(memory, vars, **kwargs)
2928
+ elif operation == 'memory_query':
2929
+ result, last_error, attempts_made = self._execute_memory_query(memory, vars, **kwargs)
2930
+ elif operation == 'variable_set':
2931
+ result, last_error, attempts_made = self._execute_variable_set(memory, vars, **kwargs)
2932
+ elif operation == 'conditional':
2933
+ result, last_error, attempts_made = self._execute_conditional(memory, vars, **kwargs)
2934
+ else:
2935
+ raise ValueError("Unknown operation: {}. Valid operations: {}".format(operation, self.VALID_OPERATIONS))
2936
+
2937
+ # Calculate execution duration
2938
+ duration_ms = (time_module.time() - start_time) * 1000
2939
+
2940
+ # Build execution event for logging
2941
+ execution_event = {
2942
+ 'thought_name': self.name,
2943
+ 'thought_id': self.id,
2944
+ 'operation': operation,
2945
+ 'attempts': attempts_made,
2946
+ 'success': result is not None,
2947
+ 'duration_ms': round(duration_ms, 2),
2948
+ 'output_var': self.output_var
2949
+ }
2950
+
2951
+ # If failed after all retries
2952
+ if result is None and last_error is not None:
2953
+ execution_event['error'] = last_error
2954
+ if hasattr(memory, "add_log") and callable(getattr(memory, "add_log", None)):
2955
+ memory.add_log("Thought execution failed: " + json.dumps(execution_event))
2956
+ # Store None as result
2957
+ self.update_memory(memory, None)
2958
+ else:
2959
+ if hasattr(memory, "add_log") and callable(getattr(memory, "add_log", None)):
2960
+ memory.add_log("Thought execution complete: " + json.dumps(execution_event))
2961
+ self.update_memory(memory, result)
2962
+
2963
+ # Track execution history on the THOUGHT instance
2964
+ self.execution_history.append({
2965
+ 'stamp': event_stamp(),
2966
+ 'memory_id': getattr(memory, 'id', None),
2967
+ 'operation': operation,
2968
+ 'duration_ms': duration_ms,
2969
+ 'success': result is not None or last_error is None,
2970
+ 'attempts': attempts_made,
2971
+ 'error': self.last_error
2972
+ })
2973
+
2974
+ # Post-hook
2975
+ if self.post_hook and callable(self.post_hook):
2976
+ self.post_hook(self, memory, self.last_result, self.last_error)
2977
+
2978
+ return memory
2979
+
2980
+ def _execute_llm_call(self, memory, vars, **kwargs):
2981
+ """
2982
+ Execute an LLM call operation with retry logic.
2983
+
2984
+ Returns:
2985
+ tuple: (result, last_error, attempts_made)
2986
+ """
2987
+ import copy as copy_module
2988
+ import time as time_module
2989
+
2990
+ retries_left = self.max_retries
2991
+ last_error = None
2992
+ result = None
2993
+ attempts_made = 0
2994
+
2995
+ # Store original prompt to avoid mutation - work with a copy
2996
+ original_prompt = copy_module.deepcopy(self.prompt)
2997
+ working_prompt = copy_module.deepcopy(self.prompt)
2998
+
2999
+ while retries_left > 0:
3000
+ attempts_made += 1
3001
+ try:
3002
+ # Temporarily set working prompt for this iteration
3003
+ self.prompt = working_prompt
3004
+
3005
+ # Build context and prompt/messages
3006
+ ctx = self.get_context(memory)
3007
+ ctx.update(vars)
3008
+ msgs = self.build_msgs(memory, ctx)
3009
+
3010
+ # Run LLM
3011
+ llm_kwargs = self.config.get("llm_params", {})
3012
+ llm_kwargs.update(kwargs)
3013
+ response = self.run_llm(msgs, **llm_kwargs)
3014
+ self.last_response = response
3015
+
3016
+ # Get channel from config for message tracking
3017
+ channel = self.config.get("channel", "system")
3018
+
3019
+ # Add assistant message to memory (if possible)
3020
+ if hasattr(memory, "add_msg") and callable(getattr(memory, "add_msg", None)):
3021
+ memory.add_msg("assistant", response, channel=channel)
3022
+
3023
+ # Parse
3024
+ parsed = self.parse_response(response)
3025
+ self.last_result = parsed
3026
+
3027
+ # Validate
3028
+ valid, why = self.validate(parsed)
3029
+ if valid:
3030
+ result = parsed
3031
+ self.last_error = None
3032
+ # Logging
3033
+ if hasattr(memory, "add_log") and callable(getattr(memory, "add_log", None)):
3034
+ memory.add_log("Thought '{}' completed successfully".format(self.name))
3035
+ # Add reflection for reasoning trace (if configured)
3036
+ if self.config.get("add_reflection", True):
3037
+ if hasattr(memory, "add_ref") and callable(getattr(memory, "add_ref", None)):
3038
+ # Truncate response for reflection if too long
3039
+ response_preview = str(response)[:300]
3040
+ if len(str(response)) > 300:
3041
+ response_preview += "..."
3042
+ memory.add_ref("Thought '{}': {}".format(self.name, response_preview))
3043
+ break
3044
+ else:
3045
+ last_error = why
3046
+ self.last_error = why
3047
+ if hasattr(memory, "add_log") and callable(getattr(memory, "add_log", None)):
3048
+ memory.add_log("Thought '{}' validation failed: {}".format(self.name, why))
3049
+ # Create repair suffix for next retry (modify working_prompt, not original)
3050
+ repair_suffix = "\n(Please return only the requested format; your last answer failed: {}.)" .format(why)
3051
+ if isinstance(original_prompt, str):
3052
+ working_prompt = original_prompt.rstrip() + repair_suffix
3053
+ elif isinstance(original_prompt, dict):
3054
+ working_prompt = copy_module.deepcopy(original_prompt)
3055
+ last_key = list(working_prompt.keys())[-1]
3056
+ working_prompt[last_key] = working_prompt[last_key].rstrip() + repair_suffix
3057
+ except Exception as e:
3058
+ last_error = str(e)
3059
+ self.last_error = last_error
3060
+ if hasattr(memory, "add_log") and callable(getattr(memory, "add_log", None)):
3061
+ memory.add_log("Thought '{}' error: {}".format(self.name, last_error))
3062
+ # Create repair suffix for next retry (modify working_prompt, not original)
3063
+ repair_suffix = "\n(Please return only the requested format; your last answer failed: {}.)".format(last_error)
3064
+ if isinstance(original_prompt, str):
3065
+ working_prompt = original_prompt.rstrip() + repair_suffix
3066
+ elif isinstance(original_prompt, dict):
3067
+ working_prompt = copy_module.deepcopy(original_prompt)
3068
+ last_key = list(working_prompt.keys())[-1]
3069
+ working_prompt[last_key] = working_prompt[last_key].rstrip() + repair_suffix
3070
+ retries_left -= 1
3071
+ if self.retry_delay:
3072
+ time_module.sleep(self.retry_delay)
3073
+
3074
+ # Restore original prompt after execution (prevents permanent mutation)
3075
+ self.prompt = original_prompt
3076
+
3077
+ return result, last_error, attempts_made
3078
+
3079
+ def _execute_memory_query(self, memory, vars, **kwargs):
3080
+ """
3081
+ Execute a memory query operation (no LLM involved).
3082
+ Retrieves specified variables from memory and returns them as a dict.
3083
+
3084
+ Returns:
3085
+ tuple: (result, last_error, attempts_made)
3086
+ """
3087
+ try:
3088
+ result = {}
3089
+
3090
+ # Get required variables
3091
+ for var in self.required_vars:
3092
+ if hasattr(memory, "get_var") and callable(getattr(memory, "get_var", None)):
3093
+ val = memory.get_var(var)
3094
+ else:
3095
+ val = getattr(memory, var, None)
3096
+
3097
+ if val is None:
3098
+ return None, "Required variable '{}' not found in memory".format(var), 1
3099
+ result[var] = val
3100
+
3101
+ # Get optional variables
3102
+ for var in self.optional_vars:
3103
+ if hasattr(memory, "get_var") and callable(getattr(memory, "get_var", None)):
3104
+ val = memory.get_var(var)
3105
+ else:
3106
+ val = getattr(memory, var, None)
3107
+
3108
+ if val is not None:
3109
+ result[var] = val
3110
+
3111
+ # Include any vars passed directly
3112
+ result.update(vars)
3113
+
3114
+ self.last_result = result
3115
+ self.last_error = None
3116
+
3117
+ if hasattr(memory, "add_log") and callable(getattr(memory, "add_log", None)):
3118
+ memory.add_log("Thought '{}' memory query completed".format(self.name))
3119
+
3120
+ return result, None, 1
3121
+
3122
+ except Exception as e:
3123
+ self.last_error = str(e)
3124
+ return None, str(e), 1
3125
+
3126
+ def _execute_variable_set(self, memory, vars, **kwargs):
3127
+ """
3128
+ Execute a variable set operation.
3129
+ Sets variables in memory from the prompt (as dict) or vars parameter.
3130
+
3131
+ Returns:
3132
+ tuple: (result, last_error, attempts_made)
3133
+ """
3134
+ try:
3135
+ values_to_set = {}
3136
+
3137
+ # If prompt is a dict, use it as the values to set
3138
+ if isinstance(self.prompt, dict):
3139
+ values_to_set.update(self.prompt)
3140
+
3141
+ # Override/add with vars parameter
3142
+ values_to_set.update(vars)
3143
+
3144
+ # Set each variable in memory
3145
+ for key, value in values_to_set.items():
3146
+ if hasattr(memory, "set_var") and callable(getattr(memory, "set_var", None)):
3147
+ desc = self.config.get("var_descriptions", {}).get(key, "Set by thought: {}".format(self.name))
3148
+ memory.set_var(key, value, desc=desc)
3149
+ elif hasattr(memory, "vars"):
3150
+ if key not in memory.vars:
3151
+ memory.vars[key] = []
3152
+ stamp = event_stamp(value)
3153
+ memory.vars[key].append([stamp, value])
3154
+
3155
+ self.last_result = values_to_set
3156
+ self.last_error = None
3157
+
3158
+ if hasattr(memory, "add_log") and callable(getattr(memory, "add_log", None)):
3159
+ memory.add_log("Thought '{}' set {} variables".format(self.name, len(values_to_set)))
3160
+
3161
+ return values_to_set, None, 1
3162
+
3163
+ except Exception as e:
3164
+ self.last_error = str(e)
3165
+ return None, str(e), 1
3166
+
3167
+ def _execute_conditional(self, memory, vars, **kwargs):
3168
+ """
3169
+ Execute a conditional operation.
3170
+ Evaluates a condition from config and returns the appropriate result.
3171
+
3172
+ Config options:
3173
+ condition (callable): Function that takes (memory, vars) and returns bool
3174
+ if_true: Value/action if condition is true
3175
+ if_false: Value/action if condition is false
3176
+
3177
+ Returns:
3178
+ tuple: (result, last_error, attempts_made)
3179
+ """
3180
+ try:
3181
+ condition_fn = self.config.get("condition")
3182
+ if_true = self.config.get("if_true")
3183
+ if_false = self.config.get("if_false")
3184
+
3185
+ if condition_fn is None:
3186
+ return None, "No condition function provided for conditional operation", 1
3187
+
3188
+ if not callable(condition_fn):
3189
+ return None, "Condition must be callable", 1
3190
+
3191
+ # Evaluate condition
3192
+ ctx = self.get_context(memory)
3193
+ ctx.update(vars)
3194
+ condition_result = condition_fn(memory, ctx)
3195
+
3196
+ # Return appropriate value
3197
+ if condition_result:
3198
+ result = if_true
3199
+ if callable(if_true):
3200
+ result = if_true(memory, ctx)
3201
+ else:
3202
+ result = if_false
3203
+ if callable(if_false):
3204
+ result = if_false(memory, ctx)
3205
+
3206
+ self.last_result = result
3207
+ self.last_error = None
3208
+
3209
+ if hasattr(memory, "add_log") and callable(getattr(memory, "add_log", None)):
3210
+ memory.add_log("Thought '{}' conditional evaluated to {}".format(self.name, bool(condition_result)))
3211
+
3212
+ return result, None, 1
3213
+
3214
+ except Exception as e:
3215
+ self.last_error = str(e)
3216
+ return None, str(e), 1
3217
+
3218
+ def build_prompt(self, memory, context_vars=None):
3219
+ """
3220
+ Build the prompt for the LLM using construct_prompt.
3221
+
3222
+ Args:
3223
+ memory: MEMORY object providing context.
3224
+ context_vars (dict): Optional context variables to fill the prompt.
3225
+
3226
+ Returns:
3227
+ str: The constructed prompt string.
3228
+ """
3229
+ # Get context variables (merge get_context and context_vars)
3230
+ ctx = self.get_context(memory)
3231
+ if context_vars:
3232
+ ctx.update(context_vars)
3233
+ prompt_template = self.prompt
3234
+ # If prompt is a dict, use construct_prompt, else format as string
3235
+ if isinstance(prompt_template, dict):
3236
+ prompt = construct_prompt(prompt_template)
3237
+ elif isinstance(prompt_template, str):
3238
+ try:
3239
+ prompt = prompt_template.format(**ctx)
3240
+ except Exception:
3241
+ # fallback: just return as is
3242
+ prompt = prompt_template
3243
+ else:
3244
+ prompt = str(prompt_template)
3245
+ self.last_prompt = prompt
3246
+ return prompt
3247
+
3248
+ def build_msgs(self, memory, context_vars=None):
3249
+ """
3250
+ Build the messages list for the LLM using construct_msgs.
3251
+
3252
+ Args:
3253
+ memory: MEMORY object providing context.
3254
+ context_vars (dict): Optional context variables to fill the prompt.
3255
+
3256
+ Returns:
3257
+ list: List of message dicts for LLM input.
3258
+ """
3259
+ ctx = self.get_context(memory)
3260
+ if context_vars:
3261
+ ctx.update(context_vars)
3262
+ # Compose system and user prompts
3263
+ sys_prompt = self.config.get("system_prompt", "")
3264
+ usr_prompt = self.build_prompt(memory, ctx)
3265
+ # Optionally, allow for prior messages from memory
3266
+ msgs = []
3267
+ if hasattr(memory, "get_msgs"):
3268
+ # Optionally, get recent messages for context
3269
+ msgs = memory.get_msgs(repr="list") if callable(getattr(memory, "get_msgs", None)) else []
3270
+ # Build messages using construct_msgs
3271
+ msgs_out = construct_msgs(
3272
+ usr_prompt=usr_prompt,
3273
+ vars=ctx,
3274
+ sys_prompt=sys_prompt,
3275
+ msgs=msgs
3276
+ )
3277
+ self.last_msgs = msgs_out
3278
+ return msgs_out
3279
+
3280
+ def get_context(self, memory):
3281
+ """
3282
+ Extract relevant context from the MEMORY object for this thought.
3283
+
3284
+ Args:
3285
+ memory: MEMORY object.
3286
+
3287
+ Returns:
3288
+ dict: Context variables for prompt filling.
3289
+ """
3290
+ ctx = {}
3291
+ # If required_vars is specified, try to get those from memory
3292
+ if hasattr(self, "required_vars") and self.required_vars:
3293
+ for var in self.required_vars:
3294
+ # Try to get from memory.get_var if available
3295
+ if hasattr(memory, "get_var") and callable(getattr(memory, "get_var", None)):
3296
+ val = memory.get_var(var)
3297
+ else:
3298
+ val = getattr(memory, var, None)
3299
+ if val is not None:
3300
+ ctx[var] = val
3301
+ # Optionally, add optional_vars if present in memory
3302
+ if hasattr(self, "optional_vars") and self.optional_vars:
3303
+ for var in self.optional_vars:
3304
+ if hasattr(memory, "get_var") and callable(getattr(memory, "get_var", None)):
3305
+ val = memory.get_var(var)
3306
+ else:
3307
+ val = getattr(memory, var, None)
3308
+ if val is not None:
3309
+ ctx[var] = val
3310
+ # Add some common context keys if available
3311
+ if hasattr(memory, "last_user_msg") and callable(getattr(memory, "last_user_msg", None)):
3312
+ ctx["last_user_msg"] = memory.last_user_msg()
3313
+ if hasattr(memory, "last_asst_msg") and callable(getattr(memory, "last_asst_msg", None)):
3314
+ ctx["last_asst_msg"] = memory.last_asst_msg()
3315
+ if hasattr(memory, "get_msgs") and callable(getattr(memory, "get_msgs", None)):
3316
+ ctx["messages"] = memory.get_msgs(repr="list")
3317
+ # Add all memory.vars if present
3318
+ if hasattr(memory, "vars"):
3319
+ ctx.update(getattr(memory, "vars", {}))
3320
+ return ctx
3321
+
3322
+ def run_llm(self, msgs, **llm_kwargs):
3323
+ """
3324
+ Execute the LLM call with the given messages.
3325
+ !!! USE THE EXISTING LLM CLASS !!!
3326
+
3327
+ Args:
3328
+ msgs (list): List of message dicts.
3329
+ **llm_kwargs: Additional LLM parameters.
3330
+
3331
+ Returns:
3332
+ str: Raw LLM response.
3333
+ """
3334
+ if self.llm is None:
3335
+ raise ValueError("No LLM instance provided to this THOUGHT.")
3336
+ # The LLM class is expected to be callable: llm(msgs, **kwargs)
3337
+ # If LLM is a class with .call, use that (standard interface)
3338
+ if hasattr(self.llm, "call") and callable(getattr(self.llm, "call", None)):
3339
+ response = self.llm.call(msgs, llm_kwargs)
3340
+ elif hasattr(self.llm, "chat") and callable(getattr(self.llm, "chat", None)):
3341
+ response = self.llm.chat(msgs, **llm_kwargs)
3342
+ else:
3343
+ response = self.llm(msgs, **llm_kwargs)
3344
+
3345
+ # Handle list response from LLM.call() - it returns a list of choices
3346
+ if isinstance(response, list):
3347
+ response = response[0] if response else ""
3348
+
3349
+ # If response is a dict with 'content', extract it
3350
+ if isinstance(response, dict) and "content" in response:
3351
+ return response["content"]
3352
+
3353
+ return response
3354
+
3355
+ def parse_response(self, response):
3356
+ """
3357
+ Parse the LLM response to extract the desired content.
3358
+
3359
+ Args:
3360
+ response (str): Raw LLM response.
3361
+
3362
+ Returns:
3363
+ object: Parsed result (e.g., string, list, dict).
3364
+
3365
+ Supports:
3366
+ - Custom parse_fn callable
3367
+ - Schema-based parsing via parsing_rules (uses valid_extract)
3368
+ - Built-in parsers: 'text', 'json', 'list'
3369
+ """
3370
+ # Use custom parse_fn if provided
3371
+ if self.parse_fn and callable(self.parse_fn):
3372
+ return self.parse_fn(response)
3373
+
3374
+ # Check for schema-based parsing rules (using valid_extract)
3375
+ parsing_rules = self.config.get("parsing_rules")
3376
+ if parsing_rules:
3377
+ try:
3378
+ return valid_extract(response, parsing_rules)
3379
+ except ValidExtractError as e:
3380
+ raise ValueError("Schema-based parsing failed: {}".format(e))
3381
+
3382
+ # Use built-in parser based on config
3383
+ parser = self.config.get("parser", None)
3384
+ if parser is None:
3385
+ # Default: return as string
3386
+ return response
3387
+ if parser == "text":
3388
+ return response
3389
+ elif parser == "json":
3390
+ import re
3391
+ # Remove code fences if present
3392
+ text = response.strip()
3393
+ text = re.sub(r"^```(?:json)?|```$", "", text, flags=re.MULTILINE).strip()
3394
+ # Find first JSON object or array
3395
+ match = re.search(r"(\{.*\}|\[.*\])", text, re.DOTALL)
3396
+ if match:
3397
+ json_str = match.group(1)
3398
+ return json.loads(json_str)
3399
+ else:
3400
+ raise ValueError("No JSON object or array found in response.")
3401
+ elif parser == "list":
3402
+ import ast, re
3403
+ # Find first list literal
3404
+ match = re.search(r"(\[.*\])", response, re.DOTALL)
3405
+ if match:
3406
+ list_str = match.group(1)
3407
+ return ast.literal_eval(list_str)
3408
+ else:
3409
+ raise ValueError("No list found in response.")
3410
+ elif callable(parser):
3411
+ return parser(response)
3412
+ else:
3413
+ # Unknown parser, return as is
3414
+ return response
3415
+
3416
+ def validate(self, parsed_result):
3417
+ """
3418
+ Validate the parsed result according to the thought's rules.
3419
+
3420
+ Args:
3421
+ parsed_result: The parsed output from the LLM.
3422
+
3423
+ Returns:
3424
+ (bool, why): True if valid, False otherwise, and reason string.
3425
+ """
3426
+ # Use custom validation if provided
3427
+ if self.validation and callable(self.validation):
3428
+ try:
3429
+ valid, why = self.validation(parsed_result)
3430
+ return bool(valid), why
3431
+ except Exception as e:
3432
+ return False, "Validation exception: {}".format(e)
3433
+ # Use built-in validator based on config
3434
+ validator = self.config.get("validator", None)
3435
+ if validator is None or validator == "any":
3436
+ return True, ""
3437
+ elif isinstance(validator, str):
3438
+ if validator.startswith("has_keys:"):
3439
+ keys = [k.strip() for k in validator.split(":", 1)[1].split(",")]
3440
+ if isinstance(parsed_result, dict):
3441
+ missing = [k for k in keys if k not in parsed_result]
3442
+ if not missing:
3443
+ return True, ""
3444
+ else:
3445
+ return False, "Missing keys: {}".format(missing)
3446
+ else:
3447
+ return False, "Result is not a dict"
3448
+ elif validator.startswith("list_min_len:"):
3449
+ try:
3450
+ min_len = int(validator.split(":", 1)[1])
3451
+ except Exception:
3452
+ min_len = 1
3453
+ if isinstance(parsed_result, list) and len(parsed_result) >= min_len:
3454
+ return True, ""
3455
+ else:
3456
+ return False, "List too short (min {})".format(min_len)
3457
+ elif validator == "summary_v1":
3458
+ # Example: summary must be a string of at least 10 chars
3459
+ if isinstance(parsed_result, str) and len(parsed_result.strip()) >= 10:
3460
+ return True, ""
3461
+ else:
3462
+ return False, "Summary too short"
3463
+ else:
3464
+ return True, ""
3465
+ elif callable(validator):
3466
+ try:
3467
+ valid, why = validator(parsed_result)
3468
+ return bool(valid), why
3469
+ except Exception as e:
3470
+ return False, "Validation exception: {}".format(e)
3471
+ else:
3472
+ return True, ""
3473
+
3474
+ def update_memory(self, memory, result):
3475
+ """
3476
+ Update the MEMORY object with the result of this thought.
3477
+
3478
+ Args:
3479
+ memory: MEMORY object.
3480
+ result: The result to store.
3481
+
3482
+ Returns:
3483
+ MEMORY: Updated memory object.
3484
+ """
3485
+ # Store result in vars or via set_var if available
3486
+ varname = self.output_var or ("{}_result".format(self.name) if self.name else "thought_result")
3487
+ if hasattr(memory, "set_var") and callable(getattr(memory, "set_var", None)):
3488
+ memory.set_var(varname, result, desc="Result of thought: {}".format(self.name))
3489
+ elif hasattr(memory, "vars"):
3490
+ # Fallback: directly access vars dict if set_var not available
3491
+ if varname not in memory.vars:
3492
+ memory.vars[varname] = []
3493
+ stamp = event_stamp(result) if 'event_stamp' in globals() else 'no_stamp'
3494
+ memory.vars[varname].append({'object': result, 'stamp': stamp})
3495
+ else:
3496
+ setattr(memory, varname, result)
3497
+ return memory
3498
+
3499
+ def to_dict(self):
3500
+ """
3501
+ Return a serializable dictionary representation of this THOUGHT.
3502
+
3503
+ Note: The LLM instance, parse_fn, validation, and hooks cannot be serialized,
3504
+ so they are represented by type/name only. When deserializing, these must be
3505
+ provided separately.
3506
+
3507
+ Returns:
3508
+ dict: Serializable representation of this thought.
3509
+ """
3510
+ return {
3511
+ "name": self.name,
3512
+ "id": self.id,
3513
+ "prompt": self.prompt,
3514
+ "operation": self.operation,
3515
+ "config": self.config,
3516
+ "description": self.description,
3517
+ "max_retries": self.max_retries,
3518
+ "retry_delay": self.retry_delay,
3519
+ "output_var": self.output_var,
3520
+ "required_vars": self.required_vars,
3521
+ "optional_vars": self.optional_vars,
3522
+ "execution_history": self.execution_history,
3523
+ # Store metadata about non-serializable items
3524
+ "llm_type": type(self.llm).__name__ if self.llm else None,
3525
+ "has_parse_fn": self.parse_fn is not None,
3526
+ "has_validation": self.validation is not None,
3527
+ "has_pre_hook": self.pre_hook is not None,
3528
+ "has_post_hook": self.post_hook is not None,
3529
+ }
3530
+
3531
+ @classmethod
3532
+ def from_dict(cls, data, llm=None, parse_fn=None, validation=None, pre_hook=None, post_hook=None):
3533
+ """
3534
+ Reconstruct a THOUGHT from a dictionary representation.
3535
+
3536
+ Args:
3537
+ data (dict): Dictionary representation of a THOUGHT.
3538
+ llm: LLM instance to use (required for execution).
3539
+ parse_fn: Optional custom parse function.
3540
+ validation: Optional custom validation function.
3541
+ pre_hook: Optional pre-execution hook.
3542
+ post_hook: Optional post-execution hook.
3543
+
3544
+ Returns:
3545
+ THOUGHT: Reconstructed THOUGHT object.
3546
+ """
3547
+ # Extract config and merge with explicit kwargs
3548
+ config = data.get("config", {}).copy()
3549
+
3550
+ thought = cls(
3551
+ name=data.get("name"),
3552
+ llm=llm,
3553
+ prompt=data.get("prompt"),
3554
+ operation=data.get("operation"),
3555
+ description=data.get("description"),
3556
+ max_retries=data.get("max_retries", 1),
3557
+ retry_delay=data.get("retry_delay", 0),
3558
+ output_var=data.get("output_var"),
3559
+ required_vars=data.get("required_vars", []),
3560
+ optional_vars=data.get("optional_vars", []),
3561
+ parse_fn=parse_fn,
3562
+ validation=validation,
3563
+ pre_hook=pre_hook,
3564
+ post_hook=post_hook,
3565
+ **config
3566
+ )
3567
+
3568
+ # Restore ID if provided
3569
+ if data.get("id"):
3570
+ thought.id = data["id"]
3571
+
3572
+ # Restore execution history
3573
+ thought.execution_history = data.get("execution_history", [])
3574
+
3575
+ return thought
3576
+
3577
+ def copy(self):
3578
+ """
3579
+ Return a deep copy of this THOUGHT.
3580
+
3581
+ Note: The LLM instance is shallow-copied (same reference), as LLM
3582
+ instances typically should be shared. All other attributes are deep-copied.
3583
+
3584
+ Returns:
3585
+ THOUGHT: A new THOUGHT instance with copied attributes.
3586
+ """
3587
+ import copy as copy_module
3588
+
3589
+ new_thought = THOUGHT(
3590
+ name=self.name,
3591
+ llm=self.llm, # Shallow copy - same LLM instance
3592
+ prompt=copy_module.deepcopy(self.prompt),
3593
+ operation=self.operation,
3594
+ description=self.description,
3595
+ max_retries=self.max_retries,
3596
+ retry_delay=self.retry_delay,
3597
+ output_var=self.output_var,
3598
+ required_vars=copy_module.deepcopy(self.required_vars),
3599
+ optional_vars=copy_module.deepcopy(self.optional_vars),
3600
+ parse_fn=self.parse_fn,
3601
+ validation=self.validation,
3602
+ pre_hook=self.pre_hook,
3603
+ post_hook=self.post_hook,
3604
+ **copy_module.deepcopy(self.config)
3605
+ )
3606
+
3607
+ # Copy internal state
3608
+ new_thought.id = event_stamp() # Generate new ID for the copy
3609
+ new_thought.execution_history = copy_module.deepcopy(self.execution_history)
3610
+ new_thought.last_result = copy_module.deepcopy(self.last_result)
3611
+ new_thought.last_error = self.last_error
3612
+ new_thought.last_prompt = self.last_prompt
3613
+ new_thought.last_msgs = copy_module.deepcopy(self.last_msgs)
3614
+ new_thought.last_response = self.last_response
3615
+
3616
+ return new_thought
3617
+
3618
+ def __repr__(self):
3619
+ """
3620
+ Return a detailed string representation of this THOUGHT.
3621
+
3622
+ Returns:
3623
+ str: Detailed representation including key attributes.
3624
+ """
3625
+ return ("THOUGHT(name='{}', operation='{}', "
3626
+ "max_retries={}, output_var='{}')".format(
3627
+ self.name, self.operation, self.max_retries, self.output_var))
3628
+
3629
+ def __str__(self):
3630
+ """
3631
+ Return a human-readable string representation of this THOUGHT.
3632
+
3633
+ Returns:
3634
+ str: Simple description of the thought.
3635
+ """
3636
+ return "Thought: {}".format(self.name or 'unnamed')
3637
+
3638
+
3639
+
3640
+
3641
+
3642
+
3643
+ ThoughtClassTests = """
3644
+ # --- THOUGHT Class Tests ---
3645
+
3646
+ # Test 1: Basic THOUGHT instantiation and attributes
3647
+ >>> from thoughtflow6 import THOUGHT, MEMORY, event_stamp
3648
+ >>> t = THOUGHT(name="test_thought", prompt="Hello {name}", max_retries=3)
3649
+ >>> t.name
3650
+ 'test_thought'
3651
+ >>> t.max_retries
3652
+ 3
3653
+ >>> t.output_var
3654
+ 'test_thought_result'
3655
+ >>> t.operation is None # Defaults to None, which means 'llm_call'
3656
+ True
3657
+ >>> len(t.execution_history)
3658
+ 0
3659
+
3660
+ # Test 2: Serialization round-trip with to_dict/from_dict
3661
+ >>> t1 = THOUGHT(name="serialize_test", prompt="test prompt", max_retries=3, output_var="my_output")
3662
+ >>> data = t1.to_dict()
3663
+ >>> data['name']
3664
+ 'serialize_test'
3665
+ >>> data['max_retries']
3666
+ 3
3667
+ >>> data['output_var']
3668
+ 'my_output'
3669
+ >>> t2 = THOUGHT.from_dict(data)
3670
+ >>> t2.name == t1.name
3671
+ True
3672
+ >>> t2.max_retries == t1.max_retries
3673
+ True
3674
+ >>> t2.output_var == t1.output_var
3675
+ True
3676
+
3677
+ # Test 3: Copy creates independent instance
3678
+ >>> t1 = THOUGHT(name="copy_test", prompt="original prompt")
3679
+ >>> t2 = t1.copy()
3680
+ >>> t2.name = "modified"
3681
+ >>> t1.name
3682
+ 'copy_test'
3683
+ >>> t2.name
3684
+ 'modified'
3685
+ >>> t1.id != t2.id # Copy gets new ID
3686
+ True
3687
+
3688
+ # Test 4: __repr__ and __str__
3689
+ >>> t = THOUGHT(name="repr_test", operation="llm_call", max_retries=2, output_var="result")
3690
+ >>> "repr_test" in repr(t)
3691
+ True
3692
+ >>> "llm_call" in repr(t)
3693
+ True
3694
+ >>> str(t)
3695
+ 'Thought: repr_test'
3696
+ >>> t2 = THOUGHT() # unnamed
3697
+ >>> str(t2)
3698
+ 'Thought: unnamed'
3699
+
3700
+ # Test 5: Memory query operation (no LLM)
3701
+ >>> mem = MEMORY()
3702
+ >>> mem.set_var("user_name", "Alice", desc="Test user")
3703
+ >>> mem.set_var("session_id", "sess123", desc="Test session")
3704
+ >>> t = THOUGHT(
3705
+ ... name="query_test",
3706
+ ... operation="memory_query",
3707
+ ... required_vars=["user_name", "session_id"]
3708
+ ... )
3709
+ >>> mem2 = t(mem)
3710
+ >>> result = mem2.get_var("query_test_result")
3711
+ >>> result['user_name']
3712
+ 'Alice'
3713
+ >>> result['session_id']
3714
+ 'sess123'
3715
+
3716
+ # Test 6: Variable set operation
3717
+ >>> mem = MEMORY()
3718
+ >>> t = THOUGHT(
3719
+ ... name="setvar_test",
3720
+ ... operation="variable_set",
3721
+ ... prompt={"status": "active", "count": 42}
3722
+ ... )
3723
+ >>> mem2 = t(mem)
3724
+ >>> mem2.get_var("status")
3725
+ 'active'
3726
+ >>> mem2.get_var("count")
3727
+ 42
3728
+
3729
+ # Test 7: Execution history tracking
3730
+ >>> mem = MEMORY()
3731
+ >>> t = THOUGHT(name="history_test", operation="memory_query", required_vars=[])
3732
+ >>> len(t.execution_history)
3733
+ 0
3734
+ >>> mem = t(mem)
3735
+ >>> len(t.execution_history)
3736
+ 1
3737
+ >>> t.execution_history[0]['success']
3738
+ True
3739
+ >>> 'duration_ms' in t.execution_history[0]
3740
+ True
3741
+ >>> 'stamp' in t.execution_history[0]
3742
+ True
3743
+
3744
+ # Test 8: Conditional operation
3745
+ >>> mem = MEMORY()
3746
+ >>> mem.set_var("threshold", 50)
3747
+ >>> t = THOUGHT(
3748
+ ... name="cond_test",
3749
+ ... operation="conditional",
3750
+ ... condition=lambda m, ctx: ctx.get('value', 0) > ctx.get('threshold', 0),
3751
+ ... if_true="above",
3752
+ ... if_false="below"
3753
+ ... )
3754
+ >>> mem2 = t(mem, vars={'value': 75})
3755
+ >>> mem2.get_var("cond_test_result")
3756
+ 'above'
3757
+ >>> mem3 = t(mem, vars={'value': 25})
3758
+ >>> mem3.get_var("cond_test_result")
3759
+ 'below'
3760
+
3761
+ # Test 9: VALID_OPERATIONS class attribute
3762
+ >>> 'llm_call' in THOUGHT.VALID_OPERATIONS
3763
+ True
3764
+ >>> 'memory_query' in THOUGHT.VALID_OPERATIONS
3765
+ True
3766
+ >>> 'variable_set' in THOUGHT.VALID_OPERATIONS
3767
+ True
3768
+ >>> 'conditional' in THOUGHT.VALID_OPERATIONS
3769
+ True
3770
+
3771
+ # Test 10: Parse response with parsing_rules (valid_extract integration)
3772
+ >>> t = THOUGHT(name="parse_test", parsing_rules={"kind": "python", "format": []})
3773
+ >>> t.parse_response("Here is the list: [1, 2, 3]")
3774
+ [1, 2, 3]
3775
+ >>> t2 = THOUGHT(name="parse_dict", parsing_rules={"kind": "python", "format": {"name": "", "count": 0}})
3776
+ >>> t2.parse_response("Result: {'name': 'test', 'count': 5}")
3777
+ {'name': 'test', 'count': 5}
3778
+
3779
+ # Test 11: Built-in parsers
3780
+ >>> t = THOUGHT(name="json_test", parser="json")
3781
+ >>> t.parse_response('Here is JSON: {"key": "value"}')
3782
+ {'key': 'value'}
3783
+ >>> t2 = THOUGHT(name="list_test", parser="list")
3784
+ >>> t2.parse_response("Numbers: [1, 2, 3, 4, 5]")
3785
+ [1, 2, 3, 4, 5]
3786
+ >>> t3 = THOUGHT(name="text_test", parser="text")
3787
+ >>> t3.parse_response("plain text")
3788
+ 'plain text'
3789
+
3790
+ # Test 12: Built-in validators
3791
+ >>> t = THOUGHT(name="val_test", validator="has_keys:name,age")
3792
+ >>> t.validate({"name": "Alice", "age": 30})
3793
+ (True, '')
3794
+ >>> t.validate({"name": "Bob"})
3795
+ (False, 'Missing keys: [\\'age\\']')
3796
+ >>> t2 = THOUGHT(name="list_val", validator="list_min_len:3")
3797
+ >>> t2.validate([1, 2, 3])
3798
+ (True, '')
3799
+ >>> t2.validate([1, 2])
3800
+ (False, 'List too short (min 3)')
3801
+
3802
+ """
3803
+
3804
+
3805
+ #############################################################################
3806
+ #############################################################################
3807
+
3808
+ ### ACTION CLASS
3809
+
3810
+
3811
+ class ACTION:
3812
+ """
3813
+ The ACTION class encapsulates an external or internal operation that can be invoked within a Thoughtflow agent.
3814
+ It is designed to represent a single, named action (such as a tool call, API request, or function) whose result
3815
+ is stored in the agent's state for later inspection, branching, or retry.
3816
+
3817
+ An ACTION represents a discrete, named operation (function, API call, tool invocation) that can be defined once
3818
+ and executed multiple times with different parameters. When executed, the ACTION handles logging, error management,
3819
+ and result storage in a consistent way.
3820
+
3821
+ Attributes:
3822
+ name (str): Identifier for this action, used for logging and storing results.
3823
+ id (str): Unique identifier for this action instance (event_stamp).
3824
+ fn (callable): The function to execute when this action is called.
3825
+ config (dict): Default configuration parameters that will be passed to the function.
3826
+ result_key (str): Key where results are stored in memory (defaults to "{name}_result").
3827
+ description (str): Human-readable description of what this action does.
3828
+ last_result (Any): The most recent result from executing this action.
3829
+ last_error (Exception): The most recent error from executing this action, if any.
3830
+ execution_count (int): Number of times this action has been executed.
3831
+ execution_history (list): Full execution history with timing and success/error tracking.
3832
+
3833
+ Methods:
3834
+ __init__(name, fn, config=None, result_key=None, description=None):
3835
+ Initializes an ACTION with a name, function, and optional configuration.
3836
+
3837
+ __call__(memory, **kwargs):
3838
+ Executes the action function with the memory object and any override parameters.
3839
+ The function receives (memory, **merged_kwargs) where merged_kwargs combines
3840
+ self.config with any call-specific kwargs.
3841
+
3842
+ Returns the memory object with results stored via set_var.
3843
+ Logs execution details with JSON-formatted event data.
3844
+ Tracks execution timing and history.
3845
+
3846
+ Handles exceptions during execution by logging them rather than raising them,
3847
+ allowing the workflow to continue and decide how to handle failures.
3848
+
3849
+ get_last_result():
3850
+ Returns the most recent result from executing this action.
3851
+
3852
+ was_successful():
3853
+ Returns True if the last execution was successful, False otherwise.
3854
+
3855
+ reset_stats():
3856
+ Resets execution statistics (count, last_result, last_error, execution_history).
3857
+
3858
+ copy():
3859
+ Returns a copy of this ACTION with a new ID and reset statistics.
3860
+
3861
+ to_dict():
3862
+ Returns a serializable dictionary representation of this action.
3863
+
3864
+ from_dict(cls, data, fn_registry):
3865
+ Class method to reconstruct an ACTION from a dictionary representation.
3866
+
3867
+ Example Usage:
3868
+ # Define a web search action
3869
+ def search_web(memory, query, max_results=3):
3870
+ # Implementation of web search
3871
+ results = web_api.search(query, limit=max_results)
3872
+ return {"status": "success", "hits": results}
3873
+
3874
+ search_action = ACTION(
3875
+ name="web_search",
3876
+ fn=search_web,
3877
+ config={"max_results": 5},
3878
+ description="Searches the web for information"
3879
+ )
3880
+
3881
+ # Execute the action
3882
+ memory = MEMORY()
3883
+ memory = search_action(memory, query="thoughtflow framework")
3884
+
3885
+ # Access results
3886
+ result = memory.get_var("web_search_result")
3887
+
3888
+ # Check execution history
3889
+ print(search_action.execution_history[-1]['duration_ms']) # Execution time
3890
+ print(search_action.execution_history[-1]['success']) # True/False
3891
+
3892
+ Design Principles:
3893
+ 1. Explicit and inspectable operations with consistent logging
3894
+ 2. Predictable result storage via memory.set_var
3895
+ 3. Error handling that doesn't interrupt workflow execution
3896
+ 4. Composability with other Thoughtflow components (MEMORY, THOUGHT)
3897
+ 5. Serialization support for reproducibility
3898
+ 6. Full execution history with timing for debugging and optimization
3899
+ """
3900
+
3901
+ def __init__(self, name, fn, config=None, result_key=None, description=None):
3902
+ """
3903
+ Initialize an ACTION with a name, function, and optional configuration.
3904
+
3905
+ Args:
3906
+ name (str): Identifier for this action, used for logging and result storage.
3907
+ fn (callable): The function to execute when this action is called.
3908
+ config (dict, optional): Default configuration parameters passed to the function.
3909
+ result_key (str, optional): Key where results are stored in memory (defaults to "{name}_result").
3910
+ description (str, optional): Human-readable description of what this action does.
3911
+ """
3912
+ self.name = name
3913
+ self.id = event_stamp() # Unique identifier for this action instance
3914
+ self.fn = fn
3915
+ self.config = config or {}
3916
+ self.result_key = result_key or "{}_result".format(name)
3917
+ self.description = description or "Action: {}".format(name)
3918
+ self.last_result = None
3919
+ self.last_error = None
3920
+ self.execution_count = 0
3921
+ self.execution_history = [] # Full execution tracking with timing
3922
+
3923
+ def __call__(self, memory, **kwargs):
3924
+ """
3925
+ Execute the action function with the memory object and any override parameters.
3926
+
3927
+ Args:
3928
+ memory (MEMORY): The memory object to update with results.
3929
+ **kwargs: Parameters that override the default config for this execution.
3930
+
3931
+ Returns:
3932
+ MEMORY: The updated memory object with results stored in memory.vars[result_key].
3933
+
3934
+ Note:
3935
+ The function receives (memory, **merged_kwargs) where merged_kwargs combines
3936
+ self.config with any call-specific kwargs.
3937
+
3938
+ Exceptions during execution are logged rather than raised, allowing the
3939
+ workflow to continue and decide how to handle failures.
3940
+ """
3941
+ import time as time_module
3942
+
3943
+ start_time = time_module.time()
3944
+
3945
+ # Merge default config with call-specific kwargs
3946
+ merged_kwargs = {**self.config, **kwargs}
3947
+ self.execution_count += 1
3948
+
3949
+ try:
3950
+ # Execute the function
3951
+ result = self.fn(memory, **merged_kwargs)
3952
+ self.last_result = result
3953
+ self.last_error = None
3954
+
3955
+ # Calculate execution duration
3956
+ duration_ms = (time_module.time() - start_time) * 1000
3957
+
3958
+ # Store result in memory using set_var (correct API)
3959
+ if hasattr(memory, "set_var") and callable(getattr(memory, "set_var", None)):
3960
+ memory.set_var(self.result_key, result, desc="Result of action: {}".format(self.name))
3961
+
3962
+ # Build execution event for logging (JSON format like THOUGHT)
3963
+ execution_event = {
3964
+ 'action_name': self.name,
3965
+ 'action_id': self.id,
3966
+ 'status': 'success',
3967
+ 'duration_ms': round(duration_ms, 2),
3968
+ 'result_key': self.result_key
3969
+ }
3970
+
3971
+ # Log successful execution (single message with JSON, no invalid details param)
3972
+ if hasattr(memory, "add_log") and callable(getattr(memory, "add_log", None)):
3973
+ memory.add_log("Action execution complete: " + json.dumps(execution_event))
3974
+
3975
+ # Track execution history
3976
+ self.execution_history.append({
3977
+ 'stamp': event_stamp(),
3978
+ 'memory_id': getattr(memory, 'id', None),
3979
+ 'duration_ms': duration_ms,
3980
+ 'success': True,
3981
+ 'error': None
3982
+ })
3983
+
3984
+ except Exception as e:
3985
+ # Handle and log exceptions
3986
+ self.last_error = e
3987
+
3988
+ # Calculate execution duration
3989
+ duration_ms = (time_module.time() - start_time) * 1000
3990
+
3991
+ # Build error event for logging
3992
+ error_event = {
3993
+ 'action_name': self.name,
3994
+ 'action_id': self.id,
3995
+ 'status': 'error',
3996
+ 'error': str(e),
3997
+ 'duration_ms': round(duration_ms, 2),
3998
+ 'result_key': self.result_key
3999
+ }
4000
+
4001
+ # Log failed execution (single message with JSON)
4002
+ if hasattr(memory, "add_log") and callable(getattr(memory, "add_log", None)):
4003
+ memory.add_log("Action execution failed: " + json.dumps(error_event))
4004
+
4005
+ # Store error info in memory using set_var
4006
+ if hasattr(memory, "set_var") and callable(getattr(memory, "set_var", None)):
4007
+ memory.set_var(self.result_key, error_event, desc="Error in action: {}".format(self.name))
4008
+
4009
+ # Track execution history
4010
+ self.execution_history.append({
4011
+ 'stamp': event_stamp(),
4012
+ 'memory_id': getattr(memory, 'id', None),
4013
+ 'duration_ms': duration_ms,
4014
+ 'success': False,
4015
+ 'error': str(e)
4016
+ })
4017
+
4018
+ return memory
4019
+
4020
+ def get_last_result(self):
4021
+ """
4022
+ Returns the most recent result from executing this action.
4023
+
4024
+ Returns:
4025
+ Any: The last result or None if the action hasn't been executed.
4026
+ """
4027
+ return self.last_result
4028
+
4029
+ def was_successful(self):
4030
+ """
4031
+ Returns True if the last execution was successful, False otherwise.
4032
+
4033
+ Returns:
4034
+ bool: True if the last execution completed without errors, False otherwise.
4035
+ """
4036
+ return self.last_error is None and self.execution_count > 0
4037
+
4038
+ def reset_stats(self):
4039
+ """
4040
+ Resets execution statistics (count, last_result, last_error, execution_history).
4041
+
4042
+ Returns:
4043
+ ACTION: Self for method chaining.
4044
+ """
4045
+ self.execution_count = 0
4046
+ self.last_result = None
4047
+ self.last_error = None
4048
+ self.execution_history = []
4049
+ return self
4050
+
4051
+ def copy(self):
4052
+ """
4053
+ Return a copy of this ACTION with a new ID.
4054
+
4055
+ The function reference is shared (same callable), but config is copied.
4056
+ Execution statistics are reset in the copy.
4057
+
4058
+ Returns:
4059
+ ACTION: A new ACTION instance with copied attributes and new ID.
4060
+ """
4061
+ new_action = ACTION(
4062
+ name=self.name,
4063
+ fn=self.fn, # Same function reference
4064
+ config=self.config.copy() if self.config else None,
4065
+ result_key=self.result_key,
4066
+ description=self.description
4067
+ )
4068
+ # New ID is already assigned in __init__, no need to set it
4069
+ return new_action
4070
+
4071
+ def to_dict(self):
4072
+ """
4073
+ Returns a serializable dictionary representation of this action.
4074
+
4075
+ Note: The function itself cannot be serialized, so it's represented by name.
4076
+ When deserializing, a function registry must be provided.
4077
+
4078
+ Returns:
4079
+ dict: Serializable representation of this action.
4080
+ """
4081
+ return {
4082
+ "name": self.name,
4083
+ "id": self.id,
4084
+ "fn_name": self.fn.__name__,
4085
+ "config": self.config,
4086
+ "result_key": self.result_key,
4087
+ "description": self.description,
4088
+ "execution_count": self.execution_count,
4089
+ "execution_history": self.execution_history
4090
+ }
4091
+
4092
+ @classmethod
4093
+ def from_dict(cls, data, fn_registry):
4094
+ """
4095
+ Reconstruct an ACTION from a dictionary representation.
4096
+
4097
+ Args:
4098
+ data (dict): Dictionary representation of an ACTION.
4099
+ fn_registry (dict): Dictionary mapping function names to function objects.
4100
+
4101
+ Returns:
4102
+ ACTION: Reconstructed ACTION object.
4103
+
4104
+ Raises:
4105
+ KeyError: If the function name is not found in the registry.
4106
+ """
4107
+ if data["fn_name"] not in fn_registry:
4108
+ raise KeyError("Function '{}' not found in registry".format(data['fn_name']))
4109
+
4110
+ action = cls(
4111
+ name=data["name"],
4112
+ fn=fn_registry[data["fn_name"]],
4113
+ config=data["config"],
4114
+ result_key=data["result_key"],
4115
+ description=data["description"]
4116
+ )
4117
+ # Restore ID if provided, otherwise keep the new one from __init__
4118
+ if data.get("id"):
4119
+ action.id = data["id"]
4120
+ action.execution_count = data.get("execution_count", 0)
4121
+ action.execution_history = data.get("execution_history", [])
4122
+ return action
4123
+
4124
+ def __str__(self):
4125
+ """
4126
+ Returns a string representation of this action.
4127
+
4128
+ Returns:
4129
+ str: String representation.
4130
+ """
4131
+ return "ACTION({}, desc='{}', executions={})".format(self.name, self.description, self.execution_count)
4132
+
4133
+ def __repr__(self):
4134
+ """
4135
+ Returns a detailed string representation of this action.
4136
+
4137
+ Returns:
4138
+ str: Detailed string representation.
4139
+ """
4140
+ return ("ACTION(name='{}', fn={}, "
4141
+ "config={}, result_key='{}', "
4142
+ "description='{}', execution_count={})".format(
4143
+ self.name, self.fn.__name__, self.config,
4144
+ self.result_key, self.description, self.execution_count))
4145
+
4146
+
4147
+ ### ACTION CLASS TESTS
4148
+
4149
+ ActionClassTests = """
4150
+ # --- ACTION Class Tests ---
4151
+
4152
+
4153
+ """
4154
+
4155
+ #############################################################################
4156
+ #############################################################################
4157
+
4158
+
4159
+
4160
+
4161
+
4162
+
4163
+
4164
+
4165
+
4166
+
4167
+
4168
+
4169
+
4170
+
4171
+
4172
+
4173
+
4174
+
4175
+
4176
+
4177
+
4178
+
4179
+
4180
+