optichat 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,138 @@
1
+ Metadata-Version: 2.4
2
+ Name: optichat
3
+ Version: 0.1.0
4
+ Summary: An advanced terminal-based chat application built with Python and Textual.
5
+ Author: OptiChat Contributors
6
+ License-Expression: MIT
7
+ Classifier: Programming Language :: Python :: 3
8
+ Classifier: Operating System :: OS Independent
9
+ Requires-Python: >=3.9
10
+ Description-Content-Type: text/markdown
11
+ Requires-Dist: textual
12
+ Requires-Dist: textual-dev
13
+ Requires-Dist: langchain
14
+ Requires-Dist: langchain-community
15
+ Requires-Dist: langchain-core
16
+ Requires-Dist: langchain-openai
17
+ Requires-Dist: langchain-anthropic
18
+ Requires-Dist: langchain-google-genai
19
+ Requires-Dist: langgraph
20
+ Requires-Dist: ollama
21
+ Requires-Dist: chromadb
22
+ Requires-Dist: sentence-transformers
23
+ Requires-Dist: ddgs
24
+
25
+ # OptiChat
26
+
27
+ OptiChat is an advanced terminal-based chat application built with Python and Textual. It features a robust multi-tier memory system, personalized memory tracking, dynamic model connectivity (including cloud and local Ollama models), and a sophisticated prompt construction pipeline for high-quality, contextual AI responses.
28
+
29
+ ## 🌟 Key Features
30
+
31
+ * **Terminal-based UI**: A beautiful, responsive interface built with Textual, featuring tabs, chat session sidebars, and customizable themes.
32
+ * **Multi-Tier Memory System**:
33
+ * **Short-Term Memory**: Token-budgeted rolling window for recent context.
34
+ * **LRU Memory**: Background-processed cache of frequently used messages.
35
+ * **Long-Term Memory**: Persistent vector store (ChromaDB) for semantic search across conversations.
36
+ * **Personalized Memory**: Automatically learns and updates user preferences, interests, and interaction styles with conflict resolution.
37
+ * **Dynamic Model Connectivity**: Support for OpenAI, Anthropic, Gemini, and local models via Ollama.
38
+ * **Prompt Construction Pipeline**: Utilizes LangGraph to dynamically classify queries, retrieve memory, apply personalization, and enforce structured output schemas.
39
+ * **Chat Trace Logs**: Every assistant response includes a collapsible section showing the model's chain-of-thought ToDo plan – what the model thought before responding.
40
+ * **Adaptive Response**: Response length and depth dynamically adapt to question complexity (simple → concise, complex → thorough and comprehensive).
41
+ * **Auto Chat Naming**: New chats are automatically renamed based on your first question (2-3 word title) via a background thread.
42
+ * **Secure Local Storage**: All data, including settings, API keys (via `.env`), SQLite databases for chats, and ChromaDB vectors, are stored securely in your local `~/.optichat/` directory.
43
+
44
+ ## 🏗️ Architecture
45
+
46
+ ### Storage
47
+ OptiChat stores its data locally in `~/.optichat/`. This includes:
48
+ - `config.json` for global settings.
49
+ - `optichat.db` (SQLite) for storing chats, messages, and session metadata.
50
+ - `chroma/` for ChromaDB vector embeddings.
51
+ - Flat files for chat-specific short-term and LRU caches.
52
+
53
+ ### Memory Pipeline
54
+ 1. **Short-term**: Retains the most recent 3-5 messages.
55
+ 2. **LRU Cache**: Frequently accessed context swapped in from long-term memory.
56
+ 3. **Long-term**: Chunks and embeds responses into ChromaDB for semantic retrieval.
57
+ 4. **Personalized**: Analyzes user behavior and explicitly stated preferences to tailor AI responses.
58
+
59
+ ### Prompt Construction
60
+ Using LangChain and LangGraph, the pipeline:
61
+ 1. Classifies the user input (type, complexity).
62
+ 2. Retrieves relevant context (Short-term, LRU, or Long-term via semantic search).
63
+ 3. Scores and orders the context.
64
+ 4. Injects personalized memory (tone, length, interests).
65
+ 5. Selects an appropriate output schema (e.g., factual, procedural, coding).
66
+ 6. Instructs the model to produce a **chain-of-thought ToDo plan** (`<TRACE>…</TRACE>`) before answering.
67
+ 7. Applies **adaptive response** instructions based on detected question complexity.
68
+ 8. Streams the final response and parses the trace log for display.
69
+
70
+ ### Chat Trace Logs
71
+ Every assistant response includes a collapsible **Chat Trace Logs** widget at the bottom of the message bubble. This displays the numbered ToDo plan (chain-of-thought) that the model produced before generating its answer. Click to expand and inspect the model's reasoning process — useful for debugging, understanding responses, and evaluating quality.
72
+
73
+ ### Adaptive Response
74
+ Response length automatically adapts to question complexity:
75
+ | Complexity | Behaviour |
76
+ | :--- | :--- |
77
+ | **Simple** | Concise, focused answer — a few sentences. |
78
+ | **Moderate** | Well-structured with paragraphs, lists, and examples. |
79
+ | **Complex** | Comprehensive and thorough — covers all aspects, edge cases, and examples. |
80
+
81
+ Complexity is auto-detected from signal words (e.g., *"briefly"* → simple, *"in detail"* → complex).
82
+
83
+ ### Auto Chat Naming
84
+ New chats start with a generic "Chat N" name. After the first AI response, a background thread automatically renames the chat based on your first question, producing a short 2-3 word title.
85
+
86
+ ## 🛠️ Setup & Installation
87
+
88
+ 1. **Clone the repository:**
89
+ ```bash
90
+ git clone <repository_url>
91
+ cd OptiChat
92
+ ```
93
+
94
+ 2. **Create a virtual environment (optional but recommended):**
95
+ ```bash
96
+ python -m venv .venv
97
+ source .venv/bin/activate # On Windows: .venv\Scripts\activate
98
+ ```
99
+
100
+ 3. **Install dependencies:**
101
+ ```bash
102
+ pip install -r requirements.txt
103
+ ```
104
+
105
+ 4. **Run OptiChat:**
106
+ ```bash
107
+ python main.py # runs in terminal
108
+ textual run --dev main.py # runs in textual UI (Slower startup)
109
+ ```
110
+ *Note: OptiChat will automatically create the `~/.optichat/` directory and necessary files upon first launch.*
111
+
112
+ 5. **Configure AI Models:**
113
+ - Launch the application and navigate to the **Settings** tab.
114
+ - Enter your API keys for Cloud Providers (OpenAI, Anthropic, Gemini).
115
+ - Alternatively, ensure [Ollama](https://ollama.com/) is running locally to auto-detect and use local models.
116
+ - **DISCLAIMER: API models consume a lot of tokens for chats as multiple calls are used for a single response, use local models for longer conversations**
117
+
118
+ ## ⌨️ Keyboard Shortcuts
119
+
120
+ | Shortcut | Action |
121
+ | :--- | :--- |
122
+ | `Ctrl+Q` | Quit OptiChat and close the layout |
123
+ | `Ctrl+R` | Toggle streaming on/off |
124
+ | `Ctrl+C` | Cancel current streaming response mid-output |
125
+ | `↑ / ↓` | Scroll through input history (previous commands/messages) |
126
+ | `Page Up / Page Down` | Scroll the main panel content |
127
+
128
+ ## 🚀 Development Roadmap
129
+
130
+ OptiChat is developed in structured phases:
131
+
132
+ * **Phase 1: UI Design via Textual** - Building the responsive terminal interface, navigation, settings panels for API keys and themes, and chat windows.
133
+ * **Phase 2: Core Backend & Model Connectivity** - Initializing the `~/.optichat/` environment, implementing SQLite for chat history, and connecting to Cloud/Local AI models using LangChain.
134
+ * **Phase 3: Memory Storing Mechanism** - Implementing the background threads for Short-Term, LRU, and Long-Term (ChromaDB) memory handling, along with personalized memory updates.
135
+ * **Phase 4: Prompt Construction Pipeline** - Orchestrating the advanced LangGraph pipeline for query classification, semantic retrieval, schema enforcement, chain-of-thought trace logs, adaptive response, auto chat naming, and intelligent prompt assembly.
136
+
137
+ ---
138
+ *Developed using Textual, LangChain, and LangGraph.*
@@ -0,0 +1,114 @@
1
+ # OptiChat
2
+
3
+ OptiChat is an advanced terminal-based chat application built with Python and Textual. It features a robust multi-tier memory system, personalized memory tracking, dynamic model connectivity (including cloud and local Ollama models), and a sophisticated prompt construction pipeline for high-quality, contextual AI responses.
4
+
5
+ ## 🌟 Key Features
6
+
7
+ * **Terminal-based UI**: A beautiful, responsive interface built with Textual, featuring tabs, chat session sidebars, and customizable themes.
8
+ * **Multi-Tier Memory System**:
9
+ * **Short-Term Memory**: Token-budgeted rolling window for recent context.
10
+ * **LRU Memory**: Background-processed cache of frequently used messages.
11
+ * **Long-Term Memory**: Persistent vector store (ChromaDB) for semantic search across conversations.
12
+ * **Personalized Memory**: Automatically learns and updates user preferences, interests, and interaction styles with conflict resolution.
13
+ * **Dynamic Model Connectivity**: Support for OpenAI, Anthropic, Gemini, and local models via Ollama.
14
+ * **Prompt Construction Pipeline**: Utilizes LangGraph to dynamically classify queries, retrieve memory, apply personalization, and enforce structured output schemas.
15
+ * **Chat Trace Logs**: Every assistant response includes a collapsible section showing the model's chain-of-thought ToDo plan – what the model thought before responding.
16
+ * **Adaptive Response**: Response length and depth dynamically adapt to question complexity (simple → concise, complex → thorough and comprehensive).
17
+ * **Auto Chat Naming**: New chats are automatically renamed based on your first question (2-3 word title) via a background thread.
18
+ * **Secure Local Storage**: All data, including settings, API keys (via `.env`), SQLite databases for chats, and ChromaDB vectors, are stored securely in your local `~/.optichat/` directory.
19
+
20
+ ## 🏗️ Architecture
21
+
22
+ ### Storage
23
+ OptiChat stores its data locally in `~/.optichat/`. This includes:
24
+ - `config.json` for global settings.
25
+ - `optichat.db` (SQLite) for storing chats, messages, and session metadata.
26
+ - `chroma/` for ChromaDB vector embeddings.
27
+ - Flat files for chat-specific short-term and LRU caches.
28
+
29
+ ### Memory Pipeline
30
+ 1. **Short-term**: Retains the most recent 3-5 messages.
31
+ 2. **LRU Cache**: Frequently accessed context swapped in from long-term memory.
32
+ 3. **Long-term**: Chunks and embeds responses into ChromaDB for semantic retrieval.
33
+ 4. **Personalized**: Analyzes user behavior and explicitly stated preferences to tailor AI responses.
34
+
35
+ ### Prompt Construction
36
+ Using LangChain and LangGraph, the pipeline:
37
+ 1. Classifies the user input (type, complexity).
38
+ 2. Retrieves relevant context (Short-term, LRU, or Long-term via semantic search).
39
+ 3. Scores and orders the context.
40
+ 4. Injects personalized memory (tone, length, interests).
41
+ 5. Selects an appropriate output schema (e.g., factual, procedural, coding).
42
+ 6. Instructs the model to produce a **chain-of-thought ToDo plan** (`<TRACE>…</TRACE>`) before answering.
43
+ 7. Applies **adaptive response** instructions based on detected question complexity.
44
+ 8. Streams the final response and parses the trace log for display.
45
+
46
+ ### Chat Trace Logs
47
+ Every assistant response includes a collapsible **Chat Trace Logs** widget at the bottom of the message bubble. This displays the numbered ToDo plan (chain-of-thought) that the model produced before generating its answer. Click to expand and inspect the model's reasoning process — useful for debugging, understanding responses, and evaluating quality.
48
+
49
+ ### Adaptive Response
50
+ Response length automatically adapts to question complexity:
51
+ | Complexity | Behaviour |
52
+ | :--- | :--- |
53
+ | **Simple** | Concise, focused answer — a few sentences. |
54
+ | **Moderate** | Well-structured with paragraphs, lists, and examples. |
55
+ | **Complex** | Comprehensive and thorough — covers all aspects, edge cases, and examples. |
56
+
57
+ Complexity is auto-detected from signal words (e.g., *"briefly"* → simple, *"in detail"* → complex).
58
+
59
+ ### Auto Chat Naming
60
+ New chats start with a generic "Chat N" name. After the first AI response, a background thread automatically renames the chat based on your first question, producing a short 2-3 word title.
61
+
62
+ ## 🛠️ Setup & Installation
63
+
64
+ 1. **Clone the repository:**
65
+ ```bash
66
+ git clone <repository_url>
67
+ cd OptiChat
68
+ ```
69
+
70
+ 2. **Create a virtual environment (optional but recommended):**
71
+ ```bash
72
+ python -m venv .venv
73
+ source .venv/bin/activate # On Windows: .venv\Scripts\activate
74
+ ```
75
+
76
+ 3. **Install dependencies:**
77
+ ```bash
78
+ pip install -r requirements.txt
79
+ ```
80
+
81
+ 4. **Run OptiChat:**
82
+ ```bash
83
+ python main.py # runs in terminal
84
+ textual run --dev main.py # runs in textual UI (Slower startup)
85
+ ```
86
+ *Note: OptiChat will automatically create the `~/.optichat/` directory and necessary files upon first launch.*
87
+
88
+ 5. **Configure AI Models:**
89
+ - Launch the application and navigate to the **Settings** tab.
90
+ - Enter your API keys for Cloud Providers (OpenAI, Anthropic, Gemini).
91
+ - Alternatively, ensure [Ollama](https://ollama.com/) is running locally to auto-detect and use local models.
92
+ - **DISCLAIMER: API models consume a lot of tokens for chats as multiple calls are used for a single response, use local models for longer conversations**
93
+
94
+ ## ⌨️ Keyboard Shortcuts
95
+
96
+ | Shortcut | Action |
97
+ | :--- | :--- |
98
+ | `Ctrl+Q` | Quit OptiChat and close the layout |
99
+ | `Ctrl+R` | Toggle streaming on/off |
100
+ | `Ctrl+C` | Cancel current streaming response mid-output |
101
+ | `↑ / ↓` | Scroll through input history (previous commands/messages) |
102
+ | `Page Up / Page Down` | Scroll the main panel content |
103
+
104
+ ## 🚀 Development Roadmap
105
+
106
+ OptiChat is developed in structured phases:
107
+
108
+ * **Phase 1: UI Design via Textual** - Building the responsive terminal interface, navigation, settings panels for API keys and themes, and chat windows.
109
+ * **Phase 2: Core Backend & Model Connectivity** - Initializing the `~/.optichat/` environment, implementing SQLite for chat history, and connecting to Cloud/Local AI models using LangChain.
110
+ * **Phase 3: Memory Storing Mechanism** - Implementing the background threads for Short-Term, LRU, and Long-Term (ChromaDB) memory handling, along with personalized memory updates.
111
+ * **Phase 4: Prompt Construction Pipeline** - Orchestrating the advanced LangGraph pipeline for query classification, semantic retrieval, schema enforcement, chain-of-thought trace logs, adaptive response, auto chat naming, and intelligent prompt assembly.
112
+
113
+ ---
114
+ *Developed using Textual, LangChain, and LangGraph.*
@@ -0,0 +1 @@
1
+ # OptiChat App Package
@@ -0,0 +1,340 @@
1
+ """OptiChat – AI model connection layer.
2
+
3
+ Responsibilities
4
+ ────────────────
5
+ • Validate API keys against each provider.
6
+ • List available models from cloud providers (OpenAI / Anthropic / Gemini).
7
+ • Detect locally-installed Ollama models.
8
+ • Instantiate LangChain chat model objects for actual inference.
9
+ """
10
+
11
+ from __future__ import annotations
12
+
13
+ from typing import Any
14
+
15
+ from langchain_core.language_models.chat_models import BaseChatModel
16
+
17
+
18
+ # ══════════════════════════════════════════════
19
+ # Provider registry
20
+ # ══════════════════════════════════════════════
21
+ PROVIDERS = ("openai", "anthropic", "gemini")
22
+
23
+
24
+ # ══════════════════════════════════════════════
25
+ # API key validation
26
+ # ══════════════════════════════════════════════
27
+ def validate_api_key(provider: str, api_key: str) -> bool:
28
+ """Return True if *api_key* is accepted by *provider*.
29
+
30
+ Each provider check is a lightweight call (list models or a tiny request)
31
+ wrapped in a try/except so a bad key returns False.
32
+ """
33
+ try:
34
+ if provider == "openai":
35
+ return _validate_openai(api_key)
36
+ elif provider == "anthropic":
37
+ return _validate_anthropic(api_key)
38
+ elif provider == "gemini":
39
+ return _validate_gemini(api_key)
40
+ else:
41
+ return False
42
+ except Exception:
43
+ return False
44
+
45
+
46
+ def _validate_openai(api_key: str) -> bool:
47
+ from openai import OpenAI
48
+
49
+ client = OpenAI(api_key=api_key)
50
+ # A successful models.list() call proves the key is valid
51
+ models = client.models.list()
52
+ # Consume at least one item to confirm
53
+ _ = next(iter(models))
54
+ return True
55
+
56
+
57
+ def _validate_anthropic(api_key: str) -> bool:
58
+ from langchain_anthropic import ChatAnthropic
59
+
60
+ client = ChatAnthropic(api_key=api_key)
61
+ models = client.models.list()
62
+ # Consume at least one item to confirm
63
+ _ = next(iter(models))
64
+ return True
65
+
66
+
67
+ def _validate_gemini(api_key: str) -> bool:
68
+ from google import genai
69
+
70
+ client = genai.Client(api_key=api_key)
71
+ models = list(client.models.list())
72
+ return len(models) > 0
73
+
74
+
75
+ # ══════════════════════════════════════════════
76
+ # List cloud models
77
+ # ══════════════════════════════════════════════
78
+ def list_cloud_models(provider: str, api_key: str) -> list[dict[str, str]]:
79
+ """Return a list of ``{id, name}`` dicts for available models.
80
+
81
+ Only returns chat/completion-capable models where possible.
82
+ """
83
+ try:
84
+ if provider == "openai":
85
+ return _list_openai(api_key)
86
+ elif provider == "anthropic":
87
+ return _list_anthropic(api_key)
88
+ elif provider == "gemini":
89
+ return _list_gemini(api_key)
90
+ except Exception:
91
+ pass
92
+ return []
93
+
94
+
95
+ def _list_openai(api_key: str) -> list[dict[str, str]]:
96
+ from openai import OpenAI
97
+
98
+ client = OpenAI(api_key=api_key)
99
+ models = client.models.list()
100
+ result: list[dict[str, str]] = []
101
+ for m in models:
102
+ mid = m.id
103
+ # Filter to chat-capable models (gpt- prefix)
104
+ if mid.startswith(("gpt-", "o", "chatgpt")):
105
+ result.append({"id": f"openai/{mid}", "name": mid})
106
+ result.sort(key=lambda x: x["name"])
107
+ return result
108
+
109
+
110
+ def _list_anthropic(api_key: str) -> list[dict[str, str]]:
111
+ from langchain_anthropic import ChatAnthropic
112
+
113
+ client = ChatAnthropic(api_key=api_key)
114
+ models = client.models.list()
115
+ result: list[dict[str, str]] = []
116
+ for m in models:
117
+ result.append({"id": f"anthropic/{m.id}", "name": m.display_name or m.id})
118
+ result.sort(key=lambda x: x["name"])
119
+ return result
120
+
121
+
122
+ def _list_gemini(api_key: str) -> list[dict[str, str]]:
123
+ from google import genai
124
+
125
+ client = genai.Client(api_key=api_key)
126
+ result: list[dict[str, str]] = []
127
+ for m in client.models.list():
128
+ name = getattr(m, "name", "")
129
+ display = getattr(m, "display_name", name)
130
+ # Only include generative models
131
+ if "gemini" in name.lower():
132
+ result.append({"id": f"gemini/{name}", "name": display})
133
+ result.sort(key=lambda x: x["name"])
134
+ return result
135
+
136
+
137
+ # ══════════════════════════════════════════════
138
+ # Ollama – local model detection
139
+ # ══════════════════════════════════════════════
140
+ def detect_ollama_models() -> list[dict[str, str]]:
141
+ """Detect locally installed Ollama models.
142
+
143
+ Returns a list of ``{id, name, size}`` dicts, or an empty list
144
+ if Ollama is not running / not installed.
145
+ """
146
+ try:
147
+ from ollama import Client
148
+ client = Client(host='http://127.0.0.1:11434')
149
+
150
+ response = client.list()
151
+ result: list[dict[str, str]] = []
152
+ for m in response.models:
153
+ model_name = m.model if hasattr(m, "model") else m.name
154
+ size_bytes = getattr(m, "size", 0)
155
+ size_gb = f"{size_bytes / (1024 ** 3):.1f} GB" if size_bytes else "?"
156
+ result.append({
157
+ "id": f"ollama/{model_name}",
158
+ "name": model_name,
159
+ "size": size_gb,
160
+ })
161
+ return result
162
+ except Exception:
163
+ return []
164
+
165
+
166
+ # ══════════════════════════════════════════════
167
+ # Create a LangChain chat model instance
168
+ # ══════════════════════════════════════════════
169
+ def get_chat_model(model_id: str) -> BaseChatModel:
170
+ """Instantiate and return a LangChain chat model for *model_id*.
171
+
172
+ *model_id* format: ``provider/model_name``
173
+ e.g. ``openai/gpt-4o``, ``anthropic/claude-sonnet-4-20250514``,
174
+ ``gemini/gemini-2.0-flash``, ``ollama/llama3``.
175
+ """
176
+ if "/" not in model_id:
177
+ raise ValueError(f"Invalid model_id format: {model_id!r}. Expected 'provider/model'.")
178
+
179
+ provider, model_name = model_id.split("/", 1)
180
+
181
+ if provider == "openai":
182
+ from langchain_openai import ChatOpenAI
183
+
184
+ return ChatOpenAI(model=model_name, streaming=True)
185
+
186
+ elif provider == "anthropic":
187
+ from langchain_anthropic import ChatAnthropic
188
+
189
+ return ChatAnthropic(model=model_name, streaming=True)
190
+
191
+ elif provider == "gemini":
192
+ from langchain_google_genai import ChatGoogleGenerativeAI
193
+
194
+ return ChatGoogleGenerativeAI(model=model_name, streaming=True)
195
+
196
+ elif provider == "ollama":
197
+ from langchain_community.chat_models import ChatOllama
198
+
199
+ return ChatOllama(model=model_name)
200
+
201
+ else:
202
+ raise ValueError(f"Unknown provider: {provider!r}")
203
+
204
+
205
+ # ══════════════════════════════════════════════
206
+ # Pipeline-aware message sending (Phase 4)
207
+ # ══════════════════════════════════════════════
208
+ async def send_message_via_pipeline(
209
+ model_id: str,
210
+ user_input: str,
211
+ chat_name: str,
212
+ chat_id: str,
213
+ *,
214
+ websearch_enabled: bool = False,
215
+ ) -> dict[str, str]:
216
+ """Run the user's message through the full prompt construction pipeline.
217
+
218
+ The pipeline handles classification, memory retrieval, prompt assembly,
219
+ LLM invocation, and post-processing (DB + memory storage).
220
+
221
+ Parameters
222
+ ----------
223
+ websearch_enabled:
224
+ When True the pipeline's classifier node queries DuckDuckGo for
225
+ the top-2 results and injects them into the final prompt
226
+ (Phase 5 feature).
227
+
228
+ Returns a dict with keys ``response`` (the assistant reply) and
229
+ ``trace_log`` (the chain-of-thought trace extracted from the model output).
230
+ """
231
+ from app.pipeline import run_pipeline
232
+
233
+ result = await run_pipeline(
234
+ user_input=user_input,
235
+ chat_name=chat_name,
236
+ chat_id=chat_id,
237
+ model_id=model_id,
238
+ websearch_enabled=websearch_enabled,
239
+ )
240
+
241
+ error = result.get("error")
242
+ if error:
243
+ return {"response": f"*{error}*", "trace_log": ""}
244
+
245
+ return {
246
+ "response": result.get("response", ""),
247
+ "trace_log": result.get("trace_log", ""),
248
+ }
249
+
250
+
251
+ # ══════════════════════════════════════════════
252
+ # Legacy: direct send (no pipeline, for fallback)
253
+ # ══════════════════════════════════════════════
254
+ async def send_message(
255
+ model_id: str,
256
+ messages: list[dict[str, str]],
257
+ chat_name: str | None = None,
258
+ chat_id: str | None = None,
259
+ ) -> str:
260
+ """Send a list of {role, content} dicts and return the assistant reply.
261
+
262
+ If *chat_name* and *chat_id* are provided, the user message and AI
263
+ response are automatically fed through the memory pipeline.
264
+
265
+ NOTE: For Phase 4+, prefer ``send_message_via_pipeline()`` which runs
266
+ the full prompt construction pipeline.
267
+ """
268
+ from langchain_core.messages import AIMessage, HumanMessage, SystemMessage
269
+
270
+ _type_map = {
271
+ "system": SystemMessage,
272
+ "user": HumanMessage,
273
+ "assistant": AIMessage,
274
+ }
275
+ lc_messages = [_type_map[m["role"]](content=m["content"]) for m in messages]
276
+
277
+ model = get_chat_model(model_id)
278
+ response = await model.ainvoke(lc_messages)
279
+ reply = str(response.content)
280
+
281
+ # ── Memory integration (Phase 3) ────────
282
+ if chat_name and chat_id:
283
+ try:
284
+ from app.memory import process_message
285
+
286
+ # Store the last user message in memory
287
+ user_msgs = [m for m in messages if m["role"] == "user"]
288
+ if user_msgs:
289
+ await process_message(chat_name, chat_id, "user", user_msgs[-1]["content"])
290
+ # Store the assistant reply in memory
291
+ await process_message(chat_name, chat_id, "assistant", reply)
292
+ except Exception:
293
+ pass # Memory errors must not block the response
294
+
295
+ return reply
296
+
297
+
298
+ async def stream_message(
299
+ model_id: str,
300
+ messages: list[dict[str, str]],
301
+ chat_name: str | None = None,
302
+ chat_id: str | None = None,
303
+ ):
304
+ """Yield token chunks as an async generator.
305
+
306
+ After streaming completes, the accumulated response is fed through
307
+ the memory pipeline if *chat_name* and *chat_id* are provided.
308
+ """
309
+ from langchain_core.messages import AIMessage, HumanMessage, SystemMessage
310
+
311
+ _type_map = {
312
+ "system": SystemMessage,
313
+ "user": HumanMessage,
314
+ "assistant": AIMessage,
315
+ }
316
+ lc_messages = [_type_map[m["role"]](content=m["content"]) for m in messages]
317
+
318
+ model = get_chat_model(model_id)
319
+ full_response: list[str] = []
320
+ async for chunk in model.astream(lc_messages):
321
+ text = chunk.content if hasattr(chunk, "content") else str(chunk)
322
+ if text:
323
+ full_response.append(text)
324
+ yield text
325
+
326
+ # ── Memory integration (Phase 3) ────────
327
+ if chat_name and chat_id:
328
+ try:
329
+ from app.memory import process_message
330
+
331
+ # Store the last user message in memory
332
+ user_msgs = [m for m in messages if m["role"] == "user"]
333
+ if user_msgs:
334
+ await process_message(chat_name, chat_id, "user", user_msgs[-1]["content"])
335
+ # Store the full accumulated assistant response in memory
336
+ accumulated = "".join(full_response)
337
+ if accumulated:
338
+ await process_message(chat_name, chat_id, "assistant", accumulated)
339
+ except Exception:
340
+ pass # Memory errors must not block the response