khoj 2.0.0b12__py3-none-any.whl → 2.0.0b13__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (199) hide show
  1. khoj/app/README.md +1 -1
  2. khoj/app/urls.py +1 -0
  3. khoj/configure.py +21 -54
  4. khoj/database/adapters/__init__.py +6 -15
  5. khoj/database/management/commands/delete_orphaned_fileobjects.py +0 -1
  6. khoj/database/migrations/0064_remove_conversation_temp_id_alter_conversation_id.py +1 -1
  7. khoj/database/migrations/0075_migrate_generated_assets_and_validate.py +1 -1
  8. khoj/database/migrations/0092_alter_chatmodel_model_type_alter_chatmodel_name_and_more.py +36 -0
  9. khoj/database/migrations/0093_remove_localorgconfig_user_and_more.py +36 -0
  10. khoj/database/models/__init__.py +10 -40
  11. khoj/database/tests.py +0 -2
  12. khoj/interface/compiled/404/index.html +2 -2
  13. khoj/interface/compiled/_next/static/chunks/{9245.a04e92d034540234.js → 1225.ecac11e7421504c4.js} +3 -3
  14. khoj/interface/compiled/_next/static/chunks/1320.ae930ad00affe685.js +5 -0
  15. khoj/interface/compiled/_next/static/chunks/{1327-3b1a41af530fa8ee.js → 1327-e254819a9172cfa7.js} +1 -1
  16. khoj/interface/compiled/_next/static/chunks/1626.15a8acc0d6639ec6.js +1 -0
  17. khoj/interface/compiled/_next/static/chunks/{3489.c523fe96a2eee74f.js → 1940.d082758bd04e08ae.js} +1 -1
  18. khoj/interface/compiled/_next/static/chunks/{2327-ea623ca2d22f78e9.js → 2327-438aaec1657c5ada.js} +1 -1
  19. khoj/interface/compiled/_next/static/chunks/2475.57a0d0fd93d07af0.js +93 -0
  20. khoj/interface/compiled/_next/static/chunks/2481.5ce6524ba0a73f90.js +55 -0
  21. khoj/interface/compiled/_next/static/chunks/297.4c4c823ff6e3255b.js +174 -0
  22. khoj/interface/compiled/_next/static/chunks/{5639-09e2009a2adedf8b.js → 3260-82d2521fab032ff1.js} +68 -23
  23. khoj/interface/compiled/_next/static/chunks/3353.1c6d553216a1acae.js +1 -0
  24. khoj/interface/compiled/_next/static/chunks/3855.f7b8131f78af046e.js +1 -0
  25. khoj/interface/compiled/_next/static/chunks/3973.dc54a39586ab48be.js +1 -0
  26. khoj/interface/compiled/_next/static/chunks/4241.c1cd170f7f37ac59.js +24 -0
  27. khoj/interface/compiled/_next/static/chunks/{4327.8d2a1b8f1ea78208.js → 4327.f3704dc398c67113.js} +19 -19
  28. khoj/interface/compiled/_next/static/chunks/4505.f09454a346269c3f.js +117 -0
  29. khoj/interface/compiled/_next/static/chunks/4801.96a152d49742b644.js +1 -0
  30. khoj/interface/compiled/_next/static/chunks/5427-a95ec748e52abb75.js +1 -0
  31. khoj/interface/compiled/_next/static/chunks/549.2bd27f59a91a9668.js +148 -0
  32. khoj/interface/compiled/_next/static/chunks/5765.71b1e1207b76b03f.js +1 -0
  33. khoj/interface/compiled/_next/static/chunks/584.d7ce3505f169b706.js +1 -0
  34. khoj/interface/compiled/_next/static/chunks/6240.34f7c1fa692edd61.js +24 -0
  35. khoj/interface/compiled/_next/static/chunks/6d3fe5a5-f9f3c16e0bc0cdf9.js +10 -0
  36. khoj/interface/compiled/_next/static/chunks/{7127-0f4a2a77d97fb5fa.js → 7127-97b83757db125ba6.js} +1 -1
  37. khoj/interface/compiled/_next/static/chunks/7200-93ab0072359b8028.js +1 -0
  38. khoj/interface/compiled/_next/static/chunks/{2612.bcf5a623b3da209e.js → 7553.f5ad54b1f6e92c49.js} +2 -2
  39. khoj/interface/compiled/_next/static/chunks/7626-1b630f1654172341.js +1 -0
  40. khoj/interface/compiled/_next/static/chunks/764.dadd316e8e16d191.js +63 -0
  41. khoj/interface/compiled/_next/static/chunks/78.08169ab541abab4f.js +43 -0
  42. khoj/interface/compiled/_next/static/chunks/784.e03acf460df213d1.js +1 -0
  43. khoj/interface/compiled/_next/static/chunks/{9537-d9ab442ce15d1e20.js → 8072-e1440cb482a0940e.js} +1 -1
  44. khoj/interface/compiled/_next/static/chunks/{3265.924139c4146ee344.js → 8086.8d39887215807fcd.js} +1 -1
  45. khoj/interface/compiled/_next/static/chunks/8168.f074ab8c7c16d82d.js +59 -0
  46. khoj/interface/compiled/_next/static/chunks/{8694.2bd9c2f65d8c5847.js → 8223.1705878fa7a09292.js} +1 -1
  47. khoj/interface/compiled/_next/static/chunks/8483.94f6c9e2bee86f50.js +215 -0
  48. khoj/interface/compiled/_next/static/chunks/{8888.ebe0e552b59e7fed.js → 8810.fc0e479de78c7c61.js} +1 -1
  49. khoj/interface/compiled/_next/static/chunks/8828.bc74dc4ce94e78f6.js +1 -0
  50. khoj/interface/compiled/_next/static/chunks/{7303.d0612f812a967a08.js → 8909.14ac3f43d0070cf1.js} +5 -5
  51. khoj/interface/compiled/_next/static/chunks/90542734.b1a1629065ba199b.js +1 -0
  52. khoj/interface/compiled/_next/static/chunks/9167.098534184f03fe92.js +56 -0
  53. khoj/interface/compiled/_next/static/chunks/{4980.63500d68b3bb1222.js → 9537.e934ce37bf314509.js} +5 -5
  54. khoj/interface/compiled/_next/static/chunks/9574.3fe8e26e95bf1c34.js +1 -0
  55. khoj/interface/compiled/_next/static/chunks/9599.ec50b5296c27dae9.js +1 -0
  56. khoj/interface/compiled/_next/static/chunks/9643.b34248df52ffc77c.js +262 -0
  57. khoj/interface/compiled/_next/static/chunks/9747.2fd9065b1435abb1.js +1 -0
  58. khoj/interface/compiled/_next/static/chunks/9922.98f2b2a9959b4ebe.js +1 -0
  59. khoj/interface/compiled/_next/static/chunks/app/agents/layout-e49165209d2e406c.js +1 -0
  60. khoj/interface/compiled/_next/static/chunks/app/agents/page-e291b49977f43880.js +1 -0
  61. khoj/interface/compiled/_next/static/chunks/app/automations/page-198b26df6e09bbb0.js +1 -0
  62. khoj/interface/compiled/_next/static/chunks/app/chat/layout-33934fc2d6ae6838.js +1 -0
  63. khoj/interface/compiled/_next/static/chunks/app/chat/{page-4bc2938df5d57981.js → page-dfcc1e8e2ad62873.js} +1 -1
  64. khoj/interface/compiled/_next/static/chunks/app/{page-a19a597629e87fb8.js → page-1567cac7b79a7c59.js} +1 -1
  65. khoj/interface/compiled/_next/static/chunks/app/search/layout-c02531d586972d7d.js +1 -0
  66. khoj/interface/compiled/_next/static/chunks/app/search/{page-fa366ac14b228688.js → page-3639e50ec3e9acfd.js} +1 -1
  67. khoj/interface/compiled/_next/static/chunks/app/settings/{page-8f9a85f96088c18b.js → page-6081362437c82470.js} +1 -1
  68. khoj/interface/compiled/_next/static/chunks/app/share/chat/layout-6fb51c5c80f8ec67.js +1 -0
  69. khoj/interface/compiled/_next/static/chunks/app/share/chat/{page-ed7787cf4938b8e3.js → page-e0dcb1762f8c8f88.js} +1 -1
  70. khoj/interface/compiled/_next/static/chunks/webpack-5393aad3d824e0cb.js +1 -0
  71. khoj/interface/compiled/_next/static/css/{a0c2fd63bb396f04.css → 23b26df423cd8a9c.css} +1 -1
  72. khoj/interface/compiled/_next/static/css/{93eeacc43e261162.css → c34713c98384ee87.css} +1 -1
  73. khoj/interface/compiled/agents/index.html +2 -2
  74. khoj/interface/compiled/agents/index.txt +3 -3
  75. khoj/interface/compiled/automations/index.html +2 -2
  76. khoj/interface/compiled/automations/index.txt +4 -4
  77. khoj/interface/compiled/chat/index.html +2 -2
  78. khoj/interface/compiled/chat/index.txt +3 -3
  79. khoj/interface/compiled/index.html +2 -2
  80. khoj/interface/compiled/index.txt +3 -3
  81. khoj/interface/compiled/search/index.html +2 -2
  82. khoj/interface/compiled/search/index.txt +3 -3
  83. khoj/interface/compiled/settings/index.html +2 -2
  84. khoj/interface/compiled/settings/index.txt +5 -5
  85. khoj/interface/compiled/share/chat/index.html +2 -2
  86. khoj/interface/compiled/share/chat/index.txt +3 -3
  87. khoj/main.py +7 -9
  88. khoj/manage.py +1 -0
  89. khoj/processor/content/github/github_to_entries.py +6 -7
  90. khoj/processor/content/images/image_to_entries.py +0 -1
  91. khoj/processor/content/markdown/markdown_to_entries.py +2 -3
  92. khoj/processor/content/notion/notion_to_entries.py +5 -6
  93. khoj/processor/content/org_mode/org_to_entries.py +4 -5
  94. khoj/processor/content/org_mode/orgnode.py +4 -4
  95. khoj/processor/content/plaintext/plaintext_to_entries.py +1 -2
  96. khoj/processor/content/text_to_entries.py +1 -3
  97. khoj/processor/conversation/google/utils.py +3 -3
  98. khoj/processor/conversation/openai/gpt.py +65 -28
  99. khoj/processor/conversation/openai/utils.py +359 -28
  100. khoj/processor/conversation/prompts.py +16 -41
  101. khoj/processor/conversation/utils.py +29 -39
  102. khoj/processor/embeddings.py +0 -2
  103. khoj/processor/image/generate.py +3 -3
  104. khoj/processor/operator/__init__.py +2 -3
  105. khoj/processor/operator/grounding_agent.py +15 -2
  106. khoj/processor/operator/grounding_agent_uitars.py +34 -23
  107. khoj/processor/operator/operator_agent_anthropic.py +29 -4
  108. khoj/processor/operator/operator_agent_base.py +1 -1
  109. khoj/processor/operator/operator_agent_binary.py +4 -4
  110. khoj/processor/operator/operator_agent_openai.py +21 -6
  111. khoj/processor/operator/operator_environment_browser.py +1 -1
  112. khoj/processor/operator/operator_environment_computer.py +1 -1
  113. khoj/processor/speech/text_to_speech.py +0 -1
  114. khoj/processor/tools/online_search.py +1 -1
  115. khoj/processor/tools/run_code.py +1 -1
  116. khoj/routers/api.py +2 -15
  117. khoj/routers/api_agents.py +1 -2
  118. khoj/routers/api_automation.py +1 -1
  119. khoj/routers/api_chat.py +10 -16
  120. khoj/routers/api_content.py +3 -111
  121. khoj/routers/api_model.py +0 -1
  122. khoj/routers/api_subscription.py +1 -1
  123. khoj/routers/email.py +4 -4
  124. khoj/routers/helpers.py +44 -103
  125. khoj/routers/research.py +8 -8
  126. khoj/search_filter/base_filter.py +2 -4
  127. khoj/search_type/text_search.py +1 -2
  128. khoj/utils/cli.py +5 -53
  129. khoj/utils/config.py +0 -65
  130. khoj/utils/constants.py +6 -7
  131. khoj/utils/helpers.py +10 -18
  132. khoj/utils/initialization.py +7 -48
  133. khoj/utils/models.py +2 -4
  134. khoj/utils/rawconfig.py +1 -69
  135. khoj/utils/state.py +2 -8
  136. khoj/utils/yaml.py +0 -39
  137. {khoj-2.0.0b12.dist-info → khoj-2.0.0b13.dist-info}/METADATA +3 -3
  138. {khoj-2.0.0b12.dist-info → khoj-2.0.0b13.dist-info}/RECORD +149 -158
  139. khoj/interface/compiled/_next/static/chunks/1191.b547ec13349b4aed.js +0 -1
  140. khoj/interface/compiled/_next/static/chunks/1588.f0558a0bdffc4761.js +0 -117
  141. khoj/interface/compiled/_next/static/chunks/1918.925cb4a35518d258.js +0 -43
  142. khoj/interface/compiled/_next/static/chunks/2849.dc00ae5ba7219cfc.js +0 -1
  143. khoj/interface/compiled/_next/static/chunks/303.fe76de943e930fbd.js +0 -1
  144. khoj/interface/compiled/_next/static/chunks/4533.586e74b45a2bde25.js +0 -55
  145. khoj/interface/compiled/_next/static/chunks/4551.82ce1476b5516bc2.js +0 -5
  146. khoj/interface/compiled/_next/static/chunks/4748.0edd37cba3ea2809.js +0 -59
  147. khoj/interface/compiled/_next/static/chunks/5210.cd35a1c1ec594a20.js +0 -93
  148. khoj/interface/compiled/_next/static/chunks/5329.f8b3c5b3d16159cd.js +0 -1
  149. khoj/interface/compiled/_next/static/chunks/5427-13d6ffd380fdfab7.js +0 -1
  150. khoj/interface/compiled/_next/static/chunks/558-c14e76cff03f6a60.js +0 -1
  151. khoj/interface/compiled/_next/static/chunks/5830.8876eccb82da9b7d.js +0 -262
  152. khoj/interface/compiled/_next/static/chunks/6230.88a71d8145347b3f.js +0 -1
  153. khoj/interface/compiled/_next/static/chunks/7161.77e0530a40ad5ca8.js +0 -1
  154. khoj/interface/compiled/_next/static/chunks/7200-ac3b2e37ff30e126.js +0 -1
  155. khoj/interface/compiled/_next/static/chunks/7505.c31027a3695bdebb.js +0 -148
  156. khoj/interface/compiled/_next/static/chunks/7760.35649cc21d9585bd.js +0 -56
  157. khoj/interface/compiled/_next/static/chunks/83.48e2db193a940052.js +0 -1
  158. khoj/interface/compiled/_next/static/chunks/8427.844694e06133fb51.js +0 -1
  159. khoj/interface/compiled/_next/static/chunks/8665.4db7e6b2e8933497.js +0 -174
  160. khoj/interface/compiled/_next/static/chunks/872.caf84cc1a39ae59f.js +0 -1
  161. khoj/interface/compiled/_next/static/chunks/8890.6e8a59e4de6978bc.js +0 -215
  162. khoj/interface/compiled/_next/static/chunks/8950.5f2272e0ac923f9e.js +0 -1
  163. khoj/interface/compiled/_next/static/chunks/90542734.2c21f16f18b22411.js +0 -1
  164. khoj/interface/compiled/_next/static/chunks/9202.c703864fcedc8d1f.js +0 -63
  165. khoj/interface/compiled/_next/static/chunks/9320.6aca4885d541aa44.js +0 -24
  166. khoj/interface/compiled/_next/static/chunks/9535.f78cd92d03331e55.js +0 -1
  167. khoj/interface/compiled/_next/static/chunks/9968.b111fc002796da81.js +0 -1
  168. khoj/interface/compiled/_next/static/chunks/app/agents/layout-4e2a134ec26aa606.js +0 -1
  169. khoj/interface/compiled/_next/static/chunks/app/agents/page-5db6ad18da10d353.js +0 -1
  170. khoj/interface/compiled/_next/static/chunks/app/automations/page-6271e2e31c7571d1.js +0 -1
  171. khoj/interface/compiled/_next/static/chunks/app/chat/layout-ad4d1792ab1a4108.js +0 -1
  172. khoj/interface/compiled/_next/static/chunks/app/search/layout-f5881c7ae3ba0795.js +0 -1
  173. khoj/interface/compiled/_next/static/chunks/app/share/chat/layout-abb6c5f4239ad7be.js +0 -1
  174. khoj/interface/compiled/_next/static/chunks/f3e3247b-1758d4651e4457c2.js +0 -10
  175. khoj/interface/compiled/_next/static/chunks/webpack-4b00e5a0da4a9dae.js +0 -1
  176. khoj/migrations/__init__.py +0 -0
  177. khoj/migrations/migrate_offline_chat_default_model.py +0 -69
  178. khoj/migrations/migrate_offline_chat_default_model_2.py +0 -71
  179. khoj/migrations/migrate_offline_chat_schema.py +0 -83
  180. khoj/migrations/migrate_offline_model.py +0 -29
  181. khoj/migrations/migrate_processor_config_openai.py +0 -67
  182. khoj/migrations/migrate_server_pg.py +0 -132
  183. khoj/migrations/migrate_version.py +0 -17
  184. khoj/processor/conversation/offline/__init__.py +0 -0
  185. khoj/processor/conversation/offline/chat_model.py +0 -224
  186. khoj/processor/conversation/offline/utils.py +0 -80
  187. khoj/processor/conversation/offline/whisper.py +0 -15
  188. khoj/utils/fs_syncer.py +0 -252
  189. /khoj/interface/compiled/_next/static/{TTch40tYWOfh0SzwjwZXV → RYbQvo3AvgOR0bEVVfxF4}/_buildManifest.js +0 -0
  190. /khoj/interface/compiled/_next/static/{TTch40tYWOfh0SzwjwZXV → RYbQvo3AvgOR0bEVVfxF4}/_ssgManifest.js +0 -0
  191. /khoj/interface/compiled/_next/static/chunks/{1915-fbfe167c84ad60c5.js → 1915-5c6508f6ebb62a30.js} +0 -0
  192. /khoj/interface/compiled/_next/static/chunks/{2117-e78b6902ad6f75ec.js → 2117-080746c8e170c81a.js} +0 -0
  193. /khoj/interface/compiled/_next/static/chunks/{2939-4d4084c5b888b960.js → 2939-4af3fd24b8ffc9ad.js} +0 -0
  194. /khoj/interface/compiled/_next/static/chunks/{4447-d6cf93724d57e34b.js → 4447-cd95608f8e93e711.js} +0 -0
  195. /khoj/interface/compiled/_next/static/chunks/{8667-4b7790573b08c50d.js → 8667-50b03a89e82e0ba7.js} +0 -0
  196. /khoj/interface/compiled/_next/static/chunks/{9139-ce1ae935dac9c871.js → 9139-8ac4d9feb10f8869.js} +0 -0
  197. {khoj-2.0.0b12.dist-info → khoj-2.0.0b13.dist-info}/WHEEL +0 -0
  198. {khoj-2.0.0b12.dist-info → khoj-2.0.0b13.dist-info}/entry_points.txt +0 -0
  199. {khoj-2.0.0b12.dist-info → khoj-2.0.0b13.dist-info}/licenses/LICENSE +0 -0
@@ -2,7 +2,6 @@ import json
2
2
  import logging
3
3
  import os
4
4
  from copy import deepcopy
5
- from functools import partial
6
5
  from time import perf_counter
7
6
  from typing import AsyncGenerator, Dict, Generator, List, Literal, Optional, Union
8
7
  from urllib.parse import urlparse
@@ -22,6 +21,8 @@ from openai.types.chat.chat_completion_chunk import (
22
21
  Choice,
23
22
  ChoiceDelta,
24
23
  )
24
+ from openai.types.responses import Response as OpenAIResponse
25
+ from openai.types.responses import ResponseFunctionToolCall, ResponseReasoningItem
25
26
  from pydantic import BaseModel
26
27
  from tenacity import (
27
28
  before_sleep_log,
@@ -54,13 +55,31 @@ openai_clients: Dict[str, openai.OpenAI] = {}
54
55
  openai_async_clients: Dict[str, openai.AsyncOpenAI] = {}
55
56
 
56
57
 
58
+ def _extract_text_for_instructions(content: Union[str, List, Dict, None]) -> str:
59
+ """Extract plain text from a message content suitable for Responses API instructions."""
60
+ if content is None:
61
+ return ""
62
+ if isinstance(content, str):
63
+ return content
64
+ if isinstance(content, list):
65
+ texts: List[str] = []
66
+ for part in content:
67
+ if isinstance(part, dict) and part.get("type") == "input_text" and part.get("text"):
68
+ texts.append(str(part.get("text")))
69
+ return "\n\n".join(texts)
70
+ if isinstance(content, dict):
71
+ # If a single part dict was passed
72
+ if content.get("type") == "input_text" and content.get("text"):
73
+ return str(content.get("text"))
74
+ # Fallback to string conversion
75
+ return str(content)
76
+
77
+
57
78
  @retry(
58
79
  retry=(
59
80
  retry_if_exception_type(openai._exceptions.APITimeoutError)
60
- | retry_if_exception_type(openai._exceptions.APIError)
61
- | retry_if_exception_type(openai._exceptions.APIConnectionError)
62
81
  | retry_if_exception_type(openai._exceptions.RateLimitError)
63
- | retry_if_exception_type(openai._exceptions.APIStatusError)
82
+ | retry_if_exception_type(openai._exceptions.InternalServerError)
64
83
  | retry_if_exception_type(ValueError)
65
84
  ),
66
85
  wait=wait_random_exponential(min=1, max=10),
@@ -228,10 +247,8 @@ def completion_with_backoff(
228
247
  @retry(
229
248
  retry=(
230
249
  retry_if_exception_type(openai._exceptions.APITimeoutError)
231
- | retry_if_exception_type(openai._exceptions.APIError)
232
- | retry_if_exception_type(openai._exceptions.APIConnectionError)
233
250
  | retry_if_exception_type(openai._exceptions.RateLimitError)
234
- | retry_if_exception_type(openai._exceptions.APIStatusError)
251
+ | retry_if_exception_type(openai._exceptions.InternalServerError)
235
252
  | retry_if_exception_type(ValueError)
236
253
  ),
237
254
  wait=wait_exponential(multiplier=1, min=4, max=10),
@@ -284,9 +301,9 @@ async def chat_completion_with_backoff(
284
301
  if len(system_messages) > 0:
285
302
  first_system_message_index, first_system_message = system_messages[0]
286
303
  first_system_message_content = first_system_message["content"]
287
- formatted_messages[first_system_message_index][
288
- "content"
289
- ] = f"{first_system_message_content}\nFormatting re-enabled"
304
+ formatted_messages[first_system_message_index]["content"] = (
305
+ f"{first_system_message_content}\nFormatting re-enabled"
306
+ )
290
307
  elif is_twitter_reasoning_model(model_name, api_base_url):
291
308
  reasoning_effort = "high" if deepthought else "low"
292
309
  # Grok-4 models do not support reasoning_effort parameter
@@ -391,6 +408,283 @@ async def chat_completion_with_backoff(
391
408
  commit_conversation_trace(messages, aggregated_response, tracer)
392
409
 
393
410
 
411
+ @retry(
412
+ retry=(
413
+ retry_if_exception_type(openai._exceptions.APITimeoutError)
414
+ | retry_if_exception_type(openai._exceptions.RateLimitError)
415
+ | retry_if_exception_type(openai._exceptions.InternalServerError)
416
+ | retry_if_exception_type(ValueError)
417
+ ),
418
+ wait=wait_random_exponential(min=1, max=10),
419
+ stop=stop_after_attempt(3),
420
+ before_sleep=before_sleep_log(logger, logging.DEBUG),
421
+ reraise=True,
422
+ )
423
+ def responses_completion_with_backoff(
424
+ messages: List[ChatMessage],
425
+ model_name: str,
426
+ temperature=0.6,
427
+ openai_api_key=None,
428
+ api_base_url=None,
429
+ deepthought: bool = False,
430
+ model_kwargs: dict = {},
431
+ tracer: dict = {},
432
+ ) -> ResponseWithThought:
433
+ """
434
+ Synchronous helper using the OpenAI Responses API in streaming mode under the hood.
435
+ Aggregates streamed deltas and returns a ResponseWithThought.
436
+ """
437
+ client_key = f"{openai_api_key}--{api_base_url}"
438
+ client = openai_clients.get(client_key)
439
+ if not client:
440
+ client = get_openai_client(openai_api_key, api_base_url)
441
+ openai_clients[client_key] = client
442
+
443
+ formatted_messages = format_message_for_api(messages, api_base_url)
444
+ # Move the first system message to Responses API instructions
445
+ instructions: Optional[str] = None
446
+ if formatted_messages and formatted_messages[0].get("role") == "system":
447
+ instructions = _extract_text_for_instructions(formatted_messages[0].get("content")) or None
448
+ formatted_messages = formatted_messages[1:]
449
+
450
+ model_kwargs = deepcopy(model_kwargs)
451
+ model_kwargs["top_p"] = model_kwargs.get("top_p", 0.95)
452
+ # Configure thinking for openai reasoning models
453
+ if is_openai_reasoning_model(model_name, api_base_url):
454
+ temperature = 1
455
+ reasoning_effort = "medium" if deepthought else "low"
456
+ model_kwargs["reasoning"] = {"effort": reasoning_effort, "summary": "auto"}
457
+ # Remove unsupported params for reasoning models
458
+ model_kwargs.pop("top_p", None)
459
+ model_kwargs.pop("stop", None)
460
+
461
+ read_timeout = 300 if is_local_api(api_base_url) else 60
462
+
463
+ # Stream and aggregate
464
+ model_response: OpenAIResponse = client.responses.create(
465
+ input=formatted_messages,
466
+ instructions=instructions,
467
+ model=model_name,
468
+ temperature=temperature,
469
+ timeout=httpx.Timeout(30, read=read_timeout), # type: ignore
470
+ store=False,
471
+ include=["reasoning.encrypted_content"],
472
+ **model_kwargs,
473
+ )
474
+ if not model_response or not isinstance(model_response, OpenAIResponse) or not model_response.output:
475
+ raise ValueError(f"Empty response returned by {model_name}.")
476
+
477
+ raw_content = [item.model_dump() for item in model_response.output]
478
+ aggregated_text = model_response.output_text
479
+ thoughts = ""
480
+ tool_calls: List[ToolCall] = []
481
+ for item in model_response.output:
482
+ if isinstance(item, ResponseFunctionToolCall):
483
+ tool_calls.append(ToolCall(name=item.name, args=json.loads(item.arguments), id=item.call_id))
484
+ elif isinstance(item, ResponseReasoningItem):
485
+ thoughts = "\n\n".join([summary.text for summary in item.summary])
486
+
487
+ if tool_calls:
488
+ if thoughts and aggregated_text:
489
+ # If there are tool calls, aggregate thoughts and responses into thoughts
490
+ thoughts = "\n".join([f"*{line.strip()}*" for line in thoughts.splitlines() if line.strip()])
491
+ thoughts = f"{thoughts}\n\n{aggregated_text}"
492
+ else:
493
+ thoughts = thoughts or aggregated_text
494
+ # Json dump tool calls into aggregated response
495
+ aggregated_text = json.dumps([tool_call.__dict__ for tool_call in tool_calls])
496
+
497
+ # Usage/cost tracking
498
+ input_tokens = model_response.usage.input_tokens if model_response and model_response.usage else 0
499
+ output_tokens = model_response.usage.output_tokens if model_response and model_response.usage else 0
500
+ cost = 0
501
+ cache_read_tokens = 0
502
+ if model_response and model_response.usage and model_response.usage.input_tokens_details:
503
+ cache_read_tokens = model_response.usage.input_tokens_details.cached_tokens
504
+ input_tokens -= cache_read_tokens
505
+ tracer["usage"] = get_chat_usage_metrics(
506
+ model_name, input_tokens, output_tokens, cache_read_tokens, usage=tracer.get("usage"), cost=cost
507
+ )
508
+
509
+ # Validate final aggregated text (either message or tool-calls JSON)
510
+ if is_none_or_empty(aggregated_text):
511
+ logger.warning(f"No response by {model_name}\nLast Message by {messages[-1].role}: {messages[-1].content}.")
512
+ raise ValueError(f"Empty or no response by {model_name} over Responses API. Retry if needed.")
513
+
514
+ # Trace
515
+ tracer["chat_model"] = model_name
516
+ tracer["temperature"] = temperature
517
+ if is_promptrace_enabled():
518
+ commit_conversation_trace(messages, aggregated_text, tracer)
519
+
520
+ return ResponseWithThought(text=aggregated_text, thought=thoughts, raw_content=raw_content)
521
+
522
+
523
+ @retry(
524
+ retry=(
525
+ retry_if_exception_type(openai._exceptions.APITimeoutError)
526
+ | retry_if_exception_type(openai._exceptions.RateLimitError)
527
+ | retry_if_exception_type(openai._exceptions.InternalServerError)
528
+ | retry_if_exception_type(ValueError)
529
+ ),
530
+ wait=wait_exponential(multiplier=1, min=4, max=10),
531
+ stop=stop_after_attempt(3),
532
+ before_sleep=before_sleep_log(logger, logging.WARNING),
533
+ reraise=False,
534
+ )
535
+ async def responses_chat_completion_with_backoff(
536
+ messages: list[ChatMessage],
537
+ model_name: str,
538
+ temperature,
539
+ openai_api_key=None,
540
+ api_base_url=None,
541
+ deepthought=False, # Unused; parity with legacy signature
542
+ tracer: dict = {},
543
+ ) -> AsyncGenerator[ResponseWithThought, None]:
544
+ """
545
+ Async streaming helper using the OpenAI Responses API.
546
+ Yields ResponseWithThought chunks as text/think deltas arrive.
547
+ """
548
+ client_key = f"{openai_api_key}--{api_base_url}"
549
+ client = openai_async_clients.get(client_key)
550
+ if not client:
551
+ client = get_openai_async_client(openai_api_key, api_base_url)
552
+ openai_async_clients[client_key] = client
553
+
554
+ formatted_messages = format_message_for_api(messages, api_base_url)
555
+ # Move the first system message to Responses API instructions
556
+ instructions: Optional[str] = None
557
+ if formatted_messages and formatted_messages[0].get("role") == "system":
558
+ instructions = _extract_text_for_instructions(formatted_messages[0].get("content")) or None
559
+ formatted_messages = formatted_messages[1:]
560
+
561
+ model_kwargs: dict = {}
562
+ model_kwargs["top_p"] = model_kwargs.get("top_p", 0.95)
563
+ # Configure thinking for openai reasoning models
564
+ if is_openai_reasoning_model(model_name, api_base_url):
565
+ temperature = 1
566
+ reasoning_effort = "medium" if deepthought else "low"
567
+ model_kwargs["reasoning"] = {"effort": reasoning_effort, "summary": "auto"}
568
+ # Remove unsupported params for reasoning models
569
+ model_kwargs.pop("top_p", None)
570
+ model_kwargs.pop("stop", None)
571
+
572
+ read_timeout = 300 if is_local_api(api_base_url) else 60
573
+
574
+ aggregated_text = ""
575
+ last_final: Optional[OpenAIResponse] = None
576
+ # Tool call assembly buffers
577
+ tool_calls_args: Dict[str, str] = {}
578
+ tool_calls_name: Dict[str, str] = {}
579
+ tool_call_order: List[str] = []
580
+
581
+ async with client.responses.stream(
582
+ input=formatted_messages,
583
+ instructions=instructions,
584
+ model=model_name,
585
+ temperature=temperature,
586
+ timeout=httpx.Timeout(30, read=read_timeout),
587
+ **model_kwargs,
588
+ ) as stream: # type: ignore
589
+ async for event in stream: # type: ignore
590
+ et = getattr(event, "type", "")
591
+ if et == "response.output_text.delta":
592
+ delta = getattr(event, "delta", "") or getattr(event, "output_text", "")
593
+ if delta:
594
+ aggregated_text += delta
595
+ yield ResponseWithThought(text=delta)
596
+ elif et == "response.reasoning.delta":
597
+ delta = getattr(event, "delta", "")
598
+ if delta:
599
+ yield ResponseWithThought(thought=delta)
600
+ elif et == "response.tool_call.created":
601
+ item = getattr(event, "item", None)
602
+ tool_id = (
603
+ getattr(event, "id", None)
604
+ or getattr(event, "tool_call_id", None)
605
+ or (getattr(item, "id", None) if item is not None else None)
606
+ )
607
+ name = (
608
+ getattr(event, "name", None)
609
+ or (getattr(item, "name", None) if item is not None else None)
610
+ or getattr(event, "tool_name", None)
611
+ )
612
+ if tool_id:
613
+ if tool_id not in tool_calls_args:
614
+ tool_calls_args[tool_id] = ""
615
+ tool_call_order.append(tool_id)
616
+ if name:
617
+ tool_calls_name[tool_id] = name
618
+ elif et == "response.tool_call.delta":
619
+ tool_id = getattr(event, "id", None) or getattr(event, "tool_call_id", None)
620
+ delta = getattr(event, "delta", None)
621
+ if hasattr(delta, "arguments"):
622
+ arg_delta = getattr(delta, "arguments", "")
623
+ else:
624
+ arg_delta = delta if isinstance(delta, str) else getattr(event, "arguments", "")
625
+ if tool_id and arg_delta:
626
+ tool_calls_args[tool_id] = tool_calls_args.get(tool_id, "") + arg_delta
627
+ if tool_id not in tool_call_order:
628
+ tool_call_order.append(tool_id)
629
+ elif et == "response.tool_call.completed":
630
+ item = getattr(event, "item", None)
631
+ tool_id = (
632
+ getattr(event, "id", None)
633
+ or getattr(event, "tool_call_id", None)
634
+ or (getattr(item, "id", None) if item is not None else None)
635
+ )
636
+ args_final = None
637
+ if item is not None:
638
+ args_final = getattr(item, "arguments", None) or getattr(item, "args", None)
639
+ if tool_id and args_final:
640
+ tool_calls_args[tool_id] = args_final if isinstance(args_final, str) else json.dumps(args_final)
641
+ if tool_id not in tool_call_order:
642
+ tool_call_order.append(tool_id)
643
+ # ignore other events for now
644
+ last_final = await stream.get_final_response()
645
+
646
+ # Usage/cost tracking after stream ends
647
+ input_tokens = last_final.usage.input_tokens if last_final and last_final.usage else 0
648
+ output_tokens = last_final.usage.output_tokens if last_final and last_final.usage else 0
649
+ cost = 0
650
+ tracer["usage"] = get_chat_usage_metrics(
651
+ model_name, input_tokens, output_tokens, usage=tracer.get("usage"), cost=cost
652
+ )
653
+
654
+ # If there are tool calls, package them into aggregated text for tracing parity
655
+ if tool_call_order:
656
+ packaged_tool_calls: List[ToolCall] = []
657
+ for tool_id in tool_call_order:
658
+ name = tool_calls_name.get(tool_id) or ""
659
+ args_str = tool_calls_args.get(tool_id, "")
660
+ try:
661
+ args = json.loads(args_str) if isinstance(args_str, str) else args_str
662
+ except Exception:
663
+ logger.warning(f"Failed to parse tool call arguments for {tool_id}: {args_str}")
664
+ args = {}
665
+ packaged_tool_calls.append(ToolCall(name=name, args=args, id=tool_id))
666
+ # Move any text into trace thought
667
+ tracer_text = aggregated_text
668
+ aggregated_text = json.dumps([tc.__dict__ for tc in packaged_tool_calls])
669
+ # Save for trace below
670
+ if tracer_text:
671
+ tracer.setdefault("_responses_stream_text", tracer_text)
672
+
673
+ if is_none_or_empty(aggregated_text):
674
+ logger.warning(f"No response by {model_name}\nLast Message by {messages[-1].role}: {messages[-1].content}.")
675
+ raise ValueError(f"Empty or no response by {model_name} over Responses API. Retry if needed.")
676
+
677
+ tracer["chat_model"] = model_name
678
+ tracer["temperature"] = temperature
679
+ if is_promptrace_enabled():
680
+ # If tool-calls were present, include any streamed text in the trace thought
681
+ trace_payload = aggregated_text
682
+ if tracer.get("_responses_stream_text"):
683
+ thoughts = tracer.pop("_responses_stream_text")
684
+ trace_payload = thoughts
685
+ commit_conversation_trace(messages, trace_payload, tracer)
686
+
687
+
394
688
  def get_structured_output_support(model_name: str, api_base_url: str = None) -> StructuredOutputSupport:
395
689
  if model_name.startswith("deepseek-reasoner"):
396
690
  return StructuredOutputSupport.NONE
@@ -413,6 +707,12 @@ def format_message_for_api(raw_messages: List[ChatMessage], api_base_url: str) -
413
707
  # Handle tool call and tool result message types
414
708
  message_type = message.additional_kwargs.get("message_type")
415
709
  if message_type == "tool_call":
710
+ if is_openai_api(api_base_url):
711
+ for part in message.content:
712
+ if "status" in part:
713
+ part.pop("status") # Drop unsupported tool call status field
714
+ formatted_messages.extend(message.content)
715
+ continue
416
716
  # Convert tool_call to OpenAI function call format
417
717
  content = []
418
718
  for part in message.content:
@@ -451,14 +751,23 @@ def format_message_for_api(raw_messages: List[ChatMessage], api_base_url: str) -
451
751
  if not tool_call_id:
452
752
  logger.warning(f"Dropping tool result without valid tool_call_id: {part.get('name')}")
453
753
  continue
454
- formatted_messages.append(
455
- {
456
- "role": "tool",
457
- "tool_call_id": tool_call_id,
458
- "name": part.get("name"),
459
- "content": part.get("content"),
460
- }
461
- )
754
+ if is_openai_api(api_base_url):
755
+ formatted_messages.append(
756
+ {
757
+ "type": "function_call_output",
758
+ "call_id": tool_call_id,
759
+ "output": part.get("content") or "No output",
760
+ }
761
+ )
762
+ else:
763
+ formatted_messages.append(
764
+ {
765
+ "role": "tool",
766
+ "tool_call_id": tool_call_id,
767
+ "name": part.get("name"),
768
+ "content": part.get("content") or "No output",
769
+ }
770
+ )
462
771
  continue
463
772
  if isinstance(message.content, list) and not is_openai_api(api_base_url):
464
773
  assistant_texts = []
@@ -490,6 +799,11 @@ def format_message_for_api(raw_messages: List[ChatMessage], api_base_url: str) -
490
799
  message.content.remove(part)
491
800
  elif part["type"] == "image_url" and not part.get("image_url"):
492
801
  message.content.remove(part)
802
+ # OpenAI models use the Responses API which uses slightly different content types
803
+ if part["type"] == "text":
804
+ part["type"] = "output_text" if message.role == "assistant" else "input_text"
805
+ if part["type"] == "image":
806
+ part["type"] = "output_image" if message.role == "assistant" else "input_image"
493
807
  # If no valid content parts left, remove the message
494
808
  if is_none_or_empty(message.content):
495
809
  messages.remove(message)
@@ -514,7 +828,9 @@ def is_openai_reasoning_model(model_name: str, api_base_url: str = None) -> bool
514
828
  """
515
829
  Check if the model is an OpenAI reasoning model
516
830
  """
517
- return model_name.lower().startswith("o") and is_openai_api(api_base_url)
831
+ return is_openai_api(api_base_url) and (
832
+ model_name.lower().startswith("o") or model_name.lower().startswith("gpt-5")
833
+ )
518
834
 
519
835
 
520
836
  def is_non_streaming_model(model_name: str, api_base_url: str = None) -> bool:
@@ -610,6 +926,9 @@ async def astream_thought_processor(
610
926
  if not chunk_data.get("object") or chunk_data.get("object") != "chat.completion.chunk":
611
927
  logger.warning(f"Skipping invalid chunk with object field: {chunk_data.get('object', 'missing')}")
612
928
  continue
929
+ # Handle unsupported service tiers like "on_demand" by Groq
930
+ if chunk.service_tier and chunk.service_tier == "on_demand":
931
+ chunk_data["service_tier"] = "auto"
613
932
 
614
933
  tchunk = ChatCompletionWithThoughtsChunk.model_validate(chunk_data)
615
934
 
@@ -851,20 +1170,32 @@ def add_qwen_no_think_tag(formatted_messages: List[dict]) -> None:
851
1170
  break
852
1171
 
853
1172
 
854
- def to_openai_tools(tools: List[ToolDefinition]) -> List[Dict] | None:
1173
+ def to_openai_tools(tools: List[ToolDefinition], use_responses_api: bool) -> List[Dict] | None:
855
1174
  "Transform tool definitions from standard format to OpenAI format."
856
- openai_tools = [
857
- {
858
- "type": "function",
859
- "function": {
1175
+ if use_responses_api:
1176
+ openai_tools = [
1177
+ {
1178
+ "type": "function",
860
1179
  "name": tool.name,
861
1180
  "description": tool.description,
862
1181
  "parameters": clean_response_schema(tool.schema),
863
1182
  "strict": True,
864
- },
865
- }
866
- for tool in tools
867
- ]
1183
+ }
1184
+ for tool in tools
1185
+ ]
1186
+ else:
1187
+ openai_tools = [
1188
+ {
1189
+ "type": "function",
1190
+ "function": {
1191
+ "name": tool.name,
1192
+ "description": tool.description,
1193
+ "parameters": clean_response_schema(tool.schema),
1194
+ "strict": True,
1195
+ },
1196
+ }
1197
+ for tool in tools
1198
+ ]
868
1199
 
869
1200
  return openai_tools or None
870
1201
 
@@ -78,38 +78,6 @@ no_entries_found = PromptTemplate.from_template(
78
78
  """.strip()
79
79
  )
80
80
 
81
- ## Conversation Prompts for Offline Chat Models
82
- ## --
83
- system_prompt_offline_chat = PromptTemplate.from_template(
84
- """
85
- You are Khoj, a smart, inquisitive and helpful personal assistant.
86
- - Use your general knowledge and past conversation with the user as context to inform your responses.
87
- - If you do not know the answer, say 'I don't know.'
88
- - Think step-by-step and ask questions to get the necessary information to answer the user's question.
89
- - Ask crisp follow-up questions to get additional context, when the answer cannot be inferred from the provided information or past conversations.
90
- - Do not print verbatim Notes unless necessary.
91
-
92
- Note: More information about you, the company or Khoj apps can be found at https://khoj.dev.
93
- Today is {day_of_week}, {current_date} in UTC.
94
- """.strip()
95
- )
96
-
97
- custom_system_prompt_offline_chat = PromptTemplate.from_template(
98
- """
99
- You are {name}, a personal agent on Khoj.
100
- - Use your general knowledge and past conversation with the user as context to inform your responses.
101
- - If you do not know the answer, say 'I don't know.'
102
- - Think step-by-step and ask questions to get the necessary information to answer the user's question.
103
- - Ask crisp follow-up questions to get additional context, when the answer cannot be inferred from the provided information or past conversations.
104
- - Do not print verbatim Notes unless necessary.
105
-
106
- Note: More information about you, the company or Khoj apps can be found at https://khoj.dev.
107
- Today is {day_of_week}, {current_date} in UTC.
108
-
109
- Instructions:\n{bio}
110
- """.strip()
111
- )
112
-
113
81
  ## Notes Conversation
114
82
  ## --
115
83
  notes_conversation = PromptTemplate.from_template(
@@ -551,12 +519,13 @@ Q: {query}
551
519
 
552
520
  extract_questions_system_prompt = PromptTemplate.from_template(
553
521
  """
554
- You are Khoj, an extremely smart and helpful document search assistant with only the ability to retrieve information from the user's notes.
555
- Construct search queries to retrieve relevant information to answer the user's question.
522
+ You are Khoj, an extremely smart and helpful document search assistant with only the ability to use natural language semantic search to retrieve information from the user's notes.
523
+ Construct upto {max_queries} search queries to retrieve relevant information to answer the user's question.
556
524
  - You will be provided past questions(User), search queries(Assistant) and answers(A) for context.
557
- - Add as much context from the previous questions and answers as required into your search queries.
558
- - Break your search down into multiple search queries from a diverse set of lenses to retrieve all related documents.
559
- - Add date filters to your search queries from questions and answers when required to retrieve the relevant information.
525
+ - You can use context from previous questions and answers to improve your search queries.
526
+ - Break down your search into multiple search queries from a diverse set of lenses to retrieve all related documents. E.g who, what, where, when, why, how.
527
+ - Add date filters to your search queries when required to retrieve the relevant information. This is the only structured query filter you can use.
528
+ - Output 1 concept per query. Do not use boolean operators (OR/AND) to combine queries. They do not work and degrade search quality.
560
529
  - When asked a meta, vague or random questions, search for a variety of broad topics to answer the user's question.
561
530
  {personality_context}
562
531
  What searches will you perform to answer the users question? Respond with a JSON object with the key "queries" mapping to a list of searches you would perform on the user's knowledge base. Just return the queries and nothing else.
@@ -567,22 +536,27 @@ User's Location: {location}
567
536
 
568
537
  Here are some examples of how you can construct search queries to answer the user's question:
569
538
 
539
+ Illustrate - Using diverse perspectives to retrieve all relevant documents
570
540
  User: How was my trip to Cambodia?
571
541
  Assistant: {{"queries": ["How was my trip to Cambodia?", "Angkor Wat temple visit", "Flight to Phnom Penh", "Expenses in Cambodia", "Stay in Cambodia"]}}
572
542
  A: The trip was amazing. You went to the Angkor Wat temple and it was beautiful.
573
543
 
544
+ Illustrate - Combining date filters with natural language queries to retrieve documents in relevant date range
574
545
  User: What national parks did I go to last year?
575
546
  Assistant: {{"queries": ["National park I visited in {last_new_year} dt>='{last_new_year_date}' dt<'{current_new_year_date}'"]}}
576
547
  A: You visited the Grand Canyon and Yellowstone National Park in {last_new_year}.
577
548
 
549
+ Illustrate - Using broad topics to answer meta or vague questions
578
550
  User: How can you help me?
579
551
  Assistant: {{"queries": ["Social relationships", "Physical and mental health", "Education and career", "Personal life goals and habits"]}}
580
552
  A: I can help you live healthier and happier across work and personal life
581
553
 
554
+ Illustrate - Combining location and date in natural language queries with date filters to retrieve relevant documents
582
555
  User: Who all did I meet here yesterday?
583
556
  Assistant: {{"queries": ["Met in {location} on {yesterday_date} dt>='{yesterday_date}' dt<'{current_date}'"]}}
584
557
  A: Yesterday's note mentions your visit to your local beach with Ram and Shyam.
585
558
 
559
+ Illustrate - Combining broad, diverse topics with date filters to answer meta or vague questions
586
560
  User: Share some random, interesting experiences from this month
587
561
  Assistant: {{"queries": ["Exciting travel adventures from {current_month}", "Fun social events dt>='{current_month}-01' dt<'{current_date}'", "Intense emotional experiences in {current_month}"]}}
588
562
  A: You had a great time at the local beach with your friends, attended a music concert and had a deep conversation with your friend, Khalid.
@@ -667,16 +641,17 @@ Here's some additional context about you:
667
641
 
668
642
  plan_function_execution = PromptTemplate.from_template(
669
643
  """
670
- You are Khoj, a smart, creative and meticulous researcher. Use the provided tool AIs to accomplish the task assigned to you.
644
+ You are Khoj, a smart, creative and meticulous researcher.
671
645
  Create a multi-step plan and intelligently iterate on the plan to complete the task.
646
+ Use the help of the provided tool AIs to accomplish the task assigned to you.
672
647
  {personality_context}
673
648
 
674
649
  # Instructions
675
- - Provide highly diverse, detailed requests to the tool AIs, one tool AI at a time, to gather information, perform actions etc. Their response will be shown to you in the next iteration.
676
- - Break down your research process into independent, self-contained steps that can be executed sequentially using the available tool AIs to answer the user's query. Write your step-by-step plan in the scratchpad.
677
- - Always ask a new query that was not asked to the tool AI in a previous iteration. Build on the results of the previous iterations.
650
+ - Make detailed, self-contained requests to the tool AIs, one tool AI at a time, to gather information, perform actions etc.
651
+ - Break down your research process into independent, self-contained steps that can be executed sequentially using the available tool AIs to accomplish the user assigned task.
678
652
  - Ensure that all required context is passed to the tool AIs for successful execution. Include any relevant stuff that has previously been attempted. They only know the context provided in your query.
679
653
  - Think step by step to come up with creative strategies when the previous iteration did not yield useful results.
654
+ - Do not ask the user to confirm or clarify assumptions for information gathering tasks and non-destructive actions, as you can always adjust later — decide what the most reasonable assumption is, proceed with it, and document it for the user's reference after you finish acting.
680
655
  - You are allowed upto {max_iterations} iterations to use the help of the provided tool AIs to accomplish the task assigned to you. Only stop when you have completed the task.
681
656
 
682
657
  # Examples