langroid 0.31.1__py3-none-any.whl → 0.33.3__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (163) hide show
  1. {langroid-0.31.1.dist-info → langroid-0.33.3.dist-info}/METADATA +150 -124
  2. langroid-0.33.3.dist-info/RECORD +7 -0
  3. {langroid-0.31.1.dist-info → langroid-0.33.3.dist-info}/WHEEL +1 -1
  4. langroid-0.33.3.dist-info/entry_points.txt +4 -0
  5. pyproject.toml +317 -212
  6. langroid/__init__.py +0 -106
  7. langroid/agent/.chainlit/config.toml +0 -121
  8. langroid/agent/.chainlit/translations/bn.json +0 -231
  9. langroid/agent/.chainlit/translations/en-US.json +0 -229
  10. langroid/agent/.chainlit/translations/gu.json +0 -231
  11. langroid/agent/.chainlit/translations/he-IL.json +0 -231
  12. langroid/agent/.chainlit/translations/hi.json +0 -231
  13. langroid/agent/.chainlit/translations/kn.json +0 -231
  14. langroid/agent/.chainlit/translations/ml.json +0 -231
  15. langroid/agent/.chainlit/translations/mr.json +0 -231
  16. langroid/agent/.chainlit/translations/ta.json +0 -231
  17. langroid/agent/.chainlit/translations/te.json +0 -231
  18. langroid/agent/.chainlit/translations/zh-CN.json +0 -229
  19. langroid/agent/__init__.py +0 -41
  20. langroid/agent/base.py +0 -1981
  21. langroid/agent/batch.py +0 -398
  22. langroid/agent/callbacks/__init__.py +0 -0
  23. langroid/agent/callbacks/chainlit.py +0 -598
  24. langroid/agent/chat_agent.py +0 -1899
  25. langroid/agent/chat_document.py +0 -454
  26. langroid/agent/helpers.py +0 -0
  27. langroid/agent/junk +0 -13
  28. langroid/agent/openai_assistant.py +0 -882
  29. langroid/agent/special/__init__.py +0 -59
  30. langroid/agent/special/arangodb/__init__.py +0 -0
  31. langroid/agent/special/arangodb/arangodb_agent.py +0 -656
  32. langroid/agent/special/arangodb/system_messages.py +0 -186
  33. langroid/agent/special/arangodb/tools.py +0 -107
  34. langroid/agent/special/arangodb/utils.py +0 -36
  35. langroid/agent/special/doc_chat_agent.py +0 -1466
  36. langroid/agent/special/lance_doc_chat_agent.py +0 -262
  37. langroid/agent/special/lance_rag/__init__.py +0 -9
  38. langroid/agent/special/lance_rag/critic_agent.py +0 -198
  39. langroid/agent/special/lance_rag/lance_rag_task.py +0 -82
  40. langroid/agent/special/lance_rag/query_planner_agent.py +0 -260
  41. langroid/agent/special/lance_tools.py +0 -61
  42. langroid/agent/special/neo4j/__init__.py +0 -0
  43. langroid/agent/special/neo4j/csv_kg_chat.py +0 -174
  44. langroid/agent/special/neo4j/neo4j_chat_agent.py +0 -433
  45. langroid/agent/special/neo4j/system_messages.py +0 -120
  46. langroid/agent/special/neo4j/tools.py +0 -32
  47. langroid/agent/special/relevance_extractor_agent.py +0 -127
  48. langroid/agent/special/retriever_agent.py +0 -56
  49. langroid/agent/special/sql/__init__.py +0 -17
  50. langroid/agent/special/sql/sql_chat_agent.py +0 -654
  51. langroid/agent/special/sql/utils/__init__.py +0 -21
  52. langroid/agent/special/sql/utils/description_extractors.py +0 -190
  53. langroid/agent/special/sql/utils/populate_metadata.py +0 -85
  54. langroid/agent/special/sql/utils/system_message.py +0 -35
  55. langroid/agent/special/sql/utils/tools.py +0 -64
  56. langroid/agent/special/table_chat_agent.py +0 -263
  57. langroid/agent/structured_message.py +0 -9
  58. langroid/agent/task.py +0 -2093
  59. langroid/agent/tool_message.py +0 -393
  60. langroid/agent/tools/__init__.py +0 -38
  61. langroid/agent/tools/duckduckgo_search_tool.py +0 -50
  62. langroid/agent/tools/file_tools.py +0 -234
  63. langroid/agent/tools/google_search_tool.py +0 -39
  64. langroid/agent/tools/metaphor_search_tool.py +0 -67
  65. langroid/agent/tools/orchestration.py +0 -303
  66. langroid/agent/tools/recipient_tool.py +0 -235
  67. langroid/agent/tools/retrieval_tool.py +0 -32
  68. langroid/agent/tools/rewind_tool.py +0 -137
  69. langroid/agent/tools/segment_extract_tool.py +0 -41
  70. langroid/agent/typed_task.py +0 -19
  71. langroid/agent/xml_tool_message.py +0 -382
  72. langroid/agent_config.py +0 -0
  73. langroid/cachedb/__init__.py +0 -17
  74. langroid/cachedb/base.py +0 -58
  75. langroid/cachedb/momento_cachedb.py +0 -108
  76. langroid/cachedb/redis_cachedb.py +0 -153
  77. langroid/embedding_models/__init__.py +0 -39
  78. langroid/embedding_models/base.py +0 -74
  79. langroid/embedding_models/clustering.py +0 -189
  80. langroid/embedding_models/models.py +0 -461
  81. langroid/embedding_models/protoc/__init__.py +0 -0
  82. langroid/embedding_models/protoc/embeddings.proto +0 -19
  83. langroid/embedding_models/protoc/embeddings_pb2.py +0 -33
  84. langroid/embedding_models/protoc/embeddings_pb2.pyi +0 -50
  85. langroid/embedding_models/protoc/embeddings_pb2_grpc.py +0 -79
  86. langroid/embedding_models/remote_embeds.py +0 -153
  87. langroid/exceptions.py +0 -65
  88. langroid/experimental/team-save.py +0 -391
  89. langroid/language_models/.chainlit/config.toml +0 -121
  90. langroid/language_models/.chainlit/translations/en-US.json +0 -231
  91. langroid/language_models/__init__.py +0 -53
  92. langroid/language_models/azure_openai.py +0 -153
  93. langroid/language_models/base.py +0 -678
  94. langroid/language_models/config.py +0 -18
  95. langroid/language_models/mock_lm.py +0 -124
  96. langroid/language_models/openai_gpt.py +0 -1923
  97. langroid/language_models/prompt_formatter/__init__.py +0 -16
  98. langroid/language_models/prompt_formatter/base.py +0 -40
  99. langroid/language_models/prompt_formatter/hf_formatter.py +0 -132
  100. langroid/language_models/prompt_formatter/llama2_formatter.py +0 -75
  101. langroid/language_models/utils.py +0 -147
  102. langroid/mytypes.py +0 -84
  103. langroid/parsing/__init__.py +0 -52
  104. langroid/parsing/agent_chats.py +0 -38
  105. langroid/parsing/code-parsing.md +0 -86
  106. langroid/parsing/code_parser.py +0 -121
  107. langroid/parsing/config.py +0 -0
  108. langroid/parsing/document_parser.py +0 -718
  109. langroid/parsing/image_text.py +0 -32
  110. langroid/parsing/para_sentence_split.py +0 -62
  111. langroid/parsing/parse_json.py +0 -155
  112. langroid/parsing/parser.py +0 -313
  113. langroid/parsing/repo_loader.py +0 -790
  114. langroid/parsing/routing.py +0 -36
  115. langroid/parsing/search.py +0 -275
  116. langroid/parsing/spider.py +0 -102
  117. langroid/parsing/table_loader.py +0 -94
  118. langroid/parsing/url_loader.py +0 -111
  119. langroid/parsing/url_loader_cookies.py +0 -73
  120. langroid/parsing/urls.py +0 -273
  121. langroid/parsing/utils.py +0 -373
  122. langroid/parsing/web_search.py +0 -155
  123. langroid/prompts/__init__.py +0 -9
  124. langroid/prompts/chat-gpt4-system-prompt.md +0 -68
  125. langroid/prompts/dialog.py +0 -17
  126. langroid/prompts/prompts_config.py +0 -5
  127. langroid/prompts/templates.py +0 -141
  128. langroid/pydantic_v1/__init__.py +0 -10
  129. langroid/pydantic_v1/main.py +0 -4
  130. langroid/utils/.chainlit/config.toml +0 -121
  131. langroid/utils/.chainlit/translations/en-US.json +0 -231
  132. langroid/utils/__init__.py +0 -19
  133. langroid/utils/algorithms/__init__.py +0 -3
  134. langroid/utils/algorithms/graph.py +0 -103
  135. langroid/utils/configuration.py +0 -98
  136. langroid/utils/constants.py +0 -30
  137. langroid/utils/docker.py +0 -37
  138. langroid/utils/git_utils.py +0 -252
  139. langroid/utils/globals.py +0 -49
  140. langroid/utils/llms/__init__.py +0 -0
  141. langroid/utils/llms/strings.py +0 -8
  142. langroid/utils/logging.py +0 -135
  143. langroid/utils/object_registry.py +0 -66
  144. langroid/utils/output/__init__.py +0 -20
  145. langroid/utils/output/citations.py +0 -41
  146. langroid/utils/output/printing.py +0 -99
  147. langroid/utils/output/status.py +0 -40
  148. langroid/utils/pandas_utils.py +0 -30
  149. langroid/utils/pydantic_utils.py +0 -602
  150. langroid/utils/system.py +0 -286
  151. langroid/utils/types.py +0 -93
  152. langroid/utils/web/__init__.py +0 -0
  153. langroid/utils/web/login.py +0 -83
  154. langroid/vector_store/__init__.py +0 -50
  155. langroid/vector_store/base.py +0 -357
  156. langroid/vector_store/chromadb.py +0 -214
  157. langroid/vector_store/lancedb.py +0 -401
  158. langroid/vector_store/meilisearch.py +0 -299
  159. langroid/vector_store/momento.py +0 -278
  160. langroid/vector_store/qdrant_cloud.py +0 -6
  161. langroid/vector_store/qdrantdb.py +0 -468
  162. langroid-0.31.1.dist-info/RECORD +0 -162
  163. {langroid-0.31.1.dist-info → langroid-0.33.3.dist-info/licenses}/LICENSE +0 -0
@@ -1,16 +0,0 @@
1
- from . import base
2
- from . import llama2_formatter
3
- from .base import PromptFormatter
4
- from .llama2_formatter import Llama2Formatter
5
- from ..config import PromptFormatterConfig
6
- from ..config import Llama2FormatterConfig
7
-
8
-
9
- __all__ = [
10
- "PromptFormatter",
11
- "Llama2Formatter",
12
- "PromptFormatterConfig",
13
- "Llama2FormatterConfig",
14
- "base",
15
- "llama2_formatter",
16
- ]
@@ -1,40 +0,0 @@
1
- import logging
2
- from abc import ABC, abstractmethod
3
- from typing import List
4
-
5
- from langroid.language_models.base import LLMMessage
6
- from langroid.language_models.config import PromptFormatterConfig
7
-
8
- logger = logging.getLogger(__name__)
9
-
10
-
11
- class PromptFormatter(ABC):
12
- """
13
- Abstract base class for a prompt formatter
14
- """
15
-
16
- def __init__(self, config: PromptFormatterConfig):
17
- self.config = config
18
-
19
- @staticmethod
20
- def create(formatter: str) -> "PromptFormatter":
21
- from langroid.language_models.config import HFPromptFormatterConfig
22
- from langroid.language_models.prompt_formatter.hf_formatter import HFFormatter
23
-
24
- return HFFormatter(HFPromptFormatterConfig(model_name=formatter))
25
-
26
- @abstractmethod
27
- def format(self, messages: List[LLMMessage]) -> str:
28
- """
29
- Convert sequence of messages (system, user, assistant, user, assistant...user)
30
- to a single prompt formatted according to the specific format type,
31
- to be used in a /completions endpoint.
32
-
33
- Args:
34
- messages (List[LLMMessage]): chat history as a sequence of messages
35
-
36
- Returns:
37
- (str): formatted version of chat history
38
-
39
- """
40
- pass
@@ -1,132 +0,0 @@
1
- """
2
- Prompt formatter based on HuggingFace `AutoTokenizer.apply_chat_template` method
3
- from their Transformers library. It searches the hub for a model matching the
4
- specified name, and uses the first one it finds. We assume that all matching
5
- models will have the same tokenizer, so we just use the first one.
6
- """
7
-
8
- import logging
9
- import re
10
- from typing import Any, List, Set, Tuple, Type
11
-
12
- from jinja2.exceptions import TemplateError
13
-
14
- from langroid.language_models.base import LanguageModel, LLMMessage, Role
15
- from langroid.language_models.config import HFPromptFormatterConfig
16
- from langroid.language_models.prompt_formatter.base import PromptFormatter
17
-
18
- logger = logging.getLogger(__name__)
19
-
20
-
21
- def try_import_hf_modules() -> Tuple[Type[Any], Type[Any]]:
22
- """
23
- Attempts to import the AutoTokenizer class from the transformers package.
24
- Returns:
25
- The AutoTokenizer class if successful.
26
- Raises:
27
- ImportError: If the transformers package is not installed.
28
- """
29
- try:
30
- from huggingface_hub import HfApi
31
- from transformers import AutoTokenizer
32
-
33
- return AutoTokenizer, HfApi
34
- except ImportError:
35
- raise ImportError(
36
- """
37
- You are trying to use some/all of:
38
- HuggingFace transformers.AutoTokenizer,
39
- huggingface_hub.HfApi,
40
- but these are not not installed
41
- by default with Langroid. Please install langroid using the
42
- `transformers` extra, like so:
43
- pip install "langroid[transformers]"
44
- or equivalent.
45
- """
46
- )
47
-
48
-
49
- def find_hf_formatter(model_name: str) -> str:
50
- AutoTokenizer, HfApi = try_import_hf_modules()
51
- hf_api = HfApi()
52
- # try to find a matching model, with progressivly shorter prefixes of model_name
53
- model_name = model_name.lower().split("/")[-1]
54
- parts = re.split("[:\\-_]", model_name)
55
- parts = [p.lower() for p in parts if p != ""]
56
- for i in range(len(parts), 0, -1):
57
- prefix = "-".join(parts[:i])
58
- models = hf_api.list_models(
59
- task="text-generation",
60
- model_name=prefix,
61
- )
62
- try:
63
- mdl = next(models)
64
- tokenizer = AutoTokenizer.from_pretrained(mdl.id)
65
- if tokenizer.chat_template is not None:
66
- return str(mdl.id)
67
- else:
68
- continue
69
- except Exception:
70
- continue
71
-
72
- return ""
73
-
74
-
75
- class HFFormatter(PromptFormatter):
76
- models: Set[str] = set() # which models have been used for formatting
77
-
78
- def __init__(self, config: HFPromptFormatterConfig):
79
- super().__init__(config)
80
- AutoTokenizer, HfApi = try_import_hf_modules()
81
- self.config: HFPromptFormatterConfig = config
82
- hf_api = HfApi()
83
- models = hf_api.list_models(
84
- task="text-generation",
85
- model_name=config.model_name,
86
- )
87
- try:
88
- mdl = next(models)
89
- except StopIteration:
90
- raise ValueError(f"Model {config.model_name} not found on HuggingFace Hub")
91
-
92
- self.tokenizer = AutoTokenizer.from_pretrained(mdl.id)
93
- if self.tokenizer.chat_template is None:
94
- raise ValueError(
95
- f"Model {config.model_name} does not support chat template"
96
- )
97
- elif mdl.id not in HFFormatter.models:
98
- # only warn if this is the first time we've used this mdl.id
99
- logger.warning(
100
- f"""
101
- Using HuggingFace {mdl.id} for prompt formatting:
102
- This is the CHAT TEMPLATE. If this is not what you intended,
103
- consider specifying a more complete model name for the formatter.
104
-
105
- {self.tokenizer.chat_template}
106
- """
107
- )
108
- HFFormatter.models.add(mdl.id)
109
-
110
- def format(self, messages: List[LLMMessage]) -> str:
111
- sys_msg, chat_msgs, user_msg = LanguageModel.get_chat_history_components(
112
- messages
113
- )
114
- # build msg dicts expected by AutoTokenizer.apply_chat_template
115
- sys_msg_dict = dict(role=Role.SYSTEM.value, content=sys_msg)
116
- chat_dicts = []
117
- for user, assistant in chat_msgs:
118
- chat_dicts.append(dict(role=Role.USER.value, content=user))
119
- chat_dicts.append(dict(role=Role.ASSISTANT.value, content=assistant))
120
- chat_dicts.append(dict(role=Role.USER.value, content=user_msg))
121
- all_dicts = [sys_msg_dict] + chat_dicts
122
- try:
123
- # apply chat template
124
- result = self.tokenizer.apply_chat_template(all_dicts, tokenize=False)
125
- except TemplateError:
126
- # this likely means the model doesn't support a system msg,
127
- # so combine it with the first user msg
128
- first_user_msg = chat_msgs[0][0] if len(chat_msgs) > 0 else user_msg
129
- first_user_msg = sys_msg + "\n\n" + first_user_msg
130
- chat_dicts[0] = dict(role=Role.USER.value, content=first_user_msg)
131
- result = self.tokenizer.apply_chat_template(chat_dicts, tokenize=False)
132
- return str(result)
@@ -1,75 +0,0 @@
1
- import logging
2
- from typing import List, Tuple
3
-
4
- from langroid.language_models.base import LanguageModel, LLMMessage
5
- from langroid.language_models.config import Llama2FormatterConfig
6
- from langroid.language_models.prompt_formatter.base import PromptFormatter
7
-
8
- logger = logging.getLogger(__name__)
9
-
10
-
11
- BOS: str = "<s>"
12
- EOS: str = "</s>"
13
- B_INST: str = "[INST]"
14
- E_INST: str = "[/INST]"
15
- B_SYS: str = "<<SYS>>\n"
16
- E_SYS: str = "\n<</SYS>>\n\n"
17
- SPECIAL_TAGS: List[str] = [B_INST, E_INST, BOS, EOS, "<<SYS>>", "<</SYS>>"]
18
-
19
-
20
- class Llama2Formatter(PromptFormatter):
21
- def __int__(self, config: Llama2FormatterConfig) -> None:
22
- super().__init__(config)
23
- self.config: Llama2FormatterConfig = config
24
-
25
- def format(self, messages: List[LLMMessage]) -> str:
26
- sys_msg, chat_msgs, user_msg = LanguageModel.get_chat_history_components(
27
- messages
28
- )
29
- return self._get_prompt_from_components(sys_msg, chat_msgs, user_msg)
30
-
31
- def _get_prompt_from_components(
32
- self,
33
- system_prompt: str,
34
- chat_history: List[Tuple[str, str]],
35
- user_message: str,
36
- ) -> str:
37
- """
38
- For llama2 models, convert chat history into a single
39
- prompt for Llama2 models, for use in the /completions endpoint
40
- (as opposed to the /chat/completions endpoint).
41
- See:
42
- https://www.reddit.com/r/LocalLLaMA/comments/155po2p/get_llama_2_prompt_format_right/
43
- https://github.com/facebookresearch/llama/blob/main/llama/generation.py#L44
44
-
45
- Args:
46
- system_prompt (str): system prompt, typically specifying role/task.
47
- chat_history (List[Tuple[str,str]]): List of (user, assistant) pairs
48
- user_message (str): user message, at the end of the chat, i.e. the message
49
- for which we want to generate a response.
50
-
51
- Returns:
52
- str: Prompt for Llama2 models
53
-
54
- Typical structure of the formatted prompt:
55
- Note important that the first [INST], [/INST] surrounds the system prompt,
56
- together with the first user message. A lot of libs seem to miss this detail.
57
-
58
- <s>[INST] <<SYS>>
59
- You are are a helpful... bla bla.. assistant
60
- <</SYS>>
61
-
62
- Hi there! [/INST] Hello! How can I help you today? </s><s>[INST]
63
- What is a neutron star? [/INST] A neutron star is a ... </s><s>
64
- [INST] Okay cool, thank you! [/INST] You're welcome! </s><s>
65
- [INST] Ah, I have one more question.. [/INST]
66
- """
67
- bos = BOS if self.config.use_bos_eos else ""
68
- eos = EOS if self.config.use_bos_eos else ""
69
- text = f"{bos}{B_INST} {B_SYS}{system_prompt}{E_SYS}"
70
- for user_input, response in chat_history:
71
- text += (
72
- f"{user_input.strip()} {E_INST} {response.strip()} {eos}{bos} {B_INST} "
73
- )
74
- text += f"{user_message.strip()} {E_INST}"
75
- return text
@@ -1,147 +0,0 @@
1
- # from openai-cookbook
2
- import asyncio
3
- import logging
4
- import random
5
- import time
6
- from typing import Any, Callable, Dict, List
7
-
8
- import aiohttp
9
- import openai
10
- import requests
11
-
12
- logger = logging.getLogger(__name__)
13
- # setlevel to warning
14
- logger.setLevel(logging.WARNING)
15
-
16
-
17
- # define a retry decorator
18
- def retry_with_exponential_backoff(
19
- func: Callable[..., Any],
20
- initial_delay: float = 1,
21
- exponential_base: float = 1.3,
22
- jitter: bool = True,
23
- max_retries: int = 5,
24
- errors: tuple = ( # type: ignore
25
- requests.exceptions.RequestException,
26
- openai.APITimeoutError,
27
- openai.RateLimitError,
28
- openai.AuthenticationError,
29
- openai.APIError,
30
- aiohttp.ServerTimeoutError,
31
- asyncio.TimeoutError,
32
- ),
33
- ) -> Callable[..., Any]:
34
- """Retry a function with exponential backoff."""
35
-
36
- def wrapper(*args: List[Any], **kwargs: Dict[Any, Any]) -> Any:
37
- # Initialize variables
38
- num_retries = 0
39
- delay = initial_delay
40
-
41
- # Loop until a successful response or max_retries is hit or exception is raised
42
- while True:
43
- try:
44
- return func(*args, **kwargs)
45
-
46
- except openai.BadRequestError as e:
47
- # do not retry when the request itself is invalid,
48
- # e.g. when context is too long
49
- logger.error(f"OpenAI API request failed with error: {e}.")
50
- raise e
51
- except openai.AuthenticationError as e:
52
- # do not retry when there's an auth error
53
- logger.error(f"OpenAI API request failed with error: {e}.")
54
- raise e
55
-
56
- # Retry on specified errors
57
- except errors as e:
58
- # Increment retries
59
- num_retries += 1
60
-
61
- # Check if max retries has been reached
62
- if num_retries > max_retries:
63
- raise Exception(
64
- f"Maximum number of retries ({max_retries}) exceeded."
65
- f" Last error: {str(e)}."
66
- )
67
-
68
- # Increment the delay
69
- delay *= exponential_base * (1 + jitter * random.random())
70
- logger.warning(
71
- f"""OpenAI API request failed with error:
72
- {e}.
73
- Retrying in {delay} seconds..."""
74
- )
75
- # Sleep for the delay
76
- time.sleep(delay)
77
-
78
- # Raise exceptions for any errors not specified
79
- except Exception as e:
80
- raise e
81
-
82
- return wrapper
83
-
84
-
85
- def async_retry_with_exponential_backoff(
86
- func: Callable[..., Any],
87
- initial_delay: float = 1,
88
- exponential_base: float = 1.3,
89
- jitter: bool = True,
90
- max_retries: int = 5,
91
- errors: tuple = ( # type: ignore
92
- openai.APITimeoutError,
93
- openai.RateLimitError,
94
- openai.AuthenticationError,
95
- openai.APIError,
96
- aiohttp.ServerTimeoutError,
97
- asyncio.TimeoutError,
98
- ),
99
- ) -> Callable[..., Any]:
100
- """Retry a function with exponential backoff."""
101
-
102
- async def wrapper(*args: List[Any], **kwargs: Dict[Any, Any]) -> Any:
103
- # Initialize variables
104
- num_retries = 0
105
- delay = initial_delay
106
-
107
- # Loop until a successful response or max_retries is hit or exception is raised
108
- while True:
109
- try:
110
- result = await func(*args, **kwargs)
111
- return result
112
-
113
- except openai.BadRequestError as e:
114
- # do not retry when the request itself is invalid,
115
- # e.g. when context is too long
116
- logger.error(f"OpenAI API request failed with error: {e}.")
117
- raise e
118
- except openai.AuthenticationError as e:
119
- # do not retry when there's an auth error
120
- logger.error(f"OpenAI API request failed with error: {e}.")
121
- raise e
122
- # Retry on specified errors
123
- except errors as e:
124
- # Increment retries
125
- num_retries += 1
126
-
127
- # Check if max retries has been reached
128
- if num_retries > max_retries:
129
- raise Exception(
130
- f"Maximum number of retries ({max_retries}) exceeded."
131
- f" Last error: {str(e)}."
132
- )
133
-
134
- # Increment the delay
135
- delay *= exponential_base * (1 + jitter * random.random())
136
- logger.warning(
137
- f"""OpenAI API request failed with error{e}.
138
- Retrying in {delay} seconds..."""
139
- )
140
- # Sleep for the delay
141
- time.sleep(delay)
142
-
143
- # Raise exceptions for any errors not specified
144
- except Exception as e:
145
- raise e
146
-
147
- return wrapper
langroid/mytypes.py DELETED
@@ -1,84 +0,0 @@
1
- from enum import Enum
2
- from textwrap import dedent
3
- from typing import Any, Callable, Dict, List, Union
4
- from uuid import uuid4
5
-
6
- from langroid.pydantic_v1 import BaseModel, Extra, Field
7
-
8
- Number = Union[int, float]
9
- Embedding = List[Number]
10
- Embeddings = List[Embedding]
11
- EmbeddingFunction = Callable[[List[str]], Embeddings]
12
-
13
-
14
- class Entity(str, Enum):
15
- """
16
- Enum for the different types of entities that can respond to the current message.
17
- """
18
-
19
- AGENT = "Agent"
20
- LLM = "LLM"
21
- USER = "User"
22
- SYSTEM = "System"
23
-
24
- def __eq__(self, other: object) -> bool:
25
- """Allow case-insensitive equality (==) comparison with strings."""
26
- if other is None:
27
- return False
28
- if isinstance(other, str):
29
- return self.value.lower() == other.lower()
30
- return super().__eq__(other)
31
-
32
- def __ne__(self, other: object) -> bool:
33
- """Allow case-insensitive non-equality (!=) comparison with strings."""
34
- return not self.__eq__(other)
35
-
36
- def __hash__(self) -> int:
37
- """Override this to ensure hashability of the enum,
38
- so it can be used sets and dictionary keys.
39
- """
40
- return hash(self.value.lower())
41
-
42
-
43
- class DocMetaData(BaseModel):
44
- """Metadata for a document."""
45
-
46
- source: str = "context"
47
- is_chunk: bool = False # if it is a chunk, don't split
48
- id: str = Field(default_factory=lambda: str(uuid4()))
49
- window_ids: List[str] = [] # for RAG: ids of chunks around this one
50
-
51
- def dict_bool_int(self, *args: Any, **kwargs: Any) -> Dict[str, Any]:
52
- """
53
- Special dict method to convert bool fields to int, to appease some
54
- downstream libraries, e.g. Chroma which complains about bool fields in
55
- metadata.
56
- """
57
- original_dict = super().dict(*args, **kwargs)
58
-
59
- for key, value in original_dict.items():
60
- if isinstance(value, bool):
61
- original_dict[key] = 1 * value
62
-
63
- return original_dict
64
-
65
- class Config:
66
- extra = Extra.allow
67
-
68
-
69
- class Document(BaseModel):
70
- """Interface for interacting with a document."""
71
-
72
- content: str
73
- metadata: DocMetaData
74
-
75
- def id(self) -> str:
76
- return self.metadata.id
77
-
78
- def __str__(self) -> str:
79
- return dedent(
80
- f"""
81
- CONTENT: {self.content}
82
- SOURCE:{self.metadata.source}
83
- """
84
- )
@@ -1,52 +0,0 @@
1
- from . import parser
2
- from . import agent_chats
3
- from . import code_parser
4
- from . import document_parser
5
- from . import parse_json
6
- from . import para_sentence_split
7
- from . import repo_loader
8
- from . import url_loader
9
- from . import table_loader
10
- from . import urls
11
- from . import utils
12
- from . import search
13
- from . import web_search
14
-
15
- from .parser import (
16
- Splitter,
17
- PdfParsingConfig,
18
- DocxParsingConfig,
19
- DocParsingConfig,
20
- ParsingConfig,
21
- Parser,
22
- )
23
-
24
- __all__ = [
25
- "parser",
26
- "agent_chats",
27
- "code_parser",
28
- "document_parser",
29
- "parse_json",
30
- "para_sentence_split",
31
- "repo_loader",
32
- "url_loader",
33
- "table_loader",
34
- "urls",
35
- "utils",
36
- "search",
37
- "web_search",
38
- "Splitter",
39
- "PdfParsingConfig",
40
- "DocxParsingConfig",
41
- "DocParsingConfig",
42
- "ParsingConfig",
43
- "Parser",
44
- ]
45
-
46
- try:
47
- from . import spider
48
-
49
- spider
50
- __all__.append("spider")
51
- except ImportError:
52
- pass
@@ -1,38 +0,0 @@
1
- from typing import Tuple, no_type_check
2
-
3
- from pyparsing import Empty, Literal, ParseException, SkipTo, StringEnd, Word, alphanums
4
-
5
-
6
- @no_type_check
7
- def parse_message(msg: str) -> Tuple[str, str]:
8
- """
9
- Parse the intended recipient and content of a message.
10
- Message format is assumed to be TO[<recipient>]:<message>.
11
- The TO[<recipient>]: part is optional.
12
-
13
- Args:
14
- msg (str): message to parse
15
-
16
- Returns:
17
- str, str: task-name of intended recipient, and content of message
18
- (if recipient is not specified, task-name is empty string)
19
-
20
- """
21
- if msg is None:
22
- return "", ""
23
-
24
- # Grammar definition
25
- name = Word(alphanums)
26
- to_start = Literal("TO[").suppress()
27
- to_end = Literal("]:").suppress()
28
- to_field = (to_start + name("name") + to_end) | Empty().suppress()
29
- message = SkipTo(StringEnd())("text")
30
-
31
- # Parser definition
32
- parser = to_field + message
33
-
34
- try:
35
- parsed = parser.parseString(msg)
36
- return parsed.name, parsed.text
37
- except ParseException:
38
- return "", msg
@@ -1,86 +0,0 @@
1
- To split Python code files into meaningful chunks, you can use the `tree-sitter` library, which is a parser generator tool and an incremental parsing library. It can be used to parse source code into an abstract syntax tree (AST) and extract meaningful code blocks from it. Here's how you can use `tree-sitter` to achieve this:
2
-
3
- 1. Install the `tree-sitter` Python package:
4
- ```python
5
- pip install tree-sitter
6
- ```
7
-
8
- 2. Install the `tree-sitter-python` language grammar:
9
- ```bash
10
- git clone https://github.com/tree-sitter/tree-sitter-python
11
- ```
12
-
13
- 3. Use `tree-sitter` to parse Python code files and extract meaningful code
14
- blocks:
15
-
16
- ```python
17
- from tree_sitter import Language, Parser
18
-
19
- # Set the path to the tree-sitter-python language grammar
20
- TREE_SITTER_PYTHON_PATH = './tree-sitter-python'
21
-
22
- # Build the Python language
23
- Language.build_library(
24
- 'build/my-languages.so',
25
- [TREE_SITTER_PYTHON_PATH]
26
- )
27
-
28
- PYTHON_LANGUAGE = Language('build/my-languages.so', 'python')
29
-
30
- # Create a parser
31
- parser = Parser()
32
- parser.set_language(PYTHON_LANGUAGE)
33
-
34
- # Parse the code
35
- code = """
36
- def foo():
37
- return "Hello, World!"
38
-
39
- def bar():
40
- return "Goodbye, World!"
41
- """
42
-
43
- tree = parser.parse(bytes(code, 'utf8'))
44
-
45
- # Extract meaningful code blocks (e.g., function definitions)
46
- def extract_functions(node):
47
- functions = []
48
- for child in node.children:
49
- if child.type == 'function_definition':
50
- start_byte = child.start_byte
51
- end_byte = child.end_byte
52
- functions.append(code[start_byte:end_byte])
53
- functions.extend(extract_functions(child))
54
- return functions
55
-
56
- functions = extract_functions(tree.root_node)
57
- print(functions)
58
- ```
59
-
60
- In the example provided, the `tree-sitter` library is used to parse the Python
61
- code into an abstract syntax tree (AST). The `extract_functions` function is
62
- then used to recursively traverse the AST and extract code blocks corresponding
63
- to function definitions. The extracted code blocks are stored in the `functions`
64
- list.
65
-
66
- The `extract_functions` function takes an AST node as input and returns a list
67
- of code blocks corresponding to function definitions. It checks whether the
68
- current node is of type `'function_definition'` (which corresponds to a function
69
- definition in Python code). If it is, the function extracts the corresponding
70
- code block from the original code using the `start_byte` and `end_byte`
71
- attributes of the node. The function then recursively processes the children of
72
- the current node to extract any nested function definitions.
73
-
74
- The resulting list `functions` contains the extracted code blocks, each
75
- representing a function definition from the original code. You can modify
76
- the `extract_functions` function to extract other types of code blocks (e.g.,
77
- class definitions, loops) by checking for different node types in the AST.
78
-
79
- Once you have extracted the code blocks, you can proceed with further
80
- processing, such as converting them into vectors and storing them in a vector
81
- database, as mentioned in the previous response.
82
-
83
- Note: The code provided in this response is a basic example to demonstrate the
84
- concept. Depending on your specific use case and requirements, you may need to
85
- extend or modify the code to handle more complex scenarios, such as handling
86
- comments, docstrings, and other code constructs.