autodocgenerator 0.9.0.1__py3-none-any.whl → 0.9.0.3__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,839 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: autodocgenerator
3
- Version: 0.9.0.1
4
- Summary: This Project helps you to create docs for your projects
5
- License: MIT
6
- Author: dima-on
7
- Author-email: sinica911@gmail.com
8
- Requires-Python: >=3.11,<4.0
9
- Classifier: License :: OSI Approved :: MIT License
10
- Classifier: Programming Language :: Python :: 3
11
- Classifier: Programming Language :: Python :: 3.11
12
- Classifier: Programming Language :: Python :: 3.12
13
- Classifier: Programming Language :: Python :: 3.13
14
- Classifier: Programming Language :: Python :: 3.14
15
- Requires-Dist: CacheControl (==0.14.4)
16
- Requires-Dist: Pygments (==2.19.2)
17
- Requires-Dist: RapidFuzz (==3.14.3)
18
- Requires-Dist: annotated-types (==0.7.0)
19
- Requires-Dist: anyio (==4.12.1)
20
- Requires-Dist: certifi (==2026.1.4)
21
- Requires-Dist: charset-normalizer (==3.4.4)
22
- Requires-Dist: cleo (==2.1.0)
23
- Requires-Dist: colorama (==0.4.6)
24
- Requires-Dist: crashtest (==0.4.1)
25
- Requires-Dist: distlib (==0.4.0)
26
- Requires-Dist: distro (==1.9.0)
27
- Requires-Dist: dulwich (==0.25.2)
28
- Requires-Dist: fastjsonschema (==2.21.2)
29
- Requires-Dist: filelock (==3.20.3)
30
- Requires-Dist: findpython (==0.7.1)
31
- Requires-Dist: google-auth (==2.47.0)
32
- Requires-Dist: google-genai (==1.56.0)
33
- Requires-Dist: groq (==1.0.0)
34
- Requires-Dist: h11 (==0.16.0)
35
- Requires-Dist: httpcore (==1.0.9)
36
- Requires-Dist: httpx (==0.28.1)
37
- Requires-Dist: idna (==3.11)
38
- Requires-Dist: installer (==0.7.0)
39
- Requires-Dist: jaraco.classes (==3.4.0)
40
- Requires-Dist: jaraco.context (==6.1.0)
41
- Requires-Dist: jaraco.functools (==4.4.0)
42
- Requires-Dist: jiter (==0.12.0)
43
- Requires-Dist: keyring (==25.7.0)
44
- Requires-Dist: markdown-it-py (==4.0.0)
45
- Requires-Dist: mdurl (==0.1.2)
46
- Requires-Dist: more-itertools (==10.8.0)
47
- Requires-Dist: msgpack (==1.1.2)
48
- Requires-Dist: openai (==2.14.0)
49
- Requires-Dist: packaging (==25.0)
50
- Requires-Dist: pbs-installer (==2026.1.14)
51
- Requires-Dist: pkginfo (==1.12.1.2)
52
- Requires-Dist: platformdirs (==4.5.1)
53
- Requires-Dist: pyasn1 (==0.6.1)
54
- Requires-Dist: pyasn1_modules (==0.4.2)
55
- Requires-Dist: pydantic (==2.12.5)
56
- Requires-Dist: pydantic_core (==2.41.5)
57
- Requires-Dist: pyproject_hooks (==1.2.0)
58
- Requires-Dist: python-dotenv (==1.2.1)
59
- Requires-Dist: pywin32-ctypes (==0.2.3)
60
- Requires-Dist: pyyaml (==6.0.3)
61
- Requires-Dist: requests (==2.32.5)
62
- Requires-Dist: requests-toolbelt (==1.0.0)
63
- Requires-Dist: rich (==14.2.0)
64
- Requires-Dist: rich_progress (==0.4.0)
65
- Requires-Dist: rsa (==4.9.1)
66
- Requires-Dist: shellingham (==1.5.4)
67
- Requires-Dist: sniffio (==1.3.1)
68
- Requires-Dist: tenacity (==9.1.2)
69
- Requires-Dist: tomlkit (==0.14.0)
70
- Requires-Dist: tqdm (==4.67.1)
71
- Requires-Dist: trove-classifiers (==2026.1.14.14)
72
- Requires-Dist: typing-inspection (==0.4.2)
73
- Requires-Dist: typing_extensions (==4.15.0)
74
- Requires-Dist: urllib3 (==2.6.2)
75
- Requires-Dist: virtualenv (==20.36.1)
76
- Requires-Dist: websockets (==15.0.1)
77
- Requires-Dist: zstandard (==0.25.0)
78
- Description-Content-Type: text/markdown
79
-
80
- ## Executive Navigation Tree
81
- - 📦 Installation & Workflow
82
- - [install-workflow-description](#install-workflow-description)
83
- - ⚙️ Configuration
84
- - [autodocconfig-options](#autodocconfig-options)
85
- - [config-reader-yaml-parsing](#config-reader-yaml-parsing)
86
- - [project-build-config-model](#project-build-config-model)
87
- - [projectsettings](#projectsettings)
88
- - 🏗️ Model & Architecture
89
- - [data-contract](#data-contract)
90
- - [logic-flow](#logic-flow)
91
- - [parent-model-hierarchy](#parent-model-hierarchy)
92
- - 📂 Modules & Management
93
- - [basemodule-definition](#basemodule-definition)
94
- - [docfactory-implementation](#docfactory-implementation)
95
- - [custommodule-implementation](#custommodule-implementation)
96
- - [intro-modules-implementation](#intro-modules-implementation)
97
- - [manager-class](#manager-class)
98
- - [manager-class-usage](#manager-class-usage)
99
- - [module-init-logger-setup](#module-init-logger-setup)
100
- - 📝 Document Generation
101
- - [asyncgptmodel-implementation](#asyncgptmodel-implementation)
102
- - [gptmodel-implementation](#gptmodel-implementation)
103
- - [document-generation-orchestrator](#document-generation-orchestrator)
104
- - [generate_descriptions_for_code](#generate_descriptions_for_code)
105
- - [gen_doc_parts](#gen_doc_parts)
106
- - [async_gen_doc_parts](#async_gen_doc_parts)
107
- - [write_docs_by_parts](#write_docs_by_parts)
108
- - [async_write_docs_by_parts](#async_write_docs_by_parts)
109
- - 📄 Content & Descriptions
110
- - [CONTENT_DESCRIPTION](#CONTENT_DESCRIPTION)
111
- - [generete_custom_discription](#generete_custom_discription)
112
- - [generete_custom_discription_without](#generete_custom_discription_without)
113
- - 🔗 Link & Text Processing
114
- - [extract_links_from_start](#extract_links_from_start)
115
- - [get_all_html_links](#get_all_html_links)
116
- - [RegexPattern](#["\\\']?(.*?)["\\\']?)
117
- - [get_introdaction](#get_introdaction)
118
- - [get_links_intro](#get_links_intro)
119
- - [split_text_by_anchors](#split_text_by_anchors)
120
- - [get_order](#get_order)
121
- - [split_data](#split_data)
122
- - 🛠️ Utilities
123
- - [code_mix](#code_mix)
124
- - 📦 Compression
125
- - [compress](#compress)
126
- - [compress_and_compare](#compress_and_compare)
127
- - [async_compress](#async_compress)
128
- - [async_compress_and_compare](#async_compress_and_compare)
129
- - [compress_to_one](#compress_to_one)
130
- - ❓ Miscellaneous
131
- - [missing-fragment](#missing-fragment)
132
-
133
-
134
-
135
- <a name="install-workflow-description"></a>
136
- **Installation workflow overview**
137
-
138
- 1. **Windows PowerShell execution**
139
- - Open a PowerShell terminal with administrative rights.
140
- - Run the following one‑liner, which downloads the PowerShell installer script directly from the project's repository and executes it in the same session:
141
- ```powershell
142
- irm <raw‑script‑url> | iex
143
- ```
144
- - The command uses `Invoke‑WebRequest` (`irm`) to fetch the script content and pipes it to `iex` (Invoke‑Expression) for immediate execution.
145
-
146
- 2. **Linux/macOS shell execution**
147
- - Open a terminal.
148
- - Execute the following command to retrieve the shell installer script from the repository and run it with `bash`:
149
- ```bash
150
- curl -sSL <raw‑script‑url> | bash
151
- ```
152
- - `curl` fetches the script silently (`-s`) while following redirects (`-L`). The output is streamed to `bash` for execution.
153
-
154
- 3. **GitHub Actions secret configuration**
155
- - In the GitHub repository, navigate to **Settings → Secrets and variables → Actions**.
156
- - Add a new **secret** named `GROCK_API_KEY`.
157
- - Paste the API key you obtained from the Grock documentation into the value field.
158
- - Save the secret; the workflow will now have access to `GROCK_API_KEY` as an environment variable during runs.
159
-
160
- 4. **Workflow behavior**
161
- - When the GitHub Action triggers, it will reference the `GROCK_API_KEY` secret to authenticate calls to the Grock service.
162
- - The appropriate installer command (PowerShell on Windows runners, Bash on Linux/macOS runners) will be invoked, pulling the latest installer script from the repository and executing it automatically.
163
-
164
- **Key points to remember**
165
- - Use the raw file URL from the repository for both `irm` and `curl` commands.
166
- - Ensure the secret is correctly named and stored; GitHub masks its value in logs.
167
- - Run the commands in a clean environment to avoid conflicts with existing installations.
168
- <a name="autodocconfig-options"></a>
169
- The configuration file uses a top‑level mapping with several sections:
170
-
171
- **Project information**
172
- - `project_name`: a short title for the documentation generator.
173
- - `language`: the language code for the generated text (e.g., “en”).
174
-
175
- **Build section**
176
- - `save_logs`: set to `true` to keep generation logs, `false` to discard them.
177
- - `log_level`: numeric level controlling verbosity (higher values give more detail).
178
-
179
- **Structure section**
180
- - `include_intro_links`: `true` adds navigation links at the beginning.
181
- - `include_order`: `true` keeps the original order of the processed files.
182
- - `max_doc_part_size`: maximum size of each documentation chunk, expressed as an integer.
183
-
184
- **Additional information**
185
- - `global idea`: a free‑form description that will be inserted into the documentation as a project overview.
186
-
187
- **Custom descriptions**
188
- - A list of strings that define extra prompts for the generator. Each item can contain placeholders and URLs for installation instructions or other guidance.
189
-
190
- When creating the file, follow the YAML syntax shown above, using proper indentation for nested mappings and list items. Use boolean values (`true`/`false`) and integers where indicated. The custom description strings can be written on separate lines prefixed with a hyphen.
191
- <a name="config-reader-yaml-parsing"></a>
192
- ## Config Reader – YAML Parsing
193
-
194
- The **`read_config`** function deserialises a YAML string into three concrete objects used throughout the runner.
195
-
196
- | Entity | Type | Role | Notes |
197
- |--------|------|------|-------|
198
- | `file_data` | `str` | Raw YAML payload | Must be UTF‑8 encoded |
199
- | `config` | `Config` | Global project configuration | Populated via `Config` setters |
200
- | `custom_modules` | `list[CustomModule│CustomModuleWithOutContext]` | Extension points for documentation generators | Determined by leading “%” token |
201
- | `structure_settings_object` | `StructureSettings` | Controls output segmentation and linking | Loads arbitrary keys from `structure_settings` dict |
202
-
203
- **Logic flow**
204
- 1. `yaml.safe_load` → `data` (dict).
205
- 2. Instantiate `Config` & `ProjectBuildConfig`.
206
- 3. Pull `ignore_files`, `language`, `project_name`, `project_additional_info`, `build_settings` from `data`.
207
- 4. `pcs.load_settings(build_settings)`, then chain `config.set_language(...).set_project_name(...).set_pcs(pcs)`.
208
- 5. Iterate `ignore_files` → `config.add_ignore_file`.
209
- 6. Iterate `project_additional_info` → `config.add_project_additional_info`.
210
- 7. Build `custom_modules` list: **`%`** prefix → `CustomModuleWithOutContext`, else `CustomModule`.
211
- 8. Load `structure_settings` into a fresh `StructureSettings` via `load_settings`.
212
- 9. Return `(config, custom_modules, structure_settings_object)`.
213
-
214
- > **Deterministic**: No conditionals beyond data‑driven branches; identical input yields identical output.
215
-
216
- ---
217
- <a name="project-build-config-model"></a>
218
- ## Project Build Config Model (`ProjectBuildConfig`)
219
-
220
- A simple container for build‑time flags.
221
-
222
- | Entity | Type | Role | Notes |
223
- |--------|------|------|-------|
224
- | `save_logs` | `bool` | Enable persistent logging | Default `False` |
225
- | `log_level` | `int` | Verbosity selector | Default `-1` (unspecified) |
226
- | `load_settings` | `method` | Populate attributes from dict | Direct `setattr` loop |
227
-
228
- No methods beyond `load_settings`; the object is attached to `Config` via `set_pcs`.
229
- <a name="projectsettings"></a>
230
- ## `ProjectSettings` – Prompt Builder
231
-
232
- | Entity | Type | Role |
233
- |--------|------|------|
234
- | `project_name` | `str` | Identifier inserted into prompt |
235
- | `info` | `dict` | Additional key‑value pairs |
236
- | `prompt` (property) | `str` | Concatenation of `BASE_SETTINGS_PROMPT`, project name, and each `info` entry (each on its own line) |
237
-
238
- **Logic**
239
- - `add_info` stores arbitrary metadata.
240
- - `prompt` assembles base prompt, project name, then iterates `self.info` to append `"{key}: {value}"` lines.
241
-
242
- > **Note**: All functions rely exclusively on the LLM interface (`get_answer_without_history`) and a progress‑bar abstraction; no file I/O occurs here.
243
- <a name="data-contract"></a>
244
- ### Data Contract
245
-
246
- | Entity | Type | Role | Notes |
247
- |--------|------|------|-------|
248
- | `print("ADG")` | side‑effect (stdout) | Simple identification signal emitted at import time. | No return value; executed once per interpreter session. |
249
- | `BaseLogger` | class (import) | Core logging facility used throughout the package. | Imported but not instantiated elsewhere in this file. |
250
- | `BaseLoggerTemplate` | class (import) | Provides the default formatting/handler configuration for the logger. | Passed to `logger.set_logger`. |
251
- | `logger` | `BaseLogger` instance | Shared logger instance exposed as a module‑level variable. | Other modules can `from autodocgenerator import logger`. |
252
- | `InfoLog`, `ErrorLog`, `WarningLog` | classes (import) | Specialized log record types. | Imported for external use; not instantiated here. |
253
-
254
- > **⚠️ Note** – The module does **not** perform file I/O, network calls, or alter global state beyond the stdout side‑effect and logger creation.
255
-
256
- ---
257
- <a name="logic-flow"></a>
258
- ### Execution Flow (Step‑by‑Step)
259
-
260
- 1. **Import phase** – Python evaluates the file linearly.
261
- 2. **`print` execution** – Immediately writes `"ADG"` to the console.
262
- 3. **Symbol import** – Retrieves logger‑related classes from `autodocgenerator.ui.logging`.
263
- 4. **Logger instantiation** – Calls `BaseLogger()` → creates a logger object.
264
- 5. **Template binding** – Calls `logger.set_logger(BaseLoggerTemplate())` → attaches the default template to the logger.
265
- 6. **Export** – The module’s namespace now contains the ready‑to‑use `logger` and the imported log‑type classes.
266
-
267
- No additional functions or conditional branches are present; the module’s behavior is fully deterministic and repeatable on each import.
268
- <a name="parent-model-hierarchy"></a>
269
- ## Core Model Hierarchy (`ParentModel`, `Model`, `AsyncModel`)
270
-
271
- **Responsibility** – Supplies shared state (API key, history, model rotation) for concrete generators.
272
- **Visible interactions** – Other modules import `Model`/`AsyncModel` via `gpt_model.py`; they receive a pre‑configured instance from the orchestrator.
273
-
274
- | Entity | Type | Role | Notes |
275
- |--------|------|------|-------|
276
- | `api_key` | `str` | Authentication token | Defaulted to `API_KEY` from config |
277
- | `history` | `History` | Conversational buffer | Injected or created lazily |
278
- | `use_random` | `bool` | Controls shuffling of `MODELS_NAME` | Randomised on each instantiation |
279
- | `current_model_index` | `int` | Index of the active model | Starts at 0 |
280
- | `regen_models_name` | `list[str]` | Rotation list of model identifiers | Shuffled when `use_random=True` |
281
-
282
- **Logic flow**
283
- 1. `ParentModel.__init__` stores `api_key` & `history`.
284
- 2. Copies global `MODELS_NAME`; shuffles if `use_random`.
285
- 3. Exposes `regen_models_name` & `current_model_index` for child classes.
286
-
287
- ---
288
- <a name="basemodule-definition"></a>
289
- ## Abstract Base Module (`BaseModule`)
290
-
291
- | Entity | Type | Role | Notes |
292
- |--------|------|------|-------|
293
- | `BaseModule` | `ABC` | Contract for all doc‑generation blocks | Requires `generate(info: dict, model: Model)` |
294
- | `__init__` | `method` | No‑op constructor | Allows subclass‑specific init |
295
- | `generate` | `abstractmethod` | Core payload generator | Must return a **string** fragment |
296
-
297
- > **Assumption** – Sub‑classes provide concrete logic; the base class itself does not produce output.
298
-
299
- ---
300
- <a name="docfactory-implementation"></a>
301
- ## Documentation Orchestrator (`DocFactory`)
302
-
303
- | Entity | Type | Role | Notes |
304
- |--------|------|------|-------|
305
- | `modules` | `list[BaseModule]` | Ordered generators supplied at construction | Stored as‑is |
306
- | `logger` | `BaseLogger` | Centralised logging | Uses `InfoLog` |
307
- | `generate_doc` | `method` | Executes each module, aggregates results, updates progress | Returns the full markdown document |
308
-
309
- **Logic flow**
310
- 1. Initialise `output = ""`.
311
- 2. Call `progress.create_new_subtask("Generate parts", len(self.modules))`.
312
- 3. Iterate `module` in `self.modules`:
313
- - `module_result = module.generate(info, model)`
314
- - Append `module_result` and two newlines to `output`.
315
- - Log module completion (`InfoLog`).
316
- - Log raw module output at level 2.
317
- - `progress.update_task()`.
318
- 4. After loop, `progress.remove_subtask()` and return `output`.
319
-
320
- > **Warning** – The `__main__` guard instantiates `BaseModule()` directly, which is abstract and would raise `TypeError` if executed.
321
-
322
- ---
323
- <a name="custommodule-implementation"></a>
324
- ## Custom Content Modules (`CustomModule`, `CustomModuleWithOutContext`)
325
-
326
- | Entity | Type | Role | Notes |
327
- |--------|------|------|-------|
328
- | `discription` | `str` | User‑provided header for the custom block | Set in ctor |
329
- | `generate` (both) | `method` | Calls post‑processor to build a custom description | Returns a **string** |
330
-
331
- **CustomModule** –
332
- 1. Split `info["code_mix"]` into ≤ 5000‑symbol chunks via `split_data`.
333
- 2. Invoke `generete_custom_discription` with the chunks, model, description, and language.
334
-
335
- **CustomModuleWithOutContext** –
336
- 1. Directly call `generete_custom_discription_without` with model, description, and language (no code context).
337
-
338
- Both rely exclusively on the imported post‑processor functions; no side effects beyond the returned string.
339
-
340
- ---
341
- <a name="intro-modules-implementation"></a>
342
- ## Intro Extraction Modules (`IntroLinks`, `IntroText`)
343
-
344
- | Entity | Type | Role | Notes |
345
- |--------|------|------|-------|
346
- | `generate` | `method` | Produces introductory material | Returns a **string** |
347
- | `links` / `intro` | `str` | Intermediate data from helpers | Obtained from `info` dict |
348
-
349
- **IntroLinks** –
350
- 1. `get_all_html_links(info["full_data"])` → `links`.
351
- 2. `get_links_intro(links, model, info["language"])` → `intro_links`.
352
-
353
- **IntroText** –
354
- 1. `get_introdaction(info["global_data"], model, info["language"])` → `intro`.
355
-
356
- Both modules delegate all heavy lifting to the imported `custom_intro` helpers and simply forward the resulting markdown snippet.
357
- <a name="manager-class"></a>
358
- ## `Manager` – Orchestrator of Project‑wide Documentation Pipeline
359
-
360
- | Entity | Type | Role | Notes |
361
- |--------|------|------|-------|
362
- | `CACHE_FOLDER_NAME` | `str` | Fixed cache directory name | `".auto_doc_cache"` |
363
- | `FILE_NAMES` | `dict[str,str]` | Maps logical keys to cache filenames | Used by `get_file_path` |
364
- | `__init__` | `method` | Sets configuration, logger, progress UI, creates cache folder | `progress_bar` defaults to a fresh `BaseProgress()` instance |
365
- | `read_file_by_file_key` | `method` | Returns raw text of a cached file | Reads UTF‑8, key resolved via `FILE_NAMES` |
366
- | `get_file_path` | `method` | Constructs absolute cache path for a given key | Combines `project_directory`, `CACHE_FOLDER_NAME`, and `FILE_NAMES` |
367
- | `generate_code_file` | `method` | Builds a **code‑mix** file from the repository | Uses `CodeMix.build_repo_content` |
368
- | `generete_doc_parts` | `method` | Splits `code_mix` into ≤ 5 000‑symbol chunks and generates markdown via `gen_doc_parts` | Writes result to `output_doc` |
369
- | `factory_generate_doc` | `method` | Invokes a `DocFactory` to prepend additional modules to the existing doc | Merges new fragments with current output |
370
- | `order_doc` | `method` | Re‑orders markdown sections by anchor using `split_text_by_anchors` & `get_order` | Overwrites `output_doc` |
371
- | `clear_cache` | `method` | Optionally removes the log file based on `config.pbc.save_logs` | No other side‑effects |
372
-
373
- > **Warning** – The default argument `progress_bar: BaseProgress = BaseProgress()` creates a mutable instance at import time; repeated `Manager` constructions share the same progress object.
374
-
375
- ### Initialization Flow
376
- 1. Store `project_directory`, `config`, optional models, and `progress_bar`.
377
- 2. Initialise `BaseLogger` and attach a `FileLoggerTemplate` targeting the cache `logs` file.
378
- 3. Ensure the cache folder exists (`os.mkdir` if absent).
379
-
380
- ### Core Operations
381
-
382
- #### 1. `generate_code_file`
383
- 1. Log start (`InfoLog`).
384
- 2. Instantiate `CodeMix` with `project_directory` and `config.ignore_files`.
385
- 3. Call `cm.build_repo_content` → writes `code_mix.txt`.
386
- 4. Log completion and advance the progress bar.
387
-
388
- #### 2. `generete_doc_parts`
389
- 1. Load `code_mix.txt`.
390
- 2. Log start, invoke `gen_doc_parts(full_code_mix, max_symbols, sync_model, config.language, progress_bar)`.
391
- 3. Persist returned markdown to `output_doc.md`.
392
- 4. Log finish and update progress.
393
-
394
- #### 3. `factory_generate_doc`
395
- 1. Load current `output_doc.md` and `code_mix.txt`.
396
- 2. Assemble `info` dict (`language`, `full_data`, `code_mix`).
397
- 3. Log detailed start message including module names and input sizes.
398
- 4. Call `doc_factory.generate_doc(info, sync_model, progress_bar)`.
399
- 5. Prepend new fragments to the existing doc and write back.
400
- 6. Update progress.
401
-
402
- #### 4. `order_doc`
403
- 1. Read current `output_doc.md`.
404
- 2. Split by markdown anchors (`split_text_by_anchors`).
405
- 3. If split succeeded, reorder sections via `get_order(sync_model, parts)`.
406
- 4. Overwrite `output_doc.md` with ordered content.
407
-
408
- #### 5. `clear_cache`
409
- 1. If `config.pbc.save_logs` is `False`, delete the `report.txt` log file.
410
-
411
- All side‑effects are confined to file system writes within the hidden cache directory and logger emissions; no network or external state is accessed beyond the injected `Model` instances.
412
- <a name="manager-class-usage"></a>!noinfo
413
- <a name="module-init-logger-setup"></a>
414
- ## Module Initialization & Logger Configuration
415
-
416
- The **`autodocgenerator/__init__.py`** module performs three concrete actions when the package is imported:
417
-
418
- 1. Emits a literal string **`"ADG"`** to *stdout* via `print`.
419
- 2. Imports the public logger classes from `autodocgenerator.ui.logging`:
420
- ```python
421
- from .ui.logging import BaseLogger, BaseLoggerTemplate, InfoLog, ErrorLog, WarningLog
422
- ```
423
- 3. Instantiates a **singleton‑style logger** and binds a default template:
424
- ```python
425
- logger = BaseLogger()
426
- logger.set_logger(BaseLoggerTemplate())
427
- ```
428
-
429
- These steps make a ready‑to‑use `logger` object available to any sub‑module that imports `autodocgenerator`.
430
-
431
- ---
432
- <a name="asyncgptmodel-implementation"></a>
433
- ## Asynchronous Generator (`AsyncGPTModel`)
434
-
435
- | Entity | Type | Role | Notes |
436
- |--------|------|------|-------|
437
- | `client` | `AsyncGroq` | Async LLM client | Instantiated with `api_key` |
438
- | `logger` | `BaseLogger` | Async‑compatible logger | Same log classes as sync version |
439
- | `generate_answer` | `async method` | Async request/response loop | Returns `awaitable str` |
440
-
441
- **Logic flow** (mirrors `GPTModel` but using `await`):
442
- 1. Log async start.
443
- 2. Resolve `messages` from history or `prompt`.
444
- 3. `while True` loop with the same exhaustion check and model rotation.
445
- 4. `await self.client.chat.completions.create(...)`.
446
- 5. On failure: log warning, rotate index, continue.
447
- 6. After success, extract `result`, log both model used and answer, then `return result`.
448
-
449
- **Interaction pattern** – Consumed by the orchestrator (`gen_doc`) via `await model.generate_answer(...)`; shares the same rotation logic as the sync counterpart.
450
- <a name="gptmodel-implementation"></a>
451
- ## Synchronous Generator (`GPTModel`)
452
-
453
- | Entity | Type | Role | Notes |
454
- |--------|------|------|-------|
455
- | `client` | `Groq` | Remote LLM client | Created with `api_key` |
456
- | `logger` | `BaseLogger` | Structured logging | Uses `InfoLog`, `ErrorLog`, `WarningLog` |
457
- | `generate_answer` | `method` | Core request/response loop | Returns `str` |
458
-
459
- **Logic flow**
460
- 1. Log start of generation.
461
- 2. Choose `messages` from `history` or supplied `prompt`.
462
- 3. Loop:
463
- - If `regen_models_name` empty → log error & raise `ModelExhaustedException`.
464
- - Pick `model_name` at `current_model_index`.
465
- - Attempt `self.client.chat.completions.create(messages=messages, model=model_name)`.
466
- - On exception: log warning, advance index (wrap‑around), retry.
467
- 4. Extract `result` from `chat_completion.choices[0].message.content`.
468
- 5. Log success & result (level 2).
469
- 6. Return `result`.
470
-
471
- > **Determinism** – Outcome depends only on input data and external API responses; no hidden branches.
472
-
473
- ---
474
- <a name="document-generation-orchestrator"></a>
475
- ## Document Generation Orchestrator (`gen_doc`)
476
-
477
- Coordinates model instantiation, manager setup, and final document retrieval.
478
-
479
- | Entity | Type | Role | Notes |
480
- |--------|------|------|-------|
481
- | `project_path` | `str` | Root of source tree | Passed to `Manager` |
482
- | `config` | `Config` | Project‑wide settings | From `read_config` |
483
- | `custom_modules` | `list[CustomModule│CustomModuleWithOutContext]` | Doc factories | Forwarded to `DocFactory` |
484
- | `structure_settings` | `StructureSettings` | Output segmentation flags | Controls ordering & intro links |
485
-
486
- **Step‑by‑step**
487
- 1. Instantiate `GPTModel` (sync) & `AsyncGPTModel` (async) with global `API_KEY`.
488
- 2. Build `Manager` with path, config, models, and a `ConsoleGtiHubProgress` bar.
489
- 3. Call `manager.generate_code_file()`.
490
- 4. Split docs via `manager.generete_doc_parts(max_symbols=structure_settings.max_doc_part_size)`.
491
- 5. Feed custom factories: `manager.factory_generate_doc(DocFactory(*custom_modules))`.
492
- 6. If `include_order` → `manager.order_doc()`.
493
- 7. If `include_intro_links` → `manager.factory_generate_doc(DocFactory(IntroLinks()))`.
494
- 8. Clean temporary cache, then `manager.read_file_by_file_key("output_doc")` is returned.
495
-
496
- ---
497
- <a name="generate_descriptions_for_code"></a>
498
- ## `generate_descriptions_for_code` – LLM‑driven Doc Generation
499
-
500
- | Entity | Type | Role |
501
- |--------|------|------|
502
- | `data` | `list[str]` | Code snippets |
503
- | `model` | `Model` | LLM |
504
- | `project_settings` | `ProjectSettings` | Unused (present for signature) |
505
- | `progress_bar` | `BaseProgress` | Progress |
506
- | return | `list[str]` | Model answers (descriptions) |
507
-
508
- **Logic**
509
- - For each `code` create a two‑message prompt (instruction block + `CONTEXT: {code}`), call `model.get_answer_without_history`, append answer, update progress.
510
- <a name="gen_doc_parts"></a>
511
- ## `gen_doc_parts` – Synchronous Batch Documentation
512
-
513
- | Entity | Type | Role | Notes |
514
- |--------|------|------|-------|
515
- | `full_code_mix` | `str` | Complete source to split | |
516
- | `max_symbols` | `int` | Chunk size for `split_data` | |
517
- | `model` | `Model` | LLM used for each part | |
518
- | `language` | `str` | Output language | |
519
- | `progress_bar` | `BaseProgress` | Sub‑task progress tracker | |
520
- | **return** | `str` | Concatenated documentation of all parts | |
521
-
522
- **Logic**
523
- 1. Call `split_data` → list of parts.
524
- 2. Create a sub‑task in `progress_bar` with total length = number of parts.
525
- 3. Iterate parts: invoke `write_docs_by_parts`, append result to `all_result`, keep last 3000 characters of the current result for next iteration (`prev_info`). Update progress bar each loop.
526
- 4. Remove sub‑task, log final length, and return the assembled document.
527
- <a name="async_gen_doc_parts"></a>
528
- ## `async_gen_doc_parts` – Asynchronous Batch Documentation
529
-
530
- | Entity | Type | Role | Notes |
531
- |--------|------|------|-------|
532
- | `full_code_mix` | `str` | Source code | |
533
- | `global_info` | `str` | Passed to each async task (unused in prompt) | |
534
- | `max_symbols` | `int` | Chunk size | |
535
- | `model` | `AsyncModel` | Async LLM | |
536
- | `language` | `str` | Output language | |
537
- | `progress_bar` | `BaseProgress` | Sub‑task progress manager | |
538
- | **return** | `str` | Full documentation assembled from async tasks | |
539
-
540
- **Logic**
541
- 1. Split source via `split_data`.
542
- 2. Initialise a sub‑task in `progress_bar`.
543
- 3. Create a semaphore (`4` permits).
544
- 4. Build a list of `async_write_docs_by_parts` tasks, each receiving the shared semaphore and a lambda that updates the progress bar.
545
- 5. `await asyncio.gather(*tasks)` → list of part documents.
546
- 6. Concatenate results with double newlines, clean up sub‑task, log final length, and return.
547
-
548
- > **Critical assumption**: All logging is performed through `BaseLogger`; no file I/O occurs in this module.
549
- <a name="write_docs_by_parts"></a>
550
- ## `write_docs_by_parts` – Synchronous Part‑wise Doc Generation
551
-
552
- | Entity | Type | Role | Notes |
553
- |--------|------|------|-------|
554
- | `part` | `str` | Code fragment to document | |
555
- | `model` | `Model` | Synchronous LLM interface | Provides `get_answer_without_history` |
556
- | `prev_info` | `str` | Optional prior output | Inserted into prompt when present |
557
- | `language` | `str` | Target language for docs | Default `"en"` |
558
- | **return** | `str` | Generated documentation for the part | May be trimmed of surrounding ````` markers |
559
-
560
- **Logic**
561
- 1. Build a system‑message list: language hint, `BASE_PART_COMPLITE_TEXT`, optional previous info, then the user message containing `part`.
562
- 2. Call `model.get_answer_without_history(prompt)`.
563
- 3. Strip leading/trailing markdown fences (`````), log length and content, and return the cleaned answer.
564
- <a name="async_write_docs_by_parts"></a>
565
- ## `async_write_docs_by_parts` – Async Part‑wise Doc Generation
566
-
567
- | Entity | Type | Role | Notes |
568
- |--------|------|------|-------|
569
- | `part` | `str` | Code fragment | |
570
- | `async_model` | `AsyncModel` | Async LLM interface | Provides `await get_answer_without_history` |
571
- | `global_info` | `str` | Unused in prompt construction | Present for signature compatibility |
572
- | `semaphore` | `asyncio.Semaphore` | Concurrency limiter | Acquired via `async with` |
573
- | `prev_info` | `str` | Optional prior output | |
574
- | `language` | `str` | Target language | |
575
- | `update_progress` | `callable` | Optional progress callback | Invoked after answer received |
576
- | **return** | `str` | Documentation for the part | Fence‑stripped like the sync version |
577
-
578
- **Logic** mirrors the synchronous variant, wrapped in `async with semaphore:` and awaiting the model call. Progress is reported if `update_progress` is supplied.
579
- <a name="CONTENT_DESCRIPTION"></a>` tag |
580
-
581
- **Logic**
582
- 1. Create a prompt with three system messages: language, analyst role, and a rule‑enforced template demanding a single anchor tag with no filenames, extensions, generic terms, or URLs.
583
- 2. Append a user message containing the task.
584
- 3. Call `model.get_answer_without_history`.
585
- 4. Return the raw answer.
586
-
587
- ---
588
-
589
- **Cross‑Component Interaction**
590
- All functions rely on `BaseLogger` for internal diagnostics and on a `Model` implementation (e.g., `GPTModel`) to obtain LLM responses. No other modules are referenced; constants are imported from `engine.config.config`. The module therefore acts as a **post‑processing helper** that extracts navigation anchors and orchestrates LLM‑driven intro and custom description creation.
591
- <a name="generete_custom_discription"></a>
592
- ## `generete_custom_discription` – Context‑Sensitive Custom Description
593
-
594
- | Entity | Type | Role | Notes |
595
- |--------|------|------|-------|
596
- | `splited_data` | `str` (iterable) | Chunked documentation pieces | Iterated until a satisfactory result |
597
- | `model` | `Model` | LLM interface | |
598
- | `custom_description` | `str` | User‑specified description task | |
599
- | `language` | `str` | Prompt language | Default `"en"` |
600
- | return | `str` | First LLM answer that passes filters | Empty string if none succeed |
601
-
602
- **Logic**
603
- 1. Loop over each `sp_data` in `splited_data`.
604
- 2. Build a multi‑system‑message prompt: language, analyst role, context (`sp_data`), constant `BASE_CUSTOM_DISCRIPTIONS`, and the task.
605
- 3. Invoke `model.get_answer_without_history`.
606
- 4. If the result does **not** contain `"!noinfo"` or `"No information found"` (or those markers appear after position 30), break and keep the answer.
607
- 5. Otherwise reset `result` and continue.
608
- 6. Return the final `result`.
609
-
610
- ---
611
- <a name="generete_custom_discription_without"></a>
612
- ## `generete_custom_discription_without` – Stand‑Alone Description Generation
613
-
614
- | Entity | Type | Role | Notes |
615
- |--------|------|------|-------|
616
- | `model` | `Model` | LLM interface | |
617
- | `custom_description` | `str` | Desired description task | |
618
- | `language` | `str` | Prompt language | Default `"en"` |
619
- | return | `str` | LLM answer that obeys strict tag rules | Must start with a single `
620
- <a name="extract_links_from_start"></a>
621
- ## `extract_links_from_start` – Anchor Extraction
622
-
623
- | Entity | Type | Role | Notes |
624
- |--------|------|------|-------|
625
- | `chunks` | `list[str]` | Text blocks to scan | Expected to start with an `<a name=…>` tag |
626
- | `links` | `list[str]` | Collected anchors | Prefixed with “#” |
627
- | `pattern` | `str` | Regex `^<a name=["']?(.*?)["']?</a>` | Captures the name attribute at the very start of a chunk |
628
- | return | `list[str]` | Anchor list (only names > 5 chars) | Empty list if none match |
629
-
630
- **Logic**
631
- 1. Initialise empty `links`.
632
- 2. For each `chunk` → `chunk.strip()` → `re.search(pattern)`.
633
- 3. If a match and `len(anchor_name) > 5` → append `"#"+anchor_name`.
634
- 4. Return `links`.
635
-
636
- > **Assumption**: Only leading anchors are considered; embedded anchors are ignored.
637
- <a name="get_all_html_links"></a>
638
- ## `get_all_html_links` – HTML Anchor Extraction
639
-
640
- | Entity | Type | Role | Notes |
641
- |--------|------|------|-------|
642
- | `data` | `str` | Source markdown/HTML text | Expected to contain `<a name="…"></a>` anchors |
643
- | return | `list[str]` | Collected link identifiers | Each returned as `#anchor_name` (anchors longer than 5 chars) |
644
-
645
- **Logic**
646
- 1. Instantiate a fresh `BaseLogger`.
647
- 2. Log start message.
648
- 3. Compile regex `r'<a name=["\']?(.*?)["\']?></a>'`.
649
- 4. Iterate over `re.finditer`; for each match, capture group 1.
650
- 5. If captured name length > 5, prepend `#` and append to `links`.
651
- 6. Log count and list of links (debug level 1).
652
- 7. Return the list.
653
-
654
- > **Note** – No filesystem or network access; pure string processing.
655
-
656
- ---
657
- <a name="get_introdaction"></a>
658
- ## `get_introdaction` – Global Introduction Generation
659
-
660
- | Entity | Type | Role | Notes |
661
- |--------|------|------|-------|
662
- | `global_data` | `str` | Full documentation content | Sent as user prompt |
663
- | `model` | `Model` | LLM interface | Same contract as above |
664
- | `language` | `str` | Prompt language | Default `"en"` |
665
- | return | `str` | Generated introduction text | No logging performed in this fragment |
666
-
667
- **Logic**
668
- 1. Assemble prompt: language system message, constant `BASE_INTRO_CREATE`, and `global_data`.
669
- 2. Call `model.get_answer_without_history`.
670
- 3. Return the answer.
671
-
672
- ---
673
- <a name="get_links_intro"></a>
674
- ## `get_links_intro` – Intro Generation with Links
675
-
676
- | Entity | Type | Role | Notes |
677
- |--------|------|------|-------|
678
- | `links` | `list[str]` | Anchor list from `get_all_html_links` | Serialized via `str()` for prompt |
679
- | `model` | `Model` | LLM interface | Must implement `get_answer_without_history` |
680
- | `language` | `str` | Prompt language selector | Default `"en"` |
681
- | return | `str` | Generated introductory markdown | Contains the supplied links |
682
-
683
- **Logic**
684
- 1. Create `BaseLogger`.
685
- 2. Build a system‑user prompt array: set language, inject constant `BASE_INTRODACTION_CREATE_LINKS`, and pass stringified `links`.
686
- 3. Log generation start.
687
- 4. Call `model.get_answer_without_history(prompt=prompt)`.
688
- 5. Log completion and raw result (debug level 1).
689
- 6. Return the LLM’s answer.
690
-
691
- ---
692
- <a name="split_text_by_anchors"></a>
693
- ## `split_text_by_anchors` – Chunk Segmentation
694
-
695
- | Entity | Type | Role | Notes |
696
- |--------|------|------|-------|
697
- | `text` | `str` | Full markdown source | Contains `<a name=…>` anchors |
698
- | `pattern` | `str` | Look‑ahead regex `(?=<a name=["']?[^"\'>\s]{6,200}["']?</a>)` | Splits **before** each valid anchor |
699
- | `result_chanks` | `list[str]` | Trimmed non‑empty chunks | One per anchor |
700
- | `all_links` | `list[str]` | Output of `extract_links_from_start` | Must align with `result_chanks` |
701
- | return | `dict[str,str]` or `None` | Mapping `#anchor → chunk` | `None` if counts differ |
702
-
703
- **Logic**
704
- 1. `re.split` on `pattern` → raw `chunks`.
705
- 2. Strip and filter empty entries → `result_chanks`.
706
- 3. Call `extract_links_from_start(result_chanks)` → `all_links`.
707
- 4. If `len(all_links) != len(result_chanks)` → `return None`.
708
- 5. Build dict pairing each link with its corresponding chunk.
709
- <a name="get_order"></a>
710
- ## `get_order` – Semantic Title Ordering
711
-
712
- | Entity | Type | Role | Notes |
713
- |--------|------|------|-------|
714
- | `model` | `Model` | LLM interface | Provides `get_answer_without_history` |
715
- | `chanks` | `dict[str,str]` | Anchor‑to‑content map | Keys are `#anchor` strings |
716
- | `logger` | `BaseLogger` | Diagnostic output | Uses `InfoLog` at various levels |
717
- | return | `str` | Concatenated content in LLM‑suggested order | Ends with newline after each chunk |
718
-
719
- **Logic**
720
- 1. Log start and input keys/values.
721
- 2. Build single‑message prompt asking the model to **return a comma‑separated list** of the titles (keys) sorted semantically, preserving the leading “#”.
722
- 3. Call `model.get_answer_without_history(prompt)`.
723
- 4. Split result on commas, strip whitespace → `new_result`.
724
- 5. Iterate `new_result`; for each key `el` append `chanks[el]` and a newline to `order_output`, logging each addition.
725
- 6. Return `order_output`.
726
- <a name="split_data"></a>
727
- ## `split_data` – Text Chunking Engine
728
-
729
- | Entity | Type | Role | Notes |
730
- |--------|------|------|-------|
731
- | `data` | `str` | Raw source text | May contain newline separators |
732
- | `max_symbols` | `int` | Upper size limit for a chunk (symbols) | Used with 1.25 ×  and 1.5 ×  heuristics |
733
- | **return** | `list[str]` | List of chunk strings | Each ≤ `max_symbols` ≈ target size |
734
-
735
- **Logic**
736
- 1. Split `data` on newline (`"\n"`).
737
- 2. Repeatedly scan the list; any element longer than `1.5 × max_symbols` is cut in half (first half kept, second half inserted after). Loop until no element exceeds the threshold.
738
- 3. Accumulate elements into `split_objects`, starting a new chunk when the current one would exceed `1.25 × max_symbols`. Newlines are inserted between concatenated parts.
739
- 4. Log start and completion via `BaseLogger`.
740
- <a name="code_mix"></a>
741
- ## `CodeMix` – Repository Snapshot Builder
742
-
743
- | Entity | Type | Role | Notes |
744
- |--------|------|------|-------|
745
- | `root_dir` | `Path` | Base directory for scanning | Resolved at init |
746
- | `ignore_patterns` | `list[str]` | Glob patterns to exclude | Defaults to empty list |
747
- | `logger` | `BaseLogger` | Progress logger | Uses `InfoLog` |
748
- | `should_ignore(path)` | `bool` | Determines exclusion | Checks path, basename, and each part against patterns |
749
- | `build_repo_content(output_file)` | `None` | Writes repository tree and file contents to `output_file` | Inserts `<file path="…">` tags before each file block |
750
- | return | `None` | Side‑effect: file creation | Prints a completion message in `__main__` |
751
-
752
- **Logic**
753
- 1. Open `output_file` for writing.
754
- 2. Write “Repository Structure:” header.
755
- 3. Walk `root_dir.rglob("*")` sorted; for each `path` not ignored, compute depth → indentation → write directory or file line.
756
- 4. Write separator line (`"="*20`).
757
- 5. Walk again; for each non‑ignored file, write `<file path="relative_path">`, then the file’s raw text, then two newlines. Errors are caught and written as `"Error reading …"`.
758
-
759
- > **Warning**: Files matching any pattern in `ignore_patterns` (e.g., `*.pyc`, `venv`, `.git`) are silently skipped.
760
- <a name="compress"></a>
761
- ## `compress` – Single‑File LLM Compression
762
-
763
- | Entity | Type | Role | Notes |
764
- |--------|------|------|-------|
765
- | `data` | `str` | Raw source text | – |
766
- | `project_settings` | `ProjectSettings` | Supplies system prompt via `project_settings.prompt` | – |
767
- | `model` | `Model` | LLM interface, provides `get_answer_without_history` | – |
768
- | `compress_power` | `int` | Controls token budget for `BASE_COMPRESS_TEXT` | – |
769
- | return | `str` | LLM‑generated compressed text | – |
770
-
771
- **Logic**
772
- 1. Build `prompt` list: system prompt from settings, token‑budget prompt from `get_BASE_COMPRESS_TEXT(10000, compress_power)`, then user content `data`.
773
- 2. Call `model.get_answer_without_history(prompt=prompt)`.
774
- 3. Return the answer unchanged.
775
- <a name="compress_and_compare"></a>
776
- ## `compress_and_compare` – Sync Batch Compression
777
-
778
- | Entity | Type | Role | Notes |
779
- |--------|------|------|-------|
780
- | `data` | `list[str]` | Files to compress | – |
781
- | `model` | `Model` | LLM instance | – |
782
- | `project_settings` | `ProjectSettings` | Prompt source | – |
783
- | `compress_power` | `int` | Chunk size (default 4) | – |
784
- | `progress_bar` | `BaseProgress` | Visual progress | Default instance |
785
- | return | `list[str]` | Concatenated chunks, one per `compress_power` files | – |
786
-
787
- **Logic**
788
- 1. Allocate result list sized `ceil(len(data)/compress_power)`.
789
- 2. Initialise sub‑task on `progress_bar`.
790
- 3. For each element `el` at index `i`: compute `curr_index = i // compress_power`; append `compress(el, …)` + newline to that slot; update progress.
791
- 4. Remove sub‑task and return the list.
792
- <a name="async_compress"></a>
793
- ## `async_compress` – Async Single Compression
794
-
795
- | Entity | Type | Role |
796
- |--------|------|------|
797
- | `data` | `str` | Source text |
798
- | `project_settings` | `ProjectSettings` | Prompt source |
799
- | `model` | `AsyncModel` | Async LLM |
800
- | `compress_power` | `int` | Token budget |
801
- | `semaphore` | `asyncio.Semaphore` | Concurrency guard |
802
- | `progress_bar` | `BaseProgress` | Progress update |
803
- | return | `str` | Compressed result |
804
-
805
- **Logic**
806
- - Acquire semaphore, build identical prompt as `compress`, await `model.get_answer_without_history`, update progress, release semaphore, return answer.
807
- <a name="async_compress_and_compare"></a>
808
- ## `async_compress_and_compare` – Async Batch
809
-
810
- | Entity | Type | Role |
811
- |--------|------|------|
812
- | `data` | `list[str]` | Files |
813
- | `model` | `AsyncModel` | LLM |
814
- | `project_settings` | `ProjectSettings` | Prompt |
815
- | `compress_power` | `int` | Chunk size |
816
- | `progress_bar` | `BaseProgress` | Sub‑task |
817
- | return | `list[str]` | Chunked concatenations |
818
-
819
- **Logic**
820
- 1. Semaphore = 4, spawn `async_compress` tasks for each file.
821
- 2. `await asyncio.gather` → `compressed_elements`.
822
- 3. Group results by `compress_power`, join with newlines, add trailing newline.
823
- <a name="compress_to_one"></a>
824
- ## `compress_to_one` – Iterative Reduction
825
-
826
- | Entity | Type | Role |
827
- |--------|------|------|
828
- | `data` | `list[str]` | Initial chunks |
829
- | `model` | `Model` | LLM |
830
- | `project_settings` | `ProjectSettings` | Prompt |
831
- | `compress_power` | `int` | Base chunk size |
832
- | `use_async` | `bool` | Switch between sync/async |
833
- | `progress_bar` | `BaseProgress` | Progress |
834
- | return | `str` | Single aggregated compressed block |
835
-
836
- **Logic**
837
- - Loop while `len(data) > 1`; adjust `compress_power` (minimum 2); call either `async_compress_and_compare` via `asyncio.run` or `compress_and_compare`; increment iteration counter. Final element returned.
838
-
839
-