autodocgenerator 0.8.9.6__py3-none-any.whl → 0.8.9.7__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -11,7 +11,7 @@ Do NOT skip details; analyze everything that appears in the snippet.
11
11
 
12
12
  BASE_PART_COMPLITE_TEXT = """Revised Documentation Prompt
13
13
  Role: You are a senior technical writer. Input: You will receive a specific code snippet representing a fragment of a larger system.
14
- Task: Write clear, structured, and hierarchical documentation for this fragment. Length: 0.30.7k characters (keep it tight).
14
+ Task: Write clear, structured, and hierarchical documentation for this fragment. Length: 0.71k characters (keep it tight).
15
15
 
16
16
  Content Requirements:
17
17
  Component Responsibility: Define exactly what this specific fragment does.
@@ -32,7 +32,7 @@ Formatting:
32
32
  Use Markdown for structure.
33
33
  Include HTML anchors near titles: <a name="specific-title"></a> \n ## Specific Title."""
34
34
 
35
- BASE_INTRODACTION_CREATE_TEXT = """
35
+ BASE_INTRODACTION_CREATE_LINKS = """
36
36
  Role: Senior Technical Solutions Architect.
37
37
  Context: You are processing technical documentation structure for an automated system (AutoDoc).
38
38
 
@@ -86,11 +86,8 @@ all information take from the following data
86
86
 
87
87
  BASE_SETTINGS_PROMPT = """
88
88
  Role & Context: "Act as a Project Knowledge Base. I will provide you with a structured project profile. Your goal is to memorize these parameters and use them as the foundational context for all our future interactions regarding this project."
89
-
90
89
  Input Format: "The input will follow this structure:
91
-
92
90
  Project Name: [Name] (This is the unique identifier).
93
-
94
91
  Project Parameters: A list of key: value pairs defining the project scope, such as global_idea, target_audience, tech_stack, etc."
95
92
 
96
93
  Instructions:
@@ -102,6 +99,22 @@ Project Data to Process:
102
99
 
103
100
  """
104
101
 
102
+ BASE_CUSTOM_DISCRIPTIONS = """
103
+ ### Strict Rules:
104
+ 1. Use ONLY the provided Context to answer.
105
+ 2. If the requested information is not explicitly mentioned in the Context, or if you don't know the answer based on the provided data, respond with an empty string ("") or simply say "No information found".
106
+ 3. DO NOT use external knowledge or invent any logic that is not present in the text.
107
+ 4. Do not provide any introductory or concluding remarks. If there is no info, output must be empty.
108
+ 5. If you dont have any info about it return just !noinfo
109
+ 6. Every response must start with exactly one <a name="CONTENT_DESCRIPTION"></a> tag. The CONTENT_DESCRIPTION must be a short, hyphenated summary of the actual information you are providing (e.g., "user-authentication-logic" instead of "auth.yml"). STRICT RULES:
110
+
111
+ NO filenames or paths (e.g., forbidden: "autodocconfig.yml", "src/config").
112
+ NO file extensions (e.g., forbidden: ".yml", ".md").
113
+ NO generic terms (e.g., forbidden: "config", "settings", "run", "docs").
114
+ NO protocols (http/https).
115
+ This tag must appear ONLY ONCE at the very beginning. Never repeat it or use other links
116
+ """
117
+
105
118
  def get_BASE_COMPRESS_TEXT(start, power):
106
119
  return f"""
107
120
  You will receive a large code snippet (up to ~{start} characters).
@@ -1,6 +1,6 @@
1
1
  from ..engine.models.gpt_model import GPTModel
2
2
  from ..engine.models.model import Model
3
- from ..engine.config.config import BASE_INTRODACTION_CREATE_TEXT, BASE_INTRO_CREATE
3
+ from ..engine.config.config import BASE_INTRODACTION_CREATE_LINKS, BASE_INTRO_CREATE, BASE_CUSTOM_DISCRIPTIONS
4
4
  from ..ui.logging import InfoLog, BaseLogger
5
5
  import re
6
6
 
@@ -35,7 +35,7 @@ def get_links_intro(links: list[str], model: Model, language: str = "en"):
35
35
  },
36
36
  {
37
37
  "role": "system",
38
- "content": BASE_INTRODACTION_CREATE_TEXT
38
+ "content": BASE_INTRODACTION_CREATE_LINKS
39
39
  },
40
40
  {
41
41
  "role": "user",
@@ -79,30 +79,15 @@ def generete_custom_discription(splited_data: str, model: Model, custom_descript
79
79
  },
80
80
  {
81
81
  "role": "system",
82
- "content": f"Act as a precise Technical Analyst. You will be provided with specific code or documentation. Your task is to describe or extract information based ONLY on the provided context. And make title and link <a name='your_title'> </a> format"
82
+ "content": f"Act as a precise Technical Analyst. You will be provided with specific code or documentation. Your task is to describe or extract information based ONLY on the provided context"
83
83
  },
84
84
  {
85
85
  "role": "system",
86
86
  "content": f"### Context: {sp_data}"
87
87
  },
88
- {"role": "system",
89
- "content": """### Strict Rules:
90
- 1. Use ONLY the provided Context to answer.
91
- 2. If the requested information is not explicitly mentioned in the Context, or if you don't know the answer based on the provided data, respond with an empty string ("") or simply say "No information found".
92
- 3. DO NOT use external knowledge or invent any logic that is not present in the text.
93
- 4. Do not provide any introductory or concluding remarks. If there is no info, output must be empty.
94
- 5. If you dont have any info about it return just !noinfo
95
- 6. Every response must start with exactly one <a name="CONTENT_DESCRIPTION"></a> tag. The CONTENT_DESCRIPTION must be a short, hyphenated summary of the actual information you are providing (e.g., "user-authentication-logic" instead of "auth.yml"). STRICT RULES:
96
-
97
- NO filenames or paths (e.g., forbidden: "autodocconfig.yml", "src/config").
98
-
99
- NO file extensions (e.g., forbidden: ".yml", ".md").
100
-
101
- NO generic terms (e.g., forbidden: "config", "settings", "run", "docs").
102
-
103
- NO protocols (http/https).
104
-
105
- This tag must appear ONLY ONCE at the very beginning. Never repeat it or use other links"""
88
+ {
89
+ "role": "system",
90
+ "content": BASE_CUSTOM_DISCRIPTIONS
106
91
  },
107
92
  {
108
93
  "role": "user",
@@ -0,0 +1,577 @@
1
+ Metadata-Version: 2.4
2
+ Name: autodocgenerator
3
+ Version: 0.8.9.7
4
+ Summary: This Project helps you to create docs for your projects
5
+ License: MIT
6
+ Author: dima-on
7
+ Author-email: sinica911@gmail.com
8
+ Requires-Python: >=3.11,<4.0
9
+ Classifier: License :: OSI Approved :: MIT License
10
+ Classifier: Programming Language :: Python :: 3
11
+ Classifier: Programming Language :: Python :: 3.11
12
+ Classifier: Programming Language :: Python :: 3.12
13
+ Classifier: Programming Language :: Python :: 3.13
14
+ Classifier: Programming Language :: Python :: 3.14
15
+ Requires-Dist: CacheControl (==0.14.4)
16
+ Requires-Dist: Pygments (==2.19.2)
17
+ Requires-Dist: RapidFuzz (==3.14.3)
18
+ Requires-Dist: annotated-types (==0.7.0)
19
+ Requires-Dist: anyio (==4.12.1)
20
+ Requires-Dist: certifi (==2026.1.4)
21
+ Requires-Dist: charset-normalizer (==3.4.4)
22
+ Requires-Dist: cleo (==2.1.0)
23
+ Requires-Dist: colorama (==0.4.6)
24
+ Requires-Dist: crashtest (==0.4.1)
25
+ Requires-Dist: distlib (==0.4.0)
26
+ Requires-Dist: distro (==1.9.0)
27
+ Requires-Dist: dulwich (==0.25.2)
28
+ Requires-Dist: fastjsonschema (==2.21.2)
29
+ Requires-Dist: filelock (==3.20.3)
30
+ Requires-Dist: findpython (==0.7.1)
31
+ Requires-Dist: google-auth (==2.47.0)
32
+ Requires-Dist: google-genai (==1.56.0)
33
+ Requires-Dist: groq (==1.0.0)
34
+ Requires-Dist: h11 (==0.16.0)
35
+ Requires-Dist: httpcore (==1.0.9)
36
+ Requires-Dist: httpx (==0.28.1)
37
+ Requires-Dist: idna (==3.11)
38
+ Requires-Dist: installer (==0.7.0)
39
+ Requires-Dist: jaraco.classes (==3.4.0)
40
+ Requires-Dist: jaraco.context (==6.1.0)
41
+ Requires-Dist: jaraco.functools (==4.4.0)
42
+ Requires-Dist: jiter (==0.12.0)
43
+ Requires-Dist: keyring (==25.7.0)
44
+ Requires-Dist: markdown-it-py (==4.0.0)
45
+ Requires-Dist: mdurl (==0.1.2)
46
+ Requires-Dist: more-itertools (==10.8.0)
47
+ Requires-Dist: msgpack (==1.1.2)
48
+ Requires-Dist: openai (==2.14.0)
49
+ Requires-Dist: packaging (==25.0)
50
+ Requires-Dist: pbs-installer (==2026.1.14)
51
+ Requires-Dist: pkginfo (==1.12.1.2)
52
+ Requires-Dist: platformdirs (==4.5.1)
53
+ Requires-Dist: pyasn1 (==0.6.1)
54
+ Requires-Dist: pyasn1_modules (==0.4.2)
55
+ Requires-Dist: pydantic (==2.12.5)
56
+ Requires-Dist: pydantic_core (==2.41.5)
57
+ Requires-Dist: pyproject_hooks (==1.2.0)
58
+ Requires-Dist: python-dotenv (==1.2.1)
59
+ Requires-Dist: pywin32-ctypes (==0.2.3)
60
+ Requires-Dist: pyyaml (==6.0.3)
61
+ Requires-Dist: requests (==2.32.5)
62
+ Requires-Dist: requests-toolbelt (==1.0.0)
63
+ Requires-Dist: rich (==14.2.0)
64
+ Requires-Dist: rich_progress (==0.4.0)
65
+ Requires-Dist: rsa (==4.9.1)
66
+ Requires-Dist: shellingham (==1.5.4)
67
+ Requires-Dist: sniffio (==1.3.1)
68
+ Requires-Dist: tenacity (==9.1.2)
69
+ Requires-Dist: tomlkit (==0.14.0)
70
+ Requires-Dist: tqdm (==4.67.1)
71
+ Requires-Dist: trove-classifiers (==2026.1.14.14)
72
+ Requires-Dist: typing-inspection (==0.4.2)
73
+ Requires-Dist: typing_extensions (==4.15.0)
74
+ Requires-Dist: urllib3 (==2.6.2)
75
+ Requires-Dist: virtualenv (==20.36.1)
76
+ Requires-Dist: websockets (==15.0.1)
77
+ Requires-Dist: zstandard (==0.25.0)
78
+ Description-Content-Type: text/markdown
79
+
80
+ ## Executive Navigation Tree
81
+ - 📂 Installation & Setup
82
+ - [Install Workflow Guide](#install-workflow-guide)
83
+ - [PowerShell Setup Script](#powershell-setup-script)
84
+ - [Bash Setup Script](#bash-setup-script)
85
+ - ⚙️ Configuration & Core
86
+ - [Autodocconfig File Options](#autodocconfig-file-options)
87
+ - [Purpose Of Config Reader](#purpose-of-config-reader)
88
+ - [Key Function Read Config](#key-function-read-config)
89
+ - [Cli Bootstrap And Configuration Loading](#cli‑bootstrap‑and‑configuration‑loading)
90
+ - [Configuration Constants](#configuration-constants)
91
+ - [Environment Variable Loading](#environment-variable-loading)
92
+ - [Projectsettings Class](#projectsettings-class)
93
+ - [Project Metadata Declaration](#project-metadata-declaration)
94
+ - [Dependency Specification](#dependency-specification)
95
+ - [Build System Configuration](#build-system-configuration)
96
+ - 🏗️ Integration & Modules
97
+ - [Integration With Factory Modules](#integration-with-factory-modules)
98
+ - [Integration Points And Assumptions](#integration‑points‑and‑assumptions)
99
+ - [Integration Points](#integration‑points)
100
+ - [Assumptions And Side‑Effects](#assumptions-and-side-effects)
101
+ - [Module Purpose](#module-purpose)
102
+ - [Basemodule Abstract Contract](#basemodule‑abstract‑contract)
103
+ - [Manager Orchestration Role](#manager‑orchestration‑role)
104
+ - [Docfactory Module Orchestrator](#docfactory‑module‑orchestrator)
105
+ - [Factory Doc Augmentation](#factory‑doc‑augmentation)
106
+ - [Model Instantiation And Manager Setup](#model-instantiation-and-manager-setup)
107
+ - 🔄 Processing & Generation Pipeline
108
+ - [Processing Steps](#processing-steps)
109
+ - [Generation Pipeline Steps](#generation‑pipeline‑steps)
110
+ - [Document Ordering Step](#document‑ordering‑step)
111
+ - [Anchor Extraction And Chunk Splitting](#anchor-extraction-and-chunk-splitting)
112
+ - [Semantic Ordering Of Chunks](#semantic-ordering-of-chunks)
113
+ - [Data Splitting Logic](#data-splitting-logic)
114
+ - [History Buffer Management](#history-buffer-management)
115
+ - [Parentmodel Selection Logic](#parentmodel‑selection‑logic)
116
+ - [Gptmodel Synchronous Wrapper](#gptmodel‑synchronous‑wrapper)
117
+ - [Asyncgptmodel Asynchronous Wrapper](#asyncgptmodel‑asynchronous‑wrapper)
118
+ - [Model Exhausted Exception](#model-exhausted-exception)
119
+ - 🗂️ Compression & Optimization
120
+ - [Compress Function](#compress-function)
121
+ - [Compress And Compare Sync](#compress-and-compare-sync)
122
+ - [Compress To One Loop](#compress-to-one-loop)
123
+ - [Async Compress Function](#async-compress-function)
124
+ - [Async Compress And Compare](#async-compress-and-compare)
125
+ - [Compression Prompt Generator](#compression-prompt-generator)
126
+ - 📄 Documentation Generation
127
+ - [Custommodule Custom Description Generator](#custommodule‑custom‑description‑generator)
128
+ - [Custom Description Generation](#custom‑description‑generation)
129
+ - [Generate Descriptions](#generate-descriptions)
130
+ - [Code Mix Generation Method](#code‑mix‑generation‑method)
131
+ - [Repository Content Aggregation Codemix](#repository-content-aggregation-codemix)
132
+ - [Sync Part Doc Generation](#sync-part-doc-generation)
133
+ - [Synchronous Doc Parts Generation](#synchronous‑doc‑parts‑generation)
134
+ - [Async Part Doc Generation](#async-part-doc-generation)
135
+ - [Batch Doc Generation Sync](#batch-doc-generation-sync)
136
+ - [Batch Doc Generation Async](#batch-doc-generation-async)
137
+ - 🌐 HTML Extraction & Intro Links
138
+ - [Introlinks Html Link Extraction Intro](#introlinks‑html‑link‑extraction‑intro)
139
+ - [Html Link Extraction](#html-link-extraction)
140
+ - 🧩 Misc Tools
141
+ - [Regex Pattern](#["\\\']?(.*?)["\\\']?)
142
+ - 📄 Intro Generation
143
+ - [Introtext Global Introduction Assembly](#introtext‑global‑introduction‑assembly)
144
+ - [Global Intro Generation](#global‑intro‑generation)
145
+ - [Link Driven Intro Generation](#link‑driven‑intro‑generation)
146
+ - 🗃️ Cache & Logging
147
+ - [Cache Initialisation Paths](#cache‑initialisation‑paths)
148
+ - [Cache Cleanup Routine](#cache‑cleanup‑routine)
149
+ - [Logging Infrastructure](#logging-infrastructure)
150
+ - [Rich Console Progress](#rich-console-progress)
151
+ - [Console Progress](#console-progress)
152
+
153
+
154
+
155
+ <a name="install-workflow-guide"></a>To set up the documentation generation workflow, fetch the Windows installer script from raw.githubusercontent.com/Drag-GameStudio/ADG/main/install.ps1 and execute it in PowerShell using `| iex`. For Linux systems, retrieve the installer from raw.githubusercontent.com/Drag-GameStudio/ADG/main/install.sh and run it with `| bash`. After installing, add a secret called **GROCK_API_KEY** to your repository’s GitHub Actions secrets, inserting the API key obtained from the Grock documentation site (grockdocs.com) to enable the workflow.
156
+ <a name="powershell-setup-script"></a>
157
+ ## PowerShell Setup Script (`install.ps1`)
158
+ **Responsibility**: Generates GitHub workflow files and a minimal `autodocconfig.yml` for the current repository.
159
+ **Interactions**: Uses PowerShell here‑strings to write `.github/workflows/autodoc.yml` and `autodocconfig.yml`; reads the folder name via `Get-Item .`.
160
+ **Technical Details**: Creates target directory (`New-Item -Force`), writes static YAML content with embedded secret reference, and prints a success message.
161
+ **Data Flow**: Filesystem paths → created/overwritten YAML files.
162
+ <a name="bash-setup-script"></a>
163
+ ## Bash Setup Script (`install.sh`)
164
+ **Responsibility**: Mirrors `install.ps1` for Unix‑like shells, creating the same workflow and config files.
165
+ **Interactions**: Uses `mkdir -p` for directory creation, `cat <<EOF` redirection to write YAML, and `$(basename "$PWD")` to insert the project name.
166
+ **Technical Details**: Escapes the `${{…}}` placeholder to avoid shell interpolation, then echoes a confirmation.
167
+ **Data Flow**: Filesystem operations → generated `.github/workflows/autodoc.yml` and `autodocconfig.yml`.
168
+ <a name="autodocconfig-file-options"></a>
169
+ The configuration file is written in YAML and may contain the following top‑level keys:
170
+
171
+ * **project_name** – a string that defines the name of the project.
172
+ * **language** – a string indicating the documentation language (default “en”).
173
+ * **ignore_files** – an optional list of glob patterns for files that should be excluded from processing.
174
+ * **project_settings** – a map with optional settings:
175
+ * **save_logs** – boolean, when true the generation logs are persisted.
176
+ * **log_level** – integer specifying the verbosity of logging.
177
+ * **project_additional_info** – a map where any custom key‑value pairs can be added to enrich the project description (e.g., a “global idea” entry).
178
+ * **custom_descriptions** – a list of strings; each string is passed to a custom module and can contain arbitrary explanatory text, commands, or references.
179
+
180
+ When writing the file, ensure proper indentation and use plain YAML syntax. Include only the keys you need; omitted keys will fall back to defaults defined in the generator.
181
+ <a name="purpose-of-config-reader"></a>
182
+ ## Purpose of ConfigReader
183
+
184
+ `read_config` translates a raw YAML string into a fully‑populated `Config` instance and a list of `CustomModule` objects. It centralises all project‑wide settings, language choice, ignore patterns and custom description handling for the Auto‑Doc Generator.
185
+ <a name="key-function-read-config"></a>
186
+ ## Key Function `read_config`
187
+
188
+ ```python
189
+ def read_config(file_data: str) -> tuple[Config, list[CustomModule]]:
190
+ ```
191
+
192
+ * **Parameters** – `file_data`: a YAML‑formatted string (typically the contents of `autodocconfig.yml`).
193
+ * **Returns** – a tuple:
194
+ 1. `Config` – holds project metadata, language, ignore patterns, and `ProjectConfigSettings`.
195
+ 2. `list[CustomModule]` – one module per custom description.
196
+ <a name="cli‑bootstrap‑and‑configuration‑loading"></a>
197
+ ## CLI bootstrap and configuration loading
198
+
199
+ The `if __name__ == "__main__":` block acts as a tiny command‑line driver:
200
+
201
+ 1. Reads `autodocconfig.yml` into a string.
202
+ 2. Calls `read_config` (from `auto_runner.config_reader`) to obtain a `Config` instance and a list of custom module objects.
203
+ 3. Invokes `gen_doc(".", config, custom_modules)` and stores the result in `output_doc`.
204
+
205
+ No external I/O occurs inside `gen_doc`; all file interactions are confined to the `Manager`’s internal cache and the final `read_file_by_file_key` call.
206
+ <a name="configuration-constants"></a>
207
+ ## Configuration constants and prompts
208
+
209
+ The module defines a set of multi‑line string constants (`BASE_SYSTEM_TEXT`, `BASE_PART_COMPLITE_TEXT`, `BASE_INTRODACTION_CREATE_TEXT`, `BASE_INTRO_CREATE`, `BASE_SETTINGS_PROMPT`). Each constant supplies a reusable prompt fragment for the AutoDoc pipeline (system instruction, documentation style, navigation‑tree generation, project‑overview template, and persistent‑memory instruction). These literals are imported by the runner to build the full prompt passed to the LLM.
210
+ <a name="environment-variable-loading"></a>
211
+ ## Environment variable loading and API key validation
212
+
213
+ ```python
214
+ load_dotenv()
215
+ API_KEY = os.getenv("API_KEY")
216
+ if API_KEY is None:
217
+ raise Exception("API_KEY is not set in environment variables.")
218
+ ```
219
+
220
+ The code pulls the OpenAI key from a `.env` file at runtime. Absence of the key aborts execution, guaranteeing that downstream `GPTModel` instances always receive valid credentials.
221
+ <a name="projectsettings-class"></a>
222
+ ## ProjectSettings – Prompt Builder
223
+ **Responsibility**: Holds project‑level metadata and produces a composite system prompt.
224
+ **Interactions**: Accessed by all compression functions via the `prompt` property.
225
+ **Technical Details**: Starts with `BASE_SETTINGS_PROMPT`, appends project name and any key/value pairs added via `add_info`.
226
+ **Data Flow**: `ProjectSettings` → `str` prompt used in LLM calls.
227
+ <a name="project-metadata-declaration"></a>
228
+ ## Project Metadata Declaration
229
+
230
+ The `pyproject.toml` fragment declares the **autodocgenerator** package’s identity: name, version, description, authors, license, README, and supported Python range. This information is consumed by packaging tools (Poetry, pip, build back‑ends) to generate distribution metadata (`PKG‑INFO`, wheel tags) and to surface project details on PyPI.
231
+ <a name="dependency-specification"></a>
232
+ ## Dependency Specification
233
+
234
+ Under `[project]` the `dependencies` array enumerates exact version pins for every runtime library (e.g., `openai==2.14.0`, `pydantic==2.12.5`). The list drives `poetry install` and `pip install .` to resolve a reproducible environment. No optional or development groups are defined here; they would be placed in separate sections (`[tool.poetry.dev-dependencies]`) if needed.
235
+ <a name="build-system-configuration"></a>
236
+ ## Build System Configuration
237
+
238
+ The `[build-system]` table tells the Python build frontend to use **poetry-core** (`requires = ["poetry-core>=2.0.0"]`) with the entry point `poetry.core.masonry.api`. During `python -m build` or `pip install .`, this config triggers Poetry’s PEP‑517 builder, which reads the above metadata and assembles the source distribution and wheel. No custom build steps or hooks are declared, so the process is deterministic and isolated from external scripts.
239
+ <a name="entry-point-for-doc-generation"></a>
240
+ ## Entry point for documentation generation (`gen_doc`)
241
+
242
+ The `gen_doc` function is the orchestrator that ties together configuration, language models, and the `Manager` to produce a complete documentation file. It receives a filesystem root (`project_path`), a validated `Config` object, and a list of instantiated custom module objects.
243
+
244
+ **Data flow**
245
+ - **Inputs**: `project_path` (str), `config` (Config), `custom_modules` (list[CustomModule])
246
+ - **Outputs**: Raw markdown string returned by `manager.read_file_by_file_key("output_doc")`
247
+
248
+ **Side effects**: Initializes two GPT model instances, creates a `Manager`, triggers a series of generation steps, and clears the internal cache.
249
+ <a name="integration-with-factory-modules"></a>
250
+ ## Integration with Factory Modules
251
+
252
+ The function imports `CustomModule` from `autodocgenerator.factory.modules.general_modules`. Each entry in the `custom_descriptions` YAML array is wrapped in a `CustomModule`, allowing the downstream factory pipeline to treat user‑supplied snippets uniformly with built‑in modules.
253
+ <a name="integration‑points‑and‑assumptions"></a>
254
+ ## Integration points and assumptions
255
+
256
+ - **Config object** must conform to the schema defined in `autodocgenerator.auto_runner.config_reader`; malformed YAML raises `yaml.YAMLError`.
257
+ - **Custom modules** are expected to inherit from `CustomModule` and be instantiable without arguments.
258
+ - The global `API_KEY` is imported from `autodocgenerator.engine.config.config`; absence of a valid key will cause runtime authentication errors.
259
+ - The function is pure from the caller’s perspective – it returns the assembled markdown and leaves the filesystem untouched after execution.
260
+ <a name="integration‑points"></a>
261
+ ## Integration with the documentation pipeline
262
+
263
+ 1. After `order_doc` produces the final markdown, `custom_intro` is imported by the post‑processor stage.
264
+ 2. `get_all_html_links` extracts navigation anchors → fed to `get_links_intro`.
265
+ 3. `get_introdaction` receives the whole document for a high‑level intro.
266
+ 4. `generete_custom_discription` may be invoked with user‑specified topics to prepend targeted sections.
267
+ 5. The returned strings are concatenated and written back to `output_doc.md`.
268
+
269
+ All functions are pure apart from logging; they rely solely on the provided `Model` instance, making them trivially mockable for unit testing.
270
+ <a name="assumptions-and-side-effects"></a>
271
+ ## Assumptions and Side Effects
272
+
273
+ * The YAML must be syntactically valid; malformed input raises `yaml.YAMLError`.
274
+ * Missing optional keys default to empty collections or sensible defaults (`language` → `"en"`).
275
+ * No external I/O occurs; the function purely transforms in‑memory data, leaving the filesystem untouched.
276
+
277
+ This fragment is the entry point for configuration loading, feeding the rest of the ADG pipeline with a consistent, typed configuration object.
278
+ <a name="module-purpose"></a>
279
+ ## Purpose of **custom_intro** post‑processor
280
+
281
+ The module supplies a lightweight post‑processing pipeline that enriches the automatically generated documentation with anchor‑based navigation and optional introductory sections. It operates on the final markdown produced by the core generation flow and prepares ready‑to‑display HTML‑compatible fragments.
282
+ <a name="basemodule‑abstract‑contract"></a>
283
+ ## `BaseModule` – abstract generation contract
284
+ `BaseModule` defines the required interface for any documentation fragment generator. It inherits from `ABC` and mandates a `generate(info: dict, model: Model)` method, ensuring uniformity across plug‑in modules. Sub‑classes implement their own logic while receiving the raw `info` payload and a concrete `Model` instance.
285
+ <a name="manager‑orchestration‑role"></a>
286
+ ## `Manager` – orchestrating preprocessing, documentation generation, and post‑processing
287
+
288
+ **Responsibility** – Coordinates the end‑to‑end documentation pipeline: builds a *code‑mix* snapshot, splits it into manageable chunks, runs factory modules (e.g., `IntroLinks`, `IntroText`), orders the final markdown, and handles cache/log housekeeping.
289
+
290
+ **Interactions** –
291
+ - **Pre‑processors**: `CodeMix` (repo scanning), `gen_doc_parts` / `async_gen_doc_parts` (chunk‑wise generation).
292
+ - **Post‑processors**: `split_text_by_anchors`, `get_order` (re‑ordering).
293
+ - **Factories**: any `DocFactory` subclass supplying a list of modules that implement `generate_doc`.
294
+ - **Models**: synchronous `Model` or asynchronous `AsyncModel` supplied at construction.
295
+ - **UI**: `BaseProgress` updates progress bars; `BaseLogger` writes to the cache‑log file.
296
+ <a name="docfactory‑module‑orchestrator"></a>
297
+ ## `DocFactory` – orchestrator of documentation modules
298
+ `DocFactory` aggregates a sequence of `BaseModule` objects. Its `generate_doc` method creates a progress sub‑task, iterates through each module, concatenates their outputs, logs success and module content (verbosity level 2), updates progress, and finally returns the assembled documentation string. Errors propagate from individual modules; the factory itself does not alter content.
299
+ <a name="factory‑doc‑augmentation"></a>
300
+ ## `factory_generate_doc` – applying modular enrichments
301
+
302
+ 1. Loads current `output_doc` and the original `code_mix`.
303
+ 2. Builds `info` dict (`language`, `full_data`, `code_mix`).
304
+ 3. Calls `doc_factory.generate_doc(info, sync_model, progress_bar)`.
305
+ 4. Prepends the factory result to the existing doc (`new_data = f"{result}\\n\\n{curr_doc}"`) and writes back.
306
+
307
+ The method is model‑agnostic; any `DocFactory` with a `modules` attribute (e.g., `IntroLinks`, `IntroText`) can contribute additional sections.
308
+ <a name="model-instantiation-and-manager-setup"></a>
309
+ ## Model instantiation and manager setup
310
+
311
+ ```python
312
+ sync_model = GPTModel(API_KEY, use_random=False)
313
+ async_model = AsyncGPTModel(API_KEY)
314
+
315
+ manager = Manager(
316
+ project_path,
317
+ config=config,
318
+ sync_model=sync_model,
319
+ async_model=async_model,
320
+ progress_bar=ConsoleGtiHubProgress(),
321
+ )
322
+ ```
323
+
324
+ - **GPTModel / AsyncGPTModel** – provide synchronous and asynchronous OpenAI API access, respectively.
325
+ - **ConsoleGtiHubProgress** – concrete progress‑bar implementation displayed in the terminal.
326
+ - **Manager** – core engine that holds state, coordinates factories, and writes the final document.
327
+ <a name="processing-steps"></a>
328
+ ## Processing Steps
329
+
330
+ 1. **Parse YAML** – `yaml.safe_load` yields a Python dict.
331
+ 2. **Instantiate Config** – default `Config()` created.
332
+ 3. **Populate core fields** – `language`, `project_name`, `project_additional_info`.
333
+ 4. **Load project settings** – `ProjectConfigSettings().load_settings(...)` then attached via `config.set_pcs`.
334
+ 5. **Register ignore patterns** – each pattern from `ignore_files` added with `config.add_ignore_file`.
335
+ 6. **Add supplemental info** – key/value pairs from `project_additional_info` stored via `config.add_project_additional_info`.
336
+ 7. **Create custom modules** – each string in `custom_descriptions` wrapped in `CustomModule`.
337
+ <a name="generation‑pipeline‑steps"></a>
338
+ ## Generation pipeline steps
339
+
340
+ | Call | Purpose |
341
+ |------|---------|
342
+ | `manager.generate_code_file()` | Scans the project, extracts source files, and stores a normalized code representation. |
343
+ | `manager.generete_doc_parts(max_symbols=5000)` | Produces raw documentation fragments (function signatures, docstrings, etc.) limited to `max_symbols` characters per chunk. |
344
+ | `manager.factory_generate_doc(DocFactory(*custom_modules))` | Runs a `DocFactory` built from user‑supplied `custom_modules` to inject bespoke sections (e.g., custom tutorials). |
345
+ | `manager.order_doc()` | Reorders fragments into a logical sequence (intro → modules → API reference). |
346
+ | `manager.factory_generate_doc(DocFactory(IntroLinks()))` | Adds a generated introductory links section using the built‑in `IntroLinks` module. |
347
+ | `manager.clear_cache()` | Purges temporary files and in‑memory caches to keep the workspace clean. |
348
+ <a name="document‑ordering‑step"></a>
349
+ ## `order_doc` – anchor‑based re‑ordering
350
+
351
+ - Splits `output_doc` into sections via `split_text_by_anchors`.
352
+ - Sends the list to `get_order(sync_model, sections)` which uses the LLM to compute the optimal sequence.
353
+ - Persists the reordered markdown.
354
+ <a name="anchor-extraction-and-chunk-splitting"></a>
355
+ ## Anchor Extraction & Chunk Splitting (`extract_links_from_start`, `split_text_by_anchors`)
356
+
357
+ **Responsibility** – Isolate markdown sections that begin with an HTML anchor (`<a name="…"></a>`) and build a mapping {anchor → section text}.
358
+ **Interactions** – Consumes raw markdown supplied by the post‑processor, emits a `dict[str,str]` used later by `get_order`. No external services; only `re` and the internal logger for debugging.
359
+ **Technical Details** –
360
+ * `extract_links_from_start` scans each pre‑split chunk with `^<a name=["']?(.*?)["']?</a>`; anchors shorter than six characters are discarded and a leading “#” is prefixed.
361
+ * `split_text_by_anchors` uses a positive‑lookahead split (`(?=<a name=["']?[^"\'>\s]{6,200}["']?</a>)`) to produce clean chunks, strips whitespace, validates a one‑to‑one count between anchors and chunks, and finally assembles the result dictionary.
362
+ **Data Flow** – Input: full markdown string. Output: `{ "#anchorName": "section markdown …" }` or `None` on mismatch. Side‑effects: optional `InfoLog` messages (not shown here).
363
+ <a name="semantic-ordering-of-chunks"></a>
364
+ ## Semantic Ordering of Documentation Chunks (`get_order`)
365
+
366
+ **Responsibility** – Ask the LLM (`model`) to reorder the extracted sections so related topics are grouped logically.
367
+ **Interactions** – Receives the anchor‑to‑chunk map from the splitter, builds a single‑turn user prompt, calls `model.get_answer_without_history`, parses the comma‑separated title list, and concatenates the corresponding markdown blocks. Logging via `BaseLogger` records the input map, the raw LLM reply, and each block addition.
368
+ **Technical Details** –
369
+ * Prompt explicitly requests only a CSV list, preserving the leading “#” in titles.
370
+ * Result string split → `new_result` list, then ordered markdown assembled in `order_output`.
371
+ **Data Flow** – Input: `Model` instance, `dict[str,str]`. Output: a single ordered markdown string. No file I/O; side‑effects limited to logger entries.
372
+ <a name="data-splitting-logic"></a>
373
+ ## Data Splitting Logic (`split_data`)
374
+ **Responsibility**: Breaks a large source‑code string into chunks whose length does not exceed `max_symbols`.
375
+ **Interactions**: Relies on `BaseLogger` for progress messages; no external state.
376
+ **Technical Details**:
377
+ - Splits on line breaks, then iteratively halves any segment > 1.5 × `max_symbols`.
378
+ - Packs the refined segments into `split_objects`, starting a new chunk when the current one would exceed 1.25 × `max_symbols`.
379
+ **Data Flow**: `str` → list of `str` (chunks).
380
+ <a name="history-buffer-management"></a>
381
+ ## `History` – accumulating system‑ and conversation‑level messages
382
+
383
+ - **Inputs:** optional `system_prompt` (defaults to `BASE_SYSTEM_TEXT`).
384
+ - **State:** `self.history` – ordered list of `{role, content}` dicts.
385
+ - **Side‑effects:** `add_to_history` appends new entries, used by `Model`/`AsyncModel` to build the chat payload.
386
+ - **Assumptions:** callers respect role strings (`"system"`, `"user"`, `"assistant"`).
387
+ <a name="parentmodel‑selection‑logic"></a>
388
+ ## `ParentModel` – randomized model list preparation
389
+
390
+ During initialization it copies `MODELS_NAME`, optionally shuffles it (`use_random`), and stores the sequence in `self.regen_models_name`. `self.current_model_index` tracks the active model. This structure enables fail‑over cycling when a model call fails.
391
+ <a name="gptmodel‑synchronous‑wrapper"></a>
392
+ ## `GPTModel` – synchronous Groq client integration
393
+
394
+ - Constructs a `Groq` client with the supplied `api_key`.
395
+ - `generate_answer` selects the current model, attempts `client.chat.completions.create(messages, model)`, and on exception logs a warning, advances `current_model_index`, and retries until a model succeeds or the list is exhausted (raising `ModelExhaustedException`).
396
+ - Returns the content of the first choice and logs the result.
397
+ <a name="asyncgptmodel‑asynchronous‑wrapper"></a>
398
+ ## `AsyncGPTModel` – async counterpart using `AsyncGroq`
399
+
400
+ Mirrors `GPTModel` logic but with `await` on `client.chat.completions.create`. Logging is identical, and the method signature is `async`. It enables non‑blocking generation in event‑driven workflows.
401
+
402
+ **Data Flow Summary**
403
+ Prompt (either full history or raw `prompt` arg) → `History`/caller → selected model name → Groq API call → `chat_completion` object → extracted `content` → logger → returned string. All errors funnel through the retry loop or raise `ModelExhaustedException`.
404
+ <a name="model-exhausted-exception"></a>
405
+ ## `ModelExhaustedException` – signaling depletion of model pool
406
+
407
+ `ModelExhaustedException` derives from `Exception` and is raised when `regen_models_name` becomes empty. It bubbles up to the caller, forcing upstream logic (e.g., factories or UI) to abort or retry with a different configuration.
408
+ <a name="compress-function"></a>
409
+ ## Compress – Single‑Pass Summarization
410
+ **Responsibility**: Sends a raw text chunk to the LLM with a system prompt built from `ProjectSettings` and a configurable compression baseline.
411
+ **Interactions**: Calls `model.get_answer_without_history`; reads `project_settings.prompt` and `get_BASE_COMPRESS_TEXT`.
412
+ **Technical Details**: Constructs a three‑message list (`system`, `system`, `user`) and returns the LLM’s answer verbatim.
413
+ **Data Flow**: `data: str` → LLM request → `str` answer.
414
+ <a name="compress-and-compare-sync"></a>
415
+ ## Compress & Compare (synchronous)
416
+ **Responsibility**: Groups input strings into `compress_power`‑sized batches, compresses each element, and concatenates results per batch.
417
+ **Interactions**: Uses `compress`; updates a `BaseProgress` sub‑task.
418
+ **Technical Details**: Pre‑allocates a result list sized `ceil(len(data)/compress_power)`, iterates with index division, appends compressed text plus newline.
419
+ **Data Flow**: `list[str]` → list of combined batch strings.
420
+ <a name="compress-to-one-loop"></a>
421
+ ## Compress to One – Iterative Reduction
422
+ **Responsibility**: Repeatedly compresses the list until a single aggregated summary remains.
423
+ **Interactions**: Switches between sync/async paths based on `use_async`; each iteration calls the appropriate batch function.
424
+ **Technical Details**: Dynamically lowers `compress_power` when remaining items < `compress_power+1`; counts iterations for diagnostics.
425
+ **Data Flow**: `list[str]` → final `str` summary.
426
+ <a name="async-compress-function"></a>
427
+ ## Async Compress – Concurrency‑Safe Summarization
428
+ **Responsibility**: Same as `compress` but respects an `asyncio.Semaphore` to limit parallel LLM calls and updates progress.
429
+ **Interactions**: Awaits `model.get_answer_without_history`; shares the same prompt structure.
430
+ **Technical Details**: Wrapped in `async with semaphore`; returns the answer after `progress_bar.update_task()`.
431
+ **Data Flow**: `str` → async LLM request → `str`.
432
+ <a name="async-compress-and-compare"></a>
433
+ ## Async Compress & Compare
434
+ **Responsibility**: Parallel version of batch compression.
435
+ **Interactions**: Spawns one `async_compress` task per element, gathers results, then re‑chunks them into batches of `compress_power`.
436
+ **Technical Details**: Uses a fixed 4‑slot semaphore, creates a progress sub‑task, joins with `asyncio.gather`.
437
+ **Data Flow**: `list[str]` → list of batch strings.
438
+ <a name="compression-prompt-generator"></a>
439
+ ## `get_BASE_COMPRESS_TEXT` – dynamic prompt builder
440
+
441
+ ```python
442
+ def get_BASE_COMPRESS_TEXT(start, power):
443
+ return f\"\"\"
444
+ You will receive a large code snippet (up to ~{start} characters).
445
+ ...
446
+ ```
447
+ This helper creates a size‑aware instruction block for summarising large code fragments. Parameters:
448
+ - **start** – approximate maximum character count of the incoming snippet.
449
+ - **power** – divisor controlling the length of the summary (`~start/power`).
450
+
451
+ The function interpolates these values into a template that directs the model to extract architecture, produce a concise summary, and emit a strict usage example. It returns the formatted string for later concatenation with other prompt pieces.
452
+ <a name="custommodule‑custom‑description‑generator"></a>
453
+ ## `CustomModule` – custom description generator
454
+ Initialised with a `discription` string, `CustomModule.generate` splits the mixed code (`info["code_mix"]`) to ≤ 7000 symbols via `split_data`, then calls `generete_custom_discription` (post‑processor) with the split data, the provided `model`, the stored description, and the target language. The returned text becomes the module’s contribution.
455
+ <a name="custom‑description‑generation"></a>
456
+ ## `generete_custom_discription(splited_data: str, model: Model, custom_description: str, language: str = "en") → str`
457
+
458
+ *Responsibility* – Iterates over pre‑split documentation fragments, asking the LLM to produce a concise, anchor‑prefixed description for a user‑defined topic (`custom_description`).
459
+ *Logic Flow*
460
+
461
+ 1. For each `sp_data` in `splited_data` construct a multi‑system‑message prompt:
462
+ * language directive,
463
+ * role description (“Technical Analyst”),
464
+ * strict rule block enforcing *zero‑hallucination* and mandatory single `<a name="…"></a>` tag,
465
+ * the fragment context,
466
+ * the task description.
467
+ 2. Call `model.get_answer_without_history`.
468
+ 3. If the result does **not** contain the sentinel `!noinfo` / “No information found” (or it appears after position 30), break the loop and return the answer; otherwise continue with the next fragment.
469
+
470
+ *Side‑effects* – None; all I/O is through the LLM and logging performed implicitly by the model or caller.
471
+ <a name="generate-descriptions"></a>
472
+ ## Generate Descriptions for Code
473
+ **Responsibility**: Queries the LLM for a structured developer‑facing description of each source file.
474
+ **Interactions**: Sends a fixed instructional system prompt plus the code snippet; logs progress.
475
+ **Technical Details**: Iterates over `data`, builds a two‑message prompt, collects answers in order.
476
+ **Data Flow**: `list[str]` (code) → list of LLM‑generated markdown descriptions.
477
+ <a name="code‑mix‑generation‑method"></a>
478
+ ## `generate_code_file` – building the repository snapshot
479
+
480
+ 1. Logs start.
481
+ 2. Instantiates `CodeMix(project_directory, config.ignore_files)`.
482
+ 3. Calls `cm.build_repo_content` → writes the mixed source to `code_mix.txt`.
483
+ 4. Logs completion and advances the progress bar.
484
+ <a name="repository-content-aggregation-codemix"></a>
485
+ ## Repository Content Aggregation (`CodeMix` class)
486
+
487
+ **Responsibility** – Produce a linear textual representation of a repository’s directory tree and file contents, while respecting an ignore list.
488
+ **Interactions** – Called by the pre‑processor stage; writes to a user‑specified output file. Relies on `BaseLogger` for ignored‑path notices; does not invoke the LLM.
489
+ **Technical Details** –
490
+ * `should_ignore` evaluates a `Path` against `ignore_patterns` using `fnmatch` on the full relative path, basename, and each path component.
491
+ * `build_repo_content` iterates twice over `root_dir.rglob("*")`: first to emit the hierarchical tree (indentation based on depth), second to embed each non‑ignored file inside `<file path="…">` tags. Errors while reading files are captured and written inline.
492
+ **Data Flow** – Input: root directory path, ignore pattern list, optional output filename. Output: side‑effect – a text file (`repomix-output.txt` by default) containing the structured dump. Logging side‑effects report ignored entries and read errors.
493
+ <a name="sync-part-doc-generation"></a>
494
+ ## Synchronous Part Documentation Generation (`write_docs_by_parts`)
495
+ **Responsibility**: Sends a single chunk to the LLM and returns the raw markdown response.
496
+ **Interactions**: Uses `BASE_PART_COMPLITE_TEXT`, optional `prev_info`, and a `Model` instance; logs via `BaseLogger`.
497
+ **Technical Details**: Builds a 2‑ or 3‑message prompt (system → language/id, system → base prompt, optional system → prior info, user → code). Calls `model.get_answer_without_history`. Strips surrounding triple back‑ticks if present.
498
+ **Data Flow**: `(part_id, part, Model, prev_info?)` → `str` (LLM answer).
499
+ <a name="synchronous‑doc‑parts‑generation"></a>
500
+ ## `generete_doc_parts` – synchronous chunked documentation
501
+
502
+ - Reads the full code‑mix.
503
+ - Calls `gen_doc_parts(full_code_mix, max_symbols, sync_model, config.language, progress_bar)`.
504
+ - Writes the resulting markdown to `output_doc.md` and updates progress.
505
+ - Provides a clear input‑output contract: **input** – raw code text; **output** – partially generated documentation limited by `max_symbols`.
506
+ <a name="async-part-doc-generation"></a>
507
+ ## Asynchronous Part Documentation Generation (`async_write_docs_by_parts`)
508
+ **Responsibility**: Same as the sync variant but runs under an `asyncio.Semaphore` to limit concurrent LLM calls.
509
+ **Interactions**: Accepts `AsyncModel`, optional `prev_info`, optional `update_progress` callback, and a shared `semaphore`.
510
+ **Technical Details**: `async with semaphore` guards the request; prompt composition mirrors the sync version; result trimming identical; invokes `update_progress` after the LLM call.
511
+ **Data Flow**: `(part, AsyncModel, semaphore, …)` → `await` → `str`.
512
+ <a name="batch-doc-generation-sync"></a>
513
+ ## Batch Documentation Generation (Synchronous) (`gen_doc_parts`)
514
+ **Responsibility**: Orchestrates full‑code documentation by splitting the input, iterating over chunks, and concatenating the LLM outputs.
515
+ **Interactions**: Calls `split_data`, `write_docs_by_parts`, and updates a `BaseProgress` sub‑task.
516
+ **Technical Details**: After each part, retains the last 3000 characters as context for the next call (`prev_info`). Progress bar is incremented per chunk.
517
+ **Data Flow**: `(full_code_mix, max_symbols, Model, language)` → `str` (complete documentation).
518
+ <a name="batch-doc-generation-async"></a>
519
+ ## Batch Documentation Generation (Asynchronous) (`async_gen_doc_parts`)
520
+ **Responsibility**: Parallel version of `gen_doc_parts` using `asyncio.gather`.
521
+ **Interactions**: Shares the same splitter, creates a semaphore (max 4 parallel calls), and updates `BaseProgress` via a lambda.
522
+ **Technical Details**: Builds a list of `async_write_docs_by_parts` tasks, gathers results, and concatenates them with double newlines.
523
+ **Data Flow**: `(full_code_mix, global_info, max_symbols, AsyncModel, language)` → `await` → `str` (full documentation).
524
+ <a name="introlinks‑html‑link‑extraction‑intro"></a>
525
+ ## `IntroLinks` – HTML link extraction and intro generation
526
+ `IntroLinks.generate` extracts all HTML links from `info["full_data"]` using `get_all_html_links`, then produces a links‑focused introduction via `get_links_intro`, passing the link list, `model`, and language. The resulting markdown/HTML snippet is returned.
527
+ <a name="html-link-extraction"></a>
528
+ ## `get_all_html_links(data: str) → list[str]`
529
+
530
+ *Responsibility* – Scans the supplied markdown for `<a name="…"></a>` anchors and returns a list of fragment identifiers prefixed with `#`.
531
+ *Interactions* – Uses **BaseLogger** to emit progress messages; no external services.
532
+ *Logic* – Compiles a regex `r'<a name=["\']?(.*?)["\']?></a>'`, iterates over `re.finditer`, keeps anchors longer than five characters, logs count and content, returns the collected list.
533
+ <a name="introtext‑global‑introduction‑assembly"></a>
534
+ ## `IntroText` – global introduction assembly
535
+ `IntroText.generate` retrieves a high‑level description from `info["global_data"]` and creates a narrative introduction with `get_introdaction`, again using the supplied `model` and language. The final intro text is emitted for later concatenation.
536
+ <a name="global‑intro‑generation"></a>
537
+ ## `get_introdaction(global_data: str, model: Model, language: str = "en") → str`
538
+
539
+ *Responsibility* – Generates a generic project overview based on the complete documentation text (`global_data`).
540
+ *Interactions* – Prompt comprises `BASE_INTRO_CREATE` plus the full markdown as user content; result is obtained from the same LLM endpoint as above. No logging inside the function (caller may wrap).
541
+ <a name="link‑driven‑intro-generation"></a>
542
+ ## `get_links_intro(links: list[str], model: Model, language: str = "en") → str`
543
+
544
+ *Responsibility* – Calls the supplied LLM (`model`) to synthesize a short introductory paragraph that references the provided link list.
545
+ *Interactions* – Builds a system‑message prompt containing `BASE_INTRODACTION_CREATE_TEXT`, adds the link list as user content, forwards the prompt to `model.get_answer_without_history`. Logs before/after invocation.
546
+ *Output* – Raw LLM response string intended for insertion at the top of the documentation.
547
+ <a name="cache‑initialisation‑paths"></a>
548
+ ## Cache folder and file‑path helpers
549
+
550
+ - `CACHE_FOLDER_NAME = ".auto_doc_cache"` and `FILE_NAMES` map logical keys to filenames (`code_mix.txt`, `global_info.md`, etc.).
551
+ - `__init__` creates the cache directory if missing, configures a file logger (`FileLoggerTemplate`) and stores injected `config`, `project_directory`, models, and progress bar.
552
+ - `get_file_path(key)` builds an absolute path inside the cache; `read_file_by_file_key(key)` returns its UTF‑8 contents.
553
+ <a name="cache‑cleanup‑routine"></a>
554
+ ## `clear_cache` – optional log removal
555
+
556
+ If `config.pcs.save_logs` is `False`, deletes the `report.txt` file, leaving other cached artifacts untouched.
557
+
558
+ **Data flow summary** – Input files → `CodeMix` → chunk generation → factory enrichment → ordering → final `output_doc.md`. All steps log progress, respect the user‑provided language setting, and optionally clean up temporary logs.
559
+ <a name="logging-infrastructure"></a>
560
+ ## Logging Infrastructure (`BaseLog`, `BaseLoggerTemplate`, `FileLoggerTemplate`, `BaseLogger`)
561
+ **Responsibility**: Provides typed log objects (Error/Warning/Info) and a singleton logger that forwards messages to a configurable template (console or file).
562
+ **Interactions**: `BaseLogger.set_logger()` injects a `BaseLoggerTemplate`; all calls route through `global_log` respecting the global `log_level`.
563
+ **Technical Details**: `BaseLog.format()` yields the raw message; subclasses prepend a timestamp and severity. `BaseLogger.__new__` guarantees a single instance.
564
+ **Data Flow**: `BaseLog` → `str` (formatted line) → `print` or file append.
565
+ <a name="rich-console-progress"></a>
566
+ ## Rich‑Console Progress (`LibProgress`)
567
+ **Responsibility**: Wraps *rich*’s `Progress` to expose a generic sub‑task API used by the documentation pipeline.
568
+ **Interactions**: Created with a shared `Progress` object; `create_new_subtask` registers a child task, `update_task` advances either the sub‑task or the main task, `remove_subtask` discards the current child.
569
+ **Technical Details**: Maintains `_base_task` and `_cur_sub_task` IDs; advances are atomic calls to `Progress.update`.
570
+ **Data Flow**: Calls → `Progress.update` → visual progress bar.
571
+ <a name="console-progress"></a>
572
+ ## Console‑Based Progress (`ConsoleGtiHubProgress` & `ConsoleTask`)
573
+ **Responsibility**: Emits simple stdout progress for environments without *rich*.
574
+ **Interactions**: `ConsoleGtiHubProgress.create_new_subtask` spawns a `ConsoleTask`; `update_task` increments either the active sub‑task or a generic “General Progress” task.
575
+ **Technical Details**: `ConsoleTask.progress()` computes percentage and prints a line; removal clears the reference.
576
+ **Data Flow**: Update call → printed percentage line.
577
+
@@ -3,7 +3,7 @@ autodocgenerator/auto_runner/config_reader.py,sha256=tma7PdzdRxrlaDXs8ucBWawS0QE
3
3
  autodocgenerator/auto_runner/run_file.py,sha256=WouFAUXCH5lCD5IQA88BwYMeTzbBsl7kX97aXxj5Ufk,1389
4
4
  autodocgenerator/config/config.py,sha256=M6q999CHKh91hKwxjgBKA6q816XO5BEGj53m1t17On8,1782
5
5
  autodocgenerator/engine/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
6
- autodocgenerator/engine/config/config.py,sha256=V2FtHeCxbDRsnVzeTPG2TVqGcD6zG5x4s1t01a191pM,6782
6
+ autodocgenerator/engine/config/config.py,sha256=Ov2SJlJR_9n7aOZk_6jZzHAsyMNFisseAUUJgi6RGSQ,8009
7
7
  autodocgenerator/engine/exceptions.py,sha256=pvohRlWSVBdZ8FkdafHBIVKxjbOBX6A0x4yZGHyuc2I,118
8
8
  autodocgenerator/engine/models/gpt_model.py,sha256=SSXj1o1GV2Y3hQZ7B7WOWTjuLE_iSYqltWL0qHP_yOY,3541
9
9
  autodocgenerator/engine/models/model.py,sha256=LaN3andzG3czmy0jFU60P9ji33BAi1rmqWM2titVZ8w,1901
@@ -12,7 +12,7 @@ autodocgenerator/factory/base_factory.py,sha256=NsAWFoTO14XjJ2w9WseoBUb75ooYHEVV
12
12
  autodocgenerator/factory/modules/general_modules.py,sha256=7Ukebf-7JOgOQeeIkRfmcCFwxBoJgZN-mUEE6ZYo3jc,556
13
13
  autodocgenerator/factory/modules/intro.py,sha256=0pPz9pL0WpD098J7CFvLC6g144LWQAHVLNHuxETY25c,606
14
14
  autodocgenerator/manage.py,sha256=y9XFyfTUZQgMmm91vfG2RQiPGIdFcyp3UcFz68qhhks,4930
15
- autodocgenerator/postprocessor/custom_intro.py,sha256=EaF-rTCmKvCdHJ3o8Af3t5i_uLAFaZpw5Wcvkv1dVoU,4396
15
+ autodocgenerator/postprocessor/custom_intro.py,sha256=Poq1JBaDbrH5HOQZlVDoR9JOlmDIdcP-Jn_q5hUqKoI,3206
16
16
  autodocgenerator/postprocessor/sorting.py,sha256=KAPSdwUtxmYBMSajI7V7DVzpiFKpBspHLYMXTOXeZxM,2320
17
17
  autodocgenerator/preprocessor/code_mix.py,sha256=KdBXqHtYkoi-g2f6Kq0IsT9jtkGvt2LWm7w9d-2I1AY,2502
18
18
  autodocgenerator/preprocessor/compressor.py,sha256=VQejX55aD5gwC_ov7JV0vW9DNhFP0G3_zYEp38pV4FY,5173
@@ -21,6 +21,6 @@ autodocgenerator/preprocessor/spliter.py,sha256=kjU4WRBepLI0JrW7BXSmHb3VxzjzBwVc
21
21
  autodocgenerator/ui/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
22
22
  autodocgenerator/ui/logging.py,sha256=r0dWxYvShJjbJVM5P4NYmEvZybrwSGx8Th2ybLOCZuE,1684
23
23
  autodocgenerator/ui/progress_base.py,sha256=LCa-Df_JWvqWRgzglZyu-rM4X2q8pJ1OlXTAbc2v3Ls,1947
24
- autodocgenerator-0.8.9.6.dist-info/METADATA,sha256=xD9988W2ywLdXXVi406ETr9OJYHmP3ipQc0LULVoMqk,26959
25
- autodocgenerator-0.8.9.6.dist-info/WHEEL,sha256=3ny-bZhpXrU6vSQ1UPG34FoxZBp3lVcvK0LkgUz6VLk,88
26
- autodocgenerator-0.8.9.6.dist-info/RECORD,,
24
+ autodocgenerator-0.8.9.7.dist-info/METADATA,sha256=FvFiyz2lJcps1MmUdWE2aKpEXAhO4YNs1RPqYmGeVbY,42485
25
+ autodocgenerator-0.8.9.7.dist-info/WHEEL,sha256=3ny-bZhpXrU6vSQ1UPG34FoxZBp3lVcvK0LkgUz6VLk,88
26
+ autodocgenerator-0.8.9.7.dist-info/RECORD,,
@@ -1,419 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: autodocgenerator
3
- Version: 0.8.9.6
4
- Summary: This Project helps you to create docs for your projects
5
- License: MIT
6
- Author: dima-on
7
- Author-email: sinica911@gmail.com
8
- Requires-Python: >=3.11,<4.0
9
- Classifier: License :: OSI Approved :: MIT License
10
- Classifier: Programming Language :: Python :: 3
11
- Classifier: Programming Language :: Python :: 3.11
12
- Classifier: Programming Language :: Python :: 3.12
13
- Classifier: Programming Language :: Python :: 3.13
14
- Classifier: Programming Language :: Python :: 3.14
15
- Requires-Dist: CacheControl (==0.14.4)
16
- Requires-Dist: Pygments (==2.19.2)
17
- Requires-Dist: RapidFuzz (==3.14.3)
18
- Requires-Dist: annotated-types (==0.7.0)
19
- Requires-Dist: anyio (==4.12.1)
20
- Requires-Dist: certifi (==2026.1.4)
21
- Requires-Dist: charset-normalizer (==3.4.4)
22
- Requires-Dist: cleo (==2.1.0)
23
- Requires-Dist: colorama (==0.4.6)
24
- Requires-Dist: crashtest (==0.4.1)
25
- Requires-Dist: distlib (==0.4.0)
26
- Requires-Dist: distro (==1.9.0)
27
- Requires-Dist: dulwich (==0.25.2)
28
- Requires-Dist: fastjsonschema (==2.21.2)
29
- Requires-Dist: filelock (==3.20.3)
30
- Requires-Dist: findpython (==0.7.1)
31
- Requires-Dist: google-auth (==2.47.0)
32
- Requires-Dist: google-genai (==1.56.0)
33
- Requires-Dist: groq (==1.0.0)
34
- Requires-Dist: h11 (==0.16.0)
35
- Requires-Dist: httpcore (==1.0.9)
36
- Requires-Dist: httpx (==0.28.1)
37
- Requires-Dist: idna (==3.11)
38
- Requires-Dist: installer (==0.7.0)
39
- Requires-Dist: jaraco.classes (==3.4.0)
40
- Requires-Dist: jaraco.context (==6.1.0)
41
- Requires-Dist: jaraco.functools (==4.4.0)
42
- Requires-Dist: jiter (==0.12.0)
43
- Requires-Dist: keyring (==25.7.0)
44
- Requires-Dist: markdown-it-py (==4.0.0)
45
- Requires-Dist: mdurl (==0.1.2)
46
- Requires-Dist: more-itertools (==10.8.0)
47
- Requires-Dist: msgpack (==1.1.2)
48
- Requires-Dist: openai (==2.14.0)
49
- Requires-Dist: packaging (==25.0)
50
- Requires-Dist: pbs-installer (==2026.1.14)
51
- Requires-Dist: pkginfo (==1.12.1.2)
52
- Requires-Dist: platformdirs (==4.5.1)
53
- Requires-Dist: pyasn1 (==0.6.1)
54
- Requires-Dist: pyasn1_modules (==0.4.2)
55
- Requires-Dist: pydantic (==2.12.5)
56
- Requires-Dist: pydantic_core (==2.41.5)
57
- Requires-Dist: pyproject_hooks (==1.2.0)
58
- Requires-Dist: python-dotenv (==1.2.1)
59
- Requires-Dist: pywin32-ctypes (==0.2.3)
60
- Requires-Dist: pyyaml (==6.0.3)
61
- Requires-Dist: requests (==2.32.5)
62
- Requires-Dist: requests-toolbelt (==1.0.0)
63
- Requires-Dist: rich (==14.2.0)
64
- Requires-Dist: rich_progress (==0.4.0)
65
- Requires-Dist: rsa (==4.9.1)
66
- Requires-Dist: shellingham (==1.5.4)
67
- Requires-Dist: sniffio (==1.3.1)
68
- Requires-Dist: tenacity (==9.1.2)
69
- Requires-Dist: tomlkit (==0.14.0)
70
- Requires-Dist: tqdm (==4.67.1)
71
- Requires-Dist: trove-classifiers (==2026.1.14.14)
72
- Requires-Dist: typing-inspection (==0.4.2)
73
- Requires-Dist: typing_extensions (==4.15.0)
74
- Requires-Dist: urllib3 (==2.6.2)
75
- Requires-Dist: virtualenv (==20.36.1)
76
- Requires-Dist: websockets (==15.0.1)
77
- Requires-Dist: zstandard (==0.25.0)
78
- Description-Content-Type: text/markdown
79
-
80
- ## Executive Navigation Tree
81
- - 📂 Setup & Configuration
82
- - [Install Workflow Setup](#install-workflow-setup)
83
- - [Configuration Loading And Validation](#configuration-loading-and-validation)
84
- - [Manager Configuration](#manager-configuration)
85
- - [Projectsettings Prompt Builder](#projectsettings-prompt-builder)
86
- - [Command Line Invocation Logic](#command-line-invocation-logic)
87
-
88
- - ⚙️ Documentation Generation
89
- - [Documentation Pipeline Trigger](#documentation-pipeline-trigger)
90
- - [Execution Flow Summary](#execution-flow-summary)
91
- - [Documentation Generation Workflow](#documentation-generation-workflow)
92
- - [Autodocfile Parameters](#autodocfile-parameters)
93
- - [Docfactory Orchestration](#docfactory-orchestration)
94
- - [Global Introduction Generation](#global-introduction-generation)
95
- - [Intro With Links Generation](#intro-with-links-generation)
96
- - [Custom Description Generation](#custom-description-generation)
97
- - [Anchor‑Ordering‑Cleanup](#anchor‑ordering‑cleanup)
98
- - [Anchor‑Chunk‑Splitting](#anchor‑chunk‑splitting)
99
- - [Semantic‑Ordering](#semantic‑ordering)
100
- - [Html Link Extraction](#html-link-extraction)
101
- - [Factory‑Doc‑Assembly](#factory‑doc‑assembly)
102
- - [Doc‑Parts‑Generation](#doc‑parts‑generation)
103
- - [Custommodule Intro Modules](#custommodule-intro-modules)
104
-
105
- - 🤖 Model Orchestration
106
- - [Asyncgptmodel Implementation](#asyncgptmodel-implementation)
107
- - [Gptmodel Synchronous Flow](#gptmodel-synchronous-flow)
108
- - [Parentmodel Setup And Rotation](#parentmodel-setup-and-rotation)
109
- - [Synchronous‑Part‑Doc‑Generator](#synchronous‑part‑doc‑generator)
110
- - [Asynchronous‑Part‑Doc‑Generator](#asynchronous‑part‑doc‑generator)
111
- - [Synchronous‑Multi‑Part‑Orchestrator](#synchronous‑multi‑part‑orchestrator)
112
- - [Asynchronous‑Multi‑Part‑Orchestrator](#asynchronous‑multi‑part‑orchestrator)
113
-
114
- - 🔀 Data Splitting & Repository
115
- - [Spliter Entry Point](#spliter-entry-point)
116
- - [Data‑Splitting Loop](#data‑splitting‑loop)
117
- - [Repository‑Mix Builder](#repository‑mix‑builder)
118
- - [Code‑Mix Generation](#code‑mix‑generation)
119
-
120
- - 🗄️ Caching & Compression
121
- - [Cache‑File Access](#cache‑file-access)
122
- - [Compress Function](#compress-function)
123
- - [Batch Compression Sync](#batch-compression-sync)
124
- - [Batch Compression Async](#batch-compression-async)
125
-
126
- - 📊 Logging & Progress
127
- - [Singleton‑Logger‑Implementation](#singleton‑logger‑implementation)
128
- - [Log‑Message‑Hierarchy](#log‑message‑hierarchy)
129
- - [Progress‑Abstraction](#progress‑abstraction)
130
- - [Rich‑Implementation](#rich‑implementation)
131
- - [Console‑Task‑Helper](#console‑task‑helper)
132
- - [Fallback‑Console‑Progress](#fallback‑console‑progress)
133
-
134
-
135
-
136
-
137
-
138
- <a name="install-workflow-setup"></a>
139
- To set up the installation workflow, run the PowerShell script on Windows using:
140
- `irm raw.githubusercontent.com/Drag-GameStudio/ADG/main/install.ps1 | iex`
141
-
142
- On Linux‑based systems, execute the shell script with:
143
- `curl -sSL raw.githubusercontent.com/Drag-GameStudio/ADG/main/install.sh | bash`
144
-
145
- Additionally, add a secret variable named **GROCK_API_KEY** to your GitHub Actions configuration, containing the API key obtained from the Grock documentation site (grockdocs.com).
146
- <a name="configuration-loading-and-validation"></a>
147
- ## Configuration Loading and Validation
148
-
149
- - **`load_config()`** reads the global settings file, merges environment overrides, and returns a structured config object.
150
- - Immediate checks ensure required keys (e.g., `source_path`, `output_dir`, `templates`) exist; missing keys raise `ConfigurationError`.
151
- - The validated config is the sole input to the pipeline, guaranteeing deterministic behavior.
152
- <a name="manager-configuration"></a>
153
- ## Manager Class – Configuration & State
154
-
155
- `Manager` orchestrates the end‑to‑end documentation pipeline. It receives the project root, a `Config` object, optional LLM model instances (`Model` / `AsyncModel`), and a progress UI (`BaseProgress`). During construction it:
156
-
157
- * Stores configuration and progress objects.
158
- * Instantiates a file‑based logger that writes to **report.txt** inside a hidden cache folder (`.auto_doc_cache`).
159
- * Guarantees the cache directory exists, creating it if necessary.
160
- <a name="projectsettings-prompt-builder"></a>ProjectSettings Prompt Builder
161
- `ProjectSettings.prompt` concatenates the global `BASE_SETTINGS_PROMPT` with the project name and any key/value pairs added via `add_info`. The resulting string feeds the system role of every compression request.
162
-
163
- ##
164
- <a name="command-line-invocation-logic"></a>
165
- ## Command‑Line Invocation Logic
166
-
167
- The `if __name__ == "__main__":` block parses CLI arguments (if any), invokes `load_config()` from `autodocgenerator.engine.config.config`, validates the returned dictionary, and passes it to `run_documentation_pipeline()` which orchestrates the end‑to‑end doc generation.
168
- <a name="documentation-pipeline-trigger"></a>
169
- ## Documentation Pipeline Trigger
170
-
171
- `run_documentation_pipeline(config)` sequentially executes:
172
-
173
- 1. **Source Discovery** – walks `config["source_path"]` to collect parsable modules.
174
- 2. **Parsing Engine** – feeds each file to the parser subsystem, producing intermediate ASTs.
175
- 3. **Renderer** – transforms ASTs into Markdown/HTML using the selected template set.
176
- 4. **Writer** – writes rendered files into `config["output_dir"]`, optionally cleaning stale artifacts.
177
-
178
- Side effects include filesystem writes, optional logging to `config["log_file"]`, and temporary cache creation.
179
- <a name="execution-flow-summary"></a>
180
- ## Execution Flow Summary
181
-
182
- 1. CLI start → load & validate config.
183
- 2. Valid config → `run_documentation_pipeline`.
184
- 3. Pipeline → generated documentation persisted on disk.
185
-
186
- All exceptions propagate to the top level, where a concise error message is printed and the process exits with a non‑zero status.
187
- <a name="documentation-generation-workflow"></a>
188
- ## Documentation Generation Workflow
189
-
190
- **Responsibility**
191
- `autodocgenerator/auto_runner/run_file.py` orchestrates the end‑to‑end generation of project documentation. It loads the user configuration, instantiates synchronous and asynchronous GPT models, creates a `Manager` that drives file parsing, doc‑part creation, factory‑based rendering, ordering, and cache cleanup, finally returning the assembled markdown.
192
-
193
- **Interactions**
194
- - **Config Reader** – imports `read_config` from `config_reader.py` to parse *autodocconfig.yml*.
195
- - **Model Layer** – creates `GPTModel` (blocking) and `AsyncGPTModel` (non‑blocking) using the API key from `engine/config/config.py`.
196
- - **Manager** – `autodocgenerator.manage.Manager` receives the project path, configuration, models, and a progress bar (`ConsoleGtiHubProgress`).
197
- - **Factories** – `DocFactory` builds doc sections from custom modules (`CustomModule`) and from the built‑in `IntroLinks`.
198
- - **UI** – progress feedback is sent to GitHub‑style console via `ConsoleGtiHubProgress`.
199
-
200
- **Technical Details**
201
- ```python
202
- def gen_doc(project_path: str, config: Config, custom_modules):
203
- sync_model = GPTModel(API_KEY, use_random=False)
204
- async_model = AsyncGPTModel(API_KEY)
205
-
206
- manager = Manager(
207
- project_path,
208
- config=config,
209
- sync_model=sync_model,
210
- async_model=async_model,
211
- progress_bar=ConsoleGtiHubProgress(),
212
- )
213
- # Core pipeline
214
- manager.generate_code_file()
215
- manager.generete_doc_parts(max_symbols=7000)
216
- manager.factory_generate_doc(DocFactory(*custom_modules))
217
- manager.order_doc()
218
- manager.factory_generate_doc(DocFactory(IntroLinks()))
219
- manager.clear_cache()
220
- return manager.read_file_by_file_key("output_doc")
221
- ```
222
-
223
- - `generate_code_file()` scans the repository, respecting `Config.ignore_files`.
224
- - `generete_doc_parts()` chunks source code (≤ 7000 symbols) and queries the GPT models.
225
- - Two `factory_generate_doc` calls render custom user‑supplied modules first, then prepend standard intro links.
226
- - `order_doc()` ensures a logical section order before writing the final file.
227
-
228
- **Data Flow**
229
- 1. **Input** – `project_path` (root of the repo) and a fully populated `Config` object (language, ignore patterns, project info).
230
- 2. **Processing** – Files → code extraction → GPT prompts → text fragments → module factories → ordered markdown.
231
- 3. **Outputs** – Cached files under `.auto_doc_cache/`, and the final document string returned by `read_file_by_file_key("output_doc")`.
232
- 4. **Side Effects** – Writes intermediate cache, updates progress UI, and may raise exceptions from model calls or file I/O.
233
-
234
- This module acts as the command‑line entry point (`if __name__ == "__main__":`) that ties configuration loading to the documentation pipeline.
235
- <a name="autodocfile-parameters"></a>
236
- The file is a YAML document that defines the behavior of the documentation generator.
237
- Key sections and available options:
238
-
239
- * **project_name** – a short title for the project.
240
- * **language** – language code for the generated text, default “en”.
241
- * **project_options** – a map of boolean and numeric controls:
242
- * **save_logs** – true to keep generation logs, false otherwise.
243
- * **log_level** – integer indicating the verbosity of log output.
244
- * **project_additional_info** – free‑form key/value pairs that are inserted into the generated material (e.g., a global idea or description).
245
- * **ignore_files** – list of glob patterns for files that should be skipped during analysis.
246
- * **custom_descriptions** – an array of strings; each string is interpreted as a custom instruction for the generator, allowing you to request specific sections or explanations.
247
-
248
- When writing the file, list each option under its heading using proper YAML indentation. Only the sections you need are required; omitted sections will use the defaults (language “en”, empty project name, default project_options).
249
- <a name="docfactory-orchestration"></a>
250
- ## DocFactory Orchestration
251
-
252
- `DocFactory` receives an ordered list of `BaseModule` subclasses. `generate_doc` creates a progress sub‑task, invokes each module’s `generate(info, model)`, concatenates their outputs, and logs module completion. It returns the assembled documentation string.
253
- <a name="global-introduction-generation"></a>Global Introduction Generation
254
- `get_introdaction` (note spelling) builds a similar prompt using `BASE_INTRO_CREATE` and the whole documentation body, then forwards it to the model. The raw intro string is returned unchanged.
255
-
256
- ##
257
- <a name="intro-with-links-generation"></a>Introduction‑With‑Links Generation
258
- `get_links_intro` receives the link list and a **Model** implementation. It composes a three‑message prompt (system language directive, `BASE_INTRODACTION_CREATE_TEXT`, and the link payload) and calls `model.get_answer_without_history`. Logging surrounds the call, and the generated introduction text is returned.
259
-
260
- ##
261
- <a name="custom-description-generation"></a>Custom Description Generation
262
- `generete_custom_discription` iterates over split document chunks, sending each to the model with a strict system prompt that forces a single‑anchor output (`<a name="…"></a>`). It aborts on the first non‑error response, otherwise returns an empty string. The rules prevent filename, extension, or external URL leakage.
263
-
264
- ##
265
- <a name="anchor‑ordering‑cleanup"></a>
266
- ## Anchor‑Based Ordering & Cache Cleanup
267
-
268
- `order_doc` extracts anchor‑segmented sections via `split_text_by_anchors`, asks the LLM (`self.sync_model`) to compute the correct order (`get_order`), and rewrites the file.
269
-
270
- `clear_cache` removes **report.txt** unless `config.pcs.save_logs` is true, ensuring optional log retention.
271
-
272
- **Data Flow Summary** – Input files → `CodeMix` → raw mix → `gen_doc_parts` → partial doc → `DocFactory` → enriched doc → `split_text_by_anchors`/`get_order` → final ordered doc. Side effects include file writes, logger entries, and progress UI updates.
273
-
274
- ##
275
- <a name="anchor‑chunk‑splitting"></a>Anchor‑Based Chunk Splitting
276
- `split_text_by_anchors` uses a look‑ahead regex to split a full doc into sections that start with a valid anchor (`<a name="…"></a>`). It validates that each chunk yields a corresponding link via `extract_links_from_start`; mismatches return `None`. The result is a dict mapping “#anchor” keys to their text blocks.
277
-
278
- ##
279
- <a name="semantic‑ordering"></a>Semantic Ordering of Documentation Sections
280
- `get_order` receives the chunk dict and a **Model**. It logs the incoming keys, prompts the model to return a comma‑separated, semantically sorted list of titles (preserving the leading “#”). The function reassembles the final ordered document by concatenating the chunks in the returned order, logging each addition.
281
-
282
- ##
283
- <a name="html-link-extraction"></a>HTML Link Extraction Logic
284
- `get_all_html_links` scans a documentation string for anchor tags (`<a name=…></a>`). It logs start/end messages via **BaseLogger**, builds a regex pattern, iterates with `re.finditer`, and appends up‑to‑five links prefixed with “#”. The function returns the collected list, providing the first‑stage data for downstream ordering.
285
-
286
- ##
287
- <a name="factory‑doc‑assembly"></a>
288
- ## Factory‑Driven Documentation Assembly
289
-
290
- `factory_generate_doc` loads the current output and code mix, builds an `info` dict (`language`, `full_data`, `code_mix`), and logs the module chain of the supplied `DocFactory`. It then calls `doc_factory.generate_doc(info, self.sync_model, self.progress_bar)`. The factory‑produced fragment is prepended to the existing doc and persisted.
291
- <a name="doc‑parts‑generation"></a>
292
- ## Synchronous Documentation Chunking
293
-
294
- `generete_doc_parts` (typo retained) reads the code‑mix, then invokes `gen_doc_parts` with:
295
-
296
- * raw code mix,
297
- * `max_symbols` limit (default 5 000),
298
- * the synchronous LLM (`self.sync_model`),
299
- * target language (`self.config.language`),
300
- * the progress UI.
301
-
302
- The resulting Markdown is written to **output_doc.md** and progress updated.
303
- <a name="custommodule-intro-modules"></a>
304
- ## CustomModule & Intro Modules
305
-
306
- * `CustomModule` injects a user‑provided description into a generated custom intro by preprocessing code via `split_data` and delegating to `generete_custom_discription`.
307
- * `IntroLinks` extracts HTML links from `info["full_data"]` and builds a linked introduction using `get_links_intro`.
308
- * `IntroText` produces a plain introduction via `get_introdaction`.
309
-
310
- All modules depend on a `Model` instance for LLM calls and output plain‑text Markdown/HTML fragments.
311
- <a name="asyncgptmodel-implementation"></a>
312
- ## AsyncGPTModel Implementation
313
-
314
- `AsyncGPTModel` extends `AsyncModel` to call Groq’s async client. It builds a shuffled list of candidate model names (`regen_models_name`) from the global config, logs each step via `BaseLogger`, and retries on failure, cycling through the list. Input: optional prompt or full history; output: generated answer string. Side‑effects: async HTTP request, log entries, possible `ModelExhaustedException` if no models remain.
315
- <a name="gptmodel-synchronous-flow"></a>
316
- ## GPTModel Synchronous Flow
317
-
318
- `GPTModel` mirrors `AsyncGPTModel` but uses the synchronous `Groq` client. It follows the same retry loop, logs progress, and returns the answer. It also respects the `with_history` flag to select either the stored conversation (`self.history.history`) or a raw prompt.
319
- <a name="parentmodel-setup-and-rotation"></a>
320
- ## ParentModel Setup & Model Rotation
321
-
322
- `ParentModel` (base for both sync/async) stores the API key, a `History` instance, and prepares `regen_models_name`—a shuffled copy of `MODELS_NAME` when `use_random=True`. It tracks `current_model_index` to rotate through candidates after each failure.
323
- <a name="synchronous‑part‑doc‑generator"></a>Synchronous Part Documentation Generator (`write_docs_by_parts`)
324
- Builds a two‑message system prompt (language + part‑ID, then `BASE_PART_COMPLITE_TEXT`). If a previous part’s summary exists, it is appended as an additional system message. The user message contains the raw source fragment. The `Model`’s `get_answer_without_history` call returns a markdown block; surrounding triple‑backticks are stripped before the final string is returned. Logs start, length, and raw answer (level 2).
325
-
326
- ##
327
- <a name="asynchronous‑part‑doc‑generator"></a>Asynchronous Part Documentation Generator (`async_write_docs_by_parts`)
328
- Mirrors the synchronous flow but runs inside an `async with semaphore` to cap concurrency (default 4). Uses an `AsyncModel` for non‑blocking `await get_answer_without_history`. An optional `update_progress` callback is invoked after each answer. Logging mirrors the sync variant.
329
-
330
- ##
331
- <a name="synchronous‑multi‑part‑orchestrator"></a>Synchronous Multi‑Part Documentation Orchestrator (`gen_doc_parts`)
332
- 1. Calls `split_data` to obtain `splited_data`.
333
- 2. Creates a progress sub‑task.
334
- 3. Sequentially invokes `write_docs_by_parts` for each chunk, concatenating results into `all_result`.
335
- 4. Keeps a 3000‑character tail of the previous answer to provide context for the next part.
336
- 5. Updates the progress bar after each iteration and logs final output length.
337
-
338
- ##
339
- <a name="asynchronous‑multi‑part‑orchestrator"></a>Asynchronous Multi‑Part Documentation Orchestrator (`async_gen_doc_parts`)
340
- Splits the input, creates a semaphore (max 4), and dispatches `async_write_docs_by_parts` for every chunk via `asyncio.gather`. Progress updates are wired through a lambda calling `progress_bar.update_task()`. Results are concatenated in order, progress sub‑task is removed, and the assembled documentation is logged.
341
-
342
- ##
343
- <a name="spliter-entry-point"></a>Spliter Entry Point (`split_data`) – Partial
344
- `split_data(data, max_symbols)` prepares to split a large document into sub‑strings respecting a symbol limit. The implementation continues beyond the shown snippet, but its purpose is to feed the compressor pipeline with appropriately sized chunks.
345
-
346
- ##
347
- <a name="data‑splitting‑loop"></a>Data Splitting Loop (`split_data`)
348
- Iteratively refines a list of file‑derived chunks (`splited_by_files`) so that no element exceeds 1.5 × `max_symbols`. Oversized entries are bisected at the halfway point, inserted back, and the process repeats until stability. The second phase packs these normalized fragments into `split_objects`, inserting line breaks and respecting a 1.25 × `max_symbols` buffer. Returns a list of size‑constrained text blocks ready for downstream processing.
349
-
350
- ##
351
- <a name="repository‑mix‑builder"></a>Repository Content Aggregation (`CodeMix`)
352
- `CodeMix` walks a repository rooted at `root_dir`, respecting `ignore_patterns`. `should_ignore` checks each path against glob patterns, file basenames, and any path component. `build_repo_content` writes a hierarchical tree view to `repomix-output.txt`, then appends each non‑ignored file’s relative path (`<file path="…">`) followed by its raw content. Progress is logged throughout.
353
-
354
- **Data Flow:**
355
- - Source files → `CodeMix` → mixed text file → `split_text_by_anchors` → chunk dict → `get_order` → ordered doc → optional `get_links_intro`/`get_introdaction`/`generete_custom_discription` → final documentation output.
356
- - Side effects: file creation (`repomix-output.txt`), logger entries, and UI progress updates.
357
-
358
- ##
359
- <a name="code‑mix‑generation"></a>
360
- ## Code Mix Generation Workflow
361
-
362
- `generate_code_file` creates a `CodeMix` instance (respecting `config.ignore_files`) and calls `build_repo_content` to serialize the entire repository into **code_mix.txt**. Logging marks start/end, and the progress bar task is advanced.
363
- <a name="cache‑file‑access"></a>
364
- ## Cached File Access Helpers
365
-
366
- * `get_file_path(file_key)` builds an absolute path inside the cache using the static `FILE_NAMES` map.
367
- * `read_file_by_file_key(file_key)` opens the derived path, reads UTF‑8 content and returns it. These utilities centralise path handling for all subsequent steps.
368
- <a name="compress-function"></a>Compress Function Logic
369
- `compress(data, project_settings, model, compress_power)` builds a three‑message prompt: the project‑specific system prompt, a compression directive from `get_BASE_COMPRESS_TEXT`, and the raw `data`. It forwards this prompt to `model.get_answer_without_history` and returns the model’s summary.
370
- *Input*: plain text, `ProjectSettings` instance, `Model`, integer power.
371
- *Output*: compressed string.
372
- *Side‑effects*: none.
373
-
374
- ##
375
- <a name="batch-compression-sync"></a>Synchronous Batch Compression (`compress_and_compare`)
376
- Partitions a list of strings into groups of `compress_power`. For each element it calls `compress`, concatenates results per group, and updates a `BaseProgress` sub‑task. Returns a list where each entry aggregates the compressed texts of one group.
377
-
378
- ##
379
- <a name="batch-compression-async"></a>Asynchronous Batch Compression (`async_compress_and_compare`)
380
- Creates a semaphore (max 4 concurrent calls) and launches `async_compress` for every element. Each coroutine builds the same three‑message prompt, awaits `model.get_answer_without_history`, and updates progress. After `asyncio.gather`, groups results into chunks of size `compress_power` and returns the aggregated list.
381
-
382
- ##
383
- <a name="singleton‑logger‑implementation"></a>Singleton Logger Implementation (`BaseLogger`)
384
- `BaseLogger.__new__` ensures a single shared instance. Clients set a concrete `BaseLoggerTemplate` (e.g., `FileLoggerTemplate`) via `set_logger`. Calls to `log` forward to the template’s `global_log`, which respects the configured log‑level filter.
385
-
386
- ##
387
- <a name="log‑message‑hierarchy"></a>Log Message Hierarchy (`BaseLog` & subclasses)
388
- `BaseLog` stores a message and level; subclasses (`InfoLog`, `WarningLog`, `ErrorLog`) override `format()` to prepend a timestamp and severity tag. The hierarchy enables uniform, level‑aware console or file output across the documentation pipeline.
389
-
390
- ##
391
- <a name="progress‑abstraction"></a>Progress Abstraction (`BaseProgress`)
392
- Defines the minimal interface for creating, updating, and removing sub‑tasks. Concrete classes implement these hooks so the documentation pipeline can switch between rich‑based UI or plain console output without code changes.
393
-
394
- ##
395
- <a name="rich‑implementation"></a>Rich‑Library Implementation (`LibProgress`)
396
- Wraps **rich.Progress**:
397
- * `__init__` registers a base task (`General progress`) with a configurable total (default 4).
398
- * `create_new_subtask(name, total_len)` adds a child task and stores its ID.
399
- * `update_task()` advances the current sub‑task if present, otherwise the base task.
400
- * `remove_subtask()` clears the reference, allowing the next chunk to start fresh.
401
- All calls forward to `rich.Progress.update`, guaranteeing thread‑safe visual feedback.
402
-
403
- ##
404
- <a name="console‑task-helper"></a>Console Task Helper (`ConsoleTask`)
405
- Utility that prints a simple progress line.
406
- * `start_task()` emits the start banner.
407
- * `progress()` increments an internal counter, computes a percentage, and prints it.
408
- Used by the fallback console progress class.
409
-
410
- ##
411
- <a name="fallback‑console‑progress"></a>Fallback Console Progress (`ConsoleGtiHubProgress`)
412
- Implements the same API as `BaseProgress` for environments lacking Rich:
413
- * Holds a persistent “General Progress” `ConsoleTask`.
414
- * `create_new_subtask` spawns a dedicated `ConsoleTask`.
415
- * `update_task` delegates to the active task or the general one.
416
- * `remove_subtask` discards the current sub‑task.
417
-
418
- **Data Flow** – Caller (e.g., `gen_doc_parts`) invokes `create_new_subtask` → progress updates via `update_task` → optional `remove_subtask`. No side effects beyond console/rich output. The abstraction keeps the rest of the system agnostic to the UI backend.
419
-