autodocgenerator 0.9.0.0__py3-none-any.whl → 0.9.0.1__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- autodocgenerator/engine/config/config.py +28 -22
- autodocgenerator/preprocessor/spliter.py +5 -8
- autodocgenerator-0.9.0.1.dist-info/METADATA +839 -0
- {autodocgenerator-0.9.0.0.dist-info → autodocgenerator-0.9.0.1.dist-info}/RECORD +5 -5
- autodocgenerator-0.9.0.0.dist-info/METADATA +0 -699
- {autodocgenerator-0.9.0.0.dist-info → autodocgenerator-0.9.0.1.dist-info}/WHEEL +0 -0
|
@@ -1,699 +0,0 @@
|
|
|
1
|
-
Metadata-Version: 2.4
|
|
2
|
-
Name: autodocgenerator
|
|
3
|
-
Version: 0.9.0.0
|
|
4
|
-
Summary: This Project helps you to create docs for your projects
|
|
5
|
-
License: MIT
|
|
6
|
-
Author: dima-on
|
|
7
|
-
Author-email: sinica911@gmail.com
|
|
8
|
-
Requires-Python: >=3.11,<4.0
|
|
9
|
-
Classifier: License :: OSI Approved :: MIT License
|
|
10
|
-
Classifier: Programming Language :: Python :: 3
|
|
11
|
-
Classifier: Programming Language :: Python :: 3.11
|
|
12
|
-
Classifier: Programming Language :: Python :: 3.12
|
|
13
|
-
Classifier: Programming Language :: Python :: 3.13
|
|
14
|
-
Classifier: Programming Language :: Python :: 3.14
|
|
15
|
-
Requires-Dist: CacheControl (==0.14.4)
|
|
16
|
-
Requires-Dist: Pygments (==2.19.2)
|
|
17
|
-
Requires-Dist: RapidFuzz (==3.14.3)
|
|
18
|
-
Requires-Dist: annotated-types (==0.7.0)
|
|
19
|
-
Requires-Dist: anyio (==4.12.1)
|
|
20
|
-
Requires-Dist: certifi (==2026.1.4)
|
|
21
|
-
Requires-Dist: charset-normalizer (==3.4.4)
|
|
22
|
-
Requires-Dist: cleo (==2.1.0)
|
|
23
|
-
Requires-Dist: colorama (==0.4.6)
|
|
24
|
-
Requires-Dist: crashtest (==0.4.1)
|
|
25
|
-
Requires-Dist: distlib (==0.4.0)
|
|
26
|
-
Requires-Dist: distro (==1.9.0)
|
|
27
|
-
Requires-Dist: dulwich (==0.25.2)
|
|
28
|
-
Requires-Dist: fastjsonschema (==2.21.2)
|
|
29
|
-
Requires-Dist: filelock (==3.20.3)
|
|
30
|
-
Requires-Dist: findpython (==0.7.1)
|
|
31
|
-
Requires-Dist: google-auth (==2.47.0)
|
|
32
|
-
Requires-Dist: google-genai (==1.56.0)
|
|
33
|
-
Requires-Dist: groq (==1.0.0)
|
|
34
|
-
Requires-Dist: h11 (==0.16.0)
|
|
35
|
-
Requires-Dist: httpcore (==1.0.9)
|
|
36
|
-
Requires-Dist: httpx (==0.28.1)
|
|
37
|
-
Requires-Dist: idna (==3.11)
|
|
38
|
-
Requires-Dist: installer (==0.7.0)
|
|
39
|
-
Requires-Dist: jaraco.classes (==3.4.0)
|
|
40
|
-
Requires-Dist: jaraco.context (==6.1.0)
|
|
41
|
-
Requires-Dist: jaraco.functools (==4.4.0)
|
|
42
|
-
Requires-Dist: jiter (==0.12.0)
|
|
43
|
-
Requires-Dist: keyring (==25.7.0)
|
|
44
|
-
Requires-Dist: markdown-it-py (==4.0.0)
|
|
45
|
-
Requires-Dist: mdurl (==0.1.2)
|
|
46
|
-
Requires-Dist: more-itertools (==10.8.0)
|
|
47
|
-
Requires-Dist: msgpack (==1.1.2)
|
|
48
|
-
Requires-Dist: openai (==2.14.0)
|
|
49
|
-
Requires-Dist: packaging (==25.0)
|
|
50
|
-
Requires-Dist: pbs-installer (==2026.1.14)
|
|
51
|
-
Requires-Dist: pkginfo (==1.12.1.2)
|
|
52
|
-
Requires-Dist: platformdirs (==4.5.1)
|
|
53
|
-
Requires-Dist: pyasn1 (==0.6.1)
|
|
54
|
-
Requires-Dist: pyasn1_modules (==0.4.2)
|
|
55
|
-
Requires-Dist: pydantic (==2.12.5)
|
|
56
|
-
Requires-Dist: pydantic_core (==2.41.5)
|
|
57
|
-
Requires-Dist: pyproject_hooks (==1.2.0)
|
|
58
|
-
Requires-Dist: python-dotenv (==1.2.1)
|
|
59
|
-
Requires-Dist: pywin32-ctypes (==0.2.3)
|
|
60
|
-
Requires-Dist: pyyaml (==6.0.3)
|
|
61
|
-
Requires-Dist: requests (==2.32.5)
|
|
62
|
-
Requires-Dist: requests-toolbelt (==1.0.0)
|
|
63
|
-
Requires-Dist: rich (==14.2.0)
|
|
64
|
-
Requires-Dist: rich_progress (==0.4.0)
|
|
65
|
-
Requires-Dist: rsa (==4.9.1)
|
|
66
|
-
Requires-Dist: shellingham (==1.5.4)
|
|
67
|
-
Requires-Dist: sniffio (==1.3.1)
|
|
68
|
-
Requires-Dist: tenacity (==9.1.2)
|
|
69
|
-
Requires-Dist: tomlkit (==0.14.0)
|
|
70
|
-
Requires-Dist: tqdm (==4.67.1)
|
|
71
|
-
Requires-Dist: trove-classifiers (==2026.1.14.14)
|
|
72
|
-
Requires-Dist: typing-inspection (==0.4.2)
|
|
73
|
-
Requires-Dist: typing_extensions (==4.15.0)
|
|
74
|
-
Requires-Dist: urllib3 (==2.6.2)
|
|
75
|
-
Requires-Dist: virtualenv (==20.36.1)
|
|
76
|
-
Requires-Dist: websockets (==15.0.1)
|
|
77
|
-
Requires-Dist: zstandard (==0.25.0)
|
|
78
|
-
Description-Content-Type: text/markdown
|
|
79
|
-
|
|
80
|
-
## Executive Navigation Tree
|
|
81
|
-
- 📦 **Build & Packaging**
|
|
82
|
-
- [Build System Configuration](#build-system-configuration)
|
|
83
|
-
- [Package Initialization](#package-initialization)
|
|
84
|
-
- [Package Metadata](#package-metadata)
|
|
85
|
-
- 📦 **Dependency Management**
|
|
86
|
-
- [Dependency Declarations](#dependency-declarations)
|
|
87
|
-
- ⚙️ **CI/CD**
|
|
88
|
-
- [GitHub Action Setup](#github-action-setup)
|
|
89
|
-
- 📁 **Repository & Structure**
|
|
90
|
-
- [ProjectSettings Class](#projectsettings-class)
|
|
91
|
-
- [Repository Structure Builder](#repository-structure-builder)
|
|
92
|
-
- 🪵 **Logging**
|
|
93
|
-
- [Logger Instantiation Flow](#logger-instantiation-flow)
|
|
94
|
-
- [Logging and Model Interaction](#logging-and-model-interaction)
|
|
95
|
-
- [Logging Hierarchy](#logging-hierarchy)
|
|
96
|
-
- 🏭 **Factory & Orchestration**
|
|
97
|
-
- [BaseFactory‑Orchestrator](#basefactory‑orchestrator)
|
|
98
|
-
- [Custom‑Module‑Generation](#custom‑module‑generation)
|
|
99
|
-
- [Doc‑Generation‑Orchestrator](#doc‑generation‑orchestrator)
|
|
100
|
-
- [Factory Doc Assembly](#factory-doc-assembly)
|
|
101
|
-
- 📦 **Modules**
|
|
102
|
-
- [Intro Modules Link and Text](#intro‑modules‑link‑and‑text)
|
|
103
|
-
- [Inter‑Module Interactions](#inter‑module‑interactions)
|
|
104
|
-
- [Manager Class Usage](#manager-class-usage)
|
|
105
|
-
- [Manager Orchestration](#manager-orchestration)
|
|
106
|
-
- 📊 **Assumptions & Constraints**
|
|
107
|
-
- [Assumptions and Constraints](#assumptions-and‑constraints)
|
|
108
|
-
- 🔧 **Cache & Path Resolution**
|
|
109
|
-
- [Cache and Path Resolution](#cache-and-path-resolution)
|
|
110
|
-
- 📊 **Data Flow & Side Effects**
|
|
111
|
-
- [Data Flow and Side Effects](#data‑flow-and‑side‑effects)
|
|
112
|
-
- [Split Data](#split-data)
|
|
113
|
-
- 📄 **YAML & Runtime Objects**
|
|
114
|
-
- [YAML to Runtime Objects](#yaml‑to‑runtime‑objects)
|
|
115
|
-
- 🤖 **Model Wrappers**
|
|
116
|
-
- [AsyncGPTModel Async Wrapper](#asyncgptmodel‑async‑wrapper)
|
|
117
|
-
- [GPTModel Sync Wrapper](#gptmodel‑sync‑wrapper)
|
|
118
|
-
- 🗂️ **History & Content**
|
|
119
|
-
- [History Prompt Buffer](#history‑prompt‑buffer)
|
|
120
|
-
- [CONTENT_DESCRIPTION](#CONTENT_DESCRIPTION)
|
|
121
|
-
- 📚 **Chunk Extraction & Description**
|
|
122
|
-
- [Anchor‑Based Chunk Extraction](#anchor-based-chunk-extraction)
|
|
123
|
-
- [Custom Description Synthesis](#custom-description-synthesis)
|
|
124
|
-
- [Doc Chunk Behaviour](#doc‑chunk‑behaviour)
|
|
125
|
-
- [Document Ordering](#document-ordering)
|
|
126
|
-
- [Generate Descriptions](#generate-descriptions)
|
|
127
|
-
- [Gen Doc Parts](#gen-doc-parts)
|
|
128
|
-
- 🔗 **Link & Introduction Generation**
|
|
129
|
-
- [HTML Link Extraction Logic](#html-link-extraction-logic)
|
|
130
|
-
- [Link‑Based Introduction Generation](#link-based-introduction-generation)
|
|
131
|
-
- [Plain Introduction Synthesis](#plain-introduction-synthesis)
|
|
132
|
-
- 🌐 **Parent Model Shared State**
|
|
133
|
-
- [ParentModel Shared State](#parentmodel‑shared‑state)
|
|
134
|
-
- 📈 **Progress & Semantic Ordering**
|
|
135
|
-
- [Progress Implementations](#progress-implementations)
|
|
136
|
-
- [Semantic Title Ordering](#semantic-title-ordering)
|
|
137
|
-
- 🔄 **Sync / Async Generation**
|
|
138
|
-
- [Sync Doc Part Generation](#sync-doc-part-generation)
|
|
139
|
-
- [Write Autodocfile Options](#write-autodocfile-options)
|
|
140
|
-
- [Write Docs by Parts](#write-docs-by-parts)
|
|
141
|
-
- [Async Gen Doc Parts](#async-gen-doc-parts)
|
|
142
|
-
- [Async Write Docs by Parts](#async-write-docs-by-parts)
|
|
143
|
-
- 🗜️ **Compression**
|
|
144
|
-
- [Async Compress](#async-compress)
|
|
145
|
-
- [Async Compress and Compare](#async-compress-and-compare)
|
|
146
|
-
- [Compress and Compare Sync](#compress-and-compare-sync)
|
|
147
|
-
- [Compress Single Pass](#compress-single-pass)
|
|
148
|
-
- [Compress to One](#compress-to-one)
|
|
149
|
-
- 🧩 **Code Mix Generation**
|
|
150
|
-
- [Code Mix Generation](#code-mix-generation)
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
<a name="build-system-configuration"></a>
|
|
155
|
-
## Build‑system configuration
|
|
156
|
-
|
|
157
|
-
**Responsibility** – Informs PEP 517 how to build the project by specifying the build backend and its minimum requirement.
|
|
158
|
-
|
|
159
|
-
**Interactions** – When `python -m build` or `pip install .` is invoked, the build front‑end imports `poetry.core.masonry.api` and calls its `build_wheel` / `build_sdist` APIs.
|
|
160
|
-
|
|
161
|
-
**Technical details**
|
|
162
|
-
- `requires = ["poetry-core>=2.0.0"]` ensures the build backend is present.
|
|
163
|
-
- `build-backend = "poetry.core.masonry.api"` points to Poetry’s PEP 517 implementation.
|
|
164
|
-
|
|
165
|
-
**Data flow** – Input: the `pyproject.toml` file itself. Output: built distribution archives (`.whl`, `.tar.gz`) placed in `dist/`.
|
|
166
|
-
<a name="dependency-declarations"></a>
|
|
167
|
-
## Dependency declarations
|
|
168
|
-
|
|
169
|
-
**Responsibility** – Enumerates the exact third‑party packages required at runtime, with pinned versions to guarantee reproducible builds.
|
|
170
|
-
|
|
171
|
-
**Interactions** – Poetry resolves these constraints, creates a lockfile, and installs the packages into a virtual environment. The application imports the listed libraries (e.g., `rich`, `pydantic`, `openai`).
|
|
172
|
-
|
|
173
|
-
**Technical details**
|
|
174
|
-
- `requires-python = ">=3.11,<4.0"` restricts interpreter compatibility.
|
|
175
|
-
- Each entry follows the format `package==exact.version`.
|
|
176
|
-
- Versions span utilities (e.g., `requests`), AI SDKs (`openai`, `google-genai`), and UI helpers (`rich`, `tqdm`).
|
|
177
|
-
|
|
178
|
-
**Data flow** – Input: developer‑specified version strings. Output: a resolved dependency graph written to `poetry.lock`; at install time, the resolved packages are materialised on disk.
|
|
179
|
-
<a name="github-action-setup"></a>
|
|
180
|
-
To set up the automation workflow, follow these steps:
|
|
181
|
-
|
|
182
|
-
1. **PowerShell‑based environment (Windows)**
|
|
183
|
-
- Use PowerShell’s *Invoke‑Expression* to fetch and execute the remote installer script in a single command:
|
|
184
|
-
```powershell
|
|
185
|
-
irm raw.githubusercontent.com/Drag-GameStudio/ADG/main/install.ps1 | iex
|
|
186
|
-
```
|
|
187
|
-
- This command downloads the script directly from the project's raw content location and evaluates it on the host machine.
|
|
188
|
-
|
|
189
|
-
2. **POSIX‑compatible environment (Linux/macOS)**
|
|
190
|
-
- Retrieve and execute the installer script with a single pipeline using *curl* and *bash*:
|
|
191
|
-
```bash
|
|
192
|
-
curl -sSL raw.githubusercontent.com/Drag-GameStudio/ADG/main/install.sh | bash
|
|
193
|
-
```
|
|
194
|
-
- The flags `-sSL` make the transfer silent, follow redirects, and display errors, while piping to *bash* runs the script immediately.
|
|
195
|
-
|
|
196
|
-
3. **GitHub Action secret**
|
|
197
|
-
- In the repository’s Actions settings, create a secret named `GROCK_API_KEY`.
|
|
198
|
-
- Obtain the required key from the documentation hosted at `grockdocs.com` and paste it into the secret field.
|
|
199
|
-
- The workflow will reference this secret to authenticate calls to the external Grock service.
|
|
200
|
-
|
|
201
|
-
By performing the two platform‑specific commands and configuring the secret, the workflow becomes fully operational across Windows and Linux‑based runners.
|
|
202
|
-
<a name="package-initialization"></a>
|
|
203
|
-
## Package initialization and logger bootstrap
|
|
204
|
-
|
|
205
|
-
The **autodocgenerator** package executes a short bootstrap when imported. It prints a literal *“ADG”* to standard output, then creates a singleton‑style logger instance (`logger`) based on the UI logging abstraction. This enables every sub‑module to emit structured logs without additional setup.
|
|
206
|
-
<a name="package-metadata"></a>
|
|
207
|
-
## Package metadata
|
|
208
|
-
|
|
209
|
-
**Responsibility** – Declares the distributable identity of the *autodocgenerator* library: name, version, brief description, author contact, licensing, and the location of the long‑form README.
|
|
210
|
-
|
|
211
|
-
**Interactions** – Consumed by packaging tools (Poetry, pip, build) to generate `dist/` artifacts and to populate the project’s *metadata* section on PyPI. Runtime code does not import this file.
|
|
212
|
-
|
|
213
|
-
**Technical details**
|
|
214
|
-
- `name`, `version`, `description` are mandatory PEP 621 fields.
|
|
215
|
-
- `authors` is a list of mappings, enabling tools to render proper author credits.
|
|
216
|
-
- `license` uses a free‑text block (`MIT`).
|
|
217
|
-
- `readme` points to `README.md`, allowing automatic inclusion in the wheel’s `METADATA`.
|
|
218
|
-
|
|
219
|
-
**Data flow** – Input: static values maintained by the maintainer. Output: a TOML document parsed by build back‑ends; resulting values flow into the generated wheel/sdist metadata files.
|
|
220
|
-
None
|
|
221
|
-
<a name="projectsettings-class"></a>
|
|
222
|
-
## `ProjectSettings` – project‑wide prompt composer
|
|
223
|
-
|
|
224
|
-
**Responsibility** – Holds the project name and an arbitrary key/value info map; renders a full system prompt by concatenating `BASE_SETTINGS_PROMPT`, the project name, and each `info` entry.
|
|
225
|
-
|
|
226
|
-
**Key members** –
|
|
227
|
-
* `__init__(self, project_name: str)` – stores name, init empty `info`.
|
|
228
|
-
* `add_info(self, key, value)` – mutates `info`.
|
|
229
|
-
* `prompt` property – builds and returns the composite prompt string.
|
|
230
|
-
|
|
231
|
-
**Data flow** – No side‑effects beyond internal state mutation; `prompt` is read‑only.
|
|
232
|
-
|
|
233
|
-
---
|
|
234
|
-
|
|
235
|
-
**Mapping anchor → raw section text**
|
|
236
|
-
|
|
237
|
-
```python
|
|
238
|
-
{
|
|
239
|
-
"#compress-single-pass": "## `compress` – single‑pass LLM compression\n\n**Responsibility** … (text of first section)",
|
|
240
|
-
"#compress-and-compare-sync": "## `compress_and_compare` – batch sync compression\n\n**Responsibility** …",
|
|
241
|
-
"#async-compress": "## `async_compress` – semaphore‑protected async compression\n\n**Responsibility** …",
|
|
242
|
-
"#async-compress-and-compare": "## `async_compress_and_compare` – parallel batch compression\n\n**Responsibility** …",
|
|
243
|
-
"#compress-to-one": "## `compress_to_one` – iterative reduction to a single summary\n\n**Responsibility** …",
|
|
244
|
-
"#generate-descriptions": "## `generate_describtions_for_code` – LLM‑driven API documentation generator\n\n**Responsibility** …",
|
|
245
|
-
"#projectsettings-class": "## `ProjectSettings` – project‑wide prompt composer\n\n**Responsibility** …"
|
|
246
|
-
}
|
|
247
|
-
```
|
|
248
|
-
<a name="repository-structure-builder"></a>
|
|
249
|
-
## Repository Structure Builder (`CodeMix`)
|
|
250
|
-
|
|
251
|
-
**Responsibility** – Walks a source tree, writes a concise directory tree followed by the raw content of each non‑ignored file into a single output file (`repomix-output.txt`).
|
|
252
|
-
|
|
253
|
-
**Interactions** – Operates on the filesystem rooted at `root_dir`; uses `BaseLogger` for informational messages. It does **not** depend on other project modules.
|
|
254
|
-
|
|
255
|
-
**Technical details** –
|
|
256
|
-
* `should_ignore` applies `ignore_patterns` (glob‑style) to the relative path, filename, and any path component.
|
|
257
|
-
* `build_repo_content` writes a “Repository Structure” header, then iterates twice over `Path.rglob("*")`: first to emit the indented tree, second to embed file contents wrapped in `<file path="…">` tags. Errors while reading files are captured and logged inline.
|
|
258
|
-
|
|
259
|
-
**Data flow** – Input: optional `output_file` name, ignore list. Output: a UTF‑8 text file containing the tree and file bodies. Side‑effects: filesystem reads and a single write operation.
|
|
260
|
-
<a name="logger-instantiation-flow"></a>
|
|
261
|
-
## Logger instantiation flow
|
|
262
|
-
|
|
263
|
-
1. **Import logging symbols** – `BaseLogger`, `BaseLoggerTemplate`, `InfoLog`, `ErrorLog`, `WarningLog` are pulled from `autodocgenerator.ui.logging`.
|
|
264
|
-
2. **Create BaseLogger** – `logger = BaseLogger()` constructs the core logger object, initializing internal handlers (e.g., Rich console).
|
|
265
|
-
3. **Attach concrete template** – `logger.set_logger(BaseLoggerTemplate())` injects a concrete formatting template, defining how messages are rendered (color, level prefixes).
|
|
266
|
-
|
|
267
|
-
The instantiated `logger` is a module‑level variable, exported as part of the package’s public API, so downstream code can simply `from autodocgenerator import logger` and start logging.
|
|
268
|
-
<a name="logging-and-model-interaction"></a>## Logging and Model Interaction
|
|
269
|
-
All public functions instantiate **`BaseLogger`** locally, emitting `InfoLog` messages at start, completion, and (optionally) detailed debug level 1. Model calls are performed via `model.get_answer_without_history`, guaranteeing stateless queries. Input assumptions include well‑formed Markdown anchors, valid `Model` instances, and non‑empty `custom_description` strings. Outputs are plain strings (links list, introductions, or custom descriptions) ready for downstream concatenation or insertion into the final `output_doc.md`.
|
|
270
|
-
<a name="logging-hierarchy"></a>
|
|
271
|
-
## `BaseLogger` & log‑record classes – unified logging façade
|
|
272
|
-
|
|
273
|
-
**Responsibility** – Provides a process‑wide singleton (`BaseLogger`) that forwards `BaseLog`‑derived messages to a configurable `BaseLoggerTemplate`. The hierarchy (`ErrorLog`, `WarningLog`, `InfoLog`) supplies level‑aware formatting with a timestamp prefix.
|
|
274
|
-
|
|
275
|
-
**Interactions** – Other UI components (e.g. doc‑generation orchestrators) call `BaseLogger().log(<Log>)`. The logger delegates to the attached template (`FileLoggerTemplate` for file output or the default `BaseLoggerTemplate` for console). No external state is read; only the optional file is written.
|
|
276
|
-
|
|
277
|
-
**Technical Details**
|
|
278
|
-
* `BaseLog` stores `message` and `level`; `format()` returns the raw text.
|
|
279
|
-
* Sub‑classes override `format()` to prepend `[_log_prefix] [LEVEL]`.
|
|
280
|
-
* `_log_prefix` builds a human‑readable timestamp from `time.time()`.
|
|
281
|
-
* `BaseLoggerTemplate` implements `global_log()` that respects the instance’s `log_level` filter.
|
|
282
|
-
* `FileLoggerTemplate` overrides `log()` to append formatted lines to a file.
|
|
283
|
-
* `BaseLogger.__new__` enforces a single shared instance (`cls.instance`).
|
|
284
|
-
|
|
285
|
-
**Data Flow** – *Input*: a `BaseLog` instance (message + level). *Output*: side‑effect → `print()` or file write via the selected template. The method returns `None`.
|
|
286
|
-
|
|
287
|
-
---
|
|
288
|
-
<a name="basefactory‑orchestrator"></a>
|
|
289
|
-
## BaseFactory – Document Assembly Orchestrator
|
|
290
|
-
|
|
291
|
-
**Responsibility** – Collects a configurable list of `BaseModule` subclasses, invokes each module’s `generate` method, concatenates their outputs, and reports progress.
|
|
292
|
-
|
|
293
|
-
**Interactions** – Instantiated by the UI layer (e.g., `run_file.py`). Receives a concrete `Model` (sync or async) and a `BaseProgress` implementation. Each module may call the model’s `get_answer` APIs, while `BaseFactory` logs every step through a shared `BaseLogger`.
|
|
294
|
-
|
|
295
|
-
**Technical Details**
|
|
296
|
-
- `BaseModule` is an abstract base class defining `generate(info: dict, model: Model)`.
|
|
297
|
-
- `DocFactory.__init__(*modules)` stores the supplied modules in `self.modules`.
|
|
298
|
-
- `generate_doc(info, model, progress)` creates a sub‑task, iterates over `self.modules`, concatenates `module_result` with double line breaks, logs success (`InfoLog`) and verbose output (`level=2`), updates the progress bar, then removes the sub‑task.
|
|
299
|
-
|
|
300
|
-
**Data Flow**
|
|
301
|
-
```
|
|
302
|
-
info dict → each module.generate → model.get_answer (optional) → string fragment
|
|
303
|
-
→ DocFactory aggregates → final documentation string
|
|
304
|
-
```
|
|
305
|
-
Side effects: history updates inside the model, log file writes, and UI progress updates.
|
|
306
|
-
<a name="custom‑module‑generation"></a>
|
|
307
|
-
## CustomModule & CustomModuleWithOutContext – Tailored Intro Generation
|
|
308
|
-
|
|
309
|
-
**Responsibility** – Produce a custom description block either with or without the source‑code context.
|
|
310
|
-
|
|
311
|
-
**Interactions** – Both classes inherit `BaseModule`; they call post‑processor helpers `generete_custom_discription` / `generete_custom_discription_without`, passing the split source (`split_data`) and the shared `Model`.
|
|
312
|
-
|
|
313
|
-
**Technical Details**
|
|
314
|
-
- Constructor stores `self.discription`.
|
|
315
|
-
- `generate` extracts `info["code_mix"]` (or skips it) and `info["language"]`, then delegates to the appropriate helper.
|
|
316
|
-
- Returns the raw string produced by the helper.
|
|
317
|
-
|
|
318
|
-
**Data Flow**
|
|
319
|
-
```
|
|
320
|
-
info → split_data (max 5000 symbols) → generete_custom_discription → model.get_answer → description string
|
|
321
|
-
```
|
|
322
|
-
or, without context, directly `generete_custom_discription_without`.
|
|
323
|
-
<a name="doc‑generation‑orchestrator"></a>
|
|
324
|
-
## Orchestrator (`gen_doc` in run_file.py)
|
|
325
|
-
|
|
326
|
-
`gen_doc` wires together the core engine:
|
|
327
|
-
|
|
328
|
-
1. **Model layer** – creates a synchronous `GPTModel` and an asynchronous `AsyncGPTModel` using the global `API_KEY`.
|
|
329
|
-
2. **Manager** – instantiated with the target `project_path`, the parsed `Config`, both model instances, and a console‑based progress bar (`ConsoleGtiHubProgress`). The manager centralises file scanning, caching, and doc assembly.
|
|
330
|
-
3. **Pipeline** –
|
|
331
|
-
* `manager.generate_code_file()` – extracts source files respecting `Config.ignore_files`.
|
|
332
|
-
* `manager.generete_doc_parts(max_symbols=structure_settings.max_doc_part_size)` – splits generated text into manageable chunks.
|
|
333
|
-
* `manager.factory_generate_doc(DocFactory(*custom_modules))` – runs user‑defined modules through the `DocFactory`.
|
|
334
|
-
* Conditional steps based on `StructureSettings`: ordering (`manager.order_doc()`) and intro links (`DocFactory(IntroLinks())`).
|
|
335
|
-
* `manager.clear_cache()` – removes temporary artifacts, keeping the bootstrap lightweight.
|
|
336
|
-
4. Returns the final document via `manager.read_file_by_file_key("output_doc")`.
|
|
337
|
-
|
|
338
|
-
---
|
|
339
|
-
<a name="factory-doc-assembly"></a>
|
|
340
|
-
## Factory‑Based Document Assembly
|
|
341
|
-
|
|
342
|
-
`factory_generate_doc(doc_factory)`
|
|
343
|
-
- Loads current `output_doc` and `code_mix`.
|
|
344
|
-
- Builds `info` dict (`language`, `full_data`, `code_mix`).
|
|
345
|
-
- Logs the module list and input sizes.
|
|
346
|
-
- Executes `doc_factory.generate_doc(info, sync_model, progress_bar)` – the orchestrator that runs each `BaseModule`.
|
|
347
|
-
- Prepends the new fragment to the existing document and writes back.
|
|
348
|
-
<a name="intro‑modules‑link‑and‑text"></a>
|
|
349
|
-
## IntroLinks & IntroText – Automatic Intro Construction
|
|
350
|
-
|
|
351
|
-
**Responsibility** – Create introductory sections: a list of HTML links (`IntroLinks`) and a natural‑language overview (`IntroText`).
|
|
352
|
-
|
|
353
|
-
**Interactions** – Both modules pull data from `info` (`full_data` or `global_data`), invoke post‑processor utilities (`get_all_html_links`, `get_links_intro`, `get_introdaction`) which internally query the provided `Model`.
|
|
354
|
-
|
|
355
|
-
**Technical Details**
|
|
356
|
-
- `IntroLinks.generate` → `get_all_html_links` → `get_links_intro(model, language)`.
|
|
357
|
-
- `IntroText.generate` → `get_introdaction(model, language)`.
|
|
358
|
-
- Each returns a ready‑to‑insert markdown/HTML fragment.
|
|
359
|
-
|
|
360
|
-
**Data Flow**
|
|
361
|
-
```
|
|
362
|
-
info → extractor (links or global data) → post‑processor → model.get_answer → intro fragment
|
|
363
|
-
```
|
|
364
|
-
All fragments are later concatenated by `BaseFactory`.
|
|
365
|
-
None
|
|
366
|
-
<a name="inter‑module‑interactions"></a>
|
|
367
|
-
## Inter‑module Interactions
|
|
368
|
-
|
|
369
|
-
* Both wrappers import `ModelExhaustedException` (raised when no fallback model remains).
|
|
370
|
-
* Logging relies on `BaseLogger` and concrete `InfoLog/WarningLog/ErrorLog` objects from the UI layer.
|
|
371
|
-
* The concrete models are consumed by the **engine** (e.g., `run_file.py`’s `gen_doc`) which injects them into the `Manager` for document generation.
|
|
372
|
-
|
|
373
|
-
**Data flow**
|
|
374
|
-
```
|
|
375
|
-
User prompt → Model.get_answer / AsyncModel.get_answer
|
|
376
|
-
↳ History updated (user → assistant)
|
|
377
|
-
↳ generate_answer → Groq/AsyncGroq → chat_completion
|
|
378
|
-
↳ result returned → History records assistant reply
|
|
379
|
-
```
|
|
380
|
-
|
|
381
|
-
All side effects are confined to `self.history` and log file emission (controlled by `ProjectBuildConfig`). This fragment therefore provides the core LLM request/response loop, with deterministic fallback and full traceability for both synchronous and asynchronous execution paths.
|
|
382
|
-
<a name="manager-class-usage"></a>!noinfo
|
|
383
|
-
<a name="manager-orchestration"></a>
|
|
384
|
-
## Manager – Orchestration of Project Pipeline
|
|
385
|
-
|
|
386
|
-
**Responsibility** – Coordinates the end‑to‑end doc‑generation workflow for a given project directory: prepares cache, creates a mixed source view, drives part‑wise or factory‑based documentation, and finally orders the sections.
|
|
387
|
-
|
|
388
|
-
**Interactions** – Instantiated by the CLI (`run_file.py`). Receives a `Config`, optional `Model`/`AsyncModel`, and a `BaseProgress` implementation. It logs via `BaseLogger`, writes files to `<project>/.auto_doc_cache`, and updates the UI progress bar after each step.
|
|
389
|
-
<a name="assumptions-and‑constraints"></a>
|
|
390
|
-
## Assumptions and constraints
|
|
391
|
-
|
|
392
|
-
- The logging classes must be importable; any failure in `autodocgenerator.ui.logging` will raise an `ImportError` and abort package import.
|
|
393
|
-
- The environment must support Rich’s terminal capabilities; otherwise, fallback rendering may occur but the “ADG” banner will still print.
|
|
394
|
-
|
|
395
|
-
By centralising logger creation here, the package guarantees a uniform logging experience across all components while keeping the bootstrap lightweight.
|
|
396
|
-
<a name="cache-and-path-resolution"></a>
|
|
397
|
-
## Cache Management & File Path Resolution
|
|
398
|
-
|
|
399
|
-
- `CACHE_FOLDER_NAME` designates the hidden cache folder.
|
|
400
|
-
- `FILE_NAMES` maps logical keys (`code_mix`, `global_info`, `logs`, `output_doc`) to concrete filenames.
|
|
401
|
-
- `get_file_path(key)` builds an absolute path inside the cache; `read_file_by_file_key` returns its contents.
|
|
402
|
-
- Constructor creates the cache directory if absent and attaches a `FileLoggerTemplate` to `BaseLogger`.
|
|
403
|
-
<a name="data‑flow-and‑side‑effects"></a>
|
|
404
|
-
## Data flow, inputs, and side effects
|
|
405
|
-
|
|
406
|
-
- **Input**: No external parameters; the only implicit input is the environment’s standard output stream.
|
|
407
|
-
- **Output**: The literal string “ADG” is written to stdout; log records are emitted to the console (or configured Rich handlers).
|
|
408
|
-
- **Side effects**: Global state mutation – the module‑level `logger` object becomes available for import, and the console receives the “ADG” banner. This side effect is intentional to give immediate visual feedback that the Auto Doc Generator package has been loaded.
|
|
409
|
-
<a name="split-data"></a>
|
|
410
|
-
## `split_data` – token‑aware chunking of source text
|
|
411
|
-
|
|
412
|
-
**Responsibility** – Breaks a single string into a list of fragments whose length does not exceed `max_symbols`. It first splits on newline, repeatedly bisects any piece longer than 1.5 × `max_symbols`, then recombines pieces while respecting a 1.25 × `max_symbols` limit to keep chunks balanced.
|
|
413
|
-
|
|
414
|
-
**Interactions** – Uses `BaseLogger` to emit start/finish messages; no external state is read or written.
|
|
415
|
-
|
|
416
|
-
**Data Flow** – *Input*: `data: str`, `max_symbols: int`. *Output*: `list[str]` of ready‑for‑LLM chunks. Side‑effects are limited to logged messages.
|
|
417
|
-
<a name="yaml‑to‑runtime‑objects"></a>
|
|
418
|
-
## YAML → Runtime Objects (ConfigReader)
|
|
419
|
-
|
|
420
|
-
The `read_config` function is the entry point for translating a user‑supplied `autodocconfig.yml` into three ready‑to‑use objects:
|
|
421
|
-
|
|
422
|
-
* **`Config`** – holds global ignore patterns, language, project name and a `ProjectBuildConfig` instance.
|
|
423
|
-
* **`custom_modules`** – a list of `CustomModule` or `CustomModuleWithOutContext` built from the `custom_descriptions` section; the leading “%” marker selects the context‑less variant.
|
|
424
|
-
* **`StructureSettings`** – controls how the final documentation is sliced (`max_doc_part_size`) and whether intro links or ordering are injected.
|
|
425
|
-
|
|
426
|
-
The function:
|
|
427
|
-
1. Loads YAML via `yaml.safe_load`.
|
|
428
|
-
2. Instantiates `Config` and populates ignore patterns, language, and project metadata.
|
|
429
|
-
3. Creates a `ProjectBuildConfig`, applies any build‑time flags (e.g., `save_logs`, `log_level` – the package’s uniform logging knobs), and attaches it to `Config`.
|
|
430
|
-
4. Iterates over `ignore_files` and `project_additional_info` to extend the `Config` state.
|
|
431
|
-
5. Maps `custom_descriptions` to the appropriate module class.
|
|
432
|
-
6. Initializes `StructureSettings` with defaults (`include_intro_links=True`, `include_order=True`, `max_doc_part_size=5_000`) and overrides them from the YAML block.
|
|
433
|
-
|
|
434
|
-
Outputs are returned as a tuple, ready for the runner.
|
|
435
|
-
|
|
436
|
-
---
|
|
437
|
-
<a name="asyncgptmodel‑async‑wrapper"></a>
|
|
438
|
-
## AsyncGPTModel – Asynchronous LLM Wrapper
|
|
439
|
-
|
|
440
|
-
Mirrors `GPTModel` but uses **AsyncGroq** (`self.client = AsyncGroq(...)`) and async‑compatible methods.
|
|
441
|
-
|
|
442
|
-
* `generate_answer` is declared `async`; all internal steps are identical to the sync version, except the `await` on `self.client.chat.completions.create`.
|
|
443
|
-
* Logging, fallback handling, and result extraction are unchanged, ensuring parity between sync and async pipelines.
|
|
444
|
-
<a name="gptmodel‑sync‑wrapper"></a>
|
|
445
|
-
## GPTModel – Synchronous LLM Wrapper
|
|
446
|
-
|
|
447
|
-
Derived from `Model`, `GPTModel` binds a **Groq** client (`self.client = Groq(api_key=self.api_key)`) and a `BaseLogger`.
|
|
448
|
-
|
|
449
|
-
**Key flow (`generate_answer`)**
|
|
450
|
-
1. Log start (`InfoLog`).
|
|
451
|
-
2. Resolve `messages` from `self.history.history` or an explicit `prompt`.
|
|
452
|
-
3. Loop over `self.regen_models_name` until a successful completion:
|
|
453
|
-
* Attempt `self.client.chat.completions.create(messages=messages, model=model_name)`.
|
|
454
|
-
* On exception, log a warning and advance `self.current_model_index` (wrap‑around).
|
|
455
|
-
* If the list is empty, raise `ModelExhaustedException`.
|
|
456
|
-
4. Extract `result = chat_completion.choices[0].message.content`.
|
|
457
|
-
5. Log the model used and the raw answer (verbosity level 2).
|
|
458
|
-
6. Return `result`.
|
|
459
|
-
|
|
460
|
-
`get_answer` enriches the history before and after the call, providing a convenient one‑step query interface.
|
|
461
|
-
<a name="history‑prompt‑buffer"></a>
|
|
462
|
-
## History – Prompt Context Buffer
|
|
463
|
-
|
|
464
|
-
`History` initialises with the system prompt (`BASE_SYSTEM_TEXT`).
|
|
465
|
-
* `self.history` holds a list of `{role, content}` dicts.
|
|
466
|
-
* `add_to_history` appends new entries, used by the higher‑level `Model`/`AsyncModel` APIs to record user and assistant turns.
|
|
467
|
-
* Exposed to callers via `Model.get_answer` and `AsyncModel.get_answer`, enabling multi‑turn interactions.
|
|
468
|
-
<a name="CONTENT_DESCRIPTION"></a>` tag with strict content rules (no filenames, extensions, generic terms, or URLs). This ensures a clean, tag‑driven snippet suitable for anchor‑based navigation.
|
|
469
|
-
<a name="anchor-based-chunk-extraction"></a>
|
|
470
|
-
## Anchor‑Based Chunk Extraction
|
|
471
|
-
|
|
472
|
-
**Responsibility** – Parses a Markdown document, isolates sections that begin with a well‑formed `<a name="…"></a>` anchor, and returns a mapping **anchor → raw section text**.
|
|
473
|
-
|
|
474
|
-
**Interactions** – Consumes raw Markdown (e.g., the generated `output_doc.md`) and supplies the dictionary to the *semantic sorter* (`get_order`). No external state is touched; the function is pure.
|
|
475
|
-
|
|
476
|
-
**Technical details** –
|
|
477
|
-
* `extract_links_from_start` scans each chunk’s first line with regex `^<a name=["']?(.*?)["']?</a>`; anchors longer than five characters become `#anchor`.
|
|
478
|
-
* `split_text_by_anchors` uses a look‑ahead split (`(?=<a name=…>)`) to create chunks, trims whitespace, validates a 1‑to‑1 anchor‑chunk relationship, and builds the result dict.
|
|
479
|
-
* Returns `None` on mismatch, allowing callers to abort safely.
|
|
480
|
-
|
|
481
|
-
**Data flow** – Input: full Markdown string. Output: `dict[str, str]` where keys are `#anchor` strings and values are the associated section bodies. Side‑effects: none.
|
|
482
|
-
<a name="custom-description-synthesis"></a>## Custom Description Synthesis
|
|
483
|
-
`generete_custom_discription` iterates over pre‑split documentation chunks. For each chunk it assembles a detailed system‑role prompt, embeds the chunk as context, adds **`BASE_CUSTOM_DISCRIPTIONS`**, and asks the model to describe a user‑provided `custom_description`. It stops on the first non‑error response (absence of “!noinfo”/“No information found”).
|
|
484
|
-
`generete_custom_discription_without` skips the context step and forces the model to prepend a single `
|
|
485
|
-
<a name="doc‑chunk‑behaviour"></a>
|
|
486
|
-
## Documentation Chunk Behaviour (`StructureSettings`)
|
|
487
|
-
|
|
488
|
-
* `include_intro_links` – injects a generated table‑of‑contents block (`IntroLinks`) at the document head.
|
|
489
|
-
* `include_order` – sorts generated parts to respect source file order, improving readability.
|
|
490
|
-
* `max_doc_part_size` – caps the character count of each generated segment; the manager respects this when calling `generete_doc_parts`.
|
|
491
|
-
|
|
492
|
-
Together these settings give developers fine‑grained control over the size, ordering, and navigation of the auto‑generated documentation while the underlying logging remains consistent across all modules.
|
|
493
|
-
<a name="document-ordering"></a>
|
|
494
|
-
## Document Ordering & Cleanup
|
|
495
|
-
|
|
496
|
-
`order_doc()` splits the final markdown by anchor tags (`split_text_by_anchors`), asks the model for the correct sequence via `get_order`, and overwrites `output_doc.md`.
|
|
497
|
-
|
|
498
|
-
`clear_cache()` removes the log file unless `config.pbc.save_logs` is true.
|
|
499
|
-
|
|
500
|
-
**Data Flow Summary**
|
|
501
|
-
|
|
502
|
-
```
|
|
503
|
-
project_dir → Manager init → cache files
|
|
504
|
-
→ CodeMix → code_mix.txt
|
|
505
|
-
→ gen_doc_parts / DocFactory → output_doc.md
|
|
506
|
-
→ split_text_by_anchors → get_order → reordered output_doc.md
|
|
507
|
-
```
|
|
508
|
-
<a name="generate-descriptions"></a>
|
|
509
|
-
## `generate_describtions_for_code` – LLM‑driven API documentation generator
|
|
510
|
-
|
|
511
|
-
**Responsibility** – For each code snippet, asks the LLM to produce a markdown‑formatted description following strict guidelines (components, parameters, usage example).
|
|
512
|
-
|
|
513
|
-
**Interactions** – Builds a fixed system prompt, adds the code as user content, calls `model.get_answer_without_history`, and tracks progress via `BaseProgress`.
|
|
514
|
-
|
|
515
|
-
**Output** – List of description strings matching the order of `data`.
|
|
516
|
-
|
|
517
|
-
---
|
|
518
|
-
<a name="gen-doc-parts"></a>
|
|
519
|
-
## `gen_doc_parts` – orchestrator for synchronous batch documentation
|
|
520
|
-
|
|
521
|
-
**Responsibility** – Splits the full source code via `split_data`, then iteratively calls `write_docs_by_parts` for each chunk, concatenating results. Keeps a sliding window of the last 3000 characters as context for the next call.
|
|
522
|
-
|
|
523
|
-
**Interactions** – Creates a `BaseProgress` sub‑task, updates it per chunk, and logs overall progress.
|
|
524
|
-
|
|
525
|
-
**Data Flow** – *Inputs*: `full_code_mix`, `max_symbols`, `model`, `language`, `progress_bar`. *Output*: single markdown string containing the entire documentation. Side‑effects: progress‑bar mutation and logging.
|
|
526
|
-
<a name="html-link-extraction-logic"></a>## HTML Link Extraction Logic
|
|
527
|
-
The function **`get_all_html_links`** scans a documentation string for Markdown‑style anchor tags (`<a name="…"></a>`). Using a compiled regex it captures the anchor name, prefixes it with “#”, and returns a list of link fragments. Logging via **`BaseLogger`** reports start, count, and the raw list (verbosity 1). The routine assumes anchors longer than five characters are meaningful and ignores shorter matches.
|
|
528
|
-
<a name="link-based-introduction-generation"></a>## Link‑Based Introduction Generation
|
|
529
|
-
**`get_links_intro`** builds a three‑message prompt for a **`Model`** (typically a `GPTModel`). System messages enforce language and inject the constant **`BASE_INTRODACTION_CREATE_LINKS`**; the user message supplies the extracted links. The model’s response (a prose introduction containing those links) is returned after logging the operation. The function is language‑agnostic, defaulting to English.
|
|
530
|
-
<a name="plain-introduction-synthesis"></a>## Plain Introduction Synthesis
|
|
531
|
-
**`get_introdaction`** (note the historic typo) follows the same prompt pattern but uses **`BASE_INTRO_CREATE`** and feeds the full documentation (`global_data`). The model returns a standalone introductory paragraph, which the caller integrates elsewhere.
|
|
532
|
-
<a name="parentmodel‑shared‑state"></a>
|
|
533
|
-
## ParentModel – Shared Model State
|
|
534
|
-
|
|
535
|
-
`ParentModel` centralises configuration for every LLM client.
|
|
536
|
-
* **Constructor arguments** – `api_key`, optional `History` object, `use_random` flag.
|
|
537
|
-
* **State built** –
|
|
538
|
-
* `self.history` stores the rolling conversation.
|
|
539
|
-
* `self.api_key` propagates to concrete clients.
|
|
540
|
-
* `self.regen_models_name` is a shuffled (if `use_random`) copy of the global `MODELS_NAME` list, defining the fallback order when a model fails.
|
|
541
|
-
* `self.current_model_index` tracks the active candidate.
|
|
542
|
-
|
|
543
|
-
The class is inherited by both sync (`Model`) and async (`AsyncModel`) wrappers, guaranteeing identical fallback logic across execution modes.
|
|
544
|
-
<a name="progress-implementations"></a>
|
|
545
|
-
## `BaseProgress` & concrete progress reporters – task progress visualisation
|
|
546
|
-
|
|
547
|
-
**Responsibility** – Defines a minimal progress‑tracking contract (`create_new_subtask`, `update_task`, `remove_subtask`). Concrete classes implement visual feedback for either Rich‑based terminal bars (`LibProgress`) or plain console prints (`ConsoleGtiHubProgress`).
|
|
548
|
-
|
|
549
|
-
**Interactions** – Documentation generators instantiate a progress object and invoke the three methods around each chunk‑processing step. No shared mutable state beyond the Rich `Progress` instance or internal `ConsoleTask` objects.
|
|
550
|
-
|
|
551
|
-
**Technical Details**
|
|
552
|
-
* `LibProgress` wraps `rich.progress.Progress`; maintains a base task (`_base_task`) and a current sub‑task (`_cur_sub_task`). Updating advances the appropriate task.
|
|
553
|
-
* `ConsoleTask` prints a start banner and a percentage on each `progress()` call.
|
|
554
|
-
* `ConsoleGtiHubProgress` composes a permanent “General Progress” `ConsoleTask` and creates per‑chunk `ConsoleTask`s on demand. `update_task()` delegates to the active sub‑task or the general one.
|
|
555
|
-
* Both concrete classes inherit from the empty `BaseProgress` stub, ensuring a consistent API.
|
|
556
|
-
|
|
557
|
-
**Data Flow** – *Inputs*: `name` (string) and `total_len` (int) for sub‑task creation; subsequent calls receive no arguments. *Outputs*: visual side‑effects on stdout (Rich bar or console prints). Internally, task identifiers are stored to allow incremental updates and later removal.
|
|
558
|
-
|
|
559
|
-
---
|
|
560
|
-
|
|
561
|
-
These two modules together supply the UI layer’s logging and progress reporting facilities, enabling the higher‑level documentation pipeline to emit timestamped diagnostics and user‑visible progress without coupling to a specific output medium.
|
|
562
|
-
<a name="semantic-title-ordering"></a>
|
|
563
|
-
## Semantic Title Ordering
|
|
564
|
-
|
|
565
|
-
**Responsibility** – Sends the extracted anchor titles to the LLM (`Model.get_answer_without_history`) and reassembles the document in the LLM‑proposed order.
|
|
566
|
-
|
|
567
|
-
**Interactions** – Receives the anchor‑to‑section map from the extractor, logs progress via `BaseLogger`, and calls the **stateless** LLM endpoint (`model.get_answer_without_history`).
|
|
568
|
-
|
|
569
|
-
**Technical details** –
|
|
570
|
-
* Constructs a user‑role prompt asking the model to “Sort the following titles semantically … Return ONLY a comma‑separated list … leave # in title”.
|
|
571
|
-
* Parses the comma‑separated response, trims entries, and iterates over the ordered list, concatenating the corresponding chunk text (`order_output`).
|
|
572
|
-
* Detailed logging at three verbosity levels records input keys, raw chunk dict, and per‑chunk inclusion.
|
|
573
|
-
|
|
574
|
-
**Data flow** – Input: `Model` instance, `dict[anchor, chunk]`. Output: single string containing the reordered document sections, ready for final concatenation. No file I/O occurs here.
|
|
575
|
-
<a name="sync-doc-part-generation"></a>
|
|
576
|
-
## Synchronous Documentation Part Generation
|
|
577
|
-
|
|
578
|
-
`generete_doc_parts(max_symbols=5_000)`
|
|
579
|
-
- Reads the previously stored `code_mix`.
|
|
580
|
-
- Calls `gen_doc_parts(full_code_mix, max_symbols, sync_model, config.language, progress_bar)` which splits the mix, queries the model, and streams partial docs.
|
|
581
|
-
- Persists the assembled output to `output_doc.md` and updates progress.
|
|
582
|
-
<a name="write-autodocfile-options"></a>
|
|
583
|
-
The file is a simple key‑value list written in YAML.
|
|
584
|
-
Top‑level keys define the project and its behavior:
|
|
585
|
-
|
|
586
|
-
- **project_name** – a short title for the documentation generator.
|
|
587
|
-
- **language** – language code for generated text (e.g., “en”).
|
|
588
|
-
|
|
589
|
-
A **build** block controls execution details:
|
|
590
|
-
- **save_logs** – true/false to keep the generation log.
|
|
591
|
-
- **log_level** – numeric level (higher means more detail).
|
|
592
|
-
|
|
593
|
-
A **structure** block influences how the output is organized:
|
|
594
|
-
- **include_intro_links** – include navigation links at the beginning.
|
|
595
|
-
- **include_order** – keep sections in the order they appear in the source.
|
|
596
|
-
- **max_doc_part_size** – maximum character count for each generated piece.
|
|
597
|
-
|
|
598
|
-
An **additional_info** block can hold free‑form data, such as a global description of the project.
|
|
599
|
-
|
|
600
|
-
A **custom_descriptions** list allows you to add specific prompts that the generator will answer, for example instructions on installation, how to write this file, or how to use particular classes.
|
|
601
|
-
<a name="write-docs-by-parts"></a>
|
|
602
|
-
## `write_docs_by_parts` – synchronous single‑part documentation generation
|
|
603
|
-
|
|
604
|
-
**Responsibility** – Constructs a system‑prompt (including language, part ID, optional previous output) and calls `model.get_answer_without_history` to obtain a markdown description for one code fragment. Trims surrounding markdown fences before returning.
|
|
605
|
-
|
|
606
|
-
**Interactions** – Relies on `BASE_PART_COMPLITE_TEXT`, `BaseLogger`, and the synchronous `Model` interface.
|
|
607
|
-
|
|
608
|
-
**Data Flow** – *Inputs*: `part_id`, `part`, `model`, optional `prev_info`, `language`. *Output*: cleaned LLM answer (`str`). Logs request/response sizes.
|
|
609
|
-
<a name="async-gen-doc-parts"></a>
|
|
610
|
-
## `async_gen_doc_parts` – parallel orchestrator for asynchronous batch documentation
|
|
611
|
-
|
|
612
|
-
**Responsibility** – Mirrors `gen_doc_parts` but launches an `async_write_docs_by_parts` task per chunk, limited by a semaphore of size 4, and aggregates the async results with `asyncio.gather`.
|
|
613
|
-
|
|
614
|
-
**Interactions** – Uses the same `BaseProgress` API (sub‑task creation, per‑task updates, removal), the shared semaphore, and the asynchronous `AsyncModel`.
|
|
615
|
-
|
|
616
|
-
**Data Flow** – *Inputs*: `full_code_mix`, `global_info`, `max_symbols`, `model`, `language`, `progress_bar`. *Output*: concatenated markdown documentation (`str`). Logs start/end and total length.
|
|
617
|
-
<a name="async-write-docs-by-parts"></a>
|
|
618
|
-
## `async_write_docs_by_parts` – concurrent part processing
|
|
619
|
-
|
|
620
|
-
**Responsibility** – Same logical work as `write_docs_by_parts` but runs inside an `asyncio.Semaphore` to limit parallel requests. Calls `async_model.get_answer_without_history` and optionally invokes `update_progress`.
|
|
621
|
-
|
|
622
|
-
**Interactions** – Accepts a shared `semaphore` object, uses `BaseLogger`, and expects an `AsyncModel` that implements an async `get_answer_without_history`.
|
|
623
|
-
|
|
624
|
-
**Data Flow** – *Inputs*: `part`, `async_model`, `global_info`, `semaphore`, optional `prev_info`, `language`, `update_progress`. *Output*: trimmed answer (`str`). Emits progress callbacks and logs.
|
|
625
|
-
<a name="async-compress"></a>
|
|
626
|
-
## `async_compress` – semaphore‑protected async compression
|
|
627
|
-
|
|
628
|
-
**Responsibility** – Performs the same LLM call as `compress` but within an `asyncio.Semaphore` to limit concurrency and updates the async progress bar.
|
|
629
|
-
|
|
630
|
-
**Interactions** – Awaits `AsyncModel.get_answer_without_history`; updates `BaseProgress`.
|
|
631
|
-
|
|
632
|
-
**Flow** – Build prompt (identical to `compress`), `await model.get_answer_without_history`, then `progress_bar.update_task()`.
|
|
633
|
-
|
|
634
|
-
**Data** – Input: `data: str`, `project_settings`, `model: AsyncModel`, `compress_power`, `semaphore`, `progress_bar`.
|
|
635
|
-
Output: `str` compressed answer.
|
|
636
|
-
|
|
637
|
-
---
|
|
638
|
-
<a name="async-compress-and-compare"></a>
|
|
639
|
-
## `async_compress_and_compare` – parallel batch compression
|
|
640
|
-
|
|
641
|
-
**Responsibility** – Dispatches `async_compress` for every element in `data` (default concurrency = 4), gathers results, then re‑chunks them into groups of `compress_power`.
|
|
642
|
-
|
|
643
|
-
**Interactions** – Creates an `asyncio.Semaphore(4)`, populates `tasks`, awaits `asyncio.gather`, uses `BaseProgress` for a sub‑task.
|
|
644
|
-
|
|
645
|
-
**Result** – Returns `list[str]` where each entry is a newline‑joined group of compressed chunks.
|
|
646
|
-
|
|
647
|
-
---
|
|
648
|
-
<a name="compress-and-compare-sync"></a>
|
|
649
|
-
## `compress_and_compare` – batch sync compression
|
|
650
|
-
|
|
651
|
-
**Responsibility** – Groups input files into chunks of size `compress_power`, compresses each file with `compress`, concatenates results per chunk, and reports progress.
|
|
652
|
-
|
|
653
|
-
**Interactions** – Uses `BaseProgress` to create/update a sub‑task; repeatedly calls `compress`.
|
|
654
|
-
|
|
655
|
-
**Logic** –
|
|
656
|
-
* Allocate result list sized to `ceil(len(data)/compress_power)`.
|
|
657
|
-
* Loop over `data`, compute `curr_index = i // compress_power`, append each `compress` result plus newline.
|
|
658
|
-
* Update progress bar each iteration, then remove sub‑task.
|
|
659
|
-
|
|
660
|
-
**Data** – Input: `data: list[str]`, `model`, `project_settings`, `compress_power`, `progress_bar`.
|
|
661
|
-
Output: `list[str]` where each element contains the concatenated compressed chunk.
|
|
662
|
-
|
|
663
|
-
---
|
|
664
|
-
<a name="compress-single-pass"></a>
|
|
665
|
-
## `compress` – single‑pass LLM compression
|
|
666
|
-
|
|
667
|
-
**Responsibility** – Sends a raw code string to the LLM together with the project‑wide system prompt and a size‑adjusted compression prompt, returning the model’s answer.
|
|
668
|
-
|
|
669
|
-
**Interactions** – Calls `Model.get_answer_without_history`. No filesystem or network side‑effects other than the LLM request.
|
|
670
|
-
|
|
671
|
-
**Technical flow** –
|
|
672
|
-
1. Build `prompt` list (`system` → project prompt, `system` → base compress text, `user` → `data`).
|
|
673
|
-
2. Invoke `model.get_answer_without_history(prompt=prompt)`.
|
|
674
|
-
3. Return the answer string.
|
|
675
|
-
|
|
676
|
-
**Data** – Input: `data: str`, `project_settings: ProjectSettings`, `model: Model`, `compress_power: int`.
|
|
677
|
-
Output: compressed `str`. No mutation of arguments.
|
|
678
|
-
|
|
679
|
-
---
|
|
680
|
-
<a name="compress-to-one"></a>
|
|
681
|
-
## `compress_to_one` – iterative reduction to a single summary
|
|
682
|
-
|
|
683
|
-
**Responsibility** – Repeatedly compresses the list of strings until only one element remains, optionally using the async path.
|
|
684
|
-
|
|
685
|
-
**Interactions** – Calls either `compress_and_compare` or `async_compress_and_compare` inside a `while len(data) > 1` loop; updates a `BaseProgress` sub‑task each iteration.
|
|
686
|
-
|
|
687
|
-
**Data** – Input: `data: list[str]`, `model`, `project_settings`, `compress_power`, `use_async`, `progress_bar`.
|
|
688
|
-
Output: final aggregated string (`data[0]`).
|
|
689
|
-
|
|
690
|
-
---
|
|
691
|
-
<a name="code-mix-generation"></a>
|
|
692
|
-
## Code Mix Generation Workflow
|
|
693
|
-
|
|
694
|
-
`generate_code_file()`
|
|
695
|
-
1. Logs start.
|
|
696
|
-
2. Instantiates `CodeMix(project_directory, config.ignore_files)`.
|
|
697
|
-
3. Calls `build_repo_content` → writes a concatenated source snapshot to `code_mix.txt`.
|
|
698
|
-
4. Logs completion and advances the progress bar.
|
|
699
|
-
|