@redpanda-data/docs-extensions-and-macros 4.12.1 → 4.12.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,937 +1,1289 @@
1
1
  = Redpanda Property Extractor
2
2
 
3
- The Redpanda Property Extractor automatically extracts configuration properties and type definitions from Redpanda's C++ source code and generates JSON schemas and AsciiDoc documentation.
3
+ Automatically generates Redpanda configuration property documentation from C++ source code.
4
4
 
5
- == Prerequisites
5
+ == What is this?
6
6
 
7
- Ensure the following prerequisites are installed:
7
+ The Property Extractor is part of the https://github.com/redpanda-data/docs-extensions-and-macros[`docs-extensions-and-macros`] package. It analyzes Redpanda's C++ source code to automatically generate accurate, up-to-date configuration property documentation in AsciiDoc format.
8
8
 
9
- - https://www.python.org/downloads/[Python 3.10 or higher]
10
- - A C++ compiler (such as `gcc` or `clang`)
11
- - https://www.gnu.org/software/make/[`make` utility]
9
+ **Why it exists:** Redpanda has hundreds of configuration properties defined in C++ code. Rather than manually maintaining documentation that can drift out of sync, this tool automatically extracts property definitions, default values, types, and descriptions directly from the source of truth—the code itself.
12
10
 
13
- Verify `make` installation:
11
+ **What it generates:**
12
+
13
+ * *Property reference documentation* (AsciiDoc files)
14
+ * *Topic property documentation* including defaults inherited from cluster properties
15
+ * *JSON schemas* with complete property metadata
16
+ * *Version-specific property files* for tracking changes across releases
17
+ * *Property change diffs* highlighting what changed between versions
18
+
19
+ == For documentation writers
20
+
21
+ === Quick start
22
+
23
+ The Property Extractor is installed as part of the `docs-extensions-and-macros` npm package. You interact with it through the `doc-tools` CLI from within your documentation repository.
24
+
25
+ *Prerequisites:*
26
+
27
+ * Node.js 18+ and npm
28
+ * Python 3.10+
29
+ * A C++ compiler (gcc/clang)
30
+ * Git
31
+
32
+ *Installation:*
14
33
 
15
34
  [,bash]
16
35
  ----
17
- make --version
36
+ # In your docs repository
37
+ npm install @redpanda-data/docs-extensions-and-macros
18
38
  ----
19
39
 
20
- == Quick start
40
+ *Generate property documentation:*
21
41
 
22
- . Clone the repository:
23
- +
24
42
  [,bash]
25
43
  ----
26
- git clone https://github.com/redpanda-data/docs-extensions-and-macros.git
27
- cd docs-extensions-and-macros/tools/property-extractor
44
+ # Generate docs for a specific Redpanda version
45
+ npx doc-tools generate property-docs --tag v25.3.1
46
+
47
+ # Generate docs with custom overrides
48
+ npx doc-tools generate property-docs \
49
+ --tag v25.3.1 \
50
+ --overrides path/to/property-overrides.json
51
+
52
+ # Generate docs and create consolidated partials
53
+ npx doc-tools generate property-docs \
54
+ --tag v25.3.1 \
55
+ --generate-partials
56
+
57
+ # Generate with diff from previous version
58
+ npx doc-tools generate property-docs \
59
+ --tag v25.3.1 \
60
+ --diff v25.2.1
28
61
  ----
29
62
 
30
- . Build and generate documentation:
31
- +
32
- [,bash]
63
+ *Automatic version detection and diff generation:*
64
+
65
+ If your repository has an `antora.yml` file at the root with a `latest-redpanda-tag` attribute, the tool provides automatic version management:
66
+
67
+ [,yaml]
33
68
  ----
34
- make build
69
+ # antora.yml
70
+ name: redpanda
71
+ title: Redpanda Documentation
72
+ asciidoc:
73
+ attributes:
74
+ latest-redpanda-tag: v25.2.1 # Current documented version
35
75
  ----
36
- +
37
- This command:
38
- +
39
- * Sets up a Python virtual environment
40
- * Clones the Redpanda source code to the specified branch or tag
41
- * Extracts properties and type definitions to `gen/properties-output.json`
42
- * Generates AsciiDoc documentation files in `output/`
43
76
 
44
- . View generated files:
45
- +
77
+ When you specify a new `--tag` without `--diff`, the tool automatically:
78
+
79
+ 1. Creates a diff between the current `latest-redpanda-tag` (v25.2.1) and your new tag (v25.3.1)
80
+ 2. Generates property documentation for the new version
81
+ 3. Updates `antora.yml` with the new tag
82
+
46
83
  [,bash]
47
84
  ----
48
- ls gen/properties-output.json
49
- ls output/pages/
85
+ # Current antora.yml has v25.2.1
86
+ # This command will:
87
+ # - Generate diff from v25.2.1 to v25.3.1
88
+ # - Generate docs for v25.3.1
89
+ # - Update antora.yml to v25.3.1
90
+ npx doc-tools generate property-docs --tag v25.3.1 --generate-partials
91
+
92
+ # Explicit --diff prevents automatic diff and antora.yml update
93
+ npx doc-tools generate property-docs --tag v25.3.1 --diff v25.1.1
50
94
  ----
51
95
 
52
- To clean generated files:
96
+ This workflow makes version updates seamless: specify the new version, and the tool handles diffing and updating your Antora configuration automatically.
53
97
 
54
- [,bash]
98
+ === Generated output
99
+
100
+ Running `doc-tools generate property-docs` creates:
101
+
102
+ [,text]
55
103
  ----
56
- make clean
104
+ modules/reference/
105
+ ├── pages/
106
+ │ └── (individual property pages - not commonly used)
107
+ ├── partials/
108
+ │ ├── properties/
109
+ │ │ ├── cluster-properties.adoc # All cluster properties
110
+ │ │ ├── broker-properties.adoc # All broker properties
111
+ │ │ ├── topic-properties.adoc # All topic properties
112
+ │ │ ├── topic-property-mappings.adoc # Topic→Cluster mappings
113
+ │ │ └── object-storage-properties.adoc
114
+ │ └── deprecated/
115
+ │ └── deprecated-properties.adoc
116
+ └── attachments/
117
+ ├── redpanda-properties-v25.3.1.json # Versioned snapshot
118
+ └── redpanda-property-changes-*.json # Change logs
57
119
  ----
58
120
 
59
- == How it works
121
+ *Include in your documentation:*
60
122
 
61
- === Architecture overview
123
+ [,asciidoc]
124
+ ----
125
+ // In your cluster properties reference page
126
+ \include::reference:partial$properties/cluster-properties.adoc[]
62
127
 
63
- The property extractor uses a multi-stage pipeline:
128
+ // In your topic properties reference page
129
+ \include::reference:partial$properties/topic-properties.adoc[]
130
+ ----
131
+
132
+ === Overriding property documentation
133
+
134
+ Create a `property-overrides.json` file to customize extracted documentation:
64
135
 
65
- [source,text]
136
+ [,json]
66
137
  ----
67
- C++ Source Code
68
-
69
- [Tree-sitter Parser] → AST
70
-
71
- [Property Extractor] Raw properties
72
-
73
- [Type Definition Extractor] → Auto-discovered types
74
-
75
- [Transformers Pipeline] → Enriched properties
76
-
77
- [Type Resolver] → Resolved types & defaults
78
-
79
- [Enum Default Mapper] → User-facing enum values
80
-
81
- [Chrono Evaluator] → Numeric values & human-readable times
82
-
83
- [Overrides Applier] → Final properties
84
-
85
- JSON Schema Output
138
+ {
139
+ "$comment": "Override descriptions, add examples, specify versions",
140
+ "properties": {
141
+ "kafka_api": {
142
+ "description": "Network address and port for Kafka API clients to connect to Redpanda brokers.",
143
+ "example": ".Example\n[,yaml]\n----\nredpanda:\n kafka_api:\n - name: internal\n address: 0.0.0.0\n port: 9092\n - name: external\n address: redpanda.example.com\n port: 19092\n----",
144
+ "version": "v21.4.1",
145
+ "related_topics": [
146
+ "xref:manage:kubernetes/networking/k-networking-and-connectivity.adoc[]"
147
+ ]
148
+ }
149
+ }
150
+ }
86
151
  ----
87
152
 
88
- === Stage 1: Source code parsing
153
+ *Override fields:*
89
154
 
90
- The extractor uses https://tree-sitter.github.io/tree-sitter/[Tree-sitter] to parse C++ source code into Abstract Syntax Trees (ASTs). It identifies property declarations in:
155
+ * `description` - Replace auto-extracted description with custom text
156
+ * `example` - Add usage example in AsciiDoc format
157
+ * `example_file` - Load example from external file
158
+ * `version` - Document when property was introduced
159
+ * `related_topics` - Array of AsciiDoc xrefs to related content
91
160
 
92
- * `src/v/config/configuration.cc` - Broker and cluster properties
93
- * `src/v/kafka/client/configuration.cc` - Kafka client properties
94
- * Other configuration files
161
+ === Understanding property changes
95
162
 
96
- Properties are declared using Redpanda's property template classes:
163
+ When you generate docs with `--diff`, a change report is created:
97
164
 
98
- [,cpp]
165
+ [,bash]
99
166
  ----
100
- property<std::optional<int>>("property_name", "Description")
101
- .default_value(42)
102
- .visibility(visibility::tunable);
167
+ npx doc-tools generate property-docs --tag v25.3.1 --diff v25.2.1
103
168
  ----
104
169
 
105
- === Stage 2: Type definition extraction
170
+ The generated `redpanda-property-changes-v25.2.1-to-v25.3.1.json` contains:
106
171
 
107
- The extractor automatically discovers type definitions from C++ headers:
172
+ [,json]
173
+ ----
174
+ {
175
+ "new_properties": [
176
+ {"name": "new_feature_enabled", "type": "boolean", "default": false}
177
+ ],
178
+ "changed_defaults": [
179
+ {"name": "log_segment_size", "old": 536870912, "new": 1073741824}
180
+ ],
181
+ "deprecated_properties": [
182
+ {"name": "legacy_setting", "reason": "Use new_setting instead"}
183
+ ]
184
+ }
185
+ ----
108
186
 
109
- ==== Automatically extracted types
187
+ Use this to:
110
188
 
111
- [cols="1,2,2"]
112
- |===
113
- | Type category | Example | Extraction method
189
+ * Update release notes
190
+ * Identify breaking changes
191
+ * Track configuration evolution
114
192
 
115
- | *Structs and classes*
116
- | `model::broker_endpoint`, `config::tls_config`
117
- | Brace-counting algorithm extracts complete struct bodies including nested types and methods
193
+ === Common workflows
118
194
 
119
- | *Enumerations*
120
- | `model::compression`, `config::tls_version`
121
- | Regex pattern matching with support for four conversion function patterns: `_to_string()`, `operator<<`, `string_switch`, and `to_string_view()`
195
+ ==== Documenting a new Redpanda release
122
196
 
123
- | *Type aliases*
124
- | `using node_id = named_type<int32_t, ...>`
125
- | Pattern matching for `using` declarations with underlying type resolution
197
+ [,bash]
198
+ ----
199
+ # 1. Generate docs for the new version
200
+ npx doc-tools generate property-docs \
201
+ --tag v25.3.1 \
202
+ --diff v25.2.1 \
203
+ --generate-partials \
204
+ --overrides property-overrides.json
126
205
 
127
- | *Enum string mappings*
128
- | `write_caching_mode::default_false` → `"false"`
129
- | Extracted from enum-to-string conversion functions using four pattern-matching strategies
130
- |===
206
+ # 2. Review the generated change report
207
+ cat modules/reference/attachments/redpanda-property-changes-*.json
131
208
 
132
- ==== Enum string mapping patterns
209
+ # 3. Update release notes with new/changed properties
133
210
 
134
- The extractor supports four C++ patterns for mapping enum values to user-facing strings:
211
+ # 4. Commit the generated files
212
+ git add modules/reference/partials/properties/
213
+ git add modules/reference/attachments/redpanda-properties-v25.3.1.json
214
+ git commit -m "docs: Update property docs for v25.3.1"
215
+ ----
135
216
 
136
- [cols="1,2"]
137
- |===
138
- | Pattern | C++ Code Example
217
+ ==== Updating descriptions for existing properties
139
218
 
140
- | *Pattern 1: `_to_string()` method*
141
- |
142
- [,cpp]
219
+ [,bash]
143
220
  ----
144
- std::string_view write_caching_mode_to_string(write_caching_mode s) {
145
- switch(s) {
146
- case write_caching_mode::default_false:
147
- return "false";
221
+ # 1. Edit your property-overrides.json
222
+ {
223
+ "properties": {
224
+ "my_property": {
225
+ "description": "Updated description",
226
+ "example": ".Example\n..."
148
227
  }
228
+ }
149
229
  }
230
+
231
+ # 2. Regenerate docs
232
+ npx doc-tools generate property-docs \
233
+ --tag v25.3.1 \
234
+ --overrides property-overrides.json \
235
+ --generate-partials
236
+
237
+ # 3. Review changes
238
+ git diff modules/reference/partials/properties/
150
239
  ----
151
240
 
152
- | *Pattern 2: `operator<<` overload*
153
- |
154
- [,cpp]
241
+ ==== Checking if properties exist in a version
242
+
243
+ [,bash]
155
244
  ----
156
- std::ostream& operator<<(std::ostream& os, compression c) {
157
- switch(c) {
158
- case compression::gzip:
159
- os << "gzip";
160
- }
161
- }
245
+ # List all properties in a version
246
+ cat modules/reference/attachments/redpanda-properties-v25.3.1.json | \
247
+ jq '.properties | keys'
248
+
249
+ # Check specific property
250
+ cat modules/reference/attachments/redpanda-properties-v25.3.1.json | \
251
+ jq '.properties.kafka_api'
162
252
  ----
163
253
 
164
- | *Pattern 3: `string_switch` reverse lookup*
165
- |
254
+ == How it works
255
+
256
+ === Architecture overview
257
+
258
+ [,text]
259
+ ----
260
+ ┌─────────────────────────────────────────────────────────────┐
261
+ │ doc-tools CLI │
262
+ │ └─ generate property-docs command │
263
+ └────────────────────┬────────────────────────────────────────┘
264
+
265
+
266
+ ┌─────────────────────────────────────────────────────────────┐
267
+ │ Property Extractor Pipeline │
268
+ ├─────────────────────────────────────────────────────────────┤
269
+ │ │
270
+ │ 1. Clone Redpanda source code (specific version/tag) │
271
+ │ │
272
+ │ 2. Parse C++ with Tree-sitter │
273
+ │ ├─ Extract property declarations │
274
+ │ ├─ Extract topic properties │
275
+ │ └─ Extract type definitions (structs, enums) │
276
+ │ │
277
+ │ 3. Enrich with transformers │
278
+ │ ├─ Resolve types and defaults │
279
+ │ ├─ Map enum values to strings │
280
+ │ ├─ Evaluate chrono expressions (24h → milliseconds) │
281
+ │ ├─ Detect deprecated/experimental properties │
282
+ │ └─ Link topic properties to cluster defaults │
283
+ │ │
284
+ │ 4. Apply overrides from JSON │
285
+ │ └─ Merge custom descriptions, examples, versions │
286
+ │ │
287
+ │ 5. Generate outputs │
288
+ │ ├─ JSON schema (complete property metadata) │
289
+ │ ├─ AsciiDoc partials (via Handlebars templates) │
290
+ │ └─ Change reports (diffs between versions) │
291
+ │ │
292
+ └─────────────────────────────────────────────────────────────┘
293
+
294
+
295
+ ┌─────────────────────────────────────────────────────────────┐
296
+ │ Generated Documentation │
297
+ ├─────────────────────────────────────────────────────────────┤
298
+ │ • cluster-properties.adoc │
299
+ │ • topic-properties.adoc │
300
+ │ • broker-properties.adoc │
301
+ │ • redpanda-properties-v25.3.1.json │
302
+ │ • redpanda-property-changes-v25.2.1-to-v25.3.1.json │
303
+ └─────────────────────────────────────────────────────────────┘
304
+ ----
305
+
306
+ === What gets extracted
307
+
308
+ ==== 1. Cluster and broker properties
309
+
310
+ Redpanda properties are declared in C++ using a template pattern:
311
+
166
312
  [,cpp]
167
313
  ----
168
- compression from_string(std::string_view s) {
169
- return string_switch<compression>(s)
170
- .match("gzip", compression::gzip)
171
- .match("snappy", compression::snappy);
172
- }
314
+ // src/v/config/configuration.cc
315
+ property<int64_t>(
316
+ *this,
317
+ "log_segment_size",
318
+ "Maximum log segment size in bytes",
319
+ {.needs_restart = config::needs_restart::no,
320
+ .example = "536870912",
321
+ .visibility = visibility::user},
322
+ 1_GiB)
323
+ .with_validator(validate_log_segment_size);
173
324
  ----
174
325
 
175
- | *Pattern 4: `to_string_view()` function*
176
- |
177
- [,cpp]
326
+ The extractor parses this to generate:
327
+
328
+ [,json]
178
329
  ----
179
- constexpr std::string_view to_string_view(tls_version v) {
180
- switch(v) {
181
- case tls_version::v1_0:
182
- return "v1.0";
183
- case tls_version::v1_2:
184
- return "v1.2";
185
- }
330
+ {
331
+ "log_segment_size": {
332
+ "name": "log_segment_size",
333
+ "type": "integer",
334
+ "description": "Maximum log segment size in bytes",
335
+ "default": 1073741824,
336
+ "default_human_readable": "1 GiB",
337
+ "needs_restart": false,
338
+ "visibility": "user",
339
+ "example": "536870912",
340
+ "config_scope": "cluster"
341
+ }
186
342
  }
187
343
  ----
188
- |===
189
344
 
190
- The extractor searches for these patterns in `.cc` files related to the enum's `.h` header file.
345
+ ==== 2. Topic properties
191
346
 
192
- ==== Type namespace resolution
347
+ Topic properties are simpler string constants:
193
348
 
194
- The extractor resolves unqualified type names by trying common namespace prefixes:
349
+ [,cpp]
350
+ ----
351
+ // src/v/kafka/server/handlers/topics/types.h
352
+ inline constexpr std::string_view topic_property_retention_ms = "retention.ms";
353
+ ----
195
354
 
196
- * `config::` - Configuration types
197
- * `model::` - Core data model types
198
- * `security::` - Security and authentication types
199
- * `net::` - Network types
200
- * `kafka::` - Kafka protocol types
201
- * `pandaproxy::` - Schema registry types
355
+ The extractor:
202
356
 
203
- Example: An unqualified type `tls_version` automatically resolves to `config::tls_version` if found in the `config` namespace.
357
+ . Finds all `topic_property_*` constants
358
+ . Discovers cluster property mappings in `config_response_utils.cc`
359
+ . Inherits default values from corresponding cluster properties
204
360
 
205
- The extractor scans these source directories:
361
+ [,json]
362
+ ----
363
+ {
364
+ "retention.ms": {
365
+ "name": "retention.ms",
366
+ "type": "integer",
367
+ "config_scope": "topic",
368
+ "corresponding_cluster_property": "log_retention_ms",
369
+ "default": 604800000,
370
+ "default_human_readable": "7 days"
371
+ }
372
+ }
373
+ ----
206
374
 
207
- * `model/` - Core data model types
208
- * `config/` - Configuration types
209
- * `net/` - Network types
210
- * `kafka/` - Kafka protocol types
211
- * `pandaproxy/` - Schema registry types
212
- * `security/` - Security and audit types
213
- * `utils/` - Utility types
375
+ ==== 3. Type definitions
214
376
 
215
- === Stage 3: Property enrichment
377
+ The extractor automatically discovers types from C++ headers:
216
378
 
217
- A series of transformers processes extracted properties:
379
+ *Structs and classes:*
218
380
 
219
- [cols="1,2"]
220
- |===
221
- | Transformer | Function
381
+ [,cpp]
382
+ ----
383
+ struct broker_endpoint {
384
+ ss::sstring name;
385
+ ss::sstring address;
386
+ uint16_t port;
387
+ };
388
+ ----
222
389
 
223
- | `BasicInfoTransformer`
224
- | Extracts property names, types, and descriptions
390
+ Becomes:
225
391
 
226
- | `VisibilityTransformer`
227
- | Determines visibility (public, tunable, deprecated)
392
+ [,json]
393
+ ----
394
+ {
395
+ "model::broker_endpoint": {
396
+ "type": "object",
397
+ "properties": {
398
+ "name": {"type": "string"},
399
+ "address": {"type": "string"},
400
+ "port": {"type": "integer", "minimum": 0, "maximum": 65535}
401
+ }
402
+ }
403
+ }
404
+ ----
228
405
 
229
- | `IsNullableTransformer`
230
- | Detects optional properties
406
+ *Enumerations:*
407
+
408
+ [,cpp]
409
+ ----
410
+ enum class compression {
411
+ none,
412
+ gzip,
413
+ snappy,
414
+ lz4,
415
+ zstd
416
+ };
417
+
418
+ // String conversion function
419
+ std::ostream& operator<<(std::ostream& os, compression c) {
420
+ switch(c) {
421
+ case compression::none: os << "none"; break;
422
+ case compression::gzip: os << "gzip"; break;
423
+ // ...
424
+ }
425
+ }
426
+ ----
231
427
 
232
- | `DefaultValueTransformer`
233
- | Extracts and resolves default values
428
+ Becomes:
234
429
 
235
- | `UnitsTransformer`
236
- | Identifies units (bytes, milliseconds, etc.)
430
+ [,json]
431
+ ----
432
+ {
433
+ "model::compression": {
434
+ "type": "string",
435
+ "enum": ["none", "gzip", "snappy", "lz4", "zstd"]
436
+ }
437
+ }
438
+ ----
237
439
 
238
- | `RequiresRestartTransformer`
239
- | Determines if changes require restart
440
+ === Key transformations
240
441
 
241
- | `IsSecretTransformer`
242
- | Marks sensitive properties
243
- |===
442
+ ==== Chrono expression evaluation
244
443
 
245
- ==== Deprecated property detection
444
+ C++ time expressions are converted to numeric values with human-readable formats:
246
445
 
247
- The extractor identifies deprecated properties using three methods:
446
+ [,cpp]
447
+ ----
448
+ property<std::chrono::milliseconds>("log_retention_ms")
449
+ .default_value(7 * 24h); // 7 days
450
+ ----
248
451
 
249
- [cols="1,2,2"]
250
- |===
251
- | Detection method | C++ pattern | Result
452
+ Evaluates to:
252
453
 
253
- | *Type-based*
254
- | `deprecated_property<T>("name", ...)`
255
- | Sets `is_deprecated: true` in JSON output
454
+ [,json]
455
+ ----
456
+ {
457
+ "log_retention_ms": {
458
+ "type": "integer",
459
+ "default": 604800000,
460
+ "default_human_readable": "7 days"
461
+ }
462
+ }
463
+ ----
256
464
 
257
- | *Metadata-based*
258
- | `meta{.deprecated = "reason"}` +
259
- `meta{.deprecated = yes}`
260
- | Sets `is_deprecated: true` and optionally captures `deprecated_reason`
465
+ Supported units: `h` (hours), `min` (minutes), `s` (seconds), `ms` (milliseconds), `d` (days)
261
466
 
262
- | *Visibility-based*
263
- | `meta{.visibility = visibility::deprecated}`
264
- | Sets `is_deprecated: true` and marks for migration documentation only
265
- |===
467
+ ==== Enum default mapping
266
468
 
267
- Example C++ declarations:
469
+ Enum identifiers are mapped to user-facing strings:
268
470
 
269
471
  [,cpp]
270
472
  ----
271
- // Type-based deprecation
272
- deprecated_property<int>("old_setting", "Legacy configuration")
273
- .default_value(42);
473
+ enum class write_caching_mode {
474
+ default_true,
475
+ default_false,
476
+ disabled
477
+ };
274
478
 
275
- // Metadata-based deprecation with reason
276
- property<bool>("legacy_mode", "Old behavior flag")
277
- .default_value(false)
278
- .visibility(visibility::user)
279
- .meta{.deprecated = "Use new_mode instead"};
479
+ const char* write_caching_mode_to_string(write_caching_mode m) {
480
+ switch(m) {
481
+ case write_caching_mode::default_false: return "false";
482
+ case write_caching_mode::default_true: return "true";
483
+ case write_caching_mode::disabled: return "disabled";
484
+ }
485
+ }
280
486
 
281
- // Visibility-based deprecation
282
- property<std::string>("obsolete_path", "Deprecated file path")
283
- .default_value("/old/location")
284
- .visibility(visibility::deprecated);
487
+ property<write_caching_mode>("write_caching")
488
+ .default_value(write_caching_mode::default_false);
285
489
  ----
286
490
 
287
- Generated JSON output:
491
+ Maps to:
288
492
 
289
493
  [,json]
290
494
  ----
291
495
  {
292
- "old_setting": {
293
- "type": "integer",
294
- "default": 42,
295
- "is_deprecated": true
296
- },
297
- "legacy_mode": {
298
- "type": "boolean",
299
- "default": false,
300
- "is_deprecated": true,
301
- "deprecated_reason": "Use new_mode instead"
302
- },
303
- "obsolete_path": {
496
+ "write_caching": {
304
497
  "type": "string",
305
- "default": "/old/location",
306
- "is_deprecated": true,
307
- "visibility": "deprecated"
498
+ "enum": ["false", "true", "disabled"],
499
+ "default": "false"
308
500
  }
309
501
  }
310
502
  ----
311
503
 
312
- Deprecated properties appear in migration guides but are excluded from standard user documentation.
504
+ ==== Topic property defaults
313
505
 
314
- ==== Experimental property detection
506
+ Topic properties inherit defaults from cluster properties:
315
507
 
316
- The extractor identifies experimental properties that are in development or testing:
508
+ . Extract topic property `retention.ms`
509
+ . Find cluster mapping: `retention.ms` → `log_retention_ms`
510
+ . Look up `log_retention_ms` default: `604800000` (7 days)
511
+ . Copy to topic property JSON
317
512
 
318
- [cols="1,2,2"]
319
- |===
320
- | Detection method | C++ pattern | Result
513
+ This ensures topic property documentation always shows current defaults.
321
514
 
322
- | *Type-based*
323
- | `experimental_property<T>("name", ...)`
324
- | Sets `is_experimental_property: true` in JSON output
515
+ === Template rendering
325
516
 
326
- | *Metadata-based*
327
- | `meta{.experimental = true}` +
328
- `meta{.experimental = "description"}`
329
- | Sets `is_experimental_property: true` and optionally captures experimental notes
330
- |===
517
+ Property data flows through Handlebars templates:
331
518
 
332
- Example C++ declarations:
333
-
334
- [,cpp]
519
+ [,text]
335
520
  ----
336
- // Type-based experimental
337
- experimental_property<int>("new_feature", "Feature in development")
338
- .default_value(0);
521
+ JSON Data → Handlebars Template → AsciiDoc Output
339
522
 
340
- // Metadata-based experimental
341
- property<bool>("beta_mode", "Experimental feature flag")
342
- .default_value(false)
343
- .visibility(visibility::tunable)
344
- .meta{.experimental = true};
523
+ { {{#each properties}} === retention.ms
524
+ "retention.ms": { === {{name}}
525
+ "type": "integer", *Type:* {{type}} *Type:* integer
526
+ "default": 604800000,
527
+ *Default:* *Default:* 604800000
528
+ "default_human_readable": (7 days)
529
+ "7 days" {{default}}
530
+ } ({{default_human_readable}})
531
+ } {{/each}}
345
532
  ----
346
533
 
347
- Generated JSON output:
534
+ Templates live in `tools/property-extractor/templates/`:
348
535
 
349
- [,json]
350
- ----
351
- {
352
- "new_feature": {
353
- "type": "integer",
354
- "default": 0,
355
- "is_experimental_property": true
356
- },
357
- "beta_mode": {
358
- "type": "boolean",
359
- "default": false,
360
- "is_experimental_property": true
361
- }
362
- }
363
- ----
536
+ * `property.hbs` - Cluster/broker property template
537
+ * `topic-property.hbs` - Topic property template
538
+ * `deprecated-property.hbs` - Deprecated property template
539
+
540
+ Handlebars helpers in `tools/property-extractor/helpers.js` format values:
541
+
542
+ * `formatPropertyValue` - Formats defaults based on type
543
+ * `join` - Joins arrays with separators
544
+ * `parseRelatedTopic` - Processes xref links
545
+ * `eq`, `ne`, `gt`, `and`, `or` - Logic helpers
364
546
 
365
- Experimental properties are excluded from the documentation.
547
+ == Understanding the codebase
366
548
 
367
- === Stage 4: Type resolution
549
+ === Project structure
550
+
551
+ [,text]
552
+ ----
553
+ tools/property-extractor/
554
+ ├── property_extractor.py # Main extraction pipeline
555
+ ├── topic_property_extractor.py # Topic property extraction
556
+ ├── type_definition_extractor.py # Type discovery from headers
557
+ ├── transformers.py # Property enrichment transformers
558
+ ├── property_bag.py # Auto-expanding dict structure
559
+ ├── helpers.js # Handlebars template helpers
560
+ ├── generate-handlebars-docs.js # AsciiDoc generation
561
+ ├── compare-properties.js # Version diff generation
562
+ ├── Makefile # Build automation
563
+ ├── requirements.txt # Python dependencies
564
+ ├── templates/
565
+ │ ├── property.hbs # Cluster/broker template
566
+ │ ├── topic-property.hbs # Topic property template
567
+ │ └── deprecated-property.hbs # Deprecated template
568
+ └── tree-sitter/
569
+ └── tree-sitter-cpp/ # C++ parser (git submodule)
570
+ ----
368
571
 
369
- The type resolver:
572
+ === Extraction pipeline deep dive
370
573
 
371
- . Resolves `$ref` pointers to actual type definitions
372
- . Expands C++ constructors into JSON-compatible default values
373
- . Maps C++ types to JSON Schema types
374
- . Applies enum constraints to properties
574
+ ==== Stage 1: Tree-sitter parsing
375
575
 
376
- Example transformation:
576
+ Tree-sitter converts C++ code into Abstract Syntax Trees:
377
577
 
378
578
  [,cpp]
379
579
  ----
380
- // C++ source
381
- property<std::vector<model::broker_endpoint>>("kafka_api")
382
- .default_value({model::broker_endpoint{"internal", "127.0.0.1", 9092}})
580
+ property<int>("my_property", "Description").default_value(42);
383
581
  ----
384
582
 
385
- Becomes:
583
+ Becomes an AST:
386
584
 
387
- [,json]
585
+ [,text]
388
586
  ----
389
- {
390
- "kafka_api": {
391
- "type": "array",
392
- "items": {"$ref": "#/definitions/model::broker_endpoint"},
393
- "default": [{"name": "internal", "address": "127.0.0.1", "port": 9092}]
394
- }
395
- }
587
+ declaration
588
+ ├─ template_function (property<int>)
589
+ ├─ argument_list
590
+ │ ├─ string_literal ("my_property")
591
+ │ ├─ string_literal ("Description")
592
+ └─ call_expression (.default_value)
593
+ └─ argument_list
594
+ └─ number_literal (42)
396
595
  ----
397
596
 
398
- === Stage 5: Chrono expression evaluation and human-readable formatting
597
+ The extractor walks this tree to identify:
399
598
 
400
- The extractor automatically evaluates C++ chrono expressions in default values and provides human-readable time representations:
599
+ * Property names (first argument)
600
+ * Descriptions (second argument)
601
+ * Template types (`<int>`, `<std::optional<string>>`)
602
+ * Method calls (`.default_value()`, `.visibility()`)
603
+ * Metadata structs (`{.needs_restart = no}`)
401
604
 
402
- ==== Chrono expression evaluation
605
+ **Key function:** `get_properties()` in `property_extractor.py`
403
606
 
404
- Mathematical time expressions are converted to numeric values:
607
+ ==== Stage 2: Type definition extraction
405
608
 
406
- [,cpp]
407
- ----
408
- // C++ source with chrono expressions
409
- property<std::chrono::milliseconds>("log_segment_ms_max")
410
- .default_value(24h * 365); // One year in hours
609
+ The `TypeDefinitionExtractor` class scans header files for types:
610
+
611
+ **Struct extraction:**
411
612
 
412
- property<std::chrono::seconds>("connection_timeout")
413
- .default_value(7 * 24h); // One week
613
+ Uses a brace-counting algorithm to extract complete struct bodies:
614
+
615
+ [,python]
616
+ ----
617
+ def _extract_structs_with_brace_counting(content, file_path):
618
+ """
619
+ Find struct/class declarations and extract their complete body
620
+ by counting braces.
621
+ """
622
+ for match in STRUCT_OR_CLASS_PATTERN.finditer(content):
623
+ # Find opening brace
624
+ # Count braces to find matching closing brace
625
+ # Extract complete struct body
626
+ # Parse fields with regex
414
627
  ----
415
628
 
416
- The extractor:
629
+ **Enum extraction:**
630
+
631
+ Finds enums and their string conversion functions:
417
632
 
418
- 1. Parses time literals: `24h`, `365d`, `5min`, `30s`, `100ms`
419
- 2. Evaluates arithmetic: `24h * 365`, `7 * 24h`, `60s + 30s`
420
- 3. Converts to appropriate unit based on C++ type
421
- 4. Adds human-readable representation for documentation
633
+ 1. Locate `enum class name { ... };`
634
+ 2. Search for conversion function: `name_to_string()` or `operator<<`
635
+ 3. Extract string mappings: `case value: return "string";`
636
+ 4. Build enum string map
422
637
 
423
- Example transformation:
638
+ **Key class:** `TypeDefinitionExtractor` in `type_definition_extractor.py`
424
639
 
425
- [cols="1,1,1,1"]
426
- |===
427
- | C++ Expression | C++ Type | Numeric Value | Human-Readable
640
+ ==== Stage 3: Transformation pipeline
428
641
 
429
- | `24h * 365`
430
- | `std::chrono::milliseconds`
431
- | `31536000000`
432
- | "1 year"
642
+ Transformers enrich raw extracted data:
433
643
 
434
- | `7 * 24h`
435
- | `std::chrono::seconds`
436
- | `604800`
437
- | "1 week"
644
+ [,python]
645
+ ----
646
+ # In property_extractor.py
647
+ transformers = [
648
+ TypeTransformer(), # C++ → JSON type mapping
649
+ NeedsRestartTransformer(), # Extract needs_restart metadata
650
+ VisibilityTransformer(), # Extract visibility level
651
+ DeprecatedTransformer(), # Mark deprecated properties
652
+ ExperimentalTransformer(), # Mark experimental properties
653
+ IsSecretTransformer(), # Mark sensitive properties
654
+ SimpleDefaultValuesTransformer(), # Extract default values
655
+ FriendlyDefaultTransformer(), # Format defaults nicely
656
+ EnterpriseTransformer() # Mark enterprise features
657
+ ]
438
658
 
439
- | `5min`
440
- | `std::chrono::seconds`
441
- | `300`
442
- | "5 minutes"
659
+ for transformer in transformers:
660
+ if transformer.accepts(property_definition, file_path):
661
+ transformer.parse(property_definition, properties, file_path)
662
+ ----
443
663
 
444
- | `24h`
445
- | `std::chrono::milliseconds`
446
- | `86400000`
447
- | "1 day"
448
- |===
664
+ Each transformer:
449
665
 
450
- Generated JSON output:
666
+ . Checks if it applies (`accepts()`)
667
+ . Extracts specific metadata (`parse()`)
668
+ . Updates property definition in-place
451
669
 
452
- [,json]
670
+ **Example - NeedsRestartTransformer:**
671
+
672
+ [,python]
453
673
  ----
454
- {
455
- "log_segment_ms_max": {
456
- "type": "integer",
457
- "default": 31536000000,
458
- "default_human_readable": "1 year",
459
- "c_type": "std::chrono::milliseconds"
460
- },
461
- "connection_timeout": {
462
- "type": "integer",
463
- "default": 604800,
464
- "default_human_readable": "1 week",
465
- "c_type": "std::chrono::seconds"
466
- }
467
- }
674
+ class NeedsRestartTransformer:
675
+ def accepts(self, prop, file_path):
676
+ return 'needs_restart' in str(prop)
677
+
678
+ def parse(self, prop, all_props, file_path):
679
+ # Extract from metadata: {.needs_restart = no}
680
+ if match := re.search(r'needs_restart\s*=\s*(yes|no)', str(prop)):
681
+ prop['needs_restart'] = (match.group(1) == 'yes')
468
682
  ----
469
683
 
470
- ==== Human-readable time formatting
684
+ **Key file:** `transformers.py`
685
+
686
+ ==== Stage 4: Type and default resolution
687
+
688
+ The `resolve_type_and_default()` function:
689
+
690
+ 1. Resolves template types: `property<std::optional<int>>` → `integer`, `nullable: true`
691
+ 2. Expands C++ constructors: `{field1, field2}` → `{"field1": val, "field2": val}`
692
+ 3. Evaluates expressions: `24h * 365` → `31536000000` milliseconds
693
+ 4. Maps enums: `write_caching_mode::default_false` → `"false"`
694
+ 5. Formats human-readable: `604800000ms` → `"7 days"`
471
695
 
472
- The `format_time_human_readable()` function automatically selects the most appropriate time unit:
696
+ **Key functions:**
473
697
 
474
- * Prefers larger units (years > weeks > days > hours > minutes > seconds > milliseconds)
475
- * Only uses a unit if the value divides evenly
476
- * Example: 604800 seconds becomes "1 week" instead of "7 days"
698
+ * `resolve_type_and_default()` - Main resolution logic
699
+ * `evaluate_chrono_expressions()` - Time expression evaluation
700
+ * `map_enum_defaults()` - Enum value mapping
701
+ * `expand_constructor_syntax()` - C++ initializer expansion
477
702
 
478
- This human-readable format appears in documentation templates alongside the numeric value:
703
+ **Location:** `property_extractor.py` lines 1445-2250
479
704
 
480
- [source,asciidoc]
705
+ ==== Stage 5: Topic property extraction
706
+
707
+ The `TopicPropertyExtractor` class:
708
+
709
+ . Scans `types.h` for `topic_property_*` constants
710
+ . Reads `config_response_utils.cc` for cluster mappings:
711
+ +
712
+ [,cpp]
481
713
  ----
482
- | Default
483
- | `604800` (1 week)
714
+ add_topic_config_if_requested(
715
+ topic_property_retention_ms, // Topic property
716
+ config::shard_local_cfg().log_retention_ms.name(), // Cluster property
717
+ config::shard_local_cfg().log_retention_ms.desc()
718
+ );
484
719
  ----
485
720
 
486
- === Stage 6: Enum default mapping
721
+ . Looks up cluster property defaults from main properties dict
722
+ . Copies `default` and `default_human_readable` to topic property
487
723
 
488
- Raw C++ enum values are mapped to user-facing strings:
724
+ **Key class:** `TopicPropertyExtractor` in `topic_property_extractor.py`
489
725
 
490
- [,cpp]
726
+ ==== Stage 6: Override application
727
+
728
+ Overrides are applied after extraction but before output:
729
+
730
+ [,python]
491
731
  ----
492
- enum class write_caching_mode {
493
- default_true,
494
- default_false,
495
- disabled
496
- };
732
+ def apply_overrides(properties, overrides, overrides_file_path):
733
+ """
734
+ Apply manual overrides from JSON file.
497
735
 
498
- const char* write_caching_mode_to_string(write_caching_mode s) {
499
- case write_caching_mode::default_false: return "false";
500
- // ...
501
- }
736
+ For each property in overrides:
737
+ 1. Find matching property (by key or by name field)
738
+ 2. Deep merge override fields into property
739
+ 3. Resolve example_file references to actual content
740
+ """
741
+ for override_key, override_data in overrides['properties'].items():
742
+ if override_key in properties:
743
+ properties[override_key].update(override_data)
744
+ else:
745
+ # Create new property from override
746
+ properties[override_key] = override_data
502
747
  ----
503
748
 
504
- Properties using this enum automatically map:
749
+ **Key function:** `apply_overrides()` in `property_extractor.py`
505
750
 
506
- * Default: `default_false` → `"false"`
507
- * Enum values: `["true", "false", "disabled"]`
751
+ ==== Stage 7: AsciiDoc generation
508
752
 
509
- === Stage 7: Override application
753
+ The `generate-handlebars-docs.js` script:
510
754
 
511
- The `overrides.json` file allows customization of both properties and type definitions:
755
+ 1. Loads property JSON
756
+ 2. Groups by config_scope and category
757
+ 3. Renders each property through Handlebars template
758
+ 4. Writes AsciiDoc files
512
759
 
513
- [,json]
760
+ [,javascript]
514
761
  ----
515
- {
516
- "properties": {
517
- "kafka_api": {
518
- "description": "Custom description",
519
- "example": "kafka_api:\n - name: internal\n address: 0.0.0.0\n port: 9092"
520
- }
521
- },
522
- "definitions": {
523
- "model::compression": {
524
- "enum": ["none", "gzip", "snappy", "lz4", "zstd", "producer"]
525
- }
526
- }
527
- }
762
+ // In generate-handlebars-docs.js
763
+ const properties = JSON.parse(fs.readFileSync(inputFile));
764
+
765
+ // Group properties
766
+ const clusterProps = Object.values(properties.properties)
767
+ .filter(p => p.config_scope === 'cluster');
768
+
769
+ // Render template
770
+ const template = handlebars.compile(templateSource);
771
+ const output = template({properties: clusterProps});
772
+
773
+ // Write file
774
+ fs.writeFileSync('cluster-properties.adoc', output);
528
775
  ----
529
776
 
530
- == Command-line reference
777
+ **Key files:**
531
778
 
532
- === Basic usage
779
+ * `generate-handlebars-docs.js` - Main generation script
780
+ * `helpers.js` - Template helper functions
781
+ * `templates/*.hbs` - Handlebars templates
533
782
 
534
- [,bash]
783
+ ==== Stage 8: Diff generation
784
+
785
+ The `compare-properties.js` script compares two JSON files:
786
+
787
+ [,javascript]
535
788
  ----
536
- ./property_extractor.py --path <redpanda-source-path> [options]
789
+ function compareProperties(oldData, newData) {
790
+ return {
791
+ newProperties: findNew(oldData, newData),
792
+ changedDefaults: findDefaultChanges(oldData, newData),
793
+ changedDescriptions: findDescriptionChanges(oldData, newData),
794
+ deprecatedProperties: findNewlyDeprecated(oldData, newData),
795
+ removedProperties: findRemoved(oldData, newData)
796
+ };
797
+ }
537
798
  ----
538
799
 
539
- === Options
800
+ Uses deep equality checking to detect:
540
801
 
541
- [cols="1,2,1"]
542
- |===
543
- | Option | Description | Default
802
+ * New properties in new version
803
+ * Properties removed from old version
804
+ * Changed default values
805
+ * Changed descriptions/types
806
+ * Newly deprecated properties
544
807
 
545
- | `--path <path>`
546
- | Path to Redpanda source directory (required)
547
- | None
808
+ **Key file:** `compare-properties.js`
548
809
 
549
- | `--recursive`
550
- | Recursively scan for header/implementation file pairs
551
- | False
810
+ === Adding new transformers
552
811
 
553
- | `--output <file>`
554
- | Output JSON file path
555
- | stdout
812
+ To add custom property metadata extraction:
556
813
 
557
- | `--overrides <file>`
558
- | JSON file with property and definition overrides
559
- | `overrides.json`
814
+ . **Create transformer class** in `transformers.py`:
815
+ +
816
+ [,python]
817
+ ----
818
+ class MyCustomTransformer:
819
+ """Extract custom metadata from properties."""
560
820
 
561
- | `-v`, `--verbose`
562
- | Enable verbose logging
563
- | False
564
- |===
821
+ def accepts(self, property_definition, file_path):
822
+ """Check if this transformer applies to this property."""
823
+ return 'custom_metadata' in str(property_definition)
565
824
 
566
- === Examples
825
+ def parse(self, property_definition, all_properties, file_path):
826
+ """Extract metadata and update property_definition."""
827
+ # Extract from C++ source
828
+ if match := re.search(r'custom_field\s*=\s*(\w+)', str(property_definition)):
829
+ property_definition['custom_field'] = match.group(1)
830
+ ----
567
831
 
568
- Extract properties from Redpanda source:
832
+ . **Register transformer** in `property_extractor.py`:
833
+ +
834
+ [,python]
835
+ ----
836
+ # In transform_files_with_properties()
837
+ transformers = [
838
+ # ... existing transformers ...
839
+ MyCustomTransformer(), # Add here
840
+ ]
841
+ ----
569
842
 
843
+ . **Test on sample property:**
844
+ +
570
845
  [,bash]
571
846
  ----
572
- ./property_extractor.py --path ./tmp/redpanda/src/v --output properties.json
847
+ ./property_extractor.py --path tmp/redpanda/src/v --verbose | \
848
+ jq '.properties.test_property'
573
849
  ----
574
850
 
575
- Use custom overrides:
851
+ === Extending type extraction
576
852
 
577
- [,bash]
853
+ To support new C++ patterns:
854
+
855
+ . **Add extraction method** in `type_definition_extractor.py`:
856
+ +
857
+ [,python]
578
858
  ----
579
- ./property_extractor.py \
580
- --path ./tmp/redpanda/src/v \
581
- --overrides custom-overrides.json \
582
- --output properties.json
859
+ def _extract_my_new_pattern(self, content, file_path):
860
+ """Extract custom C++ pattern."""
861
+ pattern = re.compile(r'my_pattern\s+(\w+)\s*{([^}]+)}')
862
+
863
+ for match in pattern.finditer(content):
864
+ name = match.group(1)
865
+ body = match.group(2)
866
+
867
+ # Parse and build definition
868
+ self.definitions[name] = {
869
+ 'type': 'custom',
870
+ 'body': body,
871
+ 'defined_in': file_path
872
+ }
873
+ ----
874
+
875
+ . **Call from `_extract_from_file()`**:
876
+ +
877
+ [,python]
583
878
  ----
879
+ def _extract_from_file(self, file_path):
880
+ content = file_path.read_text()
584
881
 
585
- Enable verbose logging for debugging:
882
+ # Existing extraction calls
883
+ self._extract_structs(content, file_path)
884
+ self._extract_enums(content, file_path)
586
885
 
886
+ # Add your new extraction
887
+ self._extract_my_new_pattern(content, file_path)
888
+ ----
889
+
890
+ . **Test extraction:**
891
+ +
587
892
  [,bash]
588
893
  ----
589
- ./property_extractor.py --path ./tmp/redpanda/src/v --verbose
894
+ python3 -c "
895
+ from type_definition_extractor import TypeDefinitionExtractor
896
+ from pathlib import Path
897
+
898
+ extractor = TypeDefinitionExtractor(Path('tmp/redpanda/src/v'))
899
+ extractor.extract()
900
+ print(extractor.definitions['my_type'])
901
+ "
590
902
  ----
591
903
 
592
- == Customization
904
+ === Adding template helpers
593
905
 
594
- === When to add manual definitions
906
+ To add new Handlebars formatting helpers:
595
907
 
596
- You need manual definitions in `overrides.json` only for:
908
+ . **Add function** to `helpers.js`:
909
+ +
910
+ [,javascript]
911
+ ----
912
+ /**
913
+ * Format byte values with units
914
+ */
915
+ function formatBytes(value) {
916
+ const units = ['B', 'KB', 'MB', 'GB', 'TB'];
917
+ let size = value;
918
+ let unitIndex = 0;
919
+
920
+ while (size >= 1024 && unitIndex < units.length - 1) {
921
+ size /= 1024;
922
+ unitIndex++;
923
+ }
597
924
 
598
- ==== 1. Types removed from codebase
925
+ return `${size.toFixed(2)} ${units[unitIndex]}`;
926
+ }
599
927
 
600
- If a type was removed from Redpanda source but properties still reference it:
928
+ module.exports = {
929
+ // ... existing helpers ...
930
+ formatBytes: formatBytes
931
+ };
932
+ ----
601
933
 
602
- [,json]
934
+ . **Use in template** (`templates/property.hbs`):
935
+ +
936
+ [,handlebars]
603
937
  ----
604
- {
605
- "definitions": {
606
- "legacy_type": {
607
- "type": "string",
608
- "description": "Maintained for backward compatibility"
609
- }
610
- }
611
- }
938
+ {{#if (eq type "integer")}}
939
+ | Default
940
+ | `{{formatBytes default}}`
941
+ {{/if}}
942
+ ----
943
+
944
+ . **Test rendering:**
945
+ +
946
+ [,bash]
947
+ ----
948
+ node generate-handlebars-docs.js test-input.json test-output
949
+ cat test-output/property.adoc
612
950
  ----
613
951
 
614
- ==== 2. Complex types not auto-extractable
952
+ === Debugging tips
615
953
 
616
- Property classes inheriting from template base classes:
954
+ ==== Enable verbose logging
617
955
 
618
- [,cpp]
956
+ [,bash]
619
957
  ----
620
- class retention_duration_property final
621
- : public property<std::optional<std::chrono::milliseconds>> {
622
- // Complex logic, no simple fields to extract
623
- };
958
+ # Python extractor
959
+ ./property_extractor.py --path tmp/redpanda/src/v --verbose
960
+
961
+ # Makefile with debug
962
+ make build TAG=v25.3.1 VERBOSE=1
624
963
  ----
625
964
 
626
- Define manually:
965
+ ==== Inspect AST for property
627
966
 
628
- [,json]
967
+ [,python]
629
968
  ----
630
- {
631
- "definitions": {
632
- "retention_duration_property": {
633
- "type": "integer",
634
- "minimum": -2147483648,
635
- "maximum": 2147483647
636
- }
637
- }
638
- }
969
+ # In property_extractor.py, add temporary debug:
970
+ def get_properties(node, source_bytes):
971
+ if property_name == "debug_this_property":
972
+ print(f"AST: {node}")
973
+ print(f"Source: {source_bytes[node.start_byte:node.end_byte]}")
974
+ import pdb; pdb.set_trace() # Drop into debugger
639
975
  ----
640
976
 
641
- ==== 3. Override auto-extracted definitions
977
+ ==== Check type extraction
642
978
 
643
- Provide cleaner enum values or simplified field lists:
979
+ [,bash]
980
+ ----
981
+ # See all extracted types
982
+ ./property_extractor.py --path tmp/redpanda/src/v --output test.json
983
+ jq '.definitions | keys' test.json
644
984
 
645
- [,json]
985
+ # Check specific type
986
+ jq '.definitions."model::compression"' test.json
646
987
  ----
647
- {
648
- "definitions": {
649
- "model::compression": {
650
- "$comment": "Overrides auto-extracted enum to exclude internal values",
651
- "enum": ["none", "gzip", "snappy", "lz4", "zstd", "producer"]
652
- }
653
- }
654
- }
988
+
989
+ ==== Test transformer in isolation
990
+
991
+ [,python]
655
992
  ----
993
+ # test_transformer.py
994
+ from transformers import NeedsRestartTransformer
995
+ from property_bag import PropertyBag
656
996
 
657
- ==== 4. Documentation-only types
997
+ prop = PropertyBag()
998
+ prop['raw'] = 'property<int>("test").needs_restart(no)'
658
999
 
659
- Types needed for documentation but not in C++ source:
1000
+ transformer = NeedsRestartTransformer()
1001
+ if transformer.accepts(prop, "test.cc"):
1002
+ transformer.parse(prop, {}, "test.cc")
1003
+ print(f"Result: {prop}")
1004
+ ----
660
1005
 
661
- [,json]
1006
+ ==== Compare JSON outputs
1007
+
1008
+ [,bash]
662
1009
  ----
663
- {
664
- "definitions": {
665
- "custom_config_type": {
666
- "type": "object",
667
- "properties": {
668
- "host": {"type": "string"},
669
- "port": {"type": "integer"}
670
- }
671
- }
672
- }
673
- }
1010
+ # Diff two versions
1011
+ diff <(jq -S . old.json) <(jq -S . new.json)
1012
+
1013
+ # Check if property exists in version
1014
+ jq '.properties | has("property_name")' redpanda-properties-v25.3.1.json
674
1015
  ----
675
1016
 
676
- === Override precedence
1017
+ == Limitations and known issues
677
1018
 
678
- Definitions are applied in this order (later overrides earlier):
1019
+ === What works well
679
1020
 
680
- . Auto-extracted from C++ source
681
- . `overrides.json` definitions
1021
+ * Standard property declarations using `property<T>` template
1022
+ * Common C++ types (int, string, bool, chrono types)
1023
+ * Struct and class types with public fields
1024
+ * Enums with string conversion functions
1025
+ * Chrono expression evaluation (`24h`, `7 * 24h`)
1026
+ * Optional types (`std::optional<T>`)
1027
+ * Array types (`std::vector<T>`)
1028
+ * Metadata extraction (`.visibility()`, `.needs_restart()`)
1029
+ * Topic property cluster mapping
682
1030
 
683
- === Overrides file format
1031
+ === Current limitations
684
1032
 
685
- The `overrides.json` file supports two top-level keys:
1033
+ * *Complex template types*: Highly nested templates may not resolve correctly
1034
+ * *Constexpr evaluation*: Complex compile-time expressions beyond chrono may not evaluate
1035
+ * *Private fields*: Struct fields marked private are not extracted
1036
+ * *Inheritance*: Properties in derived classes may not be fully captured
1037
+ * *Preprocessor macros*: Properties defined via macros may be missed
1038
+ * *Runtime defaults*: Defaults computed at runtime cannot be extracted
686
1039
 
687
- [,json]
688
- ----
689
- {
690
- "$comment": "Property and definition overrides for Redpanda property extraction",
1040
+ === Alternative approach: Admin API
691
1041
 
692
- "properties": {
693
- "property_name": {
694
- "description": "Custom description text",
695
- "example": ".Example\n[,yaml]\n----\nredpanda:\n property_name: value\n----",
696
- "version": "24.3",
697
- "related_topics": ["xref:topic.adoc[Link]"],
698
- "default": "custom_default",
699
- "config_scope": "broker",
700
- "type": "string"
701
- }
702
- },
1042
+ An alternative to C++ source extraction is using the Redpanda Admin API to fetch configuration properties at runtime. This approach would provide:
703
1043
 
704
- "definitions": {
705
- "type::name": {
706
- "$comment": "Overrides or adds type definition",
707
- "type": "enum",
708
- "enum": ["value1", "value2", "value3"],
709
- "defined_in": "https://github.com/.../file.h#L123"
710
- }
711
- }
712
- }
713
- ----
1044
+ * Runtime-accurate default values
1045
+ * Dynamically computed defaults
1046
+ * No C++ parsing complexity
714
1047
 
715
- Property override fields:
1048
+ However, this approach has drawbacks:
716
1049
 
717
- * `description` - Override auto-extracted description
718
- * `example` - Add AsciiDoc example block
719
- * `example_file` - Load example from external file
720
- * `version` - Version when property was introduced
721
- * `related_topics` - Array of cross-reference links
722
- * `default` - Override default value
723
- * `config_scope` - Specify scope for new properties (broker/cluster/topic)
724
- * `type` - Specify type for new properties
1050
+ * Requires running Redpanda cluster
1051
+ * Current endpoint limitations:
1052
+ ** Only cluster properties available (no topic properties yet)
1053
+ ** Schema doesn't include enterprise feature flags
1054
+
1055
+ See https://redpandadata.atlassian.net/browse/DOC-1828[DOC-1828] for proposed enhancements to the Admin API that would make this approach more viable.
725
1056
 
726
- == JSON output format
1057
+ === Workarounds
727
1058
 
728
- The extractor generates a JSON Schema-like document:
1059
+ For unsupported patterns, use `overrides.json`:
729
1060
 
730
1061
  [,json]
731
1062
  ----
732
1063
  {
733
1064
  "properties": {
734
- "property_name": {
735
- "type": "string",
736
- "description": "Property description",
737
- "default": "default_value",
738
- "required": false,
739
- "visibility": "tunable",
740
- "requires_restart": false,
741
- "config_scope": "broker",
742
- "units": "bytes",
743
- "minimum": 0,
744
- "maximum": 1000,
745
- "enum": ["option1", "option2"],
746
- "example": ".Example\n[,yaml]\n----\nredpanda:\n property_name: value\n----"
1065
+ "complex_property": {
1066
+ "type": "object",
1067
+ "default": {"field": "value"},
1068
+ "description": "Manual override for complex property"
747
1069
  }
748
1070
  },
749
-
750
1071
  "definitions": {
751
- "model::broker_endpoint": {
1072
+ "ComplexType": {
752
1073
  "type": "object",
753
1074
  "properties": {
754
- "name": {"type": "string"},
755
- "address": {"type": "string"},
756
- "port": {"type": "integer", "minimum": 0, "maximum": 65535}
757
- },
758
- "defined_in": "model/metadata.h"
759
- },
760
-
761
- "model::compression": {
762
- "type": "enum",
763
- "enum": ["none", "gzip", "snappy", "lz4", "zstd", "producer"],
764
- "enum_string_mappings": {
765
- "compression_type_none": "none",
766
- "compression_type_gzip": "gzip"
767
- },
768
- "defined_in": "model/compression.h"
769
- },
770
-
771
- "model::node_id": {
772
- "type": "integer",
773
- "minimum": -2147483648,
774
- "maximum": 2147483647,
775
- "alias_for": "named_type<int32_t, struct node_id_model_type>",
776
- "defined_in": "model/fundamental.h"
1075
+ "field": {"type": "string"}
1076
+ }
777
1077
  }
778
1078
  }
779
1079
  }
780
1080
  ----
781
1081
 
782
- == Documentation generation
1082
+ == Contributing
783
1083
 
784
- To generate AsciiDoc documentation from the JSON:
1084
+ === Development setup
785
1085
 
786
1086
  [,bash]
787
1087
  ----
788
- python3 generate_docs.py
789
- ----
790
-
791
- This creates:
1088
+ # Clone repository
1089
+ git clone https://github.com/redpanda-data/docs-extensions-and-macros.git
1090
+ cd docs-extensions-and-macros/tools/property-extractor
792
1091
 
793
- * `output/pages/broker-properties.adoc` - Broker configuration
794
- * `output/pages/cluster-properties.adoc` - Cluster configuration
795
- * `output/pages/object-storage-properties.adoc` - Cloud storage configuration
796
- * `output/pages/deprecated/partials/deprecated-properties.adoc` - Deprecated properties
1092
+ # Set up Python environment
1093
+ make venv
797
1094
 
1095
+ # Install Node dependencies
1096
+ npm install
798
1097
 
799
- == Troubleshooting
1098
+ # Clone Redpanda source for testing
1099
+ make redpanda-git TAG=v25.3.1
800
1100
 
801
- === Type not found
1101
+ # Build tree-sitter parser
1102
+ make treesitter
1103
+ ----
802
1104
 
803
- If a property references a type that isn't extracted:
1105
+ === Running tests
804
1106
 
805
- . Check if the type exists in Redpanda source:
806
- +
807
1107
  [,bash]
808
1108
  ----
809
- find tmp/redpanda/src/v -name "*.h" -exec grep -l "your_type_name" {} \;
810
- ----
1109
+ # Run test suite
1110
+ npm test
811
1111
 
812
- . If found, check extraction:
813
- +
814
- [,bash]
815
- ----
816
- ./property_extractor.py --path tmp/redpanda/src/v --verbose 2>&1 | grep "your_type_name"
1112
+ # Run specific test file
1113
+ npx jest __tests__/tools/topic_property_extractor.test.js
1114
+
1115
+ # Run with coverage
1116
+ npm test -- --coverage
817
1117
  ----
818
1118
 
819
- . If not extracted, add manual definition to `overrides.json`
1119
+ === Making changes
820
1120
 
821
- === Enum values incorrect
1121
+ 1. **Before you start:**
1122
+ - Open an issue describing the problem/enhancement
1123
+ - Discuss approach with maintainers
822
1124
 
823
- If enum values don't match user-facing strings:
1125
+ 2. **Make your changes:**
1126
+ - Follow existing code style
1127
+ - Add tests for new functionality
1128
+ - Update this README for significant changes
824
1129
 
825
- . Check for `_to_string()` function in source
826
- . If missing or incorrect, override in `overrides.json`:
827
- +
828
- [,json]
829
- ----
830
- {
831
- "definitions": {
832
- "model::your_enum": {
833
- "enum": ["user_value1", "user_value2"]
834
- }
835
- }
836
- }
837
- ----
1130
+ 3. **Test thoroughly:**
1131
+ - Run existing tests: `npm test`
1132
+ - Test on multiple Redpanda versions
1133
+ - Generate docs locally and verify output
838
1134
 
839
- === Missing property fields
1135
+ 4. **Submit PR:**
1136
+ - Include issue reference
1137
+ - Describe what changed and why
1138
+ - Show before/after examples
840
1139
 
841
- If extracted properties lack descriptions or defaults:
1140
+ === Testing checklist
842
1141
 
843
- . Check C++ source for property declaration
844
- . Add override in `overrides.json`:
845
- +
846
- [,json]
847
- ----
848
- {
849
- "properties": {
850
- "property_name": {
851
- "description": "Detailed description",
852
- "example": "..."
853
- }
854
- }
855
- }
856
- ----
1142
+ Before submitting changes, verify:
1143
+
1144
+ * [ ] All tests pass: `npm test`
1145
+ * [ ] Generated docs look correct: `npx doc-tools generate property-docs --tag v25.3.1`
1146
+ * [ ] No regression on older versions: Test with v24.x and v25.x
1147
+ * [ ] Override handling still works
1148
+ * [ ] Topic properties get correct defaults
1149
+ * [ ] Diff generation works between versions
1150
+ * [ ] Templates render correctly
1151
+ * [ ] New transformers don't break existing extraction
1152
+
1153
+ === Code style guidelines
1154
+
1155
+ *Python:*
1156
+
1157
+ * Follow PEP 8
1158
+ * Use type hints where possible
1159
+ * Document complex functions with docstrings
1160
+ * Keep functions focused and testable
1161
+
1162
+ *JavaScript:*
1163
+
1164
+ * Use modern ES6+ features
1165
+ * Prefer const/let over var
1166
+ * Document helper functions with JSDoc
1167
+ * Use async/await over callbacks
1168
+
1169
+ *Documentation:*
1170
+
1171
+ * Use AsciiDoc format
1172
+ * Include code examples
1173
+ * Explain the "why" not just the "what"
1174
+ * Keep README in sync with code
1175
+
1176
+ == Troubleshooting
857
1177
 
858
1178
  === Build failures
859
1179
 
860
- Tree-sitter compilation errors:
1180
+ **Tree-sitter won't compile:**
861
1181
 
862
1182
  [,bash]
863
1183
  ----
1184
+ # Update submodule
864
1185
  cd tree-sitter/tree-sitter-cpp
865
1186
  git submodule update --init --recursive
1187
+
1188
+ # Clean and rebuild
1189
+ cd ../..
1190
+ make clean
1191
+ make treesitter
866
1192
  ----
867
1193
 
868
- Python dependency errors:
1194
+ **Python dependencies fail:**
869
1195
 
870
1196
  [,bash]
871
1197
  ----
1198
+ # Remove and recreate venv
872
1199
  make clean
1200
+ rm -rf tmp/redpanda-property-extractor-venv
873
1201
  make venv
874
1202
  ----
875
1203
 
876
- == Advanced usage
1204
+ **Node modules missing:**
877
1205
 
878
- === Adding new transformers
1206
+ [,bash]
1207
+ ----
1208
+ npm install
1209
+ ----
1210
+
1211
+ === Extraction issues
879
1212
 
880
- To add custom property transformations:
1213
+ **Property not found:**
881
1214
 
882
- . Create a transformer function in `transformers.py`:
1215
+ 1. Verify property exists in Redpanda source:
883
1216
  +
884
- [,python]
1217
+ [,bash]
885
1218
  ----
886
- def my_custom_transformer(properties):
887
- """Add custom metadata to properties."""
888
- for prop_name, prop in properties.items():
889
- # Add custom logic
890
- prop['custom_field'] = compute_value(prop)
891
- return properties
1219
+ grep -r "property_name" tmp/redpanda/src/v/config/
892
1220
  ----
893
1221
 
894
- . Register in transformer pipeline in `property_extractor.py`:
1222
+ 2. Check if property uses unsupported pattern
1223
+
1224
+ 3. Add override to `property-overrides.json`
1225
+
1226
+ **Type not resolved:**
1227
+
1228
+ 1. Check type definition exists:
895
1229
  +
896
- [,python]
1230
+ [,bash]
897
1231
  ----
898
- properties = transform_files_with_properties(files_with_properties)
899
- properties = my_custom_transformer(properties) # Add here
1232
+ find tmp/redpanda/src/v -name "*.h" -exec grep -l "TypeName" {} \;
900
1233
  ----
901
1234
 
902
- === Extending type extraction
1235
+ 2. Verify type is in scanned directories
903
1236
 
904
- To support additional C++ patterns:
1237
+ 3. Add manual definition to `overrides.json`
905
1238
 
906
- . Add extraction method to `type_definition_extractor.py`
907
- . Register in `_extract_from_file()` method
908
- . Test extraction on sample files
1239
+ **Wrong default value:**
909
1240
 
910
- === Custom output formats
1241
+ 1. Check C++ source for actual default
1242
+ 2. Verify chrono evaluation is correct
1243
+ 3. Override in `property-overrides.json` if needed
911
1244
 
912
- To generate additional output formats:
1245
+ **Missing topic property defaults:**
913
1246
 
914
- . Load the JSON output:
1247
+ 1. Verify cluster property has a default:
915
1248
  +
916
- [,python]
1249
+ [,bash]
917
1250
  ----
918
- import json
919
- with open('gen/properties-output.json') as f:
920
- data = json.load(f)
1251
+ jq '.properties.log_retention_ms.default' output.json
921
1252
  ----
922
1253
 
923
- . Transform to desired format (YAML, XML, etc.)
1254
+ 2. Check topic cluster mapping exists
924
1255
 
925
- == Contributing
1256
+ 3. Ensure `extract_topic_properties()` receives cluster properties
1257
+
1258
+ === Template rendering issues
1259
+
1260
+ **Handlebars error:**
1261
+
1262
+ [,bash]
1263
+ ----
1264
+ # Validate JSON
1265
+ jq . generated.json
1266
+
1267
+ # Test template manually
1268
+ node generate-handlebars-docs.js test-input.json test-output
1269
+ ----
1270
+
1271
+ **Missing values in output:**
1272
+
1273
+ 1. Check property has required field in JSON
1274
+ 2. Verify template references correct field name
1275
+ 3. Test with minimal property to isolate issue
926
1276
 
927
- When modifying the extractor:
1277
+ **Formatting looks wrong:**
928
1278
 
929
- . Test on multiple Redpanda versions
930
- . Update `overrides.json` for new types
931
- . Run validation: `make test`
932
- . Document changes in this README
1279
+ 1. Check helper function in `helpers.js`
1280
+ 2. Verify AsciiDoc syntax in template
1281
+ 3. Test rendering with sample property
933
1282
 
934
1283
  == Additional resources
935
1284
 
936
1285
  * https://github.com/redpanda-data/redpanda[Redpanda GitHub Repository]
1286
+ * https://github.com/redpanda-data/docs-extensions-and-macros[docs-extensions-and-macros Repository]
937
1287
  * https://tree-sitter.github.io/tree-sitter/[Tree-sitter Documentation]
1288
+ * https://handlebarsjs.com/[Handlebars.js Guide]
1289
+ * https://docs.asciidoctor.org/asciidoc/latest/[AsciiDoc Language Documentation]