harness-auto-docs 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.nvmrc +1 -0
- package/AGENTS.md +69 -0
- package/ARCHITECTURE.md +123 -0
- package/README.md +52 -0
- package/dist/ai/anthropic.d.ts +7 -0
- package/dist/ai/anthropic.js +20 -0
- package/dist/ai/interface.d.ts +3 -0
- package/dist/ai/interface.js +1 -0
- package/dist/ai/minimax.d.ts +7 -0
- package/dist/ai/minimax.js +21 -0
- package/dist/ai/openai.d.ts +7 -0
- package/dist/ai/openai.js +16 -0
- package/dist/cli.d.ts +2 -0
- package/dist/cli.js +103 -0
- package/dist/core/diff.d.ts +17 -0
- package/dist/core/diff.js +46 -0
- package/dist/core/generator.d.ts +10 -0
- package/dist/core/generator.js +238 -0
- package/dist/core/relevance.d.ts +3 -0
- package/dist/core/relevance.js +29 -0
- package/dist/core/writer.d.ts +2 -0
- package/dist/core/writer.js +23 -0
- package/dist/providers/github.d.ts +13 -0
- package/dist/providers/github.js +43 -0
- package/dist/providers/gitlab.d.ts +9 -0
- package/dist/providers/gitlab.js +6 -0
- package/dist/providers/interface.d.ts +8 -0
- package/dist/providers/interface.js +1 -0
- package/docs/DESIGN.md +94 -0
- package/docs/QUALITY_SCORE.md +74 -0
- package/docs/design-docs/core-beliefs.md +71 -0
- package/docs/design-docs/index.md +32 -0
- package/docs/exec-plans/tech-debt-tracker.md +26 -0
- package/docs/product-specs/index.md +39 -0
- package/docs/references/anthropic-sdk-llms.txt +40 -0
- package/docs/references/octokit-rest-llms.txt +44 -0
- package/docs/references/openai-sdk-llms.txt +38 -0
- package/docs/superpowers/plans/2026-04-03-harness-engineering-auto-docs.md +1863 -0
- package/docs/superpowers/specs/2026-04-03-harness-engineering-auto-docs-design.md +169 -0
- package/examples/github-workflow.yml +32 -0
- package/markdown/harness-engineering-codex-agent-first-world.md +215 -0
- package/package.json +30 -0
- package/src/ai/anthropic.ts +23 -0
- package/src/ai/interface.ts +3 -0
- package/src/ai/minimax.ts +25 -0
- package/src/ai/openai.ts +20 -0
- package/src/cli.ts +122 -0
- package/src/core/diff.ts +77 -0
- package/src/core/generator.ts +294 -0
- package/src/core/relevance.ts +53 -0
- package/src/core/writer.ts +25 -0
- package/src/providers/github.ts +53 -0
- package/src/providers/gitlab.ts +16 -0
- package/src/providers/interface.ts +9 -0
- package/tests/core/anthropic.test.ts +33 -0
- package/tests/core/diff.test.ts +49 -0
- package/tests/core/generator.test.ts +93 -0
- package/tests/core/openai.test.ts +38 -0
- package/tests/core/relevance.test.ts +62 -0
- package/tests/core/writer.test.ts +56 -0
- package/tests/fixtures/diff-frontend.txt +11 -0
- package/tests/fixtures/diff-schema.txt +12 -0
- package/tests/fixtures/diff-small.txt +16 -0
- package/tests/integration/generate.test.ts +49 -0
- package/tests/providers/github.test.ts +69 -0
- package/tsconfig.json +15 -0
- package/vitest.config.ts +7 -0
|
@@ -0,0 +1,238 @@
|
|
|
1
|
+
import { appendSection, createFile } from './writer.js';
|
|
2
|
+
const PROMPTS = {
|
|
3
|
+
'AGENTS.md': (diff) => `You are a technical writer following Harness Engineering documentation style.
|
|
4
|
+
|
|
5
|
+
Based on the git diff between ${diff.prevTag} and ${diff.currentTag}, write a section for AGENTS.md.
|
|
6
|
+
Describe what AI coding agents need to know: new modules, interfaces, APIs, navigation patterns, new conventions.
|
|
7
|
+
Write in present tense. Be specific and actionable. 2-4 paragraphs max.
|
|
8
|
+
|
|
9
|
+
Git diff:
|
|
10
|
+
${diff.raw.slice(0, 8000)}`,
|
|
11
|
+
'ARCHITECTURE.md': (diff) => `You are a technical writer following Harness Engineering documentation style.
|
|
12
|
+
|
|
13
|
+
Based on the git diff between ${diff.prevTag} and ${diff.currentTag}, write a section for ARCHITECTURE.md.
|
|
14
|
+
Focus on: new layers, modules, dependency changes, new abstractions, removed or restructured components.
|
|
15
|
+
Write in present tense. 2-3 paragraphs max.
|
|
16
|
+
|
|
17
|
+
Git diff:
|
|
18
|
+
${diff.raw.slice(0, 8000)}`,
|
|
19
|
+
'DESIGN.md': (diff) => `You are a technical writer following Harness Engineering documentation style.
|
|
20
|
+
|
|
21
|
+
Based on the git diff between ${diff.prevTag} and ${diff.currentTag}, write a section for docs/DESIGN.md.
|
|
22
|
+
Focus on key design decisions, why certain approaches were chosen, trade-offs made.
|
|
23
|
+
Write in present tense. 2-3 paragraphs max.
|
|
24
|
+
|
|
25
|
+
Git diff:
|
|
26
|
+
${diff.raw.slice(0, 8000)}`,
|
|
27
|
+
'FRONTEND.md': (diff) => `You are a technical writer following Harness Engineering documentation style.
|
|
28
|
+
|
|
29
|
+
Based on the git diff between ${diff.prevTag} and ${diff.currentTag}, write a section for FRONTEND.md.
|
|
30
|
+
Focus on new components, UI patterns, styling changes, frontend architecture changes.
|
|
31
|
+
Write in present tense. 2-3 paragraphs max.
|
|
32
|
+
|
|
33
|
+
Git diff:
|
|
34
|
+
${diff.raw.slice(0, 8000)}`,
|
|
35
|
+
'SECURITY.md': (diff) => `You are a technical writer following Harness Engineering documentation style.
|
|
36
|
+
|
|
37
|
+
Based on the git diff between ${diff.prevTag} and ${diff.currentTag}, write a section for SECURITY.md.
|
|
38
|
+
Focus on auth changes, permission model updates, new security boundaries, data handling changes.
|
|
39
|
+
Write in present tense. 2-3 paragraphs max.
|
|
40
|
+
|
|
41
|
+
Git diff:
|
|
42
|
+
${diff.raw.slice(0, 8000)}`,
|
|
43
|
+
'RELIABILITY.md': (diff) => `You are a technical writer following Harness Engineering documentation style.
|
|
44
|
+
|
|
45
|
+
Based on the git diff between ${diff.prevTag} and ${diff.currentTag}, write a section for RELIABILITY.md.
|
|
46
|
+
Focus on error handling improvements, retry logic, circuit breakers, observability changes.
|
|
47
|
+
Write in present tense. 2-3 paragraphs max.
|
|
48
|
+
|
|
49
|
+
Git diff:
|
|
50
|
+
${diff.raw.slice(0, 8000)}`,
|
|
51
|
+
'QUALITY_SCORE.md': (diff) => `You are a technical writer following Harness Engineering documentation style.
|
|
52
|
+
|
|
53
|
+
Based on the git diff between ${diff.prevTag} and ${diff.currentTag}, write a quality assessment section.
|
|
54
|
+
Assess test coverage changes, code complexity, technical debt introduced or resolved.
|
|
55
|
+
Write in present tense. 1-2 paragraphs max.
|
|
56
|
+
|
|
57
|
+
Git diff:
|
|
58
|
+
${diff.raw.slice(0, 8000)}`,
|
|
59
|
+
'changelog': (diff) => `Write a changelog entry in Markdown for changes from ${diff.prevTag} to ${diff.currentTag}.
|
|
60
|
+
|
|
61
|
+
Format:
|
|
62
|
+
# Changelog: ${diff.currentTag}
|
|
63
|
+
|
|
64
|
+
## Added
|
|
65
|
+
- ...
|
|
66
|
+
|
|
67
|
+
## Changed
|
|
68
|
+
- ...
|
|
69
|
+
|
|
70
|
+
## Fixed
|
|
71
|
+
- ...
|
|
72
|
+
|
|
73
|
+
## Removed
|
|
74
|
+
- ...
|
|
75
|
+
|
|
76
|
+
Be specific and engineer-focused. Only include sections with actual content.
|
|
77
|
+
|
|
78
|
+
Git diff:
|
|
79
|
+
${diff.raw.slice(0, 8000)}`,
|
|
80
|
+
'design-doc': (diff) => `You are a technical writer following Harness Engineering documentation style.
|
|
81
|
+
|
|
82
|
+
Write a design document for changes from ${diff.prevTag} to ${diff.currentTag}.
|
|
83
|
+
|
|
84
|
+
Format:
|
|
85
|
+
# Design: ${diff.currentTag}
|
|
86
|
+
|
|
87
|
+
## Summary
|
|
88
|
+
[What changed and why]
|
|
89
|
+
|
|
90
|
+
## Design Decisions
|
|
91
|
+
[Key decisions with rationale]
|
|
92
|
+
|
|
93
|
+
## Agent Legibility Notes
|
|
94
|
+
[What an AI coding agent needs to know to work in the updated codebase]
|
|
95
|
+
|
|
96
|
+
## Technical Debt
|
|
97
|
+
[Any shortcuts taken, what should be cleaned up later — or "None identified"]
|
|
98
|
+
|
|
99
|
+
Git diff:
|
|
100
|
+
${diff.raw.slice(0, 8000)}`,
|
|
101
|
+
'design-doc-index': (diff) => `Return only a single Markdown list item for a docs index. Format:
|
|
102
|
+
- [${diff.currentTag}](${diff.currentTag}.md) — One-sentence summary of changes.
|
|
103
|
+
|
|
104
|
+
Changed files: ${diff.changedFiles.slice(0, 20).join(', ')}
|
|
105
|
+
|
|
106
|
+
Return only the list item, nothing else.`,
|
|
107
|
+
'tech-debt-tracker': (diff) => `You are a technical writer following Harness Engineering documentation style.
|
|
108
|
+
|
|
109
|
+
Based on the git diff between ${diff.prevTag} and ${diff.currentTag}, identify technical debt.
|
|
110
|
+
|
|
111
|
+
Write this section:
|
|
112
|
+
## ${diff.currentTag}
|
|
113
|
+
|
|
114
|
+
### New debt
|
|
115
|
+
- [Shortcuts, hacks, or deferred work visible in the diff — or "None identified"]
|
|
116
|
+
|
|
117
|
+
### Resolved debt
|
|
118
|
+
- [Cleanup or refactoring that resolves known issues — or "None identified"]
|
|
119
|
+
|
|
120
|
+
Git diff:
|
|
121
|
+
${diff.raw.slice(0, 8000)}`,
|
|
122
|
+
'db-schema': (diff) => `You are a technical writer.
|
|
123
|
+
|
|
124
|
+
Based on the git diff between ${diff.prevTag} and ${diff.currentTag}, document database schema changes.
|
|
125
|
+
Focus only on schema changes visible in the diff: new tables, columns, indexes, constraints.
|
|
126
|
+
Format as Markdown with clear table descriptions.
|
|
127
|
+
If no schema changes: write "No schema changes in this release."
|
|
128
|
+
|
|
129
|
+
Git diff:
|
|
130
|
+
${diff.raw.slice(0, 8000)}`,
|
|
131
|
+
'product-specs-index': (diff) => `Return only a single Markdown list item for a product specs index, or an empty string if no clear new feature.
|
|
132
|
+
Format (if applicable): - [Feature Name](feature-name.md) — One-sentence description.
|
|
133
|
+
|
|
134
|
+
Changed files: ${diff.changedFiles.slice(0, 20).join(', ')}
|
|
135
|
+
|
|
136
|
+
Return only the list item or empty string, nothing else.`,
|
|
137
|
+
'references': (diff) => `You are a technical writer.
|
|
138
|
+
|
|
139
|
+
Based on the git diff between ${diff.prevTag} and ${diff.currentTag}, identify new external libraries introduced.
|
|
140
|
+
For each new library write:
|
|
141
|
+
|
|
142
|
+
# library-name
|
|
143
|
+
|
|
144
|
+
One paragraph: what it does and how it is used in this codebase.
|
|
145
|
+
|
|
146
|
+
If no new external libraries: return an empty string.
|
|
147
|
+
|
|
148
|
+
Git diff:
|
|
149
|
+
${diff.raw.slice(0, 8000)}`,
|
|
150
|
+
};
|
|
151
|
+
export async function generateDocs(ai, diff, targets) {
|
|
152
|
+
return Promise.all(targets.map(async (target) => {
|
|
153
|
+
try {
|
|
154
|
+
const prompt = PROMPTS[target](diff);
|
|
155
|
+
const content = await ai.generate(prompt);
|
|
156
|
+
return { target, content };
|
|
157
|
+
}
|
|
158
|
+
catch (err) {
|
|
159
|
+
return { target, content: '', error: String(err) };
|
|
160
|
+
}
|
|
161
|
+
}));
|
|
162
|
+
}
|
|
163
|
+
export function writeResults(results, diff, cwd) {
|
|
164
|
+
const written = [];
|
|
165
|
+
for (const result of results) {
|
|
166
|
+
if (result.error || !result.content.trim())
|
|
167
|
+
continue;
|
|
168
|
+
const path = writeResult(result, diff, cwd);
|
|
169
|
+
if (path)
|
|
170
|
+
written.push(path);
|
|
171
|
+
}
|
|
172
|
+
return written;
|
|
173
|
+
}
|
|
174
|
+
function writeResult(result, diff, cwd) {
|
|
175
|
+
const tag = diff.currentTag;
|
|
176
|
+
const heading = `Changes in ${tag}`;
|
|
177
|
+
const appendTargets = [
|
|
178
|
+
'AGENTS.md', 'ARCHITECTURE.md', 'DESIGN.md', 'FRONTEND.md',
|
|
179
|
+
'SECURITY.md', 'RELIABILITY.md', 'QUALITY_SCORE.md',
|
|
180
|
+
];
|
|
181
|
+
const rootTargets = ['AGENTS.md', 'ARCHITECTURE.md'];
|
|
182
|
+
if (appendTargets.includes(result.target)) {
|
|
183
|
+
const dir = rootTargets.includes(result.target) ? '' : 'docs/';
|
|
184
|
+
const path = `${cwd}/${dir}${result.target}`;
|
|
185
|
+
appendSection(path, heading, result.content);
|
|
186
|
+
return path;
|
|
187
|
+
}
|
|
188
|
+
switch (result.target) {
|
|
189
|
+
case 'changelog': {
|
|
190
|
+
const path = `${cwd}/changelog/${tag}.md`;
|
|
191
|
+
createFile(path, result.content);
|
|
192
|
+
return path;
|
|
193
|
+
}
|
|
194
|
+
case 'design-doc': {
|
|
195
|
+
const path = `${cwd}/docs/design-docs/${tag}.md`;
|
|
196
|
+
createFile(path, result.content);
|
|
197
|
+
return path;
|
|
198
|
+
}
|
|
199
|
+
case 'design-doc-index': {
|
|
200
|
+
const path = `${cwd}/docs/design-docs/index.md`;
|
|
201
|
+
appendSection(path, heading, result.content);
|
|
202
|
+
return path;
|
|
203
|
+
}
|
|
204
|
+
case 'tech-debt-tracker': {
|
|
205
|
+
const path = `${cwd}/docs/exec-plans/tech-debt-tracker.md`;
|
|
206
|
+
appendSection(path, heading, result.content);
|
|
207
|
+
return path;
|
|
208
|
+
}
|
|
209
|
+
case 'db-schema': {
|
|
210
|
+
const path = `${cwd}/docs/generated/db-schema.md`;
|
|
211
|
+
appendSection(path, heading, result.content);
|
|
212
|
+
return path;
|
|
213
|
+
}
|
|
214
|
+
case 'product-specs-index': {
|
|
215
|
+
if (!result.content.trim())
|
|
216
|
+
return null;
|
|
217
|
+
const path = `${cwd}/docs/product-specs/index.md`;
|
|
218
|
+
appendSection(path, heading, result.content);
|
|
219
|
+
return path;
|
|
220
|
+
}
|
|
221
|
+
case 'references': {
|
|
222
|
+
if (!result.content.trim())
|
|
223
|
+
return null;
|
|
224
|
+
const libName = extractLibName(result.content);
|
|
225
|
+
if (!libName)
|
|
226
|
+
return null;
|
|
227
|
+
const path = `${cwd}/docs/references/${libName}-llms.txt`;
|
|
228
|
+
createFile(path, result.content);
|
|
229
|
+
return path;
|
|
230
|
+
}
|
|
231
|
+
default:
|
|
232
|
+
return null;
|
|
233
|
+
}
|
|
234
|
+
}
|
|
235
|
+
function extractLibName(content) {
|
|
236
|
+
const match = content.match(/^#\s+(.+)$/m);
|
|
237
|
+
return match ? match[1].toLowerCase().replace(/[^a-z0-9-]/g, '-') : null;
|
|
238
|
+
}
|
|
@@ -0,0 +1,3 @@
|
|
|
1
|
+
import type { FileGroups } from './diff.js';
|
|
2
|
+
export type DocTarget = 'AGENTS.md' | 'ARCHITECTURE.md' | 'DESIGN.md' | 'FRONTEND.md' | 'SECURITY.md' | 'RELIABILITY.md' | 'QUALITY_SCORE.md' | 'changelog' | 'design-doc' | 'design-doc-index' | 'tech-debt-tracker' | 'db-schema' | 'product-specs-index' | 'references';
|
|
3
|
+
export declare function selectTargets(fileGroups: FileGroups, changedFiles: string[]): DocTarget[];
|
|
@@ -0,0 +1,29 @@
|
|
|
1
|
+
const CORE_TARGETS = [
|
|
2
|
+
'AGENTS.md',
|
|
3
|
+
'ARCHITECTURE.md',
|
|
4
|
+
'DESIGN.md',
|
|
5
|
+
'QUALITY_SCORE.md',
|
|
6
|
+
'changelog',
|
|
7
|
+
'design-doc',
|
|
8
|
+
'design-doc-index',
|
|
9
|
+
'tech-debt-tracker',
|
|
10
|
+
];
|
|
11
|
+
export function selectTargets(fileGroups, changedFiles) {
|
|
12
|
+
const targets = [...CORE_TARGETS];
|
|
13
|
+
if (fileGroups.frontend.length > 0)
|
|
14
|
+
targets.push('FRONTEND.md');
|
|
15
|
+
if (fileGroups.auth.length > 0)
|
|
16
|
+
targets.push('SECURITY.md');
|
|
17
|
+
if (fileGroups.infra.length > 0)
|
|
18
|
+
targets.push('RELIABILITY.md');
|
|
19
|
+
if (fileGroups.schema.length > 0)
|
|
20
|
+
targets.push('db-schema');
|
|
21
|
+
const packageFiles = ['package.json', 'package-lock.json', 'yarn.lock', 'pnpm-lock.yaml'];
|
|
22
|
+
if (changedFiles.some(f => packageFiles.includes(f))) {
|
|
23
|
+
targets.push('references');
|
|
24
|
+
}
|
|
25
|
+
const looksLikeNewFeature = changedFiles.some(f => /\b(feature|feat|new)\b/i.test(f) || f.startsWith('src/features/'));
|
|
26
|
+
if (looksLikeNewFeature)
|
|
27
|
+
targets.push('product-specs-index');
|
|
28
|
+
return [...new Set(targets)];
|
|
29
|
+
}
|
|
@@ -0,0 +1,23 @@
|
|
|
1
|
+
import { writeFileSync, readFileSync, existsSync, mkdirSync } from 'fs';
|
|
2
|
+
import { dirname } from 'path';
|
|
3
|
+
export function appendSection(filePath, heading, content) {
|
|
4
|
+
ensureDir(filePath);
|
|
5
|
+
const section = `\n## ${heading}\n\n${content}\n`;
|
|
6
|
+
if (existsSync(filePath)) {
|
|
7
|
+
const existing = readFileSync(filePath, 'utf-8');
|
|
8
|
+
writeFileSync(filePath, existing + section, 'utf-8');
|
|
9
|
+
}
|
|
10
|
+
else {
|
|
11
|
+
writeFileSync(filePath, section.trimStart(), 'utf-8');
|
|
12
|
+
}
|
|
13
|
+
}
|
|
14
|
+
export function createFile(filePath, content) {
|
|
15
|
+
ensureDir(filePath);
|
|
16
|
+
writeFileSync(filePath, content, 'utf-8');
|
|
17
|
+
}
|
|
18
|
+
function ensureDir(filePath) {
|
|
19
|
+
const dir = dirname(filePath);
|
|
20
|
+
if (!existsSync(dir)) {
|
|
21
|
+
mkdirSync(dir, { recursive: true });
|
|
22
|
+
}
|
|
23
|
+
}
|
|
@@ -0,0 +1,13 @@
|
|
|
1
|
+
import type { PlatformProvider } from './interface.js';
|
|
2
|
+
export declare class GitHubProvider implements PlatformProvider {
|
|
3
|
+
private octokit;
|
|
4
|
+
private owner;
|
|
5
|
+
private repo;
|
|
6
|
+
constructor(token: string);
|
|
7
|
+
createOrUpdatePR(opts: {
|
|
8
|
+
branch: string;
|
|
9
|
+
title: string;
|
|
10
|
+
body: string;
|
|
11
|
+
baseBranch: string;
|
|
12
|
+
}): Promise<string>;
|
|
13
|
+
}
|
|
@@ -0,0 +1,43 @@
|
|
|
1
|
+
import { Octokit } from '@octokit/rest';
|
|
2
|
+
import { execSync } from 'child_process';
|
|
3
|
+
export class GitHubProvider {
|
|
4
|
+
octokit;
|
|
5
|
+
owner;
|
|
6
|
+
repo;
|
|
7
|
+
constructor(token) {
|
|
8
|
+
this.octokit = new Octokit({ auth: token });
|
|
9
|
+
const remoteUrl = execSync('git remote get-url origin').toString().trim();
|
|
10
|
+
const match = remoteUrl.match(/github\.com[/:](.+?)\/(.+?)(?:\.git)?$/);
|
|
11
|
+
if (!match)
|
|
12
|
+
throw new Error(`Cannot parse GitHub remote URL: ${remoteUrl}`);
|
|
13
|
+
this.owner = match[1];
|
|
14
|
+
this.repo = match[2];
|
|
15
|
+
}
|
|
16
|
+
async createOrUpdatePR(opts) {
|
|
17
|
+
const { data: existingPRs } = await this.octokit.pulls.list({
|
|
18
|
+
owner: this.owner,
|
|
19
|
+
repo: this.repo,
|
|
20
|
+
head: `${this.owner}:${opts.branch}`,
|
|
21
|
+
state: 'open',
|
|
22
|
+
});
|
|
23
|
+
if (existingPRs.length > 0) {
|
|
24
|
+
const { data: pr } = await this.octokit.pulls.update({
|
|
25
|
+
owner: this.owner,
|
|
26
|
+
repo: this.repo,
|
|
27
|
+
pull_number: existingPRs[0].number,
|
|
28
|
+
title: opts.title,
|
|
29
|
+
body: opts.body,
|
|
30
|
+
});
|
|
31
|
+
return pr.html_url;
|
|
32
|
+
}
|
|
33
|
+
const { data: pr } = await this.octokit.pulls.create({
|
|
34
|
+
owner: this.owner,
|
|
35
|
+
repo: this.repo,
|
|
36
|
+
title: opts.title,
|
|
37
|
+
body: opts.body,
|
|
38
|
+
head: opts.branch,
|
|
39
|
+
base: opts.baseBranch,
|
|
40
|
+
});
|
|
41
|
+
return pr.html_url;
|
|
42
|
+
}
|
|
43
|
+
}
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
export {};
|
package/docs/DESIGN.md
ADDED
|
@@ -0,0 +1,94 @@
|
|
|
1
|
+
# DESIGN.md
|
|
2
|
+
|
|
3
|
+
Design philosophy and key decisions for `harness-engineering-auto-docs`.
|
|
4
|
+
See `docs/design-docs/` for per-release design documents.
|
|
5
|
+
|
|
6
|
+
## Core philosophy
|
|
7
|
+
|
|
8
|
+
This tool exists to automate the most expensive human bottleneck in agent-first development:
|
|
9
|
+
keeping repository knowledge current. Based on lessons from the Harness Engineering experiment
|
|
10
|
+
(OpenAI, Feb 2026), the single biggest drag on agent productivity is **stale or missing context**.
|
|
11
|
+
|
|
12
|
+
> "From the agent's point of view, anything it can't access in-context while running effectively
|
|
13
|
+
> doesn't exist." — Ryan Lopopolo, Harness Engineering
|
|
14
|
+
|
|
15
|
+
`harness-engineering-auto-docs` makes repository knowledge self-maintaining: every git tag triggers a
|
|
16
|
+
documentation pass, so the knowledge store reflects the codebase as it actually exists.
|
|
17
|
+
|
|
18
|
+
## Key design decisions
|
|
19
|
+
|
|
20
|
+
### 1. Tags as the documentation trigger (not commits)
|
|
21
|
+
|
|
22
|
+
**Decision**: Generate docs on `git tag`, not on every commit or PR merge.
|
|
23
|
+
|
|
24
|
+
**Rationale**: Tags represent meaningful semantic versions. Generating docs per-commit would
|
|
25
|
+
flood the repository with micro-updates and overwhelm agents with noise. Tags mark intentional
|
|
26
|
+
release points where a holistic documentation pass is appropriate.
|
|
27
|
+
|
|
28
|
+
### 2. Diff-driven content (not full-file scans)
|
|
29
|
+
|
|
30
|
+
**Decision**: Extract a `git diff` between adjacent tags and feed only that diff to the LLM.
|
|
31
|
+
|
|
32
|
+
**Rationale**: Full-file analysis requires enormous context windows and is expensive. The diff
|
|
33
|
+
is the minimal surface that conveys *what changed*. Prompts cap the diff at 8 000 characters
|
|
34
|
+
to stay within reliable model context limits.
|
|
35
|
+
|
|
36
|
+
**Trade-off**: Changes in very large diffs are truncated. Complex releases may miss some nuance.
|
|
37
|
+
This is acceptable — the goal is progressive accumulation, not exhaustive single-pass coverage.
|
|
38
|
+
|
|
39
|
+
### 3. Append-only for living documents, create-new for point-in-time docs
|
|
40
|
+
|
|
41
|
+
**Decision**: `AGENTS.md`, `ARCHITECTURE.md`, `docs/DESIGN.md`, etc. are appended to with each
|
|
42
|
+
release. `changelog/vX.Y.Z.md` and `docs/design-docs/vX.Y.Z.md` are created fresh.
|
|
43
|
+
|
|
44
|
+
**Rationale**: Living documents accumulate knowledge over time. Per-release documents provide
|
|
45
|
+
an immutable point-in-time record. This mirrors how architecture evolves alongside changelogs.
|
|
46
|
+
|
|
47
|
+
### 4. File-group heuristics drive conditional doc targets
|
|
48
|
+
|
|
49
|
+
**Decision**: Classify changed files into groups (`frontend`, `schema`, `auth`, `infra`) and
|
|
50
|
+
only generate conditional docs when the relevant group is non-empty.
|
|
51
|
+
|
|
52
|
+
**Rationale**: Generating `SECURITY.md` for a release that only touched CSS would produce
|
|
53
|
+
meaningless filler. Conditional generation keeps each document signal-rich.
|
|
54
|
+
|
|
55
|
+
**Trade-off**: Heuristics in `diff.ts:groupFiles` are regex-based and imprecise. False negatives
|
|
56
|
+
are acceptable (some releases skip conditional docs they could have generated). False positives
|
|
57
|
+
are more harmful — generating irrelevant content pollutes the knowledge store.
|
|
58
|
+
|
|
59
|
+
### 5. Single AIProvider interface with prefix-based dispatch
|
|
60
|
+
|
|
61
|
+
**Decision**: Model name prefix (`claude-*` / `gpt-*`) determines which provider is instantiated.
|
|
62
|
+
|
|
63
|
+
**Rationale**: Users already know their model name. Adding a separate `AI_PROVIDER` env var
|
|
64
|
+
would be redundant and error-prone. The prefix convention is unambiguous and self-documenting.
|
|
65
|
+
|
|
66
|
+
### 6. No database, no server, no long-lived state
|
|
67
|
+
|
|
68
|
+
**Decision**: The tool is a stateless CLI that runs to completion and exits.
|
|
69
|
+
|
|
70
|
+
**Rationale**: Simplicity is a force multiplier for agents. A stateless tool is trivially
|
|
71
|
+
reproducible, testable, and debuggable. All persistent state lives in the repository (git history,
|
|
72
|
+
Markdown files). This is the Harness Engineering principle: repository-local, versioned artifacts
|
|
73
|
+
are the system of record.
|
|
74
|
+
|
|
75
|
+
### 7. AGENTS.md as table of contents, not encyclopedia
|
|
76
|
+
|
|
77
|
+
**Decision**: This `AGENTS.md` is ~100 lines. It is a map with pointers, not a manual.
|
|
78
|
+
|
|
79
|
+
**Rationale**: Directly from Harness Engineering lessons:
|
|
80
|
+
- Large instruction files crowd out task context
|
|
81
|
+
- "Too much guidance becomes non-guidance"
|
|
82
|
+
- Monolithic files rot and become stale attractors
|
|
83
|
+
|
|
84
|
+
The detailed knowledge lives in `ARCHITECTURE.md`, `docs/DESIGN.md`, `docs/design-docs/`, etc.,
|
|
85
|
+
and agents navigate there via `AGENTS.md`.
|
|
86
|
+
|
|
87
|
+
## What we intentionally did NOT do
|
|
88
|
+
|
|
89
|
+
- **No semantic chunking or embeddings** — the tool targets small-to-medium diffs; retrieval
|
|
90
|
+
augmentation adds complexity without proportional benefit at this scale
|
|
91
|
+
- **No multi-repo support** — each invocation operates on a single git repository
|
|
92
|
+
- **No rollback mechanism** — doc updates are PR-based; rollback happens via normal git revert
|
|
93
|
+
- **No GitLab MR** — stub exists in `src/providers/gitlab.ts` but is not implemented; see
|
|
94
|
+
`docs/exec-plans/tech-debt-tracker.md` for this as tracked debt
|
|
@@ -0,0 +1,74 @@
|
|
|
1
|
+
# QUALITY_SCORE.md
|
|
2
|
+
|
|
3
|
+
Quality grades for each domain and architectural layer.
|
|
4
|
+
Updated on each release by `harness-engineering-auto-docs` itself.
|
|
5
|
+
|
|
6
|
+
## Grading legend
|
|
7
|
+
|
|
8
|
+
| Grade | Meaning |
|
|
9
|
+
|-------|---------|
|
|
10
|
+
| ✅ A | Solid — well-tested, clear interfaces, minimal debt |
|
|
11
|
+
| 🟡 B | Adequate — works but has known gaps or rough edges |
|
|
12
|
+
| 🟠 C | Needs attention — debt is accumulating or tests are sparse |
|
|
13
|
+
| 🔴 D | Problematic — blocking issues or significant risk |
|
|
14
|
+
|
|
15
|
+
---
|
|
16
|
+
|
|
17
|
+
## v0.1.0 baseline (2026-04-03)
|
|
18
|
+
|
|
19
|
+
### Core pipeline
|
|
20
|
+
|
|
21
|
+
| Module | Grade | Notes |
|
|
22
|
+
|--------|-------|-------|
|
|
23
|
+
| `core/diff.ts` | ✅ A | Well-tested; `groupFiles` heuristics are simple and readable |
|
|
24
|
+
| `core/relevance.ts` | ✅ A | Clean, pure function; full test coverage |
|
|
25
|
+
| `core/generator.ts` | 🟡 B | Prompts are functional but not versioned; no prompt regression tests |
|
|
26
|
+
| `core/writer.ts` | ✅ A | Minimal helpers; appendSection and createFile are straightforward |
|
|
27
|
+
|
|
28
|
+
### AI layer
|
|
29
|
+
|
|
30
|
+
| Module | Grade | Notes |
|
|
31
|
+
|--------|-------|-------|
|
|
32
|
+
| `ai/anthropic.ts` | ✅ A | Thin wrapper; tested with mocks |
|
|
33
|
+
| `ai/openai.ts` | ✅ A | Thin wrapper; tested with mocks |
|
|
34
|
+
| `ai/interface.ts` | ✅ A | Clean single-method interface |
|
|
35
|
+
|
|
36
|
+
### Platform layer
|
|
37
|
+
|
|
38
|
+
| Module | Grade | Notes |
|
|
39
|
+
|--------|-------|-------|
|
|
40
|
+
| `providers/github.ts` | 🟡 B | Creates/updates PRs correctly; `owner/repo` parsed from remote URL (fragile for non-standard remotes) |
|
|
41
|
+
| `providers/gitlab.ts` | 🔴 D | **Not implemented** — throws on any call; this is tracked debt |
|
|
42
|
+
|
|
43
|
+
### CLI orchestration
|
|
44
|
+
|
|
45
|
+
| Module | Grade | Notes |
|
|
46
|
+
|--------|-------|-------|
|
|
47
|
+
| `cli.ts` | 🟡 B | Correct but dense; git operations are inline shell commands with no error recovery |
|
|
48
|
+
|
|
49
|
+
### Test suite
|
|
50
|
+
|
|
51
|
+
| Area | Grade | Notes |
|
|
52
|
+
|------|-------|-------|
|
|
53
|
+
| Unit coverage | 🟡 B | Core modules well covered; provider tests are mock-only |
|
|
54
|
+
| Integration tests | 🟡 B | `tests/integration/generate.test.ts` covers happy path; no failure-path integration tests |
|
|
55
|
+
| Fixtures | ✅ A | Diff fixtures in `tests/fixtures/` allow deterministic diff parsing tests |
|
|
56
|
+
|
|
57
|
+
### Documentation
|
|
58
|
+
|
|
59
|
+
| Area | Grade | Notes |
|
|
60
|
+
|------|-------|-------|
|
|
61
|
+
| README | ✅ A | Accurate and complete for users |
|
|
62
|
+
| Knowledge store | 🟡 B | Baseline being established; will improve with each release |
|
|
63
|
+
|
|
64
|
+
---
|
|
65
|
+
|
|
66
|
+
## Gap tracking
|
|
67
|
+
|
|
68
|
+
| Gap | Priority | Owner |
|
|
69
|
+
|-----|----------|-------|
|
|
70
|
+
| GitLab MR support | Medium | see `docs/exec-plans/tech-debt-tracker.md` |
|
|
71
|
+
| Prompt regression tests | Low | no mechanism to detect prompt quality degradation |
|
|
72
|
+
| `owner/repo` URL parsing robustness | Low | SSH remotes and non-GitHub hosts may fail |
|
|
73
|
+
| Git operations error handling in `cli.ts` | Low | `execSync` throws on any git failure with no recovery |
|
|
74
|
+
| Diff truncation strategy | Low | 8 000-char hard cut may slice mid-hunk |
|
|
@@ -0,0 +1,71 @@
|
|
|
1
|
+
# Core Beliefs
|
|
2
|
+
|
|
3
|
+
Agent-first operating principles for `harness-engineering-auto-docs`.
|
|
4
|
+
These beliefs inform every design decision and should be consulted when making trade-offs.
|
|
5
|
+
|
|
6
|
+
Source: [Harness Engineering: Leveraging Codex in an Agent-First World](https://openai.com/index/harness-engineering/) (OpenAI, Feb 2026)
|
|
7
|
+
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
## 1. Humans steer. Agents execute.
|
|
11
|
+
|
|
12
|
+
The human role is to define intent, set direction, and validate outcomes — not to write code
|
|
13
|
+
line by line. This tool exists to reduce the human cost of keeping knowledge current so that
|
|
14
|
+
agents can do more of the execution work.
|
|
15
|
+
|
|
16
|
+
## 2. Repository knowledge is the system of record
|
|
17
|
+
|
|
18
|
+
Anything not in the repository does not exist from the agent's perspective. Decisions made in
|
|
19
|
+
chat threads, Slack, or people's heads are invisible to the system. All meaningful context must
|
|
20
|
+
be committed as versioned, repository-local Markdown.
|
|
21
|
+
|
|
22
|
+
**Implication for this tool**: Every documentation output is committed to the target repository.
|
|
23
|
+
Nothing is written to an external system.
|
|
24
|
+
|
|
25
|
+
## 3. AGENTS.md is a map, not a manual
|
|
26
|
+
|
|
27
|
+
A short (`~100 lines`) `AGENTS.md` serves as an entry point with pointers to deeper documents.
|
|
28
|
+
Long instruction files crowd out task context, rot quickly, and become non-guidance.
|
|
29
|
+
|
|
30
|
+
**Implication**: This tool generates `AGENTS.md` by *appending* short sections — not by replacing
|
|
31
|
+
or expanding it into an encyclopedia.
|
|
32
|
+
|
|
33
|
+
## 4. Progressive disclosure over front-loading
|
|
34
|
+
|
|
35
|
+
Agents start with a small, stable entry point (`AGENTS.md`) and navigate to specifics as needed.
|
|
36
|
+
The knowledge store is organized so each layer reveals more detail than the last.
|
|
37
|
+
|
|
38
|
+
## 5. Enforce invariants mechanically, not verbally
|
|
39
|
+
|
|
40
|
+
Documentation alone cannot keep a codebase coherent. Rules that matter must be enforced via
|
|
41
|
+
linters, type system constraints, or structural tests — not just written down and hoped about.
|
|
42
|
+
|
|
43
|
+
**Implication**: `harness-engineering-auto-docs` is a mechanical enforcement tool — it runs automatically
|
|
44
|
+
on every tag, not on request.
|
|
45
|
+
|
|
46
|
+
## 6. Continuous small increments beat periodic big fixes
|
|
47
|
+
|
|
48
|
+
Technical debt, like financial debt, compounds. The right strategy is continuous small paydowns
|
|
49
|
+
(per-release doc updates, automated quality grading) rather than infrequent large efforts.
|
|
50
|
+
|
|
51
|
+
## 7. Boring technology is better for agents
|
|
52
|
+
|
|
53
|
+
Technologies with stable APIs, strong training data representation, and high composability are
|
|
54
|
+
easier for agents to model correctly. Prefer well-established libraries over cutting-edge ones.
|
|
55
|
+
Prefer reimplementing small subsets of functionality (with full test coverage and telemetry
|
|
56
|
+
integration) over depending on opaque upstream behaviour.
|
|
57
|
+
|
|
58
|
+
**Implication for this tool**: We use `@octokit/rest` (stable, well-documented) for GitHub and
|
|
59
|
+
`@anthropic-ai/sdk` / `openai` (official SDKs with strong training data) for AI.
|
|
60
|
+
|
|
61
|
+
## 8. Conditional generation over unconditional noise
|
|
62
|
+
|
|
63
|
+
Generating a `SECURITY.md` section for a release that only touched CSS creates noise that
|
|
64
|
+
degrades the signal-to-noise ratio of the knowledge store. Only generate documentation where
|
|
65
|
+
there is real signal. Guard with file-group heuristics.
|
|
66
|
+
|
|
67
|
+
## 9. Agent legibility is a first-class output
|
|
68
|
+
|
|
69
|
+
Every feature should be evaluated not just for correctness but for whether it makes the
|
|
70
|
+
codebase more legible to a future agent run. "Does this change make the repository easier
|
|
71
|
+
for Codex to navigate and reason about?" is a valid and important design criterion.
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
# Design Docs Index
|
|
2
|
+
|
|
3
|
+
Catalogued design documents for `harness-engineering-auto-docs`.
|
|
4
|
+
Each entry links to the per-release design document generated at tag time.
|
|
5
|
+
|
|
6
|
+
See `docs/design-docs/core-beliefs.md` for the foundational agent-first operating principles
|
|
7
|
+
that underpin every design decision in this project.
|
|
8
|
+
|
|
9
|
+
## Releases
|
|
10
|
+
|
|
11
|
+
<!-- harness-engineering-auto-docs appends entries here on each release -->
|
|
12
|
+
|
|
13
|
+
_No releases yet. The first tagged release will appear here._
|
|
14
|
+
|
|
15
|
+
---
|
|
16
|
+
|
|
17
|
+
## Document structure
|
|
18
|
+
|
|
19
|
+
Each design doc follows this template:
|
|
20
|
+
|
|
21
|
+
```
|
|
22
|
+
# Design: vX.Y.Z
|
|
23
|
+
|
|
24
|
+
## Summary
|
|
25
|
+
## Design Decisions
|
|
26
|
+
## Agent Legibility Notes
|
|
27
|
+
## Technical Debt
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
The **Agent Legibility Notes** section is specific to this project's Harness Engineering style:
|
|
31
|
+
it explicitly calls out what an AI coding agent needs to understand to work correctly in the
|
|
32
|
+
updated codebase after the release.
|