@joshski/dust 0.1.111 → 0.1.113

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,58 +1,39 @@
1
1
  // lib/bundled-core-principles.ts
2
2
  var BUNDLED_PRINCIPLES = [
3
3
  {
4
- slug: "batteries-included",
5
- content: `# Batteries Included
6
-
7
- Dust should provide everything that is required (within reason) for an agent to be productive in an arbitrary codebase.
4
+ slug: "design-for-testability",
5
+ content: `# Design for Testability
8
6
 
9
- An agent working autonomously should not be blocked because a tool or configuration is missing. For example, dust should ship custom lint rules for different linters, even though those linters are not dependencies of dust itself. If an agent needs a capability to do its job well in a typical codebase, dust should provide it out of the box.
7
+ Design code to be testable first; good structure follows naturally.
10
8
 
11
- This means accepting some breadth of scope bundling configs, rules, and utilities that target external tools in exchange for agents that can start producing useful work immediately without manual setup.
9
+ Testability should be a primary design driver, not a quality to be retrofitted. When code is designed to be testable from the start, it naturally becomes decoupled, explicit in its dependencies, and clear in its interfaces.
12
10
 
13
- ## Applicability
11
+ The discipline of testability forces good design: functions become pure, dependencies become explicit, side effects become isolated. Rather than viewing testability as a tax on production code, recognize it as a compass that points toward better architecture.
14
12
 
15
- Internal
13
+ This is particularly important in agent-driven development. Agents cannot manually verify their changes—they rely entirely on tests. Code that resists testing resists autonomous modification.
16
14
 
17
15
  ## Parent Principle
18
16
 
19
- - [Agent Autonomy](agent-autonomy.md)
17
+ - [Decoupled Code](decoupled-code.md)
20
18
 
21
19
  ## Sub-Principles
20
+
21
+ - (none)
22
22
  `
23
23
  },
24
24
  {
25
- slug: "some-big-design-up-front",
26
- content: `# Some Big Design Up Front
27
-
28
- AI agents lower the cost of architectural exploration, making heavier upfront investment rational during the idea phase.
29
-
30
- Agile's rejection of "big design up front" (BDUF) was largely economic: detailed architecture was expensive to produce and often wrong. AI agents change that equation — they can explore multiple variants, prototype them, and measure trade-offs cheaply. When evaluating alternatives costs less, the expected value of avoiding large structural mistakes increases.
31
-
32
- This doesn't mean returning to traditional BDUF. Uncertainty about future requirements still limits what prediction can achieve. The insight is that the optimal amount of upfront work has shifted, not that prediction became reliable.
33
-
34
- The model is hybrid: thorough AI-assisted exploration during ideas, followed by straightforward execution during tasks. "Lightweight" refers to task-level planning, not idea-level exploration. Invest heavily in understanding alternatives during the idea phase, then decompose into atomic tasks once the direction is clear.
35
-
36
- ## Convergence Criteria
37
-
38
- Exploration should continue until clear trade-offs are identified and the chosen approach can be articulated against alternatives. This is convergence-based, not time-boxed — simple ideas converge quickly, complex architectural decisions require more exploration.
39
-
40
- When exploration feels "done":
41
-
42
- - Multiple approaches have been considered
43
- - Trade-offs between approaches are understood
44
- - The chosen direction has clear justification
45
- - Remaining uncertainty is about requirements, not design
25
+ slug: "fast-feedback-loops",
26
+ content: `# Fast Feedback Loops
46
27
 
47
- If a task requires significant design decisions during execution, it wasn't ready to be a task.
28
+ The primary feedback loop write code, run checks, see results should be as fast as possible.
48
29
 
49
- ## Documenting Alternatives
30
+ Fast feedback is the foundation of productive development, for both humans and agents. When tests, linters, and type checks run in seconds rather than minutes, developers iterate more frequently and catch problems earlier. Agents especially benefit because they operate in tight loops of change-and-verify; slow feedback wastes tokens and context window space on waiting rather than working.
50
31
 
51
- Ideas should document the alternatives considered and why they were ruled out. This creates a decision log that helps future agents and humans understand context. Include alternatives in the idea body or Open Questions sections.
32
+ Dust should help projects measure the speed of their feedback loops, identify bottlenecks, and keep them fast as the codebase grows. This includes promoting practices like unit tests over integration tests for speed, incremental compilation, and check parallelisation.
52
33
 
53
34
  ## Parent Principle
54
35
 
55
- - [Lightweight Planning](lightweight-planning.md)
36
+ - [Ideal Agent Developer Experience](ideal-agent-developer-experience.md)
56
37
 
57
38
  ## Sub-Principles
58
39
 
@@ -60,72 +41,41 @@ Ideas should document the alternatives considered and why they were ruled out. T
60
41
  `
61
42
  },
62
43
  {
63
- slug: "design-for-testability",
64
- content: `# Design for Testability
65
-
66
- Design code to be testable first; good structure follows naturally.
44
+ slug: "test-isolation",
45
+ content: `# Test Isolation
67
46
 
68
- Testability should be a primary design driver, not a quality to be retrofitted. When code is designed to be testable from the start, it naturally becomes decoupled, explicit in its dependencies, and clear in its interfaces.
47
+ Tests should not interfere with one another. Each test must be independently runnable and produce the same result regardless of execution order or which other tests run alongside it.
69
48
 
70
- The discipline of testability forces good design: functions become pure, dependencies become explicit, side effects become isolated. Rather than viewing testability as a tax on production code, recognize it as a compass that points toward better architecture.
49
+ This means:
50
+ - No shared mutable state between tests
51
+ - No reliance on test execution order
52
+ - No file system or environment pollution
53
+ - Each test sets up its own dependencies
71
54
 
72
- This is particularly important in agent-driven development. Agents cannot manually verify their changes—they rely entirely on tests. Code that resists testing resists autonomous modification.
55
+ Test isolation enables parallel execution, makes failures easier to diagnose, and prevents cascading false failures when one test breaks.
73
56
 
74
57
  ## Parent Principle
75
58
 
76
- - [Decoupled Code](decoupled-code.md)
59
+ - [Make Changes with Confidence](make-changes-with-confidence.md)
77
60
 
78
61
  ## Sub-Principles
79
62
 
80
- - (none)
63
+ - [Environment-Independent Tests](environment-independent-tests.md)
81
64
  `
82
65
  },
83
66
  {
84
- slug: "readable-test-data",
85
- content: `# Readable Test Data
86
-
87
- Test data setup should use natural structures that mirror what they represent.
88
-
89
- ## Why it matters
90
-
91
- When test data is easy to read, tests become self-documenting. A file system hierarchy expressed as a nested object immediately conveys structure, while a flat Map with path strings requires mental parsing to understand the relationships.
92
-
93
- ## In practice
94
-
95
- Prefer literal structures that visually match the domain:
96
-
97
- \`\`\`javascript
98
- // Avoid: flat paths that obscure hierarchy
99
- const fs = createFileSystemEmulator({
100
- files: new Map([['/project/.dust/principles/my-goal.md', '# My Goal']]),
101
- existingPaths: new Set(['/project/.dust/ideas']),
102
- })
103
-
104
- // Prefer: nested object that mirrors file system structure
105
- const fs = createFileSystemEmulator({
106
- project: {
107
- '.dust': {
108
- principles: {
109
- 'my-goal.md': '# My Goal'
110
- },
111
- ideas: {}
112
- }
113
- }
114
- })
115
- \`\`\`
67
+ slug: "boy-scout-rule",
68
+ content: `# Boy Scout Rule
116
69
 
117
- The nested form:
118
- - Shows parent-child relationships through indentation
119
- - Makes empty directories explicit with empty objects
120
- - Requires no mental path concatenation to understand structure
70
+ Always leave the code better than you found it.
121
71
 
122
- ## How to evaluate
72
+ When working in any area of the codebase, take the opportunity to make small improvements — clearer names, removed dead code, better structure — even if they're not directly related to the task at hand. These incremental improvements compound over time, preventing gradual decay and keeping the codebase healthy without requiring dedicated cleanup efforts.
123
73
 
124
- Work supports this principle when test setup data uses structures that visually resemble what they represent, reducing cognitive load for readers.
74
+ The Boy Scout Rule is not a license for large-scale refactoring during unrelated work. Improvements should be small, obvious, and low-risk. If a cleanup is too large to include alongside the current task, capture it as a separate task instead.
125
75
 
126
76
  ## Parent Principle
127
77
 
128
- - [Make Changes with Confidence](make-changes-with-confidence.md)
78
+ - [Maintainable Codebase](maintainable-codebase.md)
129
79
 
130
80
  ## Sub-Principles
131
81
 
@@ -133,45 +83,35 @@ Work supports this principle when test setup data uses structures that visually
133
83
  `
134
84
  },
135
85
  {
136
- slug: "agent-specific-enhancement",
137
- content: `# Agent-Specific Enhancement
138
-
139
- Dust should detect and enhance the experience for specific agents while remaining agnostic at its core.
140
-
141
- While Dust has [Agent-Agnostic Design](agent-agnostic-design.md) and works with any capable agent, it can still optimize the "agent DX" (developer experience) when it detects a specific agent is being used. This means:
142
-
143
- - **Detection** - Dust may detect which agent is running (e.g., Claude Code, Aider, Cursor) through environment variables, configuration, or other signals
144
- - **Enhancement** - Once detected, Dust can tailor its output format, prompts, or context to leverage that agent's specific strengths
145
- - **Graceful fallback** - When no specific agent is detected, Dust provides a generic experience that works with any agent
86
+ slug: "atomic-commits",
87
+ content: `# Atomic Commits
146
88
 
147
- This principle complements Agent-Agnostic Design: the core functionality never requires a specific agent, but the experience improves when one is recognized.
89
+ Each commit should tell a complete story, bundling implementation changes with their corresponding documentation updates.
148
90
 
149
- ## Applicability
91
+ When a task is completed, the commit deletes the task file, updates relevant facts to reflect the new reality, and removes any ideas that have been realized. This discipline ensures that any point in the commit history represents a coherent, self-documenting state of the project.
150
92
 
151
- Internal
93
+ Clean commit history is essential because archaeology depends on it. Future humans and AI agents will traverse history to understand why decisions were made and how the system evolved.
152
94
 
153
95
  ## Parent Principle
154
96
 
155
- - [Agent Autonomy](agent-autonomy.md)
97
+ - [Repository Hygiene](repository-hygiene.md)
156
98
 
157
99
  ## Sub-Principles
158
100
 
159
- - (none)
101
+ - [Traceable Decisions](traceable-decisions.md)
160
102
  `
161
103
  },
162
104
  {
163
- slug: "context-optimised-code",
164
- content: `# Context-Optimised Code
165
-
166
- Code should be structured so that agents can understand and modify it within their context window constraints.
105
+ slug: "co-located-tests",
106
+ content: `# Co-located Tests
167
107
 
168
- Large files, deeply nested abstractions, and sprawling dependency chains all work against agents. A 3,000-line file cannot be fully loaded into context. A function that requires understanding six levels of indirection demands more context than one that is self-contained. Context-optimised code favours small files, shallow abstractions, explicit dependencies, and co-located related logic.
108
+ Test files should live next to the code they test.
169
109
 
170
- Dust should help projects identify files that are too large, modules that are too tangled, and patterns that make agent comprehension harder than it needs to be. This is not just about file size it is about ensuring that the unit of code an agent needs to understand fits comfortably within the window available.
110
+ When tests are co-located with their source files, developers can immediately see what's tested and what isn't. Finding the test for a module becomes trivial—it's right there in the same directory. This proximity encourages writing tests as part of the development flow rather than as an afterthought, and makes it natural to update tests when modifying code.
171
111
 
172
112
  ## Parent Principle
173
113
 
174
- - [Ideal Agent Developer Experience](ideal-agent-developer-experience.md)
114
+ - [Intuitive Directory Structure](intuitive-directory-structure.md)
175
115
 
176
116
  ## Sub-Principles
177
117
 
@@ -179,57 +119,20 @@ Dust should help projects identify files that are too large, modules that are to
179
119
  `
180
120
  },
181
121
  {
182
- slug: "self-diagnosing-tests",
183
- content: `# Self-Diagnosing Tests
184
-
185
- When a big test fails, it should be self-evident how to diagnose and fix the failure.
186
-
187
- The more moving parts a test has — end-to-end, system, integration — the more critical this becomes. A test that fails with \`expected true, received false\` forces the developer (or agent) to re-run, add logging, and guess. A test that fails with a rich diff showing the actual state versus the expected state turns diagnosis into reading.
188
-
189
- ## Anti-patterns
190
-
191
- **Boolean flattening** — collapsing a rich value into true/false before asserting:
192
- \`\`\`javascript
193
- // Bad: "expected true, received false" — what events arrived?
194
- expect(events.some(e => e.type === 'check-passed')).toBe(true)
195
-
196
- // Good: shows the actual event types on failure
197
- expect(events.map(e => e.type)).toContain('check-passed')
198
- \`\`\`
199
-
200
- **Length-only assertions** — checking count without showing contents:
201
- \`\`\`javascript
202
- // Bad: "expected 2, received 0" — what requests were captured?
203
- expect(requests.length).toBe(2)
204
-
205
- // Good: shows the actual requests on failure
206
- expect(requests).toHaveLength(2) // vitest shows the array
207
- \`\`\`
208
-
209
- **Silent guards** — using \`if\` where an assertion belongs:
210
- \`\`\`javascript
211
- // Bad: silently passes when settings is undefined
212
- if (settings) {
213
- expect(JSON.parse(settings).key).toBeDefined()
214
- }
215
-
216
- // Good: fails explicitly if settings is missing
217
- expect(settings).toBeDefined()
218
- const parsed = JSON.parse(settings!)
219
- expect(parsed.key).toBeDefined()
220
- \`\`\`
122
+ slug: "broken-windows",
123
+ content: `# Broken Windows
221
124
 
222
- ## The test
125
+ Don't leave broken windows unrepaired.
223
126
 
224
- If a test fails, can a developer who has never seen the code identify the problem from the failure output alone without re-running, adding console.logs, or reading the test source? The closer to "yes", the better.
127
+ A broken window — a bad name, a hack, a TODO that lingers, a test that's been skipped signals that nobody cares. That signal invites more neglect. One shortcut becomes two, then ten, and the codebase quietly rots from the inside.
225
128
 
226
- ## How to evaluate
129
+ When you spot a broken window, fix it immediately if the fix is small. If it's too large, capture it as a task so it doesn't get forgotten. The key is to never normalise the damage. Even a comment acknowledging the problem ("this needs fixing because...") is better than silent acceptance.
227
130
 
228
- Work supports this principle when every assertion in a system or integration test would, on failure, reveal the actual state richly enough to guide a fix. Bare boolean checks, length-only assertions, and silent conditional guards are violations.
131
+ This principle complements the [Boy Scout Rule](boy-scout-rule.md): the Boy Scout Rule encourages proactive improvement, while Broken Windows warns against tolerating known problems. Together they keep entropy at bay.
229
132
 
230
133
  ## Parent Principle
231
134
 
232
- - [Make Changes with Confidence](make-changes-with-confidence.md)
135
+ - [Maintainable Codebase](maintainable-codebase.md)
233
136
 
234
137
  ## Sub-Principles
235
138
 
@@ -237,48 +140,45 @@ Work supports this principle when every assertion in a system or integration tes
237
140
  `
238
141
  },
239
142
  {
240
- slug: "ideal-agent-developer-experience",
241
- content: `# Ideal Agent Developer Experience
143
+ slug: "trunk-based-development",
144
+ content: `# Trunk-Based Development
242
145
 
243
- The agent is the developer. The human is the CEO. Dust is the PM.
146
+ Dust is designed to support a non-branching workflow where developers commit directly to a single main branch.
244
147
 
245
- With today's AI coding assistants, the human is stuck in a tight loop with agents constantly directing, reviewing, and course-correcting. Dust is designed to relieve humans from this tight loop. Like an assistant to a CEO, dust predominantly brings fully-researched questions and well-prepared work to the human, rather than expecting the human to drive every decision. The human checks in less frequently, and when they do, they make high-leverage strategic calls rather than micromanaging implementation.
148
+ In trunk-based development, teams collaborate on code in one primary branch rather than maintaining multiple long-lived feature branches. This eliminates merge conflicts, enables continuous integration, and keeps the codebase continuously releasable.
246
149
 
247
- For this to work, the agent's development environment must be excellent. The agent reads the code, writes changes, runs the checks, and iterates until the task is done. Everything about the codebase and its tooling either helps or hinders that process. Comprehensive tests are the agent's only way to verify correctness. Fast feedback loops are the agent's iteration speed. Structured logs are the agent's eyes into runtime behaviour. Small, well-organised files are what fit in the agent's context window. Exploratory and debugging tools are how the agent navigates and diagnoses without trial and error.
150
+ The \`dust loop claude\` command embodies this philosophy: agents pull from main, implement a task, and push directly back to main. There are no feature branches, no pull requests, no merge queues. Each commit is atomic and complete.
248
151
 
249
- Each sub-principle represents a different aspect of the ideal agent developer setup. The better these are, the less the human needs to be in the loop.
152
+ This approach scales through discipline rather than isolation. Feature flags and incremental changes replace long-running branches. The repository history becomes a linear sequence of working states.
153
+
154
+ See: https://trunkbaseddevelopment.com/
250
155
 
251
156
  ## Parent Principle
252
157
 
253
- - [Human-AI Collaboration](human-ai-collaboration.md)
158
+ - [Repository Hygiene](repository-hygiene.md)
254
159
 
255
160
  ## Sub-Principles
256
161
 
257
- - [Comprehensive Test Coverage](comprehensive-test-coverage.md)
258
- - [Fast Feedback Loops](fast-feedback-loops.md)
259
- - [Slow Feedback Coping](slow-feedback-coping.md)
260
- - [Development Traceability](development-traceability.md)
261
- - [Context-Optimised Code](context-optimised-code.md)
262
- - [Exploratory Tooling](exploratory-tooling.md)
263
- - [Debugging Tooling](debugging-tooling.md)
264
- - [Self-Contained Repository](self-contained-repository.md)
162
+ (none)
265
163
  `
266
164
  },
267
165
  {
268
- slug: "broken-windows",
269
- content: `# Broken Windows
270
-
271
- Don't leave broken windows unrepaired.
166
+ slug: "environment-independent-tests",
167
+ content: `# Environment-Independent Tests
272
168
 
273
- A broken window a bad name, a hack, a TODO that lingers, a test that's been skipped signals that nobody cares. That signal invites more neglect. One shortcut becomes two, then ten, and the codebase quietly rots from the inside.
169
+ Tests must produce the same result regardless of where they run. A test that passes locally but fails in CI (or vice versa) is a broken test.
274
170
 
275
- When you spot a broken window, fix it immediately if the fix is small. If it's too large, capture it as a task so it doesn't get forgotten. The key is to never normalise the damage. Even a comment acknowledging the problem ("this needs fixing because...") is better than silent acceptance.
171
+ Concretely, tests should never depend on:
172
+ - Ambient environment variables (e.g. \`CLAUDECODE\`, \`CI\`, \`HOME\`)
173
+ - The current working directory or filesystem layout of the host machine
174
+ - Network availability or external services
175
+ - The identity of the user or agent running the tests
276
176
 
277
- This principle complements the [Boy Scout Rule](boy-scout-rule.md): the Boy Scout Rule encourages proactive improvement, while Broken Windows warns against tolerating known problems. Together they keep entropy at bay.
177
+ When a function's behavior depends on environment variables, the test must explicitly control those variables (via \`stubEnv\`, dependency injection, or passing an \`env\` parameter) rather than relying on whatever happens to be set in the current shell.
278
178
 
279
179
  ## Parent Principle
280
180
 
281
- - [Maintainable Codebase](maintainable-codebase.md)
181
+ - [Test Isolation](test-isolation.md)
282
182
 
283
183
  ## Sub-Principles
284
184
 
@@ -286,112 +186,138 @@ This principle complements the [Boy Scout Rule](boy-scout-rule.md): the Boy Scou
286
186
  `
287
187
  },
288
188
  {
289
- slug: "progressive-disclosure",
290
- content: `# Progressive Disclosure
189
+ slug: "comprehensive-assertions",
190
+ content: `# Comprehensive Assertions
291
191
 
292
- Dust should reveal details progressively as a way of achieving context window efficiency.
192
+ Assert the whole, not the parts.
293
193
 
294
- Not all information is needed at once. A task list showing just titles is sufficient for choosing what to work on. Full task details are only needed when actively implementing. Linked principles and facts can be followed when deeper context is required.
194
+ When you break a complex object into many small assertions, a failure tells you *one thing that's wrong*. When you assert against the whole expected value, the diff tells you *what actually happened versus what you expected* the full picture, in one glance.
295
195
 
296
- This layered approach keeps initial reads lightweight while preserving access to complete information when needed.
196
+ Small assertions are like yes/no questions to a witness. A whole-object assertion is like asking "tell me what you saw."
297
197
 
298
- ## Parent Principle
198
+ ## In practice
299
199
 
300
- - [Context Window Efficiency](context-window-efficiency.md)
200
+ Collapse multiple partial assertions into one comprehensive assertion:
301
201
 
302
- ## Sub-Principles
202
+ \`\`\`javascript
203
+ // Fragmented — each failure is a narrow keyhole
204
+ expect(result.name).toBe("Alice");
205
+ expect(result.age).toBe(30);
206
+ expect(result.role).toBe("admin");
303
207
 
304
- - (none)
305
- `
306
- },
307
- {
308
- slug: "lightweight-planning",
309
- content: `# Lightweight Planning
208
+ // Whole — a failure diff tells the full story
209
+ expect(result).toEqual({
210
+ name: "Alice",
211
+ age: 30,
212
+ role: "admin",
213
+ });
214
+ \`\`\`
310
215
 
311
- Dust aims to be a minimal, low-overhead planning system that stays relevant over time.
216
+ If \`role\` is \`"user"\` and \`age\` is \`29\`, the fragmented version stops at the first failure. The whole-object assertion shows both discrepancies at once, in context.
312
217
 
313
- Planning artifacts are simple markdown files that live alongside code. Ideas are intentionally vague until implementation is imminent. Tasks are small and completable in single commits. Facts document current reality rather than aspirational states.
218
+ The same applies to arrays:
314
219
 
315
- The system avoids the staleness problem by deferring detail until the last responsible moment and deleting completed work rather than archiving it.
220
+ \`\`\`javascript
221
+ // Avoid: partial assertions that hide the actual state
222
+ expect(array).toContain('apples')
223
+ expect(array).toContain('oranges')
224
+
225
+ // Prefer: one assertion that reveals the full picture on failure
226
+ expect(array).toEqual(['apples', 'oranges'])
227
+ \`\`\`
228
+
229
+ ## How to evaluate
230
+
231
+ Work supports this principle when test failures tell a rich story — showing the complete actual value alongside the complete expected value, so the reader can understand what happened without re-running anything.
316
232
 
317
233
  ## Parent Principle
318
234
 
319
- - [Human-AI Collaboration](human-ai-collaboration.md)
235
+ - [Make Changes with Confidence](make-changes-with-confidence.md)
320
236
 
321
237
  ## Sub-Principles
322
238
 
323
- - [Task-First Workflow](task-first-workflow.md)
324
- - [Some Big Design Up Front](some-big-design-up-front.md)
239
+ - (none)
325
240
  `
326
241
  },
327
242
  {
328
- slug: "comprehensive-test-coverage",
329
- content: `# Comprehensive Test Coverage
330
-
331
- A project's test suite is its primary safety net, and agents depend on it even more than humans do.
243
+ slug: "maintainable-codebase",
244
+ content: `# Maintainable Codebase
332
245
 
333
- Agents cannot manually verify that their changes work. They rely entirely on automated tests to confirm correctness. Gaps in test coverage become gaps in agent capability — areas where changes are risky and feedback is absent. Comprehensive coverage means every meaningful behaviour is tested, so agents can make changes anywhere in the codebase with confidence.
246
+ The dust codebase should be easy to understand, modify, and extend.
334
247
 
335
- Dust should help projects measure and improve their test coverage, flag untested areas, and encourage a culture where new code comes with new tests.
248
+ This principle governs how we develop and maintain dust itself, separate from the principles that describe what dust offers its users. A well-maintained codebase enables rapid iteration, reduces bugs, and makes contributions easier.
336
249
 
337
250
  ## Parent Principle
338
251
 
339
- - [Ideal Agent Developer Experience](ideal-agent-developer-experience.md)
252
+ - [Agentic Flow State](agentic-flow-state.md)
340
253
 
341
254
  ## Sub-Principles
342
255
 
343
- - (none)
256
+ - [Make Changes with Confidence](make-changes-with-confidence.md)
257
+ - [Minimal Dependencies](minimal-dependencies.md)
258
+ - [Intuitive Directory Structure](intuitive-directory-structure.md)
259
+ - [Repository Hygiene](repository-hygiene.md)
260
+ - [Naming Matters](naming-matters.md)
261
+ - [Reasonably DRY](reasonably-dry.md)
262
+ - [Make the Change Easy](make-the-change-easy.md)
263
+ - [Boy Scout Rule](boy-scout-rule.md)
264
+ - [Broken Windows](broken-windows.md)
344
265
  `
345
266
  },
346
267
  {
347
- slug: "intuitive-directory-structure",
348
- content: `# Intuitive Directory Structure
268
+ slug: "context-window-efficiency",
269
+ content: `# Context Window Efficiency
349
270
 
350
- Code should be organized around related concerns in clearly named directories.
271
+ Dust should be designed with short attention spans in mind.
351
272
 
352
- When files that serve similar purposes are grouped together, the codebase becomes easier to navigate and understand. A developer looking for "commands" should find them in a \`commands\` directory. Utilities should live with utilities. This organization reduces cognitive load and makes the project structure self-documenting.
273
+ AI agents operate within limited context windows. Every token consumed by planning artifacts is a token unavailable for reasoning about code. Dust keeps artifacts concise and scannable so agents can quickly understand what needs to be done without wading through verbose documentation.
274
+
275
+ This means favoring brevity over completeness, using consistent structures that are fast to parse, and avoiding redundant information across files.
353
276
 
354
277
  ## Parent Principle
355
278
 
356
- - [Maintainable Codebase](maintainable-codebase.md)
279
+ - [Agent Autonomy](agent-autonomy.md)
357
280
 
358
281
  ## Sub-Principles
359
282
 
360
- - [Co-located Tests](co-located-tests.md)
283
+ - [Progressive Disclosure](progressive-disclosure.md)
361
284
  `
362
285
  },
363
286
  {
364
- slug: "small-units",
365
- content: `# Small Units
366
-
367
- Ideas, principles, facts, and tasks should each be as discrete and fine-grained as possible.
287
+ slug: "human-ai-collaboration",
288
+ content: `# Human-AI Collaboration
368
289
 
369
- Small, focused documents enable precise relationships between them. A task can link to exactly the principles it serves. A fact can describe one specific aspect of the system. This granularity reduces ambiguity.
290
+ Dust exists to enable effective collaboration between humans and AI agents on complex projects.
370
291
 
371
- Tasks especially benefit from being small. A narrowly scoped task gives agents or humans the best chance of delivering exactly what was intended, in a single atomic commit.
292
+ The human is the CEO — they set direction, make strategic decisions, and check in when it matters. Dust is the PM it manages the work, prepares context, and brings fully-researched questions to the human rather than expecting them to drive every detail. Agents are the developers — they read code, write changes, and iterate autonomously.
372
293
 
373
- Note: This principle directly supports [Lightweight Planning](lightweight-planning.md), which explicitly mentions that "Tasks are small and completable in single commits."
294
+ Today's AI coding tools keep humans in a tight loop with agents. Dust is designed to loosen that loop, so humans spend less time directing and more time deciding.
374
295
 
375
296
  ## Parent Principle
376
297
 
377
- - [Agent Autonomy](agent-autonomy.md)
298
+ - [Agentic Flow State](agentic-flow-state.md)
378
299
 
379
300
  ## Sub-Principles
380
301
 
381
- - (none)
302
+ - [Agent Autonomy](agent-autonomy.md)
303
+ - [Easy Adoption](easy-adoption.md)
304
+ - [Ideal Agent Developer Experience](ideal-agent-developer-experience.md)
305
+ - [Lightweight Planning](lightweight-planning.md)
382
306
  `
383
307
  },
384
308
  {
385
- slug: "fast-feedback",
386
- content: `# Fast Feedback
309
+ slug: "functional-core-imperative-shell",
310
+ content: `# Functional Core, Imperative Shell
387
311
 
388
- Dust should provide fast feedback loops for developers.
312
+ Separate code into a pure "functional core" and a thin "imperative shell." The core takes values in and returns values out, with no side effects. The shell handles I/O and wires things together.
389
313
 
390
- Scripts and tooling should execute quickly so developers can iterate rapidly. Slow feedback discourages frequent validation and leads to larger, riskier changes. Fast feedback enables small, confident steps.
314
+ Purely functional code makes some things easier to understand: because values don't change, you can call functions and know that only their return value matters—they don't change anything outside themselves.
315
+
316
+ The functional core contains business logic as pure functions that take values and return values. The imperative shell sits at the boundary, reading input, calling into the core, and performing side effects with the results. This keeps the majority of code easy to test (no mocks or stubs needed for pure functions) and makes the I/O surface area small and explicit.
391
317
 
392
318
  ## Parent Principle
393
319
 
394
- - [Make Changes with Confidence](make-changes-with-confidence.md)
320
+ - [Decoupled Code](decoupled-code.md)
395
321
 
396
322
  ## Sub-Principles
397
323
 
@@ -399,18 +325,28 @@ Scripts and tooling should execute quickly so developers can iterate rapidly. Sl
399
325
  `
400
326
  },
401
327
  {
402
- slug: "dependency-injection",
403
- content: `# Dependency Injection
328
+ slug: "keep-unit-tests-pure",
329
+ content: `# Keep Unit Tests Pure
404
330
 
405
- Avoid global mocks. Dependency injection is almost always preferable to testing code that depends directly on globals.
331
+ Unit tests (those run very frequently as part of a tight feedback loop) should be pure and side-effect free. A test is **not** a unit test if it:
406
332
 
407
- When code depends on global state or singletons, testing requires mocking those globals—which introduces hidden coupling, complicates test setup, and risks interference between tests. Dependency injection makes dependencies explicit: they're passed in as arguments, making the code's requirements visible and enabling tests to supply controlled implementations.
333
+ - Accesses a database
334
+ - Communicates over a network
335
+ - Touches the file system
336
+ - Cannot run concurrently with other tests
337
+ - Requires special environment setup
408
338
 
409
- This approach improves testability (each test controls its own dependencies), readability (dependencies are declared upfront), and flexibility (swapping implementations doesn't require changing the consuming code). It also makes refactoring safer since dependencies are explicit rather than implicit.
339
+ "Unit tests" here means tests run frequently during development not system tests, which intentionally exercise the full stack including I/O. Pure unit tests exercise only business logic, not infrastructure.
340
+
341
+ The value of pure unit tests is that they are fast, deterministic, and isolate business logic from infrastructure concerns. When unit tests pass but integration or system tests fail, developers can immediately narrow the problem to the boundary layer — a diagnostic "binary chop" that accelerates debugging.
342
+
343
+ ## Migration Guidance
344
+
345
+ Where existing tests are impure (e.g. they spawn processes, write temporary files, or make network calls), prefer converting them to use in-memory alternatives — stubs, fakes, or dependency-injected doubles — rather than leaving them as-is. Opportunistic migration is fine; a big-bang rewrite is not required.
410
346
 
411
347
  ## Parent Principle
412
348
 
413
- - [Decoupled Code](decoupled-code.md)
349
+ - [Make Changes with Confidence](make-changes-with-confidence.md)
414
350
 
415
351
  ## Sub-Principles
416
352
 
@@ -418,16 +354,16 @@ This approach improves testability (each test controls its own dependencies), re
418
354
  `
419
355
  },
420
356
  {
421
- slug: "reproducible-checks",
422
- content: `# Reproducible Checks
357
+ slug: "runtime-agnostic-tests",
358
+ content: `# Runtime Agnostic Tests
423
359
 
424
- Every check must produce the same result regardless of who runs it, when, or on what machine. If a check passes for one developer but fails for another, the check is broken.
360
+ Dust's test suite should work across JavaScript runtimes.
425
361
 
426
- Concretely, checks should pin their tool versions via the project's dependency manager (e.g. \`devDependencies\`) rather than relying on \`npx\`/\`bunx\` to fetch the latest version at runtime. Unpinned versions introduce non-determinism a check that passed yesterday may fail today due to a tool upgrade that nobody chose to adopt.
362
+ Tests should use standard JavaScript testing patterns that work across Node.js, Bun, and other runtimes. Avoiding runtime-specific test APIs ensures the project can leverage different runtimes' advantages while maintaining broad compatibility.
427
363
 
428
364
  ## Parent Principle
429
365
 
430
- - [Make Changes with Confidence](make-changes-with-confidence.md)
366
+ - [Minimal Dependencies](minimal-dependencies.md)
431
367
 
432
368
  ## Sub-Principles
433
369
 
@@ -435,18 +371,18 @@ Concretely, checks should pin their tool versions via the project's dependency m
435
371
  `
436
372
  },
437
373
  {
438
- slug: "slow-feedback-coping",
439
- content: `# Slow Feedback Coping
374
+ slug: "unsurprising-ux",
375
+ content: `# Unsurprising UX
440
376
 
441
- Some feedback is unavoidably slow — dust should offer coping strategies rather than pretending it can be eliminated.
377
+ The user interface should be as "guessable" as possible.
442
378
 
443
- Integration tests, end-to-end tests, deployment pipelines, and external API calls all take time. Pretending they can be made instant is unrealistic. Instead, dust should help developers and agents cope with slow feedback effectively: by structuring work so that fast checks catch most problems early, by batching slow checks intelligently, by providing clear progress indicators, and by ensuring that when slow feedback does arrive, it is actionable and specific.
379
+ Following the [Principle of Least Astonishment](https://en.wikipedia.org/wiki/Principle_of_least_astonishment), users form expectations about how a tool will behave based on conventions, prior experience, and intuition. Dust's interface (including the CLI) should match those expectations wherever possible. If users are observed trying to use the interface in ways we didn't anticipate, the interface should be adjusted to meet their expectations even if that means supporting many ways of achieving the same result.
444
380
 
445
- Strategies include separating fast and slow test suites, running slow checks asynchronously or in CI, caching expensive operations, and designing workflows that minimise how often slow feedback is needed.
381
+ Surprising behavior erodes trust and slows people down. Unsurprising behavior lets users stay in flow.
446
382
 
447
383
  ## Parent Principle
448
384
 
449
- - [Ideal Agent Developer Experience](ideal-agent-developer-experience.md)
385
+ - [Easy Adoption](easy-adoption.md)
450
386
 
451
387
  ## Sub-Principles
452
388
 
@@ -454,127 +390,113 @@ Strategies include separating fast and slow test suites, running slow checks asy
454
390
  `
455
391
  },
456
392
  {
457
- slug: "make-changes-with-confidence",
458
- content: `# Make Changes with Confidence
393
+ slug: "unit-test-coverage",
394
+ content: `# Unit Test Coverage
459
395
 
460
- Developers should be able to modify code without fear of breaking existing behavior.
396
+ Complete unit test coverage ensures low-level tests give users direct feedback as they change the code.
461
397
 
462
- Tests, type checking, and other automated verification enable safe refactoring and evolution of the codebase. When changes break something, fast feedback identifies the problem before it spreads. This confidence encourages continuous improvement rather than fragile, stagnant code.
398
+ Excluding system tests from coverage reporting focuses attention on unit tests - the tests that provide the fastest, most specific feedback. When coverage tools only measure unit tests, developers can quickly identify which parts of the codebase lack fine-grained test protection.
463
399
 
464
400
  ## Parent Principle
465
401
 
466
- - [Maintainable Codebase](maintainable-codebase.md)
402
+ - [Make Changes with Confidence](make-changes-with-confidence.md)
467
403
 
468
404
  ## Sub-Principles
469
405
 
470
- - [Comprehensive Assertions](comprehensive-assertions.md)
471
- - [Decoupled Code](decoupled-code.md)
472
- - [Fast Feedback](fast-feedback.md)
473
- - [Lint Everything](lint-everything.md)
474
- - [Readable Test Data](readable-test-data.md)
475
- - [Reproducible Checks](reproducible-checks.md)
476
- - [Stop the Line](stop-the-line.md)
477
- - [Keep Unit Tests Pure](keep-unit-tests-pure.md)
478
- - [Test Isolation](test-isolation.md)
479
- - [Self-Diagnosing Tests](self-diagnosing-tests.md)
480
- - [Unit Test Coverage](unit-test-coverage.md)
406
+ - (none)
481
407
  `
482
408
  },
483
409
  {
484
- slug: "test-isolation",
485
- content: `# Test Isolation
410
+ slug: "cross-platform-compatibility",
411
+ content: `# Cross-Platform Compatibility
486
412
 
487
- Tests should not interfere with one another. Each test must be independently runnable and produce the same result regardless of execution order or which other tests run alongside it.
413
+ Dust should work consistently across operating systems: Linux, macOS, and Windows.
488
414
 
489
415
  This means:
490
- - No shared mutable state between tests
491
- - No reliance on test execution order
492
- - No file system or environment pollution
493
- - Each test sets up its own dependencies
416
+ - Avoiding platform-specific shell commands or syntax
417
+ - Using cross-platform path handling
418
+ - Testing on multiple platforms when possible
419
+ - Documenting any platform-specific limitations
494
420
 
495
- Test isolation enables parallel execution, makes failures easier to diagnose, and prevents cascading false failures when one test breaks.
421
+ Cross-platform support broadens adoption and ensures teams with mixed environments can collaborate effectively.
496
422
 
497
423
  ## Parent Principle
498
424
 
499
- - [Make Changes with Confidence](make-changes-with-confidence.md)
425
+ - [Easy Adoption](easy-adoption.md)
500
426
 
501
427
  ## Sub-Principles
502
428
 
503
- - [Environment-Independent Tests](environment-independent-tests.md)
429
+ - (none)
504
430
  `
505
431
  },
506
432
  {
507
- slug: "repository-hygiene",
508
- content: `# Repository Hygiene
433
+ slug: "vcs-independence",
434
+ content: `# VCS Independence
509
435
 
510
- Dust repositories should maintain a clean, organized state with minimal noise.
436
+ Dust should work independently of any specific version control system.
511
437
 
512
- This includes proper gitignore configuration to exclude build artifacts, dependencies, editor files, and other generated content from version control. A well-maintained repository makes it easier for both humans and AI to navigate and understand the codebase.
438
+ While git is common, dust's core functionality should not require git. This enables use in repositories using other VCS (Mercurial, SVN, Perforce) or in non-VCS workflows.
513
439
 
514
440
  ## Parent Principle
515
441
 
516
- - [Maintainable Codebase](maintainable-codebase.md)
442
+ - [Easy Adoption](easy-adoption.md)
517
443
 
518
444
  ## Sub-Principles
519
445
 
520
- - [Atomic Commits](atomic-commits.md)
521
- - [Trunk-Based Development](trunk-based-development.md)
446
+ - (none)
522
447
  `
523
448
  },
524
449
  {
525
- slug: "agentic-flow-state",
526
- content: `# Agentic Flow State
527
-
528
- Flow is the mental state where work becomes effortless - where you're fully immersed, losing track of time, operating at peak performance. Psychologist Mihaly Csikszentmihalyi identified three conditions that create flow: clear goals, immediate feedback, and challenge-skill balance.
450
+ slug: "self-contained-repository",
451
+ content: `# Self-Contained Repository
529
452
 
530
- For AI agents, achieving flow state means staying engaged and productive without interruption. Agents enter flow when they have optimal context, comprehensive guard rails, and minimal friction. Context window optimization ensures agents have exactly what they need without cognitive overload. In-session guard rails prevent agents from straying off course or making mistakes that break their momentum.
453
+ Where possible, developers and agents should have everything they need to be productive, within the repository.
531
454
 
532
- Dust's design targets these conditions directly:
455
+ No third-party tools should be required beyond those that can be installed with a single command defined in the repository. Setup instructions, scripts, configuration, and dependencies should all live in version control so that cloning the repo and running a single install command is sufficient to start working. This eliminates onboarding friction, reduces "works on my machine" issues, and is especially important for agents — who cannot browse the web to find missing tools or ask colleagues how to set things up.
533
456
 
534
- - **Clear goals**: Task files and lightweight planning give you a concrete target. You know exactly what you're building next.
535
- - **Immediate feedback**: Fast feedback loops let you see results quickly. Each change confirms you're on track or shows you what to adjust.
536
- - **Challenge-skill balance**: Small units of work and agent autonomy keep you in the zone - challenged enough to stay engaged, supported enough to succeed.
537
- - **Context window efficiency**: Progressive disclosure and artifact summarization ensure agents have the right context without overflow.
538
- - **Comprehensive guard rails**: Lint rules, type checks, and automated validation catch mistakes before they compound.
457
+ ## Applicability
539
458
 
540
- Everything dust does serves flow. When agents stay in flow, they produce better work, sustain their momentum, and complete tasks autonomously.
459
+ Internal
541
460
 
542
461
  ## Parent Principle
543
462
 
544
- - (none)
463
+ - [Ideal Agent Developer Experience](ideal-agent-developer-experience.md)
545
464
 
546
465
  ## Sub-Principles
547
466
 
548
- - [Human-AI Collaboration](human-ai-collaboration.md)
549
- - [Maintainable Codebase](maintainable-codebase.md)
467
+ - (none)
550
468
  `
551
469
  },
552
470
  {
553
- slug: "stop-the-line",
554
- content: `# Stop the Line
471
+ slug: "minimal-dependencies",
472
+ content: `# Minimal Dependencies
555
473
 
556
- Any worker human or agent should halt and fix a problem the moment they detect it, rather than letting defects propagate downstream.
474
+ Dust should avoid coupling to specific tools so we can switch to better alternatives as they emerge.
557
475
 
558
- Originating from the Toyota production system, "Stop the Line" empowers every participant to pause work immediately upon identifying a defect, failing check, or safety hazard. Problems are cheaper to fix at their source than after they've compounded through later stages. In the context of dust, this means agents and humans alike should treat broken checks, test failures, and lint errors as blockers that demand immediate attention — not warnings to be deferred.
476
+ By keeping dependencies minimal and using standard APIs where possible, we maintain the freedom to adopt new tools without major rewrites. This applies to runtimes, test frameworks, build tools, and other infrastructure choices.
559
477
 
560
478
  ## Parent Principle
561
479
 
562
- - [Make Changes with Confidence](make-changes-with-confidence.md)
480
+ - [Maintainable Codebase](maintainable-codebase.md)
563
481
 
564
482
  ## Sub-Principles
565
483
 
566
- - (none)
484
+ - [Runtime Agnostic Tests](runtime-agnostic-tests.md)
567
485
  `
568
486
  },
569
487
  {
570
- slug: "agent-context-inference",
571
- content: `# Agent Context Inference
488
+ slug: "agent-specific-enhancement",
489
+ content: `# Agent-Specific Enhancement
572
490
 
573
- Terse human prompts should trigger the correct agent action.
491
+ Dust should detect and enhance the experience for specific agents while remaining agnostic at its core.
574
492
 
575
- When a human gives a brief instruction like "the button should be green", the agent should be able to infer what to do. The agent shouldn't require the human to specify file paths, component names, or implementation details that can be discovered from the repository.
493
+ While Dust has [Agent-Agnostic Design](agent-agnostic-design.md) and works with any capable agent, it can still optimize the "agent DX" (developer experience) when it detects a specific agent is being used. This means:
576
494
 
577
- This reduces friction for humans and makes agent interactions feel more natural. The burden of context discovery shifts to the agent, which can use dust's CLI and repository structure to find what it needs.
495
+ - **Detection** - Dust may detect which agent is running (e.g., Claude Code, Aider, Cursor) through environment variables, configuration, or other signals
496
+ - **Enhancement** - Once detected, Dust can tailor its output format, prompts, or context to leverage that agent's specific strengths
497
+ - **Graceful fallback** - When no specific agent is detected, Dust provides a generic experience that works with any agent
498
+
499
+ This principle complements Agent-Agnostic Design: the core functionality never requires a specific agent, but the experience improves when one is recognized.
578
500
 
579
501
  ## Applicability
580
502
 
@@ -590,40 +512,57 @@ Internal
590
512
  `
591
513
  },
592
514
  {
593
- slug: "naming-matters",
594
- content: `# Naming Matters
515
+ slug: "self-diagnosing-tests",
516
+ content: `# Self-Diagnosing Tests
595
517
 
596
- Good naming reduces waste by eliminating confusion and making code self-documenting.
518
+ When a big test fails, it should be self-evident how to diagnose and fix the failure.
597
519
 
598
- Poor names cause rework, bugs, and communication overhead. When names don't clearly convey meaning, developers waste time deciphering code, misunderstand intentions, and introduce defects. Well-chosen names serve as documentation that never goes stale, reducing the need for explanatory comments and enabling both humans and AI agents to navigate the codebase efficiently.
520
+ The more moving parts a test has — end-to-end, system, integration the more critical this becomes. A test that fails with \`expected true, received false\` forces the developer (or agent) to re-run, add logging, and guess. A test that fails with a rich diff showing the actual state versus the expected state turns diagnosis into reading.
599
521
 
600
- ## Parent Principle
522
+ ## Anti-patterns
601
523
 
602
- - [Maintainable Codebase](maintainable-codebase.md)
524
+ **Boolean flattening** — collapsing a rich value into true/false before asserting:
525
+ \`\`\`javascript
526
+ // Bad: "expected true, received false" — what events arrived?
527
+ expect(events.some(e => e.type === 'check-passed')).toBe(true)
603
528
 
604
- ## Sub-Principles
529
+ // Good: shows the actual event types on failure
530
+ expect(events.map(e => e.type)).toContain('check-passed')
531
+ \`\`\`
605
532
 
606
- - [Consistent Naming](consistent-naming.md)
607
- - [Clarity Over Brevity](clarity-over-brevity.md)
608
- `
609
- },
610
- {
611
- slug: "stubs-over-mocks",
612
- content: `# Stubs Over Mocks
533
+ **Length-only assertions** — checking count without showing contents:
534
+ \`\`\`javascript
535
+ // Bad: "expected 2, received 0" — what requests were captured?
536
+ expect(requests.length).toBe(2)
613
537
 
614
- Prefer hand-rolled stubs over mocks, in unit tests. Stubs keep tests focused on observable behavior instead of implementation details.
538
+ // Good: shows the actual requests on failure
539
+ expect(requests).toHaveLength(2) // vitest shows the array
540
+ \`\`\`
615
541
 
616
- Mocks tend to encode a script of “expected calls” (what was invoked, in what order, with what arguments). That makes tests brittle: harmless refactors (changing internal decomposition, adding caching, batching calls, reordering operations) can break tests even when the externally visible behavior is unchanged. You end up maintaining tests that police how the code works rather than what it does.
542
+ **Silent guards** using \`if\` where an assertion belongs:
543
+ \`\`\`javascript
544
+ // Bad: silently passes when settings is undefined
545
+ if (settings) {
546
+ expect(JSON.parse(settings).key).toBeDefined()
547
+ }
617
548
 
618
- Stubs (and especially in-memory emulators) push tests toward the contract: provide inputs, run the code, assert outputs and side effects. When a test fails, it’s usually because a behavior changed, not because the internal call choreography shifted. That improves signal-to-noise, reduces rewrites during refactors, and makes it easier to evolve the implementation.
549
+ // Good: fails explicitly if settings is missing
550
+ expect(settings).toBeDefined()
551
+ const parsed = JSON.parse(settings!)
552
+ expect(parsed.key).toBeDefined()
553
+ \`\`\`
619
554
 
620
- For external dependencies (databases, queues, object stores, HTTP services), the default choice should be an in-memory emulator: a drop-in replacement that is faithful enough to the real interface/semantics but runs entirely in-process. It gives most of the benefits of integration testing—realistic state transitions, error modes, concurrency behavior where relevant—without the cost, flakiness, and setup burden of booting real infrastructure. It also keeps the test environment hermetic (no network, no shared state), which improves determinism and makes tests fast.
555
+ ## The test
621
556
 
622
- Still use mocks selectively—mainly to assert something is called (e.g., telemetry emission, "at most once" notifications, payment capture guarded by a feature flag) or when a dependency is impossible to emulate. But for most cases, stubs and in-memory emulators produce tests that are clearer, more resilient to refactoring, and better aligned with the system's actual contracts.
557
+ If a test fails, can a developer who has never seen the code identify the problem from the failure output alone without re-running, adding console.logs, or reading the test source? The closer to "yes", the better.
558
+
559
+ ## How to evaluate
560
+
561
+ Work supports this principle when every assertion in a system or integration test would, on failure, reveal the actual state richly enough to guide a fix. Bare boolean checks, length-only assertions, and silent conditional guards are violations.
623
562
 
624
563
  ## Parent Principle
625
564
 
626
- - [Decoupled Code](decoupled-code.md)
565
+ - [Make Changes with Confidence](make-changes-with-confidence.md)
627
566
 
628
567
  ## Sub-Principles
629
568
 
@@ -631,18 +570,18 @@ Still use mocks selectively—mainly to assert something is called (e.g., teleme
631
570
  `
632
571
  },
633
572
  {
634
- slug: "functional-core-imperative-shell",
635
- content: `# Functional Core, Imperative Shell
573
+ slug: "slow-feedback-coping",
574
+ content: `# Slow Feedback Coping
636
575
 
637
- Separate code into a pure "functional core" and a thin "imperative shell." The core takes values in and returns values out, with no side effects. The shell handles I/O and wires things together.
576
+ Some feedback is unavoidably slow dust should offer coping strategies rather than pretending it can be eliminated.
638
577
 
639
- Purely functional code makes some things easier to understand: because values don't change, you can call functions and know that only their return value matters—they don't change anything outside themselves.
578
+ Integration tests, end-to-end tests, deployment pipelines, and external API calls all take time. Pretending they can be made instant is unrealistic. Instead, dust should help developers and agents cope with slow feedback effectively: by structuring work so that fast checks catch most problems early, by batching slow checks intelligently, by providing clear progress indicators, and by ensuring that when slow feedback does arrive, it is actionable and specific.
640
579
 
641
- The functional core contains business logic as pure functions that take values and return values. The imperative shell sits at the boundary, reading input, calling into the core, and performing side effects with the results. This keeps the majority of code easy to test (no mocks or stubs needed for pure functions) and makes the I/O surface area small and explicit.
580
+ Strategies include separating fast and slow test suites, running slow checks asynchronously or in CI, caching expensive operations, and designing workflows that minimise how often slow feedback is needed.
642
581
 
643
582
  ## Parent Principle
644
583
 
645
- - [Decoupled Code](decoupled-code.md)
584
+ - [Ideal Agent Developer Experience](ideal-agent-developer-experience.md)
646
585
 
647
586
  ## Sub-Principles
648
587
 
@@ -650,51 +589,61 @@ The functional core contains business logic as pure functions that take values a
650
589
  `
651
590
  },
652
591
  {
653
- slug: "development-traceability",
654
- content: `# Development Traceability
592
+ slug: "agentic-flow-state",
593
+ content: `# Agentic Flow State
655
594
 
656
- Structured logging and tracing help agents understand system behaviour without resorting to ad-hoc testing cycles.
595
+ Flow is the mental state where work becomes effortless - where you're fully immersed, losing track of time, operating at peak performance. Psychologist Mihaly Csikszentmihalyi identified three conditions that create flow: clear goals, immediate feedback, and challenge-skill balance.
657
596
 
658
- When something goes wrong, agents often resort to adding temporary log statements, running the code, reading the output, and repeating a slow and wasteful debugging loop. Good traceability means the system already records what happened and why, through structured logs, trace IDs, and observable state. This lets agents diagnose issues by reading existing output rather than generating new experiments.
597
+ For AI agents, achieving flow state means staying engaged and productive without interruption. Agents enter flow when they have optimal context, comprehensive guard rails, and minimal friction. Context window optimization ensures agents have exactly what they need without cognitive overload. In-session guard rails prevent agents from straying off course or making mistakes that break their momentum.
659
598
 
660
- Dust should encourage projects to adopt structured logging, promote traceability as a first-class concern, and provide tools that surface relevant trace information when agents need it.
599
+ Dust's design targets these conditions directly:
661
600
 
662
- ## Applicability
601
+ - **Clear goals**: Task files and lightweight planning give you a concrete target. You know exactly what you're building next.
602
+ - **Immediate feedback**: Fast feedback loops let you see results quickly. Each change confirms you're on track or shows you what to adjust.
603
+ - **Challenge-skill balance**: Small units of work and agent autonomy keep you in the zone - challenged enough to stay engaged, supported enough to succeed.
604
+ - **Context window efficiency**: Progressive disclosure and artifact summarization ensure agents have the right context without overflow.
605
+ - **Comprehensive guard rails**: Lint rules, type checks, and automated validation catch mistakes before they compound.
663
606
 
664
- Internal
607
+ Everything dust does serves flow. When agents stay in flow, they produce better work, sustain their momentum, and complete tasks autonomously.
665
608
 
666
609
  ## Parent Principle
667
610
 
668
- - [Ideal Agent Developer Experience](ideal-agent-developer-experience.md)
611
+ - (none)
669
612
 
670
613
  ## Sub-Principles
671
614
 
672
- - (none)
615
+ - [Human-AI Collaboration](human-ai-collaboration.md)
616
+ - [Maintainable Codebase](maintainable-codebase.md)
673
617
  `
674
618
  },
675
619
  {
676
- slug: "keep-unit-tests-pure",
677
- content: `# Keep Unit Tests Pure
620
+ slug: "reproducible-checks",
621
+ content: `# Reproducible Checks
678
622
 
679
- Unit tests (those run very frequently as part of a tight feedback loop) should be pure and side-effect free. A test is **not** a unit test if it:
623
+ Every check must produce the same result regardless of who runs it, when, or on what machine. If a check passes for one developer but fails for another, the check is broken.
680
624
 
681
- - Accesses a database
682
- - Communicates over a network
683
- - Touches the file system
684
- - Cannot run concurrently with other tests
685
- - Requires special environment setup
625
+ Concretely, checks should pin their tool versions via the project's dependency manager (e.g. \`devDependencies\`) rather than relying on \`npx\`/\`bunx\` to fetch the latest version at runtime. Unpinned versions introduce non-determinism a check that passed yesterday may fail today due to a tool upgrade that nobody chose to adopt.
686
626
 
687
- "Unit tests" here means tests run frequently during development — not system tests, which intentionally exercise the full stack including I/O. Pure unit tests exercise only business logic, not infrastructure.
627
+ ## Parent Principle
688
628
 
689
- The value of pure unit tests is that they are fast, deterministic, and isolate business logic from infrastructure concerns. When unit tests pass but integration or system tests fail, developers can immediately narrow the problem to the boundary layer — a diagnostic "binary chop" that accelerates debugging.
629
+ - [Make Changes with Confidence](make-changes-with-confidence.md)
690
630
 
691
- ## Migration Guidance
631
+ ## Sub-Principles
692
632
 
693
- Where existing tests are impure (e.g. they spawn processes, write temporary files, or make network calls), prefer converting them to use in-memory alternatives — stubs, fakes, or dependency-injected doubles — rather than leaving them as-is. Opportunistic migration is fine; a big-bang rewrite is not required.
633
+ - (none)
634
+ `
635
+ },
636
+ {
637
+ slug: "task-first-workflow",
638
+ content: `# Task-First Workflow
639
+
640
+ Work should be captured as a task before implementation begins, creating traceability between intent and outcome.
641
+
642
+ This discipline ensures that every change has a documented purpose. The commit history shows pairs of "Add task" followed by implementation, making it easy to understand why each change was made. It also prevents scope creep by defining boundaries before work starts.
694
643
 
695
644
  ## Parent Principle
696
645
 
697
- - [Make Changes with Confidence](make-changes-with-confidence.md)
646
+ - [Lightweight Planning](lightweight-planning.md)
698
647
 
699
648
  ## Sub-Principles
700
649
 
@@ -702,78 +651,96 @@ Where existing tests are impure (e.g. they spawn processes, write temporary file
702
651
  `
703
652
  },
704
653
  {
705
- slug: "co-located-tests",
706
- content: `# Co-located Tests
654
+ slug: "ideal-agent-developer-experience",
655
+ content: `# Ideal Agent Developer Experience
707
656
 
708
- Test files should live next to the code they test.
657
+ The agent is the developer. The human is the CEO. Dust is the PM.
709
658
 
710
- When tests are co-located with their source files, developers can immediately see what's tested and what isn't. Finding the test for a module becomes trivial—it's right there in the same directory. This proximity encourages writing tests as part of the development flow rather than as an afterthought, and makes it natural to update tests when modifying code.
659
+ With today's AI coding assistants, the human is stuck in a tight loop with agents constantly directing, reviewing, and course-correcting. Dust is designed to relieve humans from this tight loop. Like an assistant to a CEO, dust predominantly brings fully-researched questions and well-prepared work to the human, rather than expecting the human to drive every decision. The human checks in less frequently, and when they do, they make high-leverage strategic calls rather than micromanaging implementation.
660
+
661
+ For this to work, the agent's development environment must be excellent. The agent reads the code, writes changes, runs the checks, and iterates until the task is done. Everything about the codebase and its tooling either helps or hinders that process. Comprehensive tests are the agent's only way to verify correctness. Fast feedback loops are the agent's iteration speed. Structured logs are the agent's eyes into runtime behaviour. Small, well-organised files are what fit in the agent's context window. Exploratory and debugging tools are how the agent navigates and diagnoses without trial and error.
662
+
663
+ Each sub-principle represents a different aspect of the ideal agent developer setup. The better these are, the less the human needs to be in the loop.
711
664
 
712
665
  ## Parent Principle
713
666
 
714
- - [Intuitive Directory Structure](intuitive-directory-structure.md)
667
+ - [Human-AI Collaboration](human-ai-collaboration.md)
715
668
 
716
669
  ## Sub-Principles
717
670
 
718
- - (none)
671
+ - [Comprehensive Test Coverage](comprehensive-test-coverage.md)
672
+ - [Fast Feedback Loops](fast-feedback-loops.md)
673
+ - [Slow Feedback Coping](slow-feedback-coping.md)
674
+ - [Development Traceability](development-traceability.md)
675
+ - [Context-Optimised Code](context-optimised-code.md)
676
+ - [Exploratory Tooling](exploratory-tooling.md)
677
+ - [Debugging Tooling](debugging-tooling.md)
678
+ - [Self-Contained Repository](self-contained-repository.md)
719
679
  `
720
680
  },
721
681
  {
722
- slug: "human-ai-collaboration",
723
- content: `# Human-AI Collaboration
682
+ slug: "agent-context-inference",
683
+ content: `# Agent Context Inference
724
684
 
725
- Dust exists to enable effective collaboration between humans and AI agents on complex projects.
685
+ Terse human prompts should trigger the correct agent action.
726
686
 
727
- The human is the CEO they set direction, make strategic decisions, and check in when it matters. Dust is the PM it manages the work, prepares context, and brings fully-researched questions to the human rather than expecting them to drive every detail. Agents are the developers they read code, write changes, and iterate autonomously.
687
+ When a human gives a brief instruction like "the button should be green", the agent should be able to infer what to do. The agent shouldn't require the human to specify file paths, component names, or implementation details that can be discovered from the repository.
728
688
 
729
- Today's AI coding tools keep humans in a tight loop with agents. Dust is designed to loosen that loop, so humans spend less time directing and more time deciding.
689
+ This reduces friction for humans and makes agent interactions feel more natural. The burden of context discovery shifts to the agent, which can use dust's CLI and repository structure to find what it needs.
690
+
691
+ ## Applicability
692
+
693
+ Internal
730
694
 
731
695
  ## Parent Principle
732
696
 
733
- - [Agentic Flow State](agentic-flow-state.md)
697
+ - [Agent Autonomy](agent-autonomy.md)
734
698
 
735
699
  ## Sub-Principles
736
700
 
737
- - [Agent Autonomy](agent-autonomy.md)
738
- - [Easy Adoption](easy-adoption.md)
739
- - [Ideal Agent Developer Experience](ideal-agent-developer-experience.md)
740
- - [Lightweight Planning](lightweight-planning.md)
701
+ - (none)
741
702
  `
742
703
  },
743
704
  {
744
- slug: "vcs-independence",
745
- content: `# VCS Independence
705
+ slug: "agent-autonomy",
706
+ content: `# Agent Autonomy
746
707
 
747
- Dust should work independently of any specific version control system.
708
+ Dust exists to enable AI agents to produce work autonomously.
748
709
 
749
- While git is common, dust's core functionality should not require git. This enables use in repositories using other VCS (Mercurial, SVN, Perforce) or in non-VCS workflows.
710
+ With sufficient planning and small enough units, this works much better in practice.
750
711
 
751
712
  ## Parent Principle
752
713
 
753
- - [Easy Adoption](easy-adoption.md)
714
+ - [Human-AI Collaboration](human-ai-collaboration.md)
754
715
 
755
716
  ## Sub-Principles
756
717
 
757
- - (none)
718
+ - [Actionable Errors](actionable-errors.md)
719
+ - [Batteries Included](batteries-included.md)
720
+ - [Agent-Agnostic Design](agent-agnostic-design.md)
721
+ - [Agent Context Inference](agent-context-inference.md)
722
+ - [Agent-Specific Enhancement](agent-specific-enhancement.md)
723
+ - [Context Window Efficiency](context-window-efficiency.md)
724
+ - [Small Units](small-units.md)
758
725
  `
759
726
  },
760
727
  {
761
- slug: "environment-independent-tests",
762
- content: `# Environment-Independent Tests
728
+ slug: "stubs-over-mocks",
729
+ content: `# Stubs Over Mocks
730
+
731
+ Prefer hand-rolled stubs over mocks, in unit tests. Stubs keep tests focused on observable behavior instead of implementation details.
732
+
733
+ Mocks tend to encode a script of “expected calls” (what was invoked, in what order, with what arguments). That makes tests brittle: harmless refactors (changing internal decomposition, adding caching, batching calls, reordering operations) can break tests even when the externally visible behavior is unchanged. You end up maintaining tests that police how the code works rather than what it does.
763
734
 
764
- Tests must produce the same result regardless of where they run. A test that passes locally but fails in CI (or vice versa) is a broken test.
735
+ Stubs (and especially in-memory emulators) push tests toward the contract: provide inputs, run the code, assert outputs and side effects. When a test fails, it’s usually because a behavior changed, not because the internal call choreography shifted. That improves signal-to-noise, reduces rewrites during refactors, and makes it easier to evolve the implementation.
765
736
 
766
- Concretely, tests should never depend on:
767
- - Ambient environment variables (e.g. \`CLAUDECODE\`, \`CI\`, \`HOME\`)
768
- - The current working directory or filesystem layout of the host machine
769
- - Network availability or external services
770
- - The identity of the user or agent running the tests
737
+ For external dependencies (databases, queues, object stores, HTTP services), the default choice should be an in-memory emulator: a drop-in replacement that is faithful enough to the real interface/semantics but runs entirely in-process. It gives most of the benefits of integration testing—realistic state transitions, error modes, concurrency behavior where relevant—without the cost, flakiness, and setup burden of booting real infrastructure. It also keeps the test environment hermetic (no network, no shared state), which improves determinism and makes tests fast.
771
738
 
772
- When a function's behavior depends on environment variables, the test must explicitly control those variables (via \`stubEnv\`, dependency injection, or passing an \`env\` parameter) rather than relying on whatever happens to be set in the current shell.
739
+ Still use mocks selectively—mainly to assert something is called (e.g., telemetry emission, "at most once" notifications, payment capture guarded by a feature flag) or when a dependency is impossible to emulate. But for most cases, stubs and in-memory emulators produce tests that are clearer, more resilient to refactoring, and better aligned with the system's actual contracts.
773
740
 
774
741
  ## Parent Principle
775
742
 
776
- - [Test Isolation](test-isolation.md)
743
+ - [Decoupled Code](decoupled-code.md)
777
744
 
778
745
  ## Sub-Principles
779
746
 
@@ -804,95 +771,116 @@ Internal
804
771
  `
805
772
  },
806
773
  {
807
- slug: "atomic-commits",
808
- content: `# Atomic Commits
809
-
810
- Each commit should tell a complete story, bundling implementation changes with their corresponding documentation updates.
774
+ slug: "consistent-naming",
775
+ content: `# Consistent Naming
811
776
 
812
- When a task is completed, the commit deletes the task file, updates relevant facts to reflect the new reality, and removes any ideas that have been realized. This discipline ensures that any point in the commit history represents a coherent, self-documenting state of the project.
777
+ Names should follow established conventions within each category to reduce cognitive load.
813
778
 
814
- Clean commit history is essential because archaeology depends on it. Future humans and AI agents will traverse history to understand why decisions were made and how the system evolved.
779
+ Principles use Title Case. File names use kebab-case. Commands use lowercase with hyphens. When naming conventions exist, follow them. When they don't, establish one and apply it consistently. Inconsistent naming creates friction for both humans and AI agents trying to predict or recall identifiers.
815
780
 
816
781
  ## Parent Principle
817
782
 
818
- - [Repository Hygiene](repository-hygiene.md)
783
+ - [Naming Matters](naming-matters.md)
819
784
 
820
785
  ## Sub-Principles
821
786
 
822
- - [Traceable Decisions](traceable-decisions.md)
787
+ - (none)
823
788
  `
824
789
  },
825
790
  {
826
- slug: "trunk-based-development",
827
- content: `# Trunk-Based Development
791
+ slug: "lightweight-planning",
792
+ content: `# Lightweight Planning
828
793
 
829
- Dust is designed to support a non-branching workflow where developers commit directly to a single main branch.
794
+ Dust aims to be a minimal, low-overhead planning system that stays relevant over time.
830
795
 
831
- In trunk-based development, teams collaborate on code in one primary branch rather than maintaining multiple long-lived feature branches. This eliminates merge conflicts, enables continuous integration, and keeps the codebase continuously releasable.
796
+ Planning artifacts are simple markdown files that live alongside code. Ideas are intentionally vague until implementation is imminent. Tasks are small and completable in single commits. Facts document current reality rather than aspirational states.
832
797
 
833
- The \`dust loop claude\` command embodies this philosophy: agents pull from main, implement a task, and push directly back to main. There are no feature branches, no pull requests, no merge queues. Each commit is atomic and complete.
798
+ The system avoids the staleness problem by deferring detail until the last responsible moment and deleting completed work rather than archiving it.
834
799
 
835
- This approach scales through discipline rather than isolation. Feature flags and incremental changes replace long-running branches. The repository history becomes a linear sequence of working states.
800
+ ## Parent Principle
836
801
 
837
- See: https://trunkbaseddevelopment.com/
802
+ - [Human-AI Collaboration](human-ai-collaboration.md)
803
+
804
+ ## Sub-Principles
805
+
806
+ - [Task-First Workflow](task-first-workflow.md)
807
+ - [Some Big Design Up Front](some-big-design-up-front.md)
808
+ `
809
+ },
810
+ {
811
+ slug: "easy-adoption",
812
+ content: `# Easy Adoption
813
+
814
+ Dust should be trivially easy to adopt in any repository.
815
+
816
+ Getting started with Dust should require minimal friction. A developer should be able to bootstrap Dust in their repository with a single command, without needing to install dependencies, configure build tools, or understand the internals.
817
+
818
+ This lowers the barrier to entry and encourages experimentation.
838
819
 
839
820
  ## Parent Principle
840
821
 
841
- - [Repository Hygiene](repository-hygiene.md)
822
+ - [Human-AI Collaboration](human-ai-collaboration.md)
842
823
 
843
824
  ## Sub-Principles
844
825
 
845
- (none)
826
+ - [Cross-Platform Compatibility](cross-platform-compatibility.md)
827
+ - [Unsurprising UX](unsurprising-ux.md)
828
+ - [VCS Independence](vcs-independence.md)
846
829
  `
847
830
  },
848
831
  {
849
- slug: "comprehensive-assertions",
850
- content: `# Comprehensive Assertions
832
+ slug: "intuitive-directory-structure",
833
+ content: `# Intuitive Directory Structure
851
834
 
852
- Assert the whole, not the parts.
835
+ Code should be organized around related concerns in clearly named directories.
853
836
 
854
- When you break a complex object into many small assertions, a failure tells you *one thing that's wrong*. When you assert against the whole expected value, the diff tells you *what actually happened versus what you expected* the full picture, in one glance.
837
+ When files that serve similar purposes are grouped together, the codebase becomes easier to navigate and understand. A developer looking for "commands" should find them in a \`commands\` directory. Utilities should live with utilities. This organization reduces cognitive load and makes the project structure self-documenting.
855
838
 
856
- Small assertions are like yes/no questions to a witness. A whole-object assertion is like asking "tell me what you saw."
839
+ ## Parent Principle
857
840
 
858
- ## In practice
841
+ - [Maintainable Codebase](maintainable-codebase.md)
859
842
 
860
- Collapse multiple partial assertions into one comprehensive assertion:
843
+ ## Sub-Principles
861
844
 
862
- \`\`\`javascript
863
- // Fragmented — each failure is a narrow keyhole
864
- expect(result.name).toBe("Alice");
865
- expect(result.age).toBe(30);
866
- expect(result.role).toBe("admin");
845
+ - [Co-located Tests](co-located-tests.md)
846
+ `
847
+ },
848
+ {
849
+ slug: "lint-everything",
850
+ content: `# Lint Everything
867
851
 
868
- // Whole a failure diff tells the full story
869
- expect(result).toEqual({
870
- name: "Alice",
871
- age: 30,
872
- role: "admin",
873
- });
874
- \`\`\`
852
+ Prefer static analysis over runtime checks. Every error caught by a linter is an error that never reaches tests, and every error caught by tests is an error that never reaches production.
875
853
 
876
- If \`role\` is \`"user"\` and \`age\` is \`29\`, the fragmented version stops at the first failure. The whole-object assertion shows both discrepancies at once, in context.
854
+ Lint markdown, lint types, lint formatting. If it can be checked statically, check it. Linters are fast, deterministic, and catch entire categories of bugs before code even runs.
877
855
 
878
- The same applies to arrays:
856
+ This project lints:
857
+ - TypeScript (type checking and style)
858
+ - Markdown (broken links, required sections)
859
+ - Task files (structure validation)
860
+ - Principle hierarchy (parent/child consistency)
879
861
 
880
- \`\`\`javascript
881
- // Avoid: partial assertions that hide the actual state
882
- expect(array).toContain('apples')
883
- expect(array).toContain('oranges')
862
+ ## Parent Principle
884
863
 
885
- // Prefer: one assertion that reveals the full picture on failure
886
- expect(array).toEqual(['apples', 'oranges'])
887
- \`\`\`
864
+ - [Make Changes with Confidence](make-changes-with-confidence.md)
888
865
 
889
- ## How to evaluate
866
+ ## Sub-Principles
890
867
 
891
- Work supports this principle when test failures tell a rich story — showing the complete actual value alongside the complete expected value, so the reader can understand what happened without re-running anything.
868
+ (none)
869
+ `
870
+ },
871
+ {
872
+ slug: "progressive-disclosure",
873
+ content: `# Progressive Disclosure
874
+
875
+ Dust should reveal details progressively as a way of achieving context window efficiency.
876
+
877
+ Not all information is needed at once. A task list showing just titles is sufficient for choosing what to work on. Full task details are only needed when actively implementing. Linked principles and facts can be followed when deeper context is required.
878
+
879
+ This layered approach keeps initial reads lightweight while preserving access to complete information when needed.
892
880
 
893
881
  ## Parent Principle
894
882
 
895
- - [Make Changes with Confidence](make-changes-with-confidence.md)
883
+ - [Context Window Efficiency](context-window-efficiency.md)
896
884
 
897
885
  ## Sub-Principles
898
886
 
@@ -900,22 +888,18 @@ Work supports this principle when test failures tell a rich story — showing th
900
888
  `
901
889
  },
902
890
  {
903
- slug: "cross-platform-compatibility",
904
- content: `# Cross-Platform Compatibility
891
+ slug: "context-optimised-code",
892
+ content: `# Context-Optimised Code
905
893
 
906
- Dust should work consistently across operating systems: Linux, macOS, and Windows.
894
+ Code should be structured so that agents can understand and modify it within their context window constraints.
907
895
 
908
- This means:
909
- - Avoiding platform-specific shell commands or syntax
910
- - Using cross-platform path handling
911
- - Testing on multiple platforms when possible
912
- - Documenting any platform-specific limitations
896
+ Large files, deeply nested abstractions, and sprawling dependency chains all work against agents. A 3,000-line file cannot be fully loaded into context. A function that requires understanding six levels of indirection demands more context than one that is self-contained. Context-optimised code favours small files, shallow abstractions, explicit dependencies, and co-located related logic.
913
897
 
914
- Cross-platform support broadens adoption and ensures teams with mixed environments can collaborate effectively.
898
+ Dust should help projects identify files that are too large, modules that are too tangled, and patterns that make agent comprehension harder than it needs to be. This is not just about file size — it is about ensuring that the unit of code an agent needs to understand fits comfortably within the window available.
915
899
 
916
900
  ## Parent Principle
917
901
 
918
- - [Easy Adoption](easy-adoption.md)
902
+ - [Ideal Agent Developer Experience](ideal-agent-developer-experience.md)
919
903
 
920
904
  ## Sub-Principles
921
905
 
@@ -923,22 +907,37 @@ Cross-platform support broadens adoption and ensures teams with mixed environmen
923
907
  `
924
908
  },
925
909
  {
926
- slug: "exploratory-tooling",
927
- content: `# Exploratory Tooling
910
+ slug: "some-big-design-up-front",
911
+ content: `# Some Big Design Up Front
928
912
 
929
- Agents need tools to efficiently explore and understand unfamiliar codebases.
913
+ AI agents lower the cost of architectural exploration, making heavier upfront investment rational during the idea phase.
930
914
 
931
- When an agent encounters a new codebase or an unfamiliar corner of a familiar one it needs to quickly build a mental model: what exists, how it fits together, and where to make changes. Without good exploratory tools, agents waste context on trial-and-error searches, reading irrelevant files, and forming incorrect assumptions.
915
+ Agile's rejection of "big design up front" (BDUF) was largely economic: detailed architecture was expensive to produce and often wrong. AI agents change that equation they can explore multiple variants, prototype them, and measure trade-offs cheaply. When evaluating alternatives costs less, the expected value of avoiding large structural mistakes increases.
932
916
 
933
- Dust should promote and integrate tools that help agents explore: dependency graphs, module overviews, search utilities tuned for code navigation, and summaries of project structure. The goal is to make the "orientation" phase of any task as short and reliable as possible.
917
+ This doesn't mean returning to traditional BDUF. Uncertainty about future requirements still limits what prediction can achieve. The insight is that the optimal amount of upfront work has shifted, not that prediction became reliable.
934
918
 
935
- ## Applicability
919
+ The model is hybrid: thorough AI-assisted exploration during ideas, followed by straightforward execution during tasks. "Lightweight" refers to task-level planning, not idea-level exploration. Invest heavily in understanding alternatives during the idea phase, then decompose into atomic tasks once the direction is clear.
936
920
 
937
- Internal
921
+ ## Convergence Criteria
922
+
923
+ Exploration should continue until clear trade-offs are identified and the chosen approach can be articulated against alternatives. This is convergence-based, not time-boxed — simple ideas converge quickly, complex architectural decisions require more exploration.
924
+
925
+ When exploration feels "done":
926
+
927
+ - Multiple approaches have been considered
928
+ - Trade-offs between approaches are understood
929
+ - The chosen direction has clear justification
930
+ - Remaining uncertainty is about requirements, not design
931
+
932
+ If a task requires significant design decisions during execution, it wasn't ready to be a task.
933
+
934
+ ## Documenting Alternatives
935
+
936
+ Ideas should document the alternatives considered and why they were ruled out. This creates a decision log that helps future agents and humans understand context. Include alternatives in the idea body or Open Questions sections.
938
937
 
939
938
  ## Parent Principle
940
939
 
941
- - [Ideal Agent Developer Experience](ideal-agent-developer-experience.md)
940
+ - [Lightweight Planning](lightweight-planning.md)
942
941
 
943
942
  ## Sub-Principles
944
943
 
@@ -946,16 +945,16 @@ Internal
946
945
  `
947
946
  },
948
947
  {
949
- slug: "reasonably-dry",
950
- content: `# Reasonably DRY
948
+ slug: "traceable-decisions",
949
+ content: `# Traceable Decisions
951
950
 
952
- Don't repeat yourself is a good principle, but don't overdo it.
951
+ The commit history should explain why changes were made, not just what changed.
953
952
 
954
- Extracting shared code too eagerly can create tight coupling, obscure intent, and make changes harder. When two pieces of code look similar but serve different purposes or are likely to evolve independently, duplication is the better choice. The cost of a wrong abstraction is higher than the cost of a little repetition. Extract shared code when the duplication is truly about the same concept and has proven stable, not just because two things happen to look alike right now.
953
+ Commit messages should capture intent and context that would otherwise be lost. Future maintainers (human or AI) will traverse history to understand the reasoning behind decisions. A commit that says "Fix bug" is less valuable than one that explains what was broken and why the fix is correct.
955
954
 
956
955
  ## Parent Principle
957
956
 
958
- - [Maintainable Codebase](maintainable-codebase.md)
957
+ - [Atomic Commits](atomic-commits.md)
959
958
 
960
959
  ## Sub-Principles
961
960
 
@@ -963,16 +962,16 @@ Extracting shared code too eagerly can create tight coupling, obscure intent, an
963
962
  `
964
963
  },
965
964
  {
966
- slug: "runtime-agnostic-tests",
967
- content: `# Runtime Agnostic Tests
965
+ slug: "fast-feedback",
966
+ content: `# Fast Feedback
968
967
 
969
- Dust's test suite should work across JavaScript runtimes.
968
+ Dust should provide fast feedback loops for developers.
970
969
 
971
- Tests should use standard JavaScript testing patterns that work across Node.js, Bun, and other runtimes. Avoiding runtime-specific test APIs ensures the project can leverage different runtimes' advantages while maintaining broad compatibility.
970
+ Scripts and tooling should execute quickly so developers can iterate rapidly. Slow feedback discourages frequent validation and leads to larger, riskier changes. Fast feedback enables small, confident steps.
972
971
 
973
972
  ## Parent Principle
974
973
 
975
- - [Minimal Dependencies](minimal-dependencies.md)
974
+ - [Make Changes with Confidence](make-changes-with-confidence.md)
976
975
 
977
976
  ## Sub-Principles
978
977
 
@@ -980,43 +979,50 @@ Tests should use standard JavaScript testing patterns that work across Node.js,
980
979
  `
981
980
  },
982
981
  {
983
- slug: "task-first-workflow",
984
- content: `# Task-First Workflow
982
+ slug: "decoupled-code",
983
+ content: `# Decoupled Code
985
984
 
986
- Work should be captured as a task before implementation begins, creating traceability between intent and outcome.
985
+ Code should be organized into independent units with explicit dependencies.
987
986
 
988
- This discipline ensures that every change has a documented purpose. The commit history shows pairs of "Add task" followed by implementation, making it easy to understand why each change was made. It also prevents scope creep by defining boundaries before work starts.
987
+ Decoupled code is easier to test, understand, and modify. Dependencies are passed in rather than hard-coded, enabling units to be tested in isolation and composed flexibly. This reduces the blast radius of changes and makes the system more maintainable.
989
988
 
990
989
  ## Parent Principle
991
990
 
992
- - [Lightweight Planning](lightweight-planning.md)
991
+ - [Make Changes with Confidence](make-changes-with-confidence.md)
993
992
 
994
993
  ## Sub-Principles
995
994
 
996
- - (none)
995
+ - [Dependency Injection](dependency-injection.md)
996
+ - [Stubs Over Mocks](stubs-over-mocks.md)
997
+ - [Functional Core, Imperative Shell](functional-core-imperative-shell.md)
998
+ - [Design for Testability](design-for-testability.md)
997
999
  `
998
1000
  },
999
1001
  {
1000
- slug: "agent-autonomy",
1001
- content: `# Agent Autonomy
1002
+ slug: "make-changes-with-confidence",
1003
+ content: `# Make Changes with Confidence
1002
1004
 
1003
- Dust exists to enable AI agents to produce work autonomously.
1005
+ Developers should be able to modify code without fear of breaking existing behavior.
1004
1006
 
1005
- With sufficient planning and small enough units, this works much better in practice.
1007
+ Tests, type checking, and other automated verification enable safe refactoring and evolution of the codebase. When changes break something, fast feedback identifies the problem before it spreads. This confidence encourages continuous improvement rather than fragile, stagnant code.
1006
1008
 
1007
1009
  ## Parent Principle
1008
1010
 
1009
- - [Human-AI Collaboration](human-ai-collaboration.md)
1011
+ - [Maintainable Codebase](maintainable-codebase.md)
1010
1012
 
1011
1013
  ## Sub-Principles
1012
1014
 
1013
- - [Actionable Errors](actionable-errors.md)
1014
- - [Batteries Included](batteries-included.md)
1015
- - [Agent-Agnostic Design](agent-agnostic-design.md)
1016
- - [Agent Context Inference](agent-context-inference.md)
1017
- - [Agent-Specific Enhancement](agent-specific-enhancement.md)
1018
- - [Context Window Efficiency](context-window-efficiency.md)
1019
- - [Small Units](small-units.md)
1015
+ - [Comprehensive Assertions](comprehensive-assertions.md)
1016
+ - [Decoupled Code](decoupled-code.md)
1017
+ - [Fast Feedback](fast-feedback.md)
1018
+ - [Lint Everything](lint-everything.md)
1019
+ - [Readable Test Data](readable-test-data.md)
1020
+ - [Reproducible Checks](reproducible-checks.md)
1021
+ - [Stop the Line](stop-the-line.md)
1022
+ - [Keep Unit Tests Pure](keep-unit-tests-pure.md)
1023
+ - [Test Isolation](test-isolation.md)
1024
+ - [Self-Diagnosing Tests](self-diagnosing-tests.md)
1025
+ - [Unit Test Coverage](unit-test-coverage.md)
1020
1026
  `
1021
1027
  },
1022
1028
  {
@@ -1037,18 +1043,24 @@ Abbreviated names like \`ctx\`, \`deps\`, \`fs\`, or \`args\` save a few keystro
1037
1043
  `
1038
1044
  },
1039
1045
  {
1040
- slug: "fast-feedback-loops",
1041
- content: `# Fast Feedback Loops
1046
+ slug: "agent-agnostic-design",
1047
+ content: `# Agent-Agnostic Design
1042
1048
 
1043
- The primary feedback loop write code, run checks, see results — should be as fast as possible.
1049
+ Dust should work with multiple agents without favoring one.
1044
1050
 
1045
- Fast feedback is the foundation of productive development, for both humans and agents. When tests, linters, and type checks run in seconds rather than minutes, developers iterate more frequently and catch problems earlier. Agents especially benefit because they operate in tight loops of change-and-verify; slow feedback wastes tokens and context window space on waiting rather than working.
1051
+ Rather than implementing agents, Dust generates prompts and context that can be passed to any capable agent. This keeps Dust lightweight and allows teams to use whatever agent tooling they prefer.
1046
1052
 
1047
- Dust should help projects measure the speed of their feedback loops, identify bottlenecks, and keep them fast as the codebase grows. This includes promoting practices like unit tests over integration tests for speed, incremental compilation, and check parallelisation.
1053
+ Dust may have built-in support for invoking popular agents (Claude, Aider, Codex, etc.), but the choice of agent should always be made by the user at runtime - never hard-coded into repository configuration.
1054
+
1055
+ Note: Supporting multiple agents directly contributes to [Easy Adoption](easy-adoption.md), since teams can use their preferred agent tools without being locked into a specific platform.
1056
+
1057
+ ## Applicability
1058
+
1059
+ Internal
1048
1060
 
1049
1061
  ## Parent Principle
1050
1062
 
1051
- - [Ideal Agent Developer Experience](ideal-agent-developer-experience.md)
1063
+ - [Agent Autonomy](agent-autonomy.md)
1052
1064
 
1053
1065
  ## Sub-Principles
1054
1066
 
@@ -1056,39 +1068,51 @@ Dust should help projects measure the speed of their feedback loops, identify bo
1056
1068
  `
1057
1069
  },
1058
1070
  {
1059
- slug: "make-the-change-easy",
1060
- content: `# Make the Change Easy
1061
-
1062
- For each desired change, make the change easy, then make the easy change.
1071
+ slug: "readable-test-data",
1072
+ content: `# Readable Test Data
1063
1073
 
1064
- This principle, articulated by Kent Beck, recognizes that the hardest part of a change is often not the change itself but the state of the code receiving it. When code resists a change, the right response is to first refactor until the change becomes straightforward, and only then make it. The warning - "this may be hard" - acknowledges that preparing the ground takes real effort, but the result is a change that fits naturally rather than one forced in against the grain.
1074
+ Test data setup should use natural structures that mirror what they represent.
1065
1075
 
1066
- Work that supports this principle includes refactoring before feature work, improving abstractions that make a category of changes simpler, and resisting the urge to bolt changes onto code that isn't ready for them.
1076
+ ## Why it matters
1067
1077
 
1068
- ## Parent Principle
1078
+ When test data is easy to read, tests become self-documenting. A file system hierarchy expressed as a nested object immediately conveys structure, while a flat Map with path strings requires mental parsing to understand the relationships.
1069
1079
 
1070
- - [Maintainable Codebase](maintainable-codebase.md)
1080
+ ## In practice
1071
1081
 
1072
- ## Sub-Principles
1082
+ Prefer literal structures that visually match the domain:
1073
1083
 
1074
- - (none)
1075
- `
1076
- },
1077
- {
1078
- slug: "self-contained-repository",
1079
- content: `# Self-Contained Repository
1084
+ \`\`\`javascript
1085
+ // Avoid: flat paths that obscure hierarchy
1086
+ const fs = createFileSystemEmulator({
1087
+ files: new Map([['/project/.dust/principles/my-goal.md', '# My Goal']]),
1088
+ existingPaths: new Set(['/project/.dust/ideas']),
1089
+ })
1080
1090
 
1081
- Where possible, developers and agents should have everything they need to be productive, within the repository.
1091
+ // Prefer: nested object that mirrors file system structure
1092
+ const fs = createFileSystemEmulator({
1093
+ project: {
1094
+ '.dust': {
1095
+ principles: {
1096
+ 'my-goal.md': '# My Goal'
1097
+ },
1098
+ ideas: {}
1099
+ }
1100
+ }
1101
+ })
1102
+ \`\`\`
1082
1103
 
1083
- No third-party tools should be required beyond those that can be installed with a single command defined in the repository. Setup instructions, scripts, configuration, and dependencies should all live in version control so that cloning the repo and running a single install command is sufficient to start working. This eliminates onboarding friction, reduces "works on my machine" issues, and is especially important for agents — who cannot browse the web to find missing tools or ask colleagues how to set things up.
1104
+ The nested form:
1105
+ - Shows parent-child relationships through indentation
1106
+ - Makes empty directories explicit with empty objects
1107
+ - Requires no mental path concatenation to understand structure
1084
1108
 
1085
- ## Applicability
1109
+ ## How to evaluate
1086
1110
 
1087
- Internal
1111
+ Work supports this principle when test setup data uses structures that visually resemble what they represent, reducing cognitive load for readers.
1088
1112
 
1089
1113
  ## Parent Principle
1090
1114
 
1091
- - [Ideal Agent Developer Experience](ideal-agent-developer-experience.md)
1115
+ - [Make Changes with Confidence](make-changes-with-confidence.md)
1092
1116
 
1093
1117
  ## Sub-Principles
1094
1118
 
@@ -1096,16 +1120,16 @@ Internal
1096
1120
  `
1097
1121
  },
1098
1122
  {
1099
- slug: "traceable-decisions",
1100
- content: `# Traceable Decisions
1123
+ slug: "reasonably-dry",
1124
+ content: `# Reasonably DRY
1101
1125
 
1102
- The commit history should explain why changes were made, not just what changed.
1126
+ Don't repeat yourself is a good principle, but don't overdo it.
1103
1127
 
1104
- Commit messages should capture intent and context that would otherwise be lost. Future maintainers (human or AI) will traverse history to understand the reasoning behind decisions. A commit that says "Fix bug" is less valuable than one that explains what was broken and why the fix is correct.
1128
+ Extracting shared code too eagerly can create tight coupling, obscure intent, and make changes harder. When two pieces of code look similar but serve different purposes or are likely to evolve independently, duplication is the better choice. The cost of a wrong abstraction is higher than the cost of a little repetition. Extract shared code when the duplication is truly about the same concept and has proven stable, not just because two things happen to look alike right now.
1105
1129
 
1106
1130
  ## Parent Principle
1107
1131
 
1108
- - [Atomic Commits](atomic-commits.md)
1132
+ - [Maintainable Codebase](maintainable-codebase.md)
1109
1133
 
1110
1134
  ## Sub-Principles
1111
1135
 
@@ -1113,16 +1137,21 @@ Commit messages should capture intent and context that would otherwise be lost.
1113
1137
  `
1114
1138
  },
1115
1139
  {
1116
- slug: "unit-test-coverage",
1117
- content: `# Unit Test Coverage
1140
+ slug: "actionable-errors",
1141
+ content: `# Actionable Errors
1118
1142
 
1119
- Complete unit test coverage ensures low-level tests give users direct feedback as they change the code.
1143
+ Error messages should tell you what to do next, not just what went wrong.
1120
1144
 
1121
- Excluding system tests from coverage reporting focuses attention on unit tests - the tests that provide the fastest, most specific feedback. When coverage tools only measure unit tests, developers can quickly identify which parts of the codebase lack fine-grained test protection.
1145
+ When something fails, the message should provide:
1146
+ - A clear description of the problem
1147
+ - Specific guidance on how to fix it
1148
+ - Context needed to take the next step
1149
+
1150
+ This is especially important for AI agents, who need concrete instructions to recover autonomously. A good error message turns a dead end into a signpost.
1122
1151
 
1123
1152
  ## Parent Principle
1124
1153
 
1125
- - [Make Changes with Confidence](make-changes-with-confidence.md)
1154
+ - [Agent Autonomy](agent-autonomy.md)
1126
1155
 
1127
1156
  ## Sub-Principles
1128
1157
 
@@ -1130,84 +1159,70 @@ Excluding system tests from coverage reporting focuses attention on unit tests -
1130
1159
  `
1131
1160
  },
1132
1161
  {
1133
- slug: "decoupled-code",
1134
- content: `# Decoupled Code
1162
+ slug: "make-the-change-easy",
1163
+ content: `# Make the Change Easy
1135
1164
 
1136
- Code should be organized into independent units with explicit dependencies.
1165
+ For each desired change, make the change easy, then make the easy change.
1137
1166
 
1138
- Decoupled code is easier to test, understand, and modify. Dependencies are passed in rather than hard-coded, enabling units to be tested in isolation and composed flexibly. This reduces the blast radius of changes and makes the system more maintainable.
1167
+ This principle, articulated by Kent Beck, recognizes that the hardest part of a change is often not the change itself but the state of the code receiving it. When code resists a change, the right response is to first refactor until the change becomes straightforward, and only then make it. The warning - "this may be hard" - acknowledges that preparing the ground takes real effort, but the result is a change that fits naturally rather than one forced in against the grain.
1168
+
1169
+ Work that supports this principle includes refactoring before feature work, improving abstractions that make a category of changes simpler, and resisting the urge to bolt changes onto code that isn't ready for them.
1139
1170
 
1140
1171
  ## Parent Principle
1141
1172
 
1142
- - [Make Changes with Confidence](make-changes-with-confidence.md)
1173
+ - [Maintainable Codebase](maintainable-codebase.md)
1143
1174
 
1144
1175
  ## Sub-Principles
1145
1176
 
1146
- - [Dependency Injection](dependency-injection.md)
1147
- - [Stubs Over Mocks](stubs-over-mocks.md)
1148
- - [Functional Core, Imperative Shell](functional-core-imperative-shell.md)
1149
- - [Design for Testability](design-for-testability.md)
1177
+ - (none)
1150
1178
  `
1151
1179
  },
1152
1180
  {
1153
- slug: "lint-everything",
1154
- content: `# Lint Everything
1181
+ slug: "dependency-injection",
1182
+ content: `# Dependency Injection
1155
1183
 
1156
- Prefer static analysis over runtime checks. Every error caught by a linter is an error that never reaches tests, and every error caught by tests is an error that never reaches production.
1184
+ Avoid global mocks. Dependency injection is almost always preferable to testing code that depends directly on globals.
1157
1185
 
1158
- Lint markdown, lint types, lint formatting. If it can be checked statically, check it. Linters are fast, deterministic, and catch entire categories of bugs before code even runs.
1186
+ When code depends on global state or singletons, testing requires mocking those globals—which introduces hidden coupling, complicates test setup, and risks interference between tests. Dependency injection makes dependencies explicit: they're passed in as arguments, making the code's requirements visible and enabling tests to supply controlled implementations.
1159
1187
 
1160
- This project lints:
1161
- - TypeScript (type checking and style)
1162
- - Markdown (broken links, required sections)
1163
- - Task files (structure validation)
1164
- - Principle hierarchy (parent/child consistency)
1188
+ This approach improves testability (each test controls its own dependencies), readability (dependencies are declared upfront), and flexibility (swapping implementations doesn't require changing the consuming code). It also makes refactoring safer since dependencies are explicit rather than implicit.
1165
1189
 
1166
1190
  ## Parent Principle
1167
1191
 
1168
- - [Make Changes with Confidence](make-changes-with-confidence.md)
1192
+ - [Decoupled Code](decoupled-code.md)
1169
1193
 
1170
1194
  ## Sub-Principles
1171
1195
 
1172
- (none)
1196
+ - (none)
1173
1197
  `
1174
1198
  },
1175
1199
  {
1176
- slug: "maintainable-codebase",
1177
- content: `# Maintainable Codebase
1200
+ slug: "repository-hygiene",
1201
+ content: `# Repository Hygiene
1178
1202
 
1179
- The dust codebase should be easy to understand, modify, and extend.
1203
+ Dust repositories should maintain a clean, organized state with minimal noise.
1180
1204
 
1181
- This principle governs how we develop and maintain dust itself, separate from the principles that describe what dust offers its users. A well-maintained codebase enables rapid iteration, reduces bugs, and makes contributions easier.
1205
+ This includes proper gitignore configuration to exclude build artifacts, dependencies, editor files, and other generated content from version control. A well-maintained repository makes it easier for both humans and AI to navigate and understand the codebase.
1182
1206
 
1183
1207
  ## Parent Principle
1184
1208
 
1185
- - [Agentic Flow State](agentic-flow-state.md)
1209
+ - [Maintainable Codebase](maintainable-codebase.md)
1186
1210
 
1187
1211
  ## Sub-Principles
1188
1212
 
1189
- - [Make Changes with Confidence](make-changes-with-confidence.md)
1190
- - [Minimal Dependencies](minimal-dependencies.md)
1191
- - [Intuitive Directory Structure](intuitive-directory-structure.md)
1192
- - [Repository Hygiene](repository-hygiene.md)
1193
- - [Naming Matters](naming-matters.md)
1194
- - [Reasonably DRY](reasonably-dry.md)
1195
- - [Make the Change Easy](make-the-change-easy.md)
1196
- - [Boy Scout Rule](boy-scout-rule.md)
1197
- - [Broken Windows](broken-windows.md)
1213
+ - [Atomic Commits](atomic-commits.md)
1214
+ - [Trunk-Based Development](trunk-based-development.md)
1198
1215
  `
1199
1216
  },
1200
1217
  {
1201
- slug: "agent-agnostic-design",
1202
- content: `# Agent-Agnostic Design
1203
-
1204
- Dust should work with multiple agents without favoring one.
1218
+ slug: "batteries-included",
1219
+ content: `# Batteries Included
1205
1220
 
1206
- Rather than implementing agents, Dust generates prompts and context that can be passed to any capable agent. This keeps Dust lightweight and allows teams to use whatever agent tooling they prefer.
1221
+ Dust should provide everything that is required (within reason) for an agent to be productive in an arbitrary codebase.
1207
1222
 
1208
- Dust may have built-in support for invoking popular agents (Claude, Aider, Codex, etc.), but the choice of agent should always be made by the user at runtime - never hard-coded into repository configuration.
1223
+ An agent working autonomously should not be blocked because a tool or configuration is missing. For example, dust should ship custom lint rules for different linters, even though those linters are not dependencies of dust itself. If an agent needs a capability to do its job well in a typical codebase, dust should provide it out of the box.
1209
1224
 
1210
- Note: Supporting multiple agents directly contributes to [Easy Adoption](easy-adoption.md), since teams can use their preferred agent tools without being locked into a specific platform.
1225
+ This means accepting some breadth of scope bundling configs, rules, and utilities that target external tools in exchange for agents that can start producing useful work immediately without manual setup.
1211
1226
 
1212
1227
  ## Applicability
1213
1228
 
@@ -1218,47 +1233,25 @@ Internal
1218
1233
  - [Agent Autonomy](agent-autonomy.md)
1219
1234
 
1220
1235
  ## Sub-Principles
1221
-
1222
- - (none)
1223
1236
  `
1224
1237
  },
1225
1238
  {
1226
- slug: "easy-adoption",
1227
- content: `# Easy Adoption
1228
-
1229
- Dust should be trivially easy to adopt in any repository.
1230
-
1231
- Getting started with Dust should require minimal friction. A developer should be able to bootstrap Dust in their repository with a single command, without needing to install dependencies, configure build tools, or understand the internals.
1232
-
1233
- This lowers the barrier to entry and encourages experimentation.
1234
-
1235
- ## Parent Principle
1236
-
1237
- - [Human-AI Collaboration](human-ai-collaboration.md)
1239
+ slug: "development-traceability",
1240
+ content: `# Development Traceability
1238
1241
 
1239
- ## Sub-Principles
1242
+ Structured logging and tracing help agents understand system behaviour without resorting to ad-hoc testing cycles.
1240
1243
 
1241
- - [Cross-Platform Compatibility](cross-platform-compatibility.md)
1242
- - [Unsurprising UX](unsurprising-ux.md)
1243
- - [VCS Independence](vcs-independence.md)
1244
- `
1245
- },
1246
- {
1247
- slug: "actionable-errors",
1248
- content: `# Actionable Errors
1244
+ When something goes wrong, agents often resort to adding temporary log statements, running the code, reading the output, and repeating — a slow and wasteful debugging loop. Good traceability means the system already records what happened and why, through structured logs, trace IDs, and observable state. This lets agents diagnose issues by reading existing output rather than generating new experiments.
1249
1245
 
1250
- Error messages should tell you what to do next, not just what went wrong.
1246
+ Dust should encourage projects to adopt structured logging, promote traceability as a first-class concern, and provide tools that surface relevant trace information when agents need it.
1251
1247
 
1252
- When something fails, the message should provide:
1253
- - A clear description of the problem
1254
- - Specific guidance on how to fix it
1255
- - Context needed to take the next step
1248
+ ## Applicability
1256
1249
 
1257
- This is especially important for AI agents, who need concrete instructions to recover autonomously. A good error message turns a dead end into a signpost.
1250
+ Internal
1258
1251
 
1259
1252
  ## Parent Principle
1260
1253
 
1261
- - [Agent Autonomy](agent-autonomy.md)
1254
+ - [Ideal Agent Developer Experience](ideal-agent-developer-experience.md)
1262
1255
 
1263
1256
  ## Sub-Principles
1264
1257
 
@@ -1266,16 +1259,22 @@ This is especially important for AI agents, who need concrete instructions to re
1266
1259
  `
1267
1260
  },
1268
1261
  {
1269
- slug: "consistent-naming",
1270
- content: `# Consistent Naming
1262
+ slug: "exploratory-tooling",
1263
+ content: `# Exploratory Tooling
1271
1264
 
1272
- Names should follow established conventions within each category to reduce cognitive load.
1265
+ Agents need tools to efficiently explore and understand unfamiliar codebases.
1273
1266
 
1274
- Principles use Title Case. File names use kebab-case. Commands use lowercase with hyphens. When naming conventions exist, follow them. When they don't, establish one and apply it consistently. Inconsistent naming creates friction for both humans and AI agents trying to predict or recall identifiers.
1267
+ When an agent encounters a new codebase or an unfamiliar corner of a familiar one it needs to quickly build a mental model: what exists, how it fits together, and where to make changes. Without good exploratory tools, agents waste context on trial-and-error searches, reading irrelevant files, and forming incorrect assumptions.
1268
+
1269
+ Dust should promote and integrate tools that help agents explore: dependency graphs, module overviews, search utilities tuned for code navigation, and summaries of project structure. The goal is to make the "orientation" phase of any task as short and reliable as possible.
1270
+
1271
+ ## Applicability
1272
+
1273
+ Internal
1275
1274
 
1276
1275
  ## Parent Principle
1277
1276
 
1278
- - [Naming Matters](naming-matters.md)
1277
+ - [Ideal Agent Developer Experience](ideal-agent-developer-experience.md)
1279
1278
 
1280
1279
  ## Sub-Principles
1281
1280
 
@@ -1283,54 +1282,57 @@ Principles use Title Case. File names use kebab-case. Commands use lowercase wit
1283
1282
  `
1284
1283
  },
1285
1284
  {
1286
- slug: "minimal-dependencies",
1287
- content: `# Minimal Dependencies
1285
+ slug: "small-units",
1286
+ content: `# Small Units
1288
1287
 
1289
- Dust should avoid coupling to specific tools so we can switch to better alternatives as they emerge.
1288
+ Ideas, principles, facts, and tasks should each be as discrete and fine-grained as possible.
1290
1289
 
1291
- By keeping dependencies minimal and using standard APIs where possible, we maintain the freedom to adopt new tools without major rewrites. This applies to runtimes, test frameworks, build tools, and other infrastructure choices.
1290
+ Small, focused documents enable precise relationships between them. A task can link to exactly the principles it serves. A fact can describe one specific aspect of the system. This granularity reduces ambiguity.
1291
+
1292
+ Tasks especially benefit from being small. A narrowly scoped task gives agents or humans the best chance of delivering exactly what was intended, in a single atomic commit.
1293
+
1294
+ Note: This principle directly supports [Lightweight Planning](lightweight-planning.md), which explicitly mentions that "Tasks are small and completable in single commits."
1292
1295
 
1293
1296
  ## Parent Principle
1294
1297
 
1295
- - [Maintainable Codebase](maintainable-codebase.md)
1298
+ - [Agent Autonomy](agent-autonomy.md)
1296
1299
 
1297
1300
  ## Sub-Principles
1298
1301
 
1299
- - [Runtime Agnostic Tests](runtime-agnostic-tests.md)
1302
+ - (none)
1300
1303
  `
1301
1304
  },
1302
1305
  {
1303
- slug: "context-window-efficiency",
1304
- content: `# Context Window Efficiency
1305
-
1306
- Dust should be designed with short attention spans in mind.
1306
+ slug: "naming-matters",
1307
+ content: `# Naming Matters
1307
1308
 
1308
- AI agents operate within limited context windows. Every token consumed by planning artifacts is a token unavailable for reasoning about code. Dust keeps artifacts concise and scannable so agents can quickly understand what needs to be done without wading through verbose documentation.
1309
+ Good naming reduces waste by eliminating confusion and making code self-documenting.
1309
1310
 
1310
- This means favoring brevity over completeness, using consistent structures that are fast to parse, and avoiding redundant information across files.
1311
+ Poor names cause rework, bugs, and communication overhead. When names don't clearly convey meaning, developers waste time deciphering code, misunderstand intentions, and introduce defects. Well-chosen names serve as documentation that never goes stale, reducing the need for explanatory comments and enabling both humans and AI agents to navigate the codebase efficiently.
1311
1312
 
1312
1313
  ## Parent Principle
1313
1314
 
1314
- - [Agent Autonomy](agent-autonomy.md)
1315
+ - [Maintainable Codebase](maintainable-codebase.md)
1315
1316
 
1316
1317
  ## Sub-Principles
1317
1318
 
1318
- - [Progressive Disclosure](progressive-disclosure.md)
1319
+ - [Consistent Naming](consistent-naming.md)
1320
+ - [Clarity Over Brevity](clarity-over-brevity.md)
1319
1321
  `
1320
1322
  },
1321
1323
  {
1322
- slug: "boy-scout-rule",
1323
- content: `# Boy Scout Rule
1324
+ slug: "comprehensive-test-coverage",
1325
+ content: `# Comprehensive Test Coverage
1324
1326
 
1325
- Always leave the code better than you found it.
1327
+ A project's test suite is its primary safety net, and agents depend on it even more than humans do.
1326
1328
 
1327
- When working in any area of the codebase, take the opportunity to make small improvements clearer names, removed dead code, better structureeven if they're not directly related to the task at hand. These incremental improvements compound over time, preventing gradual decay and keeping the codebase healthy without requiring dedicated cleanup efforts.
1329
+ Agents cannot manually verify that their changes work. They rely entirely on automated tests to confirm correctness. Gaps in test coverage become gaps in agent capabilityareas where changes are risky and feedback is absent. Comprehensive coverage means every meaningful behaviour is tested, so agents can make changes anywhere in the codebase with confidence.
1328
1330
 
1329
- The Boy Scout Rule is not a license for large-scale refactoring during unrelated work. Improvements should be small, obvious, and low-risk. If a cleanup is too large to include alongside the current task, capture it as a separate task instead.
1331
+ Dust should help projects measure and improve their test coverage, flag untested areas, and encourage a culture where new code comes with new tests.
1330
1332
 
1331
1333
  ## Parent Principle
1332
1334
 
1333
- - [Maintainable Codebase](maintainable-codebase.md)
1335
+ - [Ideal Agent Developer Experience](ideal-agent-developer-experience.md)
1334
1336
 
1335
1337
  ## Sub-Principles
1336
1338
 
@@ -1338,18 +1340,16 @@ The Boy Scout Rule is not a license for large-scale refactoring during unrelated
1338
1340
  `
1339
1341
  },
1340
1342
  {
1341
- slug: "unsurprising-ux",
1342
- content: `# Unsurprising UX
1343
-
1344
- The user interface should be as "guessable" as possible.
1343
+ slug: "stop-the-line",
1344
+ content: `# Stop the Line
1345
1345
 
1346
- Following the [Principle of Least Astonishment](https://en.wikipedia.org/wiki/Principle_of_least_astonishment), users form expectations about how a tool will behave based on conventions, prior experience, and intuition. Dust's interface (including the CLI) should match those expectations wherever possible. If users are observed trying to use the interface in ways we didn't anticipate, the interface should be adjusted to meet their expectations — even if that means supporting many ways of achieving the same result.
1346
+ Any worker human or agent should halt and fix a problem the moment they detect it, rather than letting defects propagate downstream.
1347
1347
 
1348
- Surprising behavior erodes trust and slows people down. Unsurprising behavior lets users stay in flow.
1348
+ Originating from the Toyota production system, "Stop the Line" empowers every participant to pause work immediately upon identifying a defect, failing check, or safety hazard. Problems are cheaper to fix at their source than after they've compounded through later stages. In the context of dust, this means agents and humans alike should treat broken checks, test failures, and lint errors as blockers that demand immediate attention — not warnings to be deferred.
1349
1349
 
1350
1350
  ## Parent Principle
1351
1351
 
1352
- - [Easy Adoption](easy-adoption.md)
1352
+ - [Make Changes with Confidence](make-changes-with-confidence.md)
1353
1353
 
1354
1354
  ## Sub-Principles
1355
1355