npm - arscontexta - Versions diffs - 0.6.0 - Mend

arscontexta 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (418) hide show

package/methodology/cross-links between MOC territories indicate creative leaps and integration depth.md ADDED Viewed

@@ -0,0 +1,43 @@
+---
+description: Notes that appear in multiple distant MOCs are integration points where ideas from separate domains combine — tracking cross-MOC membership reveals synthesis quality
+kind: research
+topics: ["[[graph-structure]]"]
+methodology: ["Concept Mapping"]
+source: [[tft-research-part2]]
+---
+# cross-links between MOC territories indicate creative leaps and integration depth
+Novak's concept mapping research identifies cross-links — connections between different domains or map segments — as "crucial for showing creative leaps and deep understanding of how systems integrate." In a vault organized by MOCs, each MOC defines a territory: notes that belong to [[graph-structure]] are one neighborhood, notes that belong to [[agent-cognition]] are another. A note that appears in both territories is a cross-link, and these cross-links reveal something specific about thinking quality.
+Cross-links are hard to create. They require understanding two domains well enough to see where they connect. A note about "context window constraints" that only appears in agent-cognition shows local knowledge. The same note appearing in both agent-cognition AND processing-workflow shows the author understood that context limits shape how processing should be structured — a genuine integration insight rather than domain expertise.
+This matters for vault analysis because cross-link density is measurable. Track which MOC(s) each note belongs to via its Topics footer. Notes that appear in a single MOC are local claims. Notes that appear in multiple MOCs — especially MOCs that aren't obviously adjacent — are integration points worth attention. If you're looking for synthesis opportunities, start with the notes that already bridge domains.
+The mechanism connects to why [[concept-orientation beats source-orientation for cross-domain connections]]. Source-oriented notes are trapped in their origin: "notes on Book X" can only live in Book X's territory. Concept-oriented notes can participate in any territory where the concept is relevant. The extraction step that creates concept nodes is what makes cross-MOC membership possible at all. Cross-links are evidence that extraction happened correctly.
+For agents, cross-link density becomes a synthesis quality metric. When running vault health checks, surface notes with high cross-MOC membership as integration hubs. These notes are structurally valuable — they create shortcuts between otherwise distant domains, exactly what [[small-world topology requires hubs and dense local links]] describes as essential for efficient navigation. Cross-MOC membership also addresses the problem that [[navigational vertigo emerges in pure association systems without local hierarchy]]: a note appearing in multiple MOCs becomes findable from multiple starting points, reducing the "lost between topics" problem that pure association creates. A vault with many cross-links is a vault where ideas flow freely between domains. A vault where every note lives in exactly one MOC is a collection of silos.
+The practical implication: when creating notes, actively ask whether the claim belongs to more than one topic. The question "does this also relate to X?" isn't organizational housekeeping — it's testing whether you've found a genuine integration point or just a local fact. Notes that resist multi-MOC classification are fine as atomic claims. Notes that clearly span multiple topics are the creative leaps that advance understanding.
+In multi-domain systems, cross-link analysis scales to a new level. Since [[multi-domain systems compose through separate templates and shared graph]], the four cross-domain connection patterns — temporal correlation, entity sharing, causal chains, and goal alignment — systematically produce notes that bridge domain-specific MOCs. Cross-MOC membership density becomes the quality metric for whether multi-domain composition is producing genuine integration or merely cohabitation. A vault where research, health, and project domains share a graph but produce zero cross-domain MOC memberships has composed structurally but failed to integrate intellectually.
+There is a productive tension between cross-link analysis and boundary detection. Since [[community detection algorithms can inform when MOCs should split or merge]], algorithmic community detection identifies where topic boundaries naturally fall based on link density. Cross-links are the notes that straddle those boundaries. The two analyses work in concert: community detection draws the map of territories, cross-link analysis identifies the bridges between them. A note that sits at a community boundary with high cross-MOC membership is precisely the kind of integration point worth developing — it connects what the algorithm recognizes as structurally distinct domains.
+Cross-linking has a deeper dimension beyond topic boundaries. Since [[federated wiki pattern enables multi-agent divergence as feature not bug]], when multiple agents process the same concept, they may produce parallel notes that bridge interpretive boundaries rather than topic boundaries. Cross-MOC membership asks "does this concept span multiple domains?" Federation asks "does this concept support multiple valid interpretations?" Both are forms of integration: one across subject territories, the other across perspectives on the same subject. A note that appears in multiple MOCs AND has a federated counterpart with a different interpretive angle is an especially rich integration point — it bridges both domain and perspective simultaneously.
+The quantitative complement comes from [[betweenness centrality identifies bridge notes connecting disparate knowledge domains]], which measures bridging by path structure rather than topic membership. The two approaches sometimes converge — a note in both graph-structure and agent-cognition MOCs might also have high betweenness — but they can diverge. A note might appear in only one MOC but still serve as the primary bridge between two clusters within that MOC's territory. Cross-MOC membership answers "did the author think across domains?" while betweenness centrality answers "can the agent traverse between domains?" Both are needed for comprehensive graph health assessment.
+---
+Relevant Notes:
+- [[concept-orientation beats source-orientation for cross-domain connections]] — explains WHY cross-domain connections are architecturally possible: concept-orientation creates nodes that can form edges across domain boundaries
+- [[small-world topology requires hubs and dense local links]] — the structural foundation that makes cross-MOC linking meaningful: hubs define domains while cross-links connect them
+- [[each new note compounds value by creating traversal paths]] — cross-MOC links are especially high-value paths because they connect previously isolated domains
+- [[navigational vertigo emerges in pure association systems without local hierarchy]] — cross-MOC membership solves vertigo by making notes findable from multiple starting points: a note in two MOCs provides paths from both territories
+- [[community detection algorithms can inform when MOCs should split or merge]] — productive tension: cross-links identify where boundaries should be deliberately crossed, community detection identifies where boundaries should be drawn; notes at community boundaries are the most important integration points to preserve
+- [[betweenness centrality identifies bridge notes connecting disparate knowledge domains]] — quantitative complement: cross-MOC membership identifies bridges by topic diversity, betweenness centrality identifies them by path structure; the two measures sometimes converge but can diverge when a note bridges clusters within a single MOC territory
+- [[federated wiki pattern enables multi-agent divergence as feature not bug]] — extends cross-linking from topic boundaries to interpretive boundaries: federation creates parallel notes that bridge different perspectives on the same concept, complementing how cross-MOC membership bridges different topics
+- [[multi-domain systems compose through separate templates and shared graph]] — domain-scale integration metric: the four cross-domain connection patterns (temporal, entity, causal, goal) systematically produce cross-MOC memberships, and their density measures whether composition produces genuine integration or mere cohabitation
+Topics:
+- [[graph-structure]]

package/methodology/dangling links reveal which notes want to exist.md ADDED Viewed

@@ -0,0 +1,62 @@
+---
+description: Wiki links to non-existent notes accumulate as organic signals of concept demand, and frequency analysis identifies which placeholders deserve their own pages
+kind: research
+topics: ["[[graph-structure]]"]
+---
+# dangling links reveal which notes want to exist
+The insight: dangling links are not errors to fix immediately. They are signals to monitor. Each dangling link represents a concept someone thought was important enough to reference. When multiple notes independently reference the same non-existent concept, that frequency reveals demand.
+## The mechanism
+Link freely during capture. Don't worry about whether the target note exists yet. The act of linking expresses: "this concept matters here, and deserves its own treatment." But since [[wiki links as social contract transforms agents into stewards of incomplete references]], this act carries more weight than simple signal generation — each link is a commitment that the target concept will eventually be elaborated, and the vault's integrity depends on those commitments being honored through the maintenance pipeline.
+Dangling links accumulate independently across different notes. A concept referenced from three different contexts has more demand than one referenced once. Frequency analysis reveals which placeholder concepts deserve their own notes.
+When the demanded note is finally created, all dangling links become live connections instantly. The graph weaves itself retroactively.
+Since [[orphan notes are seeds not failures]], the inverse pattern complements this one. Dangling links are future inbound connections waiting for their target note to exist. Orphan notes are future outbound connections waiting for maintenance passes to discover where they belong. Both are intermediate states in incremental graph construction — the graph doesn't need to be complete at any moment, only growing toward completeness.
+## Structural prediction
+High-frequency dangling links predict future MOC candidates. If ten notes link to [concept X] before it exists, when you finally create [concept X], it immediately has ten incoming connections. That's hub-level link density. The concept was already functioning as a hub before it had a body.
+This is how notes enter the graph with pre-accumulated value. Since [[each new note compounds value by creating traversal paths]], a note with ten incoming links on day one doesn't just add one node — it creates ten new traversal paths that make the existing graph more navigable. High-frequency dangling links predict not just future hubs but future value multipliers.
+This is organic graph growth: structure emerges from use patterns rather than imposed categories. Since [[complex systems evolve from simple working systems]], the graph that grows through dangling link resolution has been pressure-tested by actual reference patterns — unlike a designed taxonomy that guesses what concepts matter before they prove their value. This matters for how [[spreading activation models how agents should traverse]] will function in the future. A note that enters the graph with ten incoming links immediately becomes a high-activation node that primes many related concepts simultaneously.
+## How dangling links extend wiki link addressing
+Since [[wiki links implement GraphRAG without the infrastructure]] through explicit edges, dangling links add a complementary demand signal. Together they create a system where:
+1. You link to concepts as if they exist
+2. Frequency reveals which concepts should exist
+3. Creation of demanded notes instantly weaves the graph
+4. Hub topology emerges from genuine reference patterns
+The graph grows through use, not through planning. This is retrieval-first architecture applied to graph growth: since [[retrieval utility should drive design over capture completeness]], demand signals (dangling link frequency) determine structure rather than upfront planning.
+## Predicting future hubs from link frequency
+High-frequency dangling links predict which notes will become hubs. Because [[small-world topology requires hubs and dense local links]], a note that has ten incoming links on creation day starts with hub-level connectivity. The topology self-organizes around concepts that multiple contexts reference. This forward-looking prediction complements what [[betweenness centrality identifies bridge notes connecting disparate knowledge domains]] measures for existing nodes — dangling link frequency predicts where new structural importance will emerge, while betweenness centrality identifies where it already exists. Together they provide a complete temporal picture: what the graph wants to become and what it currently is.
+The "notes that want" framing extends beyond existence. Since [[programmable notes could enable property-triggered workflows]], notes can want attention (staleness thresholds), want review (due dates), want connection (link density below threshold). Dangling links wanting resolution is one instance of a broader pattern: metadata conditions that trigger surfacing behavior.
+---
+Relevant Notes:
+- [[wiki links implement GraphRAG without the infrastructure]] — provides the addressing mechanism this relies on
+- [[small-world topology requires hubs and dense local links]] — high-frequency dangling links predict hub candidates
+- [[spreading activation models how agents should traverse]] — notes born with high link counts immediately become high-activation nodes
+- [[processing effort should follow retrieval demand]] — dangling link frequency provides the demand signal that determines where processing investment pays off
+- [[throughput matters more than accumulation]] — dangling link frequency operationalizes throughput metrics: process what has demonstrated demand
+- [[each new note compounds value by creating traversal paths]] — explains WHY pre-accumulated links matter: notes with many incoming connections multiply graph value on creation
+- [[retrieval utility should drive design over capture completeness]] — demand-driven structure (dangling link frequency) is retrieval-first design: let use patterns reveal what deserves notes, not upfront planning
+- [[orphan notes are seeds not failures]] — the inverse pattern: dangling links are future inbound connections, orphan notes are future outbound connections; both are intermediate states in incremental graph growth
+- [[associative ontologies beat hierarchical taxonomies because heterarchy adapts while hierarchy brittles]] — grounds the theory: organic structure from link patterns is heterarchy in action, avoiding the brittleness of upfront classification
+- [[complex systems evolve from simple working systems]] — theoretical foundation: Gall's Law explains WHY organic graph growth works — structure that emerges from use patterns has been tested by use, unlike designed-from-scratch systems that fail before validation
+- [[betweenness centrality identifies bridge notes connecting disparate knowledge domains]] — demand-side complement: dangling link frequency predicts future hubs by reference frequency, betweenness centrality identifies existing hubs by structural position; together they provide forward-looking and backward-looking graph analysis
+- [[wiki links as social contract transforms agents into stewards of incomplete references]] — adds the ethical dimension this note's demand-signal framing omits: creating a dangling link is not just flagging statistical demand but making a deliberate commitment to future elaboration, reframing frequency-based prioritization as obligation-based stewardship
+Topics:
+- [[graph-structure]]

package/methodology/data exit velocity measures how quickly content escapes vendor lock-in.md ADDED Viewed

@@ -0,0 +1,74 @@
+---
+description: Three-tier framework (high/medium/low velocity) turns abstract portability into an auditable metric where every feature gets evaluated by whether it traps content or frees it
+kind: research
+topics: ["[[agent-cognition]]", "[[graph-structure]]"]
+methodology: ["PKM Research"]
+source: [[tft-research-part3]]
+---
+# data exit velocity measures how quickly content escapes vendor lock-in
+The portability research introduces a concept that turns a vague intuition into something you can actually audit: Data Exit Velocity. The framework categorizes tools by how quickly content can leave:
+- **High Velocity (low risk):** plain text, markdown, YAML. "Export" means copying a folder. Any tool reads it tomorrow.
+- **Medium Velocity (medium risk):** proprietary formats with export capabilities. OPML preserves hierarchy but conversion costs exist.
+- **Low Velocity (high risk):** sharded databases like Notion or Evernote. Export requires conversion that typically loses relationships, metadata, or layout fidelity.
+The insight isn't just classification — it's that exit velocity becomes a design metric. Every feature decision should pass the test: does this lower exit velocity? Wiki links in markdown maintain high velocity because the link syntax is human-readable even without resolution software. YAML frontmatter maintains high velocity because any text parser reads it. But agent state living outside markdown files would lower exit velocity, because it introduces dependencies that aren't portable. Since [[four abstraction layers separate platform-agnostic from platform-dependent knowledge system features]], exit velocity decreases monotonically through the layers: foundation features (files, wiki links) have maximum velocity, convention features (context file instructions) remain high because they are still just text, automation features (hooks, skills, MCP) introduce platform dependencies that lower velocity, and orchestration features (pipelines, teams) have minimum velocity because they require the most platform-specific infrastructure. The layer hierarchy is literally an exit velocity gradient. And the automation layer's low velocity is not merely a matter of file format differences -- since [[platform adapter translation is semantic not mechanical because hook event meanings differ]], the event semantics themselves resist mechanical porting, which means adapting automation features requires decomposing each hook into its quality guarantee properties and reconstructing them independently on the target platform.
+## Why agents care about this metric
+Since [[local-first file formats are inherently agent-native]], the portability argument already exists in principle. But principles are hard to audit. Exit velocity makes the abstract concrete: can everything in this vault be fully utilized by any markdown-compatible tool tomorrow?
+For agent-operated systems, exit velocity has a specific edge. Agents don't just need to read content — they need to traverse structure, parse metadata, and follow connections. A database-backed system with API access has low exit velocity not only because migration is hard, but because every agent needs credentials, API knowledge, and format-specific parsers. High exit velocity means any LLM with filesystem access can operate the system immediately. This extends [[retrieval utility should drive design over capture completeness]] to the format layer: the retrieval-first question "how will I find this later" becomes "how will any future agent read this in any tool."
+High exit velocity also enables the [[bootstrapping principle enables self-improving systems]]: the recursive improvement loop requires the system to read and modify its own files without external coordination. If those files live in proprietary formats requiring API authentication, bootstrapping stalls at the boundary of the tool. The exit velocity metric makes this dependency auditable — every feature that lowers velocity is a potential bootstrapping bottleneck.
+The research categorizes PKM tools along this axis: "file-first" tools (Obsidian, Logseq) prioritize local plain text and have high exit velocity. "Database-first" tools (Notion, Anytype) prioritize structured queries but trap content. Since [[complex systems evolve from simple working systems]], this isn't surprising — the simplest substrate proves the most durable because it accumulates fewer dependencies that could break.
+## The evaluation practice
+The metric suggests a concrete audit: walk through every custom feature and ask "does this require specific tooling to function?" For this vault:
+- Wiki links: need conversion for standard markdown but content remains readable. Moderate exit cost.
+- YAML frontmatter: fully portable, any parser reads it. Zero exit cost.
+- Obsidian-specific features (canvas files, plugins): Obsidian-specific. High exit cost if relied upon.
+- qmd semantic search: external tool, but the notes it indexes are plain text. The search layer has low velocity but the content it operates on has high velocity.
+The target the research suggests: >95% of content should be portable. The remaining 5% is acceptable tooling convenience, not structural dependency. Since [[wiki links implement GraphRAG without the infrastructure]], the graph structure itself lives in the portable layer — the most valuable structural feature has high exit velocity.
+## The shadow side
+The exit velocity framework extends beyond tool lock-in to format lock-in. Since [[temporal media must convert to spatial text for agent traversal]], knowledge trapped in audio, video, and podcast formats has effectively zero knowledge exit velocity — agents cannot search, link, or synthesize temporal content. Transcription is the exit operation: it converts temporally locked knowledge into high-velocity text that participates in the graph. The conversion is lossy (tone, emphasis, gesture are lost), but the alternative is knowledge that never exits its temporal container at all. This suggests a fourth tier beyond the three vendor-based categories: content that exists in formats agents cannot traverse represents the lowest possible exit velocity, regardless of the tool that created it.
+The exit velocity gradient also maps to platform capability. Since [[platform capability tiers determine which knowledge system features can be implemented]], features that work at every tier (markdown conventions, wiki links, YAML schemas) are precisely the high-velocity features, while features that require tier-one infrastructure (hooks, pipelines, semantic search) are the low-velocity ones. This is not a coincidence -- portability and tier-universality measure the same property from different angles. A feature that requires platform-specific infrastructure both lowers exit velocity and raises the tier floor.
+Exit velocity as a metric could become overly conservative. Some low-velocity features genuinely improve capability — database queries are powerful, real-time collaboration requires servers, vector search needs embeddings. The question isn't "maximize exit velocity at all costs" but rather "is the capability worth the lock-in?" For a single-operator vault optimized for agent traversal, the answer usually favors high velocity. For a team knowledge base needing structured queries, the tradeoff might flip. The metric clarifies the decision; it doesn't make it for you.
+There is also a multi-agent dimension. Since [[federated wiki pattern enables multi-agent divergence as feature not bug]], federation requires content that can exist independently across different systems. Low exit velocity makes federation structurally impossible — if interpretations are locked into one platform, parallel versions cannot coexist across sites. Exit velocity is therefore a prerequisite for the divergence-as-feature pattern: the content must be free before it can be federated.
+Since [[digital mutability enables note evolution that physical permanence forbids]], high exit velocity extends mutability beyond a single tool's lifetime. Notes locked in a dying tool are as immutable as Luhmann's paper cards — not because the medium forbids editing, but because the tool forbids leaving. Exit velocity ensures that the evolutionary potential of digital notes survives tool transitions.
+And since [[the system is the argument]], this vault can be audited against its own metric. The >95% portability target applies here: the vast majority of content is markdown with YAML and wiki links, readable by any text parser. The audit practice the note describes is self-referential — the vault demonstrates what it claims.
+---
+---
+Relevant Notes:
+- [[local-first file formats are inherently agent-native]] — foundation: the portability principle this note makes measurable; that note argues for plain text, this note provides the yardstick
+- [[complex systems evolve from simple working systems]] — explains why high-velocity formats survive: fewer dependencies mean fewer failure modes, which is the mechanism behind exit velocity
+- [[wiki links implement GraphRAG without the infrastructure]] — example of high exit velocity: wiki links encode graph structure in plain text, requiring no infrastructure to read or migrate
+- [[federated wiki pattern enables multi-agent divergence as feature not bug]] — exit velocity is prerequisite for federation: parallel interpretations across sites only work when content can exist independently of its creating tool
+- [[retrieval utility should drive design over capture completeness]] — exit velocity operationalizes retrieval-first thinking at the format level: 'how will any future agent read this' extends 'how will I find this later'
+- [[bootstrapping principle enables self-improving systems]] — bootstrapping depends on high exit velocity: the recursive improvement loop requires filesystem-level read/write that proprietary formats block
+- [[the system is the argument]] — this vault demonstrates exit velocity: >95% plain text content makes the portability claim verifiable against its own architecture
+- [[digital mutability enables note evolution that physical permanence forbids]] — high exit velocity extends mutability across tools, not just within one: notes can evolve regardless of which software reads them
+- [[intermediate representation pattern enables reliable vault operations beyond regex]] — the tension in practice: an IR layer adds infrastructure dependency that lowers exit velocity at the tooling layer even though the files it operates on remain high-velocity plain text; the tradeoff is between operational reliability (IR wins) and infrastructure independence (raw files win)
+- [[temporal media must convert to spatial text for agent traversal]] — exit velocity applied to media format: temporal media has effectively zero knowledge exit velocity because agents cannot traverse it; transcription is the exit operation that converts trapped temporal knowledge into high-velocity text
+- [[four abstraction layers separate platform-agnostic from platform-dependent knowledge system features]] — formalizes exit velocity as a gradient: foundation has maximum velocity (copy a folder), convention still high (just instructions), automation introduces platform dependencies that lower velocity, orchestration has minimum velocity; the layer hierarchy IS an exit velocity gradient
+- [[platform adapter translation is semantic not mechanical because hook event meanings differ]] — mechanism: explains WHY automation-layer features have low exit velocity; hook semantics do not transfer mechanically because event meanings differ across platforms, making semantic adapter translation the concrete obstacle behind low portability scores at the automation layer
+- [[platform capability tiers determine which knowledge system features can be implemented]] — exit velocity and tier-universality measure the same property from different angles: high-velocity features work at every tier, low-velocity features require tier-specific infrastructure
+Topics:
+- [[agent-cognition]]
+- [[graph-structure]]

package/methodology/decontextualization risk means atomicity may strip meaning that cannot be recovered.md ADDED Viewed

@@ -0,0 +1,48 @@
+---
+description: Extracting claims from source discourse strips argumentative context, and Source footers plus wiki links may not reconstruct enough when retrieved years later
+kind: research
+topics: ["[[note-design]]", "[[design-dimensions]]"]
+confidence: speculative
+methodology: ["Zettelkasten", "Cognitive Science"]
+source: [[tft-research-part3]]
+---
+# decontextualization risk means atomicity may strip meaning that cannot be recovered
+Atomic note-taking depends on an assumption: that ideas can be extracted from their source discourse and stand alone without losing essential meaning. The claim-as-title pattern takes this further — the idea must compress into a sentence that works as prose when linked from other contexts. But critics of strict atomicity argue that stripping ideas of their original context can lead to superficial understanding or misinterpretation when the note is retrieved years later, after the extractor has forgotten the argumentative landscape the claim inhabited.
+The risk is distinct from atomicity paralysis. Since [[enforcing atomicity can create paralysis when ideas resist decomposition]], that tension concerns difficulty during creation — when ideas resist being split. Decontextualization risk concerns a different failure: meaning that survives creation intact but degrades during retrieval, because the context that made the claim meaningful has been deliberately removed in service of composability.
+Consider what gets stripped during extraction. A claim like "spaced repetition builds durable memory" makes sense in isolation. But a claim like "incremental formalization outperforms upfront design" carries hidden assumptions about what counts as formalization, what scale of system is being discussed, and what trade-offs are acceptable. These assumptions are clear in the source discourse — the surrounding argument, the examples, the counterarguments addressed — but may vanish when the claim stands alone. Years later, the reader encounters the claim without the argumentative scaffolding that originally supported it, and may either accept it uncritically (losing the nuance of when it applies) or dismiss it (unable to reconstruct why it seemed compelling).
+This is not merely a description-quality problem, though it connects to one. Since [[sense-making vs storage does compression lose essential nuance]], the vault's progressive disclosure architecture assumes that title + description + full content available preserves enough. But decontextualization operates at a different layer: even when the full note content is available, the meaning may still be impoverished because the source context — the conversation the claim was extracted from, the counterarguments it was responding to, the specific examples that grounded it — lives in the source document, not the note. Since [[source attribution enables tracing claims to foundations]], the provenance chain technically allows reconstruction. But provenance is breadcrumbs, not the meal. Knowing a claim came from a specific article is not the same as having the argumentative context that gave the claim its force.
+The ThreadMode-to-DocumentMode transformation names this stripping explicitly. Since [[ThreadMode to DocumentMode transformation is the core value creation step]], the entire pipeline depends on removing chronological scaffolding to produce timeless claims. Decontextualization is not a bug in this process — it is the process. The question is whether the transformation preserves enough meaning or whether certain kinds of claims lose essential features when made timeless.
+The strongest counterargument comes from the generation effect. Since [[the generation effect requires active transformation not just storage]], the act of extraction forces the extractor to re-encode meaning in the note's own reasoning. A well-written note doesn't borrow meaning from its source — it generates its own through inline reasoning, wiki link connections, and considered counterarguments. Since [[elaborative encoding is the quality gate for new notes]], the specific form of generation that matters most is connecting the new claim to existing knowledge through articulated relationships — and those articulated relationships create a new contextual web that partially substitutes for the stripped source context. On this view, a note that loses meaning when decontextualized was never well-extracted: the extractor copied the claim without doing the generative work to make it self-supporting. The decontextualization risk is really an extraction quality problem, not an atomicity problem.
+But this counterargument has limits. Some claims derive their meaning partly from what they argue against, and the opposing positions may not have their own notes in the vault. A claim about why hierarchical taxonomies fail gains force from specific examples of failure that the source provided. The extraction may capture the claim and even the reasoning, but the vivid examples — the particular classification system that collapsed, the user story that illustrated the problem — often get left behind as too specific for an atomic note. What remains is the skeleton of an argument that was originally fleshed out by context.
+For agent-operated systems, the risk compounds. Since [[cognitive outsourcing risk in agent-operated systems]], when agents handle extraction, even the human's memory of the original context may not persist. In human-operated Zettelkasten, the note-maker's personal memory of the source serves as implicit context — encountering their own note years later triggers recall of the original discourse, at least partially. When an agent extracts and the human never deeply engaged with the source, no one retains the context. The note stands truly alone. Since [[concept-orientation beats source-orientation for cross-domain connections]], the vault deliberately trades source context for composability. The architectural bet is that cross-domain edges are worth more than preserved source context. But the bet may not hold equally for all claim types — highly contextual claims (about when patterns apply, about trade-offs in specific situations, about edge cases) may lose more than general claims (about mechanisms, about cognitive science findings, about architectural principles).
+The vault has partial mitigations. Source footers trace provenance. Inline wiki links embed the claim in a web of related reasoning that partially replaces the original context with a new context — the graph itself. Since [[backlinks implicitly define notes by revealing usage context]], as a note accumulates incoming links from other arguments, it gains an implicit definition through usage that was never present in the original source. The argumentative scaffolding that extraction stripped gets partially rebuilt — not as the original discourse, but as the graph's record of every context where the claim proved useful. This does not recover the source context, but it creates a substitute context that may actually be richer for retrieval purposes, because it reflects how the claim functions across the vault rather than how it was originally justified. Since [[capture the reaction to content not just the content itself]], reactions captured during initial reading preserve the contextual spark that later extraction might strip. Well-written notes with rich inline reasoning and considered counterarguments are self-contextualizing.
+But these mitigations may not be sufficient for all claim types. The open question is whether certain categories of knowledge — contextual heuristics, trade-off judgments, pattern-matching wisdom that depends on knowing when NOT to apply it — resist decontextualization fundamentally, not just due to extraction quality. If so, the claim-as-title pattern may need escape hatches for ideas that work as prose when linked but lose essential meaning without their argumentative scaffolding.
+---
+Relevant Notes:
+- [[enforcing atomicity can create paralysis when ideas resist decomposition]] — sibling risk at a different timescale: that note addresses friction during creation, this note addresses meaning loss during later retrieval
+- [[sense-making vs storage does compression lose essential nuance]] — parallel tension at the description layer: both ask whether vault compression strips features that make ideas valuable, but descriptions compress for filtering while atomicity compresses for composability
+- [[concept-orientation beats source-orientation for cross-domain connections]] — the architectural argument for decontextualization: extracting concepts from sources enables cross-domain edges, but this note names the cost that extraction incurs
+- [[ThreadMode to DocumentMode transformation is the core value creation step]] — names the transformation that decontextualization performs: stripping chronological scaffolding to produce timeless claims, which is exactly the process that risks losing contextual meaning
+- [[source attribution enables tracing claims to foundations]] — the primary mitigation: Source footers preserve provenance chains, but provenance is not the same as the argumentative context that gave a claim its original force
+- [[the generation effect requires active transformation not just storage]] — the counterargument: if extraction forces genuine generative work, the meaning gets re-encoded in the note's own reasoning rather than borrowed from the source context
+- [[capture the reaction to content not just the content itself]] — reactions preserve the contextual spark that pure extraction might lose, functioning as proto-context that survives decontextualization
+- [[cognitive outsourcing risk in agent-operated systems]] — compounds decontextualization risk: when agents extract claims, even the human's contextual understanding may not survive to inform later retrieval
+- [[elaborative encoding is the quality gate for new notes]] — specifies the mechanism behind the generation effect counterargument: connecting new claims to existing knowledge through articulated relationships creates contextual anchoring that partially substitutes for stripped source context
+- [[backlinks implicitly define notes by revealing usage context]] — the graph-level mitigation: as a note accumulates incoming links, it gains implicit definition through usage that rebuilds contextual scaffolding from the network rather than from the original source
+- [[organic emergence versus active curation creates a fundamental vault governance tension]] — the governance tension shapes decontextualization risk: aggressive curation (schema enforcement, atomicity standards) amplifies extraction pressure that strips context, while the emergence pole's looser handling preserves more contextual scaffolding but accumulates structural debt
+Topics:
+- [[note-design]]
+- [[design-dimensions]]

package/methodology/dense interlinked research claims enable derivation while sparse references only enable templating.md ADDED Viewed

@@ -0,0 +1,47 @@
+---
+description: Four structural properties of TFT research — atomic composability, dense interlinking, methodology provenance, and semantic queryability — are prerequisites that separate principled derivation from
+kind: research
+topics: ["[[design-dimensions]]"]
+methodology: ["Original"]
+source: [[arscontexta-notes]]
+---
+# dense interlinked research claims enable derivation while sparse references only enable templating
+Derivation — the ability to generate a custom knowledge system from principles rather than copying a template — is not a process you can run against any collection of notes about knowledge management. It requires the research substrate itself to have specific structural properties. Without these properties, the agent falls back to the only thing sparse research supports: selecting and customizing pre-built templates. The difference between derivation and templating is not a philosophical choice about how to distribute systems. It is a structural consequence of how the underlying research is organized.
+Four properties separate a derivation-ready research graph from a reference library that can only support templating. These properties transfer across domains because since [[the vault methodology transfers because it encodes cognitive science not domain specifics]], each structural property maps to a cognitive operation rather than a domain-specific convention. And they operate on a fixed architectural inventory: since [[knowledge systems share universal operations and structural components across all methodology traditions]], the nodes that need to be densely interlinked are the eight universal operations and nine structural components that every viable system implements.
+**Atomic composability** means each claim stands alone as a linkable unit. Because [[concept-orientation beats source-orientation for cross-domain connections]], the extraction step that produces these independent claims is what makes composability possible — source-bundled research resists composition because its internal structure assumes a specific reading order. When a derivation agent needs to reason about note granularity for a creative writing use case, it should be able to pull [[enforcing atomicity can create paralysis when ideas resist decomposition]] independently from [[decontextualization risk means atomicity may strip meaning that cannot be recovered]] and compose them into a trade-off analysis specific to that domain. If these insights were buried inside a monolithic "guide to atomic notes," the agent would have to load the entire guide and extract the relevant portions — a process that degrades under context window constraints and loses the composability that makes derivation tractable. Atomic claims compose because they can be assembled in novel combinations the original author never imagined. Monolithic references resist composition because their internal structure assumes a specific reading order.
+**Dense interlinking** means the claims explain their relationships to each other. Since [[configuration dimensions interact so choices in one create pressure on others]], derivation must understand how choosing atomic granularity creates pressure toward explicit linking and heavy processing. These interaction effects are not derivable from individual claims in isolation — they require link context that articulates why one claim constrains or enables another. A research graph where claims exist as isolated statements forces the derivation agent to infer interactions, which is unreliable because the interactions often reflect empirical discoveries rather than logical necessities. Dense links with context phrases ("extends," "constrains," "contradicts") encode the interaction knowledge that derivation depends on but cannot reconstruct from scratch. This is why [[propositional link semantics transform wiki links from associative to reasoned]] — the move from "these are related" to "this constrains that" is what makes the link graph a reasoning substrate rather than an address book.
+**Methodology provenance** means each claim traces to the tradition or discipline that produced it. When [[methodology traditions are named points in a shared configuration space not competing paradigms]], the derivation agent needs to know which tradition validated a claim — because a claim validated by Zettelkasten practice may not transfer to a GTD-style system. Provenance tagging turns the research graph from a flat collection of assertions into a map of which traditions have explored which regions of the configuration space. Since [[source attribution enables tracing claims to foundations]], the vault implements this through YAML provenance fields (`methodology`, `adapted_from`) and Source footers that create a verification graph parallel to the conceptual graph. Without provenance, the derivation agent treats all claims as equally applicable everywhere, which is exactly the false universalism that produces systems technically correct but contextually inappropriate.
+**Semantic queryability** means the agent can find relevant claims through meaning, not just keywords. A user building a therapy practice management system would not know to search for "atomic granularity" or "processing cadence." They would describe their needs in domain language — "tracking client progress across sessions" — and the research graph must surface the relevant claims about temporal dynamics, maintenance cadence, and privacy-constrained linking. Because [[spreading activation models how agents should traverse]], semantic search is the entry point that starts the traversal: finding one relevant claim and then following its links to discover the cluster of related claims that together inform the derivation. Keyword search alone cannot bridge the vocabulary gap between domain practitioners and knowledge system researchers.
+The structural consequence is that templates and derivation sit on opposite sides of a research quality threshold. Below the threshold — where research is sparse, disconnected, unprovenance-tagged, and keyword-searchable only — the best an agent can do is pattern-match the user's needs to the closest available template and customize. Above the threshold, the agent can traverse the claim graph, compose dimension-specific trade-off analyses, and produce a novel configuration justified at every step — and since [[justification chains enable forward backward and evolution reasoning about configuration decisions]], every configuration choice traces back through the dense claim graph to the specific research claims and user constraints that produced it. The threshold is not binary but the transition is sharp: partial density produces derivations that look principled but have gaps where the agent reverts to guessing because claims run out.
+This creates a strategic imperative for the research graph: growth should prioritize density and interlinking over breadth. Adding a new claim that connects to five existing claims strengthens derivation more than adding five disconnected claims about separate topics. Since [[each new note compounds value by creating traversal paths]] and [[small-world topology requires hubs and dense local links]], the graph's derivation capacity grows with connection density, not note count — and that density must exhibit hub structure where synthesis notes and MOCs create the shortcuts that make the claim graph navigable at scale. A hundred deeply interlinked claims about processing patterns support richer derivation than a thousand isolated observations covering the full design space but with no connection structure.
+The shadow side is that these structural requirements create a chicken-and-egg problem. You cannot derive systems until the research graph is dense enough, but the graph gets dense through operational experience with derived systems. The resolution is that derivation capacity emerges gradually: early derivations are partial (some dimensions are well-supported, others fall back to templates), and each deployment cycle enriches the graph through operational observations. The research graph is not a prerequisite to be completed before derivation begins — it is a substrate that makes derivation progressively more capable as it grows.
+---
+---
+Relevant Notes:
+- [[derivation generates knowledge systems from composable research claims not template customization]] — describes the derivation process itself; this note identifies what the research substrate must look like for that process to function
+- [[eight configuration dimensions parameterize the space of possible knowledge systems]] — the dimensions derivation navigates, but navigating them requires claims dense enough to inform each dimension's trade-offs
+- [[the derivation engine improves recursively as deployed systems generate observations]] — recursive improvement enriches the research graph with deployment evidence, directly strengthening the four properties this note identifies as prerequisites
+- [[bootstrapping principle enables self-improving systems]] — the self-referential loop where the research graph improves itself by processing its own operational evidence, which is exactly how claim density and interlinking grow over time
+- [[composable knowledge architecture builds systems from independent toggleable modules not monolithic templates]] — composable architecture is the engineering output; this note argues that composable research INPUT is equally necessary
+- [[source attribution enables tracing claims to foundations]] — implements the methodology provenance property: provenance tagging requires attribution infrastructure that traces each claim to its tradition and source
+- [[propositional link semantics transform wiki links from associative to reasoned]] — operationalizes the dense interlinking property: context phrases like 'extends' and 'constrains' are propositional semantics that encode the interaction knowledge derivation depends on
+- [[concept-orientation beats source-orientation for cross-domain connections]] — enables the atomic composability property: extracting concept-oriented claims from source material is the act that produces independently composable units
+- [[justification chains enable forward backward and evolution reasoning about configuration decisions]] — downstream consumer of all four properties: chain quality is a trailing indicator of claim graph quality because chains can only trace what the graph's density and provenance make traceable
+- [[small-world topology requires hubs and dense local links]] — the graph topology that makes dense interlinking navigable: without power-law distribution creating shortcuts through hubs, dense interlinks would exist but remain opaque to traversal
+- [[knowledge systems share universal operations and structural components across all methodology traditions]] — the universal operations and components are the nodes that dense interlinking connects; derivation-readiness requires density across this fixed inventory rather than breadth into novel categories
+- [[the vault methodology transfers because it encodes cognitive science not domain specifics]] — grounds WHY the four substrate properties work across domains: atomic composability, dense interlinking, provenance, and queryability transfer because they encode cognitive operations not domain content
+Topics:
+- [[design-dimensions]]

package/methodology/dependency resolution through topological sort makes module composition transparent and verifiable.md ADDED Viewed

@@ -0,0 +1,56 @@
+---
+description: Topological sort on a module DAG resolves dependencies automatically while producing human-readable explanations that teach users why modules relate — turning opaque automation into education
+kind: research
+topics: ["[[design-dimensions]]"]
+methodology: ["Original", "Systems Theory"]
+source: [[composable-knowledge-architecture-blueprint]]
+---
+# dependency resolution through topological sort makes module composition transparent and verifiable
+When a knowledge system offers independently toggleable modules, the user faces a composition problem: enabling a module may require other modules that provide capabilities it depends on, and understanding those transitive dependencies requires tracing chains that quickly exceed what manual reasoning can track. Dependency resolution via topological sort solves the mechanical problem — given a directed acyclic graph of module dependencies, it produces a valid initialization order and identifies all transitive requirements automatically. But the more interesting contribution is that resolution can be made transparent, turning what is normally opaque automation into an explanation of why the system is structured as it is.
+## The layered DAG guarantees acyclicity
+The dependency graph for a composable knowledge system is a DAG aligned with the abstraction layers. Since [[four abstraction layers separate platform-agnostic from platform-dependent knowledge system features]], the layer hierarchy guarantees acyclicity because dependencies always point downward — foundation modules like yaml-schema and wiki-links have no dependencies, convention modules like atomic-notes depend on foundation modules, and automation modules like processing-pipeline depend on convention modules. This layered structure means topological sort always succeeds (no cycles to detect), and the resolution order reflects the architectural logic: foundations activate first, then conventions that build on them, then automation that assumes both. The sort does not merely find AN order — it reveals THE order that the architecture implies.
+## Initial versus incremental resolution
+Two resolution modes serve different moments in the system's lifecycle. Initial resolution runs at setup, sorting all selected modules into a valid activation sequence — and because [[use-case presets dissolve the tension between composability and simplicity]], the most common trigger for initial resolution is a user selecting a preset, which translates a use-case label into an ordered activation sequence without exposing the module-level complexity. Incremental resolution runs when a user adds a module to an already-running system — it computes the transitive dependencies of the new module, filters those already active, and presents only what needs to be newly enabled. This incremental pattern is why dependency resolution enables Gall's Law at the module level: since [[complex systems evolve from simple working systems]], a user starts with yaml-schema and wiki-links, then adds mocs a month later when fifty notes make navigation painful. The resolver calculates that mocs requires atomic-notes, checks that atomic-notes requires yaml-schema and wiki-links (already active), and presents a single addition — atomic-notes — for confirmation. Each evolution step is minimal, legible, and safe because since [[the no wrong patches guarantee ensures any valid module combination produces a valid system]], the resolved combination cannot corrupt what already works.
+## Resolution explanations as architectural education
+The transparency dimension is what separates this from standard package management. Every resolution produces a human-readable explanation showing what was resolved, what was already active, and what needed to be newly enabled. When a user requests processing-pipeline, the explanation reads: "Enabling processing-pipeline also enabled atomic-notes because processing-pipeline requires structured notes with explicit linking. yaml-schema and wiki-links were already active." This transforms dependency resolution from a black box that silently enables things into an architectural tutorial that teaches users why components relate as they do. Resolution explanations are a specific instance of how [[justification chains enable forward backward and evolution reasoning about configuration decisions]] — the backward reasoning mode lets a user trace from any enabled module to the claims and constraints that required it, and the resolution explanation IS a justification chain rendered at module activation time. The user learns that processing-pipeline needs atomic-notes because pipeline phases (extract, reflect, reweave, verify) assume each note is a single claim with YAML frontmatter — not because of an arbitrary technical constraint, but because the processing methodology requires composable units to operate on.
+This transparency matters because it addresses a specific failure mode of composable systems. Since [[configuration dimensions interact so choices in one create pressure on others]], module selection has design-level consequences beyond the mechanical dependencies the resolver checks. The resolver ensures all declared dependencies are satisfied, but it cannot verify that the selected modules form a coherent configuration — atomic granularity with shallow navigation is structurally valid but practically unnavigable. By explaining WHY modules depend on each other, the resolution explanation gives users the conceptual framework to evaluate coherence themselves. A user who understands that processing-pipeline needs atomic-notes because pipeline phases operate on single-claim units can reason about whether their intended workflow actually needs per-claim granularity, rather than blindly accepting whatever the resolver enables.
+## Verifiability through auditable dependency contracts
+The verifiability property follows from transparency. Since [[module communication through shared YAML fields creates loose coupling without direct dependencies]], what the resolver actually verifies is that the field-providing modules are active for every field that consuming modules expect to read. If mocs reads `topics` and `topics` is contributed by the atomic-notes module, the resolver ensures atomic-notes is active before enabling mocs. The dependency graph IS the communication contract: it formalizes which modules write which shared fields and which modules read them. This makes the dependency graph auditable — an architect can inspect it to verify that no module reads fields that no active module writes, that no circular dependency has crept in through an undocumented convention, and that the activation order respects the layered architecture. But since [[implicit dependencies create distributed monoliths that fail silently across configurations]], the resolver can only verify what is declared: a module that reads a field without declaring a dependency on its provider passes resolution but fails silently when the provider is absent, which means the dependency graph is trustworthy only to the extent that dependency declarations are complete. And since [[progressive schema validates only what active modules require not the full system schema]], the resolved module set determines the validation scope — what the resolver enables is exactly what the validator checks, creating consistency between the composition decision and the enforcement surface.
+## Shadow sides: NP-completeness and incomplete explanations
+There is a shadow side that the theoretical literature makes concrete. Dependency resolution in its general form is NP-complete — a result demonstrated through reduction from 3-SAT. For knowledge system modules the practical constraint structure is far simpler than software package repositories (tens of modules rather than millions, no version conflicts, just capability requirements), so topological sort suffices. But as a module ecosystem grows, richer constraint types may emerge — mutual exclusions between modules, version-specific capabilities, platform-conditional dependencies — that push resolution beyond what topological sort handles. The knowledge system equivalent of PubGrub-style conflict-driven solving may eventually become necessary, with its key advantage being human-readable error explanations when resolution fails. For now, the DAG structure and small module count keep resolution trivially fast, but the architecture should not assume this simplicity is permanent.
+A second shadow side is that transparency can mislead when the explanation is technically correct but architecturally incomplete. Telling a user "processing-pipeline also enabled atomic-notes" explains the module dependency but not the dimension interaction — it does not explain that enabling atomic notes with a shallow navigation configuration produces friction that will eventually force enabling mocs and deep navigation too. The resolution explanation addresses module-level composition; dimension-level coherence requires additional reasoning that the resolver does not perform. The gap between what the resolver verifies (structural completeness) and what the user needs to understand (design coherence) is the same gap that [[the no wrong patches guarantee ensures any valid module combination produces a valid system]] acknowledges — structural validity is the floor, not the ceiling. A further asymmetry emerges because [[module deactivation must account for structural artifacts that survive the toggle]]: dependency resolution handles the forward direction (what to enable when adding a module) but has no corresponding protocol for the reverse direction (what to clean up when removing a module). Resolution explanations teach users why modules were enabled; nothing teaches them what structural artifacts remain after modules are disabled.
+---
+---
+Relevant Notes:
+- [[composable knowledge architecture builds systems from independent toggleable modules not monolithic templates]] — parent architecture: dependency resolution is the computational engine that makes composable module selection practical rather than manual
+- [[four abstraction layers separate platform-agnostic from platform-dependent knowledge system features]] — the layer hierarchy guarantees acyclicity: dependencies always point downward through abstraction layers, so the DAG structure is a consequence of the layer design rather than an independent constraint
+- [[the no wrong patches guarantee ensures any valid module combination produces a valid system]] — the safety property that dependency resolution preserves: resolution ensures all declared dependencies are satisfied, and the no wrong patches guarantee ensures that any resolved combination produces a valid system
+- [[configuration dimensions interact so choices in one create pressure on others]] — dimension coupling is the design-level analog: dependency resolution handles module-level composition while dimension interaction handles configuration-level coherence; both constrain the valid space but at different abstraction levels
+- [[complex systems evolve from simple working systems]] — incremental resolution enables Gall's Law: adding one module months later triggers resolution of only the new transitive dependencies, so the system grows at friction points without requiring upfront planning of the full dependency graph
+- [[module communication through shared YAML fields creates loose coupling without direct dependencies]] — the communication substrate that dependencies point to: modules depend on shared field conventions rather than on each other directly, so what the resolver actually checks is whether the convention-providing modules are active
+- [[progressive schema validates only what active modules require not the full system schema]] — extends transparency to validation: just as resolution explains what was enabled and why, progressive schema ensures only enabled modules generate validation requirements, so the user encounters consistent behavior between what resolution showed and what validation enforces
+- [[implicit dependencies create distributed monoliths that fail silently across configurations]] — the anti-pattern that bypasses dependency resolution: undeclared dependencies cannot be resolved by topological sort, so the resolver approves combinations that fail silently when implicit providers are absent
+- [[justification chains enable forward backward and evolution reasoning about configuration decisions]] — resolution explanations are a specific instance of justification chains: each resolution explanation is a backward-reasoning chain from enabled module to the architectural claims that require it
+- [[use-case presets dissolve the tension between composability and simplicity]] — presets are the primary trigger for initial resolution: selecting a use-case label translates to a module list that the resolver sorts into activation order, giving template-level simplicity backed by composable resolution
+- [[module deactivation must account for structural artifacts that survive the toggle]] — the lifecycle gap: dependency resolution handles the forward direction (what to enable) but has no corresponding protocol for the reverse direction (what to clean up when a module is removed)
+- [[each module must be describable in one sentence under 200 characters or it does too many things]] — enables explanation legibility: focused modules produce dependency rationale expressible in one sentence, while unfocused modules require compound explanations that obscure the architectural teaching function resolution is designed to provide
+- [[friction-driven module adoption prevents configuration debt by adding complexity only at pain points]] — the primary context for incremental resolution: when friction triggers a module addition, the resolver explains what transitive dependencies need enabling and why, turning each pain-point-driven addition into architectural education
+Topics:
+- [[design-dimensions]]

package/methodology/derivation generates knowledge systems from composable research claims not template customization.md ADDED Viewed

@@ -0,0 +1,63 @@
+---
+description: Templates constrain to deviation from fixed starting points while derivation traverses a claim graph to compose justified configurations, producing justification chains that improve recursively
+kind: research
+topics: ["[[design-dimensions]]"]
+methodology: ["Original"]
+source: [[knowledge-system-derivation-blueprint]]
+---
+# derivation generates knowledge systems from composable research claims not template customization
+The tempting approach to producing knowledge systems for new use cases is to start with a template — a working system like this vault — strip the domain-specific parts, and offer what remains as a starting point. Users customize by removing what they don't need and adding what they want. This feels practical because templates provide immediate structure. But it fails at scale because the space of possible knowledge systems is vast, and templates can only cover the configurations their authors imagined.
+Derivation works differently. Instead of starting with a fixed system and carving away, it starts with a graph of research claims about how knowledge systems work — since [[eight configuration dimensions parameterize the space of possible knowledge systems]], these claims span note design, linking philosophy, processing intensity, navigation structure, maintenance cadence, schema density, and automation level — and composes a new system by traversing relevant claims and making configuration decisions that each trace back to specific evidence. The output is not a simplified copy of an existing system but a novel configuration justified by the research that informs each choice.
+Three properties make derivation fundamentally superior to templating. First, the configuration space is combinatorially large. Since [[knowledge system architecture is parameterized by platform capabilities not fixed by methodology]], the same conceptual system manifests differently depending on what the platform can support, and within each platform tier the eight configuration dimensions (granularity, organization, linking, processing intensity, navigation depth, maintenance cadence, schema density, automation level) interact to create millions of valid configurations. No template catalog covers this space because the combinations are multiplicative, not additive. Derivation can explore the space because it composes individual dimension choices rather than selecting from pre-composed packages — though since [[configuration dimensions interact so choices in one create pressure on others]], the valid region is far smaller than the raw combinatorial product, which actually helps by constraining derivation to coherent configurations.
+This surface-level reshaping is why since [[schema fields should use domain-native vocabulary not abstract terminology]], vocabulary adaptation is the litmus test for genuine derivation — a template might rename folders and call itself "customized," but real derivation reshapes every linguistic surface to speak the target domain's language, because vocabulary carries the ontology the domain has developed through practice.
+Second, derivation produces justification chains that templates cannot. When a derived system specifies atomic granularity with heavy processing and deep navigation, the justification chain explains why: atomic notes enable composability (research claim), composability requires explicit linking (interaction pressure), explicit linking demands processing capacity to maintain (constraint propagation), and deep navigation prevents the navigational vertigo that atomic systems create without local hierarchy (failure mode avoidance). Because [[methodology traditions are named points in a shared configuration space not competing paradigms]], derivation can reference these traditions as validated coherence points — Zettelkasten coheres at the atomic-explicit-deep-heavy end, PARA at the coarse-hierarchical-shallow-manual end — while composing novel combinations that no single tradition discovered. A template user gets the same configuration but not the reasoning. Without understanding why the system is shaped a certain way, the user cannot evolve it intelligently — they either drift from the template or preserve features they don't understand.
+Third, derivation improves recursively. Since [[bootstrapping principle enables self-improving systems]], every system derived and deployed generates operational observations — what worked, what caused friction, what the users actually needed versus what the derivation predicted. Since [[evolution observations provide actionable signals for system adaptation]], these observations feed back into the claim graph through a diagnostic protocol that maps operational symptoms to structural causes, sharpening existing claims and generating new ones. The next derivation benefits from everything the previous ones learned. Templates improve too, but only through the template author's revisions. Derivation distributes the improvement across the entire claim graph, so discoveries in one deployment context improve derivations for all contexts.
+But derivation has constraints that prevent it from being unconstrained generation. Since [[complex systems evolve from simple working systems]], Gall's Law requires that even a perfectly derived configuration start simple and evolve through use. The derivation process must target the minimum viable configuration for the user's platform and use case, then embed enough context for the system to evolve where friction emerges — and since [[context files function as agent operating systems through self-referential self-extension]], that context file is not just configuration but an operating system capable of teaching the agent how to extend the system it describes. A derivation that outputs a maximally complex system optimized for the user's stated needs violates evolutionary design even if every individual choice is research-justified. The right derivation produces a simple working system with a clear evolutionary path, not a complete system that needs no evolution.
+The enabling condition for derivation is the claim graph itself — a dense, interlinked network where claims about note design connect to claims about linking strategy, which connect to claims about processing workflows, which connect to claims about maintenance cadence. Because [[dense interlinked research claims enable derivation while sparse references only enable templating]], this is not a quality preference but a structural threshold: four specific properties (atomic composability, dense interlinking, methodology provenance, semantic queryability) must be present before derivation can function at all. Since [[four abstraction layers separate platform-agnostic from platform-dependent knowledge system features]], the derivation process is itself layered: foundation-layer choices (files, markdown, wiki links) are universal and require no user input, convention-layer choices (quality standards, naming patterns) depend on use case, automation-layer choices (hooks, skills, pipelines) depend on platform tier, and orchestration-layer choices (multi-phase processing, team coordination) depend on both platform tier and scale. Each layer narrows the audience while deepening the customization.
+The relationship to existing parameterization work is complementary at multiple levels. Since [[knowledge system architecture is parameterized by platform capabilities not fixed by methodology]], parameterization describes what varies — the dimensions and their ranges. Derivation is the process of traversing those dimensions and composing specific values into a coherent system. And since [[composable knowledge architecture builds systems from independent toggleable modules not monolithic templates]], the composable architecture is the engineering pattern that makes derivation outputs implementable as independent toggles — derivation decides which dimension values to choose, composable architecture ensures those choices manifest as modules with explicit dependencies that can be adopted incrementally. Parameterization is the map of the space, derivation is the journey through it, and composable architecture is the vehicle that carries the chosen configuration to the agent as independently activatable modules rather than a monolithic package. All three are necessary: without parameterization, no decisions to make; without derivation, no principled way to make them; without composable architecture, no way to deliver them as incremental adoptions.
+The practical value extends further: since [[every knowledge domain shares a four-phase processing skeleton that diverges only in the process step]], derivation for a new domain inherits capture, connect, and verify as structural constants, focusing design effort entirely on the process step — what transformation does this domain's content require? The pipeline structure is not derived but inherited, which simplifies derivation considerably. And since [[storage versus thinking distinction determines which tool patterns apply]], the generator must resolve an upstream classification before traversing configuration dimensions at all: a storage system and a thinking system occupy fundamentally different regions of the configuration space, making this the first decision that narrows the derivation search. When the target domain does not match any reference domain directly, the derivation agent needs an entry procedure — and since [[novel domains derive by mapping knowledge type to closest reference domain then adapting]], that entry procedure works by classifying what kind of knowledge the domain produces (factual, experiential, competency, outcome, social, creative), mapping to the reference domain that handles that type, then adapting temporal dynamics, ethical requirements, collaboration patterns, and retrieval needs. The analogy-based approach still produces justification chains because each mapping step — knowledge type classification, reference domain selection, adaptation rationale — is traceable and debatable.
+Derivation also has a temporal dimension that extends beyond the initial generation event. Since [[derived systems follow a seed-evolve-reseed lifecycle]], the same claim graph that produces the initial derivation later enables principled re-derivation. During the evolution phase, a derived system accumulates observations about what works and what causes friction. When accumulated drift produces systemic incoherence — schemas that conflict, navigation that degrades, processing that no longer matches content — the system can be reseeded: re-derived using the original constraints enriched by accumulated operational evidence. This is where derivation's superiority over templating becomes most visible. A template-based system that has drifted can only be restructured by intuition, because the reasoning behind the original choices was never captured. A derived system retains its justification chains, and those chains combined with operational observations enable principled restructuring rather than ad hoc fixes.
+The distribution problem is the final piece: derivation produces configuration choices, but those choices must reach the agent's platform in a form the agent can implement. Since [[blueprints that teach construction outperform downloads that provide pre-built code for platform-dependent modules]], the automation and orchestration layers cannot ship as pre-built artifacts because platform fragmentation makes any fixed implementation embed assumptions about where it runs. Derivation handles the intellectual problem of composing the right configuration; blueprints handle the distribution problem of getting that configuration implemented on each platform. Without blueprints, derivation would still produce monolithic downloads that resist cross-platform deployment — the intellectual work would be sound but the delivery would collapse back into the template distribution pattern that derivation was designed to replace.
+The shadow side is that derivation quality depends entirely on claim graph quality. If the research claims are shallow, contradictory, or disconnected, the derived systems will be incoherent. This creates a chicken-and-egg problem: you need a good claim graph to derive good systems, but you need operational feedback from derived systems to build a good claim graph. The resolution is bootstrapping — start with the claims this vault has already validated through its own operation, derive initial systems, collect feedback, and iterate. The first derivations will be rough. The recursive improvement means they get better.
+---
+---
+Relevant Notes:
+- [[knowledge system architecture is parameterized by platform capabilities not fixed by methodology]] — provides the parameterization frame that derivation operates within: derivation decides configuration values, platform capabilities constrain which values are reachable
+- [[bootstrapping principle enables self-improving systems]] — the recursive improvement loop that makes derivation compound: each derived system generates observations that feed back into the claim graph, making the next derivation better
+- [[complex systems evolve from simple working systems]] — constrains derivation to produce minimum viable configurations rather than theoretically complete ones: Galls Law says even a perfectly derived system should start simple
+- [[four abstraction layers separate platform-agnostic from platform-dependent knowledge system features]] — derivation must respect the layer hierarchy: foundation choices are universal, orchestration choices are platform-specific, so the derivation process itself is layered
+- [[eight configuration dimensions parameterize the space of possible knowledge systems]] — defines the specific dimensions derivation navigates: granularity, organization, linking, processing intensity, navigation depth, maintenance cadence, schema density, and automation level are the knobs the derivation agent turns
+- [[configuration dimensions interact so choices in one create pressure on others]] — constrains derivation to produce coherent configurations: the valid configuration space is far smaller than the combinatorial product because dimension choices cascade, so derivation must satisfy interaction constraints not just individual dimension preferences
+- [[methodology traditions are named points in a shared configuration space not competing paradigms]] — reframes what derivation has to work with: existing traditions are pre-validated coherence points in configuration space, so derivation can use them as seeds for novel combinations rather than starting from raw dimensions
+- [[every knowledge domain shares a four-phase processing skeleton that diverges only in the process step]] — simplifies derivation for new domains: the pipeline structure is inherited rather than invented, with only the process step requiring domain-specific design
+- [[storage versus thinking distinction determines which tool patterns apply]] — the upstream classification derivation must resolve first: before traversing configuration dimensions, the generator must identify whether the target system is storage or thinking, because the two types occupy fundamentally different regions of the configuration space
+- [[context files function as agent operating systems through self-referential self-extension]] — identifies the output format that makes derived systems evolvable: the context file carries both methodology and self-extension instructions, so derivation produces not just a configuration but an operating system capable of adapting through use
+- [[novel domains derive by mapping knowledge type to closest reference domain then adapting]] — the entry procedure when no reference domain matches directly: six knowledge type categories identify which reference domain's processing patterns transfer, then four adaptation dimensions customize the configuration
+- [[schema fields should use domain-native vocabulary not abstract terminology]] — the linguistic adaptation constraint: derivation must reshape every surface label to speak the target domain's language, and this vocabulary adaptation is what distinguishes genuine derivation from template distribution wearing a different label
+- [[multi-domain systems compose through separate templates and shared graph]] — the composition output: when derivation targets a use case spanning multiple knowledge types, it must compose separate domain templates within a shared graph, with five composition rules and domain coupling strength as the decision factor
+- [[derived systems follow a seed-evolve-reseed lifecycle]] — temporal extension: derivation is not a one-time event but the mechanism that makes reseeding principled; the claim graph and justification chains enable re-derivation when accumulated evolution produces systemic incoherence
+- [[evolution observations provide actionable signals for system adaptation]] — the feedback mechanism that makes recursive improvement concrete: six diagnostic patterns convert operational symptoms into specific structural corrections, testing whether derivation choices were correct
+- [[premature complexity is the most common derivation failure mode]] — derivation's structural incentive toward thoroughness is both its strength and its primary risk: a richer claim graph produces more justified choices, but the composed system can be locally justified at every step while globally unsustainable, requiring a complexity budget as an external constraint
+- [[configuration paralysis emerges when derivation surfaces too many decisions]] — the UX constraint on derivation: presenting every dimension as a question creates analysis paralysis; derivation must infer secondary choices from primary constraints and surface only genuine choice points, using justification chains to make defaults overrideable without requiring upfront comprehension
+- [[blueprints that teach construction outperform downloads that provide pre-built code for platform-dependent modules]] — the distribution mechanism that completes derivation's output: derivation composes the right configuration choices, blueprints ship those choices to agents as construction instructions that adapt to each platform rather than pre-built code that embeds single-platform assumptions
+- [[composable knowledge architecture builds systems from independent toggleable modules not monolithic templates]] — the engineering complement: derivation decides WHAT configuration to produce by traversing the claim graph, composable architecture decides HOW to implement that configuration as independently toggleable modules with explicit dependencies, making derivation outputs incrementally adoptable via addition rather than template subtraction
+- [[dense interlinked research claims enable derivation while sparse references only enable templating]] — identifies the four structural prerequisites (atomic composability, dense interlinking, methodology provenance, semantic queryability) the research substrate must satisfy before derivation can function; expands the enabling condition into a full claim with a quality threshold below which agents fall back to templating
+Topics:
+- [[design-dimensions]]

package/methodology/derivation-engine.md ADDED Viewed

@@ -0,0 +1,27 @@
+---
+description: How the init wizard derives configurations from conversation -- signal extraction, cascade resolution, coherence validation
+type: moc
+---
+# derivation-engine
+The methodology of the /setup wizard. Signal extraction from conversation, dimension resolution, cascade effects, coherence validation, personality derivation, and vocabulary transformation.
+## Core Ideas
+### Guidance
+- [[balance onboarding enforcement and questions to prevent premature complexity]] -- What to enforce, what to explain, and what to ask during /setup — the decision framework for vault generation that preven
+## Tensions
+(Capture conflicts as they emerge)
+## Open Questions
+- What signal confidence threshold balances accuracy against conversation length?
+- How should anti-signals be weighted in the derivation?
+---
+Topics:
+- [[index]]