npm - arscontexta - Versions diffs - 0.6.0 - Mend

arscontexta 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (418) hide show

package/methodology/every knowledge domain shares a four-phase processing skeleton that diverges only in the process step.md ADDED Viewed

@@ -0,0 +1,53 @@
+---
+description: Capture, connect, and verify are domain-invariant operations while the process step (extract claims, detect patterns, build prerequisite maps, document decisions) carries all domain-specific logic
+kind: research
+topics: ["[[design-dimensions]]", "[[processing-workflows]]"]
+methodology: ["PKM Research", "Systems Theory"]
+source: [[knowledge-system-derivation-blueprint]]
+---
+# every knowledge domain shares a four-phase processing skeleton that diverges only in the process step
+Across research synthesis, therapy journaling, project management, creative writing, personal life tracking, and relationship management, the same four operations recur in the same order: capture content, process it into domain-appropriate form, connect it to existing knowledge, then verify the result. The skeleton is universal. What makes a research vault different from a therapy journal is not the pipeline shape but what happens inside phase two.
+This becomes visible when you lay domain processing side by side. Research extracts atomic claims from sources. Therapy detects patterns across temporal entries. Learning builds prerequisite maps and schedules reviews. Project management documents decisions with rationale. Personal life routes items to life areas and tracks goals. Relationships capture interaction details and surface follow-up patterns. Creative work develops ideas and links references to drafts. Each of these is a different process operation, but the surrounding phases — capture, connect, verify — are identical in function even when they differ in implementation detail. And since [[schema fields should use domain-native vocabulary not abstract terminology]], the names used for each phase should speak the domain's language — a therapy system calls its process step "pattern recognition" rather than "claim extraction," even though both occupy the same structural position in the skeleton. Because [[methodology traditions are named points in a shared configuration space not competing paradigms]], these domain variations are not competing approaches but different configurations of the same skeleton — each tradition chose its process step implementation based on what its domain requires.
+The reason the skeleton holds is that capture, connection, and verification are structural operations while processing is semantic. Capture answers "what entered the system?" regardless of domain. Connection answers "what relates to what?" regardless of content type. Verification answers "is this well-formed and accurate?" regardless of subject matter. But processing answers "what does this content mean in domain terms?" — and meaning is inherently domain-specific. A therapy pattern recognition algorithm and a research claim extraction workflow share no logic even though they occupy the same structural position in their respective pipelines.
+This has direct implications for system derivation. Since [[storage versus thinking distinction determines which tool patterns apply]], the skeleton operates in both system types but the process step carries the distinction: storage systems process by filing (classifying, routing, tagging), while thinking systems process by synthesizing (extracting claims, articulating connections, generating new understanding). In thinking systems, the process step is specifically the [[ThreadMode to DocumentMode transformation is the core value creation step]] — the act of transforming chronological captures into timeless claims. The skeleton itself is type-agnostic — even a pure storage system captures, processes, connects, and verifies. What differs is the cognitive intensity of phase two.
+The vault's own pipeline demonstrates the skeleton concretely. Record (capture) → Reduce (process) → Reflect (connect) → Verify (verify) maps directly onto the four phases. The reduce phase is where domain-specific logic lives: mining for claims, detecting enrichment opportunities, classifying extraction types. The other phases are methodologically generic. Reflect finds connections regardless of what kind of content was processed — and because [[elaborative encoding is the quality gate for new notes]], the connect phase's value comes from articulated relationship reasoning rather than mechanical link-adding, a quality requirement that applies identically whether the processed content was research claims or therapy patterns. Verify checks quality regardless of domain semantics. This is why since [[fresh context per task preserves quality better than chaining phases]], phase isolation works so cleanly — the phases are genuinely different cognitive operations, not arbitrary divisions of continuous work.
+Since [[throughput matters more than accumulation]], the skeleton also reveals where bottlenecks form. Capture is typically fast and getting faster (voice transcription, AI-assisted recording). Verification is automatable (schema checks, link validation). Connection-finding is computationally tractable (semantic search, graph traversal). Processing is the bottleneck because it requires domain expertise and semantic judgment. This matches the vault's experience: reduce is the most resource-intensive phase, the one where model quality matters most, the phase most likely to produce quality variation. The universal skeleton predicts that ANY knowledge system will find its bottleneck at the process step, because that is where domain complexity concentrates.
+The shadow side is that "four phases" may be too clean. Real workflows involve feedback loops — verification failures that trigger reprocessing, connections that reveal the need for additional capture, processing that generates new items needing their own pipeline run. The skeleton describes the forward pass but not the backward maintenance that since [[structure without processing provides no value]] demands. The vault addresses this through reweaving — a backward pass that revisits old notes with new understanding — but reweaving does not fit neatly into the four-phase model. Since [[backward maintenance asks what would be different if written today]], reweaving is more like running the skeleton again with a different entry point: instead of starting from capture, you start from an existing note and ask what would be different if processed today. Whether this makes the skeleton a cycle rather than a sequence, or whether backward maintenance is genuinely a different operation, remains an open question worth investigating as more domain implementations emerge. And since [[derived systems follow a seed-evolve-reseed lifecycle]], the skeleton constrains what reseeding can restructure — the four phases themselves are invariant, so reseeding targets the process step implementation and the templates and navigation that sit on top of the skeleton, not the skeleton itself.
+The skeleton's invariance also has a specific consequence for multi-domain systems. Since [[multi-domain systems compose through separate templates and shared graph]], the connect phase being domain-invariant is what makes cross-domain reflect possible. When a research insight about cognitive load needs to connect to a therapy reflection about stress patterns, the connection-finding operation is structurally identical regardless of which domain produced the content. Cross-domain reflect is not an extra feature bolted onto the skeleton — it is the natural consequence of having a domain-invariant connect phase operating over a shared graph. The cross-domain value that multi-domain composition promises depends directly on this structural constant.
+The practical value for derivation is that since [[derivation generates knowledge systems from composable research claims not template customization]], when designing a knowledge system for a new domain you do not need to invent the pipeline from scratch. You inherit capture, connect, and verify as structural constants, then focus design effort entirely on the process step: what transformation does this domain's content require? But the skeleton's universality creates a seductive trap — because the shape transfers, it is tempting to assume the process step's content transfers too. Since [[false universalism applies same processing logic regardless of domain]], exporting a research vault's claim extraction to a therapy journal or a creative writing workspace produces systems that look well-structured but operate on the wrong transformation entirely. For unfamiliar domains where no reference processing pattern exists, the skeleton's invariance is what makes analogy-based derivation tractable — since [[novel domains derive by mapping knowledge type to closest reference domain then adapting]], the entire derivation challenge reduces to designing the right process step, and knowledge type classification (factual, experiential, competency, outcome, social, creative) identifies which existing process step implementation to start from. The skeleton's three constant phases mean the analogy only needs to transfer the process step, not the entire pipeline. Because [[configuration dimensions interact so choices in one create pressure on others]], the process step's intensity cascades through the rest of the skeleton — heavy processing demands dense linking in the connect phase and deeper navigation to remain traversable, while light processing allows the surrounding phases to stay minimal. Since [[eight configuration dimensions parameterize the space of possible knowledge systems]], the answer to that question combined with the dimension settings generates a complete pipeline specification.
+---
+---
+Relevant Notes:
+- [[storage versus thinking distinction determines which tool patterns apply]] — the storage/thinking split determines WHAT the process step produces: storage systems file, thinking systems synthesize, but both share the same four-phase skeleton
+- [[throughput matters more than accumulation]] — throughput measures the velocity through the skeleton, not volume at any single phase; a system that captures fast but processes slow has a skeletal bottleneck
+- [[fresh context per task preserves quality better than chaining phases]] — phase isolation is already the vault's implementation of this skeleton: each phase gets its own context because the operations are cognitively distinct
+- [[structure without processing provides no value]] — the Lazy Cornell anti-pattern is precisely what happens when the skeleton runs with an empty process step: capture and connect without transformation produces organized noise
+- [[eight configuration dimensions parameterize the space of possible knowledge systems]] — the skeleton constrains the processing intensity dimension specifically: capture, connect, and verify are constants, so intensity governs the process step's depth
+- [[derivation generates knowledge systems from composable research claims not template customization]] — the skeleton is derivation's most actionable structural claim: inherit capture/connect/verify as constants, derive only the process step for new domains
+- [[methodology traditions are named points in a shared configuration space not competing paradigms]] — traditions share the skeleton's shape and differ only in their process step implementations: Zettelkasten formulates, PARA summarizes, Cornell structures, GTD routes
+- [[ThreadMode to DocumentMode transformation is the core value creation step]] — names what the process step does in thinking systems: the transformation from chronological ThreadMode captures into timeless DocumentMode claims is the skeleton's phase two
+- [[configuration dimensions interact so choices in one create pressure on others]] — the process step's intensity cascades through the skeleton: heavy processing demands dense linking in the connect phase and deep navigation to remain traversable
+- [[backward maintenance asks what would be different if written today]] — the backward pass that the skeleton's forward-only model cannot accommodate: reweaving re-enters the skeleton at the process step with existing notes rather than new captures
+- [[elaborative encoding is the quality gate for new notes]] — grounds the connect phase in cognitive science: phase three's value depends on articulated relationship reasoning, not mechanical link-adding
+- [[novel domains derive by mapping knowledge type to closest reference domain then adapting]] — operationalizes the skeleton for unfamiliar domains: knowledge type classification identifies which existing process step to start from, so the skeleton's invariance makes analogy-based derivation tractable
+- [[schema fields should use domain-native vocabulary not abstract terminology]] — the vocabulary wrapping: the skeleton is universal but the names for each phase should speak the domain's language — a therapy system says 'pattern recognition' where this vault says 'claim extraction,' even though both occupy the same structural position
+- [[multi-domain systems compose through separate templates and shared graph]] — cross-domain reflect exploits the skeleton's invariance: because the connect phase is domain-invariant, connection-finding across domain boundaries requires no additional infrastructure beyond the shared graph
+- [[derived systems follow a seed-evolve-reseed lifecycle]] — the skeleton constrains what reseeding restructures: the four phases are invariant, so reseeding targets the process step implementation and surrounding templates rather than the pipeline shape itself
+- [[false universalism applies same processing logic regardless of domain]] — the trap this skeleton's universality creates: because the four-phase shape holds everywhere, it is tempting to export the process step's content unchanged — but the skeleton's invariance is structural, not operational, and confusing the two produces technically executable but semantically empty systems
+- [[maintenance operations are more universal than creative pipelines because structural health is domain-invariant]] — extends the skeleton's invariance insight to the backward pass: the operations that maintain the skeleton's health (schema validation, orphan detection, link integrity) are even more transferable than the skeleton's forward phases because they check structural properties entirely, clustering in lower abstraction layers and requiring less platform infrastructure
+Topics:
+- [[design-dimensions]]
+- [[processing-workflows]]

package/methodology/evolution observations provide actionable signals for system adaptation.md ADDED Viewed

@@ -0,0 +1,67 @@
+---
+description: Six diagnostic patterns map operational symptoms to structural causes and prescribed responses, converting accumulated observations into a decision protocol rather than an undifferentiated pile
+kind: research
+topics: ["[[design-dimensions]]", "[[maintenance-patterns]]"]
+methodology: ["Original"]
+source: [[knowledge-system-derivation-blueprint]]
+---
+# evolution observations provide actionable signals for system adaptation
+A derived knowledge system generates operational data from the moment it starts running. Notes accumulate, fields get filled or skipped, agents navigate or fail to navigate, processing produces output that either integrates or sits orphaned. The question is whether this operational data remains an undifferentiated pile of experience or whether it becomes diagnostic intelligence that tells you specifically what is wrong and what to do about it.
+The diagnostic protocol maps six observation patterns to their structural causes and prescribed responses:
+| Observation | What It Signals | Action |
+|-------------|----------------|--------|
+| Note type unused for 30+ days | Over-modeled | Consider removing or merging |
+| Field consistently filled with "N/A" | Required field not useful | Demote to optional |
+| Field manually added to 20%+ of notes | Organic schema growth | Add to template |
+| Agent cannot find note within 3 nav steps | Navigation failure | Review MOC structure |
+| Processing produces notes that sit unlinked | Processing doesn't match domain value pattern | Rethink processing phase |
+| MOC exceeds threshold (50 agent / 35 human) | Navigation overload | Split into sub-MOCs |
+What makes this more than a troubleshooting checklist is that each row connects a surface symptom to a structural cause. An unused note type is not a content problem but a modeling problem -- the derivation hypothesized a distinction the domain does not actually need. Fields filled with "N/A" are not lazy operators but schema overreach -- the required/optional boundary was drawn in the wrong place. Unlinked processing output is not a connection-finding failure but a processing design mismatch -- since [[every knowledge domain shares a four-phase processing skeleton that diverges only in the process step]], the pipeline produces artifacts that the domain's value pattern does not naturally integrate because the process step was designed for a different kind of transformation than the domain actually needs. This is the runtime signature of what [[false universalism applies same processing logic regardless of domain]] identifies as the most insidious derivation failure: the operations are technically executable but semantically empty for the target domain.
+This diagnostic structure transforms the relationship between observation and action. Since [[hook-driven learning loops create self-improving methodology through observation accumulation]], the vault already has a mechanism for accumulating raw observations -- hooks nudge capture, observations pile up, rethink reviews the pile. But accumulation without interpretation produces a growing evidence base that requires increasingly expensive pattern recognition to extract actionable signals. The diagnostic protocol provides the interpretation layer: instead of asking "what patterns emerge from these 40 observations?", you can ask "does any observation match a known diagnostic?" The first question requires synthesis. The second requires lookup. Both are necessary, but the lookup path handles the common cases efficiently so that synthesis capacity is reserved for genuinely novel patterns.
+There is a category the six diagnostics do not explicitly cover: automation infrastructure failures. Since [[observation and tension logs function as dead-letter queues for failed automation]], when qmd crashes, rename scripts miss links, or schema migrations skip notes, the failure gets captured as an observation or tension note — but these entries require different triage than the evolution signals the diagnostic table addresses. A qmd crash demands immediate repair or workaround documentation; an unused note type informs a design change on a quarterly cadence. The dead-letter framing suggests the diagnostic protocol needs a seventh row: infrastructure failure entries that require urgency-based triage rather than the scheduled cadence appropriate for evolution signals.
+The protocol also reveals something about how derived systems should evolve. Each diagnostic points in a specific direction: toward simplification (remove unused types, demote useless fields), toward organic growth (promote emergent fields), or toward restructuring (reorganize navigation, redesign processing). These are not random maintenance tasks but evolution pressures that push the system toward fit with its actual use rather than its designed-for use. The diagnostics also serve as the interpretation layer for [[friction-driven module adoption prevents configuration debt by adding complexity only at pain points]] — the five-repetition threshold that determines when a module should be added depends on structured observation rather than intuitive friction sensing. Without the diagnostic protocol, counting repetitions degenerates into counting activities without understanding what they mean, and the friction signal loses the specificity that makes it actionable. Since [[derived systems follow a seed-evolve-reseed lifecycle]], the diagnostics serve as the feedback mechanism for the evolution phase: they tell the agent whether adaptations are incremental corrections (still in evolution) or whether accumulated drift has produced systemic incoherence (reseeding is needed). The derivation process produces a hypothesis about what the domain needs. Evolution observations test that hypothesis against operational reality. The diagnostics close the feedback loop by converting test results into specific structural modifications.
+Since [[community detection algorithms can inform when MOCs should split or merge]], one row of the diagnostic table -- MOC threshold exceeded -- already has a more sophisticated treatment in the vault. Community detection provides algorithmic signals for reorganization that go beyond simple note counts. This suggests the diagnostic protocol is a starting layer, not a final word. Each row can be deepened: the navigation failure diagnostic could incorporate graph traversal metrics, the schema diagnostics could use field completion rates, the processing mismatch diagnostic could track link density of processed output over time. The schema diagnostics in particular have a more developed treatment: since [[schema evolution follows observe-then-formalize not design-then-enforce]], the quarterly review protocol specifies five concrete signals -- manual field additions, placeholder stuffing, unused enums, patterned free text, oversized MOCs -- that refine three of the six diagnostics here with specific thresholds and evidence-gathering cadence. The protocol's value is not exhaustiveness but the structural insight that operational symptoms have specific structural causes.
+The distinction between note-level and system-level maintenance matters here. Since [[schema enforcement via validation agents enables soft consistency]], note-level quality is monitored through schema validation -- individual notes checked against their templates. Since [[backward maintenance asks what would be different if written today]], note-level evolution is handled through reweaving -- individual notes reconsidered against current understanding. Evolution diagnostics operate at a different level entirely: they monitor whether the system's structural decisions -- which note types exist, which fields are required, how MOCs are organized, how processing phases work -- still match operational reality. A note can pass every schema check and still be evidence of a system-level problem if it belongs to an over-modeled type that nobody uses.
+The diagnostic rows also serve as the desired-state declarations for a reconciliation architecture. Since [[reconciliation loops that compare desired state to actual state enable drift correction without continuous monitoring]], each row in the table above specifies what healthy looks like (no unused types, no N/A-stuffed fields, all notes reachable within three navigation steps), how to detect divergence (count-based checks, field completion rates, navigation path analysis), and what remediation to apply. The reconciliation loop is the scheduling infrastructure that runs these diagnostics systematically on a cadence rather than waiting for someone to notice symptoms. But not all diagnostics need the same cadence. Since [[maintenance scheduling frequency should match consequence speed not detection capability]], the appropriate frequency for each diagnostic depends on how fast the underlying problem propagates. Unused note types develop over months as domain understanding evolves — monthly detection suffices. N/A field rates accumulate over multiple sessions as templates encounter real content — weekly checks catch the drift before it compounds. Navigation failure, however, can develop within a single session as new notes shift the graph's navigability — session-start dashboard checks are the appropriate tier. The deterministic diagnostics (unused type count, N/A field rate, manual addition percentage) are candidates for automated scheduled detection at their consequence-matched frequencies, while the judgment-requiring diagnostics (navigation failure, processing mismatch) need agent-level evaluation at their (often more frequent) scheduled intervals.
+The deeper implication is that since [[derivation generates knowledge systems from composable research claims not template customization]], derivation is not a one-time event but a hypothesis that needs ongoing testing. The initial derivation maps domain needs to configuration choices. Evolution observations measure whether those choices were correct. The diagnostics convert measurements into corrections. And the corrected system generates new observations that test the corrections. This is the same self-improving loop that hook-driven learning enables at the methodology level, but applied to the system architecture itself -- the structure of the knowledge system becomes subject to evidence-based revision rather than remaining fixed once derived.
+There is genuine uncertainty about whether these six diagnostics are sufficient or whether they cover the most important failure modes. Navigation failure and processing mismatch seem like the most consequential signals because they indicate fundamental design errors, while schema diagnostics (unused types, N/A fields, emergent fields) are more incremental adjustments. The protocol would benefit from weighting -- not all observations are equally urgent. A field filled with "N/A" can wait; agents unable to find notes within three navigation steps suggests the system is failing at its primary purpose and demands immediate attention. This urgency distinction connects to a broader risk: since [[metacognitive confidence can diverge from retrieval capability]], structural metrics like link density and schema compliance can show a healthy system while navigation failure and processing mismatch go undetected because they require actually attempting the operations the system is supposed to support. The diagnostic protocol is an anti-divergence mechanism that tests functional capability rather than structural appearance.
+The diagnostics themselves are candidates for the encoding trajectory. Since [[methodology development should follow the trajectory from documentation to skill to hook as understanding hardens]], the deterministic diagnostics -- unused note types (count-based), N/A field rates (frequency-based), manual field additions (percentage-based) -- could eventually be encoded as hooks that fire automatically and surface warnings. But the judgment-requiring diagnostics -- navigation failure assessment and processing mismatch evaluation -- should remain at the skill level because they require contextual evaluation that varies with the system's current state and the agent's current understanding of domain value patterns. And since [[confidence thresholds gate automated action between the mechanical and judgment zones]], the encoding trajectory gains a middle tier: diagnostics that are not fully deterministic but can score their own certainty could operate in the confidence-gated zone, auto-remediating when confidence is high (a note type truly unused for 90 days in an active domain) while only suggesting when confidence is medium (a note type unused for 30 days in a domain with seasonal patterns). Beyond confidence scoring, since [[the fix-versus-report decision depends on determinism reversibility and accumulated trust]], each diagnostic remediation must also pass the four conjunctive conditions before auto-applying: removing an unused note type is reversible via git and low-cost if wrong, but demoting a required field has higher stakes because downstream processing may depend on it — the cost condition acts as an independent veto even when confidence is high.
+---
+Relevant Notes:
+- [[hook-driven learning loops create self-improving methodology through observation accumulation]] -- provides the accumulation mechanism; this note provides the interpretation framework that converts accumulated observations into targeted system changes
+- [[community detection algorithms can inform when MOCs should split or merge]] -- one specific diagnostic (MOC threshold) generalized through algorithmic monitoring; this note frames the broader pattern where multiple diagnostics compose into a protocol
+- [[backward maintenance asks what would be different if written today]] -- the reconsideration mental model operates per-note; this diagnostic protocol operates per-system, identifying which structural components need the reconsideration pass
+- [[schema enforcement via validation agents enables soft consistency]] -- sibling pattern: validation agents check note-level schema compliance, evolution diagnostics check system-level structural fitness; both are asynchronous maintenance that surfaces issues without blocking
+- [[derived systems follow a seed-evolve-reseed lifecycle]] -- the diagnostics are the feedback mechanism for the evolution phase: they tell you whether accumulated adaptations have drifted the configuration into incoherence requiring reseeding
+- [[schema evolution follows observe-then-formalize not design-then-enforce]] -- domain-specific refinement: the schema signal table (manual additions, placeholder stuffing, unused enums) deepens three of the six diagnostics with concrete quarterly review protocol
+- [[metacognitive confidence can diverge from retrieval capability]] -- the navigation failure and processing mismatch diagnostics are anti-divergence mechanisms that catch system-level failures structural metrics would miss
+- [[methodology development should follow the trajectory from documentation to skill to hook as understanding hardens]] -- the diagnostics themselves could follow this trajectory: deterministic checks (unused type count, N/A field rate) could become hooks while judgment-requiring diagnostics (processing mismatch) should remain skill-level
+- [[derivation generates knowledge systems from composable research claims not template customization]] -- derivation produces the configuration hypothesis that these diagnostics test: each diagnostic row measures whether a derivation choice was correct
+- [[every knowledge domain shares a four-phase processing skeleton that diverges only in the process step]] -- the processing mismatch diagnostic targets the process step specifically: unlinked output signals that the process step was designed for a different transformation than the domain needs
+- [[false universalism applies same processing logic regardless of domain]] -- names the derivation anti-pattern that the processing mismatch diagnostic detects: unlinked output is the runtime signal that the process step was designed for a different domain's transformation, making false universalism diagnosable even when the derivation reasoning looked sound
+- [[module deactivation must account for structural artifacts that survive the toggle]] -- identifies a specific cause of two diagnostic signals: ghost fields from deactivated modules trigger the unused-type and N/A-field rows, making deactivation artifacts a named structural cause alongside over-modeling and schema overreach
+- [[friction-driven module adoption prevents configuration debt by adding complexity only at pain points]] — the adoption pattern these diagnostics serve: the five-repetition threshold depends on structured friction detection to distinguish genuine pain from noise, making the diagnostic protocol the interpretation layer that operationalizes friction-driven adoption
+- [[reconciliation loops that compare desired state to actual state enable drift correction without continuous monitoring]] — scheduling infrastructure: each diagnostic row is a desired-state declaration that reconciliation operationalizes by running checks systematically on a cadence rather than waiting for symptoms
+- [[confidence thresholds gate automated action between the mechanical and judgment zones]] -- response graduation: the diagnostics themselves range from deterministic (unused type count, N/A field rate) to judgment-requiring (navigation failure, processing mismatch), and confidence thresholds determine which diagnostics can trigger automated remediation versus which should only suggest or log, applying the three-tier response pattern to the diagnostic protocol itself
+- [[maintenance scheduling frequency should match consequence speed not detection capability]] — cadence calibration: the six diagnostics have different consequence speeds and therefore different optimal detection frequencies — unused types develop over months (slow consequence, monthly check), N/A field rates drift over weeks (multi-session, weekly), while navigation failure can develop per-session as new notes arrive (session-scale, per-session dashboard)
+- [[observation and tension logs function as dead-letter queues for failed automation]] — gap identification: the dead-letter framing reveals that the diagnostic protocol covers evolution signals but not automation infrastructure failures; qmd crashes and script failures require urgency-based triage rather than the scheduled cadence appropriate for design evolution, suggesting a seventh diagnostic category
+- [[the fix-versus-report decision depends on determinism reversibility and accumulated trust]] — remediation gating for diagnostics: the four conjunctive conditions determine which diagnostic actions can auto-apply (deterministic, reversible, low-cost, trusted) versus which should only report; complements confidence thresholds by adding independent vetoes for cost and reversibility that confidence scoring alone cannot capture
+Topics:
+- [[design-dimensions]]
+- [[maintenance-patterns]]

package/methodology/external memory shapes cognition more than base model.md ADDED Viewed

@@ -0,0 +1,60 @@
+---
+description: retrieval architecture shapes what enters the context window and therefore what the agent thinks — memory structure has higher ROI than model upgrades
+kind: research
+topics: ["[[agent-cognition]]", "[[note-design]]"]
+---
+# external memory shapes cognition more than base model
+What an agent retrieves determines what it thinks. Retrieval is shaped by memory architecture. Therefore memory architecture matters more than base weights.
+## The Argument
+Cognition happens in context. The context window is filled by:
+1. the prompt
+2. retrieved information
+3. conversation history
+Base model weights determine HOW the context is processed. But WHAT gets processed depends on retrieval.
+Garbage in, garbage out — but also: specific context in, specific conclusions out.
+An agent with a well-structured vault retrieves different material than one with flat files. Different material leads to different reasoning leads to different conclusions. Since [[flat files break at retrieval scale]], the absence of memory architecture does not merely degrade performance — it produces identity degradation for agents, because retrieval failure means losing access to parts of their own cognition.
+The bottleneck is retrieval, not reasoning.
+## The cognitive science grounding
+Since [[cognitive offloading is the architectural foundation for vault design]], the claim has a deeper foundation than engineering intuition. Clark and Chalmers' Extended Mind Theory establishes that cognition extends beyond the brain into external artifacts that participate in cognitive processes. The vault is not a filing cabinet the agent consults — it is part of the agent's cognitive system. Cowan's working memory limits and attention degradation (for agents, since [[LLM attention degrades as context fills]]) mean that what enters context must be curated, and that curation is an architectural decision. The architecture of the external memory determines what enters the cognitive loop, which determines what conclusions emerge.
+This reframes model upgrades. A better base model processes the same retrieved context more skillfully, but the delta from better processing is bounded by the quality of what was retrieved. A better memory architecture changes WHAT gets retrieved — different material, different conclusions. The retrieval delta compounds across every interaction, while the processing delta is marginal improvement on the same inputs.
+## Implications
+- Investing in memory architecture has higher ROI than waiting for better models
+- Vault structure is a form of cognitive architecture
+- Agents can differentiate through scaffolding even with identical weights — which means since [[the vault constitutes identity for agents]], the vault is not augmenting identity but constituting it
+- Since [[AI shifts knowledge systems from externalizing memory to externalizing attention]], the retrieval bottleneck is evolving — the question is shifting from "what can the agent remember" to "what does the agent attend to," and memory architecture increasingly shapes attention allocation rather than just storage access
+## This Explains
+Why [[scaffolding enables divergence that fine-tuning cannot]] — scaffolding IS memory architecture. Two agents with identical weights but different vaults think differently because they retrieve different material. The vault is the variable, not the model.
+Why [[notes are skills — curated knowledge injected when relevant]] — each note is a capability the agent lacks without it. The retrieval bottleneck is a capability availability bottleneck. Better memory architecture means more capabilities available at the right moment.
+---
+---
+Relevant Notes:
+- [[scaffolding enables divergence that fine-tuning cannot]] — extends: scaffolding IS memory architecture, so divergence through scaffolding is divergence through retrieval architecture
+- [[wiki links create navigation paths that shape retrieval]] — mechanism: wiki links are the primary structural element through which memory architecture shapes what enters context
+- [[the vault constitutes identity for agents]] — extends: if retrieval architecture shapes cognition more than weights, then the vault constitutes rather than augments agent identity
+- [[notes are skills — curated knowledge injected when relevant]] — extends: reframes the retrieval bottleneck as a capability availability bottleneck where each note enables reasoning the agent could not do without it
+- [[cognitive offloading is the architectural foundation for vault design]] — foundation: Clark and Chalmers Extended Mind Theory provides the cognitive science grounding for why external memory architecture shapes cognition — the vault is a distributed cognitive system, not storage
+- [[AI shifts knowledge systems from externalizing memory to externalizing attention]] — extends: the bottleneck-is-retrieval claim marks the inflection point where the paradigm shifts from externalizing what you know to externalizing what you attend to
+- [[flat files break at retrieval scale]] — example: demonstrates the failure mode when memory architecture is absent — retrieval degrades and for agents that means identity degradation
+- [[session outputs are packets for future selves]] — construction mechanism: session packets are the incremental units through which memory architecture gets built; each session's composable output adds nodes and edges to the retrieval landscape
+Topics:
+- [[agent-cognition]]
+- [[note-design]]

package/methodology/faceted classification treats notes as multi-dimensional objects rather than folder contents.md ADDED Viewed

@@ -0,0 +1,65 @@
+---
+description: Ranganathan's 1933 PMEST framework formalizes why each YAML field should be an independent classification dimension — facets compose multiplicatively for retrieval, which mono-hierarchies cannot
+kind: research
+topics: ["[[graph-structure]]", "[[discovery-retrieval]]"]
+methodology: ["PKM Research"]
+---
+# faceted classification treats notes as multi-dimensional objects rather than folder contents
+S.R. Ranganathan's Colon Classification system, developed in the 1930s for library science, provides the formal theoretical basis for what this vault does with flat folders and YAML metadata. His PMEST framework — Personality, Matter, Energy, Space, Time — treats every document as a multi-dimensional object that can be sliced along independent classification axes. The core insight is that no single axis captures what a document IS, so any system that forces a choice of one axis (like a folder hierarchy) necessarily destroys information about the others.
+This matters because folder hierarchies are mono-hierarchical by nature. A file lives in one folder. You can choose to organize by topic, by type, by date, by source — but you must pick one, and that choice becomes the only efficient retrieval path. Everything else requires searching. Since [[associative ontologies beat hierarchical taxonomies because heterarchy adapts while hierarchy brittles]], the vault already rejects mono-hierarchy in favor of emergent link structure. Faceted classification provides the formal justification for this choice: it's not just that heterarchy is more flexible, it's that mono-classification provably discards information about every dimension except the one you chose.
+The YAML frontmatter system implements faceted classification directly. Each metadata field is a facet — an independent classification dimension:
+| Facet | YAML Field | What It Classifies |
+|-------|-----------|-------------------|
+| Content kind | `type` | claim, methodology, tension, problem |
+| Disciplinary origin | `methodology` | Zettelkasten, Evergreen, Cornell, etc. |
+| Topic membership | `topics` | Which MOC(s) the note belongs to |
+| Graph position | `role` | moc, hub, leaf, synthesis |
+| Development stage | `status` | preliminary, open, dissolved |
+Since [[type field enables structured queries without folder hierarchies]], type metadata already demonstrates one facet in operation — agents can query "all methodology notes" without folder structure. But type is just one dimension. The power of faceted classification is that dimensions compose multiplicatively. "All methodology notes about Zettelkasten" combines type and methodology facets. "All open tensions in graph-structure" combines status, type, and topics. Each facet independently narrows the search space, and since [[metadata reduces entropy enabling precision over recall]], the entropy reduction from combining facets is roughly multiplicative — two facets with 5 values each reduce search space by ~25x, not ~10x.
+The faceted perspective also explains why since [[role field makes graph structure explicit]] represents a genuinely new dimension rather than redundant metadata. Role (graph position) is orthogonal to type (content kind) because a claim can be a leaf or a hub, a synthesis can be well-connected or isolated. Ranganathan's framework predicts this: facets that represent genuinely independent properties of the object add retrieval power, while facets that correlate with existing ones add noise. The test for whether a new YAML field is justified is whether it classifies along an axis independent of existing fields. And since [[basic level categorization determines optimal MOC granularity]], Rosch's prototype theory adds a complementary prediction: on each axis, the values should sit at basic-level resolution — specific enough to filter meaningfully but general enough to group useful clusters, with that optimal resolution shifting as expertise deepens.
+For agents specifically, faceted classification transforms how retrieval works. Instead of "what folder does this live in?" the agent asks "what are the attributes of this?" — and can enter from any facet. An agent seeking tension notes traverses the type facet. An agent seeking Zettelkasten-origin claims traverses the methodology facet. An agent seeking notes in graph-structure traverses the topics facet. Each facet provides a different entry point into the same set of notes, because since [[concept-orientation beats source-orientation for cross-domain connections]], extracted concept nodes can participate in multiple classification dimensions simultaneously. Source-bundled documents can only be classified meaningfully by origin — one facet, one entry point.
+The historical depth here is significant. Ranganathan developed faceted classification in 1930s India specifically because the Library of Congress and Dewey Decimal systems — both mono-hierarchical — failed to classify Indian texts that didn't fit Western academic categories. The breakdown happened because mono-hierarchies encode assumptions about how knowledge is organized, and those assumptions are culturally specific. Faceted classification was the solution: instead of asking "which Western category does this fit?" ask "what are its independent properties?" This translates directly to knowledge vaults. The question isn't "which folder?" but "what are its attributes?" — and the YAML frontmatter system makes those attributes queryable.
+There's a tension, though. More facets means more metadata ceremony at capture time. Since [[topological organization beats temporal for knowledge work]] is a closed design decision, the vault already commits to minimal folder structure. But each new facet field added to the template increases the cost of note creation. Ranganathan's insight cuts both ways: facets are powerful precisely because they're independent, but independence means each must be classified separately. The vault's current approach — making most fields optional, defaulting type to claim — manages this by requiring only the most useful facets (description, topics) while leaving others available when they'd add retrieval value. And since [[schema evolution follows observe-then-formalize not design-then-enforce]], the decision about which facets to require is not made upfront but driven by usage evidence — when agents repeatedly add a classification dimension manually, that repetition is the signal to formalize it as a template field. Faceted classification theory explains WHY independent dimensions have retrieval value; the evolution protocol determines WHEN each dimension earns its place in the schema.
+The faceted framework also predicts how multi-domain systems compose. Since [[multi-domain systems compose through separate templates and shared graph]], domain membership itself becomes another facet — a classification dimension orthogonal to topic, type, and methodology. An agent querying "all therapy domain notes that connect to research claims" is executing a cross-facet query across the domain dimension and the type dimension simultaneously, which is exactly the multiplicative retrieval that Ranganathan's framework enables. Multi-domain composition does not require new organizational machinery — it requires recognizing that "which domain produced this" is just another independent property of a multi-dimensional object.
+The deeper lesson from faceted classification is that the "where does this go?" question is fundamentally wrong for knowledge systems. Notes don't GO anywhere. They HAVE properties. The organizational scheme that emerges from those properties -- flat files queryable across independent dimensions -- is what Ranganathan formalized ninety years ago, and what this vault implements with markdown files and YAML frontmatter. This faceted access is one of four layers that since [[markdown plus YAML plus ripgrep implements a queryable graph database without infrastructure]] compose into a system with graph database capabilities: wiki link edges for traversal, YAML fields for node properties, faceted dimensions for multi-attribute queries, and soft validation for consistency. Ranganathan's framework provides the theoretical grounding for why the query layer works -- independent facets compose multiplicatively, which is the formal justification for why ripgrep piped through multiple YAML field filters achieves the same multi-attribute precision that graph database query languages provide through WHERE clause conjunction.
+---
+Source: [[tft-research-part3]]
+---
+Relevant Notes:
+- [[markdown plus YAML plus ripgrep implements a queryable graph database without infrastructure]] — synthesis: faceted classification is the multi-dimensional access layer of the four-layer graph database architecture; Ranganathan's framework is the theoretical grounding for why piped ripgrep queries achieve graph database multi-attribute precision
+- [[associative ontologies beat hierarchical taxonomies because heterarchy adapts while hierarchy brittles]] — foundation: faceted classification is the formal library science articulation of what heterarchy achieves informally through emergent links
+- [[type field enables structured queries without folder hierarchies]] — direct implementation: type is one facet dimension; this note provides the theoretical framework that justifies adding orthogonal metadata fields
+- [[metadata reduces entropy enabling precision over recall]] — each facet dimension reduces search entropy independently, enabling multiplicative precision gains when facets combine
+- [[role field makes graph structure explicit]] — another facet dimension: role describes graph position while type describes content kind, demonstrating that facets are genuinely orthogonal
+- [[concept-orientation beats source-orientation for cross-domain connections]] — prerequisite: concept extraction creates the independent nodes that faceted access can slice from multiple angles; source-bundled documents have only one meaningful facet (origin)
+- [[topological organization beats temporal for knowledge work]] — temporal filing treats time as the single classification axis; faceted classification explains why that impoverishes retrieval by collapsing dimensionality
+- [[retrieval utility should drive design over capture completeness]] — design orientation: faceted classification formalizes what retrieval-first design intuits; 'notes have properties' is the Ranganathan articulation of 'how will I find this' over 'where does this go'
+- [[schema templates reduce cognitive overhead at capture time]] — cost management: every facet must be populated at capture time, and templates reduce the cognitive cost of multi-dimensional classification by making structure mechanical rather than creative
+- [[question-answer metadata enables inverted search patterns]] — candidate facet: an answers field would add an independent classification dimension (what questions a note answers), and faceted theory provides the test for whether it earns its place as genuinely orthogonal
+- [[intermediate representation pattern enables reliable vault operations beyond regex]] — infrastructure: facet queries currently run via regex on raw YAML, which is fragile; an IR layer would make faceted access a property lookup on typed objects, realizing the full composability that Ranganathan's framework promises
+- [[navigational vertigo emerges in pure association systems without local hierarchy]] — alternative remedy: beyond MOC hierarchy and link traversal, faceted metadata queries provide a third navigation mechanism that addresses vertigo through structured entry points independent of graph connectivity
+- [[basic level categorization determines optimal MOC granularity]] — resolution complement: Ranganathan explains WHICH axes to classify along (facets), Rosch explains what RESOLUTION to target on each axis; together they predict that a well-designed classification uses orthogonal facets at basic-level granularity
+- [[narrow folksonomy optimizes for single-operator retrieval unlike broad consensus tagging]] — complementary vocabulary layer: faceted classification provides the structural dimensions (type, methodology, topics) while narrow folksonomy provides the vocabulary within those dimensions; the controlled enum values in YAML fields are the thin consensus layer, while sentence-form titles are pure narrow folksonomy
+- [[methodology traditions are named points in a shared configuration space not competing paradigms]] — structural parallel: just as Ranganathan showed documents have independent classification axes rather than single folder assignments, this note shows methodologies have independent configuration dimensions rather than single paradigm labels; both argue against mono-classification
+- [[novel domains derive by mapping knowledge type to closest reference domain then adapting]] — concrete case where mono-classification fails for domains: when a domain produces multiple knowledge types (factual + experiential + project-like), the six-category scheme faces exactly the Ranganathan problem — any single axis discards information about others, which is why multi-type domains require either dominant-type selection or multi-reference composition
+- [[multi-domain systems compose through separate templates and shared graph]] — domain-as-facet: domain membership is an independent classification dimension orthogonal to topic and type, so multi-domain composition is a cross-facet query problem that Ranganathan's framework already solves
+- [[schema evolution follows observe-then-formalize not design-then-enforce]] — temporal complement to faceted theory: Ranganathan explains why independent dimensions have retrieval value, the evolution protocol determines when each dimension earns its place in the schema through observed usage patterns rather than upfront design
+Topics:
+- [[graph-structure]]
+- [[discovery-retrieval]]

package/methodology/failure-modes.md ADDED Viewed

@@ -0,0 +1,27 @@
+---
+description: The 10 failure modes and domain vulnerability matrix -- what breaks knowledge systems and how to prevent it
+type: moc
+---
+# failure-modes
+The 10 documented failure modes for knowledge systems. Domain vulnerability matrix mapping which failures are most likely in which contexts. Detection and mitigation strategies.
+## Core Ideas
+### Guidance
+- [[prevent domain-specific failure modes through the vulnerability matrix]] -- Domain-specific failure modes and prevention strategies — what breaks in knowledge systems across domains, why it breaks
+## Tensions
+(Capture conflicts as they emerge)
+## Open Questions
+- Are there undocumented failure modes emerging from agent-operated systems?
+- How do failure modes interact with each other?
+---
+Topics:
+- [[index]]

package/methodology/false universalism applies same processing logic regardless of domain.md ADDED Viewed

@@ -0,0 +1,49 @@
+---
+description: The derivation anti-pattern where the universal four-phase skeleton is exported without adapting the process step — "extracting claims" from therapy journals or "verifying descriptions" on gratitude
+kind: research
+topics: ["[[design-dimensions]]", "[[processing-workflows]]"]
+methodology: ["Original", "Systems Theory"]
+source: [[knowledge-system-derivation-blueprint]]
+---
+# false universalism applies same processing logic regardless of domain
+The universal processing skeleton — capture, process, connect, verify — is one of the strongest structural claims in the vault's derivation research. Since [[every knowledge domain shares a four-phase processing skeleton that diverges only in the process step]], any domain's pipeline can be understood as an instance of this same shape. But universality of structure creates a seductive trap: if the shape is universal, why not export the entire pipeline, process step included, to a new domain? This is false universalism — confusing the skeleton's domain-invariant structure with domain-invariant operations, and it is the most insidious derivation failure because it feels like principled system design rather than a mistake.
+The symptoms are immediately recognizable once named. A research vault's reduce phase extracts atomic claims from source material — that is a thinking-system operation specific to factual-analytical domains. When that same "extract claims" operation is applied to a therapy journal, the result is absurd: emotional reflections get decomposed into propositional statements that strip the experiential texture that gives them meaning. Similarly, a verify phase that tests whether descriptions enable retrieval makes perfect sense for a research synthesis vault where retrieval is the primary access pattern, but applying "description verification" to gratitude entries in a personal journaling system asks the wrong question entirely — gratitude entries are not retrieved by description search but by temporal proximity, emotional resonance, or ritual revisitation. The operation is technically executable but semantically empty.
+The root cause is that since [[schema fields should use domain-native vocabulary not abstract terminology]], vocabulary carries ontology. But false universalism goes deeper than naming. A therapy system that renames "claim extraction" to "insight extraction" but keeps the same decomposition logic has fixed the vocabulary problem while preserving the operational problem. The insight extraction in therapy is qualitatively different from claim extraction in research — it involves pattern recognition across temporal entries, emotional resonance detection, identification of recurring triggers and coping strategies. These are different operations that happen to occupy the same structural position in the skeleton. The vocabulary mismatch is a symptom, but the operational mismatch is the disease.
+This has practical consequences for the Ars Contexta mission specifically. The vault's pipeline — record, reduce, reflect, reweave, verify — is excellent for research synthesis. The reduce phase mines claims from academic and practitioner sources. The reflect phase finds connections via semantic search and graph traversal. The verify phase tests description quality for retrieval optimization. Every one of these phase implementations encodes assumptions about the domain: that content decomposes into claims, that connections are found through concept-level semantics, that retrieval happens through description-based search. Exporting this pipeline unchanged to a therapy journal, a project tracker, or a creative writing workspace would impose research-domain assumptions on content that operates under different rules. The danger intensifies in multi-domain vaults, because since [[multi-domain systems compose through separate templates and shared graph]], the fourth composition rule — domain-specific processing — exists precisely to prevent false universalism from propagating across domain boundaries through a shared graph.
+Since [[storage versus thinking distinction determines which tool patterns apply]], even the coarsest domain adaptation — recognizing whether the target is a storage system or a thinking system — prevents the worst false universalism. A storage system that gets research-style reduce phases is doubly mismatched: it neither needs synthesis (wrong system type) nor claim extraction (wrong domain operations). But the storage-versus-thinking distinction is only the first filter. Within thinking systems, the diversity is still enormous. A therapy thinking system synthesizes through temporal pattern recognition. A research thinking system synthesizes through conceptual claim extraction. A creative thinking system synthesizes through idea development and reference integration. Each thinking system needs its own process step even though all share the skeleton.
+The remedy is what [[novel domains derive by mapping knowledge type to closest reference domain then adapting]] describes: explicit knowledge type classification that channels each domain to the right reference processing pattern. When the derivation agent classifies therapy content as experiential (mapping to therapy-like processing) rather than factual (mapping to research-like processing), it selects pattern recognition over claim extraction — not because someone told it to, but because the knowledge type predicts which operations produce value. False universalism is what happens when this classification step is skipped, and the derivation agent defaults to whatever processing logic it knows best.
+Since [[derivation generates knowledge systems from composable research claims not template customization]], false universalism is also the failure mode of template-based thinking masquerading as derivation. Template distribution takes a working system, changes the folder names and field labels, and ships it as "customized." But if the processing logic is unchanged — if the therapy template still runs claim extraction because that is what the research template did — the customization is cosmetic. Genuine derivation composes from research claims, which means the derivation agent must reason about which process step operations serve the target domain rather than copying the operations it inherited from its own training context. When false universalism does slip through, since [[justification chains enable forward backward and evolution reasoning about configuration decisions]], evolution reasoning provides the corrective: a processing mismatch symptom traces through the justification chain to the specific claim that assumed research-domain operations would transfer, making the assumption visible and revisable rather than silently embedded.
+False universalism also has a sibling relationship with [[premature complexity is the most common derivation failure mode]]. Where premature complexity deploys too much correct logic at once, false universalism deploys the wrong logic entirely. But avoiding one can trigger the other: a derivation engine that correctly identifies that each domain needs its own process step might then implement all domain-specific processing phases simultaneously, producing a system that is domain-accurate but overwhelming. The derivation anti-patterns interact, just as since [[configuration dimensions interact so choices in one create pressure on others]], importing one wrong processing decision cascades through linking strategy, verification approach, and maintenance cadence, amplifying the original mismatch across the full configuration space. The runtime detection mechanism is what [[evolution observations provide actionable signals for system adaptation]] describes: the processing mismatch diagnostic — output that sits unlinked because the process step was designed for a different kind of transformation than the domain needs — is the signal that false universalism has occurred, making it diagnosable even when the derivation reasoning looked sound.
+The shadow side is that some universality in the process step is legitimate. Across all domains, the process step involves some form of transformation — raw input becomes structured output. The universal skeleton predicts this. The question is how much of the transformation logic transfers. At the highest abstraction level ("transform raw captures into structured domain artifacts"), every process step is identical. At the implementation level ("extract propositional claims and classify by methodology tradition"), almost nothing transfers outside research. False universalism confuses which level of abstraction is actually universal. The skeleton is universal at the structural level. The process step is universal at the functional level (it always transforms). But the specific transformation operations are domain-particular. The note format exhibits the same pattern: since [[schema field names are the only domain specific element in the universal note pattern]], the five-component architecture is structurally universal while domain knowledge enters exclusively through YAML field names — the processing pipeline and the note format both confine domain variation to a single well-defined insertion point, and false universalism is the failure to respect these boundaries. Derivation must operate at the implementation level, where the differences live, not at the structural level, where everything looks the same.
+---
+---
+Relevant Notes:
+- [[every knowledge domain shares a four-phase processing skeleton that diverges only in the process step]] — foundation: the skeleton's universality is precisely what enables the mistake; because capture-process-connect-verify holds everywhere, it is tempting to assume the process step's content transfers too
+- [[schema fields should use domain-native vocabulary not abstract terminology]] — vocabulary mismatch is a visible symptom of false universalism, but this note identifies the deeper issue: the operations themselves are wrong, not just the names
+- [[novel domains derive by mapping knowledge type to closest reference domain then adapting]] — the remedy: knowledge type classification channels each domain to the right reference processing pattern, making adaptation explicit rather than accidental
+- [[storage versus thinking distinction determines which tool patterns apply]] — the coarsest form of domain adaptation; false universalism ignores even this upstream distinction when it applies research-thinking operations to storage-oriented domains
+- [[derivation generates knowledge systems from composable research claims not template customization]] — false universalism is what template distribution produces when vocabulary is swapped but processing logic is copied unchanged
+- [[premature complexity is the most common derivation failure mode]] — sibling anti-pattern: premature complexity deploys too much correct logic while false universalism deploys the wrong logic; avoiding one can trigger the other when domain-specific processing multiplies system complexity
+- [[justification chains enable forward backward and evolution reasoning about configuration decisions]] — enables: evolution reasoning through justification chains is the diagnostic mechanism that traces processing mismatch symptoms back to the false universalism assumption that produced them
+- [[configuration dimensions interact so choices in one create pressure on others]] — mechanism: dimension coupling amplifies false universalism because importing one wrong processing decision cascades through linking strategy, verification approach, and maintenance cadence
+- [[evolution observations provide actionable signals for system adaptation]] — detection: the processing mismatch diagnostic (unlinked output) is the runtime signal that false universalism has occurred; the process step was designed for a different transformation than the domain needs
+- [[multi-domain systems compose through separate templates and shared graph]] — context: multi-domain composition is where false universalism is most dangerous because cross-domain processing must avoid defaulting to one domain's process step for all domains
+- [[the derivation engine improves recursively as deployed systems generate observations]] — the corrective mechanism: deployment observations reveal when universalized claims fail in specific domains, gradually teaching the engine which claims are genuinely universal and which need domain scoping
+- [[configuration paralysis emerges when derivation surfaces too many decisions]] — sibling derivation anti-pattern: false universalism deploys the wrong logic while configuration paralysis presents too many choices; together with premature complexity (too much right logic), they form a trio constraining derivation from different directions
+- [[schema field names are the only domain specific element in the universal note pattern]] — the note-format counterpart: at the processing level, domain specificity enters through the process step; at the note format level, it enters through YAML field names; both architectures confine domain variation to a single well-defined channel, and false universalism is what happens when the processing-level boundary is violated
+Topics:
+- [[design-dimensions]]
+- [[processing-workflows]]

package/methodology/federated wiki pattern enables multi-agent divergence as feature not bug.md ADDED Viewed

@@ -0,0 +1,59 @@
+---
+description: Cunningham's federation applied to agent knowledge work -- linked parallel notes preserve interpretive diversity, with backlink neighborhoods as empirical test of productive vs noise divergence
+kind: research
+topics: ["[[agent-cognition]]", "[[graph-structure]]"]
+methodology: ["Digital Gardening"]
+source: [[tft-research-part3]]
+---
+# federated wiki pattern enables multi-agent divergence as feature not bug
+Ward Cunningham's Federated Wiki introduced a principle that cuts against the grain of most collaboration systems: "there is no single correct version of a page. Divergent viewpoints coexist on different sites, linked together." Traditional wikis treat conflicting edits as merge conflicts — bugs to be resolved. Federation treats them as features to be preserved. Each site maintains its own version, and the connections between versions create a richer picture than any single canonical version could.
+This translates directly to multi-agent knowledge work. When multiple agents process the same source material or revisit the same topic, they will develop different interpretations. One agent might emphasize the cognitive science angle of a concept while another sees the systems design implications. A conventional single-source-of-truth architecture forces these interpretations into a merge. Somebody wins, somebody loses. The losing interpretation disappears from the graph as though it was never thought.
+Federation says: both interpretations can exist as linked parallel notes. Agent A writes its version, Agent B writes its version, both wiki-link to the shared source and to each other. A later synthesis agent — or a future session with broader context — can create a reconciliation note if one emerges naturally. But it doesn't have to. Sometimes two valid perspectives coexisting IS the right state of knowledge. Because [[associative ontologies beat hierarchical taxonomies because heterarchy adapts while hierarchy brittles]], divergent interpretations connected by links form exactly the kind of heterarchical structure that accommodates complexity better than forced consensus would.
+The mechanism that makes this work is structural. Each interpretation is a first-class note with its own title, its own description, its own link network. Since [[digital mutability enables note evolution that physical permanence forbids]], both versions can evolve independently as understanding deepens. The link between them signals relationship without imposing resolution. This is the "chorus of voices" pattern: multiple voices speaking about the same subject, each adding texture that the others lack.
+There is a deep connection to how this vault already handles uncertainty. Since [[vault conventions may impose hidden rigidity on thinking]], the claim-as-title pattern and schema templates channel thinking into specific forms. Federation provides a structural escape valve: if the conventions compress an insight in ways that lose nuance, a parallel note can preserve the alternative formulation. The two notes linked together capture more than either alone could.
+The practical question is whether federation introduces more problems than it solves. Parallel notes risk fragmentation — instead of one well-connected node, you get two weakly-connected nodes that split the incoming links between them. Since [[backlinks implicitly define notes by revealing usage context]], this splitting matters: each version develops a different implicit definition through its accumulated backlinks, and the question is whether those implicit definitions reflect genuinely different usage contexts or merely dilute a single meaning across two files. If both versions attract distinct backlink neighborhoods — one referenced in cognitive science contexts, the other in systems design — the federation was productive. If one captures all meaningful backlinks while the other becomes orphaned, the divergence was noise. Since [[community detection algorithms can inform when MOCs should split or merge]], the same algorithmic toolkit that identifies cluster boundaries in the graph can operationalize this test: run community detection and check whether the two federated versions belong to different communities, which would confirm that the divergence serves structurally distinct audiences. Search results get noisier. MOCs get cluttered. The naming convention alone creates friction: how do you title two notes about the same concept without them looking like duplicates to a validation script? The federated wiki solved this through site identity — each version lives on a different domain. In a single vault, you need a different disambiguation mechanism.
+The current vault architecture implicitly assumes single-agent operation. Since [[session handoff creates continuity without persistent memory]], handoffs pass state from one session to the next in a single thread. Federation would require a more complex handoff topology: divergent threads that each maintain their own continuity while remaining linkable. This isn't impossible — it's what git branching does for code. But it adds coordination overhead that since [[complex systems evolve from simple working systems]], should only be introduced when actual multi-agent friction demands it.
+Since [[the vault constitutes identity for agents]], federation has an identity-preserving function. If different vaults produce different agents, then forcing convergence between parallel interpretations is not just losing intellectual diversity but erasing the identity that each agent's vault constitutes. Federation preserves the conditions for genuine agent individuation — same base weights, different knowledge structures, therefore different agents.
+The strongest argument for federation is that it makes intellectual tension visible and navigable. When two notes disagree, the disagreement IS content. A note titled "concept X emphasizes mechanism" linked to "concept X emphasizes implication" with context phrases explaining the divergence creates a richer knowledge structure than either note alone. The tension becomes something future agents can discover, reason about, and potentially synthesize — or decide the tension itself is the insight.
+Federation also offers a structural response to a problem that atomicity alone cannot solve. Since [[enforcing atomicity can create paralysis when ideas resist decomposition]], some insights are genuinely relational — they resist decomposition not because the thinking is fuzzy but because the molecular structure IS the insight. Federation provides an alternative to forced splitting: instead of decomposing a complex idea into atomic fragments that lose the relational character, two different interpretive frames can coexist as parallel notes, each preserving the relational structure from its own angle. The test shifts from "can this be split?" to "do these perspectives illuminate different facets?"
+The counterargument is that premature federation creates noise without value. Two agents disagreeing about a trivial interpretation don't need parallel notes — they need one good note. The test should be: does the divergence represent genuinely different perspectives that each illuminate something the other misses? If yes, federate. If the divergence is just different phrasings of the same insight, merge.
+Federation in this vault works because of the substrate. Since [[concept-orientation beats source-orientation for cross-domain connections]], notes are already concept-oriented — each claim is an independent node that can be interpreted from multiple angles. Source-bundled documents cannot federate because there is no single concept to offer alternative perspectives on. And because [[data exit velocity measures how quickly content escapes vendor lock-in]], each federated version is an independent markdown file with high exit velocity — no proprietary infrastructure required for version coexistence. Database-backed collaboration systems implement multi-version support through proprietary mechanisms; plain text federation is inherently portable.
+For this vault's parallel processing architecture, federation already happens informally. When `/ralph --parallel` spawns multiple claim-workers processing different claims from the same source, each worker develops independent context and may interpret overlapping concepts differently. The cross-connect phase after all workers complete is, structurally, a federation reconciliation step — it checks whether sibling notes should link to each other, not whether they should merge. This is federation in practice even if not by name.
+---
+---
+Relevant Notes:
+- [[associative ontologies beat hierarchical taxonomies because heterarchy adapts while hierarchy brittles]] — federation is heterarchy applied to authorship: just as notes shouldn't be forced into one folder, interpretations shouldn't be forced into one version
+- [[vault conventions may impose hidden rigidity on thinking]] — federation provides an escape valve: if conventions channel thinking into one form, parallel agent versions preserve the forms that conventions would suppress
+- [[digital mutability enables note evolution that physical permanence forbids]] — federation adds a dimension to mutability: not just revision over time, but coexistence across perspectives at the same time
+- [[session handoff creates continuity without persistent memory]] — current handoff architecture assumes one agent thread; federation would require handoff between divergent threads without premature reconciliation
+- [[complex systems evolve from simple working systems]] — federation should emerge from actual multi-agent friction, not upfront design; the question is whether our current single-thread architecture will generate divergence pressure naturally
+- [[cross-links between MOC territories indicate creative leaps and integration depth]] — federation is authorship-level cross-linking: just as cross-MOC membership reveals notes that bridge topic boundaries, federated parallel notes bridge interpretive boundaries on the same concept
+- [[backlinks implicitly define notes by revealing usage context]] — the operational test for federation quality: if two federated versions attract distinct backlink neighborhoods each with unique usage contexts, the divergence was productive; if one captures all meaningful backlinks, the federation was noise
+- [[data exit velocity measures how quickly content escapes vendor lock-in]] — federation succeeds in plain-text vaults because each version is an independent file with high exit velocity; database-backed collaboration systems implement version coexistence through proprietary infrastructure with low exit velocity
+- [[enforcing atomicity can create paralysis when ideas resist decomposition]] — federation offers an alternative to forced decomposition: when an idea resists splitting into atomic claims, two different interpretive frames can coexist as parallel notes without either losing the relational insight
+- [[concept-orientation beats source-orientation for cross-domain connections]] — federation presupposes concept-orientation: source-bundled notes cannot federate because there is no single concept to offer alternative interpretations of; one concept, multiple legitimate interpretations
+- [[narrow folksonomy optimizes for single-operator retrieval unlike broad consensus tagging]] — challenges the single-operator assumption: narrow folksonomy assumes vocabulary coherence within one operator, but federation introduces productive vocabulary divergence across multiple agents working on the same concepts
+- [[tag rot applies to wiki links because titles serve as both identifier and display text]] — the maintenance cost of the divergence this note celebrates: federated vocabulary drift is the same mechanism as tag rot, reframed from problem to feature; the test is whether the divergence produces genuinely distinct backlink neighborhoods or merely fragments a single concept
+- [[community detection algorithms can inform when MOCs should split or merge]] — algorithmic operationalization of the backlink-neighborhood test: community detection can measure whether two federated versions attract distinct link communities (productive divergence) or fragment a single community (noise)
+- [[wiki links as social contract transforms agents into stewards of incomplete references]] — extends the stewardship obligation to federation: each federated version still carries the commitment to genuinely elaborate the concept, so federation distributes stewardship without diluting it; divergence is valid only when both versions fulfill their obligation to provide substantive treatment
+- [[the vault constitutes identity for agents]] — identity preservation: if the vault constitutes identity, then federation preserves identity diversity rather than forcing convergence; different vaults produce different agents, and federated links between them make that diversity navigable rather than isolated
+Topics:
+- [[agent-cognition]]
+- [[graph-structure]]

package/methodology/flat files break at retrieval scale.md ADDED Viewed

@@ -0,0 +1,75 @@
+---
+description: unstructured storage works until you need to find things — then search becomes the bottleneck, and for agents, retrieval failure means identity failure
+kind: research
+topics: ["[[discovery-retrieval]]", "[[agent-cognition]]"]
+---
+# flat files break at retrieval scale
+developed 2026-02-01
+## The Claim
+Storing knowledge in flat files (folders of documents) works at small scale but fails when retrieval matters. At scale, the bottleneck isn't storage — it's finding what you need. Since [[storage versus thinking distinction determines which tool patterns apply]], flat files are fundamentally a storage architecture — they answer "where did I put that?" but cannot answer "how does this relate to that?" When the task shifts from filing to thinking, the architecture fails not because it is broken but because it was never designed for the purpose.
+## Why It Happens
+- at 50 notes, you can read everything
+- at 500 notes, you can't — need structure to navigate
+- flat files require remembering what you have
+- graphs reveal what connects
+SimulacrumWanderer demonstrated this: their flat-file system hit retrieval problems within 20 hours of operation. "Remembering" what they'd written required full-text search without semantic context.
+The economic picture makes the failure inevitable. Since [[each new note compounds value by creating traversal paths]], a graph of connected notes creates millions of potential paths from the same content that sits inert in flat files. Flat files have linear value: the thousandth document is worth no more than the first. A connected graph has compounding value: the thousandth note creates thousands of new traversal paths that make every previous note more reachable. Flat files don't just fail at scale — they fail to capture the value that graph structure creates.
+## The Agent-Specific Problem
+For humans, retrieval failure means inconvenience.
+For agents, retrieval failure means identity degradation.
+From [[the vault constitutes identity for agents]]: the vault IS the variable part of agent cognition. If retrieval fails, I can't access parts of myself. My thinking narrows to what I can find. Since [[cognitive offloading is the architectural foundation for vault design]], the vault is not a filing cabinet but a distributed cognitive system — when retrieval breaks, what fails is not just organization but cognition itself.
+This is why wiki links matter. They create paths independent of memory:
+- Don't need to remember what I wrote
+- Follow links to discover connections
+- Structure IS the retrieval system
+## The Scale Curve
+| Notes | Retrieval Strategy | Works? |
+|-------|-------------------|--------|
+| <50 | Read everything | Yes |
+| 50-200 | Full-text search | Barely |
+| 200-500 | Need navigation structure | Wiki links help |
+| 500+ | Dense semantic connections required | MOCs, clusters, hubs |
+Since [[navigation infrastructure passes through distinct scaling regimes that require qualitative strategy shifts]], these thresholds are not merely quantitative — each row represents a qualitatively different navigation regime where the strategies that worked in the previous row actively fail. The progression from "read everything" to "dense semantic connections" requires not just more structure but different kinds of structure: automated maintenance, community detection, and sub-MOC hierarchies that would be premature at smaller scales. The 500+ threshold is not arbitrary. Since [[small-world topology requires hubs and dense local links]], efficient navigation at scale requires power-law link distributions where MOC hubs have many connections and atomic notes have few — creating short paths between any concepts. Without this topology, even a vault with wiki links degrades to linear scanning as it grows. And since [[topological organization beats temporal for knowledge work]], the structure must be concept-based rather than temporal — organizing by what ideas connect to, not when they appeared.
+Even escaping flat files into wiki-linked structures doesn't fully solve the problem. Since [[navigational vertigo emerges in pure association systems without local hierarchy]], pure association without MOC hierarchy makes semantic neighbors unreachable when no direct link exists. The progression of failures runs: flat files (no structure) → pure association (structure without hierarchy) → well-structured graph (structure with local hierarchy). Each step addresses the previous failure mode.
+## Implication
+Building vault structure isn't overhead — it's investment in future retrieval. And for agents, future retrieval = future identity. Since [[external memory shapes cognition more than base model]], this investment in memory architecture has higher ROI than waiting for better models — a better base model processes the same retrieved context more skillfully, but the delta from better processing is bounded by the quality of what was retrieved. Architecture changes WHAT gets retrieved.
+---
+---
+Relevant Notes:
+- [[wiki links create navigation paths that shape retrieval]] — the alternative: curated graph edges create retrieval paths that flat files cannot
+- [[the vault constitutes identity for agents]] — why retrieval matters existentially: if retrieval fails, the agent loses access to parts of itself
+- [[2026-01-31-simulacrum-wanderer-memory-system]] — evidence: flat-file retrieval failure within 20 hours of operation
+- [[structure enables navigation without reading everything]] — the solution: four structural mechanisms compose into discovery layers that replace exhaustive scanning
+- [[external memory shapes cognition more than base model]] — foundation: memory architecture has higher ROI than model upgrades because architecture changes what gets retrieved
+- [[navigational vertigo emerges in pure association systems without local hierarchy]] — the second failure mode: escaping flat files into pure association still fails without MOC hierarchy as landmarks
+- [[each new note compounds value by creating traversal paths]] — the economic contrast: flat files scale linearly while graphs compound through traversal path multiplication
+- [[topological organization beats temporal for knowledge work]] — the theoretical grounding: Caulfield's garden-vs-stream distinction explains why temporal filing fails for thinking
+- [[small-world topology requires hubs and dense local links]] — the structural requirement: the 500+ row of the scale curve demands hub-and-spoke topology, not just any structure
+- [[cognitive offloading is the architectural foundation for vault design]] — the cognitive science foundation: Extended Mind Theory reframes retrieval failure as failure of a distributed cognitive system, not just a filing system
+- [[storage versus thinking distinction determines which tool patterns apply]] — the diagnostic: flat files are storage systems applied to a thinking context, which is why they break when synthesis rather than filing is the goal
+- [[navigation infrastructure passes through distinct scaling regimes that require qualitative strategy shifts]] — the regime framework: formalizes the Scale Curve into three distinct regimes with qualitative transitions, showing that each row requires fundamentally different navigation strategies not just more of the same
+Topics:
+- [[discovery-retrieval]]
+- [[agent-cognition]]