npm - arscontexta - Versions diffs - 0.6.0 - Mend

arscontexta 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (418) hide show

package/methodology/eight configuration dimensions parameterize the space of possible knowledge systems.md ADDED Viewed

@@ -0,0 +1,56 @@
+---
+description: Granularity, organization, linking, processing intensity, navigation depth, maintenance cadence, schema density, and automation level — each a spectrum with poles, together defining what knobs a
+kind: research
+topics: ["[[design-dimensions]]"]
+methodology: ["Original"]
+source: [[knowledge-system-derivation-blueprint]]
+---
+# eight configuration dimensions parameterize the space of possible knowledge systems
+The claim that knowledge systems are parameterized rather than fixed needs teeth. Saying "it depends on the use case" is vacuously true. What makes the design space tractable is identifying the specific dimensions along which systems vary and understanding what each dimension's poles actually trade off. Eight dimensions emerge from the research, and together they define the space a knowledge system generator navigates.
+**Note granularity** ranges from atomic (one claim per note, as this vault practices) to coarse (one note per source, preserving original structure). Atomic granularity maximizes composability — since [[enforcing atomicity can create paralysis when ideas resist decomposition]], the cost is real, but atomic notes can be linked, rearranged, and synthesized independently in ways that compound notes cannot. Coarse granularity reduces capture friction and preserves source coherence, which matters when the primary operation is reference rather than synthesis. The right position depends on whether the system prioritizes retrieval of original context or generation of new connections.
+**Organization** ranges from flat (all files in one directory, structure via wiki links) to hierarchical (nested folders with path-based navigation). Since [[associative ontologies beat hierarchical taxonomies because heterarchy adapts while hierarchy brittles]], flat organization accommodates evolving understanding because notes are not locked into categories at creation time. But hierarchical systems provide predictable locations for team collaboration and reduce navigational vertigo in large collections. Wiki link resolution constrains this choice — flat structures enable filename-only linking while hierarchical structures require path disambiguation.
+**Linking philosophy** ranges from explicit only (every connection is manually created with articulated reasoning) to explicit plus implicit discovery (semantic search, embedding similarity, and algorithmic suggestions augment manual linking). Since [[propositional link semantics transform wiki links from associative to reasoned]], explicit-only linking produces higher-quality connections but scales poorly. Adding implicit discovery compensates for human or agent attention limits but introduces noise — because [[spreading activation models how agents should traverse]], the traversal mechanism must handle both curated and suggested edges without conflating their reliability.
+**Processing intensity** ranges from heavy (extract, reflect, reweave, verify — the full pipeline this vault implements) to light (capture, route, review — minimal transformation). Since [[fresh context per task preserves quality better than chaining phases]], heavy processing produces higher-quality notes but demands more infrastructure and compute. Since [[every knowledge domain shares a four-phase processing skeleton that diverges only in the process step]], the intensity dimension specifically governs how much transformation happens in the process phase — capture, connect, and verify remain structural constants regardless of intensity level. Light processing suits domains where content arrives near-final (curated feeds, expert-written sources) or where the volume-to-value ratio is low enough that deep processing per item isn't justified.
+**Navigation depth** ranges from two levels (hub linking directly to topic notes) to four levels (hub to domain to topic to sub-topic). Deeper navigation serves larger collections by providing intermediate orientation points, but each level adds maintenance burden and cognitive overhead during traversal. The right depth correlates with expected note count — a 50-note vault needs two levels, a 5000-note vault needs four — but also with the conceptual complexity of the domain. A domain with natural sub-structure (like "processing" splitting into capture, transformation, and retrieval) benefits from depth even at modest scale.
+**Maintenance cadence** ranges from daily (a trading journal where yesterday's analysis becomes stale) to quarterly (a stable reference vault where core concepts change slowly). This dimension is often invisible in system design because maintenance happens — or fails to happen — as a social practice rather than an architectural feature. But cadence constrains other dimensions: daily maintenance requires lightweight review processes, while quarterly maintenance can afford deep reweaving passes. Domains with high temporal dynamics (markets, active research) need faster cadence than domains with slow conceptual evolution (mathematical foundations, historical analysis). Cadence also interacts with the governance rhythm: since [[organic emergence versus active curation creates a fundamental vault governance tension]], the cadence setting determines how frequently curation interventions occur, and setting cadence too high suppresses the emergence that generates organic structure while setting it too low lets structural debt accumulate past easy resolution.
+**Schema density** ranges from minimal (description and topics only, roughly what any note needs to be findable) to dense (eight or more fields including methodology, classification, provenance, status, and custom domain fields). Dense schemas make notes queryable like a graph database — since [[faceted classification treats notes as multi-dimensional objects rather than folder contents]], each YAML field is an independent classification facet that composes multiplicatively for retrieval precision. But since [[storage versus thinking distinction determines which tool patterns apply]], the trade-off is real: each required field adds capture friction and maintenance burden. And since [[schema fields should use domain-native vocabulary not abstract terminology]], the density choice interacts with vocabulary: dense schemas amplify the cost of vocabulary mismatch because every abstractly-named field forces a translation step at capture time, so higher density demands more careful domain-native naming to remain usable. The right density depends on whether the primary access pattern is structured query (favoring dense) or free-form traversal (favoring minimal). Critically, since [[schema evolution follows observe-then-formalize not design-then-enforce]], the position on this spectrum should not be chosen upfront — systems should start minimal and move toward density only when usage evidence (manual field additions, patterned free text) justifies the cost of maintaining additional fields.
+**Automation level** ranges from full (hooks enforce schemas on every write, skills encode methodology as executable workflows, pipelines coordinate multi-phase processing) to manual (all quality standards exist as written instructions that the agent follows by convention). Since [[four abstraction layers separate platform-agnostic from platform-dependent knowledge system features]], automation level is partly determined by platform capability — you cannot automate what the platform does not support — and partly a design choice about system maturity. New systems benefit from starting manual (convention layer) and automating as understanding hardens, following the trajectory from documentation to skill to hook.
+What makes these eight dimensions a useful parameterization rather than an arbitrary list is that they are largely independent in definition but interact in practice. A system positioned at "atomic granularity" does not logically require "heavy processing," but in practice, atomic notes need more connection work than coarse notes because the relationships lost by decomposing sources must be explicitly recreated. Since [[configuration dimensions interact so choices in one create pressure on others]], the space of viable configurations is much smaller than the combinatorial product of eight independent spectrums. This interaction structure also explains why [[methodology traditions are named points in a shared configuration space not competing paradigms]] — Zettelkasten, PARA, Cornell, and Evergreen each represent a coherent region where dimension interactions have been resolved through practice, not competing answers to the same question. The dimensions together define what a knowledge system generator decides when [[derivation generates knowledge systems from composable research claims not template customization]], while the universal components (notes, prose-as-title, wiki links, YAML base fields, graph mechanics, progressive disclosure) define what it never varies. But since [[dense interlinked research claims enable derivation while sparse references only enable templating]], the dimensions are only navigable when the claim graph supporting each dimension is dense enough — sparse research per dimension forces the derivation agent to guess rather than compose, degrading derivation to templating at the per-dimension level even if the overall graph appears large. And when a domain has no direct reference model, since [[novel domains derive by mapping knowledge type to closest reference domain then adapting]], knowledge type classification identifies which reference domain provides initial settings for these eight dimensions, with adaptation adjusting from that starting configuration rather than choosing each dimension from scratch.
+---
+---
+Relevant Notes:
+- [[knowledge system architecture is parameterized by platform capabilities not fixed by methodology]] — the upstream principle this concretizes: parameterization as concept needs specific dimensions to become actionable
+- [[four abstraction layers separate platform-agnostic from platform-dependent knowledge system features]] — layers describe WHERE features live while dimensions describe HOW features vary; the two are orthogonal decompositions of the same design space
+- [[storage versus thinking distinction determines which tool patterns apply]] — the upstream classification before dimensions apply: a storage system and a thinking system occupy different regions of the eight-dimensional space
+- [[enforcing atomicity can create paralysis when ideas resist decomposition]] — the failure mode at the atomic pole of the granularity dimension, grounding it in real trade-offs rather than abstract spectrum
+- [[associative ontologies beat hierarchical taxonomies because heterarchy adapts while hierarchy brittles]] — the research case for one pole of the organization dimension
+- [[propositional link semantics transform wiki links from associative to reasoned]] — moves along the linking philosophy dimension from implicit association toward explicit typed connections
+- [[fresh context per task preserves quality better than chaining phases]] — grounds the heavy end of the processing intensity dimension in attention science
+- [[configuration dimensions interact so choices in one create pressure on others]] — develops the interaction thesis: dimensions are independent in definition but coupled in practice, reducing the viable configuration space
+- [[methodology traditions are named points in a shared configuration space not competing paradigms]] — extends: maps five major traditions as specific points in this eight-dimensional space, showing what instantiated dimension choices look like
+- [[derivation generates knowledge systems from composable research claims not template customization]] — enables: derivation is the process that traverses these dimensions to compose justified configurations rather than selecting from templates
+- [[every knowledge domain shares a four-phase processing skeleton that diverges only in the process step]] — constrains the processing intensity dimension by showing that capture, connect, and verify are structural constants while only the process step varies by domain
+- [[faceted classification treats notes as multi-dimensional objects rather than folder contents]] — grounds the schema density dimension in Ranganathan's formal classification theory: each YAML field is an independent facet enabling multiplicative retrieval precision
+- [[novel domains derive by mapping knowledge type to closest reference domain then adapting]] — operationalizes these dimensions for unfamiliar domains: knowledge type classification identifies which reference domain provides initial dimension settings, with adaptation adjusting from that starting configuration
+- [[schema evolution follows observe-then-formalize not design-then-enforce]] — the temporal mechanism for the schema density dimension: rather than choosing density upfront, start minimal and let the five-signal quarterly review protocol move density in response to observed usage patterns
+- [[schema fields should use domain-native vocabulary not abstract terminology]] — the naming constraint on the schema density dimension: dense schemas amplify vocabulary mismatch costs because every abstractly-named field forces a translation step, so higher density demands more careful domain-native naming
+- [[organic emergence versus active curation creates a fundamental vault governance tension]] — the governance meta-dimension spanning maintenance cadence and automation level: cadence governs how frequently curation interventions occur, and automation level governs how much curation is delegated to infrastructure, with the governance rhythm determining the effective position on both spectrums
+- [[configuration paralysis emerges when derivation surfaces too many decisions]] — the UX consequence of surfacing all eight dimensions: presenting every dimension as a question produces analysis paralysis; sensible defaults and inference from dimension coupling reduce the decision surface to genuine choice points
+- [[premature complexity is the most common derivation failure mode]] — the deployment consequence of the combinatorial space: a derivation engine can justify choices along all eight dimensions, but the composed system exceeds user capacity, requiring a complexity budget that constrains initial deployment to a subset of these dimensions
+- [[dense interlinked research claims enable derivation while sparse references only enable templating]] — the research quality prerequisite: dimensions are only navigable when each has dense, interlinked claims with methodology provenance; sparse coverage per dimension forces derivation to guess rather than compose, degrading to templating at the per-dimension level
+Topics:
+- [[design-dimensions]]

package/methodology/elaborative encoding is the quality gate for new notes.md ADDED Viewed

@@ -0,0 +1,55 @@
+---
+description: Zettelkasten works because connecting new information to existing knowledge — not just filing it — creates encoding depth, and explicit relationship articulation in wiki links is how the vault
+kind: research
+topics: ["[[note-design]]", "[[graph-structure]]"]
+methodology: ["Zettelkasten", "Cognitive Science"]
+source: [[tft-research-part3]]
+---
+# elaborative encoding is the quality gate for new notes
+Elaborative encoding is the cognitive mechanism that distinguishes productive note-taking from sophisticated filing. The principle is specific: information is better retained and more usable when the learner connects it to what they already know, rather than storing it in isolation. Craik and Lockhart's levels of processing framework established the hierarchy — shallow processing (structural features) produces weaker memory traces than deep processing (semantic meaning) — and elaborative rehearsal is the deepest form because it requires actively relating new material to existing mental structures.
+Zettelkasten operationalizes this through a simple rule: every new note must link to at least one existing note with an explicit statement of why. Not "related to" — that is structural processing, the same level as organizing by topic. The relationship must be articulated: "extends this by adding the temporal dimension," "contradicts this because agents lack motor encoding," "provides the mechanism that this note only names." Since [[inline links carry richer relationship data than metadata fields]], the prose context around a wiki link IS the elaborative encoding — it forces the author to process how the new claim relates to the existing knowledge structure, which is the cognitive act that creates durable understanding.
+This is narrower and more specific than [[the generation effect requires active transformation not just storage]]. The generation effect covers all forms of active production — writing descriptions, synthesizing sources, articulating claims. Elaborative encoding specifies the particular form of generation that builds network coherence: connecting to what already exists. You can generate a perfectly good description for a note without connecting it to anything, and that description demonstrates the generation effect. But without elaborative encoding — without the "this extends X because Y" articulation — the note floats as an isolated node. The generation created local value (a good description), but missed the network-level value (traversal paths to existing knowledge).
+This is why the vault's quality standard requires context phrases on wiki links rather than bare link lists. A MOC entry that reads `- [[note title]]` has failed the elaborative encoding test. It says a connection exists without saying what the connection means. Compare `- [[note title]] — contradicts the assumption that volume drives value` — this forces the author to process the relationship, and that processing is what transforms a link from a filing operation into a cognitive operation. Since [[propositional link semantics transform wiki links from associative to reasoned]], the articulation vocabulary (extends, contradicts, enables, specifies) provides scaffolding for elaboration. Each relationship type demands a different form of processing: "extends" requires understanding what the original note argues and how the new note adds to it; "contradicts" requires understanding the tension and why both positions have force.
+The connection to capture is immediate. Since [[capture the reaction to content not just the content itself]], reactions are elaborative encoding happening at capture time — relating the new content to existing understanding before the context decays. Without a reaction, the agent must reconstruct the elaboration later, and since [[structure without processing provides no value]], reconstruction that only achieves structural processing (filing, tagging, organizing) produces nothing. The vault's processing pipeline encodes elaborative encoding as a structural requirement: the reflect phase exists specifically to find and articulate connections, and the quality gates on that phase check whether the articulation is genuine or perfunctory.
+But there is a shadow side. Mandatory elaboration can become a ritual that produces form without substance, and since [[verbatim risk applies to agents too]], an agent that writes "extends [[note]] by adding detail" on every connection has satisfied the articulation requirement without achieving genuine elaborative processing. The articulation vocabulary helps here because different relationship types demand different cognitive work, but the deeper safeguard is specificity. Since [[claims must be specific enough to be wrong]], vague elaboration fails the same test that vague claims fail — if the relationship articulation could apply to any pair of notes, it hasn't actually processed the specific connection between these two. "Extends by adding the temporal dimension" can only describe one relationship. "Extends by adding detail" could describe anything. The elaboration must be as specific as the claims it connects.
+This grounds the vault's approach to note quality in cognitive science rather than aesthetic preference. The requirement for explicit relationship articulation is not a formatting convention. It is the mechanism through which new knowledge integrates with existing knowledge, creating the encoding depth that makes content retrievable and the network density that makes traversal productive. Notes without elaborated connections are structurally complete but cognitively orphaned — they exist in the graph without participating in the reasoning the graph enables. Elaborative encoding also serves as the calibration mechanism for [[controlled disorder engineers serendipity through semantic rather than topical linking]] — the quality gate that keeps disorder "controlled" rather than random. Every semantic cross-link must pass the articulation test ("WHY do these connect?"), which means the network-level unpredictability that generates serendipity is composed entirely of judgment-tested individual connections. Without this gate, semantic linking degenerates into noise; with it, each cross-topical edge carries genuine reasoning that makes the unexpected adjacency productive rather than arbitrary.
+The encoding depth is testable. Since [[testing effect could enable agent knowledge verification]], an agent that reads only title and description and then predicts note content is testing whether the elaborative encoding produced retrievable hooks. Notes whose connections were genuinely elaborated create distinctive enough descriptions that prediction succeeds. Notes whose connections were structurally filed produce descriptions that merely paraphrase, and prediction fails. The testing effect becomes the verification mechanism for elaborative encoding quality.
+Elaborative encoding also has a temporal dimension. The initial encoding at creation time — connecting a new note to existing knowledge — is only the first pass. Since [[incremental formalization happens through repeated touching of old notes]], each subsequent encounter with a note triggers fresh elaboration against current understanding. A link added six months ago said "extends by adding the temporal dimension" based on six-month-old understanding. Today, with new notes and deeper comprehension, that same connection might need richer articulation or might reveal itself as superficial. This is why [[backward maintenance asks what would be different if written today]] — the reweaving question forces re-elaboration, preventing connections from fossilizing at whatever depth they had when first written.
+There is a deeper question about WHO does the elaboration. Since [[cognitive outsourcing risk in agent-operated systems]] identifies the delegation shadow, the encoding depth that elaboration creates accrues to whoever does the processing. When agents handle all connection-finding and articulation, the vault becomes richly elaborated — dense with context phrases, typed relationships, and genuine connection reasoning — but the human's understanding may remain at the structural processing level of someone who reads but does not generate. The elaborative encoding gate ensures vault quality, but it does not ensure human understanding unless the human participates in at least some of the elaboration.
+The vault's claim-based architecture maps onto a formal framework that names what elaboration produces. Since [[IBIS framework maps claim-based architecture to structured argumentation]], each context phrase on a wiki link functions as an Argument in IBIS terms — evidence connecting one Position (claim note) to another. A bare link without elaboration is a non-argument: it asserts a connection exists without providing any argumentative force. The elaborative encoding test and the IBIS argument test converge: both require that connections carry reasoning, not just reference.
+---
+Relevant Notes:
+- [[the generation effect requires active transformation not just storage]] — foundation: generation is the broader principle; elaborative encoding specifies the mechanism — connecting to existing knowledge is the particular form of generation that creates durable understanding
+- [[inline links carry richer relationship data than metadata fields]] — implements elaborative encoding at the link level: 'since X, therefore Y' is elaboration, 'related_to: X' is filing
+- [[propositional link semantics transform wiki links from associative to reasoned]] — extends the articulation requirement: typed relationships (causes, enables, contradicts) are deeper elaboration than untyped association
+- [[structure without processing provides no value]] — the Lazy Cornell anti-pattern is the failure mode when elaboration is skipped: structure without connecting-to-existing-knowledge produces no benefit
+- [[generation effect gate blocks processing without transformation]] — operationalizes the gate this note describes: requires a generated artifact before inbox exit, but elaborative encoding specifies what KIND of generation matters most
+- [[capture the reaction to content not just the content itself]] — reactions ARE elaborative encoding at capture time: relating new content to existing understanding is the reaction that creates synthesis seeds
+- [[claims must be specific enough to be wrong]] — specificity enables elaboration: vague claims cannot be connected to existing knowledge because they lack the precision needed for relationship articulation
+- [[testing effect could enable agent knowledge verification]] — verifies whether elaboration succeeded: if an agent cannot predict note content from title and description, the elaborative encoding that should have created retrievable connections was shallow
+- [[verbatim risk applies to agents too]] — the failure mode when elaboration becomes ritual: agents can produce context phrases that satisfy the articulation requirement structurally while achieving no genuine elaborative processing
+- [[incremental formalization happens through repeated touching of old notes]] — extends elaborative encoding from a single-pass mechanism to a longitudinal one: each re-encounter with an old note triggers fresh elaboration against current understanding, deepening connections over time
+- [[backward maintenance asks what would be different if written today]] — re-elaboration as maintenance: the 'what would be different' question forces fresh elaborative encoding of existing content against new knowledge, preventing connections from calcifying at their original depth
+- [[cognitive outsourcing risk in agent-operated systems]] — the delegation shadow: when agents perform all elaborative encoding, the vault benefits from deep connections but the human loses the encoding depth that elaboration provides to whoever does the processing
+- [[IBIS framework maps claim-based architecture to structured argumentation]] — provides formal vocabulary: elaborated context phrases on wiki links are Arguments in IBIS terms, and bare links without articulation are non-arguments that fail both the IBIS and elaborative encoding tests
+- [[wiki links as social contract transforms agents into stewards of incomplete references]] — extends elaborative encoding from a quality gate to a stewardship standard: the social contract says agents must fulfill link promises, and elaborative encoding specifies what fulfillment requires — not just creating the target note but connecting it with articulated relationships
+- [[vibe notetaking is the emerging industry consensus for AI-native self-organization]] — industry-scale example of the elaboration gap: most vibe notetaking tools use embedding-based linking that skips elaboration entirely, while agent-curated wiki links with context phrases implement elaborative encoding at system scale
+- [[controlled disorder engineers serendipity through semantic rather than topical linking]] — elaborative encoding is the calibration mechanism: the requirement that every link articulate WHY it connects is what keeps Luhmann's controlled disorder productive rather than random; without the quality gate, semantic cross-linking degenerates into noise
+- [[MOC construction forces synthesis that automated generation from metadata cannot replicate]] — navigation-scale application: writing context phrases on MOC entries IS elaborative encoding applied to navigation; the Dump-Lump-Jump pattern reveals that the Lump phase (context phrase creation) is precisely where elaboration happens, and automated MOC generation skips this encoding step entirely, producing lists without the relationship articulation that makes MOCs navigable
+Topics:
+- [[note-design]]
+- [[graph-structure]]

package/methodology/enforce schema with graduated strictness across capture processing and query zones.md ADDED Viewed

@@ -0,0 +1,221 @@
+---
+description: Why schema enforcement is non-negotiable for agent-operated knowledge systems and how to implement it across domains — soft enforcement at capture, hard enforcement at pipeline boundaries, and the
+kind: guidance
+status: active
+topics: ["[[schema-enforcement]]"]
+---
+# enforce schema with graduated strictness across capture processing and query zones
+Schema enforcement is the data integrity layer that makes a knowledge vault queryable rather than browsable. Without it, the vault degrades from a structured graph database into an unstructured pile of files that happens to have YAML at the top. Since [[markdown plus YAML plus ripgrep implements a queryable graph database without infrastructure]], schema enforcement is what keeps that database functional — the YAML fields are the columns, and enforcement is what ensures the columns are populated.
+This doc tells the plugin WHEN to enforce, HOW strictly, and WHAT to enforce for each domain. It is consulted by /ask when users ask about schema compliance, by /architect when designing new note types, and by /recommend when diagnosing retrieval failures.
+## Why Non-Negotiable
+Every domain composition the plugin generates depends on queryable metadata. Therapy systems query `mood` and `trigger` fields to detect emotional patterns across entries. Project management queries `status` and `stakeholders` to prepare meeting briefings. Research queries `methodology` and `classification` to find methodologically comparable claims. Trading queries `ticker` and `thesis` to check conviction against outcome. Without schema enforcement, these queries return incomplete or incorrect results — the system works for a week, then silently degrades as notes accumulate without proper metadata.
+The degradation is invisible. A therapy vault with 50 reflections looks healthy. But if 12 of them lack `mood` fields, the pattern detection algorithm sees only 38 data points and the mood-trigger correlations it surfaces are skewed toward entries that happened to be well-formatted. The user never knows the analysis is incomplete because missing data does not announce itself — it simply disappears from query results. This is "silent data loss," and it is the primary failure mode that schema enforcement prevents.
+Since [[schema enforcement via validation agents enables soft consistency]], the enforcement approach matters as much as the enforcement itself. The research is clear: hard enforcement at capture time creates friction that causes system abandonment, while no enforcement at all produces metadata drift that corrupts retrieval. The solution is graduated enforcement — strict where it matters, gentle where friction would cost more than compliance.
+## The Cognitive Science of Externalized Enforcement
+Why can't we simply instruct agents to fill in all the fields? Because since [[schema validation hooks externalize inhibitory control that degrades under cognitive load]], instruction-based enforcement degrades precisely when it matters most. Inhibitory control — the capacity to suppress automatic behavior and follow rules — is the first executive function to degrade under cognitive load. An agent deep in a complex synthesis task, with its context window filling past the smart zone, stops attending to procedural instructions like "validate frontmatter against the template schema." The agent does not decide to skip validation. It simply stops attending to the instruction, the same way a tired surgeon proceeds without the checklist.
+Externalized enforcement through hooks solves this. A validation hook fires after every file write regardless of the agent's cognitive state — whether it is in the first 10% of context or the last 5%. The check happens outside the context window, immune to the attention degradation that compromises everything inside it. Since [[hooks enable context window efficiency by delegating deterministic checks to external processes]], externalized enforcement provides a double benefit: it both guarantees compliance and frees context tokens for substantive work.
+This is the theoretical foundation for the entire enforcement architecture. Enforcement that lives in instructions degrades. Enforcement that lives in infrastructure persists.
+## The Enforcement Gradient
+The plugin implements a graduated enforcement model across three zones:
+| Zone | Strictness | When | Why |
+|------|-----------|------|-----|
+| **Capture** | Soft (warn, don't block) | Creating new notes | Since [[temporal separation of capture and processing preserves context freshness]], blocking capture for schema issues trades context preservation for format purity — a bad trade |
+| **Processing** | Medium (require core fields) | Pipeline reduce/reflect/verify phases | Processing phases can ensure quality because they are not time-sensitive and run with dedicated context |
+| **Query** | Hard (missing fields = invisible) | Graph queries, pattern detection | If a therapy note lacks `mood`, it does not appear in mood analysis — enforcement by consequence |
+The gradient maps to Thaler and Sunstein's insight from nudge theory: intervention strength should match violation severity. Since [[nudge theory explains graduated hook enforcement as choice architecture for agents]], blocking a write (exit code 2) is a mandate appropriate for structural failures. Warning via context injection is a nudge appropriate for qualitative issues. The choice between them is architectural, not arbitrary.
+### What to Enforce at Each Level
+**Always enforce (hard, at all levels):**
+- `description` field exists and is non-empty — since [[descriptions are retrieval filters not summaries]], every note needs one for discovery
+- `topics` field exists with at least one MOC reference — orphan notes are lost notes
+- Valid wiki links (targets exist) — since [[stale navigation actively misleads because agents trust curated maps completely]], dangling links do not just clutter; they actively misdirect
+**Enforce at processing (medium):**
+- Domain-specific required fields populated (e.g., `mood` for therapy entries, `ticker` for trade journals, `jurisdiction` for legal notes)
+- Enum values within valid ranges (e.g., `status: preliminary | open | dissolved`, `conviction_level: low | moderate | high | extreme`)
+- Description adds information beyond title — since [[good descriptions layer heuristic then mechanism then implication]], descriptions that merely restate the title waste the retrieval layer
+- `relevant_notes` entries include context phrases explaining the relationship
+**Warn at capture (soft):**
+- Missing optional fields
+- Description quality issues (too short, restates title)
+- Missing relevant_notes context phrases
+- Non-standard enum values that might be typos
+## Domain-Specific Enforcement
+Each domain has different critical fields. The plugin derives enforcement rules from domain schemas, following the principle that fields the domain QUERIES against are critical:
+| Domain | Critical Fields (Hard) | Important Fields (Medium) | Nice-to-Have (Soft) |
+|--------|----------------------|--------------------------|---------------------|
+| Research | description, methodology, topics | classification, source, confidence | adapted_from, evidence_type |
+| Therapy | description, mood, date | triggers, coping_used, mood_context | physical_sensations, sleep_quality |
+| PM | description, project, status | stakeholders, rationale | follow_ups, blocked_by |
+| Creative | description, canon_status | characters_involved, timeline_position | draft_number, revision_notes |
+| Learning | description, domain, prerequisites | mastery_level, knowledge_type | last_reviewed, practice_count |
+| Trading | description, ticker, thesis | conviction_level, position_size | emotions_during, market_regime |
+| Health | description, date, type | severity, triggers | correlation_notes, duration |
+| Legal | description, jurisdiction, matter | precedent_status, holding | opposing_arguments, appeal_risk |
+The principle: fields the domain QUERIES against are critical (hard enforcement). Fields that enrich context but are not machine-processed are important (medium). Fields that provide optional color are nice-to-have (soft). A research vault that never queries `adapted_from` does not need to enforce it — but a research vault that queries `methodology` to compare evidence across traditions must enforce it, because missing values corrupt the comparison.
+Since [[schema fields should use domain-native vocabulary not abstract terminology]], enforcement must respect the language of each domain. A therapy system validates that `trigger` uses the person's own words, not clinical codes. A trading system validates that `thesis` contains an actual investment rationale, not a ticker repeat. Field names are the only domain-specific element in the universal note pattern, and they carry the entire burden of making the system feel native rather than imposed.
+## The Observe-Then-Formalize Lifecycle
+Since [[schema evolution follows observe-then-formalize not design-then-enforce]], the plugin never designs a 15-field schema on day one. Every field starts its life as an observation, graduates to a convention, and is formalized as schema.
+### The Lifecycle Stages
+**Stage 1: Minimal seed.** The plugin generates a starting schema with 2-3 required fields (always `description` and `topics`, plus one domain-critical field). Everything else is optional or absent. The schema is deliberately incomplete — it captures just enough to make the vault queryable, not enough to be burdensome.
+**Stage 2: Observation.** As the user works, the agent notices patterns: fields being added manually, fields being stuffed with "N/A," free-text fields developing internal structure. These observations accumulate as evidence for schema change. Since [[hook-driven learning loops create self-improving methodology through observation accumulation]], the observation mechanism already exists — schema evolution piggybacks on the general learning loop.
+**Stage 3: Convention.** When an observation recurs 5+ times, it becomes a convention — documented in the context file as guidance but not yet enforced. "We tend to track triggers on therapy reflections" is a convention. It guides behavior without blocking it.
+**Stage 4: Formalization.** When a convention has been followed consistently and the field is being queried, formalize it: add to the template, set required/optional status, define valid values if applicable, add validation. The field has earned its place through demonstrated use.
+This lifecycle mirrors since [[methodology development should follow the trajectory from documentation to skill to hook as understanding hardens]] — schema hardens as evidence accumulates, not as speculation compounds.
+## The Five Evolution Signals
+Since [[evolution observations provide actionable signals for system adaptation]], the quarterly review protocol uses five diagnostic signals to identify schema that needs to change. These signals are the concrete mechanism for the observe-then-formalize lifecycle:
+### 1. Manual Additions
+The agent or user consistently adds a field manually that is not in the template. When the same ad-hoc field appears across five or more notes, it has demonstrated its value through use. **Action:** Formalize it — add to the template, decide required vs optional, define valid values. **Example:** A therapy system's agent starts adding `trigger` to reflections even though it was not in the initial template. After five reflections with manually added triggers, the field earns formalization.
+### 2. Placeholder Stuffing
+A required field is consistently filled with "N/A," "none," or empty placeholder text. This means the field does not fit every note in the type. **Action:** Demote from required to optional. **Example:** A research system requires `adapted_from` on every claim, but original claims are stuffed with "null." The field should be optional with guidance to fill it only when genuinely adapting a human pattern.
+### 3. Dead Enums
+An enum value defined in the template is never selected across 50+ notes. **Action:** Remove it — it is noise in the selection space. **Example:** A therapy system defines "ecstatic" as a mood enum but no reflection ever uses it. Drop it. Dead enums are a sign of upfront speculation that did not match reality.
+### 4. Patterned Free Text
+A free-text field develops consistent internal structure across multiple notes. **Action:** If the structure is consistent enough, decompose into separate fields. If it is a convention, document it in the template guidance but keep the field free-text. **Example:** A free-text "context" field in a therapy system consistently contains "trigger: X, reaction: Y, insight: Z." That is structure waiting to be formalized into three separate fields. But a description field that consistently starts with a verb phrase is a stylistic convention — document it, do not decompose it.
+### 5. Oversized MOCs
+A MOC consistently exceeds the healthy threshold (40+ notes for agent-operated systems, 30+ for human-operated). **Action:** Create sub-MOCs along the natural fault lines. **Signal:** If the split follows a pattern across multiple MOCs (e.g., every topic MOC splits into "theory" and "practice"), that pattern itself is a schema dimension worth formalizing. This signal is architectural rather than field-level, but it belongs in the quarterly review because it indicates the current taxonomy has outgrown its design.
+The five signals compose into a quarterly review protocol: scan for manual additions, check for placeholder stuffing, audit enum usage, inspect free-text structure, review MOC sizes. Each check produces either "no action" or a specific schema proposal with evidence.
+## Progressive Validation
+Since [[progressive schema validates only what active modules require not the full system schema]], the plugin never enforces fields from inactive modules. A user with just basic wiki-links and yaml-schema enabled sees no warnings about `methodology` or `relevant_notes` until they activate the modules that use those fields.
+This is critical for onboarding. New users encounter minimal schema requirements. As they activate more features, enforcement grows with them. The progression:
+1. **Day 1:** Only `description` and `topics` required. The vault is immediately queryable for basic navigation.
+2. **Week 2:** User activates processing pipeline. Domain-specific fields become required at the processing boundary (not at capture).
+3. **Month 2:** User activates pattern detection. The fields that pattern detection queries become critical — now the system depends on them.
+4. **Ongoing:** The quarterly review protocol adds fields that have earned formalization and removes fields that have proven unnecessary.
+Since [[premature complexity is the most common derivation failure mode]], progressive validation prevents the single most common cause of system abandonment: overwhelming new users with a schema they do not yet understand the purpose of.
+## Hook-Based Enforcement for Agents
+For platforms that support hooks (tier 1), the plugin generates enforcement as automated checks:
+**PostToolUse on Write:** Fires after every file creation or modification. Checks required fields for the note's type. Returns exit code 2 (blocking) for hard enforcement violations, exit code 0 with warning in additionalContext for soft violations. The agent sees warnings and can choose to address them; it cannot bypass blocks.
+**Session-start health check:** Fires at session orientation. Counts notes per type missing critical fields. Reports as "schema compliance: 94% (3 therapy reflections missing mood field)." The number makes the gap visible without blocking work.
+**Pipeline boundary gate:** Since [[generation effect gate blocks processing without transformation]], the pipeline boundary enforces a similar principle for schema: content cannot advance from one pipeline phase to the next if critical fields are missing. This is where processing-level enforcement lives — the gate is structural, not attentional.
+For platforms that do not support hooks (tier 2-3), the plugin generates equivalent guidance in the context file: explicit checklists for note creation, processing phase instructions that include field validation, and periodic manual review prompts. Since [[platform capability tiers determine which knowledge system features can be implemented]], the enforcement mechanism adapts to platform capability while the enforcement principle remains constant.
+## Cross-Domain Enforcement
+When users have multiple domains (e.g., research + personal life + therapy), each domain maintains its own schema scope. Since [[multi-domain systems compose through separate templates and shared graph]], hard enforcement that blocks on a therapy field violation must never stall research processing.
+**Shared fields** (`description`, `topics`) have universal enforcement — every note in every domain must have them. **Domain-specific fields** (`mood`, `ticker`, `precedent`) have domain-scoped enforcement — they only apply to notes of the relevant type. The graph is shared; the schemas are separate.
+The practical implication: a validation hook must know which template applies to which note. The plugin generates this routing from the template registry — each template declares which folder or note type it governs, and the hook dispatches accordingly.
+## Implementation Pattern
+### For /setup (Onboarding)
+When generating a new vault, the plugin:
+1. Identifies domain-specific note types from the domain composition
+2. Generates template files with `_schema` blocks defining required/optional/enum fields
+3. Creates validation hooks appropriate to the user's platform tier
+4. Explains the enforcement gradient: "Your notes will be checked for [critical fields]. Missing fields get flagged during processing, not blocked at capture."
+### For /architect (Extension)
+When a user adds a new note type:
+1. The plugin proposes a schema based on similar domains (since [[novel domains derive by mapping knowledge type to closest reference domain then adapting]])
+2. Critical fields are identified by asking: "What will you query this note type for?"
+3. The schema starts minimal and grows — since [[schema evolution follows observe-then-formalize not design-then-enforce]]
+### For /recommend (Maintenance)
+When checking vault health:
+1. Count notes per type missing critical fields — report as "schema compliance"
+2. Detect fields being manually added that are not in the schema — suggest schema evolution
+3. Identify queries that return incomplete results due to missing metadata — flag as "silent data loss"
+## Anti-Patterns
+| Anti-Pattern | Why It Fails | Better Approach |
+|-------------|-------------|-----------------|
+| Block all writes on any schema violation | Capture friction causes system abandonment | Soft enforcement at capture, hard at query |
+| No enforcement at all | Metadata drift corrupts query results within weeks | At minimum, enforce description + topics universally |
+| Enforce full schema from day one | Overwhelms users before they understand what fields are for | Progressive schema: minimal start, grow with observed use |
+| Same enforcement across domains | Cross-domain blocking (therapy validation stalls research) | Domain-scoped validation via template routing |
+| Enforce fields nobody queries | Wasted effort and unnecessary friction | Tie enforcement to actual query patterns |
+| Design schema upfront without observation | Speculation produces dead fields and missing useful ones | Observe-then-formalize: fields earn their place through use |
+## Domain Examples
+These domain compositions demonstrate schema enforcement in practice:
+- [[academic research uses structured extraction with cross-source synthesis]] — Hard enforcement on `methodology` and `classification` enables evidence-quality queries and cross-methodology synthesis. The dense schema serves a purpose: every field powers a specific academic query.
+- [[therapy journal uses warm personality with pattern detection for emotional processing]] — `mood` and `trigger` fields are critical for pattern detection, but soft enforcement at journaling time prevents friction during vulnerable moments. The ethical constraint governs: never let schema demands interfere with emotional capture.
+- [[trading uses conviction tracking with thesis-outcome correlation]] — `ticker`, `thesis`, and `conviction_level` fields power strategy compliance checking. Enforcement by consequence when missing fields exclude trades from performance analysis.
+- [[legal case management uses precedent chains with regulatory change propagation]] — `jurisdiction` and `precedent_status` require hard enforcement because regulatory compliance depends on complete metadata. This is the domain where enforcement is most naturally strict.
+- [[health wellness uses symptom-trigger correlation with multi-dimensional tracking]] — Accumulation-based domain where schema enforcement gates pattern detection. Missing `date` or `type` fields make correlation analysis impossible, but capture must stay as frictionless as a health log requires.
+## Grounding
+This guidance is grounded in:
+- [[schema enforcement via validation agents enables soft consistency]] — the foundational research on soft vs hard enforcement
+- [[schema validation hooks externalize inhibitory control that degrades under cognitive load]] — why enforcement must be externalized, not instructed
+- [[schema fields should use domain-native vocabulary not abstract terminology]] — why enforcement must respect domain language
+- [[progressive schema validates only what active modules require not the full system schema]] — module-aware enforcement
+- [[schema evolution follows observe-then-formalize not design-then-enforce]] — enforcement generates the observation data for evolution
+- [[nudge theory explains graduated hook enforcement as choice architecture for agents]] — theoretical basis for graduated enforcement
+- [[schema templates reduce cognitive overhead at capture time]] — templates as the capture-side of enforcement
+- [[evolution observations provide actionable signals for system adaptation]] — the five signals that drive schema evolution
+- [[generation effect gate blocks processing without transformation]] — pipeline boundary enforcement as a quality gate
+---
+Topics:
+- [[index]]
+- [[index]]
+---
+Topics:
+- [[schema-enforcement]]

package/methodology/enforcing atomicity can create paralysis when ideas resist decomposition.md ADDED Viewed

@@ -0,0 +1,43 @@
+---
+description: The cognitive effort of splitting complex arguments into single-concept notes can exceed productive friction, becoming system management overhead that crowds out actual thinking
+kind: research
+topics: ["[[note-design]]"]
+confidence: speculative
+methodology: ["Zettelkasten", "Evergreen"]
+source: [[tft-research-part2]]
+---
+# enforcing atomicity can create paralysis when ideas resist decomposition
+Critics of strict atomicity in note-taking systems report that the requirement to decompose every insight into single-concept notes can be paralyzing. Breaking complex arguments into atomic units requires significant cognitive effort — identifying the seams, deciding where one concept ends and another begins, writing connective tissue to link the fragments. Some practitioners describe this as "note gymnastics: spending more time managing the system than actually thinking." Since [[insight accretion differs from productivity in knowledge systems]], this describes effort spent on productivity (structural splitting, fragment management) rather than accretion (deepening understanding) — the atomicity requirement may channel work toward system management at the expense of actual thinking.
+This creates a diagnostic problem for ars contexta. Andy Matuschak's response to atomicity paralysis is instructive: "If it is hard to atomize the idea, it means the idea is not yet clearly understood." On this view, atomization difficulty is a signal, not an obstacle. The friction reveals that the insight hasn't resolved into clean components yet — which means forcing the decomposition is desirable difficulty that prompts deeper understanding. Since [[the generation effect requires active transformation not just storage]], the struggle to atomize might be exactly the generative work that creates cognitive hooks.
+But there's an alternative interpretation: some ideas are genuinely relational and resist decomposition not because they're fuzzy but because they're structural. A claim about how three concepts interact may be one insight about their interaction, not three insights awkwardly stitched together. Forcing atomization of inherently relational thinking might destroy the insight rather than clarify it — like insisting that a molecule is really just atoms, when the molecular structure is precisely what matters. This parallels the concern that [[sense-making vs storage does compression lose essential nuance]] raises for descriptions: some features that make an idea valuable may be exactly the features that resist compression or decomposition.
+Since [[configuration dimensions interact so choices in one create pressure on others]], the paralysis is not just a granularity problem in isolation — choosing atomic notes forces explicit linking, deep navigation, and heavy processing simultaneously, and the total burden can exceed the system's processing capacity. The vault's position — composability over atomicity — suggests a middle path, and since [[methodology traditions are named points in a shared configuration space not competing paradigms]], the mapping confirms that granularity is a spectrum with successfully operating traditions at each pole: Zettelkasten and Evergreen at atomic, Cornell at medium, PARA at coarse. The test is composability, not atomicity: can this note be linked from elsewhere without dragging unwanted context? If yes, the note is fine regardless of how many concepts it touches. If no, split. But this reframe doesn't eliminate the diagnostic problem: when linking drags context, is that because the note bundles separate claims, or because the claims are genuinely entangled?
+For agent-operated systems, this creates a design tension. Since [[summary coherence tests composability before filing]] uses summary failure as a bundling diagnostic, agents might flag notes as "incoherent" when they're actually expressing relational insights that resist propositional summary. The agent's atomicity pressure could fragment ideas that humans understood as wholes. Since [[vault conventions may impose hidden rigidity on thinking]], atomicity enforcement might be one of those conventions that channels thinking into patterns favoring certain styles — sequential, decomposable, propositional — over others.
+The research suggests friction is sometimes "desirable difficulty" and sometimes paralysis. Distinguishing these cases is non-trivial. One heuristic: if the struggle to split reveals new understanding (you realize what you thought was one idea is actually two), that's desirable difficulty. If the struggle feels like paperwork (you know exactly what you mean but can't make it fit the format), that's paralysis. But this heuristic relies on felt sense, which agents don't have. This is an operationalization problem shared with [[claims must be specific enough to be wrong]], which also asks when forced reformulation helps (incomplete thinking resolving into clarity) vs. constrains (valid non-linear insight being distorted).
+There is a third structural option beyond "split" or "leave as-is" that partially dissolves this question. Since [[federated wiki pattern enables multi-agent divergence as feature not bug]], when an idea resists decomposition because its relational character IS the insight, parallel interpretive frames can coexist as federated notes. Instead of forcing a molecular insight into atomic fragments, two agents can each write a version that preserves the relational structure from its own angle. The test shifts from "can this be split?" to "do these perspectives illuminate different facets?" This does not fully operationalize the productive/paralysis distinction, but it removes one class of paralysis: the case where the thinker knows the idea is coherent but cannot make it fit the atomic format. Federation acknowledges that the coherence is real and offers coexistence instead of forced resolution.
+The open question: can we operationalize the distinction between productive atomization friction and destructive atomization paralysis? Or does the vault's composability-over-atomicity framing dissolve the question by making atomicity a means rather than an end?
+---
+Relevant Notes:
+- [[vault conventions may impose hidden rigidity on thinking]] — this note names one specific form of that rigidity: atomicity enforcement as a convention that may constrain thinking
+- [[the generation effect requires active transformation not just storage]] — the counterargument: if atomization difficulty signals incomplete understanding, the friction is desirable difficulty that prompts genuine transformation
+- [[summary coherence tests composability before filing]] — operationalizes atomicity as a diagnostic: summary failure reveals bundling, but raises the question of whether all summary failures indicate bundling or whether some insights resist propositional compression
+- [[claims must be specific enough to be wrong]] — sibling convention constraint: specificity requirement and atomicity requirement both demand that thinking resolve into particular forms, and both may channel thinking into patterns favoring certain styles
+- [[sense-making vs storage does compression lose essential nuance]] — parallel tension at the description layer: that note asks whether ~150-character descriptions lose essential nuance, this note asks whether decomposition into atomic units loses essential structure
+- [[insight accretion differs from productivity in knowledge systems]] — the note gymnastics critique names exactly this pattern: effort spent managing atomicity is productivity (filing, splitting, connecting fragments) without accretion (deepening understanding)
+- [[writing for audience blocks authentic creation]] — sibling pattern of cognitive split: atomicity paralysis divides effort between decomposition and synthesis, audience awareness divides effort between presentation and synthesis; both identify how format requirements compete with genuine thinking
+- [[federated wiki pattern enables multi-agent divergence as feature not bug]] — offers a structural alternative to forced decomposition: when ideas resist splitting because the relational structure IS the insight, parallel interpretive frames can coexist as federated notes without either losing the molecular character
+- [[configuration dimensions interact so choices in one create pressure on others]] — generalizes this paralysis as a predicted outcome of the granularity cascade: fine granularity creates pressure toward heavy processing, and when processing burden exceeds capacity the system collapses into exactly this paralysis
+- [[methodology traditions are named points in a shared configuration space not competing paradigms]] — maps the granularity spectrum to real traditions: Zettelkasten at atomic, Cornell at medium, PARA at coarse — showing that multiple successful operating points exist, making atomicity a choice not a requirement
+- [[configuration paralysis emerges when derivation surfaces too many decisions]] — sibling paralysis pattern at the system level: atomicity paralysis comes from methodology requirements during note creation, configuration paralysis comes from methodology requirements during system derivation; both are cases where surfacing too many decisions overwhelms the decision-maker, but configuration paralysis has a cleaner solution because dimension coupling makes inference tractable
+Topics:
+- [[note-design]]