npm - @vextlabs/theron-cli - Versions diffs - 0.2.1 → 0.4.0 - Mend

@vextlabs/theron-cli 0.2.1 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (191) hide show

package/dist/api.d.ts +8 -0
package/dist/api.js +3 -0
package/dist/api.js.map +1 -1
package/dist/auth.js +51 -1
package/dist/auth.js.map +1 -1
package/dist/banner.js +3 -2
package/dist/banner.js.map +1 -1
package/dist/checkpoints.d.ts +32 -0
package/dist/checkpoints.js +61 -0
package/dist/checkpoints.js.map +1 -0
package/dist/index.js +61 -5
package/dist/index.js.map +1 -1
package/dist/input.d.ts +61 -0
package/dist/input.js +574 -0
package/dist/input.js.map +1 -0
package/dist/profiles/index.js +5 -0
package/dist/profiles/index.js.map +1 -1
package/dist/profiles/methodologies/build_domains.d.ts +6 -0
package/dist/profiles/methodologies/build_domains.js +170 -0
package/dist/profiles/methodologies/build_domains.js.map +1 -0
package/dist/profiles/methodologies/operate_domains.d.ts +8 -0
package/dist/profiles/methodologies/operate_domains.js +1239 -0
package/dist/profiles/methodologies/operate_domains.js.map +1 -0
package/dist/profiles/methodologies/regulated_domains.d.ts +6 -0
package/dist/profiles/methodologies/regulated_domains.js +153 -0
package/dist/profiles/methodologies/regulated_domains.js.map +1 -0
package/dist/profiles/methodologies/research_domains.d.ts +8 -0
package/dist/profiles/methodologies/research_domains.js +179 -0
package/dist/profiles/methodologies/research_domains.js.map +1 -0
package/dist/profiles/methodologies/strategy_domains.d.ts +15 -0
package/dist/profiles/methodologies/strategy_domains.js +193 -0
package/dist/profiles/methodologies/strategy_domains.js.map +1 -0
package/dist/profiles/seeds.js +241 -95
package/dist/profiles/seeds.js.map +1 -1
package/dist/receipt.d.ts +17 -0
package/dist/receipt.js +46 -0
package/dist/receipt.js.map +1 -0
package/dist/render.d.ts +4 -1
package/dist/render.js +95 -28
package/dist/render.js.map +1 -1
package/dist/repl.d.ts +8 -1
package/dist/repl.js +420 -62
package/dist/repl.js.map +1 -1
package/dist/sessions.d.ts +14 -0
package/dist/sessions.js +100 -0
package/dist/sessions.js.map +1 -1
package/dist/ship.d.ts +2 -0
package/dist/ship.js +62 -0
package/dist/ship.js.map +1 -0
package/dist/skills/catalog.d.ts +13 -0
package/dist/skills/catalog.js +86 -0
package/dist/skills/catalog.js.map +1 -0
package/dist/tools/bash.js +81 -14
package/dist/tools/bash.js.map +1 -1
package/dist/tools/edit.js +21 -1
package/dist/tools/edit.js.map +1 -1
package/dist/tools/glob.js +4 -1
package/dist/tools/glob.js.map +1 -1
package/dist/tools/grep.d.ts +5 -0
package/dist/tools/grep.js +101 -2
package/dist/tools/grep.js.map +1 -1
package/dist/tools/index.d.ts +22 -0
package/dist/tools/index.js +177 -41
package/dist/tools/index.js.map +1 -1
package/dist/tools/ls.d.ts +3 -0
package/dist/tools/ls.js +23 -12
package/dist/tools/ls.js.map +1 -1
package/dist/tools/multiedit.d.ts +12 -0
package/dist/tools/multiedit.js +79 -0
package/dist/tools/multiedit.js.map +1 -0
package/dist/tools/stoa.d.ts +1 -1
package/dist/tools/stoa.js +7 -3
package/dist/tools/stoa.js.map +1 -1
package/dist/tools/task.d.ts +9 -0
package/dist/tools/task.js +166 -0
package/dist/tools/task.js.map +1 -0
package/dist/tools/todowrite.d.ts +12 -0
package/dist/tools/todowrite.js +38 -0
package/dist/tools/todowrite.js.map +1 -0
package/dist/tools/webfetch.d.ts +6 -0
package/dist/tools/webfetch.js +98 -0
package/dist/tools/webfetch.js.map +1 -0
package/dist/tools/websearch.d.ts +7 -0
package/dist/tools/websearch.js +83 -0
package/dist/tools/websearch.js.map +1 -0
package/dist/tools/write.js +17 -1
package/dist/tools/write.js.map +1 -1
package/dist/verifiers/calc_gate.d.ts +2 -0
package/dist/verifiers/calc_gate.js +112 -0
package/dist/verifiers/calc_gate.js.map +1 -0
package/dist/verifiers/citation_gate.d.ts +2 -0
package/dist/verifiers/citation_gate.js +130 -0
package/dist/verifiers/citation_gate.js.map +1 -0
package/dist/verifiers/confidence_marked.d.ts +2 -0
package/dist/verifiers/confidence_marked.js +49 -0
package/dist/verifiers/confidence_marked.js.map +1 -0
package/dist/verifiers/disclaimer_gate.d.ts +2 -0
package/dist/verifiers/disclaimer_gate.js +57 -0
package/dist/verifiers/disclaimer_gate.js.map +1 -0
package/dist/verifiers/evidence_gate.d.ts +2 -0
package/dist/verifiers/evidence_gate.js +108 -0
package/dist/verifiers/evidence_gate.js.map +1 -0
package/dist/verifiers/index.d.ts +5 -0
package/dist/verifiers/index.js +28 -7
package/dist/verifiers/index.js.map +1 -1
package/dist/verifiers/lint.js +4 -3
package/dist/verifiers/lint.js.map +1 -1
package/dist/verifiers/promoted_kernels.d.ts +8 -0
package/dist/verifiers/promoted_kernels.js +190 -0
package/dist/verifiers/promoted_kernels.js.map +1 -0
package/dist/verifiers/source_gate.d.ts +2 -0
package/dist/verifiers/source_gate.js +125 -0
package/dist/verifiers/source_gate.js.map +1 -0
package/dist/verifiers/test_smoke.js +30 -0
package/dist/verifiers/test_smoke.js.map +1 -1
package/dist/verifiers/types.d.ts +3 -0
package/package.json +4 -2
package/skills/README.md +123 -0
package/skills/ab-test.md +89 -0
package/skills/api-design.md +175 -0
package/skills/architecture-design.md +185 -0
package/skills/business-case.md +77 -0
package/skills/causal-inference.md +77 -0
package/skills/clinical-guideline.md +98 -0
package/skills/code-review.md +98 -0
package/skills/cold-outreach.md +268 -0
package/skills/competitive-teardown.md +223 -0
package/skills/component-spec.md +121 -0
package/skills/content-calendar.md +280 -0
package/skills/contract-review.md +155 -0
package/skills/data-analysis.md +187 -0
package/skills/debug.md +91 -0
package/skills/design-audit.md +121 -0
package/skills/differential-diagnosis.md +79 -0
package/skills/discovery-call.md +206 -0
package/skills/edit-pass.md +80 -0
package/skills/engineering-calc.md +101 -0
package/skills/estimate.md +70 -0
package/skills/experiment-design.md +105 -0
package/skills/fact-check.md +82 -0
package/skills/financial-model.md +104 -0
package/skills/grant-proposal.md +93 -0
package/skills/harmony-analysis.md +93 -0
package/skills/hypothesis-generation.md +99 -0
package/skills/incident-response.md +134 -0
package/skills/interview-loop.md +62 -0
package/skills/job-scorecard.md +92 -0
package/skills/kb-article.md +174 -0
package/skills/launch-plan.md +85 -0
package/skills/lease-review.md +93 -0
package/skills/lesson-plan.md +198 -0
package/skills/literature-review.md +69 -0
package/skills/market-entry.md +137 -0
package/skills/market-sizing.md +159 -0
package/skills/meta-analysis.md +140 -0
package/skills/migrate.md +117 -0
package/skills/optimize.md +88 -0
package/skills/options-strategy.md +166 -0
package/skills/peer-review.md +96 -0
package/skills/pentest-plan.md +193 -0
package/skills/pitch-review.md +132 -0
package/skills/plan.md +88 -0
package/skills/policy-brief.md +124 -0
package/skills/positioning.md +192 -0
package/skills/postmortem.md +168 -0
package/skills/prd.md +105 -0
package/skills/prioritize.md +162 -0
package/skills/proof.md +91 -0
package/skills/property-underwrite.md +159 -0
package/skills/recipe-develop.md +109 -0
package/skills/red-team.md +142 -0
package/skills/refactor.md +58 -0
package/skills/reflection-session.md +115 -0
package/skills/regulatory-compliance.md +136 -0
package/skills/reproduce.md +87 -0
package/skills/runbook.md +344 -0
package/skills/security-audit.md +154 -0
package/skills/seo-brief.md +201 -0
package/skills/sql-query.md +161 -0
package/skills/story-craft.md +163 -0
package/skills/tdd.md +59 -0
package/skills/term-sheet.md +298 -0
package/skills/theory-of-change.md +88 -0
package/skills/threat-model.md +104 -0
package/skills/ticket-triage.md +200 -0
package/skills/tolerance-analysis.md +149 -0
package/skills/training-program.md +151 -0
package/skills/translate.md +64 -0
package/skills/unit-economics.md +238 -0
package/skills/valuation.md +112 -0
package/skills/write-tests.md +77 -0

package/skills/fact-check.md ADDED Viewed

@@ -0,0 +1,82 @@
+---
+name: fact-check
+description: Adversarially verify claims — decompose into atomic claims, find primary sources, try to refute each, and rate with cited evidence. Never assert without a source.
+allowed-tools: WebSearch, WebFetch, Read, Grep
+---
+## PHASE 1 — INGEST & DECOMPOSE
+1. Read the full text without judging. Identify every discrete assertion the author makes.
+2. Split compound sentences into one claim per line. "X did Y in Z year, causing W" = three separate claims.
+3. Classify each claim into exactly one of four types:
+   - **FACT** — empirically testable (dates, numbers, quotes, events, scientific findings)
+   - **PREDICTION** — about a future state (verify the model/assumptions, not the outcome)
+   - **OPINION** — subjective interpretation (note, do not rate true/false)
+   - **VALUE JUDGMENT** — normative assertion ("X is good/bad") (note, do not rate true/false)
+4. Number every claim. Only FACT-type claims proceed to Phase 2. Log the others as "opinion/judgment — not rated."
+## PHASE 2 — SOURCE HIERARCHY (always prefer higher tier)
+- **Tier 1 (primary):** Government databases, official registries, court records, SEC/FDA filings, peer-reviewed papers (PubMed, arXiv with DOI), census data, central-bank releases.
+- **Tier 2 (authoritative secondary):** Reports from the originating institution, official press releases with traceable data, recognized standards bodies (ISO, IEEE).
+- **Tier 3 (reliable secondary):** Long-form investigative journalism with named sources and document trails (Reuters, AP, WSJ, FT). Acceptable only when Tier 1/2 unavailable.
+- **Never cite:** opinion pieces, undated pages, aggregator sites without named sources, Wikipedia alone (follow its citations instead), social media posts.
+## PHASE 3 — ADVERSARIAL SEARCH PROTOCOL (per FACT claim)
+For each claim, run this sequence — stop only when you have a definitive answer or exhaust all steps:
+5. **Confirm search:** Search the exact assertion as stated. Find the best supporting source.
+6. **Refute search:** Explicitly search for contradictions. Use "NOT", "false", "debunked", "error", "correction" + key terms. Search for alternative data sources that measure the same thing differently.
+7. **Primary-source trace:** Follow citations upstream. If a news article says "a study found X", fetch the actual study abstract or full text. If a statistic is cited, find the originating dataset.
+8. **Steelman the opposite:** Construct the strongest plausible case that the claim is wrong. If that case has merit, document it — do not bury it.
+9. **Context check:** Verify the claim is not technically true but misleading. Check: cherry-picked time window? Unrepresentative sample? Comparison to a cherry-picked baseline? Units changed mid-comparison?
+## PHASE 4 — QUOTE & NUMBER VERIFICATION
+10. For every direct quote: fetch the original source document and verify the exact wording. Flag if paraphrased and presented as verbatim.
+11. For every number: verify the exact figure, the date it was measured, the methodology that produced it, and whether subsequent revisions exist.
+12. For percentages: verify the denominator. "X% increase" — from what baseline? Absolute vs. relative risk — which was used?
+13. For dates: verify against primary records. One-day errors in date claims are common and sometimes material.
+## PHASE 5 — RATE EACH FACT CLAIM
+Apply exactly one of these ratings:
+- **TRUE** — Confirmed by Tier 1 or Tier 2 source; no credible contradicting evidence found after adversarial search.
+- **MOSTLY TRUE** — Core assertion is accurate; minor details are imprecise or lack context that does not change the substance.
+- **MISLEADING** — Technically accurate but presented in a way that creates a false impression (cherry-picking, missing context, selective framing).
+- **FALSE** — Contradicted by primary sources or direct evidence.
+- **UNVERIFIABLE** — No primary or authoritative source exists or is accessible; claim is not disprovable but cannot be confirmed. Do not guess or infer a rating.
+Hard rule: **No rating without a citation.** If you cannot cite a source for a verdict, the rating is UNVERIFIABLE.
+## PHASE 6 — VERDICT TABLE
+Produce this table for every claim:
+| # | Claim (verbatim excerpt) | Type | Rating | Evidence summary | Source URL | Source date | Adversarial note |
+|---|--------------------------|------|--------|-----------------|------------|-------------|-----------------|
+| 1 | "..." | FACT | TRUE | ... | https://... | YYYY-MM-DD | No credible refutation found |
+| 2 | "..." | FACT | MISLEADING | ... | https://... | YYYY-MM-DD | Baseline was cherry-picked; full-series shows opposite trend |
+| 3 | "..." | OPINION | — | Not rated (subjective) | — | — | — |
+## PHASE 7 — OVERALL ASSESSMENT
+After the table:
+- **Accuracy rate:** X of Y verifiable claims rated TRUE or MOSTLY TRUE.
+- **Highest-severity issues:** List FALSE and MISLEADING claims with one-line explanations.
+- **Pattern flags:** Systematic cherry-picking? Consistent conflation of correlation/causation? Reliance on a single non-primary source throughout?
+- **What could not be checked:** Explicitly name claims rated UNVERIFIABLE and why.
+- **Overall verdict:** One of — Well-supported / Partially supported / Requires significant caveats / Substantially inaccurate. State the single most material flaw if rating is not "Well-supported."
+## HARD RULES (never violate)
+- Never assert a fact in your own voice without citing a source URL and date.
+- Never upgrade UNVERIFIABLE to a verdict because it "seems likely" or "is commonly believed."
+- Never treat absence of refutation as confirmation. The standard is positive evidence.
+- Never present a Tier 3 source as definitive when Tier 1/2 data exists and conflicts.
+- Flag quotes as UNVERIFIED until you have personally fetched the source document and confirmed the exact wording.
+- If a claim is partially true and partially false, split it into two sub-claims and rate each separately rather than averaging.
+- Recency matters: a statistic that was true in 2020 may be false today — always record the source date and flag if stale relative to the claim's implied timeframe.

package/skills/financial-model.md ADDED Viewed

@@ -0,0 +1,104 @@
+---
+name: financial-model
+description: Build a driver-based 3-statement (P&L / balance sheet / cash flow) projection model with explicit assumptions, linked drivers, upside/base/downside scenarios, DCF/terminal-value valuation, and sanity-checks against stage-appropriate industry benchmarks — invoke whenever the task is financial forecasting, unit-economics analysis, fundraising model, DCF valuation, or scenario planning.
+allowed-tools: Read, Bash, Write
+---
+═══ HARD RULES ═══
+1. NEVER fabricate a benchmark, comp multiple, or historical figure — mark every externally-sourced number with [SOURCE NEEDED] until verified.
+2. NEVER round silently: if you round a number for display, keep full precision in the formula cell and document the rounding decision.
+3. NEVER let a driver be an assumption in one tab and a hardcode in another — one source of truth, always.
+4. NEVER present a single-point forecast as "the" forecast — every output requires a base, upside, and downside column.
+5. NEVER chain percentage-of-revenue ratios without anchoring to at least one absolute-dollar sanity check per period.
+6. NEVER skip the cash-flow-from-operations bridge — P&L profit ≠ cash; working-capital and capex movements must be explicit.
+7. NEVER accept a model where the balance sheet does not balance to the dollar every period.
+8. ALL assumptions must be visible, labeled, and grouped in a single "Assumptions" section before any formula references them.
+9. NEVER silently update assumptions between model versions — every change must be logged with a date, author, prior value, new value, and rationale (see Phase A8).
+10. Use Bash only to run a verification script (e.g., Python/numpy IRR or NPV calculation, CSV integrity check, or balance-sheet balance assertion) — never to hardcode model outputs.
+═══ PHASE A: SCOPE, ASSUMPTIONS REGISTRY, AND VERSION CONTROL ═══
+A1. Identify the model's purpose (fundraising deck, board operating plan, M&A diligence, DCF valuation) — this determines which outputs are primary and what level of granularity is required.
+A2. Determine the time horizon (monthly for 0–18 months, quarterly for 18–36 months, annual thereafter) and lock it before building any formula.
+A3. Build the Assumptions Registry: a flat, labeled list with four columns — [Driver Name | Value | Rationale | Confidence (H/M/L)]. Every number in the model must trace back to a row here.
+A4. Separate assumptions into tiers:
+    - Tier 1 (Structural): pricing, unit economics, headcount-to-revenue ratios — change rarely, high sensitivity.
+    - Tier 2 (Operational): growth rates, churn, payback period — revisit quarterly.
+    - Tier 3 (Market): TAM, conversion funnel, benchmark multiples — flag as [EXTERNAL] and cite source or mark [SOURCE NEEDED].
+A5. State explicitly which line items are driver-based vs. percentage-of-revenue vs. absolute — never mix methods on the same line without a comment.
+A6. Document the base-case definition: what macro + company conditions must hold for the base case to be valid.
+A7. State the model's primary audience and how outputs will be consumed — board members reading EBITDA bridge, investors stress-testing runway, acquirers building a DCF — because this governs which sanity checks are mandatory vs. informational.
+A8. Maintain a version log as the first tab/section of the model: [Version | Date | Author | Changed Drivers | Prior Values | New Values | Rationale]. Increment the version number on every assumption change, not just structural rebuilds. A model without a version log cannot be audited or defended in diligence.
+═══ PHASE B: REVENUE ARCHITECTURE ═══
+B1. Decompose revenue to its irreducible drivers — never use a single "revenue growth rate" assumption:
+    - SaaS example: Seats × ASP × (1 − Monthly Churn)^t + Net New ARR from new logos
+    - Transactional example: Volume × Average Order Value × Take Rate
+    - Services example: Billable Headcount × Utilization Rate × Blended Bill Rate
+B2. Build the funnel explicitly: Impressions → Leads → Qualified Leads → Trials → Conversions → Paid → Retained. Each stage has a conversion rate assumption in the registry.
+B3. Model cohorts when churn is material: new cohort each period, apply per-cohort retention curve, sum to total active base. Do NOT apply a single blended churn rate to a growing base — it understates early-period churn and overstates cohort health.
+B4. Separate recurring revenue, expansion revenue (NRR > 100% mechanics: upsell, seat growth), and one-time revenue. Each has different margin profiles and valuation multiples.
+B5. Sanity check: Revenue per employee must land within 2× of the industry median for the company's stage and sector. If it doesn't, re-examine either headcount or revenue assumptions.
+B6. State ARR / MRR bridge explicitly each period: Opening ARR + New ARR + Expansion ARR − Churn ARR − Contraction ARR = Closing ARR. Closing ARR × 12 must reconcile to recognized revenue ± timing adjustments.
+B7. For annual-upfront billing models, model billings separately from recognized revenue: Billings − Recognized Revenue = Change in Deferred Revenue. Deferred Revenue is a balance-sheet liability (Phase D2) — never treat billings as revenue.
+═══ PHASE C: COST ARCHITECTURE ═══
+C1. Cost of Goods Sold (COGS): itemize every direct cost — hosting/infra per unit, third-party API costs, support headcount loaded cost, payment processing fees. Never use a single gross-margin percentage without this decomposition.
+C2. Gross margin floor and ceiling: state the theoretical floor (variable COGS at infinite scale) and the near-term ceiling (current infrastructure constraints). The model must not project gross margins above the structural ceiling without a documented leverage event (e.g., committed cloud spend discount tier, AI-driven support deflection at a specific headcount ratio).
+C3. Operating expenses — build from headcount first:
+    - For each function (Engineering, Sales, Marketing, G&A), model: Headcount × Loaded Cost per Head (salary + benefits + payroll tax + equity amortization at grant-date fair value).
+    - Loaded cost multiplier: typically 1.20–1.30× cash compensation in the US — state your multiplier explicitly.
+    - Headcount plan must tie to a hiring plan: current head + planned hires − planned attrition, period by period.
+C4. Non-headcount OpEx: software, marketing spend, office/facilities, legal/audit — model each as either a fixed absolute or as a ratio with a floor. Mark any ratio being extrapolated from a single historical data point as [THIN BASIS].
+C5. Capitalize vs. expense: internal software development costs, cloud infrastructure for internal tools, and equipment above the capitalization threshold must be capitalized and depreciated — not expensed in the period incurred. State the capitalization policy and the useful-life assumption per asset class.
+C6. Sanity check: S&M as % of new ARR added (CAC ratio) must be benchmarked. For SaaS at Series A–B, CAC payback 12–18 months is typical; below 6 months is suspicious and requires explanation; above 24 months is a red flag requiring a specific recovery mechanism in the model.
+═══ PHASE D: 3-STATEMENT BUILD AND LINKAGE ═══
+D1. Income Statement first: Revenue − COGS = Gross Profit → − Operating Expenses = EBIT → − Interest Expense + Interest Income = EBT → − Tax = Net Income. Every line flows from Phase B/C drivers. Tax: for pre-profitability companies, model NOL carryforwards explicitly and apply them in the period when taxable income first turns positive.
+D2. Balance Sheet: build from the accounting equation at every period end. Key items to model explicitly:
+    - Cash (the plug from the cash flow statement — never hardcode).
+    - Accounts Receivable = Revenue × (DSO / days in period). State DSO assumption and benchmark it against sector peers.
+    - Deferred Revenue = Opening Deferred Revenue + Billings − Recognized Revenue. Critical for annual-upfront SaaS (Phase B7).
+    - PP&E: Opening + Capex − Depreciation = Closing. Build a depreciation schedule by asset class using useful-life assumptions from Phase C5.
+    - Accounts Payable = COGS + OpEx (cash portion) × (DPO / days in period). State DPO assumption.
+    - Equity: Opening Equity + Net Income − Dividends + New Equity Raised = Closing Equity. Model each equity raise as a discrete event with a date, amount, and implied dilution.
+    - VERIFY: Total Assets = Total Liabilities + Total Equity every period, or halt and debug before proceeding.
+D3. Cash Flow Statement — indirect method:
+    - Start with Net Income.
+    - Add back non-cash items: Depreciation & Amortization, stock-based compensation.
+    - Working capital movements: ΔAR (negative if AR grows), ΔInventory, ΔDeferred Revenue (positive if deferred grows), ΔAP.
+    - CFO = Net Income + Non-Cash + Working Capital Changes.
+    - CFI = −Capex ± acquisitions/disposals.
+    - CFF = Debt draws − repayments + equity raised − dividends/buybacks.
+    - Ending Cash = Beginning Cash + CFO + CFI + CFF. This must equal the Cash line on the balance sheet — if it does not, there is an error; do not proceed.
+D4. Circular references: if interest expense depends on ending debt and ending debt depends on cash sweep, resolve via an iterative calculation flag (enabled in Excel) or an explicit revolver draw/repayment formula with a stated priority waterfall. Never leave a broken circular reference or mask it with a hardcoded value.
+═══ PHASE E: SCENARIO ANALYSIS ═══
+E1. Define three scenarios with discipline — they must differ in DRIVERS, not just outputs:
+    - Base: assumptions as registered; the company executes on current plan with normal operational friction.
+    - Upside: 1–2 specific positive driver changes (e.g., conversion rate +30%, churn −2pp) plus the macro condition that enables them.
+    - Downside: 1–2 specific negative driver changes (e.g., sales cycle +50%, CAC +40%) plus the trigger condition.
+E2. Never build downside as "base × 0.7" — that hides which driver is the source of risk and provides no operational insight to fix it.
+E3. Tornado chart / sensitivity table: for the top 5 drivers by revenue impact, compute the output range when each driver moves ±10% (or its realistic bound). Rank by absolute dollar impact on 3-year cumulative revenue. This identifies the model's load-bearing assumptions and where management attention should concentrate.
+E4. Break-even analysis: for each scenario, state the period in which (a) gross profit turns positive, (b) EBITDA turns positive, (c) free cash flow turns positive. These are operational milestones with fundraising and hiring implications, not just accounting artifacts.
+E5. Runway analysis (for pre-profitability companies): Ending Cash ÷ Monthly Net Cash Burn = Months of Runway. Flag if runway falls below 12 months in any scenario — that is a financing trigger requiring a stated response (bridge, Series raise, expense reduction), not a rounding error to suppress.
+═══ PHASE F: DCF VALUATION AND TERMINAL VALUE ═══
+F1. Discount rate (WACC): for a private company, state the method explicitly — CAPM with a size premium, VC-stage discount rate (typically 30–60% for Seed/Series A, 20–35% for Series B/C), or a public-company peer WACC plus an illiquidity premium. Never apply a public-company WACC to a pre-revenue startup without a stated adjustment.
+F2. Free Cash Flow to Firm (FCFF) each period: EBIT × (1 − Tax Rate) + D&A − Capex − ΔNWC. This is the input to the DCF, not Net Income. Verify that the FCFF series is consistent with the 3-statement model in Phase D — any discrepancy means there is a linkage error.
+F3. Terminal value: choose one method and defend it:
+    - Gordon Growth Model: Terminal FCF × (1 + g) ÷ (WACC − g). State the perpetuity growth rate g with a rationale tied to long-run GDP or sector growth, not the model's near-term growth rate. g > 4% for a USD-denominated business requires explicit justification.
+    - Exit Multiple: Terminal Year EBITDA (or ARR) × Comparable Multiple. The multiple must be sourced from current trading comps or precedent transactions, marked [SOURCE NEEDED] if not yet verified.
+    - Terminal value must not exceed 75% of total enterprise value without explicit acknowledgment — if it does, the model is a terminal-value exercise, not a DCF, and that must be stated.
+F4. Present value: discount each period's FCFF and the terminal value to today using the mid-period convention (cash received throughout the year, not at year-end). State whether you are using mid-period or end-of-period discounting.
+F5. Sensitivity table on valuation: show enterprise value as a function of WACC (±200bps) × Terminal Growth Rate or Exit Multiple (±1 turn). The intersection of the most likely WACC and multiple is the point estimate; the table is the honest answer.
+F6. Bridge from enterprise value to equity value: Enterprise Value − Net Debt (total debt − cash) − Preferred Liquidation Preference − Minority Interest = Equity Value. For companies with complex cap tables, model the waterfall by share class (preferred stack, participation caps, conversion thresholds) to arrive at common equity value and implied per-share price.
+═══ PHASE G: SANITY CHECKS AND BENCHMARK VALIDATION ═══
+G1. Revenue growth rate: benchmark against comparable public companies or funding-stage peers. Growth implying a market share gain above 0.5% of TAM per year in years 1–3 requires explicit explanation of the competitive mechanism (not just "large market").
+G2. Gross margin trajectory: the path from current to target gross margin must be tied to specific leverage events with a dollar threshold or headcount trigger. "Scale" is not a lever — name the event.
+G3. Rule of 40 check: for SaaS, Revenue Growth Rate (YoY %) + EBITDA Margin (%) must be computed each year. A sub-20 Rule of 40 at Series B or later requires an explanation of the investment thesis that justifies the trade-off.
+G4. LTV:CAC ratio: LTV = (ARPU × Gross Margin %) ÷ Monthly Churn. CAC = S&M spend ÷ New Customers Acquired in the period. LTV:CAC < 3× is a unit-economics warning; > 10× may indicate chronic under-investment in growth that will surface in churn as product-market fit degrades.
+G5. Capital efficiency: ARR added per dollar of net cash consumed. Best-in-class SaaS: $0.80–$1.20 ARR per $1 burned. Flag any period below $0.30 (capital destruction) or above $2.00 (likely an artifact of deferred costs) as requiring explanation.
+G6. Final integrity check before delivery: (a) Balance sheet balances every period. (b) Cash flow statement reconciles to balance sheet cash every period. (c) Every assumption in Phase A is used at least once in the model. (d) No hardcoded numbers appear in formula cells — all constants live in the Assumptions Registry. (e) All [SOURCE NEEDED] flags are listed and acknowledged. (f) The version log in A8 reflects every change made in this session.
+KEY PRINCIPLE: A financial model is a machine for making assumptions explicit and consequences visible — every number must be traceable to a labeled driver, every output must carry a scenario range, every version must be logged, and no figure may enter the model without a stated rationale. The model's value is not its outputs but its ability to show which inputs matter most and why.

package/skills/grant-proposal.md ADDED Viewed

@@ -0,0 +1,93 @@
+---
+name: grant-proposal
+description: Draft, review, or strengthen a complete funder-mirrored grant proposal (need+data, logic model, measurable outcomes, M&E plan, organizational capacity, line-item budget, sustainability) for nonprofit, government, or philanthropic RFPs — invoke whenever a user needs to write, audit, or rebuild a grant application.
+allowed-tools: Read, WebSearch, WebFetch, Write
+---
+═══ HARD RULES ═══
+1. MIRROR THE RUBRIC. Read the RFP/NOFA scoring rubric before writing a single word. Every section heading must map to a rubric criterion; weight your word count proportionally to point allocations.
+2. NEVER FABRICATE. Every statistic, citation, and figure must come from a source you can name (census, peer-reviewed study, government dataset, funder's own published data). If a number is unknown, state "to be confirmed with [source]" rather than inventing it.
+3. NO OVERCLAIMING. Use "is associated with," "is expected to," or "preliminary evidence suggests" — not "will eliminate," "proven to," or "guarantees." Funders score down proposals with inflated impact language.
+4. SPECIFICITY OVER GENERALITY. Reject any sentence that could apply to a different organization, population, or intervention. Name the zip codes, the demographics, the dosage, the staff credentials, the partner MOUs.
+5. BUDGET MUST NARRATE THE PROGRAM. Every line item must link back to a stated activity. No orphan costs. No padding. Include a brief justification column.
+6. ONE VOICE THROUGHOUT. Maintain the organization's established tone (formal federal = authoritative/plain-language per HHS/DOJ/ED style guides; community foundation = warm/asset-framing; corporate CSR = concise/ROI-forward). Never mix registers.
+7. SUSTAINABILITY IS NOT A PARAGRAPH. It is a specific plan: which revenue streams, which partners, which year they activate, what happens to enrolled participants if funding ends.
+8. REVIEW PATH IS DIFFERENT FROM DRAFTING PATH. If the user hands you an existing draft, enter at Phase D first — audit it against the rubric, then return to whichever phases need rebuilding. Do not redraft sections that already meet standard.
+═══ PHASE 0 — Project Intake ═══
+Before any research or drafting, establish:
+P0.1. MODE: Is the user (a) writing from scratch, (b) reviewing/strengthening an existing draft, or (c) doing a final pre-submission audit? Mode determines entry point: (a) → Phase A; (b) → Phase D first, then rebuild flagged phases; (c) → Phase D only.
+P0.2. ORGANIZATION TYPE: 501(c)(3) public charity, fiscal sponsor, government entity, tribal organization, university, or for-profit (some NOFAs allow)? This controls allowable costs, indirect rate authority, and budget form requirements.
+P0.3. SECTOR: Which domain — workforce development, early childhood, housing/homelessness, behavioral health, criminal justice/reentry, K-12 education, environmental justice, food access, disability services, arts/culture, or other? The sector governs which data sources, validated instruments, and evidence tiers apply throughout.
+P0.4. PAGE AND FORMAT CONSTRAINTS: What are the page limits (overall and by section), font/margin requirements, required attachments, and submission platform (Grants.gov, eCivis, Submittable, foundation portal)? Record these before drafting a single word — proposals that violate format are desk-rejected.
+P0.5. DEADLINE AND SUBMISSION PATHWAY: Confirm the deadline, submission window (some portals close hours before the listed deadline), and who has system access to submit. Flag if fewer than 72 hours remain — prioritize QC over new drafting.
+═══ PHASE A — Intelligence Gathering ═══
+A1. Obtain and read the full RFP/NOFA via WebFetch (use the direct URL if provided) or locate via WebSearch. Extract: eligibility requirements, funding ceiling, required match and match sources, page limits by section, formatting specs, allowable/unallowable costs, required attachments, and the exact scoring rubric with point values per criterion. Record the rubric in a working table — it governs everything downstream.
+A2. Research the funder's portfolio: last 2–3 grant cycles' award lists (often posted on funder website or USASpending.gov for federal), stated strategic priorities, funder blog posts or learning agendas, and any published evaluation reports of prior grantees. Identify what they have already funded heavily (avoid replicating it) and what gaps they have explicitly named in public statements.
+A3. Profile the target population using authoritative, jurisdiction-specific sources:
+   - Federal: U.S. Census Bureau / ACS (American Community Survey 5-year estimates), CDC WONDER, BLS LAUS, NIJ recidivism datasets, NCES Common Core of Data, HUD PIT Count, USDA Food Access Research Atlas.
+   - State: state health department vital stats, state education agency longitudinal data, state labor market information systems.
+   Record: exact geography (county, city, zip code, census tract), demographic breakdown, magnitude of need (absolute numbers and rates), and the disparity gap vs. a state or national benchmark. Cite every figure inline (Author/Agency, Year, Dataset Name).
+A4. Map the competitive landscape: who else serves this population in this geography, what their documented gaps or waitlists are, and why this organization's specific approach does not duplicate but complements existing services. If a lead agency or backbone organization exists, explain the relationship.
+A5. Confirm all organizational facts before drafting: EIN, 501(c)(3) determination date, DUNS/UEI (required for federal), annual operating budget from the last two audited fiscal years, staff FTE count, relevant professional credentials and licenses, active MOUs with partner organizations (and whether each is signed or draft), and any prior grants from this specific funder — including outcomes reported.
+═══ PHASE B — Logic Model Construction ═══
+B1. STATE THE ROOT PROBLEM in one declarative sentence backed by the strongest local data point from Phase A. Required format: "[Population] in [geography] experiences [problem] at [rate/magnitude], compared to [benchmark], because [proximate cause(s)], resulting in [downstream consequence]."
+B2. Define INPUTS: staff time (named role, FTE, % effort), partner contributions (type and dollar value), grant funds requested, confirmed match (cash vs. in-kind), volunteer hours (use sector-standard valuation: Independent Sector rate for the state). These anchor the budget — inputs listed here must appear in the budget narrative.
+B3. Define ACTIVITIES with full dosage specifications: the named intervention, delivery modality (in-person/virtual/hybrid), frequency (sessions per week), session duration, cohort or caseload size, setting (community site, clinic, school, correctional facility), over what grant period, and the minimum fidelity standard. Example format: "12-session Motivational Interviewing group, 90 min/session, weekly, cohorts of 8 participants, delivered by licensed LCSW at partner community health center, Months 2–24, fidelity monitored via MITI 4.2 coding at 20% of sessions."
+B4. Define OUTPUTS: countable, direct products of activities that measure delivery fidelity — not participant change. Examples: # participants enrolled, # sessions delivered, # training hours completed, # screenings administered, # toolkits distributed. Outputs are within the organization's direct control; they happen even if participants do not change.
+B5. Define SHORT-TERM OUTCOMES (0–12 months): measurable changes in knowledge, attitude, skill, or behavior attributable to the intervention, assessed at program exit or a fixed follow-up interval. For each outcome: the specific indicator, the validated measurement instrument (e.g., PHQ-9 for depression, ACES-IQ for adverse childhood experiences, WorkKeys for job readiness), the minimum detectable change, and who administers the instrument.
+B6. Define MEDIUM-TERM OUTCOMES (1–3 years): changes in condition or status that require time to manifest — employment entered and retained at 6 months, housing stability at 12 months, recidivism at 18 months, graduation rate, clinical remission. For each: the data source (administrative record match with state wage database, UI claims, state corrections data, school enrollment records, health claims), the follow-up mechanism, and who holds the data-sharing agreement.
+B7. Define LONG-TERM IMPACT (3–5+ years): the population-level change this program contributes to. Frame as a contribution, not sole attribution (the program is one intervention in a system). Link to the funder's published theory of change or priority outcome if one exists.
+B8. VALIDATE THE LOGIC: for each causal arrow (activity → output → outcome → impact), name the peer-reviewed evidence or established program model that makes this pathway plausible. Cite 1–2 studies or model evaluations with sample characteristics that match the target population. If the evidence base is weak or mixed, acknowledge it — funders who know the literature will penalize omissions.
+═══ PHASE C — Proposal Drafting ═══
+Before drafting, assign a target word/page count to each section proportional to its rubric weight. Do not deviate more than 10% per section.
+C1. EXECUTIVE SUMMARY (if the RFP requires it — not all do): ≤1 page. Problem sentence, proposed solution in one sentence, target population with geography, total request, grant period, and one headline outcome with a specific number. Write this last — paste in final figures.
+C2. STATEMENT OF NEED: Lead with the strongest local data point — not national context. Layer: national frame (1 sentence) → state rate → target geography rate → target subpopulation rate → disparity gap vs. comparison group. Close with the concrete consequence of inaction (economic cost, human cost, system burden) — cited, not rhetorical. Every figure cited inline (Author/Agency, Year). No rhetorical questions. No generic opener ("Every year, millions of Americans...").
+C3. PROJECT DESCRIPTION / APPROACH:
+   - Open with a one-paragraph program theory: this intervention, for this population, in this sequence, because this evidence.
+   - Describe each activity from B3 format: who delivers it, where, when, to how many, over what grant period, at what fidelity standard.
+   - Recruitment and selection: eligibility criteria (age, geography, clinical threshold, income level, justice involvement, etc.), outreach channels (which community-based organizations, clinics, schools, justice system partners), referral pathway, and projected timeline to enroll the first cohort.
+   - Equity by design: name specific barriers this population faces (language, transportation, childcare, distrust of institutions, documentation status, digital access) and the corresponding design features that remove each barrier. This is not a generic DEI paragraph — it maps problem to mechanism.
+   - Evidence base: name the program model, its developer, the evaluated population, the effect sizes from controlled studies, and any fidelity certification requirements. If adapting a model, explain which core components are preserved and which are adapted, and why adaptation is appropriate for this population and context.
+C4. LOGIC MODEL TABLE: Present as a five-column table (Inputs | Activities | Outputs | Short/Medium Outcomes | Long-Term Impact). If the RFP does not specify a format, use a clean horizontal table. This is not optional — funders use it as a navigation anchor for the rest of the proposal.
+C5. EVALUATION / M&E PLAN:
+   - State 2–4 primary outcome metrics with: baseline value (from existing data or needs assessment), performance target, data source, collection instrument, collection frequency, and who is responsible.
+   - Evaluation design: be explicit about what is feasible. Pre/post single group = lowest tier but acceptable for community foundations; comparison group using propensity matching = mid-tier; randomized assignment = highest tier but rarely feasible for service programs. Do not overstate rigor. If using an independent evaluator, name the firm or university partner and their role.
+   - Name specific validated instruments — do not write "a survey." Write "the Kessler Psychological Distress Scale (K10)" or "the TABE 11/12 for literacy assessment."
+   - Data quality and protection: how will participant data be stored (HIPAA-compliant EHR, REDCap, encrypted spreadsheet), who has access, what IRB determination applies (exempt, expedited, or full review).
+   - Continuous quality improvement: how findings from mid-year data will be reviewed (monthly dashboard, quarterly fidelity review, bi-annual program adjustment meeting) and who is in the room.
+   - Participant feedback mechanism: advisory council with stipended members, exit survey in participant's primary language, community listening sessions — name the mechanism and its cadence.
+C6. ORGANIZATIONAL CAPACITY:
+   - Name the Project Director: degree(s), licensure, years of experience specifically relevant to this population and intervention — not general management experience. If not yet hired, describe the minimum qualifications and the hiring timeline.
+   - Governance: board oversight structure, board composition (relevant expertise), fiscal controls (annual audit, signatory policy, subcontract monitoring procedures), and whether the organization has a federally negotiated indirect cost rate agreement (include cognizant agency and rate) or will use the de minimis 10% MTDC.
+   - Track record: list 3–5 directly analogous past projects. For each: grant name, funder, period, population served, scale, and at least one measurable outcome achieved (not just a description of services). If this is a new program area, explain the transferable capacity explicitly.
+   - Partners: for each named partner, state their specific role (not "collaborative support"), what they contribute (FTE, in-kind space, referral pipeline, data-sharing, co-facilitation), MOU status (signed/in negotiation/draft), and why they are uniquely positioned for this role vs. any other organization.
+C7. BUDGET NARRATIVE:
+   - Organize by standard cost category: Personnel, Fringe Benefits, Consultants/Contractors, Travel, Equipment, Supplies, Indirect/Overhead, Other Direct Costs, and Match/Cost Share.
+   - Personnel: for each position — name (or "TBH" if unfilled), FTE, % effort charged to this grant, annual salary, resulting grant charge, and one sentence tying the role to a specific named activity. Example: "Program Coordinator (1.0 FTE, 100% grant-funded, $52,000/year): manages participant enrollment, maintains MIS data entry, and coordinates session scheduling for all 12-session cohorts per Activity 3."
+   - Fringe: apply the organization's actual fringe rate (or blended rate) and state what it covers (FICA, health, retirement, workers' comp). Do not use a placeholder percentage without explanation.
+   - Consultants: name or role, daily/hourly rate, number of days/hours, total, and confirmation that the rate is at or below market (cite a comparable rate source if the funder requires it).
+   - Travel: break out local mileage (rate per mile per IRS or funder rate), in-state, and out-of-state separately. If a conference is included, name it and justify attendance against the project's learning objectives.
+   - Indirect/Overhead: apply the negotiated rate to the approved base (MTDC or total direct costs per the rate agreement). If using de minimis 10%, state "10% of Modified Total Direct Costs per 2 CFR 200.414(f)."
+   - Match: list each match source on a separate line with dollar value, whether cash or in-kind, and the confirming document (letter, MOU, board resolution). Match must meet the RFP's minimum percentage and must comply with federal match rules if applicable (no federal funds matching federal funds unless specifically authorized).
+   - Flag all costs deliberately excluded because they are unallowable under this funder's rules (e.g., lobbying, fundraising costs, alcoholic beverages, entertainment) — this signals sophistication to the program officer.
+C8. SUSTAINABILITY PLAN:
+   - List specific named revenue sources that will support this program after the grant period ends: Medicaid/insurance billing (state plan or waiver, billing enrollment timeline), government contracts (which agency, which funding stream, what action is required), other private foundations (name those with active or planned relationships), municipal budget line (which department, which budget cycle), earned revenue, or individual donor base. "Diversified funding" is not a plan — a named source with a timeline and a required action is.
+   - For each source: realistic projected dollar amount, the year it activates relative to the grant period, and the specific action required to secure it (application submitted Date X, billing credentialing underway, MOU in negotiation with Agency Y).
+   - Address what happens to enrolled participants if continuation funding is not secured: warm handoff protocol, referral partners, whether the intervention has a defined end point or is ongoing. Funders read this as an equity and ethics signal — "we will seek additional funding" is an inadequate answer.
+   - If the program model will be absorbed into government operations, a larger host organization, or the organization's core budget, name the entity and describe the pathway and timeline.
+═══ PHASE D — Quality Control ═══
+D1. RUBRIC AUDIT: Re-read the exact scoring rubric from Phase A1. Map every scored criterion to the specific section(s) where it is addressed. Assign a score estimate (strong/adequate/weak) based on whether a reviewer with no prior knowledge of the organization would find the criterion fully answered. Any criterion rated weak must be rebuilt before D2.
+D2. CLAIMS AUDIT: Re-read every sentence containing a number, a causal claim, a superlative ("most," "only," "largest"), or a comparative ("better than," "more effective than"). Verify each traces to a named, dateable source from Phase A. Delete or qualify any that do not. Pay specific attention to sentences that sound factual but were generated without a source citation.
+D3. INTERNAL CONSISTENCY CHECK: Confirm that population size, enrollment targets, outcome targets, budget figures, match amounts, indirect rate, and timeline appear identically in (a) the narrative, (b) the logic model table, (c) the budget narrative, and (d) any required forms (SF-424, budget workbook, required attachments). A single discrepancy — $1 off, one participant count different — can trigger a budget revision request or disqualification.
+D4. PLAIN-LANGUAGE PASS: Replace jargon with the simplest accurate term. Read every paragraph aloud — any sentence longer than 35 words should be split. Program officer reviewers read 20–50 proposals in a compressed review period; dense prose and undefined acronyms lose points. Spell out every acronym on first use even if it seems universal.
+D5. FORMATTING COMPLIANCE CHECK: Verify page count (by section if the RFP specifies), font type and size, margin widths, line spacing, header/footer content, file format (PDF vs. Word vs. portal-native), filename conventions, and the complete required attachments list (audited financial statements, IRS determination letter, board list, resumes, MOUs, logic model, letters of support, SF-424, SF-424A/B, indirect cost rate agreement). Non-compliant proposals are frequently desk-rejected without reviewer evaluation.
+D6. FINAL RFP RE-READ: After all drafting and QC, re-read the original RFP from the beginning one time — looking specifically for any requirement, prohibition, or attachment that was overlooked during drafting. This catches late-page requirements that writers miss by reading the RFP only for content.
+D7. INTERNAL REVIEW ROUTING: Route the complete package to the Project Director (programmatic accuracy), the Finance Director (budget arithmetic and allowable cost compliance), and one person outside the program team (clarity and persuasiveness to a non-expert reader). Incorporate substantive edits. Do not defend every word.
+KEY PRINCIPLE: A grant proposal is not a description of your organization — it is a structured argument that THIS funder's dollars, deployed through THIS specific mechanism, will produce THIS measurable change for THESE specific people, verified by THESE data systems and instruments, and sustained by THESE named revenue sources after the grant ends. Every sentence that fails this test is a sentence a reviewer is not scoring points on.

package/skills/harmony-analysis.md ADDED Viewed

@@ -0,0 +1,93 @@
+---
+name: harmony-analysis
+description: Analyze the harmonic language of a piece or passage — key, Roman-numeral function, cadences, voice-leading, secondary dominants, modal interchange, enharmonic modulation, formal structure, and tension/release mechanics — producing a rigorous, notation-standard analytical report within the common-practice tonal tradition.
+allowed-tools: Read, Write
+---
+═══ SCOPE DECLARATION ═══
+This skill operates within **common-practice tonal harmony** (roughly 1600–1910) and tonal extensions thereof (modal jazz voicings, neo-Romantic chromaticism). It does NOT attempt Roman-numeral reduction of atonal, serial, spectral, or genuinely post-tonal passages — for those, flag the limitation and offer pitch-class or set-theoretic description instead.
+═══ HARD RULES ═══
+1. Every chord label must include BOTH the Roman numeral (upper/lower case for major/minor quality) AND figured-bass interval symbols where inversion or suspension applies: e.g. V⁴₂, ii⁶₅, viio⁷/V.
+2. Never reduce an analysis to chord names (C, Am, F) — functional Roman numerals in the established key are mandatory at every step.
+3. Key claims require evidence: name the pitch that confirms the key (leading tone, raised ^6/^7 in minor, or tritone of the dominant).
+4. Modal interchange chords must be labeled with their borrowed key in brackets: e.g. iv [from C minor], bVII [Mixolydian borrowed].
+5. Cadences must be typed precisely — PAC (perfect authentic), IAC (imperfect authentic), HC (half cadence), DC (deceptive), PC (plagal) — and the exact scale degrees of the bass at the cadence point must be stated.
+6. Voice-leading observations must name the specific interval and voice pair: e.g. "parallel octaves between Soprano and Bass on beats 3–4 of m. 12; contrary motion resolves the tritone B–F → C–E."
+7. Do not fabricate measure numbers, pitch spellings, specific analytical details, or composer biographical claims. If the score is not provided, qualify every claim: "if this passage is as described." Style comparisons must reference documented period practice by technique name, not by asserting composer-specific attribution unless the score and composer are both confirmed.
+8. Tension/release claims must cite a specific harmonic or voice-leading mechanism — not vague adjectives (e.g., not "the passage feels tense" but "the diminished seventh viio⁷/V on a metrically strong beat with unresolved suspension 4–3 in the soprano generates peak dissonance").
+9. Formal section labels use standard abbreviations: Exp (Exposition), Dev (Development), Rec (Recapitulation), A/B/A′ (binary/ternary), Intro, Coda, Tr (transition), Rt (retransition), P (primary theme), S (secondary theme), K (closing theme).
+10. Output sections in the exact order of the phases below; skip no phase, mark "N/A — insufficient data" if information is genuinely unavailable.
+11. Enharmonic equivalence must be stated explicitly when a respelling changes function: e.g. "G# = Ab; the German+6 in A minor resolves enharmonically as V⁷ of Db." Ambiguity is not a flaw to paper over — name it.
+12. When competing analytical frameworks yield different readings (Riemannian function, Schenkerian prolongation, neo-Riemannian transformation), name the framework and the reading it produces rather than arbitrating between them without grounds.
+═══ PHASE A — TONAL ORIENTATION ═══
+A1. Identify the home key. State: (a) the tonic pitch-class, (b) mode (major / natural minor / harmonic minor / melodic minor / Dorian / Mixolydian / Phrygian / other), (c) confirming evidence (key signature + cadential V–I closure, or explicit tonicization).
+A2. Flag any tonal ambiguity at the opening: does the piece begin on I, or on a harmony that defers tonic confirmation (e.g., begins on IV, vi, or V)? Name the deferral mechanism and the measure at which tonic is first unambiguously established.
+A3. If the piece modulates, list each tonicized area in chronological order with: measure range, target key, and the modulatory mechanism — choose from:
+   - Common-chord (diatonic pivot): name the chord and its dual Roman-numeral reading.
+   - Chromatic pivot: name the chord and its enharmonic reinterpretation.
+   - Enharmonic modulation: identify the respelled chord (typically Ger+6 ↔ V⁷, or viio⁷ ↔ viio⁷ of target).
+   - Sequential / phrase-model: identify the sequence pattern and the key it deposits.
+   - Direct (abrupt): no pivot; state the bass motion and the first chord of the new key.
+═══ PHASE B — ROMAN-NUMERAL HARMONIC REDUCTION ═══
+B1. Parse the passage into harmonic rhythm units: state the prevailing rate of chord change (per beat, per bar, per phrase) and note where the rate accelerates or slows and what that signals formally.
+B2. Assign a Roman numeral to every distinct harmony. Use:
+   - Upper case (I II III IV V VI VII) for major triads / dominant sevenths in major keys.
+   - Lower case (i ii° iii iv v VI VII) for minor triads in minor keys.
+   - Superscript "o" for diminished, "ø" for half-diminished, "+" for augmented.
+   - Figured bass: 6 = first inversion, ⁶₄ = second inversion; 7 / 6₅ / 4₃ / 4₂ for seventh-chord inversions.
+B3. Mark non-chord tones explicitly: passing tone (PT), neighbor tone (NT), suspension with resolution interval (e.g. sus 4–3, 7–6, 9–8), appoggiatura (App), escape tone (ET), anticipation (Ant), pedal point (Ped), cambiata. State the voice carrying the NCT.
+B4. Write the reduction as a condensed chord string per phrase:
+   Phrase 1 (mm. 1–8): I – IV⁶ – I⁶ – V⁷ – I [PAC, soprano ^2–^1, bass ^5–^1]
+B5. Flag any chord that resists clean Roman-numeral assignment and propose two competing readings (see Hard Rule 12); do not force a label if the passage is genuinely ambiguous.
+═══ PHASE C — CADENCE INVENTORY ═══
+C1. List every cadence with: measure number(s), cadence type (PAC/IAC/HC/DC/PC), the two final chords and their bass scale degrees, soprano scale degree at the final chord, and bass motion (step up / step down / leap / same).
+C2. Assess cadential weight: phrase cadence (closes a sub-phrase), period-closing PAC (closes an antecedent-consequent pair), or structural cadence (closes a large formal section or the entire piece).
+C3. Identify any cadential ⁶₄ (I⁶₄–V–I): describe its voice-leading preparation (bass must arrive by common tone or stepwise motion, not by leap), the duration of the ⁶₄ relative to the V, and whether the ⁶₄ is metrically accented.
+C4. Note any evaded cadences (expected resolution replaced by a deceptive substitute), elided cadences (new phrase overlaps with the resolution), or abandoned cadences (harmonic goal reached but texture dissolves before rhythmic close).
+═══ PHASE D — VOICE-LEADING AUDIT ═══
+D1. Identify the soprano line's structural pitches at phrase boundaries: do they form a stepwise descent toward ^1 (Schenkerian Urlinie motion), or does the soprano articulate leaps at cadences? Note: Urlinie analysis implies a Schenkerian framework; label it as such rather than asserting it as the only reading.
+D2. Check and report prohibited parallels: parallel perfect fifths, parallel octaves, direct (hidden) fifths/octaves in outer voices approaching a perfect consonance by similar motion. State the specific voice pair, beat, and measure.
+D3. Evaluate tendency-tone treatment:
+   - Leading tone (^7): should resolve up to ^1 or be transferred to another voice (inner-voice exception in complete seventh chords).
+   - Chordal seventh: should resolve down by step; flag any upward resolution or leap.
+   - Raised ^6 and ^7 in harmonic/melodic minor: confirm they resolve as expected or describe the musical justification for deviation.
+D4. Describe the dominant-to-tonic voice-leading: which voice carries the tritone (^7 and ^4 in major), and how is it resolved (both voices by step in contrary motion is the normative pattern; name any deviation).
+D5. Note voice-crossing (a lower voice moves above a higher voice), voice-overlap (a voice moves to a pitch the other voice just vacated in the same direction), and leaps larger than a perfect fifth in inner voices — assess whether each is compositionally justified or a voice-leading defect.
+═══ PHASE E — CHROMATIC HARMONY ═══
+E1. **Secondary dominants and secondary leading-tone chords.** For each, provide: Roman-numeral label (e.g. V⁷/V in m. 6; viio⁶/ii in m. 14), the applied chromatic pitch (the tone raised to act as a local leading tone or perfect-fifth partner), and the target chord. Confirm resolution or name the substitute if resolution is interrupted.
+E2. **Modal interchange (borrowed chords).** State the source mode for each borrowed chord: e.g. bVII borrowed from parallel Mixolydian (m. 12); iv borrowed from parallel minor (m. 20); bVI borrowed from parallel minor (m. 34); bII (Neapolitan) borrowed from Phrygian or parallel minor.
+E3. **Augmented-sixth chords.** Identify type (Italian+6 = +6 only; German+6 = +6 with P5 above bass; French+6 = +6 with A4 above bass). State: the bass pitch (typically ^b6), the voice carrying the augmented sixth, and the outward resolution of both aug-sixth voices to the octave on ^5. Note the German+6's enharmonic equivalence to a dominant seventh (used in enharmonic modulation).
+E4. **Neapolitan (bII or N⁶).** Identify the flatted second scale degree, the typical first-inversion voicing, and the resolution path — via cadential ⁶₄, directly to V, or as a Neapolitan sixth in a sequence.
+E5. **Enharmonic modulations.** Identify the pivot chord, its spelling in the departure key, its enharmonic respelling in the arrival key, and the voice that enharmonically transforms (e.g. "Ab in Ger+6 of C minor respelled as G# to become ^7 in A major, becoming V⁷/IV or the new leading tone").
+E6. **Common chromatic progressions.** Flag recognizable chromatic idioms by name: the omnibus progression (chromatic voice exchange between outer voices under a sustained bass); the Monte (sequential ii–V rising by step); the Fonte (ii–V descending by step to I); the Romanesca; the lament bass (chromatic descending tetrachord ^1–^b7–^b6–^5).
+═══ PHASE F — FORMAL ARCHITECTURE ═══
+F1. Identify the large-scale form: binary (simple / rounded / continuous), ternary (sectional / continuous), sonata (with or without development), sonatina, rondo (5-part / 7-part), ritornello, through-composed, strophic, theme-and-variations, or hybrid. Justify the label with measure ranges.
+F2. Map every section with measure ranges and functional labels (from Hard Rules abbreviations), plus the local key at each section's tonic arrival.
+F3. State the tonal plan: where does the piece depart the home key, which secondary tonal area receives the most structural weight, and which formal section carries the harmonic climax (greatest chromatic density, most remote tonal region, or highest dissonance level)?
+F4. Phrase-structure analysis: identify the normative phrase length, then flag all deviations — phrase extensions (cadential extension, suffix), phrase contractions (liquidation, truncation), sentence vs. period construction, and hybrid phrase types.
+F5. Identify the proportional climax if present: the moment of peak harmonic tension or highest structural dissonance. State its measure number and its proportional position in the form (e.g., "m. 47 of 73 = 64% — near but not at the golden section of ~62%"). Do not assert a golden-section relationship without measuring it.
+═══ PHASE G — TENSION / RELEASE MECHANICS ═══
+G1. Rank the harmonic tension sources in this passage from most to least potent, with the specific chord or voice-leading event for each:
+   - Dissonance level: diminished/augmented intervals > sevenths > sixths > thirds/sixths > perfect consonances.
+   - Tonal distance: chords outside the diatonic set raise tension; the dominant's pull toward tonic is distinct from the chromatic tension of borrowed or applied chords.
+   - Metric placement: dissonance on a strong beat resolving on a weak beat (suspension) generates more tension than weak-beat dissonance.
+   - Register and textural density: high soprano register + close voicing in inner voices amplifies perceived tension.
+   - Harmonic rhythm: acceleration toward a cadence signals goal-directed tension; sudden slowing after a climax signals release.
+   Note: the functional tension hierarchy (V/V > V > IV in traditional Riemannian function theory) is one model; a Schenkerian reading may weight structural dominants differently from surface chromatic events. Name the framework if you apply one.
+G2. Trace the single most powerful tension arc: identify its inception chord (and the specific dissonant interval introduced), its apex (name the chord and the dissonant interval at maximum tension), and its resolution (chord, voice-leading resolution path, metric position).
+G3. Assess whether resolution is complete (PAC + root-position I + stable texture + slowing harmonic rhythm) or partial/frustrated (IAC / DC / modal drift / unresolved chordal seventh / new phrase eliding the resolution). Explain the expressive consequence of partial resolution.
+═══ PHASE H — SYNTHESIS & ANALYTICAL SUMMARY ═══
+H1. Write a 3–5 sentence analytical précis integrating all phases: tonal language, formal logic, harmonic vocabulary (functional vs. chromatic), voice-leading profile, and the dominant compositional technique creating expressive effect. The précis must name at least one specific chord event and one formal feature.
+H2. Identify one underexplored harmonic feature — a chord, progression, or voice-leading pattern that resists clean functional explanation — and present the two most plausible analytical interpretations (e.g., functional Roman-numeral reading vs. Schenkerian prolongation reading vs. neo-Riemannian transformation reading). Do not collapse competing readings prematurely; explain what musical evidence would favor each.
+H3. If period-style comparisons are appropriate, cite the specific technique by name and its documented association with a practice or style period: e.g. "the bVI–V–I cadential pattern is characteristic of Romantic chromaticism from roughly 1820 onward; the Neapolitan sixth as a pre-dominant in slow movements is documented throughout the Classical period." Do not assert specific composer attribution unless the score and composer are both confirmed in the input.
+KEY PRINCIPLE: Harmonic function is relational, not absolute — every Roman-numeral label is a claim about how a chord behaves in its tonal context, and that claim must be supported by voice-leading evidence and tonal resolution pattern, not by the chord's pitch content alone. A chord's label can change when the tonal context shifts; always ask "what does this chord resolve to?" before assigning its function.

package/skills/hypothesis-generation.md ADDED Viewed

@@ -0,0 +1,99 @@
+---
+name: hypothesis-generation
+description: Generate and rank falsifiable hypotheses — diverse candidates each with a concrete prediction and discriminating test, ranked by novelty × plausibility × testability.
+allowed-tools: Read, WebSearch, WebFetch, Write
+---
+## PHASE 0 — ANCHOR IN THE ANOMALY (before generating anything)
+1. State the observation or anomaly in ONE sentence: what was measured, when, under what conditions, how it deviates from expectation. No interpretation yet.
+2. Quantify the effect size or deviation if possible (e.g., "2.3× above baseline, p=0.004, n=47"). Vague anomalies produce vague hypotheses.
+3. List every confound already ruled out by the study design or prior controls. Write them down — this is your moat against redundant hypotheses.
+4. Identify the observation's PROVENANCE: is it a single lab, replicated, computational, or anecdotal? Weight your priors accordingly.
+## PHASE 1 — LITERATURE SWEEP (flag every use)
+5. Search for: (a) the anomaly itself by precise terms, (b) the closest mechanistic analogues in adjacent fields, (c) the canonical null explanation. Use WebSearch + WebFetch on the top 3-5 results.
+6. For each source found, extract: claim, sample size / method, replication status, year. Do NOT paraphrase conclusions from abstracts alone — fetch the methods section.
+7. Flag [LIT] next to every hypothesis that was directly informed by a retrieved source. Flag [GAP] where literature was searched but returned nothing relevant.
+8. If a prior paper already proposes the hypothesis, note it as PRIOR ART and still include it — prior art raises plausibility, it does not disqualify.
+## PHASE 2 — ABDUCTIVE ENUMERATION (what WOULD explain this?)
+9. Ask: "What is the MINIMAL change to the current mechanistic model that would produce this observation and nothing else?" — this is your mechanistic anchor hypothesis.
+10. Ask: "What if the observation is an artifact?" — instrumentation error, sampling bias, batch effect, measurement ceiling. Generate at least ONE artifact hypothesis.
+11. Ask: "What if the standard model is locally wrong?" — generate a hypothesis that requires revising a widely-held assumption. Label it REVISIONIST.
+12. Ask: "What unobserved variable could mediate or moderate this?" — generate at least ONE confounding / third-variable hypothesis.
+13. Ask: "What would the effect look like if it were purely stochastic?" — this is your null hypothesis. Make it SPECIFIC (e.g., "the deviation is sampling noise under H0: μ=baseline, σ=observed_SD").
+14. Generate a minimum of FIVE hypotheses total: (H-NULL) null, (H-MECH) mechanistic, (H-ALT1) alternative mechanism, (H-CONF) confound/mediator, (H-REV) revisionist. Add more if the domain warrants.
+## PHASE 3 — FALSIFIABILITY SPECIFICATION (mandatory for each hypothesis)
+15. For EACH hypothesis write exactly:
+    - **Predicted observation**: the specific, measurable outcome IF this hypothesis is true (not just "different from control" — state direction, magnitude, and on what variable).
+    - **Predicted non-observation**: what should NOT happen if this hypothesis is true (rules out alternatives).
+    - **Discriminating test**: the single experiment or analysis that most cleanly separates this hypothesis from the others. Name the independent variable, dependent variable, comparison group, and the decision threshold.
+    - **Kill condition**: the result that would FALSIFY this hypothesis with high confidence.
+16. Hypotheses that cannot be falsified in principle are moved to a BACKGROUND ASSUMPTIONS list — they are not ranked.
+## PHASE 4 — ASSUMPTION + PRIOR AUDIT
+17. For each hypothesis, list its load-bearing assumptions (what must be true for the predicted observation to follow from the hypothesis). If an assumption is itself untested, flag it [UNTESTED ASSUMPTION].
+18. State your prior probability for each hypothesis BEFORE seeing any discriminating data. Use qualitative bins: HIGH (>0.4), MEDIUM (0.1–0.4), LOW (<0.1). Justify in one phrase (e.g., "LOW — requires two simultaneous mechanism failures").
+19. Name the assumption most likely to be WRONG in each hypothesis. This is the fragility point — tests should probe it.
+## PHASE 5 — DIVERSITY AUDIT (anti-confirmation-bias gate)
+20. Steelman the NULL: write the strongest version of "nothing interesting is happening." Does it actually explain the observation at least as parsimoniously as your favorite? If yes, the null must be ranked at least MEDIUM.
+21. Steelman each RIVAL: for each non-favorite hypothesis, write one piece of evidence that WOULD convince you it is correct. If you cannot, you are not engaging with it honestly.
+22. Check for HYPOTHESIS CLUSTERING: if three of your hypotheses are minor variants of the same mechanism, collapse them into one with a parameter range, then generate a genuinely different candidate to replace the collapsed ones.
+23. Check for FIELD BIAS: are all hypotheses drawn from one discipline? Explicitly ask whether a mechanism from an adjacent field (chemistry → biology, economics → ecology, physics → neuroscience) could apply. Add one cross-disciplinary hypothesis if plausible.
+## PHASE 6 — RANKING
+24. Score each hypothesis on four axes, each 1–5:
+    - **Novelty**: 5 = contradicts current consensus or opens a new mechanism; 1 = restatement of prior art.
+    - **Plausibility**: 5 = consistent with all known constraints and analogous cases; 1 = requires violation of established principles.
+    - **Testability**: 5 = discriminating test is feasible with current technology and resources in <6 months; 1 = requires technology that does not exist.
+    - **Impact**: 5 = if true, changes how the field designs experiments or therapies; 1 = confirms a detail with no downstream consequence.
+25. Compute RANK SCORE = Novelty × Plausibility × Testability × Impact. Sort descending.
+26. Flag any hypothesis that scores 5 on Testability AND ≥4 on Impact as a PRIORITY TEST regardless of novelty — actionable beats elegant.
+## PHASE 7 — CRUCIAL EXPERIMENT
+27. Identify the ONE experiment that best DISCRIMINATES the top three hypotheses simultaneously. Criteria: (a) its outcome space has at least three non-overlapping regions, each mapping to a different hypothesis; (b) a null result is also interpretable; (c) it is the most resource-efficient design that achieves (a) and (b).
+28. Specify the crucial experiment fully: design, controls, sample size rationale (power ≥ 0.8 at effect size of interest), data analysis plan, and the decision table mapping outcomes to hypotheses.
+29. If no single experiment can discriminate all three, specify TWO sequential experiments and the branching logic.
+## PHASE 8 — OUTPUT FORMAT
+30. Deliver results in this exact structure:
+    **ANOMALY**: [one sentence + quantification]
+    **RULED-OUT CONFOUNDS**: [bulleted list]
+    **LITERATURE STATUS**: [searched / not searched; key sources]
+    For each hypothesis (H-NULL first, then ranked order):
+    > **[H-ID] [NAME]** | Prior: HIGH/MED/LOW | Rank Score: N
+    > Mechanism: [one sentence]
+    > Predicted observation: [specific + measurable]
+    > Predicted non-observation: [what must NOT happen]
+    > Discriminating test: [IV, DV, comparison, threshold]
+    > Kill condition: [specific falsifying result]
+    > Load-bearing assumptions: [list] — fragility point: [which one]
+    > [LIT] or [GAP] or [PRIOR ART] tag
+    **CRUCIAL EXPERIMENT**: [full spec per step 28]
+    **BACKGROUND ASSUMPTIONS** (unfalsifiable): [list]
+    **PRIORITY TEST FLAGS**: [hypotheses with score 5 testability + ≥4 impact]
+## HARD RULES
+31. Never generate a hypothesis you cannot attach a kill condition to. No kill condition = background assumption, not hypothesis.
+32. Never rank your preferred hypothesis first by default. Ranking is determined by the score formula only.
+33. Never omit the null. A missing null is a sign of motivated reasoning.
+34. Never cite an abstract without fetching the methods. Conclusions without methods are unverified.
+35. Never let fewer than two hypotheses survive to the crucial experiment stage. Single-hypothesis science is not science.
+36. If the anomaly is under-specified (no effect size, no comparison group, no replication count), STOP and ask the user for these before generating hypotheses. Garbage in, garbage out.
+37. All literature use must be flagged [LIT]. Unflagged claims are treated as priors from training data — lower confidence.
+38. The crucial experiment must be DESIGNED TO LOSE YOUR FAVORITE — if it cannot falsify the top-ranked hypothesis, redesign it.