npm - @vextlabs/theron-cli - Versions diffs - 0.3.0 → 0.4.0 - Mend

@vextlabs/theron-cli 0.3.0 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (169) hide show

package/dist/api.d.ts +8 -0
package/dist/api.js +3 -0
package/dist/api.js.map +1 -1
package/dist/auth.js +51 -1
package/dist/auth.js.map +1 -1
package/dist/banner.js +3 -2
package/dist/banner.js.map +1 -1
package/dist/checkpoints.d.ts +32 -0
package/dist/checkpoints.js +61 -0
package/dist/checkpoints.js.map +1 -0
package/dist/index.js +59 -4
package/dist/index.js.map +1 -1
package/dist/input.d.ts +61 -0
package/dist/input.js +574 -0
package/dist/input.js.map +1 -0
package/dist/profiles/index.js +5 -0
package/dist/profiles/index.js.map +1 -1
package/dist/profiles/methodologies/operate_domains.d.ts +8 -0
package/dist/profiles/methodologies/operate_domains.js +1239 -0
package/dist/profiles/methodologies/operate_domains.js.map +1 -0
package/dist/profiles/seeds.js +57 -36
package/dist/profiles/seeds.js.map +1 -1
package/dist/receipt.d.ts +17 -0
package/dist/receipt.js +46 -0
package/dist/receipt.js.map +1 -0
package/dist/render.d.ts +4 -1
package/dist/render.js +95 -28
package/dist/render.js.map +1 -1
package/dist/repl.d.ts +8 -1
package/dist/repl.js +420 -62
package/dist/repl.js.map +1 -1
package/dist/sessions.d.ts +14 -0
package/dist/sessions.js +100 -0
package/dist/sessions.js.map +1 -1
package/dist/ship.d.ts +2 -0
package/dist/ship.js +62 -0
package/dist/ship.js.map +1 -0
package/dist/skills/catalog.d.ts +13 -0
package/dist/skills/catalog.js +86 -0
package/dist/skills/catalog.js.map +1 -0
package/dist/tools/bash.js +81 -14
package/dist/tools/bash.js.map +1 -1
package/dist/tools/edit.js +21 -1
package/dist/tools/edit.js.map +1 -1
package/dist/tools/glob.js +4 -1
package/dist/tools/glob.js.map +1 -1
package/dist/tools/grep.d.ts +5 -0
package/dist/tools/grep.js +101 -2
package/dist/tools/grep.js.map +1 -1
package/dist/tools/index.d.ts +22 -0
package/dist/tools/index.js +177 -41
package/dist/tools/index.js.map +1 -1
package/dist/tools/ls.d.ts +3 -0
package/dist/tools/ls.js +23 -12
package/dist/tools/ls.js.map +1 -1
package/dist/tools/multiedit.d.ts +12 -0
package/dist/tools/multiedit.js +79 -0
package/dist/tools/multiedit.js.map +1 -0
package/dist/tools/stoa.d.ts +1 -1
package/dist/tools/stoa.js +7 -3
package/dist/tools/stoa.js.map +1 -1
package/dist/tools/task.d.ts +9 -0
package/dist/tools/task.js +166 -0
package/dist/tools/task.js.map +1 -0
package/dist/tools/todowrite.d.ts +12 -0
package/dist/tools/todowrite.js +38 -0
package/dist/tools/todowrite.js.map +1 -0
package/dist/tools/webfetch.d.ts +6 -0
package/dist/tools/webfetch.js +98 -0
package/dist/tools/webfetch.js.map +1 -0
package/dist/tools/websearch.d.ts +7 -0
package/dist/tools/websearch.js +83 -0
package/dist/tools/websearch.js.map +1 -0
package/dist/tools/write.js +17 -1
package/dist/tools/write.js.map +1 -1
package/dist/verifiers/confidence_marked.d.ts +2 -0
package/dist/verifiers/confidence_marked.js +49 -0
package/dist/verifiers/confidence_marked.js.map +1 -0
package/dist/verifiers/disclaimer_gate.d.ts +2 -0
package/dist/verifiers/disclaimer_gate.js +57 -0
package/dist/verifiers/disclaimer_gate.js.map +1 -0
package/dist/verifiers/index.d.ts +5 -0
package/dist/verifiers/index.js +20 -7
package/dist/verifiers/index.js.map +1 -1
package/dist/verifiers/lint.js +4 -3
package/dist/verifiers/lint.js.map +1 -1
package/dist/verifiers/promoted_kernels.d.ts +8 -0
package/dist/verifiers/promoted_kernels.js +190 -0
package/dist/verifiers/promoted_kernels.js.map +1 -0
package/dist/verifiers/source_gate.js +2 -3
package/dist/verifiers/source_gate.js.map +1 -1
package/dist/verifiers/test_smoke.js +30 -0
package/dist/verifiers/test_smoke.js.map +1 -1
package/dist/verifiers/types.d.ts +3 -0
package/package.json +4 -2
package/skills/README.md +123 -0
package/skills/ab-test.md +89 -0
package/skills/api-design.md +175 -0
package/skills/architecture-design.md +185 -0
package/skills/business-case.md +77 -0
package/skills/causal-inference.md +77 -0
package/skills/clinical-guideline.md +98 -0
package/skills/code-review.md +98 -0
package/skills/cold-outreach.md +268 -0
package/skills/competitive-teardown.md +223 -0
package/skills/component-spec.md +121 -0
package/skills/content-calendar.md +280 -0
package/skills/contract-review.md +155 -0
package/skills/data-analysis.md +187 -0
package/skills/debug.md +91 -0
package/skills/design-audit.md +121 -0
package/skills/differential-diagnosis.md +79 -0
package/skills/discovery-call.md +206 -0
package/skills/edit-pass.md +80 -0
package/skills/engineering-calc.md +101 -0
package/skills/estimate.md +70 -0
package/skills/experiment-design.md +105 -0
package/skills/fact-check.md +82 -0
package/skills/financial-model.md +104 -0
package/skills/grant-proposal.md +93 -0
package/skills/harmony-analysis.md +93 -0
package/skills/hypothesis-generation.md +99 -0
package/skills/incident-response.md +134 -0
package/skills/interview-loop.md +62 -0
package/skills/job-scorecard.md +92 -0
package/skills/kb-article.md +174 -0
package/skills/launch-plan.md +85 -0
package/skills/lease-review.md +93 -0
package/skills/lesson-plan.md +198 -0
package/skills/literature-review.md +69 -0
package/skills/market-entry.md +137 -0
package/skills/market-sizing.md +159 -0
package/skills/meta-analysis.md +140 -0
package/skills/migrate.md +117 -0
package/skills/optimize.md +88 -0
package/skills/options-strategy.md +166 -0
package/skills/peer-review.md +96 -0
package/skills/pentest-plan.md +193 -0
package/skills/pitch-review.md +132 -0
package/skills/plan.md +88 -0
package/skills/policy-brief.md +124 -0
package/skills/positioning.md +192 -0
package/skills/postmortem.md +168 -0
package/skills/prd.md +105 -0
package/skills/prioritize.md +162 -0
package/skills/proof.md +91 -0
package/skills/property-underwrite.md +159 -0
package/skills/recipe-develop.md +109 -0
package/skills/red-team.md +142 -0
package/skills/refactor.md +58 -0
package/skills/reflection-session.md +115 -0
package/skills/regulatory-compliance.md +136 -0
package/skills/reproduce.md +87 -0
package/skills/runbook.md +344 -0
package/skills/security-audit.md +154 -0
package/skills/seo-brief.md +201 -0
package/skills/sql-query.md +161 -0
package/skills/story-craft.md +163 -0
package/skills/tdd.md +59 -0
package/skills/term-sheet.md +298 -0
package/skills/theory-of-change.md +88 -0
package/skills/threat-model.md +104 -0
package/skills/ticket-triage.md +200 -0
package/skills/tolerance-analysis.md +149 -0
package/skills/training-program.md +151 -0
package/skills/translate.md +64 -0
package/skills/unit-economics.md +238 -0
package/skills/valuation.md +112 -0
package/skills/write-tests.md +77 -0

package/skills/unit-economics.md ADDED Viewed

@@ -0,0 +1,238 @@
+---
+name: unit-economics
+description: Compute and stress-test SaaS/subscription unit economics — contribution margin, CAC, LTV (with churn and discount rate), LTV/CAC ratio, payback period, and cohort-level retention curves — from first principles; flags when assumptions break the model.
+allowed-tools: Read, Bash, Write
+---
+═══ HARD RULES ═══
+1. NEVER state a derived figure (LTV, LTV/CAC, payback) without first surfacing every input assumption explicitly.
+2. NEVER use a blended CAC when paid and organic channels have meaningfully different conversion costs — split them.
+3. NEVER use gross revenue as the revenue base for contribution margin; always subtract payment processing, hosting/serving cost per user, and support cost per user first.
+4. NEVER present a single-point LTV as if it were robust — always show the sensitivity table (churn ±2pp, ARPU ±10%, discount ±2pp).
+5. NEVER conflate logo churn with revenue churn; report both when expansion revenue exists.
+6. NEVER accept "our churn is X%" without asking: monthly or annual? logo or net revenue? cohort-measured or self-reported?
+7. Flag loudly when the LTV/CAC payback period exceeds the median observed customer lifetime — the model is lying to you.
+8. Flag loudly when < 3 full cohort months of data exist — all LTV figures are then projections, not measurements.
+9. DO NOT state benchmark thresholds (e.g., "LTV/CAC of 3x is healthy") as ground truth; label every heuristic as a heuristic and cite its origin or context.
+10. All monetary figures carry currency and period (monthly vs. annual); all rates carry the compounding period explicitly.
+11. When the product offers both monthly and annual billing, NEVER compute a single ARPU across the mixed subscriber base — split by billing cadence, compute ARPU per tier, then weight by subscriber count to get blended ARPU. Annual plan subscribers paying $200/yr contribute $16.67/mo to ARPU, not $200.
+═══ PHASE A — DATA COLLECTION & AUDIT ═══
+A1. Collect raw inputs — ask for or locate:
+    - MRR/ARR at the cohort level (not just aggregate), broken out by billing tier
+      (monthly subscribers × monthly price) + (annual subscribers × annual price / 12)
+    - Monthly subscriber counts by cohort start month, by billing tier
+    - Churned subscribers per cohort per month (logo churn) and churned MRR (revenue churn)
+    - New subscriber counts per channel (paid, organic, referral, outbound)
+    - Total sales & marketing spend per period, broken out by channel where possible
+    - COGS line items: payment processing rate (%), hosting/infra cost per active user/month,
+      customer support cost per ticket × avg tickets/user/month, onboarding/implementation cost per new user
+    - Expansion revenue (upsells, seat adds, plan upgrades) per cohort if applicable
+    - Discount rate / WACC for DCF-style LTV (ask founder; default to 10% annually = 0.797%/mo
+      [(1.10)^(1/12) − 1] if unknown — LABEL IT AS AN ASSUMED DEFAULT)
+    - Payment processor fee schedule (Stripe standard: 2.9% + $0.30/transaction; varies by volume tier)
+A2. Audit data quality:
+    - Confirm churn is cohort-measured: Retention_t = subscribers still active at month t / cohort starting size.
+      "1 − (end period subs / start period subs)" is WRONG — it double-counts re-acquisitions and new-to-churned ratios.
+    - Confirm revenue figures are net of refunds and chargebacks.
+    - Flag any periods with pricing changes, promotional discounts, or one-time enterprise deals that distort ARPU.
+      Segment those cohorts separately — do not blend distorted ARPU into the base case.
+    - Note the number of complete cohort months available; label any extrapolation beyond observed data
+      as PROJECTED (± confidence interval if sample size allows).
+    - Check for annual plan cliffs: annual subscribers who cancel at month 12 show zero logo churn for 11 months
+      then a spike — this distorts monthly churn curves. Track annual subscribers separately.
+═══ PHASE B — CONTRIBUTION MARGIN ═══
+B1. Compute ARPU by billing tier first:
+    ARPU_monthly_tier  = total MRR from monthly subscribers / count of monthly subscribers
+    ARPU_annual_tier   = (total ARR from annual subscribers / 12) / count of annual subscribers
+    ARPU_blended       = total MRR / total active subscribers   [use only for blended CM; never mix tiers in churn math]
+B2. Compute Gross Margin per unit (use ARPU_blended or per-tier as appropriate):
+    COGS_per_user = (payment_processing_rate × ARPU_gross) + per_transaction_fee_amortized
+                  + hosting_cost_per_user_per_month
+                  + support_cost_per_user_per_month
+                  + amortized_onboarding_cost_per_user_per_month
+    Gross_Margin_per_user = ARPU_gross − COGS_per_user
+    Gross_Margin_%        = Gross_Margin_per_user / ARPU_gross
+    Note on hosting cost: for AI/LLM-backed products, serving cost per user can dwarf payment processing.
+    Measure actual inference cost per active user per month; do not estimate.
+B3. Compute Contribution Margin (subtract per-unit variable sales costs if any):
+    CM_per_user = Gross_Margin_per_user − variable_sales_cost_per_user
+    Fixed S&M spend (brand advertising, salaries, tooling) goes into CAC calculation, not CM.
+B4. BREAK FLAG: if CM_per_user < 0, the unit is structurally unprofitable at any scale.
+    Stop. Fix pricing or COGS before any LTV or payback calculation — those numbers are meaningless on a negative CM base.
+═══ PHASE C — CAC BY CHANNEL ═══
+C1. For each acquisition channel c:
+    CAC_c = total_spend_c_in_window / new_paying_customers_acquired_via_c_in_window
+    Use the same time window for spend and customers. Lag-adjust when sales cycles exceed 1 month:
+    if avg sales cycle = L months, attribute spend from month (t − L) to customers acquired in month t.
+C2. Blended CAC (compute only when channel mix has been stable for ≥ 3 months):
+    CAC_blended = total_S&M_spend / total_new_paying_customers
+C3. Fully-loaded CAC vs. acquisition-only CAC:
+    CAC_acquisition = S&M spend only (as above)
+    CAC_fully_loaded = CAC_acquisition + amortized_account_management_cost_per_customer_lifetime
+                     + amortized_renewal_cost_per_customer
+    Use CAC_fully_loaded when comparing to LTV; use CAC_acquisition for channel efficiency comparisons.
+C4. Payback Period (months to recover CAC from contribution margin):
+    Payback_months_c = CAC_c / CM_per_user
+    Context-dependent thresholds (heuristics from VC-era SaaS 2015-2022; PLG and consumer products differ):
+      < 12 months  → strong; typically fundable at Series A/B
+      12–18 months → acceptable for B2B SaaS with annual contracts
+      18–24 months → capital-intensive; scrutinize before scaling
+      > 24 months  → structurally risky at most bootstrap or early VC scales
+    LABEL THESE AS HEURISTICS. High-touch enterprise or infrastructure SaaS can sustain longer payback
+    with high NRR; consumer subscription cannot.
+C5. BREAK FLAG: if Payback_months > median observed customer lifetime from cohort data, the CAC model
+    does not close — you are paying to acquire customers who leave before you recover the acquisition cost.
+    No amount of LTV projection fixes this; it requires CAC reduction or churn reduction, not remodeling.
+═══ PHASE D — LTV (COHORT + DCF) ═══
+D1. Cohort retention curve — for each cohort, compute:
+    Retention_t = surviving_subscribers_at_month_t / cohort_starting_size
+    Tabulate through the last observed month. Watch for the "smile curve" (high early churn,
+    then flattening) vs. linear decay — these imply different long-run LTV.
+D2. Derive monthly churn rate per cohort:
+    churn_monthly_t = 1 − (Retention_t / Retention_{t-1})   for each observed month t
+    Average across observed months for a summary figure.
+    If data is sparse (< 6 months), fit a geometric decay: Retention_t = R0 × (1 − churn_monthly)^t
+    Solve for churn_monthly using least-squares on ln(Retention_t) vs. t — this is the geometric fit.
+    Do not use a single-period churn rate from a high-activity month as the long-run estimate.
+D3. Simple LTV (constant churn, no expansion, no discounting):
+    LTV_simple = CM_per_user / churn_monthly
+    This is the infinite-horizon expected discounted value at discount rate = 0.
+    Label it as such. It overstates LTV whenever the discount rate is material (> 5% annually).
+D4. DCF-adjusted LTV (preferred formula):
+    discount_rate_monthly = (1 + annual_discount_rate)^(1/12) − 1
+    LTV_dcf = CM_per_user / (churn_monthly + discount_rate_monthly)
+    This is valid when churn is stationary and CM is constant. Both conditions must be audited (Phase F).
+D5. LTV with expansion revenue (when NRR > 100%):
+    expansion_rate_monthly = net monthly MRR growth rate among surviving customers
+                           = (expansion_MRR − contraction_MRR) / MRR_from_surviving_customers
+    Note: use NET expansion (gross upsells minus downgrades on surviving accounts); do not use gross upsell rate.
+    LTV_expansion = CM_per_user × (1 + expansion_rate_monthly) / (churn_monthly + discount_rate_monthly − expansion_rate_monthly)
+    This closed-form applies when expansion is proportional to remaining MRR (geometric growth per survivor).
+    Cap projection at 60 months unless > 48 months of cohort data are observed.
+    BREAK FLAG: if expansion_rate_monthly ≥ (churn_monthly + discount_rate_monthly), the formula diverges —
+    the model implies infinite LTV, which is not a result; it means the growth assumption exceeds the discount+churn
+    floor. Cap at 60-month finite sum in this case and state the cap explicitly.
+D6. BREAK FLAG — LTV_dcf is not meaningful when:
+    churn_monthly is so high that (churn_monthly + discount_rate_monthly) >> CM_per_user / ARPU_gross,
+    meaning customers leave so fast that no realistic CM accumulates before departure.
+    Operationally: if LTV_dcf < 2 × CAC, treat the model as a capital-destruction signal, not a valuation input.
+    The structural fix is churn reduction, not formula adjustment.
+═══ PHASE E — LTV/CAC & SENSITIVITY ═══
+E1. Core ratio:
+    LTV_CAC = LTV_dcf / CAC_blended   (also compute per-channel: LTV_dcf / CAC_c)
+    Heuristic thresholds (VC SaaS convention, 2015-2022 era — label as such; consumer and PLG differ):
+      < 1x  → destroying value on every customer acquired
+      1–3x  → marginal; scrutinize payback and burn rate
+      3–5x  → healthy range for B2B SaaS with moderate growth spend
+      > 5x  → either underinvesting in growth, CAC is understated, or the channel is not scalable at current volume
+E2. Sensitivity table — vary the three highest-uncertainty inputs simultaneously:
+    | Scenario           | Churn_mo | ARPU_gross | Discount_ann | LTV_dcf | LTV/CAC | Payback_mo |
+    |--------------------|----------|------------|--------------|---------|---------|------------|
+    | Base case          |  X%      |  $Y        |  Z%          |  $...   |  ...x   |  ...       |
+    | Churn +2pp         |  X+2%    |  $Y        |  Z%          |  $...   |  ...x   |  ...       |
+    | Churn −2pp         |  X−2%    |  $Y        |  Z%          |  $...   |  ...x   |  ...       |
+    | ARPU −10%          |  X%      |  $Y×0.9    |  Z%          |  $...   |  ...x   |  ...       |
+    | Discount +2pp ann  |  X%      |  $Y        |  Z+2%        |  $...   |  ...x   |  ...       |
+    | Bear (all adverse) |  X+2%    |  $Y×0.9    |  Z+2%        |  $...   |  ...x   |  ...       |
+E3. Identify the single input that moves LTV/CAC the most (the dominant lever).
+    This is the one variable the operator should instrument and optimize first — state it explicitly.
+═══ PHASE F — COHORT VIEW & MODEL HEALTH ═══
+F1. Cohort aging view — tabulate per-cohort LTV_dcf at months 1, 3, 6, 12, 24:
+    Younger cohorts showing materially worse retention than older ones at the same age signals either
+    product-market fit regression (engagement is degrading) or channel quality decay (you exhausted
+    the high-intent early adopters and are now acquiring lower-intent users). Do not average across cohorts
+    to hide this trend.
+F2. Check NRR (Net Revenue Retention) if expansion revenue exists:
+    NRR = MRR from cohort at month T (including expansion, minus churned and contracted MRR)
+          / MRR from same cohort at month 0
+    NRR > 100%: LTV grows even without new customer acquisition — structurally different and more durable
+    than NRR < 100% businesses. An NRR > 100% with healthy CAC payback is the condition under which
+    aggressive growth spend is unambiguously justified.
+F3. Check for churn stationarity (required assumption for LTV_dcf):
+    Compute churn_monthly per cohort per month. If the average churn rate in months 1-3 is more than
+    2× the average in months 7-12, churn is non-stationary (high early attrition that stabilizes).
+    In this case, LTV_simple and LTV_dcf both overstate LTV — use a finite-horizon cohort sum instead.
+F4. Magic Number (sales efficiency; growth-stage diagnostic):
+    Magic_Number = (MRR_this_quarter − MRR_last_quarter) × 4 / S&M_spend_last_quarter
+    Thresholds (heuristic, B2B SaaS baseline — PLG and consumer models differ significantly):
+      > 0.75  → efficient; generally safe to increase S&M spend
+      0.5–0.75 → review channel mix before scaling
+      < 0.5   → re-examine offer, ICP, or channel before accelerating spend
+    Label as a heuristic. High-velocity PLG can show Magic Numbers > 2 at small scale that compress
+    as the addressable high-intent pool depletes.
+F5. Burn Multiple (capital efficiency):
+    Burn_Multiple = net_cash_burned_in_period / net_new_ARR_added_in_period
+    < 1x   → excellent capital efficiency
+    1–1.5x → good
+    > 2x   → concerning at early stage; investigate whether S&M spend is ahead of retention infrastructure
+F6. BREAK FLAGS — model validity conditions:
+    - Cohort churn is non-stationary (accelerating or high-then-stable) → use finite cohort sum, not LTV_dcf
+    - < 3 full cohort months available → label ALL LTV as projections with wide uncertainty; do not present as measurements
+    - Pricing changed mid-cohort → segment cohorts by pricing tier before computing ARPU; do not blend
+    - Mixed billing cadences (monthly + annual) not separated → ARPU and churn curves are both distorted
+    - CAC attribution lag > 3 months (long enterprise sales cycle) → payback arithmetic must shift attribution window
+═══ PHASE G — OUTPUT & RECOMMENDATIONS ═══
+G1. Deliver a one-page summary table:
+    | Metric              | Value      | Data Quality       | Key Assumption                      |
+    |---------------------|------------|--------------------|-------------------------------------|
+    | ARPU (gross, blend) |            |                    |                                     |
+    | ARPU (monthly tier) |            |                    |                                     |
+    | ARPU (annual tier)  |            |                    |                                     |
+    | COGS / user / mo    |            |                    |                                     |
+    | CM / user / mo      |            |                    |                                     |
+    | GM%                 |            |                    |                                     |
+    | Blended CAC         |            |                    |                                     |
+    | Payback (months)    |            |                    |                                     |
+    | Monthly churn       |            |                    |                                     |
+    | LTV (DCF)           |            |                    |                                     |
+    | LTV/CAC             |            |                    |                                     |
+    | NRR                 |            | (if available)     |                                     |
+    | Magic Number        |            | (if available)     |                                     |
+    | Burn Multiple       |            | (if available)     |                                     |
+G2. State explicitly:
+    (a) which figures are measured from observed cohort data vs. projected beyond observed data
+    (b) the dominant sensitivity lever (from Phase E3)
+    (c) whether churn is stationary (from Phase F3) — if not, note which LTV formula was used and why
+    (d) the single highest-priority operational fix implied by the numbers
+G3. Never end without naming the break condition most likely to invalidate this model for this specific business.
+    The break condition must be specific to the inputs provided — not a generic disclaimer.
+KEY PRINCIPLE: Unit economics are only as honest as the cohort data and assumption transparency behind them — a beautiful LTV/CAC ratio built on blended CAC, self-reported monthly churn from an annual-plan-heavy base, and zero COGS allocation is a fiction that delays the reckoning, not prevents it. The model's job is to surface the reckoning early.

package/skills/valuation.md ADDED Viewed

@@ -0,0 +1,112 @@
+---
+name: valuation
+description: Value a private or public company via DCF (WACC build-up, explicit FCF, terminal value) and trading/transaction comparable multiples, then reconcile the two into a single defensible range with a sensitivity table on the two highest-impact drivers; invoked when the task is to price a business, assess a deal, build a pitch-book range, or stress-test an offer price.
+allowed-tools: Read, Bash, Write
+---
+═══ HARD RULES ═══
+1. NEVER present a point estimate as fact — every output number is LABELED [ESTIMATE] or [SOURCED: <origin>].
+2. DCF and comps are two INDEPENDENT lenses; reconcile them explicitly; never silently average.
+3. WACC must be built bottom-up — no plugging in a "felt" discount rate.
+4. Terminal value is the largest single line in any DCF; its assumptions must be shown, not buried.
+5. Comp multiples must be sourced (public filings, terminal, CapIQ, etc.) — fabricated peer data is grounds to abort.
+6. Sensitivity table must test the two drivers with the HIGHEST partial derivative on equity value; choosing low-impact drivers to make the range look tight is prohibited.
+7. All FCF projections must be grounded in stated assumptions about revenue growth, margin expansion, capex intensity, and working-capital dynamics — no black-box plug figures.
+8. If a required input (e.g., risk-free rate, peer EBITDA multiples, target capital structure) cannot be sourced, STOP and request it rather than fabricate it.
+9. NOLs and deferred tax assets must be identified in Phase A and modeled explicitly in Phase C — do not silently assume a full statutory tax rate for a loss-making or recently turned-profitable company.
+═══ PHASE A — BUSINESS CHARACTERIZATION ═══
+A1. Identify the company: stage (early/growth/mature/distressed), sector, geography, and primary revenue model (recurring SaaS, transactional, project, product). Stage affects which valuation method dominates: pre-revenue → VC multiples or revenue-based; pre-EBITDA → EV/Revenue or EV/ARR; FCF-mature → DCF-primary.
+A2. Clarify the valuation PURPOSE: strategic M&A, minority investment, fairness opinion, IPO pricing, or internal budget — the purpose determines which standard of value (FMV, strategic, intrinsic) and whether a control premium or minority discount applies.
+A3. Collect the minimum viable data set:
+    — 3–5 years of historicals (revenue, gross profit, EBITDA, D&A, capex, ΔWC, net debt, share count)
+    — Management projections if available (label as [MGMT CASE — not independently verified])
+    — Capital structure (debt tranches with maturity and coupon, preferred stock with liquidation preference, options/warrants with strike prices for treasury-stock method)
+    — NOL carryforwards and deferred tax asset balance (if any) — these reduce effective future tax rates and inflate FCF; must be modeled not assumed away
+    — Any known synergies or dis-synergies if strategic context
+A4. State what is MISSING and what assumption fills the gap; do not proceed silently.
+═══ PHASE B — DCF: WACC BUILD-UP ═══
+B1. RISK-FREE RATE: Use the current on-the-run 10-year sovereign yield (USD = US Treasury; EUR = German Bund; etc.) [SOURCED: specify date and source]. Do not use a stale rate from memory.
+B2. EQUITY RISK PREMIUM (ERP): Use a current market-implied or survey-based ERP (e.g., Damodaran's annual implied ERP for the relevant market, updated each January) [SOURCED]. Do not use a static "5%" from memory.
+B3. BETA:
+    — For public companies: 2-year weekly OLS beta vs. local broad index, then re-lever to target capital structure.
+    — For private companies: identify 5–8 pure-play public comps → unlevered betas via Hamada: βu = βl / [1 + (1−T)(D/E)] → take median βu → re-lever at subject's target D/E → [ESTIMATE]. If the private company has a meaningfully different revenue model than any public comp, note the limitation.
+B4. SIZE / SPECIFIC RISK PREMIUM:
+    — Size premium: for subject equity value < ~$2B, add a size premium from Duff & Phelps / Kroll Cost of Capital Navigator decile data (based on NYSE/AMEX/NASDAQ market cap deciles). NOTE: these are derived from public-market data; for a private company, add a further illiquidity premium of 1–3% (state the basis for your choice) or use a total-beta approach.
+    — Company-specific risk: add explicit bps for each of: customer concentration (single customer >20% of revenue), key-man dependency, IP / regulatory risk, geographic risk. Document each add-on separately.
+B5. COST OF EQUITY: Ke = Rf + β × ERP + size premium + specific risk [ESTIMATE, show full arithmetic line by line].
+B6. COST OF DEBT: Use current marginal borrowing rate (yield on existing bonds or comparable credit-rated new issuance), after-tax: Kd × (1 − marginal tax rate). If the company has NOLs that make the tax shield non-current, use a lower effective rate.
+B7. TARGET CAPITAL STRUCTURE: Use industry-median D/(D+E) from the comp set (market-value weights, not book) or a stated post-deal target; document rationale; avoid using current structure if it is transitory (e.g., post-LBO leverage burning down).
+B8. WACC = Ke × E/(D+E) + Kd(1−T) × D/(D+E) [ESTIMATE, show full arithmetic].
+═══ PHASE C — DCF: EXPLICIT FORECAST PERIOD ═══
+C1. Build a 5–10 year explicit FCF model (5 years minimum; extend to 10 if the company is pre-peak-margin, in a long investment cycle such as infrastructure or pharma, or if the terminal year would otherwise have an anomalous FCF that distorts terminal value).
+C2. For each year state:
+    — Revenue growth rate [ESTIMATE with assumption: e.g., "15% YoY driven by stated contract backlog of $X plus assumed market-share gain of Y bps in a $Z TAM"]
+    — EBITDA margin (bridge from current to terminal: explain each margin lever — volume-driven scale, pricing power, product mix shift, opex leverage; do not simply interpolate silently)
+    — D&A (tie to the opening PP&E balance and the capex schedule; D&A cannot exceed the gross asset base)
+    — Capex (split maintenance capex [typically 1–3% of revenue for asset-light businesses; higher for industrials] from growth capex; tie growth capex to the revenue growth assumption)
+    — Change in net working capital (NWC as % of incremental revenue, benchmarked to the comp set; flag if the company has negative NWC like SaaS deferred revenue — this is a source of cash, not a use)
+C3. TAX MODELING: If the company has NOL carryforwards, model the tax shield explicitly year by year (NOL usage limited to taxable income; TCJA caps federal usage at 80% of taxable income per year; state-level treatment varies). Effective cash tax rate may be 0–15% in early years even at full profitability. Do not apply a flat 25–28% statutory rate and move on.
+C4. Unlevered FCF = EBIT × (1−effective cash tax rate) + D&A − Capex − ΔNWC.
+C5. Discount each year's FCF at WACC using mid-year convention (year 1 discount factor = 1/(1+WACC)^0.5) unless year-end cash flows are explicitly justified (e.g., a single annual contract payment).
+═══ PHASE D — TERMINAL VALUE ═══
+D1. GORDON GROWTH MODEL (primary): TV = FCF_n+1 / (WACC − g), where g = long-run nominal GDP growth of the primary operating market [ESTIMATE; state source for GDP growth — e.g., IMF WEO, CBO]. For a US-domiciled business, g is typically 2.0–2.5%. Never set g ≥ WACC (mathematical singularity). Never set g above the long-run expected GDP growth of the subject's primary market — this implies the company eventually becomes larger than the economy.
+D2. EXIT MULTIPLE CHECK (sanity): TV_alt = EBITDA_n × median EV/EBITDA of mature comps (not current high-multiple growth comps — use where the subject will be in steady state). Compare to GGM TV; if they diverge >25%, investigate the driver (g too high, WACC too low, or the peer set is incorrectly specified at terminal year).
+D3. Show terminal value as a % of total enterprise value — if >75%, flag it explicitly and either expand the explicit period or revise g downward. A TV >75% of EV means the analysis is essentially a one-number bet on g and WACC, not a cash-flow model.
+D4. Enterprise Value (DCF) = PV of FCFs + PV of TV + Non-operating assets (excess cash, investments, real estate held separately) − Non-operating liabilities.
+D5. Equity Value (DCF) = EV − Net Debt − Preferred (at liquidation value) − Minority Interest; divide by fully diluted share count (treasury-stock method for all in-the-money options and warrants using the current share price or midpoint valuation — iterate if necessary).
+═══ PHASE E — COMPARABLE COMPANY MULTIPLES ═══
+E1. COMP SET SELECTION — criteria (all must be met):
+    — Same primary revenue model (recurring vs. transactional vs. product vs. usage-based)
+    — Same or adjacent sector/sub-sector (2-digit GICS level minimum; 4-digit preferred)
+    — Comparable scale (within 0.3×–3× of subject on revenue) — outliers may be retained if noted and excluded from the applied range
+    — At least 4 comps; 6–10 preferred; if fewer than 4 exist, state this explicitly and explain why (niche sector, no pure-plays), and weight the DCF more heavily in Phase F
+E2. TIMING CONVENTIONS — use NTM multiples when comps have different fiscal year-ends (NTM normalizes to a common 12-month forward window and avoids mixing LTM periods that span different macro environments). Use LTM only when forward estimates are unavailable or unreliable (e.g., distressed companies with wide analyst dispersion). State which convention is used and apply it consistently across the entire comp set.
+E3. MULTIPLES TO COLLECT per comp [SOURCED]:
+    — EV/Revenue (use for pre-profit or high-growth companies where EBITDA is negative or non-comparable)
+    — EV/EBITDA (use for FCF-mature companies; adjust for stock-based compensation if material — SBC-adjusted EBITDA is preferred for tech comps)
+    — EV/EBIT or P/E (use if capex and D&A distort EBITDA meaningfully, e.g., capital-intensive industrials)
+    — EV/ARR or EV/NTM Revenue (for SaaS/subscription businesses; EV/ARR is preferable when NTM estimates have high dispersion)
+E4. Compute: median, 25th pct, 75th pct for each multiple across the comp set.
+E5. Apply median (base case), 25th pct (bear), 75th pct (bull) to the subject's corresponding metric to yield a comp-implied EV range.
+E6. COMPARABILITY ADJUSTMENTS: If the subject has materially different growth or margin vs. the comp median, apply a premium/discount — use a regression of the anchor multiple vs. NTM growth rate across the comp set (the slope gives the $/% premium per point of growth differential) rather than an intuition-based "we deserve a premium." Document the regression or the explicit adjustment factor.
+E7. TRANSACTION COMPS (if M&A context): layer in precedent transaction multiples (same sector/model criteria + last 3–5 years; exclude transactions during extreme market dislocations unless they are directly relevant). Transaction comps embed a control premium (typically 20–40% over unaffected trading price in public M&A) and a specific deal structure; distinguish them from trading comps and explain why they are or are not appropriate for this context.
+═══ PHASE F — RECONCILIATION ═══
+F1. Present all methods side by side:
+    | Method                          | Bear EV | Base EV | Bull EV |
+    | DCF                             | [E]     | [E]     | [E]     |
+    | Trading Comps                   | [E]     | [E]     | [E]     |
+    | Transaction Comps (if applicable)| [E]    | [E]     | [E]     |
+F2. WEIGHTING RATIONALE — state explicitly why you weight each method:
+    — DCF is primary for companies with visible, stable FCF or where no liquid comp market exists (weight 50–60%).
+    — Trading comps are primary when the market for similar companies is active and the subject is near maturity (weight 40–50%).
+    — Transaction comps are additive in M&A contexts; do not double-count the control premium if it is also reflected in the DCF synergy assumptions.
+    — If DCF and trading comps diverge by >30% at the midpoint, DO NOT average — diagnose the gap: are market multiples embedding a growth trajectory faster than your explicit forecast? Is your WACC out of step with the implied discount rate in market multiples? Resolve or explicitly document the unresolved divergence.
+F3. Output a SINGLE CONCLUDED RANGE with a stated midpoint [ESTIMATE] and the weighting scheme used.
+═══ PHASE G — SENSITIVITY TABLE ═══
+G1. Identify the TWO inputs with the highest impact on equity value (run partial derivatives or a quick tornado by flexing each input ±10% while holding others fixed):
+    — For most mature businesses: WACC and terminal growth rate (g) dominate DCF equity value.
+    — For high-growth businesses: Year 3–5 revenue CAGR or peak EBITDA margin may dominate over g.
+    — For comp-heavy analyses: the anchor multiple (EV/EBITDA or EV/NTM Revenue) and the subject's NTM margin or growth rate.
+    — For NOL-heavy companies: the pace of NOL utilization (taxable income ramp) may rank among the top two drivers — model it if so.
+G2. Build a 5×5 sensitivity table for each driver pair:
+    Rows = Driver 1 (e.g., WACC: base −100bps to base +100bps in 50bps steps)
+    Cols = Driver 2 (e.g., terminal g: 1.5% to 3.5% in 0.5% steps)
+    Cells = implied equity value per share or EV [ESTIMATE]
+G3. Highlight the cell corresponding to your base case. Mark cells representing the bear and bull scenario. The range of the sensitivity table should span at least the bear-to-bull range from Phase F — if it does not, the sensitivity table is not testing the right drivers.
+G4. Label every cell [ESTIMATE] or embed a single label at the table header.
+═══ PHASE H — OUTPUT & DELIVERABLE ═══
+H1. Write a 3–5 sentence executive summary: concluded range, primary method and its weight, the single biggest assumption risk (the one input where being wrong by 1 standard deviation moves equity value by the most), and one watch item specific to this deal (e.g., covenant headroom, customer concentration cliff, pending regulatory decision, earnout structure).
+H2. Provide a structured appendix: WACC build arithmetic (every line), FCF bridge (year-by-year with assumptions stated inline), NOL utilization schedule (if applicable), comp table with sources and the NTM/LTM convention used, sensitivity table.
+H3. If the analysis is for a live deal: state which end of the range each party should anchor in an LOI and why (buyer anchors bear; seller anchors bull; fairness opinion uses midpoint with a stated range; board approving a sale uses the range to assess adequacy).
+H4. Caveat block (mandatory, cannot be omitted):
+    "All figures herein are estimates based on stated assumptions and sourced market data as of [DATE]. They do not constitute a fairness opinion, audited valuation, or investment advice. Material changes in market conditions, company performance, or deal structure may render them obsolete. Users should engage a qualified financial advisor before making investment or transaction decisions."
+KEY PRINCIPLE: A valuation is a structured argument, not a calculation — every number must be traceable to an assumption, every assumption must be defensible, and every defensible assumption must be stress-tested. The goal is not to produce the answer the client wants; it is to produce the answer that would survive scrutiny from the other side of the table.

package/skills/write-tests.md ADDED Viewed

@@ -0,0 +1,77 @@
+---
+name: write-tests
+description: Author comprehensive tests — cover happy path, boundaries, error paths, and invariants; test behavior not implementation; deterministic and isolated; verify the test can actually fail.
+allowed-tools: Read, Write, Edit, Bash, Grep, Glob
+---
+## Phase 0 — Understand what you are testing (do this before opening a new file)
+1. Read the module under test in full. Extract its public surface: exported functions, classes, methods, event emitters, thrown errors. Ignore private helpers — they are implementation details.
+2. State the behavioral contract in plain language: given X input, produce Y output or side-effect; under Z condition, throw W error. Write this down as a comment block at the top of the test file.
+3. Find the test framework: `grep -rE "\"(vitest|jest|mocha|ava|bun)\"|\"test\":" package.json` across the repo. Read one existing test file for import style, assertion library, `beforeEach`/`afterEach` conventions, and file co-location rules.
+4. Find the test runner: check `scripts.test` in `package.json` and any `vitest.config.*` / `jest.config.*`. Run the existing suite once — `npm test` or `npx vitest run` — to confirm it is green before you add anything.
+5. List every input class and boundary you can identify: normal values, zero, one, max, empty string/array/object, null/undefined, wrong type, negative, very large, duplicate, out-of-order. Each class gets at least one test.
+## Phase 1 — Map every case before writing code
+6. Enumerate: (a) happy path(s), (b) boundary values per parameter, (c) error/exception paths, (d) empty/null/undefined inputs, (e) large or adversarial inputs, (f) concurrent or out-of-order calls, (g) idempotency (call twice → same result), (h) invalid or malformed input that must be rejected.
+7. Mark which cases are candidate property-based tests: any function with an algebraic invariant (sort is idempotent, encode→decode = identity, merge is commutative) belongs in a `fc.property` / `fast-check` block instead of a single example.
+8. Flag every external dependency: network, filesystem, clock, randomness, database, message queue. Each one MUST be replaced with a fake/stub at the test boundary — not deep inside the implementation.
+9. Note which tests should be unit (pure logic, < 10 ms), integration (two real modules wired together), or end-to-end. Do not mix levels in the same file.
+## Phase 2 — Author tests
+10. **Arrange-Act-Assert**: every test body follows three clearly separated sections. Arrange: set up all inputs and fakes. Act: call the unit under test exactly once. Assert: check the output or side-effect. No assertion in the Arrange section; no setup in the Assert section.
+11. **One logical assertion per test**: a test may have multiple `.expect()` calls only if they all validate the same behavior. Two unrelated assertions = two tests.
+12. **Descriptive names**: `it("throws RangeError when pageSize is 0", ...)` not `it("test error")`. The name must serve as documentation that passes without being read with the source.
+13. **Happy path first**: write the canonical success case, run it, confirm it passes. This is your baseline.
+14. **Boundaries next**: for every numeric parameter write tests at 0, 1, max, max+1. For strings: empty string, single char, max length, max+1 length, unicode. For arrays: [], [item], [max items].
+15. **Error paths**: assert the exact error type and message fragment thrown. Use `expect(() => fn()).toThrow(RangeError)` not a try/catch that swallows the error class.
+16. **Rejection and async**: for Promises, test both resolution and rejection with `await expect(promise).rejects.toThrow(...)`. Never leave a floating unhandled-rejection.
+17. **Null/undefined/missing**: call the function with missing optional args, null, undefined, and empty objects. Assert it either handles gracefully or throws a typed error — never silently corrupts state.
+18. **Idempotency**: if the function claims to be idempotent, call it twice with the same input and assert the output is identical and any side-effects are not doubled.
+19. **Concurrency**: if the module manages shared state, write a test that fires N calls simultaneously (Promise.all) and asserts no race corruption.
+20. **Property-based** (where algebraic invariant exists): wrap in a fast-check `fc.assert(fc.property(...))` block. Combine with an explicit fixed example so the failure message stays readable.
+## Phase 3 — Mocking discipline
+21. Mock ONLY at true external boundaries: HTTP calls, filesystem reads/writes, clock (`Date.now`, `setTimeout`), randomness (`Math.random`), message queues. Never mock a module in the same repo that you could simply call.
+22. Prefer dependency injection over module-level monkey-patching. If the unit under test accepts a `fetch` parameter or a `clock` interface, pass a fake object in tests.
+23. Never mock the return value of the function under test. That tests the mock, not the code.
+24. Reset all mocks in `beforeEach` / `afterEach`. Never let mock call counts or return values bleed between tests.
+25. If a mock is more than ~10 lines, extract it to a `__fixtures__/` or `_helpers/` file co-located with the tests.
+## Phase 4 — Determinism and isolation
+26. No `Math.random()`, `Date.now()`, or real `setTimeout` in test code. Stub them or use a fake clock (e.g. `vi.useFakeTimers()`).
+27. No shared mutable state at module scope between tests. Use `beforeEach` to re-initialize, not a `let` assigned once in `describe`.
+28. Tests must not depend on execution order. Run the suite with `--shuffle` at least once to verify.
+29. No real network, real DB, or real filesystem in unit tests. Use an in-memory adapter or `memfs`.
+30. Each test must be runnable in under 200 ms in isolation. Anything slower belongs in an integration or e2e suite with its own tag.
+## Phase 5 — Verify the test can actually fail
+31. After writing each test, **break the implementation**: comment out the line of production code the test targets, or invert the condition. Run the test. It MUST go red. If it stays green, it is testing nothing — rewrite it.
+32. Restore the production code, confirm green, then move on. This step is non-negotiable.
+33. For property-based tests: insert a known-bad input into the generator and verify the test catches it before enabling shrinking.
+## Phase 6 — Regression tests
+34. Every bug fixed gets a regression test first, before the fix. The test must reproduce the exact reported behavior and go red against the unfixed code.
+35. Name regression tests with the issue/PR reference: `it("does not hang when header is missing (fixes #412)", ...)`.
+36. Never close a bug without a test that would have caught it.
+## Phase 7 — Run and validate coverage
+37. Run the full suite: `npm test` (or `npx vitest run --reporter=verbose`). All tests green before declaring done.
+38. If the project has a coverage tool, run it: `npx vitest run --coverage`. Aim for 100% branch coverage on the public surface; flag any uncovered branch with a `// TODO(test):` comment and a reason.
+39. A test that always passes — even when you delete the code under test — is worse than no test. Delete it or fix it.
+## Hard rules (never break)
+- Test behavior, not implementation. Assert outputs and side-effects, not which internal helpers were called.
+- Never comment out or delete a failing test to make CI green. Fix the code, or mark `todo` with a dated justification.
+- No snapshot tests for logic — snapshot only stable, deterministic rendered output, and only when the team has a workflow to review snapshot diffs.
+- Mocks must be minimal. If your mock is smarter than your implementation, you are testing the wrong thing.
+- If a test is flaky (intermittently red), stop everything and fix it. Flaky tests are trust-destroyers that corrupt the entire suite's signal.
+- Run the suite in CI mode (`CI=true npm test`) at least once to catch environment-sensitive failures before pushing.