crudecode-valuation 0.2.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,146 @@
1
+ # crude_valuation
2
+
3
+ In-package cookbook the inner valuation_agent reads at runtime. Walks
4
+ through the three-phase workflow and the diagnostic helpers.
5
+
6
+ The package has two subsystems:
7
+
8
+ - **`forecast/`** — produce `(forecasts_oil, forecasts_gas)`. Three
9
+ diagnostic helpers (`cohort_summary`, `fit_quality`,
10
+ `target_vs_cohort`) let you inspect the cohort and target wells before
11
+ committing to forecast parameters.
12
+ - **`econ/`** — turn forecasts into money. Revenue → cashflow → NPV.
13
+
14
+ Forecast and econ are deliberately separate. The forecast step's
15
+ deliverable is `(forecasts_oil, forecasts_gas)` — that tuple is the input
16
+ to the econ step. There is no fused entry point.
17
+
18
+ ## Imports (flat)
19
+
20
+ ```python
21
+ from crude_valuation import (
22
+ # Forecast types and primitives
23
+ DeclineCurve, Forecast, ForecastProvenance,
24
+ fit_curve, percentile_curves, blend_curves, curve_rate,
25
+ project, aggregate,
26
+ peek_well, load_production, find_analogs,
27
+ # Diagnostics
28
+ cohort_summary, fit_quality, target_vs_cohort,
29
+ CohortSummary, FitQuality, TargetComparison, ParamStats,
30
+ # Econ
31
+ compute_gross_revenue, compute_net_cashflow, npv,
32
+ # Briefing assembly
33
+ build_minimal_briefing,
34
+ )
35
+ ```
36
+
37
+ ## Workflow — EXPLORE → DECIDE → EXECUTE
38
+
39
+ Every `/tmp/valuation.py` you write is structured into three phases marked
40
+ by comment headers. You think like a petroleum engineer: inspect, decide,
41
+ execute. Document the calls you made in plain English so a reader of the
42
+ briefing can audit them.
43
+
44
+ ### EXPLORE — load data, fit analogs, print diagnostics
45
+
46
+ ```python
47
+ import numpy as np
48
+ from crude_valuation import (
49
+ load_production, find_analogs, fit_curve, peek_well,
50
+ percentile_curves,
51
+ cohort_summary, fit_quality, target_vs_cohort,
52
+ )
53
+
54
+ prod = load_production(WELL_APIS, run_sql)
55
+ analog_apis = find_analogs(ANALOG_FILTER, run_sql, exclude=WELL_APIS)
56
+
57
+ analog_curves = []
58
+ for a in analog_apis:
59
+ try:
60
+ analog_prod = load_production([a], run_sql)
61
+ q = np.array(analog_prod["oil_bbl"])
62
+ months = np.arange(len(q), dtype=float)
63
+ meta = peek_well(a, run_sql)
64
+ analog_curves.append(
65
+ fit_curve(months, q, stream="oil", lateral_norm_ft=meta.lateral_ft, b_fixed=0.8)
66
+ )
67
+ except ValueError:
68
+ continue # skip thin / un-fittable analogs
69
+
70
+ print(cohort_summary(analog_curves)) # AGENT READS THIS
71
+ cohort_median = percentile_curves(analog_curves, pct=0.5, norm_to_lateral_ft=10_000.0)
72
+
73
+ for api in WELL_APIS:
74
+ meta = peek_well(api, run_sql)
75
+ target_prod = load_production([api], run_sql)
76
+ target_q = np.array(target_prod["oil_bbl"])
77
+ target_months = np.arange(len(target_q), dtype=float)
78
+ print(api, target_vs_cohort(cohort_median, target_months, target_q, meta.lateral_ft))
79
+ ```
80
+
81
+ For PDP candidates (wells you'll fit on their own history), also print
82
+ `fit_quality(own_curve, months, q)` so you can gate fits that rode the
83
+ b bound.
84
+
85
+ ### DECIDE — write a comment block per well
86
+
87
+ ```python
88
+ # === DECIDE ===
89
+ # Cohort: 15 analogs after dropping 3 b-upper-bound riders. b median 0.92
90
+ # (IQR 0.31). qi median 14,200 bbl/mo at 9,800 ft lateral.
91
+ #
92
+ # Target 42-329-12345 (Wolfcamp A, 9,400 ft):
93
+ # 36 months observed, qi_ratio 1.04 -> in line with cohort
94
+ # Choice: pdp; fit own decline with b=0.9 (cohort median, dropped outliers)
95
+ #
96
+ # Target 42-329-12999 (PERMITTED, 10,200 ft):
97
+ # no history; planned spud 2026-08
98
+ # Choice: pure_analog with cohort curve at target lateral
99
+ ```
100
+
101
+ ### EXECUTE — inline the per-well calls, then run econ
102
+
103
+ No router constants. No `THIN_HISTORY_THRESHOLD`. No `B_FIXED`. Each
104
+ well's strategy is whatever the DECIDE block said.
105
+
106
+ ```python
107
+ forecasts_oil, forecasts_gas = [], []
108
+ for api, choice in CHOICES.items():
109
+ if choice["strategy"] == "pdp":
110
+ forecasts_oil.append(_build_pdp(api, b=choice["b"], stream="oil", ...))
111
+ elif choice["strategy"] == "thin_blend":
112
+ forecasts_oil.append(_build_thin(api, cohort_median_oil, ...))
113
+ else: # pure_analog
114
+ forecasts_oil.append(_build_pure_analog(api, cohort_median_oil, ...))
115
+ # same for gas
116
+
117
+ revenue = compute_gross_revenue(forecasts_oil, forecasts_gas, price_deck=...)
118
+ cashflow = compute_net_cashflow(revenue, interest_type=INTEREST_TYPE, ...)
119
+ npv_dict = npv(cashflow, discount_rates=[0.08, 0.10, 0.12, 0.15])
120
+
121
+ spec = build_minimal_briefing(headline=..., commentary=..., npv=npv_dict, ...)
122
+ persist(spec, persist_url="$persist_url")
123
+ ```
124
+
125
+ Commentary must reference at least one concrete diagnostic you looked at
126
+ (e.g. "cohort of 14 analogs, b median 0.92 with IQR 0.31; target's first
127
+ 4 months at 71% of cohort qi suggests below-average completion").
128
+
129
+ ## Playbooks
130
+
131
+ `crude_valuation/forecast/examples/` ships annotated playbooks. Today
132
+ only `forecast_mixed.py` is in the new playbook form; the other two are
133
+ still recipe-style and will be migrated.
134
+
135
+ ## Diagnostic helpers in detail
136
+
137
+ - `cohort_summary(curves)` — distribution stats per fitted parameter
138
+ (qi, di, b, terminal, lateral). Plus `notes` flagging fits that rode
139
+ the b bound.
140
+ - `fit_quality(curve, months, q)` — R^2, RMSE, bound-riding flags, max
141
+ residual %, early/late signed-% bias windows for a single fit.
142
+ - `target_vs_cohort(cohort, target_months, target_q, target_lateral_ft)` —
143
+ qi_ratio of target to cohort, peak detection, per-month signed-%
144
+ residual vs cohort placed at target lateral.
145
+
146
+ None of these decide anything. You read them and choose.
@@ -0,0 +1,51 @@
1
+ """crude-valuation — valuation toolkit for the inner valuation_agent.
2
+
3
+ Wraps crude_analyst (the base briefing platform) and adds domain math.
4
+ The agent only ever imports from crude_valuation; crude_analyst is invisible.
5
+ """
6
+ from crude_analyst import * # noqa: F401, F403
7
+
8
+ # Econ subsystem
9
+ from crude_valuation.econ import ( # noqa: F401
10
+ compute_gross_revenue,
11
+ compute_net_cashflow,
12
+ npv,
13
+ )
14
+
15
+ # Shared utilities
16
+ from crude_valuation.compose import build_minimal_briefing # noqa: F401
17
+
18
+ # Forecast subsystem — flat re-exports for the agent
19
+ from crude_valuation.forecast import ( # noqa: F401
20
+ # Types
21
+ CohortSummary,
22
+ CurveProvenance,
23
+ DeclineCurve,
24
+ FitQuality,
25
+ Forecast,
26
+ ForecastProvenance,
27
+ ParamStats,
28
+ TargetComparison,
29
+ WellMeta,
30
+ # Primitives
31
+ fit_curve,
32
+ percentile_curves,
33
+ blend_curves,
34
+ curve_rate,
35
+ # Diagnostics
36
+ cohort_summary,
37
+ fit_quality,
38
+ target_vs_cohort,
39
+ # Projection + aggregation
40
+ project,
41
+ aggregate,
42
+ # DB helpers
43
+ peek_well,
44
+ load_production,
45
+ load_well_status,
46
+ resolve_asset_list,
47
+ find_analogs,
48
+ # Errors
49
+ AnalogError,
50
+ WellsError,
51
+ )
@@ -0,0 +1,104 @@
1
+ """Case file parsing for the valuation agent.
2
+
3
+ Validates the JSON contract the MCP tool sends, returns a typed CaseFile
4
+ the rest of the package can read without re-checking shapes.
5
+ """
6
+ from dataclasses import dataclass, field
7
+ from typing import Any
8
+
9
+
10
+ class CaseFileError(ValueError):
11
+ """Raised when the case file fails schema validation."""
12
+
13
+
14
+ @dataclass
15
+ class CaseFile:
16
+ interest_type: str # "wi" | "minerals"
17
+ interest: dict # {wi_pct, nri_pct} or {decimal}
18
+ asset_list: dict # {well_apis: [...]} XOR {filter_sql: "..."}
19
+ economics_overrides: dict = field(default_factory=dict)
20
+ transcript: list = field(default_factory=list)
21
+ queries_run: list = field(default_factory=list)
22
+ handoff: str = ""
23
+ source_documents: list = field(default_factory=list)
24
+
25
+
26
+ def parse_case_file(body: dict) -> CaseFile:
27
+ """Validate + parse the case file body. Raises CaseFileError on any failure."""
28
+ if not isinstance(body, dict):
29
+ raise CaseFileError(f"case file must be an object, got {type(body).__name__}")
30
+
31
+ interest_type = body.get("interest_type")
32
+ if interest_type not in ("wi", "minerals"):
33
+ raise CaseFileError(
34
+ f"interest_type must be 'wi' or 'minerals', got {interest_type!r}"
35
+ )
36
+
37
+ interest = body.get("interest")
38
+ if not isinstance(interest, dict):
39
+ raise CaseFileError("interest must be an object")
40
+ if interest_type == "wi":
41
+ for k in ("wi_pct", "nri_pct"):
42
+ if k not in interest:
43
+ raise CaseFileError(f"interest.{k} is required for interest_type='wi'")
44
+ v = interest[k]
45
+ if not isinstance(v, (int, float)) or not (0.0 <= v <= 1.0):
46
+ raise CaseFileError(f"interest.{k} must be in [0, 1], got {v!r}")
47
+ else: # minerals
48
+ if "decimal" not in interest:
49
+ raise CaseFileError("interest.decimal is required for interest_type='minerals'")
50
+ v = interest["decimal"]
51
+ if not isinstance(v, (int, float)) or not (0.0 <= v <= 1.0):
52
+ raise CaseFileError(f"interest.decimal must be in [0, 1], got {v!r}")
53
+
54
+ asset_list = body.get("asset_list")
55
+ if not isinstance(asset_list, dict):
56
+ raise CaseFileError("asset_list must be an object")
57
+ has_apis = bool(asset_list.get("well_apis"))
58
+ has_sql = bool(asset_list.get("filter_sql"))
59
+ if has_apis and has_sql:
60
+ raise CaseFileError("asset_list must have exactly one of well_apis or filter_sql, not both")
61
+ if not has_apis and not has_sql:
62
+ raise CaseFileError("asset_list must have exactly one of well_apis or filter_sql")
63
+ if has_apis and not all(isinstance(a, str) for a in asset_list["well_apis"]):
64
+ raise CaseFileError("asset_list.well_apis must be a list of strings")
65
+ if has_sql and not isinstance(asset_list["filter_sql"], str):
66
+ raise CaseFileError("asset_list.filter_sql must be a string")
67
+
68
+ handoff = body.get("handoff")
69
+ if not isinstance(handoff, str) or not handoff.strip():
70
+ raise CaseFileError("handoff is required and must be a non-empty string")
71
+
72
+ transcript = body.get("transcript")
73
+ if not isinstance(transcript, list) or len(transcript) == 0:
74
+ raise CaseFileError("transcript is required and must be a non-empty list")
75
+ for i, turn in enumerate(transcript):
76
+ if not isinstance(turn, dict):
77
+ raise CaseFileError(f"transcript[{i}] must be an object")
78
+ if turn.get("role") not in ("user", "assistant"):
79
+ raise CaseFileError(f"transcript[{i}].role must be 'user' or 'assistant'")
80
+ if not isinstance(turn.get("content"), str):
81
+ raise CaseFileError(f"transcript[{i}].content must be a string")
82
+
83
+ queries_run = body.get("queries_run", [])
84
+ if not isinstance(queries_run, list):
85
+ raise CaseFileError("queries_run must be a list")
86
+
87
+ economics_overrides = body.get("economics_overrides", {}) or {}
88
+ if not isinstance(economics_overrides, dict):
89
+ raise CaseFileError("economics_overrides must be an object")
90
+
91
+ source_documents = body.get("source_documents", []) or []
92
+ if not isinstance(source_documents, list):
93
+ raise CaseFileError("source_documents must be a list")
94
+
95
+ return CaseFile(
96
+ interest_type=interest_type,
97
+ interest=interest,
98
+ asset_list=asset_list,
99
+ economics_overrides=economics_overrides,
100
+ transcript=transcript,
101
+ queries_run=queries_run,
102
+ handoff=handoff.strip(),
103
+ source_documents=source_documents,
104
+ )
@@ -0,0 +1,79 @@
1
+ """Build a minimal Slice-1 briefing spec from valuation results.
2
+
3
+ Slice 1 is PV-only and commentary-only:
4
+
5
+ - PV-only — no IRR, no payout, no MOIC, no sensitivity. Those land in a later
6
+ slice; this slice ships discount-rate NPV values and the methodology footer.
7
+ - Commentary-only — no callout widget. The base `callout` is designed to render
8
+ a value from a SQL row, not a Python-side literal; using it for a static
9
+ headline NPV would require a crude_analyst widget change we're deferring.
10
+ Instead the headline NPV lives in the commentary text body.
11
+ """
12
+ import math
13
+ from crude_analyst import briefing, section, commentary
14
+
15
+
16
+ def _fmt_money(x: float) -> str:
17
+ """Render dollar values as $X.YM, $XK, etc. Negative shown as -$X.YM."""
18
+ if not math.isfinite(x):
19
+ return "n/a"
20
+ sign = "-" if x < 0 else ""
21
+ a = abs(x)
22
+ if a >= 1e9:
23
+ return f"{sign}${a/1e9:.2f}B"
24
+ if a >= 1e6:
25
+ return f"{sign}${a/1e6:.2f}M"
26
+ if a >= 1e3:
27
+ return f"{sign}${a/1e3:.1f}K"
28
+ return f"{sign}${a:.0f}"
29
+
30
+
31
+ def _headline_rate(npv_by_rate: dict[float, float]) -> float:
32
+ """Pick the headline discount rate — PV10 if present, else the lowest rate."""
33
+ if 0.10 in npv_by_rate:
34
+ return 0.10
35
+ return min(npv_by_rate.keys())
36
+
37
+
38
+ def build_minimal_briefing(
39
+ *,
40
+ npv_by_rate: dict[float, float],
41
+ n_wells: int,
42
+ interest_type: str,
43
+ methodology_notes: str,
44
+ ) -> dict:
45
+ """Assemble the Slice 1 briefing: one commentary widget with PV results."""
46
+ headline_rate = _headline_rate(npv_by_rate)
47
+ headline_npv = npv_by_rate[headline_rate]
48
+ headline_label = f"PV{int(headline_rate*100)}"
49
+ interest_label = "minerals" if interest_type == "minerals" else "working-interest"
50
+
51
+ lines = [
52
+ f"{interest_label.capitalize()} valuation across {n_wells} producing well(s).",
53
+ f"{headline_label}: {_fmt_money(headline_npv)}.",
54
+ ]
55
+ if len(npv_by_rate) > 1:
56
+ other = ", ".join(
57
+ f"PV{int(r*100)}: {_fmt_money(v)}"
58
+ for r, v in sorted(npv_by_rate.items())
59
+ if r != headline_rate
60
+ )
61
+ lines.append(f"Other rates — {other}.")
62
+ lines.append("")
63
+ lines.append(f"Methodology: {methodology_notes}")
64
+
65
+ spec = briefing(
66
+ headline=f"{headline_label} {_fmt_money(headline_npv)} on {n_wells} {interest_label} well(s).",
67
+ tldr=f"{n_wells}-well {interest_type} valuation; headline {headline_label} = {_fmt_money(headline_npv)}.",
68
+ sections=[
69
+ section(
70
+ label="01 · Result",
71
+ layout="full-width",
72
+ widgets=[commentary(text="\n".join(lines))],
73
+ ),
74
+ ],
75
+ )
76
+ spec["headline_npv"] = {
77
+ str(round(rate * 100)): value for rate, value in npv_by_rate.items()
78
+ }
79
+ return spec
@@ -0,0 +1,4 @@
1
+ """Economics: revenue → cashflow → NPV."""
2
+ from crude_valuation.econ.revenue import compute_gross_revenue # noqa: F401
3
+ from crude_valuation.econ.cashflow import compute_net_cashflow # noqa: F401
4
+ from crude_valuation.econ.npv import npv # noqa: F401
@@ -0,0 +1,75 @@
1
+ """Cashflow layer: branches on interest_type.
2
+
3
+ For all deals: net_of_deducts = revenue × (1 - tax_pct) × (1 - gpt_pct)
4
+ For minerals: cashflow = net_of_deducts × decimal
5
+ For WI: cashflow = net_of_deducts × NRI − (opex + capex + p_and_a) × WI%
6
+
7
+ Defaults applied when economics fields are missing:
8
+ tax_pct = 0.075, gpt_pct = 0.05, opex = $3000/well/month,
9
+ p_and_a = $75000/well at end of stream.
10
+ """
11
+ import numpy as np
12
+
13
+
14
+ _DEFAULTS = {
15
+ "tax_pct": 0.075,
16
+ "gpt_pct": 0.05,
17
+ "opex_per_well_per_month_usd": 3000.0,
18
+ "p_and_a_per_well_usd": 75000.0,
19
+ }
20
+
21
+
22
+ def _resolve(economics: dict, key: str) -> float:
23
+ return float(economics.get(key, _DEFAULTS[key]))
24
+
25
+
26
+ def compute_net_cashflow(
27
+ *,
28
+ revenue_monthly: np.ndarray,
29
+ interest_type: str,
30
+ interest: dict,
31
+ economics: dict,
32
+ n_wells: int,
33
+ oil_bbl: np.ndarray | None = None,
34
+ ) -> np.ndarray:
35
+ """Apply interest_type branch and return monthly net cashflow.
36
+
37
+ `oil_bbl` is required when `economics["opex_per_bbl_usd"]` is set (the
38
+ per-bbl opex code path); otherwise ignored.
39
+ """
40
+ revenue_monthly = np.asarray(revenue_monthly, dtype=float)
41
+
42
+ tax_pct = _resolve(economics, "tax_pct")
43
+ gpt_pct = _resolve(economics, "gpt_pct")
44
+ net_of_deducts = revenue_monthly * (1.0 - tax_pct) * (1.0 - gpt_pct)
45
+
46
+ if interest_type == "minerals":
47
+ return net_of_deducts * float(interest["decimal"])
48
+
49
+ # WI branch
50
+ wi_pct = float(interest["wi_pct"])
51
+ nri_pct = float(interest["nri_pct"])
52
+ revenue_net = net_of_deducts * nri_pct
53
+
54
+ n_months = len(revenue_monthly)
55
+
56
+ # Opex: either per-bbl or per-well-per-month (caller picks at most one)
57
+ if "opex_per_bbl_usd" in economics:
58
+ if oil_bbl is None:
59
+ raise ValueError("oil_bbl is required when economics.opex_per_bbl_usd is set")
60
+ opex_monthly = float(economics["opex_per_bbl_usd"]) * np.asarray(oil_bbl, dtype=float)
61
+ else:
62
+ per_well = _resolve(economics, "opex_per_well_per_month_usd")
63
+ opex_monthly = np.full(n_months, per_well * n_wells, dtype=float)
64
+
65
+ # P&A — one-time at end of stream
66
+ p_and_a_per_well = _resolve(economics, "p_and_a_per_well_usd")
67
+ p_and_a = np.zeros(n_months, dtype=float)
68
+ if n_months > 0:
69
+ p_and_a[-1] = p_and_a_per_well * n_wells
70
+
71
+ # CapEx — unused in Slice 1 (non-producing wells rejected at run time)
72
+ capex = np.zeros(n_months, dtype=float)
73
+
74
+ cost_monthly = opex_monthly + capex + p_and_a
75
+ return revenue_net - cost_monthly * wi_pct
@@ -0,0 +1,14 @@
1
+ """Monthly NPV — pure numpy. Slice 1 ships PV only; IRR / payout / MOIC deferred."""
2
+ import numpy as np
3
+
4
+
5
+ def npv(cashflow_monthly: np.ndarray, rate: float) -> float:
6
+ """Discount a monthly cashflow stream at annual nominal rate.
7
+
8
+ Month 0 is undiscounted. `rate` is annual; converted to monthly internally.
9
+ """
10
+ cf = np.asarray(cashflow_monthly, dtype=float)
11
+ monthly_rate = rate / 12.0
12
+ n = len(cf)
13
+ discount = 1.0 / np.power(1.0 + monthly_rate, np.arange(n, dtype=float))
14
+ return float(np.sum(cf * discount))
@@ -0,0 +1,39 @@
1
+ """Revenue layer: oil + gas volumes → gross revenue per month."""
2
+ import numpy as np
3
+
4
+
5
+ _DEFAULT_OIL_USD = 80.0
6
+ _DEFAULT_GAS_USD = 3.0
7
+
8
+
9
+ def compute_gross_revenue(
10
+ oil_bbl: np.ndarray,
11
+ gas_mcf: np.ndarray,
12
+ price_deck: dict | None,
13
+ differential: dict | None,
14
+ ) -> np.ndarray:
15
+ """Apply price × volumes to produce monthly gross revenue.
16
+
17
+ Slice 1: only `price_deck.type == "flat"` is supported. `nymex_strip` and
18
+ `escalated` raise NotImplementedError until a later slice.
19
+
20
+ 1 MCF ≈ 1 MMBtu assumption in Slice 1 (real BTU content varies by basin,
21
+ deferred).
22
+ """
23
+ if price_deck is None:
24
+ oil_price = _DEFAULT_OIL_USD
25
+ gas_price = _DEFAULT_GAS_USD
26
+ else:
27
+ ptype = price_deck.get("type")
28
+ if ptype != "flat":
29
+ raise NotImplementedError(
30
+ f"price_deck type {ptype!r} not supported in Slice 1 (only 'flat')"
31
+ )
32
+ oil_price = float(price_deck.get("oil_usd_bbl", _DEFAULT_OIL_USD))
33
+ gas_price = float(price_deck.get("gas_usd_mmbtu", _DEFAULT_GAS_USD))
34
+
35
+ if differential is not None:
36
+ oil_price += float(differential.get("oil_usd_bbl", 0.0))
37
+ gas_price += float(differential.get("gas_usd_mmbtu", 0.0))
38
+
39
+ return np.asarray(oil_bbl, dtype=float) * oil_price + np.asarray(gas_mcf, dtype=float) * gas_price
@@ -0,0 +1,142 @@
1
+ # crude_valuation.forecast
2
+
3
+ Build per-well `Forecast`s from production history, analog cohorts, or both.
4
+ This subpackage's deliverable is exactly:
5
+
6
+ ```python
7
+ forecasts_oil: list[Forecast]
8
+ forecasts_gas: list[Forecast]
9
+ ```
10
+
11
+ Economics lives in `crude_valuation.econ` and consumes the tuple above.
12
+
13
+ ## Conceptual model
14
+
15
+ Two value types do disjoint jobs:
16
+
17
+ - `DeclineCurve` — pure decline shape. Carries `qi_peak`, `di`, `b`,
18
+ terminal decline, switch month, the stream, and the lateral the curve
19
+ is normalized to. No calendar time. No well identity. A curve fit from
20
+ Well A can be applied to many other wells.
21
+ - `Forecast` — a `DeclineCurve` placed against a target well. Carries
22
+ `peak_date` (where `qi_peak` applies in calendar time), `start_date`
23
+ (where the projection's month-0 sits), `lateral_scale`, and provenance.
24
+
25
+ Together `peak_date` and `start_date` pin down "how far past peak is this
26
+ projection starting." The peak-vs-current bug from Phase 1.5 — anchoring
27
+ projection at peak rate but labeling it as "today" — is structurally
28
+ impossible because `project` always indexes from `peak_date`, and both
29
+ fields are required.
30
+
31
+ ## Three strategies, one shape
32
+
33
+ Every forecast goes through three steps:
34
+
35
+ 1. **Build a curve.** Either `fit_curve(months, q, ...)` for own
36
+ production, or `percentile_curves([curve, ...], ...)` for an analog
37
+ cohort, or `blend_curves(qi_from=..., decline_from=...)` for the
38
+ thin-history case.
39
+ 2. **Wrap in a Forecast** with `peak_date`, `start_date`, and
40
+ `lateral_scale`.
41
+ 3. **`aggregate(forecasts, horizon_months=...)`** to get the per-stream
42
+ monthly volume.
43
+
44
+ | Strategy | Curve source | peak_date | start_date |
45
+ |---|---|---|---|
46
+ | PDP (own history) | `fit_curve` on own prod | prod_date at argmax(q) | last prod_date |
47
+ | Pure analog (PERMITTED/DUC) | `percentile_curves` on N analogs | planned first-prod date | planned first-prod date |
48
+ | Thin blend | `blend_curves`(own qi, cohort decline) | prod_date at argmax(own q) | last prod_date |
49
+
50
+ ## Templates
51
+
52
+ Three runnable templates in `examples/`, organized by strategy:
53
+
54
+ - `forecast_historical.py` — all wells have own history (PDP)
55
+ - `forecast_analogs.py` — no wells have own history (PERMITTED/DUC)
56
+ - `forecast_mixed.py` — mix of PDP / thin / not-yet-drilled
57
+
58
+ Each template exposes `build(run_sql) -> (forecasts_oil, forecasts_gas)`.
59
+ Edit `WELL_APIS` (and `ANALOG_FILTER` where applicable) before running.
60
+
61
+ ## Atomic snippets
62
+
63
+ ### Fit one well's curve
64
+ ```python
65
+ from crude_valuation.forecast import fit_curve, peek_well, load_production
66
+ import numpy as np
67
+
68
+ meta = peek_well(api, run_sql)
69
+ prod = load_production([api], run_sql)
70
+ months = np.arange(len(prod["months"]), dtype=float)
71
+ q = np.array(prod["oil_bbl"])
72
+ curve = fit_curve(months, q, stream="oil", lateral_norm_ft=meta.lateral_ft)
73
+ ```
74
+
75
+ ### Build a cohort curve
76
+ ```python
77
+ from crude_valuation.forecast import (
78
+ find_analogs, fit_curve, load_production, peek_well, percentile_curves,
79
+ )
80
+ import numpy as np
81
+
82
+ analog_apis = find_analogs("basin = 'Permian' AND ...", run_sql, exclude=[target])
83
+ curves = []
84
+ for a in analog_apis:
85
+ meta = peek_well(a, run_sql)
86
+ prod = load_production([a], run_sql)
87
+ q = np.array(prod["oil_bbl"])
88
+ months = np.arange(len(prod["months"]), dtype=float)
89
+ try:
90
+ curves.append(fit_curve(months, q, stream="oil", lateral_norm_ft=meta.lateral_ft))
91
+ except ValueError:
92
+ pass # skip thin-history analogs
93
+ cohort = percentile_curves(curves, pct=0.5, norm_to_lateral_ft=10_000.0)
94
+ ```
95
+
96
+ ### Place a PDP well
97
+ ```python
98
+ from datetime import date
99
+ from crude_valuation.forecast import Forecast, ForecastProvenance
100
+
101
+ # (continuing from the fit_curve snippet above — `prod` and `curve` in scope)
102
+ peak_idx = int(np.argmax(q))
103
+ fc = Forecast(
104
+ curve=curve,
105
+ peak_date=date.fromisoformat(str(prod["months"][peak_idx])),
106
+ start_date=date.fromisoformat(str(prod["months"][-1])), # last reported month
107
+ lateral_scale=1.0,
108
+ provenance=ForecastProvenance(target_well_api=api, strategy="pdp"),
109
+ )
110
+ ```
111
+
112
+ ### Place a synthetic well
113
+ ```python
114
+ from datetime import date
115
+ from crude_valuation.forecast import Forecast, ForecastProvenance
116
+
117
+ fc = Forecast(
118
+ curve=cohort,
119
+ peak_date=meta.planned_first_prod_date,
120
+ start_date=meta.planned_first_prod_date,
121
+ lateral_scale=(meta.lateral_ft or 10_000.0) / cohort.lateral_norm_ft,
122
+ provenance=ForecastProvenance(target_well_api=api, strategy="pure_analog"),
123
+ )
124
+ ```
125
+
126
+ ### Blend for a thin-history well
127
+ ```python
128
+ from crude_valuation.forecast import blend_curves
129
+ blended = blend_curves(qi_from=own_curve, decline_from=cohort_curve)
130
+ ```
131
+
132
+ ## Common mistakes
133
+
134
+ - Setting `peak_date == start_date` for a PDP well. That throws away the
135
+ decline history — the projection starts at peak rate. For PDP, peak_date
136
+ is the historical peak month and start_date is `last_prod`.
137
+ - Mixing analog curves at different laterals without normalizing. Use
138
+ `percentile_curves(..., norm_to_lateral_ft=X)` to put them on the same
139
+ basis before percentiling.
140
+ - Calling `fit_curve` on a well with only a few months of data. It will
141
+ raise. Route thin-history wells through `blend_curves(qi_from=own_fit,
142
+ decline_from=cohort_curve)` (see `examples/forecast_mixed.py`).