PyPI - stata-code - Versions diffs - 0.8.1__tar.gz → 0.9.0__tar.gz - Mend

stata-code 0.8.1tar.gz → 0.9.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (81) hide show

{stata_code-0.8.1 → stata_code-0.9.0}/CHANGELOG.md RENAMED Viewed

@@ -4,6 +4,72 @@ All notable changes to `stata-code` are documented here. The format follows
 [Keep a Changelog](https://keepachangelog.com/en/1.1.0/); the project adheres
 to semver-major.minor for the result schema (see `SCHEMA.md` §6).
+## 0.9.0 — 2026-06-23
+### Fixed
+- **Error-taxonomy correctness.** Audited the `_rc` → `ErrorKind` table against
+  StataCorp's `[P] error` manual (Stata 19) and corrected several
+  misclassifications: `not_sorted` is now `r(5)` (was the unrelated `r(119)`
+  "statement out of context" / `r(459)` "data is not…"); numlist errors
+  `r(122)`/`r(123)` are now `syntax` (were `invalid_name`); `r(322)` and
+  `r(1400)` map to `estimation_failure` (was `file_not_found` /
+  `estimation_sample_empty`); `r(480)` maps to `infeasible` (was
+  `out_of_memory`); local I/O `r(691)`–`r(693)` map to `file_io` (were
+  `network`). Misleading mappings for `r(9)`/`r(604)`/`r(615)`/`r(616)` were
+  removed (they fall through to `unknown` rather than assert a wrong kind).
+- **Command "did you mean?" now fires.** The `command_not_found` (rc 199) name
+  extractor expected `"<X> unrecognized command"`, but Stata's actual message is
+  `"command <X> is unrecognized"` — so the fuzzy suggestion never matched in
+  practice (synthetic unit tests passed the name in directly and hid it). Fixed
+  the regex and added a real-Stata integration test so a typo like `regresss`
+  now surfaces "Did you mean `regress`?".
+### Added
+- **Typed estimation contract.** `RunResult.results.estimation` now exposes a
+  frontend-neutral coefficient table derived from verified `r(table)` when
+  possible, or from inline `e(b)` / `e(V)` as a clearly marked fallback. New
+  public helpers `build_estimation_result()` and
+  `build_estimation_from_returns()` keep the contract unit-testable without
+  Stata. The contract also carries a coarse `command_family`
+  (ols/iv/gmm/panel/count/did/…) and command-aware `diagnostics` — identification
+  and specification tests surfaced from `e()` for the commands economists must
+  report (`ivreg2`/`ivreghdfe` weak-ID F and Hansen J, `xtabond2` AR(2)/Hansen,
+  `reghdfe` within-R²/absorbed FE, `xtreg` rho). Only scalars actually present in
+  `e()` are surfaced — never fabricated.
+- **Machine-readable recovery contract.** `error.recovery` now classifies each
+  `ErrorKind` by failure domain and tells agents whether an unchanged retry,
+  code edit, or user/out-of-band action is likely needed. Synthetic timeout,
+  cancellation, and adapter-crash errors carry the same recovery metadata as
+  ordinary Stata errors.
+- **Reproducibility provenance helpers.** New `Provenance`,
+  `build_provenance()`, and `build_reproducible_do()` helpers turn a completed
+  `RunResult` plus original code into a runtime provenance envelope and a
+  re-runnable `.do` script preamble with Stata `version`, `set more off`, and an
+  optional `set seed`. Provenance now also records **per-package dependencies**
+  parsed from the script (`extract_package_installs()` →
+  `Provenance.packages`: `ssc`/`net install` name, source, and `from()` URL),
+  and `build_submission_package()` assembles a self-contained
+  replication/journal-submission bundle (`analysis.do` + `PROVENANCE.json` +
+  a `README.md` manifest listing runtime, seed, and required community packages).
+- **Data-MCP handoff verifier.** New `verify_dataset()` and `DatasetCheck`
+  helpers validate imported datasets against provider metadata such as expected
+  row count, variable count, observation bounds, and required variables.
+- **`error.rc_label` is now populated for real Stata errors.** New
+  `RC_LABEL` table and `label_for_rc()` (public API) supply Stata's canonical
+  short message (e.g. `r(111)` → "variable not found") so agents have a stable,
+  transcript-independent descriptor to branch and group on. Unverified codes
+  yield an empty label rather than a guess.
+- **More return codes classified** (shrinking `unknown`): real network codes
+  `r(2)`/`r(631)`/`r(672)`/`r(677)` → `network`; `r(688)` → `file_corrupt`;
+  `r(907)` → `stata_limit`; `r(950)` → `out_of_memory`; numlist `r(124)`–`r(127)`
+  → `syntax`.
+- **Remediation suggestions for more error kinds.** `suggestions_for()` now
+  emits actionable hints for `network`, `infeasible`, `type_mismatch`,
+  `file_io`, `file_corrupt`, `permission`, `estimation_failure`, and
+  `matrix_missing`, so nearly every common failure ships a recovery hint.
 ## 0.8.1 — 2026-06-20
 ### Changed

{stata_code-0.8.1 → stata_code-0.9.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: stata-code
-Version: 0.8.1
+Version: 0.9.0
 Summary: Agent-native Stata bridge — one core, multiple frontends (MCP, Jupyter, VSCode)
 Project-URL: Homepage, https://github.com/brycewang-stanford/stata-code
 Project-URL: Repository, https://github.com/brycewang-stanford/stata-code
@@ -184,12 +184,14 @@ Verify the local setup with the read-only doctor:
 stata-code doctor
 stata-code doctor --json          # machine-readable output
 stata-code doctor --no-stata-probe # skip live Stata initialization
+stata-code doctor --workspace /path/to/project --no-user-config-scan
 ```
 The doctor reports the package/Python version, MCP and Jupyter extras, `pystata`
-discovery, console scripts on `PATH`, client/VS Code configuration hints, and a
-best-effort Stata version/edition probe. It never edits shell, Stata, Claude, or
-VS Code config.
+discovery, console scripts on `PATH`, common project/user MCP client config
+files, client/VS Code configuration hints, and a best-effort Stata
+version/edition probe. It never edits shell, Stata, Claude, Cursor, or VS Code
+config.
 ---
@@ -418,7 +420,9 @@ If the extension or an MCP client cannot find the server, run
 `stata-code doctor --no-stata-probe` in the same Python environment. It reports
 whether `stata-code-mcp` is on `PATH` and suggests absolute-path or
 `python -m stata_code.mcp` fallbacks for GUI clients whose `PATH` differs from
-your shell.
+your shell. It also reads common MCP config files in the current workspace and
+user config directories so you can see whether a client is already wired to
+`stata-code`.
 #### Cell and section conventions

{stata_code-0.8.1 → stata_code-0.9.0}/README.md RENAMED Viewed

@@ -145,12 +145,14 @@ Verify the local setup with the read-only doctor:
 stata-code doctor
 stata-code doctor --json          # machine-readable output
 stata-code doctor --no-stata-probe # skip live Stata initialization
+stata-code doctor --workspace /path/to/project --no-user-config-scan
 ```
 The doctor reports the package/Python version, MCP and Jupyter extras, `pystata`
-discovery, console scripts on `PATH`, client/VS Code configuration hints, and a
-best-effort Stata version/edition probe. It never edits shell, Stata, Claude, or
-VS Code config.
+discovery, console scripts on `PATH`, common project/user MCP client config
+files, client/VS Code configuration hints, and a best-effort Stata
+version/edition probe. It never edits shell, Stata, Claude, Cursor, or VS Code
+config.
 ---
@@ -379,7 +381,9 @@ If the extension or an MCP client cannot find the server, run
 `stata-code doctor --no-stata-probe` in the same Python environment. It reports
 whether `stata-code-mcp` is on `PATH` and suggests absolute-path or
 `python -m stata_code.mcp` fallbacks for GUI clients whose `PATH` differs from
-your shell.
+your shell. It also reads common MCP config files in the current workspace and
+user config directories so you can see whether a client is already wired to
+`stata-code`.
 #### Cell and section conventions

{stata_code-0.8.1 → stata_code-0.9.0}/SCHEMA.md RENAMED Viewed

@@ -77,7 +77,38 @@ Every successful or failed Stata execution returns one result object:
         }
       }
     },
-    "last_estimation_cmd": "regress"
+    "last_estimation_cmd": "regress",
+    "estimation": {
+      "command": "regress",
+      "depvar": "mpg",
+      "n_obs": 74,
+      "df_model": 1,
+      "df_resid": null,
+      "statistic_kind": "z",
+      "source": "e_b_v",
+      "ci_level": 95.0,
+      "coefficients": [
+        {
+          "term": "weight",
+          "b": -0.006,
+          "se": null,
+          "statistic": null,
+          "p_value": null,
+          "ci_low": null,
+          "ci_high": null
+        },
+        {
+          "term": "_cons",
+          "b": 39.44,
+          "se": null,
+          "statistic": null,
+          "p_value": null,
+          "ci_low": null,
+          "ci_high": null
+        }
+      ],
+      "model_stats": {"N": 74, "df_m": 1, "r2": 0.219}
+    }
   },
   "dataset": {
@@ -140,7 +171,8 @@ A failed execution sets `ok: false`, `rc != 0`, and populates `error`:
   "results": { "r": {"scalars": {}, "macros": {}, "matrices": {}},
                "e": {"scalars": {}, "macros": {}, "matrices": {}},
-               "last_estimation_cmd": null },
+               "last_estimation_cmd": null,
+               "estimation": null },
   "dataset": { "frame": "default", "n_obs": 74, "n_vars": 12, "changed": false,
                "filename": "auto.dta", "variables": null },
@@ -167,7 +199,13 @@ A failed execution sets `ok: false`, `rc != 0`, and populates `error`:
     "suggestions": [
       {"action": "Check the variable name. Did you mean `mpg`?",
        "command": "describe"}
-    ]
+    ],
+    "recovery": {
+      "category": "user_code",
+      "retriable": false,
+      "needs_code_change": true,
+      "needs_user_input": false
+    }
   },
   "schema_version": "1.0",
@@ -315,6 +353,34 @@ Stata's `r()` and `e()` return dictionaries, structurally separated. Each follow
 | Field | Type | Notes |
 | --- | --- | --- |
 | `last_estimation_cmd` | `string \| null` | Mirrors `e(cmd)` for callers who don't want to dig into `e.macros`. After multi-command code, this reflects the *last* command that wrote to `e()`. `null` if no estimation has been performed. |
+| `estimation` | `EstimationResult \| null` | Typed coefficient table derived from `r(table)` or `e(b)` / `e(V)`. `null` when no inline `e(b)` is available. |
+**`EstimationResult` shape:**
+| Field | Type | Notes |
+| --- | --- | --- |
+| `command` | `string \| null` | Mirrors `e(cmd)` when available; falls back to `last_estimation_cmd`. |
+| `depvar` | `string \| null` | Mirrors `e(depvar)`. |
+| `n_obs` | `int \| null` | Integer form of `e(N)` when available. |
+| `df_model` | `number \| null` | Mirrors `e(df_m)`. |
+| `df_resid` | `number \| null` | Mirrors `e(df_r)`. |
+| `statistic_kind` | `"t" \| "z"` | Which statistic fills each coefficient's `statistic` field. |
+| `source` | `"r_table" \| "e_b_v"` | `r_table` means values were copied from Stata's displayed `r(table)` after verifying its columns and `b` row match `e(b)`; `e_b_v` means point estimates come from `e(b)` and inference, when present, is computed from `e(V)` with a normal approximation. |
+| `ci_level` | `number` | Confidence level used for `ci_low` / `ci_high`; currently `95.0`. |
+| `coefficients` | `array<Coefficient>` | One row per term in `e(b)`. |
+| `model_stats` | `dict<str, number \| null>` | High-signal subset of `e()` scalars such as `N`, `df_m`, `df_r`, `r2`, `F`, `chi2`, `ll`, and `rmse`. Full scalars remain under `results.e.scalars`. |
+**`Coefficient` shape:**
+| Field | Type | Notes |
+| --- | --- | --- |
+| `term` | `string` | Term / coefficient column name. |
+| `b` | `number \| null` | Point estimate. |
+| `se` | `number \| null` | Standard error when available. |
+| `statistic` | `number \| null` | `t` or `z`, per `EstimationResult.statistic_kind`. |
+| `p_value` | `number \| null` | Two-sided p-value when available. |
+| `ci_low` | `number \| null` | Lower confidence interval bound when available. |
+| `ci_high` | `number \| null` | Upper confidence interval bound when available. |
 **Empty is empty.** Sub-dicts are `{}` when Stata returned nothing — never absent, never `null`.
@@ -377,6 +443,7 @@ Populated iff `ok: false`. The schema's most important contribution to agent UX:
 | `varname` | `string \| null` | For `varname_not_found` and related, the variable name at issue. |
 | `name` | `string \| null` | For `name_conflict` and `invalid_name`, the conflicting/invalid name. |
 | `suggestions` | `array<Suggestion>` | Producer-supplied remediation hints. Empty when none apply. See below. |
+| `recovery` | `Recovery \| null` | Machine-readable recovery contract for agents. Present on current producers; old or third-party producers may omit it, so consumers should handle `null`. |
 **`context` shape:**
@@ -399,36 +466,47 @@ Populated iff `ok: false`. The schema's most important contribution to agent UX:
 Suggestions are best-effort; agents should treat them as hints, not directives. A suggestion is not consent to mutate source files or silently retry changed code; consumers should apply fixes automatically only in workflows where the user requested repair or approved iteration. The `kind` enum below documents what suggestions are typically populated.
+**`Recovery` shape:**
+| Field | Type | Notes |
+| --- | --- | --- |
+| `category` | `"user_code" \| "data" \| "model" \| "resource" \| "environment" \| "internal" \| "unknown"` | Broad failure domain for routing. |
+| `retriable` | `bool` | Whether re-running the exact same code may succeed. True mainly for transient environment or producer-side failures. |
+| `needs_code_change` | `bool` | Whether the submitted Stata code must change to succeed. |
+| `needs_user_input` | `bool` | Whether resolution likely requires a human or out-of-band action such as permissions, license/edition limits, or re-acquiring a corrupt file. |
 **`kind` enum (v1.0):**
+rc(s) below cite StataCorp `[P] error` (Stata 19, 2025). The code is authoritative; this table is a readable mirror.
 | `kind` | Typical rc(s) | Notes / suggestion seed |
 | --- | --- | --- |
-| `syntax` | 9, 100, 101, 102, 103, 121, 130, 132, 197, 198 | Generic parser failure. No automatic suggestion. |
+| `syntax` | 100, 101, 102, 103, 121–127, 130, 132, 197, 198 | Generic parser failure (incl. numlist errors 121–127). No automatic suggestion. |
 | `command_not_found` | 199 | Often resolved by `ssc install` or `net install`; suggestions populated when Stata reports a likely package name. |
 | `varname_not_found` | 111 | `varname` populated. Suggestions may include similar varnames from `dataset.variables`. |
-| `invalid_name` | 122, 123 | `name` populated. |
-| `type_mismatch` | 109, 408 | |
+| `invalid_name` | (no dedicated rc) | Stata folds "invalid name" into r(198). `name` populated when constructed by a producer. |
+| `type_mismatch` | 109, 408 | Suggestion: `destring`/`tostring`. |
 | `name_conflict` | 110 | `name` populated. Suggestion typically: `replace`. |
-| `not_sorted` | 119, 459 | Suggestion: `sort <varlist>`. |
+| `not_sorted` | 5 | Suggestion: `sort <varlist>`. |
 | `convergence` | 430 | |
-| `infeasible` | 491 | Distinct from convergence: starting values not feasible. |
-| `estimation_sample_empty` | 1400, 2000 (in estimation context) | |
-| `estimation_failure` | 1401, 1402 | |
+| `infeasible` | 480, 491 | Distinct from convergence: starting values not feasible (e.g. `nl`, `ml`). |
+| `estimation_sample_empty` | (no dedicated rc) | Empty estimation samples surface as r(2000); producer-set otherwise. |
+| `estimation_failure` | 322, 1400, 1401, 1402 | Postestimation/prefix saw an unexpected result, or numerical overflow. |
 | `no_estimation_results` | 301 | Common when calling `predict`/`margins` without prior estimation. |
 | `no_observations` | 2000, 2001 | |
 | `data_in_memory` | 4 | Suggestion: `clear`. |
 | `matrix_singular` | 506, 508 | Matrix not positive definite / not invertible. |
-| `matrix_conformability` | 503, 507 | Dimension mismatch. |
+| `matrix_conformability` | 503, 507 | Dimension mismatch; 507 is a `matrix post` row/col name conflict kept in the matrix bucket. |
 | `matrix_missing` | 504 | Matrix has missing values. |
-| `file_not_found` | 322, 601 | `path` populated. |
+| `file_not_found` | 601 | `path` populated. |
 | `file_exists` | 602 | `path` populated. Suggestion: pass `replace` option. |
-| `file_corrupt` | 604, 610 | `path` populated. Often "not a Stata file." |
-| `file_io` | 603, 691 (local) | `path` populated. Catch-all for open/read/write failures not otherwise classified. |
-| `network` | 691 (network), 692, 693 | URL fetches, network reads. |
-| `permission` | 608 | `path` populated. Includes Stata-license-limit errors (615/616 family that surface as permission denials). |
-| `encoding` | 615, 616 | Unicode / encoding-conversion failures. |
-| `stata_limit` | 901, 902, 903 | Edition / matsize / similar Stata-imposed caps. Distinct from OS OOM. Suggestion: `set maxvar` or upgrade edition. |
-| `out_of_memory` | 480, 909 | OS-level memory exhaustion. |
+| `file_corrupt` | 610, 688 | `path` populated. "Not a Stata file" (610) or genuinely corrupt (688). |
+| `file_io` | 603, 691, 692, 693 | `path` populated. Catch-all for open/read/write failures (691–693 are local filesystem I/O). |
+| `network` | 2, 631, 672, 677 | Connection timed out / host not found / server refused / remote connection failed. |
+| `permission` | 608 | `path` populated. File is read-only / not writable. |
+| `encoding` | (no dedicated rc) | Unicode / encoding-conversion failures; producer-set. |
+| `stata_limit` | 901, 902, 903, 907 | Edition / maxvar / width caps. Distinct from OS OOM. Suggestion: `set maxvar` or upgrade edition. |
+| `out_of_memory` | 909, 950 | OS-level memory exhaustion. Suggestion: `compress`. |
 | `interrupt` | 1 | User Break / Ctrl-C from a frontend. |
 | `cancelled` | (synthetic `rc: -3`) | Cancellation was requested. Subprocess-backed producers may terminate an in-flight worker; the direct in-process runner only short-circuits before Stata receives code. |
 | `timeout` | (synthetic `rc: -2`) | Adapter-imposed time limit exceeded. |
@@ -607,6 +685,8 @@ This section tracks how much of the schema is wired up in code. Not normative
   emit a `matrix://<request_id>/<r|e>/<name>` ref instead, retrievable
   via `get_matrix(ref)`.
 - `results.last_estimation_cmd` (mirrors `e(cmd)`).
+- `results.estimation` typed coefficient table, copied from verified
+  `r(table)` when possible and otherwise derived from inline `e(b)` / `e(V)`.
 - `dataset` block — `n_obs`, `n_vars`, `frame`, `changed`, `filename`,
   and `variables` (capped at 200 entries).
 - `graphs[]` with `ref` + on-disk capture pipeline; format restricted to
@@ -617,7 +697,8 @@ This section tracks how much of the schema is wired up in code. Not normative
   extracted from Stata's English error text by regex, structured
   `context` (`{before, failing, after}`), `commands_executed` parsed
   from pystata's multi-line transcript, `suggestions` generated by
-  `core.errors.suggestions_for`.
+  `core.errors.suggestions_for`, and `recovery` generated by
+  `core.errors.recovery_for`.
 - `request_id` (uuid4 hex), `started_at` (ISO 8601 UTC ms),
   `stata_elapsed_ms`, `capabilities`.
 - Multi-session via Stata frames — `session_id="main"` ↔ `default`

stata_code-0.9.0/docs/competitive-landscape.md ADDED Viewed

@@ -0,0 +1,163 @@
+# Competitive Landscape & Long-Term Goals
+Last updated: 2026-06-23
+This document is the **evidence base** behind
+[industry-leader-roadmap.md](industry-leader-roadmap.md). The roadmap says *what
+we will build*; this file says *who else is in the market, where the open lane
+is, and which long-term bets follow from that*. Keep the two in sync: when the
+landscape shifts, update this file first, then re-derive the roadmap.
+Star counts, install counts, and versions below are point-in-time reads from
+June 2026 and will drift. Treat them as relative signal, not live data.
+## North-star positioning
+> `stata-code` should be the most **reliable, agent-native, typed** way to run,
+> inspect, repair, and audit Stata — winning on *fidelity to the authoritative
+> Stata runtime* and *referee-grade reproducibility*, not on method count or
+> raw breadth.
+The single fact that defines our lane: **no competitor ships all three of**
+1. a typed **error taxonomy** (stable `error.kind` values an agent branches on,
+   not return codes + red text it must string-match);
+2. **typed `r()` / `e()` result contracts** (not a generic results dump); and
+3. **token-efficient by-reference artifacts** for logs, graphs, *and* matrices.
+Everything else is either AGPL/GPL-licensed, editor-bound, raw-log-only, or a
+Python/R reimplementation that does not touch the real Stata runtime.
+## The field (June 2026)
+| Tool | License | Exec | Structured out | Typed error kinds | Frontends | Maturity |
+| --- | --- | --- | --- | --- | --- | --- |
+| **stata-code** (this) | MIT | pystata | typed `RunResult` + `r()/e()` | **yes** | MCP + kernel + VS Code | pre-1.0 |
+| tmonk/mcp-stata + workbench | AGPL-3.0 | pystata | JSON + `r()/e()/s()` | no (rc + red text) | MCP + ext | ~69★, very active |
+| hanlulong/stata-mcp | MIT | pystata | no (filtered raw log) | no | VS Code + MCP | ~440★, ~15.6k installs |
+| SepineTam/mcp-for-stata | AGPL-3.0 | do-file subprocess | no (raw SMCL) | no | MCP + CLI (7+ agents) | ~204★, very active |
+| haoyu-haoyu/stata-ai-fusion | MIT | pexpect | partial (`r()/e()/c()`) | no (text-flagged) | MCP + Skill + ext | ~32★, new |
+| stata_kernel (kylebarron) | GPL-3.0 | automation/console | no (raw log) | no | Jupyter | ~278★ |
+| nbstata (hugetim) | GPL-3.0 | pystata | no (raw log + widgets) | no | Jupyter | ~59★ |
+| pystata (StataCorp) | proprietary | native | `r()/e()`→dict, mat→NumPy | no (Py exceptions) | IPython magics | ships w/ Stata |
+| StatsPAI (sibling, Python) | MIT | Python reimpl. | yes (agent cards) | partial (validation) | MCP | ~244★, daily |
+### Reading the table
+- **The genuine head-to-head is `tmonk/mcp-stata`** — pystata-backed, MCP-native,
+  returns `r()/e()`, ships a skills catalog, and has StataCorp-newsletter
+  visibility. Its two gaps are our two wedges: **(a) no typed error taxonomy**
+  (agents still parse return codes + preserved red text) and **(b) AGPL-3.0**,
+  a hard blocker for commercial/embeddable adoption. MIT + typed kinds is the
+  clean answer to both.
+- **The adoption leader is `hanlulong/stata-mcp`** (~15.6k installs). Despite
+  using pystata — which *could* expose stored results — it surfaces **filtered
+  raw log text**, not a schema. Distribution, not schema quality, is its moat;
+  we must not assume a better contract auto-wins installs.
+- **Jupyter kernels and pystata itself are human-facing**, not agent-native:
+  raw log + images, Python exceptions, no MCP, no by-reference economy.
+- **The real *category* threat is the Python/R reimplementation wave**
+  (StatsPAI, rmcp): they took the "first agent-native econometrics" framing.
+  We do not beat them on method count — we beat them by being the authoritative
+  Stata leg they themselves reach for when cross-validating.
+## Why the error taxonomy is the defensible identity
+A typed error taxonomy is the one capability **no competitor has** and the
+**hardest to retrofit** onto a raw-log design. It is also cheap for us to lead
+on because we already have the architecture (`error.kind`, `error.suggestions`,
+`error.rc_label`, pinpoint context). The 2026-06-23 core pass made this concrete:
+- audited `RC_TO_KIND` against StataCorp `[P] error` (Stata 19) and corrected
+  multiple misclassifications (e.g. `not_sorted` is `r(5)`, not the unrelated
+  `r(119)`/`r(459)`);
+- populated `error.rc_label` with Stata's canonical short message via
+  `label_for_rc()` (it was silently empty for every real error before);
+- expanded `suggestions_for()` so nearly every common failure ships an
+  actionable recovery hint.
+The moat is not "we have error kinds" — it is "**our error kinds are correct,
+labeled, and paired with a recovery action**, verified against the manual." That
+is the kind of trust an empirical economist and a referee both need.
+## Long-term goals (6–12 months), by leverage
+Ranked. Each ties back to a roadmap pillar and is phrased as a durable outcome,
+not a feature list. Status legend: ✅ shipped · 🟡 foundation shipped, stretch
+remaining · ⬜ not started.
+1. ✅ **Own "agent-native typed Stata errors" as the headline.** *Shipped
+   2026-06-23.* `error.kind` is a stable, manual-verified contract (audited
+   against `[P] error`); every classified rc ships a canonical `rc_label`
+   (`label_for_rc`) and, where actionable, a remediation `suggestion`. The
+   **agent recovery contract** (`error.recovery` / `recovery_for`) gives a
+   defined next action per kind: retry-as-is, change-code, or escalate.
+   → Roadmap pillar 1 (reliable execution contract).
+2. ✅ **Per-command typed `r()/e()` result contracts for mandated commands.**
+   *Shipped 2026-06-23.* `RunResult.results.estimation` is a typed coefficient
+   table (term/b/se/statistic/p/CI + model_stats) from referee-grade `r(table)`
+   (or `e(b)`/`e(V)` fallback). It now also carries `command_family`
+   (ols/iv/gmm/panel/count/…) and command-aware `diagnostics` — the
+   identification/spec tests economists must report (`ivreg2`/`ivreghdfe`
+   weak-ID F + Hansen J, `xtabond2` AR(2)/Hansen, `reghdfe` within-R²/absorbed
+   FE, `xtreg` rho), surfaced only when present in `e()`. This is the
+   StatsPAI-defense: referee-grade numbers from the exact mandated command.
+   → Roadmap pillar 2.
+3. ✅ **Reproducibility / provenance envelope.** *Shipped 2026-06-23.*
+   `build_provenance()` captures Stata version/edition, `e(cmd)`, stata-code +
+   schema versions, timestamp, seed, and **per-package dependencies** parsed
+   from the script (`ssc`/`net install` → `Provenance.packages`);
+   `build_reproducible_do()` renders a `version`-pinned, seed-set re-runnable
+   `.do`; and `build_submission_package()` assembles a replication/journal
+   bundle (do + `PROVENANCE.json` + README manifest). *Stretch:* package
+   *version* pinning (vs. name only) and a journal-specific layout.
+   → Roadmap pillar 1 + 3.
+4. ✅ **Data-MCP integration bridge** (FRED / World Bank / Census).
+   *Shipped 2026-06-23.* `verify_dataset()` enforces the handoff's key checks
+   (row/var counts, observation bounds, required columns) on the captured
+   `DatasetInfo` — the executable companion to the `data-mcp-handoff` protocol,
+   documented in `references/structured-results.md`. *Stretch:* first-class
+   adapters that ferry source metadata (row hash, series ids) into the check
+   automatically. **No Stata MCP has done this composition.** → Roadmap pillar 4.
+5. 🟡 **Typed-schema-anchored skills catalog** — replication audits, robustness
+   sweeps, publication QA, legacy `.do` modernization — each anchored to typed
+   results + provenance, under MIT. *Foundation shipped:*
+   `references/structured-results.md` teaches agents to consume the typed
+   contracts (`results.estimation`/`diagnostics`, `error.recovery`/`rc_label`,
+   reproducible-do / submission bundles, `verify_dataset`). *Stretch:* the
+   audit/robustness/QA recipe set in the skills lane
+   (`skills/stata-code/references/recipes/**`). → Roadmap pillar 2.
+## Risks & threats
+- **StataCorp native AI — LOW near-term, monitor.** Stata 19 shipped classical
+  H2O ML only; no LLM/copilot/agent feature is shipped or announced, and
+  StataCorp frames AI as community-built. Watch *New in Stata* / StataNow for any
+  shift from tutorials to a shipped feature. Our durable hedge: we *require* a
+  genuine Stata license, so our incentives align with StataCorp's rather than
+  competing with them.
+- **Distribution gap.** hanlulong already has ~15.6k installs and tmonk has
+  newsletter visibility. A superior contract does not auto-win adoption; the
+  typed-error / reproducibility story must reach economists where they are
+  (SSC, Statalist, replication and referee communities).
+- **Category framing already taken by Python/R.** "Agent-native econometrics
+  with structured results + MCP" converged across StatsPAI and rmcp. Defend by
+  positioning on **authoritative Stata-runtime fidelity**, not breadth.
+- **Generic code-exec substitution.** A Jupyter-MCP + statsmodels sandbox can
+  "do econometrics" with zero domain tooling. Defense: *mandated Stata commands
+  + verifiable typed `e()` contracts* a generic sandbox cannot provide.
+- **License contagion.** The two structured/MCP-native competitors (tmonk,
+  SepineTam) are AGPL; the kernels are GPL. To preserve our MIT clean-room
+  wedge we must never vendor their code paths (see `LICENSE-POLICY.md`).
+- **Naming/trademark.** "Stata" is a StataCorp trademark and the `*stata-mcp`
+  namespace is crowded. Keep the "not affiliated with StataCorp" disclaimer
+  prominent and avoid implying endorsement.
+## Sources
+Competitor repos and listings (June 2026): tmonk/mcp-stata, tmonk/stata-workbench;
+hanlulong/stata-mcp (VS Code Marketplace: DeepEcon.stata-mcp); SepineTam/mcp-for-stata
+(PyPI `stata-mcp`); haoyu-haoyu/stata-ai-fusion; kylebarron/stata_kernel; hugetim/nbstata;
+StataCorp pystata docs and *New in Stata 19*; brycewang-stanford/StatsPAI; finite-sample/rmcp;
+data MCPs (datacommonsorg/agent-toolkit, stefanoamorelli/fred-mcp-server, worldbank/data360-mcp).
+StataCorp `[P] error` (Stata 19, 2025) is the authoritative source for the error-code audit.

{stata_code-0.8.1 → stata_code-0.9.0}/docs/industry-leader-roadmap.md RENAMED Viewed

@@ -46,6 +46,21 @@ causal libraries, or paid services. Those are separate tools. The durable
 boundary is: external data/model tools produce files or results; `stata-code`
 executes and audits the Stata side with traceable artifacts.
+## Market Refresh (2026-06-23)
+| Adjacent tool | Current strength | Implication for `stata-code` |
+| --- | --- | --- |
+| Official Stata PyStata + Jupyter support | Official Python-side Stata API, IPython magics, and notebook workflow | Keep `pystata` discovery reliable and treat official Stata as the execution source of truth |
+| `nbstata` / `stata_kernel` | Stata-first notebooks, autocomplete, graphs, data browsing, and rich notebook interaction | Win on the shared execution contract across notebooks, MCP, and VS Code rather than duplicating every notebook UI feature |
+| Stata Workbench / `mcp-stata` | Agent-facing IDE workflow with Stata execution, variables, graphs, logs, and multi-session framing | Compete through structured `RunResult`, token-economic artifacts, and clear audit trails |
+| `stata-mcp` / DeepEcon Stata MCP | One-command multi-agent install story, doctor diagnostics, and broad client messaging | Close setup-confidence gaps with read-only client-config visibility and explicit fallback commands |
+| Stata All in One / Stata Enhanced | Human editor polish: syntax, outline, hints, execution, and data viewing | Keep VS Code ergonomics practical, but route execution and artifacts through the same MCP/core schema |
+Near-term priority: keep the project boringly dependable for agents. That means
+read-only setup diagnostics, visible MCP client wiring, stable schema contracts,
+and artifact discovery should outrank broad new integrations unless a testable
+workflow needs them.
 ## One-Month Execution Plan
 ### Week 1: Workflow Layer

{stata_code-0.8.1 → stata_code-0.9.0}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
 [project]
 name = "stata-code"
-version = "0.8.1"
+version = "0.9.0"
 description = "Agent-native Stata bridge — one core, multiple frontends (MCP, Jupyter, VSCode)"
 readme = "README.md"
 license = "MIT"

stata-code 0.8.1__tar.gz → 0.9.0__tar.gz

stata-code 0.8.1tar.gz → 0.9.0tar.gz