PyPI - stata-code - Versions diffs - 0.3.1__tar.gz → 0.5.0__tar.gz - Mend

stata-code 0.3.1tar.gz → 0.5.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (47) hide show

{stata_code-0.3.1 → stata_code-0.5.0}/.gitignore RENAMED Viewed

@@ -218,7 +218,21 @@ __marimo__/
 .streamlit/secrets.toml
 # Stata-specific
+log-files/
 *.gph
 *.smcl
 *.dta
 !tests/fixtures/*.dta
+# macOS
+.DS_Store
+**/.DS_Store
+# VS Code workspace settings (contain user-machine absolute paths)
+.vscode/
+# Claude Code scratch (worktrees, transcripts)
+.claude/
+# Demo / scratch notebooks (real tests live under tests/)
+demo-tests/*.ipynb

{stata_code-0.3.1 → stata_code-0.5.0}/CHANGELOG.md RENAMED Viewed

@@ -4,7 +4,106 @@ All notable changes to `stata-code` are documented here. The format follows
 [Keep a Changelog](https://keepachangelog.com/en/1.1.0/); the project adheres
 to semver-major.minor for the result schema (see `SCHEMA.md` §6).
-## [Unreleased]
+## [0.5.0] — 2026-05-08
+### Added
+- **Bundled Jupyter kernel logos.** `stata-code-kernel install --user`
+  now copies `stata_code/kernel/assets/{logo-32x32.png,logo-64x64.png,
+  logo-svg.svg}` into the kernelspec source dir before
+  `KernelSpecManager().install_kernel_spec` runs. VS Code's Jupyter
+  extension filters out kernelspecs that lack logo files, so prior
+  releases were invisible in its kernel picker; v0.5 fixes that without
+  affecting JupyterLab or classic Jupyter (which both already worked).
+- **TestPyPI publishing step in `release.yml`.** Tag `v*` now publishes
+  to TestPyPI (via OIDC trusted publishing, environment `testpypi`)
+  before publishing to PyPI proper. `continue-on-error: true` keeps
+  PyPI + GitHub Release on the happy path even when TestPyPI is
+  misconfigured. Setup mirrors the PyPI trusted publisher and is
+  documented in [CLAUDE.md](CLAUDE.md).
+### Changed
+- **`stata_run` tool description and README** clarify the boundary
+  between non-mutating execution and the optional agent "fix and
+  rerun" repair loop. The tool itself never rewrites your `.do` file
+  — but the submitted Stata code can still produce logs, graphs, and
+  output files as usual. Repair loops require explicit user opt-in;
+  failed runs are diagnostics first, not automatic rewrite permission.
+- **VSCode MCP-client handshake version aligned to 0.5.0** (was a
+  stale 0.3.2 since the v0.3.2 release).
+### Fixed
+- **`install_kernel` no longer `.resolve()`s `sys.executable`.** On
+  macOS Homebrew venvs (and other layouts that use a `python` symlink
+  outside the venv's `bin/` to a Cellar-style real interpreter),
+  resolving the symlink pointed Jupyter at an interpreter that
+  couldn't import `stata_code`. The kernelspec now keeps the
+  unresolved `sys.executable`, so the venv's `python` (with
+  `stata_code` on its `sys.path`) launches the kernel.
+## [0.4.0] — 2026-05-07
+### Added
+- **Persistent per-run log bundles.** When a `.do` file path is supplied as
+  `origin_path`, the runner writes an immutable `log-files/<run>/` directory
+  next to the source file containing:
+  - `<run>.log` and `<run>.smcl` — Stata's textual and SMCL logs
+  - `manifest.json` — run metadata (elapsed_ms, rc, session, Stata edition)
+  - `submitted.do` — a snapshot of the code that was executed
+  - `graphs/` — captured graph files materialized from graph refs
+  - `outputs/` — newly created or modified table/export files copied from
+    the run's working directory
+  The directory name encodes UTC timestamp, session, and request IDs so
+  parallel runs and reruns are never ambiguous.
+- **Working-directory defaults from `origin_path`.** Before running,
+  Stata `cd`s to the `.do` file's parent so relative `graph export`,
+  `putexcel`, `esttab using`, `collect export`, etc. output next to the
+  source. Toggle with `use_origin_workdir` / `useDoFileDirectory` setting.
+  Explicit `working_dir` overrides this.
+- **Schema extensions.** `LogInfo.files` (`LogFileInfo`) carries the
+  bundle paths and derived `graphs_dir`/`outputs_dir`; `GraphInfo.file_path`
+  records where a graph was materialized; two new capabilities
+  `log_files` and `run_artifacts` signal support.
+- **MCP tool options.** `stata_run` gains `persist_log_files`,
+  `persist_generated_files`, `origin_path`, `origin_kind`,
+  `origin_label`, `use_origin_workdir`, `working_dir`.
+- **VS Code settings.** Three new configuration options:
+  `stataCode.persistLogFiles` (default `true`),
+  `stataCode.persistGeneratedFiles` (default `true`),
+  `stataCode.useDoFileDirectory` (default `true`).
+- **VS Code tree views.** The Last Result tree now shows "saved" and
+  "N outputs" badges on the log node when artifacts are present; the
+  output log header prints `working_dir:`, `log_file:`, `smcl_file:`,
+  `graphs_dir:`, `outputs_dir:` for each run.
+### Changed
+- **VSCode MCP startup.** The extension now expands common macOS Python
+  script directories before spawning `stata-code-mcp`, tries workspace
+  `.venv` and `python -m stata_code.mcp` fallbacks for the default command,
+  and writes child-process stderr to the `stata-code` output channel so
+  missing PATH / missing dependency failures are actionable.
+- **VSCode toolbar ordering.** Run-all and run-selection now share the same
+  ordinary `editor/title` toolbar sequence, with ordering moved later in the
+  `navigation` group to reduce interleaving from other extensions.
+## [0.3.2] — 2026-05-08
+### Changed
+- **VSCode toolbar ordering.** Editor title-bar actions now live in one
+  contiguous `navigation` group so `stata-code` buttons stay together. The
+  order prioritizes run commands first, then data/output views, session
+  controls, cancellation/reset, and working-directory actions.
 ## [0.3.1] — 2026-05-07

{stata_code-0.3.1 → stata_code-0.5.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: stata-code
-Version: 0.3.1
+Version: 0.5.0
 Summary: Agent-native Stata bridge — one core, multiple frontends (MCP, Jupyter, VSCode)
 Project-URL: Homepage, https://github.com/brycewang-stanford/stata-code
 Project-URL: Repository, https://github.com/brycewang-stanford/stata-code
@@ -59,7 +59,12 @@ Description-Content-Type: text/markdown
               └─────────────┘  └────────────┘  └─────────────────┘
 ```
-**Status: v0.2 (May 2026)** — the core, MCP server, and Jupyter kernel work end-to-end against Stata 18 MP. Current test suite: 144 passing tests (88 no-Stata unit tests + 56 real-Stata integration tests). License: **MIT**.
+**Status: v0.5 (May 2026)** — the core, MCP server, Jupyter kernel, and VS Code extension work end-to-end against Stata 18 MP. Current test suite: 218 passing tests across schema, runner, MCP, kernel, and ref-store modules. License: **MIT**.
+Two workflows v0.5 explicitly supports for end users:
+- **Run Stata code from a Jupyter notebook.** `pip install "stata-code[kernel]"` + `stata-code-kernel install --user` registers a **Stata** kernel that the Jupyter Notebook UI, JupyterLab, and the VS Code Jupyter extension all pick up by name. Cells render Stata logs, graphs, and warnings inline (the kernel logo bundled in v0.5 makes it appear in VS Code's kernel picker too). See [As a Jupyter Kernel](#as-a-jupyter-kernel).
+- **Optional agent "fix and rerun" loop.** `stata_run` returns typed `error.kind/line/context` plus `suggestions` on every failure. By default Claude Code only reports diagnostics — but if you explicitly say "fix this and rerun until it passes", the agent uses the same fields to edit your `.do` file and re-call `stata_run` until the run is green. The repair loop is **opt-in**: failed runs are diagnostics first, not automatic rewrite permission. See [Error Recovery in Agent Workflows](#error-recovery-in-agent-workflows).
 ---
@@ -135,7 +140,7 @@ else:
 ### As an MCP Server
-After `pip install stata-code`, the `stata-code-mcp` binary is on your `PATH`. You can wire it into Claude Code, Cursor, Claude Desktop, or any other MCP-compatible client.
+After `pip install "stata-code[mcp]"`, the `stata-code-mcp` binary is on your `PATH`. You can wire it into Claude Code, Cursor, Claude Desktop, or any other MCP-compatible client.
 #### Claude Code via `claude mcp add` (recommended)
@@ -156,6 +161,15 @@ claude mcp add stata-code --scope project -- stata-code-mcp
 Then launch `claude` and type `/mcp` to confirm `stata-code` shows up with its 8 tools (`stata_run`, `stata_info`, `get_log`, `get_graph`, `get_matrix`, `list_sessions`, `cancel_session`, `reset_session`).
+#### Error Recovery in Agent Workflows
+`stata_run` does not rewrite the source `.do` file or change code on its own. It executes the submitted Stata code, so that code may still create logs, graphs, tables, or other outputs as usual. When Stata fails, `stata_run` returns typed diagnostics (`error.kind`, `error.message`, `error.line`, `error.context`) plus best-effort `suggestions`. That supports two distinct Claude Code workflows:
+- For "run this do-file" or "verify this code", Claude can report the failure and suggested next steps without changing source files.
+- For "fix this and rerun until it passes", Claude can use the same structured error fields to edit the `.do` file, call `stata_run` again, and iterate.
+If you want the repair loop, say so explicitly. Otherwise, treat failed runs as diagnostics first, not as automatic permission to rewrite code.
 #### `uvx` (no global pip install)
 If you prefer not to `pip install stata-code` globally, run it ephemerally through [`uv`](https://github.com/astral-sh/uv):
@@ -201,17 +215,30 @@ The MCP server registers 8 tools:
 ### As a Jupyter Kernel
+`stata-code` ships a Jupyter kernel as part of the Python package — there is no separate "Jupyter plugin" in the JupyterLab extension marketplace. Installation is two steps: `pip install` the package with the `kernel` extra, then register the kernelspec with Jupyter.
+**Prerequisites**: Stata 17+ installed locally with a valid license (the kernel calls Stata via `pystata`), and Python 3.10+ with `jupyter`/`jupyterlab` already on the same environment.
 ```bash
+# 1. Install stata-code with the kernel extra (pulls in ipykernel)
+pip install "stata-code[kernel]"
+# 2. Register the kernelspec into Jupyter's user data dir
 stata-code-kernel install --user
+# Or, equivalently:
+# python -m stata_code.kernel install --user
 ```
-Or install it as a module:
+Verify the kernel is registered:
 ```bash
-python -m stata_code.kernel install --user
+jupyter kernelspec list
+# should include an entry named `stata`
 ```
-Then open a notebook and select the **Stata** kernel. Stata commands run in cells; logs, graphs, and warnings render inline.
+Then open Jupyter Notebook / JupyterLab (or a `.ipynb` in VS Code), pick **Stata** in the kernel selector, and run Stata commands in cells. Logs, graphs, and warnings render inline.
+> JupyterLab's Extension Manager only installs front-end JS extensions, so it cannot install a kernel — `pip install` plus the `install --user` step above is the only supported path.
 ### As a VS Code Extension
@@ -224,7 +251,7 @@ code --install-extension brycewang-stanford.stata-code-vscode
 Or open the **Extensions** sidebar in VS Code and search `stata-code`.
-The extension still requires `stata-code` itself to be importable on your system Python (`pip install stata-code`), so that `stata-code-mcp` resolves on `PATH`. Stata 17+ and a valid Stata license are required as for any other frontend.
+The extension still requires the MCP extra on your system Python (`pip install "stata-code[mcp]"`), so that `stata-code-mcp` resolves on `PATH` and can import the MCP SDK. Stata 17+ and a valid Stata license are required as for any other frontend.
 ---
@@ -301,7 +328,7 @@ stata_code/
 ## Roadmap
-### Done (v0.2 — May 2026)
+### Done (through v0.5 — May 2026)
 - v1.0 result schema ([SCHEMA.md](SCHEMA.md))
 - `pystata`-based runner with native-typed `r()`, `e()`, and matrices
@@ -334,7 +361,7 @@ See [SCHEMA.md §7](SCHEMA.md) for explicitly out-of-scope items.
 ```bash
 pip install -e ".[dev,mcp,kernel]"
-pytest                              # full suite (144 tests)
+pytest                              # full suite (218 tests)
 pytest -m "not stata_required"      # CI subset; no Stata needed
 pytest -m "stata_required" -v       # Stata-only integration tests
 ```
@@ -389,7 +416,12 @@ The Stata tooling landscape that this project builds on and learns from is surve
               └─────────────┘  └────────────┘  └─────────────────┘
 ```
-**当前状态：v0.2（2026 年 5 月）** —— core、MCP server 和 Jupyter kernel 已经可以在 Stata 18 MP 上端到端运行。当前测试：144 passing（88 个不需要 Stata 的单元测试 + 56 个真实 Stata 集成测试）。许可证：**MIT**。
+**当前状态：v0.5（2026 年 5 月）** —— core、MCP server、Jupyter kernel、VS Code 扩展都已经在 Stata 18 MP 上端到端跑通。测试套件：218 个 passing tests，覆盖 schema、runner、MCP、kernel 和 ref-store。许可证：**MIT**。
+v0.5 明确支持的两种用户工作流：
+- **在 Jupyter notebook 里跑 Stata 代码。** `pip install "stata-code[kernel]"` + `stata-code-kernel install --user` 会注册一个名为 **Stata** 的 kernel，Jupyter Notebook、JupyterLab、以及 VS Code 的 Jupyter 扩展都能在 kernel 选择器里看到它。Cell 里直接写 Stata 命令，日志、图形和警告会内联渲染（v0.5 把 kernel logo 一起打包进 PyPI wheel，VS Code 的 Jupyter kernel picker 也能正常显示）。详见下文 [作为 Jupyter kernel](#作为-jupyter-kernel)。
+- **可选的 agent「修复并重跑」循环。** `stata_run` 在每次失败时都会返回结构化的 `error.kind/line/context` 和 `suggestions`。默认情况下 Claude Code 只把它当作诊断信息上报；但如果你明确说「帮我修到跑通」「修复并反复运行直到成功」，agent 就会用同一组字段去改 `.do` 文件、再调 `stata_run`，直到代码通过。这个修复循环是 **opt-in** 的：默认失败 = 诊断，不是自动改写授权。详见下文 [Agent 工作流里的报错恢复](#agent-工作流里的报错恢复)。
 ---
@@ -464,7 +496,7 @@ else:
 ### 作为 MCP server
-`pip install stata-code` 之后，`stata-code-mcp` 会出现在你的 `PATH` 中。可以接到 Claude Code、Cursor、Claude Desktop 等任何兼容 MCP 的客户端里。
+`pip install "stata-code[mcp]"` 之后，`stata-code-mcp` 会出现在你的 `PATH` 中。可以接到 Claude Code、Cursor、Claude Desktop 等任何兼容 MCP 的客户端里。
 #### 用 `claude mcp add` 接入 Claude Code（推荐）
@@ -485,6 +517,15 @@ claude mcp add stata-code --scope project -- stata-code-mcp
 接着运行 `claude`，输入 `/mcp` 确认 `stata-code` 出现并带有 8 个工具（`stata_run`, `stata_info`, `get_log`, `get_graph`, `get_matrix`, `list_sessions`, `cancel_session`, `reset_session`）。
+#### Agent 工作流里的报错恢复
+`stata_run` 不会自行改写源 `.do` 文件或替你改代码。它执行提交的 Stata 代码，所以代码本身仍可能照常生成日志、图形、表格或其他输出。Stata 报错时，`stata_run` 返回结构化诊断（`error.kind`, `error.message`, `error.line`, `error.context`）和尽力生成的 `suggestions`。这支持两种不同的 Claude Code 工作流：
+- 如果你说的是「运行这个 do-file」或「验证这段代码」，Claude 可以只报告失败原因和建议的下一步，不修改源文件。
+- 如果你明确说「帮我修到跑通」或「修复并反复运行直到成功」，Claude 可以基于同一组结构化错误字段修改 `.do` 文件，再调用 `stata_run` 继续迭代。
+如果需要自动修复循环，请明确说出来。否则，失败的运行应先被视为诊断结果，而不是自动改写代码的授权。
 #### 用 `uvx`（不必全局 pip install）
 如果不想全局 `pip install stata-code`，可以用 [`uv`](https://github.com/astral-sh/uv) 临时运行：
@@ -530,17 +571,30 @@ MCP server 注册了 8 个工具：
 ### 作为 Jupyter kernel
+`stata-code` 的 Jupyter 支持是以 **kernel** 形式打包在 Python 包里的 —— JupyterLab 插件市场里**没有**独立的 "stata-code 插件"。安装分两步：先 `pip install` 安装带 `kernel` extra 的包，再把 kernelspec 注册到 Jupyter。
+**前置条件**：本机已经安装 Stata 17+ 且持有合法许可证（kernel 通过 `pystata` 调用本地 Stata），同一个 Python 环境里已经装好 `jupyter`/`jupyterlab`，Python 版本 ≥ 3.10。
 ```bash
+# 1. 安装带 kernel extra 的 stata-code（会同时装上 ipykernel）
+pip install "stata-code[kernel]"
+# 2. 把 kernelspec 注册到当前用户的 Jupyter data dir
 stata-code-kernel install --user
+# 等价命令：
+# python -m stata_code.kernel install --user
 ```
-也可以直接以 module 方式安装：
+检查 kernel 是否注册成功：
 ```bash
-python -m stata_code.kernel install --user
+jupyter kernelspec list
+# 输出里应该能看到名为 `stata` 的条目
 ```
-然后打开 notebook，选择 **Stata** kernel。Stata 命令会在 cell 中运行，日志、图形和 warnings 会以内联方式显示。
+然后打开 Jupyter Notebook / JupyterLab（或 VS Code 中的 `.ipynb`），在 kernel 选择器里挑 **Stata**，cell 里直接写 Stata 命令即可，日志、graphs 和 warnings 会以内联方式显示。
+> JupyterLab 的 Extension Manager 只能安装前端 JS 扩展，**装不了 kernel**。所以上面的 `pip install` + `install --user` 是唯一支持的安装路径。
 ### 作为 VS Code 扩展
@@ -553,7 +607,7 @@ code --install-extension brycewang-stanford.stata-code-vscode
 或者打开 VS Code 的 **Extensions** 侧栏，搜索 `stata-code`。
-扩展仍然依赖系统 Python 上能导入 `stata-code`（`pip install stata-code`），从而保证 `stata-code-mcp` 在 `PATH` 上可用。和其它前端一样，需要 Stata 17+ 和有效的 Stata 许可证。
+扩展仍然依赖系统 Python 上安装了 MCP extra（`pip install "stata-code[mcp]"`），从而保证 `stata-code-mcp` 在 `PATH` 上可用，并且能导入 MCP SDK。和其它前端一样，需要 Stata 17+ 和有效的 Stata 许可证。
 ---

{stata_code-0.3.1 → stata_code-0.5.0}/README.md RENAMED Viewed

@@ -21,7 +21,12 @@
               └─────────────┘  └────────────┘  └─────────────────┘
 ```
-**Status: v0.2 (May 2026)** — the core, MCP server, and Jupyter kernel work end-to-end against Stata 18 MP. Current test suite: 144 passing tests (88 no-Stata unit tests + 56 real-Stata integration tests). License: **MIT**.
+**Status: v0.5 (May 2026)** — the core, MCP server, Jupyter kernel, and VS Code extension work end-to-end against Stata 18 MP. Current test suite: 218 passing tests across schema, runner, MCP, kernel, and ref-store modules. License: **MIT**.
+Two workflows v0.5 explicitly supports for end users:
+- **Run Stata code from a Jupyter notebook.** `pip install "stata-code[kernel]"` + `stata-code-kernel install --user` registers a **Stata** kernel that the Jupyter Notebook UI, JupyterLab, and the VS Code Jupyter extension all pick up by name. Cells render Stata logs, graphs, and warnings inline (the kernel logo bundled in v0.5 makes it appear in VS Code's kernel picker too). See [As a Jupyter Kernel](#as-a-jupyter-kernel).
+- **Optional agent "fix and rerun" loop.** `stata_run` returns typed `error.kind/line/context` plus `suggestions` on every failure. By default Claude Code only reports diagnostics — but if you explicitly say "fix this and rerun until it passes", the agent uses the same fields to edit your `.do` file and re-call `stata_run` until the run is green. The repair loop is **opt-in**: failed runs are diagnostics first, not automatic rewrite permission. See [Error Recovery in Agent Workflows](#error-recovery-in-agent-workflows).
 ---
@@ -97,7 +102,7 @@ else:
 ### As an MCP Server
-After `pip install stata-code`, the `stata-code-mcp` binary is on your `PATH`. You can wire it into Claude Code, Cursor, Claude Desktop, or any other MCP-compatible client.
+After `pip install "stata-code[mcp]"`, the `stata-code-mcp` binary is on your `PATH`. You can wire it into Claude Code, Cursor, Claude Desktop, or any other MCP-compatible client.
 #### Claude Code via `claude mcp add` (recommended)
@@ -118,6 +123,15 @@ claude mcp add stata-code --scope project -- stata-code-mcp
 Then launch `claude` and type `/mcp` to confirm `stata-code` shows up with its 8 tools (`stata_run`, `stata_info`, `get_log`, `get_graph`, `get_matrix`, `list_sessions`, `cancel_session`, `reset_session`).
+#### Error Recovery in Agent Workflows
+`stata_run` does not rewrite the source `.do` file or change code on its own. It executes the submitted Stata code, so that code may still create logs, graphs, tables, or other outputs as usual. When Stata fails, `stata_run` returns typed diagnostics (`error.kind`, `error.message`, `error.line`, `error.context`) plus best-effort `suggestions`. That supports two distinct Claude Code workflows:
+- For "run this do-file" or "verify this code", Claude can report the failure and suggested next steps without changing source files.
+- For "fix this and rerun until it passes", Claude can use the same structured error fields to edit the `.do` file, call `stata_run` again, and iterate.
+If you want the repair loop, say so explicitly. Otherwise, treat failed runs as diagnostics first, not as automatic permission to rewrite code.
 #### `uvx` (no global pip install)
 If you prefer not to `pip install stata-code` globally, run it ephemerally through [`uv`](https://github.com/astral-sh/uv):
@@ -163,17 +177,30 @@ The MCP server registers 8 tools:
 ### As a Jupyter Kernel
+`stata-code` ships a Jupyter kernel as part of the Python package — there is no separate "Jupyter plugin" in the JupyterLab extension marketplace. Installation is two steps: `pip install` the package with the `kernel` extra, then register the kernelspec with Jupyter.
+**Prerequisites**: Stata 17+ installed locally with a valid license (the kernel calls Stata via `pystata`), and Python 3.10+ with `jupyter`/`jupyterlab` already on the same environment.
 ```bash
+# 1. Install stata-code with the kernel extra (pulls in ipykernel)
+pip install "stata-code[kernel]"
+# 2. Register the kernelspec into Jupyter's user data dir
 stata-code-kernel install --user
+# Or, equivalently:
+# python -m stata_code.kernel install --user
 ```
-Or install it as a module:
+Verify the kernel is registered:
 ```bash
-python -m stata_code.kernel install --user
+jupyter kernelspec list
+# should include an entry named `stata`
 ```
-Then open a notebook and select the **Stata** kernel. Stata commands run in cells; logs, graphs, and warnings render inline.
+Then open Jupyter Notebook / JupyterLab (or a `.ipynb` in VS Code), pick **Stata** in the kernel selector, and run Stata commands in cells. Logs, graphs, and warnings render inline.
+> JupyterLab's Extension Manager only installs front-end JS extensions, so it cannot install a kernel — `pip install` plus the `install --user` step above is the only supported path.
 ### As a VS Code Extension
@@ -186,7 +213,7 @@ code --install-extension brycewang-stanford.stata-code-vscode
 Or open the **Extensions** sidebar in VS Code and search `stata-code`.
-The extension still requires `stata-code` itself to be importable on your system Python (`pip install stata-code`), so that `stata-code-mcp` resolves on `PATH`. Stata 17+ and a valid Stata license are required as for any other frontend.
+The extension still requires the MCP extra on your system Python (`pip install "stata-code[mcp]"`), so that `stata-code-mcp` resolves on `PATH` and can import the MCP SDK. Stata 17+ and a valid Stata license are required as for any other frontend.
 ---
@@ -263,7 +290,7 @@ stata_code/
 ## Roadmap
-### Done (v0.2 — May 2026)
+### Done (through v0.5 — May 2026)
 - v1.0 result schema ([SCHEMA.md](SCHEMA.md))
 - `pystata`-based runner with native-typed `r()`, `e()`, and matrices
@@ -296,7 +323,7 @@ See [SCHEMA.md §7](SCHEMA.md) for explicitly out-of-scope items.
 ```bash
 pip install -e ".[dev,mcp,kernel]"
-pytest                              # full suite (144 tests)
+pytest                              # full suite (218 tests)
 pytest -m "not stata_required"      # CI subset; no Stata needed
 pytest -m "stata_required" -v       # Stata-only integration tests
 ```
@@ -351,7 +378,12 @@ The Stata tooling landscape that this project builds on and learns from is surve
               └─────────────┘  └────────────┘  └─────────────────┘
 ```
-**当前状态：v0.2（2026 年 5 月）** —— core、MCP server 和 Jupyter kernel 已经可以在 Stata 18 MP 上端到端运行。当前测试：144 passing（88 个不需要 Stata 的单元测试 + 56 个真实 Stata 集成测试）。许可证：**MIT**。
+**当前状态：v0.5（2026 年 5 月）** —— core、MCP server、Jupyter kernel、VS Code 扩展都已经在 Stata 18 MP 上端到端跑通。测试套件：218 个 passing tests，覆盖 schema、runner、MCP、kernel 和 ref-store。许可证：**MIT**。
+v0.5 明确支持的两种用户工作流：
+- **在 Jupyter notebook 里跑 Stata 代码。** `pip install "stata-code[kernel]"` + `stata-code-kernel install --user` 会注册一个名为 **Stata** 的 kernel，Jupyter Notebook、JupyterLab、以及 VS Code 的 Jupyter 扩展都能在 kernel 选择器里看到它。Cell 里直接写 Stata 命令，日志、图形和警告会内联渲染（v0.5 把 kernel logo 一起打包进 PyPI wheel，VS Code 的 Jupyter kernel picker 也能正常显示）。详见下文 [作为 Jupyter kernel](#作为-jupyter-kernel)。
+- **可选的 agent「修复并重跑」循环。** `stata_run` 在每次失败时都会返回结构化的 `error.kind/line/context` 和 `suggestions`。默认情况下 Claude Code 只把它当作诊断信息上报；但如果你明确说「帮我修到跑通」「修复并反复运行直到成功」，agent 就会用同一组字段去改 `.do` 文件、再调 `stata_run`，直到代码通过。这个修复循环是 **opt-in** 的：默认失败 = 诊断，不是自动改写授权。详见下文 [Agent 工作流里的报错恢复](#agent-工作流里的报错恢复)。
 ---
@@ -426,7 +458,7 @@ else:
 ### 作为 MCP server
-`pip install stata-code` 之后，`stata-code-mcp` 会出现在你的 `PATH` 中。可以接到 Claude Code、Cursor、Claude Desktop 等任何兼容 MCP 的客户端里。
+`pip install "stata-code[mcp]"` 之后，`stata-code-mcp` 会出现在你的 `PATH` 中。可以接到 Claude Code、Cursor、Claude Desktop 等任何兼容 MCP 的客户端里。
 #### 用 `claude mcp add` 接入 Claude Code（推荐）
@@ -447,6 +479,15 @@ claude mcp add stata-code --scope project -- stata-code-mcp
 接着运行 `claude`，输入 `/mcp` 确认 `stata-code` 出现并带有 8 个工具（`stata_run`, `stata_info`, `get_log`, `get_graph`, `get_matrix`, `list_sessions`, `cancel_session`, `reset_session`）。
+#### Agent 工作流里的报错恢复
+`stata_run` 不会自行改写源 `.do` 文件或替你改代码。它执行提交的 Stata 代码，所以代码本身仍可能照常生成日志、图形、表格或其他输出。Stata 报错时，`stata_run` 返回结构化诊断（`error.kind`, `error.message`, `error.line`, `error.context`）和尽力生成的 `suggestions`。这支持两种不同的 Claude Code 工作流：
+- 如果你说的是「运行这个 do-file」或「验证这段代码」，Claude 可以只报告失败原因和建议的下一步，不修改源文件。
+- 如果你明确说「帮我修到跑通」或「修复并反复运行直到成功」，Claude 可以基于同一组结构化错误字段修改 `.do` 文件，再调用 `stata_run` 继续迭代。
+如果需要自动修复循环，请明确说出来。否则，失败的运行应先被视为诊断结果，而不是自动改写代码的授权。
 #### 用 `uvx`（不必全局 pip install）
 如果不想全局 `pip install stata-code`，可以用 [`uv`](https://github.com/astral-sh/uv) 临时运行：
@@ -492,17 +533,30 @@ MCP server 注册了 8 个工具：
 ### 作为 Jupyter kernel
+`stata-code` 的 Jupyter 支持是以 **kernel** 形式打包在 Python 包里的 —— JupyterLab 插件市场里**没有**独立的 "stata-code 插件"。安装分两步：先 `pip install` 安装带 `kernel` extra 的包，再把 kernelspec 注册到 Jupyter。
+**前置条件**：本机已经安装 Stata 17+ 且持有合法许可证（kernel 通过 `pystata` 调用本地 Stata），同一个 Python 环境里已经装好 `jupyter`/`jupyterlab`，Python 版本 ≥ 3.10。
 ```bash
+# 1. 安装带 kernel extra 的 stata-code（会同时装上 ipykernel）
+pip install "stata-code[kernel]"
+# 2. 把 kernelspec 注册到当前用户的 Jupyter data dir
 stata-code-kernel install --user
+# 等价命令：
+# python -m stata_code.kernel install --user
 ```
-也可以直接以 module 方式安装：
+检查 kernel 是否注册成功：
 ```bash
-python -m stata_code.kernel install --user
+jupyter kernelspec list
+# 输出里应该能看到名为 `stata` 的条目
 ```
-然后打开 notebook，选择 **Stata** kernel。Stata 命令会在 cell 中运行，日志、图形和 warnings 会以内联方式显示。
+然后打开 Jupyter Notebook / JupyterLab（或 VS Code 中的 `.ipynb`），在 kernel 选择器里挑 **Stata**，cell 里直接写 Stata 命令即可，日志、graphs 和 warnings 会以内联方式显示。
+> JupyterLab 的 Extension Manager 只能安装前端 JS 扩展，**装不了 kernel**。所以上面的 `pip install` + `install --user` 是唯一支持的安装路径。
 ### 作为 VS Code 扩展
@@ -515,7 +569,7 @@ code --install-extension brycewang-stanford.stata-code-vscode
 或者打开 VS Code 的 **Extensions** 侧栏，搜索 `stata-code`。
-扩展仍然依赖系统 Python 上能导入 `stata-code`（`pip install stata-code`），从而保证 `stata-code-mcp` 在 `PATH` 上可用。和其它前端一样，需要 Stata 17+ 和有效的 Stata 许可证。
+扩展仍然依赖系统 Python 上安装了 MCP extra（`pip install "stata-code[mcp]"`），从而保证 `stata-code-mcp` 在 `PATH` 上可用，并且能导入 MCP SDK。和其它前端一样，需要 Stata 17+ 和有效的 Stata 许可证。
 ---

{stata_code-0.3.1 → stata_code-0.5.0}/SCHEMA.md RENAMED Viewed

@@ -228,6 +228,7 @@ The single biggest token-economy decision in the schema. Default response carrie
 | `complete` | `bool` | Reserved for v2 streaming. Always `true` in v1. v2 may emit interim results with `complete: false`. |
 | `error_window` | `string \| null` | When `error` is non-null, the ~10 log lines immediately surrounding the failing emission (regardless of `head`/`tail` window). Cheap for the producer to compute; saves agents from bumping `log_lines` or fetching the full log just to see "what did Stata say right when it broke." `null` on success or when not computable. |
 | `ref` | `string \| null` | Opaque reference for `get_log`. Required when `truncated: true`; may be set when `truncated: false` for caller convenience; `null` is allowed when full log is in `head`. |
+| `files` | `object \| null` | Persistent `.log` / `.smcl` artifacts written for file-backed runs when requested. `null` when no files were written. See "Persistent log files" below. |
 **ANSI handling.** All log views (`head`, `tail`, `error_window`, the payload returned by `get_log(ref)`) are ANSI-escape-stripped, consistently.
@@ -237,6 +238,37 @@ The single biggest token-economy decision in the schema. Default response carrie
 **Defaults.** `head=20`, `tail=20`. Configurable per call via `log_lines_head` / `log_lines_tail` (see §4). If `lines_total ≤ head+tail`, the producer MUST set `truncated: false`, place the full log in `head`, set `tail: ""`, and set `ref: null`.
+**Persistent log files.** When a frontend passes a source `.do` path and requests `persist_log_files`, producers write immutable run artifacts under:
+```text
+<do-file-dir>/log-files/<do-stem>__<UTC timestamp>__<session_id>__<request_id>/
+```
+`log.files` then has:
+```json
+{
+  "directory": "/abs/path/log-files/test1__20260508T012233123Z__main__abc123",
+  "log_path": "/abs/path/.../test1__20260508T012233123Z__main__abc123.log",
+  "smcl_path": "/abs/path/.../test1__20260508T012233123Z__main__abc123.smcl",
+  "manifest_path": "/abs/path/.../manifest.json",
+  "code_path": "/abs/path/.../submitted.do",
+  "working_dir": "/abs/path",
+  "graphs_dir": "/abs/path/.../graphs",
+  "outputs_dir": "/abs/path/.../outputs",
+  "graph_paths": ["/abs/path/.../graphs/01-Graph.png"],
+  "output_paths": ["/abs/path/.../outputs/table.xlsx"],
+  "policy": "per_run_directory",
+  "append": false
+}
+```
+The stable folder name is `log-files`; timestamps belong on child run directories, not on the root. Producers SHOULD NOT append different executions into one log file, because parallel sessions, reruns after a pause, and selection/cell executions become ambiguous. Each run directory SHOULD include a manifest and submitted-code snapshot so the log is attributable without relying on editor history.
+When `origin_path` is supplied, producers SHOULD default Stata's working directory to the source `.do` file's directory before running. This mirrors how users organize project-relative `graph export`, `putexcel`, `esttab using`, `collect export`, and similar output commands. Frontends may disable this with `use_origin_workdir: false` or override it with `working_dir`.
+When `persist_generated_files` is true, producers SHOULD copy newly created or modified common output files from the run working directory into `outputs/`, preserving relative paths where practical. Captured graph refs SHOULD also be materialized into `graphs/`, with the corresponding `GraphInfo.file_path` set.
 ### 3.4 `results`
 Stata's `r()` and `e()` return dictionaries, structurally separated. Each follows the same shape:
@@ -317,6 +349,7 @@ Each entry describes one captured graph. By default the bytes are **not** inline
 | `source_command` | `string \| null` | The user-submitted command line that produced this graph, when isolatable. |
 | `source_line` | `int \| null` | 1-indexed line within the submitted code that produced this graph. |
 | `inline` | `string \| null` | Base64-encoded bytes when the caller explicitly asked for inline (`include_graphs: "inline"`); else `null`. |
+| `file_path` | `string \| null` | Persistent graph file path when the run bundle materialized captured graphs under `log.files.graphs_dir`; else `null`. |
 ### 3.7 `error`
@@ -356,7 +389,7 @@ Populated iff `ok: false`. The schema's most important contribution to agent UX:
 }
 ```
-Suggestions are best-effort; agents should treat them as hints, not directives. The `kind` enum below documents what suggestions are typically populated.
+Suggestions are best-effort; agents should treat them as hints, not directives. A suggestion is not consent to mutate source files or silently retry changed code; consumers should apply fixes automatically only in workflows where the user requested repair or approved iteration. The `kind` enum below documents what suggestions are typically populated.
 **`kind` enum (v1.0):**
@@ -426,6 +459,11 @@ The schema also dictates what callers may *ask for*. Every frontend exposes the
 | `graph_format` | `"png" \| "svg" \| "pdf"` | `"png"` | Render format. |
 | `include_dataset_variables` | `bool` | `true` | Set `false` to omit `dataset.variables`. |
 | `timeout_ms` | `int \| null` | `600000` (10 min) | Hard timeout. `null` disables. On expiry, returns `ok: false`, `error.kind: "timeout"`, `rc: -2`. Frontends MAY override the default if their use case demands. |
+| `persist_log_files` | `bool` | `false` | With `origin_path`, writes immutable `.log` / `.smcl` / manifest files under the source `.do` file's `log-files/` directory. |
+| `persist_generated_files` | `bool` | `true` | When log files are persisted, also copies newly created or modified table/export files into `outputs/` and captured graphs into `graphs/`. |
+| `origin_path` | `string \| null` | `null` | Absolute source `.do` path used for working-directory defaults and run-bundle placement. |
+| `use_origin_workdir` | `bool` | `true` | With `origin_path`, `cd` Stata to the source `.do` directory before running. |
+| `working_dir` | `string \| null` | `null` | Explicit Stata working directory; overrides the source `.do` directory. |
 Frontends translate their native idiom (MCP `inputSchema`, Jupyter kernel options, VSCode commands) into these names without renaming.
@@ -481,6 +519,8 @@ These are *additions* to `run()`. A minimal client only needs `run()` plus which
 | `matrix_ref` | Producer can emit large matrices as refs and supports `get_matrix`. |
 | `multi_session` | Producer supports `session_id != "main"` and `list_sessions`. |
 | `inline_graphs` | Producer supports `include_graphs: "inline"`. |
+| `log_files` | Producer can persist immutable per-run `.log` / `.smcl` bundles. |
+| `run_artifacts` | Producer can materialize captured graphs and copied table/export outputs into the run bundle. |
 Consumers detect optional features via `capabilities`, not by parsing `schema_version`. Producers may add entries; agents MUST treat unknown capability names as opaque.

{stata_code-0.3.1 → stata_code-0.5.0}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
 [project]
 name = "stata-code"
-version = "0.3.1"
+version = "0.5.0"
 description = "Agent-native Stata bridge — one core, multiple frontends (MCP, Jupyter, VSCode)"
 readme = "README.md"
 license = "MIT"

stata-code 0.3.1__tar.gz → 0.5.0__tar.gz

stata-code 0.3.1tar.gz → 0.5.0tar.gz