PyPI - stata-code - Versions diffs - 0.7.0__tar.gz → 0.7.2__tar.gz - Mend

stata-code 0.7.0tar.gz → 0.7.2tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (61) hide show

{stata_code-0.7.0 → stata_code-0.7.2}/.gitignore RENAMED Viewed

@@ -223,6 +223,8 @@ log-files/
 *.smcl
 *.dta
 !tests/fixtures/*.dta
+# Graph-export artifact written by the runner graph-capture tests
+stata_code_test_export.png
 # macOS
 .DS_Store

{stata_code-0.7.0 → stata_code-0.7.2}/CHANGELOG.md RENAMED Viewed

@@ -6,6 +6,66 @@ to semver-major.minor for the result schema (see `SCHEMA.md` §6).
 ## Unreleased
+## 0.7.2 — 2026-06-20
+### Added
+- **Three convenience MCP tools** raise the tool surface from 15 to 18:
+  - `install_package(name, source?, url?, replace?, session_id?)` — installs a
+    community package via `ssc install` / `net install` without the agent
+    having to remember the syntax, then verifies it resolves with `which`.
+    Package names and URLs are validated to keep them out of the generated
+    command line; failures surface the typed `error` block (e.g. `network`).
+  - `search_log(ref, pattern, is_regex?, ignore_case?, context?, max_matches?)`
+    — greps within a truncated `log://` payload and returns only the matching
+    lines (with optional context), so a long log can be inspected without
+    pulling the whole transcript back through `get_log`.
+  - `inspect_data(varlist?, detail?, session_id?)` — runs `describe` +
+    `codebook` and returns the structured `dataset` block plus the codebook
+    log: a one-call "what's in this dataset" the agent doesn't have to spell out.
+- **On-demand Stata reference library** under `skills/stata-code/references/`
+  (~4,200 lines): topic files for core syntax, data management, econometrics,
+  causal inference, panel/time series, graphics, and table export; load-bearing
+  `error-codes.md` (the full `rc → kind → fix` table + self-repair loop, aligned
+  with the typed-error taxonomy) and `defensive-coding.md`; and per-package notes
+  for `reghdfe`, `coefplot`, `estout`, and `gtools`. `SKILL.md` gained a routing
+  table (read 1–3 files on demand) and a live-vs-offline execution-mode section.
+- **`scripts/build_skill_zip.py`** packages the skill into a deterministic
+  `build/stata-code-skill.zip` for upload as Claude.ai project knowledge.
+## 0.7.1 — 2026-06-19
+### Fixed
+- **Jupyter kernel: graphs after the first cell now display.** Graph capture
+  detected new graphs by diffing in-memory graph names before/after a run.
+  Because Stata keeps only one graph per name and every unnamed graph command
+  overwrites the default `Graph` in place, the second and later cells of a
+  persistent session produced no name delta and their graphs were silently
+  dropped — only the first cell's graph ever rendered. Capture now also
+  re-exports any graph the cell's own source shows it (re)drew (every
+  `name(...)` target, plus the default `Graph` for any unnamed graph command),
+  so in-place redraws surface every time. The same fix covers repeated MCP
+  `stata_run` calls in one session. The graph-command detector was tightened
+  to distinguish drawing commands from `graph` utility subcommands (`export`,
+  `display`, `dir`, `drop`, …) so a utility-only cell no longer re-surfaces a
+  stale graph.
+- **Jupyter kernel: no more duplicated code echo in cell output.** pystata
+  runs a multi-line cell as a temporary do-file, and Stata echoes every
+  submitted command (`. cmd` / `> continuation`) regardless of `echo=False`
+  (which only suppresses echo for a single inline command). For a cell with no
+  textual output (e.g. a graph) that echo was the *only* thing shown — a
+  useless repeat of the source already visible in the input cell. The kernel
+  now strips command-echo lines before streaming, keeping genuine command
+  output. The full log (with echo) is unchanged in `RunResult.log` for MCP /
+  agent consumers.
+### Changed
+- **VS Code extension now ships a Marketplace icon** (coef-plot mark, Anthropic
+  palette on white) so the listing and Extensions sidebar render branded
+  artwork instead of the default placeholder.
 ## 0.7.0 — 2026-05-30
 ### Added

{stata_code-0.7.0 → stata_code-0.7.2}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: stata-code
-Version: 0.7.0
+Version: 0.7.2
 Summary: Agent-native Stata bridge — one core, multiple frontends (MCP, Jupyter, VSCode)
 Project-URL: Homepage, https://github.com/brycewang-stanford/stata-code
 Project-URL: Repository, https://github.com/brycewang-stanford/stata-code
@@ -84,9 +84,9 @@ Description-Content-Type: text/markdown
               └─────────────┘  └────────────┘  └─────────────────┘
 ```
-**Status: v0.6 (May 2026)** — the core, MCP server, Jupyter kernel, and VS Code extension work end-to-end against Stata 18 MP. The test suite covers schema, runner, MCP, kernel, notebook, run-index, subprocess-pool, and VS Code modules; CI also checks linting, type safety, schema generation, package metadata, and VSIX packaging. License: **MIT**.
+**Status: v0.7 (May 2026)** — the core, MCP server, Jupyter kernel, and VS Code extension work end-to-end against Stata 18 MP. The test suite covers schema, runner, MCP, kernel, notebook, run-index, subprocess-pool, and VS Code modules; CI also checks linting, type safety, schema generation, package metadata, and VSIX packaging. License: **MIT**.
-Two workflows v0.6 explicitly supports for end users:
+Two workflows the current release explicitly supports for end users:
 - **Run Stata code from a Jupyter notebook.** `pip install "stata-code[kernel]"` + `stata-code-kernel install --user` registers a **Stata** kernel that the Jupyter Notebook UI, JupyterLab, and the VS Code Jupyter extension all pick up by name. Cells render Stata logs, graphs, and warnings inline (the kernel logo bundled since v0.5 makes it appear in VS Code's kernel picker too). See [As a Jupyter Kernel](#as-a-jupyter-kernel).
 - **Optional agent "fix and rerun" loop.** `stata_run` returns typed `error.kind/line/context` plus `suggestions` on every failure. By default Claude Code only reports diagnostics — but if you explicitly say "fix this and rerun until it passes", the agent uses the same fields to edit your `.do` file and re-call `stata_run` until the run is green. The repair loop is **opt-in**: failed runs are diagnostics first, not automatic rewrite permission. See [Error Recovery in Agent Workflows](#error-recovery-in-agent-workflows).
@@ -188,7 +188,7 @@ claude mcp add stata-code --scope local -- stata-code-mcp
 claude mcp add stata-code --scope project -- stata-code-mcp
 ```
-Then launch `claude` and type `/mcp` to confirm `stata-code` shows up with its 15 tools (`stata_run`, `stata_info`, `get_log`, `get_graph`, `get_matrix`, `list_sessions`, `cancel_session`, `reset_session`, `notebook_outline`, `notebook_get_cell`, `notebook_locate`, `notebook_edit_cell`, `notebook_insert_cell`, `notebook_delete_cell`, `list_runs`).
+Then launch `claude` and type `/mcp` to confirm `stata-code` shows up with its 18 tools (`stata_run`, `stata_info`, `get_log`, `search_log`, `get_graph`, `get_matrix`, `inspect_data`, `install_package`, `list_sessions`, `cancel_session`, `reset_session`, `notebook_outline`, `notebook_get_cell`, `notebook_locate`, `notebook_edit_cell`, `notebook_insert_cell`, `notebook_delete_cell`, `list_runs`).
 #### Error Recovery in Agent Workflows
@@ -276,15 +276,18 @@ If an OpenAI-backed client reports `API Error: 400 Invalid schema for function
 upgrade to `stata-code>=0.6.5`, then restart the MCP client. Older server
 processes keep advertising the stale schema until they are restarted.
-The MCP server registers 15 tools:
+The MCP server registers 18 tools:
 | Tool | Purpose |
 | --- | --- |
 | `stata_run` | Execute Stata code and return a v1.0 RunResult JSON |
 | `stata_info` | Report Stata edition, version, and capabilities |
 | `get_log` | Fetch the full log behind a `log://` ref |
+| `search_log` | Search matching lines inside a stored `log://` payload |
 | `get_graph` | Fetch graph bytes behind a `graph://` ref (`ImageContent`) |
 | `get_matrix` | Fetch matrix payloads behind a `matrix://` ref |
+| `inspect_data` | Run `describe` + `codebook` and return compact dataset metadata |
+| `install_package` | Install an SSC or explicit `net install` package and verify it resolves |
 | `list_sessions` | Enumerate live sessions |
 | `cancel_session` | Cancel a session; the subprocess-backed path terminates in-flight runs and short-circuits pending ones |
 | `reset_session` | Drop a session's data |
@@ -416,7 +419,7 @@ stata_code/
 │   ├── runner.py      # in-process execute(); collects everything via sfi
 │   └── _pool.py       # subprocess workers for public API / MCP hard timeouts
 ├── mcp/
-│   └── server.py      # MCP server (15 tools)
+│   └── server.py      # MCP server (18 tools)
 └── kernel/
     └── kernel.py      # Jupyter kernel
 ```
@@ -444,7 +447,7 @@ stata_code/
 ## Roadmap
-### Done (through v0.6 — May 2026)
+### Done (through v0.7 — May 2026)
 - v1.0 result schema ([SCHEMA.md](SCHEMA.md))
 - `pystata`-based runner with native-typed `r()`, `e()`, and matrices
@@ -454,7 +457,7 @@ stata_code/
 - Log truncation with ref store
 - Warning extraction: 5 categories + generic notes
 - 32-kind error taxonomy with canonical suggestions
-- MCP server: 15 tools, including notebook navigation / search / atomic edits and the run-bundle index (`list_runs`)
+- MCP server: 18 tools, including notebook navigation / search / atomic edits, the run-bundle index (`list_runs`), log grep (`search_log`), dataset inspection (`inspect_data`), and package installation (`install_package`)
 - Jupyter kernel: rewired to the v1.0 pipeline, kernel logos bundled
 - Matrix size cap + `get_matrix(ref)` for large matrices (>10k cells)
 - Subprocess-backed hard timeout and cancellation for the public Python API and MCP server: `timeout_ms`, `cancel(session_id)`, and MCP `cancel_session`

{stata_code-0.7.0 → stata_code-0.7.2}/README.md RENAMED Viewed

@@ -45,9 +45,9 @@
               └─────────────┘  └────────────┘  └─────────────────┘
 ```
-**Status: v0.6 (May 2026)** — the core, MCP server, Jupyter kernel, and VS Code extension work end-to-end against Stata 18 MP. The test suite covers schema, runner, MCP, kernel, notebook, run-index, subprocess-pool, and VS Code modules; CI also checks linting, type safety, schema generation, package metadata, and VSIX packaging. License: **MIT**.
+**Status: v0.7 (May 2026)** — the core, MCP server, Jupyter kernel, and VS Code extension work end-to-end against Stata 18 MP. The test suite covers schema, runner, MCP, kernel, notebook, run-index, subprocess-pool, and VS Code modules; CI also checks linting, type safety, schema generation, package metadata, and VSIX packaging. License: **MIT**.
-Two workflows v0.6 explicitly supports for end users:
+Two workflows the current release explicitly supports for end users:
 - **Run Stata code from a Jupyter notebook.** `pip install "stata-code[kernel]"` + `stata-code-kernel install --user` registers a **Stata** kernel that the Jupyter Notebook UI, JupyterLab, and the VS Code Jupyter extension all pick up by name. Cells render Stata logs, graphs, and warnings inline (the kernel logo bundled since v0.5 makes it appear in VS Code's kernel picker too). See [As a Jupyter Kernel](#as-a-jupyter-kernel).
 - **Optional agent "fix and rerun" loop.** `stata_run` returns typed `error.kind/line/context` plus `suggestions` on every failure. By default Claude Code only reports diagnostics — but if you explicitly say "fix this and rerun until it passes", the agent uses the same fields to edit your `.do` file and re-call `stata_run` until the run is green. The repair loop is **opt-in**: failed runs are diagnostics first, not automatic rewrite permission. See [Error Recovery in Agent Workflows](#error-recovery-in-agent-workflows).
@@ -149,7 +149,7 @@ claude mcp add stata-code --scope local -- stata-code-mcp
 claude mcp add stata-code --scope project -- stata-code-mcp
 ```
-Then launch `claude` and type `/mcp` to confirm `stata-code` shows up with its 15 tools (`stata_run`, `stata_info`, `get_log`, `get_graph`, `get_matrix`, `list_sessions`, `cancel_session`, `reset_session`, `notebook_outline`, `notebook_get_cell`, `notebook_locate`, `notebook_edit_cell`, `notebook_insert_cell`, `notebook_delete_cell`, `list_runs`).
+Then launch `claude` and type `/mcp` to confirm `stata-code` shows up with its 18 tools (`stata_run`, `stata_info`, `get_log`, `search_log`, `get_graph`, `get_matrix`, `inspect_data`, `install_package`, `list_sessions`, `cancel_session`, `reset_session`, `notebook_outline`, `notebook_get_cell`, `notebook_locate`, `notebook_edit_cell`, `notebook_insert_cell`, `notebook_delete_cell`, `list_runs`).
 #### Error Recovery in Agent Workflows
@@ -237,15 +237,18 @@ If an OpenAI-backed client reports `API Error: 400 Invalid schema for function
 upgrade to `stata-code>=0.6.5`, then restart the MCP client. Older server
 processes keep advertising the stale schema until they are restarted.
-The MCP server registers 15 tools:
+The MCP server registers 18 tools:
 | Tool | Purpose |
 | --- | --- |
 | `stata_run` | Execute Stata code and return a v1.0 RunResult JSON |
 | `stata_info` | Report Stata edition, version, and capabilities |
 | `get_log` | Fetch the full log behind a `log://` ref |
+| `search_log` | Search matching lines inside a stored `log://` payload |
 | `get_graph` | Fetch graph bytes behind a `graph://` ref (`ImageContent`) |
 | `get_matrix` | Fetch matrix payloads behind a `matrix://` ref |
+| `inspect_data` | Run `describe` + `codebook` and return compact dataset metadata |
+| `install_package` | Install an SSC or explicit `net install` package and verify it resolves |
 | `list_sessions` | Enumerate live sessions |
 | `cancel_session` | Cancel a session; the subprocess-backed path terminates in-flight runs and short-circuits pending ones |
 | `reset_session` | Drop a session's data |
@@ -377,7 +380,7 @@ stata_code/
 │   ├── runner.py      # in-process execute(); collects everything via sfi
 │   └── _pool.py       # subprocess workers for public API / MCP hard timeouts
 ├── mcp/
-│   └── server.py      # MCP server (15 tools)
+│   └── server.py      # MCP server (18 tools)
 └── kernel/
     └── kernel.py      # Jupyter kernel
 ```
@@ -405,7 +408,7 @@ stata_code/
 ## Roadmap
-### Done (through v0.6 — May 2026)
+### Done (through v0.7 — May 2026)
 - v1.0 result schema ([SCHEMA.md](SCHEMA.md))
 - `pystata`-based runner with native-typed `r()`, `e()`, and matrices
@@ -415,7 +418,7 @@ stata_code/
 - Log truncation with ref store
 - Warning extraction: 5 categories + generic notes
 - 32-kind error taxonomy with canonical suggestions
-- MCP server: 15 tools, including notebook navigation / search / atomic edits and the run-bundle index (`list_runs`)
+- MCP server: 18 tools, including notebook navigation / search / atomic edits, the run-bundle index (`list_runs`), log grep (`search_log`), dataset inspection (`inspect_data`), and package installation (`install_package`)
 - Jupyter kernel: rewired to the v1.0 pipeline, kernel logos bundled
 - Matrix size cap + `get_matrix(ref)` for large matrices (>10k cells)
 - Subprocess-backed hard timeout and cancellation for the public Python API and MCP server: `timeout_ms`, `cancel(session_id)`, and MCP `cancel_session`

{stata_code-0.7.0 → stata_code-0.7.2}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
 [project]
 name = "stata-code"
-version = "0.7.0"
+version = "0.7.2"
 description = "Agent-native Stata bridge — one core, multiple frontends (MCP, Jupyter, VSCode)"
 readme = "README.md"
 license = "MIT"

stata_code-0.7.2/scripts/build_skill_zip.py ADDED Viewed

@@ -0,0 +1,105 @@
+"""Package the ``stata-code`` skill into a single uploadable ``.zip``.
+The skill (``skills/stata-code/SKILL.md`` + the ``references/`` library) is
+consumed two ways:
+* In-repo / Claude Code — read straight from ``skills/stata-code/``.
+* Claude.ai project knowledge — uploaded as a ``.zip``. This script builds
+  that archive.
+The archive contains a single top-level ``stata-code/`` folder so it extracts
+cleanly::
+    stata-code/SKILL.md
+    stata-code/references/econometrics.md
+    stata-code/references/packages/reghdfe.md
+    ...
+Run::
+    python scripts/build_skill_zip.py                 # -> build/stata-code-skill.zip
+    python scripts/build_skill_zip.py -o /tmp/out.zip  # custom destination
+The build is deterministic (sorted entries, fixed timestamps) so re-running it
+on unchanged inputs produces a byte-identical archive.
+"""
+from __future__ import annotations
+import argparse
+import sys
+import zipfile
+from pathlib import Path
+REPO_ROOT = Path(__file__).resolve().parent.parent
+SKILL_DIR = REPO_ROOT / "skills" / "stata-code"
+DEFAULT_OUTPUT = REPO_ROOT / "build" / "stata-code-skill.zip"
+ARCHIVE_PREFIX = "stata-code"
+# Fixed timestamp for reproducible archives (zip epoch starts at 1980).
+_FIXED_DATE_TIME = (1980, 1, 1, 0, 0, 0)
+def collect_files(skill_dir: Path = SKILL_DIR) -> list[Path]:
+    """Return every shippable skill file, sorted, relative-stable.
+    Excludes editor/OS cruft so the archive is clean.
+    """
+    if not skill_dir.is_dir():
+        raise FileNotFoundError(f"skill directory not found: {skill_dir}")
+    skip = {".DS_Store"}
+    files = [
+        p
+        for p in skill_dir.rglob("*")
+        if p.is_file() and p.name not in skip and "__pycache__" not in p.parts
+    ]
+    return sorted(files)
+def build_zip(
+    dest: Path = DEFAULT_OUTPUT,
+    skill_dir: Path = SKILL_DIR,
+) -> list[str]:
+    """Write the skill archive to ``dest``; return the arcnames included."""
+    files = collect_files(skill_dir)
+    if not files:
+        raise FileNotFoundError(f"no skill files under {skill_dir}")
+    dest.parent.mkdir(parents=True, exist_ok=True)
+    arcnames: list[str] = []
+    with zipfile.ZipFile(dest, "w", compression=zipfile.ZIP_DEFLATED) as zf:
+        for path in files:
+            rel = path.relative_to(skill_dir).as_posix()
+            arcname = f"{ARCHIVE_PREFIX}/{rel}"
+            info = zipfile.ZipInfo(arcname, date_time=_FIXED_DATE_TIME)
+            info.compress_type = zipfile.ZIP_DEFLATED
+            info.external_attr = 0o644 << 16  # regular file, rw-r--r--
+            zf.writestr(info, path.read_bytes())
+            arcnames.append(arcname)
+    return arcnames
+def main() -> int:
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument(
+        "-o",
+        "--output",
+        type=Path,
+        default=DEFAULT_OUTPUT,
+        help=f"Destination .zip (default: {DEFAULT_OUTPUT.relative_to(REPO_ROOT)}).",
+    )
+    args = parser.parse_args()
+    try:
+        arcnames = build_zip(args.output)
+    except FileNotFoundError as exc:
+        print(f"error: {exc}", file=sys.stderr)
+        return 1
+    size = args.output.stat().st_size
+    print(f"wrote: {args.output}  ({len(arcnames)} files, {size:,} bytes)")
+    return 0
+if __name__ == "__main__":
+    sys.exit(main())

{stata_code-0.7.0 → stata_code-0.7.2}/stata_code/__init__.py RENAMED Viewed

@@ -174,7 +174,7 @@ def is_available() -> bool:
     return True
-__version__ = "0.7.0"
+__version__ = "0.7.2"
 __all__ = [
     # Primary entry points

{stata_code-0.7.0 → stata_code-0.7.2}/stata_code/core/runner.py RENAMED Viewed

@@ -218,6 +218,107 @@ def get_log(ref: str) -> dict[str, Any]:
     }
+def search_log(
+    ref: str,
+    pattern: str,
+    *,
+    is_regex: bool = False,
+    ignore_case: bool = True,
+    context: int = 0,
+    max_matches: int = 50,
+) -> dict[str, Any]:
+    """Auxiliary tool: grep within a stored ``log://`` payload.
+    Pairs with the token-economy default of returning long logs by
+    reference: instead of pulling the whole log back with
+    :func:`get_log`, the agent can find just the lines it cares about.
+    Parameters
+    ----------
+    ref : str
+        A ``log://<request_id>`` ref produced by a truncated ``stata_run``.
+    pattern : str
+        Substring (default) or regular expression (``is_regex=True``) to
+        match against each line.
+    is_regex : bool
+        Treat ``pattern`` as a Python regular expression. A malformed
+        regex raises :class:`ValueError` (surfaced as ``invalid_request``).
+    ignore_case : bool
+        Case-insensitive matching (default ``True``).
+    context : int
+        Lines of surrounding context to include on each side of a match
+        (capped at 10). ``before`` / ``after`` are omitted when 0.
+    max_matches : int
+        Stop after this many matches; ``truncated`` reports whether more
+        existed (capped at 1000).
+    Returns
+    -------
+    dict
+        ``{ref, pattern, is_regex, lines_total, match_count, truncated,
+        matches: [{line_no, text, before?, after?}]}``. ``line_no`` is
+        1-based. Raises :class:`RefNotFound` for an unknown ref.
+    """
+    payload = _refs.get(ref)
+    if (
+        not isinstance(payload, dict)
+        or not isinstance(payload.get("text"), str)
+        or "lines_total" not in payload
+    ):
+        raise RefNotFound(ref, kind="unknown_log_ref")
+    if not pattern:
+        raise ValueError("pattern must be a non-empty string")
+    context = max(0, min(int(context), 10))
+    max_matches = max(1, min(int(max_matches), 1000))
+    flags = re.IGNORECASE if ignore_case else 0
+    if is_regex:
+        try:
+            matcher = re.compile(pattern, flags)
+        except re.error as exc:
+            raise ValueError(f"invalid regex: {exc}") from exc
+        def _hit(line: str) -> bool:
+            return matcher.search(line) is not None
+    else:
+        needle = pattern.lower() if ignore_case else pattern
+        def _hit(line: str) -> bool:
+            hay = line.lower() if ignore_case else line
+            return needle in hay
+    text: str = payload["text"]
+    lines = text.split("\n")
+    matches: list[dict[str, Any]] = []
+    truncated = False
+    for idx, line in enumerate(lines):
+        if not _hit(line):
+            continue
+        if len(matches) >= max_matches:
+            truncated = True
+            break
+        entry: dict[str, Any] = {"line_no": idx + 1, "text": line}
+        if context:
+            before = lines[max(0, idx - context):idx]
+            after = lines[idx + 1:idx + 1 + context]
+            if before:
+                entry["before"] = before
+            if after:
+                entry["after"] = after
+        matches.append(entry)
+    return {
+        "ref": ref,
+        "pattern": pattern,
+        "is_regex": is_regex,
+        "lines_total": payload["lines_total"],
+        "match_count": len(matches),
+        "truncated": truncated,
+        "matches": matches,
+    }
 def cancel(session_id: str = "main") -> bool:
     """Request cancellation of the next ``execute()`` call for ``session_id``.
@@ -1195,11 +1296,20 @@ def _extract_warnings(log: str) -> list:  # list[StataWarning]
 _GRAPH_NAME_RE = re.compile(r"\bname\(\s*([A-Za-z_][A-Za-z0-9_]*)", re.IGNORECASE)
+# Stata's default in-memory graph name, (re)used by any graph command that
+# omits an explicit `name(...)` option. Capture/redraw detection keys off this.
+_DEFAULT_GRAPH_NAME = "Graph"
+# Commands that actually *draw* a graph (and thus create/overwrite an
+# in-memory graph). Deliberately excludes the `graph` utility subcommands
+# (export, display, dir, drop, describe, save, use, rename, set, copy, query,
+# replay) — those operate on existing graphs and must not be mistaken for a
+# redraw, or a bare `graph export` cell would spuriously re-surface a stale
+# graph.
 _GRAPH_COMMAND_RE = re.compile(
     r"^\s*(?:"
-    r"graph\s+\w+|"
-    r"twoway|scatter|line|connected|histogram|kdensity|lowess|lfit|qfit|"
-    r"coefplot|binscatter"
+    r"graph\s+(?:bar|hbar|box|hbox|dot|pie|twoway|matrix|combine)\b|"
+    r"twoway|scatter|line|connected|histogram|hist|kdensity|lpoly|lowess|"
+    r"lfit|qfit|coefplot|binscatter|marginsplot"
     r")\b",
     re.IGNORECASE,
 )
@@ -1262,19 +1372,40 @@ def _collect_graphs(
     source_hints: dict[str, tuple[str, int]] | None = None,
     unnamed_source_hints: list[tuple[str, int]] | None = None,
 ) -> list[GraphInfo]:
-    """Capture graphs that user code newly created.
+    """Capture graphs that user code newly created or redrew.
     Strategy: snapshot graph names before user code (`pre_existing`), call
-    after to find the post-existing list, take the set difference. For each
-    new graph: `graph display <name>` (makes it active), `graph export` to a
-    tmpfile, read bytes, store under a ref. Tmpfile is deleted after.
+    after to find the post-existing list. Capture a graph when its name is
+    genuinely new *or* when this cell's source shows it (re)drew that name.
+    The redraw case matters because Stata keeps only one in-memory graph per
+    name, so a command that overwrites an existing name (most commonly the
+    default ``Graph``, produced by any unnamed graph command) leaves the
+    ``graph dir`` name set unchanged. A pure set-difference against
+    `pre_existing` therefore misses it — which is why, in a persistent session
+    (Jupyter cell 2+, repeated MCP runs), only the first graph ever surfaced.
+    For each captured graph: `graph display <name>` (makes it active),
+    `graph export` to a tmpfile, read bytes, store under a ref. Tmpfile is
+    deleted after.
     """
     after_names = _list_graph_names(rt)
-    new_names = [n for n in after_names if n not in pre_existing]
-    if not new_names:
-        return []
     source_hints = source_hints or {}
     unnamed_source_hints = unnamed_source_hints or []
+    # Names this cell explicitly drew, inferred from its source: every
+    # `name(...)` option, plus the default graph when any unnamed graph
+    # command ran. These are re-captured even if they already existed, so an
+    # in-place redraw is not dropped.
+    redrawn = set(source_hints)
+    if unnamed_source_hints:
+        redrawn.add(_DEFAULT_GRAPH_NAME)
+    new_names = [
+        n for n in after_names if n not in pre_existing or n in redrawn
+    ]
+    if not new_names:
+        return []
     unattributed_names = [n for n in new_names if n not in source_hints]
     unnamed_by_graph: dict[str, tuple[str, int]] = {}
     if len(unattributed_names) == len(unnamed_source_hints):

{stata_code-0.7.0 → stata_code-0.7.2}/stata_code/kernel/kernel.py RENAMED Viewed

@@ -102,6 +102,35 @@ def _word_at_cursor(code: str, cursor_pos: int) -> tuple[str, int, int]:
     return code[start:end], start, end
+def _strip_command_echo(log_text: str) -> str:
+    """Drop Stata's do-file command echo from a captured cell log.
+    pystata runs a multi-line cell as a temporary do-file, and Stata echoes
+    every submitted command — ``. cmd`` for the first line of each command and
+    ``> ...`` for wrapped/continued lines — regardless of the ``echo=False``
+    flag (which only suppresses echo for a single inline command). In a
+    notebook the input cell already shows the source, so the echo is pure
+    duplication; for a cell with no textual output (e.g. a graph) the echo is
+    the *only* thing shown, which reads as a useless repeat of the code.
+    Strip the echoed command/continuation lines, keep genuine command output,
+    and collapse the blank-line runs the removal leaves behind. Echoed lines
+    always start at column 0 with ``. `` (dot-space) or ``> `` (continuation);
+    real Stata output never begins that way, so this is safe.
+    """
+    kept: list[str] = []
+    for line in log_text.split("\n"):
+        if line.startswith(". ") or line.startswith("> "):
+            continue
+        # Collapse leading and consecutive blank lines left by removed echoes.
+        if not line.strip() and (not kept or not kept[-1].strip()):
+            continue
+        kept.append(line)
+    while kept and not kept[-1].strip():
+        kept.pop()
+    return "\n".join(kept)
 # ─────────────────────────────────────────────────────────────────────────────
 # Kernel
 # ─────────────────────────────────────────────────────────────────────────────
@@ -155,8 +184,9 @@ class StataKernel(_KernelBase):
         self._last_result = result
         if not silent:
-            if result.log.head:
-                self._stream("stdout", result.log.head + "\n")
+            log_text = _strip_command_echo(result.log.head) if result.log.head else ""
+            if log_text:
+                self._stream("stdout", log_text + "\n")
             if result.warnings:
                 for w in result.warnings:
                     self._stream("stderr", f"[{w.kind}] {w.message}\n")

stata-code 0.7.0__tar.gz → 0.7.2__tar.gz

stata-code 0.7.0tar.gz → 0.7.2tar.gz