research-git 0.0.1__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- research_git-0.0.1.dist-info/METADATA +215 -0
- research_git-0.0.1.dist-info/RECORD +44 -0
- research_git-0.0.1.dist-info/WHEEL +5 -0
- research_git-0.0.1.dist-info/entry_points.txt +2 -0
- research_git-0.0.1.dist-info/licenses/LICENSE +21 -0
- research_git-0.0.1.dist-info/top_level.txt +1 -0
- rgit/__init__.py +1 -0
- rgit/_plugin/.claude-plugin/marketplace.json +15 -0
- rgit/_plugin/.claude-plugin/plugin.json +14 -0
- rgit/_plugin/agents/capsule-regenerator.md +70 -0
- rgit/_plugin/agents/capsule-segmenter.md +58 -0
- rgit/_plugin/agents/edge-judge.md +59 -0
- rgit/_plugin/skills/rgit-capture/SKILL.md +95 -0
- rgit/_plugin/skills/rgit-recall/SKILL.md +54 -0
- rgit/ablation.py +70 -0
- rgit/agent_guidance.py +169 -0
- rgit/agent_platforms.py +49 -0
- rgit/astmap.py +106 -0
- rgit/cli.py +417 -0
- rgit/compare.py +125 -0
- rgit/compose.py +48 -0
- rgit/curation.py +48 -0
- rgit/edges.py +75 -0
- rgit/gitutil.py +112 -0
- rgit/graphview.py +238 -0
- rgit/hooks.py +64 -0
- rgit/installer.py +248 -0
- rgit/mcp_server.py +75 -0
- rgit/metricdir.py +43 -0
- rgit/metrics.py +28 -0
- rgit/provenance.py +95 -0
- rgit/ranking.py +55 -0
- rgit/recall.py +41 -0
- rgit/runner.py +48 -0
- rgit/segmenter.py +103 -0
- rgit/store/__init__.py +0 -0
- rgit/store/db.py +75 -0
- rgit/store/ids.py +5 -0
- rgit/store/models.py +92 -0
- rgit/store/objects.py +32 -0
- rgit/store/store.py +216 -0
- rgit/tables.py +31 -0
- rgit/toggles.py +84 -0
- rgit/watch.py +51 -0
|
@@ -0,0 +1,54 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: rgit-recall
|
|
3
|
+
description: |
|
|
4
|
+
Bring a past idea back onto today's codebase. Use when the user wants to "recall", "resurrect", "bring back", or "re-apply" a feature/idea they captured before (e.g. "bring back the re-ranking retrieval step"). Orchestrates: recall the capsule(s) → compose a regeneration brief against current code → dispatch the capsule-regenerator subagent to re-implement it → human review + `rgit run`. No paid API.
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# rgit-recall
|
|
8
|
+
|
|
9
|
+
Drives the **recall → compose → regenerate** half of the research-git loop. The stored capsule is a spec; the regenerator rebuilds it onto *today's* code.
|
|
10
|
+
|
|
11
|
+
**Prerequisites:** the repo is `rgit init`-ed and the `research-git` MCP server is connected (it exposes `recall` and `compose`).
|
|
12
|
+
|
|
13
|
+
**Locating the agent definitions.** On Claude Code the plugin runtime resolves agent paths for you. On other CLIs (Codex, Gemini, opencode) this skill is symlinked into `~/.agents/skills/rgit-recall`, so resolve the plugin root once and reference the agent from there:
|
|
14
|
+
|
|
15
|
+
```bash
|
|
16
|
+
SKILL_REAL=$(realpath ~/.agents/skills/rgit-recall 2>/dev/null || readlink -f ~/.agents/skills/rgit-recall)
|
|
17
|
+
PLUGIN_ROOT=$(dirname "$(dirname "$SKILL_REAL")") # the bundled _plugin/ directory
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
The `agents/capsule-regenerator.md` reference below lives at `$PLUGIN_ROOT/agents/capsule-regenerator.md`.
|
|
21
|
+
|
|
22
|
+
## Process
|
|
23
|
+
|
|
24
|
+
### 1. Recall the capsule(s)
|
|
25
|
+
|
|
26
|
+
Take the user's natural-language ask and call the MCP tool **`recall(query)`**. It returns matches, each with its `depends_on` subgraph. Show the user a short list (name + intent) and confirm which feature(s) to bring back. Default to the top match if unambiguous. If nothing matches, tell the user and stop (suggest `list_features` to browse).
|
|
27
|
+
|
|
28
|
+
### 2. Resolve the full feature set
|
|
29
|
+
|
|
30
|
+
Include each chosen capsule **plus its `depends_on` dependencies** (a feature often needs its prerequisites). Collect the final list of `feature_id`s.
|
|
31
|
+
|
|
32
|
+
### 3. Compose the regeneration brief (against current code)
|
|
33
|
+
|
|
34
|
+
Call the MCP tool **`compose(feature_ids)`**. It returns, per feature: `intent`, `knobs`, `data_assumptions`, `resurrection_guide`, the reference `code_slices`, the **live `current_source`** of each touched symbol, and any `conflicts` (symbols touched by more than one chosen feature).
|
|
35
|
+
|
|
36
|
+
### 4. Dispatch the capsule-regenerator subagent (on subscription)
|
|
37
|
+
|
|
38
|
+
Dispatch a subagent using the **`capsule-regenerator`** agent definition (`agents/capsule-regenerator.md`). Pass the full brief verbatim plus `repo_root`. The subagent edits the working tree to re-implement the feature(s) onto today's code, resolves conflicts, sanity-checks syntax, and returns an `applied` report with `provenance` (clean vs adapted) per feature.
|
|
39
|
+
|
|
40
|
+
### 5. Review + close the loop
|
|
41
|
+
|
|
42
|
+
Show the user the resulting working-tree diff (`git diff`) and the subagent's provenance/adaptation notes. **Do not commit or freeze for them.** Tell the user to test + freeze it, linking the new run back to the source capsule:
|
|
43
|
+
|
|
44
|
+
```
|
|
45
|
+
rgit run --from <source_capsule_id> -- <their command>
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
That records a new `run` node, freezes a byte-exact artifact, links a `produced` edge from the source capsule, and (on approving the resulting proposal) establishes `variant_of` back to the original. If the subagent returned an `updated_resurrection_guide`, write it to a file and pass `--refresh-guide-file <path>` so the source capsule learns.
|
|
49
|
+
|
|
50
|
+
## Notes
|
|
51
|
+
|
|
52
|
+
- **Reproducibility stays intact.** The subagent only *authors*; the human runs `rgit run`, which is the only thing that freezes the reproducible artifact. The agent is never in the replay path.
|
|
53
|
+
- **No paid API.** The regenerator is a dispatched subagent on this session's subscription. MCP only served read-only graph snippets (`recall`, `compose`).
|
|
54
|
+
- **Sibling flow:** capture/segmentation is `rgit-capture` + `capsule-segmenter`.
|
rgit/ablation.py
ADDED
|
@@ -0,0 +1,70 @@
|
|
|
1
|
+
# src/rgit/ablation.py
|
|
2
|
+
from __future__ import annotations
|
|
3
|
+
from itertools import chain, combinations
|
|
4
|
+
|
|
5
|
+
from .metricdir import best_index
|
|
6
|
+
from .store.store import Store
|
|
7
|
+
|
|
8
|
+
|
|
9
|
+
def _powerset(items: list[str]):
|
|
10
|
+
return chain.from_iterable(combinations(items, r) for r in range(len(items) + 1))
|
|
11
|
+
|
|
12
|
+
|
|
13
|
+
def _active_set(store: Store, run_id: str) -> frozenset[str]:
|
|
14
|
+
"""A run's active capsules; fall back to produced capsules when none declared."""
|
|
15
|
+
active = store.active_features(run_id)
|
|
16
|
+
if active:
|
|
17
|
+
return frozenset(active)
|
|
18
|
+
produced = [r["src"] for r in store.conn.execute(
|
|
19
|
+
"SELECT src FROM edges WHERE dst=? AND type=?", (run_id, "produced"))]
|
|
20
|
+
return frozenset(produced)
|
|
21
|
+
|
|
22
|
+
|
|
23
|
+
def ablation(store: Store, capsule_ids: list[str], metric: str | None = None) -> dict:
|
|
24
|
+
"""Build a base/+A/+A+B grid over the powerset of `capsule_ids`.
|
|
25
|
+
|
|
26
|
+
`capsule_ids` accepts capsule ids or names (resolved to ids). Each subset cell
|
|
27
|
+
is the latest run whose active set equals that subset *exactly* — a run that
|
|
28
|
+
also had some capsule active outside the requested sweep is dropped, not
|
|
29
|
+
folded into a smaller cell, so cells never compare confounded measurements.
|
|
30
|
+
Returns {"rows": [{subset(names), run, cells{metric: value}}], "winners": {metric: subset}}.
|
|
31
|
+
"""
|
|
32
|
+
capsule_ids = [store.resolve_feature(t) for t in capsule_ids] # names -> ids
|
|
33
|
+
caps = {c.id: c for c in store.list_features()}
|
|
34
|
+
name = {cid: caps[cid].name for cid in capsule_ids}
|
|
35
|
+
target = frozenset(capsule_ids)
|
|
36
|
+
|
|
37
|
+
# All runs, newest first, bucketed by their EXACT active set (subsets of the
|
|
38
|
+
# sweep only; a run with an extra active capsule outside `target` is skipped).
|
|
39
|
+
rows_all = store.conn.execute("SELECT id FROM runs").fetchall()
|
|
40
|
+
runs = sorted((store.get_run(r["id"]) for r in rows_all),
|
|
41
|
+
key=lambda r: r.created_at, reverse=True)
|
|
42
|
+
latest_for: dict[frozenset, object] = {}
|
|
43
|
+
for run in runs:
|
|
44
|
+
aset = _active_set(store, run.id)
|
|
45
|
+
if not aset <= target: # confounded by a feature outside the sweep
|
|
46
|
+
continue
|
|
47
|
+
latest_for.setdefault(aset, run) # newest wins (we iterate desc)
|
|
48
|
+
|
|
49
|
+
metric_names: list[str] = []
|
|
50
|
+
rows = []
|
|
51
|
+
for subset in _powerset(capsule_ids):
|
|
52
|
+
run = latest_for.get(frozenset(subset))
|
|
53
|
+
cells = dict(run.metrics) if (run and run.metrics) else {}
|
|
54
|
+
for k in cells:
|
|
55
|
+
if k not in metric_names:
|
|
56
|
+
metric_names.append(k)
|
|
57
|
+
rows.append({"subset": tuple(sorted(name[c] for c in subset)),
|
|
58
|
+
"run": run.id if run else None, "cells": cells})
|
|
59
|
+
|
|
60
|
+
cols = [metric] if metric else metric_names
|
|
61
|
+
for row in rows: # ensure every row has every column
|
|
62
|
+
for m in cols:
|
|
63
|
+
row["cells"].setdefault(m, None)
|
|
64
|
+
|
|
65
|
+
winners: dict[str, tuple] = {}
|
|
66
|
+
for m in cols:
|
|
67
|
+
idx = best_index(store, m, [row["cells"].get(m) for row in rows])
|
|
68
|
+
if idx is not None:
|
|
69
|
+
winners[m] = rows[idx]["subset"]
|
|
70
|
+
return {"rows": rows, "winners": winners}
|
rgit/agent_guidance.py
ADDED
|
@@ -0,0 +1,169 @@
|
|
|
1
|
+
"""Managed global guidance block for agent clients."""
|
|
2
|
+
from __future__ import annotations
|
|
3
|
+
|
|
4
|
+
import os
|
|
5
|
+
import re
|
|
6
|
+
from pathlib import Path
|
|
7
|
+
|
|
8
|
+
|
|
9
|
+
START = "<!-- research-git:start -->"
|
|
10
|
+
END = "<!-- research-git:end -->"
|
|
11
|
+
|
|
12
|
+
# Modes a user may pin in the managed block; carried across reinstalls so an
|
|
13
|
+
# upgrade does not silently reset a deliberate choice back to `default`.
|
|
14
|
+
KNOWN_MODES = ("default", "manual-only", "custom")
|
|
15
|
+
_MODE_RE = re.compile(r"^Current mode:[ \t]*(.+)$", re.MULTILINE)
|
|
16
|
+
|
|
17
|
+
|
|
18
|
+
def render_global_block(mode: str = "default") -> str:
|
|
19
|
+
if mode not in KNOWN_MODES:
|
|
20
|
+
mode = "default"
|
|
21
|
+
return (
|
|
22
|
+
f"{START}\n"
|
|
23
|
+
"## research-git\n"
|
|
24
|
+
"\n"
|
|
25
|
+
"research-git is installed as a default agent capability.\n"
|
|
26
|
+
"\n"
|
|
27
|
+
f"Current mode: {mode}\n"
|
|
28
|
+
"\n"
|
|
29
|
+
"Mode options:\n"
|
|
30
|
+
"- `default`: After code changes, consider capture. Capture meaningful "
|
|
31
|
+
"research/code ideas; skip mechanical changes.\n"
|
|
32
|
+
"- `manual-only`: Use research-git only when the user explicitly asks to "
|
|
33
|
+
"capture, save, recall, resurrect, or bring back an idea.\n"
|
|
34
|
+
"- `custom`: Inherit `default`, then apply repo-specific rules from this "
|
|
35
|
+
"repo.\n"
|
|
36
|
+
"\n"
|
|
37
|
+
"Priority:\n"
|
|
38
|
+
"- Session/user instruction overrides repo and global guidance.\n"
|
|
39
|
+
"- Repo-level research-git preferences override global guidance.\n"
|
|
40
|
+
"\n"
|
|
41
|
+
"Repo preference recording:\n"
|
|
42
|
+
"- If the user clearly asks to remember a stable repo-level research-git "
|
|
43
|
+
"preference, update this repo's guidance file.\n"
|
|
44
|
+
"- For unclear preferences, ask first.\n"
|
|
45
|
+
"- Do not write repo preferences for one-off session instructions.\n"
|
|
46
|
+
"\n"
|
|
47
|
+
"Use:\n"
|
|
48
|
+
"- After meaningful code/research changes, consider "
|
|
49
|
+
"`rgit capture --trigger manual` and the `rgit-capture` skill.\n"
|
|
50
|
+
"- For recall/resurrection requests, use the `rgit-recall` skill.\n"
|
|
51
|
+
"- If `.rgit/` is missing in a git repo: when operating autonomously "
|
|
52
|
+
"(no human to ask), bootstrap the store with `rgit capture --init "
|
|
53
|
+
"--trigger manual` (store only — never install hooks unless asked); in "
|
|
54
|
+
"an interactive session, tell the user to run `rgit init` rather than "
|
|
55
|
+
"initializing silently.\n"
|
|
56
|
+
"- In final feedback, mention any capsules created, approved, applied, "
|
|
57
|
+
"or skipped, plus important graph relations.\n"
|
|
58
|
+
f"{END}\n"
|
|
59
|
+
)
|
|
60
|
+
|
|
61
|
+
|
|
62
|
+
def manual_status(mode: str = "default") -> dict:
|
|
63
|
+
return {
|
|
64
|
+
"action": "manual",
|
|
65
|
+
"block": render_global_block(mode),
|
|
66
|
+
"instructions": "Add this managed block to your agent's global guidance file.",
|
|
67
|
+
}
|
|
68
|
+
|
|
69
|
+
|
|
70
|
+
def manual_uninstall_status() -> dict:
|
|
71
|
+
return {
|
|
72
|
+
"action": "manual",
|
|
73
|
+
"instructions": (
|
|
74
|
+
"remove the research-git managed block from your agent's global "
|
|
75
|
+
"guidance file if you added it manually."
|
|
76
|
+
),
|
|
77
|
+
}
|
|
78
|
+
|
|
79
|
+
|
|
80
|
+
def upsert_managed_block(path: Path, *, mode: str | None = None,
|
|
81
|
+
dry_run: bool = False) -> dict:
|
|
82
|
+
# mode=None means "no explicit choice this run": render the default block but
|
|
83
|
+
# preserve any mode the user previously pinned. An explicit mode overrides.
|
|
84
|
+
explicit = mode is not None
|
|
85
|
+
block = render_global_block(mode or "default")
|
|
86
|
+
exists = path.exists()
|
|
87
|
+
text = path.read_text(encoding="utf-8") if exists else ""
|
|
88
|
+
new_text, action = _upsert_text(text, block, exists, carry=not explicit)
|
|
89
|
+
if dry_run:
|
|
90
|
+
return {"action": _dry_action(action), "path": str(path), "block": block}
|
|
91
|
+
if new_text != text:
|
|
92
|
+
path.parent.mkdir(parents=True, exist_ok=True)
|
|
93
|
+
_atomic_write(path, new_text)
|
|
94
|
+
return {"action": action if new_text != text else "unchanged",
|
|
95
|
+
"path": str(path)}
|
|
96
|
+
|
|
97
|
+
|
|
98
|
+
def remove_managed_block(path: Path, *, dry_run: bool = False) -> dict:
|
|
99
|
+
if not path.exists():
|
|
100
|
+
return {"action": "absent", "path": str(path)}
|
|
101
|
+
text = path.read_text(encoding="utf-8")
|
|
102
|
+
span = _managed_span(text)
|
|
103
|
+
if span is None:
|
|
104
|
+
return {"action": "absent", "path": str(path)}
|
|
105
|
+
start, end = span
|
|
106
|
+
new_text = text[:start] + text[end:]
|
|
107
|
+
if dry_run:
|
|
108
|
+
return {"action": "would_remove", "path": str(path)}
|
|
109
|
+
_atomic_write(path, new_text)
|
|
110
|
+
return {"action": "removed", "path": str(path)}
|
|
111
|
+
|
|
112
|
+
|
|
113
|
+
def _upsert_text(text: str, block: str, exists: bool,
|
|
114
|
+
carry: bool = True) -> tuple[str, str]:
|
|
115
|
+
span = _managed_span(text)
|
|
116
|
+
if span is not None:
|
|
117
|
+
start, end = span
|
|
118
|
+
if carry:
|
|
119
|
+
block = _carry_mode(text[start:end], block)
|
|
120
|
+
new_text = text[:start] + block + text[end:]
|
|
121
|
+
return new_text, "updated"
|
|
122
|
+
if not exists or not text:
|
|
123
|
+
return block, "created"
|
|
124
|
+
sep = "" if text.endswith("\n") else "\n"
|
|
125
|
+
return text + sep + "\n" + block, "appended"
|
|
126
|
+
|
|
127
|
+
|
|
128
|
+
def _carry_mode(old_block: str, new_block: str) -> str:
|
|
129
|
+
"""Preserve a user-pinned `Current mode:` from the existing block.
|
|
130
|
+
|
|
131
|
+
Re-rendering always emits `Current mode: default`. If the user had pinned a
|
|
132
|
+
different recognized mode, keep it so an upgrade does not reset their choice.
|
|
133
|
+
"""
|
|
134
|
+
m = _MODE_RE.search(old_block)
|
|
135
|
+
if not m:
|
|
136
|
+
return new_block
|
|
137
|
+
mode = m.group(1).strip()
|
|
138
|
+
if mode == "default" or mode not in KNOWN_MODES:
|
|
139
|
+
return new_block
|
|
140
|
+
return _MODE_RE.sub(f"Current mode: {mode}", new_block, count=1)
|
|
141
|
+
|
|
142
|
+
|
|
143
|
+
def _managed_span(text: str) -> tuple[int, int] | None:
|
|
144
|
+
start = text.find(START)
|
|
145
|
+
if start < 0:
|
|
146
|
+
return None
|
|
147
|
+
end = text.find(END, start)
|
|
148
|
+
if end < 0:
|
|
149
|
+
return None
|
|
150
|
+
after_end = end + len(END)
|
|
151
|
+
if text[after_end:after_end + 1] == "\n":
|
|
152
|
+
after_end += 1
|
|
153
|
+
return start, after_end
|
|
154
|
+
|
|
155
|
+
|
|
156
|
+
def _dry_action(action: str) -> str:
|
|
157
|
+
return {
|
|
158
|
+
"created": "would_create",
|
|
159
|
+
"appended": "would_append",
|
|
160
|
+
"updated": "would_update",
|
|
161
|
+
}[action]
|
|
162
|
+
|
|
163
|
+
|
|
164
|
+
def _atomic_write(path: Path, text: str) -> None:
|
|
165
|
+
tmp = path.with_name(f".{path.name}.research-git.tmp")
|
|
166
|
+
tmp.write_text(text, encoding="utf-8")
|
|
167
|
+
if path.exists():
|
|
168
|
+
os.chmod(tmp, path.stat().st_mode)
|
|
169
|
+
tmp.replace(path)
|
rgit/agent_platforms.py
ADDED
|
@@ -0,0 +1,49 @@
|
|
|
1
|
+
"""Platform facts for installing research-git agent guidance."""
|
|
2
|
+
from __future__ import annotations
|
|
3
|
+
|
|
4
|
+
import os
|
|
5
|
+
from pathlib import Path
|
|
6
|
+
|
|
7
|
+
|
|
8
|
+
def home_dir() -> Path:
|
|
9
|
+
return Path.home()
|
|
10
|
+
|
|
11
|
+
|
|
12
|
+
def _env_dir(var: str) -> Path | None:
|
|
13
|
+
"""Return $var as a Path if it is set and non-empty, else None."""
|
|
14
|
+
val = os.environ.get(var)
|
|
15
|
+
return Path(val) if val else None
|
|
16
|
+
|
|
17
|
+
|
|
18
|
+
def _config_home() -> Path:
|
|
19
|
+
"""XDG base dir for per-user config, honoring $XDG_CONFIG_HOME."""
|
|
20
|
+
return _env_dir("XDG_CONFIG_HOME") or home_dir() / ".config"
|
|
21
|
+
|
|
22
|
+
|
|
23
|
+
def guidance_target(platform: str) -> dict | None:
|
|
24
|
+
home = home_dir()
|
|
25
|
+
if platform == "codex":
|
|
26
|
+
root = _env_dir("CODEX_HOME") or home / ".codex"
|
|
27
|
+
return {
|
|
28
|
+
"path": root / "AGENTS.md",
|
|
29
|
+
"reload": "Start a new Codex session after install.",
|
|
30
|
+
}
|
|
31
|
+
if platform == "claude-code":
|
|
32
|
+
root = _env_dir("CLAUDE_CONFIG_DIR") or home / ".claude"
|
|
33
|
+
return {
|
|
34
|
+
"path": root / "CLAUDE.md",
|
|
35
|
+
"reload": "Restart Claude Code or run /reload-plugins after install.",
|
|
36
|
+
}
|
|
37
|
+
if platform == "gemini":
|
|
38
|
+
return {
|
|
39
|
+
"path": home / ".gemini" / "GEMINI.md",
|
|
40
|
+
"reload": "Start a new Gemini CLI session after install.",
|
|
41
|
+
}
|
|
42
|
+
if platform == "opencode":
|
|
43
|
+
root = _config_home() / "opencode"
|
|
44
|
+
if root.exists():
|
|
45
|
+
return {
|
|
46
|
+
"path": root / "AGENTS.md",
|
|
47
|
+
"reload": "Start a new opencode session after install.",
|
|
48
|
+
}
|
|
49
|
+
return None
|
rgit/astmap.py
ADDED
|
@@ -0,0 +1,106 @@
|
|
|
1
|
+
from __future__ import annotations
|
|
2
|
+
import re
|
|
3
|
+
from pathlib import Path
|
|
4
|
+
from typing import Optional
|
|
5
|
+
|
|
6
|
+
import libcst as cst
|
|
7
|
+
from libcst.metadata import MetadataWrapper, PositionProvider
|
|
8
|
+
|
|
9
|
+
_HUNK = re.compile(r"^@@ -\d+(?:,\d+)? \+(\d+)(?:,(\d+))? @@", re.M)
|
|
10
|
+
_FILE = re.compile(r"^\+\+\+ b/(.+)$", re.M)
|
|
11
|
+
|
|
12
|
+
|
|
13
|
+
def _read_python_source(path: Path) -> str:
|
|
14
|
+
"""Read a .py file for parsing. ``utf-8-sig`` strips a UTF-8 BOM (common on
|
|
15
|
+
Windows-authored files) that would otherwise make libcst miss the first
|
|
16
|
+
symbol — and it also reads plain UTF-8 unchanged."""
|
|
17
|
+
return path.read_text(encoding="utf-8-sig")
|
|
18
|
+
|
|
19
|
+
|
|
20
|
+
def _changed_line_ranges(diff: str) -> dict[str, list[tuple[int, int]]]:
|
|
21
|
+
"""file -> list of (start, end) line ranges touched on the new side."""
|
|
22
|
+
result: dict[str, list[tuple[int, int]]] = {}
|
|
23
|
+
current: Optional[str] = None
|
|
24
|
+
for line in diff.splitlines():
|
|
25
|
+
m = _FILE.match(line)
|
|
26
|
+
if m:
|
|
27
|
+
current = m.group(1)
|
|
28
|
+
result.setdefault(current, [])
|
|
29
|
+
continue
|
|
30
|
+
h = _HUNK.match(line)
|
|
31
|
+
if h and current:
|
|
32
|
+
start = int(h.group(1))
|
|
33
|
+
length = int(h.group(2) or "1")
|
|
34
|
+
result[current].append((start, start + max(length, 1) - 1))
|
|
35
|
+
return result
|
|
36
|
+
|
|
37
|
+
|
|
38
|
+
class _SymbolFinder(cst.CSTVisitor):
|
|
39
|
+
METADATA_DEPENDENCIES = (PositionProvider,)
|
|
40
|
+
|
|
41
|
+
def __init__(self, ranges: list[tuple[int, int]]):
|
|
42
|
+
self.ranges = ranges
|
|
43
|
+
self.found: set[str] = set()
|
|
44
|
+
|
|
45
|
+
def _overlaps(self, node) -> bool:
|
|
46
|
+
pos = self.get_metadata(PositionProvider, node)
|
|
47
|
+
for s, e in self.ranges:
|
|
48
|
+
if pos.start.line <= e and pos.end.line >= s:
|
|
49
|
+
return True
|
|
50
|
+
return False
|
|
51
|
+
|
|
52
|
+
def visit_FunctionDef(self, node: cst.FunctionDef) -> None:
|
|
53
|
+
if self._overlaps(node):
|
|
54
|
+
self.found.add(node.name.value)
|
|
55
|
+
|
|
56
|
+
def visit_ClassDef(self, node: cst.ClassDef) -> None:
|
|
57
|
+
if self._overlaps(node):
|
|
58
|
+
self.found.add(node.name.value)
|
|
59
|
+
|
|
60
|
+
|
|
61
|
+
def changed_symbols(diff: str, repo: Path) -> list[dict]:
|
|
62
|
+
"""[{file, symbol}] for each top-level def/class overlapping a diff hunk."""
|
|
63
|
+
out: list[dict] = []
|
|
64
|
+
for file, ranges in _changed_line_ranges(diff).items():
|
|
65
|
+
path = repo / file
|
|
66
|
+
if not path.suffix == ".py" or not path.exists() or not ranges:
|
|
67
|
+
continue
|
|
68
|
+
try:
|
|
69
|
+
wrapper = MetadataWrapper(cst.parse_module(_read_python_source(path)))
|
|
70
|
+
except cst.ParserSyntaxError:
|
|
71
|
+
continue
|
|
72
|
+
finder = _SymbolFinder(ranges)
|
|
73
|
+
wrapper.visit(finder)
|
|
74
|
+
for sym in sorted(finder.found):
|
|
75
|
+
out.append({"file": file, "symbol": sym})
|
|
76
|
+
return out
|
|
77
|
+
|
|
78
|
+
|
|
79
|
+
def read_symbol_source(repo: Path, file: str, symbol: str) -> Optional[str]:
|
|
80
|
+
"""Current source text of a top-level def/class, or None if absent."""
|
|
81
|
+
path = repo / file
|
|
82
|
+
if not path.exists():
|
|
83
|
+
return None
|
|
84
|
+
try:
|
|
85
|
+
module = cst.parse_module(_read_python_source(path))
|
|
86
|
+
except cst.ParserSyntaxError:
|
|
87
|
+
return None
|
|
88
|
+
for stmt in module.body:
|
|
89
|
+
if isinstance(stmt, (cst.FunctionDef, cst.ClassDef)) and stmt.name.value == symbol:
|
|
90
|
+
return module.code_for_node(stmt)
|
|
91
|
+
return None
|
|
92
|
+
|
|
93
|
+
|
|
94
|
+
def symbol_at_line(repo: Path, file: str, line: int) -> Optional[str]:
|
|
95
|
+
"""Name of the top-level def/class enclosing `line` (1-based), or None."""
|
|
96
|
+
path = repo / file
|
|
97
|
+
if path.suffix != ".py" or not path.exists():
|
|
98
|
+
return None
|
|
99
|
+
try:
|
|
100
|
+
wrapper = MetadataWrapper(cst.parse_module(_read_python_source(path)))
|
|
101
|
+
except cst.ParserSyntaxError:
|
|
102
|
+
return None
|
|
103
|
+
finder = _SymbolFinder([(line, line)])
|
|
104
|
+
wrapper.visit(finder)
|
|
105
|
+
found = sorted(finder.found)
|
|
106
|
+
return found[0] if found else None
|