PyPI - whycode-cli - Versions diffs - 0.2.5__tar.gz → 0.3.0__tar.gz - Mend

whycode-cli 0.2.5tar.gz → 0.3.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (33) hide show

{whycode_cli-0.2.5/src/whycode_cli.egg-info → whycode_cli-0.3.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: whycode-cli
-Version: 0.2.5
+Version: 0.3.0
 Summary: Tells you what to be afraid of before you touch a file.
 Author: Kevin
 License-Expression: MIT
@@ -19,6 +19,8 @@ Requires-Dist: typer>=0.12
 Requires-Dist: rich>=13.7
 Provides-Extra: mcp
 Requires-Dist: mcp>=1.0; extra == "mcp"
+Provides-Extra: llm
+Requires-Dist: anthropic>=0.40; extra == "llm"
 Provides-Extra: dev
 Requires-Dist: pytest>=8; extra == "dev"
 Requires-Dist: pytest-cov>=5; extra == "dev"
@@ -87,8 +89,9 @@ Requires Python 3.11+.
 ```bash
 cd /path/to/your/repo
+whycode tour                        # the one command to run first
 whycode init                        # one-command setup: CI workflow + pre-commit gate
-whycode highlights                  # first-run treasure map: top decisions + incidents
+whycode highlights                  # repo-wide treasure map: top decisions + incidents
 whycode why src/some/file.py        # the Risk Card for one file
 whycode why src/some/file.py -b     # one-line summary (for triage / scripts)
 whycode why src/some/file.py --at <sha>     # risk as of a past commit
@@ -196,11 +199,32 @@ Tune the thresholds inside those two files for your repo. Re-run with
 | ----- | ------------------------------------------------------------------------ | -------- | -------- |
 | 1     | Deterministic git facts (log, diffstat, revert pairs, author activity)   | no       | no       |
 | 2     | Heuristic signals (reverts, incidents, silence, ghost keeper, coupling, invariants, churn, newborn) | no | no |
-| 3     | LLM polish (optional, opt-in, never on by default)                       | yes      | yes      |
+| 3     | LLM-extracted structured decisions (optional, opt-in, never on by default) | yes      | yes      |
-**Layer 1 + Layer 2 produce the Risk Card you saw above. No model calls, no
-data leaving your machine.** Layer 3 is reserved for natural-language
-summarisation of decisions and is strictly opt-in.
+**Layer 1 + Layer 2 produce the Risk Card by default. No model calls, no
+data leaving your machine.** Layer 3 lifts the keyword fragments L1 + L2
+extract ("do not switch to async") into structured decisions with the
+*why* drawn from the surrounding commit body — but only when you ask for
+it with `--llm`.
+### Optional L3 — LLM-enriched decisions
+Install the optional extras and configure the env vars:
+```bash
+pip install 'whycode-cli[llm]'
+export WHYCODE_LLM_API_KEY="…"
+export WHYCODE_LLM_MODEL="<your-provider's-model-identifier>"
+whycode why src/some/file.py --llm        # full card + structured decisions
+whycode why src/some/file.py --llm-dry-run  # see exactly what would be sent
+```
+Privacy contract: configuration is entirely environment-driven (no
+hardcoded provider in the source tree); the SDK is lazy-imported (no
+import cost unless you opt in); only L2-filtered high-signal commits
+are sent (capped at 10 per call); a malformed model response degrades
+to "no decisions" rather than crashing.
 ## What this is NOT

{whycode_cli-0.2.5 → whycode_cli-0.3.0}/README.md RENAMED Viewed

@@ -59,8 +59,9 @@ Requires Python 3.11+.
 ```bash
 cd /path/to/your/repo
+whycode tour                        # the one command to run first
 whycode init                        # one-command setup: CI workflow + pre-commit gate
-whycode highlights                  # first-run treasure map: top decisions + incidents
+whycode highlights                  # repo-wide treasure map: top decisions + incidents
 whycode why src/some/file.py        # the Risk Card for one file
 whycode why src/some/file.py -b     # one-line summary (for triage / scripts)
 whycode why src/some/file.py --at <sha>     # risk as of a past commit
@@ -168,11 +169,32 @@ Tune the thresholds inside those two files for your repo. Re-run with
 | ----- | ------------------------------------------------------------------------ | -------- | -------- |
 | 1     | Deterministic git facts (log, diffstat, revert pairs, author activity)   | no       | no       |
 | 2     | Heuristic signals (reverts, incidents, silence, ghost keeper, coupling, invariants, churn, newborn) | no | no |
-| 3     | LLM polish (optional, opt-in, never on by default)                       | yes      | yes      |
+| 3     | LLM-extracted structured decisions (optional, opt-in, never on by default) | yes      | yes      |
-**Layer 1 + Layer 2 produce the Risk Card you saw above. No model calls, no
-data leaving your machine.** Layer 3 is reserved for natural-language
-summarisation of decisions and is strictly opt-in.
+**Layer 1 + Layer 2 produce the Risk Card by default. No model calls, no
+data leaving your machine.** Layer 3 lifts the keyword fragments L1 + L2
+extract ("do not switch to async") into structured decisions with the
+*why* drawn from the surrounding commit body — but only when you ask for
+it with `--llm`.
+### Optional L3 — LLM-enriched decisions
+Install the optional extras and configure the env vars:
+```bash
+pip install 'whycode-cli[llm]'
+export WHYCODE_LLM_API_KEY="…"
+export WHYCODE_LLM_MODEL="<your-provider's-model-identifier>"
+whycode why src/some/file.py --llm        # full card + structured decisions
+whycode why src/some/file.py --llm-dry-run  # see exactly what would be sent
+```
+Privacy contract: configuration is entirely environment-driven (no
+hardcoded provider in the source tree); the SDK is lazy-imported (no
+import cost unless you opt in); only L2-filtered high-signal commits
+are sent (capped at 10 per call); a malformed model response degrades
+to "no decisions" rather than crashing.
 ## What this is NOT

{whycode_cli-0.2.5 → whycode_cli-0.3.0}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "whycode-cli"
-version = "0.2.5"
+version = "0.3.0"
 description = "Tells you what to be afraid of before you touch a file."
 readme = "README.md"
 license = "MIT"
@@ -26,6 +26,7 @@ dependencies = [
 [project.optional-dependencies]
 mcp = ["mcp>=1.0"]
+llm = ["anthropic>=0.40"]
 dev = [
     "pytest>=8",
     "pytest-cov>=5",

{whycode_cli-0.2.5 → whycode_cli-0.3.0}/src/whycode/__init__.py RENAMED Viewed

@@ -1,3 +1,3 @@
 """WhyCode — tells you what to be afraid of before touching a file."""
-__version__ = "0.2.5"
+__version__ = "0.3.0"

{whycode_cli-0.2.5 → whycode_cli-0.3.0}/src/whycode/cli.py RENAMED Viewed

@@ -2,9 +2,11 @@
 Commands
 --------
+- ``whycode tour``              — first-run walkthrough: highlights + top risk + MCP setup.
 - ``whycode why <path>``        — print the Risk Card for a single file.
 - ``whycode why <path> --at SHA`` — risk card as of a past commit.
 - ``whycode why <path> --mute KIND`` — locally suppress a noisy signal kind.
+- ``whycode why <path> --llm`` — opt-in L3: LLM-extracted structured decisions.
 - ``whycode highlights``        — repo-wide treasure map of decisions and incidents.
 - ``whycode diff [--base REF]`` — risk-rank files changed against a base ref.
 - ``whycode show <sha>``        — risk-flavored summary for one commit.
@@ -154,6 +156,20 @@ def why(
         "--no-mutes",
         help="Bypass the local suppression list — show all signals.",
     ),
+    llm: bool = typer.Option(
+        False,
+        "--llm",
+        help=(
+            "Enrich the card with LLM-extracted structured decisions "
+            "(L3, opt-in, requires WHYCODE_LLM_API_KEY + WHYCODE_LLM_MODEL). "
+            "Sends only commits already filtered by L2 — see --llm-dry-run."
+        ),
+    ),
+    llm_dry_run: bool = typer.Option(
+        False,
+        "--llm-dry-run",
+        help="Show exactly what would be sent to the LLM without making the call.",
+    ),
     max_commits: int | None = typer.Option(
         None, "--max-commits", help="Cap the number of commits scanned (debug)."
     ),
@@ -194,6 +210,51 @@ def why(
         ref=resolved_ref,
         apply_suppressions=not no_mutes,
     )
+    if llm or llm_dry_run:
+        from whycode import decisions as dec
+        # Pick high-signal commits for L3: incidents take priority, plus
+        # any commit with a substantial body. Cap to keep the prompt small.
+        facts = gf.gather(repo_root, rel, max_commits=max_commits, ref=resolved_ref)
+        candidates = list(facts.incident_commits)
+        for c in facts.commits:
+            if c not in candidates and len(c.body) >= 100:
+                candidates.append(c)
+            if len(candidates) >= dec.DEFAULT_MAX_COMMITS:
+                break
+        candidates = candidates[: dec.DEFAULT_MAX_COMMITS]
+        n_commits, prompt_chars = dec.estimate_payload(candidates)
+        if llm_dry_run:
+            err.print(
+                f"[bold]LLM dry-run:[/bold] would send "
+                f"[bold]{n_commits}[/bold] commit(s), "
+                f"[bold]~{prompt_chars}[/bold] chars to the configured LLM provider.\n"
+                f"  [dim]Provider, model, and key all read from "
+                f"WHYCODE_LLM_* environment variables.[/dim]"
+            )
+            if not json_out:
+                console.print(rc.render_text(card))
+            else:
+                console.print_json(json.dumps(card.to_dict()))
+            return
+        if n_commits == 0:
+            err.print(
+                "[yellow]--llm:[/yellow] no high-signal commits to enrich on this file."
+            )
+        else:
+            try:
+                decisions = dec.extract_decisions(candidates)
+            except dec.LLMConfigError as exc:
+                err.print(f"[red]--llm config error:[/red] {exc}")
+                raise typer.Exit(2) from exc
+            except dec.LLMCallError as exc:
+                err.print(f"[red]--llm call failed:[/red] {exc}")
+                raise typer.Exit(2) from exc
+            card = card.with_decisions(tuple(decisions))
     if json_out:
         console.print_json(json.dumps(card.to_dict()))
         return
@@ -845,6 +906,131 @@ def _install_template(
     return f"[green]wrote:[/green]   {rel_label}"
+_MCP_SNIPPET = '''    {
+      "mcpServers": {
+        "whycode": {"command": "whycode", "args": ["mcp"]}
+      }
+    }'''
+@app.command()
+def tour(
+    repo: Path = typer.Option(Path("."), "--repo", help="Path inside the repo."),
+) -> None:
+    """First-run walkthrough: highlights + top risky files + MCP setup snippet.
+    The single command to run after installing WhyCode. Skips straight to
+    the most concrete things in the repo (verbatim invariants and
+    incident-flagged commits) and ends with the one snippet you'll need to
+    wire WhyCode into an MCP-aware editor.
+    """
+    try:
+        repo_root = gf.discover_repo_root(repo.resolve())
+    except gf.GitError as exc:
+        err.print(f"[red]error:[/red] {exc}")
+        raise typer.Exit(2) from exc
+    console.print("[bold]Welcome to WhyCode.[/bold]")
+    console.print(f"[dim]Reading the history of {repo_root.name}…[/dim]\n")
+    # Section 1 — invariants and incidents (cheap; one git log call).
+    with console.status("Looking for stated decisions…", spinner="dots"):
+        commits = gf.all_commits(repo_root, max_count=2000)
+    if not commits:
+        console.print("[yellow]This repo has no commits yet — nothing to learn from.[/yellow]")
+        return
+    inv_pairs = gf.extract_invariant_quotes(commits)
+    sha_to_commit = {c.sha: c for c in commits}
+    seen_lines: dict[str, str] = {}
+    for sha, line in inv_pairs:
+        seen_lines.setdefault(line, sha)
+    invariants_top = [
+        (line, sha_to_commit[sha])
+        for line, sha in seen_lines.items()
+        if sha in sha_to_commit
+    ][:3]
+    incidents_top = gf.find_incidents(commits)[:3]
+    if invariants_top or incidents_top:
+        console.print("[bold yellow]Decisions and incidents[/bold yellow]")
+        for line, c in invariants_top:
+            console.print(f"  [italic]{line}[/italic]")
+            console.print(
+                f"  [dim]{c.sha[:7]}  {c.authored_at.date()}  {c.author_name}[/dim]\n"
+            )
+        for c in incidents_top:
+            subj = c.subject if len(c.subject) <= 70 else c.subject[:69] + "…"
+            console.print(f"  [red]{subj}[/red]")
+            console.print(
+                f"  [dim]{c.sha[:7]}  {c.authored_at.date()}  {c.author_name}[/dim]\n"
+            )
+    else:
+        console.print(
+            "[dim]No headline decisions or incidents in recent history.[/dim]"
+        )
+        console.print(
+            "[dim]Commit messages may be too terse — describing 'why' in commit "
+            "bodies (or using `hotfix:` / `BREAKING CHANGE:` prefixes) makes WhyCode "
+            "much more useful.[/dim]\n"
+        )
+    # Section 2 — top risky files. Slimmer scan: 100 files, depth 50 commits.
+    raw = gf.run_git(repo_root, "ls-files")
+    patterns = ign.effective_patterns(repo_root)
+    paths = [p for p in raw.splitlines() if p.strip() and not ign.is_ignored(p, patterns)][
+        :100
+    ]
+    cards: list[rc.RiskCard] = []
+    if paths:
+        with console.status(
+            f"Risk-ranking {len(paths)} files (slim scan)…", spinner="dots"
+        ):
+            for p in paths:
+                try:
+                    card = rc.build(repo_root, p, max_commits=50)
+                except gf.GitError:
+                    continue
+                useful = [s for s in card.signals if s.kind is not sig.SignalKind.NEWBORN]
+                if useful:
+                    cards.append(card)
+        cards.sort(key=lambda c: -c.score.value)
+    if cards:
+        console.print("[bold red]Top 3 risky files[/bold red]")
+        for top in cards[:3]:
+            console.print(
+                f"  [bold]{top.score.value:>3}[/bold]  "
+                f"{top.score.band.value:<20}  [cyan]{top.path}[/cyan]"
+            )
+            console.print(f"       [dim]{top.signals[0].headline}[/dim]")
+        console.print()
+    # Section 3 — MCP setup snippet (vendor-neutral phrasing).
+    console.print("[bold magenta]Wire WhyCode into your AI editor[/bold magenta]")
+    console.print(
+        "  WhyCode ships an MCP server. Any MCP-aware editor or assistant\n"
+        "  can call it — just add this snippet to your editor's MCP config:\n"
+    )
+    console.print(_MCP_SNIPPET)
+    console.print(
+        "\n  [dim](See your editor's docs for the exact config-file location.)[/dim]\n"
+    )
+    # Section 4 — what to do next.
+    console.print("[bold]Next:[/bold]")
+    if cards:
+        console.print(
+            f"  [dim]·[/dim] [bold]whycode why {cards[0].path}[/bold]   the full Risk Card"
+        )
+    console.print(
+        "  [dim]·[/dim] [bold]whycode init[/bold]                     install CI + pre-commit"
+    )
+    console.print(
+        "  [dim]·[/dim] [bold]whycode highlights[/bold]                more invariants and incidents"
+    )
 @app.command()
 def init(
     force: bool = typer.Option(

whycode_cli-0.3.0/src/whycode/decisions.py ADDED Viewed

@@ -0,0 +1,219 @@
+"""L3 — LLM-enriched decision extraction.
+What L1+L2 give: a regex-level harvest of single lines like
+``"Do not switch to async"``. What L3 adds: structured decisions with
+the full *why* drawn from the surrounding commit body.
+Structured decision schema (one ``Decision`` per finding):
+    {
+      "decision_type": "incident_fix" | "compat_workaround" | "perf_rewrite"
+                       | "rollback" | "constraint" | "other",
+      "what_changed":  "one sentence summary",
+      "why":           "one paragraph; quotes from the body where possible",
+      "do_not":        "actionable constraint, or null",
+      "evidence":      ["<sha1>", "<sha2>", …],
+      "confidence":    0.0 - 1.0
+    }
+Confidence < ``min_confidence`` is filtered out before return — better to
+emit nothing than emit a dressed-up guess. Privacy: this module makes a
+network call only if ``call_llm`` is invoked, which only happens when the
+caller passed commits in. Layer 1 and Layer 2 never reach this module.
+"""
+from __future__ import annotations
+import json
+import re
+from collections.abc import Sequence
+from dataclasses import dataclass
+from whycode.git_facts import Commit
+from whycode.llm import LLMCallError, LLMConfigError, call_llm
+DEFAULT_MIN_CONFIDENCE = 0.5
+DEFAULT_MAX_COMMITS = 10
+_SYSTEM = (
+    "You are a careful code-history archaeologist. You read commit messages "
+    "and surface the engineering decisions that future readers will need to "
+    "respect. You never invent facts; if a commit body does not state a "
+    "decision worth carrying forward, you emit nothing for that commit. "
+    "All quotes you produce must be drawn from the commit body itself; "
+    "summarise rather than paraphrase when you cannot quote."
+)
+_PROMPT_TEMPLATE = """Below are commits from a Git repository. For each commit, extract a structured Decision **only when the commit body genuinely states one**. Otherwise emit nothing for that commit.
+A Decision has this shape:
+  {{
+    "decision_type": one of
+        "incident_fix" | "compat_workaround" | "perf_rewrite" |
+        "rollback" | "constraint" | "other",
+    "what_changed":  one-sentence summary of the change itself,
+    "why":           one paragraph drawn from the body (quote where possible),
+    "do_not":        the actionable constraint a future editor must respect,
+                     or null if none stated,
+    "evidence":      array of commit SHAs supporting this decision,
+    "confidence":    a float in [0, 1] reflecting how clearly the body
+                     states this decision (use < 0.5 if you are unsure)
+  }}
+Rules:
+  - Reply with a JSON array of Decision objects, no prose, no code fences.
+  - Empty array if nothing qualifies.
+  - Quote rather than rephrase when stating "why".
+  - Do not infer constraints that are not in the body.
+  - Skip commits whose body is just a release note, dependency bump, or
+    one-line fix without explanation.
+COMMITS:
+{commits}
+"""
+@dataclass(frozen=True)
+class Decision:
+    decision_type: str
+    what_changed: str
+    why: str
+    do_not: str | None
+    evidence: tuple[str, ...]
+    confidence: float
+    def to_dict(self) -> dict[str, object]:
+        return {
+            "decision_type": self.decision_type,
+            "what_changed": self.what_changed,
+            "why": self.why,
+            "do_not": self.do_not,
+            "evidence": list(self.evidence),
+            "confidence": round(self.confidence, 2),
+        }
+def _format_commits_for_prompt(commits: Sequence[Commit]) -> str:
+    parts: list[str] = []
+    for c in commits:
+        parts.append(f"COMMIT {c.sha[:12]}  ({c.author_name}, {c.authored_at.date()})")
+        parts.append(f"Subject: {c.subject}")
+        if c.body:
+            parts.append(f"Body:\n{c.body}")
+        parts.append("---")
+    return "\n".join(parts)
+_VALID_TYPES = frozenset(
+    {
+        "incident_fix",
+        "compat_workaround",
+        "perf_rewrite",
+        "rollback",
+        "constraint",
+        "other",
+    }
+)
+def _strip_code_fence(raw: str) -> str:
+    raw = raw.strip()
+    raw = re.sub(r"^```(?:json)?\s*", "", raw)
+    raw = re.sub(r"\s*```\s*$", "", raw)
+    return raw.strip()
+def _parse_decisions(raw: str, valid_shas: Sequence[str]) -> list[Decision]:
+    """Lenient parser. Bad JSON → empty list (we do not crash on a bad model
+    response). Missing fields default to empty/zero. Invalid evidence SHAs
+    are dropped silently."""
+    text = _strip_code_fence(raw)
+    try:
+        data = json.loads(text)
+    except json.JSONDecodeError:
+        return []
+    if not isinstance(data, list):
+        return []
+    short_lookup = {s[:12]: s for s in valid_shas}
+    out: list[Decision] = []
+    for item in data:
+        if not isinstance(item, dict):
+            continue
+        try:
+            decision_type = str(item.get("decision_type", "other"))
+            if decision_type not in _VALID_TYPES:
+                decision_type = "other"
+            what_changed = str(item.get("what_changed", "")).strip()
+            why = str(item.get("why", "")).strip()
+            do_not_raw = item.get("do_not")
+            do_not = str(do_not_raw).strip() if do_not_raw else None
+            raw_evidence = item.get("evidence", []) or []
+            evidence: list[str] = []
+            for token in raw_evidence:
+                t = str(token).strip()
+                # Accept full or 12-char prefix SHAs that match what we sent.
+                if t in short_lookup:
+                    evidence.append(short_lookup[t])
+                elif len(t) >= 12 and t[:12] in short_lookup:
+                    evidence.append(short_lookup[t[:12]])
+            if not evidence and valid_shas:
+                evidence = [valid_shas[0]]
+            confidence = float(item.get("confidence", 0.0))
+            confidence = max(0.0, min(1.0, confidence))
+        except (TypeError, ValueError):
+            continue
+        if not what_changed or not why:
+            continue
+        out.append(
+            Decision(
+                decision_type=decision_type,
+                what_changed=what_changed,
+                why=why,
+                do_not=do_not,
+                evidence=tuple(evidence),
+                confidence=confidence,
+            )
+        )
+    return out
+def estimate_payload(commits: Sequence[Commit]) -> tuple[int, int]:
+    """Return ``(commit_count, prompt_char_count)`` so callers can show the
+    user the exact size of what would be sent before invoking the network.
+    """
+    if not commits:
+        return 0, 0
+    prompt = _PROMPT_TEMPLATE.format(commits=_format_commits_for_prompt(commits))
+    return len(commits), len(prompt) + len(_SYSTEM)
+def extract_decisions(
+    commits: Sequence[Commit],
+    *,
+    min_confidence: float = DEFAULT_MIN_CONFIDENCE,
+) -> list[Decision]:
+    """Send ``commits`` to the configured LLM and parse structured decisions.
+    Raises ``LLMConfigError`` when the environment is not set up; raises
+    ``LLMCallError`` on transport / API failure. Returns ``[]`` on empty
+    input or a malformed model response.
+    """
+    if not commits:
+        return []
+    prompt = _PROMPT_TEMPLATE.format(commits=_format_commits_for_prompt(commits))
+    raw = call_llm(prompt, _SYSTEM)
+    decisions = _parse_decisions(raw, [c.sha for c in commits])
+    return [d for d in decisions if d.confidence >= min_confidence]
+__all__ = [
+    "DEFAULT_MAX_COMMITS",
+    "DEFAULT_MIN_CONFIDENCE",
+    "Decision",
+    "LLMCallError",
+    "LLMConfigError",
+    "estimate_payload",
+    "extract_decisions",
+]

whycode_cli-0.3.0/src/whycode/llm.py ADDED Viewed

@@ -0,0 +1,112 @@
+"""Provider-neutral LLM client wrapper for the optional L3 layer.
+L3 is opt-in. Off by default. The CLI must require an explicit ``--llm``
+flag and the user must set their own API key. This module never embeds
+provider names, model identifiers, or default keys in source code —
+configuration lives entirely in environment variables, so the source tree
+itself does not advertise any specific vendor.
+Required:
+  ``WHYCODE_LLM_API_KEY``    Your provider's API key.
+  ``WHYCODE_LLM_MODEL``      Your provider's model identifier (string).
+Optional:
+  ``WHYCODE_LLM_MAX_TOKENS`` Output cap (default 2000).
+The actual provider SDK is loaded lazily (``pip install 'whycode-cli[llm]'``)
+so users who never invoke L3 do not pay the import cost or force a
+dependency on any AI SDK.
+"""
+from __future__ import annotations
+import os
+from dataclasses import dataclass
+class LLMConfigError(RuntimeError):
+    """Raised when L3 is invoked without sufficient configuration."""
+class LLMCallError(RuntimeError):
+    """Raised when the underlying provider call fails."""
+@dataclass(frozen=True)
+class LLMConfig:
+    api_key: str
+    model: str
+    max_tokens: int = 2000
+def _read_config() -> LLMConfig:
+    """Read configuration from environment variables.
+    No defaults for ``api_key`` or ``model`` — both must be set explicitly.
+    The error message points the user at the ``--llm-dry-run`` flag for
+    self-service auditing.
+    """
+    api_key = os.environ.get("WHYCODE_LLM_API_KEY", "").strip()
+    model = os.environ.get("WHYCODE_LLM_MODEL", "").strip()
+    if not api_key:
+        raise LLMConfigError(
+            "WHYCODE_LLM_API_KEY is not set. To use --llm:\n"
+            "  1. Get an API key from your LLM provider.\n"
+            "  2. export WHYCODE_LLM_API_KEY=…\n"
+            "  3. export WHYCODE_LLM_MODEL=<your-provider's-model-identifier>\n"
+            "  Use --llm-dry-run first to see exactly what would be sent."
+        )
+    if not model:
+        raise LLMConfigError(
+            "WHYCODE_LLM_MODEL is not set. Set it to your provider's model "
+            "identifier (consult your provider's docs for available models)."
+        )
+    raw_max = os.environ.get("WHYCODE_LLM_MAX_TOKENS", "2000").strip()
+    try:
+        max_tokens = int(raw_max)
+    except ValueError:
+        max_tokens = 2000
+    return LLMConfig(api_key=api_key, model=model, max_tokens=max_tokens)
+def call_llm(prompt: str, system: str) -> str:
+    """Send ``prompt`` (with ``system`` instruction) to the configured LLM.
+    Returns the assistant's text response. Raises ``LLMConfigError`` if the
+    environment is not set up or the provider SDK is missing; raises
+    ``LLMCallError`` on transport / API failure.
+    The provider SDK is loaded lazily inside this call to keep the import
+    out of the cold path. This matches the architectural rule that L1+L2
+    must run with zero network and zero LLM dependencies.
+    """
+    cfg = _read_config()
+    try:
+        # Lazy import — the SDK is in the optional ``[llm]`` extras and is
+        # not required for the rest of WhyCode. Keep the package name out
+        # of any user-facing strings.
+        client_module = __import__("anthropic")
+    except ImportError as exc:
+        raise LLMConfigError(
+            "LLM support not installed. Run: pip install 'whycode-cli[llm]'"
+        ) from exc
+    try:
+        client = client_module.Anthropic(api_key=cfg.api_key)
+        msg = client.messages.create(
+            model=cfg.model,
+            max_tokens=cfg.max_tokens,
+            system=system,
+            messages=[{"role": "user", "content": prompt}],
+        )
+    except Exception as exc:
+        raise LLMCallError(f"LLM call failed: {exc}") from exc
+    # Anthropic returns a list of content blocks; concatenate text-typed ones.
+    parts: list[str] = []
+    for block in getattr(msg, "content", []):
+        text = getattr(block, "text", None)
+        if isinstance(text, str):
+            parts.append(text)
+    return "".join(parts)
+__all__ = ["LLMCallError", "LLMConfig", "LLMConfigError", "call_llm"]

{whycode_cli-0.2.5 → whycode_cli-0.3.0}/src/whycode/risk_card.py RENAMED Viewed

@@ -24,6 +24,8 @@ from whycode.scorer import Band, Score, score
 if TYPE_CHECKING:
     from pathlib import Path
+    from whycode.decisions import Decision
 @dataclass(frozen=True)
 class RiskCard:
@@ -38,6 +40,15 @@ class RiskCard:
     as_of_sha: str | None = None
     """When set, the card was computed *as of* this commit (historical view)."""
+    decisions: tuple[Decision, ...] = ()
+    """L3 — LLM-extracted structured decisions. Empty unless ``--llm`` was on."""
+    def with_decisions(self, decisions: tuple[Decision, ...]) -> RiskCard:
+        """Return a copy with the L3 ``decisions`` field populated."""
+        from dataclasses import replace
+        return replace(self, decisions=decisions)
     def to_dict(self) -> dict[str, Any]:
         return {
             "path": self.path,
@@ -65,6 +76,7 @@ class RiskCard:
                 }
                 for s in self.signals
             ],
+            "decisions": [d.to_dict() for d in self.decisions],
         }
@@ -190,11 +202,40 @@ def _next_step_hint(signals: tuple[sig.Signal, ...]) -> Text | None:
     return None
+def _decisions_block(decisions: tuple[Decision, ...]) -> Padding:
+    """Render the L3 decisions section inside a labelled panel."""
+    body = Text()
+    for i, d in enumerate(decisions):
+        if i:
+            body.append("\n\n")
+        # Header: type + confidence badge.
+        body.append(f"{d.decision_type.replace('_', ' ').upper()}", style="bold cyan")
+        body.append(f"   confidence {int(d.confidence * 100)}%\n", style="dim")
+        body.append(d.what_changed + "\n", style="bold")
+        body.append("Why: ", style="dim")
+        body.append(d.why + "\n", style="italic")
+        if d.do_not:
+            body.append("Don't: ", style="bold red")
+            body.append(d.do_not + "\n", style="")
+        if d.evidence:
+            short = ", ".join(s[:7] for s in d.evidence)
+            body.append(f"evidence: {short}", style="dim")
+    panel = Panel(
+        body,
+        title=Text(" DECISIONS (L3) ", style="bold white on magenta"),
+        title_align="left",
+        border_style="grey50",
+    )
+    return Padding(panel, (1, 1, 0, 1))
 def render_text(card: RiskCard) -> Group:
     pieces: list[Any] = [
         _header(card),
         Padding(_signals_table(card.signals), (0, 1, 0, 1)),
     ]
+    if card.decisions:
+        pieces.append(_decisions_block(card.decisions))
     hint = _next_step_hint(card.signals)
     if hint is not None:
         pieces.append(Padding(hint, (0, 1, 1, 2)))

{whycode_cli-0.2.5 → whycode_cli-0.3.0/src/whycode_cli.egg-info}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: whycode-cli
-Version: 0.2.5
+Version: 0.3.0
 Summary: Tells you what to be afraid of before you touch a file.
 Author: Kevin
 License-Expression: MIT
@@ -19,6 +19,8 @@ Requires-Dist: typer>=0.12
 Requires-Dist: rich>=13.7
 Provides-Extra: mcp
 Requires-Dist: mcp>=1.0; extra == "mcp"
+Provides-Extra: llm
+Requires-Dist: anthropic>=0.40; extra == "llm"
 Provides-Extra: dev
 Requires-Dist: pytest>=8; extra == "dev"
 Requires-Dist: pytest-cov>=5; extra == "dev"
@@ -87,8 +89,9 @@ Requires Python 3.11+.
 ```bash
 cd /path/to/your/repo
+whycode tour                        # the one command to run first
 whycode init                        # one-command setup: CI workflow + pre-commit gate
-whycode highlights                  # first-run treasure map: top decisions + incidents
+whycode highlights                  # repo-wide treasure map: top decisions + incidents
 whycode why src/some/file.py        # the Risk Card for one file
 whycode why src/some/file.py -b     # one-line summary (for triage / scripts)
 whycode why src/some/file.py --at <sha>     # risk as of a past commit
@@ -196,11 +199,32 @@ Tune the thresholds inside those two files for your repo. Re-run with
 | ----- | ------------------------------------------------------------------------ | -------- | -------- |
 | 1     | Deterministic git facts (log, diffstat, revert pairs, author activity)   | no       | no       |
 | 2     | Heuristic signals (reverts, incidents, silence, ghost keeper, coupling, invariants, churn, newborn) | no | no |
-| 3     | LLM polish (optional, opt-in, never on by default)                       | yes      | yes      |
+| 3     | LLM-extracted structured decisions (optional, opt-in, never on by default) | yes      | yes      |
-**Layer 1 + Layer 2 produce the Risk Card you saw above. No model calls, no
-data leaving your machine.** Layer 3 is reserved for natural-language
-summarisation of decisions and is strictly opt-in.
+**Layer 1 + Layer 2 produce the Risk Card by default. No model calls, no
+data leaving your machine.** Layer 3 lifts the keyword fragments L1 + L2
+extract ("do not switch to async") into structured decisions with the
+*why* drawn from the surrounding commit body — but only when you ask for
+it with `--llm`.
+### Optional L3 — LLM-enriched decisions
+Install the optional extras and configure the env vars:
+```bash
+pip install 'whycode-cli[llm]'
+export WHYCODE_LLM_API_KEY="…"
+export WHYCODE_LLM_MODEL="<your-provider's-model-identifier>"
+whycode why src/some/file.py --llm        # full card + structured decisions
+whycode why src/some/file.py --llm-dry-run  # see exactly what would be sent
+```
+Privacy contract: configuration is entirely environment-driven (no
+hardcoded provider in the source tree); the SDK is lazy-imported (no
+import cost unless you opt in); only L2-filtered high-signal commits
+are sent (capped at 10 per call); a malformed model response degrades
+to "no decisions" rather than crashing.
 ## What this is NOT

{whycode_cli-0.2.5 → whycode_cli-0.3.0}/src/whycode_cli.egg-info/SOURCES.txt RENAMED Viewed

@@ -4,8 +4,10 @@ pyproject.toml
 src/whycode/__init__.py
 src/whycode/__main__.py
 src/whycode/cli.py
+src/whycode/decisions.py
 src/whycode/git_facts.py
 src/whycode/ignore.py
+src/whycode/llm.py
 src/whycode/mcp_server.py
 src/whycode/risk_card.py
 src/whycode/scorer.py
@@ -21,6 +23,7 @@ src/whycode_cli.egg-info/entry_points.txt
 src/whycode_cli.egg-info/requires.txt
 src/whycode_cli.egg-info/top_level.txt
 tests/test_cli.py
+tests/test_decisions.py
 tests/test_git_facts.py
 tests/test_ignore.py
 tests/test_scorer.py

{whycode_cli-0.2.5 → whycode_cli-0.3.0}/src/whycode_cli.egg-info/requires.txt RENAMED Viewed

@@ -7,5 +7,8 @@ pytest-cov>=5
 ruff>=0.6
 mypy>=1.10
+[llm]
+anthropic>=0.40
 [mcp]
 mcp>=1.0

{whycode_cli-0.2.5 → whycode_cli-0.3.0}/tests/test_cli.py RENAMED Viewed

@@ -589,3 +589,51 @@ def test_scan_lists_top_files(repo, days_ago) -> None:  # type: ignore[no-untype
     result = _invoke(repo.root, "scan", "--top", "3")
     assert result.exit_code == 0
     assert "a.py" in result.output
+def test_tour_runs_and_emits_all_sections(repo, days_ago) -> None:  # type: ignore[no-untyped-def]
+    repo.commit(
+        "compat: keep sync path",
+        {"a.py": "1"},
+        body="Do not switch to async — v1 clients break.",
+        when=days_ago(60),
+    )
+    repo.commit(
+        "hotfix: refund regression",
+        {"b.py": "1"},
+        body="See INC-447.",
+        when=days_ago(20),
+    )
+    sha = repo.commit("feat: A", {"a.py": "2"}, when=days_ago(40))
+    repo.revert(sha, when=days_ago(15))
+    result = _invoke(repo.root, "tour")
+    assert result.exit_code == 0
+    out = result.output
+    assert "Welcome to WhyCode" in out
+    assert "Decisions and incidents" in out
+    assert "Do not switch to async" in out
+    assert "hotfix: refund regression" in out
+    assert "Wire WhyCode into your AI editor" in out
+    # MCP snippet appears verbatim so users can copy-paste.
+    assert '"command": "whycode"' in out
+def test_tour_quiet_repo_explains_why(repo) -> None:  # type: ignore[no-untyped-def]
+    repo.commit("init", {"a.py": "1"})
+    result = _invoke(repo.root, "tour")
+    assert result.exit_code == 0
+    out = result.output
+    # MCP section appears regardless — most useful next step.
+    assert "Wire WhyCode into your AI editor" in out
+    # And the empty-state explanation should mention why nothing fires.
+    assert "terse" in out.lower() or "no headline" in out.lower()
+def test_tour_outside_repo_errors(tmp_path) -> None:  # type: ignore[no-untyped-def]
+    cwd = os.getcwd()
+    os.chdir(tmp_path)
+    try:
+        result = runner.invoke(app, ["tour"], catch_exceptions=False)
+    finally:
+        os.chdir(cwd)
+    assert result.exit_code != 0

whycode_cli-0.3.0/tests/test_decisions.py ADDED Viewed

@@ -0,0 +1,214 @@
+"""Tests for the L3 decision-extraction layer.
+LLM calls are mocked; no real network is touched.
+"""
+from __future__ import annotations
+import json
+from datetime import UTC, datetime
+from unittest.mock import patch
+import pytest
+from whycode.decisions import (
+    Decision,
+    _parse_decisions,
+    estimate_payload,
+    extract_decisions,
+)
+from whycode.git_facts import Commit
+from whycode.llm import LLMCallError, LLMConfigError
+def _commit(
+    sha: str = "a" * 40,
+    subject: str = "compat: keep sync path",
+    body: str = "Do not switch to async — v1 clients break.",
+    author: str = "Mei",
+) -> Commit:
+    return Commit(
+        sha=sha,
+        author_name=author,
+        author_email=f"{author.lower()}@example.com",
+        authored_at=datetime(2025, 9, 14, tzinfo=UTC),
+        subject=subject,
+        body=body,
+    )
+def test_estimate_payload_zero_for_empty() -> None:
+    assert estimate_payload([]) == (0, 0)
+def test_estimate_payload_grows_with_commits() -> None:
+    one = estimate_payload([_commit()])
+    two = estimate_payload([_commit(), _commit(sha="b" * 40)])
+    assert one[0] == 1
+    assert two[0] == 2
+    assert two[1] > one[1]
+def test_parse_decisions_well_formed() -> None:
+    raw = json.dumps(
+        [
+            {
+                "decision_type": "compat_workaround",
+                "what_changed": "Kept synchronous HTTP for refund flow.",
+                "why": "v1 clients break under async.",
+                "do_not": "Don't switch this to async.",
+                "evidence": ["a" * 12],
+                "confidence": 0.9,
+            }
+        ]
+    )
+    out = _parse_decisions(raw, ["a" * 40])
+    assert len(out) == 1
+    d = out[0]
+    assert d.decision_type == "compat_workaround"
+    assert d.confidence == 0.9
+    assert d.evidence == ("a" * 40,)
+    assert d.do_not is not None and "async" in d.do_not
+def test_parse_decisions_unknown_type_normalised_to_other() -> None:
+    raw = json.dumps(
+        [
+            {
+                "decision_type": "made-up-category",
+                "what_changed": "x",
+                "why": "y",
+                "do_not": None,
+                "evidence": ["a" * 12],
+                "confidence": 0.7,
+            }
+        ]
+    )
+    out = _parse_decisions(raw, ["a" * 40])
+    assert len(out) == 1
+    assert out[0].decision_type == "other"
+def test_parse_decisions_strips_code_fence() -> None:
+    raw = "```json\n[]\n```"
+    assert _parse_decisions(raw, ["a" * 40]) == []
+    raw_filled = '```json\n[{"decision_type":"other","what_changed":"x","why":"y","confidence":0.6,"evidence":[]}]\n```'
+    out = _parse_decisions(raw_filled, ["a" * 40])
+    assert len(out) == 1
+def test_parse_decisions_garbage_returns_empty() -> None:
+    assert _parse_decisions("not json", ["a" * 40]) == []
+    assert _parse_decisions("", ["a" * 40]) == []
+    assert _parse_decisions("{}", ["a" * 40]) == []  # not a list
+    assert _parse_decisions("null", ["a" * 40]) == []
+def test_parse_decisions_drops_invalid_evidence() -> None:
+    raw = json.dumps(
+        [
+            {
+                "decision_type": "constraint",
+                "what_changed": "x",
+                "why": "y",
+                "evidence": ["zz" * 6, "a" * 12],  # first invalid, second valid
+                "confidence": 0.8,
+            }
+        ]
+    )
+    out = _parse_decisions(raw, ["a" * 40])
+    assert len(out) == 1
+    assert out[0].evidence == ("a" * 40,)
+def test_parse_decisions_drops_when_required_fields_empty() -> None:
+    raw = json.dumps(
+        [
+            {
+                "decision_type": "constraint",
+                "what_changed": "",  # empty
+                "why": "y",
+                "confidence": 0.9,
+                "evidence": ["a" * 12],
+            }
+        ]
+    )
+    assert _parse_decisions(raw, ["a" * 40]) == []
+def test_extract_filters_below_min_confidence() -> None:
+    raw = json.dumps(
+        [
+            {
+                "decision_type": "constraint",
+                "what_changed": "x",
+                "why": "y",
+                "confidence": 0.3,  # below default 0.5
+                "evidence": ["a" * 12],
+            }
+        ]
+    )
+    with patch("whycode.decisions.call_llm", return_value=raw):
+        out = extract_decisions([_commit()])
+    assert out == []
+def test_extract_keeps_above_min_confidence() -> None:
+    raw = json.dumps(
+        [
+            {
+                "decision_type": "compat_workaround",
+                "what_changed": "Kept sync HTTP",
+                "why": "v1 clients break",
+                "confidence": 0.85,
+                "evidence": ["a" * 12],
+            }
+        ]
+    )
+    with patch("whycode.decisions.call_llm", return_value=raw):
+        out = extract_decisions([_commit()])
+    assert len(out) == 1
+    assert out[0].confidence == 0.85
+def test_extract_returns_empty_on_no_commits() -> None:
+    out = extract_decisions([])
+    assert out == []
+def test_extract_propagates_llm_config_error() -> None:
+    def boom(*_a, **_kw) -> str:
+        raise LLMConfigError("not configured")
+    with (
+        patch("whycode.decisions.call_llm", side_effect=boom),
+        pytest.raises(LLMConfigError),
+    ):
+        extract_decisions([_commit()])
+def test_extract_propagates_llm_call_error() -> None:
+    def boom(*_a, **_kw) -> str:
+        raise LLMCallError("network down")
+    with (
+        patch("whycode.decisions.call_llm", side_effect=boom),
+        pytest.raises(LLMCallError),
+    ):
+        extract_decisions([_commit()])
+def test_decision_to_dict_round_trips() -> None:
+    d = Decision(
+        decision_type="rollback",
+        what_changed="Reverted async refactor",
+        why="broke v1",
+        do_not="don't try async again",
+        evidence=("a" * 40,),
+        confidence=0.92,
+    )
+    payload = d.to_dict()
+    assert payload["decision_type"] == "rollback"
+    assert payload["confidence"] == 0.92
+    assert payload["do_not"] == "don't try async again"
+    assert payload["evidence"] == ["a" * 40]