PyPI - falsegreen - Versions diffs - 0.2.0__tar.gz → 0.2.2__tar.gz - Mend

falsegreen 0.2.0tar.gz → 0.2.2tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (37) hide show

{falsegreen-0.2.0 → falsegreen-0.2.2}/.github/workflows/ci.yml RENAMED Viewed

@@ -24,9 +24,5 @@ jobs:
         run: ruff check src tests
       - name: Test
         run: pytest -q
-      - name: Bundled skill scanner must match the package
-        run: diff -u src/falsegreen/scanner.py skills/falsegreen/scripts/scan.py
-      - name: Self-scan (must flag the demo, must not flag itself)
-        run: |
-          python -m falsegreen skills/falsegreen/examples/bad_tests_sample.py || true
-          python -m falsegreen src tests
+      - name: Self-scan (must not flag itself)
+        run: python -m falsegreen src tests

{falsegreen-0.2.0 → falsegreen-0.2.2}/.github/workflows/release-drafter.yml RENAMED Viewed

@@ -16,6 +16,6 @@ jobs:
       pull-requests: write
     runs-on: ubuntu-latest
     steps:
-      - uses: release-drafter/release-drafter@v6
+      - uses: release-drafter/release-drafter@v7
         env:
           GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

{falsegreen-0.2.0 → falsegreen-0.2.2}/.github/workflows/release.yml RENAMED Viewed

@@ -31,7 +31,7 @@ jobs:
           python -m build
           python -m twine check dist/*
       - name: Upload dist artifact
-        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02  # v4
+        uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a  # v7.0.1
         with:
           name: dist
           path: dist/
@@ -45,7 +45,7 @@ jobs:
       id-token: write  # OIDC: the only credential the publish step needs
     steps:
       - name: Download dist artifact
-        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093  # v4
+        uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c  # v8.0.1
         with:
           name: dist
           path: dist/

{falsegreen-0.2.0 → falsegreen-0.2.2}/CHANGELOG.md RENAMED Viewed

@@ -6,6 +6,21 @@ to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 ## [Unreleased]
+## [0.2.2] - 2026-06-08
+### Changed
+- Skill and Claude plugin removed from this repo — the LLM semantic pass, the
+  detection reference, and multi-language support now live in
+  [falsegreen-skill](https://github.com/vinicq/falsegreen-skill).
+- README, CONTRIBUTING, and CREDITS updated to reflect the split.
+## [0.2.1] - 2026-06-08
+### Fixed
+- C2 (HIGH) no longer flags an empty body under sympy's `@SKIP` decorator
+  (`from sympy.testing.pytest import SKIP`), which raises `Skipped` at runtime —
+  same semantics as `@pytest.mark.skip`. Found validating sympy.
 ## [0.2.0] - 2026-06-05
 ### Fixed
@@ -72,16 +87,17 @@ First release.
 - C20 (HIGH): assertion in dead code after `return`/`raise`/`fail()`. C21 (LOW):
   every assertion conditional, none runs unconditionally. Both from the rotten-
   green-test line of work (Soares 2023).
-- Claude Code skill (`/falsegreen`) for the semantic pass: judges a test's
-  expected value against intended behavior using an oracle hierarchy and a
-  test-intent classification step (catches cases 12 and 18).
-- Distribution as a pip package, a `pre-commit` hook, and a Claude plugin.
-- Plain-language guide (`docs/guide.md`), detection reference, and a demo file.
+- Distribution as a pip package and a `pre-commit` hook.
+- Plain-language guide (`docs/guide.md`); the detection reference and LLM semantic
+  pass live in [falsegreen-skill](https://github.com/vinicq/falsegreen-skill).
 ### Validated
 - Two real-project passes (bailiff, md-bridge) settled the rules and fixed three
   false positives: C6 on called boolean predicates, C1 on literal-collection
   loops, and C7 on `f() is f()` (the lru_cache / singleton identity test).
-[Unreleased]: https://github.com/vinicq/falsegreen/compare/v0.1.0...HEAD
+[Unreleased]: https://github.com/vinicq/falsegreen/compare/v0.2.2...HEAD
+[0.2.2]: https://github.com/vinicq/falsegreen/compare/v0.2.1...v0.2.2
+[0.2.1]: https://github.com/vinicq/falsegreen/compare/v0.2.0...v0.2.1
+[0.2.0]: https://github.com/vinicq/falsegreen/compare/v0.1.0...v0.2.0
 [0.1.0]: https://github.com/vinicq/falsegreen/releases/tag/v0.1.0

{falsegreen-0.2.0 → falsegreen-0.2.2}/CONTRIBUTING.md RENAMED Viewed

@@ -19,49 +19,35 @@ Then branch, change, add a test, and open a pull request.
 ## How the project is built
-Two layers, one repo:
+One module, one job: `src/falsegreen/scanner.py` is a zero-dependency AST pass.
+It parses test files, never imports or runs them. Each pattern is a case code
+(`C1`, `C5`, `C13`, ...). HIGH-confidence codes block a commit; LOW only warn.
-- **Scanner** (`src/falsegreen/scanner.py`): a zero-dependency AST pass. It parses
-  test files, it never imports or runs them. Each pattern is a case code
-  (`C1`, `C5`, `C13`, ...). HIGH-confidence codes block a commit; LOW only warn.
-- **Skill** (`skills/falsegreen/`): the Claude Code semantic pass. It bundles a
-  byte-identical copy of the scanner at `skills/falsegreen/scripts/scan.py`; CI
-  fails if it drifts from `src/falsegreen/scanner.py`.
-The plain-language rubric is `docs/guide.md`; the detection reference is
-`skills/falsegreen/reference.md`.
+The plain-language rubric is `docs/guide.md`. The LLM semantic pass and the
+multi-language detection reference live in
+[falsegreen-skill](https://github.com/vinicq/falsegreen-skill).
 ## Filing an issue
 A useful bug report for a false positive includes the smallest test snippet that
 gets wrongly flagged, the code falsegreen emitted, and what you expected. For a
-false negative, show the bad test that slipped through. Use the demo file
-`skills/falsegreen/examples/bad_tests_sample.py` as a format reference.
+false negative, show the bad test that slipped through.
 ## Adding or changing a detection rule
-This is the most common contribution. A rule touches up to five places, and the
+This is the most common contribution. A rule touches up to three places, and the
 pull request needs all that apply:
 1. **Logic** in `src/falsegreen/scanner.py`. Decide HIGH vs LOW. The rule of
    thumb: HIGH only if a legitimate test can almost never trigger it, because
    HIGH blocks commits. When in doubt, ship it LOW.
-2. **Reference** entry in `skills/falsegreen/reference.md` (what it looks like,
-   why it fools you, confidence, the tool it maps to).
-3. **Guide** entry in `docs/guide.md` if it is a new case, in the same
+2. **Guide** entry in `docs/guide.md` if it is a new case, in the same
    real-world-analogy style as the others.
-4. **Tests** in `tests/test_scanner.py`: one test proving the rule fires on the
+3. **Tests** in `tests/test_scanner.py`: one test proving the rule fires on the
    bad pattern, and at least one proving it does NOT fire on the legitimate
    look-alike. The second test matters more than the first.
-5. **Skill prose** in `skills/falsegreen/SKILL.md`, *only if* the change alters a
-   confidence level, an exemption, a flag, or the operator's mental model. CI
-   byte-checks `scripts/scan.py` against the scanner, so detector *logic* is
-   mirrored automatically; the SKILL.md prose and its flag list are NOT, so they
-   must be kept consistent with `reference.md` and the README CLI section by hand.
-Then run `pytest`, `python -m falsegreen src tests` (must stay clean), and
-`diff src/falsegreen/scanner.py skills/falsegreen/scripts/scan.py` (must be
-identical, copy the file if you changed the scanner).
+Then run `pytest` and `python -m falsegreen src tests` (must stay clean).
 ### Off-by-default codes

{falsegreen-0.2.0 → falsegreen-0.2.2}/CREDITS.md RENAMED Viewed

@@ -1,8 +1,9 @@
 # Credits and academic references
 falsegreen builds on published research in test smells and rotten green tests. The
-work below shaped its concepts, its rule catalog, and the design of its two layers
-(deterministic scanner plus an LLM semantic pass). Credit to the authors.
+work below shaped its concepts, its rule catalog, and the design of the deterministic
+scanner. The LLM semantic pass and multi-language support live in
+[falsegreen-skill](https://github.com/vinicq/falsegreen-skill). Credit to the authors.
 ## Conceptual foundation
@@ -42,8 +43,9 @@ Marcelo d'Amorim, Márcio Ribeiro, Gustavo Soares
 ([@gustavoasoares](https://github.com/gustavoasoares)), Eduardo Almeida, Elvys
 Soares ([@elvyssoares](https://github.com/elvyssoares)). SBES 2025. arXiv:2504.07277. Empirical evidence that small local models in
 agent-based workflows detect and refactor test smells (Phi-4-14B, pass@5 of 75.3%;
-six generated pull requests merged into open-source projects). Backs falsegreen's
-LLM semantic pass and the AI-applies-the-fix path of the dual-use report.
+six generated pull requests merged into open-source projects). Backs
+[falsegreen-skill](https://github.com/vinicq/falsegreen-skill)'s LLM semantic pass
+and the AI-applies-the-fix path of the dual-use report.
 **Evaluating LLMs Effectiveness in Detecting and Correcting Test Smells: An
 Empirical Study.** E. G. Santana Jr., Jander Pereira Santos Junior, Erlon P.
@@ -58,8 +60,9 @@ and its multi-agent verify idea.
 **Evaluating Large Language Models in Detecting Test Smells.** Keila Lucas, Rohit
 Gheyi, Elvys Soares, Márcio Ribeiro, Ivan Machado. SBES 2024. arXiv:2407.19261.
 LLMs detected 21 of 30 test smell types across seven languages (ChatGPT-4 best).
-Backs falsegreen's choice to handle cross-language coverage in the language-agnostic
-semantic pass rather than in the Python-only scanner.
+Backs [falsegreen-skill](https://github.com/vinicq/falsegreen-skill)'s choice to handle
+cross-language coverage in the language-agnostic semantic pass rather than in the
+Python-only scanner.
 **Test smells in LLM-Generated Unit Tests.** Wendkûuni C. Ouédraogo, Yinghua Li,
 Xueqi Dang, Xunzhu Tang, Anil Koyuncu, Jacques Klein, David Lo, Tegawendé F.
@@ -82,7 +85,7 @@ Dalton Nicodemos Jorge ([@daltonjorge](https://github.com/daltonjorge)). PhD the
 UFCG, 2023. Advisors Patrícia D. L. Machado, Wilkerson L. Andrade. Tool STEEL:
 <https://github.com/daltonjorge/steel>. Its JavaScript Exception Test smell (a
 `try/catch` that swallows the thrown error) and assertion-in-`forEach`-over-empty
-sharpened the skill's "Frontend cues by language" with two J1 cues for Jest/Vitest.
+sharpened falsegreen-skill's "Frontend cues by language" with two J1 cues for Jest/Vitest.
 **Detecção de smells em testes automatizados em diferentes linguagens de
 programação.** Gustavo Augusto Calazans Lopes. TCC, UFAL, 2023. Advisor Márcio de

{falsegreen-0.2.0 → falsegreen-0.2.2}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: falsegreen
-Version: 0.2.0
+Version: 0.2.2
 Summary: Find unit tests that give false positives: green tests that protect nothing, and tests that pass while asserting the wrong expected value.
 Project-URL: Homepage, https://github.com/vinicq/falsegreen
 Project-URL: Issues, https://github.com/vinicq/falsegreen/issues
@@ -39,17 +39,18 @@ each test against more than twenty mechanical smells, the ones a parser can prov
 an assertion that never runs, a check that is empty or always true, a swallowed
 exception, a mock of the unit under test, an assertion stranded in dead code, a
 weak truthiness check, an async test that never awaits. High-confidence findings
-block the commit; the rest warn. The Claude Code skill then does the part a parser
-cannot: it reads the production code and judges whether each test asserts the
-*right* value, measured against the intended behavior rather than the current
-(possibly buggy) output.
+block the commit; the rest warn. The semantic layer — judging whether each test
+asserts the *right* value against intended behavior — lives in
+[falsegreen-skill](https://github.com/vinicq/falsegreen-skill), the companion
+LLM-based tool that covers Python and other languages.
 The checks are grounded in the rotten-green-test research (Soares 2023; Delplanque
 et al., ICSE 2019) and cross-walked against the published test-smell catalog. See
 [CREDITS.md](CREDITS.md).
-> Live on PyPI: `pip install falsegreen`. Also a pre-commit hook and a Claude Code
-> plugin (see the three install paths below).
+> Live on PyPI: `pip install falsegreen`. Also available as a pre-commit hook
+> (see install paths below). For the LLM semantic pass, see
+> [falsegreen-skill](https://github.com/vinicq/falsegreen-skill).
 ---
@@ -60,10 +61,8 @@ et al., ICSE 2019) and cross-walked against the published test-smell catalog. Se
 - [What it validates, how, and why](#what-it-validates-how-and-why)
 - [The two layers](#the-two-layers)
 - [Download and use: the three ways](#download-and-use-the-three-ways)
-  - [1. As a Python package (CLI, no skill needed)](#1-as-a-python-package-cli-no-skill-needed)
+  - [1. As a Python package (CLI)](#1-as-a-python-package-cli)
   - [2. As a pre-commit hook](#2-as-a-pre-commit-hook)
-  - [3. As a Claude Code skill (the semantic pass)](#3-as-a-claude-code-skill-the-semantic-pass)
-  - [With the skill vs without the skill](#with-the-skill-vs-without-the-skill)
 - [Configuration](#configuration)
 - [Technologies used](#technologies-used)
 - [How it compares](#how-it-compares)
@@ -131,9 +130,9 @@ positive, and a labeled characterization snapshot is not a frozen bug. That
 classification step keeps the tool from flagging legitimate styles.
 The plain-language guide behind every case, with a real-world analogy and a
-before/after for each, is in [`docs/guide.md`](docs/guide.md). The detection
-reference that maps each code to its scanner code and to established tooling is in
-[`skills/falsegreen/reference.md`](skills/falsegreen/reference.md).
+before/after for each, is in [`docs/guide.md`](docs/guide.md). The full detection
+reference (code-to-tooling mapping, J1–J6 judgment index) lives in
+[`falsegreen-skill`](https://github.com/vinicq/falsegreen-skill).
 The basis is the rotten-green-test research: a passing test that holds an
 assertion which never runs (Elvys Soares, *A Multimethod Study of Test Smells*,
@@ -147,12 +146,10 @@ and the specific thing falsegreen took from each one, is in [CREDITS.md](CREDITS
 ## What it validates, how, and why
-The catalog has 18 named cases across the five families, and the scanner now ships
-21 codes (the five families are the scanner-facing view; the semantic pass asks the
-same questions as six judgments, J1 to J6, which is the LLM-facing view of the same
-thing, mapped code by code in [`reference.md`](skills/falsegreen/reference.md)). A
-case is caught either by the deterministic **scanner** (a code like `C5`) or only
-by the **semantic** pass (it needs to read the production code). HIGH-confidence
+The catalog has 18 named cases across the five families. The scanner ships 21 codes
+covering all mechanically-detectable patterns. Cases that require reading production
+intent (10, 11, 12, 15, 18) are handled by
+[falsegreen-skill](https://github.com/vinicq/falsegreen-skill). HIGH-confidence
 scanner findings block a commit; LOW ones warn.
 | # | Case | Why it fools you | Detected by | Conf |
@@ -203,11 +200,9 @@ and stays quiet on them.
 by structure. A parser sees a mock but cannot tell whether it replaced an edge
 (network, disk, clock) or the thing under test. It sees an arithmetic expression
 but cannot tell whether the expected value was derived independently or copied
-from the code. The `/falsegreen` skill reads the production code, derives the
-intended behavior from the oracle hierarchy, compares it against what the test
-asserts, and when they disagree, names which side is wrong. It is told to favor
-precision over recall and to ground a verdict in a cited contract line, never in
-the code's current output alone.
+from the code. That judgment requires reading the production code against an
+independent oracle — that is what
+[falsegreen-skill](https://github.com/vinicq/falsegreen-skill) does.
 **Why two confidence levels.** A blocking gate that cries wolf gets disabled. So
 only near-certain, mechanically-unambiguous patterns are HIGH (they block). The
@@ -235,26 +230,10 @@ different natures.
   stays-clean regression test, and a re-scan brought the HIGH count to 0 across all
   8 projects. Each false positive is recorded as it is fixed, with its regression
   tests, in the commit history and the CHANGELOG.
-- **The semantic pass (LLM, any language).** Cross-language coverage runs through
-  this pass, so its reliability is measured, not assumed. The validation is a
-  benchmark corpus: tests planted with a known ground truth, a test that mocks the
-  unit under test, one that copies the expected value from current output, one that
-  re-implements the production formula, in Python and in other languages, scored for
-  precision and recall with precision held above recall. Because the pass runs on an
-  LLM it is non-deterministic, so this is a periodic skill-validation artifact, not
-  a CI gate. The first labeled corpus has 24 Python cases (10 rotten, 14 sound)
-  across cases 10, 11, 12, and 18, with sound look-alikes and plain controls. Run
-  blind on a small model (Claude Haiku), the pass scored precision 1.00 (no false
-  alarms on the 14 sound tests), recall 0.70, and 1.00 recall on the clear-cut
-  smells; the only misses were borderline cases (a pure-delegation passthrough, a
-  trivial one-operator formula) where the precision-first guardrail defers to
-  "sound". That is the evidence behind the design claim that a small model is
-  enough for a precision-first semantic pass. The number to grow is recall: a
-  larger corpus, a second annotator, and multi-vote runs are the next step. A
-  second corpus of 20 TypeScript cases (Jest/Vitest) reproduced the pattern:
-  precision 1.00, recall 0.625, with the only misses being the same boundary
-  cases, evidence that the pass carries across languages and frameworks, not just
-  Python.
+- **The semantic pass (LLM).** Validation for the LLM-based semantic layer is
+  tracked in [falsegreen-skill](https://github.com/vinicq/falsegreen-skill), where
+  benchmark corpora for Python and TypeScript are maintained with precision/recall
+  measurements.
 ---
@@ -292,28 +271,19 @@ that maintainability layer well; run them alongside falsegreen.
 | Layer | What it is | When it runs | Catches |
 |---|---|---|---|
-| **Scanner** | Zero-dependency AST analysis (Python/pytest), one self-contained module | CLI, CI, pre-commit | the mechanical patterns (21 codes) |
-| **Semantic pass** | A Claude Code skill (`/falsegreen`) that reads the code | on demand, in Claude Code | the bug-freezing patterns no static tool can see (cases 10/11/12/15/18) |
+| **Scanner** (this repo) | Zero-dependency AST analysis (Python/pytest) | CLI, CI, pre-commit | 21 mechanical codes |
+| **Semantic pass** ([falsegreen-skill](https://github.com/vinicq/falsegreen-skill)) | LLM-based analysis, Python + other languages | on demand | bug-freezing patterns no static tool can see (cases 10/11/12/15/18) |
 The scanner is the fast, deterministic pre-filter. It overlaps in part with
 `ruff`'s `PT` rules and with research tools like PyNose, and that overlap is fine:
-run them together. The semantic pass is the part nobody else ships, and it is the
-reason the project exists.
-The semantic pass runs on whatever Claude model your Claude Code session uses. It
-is not pinned to one model, and it does not need a frontier one: the research it
-draws on (Agentic LMs, SBES 2025; Santana Jr. et al., 2025) shows that small,
-locally-runnable models detect and refactor these patterns well. The value is in
-the protocol, not in any single model.
+run them together. For the semantic layer — and for TypeScript, JavaScript, Java,
+and other languages — use [falsegreen-skill](https://github.com/vinicq/falsegreen-skill).
 ---
-## Download and use: the three ways
-Pick one or combine them. The CLI and pre-commit need no Claude Code; the skill
-adds the semantic pass on top.
+## Download and use
-### 1. As a Python package (CLI, no skill needed)
+### 1. As a Python package (CLI)
 Install from PyPI:
@@ -343,12 +313,6 @@ code scanning / PR annotations; `--format junit` emits JUnit XML (HIGH ->
 finding. Wire those into any CI step. No third-party runtime dependencies; Python
 3.8+.
-Try it on the bundled demo (one bad test per case):
-```bash
-pipx run falsegreen skills/falsegreen/examples/bad_tests_sample.py
-```
 ### 2. As a pre-commit hook
 This is the standard, version-pinned way to gate every commit. Add to your
@@ -373,42 +337,13 @@ python -m falsegreen.hook_install --repo .      # install
 python -m falsegreen.hook_install --uninstall   # remove
 ```
-### 3. As a Claude Code skill (the semantic pass)
+### 3. With the semantic pass (multi-language)
-Install the plugin:
-```
-/plugin marketplace add vinicq/falsegreen
-```
-Then, in a Claude Code session, run:
-```
-/falsegreen
-```
-against a diff or a module. The skill triages the scanner output first, then does
-the semantic work: for each test it finds the unit under test, derives the
-intended behavior from the oracle hierarchy, and reports tests that pass while
-asserting the wrong thing, with the cited evidence and a concrete fix. It is
-read-only by default (it proposes fixes, it does not edit your tests unless you
-ask).
-The scanner is bundled inside the skill, so the plugin works on its own. On
-another Agent Skills client that does not define `${CLAUDE_SKILL_DIR}`, install
-the package (`pip install falsegreen`) and the skill falls back to the CLI.
-### With the skill vs without the skill
-- **Without the skill** (CLI / pre-commit / CI): you get the deterministic
-  scanner. It catches the 16 mechanical codes and blocks commits on the
-  high-confidence ones. This is everything a non-Claude-Code user needs and runs
-  anywhere Python runs.
-- **With the skill** (`/falsegreen` in Claude Code): you additionally get the
-  semantic pass, which catches the five code-aware cases (10, 11, 12, 15, 18),
-  including the headline one: a test that is green while its expected value
-  contradicts the spec. No static tool, this one included, can find that on its
-  own.
+For cases that require reading production intent — mocking the unit under test,
+copying expected from current output, re-implementing the formula — use
+[falsegreen-skill](https://github.com/vinicq/falsegreen-skill). It covers Python,
+TypeScript, JavaScript, Java, and other languages via an LLM-based analysis using
+the same case catalog.
 ---
@@ -468,12 +403,9 @@ override.
 - **Packaging:** `hatchling` build backend, SPDX license metadata (PEP 639),
   console entry point, distributed on PyPI.
 - **Distribution:** a [pre-commit](https://pre-commit.com) hook
-  (`.pre-commit-hooks.yaml`) and a Claude Code plugin following the
-  [Agent Skills](https://agentskills.io) open standard (`SKILL.md` plus a
-  `.claude-plugin/` marketplace manifest).
+  (`.pre-commit-hooks.yaml`), distributed on PyPI.
 - **CI:** GitHub Actions across Python 3.8 / 3.11 / 3.13, running `ruff`,
-  `pytest`, a self-scan (the tool must stay clean on its own code), and a
-  drift-check that the bundled scanner copy matches the package byte for byte.
+  `pytest`, and a self-scan (the tool must stay clean on its own code).
 ---
@@ -481,17 +413,20 @@ override.
 - **ruff / flake8-pytest-style** - mature, fast lint rules. Overlaps on broad
   `raises` (PT011) and assert-in-except (PT017). Run both. falsegreen adds
-  uncollected tests, always-true asserts, self-comparison, mock typos, and the
-  semantic pass.
+  uncollected tests, always-true asserts, self-comparison, and mock typos.
 - **PyNose / pytest-smell / TEMPY** - test-smell catalogs from research. Broader
   taxonomy, but no commit gate and no oracle-correctness check.
 - **mutmut / cosmic-ray** - mutation testing, the most honest measure of whether a
   green suite fails when the code is wrong. Complementary and heavier. falsegreen
   is the cheap pre-filter you run on every commit; mutation testing is the deep
   audit you run on the suites that matter.
+- **[falsegreen-skill](https://github.com/vinicq/falsegreen-skill)** - the LLM
+  companion for the semantic pass (cases 10/11/12/15/18) and for TypeScript,
+  JavaScript, Java, and other languages.
-The defensible gap: nobody else combines a deterministic commit gate with a
-code-as-evidence semantic pass aimed at oracle correctness (cases 12 and 18).
+The defensible gap: a deterministic commit gate that catches the mechanical
+false-positive patterns with zero runtime dependencies, paired with an LLM
+semantic layer that catches the oracle-correctness cases no static tool can see.
 ---
@@ -499,20 +434,17 @@ code-as-evidence semantic pass aimed at oracle correctness (cases 12 and 18).
 ```
 falsegreen/
-  src/falsegreen/scanner.py        the deterministic scanner (canonical)
+  src/falsegreen/scanner.py        the deterministic scanner
   src/falsegreen/hook_install.py   raw git-hook installer
-  skills/falsegreen/
-    SKILL.md                       the semantic-pass protocol
-    reference.md                   the 18-case detection rubric
-    scripts/scan.py                bundled scanner (kept identical to the package)
-    examples/bad_tests_sample.py   one bad test per case (demo + regression)
   docs/guide.md                    plain-language guide to every case
   tests/test_scanner.py            the scanner's own tests
   .pre-commit-hooks.yaml           pre-commit integration
-  .claude-plugin/                  plugin + marketplace manifests
   pyproject.toml                   packaging
 ```
+The LLM skill, the semantic-pass protocol, and the multi-language case reference
+live in [falsegreen-skill](https://github.com/vinicq/falsegreen-skill).
 ---
 ## Contributing, security, license

falsegreen 0.2.0__tar.gz → 0.2.2__tar.gz

falsegreen 0.2.0tar.gz → 0.2.2tar.gz