PyPI - docassert - Versions diffs - 0.1.0__tar.gz → 0.2.1__tar.gz - Mend

docassert 0.1.0tar.gz → 0.2.1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (105) hide show

{docassert-0.1.0/docassert.egg-info → docassert-0.2.1}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: docassert
-Version: 0.1.0
+Version: 0.2.1
 Summary: Unit testing for business documents — validate structured Markdown docs against a configurable audit standard.
 Author: C4G Enterprises Inc.
 License: Apache-2.0
@@ -38,6 +38,10 @@ Dynamic: license-file
 # docassert
+[![PyPI](https://img.shields.io/pypi/v/docassert)](https://pypi.org/project/docassert/)
+[![Python](https://img.shields.io/pypi/pyversions/docassert)](https://pypi.org/project/docassert/)
+[![License](https://img.shields.io/badge/license-Apache--2.0-blue)](LICENSE)
 **Unit testing for business documents.** Validate structured Markdown documents
 (charters, BRDs, PRDs, risk registers, …) against a configurable audit standard:
 deterministic structural checks that gate a merge, plus optional AI-graded
@@ -50,9 +54,11 @@ a vendor-neutral standard for running a PMO from version-controlled, declarative
 ## Install
 ```bash
-pip install "docassert @ git+https://github.com/c4g-john/docassert"   # PyPI release coming
+pipx install docassert          # recommended — installs the CLI in its own isolated env
+# or:
+pip install docassert
 # with the AI advisory extra:
-pip install "docassert[ai] @ git+https://github.com/c4g-john/docassert"
+pip install "docassert[ai]"
 ```
 ## Quickstart
@@ -74,13 +80,18 @@ you can customize them.
 | Command | What it does |
 |---|---|
-| `docassert validate <globs>` | Validate documents against their kind's criteria. Exit code = number of blocking failures. |
+| `docassert validate <globs>` | Validate documents against their kind's criteria. Exit code = number of blocking failures (capped at 125). |
 | `docassert consistency` | Cross-document checks: referential integrity, coverage, required links, profile completeness. |
 | `docassert rtm [--project ID]` | Requirements traceability matrix (Markdown or CSV). |
 | `docassert status [--project ID] [--index]` | Derived project status (md / json / html). |
 | `docassert pages --out DIR` | Build the portfolio site (index + a page per project). |
 | `docassert projects [--out] [--check]` | Generate / verify the project registry. |
 | `docassert init [DIR]` | Scaffold the default config into a repo. |
+| `docassert extract <file>` | Extract plain text from a source `.docx` / `.pdf` / `.md` / `.txt` (the first step of doc-to-pmo conversion). Needs the `convert` extra: `pip install "docassert[convert]"`. |
+Every document-reading command accepts `--documents-dir` (default `documents/`).
+AI alignment grades at most `alignment_limit` links per run (default 25; set it
+in `consistency.yaml`, `0` = no cap) so API cost stays bounded on large graphs.
 ## Document kinds

{docassert-0.1.0 → docassert-0.2.1}/README.md RENAMED Viewed

@@ -1,5 +1,9 @@
 # docassert
+[![PyPI](https://img.shields.io/pypi/v/docassert)](https://pypi.org/project/docassert/)
+[![Python](https://img.shields.io/pypi/pyversions/docassert)](https://pypi.org/project/docassert/)
+[![License](https://img.shields.io/badge/license-Apache--2.0-blue)](LICENSE)
 **Unit testing for business documents.** Validate structured Markdown documents
 (charters, BRDs, PRDs, risk registers, …) against a configurable audit standard:
 deterministic structural checks that gate a merge, plus optional AI-graded
@@ -12,9 +16,11 @@ a vendor-neutral standard for running a PMO from version-controlled, declarative
 ## Install
 ```bash
-pip install "docassert @ git+https://github.com/c4g-john/docassert"   # PyPI release coming
+pipx install docassert          # recommended — installs the CLI in its own isolated env
+# or:
+pip install docassert
 # with the AI advisory extra:
-pip install "docassert[ai] @ git+https://github.com/c4g-john/docassert"
+pip install "docassert[ai]"
 ```
 ## Quickstart
@@ -36,13 +42,18 @@ you can customize them.
 | Command | What it does |
 |---|---|
-| `docassert validate <globs>` | Validate documents against their kind's criteria. Exit code = number of blocking failures. |
+| `docassert validate <globs>` | Validate documents against their kind's criteria. Exit code = number of blocking failures (capped at 125). |
 | `docassert consistency` | Cross-document checks: referential integrity, coverage, required links, profile completeness. |
 | `docassert rtm [--project ID]` | Requirements traceability matrix (Markdown or CSV). |
 | `docassert status [--project ID] [--index]` | Derived project status (md / json / html). |
 | `docassert pages --out DIR` | Build the portfolio site (index + a page per project). |
 | `docassert projects [--out] [--check]` | Generate / verify the project registry. |
 | `docassert init [DIR]` | Scaffold the default config into a repo. |
+| `docassert extract <file>` | Extract plain text from a source `.docx` / `.pdf` / `.md` / `.txt` (the first step of doc-to-pmo conversion). Needs the `convert` extra: `pip install "docassert[convert]"`. |
+Every document-reading command accepts `--documents-dir` (default `documents/`).
+AI alignment grades at most `alignment_limit` links per run (default 25; set it
+in `consistency.yaml`, `0` = no cap) so API cost stays bounded on large graphs.
 ## Document kinds

{docassert-0.1.0 → docassert-0.2.1}/docassert/__init__.py RENAMED Viewed

@@ -5,4 +5,4 @@ standard: deterministic structural checks that gate a merge, plus optional
 AI-graded semantic checks that advise.
 """
-__version__ = "0.1.0"
+__version__ = "0.2.1"

{docassert-0.1.0 → docassert-0.2.1}/docassert/_data/consistency.yaml RENAMED Viewed

@@ -35,6 +35,10 @@ coverage:
 # Advisory AI alignment: for each relation, judge whether the child genuinely
 # fulfils the parent it links to. Never blocks.
+# Each graded link costs one API call; `alignment_limit` caps calls per run
+# (0 = no cap).
+alignment_limit: 25
 alignment:
   - relation: traces
     prompt: >

{docassert-0.1.0 → docassert-0.2.1}/docassert/cli.py RENAMED Viewed

@@ -3,8 +3,10 @@
     docassert validate documents/charters/aurora.md
     docassert validate documents/**/*.md --junit out.xml --markdown comment.md
-Exit code = number of BLOCKING (structural) failures. Advisory (AI) failures
-never affect the exit code, so CI is gated only by deterministic checks.
+Exit code = number of BLOCKING (structural) failures, capped at 125 so large
+counts can't wrap around the 8-bit exit-status space (256 failures must never
+read as success). Advisory (AI) failures never affect the exit code, so CI is
+gated only by deterministic checks.
 """
 from __future__ import annotations
@@ -23,15 +25,24 @@ from .models import CheckResult
 from .semantic import run_semantic
 from .structural import run_structural
-# The user's documents live here; criteria / schema / consistency.yaml / profiles
-# resolve via `config` (local override → packaged default).
-DOCUMENTS_DIR = Path("documents")
+# Default documents location; every document-reading command accepts
+# --documents-dir to override it. Criteria / schema / consistency.yaml /
+# profiles resolve via `config` (local override → packaged default).
+DEFAULT_DOCUMENTS_DIR = "documents"
+# POSIX exit statuses are 8-bit; 126+ carry shell meanings. Cap so a failure
+# count can never wrap to 0.
+_EXIT_CAP = 125
-def _build_id_index() -> dict[str, list[str]]:
-    """Map document id -> [paths] across all documents/, for uniqueness checks."""
+def _capped(failures: int) -> int:
+    return min(failures, _EXIT_CAP)
+def _build_id_index(documents_dir: Path) -> dict[str, list[str]]:
+    """Map document id -> [paths] across the documents tree, for uniqueness checks."""
     index: dict[str, list[str]] = defaultdict(list)
-    for path in DOCUMENTS_DIR.rglob("*.md"):
+    for path in documents_dir.rglob("*.md"):
         try:
             doc = load(path)
         except ValueError:
@@ -86,7 +97,7 @@ def cmd_validate(args: argparse.Namespace) -> int:
         print("docassert: no markdown documents matched.", file=sys.stderr)
         return 0
-    id_index = _build_id_index()
+    id_index = _build_id_index(Path(args.documents_dir))
     results_by_doc: dict[str, list[CheckResult]] = {}
     for path in files:
         try:
@@ -106,12 +117,12 @@ def cmd_validate(args: argparse.Namespace) -> int:
     if args.markdown:
         Path(args.markdown).write_text(report.markdown(results_by_doc))
-    return sum(1 for rs in results_by_doc.values()
-               for r in rs if r.is_blocking_failure)
+    return _capped(sum(1 for rs in results_by_doc.values()
+                       for r in rs if r.is_blocking_failure))
 def cmd_consistency(args: argparse.Namespace) -> int:
-    results = run_consistency(DOCUMENTS_DIR, with_semantic=not args.no_semantic)
+    results = run_consistency(args.documents_dir, with_semantic=not args.no_semantic)
     results_by_doc = {"consistency (cross-document)": results}
     print(report.console(results_by_doc))
@@ -123,7 +134,7 @@ def cmd_consistency(args: argparse.Namespace) -> int:
         Path(args.markdown).write_text(
             report.markdown(results_by_doc, title="docassert consistency"))
-    return sum(1 for r in results if r.is_blocking_failure)
+    return _capped(sum(1 for r in results if r.is_blocking_failure))
 def _project_code(value: str | None) -> str | None:
@@ -132,7 +143,7 @@ def _project_code(value: str | None) -> str | None:
 def cmd_rtm(args: argparse.Namespace) -> int:
-    graph = build_graph(DOCUMENTS_DIR)
+    graph = build_graph(args.documents_dir)
     code = _project_code(args.project)
     text = rtm.render_csv(graph, code) if args.csv else rtm.render_markdown(graph, code)
     if args.out:
@@ -145,7 +156,7 @@ def cmd_rtm(args: argparse.Namespace) -> int:
 def cmd_projects(args: argparse.Namespace) -> int:
     from . import projects as proj
-    plist = proj.load_projects(DOCUMENTS_DIR)
+    plist = proj.load_projects(args.documents_dir)
     issues = proj.registry_issues(plist)
     for issue in issues:
         print(f"docassert: {issue}", file=sys.stderr)
@@ -172,7 +183,7 @@ def cmd_projects(args: argparse.Namespace) -> int:
 def cmd_status(args: argparse.Namespace) -> int:
     from . import status as status_mod
     if args.index:
-        index = status_mod.build_index(DOCUMENTS_DIR)
+        index = status_mod.build_index(args.documents_dir)
         if args.format == "json":
             text = status_mod.render_json(index)
         elif args.format == "html":
@@ -181,7 +192,7 @@ def cmd_status(args: argparse.Namespace) -> int:
             text = status_mod.render_index_markdown(index)
         tag = index["overall"]["rag"]
     else:
-        model = status_mod.build_status(DOCUMENTS_DIR, project=args.project)
+        model = status_mod.build_status(args.documents_dir, project=args.project)
         if args.project and not model["documents"]:
             print(f"docassert: no documents for project {args.project!r}", file=sys.stderr)
             return 2
@@ -206,16 +217,17 @@ def cmd_pages(args: argparse.Namespace) -> int:
     from . import status as status_mod
     out = Path(args.out)
     out.mkdir(parents=True, exist_ok=True)
+    docs_dir = args.documents_dir
-    index = status_mod.build_index(DOCUMENTS_DIR)
+    index = status_mod.build_index(docs_dir)
     (out / "index.html").write_text(status_mod.render_index_html(index))
-    plist = projects_mod.load_projects(DOCUMENTS_DIR)
+    plist = projects_mod.load_projects(docs_dir)
     for p in plist:
-        model = status_mod.build_status(DOCUMENTS_DIR, project=p["id"])
+        model = status_mod.build_status(docs_dir, project=p["id"])
         (out / f"{p['id']}.html").write_text(status_mod.render_html(model))
-    (out / "RTM.md").write_text(rtm.render_markdown(build_graph(DOCUMENTS_DIR)))
+    (out / "RTM.md").write_text(rtm.render_markdown(build_graph(docs_dir)))
     print(f"docassert: wrote {out}/ — index + {len(plist)} project page(s) + RTM.md "
           f"(portfolio: {index['overall']['rag']})")
     return 0
@@ -232,6 +244,23 @@ def cmd_init(args: argparse.Namespace) -> int:
     return 0
+def cmd_extract(args: argparse.Namespace) -> int:
+    """Extract plain text from a source document (.docx/.pdf/.md/.txt) — the
+    deterministic first step of doc-to-pmo conversion."""
+    from . import extract as extract_mod
+    try:
+        text = extract_mod.extract(args.file)
+    except (FileNotFoundError, ValueError, ImportError) as exc:
+        print(f"docassert: {exc}", file=sys.stderr)
+        return 2
+    if args.out:
+        Path(args.out).write_text(text, encoding="utf-8")
+        print(f"docassert: wrote {args.out} ({len(text)} chars)")
+    else:
+        sys.stdout.write(text)
+    return 0
 def main(argv: list[str] | None = None) -> int:
     from . import __version__
     parser = argparse.ArgumentParser(prog="docassert",
@@ -239,10 +268,15 @@ def main(argv: list[str] | None = None) -> int:
     parser.add_argument("--version", action="version", version=f"docassert {__version__}")
     sub = parser.add_subparsers(dest="command", required=True)
+    def docs_dir_opt(sp: argparse.ArgumentParser) -> None:
+        sp.add_argument("--documents-dir", default=DEFAULT_DOCUMENTS_DIR,
+                        help=f"Documents tree to read (default: {DEFAULT_DOCUMENTS_DIR}/).")
     v = sub.add_parser("validate", help="Validate documents against their criteria.")
     v.add_argument("paths", nargs="+", help="Markdown files or globs.")
     v.add_argument("--junit", help="Write a JUnit XML report to this path.")
     v.add_argument("--markdown", help="Write a PR-comment markdown report to this path.")
+    docs_dir_opt(v)
     v.set_defaults(func=cmd_validate)
     c = sub.add_parser("consistency", help="Check cross-document traceability.")
@@ -250,12 +284,14 @@ def main(argv: list[str] | None = None) -> int:
     c.add_argument("--markdown", help="Write a PR-comment markdown report to this path.")
     c.add_argument("--no-semantic", action="store_true",
                    help="Skip AI alignment (structural consistency only).")
+    docs_dir_opt(c)
     c.set_defaults(func=cmd_consistency)
     r = sub.add_parser("rtm", help="Generate the requirements traceability matrix.")
     r.add_argument("--out", help="Write to this path instead of stdout.")
     r.add_argument("--csv", action="store_true", help="Emit CSV instead of Markdown.")
     r.add_argument("--project", help="Scope to one project (PRJ-NNN-CODE id or CODE).")
+    docs_dir_opt(r)
     r.set_defaults(func=cmd_rtm)
     s = sub.add_parser("status", help="Derive a project status page from the documents.")
@@ -267,22 +303,30 @@ def main(argv: list[str] | None = None) -> int:
     s.add_argument("--index", action="store_true",
                    help="Render the multi-project portfolio index instead of one status.")
     s.add_argument("--out", help="Write to this path instead of stdout.")
+    docs_dir_opt(s)
     s.set_defaults(func=cmd_status)
     pg = sub.add_parser("pages", help="Build the full Pages site (portfolio index + a page per project).")
     pg.add_argument("--out", default="_site", help="Output directory (default: _site).")
+    docs_dir_opt(pg)
     pg.set_defaults(func=cmd_pages)
     p = sub.add_parser("projects", help="Generate the project registry from the project.md anchors.")
     p.add_argument("--out", help="Write to this path instead of stdout (e.g. projects.yaml).")
     p.add_argument("--check", action="store_true",
                    help="Exit non-zero if the registry file is stale (CI freshness gate).")
+    docs_dir_opt(p)
     p.set_defaults(func=cmd_projects)
     ini = sub.add_parser("init", help="Scaffold the default criteria/schema/profiles/templates into a repo.")
     ini.add_argument("dir", nargs="?", default=".", help="Target directory (default: current).")
     ini.set_defaults(func=cmd_init)
+    ex = sub.add_parser("extract", help="Extract plain text from a source doc (.docx/.pdf/.md/.txt) for conversion.")
+    ex.add_argument("file", help="Source document (.docx / .pdf / .md / .txt).")
+    ex.add_argument("--out", help="Write to this path instead of stdout.")
+    ex.set_defaults(func=cmd_extract)
     args = parser.parse_args(argv)
     return args.func(args)

{docassert-0.1.0 → docassert-0.2.1}/docassert/consistency.py RENAMED Viewed

@@ -130,6 +130,12 @@ def check_profile_completeness(documents_dir: str | Path = "documents") -> Check
 # ── semantic (advisory) ────────────────────────────────────────────────────
+# Each alignment edge costs one API call, so a large graph could otherwise run
+# away on cost. Cap per run; tune with `alignment_limit` in consistency.yaml
+# (0 disables the cap).
+DEFAULT_ALIGNMENT_LIMIT = 25
 def run_alignment_checks(graph, config) -> list[CheckResult]:
     edges = []  # (prompt, parent, child, relation)
     for rule in config.get("alignment", []):
@@ -142,12 +148,26 @@ def run_alignment_checks(graph, config) -> list[CheckResult]:
     if not edges:
         return []
+    limit = int(config.get("alignment_limit", DEFAULT_ALIGNMENT_LIMIT) or 0)
+    note: CheckResult | None = None
+    if limit and len(edges) > limit:
+        note = CheckResult(
+            "alignment-limit", True, False,
+            f"graded {limit} of {len(edges)} link(s) — raise `alignment_limit` "
+            f"in consistency.yaml to grade more per run",
+            kind="semantic", score=None)
+        edges = edges[:limit]
     if not os.environ.get("ANTHROPIC_API_KEY"):
         return [CheckResult("alignment", True, False,
                             f"skipped — no ANTHROPIC_API_KEY ({len(edges)} link(s) to grade)",
                             kind="semantic", score=None)]
-    return [run_alignment(f"align:{c.id}-{rel}-{p.id}", prompt, p.text, c.text)
-            for prompt, p, c, rel in edges]
+    results = [run_alignment(f"align:{c.id}-{rel}-{p.id}", prompt, p.text, c.text)
+               for prompt, p, c, rel in edges]
+    if note is not None:
+        results.append(note)
+    return results
 def run_consistency(documents_dir: str | Path = "documents",

docassert-0.2.1/docassert/extract.py ADDED Viewed

@@ -0,0 +1,55 @@
+"""Extract plain text from a source document, for doc-to-pmo conversion.
+The deterministic first step of the conversion front-door: turn an arbitrary
+source file (.docx / .pdf / .md / .txt) into plain text that the doc-to-pmo
+skill then maps into a standard template. It does not interpret or reshape the
+content — that is the skill's job.
+.docx / .pdf support needs the optional `convert` extra:
+    pip install "docassert[convert]"
+"""
+from __future__ import annotations
+from pathlib import Path
+_NEED_CONVERT = 'extract needs the "convert" extra: pip install "docassert[convert]"'
+def extract(path: str | Path) -> str:
+    """Return the plain text of a source document.
+    Raises FileNotFoundError (missing file), ValueError (unsupported type), or
+    ImportError (a .docx/.pdf without the `convert` extra installed).
+    """
+    p = Path(path)
+    if not p.is_file():
+        raise FileNotFoundError(f"no such file: {p}")
+    ext = p.suffix.lower()
+    if ext in {".md", ".txt"}:
+        return p.read_text(encoding="utf-8")
+    if ext == ".docx":
+        try:
+            import docx  # python-docx
+        except ImportError as exc:
+            raise ImportError(_NEED_CONVERT) from exc
+        document = docx.Document(str(p))
+        blocks: list[str] = [para.text for para in document.paragraphs]
+        # include table cell text, which charters often use for milestones/risks
+        for table in document.tables:
+            for row in table.rows:
+                cells = [cell.text.strip() for cell in row.cells]
+                if any(cells):
+                    blocks.append(" | ".join(cells))
+        return "\n".join(blocks)
+    if ext == ".pdf":
+        try:
+            from pypdf import PdfReader
+        except ImportError as exc:
+            raise ImportError(_NEED_CONVERT) from exc
+        reader = PdfReader(str(p))
+        return "\n".join((page.extract_text() or "") for page in reader.pages)
+    raise ValueError(f"unsupported source type '{ext}' (supported: .docx, .pdf, .md, .txt)")

{docassert-0.1.0 → docassert-0.2.1/docassert.egg-info}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: docassert
-Version: 0.1.0
+Version: 0.2.1
 Summary: Unit testing for business documents — validate structured Markdown docs against a configurable audit standard.
 Author: C4G Enterprises Inc.
 License: Apache-2.0
@@ -38,6 +38,10 @@ Dynamic: license-file
 # docassert
+[![PyPI](https://img.shields.io/pypi/v/docassert)](https://pypi.org/project/docassert/)
+[![Python](https://img.shields.io/pypi/pyversions/docassert)](https://pypi.org/project/docassert/)
+[![License](https://img.shields.io/badge/license-Apache--2.0-blue)](LICENSE)
 **Unit testing for business documents.** Validate structured Markdown documents
 (charters, BRDs, PRDs, risk registers, …) against a configurable audit standard:
 deterministic structural checks that gate a merge, plus optional AI-graded
@@ -50,9 +54,11 @@ a vendor-neutral standard for running a PMO from version-controlled, declarative
 ## Install
 ```bash
-pip install "docassert @ git+https://github.com/c4g-john/docassert"   # PyPI release coming
+pipx install docassert          # recommended — installs the CLI in its own isolated env
+# or:
+pip install docassert
 # with the AI advisory extra:
-pip install "docassert[ai] @ git+https://github.com/c4g-john/docassert"
+pip install "docassert[ai]"
 ```
 ## Quickstart
@@ -74,13 +80,18 @@ you can customize them.
 | Command | What it does |
 |---|---|
-| `docassert validate <globs>` | Validate documents against their kind's criteria. Exit code = number of blocking failures. |
+| `docassert validate <globs>` | Validate documents against their kind's criteria. Exit code = number of blocking failures (capped at 125). |
 | `docassert consistency` | Cross-document checks: referential integrity, coverage, required links, profile completeness. |
 | `docassert rtm [--project ID]` | Requirements traceability matrix (Markdown or CSV). |
 | `docassert status [--project ID] [--index]` | Derived project status (md / json / html). |
 | `docassert pages --out DIR` | Build the portfolio site (index + a page per project). |
 | `docassert projects [--out] [--check]` | Generate / verify the project registry. |
 | `docassert init [DIR]` | Scaffold the default config into a repo. |
+| `docassert extract <file>` | Extract plain text from a source `.docx` / `.pdf` / `.md` / `.txt` (the first step of doc-to-pmo conversion). Needs the `convert` extra: `pip install "docassert[convert]"`. |
+Every document-reading command accepts `--documents-dir` (default `documents/`).
+AI alignment grades at most `alignment_limit` links per run (default 25; set it
+in `consistency.yaml`, `0` = no cap) so API cost stays bounded on large graphs.
 ## Document kinds

{docassert-0.1.0 → docassert-0.2.1}/docassert.egg-info/SOURCES.txt RENAMED Viewed

@@ -7,6 +7,7 @@ docassert/__main__.py
 docassert/cli.py
 docassert/config.py
 docassert/consistency.py
+docassert/extract.py
 docassert/graph.py
 docassert/loader.py
 docassert/models.py
@@ -89,6 +90,8 @@ docassert/_data/templates/test-cases.template.md
 docassert/_data/templates/user-story.template.md
 tests/test_config.py
 tests/test_consistency.py
+tests/test_defects.py
+tests/test_extract.py
 tests/test_graph.py
 tests/test_kinds_delivery.py
 tests/test_kinds_governance.py

docassert-0.2.1/tests/test_defects.py ADDED Viewed

@@ -0,0 +1,85 @@
+"""Tests for the 0.2.1 defect fixes: exit-code cap, --documents-dir, alignment cap."""
+from docassert import consistency as C
+from docassert.cli import _capped, main
+from docassert.graph import Graph
+from docassert.models import CheckResult, Item
+PROJECT_MD = """---
+kind: project
+id: PRJ-009-TST
+code: TST
+name: Test Project
+sponsor: jane.doe
+status: proposed
+---
+## Overview
+A test project.
+## Scope
+Everything.
+"""
+# ── exit-code cap ────────────────────────────────────────────────────────────
+def test_exit_code_capped_below_wraparound():
+    assert _capped(0) == 0
+    assert _capped(3) == 3
+    assert _capped(125) == 125
+    assert _capped(256) == 125   # would otherwise wrap to exit status 0
+    assert _capped(1000) == 125
+# ── --documents-dir ──────────────────────────────────────────────────────────
+def test_projects_reads_documents_dir_flag(tmp_path, monkeypatch, capsys):
+    docs = tmp_path / "elsewhere"
+    (docs / "PRJ-009-TST").mkdir(parents=True)
+    (docs / "PRJ-009-TST" / "project.md").write_text(PROJECT_MD, encoding="utf-8")
+    monkeypatch.chdir(tmp_path)   # cwd has no documents/ at all
+    assert main(["projects", "--documents-dir", str(docs)]) == 0
+    assert "PRJ-009-TST" in capsys.readouterr().out
+def test_status_reads_documents_dir_flag(tmp_path, monkeypatch, capsys):
+    docs = tmp_path / "elsewhere"
+    (docs / "PRJ-009-TST").mkdir(parents=True)
+    (docs / "PRJ-009-TST" / "project.md").write_text(PROJECT_MD, encoding="utf-8")
+    monkeypatch.chdir(tmp_path)
+    assert main(["status", "--documents-dir", str(docs), "--summary"]) == 0
+    assert "Derived from 1 documents" in capsys.readouterr().out
+# ── alignment call cap ───────────────────────────────────────────────────────
+def _graph_with_edges(n):
+    g = Graph()
+    g.add(Item("TST-BR-001", "TST", "BR", "parent", {}, "d.md", "k", "approved", "S"))
+    for i in range(n):
+        g.add(Item(f"TST-PR-{i:03d}", "TST", "PR", "child",
+                   {"traces": ["TST-BR-001"]}, "d.md", "k", "approved", "S"))
+    return g
+def _stub_calls(monkeypatch):
+    calls = []
+    monkeypatch.setenv("ANTHROPIC_API_KEY", "test-key")
+    monkeypatch.setattr(C, "run_alignment",
+                        lambda cid, *a: calls.append(cid) or CheckResult(
+                            cid, True, False, "ok", kind="semantic", score=1.0))
+    return calls
+def test_alignment_capped(monkeypatch):
+    calls = _stub_calls(monkeypatch)
+    cfg = {"alignment": [{"relation": "traces", "prompt": "judge"}], "alignment_limit": 2}
+    results = C.run_alignment_checks(_graph_with_edges(4), cfg)
+    assert len(calls) == 2
+    note = next(r for r in results if r.check_id == "alignment-limit")
+    assert "graded 2 of 4" in note.detail and not note.blocking
+def test_alignment_cap_disabled_with_zero(monkeypatch):
+    calls = _stub_calls(monkeypatch)
+    cfg = {"alignment": [{"relation": "traces", "prompt": "judge"}], "alignment_limit": 0}
+    results = C.run_alignment_checks(_graph_with_edges(4), cfg)
+    assert len(calls) == 4
+    assert not any(r.check_id == "alignment-limit" for r in results)

docassert-0.2.1/tests/test_extract.py ADDED Viewed

@@ -0,0 +1,65 @@
+"""Tests for the extract module and the `docassert extract` command."""
+import pytest
+from docassert import extract as E
+from docassert.cli import main
+# ── the extract() function ──────────────────────────────────────────────────
+def test_extract_md(tmp_path):
+    f = tmp_path / "s.md"
+    f.write_text("# Hello\nworld", encoding="utf-8")
+    assert E.extract(f) == "# Hello\nworld"
+def test_extract_txt(tmp_path):
+    f = tmp_path / "s.txt"
+    f.write_text("plain text", encoding="utf-8")
+    assert E.extract(str(f)) == "plain text"
+def test_missing_file_raises(tmp_path):
+    with pytest.raises(FileNotFoundError):
+        E.extract(tmp_path / "nope.md")
+def test_unsupported_type_raises(tmp_path):
+    f = tmp_path / "s.rtf"
+    f.write_text("x", encoding="utf-8")
+    with pytest.raises(ValueError):
+        E.extract(f)
+def test_extract_docx_paragraphs_and_tables(tmp_path):
+    docx = pytest.importorskip("docx")  # needs the 'convert' extra
+    d = docx.Document()
+    d.add_paragraph("First para.")
+    table = d.add_table(rows=1, cols=2)
+    table.rows[0].cells[0].text = "Milestone"
+    table.rows[0].cells[1].text = "2026-09-30"
+    path = tmp_path / "s.docx"
+    d.save(str(path))
+    text = E.extract(path)
+    assert "First para." in text
+    assert "Milestone | 2026-09-30" in text  # table cells joined
+# ── the CLI command ─────────────────────────────────────────────────────────
+def test_cli_extract_stdout(tmp_path, capsys):
+    f = tmp_path / "s.md"
+    f.write_text("hello cli", encoding="utf-8")
+    assert main(["extract", str(f)]) == 0
+    assert "hello cli" in capsys.readouterr().out
+def test_cli_extract_out_file(tmp_path):
+    src = tmp_path / "s.txt"
+    src.write_text("abc", encoding="utf-8")
+    out = tmp_path / "out.txt"
+    assert main(["extract", str(src), "--out", str(out)]) == 0
+    assert out.read_text() == "abc"
+def test_cli_extract_missing_returns_2(tmp_path, capsys):
+    assert main(["extract", str(tmp_path / "nope.md")]) == 2
+    assert "no such file" in capsys.readouterr().err