delimit-cli 4.1.50 → 4.1.52

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,36 @@
1
1
  # Changelog
2
2
 
3
+ ## [4.1.52] - 2026-04-10
4
+
5
+ ### Fixed (exit shim reporting zeros)
6
+ - **Git commit count always zero** — `git log --after="$SESSION_START"` was passing a raw epoch integer. Git's `--after` needs `@` prefix for epoch time (`--after="@$SESSION_START"`).
7
+ - **Ledger item count always zero** — the awk script matched any line with a `created_at` field but never compared the timestamp against the session start. Now converts `SESSION_START` to ISO format and uses string comparison to count only items created during the session.
8
+ - **Deliberation count always zero** — looked for a `deliberations.jsonl` file that doesn't exist. Deliberations are stored as individual JSON files in `~/.delimit/deliberations/`. Now uses `find -newermt "@$SESSION_START"` to count files created during the session.
9
+
10
+ ### Tests
11
+ - 134/134 npm CLI tests passing (no test changes — shell template fix only).
12
+
13
+ ## [4.1.51] - 2026-04-09
14
+
15
+ ### Fixed (gateway loop engine — LED-814)
16
+ - **`ai/loop_engine.run_governed_iteration` mishandled swarm dispatch statuses.** Only `status=='completed'` was treated as success. The swarm dispatcher returns `'dispatched'` for async handoff, so every build-loop tick fell into the failure branch and logged "Dispatch failed" even though the underlying work shipped. Session `build-loop-2026-04-09` accumulated 6 spurious failures (LED-787 / 788 / 755 / 762 / 799 / 807) for tasks that all actually shipped. Now:
17
+ - `'completed'` → close ledger + notify deploy loop (unchanged)
18
+ - `'dispatched'` → mark ledger `in_progress` with the swarm `task_id`, NOT a failure
19
+ - `'blocked'` → record a founder-approval gate without tripping the circuit breaker
20
+ - anything else → genuine failure, error message includes the unexpected status string for debuggability
21
+ - Verified live against the running MCP session before this release: `iterations 6→7`, `errors 0`, `LED-814` recorded as `dispatched` with `swarm_task_id task-449ecdf9`.
22
+ - Picked up via the standard `npm run sync-gateway` step in `prepublishOnly` (gateway commit `ce802cd` is now on `delimit-ai/delimit-gateway` main).
23
+
24
+ ### Added
25
+ - **`tests/test_loop_engine_dispatch_status.py`** in the gateway — covers all four dispatch status branches (`completed` / `dispatched` / `blocked` / unknown), 154 lines, ships with the bundled gateway.
26
+
27
+ ### Scope
28
+ - Single-purpose patch: gateway loop engine only. This is the deferred half of the multi-model deliberation that produced 4.1.50 — the deliberation explicitly required splitting the gateway fix from the CLAUDE.md regex fix so each ship has a clean rollback story.
29
+
30
+ ### Tests
31
+ - npm CLI: 134/134 still passing (no CLI changes — bundled gateway only).
32
+ - Gateway: new `test_loop_engine_dispatch_status.py` suite passing.
33
+
3
34
  ## [4.1.50] - 2026-04-09
4
35
 
5
36
  ### Fixed (CRITICAL — CLAUDE.md in-prose marker clobber)
@@ -780,24 +780,24 @@ delimit_exit_screen() {
780
780
  else
781
781
  DURATION="\${ELAPSED}s"
782
782
  fi
783
- # Count git commits made during session
783
+ # Count git commits made during session (@ prefix tells git the value is epoch)
784
784
  COMMITS=0
785
785
  if [ -d "\$SESSION_CWD/.git" ] || git -C "\$SESSION_CWD" rev-parse --git-dir >/dev/null 2>&1; then
786
- COMMITS=\$(git -C "\$SESSION_CWD" log --oneline --after="\$SESSION_START" --format="%H" 2>/dev/null | wc -l | tr -d ' ')
786
+ COMMITS=\$(git -C "\$SESSION_CWD" log --oneline --after="@\$SESSION_START" --format="%H" 2>/dev/null | wc -l | tr -d ' ')
787
787
  fi
788
788
  # Count ledger items created during session (by timestamp)
789
789
  LEDGER_DIR="\$DELIMIT_HOME/ledger"
790
790
  LEDGER_ITEMS=0
791
- if [ -d "\$LEDGER_DIR" ]; then
791
+ # Convert epoch SESSION_START to ISO prefix for string comparison
792
+ SESSION_ISO=\$(date -u -d "@\$SESSION_START" +%Y-%m-%dT%H:%M:%S 2>/dev/null || date -u -r "\$SESSION_START" +%Y-%m-%dT%H:%M:%S 2>/dev/null || echo "")
793
+ if [ -d "\$LEDGER_DIR" ] && [ -n "\$SESSION_ISO" ]; then
792
794
  for lf in "\$LEDGER_DIR"/*.jsonl; do
793
795
  [ -f "\$lf" ] || continue
794
- COUNT=\$(awk -v start="\$SESSION_START" '
796
+ COUNT=\$(awk -v start="\$SESSION_ISO" '
795
797
  BEGIN { n=0 }
796
798
  {
797
- if (match(\$0, /"(created_at|ts)":"[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}/)) {
798
- n++
799
- } else if (match(\$0, /"(created_at|ts)":([0-9]+)/, arr)) {
800
- if (arr[2]+0 >= start+0) n++
799
+ if (match(\$0, /"created_at":"([0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2})"/, arr)) {
800
+ if (arr[1] >= start) n++
801
801
  }
802
802
  }
803
803
  END { print n }
@@ -805,14 +805,11 @@ delimit_exit_screen() {
805
805
  LEDGER_ITEMS=\$((LEDGER_ITEMS + COUNT))
806
806
  done
807
807
  fi
808
- # Count deliberations (governance decisions)
808
+ # Count deliberations created during this session (stored as individual JSON files)
809
809
  DELIBERATIONS=0
810
- if [ -f "\$DELIMIT_HOME/deliberations.jsonl" ]; then
811
- DELIBERATIONS=\$(awk -v start="\$SESSION_START" '
812
- BEGIN { n=0 }
813
- { if (match(\$0, /"ts":([0-9]+)/, arr)) { if (arr[1]+0 >= start+0) n++ } }
814
- END { print n }
815
- ' "\$DELIMIT_HOME/deliberations.jsonl" 2>/dev/null || echo "0")
810
+ DELIB_DIR="\$DELIMIT_HOME/deliberations"
811
+ if [ -d "\$DELIB_DIR" ]; then
812
+ DELIBERATIONS=\$(find "\$DELIB_DIR" -maxdepth 1 -name '*.json' -newermt "@\$SESSION_START" 2>/dev/null | wc -l | tr -d ' ')
816
813
  fi
817
814
  # Determine exit status label
818
815
  if [ "\$_EXIT_CODE" -eq 0 ]; then
@@ -23,10 +23,11 @@ if str(GATEWAY_ROOT) not in sys.path:
23
23
 
24
24
 
25
25
  def _load_specs(spec_path: str) -> Dict[str, Any]:
26
- """Load an OpenAPI spec from a file path.
26
+ """Load an API spec (OpenAPI or JSON Schema) from a file path.
27
27
 
28
28
  Performs a non-fatal version compatibility check (LED-290) so that
29
29
  unknown OpenAPI versions log a warning instead of silently parsing.
30
+ JSON Schema documents skip the OpenAPI version assert.
30
31
  """
31
32
  import yaml
32
33
 
@@ -41,15 +42,146 @@ def _load_specs(spec_path: str) -> Dict[str, Any]:
41
42
  spec = json.loads(content)
42
43
 
43
44
  # LED-290: warn (non-fatal) if version is outside the validated set.
45
+ # Only applies to OpenAPI/Swagger documents — bare JSON Schema files
46
+ # have no "openapi"/"swagger" key and would otherwise trip the assert.
44
47
  try:
45
- from core.openapi_version import assert_supported
46
- assert_supported(spec, strict=False)
48
+ if isinstance(spec, dict) and ("openapi" in spec or "swagger" in spec):
49
+ from core.openapi_version import assert_supported
50
+ assert_supported(spec, strict=False)
47
51
  except Exception as exc: # pragma: no cover -- defensive only
48
52
  logger.debug("openapi version check skipped: %s", exc)
49
53
 
50
54
  return spec
51
55
 
52
56
 
57
+ # ---------------------------------------------------------------------------
58
+ # LED-713: JSON Schema spec-type dispatch helpers
59
+ # ---------------------------------------------------------------------------
60
+
61
+
62
+ def _spec_type(doc: Any) -> str:
63
+ """Classify a loaded spec doc. 'openapi' or 'json_schema'."""
64
+ from core.spec_detector import detect_spec_type
65
+ t = detect_spec_type(doc)
66
+ # Fallback to openapi for unknown so we never break existing flows.
67
+ return "json_schema" if t == "json_schema" else "openapi"
68
+
69
+
70
+ def _json_schema_changes_to_dicts(changes: List[Any]) -> List[Dict[str, Any]]:
71
+ return [
72
+ {
73
+ "type": c.type.value,
74
+ "path": c.path,
75
+ "message": c.message,
76
+ "is_breaking": c.is_breaking,
77
+ "details": c.details,
78
+ }
79
+ for c in changes
80
+ ]
81
+
82
+
83
+ def _json_schema_semver(changes: List[Any]) -> Dict[str, Any]:
84
+ """Build an OpenAPI-compatible semver result from JSON Schema changes.
85
+
86
+ Mirrors core.semver_classifier.classify_detailed shape so downstream
87
+ consumers (PR comment, CI formatter, ledger) don't need to branch.
88
+ """
89
+ breaking = [c for c in changes if c.is_breaking]
90
+ non_breaking = [c for c in changes if not c.is_breaking]
91
+ if breaking:
92
+ bump = "major"
93
+ elif non_breaking:
94
+ bump = "minor"
95
+ else:
96
+ bump = "none"
97
+ return {
98
+ "bump": bump,
99
+ "is_breaking": bool(breaking),
100
+ "counts": {
101
+ "breaking": len(breaking),
102
+ "non_breaking": len(non_breaking),
103
+ "total": len(changes),
104
+ },
105
+ }
106
+
107
+
108
+ def _bump_semver_version(current: str, bump: str) -> Optional[str]:
109
+ """Minimal semver bump for JSON Schema path (core.semver_classifier
110
+ only understands OpenAPI ChangeType enums)."""
111
+ if not current:
112
+ return None
113
+ try:
114
+ parts = current.lstrip("v").split(".")
115
+ major, minor, patch = (int(parts[0]), int(parts[1]), int(parts[2]))
116
+ except Exception:
117
+ return None
118
+ if bump == "major":
119
+ return f"{major + 1}.0.0"
120
+ if bump == "minor":
121
+ return f"{major}.{minor + 1}.0"
122
+ if bump == "patch":
123
+ return f"{major}.{minor}.{patch + 1}"
124
+ return current
125
+
126
+
127
+ def _run_json_schema_lint(
128
+ old_doc: Dict[str, Any],
129
+ new_doc: Dict[str, Any],
130
+ current_version: Optional[str] = None,
131
+ api_name: Optional[str] = None,
132
+ ) -> Dict[str, Any]:
133
+ """Build an evaluate_with_policy-compatible result for JSON Schema.
134
+
135
+ Policy rules in Delimit are defined against OpenAPI ChangeType values,
136
+ so they do not apply here. We return zero violations and rely on the
137
+ breaking-change count + semver bump to drive the governance gate.
138
+ """
139
+ from core.json_schema_diff import JSONSchemaDiffEngine
140
+
141
+ engine = JSONSchemaDiffEngine()
142
+ changes = engine.compare(old_doc, new_doc)
143
+ semver = _json_schema_semver(changes)
144
+
145
+ if current_version:
146
+ semver["current_version"] = current_version
147
+ semver["next_version"] = _bump_semver_version(current_version, semver["bump"])
148
+
149
+ breaking_count = semver["counts"]["breaking"]
150
+ total = semver["counts"]["total"]
151
+
152
+ decision = "pass"
153
+ exit_code = 0
154
+ # No policy rules apply to JSON Schema, but breaking changes still
155
+ # flag MAJOR semver and the downstream gate uses that to block.
156
+ # Mirror the shape of evaluate_with_policy so the action/CLI renderers
157
+ # need no JSON Schema-specific branch.
158
+ result: Dict[str, Any] = {
159
+ "spec_type": "json_schema",
160
+ "api_name": api_name or new_doc.get("title") or old_doc.get("title") or "JSON Schema",
161
+ "decision": decision,
162
+ "exit_code": exit_code,
163
+ "violations": [],
164
+ "summary": {
165
+ "total_changes": total,
166
+ "breaking_changes": breaking_count,
167
+ "violations": 0,
168
+ "errors": 0,
169
+ "warnings": 0,
170
+ },
171
+ "all_changes": [
172
+ {
173
+ "type": c.type.value,
174
+ "path": c.path,
175
+ "message": c.message,
176
+ "is_breaking": c.is_breaking,
177
+ }
178
+ for c in changes
179
+ ],
180
+ "semver": semver,
181
+ }
182
+ return result
183
+
184
+
53
185
  def _read_jsonl(path: Path) -> List[Dict[str, Any]]:
54
186
  """Read JSONL entries from a file, skipping malformed lines."""
55
187
  items: List[Dict[str, Any]] = []
@@ -115,29 +247,51 @@ def run_lint(old_spec: str, new_spec: str, policy_file: Optional[str] = None) ->
115
247
  """Run the full lint pipeline: diff + policy evaluation.
116
248
 
117
249
  This is the Tier 1 primary tool — combines diff detection with
118
- policy enforcement into a single pass/fail decision.
250
+ policy enforcement into a single pass/fail decision. Auto-detects
251
+ spec type (OpenAPI vs JSON Schema, LED-713) and dispatches to the
252
+ matching engine.
119
253
  """
120
254
  from core.policy_engine import evaluate_with_policy
121
255
 
122
256
  old = _load_specs(old_spec)
123
257
  new = _load_specs(new_spec)
124
258
 
259
+ # LED-713: JSON Schema dispatch. Policy rules are OpenAPI-specific,
260
+ # so JSON Schema takes the no-policy (breaking-count + semver) path.
261
+ if _spec_type(new) == "json_schema" or _spec_type(old) == "json_schema":
262
+ return _run_json_schema_lint(old, new)
263
+
125
264
  return evaluate_with_policy(old, new, policy_file)
126
265
 
127
266
 
128
267
  def run_diff(old_spec: str, new_spec: str) -> Dict[str, Any]:
129
- """Run diff engine only — no policy evaluation."""
130
- from core.diff_engine_v2 import OpenAPIDiffEngine
268
+ """Run diff engine only — no policy evaluation.
131
269
 
270
+ Auto-detects OpenAPI vs JSON Schema and dispatches (LED-713).
271
+ """
132
272
  old = _load_specs(old_spec)
133
273
  new = _load_specs(new_spec)
134
274
 
275
+ if _spec_type(new) == "json_schema" or _spec_type(old) == "json_schema":
276
+ from core.json_schema_diff import JSONSchemaDiffEngine
277
+ engine = JSONSchemaDiffEngine()
278
+ changes = engine.compare(old, new)
279
+ breaking = [c for c in changes if c.is_breaking]
280
+ return {
281
+ "spec_type": "json_schema",
282
+ "total_changes": len(changes),
283
+ "breaking_changes": len(breaking),
284
+ "changes": _json_schema_changes_to_dicts(changes),
285
+ }
286
+
287
+ from core.diff_engine_v2 import OpenAPIDiffEngine
135
288
  engine = OpenAPIDiffEngine()
136
289
  changes = engine.compare(old, new)
137
290
 
138
291
  breaking = [c for c in changes if c.is_breaking]
139
292
 
140
293
  return {
294
+ "spec_type": "openapi",
141
295
  "total_changes": len(changes),
142
296
  "breaking_changes": len(breaking),
143
297
  "changes": [
@@ -164,13 +318,20 @@ def run_changelog(
164
318
  Uses the diff engine to detect changes, then formats them into
165
319
  a human-readable changelog grouped by category.
166
320
  """
167
- from core.diff_engine_v2 import OpenAPIDiffEngine
168
321
  from datetime import datetime, timezone
169
322
 
170
323
  old = _load_specs(old_spec)
171
324
  new = _load_specs(new_spec)
172
325
 
173
- engine = OpenAPIDiffEngine()
326
+ # LED-713: dispatch on spec type. JSONSchemaChange / Change share the
327
+ # (.type.value, .path, .message, .is_breaking) duck type.
328
+ if _spec_type(new) == "json_schema" or _spec_type(old) == "json_schema":
329
+ from core.json_schema_diff import JSONSchemaDiffEngine
330
+ engine = JSONSchemaDiffEngine()
331
+ else:
332
+ from core.diff_engine_v2 import OpenAPIDiffEngine
333
+ engine = OpenAPIDiffEngine()
334
+
174
335
  changes = engine.compare(old, new)
175
336
 
176
337
  # Categorize changes
@@ -808,14 +969,26 @@ def run_semver(
808
969
  """Classify the semver bump for a spec change.
809
970
 
810
971
  Returns detailed breakdown: bump level, per-category counts,
811
- and optionally the bumped version string.
972
+ and optionally the bumped version string. Auto-detects OpenAPI vs
973
+ JSON Schema (LED-713).
812
974
  """
813
- from core.diff_engine_v2 import OpenAPIDiffEngine
814
- from core.semver_classifier import classify_detailed, bump_version, classify
815
-
816
975
  old = _load_specs(old_spec)
817
976
  new = _load_specs(new_spec)
818
977
 
978
+ # LED-713: JSON Schema path
979
+ if _spec_type(new) == "json_schema" or _spec_type(old) == "json_schema":
980
+ from core.json_schema_diff import JSONSchemaDiffEngine
981
+ engine = JSONSchemaDiffEngine()
982
+ changes = engine.compare(old, new)
983
+ result = _json_schema_semver(changes)
984
+ if current_version:
985
+ result["current_version"] = current_version
986
+ result["next_version"] = _bump_semver_version(current_version, result["bump"])
987
+ return result
988
+
989
+ from core.diff_engine_v2 import OpenAPIDiffEngine
990
+ from core.semver_classifier import classify_detailed, bump_version, classify
991
+
819
992
  engine = OpenAPIDiffEngine()
820
993
  changes = engine.compare(old, new)
821
994
  result = classify_detailed(changes)
@@ -946,7 +1119,6 @@ def run_diff_report(
946
1119
  """
947
1120
  from datetime import datetime, timezone
948
1121
 
949
- from core.diff_engine_v2 import OpenAPIDiffEngine
950
1122
  from core.policy_engine import PolicyEngine
951
1123
  from core.semver_classifier import classify_detailed, classify
952
1124
  from core.spec_health import score_spec
@@ -955,6 +1127,43 @@ def run_diff_report(
955
1127
  old = _load_specs(old_spec)
956
1128
  new = _load_specs(new_spec)
957
1129
 
1130
+ # LED-713: JSON Schema dispatch — short-circuit to a minimal report
1131
+ # shape compatible with the JSON renderer (HTML renderer remains
1132
+ # OpenAPI-only; JSON Schema callers should use fmt="json").
1133
+ if _spec_type(new) == "json_schema" or _spec_type(old) == "json_schema":
1134
+ from core.json_schema_diff import JSONSchemaDiffEngine
1135
+ js_engine = JSONSchemaDiffEngine()
1136
+ js_changes = js_engine.compare(old, new)
1137
+ js_breaking = [c for c in js_changes if c.is_breaking]
1138
+ js_semver = _json_schema_semver(js_changes)
1139
+ now_js = datetime.now(timezone.utc)
1140
+ return {
1141
+ "format": fmt,
1142
+ "spec_type": "json_schema",
1143
+ "generated_at": now_js.isoformat(),
1144
+ "old_spec": old_spec,
1145
+ "new_spec": new_spec,
1146
+ "old_title": old.get("title", "") if isinstance(old, dict) else "",
1147
+ "new_title": new.get("title", "") if isinstance(new, dict) else "",
1148
+ "semver": js_semver,
1149
+ "changes": _json_schema_changes_to_dicts(js_changes),
1150
+ "breaking_count": len(js_breaking),
1151
+ "non_breaking_count": len(js_changes) - len(js_breaking),
1152
+ "total_changes": len(js_changes),
1153
+ "policy": {
1154
+ "decision": "pass",
1155
+ "violations": [],
1156
+ "errors": 0,
1157
+ "warnings": 0,
1158
+ },
1159
+ "health": None,
1160
+ "migration": "",
1161
+ "output_file": output_file,
1162
+ "note": "JSON Schema report (policy rules and HTML report are OpenAPI-only in v1)",
1163
+ }
1164
+
1165
+ from core.diff_engine_v2 import OpenAPIDiffEngine
1166
+
958
1167
  # -- Diff --
959
1168
  engine = OpenAPIDiffEngine()
960
1169
  changes = engine.compare(old, new)
@@ -158,21 +158,80 @@ def config_audit(target: str = ".", options: Optional[Dict] = None) -> Dict[str,
158
158
  # ─── EvidencePack ───────────────────────────────────────────────────────
159
159
 
160
160
  def evidence_collect(target: str = ".", options: Optional[Dict] = None) -> Dict[str, Any]:
161
- """Collect project evidence: git log, test files, configs, governance data."""
162
- import subprocess, time as _time
163
- root = Path(target).resolve()
164
- evidence: Dict[str, Any] = {"collected_at": _time.time(), "target": str(root)}
165
- # Git log
166
- try:
167
- r = subprocess.run(["git", "-C", str(root), "log", "--oneline", "-10"], capture_output=True, text=True, timeout=10)
168
- evidence["git_log"] = r.stdout.strip().splitlines() if r.returncode == 0 else []
169
- except Exception:
161
+ """Collect project evidence: git log, test files, configs, governance data.
162
+
163
+ Accepts either a local filesystem path (repo directory) or a remote
164
+ reference (GitHub URL, owner/repo#N, or any non-filesystem string).
165
+ Remote targets skip the filesystem walk and store reference metadata.
166
+ """
167
+ import re
168
+ import subprocess
169
+ import time as _time
170
+
171
+ opts = options or {}
172
+ evidence_type = opts.get("evidence_type", "")
173
+
174
+ # Detect non-filesystem targets: URLs, owner/repo#N, bare issue refs, etc.
175
+ is_remote = (
176
+ "://" in target
177
+ or target.startswith("http")
178
+ or re.match(r"^[\w.-]+/[\w.-]+#\d+$", target) is not None
179
+ or "#" in target
180
+ )
181
+
182
+ evidence: Dict[str, Any] = {"collected_at": _time.time(), "target": target}
183
+ if evidence_type:
184
+ evidence["evidence_type"] = evidence_type
185
+
186
+ if is_remote:
187
+ # Remote/reference target — no filesystem walk, just record metadata.
188
+ evidence["target_type"] = "remote"
170
189
  evidence["git_log"] = []
171
- # Test files
172
- test_dirs = [d for d in ["tests", "test", "__tests__", "spec"] if (root / d).exists()]
173
- evidence["test_directories"] = test_dirs
174
- # Configs
175
- evidence["configs"] = [f.name for f in root.iterdir() if f.is_file() and (f.suffix in [".json", ".yaml", ".yml", ".toml"] or f.name.startswith("."))]
190
+ evidence["test_directories"] = []
191
+ evidence["configs"] = []
192
+ m = re.match(r"^([\w.-]+)/([\w.-]+)#(\d+)$", target)
193
+ if m:
194
+ evidence["repo"] = f"{m.group(1)}/{m.group(2)}"
195
+ evidence["issue_number"] = int(m.group(3))
196
+ else:
197
+ root = Path(target).resolve()
198
+ evidence["target"] = str(root)
199
+ evidence["target_type"] = "local"
200
+
201
+ if not root.exists():
202
+ return {
203
+ "tool": "evidence.collect",
204
+ "status": "error",
205
+ "error": "target_not_found",
206
+ "message": f"Path {root} does not exist. For remote targets, pass a URL or owner/repo#N.",
207
+ "target": target,
208
+ }
209
+
210
+ # Git log (safe for non-git dirs)
211
+ try:
212
+ r = subprocess.run(
213
+ ["git", "-C", str(root), "log", "--oneline", "-10"],
214
+ capture_output=True, text=True, timeout=10,
215
+ )
216
+ evidence["git_log"] = r.stdout.strip().splitlines() if r.returncode == 0 else []
217
+ except Exception:
218
+ evidence["git_log"] = []
219
+
220
+ # Test dirs + configs (only if target is a directory)
221
+ if root.is_dir():
222
+ test_dirs = [d for d in ["tests", "test", "__tests__", "spec"] if (root / d).exists()]
223
+ evidence["test_directories"] = test_dirs
224
+ try:
225
+ evidence["configs"] = [
226
+ f.name for f in root.iterdir()
227
+ if f.is_file() and (f.suffix in [".json", ".yaml", ".yml", ".toml"] or f.name.startswith("."))
228
+ ]
229
+ except (PermissionError, OSError):
230
+ evidence["configs"] = []
231
+ else:
232
+ evidence["test_directories"] = []
233
+ evidence["configs"] = []
234
+
176
235
  # Save bundle
177
236
  ev_dir = Path(os.environ.get("DELIMIT_HOME", str(Path.home() / ".delimit"))) / "evidence"
178
237
  ev_dir.mkdir(parents=True, exist_ok=True)
@@ -180,8 +239,13 @@ def evidence_collect(target: str = ".", options: Optional[Dict] = None) -> Dict[
180
239
  bundle_path = ev_dir / f"{bundle_id}.json"
181
240
  evidence["bundle_id"] = bundle_id
182
241
  bundle_path.write_text(json.dumps(evidence, indent=2))
183
- return {"tool": "evidence.collect", "status": "ok", "bundle_id": bundle_id,
184
- "bundle_path": str(bundle_path), "summary": {k: len(v) if isinstance(v, list) else v for k, v in evidence.items()}}
242
+ return {
243
+ "tool": "evidence.collect",
244
+ "status": "ok",
245
+ "bundle_id": bundle_id,
246
+ "bundle_path": str(bundle_path),
247
+ "summary": {k: len(v) if isinstance(v, list) else v for k, v in evidence.items()},
248
+ }
185
249
 
186
250
 
187
251
  def evidence_verify(bundle_id: Optional[str] = None, bundle_path: Optional[str] = None, options: Optional[Dict] = None) -> Dict[str, Any]:
@@ -56,6 +56,10 @@ _CREDENTIAL_FALSE_POSITIVES = re.compile(
56
56
  r"change[_-]?me|TODO|FIXME|xxx+|\.{4,}|"
57
57
  r"\$\{|%\(|None|null|undefined|"
58
58
  r"test[_-]?(?:password|secret|token|key)|"
59
+ # Test fixture patterns — fake keys like hosted-key-1, user-key-2, sk-test, gem-test
60
+ r"hosted[_-]key[_-]?\d*|user[_-]key[_-]?\d*|"
61
+ r"(?:codex|gem|grok)[_-]test|sk[_-]test|"
62
+ r"bad[:\-]token|fake[_-]?(?:key|token|secret)|"
59
63
  # Demo/sample literal values used in docs, recordings, fixtures
60
64
  r"sk-ant-demo|sk-demo|AIza-demo|xai-demo|demo[_-]?(?:key|secret|token)|"
61
65
  r"-demo['\"]|"
@@ -63,7 +67,9 @@ _CREDENTIAL_FALSE_POSITIVES = re.compile(
63
67
  r"json\.loads|\.read_text\(|\.slice\(|"
64
68
  r"tokens\.get\(|token\s*=\s*_make_token|"
65
69
  # RHS that is a parameter reference like token=tokens.get("access_token"...
66
- r"=\s*tokens\.get\()",
70
+ r"=\s*tokens\.get\(|"
71
+ # Dict index dereference: token_data["token"], result["secret"], etc.
72
+ r"_data\[|_result\[)",
67
73
  re.IGNORECASE,
68
74
  )
69
75