elliot-stack 1.0.29 → 1.0.33

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (128) hide show
  1. package/LICENSE +21 -21
  2. package/README.md +5 -0
  3. package/bin/install.cjs +981 -950
  4. package/hooks/repo-search-nudge.js +32 -32
  5. package/package.json +1 -1
  6. package/skills/estack-active-learning-tutor/SKILL.md +339 -339
  7. package/skills/estack-better-title/SKILL.md +64 -64
  8. package/skills/estack-better-title/scripts/rename.sh +55 -55
  9. package/skills/estack-chris-voss/SKILL.md +80 -80
  10. package/skills/estack-chris-voss/references/elliot-notes.md +120 -120
  11. package/skills/estack-chris-voss/references/voss-principles.md +210 -210
  12. package/skills/estack-customer-discovery/SKILL.md +60 -60
  13. package/skills/estack-flight-planner/SKILL.md +332 -332
  14. package/skills/estack-flight-planner/references/config_schema.md +156 -156
  15. package/skills/estack-flight-planner/references/flight_history_schema.md +97 -97
  16. package/skills/estack-flight-planner/references/shuttle_schedules.md +98 -98
  17. package/skills/estack-flight-planner/scripts/check_setup.sh +89 -89
  18. package/skills/estack-flight-planner/scripts/fetch_flights.py +99 -99
  19. package/skills/estack-flight-planner/scripts/filter_flights.py +265 -265
  20. package/skills/estack-flight-planner/scripts/pair_shuttles.py +173 -173
  21. package/skills/estack-github-issue-tracker/SKILL.md +322 -322
  22. package/skills/estack-github-issue-tracker/bin/tracker-tools.cjs +1358 -1358
  23. package/skills/estack-github-issue-tracker/references/gh-cli-patterns.md +124 -124
  24. package/skills/estack-github-issue-tracker/references/result-file-schema.md +156 -156
  25. package/skills/estack-github-issue-tracker/references/tracker-schema.md +96 -96
  26. package/skills/estack-github-issue-tracker/tracker-template.md +58 -58
  27. package/skills/estack-leadership-coach/SKILL.md +235 -0
  28. package/skills/estack-leadership-coach/adding-references.md +280 -0
  29. package/skills/estack-leadership-coach/frameworks/delegation/flows/post-mortem.md +120 -0
  30. package/skills/estack-leadership-coach/frameworks/delegation/flows/pre-delegation.md +138 -0
  31. package/skills/estack-leadership-coach/frameworks/delegation/phases/1-intake.md +145 -0
  32. package/skills/estack-leadership-coach/frameworks/delegation/phases/2-trm-assessment.md +119 -0
  33. package/skills/estack-leadership-coach/frameworks/delegation/phases/3-enrollment.md +132 -0
  34. package/skills/estack-leadership-coach/frameworks/delegation/phases/4-build-brief.md +171 -0
  35. package/skills/estack-leadership-coach/frameworks/delegation/phases/5-monitoring.md +134 -0
  36. package/skills/estack-leadership-coach/frameworks/delegation/phases/6-reverse-delegation.md +118 -0
  37. package/skills/estack-leadership-coach/frameworks/delegation/phases/7-diagnose.md +200 -0
  38. package/skills/estack-leadership-coach/references/.source-files/deci-ryan_self-determination-theory__deci-olafsen-ryan-2017-self-determination-theory-in-work-organizations.md +1881 -0
  39. package/skills/estack-leadership-coach/references/.source-files/deci-ryan_self-determination-theory__gagne-deci-2005-self-determination-theory-and-work-motivation.md +2058 -0
  40. package/skills/estack-leadership-coach/references/.source-files/deci-ryan_self-determination-theory__selfdeterminationtheory-org-theory-overview-page.md +61 -0
  41. package/skills/estack-leadership-coach/references/.source-files/gallup_engagement-research__gallup-3-key-insights-into-the-global-workplace-2024.md +57 -0
  42. package/skills/estack-leadership-coach/references/.source-files/gallup_engagement-research__gallup-managers-account-for-70-percent-of-variance-in-employee-engagement-2015.md +40 -0
  43. package/skills/estack-leadership-coach/references/.source-files/gallup_engagement-research__gallup-state-of-the-global-workplace-2026-global-data-summary.md +73 -0
  44. package/skills/estack-leadership-coach/references/.source-files/gallup_engagement-research__gallup-state-of-the-global-workplace-2026-report-landing.md +42 -0
  45. package/skills/estack-leadership-coach/references/.source-files/hormozi-leila_4-stages__leila-hormozi-the-art-of-delegation-blog-post.md +91 -0
  46. package/skills/estack-leadership-coach/references/.source-files/oncken-wass_monkeys-hbr-1974__oncken-wass-management-time-whos-got-the-monkey-hbr-classic-1974.md +969 -0
  47. package/skills/estack-leadership-coach/references/.source-files/sanchez_main-street-millionaire__codie-sanchez-afford-anything-podcast-ep-565-show-notes.md +89 -0
  48. package/skills/estack-leadership-coach/references/.source-files/sullivan_who-not-how__dan-sullivan-impact-filter-tool-and-guide-booklet.md +565 -0
  49. package/skills/estack-leadership-coach/references/.source-files/van-edwards_cues__vanessa-van-edwards-lewis-howes-school-of-greatness-ep-1231-show-notes.md +122 -0
  50. package/skills/estack-leadership-coach/references/.source-files/van-edwards_cues__vanessa-van-edwards-roger-dooley-cues-interview.md +194 -0
  51. package/skills/estack-leadership-coach/references/deci-ryan_self-determination-theory.md +166 -0
  52. package/skills/estack-leadership-coach/references/doerr_measure-what-matters.md +154 -0
  53. package/skills/estack-leadership-coach/references/ferriss_4hww.md +189 -0
  54. package/skills/estack-leadership-coach/references/gallup_engagement-research.md +105 -0
  55. package/skills/estack-leadership-coach/references/gerber_e-myth-revisited.md +118 -0
  56. package/skills/estack-leadership-coach/references/grove_high-output-management.md +95 -0
  57. package/skills/estack-leadership-coach/references/hormozi-alex_followthrough.md +152 -0
  58. package/skills/estack-leadership-coach/references/hormozi-leila_4-stages.md +146 -0
  59. package/skills/estack-leadership-coach/references/oncken-wass_monkeys-hbr-1974.md +128 -0
  60. package/skills/estack-leadership-coach/references/sanchez_main-street-millionaire.md +196 -0
  61. package/skills/estack-leadership-coach/references/sullivan_who-not-how.md +137 -0
  62. package/skills/estack-leadership-coach/references/van-edwards_cues.md +189 -0
  63. package/skills/estack-migrate-claude-session-history/SKILL.md +226 -0
  64. package/skills/estack-migrate-claude-session-history/references/path-encoding.md +55 -0
  65. package/skills/estack-migrate-claude-session-history/references/troubleshooting.md +96 -0
  66. package/skills/estack-migrate-claude-session-history/scripts/migrate-claude-history.js +1123 -0
  67. package/skills/estack-migrate-claude-session-history/scripts/test-append-note.js +48 -0
  68. package/skills/estack-migrate-claude-session-history/scripts/test-validate-migration.py +326 -0
  69. package/skills/estack-migrate-claude-session-history/scripts/validate-migration.py +493 -0
  70. package/skills/estack-pdf-to-md/SKILL.md +180 -0
  71. package/skills/estack-pdf-to-md/scripts/pdf_to_md.py +596 -0
  72. package/skills/estack-productivity-prioritization-coach/SKILL.md +124 -0
  73. package/skills/estack-productivity-prioritization-coach/sources/01-tony-robbins-rpm.md +39 -0
  74. package/skills/estack-productivity-prioritization-coach/sources/02-justin-sung-task-prioritization.md +34 -0
  75. package/skills/estack-prompt-builder-coach/SKILL.md +81 -81
  76. package/skills/estack-prompt-builder-coach/definition-of-done-generator.md +42 -42
  77. package/skills/estack-prompt-builder-coach/prompt-builder.md +37 -37
  78. package/skills/estack-prompt-builder-coach/task-shaper.md +36 -36
  79. package/skills/estack-prompt-builder-coach/vague-ask-auditor.md +37 -37
  80. package/skills/estack-read-claude-session-history/SKILL.md +204 -204
  81. package/skills/estack-read-claude-session-history/references/jsonl-schema.md +126 -126
  82. package/skills/estack-read-claude-session-history/references/modes.md +423 -423
  83. package/skills/estack-read-claude-session-history/references/recipes.md +271 -271
  84. package/skills/estack-read-claude-session-history/scripts/lib/__init__.py +1 -1
  85. package/skills/estack-read-claude-session-history/scripts/lib/parser.py +460 -460
  86. package/skills/estack-read-claude-session-history/scripts/lib/paths.py +234 -234
  87. package/skills/estack-read-claude-session-history/scripts/lib/search.py +179 -179
  88. package/skills/estack-read-claude-session-history/scripts/lib/subagents.py +88 -88
  89. package/skills/estack-read-claude-session-history/scripts/lib/tools.py +144 -144
  90. package/skills/estack-read-claude-session-history/scripts/read_transcript.py +1776 -1776
  91. package/skills/estack-read-claude-session-history/scripts/tests/conftest.py +40 -40
  92. package/skills/estack-read-claude-session-history/scripts/tests/fixtures/README.md +20 -20
  93. package/skills/estack-read-claude-session-history/scripts/tests/fixtures/all-noise.jsonl +4 -4
  94. package/skills/estack-read-claude-session-history/scripts/tests/fixtures/basic-session.jsonl +2 -2
  95. package/skills/estack-read-claude-session-history/scripts/tests/fixtures/engagement-gaps.jsonl +9 -9
  96. package/skills/estack-read-claude-session-history/scripts/tests/fixtures/engagement-noise.jsonl +7 -7
  97. package/skills/estack-read-claude-session-history/scripts/tests/fixtures/engagement-parallel-a.jsonl +3 -3
  98. package/skills/estack-read-claude-session-history/scripts/tests/fixtures/engagement-parallel-b.jsonl +3 -3
  99. package/skills/estack-read-claude-session-history/scripts/tests/fixtures/engagement-waiting.jsonl +5 -5
  100. package/skills/estack-read-claude-session-history/scripts/tests/fixtures/interrupted.jsonl +2 -2
  101. package/skills/estack-read-claude-session-history/scripts/tests/fixtures/multi-compact.jsonl +8 -8
  102. package/skills/estack-read-claude-session-history/scripts/tests/fixtures/pending-user.jsonl +2 -2
  103. package/skills/estack-read-claude-session-history/scripts/tests/fixtures/subagent-no-meta/subagents/agent-aaa.jsonl +2 -2
  104. package/skills/estack-read-claude-session-history/scripts/tests/fixtures/subagent-no-meta.jsonl +2 -2
  105. package/skills/estack-read-claude-session-history/scripts/tests/fixtures/subagent-parent/subagents/agent-xyz123.jsonl +2 -2
  106. package/skills/estack-read-claude-session-history/scripts/tests/fixtures/subagent-parent/subagents/agent-xyz123.meta.json +1 -1
  107. package/skills/estack-read-claude-session-history/scripts/tests/fixtures/subagent-parent.jsonl +4 -4
  108. package/skills/estack-read-claude-session-history/scripts/tests/fixtures/time-spread.jsonl +6 -6
  109. package/skills/estack-read-claude-session-history/scripts/tests/fixtures/timeline-day-test.jsonl +5 -5
  110. package/skills/estack-read-claude-session-history/scripts/tests/fixtures/tool-zoo.jsonl +10 -10
  111. package/skills/estack-read-claude-session-history/scripts/tests/fixtures/truncated.jsonl +2 -2
  112. package/skills/estack-read-claude-session-history/scripts/tests/fixtures/unicode.jsonl +2 -2
  113. package/skills/estack-read-claude-session-history/scripts/tests/fixtures/with-advisor.jsonl +3 -3
  114. package/skills/estack-read-claude-session-history/scripts/tests/fixtures/with-compact.jsonl +5 -5
  115. package/skills/estack-read-claude-session-history/scripts/tests/fixtures/with-thinking.jsonl +2 -2
  116. package/skills/estack-read-claude-session-history/scripts/tests/test_backup_roots.py +56 -56
  117. package/skills/estack-read-claude-session-history/scripts/tests/test_engagement.py +239 -239
  118. package/skills/estack-read-claude-session-history/scripts/tests/test_json_format.py +201 -201
  119. package/skills/estack-read-claude-session-history/scripts/tests/test_modes.py +199 -199
  120. package/skills/estack-read-claude-session-history/scripts/tests/test_parser.py +195 -195
  121. package/skills/estack-read-claude-session-history/scripts/tests/test_paths.py +133 -133
  122. package/skills/estack-read-claude-session-history/scripts/tests/test_search.py +78 -78
  123. package/skills/estack-read-claude-session-history/scripts/tests/test_subagents.py +43 -43
  124. package/skills/estack-read-claude-session-history/scripts/tests/test_timeline.py +179 -179
  125. package/skills/estack-read-claude-session-history/scripts/tests/test_timezone_and_project.py +212 -212
  126. package/skills/estack-read-claude-session-history/scripts/tests/test_tools.py +80 -80
  127. package/skills/estack-repo-search/SKILL.md +65 -65
  128. package/skills/estack-vscode-file-recovery/SKILL.md +188 -0
@@ -0,0 +1,493 @@
1
+ """Validate a migrated Claude Code session transcript.
2
+
3
+ Runs structural, schema, and path-consistency checks on the migrated .jsonl
4
+ (plus its subagent sidecar dir if present). With --source-backup, also
5
+ cross-validates against the pre-migration backup to prove no entries were
6
+ lost, no uuids reordered, no encoding variant missed.
7
+
8
+ Exit code: 0 if every check passes, 1 if any check fails.
9
+
10
+ Examples:
11
+
12
+ # Minimum: schema + self-consistency checks on the migrated file alone
13
+ python validate-migration.py <migrated.jsonl>
14
+
15
+ # Add path-replacement check (catches truly-stale old-path references)
16
+ python validate-migration.py <migrated.jsonl> \\
17
+ --old-repo "C:\\Users\\me\\old" \\
18
+ --new-repo "C:\\Users\\me\\new"
19
+
20
+ # Full cross-validation against the backup
21
+ python validate-migration.py <migrated.jsonl> \\
22
+ --old-repo "C:\\Users\\me\\old" \\
23
+ --new-repo "C:\\Users\\me\\new" \\
24
+ --source-backup "<backup>/old-project/<uuid>.jsonl" \\
25
+ --target-backup-dir "<backup>/new-project"
26
+ """
27
+
28
+ from __future__ import annotations
29
+
30
+ import argparse
31
+ import json
32
+ import os
33
+ import re
34
+ import sys
35
+ from dataclasses import dataclass, field
36
+ from pathlib import Path
37
+ from typing import Any
38
+
39
+ # UUID v4 shape (case-insensitive). Claude Code uses standard UUIDs throughout.
40
+ UUID_RE = re.compile(r"^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$", re.IGNORECASE)
41
+
42
+ # Entry types that carry conversation content and a cwd field. Other types are
43
+ # session-metadata markers (permission-mode, ai-title, last-prompt) and don't
44
+ # need to match all the schema rules.
45
+ CONVERSATION_TYPES = {"user", "assistant", "attachment", "tool_result", "system"}
46
+
47
+
48
+ @dataclass
49
+ class CheckResult:
50
+ name: str
51
+ passed: bool
52
+ detail: str
53
+ sub_results: list["CheckResult"] = field(default_factory=list)
54
+
55
+
56
+ def load_jsonl(path: Path) -> tuple[list[dict[str, Any]], list[tuple[int, str]]]:
57
+ """Return (parsed_entries, parse_errors). Skips blank lines."""
58
+ entries: list[dict[str, Any]] = []
59
+ errors: list[tuple[int, str]] = []
60
+ with path.open(encoding="utf-8") as f:
61
+ for i, raw in enumerate(f, start=1):
62
+ if not raw.strip():
63
+ continue
64
+ try:
65
+ entries.append(json.loads(raw))
66
+ except json.JSONDecodeError as exc:
67
+ errors.append((i, str(exc)))
68
+ return entries, errors
69
+
70
+
71
+ def check_parse_integrity(file_path: Path) -> CheckResult:
72
+ entries, errors = load_jsonl(file_path)
73
+ if errors:
74
+ detail = f"{len(entries)} parseable + {len(errors)} bad line(s); first error: line {errors[0][0]}: {errors[0][1]}"
75
+ return CheckResult("JSONL parse integrity", False, detail)
76
+ return CheckResult("JSONL parse integrity", True, f"{len(entries)} entries, all parseable")
77
+
78
+
79
+ def check_schema(entries: list[dict[str, Any]]) -> CheckResult:
80
+ """Every entry needs `type`. Conversation entries need a usable shape."""
81
+ problems: list[str] = []
82
+ for idx, entry in enumerate(entries):
83
+ if not isinstance(entry, dict):
84
+ problems.append(f"entry {idx}: not an object")
85
+ continue
86
+ if "type" not in entry:
87
+ problems.append(f"entry {idx}: missing 'type'")
88
+ continue
89
+ etype = entry["type"]
90
+ if etype in {"user", "assistant"}:
91
+ msg = entry.get("message")
92
+ if not isinstance(msg, dict) or "role" not in msg or "content" not in msg:
93
+ problems.append(f"entry {idx} (type={etype}): malformed 'message' (need role + content)")
94
+ if "uuid" in entry and isinstance(entry["uuid"], str) and entry["uuid"] != "":
95
+ if not UUID_RE.match(entry["uuid"]):
96
+ problems.append(f"entry {idx}: uuid not UUID-shaped: {entry['uuid']!r}")
97
+ if "parentUuid" in entry and entry["parentUuid"] is not None:
98
+ if not isinstance(entry["parentUuid"], str) or (entry["parentUuid"] != "" and not UUID_RE.match(entry["parentUuid"])):
99
+ problems.append(f"entry {idx}: parentUuid not UUID-shaped: {entry['parentUuid']!r}")
100
+ if problems:
101
+ sample = "; ".join(problems[:3]) + (f" (+{len(problems) - 3} more)" if len(problems) > 3 else "")
102
+ return CheckResult("Schema consistency", False, sample)
103
+ return CheckResult("Schema consistency", True, f"all {len(entries)} entries have required fields")
104
+
105
+
106
+ def check_session_id_consistency(entries: list[dict[str, Any]], expected_session_id: str) -> CheckResult:
107
+ seen: set[str] = set()
108
+ for entry in entries:
109
+ sid = entry.get("sessionId")
110
+ if isinstance(sid, str) and sid:
111
+ seen.add(sid)
112
+ if not seen:
113
+ return CheckResult("Session ID consistency", False, "no sessionId field found on any entry")
114
+ if seen != {expected_session_id}:
115
+ return CheckResult(
116
+ "Session ID consistency",
117
+ False,
118
+ f"expected only {expected_session_id!r}, found {sorted(seen)!r}",
119
+ )
120
+ return CheckResult("Session ID consistency", True, f"all entries reference session {expected_session_id[:8]}")
121
+
122
+
123
+ def check_parent_uuid_chains(entries: list[dict[str, Any]]) -> CheckResult:
124
+ known_uuids = {e["uuid"] for e in entries if isinstance(e.get("uuid"), str) and e["uuid"]}
125
+ broken: list[str] = []
126
+ for entry in entries:
127
+ parent = entry.get("parentUuid")
128
+ if parent and parent not in known_uuids:
129
+ uid = entry.get("uuid", "?")[:8]
130
+ broken.append(f"entry {uid} references missing parent {parent[:8]}")
131
+ if broken:
132
+ sample = "; ".join(broken[:3]) + (f" (+{len(broken) - 3} more)" if len(broken) > 3 else "")
133
+ return CheckResult("Parent UUID chains", False, sample)
134
+ return CheckResult("Parent UUID chains", True, f"all parent references resolve within the file")
135
+
136
+
137
+ def check_cwd_consistency(entries: list[dict[str, Any]], new_repo: str | None) -> CheckResult:
138
+ """Every non-empty cwd should equal new_repo. If new_repo wasn't passed,
139
+ just check that all non-empty cwds agree with each other (one value)."""
140
+ cwds = {e["cwd"] for e in entries if isinstance(e.get("cwd"), str) and e["cwd"]}
141
+ if not cwds:
142
+ return CheckResult("CWD field consistency", True, "no non-empty cwd values to check")
143
+ if len(cwds) > 1:
144
+ return CheckResult(
145
+ "CWD field consistency",
146
+ False,
147
+ f"multiple distinct cwd values found: {sorted(cwds)!r}",
148
+ )
149
+ (only,) = cwds
150
+ if new_repo is not None and only != new_repo:
151
+ return CheckResult(
152
+ "CWD field consistency",
153
+ False,
154
+ f"cwd is {only!r} but --new-repo says it should be {new_repo!r}",
155
+ )
156
+ return CheckResult("CWD field consistency", True, f"all entries use cwd={only}")
157
+
158
+
159
+ def _iter_strings(value: Any):
160
+ """Walk a parsed JSON value and yield every string it contains."""
161
+ if isinstance(value, str):
162
+ yield value
163
+ elif isinstance(value, dict):
164
+ for v in value.values():
165
+ yield from _iter_strings(v)
166
+ elif isinstance(value, list):
167
+ for v in value:
168
+ yield from _iter_strings(v)
169
+
170
+
171
+ def check_stale_path_references(
172
+ entries: list[dict[str, Any]],
173
+ old_repo: str,
174
+ new_repo: str,
175
+ ) -> CheckResult:
176
+ """A "truly stale" reference is an occurrence of the old path that is NOT
177
+ followed by the added segment of the new path. When new_repo is a subdir
178
+ of old_repo, a naive substring search matches itself; the negative
179
+ lookahead filters those false positives.
180
+
181
+ Operates on parsed entry values (Python strings — single-backslash form),
182
+ not on raw file text. This sidesteps JSON-escape mismatches between CLI
183
+ argument processing and on-disk encoding.
184
+
185
+ The migration-note entry is excluded from the search because it
186
+ legitimately contains old-path references in its explanatory text."""
187
+
188
+ # Normalize CLI inputs: if either path was passed with doubled backslashes,
189
+ # collapse them to single backslashes so we always work with the
190
+ # in-memory (parsed) form.
191
+ if "\\\\" in old_repo:
192
+ old_repo = old_repo.replace("\\\\", "\\")
193
+ if "\\\\" in new_repo:
194
+ new_repo = new_repo.replace("\\\\", "\\")
195
+
196
+ note_idx = None
197
+ for i, e in enumerate(entries):
198
+ msg = e.get("message")
199
+ if isinstance(msg, dict) and isinstance(msg.get("content"), str) and "<session-migration-note>" in msg["content"]:
200
+ note_idx = i
201
+ break
202
+
203
+ if new_repo.startswith(old_repo) and new_repo != old_repo:
204
+ added_segment = new_repo[len(old_repo):]
205
+ pattern = re.compile(re.escape(old_repo) + "(?!" + re.escape(added_segment) + ")")
206
+ else:
207
+ added_segment = None
208
+ pattern = re.compile(re.escape(old_repo))
209
+
210
+ # Only check entries that existed at migration time — i.e., up to and
211
+ # including the note. Anything after the note is post-migration activity
212
+ # that may legitimately reference the old path (e.g. a tool listing that
213
+ # touched files still living at the old location). Those references are
214
+ # outside the migration's scope and not a defect.
215
+ if note_idx is not None:
216
+ in_scope_entries = entries[: note_idx + 1]
217
+ scope_detail = f"checked entries 0..{note_idx} (pre/at migration-note); {len(entries) - note_idx - 1} post-note entries skipped"
218
+ else:
219
+ in_scope_entries = entries
220
+ scope_detail = f"checked all {len(entries)} entries (no migration-note found)"
221
+
222
+ hits_by_entry: list[tuple[int, int]] = []
223
+ for i, entry in enumerate(in_scope_entries):
224
+ if i == note_idx:
225
+ continue
226
+ count = 0
227
+ for s in _iter_strings(entry):
228
+ count += len(pattern.findall(s))
229
+ if count:
230
+ hits_by_entry.append((i, count))
231
+
232
+ total = sum(c for _, c in hits_by_entry)
233
+ if total == 0:
234
+ if added_segment:
235
+ detail = f"0 truly-stale old-path occurrences ({scope_detail})"
236
+ else:
237
+ detail = f"0 old-path occurrences in any entry value ({scope_detail})"
238
+ return CheckResult("Stale path references", True, detail)
239
+
240
+ sample = ", ".join(f"entry {i}: {c}" for i, c in hits_by_entry[:3])
241
+ return CheckResult(
242
+ "Stale path references",
243
+ False,
244
+ f"{total} truly-stale old-path occurrences across {len(hits_by_entry)} entries ({sample}); {scope_detail}",
245
+ )
246
+
247
+
248
+ def find_migration_note(entries: list[dict[str, Any]]) -> tuple[int, dict[str, Any]] | None:
249
+ for idx, entry in enumerate(entries):
250
+ msg = entry.get("message")
251
+ if isinstance(msg, dict) and isinstance(msg.get("content"), str) and "<session-migration-note>" in msg["content"]:
252
+ return idx, entry
253
+ return None
254
+
255
+
256
+ def check_migration_note(entries: list[dict[str, Any]]) -> CheckResult:
257
+ matches = [(i, e) for i, e in enumerate(entries) if (
258
+ isinstance(e.get("message"), dict)
259
+ and isinstance(e["message"].get("content"), str)
260
+ and "<session-migration-note>" in e["message"]["content"]
261
+ )]
262
+ if not matches:
263
+ return CheckResult(
264
+ "Migration note present",
265
+ False,
266
+ "no <session-migration-note> entry found — was this file actually migrated?",
267
+ )
268
+ if len(matches) > 1:
269
+ return CheckResult(
270
+ "Migration note present",
271
+ False,
272
+ f"{len(matches)} migration-note entries (should be exactly 1); duplicate append",
273
+ )
274
+ idx, entry = matches[0]
275
+ problems = []
276
+ if entry.get("type") != "user":
277
+ problems.append(f"type={entry.get('type')!r} (expected 'user')")
278
+ if entry.get("isMeta") is True:
279
+ problems.append("isMeta=true (should not be meta — both user and AI need to see it)")
280
+ if problems:
281
+ return CheckResult("Migration note present", False, f"found at entry {idx}, but: {'; '.join(problems)}")
282
+ return CheckResult(
283
+ "Migration note present",
284
+ True,
285
+ f"found at entry {idx}, type=user, isMeta unset",
286
+ )
287
+
288
+
289
+ def check_sidecar_integrity(file_path: Path, expected_session_id: str) -> CheckResult:
290
+ sidecar_dir = file_path.with_suffix("") # strip .jsonl → <uuid>/ dir
291
+ if not sidecar_dir.is_dir():
292
+ return CheckResult("Sidecar integrity", True, "no sidecar dir (no subagents were spawned)")
293
+ sidecar_files = list(sidecar_dir.rglob("*.jsonl"))
294
+ if not sidecar_files:
295
+ return CheckResult("Sidecar integrity", True, "sidecar dir exists but contains no .jsonl files")
296
+ problems: list[str] = []
297
+ for sf in sidecar_files:
298
+ entries, errors = load_jsonl(sf)
299
+ if errors:
300
+ problems.append(f"{sf.name}: {len(errors)} parse errors")
301
+ continue
302
+ # Subagent files share the parent session's sessionId
303
+ seen_sids = {e.get("sessionId") for e in entries if e.get("sessionId")}
304
+ unexpected = seen_sids - {expected_session_id}
305
+ if unexpected:
306
+ problems.append(f"{sf.name}: unexpected sessionId(s) {unexpected!r}")
307
+ if problems:
308
+ return CheckResult("Sidecar integrity", False, "; ".join(problems[:3]))
309
+ return CheckResult("Sidecar integrity", True, f"{len(sidecar_files)} subagent files, all parseable, sessionId matches")
310
+
311
+
312
+ def check_backup_cross_validation(
313
+ migrated_entries: list[dict[str, Any]],
314
+ source_backup_path: Path,
315
+ sidecar_live: Path,
316
+ sidecar_backup: Path | None,
317
+ target_backup_dir: Path | None,
318
+ target_live_dir: Path,
319
+ ) -> CheckResult:
320
+ """Cross-validate against the pre-migration backup. This is what proves
321
+ the migration was complete and non-destructive."""
322
+ sub: list[CheckResult] = []
323
+
324
+ # Source-backup parse
325
+ src_entries, src_errors = load_jsonl(source_backup_path)
326
+ if src_errors:
327
+ sub.append(CheckResult(
328
+ "Source backup parses",
329
+ False,
330
+ f"{len(src_errors)} parse error(s) in backup — backup may be corrupt",
331
+ ))
332
+ return CheckResult("Backup cross-validation", False, "source backup unreadable", sub)
333
+ sub.append(CheckResult("Source backup parses", True, f"{len(src_entries)} entries in backup"))
334
+
335
+ # Entry count: migrated = source + 1 (the migration note)
336
+ expected = len(src_entries) + 1
337
+ actual = len(migrated_entries)
338
+ sub.append(CheckResult(
339
+ "Entry count = source + 1",
340
+ actual == expected,
341
+ f"source={len(src_entries)}, migrated={actual}, expected={expected}",
342
+ ))
343
+
344
+ # UUID order preserved: every source uuid present in migrated, in the same order, as a prefix
345
+ src_uuids = [e["uuid"] for e in src_entries if isinstance(e.get("uuid"), str) and e["uuid"]]
346
+ new_uuids = [e["uuid"] for e in migrated_entries if isinstance(e.get("uuid"), str) and e["uuid"]]
347
+ prefix_ok = new_uuids[: len(src_uuids)] == src_uuids
348
+ sub.append(CheckResult(
349
+ "UUID order preserved",
350
+ prefix_ok,
351
+ f"source has {len(src_uuids)} uuids, migrated has {len(new_uuids)} (extras at tail: {len(new_uuids) - len(src_uuids)})",
352
+ ))
353
+
354
+ # Sidecar file count: source backup sidecar vs live sidecar
355
+ src_sidecar_files = list(sidecar_backup.rglob("*.jsonl")) if (sidecar_backup and sidecar_backup.is_dir()) else []
356
+ new_sidecar_files = list(sidecar_live.rglob("*.jsonl")) if sidecar_live.is_dir() else []
357
+ sub.append(CheckResult(
358
+ "Sidecar count matches backup",
359
+ len(src_sidecar_files) == len(new_sidecar_files),
360
+ f"backup={len(src_sidecar_files)}, live={len(new_sidecar_files)}",
361
+ ))
362
+
363
+ # Target dir untouched: every file in target backup should still exist in live target with same size
364
+ if target_backup_dir and target_backup_dir.is_dir():
365
+ issues: list[str] = []
366
+ for backup_file in target_backup_dir.rglob("*"):
367
+ if not backup_file.is_file():
368
+ continue
369
+ rel = backup_file.relative_to(target_backup_dir)
370
+ live_file = target_live_dir / rel
371
+ if not live_file.exists():
372
+ issues.append(f"missing in live: {rel}")
373
+ elif live_file.stat().st_size != backup_file.stat().st_size:
374
+ issues.append(f"size changed: {rel}")
375
+ sub.append(CheckResult(
376
+ "Target dir pre-existing files unchanged",
377
+ not issues,
378
+ f"{len(issues)} issue(s)" + (f"; first: {issues[0]}" if issues else ""),
379
+ ))
380
+ else:
381
+ sub.append(CheckResult(
382
+ "Target dir pre-existing files unchanged",
383
+ True,
384
+ "skipped (no --target-backup-dir provided)",
385
+ ))
386
+
387
+ overall = all(r.passed for r in sub)
388
+ return CheckResult(
389
+ "Backup cross-validation",
390
+ overall,
391
+ f"{sum(1 for r in sub if r.passed)}/{len(sub)} sub-checks passed",
392
+ sub,
393
+ )
394
+
395
+
396
+ def print_result(result: CheckResult, indent: int = 0) -> None:
397
+ label = "PASS" if result.passed else "FAIL"
398
+ pad = " " * indent
399
+ print(f"{pad}[{label}] {result.name:<42s} {result.detail}")
400
+ for sub in result.sub_results:
401
+ print_result(sub, indent + 1)
402
+
403
+
404
+ def main(argv: list[str] | None = None) -> int:
405
+ parser = argparse.ArgumentParser(
406
+ description="Validate a migrated Claude Code session transcript.",
407
+ formatter_class=argparse.RawDescriptionHelpFormatter,
408
+ epilog=__doc__,
409
+ )
410
+ parser.add_argument("file", type=Path, help="Path to migrated .jsonl file")
411
+ parser.add_argument("--old-repo", help="Old project root path (enables stale-reference check)")
412
+ parser.add_argument("--new-repo", help="New project root path (enables cwd-value check)")
413
+ parser.add_argument("--source-backup", type=Path, help="Path to source backup .jsonl (enables cross-validation)")
414
+ parser.add_argument("--source-backup-sidecar", type=Path, help="Path to source backup sidecar dir; defaults to <source-backup>/ minus .jsonl")
415
+ parser.add_argument("--target-backup-dir", type=Path, help="Path to the entire backup of the target project dir (enables 'target untouched' check)")
416
+ args = parser.parse_args(argv)
417
+
418
+ file_path: Path = args.file
419
+ if not file_path.is_file():
420
+ print(f"ERROR: file not found: {file_path}", file=sys.stderr)
421
+ return 2
422
+
423
+ expected_session_id = file_path.stem # filename without .jsonl extension
424
+
425
+ print(f"=== Validating {file_path} ===\n")
426
+
427
+ results: list[CheckResult] = []
428
+
429
+ # Parse + load entries once for downstream checks
430
+ parse_result = check_parse_integrity(file_path)
431
+ results.append(parse_result)
432
+ if not parse_result.passed:
433
+ print_result(parse_result)
434
+ print("\nFile cannot be parsed; downstream checks skipped.")
435
+ return 1
436
+
437
+ entries, _ = load_jsonl(file_path)
438
+
439
+ results.append(check_schema(entries))
440
+ results.append(check_session_id_consistency(entries, expected_session_id))
441
+ results.append(check_parent_uuid_chains(entries))
442
+ results.append(check_cwd_consistency(entries, args.new_repo))
443
+ results.append(check_migration_note(entries))
444
+
445
+ if args.old_repo and args.new_repo:
446
+ results.append(check_stale_path_references(entries, args.old_repo, args.new_repo))
447
+ else:
448
+ results.append(CheckResult(
449
+ "Stale path references",
450
+ True,
451
+ "skipped (pass --old-repo and --new-repo to enable)",
452
+ ))
453
+
454
+ results.append(check_sidecar_integrity(file_path, expected_session_id))
455
+
456
+ if args.source_backup:
457
+ if not args.source_backup.is_file():
458
+ results.append(CheckResult(
459
+ "Backup cross-validation",
460
+ False,
461
+ f"--source-backup not found: {args.source_backup}",
462
+ ))
463
+ else:
464
+ sidecar_live = file_path.with_suffix("")
465
+ sidecar_backup = args.source_backup_sidecar or args.source_backup.with_suffix("")
466
+ target_live_dir = file_path.parent
467
+ results.append(check_backup_cross_validation(
468
+ entries,
469
+ args.source_backup,
470
+ sidecar_live,
471
+ sidecar_backup if sidecar_backup.is_dir() else None,
472
+ args.target_backup_dir,
473
+ target_live_dir,
474
+ ))
475
+ else:
476
+ results.append(CheckResult(
477
+ "Backup cross-validation",
478
+ True,
479
+ "skipped (pass --source-backup to enable)",
480
+ ))
481
+
482
+ for r in results:
483
+ print_result(r)
484
+
485
+ passed = sum(1 for r in results if r.passed)
486
+ total = len(results)
487
+ overall = all(r.passed for r in results)
488
+ print(f"\n=== Result: {'PASS' if overall else 'FAIL'} ({passed}/{total} checks) ===")
489
+ return 0 if overall else 1
490
+
491
+
492
+ if __name__ == "__main__":
493
+ sys.exit(main())
@@ -0,0 +1,180 @@
1
+ ---
2
+ name: estack-pdf-to-md
3
+ version: 1.0.0
4
+ description: (pdf-to-md) Convert a PDF file to Markdown or plain text using the RunPulse API. Use this skill whenever the user wants to extract text from a PDF, convert a PDF to .md or .txt, OCR a PDF, "turn this PDF into text/markdown", drops a .pdf path into chat asking for its contents, or asks to run the RunPulse / Pulse converter. Trigger even when the user only says "convert this PDF" without naming the tool.
5
+ ---
6
+
7
+ # pdf-to-md
8
+
9
+ Convert a PDF (or several PDFs) to Markdown or plain text using the RunPulse API. The underlying script splits the PDF into page batches, fires all batches in parallel against the RunPulse `/extract` endpoint, polls each async job, and reassembles the markdown in correct page order.
10
+
11
+ ## API key check (runs on skill load)
12
+
13
+ ```!
14
+ SKILL_DIR="$HOME/.claude/skills/estack-pdf-to-md"
15
+ ENV_FILE="$SKILL_DIR/.env"
16
+ echo "=== PULSE_API_KEY status ==="
17
+
18
+ ENV_KEY=""
19
+ if [ -f "$ENV_FILE" ]; then
20
+ ENV_KEY=$(grep -E '^PULSE_API_KEY=' "$ENV_FILE" 2>/dev/null | head -1 | cut -d= -f2- | tr -d '"' | tr -d "'" | tr -d '\r' | xargs)
21
+ fi
22
+
23
+ USER_VAR=""
24
+ if command -v powershell.exe >/dev/null 2>&1; then
25
+ USER_VAR=$(powershell.exe -NoProfile -Command "[System.Environment]::GetEnvironmentVariable('PULSE_API_KEY','User')" 2>/dev/null | tr -d '\r\n')
26
+ fi
27
+
28
+ if [ -n "$ENV_KEY" ]; then
29
+ masked="${ENV_KEY:0:6}...${ENV_KEY: -4}"
30
+ echo "[OK] Key found in skill .env ($ENV_FILE) -> $masked"
31
+ [ -n "$USER_VAR" ] && echo " (also present in Windows user env var; .env wins)"
32
+ elif [ -n "$USER_VAR" ]; then
33
+ masked="${USER_VAR:0:6}...${USER_VAR: -4}"
34
+ echo "[OK] Key found in Windows user env var PULSE_API_KEY -> $masked"
35
+ echo " Note: skill .env is not set. Default storage is $ENV_FILE -- consider mirroring there."
36
+ else
37
+ echo "[MISSING] No PULSE_API_KEY configured."
38
+ echo "ACTION: Do not run the script yet. Walk the user through 'First-time setup' below."
39
+ fi
40
+ ```
41
+
42
+ ## First-time setup (only if the startup check reports [MISSING])
43
+
44
+ If the check above said `[MISSING]`, the user has not configured a RunPulse API key yet. Walk them through it before doing anything else:
45
+
46
+ 1. **Open** https://www.runpulse.com in a browser and create an account (Google/email signup).
47
+ 2. **Find the API keys section** in the RunPulse dashboard (typically under Settings → API Keys or Developers).
48
+ 3. **Generate a new key** and copy it. Keys look like a 40-ish character random string (e.g. `kwMLkDai0V7Q...`).
49
+ 4. **Store it** by creating `~/.claude/skills/estack-pdf-to-md/.env` with one line:
50
+ ```
51
+ PULSE_API_KEY=<paste-the-key-here>
52
+ ```
53
+ Offer to do this for them via the Write tool once they paste the key in chat. Default storage is the skill-local `.env` at `~/.claude/skills/estack-pdf-to-md/.env`; only fall back to setting the Windows user env var if the user explicitly prefers that.
54
+ 5. **Re-run the startup check** by re-invoking the skill, and confirm it now reports `[OK]`.
55
+
56
+ **Never echo a real key back to the user in chat.** Confirm with a masked form (first 6 + last 4 chars) like the startup check does.
57
+
58
+ ## Required inputs
59
+
60
+ Always confirm these two before running:
61
+
62
+ 1. **Input PDF path** — e.g. `C:\Users\2supe\Downloads\foo.pdf`
63
+ 2. **Output directory** — where the resulting `.md` / `.txt` should be saved
64
+
65
+ If the user gave one but not the other, ask. If they gave only a PDF path, default the output directory to the same folder as the PDF and confirm in one short sentence rather than assuming silently. The user explicitly asked that input and output be settable per run — do not skip the confirmation just because there's a sensible default.
66
+
67
+ ## Optional inputs
68
+
69
+ Mention these only if the user's request implies them — don't ask up front:
70
+
71
+ | Flag | Default | When to use |
72
+ |------|---------|-------------|
73
+ | `--format md\|txt` | `md` | User wants a `.txt` file instead of `.md` |
74
+ | `--batch-size N` | `10` | Large PDFs (100+ pages) → bump to 20+ to reduce API calls; flaky runs → drop to 5 to shrink the blast radius of a failed batch |
75
+ | `--no-separator` | off | User wants clean output with no `<!-- pages N-M -->` HTML comments between batches |
76
+ | `--min-chars N` | `20` | Threshold of locally-extractable text below which a page is skipped (not sent to RunPulse). Tune up if too many decoration pages are slipping through; tune down if real content pages are being skipped. |
77
+ | `--no-skip` | off | Send every page to RunPulse. **Use this for scanned PDFs** where every page is an image and RunPulse's OCR is the whole point — otherwise the default filter would skip everything. |
78
+ | `--quality fast\|high` | `fast` | `fast` = RunPulse `default` model, full parallelism, cheap. `high` = `pulse-ultra-2` vision-language model + full refinement pass (tables, text, formatting), figure extraction, footnote linking. Use `high` for **tables, math, charts, scanned pages, or sloppy formatting**. Ultra 2 is throttled by RunPulse to 2 concurrent / 5 per minute / 20 per hour, so the script caps the worker pool at 2 in this mode. |
79
+ | `--pages RANGE` | off | Restrict to a 1-indexed page range like `5`, `5-10`, or `1-2,5`. Useful for spot-testing on a single page before committing to a full run. When set, the blank/image-only filter is bypassed for explicitly requested pages. |
80
+
81
+ ## Cost-saving page filter (on by default)
82
+
83
+ RunPulse is expensive, so the script filters pages *before* sending anything to the API:
84
+
85
+ 1. Uses `pypdf` locally to extract text from each page.
86
+ 2. Counts non-whitespace characters.
87
+ 3. Drops any page with fewer than `--min-chars` (default 20) — this catches blank pages and pages whose entire content is a rasterized image, since `pypdf` can't read the text out of either.
88
+ 4. Surviving pages get grouped into consecutive ranges and sent in parallel batches.
89
+
90
+ The script prints exactly which pages it's skipping (e.g. `Skipping 3 page(s): 4, 17, 92`) so the user can sanity-check it. If the user complains that real content got skipped, drop `--min-chars` (e.g. `--min-chars 5`). If the user has a fully-scanned PDF and the script exits with "No pages contain extractable text", run again with `--no-skip` to force every page through OCR.
91
+
92
+ ## How to run
93
+
94
+ The script auto-loads `PULSE_API_KEY` from these sources, in order:
95
+ 1. The current shell's `PULSE_API_KEY` env var (PowerShell picks up Windows user env vars automatically; Bash does not).
96
+ 2. `~/.claude/skills/estack-pdf-to-md/.env` (the default storage for this skill).
97
+
98
+ So in either shell, just invoke directly — no need to pass the key explicitly:
99
+
100
+ ```powershell
101
+ python "$env:USERPROFILE\.claude\skills\estack-pdf-to-md\scripts\pdf_to_md.py" "<input-pdf>" --output-dir "<output-dir>"
102
+ ```
103
+
104
+ ```bash
105
+ python "$HOME/.claude/skills/estack-pdf-to-md/scripts/pdf_to_md.py" "<input-pdf>" --output-dir "<output-dir>"
106
+ ```
107
+
108
+ If the script exits with `PULSE_API_KEY is not set`, the startup check missed something — re-run the skill to re-trigger the check, or inspect `<skill_dir>/.env` directly. Never echo the key value back to the user.
109
+
110
+ ## Dependencies
111
+
112
+ The script imports `requests` and `pypdf`. If you hit `ModuleNotFoundError`, install once and retry:
113
+
114
+ ```powershell
115
+ pip install requests pypdf
116
+ ```
117
+
118
+ ## Multiple PDFs
119
+
120
+ If the user passes a folder or a list of paths, loop sequentially — one script invocation per PDF. The script already parallelizes page batches within a single PDF; running multiple PDFs in parallel on top of that risks hammering the API and obscures which file failed when something breaks.
121
+
122
+ ## Reporting back
123
+
124
+ When done, report tersely:
125
+ - Output file path(s)
126
+ - Page count converted (the script prints `Sending N page(s) in M batch(es)...` once it knows what's being sent)
127
+
128
+ Don't paste the full markdown into chat unless the user asks — the file path is enough.
129
+
130
+ ## Failure handling
131
+
132
+ The script raises and exits non-zero on any batch error. Don't silently retry the whole run. Instead:
133
+
134
+ 1. Show the error to the user.
135
+ 2. If it looks like a transient timeout, offer to rerun the same command.
136
+ 3. If a specific batch repeatedly fails, suggest `--batch-size 5` so the failure scope shrinks and successful batches can still be salvaged on a future run.
137
+
138
+ ### Encrypted PDFs
139
+
140
+ The script auto-handles publisher-restricted PDFs that are *owner-locked* but have no user password (very common — most "protected" PDFs from publishers fall in this bucket). It silently `decrypt('')`s them to a temp file, runs the conversion, then deletes the temp file. You'll see a one-line note like `<file> was owner-locked; decrypted with empty password to temp copy.`
141
+
142
+ If the PDF actually has a user password, the script exits with both workarounds spelled out:
143
+ 1. **Chrome print-to-PDF** — open in Chrome, Ctrl+P → Save as PDF. This re-renders the visible content and produces a clean, unencrypted file. Easiest for the user, no installs.
144
+ 2. **`qpdf --decrypt --password=<pwd> in.pdf out.pdf`** — requires `qpdf` installed (`winget install qpdf`) and the actual password.
145
+
146
+ Don't try to bypass real password protection yourself — surface the message and let the user decide.
147
+
148
+ ## Why this skill exists (context for judgment calls)
149
+
150
+ This was built on 2026-05-20 as a wrapper around a hand-written script, now bundled at `scripts/pdf_to_md.py`. The script was validated on `the-4-hour-workweek-expanded-and-updated-by-timothy-ferriss.pdf` (37 pages, 4 parallel batches). The batching + parallel design is for throughput and to make error messages name the specific page range that failed — but note that **one failed batch currently aborts the whole run** (no partial-result salvage today). Surface the failed range to the user so they can rerun just that span with `--pages`.
151
+ ---
152
+
153
+ ## Skill Feedback
154
+
155
+ If the user shares feedback about this skill — a bug, something confusing, a missing feature, or a suggestion — ask them to describe it in a bit more detail (what they expected, what happened, and any relevant context). Then file the issue using whichever method is available:
156
+
157
+ **If `gh` is installed** (`gh --version` succeeds), create the issue directly:
158
+
159
+ ```bash
160
+ gh issue create \
161
+ --repo ElliotDrel/e-stack \
162
+ --title "estack-pdf-to-md: <concise summary>" \
163
+ --body "<description from user feedback — expected vs. actual behavior and context>"
164
+ ```
165
+
166
+ **If `gh` is not installed**, build a pre-filled URL:
167
+
168
+ ```bash
169
+ python3 -c "
170
+ import urllib.parse
171
+ title = 'estack-pdf-to-md: <concise summary>'
172
+ body = '<description from user feedback — expected vs. actual behavior and context>'
173
+ base = 'https://github.com/ElliotDrel/e-stack/issues/new'
174
+ print(base + '?title=' + urllib.parse.quote(title) + '&body=' + urllib.parse.quote(body))
175
+ "
176
+ ```
177
+
178
+ Share the printed URL with the user and offer to open it in their browser.
179
+
180
+ They can also click it directly, review the pre-filled title and body, and click **Submit new issue**.