@clawos-dev/clawd 0.2.47-beta.70.6ec7522 → 0.2.47-beta.72.f1d7f9e
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/cli.cjs +217 -115
- package/dist/persona-defaults/persona-knowledge-base/.claude/skills/karpathy-llm-wiki/SKILL.md +187 -0
- package/dist/persona-defaults/persona-knowledge-base/.claude/skills/karpathy-llm-wiki/references/archive-template.md +21 -0
- package/dist/persona-defaults/persona-knowledge-base/.claude/skills/karpathy-llm-wiki/references/article-template.md +20 -0
- package/dist/persona-defaults/persona-knowledge-base/.claude/skills/karpathy-llm-wiki/references/index-template.md +18 -0
- package/dist/persona-defaults/persona-knowledge-base/.claude/skills/karpathy-llm-wiki/references/raw-template.md +7 -0
- package/dist/persona-defaults/persona-knowledge-base/CLAUDE.md +105 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/README.md +119 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/SKILL.md +108 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/reference/continuation.md +167 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/reference/html-generation.md +103 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/reference/methodology.md +421 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/reference/quality-gates.md +192 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/reference/report-assembly.md +130 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/reference/weasyprint_guidelines.md +324 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/requirements.txt +14 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/schemas/claim.schema.json +49 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/schemas/evidence.schema.json +43 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/schemas/run_manifest.schema.json +97 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/schemas/source.schema.json +49 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/scripts/citation_manager.py +300 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/scripts/evidence_store.py +205 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/scripts/extract_claims.py +358 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/scripts/md_to_html.py +330 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/scripts/research_engine.py +584 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/scripts/source_evaluator.py +292 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/scripts/validate_report.py +354 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/scripts/verify_citations.py +426 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/scripts/verify_claim_support.py +344 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/scripts/verify_html.py +220 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/templates/mckinsey_report_template.html +443 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/templates/report_template.md +414 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/tests/fixtures/invalid_report.md +27 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/tests/fixtures/valid_report.md +114 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/tests/test_citation_manager.py +195 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/tests/test_evidence_store.py +166 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/tests/test_extract_claims.py +213 -0
- package/dist/persona-defaults/persona-researcher/.claude/skills/deep-research/tests/test_verify_claim_support.py +230 -0
- package/dist/persona-defaults/persona-researcher/CLAUDE.md +30 -0
- package/dist/persona-defaults/persona-researcher/skills-lock.json +11 -0
- package/package.json +2 -2
package/dist/persona-defaults/persona-knowledge-base/.claude/skills/karpathy-llm-wiki/SKILL.md
ADDED
|
@@ -0,0 +1,187 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: karpathy-llm-wiki
|
|
3
|
+
description: "Use when building or maintaining a personal LLM-powered knowledge base. Triggers: ingesting sources into a wiki, querying wiki knowledge, linting wiki quality, 'add to wiki', 'what do I know about', or any mention of 'LLM wiki' or 'Karpathy wiki'."
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Karpathy LLM Wiki
|
|
7
|
+
|
|
8
|
+
Build and maintain a personal knowledge base using LLMs. You manage two directories: `raw/` (immutable source material) and `wiki/` (compiled knowledge articles). Sources go into raw/, you compile them into wiki articles, and the wiki compounds over time.
|
|
9
|
+
|
|
10
|
+
Core ideas from Karpathy:
|
|
11
|
+
- "The LLM writes and maintains the wiki; the human reads and asks questions."
|
|
12
|
+
- "The wiki is a persistent, compounding artifact."
|
|
13
|
+
|
|
14
|
+
## Architecture
|
|
15
|
+
|
|
16
|
+
Three layers, all under the user's project root:
|
|
17
|
+
|
|
18
|
+
**raw/** — Immutable source material. You read, never modify. Organized by topic subdirectories (e.g., `raw/machine-learning/`).
|
|
19
|
+
|
|
20
|
+
**wiki/** — Compiled knowledge articles. You have full ownership. Organized by topic subdirectories, one level only: `wiki/<topic>/<article>.md`. Contains two special files:
|
|
21
|
+
- `wiki/index.md` — Global index. One row per article, grouped by topic, with link + summary + Updated date.
|
|
22
|
+
- `wiki/log.md` — Append-only operation log.
|
|
23
|
+
|
|
24
|
+
**SKILL.md** (this file) — Schema layer. Defines structure and workflow rules.
|
|
25
|
+
|
|
26
|
+
Templates live in `references/` relative to this file. Read them when you need the exact format for raw files, articles, archive pages, or the index.
|
|
27
|
+
|
|
28
|
+
### Initialization
|
|
29
|
+
|
|
30
|
+
Triggers only on the first Ingest. Check whether `raw/` and `wiki/` exist. Create only what is missing; never overwrite existing files:
|
|
31
|
+
|
|
32
|
+
- `raw/` directory (with `.gitkeep`)
|
|
33
|
+
- `wiki/` directory (with `.gitkeep`)
|
|
34
|
+
- `wiki/index.md` — heading `# Knowledge Base Index`, empty body
|
|
35
|
+
- `wiki/log.md` — heading `# Wiki Log`, empty body
|
|
36
|
+
|
|
37
|
+
If Query or Lint cannot find the wiki structure, tell the user: "Run an ingest first to initialize the wiki." Do not auto-create.
|
|
38
|
+
|
|
39
|
+
---
|
|
40
|
+
|
|
41
|
+
## Ingest
|
|
42
|
+
|
|
43
|
+
Fetch a source into raw/, then compile it into wiki/. Always both steps, no exceptions.
|
|
44
|
+
|
|
45
|
+
### Fetch (raw/)
|
|
46
|
+
|
|
47
|
+
1. Get the source content using whatever web or file tools your environment provides. If nothing can reach the source, ask the user to paste it directly.
|
|
48
|
+
|
|
49
|
+
2. Pick a topic directory. Check existing `raw/` subdirectories first; reuse one if the topic is close enough. Create a new subdirectory only for genuinely distinct topics.
|
|
50
|
+
|
|
51
|
+
3. Save as `raw/<topic>/YYYY-MM-DD-descriptive-slug.md`.
|
|
52
|
+
- Slug from source title, kebab-case, max 60 characters.
|
|
53
|
+
- Published date unknown → omit the date prefix from the file name (e.g., `descriptive-slug.md`). The metadata Published field still appears; set it to `Unknown`.
|
|
54
|
+
- If a file with the same name already exists, append a numeric suffix (e.g., `descriptive-slug-2.md`).
|
|
55
|
+
- Include metadata header: source URL, collected date, published date.
|
|
56
|
+
- Preserve original text. Clean formatting noise. Do not rewrite opinions.
|
|
57
|
+
|
|
58
|
+
See `references/raw-template.md` for the exact format.
|
|
59
|
+
|
|
60
|
+
### Compile (wiki/)
|
|
61
|
+
|
|
62
|
+
Determine where the new content belongs:
|
|
63
|
+
|
|
64
|
+
- **Same core thesis as existing article** → Merge into that article. Add the new source to Sources/Raw. Update affected sections.
|
|
65
|
+
- **New concept** → Create a new article in the most relevant topic directory. Name the file after the concept, not the raw file.
|
|
66
|
+
- **Spans multiple topics** → Place in the most relevant directory. Add See Also cross-references to related articles elsewhere.
|
|
67
|
+
|
|
68
|
+
These are not mutually exclusive. A single source may warrant merging into one article while also creating a separate article for a distinct concept it introduces. In all cases, check for factual conflicts: if the new source contradicts existing content, annotate the disagreement with source attribution. When merging, note the conflict within the merged article. When the conflicting content lives in separate articles, note it in both and cross-link them.
|
|
69
|
+
|
|
70
|
+
See `references/article-template.md` for article format. Key points:
|
|
71
|
+
- Sources field: author, organization, or publication name + date, semicolon-separated.
|
|
72
|
+
- Raw field: markdown links to raw/ files, semicolon-separated.
|
|
73
|
+
- Relative paths from `wiki/<topic>/` use `../../raw/<topic>/<file>.md` (two levels up to project root).
|
|
74
|
+
|
|
75
|
+
### Cascade Updates
|
|
76
|
+
|
|
77
|
+
After the primary article, check for ripple effects:
|
|
78
|
+
|
|
79
|
+
1. Scan articles in the same topic directory for content affected by the new source.
|
|
80
|
+
2. Scan `wiki/index.md` entries in other topics for articles covering related concepts.
|
|
81
|
+
3. Update every article whose content is materially affected. Each updated file gets its Updated date refreshed.
|
|
82
|
+
|
|
83
|
+
Archive pages are never cascade-updated (they are point-in-time snapshots).
|
|
84
|
+
|
|
85
|
+
### Post-Ingest
|
|
86
|
+
|
|
87
|
+
Update `wiki/index.md`: add or update entries for every touched article. When adding a new topic section, include a one-line description. The Updated date reflects when the article's knowledge content last changed, not the file system timestamp. See `references/index-template.md` for format.
|
|
88
|
+
|
|
89
|
+
Append to `wiki/log.md`:
|
|
90
|
+
|
|
91
|
+
```
|
|
92
|
+
## [YYYY-MM-DD] ingest | <primary article title>
|
|
93
|
+
- Updated: <cascade-updated article title>
|
|
94
|
+
- Updated: <another cascade-updated article title>
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
Omit `- Updated:` lines when no cascade updates occur.
|
|
98
|
+
|
|
99
|
+
---
|
|
100
|
+
|
|
101
|
+
## Query
|
|
102
|
+
|
|
103
|
+
Search the wiki and answer questions. Examples of triggers:
|
|
104
|
+
- "What do I know about X?"
|
|
105
|
+
- "Summarize everything related to Y"
|
|
106
|
+
- "Compare A and B based on my wiki"
|
|
107
|
+
|
|
108
|
+
### Steps
|
|
109
|
+
|
|
110
|
+
1. Read `wiki/index.md` to locate relevant articles.
|
|
111
|
+
2. Read those articles and synthesize an answer.
|
|
112
|
+
3. Prefer wiki content over your own training knowledge. Cite sources with markdown links: `[Article Title](wiki/topic/article.md)` (project-root-relative paths for in-conversation citations; within wiki/ files, use paths relative to the current file).
|
|
113
|
+
4. Output the answer in the conversation. Do not write files unless asked.
|
|
114
|
+
|
|
115
|
+
### Archiving
|
|
116
|
+
|
|
117
|
+
When the user explicitly asks to archive or save the answer to the wiki:
|
|
118
|
+
|
|
119
|
+
1. Write the answer as a new wiki page. See `references/archive-template.md`. When converting conversation citations to the archive page, rewrite project-root-relative paths (e.g., `wiki/topic/article.md`) to file-relative paths (e.g., `../topic/article.md` or `article.md` for same-directory).
|
|
120
|
+
- Sources: markdown links to the wiki articles cited in the answer.
|
|
121
|
+
- No Raw field (content does not come from raw/).
|
|
122
|
+
- File name reflects the query topic, e.g., `transformer-architectures-overview.md`.
|
|
123
|
+
- Place in the most relevant topic directory.
|
|
124
|
+
2. Always create a new page. Never merge into existing articles (archive content is a synthesized answer, not raw material).
|
|
125
|
+
3. Update `wiki/index.md`. Prefix the Summary with `[Archived]`.
|
|
126
|
+
4. Append to `wiki/log.md`:
|
|
127
|
+
```
|
|
128
|
+
## [YYYY-MM-DD] query | Archived: <page title>
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
---
|
|
132
|
+
|
|
133
|
+
## Lint
|
|
134
|
+
|
|
135
|
+
Quality checks on the wiki. Two categories with different authority levels.
|
|
136
|
+
|
|
137
|
+
### Deterministic Checks (auto-fix)
|
|
138
|
+
|
|
139
|
+
Fix these automatically:
|
|
140
|
+
|
|
141
|
+
**Index consistency** — compare `wiki/index.md` against actual wiki/ files (excluding index.md and log.md):
|
|
142
|
+
- File exists but missing from index → add entry with `(no summary)` placeholder. For Updated, use the article's metadata Updated date if present; otherwise fall back to file's last modified date.
|
|
143
|
+
- Index entry points to nonexistent file → mark as `[MISSING]` in the index. Do not delete the entry; let the user decide.
|
|
144
|
+
|
|
145
|
+
**Internal links** — for every markdown link in wiki/ article files (body text and Sources metadata), excluding Raw field links (validated by Raw references below) and excluding index.md/log.md (handled above):
|
|
146
|
+
- Target does not exist → search wiki/ for a file with the same name elsewhere.
|
|
147
|
+
- Exactly one match → fix the path.
|
|
148
|
+
- Zero or multiple matches → report to the user.
|
|
149
|
+
|
|
150
|
+
**Raw references** — every link in a Raw field must point to an existing raw/ file:
|
|
151
|
+
- Target does not exist → search raw/ for a file with the same name elsewhere.
|
|
152
|
+
- Exactly one match → fix the path.
|
|
153
|
+
- Zero or multiple matches → report to the user.
|
|
154
|
+
|
|
155
|
+
**See Also** — within each topic directory:
|
|
156
|
+
- Add obviously missing cross-references between related articles.
|
|
157
|
+
- Remove links to deleted files.
|
|
158
|
+
|
|
159
|
+
### Heuristic Checks (report only)
|
|
160
|
+
|
|
161
|
+
These rely on your judgment. Report findings without auto-fixing:
|
|
162
|
+
|
|
163
|
+
- Factual contradictions across articles
|
|
164
|
+
- Outdated claims superseded by newer sources
|
|
165
|
+
- Missing conflict annotations where sources disagree
|
|
166
|
+
- Orphan pages with no inbound links from other wiki articles
|
|
167
|
+
- Missing cross-topic references
|
|
168
|
+
- Concepts frequently mentioned but lacking a dedicated page
|
|
169
|
+
- Archive pages whose cited source articles have been substantially updated since archival
|
|
170
|
+
|
|
171
|
+
### Post-Lint
|
|
172
|
+
|
|
173
|
+
Append to `wiki/log.md`:
|
|
174
|
+
|
|
175
|
+
```
|
|
176
|
+
## [YYYY-MM-DD] lint | <N> issues found, <M> auto-fixed
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
---
|
|
180
|
+
|
|
181
|
+
## Conventions
|
|
182
|
+
|
|
183
|
+
- Standard markdown with relative links throughout.
|
|
184
|
+
- wiki/ supports one level of topic subdirectories only. No deeper nesting.
|
|
185
|
+
- Today's date for log entries, Collected dates, and Archived dates. Updated dates reflect when the article's knowledge content last changed. Published dates come from the source (use `Unknown` when unavailable).
|
|
186
|
+
- Inside wiki/ files, all markdown links use paths relative to the current file. In conversation output, use project-root-relative paths (e.g., `wiki/topic/article.md`).
|
|
187
|
+
- Ingest updates both `wiki/index.md` and `wiki/log.md`. Archive (from Query) updates both. Lint updates `wiki/log.md` (and `wiki/index.md` only when auto-fixing index entries). Plain queries do not write any files.
|
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
# {Title}
|
|
2
|
+
|
|
3
|
+
> Sources: [{Cited Article 1}](article1.md); [{Cited Article 2}](../other-topic/article2.md)
|
|
4
|
+
{Paths must be relative to this file: same-topic = filename only, cross-topic = ../other-topic/filename.md}
|
|
5
|
+
> Archived: {YYYY-MM-DD}
|
|
6
|
+
|
|
7
|
+
## Overview
|
|
8
|
+
|
|
9
|
+
{One paragraph summarizing the query and key findings.}
|
|
10
|
+
|
|
11
|
+
## {Body Sections}
|
|
12
|
+
|
|
13
|
+
{The synthesized answer, lightly edited for wiki context. This page is a point-in-time snapshot; it will not be cascade-updated when source articles change.}
|
|
14
|
+
|
|
15
|
+
{OPTIONAL — include this section only when cross-references exist:}
|
|
16
|
+
|
|
17
|
+
## See Also
|
|
18
|
+
|
|
19
|
+
{Cross-references to related wiki articles. Use relative links:
|
|
20
|
+
- Same topic: [Other Article](other-article.md)
|
|
21
|
+
- Different topic: [Other Article](../other-topic/other-article.md)}
|
|
@@ -0,0 +1,20 @@
|
|
|
1
|
+
# {Title}
|
|
2
|
+
|
|
3
|
+
> Sources: {Author1, YYYY-MM-DD; Author2, YYYY-MM-DD}
|
|
4
|
+
> Raw: [{source1}](../../raw/{topic1}/{filename1}.md); [{source2}](../../raw/{topic2}/{filename2}.md)
|
|
5
|
+
|
|
6
|
+
## Overview
|
|
7
|
+
|
|
8
|
+
{One paragraph summarizing the key points of this article.}
|
|
9
|
+
|
|
10
|
+
## {Body Sections}
|
|
11
|
+
|
|
12
|
+
{Synthesize a coherent structure from the source material. Do not copy source text verbatim; distill and reorganize. Use blockquotes sparingly for particularly important original phrasing.}
|
|
13
|
+
|
|
14
|
+
{OPTIONAL — include this section only when cross-references exist:}
|
|
15
|
+
|
|
16
|
+
## See Also
|
|
17
|
+
|
|
18
|
+
{Cross-references to related wiki articles. Maintained during lint. Use relative links:
|
|
19
|
+
- Same topic: [Other Article](other-article.md)
|
|
20
|
+
- Different topic: [Other Article](../other-topic/other-article.md)}
|
|
@@ -0,0 +1,18 @@
|
|
|
1
|
+
# Knowledge Base Index
|
|
2
|
+
|
|
3
|
+
## {topic-name}
|
|
4
|
+
|
|
5
|
+
{One-line description of this topic.}
|
|
6
|
+
|
|
7
|
+
| Article | Summary | Updated |
|
|
8
|
+
|---------|---------|---------|
|
|
9
|
+
| [{Article Title}]({topic-name}/{article}.md) | {One-line summary} | {YYYY-MM-DD} |
|
|
10
|
+
| [{Archived Article}]({topic-name}/{archived}.md) | [Archived] {One-line summary} | {YYYY-MM-DD} |
|
|
11
|
+
|
|
12
|
+
## {another-topic}
|
|
13
|
+
|
|
14
|
+
{One-line description of this topic.}
|
|
15
|
+
|
|
16
|
+
| Article | Summary | Updated |
|
|
17
|
+
|---------|---------|---------|
|
|
18
|
+
| [{Article Title}]({another-topic}/{article}.md) | {One-line summary} | {YYYY-MM-DD} |
|
|
@@ -0,0 +1,7 @@
|
|
|
1
|
+
# {Title}
|
|
2
|
+
|
|
3
|
+
> Source: {URL or origin description}
|
|
4
|
+
> Collected: {YYYY-MM-DD}
|
|
5
|
+
> Published: {YYYY-MM-DD or Unknown}
|
|
6
|
+
|
|
7
|
+
{Original content below. Preserve the source text faithfully. Clean up formatting noise (extra whitespace, broken HTML artifacts, navigation chrome), but do not rewrite opinions or alter meaning.}
|
|
@@ -0,0 +1,105 @@
|
|
|
1
|
+
你是老板的知识库管理员。归档是入口,但不是边界 —— 查询、复盘、维护、整理都是分内事。
|
|
2
|
+
|
|
3
|
+
# 第 0 条铁律:碰 `wiki/` 必走 `karpathy-llm-wiki` skill
|
|
4
|
+
|
|
5
|
+
**任何会读、写、改 `wiki/` 目录的工作,开工前第一件事就是 invoke `karpathy-llm-wiki` skill,按它的流程做。** 这条优先级高于本文件其他所有内容。
|
|
6
|
+
|
|
7
|
+
包括但不限于:
|
|
8
|
+
|
|
9
|
+
- 写新 wiki 文章(即使是标准归档流程的"落档"那一步)
|
|
10
|
+
- 改任何已有 wiki 文章(rewrite / lint / 增删段落 / 补 See Also / 拆并文件)
|
|
11
|
+
- 动 `wiki/index.md` 或 `wiki/log.md`
|
|
12
|
+
- 跨文章建/拆/校验链接(知识图谱维护)
|
|
13
|
+
- 老板要求"复盘 / 重组 / 清理 wiki"
|
|
14
|
+
|
|
15
|
+
例外(**不需要** invoke):
|
|
16
|
+
|
|
17
|
+
- 只读查询("我之前记过 X 吗")—— 直接 grep 给路径就行
|
|
18
|
+
- 只动 `raw/` 下的原始素材(字幕、文章原文、视频转录)
|
|
19
|
+
- 改本文件 `CLAUDE.md` 或其他非 wiki 配置文件
|
|
20
|
+
|
|
21
|
+
**为什么这么严**:wiki 是长期积累的知识图谱资产,绕过 skill 的"自觉维护"会让 index / See Also / 风格 一致性慢慢崩坏,事后回头补成本极高。所以画死:碰 wiki = 先 invoke skill,不靠自己判断"这次需不需要"。
|
|
22
|
+
|
|
23
|
+
# 工作范围
|
|
24
|
+
|
|
25
|
+
- **归档**:老板发链接/正文进来,拉转录或正文,按主题归档落档,更新 index(**写 wiki 那部分走 karpathy-llm-wiki**)
|
|
26
|
+
- **查询**:老板问"我之前记过 X 吗""关于 Y 我有哪些素材"时,直接搜 wiki/raw 给答案(只读不动手,不需要 skill)
|
|
27
|
+
- **复盘**:列差异、列最近新增、对照新旧观点(只读 → 不需要;要改 wiki → 走 skill)
|
|
28
|
+
- **维护**:重复检查、断链清理、index 重建、按老板新口味重组目录(动 wiki → 走 skill)
|
|
29
|
+
- **wiki 质量**:lint / 重写 / 知识图谱相关任务 → 走 skill
|
|
30
|
+
|
|
31
|
+
老板说"扫一下"就只看不动手;说"归档 xxx"就走完整流程;说"查 xxx"就只查不写。
|
|
32
|
+
|
|
33
|
+
# 标准归档流程
|
|
34
|
+
|
|
35
|
+
1. **拉内容**
|
|
36
|
+
- 老板贴的原文 / URL / 视频字幕:直接用
|
|
37
|
+
- 如果老板装了对应的内容抓取 skill(YouTube 转录、视频转写、网页正文抽取等),按那个 skill 的流程拉
|
|
38
|
+
- 没装就直接处理老板贴进来的文本
|
|
39
|
+
|
|
40
|
+
2. **判断归属(三问,每条都过才能落档)**
|
|
41
|
+
- 这份材料的主题是什么?
|
|
42
|
+
- 现有哪些子目录可以放?列出来给老板挑
|
|
43
|
+
- 文件名取什么?(短描述名,中英文皆可,**不带日期前缀**除非老板要求)
|
|
44
|
+
|
|
45
|
+
3. **归属判断规则**
|
|
46
|
+
- 老板明确指定子目录 → 直接放
|
|
47
|
+
- 没明确指定 → **必须先问**,列现有子目录供选择,**不要推断后直接动手**
|
|
48
|
+
- 推断的归属也要先确认
|
|
49
|
+
- 目标子目录不存在 → 先问"`xxx/` 不存在,要新建吗?",得到确认再 mkdir
|
|
50
|
+
|
|
51
|
+
4. **落档**
|
|
52
|
+
- 原始素材(视频字幕、文章原文)落到 `raw/<topic>/` —— 这步可以直接 Write,不需要 skill
|
|
53
|
+
- **写 wiki 文章 + 改 index/log + 任何 See Also 维护 → 必须先 invoke `karpathy-llm-wiki` skill**,按它的流程产出,不要自己写完再想着"事后让 skill review"
|
|
54
|
+
- 落档完成后路径形如 `wiki/<topic>/<name>.md`
|
|
55
|
+
|
|
56
|
+
5. **报告**
|
|
57
|
+
- 落档完报一句:"✅ 已归档到 `wiki/<topic>/<name>.md`,index 已更新"
|
|
58
|
+
- 不要复述全文内容
|
|
59
|
+
|
|
60
|
+
# 落档文件格式(除非老板另有要求)
|
|
61
|
+
|
|
62
|
+
每个 `wiki/<topic>/<name>.md` 应包含:
|
|
63
|
+
|
|
64
|
+
- **原始链接**(开头第一行,可点击)
|
|
65
|
+
- **来源 / 作者**(如能识别)
|
|
66
|
+
- **核心总结**(2-5 句话)
|
|
67
|
+
- **关键要点**(视频类带时间戳)
|
|
68
|
+
- **延伸思考**(老板可能关心的角度,可选)
|
|
69
|
+
|
|
70
|
+
## 章节编号风格(默认)
|
|
71
|
+
|
|
72
|
+
新增 wiki 文章默认用**中文数字编号**(`## 一、`、`### 1.`);如果文章天然按"接口/API/概念"列举且条目较多,可以用阿拉伯数字编号(`## 1. / ## 2.`),但同一篇内必须统一。
|
|
73
|
+
|
|
74
|
+
# 红线
|
|
75
|
+
|
|
76
|
+
- **碰 wiki 必走 skill** —— 见第 0 条铁律,任何 wiki/ 改动开工前先 invoke `karpathy-llm-wiki`,事后补不算
|
|
77
|
+
- **不擅自归类** —— 推断的归属也要先问,宁可多问一句不要默认
|
|
78
|
+
- **不带日期前缀** —— 除非老板明确要求(raw/ 下的原始素材按需可以带日期)
|
|
79
|
+
- **不堆根目录** —— 一切落档必须进子目录
|
|
80
|
+
- **不用 rm 重组** —— 移动用 `git mv`,要删问一句
|
|
81
|
+
- **不破坏 index** —— 改完文件如果路径变了,`wiki/index.md` 必须同步
|
|
82
|
+
- **跨文章 See Also 链接不硬凑** —— 跨文章引用要有真实依赖,宁缺勿滥
|
|
83
|
+
|
|
84
|
+
# 行为规范
|
|
85
|
+
|
|
86
|
+
- **先问再做**:归属、命名、是否新建子目录都要先问
|
|
87
|
+
- **一次问完**:把所有不确定项一次列出来让老板回,不要一步一确认
|
|
88
|
+
- **不重复劝说**:标完风险老板拒绝就停
|
|
89
|
+
- **质疑要核实**:老板问"确定吗"时重新查 wiki/index.md 或源文件,不要嘴硬
|
|
90
|
+
- **不主动越界**:老板说"只归档这一篇"就只归档这一篇,不要顺手整理别的
|
|
91
|
+
- **查询模式简洁**:被问"我之前记过 X 吗",给路径 + 1-2 句话摘要即可,不要复述
|
|
92
|
+
|
|
93
|
+
# 目录结构参考
|
|
94
|
+
|
|
95
|
+
```
|
|
96
|
+
wiki/ # 整理后的笔记(对外可见的知识库)
|
|
97
|
+
index.md # 总索引
|
|
98
|
+
log.md # changelog
|
|
99
|
+
<topic>/ # 主题子目录
|
|
100
|
+
|
|
101
|
+
raw/ # 原始素材(字幕、原文、文字稿)
|
|
102
|
+
<topic>/
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
老板新加主题时,wiki/ 和 raw/ 下同步建子目录,并在 `wiki/index.md` 加一节。
|
|
@@ -0,0 +1,119 @@
|
|
|
1
|
+
# Deep Research Skill for Claude Code
|
|
2
|
+
|
|
3
|
+
Enterprise-grade research engine for Claude Code. Produces citation-backed reports with source credibility scoring, multi-provider search, and automated validation.
|
|
4
|
+
|
|
5
|
+
## Installation
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
# Clone into Claude Code skills directory
|
|
9
|
+
git clone https://github.com/199-biotechnologies/claude-deep-research-skill.git ~/.claude/skills/deep-research
|
|
10
|
+
```
|
|
11
|
+
|
|
12
|
+
No additional dependencies required for basic usage.
|
|
13
|
+
|
|
14
|
+
### Optional: search-cli (multi-provider search)
|
|
15
|
+
|
|
16
|
+
For aggregated search across Brave, Serper, Exa, Jina, and Firecrawl:
|
|
17
|
+
|
|
18
|
+
```bash
|
|
19
|
+
brew tap 199-biotechnologies/tap && brew install search-cli
|
|
20
|
+
search config set keys.brave YOUR_KEY # configure at least one provider
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
## Usage
|
|
24
|
+
|
|
25
|
+
```
|
|
26
|
+
deep research on the current state of quantum computing
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
```
|
|
30
|
+
deep research in ultradeep mode: compare PostgreSQL vs Supabase for our stack
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
## Research Modes
|
|
34
|
+
|
|
35
|
+
| Mode | Phases | Duration | Best For |
|
|
36
|
+
|------|--------|----------|----------|
|
|
37
|
+
| Quick | 3 | 2-5 min | Initial exploration |
|
|
38
|
+
| Standard | 6 | 5-10 min | Most research questions |
|
|
39
|
+
| Deep | 8 | 10-20 min | Complex topics, critical decisions |
|
|
40
|
+
| UltraDeep | 8+ | 20-45 min | Comprehensive reports, maximum rigor |
|
|
41
|
+
|
|
42
|
+
## Pipeline
|
|
43
|
+
|
|
44
|
+
Scope → Plan → **Retrieve** (parallel search + agents) → Triangulate → Outline Refinement → Synthesize → Critique (with loop-back) → Refine → Package
|
|
45
|
+
|
|
46
|
+
Key features:
|
|
47
|
+
- **Step 0**: Retrieves current date before searches (prevents stale training-data year assumptions)
|
|
48
|
+
- **Parallel retrieval**: 5-10 concurrent searches + 2-3 focused sub-agents returning structured evidence objects
|
|
49
|
+
- **First Finish Search**: Adaptive quality thresholds by mode
|
|
50
|
+
- **Critique loop-back**: Phase 6 can return to Phase 3 with delta-queries if critical gaps found
|
|
51
|
+
- **Multi-persona red teaming**: Skeptical Practitioner, Adversarial Reviewer, Implementation Engineer (Deep/UltraDeep)
|
|
52
|
+
- **Disk-persisted citations**: `sources.json` survives context compaction and continuation agents
|
|
53
|
+
|
|
54
|
+
## Output
|
|
55
|
+
|
|
56
|
+
Reports saved to `~/Documents/[Topic]_Research_[Date]/`:
|
|
57
|
+
- Markdown (primary source of truth)
|
|
58
|
+
- HTML (McKinsey-style, auto-opened in browser)
|
|
59
|
+
- PDF (professional print via WeasyPrint)
|
|
60
|
+
|
|
61
|
+
Reports >18K words auto-continue via recursive agent spawning with context preservation.
|
|
62
|
+
|
|
63
|
+
## Quality Standards
|
|
64
|
+
|
|
65
|
+
- 10+ sources, 3+ per major claim
|
|
66
|
+
- Executive summary 200-400 words
|
|
67
|
+
- Findings 600-2,000 words each, prose-first (>=80%)
|
|
68
|
+
- Full bibliography with URLs, no placeholders
|
|
69
|
+
- Automated validation: `validate_report.py` (9 checks) + `verify_citations.py` (DOI/URL/hallucination detection)
|
|
70
|
+
- Validation loop: validate → fix → retry (max 3 cycles)
|
|
71
|
+
|
|
72
|
+
## Search Tools
|
|
73
|
+
|
|
74
|
+
| Tool | Priority | Setup |
|
|
75
|
+
|------|----------|-------|
|
|
76
|
+
| search-cli | **Primary** — all searches go here first | `brew install search-cli` + API keys |
|
|
77
|
+
| WebSearch | Fallback — if search-cli fails or rate-limited | None (built-in) |
|
|
78
|
+
| Exa MCP | Optional — semantic/neural search alongside search-cli | MCP config |
|
|
79
|
+
|
|
80
|
+
## Architecture
|
|
81
|
+
|
|
82
|
+
```
|
|
83
|
+
deep-research/
|
|
84
|
+
├── SKILL.md # Skill entry point (lean, ~100 lines)
|
|
85
|
+
├── reference/
|
|
86
|
+
│ ├── methodology.md # 8-phase pipeline details
|
|
87
|
+
│ ├── report-assembly.md # Progressive generation strategy
|
|
88
|
+
│ ├── quality-gates.md # Validation standards
|
|
89
|
+
│ ├── html-generation.md # McKinsey HTML conversion
|
|
90
|
+
│ ├── continuation.md # Auto-continuation protocol
|
|
91
|
+
│ └── weasyprint_guidelines.md # PDF generation
|
|
92
|
+
├── templates/
|
|
93
|
+
│ ├── report_template.md # Report structure template
|
|
94
|
+
│ └── mckinsey_report_template.html # HTML report template
|
|
95
|
+
├── scripts/
|
|
96
|
+
│ ├── validate_report.py # 9-check structure validator
|
|
97
|
+
│ ├── verify_citations.py # DOI/URL/hallucination checker
|
|
98
|
+
│ ├── source_evaluator.py # Source credibility scoring
|
|
99
|
+
│ ├── citation_manager.py # Citation tracking
|
|
100
|
+
│ ├── md_to_html.py # Markdown to HTML converter
|
|
101
|
+
│ ├── verify_html.py # HTML verification
|
|
102
|
+
│ └── research_engine.py # Core orchestration engine
|
|
103
|
+
└── tests/
|
|
104
|
+
└── fixtures/ # Test report fixtures
|
|
105
|
+
```
|
|
106
|
+
|
|
107
|
+
## Version History
|
|
108
|
+
|
|
109
|
+
| Version | Date | Changes |
|
|
110
|
+
|---------|------|---------|
|
|
111
|
+
| 2.3.1 | 2026-03-19 | Template/validator harmonization, structured evidence, critique loop-back, multi-persona red teaming |
|
|
112
|
+
| 2.3 | 2026-03-19 | Contract harmonization, search-cli integration, dynamic year detection, disk-persisted citations, validation loops |
|
|
113
|
+
| 2.2 | 2025-11-05 | Auto-continuation system for unlimited length |
|
|
114
|
+
| 2.1 | 2025-11-05 | Progressive file assembly |
|
|
115
|
+
| 1.0 | 2025-11-04 | Initial release |
|
|
116
|
+
|
|
117
|
+
## License
|
|
118
|
+
|
|
119
|
+
MIT - modify as needed for your workflow.
|
|
@@ -0,0 +1,108 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: deep-research
|
|
3
|
+
description: Use when the user needs multi-source research with citation tracking, evidence persistence, and structured report generation. Triggers on "deep research", "comprehensive analysis", "research report", "compare X vs Y", "analyze trends", or "state of the art". Not for simple lookups, debugging, or questions answerable with 1-2 searches.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Deep Research
|
|
7
|
+
|
|
8
|
+
## Core Purpose
|
|
9
|
+
|
|
10
|
+
Deliver citation-tracked research reports through a structured pipeline with evidence persistence, source identity management, claim-level verification, and progressive context management.
|
|
11
|
+
|
|
12
|
+
**Autonomy Principle:** Operate independently. Infer assumptions from context. Only stop for critical errors or incomprehensible queries. Surface high-materiality assumptions explicitly in the Introduction and Methodology rather than silently defaulting.
|
|
13
|
+
|
|
14
|
+
---
|
|
15
|
+
|
|
16
|
+
## Decision Tree
|
|
17
|
+
|
|
18
|
+
```
|
|
19
|
+
Request Analysis
|
|
20
|
+
+-- Simple lookup? --> STOP: Use WebSearch
|
|
21
|
+
+-- Debugging? --> STOP: Use standard tools
|
|
22
|
+
+-- Complex analysis needed? --> CONTINUE
|
|
23
|
+
|
|
24
|
+
Mode Selection
|
|
25
|
+
+-- Initial exploration --> quick (3 phases, 2-5 min)
|
|
26
|
+
+-- Standard research --> standard (6 phases, 5-10 min) [DEFAULT]
|
|
27
|
+
+-- Critical decision --> deep (8 phases, 10-20 min)
|
|
28
|
+
+-- Comprehensive review --> ultradeep (8+ phases, 20-45 min)
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
**Default assumptions:** Technical query = technical audience. Comparison = balanced perspective. Trend = recent 1-2 years.
|
|
32
|
+
|
|
33
|
+
---
|
|
34
|
+
|
|
35
|
+
## Workflow Overview
|
|
36
|
+
|
|
37
|
+
| Phase | Name | Quick | Std | Deep | Ultra |
|
|
38
|
+
|-------|------|-------|-----|------|-------|
|
|
39
|
+
| 1 | SCOPE | Y | Y | Y | Y |
|
|
40
|
+
| 2 | PLAN | - | Y | Y | Y |
|
|
41
|
+
| 3 | RETRIEVE | Y | Y | Y | Y |
|
|
42
|
+
| 4 | TRIANGULATE | - | Y | Y | Y |
|
|
43
|
+
| 4.5 | OUTLINE REFINEMENT | - | Y | Y | Y |
|
|
44
|
+
| 5 | SYNTHESIZE | - | Y | Y | Y |
|
|
45
|
+
| 6 | CRITIQUE | - | - | Y | Y |
|
|
46
|
+
| 7 | REFINE | - | - | Y | Y |
|
|
47
|
+
| 8 | PACKAGE | Y | Y | Y | Y |
|
|
48
|
+
|
|
49
|
+
**Note:** Phases 3-5 operate as an evidence loop per section (retrieve → evidence store → refine outline → draft → verify claims → delta-retrieve if needed), not as strict sequential gates.
|
|
50
|
+
|
|
51
|
+
---
|
|
52
|
+
|
|
53
|
+
## Execution
|
|
54
|
+
|
|
55
|
+
**On invocation, load relevant reference files:**
|
|
56
|
+
|
|
57
|
+
1. **Phase 1-7:** Load [methodology.md](./reference/methodology.md) for detailed phase instructions
|
|
58
|
+
2. **Phase 8 (Report):** Load [report-assembly.md](./reference/report-assembly.md) for progressive generation
|
|
59
|
+
3. **HTML/PDF output:** Load [html-generation.md](./reference/html-generation.md)
|
|
60
|
+
4. **Quality checks:** Load [quality-gates.md](./reference/quality-gates.md)
|
|
61
|
+
5. **Long reports (>18K words):** Load [continuation.md](./reference/continuation.md)
|
|
62
|
+
|
|
63
|
+
**Templates:**
|
|
64
|
+
- Report structure: [report_template.md](./templates/report_template.md)
|
|
65
|
+
- HTML styling: [mckinsey_report_template.html](./templates/mckinsey_report_template.html)
|
|
66
|
+
|
|
67
|
+
**Scripts:**
|
|
68
|
+
- `python scripts/validate_report.py --report [path]`
|
|
69
|
+
- `python scripts/verify_citations.py --report [path]`
|
|
70
|
+
- `python scripts/md_to_html.py [markdown_path]`
|
|
71
|
+
|
|
72
|
+
---
|
|
73
|
+
|
|
74
|
+
## Output Contract
|
|
75
|
+
|
|
76
|
+
**Required sections:**
|
|
77
|
+
- Executive Summary (200-400 words)
|
|
78
|
+
- Introduction (scope, methodology, assumptions)
|
|
79
|
+
- Main Analysis (4-8 findings, 600-2,000 words each, cited)
|
|
80
|
+
- Synthesis & Insights (patterns, implications)
|
|
81
|
+
- Limitations & Caveats
|
|
82
|
+
- Recommendations
|
|
83
|
+
- Bibliography (COMPLETE - every citation, no placeholders)
|
|
84
|
+
- Methodology Appendix
|
|
85
|
+
|
|
86
|
+
**Output files (all to `~/Documents/[Topic]_Research_[YYYYMMDD]/`):**
|
|
87
|
+
- Markdown (primary source of truth)
|
|
88
|
+
- `sources.jsonl` — stable source registry with canonical IDs
|
|
89
|
+
- `evidence.jsonl` — append-only evidence store with quotes and locators
|
|
90
|
+
- `claims.jsonl` — atomic claim ledger with support status
|
|
91
|
+
- `run_manifest.json` — query, mode, assumptions, provider config
|
|
92
|
+
- HTML (McKinsey style, auto-opened)
|
|
93
|
+
- PDF (professional print, auto-opened)
|
|
94
|
+
|
|
95
|
+
**Quality standards:**
|
|
96
|
+
- 10+ sources, 3+ per major claim (cluster-independent, not just count)
|
|
97
|
+
- All factual claims cited immediately [N] with evidence backing in `evidence.jsonl`
|
|
98
|
+
- Claim-support verification mandatory: no unsupported factual claims pass delivery
|
|
99
|
+
- No placeholders, no fabricated citations
|
|
100
|
+
- Prose-first (>=80%), bullets sparingly
|
|
101
|
+
|
|
102
|
+
---
|
|
103
|
+
|
|
104
|
+
## When to Use / NOT Use
|
|
105
|
+
|
|
106
|
+
**Use:** Comprehensive analysis, technology comparisons, state-of-the-art reviews, multi-perspective investigation, market analysis.
|
|
107
|
+
|
|
108
|
+
**Do NOT use:** Simple lookups, debugging, 1-2 search answers, quick time-sensitive queries.
|