clawbooks 0.1.1 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -2,22 +2,116 @@
2
2
 
3
3
  Accounting by inference, not by engine.
4
4
 
5
- An append-only ledger + plain english policy + CLI.
6
- Your LLM agent reads the data, reads the policy, does the accounting.
5
+ Financial memory for agents.
6
+
7
+ Clawbooks is an append-only ledger, a plain-English accounting policy, and a CLI.
8
+ Your agent reads the data, reads the policy, and does the accounting.
9
+
7
10
  No rules engine. No SDK. No framework.
8
11
 
9
12
  **Two source files. Zero runtime dependencies.**
10
13
 
11
- ## Setup
14
+ Bring CSVs, Stripe exports, exchange fills, receipts, PDFs, or copied transaction text.
15
+ Your agent reads the source, applies `policy.md`, writes normalized ledger events into clawbooks, and produces statements, summaries, and audit packs from the same record.
16
+
17
+ ## The loop
18
+
19
+ ```text
20
+ Raw inputs
21
+ bank CSVs / Stripe exports / receipts / PDFs / exchange fills / copied text
22
+ ->
23
+ Agent ingestion
24
+ reads the source + applies policy.md + writes normalized ledger events
25
+ ->
26
+ Clawbooks ledger
27
+ append-only records + snapshots + verification + context + packs
28
+ ->
29
+ Agent outputs
30
+ P&L / balance sheet / cash flow / tax views / asset register / audit-ready working files
31
+ ->
32
+ Policy improvement
33
+ you refine policy.md and the next ingestion/reporting cycle gets better
34
+ ```
12
35
 
13
- ```bash
14
- git clone https://github.com/rev1ck/clawbooks.git
15
- cd clawbooks
16
- npm install
17
- npm run build
18
- cp policy.md.example policy.md # edit with your own accounting rules
36
+ ## Why
37
+
38
+ Most accounting software assumes the product should contain the accounting logic.
39
+ Clawbooks takes the opposite view:
40
+
41
+ - the ledger stores facts
42
+ - the policy states the rules in plain English
43
+ - the agent does the reasoning
44
+
45
+ That makes clawbooks useful anywhere an agent can read files and run shell commands.
46
+
47
+ ## What you get
48
+
49
+ - Append-only JSONL ledger with hash chaining
50
+ - Plain-English policy file instead of embedded bookkeeping logic
51
+ - CLI commands for recording, reviewing, reconciling, compacting, and packaging records
52
+ - Structured `context` output designed for agent reasoning
53
+ - Zero runtime dependencies
54
+
55
+ ## How ingestion works
56
+
57
+ Clawbooks does not ship source-specific import logic.
58
+ That is deliberate.
59
+
60
+ Your agent is the importer:
61
+
62
+ - bring raw inputs in whatever form you already have
63
+ - the agent reads them and applies `policy.md`
64
+ - the agent converts them into normalized ledger events
65
+ - clawbooks stores the canonical record
66
+
67
+ This keeps ingestion programmable by policy instead of hardcoded per integration.
68
+
69
+ ## What the agent can produce
70
+
71
+ With `context`, `summary`, `verify`, `reconcile`, `assets`, and `pack`, your agent can prepare:
72
+
73
+ - profit and loss statements
74
+ - balance sheets
75
+ - cash flow summaries
76
+ - categorized tax views
77
+ - asset registers and depreciation views
78
+ - audit-ready working packs
79
+
80
+ Clawbooks supplies durable memory, verification, and repeatable tooling.
81
+ The agent does the accounting work on top of that foundation.
82
+
83
+ ## Boundaries
84
+
85
+ You and your agent:
86
+
87
+ - write and refine `policy.md`
88
+ - ingest source documents and convert them into ledger events
89
+ - interpret edge cases
90
+ - review outputs and improve the policy over time
91
+
92
+ clawbooks:
93
+
94
+ - stores append-only financial records
95
+ - preserves snapshots and audit history
96
+ - provides structured context for the agent
97
+ - verifies integrity and reconciliation surfaces
98
+ - packages records for downstream review and reporting
99
+
100
+ As `policy.md` gets better, your ingestion, classification, and reporting get better too.
101
+
102
+ ## Example
103
+
104
+ ```text
105
+ You: "What's my P&L for March?"
106
+
107
+ Agent runs: clawbooks context 2026-03
108
+ Agent reads: policy + snapshot + events
109
+ Agent reasons: applies the policy to the records
110
+ Agent replies: "Revenue: $1,700. Expenses: $475. Net: $1,225."
19
111
  ```
20
112
 
113
+ There is no accounting engine. In clawbooks, the agent is the engine.
114
+
21
115
  ## Install
22
116
 
23
117
  ```bash
@@ -26,32 +120,20 @@ clawbooks --help
26
120
  cp policy.md.example policy.md
27
121
  ```
28
122
 
29
- ## Scoped Package Readiness
30
-
31
- The primary package should stay `clawbooks` for the clean install path.
32
- If you later want a brand-owned scoped companion package, the repo can stage `@clawbooks/cli` without renaming the live package:
123
+ ## Local setup
33
124
 
34
125
  ```bash
35
- npm run scoped:prepare
36
- npm run scoped:pack:dry-run
126
+ git clone https://github.com/rev1ck/clawbooks.git
127
+ cd clawbooks
128
+ npm install
129
+ npm run build
130
+ cp policy.md.example policy.md # edit with your own accounting rules
37
131
  ```
38
132
 
39
- This writes a temporary scoped package into `.dist/scoped-cli` for inspection or future publish work.
40
-
41
133
  ## How it works
42
134
 
43
- Clawbooks stores financial events and outputs context. The LLM you're already talking to does the accounting.
44
-
45
- ```
46
- You: "What's my P&L for March?"
47
-
48
- Agent runs: clawbooks context 2026-03
49
- Agent reads: policy + events
50
- Agent thinks: *applies policy to events*
51
- Agent responds: "Revenue: $1,700. Expenses: $475. Net: $1,225."
52
- ```
53
-
54
- There is no accounting engine. The LLM *is* the engine.
135
+ Clawbooks stores financial events and outputs accounting context.
136
+ The important command is `clawbooks context`: it prints a structured context envelope with metadata, instructions, policy, summary, snapshot, and raw events so an agent can reason from both overview and detail.
55
137
 
56
138
  ## Commands
57
139
 
@@ -70,13 +152,17 @@ clawbooks context 2026-03
70
152
  clawbooks context --after 2026-01-01
71
153
 
72
154
  # Analysis
73
- clawbooks verify 2026-03 # integrity + chain + duplicates
74
- clawbooks verify --balance 50000 --currency USD # cross-check closing balance
155
+ clawbooks verify 2026-03 # integrity + chain + duplicates
156
+ clawbooks verify --balance 50000 --currency USD # cross-check closing balance
75
157
  clawbooks reconcile 2026-03 --source bank --count 50 --debits -12000 --gaps
76
- clawbooks review --source bank # items needing classification
77
- clawbooks summary 2026-03 # aggregates for reports
78
- clawbooks snapshot 2026-03 --save # persist period snapshot
79
- clawbooks assets --as-of 2026-03-31 # asset register + depreciation
158
+ clawbooks review --source bank # items needing classification
159
+ clawbooks summary 2026-03 # aggregates for reports
160
+ clawbooks snapshot 2026-03 --save # persist period snapshot
161
+ clawbooks assets --as-of 2026-03-31 # asset register + depreciation
162
+
163
+ # Maintenance
164
+ clawbooks compact 2025-12 # archive old events, shrink ledger
165
+ clawbooks pack 2026-03 --out ./march-pack # generate audit pack (CSVs + JSON)
80
166
 
81
167
  # Print the policy
82
168
  clawbooks policy
@@ -84,37 +170,72 @@ clawbooks policy
84
170
 
85
171
  ## The context command
86
172
 
87
- This is the important one. It outputs your accounting policy + the latest snapshot + all events in a period, wrapped in XML tags. The agent reads this output and reasons over it.
173
+ This is the core command. It prints a `context` envelope for the requested period:
174
+
175
+ - `metadata` explains the requested and effective window, whether a snapshot was used, and what kinds of records are present
176
+ - `instructions` tells the agent how to interpret snapshot plus events
177
+ - `policy` is your plain-English accounting policy
178
+ - `summary` provides orientation before the raw records
179
+ - `snapshot` is the starting state, when available
180
+ - `events` contains the raw append-only records the agent should reason from
88
181
 
89
182
  ```bash
90
183
  $ clawbooks context 2026-03
91
184
 
185
+ <context schema="clawbooks.context.v2">
186
+ <metadata>
187
+ {
188
+ "requested_window": {"after":"2026-03-01T00:00:00.000Z","before":"2026-03-31T23:59:59.999Z"},
189
+ "effective_window": {"after":"2026-03-01T00:00:00.000Z","before":"2026-03-31T23:59:59.999Z"},
190
+ "snapshot": {"used": true, "ts":"2026-03-01T00:00:00.000Z"},
191
+ "event_count": 47,
192
+ "sources": ["bank", "stripe"],
193
+ "currencies": ["USD"]
194
+ }
195
+ </metadata>
196
+
197
+ <instructions>
198
+ Read the policy first.
199
+ Treat the snapshot as the starting state.
200
+ Apply the events block on top of that snapshot.
201
+ </instructions>
202
+
92
203
  <policy>
93
204
  # Accounting policy
94
205
  Cash basis. Crypto trades are revenue income...
95
206
  </policy>
96
207
 
97
- <snapshot as_of="2026-03-01">
98
- {"balances":{"USDC":45000},"ytd_pnl":18450}
208
+ <summary>
209
+ {
210
+ "by_type": {"income":{"count":12,"total":1700},"fee":{"count":3,"total":-55}},
211
+ "by_currency": {"USD":{"count":15,"total":1645}},
212
+ "cash_flow": {"inflows":1700,"outflows":-55,"net":1645}
213
+ }
214
+ </summary>
215
+
216
+ <snapshot as_of="2026-03-01T00:00:00.000Z">
217
+ {"balances":{"USD":45000},"ytd_pnl":18450}
99
218
  </snapshot>
100
219
 
101
- <events count="47" after="2026-03-01" before="2026-03-31">
220
+ <events count="47" after="2026-03-01T00:00:00.000Z" before="2026-03-31T23:59:59.999Z">
102
221
  {"ts":"...","source":"stripe","type":"payment","data":{"amount":500,...}}
103
222
  {"ts":"...","source":"bank","type":"fee","data":{"amount":-55,...}}
104
223
  ...
105
224
  </events>
225
+ </context>
106
226
  ```
107
227
 
108
228
  ## Importing data
109
229
 
110
- There is no import command. Your agent IS the importer.
230
+ There is no import command. The agent is the importer.
111
231
 
112
- ```
232
+ ```text
113
233
  You: [paste CSV] "Import this bank statement"
114
234
 
115
- Agent: *reads CSV, reads policy via `clawbooks policy`*
116
- *classifies each row per the policy*
117
- *outputs JSONL, pipes to `clawbooks batch`*
235
+ Agent: reads the CSV
236
+ reads policy via `clawbooks policy`
237
+ classifies each row per the policy
238
+ outputs JSONL and pipes it to `clawbooks batch`
118
239
 
119
240
  Agent: "Recorded 47 events from Chase March statement."
120
241
  ```
@@ -134,22 +255,57 @@ clawbooks assets --as-of 2026-03-31
134
255
  clawbooks record '{"source":"manual","type":"disposal","data":{"asset_id":"<id>","proceeds":5000,"currency":"USD"}}'
135
256
  ```
136
257
 
258
+ ## Scaling
259
+
260
+ When the ledger grows large, compact old periods into an archive:
261
+
262
+ ```bash
263
+ clawbooks compact 2025-12
264
+ # -> archives old events to ledger-archive-2025-12-31.jsonl
265
+ # -> rewrites the main ledger as: 1 snapshot + newer events
266
+ ```
267
+
268
+ The archive remains a complete hash-chained ledger for audits. The main ledger stays small enough for agent context windows.
269
+
270
+ ## Audit packs
271
+
272
+ Generate a folder of standard-format files for accountants or auditors:
273
+
274
+ ```bash
275
+ clawbooks pack 2026-01/2026-12-31 --out ./annual-pack
276
+ ```
277
+
278
+ This produces `general_ledger.csv`, `summary.json`, `asset_register.csv`, `reclassifications.csv`, `verify.json`, and a copy of `policy.md`.
279
+ The output is assistive. It gives an accountant structured working material, not a pretend finished report.
280
+
137
281
  ## Agent setup
138
282
 
139
- Point your agent at `program.md` for instructions on how to use clawbooks. For example:
283
+ Point your agent at `program.md` for instructions on how to use clawbooks.
140
284
 
141
- - **Claude Code** add to your `CLAUDE.md`: `Read program.md in the clawbooks directory for financial record-keeping instructions.`
142
- - **Codex** add to your `AGENTS.md` or system prompt with the same pointer
143
- - **Any agent** — any agent that can shell out can use clawbooks. The CLI outputs structured text. The agent reads it and reasons.
285
+ - **Claude Code**: add `Read program.md in the clawbooks directory for financial record-keeping instructions.`
286
+ - **Codex**: add the same pointer in `AGENTS.md` or your system prompt
287
+ - **Any shell-capable agent**: clawbooks prints structured text for the agent to read and reason over
144
288
 
145
- The npm package includes `program.md` plus all policy examples, so this workflow also works from a global install.
289
+ The npm package includes `program.md` and the policy examples, so this workflow also works from a global install.
146
290
 
147
- ## Files
291
+ ## Packaging
292
+
293
+ The primary package should stay `clawbooks` for the clean install path.
294
+ If you later want a brand-owned scoped companion package, the repo can stage `@clawbooks/cli` without renaming the live package:
148
295
 
296
+ ```bash
297
+ npm run scoped:prepare
298
+ npm run scoped:pack:dry-run
149
299
  ```
300
+
301
+ This writes a temporary scoped package into `.dist/scoped-cli` for inspection or future publish work.
302
+
303
+ ## Files
304
+
305
+ ```text
150
306
  cli.ts CLI commands
151
307
  ledger.ts JSONL read/write/filter
152
- program.md Agent instructions (how to use clawbooks)
308
+ program.md Agent instructions
153
309
  policy.md Your accounting rules (you write this, gitignored)
154
310
  policy.md.example Example policy to start from
155
311
  ledger.jsonl Your financial events (append-only, gitignored)
@@ -162,7 +318,7 @@ ledger.jsonl Your financial events (append-only, gitignored)
162
318
  | `CLAWBOOKS_LEDGER` | `./ledger.jsonl` | Path to ledger |
163
319
  | `CLAWBOOKS_POLICY` | `./policy.md` | Path to policy |
164
320
 
165
- No API key needed. The agent brings its own LLM.
321
+ No API key needed. Bring your own agent.
166
322
 
167
323
  ## License
168
324
 
package/build/cli.js CHANGED
@@ -1,7 +1,7 @@
1
1
  #!/usr/bin/env node
2
2
  import { createHash } from "node:crypto";
3
- import { readFileSync, existsSync } from "node:fs";
4
- import { computeId, readAll, filter, append, hashLine, latestSnapshot, } from "./ledger.js";
3
+ import { readFileSync, writeFileSync, existsSync, mkdirSync } from "node:fs";
4
+ import { computeId, readAll, filter, append, hashLine, rewrite, latestSnapshot, } from "./ledger.js";
5
5
  const LEDGER = process.env.CLAWBOOKS_LEDGER ?? "./ledger.jsonl";
6
6
  const POLICY = process.env.CLAWBOOKS_POLICY ?? "./policy.md";
7
7
  const OUTFLOW_TYPES = new Set([
@@ -106,6 +106,109 @@ function periodFromArgs(args) {
106
106
  function round2(n) {
107
107
  return Math.round(n * 100) / 100;
108
108
  }
109
+ function buildReclassifyMap(events) {
110
+ const reclassifyMap = {};
111
+ for (const e of events) {
112
+ if (e.type === "reclassify" && e.data.original_id && e.data.new_category) {
113
+ reclassifyMap[String(e.data.original_id)] = String(e.data.new_category);
114
+ }
115
+ }
116
+ return reclassifyMap;
117
+ }
118
+ function reviewCounts(events, all) {
119
+ const reclassified = new Set(all.filter((e) => e.type === "reclassify").map((e) => String(e.data.original_id)));
120
+ const counts = { unclear: 0, inferred: 0, unset: 0, clear: 0 };
121
+ for (const e of events) {
122
+ if (e.type === "reclassify" || e.type === "snapshot" || reclassified.has(e.id))
123
+ continue;
124
+ const confidence = String(e.data.confidence ?? "unset");
125
+ if (confidence === "clear")
126
+ counts.clear++;
127
+ else if (confidence === "unclear")
128
+ counts.unclear++;
129
+ else if (confidence === "inferred")
130
+ counts.inferred++;
131
+ else
132
+ counts.unset++;
133
+ }
134
+ return counts;
135
+ }
136
+ function buildContextSummary(events, all) {
137
+ const reclassifyMap = buildReclassifyMap(all);
138
+ const byType = {};
139
+ const bySource = {};
140
+ const byCurrency = {};
141
+ const byCategory = {};
142
+ const eventTypes = new Set();
143
+ const sources = new Set();
144
+ const currencies = new Set();
145
+ let inflows = 0;
146
+ let outflows = 0;
147
+ let nonMetaEvents = 0;
148
+ let rawReclassifications = 0;
149
+ for (const e of events) {
150
+ eventTypes.add(e.type);
151
+ sources.add(e.source);
152
+ if (e.type === "reclassify")
153
+ rawReclassifications++;
154
+ if (META_TYPES.has(e.type))
155
+ continue;
156
+ nonMetaEvents++;
157
+ const amount = Number(e.data.amount);
158
+ const currency = String(e.data.currency ?? "UNKNOWN");
159
+ const category = reclassifyMap[e.id] ?? String(e.data.category ?? e.type);
160
+ currencies.add(currency);
161
+ if (!byType[e.type])
162
+ byType[e.type] = { count: 0, total: 0 };
163
+ byType[e.type].count++;
164
+ if (!bySource[e.source])
165
+ bySource[e.source] = { count: 0, total: 0 };
166
+ bySource[e.source].count++;
167
+ if (!byCurrency[currency])
168
+ byCurrency[currency] = { count: 0, total: 0 };
169
+ byCurrency[currency].count++;
170
+ if (!byCategory[category])
171
+ byCategory[category] = { count: 0, total: 0 };
172
+ byCategory[category].count++;
173
+ if (isNaN(amount))
174
+ continue;
175
+ byType[e.type].total = round2(byType[e.type].total + amount);
176
+ bySource[e.source].total = round2(bySource[e.source].total + amount);
177
+ byCurrency[currency].total = round2(byCurrency[currency].total + amount);
178
+ byCategory[category].total = round2(byCategory[category].total + amount);
179
+ if (amount > 0)
180
+ inflows = round2(inflows + amount);
181
+ else
182
+ outflows = round2(outflows + amount);
183
+ }
184
+ const confidence = reviewCounts(events, all);
185
+ const needsReview = confidence.unclear + confidence.inferred + confidence.unset;
186
+ const reclassifiedEventCount = events.filter((e) => reclassifyMap[e.id] !== undefined).length;
187
+ return {
188
+ event_count: events.length,
189
+ non_meta_event_count: nonMetaEvents,
190
+ event_types: [...eventTypes].sort(),
191
+ sources: [...sources].sort(),
192
+ currencies: [...currencies].sort(),
193
+ by_type: byType,
194
+ by_source: bySource,
195
+ by_currency: byCurrency,
196
+ by_category: byCategory,
197
+ cash_flow: {
198
+ inflows: round2(inflows),
199
+ outflows: round2(outflows),
200
+ net: round2(inflows + outflows),
201
+ },
202
+ reclassifications: {
203
+ raw_events_in_window: rawReclassifications,
204
+ applied_to_events_in_window: reclassifiedEventCount,
205
+ },
206
+ review: {
207
+ needs_review: needsReview,
208
+ by_confidence: confidence,
209
+ },
210
+ };
211
+ }
109
212
  function enforceSign(type, data) {
110
213
  if (data.amount === undefined)
111
214
  return;
@@ -228,11 +331,62 @@ function cmdContext(args) {
228
331
  const snapshot = latestSnapshot(all, after);
229
332
  const effectiveAfter = snapshot?.ts ?? after;
230
333
  const events = filter(all, { after: effectiveAfter, before }).filter((e) => e.type !== "snapshot");
334
+ const summary = buildContextSummary(events, all);
335
+ const metadata = {
336
+ schema_version: "clawbooks.context.v2",
337
+ generated_at: new Date().toISOString(),
338
+ ledger_path: LEDGER,
339
+ policy_path: POLICY,
340
+ requested_window: {
341
+ after: after ?? "all",
342
+ before: before ?? "now",
343
+ },
344
+ effective_window: {
345
+ after: effectiveAfter ?? "all",
346
+ before: before ?? "now",
347
+ },
348
+ snapshot: snapshot ? {
349
+ used: true,
350
+ ts: snapshot.ts,
351
+ source: snapshot.source,
352
+ id: snapshot.id,
353
+ event_count: Number(snapshot.data.event_count ?? 0),
354
+ } : {
355
+ used: false,
356
+ },
357
+ event_count: events.length,
358
+ sources: summary.sources,
359
+ event_types: summary.event_types,
360
+ currencies: summary.currencies,
361
+ };
231
362
  // Output structured context for the agent
363
+ console.log(`<context schema="clawbooks.context.v2">`);
364
+ console.log(`<metadata>`);
365
+ console.log(JSON.stringify(metadata, null, 2));
366
+ console.log(`</metadata>`);
367
+ console.log();
368
+ console.log(`<instructions>`);
369
+ console.log(`Read the policy first.`);
370
+ if (snapshot) {
371
+ console.log(`Treat the snapshot as the starting state up to its as_of timestamp.`);
372
+ console.log(`Apply the events block on top of that snapshot to answer the user's question.`);
373
+ }
374
+ else {
375
+ console.log(`No snapshot is present for this window, so reason directly from the events block.`);
376
+ }
377
+ console.log(`Prefer the summary block for orientation, but use raw events for final reasoning and edge cases.`);
378
+ console.log(`Reclassify events are append-only corrections; use them when interpreting categories.`);
379
+ console.log(`Amounts are signed: inflows are positive, outflows are negative for known flow types.`);
380
+ console.log(`</instructions>`);
381
+ console.log();
232
382
  console.log(`<policy>`);
233
383
  console.log(policyText());
234
384
  console.log(`</policy>`);
235
385
  console.log();
386
+ console.log(`<summary>`);
387
+ console.log(JSON.stringify(summary, null, 2));
388
+ console.log(`</summary>`);
389
+ console.log();
236
390
  if (snapshot) {
237
391
  console.log(`<snapshot as_of="${snapshot.ts}">`);
238
392
  console.log(JSON.stringify(snapshot.data, null, 2));
@@ -243,6 +397,7 @@ function cmdContext(args) {
243
397
  for (const e of events)
244
398
  console.log(JSON.stringify(e));
245
399
  console.log(`</events>`);
400
+ console.log(`</context>`);
246
401
  }
247
402
  function cmdPolicy() {
248
403
  console.log(policyText());
@@ -762,6 +917,279 @@ function cmdAssets(args) {
762
917
  },
763
918
  }, null, 2));
764
919
  }
920
+ function cmdCompact(args) {
921
+ const f = flags(args);
922
+ const { before } = periodFromArgs(args);
923
+ if (!before) {
924
+ console.error("Usage: clawbooks compact <period> or --before <date>");
925
+ console.error(" Moves events before the cutoff to an archive file and saves a snapshot.");
926
+ console.error(" Example: clawbooks compact 2025-12");
927
+ process.exit(1);
928
+ }
929
+ const all = readAll(LEDGER);
930
+ const keep = all.filter((e) => e.ts > before);
931
+ const archive = all.filter((e) => e.ts <= before);
932
+ if (archive.length === 0) {
933
+ console.log(JSON.stringify({ compacted: false, reason: "no events before cutoff" }));
934
+ return;
935
+ }
936
+ // Build snapshot of archived events
937
+ const balances = {};
938
+ const byCategory = {};
939
+ const pnl = {};
940
+ let eventCount = 0;
941
+ for (const e of archive) {
942
+ if (META_TYPES.has(e.type))
943
+ continue;
944
+ const amount = Number(e.data.amount);
945
+ if (isNaN(amount))
946
+ continue;
947
+ eventCount++;
948
+ const currency = String(e.data.currency ?? "UNKNOWN");
949
+ const category = String(e.data.category ?? e.type);
950
+ balances[currency] = round2((balances[currency] ?? 0) + amount);
951
+ byCategory[category] = round2((byCategory[category] ?? 0) + amount);
952
+ if (!pnl[currency])
953
+ pnl[currency] = { income: 0, expenses: 0, tax: 0, net: 0 };
954
+ if (e.type === "income")
955
+ pnl[currency].income = round2(pnl[currency].income + amount);
956
+ else if (e.type === "tax_payment")
957
+ pnl[currency].tax = round2(pnl[currency].tax + amount);
958
+ else if (OUTFLOW_TYPES.has(e.type))
959
+ pnl[currency].expenses = round2(pnl[currency].expenses + amount);
960
+ pnl[currency].net = round2(pnl[currency].net + amount);
961
+ }
962
+ const snapshotData = {
963
+ period: { after: "all", before },
964
+ event_count: eventCount,
965
+ balances,
966
+ by_category: byCategory,
967
+ pnl,
968
+ compacted_from: archive.length,
969
+ };
970
+ const ts = before;
971
+ const snapshotEvent = {
972
+ ts,
973
+ source: "clawbooks:compact",
974
+ type: "snapshot",
975
+ data: snapshotData,
976
+ id: computeId(snapshotData, {
977
+ source: "clawbooks:compact", type: "snapshot", ts,
978
+ }),
979
+ prev: "",
980
+ };
981
+ // Write archive
982
+ const archivePath = f.archive ?? LEDGER.replace(".jsonl", `-archive-${before.slice(0, 10)}.jsonl`);
983
+ rewrite(archivePath, archive);
984
+ // Rewrite main ledger: snapshot + remaining events
985
+ rewrite(LEDGER, [snapshotEvent, ...keep]);
986
+ console.log(JSON.stringify({
987
+ compacted: true,
988
+ archived: archive.length,
989
+ archive_path: archivePath,
990
+ snapshot_id: snapshotEvent.id,
991
+ remaining: keep.length + 1,
992
+ }, null, 2));
993
+ }
994
+ function csvEscape(val) {
995
+ if (val.includes(",") || val.includes('"') || val.includes("\n")) {
996
+ return '"' + val.replace(/"/g, '""') + '"';
997
+ }
998
+ return val;
999
+ }
1000
+ function cmdPack(args) {
1001
+ const f = flags(args);
1002
+ const { after, before } = periodFromArgs(args);
1003
+ const outDir = f.out ?? `./audit-pack-${(before ?? new Date().toISOString()).slice(0, 10)}`;
1004
+ const all = readAll(LEDGER);
1005
+ const events = filter(all, { after, before, source: f.source });
1006
+ mkdirSync(outDir, { recursive: true });
1007
+ // --- general_ledger.csv ---
1008
+ const glHeader = "date,source,type,category,description,amount,currency,confidence,id";
1009
+ const glRows = events
1010
+ .filter((e) => !META_TYPES.has(e.type))
1011
+ .map((e) => [
1012
+ e.ts.slice(0, 10),
1013
+ csvEscape(e.source),
1014
+ e.type,
1015
+ csvEscape(String(e.data.category ?? "")),
1016
+ csvEscape(String(e.data.description ?? "")),
1017
+ String(e.data.amount ?? ""),
1018
+ String(e.data.currency ?? ""),
1019
+ String(e.data.confidence ?? ""),
1020
+ e.id,
1021
+ ].join(","));
1022
+ writeFileSync(`${outDir}/general_ledger.csv`, [glHeader, ...glRows].join("\n") + "\n", "utf-8");
1023
+ // --- reclassifications.csv ---
1024
+ const reclassEvents = all.filter((e) => e.type === "reclassify");
1025
+ if (reclassEvents.length > 0) {
1026
+ const rcHeader = "date,original_id,new_category,new_type,reason";
1027
+ const rcRows = reclassEvents.map((e) => [
1028
+ e.ts.slice(0, 10),
1029
+ String(e.data.original_id ?? ""),
1030
+ csvEscape(String(e.data.new_category ?? "")),
1031
+ csvEscape(String(e.data.new_type ?? "")),
1032
+ csvEscape(String(e.data.reason ?? "")),
1033
+ ].join(","));
1034
+ writeFileSync(`${outDir}/reclassifications.csv`, [rcHeader, ...rcRows].join("\n") + "\n", "utf-8");
1035
+ }
1036
+ // --- summary.json ---
1037
+ // Build reclassification map
1038
+ const reclassifyMap = {};
1039
+ for (const e of all) {
1040
+ if (e.type === "reclassify" && e.data.original_id && e.data.new_category) {
1041
+ reclassifyMap[String(e.data.original_id)] = String(e.data.new_category);
1042
+ }
1043
+ }
1044
+ const byType = {};
1045
+ const byCategory = {};
1046
+ const byCurrency = {};
1047
+ let inflows = 0, outflows = 0;
1048
+ for (const e of events) {
1049
+ if (META_TYPES.has(e.type))
1050
+ continue;
1051
+ const amount = Number(e.data.amount);
1052
+ if (isNaN(amount))
1053
+ continue;
1054
+ const type = e.type;
1055
+ const category = reclassifyMap[e.id] ?? String(e.data.category ?? e.type);
1056
+ const currency = String(e.data.currency ?? "UNKNOWN");
1057
+ if (!byType[type])
1058
+ byType[type] = { count: 0, total: 0 };
1059
+ byType[type].count++;
1060
+ byType[type].total = round2(byType[type].total + amount);
1061
+ if (!byCategory[category])
1062
+ byCategory[category] = { count: 0, total: 0 };
1063
+ byCategory[category].count++;
1064
+ byCategory[category].total = round2(byCategory[category].total + amount);
1065
+ if (!byCurrency[currency])
1066
+ byCurrency[currency] = { count: 0, total: 0 };
1067
+ byCurrency[currency].count++;
1068
+ byCurrency[currency].total = round2(byCurrency[currency].total + amount);
1069
+ if (amount > 0)
1070
+ inflows = round2(inflows + amount);
1071
+ else
1072
+ outflows = round2(outflows + amount);
1073
+ }
1074
+ writeFileSync(`${outDir}/summary.json`, JSON.stringify({
1075
+ period: { after: after ?? "all", before: before ?? "now" },
1076
+ by_type: byType,
1077
+ by_category: byCategory,
1078
+ by_currency: byCurrency,
1079
+ cash_flow: { inflows, outflows, net: round2(inflows + outflows) },
1080
+ }, null, 2) + "\n", "utf-8");
1081
+ // --- asset_register.csv ---
1082
+ const capitalizedEvents = all.filter((e) => e.data.capitalize === true);
1083
+ if (capitalizedEvents.length > 0) {
1084
+ const disposals = {};
1085
+ const writeOffsMap = {};
1086
+ const impairmentsMap = {};
1087
+ for (const e of all) {
1088
+ const aid = String(e.data.asset_id ?? "");
1089
+ if (!aid)
1090
+ continue;
1091
+ if (e.type === "disposal")
1092
+ disposals[aid] = e;
1093
+ else if (e.type === "write_off")
1094
+ writeOffsMap[aid] = e;
1095
+ else if (e.type === "impairment") {
1096
+ if (!impairmentsMap[aid])
1097
+ impairmentsMap[aid] = [];
1098
+ impairmentsMap[aid].push(e);
1099
+ }
1100
+ }
1101
+ const asOf = before ?? new Date().toISOString();
1102
+ const defaultLife = 36;
1103
+ const arHeader = "date,description,category,cost,currency,useful_life,monthly_dep,months_elapsed,acc_dep,impairment,nbv,status,proceeds,gain_loss,id";
1104
+ const arRows = capitalizedEvents.map((e) => {
1105
+ const amount = Math.abs(Number(e.data.amount));
1106
+ const lifeMonths = Number(e.data.useful_life_months) || defaultLife;
1107
+ const purchaseDate = new Date(e.ts);
1108
+ const reportDate = new Date(asOf);
1109
+ const monthsElapsed = Math.max(0, (reportDate.getFullYear() - purchaseDate.getFullYear()) * 12 +
1110
+ (reportDate.getMonth() - purchaseDate.getMonth()));
1111
+ const monthlyDep = round2(amount / lifeMonths);
1112
+ const accDep = round2(Math.min(amount, monthlyDep * monthsElapsed));
1113
+ let impTotal = 0;
1114
+ if (impairmentsMap[e.id]) {
1115
+ for (const imp of impairmentsMap[e.id]) {
1116
+ impTotal = round2(impTotal + Math.abs(Number(imp.data.impairment_amount) || 0));
1117
+ }
1118
+ }
1119
+ const nbv = round2(Math.max(0, amount - accDep - impTotal));
1120
+ let status = "active";
1121
+ let proceeds = "";
1122
+ let gainLoss = "";
1123
+ if (disposals[e.id]) {
1124
+ status = "disposed";
1125
+ const p = Number(disposals[e.id].data.proceeds) || 0;
1126
+ proceeds = String(p);
1127
+ gainLoss = String(round2(p - nbv));
1128
+ }
1129
+ else if (writeOffsMap[e.id]) {
1130
+ status = "written_off";
1131
+ gainLoss = String(round2(-nbv));
1132
+ }
1133
+ return [
1134
+ e.ts.slice(0, 10),
1135
+ csvEscape(String(e.data.description ?? "")),
1136
+ csvEscape(String(e.data.category ?? "")),
1137
+ String(amount),
1138
+ String(e.data.currency ?? ""),
1139
+ String(lifeMonths),
1140
+ String(monthlyDep),
1141
+ String(Math.min(monthsElapsed, lifeMonths)),
1142
+ String(accDep),
1143
+ String(impTotal),
1144
+ status === "active" ? String(nbv) : "0",
1145
+ status,
1146
+ proceeds,
1147
+ gainLoss,
1148
+ e.id,
1149
+ ].join(",");
1150
+ });
1151
+ writeFileSync(`${outDir}/asset_register.csv`, [arHeader, ...arRows].join("\n") + "\n", "utf-8");
1152
+ }
1153
+ // --- verify.json ---
1154
+ const hash = createHash("sha256").update(events.map((e) => e.id).join(",")).digest("hex");
1155
+ let debits = 0, credits = 0;
1156
+ const issues = [];
1157
+ for (const e of events) {
1158
+ const amount = Number(e.data.amount);
1159
+ if (e.data.amount !== undefined && !isNaN(amount)) {
1160
+ if (amount < 0)
1161
+ debits = round2(debits + amount);
1162
+ else
1163
+ credits = round2(credits + amount);
1164
+ if (OUTFLOW_TYPES.has(e.type) && amount > 0)
1165
+ issues.push(`${e.id}: outflow "${e.type}" positive ${amount}`);
1166
+ if (INFLOW_TYPES.has(e.type) && amount < 0)
1167
+ issues.push(`${e.id}: inflow "${e.type}" negative ${amount}`);
1168
+ }
1169
+ }
1170
+ writeFileSync(`${outDir}/verify.json`, JSON.stringify({
1171
+ event_count: events.length, debits, credits, hash, issues,
1172
+ generated: new Date().toISOString(),
1173
+ }, null, 2) + "\n", "utf-8");
1174
+ // --- policy.md ---
1175
+ if (existsSync(POLICY)) {
1176
+ writeFileSync(`${outDir}/policy.md`, readFileSync(POLICY, "utf-8"), "utf-8");
1177
+ }
1178
+ // Summary output
1179
+ const files = ["general_ledger.csv", "summary.json", "verify.json"];
1180
+ if (reclassEvents.length > 0)
1181
+ files.push("reclassifications.csv");
1182
+ if (capitalizedEvents.length > 0)
1183
+ files.push("asset_register.csv");
1184
+ if (existsSync(POLICY))
1185
+ files.push("policy.md");
1186
+ console.log(JSON.stringify({
1187
+ pack: outDir,
1188
+ period: { after: after ?? "all", before: before ?? "now" },
1189
+ events: events.length,
1190
+ files,
1191
+ }, null, 2));
1192
+ }
765
1193
  // --- Help ---
766
1194
  const HELP = `clawbooks — accounting by inference, not by engine.
767
1195
 
@@ -782,6 +1210,9 @@ Analysis commands:
782
1210
  snapshot [period] [--save] Compute period snapshot (balances, P&L)
783
1211
  assets [--category C] [--life N] [--as-of DATE]
784
1212
  Asset register (capitalize-flag based) with depreciation
1213
+ compact <period> [--archive PATH] Archive old events, save snapshot, shrink ledger
1214
+ pack [period] [--source S] [--out DIR]
1215
+ Generate audit pack (CSVs + JSON + policy)
785
1216
 
786
1217
  Common flags:
787
1218
  --after <ISO date> Events after this date
@@ -828,6 +1259,8 @@ Examples:
828
1259
  clawbooks summary 2026-03
829
1260
  clawbooks snapshot 2026-03 --save
830
1261
  clawbooks assets --as-of 2026-03-31
1262
+ clawbooks compact 2025-12
1263
+ clawbooks pack 2026-03 --out ./march-pack
831
1264
 
832
1265
  Agent workflow:
833
1266
  1. Agent runs: clawbooks context 2026-03
@@ -877,5 +1310,11 @@ switch (cmd) {
877
1310
  case "assets":
878
1311
  cmdAssets(args);
879
1312
  break;
1313
+ case "compact":
1314
+ cmdCompact(args);
1315
+ break;
1316
+ case "pack":
1317
+ cmdPack(args);
1318
+ break;
880
1319
  default: console.log(HELP);
881
1320
  }
package/build/ledger.js CHANGED
@@ -52,6 +52,17 @@ export function append(path, event) {
52
52
  appendFileSync(path, JSON.stringify(event) + "\n", "utf-8");
53
53
  return true;
54
54
  }
55
+ export function rewrite(path, events) {
56
+ let prev = "genesis";
57
+ const lines = [];
58
+ for (const e of events) {
59
+ e.prev = prev;
60
+ const line = JSON.stringify(e);
61
+ prev = hashLine(line);
62
+ lines.push(line);
63
+ }
64
+ writeFileSync(path, lines.join("\n") + (lines.length ? "\n" : ""), "utf-8");
65
+ }
55
66
  export function latestSnapshot(events, before) {
56
67
  let snapshots = events.filter((e) => e.type === "snapshot");
57
68
  if (before)
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "clawbooks",
3
- "version": "0.1.1",
3
+ "version": "0.1.3",
4
4
  "description": "Accounting by inference, not by engine. Zero dependencies.",
5
5
  "type": "module",
6
6
  "repository": {
package/program.md CHANGED
@@ -113,12 +113,13 @@ Reduces carrying value by the impairment amount. Multiple impairments can accumu
113
113
 
114
114
  When asked for a P&L, tax summary, balance, etc.:
115
115
 
116
- 1. Run `clawbooks summary <period>` for pre-computed aggregates
116
+ 1. **Start with `summary`**, not `context`. `clawbooks summary <period>` gives you pre-computed aggregates without loading every event into context.
117
117
  2. Map the output to the requested report:
118
118
  - **P&L**: `by_type` + `by_category` → Revenue - OpEx = Gross Profit - Tax = Net
119
119
  - **Balance Sheet**: `cash_flow.net` + opening balance (from snapshot or opening_balance events) → Assets. Capitalized assets from `clawbooks assets`. Equity = Assets - Liabilities
120
120
  - **Cash Flow Statement**: Map categories to Operating/Investing/Financing per policy
121
- 3. For details or edge cases, also run `clawbooks context <period>` and reason over raw events
121
+ 3. Only use `clawbooks context <period>` when you need to drill into individual events — e.g., investigating a specific transaction, answering "what was that $500 charge?", or debugging a reconciliation mismatch.
122
+ 4. For large ledgers, use `clawbooks pack <period>` to generate a full audit pack (CSVs + JSON) that you or an accountant can review outside the agent.
122
123
 
123
124
  ## Reconciliation workflow
124
125
 
@@ -159,6 +160,42 @@ clawbooks snapshot 2026-03 # compute and print (no save)
159
160
 
160
161
  The snapshot includes balances by currency, totals by category, and P&L by currency.
161
162
 
163
+ ## Compacting the ledger
164
+
165
+ When the ledger grows large (thousands of events), compact old periods into an archive:
166
+
167
+ ```bash
168
+ clawbooks compact 2025-12
169
+ ```
170
+
171
+ This:
172
+ 1. Saves a snapshot summarizing all events up to the cutoff
173
+ 2. Moves those events to `ledger-archive-2025-12-31.jsonl`
174
+ 3. Rewrites the main ledger with just the snapshot + newer events
175
+
176
+ The archive file is a complete, hash-chained ledger — it can be re-read for audits. The main ledger stays small for fast context loading.
177
+
178
+ Compact aggressively for busy ledgers. Monthly or quarterly compaction keeps context manageable.
179
+
180
+ ## Audit packs
181
+
182
+ Generate a folder of CSVs and JSON for accountants, auditors, or your own review:
183
+
184
+ ```bash
185
+ clawbooks pack 2026-03 # pack a single month
186
+ clawbooks pack 2026-01/2026-12-31 --out ./annual-pack # pack a full year
187
+ ```
188
+
189
+ The pack includes:
190
+ - `general_ledger.csv` — every transaction with date, source, type, category, description, amount, currency, confidence, id
191
+ - `summary.json` — aggregates by type, category, currency, and cash flow
192
+ - `asset_register.csv` — capitalized assets with depreciation, disposal, write-off status (if any)
193
+ - `reclassifications.csv` — all reclassification events (if any)
194
+ - `verify.json` — integrity hash, debit/credit totals, issues
195
+ - `policy.md` — copy of the accounting policy applied
196
+
197
+ These files are assistive — they give the accountant standard-format data to work with. The agent can also read them back to answer questions.
198
+
162
199
  ## Quick reference
163
200
 
164
201
  ```
@@ -177,8 +214,23 @@ clawbooks summary [period] # pre-computed aggregates for reports
177
214
  clawbooks snapshot [period] [--save] # compute period snapshot (balances, P&L)
178
215
  clawbooks assets [--category C] [--life N] [--as-of DATE]
179
216
  # asset register (capitalize-flag based)
217
+ clawbooks compact <period> # archive old events, shrink ledger
218
+ clawbooks pack [period] [--out DIR] # generate audit pack (CSVs + JSON)
180
219
  ```
181
220
 
221
+ ## Improving the policy
222
+
223
+ The accounting policy (`policy.md`) should improve over time as you process more data. After classification review cycles:
224
+
225
+ 1. Run `clawbooks review` to see reclassifications and patterns
226
+ 2. If you notice repeated corrections (e.g., "GITHUB" always gets reclassified from `office_supplies` to `software`), update `policy.md` with the new rule
227
+ 3. Add the rule to the appropriate section (expense classification, source-specific rules, etc.)
228
+ 4. Be specific — "GitHub charges are software subscriptions" is better than "tech charges are software"
229
+
230
+ The goal is that each import gets more accurate as the policy captures learned patterns. The agent should proactively suggest policy updates when it sees recurring reclassifications, but should not update the policy without the user's awareness.
231
+
232
+ When updating the policy, keep it plain english. The policy is read by the agent on every `context` call — it should be clear, concise, and actionable.
233
+
182
234
  ## Idempotent imports
183
235
 
184
236
  When importing from a source (CSV, statement), include a stable `data.ref` field derived