@robhowley/pi-structured-return 0.1.0 → 1.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -34,15 +34,65 @@ Tokens counted with `cl100k_base` (tiktoken). Linter output is more compact than
34
34
  - `minitest-text` (parses default minitest output — no flags or reporters needed)
35
35
  - `junit-xml` (JUnit XML — covers pytest `--junitxml`, Gradle, Maven, Jest with `jest-junit`, Go with `go-junit-report`, and any other tool that emits the JUnit XML schema)
36
36
 
37
+ ## Before / after
38
+
39
+ **Raw pytest output (446 tokens):** *(pytest does have a JSON mode — but piping raw `--json-report` output to the model is noisier than the default formatter, not cleaner. The parser reads the JSON and extracts only what matters.)*
40
+ ```
41
+ ============================= test session starts ==============================
42
+ platform darwin -- Python 3.14.2, pytest-9.0.2
43
+ collecting ... collected 3 items
44
+
45
+ test_math.py::test_adds_two_numbers_correctly PASSED [ 33%]
46
+ test_math.py::test_multiplies_two_numbers_correctly FAILED [ 66%]
47
+ test_math.py::test_does_not_divide_by_zero FAILED [100%]
48
+
49
+ =================================== FAILURES ===================================
50
+ ____________________ test_multiplies_two_numbers_correctly _____________________
51
+
52
+ def test_multiplies_two_numbers_correctly():
53
+ > assert 3 * 4 == 99
54
+ E assert (3 * 4) == 99
55
+
56
+ test_math.py:5: AssertionError
57
+ _________________________ test_does_not_divide_by_zero _________________________
58
+
59
+ def test_does_not_divide_by_zero():
60
+ > result = 1 / 0
61
+ ^^^^^
62
+ E ZeroDivisionError: division by zero
63
+
64
+ test_math.py:8: ZeroDivisionError
65
+ =========================== short test summary info ============================
66
+ FAILED test_math.py::test_multiplies_two_numbers_correctly
67
+ FAILED test_math.py::test_does_not_divide_by_zero - ZeroDivisionError: ...
68
+ ========================= 2 failed, 1 passed in 0.01s ==========================
69
+ ```
70
+
71
+ **Structured result returned to the model (59 tokens):**
72
+ ```
73
+ pytest test_math.py --json-report ... → cwd: project
74
+ 2 failed, 1 passed
75
+ test_math.py AssertionError: assert (3 * 4) == 99
76
+ test_math.py ZeroDivisionError: division by zero
77
+ ```
78
+
79
+ ## How it works
80
+
81
+ 1. The agent runs commands through `structured_return` instead of `bash`.
82
+ 2. Full output is captured and stored as a log.
83
+ 3. A parser converts noisy CLI output into a compact structured result. If no parser matches, the last 200 lines and the log path are returned as a fallback.
84
+ 4. The agent receives the structured result in context — signal only, no noise.
85
+ 5. The full log is always available on disk for both the agent and humans to inspect.
86
+
37
87
  ## Agentic loops
38
88
 
39
89
  The token table above measures a single run. In an agentic loop the cost compounds — every tool result accumulates in context for the life of the task.
40
90
 
41
- This applies to any loop: fixing a failing test suite, implementing a feature end-to-end, working through a migration, performance tuning execution times. The agent runs a command, reads the result, makes a change, runs it again. Each iteration adds another tool result to the window. With a noisy CLI that means paying for the same verbose boilerplate on every pass, and the agent has to hold all of it to reason about what changed.
91
+ This applies to any loop: fixing a failing test suite, implementing a feature end-to-end, working through a migration, performance tuning execution times. The agent runs a command, reads the result, makes a change, runs it again. Each iteration adds another tool result to the context window. With a noisy CLI that means paying for the same verbose boilerplate every time.
42
92
 
43
- A parser converts each result to a one- or two-line signal. Over 15 iterations, the difference isn't 80 tokens vs 15 tokens — it's 1,200 tokens vs 225, on a single command, for a single task.
93
+ A parser reduces each run to a one- or two-line signal. Over 15 iterations the difference isn't 80 tokens vs 15 tokens — it's 1,200 tokens vs 225 for a single command in a single task.
44
94
 
45
- ## Project-local extension
95
+ ## Extending with project-local parsers
46
96
 
47
97
  Built-in parsers cover common tools. For everything else — internal CLIs, custom test runners, proprietary lint tools — add a `.pi/structured-return.json` to your project root.
48
98
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@robhowley/pi-structured-return",
3
- "version": "0.1.0",
3
+ "version": "1.0.2",
4
4
  "description": "Structured command execution for Pi agents: compact results for the model, full logs for humans.",
5
5
  "license": "MIT",
6
6
  "repository": {
@@ -43,7 +43,7 @@
43
43
  "lint": "eslint 'extensions/structured-return/src/**/*.ts'",
44
44
  "format:check": "prettier --check 'extensions/structured-return/src/**/*.ts'",
45
45
  "format": "prettier --write 'extensions/structured-return/src/**/*.ts'",
46
- "prepare": "husky"
46
+ "prepare": "node -e \"if (process.env.CI || process.env.GITHUB_ACTIONS) process.exit(0)\" || husky"
47
47
  },
48
48
  "dependencies": {
49
49
  "fast-xml-parser": "^5.4.1"
@@ -1,18 +0,0 @@
1
- {
2
- "parsers": [
3
- {
4
- "id": "acme-junit",
5
- "match": {
6
- "argvIncludes": ["acme", "test"]
7
- },
8
- "parseAs": "junit-xml"
9
- },
10
- {
11
- "id": "foo-json",
12
- "match": {
13
- "argvIncludes": ["foo-cli", "check"]
14
- },
15
- "module": "parsers/foo-cli.js"
16
- }
17
- ]
18
- }
@@ -1,25 +0,0 @@
1
- import fs from "node:fs";
2
- import type { RunContext } from "../../../src/types.js";
3
-
4
- type FooError = { id?: string; file?: string; line?: number; message?: string };
5
- type FooResult = { ok: boolean; errors?: FooError[] };
6
-
7
- export default {
8
- id: "foo-json",
9
- async parse(ctx: RunContext) {
10
- const stdout = fs.readFileSync(ctx.stdoutPath, "utf8");
11
- const data = JSON.parse(stdout) as FooResult;
12
- return {
13
- tool: "foo-cli",
14
- status: data.ok ? "pass" : "fail",
15
- summary: data.ok ? "foo-cli passed" : `${data.errors?.length ?? 0} foo-cli errors`,
16
- failures: (data.errors ?? []).map((e: FooError, i: number) => ({
17
- id: e.id ?? `error-${i + 1}`,
18
- file: e.file,
19
- line: e.line,
20
- message: e.message,
21
- })),
22
- logPath: ctx.logPath,
23
- };
24
- },
25
- };
@@ -1,55 +0,0 @@
1
- <?xml version="1.0" encoding="UTF-8" ?>
2
- <testsuites name="vitest tests" tests="21" failures="0" errors="0" time="0.014556541">
3
- <testsuite name="index.test.ts" timestamp="2026-03-16T17:34:52.759Z" hostname="Roberts-MBP" tests="9" failures="0" errors="0" skipped="0" time="0.002309">
4
- <testcase classname="index.test.ts" name="stripCdPrefix &gt; strips cd /path &amp;&amp; prefix" time="0.000854833">
5
- </testcase>
6
- <testcase classname="index.test.ts" name="stripCdPrefix &gt; leaves commands without cd unchanged" time="0.000108208">
7
- </testcase>
8
- <testcase classname="index.test.ts" name="stripCdPrefix &gt; handles paths with no trailing space variations" time="0.000063708">
9
- </testcase>
10
- <testcase classname="index.test.ts" name="formatResult &gt; includes cwd when set" time="0.000125458">
11
- </testcase>
12
- <testcase classname="index.test.ts" name="formatResult &gt; omits cwd line when not set" time="0.000084875">
13
- </testcase>
14
- <testcase classname="index.test.ts" name="formatResult &gt; renders relative paths in failure lines" time="0.000090792">
15
- </testcase>
16
- <testcase classname="index.test.ts" name="finalizeResult &gt; status error with exit code 0 flips to pass" time="0.000076667">
17
- </testcase>
18
- <testcase classname="index.test.ts" name="finalizeResult &gt; status error with non-zero exit code stays error" time="0.000056917">
19
- </testcase>
20
- <testcase classname="index.test.ts" name="finalizeResult &gt; attaches cwd to result" time="0.000058875">
21
- </testcase>
22
- </testsuite>
23
- <testsuite name="parsers/eslint-json.test.ts" timestamp="2026-03-16T17:34:52.759Z" hostname="Roberts-MBP" tests="3" failures="0" errors="0" skipped="0" time="0.003122">
24
- <testcase classname="parsers/eslint-json.test.ts" name="eslint-json parser &gt; multiple errors across multiple files → correct relative paths, correct failure count, status fail" time="0.001832041">
25
- </testcase>
26
- <testcase classname="parsers/eslint-json.test.ts" name="eslint-json parser &gt; no errors → empty failures, status pass" time="0.000366459">
27
- </testcase>
28
- <testcase classname="parsers/eslint-json.test.ts" name="eslint-json parser &gt; empty stdout → no crash, status pass" time="0.000289">
29
- </testcase>
30
- </testsuite>
31
- <testsuite name="parsers/pytest-json-report.test.ts" timestamp="2026-03-16T17:34:52.760Z" hostname="Roberts-MBP" tests="4" failures="0" errors="0" skipped="0" time="0.003528333">
32
- <testcase classname="parsers/pytest-json-report.test.ts" name="pytest-json-report parser &gt; mix of passed and failed tests → correct counts, status fail" time="0.001658708">
33
- </testcase>
34
- <testcase classname="parsers/pytest-json-report.test.ts" name="pytest-json-report parser &gt; all passing → status pass, summary reflects passed count" time="0.00033575">
35
- </testcase>
36
- <testcase classname="parsers/pytest-json-report.test.ts" name="pytest-json-report parser &gt; failed test with longrepr → first line surfaced as message" time="0.000256959">
37
- </testcase>
38
- <testcase classname="parsers/pytest-json-report.test.ts" name="pytest-json-report parser &gt; missing artifact file → throws" time="0.000615667">
39
- </testcase>
40
- </testsuite>
41
- <testsuite name="parsers/ruff-json.test.ts" timestamp="2026-03-16T17:34:52.760Z" hostname="Roberts-MBP" tests="3" failures="0" errors="0" skipped="0" time="0.003317375">
42
- <testcase classname="parsers/ruff-json.test.ts" name="ruff-json parser &gt; multiple errors across multiple files → correct relative paths, rule code mapped to rule, status fail" time="0.00188625">
43
- </testcase>
44
- <testcase classname="parsers/ruff-json.test.ts" name="ruff-json parser &gt; no errors → empty failures, status pass" time="0.000399625">
45
- </testcase>
46
- <testcase classname="parsers/ruff-json.test.ts" name="ruff-json parser &gt; empty stdout → no crash, status pass" time="0.00028175">
47
- </testcase>
48
- </testsuite>
49
- <testsuite name="parsers/tail-fallback.test.ts" timestamp="2026-03-16T17:34:52.760Z" hostname="Roberts-MBP" tests="2" failures="0" errors="0" skipped="0" time="0.002279833">
50
- <testcase classname="parsers/tail-fallback.test.ts" name="tail-fallback parser &gt; any command → status error, summary contains log path" time="0.001251">
51
- </testcase>
52
- <testcase classname="parsers/tail-fallback.test.ts" name="tail-fallback parser &gt; long stdout → tail is bounded to last 200 lines" time="0.000405417">
53
- </testcase>
54
- </testsuite>
55
- </testsuites>
@@ -1 +0,0 @@
1
- {"version":"4.1.0","results":[[":parsers/pytest-json-report.test.ts",{"duration":8.454000000000008,"failed":false}],[":parsers/ruff-json.test.ts",{"duration":2.925207999999998,"failed":false}],[":index.test.ts",{"duration":2.2829590000000053,"failed":false}],[":parsers/eslint-json.test.ts",{"duration":3.558458999999999,"failed":false}],[":parsers/tail-fallback.test.ts",{"duration":3.911292000000003,"failed":false}],[":parsers/vitest-json.test.ts",{"duration":7.248625000000004,"failed":false}],[":parsers/rspec-json.test.ts",{"duration":7.745584000000008,"failed":false}],[":parsers/junit-xml.test.ts",{"duration":6.386042000000003,"failed":false}],[":parsers/minitest-text.test.ts",{"duration":6.155249999999995,"failed":false}]]}