@davidorex/pi-behavior-monitors 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md ADDED
@@ -0,0 +1,61 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file.
4
+
5
+ ## v0.1.2
6
+
7
+ [compare changes](https://github.com/davidorex/pi-behavior-monitors/compare/v0.1.1...v0.1.2)
8
+
9
+ ### 🚀 Enhancements
10
+
11
+ - Unify monitor management under /monitors command with subcommand routing ([dbffaa0](https://github.com/davidorex/pi-behavior-monitors/commit/dbffaa0))
12
+ - Add vitest test suite for pure functions in index.ts ([fed0d75](https://github.com/davidorex/pi-behavior-monitors/commit/fed0d75))
13
+
14
+ ### 🩹 Fixes
15
+
16
+ - Address conformance audit findings — unused param, session_switch, headless escalate ([f2d3baa](https://github.com/davidorex/pi-behavior-monitors/commit/f2d3baa))
17
+
18
+ ### 🏡 Chore
19
+
20
+ - Add .claude/ to gitignore, version conformance audit in docs/ ([d6a8395](https://github.com/davidorex/pi-behavior-monitors/commit/d6a8395))
21
+
22
+ ### ❤️ Contributors
23
+
24
+ - David Ryan <davidryan@gmail.com>
25
+
26
+ ## v0.1.1
27
+
28
+ [compare changes](https://github.com/davidorex/pi-behavior-monitors/compare/v0.1.0...v0.1.1)
29
+
30
+ ### 📖 Documentation
31
+
32
+ - Add CLAUDE.md with project conventions ([5f5b427](https://github.com/davidorex/pi-behavior-monitors/commit/5f5b427))
33
+ - Expand SKILL.md to cover full runtime behavior and bundled monitors ([5c9980d](https://github.com/davidorex/pi-behavior-monitors/commit/5c9980d))
34
+
35
+ ### 🏡 Chore
36
+
37
+ - Add npm publish metadata, files whitelist, and normalize repository URL ([4b3f1f4](https://github.com/davidorex/pi-behavior-monitors/commit/4b3f1f4))
38
+ - Add .gitignore, remove runtime .workflow/ from tracking ([c1d4ae5](https://github.com/davidorex/pi-behavior-monitors/commit/c1d4ae5))
39
+
40
+ ### ❤️ Contributors
41
+
42
+ - David Ryan <davidryan@gmail.com>
43
+
44
+ ## v0.1.0
45
+
46
+ Initial release.
47
+
48
+ ### Added
49
+
50
+ - Monitor extension with event-driven classification (message_end, turn_end, agent_end, command)
51
+ - JSON-based monitor definitions (.monitor.json), pattern libraries (.patterns.json), instructions (.instructions.json)
52
+ - Side-channel LLM classification with CLEAN/FLAG/NEW verdict protocol
53
+ - Auto-learning of new patterns from runtime detection
54
+ - Write action for structured JSON findings output
55
+ - Scope targeting (main, subagent, all, workflow)
56
+ - Bundled monitors: fragility, hedge, work-quality
57
+ - Slash commands: /monitors, /<name>, /<name> <instruction>
58
+ - Status bar integration showing engaged/dismissed monitors
59
+ - Escalation with ceiling + ask/dismiss
60
+ - SKILL.md for LLM-assisted monitor creation
61
+ - JSON schemas for monitor definitions and patterns
package/README.md ADDED
@@ -0,0 +1,59 @@
1
+ # pi-behavior-monitors
2
+
3
+ Behavior monitors for [pi](https://github.com/badlogic/pi-mono) that watch agent activity, classify against pattern libraries, steer corrections, and write structured findings to JSON files.
4
+
5
+ Monitors are JSON files (`.monitor.json`) with typed blocks: classify (LLM side-channel), patterns (JSON library), actions (steer + write to JSON), and scope (main/subagent/workflow targeting).
6
+
7
+ ## Install
8
+
9
+ ```bash
10
+ pi install npm:pi-behavior-monitors
11
+ ```
12
+
13
+ On first run, if no monitors exist in your project, example monitors are seeded into `.pi/monitors/`. Edit or delete them to customize.
14
+
15
+ ## Bundled Example Monitors
16
+
17
+ - **fragility** — detects when the agent leaves broken state behind (errors it noticed but didn't fix, TODO comments instead of solutions, empty catch blocks). Writes findings to `.workflow/gaps.json`.
18
+ - **hedge** — detects when the agent deviates from what the user actually said (rephrasing questions, assuming intent, deflecting with counter-questions)
19
+ - **work-quality** — on-demand audit of work quality (trial-and-error, not reading before editing, fixing symptoms instead of root causes). Invoked via `/work-quality`. Writes findings to `.workflow/gaps.json`.
20
+
21
+ ## File Structure
22
+
23
+ Each monitor is a triad of JSON files:
24
+
25
+ ```
26
+ .pi/monitors/
27
+ ├── fragility.monitor.json # Definition (classify + patterns + actions + scope)
28
+ ├── fragility.patterns.json # Known patterns (grows automatically)
29
+ └── fragility.instructions.json # User corrections (optional)
30
+ ```
31
+
32
+ ## Writing Your Own
33
+
34
+ Create a `.monitor.json` file in `.pi/monitors/` conforming to `schemas/monitor.schema.json`. Ask the LLM to read the `pi-behavior-monitors` skill for the full schema and examples.
35
+
36
+ ## Commands
37
+
38
+ | Command | Description |
39
+ |---------|-------------|
40
+ | `/monitors` | List all monitors, scope, and state |
41
+ | `/<name>` | Show monitor patterns and instructions |
42
+ | `/<name> <text>` | Add an instruction to calibrate the monitor |
43
+
44
+ ## How It Works
45
+
46
+ 1. A monitor fires on a configured event (e.g., after each assistant message)
47
+ 2. It checks scope (main context, subagent, workflow) and activation conditions
48
+ 3. It collects relevant conversation context (tool results, assistant text, etc.)
49
+ 4. A side-channel LLM call classifies the context against the JSON pattern library
50
+ 5. Based on the verdict, the monitor executes actions:
51
+ - **steer**: inject a correction message into the conversation (main scope only)
52
+ - **write**: append structured findings to a JSON file (any scope)
53
+ - **learn**: add new patterns to the library automatically
54
+ 6. Downstream workflows can consume the JSON findings (e.g., gaps.json → verify step → gate)
55
+
56
+ ## Schemas
57
+
58
+ - `schemas/monitor.schema.json` — monitor definition format
59
+ - `schemas/monitor-pattern.schema.json` — pattern library entry format
@@ -0,0 +1 @@
1
+ []
@@ -0,0 +1,62 @@
1
+ {
2
+ "name": "fragility",
3
+ "description": "Detects unaddressed fragilities after tool use",
4
+ "event": "message_end",
5
+ "when": "has_tool_results",
6
+ "scope": {
7
+ "target": "main"
8
+ },
9
+ "classify": {
10
+ "model": "claude-sonnet-4-20250514",
11
+ "context": ["tool_results", "assistant_text"],
12
+ "excludes": [],
13
+ "prompt": "An agent just performed actions and responded. Determine if it left known\nfragilities — errors, warnings, or broken state it noticed but chose not\nto fix, expecting someone else to deal with them.\n\nRecent tool outputs the agent saw:\n{tool_results}\n\nThe agent then said:\n\"{assistant_text}\"\n\n{instructions}\n\nFragility patterns to check:\n{patterns}\n\nReply CLEAN if the agent addressed problems it encountered or if no\nproblems were present.\nReply FLAG:<one sentence describing the fragility left behind> if a\nknown pattern was matched.\nReply NEW:<new pattern to add>|<one sentence describing the fragility\nleft behind> if the agent left a fragility not covered by existing patterns."
14
+ },
15
+ "patterns": {
16
+ "path": "fragility.patterns.json",
17
+ "learn": true
18
+ },
19
+ "instructions": {
20
+ "path": "fragility.instructions.json"
21
+ },
22
+ "actions": {
23
+ "on_flag": {
24
+ "steer": "Fix the issue you left behind.",
25
+ "write": {
26
+ "path": ".workflow/gaps.json",
27
+ "schema": "schemas/gaps.schema.json",
28
+ "merge": "append",
29
+ "array_field": "gaps",
30
+ "template": {
31
+ "id": "fragility-{finding_id}",
32
+ "description": "{description}",
33
+ "status": "open",
34
+ "category": "fragility",
35
+ "priority": "{severity}",
36
+ "source": "monitor"
37
+ }
38
+ }
39
+ },
40
+ "on_new": {
41
+ "steer": "Fix the issue you left behind.",
42
+ "learn_pattern": true,
43
+ "write": {
44
+ "path": ".workflow/gaps.json",
45
+ "schema": "schemas/gaps.schema.json",
46
+ "merge": "append",
47
+ "array_field": "gaps",
48
+ "template": {
49
+ "id": "fragility-{finding_id}",
50
+ "description": "{description}",
51
+ "status": "open",
52
+ "category": "fragility",
53
+ "priority": "warning",
54
+ "source": "monitor"
55
+ }
56
+ }
57
+ },
58
+ "on_clean": null
59
+ },
60
+ "ceiling": 5,
61
+ "escalate": "ask"
62
+ }
@@ -0,0 +1,86 @@
1
+ [
2
+ {
3
+ "id": "dismiss-preexisting",
4
+ "description": "Dismissing errors as pre-existing instead of fixing them",
5
+ "severity": "warning",
6
+ "category": "avoidance",
7
+ "source": "bundled"
8
+ },
9
+ {
10
+ "id": "empty-catch",
11
+ "description": "Silently catching exceptions with empty catch blocks",
12
+ "severity": "error",
13
+ "category": "error-handling",
14
+ "source": "bundled"
15
+ },
16
+ {
17
+ "id": "todo-instead-of-fix",
18
+ "description": "Adding TODO or FIXME comments instead of solving the problem now",
19
+ "severity": "warning",
20
+ "category": "deferral",
21
+ "source": "bundled"
22
+ },
23
+ {
24
+ "id": "happy-path-only",
25
+ "description": "Writing code that assumes happy path without handling failure cases",
26
+ "severity": "warning",
27
+ "category": "error-handling",
28
+ "source": "bundled"
29
+ },
30
+ {
31
+ "id": "not-my-change",
32
+ "description": "Leaving known broken state because 'it's not my change'",
33
+ "severity": "warning",
34
+ "category": "avoidance",
35
+ "source": "bundled"
36
+ },
37
+ {
38
+ "id": "early-return-on-unexpected",
39
+ "description": "Returning early or skipping logic when an unexpected condition is hit instead of handling it",
40
+ "severity": "warning",
41
+ "category": "error-handling",
42
+ "source": "bundled"
43
+ },
44
+ {
45
+ "id": "undocumented-delegation",
46
+ "description": "Deferring error handling to the caller without documenting or enforcing it",
47
+ "severity": "warning",
48
+ "category": "error-handling",
49
+ "source": "bundled"
50
+ },
51
+ {
52
+ "id": "silent-fallback",
53
+ "description": "Using fallback values that mask failures silently (returning empty string, null, undefined on error)",
54
+ "severity": "warning",
55
+ "category": "error-handling",
56
+ "source": "bundled"
57
+ },
58
+ {
59
+ "id": "prose-without-action",
60
+ "description": "Noting a problem in prose but not acting on it in code",
61
+ "severity": "warning",
62
+ "category": "deferral",
63
+ "source": "bundled"
64
+ },
65
+ {
66
+ "id": "blame-environment",
67
+ "description": "Blaming the environment or dependencies instead of working around or fixing the issue",
68
+ "severity": "warning",
69
+ "category": "avoidance",
70
+ "source": "bundled"
71
+ },
72
+ {
73
+ "id": "workaround-over-root-cause",
74
+ "description": "Identifying architectural inefficiencies but implementing workarounds instead of fixing the root cause",
75
+ "severity": "warning",
76
+ "category": "avoidance",
77
+ "source": "bundled"
78
+ },
79
+ {
80
+ "id": "elaborate-workaround-for-fixable",
81
+ "description": "Documenting a known dangerous state and designing elaborate workarounds instead of fixing the root cause",
82
+ "severity": "error",
83
+ "category": "avoidance",
84
+ "source": "bundled"
85
+ }
86
+ ]
@@ -0,0 +1 @@
1
+ []
@@ -0,0 +1,34 @@
1
+ {
2
+ "name": "hedge",
3
+ "description": "Detects when assistant deviates from what the user said",
4
+ "event": "turn_end",
5
+ "when": "always",
6
+ "scope": {
7
+ "target": "main"
8
+ },
9
+ "classify": {
10
+ "model": "claude-sonnet-4-20250514",
11
+ "context": ["user_text", "tool_calls", "custom_messages", "assistant_text"],
12
+ "excludes": ["fragility"],
13
+ "prompt": "The user said:\n\"{user_text}\"\n\n{tool_calls}\n{custom_messages}\n\nThe assistant's latest response:\n\"{assistant_text}\"\n\n{instructions}\n\nGiven the full context of what the user asked and what the assistant did,\ndid the assistant deviate from what the user actually said in its latest\nresponse?\n\nIf the user's request has been addressed by the actions taken, the\nassistant summarizing that completed work is not a deviation.\n\nCheck against these patterns:\n{patterns}\n\nReply CLEAN if the assistant stuck to what the user actually said.\nReply FLAG:<one sentence, what was added or substituted> if a known\npattern was matched.\nReply NEW:<new pattern to add>|<one sentence, what was added or\nsubstituted> if the assistant deviated in a way not covered by\nexisting patterns."
14
+ },
15
+ "patterns": {
16
+ "path": "hedge.patterns.json",
17
+ "learn": true
18
+ },
19
+ "instructions": {
20
+ "path": "hedge.instructions.json"
21
+ },
22
+ "actions": {
23
+ "on_flag": {
24
+ "steer": "Address what the user actually said."
25
+ },
26
+ "on_new": {
27
+ "steer": "Address what the user actually said.",
28
+ "learn_pattern": true
29
+ },
30
+ "on_clean": null
31
+ },
32
+ "ceiling": 3,
33
+ "escalate": "ask"
34
+ }
@@ -0,0 +1,10 @@
1
+ [
2
+ { "id": "rephrase-question", "description": "Rephrasing the user's question into a different question and answering that instead", "severity": "warning", "category": "substitution", "source": "bundled" },
3
+ { "id": "assume-intent", "description": "Assuming intent the user did not express", "severity": "warning", "category": "projection", "source": "bundled" },
4
+ { "id": "add-questions", "description": "Adding questions the user did not ask", "severity": "warning", "category": "augmentation", "source": "bundled" },
5
+ { "id": "reinterpret-words", "description": "Interpreting the user's words as meaning something other than what they said", "severity": "warning", "category": "substitution", "source": "bundled" },
6
+ { "id": "attribute-position", "description": "Attributing a position or preference the user did not state", "severity": "warning", "category": "projection", "source": "bundled" },
7
+ { "id": "ask-permission", "description": "Asking permission to do something instead of doing it when the user asked a direct question", "severity": "warning", "category": "deflection", "source": "bundled" },
8
+ { "id": "qualify-yesno", "description": "Answering a yes/no question with qualifiers instead of yes or no", "severity": "info", "category": "deflection", "source": "bundled" },
9
+ { "id": "counter-question", "description": "Deflecting with a counter-question when the user expected an answer", "severity": "warning", "category": "deflection", "source": "bundled" }
10
+ ]
@@ -0,0 +1,62 @@
1
+ {
2
+ "name": "work-quality",
3
+ "description": "On-demand work quality analysis",
4
+ "event": "command",
5
+ "when": "always",
6
+ "scope": {
7
+ "target": "main"
8
+ },
9
+ "classify": {
10
+ "model": "claude-sonnet-4-20250514",
11
+ "context": ["user_text", "tool_calls", "assistant_text"],
12
+ "excludes": [],
13
+ "prompt": "An agent was asked:\n\"{user_text}\"\n\nIt performed these actions:\n{tool_calls}\n\nThen it said:\n\"{assistant_text}\"\n\n{instructions}\n\nAnalyze the quality of the work. Check against these patterns:\n{patterns}\n\nReply CLEAN if the work was sound.\nReply FLAG:<one sentence describing the quality issue> if a known\npattern was matched.\nReply NEW:<new pattern to add>|<one sentence describing the quality\nissue> if there's a work quality problem not covered by existing patterns."
14
+ },
15
+ "patterns": {
16
+ "path": "work-quality.patterns.json",
17
+ "learn": true
18
+ },
19
+ "instructions": {
20
+ "path": "work-quality.instructions.json"
21
+ },
22
+ "actions": {
23
+ "on_flag": {
24
+ "steer": "Fix the quality issue.",
25
+ "write": {
26
+ "path": ".workflow/gaps.json",
27
+ "schema": "schemas/gaps.schema.json",
28
+ "merge": "append",
29
+ "array_field": "gaps",
30
+ "template": {
31
+ "id": "quality-{finding_id}",
32
+ "description": "{description}",
33
+ "status": "open",
34
+ "category": "work-quality",
35
+ "priority": "{severity}",
36
+ "source": "monitor"
37
+ }
38
+ }
39
+ },
40
+ "on_new": {
41
+ "steer": "Fix the quality issue.",
42
+ "learn_pattern": true,
43
+ "write": {
44
+ "path": ".workflow/gaps.json",
45
+ "schema": "schemas/gaps.schema.json",
46
+ "merge": "append",
47
+ "array_field": "gaps",
48
+ "template": {
49
+ "id": "quality-{finding_id}",
50
+ "description": "{description}",
51
+ "status": "open",
52
+ "category": "work-quality",
53
+ "priority": "warning",
54
+ "source": "monitor"
55
+ }
56
+ }
57
+ },
58
+ "on_clean": null
59
+ },
60
+ "ceiling": 3,
61
+ "escalate": "ask"
62
+ }
@@ -0,0 +1,13 @@
1
+ [
2
+ { "id": "trial-and-error", "description": "Trial-and-error instead of reading code to understand it first", "severity": "warning", "category": "methodology", "source": "bundled" },
3
+ { "id": "no-verify", "description": "Making changes without verifying them (no check/test run after edits)", "severity": "error", "category": "verification", "source": "bundled" },
4
+ { "id": "symptom-fix", "description": "Fixing symptoms instead of root causes", "severity": "warning", "category": "methodology", "source": "bundled" },
5
+ { "id": "excessive-changes", "description": "Changing more files than necessary to solve the problem", "severity": "warning", "category": "scope", "source": "bundled" },
6
+ { "id": "copy-paste", "description": "Copy-pasting code instead of extracting shared logic", "severity": "warning", "category": "quality", "source": "bundled" },
7
+ { "id": "debug-artifacts", "description": "Leaving debug artifacts (console.log, commented-out code, temporary files)", "severity": "warning", "category": "cleanup", "source": "bundled" },
8
+ { "id": "double-edit", "description": "Making an edit then immediately making another edit to the same file to fix the first edit", "severity": "info", "category": "methodology", "source": "bundled" },
9
+ { "id": "edit-without-read", "description": "Not reading a file before editing it", "severity": "error", "category": "methodology", "source": "bundled" },
10
+ { "id": "insanity-retry", "description": "Running a command, getting an error, and running the same command again expecting different results", "severity": "warning", "category": "methodology", "source": "bundled" },
11
+ { "id": "wrong-problem", "description": "Solving a different problem than the one that was asked about", "severity": "error", "category": "scope", "source": "bundled" },
12
+ { "id": "no-plan", "description": "Did not create a plan before starting work", "severity": "info", "category": "methodology", "source": "bundled" }
13
+ ]