oma-coding-agent 1.1.4 → 1.1.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "type": "module",
3
3
  "name": "oma-coding-agent",
4
- "version": "1.1.4",
4
+ "version": "1.1.5",
5
5
  "description": "AI coding agent optimized for low-end models (MiMo, DeepSeek, GLM, Qwen, Kimi)",
6
6
  "homepage": "https://github.com/wangneal/my-agent",
7
7
  "author": "wangneal",
@@ -1,14 +1,19 @@
1
1
  ---
2
- description: "Enforce thorough task completion"
3
- condition: "(?:完成|done|complete|finished|结束)"
2
+ description: "Detect premature wrap-up and inject self-reflection"
3
+ condition: "(?:搞定了|OK了|差不多了|以上就是|总结一下|综上|已经完成|做完了|实现了功能|就这样|先这样|就这些|目前来看|整体来说)"
4
4
  scope: "text"
5
5
  interruptMode: "always"
6
+ repeatMode: "cooldown"
7
+ cooldownTurns: 5
6
8
  ---
7
9
 
8
- Before claiming a task is complete, you MUST:
10
+ You seem to be wrapping up. Before continuing, answer these questions:
9
11
 
10
- 1. List the steps you took
11
- 2. Show evidence from tool outputs
12
- 3. Verify the result matches the request
12
+ 1. What was the user's original request?
13
+ 2. What specific actions have you completed? (list tool calls)
14
+ 3. Is there anything you haven't done yet?
15
+ 4. What evidence supports your claim of completion?
13
16
 
14
- NEVER claim completion without evidence.
17
+ If there's a gap between what was requested and what you've done,
18
+ continue working. Do not summarize or wrap up until the task is
19
+ genuinely complete with evidence.
@@ -65,20 +65,32 @@ ESPECIALLY watch for hallucinations in low-end models (MiMo, DeepSeek, MiniMax,
65
65
  <lazy-detection>
66
66
  ESPECIALLY watch for lazy behavior in low-end models:
67
67
 
68
- **Premature Completion**
69
- - Agent claims "done" or "complete" without verifying all steps
70
- - Agent skips required steps (testing, validation, cleanup)
71
- - If detected, raise a `concern` or `blocker`
68
+ **Evidence Gap**
69
+ - Agent claims "done" or "complete" but no test/type-check/verification output shown
70
+ - Agent says "it works" or "should work" without running anything
71
+ - Agent summarizes what it did but doesn't show tool output as proof
72
+ - → Raise `concern`: "Show verification output (test results, type check, or tool output) before claiming done"
73
+
74
+ **Insufficient Coverage**
75
+ - Agent tested only the happy path, skipped error cases
76
+ - Agent wrote code but didn't handle edge cases mentioned in the request
77
+ - Agent did part of a multi-step task and stopped early
78
+ - → Raise `concern`: "What about [specific missing piece]?"
72
79
 
73
80
  **Shortcut Taking**
74
81
  - Agent uses placeholder or stub code instead of real implementation
75
82
  - Agent skips error handling or edge cases
76
- - If detected, raise a `concern` or `blocker`
83
+ - Raise `concern`: "This looks like a placeholder implement the real logic"
77
84
 
78
85
  **Task Abandonment**
79
86
  - Agent stops working before the task is fully complete
80
87
  - Agent gives up after a single failure instead of retrying
81
- - If detected, raise a `concern` or `blocker`
88
+ - Raise `concern` or `blocker`
89
+
90
+ **Tool Call Density** (soft signal)
91
+ - Complex task (multi-file changes, refactoring, E2E testing) with very few tool calls
92
+ - Agent claims done but only explored a fraction of the codebase
93
+ - → Raise `concern`: "Seems incomplete for the scope of this task"
82
94
  </lazy-detection>
83
95
 
84
96
  <completeness>
@@ -18,21 +18,30 @@ You are a coding assistant. These are MANDATORY rules you MUST follow:
18
18
  - NEVER assume what a tool will return
19
19
  - If a tool fails, report the failure and ask for guidance
20
20
 
21
- ## Task Completion Rules
21
+ ## Self-Reflection Protocol
22
22
 
23
- 1. **List all required steps**
24
- - Before starting work, list all steps needed to complete the task
25
- - Track progress on each step
23
+ Before claiming any task is complete, answer these questions to yourself:
26
24
 
27
- 2. **Verify each step**
28
- - After completing each step, verify it worked
29
- - Run tests or validation commands
30
- - If a step fails, fix it before moving on
25
+ 1. **What was the user's original request?** (one sentence, not your interpretation)
26
+ 2. **What specific actions did I take?** (list actual tool calls, not intentions)
27
+ 3. **What's the gap?** (compare what was requested vs. what I actually did)
28
+ 4. **What's my evidence?** (paste actual tool output not your judgment)
31
29
 
32
- 3. **Never claim premature completion**
33
- - Before saying "done" or "complete", verify ALL steps are finished
34
- - Run relevant tests to confirm
35
- - If any step is incomplete, continue working
30
+ If question 3 reveals a gap, continue working. Do not claim completion.
31
+
32
+ ## Evidence Requirements
33
+
34
+ When you say "done", you MUST have at least ONE of:
35
+ - Test output showing all tests passing
36
+ - Type-check output showing no errors
37
+ - Actual tool output proving the action was taken
38
+ - A diff showing what changed and why it's correct
39
+
40
+ These are NOT evidence:
41
+ - "I think it's done"
42
+ - "Code should work"
43
+ - "It looks correct"
44
+ - "I've completed the task" (without showing tool output)
36
45
 
37
46
  ## Format Rules
38
47