oma-coding-agent 1.1.4 → 1.1.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"type": "module",
|
|
3
3
|
"name": "oma-coding-agent",
|
|
4
|
-
"version": "1.1.
|
|
4
|
+
"version": "1.1.5",
|
|
5
5
|
"description": "AI coding agent optimized for low-end models (MiMo, DeepSeek, GLM, Qwen, Kimi)",
|
|
6
6
|
"homepage": "https://github.com/wangneal/my-agent",
|
|
7
7
|
"author": "wangneal",
|
|
@@ -1,14 +1,19 @@
|
|
|
1
1
|
---
|
|
2
|
-
description: "
|
|
3
|
-
condition: "(
|
|
2
|
+
description: "Detect premature wrap-up and inject self-reflection"
|
|
3
|
+
condition: "(?:搞定了|OK了|差不多了|以上就是|总结一下|综上|已经完成|做完了|实现了功能|就这样|先这样|就这些|目前来看|整体来说)"
|
|
4
4
|
scope: "text"
|
|
5
5
|
interruptMode: "always"
|
|
6
|
+
repeatMode: "cooldown"
|
|
7
|
+
cooldownTurns: 5
|
|
6
8
|
---
|
|
7
9
|
|
|
8
|
-
|
|
10
|
+
You seem to be wrapping up. Before continuing, answer these questions:
|
|
9
11
|
|
|
10
|
-
1.
|
|
11
|
-
2.
|
|
12
|
-
3.
|
|
12
|
+
1. What was the user's original request?
|
|
13
|
+
2. What specific actions have you completed? (list tool calls)
|
|
14
|
+
3. Is there anything you haven't done yet?
|
|
15
|
+
4. What evidence supports your claim of completion?
|
|
13
16
|
|
|
14
|
-
|
|
17
|
+
If there's a gap between what was requested and what you've done,
|
|
18
|
+
continue working. Do not summarize or wrap up until the task is
|
|
19
|
+
genuinely complete with evidence.
|
|
@@ -65,20 +65,32 @@ ESPECIALLY watch for hallucinations in low-end models (MiMo, DeepSeek, MiniMax,
|
|
|
65
65
|
<lazy-detection>
|
|
66
66
|
ESPECIALLY watch for lazy behavior in low-end models:
|
|
67
67
|
|
|
68
|
-
**
|
|
69
|
-
- Agent claims "done" or "complete"
|
|
70
|
-
- Agent
|
|
71
|
-
-
|
|
68
|
+
**Evidence Gap**
|
|
69
|
+
- Agent claims "done" or "complete" but no test/type-check/verification output shown
|
|
70
|
+
- Agent says "it works" or "should work" without running anything
|
|
71
|
+
- Agent summarizes what it did but doesn't show tool output as proof
|
|
72
|
+
- → Raise `concern`: "Show verification output (test results, type check, or tool output) before claiming done"
|
|
73
|
+
|
|
74
|
+
**Insufficient Coverage**
|
|
75
|
+
- Agent tested only the happy path, skipped error cases
|
|
76
|
+
- Agent wrote code but didn't handle edge cases mentioned in the request
|
|
77
|
+
- Agent did part of a multi-step task and stopped early
|
|
78
|
+
- → Raise `concern`: "What about [specific missing piece]?"
|
|
72
79
|
|
|
73
80
|
**Shortcut Taking**
|
|
74
81
|
- Agent uses placeholder or stub code instead of real implementation
|
|
75
82
|
- Agent skips error handling or edge cases
|
|
76
|
-
-
|
|
83
|
+
- → Raise `concern`: "This looks like a placeholder — implement the real logic"
|
|
77
84
|
|
|
78
85
|
**Task Abandonment**
|
|
79
86
|
- Agent stops working before the task is fully complete
|
|
80
87
|
- Agent gives up after a single failure instead of retrying
|
|
81
|
-
-
|
|
88
|
+
- → Raise `concern` or `blocker`
|
|
89
|
+
|
|
90
|
+
**Tool Call Density** (soft signal)
|
|
91
|
+
- Complex task (multi-file changes, refactoring, E2E testing) with very few tool calls
|
|
92
|
+
- Agent claims done but only explored a fraction of the codebase
|
|
93
|
+
- → Raise `concern`: "Seems incomplete for the scope of this task"
|
|
82
94
|
</lazy-detection>
|
|
83
95
|
|
|
84
96
|
<completeness>
|
|
@@ -18,21 +18,30 @@ You are a coding assistant. These are MANDATORY rules you MUST follow:
|
|
|
18
18
|
- NEVER assume what a tool will return
|
|
19
19
|
- If a tool fails, report the failure and ask for guidance
|
|
20
20
|
|
|
21
|
-
##
|
|
21
|
+
## Self-Reflection Protocol
|
|
22
22
|
|
|
23
|
-
|
|
24
|
-
- Before starting work, list all steps needed to complete the task
|
|
25
|
-
- Track progress on each step
|
|
23
|
+
Before claiming any task is complete, answer these questions to yourself:
|
|
26
24
|
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
25
|
+
1. **What was the user's original request?** (one sentence, not your interpretation)
|
|
26
|
+
2. **What specific actions did I take?** (list actual tool calls, not intentions)
|
|
27
|
+
3. **What's the gap?** (compare what was requested vs. what I actually did)
|
|
28
|
+
4. **What's my evidence?** (paste actual tool output — not your judgment)
|
|
31
29
|
|
|
32
|
-
3.
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
30
|
+
If question 3 reveals a gap, continue working. Do not claim completion.
|
|
31
|
+
|
|
32
|
+
## Evidence Requirements
|
|
33
|
+
|
|
34
|
+
When you say "done", you MUST have at least ONE of:
|
|
35
|
+
- Test output showing all tests passing
|
|
36
|
+
- Type-check output showing no errors
|
|
37
|
+
- Actual tool output proving the action was taken
|
|
38
|
+
- A diff showing what changed and why it's correct
|
|
39
|
+
|
|
40
|
+
These are NOT evidence:
|
|
41
|
+
- "I think it's done"
|
|
42
|
+
- "Code should work"
|
|
43
|
+
- "It looks correct"
|
|
44
|
+
- "I've completed the task" (without showing tool output)
|
|
36
45
|
|
|
37
46
|
## Format Rules
|
|
38
47
|
|