codeharness 0.36.10 → 0.37.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json
CHANGED
|
@@ -66,50 +66,15 @@ prompt_template: |
|
|
|
66
66
|
|
|
67
67
|
Base your scores on what you observe through the running system, not assumptions.
|
|
68
68
|
|
|
69
|
-
## Output Format
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
{
|
|
82
|
-
"ac": <number>,
|
|
83
|
-
"description": "<AC description>",
|
|
84
|
-
"status": "pass" | "fail" | "unknown",
|
|
85
|
-
"evidence": {
|
|
86
|
-
"commands_run": ["<command>"],
|
|
87
|
-
"output_observed": "<output>",
|
|
88
|
-
"reasoning": "<why>"
|
|
89
|
-
}
|
|
90
|
-
}
|
|
91
|
-
],
|
|
92
|
-
"quality_scores": {
|
|
93
|
-
"architecture": <1-5>,
|
|
94
|
-
"originality": <1-5>,
|
|
95
|
-
"craft": <1-5>,
|
|
96
|
-
"functionality": <1-5>
|
|
97
|
-
}
|
|
98
|
-
}
|
|
99
|
-
```
|
|
100
|
-
|
|
101
|
-
Verdict is "pass" only if ALL findings have status "pass". Quality scores are informational.
|
|
102
|
-
|
|
103
|
-
## XML Tags — MANDATORY
|
|
104
|
-
|
|
105
|
-
In addition to the JSON file output, your response MUST include these XML tags (machine-parsed):
|
|
106
|
-
|
|
107
|
-
Include `<verdict>pass</verdict>` or `<verdict>fail</verdict>`.
|
|
108
|
-
|
|
109
|
-
For each AC, include `<evidence ac="N" status="pass|fail|unknown">command, output, reasoning</evidence>`.
|
|
110
|
-
|
|
111
|
-
Include `<quality-scores>architecture: N, originality: N, craft: N, functionality: N</quality-scores>`.
|
|
112
|
-
|
|
113
|
-
## Output Location
|
|
114
|
-
|
|
115
|
-
Write verdict JSON to ./verdict/verdict.json
|
|
69
|
+
## Output Format — XML Tags (machine-parsed)
|
|
70
|
+
|
|
71
|
+
Your response MUST include these XML tags:
|
|
72
|
+
|
|
73
|
+
`<verdict>pass</verdict>` or `<verdict>fail</verdict>`
|
|
74
|
+
Verdict is "pass" only if ALL ACs have status "pass".
|
|
75
|
+
|
|
76
|
+
For each AC:
|
|
77
|
+
`<evidence ac="N" status="pass|fail|unknown">command run, output observed, reasoning</evidence>`
|
|
78
|
+
|
|
79
|
+
Quality assessment:
|
|
80
|
+
`<quality-scores>architecture: N, originality: N, craft: N, functionality: N</quality-scores>`
|