npm - ccqa - Versions diffs - 0.8.2 → 0.8.3 - Mend

ccqa 0.8.2 → 0.8.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/dist/bin/ccqa.mjs CHANGED Viewed

@@ -4419,7 +4419,8 @@ ${stepsText}
 1. Take a fresh \`snapshot\` to see the current page.
 2. Carry out the instruction. Use whichever agent-browser subcommand and selector style works. If the first attempt fails, take another snapshot and try a different approach — you are not being recorded.
 3. After the instruction is performed, take another \`snapshot\` (and optionally a \`get count\` / \`wait --text\` probe) to verify the expected outcome.
-4. Decide: did the **Expected** condition hold? Be honest. If the page is in an unexpected state, that is a fail, not something to work around.
+4. **Before emitting STEP_RESULT, make the judgement target visible in the page** so the auto-captured "after" screenshot proves your verdict. Use \`agent-browser eval "<elementRef>.scrollIntoView({block:'center'})"\` or similar to bring the asserted row / banner / URL bar / bot reply into view. A correct verdict with no on-screen evidence is still a weak artifact.
+5. Decide: did the **Expected** condition hold? Be honest. If the page is in an unexpected state, that is a fail, not something to work around.
 ### Judgement rules
@@ -4428,6 +4429,7 @@ ${stepsText}
 - If the expected outcome is partially satisfied (e.g. the page loaded but the asserted element is missing) — fail, and say which part is missing.
 - Pass only when you have *positive* evidence (a successful snapshot, a verified URL, a wait that resolved). "No error shown" is not enough on its own.
 - Do not invent success when blocked: fail honestly with a short reason.
+- **Evidence discipline**: when the assertion target is a specific row / message / banner / URL, scroll it into view (or focus the relevant pane) before letting the step end. The "after" screenshot is captured for you automatically — your job is to make sure that screenshot shows the thing your STEP_RESULT line is talking about.
 ### Output contract (STRICT)
@@ -4508,13 +4510,15 @@ function findLastStepResult(text) {
 * and continues. We never throw, because a missing screenshot is a degraded
 * artifact, not a reason to abort the test step.
 */
-function takeScreenshot(sessionName, outPath) {
-	const res = spawnAB([
+function takeScreenshot(sessionName, outPath, options) {
+	const args = [
 		"--session",
 		sessionName,
-		"screenshot",
-		outPath
-	]);
+		"screenshot"
+	];
+	if (options?.fullPage) args.push("--full");
+	args.push(outPath);
+	const res = spawnAB(args);
 	if (res.status === 0) return {
 		ok: true,
 		path: outPath
@@ -4630,7 +4634,7 @@ async function runNdExecutor(input) {
 			transcriptParts.push(`[ccqa] invokeClaudeStreaming threw: ${err instanceof Error ? err.message : String(err)}`);
 		}
 		const transcript = transcriptParts.join("\n");
-		const after = takeScreenshot(input.sessionName, paths.afterPng);
+		const after = takeScreenshot(input.sessionName, paths.afterPng, { fullPage: true });
 		if (!after.ok) warn(`screenshot (after, ${step.id}) failed: ${after.error}`);
 		await writeFile(paths.logTxt, transcript || "(no assistant text captured)", "utf-8");
 		const { status, reasoning } = judgeStepOutcome({

package/dist/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "ccqa",
-  "version": "0.8.2",
+  "version": "0.8.3",
   "type": "module",
   "description": "Browser test recorder powered by Claude Code and agent-browser",
   "repository": {

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "ccqa",
-  "version": "0.8.2",
+  "version": "0.8.3",
   "type": "module",
   "description": "Browser test recorder powered by Claude Code and agent-browser",
   "repository": {