npm - mcp-lab-agent - Versions diffs - 2.1.10 → 2.3.1 - Mend

mcp-lab-agent 2.1.10 → 2.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/README.md +103 -23
package/dist/index.js +209 -10
package/dist/index.js.map +1 -1
package/learning-hub/src/server.js +6 -6
package/package.json +3 -2
package/slack-bot/check-config.js +11 -3
package/slack-bot/src/config.js +22 -5

package/README.md CHANGED Viewed

@@ -4,48 +4,127 @@
 [![Node.js](https://img.shields.io/badge/node-%3E%3D18-green)](https://nodejs.org)
 [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
-**Assistente de teste que aprende com falhas.** Reduz tempo de debug, elimina flaky e mantém seletores estáveis. Executa testes, analisa causas de falha, corrige automaticamente e aprende padrões que melhoram as próximas gerações. Integra ao Cursor, Cline, Windsurf ou Slack.
+**PT-BR** | [English](#english)
+---
+## Português (PT-BR)
+**Sistema de QA autônomo com IA.** Reduz tempo de debug de testes, elimina flaky e mantém seletores estáveis — com um sistema de aprendizado que melhora a cada correção.
+> **TL;DR para recrutadores:** QA autônomo que explica *por que* os testes falharam em linguagem clara e aplica correções automaticamente. Testes que se autocorrigem e aprendem a cada fix. Integra com IDE (Cursor) e Slack. Feito para QA Engineers, SDETs e roles de Automação/IA.
+### Por que isso importa
+| Problema real | Impacto no mercado | O que o mcp-lab-agent faz |
+|---------------|--------------------|---------------------------|
+| **Testes flaky** | Times gastam 5–10h/semana. Microsoft: ~25% das falhas em CI são flaky; Slack tinha 56% antes de remediar. | Detecta padrões flaky, sugere correções, retry automático com fixes |
+| **"Por que falhou?"** | QAs e devs perdem horas lendo stack traces e logs. "Teste falhou" genérico não ajuda. | **Causa + correção em 30 segundos.** Diagnóstico em linguagem clara: o que aconteceu, por que e como corrigir |
+| **Seletores quebrados** | Refactors de UI quebram testes. Seletores frágeis (classes CSS, XPath longo) exigem manutenção manual. | Auto-fix de seletores, sugere `data-testid`, aplica correções e tenta de novo |
+### O WOW: Testes que se autocorrigem e aprendem
+**Quando um teste falha, você recebe a causa e a correção em 30 segundos. Sem cavar em stack traces.**
+Cada correção bem-sucedida é salva e reutilizada. Na próxima falha similar, o agente aplica o padrão aprendido automaticamente. **A taxa de sucesso na primeira tentativa melhora ao longo do tempo** — mensurável via `mcp-lab-agent stats`.
 ```bash
 npx mcp-lab-agent auto "login flow" --max-retries 5
 ```
-**1 comando. Análise completa.**
+*Um comando. Análise completa. Autocorreção. Aprendizado.*
-> Teste falhou? Em 30 segundos: o que aconteceu, por que e como corrigir. O mcp-lab-agent analisa causas, corrige e acumula conhecimento.
+### Principais resultados
-**Foco:** [Top 3 problemas de QA](docs/TOP3_QA_PROBLEMAS_E_ROADMAP.md) — flaky, "por que falhou?", manutenção de seletores.
+- **Reduz tempo de debug** — "Por que falhou?" em linguagem clara, não stack traces
+- **Corta manutenção de flaky** — Detecção, diagnóstico e sugestões de correção
+- **Escala QA sem escalar headcount** — Agente no IDE + Slack bot; funciona com Cypress, Playwright, Appium, Jest e 11+ frameworks
+- **Pronto para enterprise** — Socket Mode (sem URL pública), Ollama (offline), Learning Hub para times
----
+### Como funciona
-## O que é
+**🤖 Agente no IDE (Cursor, Cline, Windsurf)** — Pergunte no chat: *"Gere teste para login"*, *"Por que o teste falhou?"*, *"Roda o teste X"*. O agente detecta o projeto, executa testes, analisa falhas, aplica correções e aprende.
-O **mcp-lab-agent** é um sistema de inteligência em qualidade de software — não uma ferramenta de teste isolada. Ele entende o seu projeto, identifica frameworks (Cypress, Playwright, Jest, Appium, Robot, pytest e outros), gera testes com base em contexto e memória, executa, analisa falhas e aplica correções automaticamente. O valor central está no **learning**: cada correção bem-sucedida é salva e usada nas próximas gerações, aumentando a taxa de sucesso na primeira tentativa.
+**💬 Slack Bot** — Mencione o bot em qualquer canal — ele executa testes e posta o relatório. Funciona em ambiente corporativo (Socket Mode, sem ngrok). QA no fluxo da conversa.
-Com o **Learning Hub**, os aprendizados são centralizados e agregados entre projetos e — em deploy compartilhado — entre times e empresas, formando uma base de conhecimento em qualidade que escala além do repositório.
----
-## Para quem
+### Para quem é
 | Perfil | Benefício |
 |--------|-----------|
-| **QAs e SDETs** | Geração assistida de testes, análise de falhas com sugestões de correção, detecção de flakiness |
-| **Desenvolvedores** | "Por que falhou?", análise de arquivos e métodos, integração direta no IDE |
+| **QAs e SDETs** | Geração assistida de testes, análise de falhas com sugestões de correção, detecção de flaky |
+| **Desenvolvedores** | "Por que falhou?" em segundos, análise de arquivos/métodos, integração direta no IDE |
 | **Tech leads** | Visão de risco por área, métricas de estabilidade, relatórios para decisão |
-| **Empresas** | Learning Hub centralizado, escala entre squads e organizações, CI/CD, Ollama (offline), Slack para QA via chat |
----
+| **Times** | Learning Hub, Slack bot para QA no chat, CI/CD, Ollama (offline) |
-## Comparação
+### Como é diferente
 | Outras ferramentas | mcp-lab-agent |
 |--------------------|---------------|
-| Só executam testes | Executa, analisa causa da falha e sugere correção |
-| Saída genérica "teste falhou" | Diagnóstico: "login falha 30% das vezes (timing)" |
-| Sem visão de risco | Identifica áreas sem testes e classifica risco (alto/médio/baixo) |
-| Sem memória entre execuções | Learning system: cada padrão de falha vira correção aplicada nas próximas gerações |
-| Uma ferramenta por tarefa | Sistema de inteligência: geração, execução, análise, relatórios, predição, learning |
+| Só executam testes | Executa, analisa causa, sugere fix, aplica correção |
+| "Teste falhou" genérico | Linguagem clara: "Login falha 30% das vezes (timing). Adicione waitForDisplayed." |
+| Sem memória entre execuções | Learning system: cada fix melhora as próximas gerações |
+| Uma ferramenta por tarefa | End-to-end: gera, executa, analisa, reporta, aprende |
+---
+<a name="english"></a>
+## English
+**AI-powered autonomous QA system.** Reduces test debugging time, eliminates flaky tests, and keeps selectors stable — with a learning system that gets smarter with every fix.
+> **TL;DR for recruiters:** Autonomous QA that explains *why* tests fail in plain language and applies fixes automatically. Self-healing tests that learn from each fix. Integrates with IDE (Cursor) and Slack. Built for QA Engineers, SDETs, and AI/Automation roles.
+### Why this matters
+| Real problem | Industry impact | What mcp-lab-agent does |
+|--------------|-----------------|-------------------------|
+| **Flaky tests** | Teams spend 5–10h/week. Microsoft: ~25% of CI failures are flaky; Slack had 56% before remediation. | Detects flaky patterns, suggests fixes, auto-retries with corrections |
+| **"Why did it fail?"** | QAs and devs lose hours reading stack traces and logs. Generic "test failed" doesn't help. | **Cause + fix in 30 seconds.** Plain-language diagnosis: what happened, why, and how to fix |
+| **Broken selectors** | UI refactors break tests. Fragile selectors (CSS classes, long XPath) require manual maintenance. | Auto-fix selectors, suggests `data-testid`, applies corrections and retries |
+### The WOW: Self-healing tests that learn
+**When a test fails, you get the cause and fix in 30 seconds. No more digging through stack traces.**
+Each successful fix is saved and reused. The next time a similar failure happens, the agent applies the learned pattern automatically. **First-attempt success rate improves over time** — measurable via `mcp-lab-agent stats`.
+```bash
+npx mcp-lab-agent auto "login flow" --max-retries 5
+```
+*One command. Full analysis. Self-correction. Learning.*
+### Key outcomes
+- **Reduce debugging time** — "Why did it fail?" in plain language, not stack traces
+- **Cut flaky test maintenance** — Detection, diagnosis, and suggested fixes
+- **Scale QA without scaling headcount** — IDE agent + Slack bot; works with Cypress, Playwright, Appium, Jest, and 11+ frameworks
+- **Enterprise-ready** — Socket Mode (no public URL), Ollama (offline), Learning Hub for teams
+### How it works
+**🤖 IDE Agent (Cursor, Cline, Windsurf)** — Ask in chat: *"Generate a test for login"*, *"Why did the test fail?"*, *"Run test X"*. The agent detects your project, runs tests, analyzes failures, applies fixes, and learns.
+**💬 Slack Bot** — Mention the bot in any channel — it runs tests and posts the report. Works in corporate environments (Socket Mode, no ngrok). QA in the flow of conversation.
+### Who it's for
+| Role | Benefit |
+|------|---------|
+| **QAs & SDETs** | Assisted test generation, failure analysis with fix suggestions, flaky detection |
+| **Developers** | "Why did it fail?" in seconds, file/method analysis, direct IDE integration |
+| **Tech leads** | Risk visibility by area, stability metrics, decision-ready reports |
+| **Teams** | Learning Hub, Slack bot for QA in chat, CI/CD integration, Ollama (offline) |
+### How it's different
+| Other tools | mcp-lab-agent |
+|-------------|---------------|
+| Run tests only | Run, analyze cause, suggest fix, apply correction |
+| Generic "test failed" | Plain-language: "Login fails 30% of the time (timing). Add waitForDisplayed." |
+| No memory between runs | Learning system: each fix improves future generations |
+| One tool per task | End-to-end: generate, run, analyze, report, learn |
 ---
@@ -299,6 +378,7 @@ npm install playwright
 - [CHANGELOG.md](CHANGELOG.md) — Histórico de versões
 - [slack-bot/README.md](slack-bot/README.md) — Slack Bot
 - [learning-hub/README.md](learning-hub/README.md) — Learning Hub
+- [docs/PORTFOLIO_COPY_PT-BR.md](docs/PORTFOLIO_COPY_PT-BR.md) — Copy em PT-BR para portfólio (Vercel)
 ---

package/dist/index.js CHANGED Viewed

@@ -1506,15 +1506,99 @@ function runTestsOnce(cmd, args, cwd, env = process.env) {
     });
   });
 }
+async function handleMultipleAutoTests(testRequests, maxRetries) {
+  console.log(`\u{1F4CB} Testes a executar:`);
+  testRequests.forEach((req, i) => console.log(`   ${i + 1}. ${req}`));
+  console.log("");
+  const results = [];
+  const startTime = Date.now();
+  const promises = testRequests.map(async (testRequest, index) => {
+    const testNum = index + 1;
+    const prefix = `[Teste ${testNum}/${testRequests.length}]`;
+    console.log(`${prefix} \u{1F680} Iniciando: "${testRequest}"
+`);
+    try {
+      const { spawn: spawn3 } = await import("child_process");
+      const args = ["auto", testRequest, "--max-retries", maxRetries.toString()];
+      return new Promise((resolve) => {
+        const child = spawn3("mcp-lab-agent", args, {
+          cwd: process.cwd(),
+          stdio: "pipe",
+          shell: process.platform === "win32"
+        });
+        let output = "";
+        if (child.stdout) child.stdout.on("data", (d) => {
+          const text = d.toString();
+          output += text;
+          process.stdout.write(text.split("\n").map((line) => line ? `${prefix} ${line}` : "").join("\n"));
+        });
+        if (child.stderr) child.stderr.on("data", (d) => {
+          output += d.toString();
+        });
+        child.on("close", (code) => {
+          const success = code === 0;
+          const duration = Math.round((Date.now() - startTime) / 1e3);
+          resolve({
+            testRequest,
+            success,
+            exitCode: code,
+            output,
+            duration
+          });
+        });
+      });
+    } catch (err) {
+      return {
+        testRequest,
+        success: false,
+        exitCode: 1,
+        error: err.message,
+        duration: 0
+      };
+    }
+  });
+  const allResults = await Promise.all(promises);
+  const totalDuration = Math.round((Date.now() - startTime) / 1e3);
+  const passed = allResults.filter((r) => r.success).length;
+  const failed = allResults.length - passed;
+  console.log("\n" + "=".repeat(60));
+  console.log("\u{1F4CA} RESUMO DOS TESTES");
+  console.log("=".repeat(60) + "\n");
+  allResults.forEach((result, i) => {
+    const icon = result.success ? "\u2705" : "\u274C";
+    const status = result.success ? "PASSOU" : "FALHOU";
+    console.log(`${icon} ${i + 1}. ${result.testRequest} - ${status}`);
+  });
+  console.log("");
+  console.log(`Total: ${allResults.length} testes`);
+  console.log(`\u2705 Passou: ${passed}`);
+  console.log(`\u274C Falhou: ${failed}`);
+  console.log(`\u23F1\uFE0F  Tempo total: ${totalDuration}s`);
+  console.log("");
+  if (failed > 0) {
+    process.exit(1);
+  }
+}
 async function handleAutoCommand() {
   const request = process.argv.slice(3).join(" ");
   if (!request) {
     console.error("\u274C Uso: mcp-lab-agent auto <descri\xE7\xE3o do teste> [--max-retries N]");
+    console.error("   Exemplos:");
+    console.error('     mcp-lab-agent auto "login"');
+    console.error('     mcp-lab-agent auto "login, cadastro, buscar" --max-retries 3');
     process.exit(1);
   }
   const maxRetriesIdx = process.argv.indexOf("--max-retries");
   const maxRetries = maxRetriesIdx !== -1 && process.argv[maxRetriesIdx + 1] ? parseInt(process.argv[maxRetriesIdx + 1], 10) : 3;
   const cleanRequest = request.replace(/--max-retries\s+\d+/g, "").trim();
+  const testRequests = cleanRequest.split(",").map((r) => r.trim()).filter(Boolean);
+  if (testRequests.length > 1) {
+    console.log(`
+\u{1F916} Modo aut\xF4nomo iniciado: ${testRequests.length} testes em paralelo
+`);
+    await handleMultipleAutoTests(testRequests, maxRetries);
+    return;
+  }
   console.log(`
 \u{1F916} Modo aut\xF4nomo iniciado: "${cleanRequest}"
 `);
@@ -1542,16 +1626,19 @@ async function handleAutoCommand() {
 [Tentativa ${attempt}/${maxRetries}] Gerando teste...`);
     const { provider, apiKey, baseUrl, model } = llm;
     const memoryHints = memory.learnings?.filter((l) => l.success).slice(-10).map((l) => l.fix).join("\n") || "";
+    const packageInfo = structure.packageJson || {};
+    const isESM = packageInfo.type === "module";
     const systemPrompt = `Voc\xEA \xE9 um engenheiro de QA especializado em ${fw}. Gere APENAS o c\xF3digo do spec, sem explica\xE7\xF5es.
 ${memoryHints ? `
 Aprendizados anteriores (use como refer\xEAncia):
 ${memoryHints.slice(0, 1e3)}` : ""}
+${isESM ? "\nIMPORTANTE: Use sintaxe ESM (import/export), N\xC3O use require()." : ""}
 Retorne SOMENTE o c\xF3digo, sem markdown.`;
     const userPrompt = `Contexto:
 ${contextLines}
 Gere teste para: ${cleanRequest}
-Framework: ${fw}`;
+Framework: ${fw}${isESM ? "\nUse import { test, expect } from '@playwright/test';" : ""}`;
     try {
       let specContent = "";
       if (provider === "gemini") {
@@ -1565,6 +1652,9 @@ Framework: ${fw}`;
           })
         });
         const data = await res.json();
+        if (data.error) {
+          throw new Error(`Gemini API Error: ${data.error.message || JSON.stringify(data.error)}`);
+        }
         specContent = data.candidates?.[0]?.content?.parts?.[0]?.text || "";
       } else {
         const res = await fetch(`${baseUrl}/chat/completions`, {
@@ -1578,10 +1668,20 @@ Framework: ${fw}`;
           })
         });
         const data = await res.json();
+        if (data.error) {
+          throw new Error(`API Error: ${data.error.message || JSON.stringify(data.error)}`);
+        }
         specContent = data.choices?.[0]?.message?.content || "";
       }
+      console.log(`[DEBUG] Resposta do LLM recebida: ${specContent.length} caracteres`);
+      if (!specContent || specContent.trim().length === 0) {
+        throw new Error("LLM retornou conte\xFAdo vazio. Verifique sua API key e conex\xE3o.");
+      }
       specContent = specContent.replace(/^```(?:js|javascript|typescript)?\n?/i, "").replace(/\n?```\s*$/i, "").trim();
       testContent = specContent;
+      if (!testContent || testContent.trim().length === 0) {
+        throw new Error("Ap\xF3s parsing, o c\xF3digo ficou vazio. Resposta do LLM pode estar em formato inesperado.");
+      }
       if (!testFilePath) {
         const fileName = cleanRequest.toLowerCase().replace(/\s+/g, "-").replace(/[^a-z0-9-]/g, "").slice(0, 30);
         const { ext, baseDir } = getExtensionAndBaseDir(fw, structure);
@@ -1590,11 +1690,16 @@ Framework: ${fw}`;
         if (!fs5.existsSync(baseDir)) fs5.mkdirSync(baseDir, { recursive: true });
       }
       fs5.writeFileSync(testFilePath, testContent, "utf8");
-      console.log(`\u2705 Teste gravado: ${testFilePath}`);
+      const fileSize = fs5.statSync(testFilePath).size;
+      if (fileSize === 0) {
+        throw new Error("Arquivo gravado mas est\xE1 vazio. Problema na escrita do arquivo.");
+      }
+      console.log(`\u2705 Teste gravado: ${testFilePath} (${fileSize} bytes)`);
       console.log(`
 [Tentativa ${attempt}/${maxRetries}] Executando teste...`);
+      const runArg = fw === "playwright" ? path5.relative(PROJECT_ROOT5, testFilePath).replace(/\\/g, "/") : testFilePath;
       const runResult = await new Promise((resolve) => {
-        const child = spawn("npx", [fw === "cypress" ? "cypress" : fw === "playwright" ? "playwright" : fw, fw === "cypress" ? "run" : fw === "playwright" ? "test" : "run", testFilePath], {
+        const child = spawn("npx", [fw === "cypress" ? "cypress" : fw === "playwright" ? "playwright" : fw, fw === "cypress" ? "run" : fw === "playwright" ? "test" : "run", runArg], {
           cwd: PROJECT_ROOT5,
           stdio: ["inherit", "pipe", "pipe"],
           shell: process.platform === "win32"
@@ -1641,8 +1746,59 @@ ${runResult.output.slice(0, 800)}
         console.log(`\u26A0\uFE0F Flaky detectado (${flakyAnalysis.confidence.toFixed(2)}): ${flakyAnalysis.patterns.map((p) => p.pattern).join(", ")}`);
       }
       console.log(`
-[Tentativa ${attempt}/${maxRetries}] Aplicando corre\xE7\xE3o (simulada)...`);
-      console.log(`\u26A0\uFE0F Corre\xE7\xE3o autom\xE1tica ainda n\xE3o implementada nesta vers\xE3o CLI. Tentando novamente...`);
+[Tentativa ${attempt}/${maxRetries}] Aplicando corre\xE7\xE3o...`);
+      try {
+        const fixPrompt = `Voc\xEA \xE9 um engenheiro de QA. O teste falhou com o seguinte erro:
+${runResult.output.slice(0, 1e3)}
+C\xF3digo atual do teste:
+${testContent}
+Analise o erro e corrija o teste. Considere:
+- Seletores podem estar errados (verifique se os elementos existem)
+- Pode precisar de waits (waitForSelector, waitForLoadState)
+- Rotas podem estar erradas (/buscar vs /busca)
+- Elementos podem ter nomes diferentes
+Retorne APENAS o c\xF3digo corrigido, sem explica\xE7\xF5es.${isESM ? "\nUse import { test, expect } from '@playwright/test';" : ""}`;
+        let fixedContent = "";
+        if (provider === "gemini") {
+          const url = `${baseUrl}/models/${model}:generateContent?key=${apiKey}`;
+          const res = await fetch(url, {
+            method: "POST",
+            headers: { "Content-Type": "application/json" },
+            body: JSON.stringify({
+              contents: [{ parts: [{ text: fixPrompt }] }],
+              generationConfig: { temperature: 0.3, maxOutputTokens: 4096 }
+            })
+          });
+          const data = await res.json();
+          fixedContent = data.candidates?.[0]?.content?.parts?.[0]?.text || "";
+        } else {
+          const res = await fetch(`${baseUrl}/chat/completions`, {
+            method: "POST",
+            headers: { "Content-Type": "application/json", Authorization: `Bearer ${apiKey}` },
+            body: JSON.stringify({
+              model,
+              messages: [{ role: "user", content: fixPrompt }],
+              temperature: 0.3,
+              max_tokens: 4096
+            })
+          });
+          const data = await res.json();
+          fixedContent = data.choices?.[0]?.message?.content || "";
+        }
+        fixedContent = fixedContent.replace(/^```(?:js|javascript|typescript)?\n?/i, "").replace(/\n?```\s*$/i, "").trim();
+        if (fixedContent && fixedContent.length > 50) {
+          testContent = fixedContent;
+          console.log(`\u2705 Corre\xE7\xE3o gerada pelo LLM.`);
+        } else {
+          console.log(`\u26A0\uFE0F Corre\xE7\xE3o vazia, tentando novamente...`);
+        }
+      } catch (fixErr) {
+        console.log(`\u26A0\uFE0F Erro ao gerar corre\xE7\xE3o: ${fixErr.message}. Tentando novamente...`);
+      }
     } catch (err) {
       console.error(`
 \u274C Erro na tentativa ${attempt}: ${err.message}
@@ -2394,8 +2550,18 @@ Framework alvo: ${fw}${referenceBlock}`;
       }
       specContent = specContent.replace(/^```(?:js|javascript)?\n?/i, "").replace(/\n?```\s*$/i, "").trim();
       const fileName = request.toLowerCase().replace(/\s+/g, "-").replace(/[^a-z0-9-]/g, "").slice(0, 40);
+      if (!specContent) {
+        return {
+          content: [{ type: "text", text: "Erro: LLM retornou conte\xFAdo vazio. Verifique API key (GROQ_API_KEY, GEMINI_API_KEY) e tente novamente." }],
+          structuredContent: { ok: false, error: "Empty LLM response" }
+        };
+      }
+      const textWithCode = `Spec gerado (${specContent.length} chars). Use write_test para gravar com name="${fileName}" e content abaixo:
+--- C\xF3digo (passe em content para write_test) ---
+${specContent}`;
       return {
-        content: [{ type: "text", text: `Spec gerado (${specContent.length} chars). Use write_test para gravar.` }],
+        content: [{ type: "text", text: textWithCode }],
         structuredContent: {
           ok: true,
           specContent,
@@ -2477,6 +2643,12 @@ server.registerTool(
         structuredContent: { ok: false, error: "No test framework" }
       };
     }
+    if (!content || !String(content).trim()) {
+      return {
+        content: [{ type: "text", text: "Erro: content n\xE3o pode ser vazio. Chame generate_tests primeiro e passe o specContent retornado em content." }],
+        structuredContent: { ok: false, error: "Empty content" }
+      };
+    }
     const { ext, baseDir } = getExtensionAndBaseDir2(fw, structure);
     const safeName = name.replace(/[^a-z0-9-_]/gi, "-").replace(/-+/g, "-").replace(/_+/g, "_").replace(/\.(cy|spec|test|robot|feature|py)\.?(js|ts|py)?$/i, "").replace(/^[-_]+|[-_]+$/g, "");
     const fileName = ext.startsWith("_") ? `${safeName}${ext}` : `${safeName}${ext}`;
@@ -4367,17 +4539,20 @@ server.registerTool(
       learnings.push({ attempt, action: "generate_tests", result: "gerando..." });
       const { provider, apiKey, baseUrl, model } = llm;
       const memoryHints = memory.learnings?.filter((l) => l.fix).slice(-10).map((l) => l.fix).join("\n") || "";
+      const packageInfo = structure.packageJson || {};
+      const isESM = packageInfo.type === "module";
       const systemPrompt = `Voc\xEA \xE9 um engenheiro de QA especializado em ${fw}. Gere APENAS o c\xF3digo do spec, sem explica\xE7\xF5es.
 ${UNIVERSAL_TEST_PRACTICES}
 ${memoryHints ? `Aprendizados anteriores (use como refer\xEAncia):
 ${memoryHints.slice(0, 1e3)}` : ""}
+${isESM ? "\nIMPORTANTE: Use sintaxe ESM (import/export), N\xC3O use require()." : ""}
 Retorne SOMENTE o c\xF3digo, sem markdown.`;
       const userPrompt = `Contexto:
 ${contextLines}
 Gere teste para: ${request}
-Framework: ${fw}`;
+Framework: ${fw}${isESM ? "\nUse import { test, expect } from '@playwright/test';" : ""}`;
       try {
         let specContent = "";
         if (provider === "gemini") {
@@ -4404,10 +4579,23 @@ Framework: ${fw}`;
             })
           });
           const data = await res.json();
+          if (data.error) {
+            learnings.push({ attempt, action: "llm_call", result: `\u274C API error: ${data.error.message}` });
+            throw new Error(`API Error: ${data.error.message || JSON.stringify(data.error)}`);
+          }
           specContent = data.choices?.[0]?.message?.content || "";
         }
+        if (!specContent || specContent.trim().length === 0) {
+          learnings.push({ attempt, action: "generate_test", result: "\u274C LLM retornou vazio" });
+          throw new Error("LLM retornou conte\xFAdo vazio. Verifique sua API key e conex\xE3o.");
+        }
+        learnings.push({ attempt, action: "generate_test", result: `\u2705 recebido ${specContent.length} chars` });
         specContent = specContent.replace(/^```(?:js|javascript|typescript)?\n?/i, "").replace(/\n?```\s*$/i, "").trim();
         testContent = specContent;
+        if (!testContent || testContent.trim().length === 0) {
+          learnings.push({ attempt, action: "parse_code", result: "\u274C c\xF3digo vazio ap\xF3s parsing" });
+          throw new Error("Ap\xF3s parsing, o c\xF3digo ficou vazio. Resposta do LLM pode estar em formato inesperado.");
+        }
         if (!testFilePath) {
           const fileName = request.toLowerCase().replace(/\s+/g, "-").replace(/[^a-z0-9-]/g, "").slice(0, 30);
           const { ext, baseDir } = getExtensionAndBaseDir2(fw, structure);
@@ -4416,10 +4604,16 @@ Framework: ${fw}`;
           if (!fs6.existsSync(baseDir)) fs6.mkdirSync(baseDir, { recursive: true });
         }
         fs6.writeFileSync(testFilePath, testContent, "utf8");
-        learnings.push({ attempt, action: "write_test", result: `gravado: ${testFilePath}` });
+        const writtenFileSize = fs6.statSync(testFilePath).size;
+        if (writtenFileSize === 0) {
+          learnings.push({ attempt, action: "write_test", result: "\u274C arquivo vazio ap\xF3s gravar" });
+          throw new Error("Arquivo gravado mas est\xE1 vazio. Problema na escrita do arquivo.");
+        }
+        learnings.push({ attempt, action: "write_test", result: `gravado: ${testFilePath} (${writtenFileSize} bytes)` });
         learnings.push({ attempt, action: "run_tests", result: "executando..." });
+        const runArg = fw === "playwright" ? path6.relative(PROJECT_ROOT6, testFilePath).replace(/\\/g, "/") : testFilePath;
         const runResult = await new Promise((resolve) => {
-          const child = spawn2("npx", [fw === "cypress" ? "cypress" : fw === "playwright" ? "playwright" : fw, fw === "cypress" ? "run" : fw === "playwright" ? "test" : "run", testFilePath], {
+          const child = spawn2("npx", [fw === "cypress" ? "cypress" : fw === "playwright" ? "playwright" : fw, fw === "cypress" ? "run" : fw === "playwright" ? "test" : "run", runArg], {
             cwd: PROJECT_ROOT6,
             stdio: ["inherit", "pipe", "pipe"],
             shell: process.platform === "win32"
@@ -4477,9 +4671,14 @@ ${runResult.output.slice(0, 500)}${learnedAppendix2}` }],
         }
         learnings.push({ attempt, action: "apply_fix", result: "aplicando corre\xE7\xE3o..." });
         const fixedCode = explainResult.structuredContent.sugestaoCorrecao;
+        if (!fixedCode || fixedCode.trim().length === 0) {
+          learnings.push({ attempt, action: "apply_fix", result: "\u274C corre\xE7\xE3o vazia" });
+          continue;
+        }
         testContent = fixedCode;
         fs6.writeFileSync(testFilePath, testContent, "utf8");
-        learnings.push({ attempt, action: "apply_fix", result: "corre\xE7\xE3o aplicada" });
+        const fixedFileSize = fs6.statSync(testFilePath).size;
+        learnings.push({ attempt, action: "apply_fix", result: `corre\xE7\xE3o aplicada (${fixedFileSize} bytes)` });
         if (flakyAnalysis.isLikelyFlaky) {
           const inferredPattern = inferFailurePattern(runResult.output, fw);
           const learningType = inferredPattern?.learningType || (flakyAnalysis.patterns[0]?.pattern === "selector" ? "selector_fix" : "timing_fix");