npm - @hasna/testers - Versions diffs - 0.0.35 → 0.0.37 - Mend

@hasna/testers 0.0.35 → 0.0.37

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (26) hide show

package/dist/cli/index.js +35695 -35422
package/dist/index.d.ts +2 -0
package/dist/index.d.ts.map +1 -1
package/dist/index.js +1178 -24
package/dist/lib/ai-client.d.ts +8 -2
package/dist/lib/ai-client.d.ts.map +1 -1
package/dist/lib/config.d.ts.map +1 -1
package/dist/lib/crawl-and-generate.d.ts +1 -1
package/dist/lib/crawl-and-generate.d.ts.map +1 -1
package/dist/lib/generator.d.ts.map +1 -1
package/dist/lib/healer.d.ts.map +1 -1
package/dist/lib/hybrid-runner.d.ts.map +1 -1
package/dist/lib/judge.d.ts +3 -3
package/dist/lib/judge.d.ts.map +1 -1
package/dist/lib/quick-qa.d.ts +61 -0
package/dist/lib/quick-qa.d.ts.map +1 -0
package/dist/lib/runner.d.ts +1 -0
package/dist/lib/runner.d.ts.map +1 -1
package/dist/lib/session-converter.d.ts.map +1 -1
package/dist/mcp/index.js +73 -33
package/dist/server/index.js +47 -25
package/package.json +2 -1
package/skills/skill-debug-prod/SKILL.md +97 -0
package/skills/skill-quick-qa/SKILL.md +81 -0
package/skills/skill-testers-qa/SKILL.md +89 -0
package/skills/skill-testers-workflow/SKILL.md +126 -0

package/skills/skill-debug-prod/SKILL.md ADDED Viewed

@@ -0,0 +1,97 @@
+---
+name: skill-debug-prod
+description: "Create a safe testers-powered production debug plan for a prod URL, request ID, session ID, project ID, org/user identifier, or login-as/check-prod request without leaking secrets or crossing tenant boundaries."
+argument-hint: "<prod-url|session-id|project-id|user-email|request-id> [--browser] [--messages] [--jobs] [--blocks] [--logs] [--full]"
+user_invocable: true
+---
+# skill-debug-prod
+Use this skill to investigate production issues while preserving customer
+privacy, tenant boundaries, and auditability. The execution surface is
+`testers prod-debug`; this skill is the safety policy and follow-through loop.
+## Safety Rules
+1. Never print secrets, cookies, bearer tokens, password reset links, magic
+   links, OAuth codes, private keys, raw headers, or full auth state.
+2. Never ask for or use a customer's password.
+3. Never query by a user-controlled identifier alone. Resolve the target org,
+   user, project, session, or request and verify scope before reading data.
+4. Never export bulk production data. Read only the minimum records needed.
+5. Never perform a production write unless the user explicitly approves that
+   exact write and the audit trail records the reason.
+6. Browser reproduction must use an audited support URL/session/grant. If none
+   exists, do read-only log/API checks and report the missing support tool.
+## Start With Testers
+Create the safe plan before touching app-specific tools:
+```bash
+testers prod-debug "<target>" --reason "<why you are debugging>" --json
+testers prod-debug "<prod-url>" --profile "<configured-app-profile>" --reason "<why>" --json
+testers prod-debug "<prod-url>" --support-url "<audited-support-url>" --support-grant "<grant-id>" --reason "<why>"
+```
+The command redacts sensitive URL parameters, parses likely org/project/session
+identifiers, proposes safe log/API/browser checks, and blocks user-scoped
+browser reproduction unless audited support access is present.
+If the CLI cannot produce the needed plan, add the missing capability to
+`open-testers`, test it, publish it, reinstall it, and then rerun the production
+debug plan.
+## Evidence To Capture
+Accept any of:
+- Prod URL
+- Request ID
+- Session ID
+- Project ID/reference
+- Org slug or user email
+- Browser/login/OAuth/connector symptom
+Capture sanitized evidence only:
+- Target org/project/session/user identifiers after scoping checks
+- Request IDs, job IDs, block IDs, timestamps, routes, status codes
+- Error names/codes and short redacted snippets
+- Support access grant/audit ID, scope, and TTL when used
+## Debugging Order
+1. Run `testers prod-debug` and follow its safe checks.
+2. Use app-specific audited wrappers only after the target and plan are clear.
+3. For login or browser repro, mint/use a support session with a short TTL.
+4. For logs, filter by request/session/project/org and redact before posting.
+5. For database reads, keep queries read-only and org-scoped.
+6. For connector/OAuth bugs, record provider, sanitized callback URL, request
+   ID, error code, and redirect URI mismatch. Never reveal OAuth codes or tokens.
+## Output
+Return a concise sanitized report:
+```text
+Target
+- org/project/session/user: ...
+- support access: read-only/browser-debug, TTL, audit id if available
+Findings
+- ...
+Evidence
+- request IDs, job IDs, block IDs, timestamps, statuses
+Likely cause
+- ...
+Fix/next action
+- code/config path to patch; approval needed if a prod write is required
+```
+## Done
+Done means the production target was scoped, `testers prod-debug` was run, the
+safe checks were followed, evidence was recorded in the active task, and any
+fix or missing tool has a verified follow-up.

package/skills/skill-quick-qa/SKILL.md ADDED Viewed

@@ -0,0 +1,81 @@
+---
+name: skill-quick-qa
+description: "Run a quick testers-powered QA pass, fix every bug found when the user asked for fixes, and verify the final app behavior. Trigger for quick QA, smoke test this, check the app, run browser QA, or find and fix product bugs."
+user_invocable: true
+---
+# skill-quick-qa
+Use `testers` as the primary execution surface. This skill is for a fast but
+real QA pass: server health, console/runtime errors, broken links, performance,
+optional accessibility, and optional autonomous smoke exploration.
+This is not a report-only skill when the user asks for fixes. Find issues, turn
+them into tracked tasks, fix the root cause, rerun the failing check, then rerun
+the quick QA pass.
+## Start
+1. Create or update the active `todos` task and add a progress comment.
+2. Determine the app URL:
+   - Use the URL from the user when provided.
+   - Otherwise inspect the repo for a dev command and port, start the server
+     yourself, and use `http://<machine>:<port>` for remote machine access.
+   - Bind local dev servers to `0.0.0.0` when another machine needs to reach
+     them.
+3. Confirm the server is answering before running browser checks:
+   ```bash
+   curl -fsS "<url>" >/tmp/testers-health.html
+   testers doctor
+   ```
+## Run The Quick Pass
+Default:
+```bash
+testers quick-qa "<url>" --json --output /tmp/testers-quick-qa.json
+```
+Use these variants when they fit:
+```bash
+testers quick-qa "<url>" --no-smoke --json --output /tmp/testers-quick-qa.json
+testers quick-qa "<url>" --a11y AA --json --output /tmp/testers-quick-qa.json
+testers quick-qa "<url>" --page / --page /login --page /dashboard --json
+testers quick-qa "<url>" --skip perf --skip smoke --json
+```
+Use `testers quick-check` only as an alias for `testers quick-qa`.
+If `testers quick-qa` is not available in the installed CLI, update/publish
+`@hasna/testers` from `open-testers` instead of falling back to unrelated
+browser tools.
+## Fix Loop
+For each failing issue:
+1. Record the failing URL, check name, severity, message, screenshot/report ID,
+   and command in the task comment.
+2. Classify the failure:
+   - App bug: broken route, UI state, console/network/runtime failure, bad auth.
+   - Test setup bug: stale scenario, missing auth, missing seed data.
+   - Environment bug: server down, migrations missing, provider key unavailable.
+3. Fix the smallest root cause and add a regression test at the repo's natural
+   test layer.
+4. Rerun the narrow failing command.
+5. Rerun `testers quick-qa`.
+For deeper flows that quick QA cannot cover, switch to `skill-testers-qa` or
+`skill-testers-workflow` and use saved scenarios/workflows rather than ad hoc
+manual clicking.
+## Done
+Only report completion when:
+- `testers quick-qa` has run against the target app.
+- Bugs found in a fix request are fixed and reverified.
+- The final command, output file/report, and remaining tracked issues are posted
+  to the active task.
+- Any remaining failures have explicit follow-up tasks with evidence.

package/skills/skill-testers-qa/SKILL.md ADDED Viewed

@@ -0,0 +1,89 @@
+---
+name: skill-testers-qa
+description: "Use @hasna/testers for a serious AI-native QA pass on a web app or repo. Trigger for requests like test this app, QA this feature, run testers, check the preview, validate auth/pages, run local or sandbox browser tests, or find and fix product bugs."
+user_invocable: true
+---
+# skill-testers-qa
+Use `testers` as the execution surface for app QA. This is broader than unit
+testing: it checks real pages, browser behavior, generated scenarios, repo-native
+tests, screenshots, console/network failures, personas, accessibility, and
+regressions. If bugs are found and the user asked for fixes, fix them and rerun.
+## Start
+1. Create or update a `todos` task and post a short start message:
+   ```bash
+   todos add "QA <app or feature>" --project "$(pwd)" --priority high --tags qa,testers
+   conversations send --space "<project-or-testers>" "Starting QA: <scope>"
+   ```
+2. Identify the target:
+   - If the user gave a URL, use it.
+   - If the app is local, discover the dev command and port from `package.json`,
+     `.env`, server docs, or existing process state. Start/restart it yourself.
+   - On multi-machine work, bind servers to `0.0.0.0` and use
+     `http://<machine>:<port>`.
+3. Run setup checks:
+   ```bash
+   testers doctor
+   testers project list --json || true
+   testers list --json || true
+   testers repo discover . --json || true
+   ```
+   Do not print API keys or secrets. If no provider key is available, either use
+   deterministic/repo-native tests or fix the key setup through the approved
+   secrets workflow.
+## Choose The Run
+- Fast default pass: `testers quick-qa <url> --json --output /tmp/testers-quick-qa.json`
+- Fast default without AI smoke: `testers quick-qa <url> --no-smoke --json`
+- Fast default with accessibility: `testers quick-qa <url> --a11y AA --json`
+- Existing scenarios: `testers run <url> --json --output /tmp/testers-run.json`
+- No scenarios yet: `testers run <url> --auto-generate --json --output /tmp/testers-run.json`
+- Focused feature: `testers generate <url> --focus "<area>" --save`, then run by
+  tag or scenario.
+- Fast CI smoke: `testers run <url> --smoke --minimal --json`
+- Accessibility: `testers run <url> --a11y AA --json`
+- Selector churn: add `--self-heal` when the goal is to repair flaky selectors.
+- Changed files only: `testers run-affected <url>` or `testers run <url> --diff`.
+- Repo-native Playwright: `testers repo prepare .` then `testers repo run .`.
+- Larger or risky workflow: create/run a sandbox workflow with
+  `skill-testers-workflow`.
+Prefer provider-specific model IDs when useful:
+- Cerebras: `--model qwen-*` or `--model llama-*`
+- Z.AI GLM: `--model glm-5.1`
+- OpenAI: `--model gpt-*`
+- Google: `--model gemini-*`
+- Anthropic/default: Claude model IDs or presets
+## Investigate Failures
+After a run:
+```bash
+testers runs --json
+testers results <run-id> --json
+testers screenshots <run-or-result-id> --json
+testers report <run-id>
+```
+Classify each failure before editing:
+- App bug: user-visible error, broken route, console/network failure, bad UI state.
+- Test bug: stale selector, wrong assumption, missing auth/persona/setup.
+- Environment bug: server down, database not migrated, missing provider key.
+If it is an app bug, reproduce with the smallest scenario or browser step,
+write a regression test where the repo has an appropriate test layer, fix the
+root cause, rerun the failing scenario, then rerun the relevant suite.
+## Done
+The task is done only when:
+- The target URL/app was actually exercised.
+- Results, screenshots or report IDs are recorded in the task/comment.
+- Bugs found during a fix request are fixed and reverified.
+- The final run is green or remaining failures are scoped, reproduced, and
+  intentionally tracked as follow-up tasks.

package/skills/skill-testers-workflow/SKILL.md ADDED Viewed

@@ -0,0 +1,126 @@
+---
+name: skill-testers-workflow
+description: "Create, run, and maintain reusable @hasna/testers workflows for deterministic scripts, agentic goal loops, personas, local execution, and sandbox execution. Trigger when asked to map workflows, test a user journey, run a script, use sandboxes, or make repeatable QA flows."
+user_invocable: true
+---
+# skill-testers-workflow
+Use this when a QA request is more than a one-off page check: auth flows,
+project creation, chat prompts, connector setup, billing, admin actions,
+multi-persona behavior, non-deterministic AI interactions, or any flow that
+should be saved and rerun.
+## Model The Workflow First
+1. Name the user-visible journey, not the implementation detail.
+2. Split deterministic checks from agentic/non-deterministic steps.
+3. Decide the execution target:
+   - `local`: fast, cheap, good for simple flows and local dev servers.
+   - `sandbox`: bigger, slower, better for isolated repo setup, long-running
+     workflows, destructive tests, or tests that need a clean machine.
+4. Decide whether this should be:
+   - Scenarios: stored steps run by `testers run`.
+   - A workflow: reusable saved bundle with tags/personas/goal/sandbox config.
+   - A hybrid script: TypeScript file run by `testers run-script`.
+   - A goal loop: `testers workflow agent`, which can create open-todos next
+     actions from observed failures.
+## Create Scenarios
+For manual scenario steps:
+```bash
+testers add "User can create a project" \
+  --description "Creates a project from the dashboard and verifies it appears" \
+  --steps "Open the dashboard" \
+  --steps "Click New project" \
+  --steps "Enter a unique project name" \
+  --steps "Save the project" \
+  --steps "Verify the project appears in the list" \
+  --tag projects --tag smoke --priority high
+```
+For AI-generated coverage:
+```bash
+testers generate "<url>" --focus "<journey or area>" --save --json
+testers list --tag "<tag>" --json
+```
+For recorded sessions:
+```bash
+testers record "<url>"
+testers convert "<recording-or-har-file>" --model "<model>" --json
+```
+## Save A Workflow
+Local workflow:
+```bash
+testers workflow create "<name>" \
+  --description "<what the journey proves>" \
+  --tag "<tag>" \
+  --goal "<agentic testing goal if needed>" \
+  --success "<observable success criterion>" \
+  --target local \
+  --json
+```
+Sandbox workflow:
+```bash
+testers workflow create "<name>" \
+  --description "<what the journey proves>" \
+  --tag "<tag>" \
+  --goal "<agentic testing goal if needed>" \
+  --success "<observable success criterion>" \
+  --target sandbox \
+  --sandbox-provider e2b \
+  --sandbox-package @hasna/testers \
+  --sandbox-setup-command "<repo setup command>" \
+  --sandbox-cleanup delete \
+  --json
+```
+Run or inspect before launching:
+```bash
+testers workflow show <id> --json
+testers workflow run <id> --url "<url>" --dry-run --json
+testers workflow run <id> --url "<url>" --model "<model>" --json
+testers workflow agent <id> --url "<url>" --model "<model>" --json
+```
+## Hybrid Scripts
+Use `testers run-script` when part of the flow is deterministic Playwright-like
+automation and part needs AI judgment. Keep scripts in the app repo near other
+tests, not in global config.
+```bash
+testers run-script tests/qa/<workflow>.ts --url "<url>" --json
+```
+Hybrid scripts should export `HybridScenario[]` and keep selectors stable
+through roles, labels, or `data-testid`.
+## Maintenance Rules
+- Store reusable workflows/scenarios in `testers`; do not leave them only in
+  chat history.
+- Prefer tags that map to product areas: `auth`, `projects`, `billing`,
+  `connectors`, `admin`, `chat`, `smoke`, `regression`.
+- Use personas for role-sensitive behavior instead of hardcoding user state.
+- Never store secrets in workflow descriptions, steps, scripts, or generated
+  JSON. Use env vars or the approved secrets workflow.
+- If a workflow fails because the app is wrong, fix the app and rerun. If it
+  fails because the workflow is stale, update the workflow and record why.
+## Done
+Done means the workflow is saved or the script exists, a dry-run plan was
+checked, at least one real run was executed, and the result/report is attached
+or summarized in the active `todos` task.