@codyswann/lisa 1.89.0 → 1.90.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +1 -1
- package/plugins/lisa/.claude-plugin/plugin.json +1 -1
- package/plugins/lisa/rules/base-rules.md +1 -0
- package/plugins/lisa-cdk/.claude-plugin/plugin.json +1 -1
- package/plugins/lisa-expo/.claude-plugin/plugin.json +1 -1
- package/plugins/lisa-expo/skills/playwright-ci-debugging/SKILL.md +140 -0
- package/plugins/lisa-nestjs/.claude-plugin/plugin.json +1 -1
- package/plugins/lisa-rails/.claude-plugin/plugin.json +1 -1
- package/plugins/lisa-typescript/.claude-plugin/plugin.json +1 -1
- package/plugins/src/base/rules/base-rules.md +1 -0
- package/plugins/src/expo/skills/playwright-ci-debugging/SKILL.md +140 -0
package/package.json
CHANGED
|
@@ -78,7 +78,7 @@
|
|
|
78
78
|
"lodash": ">=4.18.1"
|
|
79
79
|
},
|
|
80
80
|
"name": "@codyswann/lisa",
|
|
81
|
-
"version": "1.
|
|
81
|
+
"version": "1.90.0",
|
|
82
82
|
"description": "Claude Code governance framework that applies guardrails, guidance, and automated enforcement to projects",
|
|
83
83
|
"main": "dist/index.js",
|
|
84
84
|
"exports": {
|
|
@@ -50,6 +50,7 @@ Git Discipline:
|
|
|
50
50
|
- Prefix git push with `GIT_SSH_COMMAND="ssh -o ServerAliveInterval=30 -o ServerAliveCountMax=5"`.
|
|
51
51
|
- Never commit directly to an environment branch (dev, staging, main).
|
|
52
52
|
- Never use --no-verify or attempt to bypass a git hook.
|
|
53
|
+
- Never bypass branch protection. Never use `--admin`, `--force`, or any other flag to merge a PR that has failing CI checks. If CI fails, fix it. If you cannot fix it, escalate to the human. There are zero exceptions. "Green in CI" is the definition of done — not "green locally." A PR is not complete until CI passes on the actual PR branch.
|
|
53
54
|
- Never stash changes you cannot commit. Either fix whatever is preventing the commit or fail out and let the human know why.
|
|
54
55
|
- Never add "BREAKING CHANGE" to a commit message unless there is actually a breaking change.
|
|
55
56
|
- When opening a PR, watch the PR. If any status checks fail, fix them. For all bot code reviews, if the feedback is valid, implement it and push the change to the PR. Then resolve the feedback. If the feedback is not valid, reply to the feedback explaining why it's not valid and then resolve the feedback. Do this in a loop until the PR is able to be merged and then merge it.
|
|
@@ -0,0 +1,140 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: playwright-ci-debugging
|
|
3
|
+
description: Debug Playwright E2E tests that pass locally but fail in CI (or vice versa) in Expo web projects. Covers local reproduction, network interception, CI environment discovery, commit SHA verification, and robust interaction patterns that eliminate flake. Use this skill when a Playwright test is failing in CI, a test is flaky, a PR is blocked by E2E checks, or you need to investigate CI-specific test behavior. Trigger on mentions of CI failure, failing Playwright test, flaky E2E test, or debugging E2E in CI.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Debugging Playwright E2E Failures in CI
|
|
7
|
+
|
|
8
|
+
The authoring-side rules (selectors, testID forwarding, naming) are in the `playwright-selectors` skill. This skill is for the other half of the job: when a test fails in CI and you need to find out why.
|
|
9
|
+
|
|
10
|
+
## Debugging Order
|
|
11
|
+
|
|
12
|
+
When a Playwright E2E test fails in CI, follow this exact order.
|
|
13
|
+
|
|
14
|
+
### 1. Reproduce locally first — before anything else
|
|
15
|
+
|
|
16
|
+
Start the project's dev server, then run the failing test in isolation:
|
|
17
|
+
|
|
18
|
+
```bash
|
|
19
|
+
# Start the dev server (discover the script from package.json — commonly `start`, `dev`, `start:dev`, or `web`)
|
|
20
|
+
<pkg-manager> run <dev-script>
|
|
21
|
+
|
|
22
|
+
# In another terminal, run the exact failing test with no retries
|
|
23
|
+
BASE_URL=http://localhost:8081/ npx playwright test <file> --grep "<test name>" --retries=0
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
`--retries=0` is critical — retries mask flake. `--grep` isolates the single failing test so you're not waiting on a full suite.
|
|
27
|
+
|
|
28
|
+
**Do NOT read source code, CI logs, or theorize until you can reproduce locally.** Most CI failures reproduce locally if you run the same test against the same served build. Guessing from logs is slow and usually wrong.
|
|
29
|
+
|
|
30
|
+
### 2. Intercept the network request
|
|
31
|
+
|
|
32
|
+
When a test involves an API call, set up a Playwright response listener and inspect the status code and response body. A 400/500 response tells you everything.
|
|
33
|
+
|
|
34
|
+
```typescript
|
|
35
|
+
page.on("response", async (response) => {
|
|
36
|
+
if (response.url().includes("/api/")) {
|
|
37
|
+
console.log(response.status(), response.url(), await response.text());
|
|
38
|
+
}
|
|
39
|
+
});
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
UI failures are often downstream of a failed API call. The network log is cheaper evidence than the DOM.
|
|
43
|
+
|
|
44
|
+
### 3. If it doesn't reproduce locally, understand the CI environment first
|
|
45
|
+
|
|
46
|
+
CI runs against a different environment than your laptop. Before changing anything:
|
|
47
|
+
|
|
48
|
+
- Find the CI job that runs Playwright (usually `.github/workflows/*.yml`).
|
|
49
|
+
- Identify which `.env.*` file it loads — this varies by target branch (e.g., dev → `.env.development`, staging → `.env.staging`).
|
|
50
|
+
- Note that CI typically builds a static web bundle (`expo export --platform web`) then serves it on `localhost:8081` via `serve dist`. It is NOT hitting your dev server. Timing, bundling, and env vars all differ.
|
|
51
|
+
- Reproduce the CI setup locally: build the static bundle, serve it the same way, then run the test. If it now fails locally, you've isolated an env/build difference.
|
|
52
|
+
|
|
53
|
+
### 4. Verify commit SHA before trusting CI results
|
|
54
|
+
|
|
55
|
+
After pushing, confirm the CI run's `headSha` matches your latest commit:
|
|
56
|
+
|
|
57
|
+
```bash
|
|
58
|
+
gh run list --branch "$(git branch --show-current)" --limit 1 --json headSha,status,conclusion
|
|
59
|
+
git rev-parse HEAD
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
Bots (review-response, auto-update, dependabot) can push commits between your push and the CI run, overwriting your changes. A green check on a stale SHA tells you nothing about your fix.
|
|
63
|
+
|
|
64
|
+
## Patterns That Eliminate Flake
|
|
65
|
+
|
|
66
|
+
### Never use fixed waits before interactions
|
|
67
|
+
|
|
68
|
+
`waitForTimeout()` as the sole wait before a click is a silent failure waiting to happen — animations and rendering take variable time, especially on slower CI runners. Poll for the expected state, then act.
|
|
69
|
+
|
|
70
|
+
```typescript
|
|
71
|
+
// BAD — fixed wait; click silently fails if element not yet visible
|
|
72
|
+
await clickVisibleText(page, "Translate");
|
|
73
|
+
await page.waitForTimeout(1000);
|
|
74
|
+
await clickVisibleText(page, "Spanish");
|
|
75
|
+
|
|
76
|
+
// GOOD — poll for visibility, then click
|
|
77
|
+
await clickVisibleText(page, "Translate");
|
|
78
|
+
await expect
|
|
79
|
+
.poll(() => hasVisibleText(page, "Spanish"), { timeout: TIMEOUT.expect })
|
|
80
|
+
.toBe(true);
|
|
81
|
+
await clickVisibleText(page, "Spanish");
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
### Never silently swallow click return values
|
|
85
|
+
|
|
86
|
+
Any helper that can return `false` on a missed click (e.g., `clickVisibleText`) should either have its return value asserted or be preceded by a visibility poll. A click that silently returns `false` is a hidden test bug — the test proceeds as if the click happened and fails downstream with a confusing error.
|
|
87
|
+
|
|
88
|
+
```typescript
|
|
89
|
+
// BAD — return value ignored; test continues on failed click
|
|
90
|
+
await clickVisibleText(page, "Submit");
|
|
91
|
+
|
|
92
|
+
// GOOD — assert the click happened
|
|
93
|
+
expect(await clickVisibleText(page, "Submit")).toBe(true);
|
|
94
|
+
|
|
95
|
+
// GOOD — precede with visibility poll
|
|
96
|
+
await expect
|
|
97
|
+
.poll(() => hasVisibleText(page, "Submit"), { timeout: TIMEOUT.expect })
|
|
98
|
+
.toBe(true);
|
|
99
|
+
await clickVisibleText(page, "Submit");
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
### E2E tests must not depend on external API success
|
|
103
|
+
|
|
104
|
+
Any test that calls an external service (AWS Bedrock, third-party APIs, rate-limited providers) must handle the failure case. Tests should verify UI behavior, not external service uptime. If an external call might fail, the test must accept both outcomes or skip the dependent assertion.
|
|
105
|
+
|
|
106
|
+
```typescript
|
|
107
|
+
// BAD — assumes translation always succeeds
|
|
108
|
+
await expect
|
|
109
|
+
.poll(() => hasVisibleText(page, "Show Original"), { timeout: TIMEOUT.expect })
|
|
110
|
+
.toBe(true);
|
|
111
|
+
|
|
112
|
+
// GOOD — handle both success and failure
|
|
113
|
+
await expect
|
|
114
|
+
.poll(
|
|
115
|
+
async () =>
|
|
116
|
+
(await hasVisibleText(page, "Show Original")) ||
|
|
117
|
+
(await hasVisibleText(page, "Translate")),
|
|
118
|
+
{ timeout: TIMEOUT.expect }
|
|
119
|
+
)
|
|
120
|
+
.toBe(true);
|
|
121
|
+
|
|
122
|
+
const translated = await hasVisibleText(page, "Show Original");
|
|
123
|
+
if (!translated) return; // External API failed; skip downstream assertions
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
The alternative — mocking the external call — is a valid approach when the goal is to test the UI's handling of a successful response. Pick one strategy per test and commit to it.
|
|
127
|
+
|
|
128
|
+
### E2E assertions must not depend on specific test data
|
|
129
|
+
|
|
130
|
+
Do not assert specific text (e.g., "No lists detected") when the test user's data state varies across environments. If a zero-row state could have different empty-state messages depending on the user's data, either check for multiple possible states or skip the assertion entirely.
|
|
131
|
+
|
|
132
|
+
See the `playwright-selectors` skill's "Data independence" section for authoring patterns that avoid this class of bug in the first place.
|
|
133
|
+
|
|
134
|
+
## Escalation
|
|
135
|
+
|
|
136
|
+
If you've worked through steps 1–4 and still cannot explain the failure:
|
|
137
|
+
|
|
138
|
+
- Do NOT disable, skip, or `.fixme` the test to unblock the PR.
|
|
139
|
+
- Do NOT use `--admin` or force-merge to bypass the check.
|
|
140
|
+
- Escalate to a human with: the failing test name, your local reproduction attempt, the network trace, and the commit SHA verification. A real flake that cannot be root-caused is a legitimate reason to pause; it is never a reason to merge red.
|
|
@@ -50,6 +50,7 @@ Git Discipline:
|
|
|
50
50
|
- Prefix git push with `GIT_SSH_COMMAND="ssh -o ServerAliveInterval=30 -o ServerAliveCountMax=5"`.
|
|
51
51
|
- Never commit directly to an environment branch (dev, staging, main).
|
|
52
52
|
- Never use --no-verify or attempt to bypass a git hook.
|
|
53
|
+
- Never bypass branch protection. Never use `--admin`, `--force`, or any other flag to merge a PR that has failing CI checks. If CI fails, fix it. If you cannot fix it, escalate to the human. There are zero exceptions. "Green in CI" is the definition of done — not "green locally." A PR is not complete until CI passes on the actual PR branch.
|
|
53
54
|
- Never stash changes you cannot commit. Either fix whatever is preventing the commit or fail out and let the human know why.
|
|
54
55
|
- Never add "BREAKING CHANGE" to a commit message unless there is actually a breaking change.
|
|
55
56
|
- When opening a PR, watch the PR. If any status checks fail, fix them. For all bot code reviews, if the feedback is valid, implement it and push the change to the PR. Then resolve the feedback. If the feedback is not valid, reply to the feedback explaining why it's not valid and then resolve the feedback. Do this in a loop until the PR is able to be merged and then merge it.
|
|
@@ -0,0 +1,140 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: playwright-ci-debugging
|
|
3
|
+
description: Debug Playwright E2E tests that pass locally but fail in CI (or vice versa) in Expo web projects. Covers local reproduction, network interception, CI environment discovery, commit SHA verification, and robust interaction patterns that eliminate flake. Use this skill when a Playwright test is failing in CI, a test is flaky, a PR is blocked by E2E checks, or you need to investigate CI-specific test behavior. Trigger on mentions of CI failure, failing Playwright test, flaky E2E test, or debugging E2E in CI.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Debugging Playwright E2E Failures in CI
|
|
7
|
+
|
|
8
|
+
The authoring-side rules (selectors, testID forwarding, naming) are in the `playwright-selectors` skill. This skill is for the other half of the job: when a test fails in CI and you need to find out why.
|
|
9
|
+
|
|
10
|
+
## Debugging Order
|
|
11
|
+
|
|
12
|
+
When a Playwright E2E test fails in CI, follow this exact order.
|
|
13
|
+
|
|
14
|
+
### 1. Reproduce locally first — before anything else
|
|
15
|
+
|
|
16
|
+
Start the project's dev server, then run the failing test in isolation:
|
|
17
|
+
|
|
18
|
+
```bash
|
|
19
|
+
# Start the dev server (discover the script from package.json — commonly `start`, `dev`, `start:dev`, or `web`)
|
|
20
|
+
<pkg-manager> run <dev-script>
|
|
21
|
+
|
|
22
|
+
# In another terminal, run the exact failing test with no retries
|
|
23
|
+
BASE_URL=http://localhost:8081/ npx playwright test <file> --grep "<test name>" --retries=0
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
`--retries=0` is critical — retries mask flake. `--grep` isolates the single failing test so you're not waiting on a full suite.
|
|
27
|
+
|
|
28
|
+
**Do NOT read source code, CI logs, or theorize until you can reproduce locally.** Most CI failures reproduce locally if you run the same test against the same served build. Guessing from logs is slow and usually wrong.
|
|
29
|
+
|
|
30
|
+
### 2. Intercept the network request
|
|
31
|
+
|
|
32
|
+
When a test involves an API call, set up a Playwright response listener and inspect the status code and response body. A 400/500 response tells you everything.
|
|
33
|
+
|
|
34
|
+
```typescript
|
|
35
|
+
page.on("response", async (response) => {
|
|
36
|
+
if (response.url().includes("/api/")) {
|
|
37
|
+
console.log(response.status(), response.url(), await response.text());
|
|
38
|
+
}
|
|
39
|
+
});
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
UI failures are often downstream of a failed API call. The network log is cheaper evidence than the DOM.
|
|
43
|
+
|
|
44
|
+
### 3. If it doesn't reproduce locally, understand the CI environment first
|
|
45
|
+
|
|
46
|
+
CI runs against a different environment than your laptop. Before changing anything:
|
|
47
|
+
|
|
48
|
+
- Find the CI job that runs Playwright (usually `.github/workflows/*.yml`).
|
|
49
|
+
- Identify which `.env.*` file it loads — this varies by target branch (e.g., dev → `.env.development`, staging → `.env.staging`).
|
|
50
|
+
- Note that CI typically builds a static web bundle (`expo export --platform web`) then serves it on `localhost:8081` via `serve dist`. It is NOT hitting your dev server. Timing, bundling, and env vars all differ.
|
|
51
|
+
- Reproduce the CI setup locally: build the static bundle, serve it the same way, then run the test. If it now fails locally, you've isolated an env/build difference.
|
|
52
|
+
|
|
53
|
+
### 4. Verify commit SHA before trusting CI results
|
|
54
|
+
|
|
55
|
+
After pushing, confirm the CI run's `headSha` matches your latest commit:
|
|
56
|
+
|
|
57
|
+
```bash
|
|
58
|
+
gh run list --branch "$(git branch --show-current)" --limit 1 --json headSha,status,conclusion
|
|
59
|
+
git rev-parse HEAD
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
Bots (review-response, auto-update, dependabot) can push commits between your push and the CI run, overwriting your changes. A green check on a stale SHA tells you nothing about your fix.
|
|
63
|
+
|
|
64
|
+
## Patterns That Eliminate Flake
|
|
65
|
+
|
|
66
|
+
### Never use fixed waits before interactions
|
|
67
|
+
|
|
68
|
+
`waitForTimeout()` as the sole wait before a click is a silent failure waiting to happen — animations and rendering take variable time, especially on slower CI runners. Poll for the expected state, then act.
|
|
69
|
+
|
|
70
|
+
```typescript
|
|
71
|
+
// BAD — fixed wait; click silently fails if element not yet visible
|
|
72
|
+
await clickVisibleText(page, "Translate");
|
|
73
|
+
await page.waitForTimeout(1000);
|
|
74
|
+
await clickVisibleText(page, "Spanish");
|
|
75
|
+
|
|
76
|
+
// GOOD — poll for visibility, then click
|
|
77
|
+
await clickVisibleText(page, "Translate");
|
|
78
|
+
await expect
|
|
79
|
+
.poll(() => hasVisibleText(page, "Spanish"), { timeout: TIMEOUT.expect })
|
|
80
|
+
.toBe(true);
|
|
81
|
+
await clickVisibleText(page, "Spanish");
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
### Never silently swallow click return values
|
|
85
|
+
|
|
86
|
+
Any helper that can return `false` on a missed click (e.g., `clickVisibleText`) should either have its return value asserted or be preceded by a visibility poll. A click that silently returns `false` is a hidden test bug — the test proceeds as if the click happened and fails downstream with a confusing error.
|
|
87
|
+
|
|
88
|
+
```typescript
|
|
89
|
+
// BAD — return value ignored; test continues on failed click
|
|
90
|
+
await clickVisibleText(page, "Submit");
|
|
91
|
+
|
|
92
|
+
// GOOD — assert the click happened
|
|
93
|
+
expect(await clickVisibleText(page, "Submit")).toBe(true);
|
|
94
|
+
|
|
95
|
+
// GOOD — precede with visibility poll
|
|
96
|
+
await expect
|
|
97
|
+
.poll(() => hasVisibleText(page, "Submit"), { timeout: TIMEOUT.expect })
|
|
98
|
+
.toBe(true);
|
|
99
|
+
await clickVisibleText(page, "Submit");
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
### E2E tests must not depend on external API success
|
|
103
|
+
|
|
104
|
+
Any test that calls an external service (AWS Bedrock, third-party APIs, rate-limited providers) must handle the failure case. Tests should verify UI behavior, not external service uptime. If an external call might fail, the test must accept both outcomes or skip the dependent assertion.
|
|
105
|
+
|
|
106
|
+
```typescript
|
|
107
|
+
// BAD — assumes translation always succeeds
|
|
108
|
+
await expect
|
|
109
|
+
.poll(() => hasVisibleText(page, "Show Original"), { timeout: TIMEOUT.expect })
|
|
110
|
+
.toBe(true);
|
|
111
|
+
|
|
112
|
+
// GOOD — handle both success and failure
|
|
113
|
+
await expect
|
|
114
|
+
.poll(
|
|
115
|
+
async () =>
|
|
116
|
+
(await hasVisibleText(page, "Show Original")) ||
|
|
117
|
+
(await hasVisibleText(page, "Translate")),
|
|
118
|
+
{ timeout: TIMEOUT.expect }
|
|
119
|
+
)
|
|
120
|
+
.toBe(true);
|
|
121
|
+
|
|
122
|
+
const translated = await hasVisibleText(page, "Show Original");
|
|
123
|
+
if (!translated) return; // External API failed; skip downstream assertions
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
The alternative — mocking the external call — is a valid approach when the goal is to test the UI's handling of a successful response. Pick one strategy per test and commit to it.
|
|
127
|
+
|
|
128
|
+
### E2E assertions must not depend on specific test data
|
|
129
|
+
|
|
130
|
+
Do not assert specific text (e.g., "No lists detected") when the test user's data state varies across environments. If a zero-row state could have different empty-state messages depending on the user's data, either check for multiple possible states or skip the assertion entirely.
|
|
131
|
+
|
|
132
|
+
See the `playwright-selectors` skill's "Data independence" section for authoring patterns that avoid this class of bug in the first place.
|
|
133
|
+
|
|
134
|
+
## Escalation
|
|
135
|
+
|
|
136
|
+
If you've worked through steps 1–4 and still cannot explain the failure:
|
|
137
|
+
|
|
138
|
+
- Do NOT disable, skip, or `.fixme` the test to unblock the PR.
|
|
139
|
+
- Do NOT use `--admin` or force-merge to bypass the check.
|
|
140
|
+
- Escalate to a human with: the failing test name, your local reproduction attempt, the network trace, and the commit SHA verification. A real flake that cannot be root-caused is a legitimate reason to pause; it is never a reason to merge red.
|