@allenpan2026/harshjudge 0.4.1 → 0.4.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "harshjudge",
3
- "version": "0.4.1",
3
+ "version": "0.4.2",
4
4
  "description": "AI-native E2E testing orchestration for Claude Code",
5
5
  "author": {
6
6
  "name": "Allen Pan"
package/README.md CHANGED
@@ -34,7 +34,10 @@ harshjudge init my-app
34
34
 
35
35
  - **Node.js**: 18+ LTS
36
36
  - **Claude Code**: Latest version
37
- - **Playwright MCP Server**: For browser automation (screenshots, navigation, clicks)
37
+ - **A browser automation tool** (any one of):
38
+ - Playwright MCP (default, most common)
39
+ - browser-use MCP (token efficient alternative)
40
+ - Chrome DevTools MCP
38
41
 
39
42
  ## Installation
40
43
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@allenpan2026/harshjudge",
3
- "version": "0.4.1",
3
+ "version": "0.4.2",
4
4
  "description": "AI-native E2E testing orchestration CLI for Claude Code.",
5
5
  "type": "module",
6
6
  "main": "./dist/index.js",
@@ -0,0 +1,63 @@
1
+ # Browser Tool Reference
2
+
3
+ Used during step execution in [[run]].
4
+
5
+ HarshJudge is **browser-tool-agnostic**. Use whatever browser automation tool is available in your environment. The step agent needs these capabilities:
6
+
7
+ ## Required Capabilities
8
+
9
+ | Action | What to do |
10
+ |--------|-----------|
11
+ | Navigate | Go to a URL |
12
+ | Inspect page | Get current page state (DOM, accessibility tree) before interacting |
13
+ | Click | Click an element by text, role, or reference |
14
+ | Type | Enter text into an input field |
15
+ | Select | Choose an option from a dropdown |
16
+ | Wait | Wait for text to appear/disappear, or for a timeout |
17
+ | Screenshot | Capture the current page as an image file |
18
+ | Console logs | Read browser console output |
19
+ | Network logs | Read network requests/responses |
20
+
21
+ ## Supported Browser Tools
22
+
23
+ ### Playwright MCP (Default)
24
+
25
+ Most common. Available as a Claude Code plugin.
26
+
27
+ ```json
28
+ {
29
+ "playwright": {
30
+ "command": "npx",
31
+ "args": ["@playwright/mcp@latest"]
32
+ }
33
+ }
34
+ ```
35
+
36
+ Tools: `browser_navigate`, `browser_click`, `browser_type`, `browser_snapshot`, `browser_take_screenshot`, `browser_wait_for`, `browser_console_messages`, `browser_network_requests`
37
+
38
+ ### browser-use MCP (Token Efficient Alternative)
39
+
40
+ Compresses DOM before sending to LLM — significantly fewer tokens per interaction. Python-based.
41
+
42
+ Setup: See [browser-use MCP docs](https://docs.browser-use.com/customize/integrations/mcp-server)
43
+
44
+ ### Chrome DevTools MCP
45
+
46
+ Connects to an already-running Chrome instance via remote debugging.
47
+
48
+ ```json
49
+ {
50
+ "chrome-devtools": {
51
+ "command": "npx",
52
+ "args": ["chrome-devtools-mcp"]
53
+ }
54
+ }
55
+ ```
56
+
57
+ ## Best Practices
58
+
59
+ - Always inspect the page before clicking or typing to get current element state
60
+ - Take a screenshot **before** and **after** each significant action
61
+ - Wait after navigation to confirm the page loaded
62
+ - Capture console errors on unexpected behavior
63
+ - Save screenshots to a temp path, then record via `harshjudge evidence`
@@ -20,8 +20,8 @@ Status: {pass|fail|first step}
20
20
  ## Your Task
21
21
  1. Navigate to the base URL if not already there
22
22
  2. Execute the actions described in the step content
23
- 3. Use browser_snapshot before clicking to get element refs
24
- 4. Capture before/after screenshots using browser_take_screenshot
23
+ 3. Use the available browser tool to inspect the page before interacting
24
+ 4. Take before/after screenshots using the browser tool
25
25
  5. Record evidence:
26
26
  harshjudge evidence {runId} --step {stepNumber} --type screenshot --name before --data /path/to/screenshot.png
27
27
  6. Verify the expected outcome
@@ -15,7 +15,7 @@ Use this workflow when user wants to:
15
15
  3. `harshjudge complete-step <runId>` — Complete each step, get next step
16
16
  4. `harshjudge complete-run <runId>` — Finalize with pass/fail status
17
17
 
18
- See [[run-playwright]] for Playwright tool reference.
18
+ See [[run-browser]] for browser tool reference (Playwright MCP, browser-use, Chrome DevTools).
19
19
 
20
20
  > **TOKEN OPTIMIZATION**: Each step executes in its own spawned agent. This isolates context and prevents token accumulation.
21
21
 
@@ -1,41 +0,0 @@
1
- # Playwright Tools Reference
2
-
3
- Used during step execution in [[run]].
4
-
5
- ## Navigation & State
6
-
7
- | Tool | Usage |
8
- |------|-------|
9
- | `browser_navigate` | `{ "url": "http://localhost:3000" }` |
10
- | `browser_snapshot` | `{}` → Returns accessibility tree with refs |
11
- | `browser_take_screenshot` | `{ "filename": "step-01-before.png" }` |
12
-
13
- ## Interactions
14
-
15
- | Tool | Usage |
16
- |------|-------|
17
- | `browser_click` | `{ "element": "Login button", "ref": "e5" }` |
18
- | `browser_type` | `{ "element": "Email input", "ref": "e4", "text": "test@example.com" }` |
19
- | `browser_select_option` | `{ "element": "Country", "ref": "e7", "values": ["USA"] }` |
20
-
21
- ## Waiting
22
-
23
- | Tool | Usage |
24
- |------|-------|
25
- | `browser_wait_for` | `{ "text": "Welcome" }` |
26
- | `browser_wait_for` | `{ "textGone": "Loading..." }` |
27
- | `browser_wait_for` | `{ "time": 2 }` |
28
-
29
- ## Debugging
30
-
31
- | Tool | Usage |
32
- |------|-------|
33
- | `browser_console_messages` | `{ "level": "error" }` |
34
- | `browser_network_requests` | `{}` |
35
-
36
- ## Best Practices
37
-
38
- - Always call `browser_snapshot` before `browser_click` or `browser_type` to get current element refs
39
- - Take a screenshot **before** and **after** each significant action
40
- - Use `browser_wait_for` after navigation to confirm page loaded
41
- - Capture console errors on any unexpected behavior