opencode-webfetch-plugin 0.1.0 → 0.1.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +189 -0
- package/bun.lock +327 -11
- package/c1.ts +57 -0
- package/c2.js +28 -0
- package/package.json +13 -5
- package/src/BrowserManager.ts +36 -99
- package/src/BrowserServer.ts +226 -0
- package/src/BrowserWorkerManager.ts +184 -0
- package/src/HumanInteractor.ts +11 -6
- package/src/browser-worker.ts +136 -0
- package/src/index.ts +10 -31
- package/test.md +1 -0
- package/tests/BrowserWorkerManager.test.ts +49 -0
- package/tests/browser.test.ts +84 -0
- package/tests/helpers/cookie-worker.ts +51 -0
- package/vitest.config.ts +17 -0
- package/tsconfig.json +0 -17
package/AGENTS.md
ADDED
|
@@ -0,0 +1,189 @@
|
|
|
1
|
+
# Agent Guidelines for Opencode Google AI Search Plugin
|
|
2
|
+
|
|
3
|
+
This document provides coding agents with essential information about the project structure, build commands, and code style conventions.
|
|
4
|
+
|
|
5
|
+
## Project Overview
|
|
6
|
+
|
|
7
|
+
This is an Opencode plugin that provides a `webfetch` tool for fetching webpage content in markdown format. It uses Playwright for browser automation, handles captchas and login screens through human-in-the-loop interaction, and converts HTML to markdown using Readability and Turndown.
|
|
8
|
+
|
|
9
|
+
**Key Technologies:**
|
|
10
|
+
- TypeScript (ES2022, ESM modules)
|
|
11
|
+
- Playwright (peer dependency)
|
|
12
|
+
- @opencode-ai/plugin SDK
|
|
13
|
+
- JSDOM + Readability for content extraction
|
|
14
|
+
- Turndown for HTML-to-Markdown conversion
|
|
15
|
+
|
|
16
|
+
## Build Commands
|
|
17
|
+
|
|
18
|
+
### Installation
|
|
19
|
+
```bash
|
|
20
|
+
bun install
|
|
21
|
+
# or
|
|
22
|
+
npm install
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
### Build
|
|
26
|
+
```bash
|
|
27
|
+
bun run build
|
|
28
|
+
# or
|
|
29
|
+
npm run build
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
This compiles TypeScript from `src/` to `dist/` with ESM output, type declarations, and source maps.
|
|
33
|
+
|
|
34
|
+
### Clean
|
|
35
|
+
```bash
|
|
36
|
+
bun run clean
|
|
37
|
+
# or
|
|
38
|
+
npm run clean
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
Removes the `dist/` folder.
|
|
42
|
+
|
|
43
|
+
### Playwright Setup
|
|
44
|
+
```bash
|
|
45
|
+
bun install playwright
|
|
46
|
+
npx playwright install chromium
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
**Note:** There are currently no test or lint scripts defined in package.json. If adding tests, use a test runner compatible with ESM modules.
|
|
50
|
+
|
|
51
|
+
## Project Structure
|
|
52
|
+
|
|
53
|
+
```
|
|
54
|
+
src/
|
|
55
|
+
├── index.ts # Plugin entry point, tool registration
|
|
56
|
+
├── BrowserManager.ts # Browser lifecycle, navigation, blocker detection
|
|
57
|
+
├── Extractor.ts # Content extraction and markdown conversion
|
|
58
|
+
└── HumanInteractor.ts # Human-in-the-loop interaction for captchas/logins
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
## Code Style Guidelines
|
|
62
|
+
|
|
63
|
+
### Module System
|
|
64
|
+
- **Type:** ES2022 modules (`"type": "module"` in package.json)
|
|
65
|
+
- **Imports:** Always use `.js` extension for local imports (TypeScript ESM requirement)
|
|
66
|
+
```typescript
|
|
67
|
+
import { BrowserManager } from "./BrowserManager.js";
|
|
68
|
+
import { Extractor } from './Extractor.js';
|
|
69
|
+
```
|
|
70
|
+
- **External imports:** No extension needed
|
|
71
|
+
```typescript
|
|
72
|
+
import { type Plugin, tool } from "@opencode-ai/plugin";
|
|
73
|
+
import type { Page } from 'playwright';
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
### TypeScript Configuration
|
|
77
|
+
- **Target:** ES2022
|
|
78
|
+
- **Module Resolution:** Bundler
|
|
79
|
+
- **Strict mode:** Enabled
|
|
80
|
+
- **Lib:** ES2022 + DOM
|
|
81
|
+
- Always generate declaration files and source maps
|
|
82
|
+
|
|
83
|
+
### Naming Conventions
|
|
84
|
+
- **Classes:** PascalCase (e.g., `BrowserManager`, `Extractor`, `HumanInteractor`)
|
|
85
|
+
- **Functions/Methods:** camelCase (e.g., `fetchWebpage`, `extractMarkdown`, `askForHumanHelp`)
|
|
86
|
+
- **Constants:** UPPER_SNAKE_CASE (e.g., `DEFAULT_TIMEOUT`, `MAX_TIMEOUT`)
|
|
87
|
+
- **Private fields:** Use `private` keyword, camelCase (e.g., `private context`, `private page`)
|
|
88
|
+
- **Type imports:** Use `type` keyword when importing types only
|
|
89
|
+
```typescript
|
|
90
|
+
import type { ToolContext } from "@opencode-ai/plugin";
|
|
91
|
+
import type { BrowserContext, Page } from 'playwright';
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
### Class Structure
|
|
95
|
+
- Static methods for utility classes (e.g., `Extractor.extractMarkdown()`, `HumanInteractor.askForHumanHelp()`)
|
|
96
|
+
- Instance methods for stateful classes (e.g., `BrowserManager`)
|
|
97
|
+
- Constructor dependency injection pattern (see `BrowserManager` constructor)
|
|
98
|
+
|
|
99
|
+
### Error Handling
|
|
100
|
+
- Use try-catch blocks for async operations
|
|
101
|
+
- Provide fallback behavior when possible (see `Extractor.ts:24-28`)
|
|
102
|
+
- Log warnings for non-critical failures (see `BrowserManager.ts:115`)
|
|
103
|
+
- Clean up resources in finally blocks (see `index.ts:58`)
|
|
104
|
+
- Ignore errors on cleanup operations (see `BrowserManager.ts:214`)
|
|
105
|
+
|
|
106
|
+
### Async/Await Patterns
|
|
107
|
+
- Always use async/await for asynchronous operations
|
|
108
|
+
- Use `.catch()` for non-critical operations that shouldn't block execution
|
|
109
|
+
```typescript
|
|
110
|
+
await this.page.goto(url, { waomcontentloaded', timeout }).catch((e) => {
|
|
111
|
+
console.warn(`Navigation might have timed out: ${e.message}`);
|
|
112
|
+
});
|
|
113
|
+
```
|
|
114
|
+
- Chain cleanup operations: `manager.dispose().catch(() => undefined)`
|
|
115
|
+
|
|
116
|
+
### Type Safety
|
|
117
|
+
- Use TypeScript's strict mode
|
|
118
|
+
- Prefer explicit types for function parameters and return values
|
|
119
|
+
- Use type imports for external types
|
|
120
|
+
- Use `any` sparingly and only when necessary (e.g., plugin client object)
|
|
121
|
+
- Null checks before using potentially null values (see `BrowserManager.ts:151`)
|
|
122
|
+
|
|
123
|
+
### Comments and Documentation
|
|
124
|
+
- Use JSDoc comments for public methods
|
|
125
|
+
- Include `@param` and `@returns` tags
|
|
126
|
+
- Add inline com complex logic or heuristics
|
|
127
|
+
- Example:
|
|
128
|
+
```typescript
|
|
129
|
+
/**
|
|
130
|
+
* Extracts the main content of a Playwright page and converts it to Markdown.
|
|
131
|
+
* @param page The Playwright Page object.
|
|
132
|
+
* @param url The current URL of the page.
|
|
133
|
+
* @returns The main content formatted as Markdown.
|
|
134
|
+
*/
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
### Constants and Configuration
|
|
138
|
+
- Define timeout values as constants at the top of files
|
|
139
|
+
- Use milliseconds for internal timeout handling
|
|
140
|
+
- Convert user-facing timeouts from seconds to milliseconds
|
|
141
|
+
|
|
142
|
+
### Browser Automation Best Practices
|
|
143
|
+
- Use persistent browser contexts for session reuse
|
|
144
|
+
- Mask webdrive (see `BrowserManager.ts:84-93`)
|
|
145
|
+
- Add reasonable wait times for dynamic content (`waitForTimeout`)
|
|
146
|
+
- Handle navigation failures gracefully
|
|
147
|
+
- Implement blocker detection (captchas, login walls, Cloudflare)
|
|
148
|
+
|
|
149
|
+
### Plugin Development
|
|
150
|
+
- Export plugin as default export
|
|
151
|
+
- Use `tool()` helper from `@opencode-ai/plugin`
|
|
152
|
+
- Provide clear tool descriptions and argument schemas
|
|
153
|
+
- Use `ctx.metadata()` to provide rich metadata to the LLM
|
|
154
|
+
- Handle abort signals properly (see `index.ts:32-35`)
|
|
155
|
+
|
|
156
|
+
### File Operations
|
|
157
|
+
- Use Node.js built-in modules (`os`, `path`, `fs`)
|
|
158
|
+
- Create directories recursively: `fs.mkdirSync(dir, { recursive: true })`
|
|
159
|
+
- Check file existence befotions: `fs.existsSync()`
|
|
160
|
+
|
|
161
|
+
### Formatting Preferences
|
|
162
|
+
- Single quotes for strings (except when double quotes avoid escaping)
|
|
163
|
+
- Semicolons at end of statements
|
|
164
|
+
- 2-space indentation
|
|
165
|
+
- Trailing commas in multi-line objects/arrays
|
|
166
|
+
|
|
167
|
+
## Testing Guidelines
|
|
168
|
+
|
|
169
|
+
Currently no tests are defined. When adding tests:
|
|
170
|
+
- Use a test runner compatible with ESM (Vitest recommended)
|
|
171
|
+
- Mock Playwright browser interactions
|
|
172
|
+
- Test blocker detection logic
|
|
173
|
+
- Test markdown extraction with sample HTML
|
|
174
|
+
|
|
175
|
+
## Common Pitfalls
|
|
176
|
+
|
|
177
|
+
1. **Import extensions:** Always use `.js` for local imports in ESM TypeScript
|
|
178
|
+
2. **Playwright peer dependency:** Ensure Playwright is installed in the host project
|
|
179
|
+
3. **Absolute paths:** Use `file:///` URLs for local plugin paths in opencode.json
|
|
180
|
+
4. **Browser cleanup:** Always dispose of browser contexts to avoid resource leaks
|
|
181
|
+
5. **Timeout handling:** Convert seconds to milliseconds and enforce max limits
|
|
182
|
+
|
|
183
|
+
## Publishing
|
|
184
|
+
|
|
185
|
+
Before publishing to npm:
|
|
186
|
+
1. Update version in `package.json`
|
|
187
|
+
2. Run `bun run build` to compile
|
|
188
|
+
3. Test the plugin in a real Opencode environment
|
|
189
|
+
4. Ensure peer dependencies are documented in README
|