opencode-webfetch-plugin 0.1.1 → 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/AGENTS.md ADDED
@@ -0,0 +1,189 @@
1
+ # Agent Guidelines for Opencode Google AI Search Plugin
2
+
3
+ This document provides coding agents with essential information about the project structure, build commands, and code style conventions.
4
+
5
+ ## Project Overview
6
+
7
+ This is an Opencode plugin that provides a `webfetch` tool for fetching webpage content in markdown format. It uses Playwright for browser automation, handles captchas and login screens through human-in-the-loop interaction, and converts HTML to markdown using Readability and Turndown.
8
+
9
+ **Key Technologies:**
10
+ - TypeScript (ES2022, ESM modules)
11
+ - Playwright (peer dependency)
12
+ - @opencode-ai/plugin SDK
13
+ - JSDOM + Readability for content extraction
14
+ - Turndown for HTML-to-Markdown conversion
15
+
16
+ ## Build Commands
17
+
18
+ ### Installation
19
+ ```bash
20
+ bun install
21
+ # or
22
+ npm install
23
+ ```
24
+
25
+ ### Build
26
+ ```bash
27
+ bun run build
28
+ # or
29
+ npm run build
30
+ ```
31
+
32
+ This compiles TypeScript from `src/` to `dist/` with ESM output, type declarations, and source maps.
33
+
34
+ ### Clean
35
+ ```bash
36
+ bun run clean
37
+ # or
38
+ npm run clean
39
+ ```
40
+
41
+ Removes the `dist/` folder.
42
+
43
+ ### Playwright Setup
44
+ ```bash
45
+ bun install playwright
46
+ npx playwright install chromium
47
+ ```
48
+
49
+ **Note:** There are currently no test or lint scripts defined in package.json. If adding tests, use a test runner compatible with ESM modules.
50
+
51
+ ## Project Structure
52
+
53
+ ```
54
+ src/
55
+ ├── index.ts # Plugin entry point, tool registration
56
+ ├── BrowserManager.ts # Browser lifecycle, navigation, blocker detection
57
+ ├── Extractor.ts # Content extraction and markdown conversion
58
+ └── HumanInteractor.ts # Human-in-the-loop interaction for captchas/logins
59
+ ```
60
+
61
+ ## Code Style Guidelines
62
+
63
+ ### Module System
64
+ - **Type:** ES2022 modules (`"type": "module"` in package.json)
65
+ - **Imports:** Always use `.js` extension for local imports (TypeScript ESM requirement)
66
+ ```typescript
67
+ import { BrowserManager } from "./BrowserManager.js";
68
+ import { Extractor } from './Extractor.js';
69
+ ```
70
+ - **External imports:** No extension needed
71
+ ```typescript
72
+ import { type Plugin, tool } from "@opencode-ai/plugin";
73
+ import type { Page } from 'playwright';
74
+ ```
75
+
76
+ ### TypeScript Configuration
77
+ - **Target:** ES2022
78
+ - **Module Resolution:** Bundler
79
+ - **Strict mode:** Enabled
80
+ - **Lib:** ES2022 + DOM
81
+ - Always generate declaration files and source maps
82
+
83
+ ### Naming Conventions
84
+ - **Classes:** PascalCase (e.g., `BrowserManager`, `Extractor`, `HumanInteractor`)
85
+ - **Functions/Methods:** camelCase (e.g., `fetchWebpage`, `extractMarkdown`, `askForHumanHelp`)
86
+ - **Constants:** UPPER_SNAKE_CASE (e.g., `DEFAULT_TIMEOUT`, `MAX_TIMEOUT`)
87
+ - **Private fields:** Use `private` keyword, camelCase (e.g., `private context`, `private page`)
88
+ - **Type imports:** Use `type` keyword when importing types only
89
+ ```typescript
90
+ import type { ToolContext } from "@opencode-ai/plugin";
91
+ import type { BrowserContext, Page } from 'playwright';
92
+ ```
93
+
94
+ ### Class Structure
95
+ - Static methods for utility classes (e.g., `Extractor.extractMarkdown()`, `HumanInteractor.askForHumanHelp()`)
96
+ - Instance methods for stateful classes (e.g., `BrowserManager`)
97
+ - Constructor dependency injection pattern (see `BrowserManager` constructor)
98
+
99
+ ### Error Handling
100
+ - Use try-catch blocks for async operations
101
+ - Provide fallback behavior when possible (see `Extractor.ts:24-28`)
102
+ - Log warnings for non-critical failures (see `BrowserManager.ts:115`)
103
+ - Clean up resources in finally blocks (see `index.ts:58`)
104
+ - Ignore errors on cleanup operations (see `BrowserManager.ts:214`)
105
+
106
+ ### Async/Await Patterns
107
+ - Always use async/await for asynchronous operations
108
+ - Use `.catch()` for non-critical operations that shouldn't block execution
109
+ ```typescript
110
+ await this.page.goto(url, { waomcontentloaded', timeout }).catch((e) => {
111
+ console.warn(`Navigation might have timed out: ${e.message}`);
112
+ });
113
+ ```
114
+ - Chain cleanup operations: `manager.dispose().catch(() => undefined)`
115
+
116
+ ### Type Safety
117
+ - Use TypeScript's strict mode
118
+ - Prefer explicit types for function parameters and return values
119
+ - Use type imports for external types
120
+ - Use `any` sparingly and only when necessary (e.g., plugin client object)
121
+ - Null checks before using potentially null values (see `BrowserManager.ts:151`)
122
+
123
+ ### Comments and Documentation
124
+ - Use JSDoc comments for public methods
125
+ - Include `@param` and `@returns` tags
126
+ - Add inline com complex logic or heuristics
127
+ - Example:
128
+ ```typescript
129
+ /**
130
+ * Extracts the main content of a Playwright page and converts it to Markdown.
131
+ * @param page The Playwright Page object.
132
+ * @param url The current URL of the page.
133
+ * @returns The main content formatted as Markdown.
134
+ */
135
+ ```
136
+
137
+ ### Constants and Configuration
138
+ - Define timeout values as constants at the top of files
139
+ - Use milliseconds for internal timeout handling
140
+ - Convert user-facing timeouts from seconds to milliseconds
141
+
142
+ ### Browser Automation Best Practices
143
+ - Use persistent browser contexts for session reuse
144
+ - Mask webdrive (see `BrowserManager.ts:84-93`)
145
+ - Add reasonable wait times for dynamic content (`waitForTimeout`)
146
+ - Handle navigation failures gracefully
147
+ - Implement blocker detection (captchas, login walls, Cloudflare)
148
+
149
+ ### Plugin Development
150
+ - Export plugin as default export
151
+ - Use `tool()` helper from `@opencode-ai/plugin`
152
+ - Provide clear tool descriptions and argument schemas
153
+ - Use `ctx.metadata()` to provide rich metadata to the LLM
154
+ - Handle abort signals properly (see `index.ts:32-35`)
155
+
156
+ ### File Operations
157
+ - Use Node.js built-in modules (`os`, `path`, `fs`)
158
+ - Create directories recursively: `fs.mkdirSync(dir, { recursive: true })`
159
+ - Check file existence befotions: `fs.existsSync()`
160
+
161
+ ### Formatting Preferences
162
+ - Single quotes for strings (except when double quotes avoid escaping)
163
+ - Semicolons at end of statements
164
+ - 2-space indentation
165
+ - Trailing commas in multi-line objects/arrays
166
+
167
+ ## Testing Guidelines
168
+
169
+ Currently no tests are defined. When adding tests:
170
+ - Use a test runner compatible with ESM (Vitest recommended)
171
+ - Mock Playwright browser interactions
172
+ - Test blocker detection logic
173
+ - Test markdown extraction with sample HTML
174
+
175
+ ## Common Pitfalls
176
+
177
+ 1. **Import extensions:** Always use `.js` for local imports in ESM TypeScript
178
+ 2. **Playwright peer dependency:** Ensure Playwright is installed in the host project
179
+ 3. **Absolute paths:** Use `file:///` URLs for local plugin paths in opencode.json
180
+ 4. **Browser cleanup:** Always dispose of browser contexts to avoid resource leaks
181
+ 5. **Timeout handling:** Convert seconds to milliseconds and enforce max limits
182
+
183
+ ## Publishing
184
+
185
+ Before publishing to npm:
186
+ 1. Update version in `package.json`
187
+ 2. Run `bun run build` to compile
188
+ 3. Test the plugin in a real Opencode environment
189
+ 4. Ensure peer dependencies are documented in README