@antodevs/groundtruth 0.1.5 → 0.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +46 -51
- package/package.json +1 -1
- package/src/circuit-breaker.js +1 -0
- package/src/cli.js +38 -3
- package/src/config.js +67 -0
- package/src/proxy.js +8 -2
- package/src/registry.js +62 -0
- package/src/search.js +111 -62
- package/src/watcher.js +87 -18
- package/assets/banner.svg +0 -106
package/README.md
CHANGED
|
@@ -1,5 +1,3 @@
|
|
|
1
|
-

|
|
2
|
-
|
|
3
1
|
# GroundTruth
|
|
4
2
|
|
|
5
3
|
> Zero-configuration context injection layer for LLM-based coding agents.
|
|
@@ -43,6 +41,13 @@ Current-generation AI coding assistants (Claude Code, Antigravity, Cursor) suffe
|
|
|
43
41
|
|
|
44
42
|
**GroundTruth** acts as a transparent middleware layer that resolves this by dynamically injecting real-time, stack-specific documentation directly into the agent's context window prior to inference.
|
|
45
43
|
|
|
44
|
+
### The v0.2.0 Engine: Jina Reader & Source Registry
|
|
45
|
+
|
|
46
|
+
GroundTruth v0.2.0 introduces a massive upgrade to content quality:
|
|
47
|
+
- **Jina Reader API Integration**: Parses dynamic, JavaScript-rendered SPAs (like Vercel AI SDK, Next.js, and Svelte docs) into clean, LLM-optimized Markdown.
|
|
48
|
+
- **Smart Source Registry**: Automatically bypasses search engines for the top 20+ frameworks (React, Svelte, Vue, Astro, etc.) and fetches their official documentation directly.
|
|
49
|
+
- **Readability Fallback**: Ensures reliable extraction even if the primary engine fails.
|
|
50
|
+
|
|
46
51
|
---
|
|
47
52
|
|
|
48
53
|
## Architecture & Operational Mechanics
|
|
@@ -56,22 +61,18 @@ In this mode, GroundTruth provisions a local HTTP proxy that intercepts outbound
|
|
|
56
61
|
```mermaid
|
|
57
62
|
sequenceDiagram
|
|
58
63
|
participant Agent as Claude Code
|
|
59
|
-
participant Proxy as GroundTruth
|
|
60
|
-
participant
|
|
64
|
+
participant Proxy as GroundTruth
|
|
65
|
+
participant Jina as Jina Reader API
|
|
61
66
|
participant API as Anthropic API
|
|
62
67
|
|
|
63
|
-
Agent->>Proxy: Send Prompt
|
|
64
|
-
Proxy->>
|
|
65
|
-
|
|
68
|
+
Agent->>Proxy: Send Prompt
|
|
69
|
+
Proxy->>Jina: Fetch docs (Direct Registry / DDG)
|
|
70
|
+
Jina-->>Proxy: Return clean Markdown
|
|
66
71
|
Note over Proxy: Injects live context<br/>into System Prompt
|
|
67
72
|
Proxy->>API: Forward mutated request
|
|
68
|
-
API-->>Agent: Return
|
|
73
|
+
API-->>Agent: Return response
|
|
69
74
|
```
|
|
70
75
|
|
|
71
|
-
- **Query Extraction**: Parses the user prompt to identify context dependencies.
|
|
72
|
-
- **Data Hydration**: Orchestrates an automated DuckDuckGo search to fetch the most recent documentation. It relies on a deterministic `LRUCache`, TCP keep-alive Pool configurations, and a 429-aware `CircuitBreaker` pattern to safeguard network operations safely.
|
|
73
|
-
- **Payload Mutation**: Mutates the outgoing system prompt to inject the scraped live context before forwarding the request to the Anthropic completion endpoint. (It includes type-guard structures making it safe from undocumented Gemini system changes).
|
|
74
|
-
|
|
75
76
|
### 2. File Watcher Mode (Designed for `antigravity` / `gemini`)
|
|
76
77
|
|
|
77
78
|
For agents that support side-channel context ingestion via dotfiles (like Antigravity Rules), GroundTruth runs as a background daemon.
|
|
@@ -79,42 +80,33 @@ For agents that support side-channel context ingestion via dotfiles (like Antigr
|
|
|
79
80
|
```mermaid
|
|
80
81
|
flowchart TD
|
|
81
82
|
pkg([package.json]) -->|Parse Dependencies| GT{GroundTruth Watcher}
|
|
82
|
-
GT -->|
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
86
|
-
|
|
87
|
-
|
|
88
|
-
classDef core fill:#3B82F6,stroke:#fff,stroke-width:2px,color:#fff;
|
|
89
|
-
class GT,Agent core;
|
|
83
|
+
GT -->|Smart Routing| Map{Registry?}
|
|
84
|
+
Map -->|Yes| Jina[Jina Reader API]
|
|
85
|
+
Map -->|No| DDG[DuckDuckGo Search] --> Jina
|
|
86
|
+
Jina -->|Clean Markdown| Gen[Write to ~/.gemini/GEMINI.md]
|
|
87
|
+
Gen --> Agent(Coding Assistant)
|
|
90
88
|
```
|
|
91
89
|
|
|
92
|
-
- **Stack Introspection**: Analyzes the local `package.json` to infer the project's dependency graph.
|
|
93
|
-
- **Intelligent Chunking**: Groups the filtered dependencies in configurable size batches (default 3) and uniquely hashes them to avoid redundant context-fetching loops unless changes are detected.
|
|
94
|
-
- **Automated Polling**: Periodically fetches updated documentation for the detected stack chunks in parallel.
|
|
95
|
-
- **State Persistence**: Hashes are serialized persistently avoiding redundant DuckDuckGo scraping operations across application crashes.
|
|
96
|
-
- **Block-Based Synchronization**: Writes the parsed context discretely into hash-oriented blocks inside `~/.gemini/GEMINI.md`. Native POSIX bindings and intra-device temporary files are leveraged ensuring `Atomic Writes` without EXDEV link errors. Stale contexts are efficiently garbage-collected via regex matching over tracked batch hashes.
|
|
97
|
-
|
|
98
90
|
---
|
|
99
91
|
|
|
100
|
-
|
|
101
|
-
```bash
|
|
102
|
-
# Initialize GroundTruth in proxy mode (auto-exports ANTHROPIC_BASE_URL)
|
|
103
|
-
npx @antodevs/groundtruth --claude-code
|
|
104
|
-
|
|
105
|
-
# Execute your agent in a separate TTY
|
|
106
|
-
claude
|
|
107
|
-
```
|
|
108
|
-
> **Note:** The daemon automatically mutates your shell environment (`~/.zshrc`, `~/.bashrc`, `~/.bash_profile`, `~/.config/fish/config.fish`) to route traffic through the localhost proxy.
|
|
92
|
+
## Configuration (`.groundtruth.json`)
|
|
109
93
|
|
|
110
|
-
|
|
111
|
-
```bash
|
|
112
|
-
cd /workspace/your-project
|
|
94
|
+
You can globally or locally configure GroundTruth by creating a `.groundtruth.json` file in your directory:
|
|
113
95
|
|
|
114
|
-
|
|
115
|
-
|
|
96
|
+
```json
|
|
97
|
+
{
|
|
98
|
+
"maxTokens": 4000,
|
|
99
|
+
"quality": "high",
|
|
100
|
+
"verbose": true,
|
|
101
|
+
"sources": [
|
|
102
|
+
{ "url": "https://svelte.dev/docs/kit/introduction", "label": "SvelteKit Docs" }
|
|
103
|
+
]
|
|
104
|
+
}
|
|
116
105
|
```
|
|
117
|
-
|
|
106
|
+
|
|
107
|
+
- **`maxTokens`**: The maximum length of characters injected for a single page.
|
|
108
|
+
- **`quality`**: `low`, `medium`, or `high`. Controls how many search results to retrieve and the timeout budget.
|
|
109
|
+
- **`sources`**: Useful for custom, internal, or highly specific documentation that GroundTruth should always inject.
|
|
118
110
|
|
|
119
111
|
---
|
|
120
112
|
|
|
@@ -124,24 +116,27 @@ npx @antodevs/groundtruth --antigravity
|
|
|
124
116
|
|------|------|-------------|
|
|
125
117
|
| `--claude-code` | Proxy | Initializes HTTP interceptor for Anthropic API payloads. |
|
|
126
118
|
| `--antigravity` | Rules | Initializes background daemon for dotfile synchronization. |
|
|
127
|
-
| `--
|
|
119
|
+
| `--uninstall` | Cleanup | Removes `ANTHROPIC_BASE_URL` from all shell config files. |
|
|
128
120
|
| `--port <n>` | Proxy | Overrides default proxy listener port (Default: `8080`). |
|
|
121
|
+
| `--quality <level>`| Both | `low`, `medium`, or `high` quality preset (Default: `medium`). |
|
|
122
|
+
| `--max-tokens <n>` | Both | Modifies the character limit per injected context block (Default: `4000`). |
|
|
129
123
|
| `--interval <n>` | Rules | Overrides the polling interval for documentation refresh in minutes (Default: `5`). |
|
|
130
|
-
| `--batch-size <n>` | Rules | Changes the amount of dependencies per query chunk for block fetching
|
|
124
|
+
| `--batch-size <n>` | Rules | Changes the amount of dependencies per query chunk for block fetching. |
|
|
125
|
+
| `--verbose` | Both | Enables verbose logging output. |
|
|
131
126
|
|
|
132
127
|
---
|
|
133
128
|
|
|
134
129
|
## Benchmark & Comparison
|
|
135
130
|
|
|
136
|
-
GroundTruth is
|
|
131
|
+
GroundTruth is optimized for zero-configuration deployments and minimal token overhead compared to existing MCP solutions.
|
|
137
132
|
|
|
138
|
-
| Feature | GroundTruth |
|
|
139
|
-
|
|
140
|
-
| **
|
|
141
|
-
| **
|
|
142
|
-
| **
|
|
143
|
-
| **
|
|
144
|
-
| **
|
|
133
|
+
| Feature | GroundTruth | Jina Reader (Direct) | Crawl4AI / Playwright | Firecrawl |
|
|
134
|
+
|---------|-------------|----------------------|-----------------------|-----------|
|
|
135
|
+
| **Setup Required** | None (1 command) | Scripting needed | High (Docker/Deps) | High (API Key) |
|
|
136
|
+
| **JS Rendering** | ✅ Yes (via Jina) | ✅ Yes | ✅ Yes | ✅ Yes |
|
|
137
|
+
| **Agent Injection** | ✅ Auto (Proxy/File) | ❌ Manual integration | ❌ Manual integration | ❌ Manual integration |
|
|
138
|
+
| **Cost** | Free | Rate limits apply | Free | Paid |
|
|
139
|
+
| **Runtime Footprint** | < 1MB | N/A | ~200MB | N/A |
|
|
145
140
|
|
|
146
141
|
---
|
|
147
142
|
|
package/package.json
CHANGED
package/src/circuit-breaker.js
CHANGED
package/src/cli.js
CHANGED
|
@@ -4,6 +4,7 @@
|
|
|
4
4
|
*/
|
|
5
5
|
import { chalk } from './logger.js';
|
|
6
6
|
import { createRequire } from 'module';
|
|
7
|
+
import { loadConfig, resolveQuality } from './config.js';
|
|
7
8
|
|
|
8
9
|
const { version } = createRequire(import.meta.url)('../package.json');
|
|
9
10
|
|
|
@@ -30,6 +31,12 @@ if (!antigravityMode && !claudeCodeMode && !uninstallMode) {
|
|
|
30
31
|
console.log(` --port <n> custom port, default 8080 (claude-code only)`);
|
|
31
32
|
console.log(` --interval <n> refresh in minutes, default 5 (antigravity only)`);
|
|
32
33
|
console.log(` --batch-size <n> deps per search batch (default: 3)`);
|
|
34
|
+
console.log(` --max-tokens <n> max tokens per context block (default: 4000)`);
|
|
35
|
+
console.log(` --quality <level> low | medium | high (default: medium)`);
|
|
36
|
+
console.log(` --verbose enable detailed extraction logging`);
|
|
37
|
+
console.log();
|
|
38
|
+
console.log(` Config:`);
|
|
39
|
+
console.log(` Place a .groundtruth.json in your project root for persistent settings.`);
|
|
33
40
|
console.log();
|
|
34
41
|
console.log(` Docs:`);
|
|
35
42
|
console.log(` Claude Code → export ANTHROPIC_BASE_URL=http://localhost:8080`);
|
|
@@ -40,13 +47,13 @@ if (!antigravityMode && !claudeCodeMode && !uninstallMode) {
|
|
|
40
47
|
|
|
41
48
|
// ─── Default params override ─────────────────────────
|
|
42
49
|
|
|
43
|
-
let port = 8080;
|
|
50
|
+
let port = 8080;
|
|
44
51
|
const portArgIndex = args.indexOf('--port');
|
|
45
52
|
if (portArgIndex !== -1 && args[portArgIndex + 1]) {
|
|
46
53
|
port = parseInt(args[portArgIndex + 1], 10);
|
|
47
54
|
}
|
|
48
55
|
|
|
49
|
-
let intervalMinutes = 5;
|
|
56
|
+
let intervalMinutes = 5;
|
|
50
57
|
const intervalArgIndex = args.indexOf('--interval');
|
|
51
58
|
if (intervalArgIndex !== -1 && args[intervalArgIndex + 1]) {
|
|
52
59
|
intervalMinutes = parseInt(args[intervalArgIndex + 1], 10) || 5;
|
|
@@ -57,4 +64,32 @@ const batchSize = batchSizeIndex !== -1
|
|
|
57
64
|
? Math.max(2, Math.min(parseInt(args[batchSizeIndex + 1]) || 3, 5))
|
|
58
65
|
: 3;
|
|
59
66
|
|
|
60
|
-
|
|
67
|
+
// ─── New v1.2 flags ──────────────────────────────────
|
|
68
|
+
|
|
69
|
+
const maxTokensIndex = args.indexOf('--max-tokens');
|
|
70
|
+
const cliMaxTokens = maxTokensIndex !== -1
|
|
71
|
+
? Math.max(500, Math.min(parseInt(args[maxTokensIndex + 1]) || 4000, 8000))
|
|
72
|
+
: null;
|
|
73
|
+
|
|
74
|
+
const qualityIndex = args.indexOf('--quality');
|
|
75
|
+
const cliQuality = qualityIndex !== -1 && ['low', 'medium', 'high'].includes(args[qualityIndex + 1])
|
|
76
|
+
? args[qualityIndex + 1]
|
|
77
|
+
: null;
|
|
78
|
+
|
|
79
|
+
const cliVerbose = args.includes('--verbose');
|
|
80
|
+
|
|
81
|
+
// ─── Merge CLI + .groundtruth.json ───────────────────
|
|
82
|
+
|
|
83
|
+
const fileConfig = await loadConfig();
|
|
84
|
+
|
|
85
|
+
const maxTokens = cliMaxTokens ?? fileConfig.maxTokens;
|
|
86
|
+
const quality = cliQuality ?? fileConfig.quality;
|
|
87
|
+
const verbose = cliVerbose || fileConfig.verbose;
|
|
88
|
+
const qualitySettings = resolveQuality(quality);
|
|
89
|
+
const customSources = fileConfig.sources;
|
|
90
|
+
|
|
91
|
+
export {
|
|
92
|
+
args, usePackageJson, antigravityMode, claudeCodeMode, uninstallMode,
|
|
93
|
+
port, intervalMinutes, batchSize, version,
|
|
94
|
+
maxTokens, quality, qualitySettings, verbose, customSources
|
|
95
|
+
};
|
package/src/config.js
ADDED
|
@@ -0,0 +1,67 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* @module config
|
|
3
|
+
* @description Carica configurazione opzionale da .groundtruth.json nella cwd.
|
|
4
|
+
*/
|
|
5
|
+
import { readFile } from 'fs/promises';
|
|
6
|
+
import { existsSync } from 'fs';
|
|
7
|
+
import path from 'path';
|
|
8
|
+
|
|
9
|
+
// ─── Quality Presets ─────────────────────────────────
|
|
10
|
+
|
|
11
|
+
const QUALITY_PRESETS = {
|
|
12
|
+
low: { ddgResults: 1, charsPerPage: 2000, jinaTimeout: 5000 },
|
|
13
|
+
medium: { ddgResults: 3, charsPerPage: 4000, jinaTimeout: 8000 },
|
|
14
|
+
high: { ddgResults: 5, charsPerPage: 8000, jinaTimeout: 12000 },
|
|
15
|
+
};
|
|
16
|
+
|
|
17
|
+
/**
|
|
18
|
+
* @description Risolve preset quality da stringa a parametri operativi.
|
|
19
|
+
* @param {string} level - "low" | "medium" | "high"
|
|
20
|
+
* @returns {Object} { ddgResults, charsPerPage, jinaTimeout }
|
|
21
|
+
*/
|
|
22
|
+
export function resolveQuality(level) {
|
|
23
|
+
return QUALITY_PRESETS[level] || QUALITY_PRESETS.medium;
|
|
24
|
+
}
|
|
25
|
+
|
|
26
|
+
// ─── Config Defaults ─────────────────────────────────
|
|
27
|
+
|
|
28
|
+
const DEFAULTS = {
|
|
29
|
+
maxTokens: 4000,
|
|
30
|
+
quality: 'medium',
|
|
31
|
+
verbose: false,
|
|
32
|
+
sources: [],
|
|
33
|
+
};
|
|
34
|
+
|
|
35
|
+
/**
|
|
36
|
+
* @description Carica .groundtruth.json dalla cwd, merge con defaults.
|
|
37
|
+
* @returns {Promise<Object>} Configurazione finale mergiata
|
|
38
|
+
*/
|
|
39
|
+
export async function loadConfig() {
|
|
40
|
+
const configPath = path.resolve(process.cwd(), '.groundtruth.json');
|
|
41
|
+
if (!existsSync(configPath)) return { ...DEFAULTS };
|
|
42
|
+
|
|
43
|
+
try {
|
|
44
|
+
const raw = await readFile(configPath, 'utf8');
|
|
45
|
+
const parsed = JSON.parse(raw);
|
|
46
|
+
|
|
47
|
+
return {
|
|
48
|
+
maxTokens: clamp(parsed.maxTokens ?? DEFAULTS.maxTokens, 500, 8000),
|
|
49
|
+
quality: ['low', 'medium', 'high'].includes(parsed.quality) ? parsed.quality : DEFAULTS.quality,
|
|
50
|
+
verbose: typeof parsed.verbose === 'boolean' ? parsed.verbose : DEFAULTS.verbose,
|
|
51
|
+
sources: Array.isArray(parsed.sources) ? parsed.sources.filter(s => s && s.url) : DEFAULTS.sources,
|
|
52
|
+
};
|
|
53
|
+
} catch {
|
|
54
|
+
return { ...DEFAULTS };
|
|
55
|
+
}
|
|
56
|
+
}
|
|
57
|
+
|
|
58
|
+
/**
|
|
59
|
+
* @description Clamp numerico con min/max bounds.
|
|
60
|
+
*/
|
|
61
|
+
function clamp(val, min, max) {
|
|
62
|
+
const n = parseInt(val, 10);
|
|
63
|
+
if (isNaN(n)) return min;
|
|
64
|
+
return Math.max(min, Math.min(n, max));
|
|
65
|
+
}
|
|
66
|
+
|
|
67
|
+
export { QUALITY_PRESETS };
|
package/src/proxy.js
CHANGED
|
@@ -9,6 +9,7 @@ import { readPackageDeps, buildQuery } from './packages.js';
|
|
|
9
9
|
import { chalk, log, LOG_WARN, LOG_BOLT } from './logger.js';
|
|
10
10
|
import { httpsAgent } from './http-agent.js';
|
|
11
11
|
import { sanitizeWebContent } from './sanitize.js';
|
|
12
|
+
import { maxTokens, qualitySettings, verbose } from './cli.js';
|
|
12
13
|
|
|
13
14
|
// ─── HTTP Node server daemon ─────────────────────────
|
|
14
15
|
|
|
@@ -94,14 +95,19 @@ export async function createServer(usePackageJson) {
|
|
|
94
95
|
try {
|
|
95
96
|
if (!query || query.trim() === String(new Date().getFullYear())) throw new Error('Empty query');
|
|
96
97
|
// parallel load in proxy app process to boost response load
|
|
97
|
-
const { results, pageText } = await webSearch(query, true
|
|
98
|
+
const { results, pageText } = await webSearch(query, true, {
|
|
99
|
+
ddgResults: qualitySettings.ddgResults,
|
|
100
|
+
maxLen: qualitySettings.charsPerPage,
|
|
101
|
+
jinaTimeout: qualitySettings.jinaTimeout,
|
|
102
|
+
verbose,
|
|
103
|
+
});
|
|
98
104
|
resultsCount = results.length;
|
|
99
105
|
|
|
100
106
|
contextBlock = `\n\n--- WEB CONTEXT (live, ${new Date().toISOString()}) ---\n`;
|
|
101
107
|
results.forEach((r, i) => {
|
|
102
108
|
contextBlock += `${i + 1}. ${r.title}: ${sanitizeWebContent(r.snippet, 500)} (${r.url})\n`;
|
|
103
109
|
});
|
|
104
|
-
if (pageText) contextBlock += `\nFULL TEXT:\n${sanitizeWebContent(pageText)}\n`;
|
|
110
|
+
if (pageText) contextBlock += `\nFULL TEXT:\n${sanitizeWebContent(pageText, maxTokens)}\n`;
|
|
105
111
|
contextBlock += `--- END WEB CONTEXT ---\n`;
|
|
106
112
|
didInject = true;
|
|
107
113
|
} catch (_) {
|
package/src/registry.js
ADDED
|
@@ -0,0 +1,62 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* @module registry
|
|
3
|
+
* @description Mappa hardcodata dipendenza → URL docs ufficiale per bypass DDG su framework noti.
|
|
4
|
+
*/
|
|
5
|
+
|
|
6
|
+
// ─── Docs URL Registry ──────────────────────────────
|
|
7
|
+
|
|
8
|
+
const DOCS_REGISTRY = {
|
|
9
|
+
'svelte': 'https://svelte.dev/docs/svelte/overview',
|
|
10
|
+
'sveltekit': 'https://svelte.dev/docs/kit/introduction',
|
|
11
|
+
'react': 'https://react.dev/reference/react',
|
|
12
|
+
'react-dom': 'https://react.dev/reference/react-dom',
|
|
13
|
+
'next': 'https://nextjs.org/docs',
|
|
14
|
+
'nextjs': 'https://nextjs.org/docs',
|
|
15
|
+
'vue': 'https://vuejs.org/api/',
|
|
16
|
+
'nuxt': 'https://nuxt.com/docs/api',
|
|
17
|
+
'angular': 'https://angular.dev/overview',
|
|
18
|
+
'astro': 'https://docs.astro.build/en/reference/configuration-reference/',
|
|
19
|
+
'tailwindcss': 'https://tailwindcss.com/docs',
|
|
20
|
+
'typescript': 'https://www.typescriptlang.org/docs/',
|
|
21
|
+
'express': 'https://expressjs.com/en/5x/api.html',
|
|
22
|
+
'fastify': 'https://fastify.dev/docs/latest/',
|
|
23
|
+
'hono': 'https://hono.dev/docs/',
|
|
24
|
+
'solid-js': 'https://docs.solidjs.com/',
|
|
25
|
+
'qwik': 'https://qwik.dev/docs/',
|
|
26
|
+
'remix': 'https://remix.run/docs/en/main',
|
|
27
|
+
'prisma': 'https://www.prisma.io/docs',
|
|
28
|
+
'drizzle-orm': 'https://orm.drizzle.team/docs/overview',
|
|
29
|
+
'three': 'https://threejs.org/docs/',
|
|
30
|
+
'zod': 'https://zod.dev/',
|
|
31
|
+
'trpc': 'https://trpc.io/docs',
|
|
32
|
+
'tanstack-query': 'https://tanstack.com/query/latest/docs/overview',
|
|
33
|
+
};
|
|
34
|
+
|
|
35
|
+
/**
|
|
36
|
+
* @description Normalizza nome dipendenza e cerca URL docs nel registry.
|
|
37
|
+
* @param {string} depName - Nome dipendenza da package.json (es. "svelte 5.51" o "@sveltejs/kit")
|
|
38
|
+
* @returns {string|null} URL docs diretto o null se non trovato
|
|
39
|
+
*/
|
|
40
|
+
export function lookupRegistryUrl(depName) {
|
|
41
|
+
// Prende solo il nome senza versione ("svelte 5.51" → "svelte")
|
|
42
|
+
const name = depName.split(' ')[0].toLowerCase();
|
|
43
|
+
|
|
44
|
+
// Match diretto
|
|
45
|
+
if (DOCS_REGISTRY[name]) return DOCS_REGISTRY[name];
|
|
46
|
+
|
|
47
|
+
// Strip @scope/ prefix ("@sveltejs/kit" → "kit", ma usiamo mapping speciali)
|
|
48
|
+
if (name === '@sveltejs/kit') return DOCS_REGISTRY['sveltekit'];
|
|
49
|
+
if (name === 'next' || name === '@next/core') return DOCS_REGISTRY['next'];
|
|
50
|
+
|
|
51
|
+
// Generic scope strip
|
|
52
|
+
const stripped = name.startsWith('@') ? name.split('/')[1] : name;
|
|
53
|
+
if (DOCS_REGISTRY[stripped]) return DOCS_REGISTRY[stripped];
|
|
54
|
+
|
|
55
|
+
// Strip -js suffix ("solid-js" → "solid")
|
|
56
|
+
const noJs = stripped.replace(/-js$/, '');
|
|
57
|
+
if (noJs !== stripped && DOCS_REGISTRY[noJs]) return DOCS_REGISTRY[noJs];
|
|
58
|
+
|
|
59
|
+
return null;
|
|
60
|
+
}
|
|
61
|
+
|
|
62
|
+
export { DOCS_REGISTRY };
|
package/src/search.js
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
/**
|
|
2
2
|
* @module search
|
|
3
|
-
* @description Logica di scraping web
|
|
3
|
+
* @description Logica di scraping web: Jina Reader → fallback Readability, registry bypass, DDG search.
|
|
4
4
|
*/
|
|
5
5
|
import fetch from 'node-fetch';
|
|
6
6
|
import * as cheerio from 'cheerio';
|
|
@@ -10,26 +10,113 @@ import { searchCache } from './cache.js';
|
|
|
10
10
|
import { CircuitBreaker } from './circuit-breaker.js';
|
|
11
11
|
import { httpAgent, httpsAgent } from './http-agent.js';
|
|
12
12
|
import { sanitizeWebContent } from './sanitize.js';
|
|
13
|
+
import { lookupRegistryUrl } from './registry.js';
|
|
13
14
|
|
|
14
15
|
// ─── Config & Cache ──────────────────────────────────
|
|
15
16
|
|
|
16
|
-
// Evitiamo IP bans ruotando UA comuni in Chrome desktop
|
|
17
17
|
const USER_AGENTS = [
|
|
18
18
|
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36',
|
|
19
19
|
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36',
|
|
20
20
|
'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36'
|
|
21
21
|
];
|
|
22
22
|
|
|
23
|
-
/**
|
|
24
|
-
* @description Seleziona uno User-Agent rnd dall'array disponibile
|
|
25
|
-
* @returns {string} Stringa di uno User Agent
|
|
26
|
-
*/
|
|
27
23
|
function getRandomUA() {
|
|
28
24
|
return USER_AGENTS[Math.floor(Math.random() * USER_AGENTS.length)];
|
|
29
25
|
}
|
|
30
26
|
|
|
31
27
|
const ddgCircuit = new CircuitBreaker({ failureThreshold: 3, resetTimeout: 30000 });
|
|
32
28
|
|
|
29
|
+
// ─── Jina Reader + Readability Fallback ──────────────
|
|
30
|
+
|
|
31
|
+
/**
|
|
32
|
+
* @description Fetch contenuto pagina: prima Jina Reader (JS rendering + markdown), poi fallback Readability.
|
|
33
|
+
* @param {string} url - URL della pagina
|
|
34
|
+
* @param {string} userAgent - UA per il fallback fetch
|
|
35
|
+
* @param {Object} opts - { jinaTimeout, maxLen, verbose }
|
|
36
|
+
* @returns {Promise<string>} Contenuto markdown/text estratto
|
|
37
|
+
*/
|
|
38
|
+
export async function fetchPageContent(url, userAgent, opts = {}) {
|
|
39
|
+
const { jinaTimeout = 8000, maxLen = 4000, verbose = false } = opts;
|
|
40
|
+
|
|
41
|
+
// ── Try Jina Reader API first ──
|
|
42
|
+
try {
|
|
43
|
+
const jinaRes = await fetch(`https://r.jina.ai/${url}`, {
|
|
44
|
+
signal: AbortSignal.timeout(jinaTimeout),
|
|
45
|
+
headers: { 'Accept': 'text/markdown', 'X-No-Cache': 'true' }
|
|
46
|
+
});
|
|
47
|
+
if (jinaRes.ok) {
|
|
48
|
+
const text = await jinaRes.text();
|
|
49
|
+
if (text && text.length > 200) {
|
|
50
|
+
if (verbose) console.log(` [jina] ✓ ${url} → ${text.length} chars`);
|
|
51
|
+
return sanitizeWebContent(text.replace(/\s+/g, ' '), maxLen);
|
|
52
|
+
}
|
|
53
|
+
}
|
|
54
|
+
} catch (_) {
|
|
55
|
+
if (verbose) console.log(` [jina] ✗ ${url} → fallback readability`);
|
|
56
|
+
}
|
|
57
|
+
|
|
58
|
+
// ── Fallback: fetch + Readability ──
|
|
59
|
+
try {
|
|
60
|
+
const pageRes = await fetch(url, {
|
|
61
|
+
signal: AbortSignal.timeout(5000),
|
|
62
|
+
headers: { 'User-Agent': userAgent },
|
|
63
|
+
agent: url.startsWith('https:') ? httpsAgent : httpAgent
|
|
64
|
+
});
|
|
65
|
+
if (pageRes.ok) {
|
|
66
|
+
const document = new DOMParser().parseFromString(await pageRes.text(), 'text/html');
|
|
67
|
+
let text = '';
|
|
68
|
+
try {
|
|
69
|
+
const article = new Readability(document).parse();
|
|
70
|
+
text = article?.textContent || '';
|
|
71
|
+
} catch (_) {
|
|
72
|
+
text = document.body?.textContent || '';
|
|
73
|
+
}
|
|
74
|
+
if (text) {
|
|
75
|
+
if (verbose) console.log(` [readability] ✓ ${url} → ${text.length} chars`);
|
|
76
|
+
return sanitizeWebContent(text.replace(/\s+/g, ' '), maxLen);
|
|
77
|
+
}
|
|
78
|
+
}
|
|
79
|
+
} catch (_) { }
|
|
80
|
+
|
|
81
|
+
return '';
|
|
82
|
+
}
|
|
83
|
+
|
|
84
|
+
// ─── Registry Direct Fetch ───────────────────────────
|
|
85
|
+
|
|
86
|
+
/**
|
|
87
|
+
* @description Fetch diretto dalle docs ufficiali per dipendenze nel registry.
|
|
88
|
+
* @param {Array} deps - Array di dipendenze ("svelte 5.51", "sveltekit 2.50")
|
|
89
|
+
* @param {Object} opts - { jinaTimeout, maxLen, verbose }
|
|
90
|
+
* @returns {Promise<Object>} { registryText, coveredDeps }
|
|
91
|
+
*/
|
|
92
|
+
export async function registryFetch(deps, opts = {}) {
|
|
93
|
+
const { verbose = false } = opts;
|
|
94
|
+
const userAgent = getRandomUA();
|
|
95
|
+
let registryText = '';
|
|
96
|
+
const coveredDeps = new Set();
|
|
97
|
+
|
|
98
|
+
for (const dep of deps) {
|
|
99
|
+
const docUrl = lookupRegistryUrl(dep);
|
|
100
|
+
if (!docUrl) continue;
|
|
101
|
+
|
|
102
|
+
const depName = dep.split(' ')[0];
|
|
103
|
+
try {
|
|
104
|
+
const text = await fetchPageContent(docUrl, userAgent, opts);
|
|
105
|
+
if (text && text.length > 100) {
|
|
106
|
+
registryText += `\n### ${depName} (official docs)\n${text}\n`;
|
|
107
|
+
coveredDeps.add(dep);
|
|
108
|
+
if (verbose) console.log(` [registry] ✓ ${depName} → ${docUrl}`);
|
|
109
|
+
}
|
|
110
|
+
} catch (_) {
|
|
111
|
+
if (verbose) console.log(` [registry] ✗ ${depName} → fetch failed`);
|
|
112
|
+
}
|
|
113
|
+
}
|
|
114
|
+
|
|
115
|
+
return { registryText, coveredDeps };
|
|
116
|
+
}
|
|
117
|
+
|
|
118
|
+
// ─── DDG Search ──────────────────────────────────────
|
|
119
|
+
|
|
33
120
|
/**
|
|
34
121
|
* @description Decodifica link mascherati DuckDuckGo recuperando `uddg` querystring.
|
|
35
122
|
* @param {string} href - Url incapsulato proveniente da nodeDDG
|
|
@@ -47,13 +134,12 @@ export function resolveDDGUrl(href) {
|
|
|
47
134
|
|
|
48
135
|
/**
|
|
49
136
|
* @description Esegue chiamata http reale su node DDG.
|
|
50
|
-
* @param {string} query
|
|
137
|
+
* @param {string} query - Ricerca DDG formattata
|
|
138
|
+
* @param {number} resultsLimit - Max risultati da ritornare
|
|
51
139
|
* @returns {Promise<Object>} { results, userAgent }
|
|
52
|
-
* @throws {Error} Fallimento http DDG request
|
|
53
140
|
*/
|
|
54
|
-
async function doSearch(query) {
|
|
141
|
+
async function doSearch(query, resultsLimit = 3) {
|
|
55
142
|
const userAgent = getRandomUA();
|
|
56
|
-
// Fetch DDG raw HTML search endpoint ignoring CSS/JS payloads
|
|
57
143
|
const searchRes = await fetch(
|
|
58
144
|
`https://html.duckduckgo.com/html/?q=${encodeURIComponent(query)}`,
|
|
59
145
|
{ signal: AbortSignal.timeout(5000), headers: { 'User-Agent': userAgent }, agent: httpsAgent }
|
|
@@ -71,21 +157,24 @@ async function doSearch(query) {
|
|
|
71
157
|
});
|
|
72
158
|
|
|
73
159
|
const seen = new Set();
|
|
74
|
-
results = results.filter(r => r.url && !seen.has(r.url) && seen.add(r.url)).slice(0,
|
|
160
|
+
results = results.filter(r => r.url && !seen.has(r.url) && seen.add(r.url)).slice(0, resultsLimit);
|
|
75
161
|
|
|
76
162
|
if (results.length === 0) throw new Error('No DDG results');
|
|
77
163
|
return { results, userAgent };
|
|
78
164
|
}
|
|
79
165
|
|
|
166
|
+
// ─── Main Web Search ─────────────────────────────────
|
|
167
|
+
|
|
80
168
|
/**
|
|
81
169
|
* @description Punto d'accesso caching+retry orchestrator web.
|
|
82
|
-
* @param {string} query
|
|
83
|
-
* @param {boolean} parallel
|
|
170
|
+
* @param {string} query - Input utente di ricerca convertibile web
|
|
171
|
+
* @param {boolean} parallel - Promise.all fast per multiple page scraping
|
|
172
|
+
* @param {Object} opts - { ddgResults, maxLen, jinaTimeout, verbose }
|
|
84
173
|
* @returns {Promise<Object>} Oggetto risultati + pageText formattato str
|
|
85
174
|
*/
|
|
86
|
-
export async function webSearch(query, parallel = false) {
|
|
87
|
-
const
|
|
88
|
-
|
|
175
|
+
export async function webSearch(query, parallel = false, opts = {}) {
|
|
176
|
+
const { ddgResults = 3, maxLen = 4000, jinaTimeout = 8000, verbose = false } = opts;
|
|
177
|
+
|
|
89
178
|
const cached = searchCache.get(query);
|
|
90
179
|
if (cached) {
|
|
91
180
|
return { results: cached.results, pageText: cached.pageText };
|
|
@@ -93,62 +182,22 @@ export async function webSearch(query, parallel = false) {
|
|
|
93
182
|
|
|
94
183
|
let results, userAgent;
|
|
95
184
|
try {
|
|
96
|
-
const res = await ddgCircuit.execute(() => doSearch(query));
|
|
185
|
+
const res = await ddgCircuit.execute(() => doSearch(query, ddgResults));
|
|
97
186
|
results = res.results;
|
|
98
187
|
userAgent = res.userAgent;
|
|
99
188
|
} catch (err) {
|
|
100
189
|
throw err;
|
|
101
190
|
}
|
|
102
191
|
|
|
192
|
+
const fetchOpts = { jinaTimeout, maxLen, verbose };
|
|
103
193
|
let pageText = '';
|
|
104
|
-
|
|
194
|
+
|
|
105
195
|
if (parallel) {
|
|
106
|
-
const pages = await Promise.all(results.map(
|
|
107
|
-
try {
|
|
108
|
-
const pageRes = await fetch(r.url, {
|
|
109
|
-
signal: AbortSignal.timeout(5000),
|
|
110
|
-
headers: { 'User-Agent': userAgent },
|
|
111
|
-
agent: r.url.startsWith('https:') ? httpsAgent : httpAgent
|
|
112
|
-
});
|
|
113
|
-
if (pageRes.ok) {
|
|
114
|
-
const document = new DOMParser().parseFromString(await pageRes.text(), 'text/html');
|
|
115
|
-
let text = '';
|
|
116
|
-
try {
|
|
117
|
-
const article = new Readability(document).parse();
|
|
118
|
-
text = article?.textContent || '';
|
|
119
|
-
} catch (_) {
|
|
120
|
-
text = document.body?.textContent || '';
|
|
121
|
-
}
|
|
122
|
-
if (text) return sanitizeWebContent(text.replace(/\s+/g, ' '), 4000);
|
|
123
|
-
}
|
|
124
|
-
} catch (_) { // fail silenzioso parallelo tollerato per timeout link third-party
|
|
125
|
-
}
|
|
126
|
-
return '';
|
|
127
|
-
}));
|
|
196
|
+
const pages = await Promise.all(results.map(r => fetchPageContent(r.url, userAgent, fetchOpts)));
|
|
128
197
|
pageText = pages.filter(Boolean).join('\n\n');
|
|
129
198
|
} else {
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
const pageRes = await fetch(results[0].url, {
|
|
133
|
-
signal: AbortSignal.timeout(5000), // node-fetch hang timeout catch
|
|
134
|
-
headers: { 'User-Agent': userAgent },
|
|
135
|
-
agent: results[0].url.startsWith('https:') ? httpsAgent : httpAgent
|
|
136
|
-
});
|
|
137
|
-
if (pageRes.ok) {
|
|
138
|
-
const document = new DOMParser().parseFromString(await pageRes.text(), 'text/html');
|
|
139
|
-
let text = '';
|
|
140
|
-
try {
|
|
141
|
-
const article = new Readability(document).parse();
|
|
142
|
-
text = article?.textContent || '';
|
|
143
|
-
} catch (_) {
|
|
144
|
-
text = document.body?.textContent || '';
|
|
145
|
-
}
|
|
146
|
-
if (text) {
|
|
147
|
-
pageText = sanitizeWebContent(text.replace(/\s+/g, ' '), 4000);
|
|
148
|
-
}
|
|
149
|
-
}
|
|
150
|
-
}
|
|
151
|
-
} catch (_) { // bypass errore url target: fallback al contesto vuoto
|
|
199
|
+
if (results[0]) {
|
|
200
|
+
pageText = await fetchPageContent(results[0].url, userAgent, fetchOpts);
|
|
152
201
|
}
|
|
153
202
|
}
|
|
154
203
|
|
package/src/watcher.js
CHANGED
|
@@ -1,15 +1,15 @@
|
|
|
1
1
|
/**
|
|
2
2
|
* @module watcher
|
|
3
|
-
* @description Timer poll di Antigravity update locale skill inject doc rules,
|
|
3
|
+
* @description Timer poll di Antigravity update locale skill inject doc rules, con registry bypass e quality settings.
|
|
4
4
|
*/
|
|
5
5
|
import os from 'os';
|
|
6
6
|
import path from 'path';
|
|
7
|
-
import { webSearch } from './search.js';
|
|
7
|
+
import { webSearch, registryFetch, fetchPageContent } from './search.js';
|
|
8
8
|
import { readPackageDeps, buildQuery, groupIntoBatches, batchHash } from './packages.js';
|
|
9
9
|
import { sanitizeWebContent } from './sanitize.js';
|
|
10
10
|
import { updateGeminiFiles, removeStaleBlocks } from './inject.js';
|
|
11
11
|
import { chalk, label, log, LOG_WARN, LOG_REFRESH } from './logger.js';
|
|
12
|
-
import { version } from './cli.js';
|
|
12
|
+
import { version, maxTokens, quality, qualitySettings, verbose, customSources } from './cli.js';
|
|
13
13
|
import { loadBatchState, saveBatchState } from './state.js';
|
|
14
14
|
import { httpsAgent } from './http-agent.js';
|
|
15
15
|
|
|
@@ -30,20 +30,33 @@ export function startWatcher({ intervalMinutes, usePackageJson, batchSize }) {
|
|
|
30
30
|
console.log(label('◆', 'workspace', skillFilePretty));
|
|
31
31
|
console.log(label('◆', 'interval', `every ${intervalMinutes} min`));
|
|
32
32
|
console.log(label('◆', 'batch_size', `chunk limit ${batchSize}`));
|
|
33
|
-
console.log(label('◆', '
|
|
33
|
+
console.log(label('◆', 'engine', 'Jina Reader → Readability fallback'));
|
|
34
|
+
console.log(label('◆', 'quality', `${quality} (${qualitySettings.ddgResults} results, ${qualitySettings.charsPerPage} chars)`));
|
|
35
|
+
console.log(label('◆', 'max_tokens', `${maxTokens}`));
|
|
36
|
+
if (customSources.length > 0) {
|
|
37
|
+
console.log(label('◆', 'sources', `${customSources.length} custom URL(s)`));
|
|
38
|
+
}
|
|
39
|
+
if (verbose) console.log(label('◆', 'verbose', 'enabled'));
|
|
34
40
|
console.log();
|
|
35
41
|
console.log(` ${chalk.cyan('✻')} Running. Antigravity will load context automatically.`);
|
|
36
42
|
console.log();
|
|
37
43
|
|
|
38
44
|
let previousBatchHashes = new Map();
|
|
39
45
|
|
|
46
|
+
const searchOpts = {
|
|
47
|
+
ddgResults: qualitySettings.ddgResults,
|
|
48
|
+
maxLen: qualitySettings.charsPerPage,
|
|
49
|
+
jinaTimeout: qualitySettings.jinaTimeout,
|
|
50
|
+
verbose,
|
|
51
|
+
};
|
|
52
|
+
|
|
40
53
|
async function updateSkill() {
|
|
41
54
|
if (previousBatchHashes.size === 0) {
|
|
42
55
|
previousBatchHashes = await loadBatchState();
|
|
43
56
|
}
|
|
44
|
-
const deps = await readPackageDeps();
|
|
57
|
+
const deps = await readPackageDeps();
|
|
45
58
|
if (!deps || deps.length === 0) {
|
|
46
|
-
return;
|
|
59
|
+
return;
|
|
47
60
|
}
|
|
48
61
|
|
|
49
62
|
const batches = groupIntoBatches(deps, batchSize);
|
|
@@ -66,11 +79,30 @@ export function startWatcher({ intervalMinutes, usePackageJson, batchSize }) {
|
|
|
66
79
|
return;
|
|
67
80
|
}
|
|
68
81
|
|
|
69
|
-
const query = buildQuery(batch);
|
|
70
82
|
try {
|
|
71
|
-
|
|
83
|
+
// ── Registry fetch per dipendenze note ──
|
|
84
|
+
const { registryText, coveredDeps } = await registryFetch(batch, searchOpts);
|
|
85
|
+
|
|
86
|
+
// ── DDG search per dipendenze non coperte dal registry ──
|
|
87
|
+
const uncoveredBatch = batch.filter(d => !coveredDeps.has(d));
|
|
88
|
+
let ddgText = '';
|
|
89
|
+
let results = [];
|
|
90
|
+
|
|
91
|
+
if (uncoveredBatch.length > 0) {
|
|
92
|
+
const query = buildQuery(uncoveredBatch);
|
|
93
|
+
try {
|
|
94
|
+
const res = await webSearch(query, false, searchOpts);
|
|
95
|
+
results = res.results;
|
|
96
|
+
ddgText = res.pageText;
|
|
97
|
+
} catch (_) {
|
|
98
|
+
if (verbose) log(LOG_WARN, chalk.yellow, `DDG search failed for: ${uncoveredBatch.join(', ')}`);
|
|
99
|
+
}
|
|
100
|
+
}
|
|
101
|
+
|
|
102
|
+
const combinedText = registryText + (ddgText || '');
|
|
72
103
|
const badSignals = ['403', 'captcha', 'blocked', 'access denied', 'forbidden'];
|
|
73
|
-
const isBad = !
|
|
104
|
+
const isBad = !combinedText || combinedText.length < 200 || badSignals.some(s => combinedText.toLowerCase().includes(s));
|
|
105
|
+
|
|
74
106
|
if (isBad && previousBatchHashes.has(blockId)) {
|
|
75
107
|
log(LOG_WARN, chalk.yellow, `low quality result for block ${blockId} → keeping previous context`);
|
|
76
108
|
failedCount++;
|
|
@@ -82,19 +114,22 @@ export function startWatcher({ intervalMinutes, usePackageJson, batchSize }) {
|
|
|
82
114
|
const batchTitle = batch.map(b => b.split(' ')[0]).join(', ');
|
|
83
115
|
|
|
84
116
|
let globalMd = `## Live Context — ${batchTitle} (${nowStr})\n`;
|
|
85
|
-
|
|
86
|
-
|
|
117
|
+
if (registryText) {
|
|
118
|
+
globalMd += sanitizeWebContent(registryText, 500) + '\n';
|
|
119
|
+
} else if (results.length > 0) {
|
|
87
120
|
globalMd += `### ${results[0].title}\n`;
|
|
88
121
|
globalMd += `${sanitizeWebContent(results[0].snippet, 300)} — ${results[0].url}\n`;
|
|
89
122
|
}
|
|
90
123
|
|
|
91
124
|
let md = `## Live Context — ${batchTitle} (${nowStr})\n`;
|
|
92
|
-
|
|
125
|
+
if (registryText) {
|
|
126
|
+
md += sanitizeWebContent(registryText, maxTokens) + '\n\n';
|
|
127
|
+
}
|
|
93
128
|
for (const r of results) {
|
|
94
129
|
md += `### ${r.title}\n${sanitizeWebContent(r.snippet, 500)} — ${r.url}\n\n`;
|
|
95
130
|
}
|
|
96
|
-
if (
|
|
97
|
-
md += `FULL TEXT: ${sanitizeWebContent(
|
|
131
|
+
if (ddgText) {
|
|
132
|
+
md += `FULL TEXT: ${sanitizeWebContent(ddgText, maxTokens)}\n`;
|
|
98
133
|
}
|
|
99
134
|
|
|
100
135
|
await updateGeminiFiles([{
|
|
@@ -105,7 +140,11 @@ export function startWatcher({ intervalMinutes, usePackageJson, batchSize }) {
|
|
|
105
140
|
|
|
106
141
|
previousBatchHashes.set(blockId, currentHash);
|
|
107
142
|
updatedCount++;
|
|
108
|
-
|
|
143
|
+
|
|
144
|
+
const sources = [];
|
|
145
|
+
if (coveredDeps.size > 0) sources.push(`registry:${coveredDeps.size}`);
|
|
146
|
+
if (results.length > 0) sources.push(`ddg:${results.length}`);
|
|
147
|
+
log(LOG_REFRESH, chalk.cyan, `block ${blockId} updated → ${batch.join(', ')} [${sources.join(', ')}]`);
|
|
109
148
|
} catch (e) {
|
|
110
149
|
failedCount++;
|
|
111
150
|
log(LOG_WARN, chalk.yellow, `block ${blockId} fetch failed → keeping previous`);
|
|
@@ -119,6 +158,38 @@ export function startWatcher({ intervalMinutes, usePackageJson, batchSize }) {
|
|
|
119
158
|
}
|
|
120
159
|
await Promise.all(executing);
|
|
121
160
|
|
|
161
|
+
// ── Custom sources from .groundtruth.json ──
|
|
162
|
+
if (customSources.length > 0) {
|
|
163
|
+
for (const src of customSources) {
|
|
164
|
+
const blockId = 'src_' + Buffer.from(src.url).toString('base64url').slice(0, 8);
|
|
165
|
+
activeBlockIds.add(blockId);
|
|
166
|
+
|
|
167
|
+
if (previousBatchHashes.has(blockId)) {
|
|
168
|
+
skippedCount++;
|
|
169
|
+
continue;
|
|
170
|
+
}
|
|
171
|
+
|
|
172
|
+
try {
|
|
173
|
+
const text = await fetchPageContent(src.url, '', searchOpts);
|
|
174
|
+
if (text && text.length > 100) {
|
|
175
|
+
const srcLabel = src.label || new URL(src.url).hostname;
|
|
176
|
+
const md = `## Custom Source — ${srcLabel}\n${sanitizeWebContent(text, maxTokens)}\n`;
|
|
177
|
+
|
|
178
|
+
await updateGeminiFiles([{
|
|
179
|
+
blockId,
|
|
180
|
+
globalContent: `## ${srcLabel}\n${sanitizeWebContent(text, 500)}\n`,
|
|
181
|
+
workspaceContent: md
|
|
182
|
+
}]);
|
|
183
|
+
previousBatchHashes.set(blockId, blockId);
|
|
184
|
+
updatedCount++;
|
|
185
|
+
log(LOG_REFRESH, chalk.cyan, `custom source updated → ${srcLabel}`);
|
|
186
|
+
}
|
|
187
|
+
} catch (_) {
|
|
188
|
+
failedCount++;
|
|
189
|
+
}
|
|
190
|
+
}
|
|
191
|
+
}
|
|
192
|
+
|
|
122
193
|
await removeStaleBlocks(globalPath, activeBlockIds);
|
|
123
194
|
await removeStaleBlocks(workspacePath, activeBlockIds);
|
|
124
195
|
|
|
@@ -129,18 +200,16 @@ export function startWatcher({ intervalMinutes, usePackageJson, batchSize }) {
|
|
|
129
200
|
|
|
130
201
|
let cycleCount = 0;
|
|
131
202
|
|
|
132
|
-
// Periodical state persistence on process exit to avoid total crash data loss
|
|
133
203
|
process.on('SIGINT', async () => {
|
|
134
204
|
await saveBatchState(previousBatchHashes);
|
|
135
205
|
process.exit(0);
|
|
136
206
|
});
|
|
137
207
|
|
|
138
|
-
// Lancio a startup immediato
|
|
139
208
|
updateSkill();
|
|
140
209
|
setInterval(() => {
|
|
141
210
|
cycleCount++;
|
|
142
211
|
if (cycleCount % 10 === 0) {
|
|
143
|
-
httpsAgent.destroy();
|
|
212
|
+
httpsAgent.destroy();
|
|
144
213
|
}
|
|
145
214
|
updateSkill();
|
|
146
215
|
}, intervalMinutes * 60 * 1000);
|
package/assets/banner.svg
DELETED
|
@@ -1,106 +0,0 @@
|
|
|
1
|
-
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1200 300" width="100%">
|
|
2
|
-
<defs>
|
|
3
|
-
<!-- Background Gradient -->
|
|
4
|
-
<linearGradient id="bg" x1="0%" y1="0%" x2="100%" y2="100%">
|
|
5
|
-
<stop offset="0%" stop-color="#090a0f" />
|
|
6
|
-
<stop offset="50%" stop-color="#121521" />
|
|
7
|
-
<stop offset="100%" stop-color="#040508" />
|
|
8
|
-
</linearGradient>
|
|
9
|
-
|
|
10
|
-
<!-- Glowing Text Gradient -->
|
|
11
|
-
<linearGradient id="textGlow" x1="0%" y1="0%" x2="100%" y2="0%">
|
|
12
|
-
<stop offset="0%" stop-color="#4F46E5" />
|
|
13
|
-
<stop offset="50%" stop-color="#0ea5e9" />
|
|
14
|
-
<stop offset="100%" stop-color="#8b5cf6" />
|
|
15
|
-
</linearGradient>
|
|
16
|
-
|
|
17
|
-
<!-- Node Gradient -->
|
|
18
|
-
<linearGradient id="nodeGrad" x1="0%" y1="0%" x2="100%" y2="100%">
|
|
19
|
-
<stop offset="0%" stop-color="#38bdf8" />
|
|
20
|
-
<stop offset="100%" stop-color="#818cf8" />
|
|
21
|
-
</linearGradient>
|
|
22
|
-
|
|
23
|
-
<!-- Grid Pattern -->
|
|
24
|
-
<pattern id="grid" width="40" height="40" patternUnits="userSpaceOnUse">
|
|
25
|
-
<path d="M 40 0 L 0 0 0 40" fill="none" stroke="rgba(255, 255, 255, 0.03)" stroke-width="1"/>
|
|
26
|
-
</pattern>
|
|
27
|
-
|
|
28
|
-
<filter id="glow" x="-20%" y="-20%" width="140%" height="140%">
|
|
29
|
-
<feGaussianBlur stdDeviation="6" result="blur" />
|
|
30
|
-
<feMerge>
|
|
31
|
-
<feMergeNode in="blur" />
|
|
32
|
-
<feMergeNode in="SourceGraphic" />
|
|
33
|
-
</feMerge>
|
|
34
|
-
</filter>
|
|
35
|
-
</defs>
|
|
36
|
-
|
|
37
|
-
<!-- Deep Dark Solid Background -->
|
|
38
|
-
<rect width="100%" height="100%" fill="url(#bg)" />
|
|
39
|
-
|
|
40
|
-
<!-- Overlay Subtle Grid -->
|
|
41
|
-
<rect width="100%" height="100%" fill="url(#grid)" />
|
|
42
|
-
|
|
43
|
-
<!-- Connecting Lines representing Context Injection (Left) -->
|
|
44
|
-
<g opacity="0.6" stroke-width="2" fill="none" filter="url(#glow)">
|
|
45
|
-
<!-- Primary flow -->
|
|
46
|
-
<path d="M0 150 C 200 150, 300 100, 450 150" stroke="#0ea5e9" opacity="0.8"/>
|
|
47
|
-
<path d="M0 80 C 150 80, 250 150, 400 130" stroke="#4F46E5" opacity="0.5"/>
|
|
48
|
-
<path d="M0 220 C 180 220, 280 150, 420 170" stroke="#8b5cf6" opacity="0.5"/>
|
|
49
|
-
|
|
50
|
-
<!-- Minor branches -->
|
|
51
|
-
<path d="M100 80 L 150 110" stroke="#0ea5e9"/>
|
|
52
|
-
<path d="M200 220 L 250 185" stroke="#8b5cf6"/>
|
|
53
|
-
<path d="M300 100 L 320 130" stroke="#4F46E5"/>
|
|
54
|
-
</g>
|
|
55
|
-
|
|
56
|
-
<!-- Connecting Lines representing Context Injection (Right) -->
|
|
57
|
-
<g opacity="0.6" stroke-width="2" fill="none" filter="url(#glow)">
|
|
58
|
-
<!-- Primary flow -->
|
|
59
|
-
<path d="M1200 150 C 1000 150, 900 200, 750 150" stroke="#0ea5e9" opacity="0.8"/>
|
|
60
|
-
<path d="M1200 80 C 1050 80, 950 150, 800 130" stroke="#8b5cf6" opacity="0.5"/>
|
|
61
|
-
<path d="M1200 220 C 1020 220, 920 150, 780 170" stroke="#4F46E5" opacity="0.5"/>
|
|
62
|
-
|
|
63
|
-
<!-- Minor branches -->
|
|
64
|
-
<path d="M1100 80 L 1050 110" stroke="#0ea5e9"/>
|
|
65
|
-
<path d="M1000 220 L 950 185" stroke="#4F46E5"/>
|
|
66
|
-
<path d="M900 200 L 880 170" stroke="#8b5cf6"/>
|
|
67
|
-
</g>
|
|
68
|
-
|
|
69
|
-
<!-- Data Nodes (Left) -->
|
|
70
|
-
<g fill="url(#nodeGrad)" filter="url(#glow)">
|
|
71
|
-
<circle cx="100" cy="80" r="5" />
|
|
72
|
-
<circle cx="150" cy="110" r="4" />
|
|
73
|
-
<circle cx="200" cy="220" r="6" />
|
|
74
|
-
<circle cx="250" cy="185" r="4" />
|
|
75
|
-
<circle cx="300" cy="100" r="7" />
|
|
76
|
-
<circle cx="320" cy="130" r="3" />
|
|
77
|
-
<circle cx="450" cy="150" r="8" fill="#ffffff"/>
|
|
78
|
-
</g>
|
|
79
|
-
|
|
80
|
-
<!-- Data Nodes (Right) -->
|
|
81
|
-
<g fill="url(#nodeGrad)" filter="url(#glow)">
|
|
82
|
-
<circle cx="1100" cy="80" r="5" />
|
|
83
|
-
<circle cx="1050" cy="110" r="4" />
|
|
84
|
-
<circle cx="1000" cy="220" r="6" />
|
|
85
|
-
<circle cx="950" cy="185" r="4" />
|
|
86
|
-
<circle cx="900" cy="200" r="7" />
|
|
87
|
-
<circle cx="880" cy="170" r="3" />
|
|
88
|
-
<circle cx="750" cy="150" r="8" fill="#ffffff"/>
|
|
89
|
-
</g>
|
|
90
|
-
|
|
91
|
-
<!-- Typography -->
|
|
92
|
-
<text x="50%" y="48%" text-anchor="middle" font-family="-apple-system, system-ui, BlinkMacSystemFont, Segoe UI, Roboto, sans-serif" font-size="94" font-weight="900" fill="url(#textGlow)" filter="url(#glow)" letter-spacing="2">
|
|
93
|
-
GROUNDTRUTH
|
|
94
|
-
</text>
|
|
95
|
-
|
|
96
|
-
<text x="50%" y="48%" text-anchor="middle" font-family="-apple-system, system-ui, BlinkMacSystemFont, Segoe UI, Roboto, sans-serif" font-size="94" font-weight="900" fill="#ffffff" letter-spacing="2">
|
|
97
|
-
GROUNDTRUTH
|
|
98
|
-
</text>
|
|
99
|
-
|
|
100
|
-
<text x="50%" y="68%" text-anchor="middle" font-family="-apple-system, system-ui, BlinkMacSystemFont, Segoe UI, Roboto, sans-serif" font-size="28" font-weight="500" fill="#94a3b8" letter-spacing="4">
|
|
101
|
-
ZERO ZERO-CONFIGURATION CONTEXT INJECTION
|
|
102
|
-
</text>
|
|
103
|
-
|
|
104
|
-
<rect x="350" y="72%" width="500" height="2" fill="rgba(255,255,255,0.1)" />
|
|
105
|
-
<rect x="450" y="72%" width="300" height="2" fill="#0ea5e9" filter="url(#glow)" />
|
|
106
|
-
</svg>
|