npm - pi-oracle - Versions diffs - 0.1.10 → 0.1.12 - Mend

pi-oracle 0.1.10 → 0.1.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md +4 -1
package/extensions/oracle/lib/tools.ts +370 -63
package/extensions/oracle/worker/run-job.mjs +3 -0
package/package.json +1 -1
package/prompts/oracle.md +7 -3

package/README.md CHANGED Viewed

@@ -35,12 +35,15 @@ An oracle job:
 4. waits in the background
 5. persists the response and any artifacts under `/tmp/oracle-<job-id>/`
    - old terminal jobs are later pruned according to cleanup retention settings
+   - when directory inputs are expanded, project archives automatically skip common bulky generated caches and top-level build outputs such as `node_modules/`, `target/`, virtualenv caches, coverage outputs, and `dist/`/`build/`/`out/`, unless you explicitly pass those directories
+   - whole-repo archive defaults also skip obvious credentials/private data such as `.env` files, key material, credential dotfiles, local database files, and root `secrets/` directories unless you explicitly pass them
+   - if a whole-repo archive is still too large after default exclusions, submit automatically prunes the largest nested directories with generic generated-output names like `build/`, `dist/`, `out/`, `coverage/`, and `tmp/` outside obvious source roots like `src/` and `lib/`, and successful submissions report what was pruned
 6. wakes the originating `pi` session on completion
 ## Example
 ```text
-/oracle Invoke the Oracle to have it generate a thorough code review of the current pending changes. Include all modified files, and adjacent files, in the archive. Use the Pro Model with Extended effort.
+/oracle Invoke the Oracle to have it generate a thorough code review of the current pending changes. By default include the whole repo archive unless the request clearly needs a narrower scope. Use the Pro Model with Extended effort.
 ```
 ## Why this exists

package/extensions/oracle/lib/tools.ts CHANGED Viewed

@@ -1,7 +1,7 @@
 import { randomUUID } from "node:crypto";
-import { mkdtemp, rename, rm, stat, writeFile } from "node:fs/promises";
+import { lstat, mkdtemp, readdir, rename, rm, stat, writeFile } from "node:fs/promises";
 import { tmpdir } from "node:os";
-import { join } from "node:path";
+import { basename, join, posix } from "node:path";
 import { StringEnum } from "@mariozechner/pi-ai";
 import type { ExtensionAPI } from "@mariozechner/pi-coding-agent";
 import { Type } from "@sinclair/typebox";
@@ -60,74 +60,366 @@ const VALID_EFFORTS: Record<OracleModelFamily, readonly OracleEffort[]> = {
   pro: ["standard", "extended"],
 };
-async function createArchive(cwd: string, files: string[], archivePath: string): Promise<string> {
-  const entries = resolveArchiveInputs(cwd, files);
-  const listDir = await mkdtemp(join(tmpdir(), "oracle-filelist-"));
-  const listPath = join(listDir, "files.list");
-  await writeFile(listPath, Buffer.from(`${entries.map((entry) => entry.relative).join("\0")}\0`), { mode: 0o600 });
+const MAX_ARCHIVE_BYTES = 250 * 1024 * 1024;
+const DEFAULT_ARCHIVE_EXCLUDED_DIR_NAMES_ANYWHERE = new Set([
+  ".git",
+  ".hg",
+  ".svn",
+  "node_modules",
+  "target",
+  ".venv",
+  "venv",
+  "__pycache__",
+  ".pytest_cache",
+  ".mypy_cache",
+  ".ruff_cache",
+  ".tox",
+  ".nox",
+  ".hypothesis",
+  ".next",
+  ".nuxt",
+  ".svelte-kit",
+  ".turbo",
+  ".parcel-cache",
+  ".cache",
+  ".gradle",
+  ".terraform",
+  "DerivedData",
+  ".build",
+  ".pnpm-store",
+  ".serverless",
+  ".aws-sam",
+]);
+const DEFAULT_ARCHIVE_EXCLUDED_DIR_NAMES_AT_REPO_ROOT = new Set(["coverage", "htmlcov", "tmp", "temp", ".tmp", "dist", "build", "out", "secrets", ".secrets"]);
+const DEFAULT_ARCHIVE_EXCLUDED_FILES = new Set([
+  ".coverage",
+  ".DS_Store",
+  ".env",
+  ".netrc",
+  ".npmrc",
+  ".pypirc",
+  "Thumbs.db",
+  "id_dsa",
+  "id_ecdsa",
+  "id_ed25519",
+  "id_rsa",
+]);
+const DEFAULT_ARCHIVE_EXCLUDED_SUFFIXES = [".db", ".key", ".p12", ".pfx", ".pyc", ".pyd", ".pyo", ".pem", ".sqlite", ".sqlite3", ".tsbuildinfo", ".tfstate"];
+const DEFAULT_ARCHIVE_EXCLUDED_SUBSTRINGS = [".tfstate."];
+const DEFAULT_ARCHIVE_EXCLUDED_ENV_ALLOWLIST = new Set([".env.dist", ".env.example", ".env.sample", ".env.template"]);
+const DEFAULT_ARCHIVE_EXCLUDED_PATH_SEQUENCES = [[".yarn", "cache"]] as const;
+const ADAPTIVE_ARCHIVE_PRUNE_DIR_NAMES_ANYWHERE = new Set(["build", "dist", "out", "coverage", "htmlcov", "tmp", "temp", ".tmp"]);
+const ADAPTIVE_ARCHIVE_PRUNE_PROTECTED_ANCESTOR_DIR_NAMES = new Set(["src", "source", "sources", "lib"]);
+type ArchiveSizeBreakdownRow = { relativePath: string; bytes: number };
+type ArchiveCreationResult = {
+  sha256: string;
+  archiveBytes: number;
+  initialArchiveBytes?: number;
+  autoPrunedPrefixes: ArchiveSizeBreakdownRow[];
+  includedEntries: string[];
+};
+function pathContainsSequence(relativePath: string, sequence: readonly string[]): boolean {
+  const segments = relativePath.split("/").filter(Boolean);
+  if (sequence.length === 0 || segments.length < sequence.length) return false;
+  for (let index = 0; index <= segments.length - sequence.length; index += 1) {
+    if (sequence.every((segment, offset) => segments[index + offset] === segment)) return true;
+  }
+  return false;
+}
+function getRelativeDepth(relativePath: string): number {
+  return relativePath.split("/").filter(Boolean).length;
+}
+function formatBytes(bytes: number): string {
+  return `${(bytes / (1024 * 1024)).toFixed(2)} MiB`;
+}
+function formatDirectoryLabel(relativePath: string): string {
+  return relativePath.endsWith("/") ? relativePath : `${relativePath}/`;
+}
+function summarizeByKey(
+  entrySizes: ArchiveSizeBreakdownRow[],
+  keyForEntry: (relativePath: string) => string | undefined,
+  limit = 7,
+): ArchiveSizeBreakdownRow[] {
+  const totals = new Map<string, number>();
+  for (const entry of entrySizes) {
+    const key = keyForEntry(entry.relativePath);
+    if (!key) continue;
+    totals.set(key, (totals.get(key) ?? 0) + entry.bytes);
+  }
+  return [...totals.entries()]
+    .map(([relativePath, bytes]) => ({ relativePath, bytes }))
+    .sort((left, right) => right.bytes - left.bytes || left.relativePath.localeCompare(right.relativePath))
+    .slice(0, limit);
+}
+function summarizeTopLevelIncludedPaths(entrySizes: ArchiveSizeBreakdownRow[]): ArchiveSizeBreakdownRow[] {
+  return summarizeByKey(entrySizes, (relativePath) => {
+    const [topLevel, ...rest] = relativePath.split("/").filter(Boolean);
+    if (!topLevel) return undefined;
+    return rest.length > 0 ? `${topLevel}/` : topLevel;
+  });
+}
+function getAdaptivePrunePrefix(relativePath: string): string | undefined {
+  const segments = relativePath.split("/").filter(Boolean);
+  for (let index = 0; index < segments.length - 1; index += 1) {
+    const name = segments[index];
+    if (!ADAPTIVE_ARCHIVE_PRUNE_DIR_NAMES_ANYWHERE.has(name)) continue;
+    const ancestors = segments.slice(0, index);
+    if (ancestors.some((segment) => ADAPTIVE_ARCHIVE_PRUNE_PROTECTED_ANCESTOR_DIR_NAMES.has(segment))) continue;
+    return segments.slice(0, index + 1).join("/");
+  }
+  return undefined;
+}
+function summarizeAdaptivePruneCandidates(
+  entrySizes: ArchiveSizeBreakdownRow[],
+  minimumBytes = 0,
+): ArchiveSizeBreakdownRow[] {
+  return summarizeByKey(entrySizes, getAdaptivePrunePrefix, Number.POSITIVE_INFINITY).filter((entry) => entry.bytes >= minimumBytes);
+}
+function pruneEntriesByPrefix(entries: string[], prefix: string): string[] {
+  return entries.filter((entry) => entry !== prefix && !entry.startsWith(`${prefix}/`));
+}
+function shouldExcludeArchivePath(relativePath: string, isDirectory: boolean, options?: { forceInclude?: boolean }): boolean {
+  const normalized = posix.normalize(relativePath).replace(/^\.\//, "");
+  if (!normalized || normalized === ".") return false;
+  if (options?.forceInclude) return false;
+  const name = basename(normalized);
+  if (DEFAULT_ARCHIVE_EXCLUDED_PATH_SEQUENCES.some((sequence) => pathContainsSequence(normalized, sequence))) return true;
+  if (isDirectory) {
+    if (DEFAULT_ARCHIVE_EXCLUDED_DIR_NAMES_ANYWHERE.has(name)) return true;
+    if (getRelativeDepth(normalized) === 1 && DEFAULT_ARCHIVE_EXCLUDED_DIR_NAMES_AT_REPO_ROOT.has(name)) return true;
+    return false;
+  }
+  if (DEFAULT_ARCHIVE_EXCLUDED_FILES.has(name)) return true;
+  if (name.startsWith(".env.") && !DEFAULT_ARCHIVE_EXCLUDED_ENV_ALLOWLIST.has(name)) return true;
+  if (DEFAULT_ARCHIVE_EXCLUDED_SUFFIXES.some((suffix) => name.endsWith(suffix))) return true;
+  if (DEFAULT_ARCHIVE_EXCLUDED_SUBSTRINGS.some((needle) => name.includes(needle))) return true;
+  return false;
+}
+async function isSymlinkToDirectory(path: string): Promise<boolean> {
   try {
-    const { spawn } = await import("node:child_process");
-    await new Promise<void>((resolvePromise, rejectPromise) => {
-      const tar = spawn("tar", ["--null", "-cf", "-", "-T", listPath], {
-        cwd,
-        stdio: ["ignore", "pipe", "pipe"],
-      });
-      const zstd = spawn("zstd", ["-19", "-T0", "-o", archivePath], {
-        stdio: ["pipe", "ignore", "pipe"],
-      });
-      let stderr = "";
-      let settled = false;
-      let tarCode: number | null | undefined;
-      let zstdCode: number | null | undefined;
-      const finish = (error?: Error) => {
-        if (settled) return;
-        if (error) {
-          settled = true;
-          tar.kill("SIGTERM");
-          zstd.kill("SIGTERM");
-          rejectPromise(error);
-          return;
-        }
-        if (tarCode === undefined || zstdCode === undefined) return;
+    return (await stat(path)).isDirectory();
+  } catch {
+    return false;
+  }
+}
+async function shouldExcludeArchiveChild(
+  absolutePath: string,
+  relativePath: string,
+  child: { isDirectory(): boolean; isSymbolicLink(): boolean },
+  options?: { forceInclude?: boolean },
+): Promise<boolean> {
+  const isDirectoryLike = child.isDirectory() || (child.isSymbolicLink() && await isSymlinkToDirectory(absolutePath));
+  return shouldExcludeArchivePath(relativePath, isDirectoryLike, options);
+}
+async function expandArchiveEntries(cwd: string, relativePath: string, options?: { forceIncludeSubtree?: boolean }): Promise<string[]> {
+  const normalized = posix.normalize(relativePath).replace(/^\.\//, "");
+  if (normalized === ".") {
+    const children = await readdir(cwd, { withFileTypes: true });
+    const results: string[] = [];
+    for (const child of children.sort((a, b) => a.name.localeCompare(b.name))) {
+      const childRelative = child.name;
+      if (await shouldExcludeArchiveChild(join(cwd, childRelative), childRelative, child)) continue;
+      if (child.isDirectory()) results.push(...await expandArchiveEntries(cwd, childRelative));
+      else results.push(childRelative);
+    }
+    return results;
+  }
+  const absolute = join(cwd, normalized);
+  const entry = await lstat(absolute);
+  if (!entry.isDirectory()) return [normalized];
+  if (shouldExcludeArchivePath(normalized, true, { forceInclude: options?.forceIncludeSubtree })) return [];
+  const children = await readdir(absolute, { withFileTypes: true });
+  const results: string[] = [];
+  for (const child of children.sort((a, b) => a.name.localeCompare(b.name))) {
+    const childRelative = posix.join(normalized, child.name);
+    if (await shouldExcludeArchiveChild(join(cwd, childRelative), childRelative, child, { forceInclude: options?.forceIncludeSubtree })) continue;
+    if (child.isDirectory()) results.push(...await expandArchiveEntries(cwd, childRelative, { forceIncludeSubtree: options?.forceIncludeSubtree }));
+    else results.push(childRelative);
+  }
+  return results;
+}
+async function resolveExpandedArchiveEntriesFromInputs(
+  cwd: string,
+  entries: Array<{ absolute: string; relative: string }>,
+): Promise<string[]> {
+  return Array.from(new Set((await Promise.all(entries.map(async (entry) => {
+    const statEntry = await lstat(entry.absolute);
+    const forceIncludeSubtree = statEntry.isDirectory() && entry.relative !== "." && shouldExcludeArchivePath(entry.relative, true);
+    return expandArchiveEntries(cwd, entry.relative, { forceIncludeSubtree });
+  }))).flat())).sort();
+}
+export async function resolveExpandedArchiveEntries(cwd: string, files: string[]): Promise<string[]> {
+  return resolveExpandedArchiveEntriesFromInputs(cwd, resolveArchiveInputs(cwd, files));
+}
+function isWholeRepoArchiveSelection(entries: Array<{ absolute: string; relative: string }>): boolean {
+  return entries.length === 1 && entries[0]?.relative === ".";
+}
+async function measureArchiveEntrySizes(cwd: string, entries: string[]): Promise<ArchiveSizeBreakdownRow[]> {
+  return Promise.all(entries.map(async (relativePath) => ({ relativePath, bytes: (await lstat(join(cwd, relativePath))).size })));
+}
+function formatArchiveOversizeError(args: {
+  archiveBytes: number;
+  maxBytes: number;
+  entrySizes: ArchiveSizeBreakdownRow[];
+  autoPrunedPrefixes: ArchiveSizeBreakdownRow[];
+  adaptivePruneMinBytes?: number;
+}): string {
+  const topLevel = summarizeTopLevelIncludedPaths(args.entrySizes);
+  const adaptiveCandidates = summarizeAdaptivePruneCandidates(args.entrySizes, args.adaptivePruneMinBytes).slice(0, 7);
+  return [
+    `Oracle archive exceeds ChatGPT upload limit after default exclusions${args.autoPrunedPrefixes.length > 0 ? " and automatic generic generated-output-dir pruning" : ""}: ${args.archiveBytes} bytes >= ${args.maxBytes} bytes`,
+    args.autoPrunedPrefixes.length > 0 ? "Automatically pruned generic generated-output paths before failing:" : undefined,
+    ...args.autoPrunedPrefixes.map((entry) => `- ${formatDirectoryLabel(entry.relativePath)} — ${formatBytes(entry.bytes)}`),
+    topLevel.length > 0 ? "Approx top-level included sizes:" : undefined,
+    ...topLevel.map((entry) => `- ${entry.relativePath} — ${formatBytes(entry.bytes)}`),
+    adaptiveCandidates.length > 0 ? "Largest remaining generic generated-output-dir candidates:" : undefined,
+    ...adaptiveCandidates.map((entry) => `- ${formatDirectoryLabel(entry.relativePath)} — ${formatBytes(entry.bytes)}`),
+    "Retry with narrower archive inputs, starting with modified files plus adjacent files plus directly relevant subtrees.",
+  ]
+    .filter(Boolean)
+    .join("\n");
+}
+async function writeArchiveFile(cwd: string, entries: string[], archivePath: string, listPath: string): Promise<number> {
+  await writeFile(listPath, Buffer.from(`${entries.join("\0")}\0`), { mode: 0o600 });
+  await rm(archivePath, { force: true }).catch(() => undefined);
+  const { spawn } = await import("node:child_process");
+  await new Promise<void>((resolvePromise, rejectPromise) => {
+    const tar = spawn("tar", ["--null", "-cf", "-", "-T", listPath], {
+      cwd,
+      stdio: ["ignore", "pipe", "pipe"],
+    });
+    const zstd = spawn("zstd", ["-19", "-T0", "-f", "-o", archivePath], {
+      stdio: ["pipe", "ignore", "pipe"],
+    });
+    let stderr = "";
+    let settled = false;
+    let tarCode: number | null | undefined;
+    let zstdCode: number | null | undefined;
+    const finish = (error?: Error) => {
+      if (settled) return;
+      if (error) {
         settled = true;
-        if (tarCode === 0 && zstdCode === 0) resolvePromise();
-        else rejectPromise(new Error(stderr || `archive command failed (tar=${tarCode}, zstd=${zstdCode})`));
-      };
+        tar.kill("SIGTERM");
+        zstd.kill("SIGTERM");
+        rejectPromise(error);
+        return;
+      }
+      if (tarCode === undefined || zstdCode === undefined) return;
+      settled = true;
+      if (tarCode === 0 && zstdCode === 0) resolvePromise();
+      else rejectPromise(new Error(stderr || `archive command failed (tar=${tarCode}, zstd=${zstdCode})`));
+    };
-      tar.stderr.on("data", (data) => {
-        stderr += String(data);
-      });
-      zstd.stderr.on("data", (data) => {
-        stderr += String(data);
-      });
-      tar.on("error", (error) => finish(error instanceof Error ? error : new Error(String(error))));
-      zstd.on("error", (error) => finish(error instanceof Error ? error : new Error(String(error))));
-      tar.on("close", (code) => {
-        tarCode = code;
-        finish();
-      });
-      zstd.on("close", (code) => {
-        zstdCode = code;
-        finish();
-      });
-      tar.stdout.pipe(zstd.stdin);
+    tar.stderr.on("data", (data) => {
+      stderr += String(data);
     });
+    zstd.stderr.on("data", (data) => {
+      stderr += String(data);
+    });
+    tar.on("error", (error) => finish(error instanceof Error ? error : new Error(String(error))));
+    zstd.on("error", (error) => finish(error instanceof Error ? error : new Error(String(error))));
+    tar.on("close", (code) => {
+      tarCode = code;
+      finish();
+    });
+    zstd.on("close", (code) => {
+      zstdCode = code;
+      finish();
+    });
+    tar.stdout.pipe(zstd.stdin);
+  });
-    const archiveStat = await stat(archivePath);
-    const maxBytes = 250 * 1024 * 1024;
-    if (archiveStat.size >= maxBytes) {
-      throw new Error(`Oracle archive exceeds ChatGPT upload limit: ${archiveStat.size} bytes`);
-    }
+  return (await stat(archivePath)).size;
+}
+export async function createArchiveForTesting(
+  cwd: string,
+  files: string[],
+  archivePath: string,
+  options?: { maxBytes?: number; adaptivePruneMinBytes?: number },
+): Promise<ArchiveCreationResult> {
+  const archiveInputs = resolveArchiveInputs(cwd, files);
+  const wholeRepoSelection = isWholeRepoArchiveSelection(archiveInputs);
+  let expandedEntries = await resolveExpandedArchiveEntriesFromInputs(cwd, archiveInputs);
+  if (expandedEntries.length === 0) {
+    throw new Error("Oracle archive inputs are empty after default exclusions");
+  }
+  const listDir = await mkdtemp(join(tmpdir(), "oracle-filelist-"));
+  const listPath = join(listDir, "files.list");
+  const maxBytes = options?.maxBytes ?? MAX_ARCHIVE_BYTES;
+  const adaptivePruneMinBytes = options?.adaptivePruneMinBytes ?? 0;
+  const autoPrunedPrefixes: ArchiveSizeBreakdownRow[] = [];
+  let initialArchiveBytes: number | undefined;
-    return sha256File(archivePath);
+  try {
+    while (true) {
+      if (expandedEntries.length === 0) {
+        throw new Error("Oracle archive inputs are empty after default exclusions and automatic size pruning");
+      }
+      const archiveBytes = await writeArchiveFile(cwd, expandedEntries, archivePath, listPath);
+      if (archiveBytes < maxBytes) {
+        return {
+          sha256: await sha256File(archivePath),
+          archiveBytes,
+          initialArchiveBytes,
+          autoPrunedPrefixes,
+          includedEntries: [...expandedEntries],
+        };
+      }
+      if (initialArchiveBytes === undefined) initialArchiveBytes = archiveBytes;
+      const entrySizes = await measureArchiveEntrySizes(cwd, expandedEntries);
+      if (!wholeRepoSelection) {
+        throw new Error(formatArchiveOversizeError({ archiveBytes, maxBytes, entrySizes, autoPrunedPrefixes, adaptivePruneMinBytes }));
+      }
+      const nextCandidate = summarizeAdaptivePruneCandidates(entrySizes, adaptivePruneMinBytes).find(
+        (entry) => !autoPrunedPrefixes.some((pruned) => pruned.relativePath === entry.relativePath),
+      );
+      if (!nextCandidate) {
+        throw new Error(formatArchiveOversizeError({ archiveBytes, maxBytes, entrySizes, autoPrunedPrefixes, adaptivePruneMinBytes }));
+      }
+      autoPrunedPrefixes.push(nextCandidate);
+      expandedEntries = pruneEntriesByPrefix(expandedEntries, nextCandidate.relativePath);
+    }
   } finally {
     await rm(listDir, { recursive: true, force: true }).catch(() => undefined);
   }
 }
+async function createArchive(cwd: string, files: string[], archivePath: string): Promise<ArchiveCreationResult> {
+  return createArchiveForTesting(cwd, files, archivePath);
+}
 function validateSubmissionOptions(
   params: { effort?: OracleEffort; autoSwitchToThinking?: boolean },
   modelFamily: OracleModelFamily,
@@ -212,9 +504,13 @@ export function registerOracleTools(pi: ExtensionAPI, workerPath: string): void
     promptSnippet: "Dispatch a background ChatGPT web oracle job after gathering repo context.",
     promptGuidelines: [
       "Gather context before calling oracle_submit.",
-      "Always include a narrowly scoped archive of exact relevant files/directories.",
+      "By default, archive the whole repo by passing '.'; default archive exclusions apply automatically, including common bulky outputs and obvious credentials/private data like .env files, key material, credential dotfiles, local database files, and root secrets directories.",
+      "Only narrow file selection when the user explicitly asks, the task is clearly scoped smaller, or privacy/sensitivity requires it.",
+      "For very targeted asks like a single function or stack trace, a smaller archive is preferable.",
+      "When files='.' and the post-exclusion archive is still too large, submit automatically prunes the largest nested directories matching generic generated-output names like build/, dist/, out/, coverage/, and tmp/ outside obvious source roots like src/ and lib/ until the archive fits or no candidate remains; successful submissions report what was pruned.",
+      "If a submitted oracle job later fails because upload is rejected, retry smaller: remove the largest obviously irrelevant/generated content first, then narrow to modified files plus adjacent files plus directly relevant subtrees, then explain the cut or ask the user if still needed.",
+      "If oracle_submit itself fails because the local archive still exceeds the upload limit after default exclusions and automatic generic generated-output-dir pruning, or for any other submit-time error, stop and report the error instead of retrying automatically.",
       "Stop after dispatching oracle_submit; do not continue the task while the oracle job is running.",
-      "If oracle_submit fails, stop and report the error instead of retrying automatically.",
       "Only use autoSwitchToThinking with modelFamily=instant.",
     ],
     parameters: ORACLE_SUBMIT_PARAMS,
@@ -246,7 +542,7 @@ export function registerOracleTools(pi: ExtensionAPI, workerPath: string): void
       let job;
       try {
-        const archiveSha256 = await createArchive(ctx.cwd, params.files, tempArchivePath);
+        const archive = await createArchive(ctx.cwd, params.files, tempArchivePath);
         await withLock("admission", "global", { jobId, processPid: process.pid }, async () => {
           await acquireRuntimeLease(config, {
             jobId,
@@ -288,7 +584,7 @@ export function registerOracleTools(pi: ExtensionAPI, workerPath: string): void
         const worker = await spawnWorker(workerPath, job.id);
         await updateJob(job.id, (current) => ({
           ...current,
-          archiveSha256,
+          archiveSha256: archive.sha256,
           workerPid: worker.pid,
           workerNonce: worker.nonce,
           workerStartedAt: worker.startedAt,
@@ -304,6 +600,9 @@ export function registerOracleTools(pi: ExtensionAPI, workerPath: string): void
                 followUp.followUpToJobId ? `Follow-up to: ${followUp.followUpToJobId}` : undefined,
                 `Prompt: ${job.promptPath}`,
                 `Archive: ${job.archivePath}`,
+                archive.autoPrunedPrefixes.length > 0
+                  ? `Archive auto-pruned generic generated-output-name dirs to fit size limit: ${archive.autoPrunedPrefixes.map((entry) => `${entry.relativePath}/ (${formatBytes(entry.bytes)})`).join(", ")}`
+                  : undefined,
                 `Response will be written to: ${job.responsePath}`,
                 "Stop now and wait for the oracle completion wake-up.",
               ]
@@ -311,7 +610,15 @@ export function registerOracleTools(pi: ExtensionAPI, workerPath: string): void
                 .join("\n"),
             },
           ],
-          details: { jobId: job.id, archiveSha256, runtimeId: job.runtimeId, followUpToJobId: followUp.followUpToJobId },
+          details: {
+            jobId: job.id,
+            archiveSha256: archive.sha256,
+            archiveBytes: archive.archiveBytes,
+            initialArchiveBytes: archive.initialArchiveBytes,
+            autoPrunedArchivePaths: archive.autoPrunedPrefixes,
+            runtimeId: job.runtimeId,
+            followUpToJobId: followUp.followUpToJobId,
+          },
         };
       } catch (error) {
         const message = error instanceof Error ? error.message : String(error);

package/extensions/oracle/worker/run-job.mjs CHANGED Viewed

@@ -44,6 +44,7 @@ const AGENT_BROWSER_CLOSE_TIMEOUT_MS = 10_000;
 const MODEL_CONFIGURATION_SETTLE_TIMEOUT_MS = 20_000;
 const MODEL_CONFIGURATION_SETTLE_POLL_MS = 250;
 const MODEL_CONFIGURATION_CLOSE_RETRY_MS = 1_000;
+const POST_SEND_SETTLE_MS = 15_000;
 const AGENT_BROWSER_BIN = [process.env.AGENT_BROWSER_PATH, "/opt/homebrew/bin/agent-browser", "/usr/local/bin/agent-browser"].find(
   (candidate) => typeof candidate === "string" && candidate && existsSync(candidate),
 ) || "agent-browser";
@@ -1510,6 +1511,8 @@ async function run() {
     const baselineAssistantCount = (await assistantMessages(currentJob)).length;
     await log(`Assistant response count before send: ${baselineAssistantCount}`);
     await clickSend(currentJob);
+    await log(`Waiting ${POST_SEND_SETTLE_MS}ms after send to avoid streaming interruption`);
+    await sleep(POST_SEND_SETTLE_MS);
     const chatUrl = await waitForStableChatUrl(currentJob, currentJob.chatUrl);
     const conversationId = parseConversationId(chatUrl) || currentJob.conversationId;

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "pi-oracle",
-  "version": "0.1.10",
+  "version": "0.1.12",
   "description": "ChatGPT web-oracle extension for pi with isolated browser auth, async jobs, and project-context archives.",
   "private": false,
   "license": "MIT",

package/prompts/oracle.md CHANGED Viewed

@@ -8,17 +8,21 @@ Do not answer the user's request directly yet.
 Required workflow:
 1. Understand the request.
 2. Gather repo context first by reading files and searching the codebase.
-3. Select the exact relevant files/directories for the oracle archive.
+3. Choose archive inputs for the oracle job.
 4. Craft a concise but complete oracle prompt for ChatGPT web.
 5. Call oracle_submit with the prompt and exact archive inputs.
 6. Stop immediately after dispatching the oracle job.
 Rules:
 - Always include an archive. Do not submit without context files.
-- Keep the archive narrowly scoped and relevant.
+- By default, include the whole repository by passing `.`. Default archive exclusions apply automatically, including common bulky outputs and obvious credentials/private data like `.env` files, key material, credential dotfiles, local database files, and root `secrets/` directories.
+- Only limit file selection if the user explicitly requests it, if the task is clearly scoped to a smaller area, or if privacy/sensitivity requires it.
+- For very targeted asks like reviewing one function or explaining one stack trace, a smaller archive is preferable.
+- When `files=["."]` and the post-exclusion archive is still too large, submit automatically prunes the largest nested directories matching generic generated-output names like `build/`, `dist/`, `out/`, `coverage/`, and `tmp/` outside obvious source roots like `src/` and `lib/` until the archive fits or no candidate remains. Successful submissions report what was pruned.
+- If a submitted oracle job later fails because upload is rejected, retry with a smaller archive in this order: (1) remove the largest obviously irrelevant/generated content, (2) if still too large, include modified files plus adjacent files plus directly relevant subtrees, (3) if still too large, explain the cut or ask the user.
 - Prefer the configured default model/effort unless the task clearly needs something else.
 - Only use autoSwitchToThinking with the instant model family.
-- If oracle_submit fails, stop and report the error. Do not retry automatically.
+- If `oracle_submit` itself fails because the local archive still exceeds the upload limit after default exclusions and automatic generic generated-output-dir pruning, or for any other submit-time error, stop and report the error. Do not retry automatically.
 - After oracle_submit returns, end your turn. Do not keep working while the oracle runs.
 User request: