@matheuskrumenauer/tanya 0.4.0-beta.0 → 0.5.0-beta.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -163,7 +163,7 @@ through the permission engine before treating them as safe extension points.
163
163
 
164
164
  `--verify` adds required verification commands to the run context. Tanya must run and report each exact command before finishing the coding task.
165
165
 
166
- `tanya benchmark run --all` currently exercises 26 executable low-to-medium regression fixtures: targeted edits, new files, dependency/lockfile updates, framework-style migrations, failing-test repair, frontend smoke checks, artifact/context reuse, streaming long-tool execution, run-history logging, dirty worktrees, report repair, and the CosmoHQ mobile/backend smoke profiles.
166
+ `tanya benchmark run --all` currently exercises 27 executable low-to-medium regression fixtures: targeted edits, new files, dependency/lockfile updates, framework-style migrations, failing-test repair, frontend smoke checks, artifact/context reuse, streaming long-tool execution, compaction-boundary recovery, run-history logging, dirty worktrees, report repair, and the CosmoHQ mobile/backend smoke profiles.
167
167
 
168
168
  By default, `tanya run` also performs an independent post-check after the agent finishes. If the workspace has a `typecheck` script, Tanya reruns that exact script with the local package manager (`npm`, `pnpm`, `yarn`, or `bun`). If not, it falls back to `npx tsc --noEmit --pretty false` when a `tsconfig` is present. If the workspace has a `test` script, Tanya reruns that as well unless the run already reported a passing test verification.
169
169
 
@@ -201,6 +201,19 @@ Long-running `run_shell` calls stream throttled stdout/stderr chunks to the acti
201
201
  tanya run "Run npm test" # emits tool_progress while the command runs; Ctrl-C cancels the active shell and returns partial_output
202
202
  ```
203
203
 
204
+ ## Long sessions
205
+
206
+ Tanya handles context pressure as a cascade instead of truncating abruptly:
207
+
208
+ 1. Microcompact folds empty/no-op tool-call pairs in place.
209
+ 2. Snip removes low-signal history such as duplicate file reads and empty read-only tool results.
210
+ 3. Auto-compact reacts to provider `413` / context-window errors by summarizing older turns into a `[compaction summary: ...]` system message and retrying once normally, then once more aggressively.
211
+ 4. Archive writes compacted messages to `.tanya/runs/<runId>/archive.jsonl` before they leave live history, so verifier scans and future memory tools can still inspect them.
212
+
213
+ Runs are capped at three total auto-compactions. If the provider still rejects the context, Tanya raises `CompactionExhaustedError` and asks the user to narrow the task, clear the session, or split the work.
214
+
215
+ See [docs/long-sessions.md](./docs/long-sessions.md) for details.
216
+
204
217
  Context files are generic JSON envelopes for caller-supplied task metadata, artifacts, instructions, and verification commands.
205
218
 
206
219
  ## Current Tools