@tachu/extensions 1.0.0-alpha.6 → 1.0.0-rc.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (144) hide show
  1. package/CHANGELOG.md +25 -236
  2. package/README.md +53 -998
  3. package/README_ZH.md +52 -973
  4. package/dist/backends/file.d.ts +6 -6
  5. package/dist/backends/file.d.ts.map +1 -1
  6. package/dist/backends/file.js +12 -12
  7. package/dist/backends/file.js.map +1 -1
  8. package/dist/backends/terminal.d.ts +6 -6
  9. package/dist/backends/terminal.d.ts.map +1 -1
  10. package/dist/backends/terminal.js +9 -9
  11. package/dist/backends/terminal.js.map +1 -1
  12. package/dist/backends/web.d.ts +6 -6
  13. package/dist/backends/web.d.ts.map +1 -1
  14. package/dist/backends/web.js +8 -8
  15. package/dist/backends/web.js.map +1 -1
  16. package/dist/common/path.d.ts +14 -14
  17. package/dist/common/path.d.ts.map +1 -1
  18. package/dist/common/path.js +3 -3
  19. package/dist/common/process.js.map +1 -1
  20. package/dist/index.d.ts +2 -2
  21. package/dist/index.d.ts.map +1 -1
  22. package/dist/index.js +2 -2
  23. package/dist/index.js.map +1 -1
  24. package/dist/mcp/sse-adapter.d.ts +38 -38
  25. package/dist/mcp/sse-adapter.d.ts.map +1 -1
  26. package/dist/mcp/sse-adapter.js +34 -34
  27. package/dist/mcp/sse-adapter.js.map +1 -1
  28. package/dist/mcp/stdio-adapter.d.ts +38 -38
  29. package/dist/mcp/stdio-adapter.d.ts.map +1 -1
  30. package/dist/mcp/stdio-adapter.js +34 -34
  31. package/dist/mcp/stdio-adapter.js.map +1 -1
  32. package/dist/memory/fs-memory-system.d.ts +199 -58
  33. package/dist/memory/fs-memory-system.d.ts.map +1 -1
  34. package/dist/memory/fs-memory-system.js +290 -74
  35. package/dist/memory/fs-memory-system.js.map +1 -1
  36. package/dist/memory/index.d.ts +3 -0
  37. package/dist/memory/index.d.ts.map +1 -1
  38. package/dist/memory/index.js +3 -0
  39. package/dist/memory/index.js.map +1 -1
  40. package/dist/memory/projection-outbox.d.ts +69 -0
  41. package/dist/memory/projection-outbox.d.ts.map +1 -0
  42. package/dist/memory/projection-outbox.js +187 -0
  43. package/dist/memory/projection-outbox.js.map +1 -0
  44. package/dist/memory/projection-projector.d.ts +16 -0
  45. package/dist/memory/projection-projector.d.ts.map +1 -0
  46. package/dist/memory/projection-projector.js +56 -0
  47. package/dist/memory/projection-projector.js.map +1 -0
  48. package/dist/memory/projection-worker.d.ts +28 -0
  49. package/dist/memory/projection-worker.d.ts.map +1 -0
  50. package/dist/memory/projection-worker.js +84 -0
  51. package/dist/memory/projection-worker.js.map +1 -0
  52. package/dist/observability/jsonl-emitter.d.ts +25 -25
  53. package/dist/observability/jsonl-emitter.d.ts.map +1 -1
  54. package/dist/observability/jsonl-emitter.js +25 -25
  55. package/dist/observability/jsonl-emitter.js.map +1 -1
  56. package/dist/observability/otel-emitter.d.ts +23 -23
  57. package/dist/observability/otel-emitter.d.ts.map +1 -1
  58. package/dist/observability/otel-emitter.js +37 -28
  59. package/dist/observability/otel-emitter.js.map +1 -1
  60. package/dist/providers/anthropic.d.ts +39 -39
  61. package/dist/providers/anthropic.d.ts.map +1 -1
  62. package/dist/providers/anthropic.js +35 -35
  63. package/dist/providers/anthropic.js.map +1 -1
  64. package/dist/providers/mock.d.ts +33 -33
  65. package/dist/providers/mock.d.ts.map +1 -1
  66. package/dist/providers/mock.js +24 -24
  67. package/dist/providers/mock.js.map +1 -1
  68. package/dist/providers/openai.d.ts +51 -50
  69. package/dist/providers/openai.d.ts.map +1 -1
  70. package/dist/providers/openai.js +81 -46
  71. package/dist/providers/openai.js.map +1 -1
  72. package/dist/providers/qwen.d.ts +39 -38
  73. package/dist/providers/qwen.d.ts.map +1 -1
  74. package/dist/providers/qwen.js +27 -24
  75. package/dist/providers/qwen.js.map +1 -1
  76. package/dist/safety/default-gate.d.ts +35 -35
  77. package/dist/safety/default-gate.d.ts.map +1 -1
  78. package/dist/safety/default-gate.js +3 -3
  79. package/dist/safety/default-gate.js.map +1 -1
  80. package/dist/tools/_shared/web-client.d.ts +1 -1
  81. package/dist/tools/_shared/web-client.js +1 -1
  82. package/dist/tools/apply-patch/executor.js.map +1 -1
  83. package/dist/tools/fetch-url/executor.d.ts +2 -2
  84. package/dist/tools/fetch-url/executor.js +7 -7
  85. package/dist/tools/git-blame/executor.js.map +1 -1
  86. package/dist/tools/git-branch/executor.js +2 -2
  87. package/dist/tools/git-branch/executor.js.map +1 -1
  88. package/dist/tools/git-diff/executor.js.map +1 -1
  89. package/dist/tools/git-show/executor.js.map +1 -1
  90. package/dist/tools/git-status/executor.js.map +1 -1
  91. package/dist/tools/glob/executor.js.map +1 -1
  92. package/dist/tools/index.d.ts.map +1 -1
  93. package/dist/tools/index.js +3 -0
  94. package/dist/tools/index.js.map +1 -1
  95. package/dist/tools/list-dir/executor.js +1 -1
  96. package/dist/tools/list-dir/executor.js.map +1 -1
  97. package/dist/tools/read-file/executor.js +1 -1
  98. package/dist/tools/read-file/executor.js.map +1 -1
  99. package/dist/tools/run-shell/executor.js.map +1 -1
  100. package/dist/tools/run-tests/executor.js.map +1 -1
  101. package/dist/tools/run-typecheck/executor.js.map +1 -1
  102. package/dist/tools/search-code/executor.js +1 -1
  103. package/dist/tools/search-code/executor.js.map +1 -1
  104. package/dist/tools/shared.d.ts +9 -9
  105. package/dist/tools/web-fetch/errors.d.ts +2 -2
  106. package/dist/tools/web-fetch/errors.js +3 -3
  107. package/dist/tools/web-fetch/errors.js.map +1 -1
  108. package/dist/tools/web-fetch/executor.d.ts +2 -2
  109. package/dist/tools/web-fetch/executor.js +2 -2
  110. package/dist/tools/web-fetch/types.d.ts +7 -7
  111. package/dist/tools/web-fetch/types.d.ts.map +1 -1
  112. package/dist/tools/web-fetch/types.js +1 -1
  113. package/dist/tools/web-search/errors.d.ts +3 -3
  114. package/dist/tools/web-search/errors.js +3 -3
  115. package/dist/tools/web-search/executor.d.ts +2 -2
  116. package/dist/tools/web-search/executor.js +2 -2
  117. package/dist/tools/web-search/types.d.ts +4 -4
  118. package/dist/tools/web-search/types.d.ts.map +1 -1
  119. package/dist/tools/web-search/types.js +1 -1
  120. package/dist/transformers/document-to-text.d.ts +11 -11
  121. package/dist/transformers/document-to-text.d.ts.map +1 -1
  122. package/dist/transformers/document-to-text.js +11 -11
  123. package/dist/transformers/document-to-text.js.map +1 -1
  124. package/dist/transformers/image-to-text.d.ts +15 -15
  125. package/dist/transformers/image-to-text.d.ts.map +1 -1
  126. package/dist/transformers/image-to-text.js +15 -15
  127. package/dist/transformers/image-to-text.js.map +1 -1
  128. package/dist/vector/index.d.ts +4 -2
  129. package/dist/vector/index.d.ts.map +1 -1
  130. package/dist/vector/index.js +2 -2
  131. package/dist/vector/index.js.map +1 -1
  132. package/dist/vector/local-fs-index.d.ts +59 -0
  133. package/dist/vector/local-fs-index.d.ts.map +1 -0
  134. package/dist/vector/local-fs-index.js +216 -0
  135. package/dist/vector/local-fs-index.js.map +1 -0
  136. package/dist/vector/qdrant.d.ts +11 -73
  137. package/dist/vector/qdrant.d.ts.map +1 -1
  138. package/dist/vector/qdrant.js +29 -164
  139. package/dist/vector/qdrant.js.map +1 -1
  140. package/package.json +19 -4
  141. package/dist/vector/local-fs.d.ts +0 -81
  142. package/dist/vector/local-fs.d.ts.map +0 -1
  143. package/dist/vector/local-fs.js +0 -161
  144. package/dist/vector/local-fs.js.map +0 -1
package/README.md CHANGED
@@ -3,12 +3,12 @@
3
3
  **An agentic engine under active development — the *Harness* that aims to turn any LLM into a reliable, observable Agent.**
4
4
 
5
5
  [![npm version](https://img.shields.io/npm/v/@tachu/core?label=%40tachu%2Fcore)](https://www.npmjs.com/package/@tachu/core)
6
- [![status: alpha](https://img.shields.io/badge/status-alpha-orange)](#project-status)
6
+ [![status: rc](https://img.shields.io/badge/status-rc-blue)](#project-status)
7
7
  [![license: Apache-2.0](https://img.shields.io/badge/license-Apache--2.0-blue)](#license)
8
8
  [![bun](https://img.shields.io/badge/runtime-bun-orange)](https://bun.sh)
9
9
  [![TypeScript](https://img.shields.io/badge/TypeScript-5.x-blue)](https://www.typescriptlang.org)
10
10
 
11
- > **⚠️ Project Status — Alpha.** The 9-phase pipeline, registry, prompt assembler, CLI, OpenAI / Anthropic / Qwen / Gemini adapters, MCP adapters, vector stores and observability emitters are wired up and individually tested. **Phase 3 (Intent Analysis) is a real LLM call**, so `tachu chat` / `tachu run` produce real conversational replies. **Phase 5 (Planning) and Phase 8 (Result Validation) are still stubbed** requests classified as *complex* will not yet receive an LLM-generated multi-step plan or semantic validation. See [Project Status](#project-status) and [Roadmap](#roadmap) for the per-feature breakdown. Do **not** use this in production yet. Install via the `@alpha` dist-tag.
11
+ > **Project Status — Release Candidate.** The 9-phase pipeline, registry, prompt assembler, CLI, OpenAI / Anthropic / Qwen / Gemini adapters, MCP adapters, vector stores and observability emitters are wired up and tested. Phase 3 (Intent Analysis) is a real LLM call, Phase 5 routes complex tool-capable requests to the built-in `tool-use` loop, and Phase 8 runs deterministic validation rules with optional semantic judge. Runtime provider fallback and semantic judge production hardening remain post-rc work. Install via the `@rc` dist-tag.
12
12
 
13
13
  ---
14
14
 
@@ -18,38 +18,41 @@ Tachu aims to be an **agentic engine you can build a real product on** — not a
18
18
 
19
19
  The engine is intentionally **domain-agnostic**: it knows nothing about your business logic, your users, or your domain vocabulary. Instead, it defines a small set of core abstractions (Rules, Skills, Tools, Agents) through which your business fills in all the intelligence. Tachu is designed to handle the hard parts — 9-phase execution pipeline, dual-plane semantic matching, context window management, token-precise prompt assembly, structured retry/fallback, cancellation propagation, and end-to-end observability.
20
20
 
21
- Tachu ships as a Bun-native TypeScript monorepo with three packages: the zero-dependency engine core (`@tachu/core`), an official extensions library (`@tachu/extensions`), and a fully-featured CLI program (`@tachu/cli`) that doubles as the reference implementation.
21
+ Tachu ships as a Bun-native TypeScript monorepo with three published packages the zero-dependency engine core (`@tachu/core`), an official extensions library (`@tachu/extensions`), and a fully-featured CLI program (`@tachu/cli`) that doubles as the reference implementation — plus `@tachu/host-defaults` for shared CLI/embedded host wiring, and an optional private sidecar package (`@tachu/web-fetch-server`) for remote browser-backed web tools.
22
22
 
23
23
  ---
24
24
 
25
25
  ## Project Status
26
26
 
27
- **Current release:** `1.0.0-alpha.6` on the `alpha` dist-tag.
27
+ **Current release:** `1.0.0-rc.0` on the `rc` dist-tag.
28
28
 
29
- This is the **first public alpha** most infrastructure is in place, but several LLM-backed stages are still stubbed. The table below is the single source of truth; every claim made elsewhere in this README must be cross-checked against it.
29
+ **Version terminology:** The product line is **Tachu v1**. Release candidates are stabilization builds for `1.0.0`, not a separate framework generation. HTTP paths like `/v1/extract` are API versions only. See [detailed-design § version glossary](docs/detailed-design.md#版本与发布术语必读).
30
+
31
+ This is the **first release candidate**. The table below is a readability index only; runtime behavior, defaults, and edge cases are authoritative in the cited source files and tests.
30
32
 
31
33
  | Capability | Status | Notes |
32
34
  |-----------|--------|-------|
33
35
  | 9-phase pipeline skeleton (types, orchestrator, state machine, hooks) | ✅ Implemented | `packages/core/src/engine` |
34
36
  | Descriptor Registry (Rules / Skills / Tools / Agents) | ✅ Implemented | Markdown + YAML frontmatter loader, semantic indexing, startup validation |
35
37
  | Prompt assembler (tiktoken, KV-cache-friendly ordering) | ✅ Implemented | `packages/core/src/prompt` |
36
- | Task scheduler, DAG validator, retry/fallback bookkeeping | ✅ Implemented | `packages/core/src/engine/scheduler.ts` |
38
+ | Task scheduler, DAG validator, turn/task retry bookkeeping | ✅ Implemented | `packages/core/src/engine/scheduler.ts`; **runtime provider fallback on LLM errors is not wired** (see [Providers guide](./docs/guides/providers-and-integrations.md)) |
37
39
  | Session / Memory / Runtime-state / Safety / Model-router / Provider / Observability / Hooks modules | ✅ Implemented | `packages/core/src/modules` |
38
- | OpenAI / Anthropic / Qwen / Mock / Gemini Provider adapters | ✅ Implemented | streaming, function calling, tool schemas |
40
+ | OpenAI / Anthropic / Qwen / Mock Provider adapters | ✅ Implemented | CLI auto-wires via `@tachu/host-defaults`; streaming, function calling, tool schemas |
41
+ | Gemini Provider adapter | ✅ Implemented (manual wiring) | `GeminiProviderAdapter` in `@tachu/extensions` with unit tests; **not** registered by default CLI / `buildProviderAdapter` — inject via `createEngine(..., { providers: [new GeminiProviderAdapter(...)] })` (see [Providers guide](./docs/guides/providers-and-integrations.md)) |
39
42
  | `apiKey` / `baseURL` / `organization` / `timeoutMs` configuration (env var / `tachu.config.ts` / CLI flags) | ✅ Implemented | Azure OpenAI / LiteLLM / OpenRouter / self-hosted gateways supported |
40
- | 7 built-in tools + Terminal / File / Web backends | ✅ Implemented | `packages/extensions/src/{tools,backends}` |
43
+ | 22 built-in tools + Terminal / File / Web backends | ✅ Implemented | `packages/extensions/src/tools/index.ts` |
41
44
  | MCP stdio + SSE adapters | ✅ Implemented | `packages/extensions/src/mcp` |
42
- | `LocalFsVectorStore` (file-backed) + `QdrantVectorStore` (REST) | ✅ Implemented | |
43
- | OTel / JSONL / Console emitters | ✅ Implemented | |
45
+ | `LocalFsVectorIndexAdapter` (file-backed) + `QdrantVectorIndexAdapter` (REST) | ✅ Implemented | |
46
+ | OTel / JSONL emitters | ✅ Implemented | |
44
47
  | `tachu init` / `tachu run` / `tachu chat` CLI surface, streaming renderer, session persistence, Ctrl+C semantics | ✅ Implemented | |
45
- | **CLI terminal Markdown rendering** | ✅ **Implemented** | `marked` + `marked-terminal` + `cli-highlight` stack. Applied to the final assistant reply in `tachu chat` / `tachu run --output text` when stdout is a TTY; automatically disables under `NO_COLOR` / non-TTY / `--no-color`. Explicit control via `--markdown` / `--no-markdown` on `tachu run`. Dedicated `renderMarkdownToAnsi` wrapper (`packages/cli/src/renderer/markdown.ts`) with 11 unit tests in `markdown.test.ts`. |
46
- | **Phase 3 — Intent Analysis (LLM call, pure classification)** | ✅ **Implemented** | Routes through `ModelRouter.resolve("intent")` → registered `ProviderAdapter`. This phase is **pure classification** `IntentResult = { complexity, intent, contextRelevance, relevantContext? }` only; the system prompt explicitly forbids `directAnswer` / `answer` / `reply` fields and the final user-facing answer is delegated to the `direct-answer` Sub-flow (Phase 7). Ships **5 few-shot examples** (greeting / creative short / write-code / write-lesson-plan / real multi-tool complex); complexity is anchored on *"does this need real tools / external resources?"*, not output length — single-turn long-form creative asks (write code, TDD lesson plans, essays, translations) stay `simple`. Bounded history window (up to 10 recent `MemorySystem` entries) and 30 s per-call timeout composed with the phase-level abort signal. Robust JSON extraction (plain / fenced / embedded); when the LLM ignores the JSON protocol but returns meaningful text, the text is accepted as the `intent` summary and the request still flows to `direct-answer`. Heuristic `intent = input.slice(0, 200)` fallback is used only on *no usable content* (provider unregistered, network / timeout error, cancellation, empty response). 10 dedicated tests in `intent.test.ts`. |
47
- | **Phase 5 — Task Planning (fallback contract)** | ✅ **Implemented** | Enforces `plans[0].tasks.length >= 1`. Rules: (1) `simple` intent → single `direct-answer` sub-flow task; (2) `complex` + matched tools → first-N tool tasks (unchanged); (3) `complex` + no matching tool → single `direct-answer` sub-flow task with `warn: true` (the sub-flow honestly discloses the lack of tool matching); (4) defensive post-guard catches upstream regressions that leave `tasks` empty. Real LLM-backed planner (producing ranked multi-step plans for tool-chain complex requests) is still scheduled for a later alpha. |
48
- | **`direct-answer` built-in Sub-flow (Phase 7)** | ✅ **Implemented** | `packages/core/src/engine/subflows/direct-answer.ts`. Resolves `capabilityMapping.intent` (fallback to `fast-cheap`), composes system + ≤10 memory-history entries + user prompt, calls `ProviderAdapter.chat()` with a 60 s per-call timeout merged with the phase abort signal. System prompt mandates **natural-language Markdown**, forbids JSON wrappers / `"已识别请求:…"` templates / 4-space indented code blocks, and supports a `warn: true` flag for honest tool-missing disclaimers. Emits `llm_call_start` / `llm_call_end` observability events under `phase: "direct-answer"`. Non-overridable: `DescriptorRegistry` registers `direct-answer` as a reserved name and rejects business registration / unregistration with `RegistryError.reservedName`. See [ADR 0001](docs/adr/decisions/0001-direct-answer-as-builtin-subflow.md). |
49
- | **Phase 8 — Result Validation (LLM call)** | 🟡 **Stub** | Only checks whether any step `failed`; no semantic validation. Scheduled for a follow-up alpha. |
50
- | **Phase 9 — Output Assembly** | ✅ **Implemented** | Content selector: `taskResults['task-direct-answer']` → `{intent, taskResults}` structured JSON (tool-chain success, reshape pending real Phase 5/8) → honest-fallback plain-language message with recognized intent + internal diagnosis + *"rephrase as simple"* suggestion (validation failed). Internal state JSON is never leaked to end users. 6 dedicated tests in `output.test.ts`. |
51
- | Real-world smoke tests against OpenAI / Anthropic / Azure | 🔴 Not yet | Adapters are unit-tested with mocks; we have not yet published a signed-off end-to-end record. |
52
- | Production hardening (SLO, error budgets, failure injection, signed provenance) | 🔴 Not yet | v1 target. |
48
+ | **CLI terminal Markdown rendering** | ✅ **Implemented** | `marked` + `marked-terminal` + `cli-highlight` stack. Applied to the final assistant reply in `tachu chat` / `tachu run --output text` when stdout is a TTY; automatically disables under `NO_COLOR` / non-TTY / `--no-color`. Explicit control via `--markdown` / `--no-markdown` on `tachu run`. Dedicated `renderMarkdownToAnsi` wrapper (`packages/cli/src/renderer/markdown.ts`) with 12 unit tests in `markdown.test.ts`. |
49
+ | **Phase 3 — Intent Analysis (LLM call, pure classification)** | ✅ **Implemented** | Pure classification only (`IntentResult`); final user reply is Phase 7 `direct-answer`. **Implementation:** `packages/core/src/engine/phases/intent.ts` (`INTENT_SYSTEM_PROMPT_BASE`, fast paths, JSON parse, heuristic fallback); tests: `intent.test.ts`. Hosts may **`config.intent.systemPromptBase`** to replace the base wholesale; optional extra few-shots: `config.intent.fewShotExamples` (Agent Context / explicit selections still appended by core). |
50
+ | **Phase 5 — Task Planning (planning router)** | ✅ **Implemented** | Enforces `plans[0].tasks.length >= 1`. Rules: (1) `simple` intent → single `direct-answer` sub-flow task; (2) `complex` + visible tools → single `tool-use` sub-flow task; (3) `complex` + no visible tool → single `direct-answer` sub-flow task with `warn: true`; (4) defensive post-guard catches upstream regressions that leave `tasks` empty. Multi-step behavior lives inside `tool-use`; optional plan preview / human review may still evolve, but there is no separate default LLM pre-planner on the main path. |
51
+ | **`direct-answer` built-in Sub-flow (Phase 7)** | ✅ **Implemented** | `packages/core/src/engine/subflows/direct-answer.ts`. Resolves `capabilityMapping.intent` (fallback to `fast-cheap`), composes system + ≤10 memory-history entries + user prompt, calls `ProviderAdapter.chat()` with a 60 s per-call timeout merged with the phase abort signal. System prompt mandates **natural-language Markdown**, forbids JSON wrappers / `"已识别请求:…"` templates / 4-space indented code blocks, and supports a `warn: true` flag for honest tool-missing disclaimers. Emits `llm_call_start` / `llm_call_end` observability events under `phase: "direct-answer"`. Non-overridable: `DescriptorRegistry` registers `direct-answer` as a reserved name and rejects business registration / unregistration with `RegistryError.reservedName`. |
52
+ | **Phase 8 — Result Validation Outcome** | 🟡 **Partially wired** | `ValidationOutcome` union + `ValidationRuleRegistry` with **5 deterministic rules** via `buildDefaultValidationRuleRegistry()` (`packages/core/src/engine/phases/validation/index.ts`). Optional `ProviderSemanticJudgeAdapter` / `BudgetedSemanticJudgeAdapter`. Engine consumes `retry` (turn loop via `decideTurnRetry`), `degrade` / `handoff` (exit to Output). Gaps: no standalone `ExecutionPolicy` type; runtime provider fallback is not implemented and semantic judge is not production-complete. |
53
+ | **Phase 9 — Output Assembly** | ✅ **Implemented** | Content selector: `taskResults['task-direct-answer']` → `{intent, taskResults}` structured JSON (tool-chain success path; semantic polish still depends on real Phase 8) → honest-fallback plain-language message with recognized intent + internal diagnosis + *"rephrase as simple"* suggestion (validation failed). Internal state JSON is never leaked to end users. Covered by `output.test.ts`. |
54
+ | Real-world smoke tests against OpenAI / Anthropic / Azure | 🟡 **Manually verified; opt-in automated** | Mock unit tests cover adapters in CI. Maintainers have **hand-run** real LLM paths (custom gateways included). An **opt-in** scripted e2e exists — set `TACHU_REAL_E2E=1` plus `TACHU_E2E_API_KEY` / `TACHU_E2E_API_BASE` / `TACHU_E2E_PROVIDER` (see [Contributing](./CONTRIBUTING.md)) — but default CI does not publish signed recordings. |
55
+ | Production hardening (SLO, error budgets, failure injection, signed provenance) | 🔴 Not yet | Target for `1.0.0` (Tachu v1). |
53
56
 
54
57
  Legend: ✅ implemented and tested · 🟡 stub / placeholder present, real implementation in progress · 🔴 not yet started.
55
58
 
@@ -57,10 +60,9 @@ Legend: ✅ implemented and tested · 🟡 stub / placeholder present, real impl
57
60
 
58
61
  ## Key Features
59
62
 
60
- > Features marked **(stub)** are wired end-to-end but do not yet call an LLMsee the [Project Status](#project-status) table.
61
-
62
- - **9-Phase Execution Pipeline** — session management safety → intent analysis (pure classification) pre-check planning (fallback contract) DAG validation execution result validation *(stub)* output normalization; each phase is a typed, hookable stage, and every request — simple or complex — flows through all nine phases with uniform Rules / Hooks / Observability / budget accounting
63
- - **`direct-answer` built-in Sub-flow** — answers to simple requests (and to complex requests with no matching tool) are produced by a first-class engine-internal Sub-flow running in Phase 7, not baked into the intent prompt. See [ADR 0001](docs/adr/decisions/0001-direct-answer-as-builtin-subflow.md).
63
+ - **9-Phase Execution Pipeline** — session management → safety → intent analysis (pure classification) pre-check planning router DAG validation execution → result validation outcome → output normalization; each phase is a typed, hookable stage, and every request simple or complex flows through all nine phases with uniform Rules / Hooks / Observability / budget accounting
64
+ - **Task Planning + Tool-use Loop** — Phase 5 does not precompute a ranked multi-step plan. It routes `simple` requests to `direct-answer`, routes `complex + visible tools` into the built-in `tool-use` Agentic Loop, and lets that loop perform iterative LLM tool selection → gated tool execution → tool-result feedback → final answer.
65
+ - **`direct-answer` built-in Sub-flow** — answers to simple requests (and to complex requests with no matching tool) are produced by a first-class engine-internal Sub-flow running in Phase 7, not baked into the intent prompt.
64
66
  - **Dual-Plane Matching** — semantic discovery (vector similarity) + deterministic execution gate (scopes, whitelist, approval) for every Rule, Skill, Tool, and Agent
65
67
  - **Four Core Abstractions** — declare Rules, Skills, Tools, and Agents as Markdown + YAML frontmatter descriptors; the engine resolves, activates, and orchestrates them automatically
66
68
  - **OpenAI & Anthropic Adapters** — streaming, function calling, configurable `baseURL` / `organization` / `timeoutMs`; works with Azure OpenAI, LiteLLM, OpenRouter, or any self-hosted gateway
@@ -92,119 +94,17 @@ Tachu is built around three convictions:
92
94
 
93
95
  ## Core Abstractions
94
96
 
95
- Tachu's four core abstractions are **co-equal and orthogonal** each independently registered, independently activated, and composable across all engine phases.
96
-
97
- | Abstraction | Nature | Activation Gate | Effect |
98
- |-------------|--------|-----------------|--------|
99
- | **Rules** | Constraints & guidance | Semantic discovery → direct activation | Injected into LLM System Prompt at each scoped phase |
100
- | **Skills** | Knowledge & instructions | Semantic discovery → direct activation | Injected into LLM context when activated |
101
- | **Tools** | Atomic executable operations | Semantic discovery → **mandatory gate** (scopes → whitelist → approval) | Executed with full side-effect tracking |
102
- | **Agents** | Natural-language-driven execution units | Semantic discovery → activatable | Recursively use engine capabilities; all Tool calls pass through the Tool gate |
103
-
104
- All four share a **common descriptor schema** (Markdown + YAML frontmatter):
105
-
106
- ```yaml
107
- name: unique-name # required, unique within type
108
- description: ... # natural language (used for semantic discovery)
109
- tags: [tag1, tag2] # for filtering and classification
110
- trigger: { type: always } # activation condition
111
- requires:
112
- - { kind: tool, name: read-file } # explicit dependency references
113
- ```
114
-
115
- ### Dual-Plane Matching Model
116
-
117
- Every core abstraction is activated through a two-phase process:
97
+ Rules, Skills, Tools, and Agents are Markdown + YAML descriptors. Semantic discovery proposes candidates; deterministic gates authorize execution (especially Tools).
118
98
 
119
- ```mermaid
120
- graph LR
121
- Input[Context Input] --> Discovery[Semantic Discovery Plane]
122
- Discovery --> Index[(Vector Index)]
123
- Index --> Candidates[Candidate Set]
124
- Candidates --> Gate[Deterministic Execution Gate]
125
- Gate -- scopes / whitelist / approval --> Execution[Execution Plane]
126
- ```
127
-
128
- - **Semantic discovery plane**: `description` is vectorized on registration; at runtime, the current context is matched against the index to produce a candidate set
129
- - **Deterministic execution gate**: final activation requires passing a deterministic gate (explicit references, whitelist checks, permission scopes, approval checks)
130
-
131
- Rules and Skills pass through the gate freely (no side effects). Tools always pass through the full gate. Agents activate freely but all Tool calls they trigger still pass through the Tool gate.
99
+ Details: [Overview · abstractions](./docs/overview-design.md#三四大核心抽象) · [Dual-plane matching](./docs/overview-design.md#共享特性).
132
100
 
133
101
  ---
134
102
 
135
- ## Architecture Overview
136
-
137
- ### Three-Layer Structure
138
-
139
- Tachu is published as three layers:
140
-
141
- ```mermaid
142
- graph TD
143
- subgraph "Business Layer"
144
- A[Business Rules / Domain Tools / Custom Adapters / Domain Skills / Agents / Plan Templates]
145
- end
146
- subgraph "Extensions Library — @tachu/extensions"
147
- B[OpenAI & Anthropic Adapters / 7 Common Tools / Terminal+File+Web Backends / Qdrant+LocalFs VectorStore / MCP Adapter / OTel+JSONL Emitters / 4 Common Rules]
148
- end
149
- subgraph "Engine Core — @tachu/core"
150
- C[Protocol Definitions / 9-Phase Pipeline / Lifecycle Hooks / Session / Memory / Safety / Model Router / Runtime State]
151
- end
152
- A --> B
153
- B --> C
154
- ```
155
-
156
- | Layer | Package | Responsibility |
157
- |-------|---------|----------------|
158
- | Engine Core | `@tachu/core` | Protocol interfaces, 9-phase pipeline skeleton, 8 core modules, Registry, Prompt assembler, VectorStore interface + built-in light implementation |
159
- | Extensions Library | `@tachu/extensions` | Official concrete implementations: Provider adapters, Tools, Backends, VectorStore adapters, OTel/JSONL emitters, common Rules |
160
- | Business / CLI | `@tachu/cli` or your code | Assembles core + extensions into a working Agent; provides domain Rules/Skills/Tools/Agents |
103
+ ## Architecture (summary)
161
104
 
162
- ### 9-Phase Execution Pipeline
105
+ Requests flow through a **9-phase pipeline**. Complex tool work enters the built-in `tool-use` loop in Phase 7.
163
106
 
164
- Every request processed by the engine flows through exactly 9 phases:
165
-
166
- ```mermaid
167
- graph TD
168
- Start([Business Request]) --> S1[Phase 1: Session Management]
169
- S1 --> S2[Phase 2: Minimum Safety Check]
170
- S2 --> S3[Phase 3: Intent Analysis — LLM]
171
- S3 -- simple --> S9[Phase 9: Output Normalization]
172
- S3 -- complex --> S4[Phase 4: Pre-Check]
173
- S4 --> S5[Phase 5: Task Planning]
174
- S5 -- Plan mode --> PlanLoop{Plan Review Loop}
175
- S5 -- Template match --> S6
176
- S5 -- Dynamic split --> S6[Phase 6: DAG Validation]
177
- PlanLoop -- confirmed --> S6
178
- S6 --> S7[Phase 7: Sub-task Execution]
179
- S7 --> S8[Phase 8: Result Validation — LLM]
180
- S8 -- pass --> S9
181
- S8 -- fail --> Retry{Retry / Re-plan}
182
- Retry -- within limits --> S5
183
- Retry -- exhausted --> S9
184
- S9 --> End([Output])
185
-
186
- style S2 fill:#ffeaa7,stroke:#fdcb6e
187
- style S7 fill:#dfe6e9,stroke:#b2bec3
188
- ```
189
-
190
- | # | Phase | LLM Call | Key Output |
191
- |---|-------|----------|------------|
192
- | 1 | Session Management | No | Session context loaded |
193
- | 2 | Minimum Safety Check | No | Pass / reject |
194
- | 3 | Intent Analysis | **Yes** | `IntentResult` (simple/complex, context relevance) |
195
- | 4 | Pre-Check | No | Resource availability, deep safety validation |
196
- | 5 | Task Planning | **Yes** | `PlanningResult` (ranked plans + DAG) |
197
- | 6 | DAG Validation | No | Cycle detection, node integrity (deterministic) |
198
- | 7 | Sub-task Execution | Per sub-task | `TaskResult[]` (parallel where possible) |
199
- | 8 | Result Validation | **Yes** | `ValidationResult` (pass / execution_issue / planning_issue) |
200
- | 9 | Output Normalization | No | `EngineOutput` (typed, with steps, metadata, artifacts) |
201
-
202
- **Key pipeline properties:**
203
-
204
- - **Full-path safety gating** — Phase 2 runs on every request, including fast-path simple responses
205
- - **Context guard** — Phase 3 decides whether session history is relevant; irrelevant history is not forwarded
206
- - **Three-strikes limit** — Task-level retries are bounded at 3 (configurable); system-level retries at 2
207
- - **Last-message-wins** — A new request on the same session cancels the current execution via `AbortController`
107
+ Details: [Pipeline phases](./docs/architecture/pipeline-phases.md) · [Overview design](./docs/overview-design.md).
208
108
 
209
109
  ---
210
110
 
@@ -212,23 +112,23 @@ graph TD
212
112
 
213
113
  Tachu requires [Bun](https://bun.sh) as the runtime.
214
114
 
215
- > **Install via the `@alpha` dist-tag** (or an exact version) until Tachu reaches stable.
115
+ > **Install via the `@rc` dist-tag** (or an exact version) until Tachu reaches stable.
216
116
 
217
117
  ```bash
218
- # Install the engine core (alpha)
219
- bun add @tachu/core@alpha
118
+ # Install the engine core
119
+ bun add @tachu/core@rc
220
120
 
221
121
  # Install the extensions library (providers, tools, backends, vector stores)
222
- bun add @tachu/extensions@alpha
122
+ bun add @tachu/extensions@rc
223
123
 
224
124
  # Install and use the CLI globally
225
- bun add -g @tachu/cli@alpha
125
+ bun add -g @tachu/cli@rc
226
126
  ```
227
127
 
228
128
  After global installation, verify with:
229
129
 
230
130
  ```bash
231
- tachu --version # expect 1.0.0-alpha.6 or newer
131
+ tachu --version # expect 1.0.0-rc.0 or newer
232
132
  ```
233
133
 
234
134
  ---
@@ -254,854 +154,8 @@ tachu chat
254
154
  tachu chat --resume
255
155
  ```
256
156
 
257
- ### Programmatic (TypeScript)
258
-
259
- ```typescript
260
- import { Engine } from '@tachu/core';
261
- import { OpenAIProviderAdapter } from '@tachu/extensions/providers';
262
- import { FileBackend, TerminalBackend } from '@tachu/extensions/backends';
263
- import type { EngineConfig, InputEnvelope, ExecutionContext } from '@tachu/core';
264
-
265
- const config: EngineConfig = {
266
- registry: {
267
- descriptorPaths: ['.tachu'],
268
- enableVectorIndexing: false,
269
- },
270
- runtime: {
271
- planMode: false,
272
- maxConcurrency: 4,
273
- defaultTaskTimeoutMs: 120_000,
274
- failFast: false,
275
- },
276
- memory: {
277
- contextTokenLimit: 8000,
278
- compressionThreshold: 0.8,
279
- headKeep: 4,
280
- tailKeep: 12,
281
- archivePath: '.tachu/archive.jsonl',
282
- vectorIndexLimit: 10_000,
283
- },
284
- budget: {
285
- maxTokens: 50_000,
286
- maxToolCalls: 50,
287
- maxWallTimeMs: 300_000,
288
- },
289
- safety: {
290
- maxInputSizeBytes: 1_000_000,
291
- maxRecursionDepth: 10,
292
- workspaceRoot: process.cwd(),
293
- promptInjectionPatterns: [],
294
- },
295
- models: {
296
- capabilityMapping: {
297
- 'high-reasoning': { provider: 'openai', model: 'gpt-4o' },
298
- 'fast-cheap': { provider: 'openai', model: 'gpt-4o-mini' },
299
- },
300
- providerFallbackOrder: ['openai'],
301
- },
302
- observability: { enabled: true, maskSensitiveData: true },
303
- hooks: { writeHookTimeout: 5000, failureBehavior: 'continue' },
304
- };
305
-
306
- const engine = new Engine(config);
307
-
308
- // Register a provider
309
- engine.useProvider(new OpenAIProviderAdapter({ apiKey: process.env.OPENAI_API_KEY! }));
310
-
311
- // Stream results
312
- const input: InputEnvelope = {
313
- content: 'Write a TypeScript function that debounces async operations',
314
- metadata: { modality: 'text' },
315
- };
316
-
317
- const context: ExecutionContext = {
318
- requestId: crypto.randomUUID(),
319
- sessionId: 'session-001',
320
- traceId: crypto.randomUUID(),
321
- principal: { userId: 'user-001' },
322
- budget: { maxTokens: 20_000, maxDurationMs: 60_000 },
323
- scopes: ['read', 'write'],
324
- };
325
-
326
- for await (const chunk of engine.runStream(input, context)) {
327
- if (chunk.type === 'delta') process.stdout.write(chunk.content);
328
- if (chunk.type === 'done') console.log('\n\nCompleted:', chunk.output.status);
329
- }
330
- ```
331
-
332
- ---
333
-
334
- ## Package Layout
335
-
336
- ### Package Summary
337
-
338
- | Package | Description | Key Exports |
339
- |---------|-------------|-------------|
340
- | `@tachu/core` | Zero-dependency engine core | `Engine`, `Registry`, `PromptAssembler`, all interfaces and types |
341
- | `@tachu/extensions` | Official implementations | `OpenAIProviderAdapter`, `AnthropicProviderAdapter`, `McpToolAdapter`, `QdrantVectorStore`, `OtelEmitter`, backends, tools, rules |
342
- | `@tachu/cli` | Production CLI program | `tachu chat`, `tachu run`, `tachu init` |
343
-
344
- ### Dependency Relationship
345
-
346
- ```mermaid
347
- graph LR
348
- cli["@tachu/cli"]
349
- extensions["@tachu/extensions"]
350
- core["@tachu/core"]
351
-
352
- cli --> extensions
353
- cli --> core
354
- extensions --> core
355
-
356
- style core fill:#74b9ff,stroke:#0984e3
357
- style extensions fill:#a29bfe,stroke:#6c5ce7
358
- style cli fill:#fd79a8,stroke:#e84393
359
- ```
360
-
361
- ### Core Package Internal Structure
362
-
363
- ```
364
- @tachu/core / src/
365
- ├── types/ # All TypeScript interfaces: descriptors, context, I/O, config
366
- ├── engine/ # Engine entry class, phase handlers, orchestrator, scheduler
367
- ├── registry/ # Registry: register/lookup/startup validation for all 4 abstractions
368
- ├── modules/ # 8 core modules (session, memory, runtime-state, model-router,
369
- │ # provider, safety, observability, hooks)
370
- ├── prompt/ # PromptAssembler: token budgeting, KV-cache-friendly ordering
371
- └── vector/ # VectorStore interface + built-in lightweight implementation
372
- ```
373
-
374
- ---
375
-
376
- ## Providers & Integrations
377
-
378
- ### LLM Providers
379
-
380
- | Provider | Package | Streaming | Function Calling | Notes |
381
- |----------|---------|-----------|-----------------|-------|
382
- | OpenAI | `@tachu/extensions/providers` | ✅ | ✅ | GPT-4o, GPT-4o-mini, and all listable models |
383
- | Anthropic | `@tachu/extensions/providers` | ✅ | ✅ | Claude 3.5 Sonnet and all listable models |
384
- | Mock | `@tachu/extensions/providers` | ✅ | ✅ | For testing; configurable responses |
385
-
386
- Provider fallback is configured via `models.providerFallbackOrder`. When a system-level error occurs (timeout, API error), the engine automatically switches to the next provider in the list without re-planning.
387
-
388
- ### Provider Connection Configuration
389
-
390
- Every built-in provider accepts `apiKey`, `baseURL`, `organization` (OpenAI-only), `project` (OpenAI-only), and `timeoutMs`. Supply them at any of three levels (later wins):
391
-
392
- 1. **Environment variables** (recommended for secrets):
393
-
394
- | Variable | Provider | Purpose |
395
- |----------|----------|---------|
396
- | `OPENAI_API_KEY` | OpenAI | Credential fallback when `apiKey` is not set |
397
- | `OPENAI_BASE_URL` | OpenAI | SDK-level baseURL override (honored by `openai` SDK) |
398
- | `ANTHROPIC_API_KEY` | Anthropic | Credential fallback when `apiKey` is not set |
399
- | `ANTHROPIC_BASE_URL` | Anthropic | SDK-level baseURL override (honored by `@anthropic-ai/sdk`) |
400
-
401
- 2. **`tachu.config.ts` — `providers` block** (recommended for non-secret connection metadata):
402
-
403
- ```typescript
404
- const config: EngineConfig = {
405
- // ...other fields
406
- providers: {
407
- openai: {
408
- // apiKey stays in env; keep this file commit-safe
409
- baseURL: 'https://your-gateway.example.com/v1',
410
- organization: 'org-xxxx',
411
- timeoutMs: 60_000,
412
- },
413
- anthropic: {
414
- baseURL: 'https://your-gateway.example.com/anthropic',
415
- timeoutMs: 60_000,
416
- },
417
- },
418
- };
419
- ```
420
-
421
- 3. **CLI flags** (highest priority; great for one-off overrides):
422
-
423
- ```bash
424
- tachu run "..." --provider openai \
425
- --api-base https://gateway.example.com/v1 \
426
- --api-key sk-dev \
427
- --organization org-xxxx
428
-
429
- tachu chat --provider anthropic \
430
- --api-base https://gateway.example.com/anthropic
431
- ```
432
-
433
- Flags apply to whichever provider the request ends up using — either the one supplied via `--provider`, or the one resolved from your `capabilityMapping`. CLI flags never touch the `mock` provider.
434
-
435
- Typical use cases: Azure OpenAI, self-hosted LiteLLM / OpenRouter / Kong gateways, organization-wide egress proxies, and air-gapped environments.
436
-
437
- ### MCP (Model Context Protocol)
438
-
439
- Tachu ships two official adapters for MCP (`McpStdioAdapter` / `McpSseAdapter`, built on `@modelcontextprotocol/sdk`) and the CLI wires them into `DescriptorRegistry` and the `TaskExecutor`—declare an `mcpServers` block in `tachu.config.ts`, and the CLI auto-discovers tools, routes calls, and disconnects on exit.
440
-
441
- **Declarative config (recommended; field names align with the OpenAI Agents SDK and the common MCP client convention)**
442
-
443
- ```typescript
444
- // tachu.config.ts
445
- const config: EngineConfig = {
446
- // ... other fields
447
- mcpServers: {
448
- // Local stdio process (standard MCP stdio transport)
449
- fs: {
450
- command: 'npx',
451
- args: ['-y', '@modelcontextprotocol/server-filesystem', process.cwd()],
452
- env: { ...process.env },
453
- },
454
- // Remote SSE service (standard MCP SSE transport)
455
- remoteKb: {
456
- url: 'https://mcp.example.com/sse/',
457
- headers: { Authorization: `Bearer ${process.env.MCP_TOKEN ?? ''}` },
458
- timeoutMs: 50_000,
459
- connectTimeoutMs: 10_000,
460
- // Optional tachu extensions
461
- // description: 'Project documentation search (example)',
462
- // keywords: ['docs', '文档'],
463
- // expandOnKeywordMatch: true,
464
- // allowTools: ['getStatus'],
465
- // denyTools: ['dangerousOp'],
466
- // requiresApproval: true,
467
- // disabled: false,
468
- // tags: ['example'],
469
- },
470
- },
471
- };
472
- ```
473
-
474
- Semantics:
475
-
476
- - **Namespacing** — remote tools are registered as `<serverId>__<originalName>` (e.g. `remoteKb__getStatus`) so multiple servers coexist without name clashes
477
- - **Fault isolation** — any server failing to connect / list tools only emits a single stderr warning; other servers and the main flow keep running
478
- - **Timeouts & cancellation** — `adapter.connect()` is wrapped by `connectTimeoutMs`; `ToolExecutionContext.abortSignal` is forwarded to `adapter.executeTool({ signal })`, so Ctrl+C / budget breaches propagate through the protocol layer
479
- - **Approval gating** — MCP tool `requiresApproval` is OR-ed with the per-server `requiresApproval` and the tool-loop global flag; the CLI's default `y/N` prompt handles it uniformly
480
- - **Lifecycle** — `tachu run` / `tachu chat` always call `engine.dispose()` then `mounted.disconnectAll()` in a `finally` branch; disconnect failures only emit warnings
481
- - **LLM-facing `description`** — when provided, the per-server `description` is prefixed to every tool's description as `[<serverId>: <description>] <original>`, so the planner can route more accurately even without reading the full JSON schema
482
- - **Keyword-gated tools (prompt-size optimization)** — a server with `expandOnKeywordMatch: true` and non-empty `keywords` is *not* registered at startup. `tachu run <prompt>` and each `you>` turn in `tachu chat` evaluate the user input against each server's keywords (case-insensitive substring match; structured input is `JSON.stringify`'d first) and only register tools from servers whose keywords hit. Use this to keep dozens of niche tools out of the default prompt while still making them reachable on demand — the schema validator refuses `expandOnKeywordMatch: true` without keywords
483
-
484
- **SDK usage (when you bypass the CLI and assemble the engine yourself)**
485
-
486
- ```typescript
487
- import { McpSseAdapter, McpStdioAdapter } from '@tachu/extensions';
488
-
489
- const sse = new McpSseAdapter({
490
- url: 'https://mcp.example.com/sse/',
491
- serverId: 'remoteKb',
492
- headers: { Authorization: 'Bearer ...' },
493
- defaultTimeoutMs: 50_000,
494
- });
495
- await sse.connect('https://mcp.example.com/sse/');
496
- const tools = await sse.listTools();
497
- for (const tool of tools) await engine.registry.register(tool);
498
-
499
- const stdio = new McpStdioAdapter({
500
- command: 'npx',
501
- args: ['-y', '@modelcontextprotocol/server-filesystem', process.cwd()],
502
- serverId: 'fs',
503
- });
504
- await stdio.connect('');
505
- ```
506
-
507
- If you want the same "one block of config, auto-wired" experience inside a custom host, reuse `@tachu/cli`'s `mountMcpServers(config.mcpServers, { cwd })` / `setupMcpServersFromConfig(config, registry, { cwd })`—they return `{ descriptors, executors, disconnectAll }` that you can feed into `createEngine({ extraToolExecutors })`.
508
-
509
- ### Vector Stores
510
-
511
- | Adapter | Package | Use Case |
512
- |---------|---------|----------|
513
- | `InMemoryVectorStore` | `@tachu/core` | Development / testing; built-in, zero dependencies |
514
- | `LocalFsVectorStore` | `@tachu/extensions/vector` | Single-process production; file-backed persistence |
515
- | `QdrantVectorStore` | `@tachu/extensions/vector` | Multi-process production; full Qdrant REST API support |
516
-
517
- ```typescript
518
- import { QdrantVectorStore } from '@tachu/extensions/vector';
519
-
520
- const vectorStore = new QdrantVectorStore({
521
- url: 'http://localhost:6333',
522
- collectionName: 'tachu-descriptors',
523
- });
524
-
525
- engine.useVectorStore(vectorStore);
526
- ```
527
-
528
- ### Observability Emitters
529
-
530
- | Emitter | Package | Output |
531
- |---------|---------|--------|
532
- | `OtelEmitter` | `@tachu/extensions/emitters` | OpenTelemetry spans via `@opentelemetry/api` |
533
- | `JsonlEmitter` | `@tachu/extensions/emitters` | Append-only JSONL file |
534
- | `ConsoleEmitter` | `@tachu/extensions/emitters` | Structured console output (development) |
535
-
536
- ### Execution Backends
537
-
538
- | Backend | Package | Description |
539
- |---------|---------|-------------|
540
- | `TerminalBackend` | `@tachu/extensions/backends` | Shell command execution in a sandboxed terminal |
541
- | `FileBackend` | `@tachu/extensions/backends` | File system read/write operations |
542
- | `WebBackend` | `@tachu/extensions/backends` | HTTP requests to external APIs / web resources |
543
-
544
- ---
545
-
546
- ## Design Principles
547
-
548
- Tachu is built on seven core engineering principles drawn from its architecture:
549
-
550
- 1. **Dual-Plane Matching** — All four core abstractions are discovered semantically (vector similarity) but activated deterministically (scopes, whitelist, approval). Semantic discovery is advisory; execution gates are authoritative.
551
-
552
- 2. **Full-Path Safety Gating** — The minimum safety check (Phase 2) runs on *every* request path, including the fast path for simple questions. Safety is never traded for performance.
553
-
554
- 3. **Three-Strikes Retry Limit** — Both the task-level retry loop and the system-level retry loop are strictly bounded. Unlimited retry is not allowed. When limits are exhausted, the engine outputs step-level completion status rather than a generic failure.
555
-
556
- 4. **KV-Cache-Friendly Prompt Assembly** — The System Prompt is assembled in a stable order (hard rules → soft preferences → skills → tool definitions) so that the prefix remains unchanged across turns, maximizing KV cache reuse and reducing LLM cost.
557
-
558
- 5. **Last-Message-Wins Cancellation** — When a new message arrives in the same session, the current execution is cancelled via `AbortController` and the new input is processed in the existing context. This guarantees coherent context while avoiding stale work.
559
-
560
- 6. **Engine Agnostic of Business Permissions** — The engine only evaluates coarse-grained `scopes` from the execution context at the Tool gate. Fine-grained business authorization is the responsibility of Tool implementations or dedicated authorization Tools.
561
-
562
- 7. **Fail-Closed Safety Baseline** — Loop protection, budget circuit-breaking, and basic input validation are hardwired into the engine core and *cannot* be disabled by configuration. Even with a completely empty business configuration, the engine does not run unconstrained.
563
-
564
- ---
565
-
566
- ## Configuration
567
-
568
- The engine is configured via a `tachu.config.ts` file at the project root (generated by `tachu init`):
569
-
570
- ```typescript
571
- import type { EngineConfig } from '@tachu/core';
572
-
573
- const config: EngineConfig = {
574
- // Descriptor registry: where Rules/Skills/Tools/Agents are loaded from
575
- registry: {
576
- descriptorPaths: ['.tachu'],
577
- enableVectorIndexing: false, // set true to auto-index descriptors at startup
578
- },
579
-
580
- // Runtime behaviour
581
- runtime: {
582
- planMode: false, // when true, only plan but never execute tasks
583
- maxConcurrency: 4, // max parallel sub-tasks
584
- defaultTaskTimeoutMs: 120_000, // single-task default timeout (ms)
585
- failFast: false, // any sub-task failure aborts the run
586
- },
587
-
588
- // Context window & memory
589
- memory: {
590
- contextTokenLimit: 8000, // context window token limit
591
- compressionThreshold: 0.8, // trigger compression at 80% capacity
592
- headKeep: 4, // earliest messages preserved during compression
593
- tailKeep: 12, // latest messages preserved during compression
594
- archivePath: '.tachu/archive.jsonl',
595
- vectorIndexLimit: 10_000, // max entries in the built-in vector index
596
- },
597
-
598
- // Budget constraints (per execution)
599
- budget: {
600
- maxTokens: 50_000, // total token budget per execution
601
- maxToolCalls: 50, // max tool calls per execution
602
- maxWallTimeMs: 300_000, // 5-minute wall-time limit
603
- },
604
-
605
- // Safety baseline (hardwired minimum; add business policies via SafetyModule.registerPolicy)
606
- safety: {
607
- maxInputSizeBytes: 1_000_000,
608
- maxRecursionDepth: 10,
609
- workspaceRoot: process.cwd(), // file backend root (path-traversal guard)
610
- promptInjectionPatterns: [], // optional regex strings; matches emit warnings only
611
- },
612
-
613
- // Model routing
614
- models: {
615
- capabilityMapping: {
616
- 'high-reasoning': { provider: 'openai', model: 'gpt-4o' },
617
- 'fast-cheap': { provider: 'openai', model: 'gpt-4o-mini' },
618
- 'vision': { provider: 'openai', model: 'gpt-4o' },
619
- },
620
- providerFallbackOrder: ['openai', 'anthropic'],
621
- },
622
-
623
- // Observability (events emitted to ObservabilityEmitter)
624
- observability: {
625
- enabled: true,
626
- maskSensitiveData: true, // auto-mask PII in event payloads
627
- },
628
-
629
- // Hooks
630
- hooks: {
631
- writeHookTimeout: 5_000, // ms; mutating hooks exceeding this are skipped
632
- failureBehavior: 'continue', // 'abort' to fail the run on any hook error
633
- },
634
- };
635
-
636
- export default config;
637
- ```
638
-
639
- All fields have sensible defaults. `tachu init` generates this file pre-filled for your chosen provider.
640
-
641
- > **Schema reference**: see detailed-design §14.1 for the full `EngineConfig` interface (and the historical-vs-v1 changelog). Earlier drafts that used `retry / planning / agent / context / execution / storage` keys are deprecated and will fail `validateConfig` with `VALIDATION_INVALID_CONFIG`.
642
-
643
- ---
644
-
645
- ## CLI Reference
646
-
647
- ### `tachu init`
648
-
649
- Initialize a new Tachu project workspace.
650
-
651
- ```
652
- tachu init [options]
653
-
654
- Options:
655
- --template <name> Scaffold template: minimal | full (default: minimal)
656
- --force Overwrite existing files without prompting
657
- --path <dir> Target directory (default: CWD)
658
- --provider <name> Default provider: openai | anthropic | mock (default: mock)
659
- --no-examples Skip generating example rule/tool descriptors
660
- -h, --help Show help
661
- ```
662
-
663
- Generates `.tachu/` directory skeleton + `tachu.config.ts` + `.gitignore` entries.
664
-
665
- ---
666
-
667
- ### `tachu run <prompt>`
668
-
669
- Execute a single prompt and stream the result to stdout.
670
-
671
- ```
672
- tachu run <prompt> [options]
673
-
674
- Arguments:
675
- <prompt> The prompt text (or pipe via stdin)
676
-
677
- Options:
678
- --session <id> Use a specific session ID
679
- --resume Resume the most recent session
680
- --model <name> Override the high-reasoning model
681
- --provider <name> Override the default provider
682
- --api-base <url> Override provider baseURL (gateway / Azure / LiteLLM)
683
- --api-key <key> Override provider apiKey (env var still recommended)
684
- --organization <id> Override OpenAI organization ID
685
- --input <file> Read prompt from a file
686
- --json Parse prompt as JSON (structured input)
687
- --output <fmt> Output format: text | json | markdown (default: text)
688
- --markdown Enable terminal Markdown rendering for --output text
689
- (default: on when stdout is a TTY and NO_COLOR is unset)
690
- --no-markdown Disable terminal Markdown rendering (force raw text)
691
- --no-validation Skip Phase 8 result validation
692
- --plan-mode Enable Plan mode (pause after Phase 5 for approval)
693
- --verbose, -v Verbose output (phase transitions, each phase line has `(Nms)` duration appended)
694
- --debug Debug mode: implies --verbose and streams every engine observability
695
- event (phase / llm / tool / MCP) to stderr, color-coded.
696
- Safe for `-o json` pipelines (stdout is not polluted).
697
- --no-color Disable ANSI color output (also respects NO_COLOR env var;
698
- implies --no-markdown since Markdown rendering is color-based)
699
- --timeout <ms> Wall-time limit (overrides budget.maxWallTimeMs)
700
- -h, --help Show help
701
- ```
702
-
703
- ---
704
-
705
- ### `tachu chat`
706
-
707
- Start an interactive multi-turn chat session.
708
-
709
- ```
710
- tachu chat [options]
711
-
712
- Options:
713
- --session <id> Use a specific session ID
714
- --resume Resume the most recent session
715
- --history List all sessions and exit (no interactive prompt)
716
- --export <file> Export a session to Markdown and exit
717
- --model <name> Override the high-reasoning model
718
- --provider <name> Override the default provider
719
- --api-base <url> Override provider baseURL (gateway / Azure / LiteLLM)
720
- --api-key <key> Override provider apiKey (env var still recommended)
721
- --organization <id> Override OpenAI organization ID
722
- --plan-mode Enable Plan mode
723
- --verbose, -v Verbose output (phase lines carry `(Nms)` duration)
724
- --debug Debug mode: implies --verbose and streams observability events to stderr.
725
- Also prints per-turn MCP gated-group activation summary.
726
- --no-color Disable color output
727
- -h, --help Show help
728
- ```
729
-
730
- **Built-in interactive commands** (prefix with `/`):
731
-
732
- | Command | Description |
733
- |---------|-------------|
734
- | `/exit` | Save session and quit |
735
- | `/reset` | Clear the current session's memory |
736
- | `/new` | Start a new session |
737
- | `/list` | List all saved sessions |
738
- | `/load <id>` | Switch to a specific session |
739
- | `/save` | Manually persist the current session |
740
- | `/export <path>` | Export session to a Markdown file |
741
- | `/history` | Show this session's message history |
742
- | `/stats` | Show token usage, tool calls, and remaining budget |
743
- | `/help` | Show all commands |
744
157
 
745
- **Ctrl+C behaviour:**
746
- - First press: cancel the current LLM/Tool call (return to prompt, session intact)
747
- - Second press within 1 second: save session and exit gracefully
748
- - Third press: force exit
749
-
750
- **Session persistence contract:**
751
-
752
- `tachu chat` uses the `FsMemorySystem` from `@tachu/extensions` by default. Each conversation is written to `<cwd>/.tachu/memory/<session-id>.jsonl` on every `append` (append-only for crash safety). `--resume` and `--session <id>` hydrate the full history from that file on startup, then the engine continues inside the very same `MemorySystem` — so the LLM sees the complete prior context.
753
-
754
- - `persistence` is controlled via `memory.persistence` in `tachu.config.ts` (`"fs"` default / `"memory"` for SDK-embedded use)
755
- - `persistDir` defaults to `.tachu/memory`
756
- - Legacy `tachu.config.ts` sessions that still embedded `messages` inside the session JSON are auto-migrated into the new `jsonl` layout on first resume (one-time, idempotent)
757
- - `/history`, `/export <path>`, `/stats`, `/reset`, `/clear`, `/new`, `/load <id>` all operate against this single source of truth
758
-
759
- ---
760
-
761
- ## Extension Guide
762
-
763
- Tachu is extended by creating Markdown descriptor files in the `.tachu/` directory. No code changes are required for Rules, Skills, and Tools — only Agents need executable functions registered separately.
764
-
765
- ### Custom Rule
766
-
767
- ```markdown
768
- <!-- .tachu/rules/no-external-calls.md -->
769
- ---
770
- name: no-external-calls
771
- description: Prevent the agent from making external network calls without explicit approval
772
- type: rule
773
- scope: [execution]
774
- tags: [security, network]
775
- ---
776
-
777
- Do not make HTTP requests, DNS lookups, or any other external network calls unless
778
- the tool being invoked has `requiresApproval: true` and the user has confirmed.
779
- ```
780
-
781
- ### Custom Skill
782
-
783
- ```markdown
784
- <!-- .tachu/skills/git-workflow/SKILL.md -->
785
- ---
786
- name: git-workflow
787
- description: Git branching, commit, and PR workflow knowledge for this repository
788
- tags: [development, git]
789
- requires:
790
- - { kind: tool, name: run-command }
791
- ---
792
-
793
- ## Git Workflow
794
-
795
- This repository follows trunk-based development with short-lived feature branches.
796
-
797
- ### Branch Naming
798
- - Feature: `feat/<ticket>-<short-desc>`
799
- - Fix: `fix/<ticket>-<short-desc>`
800
-
801
- ### Commit Convention
802
- Use Conventional Commits: `type(scope): subject`
803
- ...
804
- ```
805
-
806
- ### Custom Tool
807
-
808
- ```markdown
809
- <!-- .tachu/tools/query-db.md -->
810
- ---
811
- name: query-db
812
- description: Execute a read-only SQL query against the application database
813
- sideEffect: readonly
814
- idempotent: true
815
- requiresApproval: false
816
- timeout: 10000
817
- inputSchema:
818
- type: object
819
- properties:
820
- sql: { type: string, description: "SQL SELECT statement" }
821
- limit: { type: number, description: "Max rows to return", default: 100 }
822
- required: [sql]
823
- execute: queryDatabase
824
- ---
825
-
826
- Executes a parameterized read-only SQL query. Results are returned as a JSON array.
827
- ```
828
-
829
- Register the execution function in your `engine-factory.ts`:
830
-
831
- ```typescript
832
- engine.registry.registerExecutor('queryDatabase', async (input, ctx) => {
833
- const { sql, limit = 100 } = input as { sql: string; limit?: number };
834
- return db.query(sql).limit(limit).execute();
835
- });
836
- ```
837
-
838
- ### Custom Agent
839
-
840
- ```markdown
841
- <!-- .tachu/agents/code-reviewer.md -->
842
- ---
843
- name: code-reviewer
844
- description: Reviews pull request diffs and produces structured code review feedback
845
- sideEffect: readonly
846
- idempotent: true
847
- requiresApproval: false
848
- timeout: 180000
849
- maxDepth: 1
850
- availableTools: [read-file, search-code, run-command]
851
- ---
852
-
853
- You are a careful code reviewer. When given a diff or a set of files:
854
- 1. Understand the intent of the change
855
- 2. Review for correctness, clarity, security, and performance
856
- 3. Produce a structured review with severity levels: critical / major / minor / nit
857
- ```
858
-
859
- ---
860
-
861
- ## Observability & Safety
862
-
863
- ### OpenTelemetry Integration
864
-
865
- Every engine event maps to an OTel span, enabling full distributed tracing:
866
-
867
- ```typescript
868
- import { OtelEmitter } from '@tachu/extensions/emitters';
869
- import { NodeTracerProvider } from '@opentelemetry/sdk-trace-node';
870
- import { SimpleSpanProcessor } from '@opentelemetry/sdk-trace-base';
871
- import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';
872
-
873
- const provider = new NodeTracerProvider();
874
- provider.addSpanProcessor(
875
- new SimpleSpanProcessor(new OTLPTraceExporter({ url: 'http://localhost:4318/v1/traces' }))
876
- );
877
- provider.register();
878
-
879
- const engine = new Engine({
880
- ...config,
881
- // The OtelEmitter consumes EngineEvents and creates OTel spans
882
- });
883
- engine.useEmitter(new OtelEmitter());
884
- ```
885
-
886
- **Events emitted for every request:**
887
-
888
- | Event Type | When |
889
- |-----------|------|
890
- | `phase_enter` / `phase_exit` | Every pipeline phase |
891
- | `llm_call_start` / `llm_call_end` | Every LLM invocation |
892
- | `tool_call_start` / `tool_call_end` | Every Tool execution |
893
- | `retry` | Task-level or system-level retry triggered |
894
- | `provider_fallback` | Provider downgrade initiated |
895
- | `budget_warning` | Budget at 80% of limit |
896
- | `budget_exhausted` | Budget circuit-breaker activated |
897
- | `error` | Any `EngineError` subclass |
898
-
899
- ### Safety Module
900
-
901
- The Safety module operates in two independent layers:
902
-
903
- **Engine baseline (non-disableable):**
904
- - Input size enforcement (`maxInputSize` bytes)
905
- - Recursion depth limit (`maxRecursionDepth`)
906
- - Budget circuit-breaker (terminates immediately when token/time budget is exhausted)
907
-
908
- **Business-injectable policies** (via hooks or configuration):
909
- - Prompt injection detection (`enablePromptInjectionCheck: true`)
910
- - Sensitive operation interception (register via `engine.registerSafetyPolicy()`)
911
- - Output content compliance checks
912
-
913
- ```typescript
914
- // Register a custom safety policy
915
- engine.registerSafetyPolicy(async (input, ctx) => {
916
- if (containsPersonalData(input.content)) {
917
- return { passed: false, violations: [{ type: 'pii', message: 'PII detected in input' }] };
918
- }
919
- return { passed: true, violations: [] };
920
- });
921
- ```
922
-
923
- ### Graceful Degradation Policy
924
-
925
- Tachu guarantees that **every response the user sees is a usable natural-language answer** — the engine never returns a bare "failed" status or leaks internal step IDs / phase numbers / sub-flow names. Three defensive layers enforce this:
926
-
927
- 1. **Origin** — every `EngineError` ships with a `userMessage` field resolved from a Chinese template table (46 codes covered); `toUserFacing()` projects only `{ code, userMessage, retryable }` to the UI layer, hiding `message` / `stack` / `cause` / `context`.
928
- 2. **Aggregation** — when `validation.passed === false` and the built-in `direct-answer` sub-flow also produced nothing, Phase 9's `ensureFallbackText()` first attempts a best-effort LLM summary (5 s timeout, no retry) and silently falls back to a deterministic local template. The returned text is always **≥ 30 characters**, contains a concrete next-step hint, and is sanitized of internal terminology.
929
- 3. **Final shield** — the CLI `StreamRenderer` runs a last-pass regex filter (`sanitizeUserText`) over every user-visible string (`finalize(text|markdown)` + `error` chunks), catching any internal term that might have slipped through upstream.
930
-
931
- The contract is enforced by `packages/core/src/engine/phases/fallback-contract.test.ts` (55 assertions). Any regression that leaks `task-tool-N` / `Phase N` / `direct-answer 子流程` / `capability 路由` / `Tool / Agent 描述符` to a user-facing path fails CI.
932
-
933
- ---
934
-
935
- ## Roadmap
936
-
937
- Tachu follows a `1.0.0-alpha.n` → `1.0.0-beta.n` → `1.0.0` release lane. Each
938
- cut-line below is a real, shippable deliverable with tests, not a wish list.
939
-
940
- ### 1.0.0-alpha.6 — Streaming phases, Gemini, and provider-protocol hardening (current)
941
-
942
- - [x] Engine streaming extensions: `phase-enter` / `phase-exit` plus `reasoning-delta` chunks for richer CLI/SSE consumers
943
- - [x] `SessionScope` + `modelOverride` plumbing through the model router for scoped routing decisions
944
- - [x] LLM usage telemetry with `stepId` attribution for phase-level observability
945
- - [x] Tool-use loop: structured tool results vs final-answer streaming split, parallel approval budgeting, active tool-loop timers, streamed `providerMetadata` merged into assistant history, and streaming robustness fixes
946
- - [x] Multimodal protocol: `GeneratedMedia`, `metadata.generatedMedia`, `Message` `file` parts, optional `embed` / `rerank` hooks, structured-output metadata, and opaque `providerMetadata` for adapter-specific replay
947
- - [x] `GeminiProviderAdapter` (`@google/genai`) with unit tests and package exports
948
- - [x] `.gitignore` — ignore `docs/superpowers/` for local draft notes
949
-
950
- ### 1.0.0-alpha.5 — Descriptor governance + timeout budgeting hardening
951
-
952
- - [x] `BaseDescriptor` governance metadata: `version`, `displayName`, `deprecated`, `deprecatedMessage`
953
- - [x] `DescriptorRegistry` version-aware resolution: same-name multi-version coexistence, `get(kind,name,version)`, `getLatest`, `listVersions`, and stable-first latest selection
954
- - [x] Descriptor passthrough contract: unknown top-level fields are preserved through loader + registry round-trip
955
- - [x] Registry validation: `deprecated=true` now requires `deprecatedMessage`; invalid semver is rejected with explicit registry errors
956
- - [x] Runtime timeout split: `llmWaitFirstTokenMs`, `llmStreamingMs`, `maxToolLoopActiveMs` (excluding user-blocking waits), plus phase-level timeout overrides
957
- - [x] `search-code` resilience: invalid regex patterns auto-downgrade to fixed-string matching instead of failing the turn
958
-
959
- ### 1.0.0-alpha.4 — Code editing capabilities + architecture boundary fix
960
-
961
- - [x] 14 new tools in `@tachu/extensions`: `edit-file`, `multi-edit`, `glob`, `todo-write/read`, `git-status/diff/log/blame/show/branch`, `run-typecheck`, `run-tests`
962
- - [x] `read-file` — `offset`/`limit` pagination, `withLineNumbers` (default on), `totalLines`/`hasMore`
963
- - [x] `apply-patch` — fuzzy context matching (trim + ±3-line offset tolerance)
964
- - [x] `run-shell` — extended env allowlist, session-persistent cwd, built-in deny patterns
965
- - [x] Persistent tool approval — `ApprovalStore` JSONL (project + user scope), `y/a/p/s/N` prompt, `tachu approval` subcommand group
966
- - [x] `EngineConfig.intent.additionalComplexPatterns` / `fewShotExamples` — business layer injects domain-specific complex signals; core carries no domain knowledge
967
- - [x] `EngineConfig.toolUse.systemPromptSuffix` — domain workflow instructions (e.g. code editing guide) injected at CLI layer, not hard-coded in core
968
- - [x] `toolLoop.maxSteps` default raised 8 → 25
969
-
970
- ### 1.0.0-alpha.3 — Shell auto-approve, shortTaskRoute, prompt optimisations
971
-
972
- - [x] `safety.shellAutoApprovePatterns` auto-approve matching shell commands (config-driven)
973
- - [x] `toolLoop.shortTaskRoute` routes short single-tool loops to `fast-cheap` capability
974
- - [x] Intent phase fast-paths for strong simple / strong complex heuristics
975
- - [x] Internal system prompts English-first; prompt assembler tool listing drops JSON Schema
976
-
977
- ### 1.0.0-alpha.2 — Adapter call context
978
-
979
- - [x] `AdapterCallContext` on `ProviderAdapter` / `VectorStore` / `MemorySystem`; engine and phases pass execution-derived context (trace id; optional tenant / scope identifiers).
980
-
981
- ### 1.0.0-alpha.1 — First public alpha
982
-
983
- - [x] 9-phase pipeline, descriptor registry, prompt assembler, scheduler and 8 core modules
984
- - [x] OpenAI / Anthropic / Qwen / Mock provider adapters with
985
- `apiKey` / `baseURL` / `organization` / `timeoutMs` configurable via env,
986
- `tachu.config.ts` or CLI flags
987
- - [x] CLI (`tachu init` / `run` / `chat`) with streaming renderer, session
988
- persistence, double-Ctrl+C exit semantics and terminal Markdown rendering
989
- - [x] MCP stdio + SSE adapters, auto-mounted from `tachu.config.ts`
990
- - [x] Vector stores (`LocalFsVectorStore`, `QdrantVectorStore`) and
991
- observability emitters (OTel / JSONL / Console)
992
- - [x] `direct-answer` built-in Sub-flow — reserved in the registry, runs the
993
- user-facing LLM reply inside Phase 7 with the same safety and
994
- observability hooks as any other sub-flow
995
- - [x] `tool-use` built-in Sub-flow — full agentic loop with tool selection,
996
- approval, execution, feedback and termination
997
- - [x] Phase 3 (Intent Analysis) as a real LLM call with structured JSON schema,
998
- few-shot examples, bounded history window and composed timeouts
999
- - [x] Phase 5 fallback contract — guarantees `plans[0].tasks.length >= 1` for
1000
- every request path; LLM-backed ranked planner slated for next alpha
1001
- - [x] Phase 9 Output Assembly — internal state JSON never leaks to end users
1002
- - [x] Optional `@tachu/web-fetch-server` sidecar powering the `web-fetch` and
1003
- `web-search` tools without pulling browser dependencies into the SDK
1004
- - [x] Structured text-to-image contract (`ChatResponse.images` /
1005
- `EngineOutput.metadata.generatedImages`) and `tachu run --save-image`
1006
-
1007
- ### Next alpha iterations
1008
-
1009
- - [ ] LLM-backed Phase 5 planner producing ranked multi-step plans for
1010
- tool-chain complex requests
1011
- - [ ] Phase 8 Result Validation as a real LLM call with structured
1012
- `ValidationResult` driving the retry / re-plan loop
1013
- - [ ] `tachu run --plan-mode` real plan preview before execution
1014
- - [ ] Engine-level `delta` streaming so CLI can render token-by-token output
1015
- during Phase 3 / 7 / 8
1016
- - [ ] `tachu run --json` schema lock-down
1017
- - [ ] Failure-injection test harness
1018
- - [ ] Published end-to-end smoke test recordings under `docs/smoke/`
1019
-
1020
- ### 1.0.0-beta — Graduation criteria
1021
-
1022
- - [ ] Two consecutive alpha releases without regressions
1023
- - [ ] ≥1 third-party user has run Tachu end-to-end against a real LLM and reported back
1024
- - [ ] Published coverage + benchmark baselines
1025
- - [ ] Public upgrade guide covering every breaking change since `1.0.0-alpha.1`
1026
- - [ ] Additional provider adapters (e.g. Mistral) land behind the stable protocol
1027
-
1028
- ### 1.0.0 — Stable
1029
-
1030
- - [ ] SLO / error-budget documentation
1031
- - [ ] Signed release provenance
1032
- - [ ] Backwards-compatibility policy
1033
- - [ ] Production deployments documented
1034
-
1035
- ### Beyond 1.0 — Vision
1036
-
1037
- - Multi-agent collaboration (agent-to-agent communication protocol)
1038
- - Persistent long-term memory across deployment restarts
1039
- - Fine-grained budget allocation per sub-task
1040
- - Additional VectorStore adapters (Pinecone, pgvector)
1041
- - Plan template library
1042
- - Additional compression strategies
1043
-
1044
- ---
1045
-
1046
- ## Contributing
1047
-
1048
- ### Requirements
1049
-
1050
- - [Bun](https://bun.sh) >= 1.1.0
1051
- - TypeScript 5.x (provided via dev dependencies)
1052
-
1053
- ### Development Workflow
1054
-
1055
- ```bash
1056
- # Clone and install
1057
- git clone https://github.com/dangaogit/tachu.git
1058
- cd tachu
1059
- bun install
1060
-
1061
- # Run all tests
1062
- bun test
1063
-
1064
- # Type check
1065
- bun run typecheck
1066
-
1067
- # Build all packages
1068
- bun run build
1069
-
1070
- # Run a specific package's tests
1071
- bun test --filter packages/core
1072
- ```
1073
-
1074
- ### Project Conventions
1075
-
1076
- - File names: `kebab-case`
1077
- - Classes and types: `PascalCase`
1078
- - Functions and variables: `camelCase`
1079
- - Constants: `SCREAMING_SNAKE_CASE`
1080
- - All public APIs must have TSDoc comments (`@param`, `@returns`, `@throws`, `@example`)
1081
- - Test files co-located with source: `*.test.ts`
1082
- - Integration tests under `__tests__/`
1083
-
1084
- Pull requests require:
1085
- - All tests passing (`bun test`)
1086
- - Zero TypeScript errors (`bun run typecheck`)
1087
- - Coverage thresholds met (≥80% line, ≥70% branch)
1088
- - TSDoc on any new public API
1089
-
1090
- See `CONTRIBUTING.md` for full guidelines.
1091
-
1092
- ---
1093
-
1094
- ## Benchmarks
1095
-
1096
- Performance baselines are established in `packages/core/benchmarks/` and run with `bun test`:
1097
-
1098
- | Benchmark | Metric | Baseline |
1099
- |-----------|--------|----------|
1100
- | `scheduler.bench.ts` — 100 parallel tasks | Scheduling throughput | *Populated by verifier phase* |
1101
- | `vector-store.bench.ts` — 10,000 entries, topK=10 | Search QPS | *Populated by verifier phase* |
1102
- | `prompt-assembler.bench.ts` — 4K token window assembly | Assembly latency (p99) | *Populated by verifier phase* |
1103
-
1104
- Benchmarks serve as regression baselines; there are no minimum performance requirements for v1.
158
+ Programmatic embedding: see [Configuration](./docs/guides/configuration.md) and `@tachu/host-defaults`.
1105
159
 
1106
160
  ---
1107
161
 
@@ -1109,31 +163,32 @@ Benchmarks serve as regression baselines; there are no minimum performance requi
1109
163
 
1110
164
  | Document | Description |
1111
165
  |----------|-------------|
1112
- | [Architecture Design](./docs/adr/architecture-design.md) | Vision, three-layer structure, four core abstractions, 9-phase pipeline design |
1113
- | [Detailed Design](./docs/adr/detailed-design.md) | TypeScript interfaces, module specs, configuration schema |
1114
- | [Technical Design](./docs/adr/technical-design.md) | Technology choices, engineering structure, implementation guide |
166
+ | [Overview Design](./docs/overview-design.md) | Vision, layers, abstractions, pipeline concepts |
167
+ | [Detailed Design](./docs/detailed-design.md) | Types, modules, configuration schema |
168
+ | [Technical Design](./docs/technical-design.md) | Engineering structure and implementation guide |
169
+ | [Pipeline phases](./docs/architecture/pipeline-phases.md) | 9-phase pipeline and tool-use loop |
170
+ | [Package layout](./docs/architecture/package-layout.md) | Monorepo packages and dependencies |
171
+ | [Design principles](./docs/architecture/design-principles.md) | Core engineering principles |
172
+ | [CLI reference](./docs/guides/cli.md) | All commands and flags |
173
+ | [Configuration](./docs/guides/configuration.md) | `tachu.config.ts` / `EngineConfig` |
174
+ | [Providers & integrations](./docs/guides/providers-and-integrations.md) | LLM, MCP, vector stores |
175
+ | [Extension guide](./docs/guides/extension-guide.md) | Rules, Skills, Tools, Agents |
176
+ | [Observability & safety](./docs/guides/observability-and-safety.md) | OTel, events, safety |
177
+ | [CONTEXT.md](./CONTEXT.md) | Product vocabulary |
178
+ | [CONTRIBUTING.md](./CONTRIBUTING.md) | Development workflow |
179
+ | [Web Fetch Server](./packages/web-fetch-server/README.md) | Optional browser sidecar |
1115
180
 
1116
181
  ---
1117
182
 
1118
- ## Web Fetch Server (Optional)
1119
-
1120
- The **Web Fetch Server** (`@tachu/web-fetch-server`) is an optional HTTP sidecar that performs remote browser rendering and structured extraction for the `web-fetch` and `web-search` tools in `@tachu/extensions`. It does **not** run automatically with the engine or CLI—start it only when you need those tools against live pages.
183
+ ## Web Fetch Server (optional)
1121
184
 
1122
- ### Quick start (repo root)
185
+ For browser-backed `web-fetch` / `web-search`, run the private sidecar — see [packages/web-fetch-server/README.md](./packages/web-fetch-server/README.md).
1123
186
 
1124
187
  ```bash
1125
- bun install
1126
- bun run dev:server:install-browser # first-time: Chromium for Playwright
188
+ bun run dev:server:install-browser
1127
189
  bun run dev:server
1128
190
  ```
1129
191
 
1130
- ### Tools
1131
-
1132
- - **`web-fetch`** — Calls the server to retrieve a URL and return AI-friendly Markdown (Readability + Turndown).
1133
- - **`web-search`** — In v0.1 this is a **stub**; real search providers are not wired yet.
1134
-
1135
- For full configuration, env vars, and production/Docker notes, see [packages/web-fetch-server/README.md](./packages/web-fetch-server/README.md).
1136
-
1137
192
  ---
1138
193
 
1139
194
  ## License