PyPI - agentix-toolkit - Versions diffs - 0.1.0__tar.gz → 0.2.0__tar.gz - Mend

agentix-toolkit 0.1.0tar.gz → 0.2.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (97) hide show

agentix_toolkit-0.2.0/.editorconfig ADDED Viewed

@@ -0,0 +1,18 @@
+root = true
+[*]
+charset = utf-8
+end_of_line = lf
+insert_final_newline = true
+trim_trailing_whitespace = true
+indent_style = space
+[*.py]
+indent_size = 4
+max_line_length = 100
+[*.{yml,yaml,toml,json}]
+indent_size = 2
+[*.md]
+trim_trailing_whitespace = false

agentix_toolkit-0.2.0/.github/ISSUE_TEMPLATE/bug_report.yml ADDED Viewed

@@ -0,0 +1,43 @@
+name: Bug report
+description: Something isn't working as documented.
+labels: ["bug"]
+body:
+  - type: markdown
+    attributes:
+      value: |
+        Thanks for the report! For **security vulnerabilities**, do not file a
+        public issue — see [SECURITY.md](../blob/main/SECURITY.md).
+  - type: textarea
+    id: what-happened
+    attributes:
+      label: What happened?
+      description: What did you expect, and what happened instead?
+    validations:
+      required: true
+  - type: textarea
+    id: repro
+    attributes:
+      label: Minimal reproduction
+      description: The smallest code snippet that reproduces it. Prefer `MockModel` so it runs without an API key.
+      render: python
+    validations:
+      required: true
+  - type: input
+    id: version
+    attributes:
+      label: agentix-toolkit version
+      placeholder: "0.1.0"
+    validations:
+      required: true
+  - type: input
+    id: python
+    attributes:
+      label: Python version
+      placeholder: "3.12"
+    validations:
+      required: true
+  - type: textarea
+    id: extra
+    attributes:
+      label: Anything else?
+      description: Traceback, environment, relevant config (model/provider, guards, etc.).

agentix_toolkit-0.2.0/.github/ISSUE_TEMPLATE/config.yml ADDED Viewed

@@ -0,0 +1,8 @@
+blank_issues_enabled: false
+contact_links:
+  - name: Security vulnerability
+    url: https://github.com/skwijeratne/agentix-toolkit/security/advisories/new
+    about: Report security issues privately — please do not open a public issue.
+  - name: Question / usage help
+    url: https://github.com/skwijeratne/agentix-toolkit/discussions
+    about: Ask questions and discuss ideas (if Discussions is enabled).

agentix_toolkit-0.2.0/.github/ISSUE_TEMPLATE/feature_request.yml ADDED Viewed

@@ -0,0 +1,32 @@
+name: Feature request
+description: Suggest a capability or improvement.
+labels: ["enhancement"]
+body:
+  - type: textarea
+    id: problem
+    attributes:
+      label: What problem does this solve?
+      description: The use case or pain point. What are you trying to build?
+    validations:
+      required: true
+  - type: textarea
+    id: proposal
+    attributes:
+      label: Proposed solution
+      description: |
+        What would the API look like? Remember agentix's design: small shared
+        core, with capabilities injected (a guard / tool / strategy / adapter)
+        rather than baked into the loop.
+    validations:
+      required: true
+  - type: textarea
+    id: alternatives
+    attributes:
+      label: Alternatives considered
+  - type: checkboxes
+    id: scope
+    attributes:
+      label: Fit
+      options:
+        - label: This keeps the core provider-agnostic (no coupling to one model vendor).
+        - label: I'm willing to help implement it.

agentix_toolkit-0.2.0/.github/PULL_REQUEST_TEMPLATE.md ADDED Viewed

@@ -0,0 +1,21 @@
+<!-- Thanks for contributing! Keep PRs focused. -->
+## What & why
+<!-- What does this change, and what problem does it solve? Link the issue. -->
+Closes #
+## Checklist
+- [ ] Tests added/updated (new behavior covered; bug fixes have a regression test)
+- [ ] `uv run pytest` passes
+- [ ] `uv run ruff check src tests` passes
+- [ ] `uv run mypy` passes
+- [ ] Docs updated where relevant (docstrings / README / an `examples/` script)
+- [ ] `CHANGELOG.md` updated under `[Unreleased]`
+- [ ] Change is opt-in / composable and keeps the core provider-agnostic
+## Notes for reviewers
+<!-- Anything that needs context: design tradeoffs, follow-ups, etc. -->

agentix_toolkit-0.2.0/.github/dependabot.yml ADDED Viewed

@@ -0,0 +1,20 @@
+version: 2
+updates:
+  # Python dependencies (resolved via uv.lock).
+  - package-ecosystem: "uv"
+    directory: "/"
+    schedule:
+      interval: "weekly"
+    open-pull-requests-limit: 5
+    groups:
+      python-deps:
+        patterns: ["*"]
+  # Keep GitHub Actions current (also resolves the Node-version action drift).
+  - package-ecosystem: "github-actions"
+    directory: "/"
+    schedule:
+      interval: "weekly"
+    groups:
+      actions:
+        patterns: ["*"]

{agentix_toolkit-0.1.0 → agentix_toolkit-0.2.0}/.github/workflows/ci.yml RENAMED Viewed

@@ -14,9 +14,9 @@ jobs:
       matrix:
         python-version: ["3.10", "3.11", "3.12", "3.13"]
     steps:
-      - uses: actions/checkout@v4
+      - uses: actions/checkout@v7
       - name: Install uv
-        uses: astral-sh/setup-uv@v6
+        uses: astral-sh/setup-uv@v7
         with:
           enable-cache: true
       - run: uv python install ${{ matrix.python-version }}
@@ -27,9 +27,9 @@ jobs:
     name: lint & types
     runs-on: ubuntu-latest
     steps:
-      - uses: actions/checkout@v4
+      - uses: actions/checkout@v7
       - name: Install uv
-        uses: astral-sh/setup-uv@v6
+        uses: astral-sh/setup-uv@v7
         with:
           enable-cache: true
       # --all-extras so mypy can resolve the optional anthropic/mcp imports.

{agentix_toolkit-0.1.0 → agentix_toolkit-0.2.0}/.github/workflows/release.yml RENAMED Viewed

@@ -12,13 +12,13 @@ jobs:
   build:
     runs-on: ubuntu-latest
     steps:
-      - uses: actions/checkout@v4
+      - uses: actions/checkout@v7
       - name: Install uv
-        uses: astral-sh/setup-uv@v6
+        uses: astral-sh/setup-uv@v7
       - run: uv build
       - name: Check the built artifacts
         run: uvx twine check dist/*
-      - uses: actions/upload-artifact@v4
+      - uses: actions/upload-artifact@v7
         with:
           name: dist
           path: dist/
@@ -30,7 +30,7 @@ jobs:
     permissions:
       id-token: write # required for Trusted Publishing
     steps:
-      - uses: actions/download-artifact@v4
+      - uses: actions/download-artifact@v8
         with:
           name: dist
           path: dist/

agentix_toolkit-0.2.0/.pre-commit-config.yaml ADDED Viewed

@@ -0,0 +1,27 @@
+# Run `uv run pre-commit install` once to enable. Mirrors the CI gates.
+repos:
+  - repo: https://github.com/pre-commit/pre-commit-hooks
+    rev: v5.0.0
+    hooks:
+      - id: trailing-whitespace
+      - id: end-of-file-fixer
+      - id: check-yaml
+      - id: check-toml
+      - id: check-merge-conflict
+      - id: check-added-large-files
+  - repo: https://github.com/astral-sh/ruff-pre-commit
+    rev: v0.15.18
+    hooks:
+      - id: ruff
+        args: [--fix]
+  # mypy --strict, matching CI. Uses the project's own env via uv.
+  - repo: local
+    hooks:
+      - id: mypy
+        name: mypy (strict)
+        entry: uv run mypy
+        language: system
+        types: [python]
+        pass_filenames: false

{agentix_toolkit-0.1.0 → agentix_toolkit-0.2.0}/CHANGELOG.md RENAMED Viewed

@@ -6,6 +6,38 @@ adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 ## [Unreleased]
+## [0.2.0] - 2026-06-23
+### Added
+- Subagents: `subagent_tool(agent, ...)` exposes a child agent as a delegable
+  tool (its own model/system prompt/tools/guards); composes with the loop and
+  `bounded_gather`.
+- Cost & control: USD cost tracking (`pricing` module, `cost_usd`, and
+  `cost_usd` on `ModelResponse`/`AgentOutcome`; the Anthropic adapter fills
+  `input_tokens`/`output_tokens`/`cost_usd`); `AgentPolicy.max_budget_usd`; and
+  `Interrupt` to stop a run/stream at a safe boundary.
+- Dynamic permissions: `CallbackGuard` (a `can_use_tool`-style per-call callback
+  returning allow/deny/confirm) and `ToolAllowlistGuard` (scope a run to a
+  subset of tools).
+- Output validation + retry: `Agent(output_validator=, max_output_retries=)`
+  re-prompts on a failed validation and exposes `AgentOutcome.parsed`. Ships
+  `json_output`, `pydantic_output`, `regex_output`.
+- Resilient model wrappers: `RetryModel` (backoff) and `FallbackModel`
+  (try-next-on-error), composable and drop-in.
+- Eval harness (`agentix.evals`): `evaluate(...)` runs an agent over `Case`s and
+  returns an `EvalReport` with `pass_rate` / `format_success_rate` /
+  `assert_pass_rate()` (gate CI on regressions). Scorers: `exact_match`,
+  `contains`, `regex_match`, `predicate`, `llm_judge`.
+- `SelfConsistencyModel`: sample a model N times per turn and return the majority
+  vote (drop-in `ModelFn`).
+- `JudgeGuard`: an LLM reviews the final answer against a rubric and replaces it
+  on failure (an `on_answer` safety/on-brand/format gate).
+- Anthropic adapter: structured-output passthrough documented
+  (`output_config={"format": ...}`) and `strict` tool schemas forwarded.
+- OpenTelemetry tracing (`agentix[otel]`): `TracingModel`, `tracing_events`, and
+  `trace_run` produce a span tree (run → model/tool spans) for your observability
+  stack.
 ## [0.1.0] - 2026-06-22
 Initial release.
@@ -43,5 +75,6 @@ Initial release.
   `cost_usd`; `AgentPolicy.max_budget_usd` aborts a run over budget.
 - `Interrupt` stops a run or stream at the next safe boundary.
-[Unreleased]: https://github.com/skwijeratne/agentix-toolkit/compare/v0.1.0...HEAD
+[Unreleased]: https://github.com/skwijeratne/agentix-toolkit/compare/v0.2.0...HEAD
+[0.2.0]: https://github.com/skwijeratne/agentix-toolkit/compare/v0.1.0...v0.2.0
 [0.1.0]: https://github.com/skwijeratne/agentix-toolkit/releases/tag/v0.1.0

agentix_toolkit-0.2.0/CODE_OF_CONDUCT.md ADDED Viewed

@@ -0,0 +1,64 @@
+# Contributor Covenant Code of Conduct
+## Our Pledge
+We as members, contributors, and leaders pledge to make participation in our
+community a harassment-free experience for everyone, regardless of age, body
+size, visible or invisible disability, ethnicity, sex characteristics, gender
+identity and expression, level of experience, education, socio-economic status,
+nationality, personal appearance, race, religion, or sexual identity and
+orientation.
+We pledge to act and interact in ways that contribute to an open, welcoming,
+diverse, inclusive, and healthy community.
+## Our Standards
+Examples of behavior that contributes to a positive environment include:
+- Demonstrating empathy and kindness toward other people
+- Being respectful of differing opinions, viewpoints, and experiences
+- Giving and gracefully accepting constructive feedback
+- Accepting responsibility and apologizing to those affected by our mistakes,
+  and learning from the experience
+- Focusing on what is best not just for us as individuals, but for the overall
+  community
+Examples of unacceptable behavior include:
+- The use of sexualized language or imagery, and sexual attention or advances of
+  any kind
+- Trolling, insulting or derogatory comments, and personal or political attacks
+- Public or private harassment
+- Publishing others' private information, such as a physical or email address,
+  without their explicit permission
+- Other conduct which could reasonably be considered inappropriate in a
+  professional setting
+## Enforcement Responsibilities
+Community leaders are responsible for clarifying and enforcing our standards of
+acceptable behavior and will take appropriate and fair corrective action in
+response to any behavior that they deem inappropriate, threatening, offensive,
+or harmful.
+## Scope
+This Code of Conduct applies within all community spaces, and also applies when
+an individual is officially representing the community in public spaces.
+## Enforcement
+Instances of abusive, harassing, or otherwise unacceptable behavior may be
+reported to the community leaders responsible for enforcement at
+**skwijeratne@gmail.com**. All complaints will be reviewed and investigated
+promptly and fairly. Community leaders are obligated to respect the privacy and
+security of the reporter of any incident.
+## Attribution
+This Code of Conduct is adapted from the [Contributor Covenant][homepage],
+version 2.1, available at
+https://www.contributor-covenant.org/version/2/1/code_of_conduct.html.
+[homepage]: https://www.contributor-covenant.org

agentix_toolkit-0.2.0/CONTRIBUTING.md ADDED Viewed

@@ -0,0 +1,67 @@
+# Contributing to agentix
+Thanks for your interest in improving agentix! This guide gets you set up and
+explains what we look for in a contribution.
+> The distribution is **`agentix-toolkit`** on PyPI; you import it as **`agentix`**.
+## Development setup
+This project uses [uv](https://docs.astral.sh/uv/).
+```bash
+git clone https://github.com/skwijeratne/agentix-toolkit
+cd agentix-toolkit
+uv sync --all-extras        # create the venv, install deps + dev tools + extras
+```
+## The checks (all three must pass)
+CI runs these on every PR across Python 3.10–3.13, and they are **blocking**:
+```bash
+uv run pytest                # tests
+uv run ruff check src tests  # lint
+uv run mypy                  # type-check (strict)
+```
+Run them locally before pushing. Optionally enable the pre-commit hooks so
+lint runs automatically:
+```bash
+uv run pre-commit install
+```
+## Making a change
+1. **Open an issue first** for anything non-trivial, so we can agree on the
+   approach before you invest time.
+2. Branch off `main`.
+3. Keep the change focused. Match the surrounding style — small, shared core;
+   load-bearing behavior is injected and configurable, not baked into the loop.
+4. **Add tests.** New behavior needs coverage; bug fixes need a regression test.
+   Tests are plain `def` / `async def test_*` functions (pytest, `asyncio_mode`
+   is `auto`).
+5. Update docs where relevant: docstrings, the README, an `examples/` script,
+   and a `CHANGELOG.md` entry under `[Unreleased]`.
+6. Make sure all three checks pass.
+7. Open a PR using the template; describe the change and link the issue.
+## Design principles
+- **Provider-agnostic core.** Don't couple the loop to a specific model
+  provider; provider code lives behind adapters (`providers/`).
+- **Inject, don't bake in.** New capabilities should be opt-in and composable
+  (a guard, a tool, a strategy, an executor), not hard-coded into `agent.py`.
+- **Security defaults are conservative.** When a guard is ambiguous, fail
+  closed. See `SECURITY.md`.
+- **Typed and tested.** Public APIs are typed (`mypy --strict`) and exercised by
+  tests.
+## Reporting bugs / requesting features
+Use the issue templates. For **security vulnerabilities, do not open a public
+issue** — see [`SECURITY.md`](./SECURITY.md).
+By contributing, you agree that your contributions are licensed under the
+project's [MIT License](./LICENSE).

{agentix_toolkit-0.1.0 → agentix_toolkit-0.2.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: agentix-toolkit
-Version: 0.1.0
+Version: 0.2.0
 Summary: A generic, batteries-included agent toolkit: configure the loop, tools, guards, and observability instead of rewriting them.
 Project-URL: Homepage, https://github.com/skwijeratne/agentix-toolkit
 Project-URL: Repository, https://github.com/skwijeratne/agentix-toolkit
@@ -22,6 +22,8 @@ Provides-Extra: anthropic
 Requires-Dist: anthropic>=0.40; extra == 'anthropic'
 Provides-Extra: mcp
 Requires-Dist: mcp>=1.0; extra == 'mcp'
+Provides-Extra: otel
+Requires-Dist: opentelemetry-api>=1.20; extra == 'otel'
 Description-Content-Type: text/markdown
 # agentix
@@ -53,11 +55,17 @@ outcome = await agent.run("What's the weather in Lisbon?")
 - **Async-first** core loop (`run` / `stream` / `resume`) with a sync wrapper.
 - **Provider-agnostic** — bring any model; a real **Anthropic** adapter is included.
-- **Tools from type hints** — one `@tool` decorator generates the JSON schema.
-- **Security as a first-class, opt-in subsystem** — trust boundary, permission
-  tiers, confirmation, PII/injection guards, audit events.
-- **Scales** — streaming, checkpoint/resume, MCP tools, context trimming, and
-  fleet backpressure.
+- **Tools from type hints** — one `@tool` decorator generates the JSON schema;
+  **MCP** servers and **subagents** plug in as tools too.
+- **Security, opt-in** — trust boundary, permission tiers + dynamic
+  `can_use_tool` callbacks, PII/injection guards, human confirmation, audit events.
+- **Cost & control** — token **and USD** cost tracking, step/token/USD budgets,
+  cooperative `Interrupt`.
+- **Reliability** — output **validation + retry** (`outcome.parsed`), model
+  **fallback/retry**, self-consistency, and LLM-as-judge.
+- **Scale & ops** — streaming, checkpoint/resume, context trimming, fleet
+  backpressure, an **eval harness** (gate CI on quality), and **OpenTelemetry**
+  tracing.
 > Status: **alpha**, under active development. APIs may change before `1.0`.
@@ -72,9 +80,9 @@ The distribution is **`agentix-toolkit`**; you import it as **`agentix`**.
 With [uv](https://docs.astral.sh/uv/) (recommended):
 ```bash
-uv add agentix-toolkit                      # core
-uv add "agentix-toolkit[anthropic]"         # + Anthropic adapter
-uv add "agentix-toolkit[anthropic,mcp]"     # + MCP client support
+uv add agentix-toolkit                       # core (no required deps)
+uv add "agentix-toolkit[anthropic]"          # + Anthropic adapter
+uv add "agentix-toolkit[anthropic,mcp,otel]" # + MCP client + OpenTelemetry tracing
 ```
 Or with pip:
@@ -83,6 +91,9 @@ Or with pip:
 pip install "agentix-toolkit[anthropic]"
 ```
+Extras are opt-in: `anthropic` (the model adapter), `mcp` (MCP client),
+`otel` (OpenTelemetry tracing). The core has **no required dependencies**.
 ### 2. Run an agent with no API key
 `MockModel` is a scripted, dependency-free model — perfect for trying the loop
@@ -169,6 +180,30 @@ async for event in agent.stream("Tell me about Lisbon."):
         print("\n", event.outcome.status)
 ```
+### 6. Make it production-safe (validate output, fall back, cap cost)
+Stop malformed output from crashing downstream code: validate the final answer
+and re-prompt on failure. Add a fallback model and a USD budget for resilience.
+```python
+from agentix import Agent, AgentPolicy, FallbackModel, json_output
+agent = Agent(
+    model=FallbackModel([primary_model, backup_model]),  # survive a provider blip
+    system_prompt="Reply with a JSON object.",
+    tools=[...],
+    output_validator=json_output,        # or pydantic_output(MyModel)
+    max_output_retries=2,                # re-prompt the model on bad output
+    policy=AgentPolicy(max_budget_usd=0.50),  # abort if it gets expensive
+)
+outcome = await agent.run("...")
+outcome.parsed     # a validated object — safe to use; outcome.cost_usd is tracked
+```
+Then **gate quality in CI** with the eval harness — `evaluate(...)` runs your
+agent over golden cases and `assert_pass_rate(...)` fails the build on a
+regression (see `examples/17_eval.py`).
 ---
 ## Feature tour
@@ -184,6 +219,13 @@ Each links to a runnable example in [`examples/`](./examples):
 | Concurrency | `Limiter` + `bounded_gather` for fleets | `10_concurrency.py` |
 | MCP | use any MCP server's tools | `11_mcp.py` |
 | Context | bound the transcript (`TrimRounds`, …) | `12_context.py` |
+| Subagents | delegate a subtask to a child agent | `13_subagents.py` |
+| Cost & interrupt | USD budgets + stop a run mid-flight | `14_cost_and_interrupt.py` |
+| Permissions | dynamic `can_use_tool` + tool allowlist | `15_permissions.py` |
+| Reliability | output validation + retry, fallback/retry models | `16_reliability.py` |
+| Eval | score golden cases, gate CI on pass rate | `17_eval.py` |
+| Verify | self-consistency + LLM-as-judge | `18_verification.py` |
+| Tracing | OpenTelemetry model/tool/run spans | `19_tracing.py` |
 ---
@@ -202,6 +244,12 @@ Run an example: `uv run python examples/01_hello_agent.py`.
 See [`RELEASING.md`](./RELEASING.md) for the publish process and
 [`PLAN.md`](./PLAN.md) for the roadmap.
+## Contributing
+Contributions are welcome! See [`CONTRIBUTING.md`](./CONTRIBUTING.md) for setup
+and the PR checklist, [`CODE_OF_CONDUCT.md`](./CODE_OF_CONDUCT.md), and
+[`SECURITY.md`](./SECURITY.md) for reporting vulnerabilities privately.
 ## License
 MIT — see [`LICENSE`](./LICENSE).

{agentix_toolkit-0.1.0 → agentix_toolkit-0.2.0}/PLAN.md RENAMED Viewed

@@ -109,8 +109,26 @@ agentix/
 - **P10 — Cost + interrupt.** ✅ `pricing` (per-model table + `cost_usd`);
   `ModelResponse`/`AgentOutcome` carry `cost_usd` (Anthropic adapter fills it);
   `AgentPolicy.max_budget_usd` aborts; `Interrupt` stops a run/stream at a safe
-  boundary. Tests + example 14. (P8 — permission callbacks — still open; see
-  `PLAN.gaps.md`.)
+  boundary. Tests + example 14.
+- **P8 — Dynamic permissions.** ✅ `CallbackGuard(check)` (`can_use_tool`: a
+  per-call callback returning allow/deny/confirm or a bool) and
+  `ToolAllowlistGuard` (scope a run to a tool subset). Compose with the guard
+  pipeline. Tests + example 15.
+- **P11 — Reliability & correctness.** ✅ Output validation + retry
+  (`Agent(output_validator=, max_output_retries=)` → `outcome.parsed`;
+  `json_output`/`pydantic_output`/`regex_output`); resilient model wrappers
+  (`RetryModel`, `FallbackModel`); `SelfConsistencyModel` (majority vote);
+  `JudgeGuard` (LLM answer gate); structured-output passthrough on the Anthropic
+  adapter. Examples 16 + 18.
+- **P12 — Eval harness.** ✅ `agentix.evals`: `evaluate(dataset, agent, scorer=)`
+  → `EvalReport` (`pass_rate`, `format_success_rate`, `assert_pass_rate()` to
+  gate CI). Scorers: `exact_match`/`contains`/`regex_match`/`predicate`/
+  `llm_judge`. Tests + example 17.
+- **P13 — OpenTelemetry tracing.** ✅ `agentix.tracing` (`agentix[otel]`):
+  `TracingModel` (model spans), `tracing_events()` (tool spans + guard/confirm),
+  `trace_run()` (root span). Tests + example 19 (verified vs the real OTel SDK).
+  Roadmap remainders (prompt versioning, citation guard, eval loaders) in
+  `PLAN.gaps.md`.
 > ⚠️ Streaming caveat: `on_answer` egress guards (PII redaction) can't un-send
 > already-streamed deltas — deltas are raw; `Done.outcome.answer` is redacted.

{agentix_toolkit-0.1.0 → agentix_toolkit-0.2.0}/README.md RENAMED Viewed

@@ -27,11 +27,17 @@ outcome = await agent.run("What's the weather in Lisbon?")
 - **Async-first** core loop (`run` / `stream` / `resume`) with a sync wrapper.
 - **Provider-agnostic** — bring any model; a real **Anthropic** adapter is included.
-- **Tools from type hints** — one `@tool` decorator generates the JSON schema.
-- **Security as a first-class, opt-in subsystem** — trust boundary, permission
-  tiers, confirmation, PII/injection guards, audit events.
-- **Scales** — streaming, checkpoint/resume, MCP tools, context trimming, and
-  fleet backpressure.
+- **Tools from type hints** — one `@tool` decorator generates the JSON schema;
+  **MCP** servers and **subagents** plug in as tools too.
+- **Security, opt-in** — trust boundary, permission tiers + dynamic
+  `can_use_tool` callbacks, PII/injection guards, human confirmation, audit events.
+- **Cost & control** — token **and USD** cost tracking, step/token/USD budgets,
+  cooperative `Interrupt`.
+- **Reliability** — output **validation + retry** (`outcome.parsed`), model
+  **fallback/retry**, self-consistency, and LLM-as-judge.
+- **Scale & ops** — streaming, checkpoint/resume, context trimming, fleet
+  backpressure, an **eval harness** (gate CI on quality), and **OpenTelemetry**
+  tracing.
 > Status: **alpha**, under active development. APIs may change before `1.0`.
@@ -46,9 +52,9 @@ The distribution is **`agentix-toolkit`**; you import it as **`agentix`**.
 With [uv](https://docs.astral.sh/uv/) (recommended):
 ```bash
-uv add agentix-toolkit                      # core
-uv add "agentix-toolkit[anthropic]"         # + Anthropic adapter
-uv add "agentix-toolkit[anthropic,mcp]"     # + MCP client support
+uv add agentix-toolkit                       # core (no required deps)
+uv add "agentix-toolkit[anthropic]"          # + Anthropic adapter
+uv add "agentix-toolkit[anthropic,mcp,otel]" # + MCP client + OpenTelemetry tracing
 ```
 Or with pip:
@@ -57,6 +63,9 @@ Or with pip:
 pip install "agentix-toolkit[anthropic]"
 ```
+Extras are opt-in: `anthropic` (the model adapter), `mcp` (MCP client),
+`otel` (OpenTelemetry tracing). The core has **no required dependencies**.
 ### 2. Run an agent with no API key
 `MockModel` is a scripted, dependency-free model — perfect for trying the loop
@@ -143,6 +152,30 @@ async for event in agent.stream("Tell me about Lisbon."):
         print("\n", event.outcome.status)
 ```
+### 6. Make it production-safe (validate output, fall back, cap cost)
+Stop malformed output from crashing downstream code: validate the final answer
+and re-prompt on failure. Add a fallback model and a USD budget for resilience.
+```python
+from agentix import Agent, AgentPolicy, FallbackModel, json_output
+agent = Agent(
+    model=FallbackModel([primary_model, backup_model]),  # survive a provider blip
+    system_prompt="Reply with a JSON object.",
+    tools=[...],
+    output_validator=json_output,        # or pydantic_output(MyModel)
+    max_output_retries=2,                # re-prompt the model on bad output
+    policy=AgentPolicy(max_budget_usd=0.50),  # abort if it gets expensive
+)
+outcome = await agent.run("...")
+outcome.parsed     # a validated object — safe to use; outcome.cost_usd is tracked
+```
+Then **gate quality in CI** with the eval harness — `evaluate(...)` runs your
+agent over golden cases and `assert_pass_rate(...)` fails the build on a
+regression (see `examples/17_eval.py`).
 ---
 ## Feature tour
@@ -158,6 +191,13 @@ Each links to a runnable example in [`examples/`](./examples):
 | Concurrency | `Limiter` + `bounded_gather` for fleets | `10_concurrency.py` |
 | MCP | use any MCP server's tools | `11_mcp.py` |
 | Context | bound the transcript (`TrimRounds`, …) | `12_context.py` |
+| Subagents | delegate a subtask to a child agent | `13_subagents.py` |
+| Cost & interrupt | USD budgets + stop a run mid-flight | `14_cost_and_interrupt.py` |
+| Permissions | dynamic `can_use_tool` + tool allowlist | `15_permissions.py` |
+| Reliability | output validation + retry, fallback/retry models | `16_reliability.py` |
+| Eval | score golden cases, gate CI on pass rate | `17_eval.py` |
+| Verify | self-consistency + LLM-as-judge | `18_verification.py` |
+| Tracing | OpenTelemetry model/tool/run spans | `19_tracing.py` |
 ---
@@ -176,6 +216,12 @@ Run an example: `uv run python examples/01_hello_agent.py`.
 See [`RELEASING.md`](./RELEASING.md) for the publish process and
 [`PLAN.md`](./PLAN.md) for the roadmap.
+## Contributing
+Contributions are welcome! See [`CONTRIBUTING.md`](./CONTRIBUTING.md) for setup
+and the PR checklist, [`CODE_OF_CONDUCT.md`](./CODE_OF_CONDUCT.md), and
+[`SECURITY.md`](./SECURITY.md) for reporting vulnerabilities privately.
 ## License
 MIT — see [`LICENSE`](./LICENSE).

agentix-toolkit 0.1.0__tar.gz → 0.2.0__tar.gz

agentix-toolkit 0.1.0tar.gz → 0.2.0tar.gz