nighthawk-python 0.3.1__tar.gz → 0.4.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/.claude/rules/docs.md +4 -2
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/CHANGELOG.md +10 -1
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/PKG-INFO +1 -1
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/docs/api.md +4 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/docs/coding-agent-backends.md +6 -6
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/docs/for-coding-agents.md +139 -117
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/docs/index.md +1 -13
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/docs/providers.md +6 -6
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/docs/quickstart.md +3 -9
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/docs/roadmap.md +1 -1
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/docs/tutorial.md +144 -34
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/pyproject.toml +1 -1
- nighthawk_python-0.4.0/src/nighthawk/testing.py +212 -0
- nighthawk_python-0.4.0/tests/docs/test_coding_agent_examples.py +263 -0
- nighthawk_python-0.4.0/tests/test_testing.py +556 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/uv.lock +4 -4
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/.claude/rules/coding.md +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/.claude/settings.json +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/.claude/unset_envs.sh +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/.devcontainer/Dockerfile.devcontainer +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/.devcontainer/Dockerfile.litellm +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/.devcontainer/devcontainer.json +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/.devcontainer/docker-compose.yaml +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/.devcontainer/litellm-config.yaml +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/.github/dependabot.yml +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/.github/workflows/ci.yml +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/.github/workflows/docs.yml +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/.github/workflows/publish.yml +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/.gitignore +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/.python-version +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/AGENTS.md +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/CLAUDE.md +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/CONTRIBUTING.md +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/LICENSE +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/README.md +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/docs/assets/nighthawk_logo-128x128.png +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/docs/design.md +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/mkdocs.yml +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/pyrightconfig.json +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/__init__.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/backends/__init__.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/backends/base.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/backends/claude_code_cli.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/backends/claude_code_sdk.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/backends/codex.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/backends/mcp_boundary.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/backends/mcp_server.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/backends/tool_bridge.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/configuration.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/errors.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/identifier_path.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/json_renderer.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/natural/__init__.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/natural/blocks.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/natural/decorator.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/natural/transform.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/runtime/__init__.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/runtime/async_bridge.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/runtime/prompt.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/runtime/runner.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/runtime/scoping.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/runtime/step_context.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/runtime/step_contract.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/runtime/step_executor.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/runtime/tool_calls.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/tools/__init__.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/tools/assignment.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/tools/contracts.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/tools/execution.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/tools/provided.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/tools/registry.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/src/nighthawk/ulid.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/__init__.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/backends/__init__.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/backends/test_claude_code_cli.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/backends/test_claude_code_sdk.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/backends/test_codex.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/conftest.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/docs/__init__.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/docs/test_prompt_examples.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/execution/__init__.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/execution/prompt_test_helpers.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/execution/stub_executor.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/execution/test_execution_outcome_prompt_fragment.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/execution/test_globals_prompt.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/execution/test_natural_block_ordering.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/execution/test_natural_traceback.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/execution/test_runtime.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/execution/test_variables_prompt.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/integration/__init__.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/integration/skip_helpers.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/integration/test_carry_pattern.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/integration/test_claude_code_cli_integration.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/integration/test_claude_code_sdk_integration.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/integration/test_codex_integration.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/integration/test_llm_integration.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/natural/__init__.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/natural/test_blocks.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/public/__init__.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/public/test_public_api.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/public/test_readme_example.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/test_renderer.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/tools/__init__.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/tools/test_assignment_async.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/tools/test_contracts.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/tools/test_registry.py +0 -0
- {nighthawk_python-0.3.1 → nighthawk_python-0.4.0}/tests/tools/test_tool_boundary.py +0 -0
|
@@ -13,7 +13,7 @@ Each file has a distinct audience and scope. Content belongs in exactly one file
|
|
|
13
13
|
|---|---|---|---|
|
|
14
14
|
| `index.md` | First-time visitors | Project overview, motivation, workflow styles | What Nighthawk is and why. No API details, no how-to. |
|
|
15
15
|
| `quickstart.md` | New users | Shortest path to running a Natural block | Setup, first example, backends table, credentials, troubleshooting. No deep explanations. |
|
|
16
|
-
| `tutorial.md` | Users learning the system | Build understanding from first principles | Bindings,
|
|
16
|
+
| `tutorial.md` | Users learning the system | Build understanding from first principles | Bindings, functions and discoverability, control flow, composition, configuration, guidelines. Assumes quickstart is done. |
|
|
17
17
|
| `design.md` | Implementors and advanced users | Canonical specification (target behavior) | Full technical detail: syntax rules, state layers, prompt rendering, tool contracts, outcome schema, frontmatter. |
|
|
18
18
|
| `providers.md` | Users choosing and configuring models | Provider selection, Pydantic AI setup, custom backends | Provider categories, capability matrix, model identifiers, Pydantic AI model settings, step executor protocols. No coding-agent-backend-specific content. |
|
|
19
19
|
| `coding-agent-backends.md` | Users of Claude Code or Codex backends | Coding agent backend configuration and features | Backend-specific settings, skills, MCP tool exposure, working directory, project-scoped files. |
|
|
@@ -43,6 +43,8 @@ Each file has a distinct audience and scope. Content belongs in exactly one file
|
|
|
43
43
|
- When tutorial.md and design.md cover the same concept, tutorial.md shows the "what and how" with examples; design.md specifies the "exact rules and edge cases".
|
|
44
44
|
- Keep code examples self-contained: a reader should understand the example without reading surrounding prose.
|
|
45
45
|
- Built-in tool names (`nh_eval`, `nh_exec`, `nh_assign`) are implementation details. Only `design.md` may expose them. All other files describe behavior instead (e.g., "the LLM can set a new value" rather than "use `nh_assign`").
|
|
46
|
+
- `@nh.tool` is discouraged. Binding functions are the preferred callable exposure mechanism. `design.md` documents `@nh.tool` as part of the specification. `tutorial.md` may mention it with a "prefer binding functions" note. All other files should not add examples, recommendations, or references to `@nh.tool`.
|
|
47
|
+
- The PyPI package name is `nighthawk-python`. Always use `nighthawk-python` (not `nighthawk`) in `pip install` commands and extras references (e.g., `nighthawk-python[claude-code-sdk]`).
|
|
46
48
|
|
|
47
49
|
### index.md specifics
|
|
48
50
|
|
|
@@ -91,7 +93,7 @@ Each file has a distinct audience and scope. Content belongs in exactly one file
|
|
|
91
93
|
- This file should be self-contained: a coding agent reading only this file should be able to write correct Nighthawk code without consulting other docs.
|
|
92
94
|
- This file is consumed standalone (`@docs/for-coding-agents.md` in CLAUDE.md/AGENTS.md, GitHub raw URL, etc.). Do not assume sibling files exist at relative paths.
|
|
93
95
|
- All external references to other docs use absolute URLs based on `site_url` from `mkdocs.yml` (currently `https://kurusugawa-computer.github.io/nighthawk-python/`). If `site_url` changes, update the URLs in this file.
|
|
94
|
-
- `@nh.tool`
|
|
96
|
+
- `@nh.tool` must not appear in this file (see General rule on `@nh.tool`). Binding functions are the only callable exposure mechanism presented here.
|
|
95
97
|
- Filter content for coding-agent relevance. Omit infrastructure-level concerns (scoped overrides parameter lists, exception hierarchy beyond `ExecutionError`, observability/tracing) that do not affect how an agent writes Natural blocks or binding functions. Mention existence and link to Tutorial or Design for details.
|
|
96
98
|
|
|
97
99
|
### api.md specifics
|
|
@@ -7,6 +7,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|
|
7
7
|
|
|
8
8
|
## [Unreleased]
|
|
9
9
|
|
|
10
|
+
## [0.4.0] - 2026-03-20
|
|
11
|
+
|
|
12
|
+
### Added
|
|
13
|
+
- `nighthawk.testing` module with test executors and convenience factories for deterministic Natural function testing without LLM API calls.
|
|
14
|
+
|
|
15
|
+
### Changed
|
|
16
|
+
- Rewrote testing documentation in `tutorial.md` (Section 8) and `for-coding-agents.md` (Section 8): replaced incorrect `TestModel` usage with `nighthawk.testing` utilities, added testing strategy guidance distinguishing mock tests (Python logic) from integration tests (Natural block judgment).
|
|
17
|
+
|
|
10
18
|
## [0.3.1] - 2026-03-19
|
|
11
19
|
|
|
12
20
|
### Changed
|
|
@@ -49,7 +57,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|
|
49
57
|
- Step executor abstraction and provider integration foundation.
|
|
50
58
|
- Core documentation and project scaffolding.
|
|
51
59
|
|
|
52
|
-
[Unreleased]: https://github.com/kurusugawa-computer/nighthawk-python/compare/v0.
|
|
60
|
+
[Unreleased]: https://github.com/kurusugawa-computer/nighthawk-python/compare/v0.4.0...HEAD
|
|
61
|
+
[0.4.0]: https://github.com/kurusugawa-computer/nighthawk-python/compare/v0.3.1...v0.4.0
|
|
53
62
|
[0.3.1]: https://github.com/kurusugawa-computer/nighthawk-python/compare/v0.3.0...v0.3.1
|
|
54
63
|
[0.3.0]: https://github.com/kurusugawa-computer/nighthawk-python/compare/v0.2.0...v0.3.0
|
|
55
64
|
[0.2.0]: https://github.com/kurusugawa-computer/nighthawk-python/compare/v0.1.0...v0.2.0
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: nighthawk-python
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.4.0
|
|
4
4
|
Summary: An experimental Python library that embeds Natural blocks inside Python functions and executes them using an LLM.
|
|
5
5
|
Project-URL: Repository, https://github.com/kurusugawa-computer/nighthawk-python
|
|
6
6
|
Project-URL: Documentation, https://kurusugawa-computer.github.io/nighthawk-python/
|
|
@@ -4,7 +4,7 @@ The `claude-code-sdk`, `claude-code-cli`, and `codex` backends delegate Natural
|
|
|
4
4
|
|
|
5
5
|
Minimal configuration:
|
|
6
6
|
|
|
7
|
-
```
|
|
7
|
+
```py
|
|
8
8
|
from nighthawk.configuration import StepExecutorConfiguration
|
|
9
9
|
|
|
10
10
|
# Claude Code (SDK)
|
|
@@ -62,7 +62,7 @@ pip install nighthawk-python[claude-code-sdk]
|
|
|
62
62
|
|
|
63
63
|
### Settings
|
|
64
64
|
|
|
65
|
-
```
|
|
65
|
+
```py
|
|
66
66
|
from nighthawk.backends.claude_code_sdk import ClaudeCodeSdkModelSettings
|
|
67
67
|
|
|
68
68
|
configuration = StepExecutorConfiguration(
|
|
@@ -79,7 +79,7 @@ configuration = StepExecutorConfiguration(
|
|
|
79
79
|
|
|
80
80
|
| Field | Type | Default | Description |
|
|
81
81
|
|---|---|---|---|
|
|
82
|
-
| `permission_mode` | `"default"` \| `"acceptEdits"` \| `"plan"` \| `"bypassPermissions"` | `"default"` | Claude Code permission mode |
|
|
82
|
+
| `permission_mode` | `"default"` \| `"acceptEdits"` \| `"plan"` \| `"bypassPermissions"` | `"default"` | Claude Code permission mode (always passed to the SDK) |
|
|
83
83
|
| `setting_sources` | `list[SettingSource]` \| `None` | `None` | Setting source scopes to load (`SettingSource` is `"user"`, `"project"`, or `"local"`) |
|
|
84
84
|
| `allowed_tool_names` | `tuple[str, ...]` \| `None` | `None` | Nighthawk tool names exposed to the model |
|
|
85
85
|
| `claude_allowed_tool_names` | `tuple[str, ...]` \| `None` | `None` | Additional Claude Code native tool names to allow (SDK only; CLI does not support this field) |
|
|
@@ -108,7 +108,7 @@ The `claude` CLI must be installed separately (it is a system tool, not a Python
|
|
|
108
108
|
|
|
109
109
|
### Settings
|
|
110
110
|
|
|
111
|
-
```
|
|
111
|
+
```py
|
|
112
112
|
from nighthawk.backends.claude_code_cli import ClaudeCodeCliModelSettings
|
|
113
113
|
|
|
114
114
|
configuration = StepExecutorConfiguration(
|
|
@@ -153,7 +153,7 @@ pip install nighthawk-python[codex]
|
|
|
153
153
|
|
|
154
154
|
### Settings
|
|
155
155
|
|
|
156
|
-
```
|
|
156
|
+
```py
|
|
157
157
|
from nighthawk.backends.codex import CodexModelSettings
|
|
158
158
|
|
|
159
159
|
configuration = StepExecutorConfiguration(
|
|
@@ -213,7 +213,7 @@ Use the returned groups to set <:summary_markdown> as exactly 3 bullet points.
|
|
|
213
213
|
|
|
214
214
|
Example Natural function that invokes the skill:
|
|
215
215
|
|
|
216
|
-
```
|
|
216
|
+
```py
|
|
217
217
|
import nighthawk as nh
|
|
218
218
|
|
|
219
219
|
@nh.natural_function
|
|
@@ -14,8 +14,6 @@ Key invariants:
|
|
|
14
14
|
- Each Natural block executes independently. There is no implicit message history between blocks. Cross-block context must be explicit.
|
|
15
15
|
- Write bindings (`<:name>`) are the only way the LLM commits values back into Python locals. The LLM is physically constrained to operate on interpreter-visible objects.
|
|
16
16
|
|
|
17
|
-
## 2. When to use Natural blocks
|
|
18
|
-
|
|
19
17
|
**Use Natural when the task requires LLM judgment** -- decisions that depend on interpretation, world knowledge, or subjective evaluation:
|
|
20
18
|
|
|
21
19
|
- Classification and routing (e.g., categorize a support ticket).
|
|
@@ -32,7 +30,7 @@ Key invariants:
|
|
|
32
30
|
|
|
33
31
|
**Decision rule:** if the correct output can be computed without an LLM, use Python. Natural blocks add latency, cost, and non-determinism.
|
|
34
32
|
|
|
35
|
-
##
|
|
33
|
+
## 2. Writing Natural blocks
|
|
36
34
|
|
|
37
35
|
### Anatomy
|
|
38
36
|
|
|
@@ -60,7 +58,11 @@ Each Natural block should make exactly one independent judgment. If a block make
|
|
|
60
58
|
- Use f-string injection for static config, pre-formatted context, computed values.
|
|
61
59
|
- Use `<name>` bindings for mutable state and objects the LLM needs to inspect or modify.
|
|
62
60
|
|
|
63
|
-
|
|
61
|
+
### Async
|
|
62
|
+
|
|
63
|
+
Async natural functions work identically to sync ones, with two additions: expressions evaluated by tools may use `await`, and return values that are awaitable are automatically awaited before validation.
|
|
64
|
+
|
|
65
|
+
## 3. Designing binding functions
|
|
64
66
|
|
|
65
67
|
Binding functions (local or module-level callables) are the preferred way to expose functions to the LLM. The LLM discovers them from the LOCALS/GLOBALS sections of the prompt, rendered as their signature with the first docstring line as `# intent:`.
|
|
66
68
|
|
|
@@ -68,21 +70,11 @@ Binding functions (local or module-level callables) are the preferred way to exp
|
|
|
68
70
|
|
|
69
71
|
Module-level names that are stable across invocations (constants, classes, utility functions) should stay in GLOBALS via `<name>` read bindings. Reserve function parameters for data that genuinely varies per call.
|
|
70
72
|
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
@nh.natural_function
|
|
75
|
-
async def summarize(query: str, fetch_data: object) -> str:
|
|
76
|
-
result = ""
|
|
77
|
-
"""natural
|
|
78
|
-
Use <fetch_data> to get data for <query> and set <:result>.
|
|
79
|
-
"""
|
|
80
|
-
return result
|
|
81
|
-
```
|
|
82
|
-
|
|
83
|
-
Correct -- `fetch_data` keeps its full signature in GLOBALS:
|
|
73
|
+
```py
|
|
74
|
+
# Wrong -- fetch_data loses its signature in LOCALS:
|
|
75
|
+
async def summarize(query: str, fetch_data: object) -> str: ...
|
|
84
76
|
|
|
85
|
-
|
|
77
|
+
# Correct -- fetch_data keeps its full signature in GLOBALS:
|
|
86
78
|
@nh.natural_function
|
|
87
79
|
async def summarize(query: str) -> str:
|
|
88
80
|
result = ""
|
|
@@ -96,7 +88,7 @@ async def summarize(query: str) -> str:
|
|
|
96
88
|
|
|
97
89
|
Each parameter in a binding function signature is a decision point the LLM must evaluate. Compose complex operations in Python and expose simple binding functions:
|
|
98
90
|
|
|
99
|
-
```
|
|
91
|
+
```py
|
|
100
92
|
# Wrong -- too many parameters
|
|
101
93
|
def find_items(category: str, min_score: float, max_score: float,
|
|
102
94
|
tags: list[str], created_after: str, sort_by: str) -> list[dict]:
|
|
@@ -114,7 +106,7 @@ def find_top_items(category: str) -> list[dict]:
|
|
|
114
106
|
|
|
115
107
|
Write short docstrings explaining intent and boundaries. The first line appears as `# intent:` in the prompt. Clear function names and accurate type annotations complete discoverability.
|
|
116
108
|
|
|
117
|
-
##
|
|
109
|
+
## 4. Control flow and error handling
|
|
118
110
|
|
|
119
111
|
### Outcomes
|
|
120
112
|
|
|
@@ -132,7 +124,7 @@ Each Natural block returns exactly one outcome:
|
|
|
132
124
|
|
|
133
125
|
Restrict allowed outcomes with YAML frontmatter:
|
|
134
126
|
|
|
135
|
-
```
|
|
127
|
+
```py
|
|
136
128
|
"""natural
|
|
137
129
|
---
|
|
138
130
|
deny: [raise, return]
|
|
@@ -145,7 +137,7 @@ Read <text> and set <:result> to a summary.
|
|
|
145
137
|
|
|
146
138
|
The LLM signals errors via the `raise` outcome. Catch with standard Python:
|
|
147
139
|
|
|
148
|
-
```
|
|
140
|
+
```py
|
|
149
141
|
try:
|
|
150
142
|
validate(data)
|
|
151
143
|
except nh.ExecutionError as e:
|
|
@@ -154,13 +146,13 @@ except nh.ExecutionError as e:
|
|
|
154
146
|
|
|
155
147
|
Custom exception types referenced in step locals or globals are available as raise targets. Catch `nh.ExecutionError` for Natural block failures; all Nighthawk exceptions inherit from `nh.NighthawkError`.
|
|
156
148
|
|
|
157
|
-
##
|
|
149
|
+
## 5. Cross-block composition
|
|
158
150
|
|
|
159
151
|
### The carry pattern
|
|
160
152
|
|
|
161
153
|
Pass a mutable object as a read binding (`<carry>`, not `<:carry>`) and instruct the LLM to mutate it in-place:
|
|
162
154
|
|
|
163
|
-
```
|
|
155
|
+
```py
|
|
164
156
|
@nh.natural_function
|
|
165
157
|
def step_1(carry: list[str]) -> int:
|
|
166
158
|
result = 0
|
|
@@ -177,35 +169,16 @@ r2 = step_2(carry) # carry now has 2 entries
|
|
|
177
169
|
|
|
178
170
|
Critical: use `<carry>` (read binding), not `<:carry>` (write binding). Read bindings prevent rebinding, preserving the caller's reference.
|
|
179
171
|
|
|
180
|
-
|
|
172
|
+
- Branch by copying the carry (`carry_a = carry.copy()`). Each copy continues independently.
|
|
173
|
+
- When the carry's token footprint is too large, inject context via f-string instead ([Section 2](#interpolation)).
|
|
181
174
|
|
|
182
|
-
|
|
183
|
-
|
|
184
|
-
```python
|
|
185
|
-
carry_a = carry.copy()
|
|
186
|
-
carry_b = carry.copy()
|
|
187
|
-
result_a = branch_add(carry_a)
|
|
188
|
-
result_b = branch_multiply(carry_b)
|
|
189
|
-
```
|
|
190
|
-
|
|
191
|
-
### f-string injection as alternative
|
|
192
|
-
|
|
193
|
-
When the carry's locals summary footprint is too large, inject pre-formatted context via f-string:
|
|
194
|
-
|
|
195
|
-
```python
|
|
196
|
-
f"""natural
|
|
197
|
-
Prior context: {context_text}
|
|
198
|
-
Set <:result> based on the context.
|
|
199
|
-
"""
|
|
200
|
-
```
|
|
201
|
-
|
|
202
|
-
## 7. Execution configuration
|
|
175
|
+
## 6. Execution configuration
|
|
203
176
|
|
|
204
177
|
### Run context
|
|
205
178
|
|
|
206
179
|
Natural functions must be called inside `with nh.run(step_executor):`. For backend-specific settings, see [Coding agent backends](https://kurusugawa-computer.github.io/nighthawk-python/coding-agent-backends/).
|
|
207
180
|
|
|
208
|
-
```
|
|
181
|
+
```py
|
|
209
182
|
step_executor = nh.AgentStepExecutor.from_configuration(
|
|
210
183
|
configuration=nh.StepExecutorConfiguration(model="openai-responses:gpt-5-mini"),
|
|
211
184
|
)
|
|
@@ -217,7 +190,7 @@ Use `nh.scope()` to override model, prompts, or context limits within an existin
|
|
|
217
190
|
|
|
218
191
|
LOCALS and GLOBALS sections are bounded by `StepContextLimits`. When bindings are missing or truncated (`<snipped>`), adjust the limits:
|
|
219
192
|
|
|
220
|
-
```
|
|
193
|
+
```py
|
|
221
194
|
configuration = nh.StepExecutorConfiguration(
|
|
222
195
|
model="openai-responses:gpt-5-mini",
|
|
223
196
|
context_limits=nh.StepContextLimits(
|
|
@@ -227,113 +200,162 @@ configuration = nh.StepExecutorConfiguration(
|
|
|
227
200
|
)
|
|
228
201
|
```
|
|
229
202
|
|
|
230
|
-
##
|
|
203
|
+
## 7. Testing
|
|
204
|
+
|
|
205
|
+
### Testing strategy
|
|
206
|
+
|
|
207
|
+
Mock tests exercise the Python logic around Natural blocks -- control flow, error handling, composition, binding wiring. They do **not** exercise the Natural blocks themselves. Since Natural blocks are the core of a Nighthawk application, mock tests alone are insufficient.
|
|
208
|
+
|
|
209
|
+
| Layer | What it tests | What it cannot test |
|
|
210
|
+
|---|---|---|
|
|
211
|
+
| **Mock tests** (`nighthawk.testing`) | Python logic: control flow, error handling, composition, binding wiring | Natural block effectiveness, prompt quality, LLM behavior |
|
|
212
|
+
| **Integration tests** (real LLM) | Whether the Natural block text actually produces correct judgments | Deterministic reproducibility (LLMs are non-deterministic) |
|
|
213
|
+
|
|
214
|
+
**Guideline:** use mock tests to lock down the deterministic Python shell, then use integration tests to validate that each Natural block's prompt elicits the intended judgment. Do not rely on mock tests as the primary quality gate -- a mock test passes even when the Natural block text is completely wrong.
|
|
231
215
|
|
|
232
|
-
|
|
216
|
+
### Mock tests
|
|
233
217
|
|
|
234
|
-
|
|
235
|
-
from nighthawk.runtime.step_executor import AgentStepExecutor
|
|
236
|
-
from nighthawk.configuration import StepExecutorConfiguration
|
|
237
|
-
from pydantic_ai.models.test import TestModel
|
|
218
|
+
`ScriptedExecutor` returns scripted responses and records every call. Use it for Python logic that surrounds Natural blocks.
|
|
238
219
|
|
|
239
|
-
|
|
240
|
-
|
|
220
|
+
```py
|
|
221
|
+
from nighthawk.testing import ScriptedExecutor, pass_response, raise_response
|
|
241
222
|
|
|
223
|
+
executor = ScriptedExecutor(responses=[
|
|
224
|
+
pass_response(result="Three key points: ..."),
|
|
225
|
+
])
|
|
242
226
|
with nh.run(executor):
|
|
243
|
-
|
|
244
|
-
|
|
227
|
+
output = summarize("long document")
|
|
228
|
+
|
|
229
|
+
assert output == "Three key points: ..."
|
|
230
|
+
|
|
231
|
+
# Inspect what was passed to the executor
|
|
232
|
+
call = executor.calls[0]
|
|
233
|
+
assert "result" in call.binding_names # write binding registered
|
|
234
|
+
assert call.step_locals["text"] == "long document" # locals visible
|
|
245
235
|
```
|
|
246
236
|
|
|
247
|
-
|
|
237
|
+
For multi-step functions, pass `default_response` to avoid enumerating every response:
|
|
248
238
|
|
|
249
|
-
|
|
239
|
+
```py
|
|
240
|
+
executor = ScriptedExecutor(default_response=pass_response(result=""))
|
|
241
|
+
```
|
|
250
242
|
|
|
251
|
-
|
|
243
|
+
#### Outcome factories
|
|
252
244
|
|
|
253
|
-
|
|
254
|
-
|
|
245
|
+
| Factory | Outcome | Use case |
|
|
246
|
+
|---|---|---|
|
|
247
|
+
| `pass_response(**bindings)` | pass | Normal completion with binding values |
|
|
248
|
+
| `raise_response(message, *, error_type=None)` | raise | Test error handling paths |
|
|
249
|
+
| `return_response(reference_path, **bindings)` | return | Early return from Natural function |
|
|
250
|
+
| `break_response()` | break | Exit enclosing loop |
|
|
251
|
+
| `continue_response()` | continue | Skip to next iteration |
|
|
252
|
+
|
|
253
|
+
```py
|
|
254
|
+
# Error handling:
|
|
255
|
+
executor = ScriptedExecutor(responses=[
|
|
256
|
+
raise_response("invalid input", error_type="ValueError"),
|
|
257
|
+
])
|
|
258
|
+
|
|
259
|
+
# Early return:
|
|
260
|
+
executor = ScriptedExecutor(responses=[
|
|
261
|
+
return_response("result", result="early exit"),
|
|
262
|
+
])
|
|
263
|
+
```
|
|
255
264
|
|
|
256
|
-
|
|
257
|
-
approved: bool
|
|
258
|
-
reason: str
|
|
259
|
-
risk_level: str
|
|
265
|
+
#### Callback executor
|
|
260
266
|
|
|
261
|
-
|
|
262
|
-
|
|
263
|
-
|
|
264
|
-
|
|
265
|
-
|
|
266
|
-
|
|
267
|
-
|
|
267
|
+
`CallbackExecutor` delegates to a callback when response logic depends on input. Like `ScriptedExecutor`, it records calls in `executor.calls`:
|
|
268
|
+
|
|
269
|
+
```py
|
|
270
|
+
from nighthawk.testing import CallbackExecutor, StepCall, StepResponse
|
|
271
|
+
|
|
272
|
+
def handler(call: StepCall) -> StepResponse:
|
|
273
|
+
text = call.step_locals.get("text", "")
|
|
274
|
+
if isinstance(text, str) and "urgent" in text:
|
|
275
|
+
return pass_response(priority="high")
|
|
276
|
+
return pass_response(priority="normal")
|
|
277
|
+
|
|
278
|
+
executor = CallbackExecutor(handler)
|
|
279
|
+
with nh.run(executor):
|
|
280
|
+
assert triage("urgent outage") == "high"
|
|
268
281
|
```
|
|
269
282
|
|
|
270
|
-
|
|
283
|
+
#### Binding wiring verification
|
|
271
284
|
|
|
272
|
-
|
|
273
|
-
|
|
274
|
-
|
|
275
|
-
|
|
276
|
-
|
|
277
|
-
|
|
278
|
-
| Forget type annotations on write bindings | No validation or coercion at commit time | Always annotate `<:name>` bindings |
|
|
279
|
-
| Duplicate module-level constants as function parameters | Moves stable values from GLOBALS to LOCALS, wastes tokens | Reference via `<name>` read binding |
|
|
285
|
+
Use recorded calls to verify that the right data is visible to the LLM:
|
|
286
|
+
|
|
287
|
+
```py
|
|
288
|
+
executor = ScriptedExecutor(responses=[pass_response(result="")])
|
|
289
|
+
with nh.run(executor):
|
|
290
|
+
process(query="test")
|
|
280
291
|
|
|
281
|
-
|
|
292
|
+
call = executor.calls[0]
|
|
293
|
+
assert "helper" in call.step_globals # binding function visible in GLOBALS
|
|
294
|
+
assert "query" in call.step_locals # parameter visible in LOCALS
|
|
295
|
+
assert "result" in call.binding_names # write binding registered
|
|
296
|
+
```
|
|
282
297
|
|
|
283
|
-
###
|
|
298
|
+
### Integration tests
|
|
284
299
|
|
|
285
|
-
|
|
286
|
-
import nighthawk as nh
|
|
300
|
+
Integration tests call a real LLM and validate the judgment. This is where Natural block quality is actually tested.
|
|
287
301
|
|
|
302
|
+
```py
|
|
288
303
|
step_executor = nh.AgentStepExecutor.from_configuration(
|
|
289
304
|
configuration=nh.StepExecutorConfiguration(model="openai-responses:gpt-5-mini"),
|
|
290
305
|
)
|
|
291
306
|
with nh.run(step_executor):
|
|
292
|
-
|
|
307
|
+
verdict = judge_review("The code has no error handling and uses eval().")
|
|
308
|
+
|
|
309
|
+
assert not verdict.approved
|
|
310
|
+
assert verdict.risk_level in ("high", "critical")
|
|
293
311
|
```
|
|
294
312
|
|
|
295
|
-
|
|
313
|
+
For structured outputs, assert on type, value range, and semantic consistency rather than exact string matches. LLMs are non-deterministic; brittle equality checks cause flaky tests.
|
|
296
314
|
|
|
297
|
-
|
|
298
|
-
|
|
299
|
-
|
|
300
|
-
|
|
301
|
-
|
|
302
|
-
|
|
303
|
-
|
|
304
|
-
|
|
315
|
+
Gate integration tests behind an environment variable so they do not run in every CI job:
|
|
316
|
+
|
|
317
|
+
```py
|
|
318
|
+
import os
|
|
319
|
+
import pytest
|
|
320
|
+
|
|
321
|
+
if os.getenv("NIGHTHAWK_RUN_INTEGRATION_TESTS") != "1":
|
|
322
|
+
pytest.skip("Integration tests disabled", allow_module_level=True)
|
|
305
323
|
```
|
|
306
324
|
|
|
307
|
-
|
|
325
|
+
## 8. Type boundary placement
|
|
308
326
|
|
|
309
|
-
|
|
327
|
+
For deterministic functions (no Natural blocks), the type boundary is at the function entry point -- use typed inputs.
|
|
310
328
|
|
|
311
|
-
|
|
312
|
-
@nh.natural_function
|
|
313
|
-
async def my_async_function(text: str) -> str:
|
|
314
|
-
result: str = ""
|
|
315
|
-
"""natural
|
|
316
|
-
Summarize <text> and set <:result>.
|
|
317
|
-
"""
|
|
318
|
-
return result
|
|
319
|
-
```
|
|
329
|
+
For judgment-heavy functions (containing Natural blocks), the type boundary moves inside the function. Accept flexible inputs at the entry point and let the Natural block interpret them into typed intermediates via write bindings:
|
|
320
330
|
|
|
321
|
-
|
|
331
|
+
```py
|
|
332
|
+
from pydantic import BaseModel
|
|
322
333
|
|
|
323
|
-
|
|
324
|
-
|
|
325
|
-
|
|
326
|
-
|
|
334
|
+
class ReviewVerdict(BaseModel):
|
|
335
|
+
approved: bool
|
|
336
|
+
reason: str
|
|
337
|
+
risk_level: str
|
|
327
338
|
|
|
328
339
|
@nh.natural_function
|
|
329
|
-
def
|
|
330
|
-
|
|
340
|
+
def judge_review(review_data: str | nh.JsonableValue) -> ReviewVerdict:
|
|
341
|
+
verdict: ReviewVerdict
|
|
331
342
|
"""natural
|
|
332
|
-
|
|
343
|
+
Analyze <review_data> and produce a structured <:verdict>.
|
|
333
344
|
"""
|
|
334
|
-
return
|
|
345
|
+
return verdict
|
|
335
346
|
```
|
|
336
347
|
|
|
348
|
+
## 9. Common mistakes to avoid
|
|
349
|
+
|
|
350
|
+
| Mistake | Why it breaks | Fix |
|
|
351
|
+
|---|---|---|
|
|
352
|
+
| Pass a callable as a parameter with generic type (`object`, `Any`) | Signature erased in LOCALS; LLM cannot discover arguments | Reference via `<name>` read binding so it appears in GLOBALS with full signature |
|
|
353
|
+
| Use `<:carry>` (write binding) for mutable context | Rebinding breaks the caller's reference | Use `<carry>` (read binding); mutate in-place |
|
|
354
|
+
| Put two independent judgments in one block | Non-deterministic, hard to test, unclear contract | Split into two blocks connected by Python |
|
|
355
|
+
| Use Natural for deterministic computation | Wastes latency/cost, adds non-determinism | Use Python |
|
|
356
|
+
| Forget type annotations on write bindings | No validation or coercion at commit time | Always annotate `<:name>` bindings |
|
|
357
|
+
| Duplicate module-level constants as function parameters | Moves stable values from GLOBALS to LOCALS, wastes tokens | Reference via `<name>` read binding |
|
|
358
|
+
|
|
337
359
|
## References
|
|
338
360
|
|
|
339
361
|
- [Tutorial](https://kurusugawa-computer.github.io/nighthawk-python/tutorial/) -- learn Nighthawk from first principles (human-oriented).
|
|
@@ -127,19 +127,7 @@ calculate_average([1, "2", "three", "cuatro", "五"]) # 3.0
|
|
|
127
127
|
|
|
128
128
|
## Natural blocks
|
|
129
129
|
|
|
130
|
-
A Natural block is a Python docstring or a standalone string literal
|
|
131
|
-
|
|
132
|
-
Bindings:
|
|
133
|
-
|
|
134
|
-
- `<name>` is a read binding.
|
|
135
|
-
- `<:name>` is a write binding.
|
|
136
|
-
|
|
137
|
-
Write bindings control which values are committed back into Python locals at Natural block boundaries.
|
|
138
|
-
|
|
139
|
-
Interpolation:
|
|
140
|
-
|
|
141
|
-
- Natural blocks are literal by default. Interpolation is opt-in via f-string syntax.
|
|
142
|
-
- See [Tutorial Section 2](tutorial.md#2-providing-data-to-a-block) for details.
|
|
130
|
+
A Natural block is a Python docstring or a standalone string literal beginning with `natural\n`. Inside the block, `<name>` read bindings expose Python values to the LLM, and `<:name>` write bindings let the LLM commit values back into Python locals. Natural blocks are literal by default; interpolation is opt-in via f-string syntax. See the [Tutorial](tutorial.md#2-providing-data-to-a-block) for details.
|
|
143
131
|
|
|
144
132
|
## References
|
|
145
133
|
|
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
|
|
3
3
|
Nighthawk delegates Natural block execution to an LLM. The model is selected through the `model` field of `StepExecutorConfiguration` using the `provider:model` format:
|
|
4
4
|
|
|
5
|
-
```
|
|
5
|
+
```py
|
|
6
6
|
from nighthawk.configuration import StepExecutorConfiguration
|
|
7
7
|
|
|
8
8
|
configuration = StepExecutorConfiguration(model="openai-responses:gpt-5-nano")
|
|
@@ -36,7 +36,7 @@ Any provider that [Pydantic AI supports](https://ai.pydantic.dev/models/overview
|
|
|
36
36
|
|
|
37
37
|
Examples:
|
|
38
38
|
|
|
39
|
-
```
|
|
39
|
+
```py
|
|
40
40
|
# OpenAI
|
|
41
41
|
configuration = StepExecutorConfiguration(model="openai-responses:gpt-5-nano")
|
|
42
42
|
|
|
@@ -71,7 +71,7 @@ See the [Pydantic AI documentation](https://ai.pydantic.dev/models/overview/) fo
|
|
|
71
71
|
|
|
72
72
|
Pydantic AI providers accept standard Pydantic AI model settings via the `model_settings` field:
|
|
73
73
|
|
|
74
|
-
```
|
|
74
|
+
```py
|
|
75
75
|
configuration = StepExecutorConfiguration(
|
|
76
76
|
model="openai-responses:gpt-5-nano",
|
|
77
77
|
model_settings={"temperature": 0.5},
|
|
@@ -80,7 +80,7 @@ configuration = StepExecutorConfiguration(
|
|
|
80
80
|
|
|
81
81
|
## Coding agent backends
|
|
82
82
|
|
|
83
|
-
The `claude-code-sdk`, `claude-code-cli`, and `codex` backends implement the Pydantic AI `Model` protocol internally but delegate inference to a coding agent CLI rather than a Pydantic AI provider. Install with `nighthawk[claude-code-sdk]`, `nighthawk[claude-code-cli]`, or `nighthawk[codex]`. See [Coding agent backends](coding-agent-backends.md) for configuration, skill behavior, and backend-specific settings.
|
|
83
|
+
The `claude-code-sdk`, `claude-code-cli`, and `codex` backends implement the Pydantic AI `Model` protocol internally but delegate inference to a coding agent CLI rather than a Pydantic AI provider. Install with `nighthawk-python[claude-code-sdk]`, `nighthawk-python[claude-code-cli]`, or `nighthawk-python[codex]`. See [Coding agent backends](coding-agent-backends.md) for configuration, skill behavior, and backend-specific settings.
|
|
84
84
|
|
|
85
85
|
## Custom backends
|
|
86
86
|
|
|
@@ -88,7 +88,7 @@ Nighthawk's `SyncStepExecutor` and `AsyncStepExecutor` protocols define the step
|
|
|
88
88
|
|
|
89
89
|
For most cases, wrap a Pydantic AI `Agent` using `AgentStepExecutor`:
|
|
90
90
|
|
|
91
|
-
```
|
|
91
|
+
```py
|
|
92
92
|
from pydantic_ai import Agent
|
|
93
93
|
from nighthawk.runtime.step_executor import AgentStepExecutor
|
|
94
94
|
|
|
@@ -98,7 +98,7 @@ executor = AgentStepExecutor.from_agent(agent=agent)
|
|
|
98
98
|
|
|
99
99
|
For full control, implement `AsyncStepExecutor` (or `SyncStepExecutor` for synchronous use) directly:
|
|
100
100
|
|
|
101
|
-
```
|
|
101
|
+
```py
|
|
102
102
|
from nighthawk.runtime.step_executor import AsyncStepExecutor
|
|
103
103
|
from nighthawk.runtime.step_context import StepContext
|
|
104
104
|
from nighthawk.runtime.step_contract import StepOutcome
|
|
@@ -78,8 +78,11 @@ See [Providers](providers.md) for the default and recommended models.
|
|
|
78
78
|
Credential configuration for Pydantic AI providers follows [Pydantic AI conventions](https://ai.pydantic.dev/models/overview/). Common environment variables:
|
|
79
79
|
|
|
80
80
|
- `OPENAI_API_KEY` — required for OpenAI models ([details](https://ai.pydantic.dev/models/openai/))
|
|
81
|
+
- `ANTHROPIC_API_KEY` — required for Anthropic models ([details](https://ai.pydantic.dev/models/anthropic/))
|
|
81
82
|
- `GOOGLE_API_KEY` — required for Google AI (Gemini API) models ([details](https://ai.pydantic.dev/models/gemini/))
|
|
82
83
|
- Google Vertex AI uses Application Default Credentials, not an API key ([details](https://ai.pydantic.dev/models/gemini/#vertex-ai))
|
|
84
|
+
- AWS Bedrock uses AWS credentials, not an API key ([details](https://ai.pydantic.dev/models/bedrock/))
|
|
85
|
+
- `GROQ_API_KEY` — required for Groq models ([details](https://ai.pydantic.dev/models/groq/))
|
|
83
86
|
|
|
84
87
|
## Safety model
|
|
85
88
|
|
|
@@ -105,12 +108,3 @@ Set the environment variable before running: `export OPENAI_API_KEY=sk-xxxxxxxxx
|
|
|
105
108
|
|
|
106
109
|
Install the required provider package. For Pydantic AI providers: `pip install pydantic-ai-slim[openai]`. For coding agent backends: `pip install nighthawk-python[claude-code-sdk]`.
|
|
107
110
|
|
|
108
|
-
## Next Steps
|
|
109
|
-
|
|
110
|
-
- **[Tutorial](tutorial.md)** — Learn Nighthawk from first principles.
|
|
111
|
-
- **[Providers](providers.md)** — LLM providers and configuration.
|
|
112
|
-
- **[Coding agent backends](coding-agent-backends.md)** — Claude Code and Codex backend configuration.
|
|
113
|
-
- **[Design](design.md)** — Canonical specification.
|
|
114
|
-
- **[API Reference](api.md)** — Auto-generated API documentation.
|
|
115
|
-
- **[Roadmap](roadmap.md)** — Future directions.
|
|
116
|
-
- **[For coding agents](for-coding-agents.md)** — Nighthawk development guide for coding agents (LLM reference).
|
|
@@ -70,4 +70,4 @@ The f-string binding span validation uses a NUL byte (`\x00`) as a placeholder f
|
|
|
70
70
|
## Open questions
|
|
71
71
|
|
|
72
72
|
- How to best represent tool results in the prompt for robust reasoning.
|
|
73
|
-
- How to debug Natural blocks deterministically (unit testing is addressed via `
|
|
73
|
+
- How to debug Natural blocks deterministically (unit testing is addressed via `nighthawk.testing`; debugging the LLM's reasoning path remains open).
|