agentme 0.9.0 → 0.11.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.filedist-package.yml +2 -1
- package/.xdrs/agentme/edrs/application/003-javascript-project-tooling.md +41 -5
- package/.xdrs/agentme/edrs/application/010-golang-project-tooling.md +39 -15
- package/.xdrs/agentme/edrs/application/014-python-project-tooling.md +63 -5
- package/.xdrs/agentme/edrs/application/015-cli-tool-standards.md +25 -24
- package/.xdrs/agentme/edrs/application/018-ai-agent-development-standards.md +57 -64
- package/.xdrs/agentme/edrs/application/019-ml-dataset-structure.md +12 -12
- package/.xdrs/agentme/edrs/application/020-ai-agent-xdrs-knowledge-layer.md +99 -0
- package/.xdrs/agentme/edrs/application/021-pragmatic-hexagonal-architecture.md +112 -0
- package/.xdrs/agentme/edrs/application/skills/001-create-javascript-project/SKILL.md +26 -11
- package/.xdrs/agentme/edrs/application/skills/003-create-golang-project/SKILL.md +31 -14
- package/.xdrs/agentme/edrs/application/skills/005-create-python-project/SKILL.md +56 -23
- package/.xdrs/agentme/edrs/devops/005-monorepo-structure.md +1 -1
- package/.xdrs/agentme/edrs/devops/006-github-pipelines.md +1 -1
- package/.xdrs/agentme/edrs/devops/008-common-targets.md +2 -1
- package/.xdrs/agentme/edrs/devops/017-tool-execution-and-scripting.md +1 -1
- package/.xdrs/agentme/edrs/governance/013-contributing-guide-requirements.md +1 -1
- package/.xdrs/agentme/edrs/index.md +3 -1
- package/.xdrs/agentme/edrs/observability/011-service-health-check-endpoint.md +1 -1
- package/.xdrs/agentme/edrs/principles/002-coding-best-practices.md +1 -1
- package/.xdrs/agentme/edrs/principles/004-unit-test-requirements.md +23 -3
- package/.xdrs/agentme/edrs/principles/007-project-quality-standards.md +31 -3
- package/.xdrs/agentme/edrs/principles/009-error-handling.md +1 -1
- package/.xdrs/agentme/edrs/principles/012-continuous-xdr-enrichment.md +2 -2
- package/.xdrs/agentme/edrs/principles/016-cross-language-module-structure.md +1 -1
- package/.xdrs/agentme/edrs/principles/articles/001-continuous-xdr-improvement.md +5 -5
- package/package.json +2 -2
package/.filedist-package.yml
CHANGED
|
@@ -19,7 +19,7 @@ What tooling and project structure should JavaScript/TypeScript projects follow
|
|
|
19
19
|
|
|
20
20
|
Clear, consistent tooling and layout enable fast onboarding, reliable CI pipelines, and a predictable developer experience across projects.
|
|
21
21
|
|
|
22
|
-
###
|
|
22
|
+
### Details
|
|
23
23
|
|
|
24
24
|
#### Tooling
|
|
25
25
|
|
|
@@ -46,6 +46,23 @@ Use a single `lib/tsconfig.json` for both build and type-aware linting. Keep co-
|
|
|
46
46
|
|
|
47
47
|
When `tsconfig.json` extends `@tsconfig/node24/tsconfig.json`, the default `module` is `nodenext`. `ts-jest` still runs in CommonJS mode by default, so `lib/jest.config.js` MUST configure the `ts-jest` transform with an inline `tsconfig` override that sets `module: 'commonjs'`. Do not use the deprecated `globals['ts-jest']` configuration style.
|
|
48
48
|
|
|
49
|
+
#### Coverage
|
|
50
|
+
|
|
51
|
+
Jest must enforce 80% line and branch coverage, following [agentme-edr-004](../principles/004-unit-test-requirements.md). Configure thresholds in `lib/jest.config.js`:
|
|
52
|
+
|
|
53
|
+
```js
|
|
54
|
+
coverageThreshold: {
|
|
55
|
+
global: {
|
|
56
|
+
lines: 80,
|
|
57
|
+
branches: 80,
|
|
58
|
+
},
|
|
59
|
+
},
|
|
60
|
+
coverageProvider: 'v8',
|
|
61
|
+
coverageDirectory: '.cache/coverage',
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
Builds that miss the threshold must not be merged.
|
|
65
|
+
|
|
49
66
|
#### Project structure
|
|
50
67
|
|
|
51
68
|
```
|
|
@@ -64,8 +81,14 @@ When `tsconfig.json` extends `@tsconfig/node24/tsconfig.json`, the default `modu
|
|
|
64
81
|
│ ├── .cache/ # eslint, jest, tsc incremental state, coverage
|
|
65
82
|
│ ├── dist/ # compiled files and packed .tgz artifacts
|
|
66
83
|
│ └── src/ # all TypeScript source files
|
|
67
|
-
│ ├── index.ts # public API re-exports
|
|
68
|
-
│
|
|
84
|
+
│ ├── index.ts # public API re-exports from app/
|
|
85
|
+
│ ├── adapters/ # I/O boundary layer (following agentme-edr-021)
|
|
86
|
+
│ │ ├── cli/ # inbound: CLI bootstrap and entry point
|
|
87
|
+
│ │ ├── http/ # inbound: HTTP server bootstrap and handlers
|
|
88
|
+
│ │ └── connectors/ # outbound: one folder per external resource
|
|
89
|
+
│ ├── app/ # core business logic
|
|
90
|
+
│ │ └── *.test.ts # test files co-located with source
|
|
91
|
+
│ └── shared/ # infrastructure-agnostic utilities
|
|
69
92
|
├── examples/ # runnable usage examples outside the module root
|
|
70
93
|
│ ├── Makefile # build + test all examples in sequence
|
|
71
94
|
│ ├── usage-x/ # first example
|
|
@@ -78,9 +101,20 @@ When `tsconfig.json` extends `@tsconfig/node24/tsconfig.json`, the default `modu
|
|
|
78
101
|
|
|
79
102
|
The root `Makefile` delegates every target to `/lib` then `/examples` in sequence. Parent Makefiles should call child Makefiles directly, and each module Makefile is responsible for running its actual tool commands through `mise exec --`.
|
|
80
103
|
|
|
104
|
+
Internal source code MUST be organized following [agentme-edr-021](021-pragmatic-hexagonal-architecture.md): `adapters/` (inbound and outbound I/O boundaries), `app/` (business logic), and `shared/` (infrastructure-agnostic utilities). The public API entry point (`index.ts`) re-exports from `app/`.
|
|
105
|
+
|
|
81
106
|
When a repository contains multiple JavaScript/TypeScript packages, each package MUST live in its own module folder such as `lib/my-package/` or `services/my-service/`, each with its own `Makefile`, `README.md`, `dist/`, and `.cache/`.
|
|
82
107
|
|
|
83
|
-
|
|
108
|
+
All tool caches, incremental state files, and workspace-local config outputs MUST be written under `.cache/`. This applies to every tool without exception. Cache and state paths MUST be declared in the tool's own configuration file — never on the command line — so that the location is enforced regardless of how the tool is invoked:
|
|
109
|
+
|
|
110
|
+
| Tool | Config file | Setting | Value |
|
|
111
|
+
|------|------------|---------|-------|
|
|
112
|
+
| **Jest** | `jest.config.js` | `cacheDirectory` | `.cache/jest` |
|
|
113
|
+
| **ESLint** | `eslint.config.mjs` | `cache: true, cacheLocation: '.cache/eslint'` | (set in config object) |
|
|
114
|
+
| **TypeScript** | `tsconfig.json` | `tsBuildInfoFile` | `.cache/tsbuildinfo` |
|
|
115
|
+
| **Jest coverage** | `jest.config.js` | `coverageDirectory` | `.cache/coverage` |
|
|
116
|
+
|
|
117
|
+
No tool MUST write cache or state files to the project root, `src/`, or any other directory outside `.cache/`. Passing cache paths as Makefile or CLI flags instead of config-file settings is not allowed.
|
|
84
118
|
|
|
85
119
|
Contributors and CI MUST invoke the commands below as `make <target>`. The Makefile recipes themselves MUST call the underlying tools through `mise exec -- <tool> ...`.
|
|
86
120
|
|
|
@@ -93,7 +127,7 @@ Contributors and CI MUST invoke the commands below as `make <target>`. The Makef
|
|
|
93
127
|
| `build-module` | `mise exec -- pnpm exec tsc ...` only (no pack) |
|
|
94
128
|
| `lint` | `mise exec -- pnpm exec eslint ./src` |
|
|
95
129
|
| `lint-fix` | `mise exec -- pnpm exec eslint ./src --fix` |
|
|
96
|
-
| `test` | `mise exec -- pnpm exec jest --verbose` |
|
|
130
|
+
| `test` | `mise exec -- pnpm exec jest --verbose --coverage` |
|
|
97
131
|
| `test-watch` | `mise exec -- pnpm exec jest --watch` |
|
|
98
132
|
| `clean` | remove `node_modules/`, `dist/`, and `.cache/` |
|
|
99
133
|
| `all` | `build lint test` |
|
|
@@ -120,5 +154,7 @@ The examples folder MUST exist for any libraries and utilities that are publishe
|
|
|
120
154
|
|
|
121
155
|
## References
|
|
122
156
|
|
|
157
|
+
- [agentme-edr-004](../principles/004-unit-test-requirements.md) — Coverage and unit-test baseline
|
|
158
|
+
- [agentme-edr-021](021-pragmatic-hexagonal-architecture.md) — Internal adapter/application layer separation for applications
|
|
123
159
|
- [001-create-javascript-project](skills/001-create-javascript-project/SKILL.md) — scaffolds a new project following this structure
|
|
124
160
|
|
|
@@ -19,7 +19,7 @@ What tooling and project structure should Go projects follow to ensure consisten
|
|
|
19
19
|
|
|
20
20
|
A predictable layout and minimal external tooling keep Go projects approachable, fast to build, and easy to distribute as cross-platform binaries.
|
|
21
21
|
|
|
22
|
-
###
|
|
22
|
+
### Details
|
|
23
23
|
|
|
24
24
|
#### Tooling
|
|
25
25
|
|
|
@@ -47,16 +47,25 @@ Direct installation of project-required Go CLIs with `go install ...@latest` as
|
|
|
47
47
|
├── main.go # binary entry point — argument dispatch only, no logic
|
|
48
48
|
├── .cache/ # GOCACHE, GOMODCACHE, golangci-lint cache, coverage
|
|
49
49
|
├── dist/ # built binaries and packaged outputs
|
|
50
|
-
├──
|
|
51
|
-
│ ├──
|
|
52
|
-
│ └──
|
|
53
|
-
├──
|
|
54
|
-
│ └── ...
|
|
55
|
-
├── cli/ # CLI wiring — ties flags to domain packages
|
|
56
|
-
│ ├── <feature-a>/
|
|
50
|
+
├── adapters/ # I/O boundary layer (following agentme-edr-021)
|
|
51
|
+
│ ├── cli/ # inbound: CLI wiring — flag parsing, output formatting
|
|
52
|
+
│ │ └── *.go # subfolders per feature only when complexity warrants it
|
|
53
|
+
│ ├── http/ # inbound: HTTP server bootstrap and handlers
|
|
57
54
|
│ │ └── *.go
|
|
55
|
+
│ └── connectors/ # outbound: one folder per external resource
|
|
56
|
+
│ ├── postgres/
|
|
57
|
+
│ │ └── *.go
|
|
58
|
+
│ └── stripe-api/
|
|
59
|
+
│ └── *.go
|
|
60
|
+
├── app/ # core business logic packages
|
|
61
|
+
│ ├── <feature-a>/
|
|
62
|
+
│ │ ├── *.go
|
|
63
|
+
│ │ └── *_test.go
|
|
58
64
|
│ └── <feature-b>/
|
|
59
|
-
│
|
|
65
|
+
│ ├── *.go
|
|
66
|
+
│ └── *_test.go
|
|
67
|
+
├── shared/ # infrastructure-agnostic utilities shared across adapters and app
|
|
68
|
+
│ └── *.go
|
|
60
69
|
├── tests_integration/ # optional integration tests for this module
|
|
61
70
|
├── tests_benchmark/ # optional benchmark harnesses and datasets
|
|
62
71
|
└── examples/ # optional sibling consumer examples for libraries
|
|
@@ -64,12 +73,16 @@ Direct installation of project-required Go CLIs with `go install ...@latest` as
|
|
|
64
73
|
|
|
65
74
|
**Key layout rules:**
|
|
66
75
|
|
|
76
|
+
- Internal source code is organized following [agentme-edr-021](021-pragmatic-hexagonal-architecture.md): `adapters/` (inbound and outbound I/O boundaries), `app/` (business logic), and `shared/` (infrastructure-agnostic utilities).
|
|
67
77
|
- One Go module per project (`go.mod` at the project root). In a monorepo, each Go project has its own `go.mod` in its subdirectory. No nested modules within a single project unless explicitly justified.
|
|
68
78
|
- In a multi-module repository, each Go module MUST live in its own folder root with its own `Makefile`, `README.md`, `dist/`, and `.cache/`.
|
|
69
|
-
- `main.go` is solely an argument dispatcher — it reads `os.Args[1]` and delegates to
|
|
70
|
-
- Business logic lives in named feature packages
|
|
71
|
-
- `cli/` packages own flag parsing, output formatting, and the wiring between flags and
|
|
79
|
+
- `main.go` is solely an argument dispatcher — it reads `os.Args[1]` and delegates to an `adapters/cli/<feature>/Run*()` function. No domain logic lives in `main.go`.
|
|
80
|
+
- Business logic lives in named feature packages under `app/` (e.g., `app/ownership/`, `app/changes/`). These packages are importable and testable without any CLI or adapter concerns.
|
|
81
|
+
- `adapters/cli/` packages own flag parsing, output formatting, and the wiring between flags and `app/` functions. No business logic lives in adapter packages.
|
|
82
|
+
- Outbound adapters live under `adapters/connectors/` with one subfolder per external resource, named descriptively (e.g., `postgres/`, `stripe-api/`, `redis-cache/`).
|
|
83
|
+
- `shared/` must contain only infrastructure-agnostic utilities — not business rules or domain logic.
|
|
72
84
|
- Packages are flat by default; sub-packages are only introduced when a feature package itself exceeds ~400 lines or has clearly separable sub-concerns.
|
|
85
|
+
- Application MAY import from Adapters when it simplifies the design (pragmatic coupling per edr-021 rule 05).
|
|
73
86
|
- Consumer examples for reusable libraries belong in a sibling `examples/` folder and MUST import the public module path rather than reaching into internal source paths. Because Go libraries are not typically consumed from a local packaged artifact, local example validation may use a temporary module replacement for resolution, but the import path MUST remain the public module path.
|
|
74
87
|
|
|
75
88
|
#### go.mod
|
|
@@ -94,7 +107,8 @@ Direct installation of project-required Go CLIs with `go install ...@latest` as
|
|
|
94
107
|
| `test-unit` | `mise exec -- go test -cover ./...` — alias for unit tests only (same here; integration tests get a separate tag) |
|
|
95
108
|
| `coverage` | `mise exec -- go tool cover -func .cache/coverage.out` — displays coverage summary |
|
|
96
109
|
| `clean` | Remove `dist/` and `.cache/` |
|
|
97
|
-
| `
|
|
110
|
+
| `run` | `mise exec -- go run ./ <default-args>` — launch the binary locally |
|
|
111
|
+
| `run-http` | `mise exec -- go run ./ http` — launch the HTTP inbound adapter |
|
|
98
112
|
| `publish` | Tag with `mise exec -- npx -y monotag ...`, then push tag + binaries to GitHub Releases |
|
|
99
113
|
|
|
100
114
|
The required invocation pattern is:
|
|
@@ -125,7 +139,16 @@ When the project produces a CLI binary for end-users:
|
|
|
125
139
|
- Benchmarks: keep simple `Benchmark*` functions co-located in `*_test.go`; use `tests_benchmark/` when the benchmark needs dedicated harnesses or datasets.
|
|
126
140
|
- Integration or slow tests: guard with `//go:build integration` and keep them in `tests_integration/` when they are not naturally co-located with one package.
|
|
127
141
|
|
|
128
|
-
|
|
142
|
+
All tool caches, incremental state files, and build outputs MUST be written under `.cache/`. Neither `go` nor `golangci-lint` support a project-level config file for cache paths, so environment variables are the only available mechanism. These MUST be declared as top-level exports at the top of the module `Makefile` (not passed as per-recipe CLI flags or inline env overrides) so they apply to every recipe consistently:
|
|
143
|
+
|
|
144
|
+
| Tool | Mechanism | Makefile export |
|
|
145
|
+
|------|-----------|------------------|
|
|
146
|
+
| **Go build cache** | `GOCACHE` env var | `export GOCACHE := $(CURDIR)/.cache/go-build` |
|
|
147
|
+
| **Go module cache** | `GOMODCACHE` env var | `export GOMODCACHE := $(CURDIR)/.cache/go-mod` |
|
|
148
|
+
| **golangci-lint cache** | `GOLANGCI_LINT_CACHE` env var | `export GOLANGCI_LINT_CACHE := $(CURDIR)/.cache/golangci-lint` |
|
|
149
|
+
| **Test coverage output** | `-coverprofile` flag in `test` target | `.cache/coverage.out` |
|
|
150
|
+
|
|
151
|
+
No tool MUST write cache or state files to the project root or any directory outside `.cache/`. Passing cache paths as per-recipe environment overrides instead of top-level Makefile exports is not allowed.
|
|
129
152
|
|
|
130
153
|
#### Linting
|
|
131
154
|
|
|
@@ -151,8 +174,9 @@ Use `github.com/sirupsen/logrus` for structured logging. Set the log level from
|
|
|
151
174
|
|
|
152
175
|
#### CLI flag parsing
|
|
153
176
|
|
|
154
|
-
Use the standard library `flag` package for CLI flags. Each `cli/<feature>` package defines its own `FlagSet`, parses it from `os.Args[2:]`, and calls the corresponding
|
|
177
|
+
Use the standard library `flag` package for CLI flags. Each `adapters/cli/<feature>` package defines its own `FlagSet`, parses it from `os.Args[2:]`, and calls the corresponding `app/` function.
|
|
155
178
|
|
|
156
179
|
## References
|
|
157
180
|
|
|
181
|
+
- [agentme-edr-021](021-pragmatic-hexagonal-architecture.md) — Defines the adapter/application separation that this layout follows
|
|
158
182
|
- [003-create-golang-project](skills/003-create-golang-project/SKILL.md) — scaffolds a new Go project following this structure
|
|
@@ -19,7 +19,7 @@ What tooling and project structure should Python projects follow to ensure consi
|
|
|
19
19
|
|
|
20
20
|
A single dependency manager, isolated package internals under `lib/`, and a standard Makefile contract keep Python projects predictable for contributors and CI while keeping the repository root clean.
|
|
21
21
|
|
|
22
|
-
###
|
|
22
|
+
### Details
|
|
23
23
|
|
|
24
24
|
#### Tooling
|
|
25
25
|
|
|
@@ -40,14 +40,24 @@ The repository root MUST define a `.mise.toml` that pins Python and uv. Contribu
|
|
|
40
40
|
|
|
41
41
|
The root `.venv/` is the canonical environment location for both the library and all examples. Subdirectory commands must set `UV_PROJECT_ENVIRONMENT` to the workspace root `.venv/` instead of creating nested virtual environments.
|
|
42
42
|
|
|
43
|
-
|
|
43
|
+
All tool caches, incremental state files, and workspace-local outputs MUST be written under `.cache/`. Cache paths MUST be declared in the tool's own configuration file — never on the command line or as Makefile CLI flags — so the location is enforced regardless of how the tool is invoked. Configure the following in `lib/pyproject.toml`:
|
|
44
|
+
|
|
45
|
+
| Tool | Config section | Setting | Value |
|
|
46
|
+
|------|---------------|---------|-------|
|
|
47
|
+
| **Ruff** | `[tool.ruff]` | `cache-dir` | `".cache/ruff"` |
|
|
48
|
+
| **pytest** | `[tool.pytest.ini_options]` | `cache_dir` | `".cache/pytest"` |
|
|
49
|
+
| **coverage** | `[tool.coverage.run]` | `data_file` | `".cache/.coverage"` |
|
|
50
|
+
| **coverage HTML** | `[tool.coverage.html]` | `directory` | `".cache/coverage-html"` |
|
|
51
|
+
| **uv** | `[tool.uv]` in `lib/pyproject.toml` | `cache-dir` | `".cache/uv"` |
|
|
52
|
+
|
|
53
|
+
No tool MUST write cache or state files to the project root, `src/`, `tests/`, or any directory outside `.cache/`. Passing cache paths as CLI flags or Makefile recipe-level env overrides instead of `pyproject.toml` settings is not allowed.
|
|
44
54
|
|
|
45
55
|
#### Project structure
|
|
46
56
|
|
|
47
57
|
```text
|
|
48
58
|
/
|
|
49
59
|
├── .mise.toml # required; pins Python and uv
|
|
50
|
-
├── .gitignore
|
|
60
|
+
├── .gitignore # MUST ignore .venv/, dist/, .cache/, __pycache__/
|
|
51
61
|
├── .cache/ # optional shared uv cache at repo level
|
|
52
62
|
├── .venv/ # shared uv environment for lib/ and examples/
|
|
53
63
|
├── Makefile # root entry point; delegates to lib/ and runs examples/
|
|
@@ -61,8 +71,12 @@ Persistent caches must live under `.cache/`, preferably the module `lib/.cache/`
|
|
|
61
71
|
│ ├── src/
|
|
62
72
|
│ │ └── <package_name>/
|
|
63
73
|
│ │ ├── __init__.py
|
|
64
|
-
│ │ ├──
|
|
65
|
-
│ │
|
|
74
|
+
│ │ ├── adapters/ # I/O boundary layer (following agentme-edr-021)
|
|
75
|
+
│ │ │ ├── cli/ # inbound: CLI bootstrap and entry point
|
|
76
|
+
│ │ │ ├── http/ # inbound: HTTP server bootstrap
|
|
77
|
+
│ │ │ └── connectors/ # outbound: one folder per external resource
|
|
78
|
+
│ │ ├── app/ # core business logic
|
|
79
|
+
│ │ └── shared/ # infrastructure-agnostic utilities
|
|
66
80
|
│ ├── tests/
|
|
67
81
|
│ │ ├── conftest.py # shared fixtures when needed
|
|
68
82
|
│ │ └── test_*.py
|
|
@@ -82,6 +96,8 @@ Keep the repository root clean: source code, tests, distribution artifacts, and
|
|
|
82
96
|
|
|
83
97
|
Use the `lib/src/` layout for import safety and packaging clarity. Keep tests under `lib/tests/` and shared test setup in `lib/tests/conftest.py`. Do not introduce `requirements.txt`, `setup.py`, `setup.cfg`, `tox.ini`, `ruff.toml`, or `pyrightconfig.json` by default; keep project metadata and tool configuration in `lib/pyproject.toml`.
|
|
84
98
|
|
|
99
|
+
Internal source code MUST be organized following [agentme-edr-021](021-pragmatic-hexagonal-architecture.md): `adapters/` (inbound and outbound I/O boundaries), `app/` (business logic), and `shared/` (infrastructure-agnostic utilities).
|
|
100
|
+
|
|
85
101
|
Libraries and shared utilities must include an `examples/` folder and wire example execution into the root `test` flow, following [agentme-edr-007](../principles/007-project-quality-standards.md). Each example directory is its own Python project with its own `pyproject.toml`, and examples must import the library as a consumer would rather than reaching back into `lib/src/` with relative imports. Local example verification must install the wheel built into `lib/dist/`; do not use editable or path-based dependencies back to `lib/`.
|
|
86
102
|
|
|
87
103
|
Python keeps unit tests under `lib/tests/` by default because that remains the more common and maintainable convention for typed/package-based projects than co-locating tests beside every source file. Integration tests belong in `lib/tests_integration/`, and benchmark harnesses belong in `lib/tests_benchmark/` when they are more than a single micro-benchmark helper.
|
|
@@ -98,6 +114,48 @@ When Pyright runs from `lib/`, configure it to discover the shared root virtual
|
|
|
98
114
|
|
|
99
115
|
Ruff is the default formatter and linter. Do not add Black, isort, or Flake8 unless another XDR for that repository explicitly requires them.
|
|
100
116
|
|
|
117
|
+
All Python projects must configure the following sections in `lib/pyproject.toml`. The cache-related settings are mandatory per the `.cache/` policy above:
|
|
118
|
+
|
|
119
|
+
```toml
|
|
120
|
+
[tool.pytest.ini_options]
|
|
121
|
+
cache_dir = ".cache/pytest"
|
|
122
|
+
|
|
123
|
+
[tool.coverage.run]
|
|
124
|
+
data_file = ".cache/.coverage"
|
|
125
|
+
|
|
126
|
+
[tool.coverage.html]
|
|
127
|
+
directory = ".cache/coverage-html"
|
|
128
|
+
|
|
129
|
+
[tool.uv]
|
|
130
|
+
cache-dir = ".cache/uv"
|
|
131
|
+
|
|
132
|
+
[tool.ruff]
|
|
133
|
+
cache-dir = ".cache/ruff"
|
|
134
|
+
output-format = "grouped"
|
|
135
|
+
line-length = 120
|
|
136
|
+
target-version = "py311"
|
|
137
|
+
src = ["src", "tests", "tests_integration"]
|
|
138
|
+
|
|
139
|
+
[tool.ruff.format]
|
|
140
|
+
docstring-code-format = true
|
|
141
|
+
line-ending = "lf"
|
|
142
|
+
|
|
143
|
+
[tool.ruff.lint]
|
|
144
|
+
task-tags = ["TODO"]
|
|
145
|
+
select = ["ERA", "FAST", "ANN", "ASYNC", "S", "BLE", "FBT", "B", "A", "COM",
|
|
146
|
+
"C4", "DTZ", "T10", "DJ", "EM", "EXE", "FIX", "INT", "ISC", "ICN", "LOG", "G",
|
|
147
|
+
"INP", "PIE", "T20", "PYI", "PT", "Q", "RSE", "RET", "SLF", "SIM", "SLOT", "TID",
|
|
148
|
+
"TC", "ARG", "PTH", "FLY", "I", "C90", "NPY", "PD", "N", "PERF", "E", "W",
|
|
149
|
+
"D", "F", "PGH", "PL", "UP", "FURB", "RUF", "TRY"]
|
|
150
|
+
ignore = ["ANN002", "ANN003", "ANN401", "D100", "D101", "D102", "D103", "D104",
|
|
151
|
+
"D105", "D106", "D107", "COM812", "D203", "D213", "D400", "D401", "D404", "D415", "FIX002"]
|
|
152
|
+
|
|
153
|
+
[tool.ruff.lint.pycodestyle]
|
|
154
|
+
ignore-overlong-task-comments = true
|
|
155
|
+
```
|
|
156
|
+
|
|
157
|
+
Adjust `target-version` to match the project's minimum supported Python version. The `cache-dir` keeps Ruff's cache under `.cache/ruff` alongside other tool caches. The `src` list must include every directory that contains importable Python code. The `select` list enables a broad set of rules covering style, correctness, performance, security, and documentation. The `ignore` list suppresses rules that are either too noisy or conflict with the chosen docstring style.
|
|
158
|
+
|
|
101
159
|
Pyright must run on every lint pass. `typeCheckingMode = "standard"` is the minimum baseline; projects may raise this to `strict` when the codebase is ready.
|
|
102
160
|
|
|
103
161
|
Pytest coverage must fail below 80% line and branch coverage, following [agentme-edr-004](../principles/004-unit-test-requirements.md).
|
|
@@ -19,7 +19,7 @@ What structure and interface rules should distributable CLI tools follow so they
|
|
|
19
19
|
|
|
20
20
|
This keeps the user-facing command predictable while preserving a clean library API for embedding, testing, and automation.
|
|
21
21
|
|
|
22
|
-
###
|
|
22
|
+
### Details
|
|
23
23
|
|
|
24
24
|
#### CLI command surface
|
|
25
25
|
|
|
@@ -32,34 +32,34 @@ This keeps the user-facing command predictable while preserving a clean library
|
|
|
32
32
|
- `--verbose` on the root command and on subcommands when flags are parsed per command
|
|
33
33
|
- Root `--help` output must list all available commands, key options, and usage examples. Command-specific help must describe that command's arguments and options.
|
|
34
34
|
|
|
35
|
-
#### CLI to
|
|
35
|
+
#### CLI to application separation
|
|
36
36
|
|
|
37
|
-
- Structure the software as `cli ->
|
|
38
|
-
- The CLI layer must only parse arguments, load config, call the
|
|
39
|
-
- Domain logic must live in the
|
|
40
|
-
- Every feature available through the CLI must also be available through the
|
|
41
|
-
- Organize the
|
|
42
|
-
- `extract` command -> `extract(...)`
|
|
43
|
-
- `validate` command -> `validate(...)`
|
|
44
|
-
- Avoid one generic
|
|
37
|
+
- Structure the software as `cli -> app` — the CLI adapter delegates to the application layer, following [agentme-edr-021](021-pragmatic-hexagonal-architecture.md).
|
|
38
|
+
- The CLI layer must only parse arguments, load config, call the application layer, and format output.
|
|
39
|
+
- Domain logic must live in the application layer and be usable without CLI globals such as `argv`, `stdout`, or process exit handlers.
|
|
40
|
+
- Every feature available through the CLI must also be available through the application API.
|
|
41
|
+
- Organize the application layer by action so the mapping stays direct and obvious.
|
|
42
|
+
- `extract` command -> `app/extract(...)`
|
|
43
|
+
- `validate` command -> `app/validate(...)`
|
|
44
|
+
- Avoid one generic `run()` entry point that hides action-specific contracts behind switches or string commands.
|
|
45
45
|
|
|
46
|
-
####
|
|
46
|
+
#### Application API shape
|
|
47
47
|
|
|
48
|
-
- Each CLI action should map to a dedicated exported
|
|
49
|
-
-
|
|
50
|
-
- The CLI layer is responsible for translating flags, positional arguments, and config-file contents into
|
|
51
|
-
- The
|
|
48
|
+
- Each CLI action should map to a dedicated exported application function with typed inputs and outputs appropriate for the language.
|
|
49
|
+
- Application APIs should accept in-memory options objects or typed parameters, not require config files or environment variables unless application-level config-file support is an explicit requirement.
|
|
50
|
+
- The CLI layer is responsible for translating flags, positional arguments, and config-file contents into application inputs.
|
|
51
|
+
- The application layer should return explicit results and errors so the CLI can decide what to print and which exit code to use.
|
|
52
52
|
|
|
53
53
|
#### Configuration
|
|
54
54
|
|
|
55
55
|
- Prefer flags and positional arguments for simple inputs.
|
|
56
56
|
- When configuration becomes long, nested, or repetitive, support a config file instead of pushing all values into flags.
|
|
57
|
-
- By default, config-file discovery and loading must happen in the CLI layer, not in the
|
|
57
|
+
- By default, config-file discovery and loading must happen in the CLI layer, not in the application layer.
|
|
58
58
|
- When a config file is supported, the CLI should try to load a JSON config file from `[cwd]/.[cli-name]rc` by default.
|
|
59
59
|
- The CLI should also support an explicit config path flag such as `--config`.
|
|
60
60
|
- For JavaScript tools, `cosmiconfig` is an acceptable implementation. Equivalent discovery libraries are acceptable in other ecosystems.
|
|
61
|
-
- The
|
|
62
|
-
- The
|
|
61
|
+
- The application layer must not depend on the presence of the config file; it should receive parsed configuration values from the CLI layer.
|
|
62
|
+
- The application layer may load or parse config files only when that behavior is an explicit requirement of the application contract for non-CLI consumers as well.
|
|
63
63
|
|
|
64
64
|
#### Output and progress
|
|
65
65
|
|
|
@@ -73,14 +73,14 @@ This keeps the user-facing command predictable while preserving a clean library
|
|
|
73
73
|
|
|
74
74
|
- Exit with `0` only when the requested action completed successfully.
|
|
75
75
|
- Exit with `1` when the requested action could not be completed.
|
|
76
|
-
- The
|
|
76
|
+
- The application layer should surface failure as return values, result objects, or language-idiomatic errors; the CLI is responsible for converting that outcome into user-facing messages and process exit codes.
|
|
77
77
|
|
|
78
78
|
#### Documentation
|
|
79
79
|
|
|
80
80
|
- `README.md` must include at least 4 CLI usage examples.
|
|
81
|
-
- `README.md` must include at least 2
|
|
81
|
+
- `README.md` must include at least 2 application API examples for the same operation also available through the CLI.
|
|
82
82
|
- If the tool supports config files, at least 1 README example should show config-file usage.
|
|
83
|
-
- Examples must use the public command and public
|
|
83
|
+
- Examples must use the public command and public application API, not internal modules or private files.
|
|
84
84
|
|
|
85
85
|
#### Distribution and versioning
|
|
86
86
|
|
|
@@ -93,12 +93,13 @@ This keeps the user-facing command predictable while preserving a clean library
|
|
|
93
93
|
## Considered Options
|
|
94
94
|
|
|
95
95
|
* (REJECTED) **Ad hoc CLIs with embedded business logic** - Keep parsing, processing, config loading, and output formatting inside a single entry point.
|
|
96
|
-
* Reason: Makes the tool hard to test, hard to reuse
|
|
97
|
-
* (CHOSEN) **Thin CLI adapter over action-oriented
|
|
98
|
-
* Reason: Preserves a clean programmatic API, keeps command behavior discoverable, and makes the CLI-to-
|
|
96
|
+
* Reason: Makes the tool hard to test, hard to reuse programmatically, and inconsistent across commands.
|
|
97
|
+
* (CHOSEN) **Thin CLI adapter over action-oriented application APIs** - Keep the CLI responsible for user interaction and the application layer responsible for the actual behavior.
|
|
98
|
+
* Reason: Preserves a clean programmatic API, keeps command behavior discoverable, and makes the CLI-to-application mapping easy to maintain.
|
|
99
99
|
|
|
100
100
|
## References
|
|
101
101
|
|
|
102
|
+
- [agentme-edr-021](021-pragmatic-hexagonal-architecture.md) - Defines the adapter/application separation that the CLI layer follows
|
|
102
103
|
- [agentme-edr-003](003-javascript-project-tooling.md) - JavaScript project packaging and structure
|
|
103
104
|
- [agentme-edr-007](../principles/007-project-quality-standards.md) - README and examples baseline
|
|
104
105
|
- [agentme-edr-008](../devops/008-common-targets.md) - Standard command names for project entry points
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: agentme-edr-policy-018-ai-agent-development-standards
|
|
3
|
-
description: Defines the standard toolchain, framework, evaluation approach, and
|
|
3
|
+
description: Defines the standard toolchain, framework, evaluation approach, and workflow patterns for building AI agents with Python and LangGraph. Use when scaffolding, reviewing, or extending AI agent projects.
|
|
4
4
|
apply-to: AI agent projects built with Python
|
|
5
5
|
valid-from: 2026-05-26
|
|
6
6
|
---
|
|
@@ -9,13 +9,13 @@ valid-from: 2026-05-26
|
|
|
9
9
|
|
|
10
10
|
## Context and Problem Statement
|
|
11
11
|
|
|
12
|
-
AI agent projects vary widely in how they choose frameworks, manage context, evaluate outputs, and
|
|
12
|
+
AI agent projects vary widely in how they choose frameworks, manage context, evaluate outputs, and structure workflows. Without a shared baseline, projects accumulate incompatible patterns for LLM provider abstraction, flow design, and dataset-driven testing.
|
|
13
13
|
|
|
14
14
|
Which tools, frameworks, and design patterns should AI agent projects follow to ensure reproducibility, testability, and maintainability?
|
|
15
15
|
|
|
16
16
|
## Decision Outcome
|
|
17
17
|
|
|
18
|
-
**Use Python with LangGraph for flow orchestration
|
|
18
|
+
**Use Python with LangGraph for flow orchestration and MLflow for experiment tracking and local evaluation.**
|
|
19
19
|
|
|
20
20
|
### Details
|
|
21
21
|
|
|
@@ -54,7 +54,7 @@ Use **MLflow** for all agent observability and evaluation:
|
|
|
54
54
|
|
|
55
55
|
#### 04-dataset-driven-accuracy-measurement
|
|
56
56
|
|
|
57
|
-
Every agent pipeline MUST have a companion evaluation dataset and an MLflow experiment that measures accuracy against it. Datasets and evals are organized per-workflow following rule `
|
|
57
|
+
Every agent pipeline MUST have a companion evaluation dataset and an MLflow experiment that measures accuracy against it. Datasets and evals are organized per-workflow following rule `07-workflow-structure` and rule `08-workflow-evals`.
|
|
58
58
|
|
|
59
59
|
- Store evaluation datasets under `evals/<workflow>/` (sibling of `lib/` and `examples/`), following [agentme-edr-019](019-ml-dataset-structure.md) for structure and format. For MLflow input/output pairs, use the JSONL format described in `agentme-edr-019.04-complex-structured-datasets-must-use-jsonl`.
|
|
60
60
|
- Write evaluation scripts under `evals/<workflow>/` that load the dataset, run each input through the live agent (against real LLMs, not mocks), compare outputs to expected values, and log per-sample and aggregate metrics to an MLflow experiment.
|
|
@@ -80,50 +80,7 @@ graph TD
|
|
|
80
80
|
C -->|fail| B
|
|
81
81
|
```
|
|
82
82
|
|
|
83
|
-
#### 06-
|
|
84
|
-
|
|
85
|
-
When an agent must follow elaborate procedures, decision frameworks, or domain rules:
|
|
86
|
-
|
|
87
|
-
**Static files distributed with the library**
|
|
88
|
-
|
|
89
|
-
- All static files accessed by agents at runtime (XDRS documents, reference tables, domain dictionaries, lookup files) MUST live under a `data/` folder inside the library source tree (`lib/data/`) and be embedded in the package data manifest (e.g. `pyproject.toml` `[tool.hatch.build] include` or equivalent).
|
|
90
|
-
- XDRS Policy and Skill documents MUST be placed at `lib/data/.xdrs/`, using the standard XDRS scope/type/subject folder structure (following `_core-adr-policy-001`).
|
|
91
|
-
- Other static context data (reference tables, domain dictionaries, structured lookup files) MUST be placed under `lib/data/` in an appropriate sub-folder (e.g. `lib/data/context/`).
|
|
92
|
-
- The agent system prompt MUST NOT inline procedure text. It MUST instruct the agent to read specific paths and follow the instructions found there. Example:
|
|
93
|
-
|
|
94
|
-
```
|
|
95
|
-
Before answering, read and follow the instructions in data/.xdrs/_local/edrs/procedures/triage.md.
|
|
96
|
-
```
|
|
97
|
-
|
|
98
|
-
**Dynamic context generated per workflow instantiation**
|
|
99
|
-
|
|
100
|
-
- Context files that are generated at runtime per workflow run (unpacked archives, fetched documents, intermediate outputs) MUST be written to a temporary directory created via the OS temp API (`tempfile.mkdtemp()` in Python).
|
|
101
|
-
- The temporary directory MUST be created at the start of the workflow run and passed into the workflow state so all nodes share the same path.
|
|
102
|
-
- The temporary directory MUST be deleted (including all contents) when the workflow run finishes, whether it succeeds or fails, using a `try/finally` block or a context manager.
|
|
103
|
-
- The agent file tools MUST be configured with the temporary directory path at workflow startup so the agent can read from it during the run.
|
|
104
|
-
|
|
105
|
-
- The agent file tools MUST expose `data/` (for static files) and the temporary directory (for dynamic files) as sandboxed readable roots (see rule `07-agent-file-tools`).
|
|
106
|
-
|
|
107
|
-
#### 07-agent-file-tools
|
|
108
|
-
|
|
109
|
-
Every agent that uses the XDRS knowledge layer or file-based context MUST be equipped with at least the following tools:
|
|
110
|
-
|
|
111
|
-
| Tool | Purpose |
|
|
112
|
-
|---|---|
|
|
113
|
-
| `read_file(path)` | Read the full content of a file by path |
|
|
114
|
-
| `search_files(directory, pattern)` | Glob-search for files matching a pattern under a directory |
|
|
115
|
-
| `grep_file(path, query)` | Search for lines matching a string or regex within a file |
|
|
116
|
-
|
|
117
|
-
Implement these tools as LangChain `@tool`-decorated functions with explicit path sandboxing. Two sandboxed roots MUST be configured:
|
|
118
|
-
|
|
119
|
-
| Root | Content | Source |
|
|
120
|
-
|---|---|---|
|
|
121
|
-
| `DATA_ROOT` | Static files shipped with the library (`lib/data/`) | Package data; resolved via `importlib.resources` or a path relative to the installed package |
|
|
122
|
-
| `TEMP_ROOT` | Dynamic files generated for the current workflow run | Temporary directory created by `tempfile.mkdtemp()` at workflow startup |
|
|
123
|
-
|
|
124
|
-
Resolve all paths against the appropriate root. Reject any path that would escape its root (no `../` traversal). `TEMP_ROOT` MUST be passed into the tool factory at workflow startup, not read from a global variable.
|
|
125
|
-
|
|
126
|
-
#### 08-verification-steps
|
|
83
|
+
#### 06-verification-steps
|
|
127
84
|
|
|
128
85
|
Agent flows MUST include at least one explicit verification node before producing final output:
|
|
129
86
|
|
|
@@ -132,27 +89,40 @@ Agent flows MUST include at least one explicit verification node before producin
|
|
|
132
89
|
- On failure, the verification node MUST route back to the relevant generation node, not silently pass through.
|
|
133
90
|
- Log verification results (pass/fail, score, reason) as MLflow metrics on the current run.
|
|
134
91
|
|
|
135
|
-
####
|
|
92
|
+
#### 07-workflow-structure
|
|
136
93
|
|
|
137
|
-
Agent logic MUST be organized as named workflows. Each workflow is an independent LangGraph `StateGraph` with a defined start node and end node, connecting agents, states, routes, and decision nodes.
|
|
94
|
+
Agent logic MUST be organized as named workflows following [agentme-edr-021](021-pragmatic-hexagonal-architecture.md). Each workflow is an independent LangGraph `StateGraph` with a defined start node and end node, connecting agents, states, routes, and decision nodes.
|
|
138
95
|
|
|
139
|
-
|
|
96
|
+
Workflows live inside `app/workflows/` (the application layer), while external integrations such as LLM providers, vector stores, and third-party APIs live under `adapters/connectors/` (the outbound adapter layer). Inbound interfaces (HTTP API, CLI) live under `adapters/` as inbound adapters.
|
|
97
|
+
|
|
98
|
+
For each workflow named `<workflow>`, the full project layout is:
|
|
140
99
|
|
|
141
100
|
```text
|
|
142
|
-
lib/
|
|
143
|
-
|
|
144
|
-
|
|
145
|
-
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
|
|
101
|
+
lib/src/<package_name>/
|
|
102
|
+
adapters/
|
|
103
|
+
http/ # inbound: API server that triggers workflows
|
|
104
|
+
cli/ # inbound: CLI entry point (if applicable)
|
|
105
|
+
connectors/ # outbound: external resource integrations
|
|
106
|
+
openai/ # LLM provider connector
|
|
107
|
+
azure-openai/ # alternative LLM provider connector
|
|
108
|
+
postgres/ # database connector (if applicable)
|
|
109
|
+
vector-store/ # vector DB connector (if applicable)
|
|
110
|
+
app/
|
|
111
|
+
workflows/
|
|
112
|
+
<workflow>/
|
|
113
|
+
graph.py # StateGraph definition; entry point for the workflow
|
|
114
|
+
agents.py # LangChain agent definitions used by this workflow
|
|
115
|
+
states.py # Typed state dataclasses / TypedDicts
|
|
116
|
+
routes.py # Conditional edge functions
|
|
117
|
+
shared/ # infrastructure-agnostic utilities
|
|
149
118
|
```
|
|
150
119
|
|
|
151
|
-
- `graph.py` MUST define and compile the `StateGraph` and expose a `graph` object that callers invoke.
|
|
152
|
-
-
|
|
120
|
+
- `app/workflows/<workflow>/graph.py` MUST define and compile the `StateGraph` and expose a `graph` object that callers invoke.
|
|
121
|
+
- Tool calls within workflow nodes that interact with external systems MUST use connectors from `adapters/connectors/`, not inline API calls.
|
|
122
|
+
- Additional modules (prompts, schemas) MAY be added inside `app/workflows/<workflow>/` when they are specific to that workflow. Shared utilities belong in `shared/`.
|
|
153
123
|
- Each workflow MUST be documented with a Mermaid diagram in the project `README.md` following rule `05-flow-documentation`.
|
|
154
124
|
|
|
155
|
-
####
|
|
125
|
+
#### 08-workflow-evals
|
|
156
126
|
|
|
157
127
|
For each workflow `<workflow>` there MUST be a corresponding eval directory:
|
|
158
128
|
|
|
@@ -168,8 +138,8 @@ The `evals/<workflow>/Makefile` MUST define:
|
|
|
168
138
|
|
|
169
139
|
| Target | Behaviour |
|
|
170
140
|
|---|---|
|
|
171
|
-
| `
|
|
172
|
-
| `
|
|
141
|
+
| `eval` | Runs all eval slices for the workflow |
|
|
142
|
+
| `eval-<slice>` | Runs one named slice (e.g. `eval-simple`, `eval-complex`) |
|
|
173
143
|
|
|
174
144
|
Each `eval_<slice>.py` script MUST:
|
|
175
145
|
|
|
@@ -177,5 +147,28 @@ Each `eval_<slice>.py` script MUST:
|
|
|
177
147
|
- Run every input through the live workflow against real LLMs.
|
|
178
148
|
- Log per-sample and aggregate metrics to an MLflow experiment that runs locally.
|
|
179
149
|
|
|
180
|
-
The module root Makefile `make eval` target MUST delegate to `
|
|
150
|
+
The module root Makefile `make eval` target MUST delegate to `eval` in every `evals/<workflow>/Makefile`.
|
|
151
|
+
|
|
152
|
+
#### 09-local-sandbox
|
|
153
|
+
|
|
154
|
+
When a workflow node or tool requires a **local sandbox** — an isolated environment where the agent can read files, glob-search directories, and execute shell commands — use the **[deepagents](https://github.com/deepagents/deepagents) framework** to provide that sandbox.
|
|
155
|
+
|
|
156
|
+
**When to apply this rule**
|
|
157
|
+
|
|
158
|
+
Use deepagents whenever ANY of the following is true for a workflow or tool:
|
|
159
|
+
- The agent needs to execute shell commands or scripts in a controlled environment.
|
|
160
|
+
- The agent needs to list, read, or search files across multiple directories at runtime.
|
|
161
|
+
- The agent operates on user-supplied or generated file trees that must not escape a sandboxed boundary.
|
|
162
|
+
|
|
163
|
+
**Integration requirements**
|
|
164
|
+
|
|
165
|
+
- Initialize the sandbox at the start of the workflow run and shut it down in the same `try/finally` block.
|
|
166
|
+
- Pass the sandbox handle into the LangGraph workflow state so all nodes share the same sandbox instance.
|
|
167
|
+
- If the host-side code needs to pass files into the sandbox (e.g. generated config or input data), create a temporary directory with `tempfile.mkdtemp()`, write the files there, and mount it into the sandbox. Clean it up in the `finally` block.
|
|
168
|
+
- Replace hand-rolled `read_file`, `search_files`, and `grep_file` tool implementations with the equivalent tools provided by deepagents.
|
|
169
|
+
|
|
170
|
+
## References
|
|
181
171
|
|
|
172
|
+
- [agentme-edr-021](021-pragmatic-hexagonal-architecture.md) — Adapter/application layer separation that defines the project layout
|
|
173
|
+
- [agentme-edr-014](014-python-project-tooling.md) — Python project tooling and structure
|
|
174
|
+
- [agentme-edr-019](019-ml-dataset-structure.md) — ML dataset structure for eval datasets
|