agentme 0.22.0 → 0.23.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -134,6 +134,23 @@ Names MUST NOT use generic labels such as `node1`, `process`, or `run`. Each nam
134
134
 
135
135
  Judge nodes use a **prefix** convention instead of a suffix: the name MUST start with `evaluate_` followed by the subject being judged (e.g. `evaluate_progress`, `evaluate_quality`, `evaluate_completeness`, `evaluate_relevance`). This makes judge nodes immediately distinguishable from all other node types at a glance.
136
136
 
137
+ **Grouping prefix for related nodes:** When multiple nodes deal with the same subject, entity, or workflow region, SHOULD use a shared grouping word as a prefix followed by a verb and the role suffix. The pattern is `<group>_<verb>_<role_suffix>`. This makes the graph topology scannable and clusters related nodes together alphabetically in logs, traces, and code.
138
+
139
+ ```python
140
+ # Nodes grouped under the "invoice" subject
141
+ def invoice_fetch_tool(state): ... # fetches invoice data from an API
142
+ def invoice_validate_step(state): ... # validates invoice fields deterministically
143
+ def invoice_summarize_llm(state): ... # summarizes invoice content with an LLM
144
+ def invoice_review_agent(state): ... # runs an agent loop to review the invoice
145
+
146
+ graph.add_node("invoice_fetch_tool", invoice_fetch_tool)
147
+ graph.add_node("invoice_validate_step", invoice_validate_step)
148
+ graph.add_node("invoice_summarize_llm", invoice_summarize_llm)
149
+ graph.add_node("invoice_review_agent", invoice_review_agent)
150
+ ```
151
+
152
+ The grouping prefix is optional for workflows where all nodes clearly belong to a single domain. It MUST be used when a workflow spans multiple subjects or regions (e.g. `invoice_*`, `payment_*`, `notification_*`) to prevent name collisions and to make the graph structure self-documenting.
153
+
137
154
  #### 10-workflow-unit-testing
138
155
 
139
156
  All LLM calls within workflow nodes are external API calls and MUST be mocked in unit tests per [agentme-edr-018](018-ai-llm-development-standards.md) rule `04-unit-test-mocking`. Workflow unit tests must run fully offline with no real LLM provider calls.
@@ -182,6 +182,12 @@ Where $\hat{p}$ is observed accuracy and $n$ is sample count. Accuracy and F1 ar
182
182
  - MLflow run: experiment `workflow-document-review/eval-basic` — view with `mlflow ui`
183
183
  ```
184
184
 
185
+ #### 04-eval-mlflow-unique-port
186
+
187
+ Each `evals/<component>/eval-<name>/Makefile` MUST start its MLflow tracking server on a **unique port** to prevent conflicts when multiple eval Makefiles are run concurrently or in parallel (e.g., in CI or across multiple terminal sessions).
188
+
189
+ Ports MUST be statically assigned per eval scenario and MUST NOT reuse the default `5000` port (reserved for `dev-mlflow` per [agentme-edr-008](../devops/008-common-targets.md) rule `09-ai-project-dev-targets`). Assign ports starting at `5100` and incrementing by 1 for each additional eval scenario across the entire project.
190
+
185
191
  ## References
186
192
 
187
193
  - [agentme-edr-007](../principles/007-project-quality-standards.md) — Project quality standards: when evals are required per AI tier (rule `09-ai-project-testing-requirements`) and statistical model eval targets (rule `07-statistical-models-must-have-eval-targets`)
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "agentme",
3
- "version": "0.22.0",
3
+ "version": "0.23.0",
4
4
  "description": "",
5
5
  "dependencies": {
6
6
  "filedist": "^0.36.0"