afferent 0.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- afferent-0.1.0/.github/workflows/publish.yml +66 -0
- afferent-0.1.0/.gitignore +20 -0
- afferent-0.1.0/CLAUDE.md +106 -0
- afferent-0.1.0/LICENSE +21 -0
- afferent-0.1.0/PKG-INFO +181 -0
- afferent-0.1.0/README.md +152 -0
- afferent-0.1.0/afferent/__init__.py +43 -0
- afferent-0.1.0/afferent/backends/__init__.py +12 -0
- afferent-0.1.0/afferent/backends/base.py +83 -0
- afferent-0.1.0/afferent/backends/fake.py +125 -0
- afferent-0.1.0/afferent/embodiment.py +148 -0
- afferent-0.1.0/afferent/safety.py +59 -0
- afferent-0.1.0/afferent/types.py +133 -0
- afferent-0.1.0/pyproject.toml +54 -0
- afferent-0.1.0/scripts/release.sh +97 -0
- afferent-0.1.0/tests/__init__.py +0 -0
- afferent-0.1.0/tests/test_embodiment.py +121 -0
- afferent-0.1.0/tests/test_fake_backend.py +65 -0
- afferent-0.1.0/tests/test_types.py +70 -0
|
@@ -0,0 +1,66 @@
|
|
|
1
|
+
name: Publish to PyPI
|
|
2
|
+
|
|
3
|
+
# Publishes afferent automatically on every push to main — but only when the
|
|
4
|
+
# version in afferent/__init__.py is one that isn't already on PyPI. So:
|
|
5
|
+
# - push without bumping __version__ → build/publish steps skip (green no-op)
|
|
6
|
+
# - push that bumps __version__ → build, test, and publish to PyPI
|
|
7
|
+
#
|
|
8
|
+
# Uses PyPI Trusted Publishing (OIDC) — NO API token stored anywhere.
|
|
9
|
+
# ONE-TIME setup on PyPI (https://pypi.org/manage/account/publishing/):
|
|
10
|
+
# "Add a pending publisher"
|
|
11
|
+
# PyPI Project Name: afferent
|
|
12
|
+
# Owner: andrasfe
|
|
13
|
+
# Repository name: spinalcord # the GitHub repo (unchanged)
|
|
14
|
+
# Workflow name: publish.yml
|
|
15
|
+
# Environment: (leave blank)
|
|
16
|
+
# That's the whole setup. (If you'd rather use a token instead of trusted
|
|
17
|
+
# publishing, add a repo secret PYPI_API_TOKEN and give the publish step
|
|
18
|
+
# `with: { password: ${{ secrets.PYPI_API_TOKEN }} }`.)
|
|
19
|
+
|
|
20
|
+
on:
|
|
21
|
+
push:
|
|
22
|
+
branches: [main]
|
|
23
|
+
workflow_dispatch: # allow manual runs from the Actions tab
|
|
24
|
+
|
|
25
|
+
jobs:
|
|
26
|
+
publish:
|
|
27
|
+
runs-on: ubuntu-latest
|
|
28
|
+
permissions:
|
|
29
|
+
id-token: write # required for Trusted Publishing (OIDC)
|
|
30
|
+
contents: read
|
|
31
|
+
steps:
|
|
32
|
+
- uses: actions/checkout@v4
|
|
33
|
+
|
|
34
|
+
- uses: actions/setup-python@v5
|
|
35
|
+
with:
|
|
36
|
+
python-version: "3.11"
|
|
37
|
+
|
|
38
|
+
- name: Read package version
|
|
39
|
+
id: ver
|
|
40
|
+
run: |
|
|
41
|
+
V=$(grep -oE '__version__ = "[^"]+"' afferent/__init__.py | cut -d'"' -f2)
|
|
42
|
+
echo "version=$V" >> "$GITHUB_OUTPUT"
|
|
43
|
+
echo "afferent version: $V"
|
|
44
|
+
|
|
45
|
+
- name: Is this version already on PyPI?
|
|
46
|
+
id: gate
|
|
47
|
+
run: |
|
|
48
|
+
if curl -sf "https://pypi.org/pypi/afferent/${{ steps.ver.outputs.version }}/json" > /dev/null; then
|
|
49
|
+
echo "publish=false" >> "$GITHUB_OUTPUT"
|
|
50
|
+
echo "::notice::afferent ${{ steps.ver.outputs.version }} already on PyPI — nothing to publish."
|
|
51
|
+
else
|
|
52
|
+
echo "publish=true" >> "$GITHUB_OUTPUT"
|
|
53
|
+
echo "::notice::afferent ${{ steps.ver.outputs.version }} is new — building and publishing."
|
|
54
|
+
fi
|
|
55
|
+
|
|
56
|
+
- name: Build + test
|
|
57
|
+
if: steps.gate.outputs.publish == 'true'
|
|
58
|
+
run: |
|
|
59
|
+
python -m pip install --upgrade build twine
|
|
60
|
+
python -m unittest discover -s tests -p 'test_*.py' -t .
|
|
61
|
+
python -m build
|
|
62
|
+
python -m twine check dist/*
|
|
63
|
+
|
|
64
|
+
- name: Publish to PyPI
|
|
65
|
+
if: steps.gate.outputs.publish == 'true'
|
|
66
|
+
uses: pypa/gh-action-pypi-publish@release/v1
|
afferent-0.1.0/CLAUDE.md
ADDED
|
@@ -0,0 +1,106 @@
|
|
|
1
|
+
# CLAUDE.md
|
|
2
|
+
|
|
3
|
+
Guidance for AI agents (and humans) working in this repository.
|
|
4
|
+
**This file is yours to edit** — keep it current as the code changes. If you
|
|
5
|
+
add a backend, change the protocol, or alter the release flow, update the
|
|
6
|
+
relevant section here in the same change.
|
|
7
|
+
|
|
8
|
+
## What this is
|
|
9
|
+
|
|
10
|
+
`afferent` is a **backend-agnostic sensorimotor protocol**: the typed,
|
|
11
|
+
safety-gated seam between a cognitive layer ("brain", the planner) and an
|
|
12
|
+
embodiment layer ("body", the backend). Afferent signals go up (eyes:
|
|
13
|
+
`observe` / `locate` / `verify` / `read_text`); efferent signals go down
|
|
14
|
+
(hands: `click` / `type_text` / `key` / `scroll`).
|
|
15
|
+
|
|
16
|
+
It is a standalone library — no ties to any particular body or planner. The
|
|
17
|
+
package is **stdlib-only** (zero runtime dependencies). It is published to
|
|
18
|
+
PyPI as `afferent`.
|
|
19
|
+
|
|
20
|
+
## Design rules (load-bearing — don't break these)
|
|
21
|
+
|
|
22
|
+
1. **Core stays dependency-free.** `afferent/types.py`, `safety.py`,
|
|
23
|
+
`embodiment.py`, `backends/base.py`, `backends/fake.py`, and the package
|
|
24
|
+
`__init__` must import with **stdlib only**. A real-body backend may need
|
|
25
|
+
third-party packages (httpx, a CV stack, pyautogui, …) — put those behind
|
|
26
|
+
an optional extra and import them lazily/inside the backend module, never
|
|
27
|
+
at package import time.
|
|
28
|
+
2. **Backends answer "how", the gate answers "should".** A `Backend`
|
|
29
|
+
implements raw primitives (`observe`, `do_click_at`, …) and never gates
|
|
30
|
+
itself for policy reasons. `Embodiment` applies the `SafetyGate` and the
|
|
31
|
+
post-action observation. Keep that split.
|
|
32
|
+
3. **Eyes are never gated.** Observation has no blast radius; only efferent
|
|
33
|
+
(`do_*`) actions pass through the gate.
|
|
34
|
+
4. **`read_only=True` is the default.** Hands refuse until a consumer opts in.
|
|
35
|
+
Don't change the default.
|
|
36
|
+
5. **Coordinates are `pct`** — fractions in `[0,1]`, top-left origin,
|
|
37
|
+
resolution-independent. All backends and results use this.
|
|
38
|
+
6. **`Observation.render_text()` must stay deterministic** — same observation
|
|
39
|
+
→ byte-identical string. Consumers embed it and use it as a world-model
|
|
40
|
+
key. If you change the format, keep it stable and update the test.
|
|
41
|
+
7. **`do_*` methods don't raise for ordinary failures** — return
|
|
42
|
+
`ActionResult(ok=False, reason=...)`. They may raise `BackendUnavailable`
|
|
43
|
+
when the transport/device is unreachable.
|
|
44
|
+
|
|
45
|
+
## Layout
|
|
46
|
+
|
|
47
|
+
```
|
|
48
|
+
afferent/
|
|
49
|
+
__init__.py # public surface + __version__ (single source of truth)
|
|
50
|
+
types.py # Frame, VisualElement, Observation, LocateResult,
|
|
51
|
+
# VerifyResult, ActionResult (the protocol)
|
|
52
|
+
safety.py # SafetyGate (read_only, confirm, allowed_apps, rate, panic)
|
|
53
|
+
embodiment.py # Embodiment facade — wraps a Backend with the gate
|
|
54
|
+
backends/
|
|
55
|
+
base.py # Backend ABC + BackendUnavailable
|
|
56
|
+
fake.py # FakeBackend — scripted, hardware-free reference impl
|
|
57
|
+
tests/ # unittest, fully offline (no deps, no network)
|
|
58
|
+
scripts/release.sh # build + publish to PyPI / TestPyPI
|
|
59
|
+
.github/workflows/publish.yml # Trusted-Publishing release workflow
|
|
60
|
+
pyproject.toml # hatchling; version is dynamic from __init__.py
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
## Adding a backend
|
|
64
|
+
|
|
65
|
+
Subclass `Backend`. Required: `capabilities()`, `observe()`, `do_click_at`,
|
|
66
|
+
`do_type_text`, `do_key`. Optional (sensible defaults provided): `health`,
|
|
67
|
+
`frontmost_app`, `screenshot`, `locate`, `verify`, `read_text`, `do_move_to`,
|
|
68
|
+
`do_scroll`, `close`. `FakeBackend` is the worked example. If the backend
|
|
69
|
+
needs third-party deps, add an extra in `pyproject.toml`
|
|
70
|
+
(`[project.optional-dependencies]`) and import them inside the backend module
|
|
71
|
+
so the core import stays clean. Add it to `backends/__init__.py` only if it's
|
|
72
|
+
dependency-free; otherwise document the import path.
|
|
73
|
+
|
|
74
|
+
## Tests
|
|
75
|
+
|
|
76
|
+
```bash
|
|
77
|
+
python -m unittest discover -s tests -p 'test_*.py' -t . # offline, no deps
|
|
78
|
+
# or, with dev extras: pytest
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
Every test must run offline with zero third-party deps. Drive new behavior
|
|
82
|
+
through `FakeBackend`. When you add a public method or a protocol field, add a
|
|
83
|
+
test for it.
|
|
84
|
+
|
|
85
|
+
## Releasing
|
|
86
|
+
|
|
87
|
+
Version is single-sourced from `__version__` in `afferent/__init__.py`
|
|
88
|
+
(hatchling reads it; `pyproject.toml` declares `dynamic = ["version"]`).
|
|
89
|
+
|
|
90
|
+
- **CI (primary):** bump `__version__`, commit, push to `main`.
|
|
91
|
+
`.github/workflows/publish.yml` runs on every push to main, but only
|
|
92
|
+
builds+publishes when the version isn't already on PyPI (it queries the PyPI
|
|
93
|
+
JSON API and skips otherwise). Publishes via Trusted Publishing — no tokens.
|
|
94
|
+
One-time PyPI "pending publisher" config is documented in the workflow
|
|
95
|
+
header. **So: a version bump is the release trigger; ordinary pushes are
|
|
96
|
+
no-ops.**
|
|
97
|
+
- **Local / TestPyPI:** `scripts/release.sh --test` then `scripts/release.sh`.
|
|
98
|
+
`--tag` pushes a `vX.Y.Z` git tag after a real upload.
|
|
99
|
+
|
|
100
|
+
## Conventions
|
|
101
|
+
|
|
102
|
+
- Keep public surface in `afferent/__init__.py:__all__` current.
|
|
103
|
+
- Docstrings explain *why*; code says *what*. No emojis in code.
|
|
104
|
+
- `from __future__ import annotations` in every module (supports Python 3.9).
|
|
105
|
+
- Target Python ≥ 3.9; avoid 3.10+ runtime syntax (no `match`, no runtime
|
|
106
|
+
`X | Y` in `isinstance`, etc.).
|
afferent-0.1.0/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Andras Ferenczi
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
afferent-0.1.0/PKG-INFO
ADDED
|
@@ -0,0 +1,181 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: afferent
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: A backend-agnostic sensorimotor protocol — eyes and hands for cognitive agents driving a computer.
|
|
5
|
+
Project-URL: Homepage, https://github.com/andrasfe/spinalcord
|
|
6
|
+
Project-URL: Repository, https://github.com/andrasfe/spinalcord
|
|
7
|
+
Project-URL: Issues, https://github.com/andrasfe/spinalcord/issues
|
|
8
|
+
Author-email: Andras Ferenczi <andrasf94@gmail.com>
|
|
9
|
+
License: MIT
|
|
10
|
+
License-File: LICENSE
|
|
11
|
+
Keywords: agent,automation,computer-use,embodiment,gui-automation,llm,sensorimotor
|
|
12
|
+
Classifier: Development Status :: 3 - Alpha
|
|
13
|
+
Classifier: Intended Audience :: Developers
|
|
14
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
15
|
+
Classifier: Programming Language :: Python :: 3
|
|
16
|
+
Classifier: Programming Language :: Python :: 3.9
|
|
17
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
18
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
19
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
20
|
+
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
|
21
|
+
Classifier: Topic :: Software Development :: Libraries
|
|
22
|
+
Requires-Python: >=3.9
|
|
23
|
+
Provides-Extra: dev
|
|
24
|
+
Requires-Dist: build>=1.2; extra == 'dev'
|
|
25
|
+
Requires-Dist: pytest>=8; extra == 'dev'
|
|
26
|
+
Requires-Dist: ruff>=0.6; extra == 'dev'
|
|
27
|
+
Requires-Dist: twine>=5; extra == 'dev'
|
|
28
|
+
Description-Content-Type: text/markdown
|
|
29
|
+
|
|
30
|
+
# afferent
|
|
31
|
+
|
|
32
|
+
**A backend-agnostic sensorimotor protocol — eyes and hands for cognitive agents driving a computer.**
|
|
33
|
+
|
|
34
|
+
A cognitive layer (a *brain*) plans; an embodiment layer (a *body*) acts.
|
|
35
|
+
`afferent` is the conduit between them. It carries **afferent** signals up
|
|
36
|
+
(eyes — `observe` / `locate` / `verify` / `read_text`) and **efferent**
|
|
37
|
+
signals down (hands — `click` / `type_text` / `key` / `scroll`), as typed,
|
|
38
|
+
safety-gated calls over a **pluggable backend**.
|
|
39
|
+
|
|
40
|
+
```
|
|
41
|
+
┌─────────┐ afferent (eyes) ↑ ┌────────────┐ actions ┌──────────┐
|
|
42
|
+
│ brain │ ◀───────────────────── │ afferent │ ──────────▶ │ body │
|
|
43
|
+
│ (plans) │ ──────────────────────▶│ (protocol) │ ◀────────── │ (backend)│
|
|
44
|
+
└─────────┘ efferent (hands) ↓ └────────────┘ observations└──────────┘
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
The package is **dependency-free** (stdlib only). It ships one working
|
|
48
|
+
backend — `FakeBackend` (scripted, hardware-free) — and a `Backend` ABC you
|
|
49
|
+
subclass to drive a real body: browser automation (Playwright/Selenium), OS
|
|
50
|
+
automation (pyautogui / accessibility APIs), a VM driver, a remote HID bridge,
|
|
51
|
+
or a test harness. The protocol doesn't care which.
|
|
52
|
+
|
|
53
|
+
## Why it exists
|
|
54
|
+
|
|
55
|
+
Most computer-use agents fuse perception, planning, and action into one
|
|
56
|
+
monolith. `afferent` deliberately splits the *body* from the *mind* with a
|
|
57
|
+
narrow, typed seam, so:
|
|
58
|
+
|
|
59
|
+
- the planner stays free to be anything (an LLM loop, a cognitive
|
|
60
|
+
architecture, a script);
|
|
61
|
+
- the body stays free to be anything (a real desktop, a browser, a VM, a
|
|
62
|
+
fake);
|
|
63
|
+
- and the whole loop is **unit-testable offline** via the scripted fake
|
|
64
|
+
backend — no hardware, no network, no API keys.
|
|
65
|
+
|
|
66
|
+
## Install
|
|
67
|
+
|
|
68
|
+
```bash
|
|
69
|
+
pip install afferent
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
That's it — no dependencies. (Dev tooling: `pip install afferent[dev]`.)
|
|
73
|
+
|
|
74
|
+
## Quickstart — offline, scripted body (works immediately)
|
|
75
|
+
|
|
76
|
+
```python
|
|
77
|
+
from afferent import Embodiment
|
|
78
|
+
from afferent.types import Observation, VisualElement
|
|
79
|
+
|
|
80
|
+
screen0 = Observation(
|
|
81
|
+
ts=0.0, frontmost_app="Firefox",
|
|
82
|
+
elements=[VisualElement("Run", (0.80, 0.20, 0.10, 0.04), kind="button")],
|
|
83
|
+
)
|
|
84
|
+
screen1 = Observation(ts=1.0, frontmost_app="Firefox", ocr_text="running…")
|
|
85
|
+
|
|
86
|
+
em = Embodiment.fake(script=[screen0, screen1]) # read_only=False for the demo
|
|
87
|
+
|
|
88
|
+
print(em.observe().render_text()) # afferent: see the screen
|
|
89
|
+
res = em.click("Run") # efferent: locate + click
|
|
90
|
+
print(res.ok, res.steps, res.state_after.ocr_text) # grounded outcome
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
## The protocol
|
|
94
|
+
|
|
95
|
+
All coordinates are `pct` — fractions in `[0, 1]`, top-left origin,
|
|
96
|
+
resolution-independent (so they're stable world-model keys across machines).
|
|
97
|
+
|
|
98
|
+
Typed results (`afferent.types`): `Frame`, `VisualElement`, `Observation`,
|
|
99
|
+
`LocateResult`, `VerifyResult`, `ActionResult`.
|
|
100
|
+
|
|
101
|
+
`Observation.render_text()` is a **stable, compact, embeddable** one-screen
|
|
102
|
+
string — feed it to an embedding model and use it as a key in a learned world
|
|
103
|
+
model. Determinism is guaranteed (same observation → byte-identical string).
|
|
104
|
+
|
|
105
|
+
`ActionResult` carries **grounding** for predictive-coding / world-model
|
|
106
|
+
consumers: `steps` (e.g. visual-servo iterations), `duration_ms`,
|
|
107
|
+
`final_cursor_pct`, `frame_before` / `frame_after`, and a `state_after`
|
|
108
|
+
observation bracketing the action.
|
|
109
|
+
|
|
110
|
+
## Safety
|
|
111
|
+
|
|
112
|
+
`SafetyGate` sits in front of every efferent action (eyes are never gated):
|
|
113
|
+
|
|
114
|
+
- `read_only=True` is the **default** — hands refuse until you opt in.
|
|
115
|
+
- `confirm(desc) -> bool` — a per-action veto your planner drives.
|
|
116
|
+
- `allowed_apps` — refuse when the frontmost app isn't allowed.
|
|
117
|
+
- `max_actions_per_min` — rate limit against runaway loops.
|
|
118
|
+
- `panic()` — latch into a permanent refusing state.
|
|
119
|
+
|
|
120
|
+
This is *additive* to whatever gates a backend enforces internally. Both must
|
|
121
|
+
pass.
|
|
122
|
+
|
|
123
|
+
## Writing a backend
|
|
124
|
+
|
|
125
|
+
Subclass `afferent.Backend`, implement the eyes (`observe`, optionally
|
|
126
|
+
`locate` / `verify` / `read_text`) and the raw hands (`do_click_at`,
|
|
127
|
+
`do_type_text`, `do_key`, optionally `do_move_to` / `do_scroll`), and declare
|
|
128
|
+
`capabilities()`. `Embodiment` applies the `SafetyGate` and the post-action
|
|
129
|
+
observation for you — a backend only answers "how do I see / move", never
|
|
130
|
+
"should I".
|
|
131
|
+
|
|
132
|
+
```python
|
|
133
|
+
from afferent import Backend, Embodiment
|
|
134
|
+
from afferent.types import Observation, ActionResult
|
|
135
|
+
|
|
136
|
+
class MyBackend(Backend):
|
|
137
|
+
name = "mybody"
|
|
138
|
+
def capabilities(self):
|
|
139
|
+
return {"pixels", "click", "type", "key"}
|
|
140
|
+
def observe(self, *, ocr=False, locate=None) -> Observation:
|
|
141
|
+
... # capture your screen → Observation
|
|
142
|
+
def do_click_at(self, x_pct, y_pct, button, count) -> ActionResult:
|
|
143
|
+
... # drive your mouse; return ActionResult(ok=True, ...)
|
|
144
|
+
def do_type_text(self, text, secret, append_enter) -> ActionResult:
|
|
145
|
+
...
|
|
146
|
+
def do_key(self, combo) -> ActionResult:
|
|
147
|
+
...
|
|
148
|
+
|
|
149
|
+
em = Embodiment(MyBackend(), read_only=False)
|
|
150
|
+
```
|
|
151
|
+
|
|
152
|
+
`FakeBackend` (in `afferent/backends/fake.py`) is a complete, readable
|
|
153
|
+
reference implementation of the contract.
|
|
154
|
+
|
|
155
|
+
## Develop
|
|
156
|
+
|
|
157
|
+
```bash
|
|
158
|
+
pip install -e ".[dev]"
|
|
159
|
+
python -m unittest discover -s tests -v # fully offline, no deps
|
|
160
|
+
# or: pytest
|
|
161
|
+
```
|
|
162
|
+
|
|
163
|
+
## Releasing
|
|
164
|
+
|
|
165
|
+
Publishing is automatic. Bump `__version__` in `afferent/__init__.py`,
|
|
166
|
+
commit, and **push to `main`** — `.github/workflows/publish.yml` builds, tests,
|
|
167
|
+
and publishes to PyPI via Trusted Publishing (no tokens). Pushes that don't
|
|
168
|
+
change the version are a no-op (the workflow checks PyPI and skips).
|
|
169
|
+
|
|
170
|
+
One-time setup is in the workflow header (add a "pending publisher" on PyPI).
|
|
171
|
+
|
|
172
|
+
For a manual / TestPyPI publish, use the local script:
|
|
173
|
+
|
|
174
|
+
```bash
|
|
175
|
+
scripts/release.sh --test # TestPyPI
|
|
176
|
+
scripts/release.sh # PyPI
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
## License
|
|
180
|
+
|
|
181
|
+
MIT.
|
afferent-0.1.0/README.md
ADDED
|
@@ -0,0 +1,152 @@
|
|
|
1
|
+
# afferent
|
|
2
|
+
|
|
3
|
+
**A backend-agnostic sensorimotor protocol — eyes and hands for cognitive agents driving a computer.**
|
|
4
|
+
|
|
5
|
+
A cognitive layer (a *brain*) plans; an embodiment layer (a *body*) acts.
|
|
6
|
+
`afferent` is the conduit between them. It carries **afferent** signals up
|
|
7
|
+
(eyes — `observe` / `locate` / `verify` / `read_text`) and **efferent**
|
|
8
|
+
signals down (hands — `click` / `type_text` / `key` / `scroll`), as typed,
|
|
9
|
+
safety-gated calls over a **pluggable backend**.
|
|
10
|
+
|
|
11
|
+
```
|
|
12
|
+
┌─────────┐ afferent (eyes) ↑ ┌────────────┐ actions ┌──────────┐
|
|
13
|
+
│ brain │ ◀───────────────────── │ afferent │ ──────────▶ │ body │
|
|
14
|
+
│ (plans) │ ──────────────────────▶│ (protocol) │ ◀────────── │ (backend)│
|
|
15
|
+
└─────────┘ efferent (hands) ↓ └────────────┘ observations└──────────┘
|
|
16
|
+
```
|
|
17
|
+
|
|
18
|
+
The package is **dependency-free** (stdlib only). It ships one working
|
|
19
|
+
backend — `FakeBackend` (scripted, hardware-free) — and a `Backend` ABC you
|
|
20
|
+
subclass to drive a real body: browser automation (Playwright/Selenium), OS
|
|
21
|
+
automation (pyautogui / accessibility APIs), a VM driver, a remote HID bridge,
|
|
22
|
+
or a test harness. The protocol doesn't care which.
|
|
23
|
+
|
|
24
|
+
## Why it exists
|
|
25
|
+
|
|
26
|
+
Most computer-use agents fuse perception, planning, and action into one
|
|
27
|
+
monolith. `afferent` deliberately splits the *body* from the *mind* with a
|
|
28
|
+
narrow, typed seam, so:
|
|
29
|
+
|
|
30
|
+
- the planner stays free to be anything (an LLM loop, a cognitive
|
|
31
|
+
architecture, a script);
|
|
32
|
+
- the body stays free to be anything (a real desktop, a browser, a VM, a
|
|
33
|
+
fake);
|
|
34
|
+
- and the whole loop is **unit-testable offline** via the scripted fake
|
|
35
|
+
backend — no hardware, no network, no API keys.
|
|
36
|
+
|
|
37
|
+
## Install
|
|
38
|
+
|
|
39
|
+
```bash
|
|
40
|
+
pip install afferent
|
|
41
|
+
```
|
|
42
|
+
|
|
43
|
+
That's it — no dependencies. (Dev tooling: `pip install afferent[dev]`.)
|
|
44
|
+
|
|
45
|
+
## Quickstart — offline, scripted body (works immediately)
|
|
46
|
+
|
|
47
|
+
```python
|
|
48
|
+
from afferent import Embodiment
|
|
49
|
+
from afferent.types import Observation, VisualElement
|
|
50
|
+
|
|
51
|
+
screen0 = Observation(
|
|
52
|
+
ts=0.0, frontmost_app="Firefox",
|
|
53
|
+
elements=[VisualElement("Run", (0.80, 0.20, 0.10, 0.04), kind="button")],
|
|
54
|
+
)
|
|
55
|
+
screen1 = Observation(ts=1.0, frontmost_app="Firefox", ocr_text="running…")
|
|
56
|
+
|
|
57
|
+
em = Embodiment.fake(script=[screen0, screen1]) # read_only=False for the demo
|
|
58
|
+
|
|
59
|
+
print(em.observe().render_text()) # afferent: see the screen
|
|
60
|
+
res = em.click("Run") # efferent: locate + click
|
|
61
|
+
print(res.ok, res.steps, res.state_after.ocr_text) # grounded outcome
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
## The protocol
|
|
65
|
+
|
|
66
|
+
All coordinates are `pct` — fractions in `[0, 1]`, top-left origin,
|
|
67
|
+
resolution-independent (so they're stable world-model keys across machines).
|
|
68
|
+
|
|
69
|
+
Typed results (`afferent.types`): `Frame`, `VisualElement`, `Observation`,
|
|
70
|
+
`LocateResult`, `VerifyResult`, `ActionResult`.
|
|
71
|
+
|
|
72
|
+
`Observation.render_text()` is a **stable, compact, embeddable** one-screen
|
|
73
|
+
string — feed it to an embedding model and use it as a key in a learned world
|
|
74
|
+
model. Determinism is guaranteed (same observation → byte-identical string).
|
|
75
|
+
|
|
76
|
+
`ActionResult` carries **grounding** for predictive-coding / world-model
|
|
77
|
+
consumers: `steps` (e.g. visual-servo iterations), `duration_ms`,
|
|
78
|
+
`final_cursor_pct`, `frame_before` / `frame_after`, and a `state_after`
|
|
79
|
+
observation bracketing the action.
|
|
80
|
+
|
|
81
|
+
## Safety
|
|
82
|
+
|
|
83
|
+
`SafetyGate` sits in front of every efferent action (eyes are never gated):
|
|
84
|
+
|
|
85
|
+
- `read_only=True` is the **default** — hands refuse until you opt in.
|
|
86
|
+
- `confirm(desc) -> bool` — a per-action veto your planner drives.
|
|
87
|
+
- `allowed_apps` — refuse when the frontmost app isn't allowed.
|
|
88
|
+
- `max_actions_per_min` — rate limit against runaway loops.
|
|
89
|
+
- `panic()` — latch into a permanent refusing state.
|
|
90
|
+
|
|
91
|
+
This is *additive* to whatever gates a backend enforces internally. Both must
|
|
92
|
+
pass.
|
|
93
|
+
|
|
94
|
+
## Writing a backend
|
|
95
|
+
|
|
96
|
+
Subclass `afferent.Backend`, implement the eyes (`observe`, optionally
|
|
97
|
+
`locate` / `verify` / `read_text`) and the raw hands (`do_click_at`,
|
|
98
|
+
`do_type_text`, `do_key`, optionally `do_move_to` / `do_scroll`), and declare
|
|
99
|
+
`capabilities()`. `Embodiment` applies the `SafetyGate` and the post-action
|
|
100
|
+
observation for you — a backend only answers "how do I see / move", never
|
|
101
|
+
"should I".
|
|
102
|
+
|
|
103
|
+
```python
|
|
104
|
+
from afferent import Backend, Embodiment
|
|
105
|
+
from afferent.types import Observation, ActionResult
|
|
106
|
+
|
|
107
|
+
class MyBackend(Backend):
|
|
108
|
+
name = "mybody"
|
|
109
|
+
def capabilities(self):
|
|
110
|
+
return {"pixels", "click", "type", "key"}
|
|
111
|
+
def observe(self, *, ocr=False, locate=None) -> Observation:
|
|
112
|
+
... # capture your screen → Observation
|
|
113
|
+
def do_click_at(self, x_pct, y_pct, button, count) -> ActionResult:
|
|
114
|
+
... # drive your mouse; return ActionResult(ok=True, ...)
|
|
115
|
+
def do_type_text(self, text, secret, append_enter) -> ActionResult:
|
|
116
|
+
...
|
|
117
|
+
def do_key(self, combo) -> ActionResult:
|
|
118
|
+
...
|
|
119
|
+
|
|
120
|
+
em = Embodiment(MyBackend(), read_only=False)
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
`FakeBackend` (in `afferent/backends/fake.py`) is a complete, readable
|
|
124
|
+
reference implementation of the contract.
|
|
125
|
+
|
|
126
|
+
## Develop
|
|
127
|
+
|
|
128
|
+
```bash
|
|
129
|
+
pip install -e ".[dev]"
|
|
130
|
+
python -m unittest discover -s tests -v # fully offline, no deps
|
|
131
|
+
# or: pytest
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
## Releasing
|
|
135
|
+
|
|
136
|
+
Publishing is automatic. Bump `__version__` in `afferent/__init__.py`,
|
|
137
|
+
commit, and **push to `main`** — `.github/workflows/publish.yml` builds, tests,
|
|
138
|
+
and publishes to PyPI via Trusted Publishing (no tokens). Pushes that don't
|
|
139
|
+
change the version are a no-op (the workflow checks PyPI and skips).
|
|
140
|
+
|
|
141
|
+
One-time setup is in the workflow header (add a "pending publisher" on PyPI).
|
|
142
|
+
|
|
143
|
+
For a manual / TestPyPI publish, use the local script:
|
|
144
|
+
|
|
145
|
+
```bash
|
|
146
|
+
scripts/release.sh --test # TestPyPI
|
|
147
|
+
scripts/release.sh # PyPI
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
## License
|
|
151
|
+
|
|
152
|
+
MIT.
|
|
@@ -0,0 +1,43 @@
|
|
|
1
|
+
"""afferent — a backend-agnostic sensorimotor protocol for cognitive agents.
|
|
2
|
+
|
|
3
|
+
A cognitive layer (a "brain") plans; an embodiment layer (a "body") acts.
|
|
4
|
+
``afferent`` is the conduit between them: it carries **afferent** signals
|
|
5
|
+
up (eyes — observe / locate / verify / read) and **efferent** signals down
|
|
6
|
+
(hands — click / type / key / scroll), as typed, safety-gated calls over a
|
|
7
|
+
pluggable `Backend`.
|
|
8
|
+
|
|
9
|
+
The package is **dependency-free** (stdlib only). It ships one working
|
|
10
|
+
backend — `FakeBackend` (scripted, hardware-free) — and a `Backend` ABC you
|
|
11
|
+
subclass to drive a real body (browser automation, OS automation, a VM
|
|
12
|
+
driver, a remote HID bridge, a test harness, …). `Embodiment` wraps any
|
|
13
|
+
backend with a `SafetyGate` and post-action observation.
|
|
14
|
+
"""
|
|
15
|
+
from __future__ import annotations
|
|
16
|
+
|
|
17
|
+
from .backends.base import Backend
|
|
18
|
+
from .backends.fake import FakeBackend
|
|
19
|
+
from .embodiment import Embodiment
|
|
20
|
+
from .safety import SafetyGate
|
|
21
|
+
from .types import (
|
|
22
|
+
ActionResult,
|
|
23
|
+
Frame,
|
|
24
|
+
LocateResult,
|
|
25
|
+
Observation,
|
|
26
|
+
VerifyResult,
|
|
27
|
+
VisualElement,
|
|
28
|
+
)
|
|
29
|
+
|
|
30
|
+
__version__ = "0.1.0"
|
|
31
|
+
|
|
32
|
+
__all__ = [
|
|
33
|
+
"Embodiment",
|
|
34
|
+
"Backend",
|
|
35
|
+
"FakeBackend",
|
|
36
|
+
"SafetyGate",
|
|
37
|
+
"Frame",
|
|
38
|
+
"VisualElement",
|
|
39
|
+
"Observation",
|
|
40
|
+
"LocateResult",
|
|
41
|
+
"VerifyResult",
|
|
42
|
+
"ActionResult",
|
|
43
|
+
]
|
|
@@ -0,0 +1,12 @@
|
|
|
1
|
+
"""Backends — pluggable embodiment implementations.
|
|
2
|
+
|
|
3
|
+
`Backend` is the ABC you subclass to drive a real body. `FakeBackend` is a
|
|
4
|
+
scripted, hardware-free reference implementation (stdlib only) used for tests
|
|
5
|
+
and as a worked example of the contract.
|
|
6
|
+
"""
|
|
7
|
+
from __future__ import annotations
|
|
8
|
+
|
|
9
|
+
from .base import Backend
|
|
10
|
+
from .fake import FakeBackend
|
|
11
|
+
|
|
12
|
+
__all__ = ["Backend", "FakeBackend"]
|