ezscreen 0.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- ezscreen-0.1.0/.ai_context/00_PROJECT_VIBE.md +72 -0
- ezscreen-0.1.0/.ai_context/01_TECH_ARCHITECTURE.md +240 -0
- ezscreen-0.1.0/.ai_context/02_CODING_RULES.md +143 -0
- ezscreen-0.1.0/.ai_context/03_TASK_TRACKER.md +148 -0
- ezscreen-0.1.0/.ai_context/04_DECISION_LOG.md +196 -0
- ezscreen-0.1.0/.gitignore +50 -0
- ezscreen-0.1.0/PKG-INFO +121 -0
- ezscreen-0.1.0/README.md +85 -0
- ezscreen-0.1.0/SESSION_2026-04-14.md +55 -0
- ezscreen-0.1.0/ezscreen/__init__.py +1 -0
- ezscreen-0.1.0/ezscreen/admet/__init__.py +0 -0
- ezscreen-0.1.0/ezscreen/admet/filter.py +175 -0
- ezscreen-0.1.0/ezscreen/auth.py +238 -0
- ezscreen-0.1.0/ezscreen/backends/__init__.py +0 -0
- ezscreen-0.1.0/ezscreen/backends/kaggle/__init__.py +0 -0
- ezscreen-0.1.0/ezscreen/backends/kaggle/dataset.py +124 -0
- ezscreen-0.1.0/ezscreen/backends/kaggle/kernel.py +91 -0
- ezscreen-0.1.0/ezscreen/backends/kaggle/poller.py +126 -0
- ezscreen-0.1.0/ezscreen/backends/kaggle/runner.py +265 -0
- ezscreen-0.1.0/ezscreen/backends/kaggle/templates/vina_shard.ipynb.j2 +395 -0
- ezscreen-0.1.0/ezscreen/checkpoint.py +187 -0
- ezscreen-0.1.0/ezscreen/cli.py +173 -0
- ezscreen-0.1.0/ezscreen/commands/admet.py +63 -0
- ezscreen-0.1.0/ezscreen/commands/auth.py +8 -0
- ezscreen-0.1.0/ezscreen/commands/run.py +563 -0
- ezscreen-0.1.0/ezscreen/commands/status.py +97 -0
- ezscreen-0.1.0/ezscreen/commands/validate.py +67 -0
- ezscreen-0.1.0/ezscreen/commands/view.py +119 -0
- ezscreen-0.1.0/ezscreen/config.py +77 -0
- ezscreen-0.1.0/ezscreen/errors.py +133 -0
- ezscreen-0.1.0/ezscreen/pocket/__init__.py +0 -0
- ezscreen-0.1.0/ezscreen/pocket/detect.py +226 -0
- ezscreen-0.1.0/ezscreen/prep/__init__.py +0 -0
- ezscreen-0.1.0/ezscreen/prep/ligands.py +248 -0
- ezscreen-0.1.0/ezscreen/prep/receptor.py +305 -0
- ezscreen-0.1.0/ezscreen/report.py +179 -0
- ezscreen-0.1.0/ezscreen/results/__init__.py +0 -0
- ezscreen-0.1.0/ezscreen/results/merger.py +103 -0
- ezscreen-0.1.0/ezscreen/state.py +9 -0
- ezscreen-0.1.0/ezscreen/vendor/__init__.py +1 -0
- ezscreen-0.1.0/ezscreen/vendor/scrubber/__init__.py +38 -0
- ezscreen-0.1.0/ezscreen/version_check.py +77 -0
- ezscreen-0.1.0/pyproject.toml +76 -0
- ezscreen-0.1.0/tests/__init__.py +1 -0
- ezscreen-0.1.0/tests/test_smoke.py +176 -0
|
@@ -0,0 +1,72 @@
|
|
|
1
|
+
# 00 — Project Vibe
|
|
2
|
+
|
|
3
|
+
> **STATUS: INITIALIZED FROM HANDOFF DOCUMENT**
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Core Vision
|
|
8
|
+
|
|
9
|
+
**ezscreen** is a command-line tool that makes GPU-accelerated virtual screening accessible to researchers who do not have access to high-performance computing hardware. It automates the entire molecular docking pipeline — from receptor and ligand preparation through docking, hit filtering, and pose validation — using Kaggle's free T4 GPU infrastructure as the compute backend.
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
## The Problem We Are Solving
|
|
14
|
+
|
|
15
|
+
Computational chemists and pharmacologists at under-resourced institutions spend hours configuring docking tools, managing file formats, and debugging prep pipelines before any science gets done. ezscreen collapses this into an interactive CLI with sensible defaults and a guided decision tree.
|
|
16
|
+
|
|
17
|
+
---
|
|
18
|
+
|
|
19
|
+
## Target Users
|
|
20
|
+
|
|
21
|
+
Researchers and students in molecular biology, pharmacology, and computational chemistry who are:
|
|
22
|
+
- Comfortable with a terminal
|
|
23
|
+
- Not necessarily expert computational chemists
|
|
24
|
+
- Using Kaggle or Google Colab for GPU access
|
|
25
|
+
- At institutions without dedicated HPC infrastructure
|
|
26
|
+
|
|
27
|
+
---
|
|
28
|
+
|
|
29
|
+
## The North Star Statement
|
|
30
|
+
|
|
31
|
+
> *"A researcher with a PDB code and a ligand library should be able to run GPU-accelerated virtual screening in under 10 minutes, with no HPC access and no configuration expertise."*
|
|
32
|
+
|
|
33
|
+
---
|
|
34
|
+
|
|
35
|
+
## Non-Negotiable Core Features (v1 MVP)
|
|
36
|
+
|
|
37
|
+
1. **`ezscreen run`** — Full interactive guided pipeline: receptor prep → binding site detection → ligand prep → ADMET pre-filter → docking on Kaggle T4 → results
|
|
38
|
+
2. **`ezscreen auth`** — Credential wizard for Kaggle API + NVIDIA NIM (optional)
|
|
39
|
+
3. **`ezscreen status`** — Live job monitor with auto-refresh, attach/resume/delete actions
|
|
40
|
+
4. **`ezscreen resume [run-id]`** — Checkpoint-based resume for interrupted runs (SQLite state)
|
|
41
|
+
5. **`ezscreen validate`** — Stage 2 DiffDock-L hit validation via NVIDIA NIM API
|
|
42
|
+
6. **`ezscreen admet`** — Standalone ADMET filtering on any CSV or SDF
|
|
43
|
+
7. **`ezscreen view`** — Rich table + py3Dmol self-contained HTML results viewer
|
|
44
|
+
8. **`ezscreen clean [run-id]`** — Remove Kaggle dataset + kernel artifacts for a run
|
|
45
|
+
|
|
46
|
+
---
|
|
47
|
+
|
|
48
|
+
## UX Non-Negotiables
|
|
49
|
+
|
|
50
|
+
- Every screen has a `← Back` option
|
|
51
|
+
- Breadcrumb trail always shown at top of every screen
|
|
52
|
+
- Never exit to a bare terminal prompt — always offer next logical action
|
|
53
|
+
- Full raw parameter set always shown on confirmation screen (for citation)
|
|
54
|
+
- No silent defaults on consequential decisions
|
|
55
|
+
- ctrl+c detaches gracefully — Kaggle kernel continues running
|
|
56
|
+
- No emoji in terminal output — color-coded with specific hex palette (see `01_TECH_ARCHITECTURE.md`)
|
|
57
|
+
|
|
58
|
+
---
|
|
59
|
+
|
|
60
|
+
## Scope — v1 Only
|
|
61
|
+
|
|
62
|
+
> ⚠️ Features marked v2 in the handoff are **explicitly deferred** and must not be implemented in v1.
|
|
63
|
+
|
|
64
|
+
**Out of scope for v1:**
|
|
65
|
+
- Local compute backend (no Kaggle)
|
|
66
|
+
- Google Colab backend
|
|
67
|
+
- `ezscreen refine` — MD / MMGBSA post-docking
|
|
68
|
+
- Web dashboard (Gradio / Streamlit)
|
|
69
|
+
- Multi-target screening (one library vs N receptors)
|
|
70
|
+
- Deep ADMET (ADMET-AI, pkCSM)
|
|
71
|
+
- Collaboration / team hit lists
|
|
72
|
+
- Consensus docking (UniDock-Pro + Gnina)
|
|
@@ -0,0 +1,240 @@
|
|
|
1
|
+
# 01 — Tech Architecture
|
|
2
|
+
|
|
3
|
+
> **STATUS: ALL DECISIONS FINAL — from handoff document**
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Tech Stack
|
|
8
|
+
|
|
9
|
+
| Layer | Technology | Rationale |
|
|
10
|
+
|---|---|---|
|
|
11
|
+
| CLI framework | **Typer** | Cleaner nested subcommands than Click; type-hint driven; less boilerplate |
|
|
12
|
+
| Terminal output | **Rich** | Progress bars, styled tables, panels, syntax highlighting |
|
|
13
|
+
| Interactive prompts | **Questionary** | Styled select menus, checkboxes, text inputs via prompt-toolkit |
|
|
14
|
+
| Config format | **TOML** | No YAML Norway-problem bugs; `tomllib` in stdlib (3.11+); `tomli-w` for writing |
|
|
15
|
+
| Python floor | **3.10 minimum** | 3.11+ preferred; compatibility matrix must be verified before finalising |
|
|
16
|
+
| Package distribution | **pip first** | `pyproject.toml`; conda-forge recipe filed after stable v1 release |
|
|
17
|
+
| Checkpointing | **SQLite** | Run state, shard status, resume by run-id |
|
|
18
|
+
| Notebook templating | **Jinja2** | Renders complete `.ipynb` from template variables |
|
|
19
|
+
|
|
20
|
+
---
|
|
21
|
+
|
|
22
|
+
## Docking Engines
|
|
23
|
+
|
|
24
|
+
| Stage | Engine | Notes |
|
|
25
|
+
|---|---|---|
|
|
26
|
+
| Stage 1 primary | **UniDock-Pro** | SBVS + LBVS + Hybrid; Apache 2.0; built from source on Kaggle (CUDA ≥ 11.8 + Boost) |
|
|
27
|
+
| Stage 1 fallback | **UniDock** | `pip install unidock-tools`; used when no reference ligand is available |
|
|
28
|
+
| Stage 2 validation | **DiffDock-L** | Via NVIDIA NIM API; free; called locally (not on Kaggle) |
|
|
29
|
+
| v2 (deferred) | Local backend, Colab backend, MD/MMGBSA | Do not implement |
|
|
30
|
+
|
|
31
|
+
> ⚠️ UniDock-Pro build from source happens in notebook template Cell 2 on Kaggle.
|
|
32
|
+
|
|
33
|
+
---
|
|
34
|
+
|
|
35
|
+
## Prep Pipeline
|
|
36
|
+
|
|
37
|
+
| Step | Tool | Notes |
|
|
38
|
+
|---|---|---|
|
|
39
|
+
| Receptor prep | pdbfixer → Meeko `mk_prepare_receptor` | Missing residues, waters, alternates, hydrogens |
|
|
40
|
+
| Ligand prep | scrub.py → RDKit ETKDG → Meeko | Protonation, 3D conformers, PDBQT output |
|
|
41
|
+
| Pocket detection | Tiered (see below) | co-crystal → residues → P2Rank → blind |
|
|
42
|
+
| ADMET filtering | RDKit | Ro5, PAINS, toxicophores, Veber, Egan BBB |
|
|
43
|
+
| Molscrub install | `git+https://github.com/forlilab/scrubber` | Not on PyPI — pinned in pyproject.toml |
|
|
44
|
+
|
|
45
|
+
---
|
|
46
|
+
|
|
47
|
+
## Package Structure
|
|
48
|
+
|
|
49
|
+
```
|
|
50
|
+
ezscreen/
|
|
51
|
+
├── cli.py # Typer entry points — all subcommands
|
|
52
|
+
├── state.py # State machine — BACK sentinel, context dict
|
|
53
|
+
├── config.py # TOML loader/writer — ~/.ezscreen/config.toml
|
|
54
|
+
├── auth.py # Credential management — ~/.ezscreen/credentials
|
|
55
|
+
├── backends/
|
|
56
|
+
│ └── kaggle/
|
|
57
|
+
│ ├── runner.py # Orchestrates dataset push + kernel + poll + download
|
|
58
|
+
│ ├── dataset.py # Kaggle datasets API wrapper
|
|
59
|
+
│ ├── kernel.py # Kaggle kernels API wrapper
|
|
60
|
+
│ ├── poller.py # Status polling — Rich progress, retry logic
|
|
61
|
+
│ └── templates/
|
|
62
|
+
│ └── vina_shard.ipynb.j2 # Jinja2 notebook template
|
|
63
|
+
├── prep/
|
|
64
|
+
│ ├── receptor.py # pdbfixer + Meeko receptor prep
|
|
65
|
+
│ └── ligands.py # scrub.py + RDKit + Meeko ligand prep
|
|
66
|
+
├── pocket/
|
|
67
|
+
│ └── detect.py # Tiered pocket detection
|
|
68
|
+
├── admet/
|
|
69
|
+
│ └── filter.py # RDKit ADMET filters
|
|
70
|
+
├── results/
|
|
71
|
+
│ ├── merger.py # Merge shard CSVs, global re-rank
|
|
72
|
+
│ └── viewer.py # Rich table + py3Dmol HTML generator
|
|
73
|
+
├── checkpoint.py # SQLite run state — create, update, resume
|
|
74
|
+
├── report.py # Prep report — .txt + .json writer
|
|
75
|
+
├── errors.py # Error taxonomy — all error classes
|
|
76
|
+
└── vendor/
|
|
77
|
+
└── scrubber/ # Vendored molscrub (if vendoring decided)
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
---
|
|
81
|
+
|
|
82
|
+
## Config Files
|
|
83
|
+
|
|
84
|
+
### `~/.ezscreen/config.toml` — preferences only, never secrets
|
|
85
|
+
|
|
86
|
+
```toml
|
|
87
|
+
[run]
|
|
88
|
+
auto_resume_threshold = 10 # auto-resume if failure % >= this
|
|
89
|
+
shard_retry_limit = 3 # retries before prompting user
|
|
90
|
+
default_search_depth = "Balanced"
|
|
91
|
+
default_ph = 7.4
|
|
92
|
+
admet_pre_filter = true
|
|
93
|
+
|
|
94
|
+
[kaggle]
|
|
95
|
+
default_dataset = "username/ezscreen-workspace"
|
|
96
|
+
|
|
97
|
+
[defaults]
|
|
98
|
+
box_padding = 5.0
|
|
99
|
+
enumerate_tautomers = false
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
### `~/.ezscreen/credentials` — chmod 600 enforced — NEVER in config.toml
|
|
103
|
+
|
|
104
|
+
```
|
|
105
|
+
kaggle_json_path = "~/.kaggle/kaggle.json"
|
|
106
|
+
nim_api_key = "nvapi-xxxxxxxxxxxx"
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
---
|
|
110
|
+
|
|
111
|
+
## Kaggle Dataset Strategy
|
|
112
|
+
|
|
113
|
+
- Single persistent dataset: `ezscreen-workspace` per user
|
|
114
|
+
- Local manifest at `~/.ezscreen/manifest.json` — SHA-256 hash dedup
|
|
115
|
+
- Receptor: uploaded once, reused across runs if hash matches
|
|
116
|
+
- Ligand shards: always new per run
|
|
117
|
+
- Dataset structure: `receptor_{hash8}.pdb`, `shard_{run_id}_{n}.sdf`
|
|
118
|
+
- `ezscreen clean [run-id]` → deletes entire dataset + kernel artifacts
|
|
119
|
+
|
|
120
|
+
---
|
|
121
|
+
|
|
122
|
+
## Notebook Template Structure (Jinja2 → .ipynb)
|
|
123
|
+
|
|
124
|
+
| Cell | Purpose |
|
|
125
|
+
|---|---|
|
|
126
|
+
| Cell 1 | Auto-generated header — do not edit |
|
|
127
|
+
| Cell 2 | Install deps — UniDock-Pro build or UniDock pip, Meeko, RDKit |
|
|
128
|
+
| Cell 3 | Environment check — GPU? CUDA version? Disk space? |
|
|
129
|
+
| Cell 4 | Load inputs from `/kaggle/input/ezscreen-{run_id}/` |
|
|
130
|
+
| Cell 5 | Receptor prep — pdbfixer → Meeko |
|
|
131
|
+
| Cell 6 | Ligand prep — scrub.py → Meeko, batch processing |
|
|
132
|
+
| Cell 7 | Docking loop — UniDock-Pro, writes checkpoint per shard |
|
|
133
|
+
| Cell 8 | Collect results — write `scores.csv` + `poses.sdf` to `/kaggle/working/` |
|
|
134
|
+
| Cell 9 | Write `done.flag` — signals clean completion to poller |
|
|
135
|
+
|
|
136
|
+
> Every cell writes a structured `error.json` before raising. Poller distinguishes: timeout / prep failure / GPU OOM / clean completion.
|
|
137
|
+
|
|
138
|
+
### Jinja2 Template Variables
|
|
139
|
+
|
|
140
|
+
```python
|
|
141
|
+
{
|
|
142
|
+
"run_id": "ezs-4f2a8c",
|
|
143
|
+
"engine": "unidock-pro", # or "unidock"
|
|
144
|
+
"mode": "hybrid", # sbvs | lbvs | hybrid
|
|
145
|
+
"box_center": [x, y, z],
|
|
146
|
+
"box_size": [sx, sy, sz],
|
|
147
|
+
"exhaustiveness": 8,
|
|
148
|
+
"num_modes": 3,
|
|
149
|
+
"refine_step": 5,
|
|
150
|
+
"max_step": 0,
|
|
151
|
+
"ph": 7.4,
|
|
152
|
+
"shard_index": 2,
|
|
153
|
+
"total_shards": 10,
|
|
154
|
+
"enumerate_tautomers": false,
|
|
155
|
+
"receptor_dataset_path": "/kaggle/input/ezscreen-abc123/receptor.pdb",
|
|
156
|
+
"ligand_dataset_path": "/kaggle/input/ezscreen-abc123/shard_02.sdf"
|
|
157
|
+
}
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
---
|
|
161
|
+
|
|
162
|
+
## State Machine (Back Button)
|
|
163
|
+
|
|
164
|
+
```python
|
|
165
|
+
BACK = object() # sentinel
|
|
166
|
+
|
|
167
|
+
def screen_binding_site(ctx):
|
|
168
|
+
choice = questionary.select(..., choices=[..., "← Back"]).ask()
|
|
169
|
+
if choice == "← Back":
|
|
170
|
+
return BACK
|
|
171
|
+
ctx["binding_site"] = choice
|
|
172
|
+
return ctx
|
|
173
|
+
```
|
|
174
|
+
|
|
175
|
+
---
|
|
176
|
+
|
|
177
|
+
## Pocket Detection — Box Size Rules
|
|
178
|
+
|
|
179
|
+
| Method | Box Definition |
|
|
180
|
+
|---|---|
|
|
181
|
+
| Co-crystal ligand | Envelope ligand atoms + 5Å padding per side |
|
|
182
|
+
| Residue-defined | Convex hull of residue Cα atoms + 8Å padding |
|
|
183
|
+
| P2Rank | Predicted pocket volume + 6Å padding |
|
|
184
|
+
| Blind | Whole-protein bounding box + 4Å padding |
|
|
185
|
+
|
|
186
|
+
⚠️ Warn if box volume > 30,000 ų (too large) or < 1,500 ų (too small).
|
|
187
|
+
|
|
188
|
+
---
|
|
189
|
+
|
|
190
|
+
## Search Depth Presets
|
|
191
|
+
|
|
192
|
+
| Preset | Exhaustiveness | Poses | Description |
|
|
193
|
+
|---|---|---|---|
|
|
194
|
+
| Fast | 4 | 1 | triage only · misses ~25% best poses |
|
|
195
|
+
| Balanced ★ | 8 | 3 | standard VS · good for rigid pockets |
|
|
196
|
+
| Thorough | 16 | 5 | flexible ligands · induced-fit targets |
|
|
197
|
+
| Exhaustive | 32 | 9 | allosteric/cryptic pockets · pub quality |
|
|
198
|
+
| Expert | custom | custom | all params set manually |
|
|
199
|
+
|
|
200
|
+
LBVS mode: Fast preset quietly sets `--no_refine` — not exposed to user.
|
|
201
|
+
|
|
202
|
+
---
|
|
203
|
+
|
|
204
|
+
## Terminal Color Palette
|
|
205
|
+
|
|
206
|
+
| Color | Hex | Used for |
|
|
207
|
+
|---|---|---|
|
|
208
|
+
| Cyan | `#79c0ff` | Prompts, input fields, highlights |
|
|
209
|
+
| Amber | `#e3b341` | Warnings, ETA, quota usage |
|
|
210
|
+
| Green | `#3fb950` | Success states |
|
|
211
|
+
| Red | `#f85149` | Errors, failures |
|
|
212
|
+
| Gray | `#8b949e` | Secondary text, unselected options |
|
|
213
|
+
| White bold | `#f0f6fc` | Primary text, headings |
|
|
214
|
+
| Muted | `#6e7681` | Descriptions, supplementary info |
|
|
215
|
+
| Dim | `#484f58` | Dividers, structural chrome |
|
|
216
|
+
|
|
217
|
+
---
|
|
218
|
+
|
|
219
|
+
## Shard & Retry Logic
|
|
220
|
+
|
|
221
|
+
- Shard size: ~2,000 compounds per Kaggle kernel
|
|
222
|
+
- `shard_retry_limit`: 3 (configurable)
|
|
223
|
+
- `auto_resume_threshold`: 10% (configurable)
|
|
224
|
+
- If `failure_pct < threshold`: prompt user (continue partial / resume / change threshold)
|
|
225
|
+
- If `failure_pct >= threshold`: auto-resume without prompt
|
|
226
|
+
- If auto-resume also fails: prompt permanently (continue partial / try later with checkpoint)
|
|
227
|
+
|
|
228
|
+
---
|
|
229
|
+
|
|
230
|
+
## Environment Variables (`.env.example`)
|
|
231
|
+
|
|
232
|
+
```env
|
|
233
|
+
# ezscreen does NOT use .env files for runtime config.
|
|
234
|
+
# All user preferences → ~/.ezscreen/config.toml
|
|
235
|
+
# All credentials → ~/.ezscreen/credentials (chmod 600)
|
|
236
|
+
#
|
|
237
|
+
# For development/testing only:
|
|
238
|
+
EZSCREEN_DEV_MODE=false
|
|
239
|
+
EZSCREEN_LOG_LEVEL=INFO
|
|
240
|
+
```
|
|
@@ -0,0 +1,143 @@
|
|
|
1
|
+
# 02 — Coding Rules
|
|
2
|
+
|
|
3
|
+
> **These rules are the non-negotiable law of this project. Every line of code must comply.**
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## The 9 Operating Rules
|
|
8
|
+
|
|
9
|
+
### Rule 1 — The Context Vault
|
|
10
|
+
All project meta-knowledge lives in `/.ai_context/`. These 5 files are the absolute source of truth:
|
|
11
|
+
- `00_PROJECT_VIBE.md` — Core vision and non-negotiable features
|
|
12
|
+
- `01_TECH_ARCHITECTURE.md` — Tech stack, DB schemas, API routes
|
|
13
|
+
- `02_CODING_RULES.md` — This file
|
|
14
|
+
- `03_TASK_TRACKER.md` — Living checklist of features
|
|
15
|
+
- `04_DECISION_LOG.md` — Running log of architectural decisions
|
|
16
|
+
|
|
17
|
+
The AI must immediately output updated markdown for any file after a decision changes it.
|
|
18
|
+
|
|
19
|
+
### Rule 2 — Human-Centric Commit Simulation
|
|
20
|
+
- Write code in small, iterative, logically ordered pieces
|
|
21
|
+
- Code must be clean, functional, and accurate
|
|
22
|
+
- No intentional bugs, missing imports, or logical errors
|
|
23
|
+
- No unsolicited comments in the code
|
|
24
|
+
|
|
25
|
+
### Rule 3 — The State Tracker
|
|
26
|
+
Every single response from the AI must end with a State Tracker block in this exact format:
|
|
27
|
+
```
|
|
28
|
+
---
|
|
29
|
+
**STATE TRACKER**
|
|
30
|
+
* **Current Phase:** [Name]
|
|
31
|
+
* **Pending Tasks:**
|
|
32
|
+
- [ ] Task
|
|
33
|
+
* **Completed Tasks:**
|
|
34
|
+
- [x] Task
|
|
35
|
+
* **Recent Decisions:** [Brief summary]
|
|
36
|
+
---
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
### Rule 4 — Test-Driven Vibe Validation
|
|
40
|
+
Before writing logic for any complex module:
|
|
41
|
+
1. The user provides the vibe/feature
|
|
42
|
+
2. The AI writes edge cases and expected behavior
|
|
43
|
+
3. The AI waits for explicit user approval before writing code
|
|
44
|
+
|
|
45
|
+
### Rule 5 — Hard Checkpoints
|
|
46
|
+
After every major component is complete:
|
|
47
|
+
- Generate a summary of the current codebase state
|
|
48
|
+
- Provide a simulated Git commit message as a mental fallback point
|
|
49
|
+
|
|
50
|
+
### Rule 6 — The Scratchpad Pre-Computation Rule
|
|
51
|
+
Before writing any code or vault file updates, the AI must use `<thinking> ... </thinking>` tags to:
|
|
52
|
+
- Plan the architecture
|
|
53
|
+
- Map out file dependencies
|
|
54
|
+
- Catch logical flaws before outputting the final result
|
|
55
|
+
|
|
56
|
+
### Rule 7 — The Devil's Advocate Clause
|
|
57
|
+
Before implementing any new library, database, or major architectural change:
|
|
58
|
+
- The AI must provide a strict 2-sentence warning detailing the biggest risk, bottleneck, or technical debt the choice might create
|
|
59
|
+
|
|
60
|
+
### Rule 8 — `.env` / Secrets Guardrail
|
|
61
|
+
- Never hardcode configuration values, API keys, database URIs, or sensitive data
|
|
62
|
+
- All user preferences → `~/.ezscreen/config.toml`
|
|
63
|
+
- All credentials → `~/.ezscreen/credentials` (chmod 600 enforced)
|
|
64
|
+
- Keep `.env.example` updated in the Context Vault
|
|
65
|
+
|
|
66
|
+
### Rule 9 — The Nuclear Reset Command
|
|
67
|
+
If the user types `[SYSTEM RESET]`, the AI must:
|
|
68
|
+
- Disregard all current conversational context and memory
|
|
69
|
+
- Wipe its internal state entirely
|
|
70
|
+
- Rely solely on the `.md` files in `/.ai_context/`
|
|
71
|
+
- Await fresh instructions based only on those files
|
|
72
|
+
|
|
73
|
+
---
|
|
74
|
+
|
|
75
|
+
## Project-Specific Code Rules
|
|
76
|
+
|
|
77
|
+
### Language & Version
|
|
78
|
+
- Python 3.10 minimum (3.11+ preferred)
|
|
79
|
+
- Type hints required on all function signatures
|
|
80
|
+
- Use `tomllib` (stdlib 3.11+) or `tomli` (backport for 3.10) for TOML reads
|
|
81
|
+
- Use `tomli-w` for TOML writes
|
|
82
|
+
|
|
83
|
+
### CLI Architecture Rules
|
|
84
|
+
- All Typer entry points live in `cli.py` only
|
|
85
|
+
- Every screen function follows the BACK sentinel pattern from `state.py`
|
|
86
|
+
- Context dict is passed forward through every prompt function — never use global state
|
|
87
|
+
- Every questionary select must include `← Back` as the last choice
|
|
88
|
+
|
|
89
|
+
### Error Handling Rules
|
|
90
|
+
- All error classes defined in `errors.py` — never define inline
|
|
91
|
+
- Error messages follow the 3-line format: what failed → why → exact fix/URL
|
|
92
|
+
- Ligand prep failures: log to `failed.sdf`, never abort the run
|
|
93
|
+
- Receptor prep failures: always fatal — clean message + exit
|
|
94
|
+
- Kaggle transient errors: auto-retry with `⟳` indicator, up to `shard_retry_limit`
|
|
95
|
+
|
|
96
|
+
### Security Rules
|
|
97
|
+
- Credentials stored in `~/.ezscreen/credentials` only — chmod 600 enforced at write time
|
|
98
|
+
- Warn if `config.toml` contains any string matching an API key pattern
|
|
99
|
+
- Never log credentials — scrub from all log output
|
|
100
|
+
|
|
101
|
+
### Output / UX Rules
|
|
102
|
+
- No emoji in terminal output
|
|
103
|
+
- Use the exact hex color palette defined in `01_TECH_ARCHITECTURE.md`
|
|
104
|
+
- Breadcrumb trail must be rendered at the top of every screen
|
|
105
|
+
- Confirmation screen must always show full raw parameter set (for citation)
|
|
106
|
+
- Post-action routing: always offer viewer / next stage / main menu — never exit to bare prompt
|
|
107
|
+
|
|
108
|
+
### Logging Rules
|
|
109
|
+
- Every run writes to `~/.ezscreen/logs/ezs-{run_id}.log`
|
|
110
|
+
- Log: full tracebacks, all API responses, every prep decision with timestamp, exact UniDock-Pro command, shard state transitions
|
|
111
|
+
- Users never see logs unless something goes wrong
|
|
112
|
+
|
|
113
|
+
---
|
|
114
|
+
|
|
115
|
+
## Git Commit Convention
|
|
116
|
+
|
|
117
|
+
Commit messages must sound like a real developer wrote them. No conventional commit prefixes. Natural English, lowercase, short and specific.
|
|
118
|
+
|
|
119
|
+
```
|
|
120
|
+
# Good — human, specific
|
|
121
|
+
set up project layout and dependencies
|
|
122
|
+
add all the error classes we'll need
|
|
123
|
+
wire up the back button pattern
|
|
124
|
+
stubbed out all 8 cli commands for now
|
|
125
|
+
start on the kaggle dataset uploader
|
|
126
|
+
got receptor prep working with pdbfixer and meeko
|
|
127
|
+
fix auth wizard blowing up on missing kaggle.json
|
|
128
|
+
add admet filter — ro5 and pains for now
|
|
129
|
+
|
|
130
|
+
# Bad — robotic
|
|
131
|
+
feat(errors): add full error taxonomy
|
|
132
|
+
chore(scaffold): initialize package structure and pyproject.toml
|
|
133
|
+
fix(backend): handle Kaggle 403 forbidden during kernel push
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
---
|
|
137
|
+
|
|
138
|
+
## Code Style
|
|
139
|
+
- Formatter: **black** (line length 88)
|
|
140
|
+
- Linter: **ruff**
|
|
141
|
+
- Import order: stdlib → third-party → local (enforced by ruff isort)
|
|
142
|
+
- No unsolicited inline comments
|
|
143
|
+
- Docstrings: Google style, only on public-facing functions
|
|
@@ -0,0 +1,148 @@
|
|
|
1
|
+
# 03 — Task Tracker
|
|
2
|
+
|
|
3
|
+
> **The living checklist. Updated after every session.**
|
|
4
|
+
> `[ ]` = pending · `[/]` = in progress · `[x]` = done
|
|
5
|
+
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## Phase 0 — Project Initialization ✅
|
|
9
|
+
|
|
10
|
+
- [x] Establish the 9 Operating Rules
|
|
11
|
+
- [x] Create the Context Vault (`.ai_context/` directory)
|
|
12
|
+
- [x] Ingest full handoff document (ezscreen_handoff.md)
|
|
13
|
+
- [x] Populate all 5 vault files from handoff
|
|
14
|
+
- [x] Resolve 5 open discussion items (see `04_DECISION_LOG.md`)
|
|
15
|
+
- [x] Finalize Python 3.11 floor decision
|
|
16
|
+
|
|
17
|
+
---
|
|
18
|
+
|
|
19
|
+
## Phase 1 — Project Scaffold
|
|
20
|
+
|
|
21
|
+
> One unit of work per commit. Write → commit → next.
|
|
22
|
+
|
|
23
|
+
### ~~"set up project layout and dependencies"~~ ✅ `b7afda3`
|
|
24
|
+
- [x] `pyproject.toml` — Python 3.11+, all deps pinned, hatchling build, ruff + black config
|
|
25
|
+
- [x] `ezscreen/__init__.py` — exposes `__version__`
|
|
26
|
+
- [x] All subpackage `__init__.py` files
|
|
27
|
+
- [x] `tests/__init__.py`
|
|
28
|
+
|
|
29
|
+
### ~~"add all the error classes we'll need"~~ ✅ `591d532`
|
|
30
|
+
- [x] `ezscreen/errors.py` — 18 classes across 7 categories
|
|
31
|
+
|
|
32
|
+
### ~~"wire up the back button pattern"~~ ✅ `591d532`
|
|
33
|
+
- [x] `ezscreen/state.py` — BACK sentinel + `make_context()` (swept into same commit)
|
|
34
|
+
|
|
35
|
+
### ~~"stubbed out all 8 cli commands for now"~~ ✅ `9075f14`
|
|
36
|
+
- [x] `ezscreen/cli.py` — Typer app + 8 subcommand stubs (no logic)
|
|
37
|
+
|
|
38
|
+
---
|
|
39
|
+
|
|
40
|
+
## Phase 2 — Core Foundation Modules
|
|
41
|
+
|
|
42
|
+
### ~~"add config loading and writing"~~ ✅ `c2d9891`
|
|
43
|
+
- [x] `ezscreen/config.py` — load/write `~/.ezscreen/config.toml`
|
|
44
|
+
|
|
45
|
+
### ~~"got the auth wizard working"~~ ✅ `83bca34`
|
|
46
|
+
- [x] `ezscreen/auth.py`
|
|
47
|
+
- [x] Kaggle: env var detection, validate kaggle.json, live API call
|
|
48
|
+
- [x] NIM: optional endpoint ping, skippable
|
|
49
|
+
- [x] Summary + write credentials (chmod 600)
|
|
50
|
+
- [x] Re-run mode: show existing state → ask which to update
|
|
51
|
+
|
|
52
|
+
### ~~"add run checkpointing with sqlite"~~ ✅ `0c179bf`
|
|
53
|
+
- [x] `ezscreen/checkpoint.py` — create run, update shard, increment retry, mark complete, resume by run-id
|
|
54
|
+
|
|
55
|
+
---
|
|
56
|
+
|
|
57
|
+
## Phase 3 — Prep Pipeline
|
|
58
|
+
|
|
59
|
+
### ~~"got receptor prep working with pdbfixer and meeko"~~ ✅ `82206e2`
|
|
60
|
+
- [x] `ezscreen/prep/receptor.py` — AF 4-tier, chain multi-select, RCSB fetch, pdbfixer, mk_prepare_receptor
|
|
61
|
+
|
|
62
|
+
### ~~"ligand prep pipeline with fallback scrubber"~~ ✅ `82206e2`
|
|
63
|
+
- [x] `ezscreen/prep/ligands.py` — scrubber fallback, SDF+SMILES recursive, ETKDG, Meeko, sharding, failed_prep.sdf
|
|
64
|
+
|
|
65
|
+
### ~~"add tiered pocket detection"~~ ✅ `5aa5158`
|
|
66
|
+
- [x] `ezscreen/pocket/detect.py` — co-crystal, residue Cα box, P2Rank top-3 + AF profile, blind, volume validation
|
|
67
|
+
|
|
68
|
+
---
|
|
69
|
+
|
|
70
|
+
## Phase 4 — ADMET Filtering
|
|
71
|
+
|
|
72
|
+
### ~~"add admet filter — ro5 and pains for now"~~ ✅ `b6e7bde`
|
|
73
|
+
- [x] `ezscreen/admet/filter.py` — Lipinski Ro5, PAINS, Brenk toxicophores, Veber, Egan BBB; per-filter config; library-level SDF pass-through with breakdown stats
|
|
74
|
+
|
|
75
|
+
---
|
|
76
|
+
|
|
77
|
+
## Phase 5 — Kaggle Backend
|
|
78
|
+
|
|
79
|
+
### ~~"start on the kaggle dataset uploader"~~ ✅ `804e7bf`
|
|
80
|
+
- [x] `dataset.py` — SHA-256 dedup, manifest, dataset create, delete support
|
|
81
|
+
|
|
82
|
+
### ~~"add the kernel push wrapper"~~ ✅ `804e7bf`
|
|
83
|
+
- [x] `kernel.py` — notebook push, exp backoff, kernel-metadata.json, delete
|
|
84
|
+
|
|
85
|
+
### ~~"polling loop for running kaggle jobs"~~ ✅ `804e7bf`
|
|
86
|
+
- [x] `poller.py` — 30s poll, Rich Live, error classification, retry signalling
|
|
87
|
+
|
|
88
|
+
### ~~"kaggle runner that ties dataset and kernel together"~~ ✅ `804e7bf`
|
|
89
|
+
- [x] `runner.py` — orchestrates full pipeline, retry loop, output download, clean_run()
|
|
90
|
+
|
|
91
|
+
### ~~"jinja2 notebook template for kaggle"~~ ✅ `25a2602`
|
|
92
|
+
- [x] `vina_shard.ipynb.j2` — 9 cells, UniDock-Pro/UniDock switch, scrubber fallback, error.json on every failure, done.flag
|
|
93
|
+
|
|
94
|
+
---
|
|
95
|
+
|
|
96
|
+
### ~~"start on the auth command"~~ ✅ `ed1f421`
|
|
97
|
+
- [x] `commands/auth.py` — thin wrapper around auth.run_wizard()
|
|
98
|
+
|
|
99
|
+
### ~~"add the status command"~~ ✅ `ecf1442`
|
|
100
|
+
- [x] `commands/status.py` — Rich table, colour-coded status, run count
|
|
101
|
+
|
|
102
|
+
### ~~"standalone admet command working"~~ ✅ `fefcfe6`
|
|
103
|
+
- [x] `commands/admet.py` — interactive filter toggle, breakdown output
|
|
104
|
+
|
|
105
|
+
### ~~"got the 3d viewer working"~~ ✅ `4316267`
|
|
106
|
+
- [x] `commands/view.py` — Rich table + self-contained py3Dmol HTML viewer
|
|
107
|
+
|
|
108
|
+
### ~~"validate command for diffdock nim"~~ ✅ `2e49e39`
|
|
109
|
+
- [x] `commands/validate.py` — DiffDock-L REST, key guard, timeout/auth mapping
|
|
110
|
+
|
|
111
|
+
### ~~"the big run command — full decision tree"~~ ✅ `bb58e91`
|
|
112
|
+
- [x] `commands/run.py` — 8-step state machine (receptor→chains→AF→pocket→ligands→ADMET→depth→confirm→submit)
|
|
113
|
+
|
|
114
|
+
### ~~"wire all commands into cli.py"~~ ✅ `cd8e472`
|
|
115
|
+
- [x] `cli.py` — all 8 subcommands wired with typed Typer arguments
|
|
116
|
+
|
|
117
|
+
---
|
|
118
|
+
|
|
119
|
+
## Phase 7 — Results & Reporting
|
|
120
|
+
|
|
121
|
+
### ~~"merge shard results and render the viewer"~~ ✅ `af6b008`
|
|
122
|
+
- [x] `results/merger.py` — CSV merge, best-score dedup, global sort, SDF + failed_prep concat
|
|
123
|
+
|
|
124
|
+
### ~~"prep report writer"~~ ✅ `2988862`
|
|
125
|
+
- [x] `report.py` — Section 8.3 JSON schema, `.txt` human-readable, Rich summary panel
|
|
126
|
+
|
|
127
|
+
---
|
|
128
|
+
|
|
129
|
+
## Phase 8 — Status Screen & Version Check
|
|
130
|
+
|
|
131
|
+
### ~~"ezscreen status with live refresh"~~ ✅ `388efb2`
|
|
132
|
+
- [x] `ezscreen status` — 30s auto-refresh
|
|
133
|
+
- [x] Version check async + PyPI banner ✅ `45a2986`
|
|
134
|
+
|
|
135
|
+
---
|
|
136
|
+
|
|
137
|
+
## Phase 9 — Polish & Packaging
|
|
138
|
+
|
|
139
|
+
### ~~"final cleanup and readme"~~ ✅
|
|
140
|
+
- [x] Smoke test — merger, report, version_check ✅ `9cdebb5`
|
|
141
|
+
- [x] `README.md` — install + quickstart ✅ `3c2674a`
|
|
142
|
+
- [ ] Conda-forge recipe (post pip release)
|
|
143
|
+
|
|
144
|
+
---
|
|
145
|
+
|
|
146
|
+
## Backlog / Nice-to-Have (Post v1)
|
|
147
|
+
|
|
148
|
+
- All v2 features from handoff Section 16
|