ezscreen 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (45) hide show
  1. ezscreen-0.1.0/.ai_context/00_PROJECT_VIBE.md +72 -0
  2. ezscreen-0.1.0/.ai_context/01_TECH_ARCHITECTURE.md +240 -0
  3. ezscreen-0.1.0/.ai_context/02_CODING_RULES.md +143 -0
  4. ezscreen-0.1.0/.ai_context/03_TASK_TRACKER.md +148 -0
  5. ezscreen-0.1.0/.ai_context/04_DECISION_LOG.md +196 -0
  6. ezscreen-0.1.0/.gitignore +50 -0
  7. ezscreen-0.1.0/PKG-INFO +121 -0
  8. ezscreen-0.1.0/README.md +85 -0
  9. ezscreen-0.1.0/SESSION_2026-04-14.md +55 -0
  10. ezscreen-0.1.0/ezscreen/__init__.py +1 -0
  11. ezscreen-0.1.0/ezscreen/admet/__init__.py +0 -0
  12. ezscreen-0.1.0/ezscreen/admet/filter.py +175 -0
  13. ezscreen-0.1.0/ezscreen/auth.py +238 -0
  14. ezscreen-0.1.0/ezscreen/backends/__init__.py +0 -0
  15. ezscreen-0.1.0/ezscreen/backends/kaggle/__init__.py +0 -0
  16. ezscreen-0.1.0/ezscreen/backends/kaggle/dataset.py +124 -0
  17. ezscreen-0.1.0/ezscreen/backends/kaggle/kernel.py +91 -0
  18. ezscreen-0.1.0/ezscreen/backends/kaggle/poller.py +126 -0
  19. ezscreen-0.1.0/ezscreen/backends/kaggle/runner.py +265 -0
  20. ezscreen-0.1.0/ezscreen/backends/kaggle/templates/vina_shard.ipynb.j2 +395 -0
  21. ezscreen-0.1.0/ezscreen/checkpoint.py +187 -0
  22. ezscreen-0.1.0/ezscreen/cli.py +173 -0
  23. ezscreen-0.1.0/ezscreen/commands/admet.py +63 -0
  24. ezscreen-0.1.0/ezscreen/commands/auth.py +8 -0
  25. ezscreen-0.1.0/ezscreen/commands/run.py +563 -0
  26. ezscreen-0.1.0/ezscreen/commands/status.py +97 -0
  27. ezscreen-0.1.0/ezscreen/commands/validate.py +67 -0
  28. ezscreen-0.1.0/ezscreen/commands/view.py +119 -0
  29. ezscreen-0.1.0/ezscreen/config.py +77 -0
  30. ezscreen-0.1.0/ezscreen/errors.py +133 -0
  31. ezscreen-0.1.0/ezscreen/pocket/__init__.py +0 -0
  32. ezscreen-0.1.0/ezscreen/pocket/detect.py +226 -0
  33. ezscreen-0.1.0/ezscreen/prep/__init__.py +0 -0
  34. ezscreen-0.1.0/ezscreen/prep/ligands.py +248 -0
  35. ezscreen-0.1.0/ezscreen/prep/receptor.py +305 -0
  36. ezscreen-0.1.0/ezscreen/report.py +179 -0
  37. ezscreen-0.1.0/ezscreen/results/__init__.py +0 -0
  38. ezscreen-0.1.0/ezscreen/results/merger.py +103 -0
  39. ezscreen-0.1.0/ezscreen/state.py +9 -0
  40. ezscreen-0.1.0/ezscreen/vendor/__init__.py +1 -0
  41. ezscreen-0.1.0/ezscreen/vendor/scrubber/__init__.py +38 -0
  42. ezscreen-0.1.0/ezscreen/version_check.py +77 -0
  43. ezscreen-0.1.0/pyproject.toml +76 -0
  44. ezscreen-0.1.0/tests/__init__.py +1 -0
  45. ezscreen-0.1.0/tests/test_smoke.py +176 -0
@@ -0,0 +1,72 @@
1
+ # 00 — Project Vibe
2
+
3
+ > **STATUS: INITIALIZED FROM HANDOFF DOCUMENT**
4
+
5
+ ---
6
+
7
+ ## Core Vision
8
+
9
+ **ezscreen** is a command-line tool that makes GPU-accelerated virtual screening accessible to researchers who do not have access to high-performance computing hardware. It automates the entire molecular docking pipeline — from receptor and ligand preparation through docking, hit filtering, and pose validation — using Kaggle's free T4 GPU infrastructure as the compute backend.
10
+
11
+ ---
12
+
13
+ ## The Problem We Are Solving
14
+
15
+ Computational chemists and pharmacologists at under-resourced institutions spend hours configuring docking tools, managing file formats, and debugging prep pipelines before any science gets done. ezscreen collapses this into an interactive CLI with sensible defaults and a guided decision tree.
16
+
17
+ ---
18
+
19
+ ## Target Users
20
+
21
+ Researchers and students in molecular biology, pharmacology, and computational chemistry who are:
22
+ - Comfortable with a terminal
23
+ - Not necessarily expert computational chemists
24
+ - Using Kaggle or Google Colab for GPU access
25
+ - At institutions without dedicated HPC infrastructure
26
+
27
+ ---
28
+
29
+ ## The North Star Statement
30
+
31
+ > *"A researcher with a PDB code and a ligand library should be able to run GPU-accelerated virtual screening in under 10 minutes, with no HPC access and no configuration expertise."*
32
+
33
+ ---
34
+
35
+ ## Non-Negotiable Core Features (v1 MVP)
36
+
37
+ 1. **`ezscreen run`** — Full interactive guided pipeline: receptor prep → binding site detection → ligand prep → ADMET pre-filter → docking on Kaggle T4 → results
38
+ 2. **`ezscreen auth`** — Credential wizard for Kaggle API + NVIDIA NIM (optional)
39
+ 3. **`ezscreen status`** — Live job monitor with auto-refresh, attach/resume/delete actions
40
+ 4. **`ezscreen resume [run-id]`** — Checkpoint-based resume for interrupted runs (SQLite state)
41
+ 5. **`ezscreen validate`** — Stage 2 DiffDock-L hit validation via NVIDIA NIM API
42
+ 6. **`ezscreen admet`** — Standalone ADMET filtering on any CSV or SDF
43
+ 7. **`ezscreen view`** — Rich table + py3Dmol self-contained HTML results viewer
44
+ 8. **`ezscreen clean [run-id]`** — Remove Kaggle dataset + kernel artifacts for a run
45
+
46
+ ---
47
+
48
+ ## UX Non-Negotiables
49
+
50
+ - Every screen has a `← Back` option
51
+ - Breadcrumb trail always shown at top of every screen
52
+ - Never exit to a bare terminal prompt — always offer next logical action
53
+ - Full raw parameter set always shown on confirmation screen (for citation)
54
+ - No silent defaults on consequential decisions
55
+ - ctrl+c detaches gracefully — Kaggle kernel continues running
56
+ - No emoji in terminal output — color-coded with specific hex palette (see `01_TECH_ARCHITECTURE.md`)
57
+
58
+ ---
59
+
60
+ ## Scope — v1 Only
61
+
62
+ > ⚠️ Features marked v2 in the handoff are **explicitly deferred** and must not be implemented in v1.
63
+
64
+ **Out of scope for v1:**
65
+ - Local compute backend (no Kaggle)
66
+ - Google Colab backend
67
+ - `ezscreen refine` — MD / MMGBSA post-docking
68
+ - Web dashboard (Gradio / Streamlit)
69
+ - Multi-target screening (one library vs N receptors)
70
+ - Deep ADMET (ADMET-AI, pkCSM)
71
+ - Collaboration / team hit lists
72
+ - Consensus docking (UniDock-Pro + Gnina)
@@ -0,0 +1,240 @@
1
+ # 01 — Tech Architecture
2
+
3
+ > **STATUS: ALL DECISIONS FINAL — from handoff document**
4
+
5
+ ---
6
+
7
+ ## Tech Stack
8
+
9
+ | Layer | Technology | Rationale |
10
+ |---|---|---|
11
+ | CLI framework | **Typer** | Cleaner nested subcommands than Click; type-hint driven; less boilerplate |
12
+ | Terminal output | **Rich** | Progress bars, styled tables, panels, syntax highlighting |
13
+ | Interactive prompts | **Questionary** | Styled select menus, checkboxes, text inputs via prompt-toolkit |
14
+ | Config format | **TOML** | No YAML Norway-problem bugs; `tomllib` in stdlib (3.11+); `tomli-w` for writing |
15
+ | Python floor | **3.10 minimum** | 3.11+ preferred; compatibility matrix must be verified before finalising |
16
+ | Package distribution | **pip first** | `pyproject.toml`; conda-forge recipe filed after stable v1 release |
17
+ | Checkpointing | **SQLite** | Run state, shard status, resume by run-id |
18
+ | Notebook templating | **Jinja2** | Renders complete `.ipynb` from template variables |
19
+
20
+ ---
21
+
22
+ ## Docking Engines
23
+
24
+ | Stage | Engine | Notes |
25
+ |---|---|---|
26
+ | Stage 1 primary | **UniDock-Pro** | SBVS + LBVS + Hybrid; Apache 2.0; built from source on Kaggle (CUDA ≥ 11.8 + Boost) |
27
+ | Stage 1 fallback | **UniDock** | `pip install unidock-tools`; used when no reference ligand is available |
28
+ | Stage 2 validation | **DiffDock-L** | Via NVIDIA NIM API; free; called locally (not on Kaggle) |
29
+ | v2 (deferred) | Local backend, Colab backend, MD/MMGBSA | Do not implement |
30
+
31
+ > ⚠️ UniDock-Pro build from source happens in notebook template Cell 2 on Kaggle.
32
+
33
+ ---
34
+
35
+ ## Prep Pipeline
36
+
37
+ | Step | Tool | Notes |
38
+ |---|---|---|
39
+ | Receptor prep | pdbfixer → Meeko `mk_prepare_receptor` | Missing residues, waters, alternates, hydrogens |
40
+ | Ligand prep | scrub.py → RDKit ETKDG → Meeko | Protonation, 3D conformers, PDBQT output |
41
+ | Pocket detection | Tiered (see below) | co-crystal → residues → P2Rank → blind |
42
+ | ADMET filtering | RDKit | Ro5, PAINS, toxicophores, Veber, Egan BBB |
43
+ | Molscrub install | `git+https://github.com/forlilab/scrubber` | Not on PyPI — pinned in pyproject.toml |
44
+
45
+ ---
46
+
47
+ ## Package Structure
48
+
49
+ ```
50
+ ezscreen/
51
+ ├── cli.py # Typer entry points — all subcommands
52
+ ├── state.py # State machine — BACK sentinel, context dict
53
+ ├── config.py # TOML loader/writer — ~/.ezscreen/config.toml
54
+ ├── auth.py # Credential management — ~/.ezscreen/credentials
55
+ ├── backends/
56
+ │ └── kaggle/
57
+ │ ├── runner.py # Orchestrates dataset push + kernel + poll + download
58
+ │ ├── dataset.py # Kaggle datasets API wrapper
59
+ │ ├── kernel.py # Kaggle kernels API wrapper
60
+ │ ├── poller.py # Status polling — Rich progress, retry logic
61
+ │ └── templates/
62
+ │ └── vina_shard.ipynb.j2 # Jinja2 notebook template
63
+ ├── prep/
64
+ │ ├── receptor.py # pdbfixer + Meeko receptor prep
65
+ │ └── ligands.py # scrub.py + RDKit + Meeko ligand prep
66
+ ├── pocket/
67
+ │ └── detect.py # Tiered pocket detection
68
+ ├── admet/
69
+ │ └── filter.py # RDKit ADMET filters
70
+ ├── results/
71
+ │ ├── merger.py # Merge shard CSVs, global re-rank
72
+ │ └── viewer.py # Rich table + py3Dmol HTML generator
73
+ ├── checkpoint.py # SQLite run state — create, update, resume
74
+ ├── report.py # Prep report — .txt + .json writer
75
+ ├── errors.py # Error taxonomy — all error classes
76
+ └── vendor/
77
+ └── scrubber/ # Vendored molscrub (if vendoring decided)
78
+ ```
79
+
80
+ ---
81
+
82
+ ## Config Files
83
+
84
+ ### `~/.ezscreen/config.toml` — preferences only, never secrets
85
+
86
+ ```toml
87
+ [run]
88
+ auto_resume_threshold = 10 # auto-resume if failure % >= this
89
+ shard_retry_limit = 3 # retries before prompting user
90
+ default_search_depth = "Balanced"
91
+ default_ph = 7.4
92
+ admet_pre_filter = true
93
+
94
+ [kaggle]
95
+ default_dataset = "username/ezscreen-workspace"
96
+
97
+ [defaults]
98
+ box_padding = 5.0
99
+ enumerate_tautomers = false
100
+ ```
101
+
102
+ ### `~/.ezscreen/credentials` — chmod 600 enforced — NEVER in config.toml
103
+
104
+ ```
105
+ kaggle_json_path = "~/.kaggle/kaggle.json"
106
+ nim_api_key = "nvapi-xxxxxxxxxxxx"
107
+ ```
108
+
109
+ ---
110
+
111
+ ## Kaggle Dataset Strategy
112
+
113
+ - Single persistent dataset: `ezscreen-workspace` per user
114
+ - Local manifest at `~/.ezscreen/manifest.json` — SHA-256 hash dedup
115
+ - Receptor: uploaded once, reused across runs if hash matches
116
+ - Ligand shards: always new per run
117
+ - Dataset structure: `receptor_{hash8}.pdb`, `shard_{run_id}_{n}.sdf`
118
+ - `ezscreen clean [run-id]` → deletes entire dataset + kernel artifacts
119
+
120
+ ---
121
+
122
+ ## Notebook Template Structure (Jinja2 → .ipynb)
123
+
124
+ | Cell | Purpose |
125
+ |---|---|
126
+ | Cell 1 | Auto-generated header — do not edit |
127
+ | Cell 2 | Install deps — UniDock-Pro build or UniDock pip, Meeko, RDKit |
128
+ | Cell 3 | Environment check — GPU? CUDA version? Disk space? |
129
+ | Cell 4 | Load inputs from `/kaggle/input/ezscreen-{run_id}/` |
130
+ | Cell 5 | Receptor prep — pdbfixer → Meeko |
131
+ | Cell 6 | Ligand prep — scrub.py → Meeko, batch processing |
132
+ | Cell 7 | Docking loop — UniDock-Pro, writes checkpoint per shard |
133
+ | Cell 8 | Collect results — write `scores.csv` + `poses.sdf` to `/kaggle/working/` |
134
+ | Cell 9 | Write `done.flag` — signals clean completion to poller |
135
+
136
+ > Every cell writes a structured `error.json` before raising. Poller distinguishes: timeout / prep failure / GPU OOM / clean completion.
137
+
138
+ ### Jinja2 Template Variables
139
+
140
+ ```python
141
+ {
142
+ "run_id": "ezs-4f2a8c",
143
+ "engine": "unidock-pro", # or "unidock"
144
+ "mode": "hybrid", # sbvs | lbvs | hybrid
145
+ "box_center": [x, y, z],
146
+ "box_size": [sx, sy, sz],
147
+ "exhaustiveness": 8,
148
+ "num_modes": 3,
149
+ "refine_step": 5,
150
+ "max_step": 0,
151
+ "ph": 7.4,
152
+ "shard_index": 2,
153
+ "total_shards": 10,
154
+ "enumerate_tautomers": false,
155
+ "receptor_dataset_path": "/kaggle/input/ezscreen-abc123/receptor.pdb",
156
+ "ligand_dataset_path": "/kaggle/input/ezscreen-abc123/shard_02.sdf"
157
+ }
158
+ ```
159
+
160
+ ---
161
+
162
+ ## State Machine (Back Button)
163
+
164
+ ```python
165
+ BACK = object() # sentinel
166
+
167
+ def screen_binding_site(ctx):
168
+ choice = questionary.select(..., choices=[..., "← Back"]).ask()
169
+ if choice == "← Back":
170
+ return BACK
171
+ ctx["binding_site"] = choice
172
+ return ctx
173
+ ```
174
+
175
+ ---
176
+
177
+ ## Pocket Detection — Box Size Rules
178
+
179
+ | Method | Box Definition |
180
+ |---|---|
181
+ | Co-crystal ligand | Envelope ligand atoms + 5Å padding per side |
182
+ | Residue-defined | Convex hull of residue Cα atoms + 8Å padding |
183
+ | P2Rank | Predicted pocket volume + 6Å padding |
184
+ | Blind | Whole-protein bounding box + 4Å padding |
185
+
186
+ ⚠️ Warn if box volume > 30,000 ų (too large) or < 1,500 ų (too small).
187
+
188
+ ---
189
+
190
+ ## Search Depth Presets
191
+
192
+ | Preset | Exhaustiveness | Poses | Description |
193
+ |---|---|---|---|
194
+ | Fast | 4 | 1 | triage only · misses ~25% best poses |
195
+ | Balanced ★ | 8 | 3 | standard VS · good for rigid pockets |
196
+ | Thorough | 16 | 5 | flexible ligands · induced-fit targets |
197
+ | Exhaustive | 32 | 9 | allosteric/cryptic pockets · pub quality |
198
+ | Expert | custom | custom | all params set manually |
199
+
200
+ LBVS mode: Fast preset quietly sets `--no_refine` — not exposed to user.
201
+
202
+ ---
203
+
204
+ ## Terminal Color Palette
205
+
206
+ | Color | Hex | Used for |
207
+ |---|---|---|
208
+ | Cyan | `#79c0ff` | Prompts, input fields, highlights |
209
+ | Amber | `#e3b341` | Warnings, ETA, quota usage |
210
+ | Green | `#3fb950` | Success states |
211
+ | Red | `#f85149` | Errors, failures |
212
+ | Gray | `#8b949e` | Secondary text, unselected options |
213
+ | White bold | `#f0f6fc` | Primary text, headings |
214
+ | Muted | `#6e7681` | Descriptions, supplementary info |
215
+ | Dim | `#484f58` | Dividers, structural chrome |
216
+
217
+ ---
218
+
219
+ ## Shard & Retry Logic
220
+
221
+ - Shard size: ~2,000 compounds per Kaggle kernel
222
+ - `shard_retry_limit`: 3 (configurable)
223
+ - `auto_resume_threshold`: 10% (configurable)
224
+ - If `failure_pct < threshold`: prompt user (continue partial / resume / change threshold)
225
+ - If `failure_pct >= threshold`: auto-resume without prompt
226
+ - If auto-resume also fails: prompt permanently (continue partial / try later with checkpoint)
227
+
228
+ ---
229
+
230
+ ## Environment Variables (`.env.example`)
231
+
232
+ ```env
233
+ # ezscreen does NOT use .env files for runtime config.
234
+ # All user preferences → ~/.ezscreen/config.toml
235
+ # All credentials → ~/.ezscreen/credentials (chmod 600)
236
+ #
237
+ # For development/testing only:
238
+ EZSCREEN_DEV_MODE=false
239
+ EZSCREEN_LOG_LEVEL=INFO
240
+ ```
@@ -0,0 +1,143 @@
1
+ # 02 — Coding Rules
2
+
3
+ > **These rules are the non-negotiable law of this project. Every line of code must comply.**
4
+
5
+ ---
6
+
7
+ ## The 9 Operating Rules
8
+
9
+ ### Rule 1 — The Context Vault
10
+ All project meta-knowledge lives in `/.ai_context/`. These 5 files are the absolute source of truth:
11
+ - `00_PROJECT_VIBE.md` — Core vision and non-negotiable features
12
+ - `01_TECH_ARCHITECTURE.md` — Tech stack, DB schemas, API routes
13
+ - `02_CODING_RULES.md` — This file
14
+ - `03_TASK_TRACKER.md` — Living checklist of features
15
+ - `04_DECISION_LOG.md` — Running log of architectural decisions
16
+
17
+ The AI must immediately output updated markdown for any file after a decision changes it.
18
+
19
+ ### Rule 2 — Human-Centric Commit Simulation
20
+ - Write code in small, iterative, logically ordered pieces
21
+ - Code must be clean, functional, and accurate
22
+ - No intentional bugs, missing imports, or logical errors
23
+ - No unsolicited comments in the code
24
+
25
+ ### Rule 3 — The State Tracker
26
+ Every single response from the AI must end with a State Tracker block in this exact format:
27
+ ```
28
+ ---
29
+ **STATE TRACKER**
30
+ * **Current Phase:** [Name]
31
+ * **Pending Tasks:**
32
+ - [ ] Task
33
+ * **Completed Tasks:**
34
+ - [x] Task
35
+ * **Recent Decisions:** [Brief summary]
36
+ ---
37
+ ```
38
+
39
+ ### Rule 4 — Test-Driven Vibe Validation
40
+ Before writing logic for any complex module:
41
+ 1. The user provides the vibe/feature
42
+ 2. The AI writes edge cases and expected behavior
43
+ 3. The AI waits for explicit user approval before writing code
44
+
45
+ ### Rule 5 — Hard Checkpoints
46
+ After every major component is complete:
47
+ - Generate a summary of the current codebase state
48
+ - Provide a simulated Git commit message as a mental fallback point
49
+
50
+ ### Rule 6 — The Scratchpad Pre-Computation Rule
51
+ Before writing any code or vault file updates, the AI must use `<thinking> ... </thinking>` tags to:
52
+ - Plan the architecture
53
+ - Map out file dependencies
54
+ - Catch logical flaws before outputting the final result
55
+
56
+ ### Rule 7 — The Devil's Advocate Clause
57
+ Before implementing any new library, database, or major architectural change:
58
+ - The AI must provide a strict 2-sentence warning detailing the biggest risk, bottleneck, or technical debt the choice might create
59
+
60
+ ### Rule 8 — `.env` / Secrets Guardrail
61
+ - Never hardcode configuration values, API keys, database URIs, or sensitive data
62
+ - All user preferences → `~/.ezscreen/config.toml`
63
+ - All credentials → `~/.ezscreen/credentials` (chmod 600 enforced)
64
+ - Keep `.env.example` updated in the Context Vault
65
+
66
+ ### Rule 9 — The Nuclear Reset Command
67
+ If the user types `[SYSTEM RESET]`, the AI must:
68
+ - Disregard all current conversational context and memory
69
+ - Wipe its internal state entirely
70
+ - Rely solely on the `.md` files in `/.ai_context/`
71
+ - Await fresh instructions based only on those files
72
+
73
+ ---
74
+
75
+ ## Project-Specific Code Rules
76
+
77
+ ### Language & Version
78
+ - Python 3.10 minimum (3.11+ preferred)
79
+ - Type hints required on all function signatures
80
+ - Use `tomllib` (stdlib 3.11+) or `tomli` (backport for 3.10) for TOML reads
81
+ - Use `tomli-w` for TOML writes
82
+
83
+ ### CLI Architecture Rules
84
+ - All Typer entry points live in `cli.py` only
85
+ - Every screen function follows the BACK sentinel pattern from `state.py`
86
+ - Context dict is passed forward through every prompt function — never use global state
87
+ - Every questionary select must include `← Back` as the last choice
88
+
89
+ ### Error Handling Rules
90
+ - All error classes defined in `errors.py` — never define inline
91
+ - Error messages follow the 3-line format: what failed → why → exact fix/URL
92
+ - Ligand prep failures: log to `failed.sdf`, never abort the run
93
+ - Receptor prep failures: always fatal — clean message + exit
94
+ - Kaggle transient errors: auto-retry with `⟳` indicator, up to `shard_retry_limit`
95
+
96
+ ### Security Rules
97
+ - Credentials stored in `~/.ezscreen/credentials` only — chmod 600 enforced at write time
98
+ - Warn if `config.toml` contains any string matching an API key pattern
99
+ - Never log credentials — scrub from all log output
100
+
101
+ ### Output / UX Rules
102
+ - No emoji in terminal output
103
+ - Use the exact hex color palette defined in `01_TECH_ARCHITECTURE.md`
104
+ - Breadcrumb trail must be rendered at the top of every screen
105
+ - Confirmation screen must always show full raw parameter set (for citation)
106
+ - Post-action routing: always offer viewer / next stage / main menu — never exit to bare prompt
107
+
108
+ ### Logging Rules
109
+ - Every run writes to `~/.ezscreen/logs/ezs-{run_id}.log`
110
+ - Log: full tracebacks, all API responses, every prep decision with timestamp, exact UniDock-Pro command, shard state transitions
111
+ - Users never see logs unless something goes wrong
112
+
113
+ ---
114
+
115
+ ## Git Commit Convention
116
+
117
+ Commit messages must sound like a real developer wrote them. No conventional commit prefixes. Natural English, lowercase, short and specific.
118
+
119
+ ```
120
+ # Good — human, specific
121
+ set up project layout and dependencies
122
+ add all the error classes we'll need
123
+ wire up the back button pattern
124
+ stubbed out all 8 cli commands for now
125
+ start on the kaggle dataset uploader
126
+ got receptor prep working with pdbfixer and meeko
127
+ fix auth wizard blowing up on missing kaggle.json
128
+ add admet filter — ro5 and pains for now
129
+
130
+ # Bad — robotic
131
+ feat(errors): add full error taxonomy
132
+ chore(scaffold): initialize package structure and pyproject.toml
133
+ fix(backend): handle Kaggle 403 forbidden during kernel push
134
+ ```
135
+
136
+ ---
137
+
138
+ ## Code Style
139
+ - Formatter: **black** (line length 88)
140
+ - Linter: **ruff**
141
+ - Import order: stdlib → third-party → local (enforced by ruff isort)
142
+ - No unsolicited inline comments
143
+ - Docstrings: Google style, only on public-facing functions
@@ -0,0 +1,148 @@
1
+ # 03 — Task Tracker
2
+
3
+ > **The living checklist. Updated after every session.**
4
+ > `[ ]` = pending · `[/]` = in progress · `[x]` = done
5
+
6
+ ---
7
+
8
+ ## Phase 0 — Project Initialization ✅
9
+
10
+ - [x] Establish the 9 Operating Rules
11
+ - [x] Create the Context Vault (`.ai_context/` directory)
12
+ - [x] Ingest full handoff document (ezscreen_handoff.md)
13
+ - [x] Populate all 5 vault files from handoff
14
+ - [x] Resolve 5 open discussion items (see `04_DECISION_LOG.md`)
15
+ - [x] Finalize Python 3.11 floor decision
16
+
17
+ ---
18
+
19
+ ## Phase 1 — Project Scaffold
20
+
21
+ > One unit of work per commit. Write → commit → next.
22
+
23
+ ### ~~"set up project layout and dependencies"~~ ✅ `b7afda3`
24
+ - [x] `pyproject.toml` — Python 3.11+, all deps pinned, hatchling build, ruff + black config
25
+ - [x] `ezscreen/__init__.py` — exposes `__version__`
26
+ - [x] All subpackage `__init__.py` files
27
+ - [x] `tests/__init__.py`
28
+
29
+ ### ~~"add all the error classes we'll need"~~ ✅ `591d532`
30
+ - [x] `ezscreen/errors.py` — 18 classes across 7 categories
31
+
32
+ ### ~~"wire up the back button pattern"~~ ✅ `591d532`
33
+ - [x] `ezscreen/state.py` — BACK sentinel + `make_context()` (swept into same commit)
34
+
35
+ ### ~~"stubbed out all 8 cli commands for now"~~ ✅ `9075f14`
36
+ - [x] `ezscreen/cli.py` — Typer app + 8 subcommand stubs (no logic)
37
+
38
+ ---
39
+
40
+ ## Phase 2 — Core Foundation Modules
41
+
42
+ ### ~~"add config loading and writing"~~ ✅ `c2d9891`
43
+ - [x] `ezscreen/config.py` — load/write `~/.ezscreen/config.toml`
44
+
45
+ ### ~~"got the auth wizard working"~~ ✅ `83bca34`
46
+ - [x] `ezscreen/auth.py`
47
+ - [x] Kaggle: env var detection, validate kaggle.json, live API call
48
+ - [x] NIM: optional endpoint ping, skippable
49
+ - [x] Summary + write credentials (chmod 600)
50
+ - [x] Re-run mode: show existing state → ask which to update
51
+
52
+ ### ~~"add run checkpointing with sqlite"~~ ✅ `0c179bf`
53
+ - [x] `ezscreen/checkpoint.py` — create run, update shard, increment retry, mark complete, resume by run-id
54
+
55
+ ---
56
+
57
+ ## Phase 3 — Prep Pipeline
58
+
59
+ ### ~~"got receptor prep working with pdbfixer and meeko"~~ ✅ `82206e2`
60
+ - [x] `ezscreen/prep/receptor.py` — AF 4-tier, chain multi-select, RCSB fetch, pdbfixer, mk_prepare_receptor
61
+
62
+ ### ~~"ligand prep pipeline with fallback scrubber"~~ ✅ `82206e2`
63
+ - [x] `ezscreen/prep/ligands.py` — scrubber fallback, SDF+SMILES recursive, ETKDG, Meeko, sharding, failed_prep.sdf
64
+
65
+ ### ~~"add tiered pocket detection"~~ ✅ `5aa5158`
66
+ - [x] `ezscreen/pocket/detect.py` — co-crystal, residue Cα box, P2Rank top-3 + AF profile, blind, volume validation
67
+
68
+ ---
69
+
70
+ ## Phase 4 — ADMET Filtering
71
+
72
+ ### ~~"add admet filter — ro5 and pains for now"~~ ✅ `b6e7bde`
73
+ - [x] `ezscreen/admet/filter.py` — Lipinski Ro5, PAINS, Brenk toxicophores, Veber, Egan BBB; per-filter config; library-level SDF pass-through with breakdown stats
74
+
75
+ ---
76
+
77
+ ## Phase 5 — Kaggle Backend
78
+
79
+ ### ~~"start on the kaggle dataset uploader"~~ ✅ `804e7bf`
80
+ - [x] `dataset.py` — SHA-256 dedup, manifest, dataset create, delete support
81
+
82
+ ### ~~"add the kernel push wrapper"~~ ✅ `804e7bf`
83
+ - [x] `kernel.py` — notebook push, exp backoff, kernel-metadata.json, delete
84
+
85
+ ### ~~"polling loop for running kaggle jobs"~~ ✅ `804e7bf`
86
+ - [x] `poller.py` — 30s poll, Rich Live, error classification, retry signalling
87
+
88
+ ### ~~"kaggle runner that ties dataset and kernel together"~~ ✅ `804e7bf`
89
+ - [x] `runner.py` — orchestrates full pipeline, retry loop, output download, clean_run()
90
+
91
+ ### ~~"jinja2 notebook template for kaggle"~~ ✅ `25a2602`
92
+ - [x] `vina_shard.ipynb.j2` — 9 cells, UniDock-Pro/UniDock switch, scrubber fallback, error.json on every failure, done.flag
93
+
94
+ ---
95
+
96
+ ### ~~"start on the auth command"~~ ✅ `ed1f421`
97
+ - [x] `commands/auth.py` — thin wrapper around auth.run_wizard()
98
+
99
+ ### ~~"add the status command"~~ ✅ `ecf1442`
100
+ - [x] `commands/status.py` — Rich table, colour-coded status, run count
101
+
102
+ ### ~~"standalone admet command working"~~ ✅ `fefcfe6`
103
+ - [x] `commands/admet.py` — interactive filter toggle, breakdown output
104
+
105
+ ### ~~"got the 3d viewer working"~~ ✅ `4316267`
106
+ - [x] `commands/view.py` — Rich table + self-contained py3Dmol HTML viewer
107
+
108
+ ### ~~"validate command for diffdock nim"~~ ✅ `2e49e39`
109
+ - [x] `commands/validate.py` — DiffDock-L REST, key guard, timeout/auth mapping
110
+
111
+ ### ~~"the big run command — full decision tree"~~ ✅ `bb58e91`
112
+ - [x] `commands/run.py` — 8-step state machine (receptor→chains→AF→pocket→ligands→ADMET→depth→confirm→submit)
113
+
114
+ ### ~~"wire all commands into cli.py"~~ ✅ `cd8e472`
115
+ - [x] `cli.py` — all 8 subcommands wired with typed Typer arguments
116
+
117
+ ---
118
+
119
+ ## Phase 7 — Results & Reporting
120
+
121
+ ### ~~"merge shard results and render the viewer"~~ ✅ `af6b008`
122
+ - [x] `results/merger.py` — CSV merge, best-score dedup, global sort, SDF + failed_prep concat
123
+
124
+ ### ~~"prep report writer"~~ ✅ `2988862`
125
+ - [x] `report.py` — Section 8.3 JSON schema, `.txt` human-readable, Rich summary panel
126
+
127
+ ---
128
+
129
+ ## Phase 8 — Status Screen & Version Check
130
+
131
+ ### ~~"ezscreen status with live refresh"~~ ✅ `388efb2`
132
+ - [x] `ezscreen status` — 30s auto-refresh
133
+ - [x] Version check async + PyPI banner ✅ `45a2986`
134
+
135
+ ---
136
+
137
+ ## Phase 9 — Polish & Packaging
138
+
139
+ ### ~~"final cleanup and readme"~~ ✅
140
+ - [x] Smoke test — merger, report, version_check ✅ `9cdebb5`
141
+ - [x] `README.md` — install + quickstart ✅ `3c2674a`
142
+ - [ ] Conda-forge recipe (post pip release)
143
+
144
+ ---
145
+
146
+ ## Backlog / Nice-to-Have (Post v1)
147
+
148
+ - All v2 features from handoff Section 16