free-gpu 0.1.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,317 @@
1
+ Metadata-Version: 2.4
2
+ Name: free-gpu
3
+ Version: 0.1.1
4
+ Summary: Plan local and free-tier GPU workflows around llmfit and curated provider data.
5
+ Author: Francesco Piccolo
6
+ Project-URL: Homepage, https://francescoopiccolo.github.io/free-gpu/
7
+ Project-URL: Repository, https://github.com/francescoopiccolo/free-gpu
8
+ Project-URL: Documentation, https://francescoopiccolo.github.io/free-gpu/
9
+ Project-URL: Issues, https://github.com/francescoopiccolo/free-gpu/issues
10
+ Requires-Python: >=3.12
11
+ Description-Content-Type: text/markdown
12
+ Requires-Dist: fastapi<1,>=0.115
13
+ Requires-Dist: mcp<2,>=1.27
14
+ Requires-Dist: textual<0.90,>=0.80
15
+ Provides-Extra: publish
16
+ Requires-Dist: build>=1.2; extra == "publish"
17
+ Requires-Dist: twine>=5; extra == "publish"
18
+
19
+ # free-gpu
20
+
21
+ `free-gpu` is a terminal-first planner for free and near-free compute.
22
+
23
+ It is designed to sit on top of [`llmfit`](https://www.llmfit.org/):
24
+
25
+ - `llmfit` answers: what models fit my local hardware?
26
+ - `free-gpu` answers: given this workload and compute need, which providers should I use, and how should I split work across local plus remote stages?
27
+
28
+ The point is not to clone `llmfit`. The point is to use `llmfit` as the local-fit engine, then add provider filtering, role-aware ranking, and workflow planning around free, cheap, and grant-style compute.
29
+
30
+ ## What users actually get
31
+
32
+ `free-gpu` helps answer questions like:
33
+
34
+ - "I need quick inference for a small coding task. Which free-tier provider is the least painful?"
35
+ - "I need a few hours of GPU time for LoRA fine-tuning. Should I look at credits, trials, or a cloud free tier?"
36
+ - "This task is too heavy for casual free tiers. Which grant or program lane should I think about instead?"
37
+ - "What should stay local, and what should move to remote compute?"
38
+
39
+ ## What the repo includes
40
+
41
+ - The original provider dataset in [`free_gpu/gpu_compute_database.csv`](./free_gpu/gpu_compute_database.csv)
42
+ - A Python CLI for provider ranking and workflow planning
43
+ - A Textual TUI focused on provider selection rather than local model browsing
44
+ - A small MCP server so external agents can ask for provider plans programmatically
45
+ - A GitHub Pages-ready project page in [`docs/index.html`](./docs/index.html)
46
+
47
+ ## Core product rules
48
+
49
+ - Role is a ranking lens, not a hard exclusion filter.
50
+ - Budget buckets are semantic UX buckets, not literal accounting truth.
51
+ - Grant-like providers behave like card-required options.
52
+ - The planner should surface the right provider lane for the task instead of treating every task as the same generic ranking problem.
53
+
54
+ ## Provider lanes
55
+
56
+ `free-gpu` is not only about "free" in the narrow sense. It plans across several practical lanes:
57
+
58
+ - `free tier`: browser notebooks, starter quotas, short session access
59
+ - `under-25`: credits, trials, starter plans, light paid-but-cheap access
60
+ - `grant`: startup programs, research allocations, application-based access, and other heavier support paths
61
+
62
+ That matters because different tasks naturally fall into different lanes:
63
+
64
+ - quick demos and lightweight inference often fit the free-tier lane
65
+ - medium notebook work and moderate fine-tunes often fit the under-25 or credit lane
66
+ - heavier training and long-running work often belong in the grant lane
67
+
68
+ ## Workflow logic
69
+
70
+ The planner estimates a compute lane from:
71
+
72
+ - workload
73
+ - model size
74
+ - target VRAM
75
+ - estimated task hours
76
+ - parallel jobs
77
+ - API needs
78
+
79
+ Then it schedules providers accordingly:
80
+
81
+ - `burst`: short runs, quick inference, fast-start options
82
+ - `session`: notebook or credit-backed work that lasts longer
83
+ - `heavy`: bigger VRAM or sustained remote compute
84
+ - `grant-scale`: tasks that look more like allocations, programs, or heavy research/startup support
85
+
86
+ Each workflow step carries its own compute summary, so a multi-stage plan can recommend different provider types for prep, fine-tune, eval, and serving.
87
+
88
+ ## How Pages and MCP fit together
89
+
90
+ The project has two different surfaces:
91
+
92
+ - GitHub Pages hosts the public project site and docs
93
+ - the MCP server runs locally on the user's own machine
94
+
95
+ GitHub Pages cannot run the planner logic or host the Python MCP server. It is only the public website.
96
+
97
+ The actual MCP workflow is:
98
+
99
+ 1. a user installs `free-gpu`
100
+ 2. the user runs `free-gpu-mcp` locally
101
+ 3. their coding agent connects to that local MCP server
102
+ 4. the agent can call planner tools such as `plan_provider_workflow`
103
+
104
+ That means:
105
+
106
+ - no hosting cost for you
107
+ - no central backend to maintain
108
+ - users keep control because the tool runs locally
109
+ - any MCP-capable coding agent can use it if it supports local MCP servers
110
+
111
+ This repository also supports an optional hosted HTTP deployment. If you deploy it on Vercel, the MCP endpoint is exposed at `/mcp`.
112
+
113
+ ## Install
114
+
115
+ ```bash
116
+ python -m pip install free-gpu
117
+ ```
118
+
119
+ For local development from the repository:
120
+
121
+ ```bash
122
+ python -m pip install -e .
123
+ ```
124
+
125
+ ## CLI
126
+
127
+ ### Local profile
128
+
129
+ ```bash
130
+ free-gpu local
131
+ free-gpu local --ram-gb 32 --vram-gb 12
132
+ ```
133
+
134
+ ### Provider ranking
135
+
136
+ ```bash
137
+ free-gpu providers --workload inference --budget free
138
+ free-gpu providers --workload agent-loop --budget under-25 --task-hours 3 --parallel-jobs 4 --requires-api
139
+ ```
140
+
141
+ ### Workflow planning
142
+
143
+ ```bash
144
+ free-gpu plan --workload inference --model qwen2.5-coder-7b --ram-gb 32 --vram-gb 8
145
+ free-gpu plan --workload finetune-lora --model llama-3.1-8b --budget under-25 --task-hours 6 --min-vram-gb 16
146
+ free-gpu plan --workload scratch-train --budget grant --task-hours 24 --min-vram-gb 40
147
+ ```
148
+
149
+ Useful planning flags:
150
+
151
+ - `--task-hours`
152
+ - `--min-vram-gb`
153
+ - `--parallel-jobs`
154
+ - `--requires-api`
155
+ - `--budget any|free|under-25|grant`
156
+
157
+ Every command also accepts `--json`.
158
+
159
+ ## Terminal UI
160
+
161
+ Run:
162
+
163
+ ```bash
164
+ free-gpu ui
165
+ ```
166
+
167
+ The TUI is inspired by `llmfit`'s visual grammar, but it stays focused on provider planning:
168
+
169
+ - a top system bar with local hardware context from `llmfit`
170
+ - broad provider browsing by default
171
+ - role, workload, budget, and payment filters
172
+ - a central provider table
173
+ - bottom panes for links, recommendation context, and workflow summary
174
+
175
+ Current budget options in the TUI:
176
+
177
+ - `Budget Any`
178
+ - `Free`
179
+ - `<25`
180
+ - `Grant`
181
+
182
+ ## llmfit integration
183
+
184
+ If `llmfit` is installed, `free-gpu` will try to use:
185
+
186
+ - `llmfit system --json`
187
+ - `llmfit recommend -n N --json`
188
+
189
+ The adapter uses structured JSON output rather than scraping terminal text. If `llmfit` is missing or parsing fails, `free-gpu` continues in provider-first mode and reports what it could not infer.
190
+
191
+ ## MCP server
192
+
193
+ Run:
194
+
195
+ ```bash
196
+ free-gpu-mcp
197
+ ```
198
+
199
+ The MCP server exposes tools for compute-aware planning, including:
200
+
201
+ - `plan_provider_workflow`
202
+ - `rank_providers_for_task`
203
+ - `assess_task_compute`
204
+
205
+ It also exposes a small dataset summary resource:
206
+
207
+ - `providers://snapshot`
208
+
209
+ ### What the MCP is for
210
+
211
+ The MCP lets an agent ask questions such as:
212
+
213
+ - "Plan a cheap inference workflow for this task"
214
+ - "Rank providers for a 6-hour fine-tune that needs about 16 GB VRAM"
215
+ - "Does this task look like free-tier, credit-tier, or grant-scale work?"
216
+
217
+ ### Generic local MCP setup
218
+
219
+ If your coding agent supports local MCP servers over stdio, the setup is conceptually:
220
+
221
+ ```json
222
+ {
223
+ "mcpServers": {
224
+ "free-gpu": {
225
+ "command": "free-gpu-mcp"
226
+ }
227
+ }
228
+ }
229
+ ```
230
+
231
+ The exact config file depends on the agent, but the idea is the same: point the client at the local `free-gpu-mcp` command.
232
+
233
+ ### Hosted HTTP MCP on Vercel
234
+
235
+ This repository also includes a Vercel-friendly HTTP entrypoint via [`app.py`](./app.py).
236
+
237
+ When deployed on Vercel:
238
+
239
+ - `/` returns a small service description
240
+ - `/health` returns a simple health check
241
+ - `/mcp` is the MCP endpoint to connect to
242
+ - the live hosted endpoint for this repo is `https://free-gpu.vercel.app/mcp`
243
+
244
+ That means an MCP-capable client that supports remote HTTP MCP can connect to:
245
+
246
+ ```text
247
+ https://free-gpu.vercel.app/mcp
248
+ ```
249
+
250
+ If you open `/mcp` in a browser, it may return a protocol-level error such as `406 Not Acceptable`. That is expected: the route is meant for MCP clients, not normal browser navigation.
251
+
252
+ Example MCP-style request shape:
253
+
254
+ ```json
255
+ {
256
+ "tool": "plan_provider_workflow",
257
+ "arguments": {
258
+ "workload": "agent-loop",
259
+ "budget": "under-25",
260
+ "task_hours": 3,
261
+ "parallel_jobs": 4,
262
+ "requires_api": true
263
+ }
264
+ }
265
+ ```
266
+
267
+ Example agent flow:
268
+
269
+ 1. You ask your coding agent: "I need to fine-tune an 8B model for about 6 hours and want to stay near free."
270
+ 2. The agent calls `plan_provider_workflow`.
271
+ 3. `free-gpu` estimates the compute lane.
272
+ 4. The agent gets back a structured plan with:
273
+ - local vs remote recommendation
274
+ - stage-by-stage workflow
275
+ - top providers for that compute need
276
+ - whether the task fits free tier, cheap credits, or a grant-style path
277
+
278
+ ## GitHub Pages
279
+
280
+ A project page is included in [`docs/index.html`](./docs/index.html).
281
+
282
+ On GitHub, enable Pages and point it at:
283
+
284
+ - Branch: `main`
285
+ - Folder: `/docs`
286
+
287
+ ## Tests
288
+
289
+ Run:
290
+
291
+ ```bash
292
+ python -m unittest tests.test_planner -v
293
+ ```
294
+
295
+ ## Packaging and publishing
296
+
297
+ The project is structured so end users do not need to clone the repository.
298
+
299
+ After publishing to PyPI, users can install it with:
300
+
301
+ ```bash
302
+ pip install free-gpu
303
+ ```
304
+
305
+ To build distribution artifacts locally:
306
+
307
+ ```bash
308
+ python -m pip install ".[publish]"
309
+ python -m build
310
+ python -m twine check dist/*
311
+ ```
312
+
313
+ To publish to PyPI once you have a token for the `free-gpu` project:
314
+
315
+ ```bash
316
+ python -m twine upload dist/*
317
+ ```
@@ -0,0 +1,299 @@
1
+ # free-gpu
2
+
3
+ `free-gpu` is a terminal-first planner for free and near-free compute.
4
+
5
+ It is designed to sit on top of [`llmfit`](https://www.llmfit.org/):
6
+
7
+ - `llmfit` answers: what models fit my local hardware?
8
+ - `free-gpu` answers: given this workload and compute need, which providers should I use, and how should I split work across local plus remote stages?
9
+
10
+ The point is not to clone `llmfit`. The point is to use `llmfit` as the local-fit engine, then add provider filtering, role-aware ranking, and workflow planning around free, cheap, and grant-style compute.
11
+
12
+ ## What users actually get
13
+
14
+ `free-gpu` helps answer questions like:
15
+
16
+ - "I need quick inference for a small coding task. Which free-tier provider is the least painful?"
17
+ - "I need a few hours of GPU time for LoRA fine-tuning. Should I look at credits, trials, or a cloud free tier?"
18
+ - "This task is too heavy for casual free tiers. Which grant or program lane should I think about instead?"
19
+ - "What should stay local, and what should move to remote compute?"
20
+
21
+ ## What the repo includes
22
+
23
+ - The original provider dataset in [`free_gpu/gpu_compute_database.csv`](./free_gpu/gpu_compute_database.csv)
24
+ - A Python CLI for provider ranking and workflow planning
25
+ - A Textual TUI focused on provider selection rather than local model browsing
26
+ - A small MCP server so external agents can ask for provider plans programmatically
27
+ - A GitHub Pages-ready project page in [`docs/index.html`](./docs/index.html)
28
+
29
+ ## Core product rules
30
+
31
+ - Role is a ranking lens, not a hard exclusion filter.
32
+ - Budget buckets are semantic UX buckets, not literal accounting truth.
33
+ - Grant-like providers behave like card-required options.
34
+ - The planner should surface the right provider lane for the task instead of treating every task as the same generic ranking problem.
35
+
36
+ ## Provider lanes
37
+
38
+ `free-gpu` is not only about "free" in the narrow sense. It plans across several practical lanes:
39
+
40
+ - `free tier`: browser notebooks, starter quotas, short session access
41
+ - `under-25`: credits, trials, starter plans, light paid-but-cheap access
42
+ - `grant`: startup programs, research allocations, application-based access, and other heavier support paths
43
+
44
+ That matters because different tasks naturally fall into different lanes:
45
+
46
+ - quick demos and lightweight inference often fit the free-tier lane
47
+ - medium notebook work and moderate fine-tunes often fit the under-25 or credit lane
48
+ - heavier training and long-running work often belong in the grant lane
49
+
50
+ ## Workflow logic
51
+
52
+ The planner estimates a compute lane from:
53
+
54
+ - workload
55
+ - model size
56
+ - target VRAM
57
+ - estimated task hours
58
+ - parallel jobs
59
+ - API needs
60
+
61
+ Then it schedules providers accordingly:
62
+
63
+ - `burst`: short runs, quick inference, fast-start options
64
+ - `session`: notebook or credit-backed work that lasts longer
65
+ - `heavy`: bigger VRAM or sustained remote compute
66
+ - `grant-scale`: tasks that look more like allocations, programs, or heavy research/startup support
67
+
68
+ Each workflow step carries its own compute summary, so a multi-stage plan can recommend different provider types for prep, fine-tune, eval, and serving.
69
+
70
+ ## How Pages and MCP fit together
71
+
72
+ The project has two different surfaces:
73
+
74
+ - GitHub Pages hosts the public project site and docs
75
+ - the MCP server runs locally on the user's own machine
76
+
77
+ GitHub Pages cannot run the planner logic or host the Python MCP server. It is only the public website.
78
+
79
+ The actual MCP workflow is:
80
+
81
+ 1. a user installs `free-gpu`
82
+ 2. the user runs `free-gpu-mcp` locally
83
+ 3. their coding agent connects to that local MCP server
84
+ 4. the agent can call planner tools such as `plan_provider_workflow`
85
+
86
+ That means:
87
+
88
+ - no hosting cost for you
89
+ - no central backend to maintain
90
+ - users keep control because the tool runs locally
91
+ - any MCP-capable coding agent can use it if it supports local MCP servers
92
+
93
+ This repository also supports an optional hosted HTTP deployment. If you deploy it on Vercel, the MCP endpoint is exposed at `/mcp`.
94
+
95
+ ## Install
96
+
97
+ ```bash
98
+ python -m pip install free-gpu
99
+ ```
100
+
101
+ For local development from the repository:
102
+
103
+ ```bash
104
+ python -m pip install -e .
105
+ ```
106
+
107
+ ## CLI
108
+
109
+ ### Local profile
110
+
111
+ ```bash
112
+ free-gpu local
113
+ free-gpu local --ram-gb 32 --vram-gb 12
114
+ ```
115
+
116
+ ### Provider ranking
117
+
118
+ ```bash
119
+ free-gpu providers --workload inference --budget free
120
+ free-gpu providers --workload agent-loop --budget under-25 --task-hours 3 --parallel-jobs 4 --requires-api
121
+ ```
122
+
123
+ ### Workflow planning
124
+
125
+ ```bash
126
+ free-gpu plan --workload inference --model qwen2.5-coder-7b --ram-gb 32 --vram-gb 8
127
+ free-gpu plan --workload finetune-lora --model llama-3.1-8b --budget under-25 --task-hours 6 --min-vram-gb 16
128
+ free-gpu plan --workload scratch-train --budget grant --task-hours 24 --min-vram-gb 40
129
+ ```
130
+
131
+ Useful planning flags:
132
+
133
+ - `--task-hours`
134
+ - `--min-vram-gb`
135
+ - `--parallel-jobs`
136
+ - `--requires-api`
137
+ - `--budget any|free|under-25|grant`
138
+
139
+ Every command also accepts `--json`.
140
+
141
+ ## Terminal UI
142
+
143
+ Run:
144
+
145
+ ```bash
146
+ free-gpu ui
147
+ ```
148
+
149
+ The TUI is inspired by `llmfit`'s visual grammar, but it stays focused on provider planning:
150
+
151
+ - a top system bar with local hardware context from `llmfit`
152
+ - broad provider browsing by default
153
+ - role, workload, budget, and payment filters
154
+ - a central provider table
155
+ - bottom panes for links, recommendation context, and workflow summary
156
+
157
+ Current budget options in the TUI:
158
+
159
+ - `Budget Any`
160
+ - `Free`
161
+ - `<25`
162
+ - `Grant`
163
+
164
+ ## llmfit integration
165
+
166
+ If `llmfit` is installed, `free-gpu` will try to use:
167
+
168
+ - `llmfit system --json`
169
+ - `llmfit recommend -n N --json`
170
+
171
+ The adapter uses structured JSON output rather than scraping terminal text. If `llmfit` is missing or parsing fails, `free-gpu` continues in provider-first mode and reports what it could not infer.
172
+
173
+ ## MCP server
174
+
175
+ Run:
176
+
177
+ ```bash
178
+ free-gpu-mcp
179
+ ```
180
+
181
+ The MCP server exposes tools for compute-aware planning, including:
182
+
183
+ - `plan_provider_workflow`
184
+ - `rank_providers_for_task`
185
+ - `assess_task_compute`
186
+
187
+ It also exposes a small dataset summary resource:
188
+
189
+ - `providers://snapshot`
190
+
191
+ ### What the MCP is for
192
+
193
+ The MCP lets an agent ask questions such as:
194
+
195
+ - "Plan a cheap inference workflow for this task"
196
+ - "Rank providers for a 6-hour fine-tune that needs about 16 GB VRAM"
197
+ - "Does this task look like free-tier, credit-tier, or grant-scale work?"
198
+
199
+ ### Generic local MCP setup
200
+
201
+ If your coding agent supports local MCP servers over stdio, the setup is conceptually:
202
+
203
+ ```json
204
+ {
205
+ "mcpServers": {
206
+ "free-gpu": {
207
+ "command": "free-gpu-mcp"
208
+ }
209
+ }
210
+ }
211
+ ```
212
+
213
+ The exact config file depends on the agent, but the idea is the same: point the client at the local `free-gpu-mcp` command.
214
+
215
+ ### Hosted HTTP MCP on Vercel
216
+
217
+ This repository also includes a Vercel-friendly HTTP entrypoint via [`app.py`](./app.py).
218
+
219
+ When deployed on Vercel:
220
+
221
+ - `/` returns a small service description
222
+ - `/health` returns a simple health check
223
+ - `/mcp` is the MCP endpoint to connect to
224
+ - the live hosted endpoint for this repo is `https://free-gpu.vercel.app/mcp`
225
+
226
+ That means an MCP-capable client that supports remote HTTP MCP can connect to:
227
+
228
+ ```text
229
+ https://free-gpu.vercel.app/mcp
230
+ ```
231
+
232
+ If you open `/mcp` in a browser, it may return a protocol-level error such as `406 Not Acceptable`. That is expected: the route is meant for MCP clients, not normal browser navigation.
233
+
234
+ Example MCP-style request shape:
235
+
236
+ ```json
237
+ {
238
+ "tool": "plan_provider_workflow",
239
+ "arguments": {
240
+ "workload": "agent-loop",
241
+ "budget": "under-25",
242
+ "task_hours": 3,
243
+ "parallel_jobs": 4,
244
+ "requires_api": true
245
+ }
246
+ }
247
+ ```
248
+
249
+ Example agent flow:
250
+
251
+ 1. You ask your coding agent: "I need to fine-tune an 8B model for about 6 hours and want to stay near free."
252
+ 2. The agent calls `plan_provider_workflow`.
253
+ 3. `free-gpu` estimates the compute lane.
254
+ 4. The agent gets back a structured plan with:
255
+ - local vs remote recommendation
256
+ - stage-by-stage workflow
257
+ - top providers for that compute need
258
+ - whether the task fits free tier, cheap credits, or a grant-style path
259
+
260
+ ## GitHub Pages
261
+
262
+ A project page is included in [`docs/index.html`](./docs/index.html).
263
+
264
+ On GitHub, enable Pages and point it at:
265
+
266
+ - Branch: `main`
267
+ - Folder: `/docs`
268
+
269
+ ## Tests
270
+
271
+ Run:
272
+
273
+ ```bash
274
+ python -m unittest tests.test_planner -v
275
+ ```
276
+
277
+ ## Packaging and publishing
278
+
279
+ The project is structured so end users do not need to clone the repository.
280
+
281
+ After publishing to PyPI, users can install it with:
282
+
283
+ ```bash
284
+ pip install free-gpu
285
+ ```
286
+
287
+ To build distribution artifacts locally:
288
+
289
+ ```bash
290
+ python -m pip install ".[publish]"
291
+ python -m build
292
+ python -m twine check dist/*
293
+ ```
294
+
295
+ To publish to PyPI once you have a token for the `free-gpu` project:
296
+
297
+ ```bash
298
+ python -m twine upload dist/*
299
+ ```
@@ -0,0 +1,2 @@
1
+ """free-gpu package."""
2
+