start-vibing-stacks 2.17.0 → 2.18.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,17 +1,37 @@
1
1
  ---
2
2
  name: pytest-testing
3
- version: 1.0.0
3
+ version: 2.0.0
4
+ description: "Pytest 9 (Nov 2025) testing patterns for Python 3.13/3.14. Covers conftest fixtures with autouse cleanup, parametrize + the new pytest 9 subtests API, async testing with pytest-asyncio 1.0 (May 2025 — `event_loop` fixture removed) and AnyIO for backend-agnostic tests, httpx.AsyncClient + ASGITransport as the FastAPI test client, DB rollback fixtures, mocking with `unittest.mock.AsyncMock`, parallel runs with pytest-xdist + sharding for CI, coverage with `--cov-fail-under` gate, and uv-friendly invocation. Invoke after writing any feature, fixture, or before merging."
4
5
  ---
5
6
 
6
- # Pytest Testing — Python Testing Patterns
7
+ # Pytest 9 — Python Testing Patterns (2026)
7
8
 
8
- **ALWAYS invoke AFTER implementing any feature.**
9
+ **ALWAYS invoke AFTER implementing a feature, before opening a PR, and as part of CI.**
10
+
11
+ ## Toolchain
12
+
13
+ | Tool | Version | Notes |
14
+ |---|---|---|
15
+ | `pytest` | **9.0+** (Nov 5, 2025) | Subtests built-in; new collection internals |
16
+ | `pytest-asyncio` | **1.0+** (May 26, 2025) | `event_loop` fixture removed; preliminary 3.14 support |
17
+ | `anyio[pytest]` | 4.x | Backend-agnostic async tests (asyncio + trio) |
18
+ | `pytest-cov` | 6.x | `--cov` + `--cov-fail-under` |
19
+ | `pytest-xdist` | 3.x | Parallel + sharding for CI |
20
+ | `httpx` | 0.27+ | `AsyncClient` + `ASGITransport` for FastAPI |
21
+ | `pytest-mock` | optional | `mocker` fixture wrapping unittest.mock |
22
+
23
+ Install via uv:
24
+
25
+ ```bash
26
+ uv add --dev pytest pytest-asyncio pytest-cov pytest-xdist httpx anyio
27
+ ```
9
28
 
10
29
  ## Structure
11
30
 
12
31
  ```
13
32
  tests/
14
- ├── conftest.py # Shared fixtures
33
+ ├── conftest.py # Shared fixtures (event-loop policy, DB, client)
34
+ ├── factories.py # Test data factories (uuid + faker)
15
35
  ├── unit/
16
36
  │ ├── test_services.py
17
37
  │ └── test_models.py
@@ -22,92 +42,190 @@ tests/
22
42
  └── test_flows.py
23
43
  ```
24
44
 
25
- ## Fixtures
45
+ ## `pyproject.toml` configuration
46
+
47
+ ```toml
48
+ [tool.pytest.ini_options]
49
+ minversion = "9.0"
50
+ asyncio_mode = "auto" # @pytest.mark.asyncio not required
51
+ asyncio_default_fixture_loop_scope = "session"
52
+ addopts = [
53
+ "-ra", # short summary for skip/xfail/error
54
+ "--strict-markers",
55
+ "--strict-config",
56
+ "--cov=app",
57
+ "--cov-report=term-missing",
58
+ "--cov-fail-under=80",
59
+ ]
60
+ testpaths = ["tests"]
61
+ markers = [
62
+ "slow: marks tests as slow (deselect with -m 'not slow')",
63
+ "e2e: end-to-end tests requiring external services",
64
+ ]
65
+ ```
66
+
67
+ ## Async client fixture (FastAPI)
68
+
69
+ `pytest-asyncio` 1.0 dropped the `event_loop` fixture. Use the new `asyncio_default_fixture_loop_scope = "session"` setting (above) instead of overriding the loop manually.
26
70
 
27
71
  ```python
28
- import pytest
72
+ # tests/conftest.py
73
+ import pytest_asyncio
29
74
  from httpx import AsyncClient, ASGITransport
30
75
  from app.main import app
31
- from app.core.config import settings
32
-
33
- @pytest.fixture
76
+ from app.db.session import async_session, engine
77
+ from app.db.base import Base
78
+
79
+ @pytest_asyncio.fixture(scope="session", autouse=True)
80
+ async def _create_schema():
81
+ async with engine.begin() as conn:
82
+ await conn.run_sync(Base.metadata.create_all)
83
+ yield
84
+ async with engine.begin() as conn:
85
+ await conn.run_sync(Base.metadata.drop_all)
86
+ await engine.dispose()
87
+
88
+ @pytest_asyncio.fixture
34
89
  async def client():
35
90
  transport = ASGITransport(app=app)
36
91
  async with AsyncClient(transport=transport, base_url="http://test") as ac:
37
92
  yield ac
38
93
 
39
- @pytest.fixture
40
- async def db_session():
94
+ @pytest_asyncio.fixture
95
+ async def db():
96
+ """Per-test session that always rolls back — no test pollutes another."""
41
97
  async with async_session() as session:
42
98
  yield session
43
- await session.rollback() # Cleanup
44
-
45
- @pytest.fixture
46
- def sample_user():
47
- return {"name": "Test User", "email": f"test_{uuid4().hex[:8]}@test.com", "password": "Pass1234!"}
99
+ await session.rollback()
48
100
  ```
49
101
 
50
- ## Async Tests
102
+ ## AnyIO — backend-agnostic async tests
103
+
104
+ When the code under test must work for both asyncio and trio (e.g. shared library code), use AnyIO instead of pytest-asyncio:
51
105
 
52
106
  ```python
53
107
  import pytest
54
108
 
55
- @pytest.mark.asyncio
56
- async def test_create_user(client, sample_user):
57
- response = await client.post("/api/v1/users", json=sample_user)
58
- assert response.status_code == 201
59
- data = response.json()
60
- assert data["email"] == sample_user["email"]
61
- assert "id" in data
62
- assert "password" not in data # Never leak passwords
63
-
64
- @pytest.mark.asyncio
65
- async def test_get_me_unauthorized(client):
66
- response = await client.get("/api/v1/users/me")
67
- assert response.status_code == 401
109
+ pytestmark = pytest.mark.anyio # whole module is async
110
+
111
+ async def test_works_under_either_backend(anyio_backend):
112
+ # anyio_backend is parametrised over ['asyncio', 'trio']
113
+ ...
68
114
  ```
69
115
 
70
- ## Parameterized Tests
116
+ Configure once in `pyproject.toml`:
117
+
118
+ ```toml
119
+ [tool.pytest.ini_options]
120
+ anyio_mode = "auto"
121
+ ```
122
+
123
+ ## Subtests (new in pytest 9)
124
+
125
+ For dataset-driven tests where you want **each row reported individually** without the parametrize ID overhead:
71
126
 
72
127
  ```python
73
- @pytest.mark.parametrize("email,expected", [
74
- ("valid@test.com", 201),
75
- ("invalid-email", 422),
76
- ("", 422),
77
- ])
78
- async def test_email_validation(client, email, expected):
79
- response = await client.post("/api/v1/users", json={"name": "Test", "email": email, "password": "Pass1234!"})
80
- assert response.status_code == expected
128
+ def test_email_normalization(subtests):
129
+ cases = [("A@B.com", "a@b.com"), (" a@b.com ", "a@b.com")]
130
+ for raw, expected in cases:
131
+ with subtests.test(raw=raw):
132
+ assert normalize_email(raw) == expected
133
+ ```
134
+
135
+ Each subtest reports as a distinct outcome — failures don't stop the others.
136
+
137
+ ## Parametrize — for combinatorial inputs
138
+
139
+ ```python
140
+ @pytest.mark.parametrize(
141
+ "email,status",
142
+ [
143
+ ("valid@test.com", 201),
144
+ ("invalid-email", 422),
145
+ ("", 422),
146
+ ("a" * 320, 422),
147
+ ],
148
+ ids=["valid", "no-at", "empty", "too-long"],
149
+ )
150
+ async def test_email_validation(client, email, status):
151
+ body = {"name": "Test", "email": email, "password": "Pass1234!"}
152
+ r = await client.post("/api/v1/users", json=body)
153
+ assert r.status_code == status
81
154
  ```
82
155
 
83
156
  ## Mocking
84
157
 
85
158
  ```python
86
159
  from unittest.mock import AsyncMock, patch
160
+ import pytest
87
161
 
88
- @pytest.mark.asyncio
162
+ @pytest.mark.anyio
89
163
  async def test_external_api_failure(client):
90
164
  with patch("app.services.external.fetch_data", new_callable=AsyncMock) as mock:
91
165
  mock.side_effect = ConnectionError("API down")
92
- response = await client.get("/api/v1/data")
93
- assert response.status_code == 503
166
+ r = await client.get("/api/v1/data")
167
+ assert r.status_code == 503
168
+ ```
169
+
170
+ For HTTP mocking specifically, prefer `respx` (drop-in for httpx) over hand-rolled patches.
171
+
172
+ ## Test data factories
173
+
174
+ ```python
175
+ # tests/factories.py
176
+ from uuid import uuid4
177
+ from faker import Faker
178
+
179
+ fake = Faker()
180
+
181
+ def user_payload(**overrides):
182
+ return {
183
+ "name": fake.name(),
184
+ "email": f"{uuid4().hex[:8]}@test.com",
185
+ "password": "Pass1234!",
186
+ **overrides,
187
+ }
94
188
  ```
95
189
 
96
- ## Commands
190
+ Factories are functions, not classes — keeps them testable, composable, type-safe.
191
+
192
+ ## Coverage gate
97
193
 
98
194
  ```bash
99
- pytest # Run all
100
- pytest -x # Stop on first failure
101
- pytest --tb=short # Short traceback
102
- pytest -k "test_create" # Filter by name
103
- pytest --cov=app --cov-report=html # Coverage
104
- pytest -n auto # Parallel (pytest-xdist)
195
+ pytest --cov=app --cov-report=term-missing --cov-fail-under=80
105
196
  ```
106
197
 
198
+ Set the gate per package, raise it incrementally — never lower it. Use `--cov-config=.coveragerc` to exclude generated code (`migrations/`, `__init__.py`).
199
+
200
+ ## CI parallelism + sharding
201
+
202
+ ```bash
203
+ # Local — use all cores
204
+ uv run pytest -n auto
205
+
206
+ # CI — split tests across N runners (matrix job in GitHub Actions)
207
+ uv run pytest --shard-id=$SHARD_INDEX --num-shards=$TOTAL_SHARDS
208
+ ```
209
+
210
+ `pytest-xdist` distributes tests across processes; `pytest-split` (or the built-in `--shard` style on newer versions) splits across CI runners.
211
+
107
212
  ## FORBIDDEN
108
213
 
109
- 1. **No fixtures for cleanup** — always rollback/cleanup
110
- 2. **Hardcoded test data** — use factories/uuid
111
- 3. **Testing implementation** test behavior, not internals
112
- 4. **Skipping async tests** use `pytest-asyncio`
113
- 5. **No coverage in CI**`--cov --cov-fail-under=70`
214
+ | Anti-pattern | Reason |
215
+ |---|---|
216
+ | `@pytest_asyncio.fixture(loop_scope="function")` for DB-heavy suites | Recreates pool every test slow; use `session` scope + per-test rollback |
217
+ | Defining your own `event_loop` fixture | Removed in pytest-asyncio 1.0 — use `asyncio_default_fixture_loop_scope` |
218
+ | Hardcoded test data (`"test@test.com"`) | Tests collide in paralleluse uuid/faker |
219
+ | Testing private methods (`_calculate_x`) | Test public behaviour, not internals |
220
+ | No cleanup → flaky tests | Always rollback or use isolated DB per test |
221
+ | `print()` for debugging | Use `pytest -s` + `caplog` fixture |
222
+ | Skipping coverage in CI | `--cov-fail-under=N` gate, raise over time |
223
+ | Catching `Exception` then asserting | Use `pytest.raises(SpecificError)` |
224
+ | Importing app at module top in slow tests | Lazy-import in fixtures so collection is fast |
225
+
226
+ ## See Also
227
+
228
+ - `fastapi-patterns` — endpoints + DI under test
229
+ - `pydantic-validation` — schema fuzzing with hypothesis
230
+ - `async-patterns` — TaskGroup/timeout patterns the tests verify
231
+ - `_shared/skills/playwright-automation` — for browser/E2E coverage
@@ -1,12 +1,32 @@
1
1
  ---
2
2
  name: python-patterns
3
- version: 1.0.0
3
+ version: 2.0.0
4
+ description: "Python architecture decisions for Python 3.13 (Oct 2024) / 3.14 (Oct 2025) projects. Framework selection (FastAPI / Django 5.2 LTS / Flask / scripts), async vs sync rules, free-threaded mode awareness (officially supported in 3.14 via PEP 779), modern typing (`X | None`, `TypeIs`, type-param defaults), project structure per app type, error handling, background-task choice. Pairs with the per-framework skills (fastapi-patterns, django-patterns, scripting-automation). Invoke for any new Python project, framework choice, or architectural decision."
4
5
  ---
5
6
 
6
- # Python Patterns — Architecture & Decision-Making
7
+ # Python Patterns — Architecture & Decisions (3.13 / 3.14)
7
8
 
8
9
  **ALWAYS invoke when making Python architecture decisions.**
9
10
 
11
+ ## Version Policy (2026)
12
+
13
+ - **Python 3.13** (Oct 7, 2024) — minimum for new projects
14
+ - **Python 3.14** (Oct 7, 2025) — recommended; brings **officially supported free-threaded mode** (PEP 779), template strings (PEP 750), deferred annotation evaluation
15
+ - **Package manager: `uv`** (Astral, acquired by OpenAI Mar 2026) — 10–100× faster than pip, 10–20× faster than Poetry; surpassed Poetry in monthly downloads in early 2026. Pin `uv` for new projects unless a constraint forces Poetry/pip.
16
+ - **Lint + format: `ruff`** (one Rust binary; replaces flake8 + isort + black + pydocstyle + pyupgrade + autoflake)
17
+ - **Type checker: `pyright`** for correctness (97.8% conformance), `ty` (Astral) when Pyright is too slow on huge codebases — still beta but 10–60× faster
18
+
19
+ ## Free-Threaded vs GIL — When to care
20
+
21
+ | Workload | Build | Notes |
22
+ |---|---|---|
23
+ | Web server (I/O-bound) | Standard GIL | asyncio handles concurrency; free-threading buys little |
24
+ | Mixed I/O + light CPU | Standard GIL | Standard build is faster per-thread |
25
+ | **CPU-bound multi-thread** (parsing, math, ML pre-processing) | **Free-threaded 3.14** | Real parallelism — replaces the multiprocessing dance |
26
+ | Library author | Both | Test with `python3.14t` to flag thread-safety bugs |
27
+
28
+ The free-threaded interpreter ships as a separate binary (`python3.14t`). It's slower per-thread (~10–15% overhead) than the GIL build — only adopt when you actually need parallel CPU.
29
+
10
30
  ## Framework Selection
11
31
 
12
32
  ```
@@ -44,12 +64,22 @@ Don't:
44
64
 
45
65
  ## Type Hints (MANDATORY for public APIs)
46
66
 
67
+ Modern syntax — `X | None` over `Optional[X]`, lowercase generics, `TypeIs` for narrowing.
68
+
47
69
  ```python
48
- from typing import Optional
70
+ from typing import TypeIs
49
71
 
50
- def find_user(id: int) -> Optional[User]: ...
51
- def process(data: str | dict) -> None: ...
72
+ def find_user(id: int) -> User | None: ... # 3.10+ union syntax
73
+ def process(data: str | dict[str, object]) -> None: ...
52
74
  def get_items() -> list[Item]: ...
75
+
76
+ # TypeIs (3.13+) — narrow types in type guards (better than TypeGuard for negative branches)
77
+ def is_admin(user: User | Guest) -> TypeIs[User]:
78
+ return isinstance(user, User) and user.role == "admin"
79
+
80
+ # Type parameter defaults (3.13+) — generic classes with sensible defaults
81
+ class Repo[T = User]:
82
+ def find(self, id: str) -> T | None: ...
53
83
  ```
54
84
 
55
85
  ## Project Structure
@@ -125,7 +155,19 @@ async def not_found_handler(request, exc):
125
155
  ## FORBIDDEN
126
156
 
127
157
  1. **Business logic in routes/views** — use services layer
128
- 2. **Sync libraries in async code** — blocks event loop
158
+ 2. **Sync libraries in async code** — blocks event loop (`requests`, `psycopg2` sync, `pymongo` sync, `time.sleep`)
129
159
  3. **No type hints on public APIs** — always type
130
- 4. **Raw SQL without parameterization** — injection risk
160
+ 4. **Raw SQL without parameterization** — injection risk (use ORM bindings or `:name` / `?` placeholders)
131
161
  5. **`import *`** — explicit imports only
162
+ 6. **`Optional[X]`** — write `X | None` (3.10+ syntax)
163
+ 7. **`pip install` in new projects** — use `uv add` (uv is the 2026 default; pip still fine for legacy)
164
+ 8. **Per-tool config files (`.flake8`, `.isort.cfg`, `pyproject` for black + isort + ruff…)** — consolidate under `[tool.ruff]` in `pyproject.toml`
165
+ 9. **Banking on free-threading for an I/O-bound web app** — use asyncio; the GIL build is faster
166
+
167
+ ## See Also
168
+
169
+ - `fastapi-patterns` / `django-patterns` / `scripting-automation` — per-application-type setup
170
+ - `pydantic-validation` — boundary validation (Pydantic V2)
171
+ - `pytest-testing` — pytest 9 + pytest-asyncio 1
172
+ - `async-patterns` — asyncio.timeout, TaskGroup, AnyIO
173
+ - `python-performance` — profiling, free-threading trade-offs
@@ -1,11 +1,191 @@
1
1
  ---
2
2
  name: python-performance
3
- version: 1.0.0
3
+ version: 2.0.0
4
+ description: "Performance profiling and optimisation for Python 3.13/3.14. Covers cProfile/line-profiler/memory-profiler/py-spy choice, the experimental copy-and-patch JIT in 3.13 (PEP 744 — disabled by default), free-threaded mode in 3.14 (PEP 779 officially supported — when it actually wins vs asyncio + multiprocessing), `functools.cache` (unbounded) vs `lru_cache` (bounded), structural optimisations (set vs list membership, generators for memory, `str.join` vs `+`), bulk DB ops in SQLAlchemy/Django, async caching with redis-py, and Polars (Rust-backed dataframes, ~10× pandas) for data work. Profile FIRST, optimise SECOND."
4
5
  ---
5
6
 
6
- # Python Performance — Profiling & Optimization
7
+ # Python Performance — Profiling & Optimisation (3.13 / 3.14)
7
8
 
8
- **ALWAYS invoke when optimizing slow Python code.**
9
+ **ALWAYS invoke when optimising slow Python code. Profile FIRST.**
10
+
11
+ ## What to reach for in 2026
12
+
13
+ | Symptom | Tool / Pattern |
14
+ |---|---|
15
+ | "Function X is hot" | `cProfile` → `snakeviz`, then `line_profiler` for line-by-line |
16
+ | "Process eats RAM" | `memory_profiler` for line-level, `tracemalloc` for snapshots |
17
+ | "Production is slow but we can't repro" | **`py-spy`** (sampling, no code change, attaches by PID) |
18
+ | "I want a flame graph" | `py-spy record -o profile.svg --pid …` |
19
+ | "Hot Python loop, can't rewrite in C" | Try **3.13 JIT** (`PYTHON_JIT=1`) — still experimental |
20
+ | "Multi-thread CPU-bound, GIL is the wall" | **3.14 free-threaded** build (`python3.14t`) — officially supported |
21
+ | "Tabular data crunching" | **Polars** (Rust-backed, ~10× pandas, lazy frames) |
22
+ | "Pure-Python hot path" | `mypyc`, `cython`, `numba` — pick based on dependency tolerance |
23
+
24
+ ## Profiling
25
+
26
+ ```bash
27
+ # CPU — cumulative time per function
28
+ python -m cProfile -o prof.out app.py
29
+ uv run snakeviz prof.out # interactive HTML view
30
+
31
+ # Line-level (decorate target with @profile, no import needed)
32
+ uv run kernprof -l -v script.py
33
+
34
+ # Memory — line-level allocations
35
+ uv run python -m memory_profiler script.py
36
+
37
+ # Production-safe sampling profiler — attach by PID
38
+ py-spy top --pid 12345
39
+ py-spy record -o flame.svg --duration 30 --pid 12345
40
+ ```
41
+
42
+ py-spy is the safest tool for prod: zero code changes, low overhead (~5%), works on a running process.
43
+
44
+ ## Free-threading vs JIT — when each helps
45
+
46
+ ```
47
+ 3.13 JIT (PEP 744)
48
+ ├── Status: experimental, OFF by default
49
+ ├── Win: hot Python bytecode loops (~5-15% on micro-benchmarks)
50
+ └── Enable: build with --enable-experimental-jit OR run PYTHON_JIT=1 (when distro supports it)
51
+
52
+ 3.14 Free-threaded (PEP 779)
53
+ ├── Status: OFFICIALLY SUPPORTED
54
+ ├── Binary: python3.14t (separate from python3.14)
55
+ ├── Win: parallel CPU work across threads — no GIL
56
+ ├── Cost: ~10-15% slower per-thread vs GIL build
57
+ └── Use when: CPU-bound multi-thread work where multiprocessing overhead is too high
58
+ ```
59
+
60
+ Don't bank on either for I/O-bound web servers — asyncio dominates that case.
61
+
62
+ ## Caching primitives
63
+
64
+ ```python
65
+ from functools import cache, lru_cache
66
+
67
+ # Bounded — pick a sensible maxsize for your hot paths
68
+ @lru_cache(maxsize=1024)
69
+ def expensive(n: int) -> int:
70
+ return sum(range(n))
71
+
72
+ # Unbounded — only when input space is small AND fixed
73
+ @cache # 3.9+, equivalent to @lru_cache(maxsize=None) but faster
74
+ def settings_for(env: str) -> Settings:
75
+ return Settings(env=env)
76
+
77
+ # Async — use redis-py async; lru_cache does NOT support coroutines
78
+ import redis.asyncio as redis
79
+
80
+ cache = redis.from_url("redis://localhost", decode_responses=False)
81
+
82
+ async def get_user(id: str) -> User:
83
+ cached = await cache.get(f"user:{id}")
84
+ if cached:
85
+ return User.model_validate_json(cached)
86
+ user = await db.get(User, id)
87
+ await cache.set(f"user:{id}", user.model_dump_json(), ex=300)
88
+ return user
89
+ ```
90
+
91
+ ## Data structures
92
+
93
+ ```python
94
+ # O(n) → O(1) — set lookup wins by 100×+ on big lists
95
+ big_list = [...] # 1M items
96
+ big_set = set(big_list)
97
+ "target" in big_list # SLOW
98
+ "target" in big_set # FAST
99
+
100
+ # dict.get() over try/except for happy-path
101
+ value = data.get("key", default)
102
+
103
+ # Specialised collections
104
+ from collections import defaultdict, Counter, deque
105
+ counts = Counter(events)
106
+ queue = deque(maxlen=1000) # bounded ring buffer
107
+ ```
108
+
109
+ ## Generators — memory wins
110
+
111
+ ```python
112
+ # WRONG — materialises 10M dicts in RAM
113
+ all_rows = [process(x) for x in huge_dataset]
114
+ total = sum(r["price"] for r in all_rows)
115
+
116
+ # CORRECT — single pass, constant memory
117
+ total = sum(process(x)["price"] for x in huge_dataset)
118
+ ```
119
+
120
+ Generator expressions are not always faster wall-clock, but they **always** beat list comprehensions on memory.
121
+
122
+ ## String operations
123
+
124
+ ```python
125
+ # O(n²) — Python recreates the string each iteration
126
+ result = ""
127
+ for s in strings:
128
+ result += s
129
+
130
+ # O(n) — single allocation
131
+ result = "".join(strings)
132
+
133
+ # Building structured strings
134
+ parts = [f"row {i}" for i in range(1000)]
135
+ out = "\n".join(parts)
136
+ ```
137
+
138
+ ## Database — bulk over loops
139
+
140
+ ```python
141
+ # SQLAlchemy 2.0 async — bulk insert
142
+ from sqlalchemy import insert
143
+ await db.execute(insert(Item), [{"name": n} for n in names])
144
+ await db.commit()
145
+
146
+ # Django ORM
147
+ Item.objects.bulk_create([Item(name=n) for n in names], batch_size=1000)
148
+
149
+ # Avoid the N+1 trap — see django-patterns / fastapi-patterns
150
+ ```
151
+
152
+ ## Polars — when pandas is the bottleneck
153
+
154
+ ```python
155
+ import polars as pl
156
+
157
+ # Lazy — query is optimised before execution
158
+ df = (
159
+ pl.scan_csv("orders.csv")
160
+ .filter(pl.col("amount") > 100)
161
+ .group_by("customer_id")
162
+ .agg(pl.col("amount").sum().alias("total"))
163
+ .sort("total", descending=True)
164
+ .collect(streaming=True) # streams when bigger than RAM
165
+ )
166
+ ```
167
+
168
+ Polars is Rust-backed, multi-threaded by default, and lazy — typical 5–30× speedup over pandas on aggregation/filter pipelines, plus much lower memory.
169
+
170
+ ## FORBIDDEN
171
+
172
+ | Anti-pattern | Reason |
173
+ |---|---|
174
+ | Optimising before profiling | "Premature optimisation is the root of all evil" — measure first |
175
+ | `+` for string concat in loops | O(n²) — use `"".join()` |
176
+ | `list` for membership testing | O(n) per lookup — use `set` |
177
+ | Loading whole dataset in memory | Use generators / streaming / pagination |
178
+ | One-by-one DB inserts | Use `bulk_create`/`executemany`/SQLAlchemy `insert(...)` |
179
+ | `lru_cache` on `async def` | Doesn't cache coroutines correctly — use Redis or `aiocache` |
180
+ | Banking on JIT for production wins today | Still experimental in 3.13 — measure on YOUR workload |
181
+ | Switching whole app to free-threaded for "free speed" | Per-thread overhead can make I/O-bound code slower |
182
+
183
+ ## See Also
184
+
185
+ - `python-patterns` — async vs threads vs processes decision
186
+ - `async-patterns` — TaskGroup / Semaphore / httpx pooling
187
+ - `_shared/skills/observability` — measure latency and memory in prod
188
+ - `_shared/skills/postgres-patterns` — index design, EXPLAIN, AIO in PG18
9
189
 
10
190
  ## Profiling Tools
11
191