ctxeng 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,68 @@
1
+ name: CI
2
+
3
+ on:
4
+ push:
5
+ branches: [main, dev]
6
+ pull_request:
7
+ branches: [main]
8
+
9
+ jobs:
10
+ test:
11
+ name: Test (Python ${{ matrix.python-version }})
12
+ runs-on: ubuntu-latest
13
+ strategy:
14
+ matrix:
15
+ python-version: ["3.10", "3.11", "3.12"]
16
+
17
+ steps:
18
+ - uses: actions/checkout@v4
19
+
20
+ - name: Set up Python ${{ matrix.python-version }}
21
+ uses: actions/setup-python@v5
22
+ with:
23
+ python-version: ${{ matrix.python-version }}
24
+
25
+ - name: Install ctxeng + dev deps
26
+ run: |
27
+ pip install -e ".[dev]"
28
+
29
+ - name: Lint with ruff
30
+ run: ruff check ctxeng/
31
+
32
+ - name: Type check with mypy
33
+ run: mypy ctxeng/ --ignore-missing-imports
34
+ continue-on-error: true
35
+
36
+ - name: Run tests with coverage
37
+ run: |
38
+ pytest tests/ --cov=ctxeng --cov-report=xml --cov-report=term-missing
39
+
40
+ - name: Upload coverage to Codecov
41
+ uses: codecov/codecov-action@v4
42
+ with:
43
+ token: ${{ secrets.CODECOV_TOKEN }}
44
+ fail_ci_if_error: false
45
+
46
+ publish:
47
+ name: Publish to PyPI
48
+ runs-on: ubuntu-latest
49
+ needs: test
50
+ if: github.event_name == 'push' && startsWith(github.ref, 'refs/tags/v')
51
+
52
+ steps:
53
+ - uses: actions/checkout@v4
54
+
55
+ - name: Set up Python
56
+ uses: actions/setup-python@v5
57
+ with:
58
+ python-version: "3.12"
59
+
60
+ - name: Build package
61
+ run: |
62
+ pip install hatchling
63
+ python -m hatchling build
64
+
65
+ - name: Publish to PyPI
66
+ uses: pypa/gh-action-pypi-publish@release/v1
67
+ with:
68
+ password: ${{ secrets.PYPI_API_TOKEN }}
@@ -0,0 +1,10 @@
1
+ __pycache__/
2
+ .pytest_cache/
3
+ *.pyc
4
+ *.pyo
5
+ dist/
6
+ build/
7
+ *.egg-info/
8
+ .venv/
9
+ venv/
10
+ .env
@@ -0,0 +1,73 @@
1
+ # Contributing to ctxeng
2
+
3
+ Thank you for helping make ctxeng better! This guide covers everything you need to get started.
4
+
5
+ ## Development setup
6
+
7
+ ```bash
8
+ git clone https://github.com/your-username/python-context-engineer
9
+ cd python-context-engineer
10
+ pip install -e ".[dev]"
11
+ ```
12
+
13
+ ## Running tests
14
+
15
+ ```bash
16
+ pytest # run all tests
17
+ pytest tests/unit/ # unit tests only
18
+ pytest -k "test_scoring" # filter by name
19
+ pytest --cov=ctxeng # with coverage
20
+ ```
21
+
22
+ ## Code style
23
+
24
+ We use `ruff` for linting:
25
+
26
+ ```bash
27
+ ruff check ctxeng/
28
+ ruff format ctxeng/
29
+ ```
30
+
31
+ ## Project layout
32
+
33
+ ```
34
+ ctxeng/
35
+ ├── __init__.py Public API exports
36
+ ├── core.py ContextEngine main class
37
+ ├── builder.py ContextBuilder fluent API
38
+ ├── models.py Data classes (Context, ContextFile, TokenBudget)
39
+ ├── scorer.py File relevance scoring (keyword, AST, git, path)
40
+ ├── optimizer.py Token counting, budget fitting, smart truncation
41
+ ├── cli.py CLI entry point
42
+ ├── sources/ File collectors (filesystem, git, explicit)
43
+ └── integrations/ LLM client helpers (Claude, OpenAI, LangChain)
44
+ ```
45
+
46
+ ## How to add a new scoring signal
47
+
48
+ 1. Add a function `_my_signal_score(content, query, ...) -> float` in `scorer.py`
49
+ 2. Call it from `score_file()` and add it to the weighted average
50
+ 3. Add a unit test in `tests/unit/test_core.py`
51
+ 4. Document it in the README scoring table
52
+
53
+ ## How to add a new LLM integration
54
+
55
+ 1. Add an `ask_mymodel(ctx, ...) -> str` function in `ctxeng/integrations/__init__.py`
56
+ 2. Follow the pattern of `ask_claude` / `ask_openai`
57
+ 3. Add it to `pyproject.toml` optional-dependencies
58
+ 4. Document it in the README
59
+
60
+ ## Submitting a PR
61
+
62
+ 1. Fork the repo and create a branch: `git checkout -b feat/my-feature`
63
+ 2. Write code + tests
64
+ 3. Run `pytest` and `ruff check` — both must pass
65
+ 4. Open a PR with a clear description of what it does and why
66
+
67
+ ## Reporting bugs
68
+
69
+ Open an issue with:
70
+ - Python version
71
+ - `ctxeng` version
72
+ - Minimal reproduction case
73
+ - Expected vs actual behavior
ctxeng-0.1.0/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 ctxeng contributors
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
ctxeng-0.1.0/PKG-INFO ADDED
@@ -0,0 +1,412 @@
1
+ Metadata-Version: 2.4
2
+ Name: ctxeng
3
+ Version: 0.1.0
4
+ Summary: Build perfect LLM context from your Python codebase — automatically.
5
+ Project-URL: Homepage, https://github.com/sayeem3051/python-context-engineer
6
+ Project-URL: Repository, https://github.com/sayeem3051/python-context-engineer
7
+ Project-URL: Issues, https://github.com/sayeem3051/python-context-engineer/issues
8
+ Project-URL: Changelog, https://github.com/sayeem3051/python-context-engineer/blob/main/CHANGELOG.md
9
+ Author: ctxeng contributors
10
+ License: MIT
11
+ License-File: LICENSE
12
+ Keywords: ai,claude,codebase,context,context-engineering,developer-tools,gpt,llm,openai,token
13
+ Classifier: Development Status :: 4 - Beta
14
+ Classifier: Intended Audience :: Developers
15
+ Classifier: License :: OSI Approved :: MIT License
16
+ Classifier: Programming Language :: Python :: 3
17
+ Classifier: Programming Language :: Python :: 3.10
18
+ Classifier: Programming Language :: Python :: 3.11
19
+ Classifier: Programming Language :: Python :: 3.12
20
+ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
21
+ Classifier: Topic :: Software Development :: Libraries :: Python Modules
22
+ Requires-Python: >=3.10
23
+ Provides-Extra: all
24
+ Requires-Dist: anthropic>=0.25; extra == 'all'
25
+ Requires-Dist: langchain-core>=0.2; extra == 'all'
26
+ Requires-Dist: openai>=1.0; extra == 'all'
27
+ Requires-Dist: tiktoken>=0.7; extra == 'all'
28
+ Provides-Extra: anthropic
29
+ Requires-Dist: anthropic>=0.25; extra == 'anthropic'
30
+ Provides-Extra: dev
31
+ Requires-Dist: mypy>=1.10; extra == 'dev'
32
+ Requires-Dist: pytest-cov>=5.0; extra == 'dev'
33
+ Requires-Dist: pytest>=8.0; extra == 'dev'
34
+ Requires-Dist: ruff>=0.4; extra == 'dev'
35
+ Requires-Dist: tiktoken>=0.7; extra == 'dev'
36
+ Provides-Extra: langchain
37
+ Requires-Dist: langchain-core>=0.2; extra == 'langchain'
38
+ Provides-Extra: openai
39
+ Requires-Dist: openai>=1.0; extra == 'openai'
40
+ Provides-Extra: tiktoken
41
+ Requires-Dist: tiktoken>=0.7; extra == 'tiktoken'
42
+ Description-Content-Type: text/markdown
43
+
44
+ # ctxeng — Python Context Engineering Library
45
+
46
+ <p align="center">
47
+ <strong>Stop copy-pasting files into ChatGPT.<br>
48
+ Build the perfect LLM context from your codebase, automatically.</strong>
49
+ </p>
50
+
51
+ <p align="center">
52
+ <a href="https://pypi.org/project/ctxeng/"><img src="https://img.shields.io/pypi/v/ctxeng?color=blue&label=pypi" alt="PyPI"></a>
53
+ <a href="https://github.com/sayeem3051/python-context-engineer/actions"><img src="https://github.com/sayeem3051/python-context-engineer/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
54
+ <a href="https://pypi.org/project/ctxeng/"><img src="https://img.shields.io/pypi/pyversions/ctxeng" alt="Python"></a>
55
+ <img src="https://img.shields.io/github/license/sayeem3051/python-context-engineer" alt="License">
56
+ <img src="https://img.shields.io/pypi/dm/ctxeng?label=downloads" alt="Downloads">
57
+ </p>
58
+
59
+ ---
60
+
61
+ **Context engineering** is the new prompt engineering.
62
+ The quality of your LLM's output depends almost entirely on *what you put in the context window* — not how you phrase the question.
63
+
64
+ `ctxeng` solves this automatically:
65
+
66
+ - **Scans your codebase** and scores every file for relevance to your query
67
+ - **Ranks by signal** — keyword overlap, AST symbols, git recency, import graph
68
+ - **Fits the budget** — smart truncation keeps the best parts within any model's token limit
69
+ - **Ships ready to paste** — XML, Markdown, or plain text output that works with Claude, GPT-4o, Gemini, and every other model
70
+
71
+ Zero required dependencies. Works with any LLM.
72
+
73
+ ---
74
+
75
+ ## Installation
76
+
77
+ ```bash
78
+ pip install ctxeng
79
+ ```
80
+
81
+ For accurate token counting (strongly recommended):
82
+
83
+ ```bash
84
+ pip install "ctxeng[tiktoken]"
85
+ ```
86
+
87
+ For one-line LLM calls:
88
+
89
+ ```bash
90
+ pip install "ctxeng[anthropic]" # Claude
91
+ pip install "ctxeng[openai]" # GPT-4o
92
+ pip install "ctxeng[all]" # everything
93
+ ```
94
+
95
+ ---
96
+
97
+ ## Quickstart
98
+
99
+ ### Python API
100
+
101
+ ```python
102
+ from ctxeng import ContextEngine
103
+
104
+ engine = ContextEngine(root=".", model="claude-sonnet-4")
105
+ ctx = engine.build("Fix the authentication bug in the login flow")
106
+
107
+ print(ctx.summary())
108
+ # Context summary (12,340 tokens / 197,440 budget):
109
+ # Included : 8 files
110
+ # Skipped : 23 files (over budget)
111
+ # [████████ ] 0.84 src/auth/login.py
112
+ # [███████ ] 0.71 src/auth/middleware.py
113
+ # [█████ ] 0.53 src/models/user.py
114
+ # [████ ] 0.41 tests/test_auth.py
115
+ # ...
116
+
117
+ # Paste directly into your LLM
118
+ print(ctx.to_string())
119
+ ```
120
+
121
+ ### Fluent Builder API
122
+
123
+ ```python
124
+ from ctxeng import ContextBuilder
125
+
126
+ ctx = (
127
+ ContextBuilder(root=".")
128
+ .for_model("gpt-4o")
129
+ .only("**/*.py")
130
+ .exclude("tests/**", "migrations/**")
131
+ .from_git_diff() # only changed files
132
+ .with_system("You are a senior Python engineer. Be concise.")
133
+ .build("Refactor the payment module to use async/await")
134
+ )
135
+
136
+ print(ctx.to_string("markdown"))
137
+ ```
138
+
139
+ ### One-line LLM call
140
+
141
+ ```python
142
+ from ctxeng import ContextEngine
143
+ from ctxeng.integrations import ask_claude
144
+
145
+ engine = ContextEngine(".", model="claude-sonnet-4")
146
+ ctx = engine.build("Why is the test_login test failing?")
147
+
148
+ response = ask_claude(ctx)
149
+ print(response)
150
+ ```
151
+
152
+ ### CLI
153
+
154
+ ```bash
155
+ # Build context for a query and print to stdout
156
+ ctxeng build "Fix the auth bug"
157
+
158
+ # Focused on git-changed files only
159
+ ctxeng build "Review my changes" --git-diff
160
+
161
+ # Target a specific model with markdown output
162
+ ctxeng build "Refactor this" --model gpt-4o --fmt markdown
163
+
164
+ # Save to file
165
+ ctxeng build "Explain the payment flow" --output context.md
166
+
167
+ # Project stats
168
+ ctxeng info
169
+ ```
170
+
171
+ ---
172
+
173
+ ## How It Works
174
+
175
+ ```
176
+ Your codebase ctxeng Your LLM
177
+ ───────────── ──────────────── ────────────────
178
+ src/auth/login.py ─┐
179
+ src/models/user.py ─┤ 1. Score files 2. Fit budget <context>
180
+ src/api/routes.py ─┼─► vs query + git ─► smart truncate ─► <file>...</file>
181
+ tests/test_auth.py ─┤ recency + AST token-aware <file>...</file>
182
+ ...500 more files ─┘ </context>
183
+ ```
184
+
185
+ ### Scoring signals
186
+
187
+ Each file gets a relevance score from 0 → 1, combining:
188
+
189
+ | Signal | What it measures |
190
+ |--------|-----------------|
191
+ | **Keyword overlap** | How many query terms appear in the file content |
192
+ | **AST symbols** | Class/function/import names that match the query (Python) |
193
+ | **Path relevance** | Filename and directory names matching query tokens |
194
+ | **Git recency** | Files touched in recent commits score higher |
195
+
196
+ ### Token budget optimization
197
+
198
+ Files are ranked by score and filled greedily into the token budget. Files that don't fit are **smart-truncated** (head + tail, never middle) rather than dropped entirely — the top of a file has imports and class defs; the tail has recent changes. Both are high-signal.
199
+
200
+ ---
201
+
202
+ ## Examples
203
+
204
+ ### Debug a failing test
205
+
206
+ ```python
207
+ from ctxeng import ContextBuilder
208
+ from ctxeng.integrations import ask_claude
209
+
210
+ ctx = (
211
+ ContextBuilder(".")
212
+ .for_model("claude-sonnet-4")
213
+ .include_files("tests/test_payment.py", "src/payment/service.py")
214
+ .with_system("You are a Python debugging expert.")
215
+ .build("test_charge_user is failing with a KeyError on 'amount'")
216
+ )
217
+ response = ask_claude(ctx)
218
+ ```
219
+
220
+ ### Code review on a PR
221
+
222
+ ```python
223
+ # Only include what changed in this branch vs main
224
+ ctx = (
225
+ ContextBuilder(".")
226
+ .for_model("gpt-4o")
227
+ .from_git_diff(base="main")
228
+ .with_system("Do a thorough code review. Flag security issues first.")
229
+ .build("Review this pull request")
230
+ )
231
+ ```
232
+
233
+ ### Explain an unfamiliar codebase
234
+
235
+ ```python
236
+ from ctxeng import ContextEngine
237
+
238
+ engine = ContextEngine(
239
+ root="/path/to/project",
240
+ model="gemini-1.5-pro", # 1M token window → include everything
241
+ )
242
+ ctx = engine.build("Give me a high-level architecture overview")
243
+ print(ctx.to_string())
244
+ ```
245
+
246
+ ### Targeted refactor
247
+
248
+ ```python
249
+ ctx = (
250
+ ContextBuilder(".")
251
+ .for_model("claude-sonnet-4")
252
+ .only("src/database/**/*.py")
253
+ .exclude("**/*_test.py")
254
+ .build("Convert all raw SQL queries to use SQLAlchemy ORM")
255
+ )
256
+ ```
257
+
258
+ ---
259
+
260
+ ## API Reference
261
+
262
+ ### `ContextEngine`
263
+
264
+ ```python
265
+ ContextEngine(
266
+ root=".", # Project root
267
+ model="claude-sonnet-4",# Sets token budget automatically
268
+ budget=None, # Or explicit TokenBudget(total=50_000)
269
+ max_file_size_kb=500, # Skip files larger than this
270
+ include_patterns=None, # ["**/*.py"] — only these files
271
+ exclude_patterns=None, # ["tests/**"] — skip these
272
+ use_git=True, # Use git recency signal
273
+ )
274
+ ```
275
+
276
+ ```python
277
+ engine.build(
278
+ query="", # What you want the LLM to do
279
+ files=None, # Explicit list of paths (skips auto-discovery)
280
+ git_diff=False, # Only changed files
281
+ git_base="HEAD", # Diff base ref
282
+ system_prompt="", # System prompt (counts against budget)
283
+ fmt="xml", # "xml" | "markdown" | "plain"
284
+ )
285
+ # → Context
286
+ ```
287
+
288
+ ### `ContextBuilder` (fluent API)
289
+
290
+ ```python
291
+ ContextBuilder(root=".")
292
+ .for_model("gpt-4o")
293
+ .with_budget(total=50_000, reserved_output=4096)
294
+ .only("**/*.py", "**/*.yaml")
295
+ .exclude("tests/**", "migrations/**")
296
+ .include_files("src/specific.py")
297
+ .from_git_diff(base="main")
298
+ .with_system("You are an expert Python engineer.")
299
+ .max_file_size(200) # KB
300
+ .no_git()
301
+ .build("query")
302
+ # → Context
303
+ ```
304
+
305
+ ### `Context`
306
+
307
+ ```python
308
+ ctx.to_string(fmt="xml") # → str ready to paste into an LLM
309
+ ctx.summary() # → human-readable summary with token counts
310
+ ctx.files # → list[ContextFile], sorted by relevance
311
+ ctx.skipped_files # → files that didn't fit the budget
312
+ ctx.total_tokens # → estimated token usage
313
+ ctx.budget.available # → remaining token budget
314
+ ```
315
+
316
+ ### `TokenBudget`
317
+
318
+ ```python
319
+ TokenBudget.for_model("claude-sonnet-4") # auto-detect limit
320
+ TokenBudget(total=50_000, reserved_output=2048, reserved_system=512)
321
+ ```
322
+
323
+ Supported models (auto-detected): `claude-opus-4`, `claude-sonnet-4`, `claude-haiku-4`, `gpt-4o`, `gpt-4-turbo`, `gpt-4`, `gpt-3.5-turbo`, `gemini-1.5-pro`, `gemini-1.5-flash`, `llama-3`.
324
+
325
+ ---
326
+
327
+ ## CLI Reference
328
+
329
+ ```
330
+ ctxeng [--root PATH] <command> [options]
331
+
332
+ Commands:
333
+ build Build context for a query
334
+ info Show project info and file stats
335
+
336
+ build options:
337
+ --model, -m Target model (default: claude-sonnet-4)
338
+ --fmt, -f Output format: xml | markdown | plain (default: xml)
339
+ --output, -o Write to file instead of stdout
340
+ --only Glob patterns to include
341
+ --exclude Glob patterns to exclude
342
+ --files Explicit file list
343
+ --git-diff Only include git-changed files
344
+ --git-base Git base ref (default: HEAD)
345
+ --system System prompt text
346
+ --budget Override total token budget
347
+ --no-git Disable git recency scoring
348
+ --max-size Max file size in KB (default: 500)
349
+ ```
350
+
351
+ ---
352
+
353
+ ## Supported Models
354
+
355
+ | Model | Context window | Auto-detected |
356
+ |-------|---------------|---------------|
357
+ | claude-opus-4, claude-sonnet-4, claude-haiku-4 | 200K | ✓ |
358
+ | gpt-4o, gpt-4-turbo | 128K | ✓ |
359
+ | gpt-4 | 8K | ✓ |
360
+ | gpt-3.5-turbo | 16K | ✓ |
361
+ | gemini-1.5-pro, gemini-1.5-flash | 1M | ✓ |
362
+ | llama-3 | 32K | ✓ |
363
+ | any other | 32K (safe default) | — |
364
+
365
+ ---
366
+
367
+ ## Why not just paste files manually?
368
+
369
+ You could. But you'll hit these problems immediately:
370
+
371
+ - **Token limit errors** — too many files, context overflows
372
+ - **Irrelevant noise** — wrong files dilute signal, hurt output quality
373
+ - **Stale context** — you forget to update when code changes
374
+ - **Manual effort** — figuring out which files matter takes time
375
+
376
+ `ctxeng` solves all four. The right files, in the right order, trimmed to fit, every time.
377
+
378
+ ---
379
+
380
+ ## Roadmap
381
+
382
+ - [ ] Semantic similarity scoring (optional embedding model)
383
+ - [ ] `ctxeng watch` — auto-rebuild context on file changes
384
+ - [ ] VSCode extension
385
+ - [ ] Import graph analysis (include files imported by relevant files)
386
+ - [ ] `.ctxengignore` file support
387
+ - [ ] Streaming context into LLM APIs
388
+
389
+ ---
390
+
391
+ ## Contributing
392
+
393
+ PRs welcome! See [CONTRIBUTING.md](CONTRIBUTING.md).
394
+
395
+ ```bash
396
+ git clone https://github.com/sayeem3051/python-context-engineer
397
+ cd python-context-engineer
398
+ pip install -e ".[dev]"
399
+ pytest
400
+ ```
401
+
402
+ ---
403
+
404
+ ## License
405
+
406
+ MIT. Use freely, modify as needed, contribute back if you can.
407
+
408
+ ---
409
+
410
+ <p align="center">
411
+ If <code>ctxeng</code> saved you time, please ⭐ the repo — it helps others find it.
412
+ </p>