codectx 0.1.3__tar.gz → 0.2.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {codectx-0.1.3 → codectx-0.2.0}/ARCHITECTURE.md +35 -54
- {codectx-0.1.3 → codectx-0.2.0}/CONTEXT.md +372 -328
- {codectx-0.1.3 → codectx-0.2.0}/DECISIONS.md +1 -1
- codectx-0.2.0/PKG-INFO +252 -0
- codectx-0.2.0/PLAN.md +145 -0
- codectx-0.2.0/README.md +198 -0
- codectx-0.2.0/benchmark.png +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/docs/src/content/docs/advanced/dependency-graph.md +4 -0
- codectx-0.2.0/docs/src/content/docs/advanced/ranking-system.md +40 -0
- codectx-0.2.0/docs/src/content/docs/advanced/token-compression.md +26 -0
- {codectx-0.1.3 → codectx-0.2.0}/docs/src/content/docs/comparison.md +1 -1
- codectx-0.2.0/docs/src/content/docs/getting-started/basic-usage.md +62 -0
- {codectx-0.1.3 → codectx-0.2.0}/docs/src/content/docs/getting-started/installation.md +14 -3
- {codectx-0.1.3 → codectx-0.2.0}/docs/src/content/docs/getting-started/quick-start.mdx +6 -2
- codectx-0.2.0/docs/src/content/docs/guides/configuration.md +52 -0
- {codectx-0.1.3 → codectx-0.2.0}/docs/src/content/docs/introduction/what-is-codectx.md +2 -0
- codectx-0.2.0/docs/src/content/docs/reference/architecture-overview.md +32 -0
- codectx-0.2.0/docs/src/content/docs/reference/cli-reference.md +115 -0
- {codectx-0.1.3 → codectx-0.2.0}/pyproject.toml +2 -3
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/__init__.py +1 -1
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/config/defaults.py +15 -21
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/graph/builder.py +33 -26
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/output/formatter.py +5 -1
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/parser/languages.py +5 -2
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/ranker/scorer.py +12 -1
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/walker.py +4 -0
- {codectx-0.1.3 → codectx-0.2.0}/tests/test_scorer.py +71 -0
- {codectx-0.1.3 → codectx-0.2.0}/tests/test_walker.py +35 -0
- codectx-0.2.0/tests/unit/test_call_paths.py +161 -0
- {codectx-0.1.3 → codectx-0.2.0}/tests/unit/test_git_meta.py +51 -0
- codectx-0.2.0/tests/unit/test_safety.py +27 -0
- {codectx-0.1.3 → codectx-0.2.0}/tests/unit/test_semantic.py +31 -0
- {codectx-0.1.3 → codectx-0.2.0}/uv.lock +875 -820
- codectx-0.1.3/PKG-INFO +0 -300
- codectx-0.1.3/PLAN.md +0 -174
- codectx-0.1.3/README.md +0 -245
- codectx-0.1.3/docs/src/content/docs/advanced/ranking-system.md +0 -31
- codectx-0.1.3/docs/src/content/docs/advanced/token-compression.md +0 -22
- codectx-0.1.3/docs/src/content/docs/getting-started/basic-usage.md +0 -44
- codectx-0.1.3/docs/src/content/docs/guides/configuration.md +0 -40
- codectx-0.1.3/docs/src/content/docs/reference/architecture-overview.md +0 -18
- codectx-0.1.3/docs/src/content/docs/reference/cli-reference.md +0 -37
- codectx-0.1.3/requirements.txt +0 -115
- {codectx-0.1.3 → codectx-0.2.0}/.github/workflows/ci.yml +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/.github/workflows/codeql.yml +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/.github/workflows/publish.yml +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/.gitignore +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/.python-version +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/LICENSE +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/docs/astro.config.mjs +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/docs/build_output.txt +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/docs/bun.lock +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/docs/package.json +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/docs/src/content/docs/community/contributing.md +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/docs/src/content/docs/community/faq.md +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/docs/src/content/docs/guides/best-practices.md +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/docs/src/content/docs/guides/using-context-effectively.md +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/docs/src/content/docs/index.mdx +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/docs/src/content/docs/introduction/why-it-exists.md +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/docs/src/content.config.ts +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/docs/src/env.d.ts +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/docs/src/styles/custom.css +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/docs/tsconfig.json +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/main.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/cache.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/cli.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/compressor/__init__.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/compressor/budget.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/compressor/summarizer.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/compressor/tiered.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/config/__init__.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/config/loader.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/graph/__init__.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/graph/resolver.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/ignore.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/output/__init__.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/output/sections.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/parser/__init__.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/parser/base.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/parser/queries/go.scm +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/parser/queries/java.scm +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/parser/queries/javascript.scm +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/parser/queries/python.scm +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/parser/queries/rust.scm +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/parser/queries/typescript.scm +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/parser/treesitter.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/ranker/__init__.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/ranker/git_meta.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/ranker/semantic.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/src/codectx/safety.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/tests/__init__.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/tests/test_compressor.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/tests/test_ignore.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/tests/test_integration.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/tests/test_parser.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/tests/unit/__init__.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/tests/unit/test_cache_export.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/tests/unit/test_cache_wiring.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/tests/unit/test_cli.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/tests/unit/test_cycles.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/tests/unit/test_formatter_coverage.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/tests/unit/test_formatter_sections.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/tests/unit/test_multi_root.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/tests/unit/test_queries.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/tests/unit/test_resolver.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/tests/unit/test_semantic_mock.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/tests/unit/test_summarizer.py +0 -0
- {codectx-0.1.3 → codectx-0.2.0}/tests/unit/test_treesitter.py +0 -0
|
@@ -64,21 +64,21 @@ The graph enables ranking algorithms to identify important modules based on stru
|
|
|
64
64
|
The Ranker computes a composite importance score for each file:
|
|
65
65
|
|
|
66
66
|
```
|
|
67
|
-
score = (0.
|
|
68
|
-
+ (0.
|
|
69
|
-
+ (0.
|
|
67
|
+
score = (0.40 × git_frequency)
|
|
68
|
+
+ (0.40 × fan_in)
|
|
69
|
+
+ (0.10 × recency)
|
|
70
70
|
+ (0.10 × entry_proximity)
|
|
71
71
|
```
|
|
72
72
|
|
|
73
|
-
**Git Frequency (0.
|
|
73
|
+
**Git Frequency (0.40):** Commit count touching the file. Frequently-modified files are typically more important.
|
|
74
74
|
|
|
75
|
-
**Fan-in (0.
|
|
75
|
+
**Fan-in (0.40):** Inverse-normalized in-degree. Files imported by many other modules are critical interfaces.
|
|
76
76
|
|
|
77
|
-
**Recency (0.
|
|
77
|
+
**Recency (0.10):** Days since last modification. Recently active files are prioritized.
|
|
78
78
|
|
|
79
79
|
**Entry Proximity (0.10):** Graph distance from identified entry points. Files close to main execution paths rank higher.
|
|
80
80
|
|
|
81
|
-
Scores are normalized to `[0.0, 1.0]` range for uniform compression tier assignment.
|
|
81
|
+
Scores are normalized to `[0.0, 1.0]` range for uniform compression tier assignment. Semantic searches (`--query`) inject a 5th signal at 20% weight and rescale the other four to 80%.
|
|
82
82
|
|
|
83
83
|
**Output:** `Dict[Path, float]` mapping file paths to scores.
|
|
84
84
|
|
|
@@ -86,11 +86,13 @@ Scores are normalized to `[0.0, 1.0]` range for uniform compression tier assignm
|
|
|
86
86
|
|
|
87
87
|
**Purpose:** Fit code content within a token budget.
|
|
88
88
|
|
|
89
|
-
The Compressor assigns content tiers based on
|
|
89
|
+
The Compressor assigns content tiers based on scored percentiles:
|
|
90
90
|
|
|
91
|
-
- **Tier 1** (
|
|
92
|
-
- **Tier 2** (
|
|
93
|
-
- **Tier 3** (
|
|
91
|
+
- **Tier 1** (Top 15%) — AST-driven structured summaries or full source code for true entry points
|
|
92
|
+
- **Tier 2** (Next 30%) — Function signatures and docstrings only
|
|
93
|
+
- **Tier 3** (Remaining) — One-line summaries
|
|
94
|
+
|
|
95
|
+
A Summarizer step (`--llm` extras) runs specifically evaluating `Tier 3` code mapping out detailed functions implicitly before output mapping.
|
|
94
96
|
|
|
95
97
|
Files are emitted in order: Tier 1 by score descending, then Tier 2, then Tier 3.
|
|
96
98
|
|
|
@@ -111,13 +113,15 @@ This is a hard constraint. The tool does not emit context that exceeds the token
|
|
|
111
113
|
|
|
112
114
|
The Formatter writes sections in fixed order:
|
|
113
115
|
|
|
114
|
-
1. **ARCHITECTURE** — High-level project structure
|
|
115
|
-
2. **
|
|
116
|
-
3. **
|
|
117
|
-
4. **
|
|
118
|
-
5. **
|
|
119
|
-
6. **
|
|
120
|
-
7. **
|
|
116
|
+
1. **ARCHITECTURE** — High-level project structure derived from files
|
|
117
|
+
2. **ENTRY_POINTS** — Main files and public interfaces with full source
|
|
118
|
+
3. **SYMBOL_INDEX** — Identifies references and mappings across the codebase
|
|
119
|
+
4. **IMPORTANT_CALL_PATHS** — Tracks deep operational flows sequentially
|
|
120
|
+
5. **CORE_MODULES** — High-scoring modules with structured logic constraints
|
|
121
|
+
6. **SUPPORTING_MODULES** — Mid-scoring modules with signatures and docstrings
|
|
122
|
+
7. **DEPENDENCY_GRAPH** — Mermaid diagram of module relationships
|
|
123
|
+
8. **RANKED_FILES** — Sorted layout tracking cost algorithms
|
|
124
|
+
9. **PERIPHERY** — Low-scoring files with one-line summaries
|
|
121
125
|
|
|
122
126
|
Each section is preceded by a Markdown heading and terminated with metadata (token count, file count).
|
|
123
127
|
|
|
@@ -156,6 +160,7 @@ File System
|
|
|
156
160
|
├─→ [Compressor]
|
|
157
161
|
│ ├ Tier assignment
|
|
158
162
|
│ ├ Token budget enforcement
|
|
163
|
+
│ ├ [Optional: AI Summarizer hooks]
|
|
159
164
|
│ └ Output: Dict[Path, CompressedContent]
|
|
160
165
|
│
|
|
161
166
|
└─→ [Formatter]
|
|
@@ -203,7 +208,7 @@ Budget enforcement is hard: the tool does not emit context exceeding the specifi
|
|
|
203
208
|
Consumption order:
|
|
204
209
|
|
|
205
210
|
1. Fixed overhead (section headers, metadata) — typically 500–1000 tokens
|
|
206
|
-
2. Tier 1 files by score descending (
|
|
211
|
+
2. Tier 1 files by score descending (AST Summaries / Full source)
|
|
207
212
|
3. Tier 2 files by score descending (signatures only)
|
|
208
213
|
4. Tier 3 files by score descending (one-line summaries)
|
|
209
214
|
|
|
@@ -218,11 +223,13 @@ The Parser uses tree-sitter for universal AST extraction. Each language requires
|
|
|
218
223
|
|
|
219
224
|
Currently supported:
|
|
220
225
|
|
|
221
|
-
- **Python**
|
|
222
|
-
- **TypeScript/JavaScript**
|
|
223
|
-
- **Go**
|
|
224
|
-
- **Rust**
|
|
225
|
-
- **Java**
|
|
226
|
+
- **Python**
|
|
227
|
+
- **TypeScript/JavaScript**
|
|
228
|
+
- **Go**
|
|
229
|
+
- **Rust**
|
|
230
|
+
- **Java**
|
|
231
|
+
- **C/C++**
|
|
232
|
+
- **Ruby**
|
|
226
233
|
|
|
227
234
|
Adding a language requires implementing a resolver in `src/codectx/graph/resolver.py` and adding the grammar dependency to `pyproject.toml`.
|
|
228
235
|
|
|
@@ -231,40 +238,14 @@ Adding a language requires implementing a resolver in `src/codectx/graph/resolve
|
|
|
231
238
|
Configuration is applied in this precedence order:
|
|
232
239
|
|
|
233
240
|
1. **CLI flags** (highest priority)
|
|
234
|
-
2. **`.
|
|
241
|
+
2. **`.codectx.toml`** in repository root
|
|
235
242
|
3. **Built-in defaults** (lowest priority)
|
|
236
243
|
|
|
237
|
-
Example `.
|
|
244
|
+
Example `.codectx.toml`:
|
|
238
245
|
|
|
239
246
|
```toml
|
|
240
247
|
[codectx]
|
|
241
248
|
token_budget = 120000
|
|
242
|
-
|
|
243
|
-
|
|
244
|
-
exclude_patterns = ["tests/**", "*.test.py"]
|
|
249
|
+
output_file = "CONTEXT.md"
|
|
250
|
+
extra_ignore = ["**/generated/**", "*.draft.py"]
|
|
245
251
|
```
|
|
246
|
-
|
|
247
|
-
## Parallelism Strategy
|
|
248
|
-
|
|
249
|
-
**CPU-bound tasks (Parser):** `ProcessPoolExecutor` — parsing and AST extraction leverages tree-sitter C extension.
|
|
250
|
-
|
|
251
|
-
**I/O-bound tasks (Git metadata, file I/O):** `ThreadPoolExecutor` — reading git history and source files is I/O-bound.
|
|
252
|
-
|
|
253
|
-
**Sync tasks:** Graph construction, ranking, and compression are single-threaded because they are fast and maintain simple state.
|
|
254
|
-
|
|
255
|
-
This mixed-executor approach balances CPU and I/O contention.
|
|
256
|
-
|
|
257
|
-
## Performance Characteristics
|
|
258
|
-
|
|
259
|
-
On a typical 10k-file repository:
|
|
260
|
-
|
|
261
|
-
- **Walker:** ~500ms (filesystem traversal)
|
|
262
|
-
- **Parser:** ~2-5s (parallel tree-sitter parsing)
|
|
263
|
-
- **Graph Builder:** ~100ms (import resolution)
|
|
264
|
-
- **Ranker:** ~200ms (scoring and normalization)
|
|
265
|
-
- **Compressor:** ~50ms (tier assignment)
|
|
266
|
-
- **Formatter:** ~100ms (markdown generation)
|
|
267
|
-
|
|
268
|
-
**Total:** ~3-6 seconds for full analysis.
|
|
269
|
-
|
|
270
|
-
Incremental mode (watch) is typically 5-10x faster because it processes only changed files.
|