contexa 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,10 @@
1
+ # Python-generated files
2
+ __pycache__/
3
+ *.py[oc]
4
+ build/
5
+ dist/
6
+ wheels/
7
+ *.egg-info
8
+
9
+ # Virtual environments
10
+ .venv
@@ -0,0 +1 @@
1
+ 3.12
contexa-0.1.0/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Swadhin Biswas
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
contexa-0.1.0/PKG-INFO ADDED
@@ -0,0 +1,453 @@
1
+ Metadata-Version: 2.4
2
+ Name: contexa
3
+ Version: 0.1.0
4
+ Summary: Python implementation of Git-Context-Controller (GCC)
5
+ Project-URL: Homepage, https://github.com/swadhinbiswas/contexa
6
+ Project-URL: Repository, https://github.com/swadhinbiswas/contexa
7
+ Project-URL: Issues, https://github.com/swadhinbiswas/contexa/issues
8
+ Author-email: Swadhin Biswas <swadhinbiswas.cse@gmail.com>
9
+ License-Expression: MIT
10
+ License-File: LICENSE
11
+ Keywords: agent,context,gcc,git-context-controller,llm,memory
12
+ Classifier: Development Status :: 3 - Alpha
13
+ Classifier: Intended Audience :: Developers
14
+ Classifier: License :: OSI Approved :: MIT License
15
+ Classifier: Programming Language :: Python :: 3
16
+ Classifier: Programming Language :: Python :: 3.12
17
+ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
18
+ Requires-Python: >=3.12
19
+ Requires-Dist: pyyaml>=6.0
20
+ Description-Content-Type: text/markdown
21
+
22
+ # contexa
23
+
24
+ [![PyPI version](https://img.shields.io/pypi/v/contexa.svg)](https://pypi.org/project/contexa/)
25
+ [![Python 3.12+](https://img.shields.io/badge/python-3.12%2B-blue.svg)](https://www.python.org/downloads/)
26
+ [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://github.com/swadhinbiswas/contexa/blob/main/LICENSE)
27
+ [![GitHub](https://img.shields.io/badge/GitHub-swadhinbiswas%2Fcontexa-black.svg?logo=github)](https://github.com/swadhinbiswas/contexa)
28
+
29
+ A Python implementation of the **Git-Context-Controller (GCC)** framework for managing LLM agent memory.
30
+
31
+ Based on the research paper:
32
+
33
+ > *"Git Context Controller: Manage the Context of LLM-based Agents like Git"*
34
+ > Junde Wu et al., [arXiv:2508.00031v2](https://arxiv.org/abs/2508.00031v2), 2025
35
+
36
+ ---
37
+
38
+ ## Table of Contents
39
+
40
+ - [The Problem](#the-problem)
41
+ - [How GCC Solves It](#how-gcc-solves-it)
42
+ - [Installation](#installation)
43
+ - [Quick Start](#quick-start)
44
+ - [Core Concepts](#core-concepts)
45
+ - [OTA Logging](#1-ota-logging-observation-thought-action)
46
+ - [COMMIT](#2-commit---save-milestones)
47
+ - [BRANCH](#3-branch---explore-alternatives)
48
+ - [MERGE](#4-merge---integrate-results)
49
+ - [CONTEXT](#5-context---retrieve-history)
50
+ - [API Reference](#api-reference)
51
+ - [Directory Structure](#directory-structure)
52
+ - [Data Models](#data-models)
53
+ - [Real-World Example](#real-world-example)
54
+ - [Running Tests](#running-tests)
55
+ - [Contributing](#contributing)
56
+ - [Requirements](#requirements)
57
+ - [License](#license)
58
+ - [Citation](#citation)
59
+ - [Links](#links)
60
+
61
+ ---
62
+
63
+ ## The Problem
64
+
65
+ LLM-based agents (like coding assistants, research agents, or autonomous planners) accumulate **observations**, **thoughts**, and **actions** over time. But context windows are finite. As conversations grow, agents lose track of earlier reasoning, repeat mistakes, or forget prior decisions.
66
+
67
+ Current approaches either:
68
+ - Dump the entire history into the prompt (expensive, hits token limits)
69
+ - Use simple summarization (loses critical details)
70
+ - Have no structured way to explore alternative strategies
71
+
72
+ ## How GCC Solves It
73
+
74
+ GCC borrows **Git's branching model** to give agents structured, versioned memory:
75
+
76
+ ```
77
+ main
78
+ |
79
+ init ──> log OTA ──> COMMIT ──> COMMIT ──> MERGE <──┐
80
+ | |
81
+ BRANCH ──> COMMIT ────┘
82
+ (experiment)
83
+ ```
84
+
85
+ | Concept | Git Equivalent | What It Does |
86
+ |---------|---------------|--------------|
87
+ | **OTA Log** | Working directory | Continuous trace of Observation-Thought-Action cycles |
88
+ | **COMMIT** | `git commit` | Saves a milestone summary, compressing older OTA steps |
89
+ | **BRANCH** | `git branch` | Creates an isolated workspace for alternative reasoning |
90
+ | **MERGE** | `git merge` | Integrates a successful branch back into main |
91
+ | **CONTEXT** | `git log` | Retrieves historical context at varying resolutions (K commits) |
92
+
93
+ The key insight from the paper: by controlling **how much history** the agent sees (the K parameter in CONTEXT), you can balance between detailed recent context and compressed older summaries.
94
+
95
+ ---
96
+
97
+ ## Installation
98
+
99
+ ```bash
100
+ pip install contexa
101
+ ```
102
+
103
+ Or with [uv](https://docs.astral.sh/uv/):
104
+
105
+ ```bash
106
+ uv add contexa
107
+ ```
108
+
109
+ ---
110
+
111
+ ## Quick Start
112
+
113
+ ```python
114
+ from contexa import GCCWorkspace
115
+
116
+ # 1. Initialize a workspace
117
+ ws = GCCWorkspace("/path/to/project")
118
+ ws.init("Build a REST API service with user auth")
119
+
120
+ # 2. Agent logs its reasoning as it works
121
+ ws.log_ota(
122
+ observation="Project directory is empty",
123
+ thought="Need to scaffold the project structure first",
124
+ action="create_files(['main.py', 'requirements.txt', 'models.py'])"
125
+ )
126
+ ws.log_ota(
127
+ observation="Files created successfully",
128
+ thought="Now implement the user model",
129
+ action="write_code('models.py', user_model_code)"
130
+ )
131
+
132
+ # 3. Commit a milestone (compresses OTA history)
133
+ ws.commit("Project scaffold and User model complete")
134
+
135
+ # 4. Branch to explore an alternative approach
136
+ ws.branch("auth-jwt", "Explore JWT-based authentication instead of sessions")
137
+ ws.log_ota("Reading JWT docs", "JWT is stateless, good for APIs", "implement_jwt()")
138
+ ws.commit("JWT auth middleware implemented")
139
+
140
+ # 5. Merge the successful branch back
141
+ ws.merge("auth-jwt")
142
+
143
+ # 6. Retrieve context for the agent's next step
144
+ ctx = ws.context(k=1) # K=1: only the most recent commit (paper default)
145
+ print(ctx.summary())
146
+ ```
147
+
148
+ ---
149
+
150
+ ## Core Concepts
151
+
152
+ ### 1. OTA Logging (Observation-Thought-Action)
153
+
154
+ Every reasoning step an agent takes is an OTA cycle. These are logged continuously in `log.md`:
155
+
156
+ ```python
157
+ rec = ws.log_ota(
158
+ observation="API returns 500 error on /users endpoint",
159
+ thought="The database connection might not be initialized",
160
+ action="check_db_connection()"
161
+ )
162
+ print(rec.step) # 1 (auto-incremented)
163
+ print(rec.timestamp) # 2025-03-04T12:00:00+00:00
164
+ ```
165
+
166
+ This produces a markdown entry:
167
+
168
+ ```markdown
169
+ ### Step 1-2025-03-04T12:00:00+00:00
170
+ **Observation:** API returns 500 error on /users endpoint
171
+
172
+ **Thought:** The database connection might not be initialized
173
+
174
+ **Action:** check_db_connection()
175
+
176
+ --------
177
+ ```
178
+
179
+ ### 2. COMMIT - Save Milestones
180
+
181
+ When the agent reaches a significant checkpoint, commit it. This creates a structured summary that can be retrieved later without replaying every OTA step:
182
+
183
+ ```python
184
+ commit = ws.commit(
185
+ contribution="Fixed database connection and /users endpoint now returns 200",
186
+ update_roadmap="Database layer is stable, move to auth next" # optional
187
+ )
188
+ print(commit.commit_id) # "a3f2b1c4" (8-char UUID)
189
+ print(commit.branch_name) # "main"
190
+ ```
191
+
192
+ The `previous_progress_summary` is auto-populated from the last commit if not provided.
193
+
194
+ ### 3. BRANCH - Explore Alternatives
195
+
196
+ When an agent wants to explore a different strategy without risking the main trajectory:
197
+
198
+ ```python
199
+ # Creates isolated workspace with fresh OTA log
200
+ ws.branch("redis-cache", "Try Redis caching instead of in-memory")
201
+
202
+ # Agent works in the branch
203
+ ws.log_ota("Redis docs reviewed", "Need redis-py package", "pip_install('redis')")
204
+ ws.commit("Redis caching layer implemented")
205
+
206
+ # Check what branches exist
207
+ print(ws.list_branches()) # ['main', 'redis-cache']
208
+ print(ws.current_branch) # 'redis-cache'
209
+ ```
210
+
211
+ Each branch gets its own:
212
+ - `log.md` -- fresh OTA trace (no carry-over from parent)
213
+ - `commit.md` -- independent commit history
214
+ - `metadata.yaml` -- records why the branch was created and from where
215
+
216
+ ### 4. MERGE - Integrate Results
217
+
218
+ When a branch's exploration succeeds, merge it back:
219
+
220
+ ```python
221
+ merge_commit = ws.merge("redis-cache", target="main")
222
+ # - Appends the branch's OTA trace to main's log
223
+ # - Creates a merge commit on main
224
+ # - Marks the branch as "merged" in its metadata
225
+ ```
226
+
227
+ After merging, `ws.current_branch` automatically switches back to the target.
228
+
229
+ ### 5. CONTEXT - Retrieve History
230
+
231
+ The CONTEXT command is the agent's way of "remembering". The **K parameter** controls resolution:
232
+
233
+ ```python
234
+ # K=1: Only the most recent commit (paper's recommended default)
235
+ ctx = ws.context(k=1)
236
+
237
+ # K=3: Last 3 commits for more detailed history
238
+ ctx = ws.context(k=3)
239
+
240
+ # Access the structured result
241
+ print(ctx.branch_name) # "main"
242
+ print(ctx.main_roadmap) # Global project roadmap from main.md
243
+ print(ctx.commits) # List of last K CommitRecord objects
244
+ print(ctx.ota_records) # All OTA records on the branch
245
+ print(ctx.metadata) # BranchMetadata object
246
+
247
+ # Get a formatted markdown summary ready to inject into an LLM prompt
248
+ prompt_context = ctx.summary()
249
+ ```
250
+
251
+ The paper's experiments (Table 2, Section 4) show that **K=1 performs best** in most benchmarks -- agents do better with compressed recent context than with full history dumps.
252
+
253
+ ---
254
+
255
+ ## API Reference
256
+
257
+ ### `GCCWorkspace`
258
+
259
+ | Method | Parameters | Returns | Description |
260
+ |--------|-----------|---------|-------------|
261
+ | `__init__` | `project_root: str` | -- | Set the project root directory |
262
+ | `init` | `project_roadmap: str = ""` | `None` | Create `.GCC/` structure with main branch |
263
+ | `load` | -- | `None` | Load an existing workspace |
264
+ | `log_ota` | `observation, thought, action` | `OTARecord` | Append OTA step to current branch |
265
+ | `commit` | `contribution, previous_summary=None, update_roadmap=None` | `CommitRecord` | Create milestone checkpoint |
266
+ | `branch` | `name, purpose` | `GCCWorkspace` | Create and switch to new branch |
267
+ | `merge` | `branch_name, summary=None, target="main"` | `CommitRecord` | Merge branch into target |
268
+ | `context` | `branch=None, k=1` | `ContextResult` | Retrieve historical context |
269
+ | `switch_branch` | `name` | `None` | Switch active branch |
270
+ | `list_branches` | -- | `list[str]` | List all branch names |
271
+ | `update_roadmap` | `content` | `None` | Append to global roadmap |
272
+ | `current_branch` | *(property)* | `str` | Get current active branch name |
273
+
274
+ ---
275
+
276
+ ## Directory Structure
277
+
278
+ When you call `ws.init()`, the following structure is created on disk:
279
+
280
+ ```
281
+ your-project/
282
+ .GCC/
283
+ main.md # Global roadmap / planning artifact
284
+ branches/
285
+ main/
286
+ log.md # Continuous OTA trace
287
+ commit.md # Milestone-level commit summaries
288
+ metadata.yaml # Branch intent, status, creation info
289
+ feature-branch/ # Created by ws.branch()
290
+ log.md # Independent OTA trace
291
+ commit.md # Independent commit history
292
+ metadata.yaml # Why this branch exists
293
+ ```
294
+
295
+ All data is stored as **human-readable Markdown and YAML** -- you can inspect and debug the agent's memory directly in your editor.
296
+
297
+ ---
298
+
299
+ ## Data Models
300
+
301
+ | Class | Description | Key Fields |
302
+ |-------|-------------|------------|
303
+ | `OTARecord` | Single Observation-Thought-Action cycle | `timestamp`, `observation`, `thought`, `action`, `step` |
304
+ | `CommitRecord` | Milestone commit snapshot | `commit_id`, `branch_name`, `branch_purpose`, `previous_progress_summary`, `this_commit_contribution`, `timestamp` |
305
+ | `BranchMetadata` | Branch creation intent and status | `name`, `purpose`, `created_from`, `created_at`, `status`, `merged_into`, `merged_at` |
306
+ | `ContextResult` | Result of CONTEXT retrieval | `branch_name`, `k`, `commits`, `ota_records`, `main_roadmap`, `metadata` |
307
+
308
+ All models support serialization:
309
+
310
+ ```python
311
+ from contexa import OTARecord, BranchMetadata
312
+
313
+ # OTARecord <-> dict
314
+ record = OTARecord.from_dict({"timestamp": "...", "observation": "...", ...})
315
+
316
+ # BranchMetadata <-> YAML
317
+ meta = BranchMetadata(name="main", purpose="Primary trajectory", ...)
318
+ yaml_str = meta.to_yaml()
319
+ meta_back = BranchMetadata.from_yaml(yaml_str)
320
+
321
+ # All records can be rendered as Markdown
322
+ print(record.to_markdown())
323
+ ```
324
+
325
+ ---
326
+
327
+ ## Real-World Example
328
+
329
+ Here's how an autonomous coding agent might use contexa to manage its memory while building a web application:
330
+
331
+ ```python
332
+ from contexa import GCCWorkspace
333
+
334
+ ws = GCCWorkspace("./my-webapp")
335
+ ws.init("Build a Flask web app with user auth, blog posts, and admin panel")
336
+
337
+ # === Phase 1: Project Setup ===
338
+ ws.log_ota("No project files exist", "Start with Flask boilerplate", "scaffold_project()")
339
+ ws.log_ota("Flask app created", "Need database models", "create_models()")
340
+ ws.log_ota("Models created", "Database migrations needed", "run_migrations()")
341
+ ws.commit("Project scaffold with Flask + SQLAlchemy models")
342
+
343
+ # === Phase 2: Explore auth strategies in parallel branches ===
344
+
345
+ # Try JWT auth
346
+ ws.branch("auth-jwt", "Explore stateless JWT authentication")
347
+ ws.log_ota("JWT docs reviewed", "Good for API, complex for sessions", "implement_jwt()")
348
+ ws.commit("JWT auth prototype -- works but session handling is messy")
349
+
350
+ # Go back and try session auth
351
+ ws.switch_branch("main")
352
+ ws.branch("auth-session", "Explore Flask-Login session authentication")
353
+ ws.log_ota("Flask-Login docs reviewed", "Simple, works well with templates", "implement_sessions()")
354
+ ws.commit("Session auth prototype -- clean integration with Flask")
355
+
356
+ # Session auth won, merge it
357
+ ws.merge("auth-session")
358
+
359
+ # === Phase 3: Continue on main with context ===
360
+ ctx = ws.context(k=2) # See last 2 commits: the merge + scaffold
361
+ # Feed ctx.summary() to the LLM as its "memory"
362
+
363
+ ws.log_ota("Auth is done", "Now build blog post CRUD", "implement_blog()")
364
+ ws.commit("Blog post CRUD with auth-protected routes")
365
+
366
+ # The agent always knows where it's been, without replaying everything
367
+ ```
368
+
369
+ ---
370
+
371
+ ## Running Tests
372
+
373
+ ```bash
374
+ # Clone the repository
375
+ git clone https://github.com/swadhinbiswas/contexa.git
376
+ cd contexa
377
+
378
+ # Install dev dependencies and run tests
379
+ uv sync
380
+ uv run pytest -v
381
+ ```
382
+
383
+ All 13 tests cover the core GCC commands:
384
+
385
+ ```
386
+ test_init_creates_gcc_directory # Workspace initialization
387
+ test_log_ota # OTA logging
388
+ test_commit # Milestone commits
389
+ test_branch_creates_isolated_workspace # Branch creation
390
+ test_branch_has_fresh_ota_log # Branch isolation
391
+ test_merge_integrates_branch # Branch merging
392
+ test_context_k1_returns_last_commit # Context retrieval (K=1)
393
+ test_context_k3_returns_last_three # Context retrieval (K=3)
394
+ test_context_includes_roadmap # Roadmap in context
395
+ test_branch_metadata_records_purpose # Metadata persistence
396
+ test_merge_marks_branch_as_merged # Post-merge metadata
397
+ test_switch_branch # Branch switching
398
+ test_ota_step_increments # Step auto-increment
399
+ ```
400
+
401
+ ---
402
+
403
+ ## Contributing
404
+
405
+ Contributions are welcome! Here's how to get started:
406
+
407
+ 1. Fork the repository: [https://github.com/swadhinbiswas/contexa](https://github.com/swadhinbiswas/contexa)
408
+ 2. Create a feature branch: `git checkout -b feature/my-feature`
409
+ 3. Make your changes and add tests
410
+ 4. Run the test suite: `uv run pytest -v`
411
+ 5. Submit a pull request
412
+
413
+ Please open an [issue](https://github.com/swadhinbiswas/contexa/issues) first for major changes to discuss the approach.
414
+
415
+ ---
416
+
417
+ ## Requirements
418
+
419
+ - **Python** >= 3.12
420
+ - **PyYAML** >= 6.0
421
+
422
+ No other dependencies. The entire implementation uses Python's standard library (`dataclasses`, `pathlib`, `uuid`, `datetime`) plus PyYAML for metadata serialization.
423
+
424
+ ---
425
+
426
+ ## License
427
+
428
+ This project is licensed under the MIT License. See the [LICENSE](https://github.com/swadhinbiswas/contexa/blob/main/LICENSE) file for details.
429
+
430
+ ---
431
+
432
+ ## Citation
433
+
434
+ If you use this in research, please cite the original paper:
435
+
436
+ ```bibtex
437
+ @article{wu2025gcc,
438
+ title={Git Context Controller: Manage the Context of LLM-based Agents like Git},
439
+ author={Wu, Junde and others},
440
+ journal={arXiv preprint arXiv:2508.00031v2},
441
+ year={2025}
442
+ }
443
+ ```
444
+
445
+ ---
446
+
447
+ ## Links
448
+
449
+ - **GitHub Repository**: [https://github.com/swadhinbiswas/contexa](https://github.com/swadhinbiswas/contexa)
450
+ - **PyPI Package**: [https://pypi.org/project/contexa/](https://pypi.org/project/contexa/)
451
+ - **Issue Tracker**: [https://github.com/swadhinbiswas/contexa/issues](https://github.com/swadhinbiswas/contexa/issues)
452
+ - **Original Paper**: [arXiv:2508.00031v2](https://arxiv.org/abs/2508.00031v2)
453
+ - **Author**: [Swadhin Biswas](https://github.com/swadhinbiswas)