codecortex-ai 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Rushikesh More
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,173 @@
1
+ # CodeCortex
2
+
3
+ > Persistent, AI-powered codebase knowledge layer. You shouldn't have to restructure your codebase for AI — CodeCortex gives AI the understanding automatically.
4
+
5
+ ## The Problem
6
+
7
+ Every AI coding session starts from scratch. When context compacts or a new session begins, the AI must re-scan the entire codebase — same files, same tokens, same time. It's like hiring a new developer every session who has to re-learn your entire codebase before writing a single line.
8
+
9
+ **The data backs this up:**
10
+ - AI agents increase defect risk by 30% on unfamiliar code ([CodeScene + Lund University, 2025](https://codescene.com/hubfs/whitepapers/AI-Coding-Assistants-and-Code-Quality.pdf))
11
+ - Code churn grew 2.5x in the AI era ([GitClear, 211M lines analyzed](https://www.gitclear.com/coding_on_copilot_data_shows_ais_downward_pressure_on_code_quality))
12
+ - Nobody combines structural + semantic + temporal + decision knowledge in one portable tool
13
+
14
+ ## The Solution
15
+
16
+ CodeCortex pre-digests codebases into layered, structured knowledge files and serves them to any AI agent via MCP. Instead of re-understanding your codebase every session, the AI starts with knowledge.
17
+
18
+ **Hybrid extraction:** tree-sitter native N-API for precise structure (symbols, imports, calls across 27 languages) + host LLM for rich semantics (what modules do, why they're built that way). Zero extra API keys. Language-agnostic from day 1.
19
+
20
+ ## Quick Start
21
+
22
+ ```bash
23
+ # Install
24
+ npm install -g codecortex
25
+
26
+ # Initialize knowledge for your project
27
+ cd /path/to/your-project
28
+ codecortex init
29
+
30
+ # Start MCP server (for AI agent access)
31
+ codecortex serve
32
+
33
+ # Check knowledge freshness
34
+ codecortex status
35
+ ```
36
+
37
+ ### Connect to Claude Code
38
+
39
+ Add to your MCP config:
40
+
41
+ ```json
42
+ {
43
+ "mcpServers": {
44
+ "codecortex": {
45
+ "command": "codecortex",
46
+ "args": ["serve"],
47
+ "cwd": "/path/to/your-project"
48
+ }
49
+ }
50
+ }
51
+ ```
52
+
53
+ ## Architecture
54
+
55
+ ```
56
+ ANY AI AGENT (Claude Code, Cursor, Codex, Windsurf, Zed)
57
+
58
+ MCP Protocol (stdio)
59
+
60
+ ┌──────▼──────────────────────────────────────────────┐
61
+ │ CODECORTEX MCP SERVER (14 tools) │
62
+ │ READ (9): overview │ module │ briefing │ search │ │
63
+ │ decisions │ graph │ lookup_symbol │ │
64
+ │ change_coupling │ hotspots │
65
+ │ WRITE (5): analyze_module │ save_analysis │ │
66
+ │ record_decision │ update_patterns │ │
67
+ │ report_feedback │
68
+ └──────┬──────────────────────────────────────────────┘
69
+ │ reads/writes
70
+ ┌──────▼──────────────────────────────────────────────┐
71
+ │ .codecortex/ (flat files in repo) │
72
+ │ HOT: cortex.yaml │ constitution.md │ overview.md │
73
+ │ graph.json │ symbols.json │ temporal.json │
74
+ │ WARM: modules/*.md │
75
+ │ COLD: decisions/*.md │ sessions/*.md │ patterns.md │
76
+ └──────────────────────────────────────────────────────┘
77
+ ```
78
+
79
+ ## Six Knowledge Layers
80
+
81
+ | Layer | What | File |
82
+ |-------|------|------|
83
+ | 1. Structural | Modules, deps, symbols, entry points | `graph.json` + `symbols.json` |
84
+ | 2. Semantic | What each module DOES, data flow, gotchas | `modules/*.md` |
85
+ | 3. Temporal | Git behavioral fingerprint — coupling, hotspots, bug history | `temporal.json` |
86
+ | 4. Decisions | WHY things are built this way | `decisions/*.md` |
87
+ | 5. Patterns | HOW code is written here | `patterns.md` |
88
+ | 6. Sessions | What CHANGED between sessions | `sessions/*.md` |
89
+
90
+ ### The Temporal Layer — Our Killer Differentiator
91
+
92
+ The temporal layer tells agents *"if you touch file X, you MUST also touch file Y"* even when there's no import between them. This comes from git co-change analysis, not static code analysis.
93
+
94
+ Example from a real codebase:
95
+ - `routes.ts` ↔ `worker.ts` co-changed 9/12 commits (75%) — **zero imports between them**
96
+ - Without this knowledge, an AI editing one file would produce a bug 75% of the time
97
+
98
+ ## MCP Tools
99
+
100
+ ### Read Tools (9)
101
+
102
+ | Tool | Description | Tier |
103
+ |------|-------------|------|
104
+ | `get_project_overview` | Constitution + overview + graph summary | HOT |
105
+ | `get_module_context` | Module doc by name, includes temporal signals | WARM |
106
+ | `get_session_briefing` | Changes since last session | COLD |
107
+ | `search_knowledge` | Keyword search across all knowledge | COLD |
108
+ | `get_decision_history` | Decision records filtered by topic | COLD |
109
+ | `get_dependency_graph` | Import/export graph, filterable | HOT |
110
+ | `lookup_symbol` | Symbol by name/file/kind | HOT |
111
+ | `get_change_coupling` | "What files must I also edit if I touch X?" | HOT |
112
+ | `get_hotspots` | Files ranked by risk (churn × coupling) | HOT |
113
+
114
+ ### Write Tools (5)
115
+
116
+ | Tool | Description |
117
+ |------|-------------|
118
+ | `analyze_module` | Returns source files + structured prompt for LLM analysis |
119
+ | `save_module_analysis` | Persists LLM analysis to `modules/*.md` |
120
+ | `record_decision` | Saves architectural decision to `decisions/*.md` |
121
+ | `update_patterns` | Merges coding pattern into `patterns.md` |
122
+ | `report_feedback` | Agent reports incorrect knowledge for next analysis |
123
+
124
+ ## CLI Commands
125
+
126
+ | Command | Description |
127
+ |---------|-------------|
128
+ | `codecortex init` | Discover project + extract symbols + analyze git history |
129
+ | `codecortex serve` | Start MCP server (stdio transport) |
130
+ | `codecortex update` | Re-extract changed files, update affected modules |
131
+ | `codecortex status` | Show knowledge freshness, stale modules, symbol counts |
132
+
133
+ ## Progressive Disclosure
134
+
135
+ CodeCortex uses a three-tier memory model to minimize token usage:
136
+
137
+ ```
138
+ Session start (HOT only): ~4,300 tokens ← full codebase understanding
139
+ Working on a module (+WARM): ~5,000 tokens ← deep module knowledge
140
+ Need coding patterns (+COLD): ~5,900 tokens ← every pattern + gotcha
141
+
142
+ vs. raw scan of entire codebase: ~37,800 tokens ← and still might miss things
143
+ ```
144
+
145
+ **Result: 85-90% token reduction, 80-85% fewer tool calls, 7-10x efficiency multiplier.**
146
+
147
+ ## Supported Languages (27)
148
+
149
+ Native tree-sitter grammars for:
150
+
151
+ | Category | Languages |
152
+ |----------|-----------|
153
+ | Web | TypeScript, TSX, JavaScript |
154
+ | Systems | C, C++, Objective-C, Rust, Zig, Go |
155
+ | JVM | Java, Kotlin, Scala |
156
+ | .NET | C# |
157
+ | Mobile | Swift, Dart |
158
+ | Scripting | Python, Ruby, PHP, Lua, Bash, Elixir |
159
+ | Functional | OCaml, Elm, Emacs Lisp |
160
+ | Other | Solidity, Vue, CodeQL |
161
+
162
+ ## Tech Stack
163
+
164
+ - TypeScript ESM, Node.js 20+
165
+ - `tree-sitter` (native N-API) + 27 language grammar packages
166
+ - `@modelcontextprotocol/sdk` — MCP server
167
+ - `commander` — CLI
168
+ - `simple-git` — git integration
169
+ - `yaml`, `zod`, `glob`
170
+
171
+ ## License
172
+
173
+ MIT