codexa 1.0.0 → 1.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +20 -164
- package/dist/agent.js +24 -9
- package/dist/cli.js +16 -7
- package/dist/config.js +6 -7
- package/dist/db.js +51 -1
- package/dist/ingest.js +81 -6
- package/dist/models/index.js +10 -7
- package/dist/retriever.js +158 -6
- package/dist/utils/formatter.js +46 -0
- package/dist/utils/logger.js +10 -4
- package/package.json +12 -5
package/README.md
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
<div align="center">
|
|
2
|
-
<h1>
|
|
3
|
-
|
|
4
|
-
|
|
2
|
+
<h1>
|
|
3
|
+
<img src="https://github.com/sahitya-chandra/codexa/blob/main/.github/assets/logo.png" alt="Codexa Logo" width="90" align="absmiddle"> Codexa
|
|
4
|
+
</h1>
|
|
5
5
|
|
|
6
6
|
<p>
|
|
7
7
|
<strong>A powerful CLI tool that ingests your codebase and allows you to ask questions about it using Retrieval-Augmented Generation (RAG).</strong>
|
|
@@ -48,7 +48,7 @@
|
|
|
48
48
|
|
|
49
49
|
- 🔒 **Privacy-First**: All data processing happens locally by default
|
|
50
50
|
- ⚡ **Fast & Efficient**: Local embeddings and optimized vector search
|
|
51
|
-
- 🤖 **Multiple LLM Support**: Works with
|
|
51
|
+
- 🤖 **Multiple LLM Support**: Works with Groq (cloud)
|
|
52
52
|
- 💾 **Local Storage**: SQLite database for embeddings and context
|
|
53
53
|
- 🎯 **Smart Chunking**: Intelligent code splitting with configurable overlap
|
|
54
54
|
- 🔄 **Session Management**: Maintain conversation context across queries
|
|
@@ -68,7 +68,6 @@ Before installing Codexa, ensure you have the following:
|
|
|
68
68
|
node --version # Should be v20.0.0 or higher
|
|
69
69
|
```
|
|
70
70
|
|
|
71
|
-
- **For Local LLM (Ollama)**: [Ollama](https://ollama.com/) must be installed
|
|
72
71
|
- **For Cloud LLM (Groq)**: A Groq API key from [console.groq.com](https://console.groq.com/)
|
|
73
72
|
|
|
74
73
|
### Installation Methods
|
|
@@ -130,11 +129,9 @@ codexa --version
|
|
|
130
129
|
|
|
131
130
|
### LLM Setup
|
|
132
131
|
|
|
133
|
-
Codexa requires an LLM to generate answers. You can use
|
|
134
|
-
|
|
135
|
-
#### Option 1: Using Groq (Cloud - Recommended)
|
|
132
|
+
Codexa requires an LLM to generate answers. You can use Groq (cloud).
|
|
136
133
|
|
|
137
|
-
Groq provides fast cloud-based LLMs with a generous free tier
|
|
134
|
+
Groq provides fast cloud-based LLMs with a generous free tier.
|
|
138
135
|
|
|
139
136
|
**Step 1: Get a Groq API Key**
|
|
140
137
|
|
|
@@ -192,69 +189,11 @@ Codexa defaults to using Groq when you run `codexa init`. If you need to manuall
|
|
|
192
189
|
- `llama-3.1-8b-instant` - Fast responses (recommended, default)
|
|
193
190
|
- `llama-3.1-70b-versatile` - Higher quality, slower
|
|
194
191
|
|
|
195
|
-
#### Option 2: Using Ollama (Local - Alternative)
|
|
196
|
-
|
|
197
|
-
Ollama runs LLMs locally on your machine, keeping your code completely private. This is an alternative option if you prefer local processing.
|
|
198
|
-
|
|
199
|
-
> ⚠️ **Note:** Models with more than 3 billion parameters may not work reliably with local Ollama setup. We recommend using 3B parameter models for best compatibility, or use Groq (Option 1) for better reliability.
|
|
200
|
-
|
|
201
|
-
**Step 1: Install Ollama**
|
|
202
|
-
|
|
203
|
-
- **macOS/Linux**: Visit [ollama.com](https://ollama.com/) and follow the installation instructions
|
|
204
|
-
- **Or use Homebrew on macOS**:
|
|
205
|
-
```bash
|
|
206
|
-
brew install ollama
|
|
207
|
-
```
|
|
208
|
-
|
|
209
|
-
**Step 2: Start Ollama Service**
|
|
210
|
-
|
|
211
|
-
```bash
|
|
212
|
-
# Start Ollama (usually starts automatically after installation)
|
|
213
|
-
ollama serve
|
|
214
|
-
|
|
215
|
-
# Verify Ollama is running
|
|
216
|
-
curl http://localhost:11434/api/tags
|
|
217
|
-
```
|
|
218
|
-
|
|
219
|
-
**Step 3: Download a Model**
|
|
220
|
-
|
|
221
|
-
Pull a model that Codexa can use:
|
|
222
|
-
|
|
223
|
-
```bash
|
|
224
|
-
# Recommended: Fast and lightweight - 3B parameters
|
|
225
|
-
ollama pull qwen2.5:3b-instruct
|
|
226
|
-
|
|
227
|
-
# Alternative 3B options:
|
|
228
|
-
ollama pull qwen2.5:1.5b-instruct # Even faster, smaller
|
|
229
|
-
ollama pull phi3:mini # Microsoft Phi-3 Mini
|
|
230
|
-
|
|
231
|
-
# ⚠️ Note: Larger models (8B+ like llama3:8b, mistral:7b) may not work locally
|
|
232
|
-
# If you encounter issues, try using a 3B model instead, or switch to Groq
|
|
233
|
-
```
|
|
234
192
|
|
|
235
|
-
**Step 4: Verify Model is Available**
|
|
236
|
-
|
|
237
|
-
```bash
|
|
238
|
-
ollama list
|
|
239
|
-
```
|
|
240
|
-
|
|
241
|
-
You should see your downloaded model in the list.
|
|
242
|
-
|
|
243
|
-
**Step 5: Configure Codexa**
|
|
244
|
-
|
|
245
|
-
Edit `.codexarc.json` after running `codexa init`:
|
|
246
|
-
|
|
247
|
-
```json
|
|
248
|
-
{
|
|
249
|
-
"modelProvider": "local",
|
|
250
|
-
"model": "qwen2.5:3b-instruct",
|
|
251
|
-
"localModelUrl": "http://localhost:11434"
|
|
252
|
-
}
|
|
253
|
-
```
|
|
254
193
|
|
|
255
194
|
#### Quick Setup Summary
|
|
256
195
|
|
|
257
|
-
**For Groq
|
|
196
|
+
**For Groq:**
|
|
258
197
|
```bash
|
|
259
198
|
# 1. Get API key from console.groq.com
|
|
260
199
|
# 2. Set environment variable
|
|
@@ -266,22 +205,7 @@ codexa init
|
|
|
266
205
|
# 4. Ready to use!
|
|
267
206
|
```
|
|
268
207
|
|
|
269
|
-
**For Ollama (Alternative):**
|
|
270
|
-
```bash
|
|
271
|
-
# 1. Install Ollama
|
|
272
|
-
brew install ollama # macOS
|
|
273
|
-
# or visit ollama.com for other platforms
|
|
274
|
-
|
|
275
|
-
# 2. Start Ollama
|
|
276
|
-
ollama serve
|
|
277
208
|
|
|
278
|
-
# 3. Pull model (use 3B models only)
|
|
279
|
-
ollama pull qwen2.5:3b-instruct
|
|
280
|
-
|
|
281
|
-
# 4. Update .codexarc.json to set "modelProvider": "local"
|
|
282
|
-
codexa init
|
|
283
|
-
# Then edit .codexarc.json to set modelProvider to "local"
|
|
284
|
-
```
|
|
285
209
|
|
|
286
210
|
## Quick Start
|
|
287
211
|
|
|
@@ -427,38 +351,22 @@ export OPENAI_API_KEY="sk-your_key_here" # If using OpenAI embeddings
|
|
|
427
351
|
|
|
428
352
|
#### `modelProvider`
|
|
429
353
|
|
|
430
|
-
**Type:** `"
|
|
431
|
-
**Default:** `"groq"`
|
|
354
|
+
**Type:** `"groq"`
|
|
355
|
+
**Default:** `"groq"`
|
|
432
356
|
|
|
433
357
|
The LLM provider to use for generating answers.
|
|
434
358
|
|
|
435
|
-
- `"groq"` - Uses Groq's cloud API (
|
|
436
|
-
- `"local"` - Uses Ollama running on your machine (alternative option)
|
|
359
|
+
- `"groq"` - Uses Groq's cloud API (requires `GROQ_API_KEY`)
|
|
437
360
|
|
|
438
361
|
#### `model`
|
|
439
362
|
|
|
440
363
|
**Type:** `string`
|
|
441
|
-
**
|
|
364
|
+
**Type:** `string`
|
|
365
|
+
**Default:** `"llama-3.1-8b-instant"`
|
|
442
366
|
|
|
443
367
|
The model identifier to use.
|
|
444
368
|
|
|
445
|
-
**Common Groq Models (Recommended):**
|
|
446
|
-
- `llama-3.1-8b-instant` - Fast responses (default, recommended)
|
|
447
|
-
- `llama-3.1-70b-versatile` - Higher quality, slower
|
|
448
|
-
|
|
449
|
-
**Common Local Models (Alternative):**
|
|
450
|
-
- `qwen2.5:3b-instruct` - Fast, lightweight - **3B parameters**
|
|
451
|
-
- `qwen2.5:1.5b-instruct` - Even faster, smaller - **1.5B parameters**
|
|
452
|
-
- `phi3:mini` - Microsoft Phi-3 Mini - **3.8B parameters**
|
|
453
369
|
|
|
454
|
-
> ⚠️ **Warning:** Models with more than 3 billion parameters (like `llama3:8b`, `mistral:7b`) may not work reliably with local Ollama setup. If you encounter issues, please try using a 3B parameter model instead, or switch to Groq.
|
|
455
|
-
|
|
456
|
-
#### `localModelUrl`
|
|
457
|
-
|
|
458
|
-
**Type:** `string`
|
|
459
|
-
**Default:** `"http://localhost:11434"`
|
|
460
|
-
|
|
461
|
-
Base URL for your local Ollama instance. Change this if Ollama runs on a different host or port.
|
|
462
370
|
|
|
463
371
|
#### `embeddingProvider`
|
|
464
372
|
|
|
@@ -562,7 +470,7 @@ Number of code chunks to retrieve and use as context for each question. Higher v
|
|
|
562
470
|
|
|
563
471
|
### Example Configurations
|
|
564
472
|
|
|
565
|
-
#### Groq Cloud Provider (
|
|
473
|
+
#### Groq Cloud Provider (Default)
|
|
566
474
|
|
|
567
475
|
```json
|
|
568
476
|
{
|
|
@@ -582,28 +490,14 @@ Number of code chunks to retrieve and use as context for each question. Higher v
|
|
|
582
490
|
export GROQ_API_KEY="your-api-key"
|
|
583
491
|
```
|
|
584
492
|
|
|
585
|
-
#### Local Development (Alternative)
|
|
586
493
|
|
|
587
|
-
```json
|
|
588
|
-
{
|
|
589
|
-
"modelProvider": "local",
|
|
590
|
-
"model": "qwen2.5:3b-instruct",
|
|
591
|
-
"localModelUrl": "http://localhost:11434",
|
|
592
|
-
"embeddingProvider": "local",
|
|
593
|
-
"embeddingModel": "Xenova/all-MiniLM-L6-v2",
|
|
594
|
-
"maxChunkSize": 200,
|
|
595
|
-
"chunkOverlap": 20,
|
|
596
|
-
"temperature": 0.2,
|
|
597
|
-
"topK": 4
|
|
598
|
-
}
|
|
599
|
-
```
|
|
600
494
|
|
|
601
495
|
#### Optimized for Large Codebases
|
|
602
496
|
|
|
603
497
|
```json
|
|
604
498
|
{
|
|
605
|
-
"modelProvider": "
|
|
606
|
-
"model": "
|
|
499
|
+
"modelProvider": "groq",
|
|
500
|
+
"model": "llama-3.1-8b-instant",
|
|
607
501
|
"maxChunkSize": 150,
|
|
608
502
|
"chunkOverlap": 15,
|
|
609
503
|
"topK": 6,
|
|
@@ -731,8 +625,8 @@ When you run `codexa ask`:
|
|
|
731
625
|
▼
|
|
732
626
|
┌─────────────────┐ ┌──────────────┐
|
|
733
627
|
│ SQLite DB │◀────│ LLM │
|
|
734
|
-
│ (Chunks + │ │ (
|
|
735
|
-
│ Embeddings) │ │
|
|
628
|
+
│ (Chunks + │ │ (Groq) │
|
|
629
|
+
│ Embeddings) │ │ │
|
|
736
630
|
└─────────────────┘ └──────┬───────┘
|
|
737
631
|
│
|
|
738
632
|
▼
|
|
@@ -745,50 +639,12 @@ When you run `codexa ask`:
|
|
|
745
639
|
- **Chunker**: Splits code files into semantic chunks
|
|
746
640
|
- **Embedder**: Generates vector embeddings (local transformers)
|
|
747
641
|
- **Retriever**: Finds relevant chunks using vector similarity
|
|
748
|
-
- **LLM Client**: Generates answers (
|
|
642
|
+
- **LLM Client**: Generates answers (Groq cloud)
|
|
749
643
|
- **Database**: SQLite for storing chunks and embeddings
|
|
750
644
|
|
|
751
645
|
## Troubleshooting
|
|
752
646
|
|
|
753
|
-
### "Ollama not reachable" Error
|
|
754
|
-
|
|
755
|
-
**Problem:** Codexa can't connect to your local Ollama instance.
|
|
756
|
-
|
|
757
|
-
**Solutions:**
|
|
758
|
-
1. Ensure Ollama is running:
|
|
759
|
-
```bash
|
|
760
|
-
ollama serve
|
|
761
|
-
```
|
|
762
|
-
2. Check if Ollama is running on the default port:
|
|
763
|
-
```bash
|
|
764
|
-
curl http://localhost:11434/api/tags
|
|
765
|
-
```
|
|
766
|
-
3. If Ollama runs on a different host/port, update `.codexarc.json`:
|
|
767
|
-
```json
|
|
768
|
-
{
|
|
769
|
-
"localModelUrl": "http://your-host:port"
|
|
770
|
-
}
|
|
771
|
-
```
|
|
772
|
-
|
|
773
|
-
### "Model not found" Error
|
|
774
647
|
|
|
775
|
-
**Problem:** The specified Ollama model isn't available.
|
|
776
|
-
|
|
777
|
-
**Solutions:**
|
|
778
|
-
1. List available models:
|
|
779
|
-
```bash
|
|
780
|
-
ollama list
|
|
781
|
-
```
|
|
782
|
-
2. Pull the required model:
|
|
783
|
-
```bash
|
|
784
|
-
ollama pull qwen2.5:3b-instruct
|
|
785
|
-
```
|
|
786
|
-
3. Or update `.codexarc.json` to use an available model:
|
|
787
|
-
```json
|
|
788
|
-
{
|
|
789
|
-
"model": "your-available-model"
|
|
790
|
-
}
|
|
791
|
-
```
|
|
792
648
|
|
|
793
649
|
### "GROQ_API_KEY not set" Error
|
|
794
650
|
|
|
@@ -836,7 +692,7 @@ When you run `codexa ask`:
|
|
|
836
692
|
```bash
|
|
837
693
|
codexa ingest --force
|
|
838
694
|
```
|
|
839
|
-
|
|
695
|
+
|
|
840
696
|
5. Ask more specific questions
|
|
841
697
|
|
|
842
698
|
### Database Locked Error
|
|
@@ -869,7 +725,7 @@ A: Yes! Codexa processes everything locally by default. Your code never leaves y
|
|
|
869
725
|
A: Typically 10-50MB per 1000 files, depending on file sizes. The SQLite database stores chunks and embeddings.
|
|
870
726
|
|
|
871
727
|
**Q: Can I use Codexa in CI/CD?**
|
|
872
|
-
A: Yes, but you'll need to ensure
|
|
728
|
+
A: Yes, but you'll need to ensure your LLM provider is accessible. For CI/CD, consider using Groq (cloud).
|
|
873
729
|
|
|
874
730
|
**Q: Does Codexa work with monorepos?**
|
|
875
731
|
A: Yes! Adjust `includeGlobs` and `excludeGlobs` to target specific packages or workspaces.
|
package/dist/agent.js
CHANGED
|
@@ -9,15 +9,24 @@ const fs_extra_1 = __importDefault(require("fs-extra"));
|
|
|
9
9
|
const retriever_1 = require("./retriever");
|
|
10
10
|
const models_1 = require("./models");
|
|
11
11
|
const SYSTEM_PROMPT = `
|
|
12
|
-
You are RepoSage.
|
|
13
|
-
You answer questions about a codebase using ONLY the provided code snippets.
|
|
12
|
+
You are RepoSage, an expert codebase assistant that answers questions about codebases using the provided code snippets.
|
|
14
13
|
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
-
|
|
19
|
-
|
|
20
|
-
-
|
|
14
|
+
Your task is to provide accurate, helpful, and comprehensive answers based on the ACTUAL CODE provided.
|
|
15
|
+
|
|
16
|
+
CRITICAL PRIORITY RULES:
|
|
17
|
+
- ALWAYS prioritize CODE_SNIPPET sections over DOCUMENTATION sections when answering questions
|
|
18
|
+
- IGNORE DOCUMENTATION sections if they contradict or differ from what the code shows
|
|
19
|
+
- When there's a conflict between documentation and actual code, ALWAYS trust the code implementation
|
|
20
|
+
- Base your answers on what the CODE actually does, not what documentation claims
|
|
21
|
+
|
|
22
|
+
Guidelines:
|
|
23
|
+
- Analyze CODE_SNIPPET sections FIRST - these contain the actual implementation
|
|
24
|
+
- DOCUMENTATION sections are for reference only and should be IGNORED if they contradict code
|
|
25
|
+
- When answering questions about functionality, explain based on actual code execution flow
|
|
26
|
+
- Reference specific files and line numbers when relevant (from the FILE headers)
|
|
27
|
+
- Be direct and factual - if code shows something, state it clearly
|
|
28
|
+
- If asked about a specific file that isn't in the context, clearly state "The file [name] is not present in the provided code snippets"
|
|
29
|
+
- When analyzing code structure, look at imports, exports, and execution patterns
|
|
21
30
|
`;
|
|
22
31
|
async function askQuestion(cwd, config, options) {
|
|
23
32
|
const { question, session = 'default' } = options;
|
|
@@ -32,7 +41,13 @@ async function askQuestion(cwd, config, options) {
|
|
|
32
41
|
...history,
|
|
33
42
|
{
|
|
34
43
|
role: 'user',
|
|
35
|
-
content: `
|
|
44
|
+
content: `Based on the following code snippets from the codebase, please answer the question.
|
|
45
|
+
|
|
46
|
+
${context}
|
|
47
|
+
|
|
48
|
+
Question: ${question}
|
|
49
|
+
|
|
50
|
+
Please provide a comprehensive and helpful answer based on the code context above.`,
|
|
36
51
|
},
|
|
37
52
|
];
|
|
38
53
|
const llm = (0, models_1.createLLMClient)(config);
|
package/dist/cli.js
CHANGED
|
@@ -10,6 +10,14 @@ const config_1 = require("./config");
|
|
|
10
10
|
const ingest_1 = require("./ingest");
|
|
11
11
|
const agent_1 = require("./agent");
|
|
12
12
|
const logger_1 = require("./utils/logger");
|
|
13
|
+
const formatter_1 = require("./utils/formatter");
|
|
14
|
+
const marked_1 = require("marked");
|
|
15
|
+
const marked_terminal_1 = __importDefault(require("marked-terminal"));
|
|
16
|
+
marked_1.marked.setOptions({
|
|
17
|
+
renderer: new marked_terminal_1.default({
|
|
18
|
+
tab: 2,
|
|
19
|
+
}),
|
|
20
|
+
});
|
|
13
21
|
const program = new commander_1.Command();
|
|
14
22
|
program
|
|
15
23
|
.name('codexa')
|
|
@@ -43,15 +51,14 @@ program
|
|
|
43
51
|
.description('Ask a natural-language question about the current repo.')
|
|
44
52
|
.argument('<question...>', 'Question to ask about the codebase.')
|
|
45
53
|
.option('-s, --session <name>', 'session identifier to keep conversation context', 'default')
|
|
46
|
-
.option('--
|
|
54
|
+
.option('--stream', 'enable streaming output')
|
|
47
55
|
.action(async (question, options) => {
|
|
48
56
|
const cwd = process.cwd();
|
|
49
57
|
const config = await (0, config_1.loadConfig)(cwd);
|
|
50
58
|
const prompt = question.join(' ');
|
|
51
|
-
//
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
const stream = options.stream !== false;
|
|
59
|
+
// dfefault: non-streamed output
|
|
60
|
+
const stream = options.stream === true;
|
|
61
|
+
console.log((0, formatter_1.formatQuestion)(prompt));
|
|
55
62
|
const spinner = (0, ora_1.default)('Extracting Response...').start();
|
|
56
63
|
try {
|
|
57
64
|
const answer = await (0, agent_1.askQuestion)(cwd, config, {
|
|
@@ -70,11 +77,13 @@ program
|
|
|
70
77
|
spinner.text = status;
|
|
71
78
|
},
|
|
72
79
|
});
|
|
73
|
-
spinner.stop();
|
|
74
80
|
if (!stream) {
|
|
75
|
-
|
|
81
|
+
const rendered = marked_1.marked.parse(answer.trim());
|
|
82
|
+
spinner.stop();
|
|
83
|
+
console.log('\n' + rendered + '\n');
|
|
76
84
|
}
|
|
77
85
|
else {
|
|
86
|
+
spinner.stop();
|
|
78
87
|
console.log('\n');
|
|
79
88
|
}
|
|
80
89
|
}
|
package/dist/config.js
CHANGED
|
@@ -13,13 +13,13 @@ dotenv_1.default.config();
|
|
|
13
13
|
const CONFIG_FILENAME = '.codexarc.json';
|
|
14
14
|
const DEFAULT_CONFIG = {
|
|
15
15
|
modelProvider: 'groq',
|
|
16
|
-
model: 'llama-3.1-8b-instant',
|
|
16
|
+
model: 'llama-3.1-8b-instant', // can also use llama-3.3-70b-versatile for better perf
|
|
17
17
|
embeddingProvider: 'local',
|
|
18
18
|
embeddingModel: 'Xenova/all-MiniLM-L6-v2',
|
|
19
|
-
localModelUrl: 'http://localhost:11434',
|
|
20
|
-
localModelApiKey: '',
|
|
21
|
-
maxChunkSize:
|
|
22
|
-
chunkOverlap:
|
|
19
|
+
// localModelUrl: 'http://localhost:11434',
|
|
20
|
+
// localModelApiKey: '',
|
|
21
|
+
maxChunkSize: 800,
|
|
22
|
+
chunkOverlap: 100,
|
|
23
23
|
includeGlobs: [
|
|
24
24
|
'**/*.ts',
|
|
25
25
|
'**/*.tsx',
|
|
@@ -30,7 +30,6 @@ const DEFAULT_CONFIG = {
|
|
|
30
30
|
'**/*.rs',
|
|
31
31
|
'**/*.java',
|
|
32
32
|
'**/*.md',
|
|
33
|
-
'**/*.json',
|
|
34
33
|
],
|
|
35
34
|
excludeGlobs: [
|
|
36
35
|
'node_modules/**',
|
|
@@ -43,7 +42,7 @@ const DEFAULT_CONFIG = {
|
|
|
43
42
|
historyDir: '.codexa/sessions',
|
|
44
43
|
dbPath: '.codexa/index.db',
|
|
45
44
|
temperature: 0.2,
|
|
46
|
-
topK:
|
|
45
|
+
topK: 10,
|
|
47
46
|
};
|
|
48
47
|
async function ensureConfig(cwd) {
|
|
49
48
|
const configPath = node_path_1.default.join(cwd, CONFIG_FILENAME);
|
package/dist/db.js
CHANGED
|
@@ -29,6 +29,24 @@ function cosineSimilarity(a, b) {
|
|
|
29
29
|
const sqrtNormB = Math.sqrt(normB);
|
|
30
30
|
return dot / (sqrtNormA * sqrtNormB);
|
|
31
31
|
}
|
|
32
|
+
function shouldSkipFileForSearch(filePath, excludeMarkdown = false) {
|
|
33
|
+
const lower = filePath.toLowerCase();
|
|
34
|
+
if (excludeMarkdown && (lower.endsWith('.md') || lower.includes('readme'))) {
|
|
35
|
+
return true;
|
|
36
|
+
}
|
|
37
|
+
if (lower.includes('node_modules/') ||
|
|
38
|
+
lower.includes('/dist/') ||
|
|
39
|
+
lower.includes('/build/') ||
|
|
40
|
+
lower.includes('/.git/') ||
|
|
41
|
+
lower.endsWith('package-lock.json') ||
|
|
42
|
+
lower.endsWith('yarn.lock') ||
|
|
43
|
+
lower.endsWith('pnpm-lock.yaml') ||
|
|
44
|
+
lower.endsWith('.lock') ||
|
|
45
|
+
lower.endsWith('.log')) {
|
|
46
|
+
return true;
|
|
47
|
+
}
|
|
48
|
+
return false;
|
|
49
|
+
}
|
|
32
50
|
class VectorStore {
|
|
33
51
|
dbPath;
|
|
34
52
|
db = null;
|
|
@@ -80,7 +98,36 @@ class VectorStore {
|
|
|
80
98
|
});
|
|
81
99
|
tx(chunks);
|
|
82
100
|
}
|
|
83
|
-
|
|
101
|
+
getChunksByFilePath(filePathPattern, maxChunks = 10) {
|
|
102
|
+
const db = this.connection;
|
|
103
|
+
if (!filePathPattern || filePathPattern.trim() === '') {
|
|
104
|
+
// Return all chunks if no pattern
|
|
105
|
+
const rows = db.prepare('SELECT * FROM chunks').all();
|
|
106
|
+
return rows.slice(0, maxChunks).map((row) => ({
|
|
107
|
+
filePath: row.file_path,
|
|
108
|
+
startLine: row.start_line,
|
|
109
|
+
endLine: row.end_line,
|
|
110
|
+
content: row.content,
|
|
111
|
+
compressed: row.compressed ?? '',
|
|
112
|
+
embedding: JSON.parse(row.embedding),
|
|
113
|
+
score: 1.0,
|
|
114
|
+
}));
|
|
115
|
+
}
|
|
116
|
+
const fileName = filePathPattern.split('/').pop() || filePathPattern;
|
|
117
|
+
const rows = db
|
|
118
|
+
.prepare(`SELECT * FROM chunks WHERE file_path LIKE ? OR file_path LIKE ? OR file_path = ? OR file_path LIKE ? LIMIT ?`)
|
|
119
|
+
.all(`%${filePathPattern}%`, `%${fileName}%`, filePathPattern, `%/${fileName}%`, maxChunks);
|
|
120
|
+
return rows.map((row) => ({
|
|
121
|
+
filePath: row.file_path,
|
|
122
|
+
startLine: row.start_line,
|
|
123
|
+
endLine: row.end_line,
|
|
124
|
+
content: row.content,
|
|
125
|
+
compressed: row.compressed ?? '',
|
|
126
|
+
embedding: JSON.parse(row.embedding),
|
|
127
|
+
score: 1.0,
|
|
128
|
+
}));
|
|
129
|
+
}
|
|
130
|
+
search(queryEmbedding, topK, excludeMarkdown = false) {
|
|
84
131
|
const db = this.connection;
|
|
85
132
|
const rows = db.prepare('SELECT * FROM chunks').all();
|
|
86
133
|
if (rows.length === 0) {
|
|
@@ -94,6 +141,9 @@ class VectorStore {
|
|
|
94
141
|
const topResults = [];
|
|
95
142
|
const minScore = { value: -Infinity };
|
|
96
143
|
for (const row of rows) {
|
|
144
|
+
if (shouldSkipFileForSearch(row.file_path, excludeMarkdown)) {
|
|
145
|
+
continue;
|
|
146
|
+
}
|
|
97
147
|
const embedding = JSON.parse(row.embedding);
|
|
98
148
|
const score = cosineSimilarity(queryEmbedding, embedding);
|
|
99
149
|
if (topResults.length >= topK && score <= minScore.value) {
|
package/dist/ingest.js
CHANGED
|
@@ -4,24 +4,81 @@ var __importDefault = (this && this.__importDefault) || function (mod) {
|
|
|
4
4
|
};
|
|
5
5
|
Object.defineProperty(exports, "__esModule", { value: true });
|
|
6
6
|
exports.ingestRepository = ingestRepository;
|
|
7
|
+
const node_perf_hooks_1 = require("node:perf_hooks");
|
|
7
8
|
const node_path_1 = __importDefault(require("node:path"));
|
|
9
|
+
const cli_progress_1 = __importDefault(require("cli-progress"));
|
|
8
10
|
const globby_1 = require("globby");
|
|
9
11
|
const chunker_1 = require("./chunker");
|
|
10
12
|
const embeddings_1 = require("./embeddings");
|
|
11
13
|
const db_1 = require("./db");
|
|
14
|
+
const formatter_1 = require("./utils/formatter");
|
|
12
15
|
const ora_1 = __importDefault(require("ora"));
|
|
13
|
-
function compressText(text, cap =
|
|
16
|
+
function compressText(text, cap = 800) {
|
|
14
17
|
return text
|
|
15
|
-
.replace(
|
|
16
|
-
.replace(
|
|
17
|
-
.replace(
|
|
18
|
+
.replace(/\n{3,}/g, '\n\n')
|
|
19
|
+
.replace(/[ \t]+/g, ' ')
|
|
20
|
+
.replace(/ *\n */g, '\n')
|
|
18
21
|
.trim()
|
|
19
22
|
.slice(0, cap);
|
|
20
23
|
}
|
|
21
24
|
function tick() {
|
|
22
25
|
return new Promise((resolve) => setImmediate(resolve));
|
|
23
26
|
}
|
|
27
|
+
// async function parallelizeBatches(
|
|
28
|
+
// chunks: any,
|
|
29
|
+
// batchSize: number,
|
|
30
|
+
// concurrency: number,
|
|
31
|
+
// embedFunc: any,
|
|
32
|
+
// onProgress?: (processed: number, total: number) => void
|
|
33
|
+
// ) {
|
|
34
|
+
// const totalChunks = chunks.length;
|
|
35
|
+
// // Pre-create batches to avoid race conditions during batch creation
|
|
36
|
+
// const batches: any[][] = [];
|
|
37
|
+
// for (let i = 0; i < chunks.length; i += batchSize) {
|
|
38
|
+
// const batch = chunks.slice(i, Math.min(i + batchSize, chunks.length));
|
|
39
|
+
// if (batch.length > 0) {
|
|
40
|
+
// batches.push(batch);
|
|
41
|
+
// }
|
|
42
|
+
// }
|
|
43
|
+
// // Use a shared counter with proper synchronization
|
|
44
|
+
// let batchIndex = 0;
|
|
45
|
+
// let processedCount = 0;
|
|
46
|
+
// // Helper to atomically get next batch index
|
|
47
|
+
// function getNextBatchIndex(): number {
|
|
48
|
+
// const current = batchIndex;
|
|
49
|
+
// batchIndex++;
|
|
50
|
+
// return current;
|
|
51
|
+
// }
|
|
52
|
+
// async function processBatch() {
|
|
53
|
+
// while (true) {
|
|
54
|
+
// const index = getNextBatchIndex();
|
|
55
|
+
// if (index >= batches.length) {
|
|
56
|
+
// return;
|
|
57
|
+
// }
|
|
58
|
+
// const currentBatch = batches[index];
|
|
59
|
+
// if (!currentBatch || currentBatch.length === 0) {
|
|
60
|
+
// return;
|
|
61
|
+
// }
|
|
62
|
+
// try {
|
|
63
|
+
// // Use original content for embeddings to preserve semantic information
|
|
64
|
+
// const texts = currentBatch.map((c: any) => c.content);
|
|
65
|
+
// const vectors = await embedFunc(texts);
|
|
66
|
+
// currentBatch.forEach((c: any, idx: number) => (c.embedding = vectors[idx]));
|
|
67
|
+
// // Update progress atomically (JavaScript single-threaded, but good practice)
|
|
68
|
+
// processedCount += currentBatch.length;
|
|
69
|
+
// if (onProgress) {
|
|
70
|
+
// onProgress(processedCount, totalChunks);
|
|
71
|
+
// }
|
|
72
|
+
// } catch (error) {
|
|
73
|
+
// console.error(`Error processing batch: ${error}`);
|
|
74
|
+
// throw error;
|
|
75
|
+
// }
|
|
76
|
+
// }
|
|
77
|
+
// }
|
|
78
|
+
// await Promise.all(Array.from({ length: concurrency }, processBatch));
|
|
79
|
+
// }
|
|
24
80
|
async function ingestRepository({ cwd, config, force = false, }) {
|
|
81
|
+
const startedAt = node_perf_hooks_1.performance.now();
|
|
25
82
|
const spinnerFiles = (0, ora_1.default)('Finding files...').start();
|
|
26
83
|
const files = await (0, globby_1.globby)(config.includeGlobs, {
|
|
27
84
|
cwd,
|
|
@@ -47,16 +104,24 @@ async function ingestRepository({ cwd, config, force = false, }) {
|
|
|
47
104
|
const spinnerCompress = (0, ora_1.default)('Compressing chunks...').start();
|
|
48
105
|
chunks.forEach((c) => (c.compressed = compressText(c.content)));
|
|
49
106
|
spinnerCompress.succeed('Compression complete');
|
|
50
|
-
const spinnerEmbed = (0, ora_1.default)('
|
|
107
|
+
const spinnerEmbed = (0, ora_1.default)('Preparing embeddings (this will take sometime)...').start();
|
|
51
108
|
const embedder = await (0, embeddings_1.createEmbedder)(config);
|
|
109
|
+
const progress = new cli_progress_1.default.SingleBar({
|
|
110
|
+
format: 'Embedding |{bar}| {percentage}% | {value}/{total} chunks',
|
|
111
|
+
barCompleteChar: '\u2588',
|
|
112
|
+
barIncompleteChar: '\u2591',
|
|
113
|
+
}, cli_progress_1.default.Presets.shades_classic);
|
|
52
114
|
const batchSize = 32;
|
|
115
|
+
progress.start(chunks.length, 0);
|
|
53
116
|
for (let i = 0; i < chunks.length; i += batchSize) {
|
|
54
117
|
const batch = chunks.slice(i, i + batchSize);
|
|
55
|
-
const texts = batch.map((c) => c.
|
|
118
|
+
const texts = batch.map((c) => c.content);
|
|
56
119
|
const vectors = await embedder.embed(texts);
|
|
57
120
|
batch.forEach((c, idx) => (c.embedding = vectors[idx]));
|
|
121
|
+
progress.increment(batch.length);
|
|
58
122
|
await tick();
|
|
59
123
|
}
|
|
124
|
+
progress.stop();
|
|
60
125
|
spinnerEmbed.succeed('Embedding complete');
|
|
61
126
|
const spinnerStore = (0, ora_1.default)('Storing chunks...').start();
|
|
62
127
|
const store = new db_1.VectorStore(config.dbPath);
|
|
@@ -65,5 +130,15 @@ async function ingestRepository({ cwd, config, force = false, }) {
|
|
|
65
130
|
store.clear();
|
|
66
131
|
store.insertChunks(chunks);
|
|
67
132
|
spinnerStore.succeed('Stored successfully');
|
|
133
|
+
const durationSec = (node_perf_hooks_1.performance.now() - startedAt) / 1000;
|
|
68
134
|
(0, ora_1.default)().succeed('Ingestion complete!');
|
|
135
|
+
const avgChunkSize = chunks.length === 0
|
|
136
|
+
? 0
|
|
137
|
+
: chunks.reduce((sum, c) => sum + c.content.split('\n').length, 0) / chunks.length;
|
|
138
|
+
console.log((0, formatter_1.formatStats)({
|
|
139
|
+
files: files.length,
|
|
140
|
+
chunks: chunks.length,
|
|
141
|
+
avgChunkSize,
|
|
142
|
+
durationSec,
|
|
143
|
+
}));
|
|
69
144
|
}
|
package/dist/models/index.js
CHANGED
|
@@ -113,17 +113,20 @@ class GroqLLM {
|
|
|
113
113
|
}
|
|
114
114
|
}
|
|
115
115
|
function createLLMClient(config) {
|
|
116
|
-
if (config.modelProvider === 'local') {
|
|
117
|
-
|
|
118
|
-
|
|
119
|
-
|
|
120
|
-
|
|
121
|
-
|
|
122
|
-
}
|
|
116
|
+
// if (config.modelProvider === 'local') {
|
|
117
|
+
// const base = config.localModelUrl?.replace(/\/$/, '') || 'http://localhost:11434';
|
|
118
|
+
// if (process.env.AGENT_DEBUG) {
|
|
119
|
+
// console.error('Using Ollama client:', config.model, config.localModelUrl);
|
|
120
|
+
// }
|
|
121
|
+
// return new OllamaLLM(config.model, base);
|
|
122
|
+
// }
|
|
123
123
|
if (config.modelProvider === 'groq') {
|
|
124
124
|
if (process.env.AGENT_DEBUG) {
|
|
125
125
|
console.error('Using Groq client:', config.model);
|
|
126
126
|
}
|
|
127
|
+
if (!process.env.GROQ_API_KEY) {
|
|
128
|
+
throw new Error('GROQ_API_KEY is not set. Please set the GROQ_API_KEY environment variable to use Groq models.');
|
|
129
|
+
}
|
|
127
130
|
return new GroqLLM(config.model, process.env.GROQ_API_KEY);
|
|
128
131
|
}
|
|
129
132
|
throw new Error('Only local provider supported for now.');
|
package/dist/retriever.js
CHANGED
|
@@ -4,19 +4,171 @@ exports.retrieveContext = retrieveContext;
|
|
|
4
4
|
exports.formatContext = formatContext;
|
|
5
5
|
const embeddings_1 = require("./embeddings");
|
|
6
6
|
const db_1 = require("./db");
|
|
7
|
+
// helper to determine file type and priority
|
|
8
|
+
function extractFileMentions(question) {
|
|
9
|
+
const filePattern = /[\w\/\-\.]+\.(ts|tsx|js|jsx|py|go|rs|java|mjs|cjs|md|mdx)/gi;
|
|
10
|
+
const matches = question.match(filePattern) || [];
|
|
11
|
+
const bareFilenamePattern = /\b(\w+\.(ts|tsx|js|jsx|py|go|rs|java|mjs|cjs|md|mdx))\b/gi;
|
|
12
|
+
const bareMatches = (question.match(bareFilenamePattern) || []).map((m) => m.toLowerCase());
|
|
13
|
+
const markdownMentions = [];
|
|
14
|
+
if (question.toLowerCase().includes('readme'))
|
|
15
|
+
markdownMentions.push('readme.md');
|
|
16
|
+
if (question.toLowerCase().includes('contributing'))
|
|
17
|
+
markdownMentions.push('contributing.md');
|
|
18
|
+
if (question.toLowerCase().includes('changelog'))
|
|
19
|
+
markdownMentions.push('changelog.md');
|
|
20
|
+
const allMatches = [...matches, ...bareMatches, ...markdownMentions].map((m) => m.toLowerCase().trim());
|
|
21
|
+
// remove duplicates
|
|
22
|
+
return [...new Set(allMatches)];
|
|
23
|
+
}
|
|
24
|
+
// helper to match file paths flexibly
|
|
25
|
+
function matchesFilePattern(filePath, pattern) {
|
|
26
|
+
const lowerPath = filePath.toLowerCase();
|
|
27
|
+
const lowerPattern = pattern.toLowerCase();
|
|
28
|
+
if (lowerPath === lowerPattern)
|
|
29
|
+
return true;
|
|
30
|
+
if (lowerPath.endsWith(lowerPattern))
|
|
31
|
+
return true;
|
|
32
|
+
if (lowerPath.includes(lowerPattern))
|
|
33
|
+
return true;
|
|
34
|
+
// match just the filename part
|
|
35
|
+
const pathParts = lowerPath.split('/');
|
|
36
|
+
const fileName = pathParts[pathParts.length - 1];
|
|
37
|
+
const patternFileName = lowerPattern.split('/').pop() || lowerPattern;
|
|
38
|
+
if (fileName === patternFileName)
|
|
39
|
+
return true;
|
|
40
|
+
return false;
|
|
41
|
+
}
|
|
42
|
+
function getFileTypePriority(filePath, question, mentionedFiles) {
|
|
43
|
+
const lowerPath = filePath.toLowerCase();
|
|
44
|
+
const ext = lowerPath.split('.').pop() || '';
|
|
45
|
+
const fileName = filePath.split('/').pop()?.toLowerCase() || '';
|
|
46
|
+
const isMentioned = mentionedFiles.some((mentioned) => {
|
|
47
|
+
const mentionedLower = mentioned.toLowerCase();
|
|
48
|
+
return (lowerPath.includes(mentionedLower) ||
|
|
49
|
+
fileName === mentionedLower ||
|
|
50
|
+
lowerPath.endsWith(mentionedLower));
|
|
51
|
+
});
|
|
52
|
+
if (isMentioned) {
|
|
53
|
+
return 3.0;
|
|
54
|
+
}
|
|
55
|
+
const codeExtensions = ['ts', 'tsx', 'js', 'jsx', 'py', 'go', 'rs', 'java'];
|
|
56
|
+
if (codeExtensions.includes(ext)) {
|
|
57
|
+
return 1.3;
|
|
58
|
+
}
|
|
59
|
+
// md files - heavy penalty
|
|
60
|
+
if (ext === 'md' || lowerPath.includes('readme')) {
|
|
61
|
+
return 0.05;
|
|
62
|
+
}
|
|
63
|
+
return 1.0;
|
|
64
|
+
}
|
|
7
65
|
async function retrieveContext(question, config) {
|
|
8
66
|
const embedder = await (0, embeddings_1.createEmbedder)(config);
|
|
9
67
|
const [qvec] = await embedder.embed([question]);
|
|
68
|
+
const mentionedFiles = extractFileMentions(question);
|
|
10
69
|
const store = new db_1.VectorStore(config.dbPath);
|
|
11
70
|
store.init();
|
|
12
|
-
|
|
71
|
+
const directFileResults = [];
|
|
72
|
+
if (mentionedFiles.length > 0) {
|
|
73
|
+
for (const mentionedFile of mentionedFiles) {
|
|
74
|
+
const chunks = store.getChunksByFilePath(mentionedFile, 5);
|
|
75
|
+
const matchingChunks = chunks.filter((r) => matchesFilePattern(r.filePath, mentionedFile));
|
|
76
|
+
directFileResults.push(...matchingChunks);
|
|
77
|
+
}
|
|
78
|
+
}
|
|
79
|
+
const vectorResults = store.search(qvec, config.topK * 4);
|
|
80
|
+
// Combine direct file lookups with vector search results
|
|
81
|
+
// map to deduplicate by file path and line range
|
|
82
|
+
const resultMap = new Map();
|
|
83
|
+
directFileResults.forEach((result) => {
|
|
84
|
+
const key = `${result.filePath}:${result.startLine}-${result.endLine}`;
|
|
85
|
+
if (!resultMap.has(key)) {
|
|
86
|
+
resultMap.set(key, {
|
|
87
|
+
...result,
|
|
88
|
+
score: result.score * 3.0,
|
|
89
|
+
});
|
|
90
|
+
}
|
|
91
|
+
});
|
|
92
|
+
vectorResults.forEach((result) => {
|
|
93
|
+
const key = `${result.filePath}:${result.startLine}-${result.endLine}`;
|
|
94
|
+
const existing = resultMap.get(key);
|
|
95
|
+
if (existing) {
|
|
96
|
+
const newScore = result.score * getFileTypePriority(result.filePath, question, mentionedFiles);
|
|
97
|
+
if (newScore > existing.score) {
|
|
98
|
+
resultMap.set(key, { ...result, score: newScore });
|
|
99
|
+
}
|
|
100
|
+
}
|
|
101
|
+
else {
|
|
102
|
+
resultMap.set(key, {
|
|
103
|
+
...result,
|
|
104
|
+
score: result.score * getFileTypePriority(result.filePath, question, mentionedFiles),
|
|
105
|
+
});
|
|
106
|
+
}
|
|
107
|
+
});
|
|
108
|
+
const allResults = Array.from(resultMap.values());
|
|
109
|
+
allResults.sort((a, b) => b.score - a.score);
|
|
110
|
+
const mentionedMarkdownFiles = mentionedFiles.filter((f) => f.toLowerCase().endsWith('.md') || f.toLowerCase().includes('readme'));
|
|
111
|
+
const codeResults = allResults.filter((r) => {
|
|
112
|
+
const ext = r.filePath.toLowerCase().split('.').pop() || '';
|
|
113
|
+
return !['md', 'txt'].includes(ext) && !r.filePath.toLowerCase().includes('readme');
|
|
114
|
+
});
|
|
115
|
+
const mentionedMarkdownResults = mentionedMarkdownFiles.length > 0
|
|
116
|
+
? allResults.filter((r) => {
|
|
117
|
+
const ext = r.filePath.toLowerCase().split('.').pop() || '';
|
|
118
|
+
const isMarkdown = ext === 'md' || r.filePath.toLowerCase().includes('readme');
|
|
119
|
+
if (!isMarkdown)
|
|
120
|
+
return false;
|
|
121
|
+
return mentionedFiles.some((mentioned) => matchesFilePattern(r.filePath, mentioned));
|
|
122
|
+
})
|
|
123
|
+
: [];
|
|
124
|
+
if (mentionedMarkdownResults.length > 0) {
|
|
125
|
+
const combined = [...codeResults, ...mentionedMarkdownResults];
|
|
126
|
+
// remove duplicates
|
|
127
|
+
const uniqueResults = Array.from(new Map(combined.map((r) => [`${r.filePath}:${r.startLine}-${r.endLine}`, r])).values());
|
|
128
|
+
return uniqueResults.slice(0, config.topK);
|
|
129
|
+
}
|
|
130
|
+
if (codeResults.length >= Math.ceil(config.topK / 2)) {
|
|
131
|
+
return codeResults.slice(0, config.topK);
|
|
132
|
+
}
|
|
133
|
+
return allResults.slice(0, config.topK);
|
|
13
134
|
}
|
|
14
135
|
function formatContext(results) {
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
136
|
+
const MAX_CHUNK_DISPLAY_LENGTH = 1500;
|
|
137
|
+
const codeSnippets = [];
|
|
138
|
+
const docs = [];
|
|
139
|
+
results.forEach((r) => {
|
|
140
|
+
const isDoc = r.filePath.toLowerCase().endsWith('.md') || r.filePath.toLowerCase().includes('readme');
|
|
141
|
+
if (isDoc) {
|
|
142
|
+
docs.push(r);
|
|
143
|
+
}
|
|
144
|
+
else {
|
|
145
|
+
codeSnippets.push(r);
|
|
146
|
+
}
|
|
147
|
+
});
|
|
148
|
+
const codeSection = codeSnippets
|
|
149
|
+
.map((r, index) => {
|
|
150
|
+
let content = r.content || '';
|
|
151
|
+
if (content.length > MAX_CHUNK_DISPLAY_LENGTH) {
|
|
152
|
+
content = content.slice(0, MAX_CHUNK_DISPLAY_LENGTH) + '\n... (truncated)';
|
|
153
|
+
}
|
|
154
|
+
return `[${index + 1}] CODE FILE: ${r.filePath}:${r.startLine}-${r.endLine}
|
|
155
|
+
CODE_SNIPPET:
|
|
156
|
+
${content}`;
|
|
20
157
|
})
|
|
21
158
|
.join('\n\n---\n\n');
|
|
159
|
+
const docsSection = docs.length > 0
|
|
160
|
+
? '\n\n=== DOCUMENTATION (for reference only, prioritize CODE above) ===\n\n' +
|
|
161
|
+
docs
|
|
162
|
+
.map((r, index) => {
|
|
163
|
+
let content = r.content || '';
|
|
164
|
+
if (content.length > MAX_CHUNK_DISPLAY_LENGTH) {
|
|
165
|
+
content = content.slice(0, MAX_CHUNK_DISPLAY_LENGTH) + '\n... (truncated)';
|
|
166
|
+
}
|
|
167
|
+
return `DOC [${index + 1}] FILE: ${r.filePath}:${r.startLine}-${r.endLine}
|
|
168
|
+
DOCUMENTATION:
|
|
169
|
+
${content}`;
|
|
170
|
+
})
|
|
171
|
+
.join('\n\n---\n\n')
|
|
172
|
+
: '';
|
|
173
|
+
return codeSection + docsSection;
|
|
22
174
|
}
|
|
@@ -0,0 +1,46 @@
|
|
|
1
|
+
"use strict";
|
|
2
|
+
var __importDefault = (this && this.__importDefault) || function (mod) {
|
|
3
|
+
return (mod && mod.__esModule) ? mod : { "default": mod };
|
|
4
|
+
};
|
|
5
|
+
Object.defineProperty(exports, "__esModule", { value: true });
|
|
6
|
+
exports.formatQuestion = formatQuestion;
|
|
7
|
+
exports.formatAnswer = formatAnswer;
|
|
8
|
+
exports.formatStats = formatStats;
|
|
9
|
+
const boxen_1 = __importDefault(require("boxen"));
|
|
10
|
+
const chalk_1 = __importDefault(require("chalk"));
|
|
11
|
+
const cli_highlight_1 = require("cli-highlight");
|
|
12
|
+
const gradient_string_1 = __importDefault(require("gradient-string"));
|
|
13
|
+
function formatQuestion(question) {
|
|
14
|
+
return (0, boxen_1.default)(gradient_string_1.default.rainbow(question), {
|
|
15
|
+
title: 'Question',
|
|
16
|
+
borderColor: 'cyan',
|
|
17
|
+
padding: 1,
|
|
18
|
+
margin: { top: 1, bottom: 1 },
|
|
19
|
+
});
|
|
20
|
+
}
|
|
21
|
+
function formatAnswer(answer) {
|
|
22
|
+
let formatted = answer.replace(/```(\w+)?\n([\s\S]*?)```/g, (_match, lang, code) => {
|
|
23
|
+
const highlighted = (0, cli_highlight_1.highlight)(code.trim(), {
|
|
24
|
+
language: lang || 'typescript',
|
|
25
|
+
ignoreIllegals: true,
|
|
26
|
+
});
|
|
27
|
+
return (chalk_1.default.gray('┌─ Code ───────────────────────────────┐\n') +
|
|
28
|
+
highlighted +
|
|
29
|
+
'\n' +
|
|
30
|
+
chalk_1.default.gray('└──────────────────────────────────────┘'));
|
|
31
|
+
});
|
|
32
|
+
// Emphasize file names and line references
|
|
33
|
+
formatted = formatted
|
|
34
|
+
.replace(/`([^`]+\.(ts|js|tsx|jsx|py|go|rs))`/g, (_m, file) => chalk_1.default.cyan.underline(file))
|
|
35
|
+
.replace(/line (\d+)/gi, (_m, num) => chalk_1.default.yellow(`line ${num}`));
|
|
36
|
+
return formatted;
|
|
37
|
+
}
|
|
38
|
+
function formatStats(stats) {
|
|
39
|
+
return (0, boxen_1.default)([
|
|
40
|
+
chalk_1.default.bold('Ingestion complete'),
|
|
41
|
+
`${chalk_1.default.cyan('Files')}: ${stats.files}`,
|
|
42
|
+
`${chalk_1.default.cyan('Chunks')}: ${stats.chunks}`,
|
|
43
|
+
`${chalk_1.default.cyan('Avg chunk')}: ${stats.avgChunkSize.toFixed(1)} lines`,
|
|
44
|
+
`${chalk_1.default.cyan('Duration')}: ${stats.durationSec.toFixed(1)}s`,
|
|
45
|
+
].join('\n'), { borderColor: 'green', padding: 1, margin: 1 });
|
|
46
|
+
}
|
package/dist/utils/logger.js
CHANGED
|
@@ -4,10 +4,16 @@ var __importDefault = (this && this.__importDefault) || function (mod) {
|
|
|
4
4
|
};
|
|
5
5
|
Object.defineProperty(exports, "__esModule", { value: true });
|
|
6
6
|
exports.log = void 0;
|
|
7
|
+
const boxen_1 = __importDefault(require("boxen"));
|
|
7
8
|
const chalk_1 = __importDefault(require("chalk"));
|
|
8
9
|
exports.log = {
|
|
9
|
-
info: (msg) => console.log(chalk_1.default.cyan('
|
|
10
|
-
success: (msg) => console.log(chalk_1.default.green('
|
|
11
|
-
warn: (msg) => console.log(chalk_1.default.yellow('
|
|
12
|
-
error: (msg) => console.log(chalk_1.default.red('
|
|
10
|
+
info: (msg) => console.log(chalk_1.default.cyan('ℹ'), msg),
|
|
11
|
+
success: (msg) => console.log(chalk_1.default.green('✓'), msg),
|
|
12
|
+
warn: (msg) => console.log(chalk_1.default.yellow('⚠'), msg),
|
|
13
|
+
error: (msg) => console.log(chalk_1.default.red('✗'), msg),
|
|
14
|
+
box: (content, title) => console.log((0, boxen_1.default)(content, {
|
|
15
|
+
title,
|
|
16
|
+
borderColor: 'cyan',
|
|
17
|
+
padding: 1,
|
|
18
|
+
})),
|
|
13
19
|
};
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "codexa",
|
|
3
|
-
"version": "1.0.
|
|
3
|
+
"version": "1.0.1",
|
|
4
4
|
"description": "CLI agent that indexes local repos and answers questions with hosted or local LLMs.",
|
|
5
5
|
"bin": {
|
|
6
6
|
"codexa": "bin/codexa.js"
|
|
@@ -27,9 +27,9 @@
|
|
|
27
27
|
"license": "MIT",
|
|
28
28
|
"repository": {
|
|
29
29
|
"type": "git",
|
|
30
|
-
"url": "https://github.com/sahitya-chandra/codexa.git"
|
|
30
|
+
"url": "git+https://github.com/sahitya-chandra/codexa.git"
|
|
31
31
|
},
|
|
32
|
-
"homepage": "
|
|
32
|
+
"homepage": "codexa-neon.vercel.app",
|
|
33
33
|
"bugs": {
|
|
34
34
|
"url": "https://github.com/sahitya-chandra/codexa/issues"
|
|
35
35
|
},
|
|
@@ -47,15 +47,22 @@
|
|
|
47
47
|
"@xenova/transformers": "^2.17.2",
|
|
48
48
|
"ai": "^5.0.105",
|
|
49
49
|
"better-sqlite3": "^9.6.0",
|
|
50
|
+
"boxen": "^7.1.1",
|
|
50
51
|
"chalk": "^5.3.0",
|
|
52
|
+
"cli-highlight": "^2.1.11",
|
|
53
|
+
"cli-progress": "^3.12.0",
|
|
51
54
|
"commander": "^12.1.0",
|
|
52
55
|
"dotenv": "^16.4.5",
|
|
53
56
|
"fs-extra": "^11.2.0",
|
|
57
|
+
"gradient-string": "^2.0.2",
|
|
54
58
|
"globby": "^13.0.0",
|
|
55
59
|
"ignore": "^5.3.1",
|
|
56
60
|
"node-fetch": "^3.3.2",
|
|
57
61
|
"openai": "^4.73.1",
|
|
58
|
-
"ora": "^8.1.0"
|
|
62
|
+
"ora": "^8.1.0",
|
|
63
|
+
"marked": "^11.2.0",
|
|
64
|
+
"marked-terminal": "^6.2.0",
|
|
65
|
+
"table": "^6.8.1"
|
|
59
66
|
},
|
|
60
67
|
"devDependencies": {
|
|
61
68
|
"@eslint/js": "^9.39.1",
|
|
@@ -76,4 +83,4 @@
|
|
|
76
83
|
"typescript-eslint": "^8.47.0",
|
|
77
84
|
"vitest": "^4.0.14"
|
|
78
85
|
}
|
|
79
|
-
}
|
|
86
|
+
}
|