quackstack 1.0.23 → 1.0.25

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -4,23 +4,21 @@
4
4
 
5
5
  QuackStack is an interactive CLI tool that indexes your codebase using local AI embeddings and lets you ask questions about it conversationally. Perfect for understanding unfamiliar code, onboarding to new projects, or giving your AI coding assistant persistent context.
6
6
 
7
- ## šŸŽÆ Quack in Action!
8
- Check out the QuackStack Live demo [here](https://courageous-spaniel.clueso.site/share/4f5e6395-8ad8-4d18-8e81-f736a6581a25)!
9
- ## ✨ Features
7
+ [Live Demo](https://courageous-spaniel.clueso.site/share/4f5e6395-8ad8-4d18-8e81-f736a6581a25) | [Documentation](https://quackstack.siddharththakkar.xyz/docs) | [Frontend](https://github.com/woustachemax/quack-web)
10
8
 
11
- * šŸš€ **Zero-config** - Just run `quack` in any project directory
12
- * 🧠 **Smart code parsing** - Automatically extracts functions and classes
13
- * šŸ’¬ **Interactive REPL** - Ask questions conversationally, stays open until Ctrl+C
14
- * šŸ”’ **100% Local embeddings** - No API calls for vector generation, your code stays private
15
- * šŸ¤– **AI-powered answers** - Uses OpenAI, Claude, Gemini, DeepSeek, or Mistral for conversational responses
16
- * šŸŽÆ **Universal AI tool support** - Auto-generate context for Cursor, Windsurf, Cline, Continue, and Aider
17
- * šŸ“¦ **Local database** - Your code stays on your infrastructure
18
- * šŸŒ **Multi-language** - Supports JS/TS, Python, Go, Rust, Java, C/C++, C#, Ruby, PHP, Swift, Kotlin, and more
9
+ ## Features
19
10
 
11
+ * **Zero-config** - Just run `quack` in any project directory
12
+ * **Smart code parsing** - Automatically extracts functions and classes
13
+ * **Interactive REPL** - Ask questions conversationally, stays open until Ctrl+C
14
+ * **100% Local embeddings** - No API calls for vector generation, your code stays private
15
+ * **AI-powered answers** - Uses OpenAI, Claude, Gemini, DeepSeek, Grok, or Mistral for conversational responses
16
+ * **Git history integration** - Track authorship, commit history, and code ownership
17
+ * **Universal AI tool support** - Auto-generate context for Cursor, Windsurf, Cline, Continue, and Aider
18
+ * **Local database** - Your code stays on your infrastructure
19
+ * **Multi-language** - Supports JS/TS, Python, Go, Rust, Java, C/C++, C#, Ruby, PHP, Swift, Kotlin, and more
20
20
 
21
- ## šŸ’» Frontend
22
- Check out the Frontend Repo [here](https://github.com/woustachemax/quack-web)
23
- ## šŸ“¦ Installation
21
+ ## Installation
24
22
 
25
23
  ### Global Install (Recommended)
26
24
 
@@ -39,7 +37,7 @@ pnpm install
39
37
  pnpm build
40
38
  ```
41
39
 
42
- ## āš™ļø Setup
40
+ ## Setup
43
41
 
44
42
  ### 1. Create `.env` in your project root
45
43
 
@@ -50,7 +48,7 @@ QUACKSTACK_DATABASE_URL=postgresql://user:pass@host:port/dbname
50
48
  # REQUIRED: Choose ONE AI provider for conversational answers
51
49
  # (Embeddings are computed locally - no API calls!)
52
50
 
53
- # Option 1: OpenAI (RECOMMENDED)
51
+ # Option 1: OpenAI
54
52
  QUACKSTACK_OPENAI_KEY=sk-...
55
53
 
56
54
  # Option 2: Anthropic Claude
@@ -59,10 +57,13 @@ QUACKSTACK_ANTHROPIC_KEY=sk-ant-...
59
57
  # Option 3: Google Gemini (has free tier!)
60
58
  QUACKSTACK_GEMINI_KEY=AIza...
61
59
 
62
- # Option 4: DeepSeek (cheapest option)
60
+ # Option 4: xAI Grok
61
+ QUACKSTACK_GROK_KEY=xai-...
62
+
63
+ # Option 5: DeepSeek (cheapest option)
63
64
  QUACKSTACK_DEEPSEEK_KEY=sk-...
64
65
 
65
- # Option 5: Mistral AI
66
+ # Option 6: Mistral AI
66
67
  QUACKSTACK_MISTRAL_KEY=...
67
68
  ```
68
69
 
@@ -73,14 +74,13 @@ npx prisma generate
73
74
  npx prisma db push
74
75
  ```
75
76
 
76
- ## šŸš€ Usage
77
+ ## Usage
77
78
 
78
79
  ### Interactive Mode (Default)
79
80
 
80
81
  ```bash
81
82
  quack
82
-
83
- # Answer appears with context
83
+ # Ask questions about your codebase
84
84
  # Press Ctrl+C to exit
85
85
  ```
86
86
 
@@ -95,8 +95,31 @@ quack --context
95
95
  # - Cline (.clinerules)
96
96
  # - Continue (.continue/context.md)
97
97
  # - Aider (.aider.conf.yml)
98
+ ```
99
+
100
+ ### Generate AGENTS.md Configuration
101
+
102
+ ```bash
103
+ quack --agent
104
+
105
+ # Creates agent.md with codebase context
106
+ # for AI agent frameworks
107
+ ```
108
+
109
+ ### Generate README
110
+
111
+ ```bash
112
+ quack --readme
113
+
114
+ # Auto-generates README.md from your codebase
115
+ ```
116
+
117
+ ### Generate Documentation
98
118
 
99
- # Your AI coding assistants automatically read these files!
119
+ ```bash
120
+ quack --docs
121
+
122
+ # Creates CODEBASE.md with architecture overview
100
123
  ```
101
124
 
102
125
  ### Watch Mode (Auto-update Context)
@@ -105,8 +128,21 @@ quack --context
105
128
  quack --watch
106
129
 
107
130
  # Watches for file changes
108
- # Auto-regenerates all context files
109
- # Keep running in background during development
131
+ # Auto-regenerates context files
132
+ ```
133
+
134
+ ### Git History Commands
135
+
136
+ ```bash
137
+ # View contributor statistics
138
+ quack authors
139
+
140
+ # View recently modified files
141
+ quack recent
142
+ quack recent --days 30
143
+
144
+ # View repository information
145
+ quack git-info
110
146
  ```
111
147
 
112
148
  ### Force Reindex
@@ -117,17 +153,27 @@ quack --reindex
117
153
  # Clears old index and re-scans entire codebase
118
154
  ```
119
155
 
120
- ## šŸ“– Example Session
156
+ ### List Available AI Models
157
+
158
+ ```bash
159
+ quack --list-models
160
+
161
+ # Shows all configured providers and available models
162
+ ```
163
+
164
+ ## Example Session
121
165
 
122
166
  ```bash
123
167
  $ quack
124
- Welcome to QuackStack! 🐄
125
- šŸ” Indexing your codebase (this may take a moment)...
126
- āœ… Indexing complete!
168
+ Welcome to QuackStack!
169
+
170
+ Using: OpenAI - gpt-4o
171
+ Press Ctrl+C to exit
127
172
 
128
- šŸ’” Tip: Press Ctrl+C to exit
173
+ Indexing your codebase...
174
+ Indexing complete
129
175
 
130
- 🐄 Quack! How can I help? > how does the search function work?
176
+ quack > how does the search function work?
131
177
 
132
178
  The search function uses local embeddings to convert your query into a vector,
133
179
  compares it against stored code embeddings using cosine similarity, ranks results,
@@ -135,141 +181,153 @@ and feeds the top matches to the AI for a conversational answer.
135
181
 
136
182
  Implementation is in src/commands/search.ts
137
183
 
138
- šŸ’” Want more details? (y/n) > y
184
+ Want more details? (y/n) > n
139
185
 
140
- šŸ“š Relevant Code:
186
+ quack > who wrote the authentication system?
141
187
 
142
- [1] src/commands/search.ts (relevance: 87.3%)
143
- export async function search(query: string, projectName: string) {
144
- const snippets = await client.codeSnippet.findMany({
145
- where: { projectName },
146
- });
147
- // ... cosine similarity ranking ...
148
- }
149
-
150
- 🐄 Quack! How can I help? > where are embeddings generated?
151
-
152
- Embeddings are generated locally using the local-embeddings module.
153
- No API calls are made for vector generation, keeping your code private.
154
-
155
- šŸ’” Want more details? (y/n) > n
188
+ The authentication system was primarily written by Siddharth Thakkar, with the
189
+ main implementation in app/api/auth/[...nextauth]/options.ts (last modified 187 days ago).
156
190
 
157
- 🐄 Quack! How can I help? > ^C
158
- šŸ‘‹ Happy coding!
191
+ quack > ^C
192
+ Happy coding!
159
193
  ```
160
194
 
161
- ## šŸ› ļø How It Works
195
+ ## How It Works
162
196
 
163
197
  1. **Scanning** - Finds all code files (ignoring `node_modules`, `.git`, etc.)
164
- 2. **Parsing** - Uses AST parsing to extract functions/classes from JS/TS
198
+ 2. **Parsing** - Uses AST parsing to extract functions/classes
165
199
  3. **Chunking** - Breaks code into logical chunks
166
- 4. **Local Embedding** - Generates vector embeddings **locally** (no API calls!)
167
- 5. **Storage** - Saves to your PostgreSQL/Neon database
168
- 6. **Search** - Semantic search using cosine similarity + AI-powered conversational answers
169
-
170
- ## šŸŽÆ Use Cases
171
-
172
- - **Context switching** - Quickly understand projects you haven't touched in months
173
- - **Onboarding** - New team members can ask questions instead of reading docs
174
- - **Code archaeology** - Find implementations without grepping
175
- - **AI coding assistants** - Give Cursor/Windsurf/Cline/Continue/Aider persistent codebase context
176
- - **Documentation** - Auto-generate explanations of how things work
177
- - **Privacy-focused** - All embeddings generated locally, no code sent to embedding APIs
200
+ 4. **Local Embedding** - Generates vector embeddings locally (no API calls)
201
+ 5. **Git Enrichment** - Extracts commit history, authorship, and ownership data
202
+ 6. **Storage** - Saves to your PostgreSQL database
203
+ 7. **Search** - Semantic search using cosine similarity + AI-powered conversational answers
178
204
 
179
- ## šŸ“‹ Commands Reference
205
+ ## Commands Reference
180
206
 
181
207
  | Command | Description |
182
208
  |---------|-------------|
183
- | `quack` | Start interactive REPL (auto-indexes first time) |
184
- | `quack --context` | Generate context files for ALL AI coding tools |
209
+ | `quack` | Start interactive REPL |
210
+ | `quack --context` | Generate context files for all AI coding tools |
211
+ | `quack --agent` | Generate AGENTS.md configuration |
212
+ | `quack --readme` | Generate README.md from codebase |
213
+ | `quack --docs` | Generate CODEBASE.md documentation |
185
214
  | `quack --watch` | Watch mode - auto-update context on file changes |
186
215
  | `quack --reindex` | Force reindex the entire codebase |
187
- | `quack --cursor` | [DEPRECATED] Use `--context` instead |
216
+ | `quack --list-models` | Show available AI providers and models |
217
+ | `quack authors` | View contributor statistics |
218
+ | `quack recent [--days N]` | View recently modified files |
219
+ | `quack git-info` | View repository information |
188
220
 
189
- ## šŸ”‘ Supported AI Providers
221
+ ## Supported AI Providers
190
222
 
191
223
  | Provider | Used For | Cost | Privacy | Setup |
192
224
  |----------|----------|------|---------|-------|
193
- | **Local** | Embeddings | FREE | šŸ”’ 100% Private | Built-in |
225
+ | **Local** | Embeddings | FREE | 100% Private | Built-in |
194
226
  | OpenAI | Chat answers | $$ | Query only | [Get key](https://platform.openai.com/api-keys) |
195
227
  | Anthropic | Chat answers | $$$ | Query only | [Get key](https://console.anthropic.com/) |
196
228
  | Gemini | Chat answers | FREE | Query only | [Get key](https://aistudio.google.com/app/apikey) |
229
+ | xAI Grok | Chat answers | $$ | Query only | [Get key](https://x.ai/) |
197
230
  | DeepSeek | Chat answers | $ | Query only | [Get key](https://platform.deepseek.com/) |
198
231
  | Mistral | Chat answers | $$ | Query only | [Get key](https://console.mistral.ai/) |
199
232
 
200
- **Privacy Note:** QuackStack generates embeddings **locally** on your machine. Only your natural language queries and retrieved code context are sent to the AI provider for generating conversational answers. Your entire codebase is never sent to any API.
233
+ **Privacy Note:** QuackStack generates embeddings locally on your machine. Only your natural language queries and retrieved code context are sent to the AI provider for generating conversational answers. Your entire codebase is never sent to any API.
201
234
 
202
- ## šŸ—„ļø Database Schema
235
+ ## Database Schema
203
236
 
204
237
  ```prisma
205
238
  model codeSnippet {
206
- id Int @id @default(autoincrement())
207
- content String
208
- embedding Json // Stored as JSON array of numbers
209
- filePath String
210
- projectName String
211
- language String?
212
- functionName String?
213
- lineStart Int?
214
- lineEnd Int?
215
- createdAt DateTime @default(now())
216
- updatedAt DateTime @updatedAt
239
+ id Int @id @default(autoincrement())
240
+ content String
241
+ embedding Json
242
+ filePath String
243
+ projectName String
244
+ language String?
245
+ functionName String?
246
+ lineStart Int?
247
+ lineEnd Int?
248
+
249
+ lastCommitHash String?
250
+ lastCommitAuthor String?
251
+ lastCommitEmail String?
252
+ lastCommitDate DateTime?
253
+ lastCommitMessage String?
254
+ totalCommits Int? @default(0)
255
+ primaryAuthor String?
256
+ primaryAuthorEmail String?
257
+ fileOwnerCommits Int? @default(0)
258
+
259
+ createdAt DateTime @default(now())
260
+ updatedAt DateTime @updatedAt
217
261
 
218
262
  @@index([projectName])
263
+ @@index([lastCommitDate])
264
+ @@index([primaryAuthor])
219
265
  }
220
- ```
221
266
 
222
- Each project is isolated by `projectName` (uses current directory name).
267
+ model gitAuthor {
268
+ id Int @id @default(autoincrement())
269
+ projectName String
270
+ author String
271
+ email String
272
+ totalCommits Int @default(0)
273
+ linesAdded Int @default(0)
274
+ linesRemoved Int @default(0)
275
+ recentActivity DateTime?
276
+ filesOwned String[]
277
+
278
+ createdAt DateTime @default(now())
279
+ updatedAt DateTime @updatedAt
280
+
281
+ @@unique([projectName, email])
282
+ @@index([projectName])
283
+ @@index([recentActivity])
284
+ }
285
+ ```
223
286
 
224
- ## šŸŒ Supported Languages
287
+ ## Supported Languages
225
288
 
226
289
  JavaScript, TypeScript, Python, Go, Rust, Java, C, C++, C#, Ruby, PHP, Swift, Kotlin, Scala, R, Vue, Svelte
227
290
 
228
- ## šŸŽ“ Development
291
+ ## Use Cases
292
+
293
+ - **Context switching** - Quickly understand projects you haven't touched in months
294
+ - **Onboarding** - New team members can ask questions instead of reading docs
295
+ - **Code archaeology** - Find implementations without grepping
296
+ - **Code ownership** - Identify who wrote and maintains specific parts of the codebase
297
+ - **AI coding assistants** - Give Cursor/Windsurf/Cline/Continue/Aider persistent codebase context
298
+ - **Documentation** - Auto-generate explanations of how things work
299
+ - **Privacy-focused** - All embeddings generated locally, no code sent to embedding APIs
300
+
301
+ ## Development
229
302
 
230
303
  ```bash
231
304
  git clone https://github.com/woustachemax/quackstack.git
232
305
  cd quackstack
233
306
  pnpm install
234
-
235
307
  pnpm build
236
308
 
237
309
  # Run locally
238
310
  node dist/cli.cjs
239
- node dist/cli.cjs --context
240
- node dist/cli.cjs --watch
241
311
  ```
242
312
 
243
- ## šŸ—ŗļø Roadmap
244
-
245
- - [x] Local embeddings (no API calls!)
246
- - [x] Support for all major AI coding assistants
247
- - [ ] VS Code extension
248
- - [ ] Official Cursor plugin
249
- - [ ] Export Q&A sessions as markdown docs
250
- - [ ] Add filtering by file type, date range, author
251
- - [ ] Support for code diffs and change tracking
252
- - [ ] Team collaboration features
253
- - [ ] Self-hosted web UI
254
-
255
- ## šŸ¤ Contributing
313
+ ## Contributing
256
314
 
257
315
  Contributions welcome! Feel free to:
258
316
  - Report bugs via [GitHub Issues](https://github.com/woustachemax/quackstack/issues)
259
317
  - Submit feature requests
260
318
  - Open pull requests
261
319
 
262
- ## šŸ“„ License
320
+ ## License
263
321
 
264
322
  MIT
265
323
 
266
- ## šŸ’” Pro Tips
324
+ ## Pro Tips
267
325
 
268
326
  **Privacy First**: Embeddings are generated locally - your code never leaves your machine during indexing.
269
327
 
270
328
  **Gemini Free Tier**: Start with Google Gemini for chat responses - it's free and works great for most use cases.
271
329
 
272
- **Universal Context**: Run `quack --context` once to generate context files for ALL major AI coding tools at once.
330
+ **Universal Context**: Run `quack --context` once to generate context files for all major AI coding tools at once.
273
331
 
274
332
  **Background Watcher**: Run `quack --watch &` in the background to keep context always fresh across all your AI tools.
275
333
 
@@ -277,4 +335,6 @@ MIT
277
335
 
278
336
  **Large Codebases**: First index might take a few minutes. After that, only changed files are re-indexed.
279
337
 
280
- **No Vendor Lock-in**: Unlike other tools, QuackStack works with Cursor, Windsurf, Cline, Continue, and Aider - choose your favorite!
338
+ **Git Integration**: QuackStack automatically enriches your codebase with git history - no setup required. Track authorship, view recent changes, and understand code ownership.
339
+
340
+ **No Vendor Lock-in**: Unlike other tools, QuackStack works with Cursor, Windsurf, Cline, Continue, and Aider - choose your favorite!
package/dist/cli.cjs CHANGED
@@ -12,6 +12,8 @@ const context_generator_js_1 = require("./lib/context-generator.js");
12
12
  const readme_js_1 = require("./commands/readme.js");
13
13
  const agents_js_1 = require("./commands/agents.js");
14
14
  const ai_provider_js_1 = require("./lib/ai-provider.js");
15
+ const database_js_1 = require("./lib/database.js");
16
+ const git_history_js_1 = require("./lib/git-history.js");
15
17
  const path_1 = __importDefault(require("path"));
16
18
  const program = new commander_1.Command();
17
19
  const PROJECT_NAME = path_1.default.basename(process.cwd());
@@ -85,4 +87,84 @@ program
85
87
  }
86
88
  await (0, repl_js_1.startREPL)(options.reindex, options.provider, options.model);
87
89
  });
90
+ program
91
+ .command("authors")
92
+ .description("Show contributor statistics for this project")
93
+ .action(async () => {
94
+ if (!git_history_js_1.gitHistory.isRepository()) {
95
+ console.log(chalk_1.default.red("āŒ Not a git repository"));
96
+ process.exit(1);
97
+ }
98
+ console.log(chalk_1.default.cyan("\nšŸ“Š Contributor Statistics\n"));
99
+ const authors = await (0, database_js_1.getProjectAuthors)(PROJECT_NAME);
100
+ if (authors.length === 0) {
101
+ console.log(chalk_1.default.yellow("No contributor data found. Run 'quack --reindex' first."));
102
+ process.exit(0);
103
+ }
104
+ authors.forEach((author, i) => {
105
+ console.log(chalk_1.default.green(`${i + 1}. ${author.author}`) + chalk_1.default.gray(` (${author.email})`));
106
+ console.log(chalk_1.default.white(` ${author.totalCommits} commits | +${author.linesAdded}/-${author.linesRemoved} lines`));
107
+ if (author.recentActivity) {
108
+ const daysAgo = Math.floor((Date.now() - author.recentActivity.getTime()) / (1000 * 60 * 60 * 24));
109
+ console.log(chalk_1.default.gray(` Last active ${daysAgo} days ago`));
110
+ }
111
+ if (author.filesOwned.length > 0) {
112
+ console.log(chalk_1.default.gray(` Owns ${author.filesOwned.length} files`));
113
+ }
114
+ console.log();
115
+ });
116
+ console.log(chalk_1.default.cyan(`Total: ${authors.length} contributors\n`));
117
+ });
118
+ program
119
+ .command("recent")
120
+ .description("Show recently modified files")
121
+ .option("-d, --days <number>", "Number of days to look back", "7")
122
+ .action(async (options) => {
123
+ if (!git_history_js_1.gitHistory.isRepository()) {
124
+ console.log(chalk_1.default.red("āŒ Not a git repository"));
125
+ process.exit(1);
126
+ }
127
+ const days = parseInt(options.days);
128
+ console.log(chalk_1.default.cyan(`\nšŸ“ Files modified in last ${days} days\n`));
129
+ const files = await (0, database_js_1.getRecentlyModifiedFiles)(PROJECT_NAME, days);
130
+ if (files.length === 0) {
131
+ console.log(chalk_1.default.yellow(`No files modified in last ${days} days (or run 'quack --reindex')`));
132
+ process.exit(0);
133
+ }
134
+ files.forEach((file, i) => {
135
+ const daysAgo = Math.floor((Date.now() - file.lastCommitDate.getTime()) / (1000 * 60 * 60 * 24));
136
+ console.log(chalk_1.default.green(`${i + 1}. ${file.filePath}`));
137
+ console.log(chalk_1.default.gray(` Modified by ${file.lastCommitAuthor} ${daysAgo} days ago`));
138
+ if (file.lastCommitMessage) {
139
+ console.log(chalk_1.default.white(` "${file.lastCommitMessage.substring(0, 60)}${file.lastCommitMessage.length > 60 ? '...' : ''}"`));
140
+ }
141
+ console.log();
142
+ });
143
+ console.log(chalk_1.default.cyan(`Total: ${files.length} files\n`));
144
+ });
145
+ program
146
+ .command("git-info")
147
+ .description("Show git repository information")
148
+ .action(() => {
149
+ if (!git_history_js_1.gitHistory.isRepository()) {
150
+ console.log(chalk_1.default.red("āŒ Not a git repository"));
151
+ process.exit(1);
152
+ }
153
+ console.log(chalk_1.default.cyan("\nšŸ” Git Repository Info\n"));
154
+ const branch = git_history_js_1.gitHistory.getCurrentBranch();
155
+ if (branch) {
156
+ console.log(chalk_1.default.white(`Current Branch: `) + chalk_1.default.green(branch));
157
+ }
158
+ const repoRoot = git_history_js_1.gitHistory.getRepositoryRoot();
159
+ console.log(chalk_1.default.white(`Repository Root: `) + chalk_1.default.gray(repoRoot));
160
+ console.log(chalk_1.default.cyan("\nšŸ“ˆ Recent Commits:\n"));
161
+ const commits = git_history_js_1.gitHistory.getRecentCommits(10);
162
+ commits.slice(0, 5).forEach((commit, i) => {
163
+ const date = new Date(commit.date).toLocaleDateString();
164
+ console.log(chalk_1.default.green(`${i + 1}. ${commit.author}`) + chalk_1.default.gray(` (${date})`));
165
+ console.log(chalk_1.default.white(` ${commit.message.substring(0, 70)}${commit.message.length > 70 ? '...' : ''}`));
166
+ console.log(chalk_1.default.gray(` ${commit.filesChanged.length} files changed`));
167
+ console.log();
168
+ });
169
+ });
88
170
  program.parse();
@@ -2,14 +2,20 @@ import fs from "fs";
2
2
  import path from "path";
3
3
  import { scanDir } from "../lib/scanner.js";
4
4
  import { chunkCode } from "../lib/chunker.js";
5
- import { saveToDB } from "../lib/database.js";
5
+ import { saveToDB, saveAuthorToDB } from "../lib/database.js";
6
6
  import { localEmbeddings } from "../lib/local-embeddings.js";
7
- export async function ingest(rootDir, projectName, silent = false) {
7
+ import { gitHistory, initGitHistory } from "../lib/git-history.js";
8
+ export async function ingest(rootDir, projectName, silent = false, includeGitHistory = true) {
9
+ initGitHistory(rootDir);
8
10
  if (!silent)
9
11
  console.log("Starting ingestion...");
10
12
  const files = await scanDir(rootDir);
11
13
  if (!silent)
12
14
  console.log(`Found ${files.length} files to process`);
15
+ const isGitRepo = gitHistory.isRepository();
16
+ if (isGitRepo && includeGitHistory && !silent) {
17
+ console.log(`Git repository detected - enriching with history data`);
18
+ }
13
19
  const allChunks = [];
14
20
  for (const filePath of files) {
15
21
  try {
@@ -31,10 +37,38 @@ export async function ingest(rootDir, projectName, silent = false) {
31
37
  console.log(`Saving to database...`);
32
38
  const BATCH_SIZE = 50;
33
39
  let processedCount = 0;
40
+ const fileGitData = new Map();
41
+ if (isGitRepo && includeGitHistory) {
42
+ for (const { filePath } of allChunks) {
43
+ if (!fileGitData.has(filePath)) {
44
+ const history = gitHistory.getFileHistory(filePath, 100);
45
+ fileGitData.set(filePath, history);
46
+ }
47
+ }
48
+ }
34
49
  for (let i = 0; i < allChunks.length; i += BATCH_SIZE) {
35
50
  const batch = allChunks.slice(i, i + BATCH_SIZE);
36
51
  await Promise.all(batch.map(async ({ content, filePath, chunk }) => {
37
52
  const embedding = localEmbeddings.getVector(content);
53
+ let gitMetadata = {};
54
+ if (isGitRepo && includeGitHistory) {
55
+ const history = fileGitData.get(filePath);
56
+ if (history && history.commits.length > 0) {
57
+ const lastCommit = history.commits[0];
58
+ const primaryAuthor = history.primaryAuthors[0];
59
+ gitMetadata = {
60
+ lastCommitHash: lastCommit.hash,
61
+ lastCommitAuthor: lastCommit.author,
62
+ lastCommitEmail: lastCommit.email,
63
+ lastCommitDate: lastCommit.date,
64
+ lastCommitMessage: lastCommit.message,
65
+ totalCommits: history.totalCommits,
66
+ primaryAuthor: primaryAuthor?.author,
67
+ primaryAuthorEmail: primaryAuthor?.email,
68
+ fileOwnerCommits: primaryAuthor?.commitCount,
69
+ };
70
+ }
71
+ }
38
72
  await saveToDB({
39
73
  content,
40
74
  embedding,
@@ -44,6 +78,7 @@ export async function ingest(rootDir, projectName, silent = false) {
44
78
  functionName: chunk.functionName,
45
79
  lineStart: chunk.lineStart,
46
80
  lineEnd: chunk.lineEnd,
81
+ ...gitMetadata,
47
82
  });
48
83
  }));
49
84
  processedCount += batch.length;
@@ -51,6 +86,35 @@ export async function ingest(rootDir, projectName, silent = false) {
51
86
  console.log(`Saved ${processedCount}/${allChunks.length} chunks...`);
52
87
  }
53
88
  }
54
- if (!silent)
89
+ if (isGitRepo && includeGitHistory && !silent) {
90
+ console.log("šŸ“ˆ Computing author statistics...");
91
+ const authorStats = gitHistory.getAuthorStats();
92
+ console.log(`Found ${authorStats.length} authors`);
93
+ for (const stats of authorStats) {
94
+ const ownedFiles = [];
95
+ fileGitData.forEach((history, filePath) => {
96
+ if (history?.primaryAuthors[0]?.email === stats.email) {
97
+ ownedFiles.push(path.relative(gitHistory.getRepositoryRoot(), filePath));
98
+ }
99
+ });
100
+ await saveAuthorToDB({
101
+ projectName,
102
+ author: stats.author,
103
+ email: stats.email,
104
+ totalCommits: stats.totalCommits,
105
+ linesAdded: stats.linesAdded,
106
+ linesRemoved: stats.linesRemoved,
107
+ recentActivity: stats.recentActivity,
108
+ filesOwned: ownedFiles,
109
+ });
110
+ }
111
+ if (!silent)
112
+ console.log(`āœ… Stored stats for ${authorStats.length} contributors`);
113
+ }
114
+ if (!silent) {
55
115
  console.log(`Done! Processed ${processedCount} chunks from ${files.length} files.`);
116
+ if (isGitRepo && includeGitHistory) {
117
+ console.log(`Git history enrichment complete`);
118
+ }
119
+ }
56
120
  }