commit-analyzer 1.1.3 → 1.1.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,43 +1,29 @@
1
1
  # Git Commit Analyzer
2
2
 
3
- A TypeScript/Node.js program that analyzes git commits and generates categorized summaries using Claude CLI.
3
+ A TypeScript/Node.js program that analyzes git commits and generates categorized
4
+ summaries using Claude CLI.
4
5
 
5
6
  ## Features
6
7
 
7
8
  - Extract commit details (message, date, diff) from git repositories
8
- - Categorize commits using LLM analysis into: `tweak`, `feature`, or `process`
9
- - Generate CSV reports with year, category, summary, and description
10
- - Generate condensed markdown reports from CSV data for stakeholder communication
11
- - Support for multiple LLM models (Claude, Gemini, Codex) with automatic detection
9
+ - Categorize commits using LLM analysis into:
10
+ `tweak`, `feature`, or `process`
11
+ - Generate CSV reports with timestamp, category, summary, and description
12
+ - Generate condensed markdown reports from CSV data for stakeholder
13
+ communication
14
+ - Support for multiple LLM models (Claude, Gemini, OpenAI) with automatic
15
+ detection
12
16
  - Support for batch processing multiple commits
13
17
  - Automatically filters out merge commits for cleaner analysis
14
18
  - Robust error handling and validation
15
19
 
16
- ## Prerequisites
17
-
18
- This tool requires Bun runtime. Install it globally:
19
-
20
- ```bash
21
- # Install bun globally
22
- curl -fsSL https://bun.sh/install | bash
23
- # or
24
- npm install -g bun
25
- ```
26
-
27
- ## Installation
28
-
29
- ```bash
30
- npm install
31
- bun link
32
- ```
33
-
34
- After linking, you can use `commit-analyzer` command globally.
35
20
 
36
21
  ## Usage
37
22
 
38
23
  ### Default Behavior
39
24
 
40
- When run without arguments, the program analyzes all commits authored by the current user:
25
+ When run without arguments, the program analyzes all commits authored by the
26
+ current user:
41
27
 
42
28
  ```bash
43
29
  # Analyze all your commits in the current repository
@@ -56,9 +42,6 @@ npx commit-analyzer --author user@example.com
56
42
  # Analyze specific commits
57
43
  npx commit-analyzer abc123 def456 ghi789
58
44
 
59
- # Read commits from file
60
- npx commit-analyzer --file commits.txt
61
-
62
45
  # Specify output file with default behavior
63
46
  npx commit-analyzer --output analysis.csv --limit 20
64
47
 
@@ -69,32 +52,46 @@ npx commit-analyzer --report --input-csv analysis.csv
69
52
  npx commit-analyzer --report --limit 50
70
53
 
71
54
  # Use specific LLM model
72
- npx commit-analyzer --model claude --limit 10
55
+ npx commit-analyzer --llm claude --limit 10
73
56
  ```
74
57
 
75
58
  ### Options
76
59
 
77
- - `-o, --output <file>`: Output file (default: `output.csv` for analysis, `summary-report.md` for reports)
78
- - `-f, --file <file>`: Read commit hashes from file (one per line)
79
- - `-a, --author <email>`: Filter commits by author email (defaults to current user)
80
- - `-l, --limit <number>`: Limit number of commits to analyze
81
- - `-m, --model <model>`: LLM model to use (claude, gemini, codex)
82
- - `-r, --resume`: Resume from last checkpoint if available
83
- - `-c, --clear`: Clear any existing progress checkpoint
84
- - `--report`: Generate condensed markdown report from existing CSV
85
- - `--input-csv <file>`: Input CSV file to read for report generation
86
- - `-h, --help`: Display help
87
- - `-V, --version`: Display version
88
-
89
- ### Input File Format
90
-
91
- When using `--file`, create a text file with one commit hash per line:
92
-
93
- ```
94
- abc123def456
95
- def456ghi789
96
- ghi789jkl012
97
- ```
60
+ - `-o, --output <file>`:
61
+ Output file (default:
62
+ `results/commits.csv` for analysis, `results/report.md` for reports)
63
+ - `--output-dir <dir>`:
64
+ Output directory for CSV and report files (default:
65
+ current directory)
66
+ - `-a, --author <email>`:
67
+ Filter commits by author email (defaults to current user)
68
+ - `-l, --limit <number>`:
69
+ Limit number of commits to analyze
70
+ - `--llm <model>`:
71
+ LLM model to use (claude, gemini, openai)
72
+ - `-r, --resume`:
73
+ Resume from last checkpoint if available
74
+ - `-c, --clear`:
75
+ Clear any existing progress checkpoint
76
+ - `--report`:
77
+ Generate condensed markdown report from existing CSV
78
+ - `--input-csv <file>`:
79
+ Input CSV file to read for report generation
80
+ - `-v, --verbose`:
81
+ Enable verbose logging (shows detailed error information)
82
+ - `--since <date>`:
83
+ Only analyze commits since this date (YYYY-MM-DD, '1 week ago', '2024-01-01')
84
+ - `--until <date>`:
85
+ Only analyze commits until this date (YYYY-MM-DD, '1 day ago', '2024-12-31')
86
+ - `--no-cache`:
87
+ Disable caching of analysis results
88
+ - `--batch-size <number>`:
89
+ Number of commits to process per batch (default:
90
+ 1 for sequential processing)
91
+ - `-h, --help`:
92
+ Display help
93
+ - `-V, --version`:
94
+ Display version
98
95
 
99
96
  ## Output Formats
100
97
 
@@ -102,36 +99,46 @@ ghi789jkl012
102
99
 
103
100
  The program generates a CSV file with the following columns:
104
101
 
105
- - `year`: Year of the commit
106
- - `category`: One of `tweak`, `feature`, or `process`
107
- - `summary`: One-line description (max 80 characters)
108
- - `description`: Detailed explanation (2-3 sentences)
102
+ - `timestamp`:
103
+ ISO 8601 timestamp of the commit (e.g., `2025-08-28T11:14:40.000Z`)
104
+ - `category`:
105
+ One of `tweak`, `feature`, or `process`
106
+ - `summary`:
107
+ One-line description (max 80 characters)
108
+ - `description`:
109
+ Detailed explanation (2-3 sentences)
109
110
 
110
111
  ### Markdown Report Output
111
112
 
112
- When using the `--report` option, the program generates a condensed markdown report that:
113
+ When using the `--report` option, the program generates a condensed markdown
114
+ report that:
113
115
 
114
116
  - Groups commits by year (most recent first)
115
- - Organizes by categories: Features, Processes, Tweaks & Bug Fixes
117
+ - Organizes by categories:
118
+ Features, Processes, Tweaks & Bug Fixes
116
119
  - Consolidates similar items for stakeholder readability
117
120
  - Includes commit count statistics
118
- - Uses professional language suitable for both technical and non-technical audiences
121
+ - Uses professional language suitable for both technical and non-technical
122
+ audiences
119
123
 
120
124
  ## Requirements
121
125
 
122
- - Node.js 18+ with TypeScript support
126
+ - Node.js 18+ with TypeScript support (Bun runtime recommended)
123
127
  - Git repository (must be run within a git repository)
124
128
  - At least one supported LLM CLI tool:
125
129
  - Claude CLI (`claude`) - recommended, defaults to Sonnet model
126
130
  - Gemini CLI (`gemini`)
127
- - Codex CLI (`codex`)
131
+ - OpenAI CLI (`codex`)
128
132
  - Valid git commit hashes (when specifying commits manually)
129
133
 
130
134
  ## Categories
131
135
 
132
- - **tweak**: Minor adjustments, bug fixes, small improvements
133
- - **feature**: New functionality, major additions
134
- - **process**: Build system, CI/CD, tooling, configuration changes
136
+ - **tweak**:
137
+ Minor adjustments, bug fixes, small improvements
138
+ - **feature**:
139
+ New functionality, major additions
140
+ - **process**:
141
+ Build system, CI/CD, tooling, configuration changes
135
142
 
136
143
  ## Error Handling
137
144
 
@@ -151,7 +158,8 @@ The tool automatically:
151
158
  - **Stops processing after a commit fails all retry attempts**
152
159
  - Exports partial results to the CSV file before exiting
153
160
 
154
- If the process stops (e.g., after 139 commits due to API failure), you can resume from where it left off:
161
+ If the process stops (e.g., after 139 commits due to API failure), you can
162
+ resume from where it left off:
155
163
 
156
164
  ```bash
157
165
  # Resume from last checkpoint
@@ -165,13 +173,12 @@ npx commit-analyzer --resume
165
173
  ```
166
174
 
167
175
  The checkpoint file (`.commit-analyzer/progress.json`) contains:
176
+
168
177
  - List of all commits to process
169
178
  - Successfully processed commits (including failed ones to skip on resume)
170
179
  - Analyzed commit data (only successful ones)
171
180
  - Output file location
172
181
 
173
- **Important**: When a commit fails after all retries (default 3), the process stops immediately to prevent wasting API calls. The successfully analyzed commits up to that point are saved to the CSV file.
174
-
175
182
  ### Application Data Directory
176
183
 
177
184
  The tool creates a `.commit-analyzer/` directory to store internal files:
@@ -185,23 +192,80 @@ The tool creates a `.commit-analyzer/` directory to store internal files:
185
192
  └── ...
186
193
  ```
187
194
 
188
- - **Progress checkpoint**: Enables resuming interrupted analysis sessions
189
- - **Analysis cache**: Stores LLM analysis results to avoid re-processing the same commits (TTL: 30 days)
195
+ - **Progress checkpoint**:
196
+ Enables resuming interrupted analysis sessions
197
+ - **Analysis cache**:
198
+ Stores LLM analysis results to avoid re-processing the same commits (TTL:
199
+ 30 days)
190
200
 
191
201
  Use `--no-cache` to disable caching if needed.
202
+ Use `--clear` to clear the cache and progress checkpoint.
203
+
204
+ ### Date Filtering
205
+
206
+ The tool supports flexible date filtering using natural language or specific
207
+ dates:
208
+
209
+ ```bash
210
+ # Analyze commits from the last week
211
+ npx commit-analyzer --since "1 week ago"
212
+
213
+ # Analyze commits from a specific date range
214
+ npx commit-analyzer --since "2024-01-01" --until "2024-12-31"
215
+
216
+ # Analyze commits from the beginning of the year
217
+ npx commit-analyzer --since "2024-01-01"
218
+
219
+ # Analyze commits up to a specific date
220
+ npx commit-analyzer --until "2024-06-30"
221
+ ```
222
+
223
+ Date formats supported:
224
+ - Relative dates:
225
+ `"1 week ago"`, `"2 months ago"`, `"3 days ago"`
226
+ - ISO dates:
227
+ `"2024-01-01"`, `"2024-12-31"`
228
+ - Git-style dates:
229
+ Any format accepted by `git log --since` and `git log --until`
230
+
231
+ ### Batch Processing
232
+
233
+ Control processing speed and resource usage with batch size options:
234
+
235
+ ```bash
236
+ # Process commits one at a time (default, safest for rate limits)
237
+ npx commit-analyzer --batch-size 1
238
+
239
+ # Process multiple commits in parallel (faster but may hit rate limits)
240
+ npx commit-analyzer --batch-size 5 --limit 100
241
+
242
+ # Sequential processing for large datasets
243
+ npx commit-analyzer --batch-size 1 --limit 500
244
+ ```
192
245
 
193
246
  ### Retry Logic
194
247
 
195
- The tool includes automatic retry logic with exponential backoff for handling API failures when processing many commits. This is especially useful when analyzing large numbers of commits that might trigger rate limits.
248
+ The tool includes automatic retry logic with exponential backoff for handling
249
+ API failures when processing many commits.
250
+ This is especially useful when analyzing large numbers of commits that might
251
+ trigger rate limits.
196
252
 
197
253
  #### Configuration
198
254
 
199
255
  You can configure the retry behavior using environment variables:
200
256
 
201
- - `LLM_MAX_RETRIES`: Maximum number of retry attempts (default: 3)
202
- - `LLM_INITIAL_RETRY_DELAY`: Initial delay between retries in milliseconds (default: 5000)
203
- - `LLM_MAX_RETRY_DELAY`: Maximum delay between retries in milliseconds (default: 30000)
204
- - `LLM_RETRY_MULTIPLIER`: Multiplier for exponential backoff (default: 2)
257
+ - `LLM_MAX_RETRIES`:
258
+ Maximum number of retry attempts (default:
259
+ 3)
260
+ - `LLM_INITIAL_RETRY_DELAY`:
261
+ Initial delay between retries in milliseconds (default:
262
+ 5000)
263
+ - `LLM_MAX_RETRY_DELAY`:
264
+ Maximum delay between retries in milliseconds (default:
265
+ 30000)
266
+ - `LLM_RETRY_MULTIPLIER`:
267
+ Multiplier for exponential backoff (default:
268
+ 2)
205
269
 
206
270
  #### Examples
207
271
 
@@ -226,19 +290,19 @@ The retry mechanism automatically:
226
290
 
227
291
  ```bash
228
292
  # Install dependencies
229
- npm install
293
+ bun install
230
294
 
231
295
  # Run in development mode
232
- npm run dev
296
+ bun run dev
233
297
 
234
298
  # Build for production
235
- npm run build
299
+ bun run build
236
300
 
237
301
  # Run linting
238
- npm run lint
302
+ bun run lint
239
303
 
240
304
  # Type checking
241
- npm run typecheck
305
+ bun run typecheck
242
306
  ```
243
307
 
244
308
  ## Examples
@@ -253,10 +317,6 @@ npx commit-analyzer --limit 20 --output my_analysis.csv
253
317
  # Analyze commits by a specific team member
254
318
  npx commit-analyzer --author teammate@company.com --limit 50
255
319
 
256
- # Analyze specific commits
257
- git log --oneline -5 | cut -d' ' -f1 > recent_commits.txt
258
- npx commit-analyzer --file recent_commits.txt --output recent_analysis.csv
259
-
260
320
  # Quick analysis of your recent work
261
321
  npx commit-analyzer --limit 10
262
322
 
@@ -267,8 +327,30 @@ npx commit-analyzer --report --limit 100 --output yearly_analysis.csv
267
327
  npx commit-analyzer --report --input-csv existing_analysis.csv --output team_report.md
268
328
 
269
329
  # Use specific LLM model for analysis
270
- npx commit-analyzer --model gemini --limit 25
330
+ npx commit-analyzer --llm gemini --limit 25
271
331
 
272
332
  # Resume interrupted analysis with progress tracking
273
333
  npx commit-analyzer --resume
274
334
  ```
335
+
336
+ ## Development
337
+
338
+ This tool requires the Bun runtime.
339
+ Install it globally:
340
+
341
+ ```bash
342
+ # Install bun globally
343
+ curl -fsSL https://bun.sh/install | bash
344
+ # or
345
+ npm install -g bun
346
+ ```
347
+
348
+ ## Installation
349
+
350
+ ```bash
351
+ bun install
352
+ bun build
353
+ bun link
354
+ ```
355
+
356
+ After linking, you can use `commit-analyzer` command globally.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "commit-analyzer",
3
- "version": "1.1.3",
3
+ "version": "1.1.5",
4
4
  "description": "Analyze git commits and generate categories, summaries, and descriptions for each commit. Optionally generate a yearly breakdown report of your commit history.",
5
5
  "main": "dist/main.ts",
6
6
  "bin": {
package/prompt.md CHANGED
@@ -58,12 +58,12 @@
58
58
  ### Technical Implementation
59
59
 
60
60
  - Use Node.js with TypeScript
61
- - Extract year from git commit timestamp
61
+ - Extract timestamp from git commit
62
62
 
63
63
  Output Format:
64
64
 
65
65
  CSV with headers:
66
- year,category,summary,description
66
+ timestamp,category,summary,description
67
67
 
68
68
  The program should be robust, handle edge cases, and provide clear error
69
69
  messages for invalid commits or API failures.
@@ -2,6 +2,14 @@ import { AnalyzedCommit } from "./analyzed-commit"
2
2
  import { Category, CategoryType } from "./category"
3
3
  import { DateRange } from "./date-range"
4
4
 
5
+ export type TimePeriod =
6
+ | "hourly"
7
+ | "daily"
8
+ | "weekly"
9
+ | "monthly"
10
+ | "quarterly"
11
+ | "yearly"
12
+
5
13
  /**
6
14
  * Statistics for analyzed commits
7
15
  */
@@ -151,27 +159,31 @@ export class ReportGenerationService {
151
159
  /**
152
160
  * Determines the appropriate time period for summaries based on date range
153
161
  */
154
- determineTimePeriod(commits: AnalyzedCommit[]): 'daily' | 'weekly' | 'monthly' | 'quarterly' | 'yearly' {
155
- if (commits.length === 0) return 'yearly'
162
+ determineTimePeriod(commits: AnalyzedCommit[]): TimePeriod {
163
+ if (commits.length === 0) return "yearly"
164
+
165
+ const dates = commits.map((c) => c.getDate())
166
+ const minDate = new Date(Math.min(...dates.map((d) => d.getTime())))
167
+ const maxDate = new Date(Math.max(...dates.map((d) => d.getTime())))
156
168
 
157
- const dates = commits.map(c => c.getDate())
158
- const minDate = new Date(Math.min(...dates.map(d => d.getTime())))
159
- const maxDate = new Date(Math.max(...dates.map(d => d.getTime())))
160
-
161
169
  const diffInMilliseconds = maxDate.getTime() - minDate.getTime()
162
170
  const diffInDays = diffInMilliseconds / (1000 * 60 * 60 * 24)
163
171
 
164
- if (diffInDays <= 1) return 'daily'
165
- if (diffInDays <= 7) return 'weekly'
166
- if (diffInDays <= 31) return 'monthly'
167
- if (diffInDays <= 93) return 'quarterly' // ~3 months
168
- return 'yearly'
172
+ if (diffInDays <= 1) return "hourly"
173
+ if (diffInDays <= 7) return "daily"
174
+ if (diffInDays <= 31) return "weekly"
175
+ if (diffInDays <= 93) return "monthly" // ~3 months
176
+ if (diffInDays <= 365) return "quarterly" // ~3 months
177
+ return "yearly"
169
178
  }
170
179
 
171
180
  /**
172
181
  * Groups commits by the appropriate time period
173
182
  */
174
- groupByTimePeriod(commits: AnalyzedCommit[], period: 'daily' | 'weekly' | 'monthly' | 'quarterly' | 'yearly'): Map<string, AnalyzedCommit[]> {
183
+ groupByTimePeriod(
184
+ commits: AnalyzedCommit[],
185
+ period: TimePeriod,
186
+ ): Map<string, AnalyzedCommit[]> {
175
187
  const grouped = new Map<string, AnalyzedCommit[]>()
176
188
 
177
189
  for (const commit of commits) {
@@ -179,19 +191,22 @@ export class ReportGenerationService {
179
191
  let key: string
180
192
 
181
193
  switch (period) {
182
- case 'daily':
194
+ case "hourly":
195
+ key = this.formatHourlyKey(date)
196
+ break
197
+ case "daily":
183
198
  key = this.formatDailyKey(date)
184
199
  break
185
- case 'weekly':
200
+ case "weekly":
186
201
  key = this.formatWeeklyKey(date)
187
202
  break
188
- case 'monthly':
203
+ case "monthly":
189
204
  key = this.formatMonthlyKey(date)
190
205
  break
191
- case 'quarterly':
206
+ case "quarterly":
192
207
  key = this.formatQuarterlyKey(date)
193
208
  break
194
- case 'yearly':
209
+ case "yearly":
195
210
  default:
196
211
  key = date.getFullYear().toString()
197
212
  break
@@ -206,15 +221,24 @@ export class ReportGenerationService {
206
221
  return grouped
207
222
  }
208
223
 
224
+ private formatHourlyKey(date: Date): string {
225
+ const hour = date.getHours()
226
+ const displayHour = hour === 0 ? 12 : hour > 12 ? hour - 12 : hour
227
+ const ampm = hour < 12 ? "AM" : "PM"
228
+ return `${displayHour}:00 ${ampm}`
229
+ }
230
+
209
231
  private formatDailyKey(date: Date): string {
210
232
  const year = date.getFullYear()
211
233
  const month = date.getMonth() + 1
212
234
  const day = date.getDate()
213
235
  const hour = date.getHours()
214
236
 
215
- if (hour < 12) return `${year}-${month.toString().padStart(2, '0')}-${day.toString().padStart(2, '0')} Morning`
216
- if (hour < 17) return `${year}-${month.toString().padStart(2, '0')}-${day.toString().padStart(2, '0')} Afternoon`
217
- return `${year}-${month.toString().padStart(2, '0')}-${day.toString().padStart(2, '0')} Evening`
237
+ if (hour < 12)
238
+ return `${year}-${month.toString().padStart(2, "0")}-${day.toString().padStart(2, "0")} Morning`
239
+ if (hour < 17)
240
+ return `${year}-${month.toString().padStart(2, "0")}-${day.toString().padStart(2, "0")} Afternoon`
241
+ return `${year}-${month.toString().padStart(2, "0")}-${day.toString().padStart(2, "0")} Evening`
218
242
  }
219
243
 
220
244
  private formatWeeklyKey(date: Date): string {
@@ -223,15 +247,27 @@ export class ReportGenerationService {
223
247
  const endOfWeek = new Date(startOfWeek)
224
248
  endOfWeek.setDate(startOfWeek.getDate() + 6)
225
249
 
226
- const formatDate = (d: Date) =>
227
- `${d.getFullYear()}-${(d.getMonth() + 1).toString().padStart(2, '0')}-${d.getDate().toString().padStart(2, '0')}`
250
+ const formatDate = (d: Date) =>
251
+ `${d.getFullYear()}-${(d.getMonth() + 1).toString().padStart(2, "0")}-${d.getDate().toString().padStart(2, "0")}`
228
252
 
229
253
  return `Week of ${formatDate(startOfWeek)} to ${formatDate(endOfWeek)}`
230
254
  }
231
255
 
232
256
  private formatMonthlyKey(date: Date): string {
233
- const months = ['January', 'February', 'March', 'April', 'May', 'June',
234
- 'July', 'August', 'September', 'October', 'November', 'December']
257
+ const months = [
258
+ "January",
259
+ "February",
260
+ "March",
261
+ "April",
262
+ "May",
263
+ "June",
264
+ "July",
265
+ "August",
266
+ "September",
267
+ "October",
268
+ "November",
269
+ "December",
270
+ ]
235
271
  return `${months[date.getMonth()]} ${date.getFullYear()}`
236
272
  }
237
273
 
@@ -241,45 +277,160 @@ export class ReportGenerationService {
241
277
  }
242
278
 
243
279
  /**
244
- * Converts analyzed commits to CSV string format for LLM consumption
280
+ * Converts analyzed commits to CSV string format for LLM consumption with enhanced context
245
281
  */
246
282
  convertToCSVString(commits: AnalyzedCommit[]): string {
247
- const header = "year,category,summary,description"
283
+ const header = "year,category,summary,description,commit_count,date_range"
284
+
285
+ // Group commits by year and category for context
286
+ const contextMap = new Map<string, { count: number; dates: Date[] }>()
287
+
248
288
  const rows = commits.map((commit) => {
249
289
  const analysis = commit.getAnalysis()
290
+ const key = `${commit.getYear()}-${analysis.getCategory().getValue()}`
291
+
292
+ if (!contextMap.has(key)) {
293
+ contextMap.set(key, { count: 0, dates: [] })
294
+ }
295
+
296
+ const context = contextMap.get(key)!
297
+ context.count++
298
+ context.dates.push(commit.getDate())
299
+
300
+ const dateRange =
301
+ context.dates.length > 1
302
+ ? `${this.formatDate(Math.min(...context.dates.map((d) => d.getTime())))} to ${this.formatDate(Math.max(...context.dates.map((d) => d.getTime())))}`
303
+ : this.formatDate(commit.getDate())
304
+
250
305
  return [
251
306
  commit.getYear().toString(),
252
307
  this.escapeCsvField(analysis.getCategory().getValue()),
253
308
  this.escapeCsvField(analysis.getSummary()),
254
309
  this.escapeCsvField(analysis.getDescription()),
310
+ context.count.toString(),
311
+ this.escapeCsvField(dateRange),
255
312
  ].join(",")
256
313
  })
257
314
 
258
315
  return [header, ...rows].join("\n")
259
316
  }
260
317
 
318
+ private formatDate(date: Date | number): string {
319
+ const d = new Date(date)
320
+ return `${d.getFullYear()}-${(d.getMonth() + 1).toString().padStart(2, "0")}-${d.getDate().toString().padStart(2, "0")}`
321
+ }
322
+
261
323
  /**
262
- * Converts grouped commits to CSV with time period information
324
+ * Converts grouped commits to CSV with time period information and enhanced context
263
325
  */
264
- convertGroupedToCSV(groupedCommits: Map<string, AnalyzedCommit[]>, period: string): string {
265
- const header = `${period},category,summary,description`
326
+ convertGroupedToCSV(
327
+ groupedCommits: Map<string, AnalyzedCommit[]>,
328
+ period: string,
329
+ ): string {
330
+ const header = `${period},category,summary,description,commit_count,similar_commits`
266
331
  const rows: string[] = []
267
-
332
+
268
333
  for (const [timePeriod, commits] of groupedCommits) {
334
+ // Group commits by category within the time period for context
335
+ const categoryGroups = new Map<string, AnalyzedCommit[]>()
336
+
337
+ for (const commit of commits) {
338
+ const category = commit.getAnalysis().getCategory().getValue()
339
+ if (!categoryGroups.has(category)) {
340
+ categoryGroups.set(category, [])
341
+ }
342
+ categoryGroups.get(category)!.push(commit)
343
+ }
344
+
345
+ // Add context about similar commits in the same period and category
269
346
  for (const commit of commits) {
270
347
  const analysis = commit.getAnalysis()
271
- rows.push([
272
- this.escapeCsvField(timePeriod),
273
- this.escapeCsvField(analysis.getCategory().getValue()),
274
- this.escapeCsvField(analysis.getSummary()),
275
- this.escapeCsvField(analysis.getDescription()),
276
- ].join(","))
348
+ const category = analysis.getCategory().getValue()
349
+ const similarCommits = categoryGroups.get(category)!
350
+
351
+ // Find similar summaries in the same category
352
+ const similarSummaries = similarCommits
353
+ .filter((c) => c !== commit)
354
+ .map((c) => c.getAnalysis().getSummary())
355
+ .filter((summary) =>
356
+ this.isSimilarSummary(analysis.getSummary(), summary),
357
+ )
358
+ .slice(0, 3) // Limit to 3 similar items
359
+
360
+ rows.push(
361
+ [
362
+ this.escapeCsvField(timePeriod),
363
+ this.escapeCsvField(category),
364
+ this.escapeCsvField(analysis.getSummary()),
365
+ this.escapeCsvField(analysis.getDescription()),
366
+ similarCommits.length.toString(),
367
+ this.escapeCsvField(similarSummaries.join("; ")),
368
+ ].join(","),
369
+ )
277
370
  }
278
371
  }
279
372
 
280
373
  return [header, ...rows].join("\n")
281
374
  }
282
375
 
376
+ /**
377
+ * Determines if two summaries are similar based on common keywords
378
+ */
379
+ private isSimilarSummary(summary1: string, summary2: string): boolean {
380
+ const keywords1 = this.extractKeywords(summary1)
381
+ const keywords2 = this.extractKeywords(summary2)
382
+
383
+ // Check if they share significant keywords (at least 2 common words)
384
+ const commonKeywords = keywords1.filter((word) => keywords2.includes(word))
385
+ return commonKeywords.length >= 2
386
+ }
387
+
388
+ /**
389
+ * Extracts meaningful keywords from a summary for similarity detection
390
+ */
391
+ private extractKeywords(summary: string): string[] {
392
+ // Remove common stopwords and extract meaningful terms
393
+ const stopwords = new Set([
394
+ "the",
395
+ "a",
396
+ "an",
397
+ "and",
398
+ "or",
399
+ "but",
400
+ "in",
401
+ "on",
402
+ "at",
403
+ "to",
404
+ "for",
405
+ "of",
406
+ "with",
407
+ "by",
408
+ "is",
409
+ "are",
410
+ "was",
411
+ "were",
412
+ "be",
413
+ "been",
414
+ "have",
415
+ "has",
416
+ "had",
417
+ "do",
418
+ "does",
419
+ "did",
420
+ "will",
421
+ "would",
422
+ "could",
423
+ "should",
424
+ ])
425
+
426
+ return summary
427
+ .toLowerCase()
428
+ .replace(/[^\w\s]/g, " ")
429
+ .split(/\s+/)
430
+ .filter((word) => word.length > 2 && !stopwords.has(word))
431
+ .slice(0, 5) // Take first 5 meaningful words
432
+ }
433
+
283
434
  /**
284
435
  * Escape CSV fields that contain commas, quotes, or newlines
285
436
  */
@@ -24,7 +24,7 @@ export interface CLIOptions {
24
24
  }
25
25
 
26
26
  export class CLIApplication {
27
- private static readonly VERSION = "1.1.3"
27
+ private static readonly VERSION = "1.1.5"
28
28
  private static readonly DEFAULT_COMMITS_OUTPUT_FILE = "results/commits.csv"
29
29
  private static readonly DEFAULT_REPORT_OUTPUT_FILE = "results/report.md"
30
30
 
@@ -243,9 +243,11 @@ ${csvContent}
243
243
  INSTRUCTIONS:
244
244
  1. Group the data by year (descending order, most recent first)
245
245
  2. Within each year, group by category: Features, Process Improvements, and Tweaks & Bug Fixes
246
- 3. Consolidate similar items within each category to create readable summaries
247
- 4. Focus on what was accomplished rather than individual commit details
248
- 5. Use clear, professional language appropriate for stakeholders
246
+ 3. Use the 'commit_count' and 'date_range' columns to understand the scope and timeline of work
247
+ 4. Consolidate similar items within each category to create readable summaries
248
+ 5. Focus on what was accomplished rather than individual commit details
249
+ 6. Use clear, professional language appropriate for stakeholders
250
+ 7. Pay attention to recurring themes and patterns across commits
249
251
 
250
252
  CATEGORY MAPPING:
251
253
  - "feature" → "Features" section
@@ -253,12 +255,15 @@ CATEGORY MAPPING:
253
255
  - "tweak" → "Tweaks & Bug Fixes" section
254
256
 
255
257
  CONSOLIDATION GUIDELINES:
256
- - Group similar features together (e.g., "authentication system improvements")
257
- - Combine related bug fixes (e.g., "resolved 8 authentication issues")
258
- - Summarize process changes by theme (e.g., "CI/CD pipeline enhancements")
259
- - Use bullet points for individual items within categories
258
+ - FIRST: Extract common themes and keywords from commit summaries within each category
259
+ - SECOND: Identify and merge duplicate or highly similar work items (e.g., multiple "fix auth bug" commits become "resolved authentication issues")
260
+ - Group similar features together by theme (e.g., "authentication system improvements", "payment processing enhancements")
261
+ - Combine related bug fixes by area/system (e.g., "resolved 8 authentication issues", "fixed 5 database connection problems")
262
+ - Summarize process changes by theme (e.g., "CI/CD pipeline enhancements", "testing infrastructure improvements")
263
+ - Use bullet points for individual consolidated items within categories
260
264
  - Aim for 3-7 bullet points per category per year
261
265
  - Include specific numbers when relevant (e.g., "15 bug fixes", "3 new features")
266
+ - Avoid listing near-identical items separately - consolidate them into meaningful groups
262
267
 
263
268
  OUTPUT FORMAT:
264
269
  Generate yearly summary sections with this exact structure (DO NOT include the main title or commit analysis section):
@@ -290,6 +295,16 @@ QUALITY REQUIREMENTS:
290
295
  - Avoid technical jargon where possible
291
296
  - Ensure each bullet point represents meaningful work
292
297
  - Make the report valuable for both technical and non-technical readers
298
+ - Focus on business impact and user value rather than technical implementation details
299
+ - When consolidating, preserve the most important aspects from similar commits
300
+ - Use progressive disclosure: start with high-level themes, then add specific details
301
+
302
+ CONTEXT ANALYSIS:
303
+ Before consolidating, analyze the commit data for:
304
+ 1. Common file patterns or system areas being modified
305
+ 2. Recurring keywords in commit messages that indicate related work
306
+ 3. Sequential commits that build upon each other
307
+ 4. Bug fixes that address the same underlying issue
293
308
 
294
309
  Generate the markdown report now:`
295
310
  }
@@ -306,10 +321,12 @@ ${csvContent}
306
321
  INSTRUCTIONS:
307
322
  1. Group the data by ${periodDisplayName} (descending order, most recent first)
308
323
  2. Within each ${periodDisplayName.toLowerCase()}, group by category: Features, Process Improvements, and Tweaks & Bug Fixes
309
- 3. Consolidate similar items within each category to create readable summaries
310
- 4. Focus on what was accomplished rather than individual commit details
311
- 5. Use clear, professional language appropriate for stakeholders
312
- 6. Only include sections for time periods that have commits
324
+ 3. Use the 'commit_count' and 'similar_commits' columns to understand related work and consolidation opportunities
325
+ 4. Consolidate similar items within each category to create readable summaries
326
+ 5. Focus on what was accomplished rather than individual commit details
327
+ 6. Use clear, professional language appropriate for stakeholders
328
+ 7. Only include sections for time periods that have commits
329
+ 8. Pay attention to recurring themes and patterns across commits
313
330
 
314
331
  CATEGORY MAPPING:
315
332
  - "feature" → "Features" section
@@ -317,12 +334,15 @@ CATEGORY MAPPING:
317
334
  - "tweak" → "Tweaks & Bug Fixes" section
318
335
 
319
336
  CONSOLIDATION GUIDELINES:
320
- - Group similar features together (e.g., "authentication system improvements")
321
- - Combine related bug fixes (e.g., "resolved 8 authentication issues")
322
- - Summarize process changes by theme (e.g., "CI/CD pipeline enhancements")
323
- - Use bullet points for individual items within categories
337
+ - FIRST: Extract common themes and keywords from commit summaries within each category
338
+ - SECOND: Identify and merge duplicate or highly similar work items (e.g., multiple "fix auth bug" commits become "resolved authentication issues")
339
+ - Group similar features together by theme (e.g., "authentication system improvements", "payment processing enhancements")
340
+ - Combine related bug fixes by area/system (e.g., "resolved 8 authentication issues", "fixed 5 database connection problems")
341
+ - Summarize process changes by theme (e.g., "CI/CD pipeline enhancements", "testing infrastructure improvements")
342
+ - Use bullet points for individual consolidated items within categories
324
343
  - Aim for 3-7 bullet points per category per ${periodDisplayName.toLowerCase()}
325
344
  - Include specific numbers when relevant (e.g., "15 bug fixes", "3 new features")
345
+ - Avoid listing near-identical items separately - consolidate them into meaningful groups
326
346
 
327
347
  OUTPUT FORMAT:
328
348
  Generate ${periodDisplayName.toLowerCase()} summary sections with this exact structure (DO NOT include the main title or commit analysis section):
@@ -354,6 +374,16 @@ QUALITY REQUIREMENTS:
354
374
  - Avoid technical jargon where possible
355
375
  - Ensure each bullet point represents meaningful work
356
376
  - Make the report valuable for both technical and non-technical readers
377
+ - Focus on business impact and user value rather than technical implementation details
378
+ - When consolidating, preserve the most important aspects from similar commits
379
+ - Use progressive disclosure: start with high-level themes, then add specific details
380
+
381
+ CONTEXT ANALYSIS:
382
+ Before consolidating, analyze the commit data for:
383
+ 1. Common file patterns or system areas being modified
384
+ 2. Recurring keywords in commit messages that indicate related work
385
+ 3. Sequential commits that build upon each other
386
+ 4. Bug fixes that address the same underlying issue
357
387
 
358
388
  Generate the markdown report now:`
359
389
  }