npm - commit-analyzer - Versions diffs - 1.1.3 → 1.1.5 - Mend

commit-analyzer 1.1.3 → 1.1.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/README.md +164 -82
package/package.json +1 -1
package/prompt.md +2 -2
package/src/1.domain/report-generation-service.ts +187 -36
package/src/3.presentation/cli-application.ts +1 -1
package/src/4.infrastructure/llm-adapter.ts +45 -15

package/README.md CHANGED Viewed

@@ -1,43 +1,29 @@
 # Git Commit Analyzer
-A TypeScript/Node.js program that analyzes git commits and generates categorized summaries using Claude CLI.
+A TypeScript/Node.js program that analyzes git commits and generates categorized
+summaries using Claude CLI.
 ## Features
 - Extract commit details (message, date, diff) from git repositories
-- Categorize commits using LLM analysis into: `tweak`, `feature`, or `process`
-- Generate CSV reports with year, category, summary, and description
-- Generate condensed markdown reports from CSV data for stakeholder communication
-- Support for multiple LLM models (Claude, Gemini, Codex) with automatic detection
+- Categorize commits using LLM analysis into:
+  `tweak`, `feature`, or `process`
+- Generate CSV reports with timestamp, category, summary, and description
+- Generate condensed markdown reports from CSV data for stakeholder
+  communication
+- Support for multiple LLM models (Claude, Gemini, OpenAI) with automatic
+  detection
 - Support for batch processing multiple commits
 - Automatically filters out merge commits for cleaner analysis
 - Robust error handling and validation
-## Prerequisites
-This tool requires Bun runtime. Install it globally:
-```bash
-# Install bun globally
-curl -fsSL https://bun.sh/install | bash
-# or
-npm install -g bun
-```
-## Installation
-```bash
-npm install
-bun link
-```
-After linking, you can use `commit-analyzer` command globally.
 ## Usage
 ### Default Behavior
-When run without arguments, the program analyzes all commits authored by the current user:
+When run without arguments, the program analyzes all commits authored by the
+current user:
 ```bash
 # Analyze all your commits in the current repository
@@ -56,9 +42,6 @@ npx commit-analyzer --author user@example.com
 # Analyze specific commits
 npx commit-analyzer abc123 def456 ghi789
-# Read commits from file
-npx commit-analyzer --file commits.txt
 # Specify output file with default behavior
 npx commit-analyzer --output analysis.csv --limit 20
@@ -69,32 +52,46 @@ npx commit-analyzer --report --input-csv analysis.csv
 npx commit-analyzer --report --limit 50
 # Use specific LLM model
-npx commit-analyzer --model claude --limit 10
+npx commit-analyzer --llm claude --limit 10
 ```
 ### Options
-- `-o, --output <file>`: Output file (default: `output.csv` for analysis, `summary-report.md` for reports)
-- `-f, --file <file>`: Read commit hashes from file (one per line)
-- `-a, --author <email>`: Filter commits by author email (defaults to current user)
-- `-l, --limit <number>`: Limit number of commits to analyze
-- `-m, --model <model>`: LLM model to use (claude, gemini, codex)
-- `-r, --resume`: Resume from last checkpoint if available
-- `-c, --clear`: Clear any existing progress checkpoint
-- `--report`: Generate condensed markdown report from existing CSV
-- `--input-csv <file>`: Input CSV file to read for report generation
-- `-h, --help`: Display help
-- `-V, --version`: Display version
-### Input File Format
-When using `--file`, create a text file with one commit hash per line:
-```
-abc123def456
-def456ghi789
-ghi789jkl012
-```
+- `-o, --output <file>`:
+  Output file (default:
+  `results/commits.csv` for analysis, `results/report.md` for reports)
+- `--output-dir <dir>`:
+  Output directory for CSV and report files (default:
+  current directory)
+- `-a, --author <email>`:
+  Filter commits by author email (defaults to current user)
+- `-l, --limit <number>`:
+  Limit number of commits to analyze
+- `--llm <model>`:
+  LLM model to use (claude, gemini, openai)
+- `-r, --resume`:
+  Resume from last checkpoint if available
+- `-c, --clear`:
+  Clear any existing progress checkpoint
+- `--report`:
+  Generate condensed markdown report from existing CSV
+- `--input-csv <file>`:
+  Input CSV file to read for report generation
+- `-v, --verbose`:
+  Enable verbose logging (shows detailed error information)
+- `--since <date>`:
+  Only analyze commits since this date (YYYY-MM-DD, '1 week ago', '2024-01-01')
+- `--until <date>`:
+  Only analyze commits until this date (YYYY-MM-DD, '1 day ago', '2024-12-31')
+- `--no-cache`:
+  Disable caching of analysis results
+- `--batch-size <number>`:
+  Number of commits to process per batch (default:
+  1 for sequential processing)
+- `-h, --help`:
+  Display help
+- `-V, --version`:
+  Display version
 ## Output Formats
@@ -102,36 +99,46 @@ ghi789jkl012
 The program generates a CSV file with the following columns:
-- `year`: Year of the commit
-- `category`: One of `tweak`, `feature`, or `process`
-- `summary`: One-line description (max 80 characters)
-- `description`: Detailed explanation (2-3 sentences)
+- `timestamp`:
+  ISO 8601 timestamp of the commit (e.g., `2025-08-28T11:14:40.000Z`)
+- `category`:
+  One of `tweak`, `feature`, or `process`
+- `summary`:
+  One-line description (max 80 characters)
+- `description`:
+  Detailed explanation (2-3 sentences)
 ### Markdown Report Output
-When using the `--report` option, the program generates a condensed markdown report that:
+When using the `--report` option, the program generates a condensed markdown
+report that:
 - Groups commits by year (most recent first)
-- Organizes by categories: Features, Processes, Tweaks & Bug Fixes
+- Organizes by categories:
+  Features, Processes, Tweaks & Bug Fixes
 - Consolidates similar items for stakeholder readability
 - Includes commit count statistics
-- Uses professional language suitable for both technical and non-technical audiences
+- Uses professional language suitable for both technical and non-technical
+  audiences
 ## Requirements
-- Node.js 18+ with TypeScript support
+- Node.js 18+ with TypeScript support (Bun runtime recommended)
 - Git repository (must be run within a git repository)
 - At least one supported LLM CLI tool:
   - Claude CLI (`claude`) - recommended, defaults to Sonnet model
   - Gemini CLI (`gemini`)
-  - Codex CLI (`codex`)
+  - OpenAI CLI (`codex`)
 - Valid git commit hashes (when specifying commits manually)
 ## Categories
-- **tweak**: Minor adjustments, bug fixes, small improvements
-- **feature**: New functionality, major additions
-- **process**: Build system, CI/CD, tooling, configuration changes
+- **tweak**:
+  Minor adjustments, bug fixes, small improvements
+- **feature**:
+  New functionality, major additions
+- **process**:
+  Build system, CI/CD, tooling, configuration changes
 ## Error Handling
@@ -151,7 +158,8 @@ The tool automatically:
 - **Stops processing after a commit fails all retry attempts**
 - Exports partial results to the CSV file before exiting
-If the process stops (e.g., after 139 commits due to API failure), you can resume from where it left off:
+If the process stops (e.g., after 139 commits due to API failure), you can
+resume from where it left off:
 ```bash
 # Resume from last checkpoint
@@ -165,13 +173,12 @@ npx commit-analyzer --resume
 ```
 The checkpoint file (`.commit-analyzer/progress.json`) contains:
 - List of all commits to process
 - Successfully processed commits (including failed ones to skip on resume)
 - Analyzed commit data (only successful ones)
 - Output file location
-**Important**: When a commit fails after all retries (default 3), the process stops immediately to prevent wasting API calls. The successfully analyzed commits up to that point are saved to the CSV file.
 ### Application Data Directory
 The tool creates a `.commit-analyzer/` directory to store internal files:
@@ -185,23 +192,80 @@ The tool creates a `.commit-analyzer/` directory to store internal files:
     └── ...
 ```
-- **Progress checkpoint**: Enables resuming interrupted analysis sessions
-- **Analysis cache**: Stores LLM analysis results to avoid re-processing the same commits (TTL: 30 days)
+- **Progress checkpoint**:
+  Enables resuming interrupted analysis sessions
+- **Analysis cache**:
+  Stores LLM analysis results to avoid re-processing the same commits (TTL:
+  30 days)
 Use `--no-cache` to disable caching if needed.
+Use `--clear` to clear the cache and progress checkpoint.
+### Date Filtering
+The tool supports flexible date filtering using natural language or specific
+dates:
+```bash
+# Analyze commits from the last week
+npx commit-analyzer --since "1 week ago"
+# Analyze commits from a specific date range
+npx commit-analyzer --since "2024-01-01" --until "2024-12-31"
+# Analyze commits from the beginning of the year
+npx commit-analyzer --since "2024-01-01"
+# Analyze commits up to a specific date
+npx commit-analyzer --until "2024-06-30"
+```
+Date formats supported:
+- Relative dates:
+  `"1 week ago"`, `"2 months ago"`, `"3 days ago"`
+- ISO dates:
+  `"2024-01-01"`, `"2024-12-31"`
+- Git-style dates:
+  Any format accepted by `git log --since` and `git log --until`
+### Batch Processing
+Control processing speed and resource usage with batch size options:
+```bash
+# Process commits one at a time (default, safest for rate limits)
+npx commit-analyzer --batch-size 1
+# Process multiple commits in parallel (faster but may hit rate limits)
+npx commit-analyzer --batch-size 5 --limit 100
+# Sequential processing for large datasets
+npx commit-analyzer --batch-size 1 --limit 500
+```
 ### Retry Logic
-The tool includes automatic retry logic with exponential backoff for handling API failures when processing many commits. This is especially useful when analyzing large numbers of commits that might trigger rate limits.
+The tool includes automatic retry logic with exponential backoff for handling
+API failures when processing many commits.
+This is especially useful when analyzing large numbers of commits that might
+trigger rate limits.
 #### Configuration
 You can configure the retry behavior using environment variables:
-- `LLM_MAX_RETRIES`: Maximum number of retry attempts (default: 3)
-- `LLM_INITIAL_RETRY_DELAY`: Initial delay between retries in milliseconds (default: 5000)
-- `LLM_MAX_RETRY_DELAY`: Maximum delay between retries in milliseconds (default: 30000)
-- `LLM_RETRY_MULTIPLIER`: Multiplier for exponential backoff (default: 2)
+- `LLM_MAX_RETRIES`:
+  Maximum number of retry attempts (default:
+  3)
+- `LLM_INITIAL_RETRY_DELAY`:
+  Initial delay between retries in milliseconds (default:
+  5000)
+- `LLM_MAX_RETRY_DELAY`:
+  Maximum delay between retries in milliseconds (default:
+  30000)
+- `LLM_RETRY_MULTIPLIER`:
+  Multiplier for exponential backoff (default:
+  2)
 #### Examples
@@ -226,19 +290,19 @@ The retry mechanism automatically:
 ```bash
 # Install dependencies
-npm install
+bun install
 # Run in development mode
-npm run dev
+bun run dev
 # Build for production
-npm run build
+bun run build
 # Run linting
-npm run lint
+bun run lint
 # Type checking
-npm run typecheck
+bun run typecheck
 ```
 ## Examples
@@ -253,10 +317,6 @@ npx commit-analyzer --limit 20 --output my_analysis.csv
 # Analyze commits by a specific team member
 npx commit-analyzer --author teammate@company.com --limit 50
-# Analyze specific commits
-git log --oneline -5 | cut -d' ' -f1 > recent_commits.txt
-npx commit-analyzer --file recent_commits.txt --output recent_analysis.csv
 # Quick analysis of your recent work
 npx commit-analyzer --limit 10
@@ -267,8 +327,30 @@ npx commit-analyzer --report --limit 100 --output yearly_analysis.csv
 npx commit-analyzer --report --input-csv existing_analysis.csv --output team_report.md
 # Use specific LLM model for analysis
-npx commit-analyzer --model gemini --limit 25
+npx commit-analyzer --llm gemini --limit 25
 # Resume interrupted analysis with progress tracking
 npx commit-analyzer --resume
 ```
+## Development
+This tool requires the Bun runtime.
+Install it globally:
+```bash
+# Install bun globally
+curl -fsSL https://bun.sh/install | bash
+# or
+npm install -g bun
+```
+## Installation
+```bash
+bun install
+bun build
+bun link
+```
+After linking, you can use `commit-analyzer` command globally.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "commit-analyzer",
-  "version": "1.1.3",
+  "version": "1.1.5",
   "description": "Analyze git commits and generate categories, summaries, and descriptions for each commit. Optionally generate a yearly breakdown report of your commit history.",
   "main": "dist/main.ts",
   "bin": {

package/prompt.md CHANGED Viewed

@@ -58,12 +58,12 @@
 ### Technical Implementation
   - Use Node.js with TypeScript
-  - Extract year from git commit timestamp
+  - Extract timestamp from git commit
   Output Format:
   CSV with headers:
-  year,category,summary,description
+  timestamp,category,summary,description
   The program should be robust, handle edge cases, and provide clear error
   messages for invalid commits or API failures.

package/src/1.domain/report-generation-service.ts CHANGED Viewed

@@ -2,6 +2,14 @@ import { AnalyzedCommit } from "./analyzed-commit"
 import { Category, CategoryType } from "./category"
 import { DateRange } from "./date-range"
+export type TimePeriod =
+  | "hourly"
+  | "daily"
+  | "weekly"
+  | "monthly"
+  | "quarterly"
+  | "yearly"
 /**
  * Statistics for analyzed commits
  */
@@ -151,27 +159,31 @@ export class ReportGenerationService {
   /**
    * Determines the appropriate time period for summaries based on date range
    */
-  determineTimePeriod(commits: AnalyzedCommit[]): 'daily' | 'weekly' | 'monthly' | 'quarterly' | 'yearly' {
-    if (commits.length === 0) return 'yearly'
+  determineTimePeriod(commits: AnalyzedCommit[]): TimePeriod {
+    if (commits.length === 0) return "yearly"
+    const dates = commits.map((c) => c.getDate())
+    const minDate = new Date(Math.min(...dates.map((d) => d.getTime())))
+    const maxDate = new Date(Math.max(...dates.map((d) => d.getTime())))
-    const dates = commits.map(c => c.getDate())
-    const minDate = new Date(Math.min(...dates.map(d => d.getTime())))
-    const maxDate = new Date(Math.max(...dates.map(d => d.getTime())))
     const diffInMilliseconds = maxDate.getTime() - minDate.getTime()
     const diffInDays = diffInMilliseconds / (1000 * 60 * 60 * 24)
-    if (diffInDays <= 1) return 'daily'
-    if (diffInDays <= 7) return 'weekly'
-    if (diffInDays <= 31) return 'monthly'
-    if (diffInDays <= 93) return 'quarterly' // ~3 months
-    return 'yearly'
+    if (diffInDays <= 1) return "hourly"
+    if (diffInDays <= 7) return "daily"
+    if (diffInDays <= 31) return "weekly"
+    if (diffInDays <= 93) return "monthly" // ~3 months
+    if (diffInDays <= 365) return "quarterly" // ~3 months
+    return "yearly"
   }
   /**
    * Groups commits by the appropriate time period
    */
-  groupByTimePeriod(commits: AnalyzedCommit[], period: 'daily' | 'weekly' | 'monthly' | 'quarterly' | 'yearly'): Map<string, AnalyzedCommit[]> {
+  groupByTimePeriod(
+    commits: AnalyzedCommit[],
+    period: TimePeriod,
+  ): Map<string, AnalyzedCommit[]> {
     const grouped = new Map<string, AnalyzedCommit[]>()
     for (const commit of commits) {
@@ -179,19 +191,22 @@ export class ReportGenerationService {
       let key: string
       switch (period) {
-        case 'daily':
+        case "hourly":
+          key = this.formatHourlyKey(date)
+          break
+        case "daily":
           key = this.formatDailyKey(date)
           break
-        case 'weekly':
+        case "weekly":
           key = this.formatWeeklyKey(date)
           break
-        case 'monthly':
+        case "monthly":
           key = this.formatMonthlyKey(date)
           break
-        case 'quarterly':
+        case "quarterly":
           key = this.formatQuarterlyKey(date)
           break
-        case 'yearly':
+        case "yearly":
         default:
           key = date.getFullYear().toString()
           break
@@ -206,15 +221,24 @@ export class ReportGenerationService {
     return grouped
   }
+  private formatHourlyKey(date: Date): string {
+    const hour = date.getHours()
+    const displayHour = hour === 0 ? 12 : hour > 12 ? hour - 12 : hour
+    const ampm = hour < 12 ? "AM" : "PM"
+    return `${displayHour}:00 ${ampm}`
+  }
   private formatDailyKey(date: Date): string {
     const year = date.getFullYear()
     const month = date.getMonth() + 1
     const day = date.getDate()
     const hour = date.getHours()
-    if (hour < 12) return `${year}-${month.toString().padStart(2, '0')}-${day.toString().padStart(2, '0')} Morning`
-    if (hour < 17) return `${year}-${month.toString().padStart(2, '0')}-${day.toString().padStart(2, '0')} Afternoon`
-    return `${year}-${month.toString().padStart(2, '0')}-${day.toString().padStart(2, '0')} Evening`
+    if (hour < 12)
+      return `${year}-${month.toString().padStart(2, "0")}-${day.toString().padStart(2, "0")} Morning`
+    if (hour < 17)
+      return `${year}-${month.toString().padStart(2, "0")}-${day.toString().padStart(2, "0")} Afternoon`
+    return `${year}-${month.toString().padStart(2, "0")}-${day.toString().padStart(2, "0")} Evening`
   }
   private formatWeeklyKey(date: Date): string {
@@ -223,15 +247,27 @@ export class ReportGenerationService {
     const endOfWeek = new Date(startOfWeek)
     endOfWeek.setDate(startOfWeek.getDate() + 6)
-    const formatDate = (d: Date) =>
-      `${d.getFullYear()}-${(d.getMonth() + 1).toString().padStart(2, '0')}-${d.getDate().toString().padStart(2, '0')}`
+    const formatDate = (d: Date) =>
+      `${d.getFullYear()}-${(d.getMonth() + 1).toString().padStart(2, "0")}-${d.getDate().toString().padStart(2, "0")}`
     return `Week of ${formatDate(startOfWeek)} to ${formatDate(endOfWeek)}`
   }
   private formatMonthlyKey(date: Date): string {
-    const months = ['January', 'February', 'March', 'April', 'May', 'June',
-                   'July', 'August', 'September', 'October', 'November', 'December']
+    const months = [
+      "January",
+      "February",
+      "March",
+      "April",
+      "May",
+      "June",
+      "July",
+      "August",
+      "September",
+      "October",
+      "November",
+      "December",
+    ]
     return `${months[date.getMonth()]} ${date.getFullYear()}`
   }
@@ -241,45 +277,160 @@ export class ReportGenerationService {
   }
   /**
-   * Converts analyzed commits to CSV string format for LLM consumption
+   * Converts analyzed commits to CSV string format for LLM consumption with enhanced context
    */
   convertToCSVString(commits: AnalyzedCommit[]): string {
-    const header = "year,category,summary,description"
+    const header = "year,category,summary,description,commit_count,date_range"
+    // Group commits by year and category for context
+    const contextMap = new Map<string, { count: number; dates: Date[] }>()
     const rows = commits.map((commit) => {
       const analysis = commit.getAnalysis()
+      const key = `${commit.getYear()}-${analysis.getCategory().getValue()}`
+      if (!contextMap.has(key)) {
+        contextMap.set(key, { count: 0, dates: [] })
+      }
+      const context = contextMap.get(key)!
+      context.count++
+      context.dates.push(commit.getDate())
+      const dateRange =
+        context.dates.length > 1
+          ? `${this.formatDate(Math.min(...context.dates.map((d) => d.getTime())))} to ${this.formatDate(Math.max(...context.dates.map((d) => d.getTime())))}`
+          : this.formatDate(commit.getDate())
       return [
         commit.getYear().toString(),
         this.escapeCsvField(analysis.getCategory().getValue()),
         this.escapeCsvField(analysis.getSummary()),
         this.escapeCsvField(analysis.getDescription()),
+        context.count.toString(),
+        this.escapeCsvField(dateRange),
       ].join(",")
     })
     return [header, ...rows].join("\n")
   }
+  private formatDate(date: Date | number): string {
+    const d = new Date(date)
+    return `${d.getFullYear()}-${(d.getMonth() + 1).toString().padStart(2, "0")}-${d.getDate().toString().padStart(2, "0")}`
+  }
   /**
-   * Converts grouped commits to CSV with time period information
+   * Converts grouped commits to CSV with time period information and enhanced context
    */
-  convertGroupedToCSV(groupedCommits: Map<string, AnalyzedCommit[]>, period: string): string {
-    const header = `${period},category,summary,description`
+  convertGroupedToCSV(
+    groupedCommits: Map<string, AnalyzedCommit[]>,
+    period: string,
+  ): string {
+    const header = `${period},category,summary,description,commit_count,similar_commits`
     const rows: string[] = []
     for (const [timePeriod, commits] of groupedCommits) {
+      // Group commits by category within the time period for context
+      const categoryGroups = new Map<string, AnalyzedCommit[]>()
+      for (const commit of commits) {
+        const category = commit.getAnalysis().getCategory().getValue()
+        if (!categoryGroups.has(category)) {
+          categoryGroups.set(category, [])
+        }
+        categoryGroups.get(category)!.push(commit)
+      }
+      // Add context about similar commits in the same period and category
       for (const commit of commits) {
         const analysis = commit.getAnalysis()
-        rows.push([
-          this.escapeCsvField(timePeriod),
-          this.escapeCsvField(analysis.getCategory().getValue()),
-          this.escapeCsvField(analysis.getSummary()),
-          this.escapeCsvField(analysis.getDescription()),
-        ].join(","))
+        const category = analysis.getCategory().getValue()
+        const similarCommits = categoryGroups.get(category)!
+        // Find similar summaries in the same category
+        const similarSummaries = similarCommits
+          .filter((c) => c !== commit)
+          .map((c) => c.getAnalysis().getSummary())
+          .filter((summary) =>
+            this.isSimilarSummary(analysis.getSummary(), summary),
+          )
+          .slice(0, 3) // Limit to 3 similar items
+        rows.push(
+          [
+            this.escapeCsvField(timePeriod),
+            this.escapeCsvField(category),
+            this.escapeCsvField(analysis.getSummary()),
+            this.escapeCsvField(analysis.getDescription()),
+            similarCommits.length.toString(),
+            this.escapeCsvField(similarSummaries.join("; ")),
+          ].join(","),
+        )
       }
     }
     return [header, ...rows].join("\n")
   }
+  /**
+   * Determines if two summaries are similar based on common keywords
+   */
+  private isSimilarSummary(summary1: string, summary2: string): boolean {
+    const keywords1 = this.extractKeywords(summary1)
+    const keywords2 = this.extractKeywords(summary2)
+    // Check if they share significant keywords (at least 2 common words)
+    const commonKeywords = keywords1.filter((word) => keywords2.includes(word))
+    return commonKeywords.length >= 2
+  }
+  /**
+   * Extracts meaningful keywords from a summary for similarity detection
+   */
+  private extractKeywords(summary: string): string[] {
+    // Remove common stopwords and extract meaningful terms
+    const stopwords = new Set([
+      "the",
+      "a",
+      "an",
+      "and",
+      "or",
+      "but",
+      "in",
+      "on",
+      "at",
+      "to",
+      "for",
+      "of",
+      "with",
+      "by",
+      "is",
+      "are",
+      "was",
+      "were",
+      "be",
+      "been",
+      "have",
+      "has",
+      "had",
+      "do",
+      "does",
+      "did",
+      "will",
+      "would",
+      "could",
+      "should",
+    ])
+    return summary
+      .toLowerCase()
+      .replace(/[^\w\s]/g, " ")
+      .split(/\s+/)
+      .filter((word) => word.length > 2 && !stopwords.has(word))
+      .slice(0, 5) // Take first 5 meaningful words
+  }
   /**
    * Escape CSV fields that contain commas, quotes, or newlines
    */

package/src/3.presentation/cli-application.ts CHANGED Viewed

@@ -24,7 +24,7 @@ export interface CLIOptions {
 }
 export class CLIApplication {
-  private static readonly VERSION = "1.1.3"
+  private static readonly VERSION = "1.1.5"
   private static readonly DEFAULT_COMMITS_OUTPUT_FILE = "results/commits.csv"
   private static readonly DEFAULT_REPORT_OUTPUT_FILE = "results/report.md"

package/src/4.infrastructure/llm-adapter.ts CHANGED Viewed

@@ -243,9 +243,11 @@ ${csvContent}
 INSTRUCTIONS:
 1. Group the data by year (descending order, most recent first)
 2. Within each year, group by category: Features, Process Improvements, and Tweaks & Bug Fixes
-3. Consolidate similar items within each category to create readable summaries
-4. Focus on what was accomplished rather than individual commit details
-5. Use clear, professional language appropriate for stakeholders
+3. Use the 'commit_count' and 'date_range' columns to understand the scope and timeline of work
+4. Consolidate similar items within each category to create readable summaries
+5. Focus on what was accomplished rather than individual commit details
+6. Use clear, professional language appropriate for stakeholders
+7. Pay attention to recurring themes and patterns across commits
 CATEGORY MAPPING:
 - "feature" → "Features" section
@@ -253,12 +255,15 @@ CATEGORY MAPPING:
 - "tweak" → "Tweaks & Bug Fixes" section
 CONSOLIDATION GUIDELINES:
-- Group similar features together (e.g., "authentication system improvements")
-- Combine related bug fixes (e.g., "resolved 8 authentication issues")
-- Summarize process changes by theme (e.g., "CI/CD pipeline enhancements")
-- Use bullet points for individual items within categories
+- FIRST: Extract common themes and keywords from commit summaries within each category
+- SECOND: Identify and merge duplicate or highly similar work items (e.g., multiple "fix auth bug" commits become "resolved authentication issues")
+- Group similar features together by theme (e.g., "authentication system improvements", "payment processing enhancements")
+- Combine related bug fixes by area/system (e.g., "resolved 8 authentication issues", "fixed 5 database connection problems")
+- Summarize process changes by theme (e.g., "CI/CD pipeline enhancements", "testing infrastructure improvements")
+- Use bullet points for individual consolidated items within categories
 - Aim for 3-7 bullet points per category per year
 - Include specific numbers when relevant (e.g., "15 bug fixes", "3 new features")
+- Avoid listing near-identical items separately - consolidate them into meaningful groups
 OUTPUT FORMAT:
 Generate yearly summary sections with this exact structure (DO NOT include the main title or commit analysis section):
@@ -290,6 +295,16 @@ QUALITY REQUIREMENTS:
 - Avoid technical jargon where possible
 - Ensure each bullet point represents meaningful work
 - Make the report valuable for both technical and non-technical readers
+- Focus on business impact and user value rather than technical implementation details
+- When consolidating, preserve the most important aspects from similar commits
+- Use progressive disclosure: start with high-level themes, then add specific details
+CONTEXT ANALYSIS:
+Before consolidating, analyze the commit data for:
+1. Common file patterns or system areas being modified
+2. Recurring keywords in commit messages that indicate related work
+3. Sequential commits that build upon each other
+4. Bug fixes that address the same underlying issue
 Generate the markdown report now:`
   }
@@ -306,10 +321,12 @@ ${csvContent}
 INSTRUCTIONS:
 1. Group the data by ${periodDisplayName} (descending order, most recent first)
 2. Within each ${periodDisplayName.toLowerCase()}, group by category: Features, Process Improvements, and Tweaks & Bug Fixes
-3. Consolidate similar items within each category to create readable summaries
-4. Focus on what was accomplished rather than individual commit details
-5. Use clear, professional language appropriate for stakeholders
-6. Only include sections for time periods that have commits
+3. Use the 'commit_count' and 'similar_commits' columns to understand related work and consolidation opportunities
+4. Consolidate similar items within each category to create readable summaries
+5. Focus on what was accomplished rather than individual commit details
+6. Use clear, professional language appropriate for stakeholders
+7. Only include sections for time periods that have commits
+8. Pay attention to recurring themes and patterns across commits
 CATEGORY MAPPING:
 - "feature" → "Features" section
@@ -317,12 +334,15 @@ CATEGORY MAPPING:
 - "tweak" → "Tweaks & Bug Fixes" section
 CONSOLIDATION GUIDELINES:
-- Group similar features together (e.g., "authentication system improvements")
-- Combine related bug fixes (e.g., "resolved 8 authentication issues")
-- Summarize process changes by theme (e.g., "CI/CD pipeline enhancements")
-- Use bullet points for individual items within categories
+- FIRST: Extract common themes and keywords from commit summaries within each category
+- SECOND: Identify and merge duplicate or highly similar work items (e.g., multiple "fix auth bug" commits become "resolved authentication issues")
+- Group similar features together by theme (e.g., "authentication system improvements", "payment processing enhancements")
+- Combine related bug fixes by area/system (e.g., "resolved 8 authentication issues", "fixed 5 database connection problems")
+- Summarize process changes by theme (e.g., "CI/CD pipeline enhancements", "testing infrastructure improvements")
+- Use bullet points for individual consolidated items within categories
 - Aim for 3-7 bullet points per category per ${periodDisplayName.toLowerCase()}
 - Include specific numbers when relevant (e.g., "15 bug fixes", "3 new features")
+- Avoid listing near-identical items separately - consolidate them into meaningful groups
 OUTPUT FORMAT:
 Generate ${periodDisplayName.toLowerCase()} summary sections with this exact structure (DO NOT include the main title or commit analysis section):
@@ -354,6 +374,16 @@ QUALITY REQUIREMENTS:
 - Avoid technical jargon where possible
 - Ensure each bullet point represents meaningful work
 - Make the report valuable for both technical and non-technical readers
+- Focus on business impact and user value rather than technical implementation details
+- When consolidating, preserve the most important aspects from similar commits
+- Use progressive disclosure: start with high-level themes, then add specific details
+CONTEXT ANALYSIS:
+Before consolidating, analyze the commit data for:
+1. Common file patterns or system areas being modified
+2. Recurring keywords in commit messages that indicate related work
+3. Sequential commits that build upon each other
+4. Bug fixes that address the same underlying issue
 Generate the markdown report now:`
   }