npm - commit-analyzer - Versions diffs - 1.0.1 - Mend

commit-analyzer 1.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (16) hide show

package/.claude/settings.local.json +12 -0
package/README.md +243 -0
package/csv-to-report-prompt.md +97 -0
package/package.json +39 -0
package/prompt.md +69 -0
package/src/cli.ts +143 -0
package/src/csv-reader.ts +180 -0
package/src/csv.ts +40 -0
package/src/errors.ts +49 -0
package/src/git.ts +112 -0
package/src/index.ts +283 -0
package/src/llm.ts +283 -0
package/src/progress.ts +77 -0
package/src/report-generator.ts +286 -0
package/src/types.ts +24 -0
package/tsconfig.json +19 -0

package/.claude/settings.local.json ADDED Viewed

@@ -0,0 +1,12 @@
+{
+  "permissions": {
+    "allow": [
+      "Bash(mkdir:*)",
+      "Bash(bun install:*)",
+      "Bash(bun run:*)",
+      "Bash(bun link:*)"
+    ],
+    "deny": [],
+    "ask": []
+  }
+}

package/README.md ADDED Viewed

@@ -0,0 +1,243 @@
+# Git Commit Analyzer
+A TypeScript/Node.js program that analyzes git commits and generates categorized summaries using Claude CLI.
+## Features
+- Extract commit details (message, date, diff) from git repositories
+- Categorize commits using LLM analysis into: `tweak`, `feature`, or `process`
+- Generate CSV reports with year, category, summary, and description
+- Generate condensed markdown reports from CSV data for stakeholder communication
+- Support for multiple LLM models (Claude, Gemini, Codex) with automatic detection
+- Support for batch processing multiple commits
+- Automatically filters out merge commits for cleaner analysis
+- Robust error handling and validation
+## Installation
+```bash
+npm install
+npm run build
+```
+## Usage
+### Default Behavior
+When run without arguments, the program analyzes all commits authored by the current user:
+```bash
+# Analyze all your commits in the current repository
+npx commit-analyzer
+# Analyze your last 10 commits
+npx commit-analyzer --limit 10
+# Analyze commits by a specific user
+npx commit-analyzer --author user@example.com
+```
+### Command Line Arguments
+```bash
+# Analyze specific commits
+npx commit-analyzer abc123 def456 ghi789
+# Read commits from file
+npx commit-analyzer --file commits.txt
+# Specify output file with default behavior
+npx commit-analyzer --output analysis.csv --limit 20
+# Generate markdown report from existing CSV
+npx commit-analyzer --report --input-csv analysis.csv
+# Analyze commits and generate both CSV and markdown report
+npx commit-analyzer --report --limit 50
+# Use specific LLM model
+npx commit-analyzer --model claude --limit 10
+```
+### Options
+- `-o, --output <file>`: Output file (default: `output.csv` for analysis, `summary-report.md` for reports)
+- `-f, --file <file>`: Read commit hashes from file (one per line)
+- `-a, --author <email>`: Filter commits by author email (defaults to current user)
+- `-l, --limit <number>`: Limit number of commits to analyze
+- `-m, --model <model>`: LLM model to use (claude, gemini, codex)
+- `-r, --resume`: Resume from last checkpoint if available
+- `-c, --clear`: Clear any existing progress checkpoint
+- `--report`: Generate condensed markdown report from existing CSV
+- `--input-csv <file>`: Input CSV file to read for report generation
+- `-h, --help`: Display help
+- `-V, --version`: Display version
+### Input File Format
+When using `--file`, create a text file with one commit hash per line:
+```
+abc123def456
+def456ghi789
+ghi789jkl012
+```
+## Output Formats
+### CSV Output
+The program generates a CSV file with the following columns:
+- `year`: Year of the commit
+- `category`: One of `tweak`, `feature`, or `process`
+- `summary`: One-line description (max 80 characters)
+- `description`: Detailed explanation (2-3 sentences)
+### Markdown Report Output
+When using the `--report` option, the program generates a condensed markdown report that:
+- Groups commits by year (most recent first)
+- Organizes by categories: Features, Processes, Tweaks & Bug Fixes
+- Consolidates similar items for stakeholder readability
+- Includes commit count statistics
+- Uses professional language suitable for both technical and non-technical audiences
+## Requirements
+- Node.js 18+ with TypeScript support
+- Git repository (must be run within a git repository)
+- At least one supported LLM CLI tool:
+  - Claude CLI (`claude`) - recommended, defaults to Sonnet model
+  - Gemini CLI (`gemini`)
+  - Codex CLI (`codex`)
+- Valid git commit hashes (when specifying commits manually)
+## Categories
+- **tweak**: Minor adjustments, bug fixes, small improvements
+- **feature**: New functionality, major additions
+- **process**: Build system, CI/CD, tooling, configuration changes
+## Error Handling
+The program includes comprehensive error handling for:
+- Invalid commit hashes
+- Git repository validation
+- LLM analysis failures with automatic retry
+- File I/O errors
+- Network connectivity issues
+### Resume Capability
+The tool automatically:
+- Saves progress checkpoints every 10 commits
+- Saves immediately when a failure occurs
+- **Stops processing after a commit fails all retry attempts**
+- Exports partial results to the CSV file before exiting
+If the process stops (e.g., after 139 commits due to API failure), you can resume from where it left off:
+```bash
+# Resume from last checkpoint
+npx commit-analyzer --resume
+# Clear checkpoint and start fresh
+npx commit-analyzer --clear
+# View checkpoint status (it will prompt you)
+npx commit-analyzer --resume
+```
+The checkpoint file (`.commit-analyzer-progress.json`) contains:
+- List of all commits to process
+- Successfully processed commits (including failed ones to skip on resume)
+- Analyzed commit data (only successful ones)
+- Output file location
+**Important**: When a commit fails after all retries (default 3), the process stops immediately to prevent wasting API calls. The successfully analyzed commits up to that point are saved to the CSV file.
+### Retry Logic
+The tool includes automatic retry logic with exponential backoff for handling API failures when processing many commits. This is especially useful when analyzing large numbers of commits that might trigger rate limits.
+#### Configuration
+You can configure the retry behavior using environment variables:
+- `LLM_MAX_RETRIES`: Maximum number of retry attempts (default: 3)
+- `LLM_INITIAL_RETRY_DELAY`: Initial delay between retries in milliseconds (default: 5000)
+- `LLM_MAX_RETRY_DELAY`: Maximum delay between retries in milliseconds (default: 30000)
+- `LLM_RETRY_MULTIPLIER`: Multiplier for exponential backoff (default: 2)
+#### Examples
+```bash
+# More aggressive retries for large batches (e.g., 139+ commits)
+LLM_MAX_RETRIES=5 LLM_INITIAL_RETRY_DELAY=10000 npx commit-analyzer --limit 200
+# Faster retries for testing
+LLM_MAX_RETRIES=2 LLM_INITIAL_RETRY_DELAY=2000 npx commit-analyzer
+# Conservative approach for rate-limited APIs
+LLM_MAX_RETRIES=4 LLM_INITIAL_RETRY_DELAY=15000 LLM_MAX_RETRY_DELAY=60000 npx commit-analyzer
+```
+The retry mechanism automatically:
+- Retries failed API calls with increasing delays
+- Shows progress and retry attempts in the console
+- Continues processing remaining commits even if some fail
+- Reports the total number of successful and failed commits at the end
+## Development
+```bash
+# Install dependencies
+npm install
+# Run in development mode
+npm run dev
+# Build for production
+npm run build
+# Run linting
+npm run lint
+# Type checking
+npm run typecheck
+```
+## Examples
+```bash
+# Analyze all your commits in the current repository
+npx commit-analyzer
+# Analyze your last 20 commits and save to custom file
+npx commit-analyzer --limit 20 --output my_analysis.csv
+# Analyze commits by a specific team member
+npx commit-analyzer --author teammate@company.com --limit 50
+# Analyze specific commits
+git log --oneline -5 | cut -d' ' -f1 > recent_commits.txt
+npx commit-analyzer --file recent_commits.txt --output recent_analysis.csv
+# Quick analysis of your recent work
+npx commit-analyzer --limit 10
+# Generate both CSV and markdown report from analysis
+npx commit-analyzer --report --limit 100 --output yearly_analysis.csv
+# Generate only a markdown report from existing CSV
+npx commit-analyzer --report --input-csv existing_analysis.csv --output team_report.md
+# Use specific LLM model for analysis
+npx commit-analyzer --model gemini --limit 25
+# Resume interrupted analysis with progress tracking
+npx commit-analyzer --resume
+```

package/csv-to-report-prompt.md ADDED Viewed

@@ -0,0 +1,97 @@
+# LLM Prompt Template: CSV to Markdown Report Generation
+This is the prompt template to be used by the LLM service to generate condensed
+markdown reports from CSV commit analysis data.
+## Prompt Template
+```
+Analyze the following CSV data containing git commit analysis results and generate a condensed markdown development summary report.
+CSV DATA:
+{csv_content}
+INSTRUCTIONS:
+1. Group the data by year (descending order, most recent first)
+2. Within each year, group by category: Features, Process Improvements, and Tweaks & Bug Fixes
+3. Consolidate similar items within each category to create readable summaries
+4. Focus on what was accomplished rather than individual commit details
+5. Use clear, professional language appropriate for stakeholders
+CATEGORY MAPPING:
+- "feature" → "Features" section
+- "process" → "Processes" section
+- "tweak" → "Tweaks & Bug Fixes" section
+CONSOLIDATION GUIDELINES:
+- Group similar features together (e.g., "authentication system improvements")
+- Combine related bug fixes (e.g., "resolved 8 authentication issues")
+- Summarize process changes by theme (e.g., "CI/CD pipeline enhancements")
+- Use bullet points for individual items within categories
+- Aim for 3-7 bullet points per category per year
+- Include specific numbers when relevant (e.g., "15 bug fixes", "3 new features")
+OUTPUT FORMAT:
+Generate a markdown report with this exact structure:
+```markdown
+# Development Summary Report
+## Commit Analysis
+- **Total Commits**: [X] commits across [YEAR_RANGE]
+- **[MOST_RECENT_YEAR]**: [X] commits ([X] features, [X] process, [X] tweaks)
+- **[PREVIOUS_YEAR]**: [X] commits ([X] features, [X] process, [X] tweaks)
+- [Continue for each year in the data]
+## [YEAR]
+### Features
+- [Consolidated feature summary 1]
+- [Consolidated feature summary 2]
+- [Additional features as needed]
+### Processes
+- [Consolidated process improvement 1]
+- [Consolidated process improvement 2]
+- [Additional process items as needed]
+### Tweaks & Bug Fixes
+- [Consolidated tweak/fix summary 1]
+- [Consolidated tweak/fix summary 2]
+- [Additional tweaks/fixes as needed]
+## [PREVIOUS YEAR]
+[Repeat structure for each year in the data]
+```
+QUALITY REQUIREMENTS:
+- Keep summaries concise but informative
+- Use active voice and clear language
+- Avoid technical jargon where possible
+- Ensure each bullet point represents meaningful work
+- Make the report valuable for both technical and non-technical readers
+Generate the markdown report now:
+```
+## Implementation Notes
+This prompt should be used in the `MarkdownReportGenerator` service with the following approach:
+1. **Input Processing**: Replace `{csv_content}` with the actual CSV data read from the input file
+2. **LLM Call**: Send this prompt to the configured LLM (Claude, Gemini, etc.)
+3. **Response Parsing**: Extract the markdown content from the LLM response
+4. **File Output**: Write the generated markdown to the specified output file
+### Error Handling
+- If LLM returns malformed response, retry up to MAX_RETRIES times
+- Validate that the response contains properly formatted markdown
+- Ensure all years from the CSV data are represented in the output
+- Handle edge cases like empty categories or single-item categories
+### Response Validation
+The generated report should:
+- Start with "# Development Summary Report"
+- Have year sections in descending chronological order
+- Include all three category sections for each year (even if empty)
+- Use proper markdown formatting with ## for years and ### for categories
+- Contain bullet points (-) for individual items

package/package.json ADDED Viewed

@@ -0,0 +1,39 @@
+{
+  "name": "commit-analyzer",
+  "version": "1.0.1",
+  "description": "Analyze git commits and generate categories, summaries, and descriptions for each commit. Optionally generate a yearly breakdown report of your commit history.",
+  "main": "dist/index.js",
+  "bin": {
+    "commit-analyzer": "dist/index.js"
+  },
+  "prettier": {
+    "semi": false
+  },
+  "scripts": {
+    "build": "tsc",
+    "start": "node dist/index.js",
+    "dev": "ts-node src/index.ts",
+    "lint": "eslint src/**/*.ts",
+    "typecheck": "tsc --noEmit"
+  },
+  "keywords": [
+    "git",
+    "commit",
+    "analysis",
+    "llm",
+    "categorization"
+  ],
+  "author": "steverodri",
+  "license": "MIT",
+  "devDependencies": {
+    "@types/node": "^20.0.0",
+    "@typescript-eslint/eslint-plugin": "^6.0.0",
+    "@typescript-eslint/parser": "^6.0.0",
+    "eslint": "^8.0.0",
+    "ts-node": "^10.0.0",
+    "typescript": "^5.0.0"
+  },
+  "dependencies": {
+    "commander": "^11.0.0"
+  }
+}

package/prompt.md ADDED Viewed

@@ -0,0 +1,69 @@
+# Prompt for Git Commit Analysis Program
+  Create a TypeScript/Node.js program that analyzes git commits and generates
+  categorized summaries.
+  The program should:
+## Input Requirements
+  - Accept a list of git commit hashes as input (command line arguments or file)
+  - For each commit, extract the commit message, date, and diff
+  Core Functionality:
+  1. Git Integration:
+     Use git show and git diff to get commit details and changes
+  2. LLM Analysis:
+     Send commit message + diff to the claude cli for categorization.
+  3. CSV Export:
+     Generate output with columns:
+     year, category, summary, description
+## LLM Prompt Template
+  Analyze this git commit and provide a categorization:
+  - COMMIT MESSAGE:
+    {commit_message}
+  - COMMIT DIFF:
+    {diff_content}
+  Based on the commit message and code changes, categorize this commit as one
+  of:
+  - "tweak":
+    Minor adjustments, bug fixes, small improvements
+  - "feature":
+    New functionality, major additions
+  - "process":
+    Build system, CI/CD, tooling, configuration changes
+  Provide:
+  1. Category:
+     [tweak|feature|process]
+  2. Summary:
+     One-line description (max 80 chars)
+  3. Description:
+     Detailed explanation (2-3 sentences)
+  Format as JSON:
+  ```json
+  {
+    "category": "...",
+    "summary": "...",
+    "description": "..."
+  }
+  ```
+### Technical Implementation
+  - Use Node.js with TypeScript
+  - Extract year from git commit timestamp
+  Output Format:
+  CSV with headers:
+  year,category,summary,description
+  The program should be robust, handle edge cases, and provide clear error
+  messages for invalid commits or API failures.

package/src/cli.ts ADDED Viewed

@@ -0,0 +1,143 @@
+import { readFileSync } from "fs"
+import { Command } from "commander"
+export interface CLIOptions {
+  output?: string
+  file?: string
+  commits: string[]
+  author?: string
+  limit?: number
+  useDefaults: boolean
+  resume?: boolean
+  clear?: boolean
+  model?: string
+  report?: boolean
+  inputCsv?: string
+}
+export class CLIService {
+  static parseArguments(): CLIOptions {
+    const program = new Command()
+    program
+      .name("commit-analyzer")
+      .description("Analyze git commits and generate categorized summaries")
+      .version("1.0.0")
+      .option("-o, --output <file>", "Output CSV file (default: output.csv)")
+      .option(
+        "-f, --file <file>",
+        "Read commit hashes from file (one per line)",
+      )
+      .option(
+        "-a, --author <email>",
+        "Filter commits by author email (defaults to current user)",
+      )
+      .option(
+        "-l, --limit <number>",
+        "Limit number of commits to analyze",
+        parseInt,
+      )
+      .option(
+        "-r, --resume",
+        "Resume from last checkpoint if available",
+      )
+      .option(
+        "-c, --clear",
+        "Clear any existing progress checkpoint",
+      )
+      .option(
+        "-m, --model <model>",
+        "LLM model to use (claude, gemini, codex)",
+      )
+      .option(
+        "--report",
+        "Generate condensed markdown report from existing CSV",
+      )
+      .option(
+        "--input-csv <file>",
+        "Input CSV file to read for report generation",
+      )
+      .argument(
+        "[commits...]",
+        "Commit hashes to analyze (if none provided, uses current user's commits)",
+      )
+      .parse()
+    const options = program.opts()
+    const args = program.args
+    let commits: string[] = []
+    let useDefaults = false
+    if (options.file) {
+      commits = this.readCommitsFromFile(options.file)
+    } else if (args.length > 0) {
+      commits = args
+    } else {
+      useDefaults = true
+    }
+    return {
+      output: options.output || "output.csv",
+      file: options.file,
+      commits,
+      author: options.author,
+      limit: options.limit,
+      useDefaults,
+      resume: options.resume,
+      clear: options.clear,
+      model: options.model,
+      report: options.report,
+      inputCsv: options.inputCsv,
+    }
+  }
+  private static readCommitsFromFile(filename: string): string[] {
+    try {
+      const content = readFileSync(filename, "utf8")
+      return content
+        .split("\n")
+        .map((line) => line.trim())
+        .filter((line) => line.length > 0)
+    } catch (error) {
+      throw new Error(
+        `Failed to read commits from file ${filename}: ${error instanceof Error ? error.message : "Unknown error"}`,
+      )
+    }
+  }
+  static showHelp(): void {
+    console.log(`
+Usage: commit-analyzer [options] [commits...]
+Analyze git commits and generate categorized summaries using LLM.
+If no commits are specified, analyzes all commits authored by the current user.
+Options:
+  -o, --output <file>   Output file (default: output.csv for analysis, summary-report.md for reports)
+  -f, --file <file>     Read commit hashes from file (one per line)
+  -a, --author <email>  Filter commits by author email (defaults to current user)
+  -l, --limit <number>  Limit number of commits to analyze
+  -r, --resume          Resume from last checkpoint if available
+  -c, --clear           Clear any existing progress checkpoint
+  --report              Generate condensed markdown report from existing CSV
+  --input-csv <file>    Input CSV file to read for report generation
+  -h, --help           Display help for command
+  -V, --version        Display version number
+Examples:
+  commit-analyzer                                    # Analyze your authored commits
+  commit-analyzer --limit 10                         # Analyze your last 10 commits
+  commit-analyzer --author user@example.com          # Analyze specific user's commits
+  commit-analyzer abc123 def456 ghi789               # Analyze specific commits
+  commit-analyzer --file commits.txt                 # Read commits from file
+  commit-analyzer --output analysis.csv --limit 20   # Analyze last 20 commits to custom file
+  commit-analyzer --resume                           # Resume from last checkpoint
+  commit-analyzer --clear                            # Clear checkpoint and start fresh
+  commit-analyzer --report                           # Analyze commits, generate CSV, then generate report
+  commit-analyzer --input-csv data.csv --report      # Skip analysis, generate report from existing CSV
+  commit-analyzer --report -o custom-report.md       # Analyze commits, generate CSV, then generate custom report
+    `)
+  }
+}