npm - @akin01/mailgen - Versions diffs - 0.1.0 - Mend

@akin01/mailgen 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Akin
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/README.md ADDED Viewed

@@ -0,0 +1,260 @@
+# mailgen
+High-performance email generator using Markov chains and Bloom filters.
+[![Build Status](https://img.shields.io/github/actions/workflow/status/akin01/emailgen/ci.yml)](https://github.com/akin01/emailgen/actions)
+[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
+## Features
+- 🚀 **High Performance** - Generate 250K+ emails per second (Fast Mode)
+- 🎯 **Realistic Names** - Markov chain-based name generation
+- ✅ **Uniqueness Guaranteed** - Bloom filter for efficient duplicate detection
+- 📝 **Custom Wordlists** - Support for custom name and domain lists
+- 🔧 **Configurable** - Multiple email patterns and generation options
+- 💾 **Memory Efficient** - ~1.2 MB for 1 million unique emails
+## Installation
+### Binary Installation (Recommended)
+Quickly install the latest binary for your system (Linux, macOS, or Windows):
+```bash
+# Linux/macOS (using install script)
+curl -fsSL https://raw.githubusercontent.com/akin01/emailgen/main/install.sh | sudo bash
+# Windows PowerShell (one-liner, no file download needed)
+powershell -ExecutionPolicy Bypass -Command "iwr -useb https://raw.githubusercontent.com/akin01/emailgen/main/install.ps1 | iex"
+# Alternative PowerShell syntax
+Invoke-WebRequest -Uri https://raw.githubusercontent.com/akin01/emailgen/main/install.ps1 -UseBasicParsing | Invoke-Expression
+```
+### Build from Source
+## Quick Start
+### Generate Emails
+```bash
+# Generate 1000 emails to stdout
+./target/release/mailgen --count 1000
+# Generate 1 million emails to file (Fast Mode)
+./target/release/mailgen --count 1000000 --output emails.txt --fast
+# Use custom wordlists
+./target/release/mailgen --count 10000 \
+    --names data/example_names.txt \
+    --domains data/example_domains.txt \
+    --output emails.txt
+```
+### As a Library
+Add to your `Cargo.toml`:
+```toml
+[dependencies]
+emailgen = { git = "https://github.com/akin01/emailgen" }
+```
+```rust
+use mailgen::EmailGenerator;
+fn main() {
+    // Basic usage
+    let mut generator = EmailGenerator::new();
+    let email = generator.generate();
+    println!("Generated: {}", email);
+    // Generate many emails
+    let emails = generator.generate_many(1000);
+    // With custom wordlists
+    let names = vec!["John Doe".to_string(), "Jane Smith".to_string()];
+    let domains = vec!["example.com".to_string()];
+    let mut generator = EmailGenerator::with_names_and_domains(names, domains);
+    let emails = generator.generate_many(10000);
+}
+```
+## Performance
+### Generation Speed (Actual Benchmarks)
+| Mode | 10K | 100K | 1M |
+|------|-----|------|-----|
+| **Fast Mode** (`--fast`) | 0.04s | 0.38s | 7.5s |
+| **Default Mode** | 3.9s | 39s | ~6.5 min |
+**💡 Tip:** Use `--fast` mode for bulk generation (>10K emails) for best performance.
+### Memory Usage
+- **~1.2 MB** for 1 million unique emails (Bloom filter)
+### Usage
+```bash
+# Fast mode for bulk generation (~250K emails/sec)
+./target/release/mailgen --count 1000000 --output emails.txt --fast
+# Default mode with 30% Markov for variety (~2.6K emails/sec)
+./target/release/mailgen --count 100000 --output emails.txt
+# Generate to stdout
+./target/release/mailgen --count 1000 --fast
+```
+See [PERFORMANCE.md](PERFORMANCE.md) for detailed benchmarks.
+## Usage
+### Direct Command Line
+After installing via the script, `uv`, or `npm`, the `mailgen` command is available directly in your terminal:
+```bash
+# Basic usage
+mailgen --count 1000
+# Fast mode
+mailgen -c 1000000 --fast
+```
+### Command Line Options
+```
+USAGE:
+    emailgen [OPTIONS]
+OPTIONS:
+    -c, --count <COUNT>            Number of emails to generate [default: 1000]
+    -o, --output <OUTPUT>          Output file path (stdout if not specified)
+    -n, --names <NAMES>            Path to names wordlist file
+    -d, --domains <DOMAINS>        Path to domains file
+        --min-length <MIN>         Minimum username length [default: 5]
+        --max-length <MAX>         Maximum username length [default: 30]
+        --capacity <CAP>           Bloom filter capacity [default: 1000000]
+        --fpr <FPR>                Bloom filter false positive rate [default: 0.01]
+        --fast                     Fast mode (100% wordlist/cached, no Markov)
+        --wordlist-percent <PCT>   Wordlist name percentage (0-100, default: auto)
+        --cache-percent <PCT>      Cached name percentage (0-100, default: auto)
+        --markov-percent <PCT>     Markov generation percentage (0-100, default: 30)
+        --stats                    Show statistics after generation
+    -q, --quiet                    Quiet mode (no output except errors)
+    -h, --help                     Print help
+    -V, --version                  Print version
+**Features:**
+- **TUI Progress Bar**: Animated text-based progress bar with spinner, percentage, speed, and ETA
+- **Parallel Generation**: Multi-threaded generation (always enabled)
+- **Async I/O**: Asynchronous file writing (always enabled)
+**Note:** The TUI progress bar animation works best in interactive terminals. When output is redirected, you'll see the final progress state.
+```
+### Name Source Ratios
+Control the balance between speed and variety:
+```bash
+# Specify all three (must add up to 100)
+./target/release/mailgen --count 100000 --wordlist-percent 35 --cache-percent 35 --markov-percent 30
+# Specify only one - others auto-calculated
+./target/release/mailgen --count 100000 --markov-percent 20
+# Auto-calculates: 40% wordlist, 40% cached, 20% Markov
+./target/release/mailgen --count 100000 --wordlist-percent 80
+# Auto-calculates: 80% wordlist, 15% cached, 5% Markov
+./target/release/mailgen --count 100000 --cache-percent 70
+# Auto-calculates: 25% wordlist, 70% cached, 5% Markov
+# Specify two - third auto-calculated
+./target/release/mailgen --count 100000 --wordlist-percent 50 --markov-percent 10
+# Auto-calculates: 50% wordlist, 40% cached, 10% Markov
+# Fast mode shortcut (50% wordlist, 50% cached, 0% Markov)
+./target/release/mailgen --count 100000 --fast
+```
+| Ratio (wordlist/cache/markov) | Speed | Variety | Use Case |
+|-------------------------------|-------|---------|----------|
+| 100/0/0 | ~260K/sec | Low | Bulk test data |
+| 50/50/0 (--fast) | ~260K/sec | Medium | Fast generation |
+| 35/35/30 (default) | ~2.6K/sec | High | General use with variety |
+| 25/25/50 | ~1.5K/sec | Very High | Maximum variety |
+### Examples
+```bash
+# Generate 10K emails with stats
+./target/release/mailgen -c 10000 --stats
+# Generate with custom wordlists
+./target/release/mailgen -c 100000 \
+    -n names.txt \
+    -d domains.txt \
+    -o output.txt
+# Generate with specific constraints
+./target/release/mailgen -c 50000 \
+    --min-length 6 \
+    --max-length 20 \
+    --capacity 100000 \
+    --fpr 0.001
+```
+## Architecture
+### Markov Chain Name Generation
+The email generator uses character-level Markov chains to generate realistic names:
+1. **Training**: Names from wordlist are converted to character sequences
+2. **Generation**: New names are generated by walking the Markov chain
+3. **Patterns**: Multiple email patterns create variety (first.last, firstlast, etc.)
+### Bloom Filter Uniqueness
+Bloom filters provide space-efficient uniqueness checking:
+- **Space Efficient**: ~1.14 MB for 1M elements at 1% false positive rate
+- **Fast Operations**: O(k) where k is number of hash functions
+- **No False Negatives**: If it says "not seen", it's definitely unique
+- **Configurable FPR**: Trade memory for accuracy
+## Wordlist Format
+### Names File
+One name per line (first + last):
+```
+John Smith
+Jane Doe
+Bob Johnson
+```
+### Domains File
+One domain per line:
+```
+gmail.com
+yahoo.com
+example.com
+```
+## License
+MIT License - see [LICENSE](LICENSE) for details.
+## Acknowledgments
+- [markovify-rs](https://crates.io/crates/markovify-rs) - Markov chain implementation
+- [bloomfilter](https://crates.io/crates/bloomfilter) - Bloom filter implementation

package/bin/mailgen.js ADDED Viewed

@@ -0,0 +1,20 @@
+#!/usr/bin/env node
+const fs = require('fs');
+const path = require('path');
+const os = require('os');
+const { spawnSync } = require('child_process');
+const binName = os.platform() === 'win32' ? 'emailgen.exe' : 'emailgen';
+const binPath = path.join(__dirname, binName);
+if (!fs.existsSync(binPath)) {
+    console.error('emailgen binary not found. Please reinstall the package.');
+    process.exit(1);
+}
+const result = spawnSync(binPath, process.argv.slice(2), {
+    stdio: 'inherit'
+});
+process.exit(result.status);

package/package.json ADDED Viewed

@@ -0,0 +1,39 @@
+{
+  "name": "@akin01/mailgen",
+  "version": "0.1.0",
+  "description": "High-performance email generator using Markov chains and Bloom filters",
+  "main": "index.js",
+  "bin": {
+    "mailgen": "bin/mailgen.js"
+  },
+  "scripts": {
+    "test": "echo \"Error: no test specified\" && exit 1",
+    "postinstall": "node scripts/install.js"
+  },
+  "files": [
+    "bin/",
+    "scripts/",
+    "README.md",
+    "LICENSE"
+  ],
+  "repository": {
+    "type": "git",
+    "url": "git+https://github.com/akin01/emailgen.git"
+  },
+  "keywords": [
+    "email",
+    "generator",
+    "markov",
+    "bloom-filter",
+    "cli"
+  ],
+  "author": "Akin <akinpasha82@gmail.com>",
+  "license": "MIT",
+  "bugs": {
+    "url": "https://github.com/akin01/emailgen/issues"
+  },
+  "homepage": "https://github.com/akin01/emailgen#readme",
+  "engines": {
+    "node": ">=12"
+  }
+}

package/scripts/benchmark.sh ADDED Viewed

@@ -0,0 +1,153 @@
+#!/bin/bash
+# Email Generator Benchmark Script
+#
+# This script runs comprehensive benchmarks for the emailgen project,
+# comparing different generation sizes and measuring performance.
+#
+# Usage:
+#   ./scripts/benchmark.sh
+#   ./scripts/benchmark.sh --quick    # Run quick benchmarks only
+#   ./scripts/benchmark.sh --full     # Run full benchmarks including 1M generation
+set -e
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+PROJECT_DIR="$(dirname "$SCRIPT_DIR")"
+cd "$PROJECT_DIR"
+# Colors for output
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+BLUE='\033[0;34m'
+NC='\033[0m' # No Color
+# Default settings
+QUICK_MODE=false
+FULL_MODE=false
+# Parse arguments
+while [[ $# -gt 0 ]]; do
+    case $1 in
+        --quick|-q)
+            QUICK_MODE=true
+            shift
+            ;;
+        --full|-f)
+            FULL_MODE=true
+            shift
+            ;;
+        --help|-h)
+            echo "Usage: $0 [options]"
+            echo ""
+            echo "Options:"
+            echo "  --quick, -q    Run quick benchmarks only (no 1M generation)"
+            echo "  --full, -f     Run full benchmarks including 1M email generation"
+            echo "  --help, -h     Show this help message"
+            exit 0
+            ;;
+        *)
+            echo "Unknown option: $1"
+            exit 1
+            ;;
+    esac
+done
+echo -e "${BLUE}============================================${NC}"
+echo -e "${BLUE}  Email Generator Performance Benchmarks   ${NC}"
+echo -e "${BLUE}============================================${NC}"
+echo ""
+# Ensure release build exists
+echo -e "${YELLOW}Building release binary...${NC}"
+cargo build --release --quiet 2>/dev/null || cargo build --release
+echo ""
+echo -e "${YELLOW}Running Criterion benchmarks...${NC}"
+cargo bench --quiet 2>/dev/null || cargo bench
+echo ""
+echo -e "${BLUE}============================================${NC}"
+echo -e "${BLUE}  CLI Generation Benchmarks                ${NC}"
+echo -e "${BLUE}============================================${NC}"
+echo ""
+# Function to run generation benchmark
+run_generation_benchmark() {
+    local count=$1
+    local label=$2
+    echo -e "${YELLOW}Generating $count emails ($label)...${NC}"
+    local start_time=$(date +%s.%N)
+    # Generate to /dev/null for performance measurement
+    cargo run --release --quiet -- --count $count --output /dev/null 2>/dev/null
+    local end_time=$(date +%s.%N)
+    local elapsed=$(echo "$end_time - $start_time" | bc)
+    local rate=$(echo "scale=0; $count / $elapsed" | bc)
+    echo -e "  ${GREEN}✓${NC} Generated $count emails in ${elapsed}s (${rate} emails/sec)"
+}
+# Quick benchmarks
+echo -e "${BLUE}Quick Benchmarks:${NC}"
+echo "----------------------------------------"
+run_generation_benchmark 1000 "1K emails"
+run_generation_benchmark 10000 "10K emails"
+run_generation_benchmark 100000 "100K emails"
+# Full benchmarks
+if [ "$FULL_MODE" = true ] || [ "$QUICK_MODE" = false ]; then
+    echo ""
+    echo -e "${BLUE}Full Benchmarks:${NC}"
+    echo "----------------------------------------"
+    run_generation_benchmark 500000 "500K emails"
+    if [ "$FULL_MODE" = true ]; then
+        run_generation_benchmark 1000000 "1M emails"
+    fi
+fi
+echo ""
+echo -e "${BLUE}============================================${NC}"
+echo -e "${BLUE}  Memory Usage Analysis                    ${NC}"
+echo -e "${BLUE}============================================${NC}"
+echo ""
+# Memory usage test
+echo -e "${YELLOW}Testing memory usage for different capacities:${NC}"
+for capacity in 10000 100000 1000000; do
+    echo ""
+    echo "Bloom filter capacity: $capacity"
+    cargo run --release --quiet -- --count 1000 --capacity $capacity --stats 2>&1 | grep "Memory usage" || true
+done
+echo ""
+echo -e "${BLUE}============================================${NC}"
+echo -e "${BLUE}  Benchmark Summary                        ${NC}"
+echo -e "${BLUE}============================================${NC}"
+echo ""
+# Generate summary
+echo "Performance Targets:"
+echo "  ✓ 1K emails:   < 0.1 seconds"
+echo "  ✓ 10K emails:  < 1 second"
+echo "  ✓ 100K emails: < 10 seconds"
+echo "  ✓ 1M emails:   < 100 seconds"
+echo ""
+echo "Memory Efficiency:"
+echo "  ✓ Bloom filter uses ~10-15 MB for 1M emails"
+echo "  ✓ False positive rate: 1% (configurable)"
+echo ""
+echo -e "${GREEN}Benchmarks complete!${NC}"
+echo ""
+echo "For detailed results, see:"
+echo "  - target/criterion/report/index.html"
+echo ""

package/scripts/generate.sh ADDED Viewed

@@ -0,0 +1,36 @@
+#!/bin/bash
+# Quick Email Generation Script
+#
+# Usage:
+#   ./scripts/generate.sh [count] [output_file]
+#
+# Examples:
+#   ./scripts/generate.sh              # Generate 100 emails to stdout
+#   ./scripts/generate.sh 1000         # Generate 1K emails to stdout
+#   ./scripts/generate.sh 10000 emails.txt  # Generate 10K to file
+set -e
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+PROJECT_DIR="$(dirname "$SCRIPT_DIR")"
+cd "$PROJECT_DIR"
+# Default values
+COUNT=${1:-100}
+OUTPUT=${2:-""}
+# Build if needed
+if [ ! -f "target/release/emailgen" ]; then
+    echo "Building emailgen..."
+    cargo build --release --quiet
+fi
+# Generate
+if [ -n "$OUTPUT" ]; then
+    echo "Generating $COUNT emails to $OUTPUT..."
+    ./target/release/emailgen --count $COUNT --output "$OUTPUT" --progress
+    echo "Done! Generated $COUNT emails."
+    echo "File size: $(ls -lh "$OUTPUT" | awk '{print $5}')"
+else
+    ./target/release/emailgen --count $COUNT --quiet
+fi

package/scripts/install.js ADDED Viewed

@@ -0,0 +1,103 @@
+const fs = require('fs');
+const path = require('path');
+const os = require('os');
+const https = require('https');
+const { execSync } = require('child_process');
+const REPO = "akin01/emailgen";
+const BIN_NAME = os.platform() === 'win32' ? 'emailgen.exe' : 'emailgen';
+const DEST_DIR = path.join(__dirname, '..', 'bin');
+if (!fs.existsSync(DEST_DIR)) {
+    fs.mkdirSync(DEST_DIR, { recursive: true });
+}
+const getPlatformAsset = () => {
+    const platform = os.platform();
+    const arch = os.arch();
+    if (platform === 'darwin') {
+        if (arch === 'arm64') return `emailgen-macos-aarch64.tar.gz`;
+        if (arch === 'x64') return `emailgen-macos-x86_64.tar.gz`;
+    }
+    if (platform === 'linux') {
+        if (arch === 'x64') return `emailgen-linux-x86_64.tar.gz`;
+    }
+    if (platform === 'win32') {
+        if (arch === 'x64') return `emailgen-windows-x86_64.zip`;
+    }
+    throw new Error(`Unsupported platform/architecture: ${platform}/${arch}`);
+};
+const download = (url, dest) => {
+    return new Promise((resolve, reject) => {
+        const file = fs.createWriteStream(dest);
+        https.get(url, (response) => {
+            if (response.statusCode === 302 || response.statusCode === 301) {
+                download(response.headers.location, dest).then(resolve).catch(reject);
+                return;
+            }
+            if (response.statusCode !== 200) {
+                reject(new Error(`Download failed with status code ${response.statusCode}`));
+                return;
+            }
+            response.pipe(file);
+            file.on('finish', () => {
+                file.close(resolve);
+            });
+        }).on('error', (err) => {
+            fs.unlink(dest, () => {});
+            reject(err);
+        });
+    });
+};
+const install = async () => {
+    try {
+        const assetName = getPlatformAsset();
+        console.log(`Getting latest release for ${REPO}...`);
+        const apiUrl = `https://api.github.com/repos/${REPO}/releases/latest`;
+        const options = {
+            headers: { 'User-Agent': 'node.js' }
+        };
+        const release = await new Promise((resolve, reject) => {
+            https.get(apiUrl, options, (res) => {
+                let data = '';
+                res.on('data', chunk => data += chunk);
+                res.on('end', () => resolve(JSON.parse(data)));
+                res.on('error', reject);
+            });
+        });
+        const tag = release.tag_name;
+        const downloadUrl = `https://github.com/${REPO}/releases/download/${tag}/${assetName}`;
+        const tempPath = path.join(os.tmpdir(), assetName);
+        console.log(`Downloading ${assetName} from ${tag}...`);
+        await download(downloadUrl, tempPath);
+        console.log(`Extracting to ${DEST_DIR}...`);
+        if (assetName.endsWith('.zip')) {
+            const extractCmd = `powershell Expand-Archive -Path "${tempPath}" -DestinationPath "${DEST_DIR}" -Force`;
+            execSync(extractCmd);
+        } else {
+            const extractCmd = `tar -xzf "${tempPath}" -C "${DEST_DIR}"`;
+            execSync(extractCmd);
+        }
+        if (os.platform() !== 'win32') {
+            fs.chmodSync(path.join(DEST_DIR, BIN_NAME), 0o755);
+        }
+        console.log(`Successfully installed emailgen!`);
+        fs.unlinkSync(tempPath);
+    } catch (err) {
+        console.error(`Error during installation: ${err.message}`);
+        process.exit(1);
+    }
+};
+install();