npm - @intentsolutionsio/penetration-tester - Versions diffs - 2.0.0 → 3.0.4 - Mend

@intentsolutionsio/penetration-tester 2.0.0 → 3.0.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (112) hide show

package/skills/scanning-for-hardcoded-secrets/references/PLAYBOOK.md ADDED Viewed

@@ -0,0 +1,325 @@
+# Hardcoded-Secrets Remediation Playbook
+## After detection: the standing procedure
+1. **Rotate.** Same hour. The window between detection and rotation
+   is dead time for the credential's usefulness to anyone but
+   attackers.
+2. **Audit upstream logs.** Check the provider's API audit log
+   (AWS CloudTrail, GitHub audit log, Stripe events, etc.) for any
+   request against the credential since the leak commit timestamp.
+3. **Remove from source.** Replace the literal with an env-var
+   lookup or secrets-manager fetch. See per-language patterns
+   below.
+4. **Add a pre-commit gate.** Wire this skill into the pre-commit
+   hook so the same engineer doesn't re-introduce the same class
+   tomorrow.
+5. **(Optional) Scrub history.** Only if private repo with
+   controlled clones AND credential is non-rotatable AND
+   coordination overhead is acceptable.
+## Per-language migration patterns
+### Python — `python-dotenv`
+```python
+# Before (vulnerable):
+STRIPE_KEY = "sk_live_abc123..."
+# After:
+import os
+from dotenv import load_dotenv
+load_dotenv()
+STRIPE_KEY = os.environ["STRIPE_KEY"]  # KeyError on missing
+```
+`.env` (gitignored):
+```
+STRIPE_KEY=sk_live_abc123...
+```
+`.gitignore`:
+```
+.env
+.env.local
+.env.production
+```
+### Python — pydantic-settings (typed config)
+```python
+from pydantic_settings import BaseSettings, SettingsConfigDict
+class Settings(BaseSettings):
+    model_config = SettingsConfigDict(env_file=".env", case_sensitive=False)
+    stripe_key: str
+    aws_access_key_id: str
+    aws_secret_access_key: str
+settings = Settings()
+```
+### Node.js — dotenv
+```javascript
+// Before:
+const STRIPE_KEY = "sk_live_abc123...";
+// After:
+require('dotenv').config();
+const STRIPE_KEY = process.env.STRIPE_KEY;
+if (!STRIPE_KEY) throw new Error("STRIPE_KEY not configured");
+```
+### Ruby on Rails — `Rails.application.credentials`
+```bash
+# Edit (creates / decrypts encrypted credentials)
+EDITOR=vim rails credentials:edit
+```
+In the editor:
+```yaml
+stripe:
+  api_key: sk_live_abc123...
+```
+Then in code:
+```ruby
+Rails.application.credentials.dig(:stripe, :api_key)
+```
+The encrypted file (`config/credentials.yml.enc`) commits to git;
+the decryption key (`config/master.key`) does NOT. Both are
+required to read the secret at runtime.
+### Go — `envconfig`
+```go
+package main
+import (
+    "log"
+    "github.com/kelseyhightower/envconfig"
+)
+type Config struct {
+    StripeKey      string `envconfig:"STRIPE_KEY" required:"true"`
+    AWSAccessKeyID string `envconfig:"AWS_ACCESS_KEY_ID" required:"true"`
+    AWSSecretKey   string `envconfig:"AWS_SECRET_ACCESS_KEY" required:"true"`
+}
+func main() {
+    var c Config
+    if err := envconfig.Process("", &c); err != nil {
+        log.Fatal(err)
+    }
+}
+```
+### Rust — `dotenvy` + `envy`
+```rust
+use serde::Deserialize;
+#[derive(Deserialize)]
+struct Config {
+    stripe_key: String,
+    aws_access_key_id: String,
+    aws_secret_access_key: String,
+}
+fn main() {
+    dotenvy::dotenv().ok();
+    let config: Config = envy::from_env().expect("missing env config");
+}
+```
+### Java — Spring Boot `@Value`
+```java
+@Component
+public class StripeConfig {
+    @Value("${stripe.api.key}")
+    private String stripeKey;
+}
+```
+`application.yml`:
+```yaml
+stripe:
+  api:
+    key: ${STRIPE_KEY}   # interpolated from environment
+```
+Production: set `STRIPE_KEY` in the environment (Kubernetes secret,
+ECS task definition, etc.) and Spring picks it up at startup.
+## Secrets managers
+For production, env vars alone aren't sufficient (they leak via
+process listings, container introspection, error reports). Use a
+dedicated secrets manager.
+### AWS Secrets Manager
+```python
+import boto3, json
+client = boto3.client("secretsmanager")
+secret_value = client.get_secret_value(SecretId="prod/stripe")["SecretString"]
+config = json.loads(secret_value)
+STRIPE_KEY = config["api_key"]
+```
+IAM policy grants the runtime role `secretsmanager:GetSecretValue`
+on the specific secret ARN. No literal in source; no literal in
+env-var dumps.
+### GCP Secret Manager
+```python
+from google.cloud import secretmanager
+client = secretmanager.SecretManagerServiceClient()
+name = "projects/my-project/secrets/stripe-key/versions/latest"
+response = client.access_secret_version(request={"name": name})
+STRIPE_KEY = response.payload.data.decode("UTF-8")
+```
+### HashiCorp Vault
+```bash
+# At runtime, runtime fetches the secret
+vault kv get -field=api_key secret/stripe
+```
+Or via the SDK:
+```python
+import hvac
+client = hvac.Client(url="https://vault.internal:8200")
+client.auth.approle.login(role_id=..., secret_id=...)
+secret = client.secrets.kv.v2.read_secret_version(path="stripe")
+STRIPE_KEY = secret["data"]["data"]["api_key"]
+```
+### Doppler / 1Password Secrets Automation / Bitwarden Secrets
+All follow the same pattern: SDK call at startup, no literals in
+source.
+## Pre-commit hook integration
+### Using `pre-commit` framework
+`.pre-commit-config.yaml`:
+```yaml
+repos:
+  - repo: local
+    hooks:
+      - id: scan-secrets
+        name: Scan for hardcoded secrets
+        entry: python3 plugins/security/penetration-tester/skills/scanning-for-hardcoded-secrets/scripts/scan_secrets.py
+        language: system
+        args: ['--min-severity', 'high']
+        pass_filenames: false
+```
+Install: `pre-commit install`. Now every `git commit` runs the
+scan; commits abort if the scan finds a high/critical credential.
+### Husky (Node projects)
+`package.json`:
+```json
+{
+  "husky": {
+    "hooks": {
+      "pre-commit": "python3 plugins/security/penetration-tester/skills/scanning-for-hardcoded-secrets/scripts/scan_secrets.py --min-severity high . || exit 1"
+    }
+  }
+}
+```
+## Provider rotation procedures
+### AWS access key
+```bash
+# Create new key (keep old active until apps cut over)
+aws iam create-access-key --user-name myuser
+# Update app config / secrets manager with new key
+# Verify apps are using new key (CloudTrail will show old key inactive)
+# Then deactivate + delete old key
+aws iam update-access-key --user-name myuser --access-key-id AKIAOLD --status Inactive
+# After 24h grace:
+aws iam delete-access-key --user-name myuser --access-key-id AKIAOLD
+```
+### GitHub PAT
+Settings → Developer settings → Personal access tokens → revoke
+old, generate new. Update CI / local config.
+### Stripe
+Dashboard → Developers → API keys → roll secret key. Two-step:
+generate new, deploy, deactivate old.
+### Anthropic
+Console → API keys → revoke old, create new. Update env / secrets
+manager.
+### Slack
+App settings → OAuth & Permissions → "Reinstall to Workspace" with
+admin approval generates fresh bot/user tokens.
+## GitHub Secret Scanning integration
+If your repo is on GitHub:
+- Settings → Code security → enable "Secret scanning alerts"
+- Settings → Code security → enable "Push protection" (this is the
+  game-changer: blocks pushes that contain detected credential
+  shapes, BEFORE the commit lands on the remote)
+GitHub's pattern library is roughly equivalent to this skill's;
+running both is defense-in-depth.
+## CI integration
+```yaml
+- name: Hardcoded-secrets scan
+  run: |
+    python3 plugins/security/penetration-tester/skills/scanning-for-hardcoded-secrets/scripts/scan_secrets.py \
+        . --min-severity high --format json --output secrets-scan.json
+- name: Fail on findings
+  run: |
+    if [ "$(jq 'length' secrets-scan.json)" != "0" ]; then
+      cat secrets-scan.json
+      exit 1
+    fi
+```
+## Verification after remediation
+```bash
+python3 ${CLAUDE_PLUGIN_ROOT}/skills/scanning-for-hardcoded-secrets/scripts/scan_secrets.py \
+    /path/to/repo --min-severity high
+```
+Expected: exit 0, zero high/critical findings. MEDIUM entropy-based
+findings may persist if your codebase legitimately contains high-
+entropy literals in test fixtures or build artifacts; verify
+manually.

package/skills/scanning-for-hardcoded-secrets/references/THEORY.md ADDED Viewed

@@ -0,0 +1,175 @@
+# Hardcoded-Secrets Theory
+## Why this class persists despite being well-known
+The pattern is well-understood: don't hardcode credentials in source.
+Every engineering team knows this. Every framework's getting-started
+guide opens with "set this in your `.env` file, not in code." Yet
+the class remains the #1 root cause of credential compromise year
+over year.
+Three reasons:
+1. **The "just for testing" trap.** An engineer is debugging
+   integration with a new API. The right thing is to set
+   `STRIPE_KEY=...` in the local env and read from there. The fast
+   thing is to paste the key into the integration test file as a
+   literal. The test works, the engineer moves on, the literal
+   stays.
+2. **Migration leftovers.** A codebase migrated from one secrets
+   pattern to another (e.g., from `.env` to AWS Secrets Manager)
+   often leaves stale literals from the pre-migration state, even if
+   the runtime fetches from the new location.
+3. **Test fixtures with real keys.** Integration tests need real
+   credentials to test against real APIs. Some teams check those
+   into a `tests/fixtures/` directory with full intent. The test
+   harness is now a permanent credential leak surface.
+The defensive answer is automated detection: scan on every commit
+(pre-commit hook), every push (CI gate), every release (full audit).
+Tools like gitleaks, trufflehog, and GitHub Secret Scanning all
+implement variations of the same regex library this skill uses.
+## Why provider-specific regex is the right pattern
+The naive approach is entropy detection: "find any long string with
+high randomness." The problem is the false-positive rate. Hash
+digests, base64-encoded image data, minified JavaScript, and
+compiled artifacts all look entropy-shaped.
+Provider-specific regex works because credential issuers use
+prefixed shapes intentionally — partly for routing, partly so their
+own scanners can find leaks in customer code. Examples:
+| Provider | Prefix | Why prefixed |
+|---|---|---|
+| AWS | `AKIA`, `ASIA`, `ABIA` | IAM type indicator (`AKIA` = long-term, `ASIA` = STS session) |
+| GitHub | `ghp_`, `gho_`, `ghs_`, `ghu_`, `ghr_` | Token-class router |
+| Stripe | `sk_live_`, `sk_test_`, `rk_live_`, `pk_live_` | Environment + role |
+| Anthropic | `sk-ant-api03-` | Format-version + env |
+| OpenAI | `sk-` and `sk-proj-` | Origin (user vs project) |
+| Slack | `xoxb-`, `xoxp-`, `xoxa-` | Token type |
+| Twilio | `AC`, `SK` | Account SID vs API key SID |
+| Google | `AIza` | Service-account vs user-creds router |
+The prefix means the provider's own scanner can find the leak the
+moment it lands in a public repo (GitHub Secret Scanning auto-
+notifies the provider). The scanner runs at machine speed: a public
+gist containing `AKIA...` is detected in seconds, and AWS gets a
+push notification.
+This skill scans the same set on the assumption: if the provider's
+bots will find it within seconds, the defensive posture is to find
+it before the commit lands.
+## The 1-minute leak window
+For public repos, the time between commit landing and credential
+extraction is on the order of seconds. GitHub Secret Scanning is
+roughly real-time; bot operators scraping the public push event
+stream are also real-time. By the time a developer notices the
+credential in their `git log` and force-pushes a rewrite, the
+extraction has happened. The credential must be considered
+compromised from the moment of push, regardless of subsequent
+history-scrub.
+For private repos: window depends on access posture. If contractors
+can clone, the window is "until you trust every contractor with
+every credential ever committed." If only employees clone, the
+window is "until any employee departs."
+Either way: rotate is mandatory, history-scrub is optional.
+## Entropy as a fallback
+Provider regex covers known credential shapes. New providers and
+custom internal tokens don't match.
+Shannon entropy measures information density in a string. Higher
+entropy = less compressible = more "random-looking." Real
+credentials are by design high-entropy (an attacker brute-forcing a
+low-entropy credential succeeds instantly).
+The threshold of ~4.5 bits/char is empirically calibrated:
+- English text: ~3.5
+- Base64: ~6.0
+- Hex: ~4.0
+- Random 32-char: ~5.0+
+Above 4.5 in a field labeled `key:`, `token:`, `secret:`,
+`password=` is a strong signal. Below 4.5 is usually English /
+placeholder / template variable.
+False positives:
+- High-entropy hashes (commit SHAs, content hashes) appearing in a
+  field labeled `key:` (e.g., `cache_key: a3b9...`). Use context to
+  filter.
+- Long base64-encoded test fixtures (PDF content, certificate
+  blobs). The entropy check passes; the human-verification step
+  rejects.
+The skill emits these as MEDIUM, not CRITICAL, with explicit
+"requires verification" framing.
+## History-scrub decision framework
+After finding a leaked credential in source:
+**Always rotate the credential.** Non-negotiable.
+**History scrub depends on these inputs:**
+1. **Is the repo public?** Yes → scrub is roughly futile. Anyone
+   who cloned has the history; mirror sites cache it. Public repo
+   = "publish a forever-archive of every commit ever made."
+2. **Is the credential still in the file's current state?** Yes →
+   scrub removes the live exposure. No (file's been fixed) → history
+   is the only remaining exposure surface.
+3. **Are clones controlled?** Yes (private repo, employees only) →
+   force-push + force-pull on every clone is feasible. No → don't
+   bother.
+4. **How long has the credential been in history?** Days → scrub
+   may catch unindexed copies. Months → assume copies exist
+   permanently.
+**Pragmatic default:** rotate, fix the current state, don't scrub
+history. The credential is dead either way; the historical
+disclosure of "we leaked something at this point in time" is mostly
+narrative, not technical risk, once the credential is rotated.
+**Exception:** if the credential CAN'T be rotated (e.g., it's
+embedded in a customer's deployed binary), scrub becomes the only
+remediation. Plan accordingly.
+## Why test directories are excluded by default
+Test fixtures often contain credential-shaped strings that ARE
+placeholders:
+```python
+# tests/fixtures/auth.py
+TEST_API_KEY = "ghp_FAKE0000000000000000000000000000000000"
+TEST_AWS_KEY = "AKIATESTKEY12345678"
+```
+These match the regex but are deliberately fake. Scanning tests by
+default produces high false-positive rates that train operators to
+ignore findings.
+The `--include-tests` flag is for the audit case: when reviewing an
+inherited codebase, you DO want to know whether fixtures contain
+real credentials someone forgot to redact. The flag is opt-in.
+## Primary sources
+- [CWE-798 Use of Hard-coded Credentials](https://cwe.mitre.org/data/definitions/798.html)
+- [CWE-321 Use of Hard-coded Cryptographic Key](https://cwe.mitre.org/data/definitions/321.html)
+- [OWASP A07:2021 Identification and Authentication Failures](https://owasp.org/Top10/A07_2021-Identification_and_Authentication_Failures/)
+- [GitHub Secret Scanning patterns](https://docs.github.com/en/code-security/secret-scanning/secret-scanning-patterns)
+- [AWS IAM access-key reference](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html)
+- [Trufflehog detector list](https://github.com/trufflesecurity/trufflehog/tree/main/pkg/detectors)
+- [Gitleaks rules](https://github.com/gitleaks/gitleaks/blob/master/config/gitleaks.toml)