@aegis-scan/skills 0.4.0 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (61) hide show
  1. package/ATTRIBUTION.md +111 -0
  2. package/CHANGELOG.md +48 -3
  3. package/package.json +1 -1
  4. package/skills/compliance/aegis-native/brutaler-anwalt/CHANGELOG.md +202 -0
  5. package/skills/compliance/aegis-native/brutaler-anwalt/LICENSE +43 -0
  6. package/skills/compliance/aegis-native/brutaler-anwalt/README.md +236 -0
  7. package/skills/compliance/aegis-native/brutaler-anwalt/SKILL.md +339 -5
  8. package/skills/compliance/aegis-native/brutaler-anwalt/references/aegis-integration.md +3 -4
  9. package/skills/compliance/aegis-native/brutaler-anwalt/references/audit-patterns.md +842 -5
  10. package/skills/compliance/aegis-native/brutaler-anwalt/references/bgh-urteile.md +226 -10
  11. package/skills/compliance/aegis-native/brutaler-anwalt/references/branchenrecht.md +365 -1
  12. package/skills/compliance/aegis-native/brutaler-anwalt/references/checklisten.md +33 -0
  13. package/skills/compliance/aegis-native/brutaler-anwalt/references/dsgvo.md +26 -0
  14. package/skills/compliance/aegis-native/brutaler-anwalt/references/gesetze/BDSG/paragraphs.md +62 -0
  15. package/skills/compliance/aegis-native/brutaler-anwalt/references/gesetze/BFSG/paragraphs.md +85 -0
  16. package/skills/compliance/aegis-native/brutaler-anwalt/references/gesetze/BGB/paragraphs.md +112 -0
  17. package/skills/compliance/aegis-native/brutaler-anwalt/references/gesetze/DDG/paragraphs.md +71 -0
  18. package/skills/compliance/aegis-native/brutaler-anwalt/references/gesetze/DSGVO/articles.md +182 -0
  19. package/skills/compliance/aegis-native/brutaler-anwalt/references/gesetze/EU-Verordnungen/AI-Act-2024-1689/articles.md +108 -0
  20. package/skills/compliance/aegis-native/brutaler-anwalt/references/gesetze/EU-Verordnungen/DSA-2022-2065/articles.md +131 -0
  21. package/skills/compliance/aegis-native/brutaler-anwalt/references/gesetze/HGB-AO/paragraphs.md +61 -0
  22. package/skills/compliance/aegis-native/brutaler-anwalt/references/gesetze/INDEX.md +93 -0
  23. package/skills/compliance/aegis-native/brutaler-anwalt/references/gesetze/TDDDG/paragraphs.md +67 -0
  24. package/skills/compliance/aegis-native/brutaler-anwalt/references/gesetze/UWG/paragraphs.md +117 -0
  25. package/skills/compliance/aegis-native/brutaler-anwalt/references/gesetze/VSBG/paragraphs.md +57 -0
  26. package/skills/compliance/aegis-native/brutaler-anwalt/references/it-recht.md +22 -0
  27. package/skills/compliance/aegis-native/brutaler-anwalt/references/stack-patterns/INDEX.md +122 -0
  28. package/skills/compliance/aegis-native/brutaler-anwalt/references/stack-patterns/ai/mistral-eu.md +123 -0
  29. package/skills/compliance/aegis-native/brutaler-anwalt/references/stack-patterns/ai/openai-dpa.md +120 -0
  30. package/skills/compliance/aegis-native/brutaler-anwalt/references/stack-patterns/auth/nextauth-tom.md +120 -0
  31. package/skills/compliance/aegis-native/brutaler-anwalt/references/stack-patterns/auth/supabase-auth-tom.md +104 -0
  32. package/skills/compliance/aegis-native/brutaler-anwalt/references/stack-patterns/nextjs/proxy-csp-pattern.md +93 -0
  33. package/skills/compliance/aegis-native/brutaler-anwalt/references/stack-patterns/payment/stripe-pci-tom.md +121 -0
  34. package/skills/compliance/aegis-native/brutaler-anwalt/references/stack-patterns/tracking/plausible-pattern.md +107 -0
  35. package/skills/compliance/aegis-native/brutaler-anwalt/references/templates/AffiliateDisclaimer.tsx.example +54 -0
  36. package/skills/compliance/aegis-native/brutaler-anwalt/references/templates/COMPLIANCE-AUDIT-TRAIL-template.md +95 -0
  37. package/skills/compliance/aegis-native/brutaler-anwalt/references/templates/DSE-Section-UGC.md.example +77 -0
  38. package/skills/compliance/aegis-native/brutaler-anwalt/references/templates/DSFA-template.md +76 -0
  39. package/skills/compliance/aegis-native/brutaler-anwalt/references/templates/LostFoundReportForm-consent.tsx.example +126 -0
  40. package/skills/compliance/aegis-native/brutaler-anwalt/references/templates/README.md +33 -0
  41. package/skills/compliance/aegis-native/brutaler-anwalt/references/templates/UmamiScript.tsx.example +64 -0
  42. package/skills/compliance/aegis-native/brutaler-anwalt/references/templates/VVT-template.md +60 -0
  43. package/skills/compliance/aegis-native/brutaler-anwalt/references/templates/data-retention-cron.ts.example +52 -0
  44. package/skills/compliance/aegis-native/brutaler-anwalt/references/templates/data-retention-workflow.yml.example +47 -0
  45. package/skills/compliance/aegis-native/brutaler-anwalt/references/templates/proxy-strict-dynamic.ts.example +80 -0
  46. package/skills/compliance/aegis-native/brutaler-anwalt/references/templates/security.txt.example +26 -0
  47. package/skills/compliance/aegis-native/brutaler-anwalt/scripts/health-check.sh +120 -0
  48. package/skills/defensive/aegis-native/rls-defense/SKILL.md +85 -0
  49. package/skills/foundation/aegis-native/aegis-module-builder/SKILL.md +5 -1
  50. package/skills/foundation/aegis-native/aegis-orchestrator/SKILL.md +87 -4
  51. package/skills/foundation/aegis-native/aegis-quality-gates/SKILL.md +69 -9
  52. package/skills/offensive/matty-fork/cicd-redteam/SKILL.md +531 -0
  53. package/skills/offensive/matty-fork/cloud-security/SKILL.md +106 -0
  54. package/skills/offensive/matty-fork/container-escape/SKILL.md +174 -0
  55. package/skills/offensive/matty-fork/mobile-pentester/SKILL.md +357 -0
  56. package/skills/offensive/matty-fork/subdomain-takeover/SKILL.md +154 -0
  57. package/skills/osint/elementalsouls-fork/offensive-osint/README.md +92 -0
  58. package/skills/osint/elementalsouls-fork/offensive-osint/SKILL.md +4177 -0
  59. package/skills/osint/elementalsouls-fork/osint-methodology/README.md +66 -0
  60. package/skills/osint/elementalsouls-fork/osint-methodology/SKILL.md +1695 -0
  61. package/sbom.cdx.json +0 -1
@@ -0,0 +1,4177 @@
1
+ <!-- aegis-local: forked 2026-05-01 from elementalsouls/Claude-OSINT@ea42241d068e8112da0e4e28006207125c835c2e (MIT-licensed); attribution preserved, see ATTRIBUTION.md.
2
+ PORT-NOTE: This skill body references a stdlib-only `secret_scan.py`
3
+ helper that lives upstream but is NOT shipped in @aegis-scan/skills
4
+ (the package enforces a markdown-only invariant via CI). The helper
5
+ will be ported to `packages/scanners/src/recon/external-secret-scan.ts`
6
+ under F-EXTERNAL-SECRETS-1 (planned v0.18.x). Until then, perform
7
+ secret-scanning via AEGIS' existing gitleaks / trufflehog wrappers,
8
+ or fetch the upstream helper directly from the source repo above. -->
9
+
10
+ ---
11
+ name: offensive-osint
12
+ description: "Operational arsenal for external red-team and bug-bounty reconnaissance. Concrete wordlists (28 Swagger paths, 13 GraphQL paths, 35 high-risk ports, 6 missing-header findings, 15 always-on HTTP checks, 5 SAML paths, cloud bucket permutations, JS guess-paths, vendor product fingerprints for Citrix/F5/Pulse/Fortinet/Cisco/PaloAlto/VMware/Exchange, cloud-native service fingerprints, container/K8s exposure paths, CI/CD platform paths, documentation/wiki leak paths, WHOIS/RDAP, DNS record catalog, Wayback CDX recipes), 43+-pattern secret-regex catalog (incl. modern AI API keys: Anthropic/OpenAI/HuggingFace/Cloudflare/DigitalOcean/npm/PyPI/Docker Hub/Atlassian/DataDog/Sentry/ngrok), 80+ dork corpus across 9 categories, GitHub code-search dorks, copy-paste curl/httpie probes for every check, post-discovery enumeration workflows (AWS/GitHub/Slack/JWT/PMAK/Anthropic/OpenAI), endpoint interest scoring rubric (0–100), mobile app ownership confidence, identity-fabric endpoints (Entra/Okta/ADFS/Google/SAML/M365 Teams+SharePoint+OneDrive+OAuth + user-enum), GraphQL field-suggestion enumeration when introspection disabled, 9 read-only secret validators (Postman/AWS/GitHub/Slack/Anthropic/OpenAI/npm/Atlassian/DataDog), Postman workspace search (verified endpoint), Stack Exchange sweep, public SaaS dorks, email security analysis (SPF/DMARC/DKIM/BIMI/MTA-STS/DNSSEC), origin-discovery / CDN bypass techniques, TLS deep audit (sslyze/testssl.sh/JA3/JA4), reverse-DNS sweep + IPv6 enum, vulnerability prioritization data sources (NVD/EPSS/CISA KEV/ExploitDB/Metasploit), 27 attack-path hint templates, 80+ severity-matrix examples, LinkedIn employee enumeration, job posting tech-stack analysis, Slack/Discord workspace discovery, package registry leak hunting (npm/PyPI/Docker Hub/Quay/GHCR), sat imagery for physical recon, tooling quick-install one-liners, sector-specific recon notes (healthcare/finance/ICS-SCADA/IoT/government), runnable stdlib-only secret_scan.py helper, plus the existing tool references for username/email/phone/people/social/breach/infrastructure/crypto/media/geospatial/AI/archiving/automation. Use when you need concrete probe paths, regexes, payloads, scoring rules, curl one-liners, and tool URLs for an authorized external recon engagement."
13
+ version: 2.1.1
14
+ triggers:
15
+ - external recon
16
+ - external red team
17
+ - red team external
18
+ - attack surface management
19
+ - ASM
20
+ - bug bounty recon
21
+ - bug bounty
22
+ - reconnaissance
23
+ - footprinting
24
+ - asset discovery
25
+ - swagger discovery
26
+ - openapi discovery
27
+ - graphql introspection
28
+ - graphql discovery
29
+ - subdomain enumeration
30
+ - subdomain takeover
31
+ - cloud bucket enumeration
32
+ - bucket enum
33
+ - S3 enum
34
+ - GCS enum
35
+ - Azure blob enum
36
+ - identity fabric
37
+ - SSO discovery
38
+ - IdP fingerprinting
39
+ - tenant fingerprinting
40
+ - okta enum
41
+ - entra enum
42
+ - azure AD enum
43
+ - ADFS enum
44
+ - SAML metadata
45
+ - mobile recon
46
+ - APK analysis
47
+ - mobile attack surface
48
+ - secret scanning
49
+ - secret leak
50
+ - leaked credential
51
+ - github dorking
52
+ - google dorking
53
+ - bing dorking
54
+ - DDG dorking
55
+ - postman workspace
56
+ - stack exchange OSINT
57
+ - breach lookup
58
+ - have I been pwned
59
+ - HudsonRock cavalier
60
+ - infostealer
61
+ - dehashed
62
+ - intelx
63
+ - shodan recon
64
+ - censys recon
65
+ - certificate transparency
66
+ - crt.sh
67
+ - JARM
68
+ - favicon mmh3
69
+ - JS endpoint extraction
70
+ - sourcemap leak
71
+ - copy paste probes
72
+ - curl one-liner
73
+ - email security analysis
74
+ - SPF DMARC DKIM
75
+ - origin discovery
76
+ - CDN bypass
77
+ - WAF bypass
78
+ - vendor product fingerprints
79
+ - Citrix Netscaler
80
+ - F5 BIG-IP
81
+ - Pulse Secure
82
+ - FortiGate
83
+ - PaloAlto GlobalProtect
84
+ - Cisco AnyConnect
85
+ - VMware vCenter
86
+ - cloud native fingerprint
87
+ - Lambda function URL
88
+ - Cloud Run
89
+ - kubernetes exposure
90
+ - kubelet
91
+ - etcd
92
+ - CI CD exposure
93
+ - Jenkins recon
94
+ - GitLab self-hosted
95
+ - GitHub Actions secrets
96
+ - documentation leak
97
+ - Notion public
98
+ - Confluence anonymous
99
+ - Trello board
100
+ - WHOIS RDAP
101
+ - DNS record catalog
102
+ - Wayback CDX
103
+ - LinkedIn enumeration
104
+ - job posting tech stack
105
+ - Slack workspace discovery
106
+ - Discord server discovery
107
+ - npm token leak
108
+ - PyPI token leak
109
+ - Docker Hub leak
110
+ - sat imagery physical recon
111
+ - TLS deep audit
112
+ - JA3 JA4
113
+ - reverse DNS sweep
114
+ - IPv6 enumeration
115
+ - CVE prioritization
116
+ - EPSS scoring
117
+ - CISA KEV
118
+ - vulnerability prioritization
119
+ - tooling install
120
+ - sector specific recon
121
+ - healthcare DICOM
122
+ - finance SWIFT
123
+ - ICS SCADA
124
+ - Modbus
125
+ - BACnet
126
+ - post discovery workflow
127
+ - JWT triage
128
+ - AWS key triage
129
+ - GraphQL field suggestion
130
+ - Anthropic API key
131
+ - OpenAI API key
132
+ - Microsoft 365 deep
133
+ - Teams federation
134
+ - SharePoint enum
135
+ - OneDrive enum
136
+ ---
137
+
138
+ # Offensive OSINT — External Red-Team Arsenal
139
+
140
+ > Companion skill: `osint-methodology` (the "how to think" skill). This skill is the "what to reach for." Use them together.
141
+
142
+ ## 0. When to use / When NOT
143
+
144
+ **Use this skill when:**
145
+ - You need concrete probe paths, wordlists, regexes, payloads, scoring rules, or tool URLs.
146
+ - You're executing reconnaissance and need the actual technical reference (vs. methodology).
147
+ - You're building a recon automation and need specific lists to seed it.
148
+
149
+ **Do NOT use this skill when:**
150
+ - The user is asking for active exploitation, post-exploitation, or anything past reconnaissance.
151
+ - The user is asking for defensive / blue-team detections.
152
+ - The target's authorization isn't established — see §1.
153
+
154
+ ---
155
+
156
+ ## 1. Authorization & Legal Posture
157
+
158
+ For assets the operator owns or has written authorization to assess. Soft scope check before acting against an unverified third-party target — see methodology skill §1 for the full posture.
159
+
160
+ ---
161
+
162
+ ## 2. Confidence Levels
163
+
164
+ - **TENTATIVE** — plausible based on indirect evidence (snippet-only dork match, single-source asset, inferred email pattern).
165
+ - **FIRM** — directly observed (subdomain resolves, HEAD-confirmed bucket exists, banner returned).
166
+ - **CONFIRMED** — verified via independent corroboration OR direct verification (live PMAK validation, multiple sources agree, listable bucket with object retrieval).
167
+
168
+ ---
169
+
170
+ ## 3. Output Format Conventions
171
+
172
+ Findings should carry: `id`, `module`, `asset_key`, `category`, `severity` (info/low/medium/high/critical), `confidence`, `title`, `description`, `evidence` (url + UTC timestamp + sha256 + raw ≤ 2 KiB), `references`, `remediation`. UTC timestamps everywhere.
173
+
174
+ ---
175
+
176
+ ## 4. Source Hygiene & Citations
177
+
178
+ URL + UTC timestamp + SHA-256 + tool version + run_id, every artifact. PNG screenshots, JSONL run logs, raw HTTP captures capped at 2 KiB body.
179
+
180
+ ---
181
+
182
+ ## 5. Do NOT
183
+
184
+ - Don't paste creds/PII/session tokens into cloud LLMs.
185
+ - Don't run destructive probes outside DEEP/`--aggressive`.
186
+ - Don't use validated credentials for anything except read-only liveness check.
187
+ - Don't single-source attribute.
188
+ - Don't assume vendor labels are ground truth.
189
+
190
+ ---
191
+
192
+ ## 6. General OSINT (curated tool refs)
193
+
194
+ - [OSINT Bookmarks](https://tools.myosint.training/) — comprehensive bookmarks.
195
+ - [OSINT Framework](https://osintframework.com/) — tool/resource directory.
196
+ - [IntelTechniques Tools](https://inteltechniques.com/tools/) — investigative suite.
197
+ - [Bellingcat Toolkit](https://www.bellingcat.com/resources/2024/09/24/bellingcat-online-investigations-toolkit/) — investigative journalism.
198
+ - [CyberSudo OSINT Toolkit](https://docs.google.com/spreadsheets/d/1EC0sKA_W9znzsxUt0wye9UYtyATXw5m8) — OSINT websites list.
199
+ - [Google Dorks](https://dorksearch.com/) — efficient Google searching.
200
+ - [Distributed Denial of Secrets](https://ddosecrets.com/) — leaked datasets.
201
+ - [Country-Specific Resources](https://digitaldigging.org/osint/) — country-targeted OSINT.
202
+
203
+ ## 7. Search Engines
204
+
205
+ | Tool | Notes |
206
+ |------|-------|
207
+ | [Carrot2](https://search.carrot2.org/#/search/web) | Clusters results by topic |
208
+ | [etools](https://www.etools.ch/) | Metasearch |
209
+ | [Kagi](https://kagi.com/) | Privacy-first, non-personalized |
210
+ | [Brave Search](https://search.brave.com/) | Independent index; Goggles for custom ranking |
211
+ | [PDF Search](https://www.pdfsearch.io/) | PDF + table of contents |
212
+ | [Google Fact Check Explorer](https://toolbox.google.com/factcheck/explorer) | Cross-site fact-check |
213
+
214
+ ---
215
+
216
+ ## 8. Username & Email Investigation
217
+
218
+ | Tool | Purpose |
219
+ |------|---------|
220
+ | [Sherlock](https://github.com/sherlock-project/sherlock) | Username search across social networks |
221
+ | [Maigret](https://github.com/soxoj/maigret) | Profile collector by username |
222
+ | [What's My Name](https://whatsmyname.app/) | Username search |
223
+ | [Holehe](https://github.com/megadose/holehe) | Email registration check |
224
+ | [Epieos](https://epieos.com/) | Email pivots and metadata |
225
+ | [OSINT Industries](https://osint.industries/) | Email/username/phone lookups |
226
+ | [Hunter.io](https://hunter.io/) | Domain → emails |
227
+ | [EmailRep](https://emailrep.io/) | Email reputation |
228
+ | [Emailable](https://emailable.com/) | Email verification |
229
+ | [Mugetsu](https://mugetsu.io/) | X/Twitter username history |
230
+ | [RocketReach](https://rocketreach.co/) / [Apollo](https://www.apollo.io/) | Email enrichment + pattern guessing |
231
+ | [PhoneInfoga](https://github.com/sundowndev/phoneinfoga) | Phone number intelligence |
232
+
233
+ Browser extensions: [GetProspect](https://chromewebstore.google.com/detail/email-finder-getprospect/bhbcbkonalnjkflmdkdodieehnmmeknp), [SignalHire](https://chrome.google.com/webstore/detail/signalhire-find-email-or/aeidadjdhppdffggfgjpanbafaedankd).
234
+
235
+ ---
236
+
237
+ ## 9. People Search
238
+
239
+ - [TruePeopleSearch](https://www.truepeoplesearch.com/) — free U.S. people search.
240
+ - [WhitePages](https://www.whitepages.com/), [Spokeo](https://www.spokeo.com/), [Webmii](https://webmii.com/), [Pipl](https://pipl.com/) (paid).
241
+ - [Clearbit](https://clearbit.com/) — company/individual data enrichment.
242
+ - [FaceCheck](https://facecheck.id/) / [FaceSeek](https://faceseek.online/) — reverse face search.
243
+
244
+ ---
245
+
246
+ ## 10. Phone Number OSINT
247
+
248
+ - [TrueCaller](https://www.truecaller.com/) — caller ID + spam blocking.
249
+ - [ThatsThem](https://thatsthem.com/) — reverse phone search.
250
+ - [Infobel](https://infobel.com/) — non-USA phone search.
251
+ - [FreeCarrierLookup](https://freecarrierlookup.com/) — carrier/type (US).
252
+ - [NumlookupAPI](https://numlookupapi.com/) [Freemium] — programmatic carrier checks.
253
+ - [CallerIDTest](https://calleridtest.com/), [Advanced Background Checks](https://www.advancedbackgroundchecks.com/).
254
+
255
+ ---
256
+
257
+ ## 11. Email-Pattern Inference (TENTATIVE candidates)
258
+
259
+ Given a `(first_name, last_name, domain)`, generate these 8 candidate addresses for breach pre-hits, phishing list curation, and downstream enrichment. Mark as **TENTATIVE** confidence until corroborated.
260
+
261
+ ```
262
+ {first}.{last}@{domain} # john.doe@example.com
263
+ {first}{last}@{domain} # johndoe@example.com
264
+ {first}@{domain} # john@example.com
265
+ {first[0]}{last}@{domain} # jdoe@example.com
266
+ {first}.{last[0]}@{domain} # john.d@example.com
267
+ {last}@{domain} # doe@example.com
268
+ {first}_{last}@{domain} # john_doe@example.com
269
+ {first}-{last}@{domain} # john-doe@example.com
270
+ ```
271
+
272
+ Lowercase before lookup. Strip diacritics for ASCII fallback. If the org uses a known pattern (e.g., Hunter.io shows `{first}.{last}` is dominant), prioritize that one and mark FIRM.
273
+
274
+ ---
275
+
276
+ ## 12. Email-Harvest Source Stack
277
+
278
+ Six parallel sources, dedup at the end:
279
+
280
+ 1. **IntelX phonebook API** — 2-step search + poll. Largest single source for breach-era addresses.
281
+ 2. **Hunter.io** — domain-search endpoint. ~25 free/month. Returns verified emails + roles.
282
+ 3. **crt.sh** — extract X.509 SAN extensions. Many certs include admin/contact emails.
283
+ 4. **DuckDuckGo SERP scrape** — HTML scrape of `"@{target-domain}"` results.
284
+ 5. **Bing SERP scrape** — same query, complementary index.
285
+ 6. **Wayback CDX** — historic snapshots of the target's homepage / contact / about pages often contain emails removed from the live site.
286
+
287
+ **Email regex:**
288
+ ```regex
289
+ \b[A-Za-z0-9._%+\-]+@[A-Za-z0-9.\-]+\.[A-Za-z]{2,}\b
290
+ ```
291
+
292
+ **Noise filter (reject numeric-only locals):**
293
+ ```regex
294
+ ^[0-9]+$
295
+ ```
296
+ (Discards garbage like `12345@example.com` from random tokens.)
297
+
298
+ ---
299
+
300
+ ## 13. Social Media
301
+
302
+ | Platform | Tool |
303
+ |----------|------|
304
+ | Instagram | [Picuki](https://www.picuki.com/) — profile view without account |
305
+ | X/Twitter | [snscrape](https://github.com/snscrape/snscrape) — preferred CLI scraper; Twint as fallback |
306
+ | Facebook | [Graph Search](https://inteltechniques.com/tools/Facebook.html), [sowsearch.info](https://sowsearch.info/), [lookup-id.com](https://lookup-id.com/), [whopostedwhat.com](https://whopostedwhat.com/) |
307
+ | Facebook (research) | [Meta Content Library](https://transparency.meta.com/researcher) — CrowdTangle successor (researcher-gated) |
308
+ | YouTube/Twitch | [Social Blade](https://socialblade.com/) — analytics |
309
+ | TikTok | [Tokboard](https://tokboard.com/) — trends + profile analytics |
310
+ | Reddit | [Reveddit](https://www.reveddit.com/) — removed content; [RedTrack.social](https://redtrack.social/) — user history |
311
+ | Bluesky | [Firesky](https://firesky.tv/) — real-time firehose; [SkyView](https://bsky.jazco.dev/) — follower graphs |
312
+ | Mastodon | [FediSearch](https://fedisearch.skorpil.cz/) — cross-instance search; [Fedifinder](https://fedifinder.glitch.me/) — find Twitter users on Mastodon |
313
+ | Faces | [Search4Faces](https://search4faces.com/) |
314
+
315
+ ---
316
+
317
+ ## 14. Public Records & Company Information
318
+
319
+ - [OpenCorporates](https://opencorporates.com/) — world's largest open company DB.
320
+ - [SEC EDGAR](https://www.sec.gov/edgar.shtml) — U.S. company filings.
321
+ - [OpenOwnership Register](https://register.openownership.org/) — beneficial ownership.
322
+ - [MuckRock](https://www.muckrock.com/) — FOIA repository + request tracking.
323
+ - [EU Tenders (TED)](https://ted.europa.eu/) — EU procurement notices.
324
+ - [World Bank Projects](https://projects.worldbank.org/) — project + procurement records.
325
+ - [UK Companies House](https://find-and-update.company-information.service.gov.uk/) — UK companies + officers + filings.
326
+
327
+ ### 14.1 RU registries
328
+
329
+ [Rusprofile](https://www.rusprofile.ru/), [Kontur.Focus](https://focus.kontur.ru/) (freemium), [zakupki.gov.ru](https://zakupki.gov.ru/) (procurement), EGRUL/EGRIP (official, captcha-gated).
330
+
331
+ ### 14.2 CN registries + USCC + ICP
332
+
333
+ - **GSXT** — [gsxt.gov.cn](https://www.gsxt.gov.cn/) National Enterprise Credit Info; cross-check with Tianyancha / Qichacha.
334
+ - **USCC (Unified Social Credit Code)** — 18-character entity ID assigned to all CN legal entities. Format: `<region:6><authority:2><type:1><serial:9>`. Useful for joining GSXT records to ICP filings.
335
+ - **ICP Beian** — [beian.miit.gov.cn](https://beian.miit.gov.cn/) — every domain serving traffic in mainland CN must register an ICP filing; the filing links the domain to a USCC, which links to the legal entity in GSXT.
336
+ - Workflow: `target.cn` domain → ICP lookup → USCC → GSXT → entity name + officers + adjacent registered entities.
337
+
338
+ ### 14.3 Sanctions & Compliance
339
+
340
+ - [OFAC SDN List](https://sanctionssearch.ofac.treas.gov/), [EU Sanctions Map](https://www.sanctionsmap.eu/).
341
+ - [OpenSanctions](https://www.opensanctions.org/) — aggregated.
342
+ - [OCCRP Aleph](https://aleph.occrp.org/) — investigative documents, leaks, company records.
343
+
344
+ ---
345
+
346
+ ## 15. Breach & Leak Data
347
+
348
+ - [Have I Been Pwned](https://haveibeenpwned.com/) — breach lookup; Pwned Passwords API (k-anonymity).
349
+ - [Dehashed](https://dehashed.com/) — credential search (paid).
350
+ - [IntelX](https://intelx.io/) — data intelligence.
351
+ - [LeakCheck](https://leakcheck.io/), [Snusbase](https://snusbase.com/), [BreachDirectory](https://breachdirectory.org/), [Scattered Secrets](https://scatteredsecrets.com/), [Phonebook](https://phonebook.cz/), [LeakPeek](https://leakpeek.com/).
352
+ - [Cavalier (Hudson Rock)](https://cavalier.hudsonrock.com/) — **infostealer log lookups; FREE; highest single-source ROI for finding compromised employee credentials in corporate SSO**.
353
+
354
+ ### 15.0.1 HudsonRock Cavalier — direct API recipe
355
+
356
+ The web UI wraps a **public, unauthenticated JSON API**. Hit it directly:
357
+
358
+ ```bash
359
+ # By domain (canonical first call)
360
+ curl -sk -m 30 "https://cavalier.hudsonrock.com/api/json/v2/osint-tools/search-by-domain?domain=target.com" | jq .
361
+
362
+ # By email (single-account check)
363
+ curl -sk -m 30 "https://cavalier.hudsonrock.com/api/json/v2/osint-tools/search-by-email?email=alice@target.com" | jq .
364
+
365
+ # By URL (when target's app is the breach victim)
366
+ curl -sk -m 30 "https://cavalier.hudsonrock.com/api/json/v2/osint-tools/search-by-url?url=https://app.target.com" | jq .
367
+ ```
368
+
369
+ PowerShell:
370
+ ```powershell
371
+ $hr = Invoke-RestMethod -Uri "https://cavalier.hudsonrock.com/api/json/v2/osint-tools/search-by-domain?domain=$D" -TimeoutSec 30
372
+ "Employees: $($hr.employees) | Users: $($hr.users) | Third-party: $($hr.third_parties) | Total: $($hr.total)"
373
+ $hr.data.employees_urls | Sort-Object -Property occurrence -Descending | Select-Object -First 20
374
+ $hr.data.clients_urls | Sort-Object -Property occurrence -Descending | Select-Object -First 15
375
+ ```
376
+
377
+ **Top-level JSON fields:**
378
+ - `total` — total stealer entries touching this domain.
379
+ - `totalStealers` — global stealer-log corpus size (context only).
380
+ - `employees` — count of `<*>@<domain>` accounts found.
381
+ - `users` — count of accounts where the domain appeared as a *visited* URL (customers/vendors).
382
+ - `third_parties` — accounts touching adjacent domains in the org.
383
+ - `data.employees_urls[]` — `{occurrence, type, url}` — internal apps where employees were logging in when stolen. **Subdomain hits here = recon gold.**
384
+ - `data.clients_urls[]` — same shape; user-facing apps (often reveals undocumented public portals).
385
+ - `data.stealer_families[]` — `{_key, _value}` → which stealer (RedLine / Lumma / StealC / Vidar / Raccoon).
386
+ - `data.dates_compromised[]` — `{_key, _value}` → temporal distribution.
387
+
388
+ **Free-tier caveats (CRITICAL to know):**
389
+ - Subdomain hostnames in `data.*_urls[]` past the first few are **redacted with asterisks** (`*****.target.com`). Pivot to paid Cavalier tier or other sources for unredacted.
390
+ - Free endpoint returns counts + sample URLs only. Cleartext passwords + emails are **never** in the free response.
391
+ - Rate limit ~1 req/sec/IP; 429 on burst. Sleep 1s between calls.
392
+ - For unredacted creds + bulk enumeration → paid Cavalier portal.
393
+
394
+ **Severity mapping (per §15.1 + §15.2):** `employees ≥ 10` → CRITICAL, **regardless of whether the breached service is still online** (legacy Lotus Domino / on-prem mail decommissioned + cloud SSO migration → employees almost always reuse passwords → SSO_EXPOSURE escalates CRITICAL).
395
+
396
+ ### 15.1 Domain-Level Breach Severity Mapping
397
+
398
+ When you query a breach corpus by domain, map the result to severity like so:
399
+
400
+ | Stat | Severity |
401
+ |---|---|
402
+ | ≥ 10 employees compromised | **CRITICAL** |
403
+ | 1–9 employees compromised | **HIGH** |
404
+ | ≥ 1 end-user (non-employee) compromised | **MEDIUM** |
405
+ | Domain seen in breach with 0 named accounts | **INFO** |
406
+
407
+ **Employees vs end-users distinction:** an employee account is `<anything>@<target-domain>` (the breach victim is the target's own staff). An end-user account is the target's customer who reused a password — useful for credential-stuffing risk awareness but not directly compromising the target's identity fabric.
408
+
409
+ ### 15.2 SSO_EXPOSURE finding
410
+
411
+ When a discovered SSO tenant (Entra GUID / Okta slug / Google Workspace domain) intersects with the breach corpus on its domain → `SSO_EXPOSURE` finding, severity **CRITICAL**. Evidence: tenant ID + product + employee count + per-account source attribution.
412
+
413
+ **Legacy-mail-decommissioned pattern (high-value variant):**
414
+
415
+ If `mail.<domain>` / `webmail.<domain>` returns **NXDOMAIN today** but HudsonRock/HIBP corpus still has historical employee credentials against it AND `autodiscover.<domain>` resolves to Microsoft IPs (M365) or `aspmx.l.google.com` MX (Workspace), the org migrated from on-prem to cloud — and the stolen passwords almost certainly survived the migration via password reuse. **Escalate to CRITICAL `SSO_EXPOSURE`** even when the legacy host is dead.
416
+
417
+ Concrete triggers (all three together):
418
+ 1. `Resolve-DnsName mail.<domain> -Type A` → NXDOMAIN (legacy gone)
419
+ 2. HudsonRock corpus has employee URLs against the *old* host (e.g. `mail.<domain>/names.nsf` for Lotus Domino, `mail.<domain>/owa/` for Exchange, `mail.<domain>/iwaredir.nsf` for iNotes, `mail.<domain>/zimbra/` for Zimbra)
420
+ 3. Current MX → M365 / Google Workspace / Zoho cloud (DNS confirms migration)
421
+
422
+ Evidence pack: tenant GUID + breach count + 3+ legacy URLs from corpus + autodiscover Microsoft IPs + current MX. Recommend forced password rotation + MFA audit + Conditional Access review.
423
+
424
+ ---
425
+
426
+ ## 16. Pre-built Wordlists & Probe Paths
427
+
428
+ Copy-pasteable arsenals, severity-annotated where relevant.
429
+
430
+ ### 16.1 Swagger / OpenAPI discovery — 28 paths
431
+
432
+ Probe each path on every alive webapp. GET (or HEAD if rate-limited).
433
+
434
+ ```
435
+ swagger.json
436
+ swagger.yaml
437
+ swagger/v1/swagger.json
438
+ swagger/v2/swagger.json
439
+ swagger-ui.html
440
+ swagger-ui/
441
+ swagger-resources
442
+ api-docs
443
+ api-docs.json
444
+ api/swagger
445
+ api/swagger.json
446
+ api/swagger-ui.html
447
+ api/v1/swagger.json
448
+ api/v2/swagger.json
449
+ api/v3/api-docs
450
+ v2/api-docs
451
+ v3/api-docs
452
+ openapi.json
453
+ openapi.yaml
454
+ openapi/v1
455
+ openapi/v3
456
+ docs
457
+ redoc
458
+ rapidoc
459
+ api/docs
460
+ api/documentation
461
+ .well-known/openapi
462
+ ```
463
+
464
+ **Severity:**
465
+ - Reachable Swagger/OpenAPI spec without auth → **HIGH** `LEAKY_API_SPEC` (full endpoint enumeration leaks; often reveals undocumented internal APIs).
466
+ - Behind auth but accessible to any authenticated user → MEDIUM (still discloses internal API surface).
467
+
468
+ ### 16.2 GraphQL discovery — 13 paths
469
+
470
+ ```
471
+ graphql
472
+ graphiql
473
+ api/graphql
474
+ v1/graphql
475
+ v2/graphql
476
+ query
477
+ api/query
478
+ gql
479
+ altair
480
+ playground
481
+ subscriptions
482
+ graphql/console
483
+ api/v1/graphql
484
+ ```
485
+
486
+ **Standard introspection POST body:**
487
+ ```json
488
+ {
489
+ "operationName": "IntrospectionQuery",
490
+ "query": "query IntrospectionQuery { __schema { types { name kind fields { name type { name kind } } } queryType { name } mutationType { name } subscriptionType { name } } }"
491
+ }
492
+ ```
493
+
494
+ **Severity:**
495
+ - Introspection returns schema without auth → **HIGH** `OPEN_GRAPHQL_API`.
496
+ - Field-suggestion enumeration possible (server returns "did you mean" for typo'd field names) → **MEDIUM** (re-derive partial schema even when introspection is disabled).
497
+ - `/graphql` accepts batched queries (`[...]` request body) → MEDIUM (rate-limit bypass surface; auth bypass via mixed batches).
498
+
499
+ UI markers (lower severity but still discoverable):
500
+ - HTML response contains `graphiql`, `playground`, `apollo studio`, `altair` → GraphiQL UI exposed (often shipped accidentally on prod).
501
+
502
+ ### 16.3 High-risk ports — 35 services
503
+
504
+ For each open port, emit a finding with the severity and "why an attacker cares" below. Source for the open-port observation: Shodan InternetDB (free, 1 req/sec) is the recommended starting point.
505
+
506
+ | Port | Service | Severity | Why it matters |
507
+ |---|---|---|---|
508
+ | 21 | FTP | HIGH | Anonymous read often enabled; cleartext creds. |
509
+ | 22 | SSH | LOW | Banner discloses version; brute-force surface. |
510
+ | 23 | Telnet | HIGH | Cleartext protocol; should never be exposed. |
511
+ | 25 | SMTP | LOW | Open relay risk; version banner. |
512
+ | 53 | DNS | LOW | Recursion = DDoS amplifier; AXFR opportunism. |
513
+ | 80 | HTTP | INFO | Standard. |
514
+ | 110 | POP3 | LOW | Cleartext if no STARTTLS. |
515
+ | 111 | rpcbind | MEDIUM | NFS exports enumeration. |
516
+ | 135 | MS RPC | HIGH | Enum via Impacket. |
517
+ | 139 | NetBIOS-SSN | HIGH | File/printer enum. |
518
+ | 143 | IMAP | LOW | Cleartext if no STARTTLS. |
519
+ | 161 | SNMP | HIGH | Community strings often `public`/`private`; full device enum. |
520
+ | 389 | LDAP | HIGH | Anonymous bind = full directory dump. |
521
+ | 443 | HTTPS | INFO | Standard. |
522
+ | 445 | SMB | **CRITICAL** | EternalBlue, SMB relay, anonymous shares. |
523
+ | 465 | SMTPS | LOW | Banner. |
524
+ | 514 | rsyslog | MEDIUM | Log injection / DoS. |
525
+ | 587 | SMTP-MSA | LOW | Banner. |
526
+ | 631 | IPP/CUPS | MEDIUM | Print server enum / RCE in old CUPS. |
527
+ | 873 | rsync | HIGH | Modules often listable; backup data exposure. |
528
+ | 1433 | MSSQL | HIGH | Brute-force; xp_cmdshell. |
529
+ | 1521 | Oracle TNS | HIGH | Brute-force; SID enum. |
530
+ | 2049 | NFS | HIGH | World-readable exports. |
531
+ | 2375 | Docker API (unencrypted) | **CRITICAL** | Unauthenticated container/host takeover. |
532
+ | 2376 | Docker API (TLS) | HIGH | Cert validation bypass risk. |
533
+ | 3000 | Common dev / Grafana | MEDIUM | Often Grafana / Express dev with default creds. |
534
+ | 3306 | MySQL | HIGH | Brute-force; default `root:""`. |
535
+ | 3389 | RDP | **CRITICAL** | BlueKeep / DejaBlue / NLA bypass. |
536
+ | 5432 | PostgreSQL | HIGH | Brute-force; default `postgres:postgres`. |
537
+ | 5601 | Kibana | HIGH | Often unauthenticated; Elasticsearch pivot. |
538
+ | 5900 | VNC | HIGH | Often unauthenticated or weak password. |
539
+ | 5984 | CouchDB | HIGH | Default no auth; admin party. |
540
+ | 6379 | Redis | **CRITICAL** | No auth default; write `authorized_keys` for SSH. |
541
+ | 7001 | WebLogic | HIGH | Frequent CVEs (CVE-2020-14882, etc.). |
542
+ | 8000 | Common dev | MEDIUM | Django, common dev servers. |
543
+ | 8080 | HTTP-alt | MEDIUM | Tomcat, Jenkins, common proxy. |
544
+ | 8443 | HTTPS-alt | MEDIUM | Same as 8080. |
545
+ | 8888 | Common dev / Jupyter | HIGH | Jupyter often exposes interactive shell. |
546
+ | 9090 | Cockpit / Prometheus | HIGH | Server admin UI / metrics scraping. |
547
+ | 9200 | Elasticsearch | **CRITICAL** | Typically no auth. |
548
+ | 9300 | Elasticsearch transport | HIGH | Cluster join + RCE. |
549
+ | 11211 | memcached | MEDIUM | UDP DDoS amp; data dump. |
550
+ | 27017 | MongoDB | **CRITICAL** | No auth by default. |
551
+ | 50070 | Hadoop NameNode | HIGH | HDFS browse. |
552
+
553
+ When Shodan InternetDB returns `vulns[]` for a port, escalate the finding severity by one tier and include the CVE list in evidence.
554
+
555
+ ### 16.4 Missing security headers — 6 findings
556
+
557
+ For every alive webapp, audit response headers. Each missing header below = one finding.
558
+
559
+ | Header | Severity (default) | Severity (sensitive path) | Notes |
560
+ |---|---|---|---|
561
+ | `Strict-Transport-Security` | MEDIUM | **HIGH** | Sensitive paths: `/login`, `/signin`, `/sso`, `/admin`, `/auth`. |
562
+ | `Content-Security-Policy` | MEDIUM | MEDIUM | XSS impact mitigation gone. |
563
+ | `X-Frame-Options` | LOW | LOW | Clickjacking. (CSP `frame-ancestors` is the modern replacement.) |
564
+ | `X-Content-Type-Options` | LOW | LOW | MIME-sniff XSS. |
565
+ | `Referrer-Policy` | INFO | INFO | Outbound link leakage. |
566
+ | `Permissions-Policy` | INFO | INFO | Feature-policy hardening. |
567
+
568
+ ### 16.5 Always-on HTTP checks — 15 paths
569
+
570
+ Run these against every alive webapp regardless of Nuclei availability. Cheap; high signal.
571
+
572
+ | Path | Finding | Severity | Match logic |
573
+ |---|---|---|---|
574
+ | `/.git/config` | Exposed `.git` repo | **CRITICAL** | Body contains `[core]`, `[remote`, `repositoryformatversion` |
575
+ | `/.git/HEAD` | Exposed `.git/HEAD` | HIGH | Body matches `^ref:\s` |
576
+ | `/.env` | Exposed `.env` | **CRITICAL** | Multiline regex `^\s*[A-Z_][A-Z0-9_]*\s*=` |
577
+ | `/server-status` | Apache server-status | MEDIUM | Body contains `Apache Server Status` or matching title |
578
+ | `/server-info` | Apache mod_info | MEDIUM | Body contains `Apache Server Information` |
579
+ | `/.DS_Store` | Exposed `.DS_Store` | LOW | Byte signature `\x00\x00\x00\x01Bud1` |
580
+ | `/phpinfo.php` | phpinfo() leak | HIGH | Body contains `phpinfo()`, `PHP Version`, or matching title |
581
+ | `/info.php` | phpinfo() (alt path) | HIGH | Same as above |
582
+ | `/actuator/env` | Spring Boot `/actuator/env` | **CRITICAL** | Body contains `"propertySources"`, `systemProperties`, `systemEnvironment` |
583
+ | `/actuator/heapdump` | Spring Boot heapdump | **CRITICAL** | HPROF magic bytes / large binary download |
584
+ | `/_cat/indices` | Elasticsearch open | HIGH | Returns index list |
585
+ | `/console` | Jenkins script console | HIGH | Body contains `Jenkins`/`Script Console` |
586
+ | `/manager/html` | Tomcat Manager | HIGH | Body contains `Tomcat Web Application Manager` |
587
+ | `/wp-admin/install.php` | Orphaned WP install | LOW | Body contains `WordPress Installation` |
588
+ | `/.well-known/security.txt` | Disclosure policy info | INFO | Parse contact + policy fields |
589
+
590
+ Plus parse `/robots.txt` for `Disallow:` paths — those become the next-tier wordlist for that target.
591
+
592
+ ### 16.6 SAML metadata — 5 paths
593
+
594
+ ```
595
+ /saml/metadata
596
+ /FederationMetadata/2007-06/FederationMetadata.xml
597
+ /federationmetadata/2007-06/federationmetadata.xml
598
+ /simplesaml/saml2/idp/metadata.php
599
+ /auth/saml2/metadata
600
+ ```
601
+
602
+ Reachable SAML metadata XML reveals: `EntityID`, signing certs (often pinned → cert-reuse pivot), `SingleSignOnService` URL, `NameIDFormat`. Mark as `MISCONFIG` (LOW severity unless metadata leaks internal hostnames or non-public certs, then MEDIUM).
603
+
604
+ ### 16.7 SSO subdomain prefixes — 8 prefixes
605
+
606
+ Probe each against root domain + every sibling brand domain:
607
+ ```
608
+ auth.{domain}
609
+ login.{domain}
610
+ sso.{domain}
611
+ idp.{domain}
612
+ iam.{domain}
613
+ identity.{domain}
614
+ accounts.{domain}
615
+ oauth.{domain}
616
+ ```
617
+
618
+ Plus probe `/.well-known/openid-configuration` on every alive subdomain (regardless of prefix).
619
+
620
+ ### 16.8 Cloud bucket permutation arsenal
621
+
622
+ **6 prefixes:**
623
+ ```
624
+ "" # bare candidate
625
+ backup-
626
+ assets-
627
+ static-
628
+ dev-
629
+ prod-
630
+ ```
631
+
632
+ **15 suffixes:**
633
+ ```
634
+ "" # bare candidate
635
+ -backup
636
+ -assets
637
+ -static
638
+ -media
639
+ -data
640
+ -uploads
641
+ -dev
642
+ -prod
643
+ -staging
644
+ -logs
645
+ -private
646
+ -public
647
+ -dump
648
+ -archive
649
+ ```
650
+
651
+ **47 generic stems** (filter unless combined with target-identifying token):
652
+ ```
653
+ www, mail, email, app, apps, web, webmail, ftp, cdn, static, assets, media, img, images,
654
+ videos, download, downloads, upload, uploads, data, files, docs, support, help, kb,
655
+ blog, news, dev, test, staging, stg, qa, uat, sandbox, preprod, preview, vpn,
656
+ mx, smtp, imap, pop, dns, ns, ns1, ns2, mx1, mx2
657
+ ```
658
+
659
+ **Provider URL templates:**
660
+
661
+ S3:
662
+ ```
663
+ https://{candidate}.s3.amazonaws.com/
664
+ https://{candidate}.s3-{region}.amazonaws.com/ # try us-east-1, us-west-2, eu-west-1, ap-southeast-1 first
665
+ https://s3.{region}.amazonaws.com/{candidate}/
666
+ ```
667
+
668
+ GCS:
669
+ ```
670
+ https://{candidate}.storage.googleapis.com/
671
+ https://storage.googleapis.com/{candidate}/
672
+ ```
673
+
674
+ Azure Blob:
675
+ ```
676
+ https://{candidate}.blob.core.windows.net/
677
+ ```
678
+
679
+ **Probe technique:** HEAD first → 200/301 = exists, 403 = exists private, 404 = skip. On exists, GET root → if XML/JSON object listing returns, **CRITICAL** `PUBLIC_CLOUD_BUCKET`. Direct-URL object reads but not listable → **HIGH** `PUBLIC_CLOUD_BUCKET_OBJECT_READ`.
680
+
681
+ ### 16.9 JS guess-paths for endpoint discovery
682
+
683
+ Probe these paths on every alive webapp (in addition to scraped `<script src=...>`):
684
+
685
+ ```
686
+ /main.js
687
+ /app.js
688
+ /bundle.js
689
+ /runtime.js
690
+ /index.js
691
+ /vendor.js
692
+ /_next/static/_buildManifest.js
693
+ /_next/static/_ssgManifest.js
694
+ /static/js/main.js
695
+ /static/js/bundle.js
696
+ /assets/index.js
697
+ /static/js/main.<hash>.js # try hash discovery via 404 patterns
698
+ ```
699
+
700
+ For every found JS, also try `<jsfile>.map` for sourcemap leaks (HIGH `INFO_DISCLOSURE`).
701
+
702
+ ### 16.10 Endpoint extraction regex tiers
703
+
704
+ Three tiers, run in order on every JS body + every sourcesContent[] blob:
705
+
706
+ **Tier 1 — generic quoted paths:**
707
+ ```regex
708
+ ['"`](/[A-Za-z0-9_\-./{}\[\]?=&%:]+)['"`]
709
+ ```
710
+ Match group: the path. High recall, lots of false positives — apply allowlist downstream.
711
+
712
+ **Tier 2 — API-ish paths (biased filter on tier 1):**
713
+ ```regex
714
+ ['"`](/(?:api|graphql|gql|v\d+|swagger|openapi|rest|services|internal|admin|auth|oauth|user|users|account|accounts|search|export|upload|file|files|download|webhook|hooks|callback|admin)/[A-Za-z0-9_\-./{}\[\]?=&%:]+)['"`]
715
+ ```
716
+
717
+ **Tier 3 — fully-qualified URLs:**
718
+ ```regex
719
+ \bhttps?://[A-Za-z0-9.\-]+\.[A-Za-z]{2,}(?::\d+)?[/A-Za-z0-9_\-./{}\[\]?=&%:#]*
720
+ ```
721
+
722
+ Dedup on `(method, normalized-path-template)` where the template replaces `/123/` with `/{id}/` etc.
723
+
724
+ ### 16.11 Internal-host leakage regexes
725
+
726
+ Run on every JS body + sourcesContent + APK strings + manifest:
727
+
728
+ **RFC1918:**
729
+ ```regex
730
+ \b(?:10\.(?:\d{1,3}\.){2}\d{1,3}|172\.(?:1[6-9]|2\d|3[01])\.(?:\d{1,3})\.(?:\d{1,3})|192\.168\.(?:\d{1,3})\.(?:\d{1,3})|127\.(?:\d{1,3}\.){2}\d{1,3})\b
731
+ ```
732
+
733
+ **Internal DNS suffixes:**
734
+ ```regex
735
+ \b[A-Za-z0-9][A-Za-z0-9\-]{0,62}\.(?:internal|corp|lan|intranet|local|prod|staging|dev|qa|test)\b
736
+ ```
737
+
738
+ **Kubernetes service DNS:**
739
+ ```regex
740
+ \b[A-Za-z0-9\-]+\.[A-Za-z0-9\-]+\.svc(?:\.cluster\.local)?\b
741
+ ```
742
+
743
+ Each match → MEDIUM `INFO_DISCLOSURE`. Aggregate per host: if many matches share the same internal subdomain, that's a recon seed for any future internal phase.
744
+
745
+ ### 16.12 Subdomain-takeover provider fingerprints (summary, 27 providers)
746
+
747
+ Watch for these CNAME targets + the corresponding "available for claim" response signature:
748
+
749
+ | Provider | CNAME pattern | Takeover signature |
750
+ |---|---|---|
751
+ | GitHub Pages | `*.github.io` | `There isn't a GitHub Pages site here.` |
752
+ | Heroku | `*.herokuapp.com` | `No such app` |
753
+ | AWS S3 | `*.s3*.amazonaws.com` | `NoSuchBucket` |
754
+ | AWS CloudFront | `*.cloudfront.net` | `Bad request` w/ specific X-Amz error |
755
+ | Azure (multiple) | `*.azurewebsites.net`, `*.blob.core.windows.net`, `*.cloudapp.net`, `*.trafficmanager.net` | Various per-product 404 patterns |
756
+ | Shopify | `shops.myshopify.com` | `Sorry, this shop is currently unavailable.` |
757
+ | Squarespace | `*.squarespace.com` | `No Such Account` |
758
+ | Tumblr | `*.tumblr.com` | `Whatever you were looking for doesn't currently exist.` |
759
+ | WordPress | `*.wordpress.com` | `Do you want to register *.wordpress.com?` |
760
+ | Fastly | various | Fastly-specific 404 |
761
+ | Pantheon | `*.pantheonsite.io` | `The gods are wise, but do not know of the site...` |
762
+ | Surge.sh | `*.surge.sh` | `project not found` |
763
+ | Bitbucket Pages | `*.bitbucket.io` | Repository not found |
764
+ | Tilda | `*.tilda.ws` | `Please renew your subscription` |
765
+ | Strikingly | `*.s.strikinglydns.com` | `PAGE NOT FOUND` |
766
+ | Smartling | `*.smartling.com` | Domain is not configured |
767
+ | Ngrok | `*.ngrok.io` | Tunnel not found |
768
+ | Webflow | `*.webflow.io` | Site not found |
769
+ | Zendesk | `*.zendesk.com` | `Help Center Closed` |
770
+ | Cargo | `*.cargocollective.com` | `404 Not Found` (with cargo branding) |
771
+ | Statuspage | `*.statuspage.io` | Not found |
772
+ | Intercom | `*.intercom.help` | Not found |
773
+ | Helpjuice | `*.helpjuice.com` | Not found |
774
+ | Helpscout | `*.helpscoutdocs.com` | Not found |
775
+ | Tictail | `*.tictail.com` | Not found |
776
+ | Brightcove | `*.brightcovegallery.com` | Not found |
777
+ | Smugmug | various | Not found |
778
+
779
+ For full per-provider detection signatures + edge cases, use SubdomainX or Subzy/Subjack against a freshly-fetched fingerprint database.
780
+
781
+ ---
782
+
783
+ ### 16.13 Copy-Paste Probes (curl one-liners)
784
+
785
+ Every probe path in §16.1–16.12 with a runnable curl. Defaults: `-sk` (silent + ignore TLS errors), `-m 10` (10s max), `-o /tmp/r` (response body to disk), `-w '%{http_code}\n'` (print status code), `-A "Mozilla/5.0"` (UA — change per persona).
786
+
787
+ **Always-on HTTP checks (§16.5):**
788
+
789
+ ```bash
790
+ T="https://target.example"
791
+
792
+ # .git/config (CRITICAL)
793
+ curl -sk -m 10 "$T/.git/config" | grep -E '\[core\]|\[remote|repositoryformatversion'
794
+
795
+ # .git/HEAD (HIGH)
796
+ curl -sk -m 10 "$T/.git/HEAD" | grep -E '^ref:'
797
+
798
+ # .env (CRITICAL)
799
+ curl -sk -m 10 "$T/.env" | grep -E '^[[:space:]]*[A-Z_][A-Z0-9_]*[[:space:]]*='
800
+
801
+ # Apache /server-status (MEDIUM)
802
+ curl -sk -m 10 "$T/server-status" | grep -i 'Apache Server Status'
803
+
804
+ # Apache /server-info (MEDIUM)
805
+ curl -sk -m 10 "$T/server-info" | grep -i 'Apache Server Information'
806
+
807
+ # .DS_Store (LOW)
808
+ curl -sk -m 10 "$T/.DS_Store" -o /tmp/dsstore && file /tmp/dsstore | grep -i 'data'
809
+
810
+ # phpinfo.php (HIGH)
811
+ curl -sk -m 10 "$T/phpinfo.php" | grep -E 'phpinfo\(\)|PHP Version'
812
+
813
+ # info.php (HIGH)
814
+ curl -sk -m 10 "$T/info.php" | grep -E 'phpinfo\(\)|PHP Version'
815
+
816
+ # Spring Boot /actuator/env (CRITICAL)
817
+ curl -sk -m 10 "$T/actuator/env" | grep -E '"propertySources"|systemProperties|systemEnvironment'
818
+
819
+ # Spring Boot /actuator/heapdump (CRITICAL — saves binary; check size)
820
+ curl -sk -m 30 "$T/actuator/heapdump" -o /tmp/heap && file /tmp/heap | grep -i 'HPROF\|data'
821
+
822
+ # Elasticsearch open (HIGH)
823
+ curl -sk -m 10 "$T/_cat/indices?v"
824
+
825
+ # Jenkins script console (HIGH)
826
+ curl -sk -m 10 "$T/script" | grep -iE 'Jenkins|Script Console'
827
+
828
+ # Tomcat manager (HIGH)
829
+ curl -sk -m 10 "$T/manager/html" -w '%{http_code}\n' | tail -1 # 401 = present + auth-gated; 200 = no auth
830
+
831
+ # WordPress orphan installer (LOW)
832
+ curl -sk -m 10 "$T/wp-admin/install.php" | grep -i 'WordPress Installation'
833
+
834
+ # security.txt (INFO)
835
+ curl -sk -m 10 "$T/.well-known/security.txt"
836
+ ```
837
+
838
+ **SSO subdomain prefixes (§16.7):**
839
+
840
+ ```bash
841
+ D="target.example"
842
+ for prefix in auth login sso idp iam identity accounts oauth; do
843
+ echo "=== ${prefix}.${D} ==="
844
+ curl -sk -m 10 "https://${prefix}.${D}/.well-known/openid-configuration" -o /dev/null -w '%{http_code}\n'
845
+ done
846
+
847
+ # Generic OIDC discovery on any host:
848
+ curl -sk -m 10 "https://${HOST}/.well-known/openid-configuration" | jq .
849
+ ```
850
+
851
+ **SAML metadata paths (§16.6):**
852
+
853
+ ```bash
854
+ H="target.example.com"
855
+ for p in /saml/metadata \
856
+ /FederationMetadata/2007-06/FederationMetadata.xml \
857
+ /federationmetadata/2007-06/federationmetadata.xml \
858
+ /simplesaml/saml2/idp/metadata.php \
859
+ /auth/saml2/metadata; do
860
+ echo "=== $p ==="
861
+ curl -sk -m 10 "https://${H}${p}" -o /dev/null -w '%{http_code} %{size_download}\n'
862
+ done
863
+ ```
864
+
865
+ **Cloud bucket probes (§16.8):**
866
+
867
+ ```bash
868
+ B="candidate-bucket-name"
869
+
870
+ # S3 (us-east-1 first)
871
+ curl -sk -m 10 -I "https://${B}.s3.amazonaws.com/" -w 'STATUS:%{http_code}\n' | head -20
872
+ # If 200/301: list objects
873
+ curl -sk -m 10 "https://${B}.s3.amazonaws.com/?list-type=2" | head -50
874
+
875
+ # S3 region-specific
876
+ for r in us-east-1 us-west-2 eu-west-1 ap-southeast-1; do
877
+ curl -sk -m 10 -I "https://${B}.s3-${r}.amazonaws.com/" -w "${r}: %{http_code}\n"
878
+ done
879
+
880
+ # GCS
881
+ curl -sk -m 10 -I "https://${B}.storage.googleapis.com/"
882
+ curl -sk -m 10 "https://storage.googleapis.com/${B}/"
883
+
884
+ # Azure Blob
885
+ curl -sk -m 10 -I "https://${B}.blob.core.windows.net/"
886
+ curl -sk -m 10 "https://${B}.blob.core.windows.net/?comp=list"
887
+ ```
888
+
889
+ **GraphQL introspection POST (§16.2):**
890
+
891
+ ```bash
892
+ H="https://target.example/graphql"
893
+
894
+ curl -sk -m 15 -X POST "$H" \
895
+ -H 'Content-Type: application/json' \
896
+ -d '{
897
+ "operationName":"IntrospectionQuery",
898
+ "query":"query IntrospectionQuery { __schema { types { name kind fields { name type { name kind } } } queryType { name } mutationType { name } subscriptionType { name } } }"
899
+ }' | jq '.data.__schema.types | length'
900
+ ```
901
+
902
+ **Read-only secret validators (§23):**
903
+
904
+ ```bash
905
+ # Postman PMAK
906
+ curl -sk -m 10 -H "X-Api-Key: PMAK-..." https://api.getpostman.com/me | jq .
907
+
908
+ # AWS (use boto3 instead of curl — pre-signing complexity)
909
+ python3 -c "import boto3; print(boto3.client('sts', aws_access_key_id='AKIA...', aws_secret_access_key='...').get_caller_identity())"
910
+
911
+ # GitHub PAT (note scope header)
912
+ curl -sk -m 10 -H "Authorization: token ghp_..." https://api.github.com/user -D /tmp/h | jq -r '.login,.email'
913
+ grep -i 'X-OAuth-Scopes' /tmp/h
914
+
915
+ # Slack
916
+ curl -sk -m 10 -H "Authorization: Bearer xoxb-..." -X POST https://slack.com/api/auth.test | jq .
917
+
918
+ # Anthropic (read-only validation)
919
+ curl -sk -m 10 -H "x-api-key: sk-ant-..." -H "anthropic-version: 2023-06-01" https://api.anthropic.com/v1/models | jq '.data | length'
920
+
921
+ # OpenAI
922
+ curl -sk -m 10 -H "Authorization: Bearer sk-..." https://api.openai.com/v1/models | jq '.data | length'
923
+
924
+ # npm
925
+ curl -sk -m 10 -H "Authorization: Bearer npm_..." https://registry.npmjs.org/-/whoami | jq .
926
+
927
+ # Atlassian (account)
928
+ curl -sk -m 10 -u "email:ATATT3xFfGF0_..." https://your-domain.atlassian.net/rest/api/3/myself | jq .
929
+
930
+ # DataDog (API + APP key both required)
931
+ curl -sk -m 10 -H "DD-API-KEY: ..." -H "DD-APPLICATION-KEY: ..." https://api.datadoghq.com/api/v1/validate | jq .
932
+ ```
933
+
934
+ **Bulk webapp triage (httpx, faster than curl loop):**
935
+
936
+ ```bash
937
+ # Install: go install github.com/projectdiscovery/httpx/cmd/httpx@latest
938
+ echo "target.example" | httpx -sc -title -tech-detect -web-server -ip -cdn -follow-redirects
939
+
940
+ # With probe list
941
+ cat subdomains.txt | httpx -sc -title -tech-detect -path /actuator/env,/.git/config,/.env -mc 200,301,403
942
+ ```
943
+
944
+ **Save responses for evidence:**
945
+
946
+ ```bash
947
+ mkdir -p evidence/$(date -u +%Y%m%d)
948
+ T="https://target.example"
949
+ P="/actuator/env"
950
+ TS=$(date -u +%Y%m%dT%H%M%SZ)
951
+ SAFE_NAME=$(echo "${T}${P}" | tr '/:' '_')
952
+ curl -sk -m 10 "$T$P" -o "evidence/$(date -u +%Y%m%d)/${TS}_${SAFE_NAME}.body" \
953
+ -D "evidence/$(date -u +%Y%m%d)/${TS}_${SAFE_NAME}.headers"
954
+ sha256sum "evidence/$(date -u +%Y%m%d)/${TS}_${SAFE_NAME}".* > "evidence/$(date -u +%Y%m%d)/${TS}_${SAFE_NAME}.sha256"
955
+ ```
956
+
957
+ ---
958
+
959
+ ### 16.14 Email Security Analysis (SPF/DMARC/DKIM/BIMI/MTA-STS/DNSSEC)
960
+
961
+ Spoof feasibility + SaaS tenant inference from a target's email DNS.
962
+
963
+ **SPF lookup + parsing:**
964
+
965
+ ```bash
966
+ D="target.example"
967
+ dig +short TXT "$D" | grep -i 'v=spf1'
968
+ ```
969
+
970
+ **Common SPF parsing checklist:**
971
+ - Ends in `-all` (hardfail) → strict; major providers reject spoofs.
972
+ - Ends in `~all` (softfail) → spam folder for spoofs.
973
+ - Ends in `?all` or no `all` → permissive; spoofs likely deliver.
974
+ - Includes (`include:`) reveal SaaS tenants:
975
+ - `include:_spf.google.com` → Google Workspace.
976
+ - `include:spf.protection.outlook.com` → Microsoft 365.
977
+ - `include:_spf.salesforce.com` → Salesforce.
978
+ - `include:mail.zendesk.com` → Zendesk customer.
979
+ - `include:sendgrid.net` → SendGrid customer.
980
+ - `include:mailgun.org` → Mailgun customer.
981
+ - `include:_spf.atlassian.net` → Atlassian Cloud.
982
+ - `include:amazonses.com` → AWS SES.
983
+ - `include:mktomail.com` → Marketo.
984
+ - `include:_spf.intuit.com` → Intuit (QuickBooks/Mailchimp).
985
+ - `include:spf.mandrillapp.com` → Mandrill.
986
+ - `include:_spf.workday.com` → Workday.
987
+
988
+ If SPF includes ≥10 mechanisms (max-lookups limit) → SPF eval likely fails → spoofs may pass. Tools: `spfquery`, `spftools` (online), `dig +trace`.
989
+
990
+ **DMARC policy + alignment:**
991
+
992
+ ```bash
993
+ dig +short TXT "_dmarc.${D}"
994
+ ```
995
+
996
+ Parse for:
997
+ - `p=` → primary policy (`none`, `quarantine`, `reject`).
998
+ - `sp=` → subdomain policy (defaults to `p=`).
999
+ - `aspf=` / `adkim=` → alignment mode (`r`=relaxed, `s`=strict).
1000
+ - `pct=` → percentage of mail to which policy applies.
1001
+ - `rua=` / `ruf=` → reporting addresses (often reveals SaaS DMARC vendors: dmarcian, valimail, Agari, easydmarc).
1002
+
1003
+ **Severity:**
1004
+ - `p=none` → spoof-feasible, downgrade trust → MEDIUM finding.
1005
+ - `p=quarantine pct<100` → partial enforcement → LOW.
1006
+ - `p=reject` + `aspf=s` + `adkim=s` → well-postured → no finding.
1007
+
1008
+ **DKIM key discovery:**
1009
+
1010
+ DKIM selectors aren't well-known; common patterns:
1011
+ ```bash
1012
+ for selector in default google selector1 selector2 mail email k1 dkim s1 s2 mta1 mta2 \
1013
+ amazonses 20240101 20230101 mailchimp sendgrid mxvault; do
1014
+ echo "=== ${selector} ==="
1015
+ dig +short TXT "${selector}._domainkey.${D}"
1016
+ done
1017
+ ```
1018
+
1019
+ If a key returns: extract `p=<base64>` and check key length. RSA-1024 → MEDIUM (deprecated; should be 2048+). Missing or rotated infrequently → LOW finding.
1020
+
1021
+ **BIMI (Brand Indicators for Message Identification):**
1022
+
1023
+ ```bash
1024
+ dig +short TXT "default._bimi.${D}"
1025
+ ```
1026
+
1027
+ If present + `p=reject` DMARC → brand-impersonation defense in inbox UI. Absence is LOW only (operational, not exploitable).
1028
+
1029
+ **MTA-STS (Mail Transfer Agent Strict Transport Security):**
1030
+
1031
+ ```bash
1032
+ dig +short TXT "_mta-sts.${D}"
1033
+ curl -sk -m 10 "https://mta-sts.${D}/.well-known/mta-sts.txt"
1034
+ ```
1035
+
1036
+ If neither responds → MX-server TLS not enforced; MITM-able. LOW finding. If `mode=enforce` present and policy file matches → well-postured.
1037
+
1038
+ **TLS-RPT (TLS Reporting):**
1039
+ ```bash
1040
+ dig +short TXT "_smtp._tls.${D}"
1041
+ ```
1042
+
1043
+ **DNSSEC validation:**
1044
+
1045
+ ```bash
1046
+ dig +dnssec "${D}" SOA | grep -E 'flags|RRSIG'
1047
+ delv "${D}" 2>&1 | grep -i 'fully validated\|insecur'
1048
+ ```
1049
+
1050
+ If `delv` returns "insecure" → DNSSEC not enabled (LOW finding; doesn't enable spoof but is hardening gap).
1051
+
1052
+ **MX → IdP / mail-host inference:**
1053
+
1054
+ ```bash
1055
+ dig +short MX "${D}"
1056
+ ```
1057
+
1058
+ | MX pattern | IdP / hosting |
1059
+ |---|---|
1060
+ | `aspmx.l.google.com`, `*.googlemail.com` | Google Workspace |
1061
+ | `*.mail.protection.outlook.com` | Microsoft 365 |
1062
+ | `*.mail.eo.outlook.com` | Microsoft 365 (older) |
1063
+ | `*.zoho.com` | Zoho Mail |
1064
+ | `*.yandex.net` | Yandex 360 |
1065
+ | `*.fastmail.com` | Fastmail |
1066
+ | `*.proofpoint.com`, `*.pphosted.com` | Proofpoint (M365 user with Proofpoint inbound) |
1067
+ | `*.mimecast.com`, `*.mimecast-eu.com` | Mimecast |
1068
+ | `*.barracudanetworks.com` | Barracuda |
1069
+ | Self-hosted IPs in target ASN | On-prem mail server (often Exchange) |
1070
+
1071
+ **DMARC reporting-vendor inference (parse `rua=` / `ruf=`):**
1072
+
1073
+ | RUA/RUF host | Vendor | Implication |
1074
+ |---|---|---|
1075
+ | `*.dmarcian.com` | dmarcian | DMARC reporting customer |
1076
+ | `*.valimail.com`, `*.dmarc-rua.com` | Valimail | DMARC reporting customer |
1077
+ | `*.kdmarc.com` | Kratikal kDMARC | Indian DMARC vendor; common in IN orgs |
1078
+ | `*.agari.com` | Agari (Fortra) | Email security vendor |
1079
+ | `*.easydmarc.com` | EasyDMARC | DMARC reporting customer |
1080
+ | `*.dmarcanalyzer.com` | DMARC Analyzer | Reporting customer |
1081
+ | `*.postmarkapp.com` | Postmark | DMARC reporting addon |
1082
+ | `<addr>@<target-domain>` | Self-hosted reporting | Internal mailbox; sometimes leaks team-name (`itg@`, `secops@`, `dmarc@`) |
1083
+
1084
+ Capture the vendor + the internal RUA mailbox. Both are leak surfaces (vendor compromise = DMARC bypass; internal mailbox = phishing target).
1085
+
1086
+ **Windows / PowerShell parallel for the entire §16.14 audit:**
1087
+
1088
+ PS 5.1 `Resolve-DnsName` does **not** accept `-Type CAA` (use PowerShell 7+ or `nslookup -type=CAA <domain>`). Otherwise:
1089
+
1090
+ ```powershell
1091
+ $D = "target.example"
1092
+ "=== SPF ==="; (Resolve-DnsName $D -Type TXT -EA SilentlyContinue | ? { $_.Strings -match 'v=spf1' }).Strings
1093
+ "=== DMARC ==="; (Resolve-DnsName "_dmarc.$D" -Type TXT -EA SilentlyContinue).Strings
1094
+ "=== MTA-STS ==="; (Resolve-DnsName "_mta-sts.$D" -Type TXT -EA SilentlyContinue).Strings
1095
+ "=== TLS-RPT ==="; (Resolve-DnsName "_smtp._tls.$D" -Type TXT -EA SilentlyContinue).Strings
1096
+ "=== BIMI ==="; (Resolve-DnsName "default._bimi.$D" -Type TXT -EA SilentlyContinue).Strings
1097
+ "=== MX ==="; Resolve-DnsName $D -Type MX -EA SilentlyContinue | Select NameExchange,Preference
1098
+ "=== DKIM common selectors ==="
1099
+ foreach ($s in @("default","google","selector1","selector2","mail","email","k1","dkim","s1","s2","amazonses","mailchimp","sendgrid","mxvault","20240101","zoho","zmail","outlook","o365")) {
1100
+ $r = Resolve-DnsName "$s._domainkey.$D" -Type TXT -EA SilentlyContinue
1101
+ if ($r) { "${s}: FOUND" }
1102
+ }
1103
+ "=== CAA (PS 5.1 fallback) ==="; nslookup -type=CAA $D 2>$null
1104
+ ```
1105
+
1106
+ ### 16.15 Origin Discovery / CDN Bypass
1107
+
1108
+ If the target is behind Cloudflare/Akamai/Fastly/CloudFront, their CDN IPs are well-defined. Find IPs **not** in those ranges that serve the same site = origin.
1109
+
1110
+ **Cloudflare IPv4 ranges:**
1111
+ ```
1112
+ https://www.cloudflare.com/ips-v4
1113
+ ```
1114
+ **Akamai ASNs:** AS16625, AS20940, AS21342, AS21357.
1115
+ **Fastly:** AS54113.
1116
+ **AWS CloudFront:** published in `https://ip-ranges.amazonaws.com/ip-ranges.json` filter `service:CLOUDFRONT`.
1117
+
1118
+ **Origin discovery via DNS history:**
1119
+
1120
+ ```bash
1121
+ # SecurityTrails (paid)
1122
+ curl -sk -H "APIKEY: ..." \
1123
+ "https://api.securitytrails.com/v1/history/${D}/dns/a" | jq '.records[] | {ip:.values[].ip, first_seen, last_seen}'
1124
+ ```
1125
+
1126
+ Free alternatives:
1127
+ ```bash
1128
+ # Validin
1129
+ curl -sk "https://app.validin.com/api/axon/${D}/dns" | jq .
1130
+
1131
+ # RiskIQ Community (free tier; auth required)
1132
+ curl -sk -u "user:apikey" "https://api.riskiq.net/pt/v2/dns/passive?query=${D}" | jq .
1133
+ ```
1134
+
1135
+ Filter the result: any historical A record IP **not** in current CDN ranges = origin candidate.
1136
+
1137
+ **Origin via certificate SAN pivot (Censys):**
1138
+
1139
+ ```bash
1140
+ # Censys (free 250 queries/month with key)
1141
+ censys search "services.tls.certificates.leaf_data.subject.common_name:${D} AND NOT services.tls.certificates.leaf_data.issuer.common_name:'Cloudflare'"
1142
+ ```
1143
+
1144
+ Or via crt.sh + manual IP check:
1145
+ ```bash
1146
+ curl -sk "https://crt.sh/?q=%25.${D}&output=json" | jq -r '.[].name_value' | sort -u
1147
+ ```
1148
+
1149
+ **Origin via favicon hash (Shodan):**
1150
+
1151
+ ```bash
1152
+ # Compute favicon mmh3
1153
+ python3 -c "
1154
+ import urllib.request, codecs, mmh3
1155
+ data = urllib.request.urlopen('https://target.example/favicon.ico').read()
1156
+ b64 = codecs.encode(data, 'base64')
1157
+ print(mmh3.hash(b64))"
1158
+
1159
+ # Search Shodan
1160
+ shodan search "http.favicon.hash:<computed-hash>" --fields ip_str,port,org
1161
+ ```
1162
+
1163
+ Cross-reference with CDN ranges; non-CDN matches = origin candidates.
1164
+
1165
+ **Origin via JARM:**
1166
+
1167
+ ```bash
1168
+ # Compute JARM
1169
+ python3 -c "
1170
+ import jarm
1171
+ print(jarm.scan('target.example'))
1172
+ " 2>/dev/null || echo "Install: pip install pyjarm"
1173
+
1174
+ # Search Shodan for matching JARM
1175
+ shodan search "ssl.jarm:<jarm-hash>" --fields ip_str,port
1176
+ ```
1177
+
1178
+ **Origin via Host-header probe (validate candidate):**
1179
+
1180
+ ```bash
1181
+ CANDIDATE_IP="203.0.113.42"
1182
+ curl -sk -m 10 -H "Host: target.example.com" "https://${CANDIDATE_IP}/" -o /tmp/candidate.html
1183
+ diff <(curl -sk -m 10 https://target.example.com/) /tmp/candidate.html | head -50
1184
+ ```
1185
+
1186
+ If small/no diff → confirmed origin. Document with detectability=low.
1187
+
1188
+ **Origin via auxiliary subdomains (often skip CDN):**
1189
+
1190
+ ```bash
1191
+ for sub in mail smtp ftp sftp cpanel webmail direct origin direct-connect noproxy \
1192
+ dev staging stg uat preprod sandbox preview origin-www old-www legacy \
1193
+ server srv host1 host2 vps server1; do
1194
+ echo "=== ${sub}.${D} ==="
1195
+ dig +short A "${sub}.${D}"
1196
+ done | grep -vE '^(===|$)' | sort -u
1197
+ ```
1198
+
1199
+ Cross-reference any returned IP against CDN ranges.
1200
+
1201
+ **Origin via email-header bounce:**
1202
+
1203
+ Send mail to `<random>@${D}` from a sock-puppet account. The bounce often includes `Received:` headers showing the inbound mail server's actual IP — sometimes co-located with web origin.
1204
+
1205
+ **Origin via misconfigured CDN error pages:**
1206
+
1207
+ Some CDN 5xx error pages historically leaked upstream details. Trigger errors and inspect:
1208
+ ```bash
1209
+ # Trigger CDN-side 5xx (oversized request, malformed Host)
1210
+ curl -sk -m 10 -H "Host: " "https://target.example/" -o /tmp/err.html
1211
+ curl -sk -m 10 -H "X-Forwarded-For: $(python3 -c 'print("a"*8000)')" "https://target.example/"
1212
+ grep -iE 'origin|upstream|server|backend|cf-ray' /tmp/err.html
1213
+ ```
1214
+
1215
+ ### 16.16 Vendor Product Fingerprints
1216
+
1217
+ Common edge appliances / products on the target's perimeter, with fingerprint paths and notes on common CVEs.
1218
+
1219
+ | Product | Fingerprint paths | Notes |
1220
+ |---|---|---|
1221
+ | **Citrix Netscaler / Gateway** | `/vpn/index.html`, `/logon/LogonPoint/tmindex.html`, `/citrix/` | Version in HTML; CVE-2023-3519 (RCE), CVE-2019-19781 (path traversal RCE) — both KEV-listed. |
1222
+ | **F5 BIG-IP TMUI** | `/tmui/login.jsp`, `/mgmt/tm/sys/` | Banner reveals version; CVE-2022-1388 (auth bypass), CVE-2023-46747 — KEV-listed. |
1223
+ | **Cisco ASA / AnyConnect** | `/+CSCOE+/`, `/CSCOE/index.html`, `/webvpn.html`, `/+CSCOE+/portal.html` | CVE-2020-3452 (file read), CVE-2018-0101 (RCE). |
1224
+ | **Pulse Secure / Ivanti Connect** | `/dana-na/`, `/dana-na/auth/url_default/welcome.cgi`, `/api/v1/` | CVE-2024-21887 (KEV), CVE-2023-46805 (KEV) — chained command injection. |
1225
+ | **FortiGate / FortiOS** | `/remote/login`, `/remote/info`, `/api/v2/` | CVE-2022-42475 (RCE, KEV), CVE-2024-21762 (RCE, KEV). |
1226
+ | **PaloAlto GlobalProtect** | `/global-protect/`, `/global-protect/portal/css/login.css`, `/api/?type=keygen` | CVE-2024-3400 (RCE, KEV), CVE-2019-1579. |
1227
+ | **VMware Horizon** | `/portal/info.jsp`, `/broker/xml`, `/login.jsp` | log4shell exposure (CVE-2021-44228, KEV). |
1228
+ | **VMware vCenter** | `/sdk`, `/ui/`, `/vsphere-client/`, `/websso/SAML2/` | CVE-2021-21972 (RCE, KEV), CVE-2021-22005. |
1229
+ | **VMware ESXi** | `/sdk`, `/ui/`, `/folder` | CVE-2021-21974 (heap overflow → ESXiArgs ransomware, KEV). |
1230
+ | **Microsoft Exchange OWA** | `/owa/`, `/ews/exchange.asmx`, `/ecp/` | ProxyShell (CVE-2021-34473), ProxyLogon (CVE-2021-26855), ProxyNotShell (CVE-2022-41040) — all KEV. |
1231
+ | **WatchGuard Firebox** | `/auth/`, `/wgcgi.cgi` | CVE-2022-26318 (CGI). |
1232
+ | **SonicWall SMA** | `/cgi-bin/welcome`, `/__api__/v1/`, `/diagnostics/` | CVE-2021-20016, CVE-2024-40766 (KEV). |
1233
+ | **Sophos UTM/XG/XGS** | `/userportal/`, `/webconsole/`, `/cgi-bin/` | CVE-2022-1040 (RCE, KEV). |
1234
+ | **Check Point R80/R81** | `/sslvpn/portal/`, `/clients/` | CVE-2024-24919 (KEV). |
1235
+ | **Zoho ManageEngine** | `/RestAPI/Login`, `/api/json/v2/` | Multiple RCE CVEs; check version. |
1236
+ | **Atlassian Confluence** | `/confluence/`, `/login.action`, `/rest/api/space` | CVE-2022-26134 (OGNL RCE, KEV), CVE-2023-22515 (KEV). |
1237
+ | **Atlassian Jira** | `/secure/Dashboard.jspa`, `/rest/api/2/serverInfo` | Multiple CVEs; check version. |
1238
+ | **GitLab self-hosted** | `/users/sign_in`, `/-/oauth/applications`, `/help` | Version in HTML footer; CVE-2021-22205 (RCE, KEV). |
1239
+ | **Telerik UI** | `/Telerik.Web.UI.WebResource.axd?type=rau` | CVE-2017-9248, CVE-2019-18935 — old but still found. |
1240
+ | **ConnectWise ScreenConnect** | `/SetupWizard.aspx`, `/Bin/SetupWizard.aspx` | CVE-2024-1709 (auth bypass, KEV). |
1241
+ | **SolarWinds Orion** | `/Orion/Login.aspx` | SUNBURST supply-chain (CVE-2020-10148). |
1242
+ | **Kaseya VSA** | `/dl.asp`, `/userFilterTableRpt.asp` | CVE-2021-30116 (REvil supply-chain). |
1243
+ | **Microsoft IIS / OWA misc** | `Server: Microsoft-IIS/<version>` | Old versions = old CVEs; check. |
1244
+ | **Cisco Smart Install** | port 4786 open | CVE-2018-0171 (smart install client mode RCE). |
1245
+
1246
+ **Per-vendor probe pattern:**
1247
+
1248
+ ```bash
1249
+ T="https://target.example"
1250
+ # Citrix
1251
+ curl -sk -m 10 "$T/vpn/index.html" -o /tmp/c1 -w '%{http_code}\n'
1252
+ grep -iE 'NetScaler|Citrix|version' /tmp/c1
1253
+ # F5
1254
+ curl -sk -m 10 "$T/tmui/login.jsp" -o /tmp/c2 -w '%{http_code}\n'
1255
+ grep -iE 'BIG-IP|version' /tmp/c2
1256
+ # (etc — repeat per product)
1257
+ ```
1258
+
1259
+ **Auto-fingerprint with Nuclei:**
1260
+
1261
+ ```bash
1262
+ nuclei -u $T -t http/technologies/ -severity info,low,medium,high,critical
1263
+ nuclei -u $T -t http/cves/ -severity high,critical -etags fuzz
1264
+ ```
1265
+
1266
+ ### 16.17 Cloud-Native Service Fingerprints
1267
+
1268
+ Modern apps deploy on serverless / managed services. Fingerprint the platform from the URL pattern.
1269
+
1270
+ | Provider | URL pattern | Notes |
1271
+ |---|---|---|
1272
+ | **AWS Lambda Function URL** | `*.lambda-url.<region>.on.aws` | Direct invocation; check IAM auth posture. |
1273
+ | **AWS App Runner** | `*.<region>.awsapprunner.com` | Managed container; usually behind auth. |
1274
+ | **AWS API Gateway** | `*.execute-api.<region>.amazonaws.com` | REST/HTTP/WebSocket; check authorizer config. |
1275
+ | **AWS CloudFront** | `d{14}\.cloudfront\.net` | Distribution; origin behind it (see §16.15). |
1276
+ | **AWS ALB / ELB** | `*.elb.<region>.amazonaws.com` | Behind = EC2 / ECS. |
1277
+ | **AWS Amplify** | `*.amplifyapp.com` | Static + Lambda backend. |
1278
+ | **Google Cloud Run** | `*.run.app` (and `*.<region>.run.app`) | Container; check public-vs-IAM auth. |
1279
+ | **Google Cloud Functions** | `*.cloudfunctions.net`, `*.<region>-<project>.cloudfunctions.net` | Serverless. |
1280
+ | **Google App Engine** | `*.appspot.com` | Older serverless. |
1281
+ | **Azure Functions** | `*.azurewebsites.net` (also App Service) | Function App behind same domain pattern. |
1282
+ | **Azure Container Apps** | `*.azurecontainerapps.io` | Containers. |
1283
+ | **Azure Static Web Apps** | `*.azurestaticapps.net` | Static + Functions. |
1284
+ | **Vercel** | `*.vercel.app`, `*.now.sh` (legacy) | Frontend + serverless. |
1285
+ | **Netlify** | `*.netlify.app`, `*.netlify.com` | Frontend + functions. |
1286
+ | **Cloudflare Workers** | `*.workers.dev` | Edge functions. |
1287
+ | **Cloudflare Pages** | `*.pages.dev` | Static + functions. |
1288
+ | **Heroku** | `*.herokuapp.com` | Dynos. |
1289
+ | **Render** | `*.onrender.com` | Container/static. |
1290
+ | **Fly.io** | `*.fly.dev` | Edge containers. |
1291
+ | **Railway** | `*.railway.app` | App platform. |
1292
+ | **DigitalOcean App Platform** | `*.ondigitalocean.app` | Static + container. |
1293
+
1294
+ **For each pattern:**
1295
+ - Confirm public vs auth-required (HEAD / GET).
1296
+ - Check CORS posture.
1297
+ - For Lambda Function URLs / Cloud Run / Cloud Functions: check whether IAM auth is enforced (anonymous invocation = HIGH finding).
1298
+ - For static + functions hybrids (Vercel/Netlify/Cloudflare Pages): the function paths are usually `/api/*`; enumerate via JS extraction.
1299
+
1300
+ ### 16.18 Container & Kubernetes Exposure
1301
+
1302
+ Increasingly common; often forgotten when behind a NAT.
1303
+
1304
+ | Target | Port | Probe | Severity if exposed |
1305
+ |---|---|---|---|
1306
+ | **Docker API (unencrypted)** | 2375 | `curl -sk -m 5 http://${IP}:2375/v1.40/info` | CRITICAL (container/host takeover) |
1307
+ | **Docker API (TLS)** | 2376 | `curl -sk -m 5 https://${IP}:2376/v1.40/info` | HIGH (cert validation bypass possible) |
1308
+ | **Kubernetes API server** | 6443 / 8443 | `curl -sk -m 5 https://${IP}:6443/api` | HIGH if `system:anonymous` returns non-403 |
1309
+ | **Kubernetes Dashboard** | 8001 / 9090 / 30000+ | `curl -sk -m 5 http://${IP}:8001/api/v1/namespaces/kube-system/services/kubernetes-dashboard` | HIGH if reachable |
1310
+ | **kubelet** | 10250 (HTTPS), 10255 (HTTP, deprecated) | `curl -sk -m 5 https://${IP}:10250/pods` | CRITICAL (no auth = pod exec) |
1311
+ | **etcd** | 2379 (client), 2380 (peer) | `curl -sk -m 5 https://${IP}:2379/v2/keys/` (v2) or `etcdctl --endpoints=${IP}:2379 get /` (v3) | CRITICAL (cluster state + secrets) |
1312
+ | **kube-proxy** | 10256 | `curl http://${IP}:10256/healthz` | INFO |
1313
+ | **kube-controller-manager** | 10257 | `curl https://${IP}:10257/metrics` | MEDIUM |
1314
+ | **kube-scheduler** | 10259 | `curl https://${IP}:10259/metrics` | MEDIUM |
1315
+ | **cAdvisor** | 4194 (deprecated) | `curl http://${IP}:4194/metrics` | LOW (resource metrics) |
1316
+ | **Helm Tiller** (Helm 2 — deprecated but found) | 44134 | `helm --host ${IP}:44134 list` | HIGH (Tiller had cluster-admin) |
1317
+
1318
+ **Public container registries to check for leaks:**
1319
+
1320
+ | Registry | Search pattern |
1321
+ |---|---|
1322
+ | Docker Hub | `https://hub.docker.com/search?q=<target-keyword>&type=image` |
1323
+ | Quay (Red Hat) | `https://quay.io/search?q=<target-keyword>` |
1324
+ | GitHub Container Registry (GHCR) | enumerable via GitHub API: `https://api.github.com/orgs/<org>/packages?package_type=container` |
1325
+ | Amazon ECR Public | `https://gallery.ecr.aws/?searchTerm=<keyword>` |
1326
+ | Azure Container Registry (public) | varies; check for `*.azurecr.io` |
1327
+ | Google Container Registry (public) | `https://console.cloud.google.com/gcr/images/<project>?project=<project>` |
1328
+
1329
+ **Per-image scan workflow:**
1330
+ 1. `docker pull <registry>/<image>:<tag>` (or `skopeo inspect`).
1331
+ 2. `docker save <image> -o /tmp/img.tar`.
1332
+ 3. Extract layers; scan with secret catalog (§17).
1333
+ 4. Inspect `Dockerfile` history (`docker history <image>`) — sometimes reveals build args or COPY of secrets.
1334
+
1335
+ ### 16.19 CI/CD Platform Exposure
1336
+
1337
+ | Platform | Common exposure | Probe |
1338
+ |---|---|---|
1339
+ | **Jenkins** | `/script` (Groovy console = RCE if no auth), `/asynchPeople/`, `/jnlpJars/jenkins-cli.jar`, `/computer/`, `/job/<name>/api/json` | `curl -sk -m 10 "${T}/script"` and `curl -sk -m 10 "${T}/asynchPeople/api/json"` |
1340
+ | **GitLab self-hosted** | `/users/sign_in` (version in HTML), `/-/oauth/applications` (auth-required), `/api/v4/version`, `/-/snippets/<id>/raw` | `curl -sk -m 10 "${T}/api/v4/version"` |
1341
+ | **GitHub Actions workflow files** | `.github/workflows/*.yml` in any public repo | Search via GitHub code search: `path:.github/workflows extension:yml secrets` |
1342
+ | **CircleCI config** | `.circleci/config.yml` in any repo | Search: `path:.circleci/config.yml` |
1343
+ | **TeamCity** | `/login.html`, `/agent.html?agentId=*`, `/admin/admin.html` | `curl -sk -m 10 "${T}/login.html" \| grep -i 'TeamCity'` — version disclosure. CVE-2024-27198 (KEV). |
1344
+ | **Bamboo (Atlassian)** | `/userlogin.action`, `/rest/api/latest/info` | `curl -sk -m 10 "${T}/rest/api/latest/info"` |
1345
+ | **Drone CI** | `/api/info`, `/login` | `curl -sk -m 10 "${T}/api/info"` |
1346
+ | **Travis CI (legacy)** | `.travis.yml` in repos; `https://api.travis-ci.com/repos/<owner>/<repo>` | API often exposes build env. |
1347
+ | **Argo CD** | `/api/version`, `/applications` | `curl -sk -m 10 "${T}/api/version"`. Check anonymous-auth posture. |
1348
+ | **Tekton** | `/apis/tekton.dev/v1beta1/pipelineruns` (K8s native) | Enumerate via K8s API. |
1349
+ | **Spinnaker** | `/gate/info`, `/applications` | `curl -sk -m 10 "${T}/gate/info"` |
1350
+ | **Buildkite** | per-org dashboards; usually behind auth. | Check public agents page. |
1351
+
1352
+ **GitHub Actions secret-leak patterns to look for in workflows:**
1353
+
1354
+ ```yaml
1355
+ # Anti-pattern: secret echoed to log
1356
+ run: echo "${{ secrets.MY_API_KEY }}"
1357
+
1358
+ # Anti-pattern: secret in environment without mask
1359
+ env:
1360
+ KEY: ${{ secrets.MY_API_KEY }}
1361
+ run: ./deploy.sh # script may echo $KEY
1362
+
1363
+ # Anti-pattern: pull_request_target with checkout of fork code (CVE class)
1364
+ on: pull_request_target
1365
+ jobs:
1366
+ test:
1367
+ steps:
1368
+ - uses: actions/checkout@v3
1369
+ with:
1370
+ ref: ${{ github.event.pull_request.head.sha }} # checks out fork code with secrets in env
1371
+ ```
1372
+
1373
+ ### 16.20 Documentation / Wiki Leak Paths
1374
+
1375
+ Public-share features on collaboration platforms regularly leak.
1376
+
1377
+ | Platform | URL pattern | What's exposed |
1378
+ |---|---|---|
1379
+ | **Notion (publish page)** | `*.notion.site/<slug>` or `notion.so/<workspace>/<page-id>` | Public page; sometimes whole workspaces published by accident. |
1380
+ | **Confluence Cloud (anonymous)** | `<target>.atlassian.net/wiki/spaces/` | Public spaces; check `/wiki/display/<SPACE>/`. |
1381
+ | **Atlassian Service Desk** | `<target>.atlassian.net/servicedesk/customer/portal/<N>` | Sometimes lists all internal request types. |
1382
+ | **Trello board** | `https://trello.com/b/<id>/<slug>` | Public board with cards; check via Google `site:trello.com "${target}"`. |
1383
+ | **Asana public project** | `https://app.asana.com/0/<id>/<id>` | Public project view. |
1384
+ | **ReadTheDocs** | `<project>.readthedocs.io` | Hosted docs; "private builds" sometimes default to public. |
1385
+ | **GitBook** | `<workspace>.gitbook.io/<book>/` | Published docs; sometimes contain internal SOPs. |
1386
+ | **MkDocs / Docusaurus on subdomain** | `docs.<target>` | Often contains internal architecture diagrams + setup notes. |
1387
+ | **Slab** | `<workspace>.slab.com/posts/<id>` | Published posts. |
1388
+ | **Coda** | `coda.io/d/<doc-id>` | Public docs. |
1389
+ | **Miro** | `https://miro.com/app/board/<id>/` | Public boards (often architecture diagrams). |
1390
+ | **Lucidchart** | `https://lucid.app/lucidchart/<id>/view` | Public diagrams. |
1391
+ | **Figma** | `https://www.figma.com/file/<key>/` | Public design files; sometimes leak product spec. |
1392
+ | **GitHub Wiki** | `github.com/<org>/<repo>/wiki` | Public wikis; check stale ones. |
1393
+ | **Linear** | `linear.app/<workspace>/issue/<id>` | Public issues (rare but happens). |
1394
+ | **Confluence anonymous server** | `<target>/confluence/`, `<target>/wiki/` (self-hosted) | Anonymous read sometimes left on. |
1395
+ | **Monday.com** | `view.monday.com/<id>` | Shared boards. |
1396
+ | **Wrike** | `app.wrike.com/external/<id>` | External-shared spaces. |
1397
+
1398
+ **Dork-driven discovery:**
1399
+ ```
1400
+ site:notion.site "{target}"
1401
+ site:notion.so "{target}"
1402
+ site:atlassian.net "{target}"
1403
+ site:trello.com "{target}"
1404
+ site:miro.com "{target}"
1405
+ site:lucid.app "{target}"
1406
+ site:figma.com "{target}"
1407
+ site:asana.com "{target}"
1408
+ site:gitbook.io "{target}"
1409
+ site:readthedocs.io "{target}"
1410
+ ```
1411
+
1412
+ ### 16.21 WHOIS / RDAP / Historical
1413
+
1414
+ WHOIS gives current registrant; RDAP is the structured replacement; historical WHOIS is the pivot gold.
1415
+
1416
+ **Current WHOIS:**
1417
+
1418
+ ```bash
1419
+ whois target.example # standard CLI
1420
+ curl -sk -m 10 "https://www.whois.com/whois/${D}" # web fallback
1421
+ ```
1422
+
1423
+ **RDAP (RFC 7480, structured JSON):**
1424
+
1425
+ ```bash
1426
+ # IANA bootstrap → returns the registry RDAP server
1427
+ curl -sk "https://rdap.org/domain/${D}" | jq .
1428
+ curl -sk "https://www.iana.org/rdap" | jq . # bootstrap registry
1429
+ ```
1430
+
1431
+ What to extract from WHOIS / RDAP:
1432
+ - Registrant: name, org, email, phone, address (often redacted post-GDPR but not always for non-EU registrants).
1433
+ - Registrar: enables registrar-account pivot for related domains.
1434
+ - Created / updated / expiry dates: pattern of bulk registrations = same registrant.
1435
+ - Nameservers: NS reuse pivot.
1436
+ - Status flags (`clientHold`, `clientTransferProhibited`, etc.) = posture indicators.
1437
+ - Abuse contact: useful for responsible disclosure (§30).
1438
+
1439
+ **Historical WHOIS:**
1440
+
1441
+ Pre-GDPR records often have unredacted contact info. Sources:
1442
+
1443
+ | Source | Notes |
1444
+ |---|---|
1445
+ | **DomainTools** | Paid; gold-standard; full WHOIS history. |
1446
+ | **WhoisXML API** | Paid; bulk + history. |
1447
+ | **SecurityTrails** | Paid; WHOIS + DNS history. |
1448
+ | **viewdns.info** | Free WHOIS history (limited). |
1449
+ | **whoisology.com** | Paid; reverse WHOIS by registrant email. |
1450
+
1451
+ **Reverse-WHOIS pivots:**
1452
+
1453
+ If you have a registrant email, search "every domain registered by this email":
1454
+ ```bash
1455
+ # DomainTools (paid)
1456
+ curl -sk -H "X-API-Username: ..." -H "X-API-Key: ..." \
1457
+ "https://api.domaintools.com/v1/reverse-whois/?terms=admin@target.example"
1458
+ ```
1459
+
1460
+ This finds adjacent corporate assets (subsidiary domains, brand variations, employee personal projects on corp email).
1461
+
1462
+ ### 16.22 DNS Record Catalog (TXT verification tokens, MX→IdP)
1463
+
1464
+ For every target domain, dump all common record types:
1465
+
1466
+ ```bash
1467
+ D="target.example"
1468
+ for rtype in A AAAA MX TXT NS SOA CAA SRV CNAME PTR; do
1469
+ echo "=== ${rtype} ==="
1470
+ dig +short "${D}" "${rtype}"
1471
+ done
1472
+ ```
1473
+
1474
+ **TXT record verification token catalog** (each token reveals a SaaS tenancy):
1475
+
1476
+ | TXT pattern | SaaS / service | Implication |
1477
+ |---|---|---|
1478
+ | `google-site-verification=<token>` | Google Workspace / Search Console / Analytics | Google tenancy. |
1479
+ | `MS=ms<digits>` | Microsoft 365 (older) | M365 tenancy. |
1480
+ | `apple-domain-verification=<token>` | Apple Business Manager / iCloud Calendar | Apple ecosystem. |
1481
+ | `atlassian-domain-verification=<token>` | Atlassian Cloud (Jira/Confluence/etc.) | Atlassian customer. |
1482
+ | `facebook-domain-verification=<token>` | Facebook Business / Pixel | FB Business. |
1483
+ | `adobe-idp-site-verification=<token>` | Adobe Sign / Creative Cloud | Adobe customer. |
1484
+ | `docusign=<token>` | DocuSign | DocuSign customer. |
1485
+ | `dropbox-domain-verification=<token>` | Dropbox Business | Dropbox customer. |
1486
+ | `box-verification=<token>` | Box | Box customer. |
1487
+ | `webexdomainverification.<id>` | Webex | Cisco Webex. |
1488
+ | `zoom_verify_<id>` | Zoom | Zoom customer (admin domain). |
1489
+ | `notion=<token>` (rare) | Notion workspace | Notion enterprise. |
1490
+ | `slack-domain-verification=<token>` | Slack Enterprise Grid | Slack EG. |
1491
+ | `asana-domain-verification=<token>` | Asana Enterprise | Asana customer. |
1492
+ | `mongodb-site-verification=<token>` | MongoDB Atlas | DB tenant. |
1493
+ | `_dnsauth.<token>` | Many ACME / Let's Encrypt CAs | DNS-01 challenge in progress. |
1494
+ | `pinterest-site-verification=<token>` | Pinterest Business | Marketing surface. |
1495
+ | `cisco-ci-domain-verification=<token>` | Cisco Spark / Webex | Cisco. |
1496
+ | `_globalsign-domain-verification=<token>` | GlobalSign cert authority | Cert provider. |
1497
+ | `mailru-verification:<token>` | Mail.ru | RU presence. |
1498
+ | `yandex-verification:<token>` | Yandex services | RU presence. |
1499
+ | `zscaler-verification-<id>-<date>-<random>` | Zscaler (ZIA / ZPA / ZDX) | **Web SSE / SASE customer**; the date suffix is the verification-issued date. |
1500
+ | `cloudflare-verify=<token>` | Cloudflare (Zero Trust / Access / WARP) | Cloudflare org-tier customer. |
1501
+ | `autosect-site-verification=<token>` | AutoSect (security tooling) | Security vendor on tenant. |
1502
+ | `cisco-site-verification=<token>` | Cisco (various products) | Cisco vendor. |
1503
+ | `mscid=<token>` | Microsoft (newer M365 verification) | M365 tenancy (newer format). |
1504
+ | `_amazonses=<token>` | AWS SES sender verification | SES sender. |
1505
+ | `salesforce-domain-verification=<token>` | Salesforce | SF customer. |
1506
+ | `workday-domain-verification=<token>` | Workday | Workday customer (HR + Finance). |
1507
+ | `shopify-domain-verification=<token>` | Shopify | E-commerce customer. |
1508
+ | `klaviyo-domain-verification=<token>` | Klaviyo | Marketing automation. |
1509
+ | `mailchimp-domain-verification=<token>` | Mailchimp | Marketing email. |
1510
+ | `hubspot-domain-verification=<token>` | HubSpot | CRM / marketing. |
1511
+ | `zendesk-verification=<token>` | Zendesk | Support tenancy (also see §43). |
1512
+ | `freshworks-verification=<token>` | Freshworks | Support / CRM customer. |
1513
+ | `intercom-verification=<token>` | Intercom | Messaging tenancy. |
1514
+ | `loom-site-verification=<token>` | Loom | Video. |
1515
+ | `miro-site-verification=<token>` | Miro | Whiteboard tenancy. |
1516
+ | `gitlab-domain-verification=<token>` | GitLab | Self-hosted or cloud verification. |
1517
+
1518
+ Each discovered tenancy is a separate attack surface (own credentials, own MFA posture, own data).
1519
+
1520
+ **Autodiscover-as-confirmation pattern:**
1521
+
1522
+ `autodiscover.<domain>` resolving to Microsoft IP space (`40.96.0.0/13`, `52.96.0.0/14`, `13.107.0.0/16`) is **definitive proof** of M365 Exchange Online tenancy — even when MX records are obscured by Mimecast/Proofpoint/Barracuda inbound filtering. Probe:
1523
+
1524
+ ```powershell
1525
+ Resolve-DnsName "autodiscover.$D" -Type A | Select Name,IPAddress
1526
+ ```
1527
+
1528
+ If IPs are in Microsoft ranges → `M365_CONFIRMED`. Cross-reference with `getuserrealm.srf` (§22.1) for tenant GUID extraction.
1529
+
1530
+ **CAA records:**
1531
+ ```bash
1532
+ dig +short CAA "${D}"
1533
+ ```
1534
+ Lists which CAs are allowed to issue certs. Absence = LOW finding (any CA can mis-issue). Presence + restrictive list = good posture.
1535
+
1536
+ **SOA serial pattern analysis:**
1537
+ ```bash
1538
+ dig +short SOA "${D}"
1539
+ ```
1540
+ Serial format `YYYYMMDDNN` reveals last-edit date. Pattern across multiple zones can correlate ownership.
1541
+
1542
+ ### 16.23 Wayback CDX Deep Usage
1543
+
1544
+ The Wayback Machine has a structured query API.
1545
+
1546
+ **Basic CDX query:**
1547
+ ```bash
1548
+ D="target.example"
1549
+ curl -sk "https://web.archive.org/cdx/search/cdx?url=${D}/*&output=json&fl=timestamp,original&limit=10000"
1550
+ ```
1551
+
1552
+ Returns JSON array of `[timestamp, original_url]` tuples.
1553
+
1554
+ **Useful filters:**
1555
+ - `&from=20200101&to=20231231` — date range.
1556
+ - `&filter=mimetype:application/json` — only JSON responses (often APIs).
1557
+ - `&filter=mimetype:application/javascript` — JS bundles.
1558
+ - `&filter=statuscode:200` — only successful captures.
1559
+ - `&filter=urlkey:.*api.*` — only URLs containing "api".
1560
+ - `&collapse=urlkey` — dedup by URL.
1561
+ - `&collapse=digest` — dedup by content (catches identical pages re-archived).
1562
+
1563
+ **Get specific snapshot:**
1564
+ ```bash
1565
+ TS="20231215120000"
1566
+ URL="https://target.example/admin/dashboard"
1567
+ curl -sk "https://web.archive.org/web/${TS}/${URL}"
1568
+ ```
1569
+
1570
+ **Diff snapshot vs live:**
1571
+ ```bash
1572
+ LIVE=$(curl -sk -m 10 "${URL}")
1573
+ ARCHIVED=$(curl -sk -m 10 "https://web.archive.org/web/${TS}/${URL}")
1574
+ diff <(echo "$LIVE") <(echo "$ARCHIVED") | head -100
1575
+ ```
1576
+
1577
+ **Save current page:**
1578
+ ```bash
1579
+ curl -sk -X POST "https://pragma.archivelab.org/" \
1580
+ -H 'Content-Type: application/json' \
1581
+ -d '{"url":"https://target.example/admin"}'
1582
+ ```
1583
+
1584
+ **Find every archived JS:**
1585
+ ```bash
1586
+ curl -sk "https://web.archive.org/cdx/search/cdx?url=${D}/*.js&output=json&fl=timestamp,original&filter=statuscode:200" | \
1587
+ jq -r '.[1:][] | "\(.[0]) \(.[1])"'
1588
+ ```
1589
+
1590
+ For each, fetch the archived JS and run the secret catalog (§17). Old JS often had hard-coded keys later removed.
1591
+
1592
+ **Legacy-app pivot (when `*.js` returns empty):**
1593
+
1594
+ Static brochure-ware sites (older corporate sites, especially pre-2015) often have **zero archived JS** because the frontend was server-rendered. Pivot to legacy file extensions:
1595
+
1596
+ ```bash
1597
+ # ASP / ASP.NET classic
1598
+ curl -sk "https://web.archive.org/cdx/search/cdx?url=${D}/*.asp&output=json&fl=timestamp,original&filter=statuscode:200&collapse=urlkey&limit=500"
1599
+
1600
+ # PHP
1601
+ curl -sk "https://web.archive.org/cdx/search/cdx?url=${D}/*.php&output=json&fl=timestamp,original&filter=statuscode:200&collapse=urlkey&limit=500"
1602
+
1603
+ # JSP / .NET aspx / CGI / Coldfusion
1604
+ for ext in aspx jsp cgi cfm; do
1605
+ echo "=== .$ext ==="
1606
+ curl -sk "https://web.archive.org/cdx/search/cdx?url=${D}/*.${ext}&output=json&fl=timestamp,original&filter=statuscode:200&collapse=urlkey&limit=200"
1607
+ done
1608
+
1609
+ # JSON / XML config (sometimes leaks endpoints + creds)
1610
+ for ext in json xml yml yaml ini conf; do
1611
+ echo "=== .$ext ==="
1612
+ curl -sk "https://web.archive.org/cdx/search/cdx?url=${D}/*.${ext}&output=json&fl=timestamp,original&filter=statuscode:200&collapse=urlkey&limit=100"
1613
+ done
1614
+
1615
+ # Anything indexed (broad sweep — useful for legacy enumeration)
1616
+ curl -sk "https://web.archive.org/cdx/search/cdx?url=${D}/*&output=json&fl=timestamp,original&filter=statuscode:200&collapse=urlkey&limit=10000"
1617
+ ```
1618
+
1619
+ Legacy `.asp` / `.cfm` / `.jsp` URLs often reveal: forgotten admin panels, old user-enum endpoints, legacy auth flows, SQL-injection-prone parameters. Cross-reference with current DNS — many legacy hosts now NXDOMAIN but the URL paths sometimes survive on a renamed host.
1620
+
1621
+ ### 16.24 Common-Prefix Subdomain Sweep (active, low-detectability)
1622
+
1623
+ Empirically: **passive cert-transparency enumeration (crt.sh / VirusTotal / Subfinder) misses 20–40% of high-value subdomains** because (a) many internal hosts use wildcard certs that don't expose the FQDN, (b) some hosts have never been issued public certs (HTTP-only or self-signed), (c) very-recently-provisioned hosts haven't propagated to CT log mirrors yet.
1624
+
1625
+ **Always pair passive enum with an active prefix-probe.** Detectability: low (single A-record query per host; no port scan, no HTTP).
1626
+
1627
+ **The high-yield prefix list (ordered by hit-rate from real engagements):**
1628
+
1629
+ ```
1630
+ www, mail, webmail, smtp, imap, pop, owa, autodiscover, ftp, sftp,
1631
+ vpn, sslvpn, gateway, gp, globalprotect, citrix, fortinet, anyconnect,
1632
+ api, app, apps, mobile, m,
1633
+ portal, login, sso, idp, iam, identity, accounts, oauth, auth, adfs,
1634
+ admin, manage, console, dashboard, cp, cpanel,
1635
+ intranet, internal, hr, payroll, finance, sap, erp, crm, helpdesk, servicedesk,
1636
+ support, help, kb, status, monitoring, grafana, kibana, prometheus,
1637
+ docs, wiki, confluence, jira, bitbucket, gitlab, jenkins, sonar, nexus,
1638
+ git, svn, repo, code,
1639
+ dev, test, staging, stg, qa, uat, sandbox, preprod, preview, demo,
1640
+ careers, jobs, vacancies, recruit, eapps,
1641
+ shop, store, ecommerce, checkout, payments, pay, billing,
1642
+ old, legacy, archive, backup, beta, v1, v2, classic,
1643
+ cdn, static, assets, media, img, files, downloads, public,
1644
+ ns, ns1, ns2, dns, mx, mx1, mx2,
1645
+ zoom, teams, slack, lync, sip, voice, meet,
1646
+ sclepro, tender, tenders, suppliers, vendor, vendors, procurement, purchase
1647
+ ```
1648
+
1649
+ **One-liner (PowerShell):**
1650
+ ```powershell
1651
+ $D = "target.example"
1652
+ $prefixes = @("www","mail","webmail","owa","autodiscover","ftp","vpn","sslvpn","gateway","api","app","portal","login","sso","idp","iam","identity","accounts","oauth","auth","adfs","admin","intranet","hr","sap","erp","crm","support","help","status","grafana","kibana","docs","wiki","jira","jenkins","gitlab","dev","test","staging","stg","qa","uat","sandbox","preprod","preview","careers","jobs","eapps","old","legacy","beta","tender","suppliers","procurement")
1653
+ foreach ($p in $prefixes) {
1654
+ $r = Resolve-DnsName "$p.$D" -Type A -ErrorAction SilentlyContinue
1655
+ if ($r) {
1656
+ $ips = ($r | ? {$_.IPAddress}).IPAddress -join ","
1657
+ "$p.$D -> $ips"
1658
+ }
1659
+ }
1660
+ ```
1661
+
1662
+ **One-liner (bash + dig):**
1663
+ ```bash
1664
+ D="target.example"
1665
+ for p in www mail webmail owa autodiscover ftp vpn sslvpn gateway api app portal login sso idp iam identity accounts oauth auth adfs admin intranet hr sap erp crm support help status grafana kibana docs wiki jira jenkins gitlab dev test staging stg qa uat sandbox preprod preview careers jobs eapps old legacy beta tender suppliers procurement; do
1666
+ IP=$(dig +short A "$p.$D" | head -1)
1667
+ [ -n "$IP" ] && echo "$p.$D -> $IP"
1668
+ done
1669
+ ```
1670
+
1671
+ **Mass DNS approach (faster for large prefix lists):**
1672
+ ```bash
1673
+ # Generate candidate FQDNs from a wordlist; resolve in parallel via puredns
1674
+ puredns resolve <(awk -v d="$D" '{print $1"."d}' assetnote-best-dns-wordlist.txt) -r resolvers.txt
1675
+ ```
1676
+
1677
+ **What to extract from each hit:**
1678
+ - IP / IP block → ASN lookup (§28.1) → confirms target-owned vs hosted-elsewhere.
1679
+ - For `vpn.*` / `gateway.*` / `gp.*` / `globalprotect.*` / `citrix.*` → flag for active vendor fingerprint (§16.16) under separate engagement scope.
1680
+ - For `api.*` / `app.*` → seed for §16.1–16.10 webapp probes.
1681
+ - For `staging.*` / `dev.*` / `uat.*` → seed for §16.5 always-on HTTP checks (often weaker auth + debug endpoints).
1682
+ - For `intranet.*` / `eapps.*` / `sclepro.*` → public-intranet finding (often MEDIUM; per §40).
1683
+
1684
+ **Real-engagement validation:** in an internal smoke test, prefix-sweep found `vpn.`, `api.`, `intranet.`, `staging.`, `support.`, `eapps.`, `sclepro.`, `autodiscover.` — all of which crt.sh missed (or returned 502 for). Treat passive + active as **complementary, not alternatives**.
1685
+
1686
+ ---
1687
+
1688
+ ## 17. Secret-Pattern Catalog — 48 patterns (29 base + 19 modern)
1689
+
1690
+ The catalog runs against any text source: GitHub code, Postman workspaces, JS bodies, sourcesContent blobs, mobile-app strings, Wayback HTML, paste sites, Stack Exchange code blocks. **Order matters: most-specific patterns first** so generic catches don't pre-empt typed ones.
1691
+
1692
+ | # | Name | Regex | Severity | Category |
1693
+ |---|---|---|---|---|
1694
+ | 1 | AWS Access Key | `\b(AKIA\|ASIA)[0-9A-Z]{16}\b` | **CRITICAL** | aws |
1695
+ | 2 | AWS Secret Key (typed) | `(?i)aws[_\-]?secret[_\-]?access[_\-]?key['"\s:=]+([A-Za-z0-9/+=]{40})` | **CRITICAL** | aws |
1696
+ | 3 | AWS Secret (loose) | `(?i)aws(.{0,20})?(secret\|sk)["'=: ]+([0-9a-z/+=]{40})` | HIGH | aws |
1697
+ | 4 | GCP Service Account JSON | `"type"\s*:\s*"service_account"` | **CRITICAL** | gcp |
1698
+ | 5 | Google API Key | `\bAIza[0-9A-Za-z_\-]{35}\b` | HIGH | gcp |
1699
+ | 6 | GitHub Classic PAT | `\bghp_[A-Za-z0-9]{36}\b` | **CRITICAL** | github |
1700
+ | 7 | GitHub Fine-grained PAT | `\bgithub_pat_[A-Za-z0-9_]{82}\b` | **CRITICAL** | github |
1701
+ | 8 | GitHub OAuth | `\bgho_[A-Za-z0-9]{36}\b` | HIGH | github |
1702
+ | 9 | GitHub Server-to-Server | `\bgh[usr]_[A-Za-z0-9]{36,}\b` | HIGH | github |
1703
+ | 10 | Stripe Live Key | `\bsk_live_[0-9A-Za-z]{24,}\b` | **CRITICAL** | stripe |
1704
+ | 11 | Stripe Test Key | `\bsk_test_[0-9A-Za-z]{24,}\b` | LOW | stripe |
1705
+ | 12 | Slack Token | `\bxox[abpors]-[0-9A-Za-z\-]{10,48}\b` | HIGH | slack |
1706
+ | 13 | Slack Webhook | `https://hooks\.slack\.com/services/T[A-Z0-9]+/B[A-Z0-9]+/[A-Za-z0-9]+` | MEDIUM | slack |
1707
+ | 14 | SendGrid Key | `\bSG\.[A-Za-z0-9_\-]{22}\.[A-Za-z0-9_\-]{43}\b` | HIGH | email_svc |
1708
+ | 15 | Mailgun Key (v1) | `\bkey-[0-9a-zA-Z]{32}\b` | HIGH | email_svc |
1709
+ | 16 | Mailgun Key (loose) | `\bkey-[0-9a-f]{32}\b` | HIGH | email_svc |
1710
+ | 17 | Twilio API Key | `\bSK[0-9a-fA-F]{32}\b` | HIGH | twilio |
1711
+ | 18 | Twilio Account SID | `\bAC[a-f0-9]{32}\b` | MEDIUM | twilio |
1712
+ | 19 | Twilio Auth Token | `(?i)twilio(.{0,20})?(auth\|token)["'=: ]+([a-f0-9]{32})` | HIGH | twilio |
1713
+ | 20 | Heroku API Key | `(?i)heroku(.{0,20})?api["'=: ]+([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})` | MEDIUM | paas |
1714
+ | 21 | Firebase URL | `\bhttps?://[a-z0-9\-]+\.firebaseio\.com\b` | LOW | firebase |
1715
+ | 22 | JWT (any) | `\beyJ[A-Za-z0-9_\-]{10,}\.eyJ[A-Za-z0-9_\-]{10,}\.[A-Za-z0-9_\-]{10,}\b` | MEDIUM | jwt |
1716
+ | 23 | Bearer Token Assignment | `(?i)authorization["'=: ]+bearer\s+[A-Za-z0-9._\-]{20,}` | MEDIUM | bearer |
1717
+ | 24 | Basic Auth in URL | `https?://[^/\s:@]+:[^/\s:@]+@[^/\s]+` | MEDIUM | basic_auth |
1718
+ | 25 | RSA Private Key | `-----BEGIN RSA PRIVATE KEY-----` | **CRITICAL** | private_key |
1719
+ | 26 | EC Private Key | `-----BEGIN EC PRIVATE KEY-----` | **CRITICAL** | private_key |
1720
+ | 27 | OpenSSH Private Key | `-----BEGIN OPENSSH PRIVATE KEY-----` | **CRITICAL** | private_key |
1721
+ | 28 | Generic Private Key | `-----BEGIN (DSA \|PGP \|)PRIVATE KEY-----` | **CRITICAL** | private_key |
1722
+ | 29 | Generic API Key | `(?i)(?:api[_\-]?key\|apikey\|api_secret\|access_token\|secret[_\-]?token)['"\s:=]+["']([A-Za-z0-9+/=_\-]{24,})["']` | MEDIUM | generic |
1723
+ | 30 | Anthropic API Key | `\bsk-ant-(?:api03\|admin01)-[A-Za-z0-9_\-]{93,}\b` | **CRITICAL** | ai_api |
1724
+ | 31 | OpenAI API Key (legacy) | `\bsk-[A-Za-z0-9]{20}T3BlbkFJ[A-Za-z0-9]{20}\b` | **CRITICAL** | ai_api |
1725
+ | 32 | OpenAI Project Key | `\bsk-proj-[A-Za-z0-9_\-]{40,}T3BlbkFJ[A-Za-z0-9_\-]{40,}\b` | **CRITICAL** | ai_api |
1726
+ | 33 | OpenAI User Session | `\bsess-[A-Za-z0-9]{40}\b` | HIGH | ai_api |
1727
+ | 34 | HuggingFace Token | `\bhf_[A-Za-z0-9]{30,}\b` | HIGH | ai_api |
1728
+ | 35 | Cloudflare API Token | `\b[A-Za-z0-9_\-]{40}\b` (when paired with `(?i)cloudflare`/`X-Auth-Key` context) | HIGH | infra_api |
1729
+ | 36 | Cloudflare Global API Key | `(?i)cf[_\-]?api[_\-]?key['"\s:=]+([a-f0-9]{37})` | **CRITICAL** | infra_api |
1730
+ | 37 | DigitalOcean Token | `\bdop_v1_[a-f0-9]{64}\b` | HIGH | infra_api |
1731
+ | 38 | npm Token (Modern) | `\bnpm_[A-Za-z0-9]{36}\b` | HIGH | package_registry |
1732
+ | 39 | PyPI Token | `\bpypi-AgENdGV[A-Za-z0-9_\-]+\b` | HIGH | package_registry |
1733
+ | 40 | Docker Hub PAT | `\bdckr_pat_[A-Za-z0-9_\-]{27,}\b` | HIGH | package_registry |
1734
+ | 41 | Atlassian API Token | `\bATATT3xFfGF0[A-Za-z0-9_\-]{180,}\b` | HIGH | saas_api |
1735
+ | 42 | New Relic License Key | `\b(?:NRAA\|NRAK\|NRBR)-[A-F0-9]{27}\b` | MEDIUM | observability |
1736
+ | 43 | DataDog API Key (in DD_API_KEY context) | `(?i)dd[_\-]?api[_\-]?key['"\s:=]+([a-f0-9]{32})` | HIGH | observability |
1737
+ | 44 | Sentry DSN | `https://[a-f0-9]+@o[0-9]+\.ingest\.sentry\.io/[0-9]+` | LOW | observability |
1738
+ | 45 | ngrok Auth Token | `\b[12][A-Za-z0-9]{26}_[A-Za-z0-9]{32,}\b` (when `(?i)ngrok` context) | MEDIUM | tunneling |
1739
+ | 46 | Linear API Key | `\blin_api_[A-Za-z0-9]{40}\b` | MEDIUM | saas_api |
1740
+ | 47 | Discord Bot Token | `\b[MN][A-Za-z\d]{23}\.[\w\-]{6}\.[\w\-]{27}\b` | HIGH | bot_token |
1741
+ | 48 | Telegram Bot Token | `\b\d{8,10}:[A-Za-z0-9_\-]{35}\b` | HIGH | bot_token |
1742
+
1743
+ **False-positive notes:**
1744
+ - Patterns 22 (JWT), 23 (Bearer), 29 (Generic) trigger on test/example data frequently. Always look at *context* — a JWT in a `README.md` example block ≠ a JWT in a production `.env` file.
1745
+ - Pattern 16 (Mailgun loose) and pattern 11 (Stripe test) are noisy by design; severity is set low for that reason.
1746
+ - Pattern 24 (Basic auth in URL) catches monitoring-tool URLs and CI-debug URLs as well as real creds — verify before alerting.
1747
+ - For GitHub's Fine-grained PAT (pattern 7), the `82` length is by GitHub's spec — be skeptical of matches significantly longer or shorter.
1748
+
1749
+ ---
1750
+
1751
+ ## 18. Dork Corpus — 80+ templates, 9 categories
1752
+
1753
+ Substitute `{domain}` with the target domain (e.g., `example.com`) and `{company}` with the company name (e.g., `Acme Corporation`). Run via Google, Bing, Brave, DuckDuckGo, Yandex, Baidu — engines surface different results.
1754
+
1755
+ ### 18.1 Files
1756
+
1757
+ ```
1758
+ site:{domain} filetype:env
1759
+ site:{domain} ext:env OR ext:ini OR ext:cfg OR ext:conf
1760
+ site:{domain} ext:sql OR ext:sqlite OR ext:dump OR ext:bak
1761
+ site:{domain} ext:pem OR ext:key OR ext:p12 OR ext:pfx
1762
+ site:{domain} ext:log
1763
+ site:{domain} intitle:"index of"
1764
+ site:{domain} inurl:.git OR inurl:/.git/
1765
+ site:{domain} inurl:backup OR inurl:.bak OR inurl:old
1766
+ site:{domain} ext:yml OR ext:yaml
1767
+ site:{domain} ext:properties
1768
+ ```
1769
+
1770
+ ### 18.2 Admin / login panels
1771
+
1772
+ ```
1773
+ site:{domain} inurl:admin OR inurl:login OR inurl:sso OR inurl:dashboard
1774
+ site:{domain} intitle:"phpMyAdmin"
1775
+ site:{domain} intitle:"Jenkins"
1776
+ site:{domain} intitle:"Grafana"
1777
+ site:{domain} intitle:"Kibana"
1778
+ site:{domain} intitle:"Splunk"
1779
+ site:{domain} (intitle:"login" OR intitle:"sign in")
1780
+ site:{domain} intitle:"GitLab"
1781
+ site:{domain} intitle:"Swagger" OR intitle:"OpenAPI"
1782
+ site:{domain} inurl:phpinfo
1783
+ ```
1784
+
1785
+ ### 18.3 Secrets / credential leakage
1786
+
1787
+ ```
1788
+ "{domain}" ("api_key" OR "apikey" OR "access_token")
1789
+ "{domain}" (password OR passwd OR pwd)
1790
+ site:pastebin.com "{domain}"
1791
+ site:ghostbin.com "{domain}"
1792
+ site:rentry.co "{domain}"
1793
+ site:gist.github.com "{domain}"
1794
+ site:hastebin.com "{domain}"
1795
+ "{domain}" "BEGIN RSA PRIVATE KEY"
1796
+ ```
1797
+
1798
+ ### 18.4 Cloud / CI / shadow-IT
1799
+
1800
+ ```
1801
+ site:s3.amazonaws.com "{domain}"
1802
+ site:storage.googleapis.com "{domain}"
1803
+ site:blob.core.windows.net "{domain}"
1804
+ site:digitaloceanspaces.com "{domain}"
1805
+ site:trello.com "{domain}"
1806
+ site:*.atlassian.net "{domain}"
1807
+ site:dev.azure.com "{domain}"
1808
+ site:bitbucket.org "{domain}"
1809
+ site:firebaseio.com "{domain}"
1810
+ site:herokuapp.com "{domain}"
1811
+ ```
1812
+
1813
+ ### 18.5 Docs / intel mining
1814
+
1815
+ ```
1816
+ site:{domain} filetype:pdf (confidential OR internal OR restricted)
1817
+ site:{domain} filetype:xlsx OR filetype:csv
1818
+ site:{domain} filetype:docx
1819
+ site:scribd.com "{company}"
1820
+ "{company}" filetype:pdf (salary OR payroll OR org-chart OR "organization chart")
1821
+ site:linkedin.com/in "{company}"
1822
+ site:slideshare.net "{company}"
1823
+ ```
1824
+
1825
+ ### 18.6 Vuln indicators
1826
+
1827
+ ```
1828
+ site:{domain} intext:"sql syntax" OR intext:"you have an error in your sql"
1829
+ site:{domain} intext:"Warning: mysql_"
1830
+ site:{domain} intext:"Fatal error:" intext:"on line"
1831
+ site:{domain} intext:"stack trace" OR intext:"Traceback (most recent call last)"
1832
+ "Apache/2.4.49" site:{domain}
1833
+ "Server: nginx/1.14" site:{domain}
1834
+ site:{domain} inurl:wp-content OR inurl:wp-includes
1835
+ ```
1836
+
1837
+ ### 18.7 Internal tool exposure
1838
+
1839
+ ```
1840
+ site:{domain} intitle:"Splunk"
1841
+ site:{domain} intitle:"Grafana"
1842
+ site:{domain} intitle:"Kibana"
1843
+ site:{domain} intitle:"Prometheus Time Series"
1844
+ site:{domain} intitle:"Jaeger UI"
1845
+ site:{domain} intitle:"AlertManager"
1846
+ site:{domain} intitle:"Argo CD"
1847
+ site:{domain} intitle:"Sonarqube"
1848
+ site:{domain} intitle:"Sentry"
1849
+ site:{domain} intitle:"Confluence"
1850
+ site:{domain} intitle:"Jira"
1851
+ site:{domain} intitle:"GitLab"
1852
+ site:{domain} intitle:"Gitea"
1853
+ site:{domain} intitle:"Drone CI"
1854
+ site:{domain} inurl:"/jenkins/"
1855
+ ```
1856
+
1857
+ ### 18.8 Backup / dump file extensions
1858
+
1859
+ ```
1860
+ site:{domain} ext:bak OR ext:backup OR ext:old OR ext:orig OR ext:save OR ext:swp
1861
+ site:{domain} ext:tar OR ext:tar.gz OR ext:tgz OR ext:zip OR ext:rar OR ext:7z
1862
+ site:{domain} ext:db OR ext:sqlite OR ext:sqlite3 OR ext:mdb
1863
+ site:{domain} ext:dump OR ext:rdb OR ext:bson
1864
+ site:{domain} (intext:"-- MySQL dump" OR intext:"PostgreSQL database dump")
1865
+ site:{domain} ext:pcap OR ext:pcapng OR ext:cap
1866
+ site:{domain} ext:core OR ext:hprof OR ext:dmp
1867
+ ```
1868
+
1869
+ ### 18.9 Sector-specific (healthcare / finance / gov)
1870
+
1871
+ ```
1872
+ # Healthcare
1873
+ site:{domain} (filetype:pdf OR filetype:xlsx) (HIPAA OR PHI OR "patient records")
1874
+ site:{domain} ("DICOM" OR "HL7" OR "ICD-10")
1875
+
1876
+ # Finance
1877
+ site:{domain} (filetype:pdf OR filetype:xlsx) (SOC OR "audit report" OR "internal control")
1878
+ site:{domain} (filetype:pdf OR filetype:xlsx) ("Form 10-K" OR "Form 10-Q" OR earnings)
1879
+ site:{domain} ("SWIFT" OR "BIC" OR IBAN OR "wire transfer")
1880
+
1881
+ # Gov / public sector
1882
+ site:{domain} (filetype:pdf OR filetype:doc) (FOUO OR "controlled unclassified" OR CUI)
1883
+ site:{domain} (filetype:pdf OR filetype:xlsx) ("personnel security" OR clearance)
1884
+ ```
1885
+
1886
+ ### 18.10 Result classification
1887
+
1888
+ After running, score each result via URL signature → title hint → snippet regex:
1889
+ - **CRITICAL URL signatures:** `.pem`, `.p12`, `.pfx`, `.key` extensions; `id_rsa` filename.
1890
+ - **HIGH URL signatures:** `/.env`, `/.git/`, database dumps, `wp-config.bak`, `/phpmyadmin`, `/jenkins`, `/phpinfo.php`.
1891
+ - **MEDIUM URL signatures:** `/admin`, `/login`, `/swagger`, `.log`, `/backup`, `.DS_Store`.
1892
+ - Snippet content (e.g., a secret regex hit in the snippet) overrides URL signature only if higher severity.
1893
+ - Confidence: snippet-only match = TENTATIVE (operator must visit URL to confirm; tag detectability=medium).
1894
+
1895
+ ---
1896
+
1897
+ ## 19. GitHub Code-Search Dorks for Targets — 13 dorks
1898
+
1899
+ Apply each template to `{target}` (root domain stem like `acme`), `{domain}` (full root domain like `acme.com`), and optionally `{company}` (`Acme Corporation`):
1900
+
1901
+ ```
1902
+ "{target}" filename:.env
1903
+ "{target}" filename:.env.example
1904
+ "{target}" filename:config
1905
+ "{target}" AWS_ACCESS_KEY_ID
1906
+ "{target}" AWS_SECRET_ACCESS_KEY
1907
+ "{target}" password
1908
+ "{target}" api_key
1909
+ "{target}" secret
1910
+ "{target}" authorization: Bearer
1911
+ "{target}" filename:id_rsa
1912
+ "{target}" filename:.git-credentials
1913
+ "{target}" filename:wp-config.php
1914
+ "@{domain}" password # emails + password context
1915
+ ```
1916
+
1917
+ **Requirements:** GitHub personal access token (any scope; recommend a fine-grained PAT with read-only repo access). Rate limit per token; concurrency cap ≤5.
1918
+
1919
+ **For each result:**
1920
+ 1. Fetch the file (or relevant fragment) via the GitHub Contents API.
1921
+ 2. Run the secret catalog (§17).
1922
+ 3. If a secret hits → `SECRET_LEAK` finding with catalog severity, evidence = repo URL + file path + matched secret (truncated, last 4 chars only).
1923
+ 4. Optional: clone the repo to a tempdir, run `trufflehog`/`gitleaks` for full history scan.
1924
+
1925
+ ---
1926
+
1927
+ ## 20. Endpoint Interest Score — 0–100 rubric
1928
+
1929
+ For every classified endpoint (§22 in methodology skill), apply this rubric:
1930
+
1931
+ | Signal | Points | Conditions |
1932
+ |---|---|---|
1933
+ | **Unauth write** | +40 | POST/PUT/DELETE/PATCH endpoint returns 200/201/202/204 anonymously. |
1934
+ | **Open GraphQL introspection** | +35 | `__schema` query returns full type list anonymously. |
1935
+ | **Verb tampering bypass** | +30 | OPTIONS reveals method not documented; that method is accessible. |
1936
+ | **Reflected CORS + credentials** | +25 | `Access-Control-Allow-Origin` reflects request `Origin` AND `Access-Control-Allow-Credentials: true`. |
1937
+ | **Sensitive keyword in path** | +20 | Path matches one of: `admin`, `internal`, `debug`, `user`, `password`, `token`, `key`, `export`, `upload`, `backup`, `config`, `secret`, `private`, `delete`, `purge`, `wipe`. |
1938
+ | **Schema leak in error** | +20 | Response body contains stack trace, ORM error class, framework signature (e.g., `ActiveRecord::RecordNotFound`, `org.hibernate.exception.*`, `django.db.utils.IntegrityError`). |
1939
+ | **API key in URL** | +15 | Path or query string contains `api_key=`, `apikey=`, `token=`, `access_token=`. |
1940
+ | **Wildcard CORS** | +10 | `Access-Control-Allow-Origin: *`. |
1941
+ | **Missing rate-limit headers** | +10 | No `RateLimit-*` / `X-RateLimit-*` headers; no `Retry-After` after rapid requests. |
1942
+
1943
+ **Thresholds:**
1944
+
1945
+ | Score | Severity |
1946
+ |---|---|
1947
+ | ≥ 90 | **CRITICAL** |
1948
+ | 70–89 | **HIGH** |
1949
+ | 50–69 | MEDIUM |
1950
+ | 25–49 | LOW |
1951
+ | < 25 | INFO |
1952
+
1953
+ For score ≥ 70, attach an `attack_path_hint` in evidence (see §29).
1954
+
1955
+ ---
1956
+
1957
+ ## 21. Mobile App Ownership Confidence — 0–100 rubric
1958
+
1959
+ Before running deep APK static analysis, score whether the discovered app actually belongs to the target. Threshold: **≥70 = accept**.
1960
+
1961
+ | Signal | Points |
1962
+ |---|---|
1963
+ | Package reverse-DNS matches target domain (e.g., `com.acme.android` ⟂ `acme.com`) | +40 |
1964
+ | Developer email is `<anything>@<target-domain>` | +25 |
1965
+ | Developer website URL is the target domain (or a confirmed sibling brand domain) | +20 |
1966
+ | App name contains a brand keyword from operator-supplied brand list | +10 |
1967
+ | App has ≥ minimum review-score threshold (default 20 reviews) | +5 |
1968
+
1969
+ Apps below threshold are tagged `mobile_review_pending` and shown but not analyzed. Operator can re-score with `--mobile-ownership-threshold 50` for noisier collection.
1970
+
1971
+ ---
1972
+
1973
+ ## 22. Identity Fabric — Concrete Endpoints
1974
+
1975
+ Methodology lives in the companion `osint-methodology` skill §11. This is the URL/payload reference.
1976
+
1977
+ ### 22.1 Microsoft Entra (Azure AD)
1978
+
1979
+ **OIDC metadata + tenant GUID extraction:**
1980
+ ```
1981
+ GET https://login.microsoftonline.com/{tenant-or-domain}/.well-known/openid-configuration
1982
+ ```
1983
+ Response field `issuer` contains the tenant GUID. GUID regex:
1984
+ ```regex
1985
+ \b[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}\b
1986
+ ```
1987
+ Detectability: low.
1988
+
1989
+ **getuserrealm.srf — managed vs federated probe:**
1990
+ ```
1991
+ GET https://login.microsoftonline.com/getuserrealm.srf?login=<probe-user>@<domain>
1992
+ ```
1993
+ Response: JSON with `NameSpaceType` field (`Managed` / `Federated` / `Unknown`). Federated also includes `FederationBrandName` and `AuthURL` (the upstream IdP URL). Detectability: low.
1994
+
1995
+ **Autodiscover v2:**
1996
+ ```
1997
+ POST https://autodiscover-s.outlook.com/autodiscover/metadata/json/1
1998
+ Body: {"Email": "<probe-user>@<domain>"}
1999
+ ```
2000
+ Returns the protocol endpoint for the user; presence indicates tenant membership. Detectability: low.
2001
+
2002
+ **Autodiscover IP correlation (passive M365 confirmation):**
2003
+
2004
+ Resolve `autodiscover.<domain>` and check if it lands in Microsoft Exchange Online IP space. This works even when MX is wrapped by Mimecast/Proofpoint/Barracuda inbound filtering, where MX alone doesn't reveal the underlying mail platform.
2005
+
2006
+ ```bash
2007
+ dig +short A autodiscover.target.example
2008
+ ```
2009
+ ```powershell
2010
+ Resolve-DnsName "autodiscover.$D" -Type A | Select Name,IPAddress
2011
+ ```
2012
+
2013
+ Microsoft Exchange Online IPs (truncated common ranges): `40.96.0.0/13`, `52.96.0.0/14`, `13.107.6.152/31`, `13.107.18.10/31`, `40.99.0.0/16`, `40.104.0.0/15`, `52.98.0.0/15`. Full list: [Office 365 URLs and IP address ranges](https://learn.microsoft.com/en-us/microsoft-365/enterprise/urls-and-ip-address-ranges).
2014
+
2015
+ If `autodiscover.<domain>` lands in that space → `M365_CONFIRMED` even when nothing else does. Detectability: low (passive DNS).
2016
+
2017
+ **GetCredentialType — user-enum (deep mode only):**
2018
+ ```
2019
+ POST https://login.microsoftonline.com/common/GetCredentialType
2020
+ Content-Type: application/json
2021
+ Body:
2022
+ {
2023
+ "username": "<email>",
2024
+ "isOtherIdpSupported": true,
2025
+ "checkPhones": false,
2026
+ "isRemoteNGCSupported": true,
2027
+ "isCookieBannerShown": false,
2028
+ "isFidoSupported": true,
2029
+ "originalRequest": "",
2030
+ "country": "US",
2031
+ "forceotclogin": false,
2032
+ "isExternalFederationDisallowed": false,
2033
+ "isRemoteConnectSupported": false,
2034
+ "federationFlags": 0
2035
+ }
2036
+ ```
2037
+ Response field `IfExistsResult` indicates user existence: `0` = exists, `1` = doesn't exist, `5` = exists in federated tenant. Detectability: medium (logged in tenant audit). Cap at 20 attempts per tenant.
2038
+
2039
+ ### 22.2 Okta
2040
+
2041
+ **Org slug derivation:** start with stems from discovered subdomains and root-domain stem. Probe `<slug>.okta.com` and `<slug>.oktapreview.com`. Slug regex:
2042
+ ```regex
2043
+ [a-z0-9][a-z0-9-]{1,40}\.okta(?:preview)?\.com
2044
+ ```
2045
+
2046
+ **OIDC fingerprint:**
2047
+ ```
2048
+ GET https://<slug>.okta.com/.well-known/openid-configuration
2049
+ ```
2050
+
2051
+ **/api/v1/authn user-enum (deep mode):**
2052
+ ```
2053
+ POST https://<slug>.okta.com/api/v1/authn
2054
+ Content-Type: application/json
2055
+ Body: {"username": "<email>", "password": "invalid_password_for_enum"}
2056
+ ```
2057
+ Response distinguishes user existence:
2058
+ - `400` with `errorCode: E0000004` → user doesn't exist (or generic password error in some configs).
2059
+ - `401` with `status: PASSWORD_WARN` / `LOCKED_OUT` / `MFA_REQUIRED` → user exists.
2060
+ Detectability: medium (audit-log per attempt). Cap at 20 attempts per tenant.
2061
+
2062
+ ### 22.3 ADFS
2063
+
2064
+ **Passive fingerprint:**
2065
+ ```
2066
+ GET https://{domain}/adfs/idpinitiatedsignon.aspx
2067
+ ```
2068
+ A `200 OK` with a `urn:com:microsoft:ADFS:` reference in HTML indicates ADFS. Version-string greppable in HTML resource references.
2069
+
2070
+ **Mex endpoint (deep mode):**
2071
+ ```
2072
+ GET https://{domain}/adfs/Services/Trust/mex
2073
+ ```
2074
+ Returns SOAP federation metadata including endpoint URLs, signing certs, and supported claim types.
2075
+
2076
+ ### 22.4 Google Workspace
2077
+
2078
+ **OIDC discovery:**
2079
+ ```
2080
+ GET https://{domain}/.well-known/openid-configuration
2081
+ ```
2082
+ Google-Workspace-hosted-domain customers expose discovery endpoints with characteristic `issuer` URI (`https://accounts.google.com`) and JWKS URI. MX records pointing to `aspmx.l.google.com` are a corroborating signal.
2083
+
2084
+ ### 22.5 Generic OIDC (Keycloak / Auth0 / Ping / OneLogin / Duo)
2085
+
2086
+ **Discovery:** probe `/.well-known/openid-configuration` on every alive subdomain. The `issuer` and `authorization_endpoint` field URLs fingerprint the product:
2087
+
2088
+ | Product | URL pattern in `issuer` |
2089
+ |---|---|
2090
+ | Auth0 | `https://*.auth0.com` |
2091
+ | OneLogin | `https://*.onelogin.com` |
2092
+ | Ping | `https://*.pingone.com`, `https://*.pingidentity.com` |
2093
+ | Duo | `https://*.duosecurity.com` |
2094
+ | Keycloak | URL contains `/realms/<realm>` |
2095
+ | OneLogin | `https://*.onelogin.com` |
2096
+
2097
+ ### 22.6 SAML metadata
2098
+
2099
+ See §16.6.
2100
+
2101
+ ### 22.7 AWS account-ID extraction
2102
+
2103
+ **S3 bucket region header (passive):**
2104
+ ```
2105
+ HEAD https://<known-bucket>.s3.amazonaws.com/
2106
+ ```
2107
+ Response includes `x-amz-bucket-region`. Cross-reference with bucket name entropy and known patterns to scope the account.
2108
+
2109
+ **ARN regex (in any JSON / HTML / JS response):**
2110
+ ```regex
2111
+ arn:aws:[a-z0-9\-]+:[a-z0-9\-]*:([0-9]{12}):
2112
+ ```
2113
+ Capture group: 12-digit AWS account ID.
2114
+
2115
+ **`AccountId` property pattern:**
2116
+ ```regex
2117
+ (?i)["']?account[_\-]?id["']?\s*[:=]\s*["']([0-9]{12})["']
2118
+ ```
2119
+
2120
+ **Google OAuth client_id:**
2121
+ ```regex
2122
+ \b\d{8,}-[a-z0-9]{10,40}\.apps\.googleusercontent\.com\b
2123
+ ```
2124
+
2125
+ **MSAL / Microsoft client_id (GUID property):**
2126
+ ```regex
2127
+ (?i)["']?client[_\-]?id["']?\s*[:=]\s*["']([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})["']
2128
+ ```
2129
+
2130
+ **OAuth scope extraction:**
2131
+ ```regex
2132
+ (?i)["']?scope["']?\s*[:=]\s*["']([^"']+)["']
2133
+ ```
2134
+
2135
+ ### 22.8 Microsoft 365 Deep Enumeration (Teams / SharePoint / OneDrive / OAuth)
2136
+
2137
+ **Teams federation status:**
2138
+ ```bash
2139
+ # Resolve tenant first
2140
+ curl -sk -m 10 "https://login.microsoftonline.com/${TARGET_DOMAIN}/.well-known/openid-configuration" | jq -r '.issuer'
2141
+ # Federation API requires authenticated request from a federated tenant; presence of error pattern reveals fed status
2142
+ curl -sk -m 10 "https://teams.microsoft.com/api/mt/emea/beta/users/<email>/externalsearchv3"
2143
+ ```
2144
+
2145
+ **SharePoint subdomain probe:**
2146
+ ```bash
2147
+ STEM=$(echo $TARGET_DOMAIN | cut -d. -f1)
2148
+ for sub in "" "-my" "-admin"; do
2149
+ echo "=== ${STEM}${sub}.sharepoint.com ==="
2150
+ curl -sk -m 10 -I "https://${STEM}${sub}.sharepoint.com/" -w '%{http_code}\n'
2151
+ done
2152
+ ```
2153
+
2154
+ **Reading the result correctly:** `HTTP 200` from these probes means **the tenant exists** (Microsoft serves a generic redirect-to-auth page) — it does **NOT** mean anonymous access is granted to the tenant's content. Distinguish:
2155
+ - 200 → tenant provisioned (INFO).
2156
+ - 200 + redirect to a custom anonymous-share URL (`/sites/<x>/Lists/<y>/AllItems.aspx?guestaccesstoken=...`) discovered via dorks → HIGH (data exposure).
2157
+ - 401/403 → tenant exists but auth required (INFO).
2158
+ - 404 / NXDOMAIN → tenant not provisioned at this stem (or vanity-named — check known stems from cert transparency).
2159
+
2160
+ PowerShell:
2161
+ ```powershell
2162
+ $STEM = ($D -split '\.')[0]
2163
+ foreach ($s in @("","-my","-admin")) {
2164
+ try {
2165
+ $r = Invoke-WebRequest -Uri "https://${STEM}${s}.sharepoint.com/" -Method Head -UseBasicParsing -TimeoutSec 10
2166
+ "${STEM}${s}.sharepoint.com -> HTTP $($r.StatusCode) (tenant exists)"
2167
+ } catch {
2168
+ $code = $_.Exception.Response.StatusCode.value__
2169
+ if ($code) { "${STEM}${s}.sharepoint.com -> HTTP $code" } else { "${STEM}${s}.sharepoint.com -> no host" }
2170
+ }
2171
+ }
2172
+ ```
2173
+
2174
+ **OneDrive personal site probe** (for a known email `alice@acme.com`):
2175
+ ```bash
2176
+ USER_TOKEN=$(echo "alice@acme.com" | tr '@.' '__')
2177
+ STEM="acme"
2178
+ curl -sk -m 10 -I "https://${STEM}-my.sharepoint.com/personal/${USER_TOKEN}/Documents/" -w '%{http_code}\n'
2179
+ # 401 = exists; 404 = not provisioned
2180
+ ```
2181
+
2182
+ **M365 OAuth client_id discovery in JS:**
2183
+ ```bash
2184
+ curl -sk -m 10 "https://app.target.example/main.js" | \
2185
+ grep -oE 'clientId["'\''[:=]+ ?["'\'']?[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}'
2186
+ ```
2187
+
2188
+ **Device-code phishing target check** (look for `device_authorization_endpoint` in OIDC metadata):
2189
+ ```bash
2190
+ curl -sk -m 10 "https://login.microsoftonline.com/${TARGET_DOMAIN}/v2.0/.well-known/openid-configuration" | \
2191
+ jq '.device_authorization_endpoint'
2192
+ ```
2193
+ If non-null and tenant doesn't restrict device-code: MEDIUM finding (device-code phishing feasible).
2194
+
2195
+ **Power Platform / Dynamics URLs to check:**
2196
+ - `*.crm.dynamics.com` (per-region: `crm`, `crm2`-`crm15`, `crm.dynamics.com`).
2197
+ - `*.api.crm.dynamics.com` (Web API).
2198
+ - `make.powerapps.com` / `flow.microsoft.com` (auth-required dashboards).
2199
+
2200
+ **Severity:**
2201
+ - Discovered SharePoint/OneDrive tenants → INFO (asset only).
2202
+ - Anonymous SharePoint anonymous-share link → HIGH (data exposure).
2203
+ - `device_authorization_endpoint` enabled on tenant → MEDIUM (operational risk).
2204
+ - Multi-tenant OAuth app with broad Graph scopes published by target → HIGH.
2205
+
2206
+ ### 22.9 GraphQL Field-Suggestion Enumeration (when introspection disabled)
2207
+
2208
+ When the standard introspection query (§16.2) returns `"errors":[{"message":"GraphQL introspection is disabled"}]`, fall back to field-suggestion enumeration. Apollo and most GraphQL libraries enable "did you mean" suggestions by default.
2209
+
2210
+ **Detection probe:**
2211
+ ```bash
2212
+ curl -sk -m 10 -X POST "$T/graphql" \
2213
+ -H 'Content-Type: application/json' \
2214
+ -d '{"query":"{ __schema { types { name } } }"}' | jq -r '.errors[0].message'
2215
+ # If "introspection disabled" → proceed.
2216
+ ```
2217
+
2218
+ **Field-suggestion probe** (intentionally typo a field name to trigger suggestions):
2219
+ ```bash
2220
+ curl -sk -m 10 -X POST "$T/graphql" \
2221
+ -H 'Content-Type: application/json' \
2222
+ -d '{"query":"{ usre { id } }"}' | jq -r '.errors[].message'
2223
+ # Expected: "Cannot query field \"usre\" on type \"Query\". Did you mean \"user\", \"users\", \"userById\"?"
2224
+ ```
2225
+
2226
+ Iterate over a candidate-field wordlist (use SecLists `Discovery/Web-Content/graphql.txt` or `clairvoyance` library's seed list). Each suggestion reveals real field names. Continue until no new suggestions emerge.
2227
+
2228
+ **Tooling:**
2229
+ - **Clairvoyance** (`pip install clairvoyance`) — automated field-suggestion enumerator. `clairvoyance -w wordlist.txt -o schema.json https://target.example/graphql`.
2230
+ - **GraphQL-Cop** — auditor that probes for introspection, batching, depth-limit, suggestion config. `pip install graphql-cop`.
2231
+ - **InQL** (Burp extension) — Burp Suite extension for GraphQL endpoint analysis.
2232
+ - **GraphQL Voyager** — visualize once schema is reconstructed.
2233
+
2234
+ **Other GraphQL-when-introspection-disabled techniques:**
2235
+
2236
+ - **Alias-based query batching** (rate-limit / auth-bypass surface):
2237
+ ```json
2238
+ {
2239
+ "query": "{ a:user(id:1){name} b:user(id:2){name} c:user(id:3){name} ... }"
2240
+ }
2241
+ ```
2242
+ Many APIs rate-limit per-request, not per-alias. Test 100+ aliases per request.
2243
+
2244
+ - **Query-depth-limit bypass** (DoS / introspection bypass):
2245
+ ```json
2246
+ {
2247
+ "query": "{ user { friends { friends { friends { friends { id } } } } } }"
2248
+ }
2249
+ ```
2250
+ If server allows arbitrary depth → DoS surface; if depth-limited but doesn't strip nested `__type`/`__schema` → introspection-via-depth.
2251
+
2252
+ - **Subscription enumeration via WebSocket:**
2253
+ ```bash
2254
+ wscat -c "wss://target.example/graphql" -s graphql-ws
2255
+ > {"type":"connection_init"}
2256
+ > {"id":"1","type":"start","payload":{"query":"subscription { __schema { types { name } } }"}}
2257
+ ```
2258
+
2259
+ - **Batched query bypass** (some servers process all queries in batch even if first fails):
2260
+ ```json
2261
+ [
2262
+ {"query":"{ __schema { types { name } } }"},
2263
+ {"query":"{ user(id:1) { name } }"}
2264
+ ]
2265
+ ```
2266
+
2267
+ **Severity:**
2268
+ - Field-suggestion enumeration succeeds (50+ fields recoverable) → MEDIUM `MISCONFIG`.
2269
+ - Alias batching not rate-limited → MEDIUM (rate-limit-bypass surface).
2270
+ - Subscription endpoint exposed without auth → MEDIUM (often used for real-time data exfil).
2271
+
2272
+ ---
2273
+
2274
+ ## 23. Read-Only Secret Validators
2275
+
2276
+ Use these to confirm a discovered credential is live. **Read-only, never destructive.** Tag every validation with `detectability` and `checked_at` (UTC).
2277
+
2278
+ ### 23.1 Postman API Key (PMAK-*)
2279
+
2280
+ ```
2281
+ GET https://api.getpostman.com/me
2282
+ Header: X-Api-Key: PMAK-<key>
2283
+ ```
2284
+ - `200` → live; response contains `{user: {id, username, email}}`.
2285
+ - `401` → dead.
2286
+ - Scope: full read access to the user's Postman account (collections, env vars, history).
2287
+ - Detectability: low.
2288
+
2289
+ ### 23.2 AWS Access Key
2290
+
2291
+ ```
2292
+ sts:GetCallerIdentity
2293
+ ```
2294
+ Use boto3:
2295
+ ```python
2296
+ import boto3
2297
+ sts = boto3.client('sts',
2298
+ aws_access_key_id='<AKIA...>',
2299
+ aws_secret_access_key='<secret>',
2300
+ region_name='us-east-1')
2301
+ ident = sts.get_caller_identity()
2302
+ # ident['Account'], ident['Arn'], ident['UserId']
2303
+ ```
2304
+ - Valid → returns Account ID + ARN + UserId.
2305
+ - Invalid → `InvalidClientTokenId` or `SignatureDoesNotMatch`.
2306
+ - ARN scope: `:user/` is IAM user (broad), `:assumed-role/` is temp role (narrow), `:root` is account root (do NOT validate root keys you find).
2307
+ - Detectability: **medium** (CloudTrail logs `GetCallerIdentity` in account `<found>`).
2308
+
2309
+ ### 23.3 GitHub PAT
2310
+
2311
+ ```
2312
+ GET https://api.github.com/user
2313
+ Header: Authorization: token <ghp_*>
2314
+ ```
2315
+ - `200` → live; response contains `login`, `id`, `name`, `email` (if public).
2316
+ - Response header `X-OAuth-Scopes` lists token scopes. `repo` scope = write to all accessible repos; `admin:org` = org admin.
2317
+ - `401` → dead.
2318
+ - Detectability: low.
2319
+
2320
+ ### 23.4 Slack Token
2321
+
2322
+ ```
2323
+ POST https://slack.com/api/auth.test
2324
+ Header: Authorization: Bearer <xox*-*>
2325
+ ```
2326
+ - `200` with `{"ok": true}` → live; response includes `team`, `team_id`, `user`, `user_id`.
2327
+ - `200` with `{"ok": false, "error": "invalid_auth"}` → dead.
2328
+ - Detectability: low.
2329
+
2330
+ ### 23.5 Anthropic API Key
2331
+
2332
+ ```
2333
+ GET https://api.anthropic.com/v1/models
2334
+ Headers:
2335
+ x-api-key: sk-ant-api03-...
2336
+ anthropic-version: 2023-06-01
2337
+ ```
2338
+ - `200` → live; response lists available models.
2339
+ - `401` → dead.
2340
+ - `403` with org_disabled → key valid but org disabled.
2341
+ - Detectability: low; usage shows in Anthropic Console for the workspace owner.
2342
+
2343
+ ### 23.6 OpenAI API Key
2344
+
2345
+ ```
2346
+ GET https://api.openai.com/v1/models
2347
+ Header: Authorization: Bearer sk-...
2348
+ ```
2349
+ - `200` → live; lists models (may include org-specific fine-tunes).
2350
+ - `401` → dead.
2351
+ - `429` → live but quota exhausted.
2352
+ - Detectability: low; usage shows in OpenAI dashboard.
2353
+
2354
+ ### 23.7 npm Token
2355
+
2356
+ ```
2357
+ GET https://registry.npmjs.org/-/whoami
2358
+ Header: Authorization: Bearer npm_<token>
2359
+ ```
2360
+ - `200` with `{"username": "<user>"}` → live.
2361
+ - `401` → dead.
2362
+ - For scope check: `GET /-/npm/v1/tokens` returns the token's permissions (read/publish).
2363
+ - Detectability: low.
2364
+
2365
+ ### 23.8 Atlassian API Token
2366
+
2367
+ ```
2368
+ GET https://<workspace>.atlassian.net/rest/api/3/myself
2369
+ Auth: Basic <base64(email:ATATT3xFfGF0_...)>
2370
+ ```
2371
+ - `200` → live; returns account profile + email.
2372
+ - `401` → dead.
2373
+ - Workspace required — extract from leaked repo URL or Atlassian dork results.
2374
+ - Detectability: low.
2375
+
2376
+ ### 23.9 DataDog API + APP Key
2377
+
2378
+ ```
2379
+ GET https://api.datadoghq.com/api/v1/validate
2380
+ Headers:
2381
+ DD-API-KEY: <api-key>
2382
+ DD-APPLICATION-KEY: <app-key>
2383
+ ```
2384
+ - `200` → both keys valid.
2385
+ - `403` → either key invalid.
2386
+ - Per-region URL varies: `api.datadoghq.eu`, `api.us3.datadoghq.com`, etc.
2387
+ - Detectability: low; appears in DataDog audit log.
2388
+
2389
+ ### 23.10 Validator output schema
2390
+
2391
+ ```
2392
+ {
2393
+ "status": "verified_live" | "verified_dead" | "scope_restricted" |
2394
+ "scope_unrestricted" | "validation_skipped_by_policy" |
2395
+ "validation_unsupported" | "validation_failed_transient",
2396
+ "provider": "postman" | "aws" | "github" | "slack" | "anthropic" | "openai" | "npm" | "atlassian" | "datadog",
2397
+ "account_id": "<opaque>",
2398
+ "scope": "<freeform>",
2399
+ "metadata": {<provider-specific>},
2400
+ "checked_at": "<UTC ISO8601>",
2401
+ "detectability": "low" | "medium" | "high"
2402
+ }
2403
+ ```
2404
+
2405
+ ### 23.11 Hard rules
2406
+
2407
+ - Read-only endpoint only.
2408
+ - Never use the validated credential to create, modify, delete, or send anything.
2409
+ - Tag every validation with detectability.
2410
+ - Record `checked_at` (UTC).
2411
+ - If RoE forbids validation → `validation_skipped_by_policy`, stop, document.
2412
+ - For root AWS keys, infrastructure-write GitHub PATs, or admin Slack tokens — flag for the operator and let them decide.
2413
+
2414
+ ### 23.12 Post-Discovery Enumeration Workflows
2415
+
2416
+ After validation confirms a key is live, you often want to enumerate what it can do. Stay read-only.
2417
+
2418
+ **AWS access key — IAM enum:**
2419
+ ```bash
2420
+ export AWS_ACCESS_KEY_ID="AKIA..."
2421
+ export AWS_SECRET_ACCESS_KEY="..."
2422
+
2423
+ # Identity (already done as part of validation)
2424
+ aws sts get-caller-identity
2425
+
2426
+ # IAM-user details (only if ARN was :user/)
2427
+ aws iam get-user
2428
+ aws iam list-attached-user-policies --user-name $(aws iam get-user --query 'User.UserName' --output text)
2429
+ aws iam list-user-policies --user-name $(aws iam get-user --query 'User.UserName' --output text)
2430
+ aws iam list-groups-for-user --user-name $(aws iam get-user --query 'User.UserName' --output text)
2431
+
2432
+ # What can I actually do? (simulate-principal-policy for common dangerous actions)
2433
+ aws iam simulate-principal-policy \
2434
+ --policy-source-arn $(aws sts get-caller-identity --query Arn --output text) \
2435
+ --action-names s3:ListAllMyBuckets ec2:DescribeInstances iam:ListUsers \
2436
+ secretsmanager:ListSecrets ssm:DescribeParameters \
2437
+ lambda:ListFunctions rds:DescribeDBInstances
2438
+
2439
+ # Read-only enumeration of common services (do not WRITE)
2440
+ aws s3 ls
2441
+ aws ec2 describe-instances --output table --query 'Reservations[*].Instances[*].[InstanceId,State.Name,Tags[?Key==`Name`].Value]'
2442
+ aws secretsmanager list-secrets --query 'SecretList[*].Name'
2443
+ aws ssm describe-parameters --query 'Parameters[*].Name'
2444
+ aws lambda list-functions --query 'Functions[*].FunctionName'
2445
+ aws rds describe-db-instances --query 'DBInstances[*].DBInstanceIdentifier'
2446
+
2447
+ # CloudTrail check — is logging on?
2448
+ aws cloudtrail describe-trails
2449
+
2450
+ # Check MFA enforcement on the user
2451
+ aws iam get-account-summary | jq '.SummaryMap.AccountMFAEnabled'
2452
+ aws iam list-mfa-devices --user-name <username>
2453
+ ```
2454
+
2455
+ **GitHub PAT — repo enum:**
2456
+ ```bash
2457
+ TOKEN="ghp_..."
2458
+ H="Authorization: token $TOKEN"
2459
+
2460
+ # Scopes already captured from X-OAuth-Scopes header
2461
+ curl -sk -m 10 -I -H "$H" https://api.github.com/user | grep -i 'X-OAuth-Scopes'
2462
+
2463
+ # All repos accessible (own + collaborator + org member)
2464
+ curl -sk -m 10 -H "$H" "https://api.github.com/user/repos?affiliation=owner,collaborator,organization_member&per_page=100"
2465
+
2466
+ # Org memberships
2467
+ curl -sk -m 10 -H "$H" "https://api.github.com/user/orgs"
2468
+
2469
+ # Per-org: members, repos, secrets (secrets endpoint is metadata-only — names not values)
2470
+ ORG="<orgname>"
2471
+ curl -sk -m 10 -H "$H" "https://api.github.com/orgs/$ORG/members"
2472
+ curl -sk -m 10 -H "$H" "https://api.github.com/orgs/$ORG/repos?per_page=100"
2473
+ curl -sk -m 10 -H "$H" "https://api.github.com/orgs/$ORG/actions/secrets" # requires admin:org
2474
+
2475
+ # Per-repo workflow secrets (metadata)
2476
+ REPO="<orgname/reponame>"
2477
+ curl -sk -m 10 -H "$H" "https://api.github.com/repos/$REPO/actions/secrets"
2478
+ ```
2479
+
2480
+ **Slack token — workspace enum:**
2481
+ ```bash
2482
+ TOKEN="xoxb-..."
2483
+ H="Authorization: Bearer $TOKEN"
2484
+
2485
+ # auth.test already validated
2486
+ # Identity details
2487
+ curl -sk -m 10 -H "$H" -X POST "https://slack.com/api/users.identity" | jq .
2488
+
2489
+ # What conversations can I see? (sweeping check; respects scope)
2490
+ curl -sk -m 10 -H "$H" -X POST "https://slack.com/api/conversations.list?types=public_channel,private_channel,mpim,im&limit=200" | jq '.channels[] | {id, name, is_private}'
2491
+
2492
+ # Workspace info
2493
+ curl -sk -m 10 -H "$H" -X POST "https://slack.com/api/team.info" | jq .
2494
+
2495
+ # User list (only if scope includes users:read)
2496
+ curl -sk -m 10 -H "$H" -X POST "https://slack.com/api/users.list?limit=100" | jq '.members[] | {name, real_name, is_admin}'
2497
+
2498
+ # DO NOT: chat.postMessage, files.upload, conversations.invite, etc.
2499
+ ```
2500
+
2501
+ **JWT — full triage workflow:**
2502
+ ```bash
2503
+ JWT="eyJhbGciOiJIUzI1NiI..."
2504
+
2505
+ # Decode header
2506
+ echo "$JWT" | cut -d. -f1 | base64 -d 2>/dev/null | jq .
2507
+ # Look for: alg (none = critical, HS256/HS384/HS512 = symmetric, RS256/RS512 = asymmetric, ES256 = ECDSA)
2508
+ # Look for: kid (key ID — possible JKU/X5U injection target)
2509
+ # Look for: jku, x5u (JKU/X5U values — control these = sign attacker JWTs)
2510
+
2511
+ # Decode payload
2512
+ echo "$JWT" | cut -d. -f2 | base64 -d 2>/dev/null | jq .
2513
+ # Look for: exp (expired = downgraded), iat, nbf
2514
+ # Look for: sub, iss, aud (identity disclosure)
2515
+ # Look for: roles, scopes, permissions (privilege markers)
2516
+ # Look for: sensitive claims (email, employee ID, SSN, etc.)
2517
+
2518
+ # Algorithm-confusion test (RS→HS)
2519
+ # If alg is RS256, try crafting an HS256 token signed with the public key as secret
2520
+ # Tools: jwt_tool, jwt-cracker
2521
+
2522
+ # Brute-force HS256 secret (if HS256 + short-secret suspicion)
2523
+ hashcat -m 16500 "$JWT" /path/to/wordlist.txt
2524
+ # Or: john --format=HMAC-SHA256 jwt-hash.txt --wordlist=...
2525
+
2526
+ # Check `none` algorithm bypass
2527
+ # Re-encode header with alg=none and empty signature; some libraries accept
2528
+ NEW_JWT=$(echo -n '{"alg":"none","typ":"JWT"}' | base64 -w0 | tr -d '=' | tr '/+' '_-')
2529
+ NEW_JWT="${NEW_JWT}.$(echo "$JWT" | cut -d. -f2)."
2530
+ # Test against API
2531
+ ```
2532
+
2533
+ **Postman PMAK — workspace enum:**
2534
+ ```bash
2535
+ PMAK="PMAK-..."
2536
+ H="X-Api-Key: $PMAK"
2537
+
2538
+ # /me already done (validation)
2539
+ curl -sk -m 10 -H "$H" https://api.getpostman.com/me | jq '.user'
2540
+
2541
+ # Workspaces
2542
+ curl -sk -m 10 -H "$H" https://api.getpostman.com/workspaces | jq '.workspaces[] | {id, name, type}'
2543
+
2544
+ # Per-workspace collections
2545
+ WS="<workspace-id>"
2546
+ curl -sk -m 10 -H "$H" "https://api.getpostman.com/workspaces/$WS" | jq '.workspace.collections[]'
2547
+ curl -sk -m 10 -H "$H" "https://api.getpostman.com/workspaces/$WS" | jq '.workspace.environments[]'
2548
+
2549
+ # Per-collection requests (where the secrets often live)
2550
+ COL="<collection-id>"
2551
+ curl -sk -m 10 -H "$H" "https://api.getpostman.com/collections/$COL" | jq '.collection.item[]'
2552
+ # Run secret catalog over the JSON
2553
+
2554
+ # Environments (env vars often contain creds)
2555
+ ENV="<environment-id>"
2556
+ curl -sk -m 10 -H "$H" "https://api.getpostman.com/environments/$ENV" | jq '.environment.values[] | {key, value}'
2557
+ ```
2558
+
2559
+ **Anthropic API key — usage enum:**
2560
+ ```bash
2561
+ KEY="sk-ant-api03-..."
2562
+ H="x-api-key: $KEY"
2563
+ A="anthropic-version: 2023-06-01"
2564
+
2565
+ # Models accessible
2566
+ curl -sk -m 10 -H "$H" -H "$A" https://api.anthropic.com/v1/models | jq '.data[] | .id'
2567
+
2568
+ # Usage / quota (admin-scoped tokens only):
2569
+ curl -sk -m 10 -H "$H" -H "$A" https://api.anthropic.com/v1/organizations/usage_report | jq .
2570
+
2571
+ # DO NOT: send actual completion requests against organization budget
2572
+ ```
2573
+
2574
+ **OpenAI API key — usage enum:**
2575
+ ```bash
2576
+ KEY="sk-..."
2577
+ H="Authorization: Bearer $KEY"
2578
+
2579
+ # Models
2580
+ curl -sk -m 10 -H "$H" https://api.openai.com/v1/models | jq '.data | length'
2581
+
2582
+ # Org info (if key has org scope)
2583
+ curl -sk -m 10 -H "$H" https://api.openai.com/v1/organizations | jq .
2584
+
2585
+ # Files / fine-tunes (sometimes contain training data with PII)
2586
+ curl -sk -m 10 -H "$H" https://api.openai.com/v1/files | jq .
2587
+ curl -sk -m 10 -H "$H" https://api.openai.com/v1/fine_tuning/jobs | jq .
2588
+ ```
2589
+
2590
+ **Generic key — provenance enum:**
2591
+ 1. Find the consuming domain (where in JS bundle did the key appear? what URL is the bundle served from?).
2592
+ 2. Check the API docs of the inferred service.
2593
+ 3. If the key matches a known regex, lookup vendor-specific scope check.
2594
+ 4. If unknown service, search GitHub for the key prefix (`gh search code "<prefix>" --type=code`).
2595
+ 5. Identify scope before validating; some keys are write-broad on first use.
2596
+
2597
+ ---
2598
+
2599
+ ## 24. Postman Public Workspace Universal Search
2600
+
2601
+ Postman's public-search endpoint is unauthenticated and indexes every workspace marked public.
2602
+
2603
+ **Verified endpoint shape (mid-2025 onward):**
2604
+
2605
+ ```bash
2606
+ curl -sk -m 15 \
2607
+ "https://www.postman.com/_api/ws/proxy" \
2608
+ -H 'Content-Type: application/json' \
2609
+ -H 'X-Entity-Team-Id: 0' \
2610
+ -d '{
2611
+ "service":"search",
2612
+ "method":"POST",
2613
+ "path":"/search-all",
2614
+ "body":{
2615
+ "queryIndices":["collaboration.workspace","runtime.collection","runtime.request"],
2616
+ "queryText":"acme.com",
2617
+ "size":100,
2618
+ "from":0,
2619
+ "clientTraceId":"",
2620
+ "queryAllIndices":false,
2621
+ "domain":"public"
2622
+ }
2623
+ }' | jq '.data[]'
2624
+ ```
2625
+
2626
+ This proxies through Postman's web app to their internal search service. Pagination via `from` (0, 100, 200, ...).
2627
+
2628
+ **If the proxy shape changes** (it has historically): inspect a real search request from the Postman web UI:
2629
+ 1. Open `https://www.postman.com/explore` in a browser.
2630
+ 2. Open DevTools → Network tab.
2631
+ 3. Search for any term.
2632
+ 4. Find the request to `_api/...` — copy as cURL — adapt.
2633
+
2634
+ **Per-workspace walk:**
2635
+
2636
+ For each matching workspace ID:
2637
+
2638
+ ```bash
2639
+ WS_ID="<workspace-id>"
2640
+ # Workspace metadata (name, description, team, visibility)
2641
+ curl -sk -m 10 "https://www.postman.com/_api/workspace/$WS_ID" | jq .
2642
+
2643
+ # List collections + environments + monitors in workspace
2644
+ curl -sk -m 10 "https://www.postman.com/_api/workspace/$WS_ID/collection" | jq '.[].id'
2645
+ curl -sk -m 10 "https://www.postman.com/_api/workspace/$WS_ID/environment" | jq '.[].id'
2646
+
2647
+ # Per-collection: full content (requests, headers, scripts, env vars)
2648
+ COL_ID="<collection-id>"
2649
+ curl -sk -m 10 "https://www.postman.com/_api/collection/$COL_ID" | jq '.collection.item[]'
2650
+ ```
2651
+
2652
+ **Ownership scoring signals:**
2653
+ - Creator/team name mentions target domain or brand → strong.
2654
+ - Workspace name/description mentions target → strong.
2655
+ - Request URLs contain `*.target.com` → strongest signal (workspace is actively used against target's APIs).
2656
+
2657
+ **Run secret catalog (§17) over every text blob extracted** from the requests, env vars, pre-request scripts, and test scripts.
2658
+
2659
+ ---
2660
+
2661
+ ## 25. Stack Exchange OSINT Sweep
2662
+
2663
+ Stack Exchange and its sister sites collect code paste-ins from developers — many include secrets, internal hostnames, and proprietary code excerpts.
2664
+
2665
+ **Sites to query (8 with highest signal):**
2666
+ ```
2667
+ stackoverflow.com
2668
+ serverfault.com
2669
+ dba.stackexchange.com
2670
+ devops.stackexchange.com
2671
+ security.stackexchange.com
2672
+ superuser.com
2673
+ sharepoint.stackexchange.com
2674
+ salesforce.stackexchange.com
2675
+ ```
2676
+
2677
+ **API:**
2678
+ ```
2679
+ GET https://api.stackexchange.com/2.3/search/advanced
2680
+ ?site=<site>
2681
+ &q=<target>
2682
+ &filter=withbody
2683
+ &pagesize=100
2684
+ ```
2685
+
2686
+ **Code block extraction regex:**
2687
+ ```regex
2688
+ <pre><code>([\s\S]*?)</code></pre>
2689
+ ```
2690
+ (Stack Exchange wraps code in `<pre><code>` HTML.)
2691
+
2692
+ **Pipeline:**
2693
+ 1. Search each site for the target name, brand, root domain.
2694
+ 2. Extract code blocks from `body` HTML.
2695
+ 3. Run secret catalog (§17) over each block.
2696
+ 4. Cross-reference post author email (where exposed in profile) against email_osint discoveries — confirms employee posting target's internal code.
2697
+ 5. Extract hostnames from code blocks → upsert as `subdomain` assets.
2698
+
2699
+ **Quota:** Stack Exchange API permits 30 requests/day without a key; with a free key, 10,000/day. Throttle with 2-second min interval per call.
2700
+
2701
+ ---
2702
+
2703
+ ## 26. Public SaaS Collaboration Surfaces
2704
+
2705
+ Many SaaS collaboration tools allow public sharing. Dork them like search engines.
2706
+
2707
+ **Platforms with high incident rate:**
2708
+ ```
2709
+ trello.com
2710
+ notion.so / notion.site
2711
+ *.atlassian.net (Jira / Confluence)
2712
+ miro.com
2713
+ asana.com
2714
+ clickup.com
2715
+ airtable.com
2716
+ ```
2717
+
2718
+ **Dork template:**
2719
+ ```
2720
+ site:{platform} "{target-keyword}"
2721
+ ```
2722
+
2723
+ **Run via search-engine adapter** (DDG default; Bing / Brave / Yandex / SerpAPI optional). The same classification logic from §18.7 applies.
2724
+
2725
+ **Common findings:**
2726
+ - Public Trello board with credentials in card titles or attached config files.
2727
+ - Public Notion page with internal SOPs, API keys in code blocks, customer data.
2728
+ - Public Confluence space with onboarding docs containing seed creds.
2729
+ - Public Miro board with architecture diagrams revealing internal hostnames.
2730
+
2731
+ ---
2732
+
2733
+ ## 27. Subdomain-Source Stack (Passive)
2734
+
2735
+ Practical "what actually returns useful data in 2026" reference, ordered by recall:
2736
+
2737
+ | Source | Tier | Notes |
2738
+ |---|---|---|
2739
+ | crt.sh | Free | Best single source for cert-derived subdomains; **frequently 502s during peak hours — see fallback chain below**. |
2740
+ | VirusTotal | Freemium | Domain → passive DNS history. |
2741
+ | AlienVault OTX | Free | Passive DNS + URL data. |
2742
+ | Shodan | Paid (low tier) | Subdomain enum via `domain:` filter. |
2743
+ | BinaryEdge | Paid | Comparable to Shodan. |
2744
+ | FOFA | Freemium | Strong China-side coverage. |
2745
+ | ZoomEye | Freemium | Comparable to Shodan; CN-strong. |
2746
+ | Netlas | Paid | Large-scale HTTP/DNS/cert pivots. |
2747
+ | SecurityTrails | Paid | Passive DNS + asset discovery. |
2748
+ | RapidDNS | Free | Public passive DNS. |
2749
+ | Subfinder bundled | Free | Aggregates 30+ free sources via one CLI. |
2750
+ | Amass | Free | Comparable, more thorough, slower. |
2751
+ | Recon-ng | Free | Modular framework; many free providers built in. |
2752
+
2753
+ **DNS AXFR opportunism:** for every name server discovered, attempt zone transfer:
2754
+ ```
2755
+ dig @<ns-host> <target-domain> AXFR
2756
+ ```
2757
+ Most NSs reject; those that don't = full zone disclosure (CRITICAL).
2758
+
2759
+ **Brute-force tier:** Subfinder/Subbrute against `assetnote.io` wordlists (best-curated public wordlist source).
2760
+
2761
+ ### 27.0.1 crt.sh down? Fallback chain (try in order)
2762
+
2763
+ crt.sh runs on a single nginx in front of a busy Postgres; 502 / 503 / timeout in peak hours is routine. Don't retry-loop — pivot:
2764
+
2765
+ ```bash
2766
+ D="target.example"
2767
+
2768
+ # 1. Censys cert search (free 250 queries/month with key) — same data, different infra
2769
+ censys search "names: ${D}" --index-type certificates --fields names | jq -r '.names[]' | sort -u
2770
+
2771
+ # 2. Cert Spotter API (sslmate) — free w/ rate limits
2772
+ curl -sk "https://api.certspotter.com/v1/issuances?domain=${D}&include_subdomains=true&expand=dns_names" | \
2773
+ jq -r '.[].dns_names[]' | sort -u
2774
+
2775
+ # 3. CertStream archive (Calidog) — historical CT log mirror
2776
+ curl -sk "https://crt.calidog.io/?q=${D}" | jq -r '.[].name_value' | sort -u
2777
+
2778
+ # 4. Subfinder bundled aggregator (uses 30+ sources internally — Chaos, Anubis, BinaryEdge, BufferOver, Censys, CertSpotter, Crobat, Crtsh, DNSDumpster, FOFA, Fullhunt, GitHub, HackerTarget, IntelX, PassiveTotal, Quake, Rapiddns, Shodan, Spyse, ThreatBook, ThreatMiner, URLScan, VirusTotal, WhoisXML, ZoomEye, etc.)
2779
+ subfinder -d ${D} -all -recursive -silent
2780
+
2781
+ # 5. AlienVault OTX — free, no key
2782
+ curl -sk "https://otx.alienvault.com/api/v1/indicators/domain/${D}/passive_dns" | \
2783
+ jq -r '.passive_dns[].hostname' | sort -u
2784
+
2785
+ # 6. ThreatMiner — free
2786
+ curl -sk "https://api.threatminer.org/v2/domain.php?q=${D}&rt=5" | jq -r '.results[]'
2787
+
2788
+ # 7. URLScan — passive DNS via past scans
2789
+ curl -sk "https://urlscan.io/api/v1/search/?q=domain:${D}" | \
2790
+ jq -r '.results[].page.domain' | sort -u
2791
+
2792
+ # 8. Anubis-DB / DNSDumpster (HTML scrape, last resort)
2793
+ curl -sk -A "Mozilla/5.0" "https://anubisdb.com/anubis/subdomains/${D}" | jq -r '.[]'
2794
+ ```
2795
+
2796
+ PowerShell crt.sh wrapper with retry + fallback to Subfinder:
2797
+
2798
+ ```powershell
2799
+ function Get-Subs {
2800
+ param($D)
2801
+ for ($i=0; $i -lt 3; $i++) {
2802
+ try {
2803
+ $r = Invoke-WebRequest -Uri "https://crt.sh/?q=%25.$D&output=json" -UseBasicParsing -TimeoutSec 90 -UserAgent "Mozilla/5.0"
2804
+ return ($r.Content | ConvertFrom-Json | %{ $_.name_value -split "`n" } | %{ $_.Trim().ToLower() } | ?{ $_ -and $_ -notlike "*@*" -and $_ -notmatch "^\*\." } | Sort -Unique)
2805
+ } catch {
2806
+ "crt.sh attempt $($i+1) failed; sleep 5s..." | Out-Host
2807
+ Start-Sleep -Seconds 5
2808
+ }
2809
+ }
2810
+ "crt.sh down — pivot to Subfinder: subfinder -d $D -all -silent" | Out-Host
2811
+ return @()
2812
+ }
2813
+ ```
2814
+
2815
+ ### 27.1 Wordlist Sources for Subdomain + Content Brute-Force
2816
+
2817
+ | Source | URL | Notes |
2818
+ |---|---|---|
2819
+ | **Assetnote Wordlists** | `https://wordlists.assetnote.io/` | Best-curated; updated regularly. Subdomain top-N (1k, 10k, 100k, 1M, 10M); content-paths per CMS/framework; per-vendor (AWS, Azure, GitLab, etc.). |
2820
+ | **SecLists** | `https://github.com/danielmiessler/SecLists` | Massive collection. Subdomains: `Discovery/DNS/subdomains-top1million-110000.txt`. Content: `Discovery/Web-Content/`. |
2821
+ | **jhaddix all.txt** | `https://gist.github.com/jhaddix/86a06c5dc309d08580a018c66354a056` | Long-running curated list. |
2822
+ | **OneListForAll** | `https://github.com/six2dez/OneListForAll` | Aggregated; very large (millions). |
2823
+ | **dirsearch wordlists** | `https://github.com/maurosoria/dirsearch` | Bundled with the tool. |
2824
+ | **raft-large-words.txt** | inside SecLists `Discovery/Web-Content/raft-large-words.txt` | Time-tested content wordlist. |
2825
+ | **bo0om wordlist** | `https://github.com/bo0om/wordlists` | Russian-language-aware. |
2826
+ | **commonspeak2** | `https://github.com/assetnote/commonspeak2-wordlists` | Generated from BigQuery commit data. |
2827
+ | **fuzzdb** | `https://github.com/fuzzdb-project/fuzzdb` | Fuzzing payloads + wordlists. |
2828
+ | **PayloadsAllTheThings** | `https://github.com/swisskyrepo/PayloadsAllTheThings` | Per-vuln-class payloads (less for enum, more for follow-on). |
2829
+ | **Custom per-target** | n/a | Best practice: derive a custom wordlist from the target's own content (extract every word from their public website + LinkedIn + careers page → unique → use as seed). |
2830
+
2831
+ **Size guidance:**
2832
+ - **<10k entries** → fast subdomain check (1–2 min); use for opportunistic/passive-supplement.
2833
+ - **10k–100k entries** → standard depth (10–30 min); use as default brute-force.
2834
+ - **100k–1M entries** → thorough; use when the target is a known high-value engagement (1–4 hours).
2835
+ - **>1M entries** → exhaustive; reserve for week-long engagements; expect rate-limiting.
2836
+
2837
+ **Tooling:**
2838
+ ```bash
2839
+ # Subfinder + brute-force with assetnote 100k
2840
+ subfinder -d target.example -all -recursive | tee passive.txt
2841
+ puredns bruteforce assetnote-best-dns-wordlist.txt target.example -r resolvers.txt | tee brute.txt
2842
+ cat passive.txt brute.txt | sort -u > all-subs.txt
2843
+
2844
+ # Content brute-force on alive hosts
2845
+ ffuf -u "https://target.example/FUZZ" -w raft-large-words.txt -mc 200,301,403 -t 50 -ac
2846
+ ```
2847
+
2848
+ ---
2849
+
2850
+ ## 28. Infrastructure & Attack-Surface OSINT
2851
+
2852
+ - [Shodan](https://www.shodan.io/), [Censys](https://search.censys.io/) — internet device + cert search.
2853
+ - [GreyNoise](https://viz.greynoise.io/) — distinguish background noise from targeted scans.
2854
+ - [SecurityTrails](https://securitytrails.com/) — passive DNS + asset discovery.
2855
+ - [SpiderFoot](https://www.spiderfoot.net/) — automated recon + correlation.
2856
+ - [theHarvester](https://github.com/laramies/theHarvester) — subdomain, email, metadata.
2857
+ - [Recon-ng](https://github.com/lanmaster53/recon-ng) — web recon framework.
2858
+ - [Amass](https://github.com/owasp-amass/amass) / [Subfinder](https://github.com/projectdiscovery/subfinder) — passive subdomain.
2859
+ - [BuiltWith](https://builtwith.com/) — tech stack enumeration.
2860
+ - [Netlas](https://netlas.io/) — large-scale HTTP/DNS/cert pivots.
2861
+ - [BinaryEdge](https://www.binaryedge.io/) / [FOFA](https://fofa.so/) / [ZoomEye](https://www.zoomeye.org/) — Shodan/Censys complements.
2862
+ - [RiskIQ PassiveTotal](https://community.riskiq.com/) — passive DNS/cert/host pivots.
2863
+ - [Spur](https://spur.us/) — IP lookups.
2864
+ - [Robtex](https://www.robtex.com/) — passive DNS + infrastructure.
2865
+
2866
+ ### 28.1 ASN/BGP & Internet Measurement
2867
+
2868
+ - [Hurricane Electric BGP Toolkit](https://bgp.he.net/), [RIPEstat](https://stat.ripe.net/), [BGPView](https://bgpview.io/), [bgp.tools](https://bgp.tools/), [PeeringDB](https://www.peeringdb.com/).
2869
+
2870
+ **Bulk IP → ASN — recipes that actually work in 2026:**
2871
+
2872
+ ```bash
2873
+ # Cymru bulk WHOIS (fastest; no rate-limit issues; no key required)
2874
+ echo -e "begin\nverbose\n8.8.8.8\n1.1.1.1\nend" | nc whois.cymru.com 43
2875
+ # Or one-shot:
2876
+ whois -h whois.cymru.com " -v 8.8.8.8"
2877
+
2878
+ # RIPEstat (free; CORS-friendly; ~1 req/sec polite limit)
2879
+ curl -sk "https://stat.ripe.net/data/network-info/data.json?resource=8.8.8.8" | jq '.data'
2880
+
2881
+ # bgp.tools per-IP API (free; light rate-limit; requires UA)
2882
+ curl -sk -A "osint-recon/1.0 (contact@example.com)" "https://bgp.tools/api/ip/8.8.8.8" | jq .
2883
+
2884
+ # IPinfo Lite (free 50k req/month with free key)
2885
+ curl -sk "https://ipinfo.io/8.8.8.8?token=<key>" | jq .
2886
+ ```
2887
+
2888
+ **Watch out:**
2889
+ - `bgpview.io` API has aggressive undocumented rate limits (~1 req/min/IP); not suitable for bulk.
2890
+ - `bgp.he.net` has no public API; HTML scraping only — fragile.
2891
+ - `PeeringDB` is for facility/IX info, not per-IP ASN lookup.
2892
+ - For bulk (>50 IPs): use the **Cymru bulk format** above; it accepts hundreds of IPs in one TCP session.
2893
+
2894
+ ### 28.2 Certificates & CT Monitoring
2895
+
2896
+ - [crt.sh](https://crt.sh/), [Censys Certificates](https://search.censys.io/certificates), [CertStream](https://certstream.calidog.io/) (real-time CT WebSocket), [Rapid7 Open Data](https://opendata.rapid7.com/), [Cert Spotter](https://sslmate.com/certspotter) (freemium).
2897
+ - **Favicon mmh3 hash:** cluster infrastructure across hosts; pair with Shodan/Censys favicon search for shared-infra discovery.
2898
+
2899
+ ### 28.3 Web tech / TLS / fingerprinting
2900
+
2901
+ - **httpx (ProjectDiscovery)** — Wappalyzer-compatible ~600 signatures, JARM, favicon mmh3, TLS cert SHA256, security headers, screenshots. Recommended one-shot probe wrapper for thousands of hosts.
2902
+ - **JARM** — TLS handshake hash; stable per server config; useful for clustering.
2903
+ - **Wappalyzer** browser extension or CLI for tech enumeration.
2904
+
2905
+ ### 28.4 TLS Deep Audit
2906
+
2907
+ Beyond the cert SAN + JARM, inspect cipher suites, protocols, and config quality.
2908
+
2909
+ **sslyze (most thorough):**
2910
+ ```bash
2911
+ pip install sslyze
2912
+ sslyze --regular target.example:443
2913
+ sslyze --json_out=tls.json target.example:443
2914
+ ```
2915
+ Reports: protocols supported (TLS 1.0/1.1/1.2/1.3), cipher suites per protocol, cert chain, OCSP, key info, robot/heartbleed/lucky13/poodle/freak/logjam/drown/ccs/ticketbleed.
2916
+
2917
+ **testssl.sh (thorough + readable output):**
2918
+ ```bash
2919
+ docker run --rm -ti drwetter/testssl.sh https://target.example
2920
+ # Or native install: https://github.com/drwetter/testssl.sh
2921
+ testssl.sh --jsonfile-pretty=tls-report.json target.example:443
2922
+ ```
2923
+
2924
+ **nmap script alternative (lighter):**
2925
+ ```bash
2926
+ nmap --script ssl-enum-ciphers,ssl-cert -p 443 target.example
2927
+ ```
2928
+
2929
+ **Check for these issues:**
2930
+
2931
+ | Issue | Severity | What to look for |
2932
+ |---|---|---|
2933
+ | TLS 1.0 / 1.1 supported | MEDIUM | Deprecated; PCI-DSS forbids TLS 1.0. |
2934
+ | SSL 3.0 / 2.0 supported | HIGH | Critically deprecated. |
2935
+ | Weak ciphers (RC4, 3DES, CBC modes) | MEDIUM | RC4 = NOMORE attack; 3DES = SWEET32. |
2936
+ | Anonymous DH | HIGH | No authentication. |
2937
+ | Self-signed cert on production | MEDIUM | Trust failure. |
2938
+ | Expired cert | MEDIUM | Operational + trust failure. |
2939
+ | Cert valid for too long (>397 days) | LOW | Browser warnings since 2020. |
2940
+ | Wildcard cert covering critical hosts | INFO | Operational risk if private key compromised. |
2941
+ | Weak key size (<2048 RSA, <256 ECDSA) | HIGH | Cryptographically weak. |
2942
+ | Heartbleed (CVE-2014-0160) | CRITICAL | Memory disclosure. |
2943
+ | ROBOT (CVE-2017-13099) | HIGH | Bleichenbacher. |
2944
+ | CCS injection (CVE-2014-0224) | HIGH | OpenSSL specific. |
2945
+ | Ticketbleed (CVE-2016-9244) | HIGH | F5-specific memory disclosure. |
2946
+ | HSTS not present (covered §16.4) | MEDIUM | Header audit. |
2947
+
2948
+ **JA3 / JA4 reference databases:**
2949
+
2950
+ - [ja3er.com](https://ja3er.com) — community-curated JA3 → client-software mapping.
2951
+ - [TLS Fingerprint DB](https://tlsfingerprint.io/) — research aggregator.
2952
+ - For server JARM: search Shodan `ssl.jarm:<hash>` to find shared infrastructure / origin candidates (see §16.15).
2953
+
2954
+ ### 28.5 Reverse DNS Sweep & IPv6 Enumeration
2955
+
2956
+ When a target owns an IP range (their ASN), enumerate it.
2957
+
2958
+ **Reverse DNS sweep (within scope):**
2959
+ ```bash
2960
+ # Single /24
2961
+ for i in $(seq 1 254); do
2962
+ IP="203.0.113.$i"
2963
+ PTR=$(dig +short -x $IP)
2964
+ [ -n "$PTR" ] && echo "$IP -> $PTR"
2965
+ done
2966
+
2967
+ # Larger range with parallelism
2968
+ prips 203.0.113.0/22 | xargs -I {} -P 50 sh -c 'PTR=$(dig +short -x {}); [ -n "$PTR" ] && echo "{} -> $PTR"'
2969
+ ```
2970
+
2971
+ **Mass DNS approach (better for large ranges):**
2972
+ ```bash
2973
+ # zdns: install via go install github.com/zmap/zdns/cmd/zdns@latest
2974
+ prips 203.0.113.0/22 | zdns PTR
2975
+ ```
2976
+
2977
+ **Banner-only sweep (no DNS round trip):**
2978
+ ```bash
2979
+ # masscan + banner-grab
2980
+ sudo masscan -p80,443 203.0.113.0/22 --rate=1000 --banners -oX masscan.xml
2981
+ ```
2982
+
2983
+ **IPv6 enumeration:**
2984
+
2985
+ IPv6 has weaker enumeration tradition (huge address space precludes brute-force) but the AAAA records and known-allocation prefixes are still useful.
2986
+
2987
+ ```bash
2988
+ # AAAA records for every discovered subdomain
2989
+ for sub in $(cat all-subs.txt); do
2990
+ AAAA=$(dig +short AAAA $sub)
2991
+ [ -n "$AAAA" ] && echo "$sub -> $AAAA"
2992
+ done
2993
+
2994
+ # IPv6 reverse DNS sweep is generally infeasible (2^64 host bits per subnet)
2995
+ # Instead: extract IPv6 prefixes from the target's allocations
2996
+ whois -h whois.cymru.com " -v target.example.com" # gets ASN; then look up prefix
2997
+ ```
2998
+
2999
+ **BGP route observation:**
3000
+
3001
+ - **RouteViews** — `http://archive.routeviews.org/` (free; historical BGP routing table snapshots).
3002
+ - **RIPE RIS** — `https://ris.ripe.net/` (free; route collectors).
3003
+ - Use these to detect route hijacks against the target's prefixes (defensive intel; sometimes IOC).
3004
+
3005
+ **Reverse DNS pivots from third-party IPs:**
3006
+
3007
+ If a third-party shows the target's domain in PTR records (e.g., a hosting provider's IP has PTR `customer-acme.example.com.hostingprovider.net`), that's a pivot for adjacent customer infrastructure on the same provider/datacenter.
3008
+
3009
+ ---
3010
+
3011
+ ## 29. Threat Intel & IOCs
3012
+
3013
+ - Vendor / CERT advisories: CISA/NSA/CSA joint advisories, CERT-EU, NCSC-UK, JPCERT/CC, CERT-UA.
3014
+ - [MISP Project](https://www.misp-project.org/) and public MISP feeds.
3015
+ - [OpenCTI](https://www.opencti.io/) — CTI knowledge graph.
3016
+ - [Malpedia](https://malpedia.caad.fkie.fraunhofer.de/) — malware families, YARA, references.
3017
+ - [ThreatFox](https://threatfox.abuse.ch/), [URLHaus](https://urlhaus.abuse.ch/), [SSLBL](https://sslbl.abuse.ch/).
3018
+ - [MalwareBazaar](https://bazaar.abuse.ch/) — hash-based sample sharing.
3019
+ - [PhishTank](https://www.phishtank.com/), [OpenPhish](https://openphish.com/).
3020
+
3021
+ ### 29.1 Malware Analysis & Sandboxes
3022
+
3023
+ - Static: [pefile](https://github.com/erocarrera/pefile), [FLOSS](https://github.com/mandiant/flare-floss), [capa](https://github.com/mandiant/capa).
3024
+ - Similarity: SSDEEP, TLSH.
3025
+ - Sandboxes: [ANY.RUN](https://any.run/), [Hybrid Analysis](https://www.hybrid-analysis.com/), [CAPE](https://capesandbox.com/), [Tria.ge](https://tria.ge/).
3026
+ - Intelligence: [Intezer](https://analyze.intezer.com/) (code reuse), [VirusTotal](https://www.virustotal.com/) — **caution: uploads become public**.
3027
+ - TLS: [JA3](https://github.com/salesforce/ja3), [JA4](https://github.com/FingerprinTLS/ja4).
3028
+
3029
+ ### 29.2 Vulnerability Prioritization Data Sources
3030
+
3031
+ Methodology in companion skill §28. Concrete data sources here.
3032
+
3033
+ | Source | URL | What it tells you |
3034
+ |---|---|---|
3035
+ | **NVD** | `https://nvd.nist.gov/vuln/search` (or API `services.nvd.nist.gov/rest/json/cves/2.0`) | Base CVE catalog with CVSS v2/v3 scores. |
3036
+ | **EPSS** | `https://www.first.org/epss/` (CSV at `https://epss.cyentia.com/epss_scores-current.csv.gz`) | 0.0-1.0 probability of exploit in next 30 days. Updated daily. |
3037
+ | **CISA KEV** | `https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json` | CVEs proven exploited in the wild + federal-agency due-by dates. |
3038
+ | **ExploitDB** | `https://www.exploit-db.com/`; offline DB via `searchsploit` | POC code presence (Metasploit, Python, shell). |
3039
+ | **Metasploit module catalog** | `https://www.rapid7.com/db/modules/` (or `msfconsole > search cve:CVE-2024-XXXX`) | Automation availability. |
3040
+ | **InTheWild.io** | `https://inthewild.io/` | Community-curated "actively exploited" tracker. |
3041
+ | **OpenCVE** | `https://www.opencve.io/` | Timeline + watchlist + alerts. |
3042
+ | **Trickest CVE → POC mapping** | `https://github.com/trickest/cve` | Auto-generated CVE → public POC repo links. |
3043
+ | **GitHub Security Advisories** | `https://github.com/advisories` | Per-language / per-ecosystem advisories. |
3044
+ | **MITRE CVE List** | `https://cve.mitre.org/cve/` | Official CVE registry. |
3045
+ | **VulnDB** | `https://vulndb.cyberriskanalytics.com/` | Paid; commercial enrichment. |
3046
+ | **OSV.dev** | `https://osv.dev/` | Open-source vulnerability DB; JSON API. |
3047
+ | **Vulncheck KEV** | `https://vulncheck.com/kev` | Expanded KEV feed (more than CISA). |
3048
+ | **Tenable Research** | `https://www.tenable.com/research` | Tenable's CVE detail enrichment. |
3049
+ | **Qualys ThreatPROTECT** | `https://threatprotect.qualys.com/` | Qualys' threat-context enrichment. |
3050
+
3051
+ **Workflow:**
3052
+ ```bash
3053
+ # 1. Get EPSS score for a CVE
3054
+ curl -sk "https://api.first.org/data/v1/epss?cve=CVE-2024-3400" | jq '.data[0]'
3055
+
3056
+ # 2. Check if in CISA KEV
3057
+ curl -sk https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json | \
3058
+ jq '.vulnerabilities[] | select(.cveID == "CVE-2024-3400")'
3059
+
3060
+ # 3. Check ExploitDB
3061
+ searchsploit cve 2024-3400
3062
+
3063
+ # 4. Check Metasploit
3064
+ msfconsole -q -x "search cve:2024-3400; exit"
3065
+ ```
3066
+
3067
+ **Bulk prioritization** (given a Nuclei scan output with N CVEs):
3068
+ ```bash
3069
+ # Extract CVEs from nuclei JSON output
3070
+ jq -r '.info.classification.["cve-id"][]?' nuclei-results.json | sort -u > cves.txt
3071
+
3072
+ # Annotate each with EPSS + KEV
3073
+ while IFS= read -r CVE; do
3074
+ EPSS=$(curl -sk "https://api.first.org/data/v1/epss?cve=$CVE" | jq -r '.data[0].epss // "N/A"')
3075
+ KEV=$(curl -sk https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json | \
3076
+ jq --arg c "$CVE" '.vulnerabilities[] | select(.cveID == $c) | .vulnerabilityName // empty')
3077
+ KEV_FLAG=$([ -n "$KEV" ] && echo "KEV" || echo "")
3078
+ echo "$CVE | EPSS:$EPSS | $KEV_FLAG"
3079
+ done < cves.txt | sort -t: -k2 -nr
3080
+
3081
+ ---
3082
+
3083
+ ## 30. Cryptocurrency OSINT
3084
+
3085
+ ### 30.1 Blockchain Explorers
3086
+
3087
+ | Chain | Explorer |
3088
+ |-------|---------|
3089
+ | Bitcoin | [Blockchain.com](https://www.blockchain.com/explorer), [Blockchair](https://blockchair.com/) |
3090
+ | Ethereum | [Etherscan](https://etherscan.io/) |
3091
+ | BNB Chain | [BSCScan](https://bscscan.com/) |
3092
+ | Polygon PoS | [PolygonScan](https://polygonscan.com/) |
3093
+ | Solana | [Solscan](https://solscan.io/) |
3094
+ | Multi-chain | [OKLink](https://www.oklink.com/) (freemium), [Cielo](https://cielo.io/) |
3095
+
3096
+ ### 30.2 L2 / Rollup Explorers
3097
+
3098
+ | L2 | Explorer | Notes |
3099
+ |---|---|---|
3100
+ | Arbitrum | [Arbiscan](https://arbiscan.io/) | Optimistic rollup; 7-day challenge window. |
3101
+ | Optimism | [Optimistic Etherscan](https://optimistic.etherscan.io/) | Optimistic rollup; 7-day challenge window. |
3102
+ | Base | [BaseScan](https://basescan.org/) | OP Stack. |
3103
+ | Blast | [Blastscan](https://blastscan.io/) | OP Stack derivative. |
3104
+ | Scroll | [Scrollscan](https://scrollscan.com/) | zkEVM. |
3105
+ | zkSync Era | [zkSync Era Block Explorer](https://explorer.zksync.io/) | zkRollup; faster finality. |
3106
+ | Polygon zkEVM | [PolygonScan zkEVM](https://zkevm.polygonscan.com/) | zkEVM. |
3107
+ | StarkNet | [Voyager](https://voyager.online/), [StarkScan](https://starkscan.co/) | Cairo VM; different address derivation. |
3108
+ | Cross-L2 | [L2Beat](https://l2beat.com/) | Risk framework + TVL comparison. |
3109
+
3110
+ ### 30.3 Transaction Tracking & Analytics
3111
+
3112
+ - [Arkham](https://www.arkhamintelligence.com/) — multichain, entity labels, graphs, alerts.
3113
+ - [TRM](https://www.trmlabs.com/) — address/tx graphs.
3114
+ - [MetaSleuth](https://metasleuth.io/) — visual flow.
3115
+ - [Breadcrumbs](https://www.breadcrumbs.app/) (freemium) — visual graphing + labels.
3116
+ - [Bubblemaps](https://bubblemaps.io/) — holder concentration.
3117
+ - [Whale Alert](https://whale-alert.io/) — large transaction monitoring.
3118
+ - [Chainalysis](https://www.chainalysis.com/) / [Crystal Blockchain](https://crystalblockchain.com/) — pro analytics.
3119
+ - [GraphSense](https://graphsense.info/) — open-source crypto analytics.
3120
+ - [Nansen](https://www.nansen.ai/) — Smart Money labels (paid).
3121
+ - [Dune](https://dune.com/) — custom queries.
3122
+ - [Token Sniffer](https://tokensniffer.com/) — honeypot/scam detection.
3123
+
3124
+ ### 30.4 NFT / Exchange / Bridges
3125
+
3126
+ - [OpenSea](https://opensea.io/), [NFTScan](https://www.nftscan.com/), [DappRadar](https://dappradar.com/), [CoinGecko](https://www.coingecko.com/), [CoinMarketCap](https://coinmarketcap.com/), [Glassnode](https://glassnode.com/).
3127
+ - Bridges: [Socketscan](https://socketscan.io/), [L2Beat Bridges](https://l2beat.com/bridges), [Pulsy](https://pulsy.io/).
3128
+
3129
+ ---
3130
+
3131
+ ## 31. Media Intelligence
3132
+
3133
+ ### 31.1 Reverse Image & Facial Search
3134
+
3135
+ - [Google Images](https://images.google.com/), [TinEye](https://tineye.com/), [Yandex Images](https://yandex.com/images/) (Russian/East European strong), [PimEyes](https://pimeyes.com/en), [FaceCheck](https://facecheck.id/).
3136
+
3137
+ ### 31.2 Image Forensics
3138
+
3139
+ - [Forensically](https://29a.ch/photo-forensics/), [ExifTool](https://exiftool.org/), [Jimpl](https://jimpl.com/), [Jeffrey's EXIF Viewer](http://exif.regex.info/exif.cgi), [FOCA](https://www.elevenpaths.com/labstools/foca), [Metagoofil](https://www.edge-security.com/metagoofil.php), [C2PA Verify](https://verify.contentauthenticity.org/).
3140
+
3141
+ ### 31.3 Video Analysis
3142
+
3143
+ - [YouTube Data Viewer](https://citizenevidence.amnestyusa.org/), [InVID & WeVerify](https://www.invid-project.eu/tools-and-services/invid-verification-plugin/), [YouTube Geo Tag](https://mattw.io/youtube-geofind/location), [MediaInfo](https://mediaarea.net/en/MediaInfo), Snap Map.
3144
+
3145
+ ### 31.4 Browser Extensions for Media
3146
+
3147
+ - [Fake News Debunker (InVID & WeVerify)](https://chrome.google.com/webstore/detail/fake-news-debunker-by-inv/mhccpoafgdgbhnjfhkcmgknndkeenfhe).
3148
+ - [RevEye Reverse Image Search](https://chrome.google.com/webstore/detail/reveye-reverse-image-sear/kejaocbebojdmebagkjghljkeefgimdj).
3149
+ - [EXIF Viewer Pro](https://chrome.google.com/webstore/detail/exif-viewer-pro/mmbhfeiddhndihdjeganjggkmjapkffm).
3150
+ - [Wayback Machine Extension](https://chrome.google.com/webstore/detail/wayback-machine/fpnmgdkabkmnadcjpehmlllkndpkmiak).
3151
+ - [Search by Image](https://chromewebstore.google.com/detail/search-by-image/cnojnbdhbhnkbcieeekonklommdnndci).
3152
+
3153
+ ---
3154
+
3155
+ ## 32. Geospatial Intelligence
3156
+
3157
+ ### 32.1 Satellite & Mapping
3158
+
3159
+ - [Google Maps](https://www.google.com/maps), [Bing Maps](https://www.bing.com/maps/).
3160
+ - [Sentinel Hub EO Browser](https://apps.sentinel-hub.com/eo-browser/), [NASA Worldview](https://worldview.earthdata.nasa.gov/), [Zoom Earth](https://zoom.earth/).
3161
+ - [Wayback Imagery](https://livingatlas.arcgis.com/wayback/) — historical satellite.
3162
+ - [NASA FIRMS](https://firms.modaps.eosdis.nasa.gov/map/), [Open Infrastructure Map](https://openinframap.org/), [Windy](https://www.windy.com/).
3163
+
3164
+ ### 32.2 Geolocation Tools
3165
+
3166
+ - [Mapillary](https://www.mapillary.com/app), [KartaView](https://kartaview.org/), [Overpass Turbo](https://overpass-turbo.eu/), [SunCalc](https://www.suncalc.org/), [GeoNames](https://www.geonames.org/), [PeakVisor](https://peakvisor.com/), [GeoGuesser tips](https://somerandomstuff1.wordpress.com/2019/02/08/geoguessr-the-top-tips-tricks-and-techniques/).
3167
+
3168
+ **Street View:** Google Street View, [Apple Maps](https://maps.apple.com/), [Yandex Maps](https://yandex.com/maps/), [Baidu Maps](https://map.baidu.com/).
3169
+
3170
+ ### 32.3 Flight OSINT
3171
+
3172
+ - [FlightRadar24](https://www.flightradar24.com/), [FlightAware](https://www.flightaware.com/), [RadarBox](https://www.radarbox.com/).
3173
+ - [ADSBExchange](https://www.adsbexchange.com/) — unfiltered.
3174
+ - [Planespotters](https://www.planespotters.net/) — fleet/airframe history.
3175
+ - [AirFrames](https://www.airframes.org/), [JetPhotos](https://www.jetphotos.com/).
3176
+
3177
+ ### 32.4 Maritime OSINT
3178
+
3179
+ - [MarineTraffic](https://www.marinetraffic.com/), [VesselFinder](https://www.vesselfinder.com/), [FleetMon](https://www.fleetmon.com/).
3180
+ - [Global Fishing Watch](https://globalfishingwatch.org/map/) — vessel behavior + AIS gap analysis.
3181
+
3182
+ ---
3183
+
3184
+ ## 33. AI-Assisted OSINT
3185
+
3186
+ > **Warning:** Never paste PII, sensitive IOCs, or unique pivots into cloud LLMs. They log inputs and may use them for training. Use local models for sensitive analysis.
3187
+
3188
+ | Tool | Strength |
3189
+ |------|---------|
3190
+ | [ChatGPT](https://chat.openai.com/) (paid) | Log parsing, dataset analysis, Code Interpreter for CSV/JSON, Vision OCR. |
3191
+ | [Claude](https://claude.ai/) (paid) | 200K-token context for large doc dumps + report synthesis. |
3192
+ | [Gemini](https://gemini.google.com/) | Long-context; Deep Research mode with citations. |
3193
+ | [Perplexity Pro](https://www.perplexity.ai/) (paid) | Real-time web search + reasoning. |
3194
+
3195
+ **Local / privacy-preserving:** [Ollama](https://ollama.com/), [LM Studio](https://lmstudio.ai/), [GPT4All](https://gpt4all.io/).
3196
+
3197
+ ### 33.1 Commercial AI OSINT Platforms
3198
+
3199
+ - [Cylect](https://www.cylect.io/) — entity extraction + link analysis.
3200
+ - [Fivecast Matrix](https://www.fivecast.com/products/matrix/) — generative-AI triage for social-media datasets.
3201
+ - [Recorded Future](https://www.recordedfuture.com/) — AI-driven threat intel.
3202
+ - [DarkOwl Vision](https://www.darkowl.com/) — darknet data analysis.
3203
+
3204
+ ### 33.2 Deepfake & Synthetic Media Detection
3205
+
3206
+ - [Sensity AI](https://sensity.ai/), [Reality Defender](https://realitydefender.com/), [Adobe Content Credentials Verify](https://contentcredentials.org/verify), [CarNet](https://carnet.ai/).
3207
+
3208
+ ---
3209
+
3210
+ ## 34. Archiving & Evidence Preservation
3211
+
3212
+ - [archive.today](https://archive.today/) — one-page archiver + screenshot.
3213
+ - [URLScan.io](https://urlscan.io/) — webpage scan + resource map.
3214
+ - [ArchiveBox](https://archivebox.io/) — self-hosted (HTML, PDF, screenshots, media).
3215
+ - [Hunchly](https://www.hunch.ly/) — investigator evidence capture (paid).
3216
+ - Wayback SavePageNow API v3 — on-demand archiving with job IDs.
3217
+ - [SingleFileZ](https://github.com/gildas-lormeau/SingleFileZ) — browser ext for offline HTML.
3218
+ - [Kasm Workspaces](https://kasmweb.com/) — containerized OSINT browser isolation.
3219
+
3220
+ **Evidence handling:** URL + UTC timestamp + PNG + WARC/SingleFileZ archive, SHA-256 hash all downloads, separate work profiles per case, store evidence read-only, JSONL run logs with `run_id` + tool versions.
3221
+
3222
+ ---
3223
+
3224
+ ## 35. Automation & Workflows
3225
+
3226
+ - [n8n](https://n8n.io/) — self-hosted workflow automation (RSS → scrape → alert pipelines).
3227
+ - [Huginn](https://github.com/huginn/huginn) — agent-based monitoring/scraping/alerting.
3228
+ - [Playwright](https://playwright.dev/) — headless browser automation with stealth plugins.
3229
+ - [Browsertrix Crawler](https://github.com/webrecorder/browsertrix-crawler) — archival crawling with WARC export.
3230
+ - [Prefect](https://www.prefect.io/) / [Apache Airflow](https://airflow.apache.org/) — workflow orchestration.
3231
+
3232
+ ---
3233
+
3234
+ ## 36. Cross-Module Sidecar Coordination
3235
+
3236
+ When you run a multi-module recon, late-arriving outputs need to feed into already-running modules. The pattern:
3237
+
3238
+ 1. Each module writes a sidecar JSON to a known location when it finishes:
3239
+ - `<scan>/mobile_endpoints.json` — endpoints + hostnames extracted from APK static analysis.
3240
+ - `<scan>/secrets_sidecar.json` — hostnames + endpoints + Firebase project IDs from secrets-beyond-github sweep.
3241
+ - `<scan>/sso_tenants.json` — discovered IdP tenants for breach correlation.
3242
+ 2. Downstream modules check for sidecars on start; if present, ingest.
3243
+ 3. Cross-feed: API discovery consumes both `mobile_endpoints.json` and `secrets_sidecar.json`; SSO×breach correlation consumes `sso_tenants.json` and the breach DB.
3244
+
3245
+ **Sidecar shape (mobile_endpoints.json example):**
3246
+ ```json
3247
+ {
3248
+ "endpoints": [
3249
+ {"method": "GET", "url": "https://api.acme.com/v1/users", "source": "apk:com.acme.android"},
3250
+ {"method": "POST", "url": "https://api.acme.com/v1/login", "source": "apk:com.acme.android"}
3251
+ ],
3252
+ "hostnames": ["api.acme.com", "cdn.acme.com"],
3253
+ "firebase_project_ids": ["acme-prod-12345"]
3254
+ }
3255
+ ```
3256
+
3257
+ When you implement an ad-hoc multi-tool recon (no platform), use a `tmpdir + JSON sidecars + a one-line manifest` pattern. Composable, debuggable, replay-able.
3258
+
3259
+ ---
3260
+
3261
+ ## 37. Regional Search Engines
3262
+
3263
+ - **Russia / CIS:** [Yandex](https://yandex.com/), [Mail.ru Search](https://go.mail.ru/).
3264
+ - **China:** [Baidu](https://www.baidu.com/), [Sogou](https://www.sogou.com/), [360 Search](https://www.so.com/).
3265
+ - **Russia social:** [VK](https://vk.com/), [OK.ru](https://ok.ru/).
3266
+ - **China social:** [Weibo](https://weibo.com/), [Bilibili](https://www.bilibili.com/), [Zhihu](https://www.zhihu.com/), [Douyin](https://www.douyin.com/).
3267
+
3268
+ ---
3269
+
3270
+ ## 38. Telegram & Messaging Intelligence
3271
+
3272
+ - [TGStat](https://tgstat.com/) — channel analytics + search.
3273
+ - [Telemetr](https://telemetr.io/) — channel growth, overlaps, forwards.
3274
+ - [Combot](https://combot.org/) — group analytics (partial paid).
3275
+ - [TelegramDB Search Bot](https://t.me/TGdb_bot) — basic Telegram OSINT.
3276
+ - [Discord ID](https://discord.id/) — basic Discord account info.
3277
+ - [Sogou Weixin search](https://weixin.sogou.com/) — WeChat Official Accounts.
3278
+ - View public Telegram channels: `https://t.me/s/<channel>`.
3279
+
3280
+ ---
3281
+
3282
+ ## 39. Attack-Path Hint Patterns
3283
+
3284
+ When emitting a HIGH/CRITICAL API endpoint finding (score ≥ 70), include a one-sentence `attack_path_hint` in evidence so the operator knows where to start exploiting. Templates:
3285
+
3286
+ | Trigger | Attack-path hint |
3287
+ |---|---|
3288
+ | Unauth POST / PUT / DELETE | *"Unauthenticated {method} {path} — try IDOR + privilege escalation; check whether numeric IDs are sequential or guessable."* |
3289
+ | Open GraphQL introspection | *"Open GraphQL introspection on {path} — enumerate mutations, look for `createUser`, `setRole`, `transferFunds`-shaped names; pivot to broken-auth or business-logic flaws."* |
3290
+ | Reflected CORS + creds | *"Reflected CORS with credentials on {path} — host CSRF page on attacker-controlled origin; victim's browser will leak {sensitive-data-hint}."* |
3291
+ | Wildcard CORS + sensitive | *"Wildcard CORS on {path} returning user-tied data without creds — exfiltrate via cross-origin fetch from any page victim visits."* |
3292
+ | Verb tampering | *"Verb tampering: {hidden-method} allowed on documented-{visible-method}-only endpoint → likely missing-method-check authz bug; try {hidden-method} {path} with valid auth."* |
3293
+ | API key in URL | *"API key in URL: `?{param}=...` — token leaks to access logs, browser history, Referer headers, third-party CDNs. Check Wayback / Google for cached copies."* |
3294
+ | Schema leak in error | *"Schema leak in error response — framework signature `{framework}` exposed; map to known {framework} vulns and craft targeted payloads."* |
3295
+ | Sensitive keyword | *"Path contains '{keyword}' — review for direct object reference, mass-assignment, or hidden admin functionality."* |
3296
+ | Open RTDB Firebase | *"Open Firebase RTDB at https://{project}.firebaseio.com/.json — read everything, then test write at `/<random-key>.json` with PUT to gauge ACL scope."* |
3297
+ | Listable cloud bucket | *"Listable {provider} bucket `{bucket}` — recursive object listing + content-type analysis; look for backups, logs, customer data, AWS keys in JSON configs."* |
3298
+ | .git exposed | *"Exposed .git/config on {host} — reconstruct repository with git-dumper or githacker; full source history."* |
3299
+ | .env exposed | *"Exposed .env on {host} — grep for `_KEY`, `_SECRET`, `_TOKEN`, `_PASSWORD`; validate all credentials read-only via §23 validators."* |
3300
+ | /actuator/env | *"Spring Boot /actuator/env exposed — dump environment variables; look for `spring.datasource.password`, JWT secrets, cloud creds."* |
3301
+ | /actuator/heapdump | *"Spring Boot /actuator/heapdump exposed — download HPROF, run `jhat` or VisualVM, search for cleartext secrets in heap strings."* |
3302
+ | Open Elasticsearch | *"Open Elasticsearch on {host}:9200 — `/_cat/indices?v` for index list; sample documents from each high-value index; test write to `/test-idx/_doc` to gauge ACL."* |
3303
+ | Open Redis | *"Open Redis on {host}:6379 — `INFO`, `KEYS *`, sample reads; check for write access via `CONFIG SET` then `BGSAVE` to write `authorized_keys`."* |
3304
+ | Open MongoDB | *"Open MongoDB on {host}:27017 — `show dbs`, `show collections`, sample find queries; check user collection for password hashes."* |
3305
+ | Subdomain takeover | *"CNAME for {host} points to unclaimed {provider} resource → register `{takeover-target}` on {provider} to serve content from {host}; pivot to phishing or content injection on the trusted domain."* |
3306
+ | Open kubelet | *"Open kubelet on {host}:10250 — `GET /pods` to list; `POST /run/<ns>/<pod>/<container>` for in-container exec without K8s API auth."* |
3307
+ | Open etcd | *"Open etcd on {host}:2379 — `etcdctl get / --prefix --keys-only` for full cluster state; secrets stored under `/registry/secrets/`."* |
3308
+ | K8s API anonymous | *"Kubernetes API on {host}:6443 with anonymous-auth — `kubectl --server=https://{host}:6443 --insecure-skip-tls-verify get pods --all-namespaces`."* |
3309
+ | Citrix unpatched | *"Citrix NetScaler version {ver} on {host} — vulnerable to CVE-{cve} (KEV-listed); see vendor advisory; do not exploit but flag for client immediate patching."* |
3310
+ | F5 BIG-IP TMUI exposed | *"F5 BIG-IP TMUI on {host} reachable; CVE-2022-1388 / CVE-2023-46747 KEV applicable; advise immediate patching to vendor-released hotfix."* |
3311
+ | VMware vCenter accessible | *"vCenter at {host} accessible without VPN; CVE-2021-21972 RCE if unpatched; check version banner."* |
3312
+ | Cloud function URL unauth | *"AWS Lambda Function URL at {url} accessible anonymously — review IAM auth configuration; if unauthenticated by design, audit input validation aggressively."* |
3313
+ | npm typosquat candidate | *"Package name `{candidate}` is unregistered + similar to target's published `{official}` — typosquat takeover risk; advise client to defensively register."* |
3314
+ | DMARC missing/permissive | *"DMARC `p=none` on {domain} — spoof of `{anything}@{domain}` deliverable to recipients; recommend enforcement to `p=quarantine` or `p=reject` after observing reports."* |
3315
+ | Live AI API key (Anthropic/OpenAI) | *"Validated `sk-{provider}-...` key with model access — quota cost can be exfiltrated; rotate immediately + audit usage logs in provider console."* |
3316
+ | Public Slack invite link | *"Slack workspace invite link discoverable via search engine — anyone can join the workspace without approval; trivially access internal channels."* |
3317
+ | Open Docker registry | *"Public Docker registry at {host} — `GET /v2/_catalog` lists images; pull and scan layers for embedded secrets."* |
3318
+ | Telegram bot token live | *"Telegram bot token validated — `getUpdates` reveals bot recipients (admin chats); if `getMe` shows bot is in channels, full message read access."* |
3319
+ | Sourcemap with `sourcesContent[]` | *"Sourcemap on {host} includes embedded original sources — full frontend code reconstructable; grep for inline secrets and internal hostnames."* |
3320
+
3321
+ ---
3322
+
3323
+ ## 40. Severity Decision Matrix — Worked Examples
3324
+
3325
+ When in doubt, anchor on these worked examples (drawn from real engagements):
3326
+
3327
+ | Finding | Severity | Why |
3328
+ |---|---|---|
3329
+ | `/.git/config` reachable on prod webapp | **CRITICAL** | Full source-code disclosure; secret history reconstructable. |
3330
+ | `/.env` reachable on prod webapp | **CRITICAL** | Plaintext creds (DB, cloud, API). |
3331
+ | Open Firebase RTDB returning data | **CRITICAL** | All app data readable; often writable. |
3332
+ | Listable S3 bucket containing PII | **CRITICAL** | Direct data exfil. |
3333
+ | Listable S3 bucket containing logs only | HIGH | Internal hostnames + paths in logs; pivot data. |
3334
+ | Spring Boot `/actuator/env` exposed | **CRITICAL** | DB creds, JWT secrets, cloud keys in env. |
3335
+ | Spring Boot `/actuator/heapdump` exposed | **CRITICAL** | Heap contains live secrets in string form. |
3336
+ | Open Elasticsearch (`/_cat/indices` returns) | **CRITICAL** | Full data reads; often writable. |
3337
+ | Open MongoDB (no auth) | **CRITICAL** | Full data + password-hash collection. |
3338
+ | Open Redis (no AUTH) | **CRITICAL** | Write `authorized_keys` → SSH foothold. |
3339
+ | Open Docker API (port 2375) | **CRITICAL** | Container/host takeover. |
3340
+ | Public PMAK validated live with broad scope | **CRITICAL** | Full Postman account + all team workspaces. |
3341
+ | Public AWS root access key validated live | **CRITICAL** | Full account compromise. |
3342
+ | Live AWS IAM-user key found on GitHub | HIGH | Limited scope (depends on IAM policy); often elevatable. |
3343
+ | Live GitHub PAT found in JS bundle | HIGH | Repo write access (depends on scope). |
3344
+ | Live Slack token in pastebin | HIGH | Workspace data + history; sometimes channel post. |
3345
+ | Sourcemap (`.js.map`) accessible on prod | HIGH | Frontend source disclosure. |
3346
+ | Open GraphQL introspection on prod | HIGH | Full schema → mutations + business-logic discovery. |
3347
+ | Subdomain takeover possible (Heroku / GitHub Pages / etc.) | HIGH | Takeover → phishing on trusted domain. |
3348
+ | Reflected CORS with credentials on `/api/billing` | HIGH | CSRF-via-CORS for billing data. |
3349
+ | Verb tampering: DELETE allowed on documented-GET-only endpoint | HIGH | Authz bypass; potentially destructive. |
3350
+ | `phpinfo.php` reachable on prod | HIGH | Discloses paths, env vars, modules → vuln-version pivot. |
3351
+ | Tomcat `/manager/html` reachable | HIGH | Often default creds; WAR upload = RCE. |
3352
+ | Jenkins script console accessible | HIGH | Groovy script execution = RCE. |
3353
+ | Missing HSTS on `/login` | HIGH (escalated from MED) | Login pages must enforce HSTS. |
3354
+ | Missing HSTS on standard pages | MEDIUM | Hardening gap. |
3355
+ | Missing CSP | MEDIUM | XSS impact mitigation gone. |
3356
+ | Internal IP / K8s service DNS in JS | MEDIUM | Internal topology disclosure. |
3357
+ | Apache `/server-status` reachable | MEDIUM | Live request visibility. |
3358
+ | `android:debuggable=true` on prod app | **CRITICAL** | Production debug-build → full client compromise. |
3359
+ | `android:allowBackup=true` (no whitelist) | MEDIUM | App data exfil via `adb backup`. |
3360
+ | `android:usesCleartextTraffic=true` | MEDIUM | MITM-able on hostile networks. |
3361
+ | Sensitive deep-link handler (`myapp://reset-password`) | HIGH | Other apps can trigger sensitive flows. |
3362
+ | Exported Android component without permission | MEDIUM | IPC attack surface. |
3363
+ | Slack webhook URL leaked | MEDIUM | Send to channel; can be used for social-eng. |
3364
+ | Twilio Account SID leaked (no auth token) | MEDIUM | Half a credential pair; plus account enumeration. |
3365
+ | Wildcard CORS on data-returning API | MEDIUM | Lower than reflected+creds but still exfil-able. |
3366
+ | Missing `X-Frame-Options` | LOW | Clickjacking. |
3367
+ | `.DS_Store` exposed | LOW | Directory listing of dev's machine. |
3368
+ | Stripe **test** key leaked | LOW | No real money risk. |
3369
+ | Firebase URL exposed (no open RTDB) | LOW | Project-ID disclosure only. |
3370
+ | Cert pinning missing in mobile app | LOW | MITM possible on hostile networks. |
3371
+ | Outdated WordPress install detected | LOW | Pending CVE confirmation. |
3372
+ | Missing `Referrer-Policy` / `Permissions-Policy` | INFO | Hardening, not an exposure. |
3373
+ | `/.well-known/security.txt` discovered | INFO | Useful contact info only. |
3374
+ | Domain in breach with 0 named accounts | INFO | Contextual only. |
3375
+ | Private bucket exists (HEAD 403) | INFO | Asset only, no finding. |
3376
+ | Open kubelet on 10250 | **CRITICAL** | Pod exec without K8s API auth. |
3377
+ | Open etcd on 2379 | **CRITICAL** | Cluster state + secrets. |
3378
+ | K8s API on 6443 with anonymous-auth | HIGH | Cluster recon; sometimes pod exec. |
3379
+ | K8s dashboard exposed without auth | HIGH | Cluster admin UI. |
3380
+ | Helm Tiller (Helm 2) on 44134 | HIGH | Cluster-admin scope. |
3381
+ | Citrix Netscaler with KEV CVE | **CRITICAL** | Patch immediately; actively exploited. |
3382
+ | F5 BIG-IP TMUI accessible | HIGH | TMUI = admin panel; CVE-2022-1388 if unpatched = CRIT. |
3383
+ | Pulse Secure with CVE-2024-21887 | **CRITICAL** | KEV; chained command injection. |
3384
+ | FortiGate with CVE-2024-21762 | **CRITICAL** | KEV; auth bypass + RCE. |
3385
+ | PaloAlto GlobalProtect with CVE-2024-3400 | **CRITICAL** | KEV; pre-auth RCE. |
3386
+ | VMware vCenter with CVE-2021-21972 | **CRITICAL** | KEV; pre-auth RCE. |
3387
+ | VMware ESXi exposed without VPN | HIGH | Multiple CVEs (ESXiArgs ransomware vector). |
3388
+ | MS Exchange with ProxyShell/Logon/NotShell unpatched | **CRITICAL** | KEV chain; RCE + mailbox dump. |
3389
+ | AWS Lambda Function URL accessible anonymously | HIGH | Direct invocation; check IAM auth posture. |
3390
+ | Public Cloud Run / Cloud Function unauthenticated | HIGH | Same. |
3391
+ | Public Docker registry (anonymous catalog) | MEDIUM | Image enum + secret hunt in layers. |
3392
+ | GitHub Actions secrets echoed in workflow logs | HIGH | Secret-in-log = full secret disclosure. |
3393
+ | GitHub Actions `pull_request_target` checkout of fork code | HIGH | Class of bug; secrets accessible to attacker PRs. |
3394
+ | GitLab self-hosted with CVE-2021-22205 | **CRITICAL** | KEV; ExifTool RCE. |
3395
+ | Jenkins with `pull_request_target`-equivalent misconfig | HIGH | Build secrets accessible to PRs. |
3396
+ | Public Notion page with internal SOPs | MEDIUM | Operational intel; sometimes credentials. |
3397
+ | Public Trello board with credentials in cards | HIGH | Often plaintext API keys. |
3398
+ | Public Confluence space with onboarding docs | MEDIUM | Seed creds + tech-stack reveal. |
3399
+ | Public Miro board with architecture diagrams | LOW | Internal-host disclosure. |
3400
+ | DMARC policy `p=none` on production sending domain | MEDIUM | Spoof feasible (escalated from LOW for risk surface). |
3401
+ | SPF `~all` (softfail) without strict DMARC | MEDIUM | Spoofs land in spam, but land. |
3402
+ | MX server allows open relay (test with 250 OK to RCPT TO foreign domain) | HIGH | Spam + spoof feasibility. |
3403
+ | Live Anthropic / OpenAI API key with broad scope | **CRITICAL** | Quota cost + potential PII in past responses. |
3404
+ | Live npm token with `publish` scope | **CRITICAL** | Supply-chain compromise of all maintained packages. |
3405
+ | Live PyPI / Docker Hub / GHCR token with publish scope | **CRITICAL** | Supply-chain compromise. |
3406
+ | Atlassian token with admin scope | HIGH | Workspace-wide read; sometimes write. |
3407
+ | Subdomain takeover candidate confirmed | HIGH | Trusted-domain phishing surface. |
3408
+ | Sensitive CI/CD wordlist hits (Jenkinsfile, .gitlab-ci.yml on public repo) | MEDIUM | Build-script intel; often references secret names. |
3409
+ | Public Postman workspace with internal API endpoints | MEDIUM | API attack surface mapped. |
3410
+ | WAF/CDN trivially bypassable (origin discoverable via §16.15) | HIGH | All WAF protections null. |
3411
+ | TLS 1.0/1.1 supported on prod | MEDIUM | Compliance gap; PCI-DSS forbids TLS 1.0. |
3412
+ | RC4 / 3DES cipher accepted | MEDIUM | NOMORE / SWEET32 attacks. |
3413
+ | Cert about to expire (<30 days) | LOW | Operational risk; not exploitable. |
3414
+ | Self-signed cert on prod | MEDIUM | Trust failure for users. |
3415
+ | Heartbleed (CVE-2014-0160) detected | **CRITICAL** | Memory disclosure including session tokens + keys. |
3416
+ | Public Slack invite link discoverable | HIGH | Anyone joins workspace; full DM/channel access. |
3417
+ | Vendor / supplier / e-procurement portal publicly exposed + breach corpus shows vendor accounts compromised | **HIGH** | Vendor impersonation + procurement fraud (BEC vector); regulatory exposure if PII/payment data flows. |
3418
+ | Job-application / careers portal collects PII over plain HTTP (no TLS) | **HIGH** | Cleartext PII at scale; regulatory exposure under GDPR / CCPA / India DPDP Act / LGPD. |
3419
+ | Decommissioned legacy mail (NXDOMAIN today) + breach corpus has historical employee URLs against it + cloud SSO migration confirmed via autodiscover IPs | **CRITICAL** | Stolen passwords almost certainly survived migration via reuse; SSO_EXPOSURE escalates regardless of the legacy host being dead. |
3420
+ | Public-facing intranet (`intranet.<domain>` resolves and returns content without VPN) | MEDIUM | Internal-staff portal exposed; often leaks org structure, employee directory, internal apps. |
3421
+ | Staging / preprod / UAT / sandbox subdomain publicly resolvable | MEDIUM | Often weaker auth, debug endpoints, test creds; sometimes mirrors prod data. |
3422
+ | `vpn.<domain>` resolves but vendor + version unknown (passive only) | INFO | Attack surface flag only; escalate to HIGH-CRITICAL after active fingerprint matches a KEV CVE (§16.16). |
3423
+ | DMARC RUA points to a third-party reporting vendor (kdmarc / dmarcian / Valimail / Agari / EasyDMARC) | INFO | Tenant signal only; vendor compromise = DMARC bypass for *all* their customers. |
3424
+
3425
+ ---
3426
+
3427
+ ## 41. LinkedIn Employee Enumeration
3428
+
3429
+ LinkedIn is the highest-signal source for employee enumeration during external red-team work. Use it for: target list generation, role prioritization, email-pattern derivation, pretext development.
3430
+
3431
+ ### 41.1 Search techniques
3432
+
3433
+ **Free LinkedIn (no Sales Navigator):**
3434
+ - People-search by company: `https://www.linkedin.com/search/results/people/?currentCompany=["<company-id>"]`. Get company-id from the company's LinkedIn URL or profile JSON.
3435
+ - Bypass connection-degree filter: search shows 1st/2nd-degree only by default; use Google dorking instead.
3436
+
3437
+ **Google dork for LinkedIn employee enum:**
3438
+ ```
3439
+ site:linkedin.com/in "<company name>"
3440
+ site:linkedin.com/in "<company name>" "engineer" # role filter
3441
+ site:linkedin.com/in "<company name>" "<location>" # location filter
3442
+ site:linkedin.com/in "<company name>" -inurl:/posts
3443
+ ```
3444
+
3445
+ **Bing/DuckDuckGo equivalents** — sometimes return different result sets; cross-engine union.
3446
+
3447
+ **LinkedIn Sales Navigator (paid):**
3448
+ - Most efficient if available. Lead lists by company × role × seniority. Export CSV.
3449
+
3450
+ **Tools:**
3451
+ - **theHarvester** with `-b linkedin` source (uses search-engine-driven enum).
3452
+ - **CrossLinked** — `https://github.com/m8r0wn/CrossLinked` — CLI tool that does the LinkedIn dorking.
3453
+ - **LinkedInDumper** / **Linkook** — open-source enum tools (verify currency; they break frequently).
3454
+ - **PhantomBuster** / **Apollo.io** / **RocketReach** / **Hunter.io Email Finder** — paid SaaS that does the enum + email derivation in one workflow.
3455
+
3456
+ ### 41.2 Role inference for prioritization
3457
+
3458
+ For each enumerated employee, capture:
3459
+ - **Name** (canonical form: First Last; remove suffixes like "PMP", "PhD" for email-pattern matching).
3460
+ - **Job title** (raw + normalized to a role tier).
3461
+ - **Tenure** (years at company; longer = more access typically).
3462
+ - **Location** (city / region; informs phishing time-of-day).
3463
+ - **Recent activity** (posts, comments, articles — informs pretext).
3464
+
3465
+ **Role priority for breach lookup + phishing target list:**
3466
+
3467
+ | Role tier | Examples | Why |
3468
+ |---|---|---|
3469
+ | **P0** | CEO, CFO, CTO, CISO, CIO, COO, GC, CRO | Exec accounts; BEC + finance + legal authority. |
3470
+ | **P1** | VP / Director of IT / Security / Engineering / Finance / HR | Privileged tool access; reset workflows. |
3471
+ | **P2** | DevOps, SRE, Platform, Security Engineer, DBA | GitHub / cloud / CI access; secrets in their accounts. |
3472
+ | **P3** | Software Engineer, Architect, Senior Developer | Code + occasional cloud access. |
3473
+ | **P4** | Sales, Marketing, HR, Finance Analyst, Customer Support | SaaS access (Salesforce, HubSpot, Workday); BEC enabler. |
3474
+ | **P5** | Generic individual contributor, intern, contractor | Lowest single-account value but breadth matters. |
3475
+
3476
+ ### 41.3 Email-pattern derivation from confirmed names
3477
+
3478
+ For each captured name, derive candidate emails using §11 templates. Cross-reference against:
3479
+ - Hunter.io `domain-search` to confirm pattern.
3480
+ - Breach corpus (HudsonRock + HIBP + DeHashed + IntelX) to find matches.
3481
+
3482
+ ### 41.4 Sock-puppet considerations
3483
+
3484
+ - **Never connect from the corporate persona.** LinkedIn shows "viewed your profile" notifications.
3485
+ - **Use a sock puppet** with a plausible profile (5+ years built history, similar industry, mutual connections to throw off correlation). Tools: persona-builder workflows.
3486
+ - **LinkedIn "private mode" (anonymous viewing)** — toggle in settings; reduces one signal but Sales Navigator can still see anonymized "someone viewed your profile."
3487
+ - **Connection requests are detectable.** Don't send any during recon.
3488
+ - **Profile views accumulate suspicion** if you view 100+ employees of one company in a day. Throttle: <20/day per persona.
3489
+
3490
+ ### 41.5 Output
3491
+
3492
+ Per discovered employee:
3493
+ ```
3494
+ Person:
3495
+ name: "Alice Doe"
3496
+ title: "Senior DevOps Engineer"
3497
+ role_tier: P2
3498
+ company: "Acme Corp"
3499
+ location: "Boston, MA"
3500
+ linkedin_url: https://www.linkedin.com/in/alicedoe
3501
+ derived_emails:
3502
+ - alice.doe@acme.com (TENTATIVE)
3503
+ - adoe@acme.com (TENTATIVE)
3504
+ - alice@acme.com (TENTATIVE)
3505
+ breach_hits:
3506
+ - alice.doe@acme.com (HudsonRock; cleartext password redacted; FIRM)
3507
+ pretext_hooks:
3508
+ - "DevOps tooling vendor evaluation" (recent posts)
3509
+ - "Boston DevOps Days speaker" (conference activity)
3510
+ ```
3511
+
3512
+ ---
3513
+
3514
+ ## 42. Job Posting Tech-Stack Analysis
3515
+
3516
+ Job postings reveal the target's internal tech stack with surprising precision. Free, public, and they include the exact vendor names.
3517
+
3518
+ ### 42.1 Sources
3519
+
3520
+ | Platform | URL | Notes |
3521
+ |---|---|---|
3522
+ | LinkedIn Jobs | `https://www.linkedin.com/jobs/search/?keywords=&f_C=<company-id>` | Most current; require LI account. |
3523
+ | Indeed | `https://www.indeed.com/cmp/<company>` | Company page with job feed. |
3524
+ | Glassdoor | `https://www.glassdoor.com/Jobs/<company>-Jobs-E<id>.htm` | Plus salary data + employee reviews. |
3525
+ | Lever (ATS) | `https://jobs.lever.co/<company>` | Direct ATS — full job descriptions. |
3526
+ | Greenhouse (ATS) | `https://boards.greenhouse.io/<company>` | Direct ATS. |
3527
+ | Workable (ATS) | `https://apply.workable.com/<company>/` | Direct ATS. |
3528
+ | AshbyHQ (ATS) | `https://jobs.ashbyhq.com/<company>` | Direct ATS. |
3529
+ | AngelList / Wellfound | `https://wellfound.com/company/<company>/jobs` | Startup-focused. |
3530
+ | BuiltIn | `https://builtin.com/companies/view/<company>` | Tech-focused. |
3531
+ | Stack Overflow Jobs | (deprecated 2022 but archive available) | Historical tech-stack data. |
3532
+ | Company careers page | `https://careers.<target>.com` or `https://<target>.com/careers` | Direct source; sometimes more detail than ATS. |
3533
+
3534
+ ### 42.2 What to extract
3535
+
3536
+ For each job posting, harvest:
3537
+ - **Required technologies** ("must have experience with X, Y, Z") → confirmed in-use.
3538
+ - **Nice-to-have technologies** → likely in use but maybe in transition.
3539
+ - **Vendor names** (Workday, Salesforce, Snowflake, Databricks, Datadog, etc.) → SaaS tenants.
3540
+ - **Internal tool / project codenames** (often slip into "you'll work on Project Aurora") → recon vocabulary.
3541
+ - **Team size hints** ("part of a 12-person platform team") → org-structure intel.
3542
+ - **Office locations** ("hybrid 3 days in Boston office") → physical recon.
3543
+ - **Cloud + on-prem ratio hints** ("migrating from on-prem to AWS") → posture intel.
3544
+ - **Compliance frameworks mentioned** (SOC2, FedRAMP, HIPAA, PCI) → defensive priorities + reporting context.
3545
+
3546
+ ### 42.3 Tooling
3547
+
3548
+ - **scrapy / BeautifulSoup** — custom scrapers per ATS.
3549
+ - **theHarvester** with appropriate sources.
3550
+ - **JobScraper** scripts on GitHub.
3551
+ - **Manual** — for small targets, manual review of 20–30 postings is fast and high-fidelity.
3552
+
3553
+ ### 42.4 Output
3554
+
3555
+ Per discovered tech mention:
3556
+ ```
3557
+ Tech_inferred:
3558
+ product: "Snowflake"
3559
+ category: "data warehouse"
3560
+ source: "linkedin job posting #<id>"
3561
+ source_url: https://www.linkedin.com/jobs/view/...
3562
+ confidence: TENTATIVE (job listing implies in-use; not yet confirmed by direct probe)
3563
+ posting_date: 2026-03-15
3564
+ required_or_nice: "required"
3565
+ ```
3566
+
3567
+ Aggregate to a **target tech-stack profile** that informs:
3568
+ - Which secret patterns to look for (Snowflake-specific keys, Databricks tokens).
3569
+ - Which SaaS tenants to fingerprint (Snowflake account URL pattern).
3570
+ - Which vendor-product fingerprints to probe (Snowflake DSN paths in JS).
3571
+
3572
+ ---
3573
+
3574
+ ## 43. Slack / Discord / Telegram Workspace Discovery
3575
+
3576
+ ### 43.1 Slack
3577
+
3578
+ - **Public workspace search** (limited; Slack used to have one but deprecated):
3579
+ - **Slofile** (third-party): `https://slofile.com/` — community Slack workspace directory.
3580
+ - **Slacklist** / **Slack Communities** — community-curated lists.
3581
+ - **Invite-link enumeration** — Slack invite URLs follow `https://join.slack.com/t/<workspace-slug>/shared_invite/<token>`. Common discovery:
3582
+ - Google: `site:join.slack.com "{target}"` or `inurl:slack.com inurl:shared_invite "{target}"`.
3583
+ - GitHub: `"join.slack.com/t/<target-stem>"` filename:README.
3584
+ - Twitter/X / Reddit: search for shared invite links.
3585
+ - **Confirm workspace exists**: visit `https://<slug>.slack.com/api/auth.test` (returns workspace metadata when called by an authenticated session, but the page itself returns differently per workspace existence).
3586
+ - **High-value finding**: any open invite link that bypasses the target's normal member-approval flow → operator can join workspace without authorization → MEDIUM/HIGH finding (depending on what's in the workspace).
3587
+
3588
+ ### 43.2 Discord
3589
+
3590
+ - **Discord server discovery** is harder (no central public directory).
3591
+ - **DiscordServers.com** — third-party directory.
3592
+ - **Discord.me** / **Top.gg** — community directories.
3593
+ - Google: `site:discord.gg "{target}"` or `site:discord.com "{target}"`.
3594
+ - **Confirm server**: invite URLs `https://discord.gg/<token>` resolve to a JSON via `https://discord.com/api/v9/invites/<token>?with_counts=true`. Returns server name, ID, member count, channel info.
3595
+ - **Bot enumeration**: if you find a bot token (catalog §17 row 47), use `getMe` to get bot identity + servers it's joined to (read-only check).
3596
+
3597
+ ### 43.3 Telegram
3598
+
3599
+ Already covered in §38. Quick reference:
3600
+ - TGStat — channel analytics + search.
3601
+ - Telemetr — channel growth + overlaps.
3602
+ - Combot — group analytics.
3603
+ - View public channels: `https://t.me/s/<channel>`.
3604
+ - Invite link enum: search Google `site:t.me "{target}"`.
3605
+
3606
+ ### 43.4 Microsoft Teams (federation)
3607
+
3608
+ - See companion methodology skill §11.10.
3609
+ - Federation status check via Microsoft Graph (auth-required).
3610
+ - Open-federation default = anyone can chat target's users with `<email>@<target>` lookup.
3611
+
3612
+ ### 43.5 Mattermost / Rocket.Chat / self-hosted
3613
+
3614
+ - `https://mattermost.<target>.com` or `chat.<target>` patterns.
3615
+ - Open registration check: probe `/signup` page; if accessible without invite → anyone joins.
3616
+ - Check version disclosure (`/api/v4/system/ping`) for known CVEs.
3617
+
3618
+ ---
3619
+
3620
+ ## 44. Package Registry Leak Hunting
3621
+
3622
+ Public package registries (npm, PyPI, RubyGems, Docker Hub, etc.) often contain inadvertent secrets in published packages.
3623
+
3624
+ ### 44.1 npm
3625
+
3626
+ - **Search packages by org / scope:**
3627
+ ```bash
3628
+ npm search "<target-keyword>"
3629
+ npm view @<scope>/<package-name>
3630
+ ```
3631
+ - **List org's packages:** `https://www.npmjs.com/org/<org>` or `https://registry.npmjs.org/-/org/<org>/package`.
3632
+ - **Per-package historical versions:** `https://registry.npmjs.org/<package>` — JSON with all versions.
3633
+ - **Tarball download for scan:**
3634
+ ```bash
3635
+ npm pack <package>@<version>
3636
+ tar -xzf package-version.tgz
3637
+ # Run secret catalog (§17) on extracted files
3638
+ ```
3639
+ - **Common leaks:** `.env` files included in published tarball, `package.json` `scripts` references to internal CI secrets, hardcoded API keys in `dist/` builds.
3640
+
3641
+ ### 44.2 PyPI
3642
+
3643
+ - **Search packages:** `https://pypi.org/search/?q=<target>`.
3644
+ - **Per-package metadata + history:** `https://pypi.org/pypi/<package>/json`.
3645
+ - **Download wheel/sdist for scan:**
3646
+ ```bash
3647
+ pip download <package>==<version> --no-deps -d /tmp/pkg
3648
+ unzip /tmp/pkg/*.whl -d /tmp/pkg/extracted
3649
+ # Run secret catalog
3650
+ ```
3651
+ - **Common leaks:** `setup.py` with hardcoded URLs, embedded test fixtures with real credentials, accidentally-included `.pypirc` files.
3652
+
3653
+ ### 44.3 RubyGems
3654
+
3655
+ - **Search:** `https://rubygems.org/search?query=<target>`.
3656
+ - **Per-gem metadata:** `https://rubygems.org/api/v1/gems/<gem-name>.json`.
3657
+ - **Download:**
3658
+ ```bash
3659
+ gem fetch <gem-name>
3660
+ gem unpack <gem-name>-<version>.gem
3661
+ ```
3662
+
3663
+ ### 44.4 Cargo (Rust crates)
3664
+
3665
+ - **Search:** `https://crates.io/search?q=<target>`.
3666
+ - **Per-crate metadata:** `https://crates.io/api/v1/crates/<crate-name>`.
3667
+
3668
+ ### 44.5 Packagist (PHP / Composer)
3669
+
3670
+ - **Search:** `https://packagist.org/search/?q=<target>`.
3671
+ - **Per-package metadata:** `https://packagist.org/packages/<vendor>/<package>.json`.
3672
+
3673
+ ### 44.6 NuGet (.NET)
3674
+
3675
+ - **Search:** `https://www.nuget.org/packages?q=<target>`.
3676
+
3677
+ ### 44.7 Maven Central (Java)
3678
+
3679
+ - **Search:** `https://search.maven.org/?q=<target>`.
3680
+
3681
+ ### 44.8 Docker Hub / Quay / GHCR / ECR Public
3682
+
3683
+ Already covered in §16.18; worth noting for completeness as part of registry-sweep workflow.
3684
+
3685
+ ### 44.9 Workflow
3686
+
3687
+ For each registry, for each candidate package owned-by-target:
3688
+ 1. List all historical versions (often `<package>@1.0.0` was clean but `<package>@0.9.0` had a leaked key).
3689
+ 2. Download each version's archive.
3690
+ 3. Extract; run secret catalog (§17) over all files.
3691
+ 4. Note `.env`, `package.json`/`setup.py`/`Cargo.toml` for hardcoded values.
3692
+ 5. For Docker images: scan each layer (use `dive` or `skopeo` + `docker save` + extract layers).
3693
+
3694
+ ### 44.10 Typosquat surveillance
3695
+
3696
+ For every published package the target owns, generate typosquat candidates (similar names, common substitutions) and check whether they're already taken by attackers (supply-chain attack surface).
3697
+
3698
+ ```bash
3699
+ # Example: target package "acme-utils"
3700
+ # Candidates: acme-util, acmeutils, acme_utils, acme.utils, ac-me-utils, etc.
3701
+ for candidate in acme-util acmeutils acme_utils acme.utils ac-me-utils; do
3702
+ npm view $candidate 2>&1 | head -3
3703
+ done
3704
+ ```
3705
+
3706
+ If a candidate is registered to a non-target party → MEDIUM finding (typosquat, possible supply-chain attack vector).
3707
+
3708
+ ---
3709
+
3710
+ ## 45. Sat Imagery for Physical Recon
3711
+
3712
+ For engagements that include a physical-touch component (badge access, tailgating, dumpster diving, on-site network), public imagery helps scout the target.
3713
+
3714
+ ### 45.1 Sat imagery sources
3715
+
3716
+ | Source | URL | Notes |
3717
+ |---|---|---|
3718
+ | **Google Earth Pro** | desktop app | Historical timeline; high resolution (sub-meter) for major cities. |
3719
+ | **Google Maps** | maps.google.com | Current; satellite layer; street view inside building lobbies sometimes. |
3720
+ | **Bing Maps Bird's Eye** | bing.com/maps | Oblique/45-degree imagery for many regions; sometimes shows building facades better than top-down. |
3721
+ | **Apple Maps Look Around** | (iOS / Mac) | Street-level; 3D in major cities. |
3722
+ | **Yandex Maps Panorama** | yandex.com/maps | Russia + global; sometimes higher-resolution street-level than Google. |
3723
+ | **NearMap** (paid) | nearmap.com | Highest-resolution commercial; updated frequently in served regions (US/AU/NZ/CA mostly). |
3724
+ | **Maxar / Planet Labs** (paid) | maxar.com / planet.com | Tasking + recent imagery. |
3725
+ | **Sentinel Hub EO Browser** | apps.sentinel-hub.com | Free Sentinel-2 (10m); good for change detection. |
3726
+ | **NASA Worldview** | worldview.earthdata.nasa.gov | Free; multiple sensors. |
3727
+ | **Wayback ArcGIS** | livingatlas.arcgis.com/wayback/ | Historical satellite. |
3728
+ | **OpenStreetMap** | openstreetmap.org | Crowd-sourced map data with building outlines. |
3729
+
3730
+ ### 45.2 What to extract for physical recon
3731
+
3732
+ - **Building entrance count + locations** — main entrance, employee entrances, loading docks, fire exits.
3733
+ - **Parking lot ingress / egress** — single guarded entry vs open lot.
3734
+ - **Fence lines + camera locations** — physical perimeter.
3735
+ - **HVAC / utility access** — roof access, service entries.
3736
+ - **Adjacent occupants** — neighboring tenants in same building / business park.
3737
+ - **Vehicle types in lot** — proxy for executive presence + employee count.
3738
+ - **Smoking area locations** — common social-engineering staging area.
3739
+
3740
+ ### 45.3 OSINT-derived physical intel beyond satellites
3741
+
3742
+ - **LinkedIn employee photos** — badge templates often visible in profile photos taken at the office.
3743
+ - **Glassdoor "office tour" photos** — employees post interior photos.
3744
+ - **Indeed / Glassdoor reviews** — sometimes describe security culture ("loose badge enforcement", "tailgating common").
3745
+ - **Instagram geotagged photos** — at the office address; reveals interior layout, badge designs, kitchen / common-area locations.
3746
+ - **Public press releases** — often contain "ribbon cutting" photos of new offices showing layout + executive faces.
3747
+ - **Conference talks by IT/security staff** — sometimes describe physical security setup.
3748
+ - **Meetup / workshop event listings** — at the target's office; may include photos.
3749
+
3750
+ ### 45.4 Vehicle / fleet intel
3751
+
3752
+ - **License plates** in LinkedIn/Instagram backgrounds — sometimes correlates to specific exec.
3753
+ - **Company-branded vehicles** in sat imagery — fleet count + location.
3754
+ - **Helicopter pad** / **executive parking** — clue to senior-leadership routine.
3755
+
3756
+ ### 45.5 Discipline
3757
+
3758
+ - Document that imagery + photos are public-source.
3759
+ - Don't trespass for "verification" — physical recon during OSINT phase = look only.
3760
+ - Note imagery date — buildings change.
3761
+
3762
+ ---
3763
+
3764
+ ## 46. Tooling Quick-Install
3765
+
3766
+ One-liner installs for the most-used external recon tools. All assume Linux/Mac with go/python/git installed.
3767
+
3768
+ ### 46.1 Subdomain enumeration
3769
+
3770
+ ```bash
3771
+ # Subfinder (passive, fast)
3772
+ go install github.com/projectdiscovery/subfinder/v2/cmd/subfinder@latest
3773
+
3774
+ # Amass (thorough, slow)
3775
+ go install github.com/owasp-amass/amass/v4/...@master
3776
+
3777
+ # Assetfinder
3778
+ go install github.com/tomnomnom/assetfinder@latest
3779
+
3780
+ # DNSx (resolution + brute)
3781
+ go install github.com/projectdiscovery/dnsx/cmd/dnsx@latest
3782
+
3783
+ # Puredns (brute-force with wildcard handling)
3784
+ go install github.com/d3mondev/puredns/v2@latest
3785
+ ```
3786
+
3787
+ ### 46.2 HTTP probing & enrichment
3788
+
3789
+ ```bash
3790
+ # httpx (tech-detect, status, JARM, favicon)
3791
+ go install github.com/projectdiscovery/httpx/cmd/httpx@latest
3792
+
3793
+ # Gowitness (screenshots)
3794
+ go install github.com/sensepost/gowitness@latest
3795
+
3796
+ # Aquatone (screenshots + clustering)
3797
+ go install github.com/michenriksen/aquatone@latest
3798
+ ```
3799
+
3800
+ ### 46.3 Vulnerability scanning
3801
+
3802
+ ```bash
3803
+ # Nuclei (template scanner)
3804
+ go install github.com/projectdiscovery/nuclei/v3/cmd/nuclei@latest
3805
+ nuclei -ut # update templates
3806
+
3807
+ # Naabu (port scan)
3808
+ go install github.com/projectdiscovery/naabu/v2/cmd/naabu@latest
3809
+
3810
+ # Masscan (fast port scan; requires sudo)
3811
+ git clone https://github.com/robertdavidgraham/masscan && cd masscan && make
3812
+ ```
3813
+
3814
+ ### 46.4 Content discovery
3815
+
3816
+ ```bash
3817
+ # Ffuf (fuzzer / dirbuster)
3818
+ go install github.com/ffuf/ffuf/v2@latest
3819
+
3820
+ # Gobuster
3821
+ go install github.com/OJ/gobuster/v3@latest
3822
+
3823
+ # Feroxbuster (recursive content disco)
3824
+ cargo install feroxbuster
3825
+ ```
3826
+
3827
+ ### 46.5 JS / endpoint extraction
3828
+
3829
+ ```bash
3830
+ # Katana (crawler)
3831
+ go install github.com/projectdiscovery/katana/cmd/katana@latest
3832
+
3833
+ # GoSpider
3834
+ go install github.com/jaeles-project/gospider@latest
3835
+
3836
+ # LinkFinder (JS endpoint regex)
3837
+ git clone https://github.com/GerbenJavado/LinkFinder && cd LinkFinder && pip install -r requirements.txt
3838
+
3839
+ # Subjs (extract JS URLs from HTML)
3840
+ go install github.com/lc/subjs@latest
3841
+ ```
3842
+
3843
+ ### 46.6 Wayback / archive
3844
+
3845
+ ```bash
3846
+ # gau (get all urls from Wayback + others)
3847
+ go install github.com/lc/gau/v2/cmd/gau@latest
3848
+
3849
+ # Waybackurls
3850
+ go install github.com/tomnomnom/waybackurls@latest
3851
+ ```
3852
+
3853
+ ### 46.7 Cloud / AWS
3854
+
3855
+ ```bash
3856
+ # AWS CLI
3857
+ pip install awscli
3858
+ # or: brew install awscli
3859
+
3860
+ # Cloud_enum (S3/Azure/GCP enum)
3861
+ git clone https://github.com/initstring/cloud_enum && cd cloud_enum && pip install -r requirements.txt
3862
+
3863
+ # S3Scanner
3864
+ pip install s3scanner
3865
+
3866
+ # CloudSploit
3867
+ git clone https://github.com/aquasecurity/cloudsploit && cd cloudsploit && npm install
3868
+ ```
3869
+
3870
+ ### 46.8 Identity / SSO
3871
+
3872
+ ```bash
3873
+ # o365creeper / o365enum
3874
+ git clone https://github.com/gremwell/o365enum
3875
+
3876
+ # CredMaster (per-protocol auth probe)
3877
+ git clone https://github.com/knavesec/CredMaster
3878
+ ```
3879
+
3880
+ ### 46.9 Mobile
3881
+
3882
+ ```bash
3883
+ # google-play-scraper (Python)
3884
+ pip install google-play-scraper
3885
+
3886
+ # androguard (APK static analysis)
3887
+ pip install androguard
3888
+ # or: brew install androguard
3889
+
3890
+ # apkleaks (secret scan in APK)
3891
+ pip install apkleaks
3892
+ ```
3893
+
3894
+ ### 46.10 TLS / cert
3895
+
3896
+ ```bash
3897
+ # sslyze
3898
+ pip install sslyze
3899
+
3900
+ # testssl.sh
3901
+ git clone --depth 1 https://github.com/drwetter/testssl.sh.git
3902
+
3903
+ # JARM
3904
+ pip install pyjarm
3905
+
3906
+ # Cert-spotter / certgraph
3907
+ go install github.com/lanrat/certgraph@latest
3908
+ ```
3909
+
3910
+ ### 46.11 Misc utilities
3911
+
3912
+ ```bash
3913
+ # Anew (line-dedup that streams)
3914
+ go install github.com/tomnomnom/anew@latest
3915
+
3916
+ # Gf (regex-based grep templates)
3917
+ go install github.com/tomnomnom/gf@latest
3918
+
3919
+ # Hakrawler (web crawler)
3920
+ go install github.com/hakluke/hakrawler@latest
3921
+
3922
+ # Trufflehog (secret scanner)
3923
+ go install github.com/trufflesecurity/trufflehog@latest
3924
+
3925
+ # Gitleaks
3926
+ go install github.com/zricethezav/gitleaks/v8@latest
3927
+
3928
+ # jq (JSON parsing)
3929
+ sudo apt install jq # or brew install jq
3930
+ ```
3931
+
3932
+ ### 46.12 Frameworks / orchestration
3933
+
3934
+ ```bash
3935
+ # ProjectDiscovery's "PDTM" (manages the full PD toolkit)
3936
+ go install -v github.com/projectdiscovery/pdtm/cmd/pdtm@latest
3937
+ pdtm -install-all
3938
+
3939
+ # reconftw (scripted recon framework)
3940
+ git clone https://github.com/six2dez/reconftw && cd reconftw && ./install.sh
3941
+
3942
+ # Axiom (distributed recon on cloud nodes)
3943
+ git clone https://github.com/pry0cc/axiom && cd axiom && ./interact/axiom-configure
3944
+ ```
3945
+
3946
+ ---
3947
+
3948
+ ## 47. Sector-Specific Recon Notes
3949
+
3950
+ Most recon generalizes; some sectors have unique attack-surface elements worth flagging.
3951
+
3952
+ ### 47.1 Healthcare
3953
+
3954
+ - **DICOM** (medical imaging) — port 11112, sometimes 4242 (testing).
3955
+ - **HL7 v2** (clinical messaging) — port 2575 (TCP, often plaintext).
3956
+ - **HL7 FHIR** (modern REST API) — typically `/fhir/R4/<resource>` paths; OAuth / SMART-on-FHIR auth posture varies wildly.
3957
+ - **PACS / RIS / EHR systems** — Epic (`*.epic.com` SaaS), Cerner/Oracle Health, Allscripts/Veradigm, Athenahealth, NextGen, Meditech, eClinicalWorks. Each has known CVE history.
3958
+ - **Searches:** `site:{domain} ("EHR" OR "PACS" OR "PHI" OR "HIPAA")`, `intitle:"Epic Systems" "{target}"`.
3959
+ - **Severity escalation:** any PHI exposure → CRITICAL (regulatory + reputational); HL7/DICOM open without auth → CRITICAL.
3960
+
3961
+ ### 47.2 Finance
3962
+
3963
+ - **SWIFT terminals** — typically internal-only; if external-facing, CRITICAL. Look for SWIFT Alliance Web Platform.
3964
+ - **FIX protocol** (electronic trading) — port 9876 (common); cleartext.
3965
+ - **Bloomberg terminals** — typically VDI; check for `bloomberg.com`-related auth surfaces.
3966
+ - **Trading platform vendors** — Fidessa, Charles River, Eze Software, Aladdin (BlackRock).
3967
+ - **Banking middleware** — Temenos T24, Finacle (Infosys), FIS, Jack Henry, Fiserv. Each has known CVE history.
3968
+ - **Searches:** `site:{domain} ("PCI" OR "SOX" OR "GLBA" OR "MAS")`, `intitle:"Temenos" "{target}"`.
3969
+ - **Severity escalation:** any account/balance data exposure → CRITICAL; SWIFT exposure → CRITICAL; trade-execution surface exposure → CRITICAL.
3970
+
3971
+ ### 47.3 ICS / SCADA / OT
3972
+
3973
+ > **Caution:** ICS/SCADA assets often run on legacy systems where even passive scanning can cause disruption. **Do not actively probe ICS without explicit RoE coverage and operator coordination with the OT team.**
3974
+
3975
+ - **Modbus** — port 502 (TCP).
3976
+ - **BACnet** — port 47808 (UDP).
3977
+ - **Siemens S7** — port 102 (ISO-TSAP).
3978
+ - **DNP3** — port 20000 (TCP).
3979
+ - **EtherNet/IP** — port 44818 (TCP).
3980
+ - **Niagara Framework** — port 1911, 4911, 5011, 502.
3981
+ - **Honeywell EBI / Tridium** — varies.
3982
+ - **GE Proficy / iFIX** — varies.
3983
+ - **Common findings:** unauthenticated read access (BACnet point list, Modbus register read), default credentials on HMI panels, public-facing engineering workstations.
3984
+ - **Sources:** Shodan ICS-specific filters (`port:502`, `tag:ics`), Censys, Onyphe.
3985
+ - **Detectability:** medium-to-high; ICS networks often have low background traffic and are heavily monitored.
3986
+
3987
+ ### 47.4 IoT / Consumer / SOHO
3988
+
3989
+ - **MQTT** — port 1883 (cleartext), 8883 (TLS). Topics often readable without auth.
3990
+ - **CoAP** — port 5683 (UDP).
3991
+ - **UPnP / SSDP** — port 1900 (UDP); often discloses internal device map.
3992
+ - **Common router admin patterns:** `/cgi-bin/`, `/setup.cgi`, `/admin/index.html`. Default creds are the norm.
3993
+ - **Camera DVRs / NVRs** — Hikvision, Dahua, Axis. Multiple CVEs.
3994
+ - **Smart-home hubs** — exposed APIs sometimes leak auth tokens.
3995
+
3996
+ ### 47.5 Government
3997
+
3998
+ - **`.gov` and `.mil` domains** require special engagement-scope discipline.
3999
+ - **FedRAMP / FISMA / DoD CMMC** — defensive posture is generally above baseline.
4000
+ - **OSINT data sources:** USAspending.gov, SAM.gov (System for Award Management), FBO.gov / sam.gov (procurement).
4001
+ - **Common findings:** vendor of record disclosed in public contracts → adjacent-vendor pivot.
4002
+ - **Severity:** as high or higher than commercial; political sensitivity layered on top of technical impact.
4003
+
4004
+ ### 47.6 Maritime / Aviation / Auto
4005
+
4006
+ - **Maritime:** AIS (Automatic Identification System) — vessel positions; tools MarineTraffic, VesselFinder. Engine telemetry sometimes exposed via VSAT.
4007
+ - **Aviation:** ADS-B (already covered §32.3); operator/airline-specific OPS data sometimes exposed.
4008
+ - **Automotive:** OEM telematics backends (Tesla, GM OnStar, etc.) — typically authenticated, but APIs leak via mobile-app reverse engineering.
4009
+
4010
+ ### 47.7 Universal sector caveat
4011
+
4012
+ **Most external recon techniques apply universally.** Sector-specific protocols add attack surface; sector-specific compliance regimes add reporting requirements. Don't assume "healthcare/finance/etc. has different OSINT" — the OSINT is the same; the targeted services differ.
4013
+
4014
+ ---
4015
+
4016
+ ## 48. Runnable Helper — `secret_scan.py`
4017
+
4018
+ Drop-in Python helper that mirrors the 29-pattern catalog from §17. Pure stdlib, no dependencies. For operator use against captured text.
4019
+
4020
+ ```python
4021
+ #!/usr/bin/env python3
4022
+ """Stdlib-only secret scanner. Mirrors the 29-pattern catalog.
4023
+
4024
+ Usage:
4025
+ echo "AKIAIOSFODNN7EXAMPLE" | python3 secret_scan.py
4026
+ python3 secret_scan.py file1.txt file2.js dir/
4027
+
4028
+ Output: one JSON object per line: {pattern, severity, category, match, file, line}
4029
+ """
4030
+ import json
4031
+ import os
4032
+ import re
4033
+ import sys
4034
+
4035
+ SEV_CRITICAL = "critical"
4036
+ SEV_HIGH = "high"
4037
+ SEV_MEDIUM = "medium"
4038
+ SEV_LOW = "low"
4039
+
4040
+ PATTERNS = [
4041
+ ("AWS_ACCESS_KEY", SEV_CRITICAL, "aws", r"\b(AKIA|ASIA)[0-9A-Z]{16}\b"),
4042
+ ("AWS_SECRET_TYPED", SEV_CRITICAL, "aws", r"(?i)aws[_\-]?secret[_\-]?access[_\-]?key['\"\s:=]+([A-Za-z0-9/+=]{40})"),
4043
+ ("AWS_SECRET_LOOSE", SEV_HIGH, "aws", r"(?i)aws(.{0,20})?(secret|sk)[\"'=: ]+([0-9a-z/+=]{40})"),
4044
+ ("GCP_SERVICE_ACCOUNT", SEV_CRITICAL, "gcp", r'"type"\s*:\s*"service_account"'),
4045
+ ("GOOGLE_API_KEY", SEV_HIGH, "gcp", r"\bAIza[0-9A-Za-z_\-]{35}\b"),
4046
+ ("GH_PAT_CLASSIC", SEV_CRITICAL, "github", r"\bghp_[A-Za-z0-9]{36}\b"),
4047
+ ("GH_PAT_FINEGRAINED", SEV_CRITICAL, "github", r"\bgithub_pat_[A-Za-z0-9_]{82}\b"),
4048
+ ("GH_OAUTH", SEV_HIGH, "github", r"\bgho_[A-Za-z0-9]{36}\b"),
4049
+ ("GH_S2S", SEV_HIGH, "github", r"\bgh[usr]_[A-Za-z0-9]{36,}\b"),
4050
+ ("STRIPE_LIVE", SEV_CRITICAL, "stripe", r"\bsk_live_[0-9A-Za-z]{24,}\b"),
4051
+ ("STRIPE_TEST", SEV_LOW, "stripe", r"\bsk_test_[0-9A-Za-z]{24,}\b"),
4052
+ ("SLACK_TOKEN", SEV_HIGH, "slack", r"\bxox[abpors]-[0-9A-Za-z\-]{10,48}\b"),
4053
+ ("SLACK_WEBHOOK", SEV_MEDIUM, "slack", r"https://hooks\.slack\.com/services/T[A-Z0-9]+/B[A-Z0-9]+/[A-Za-z0-9]+"),
4054
+ ("SENDGRID", SEV_HIGH, "email_svc", r"\bSG\.[A-Za-z0-9_\-]{22}\.[A-Za-z0-9_\-]{43}\b"),
4055
+ ("MAILGUN_V1", SEV_HIGH, "email_svc", r"\bkey-[0-9a-zA-Z]{32}\b"),
4056
+ ("MAILGUN_LOOSE", SEV_HIGH, "email_svc", r"\bkey-[0-9a-f]{32}\b"),
4057
+ ("TWILIO_API", SEV_HIGH, "twilio", r"\bSK[0-9a-fA-F]{32}\b"),
4058
+ ("TWILIO_SID", SEV_MEDIUM, "twilio", r"\bAC[a-f0-9]{32}\b"),
4059
+ ("TWILIO_AUTH", SEV_HIGH, "twilio", r"(?i)twilio(.{0,20})?(auth|token)[\"'=: ]+([a-f0-9]{32})"),
4060
+ ("HEROKU_API", SEV_MEDIUM, "paas", r"(?i)heroku(.{0,20})?api[\"'=: ]+([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})"),
4061
+ ("FIREBASE_URL", SEV_LOW, "firebase", r"\bhttps?://[a-z0-9\-]+\.firebaseio\.com\b"),
4062
+ ("JWT", SEV_MEDIUM, "jwt", r"\beyJ[A-Za-z0-9_\-]{10,}\.eyJ[A-Za-z0-9_\-]{10,}\.[A-Za-z0-9_\-]{10,}\b"),
4063
+ ("BEARER_AUTH", SEV_MEDIUM, "bearer", r"(?i)authorization[\"'=: ]+bearer\s+[A-Za-z0-9._\-]{20,}"),
4064
+ ("BASIC_AUTH_URL", SEV_MEDIUM, "basic_auth", r"https?://[^/\s:@]+:[^/\s:@]+@[^/\s]+"),
4065
+ ("RSA_PRIVKEY", SEV_CRITICAL, "private_key", r"-----BEGIN RSA PRIVATE KEY-----"),
4066
+ ("EC_PRIVKEY", SEV_CRITICAL, "private_key", r"-----BEGIN EC PRIVATE KEY-----"),
4067
+ ("OPENSSH_PRIVKEY", SEV_CRITICAL, "private_key", r"-----BEGIN OPENSSH PRIVATE KEY-----"),
4068
+ ("GENERIC_PRIVKEY", SEV_CRITICAL, "private_key", r"-----BEGIN (DSA |PGP |)PRIVATE KEY-----"),
4069
+ ("GENERIC_API_KEY", SEV_MEDIUM, "generic", r"(?i)(?:api[_\-]?key|apikey|api_secret|access_token|secret[_\-]?token)['\"\s:=]+[\"']([A-Za-z0-9+/=_\-]{24,})[\"']"),
4070
+ ]
4071
+
4072
+ COMPILED = [(n, s, c, re.compile(p)) for (n, s, c, p) in PATTERNS]
4073
+
4074
+ def scan_text(text: str, source: str = "<stdin>"):
4075
+ for line_no, line in enumerate(text.splitlines(), start=1):
4076
+ for name, sev, cat, rx in COMPILED:
4077
+ for m in rx.finditer(line):
4078
+ yield {
4079
+ "pattern": name,
4080
+ "severity": sev,
4081
+ "category": cat,
4082
+ "match": m.group(0)[:80], # truncate to avoid huge dumps
4083
+ "source": source,
4084
+ "line": line_no,
4085
+ }
4086
+
4087
+ def scan_path(path: str):
4088
+ if os.path.isdir(path):
4089
+ for root, _, files in os.walk(path):
4090
+ for f in files:
4091
+ p = os.path.join(root, f)
4092
+ yield from scan_path(p)
4093
+ return
4094
+ try:
4095
+ with open(path, "r", errors="replace") as fh:
4096
+ yield from scan_text(fh.read(), source=path)
4097
+ except Exception:
4098
+ return
4099
+
4100
+ def main():
4101
+ if len(sys.argv) > 1:
4102
+ for arg in sys.argv[1:]:
4103
+ for hit in scan_path(arg):
4104
+ print(json.dumps(hit))
4105
+ else:
4106
+ data = sys.stdin.read()
4107
+ for hit in scan_text(data):
4108
+ print(json.dumps(hit))
4109
+
4110
+ if __name__ == "__main__":
4111
+ main()
4112
+ ```
4113
+
4114
+ Save as `secret_scan.py`, then:
4115
+ ```bash
4116
+ python3 secret_scan.py path/to/repo/ # scan a directory tree
4117
+ python3 secret_scan.py file1 file2 file3 # scan specific files
4118
+ cat my.log | python3 secret_scan.py # pipe stdin
4119
+ ```
4120
+
4121
+ Output is JSONL — one finding per line — drops cleanly into `jq` for filtering or directly into a finding store.
4122
+
4123
+ ---
4124
+
4125
+ ## 49. Skill Self-Test
4126
+
4127
+ Drop these prompts into a fresh Claude session to verify the skill loads correctly.
4128
+
4129
+ 1. *"What paths should I probe to find Swagger or OpenAPI specs on a webapp?"* → §16.1.
4130
+ 2. *"Give me the GraphQL introspection query I should POST."* → §16.2.
4131
+ 3. *"What are the high-risk ports to flag from a Shodan scan?"* → §16.3.
4132
+ 4. *"Show me the secret regex catalog."* → §17 (48 patterns) + §48 (runnable Python).
4133
+ 5. *"How do I score an API endpoint by attack interest?"* → §20.
4134
+ 6. *"Validate a leaked Postman API key — what URL?"* → §23.1.
4135
+ 7. *"Give me dorks for pastebin/gist/ghostbin leaks for a target."* → §18.3.
4136
+ 8. *"What endpoints fingerprint a Microsoft Entra tenant?"* → §22.1 + §22.8 for M365 deep.
4137
+ 9. *"How do I score whether a discovered Android app belongs to my target?"* → §21.
4138
+ 10. *"What attack-path hint when I find unauth POST on `/api/users`?"* → §39 (first row).
4139
+ 11. *"Curl one-liner to test for `/actuator/env`."* → §16.13.
4140
+ 12. *"Show me the GraphQL field-suggestion enumeration trick when introspection is disabled."* → §22.9.
4141
+ 13. *"Found a hard-coded JWT in JS. Walk me through full triage."* → §23.12 (JWT workflow).
4142
+ 14. *"Generate cloud bucket candidates for `Shree Cement Limited` with subdomains api/billing/hr."* → §16.8.
4143
+ 15. *"How do I find Microsoft 365 Teams federation status + SharePoint subdomains?"* → §22.8.
4144
+ 16. *"Probe paths for Citrix Netscaler / F5 BIG-IP / Pulse Secure."* → §16.16.
4145
+ 17. *"Find the origin behind Cloudflare on `target.example`."* → §16.15 + companion methodology §27.
4146
+ 18. *"What ports/paths probe for Kubernetes/etcd/kubelet exposure?"* → §16.18.
4147
+ 19. *"Audit `acme.com`'s SPF/DMARC for spoof feasibility."* → §16.14.
4148
+ 20. *"List wordlist sources for subdomain bruteforce + content discovery."* → §27.1.
4149
+ 21. *"Run reverse-DNS sweep across a /22 the target owns."* → §28.5.
4150
+ 22. *"Validate an OpenAI API key without burning quota."* → §23.6 + §23.12.
4151
+ 23. *"Find leaked secrets across npm/PyPI/Docker Hub for the target."* → §44.
4152
+ 24. *"How do I enumerate target employees on LinkedIn for a phishing list?"* → §41.
4153
+ 25. *"What's a Slack invite link enumeration technique?"* → §43.1.
4154
+ 26. *"What's the EPSS score and KEV status for CVE-2024-3400?"* → §29.2.
4155
+ 27. *"What modern AI API keys (Anthropic / OpenAI / HuggingFace / Cloudflare) match catalog patterns?"* → §17 rows 30–48.
4156
+ 28. *"Severity matrix for `android:debuggable=true` on prod app?"* → §40.
4157
+ 29. *"Install commands for the standard recon toolkit (subfinder/httpx/nuclei/etc.)?"* → §46.
4158
+ 30. *"For a healthcare engagement, what additional ports / protocols matter?"* → §47.1.
4159
+ 31. *"Pull HudsonRock breach corpus for `target.com` via direct API (no UI)."* → §15.0.1.
4160
+ 32. *"Run the full §16.14 email security audit from a Windows box (PowerShell)."* → §16.14 PowerShell parallel.
4161
+ 33. *"crt.sh just 502'd. What's the fallback chain?"* → §27.0.1.
4162
+ 34. *"Bulk IP → ASN lookup for 200 IPs without burning bgpview rate limit."* → §28.1 (Cymru bulk).
4163
+ 35. *"Common-prefix subdomain sweep for `target.example` covering vpn / api / staging / portal / intranet."* → §16.24.
4164
+ 36. *"Legacy mail (`mail.<domain>`) is NXDOMAIN today but breach corpus has employee URLs against it. What's the finding?"* → §15.2 legacy-mail-decommissioned pattern.
4165
+ 37. *"Confirm M365 tenancy when MX is wrapped by Mimecast (so MX doesn't reveal underlying mail platform)."* → §22.1 autodiscover IP correlation + §16.22 autodiscover-as-confirmation.
4166
+ 38. *"DMARC RUA points to `kdmarc.com` — what does that tell me?"* → §16.14 DMARC reporting-vendor table.
4167
+ 39. *"SharePoint HEAD probe returns HTTP 200. Does that mean anonymous access is granted?"* → §22.8 (no — tenant exists, not anonymous access; distinguish).
4168
+ 40. *"Wayback `*.js` query returned empty for a brochure-ware site. Pivot?"* → §16.23 legacy-app pivot (.asp / .php / .jsp / .cfm / .aspx).
4169
+
4170
+ ---
4171
+
4172
+ ## 50. Changelog
4173
+
4174
+ - **v2.1.1 (2026-04-27)** — battle-test gap fixes from real-engagement smoke run. Added: §15.0.1 HudsonRock Cavalier direct-API recipe (curl + PowerShell, full JSON shape, free-tier redaction caveats, rate-limit guidance). §15.2 expanded with legacy-mail-decommissioned escalation pattern (NXDOMAIN legacy mail + breach corpus + autodiscover-confirmed cloud migration → CRITICAL SSO_EXPOSURE). §16.14 expanded with DMARC reporting-vendor table (Kratikal kdmarc / dmarcian / Valimail / Agari / EasyDMARC / DMARC Analyzer / Postmark) + full Windows/PowerShell parallel for the entire email security audit + caveat that PS 5.1 `Resolve-DnsName -Type CAA` errors (use PS 7+ or `nslookup -type=CAA`). §16.22 expanded TXT verification token catalog with 17 new tokens (zscaler-verification, cloudflare-verify, autosect, cisco-site-verification, mscid, _amazonses, salesforce-domain-verification, workday/shopify/klaviyo/mailchimp/hubspot/zendesk/freshworks/intercom/loom/miro/gitlab) + new "Autodiscover-as-confirmation" pattern for M365 detection when MX is wrapped by Mimecast/Proofpoint/Barracuda. §22.1 added passive Autodiscover IP correlation pattern with Microsoft Exchange Online IP ranges. §22.8 added clarification: SharePoint HEAD HTTP 200 = tenant exists, NOT anonymous access granted (operators commonly misread). New §16.23 legacy-app pivot block (when Wayback `*.js` returns empty for brochure-ware sites, pivot to .asp/.php/.jsp/.cfm/.aspx/.json/.xml/.yml/.ini/.conf — with full broad-sweep one-liner). New §16.24 Common-Prefix Subdomain Sweep — formalized active prefix-probe technique with 100+ ordered prefix list, PowerShell + bash + puredns recipes, and real-engagement validation note (passive enum misses 20-40% of high-value subdomains; always pair with active prefix probe). §27.0.1 added crt.sh fallback chain (Censys, CertSpotter, Calidog, Subfinder, OTX, ThreatMiner, URLScan, Anubis-DB) with PowerShell wrapper that retries crt.sh 3× then falls back to Subfinder. §28.1 added Bulk IP→ASN recipes (Cymru bulk WHOIS, RIPEstat, bgp.tools, IPinfo Lite) + caveat that bgpview.io API has aggressive rate limits unsuitable for bulk. §40 severity matrix gained 8 rows: vendor procurement portal exposed + breach corpus hits (HIGH), PII-collection portal over plain HTTP (HIGH), decommissioned legacy mail + breach + cloud migration (CRITICAL), public-facing intranet without VPN (MEDIUM), staging/preprod publicly resolvable (MEDIUM), vpn.<domain> resolves but vendor unknown (INFO escalating to HIGH-CRITICAL on KEV match), DMARC RUA → third-party vendor (INFO). §49 self-test expanded from 30 → 40 prompts targeting all new content.
4175
+ - **v2.1 (2026-04-27)** — comprehensive expansion based on 32-test smoke-test gap analysis. Added: copy-paste curl probes for every check (§16.13), email security analysis with SPF/DMARC/DKIM/BIMI/MTA-STS/DNSSEC parsing + SaaS tenant inference (§16.14), origin discovery / CDN bypass via DNS history + cert SAN + favicon hash + JARM + Host-header probe (§16.15), vendor product fingerprints for Citrix/F5/Pulse/Fortinet/PaloAlto/Cisco/VMware/Exchange + KEV CVE associations (§16.16), cloud-native service URL fingerprints — Lambda Function URLs, Cloud Run, Cloud Functions, Azure Functions, Vercel, Netlify, Cloudflare Workers, etc. (§16.17), container & Kubernetes exposure (kubelet, etcd, K8s API, dashboard, Helm Tiller, container registries) (§16.18), CI/CD platform exposure (Jenkins deeper, GitLab, GitHub Actions, CircleCI, TeamCity, Argo CD, Spinnaker) (§16.19), documentation/wiki leak paths (Notion, Confluence, Trello, Miro, Lucidchart, Figma, ReadTheDocs, GitBook, Slab, Coda, etc.) (§16.20), WHOIS/RDAP/historical-WHOIS recipes + reverse-WHOIS pivots (§16.21), DNS record catalog with TXT verification token table → SaaS tenant inference (§16.22), Wayback CDX deep usage with all filter parameters (§16.23). Expanded: §17 secret catalog from 29 → 48 patterns adding modern AI API keys (Anthropic, OpenAI legacy + project, HuggingFace), infra (Cloudflare, DigitalOcean), package registries (npm, PyPI, Docker Hub), SaaS (Atlassian, Linear), observability (New Relic, DataDog, Sentry DSN), bot tokens (Discord, Telegram), and ngrok. Expanded §18 dork corpus from 50+ → 80+ with internal-tool exposure (Splunk/Grafana/Kibana/Argo CD/Sonarqube/Confluence/Jira/GitLab/Gitea), backup-file extensions, and sector-specific dorks (healthcare/finance/gov). Added §22.8 Microsoft 365 deep enumeration (Teams federation, SharePoint subdomain probe, OneDrive personal-site probe, OAuth client_id discovery, device-code phishing target check, Power Platform). Added §22.9 GraphQL field-suggestion enumeration recipe + alias batching, query-depth bypass, subscription enumeration, batched-query bypass. Added §23.5–23.9 read-only validators for Anthropic, OpenAI, npm, Atlassian, DataDog (5 new). Added §23.12 post-discovery enumeration workflows (AWS IAM enum, GitHub PAT scope/repo enum, Slack workspace enum, JWT full triage with algorithm-confusion + brute-force + none-bypass, Postman PMAK workspace enum, Anthropic + OpenAI usage enum, generic key provenance enum). Pinned §24 Postman search endpoint with verified shape + DevTools fallback recipe. Added §27.1 wordlist sources (Assetnote, SecLists, jhaddix, OneListForAll, raft-large-words, fuzzdb, etc.) + size guidance. Added §28.4 TLS deep audit (sslyze + testssl.sh + nmap + JA3/JA4 + cipher/protocol/cert checks). Added §28.5 reverse DNS sweep + IPv6 enumeration + BGP route observation. Added §29.2 vulnerability prioritization data sources (NVD/EPSS/CISA KEV/ExploitDB/Metasploit/InTheWild/OpenCVE/Trickest CVE+POC mapping/OSV.dev/VulnCheck KEV) + bulk prioritization workflow. Expanded §39 attack-path hints with 15 more templates (open kubelet/etcd, K8s API anonymous, Citrix/F5/vCenter/Cloud Function unauth, npm typosquat, DMARC missing, live AI keys, Slack invite, sourcemap with sourcesContent). Expanded §40 severity matrix with 30 more worked examples covering Kubernetes/container, vendor products with KEV CVEs, M365/cloud-native, CI/CD misconfig, documentation leaks, email-security gaps, AI/package-registry credentials, TLS issues. Added §41 LinkedIn employee enumeration tradecraft (search techniques + role inference + email-pattern derivation + sock-puppet considerations). Added §42 job posting tech-stack analysis (sources + extraction + tooling). Added §43 Slack/Discord/Telegram/Mattermost workspace discovery. Added §44 package registry leak hunting (npm/PyPI/RubyGems/Cargo/Packagist/NuGet/Maven Central + workflow + typosquat surveillance). Added §45 sat imagery for physical recon (sources + extraction + LinkedIn/Glassdoor/Instagram/conference intel + vehicle/fleet intel). Added §46 tooling quick-install (subdomain, HTTP probing, vuln scanning, content discovery, JS extraction, Wayback, cloud, identity, mobile, TLS, utilities, frameworks). Added §47 sector-specific recon notes (healthcare DICOM/HL7/FHIR/EHR + finance SWIFT/FIX/Bloomberg/banking middleware + ICS-SCADA Modbus/BACnet/S7/DNP3 + IoT MQTT/CoAP/UPnP + government FedRAMP/FISMA + maritime/aviation/auto). Renumbered Runnable Helper → §48, Self-Test → §49 (refreshed for v2.1), Changelog → §50.
4176
+ - **v2.0 (2026-04-27)** — major rewrite for external red-team posture. Added: pre-built wordlists (§16), 29-pattern secret catalog (§17), 50+ dork corpus (§18), GitHub code-search dorks (§19), endpoint interest score (§20), mobile ownership confidence (§21), identity-fabric concrete endpoints (§22), read-only secret validators (§23), Postman workspace search (§24), Stack Exchange sweep (§25), public SaaS dorks (§26), subdomain-source stack (§27), domain-level breach severity (§15.1), L2 explorer table (§30.2), USCC + ICP workflow (§14.2), cross-module sidecar coordination (§36), attack-path hint patterns (§39), severity decision matrix (§40), runnable secret-scan helper (§41). Strengthened: confidence levels (§2), output format (§3), do-not rules (§5). Original tool tables retained and lightly reorganized.
4177
+ - **v1.x** — original tool-reference cheat sheet.