company-dossier 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,117 @@
1
+ # company-dossier
2
+
3
+ **Build a complete, sourced intelligence dossier on any company from public data — CLI, library and MCP server.**
4
+
5
+ `company-dossier` compiles a structured, nine-section dossier on any company or
6
+ domain using only PUBLIC sources: a live website crawl, DNS reconnaissance, the
7
+ Internet Archive's Wayback Machine, a web-technology fingerprint, USASpending.gov
8
+ federal contracts, and social-profile discovery. Every derived claim is annotated
9
+ with its source, and sections without public data are clearly marked as gaps.
10
+
11
+ No API keys. No private databases. No login. Free.
12
+
13
+ 🔗 https://companydossier.lol
14
+
15
+ ## Install
16
+
17
+ ```bash
18
+ npm install -g company-dossier
19
+ # or run without installing:
20
+ npx company-dossier acme.com
21
+ ```
22
+
23
+ ## Quickstart (CLI)
24
+
25
+ ```bash
26
+ npx company-dossier acme.com
27
+ ```
28
+
29
+ This writes an `Acme DOSSIER/` folder containing one markdown file per section
30
+ plus a machine-readable `dossier.json`.
31
+
32
+ ```bash
33
+ # choose an output directory, stay quiet
34
+ company-dossier acme.com --out ./research --quiet
35
+
36
+ # research by name (no domain)
37
+ company-dossier "Acme Corporation"
38
+
39
+ # print JSON to stdout (good for piping)
40
+ company-dossier acme.com --json > acme.json
41
+
42
+ # only build specific sections
43
+ company-dossier acme.com --sections overview,tech,risk
44
+ ```
45
+
46
+ Run `company-dossier --help` for all options.
47
+
48
+ ## The nine sections
49
+
50
+ 1. **Overview & identity** — name, description, schema.org, keywords
51
+ 2. **People & org chart** — contact emails, individual-pattern emails (gaps marked)
52
+ 3. **Hiring radar** — careers pages and job URLs from the site/sitemap
53
+ 4. **Money trail** — USASpending.gov federal contracts and obligations
54
+ 5. **Locations** — structured addresses and phone numbers
55
+ 6. **Tech fingerprint** — CMS, analytics, pixels, CDN, frameworks, email/DNS
56
+ 7. **News & timeline** — Wayback history, growth, deleted pages, archived PDFs
57
+ 8. **Relationship web** — social and external profiles
58
+ 9. **Risk flags** — automated low-confidence technical signals (SPF/DMARC, churn)
59
+
60
+ ## Library usage
61
+
62
+ ```ts
63
+ import { buildDossier, writeDossier } from 'company-dossier';
64
+
65
+ const result = await buildDossier('acme.com', {
66
+ sections: ['overview', 'tech', 'risk'], // optional subset
67
+ });
68
+
69
+ // result.meta — target, company name, sources & status
70
+ // result.json — full structured data ({ meta, data })
71
+ // result.files — [{ path, content }] markdown + dossier.json
72
+
73
+ const folder = writeDossier(result, './research');
74
+ console.log('Written to', folder);
75
+ ```
76
+
77
+ Individual collectors are also exported: `collectWebsite`, `collectDns`,
78
+ `collectWayback`, `extractTechStack`, `collectSearch`.
79
+
80
+ ## MCP server
81
+
82
+ `company-dossier` ships an [MCP](https://modelcontextprotocol.io) server over
83
+ stdio exposing a single tool, `build_dossier`, that returns the markdown and JSON.
84
+
85
+ ```json
86
+ {
87
+ "mcpServers": {
88
+ "company-dossier": {
89
+ "command": "npx",
90
+ "args": ["-y", "company-dossier-mcp"]
91
+ }
92
+ }
93
+ }
94
+ ```
95
+
96
+ Tool input:
97
+
98
+ ```json
99
+ { "target": "acme.com", "sections": ["overview", "tech", "risk"] }
100
+ ```
101
+
102
+ ## Output
103
+
104
+ - A `<Company> DOSSIER/` folder with `README.md`, nine numbered markdown
105
+ sections, and `dossier.json`.
106
+ - With `--json`, the structured dossier is printed to stdout instead.
107
+
108
+ ## Public sources only
109
+
110
+ This tool reads only publicly accessible data and clearly labels every claim
111
+ with its source. It does not perform authentication, scraping behind logins, or
112
+ access to paid databases. Network-blocked or empty sources are reported as gaps,
113
+ never fabricated. Risk flags are automated signals, not legal or financial advice.
114
+
115
+ ## License
116
+
117
+ MIT © EVERJUST