open-local-audit 0.37.0 → 0.38.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,399 +1,402 @@
1
- # Open Local Audit
2
-
3
- Open Local Audit is an open-source website and local presence auditor for small businesses. It turns public website signals into practical reports for outreach, customer education, and implementation work.
4
-
5
- ## Current stage
6
-
7
- Published CLI. The project can run a single URL audit, an opt-in Playwright-rendered audit with screenshot evidence, optional Lighthouse category scoring, branded customer-facing reports, public contact readiness extraction, a plain-text batch URL list, or a profile-aware CSV batch file, optionally check same-origin links, and produce JSON, Markdown, HTML, PDF, or all standard report formats. Batch runs can use controlled concurrency, write per-site reports, add aggregate insight sections and contact/outreach rollups to the top-level index, and optionally export prospect CSV data for triage. Discovery runs can control Google Places result counts, cap website audits, write summary JSON, merge local review CSVs, explain opportunity scores, enrich outreach and public-contact columns, select a preferred manual outreach channel, and report exact duplicate lead groups plus advisory fuzzy duplicate review candidates. CRM-ready CSV exports can be validated locally before import, lead CSVs can be ranked into local shortlist reports with optional local review-state suppression, and single-site report folders can be packaged for local customer sharing.
8
-
9
- ## Business purpose
10
-
11
- Open Local Audit produces evidence-backed mini audits for local businesses. The tool should be useful as open-source software while keeping paid value in implementation, redesign, maintenance, and custom reporting.
12
-
13
- ## Product principles
14
-
15
- - Evidence first: every recommendation should point to a concrete finding.
16
- - Ethical scanning: only scan user-provided public URLs and avoid Google Maps scraping.
17
- - Small-business language: reports should be readable by non-technical owners.
18
- - Developer quality: deterministic checks, clear CLI output, tests, and release notes.
19
- - Commercial clarity: the open-source scanner is useful; paid work is execution and support.
20
-
21
- ## Proposed first stack
22
-
23
- - Runtime: Node.js with TypeScript.
24
- - CLI framework: `commander` or `cac`.
25
- - Validation: `zod`.
26
- - HTML parsing: `cheerio`.
27
- - Browser rendering: Playwright where static parsing is not enough.
28
- - Storage: no persistence by default; optional JSON/Markdown outputs.
29
- - Test runner: Vitest.
30
- - Package manager: npm unless implementation chooses pnpm before first commit.
31
-
32
- ## Documents
33
-
34
- - [Product brief](./docs/product-brief.md)
35
- - [Technical architecture](./docs/technical-architecture.md)
36
- - [MVP roadmap](./docs/mvp-roadmap.md)
37
- - [Security and ethics](./docs/security-and-ethics.md)
38
- - [Go-to-market plan](./docs/go-to-market.md)
39
- - [Audit checklist](./docs/research/audit-checklist.md)
40
- - [Release readiness](./docs/release/release-readiness.md)
41
- - [npm publishing plan](./docs/release/npm-publishing.md)
42
- - [Project operating standard](./docs/operations/project-standard.md)
43
- - [Decision log](./docs/operations/decision-log.md)
44
-
45
- ## Local development
46
-
47
- Install dependencies and run the checks:
48
-
49
- ```bash
50
- npm install
51
- npm test
52
- npm run lint
53
- npm run build
54
- npm run release-check
55
- ```
56
-
57
- Run the CLI locally:
58
-
59
- ```bash
60
- npm start -- https://example.com --format markdown
61
- ```
62
-
63
- After building, run the compiled CLI:
64
-
65
- ```bash
66
- node dist/cli.js https://example.com --format json --pretty
67
- ```
68
-
69
- Run the published package:
70
-
71
- ```bash
72
- npx open-local-audit https://example.com --format markdown
73
- ```
74
-
75
- Run with an industry profile:
76
-
77
- ```bash
78
- open-local-audit https://example.com --profile dental --format markdown
79
- ```
80
-
81
- Supported profiles are `generic`, `dental`, `beauty`, `restaurant`, `contractor`, `lawyer`, `clinic`, `gym`, `hotel`, and `auto-service`.
82
-
83
- Render the page before auditing when static HTML is not enough:
84
-
85
- ```bash
86
- open-local-audit https://example.com --render --format markdown
87
- ```
88
-
89
- Capture a rendered homepage screenshot and add it as visual evidence:
90
-
91
- ```bash
92
- open-local-audit https://example.com --screenshot --format all --out-dir reports
93
- ```
94
-
95
- `--render` loads Playwright from the current project. Install it alongside the CLI when needed:
96
-
97
- ```bash
98
- npm install -D playwright
99
- ```
100
-
101
- Run Lighthouse category scoring:
102
-
103
- ```bash
104
- open-local-audit https://example.com --lighthouse --format html --out report.html
105
- ```
106
-
107
- `--lighthouse` runs Chrome through Lighthouse and adds performance, accessibility, best-practices, and SEO category scores to the report. It requires a local Chrome or Chromium runtime that Lighthouse can launch.
108
-
109
- Write standard report formats to a directory:
110
-
111
- ```bash
112
- open-local-audit https://example.com --format all --out-dir reports
113
- ```
114
-
115
- Write an HTML report:
116
-
117
- ```bash
118
- open-local-audit https://example.com --format html --out report.html
119
- ```
120
-
121
- Write a branded PDF report:
122
-
123
- ```bash
124
- open-local-audit https://example.com --format pdf --out report.pdf
125
- ```
126
-
127
- Apply report branding:
128
-
129
- ```bash
130
- open-local-audit https://example.com --brand-config brand.json --format pdf --out report.pdf
131
- ```
132
-
133
- Minimal `brand.json`:
134
-
135
- ```json
136
- {
137
- "name": "Example Audit Studio",
138
- "primaryColor": "#123456",
139
- "accentColor": "#2f7d5f",
140
- "footerText": "Prepared for outreach review",
141
- "contact": "hello@example.com"
142
- }
143
- ```
144
-
145
- Brand colors must use six-digit hex values. Branding applies to Markdown, HTML, and PDF reports.
146
-
147
- Run a batch audit from a text file:
148
-
149
- ```bash
150
- open-local-audit --input sites.txt --format all --out-dir reports
151
- ```
152
-
153
- Run a labeled CSV batch audit:
154
-
155
- ```bash
156
- open-local-audit --input sites.csv --format all --out-dir reports
157
- ```
158
-
159
- Run a profile-aware batch audit and write a prospect CSV export:
160
-
161
- ```bash
162
- open-local-audit --input sites.csv --profile dental --export-csv prospects.csv --format all --out-dir reports
163
- ```
164
-
165
- Run a focused batch triage index:
166
-
167
- ```bash
168
- open-local-audit --input sites.csv --format all --out-dir reports --segment dental --sort score-asc --top 25
169
- ```
170
-
171
- Run a controlled parallel batch audit:
172
-
173
- ```bash
174
- open-local-audit --input sites.csv --format all --out-dir reports --concurrency 3
175
- ```
176
-
177
- Capture screenshots during batch audits:
178
-
179
- ```bash
180
- open-local-audit --input sites.csv --screenshot --format all --out-dir reports
181
- ```
182
-
183
- Batch triage supports `--segment <segment>`, `--min-score <score>`, `--top <count>`, `--sort score-asc|severity-desc`, and `--concurrency <count>`. Batch runs can also write `--export-csv <path>` for prospect triage. Batch index reports include aggregate average score, profile breakdown, segment breakdown, frequent finding sections, contact rollups, and advisory outreach channel rollups. Batch CSV exports include contact confidence, preferred contact channel, and contactability reason when reports were audited successfully. Use `--export-preset crm` with `--export-csv` to write a CRM-ready local import CSV instead of the standard operator CSV.
184
-
185
- Supported CSV columns:
186
-
187
- ```csv
188
- url,label,segment,profile
189
- example.com,Example Clinic,dental,dental
190
- https://example.org/path,Example Salon,beauty,beauty
191
- ```
192
-
193
- Manual CSV lead discovery:
194
-
195
- ```bash
196
- open-local-audit discover --input places.csv --provider manual-csv --profile dental --out-dir reports/dental --export-csv leads.csv
197
- ```
198
-
199
- The `manual-csv` provider reads an operator-prepared CSV, resolves supplied website URLs, audits website-present rows through the existing batch pipeline, and writes `leads.csv` with `hasWebsite`, `websiteUrl`, `priority`, and `nextAction`. Use `--dry-run` to create the prospect CSV without auditing websites.
200
-
201
- Google Places lead discovery:
202
-
203
- ```bash
204
- GOOGLE_MAPS_API_KEY=your-key open-local-audit discover "guzellik salonu Umraniye" --provider google-places --profile beauty --out-dir reports/umraniye-beauty --export-csv leads.csv
205
- ```
206
-
207
- Control discovery cost and audit volume:
208
-
209
- ```bash
210
- GOOGLE_MAPS_API_KEY=your-key open-local-audit discover "dis klinigi Kadikoy" --provider google-places --limit 20 --max-audits 5 --summary-json discovery-summary.json --out-dir reports/kadikoy-dental --export-csv leads.csv
211
- ```
212
-
213
- Skip previously reviewed leads and keep only stronger opportunities:
214
-
215
- ```bash
216
- open-local-audit discover --input places.csv --provider manual-csv --suppression-list reviewed-leads.csv --min-opportunity-score 90 --dry-run --export-csv leads.csv
217
- ```
218
-
219
- Maintain a review queue and duplicate review report across reruns:
220
-
221
- ```bash
222
- open-local-audit discover --input places.csv --provider manual-csv --review-csv review.csv --duplicates-json duplicates.json --dry-run --export-csv leads.csv
223
- ```
224
-
225
- Write a CRM-ready local import CSV:
226
-
227
- ```bash
228
- open-local-audit discover --input places.csv --provider manual-csv --dry-run --export-csv crm-leads.csv --export-preset crm
229
- ```
230
-
231
- Validate a CRM-ready local import CSV:
232
-
233
- ```bash
234
- open-local-audit validate-export --input crm-leads.csv --preset crm
235
- open-local-audit validate-export --input crm-leads.csv --preset crm --format json
236
- ```
237
-
238
- `validate-export` checks the local CSV shape before a manual import. It reports missing CRM columns, missing `companyName`, missing `website`, missing `leadKey`, duplicate `leadKey` values, low or empty contact confidence, and rows that still need manual contact review. Markdown output is the default; JSON is available for automation. Any issue returns exit code `1`, while a clean file returns exit code `0`.
239
-
240
- Create a local lead shortlist from a discovery or CRM CSV export:
241
-
242
- ```bash
243
- open-local-audit shortlist --input leads.csv --out shortlist.md --top 20
244
- open-local-audit shortlist --input leads.csv --out shortlist.md --min-opportunity-score 80
245
- open-local-audit shortlist --input leads.csv --sort score-desc --out shortlist.md
246
- open-local-audit shortlist --input leads.csv --out shortlist.md --summary-json shortlist-summary.json
247
- open-local-audit shortlist --input leads.csv --segment dental --priority high --contact-confidence High --out shortlist.csv --format csv
248
- open-local-audit shortlist --input leads.csv --preferred-contact-channel email --out shortlist.json --format json
249
- open-local-audit shortlist --input leads.csv --exclude-review-status deferred --out shortlist.json --format json
250
- open-local-audit shortlist --input leads.csv --require-website --out shortlist.json --format json
251
- open-local-audit shortlist --input leads.csv --missing-website --out shortlist.json --format json
252
- open-local-audit shortlist --input leads.csv --require-contact --out shortlist.json --format json
253
- open-local-audit shortlist --input leads.csv --missing-contact --out shortlist.json --format json
254
- open-local-audit shortlist --input leads.csv --require-report --out shortlist.json --format json
255
- open-local-audit shortlist --input leads.csv --missing-report --out shortlist.json --format json
256
- open-local-audit shortlist --input leads.csv --review-csv review.csv --review-status pending --out shortlist.json --format json
257
- open-local-audit shortlist --input crm-leads.csv --out shortlist.json --format json
258
- open-local-audit shortlist --input crm-leads.csv --out shortlist.csv --format csv
259
- open-local-audit shortlist --input leads.csv --review-csv review.csv --out shortlist.md
260
- ```
261
-
262
- `shortlist` ranks local CSV rows by `opportunityScore`, `priority`, `contactConfidence`, `score`, and company name. Use `--sort opportunity-desc|score-desc|company-asc|last-reviewed-asc` to change the ranking mode before top-N selection. Use `--summary-json` to write a separate automation summary with counts and selected lead identifiers. Use `--require-website` to keep only rows with a website, `--missing-website` to keep only rows without a website, `--require-contact` to keep only rows with contact confidence above `None`, `--missing-contact` to keep only rows with contact confidence set to `None`, `--require-report` to keep only rows with a report path, and `--missing-report` to keep only rows without a report path. Use `--min-opportunity-score` to keep only stronger local opportunities. Use `--segment`, `--profile`, `--priority`, `--contact-confidence`, and `--preferred-contact-channel` for case-insensitive exact focus filters; supplied filters are combined with `AND` semantics. Use `--review-status` to keep only active leads with a matching review status after completed or suppressed review rows have been removed, or `--exclude-review-status` to remove a matching active review status from the local report. Filtering runs after review-state suppression and before top-N ranking. With `--review-csv`, the command matches review rows by `leadKey`, normalized website, or company label; skips rows marked `rejected`, `contacted`, `not-fit`, `not_a_fit`, `do-not-contact`, or `suppressed`; and carries active `reviewStatus`, `reviewReason`, and `lastReviewedAt` values into Markdown, JSON, and CSV reports. Markdown output is the default; JSON is available for automation; CSV is available for spreadsheet review. The command only reads local CSV files and writes a local report. It does not mutate source files, call APIs, send outreach, or sync to a CRM.
263
-
264
- Package an existing single-site report folder for local customer sharing:
265
-
266
- ```bash
267
- open-local-audit package-report --input reports/example-com --out packages/example-com
268
- ```
269
-
270
- `package-report` reads `open-local-audit-report.json` from the input folder, copies available JSON, Markdown, HTML, and PDF report artifacts into `reports/`, and writes `README.md`, `next-actions.md`, and `manifest.json`. It is local file packaging only; it does not upload reports, send outreach, or sync to a CRM.
271
-
272
- The `google-places` provider is opt-in and requires `GOOGLE_MAPS_API_KEY`. It uses the official Places Text Search API and requests only `places.id`, `places.displayName`, and `places.websiteUri`. It does not scrape Google Maps, collect reviews/photos/ratings, send outreach, or store raw Places responses. Google Maps Platform billing and quota limits apply to API use, and the CLI prints a billing warning when this provider is selected.
273
-
274
- Discovery CSV exports include `leadKey`, `opportunityScore`, `opportunityReasons`, `pitchAngle`, `recommendedOffer`, `estimatedNeed`, `outreachPriorityReason`, website-derived public contact columns, manual outreach handoff columns, and review columns for local triage. Contact columns are populated only from audited public website HTML and include `publicEmail`, `publicPhone`, `whatsappUrl`, `contactPageUrl`, `socialProfiles`, `contactConfidence`, and `contactSource`. Handoff columns include `preferredContactChannel`, `outreachAction`, and `contactabilityReason`; they are advisory only and do not send outreach. Duplicate review JSON includes exact `duplicateGroups` and advisory `fuzzyDuplicateGroups` with confidence and matching reasons for local operator review. Fuzzy duplicate candidates do not auto-suppress leads, change review status, send outreach, or sync to a CRM. The CRM export preset writes `companyName`, `website`, `segment`, `profile`, `priority`, `score`, `opportunityScore`, `topFinding`, `contactConfidence`, `preferredContactChannel`, `contactabilityReason`, `publicEmail`, `publicPhone`, `contactPageUrl`, `source`, `leadKey`, and `reportPath`; it is a local import file only and does not call CRM APIs. A suppression list can reuse a prior discovery CSV or a smaller CSV with `leadKey`, `sourceId`, `websiteUrl`, or `label` plus optional `reviewStatus`, `reviewReason`, and `lastReviewedAt` columns. Rows marked `rejected`, `contacted`, `not-fit`, `do-not-contact`, or `suppressed` are skipped before audits run. `--review-csv` preserves prior operator decisions and adds new leads as `pending`. Spreadsheet formula-like cell values are neutralized before export, so values such as phone numbers starting with `+` may be prefixed for spreadsheet safety.
275
-
276
- See [Google Maps API key setup](./docs/operations/google-maps-api-key.md) for local environment setup.
277
-
278
- Check same-origin links and fail CI when high-severity issues are found:
279
-
280
- ```bash
281
- open-local-audit https://example.com --check-links --max-pages 10 --fail-on high --format all --out-dir reports
282
- ```
283
-
284
- ## First implementation milestone
285
-
286
- The first implementation milestone is a CLI that accepts one URL and outputs:
287
-
288
- - JSON report.
289
- - Markdown report.
290
- - HTML report.
291
- - PDF report.
292
- - Executive Summary section in customer-facing reports.
293
- - Contact Readiness section in customer-facing reports.
294
- - Optional report branding with `--brand-config`.
295
- - Combined JSON, Markdown, and HTML report output with `--format all --out-dir`.
296
- - Batch input files with per-site report folders.
297
- - CSV batch input with optional labels and segments.
298
- - Aggregate batch index reports for prospect triage.
299
- - Batch index filtering, sorting, and top-N triage controls.
300
- - Controlled parallel batch audits with `--concurrency`.
301
- - Batch index insights for average score, profile breakdown, segment breakdown, and frequent findings.
302
- - Batch contact and outreach rollups for audited public website contact readiness.
303
- - CRM-ready local CSV export preset for batch and discovery exports.
304
- - Local CRM export validation with Markdown and JSON issue reports.
305
- - Local lead shortlist reports from discovery and CRM CSV exports.
306
- - Local shortlist sort modes with `--sort`.
307
- - Local shortlist automation summaries with `--summary-json`.
308
- - Local shortlist website-present filtering with `--require-website`.
309
- - Local shortlist missing-website filtering with `--missing-website`.
310
- - Local shortlist contact-ready filtering with `--require-contact`.
311
- - Local shortlist missing-contact filtering with `--missing-contact`.
312
- - Local shortlist report-ready filtering with `--require-report`.
313
- - Local shortlist missing-report filtering with `--missing-report`.
314
- - Local shortlist preferred-contact-channel filtering with `--preferred-contact-channel`.
315
- - Local shortlist review-state suppression with `--review-csv`.
316
- - Local shortlist opportunity-score filtering with `--min-opportunity-score`.
317
- - Local shortlist focus filtering by segment, profile, priority, and contact confidence.
318
- - Local shortlist active review-status filtering with `--review-status`.
319
- - Local shortlist active review-status exclusion with `--exclude-review-status`.
320
- - Spreadsheet-ready local shortlist CSV output with `--format csv`.
321
- - Local report packaging with shareable summaries and copied report artifacts.
322
- - Industry profiles for generic, dental, beauty, restaurant, contractor, lawyer, clinic, gym, hotel, and auto-service audits.
323
- - Profile-specific findings for dental, beauty, restaurant, and contractor conversion/trust signals.
324
- - Prospect CSV export with profile, score, top finding, contact handoff, report path, and error columns.
325
- - Discovery CSV contact enrichment from audited public website HTML.
326
- - Discovery CSV outreach handoff fields for manual next-channel selection.
327
- - Advisory fuzzy duplicate lead review candidates for discovery reruns.
328
- - Duplicate review output remains local operator guidance and does not auto-suppress, send outreach, or sync to CRM systems.
329
- - Optional rendered DOM audits with `--render`.
330
- - Optional rendered screenshot evidence with `--screenshot`.
331
- - Optional Lighthouse category scoring with `--lighthouse`.
332
- - Score summary.
333
- - Evidence table.
334
- - Optional same-origin link checks.
335
- - Terminal summary when reports are written to files.
336
- - CI-friendly exit codes with `--fail-on`.
337
- - Clear owner-readable recommendations.
338
- - Structured-data quality, address, opening-hours, service-location, CTA, and placeholder-copy checks.
339
- - Trust and conversion checks for current date signals, review cues, service detail depth, brand icons, and placeholder social links.
340
-
341
- Example target command:
342
-
343
- ```bash
344
- open-local-audit https://example.com --format markdown --out report.md
345
- ```
346
-
347
- Example report artifacts are available under [`examples/reports`](./examples/reports).
348
-
349
- ## Known limits
350
-
351
- - `--render` requires Playwright in the calling project and a working browser runtime.
352
- - `--screenshot` uses the rendered audit path, requires `--out-dir`, and also requires Playwright.
353
- - `--lighthouse` requires a local Chrome or Chromium runtime that Lighthouse can launch.
354
- - `--format pdf` is supported for single URL audits and requires `--out` or `--out-dir`.
355
- - `--brand-config` reads local JSON only; it does not fetch remote assets or upload report data.
356
- - Batch triage options apply to the aggregate batch index, not individual per-site report contents.
357
- - Batch contact and outreach rollups are advisory local triage metadata; they do not send outreach or sync to a CRM.
358
- - `--export-preset crm` changes CSV columns only; it does not create, update, or sync CRM records.
359
- - `validate-export` checks local CSV files only; it does not create, update, or sync CRM records.
360
- - `shortlist` ranks local CSV files only; it does not call APIs, send outreach, or sync CRM records.
361
- - `shortlist --sort` changes local report ordering only; it does not update source or review CSV files.
362
- - `shortlist --summary-json` writes local automation metadata only; it does not update source or review CSV files.
363
- - `shortlist --require-website` filters local report output only; it does not update source or review CSV files.
364
- - `shortlist --missing-website` filters local report output only; it does not update source or review CSV files.
365
- - `shortlist --require-contact` filters local report output only; it does not update source or review CSV files.
366
- - `shortlist --missing-contact` filters local report output only; it does not update source or review CSV files.
367
- - `shortlist --require-report` filters local report output only; it does not update source or review CSV files.
368
- - `shortlist --missing-report` filters local report output only; it does not update source or review CSV files.
369
- - `shortlist --preferred-contact-channel` filters local report output only; it does not update source or review CSV files.
370
- - `shortlist --review-csv` reads review state for suppression and report context only; it does not mutate the review CSV.
371
- - `shortlist --min-opportunity-score` filters local report output only; it does not update source lead files.
372
- - Shortlist focus filters use case-insensitive exact matching and do not update source lead files.
373
- - `shortlist --review-status` filters local report output only; it does not update source or review CSV files.
374
- - `shortlist --exclude-review-status` filters local report output only; it does not update source or review CSV files.
375
- - `shortlist --format csv` writes a local spreadsheet review file only; it does not import or sync records.
376
- - `package-report` packages local files only; it does not upload reports, send outreach, or sync CRM records.
377
- - `--export-csv` is supported for batch audits and discovery exports.
378
- - `discover --provider google-places` requires `GOOGLE_MAPS_API_KEY` and may incur Google Maps Platform billing.
379
- - `--limit` caps Google Places candidates at 50; it does not paginate beyond one Text Search request.
380
- - `--max-audits` limits website audits only; all discovered candidates still appear in `leads.csv`.
381
- - `--suppression-list` uses exact lead identity matching; source IDs are preferred, then normalized website URLs, then normalized labels.
382
- - `--review-csv` is local operator state only; it does not send outreach or sync to a CRM.
383
- - `--duplicates-json` reports exact duplicate lead groups and advisory fuzzy duplicate candidates for manual review; it does not auto-suppress rows or update review decisions.
384
- - Batch input requires `--out-dir` and cannot be combined with a positional URL.
385
- - Industry profiles are deterministic vertical heuristics, not a replacement for a human review of each business model.
386
- - Higher `--concurrency` values can increase network load against audited sites; use conservative values for prospect batches.
387
- - Rule checks are deterministic heuristics, so they can miss or over-flag site-specific markup.
388
-
389
- ## GitHub and npm release intent
390
-
391
- The project should be prepared for:
392
-
393
- - Public GitHub repository.
394
- - Clear README and examples.
395
- - MIT license, unless maintainers choose a different license before the first public release.
396
- - GitHub Actions for lint, tests, build, and release checks.
397
- - npm package after the CLI has real tests and example reports.
398
-
399
- Release work should follow the checklist in `docs/release/release-readiness.md`.
1
+ # Open Local Audit
2
+
3
+ Open Local Audit is an open-source website and local presence auditor for small businesses. It turns public website signals into practical reports for outreach, customer education, and implementation work.
4
+
5
+ ## Current stage
6
+
7
+ Published CLI. The project can run a single URL audit, an opt-in Playwright-rendered audit with screenshot evidence, optional Lighthouse category scoring, branded customer-facing reports, public contact readiness extraction, a plain-text batch URL list, or a profile-aware CSV batch file, optionally check same-origin links, and produce JSON, Markdown, HTML, PDF, or all standard report formats. Batch runs can use controlled concurrency, write per-site reports, add aggregate insight sections and contact/outreach rollups to the top-level index, and optionally export prospect CSV data for triage. Discovery runs can control Google Places result counts, cap website audits, write summary JSON, merge local review CSVs, explain opportunity scores, enrich outreach and public-contact columns, select a preferred manual outreach channel, and report exact duplicate lead groups plus advisory fuzzy duplicate review candidates. CRM-ready CSV exports can be validated locally before import, lead CSVs can be ranked into local shortlist reports with optional local review-state suppression, and single-site report folders can be packaged for local customer sharing.
8
+
9
+ ## Business purpose
10
+
11
+ Open Local Audit produces evidence-backed mini audits for local businesses. The tool should be useful as open-source software while keeping paid value in implementation, redesign, maintenance, and custom reporting.
12
+
13
+ ## Product principles
14
+
15
+ - Evidence first: every recommendation should point to a concrete finding.
16
+ - Ethical scanning: only scan user-provided public URLs and avoid Google Maps scraping.
17
+ - Small-business language: reports should be readable by non-technical owners.
18
+ - Developer quality: deterministic checks, clear CLI output, tests, and release notes.
19
+ - Commercial clarity: the open-source scanner is useful; paid work is execution and support.
20
+
21
+ ## Proposed first stack
22
+
23
+ - Runtime: Node.js with TypeScript.
24
+ - CLI framework: `commander` or `cac`.
25
+ - Validation: `zod`.
26
+ - HTML parsing: `cheerio`.
27
+ - Browser rendering: Playwright where static parsing is not enough.
28
+ - Storage: no persistence by default; optional JSON/Markdown outputs.
29
+ - Test runner: Vitest.
30
+ - Package manager: npm unless implementation chooses pnpm before first commit.
31
+
32
+ ## Documents
33
+
34
+ - [Product brief](./docs/product-brief.md)
35
+ - [Technical architecture](./docs/technical-architecture.md)
36
+ - [MVP roadmap](./docs/mvp-roadmap.md)
37
+ - [Security and ethics](./docs/security-and-ethics.md)
38
+ - [Go-to-market plan](./docs/go-to-market.md)
39
+ - [Audit checklist](./docs/research/audit-checklist.md)
40
+ - [Release readiness](./docs/release/release-readiness.md)
41
+ - [npm publishing plan](./docs/release/npm-publishing.md)
42
+ - [Project operating standard](./docs/operations/project-standard.md)
43
+ - [Decision log](./docs/operations/decision-log.md)
44
+
45
+ ## Local development
46
+
47
+ Install dependencies and run the checks:
48
+
49
+ ```bash
50
+ npm install
51
+ npm test
52
+ npm run lint
53
+ npm run build
54
+ npm run release-check
55
+ ```
56
+
57
+ Run the CLI locally:
58
+
59
+ ```bash
60
+ npm start -- https://example.com --format markdown
61
+ ```
62
+
63
+ After building, run the compiled CLI:
64
+
65
+ ```bash
66
+ node dist/cli.js https://example.com --format json --pretty
67
+ ```
68
+
69
+ Run the published package:
70
+
71
+ ```bash
72
+ npx open-local-audit https://example.com --format markdown
73
+ ```
74
+
75
+ Run with an industry profile:
76
+
77
+ ```bash
78
+ open-local-audit https://example.com --profile dental --format markdown
79
+ ```
80
+
81
+ Supported profiles are `generic`, `dental`, `beauty`, `restaurant`, `contractor`, `lawyer`, `clinic`, `gym`, `hotel`, and `auto-service`.
82
+
83
+ Render the page before auditing when static HTML is not enough:
84
+
85
+ ```bash
86
+ open-local-audit https://example.com --render --format markdown
87
+ ```
88
+
89
+ Capture a rendered homepage screenshot and add it as visual evidence:
90
+
91
+ ```bash
92
+ open-local-audit https://example.com --screenshot --format all --out-dir reports
93
+ ```
94
+
95
+ `--render` loads Playwright from the current project. Install it alongside the CLI when needed:
96
+
97
+ ```bash
98
+ npm install -D playwright
99
+ ```
100
+
101
+ Run Lighthouse category scoring:
102
+
103
+ ```bash
104
+ open-local-audit https://example.com --lighthouse --format html --out report.html
105
+ ```
106
+
107
+ `--lighthouse` runs Chrome through Lighthouse and adds performance, accessibility, best-practices, and SEO category scores to the report. It requires a local Chrome or Chromium runtime that Lighthouse can launch.
108
+
109
+ Write standard report formats to a directory:
110
+
111
+ ```bash
112
+ open-local-audit https://example.com --format all --out-dir reports
113
+ ```
114
+
115
+ Write an HTML report:
116
+
117
+ ```bash
118
+ open-local-audit https://example.com --format html --out report.html
119
+ ```
120
+
121
+ Write a branded PDF report:
122
+
123
+ ```bash
124
+ open-local-audit https://example.com --format pdf --out report.pdf
125
+ ```
126
+
127
+ Apply report branding:
128
+
129
+ ```bash
130
+ open-local-audit https://example.com --brand-config brand.json --format pdf --out report.pdf
131
+ ```
132
+
133
+ Minimal `brand.json`:
134
+
135
+ ```json
136
+ {
137
+ "name": "Example Audit Studio",
138
+ "primaryColor": "#123456",
139
+ "accentColor": "#2f7d5f",
140
+ "footerText": "Prepared for outreach review",
141
+ "contact": "hello@example.com"
142
+ }
143
+ ```
144
+
145
+ Brand colors must use six-digit hex values. Branding applies to Markdown, HTML, and PDF reports.
146
+
147
+ Run a batch audit from a text file:
148
+
149
+ ```bash
150
+ open-local-audit --input sites.txt --format all --out-dir reports
151
+ ```
152
+
153
+ Run a labeled CSV batch audit:
154
+
155
+ ```bash
156
+ open-local-audit --input sites.csv --format all --out-dir reports
157
+ ```
158
+
159
+ Run a profile-aware batch audit and write a prospect CSV export:
160
+
161
+ ```bash
162
+ open-local-audit --input sites.csv --profile dental --export-csv prospects.csv --format all --out-dir reports
163
+ ```
164
+
165
+ Run a focused batch triage index:
166
+
167
+ ```bash
168
+ open-local-audit --input sites.csv --format all --out-dir reports --segment dental --sort score-asc --top 25
169
+ ```
170
+
171
+ Run a controlled parallel batch audit:
172
+
173
+ ```bash
174
+ open-local-audit --input sites.csv --format all --out-dir reports --concurrency 3
175
+ ```
176
+
177
+ Capture screenshots during batch audits:
178
+
179
+ ```bash
180
+ open-local-audit --input sites.csv --screenshot --format all --out-dir reports
181
+ ```
182
+
183
+ Batch triage supports `--segment <segment>`, `--min-score <score>`, `--top <count>`, `--sort score-asc|severity-desc`, and `--concurrency <count>`. Batch runs can also write `--export-csv <path>` for prospect triage. Batch index reports include aggregate average score, profile breakdown, segment breakdown, frequent finding sections, contact rollups, and advisory outreach channel rollups. Batch CSV exports include contact confidence, preferred contact channel, and contactability reason when reports were audited successfully. Use `--export-preset crm` with `--export-csv` to write a CRM-ready local import CSV instead of the standard operator CSV.
184
+
185
+ Supported CSV columns:
186
+
187
+ ```csv
188
+ url,label,segment,profile
189
+ example.com,Example Clinic,dental,dental
190
+ https://example.org/path,Example Salon,beauty,beauty
191
+ ```
192
+
193
+ Manual CSV lead discovery:
194
+
195
+ ```bash
196
+ open-local-audit discover --input places.csv --provider manual-csv --profile dental --out-dir reports/dental --export-csv leads.csv
197
+ ```
198
+
199
+ The `manual-csv` provider reads an operator-prepared CSV, resolves supplied website URLs, audits website-present rows through the existing batch pipeline, and writes `leads.csv` with `hasWebsite`, `websiteUrl`, `priority`, and `nextAction`. Use `--dry-run` to create the prospect CSV without auditing websites.
200
+
201
+ Google Places lead discovery:
202
+
203
+ ```bash
204
+ GOOGLE_MAPS_API_KEY=your-key open-local-audit discover "guzellik salonu Umraniye" --provider google-places --profile beauty --out-dir reports/umraniye-beauty --export-csv leads.csv
205
+ ```
206
+
207
+ Control discovery cost and audit volume:
208
+
209
+ ```bash
210
+ GOOGLE_MAPS_API_KEY=your-key open-local-audit discover "dis klinigi Kadikoy" --provider google-places --limit 20 --max-audits 5 --summary-json discovery-summary.json --out-dir reports/kadikoy-dental --export-csv leads.csv
211
+ ```
212
+
213
+ Skip previously reviewed leads and keep only stronger opportunities:
214
+
215
+ ```bash
216
+ open-local-audit discover --input places.csv --provider manual-csv --suppression-list reviewed-leads.csv --min-opportunity-score 90 --dry-run --export-csv leads.csv
217
+ ```
218
+
219
+ Maintain a review queue and duplicate review report across reruns:
220
+
221
+ ```bash
222
+ open-local-audit discover --input places.csv --provider manual-csv --review-csv review.csv --duplicates-json duplicates.json --dry-run --export-csv leads.csv
223
+ ```
224
+
225
+ Write a CRM-ready local import CSV:
226
+
227
+ ```bash
228
+ open-local-audit discover --input places.csv --provider manual-csv --dry-run --export-csv crm-leads.csv --export-preset crm
229
+ ```
230
+
231
+ Validate a CRM-ready local import CSV:
232
+
233
+ ```bash
234
+ open-local-audit validate-export --input crm-leads.csv --preset crm
235
+ open-local-audit validate-export --input crm-leads.csv --preset crm --format json
236
+ ```
237
+
238
+ `validate-export` checks the local CSV shape before a manual import. It reports missing CRM columns, missing `companyName`, missing `website`, missing `leadKey`, duplicate `leadKey` values, low or empty contact confidence, and rows that still need manual contact review. Markdown output is the default; JSON is available for automation. Any issue returns exit code `1`, while a clean file returns exit code `0`.
239
+
240
+ Create a local lead shortlist from a discovery or CRM CSV export:
241
+
242
+ ```bash
243
+ open-local-audit shortlist --input leads.csv --out shortlist.md --top 20
244
+ open-local-audit shortlist --input leads.csv --out shortlist.md --min-opportunity-score 80
245
+ open-local-audit shortlist --input leads.csv --sort score-desc --out shortlist.md
246
+ open-local-audit shortlist --input leads.csv --out shortlist.md --summary-json shortlist-summary.json
247
+ open-local-audit shortlist --input leads.csv --segment dental --priority high --contact-confidence High --out shortlist.csv --format csv
248
+ open-local-audit shortlist --input leads.csv --preferred-contact-channel email --out shortlist.json --format json
249
+ open-local-audit shortlist --input leads.csv --exclude-review-status deferred --out shortlist.json --format json
250
+ open-local-audit shortlist --input leads.csv --require-website --out shortlist.json --format json
251
+ open-local-audit shortlist --input leads.csv --missing-website --out shortlist.json --format json
252
+ open-local-audit shortlist --input leads.csv --unreviewed --out shortlist.json --format json
253
+ open-local-audit shortlist --input leads.csv --require-contact --out shortlist.json --format json
254
+ open-local-audit shortlist --input leads.csv --missing-contact --out shortlist.json --format json
255
+ open-local-audit shortlist --input leads.csv --require-report --out shortlist.json --format json
256
+ open-local-audit shortlist --input leads.csv --missing-report --out shortlist.json --format json
257
+ open-local-audit shortlist --input leads.csv --review-csv review.csv --review-status pending --out shortlist.json --format json
258
+ open-local-audit shortlist --input crm-leads.csv --out shortlist.json --format json
259
+ open-local-audit shortlist --input crm-leads.csv --out shortlist.csv --format csv
260
+ open-local-audit shortlist --input leads.csv --review-csv review.csv --out shortlist.md
261
+ ```
262
+
263
+ `shortlist` ranks local CSV rows by `opportunityScore`, `priority`, `contactConfidence`, `score`, and company name. Use `--sort opportunity-desc|score-desc|company-asc|last-reviewed-asc` to change the ranking mode before top-N selection. Use `--summary-json` to write a separate automation summary with counts and selected lead identifiers. Use `--require-website` to keep only rows with a website, `--missing-website` to keep only rows without a website, `--unreviewed` to keep only rows whose normalized `lastReviewedAt` is empty, `--require-contact` to keep only rows with contact confidence above `None`, `--missing-contact` to keep only rows with contact confidence set to `None`, `--require-report` to keep only rows with a report path, and `--missing-report` to keep only rows without a report path. Use `--min-opportunity-score` to keep only stronger local opportunities. Use `--segment`, `--profile`, `--priority`, `--contact-confidence`, and `--preferred-contact-channel` for case-insensitive exact focus filters; supplied filters, including `--unreviewed`, are combined with `AND` semantics. Use `--review-status` to keep only active leads with a matching review status after completed or suppressed review rows have been removed, or `--exclude-review-status` to remove a matching active review status from the local report. Filtering runs after review-state suppression and before sorting and top-N selection. With `--review-csv`, the command matches review rows by `leadKey`, normalized website, or company label; skips rows marked `rejected`, `contacted`, `not-fit`, `not_a_fit`, `do-not-contact`, or `suppressed`; and carries active `reviewStatus`, `reviewReason`, and `lastReviewedAt` values into Markdown, JSON, and CSV reports. Markdown output is the default; JSON is available for automation; CSV is available for spreadsheet review. The command only reads local CSV files and writes a local report. It does not mutate source files, call APIs, send outreach, or sync to a CRM.
264
+
265
+ Package an existing single-site report folder for local customer sharing:
266
+
267
+ ```bash
268
+ open-local-audit package-report --input reports/example-com --out packages/example-com
269
+ ```
270
+
271
+ `package-report` reads `open-local-audit-report.json` from the input folder, copies available JSON, Markdown, HTML, and PDF report artifacts into `reports/`, and writes `README.md`, `next-actions.md`, and `manifest.json`. It is local file packaging only; it does not upload reports, send outreach, or sync to a CRM.
272
+
273
+ The `google-places` provider is opt-in and requires `GOOGLE_MAPS_API_KEY`. It uses the official Places Text Search API and requests only `places.id`, `places.displayName`, and `places.websiteUri`. It does not scrape Google Maps, collect reviews/photos/ratings, send outreach, or store raw Places responses. Google Maps Platform billing and quota limits apply to API use, and the CLI prints a billing warning when this provider is selected.
274
+
275
+ Discovery CSV exports include `leadKey`, `opportunityScore`, `opportunityReasons`, `pitchAngle`, `recommendedOffer`, `estimatedNeed`, `outreachPriorityReason`, website-derived public contact columns, manual outreach handoff columns, and review columns for local triage. Contact columns are populated only from audited public website HTML and include `publicEmail`, `publicPhone`, `whatsappUrl`, `contactPageUrl`, `socialProfiles`, `contactConfidence`, and `contactSource`. Handoff columns include `preferredContactChannel`, `outreachAction`, and `contactabilityReason`; they are advisory only and do not send outreach. Duplicate review JSON includes exact `duplicateGroups` and advisory `fuzzyDuplicateGroups` with confidence and matching reasons for local operator review. Fuzzy duplicate candidates do not auto-suppress leads, change review status, send outreach, or sync to a CRM. The CRM export preset writes `companyName`, `website`, `segment`, `profile`, `priority`, `score`, `opportunityScore`, `topFinding`, `contactConfidence`, `preferredContactChannel`, `contactabilityReason`, `publicEmail`, `publicPhone`, `contactPageUrl`, `source`, `leadKey`, and `reportPath`; it is a local import file only and does not call CRM APIs. A suppression list can reuse a prior discovery CSV or a smaller CSV with `leadKey`, `sourceId`, `websiteUrl`, or `label` plus optional `reviewStatus`, `reviewReason`, and `lastReviewedAt` columns. Rows marked `rejected`, `contacted`, `not-fit`, `do-not-contact`, or `suppressed` are skipped before audits run. `--review-csv` preserves prior operator decisions and adds new leads as `pending`. Spreadsheet formula-like cell values are neutralized before export, so values such as phone numbers starting with `+` may be prefixed for spreadsheet safety.
276
+
277
+ See [Google Maps API key setup](./docs/operations/google-maps-api-key.md) for local environment setup.
278
+
279
+ Check same-origin links and fail CI when high-severity issues are found:
280
+
281
+ ```bash
282
+ open-local-audit https://example.com --check-links --max-pages 10 --fail-on high --format all --out-dir reports
283
+ ```
284
+
285
+ ## First implementation milestone
286
+
287
+ The first implementation milestone is a CLI that accepts one URL and outputs:
288
+
289
+ - JSON report.
290
+ - Markdown report.
291
+ - HTML report.
292
+ - PDF report.
293
+ - Executive Summary section in customer-facing reports.
294
+ - Contact Readiness section in customer-facing reports.
295
+ - Optional report branding with `--brand-config`.
296
+ - Combined JSON, Markdown, and HTML report output with `--format all --out-dir`.
297
+ - Batch input files with per-site report folders.
298
+ - CSV batch input with optional labels and segments.
299
+ - Aggregate batch index reports for prospect triage.
300
+ - Batch index filtering, sorting, and top-N triage controls.
301
+ - Controlled parallel batch audits with `--concurrency`.
302
+ - Batch index insights for average score, profile breakdown, segment breakdown, and frequent findings.
303
+ - Batch contact and outreach rollups for audited public website contact readiness.
304
+ - CRM-ready local CSV export preset for batch and discovery exports.
305
+ - Local CRM export validation with Markdown and JSON issue reports.
306
+ - Local lead shortlist reports from discovery and CRM CSV exports.
307
+ - Local shortlist sort modes with `--sort`.
308
+ - Local shortlist automation summaries with `--summary-json`.
309
+ - Local shortlist website-present filtering with `--require-website`.
310
+ - Local shortlist missing-website filtering with `--missing-website`.
311
+ - Local shortlist unreviewed filtering with `--unreviewed`.
312
+ - Local shortlist contact-ready filtering with `--require-contact`.
313
+ - Local shortlist missing-contact filtering with `--missing-contact`.
314
+ - Local shortlist report-ready filtering with `--require-report`.
315
+ - Local shortlist missing-report filtering with `--missing-report`.
316
+ - Local shortlist preferred-contact-channel filtering with `--preferred-contact-channel`.
317
+ - Local shortlist review-state suppression with `--review-csv`.
318
+ - Local shortlist opportunity-score filtering with `--min-opportunity-score`.
319
+ - Local shortlist focus filtering by segment, profile, priority, and contact confidence.
320
+ - Local shortlist active review-status filtering with `--review-status`.
321
+ - Local shortlist active review-status exclusion with `--exclude-review-status`.
322
+ - Spreadsheet-ready local shortlist CSV output with `--format csv`.
323
+ - Local report packaging with shareable summaries and copied report artifacts.
324
+ - Industry profiles for generic, dental, beauty, restaurant, contractor, lawyer, clinic, gym, hotel, and auto-service audits.
325
+ - Profile-specific findings for dental, beauty, restaurant, and contractor conversion/trust signals.
326
+ - Prospect CSV export with profile, score, top finding, contact handoff, report path, and error columns.
327
+ - Discovery CSV contact enrichment from audited public website HTML.
328
+ - Discovery CSV outreach handoff fields for manual next-channel selection.
329
+ - Advisory fuzzy duplicate lead review candidates for discovery reruns.
330
+ - Duplicate review output remains local operator guidance and does not auto-suppress, send outreach, or sync to CRM systems.
331
+ - Optional rendered DOM audits with `--render`.
332
+ - Optional rendered screenshot evidence with `--screenshot`.
333
+ - Optional Lighthouse category scoring with `--lighthouse`.
334
+ - Score summary.
335
+ - Evidence table.
336
+ - Optional same-origin link checks.
337
+ - Terminal summary when reports are written to files.
338
+ - CI-friendly exit codes with `--fail-on`.
339
+ - Clear owner-readable recommendations.
340
+ - Structured-data quality, address, opening-hours, service-location, CTA, and placeholder-copy checks.
341
+ - Trust and conversion checks for current date signals, review cues, service detail depth, brand icons, and placeholder social links.
342
+
343
+ Example target command:
344
+
345
+ ```bash
346
+ open-local-audit https://example.com --format markdown --out report.md
347
+ ```
348
+
349
+ Example report artifacts are available under [`examples/reports`](./examples/reports).
350
+
351
+ ## Known limits
352
+
353
+ - `--render` requires Playwright in the calling project and a working browser runtime.
354
+ - `--screenshot` uses the rendered audit path, requires `--out-dir`, and also requires Playwright.
355
+ - `--lighthouse` requires a local Chrome or Chromium runtime that Lighthouse can launch.
356
+ - `--format pdf` is supported for single URL audits and requires `--out` or `--out-dir`.
357
+ - `--brand-config` reads local JSON only; it does not fetch remote assets or upload report data.
358
+ - Batch triage options apply to the aggregate batch index, not individual per-site report contents.
359
+ - Batch contact and outreach rollups are advisory local triage metadata; they do not send outreach or sync to a CRM.
360
+ - `--export-preset crm` changes CSV columns only; it does not create, update, or sync CRM records.
361
+ - `validate-export` checks local CSV files only; it does not create, update, or sync CRM records.
362
+ - `shortlist` ranks local CSV files only; it does not call APIs, send outreach, or sync CRM records.
363
+ - `shortlist --sort` changes local report ordering only; it does not update source or review CSV files.
364
+ - `shortlist --summary-json` writes local automation metadata only; it does not update source or review CSV files.
365
+ - `shortlist --require-website` filters local report output only; it does not update source or review CSV files.
366
+ - `shortlist --missing-website` filters local report output only; it does not update source or review CSV files.
367
+ - `shortlist --unreviewed` filters local report output only; it does not update source or review CSV files.
368
+ - `shortlist --require-contact` filters local report output only; it does not update source or review CSV files.
369
+ - `shortlist --missing-contact` filters local report output only; it does not update source or review CSV files.
370
+ - `shortlist --require-report` filters local report output only; it does not update source or review CSV files.
371
+ - `shortlist --missing-report` filters local report output only; it does not update source or review CSV files.
372
+ - `shortlist --preferred-contact-channel` filters local report output only; it does not update source or review CSV files.
373
+ - `shortlist --review-csv` reads review state for suppression and report context only; it does not mutate the review CSV.
374
+ - `shortlist --min-opportunity-score` filters local report output only; it does not update source lead files.
375
+ - Shortlist focus filters use case-insensitive exact matching and do not update source lead files.
376
+ - `shortlist --review-status` filters local report output only; it does not update source or review CSV files.
377
+ - `shortlist --exclude-review-status` filters local report output only; it does not update source or review CSV files.
378
+ - `shortlist --format csv` writes a local spreadsheet review file only; it does not import or sync records.
379
+ - `package-report` packages local files only; it does not upload reports, send outreach, or sync CRM records.
380
+ - `--export-csv` is supported for batch audits and discovery exports.
381
+ - `discover --provider google-places` requires `GOOGLE_MAPS_API_KEY` and may incur Google Maps Platform billing.
382
+ - `--limit` caps Google Places candidates at 50; it does not paginate beyond one Text Search request.
383
+ - `--max-audits` limits website audits only; all discovered candidates still appear in `leads.csv`.
384
+ - `--suppression-list` uses exact lead identity matching; source IDs are preferred, then normalized website URLs, then normalized labels.
385
+ - `--review-csv` is local operator state only; it does not send outreach or sync to a CRM.
386
+ - `--duplicates-json` reports exact duplicate lead groups and advisory fuzzy duplicate candidates for manual review; it does not auto-suppress rows or update review decisions.
387
+ - Batch input requires `--out-dir` and cannot be combined with a positional URL.
388
+ - Industry profiles are deterministic vertical heuristics, not a replacement for a human review of each business model.
389
+ - Higher `--concurrency` values can increase network load against audited sites; use conservative values for prospect batches.
390
+ - Rule checks are deterministic heuristics, so they can miss or over-flag site-specific markup.
391
+
392
+ ## GitHub and npm release intent
393
+
394
+ The project should be prepared for:
395
+
396
+ - Public GitHub repository.
397
+ - Clear README and examples.
398
+ - MIT license, unless maintainers choose a different license before the first public release.
399
+ - GitHub Actions for lint, tests, build, and release checks.
400
+ - npm package after the CLI has real tests and example reports.
401
+
402
+ Release work should follow the checklist in `docs/release/release-readiness.md`.