@consenttheater/playbill 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,213 @@
1
+ # @consenttheater/playbill
2
+
3
+ [![npm version](https://img.shields.io/npm/v/@consenttheater/playbill.svg)](https://www.npmjs.com/package/@consenttheater/playbill)
4
+ [![npm downloads](https://img.shields.io/npm/dm/@consenttheater/playbill.svg)](https://www.npmjs.com/package/@consenttheater/playbill)
5
+ [![License: AGPL-3.0-or-later](https://img.shields.io/badge/License-AGPL--3.0--or--later-blue.svg)](LICENSE)
6
+ [![TypeScript](https://img.shields.io/badge/TypeScript-types%20included-3178C6?logo=typescript&logoColor=white)](https://www.typescriptlang.org/)
7
+ [![Node](https://img.shields.io/node/v/@consenttheater/playbill.svg)](package.json)
8
+
9
+ **The Playbill** — an open-source, tiered knowledge base of GDPR-relevant web trackers, with pure-function matching and risk-scoring utilities.
10
+
11
+ A theater playbill lists every actor and their role on stage. This package does the same for the web: every tracking cookie, every tracking domain, every company behind them — identified, categorized, and scored against GDPR.
12
+
13
+ A standalone library — no browser dependencies, no runtime side effects, no lock-in. Useful for anyone building privacy tooling:
14
+
15
+ - Cookie banner auditors and Consent Management Platforms (CMPs)
16
+ - Browser extensions and user-agent privacy features
17
+ - CI/CD compliance scanners (catch GDPR regressions before they ship)
18
+ - Web crawlers, site-grading services, accessibility and privacy dashboards
19
+ - Academic research, regulatory studies, journalism projects
20
+ - Your own privacy tools, commercial or otherwise (subject to the AGPL — see License)
21
+
22
+ ## What's in the Playbill
23
+
24
+ - **Cookie signatures** — name, owning company, service, purpose, severity, lifetime, docs link
25
+ - **Domain signatures** — hostname, owning company, service, category, severity
26
+ - **2,800+ companies** — from Google and Meta to regional EU ad networks and niche SaaS tools
27
+ - **8,000+ entries** across 11 categories — one of the largest AGPL-licensed tracker databases available
28
+ - **Matching utilities** — exact + pattern (trailing `*`) cookie matching, exact + subdomain hostname matching
29
+ - **Scoring utilities** — GDPR-weighted compliance score with four risk bands
30
+
31
+ ### Current stats
32
+
33
+ | Metric | Count |
34
+ |--------|-------|
35
+ | Cookie signatures | 2,307 |
36
+ | Domain signatures | 6,297 |
37
+ | Total entries | 8,604 |
38
+ | Unique companies | 2,839 |
39
+ | Categories | 11 |
40
+
41
+ ## Tiers
42
+
43
+ Choose what you need — everything is **computed at runtime** from a single set of source files, no pre-built tier bundles to drift out of sync:
44
+
45
+ | Tier | Entries | Companies | Size (gzip) | Use case |
46
+ |------|---------|-----------|-------------|----------|
47
+ | `mini` | ~620 | ~45 | ~25 KB | Lightweight widgets, top-50 company quick checks |
48
+ | `core` | ~7,000 | ~2,470 | ~275 KB | CI/CD scanners, most compliance tools |
49
+ | `full` | ~8,100 | ~2,825 | ~320 KB | Complete audits, regional/niche coverage |
50
+
51
+ **Tier semantics:**
52
+ - `mini` — top 50 companies by entry count, `critical` + `high` severity only
53
+ - `core` — all `critical` + `high` + `medium` severity, any company
54
+ - `full` — everything, including `low` severity, regional, and niche
55
+
56
+ ## Usage
57
+
58
+ ```ts
59
+ import { loadPlaybill, matchCookie, matchDomain, computeScore } from '@consenttheater/playbill';
60
+
61
+ // Load the tier you need
62
+ const playbill = loadPlaybill('core');
63
+
64
+ // Identify a cookie
65
+ const cookie = matchCookie(playbill, '_ga');
66
+ // → {
67
+ // name: '_ga',
68
+ // company: 'Google',
69
+ // service: 'Google Analytics',
70
+ // category: 'analytics',
71
+ // severity: 'high',
72
+ // description: 'Distinguishes unique users...',
73
+ // lifetime: '2 years',
74
+ // docs_url: 'https://developers.google.com/...'
75
+ // }
76
+
77
+ // Identify a domain (exact or subdomain match)
78
+ const domain = matchDomain(playbill, 'connect.facebook.net');
79
+ // → { hostname: 'connect.facebook.net', company: 'Meta',
80
+ // service: 'Meta Pixel', category: 'advertising', severity: 'critical' }
81
+
82
+ // Score a page's compliance against GDPR
83
+ const result = computeScore({
84
+ preConsentCookies: [cookie],
85
+ preConsentRequests: [domain],
86
+ dataLeakRequests: [],
87
+ banner: { detected: true, hasAcceptButton: true, hasRejectButton: false }
88
+ });
89
+ // → { score: 45,
90
+ // band: { key: 'non_compliant', label: 'Non-Compliant' },
91
+ // violations: [...] }
92
+ ```
93
+
94
+ ### Loading individual categories
95
+
96
+ For tools that only care about a subset (e.g. an analytics-opt-out helper doesn't need ad trackers):
97
+
98
+ ```ts
99
+ import { loadActors } from '@consenttheater/playbill';
100
+
101
+ const playbill = loadActors(['advertising', 'analytics']);
102
+ ```
103
+
104
+ ### Direct category imports (tree-shakeable)
105
+
106
+ Each of the 11 categories is available as its own subpath export. Useful when your bundler can tree-shake and you want to skip categories entirely:
107
+
108
+ ```ts
109
+ import advertising from '@consenttheater/playbill/actors/advertising';
110
+ import analytics from '@consenttheater/playbill/actors/analytics';
111
+ import dataLeak from '@consenttheater/playbill/actors/data-leak';
112
+ // Also: marketing, functional, social, session-recording, security,
113
+ // consent, fingerprinting, tag-manager
114
+ ```
115
+
116
+ ### Matcher-only / scorer-only
117
+
118
+ If you only need matching (no scoring) or scoring (no DB):
119
+
120
+ ```ts
121
+ import { matchCookie, matchDomain } from '@consenttheater/playbill/matcher';
122
+ import { computeScore, bandForScore, SEVERITY_WEIGHTS, BANDS } from '@consenttheater/playbill/scorer';
123
+ import type { Playbill, CookieActor, ScoreResult } from '@consenttheater/playbill/types';
124
+ ```
125
+
126
+ ## Categories
127
+
128
+ | Category | Description |
129
+ |----------|-------------|
130
+ | `advertising` | Ad targeting, retargeting, DSPs, SSPs, conversion tracking, programmatic |
131
+ | `analytics` | Usage measurement, audience insights, A/B testing, CDP, attribution |
132
+ | `marketing` | Email, SMS, push, CRM tracking, marketing automation, lead capture |
133
+ | `functional` | Chat widgets, forms, payments, CMS features, loyalty, accessibility |
134
+ | `social` | Social media embeds, sharing widgets, social login |
135
+ | `session_recording` | Heatmaps, session replays, screen recording, click/scroll tracking |
136
+ | `data_leak` | Third-party resources that expose visitor IP — fonts, embeds, CDNs, maps |
137
+ | `security` | Bot detection, CAPTCHA, CSRF protection, fraud prevention |
138
+ | `consent` | Consent Management Platforms (CMPs), banners, preference management |
139
+ | `fingerprinting` | Browser fingerprinting, device identification, cross-device tracking |
140
+ | `tag_manager` | Tag management systems — container scripts that load other trackers |
141
+
142
+ ### The `data_leak` category is special
143
+
144
+ Entries scored as `data_leak` (Google Fonts, Typekit, YouTube embeds, Google Maps, etc.) are counted as violations **even after consent**. Rationale: the Austrian DPA ruling (2022) and LG München judgments hold that IP exfiltration to third parties violates GDPR regardless of consent, because the request fires before any dialog can mediate.
145
+
146
+ ## Severity levels
147
+
148
+ | Severity | GDPR weight | Meaning |
149
+ |----------|-------------|---------|
150
+ | `critical` | -25 points | Ad/retargeting trackers — clear GDPR violation if set before consent |
151
+ | `high` | -15 points | Analytics without legal basis; banner missing reject option |
152
+ | `medium` | -10 points | Session recording, data leaks — elevated risk exposure |
153
+ | `low` | -5 points | Functional/security — typically exempt under legitimate interest |
154
+
155
+ ## Risk bands
156
+
157
+ `computeScore()` returns one of four bands based on the total score (100 − Σ weights):
158
+
159
+ | Score | Band | Meaning |
160
+ |-------|------|---------|
161
+ | ≥ 90 | `compliant` | GDPR-clean or near-clean |
162
+ | 70–89 | `at_risk` | Minor issues — one critical violation drops you here |
163
+ | 40–69 | `non_compliant` | Two critical violations or equivalent |
164
+ | < 40 | `violating` | Four+ critical violations; systemic failure |
165
+
166
+ ## Pattern cookies
167
+
168
+ Cookie entries marked `"pattern": true` use **prefix matching with a trailing `*`**:
169
+
170
+ ```json
171
+ "_ga_*": { "pattern": true, "company": "Google", ... }
172
+ ```
173
+
174
+ This matches `_ga_ABC123`, `_ga_XYZ789`, etc. Wildcards in the middle of a key are not supported — only trailing `*`.
175
+
176
+ ## Types
177
+
178
+ All types are exported:
179
+
180
+ ```ts
181
+ import type {
182
+ Playbill, Tier,
183
+ CookieActor, DomainActor,
184
+ CookieMatch, DomainMatch,
185
+ Severity, Category,
186
+ ScoreInput, ScoreResult, Violation,
187
+ Band, BandKey,
188
+ } from '@consenttheater/playbill';
189
+ ```
190
+
191
+ ## License
192
+
193
+ **AGPL-3.0-or-later** — free to use, including commercially, but modifications and derivative works must remain open source under a compatible license. This applies even when the software is offered as a hosted service (SaaS). See [LICENSE](./LICENSE).
194
+
195
+ The AGPL is a deliberate choice: the tracker knowledge encoded here represents substantial community research, and we want forks, hosted scanners, and downstream tools to stay open so the ecosystem as a whole improves.
196
+
197
+ ## Contributing
198
+
199
+ Found a tracker we're missing? Want to fix an incorrect severity or update a lifetime? PRs welcome.
200
+
201
+ Each entry needs:
202
+ - Cookie name or domain
203
+ - Owning company and service name
204
+ - Category and severity
205
+ - One-sentence description
206
+ - Cookie lifetime (for cookies)
207
+ - Link to official documentation
208
+
209
+ After editing any file under `src/actors/`, run:
210
+
211
+ ```sh
212
+ npm run normalize # sorts keys, reformats, flags duplicates, updates stats
213
+ ```