moderate 0.1.0 → 1.0.0.beta1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (65) hide show
  1. checksums.yaml +4 -4
  2. data/.rubocop.yml +8 -0
  3. data/.simplecov +62 -0
  4. data/AGENTS.md +7 -0
  5. data/Appraisals +16 -0
  6. data/CHANGELOG.md +71 -1
  7. data/CLAUDE.md +7 -0
  8. data/README.md +376 -29
  9. data/Rakefile +28 -2
  10. data/app/controllers/concerns/moderate/moderation.rb +161 -0
  11. data/app/controllers/moderate/appeals_controller.rb +190 -0
  12. data/app/controllers/moderate/application_controller.rb +45 -0
  13. data/app/controllers/moderate/notices_controller.rb +382 -0
  14. data/app/controllers/moderate/transparency_reports_controller.rb +30 -0
  15. data/app/helpers/moderate/engine_helper.rb +151 -0
  16. data/app/views/moderate/appeals/new.html.erb +78 -0
  17. data/app/views/moderate/notices/new.html.erb +255 -0
  18. data/app/views/moderate/transparency_reports/_summary_card.html.erb +20 -0
  19. data/app/views/moderate/transparency_reports/show.html.erb +52 -0
  20. data/config/moderate/blocklists/en.yml +81 -0
  21. data/config/moderate/blocklists/es.yml +40 -0
  22. data/config/routes.rb +36 -0
  23. data/docs/compliance.md +178 -0
  24. data/docs/configuration.md +326 -0
  25. data/docs/dsa-notice-form.md +371 -0
  26. data/docs/madmin.md +490 -0
  27. data/docs/notifications.md +363 -0
  28. data/examples/aws_rekognition_adapter.rb +140 -0
  29. data/examples/openai_moderation_adapter.rb +111 -0
  30. data/gemfiles/rails_7.1.gemfile +36 -0
  31. data/gemfiles/rails_7.2.gemfile +36 -0
  32. data/gemfiles/rails_8.1.gemfile +36 -0
  33. data/lib/generators/moderate/install_generator.rb +56 -0
  34. data/lib/generators/moderate/templates/create_moderate_tables.rb.erb +237 -0
  35. data/lib/generators/moderate/templates/initializer.rb +198 -0
  36. data/lib/generators/moderate/views_generator.rb +63 -0
  37. data/lib/moderate/configuration.rb +341 -0
  38. data/lib/moderate/engine.rb +138 -0
  39. data/lib/moderate/errors.rb +26 -0
  40. data/lib/moderate/event.rb +75 -0
  41. data/lib/moderate/filters/base.rb +126 -0
  42. data/lib/moderate/filters/wordlist.rb +255 -0
  43. data/lib/moderate/jobs/classify_job.rb +158 -0
  44. data/lib/moderate/label.rb +111 -0
  45. data/lib/moderate/macros.rb +90 -0
  46. data/lib/moderate/models/appeal.rb +154 -0
  47. data/lib/moderate/models/application_record.rb +31 -0
  48. data/lib/moderate/models/block.rb +203 -0
  49. data/lib/moderate/models/concerns/actor.rb +174 -0
  50. data/lib/moderate/models/concerns/content_filterable.rb +155 -0
  51. data/lib/moderate/models/concerns/reportable.rb +282 -0
  52. data/lib/moderate/models/flag.rb +136 -0
  53. data/lib/moderate/models/report.rb +620 -0
  54. data/lib/moderate/result.rb +176 -0
  55. data/lib/moderate/services/intake_appeal.rb +89 -0
  56. data/lib/moderate/services/intake_notice.rb +132 -0
  57. data/lib/moderate/services/intake_report.rb +132 -0
  58. data/lib/moderate/services/resolve_appeal.rb +134 -0
  59. data/lib/moderate/services/resolve_flag.rb +101 -0
  60. data/lib/moderate/services/resolve_report.rb +291 -0
  61. data/lib/moderate/version.rb +1 -1
  62. data/lib/moderate.rb +365 -18
  63. data/log/development.log +0 -0
  64. data/log/test.log +0 -0
  65. metadata +154 -15
@@ -0,0 +1,363 @@
1
+ # Notifications — wire email, in-app, and admin alerts once
2
+
3
+ `moderate` never sends an email, never posts to Telegram, never writes a push. It does exactly one thing with notifications: it **emits events**. You wire those events to wherever they should go — **once** — and everything downstream (a report receipt, a decision email, an admin "🚩 new flag" ping, an in-app feed item) flows from that single hook.
4
+
5
+ That hook is `config.notify`:
6
+
7
+ ```ruby
8
+ Moderate.configure do |config|
9
+ config.notify = ->(event) { ... } # called for every notifiable event; no-op by default
10
+ end
11
+ ```
12
+
13
+ One lambda. Every notifiable thing that happens in your Trust & Safety layer passes through it. Inside it you `case` on `event.name` and fan out to:
14
+
15
+ - **[`goodmail`](https://github.com/rameerez/goodmail)** — beautiful transactional **emails** to the *user* (report receipt, decision / statement-of-reasons, appeal outcome).
16
+ - **[`telegrama`](https://github.com/rameerez/telegrama)** — a one-line Telegram **admin alert** (a new report landed, content got auto-flagged, a DSA notice arrived).
17
+ - **[`noticed`](https://github.com/excid3/noticed)** — multi-channel **in-app feed + push** to the user.
18
+
19
+ The whole point: **notify users by email and in-app, AND optionally ping yourself on Telegram, all wired in one place.** No notification logic scattered across your models, services, and controllers — `moderate` collects every moment worth telling someone about and hands them to you with a stable envelope.
20
+
21
+ > [!NOTE]
22
+ > `config.notify` is for **people** (users get emails/push, admins get a Telegram nudge). It's separate from `config.audit`, which is for **machines** (append-only record of every action for your audit log). Same envelope, different intent — see [Audit](#audit-the-other-hook) at the bottom. Most apps wire both, in the same initializer, in about fifteen lines.
23
+
24
+ ---
25
+
26
+ ## TL;DR — the one recipe
27
+
28
+ ```ruby
29
+ # config/initializers/moderate.rb
30
+ Moderate.configure do |config|
31
+ config.notify = ->(event) do
32
+ # 1) Email the user (goodmail) — receipts, decisions, statements of reasons.
33
+ case event.name
34
+ when :report_received, :report_decision, :affected_user_decision,
35
+ :appeal_received, :appeal_decision, :notice_received
36
+ Array(event.recipients).each do |user|
37
+ next unless user.respond_to?(:email) # anonymous DSA notifiers carry an email, not a User
38
+ ModerationMailer.with(event: event, recipient: user)
39
+ .public_send(event.name)
40
+ .deliver_later
41
+ end
42
+ end
43
+
44
+ # 2) Ping admins on Telegram (telegrama) — ONE message, only for things YOU care about.
45
+ case event.name
46
+ when :report_received, :notice_received, :content_flagged
47
+ Telegrama.send_message(event.payload[:summary], formatting: { obfuscate_emails: true })
48
+ end
49
+
50
+ # 3) In-app feed + push (noticed) — the user's bell icon and device.
51
+ case event.name
52
+ when :report_decision, :affected_user_decision, :appeal_decision,
53
+ :user_banned, :content_removed
54
+ Moderate::DecisionNotifier.with(event: event).deliver(event.recipients)
55
+ end
56
+ end
57
+ end
58
+ ```
59
+
60
+ That's the entire integration. Three `case`s, one per channel, in one lambda. Add or drop an `event.name` from any list to turn a channel on/off for that moment. The rest of this doc explains the envelope, the full vocabulary, and each channel in detail.
61
+
62
+ > [!IMPORTANT]
63
+ > **Keep `config.notify` fast.** It runs inside the same call as the moderation action (often inside a transaction). Always hand off to a background job — `deliver_later` (goodmail / Action Mailer), `perform_later` (your own jobs), or `Telegrama`'s async mode (below). Never do blocking HTTP in the hook itself.
64
+
65
+ ---
66
+
67
+ ## The event envelope
68
+
69
+ Every event your `notify` hook receives is the same small, stable object. You never construct it — `moderate` does — you only read it:
70
+
71
+ ```ruby
72
+ event.name # Symbol — which moment this is, e.g. :report_decision (the full list below)
73
+ event.subject # the record this is about — a Moderate::Report / Flag / Block / Appeal / Notice
74
+ event.actor # who triggered it — a moderator, a user, or nil for system/automated events
75
+ event.recipients # Array of who should be told — usually the user(s) to email/notify
76
+ event.payload # Hash of event-specific context, ALWAYS including :summary (a ready-to-send line)
77
+ event.to_h # the whole envelope as a Hash (handy for logging, tests, and noticed params)
78
+ ```
79
+
80
+ Two fields do most of the work:
81
+
82
+ - **`event.recipients`** — already resolved to *the right people* for this event. For `report_received` it's the reporter (their receipt). For `report_decision` it's the reporter (the outcome). For `affected_user_decision` it's the person whose content/account was acted on (their statement of reasons). You don't compute audiences — `moderate` does, and hands you the array. Iterate it.
83
+ - **`event.payload[:summary]`** — a short, human, already-redaction-safe one-liner describing what happened (e.g. `"New harassment report on Comment #4213 by joh…e@example.com"`). It exists so your **admin Telegram ping is literally one line** — no string-building, no leaking. Everything richer (the model, the category, the moderator's note) is also in `payload` if you want to compose your own copy.
84
+
85
+ A recipient is usually one of your `User` records (it responds to `email`, `id`, etc.), **except** for DSA-notice events (`notice_received`, and the `affected_user_decision` for a public notice) where the notifier may be an anonymous person — there `moderate` gives you a lightweight recipient that responds to `email`/`name` but is **not** a `User`. The `next unless user.respond_to?(:email)` / `user.is_a?(User)` guards in the recipes above handle that cleanly.
86
+
87
+ ---
88
+
89
+ ## The full event vocabulary
90
+
91
+ These are every event `moderate` can emit. Wire the ones you care about; ignore the rest (the hook is a plain `case` — unmatched names just fall through).
92
+
93
+ | Event | When it fires | `actor` | `recipients` | Typical channels |
94
+ | --- | --- | --- | --- | --- |
95
+ | `report_received` | A user files an in-app report | the reporter | the reporter (receipt) | 📧 user receipt · 💬 admin ping |
96
+ | `notice_received` | A public **DSA Art. 16** notice is submitted | the notifier (may be anonymous) | the notifier (Art. 16(4) confirmation of receipt) | 📧 confirmation · 💬 admin ping |
97
+ | `report_decision` | A moderator resolves/dismisses a report | the moderator | the reporter (outcome) | 📧 email · 🔔 in-app |
98
+ | `affected_user_decision` | A decision affects the reported party | the moderator | the reported user / notice subject (**statement of reasons**, Art. 17) | 📧 email · 🔔 in-app |
99
+ | `appeal_received` | A user appeals a decision (Art. 20) | the appellant | the appellant (receipt) | 📧 receipt · 💬 admin ping |
100
+ | `appeal_decision` | A moderator upholds/rejects an appeal | the moderator | the appellant | 📧 email · 🔔 in-app |
101
+ | `user_blocked` | One user blocks another | the blocker | — *(usually silent; no recipients)* | 🔕 audit only, typically |
102
+ | `user_unblocked` | A block is lifted | the blocker | — | 🔕 audit only, typically |
103
+ | `user_banned` | A user is banned (your `ban_handler` ran) | the moderator | the banned user | 📧 email · 🔔 in-app |
104
+ | `content_flagged` | The filter auto-creates a `Moderate::Flag` (`:flag` mode) | nil (system) | — *(no user; it's a queue item)* | 💬 admin ping |
105
+ | `content_removed` | Content is removed (via `remove_reported_field!`) | the moderator | the content's owner | 📧 email · 🔔 in-app |
106
+
107
+ A few notes that save you grief:
108
+
109
+ - **`content_flagged` has no user recipient.** It's the system telling *admins* "something needs a look" — there's nothing to email the author. Route it to Telegram (and/or your moderation dashboard), not to a user mailer. Its `recipients` is empty by design.
110
+ - **`report_decision` vs `affected_user_decision` are two events on purpose.** The same resolution tells the *reporter* "we handled your report" (one tone) and the *reported party* "here's what we did and why, and how to appeal" (the DSA Art. 17 statement of reasons — a different tone, different legal weight). Two events let you send two different emails from one moderator click.
111
+ - **Blocks are usually silent.** You rarely email "you've been blocked" (it invites retaliation). `user_blocked` / `user_unblocked` exist mostly for `config.audit`, but they're here in `notify` too if your product wants an in-app signal.
112
+
113
+ ---
114
+
115
+ ## Channel 1 — Email the user with [`goodmail`](https://github.com/rameerez/goodmail)
116
+
117
+ Receipts, decisions, and statements of reasons are exactly what `goodmail` is for: clean, single-template transactional email with zero HTML hell. Build one mailer keyed by event name and let `config.notify` call it.
118
+
119
+ ```ruby
120
+ # config/initializers/moderate.rb
121
+ config.notify = ->(event) do
122
+ case event.name
123
+ when :report_received, :report_decision, :affected_user_decision,
124
+ :appeal_received, :appeal_decision, :notice_received, :user_banned, :content_removed
125
+ Array(event.recipients).each do |recipient|
126
+ next unless recipient.respond_to?(:email) && recipient.email.present?
127
+ ModerationMailer.with(event: event, recipient: recipient)
128
+ .public_send(event.name)
129
+ .deliver_later
130
+ end
131
+ end
132
+ end
133
+ ```
134
+
135
+ ```ruby
136
+ # app/mailers/moderation_mailer.rb
137
+ class ModerationMailer < ApplicationMailer
138
+ # `event` and `recipient` arrive via `.with(...)` and are available as `params[:event]` / `params[:recipient]`.
139
+
140
+ def report_received
141
+ event = params[:event]
142
+ goodmail_mail(to: params[:recipient].email, from: "trust@myapp.com",
143
+ subject: "We received your report") do
144
+ h1 "Thanks — we're on it"
145
+ text "We received your report and our team is reviewing it. You don't need to do anything else."
146
+ info_row "Reference", event.subject.reference
147
+ info_row "Filed", event.subject.created_at.to_date.to_s
148
+ sign
149
+ end
150
+ end
151
+
152
+ def affected_user_decision
153
+ event = params[:event]
154
+ # DSA Art. 17 statement of reasons: action taken, the ground, automated-means flag, redress path.
155
+ goodmail_mail(to: params[:recipient].email, from: "trust@myapp.com",
156
+ subject: "An update about your content") do
157
+ h1 "We took action on your content"
158
+ text event.payload[:reason] # the human-readable ground
159
+ info_row "Action", event.payload[:action] # e.g. "Content removed"
160
+ info_row "Automated means", event.payload[:automated] ? "Yes" : "No"
161
+ space
162
+ text "If you believe this was a mistake, you can appeal — it's free and reviewed by a person."
163
+ button "Appeal this decision", event.payload[:appeal_url] # moderate mints a signed link for you
164
+ sign
165
+ end
166
+ end
167
+
168
+ def notice_received
169
+ event = params[:event]
170
+ goodmail_mail(to: params[:recipient].email, from: "legal@myapp.com",
171
+ subject: "Confirmation of your notice") do
172
+ h1 "We received your notice"
173
+ text "This confirms receipt of your notice under Article 16 of the Digital Services Act."
174
+ info_row "Reference", event.subject.reference
175
+ sign
176
+ end
177
+ end
178
+ end
179
+ ```
180
+
181
+ Why this shape:
182
+
183
+ - **`.with(event:, recipient:)`** is the standard Action Mailer params path, so the same arguments survive `deliver_later` serialization (the event's records are GlobalID-serializable AR objects).
184
+ - **`public_send(event.name)`** lets the *event name* select the mailer action — `report_received` → `#report_received` — so adding a new email is "add a method," not "add a branch."
185
+ - **`goodmail_mail`** is goodmail's mailer helper: it renders the DSL block, wires `List-Unsubscribe`, and hands a normal multipart message to Action Mailer. Use its DSL (`h1`, `text`, `info_row`, `button`, `sign`, …) instead of ERB templates. `info_row`/`price_row` are perfect for the reference/date/action rows a decision email needs.
186
+ - **The appeal link** in a decision email is the signed link `moderate` mints for you (`config.signed_gid_purposes` includes `:appeal`); read it from `event.payload[:appeal_url]` rather than building your own route.
187
+
188
+ > [!TIP]
189
+ > Per-product or per-tenant branding? `goodmail_mail` accepts `config: { company_name:, brand_color:, logo_url: }` for a scoped, thread-local override — handy if your app is white-labeled. See goodmail's "Per-message branding."
190
+
191
+ ---
192
+
193
+ ## Channel 2 — Ping admins on Telegram with [`telegrama`](https://github.com/rameerez/telegrama)
194
+
195
+ This is the "tell *me* something happened" channel. `event.payload[:summary]` is built for it — a single, safe line — so the admin ping is genuinely one call:
196
+
197
+ ```ruby
198
+ config.notify = ->(event) do
199
+ case event.name
200
+ when :report_received, :notice_received, :content_flagged, :appeal_received
201
+ Telegrama.send_message(event.payload[:summary], formatting: { obfuscate_emails: true })
202
+ end
203
+ end
204
+ ```
205
+
206
+ That's it. `event.payload[:summary]` for `content_flagged` reads like:
207
+
208
+ > 🚩 Auto-flagged Message #8821 — categories: hate, threats
209
+
210
+ and for `report_received`:
211
+
212
+ > 🚩 New harassment report on Comment #4213 by joh…e@example.com
213
+
214
+ `obfuscate_emails: true` means even if a summary or your own copy contains a user's address, telegrama redacts it (`john.doe@x.com` → `joh…e@x.com`) before it hits a shared admin chat. Turn it on for anything touching user PII.
215
+
216
+ ### Compose a richer alert when you want one
217
+
218
+ The `summary` is the fast path. When you'd rather build the message (the way you'd alert a sale), reach into `event.payload` and use telegrama's MarkdownV2 formatting:
219
+
220
+ ```ruby
221
+ when :report_received
222
+ report = event.subject
223
+ msg = <<~MSG
224
+ 🚩 *New report*
225
+
226
+ *Category:* #{report.category}
227
+ *On:* #{report.reportable_label}
228
+ *By:* #{event.actor&.email}
229
+
230
+ [🔗 Open in moderation queue](#{Rails.application.routes.url_helpers.admin_report_url(report)})
231
+ MSG
232
+ Telegrama.send_message(msg, formatting: { obfuscate_emails: true })
233
+ ```
234
+
235
+ ### Route different alerts to different chats / topics
236
+
237
+ Got separate Telegram chats (or forum topics) for "trust & safety" vs "everything else"? telegrama takes `chat_id:` and `message_thread_id:` per message:
238
+
239
+ ```ruby
240
+ TRUST_CHAT = Rails.application.credentials.dig(:telegram, :trust_chat_id)
241
+ TRUST_TOPIC = Rails.application.credentials.dig(:telegram, :trust_topic_id)
242
+
243
+ when :content_flagged, :notice_received
244
+ Telegrama.send_message(event.payload[:summary],
245
+ chat_id: TRUST_CHAT,
246
+ message_thread_id: TRUST_TOPIC,
247
+ formatting: { obfuscate_emails: true })
248
+ ```
249
+
250
+ > [!TIP]
251
+ > Enable telegrama's **async delivery** (`config.deliver_message_async = true` in `config/initializers/telegrama.rb`) so admin pings never block a moderation action — this is the cleanest way to keep your `config.notify` hook fast for the Telegram leg. telegrama also degrades gracefully (MarkdownV2 → HTML → plain text), so a weird character in a user's content can't break the alert.
252
+
253
+ ---
254
+
255
+ ## Channel 3 — In-app feed + push with [`noticed`](https://github.com/excid3/noticed)
256
+
257
+ For decisions and outcomes that the *user* should see in your app's notification bell (and on their phone), `noticed` is the multi-channel fan-out. The event envelope drops straight into `Notifier.with(...).deliver(recipients)`:
258
+
259
+ ```ruby
260
+ # config/initializers/moderate.rb
261
+ config.notify = ->(event) do
262
+ case event.name
263
+ when :report_decision, :affected_user_decision, :appeal_decision,
264
+ :user_banned, :content_removed
265
+ Moderate::DecisionNotifier.with(event: event.to_h).deliver(event.recipients)
266
+ end
267
+ end
268
+ ```
269
+
270
+ ```ruby
271
+ # app/notifiers/moderate/decision_notifier.rb
272
+ class Moderate::DecisionNotifier < Noticed::Event
273
+ deliver_by :database # the in-app feed
274
+ deliver_by :fcm do |config| # push to devices
275
+ config.credentials = Rails.application.credentials.fcm
276
+ config.json { |notification| { title: "Moderation update", body: params.dig(:event, :payload, :summary) } }
277
+ end
278
+
279
+ notification_methods do
280
+ def message = params.dig(:event, :payload, :summary)
281
+ def url = params.dig(:event, :payload, :url)
282
+ end
283
+ end
284
+ ```
285
+
286
+ Notes:
287
+
288
+ - Pass **`event.to_h`** (the Hash) rather than the raw event object to `noticed` — `noticed` serializes params to the database, and a plain Hash of GlobalID-able records and scalars stores cleanly. `event.payload[:summary]` doubles as the push body.
289
+ - `event.recipients` is already the correct audience, so `deliver(event.recipients)` needs no massaging.
290
+ - Same `:summary` you used for Telegram works as the push title/body — write the human line once, reuse it across channels.
291
+
292
+ > This is the [RailsFast Native](https://railsfast.com) path: `noticed` + `:fcm`/APNs gives you the in-app feed and native push from the same decision event, so a moderation outcome reaches the user on web and mobile without a second integration.
293
+
294
+ ---
295
+
296
+ ## Putting all three together (the full hook)
297
+
298
+ Here's the complete, production-shaped `config.notify` — email via goodmail, admin pings via telegrama, in-app + push via noticed — wired exactly once:
299
+
300
+ ```ruby
301
+ # config/initializers/moderate.rb
302
+ Moderate.configure do |config|
303
+ config.notify = ->(event) do
304
+ # --- 1. Email the user (goodmail) ---------------------------------------
305
+ case event.name
306
+ when :report_received, :report_decision, :affected_user_decision,
307
+ :appeal_received, :appeal_decision, :notice_received,
308
+ :user_banned, :content_removed
309
+ Array(event.recipients).each do |recipient|
310
+ next unless recipient.respond_to?(:email) && recipient.email.present?
311
+ ModerationMailer.with(event: event, recipient: recipient)
312
+ .public_send(event.name)
313
+ .deliver_later
314
+ end
315
+ end
316
+
317
+ # --- 2. Ping admins on Telegram (telegrama) -----------------------------
318
+ case event.name
319
+ when :report_received, :notice_received, :content_flagged, :appeal_received
320
+ Telegrama.send_message(event.payload[:summary], formatting: { obfuscate_emails: true })
321
+ end
322
+
323
+ # --- 3. In-app feed + push (noticed) ------------------------------------
324
+ case event.name
325
+ when :report_decision, :affected_user_decision, :appeal_decision,
326
+ :user_banned, :content_removed
327
+ Moderate::DecisionNotifier.with(event: event.to_h).deliver(event.recipients)
328
+ end
329
+ end
330
+ end
331
+ ```
332
+
333
+ Every branch is independent: the same event can email a user **and** ping you on Telegram **and** drop into their in-app feed, or just one of those, depending on which lists you put its name in. One user-facing decision, one moderator click, three channels — no duplicate audience logic, no notification code in your models.
334
+
335
+ ---
336
+
337
+ ## Audit — the other hook
338
+
339
+ `config.notify` is for telling **people**. `config.audit` is for telling your **records**: an append-only log of every important action, with the *same* envelope, so compliance and forensics don't depend on whether an email went out.
340
+
341
+ ```ruby
342
+ Moderate.configure do |config|
343
+ config.audit = ->(event) do
344
+ AuditLog.record!(
345
+ event_type: event.name,
346
+ actor: event.actor,
347
+ subject: event.subject,
348
+ data: event.payload
349
+ )
350
+ end
351
+ end
352
+ ```
353
+
354
+ `audit` fires for **every** event (including the silent ones like `user_blocked` that you usually don't notify on), it's append-only by intent, and it's where DSA **Art. 24 transparency counters** ultimately draw from (notices received, actions taken, appeal outcomes). Wire it once next to `notify` and you have both halves: humans get told, history gets kept.
355
+
356
+ > [!NOTE]
357
+ > Same envelope (`name` / `subject` / `actor` / `recipients` / `payload` / `to_h`), two destinations. Keep both hooks fast and side-effect-light; push real work (emails, HTTP, heavy writes) to background jobs.
358
+
359
+ ## See also
360
+
361
+ - [Notifications & audit — one hook each](../README.md#-notifications---audit--one-hook-each) — the README overview
362
+ - [The DSA notice form](dsa-notice-form.md) — where `notice_received` (the Art. 16 confirmation of receipt) comes from
363
+ - [`goodmail`](https://github.com/rameerez/goodmail) · [`telegrama`](https://github.com/rameerez/telegrama) · [`noticed`](https://github.com/excid3/noticed) — the three destinations
@@ -0,0 +1,140 @@
1
+ # frozen_string_literal: true
2
+
3
+ # ──────────────────────────────────────────────────────────────────────────────
4
+ # REFERENCE ADAPTER — NOT shipped, NOT loaded, NOT a dependency of `moderate`.
5
+ #
6
+ # `moderate` ships exactly ONE built-in adapter: the offline `:wordlist` (text).
7
+ # It does NOT bundle an image classifier — a real NSFW/CSAM model needs a hosted
8
+ # service or a model you can't ship offline. This file is a "bring your own" IMAGE
9
+ # adapter: copy it into your app, add the AWS SDK gem to YOUR Gemfile, and register
10
+ # it. The gem has no `aws-sdk-rekognition` dependency, so nothing here is pulled
11
+ # into a host that doesn't want it.
12
+ #
13
+ # ── How to use it ─────────────────────────────────────────────────────────────
14
+ # 1. Copy this file into your app (e.g. app/adapters/aws_rekognition_adapter.rb).
15
+ # 2. Add the runtime dependency to YOUR app's Gemfile:
16
+ # gem "aws-sdk-rekognition"
17
+ # and provide AWS credentials the usual way (ENV / IAM role / shared config).
18
+ # 3. Register the adapter and point an image field at it, in :flag mode:
19
+ # Moderate.configure do |config|
20
+ # config.register_adapter(:rekognition, AwsRekognitionAdapter.new)
21
+ # config.filter "Profile", :avatar, with: :rekognition, mode: :flag
22
+ # end
23
+ # Hand the adapter the image BYTES (the value your model's filtering seam
24
+ # passes for the field — e.g. an attachment's `download`), or pass an
25
+ # { s3_object: { bucket:, name: } } hash to moderate an object already in S3.
26
+ #
27
+ # ── Why :flag, never :block ───────────────────────────────────────────────────
28
+ # `synchronous? == false` (below): a Rekognition call is blocking network I/O, so
29
+ # `moderate` runs it in `Moderate::ClassifyJob` (:flag mode) and refuses it in
30
+ # :block mode — you can't synchronously reject a save on an in-flight API call.
31
+ #
32
+ # ── Taxonomy mapping ──────────────────────────────────────────────────────────
33
+ # Rekognition has its OWN moderation taxonomy (top-level + second-level labels like
34
+ # "Explicit Nudity" / "Violence" / "Drugs"), NOT OpenAI's. Every adapter must map
35
+ # its provider labels onto the gem's ONE canonical taxonomy (Moderate::Label), so
36
+ # Moderate::Flag, the DSA statement of reasons, and the transparency counters all
37
+ # speak one vocabulary. CATEGORY_MAP below is that mapping — adjust it to taste.
38
+ # AWS DetectModerationLabels API:
39
+ # https://docs.aws.amazon.com/rekognition/latest/APIReference/API_DetectModerationLabels.html
40
+ # Moderation label categories:
41
+ # https://docs.aws.amazon.com/rekognition/latest/dg/moderation.html
42
+ # ──────────────────────────────────────────────────────────────────────────────
43
+ class AwsRekognitionAdapter
44
+ # Only surface labels Rekognition is at least this confident about. Rekognition
45
+ # confidence is 0..100; we also pass MIN_CONFIDENCE to the API so it doesn't even
46
+ # return lower-confidence labels. Tune for your tolerance.
47
+ MIN_CONFIDENCE = 60.0
48
+
49
+ # Map Rekognition's top-level moderation categories onto the gem's canonical
50
+ # Moderate::Label taxonomy. Rekognition's top-level names (left) are mapped to a
51
+ # [category, subcategory] canonical pair (right). Unmapped/new Rekognition
52
+ # categories fall back to plain :sexual as a conservative "needs review" bucket —
53
+ # change that default if a different fallback fits your app.
54
+ CATEGORY_MAP = {
55
+ "Explicit Nudity" => [:sexual, nil],
56
+ "Sexual" => [:sexual, nil],
57
+ "Non-Explicit Nudity of Intimate parts and Kissing" => [:sexual, nil],
58
+ "Violence" => [:violence, nil],
59
+ "Visually Disturbing" => [:violence, :graphic],
60
+ "Hate Symbols" => [:hate, nil],
61
+ "Drugs & Tobacco" => [:illicit, nil],
62
+ "Gambling" => [:illicit, nil]
63
+ }.freeze
64
+
65
+ def initialize(client: nil)
66
+ # Lazily build the client so merely requiring this file doesn't construct an AWS
67
+ # client (and doesn't error when the gem/creds are absent). Inject one in tests.
68
+ @client = client
69
+ end
70
+
71
+ # classify(value) -> Moderate::Result. `value` is the image to inspect: raw bytes
72
+ # (a String) for DetectModerationLabels' `image: { bytes: ... }`, or an
73
+ # { s3_object: { bucket:, name: } } hash to point at an object already in S3.
74
+ def classify(value)
75
+ response = client.detect_moderation_labels(
76
+ image: image_param(value),
77
+ min_confidence: MIN_CONFIDENCE
78
+ )
79
+
80
+ labels = build_labels(response.moderation_labels)
81
+ return Moderate::Result.allowed(source: "image_filter", raw: response.to_h) if labels.empty?
82
+
83
+ Moderate::Result.new(
84
+ allowed: false,
85
+ labels: labels,
86
+ # "image_filter" is one of the four values the install migration's
87
+ # moderate_flags_source_check constraint allows. (The human-facing backend
88
+ # name is the adapter NAME, :rekognition, which you register it under.)
89
+ source: "image_filter",
90
+ raw: response.to_h
91
+ )
92
+ rescue => error
93
+ # FAIL OPEN on everything, same rationale as any network classifier: a moderation
94
+ # API is defense-in-depth, not a gatekeeper that should block uploads on an AWS
95
+ # blip. The image simply isn't auto-flagged this time; users can still report it.
96
+ warn("[moderate] Rekognition moderation call failed (failing open): #{error.class}: #{error.message}")
97
+ Moderate::Result.allowed(source: "image_filter", raw: { error: error.class.name, message: error.message })
98
+ end
99
+
100
+ # Background-only — see the header. This is what makes the spine route the adapter
101
+ # through ClassifyJob in :flag mode and forbid it in :block mode.
102
+ def synchronous?
103
+ false
104
+ end
105
+
106
+ private
107
+
108
+ # Build the API's `image` parameter from the host-agnostic value. A Hash is passed
109
+ # through (so { s3_object: {...} } or { bytes: ... } both work); anything else is
110
+ # treated as the raw image bytes.
111
+ def image_param(value)
112
+ return value if value.is_a?(Hash)
113
+
114
+ { bytes: value }
115
+ end
116
+
117
+ # Rekognition returns a flat list of moderation labels, each with `name`,
118
+ # `parent_name` (the top-level category, blank for a top-level label itself), and
119
+ # `confidence` (0..100). We key the canonical mapping off the TOP-LEVEL category
120
+ # (`parent_name` when present, else `name`) and emit one Moderate::Label per hit,
121
+ # normalizing the 0..100 confidence to the gem's 0..1 score, with input :image.
122
+ def build_labels(moderation_labels)
123
+ Array(moderation_labels).filter_map do |label|
124
+ top_level = label.parent_name.to_s.empty? ? label.name : label.parent_name
125
+ category, subcategory = CATEGORY_MAP.fetch(top_level, [:sexual, nil])
126
+
127
+ Moderate::Label.new(
128
+ category: category,
129
+ subcategory: subcategory,
130
+ score: label.confidence.to_f / 100.0, # Rekognition 0..100 -> canonical 0..1
131
+ flagged: true,
132
+ input: :image
133
+ )
134
+ end
135
+ end
136
+
137
+ def client
138
+ @client ||= Aws::Rekognition::Client.new
139
+ end
140
+ end
@@ -0,0 +1,111 @@
1
+ # frozen_string_literal: true
2
+
3
+ # ──────────────────────────────────────────────────────────────────────────────
4
+ # REFERENCE ADAPTER — NOT shipped, NOT loaded, NOT a dependency of `moderate`.
5
+ #
6
+ # `moderate` ships exactly ONE built-in adapter: the offline `:wordlist`. Every
7
+ # other backend — including this OpenAI one — is "bring your own": you copy this
8
+ # file into your app, add its gem to YOUR Gemfile, and register it yourself. The
9
+ # gem deliberately has no `ruby_llm`/`openai`/HTTP dependency, so nothing here is
10
+ # pulled into a host that doesn't want it.
11
+ #
12
+ # ── How to use it ─────────────────────────────────────────────────────────────
13
+ # 1. Copy this file into your app (e.g. app/adapters/openai_moderation_adapter.rb).
14
+ # 2. Add the runtime dependency to YOUR app's Gemfile:
15
+ # gem "ruby_llm"
16
+ # and configure your key (https://github.com/crmne/ruby_llm):
17
+ # RubyLLM.configure { |c| c.openai_api_key = ENV["OPENAI_API_KEY"] }
18
+ # 3. Register the adapter and point a field at it, in :flag mode:
19
+ # Moderate.configure do |config|
20
+ # config.register_adapter(:openai, OpenAIModerationAdapter.new)
21
+ # config.filter "Message", :body, with: :openai, mode: :flag
22
+ # end
23
+ #
24
+ # ── Why :flag, never :block ───────────────────────────────────────────────────
25
+ # This adapter declares `synchronous? == false` (see below), so `moderate` routes
26
+ # it through `Moderate::ClassifyJob` in :flag mode and REFUSES it in :block mode.
27
+ # You can't synchronously reject a save on a result that's still in flight over the
28
+ # network — that's the spine's documented rule ("`:block` requires a synchronous
29
+ # adapter"). An async classifier allows the write, classifies in a job, and files a
30
+ # `Moderate::Flag` for review.
31
+ #
32
+ # ── Why `omni-moderation-latest` ──────────────────────────────────────────────
33
+ # OpenAI's moderation endpoint is free and the omni model is multimodal (text AND
34
+ # image in one call). Crucially, its category set IS the gem's canonical taxonomy
35
+ # (`Moderate::Label`) — so the mapping below is 1:1, no lossy translation.
36
+ # OpenAI moderation guide: https://developers.openai.com/api/docs/guides/moderation
37
+ # ruby_llm moderation API: https://github.com/crmne/ruby_llm
38
+ # ──────────────────────────────────────────────────────────────────────────────
39
+ class OpenAIModerationAdapter
40
+ # The multimodal model. The older text-only "text-moderation-*" models don't
41
+ # accept images and don't return per-category data the same way, so pin omni.
42
+ MODEL = "omni-moderation-latest"
43
+
44
+ # The single adapter contract: classify(value) -> Moderate::Result.
45
+ #
46
+ # `value` is whatever the gem hands an adapter for the field — typically a String
47
+ # for a text column. `ruby_llm`'s `RubyLLM.moderate` takes that input plus the
48
+ # model and returns a result exposing `flagged?`, `flagged_categories`, and
49
+ # `category_scores` (see the ruby_llm moderation docs linked in the header).
50
+ def classify(value)
51
+ result = RubyLLM.moderate(value, model: MODEL)
52
+
53
+ # `flagged_categories` is the list of canonical slugs that tripped, e.g.
54
+ # ["hate", "hate/threatening", "sexual/minors"]; `category_scores` is a
55
+ # slug => 0.0..1.0 hash. Trust OpenAI's own top-level `flagged?` for the
56
+ # verdict, and surface ONLY the flagged categories as labels (a non-flagged
57
+ # category still carries a near-zero score we don't want in the Flag).
58
+ return Moderate::Result.allowed(source: "external_classifier", raw: result.results) unless result.flagged?
59
+
60
+ labels = build_labels(result.flagged_categories, result.category_scores)
61
+ Moderate::Result.new(
62
+ allowed: false,
63
+ labels: labels,
64
+ # "external_classifier" is one of the four values the install migration's
65
+ # moderate_flags_source_check constraint allows; a remote classifier records
66
+ # that. (The human-facing "which backend" detail is the adapter NAME, :openai.)
67
+ source: "external_classifier",
68
+ raw: result.results
69
+ )
70
+ rescue => error
71
+ # FAIL OPEN on EVERYTHING (network error, timeout, auth failure, malformed
72
+ # response, …). A moderation API is best-effort defense-in-depth, not a
73
+ # gatekeeper that should take down user posting when OpenAI has a hiccup — and a
74
+ # :block field is anyway backed by the synchronous :wordlist, never this. The
75
+ # content simply isn't auto-flagged this time; users can still report it. Failing
76
+ # CLOSED (rejecting writes on an upstream outage) would be a far worse outage.
77
+ warn("[moderate] OpenAI moderation call failed (failing open): #{error.class}: #{error.message}")
78
+ Moderate::Result.allowed(source: "external_classifier", raw: { error: error.class.name, message: error.message })
79
+ end
80
+
81
+ # Background-only: this does blocking network I/O. Returning false here is exactly
82
+ # what makes the spine route the adapter through Moderate::ClassifyJob in :flag
83
+ # mode and forbid it in :block mode. (The spine probes `synchronous?` directly;
84
+ # an adapter need not inherit from Moderate::Filters::Base — answering this one
85
+ # predicate is enough.)
86
+ def synchronous?
87
+ false
88
+ end
89
+
90
+ private
91
+
92
+ # One Moderate::Label per FLAGGED canonical slug. OpenAI's slugs are the gem's
93
+ # canonical slugs, so we split "category/subcategory" (e.g. "hate/threatening" ->
94
+ # category :hate, subcategory :threatening; a bare "hate" has no subcategory) and
95
+ # attach the matching 0..1 score. `input: :unknown` because the simple ruby_llm
96
+ # surface doesn't expose OpenAI's per-category `category_applied_input_types`; if
97
+ # you need text-vs-image attribution, read it from `result.results` (the raw
98
+ # payload) and pass `input:` accordingly.
99
+ def build_labels(flagged_categories, scores)
100
+ Array(flagged_categories).map do |slug|
101
+ category, subcategory = slug.to_s.split("/", 2)
102
+ Moderate::Label.new(
103
+ category: category,
104
+ subcategory: subcategory,
105
+ score: scores && scores[slug.to_s],
106
+ flagged: true,
107
+ input: :unknown
108
+ )
109
+ end
110
+ end
111
+ end
@@ -0,0 +1,36 @@
1
+ # This file was generated by Appraisal
2
+
3
+ source "https://rubygems.org"
4
+
5
+ gem "rake", "~> 13.0"
6
+ gem "rails", "~> 7.1.0"
7
+
8
+ group :development do
9
+ gem "appraisal"
10
+ gem "web-console"
11
+ gem "standard"
12
+ gem "rubocop", "~> 1.0"
13
+ gem "rubocop-minitest", "~> 0.35"
14
+ gem "rubocop-performance", "~> 1.0"
15
+ end
16
+
17
+ group :test do
18
+ gem "minitest", "~> 5.0"
19
+ gem "mocha"
20
+ gem "simplecov", require: false
21
+ gem "activejob"
22
+ gem "actionmailer"
23
+ gem "activestorage"
24
+ gem "sqlite3"
25
+ gem "pg"
26
+ gem "mysql2"
27
+ gem "bootsnap", require: false
28
+ gem "puma"
29
+ gem "importmap-rails"
30
+ gem "sprockets-rails"
31
+ gem "stimulus-rails"
32
+ gem "turbo-rails"
33
+ gem "rdoc", ">= 7.0"
34
+ end
35
+
36
+ gemspec path: "../"