RubyGems - moderate - Versions diffs - 0.1.0 → 1.0.0.beta1 - Mend

moderate 0.1.0 → 1.0.0.beta1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (65) hide show

checksums.yaml +4 -4
data/.rubocop.yml +8 -0
data/.simplecov +62 -0
data/AGENTS.md +7 -0
data/Appraisals +16 -0
data/CHANGELOG.md +71 -1
data/CLAUDE.md +7 -0
data/README.md +376 -29
data/Rakefile +28 -2
data/app/controllers/concerns/moderate/moderation.rb +161 -0
data/app/controllers/moderate/appeals_controller.rb +190 -0
data/app/controllers/moderate/application_controller.rb +45 -0
data/app/controllers/moderate/notices_controller.rb +382 -0
data/app/controllers/moderate/transparency_reports_controller.rb +30 -0
data/app/helpers/moderate/engine_helper.rb +151 -0
data/app/views/moderate/appeals/new.html.erb +78 -0
data/app/views/moderate/notices/new.html.erb +255 -0
data/app/views/moderate/transparency_reports/_summary_card.html.erb +20 -0
data/app/views/moderate/transparency_reports/show.html.erb +52 -0
data/config/moderate/blocklists/en.yml +81 -0
data/config/moderate/blocklists/es.yml +40 -0
data/config/routes.rb +36 -0
data/docs/compliance.md +178 -0
data/docs/configuration.md +326 -0
data/docs/dsa-notice-form.md +371 -0
data/docs/madmin.md +490 -0
data/docs/notifications.md +363 -0
data/examples/aws_rekognition_adapter.rb +140 -0
data/examples/openai_moderation_adapter.rb +111 -0
data/gemfiles/rails_7.1.gemfile +36 -0
data/gemfiles/rails_7.2.gemfile +36 -0
data/gemfiles/rails_8.1.gemfile +36 -0
data/lib/generators/moderate/install_generator.rb +56 -0
data/lib/generators/moderate/templates/create_moderate_tables.rb.erb +237 -0
data/lib/generators/moderate/templates/initializer.rb +198 -0
data/lib/generators/moderate/views_generator.rb +63 -0
data/lib/moderate/configuration.rb +341 -0
data/lib/moderate/engine.rb +138 -0
data/lib/moderate/errors.rb +26 -0
data/lib/moderate/event.rb +75 -0
data/lib/moderate/filters/base.rb +126 -0
data/lib/moderate/filters/wordlist.rb +255 -0
data/lib/moderate/jobs/classify_job.rb +158 -0
data/lib/moderate/label.rb +111 -0
data/lib/moderate/macros.rb +90 -0
data/lib/moderate/models/appeal.rb +154 -0
data/lib/moderate/models/application_record.rb +31 -0
data/lib/moderate/models/block.rb +203 -0
data/lib/moderate/models/concerns/actor.rb +174 -0
data/lib/moderate/models/concerns/content_filterable.rb +155 -0
data/lib/moderate/models/concerns/reportable.rb +282 -0
data/lib/moderate/models/flag.rb +136 -0
data/lib/moderate/models/report.rb +620 -0
data/lib/moderate/result.rb +176 -0
data/lib/moderate/services/intake_appeal.rb +89 -0
data/lib/moderate/services/intake_notice.rb +132 -0
data/lib/moderate/services/intake_report.rb +132 -0
data/lib/moderate/services/resolve_appeal.rb +134 -0
data/lib/moderate/services/resolve_flag.rb +101 -0
data/lib/moderate/services/resolve_report.rb +291 -0
data/lib/moderate/version.rb +1 -1
data/lib/moderate.rb +365 -18
data/log/development.log +0 -0
data/log/test.log +0 -0
metadata +154 -15

data/docs/compliance.md ADDED Viewed

@@ -0,0 +1,178 @@
+# Compliance: DSA, App Store, and Google Play — the checklist you hand to legal
+Trust & Safety is the one part of your app where "I think we covered it" isn't good enough. A missing **block** button gets your build rejected by Apple. A missing **notice form** gets you a letter from a European regulator. A missing **appeal** path is a DSA violation with a fine attached. The rules are real, they're specific, and they're written by people who have never opened your codebase.
+So this page does the boring, load-bearing work: it maps **every requirement** from the three regimes that actually gate a UGC app — the **EU Digital Services Act**, the **Apple App Store Review Guidelines**, and the **Google Play** developer policies — to the **exact `moderate` feature** that satisfies it, with the **test** that proves it. It's written to be **printed and handed to a lawyer or a store reviewer**: each row is a claim, and each claim has a receipt.
+> [!NOTE]
+> `moderate` gives you the **mechanisms** the law and the stores require — the report intake, the block edge, the filter, the queue, the appeal, the statement-of-reasons, the transparency counters, the public notice form. It cannot make your **policies** or **operations** compliant for you: you still have to publish a contact address, actually read the queue, and answer notices "in a timely manner." This checklist marks which is which — **[gem]** rows are done the moment you install; **[you]** rows are things only you can do, that the gem makes easy. Don't hand legal the **[gem]** column and call it a day.
+> [!IMPORTANT]
+> This is engineering documentation, not legal advice. It reflects the text of the **DSA (Regulation (EU) 2022/2065)**, the **App Store Review Guidelines**, and the **Google Play Developer Program Policies** as the gem was built against them. Regulations and store rules change; your obligations depend on your size, your users, and your content. Have your own counsel confirm the mapping before you rely on it.
+---
+## How to read this
+Every checklist row has four columns:
+| Column | What it means |
+| --- | --- |
+| **Requirement** | The specific obligation, quoted or closely paraphrased, with its article/guideline number. |
+| **How `moderate` satisfies it** | The concrete feature, model, hook, or method that covers it. |
+| **Who** | **[gem]** = built in, true the moment you install. **[you]** = your responsibility, made easy by the gem. **[gem + you]** = gem does the mechanism, you wire one hook or write one line. |
+| **Proof** | The test in the suite (or the manual check) that demonstrates it. |
+The **Proof** column points at `test/` paths. Run the whole thing with `bundle exec rake test`; the named files are your evidence that the mechanism actually works, not just that it's documented.
+---
+## 1. EU Digital Services Act (Regulation (EU) 2022/2065)
+The DSA applies to any "hosting service" — which includes basically any app that stores user-generated content — **that serves recipients in the EU**, regardless of where you're based. The four articles that matter for a small/mid app (you are almost certainly **not** a "Very Large Online Platform," which carries extra duties this gem does not cover) are **16, 17, 20, and 24**. `moderate` is organized around exactly these.
+> [!NOTE]
+> **Scope honesty.** `moderate` targets the obligations that apply to ordinary hosting services and online platforms. It does **not** implement VLOP-only duties (risk assessments, independent audits, the transparency database submission, vetted-researcher data access, Art. 34–43). If you cross 45M monthly EU users, you have a much bigger compliance program than any gem — talk to specialists.
+### Art. 16 — Notice and action mechanisms
+> Providers shall put mechanisms in place to allow **any individual or entity** to notify them of allegedly illegal content, **by electronic means**, that are **easy to access and user-friendly**.
+| Requirement | How `moderate` satisfies it | Who | Proof |
+| --- | --- | --- | --- |
+| A public, **electronic** notice mechanism, open to **anyone** (not just logged-in users). | The mountable notice engine, mounted at the path of your choosing (e.g. `mount Moderate::Engine => "/trust"`), serves a public form at `<mount>/notices/new`. See [`docs/dsa-notice-form.md`](dsa-notice-form.md). | **[gem]** | `test/integration/notice_form_test.rb` |
+| Notice contains a **sufficiently substantiated explanation** of why the content is illegal — Art. 16(2)(a). | `message` field, required and validated on `Moderate::Report` (a notice is a `Report` with `intake_kind: "dsa"`). | **[gem]** | `test/services/moderate/intake_notice_test.rb` |
+| Notice contains the **exact electronic location** (URL) — Art. 16(2)(b). | `subject_url` (one or more), validated as an `http(s)` URL; the model resolves it to a reportable record for the evidence snapshot. | **[gem]** | `test/models/moderate/report_test.rb` |
+| Notice contains the **name and email** of the notifier — Art. 16(2)(c). | `notifier_name` + `notifier_email`, required for `dsa` notices and email-validated. | **[gem]** | `test/models/moderate/report_test.rb` |
+| **Anonymity carve-out**: name/email are **not** required for notices alleging certain offences against minors (CSAM and related, Art. 16(2)(c) proviso). | When `anonymous` is set and `legal_reason` is `protection_of_minors`, the model **waives** the `notifier_name`/`notifier_email` requirement; any other anonymous notice is rejected. | **[gem]** | `test/models/moderate/report_test.rb` |
+| A **good-faith statement** that the information is accurate and complete — Art. 16(2)(d). | `good_faith_confirmed` (acceptance); the save is rejected unless it's truthy. | **[gem]** | `test/models/moderate/report_test.rb` |
+| **Confirmation of receipt**, sent to the notifier **without undue delay** — Art. 16(4). | `Moderate::Services::IntakeNotice` stamps the report's `acknowledged_at` and fires the `notice_received` event through `config.notify`; you wire it to your mailer once (e.g. `goodmail`) for the receipt, and the form shows an on-screen confirmation flash. | **[gem + you]** | `test/services/moderate/intake_notice_test.rb` |
+| Notice the provider that the decision **and** the redress options are communicated — Art. 16(5). | Acting on the report (resolve/dismiss) emits `report_decision` (and `affected_user_decision`) carrying the statement of reasons; see Art. 17 below. | **[gem + you]** | `test/services/moderate/resolve_report_test.rb` |
+| Decisions taken in a **timely, diligent, non-arbitrary and objective manner** — Art. 16(6). | The notice lands in `Moderate::Report.pending` — the same queue as in-app reports — with full evidence; you act on it. The gem records who decided, when, and why (mandatory moderator + note on every action). | **[gem + you]** | `test/services/moderate/resolve_report_test.rb` |
+> [!TIP]
+> Don't want to mount the engine? You can build your own public page and call `Moderate::Services::IntakeNotice` (which persists a `Moderate::Report` with `intake_kind: "dsa"`) directly — every Art. 16 validation, the snapshot, the durable `acknowledged_at` receipt, and the `notice_received` event still apply. Mounting is the fast path; the service is the full-control path. See [Staying optional](dsa-notice-form.md#staying-optional-ignore-the-engine-entirely).
+### Art. 17 — Statement of reasons
+> Where a provider restricts content, it shall provide a **clear and specific statement of reasons** to the affected recipient.
+The statement must include, at minimum: the **restriction imposed** (and its scope), the **facts and circumstances** relied on, whether **automated means** were used, the **legal or contractual ground**, and information on **redress** (internal complaints, out-of-court dispute settlement, judicial remedy).
+| Requirement | How `moderate` satisfies it | Who | Proof |
+| --- | --- | --- | --- |
+| State the **specific restriction** imposed and its scope (content removed? account suspended?). | Every resolution records its action (`remove_content:`, `ban_user:`) and emits `affected_user_decision` with that action in `event.payload`. | **[gem]** | `test/services/moderate/resolve_report_test.rb` |
+| State the **facts and circumstances** relied on. | The immutable **evidence snapshot** taken at report time travels with the decision; the moderator's mandatory `note:` is the human-readable ground. | **[gem]** | `test/models/moderate/report_test.rb` |
+| State whether **automated means** were used in detection or decision — Art. 17(3)(c). | Reports/flags carry their `source` (`text_filter`, `image_filter`, `external_classifier`, or `manual`). A decision acting on an auto-`Moderate::Flag` is flagged as automated-means in the `affected_user_decision` payload; a human report is not. | **[gem]** | `test/services/moderate/resolve_report_test.rb` |
+| State the **legal or contractual ground**. | For DSA notices, the `legal_reason` (from `Moderate::DSA_LEGAL_REASONS`) is the legal ground; for in-app reports, the community `category` + your `note:` is the contractual (terms-of-service) ground. Both ride in the decision payload. | **[gem + you]** | `test/services/moderate/resolve_report_test.rb` |
+| Communicate **redress** options (internal complaint, out-of-court, judicial). | The `affected_user_decision` event carries the appeal entry point; you render the redress text in your decision email (the gem ships the data; the copy is yours, because it names your jurisdiction). | **[gem + you]** | `test/services/moderate/resolve_report_test.rb` |
+| Deliver the statement to the **affected recipient** (the content owner), not just the reporter. | Two distinct events fire: `report_decision` → the **reporter**; `affected_user_decision` → the **content owner** (resolved via the reportable's `reported_owner`). You wire both. | **[gem + you]** | `test/services/moderate/resolve_report_test.rb` |
+> [!NOTE]
+> **Why two decision events.** Art. 16(5) wants the **notifier** informed; Art. 17 wants the **affected user** informed — they are different people with different rights. `moderate` keeps them separate (`report_decision` vs `affected_user_decision`) so your one `notify` hook sends the right message to the right person. Collapsing them into one email is a classic DSA mistake. See [Notifications](../README.md#-notifications---audit--one-hook-each).
+### Art. 20 — Internal complaint-handling system (appeals)
+> Providers shall provide recipients with access to an **effective internal complaint-handling system**, **free of charge**, for at least **six months** after a decision, with complaints handled in a **timely, non-discriminatory, diligent and non-arbitrary manner** and **not solely on automated means**.
+| Requirement | How `moderate` satisfies it | Who | Proof |
+| --- | --- | --- | --- |
+| An **internal** appeal mechanism against moderation decisions. | `Moderate::Appeal` — a complaint filed against a resolved report/notice; the queue is `Moderate::Appeal.pending`. | **[gem]** | `test/models/moderate/appeal_test.rb` |
+| **Free of charge.** | There is no charge anywhere in the appeal path — it's just a record + a queue. (You simply don't bill for it.) | **[gem]** | n/a (no payment code exists in the path) |
+| Open for **at least six months** after the decision. | Each report stores its **appeal window**; the gem's default window is **6 months** and `Moderate::Appeal` refuses to open against a decision whose window has closed. | **[gem]** | `test/models/moderate/appeal_test.rb` |
+| Decisions **reversible** — uphold the complaint and reverse the action. | `appeal.uphold!(by:, note:)` overturns the original decision (and runs the reverse enforcement); `appeal.reject!(by:, note:)` confirms it. | **[gem]** | `test/services/moderate/resolve_appeal_test.rb` |
+| **Not solely automated** — a human decides the complaint. | `uphold!`/`reject!` **require** a `by:` moderator and a `note:`; there is no path to auto-decide an appeal. | **[gem]** | `test/services/moderate/resolve_appeal_test.rb` |
+| Inform the complainant of the **appeal decision** and remaining redress (out-of-court / judicial). | Resolving an appeal emits `appeal_decision` to the complainant; you render the out-of-court / judicial redress copy. | **[gem + you]** | `test/services/moderate/resolve_appeal_test.rb` |
+### Art. 24 — Transparency reporting
+> Providers shall publish, at least **once a year**, reports on their content moderation, including the **number of notices** received, **action taken**, **use of automated means**, and **complaints** received and their outcomes.
+| Requirement | How `moderate` satisfies it | Who | Proof |
+| --- | --- | --- | --- |
+| Count of **notices received** (by type/ground). | `Moderate.transparency` aggregates `moderate_reports` by `intake_kind` and `legal_reason`/`category`. | **[gem]** | `test/integration/transparency_report_test.rb` |
+| Count of **actions taken** (removals, bans, dismissals). | The same aggregation tallies resolutions by action and dismissals. | **[gem]** | `test/integration/transparency_report_test.rb` |
+| **Median handling time** (notice → decision). | Computed from each report's received-at vs decided-at timestamps. | **[gem]** | `test/integration/transparency_report_test.rb` |
+| Use of **automated means** in moderation. | Counts of decisions acting on auto-`Moderate::Flag`s vs human reports, from the `source` column. | **[gem]** | `test/integration/transparency_report_test.rb` |
+| **Appeals** received and their **outcomes** (upheld / rejected). | Aggregation over `moderate_appeals` by status. | **[gem]** | `test/integration/transparency_report_test.rb` |
+| **Publish** the report (at least annually). | The gem produces the numbers; **you** publish them (a `/transparency` page, a PDF, whatever) — only you know your reporting period and format. | **[you]** | manual: render `Moderate.transparency(from:, to:)` |
+> [!TIP]
+> `Moderate.transparency(from: 1.year.ago, to: Time.current)` returns a plain hash you can drop straight into a view, a JSON endpoint, or a rake task that emails it to you each January. The counters are the regulator-aligned ones — same taxonomy as the notice form (`Moderate::DSA_LEGAL_REASONS`) — so the published numbers line up with the intake.
+---
+## 2. Apple App Store — Guideline 1.2 (User-Generated Content)
+Apple is blunt: an app with UGC that lacks these gets **rejected**, and rejection is the most common reason a social/community app fails review. Guideline 1.2 lists four mechanisms; **all four are required**, and reviewers test them by hand during review.
+> Apps with user-generated content … must include: **(a)** a method for **filtering objectionable material** from being posted, **(b)** a mechanism to **report** offensive content and timely responses to concerns, **(c)** the ability to **block abusive users**, and **(d)** **published contact information** so users can easily reach you.
+| Requirement | How `moderate` satisfies it | Who | Proof |
+| --- | --- | --- | --- |
+| **(a)** A method to **filter objectionable material** before it's posted. | `moderates :field` with `mode: :block` rejects the offending write before save; the default `:wordlist` adapter is a fast offline baseline, and you can register an image / remote adapter for stronger checks. | **[gem]** | `test/integration/content_filtering_test.rb` |
+| **(b)** A mechanism to **report** offensive content. | `current_user.report!(content, category:)` in-app; reportable content exposes `reports`, `reported?`, `flagged?`; the `moderate_report_link` helper drops the button into any view. | **[gem]** | `test/models/moderate/reportable_test.rb`, `test/integration/reporting_test.rb` |
+| **(b)** **Timely responses** to reports. | The report lands in `Moderate::Report.pending` with a snapshot; the reporter gets a `report_received` receipt immediately, and a `report_decision` when you act. (Acting promptly is on you — the gem surfaces the queue and the events.) | **[gem + you]** | `test/services/moderate/intake_report_test.rb` |
+| **(c)** The ability to **block abusive users**. | `current_user.block!(other)` — bidirectional, idempotent, audited; enforce it everywhere with the single `Moderate.blocked_ids_for(user)` query. | **[gem]** | `test/models/moderate/block_test.rb` |
+| **(d)** **Published contact information** to reach the developer. | The notice-engine root (`/legal`) is a natural home for your contact/abuse address; the gem gives you the page, **you** publish the address (Apple wants a real human-reachable contact). | **[gem + you]** | manual: contact shown in-app + on the notice page |
+> [!IMPORTANT]
+> **Guideline 1.2 also expects an EULA acknowledgement** for UGC apps: users must agree there's **no tolerance for objectionable content or abusive behavior**. That's a one-line acceptance in your signup/terms — `moderate` doesn't own your terms screen, but the **community-report categories** (`:harassment`, `:spam`, …) are what your EULA's "objectionable content" clause should enumerate, so the words match the buttons. Keep your terms and your report categories in sync.
+> [!TIP]
+> When you respond to App Review's inevitable "show us your moderation" question, point them at: the in-app **Report** button (1.2b), the **Block** action on a profile (1.2c), the fact that a banned-word post is **rejected** (1.2a), and your **contact** link (1.2d). Those are the four taps a reviewer makes. The rows above are the four they correspond to.
+---
+## 3. Google Play — User-Generated Content policy
+Google Play's UGC policy overlaps heavily with Apple's but is explicit about **two things Apple states more loosely**: blocking/reporting must cover **both users and content**, and you must do **ongoing** moderation (not just provide the buttons). It also requires an in-app way to **accept terms / acceptable-use** before contributing UGC.
+> Apps with UGC must: provide an in-app system for **reporting and blocking objectionable users and content**; provide a method to **moderate UGC**; and require users to **accept the app's terms of use / user policy** before creating or uploading UGC.
+| Requirement | How `moderate` satisfies it | Who | Proof |
+| --- | --- | --- | --- |
+| In-app **reporting** of objectionable **content**. | `current_user.report!(content, category:)`; any model that is reportable can be reported. | **[gem]** | `test/models/moderate/reportable_test.rb` |
+| In-app **reporting** of objectionable **users**. | A user model with `has_reporting_and_blocking` is itself reportable: `current_user.report!(other_user, category: :impersonation)`. | **[gem]** | `test/models/moderate/report_test.rb` |
+| In-app **blocking** of objectionable **users**. | `current_user.block!(other)` — the bidirectional safety edge. | **[gem]** | `test/models/moderate/block_test.rb` |
+| In-app **blocking / hiding** of objectionable **content**. | Filter the blocked pair's content out of any feed with `Moderate.blocked_ids_for(current_user)` — the single source-of-truth query you apply in search, inbox, and listings. | **[gem]** | `test/integration/blocking_test.rb` |
+| A method to **moderate UGC** (a real review surface, not just intake). | `Moderate::Report.pending` / `Moderate::Flag.pending` give admins the queue; `resolve!`/`dismiss!`/`remove_content`/`ban_user` are the audited actions. (BYOUI — you bind these to your admin; see [`docs/madmin.md`](madmin.md).) | **[gem + you]** | `test/services/moderate/resolve_report_test.rb` |
+| **Ongoing** moderation, including proactive detection. | Pre-publication filtering (`moderates`) catches content at write time; `:flag` mode queues borderline content for review **after commit**; both feed the same admin queue. The mechanism is continuous, not one-shot. | **[gem]** | `test/integration/content_filtering_test.rb` |
+| Users **accept terms / acceptable-use** before contributing UGC. | This is your signup/terms gate — `moderate` doesn't own it — but, as with Apple, your acceptable-use policy should enumerate the **community-report categories** so the terms and the report buttons describe the same prohibited behavior. | **[you]** | manual: terms acceptance in your onboarding |
+> [!NOTE]
+> **"Both users and content" is the row people miss.** Plenty of apps add a "Report comment" button and stop there. Play wants you to be able to report **and** block **both** a person and a thing. `moderate` covers all four cells because a user model with `has_reporting_and_blocking` is *also* reportable, and blocking is enforced over content via `blocked_ids_for`. If you only made content reportable and never made users blockable, you'd pass Apple's spot check and still fail Play's policy.
+---
+## 4. The one-page summary (hand this to legal)
+If you read nothing else, this is the table that says "we did the thing."
+| Regime | Obligation | `moderate` mechanism | Status |
+| --- | --- | --- | --- |
+| **DSA Art. 16** | Public electronic notice + confirmation of receipt | Notice engine (`mount Moderate::Engine`) + `notice_received` event | ✅ gem (+ wire 1 mailer) |
+| **DSA Art. 17** | Statement of reasons (action, ground, automated-means, redress) | `affected_user_decision` event carrying action + ground + source + appeal path | ✅ gem (+ wire 1 mailer) |
+| **DSA Art. 20** | Free internal appeals, ≥ 6 months, human-decided | `Moderate::Appeal` + 6-month window + `by:`/`note:`-required `uphold!`/`reject!` | ✅ gem |
+| **DSA Art. 24** | Annual transparency report | `Moderate.transparency` counters | ✅ gem (you publish) |
+| **Apple 1.2(a)** | Filter objectionable content | `moderates :field, mode: :block` | ✅ gem |
+| **Apple 1.2(b)** | Report + timely response | `report!` + `Report.pending` + `report_decision` | ✅ gem (you respond) |
+| **Apple 1.2(c)** | Block abusive users | `block!` + `blocked_ids_for` | ✅ gem |
+| **Apple 1.2(d)** | Published contact | `/legal` page | ✅ gem (you publish address) |
+| **Play UGC** | Report + block, **users and content** | `report!`/`block!` on users; reportable + `blocked_ids_for` on content | ✅ gem |
+| **Play UGC** | Ongoing moderation surface | `Flag.pending` + `:flag`-mode filtering | ✅ gem (you review) |
+| **Play UGC** | Accept terms before UGC | your onboarding gate | ⬜ you (categories align) |
+Legend: **✅ gem** — the mechanism ships and is tested. **⬜ you** — your operational/policy step that the gem makes straightforward.
+> [!WARNING]
+> A green checklist is necessary, not sufficient. The stores and the DSA judge you on **behavior over time** — that you actually read the queue, answer notices, and decide appeals — not just that the buttons exist. `moderate` makes every one of those actions a one-liner with a built-in audit trail (`config.audit`), so doing the operational work is cheap. **Do it.** The mechanisms keep you compliant only if you keep using them.
+---
+## See also
+- [`docs/dsa-notice-form.md`](dsa-notice-form.md) — the public Art. 16 notice form, in depth (mount, fields, Turnstile gate, the `moderate:views` eject)
+- [DSA & app-store compliance](../README.md#️-dsa--app-store-compliance-out-of-the-box) — the README overview these tables expand on
+- [Notifications & audit](../README.md#-notifications---audit--one-hook-each) — wiring `notice_received` / `*_decision` so Art. 16(4) and Art. 17 are actually delivered
+- [Admin & the moderation queue](../README.md#️-admin--the-moderation-queue) — `resolve!`/`dismiss!`/`uphold!` and the audit trail behind every decision

data/docs/configuration.md ADDED Viewed

@@ -0,0 +1,326 @@
+# Configuration reference
+Everything `moderate` lets you configure lives in one initializer, written in the `Moderate.configure do |config|` block. `rails generate moderate:install` drops a fully-commented `config/initializers/moderate.rb` with every option present and annotated; this doc is the reference for what each one does, with copy-paste examples.
+Every option has a sensible default, so the minimum viable config is a single line:
+```ruby
+# config/initializers/moderate.rb
+Moderate.configure do |config|
+  config.user_class = "User"
+end
+```
+That alone gives you reporting, blocking, the default `:wordlist` filter in `:block` mode, and the moderation queue. Everything below is opt-in refinement.
+> [!NOTE]
+> Config is read at the point of use, not frozen at boot — class names are stored as strings and constantized lazily, so the initializer works no matter when your app loads. The block is validated at the end of `configure`, so a typo'd mode or unknown adapter raises a plain-English `ArgumentError` immediately instead of failing mysteriously later.
+---
+## The whole surface at a glance
+```ruby
+Moderate.configure do |config|
+  # --- Identity -------------------------------------------------------------
+  config.user_class          = "User"        # who reports / blocks / gets reported / gets banned
+  # --- Filtering ------------------------------------------------------------
+  config.default_filter_mode = :block        # :off / :block / :flag  (used by `moderates` w/o a mode)
+  config.filter_adapter      = :wordlist     # default text adapter
+  config.additional_words    = %w[…]         # extra :wordlist entries
+  config.excluded_words      = %w[…]         # :wordlist false-positives to never flag
+  config.report_categories   = %w[…]         # override the in-app community category list (no migration)
+  config.register_adapter :openai, OpenAIModerationAdapter.new   # bring your own remote adapter
+  config.filter "Message", :body, with: :wordlist, mode: :flag   # per-field policy in one place
+  # --- Hooks (all no-op by default) ----------------------------------------
+  config.audit       = ->(event) { … }                    # record important actions
+  config.notify      = ->(event) { … }                    # fan out emails / alerts / push
+  config.on_block    = ->(blocker:, blocked:, at:) { … }  # side effects when a block happens
+  config.ban_handler = ->(user:, by:, reason:) { … }      # how a "ban" is applied in YOUR app
+  # --- Misc -----------------------------------------------------------------
+  config.locale = :en                        # locale for copy moderate generates itself
+end
+```
+The public legal-form options (`parent_controller`, `notice_form_enabled`, `notice_rate_limit`, `notice_guard`, `notice_human_verification_skip_if`, `appeal_form_enabled`, `appeal_rate_limit`, `appeal_guard`, `appeal_human_verification_skip_if`, `appeal_return_path`) are documented in their own guide — see [The DSA notice form](dsa-notice-form.md#configuration-reference-notice-form). They're omitted here to keep this focused on the core T&S surface.
+---
+## Identity
+### `user_class`
+```ruby
+config.user_class = "User"   # default: "User"
+```
+The model that **acts** in your Trust & Safety system: it reports, it blocks, it gets reported, and it gets banned. This is the model where you add the actor macro:
+```ruby
+class User < ApplicationRecord
+  has_reporting_and_blocking # gains report!/block!/blocks?/blocked_with?…
+  # include Moderate::Actor # the documented, exactly-equivalent include form
+end
+```
+Stored as a **string** and constantized lazily, so it doesn't matter whether the class is loaded yet when the initializer runs. It's usually `"User"`, but it can be anything that represents "a person who acts" — `"Account"`, `"Member"`, etc. `moderate` deliberately doesn't own auth or current-user; you tell it the class, your auth gem (Devise, etc.) tells it who's logged in.
+---
+## Filtering
+`moderate` filters text and images **before they're saved**, declared per field with the `moderates` macro. The three config options below set the *defaults* that macro uses; you can always override per field.
+### `default_filter_mode`
+```ruby
+config.default_filter_mode = :block   # default: :block  (:off / :block / :flag)
+```
+The mode a bare `moderates :field` uses when you don't pass `mode:`:
+- **`:off`** — no check. (Useful as a global default if you want filtering opt-in per field.)
+- **`:block`** — the write is **rejected** with a validation error if the filter trips. Best for public, high-trust fields (a profile bio, a listing title).
+- **`:flag`** — the write **succeeds**, and a `Moderate::Flag` is created **after commit** for review. Best for DMs and chat, where blocking mid-conversation is hostile UX.
+```ruby
+class Message < ApplicationRecord
+  moderates :body                  # uses default_filter_mode
+end
+class Profile < ApplicationRecord
+  moderates :bio,    mode: :block  # override: reject the save
+  moderates :avatar, mode: :flag, with: :image   # `:image` is a registered adapter — see examples/ (only :wordlist is built in)
+end
+```
+> [!IMPORTANT]
+> `:flag` never lives in a validator. Validators must be side-effect-free, and a flag created inside a rolled-back transaction would silently vanish — so `moderate` creates the flag **after commit**, correctly, for you. This is the whole reason `:flag` is a `moderates` mode and not something you can hand-roll with `validates`.
+Reportable records expose the review state directly:
+```ruby
+message.flagged?        # any pending flag?
+message.flagged?(:body) # pending flag for one field?
+```
+Use those predicates to render host-specific "under review" affordances if that is right for your product. The gem intentionally does not ship a visible banner/component because moderation copy, styling, and disclosure rules belong to the host app.
+Hotwire Native / Turbo Native apps also need host path-configuration rules for the report surfaces they mount. Cover both the form route (`/reports/new`, or your equivalent) and the form action (`/reports`) so validation errors stay in the intended native context, plus the engine's public legal routes and their form actions if you mount them (`<mount>/notices/new`, `<mount>/notices`, `<mount>/appeals/new`, `<mount>/appeals`, `<mount>/transparency`, where `<mount>` is your host-chosen `Moderate::Engine` mount point). Android rules must include the destination `uri` your app binary has registered.
+### `filter_adapter`
+```ruby
+config.filter_adapter = :wordlist   # default: :wordlist
+```
+The default **text** adapter used by `moderates :field` and `Moderate.classify`. Every adapter — built-in or yours — implements the same tiny contract, so they're interchangeable per field:
+```ruby
+adapter.classify(value)  # => Moderate::Result(allowed:, categories:, scores:)
+```
+Exactly **one** adapter ships built in:
+| Adapter | Use it for | Notes |
+| --- | --- | --- |
+| `:wordlist` (default) | text | Fast **offline** baseline, multilingual, zero-dependency. Includes Unicode normalization and common substitution handling, but it is not a contextual classifier. Ships `en`/`es` lists; extend with `additional_words` / `excluded_words`. |
+For anything nuanced — context-aware text, images, a hosted moderation API — you **bring and name your own adapter** with `register_adapter` (next section). Two ready-to-copy reference adapters live under [`examples/`](../examples/): `examples/openai_moderation_adapter.rb` (OpenAI `omni-moderation-latest`, text + image, via the `ruby_llm` gem) and `examples/aws_rekognition_adapter.rb` (image moderation via `aws-sdk-rekognition`). They are **not shipped, loaded, or a dependency** — copy one into your app, add its gem to *your* Gemfile, and register it. `moderate` intentionally does **not** ship a built-in "LLM" or image adapter: the contract is `classify(value) → Result`, and whether the backend behind your adapter is an LLM, a hosted endpoint, or a regex is your call, not the gem's.
+### `register_adapter` — bring your own backend
+An adapter is just an object that responds to `classify` and returns a `Moderate::Result`. Register it once under a name you choose, then reference it anywhere by that name:
+```ruby
+class OpenAIModerator
+  def classify(value)
+    resp = OpenAI.moderate(value)   # your call
+    Moderate::Result.new(
+      allowed:    !resp.flagged?,
+      categories: resp.categories,                 # e.g. [:hate, :harassment]
+      scores:     resp.category_scores             # { hate: 0.92, harassment: 0.13 }  (0..1)
+    )
+  end
+end
+Moderate.configure do |config|
+  config.register_adapter :openai, OpenAIModerator.new
+  # now use it by name, per field:
+  config.filter "Comment", :body, with: :openai, mode: :flag
+end
+```
+```ruby
+# or right on the model:
+class Comment < ApplicationRecord
+  moderates :body, with: :openai, mode: :flag
+end
+```
+The name is **yours** — `:openai`, `:replicate`, `:hive`, `:my_classifier`, whatever reads well in your models. The `source` recorded on resulting `Moderate::Flag`s is that name, so your moderation queue shows exactly which backend flagged each item.
+> [!TIP]
+> You don't have to write the adapter from scratch. Two production-shaped reference adapters ship under [`examples/`](../examples/) — `examples/openai_moderation_adapter.rb` (OpenAI, text + image, via `ruby_llm`) and `examples/aws_rekognition_adapter.rb` (image moderation via `aws-sdk-rekognition`). Copy one in, add its gem to *your* Gemfile, and `register_adapter` it. They're reference code, not a gem dependency, so nothing is pulled into an app that doesn't want it.
+### `additional_words` / `excluded_words`
+```ruby
+config.additional_words = %w[customword anotherword]   # default: []
+config.excluded_words   = %w[scunthorpe assangea]      # default: []
+```
+Two layers on top of the built-in `:wordlist`:
+- **`additional_words`** — domain-specific terms you want caught that aren't in the shipped lists.
+- **`excluded_words`** — false positives you never want flagged (the classic "Scunthorpe problem" — legitimate words that contain a substring of a banned one).
+Both apply only to the `:wordlist` adapter. (The old `0.x` `additional_words`/`excluded_words` config keys carry over unchanged — see [Upgrading from 0.x](../README.md#upgrading-from-0x).)
+### `report_categories` — customize the in-app community category list
+```ruby
+config.report_categories = %w[harassment hate spam fraud my_custom_label]   # default: nil
+```
+The in-app **community report** category set a user picks from when they tap "Report" (`harassment`, `spam`, …). Leave it `nil` (the default) to use the gem's `Moderate::Report::DEFAULT_CATEGORIES`; set an Array to replace the list with your own. The `category` value is validated **in the model** (a frozen constant + an ActiveModel `inclusion` validation), **not** by a database `CHECK` constraint, so **adding or narrowing a category never requires a migration** — change this one config line and you're done.
+```ruby
+Moderate::Report.report_categories
+# => your config.report_categories if set, else Moderate::Report::DEFAULT_CATEGORIES
+```
+> [!NOTE]
+> This is the **community** taxonomy only. The separate, regulator-aligned **DSA legal-reason** taxonomy (`Moderate::Report::DSA_LEGAL_REASONS`) and the EU member-state list are **not** host-overridable — they're defined by the regulation, so widening them is a gem change, not host config.
+### `filter` — per-field policy in the initializer
+If you'd rather keep all your Trust & Safety policy in one place instead of sprinkling `moderates` across models, declare per-field filters in the initializer. Same effect, same arguments:
+```ruby
+config.filter "Message", :body,   with: :wordlist,    mode: :flag
+config.filter "Profile", :bio,    with: :wordlist,    mode: :block
+config.filter "Profile", :avatar, with: :rekognition, mode: :flag   # a reference adapter you registered
+```
+`config.filter "Class", :field, with:, mode:` is the initializer twin of `moderates :field, with:, mode:` on the model. Use whichever fits your taste; you can mix both. (Reportable classes themselves are auto-discovered from the reportable macro — there's no separate registry to maintain.)
+---
+## Hooks
+`moderate` never sends an email, writes to *your* audit log, or decides what "banned" means in your app. It **emits events** and **calls handlers** you wire once. All four hooks default to a no-op, so the gem works untouched — wire them as you need them.
+### `audit` — record important actions
+```ruby
+config.audit = ->(event) { AuditLog.record!(event_type: event.name, data: event.payload) }
+```
+Called for **every important action** so you can write it to your own audit system. The gem never touches your audit log directly. The event carries a stable envelope:
+```ruby
+event.name        # Symbol, e.g. :report_decision
+event.subject     # the record acted on (a Report, Block, Flag, Appeal…)
+event.actor       # who took the action (a moderator, a user, or nil for system)
+event.recipients  # who should be notified (Array)
+event.payload     # Hash of event-specific context (includes :summary)
+event.to_h        # the whole envelope as a Hash
+```
+### `notify` — fan out anywhere
+```ruby
+config.notify = ->(event) do
+  case event.name
+  when :report_received, :report_decision, :affected_user_decision
+    ModerationMailer.with(event: event).public_send(event.name).deliver_later   # goodmail
+  when :content_flagged
+    Telegrama.send_message("🚩 #{event.payload[:summary]}")                      # admin alert
+  end
+end
+```
+Called for **every notifiable event**. One hook drives them all — `goodmail` for user emails, `telegrama` for admin alerts, `noticed` for in-app feed + push — because every event shares the same envelope. The full vocabulary:
+```
+report_received   report_decision   affected_user_decision
+appeal_received   appeal_decision
+user_blocked      user_unblocked    user_banned
+content_flagged   content_removed
+```
+> [!IMPORTANT]
+> Keep `notify` (and `audit`) **fast** — use background jobs (`deliver_later`, `perform_later`). These hooks run inside the moderation action's flow; slow work here slows down every decision and every block.
+### `on_block` — side effects when a block happens
+```ruby
+config.on_block = ->(blocker:, blocked:, at:) { CancelPendingInvites.call(blocker, blocked, at: at) }
+```
+Optional teardown when one user blocks another — cancel a pending invite, leave a shared room, drop a follow. Signature is **keyword args** (`blocker:`, `blocked:`, `at:`), where `at` is the created block row's timestamp. No-op by default. (A `user_blocked` event also fires through `notify`; use `on_block` for *domain side effects* and `notify` for *messaging*.)
+### `ban_handler` — what "banned" means in your app
+```ruby
+config.ban_handler = ->(user:, by:, reason:) { user.suspend!(reason: reason) }
+```
+`moderate` doesn't own your user lifecycle, so it **never bans a user itself**. When a moderator resolves a report with `ban_user: true` (see [the madmin queue](madmin.md#step-3--the-controller-call-the-gems-decision-methods)), this proc decides what "banned" means in your domain — `suspend!`, soft-delete, flip a flag, revoke sessions, whatever. Signature is **keyword args** (`user:`, `by:`, `reason:`). No-op by default — the decision still audits and notifies even if you haven't wired a ban yet, so you're never silently dropping the action.
+---
+## Misc
+### `locale`
+```ruby
+config.locale = :en   # default: your app's I18n.default_locale
+```
+The locale for copy `moderate` generates on its own — filter validation messages, the DSA statement-of-reasons taxonomy labels, the notice-form strings. Leave it unset to follow `I18n.default_locale`.
+---
+## How the macros relate to config
+Config sets defaults; the model macros consume them. The two halves of the API:
+| In the model | In the initializer | What it controls |
+| --- | --- | --- |
+| `has_reporting_and_blocking` (or `include Moderate::Actor`) | `config.user_class` | Who can report/block and be reported/banned |
+| `has_reportable_content :title, :description` (or `include Moderate::Reportable`) | — (auto-discovered) | Which content is reportable, and which fields |
+| `moderates :body, with:, mode:` | `config.default_filter_mode`, `config.filter_adapter`, `config.filter "…"` | Pre-publication filtering per field |
+Both sugar macros have an exactly-equivalent `include` form for include-purists — `has_reporting_and_blocking` ⇔ `include Moderate::Actor`, `has_reportable_content` ⇔ `include Moderate::Reportable`. They compile to the same thing.
+---
+## Validation & errors
+`moderate` validates your config at the end of `configure` and raises a plain-English `ArgumentError` on a bad value:
+```ruby
+config.default_filter_mode = :reject
+# => ArgumentError: default_filter_mode must be one of: off, block, flag
+config.filter "Message", :body, with: :gpt5, mode: :flag
+# => ArgumentError: unknown filter adapter :gpt5 — the only built-in is :wordlist;
+#    register your own with `config.register_adapter :gpt5, MyAdapter.new`
+```
+Modes and adapter names are normalized (`to_s.strip.downcase.to_sym`), so `"Block"`, `:block`, and `" block "` all mean the same thing. This matches the validating-setter convention across the ecosystem (`usage_credits` `default_currency=`, `wallets` `default_asset=`).
+## See also
+- [Admin & the moderation queue](madmin.md) — wiring `ban_handler` / `notify` / `audit` into a real admin
+- [The DSA notice form](dsa-notice-form.md) — the notice-form-specific config keys
+- [Notifications & audit](../README.md#-notifications---audit--one-hook-each) — the event vocabulary in full
+- [Content filtering](../README.md#-content-filtering-off--block--flag) — the `moderates` macro and the adapter contract
+- [Upgrading from 0.x](../README.md#upgrading-from-0x) — what carries over from the profanity-validator era