npm - spamscanner - Versions diffs - 4.0.0 → 5.1.1 - Mend

spamscanner 4.0.0 → 5.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md +79 -8
package/cedict_1_0_ts_utf-8_mdbg.txt +120853 -0
package/index.js +284 -119
package/package.json +37 -73
package/vocabulary-limit.js +3 -1

package/README.md CHANGED Viewed

@@ -2,9 +2,7 @@
   <a href="https://spamscanner.net"><img src="https://d1i8ikybhfrv4r.cloudfront.net/spamscanner.png" alt="spamscanner" /></a>
 </h1>
 <div align="center">
-  <a href="https://join.slack.com/t/ladjs/shared_invite/zt-fqei6z11-Bq2trhwHQxVc5x~ifiZG0g"><img src="https://img.shields.io/badge/chat-join%20slack-brightgreen" alt="chat" /></a>
-  <a href="https://travis-ci.com/spamscanner/spamscanner"><img src="https://travis-ci.com/spamscanner/spamscanner.svg?branch=master" alt="build status" /></a>
-  <a href="https://codecov.io/github/spamscanner/spamscanner"><img src="https://img.shields.io/codecov/c/github/spamscanner/spamscanner/master.svg" alt="code coverage" /></a>
+  <a href="https://github.com/spamscanner/spamscanner/actions/workflows/ci.yml"><img src="https://github.com/spamscanner/spamscanner/actions/workflows/ci.yml/badge.svg" alt="build status" /></a>
   <a href="https://github.com/sindresorhus/xo"><img src="https://img.shields.io/badge/code_style-XO-5ed9c7.svg" alt="code style" /></a>
   <a href="https://github.com/prettier/prettier"><img src="https://img.shields.io/badge/styled_with-prettier-ff69b4.svg" alt="styled with prettier" /></a>
   <a href="https://lass.js.org"><img src="https://img.shields.io/badge/made_with-lass-95CC28.svg" alt="made with lass" /></a>
@@ -48,6 +46,7 @@
   * [`scanner.getVirusResults(mail)`](#scannergetvirusresultsmail)
   * [`scanner.parseLocale(locale)`](#scannerparselocalelocale)
 * [Caching](#caching)
+* [Debugging](#debugging)
 * [Contributors](#contributors)
 * [References](#references)
 * [License](#license)
@@ -188,11 +187,48 @@ Note that you can simply use the Spam Scanner API for free at <https://spamscann
 2. Configure ClamAV:
    ```sh
+   # if you are on Intel macOS
+   sudo mv /usr/local/etc/clamav/clamd.conf.sample /usr/local/etc/clamav/clamd.conf
+   # if you are on M1 macOS (or newer brew which installs to `/opt/homebrew`)
+   sudo mv /opt/homebrew/etc/clamav/clamd.conf.sample /opt/homebrew/etc/clamav/clamd.conf
+   ```
+   ```sh
+   # if you are on Intel macOS
+   sudo vim /usr/local/etc/clamav/clamd.conf
+   # if you are on M1 macOS (or newer brew which installs to `/opt/homebrew`)
+   sudo vim /opt/homebrew/etc/clamav/clamd.conf
+   ```
+   ```diff
+   -Example
+   +#Example
+   -#StreamMaxLength 10M
+   +StreamMaxLength 50M
+   +# this file path may be different on your OS (that's OK)
+   \-#LocalSocket /tmp/clamd.socket
+   \+LocalSocket /tmp/clamd.socket
+   ```
+   ```sh
+   # if you are on Intel macOS
    sudo mv /usr/local/etc/clamav/freshclam.conf.sample /usr/local/etc/clamav/freshclam.conf
+   # if you are on M1 macOS (or newer brew which installs to `/opt/homebrew`)
+   sudo mv /opt/homebrew/etc/clamav/freshclam.conf.sample /opt/homebrew/etc/clamav/freshclam.conf
    ```
    ```sh
+   # if you are on Intel macOS
    sudo vim /usr/local/etc/clamav/freshclam.conf
+   # if you are on M1 macOS (or newer brew which installs to `/opt/homebrew`)
+   sudo vim /opt/homebrew/etc/clamav/freshclam.conf
    ```
    ```diff
@@ -210,6 +246,8 @@ Note that you can simply use the Spam Scanner API for free at <https://spamscann
    sudo vim /Library/LaunchDaemons/org.clamav.clamd.plist
    ```
+   > If you are on Intel macOS:
    ```plist
    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
@@ -231,12 +269,37 @@ Note that you can simply use the Spam Scanner API for free at <https://spamscann
    </plist>
    ```
+   > If you are on M1 macOS (or newer brew which installs to `/opt/homebrew`)
+   ```plist
+   <?xml version="1.0" encoding="UTF-8"?>
+   <!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
+   <plist version="1.0">
+   <dict>
+     <key>Label</key>
+     <string>org.clamav.clamd</string>
+     <key>KeepAlive</key>
+     <true/>
+     <key>Program</key>
+     <string>/opt/homebrew/sbin/clamd</string>
+     <key>ProgramArguments</key>
+     <array>
+       <string>clamd</string>
+     </array>
+     <key>RunAtLoad</key>
+     <true/>
+   </dict>
+   </plist>
+   ```
+4. Enable it and start it on boot:
    ```sh
    sudo launchctl load /Library/LaunchDaemons/org.clamav.clamd.plist
    sudo launchctl start /Library/LaunchDaemons/org.clamav.clamd.plist
    ```
-4. You may want to periodically run `freshclam` to update the config, or configure a similar `plist` configuration for `launchctl`.
+5. You may want to periodically run `freshclam` to update the config, or configure a similar `plist` configuration for `launchctl`.
 ## Install
@@ -244,7 +307,7 @@ Note that you can simply use the Spam Scanner API for free at <https://spamscann
 [npm][]:
 ```sh
-npm install spamscanner node-snowball
+npm install spamscanner
 ```
@@ -359,7 +422,7 @@ Currently Spam Scanner supports the following locales for tokenization, stemming
 | Finnish    | `fn`       |
 | Farsi      | `fa`       |
 | French     | `fr`       |
-| German     | `gr`       |
+| German     | `de`       |
 | Hungarian  | `hr`       |
 | Indonesian | `in`       |
 | Italian    | `it`       |
@@ -406,7 +469,7 @@ A common example of this is a link of `рaypal.com` which when converted to ASCI
 This method checks against [Cloudflare for Families](https://developers.cloudflare.com/1.1.1.1/1.1.1.1-for-families) servers for both adult-related content, malware, and phishing.  This means we do two separate DNS over HTTPS requests to `1.1.1.2` for malware and `1.1.1.3` for adult-related content.  You can parse the messages results Array for messages that contain "adult-related content" if you need to parse whether or not you want to flag for adult-related content or not on your application.
-If you are using Cloudflare for Families DNS servers as mentioned in [Requirements](#requirements)), then if there are any HTTPS over DNS request errors, it will fallback to use the DNS servers set on the system for lookups, which would in turn use Cloudflare for Family DNS. (using DNS over HTTPS with a fallback of [dns.resolve4](https://nodejs.org/api/dns.html#dns_dns_resolve4\_hostname_options_callback)) – and if it returns `0.0.0.0` then it is considered to be phishing.
+If you are using Cloudflare for Families DNS servers as mentioned in [Requirements](#requirements)), then if there are any HTTPS over DNS request errors, it will fallback to use the DNS servers set on the system for lookups, which would in turn use Cloudflare for Family DNS. (using DNS over HTTPS with a fallback of [dns.resolve4](https://nodejs.org/api/dns.html#dns_dns_resolve4_hostname_options_callback)) – and if it returns `0.0.0.0` then it is considered to be phishing.
 We actually helped Cloudflare in August 2020 to update their documentation to note that this result of `0.0.0.0` is returned for maliciously found content on FQDN and IP lookups.
@@ -430,7 +493,7 @@ Returns an Array of SHA-256 hashed tokenized and stemmed words, with respect to
 Note that this is "smart" in the sense it will parse the "Content-Language" header of the message, the `content` attribute of the HTML message's `<meta http-equiv="Content-Language" content="en-us">`, or the `lang` attribute of `<html lang="en">`.
-After parsing the language of the message, it will then use the package [franc][] to attempt to determine the language of the message (as long as the message has at least 150 characters, which is configurable).
+After parsing the language of the message, it will then use the package [franc][] to attempt to determine the language of the message (as long as the message has at least 5 characters, which is configurable).
 **Most importantly** the following types of tokens are replaced with cryptographically generated random hashes:
@@ -501,6 +564,13 @@ const scanner = new SpamScanner({
 Note that in [Forward Email][forward-email] we use the `client` approach as we have multiple threads across multiple servers running, and in-memory caching would not be efficient.
+## Debugging
+Spam Scanner has built-in debug output via `util.debuglog('spamscanner')`.
+This means you can run your app with `NODE_DEBUG=spamscanner node app.js` to get useful debug output to your console.
 ## Contributors
 | Name             | Website                    |
@@ -511,6 +581,7 @@ Note that in [Forward Email][forward-email] we use the `client` approach as we h
 ## References
+* [CC-CEDICT](https://cc-cedict.org/) is licensed under Creative Commons Attribution-ShareAlike 4.0 International License.
 * <https://www.digitalocean.com/community/tutorials/how-to-setup-exim-spamassassin-clamd-and-dovecot-on-an-arch-linux-vps>
 * <https://medium.com/@wingsuitist/set-up-clamav-for-osx-1-the-open-source-virus-scanner-82a927b60fa3>
 * <http://redgreenrepeat.com/2019/08/09/setting-up-clamav-on-macos/>