profanalyzer 0.1.0
Sign up to get free protection for your applications and to get access to all the features.
- data/History.txt +11 -0
- data/Manifest.txt +7 -0
- data/README.txt +97 -0
- data/Rakefile +13 -0
- data/config/list.yml +1586 -0
- data/lib/profanalyzer.rb +212 -0
- data/test/test_profanalyzer.rb +92 -0
- metadata +72 -0
data/lib/profanalyzer.rb
ADDED
@@ -0,0 +1,212 @@
|
|
1
|
+
require 'yaml'
|
2
|
+
# = profanalyzer
|
3
|
+
#
|
4
|
+
# * FIX (url)
|
5
|
+
#
|
6
|
+
# == DESCRIPTION:
|
7
|
+
#
|
8
|
+
# Profanalyzer has one purpose: analyze a block of text for profanity. It is
|
9
|
+
# able to filter profane words as well.
|
10
|
+
#
|
11
|
+
# What sets it slightly apart from other filters is that it classifies each
|
12
|
+
# blocked word as "profane", "racist", or "sexual" - although right now, each
|
13
|
+
# word is considered "profane". It also rates each word on a scale from 0-5,
|
14
|
+
# which is based on my subjective opinion, as well as whether the word is
|
15
|
+
# commonly used in non-profane situations, such as "ass" in "assess".
|
16
|
+
#
|
17
|
+
# The Profanalyzer will default to a tolerance of of 2, which will kick back
|
18
|
+
# the arguably non-profane words. It will also test against all words,
|
19
|
+
# including racist or sexual words.
|
20
|
+
#
|
21
|
+
# Lastly, it allows for custom substitutions! For example, the filter at the
|
22
|
+
# website http://www.fark.com/ turns the word "fuck" into "fark", and "shit"
|
23
|
+
# into "shiat". You can specify these if you want.
|
24
|
+
#
|
25
|
+
# == FEATURES/PROBLEMS:
|
26
|
+
#
|
27
|
+
# * Tolerance-based filtering
|
28
|
+
# * Switch between checking all words, racist terms, sexual words, or some
|
29
|
+
# mixture
|
30
|
+
# * Custom substitutions
|
31
|
+
# * Boolean-based profanity checking (skipping the filtering)
|
32
|
+
#
|
33
|
+
# == SYNOPSIS:
|
34
|
+
#
|
35
|
+
# Out of the box, you can simply use Profanalyzer.filter and
|
36
|
+
# Profanalyzer.profane?:
|
37
|
+
#
|
38
|
+
# require 'rubygems'
|
39
|
+
# require 'profanalyzer'
|
40
|
+
#
|
41
|
+
# Profanalyzer.profane? "asshole" #==> true
|
42
|
+
# Profanalyzer.filter "asshole" #==> "#!$%@&!"
|
43
|
+
#
|
44
|
+
# Then you can change the tolerance:
|
45
|
+
#
|
46
|
+
# Profanalyzer.tolerance = 5
|
47
|
+
# Profanalyzer.profane? "hooker" #==> false
|
48
|
+
#
|
49
|
+
# Or do specific checking:
|
50
|
+
#
|
51
|
+
# Profanalyzer.check_all = false # turn off catch-all checking
|
52
|
+
# Profanalyzer.check_racist = false # don't check racial slurs
|
53
|
+
# Profanalyzer.check_sexual = true # sexual checking on
|
54
|
+
#
|
55
|
+
# Profanalyzer.profane? "mick" #==> false
|
56
|
+
# Profanalyzer.profane? "vagina" #==> true
|
57
|
+
#
|
58
|
+
# Lastly, you can add custom substitutions:
|
59
|
+
#
|
60
|
+
# Profanalyzer.substitute("shit","shiat")
|
61
|
+
# Profanalyzer.filter "shit" #==> "shiat"
|
62
|
+
#
|
63
|
+
# Profanalyzer.substitute(:fuck => :fark)
|
64
|
+
# Profanalyzer.filter("fuck") #==> "fark"
|
65
|
+
#
|
66
|
+
#
|
67
|
+
# == REQUIREMENTS:
|
68
|
+
#
|
69
|
+
# hoe - a gem for building gems, which I used for profanalyzer.
|
70
|
+
#
|
71
|
+
# == INSTALL:
|
72
|
+
#
|
73
|
+
# sudo gem install profanalyzer
|
74
|
+
#
|
75
|
+
# == LICENSE:
|
76
|
+
#
|
77
|
+
# (The MIT License)
|
78
|
+
#
|
79
|
+
# Copyright (c) 2009 FIX
|
80
|
+
#
|
81
|
+
# Permission is hereby granted, free of charge, to any person obtaining
|
82
|
+
# a copy of this software and associated documentation files (the
|
83
|
+
# 'Software'), to deal in the Software without restriction, including
|
84
|
+
# without limitation the rights to use, copy, modify, merge, publish,
|
85
|
+
# distribute, sublicense, and/or sell copies of the Software, and to
|
86
|
+
# permit persons to whom the Software is furnished to do so, subject to
|
87
|
+
# the following conditions:
|
88
|
+
#
|
89
|
+
# The above copyright notice and this permission notice shall be
|
90
|
+
# included in all copies or substantial portions of the Software.
|
91
|
+
#
|
92
|
+
# THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
|
93
|
+
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
94
|
+
# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
|
95
|
+
# IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
|
96
|
+
# CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
|
97
|
+
# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
|
98
|
+
# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
99
|
+
class Profanalyzer
|
100
|
+
|
101
|
+
VERSION = "0.1.0"
|
102
|
+
|
103
|
+
@@full_list = YAML::load_file(File.dirname(__FILE__)+"/../config/list.yml")
|
104
|
+
@@racist_list = @@full_list.select {|w| w[:racist]}
|
105
|
+
@@sexual_list = @@full_list.select {|w| w[:sexual]}
|
106
|
+
|
107
|
+
@@settings = {:racism => :forbidden, :sexual => :forbidden, :profane => :forbidden, :tolerance => 4, :custom_subs => {}}
|
108
|
+
|
109
|
+
def self.forbidden_words_from_settings # :nodoc:
|
110
|
+
banned_words = []
|
111
|
+
|
112
|
+
@@full_list.each do |word|
|
113
|
+
banned_words << word[:word] if @@settings[:tolerance] <= word[:badness]
|
114
|
+
end if @@settings[:profane] == :forbidden
|
115
|
+
|
116
|
+
return banned_words if @@settings[:profane] == :forbidden #save some processing
|
117
|
+
|
118
|
+
@@racist_list.each do |word|
|
119
|
+
banned_words << word[:word] if @@settings[:tolerance] <= word[:badness]
|
120
|
+
end if @@settings[:racism] == :forbidden
|
121
|
+
|
122
|
+
@@sexual_list.each do |word|
|
123
|
+
banned_words << word[:word] if @@settings[:tolerance] <= word[:badness]
|
124
|
+
end if @@settings[:sexual] == :forbidden
|
125
|
+
banned_words
|
126
|
+
end
|
127
|
+
|
128
|
+
# Decides whether the given string is profane, given Profanalyzer's current
|
129
|
+
# settings. Examples:
|
130
|
+
# Profanalyzer.profane?("you're an asshole") #==> true
|
131
|
+
#
|
132
|
+
# With custom settings
|
133
|
+
# Profanalyzer.check_all = false
|
134
|
+
# Profanalyzer.check_racist = false
|
135
|
+
# Profanalyzer.profane?("you're a mick") #==> false
|
136
|
+
#
|
137
|
+
def self.profane?(str)
|
138
|
+
banned_words = Profanalyzer.forbidden_words_from_settings
|
139
|
+
banned_words.each do |word|
|
140
|
+
return true if str =~ /#{word}/
|
141
|
+
end
|
142
|
+
false
|
143
|
+
end
|
144
|
+
|
145
|
+
# Filters the provided string using the currently set rules, with #!@$%-like
|
146
|
+
# characters substituted in.
|
147
|
+
#
|
148
|
+
# Example:
|
149
|
+
# Profanalyzer.filter("shit") #==> "#!$%"
|
150
|
+
#
|
151
|
+
# With Custom Substitutions:
|
152
|
+
# Profanalyzer.substitute("shit","shiat")
|
153
|
+
# Profanalyzer.filter("shit") #==> "shiat"
|
154
|
+
# Profanalyzer.filter("damn") #==> "#!$%"
|
155
|
+
#
|
156
|
+
def self.filter(str)
|
157
|
+
retstr = str
|
158
|
+
|
159
|
+
@@settings[:custom_subs].each do |k,v|
|
160
|
+
retstr.gsub!(/#{k.to_s}/,v.to_s)
|
161
|
+
end
|
162
|
+
|
163
|
+
banned_words = Profanalyzer.forbidden_words_from_settings
|
164
|
+
banned_words.each do |word|
|
165
|
+
retstr.gsub!(/#{word}/,
|
166
|
+
"#!$%@&!$%@%@&!$#!$%@&!$%@%@&!#!$%@&!$%@%@&!"[0..(word.length-1)])
|
167
|
+
end
|
168
|
+
retstr
|
169
|
+
end
|
170
|
+
|
171
|
+
# Sets Profanalyzer's tolerance. Value should be an integer such that
|
172
|
+
# 0 <= T <= 5.
|
173
|
+
def self.tolerance=(new_tol)
|
174
|
+
@@settings[:tolerance] = new_tol
|
175
|
+
end
|
176
|
+
|
177
|
+
# Sets Profanalyzer to scan (or not scan) for racist words, based on
|
178
|
+
# the set tolerance.
|
179
|
+
# This is set to +true+ by default.
|
180
|
+
def self.check_racist=(check)
|
181
|
+
@@settings[:racism] = (check) ? :forbidden : :ignore
|
182
|
+
end
|
183
|
+
|
184
|
+
# Sets Profanalyzer to scan (or not scan) for sexual words, based on the set tolerance.
|
185
|
+
# This is set to +true+ by default.
|
186
|
+
def self.check_sexual=(check)
|
187
|
+
@@settings[:sexual] = (check) ? :forbidden : :ignore
|
188
|
+
end
|
189
|
+
|
190
|
+
# Sets Profanalyzer to scan (or not scan) for all profane words, based on the set tolerance.
|
191
|
+
# This is set to +true+ by default.
|
192
|
+
def self.check_all=(check)
|
193
|
+
@@settings[:profane] = (check) ? :forbidden : :ignore
|
194
|
+
end
|
195
|
+
|
196
|
+
# Sets the list of substitutions to the hash passed in. Substitutions are
|
197
|
+
# performed such that +Profanalyzer.filter(key) = value+.
|
198
|
+
def self.subtitutions=(hash)
|
199
|
+
@@settings[:custom_subs] = hash
|
200
|
+
end
|
201
|
+
|
202
|
+
# Sets a custom substitution for the filter.
|
203
|
+
# Can be passed as +substitute("foo","bar")+ or +"foo" => "bar"+
|
204
|
+
def self.substitute(*args)
|
205
|
+
case args[0]
|
206
|
+
when String
|
207
|
+
@@settings[:custom_subs].merge!(args[0] => args[1])
|
208
|
+
when Hash
|
209
|
+
@@settings[:custom_subs].merge!(args[0])
|
210
|
+
end
|
211
|
+
end
|
212
|
+
end
|
@@ -0,0 +1,92 @@
|
|
1
|
+
require "test/unit"
|
2
|
+
require "profanalyzer"
|
3
|
+
|
4
|
+
class TestProfanalyzer < Test::Unit::TestCase
|
5
|
+
|
6
|
+
def test_single_word
|
7
|
+
Profanalyzer.tolerance = 0
|
8
|
+
Profanalyzer.check_all = true
|
9
|
+
assert_equal(true, Profanalyzer.profane?("asshole"))
|
10
|
+
end
|
11
|
+
|
12
|
+
def test_single_racist_word
|
13
|
+
Profanalyzer.tolerance = 0
|
14
|
+
Profanalyzer.check_all = false
|
15
|
+
Profanalyzer.check_sexual = false
|
16
|
+
Profanalyzer.check_racist = true
|
17
|
+
assert_equal(true, Profanalyzer.profane?("spic"))
|
18
|
+
end
|
19
|
+
|
20
|
+
def test_single_sexual_word
|
21
|
+
Profanalyzer.tolerance = 0
|
22
|
+
Profanalyzer.check_all = false
|
23
|
+
Profanalyzer.check_racist = false
|
24
|
+
Profanalyzer.check_sexual = true
|
25
|
+
assert_equal(true, Profanalyzer.profane?("vagina"))
|
26
|
+
end
|
27
|
+
|
28
|
+
def test_tolerance
|
29
|
+
Profanalyzer.tolerance = 4
|
30
|
+
Profanalyzer.check_all = true
|
31
|
+
assert_equal(false, Profanalyzer.profane?("asskisser")) # badness = 3
|
32
|
+
assert_equal(true, Profanalyzer.profane?("fuck")) # badness = 5
|
33
|
+
end
|
34
|
+
|
35
|
+
def test_sexual_tolerance
|
36
|
+
Profanalyzer.tolerance = 4
|
37
|
+
Profanalyzer.check_all = false
|
38
|
+
Profanalyzer.check_racist = false
|
39
|
+
Profanalyzer.check_sexual = true
|
40
|
+
assert_equal(false, Profanalyzer.profane?("vagina")) # badness = 3
|
41
|
+
assert_equal(true, Profanalyzer.profane?("cunt")) # badness = 5
|
42
|
+
end
|
43
|
+
|
44
|
+
def test_racist_tolerance
|
45
|
+
Profanalyzer.tolerance = 4
|
46
|
+
Profanalyzer.check_all = false
|
47
|
+
Profanalyzer.check_sexual = false
|
48
|
+
Profanalyzer.check_racist = true
|
49
|
+
assert_equal(false, Profanalyzer.profane?("mick")) # badness = 3
|
50
|
+
assert_equal(true, Profanalyzer.profane?("nigger")) # badness = 5
|
51
|
+
end
|
52
|
+
|
53
|
+
def test_filter
|
54
|
+
Profanalyzer.tolerance = 0
|
55
|
+
Profanalyzer.check_all = true
|
56
|
+
original_string = "You're a cocksucking piece of shit, you mick."
|
57
|
+
filtered_string = "You're a #!$%@&!$%@% piece of #!$%, you #!$%."
|
58
|
+
assert_equal(filtered_string, Profanalyzer.filter(original_string))
|
59
|
+
end
|
60
|
+
|
61
|
+
def test_sexual_filter
|
62
|
+
Profanalyzer.tolerance = 0
|
63
|
+
Profanalyzer.check_all = false
|
64
|
+
Profanalyzer.check_sexual = true
|
65
|
+
Profanalyzer.check_racist = false
|
66
|
+
original_string = "You're a cocksucking piece of shit, you mick."
|
67
|
+
filtered_string = "You're a #!$%@&!$%@% piece of shit, you mick."
|
68
|
+
assert_equal(filtered_string, Profanalyzer.filter(original_string))
|
69
|
+
end
|
70
|
+
|
71
|
+
def test_racist_filter
|
72
|
+
Profanalyzer.tolerance = 0
|
73
|
+
Profanalyzer.check_all = false
|
74
|
+
Profanalyzer.check_sexual = false
|
75
|
+
Profanalyzer.check_racist = true
|
76
|
+
original_string = "You're a cocksucking piece of shit, you mick."
|
77
|
+
filtered_string = "You're a cocksucking piece of shit, you #!$%."
|
78
|
+
assert_equal(filtered_string, Profanalyzer.filter(original_string))
|
79
|
+
end
|
80
|
+
|
81
|
+
def test_subtitutions
|
82
|
+
Profanalyzer.substitute("shit","shiat")
|
83
|
+
assert_equal("shiat", Profanalyzer.filter("shit"))
|
84
|
+
|
85
|
+
Profanalyzer.substitute("damn" => "darn")
|
86
|
+
assert_equal("darn", Profanalyzer.filter("damn"))
|
87
|
+
|
88
|
+
Profanalyzer.substitute(:fuck => :fark)
|
89
|
+
assert_equal("fark", Profanalyzer.filter("fuck"))
|
90
|
+
end
|
91
|
+
|
92
|
+
end
|
metadata
ADDED
@@ -0,0 +1,72 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: profanalyzer
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.1.0
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- Michael J. Edgar
|
8
|
+
autorequire:
|
9
|
+
bindir: bin
|
10
|
+
cert_chain: []
|
11
|
+
|
12
|
+
date: 2009-03-22 00:00:00 -04:00
|
13
|
+
default_executable:
|
14
|
+
dependencies:
|
15
|
+
- !ruby/object:Gem::Dependency
|
16
|
+
name: hoe
|
17
|
+
type: :development
|
18
|
+
version_requirement:
|
19
|
+
version_requirements: !ruby/object:Gem::Requirement
|
20
|
+
requirements:
|
21
|
+
- - ">="
|
22
|
+
- !ruby/object:Gem::Version
|
23
|
+
version: 1.11.0
|
24
|
+
version:
|
25
|
+
description: "Profanalyzer has one purpose: analyze a block of text for profanity. It is able to filter profane words as well. What sets it slightly apart from other filters is that it classifies each blocked word as \"profane\", \"racist\", or \"sexual\" - although right now, each word is considered \"profane\". It also rates each word on a scale from 0-5, which is based on my subjective opinion, as well as whether the word is commonly used in non-profane situations, such as \"ass\" in \"assess\". The Profanalyzer will default to a tolerance of of 2, which will kick back the arguably non-profane words. It will also test against all words, including racist or sexual words. Lastly, it allows for custom substitutions! For example, the filter at the website http://www.fark.com/ turns the word \"fuck\" into \"fark\", and \"shit\" into \"shiat\". You can specify these if you want."
|
26
|
+
email:
|
27
|
+
- edgar@triqweb.com
|
28
|
+
executables: []
|
29
|
+
|
30
|
+
extensions: []
|
31
|
+
|
32
|
+
extra_rdoc_files:
|
33
|
+
- History.txt
|
34
|
+
- Manifest.txt
|
35
|
+
- README.txt
|
36
|
+
files:
|
37
|
+
- History.txt
|
38
|
+
- Manifest.txt
|
39
|
+
- README.txt
|
40
|
+
- Rakefile
|
41
|
+
- config/list.yml
|
42
|
+
- lib/profanalyzer.rb
|
43
|
+
- test/test_profanalyzer.rb
|
44
|
+
has_rdoc: true
|
45
|
+
homepage: FIX (url)
|
46
|
+
post_install_message:
|
47
|
+
rdoc_options:
|
48
|
+
- --main
|
49
|
+
- README.txt
|
50
|
+
require_paths:
|
51
|
+
- lib
|
52
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
53
|
+
requirements:
|
54
|
+
- - ">="
|
55
|
+
- !ruby/object:Gem::Version
|
56
|
+
version: "0"
|
57
|
+
version:
|
58
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
59
|
+
requirements:
|
60
|
+
- - ">="
|
61
|
+
- !ruby/object:Gem::Version
|
62
|
+
version: "0"
|
63
|
+
version:
|
64
|
+
requirements: []
|
65
|
+
|
66
|
+
rubyforge_project: profanalyzer
|
67
|
+
rubygems_version: 1.3.1
|
68
|
+
signing_key:
|
69
|
+
specification_version: 2
|
70
|
+
summary: "Profanalyzer has one purpose: analyze a block of text for profanity"
|
71
|
+
test_files:
|
72
|
+
- test/test_profanalyzer.rb
|