disavow_tool 0.3.2 → 0.3.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/Gemfile.lock +1 -1
- data/README.md +117 -14
- data/lib/disavow_tool/command_options.rb +1 -1
- data/lib/disavow_tool/list.rb +6 -1
- data/lib/disavow_tool/version.rb +1 -1
- data/miscellaneous/images/communist.jpg +0 -0
- data/miscellaneous/images/killed1.png +0 -0
- data/miscellaneous/images/killed2.png +0 -0
- metadata +5 -2
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: e42386b52c15d028a6f2155d7491ce131066ae0654cb99e883e44b6a0c33431d
|
|
4
|
+
data.tar.gz: 2aa2641e601d0feb670c1a3184d6a3f3a124debaf017a6f1ecb3f9d7da407fba
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: afc4eb397a8a5796427db12fa5d1c0c5298288ee302dadcb2ce912266fcf1a32ae8848d9b6890025b3ed6d721f9ae31aeb32d40856d7aa392ecde86e15339f3d
|
|
7
|
+
data.tar.gz: c7bb6614eda7503176fce92636142950109cbe7436589eb4a141121f41be6b2c03487550c49a0ab948b7ac1574454a568b7413c5c07164007baf7f5d59c0038e
|
data/Gemfile.lock
CHANGED
data/README.md
CHANGED
|
@@ -1,38 +1,141 @@
|
|
|
1
1
|
# DisavowTool
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
A simple tool to create a Disavow file to be uploaded to the [Google Disavow Tool](https://www.google.com/webmasters/tools/disavow-links-main). You'll be able to import the current disavow file, a whitelist with all the trusted URLs and domains and analyse them against new imported URLs you feed DisavowTool with. Auditing and cleaning the back links to your site is essential to have a good SEO.
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
Tips and donations: Bitcoin SV address **1NN3eRawdJ2dPoKQEnHAY7ug3Q6TwbJ4c3**
|
|
6
6
|
|
|
7
7
|
## Installation
|
|
8
8
|
|
|
9
|
-
|
|
9
|
+
Simply install the gem:
|
|
10
10
|
|
|
11
|
-
|
|
12
|
-
gem 'disavow_tool'
|
|
13
|
-
```
|
|
11
|
+
$ gem install disavow_tool
|
|
14
12
|
|
|
15
|
-
|
|
13
|
+
You'll have the new command available in your system: `disavow_tool`
|
|
16
14
|
|
|
17
|
-
|
|
15
|
+
## Usage
|
|
18
16
|
|
|
19
|
-
|
|
17
|
+
From the command's help `disavow_tool --help` option:
|
|
20
18
|
|
|
21
|
-
|
|
19
|
+
Usage: disavow.rb [options] --disavow file_1,file_2,file_3\
|
|
20
|
+
--import file_1,file_2,file_3\
|
|
21
|
+
[--whitelist file1,file2,file3]
|
|
22
22
|
|
|
23
|
-
|
|
23
|
+
You simply feed DisavowTool with the current disavow file you have and all the new links you haven't analysed yet. You can feed multiple disavow files and import files. Optionally you can feed DisavowTool with a whitelist so that DisavowTool can do the heavy lifting
|
|
24
|
+
|
|
25
|
+
DisavowTool will try to remove from the imported URLs all known URLS provided in the Disavow files and the Whitelist files as well as removing all URLs with the same domain as found in Disavow or Whitelist.
|
|
26
|
+
|
|
27
|
+
Once the cleanup of the imported URLs is done you'll be asked URL by URL what DisavowTool should do:
|
|
28
|
+
|
|
29
|
+
Links pending to analysed 241
|
|
30
|
+
**************************************************************
|
|
31
|
+
*
|
|
32
|
+
* Analysing url: https://scammysyte.com/wordpress-comments
|
|
33
|
+
* Website title: Viagra for all
|
|
34
|
+
* URls with this same domain: 180
|
|
35
|
+
*
|
|
36
|
+
***************************************************************
|
|
37
|
+
[W] Whitelist as domain [w] Whitelist URL [a] whitelist as url All urls with this domain
|
|
38
|
+
[D] Disavow as domain [d] Disavow as URL
|
|
39
|
+
[o] to open the URL [.] to exit
|
|
40
|
+
|
|
41
|
+
After the analysis is complete, DisavowTool will export a Disavow file ready to me uploaded to the [Google Disavow Tool](https://www.google.com/webmasters/tools/disavow-links-main). DisavowTool will also export a new whitelist, ideally you should feed this whitelist the next time you run your analysis for the same domain.
|
|
42
|
+
|
|
43
|
+
## Disavow files format
|
|
44
|
+
The format will be the same as [defined by Google](https://support.google.com/webmasters/answer/2648487?hl=en)
|
|
45
|
+
# Comments
|
|
46
|
+
# Domains
|
|
47
|
+
domain:scammysite.com
|
|
48
|
+
domain:scammysite2.com
|
|
49
|
+
# URLs
|
|
50
|
+
https://viagraforall.com/wordpress_comments
|
|
51
|
+
https://seo_scammers.com/wordpress_comments10
|
|
52
|
+
|
|
53
|
+
## Whitelist files format
|
|
54
|
+
The format will be the same as [defined by Google](https://support.google.com/webmasters/answer/2648487?hl=en)
|
|
55
|
+
# Comments
|
|
56
|
+
# Domains
|
|
57
|
+
domain:most_popular_site.com
|
|
58
|
+
domain:wow_this_site_linked_back_to_me.com
|
|
59
|
+
# URLs
|
|
60
|
+
https://maybe_a_good_site.com/the_best_product
|
|
61
|
+
https://should_we_trust_this_domain.com/product_review
|
|
62
|
+
|
|
63
|
+
## Import files format
|
|
64
|
+
Each line should contain a complete URL.
|
|
65
|
+
|
|
66
|
+
https://newsite.com/path
|
|
67
|
+
https://newsite2.com/path
|
|
68
|
+
https://newsite3.com/path
|
|
69
|
+
|
|
70
|
+
|
|
71
|
+
## FAQ
|
|
72
|
+
|
|
73
|
+
### Why do I need to keep track of the toxic domains?
|
|
74
|
+
Well... Google lies https://www.youtube.com/watch?v=HWJUU-g5U_I
|
|
75
|
+
|
|
76
|
+

|
|
77
|
+
|
|
78
|
+
The reality is your site will get killed with a negative SEO attack, period.
|
|
79
|
+
|
|
80
|
+
Just check one of our sites that received an attack:
|
|
81
|
+
|
|
82
|
+

|
|
83
|
+
|
|
84
|
+
Google is slow, your site's SEO won't be affected instantly if you are a victim of this type of attack but you have a colony of termites eating you from the inside. One day, out of nowhere your wood will crack and if your revenue is very SEO dependant you may be very close to bankruptcy if not completely broke. You better integrate a backlink clean up in your SEO routie as soon as possible.
|
|
85
|
+
|
|
86
|
+
In essence you need to do all the work Goole, in our opinion, should be doing.
|
|
87
|
+
|
|
88
|
+
### Why not just use _SEMrush back link audit_ or _Ahrefs_?
|
|
89
|
+
|
|
90
|
+
Good question.
|
|
91
|
+
|
|
92
|
+
Ahrefs doesn't let you import links other than what their software finds so you are out of luck if you have a better source of new backlinks.
|
|
93
|
+
|
|
94
|
+
SEMrush is better in that regard as they let you import new back links and connect your account to other services such as Google Search Console or services like Majestic. The bad news is SEMrush crawls the web with **curl** identifying themselves against the new imported backlinks and if they get an HTTP error, they discard the site and you aren't able to put that domain or URL into your disavow file from their website application. The result is malicious sites return HTTP 500 or 403 when they detect bots to detect them. Just as example, this clear malicious or infected site http://carreviewauto.com/statti/kak-uberech-avto-ot-ugona-4-zolotyx-pravila.html returns 403 for the SEMSush routine. SEMRush's curl command:
|
|
95
|
+
|
|
96
|
+
curl -i -sS -L --proto-redir -all,http,https --max-time 5 -A 'Mozilla/5.0 (compatible; SemrushBot-SA/0.97;+http://www.semrush.com/bot.html)' -H 'Accept-Encoding: gzip, deflate' -H'Accept: */*' --compressed http://carreviewauto.com/statti/kak-uberech-avto-ot-ugona-4-zolotyx-pravila.html
|
|
97
|
+
|
|
98
|
+
Response headers:
|
|
99
|
+
|
|
100
|
+
HTTP/1.1 403 Forbidden
|
|
101
|
+
Server: nginx
|
|
102
|
+
Date: Sun, 04 Aug 2019 13:23:04 GMT
|
|
103
|
+
Content-Type: text/html; charset=iso-8859-1
|
|
104
|
+
Content-Length: 334
|
|
105
|
+
Connection: keep-alive
|
|
106
|
+
Keep-Alive: timeout=60
|
|
107
|
+
|
|
108
|
+
SEMRush's customer service has declined to change this.
|
|
109
|
+
|
|
110
|
+
## Our workflow
|
|
111
|
+
1. We keep using SEMRush backlink audit tool as it's a bit more polished. Once a month we run their tool and download the generated disavow file _(disavow_semrush.cvs)_ and whitelist _(whitelist.cvs)_.
|
|
112
|
+
|
|
113
|
+
2. We also download the current disavow file uploaded to [Google Disavow Tool](https://www.google.com/webmasters/tools/disavow-links-main).
|
|
114
|
+
|
|
115
|
+
3. We download the most sampled _(GSC_most_sampled.cvs)_ and latest links _(GSC_latest.cvs)_ found by [Google Search Console](https://search.google.com/search-console).
|
|
116
|
+
|
|
117
|
+
4. We download the new backlinks found by [ahrefs](https://ahrefs.com) _(new_ahrefs.cvs)_
|
|
118
|
+
|
|
119
|
+
5. We download the new backlinks found by [Majestic](Majestic.com) _(new_majestic.cvs)_
|
|
120
|
+
|
|
121
|
+
6. We have the whitelist generated by disavow in our last session whitelist_last_disavow_tool_session.csv
|
|
122
|
+
|
|
123
|
+
7. We run the following command:
|
|
124
|
+
disavow_tool -V --disavow disavow_semrush.cvs,whitelist.cvs\
|
|
125
|
+
--whitelist.cvs,whitelist_last_disavow_tool_session.csv\
|
|
126
|
+
--import GSC_most_sampled.cvs,GSC_latest.cvs,new_ahrefs.cvs,new_majestic.cvs
|
|
24
127
|
|
|
25
|
-
|
|
128
|
+
8. Upload the generated disavow file to [Google Disavow Tool](https://www.google.com/webmasters/tools/disavow-links-main) and save the generated whitelist file for the next session.
|
|
26
129
|
|
|
27
130
|
## Development
|
|
28
131
|
|
|
29
|
-
After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment
|
|
132
|
+
After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment
|
|
30
133
|
|
|
31
134
|
To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
|
|
32
135
|
|
|
33
136
|
## Contributing
|
|
34
137
|
|
|
35
|
-
Bug reports and pull requests are welcome on GitHub at https://github.com/
|
|
138
|
+
Bug reports and pull requests are welcome on GitHub at https://github.com/maesitos/disavow_tool.
|
|
36
139
|
|
|
37
140
|
## License
|
|
38
141
|
|
|
@@ -16,7 +16,7 @@ module DisavowTool
|
|
|
16
16
|
options.network_requests = true
|
|
17
17
|
|
|
18
18
|
opt_parser = OptionParser.new do |opts|
|
|
19
|
-
opts.banner = "Usage:
|
|
19
|
+
opts.banner = "Usage: disavow_tool [options] --disavow file_1,file_2,file_3 --import file_1,file_2,file_3 [--whitelist file1,file2,file3]"
|
|
20
20
|
opts.separator ""
|
|
21
21
|
opts.separator "Requited options:"
|
|
22
22
|
opts.on("-d","--disavow file_1,file_2", Array, "Disavow files as exported from Google Search Console") do |file|
|
data/lib/disavow_tool/list.rb
CHANGED
|
@@ -9,15 +9,17 @@ module DisavowTool
|
|
|
9
9
|
|
|
10
10
|
def initialize(import_files)
|
|
11
11
|
@list = Set.new
|
|
12
|
+
@list_files_imported = []
|
|
12
13
|
@verbose = OPTIONS.verbose
|
|
13
14
|
@verbose_hard = OPTIONS.hardcore_verbose
|
|
14
|
-
import import_files
|
|
15
|
+
import import_files
|
|
15
16
|
@original_list = @list.clone
|
|
16
17
|
end
|
|
17
18
|
|
|
18
19
|
def import(import_files)
|
|
19
20
|
import_files = [import_files] if import_files.class != Array
|
|
20
21
|
import_files.each do |file|
|
|
22
|
+
@list_files_imported << file
|
|
21
23
|
puts "Importing file: #{file}"
|
|
22
24
|
File.readlines(file).each do |line|
|
|
23
25
|
line.chomp!
|
|
@@ -89,6 +91,9 @@ module DisavowTool
|
|
|
89
91
|
def summary(list=nil, original_list=nil)
|
|
90
92
|
list = list || @list
|
|
91
93
|
original_list = original_list || @original_list
|
|
94
|
+
puts "Files imported:"
|
|
95
|
+
puts @list_files_imported
|
|
96
|
+
|
|
92
97
|
puts "#{message_sumary_imported} #{original_list.count}".blue
|
|
93
98
|
puts "#{mensaje_sumary_before_export} #{list.count}".blue
|
|
94
99
|
|
data/lib/disavow_tool/version.rb
CHANGED
|
Binary file
|
|
Binary file
|
|
Binary file
|
metadata
CHANGED
|
@@ -1,14 +1,14 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: disavow_tool
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.3.
|
|
4
|
+
version: 0.3.3
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Maesitos
|
|
8
8
|
autorequire:
|
|
9
9
|
bindir: exe
|
|
10
10
|
cert_chain: []
|
|
11
|
-
date: 2019-08-
|
|
11
|
+
date: 2019-08-04 00:00:00.000000000 Z
|
|
12
12
|
dependencies:
|
|
13
13
|
- !ruby/object:Gem::Dependency
|
|
14
14
|
name: bundler
|
|
@@ -124,6 +124,9 @@ files:
|
|
|
124
124
|
- lib/disavow_tool/list.rb
|
|
125
125
|
- lib/disavow_tool/version.rb
|
|
126
126
|
- lib/disavow_tool/white_list.rb
|
|
127
|
+
- miscellaneous/images/communist.jpg
|
|
128
|
+
- miscellaneous/images/killed1.png
|
|
129
|
+
- miscellaneous/images/killed2.png
|
|
127
130
|
- samples/disavowed.cvs
|
|
128
131
|
- samples/disavowed2.cvs
|
|
129
132
|
- samples/new_links.csv
|