rowr 0.2.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 4fea7df6d4e1adb7ebe5a92dc53bdbc19002171d
4
+ data.tar.gz: e29245ead1eedb69de5b92626cf3e4759bf03a74
5
+ SHA512:
6
+ metadata.gz: 064b7ec66b949cec8b6d9fc323ac8268b3a4c2736732ae916afb49559e53200879f25b59bbbb797555397acc23f2ea3875245d1283524fae7bcb6d358ac1e097
7
+ data.tar.gz: 0d535ab043e508d36b9238f33cb9b41cbf7c8f564bdda59c52a10afb62147c661e53897f9dc92eb5d599c6e66262c6ace2ddf690f5c0f49b62e50d388eaf4b29
@@ -0,0 +1,74 @@
1
+ # Contributor Covenant Code of Conduct
2
+
3
+ ## Our Pledge
4
+
5
+ In the interest of fostering an open and welcoming environment, we as
6
+ contributors and maintainers pledge to making participation in our project and
7
+ our community a harassment-free experience for everyone, regardless of age, body
8
+ size, disability, ethnicity, gender identity and expression, level of experience,
9
+ nationality, personal appearance, race, religion, or sexual identity and
10
+ orientation.
11
+
12
+ ## Our Standards
13
+
14
+ Examples of behavior that contributes to creating a positive environment
15
+ include:
16
+
17
+ * Using welcoming and inclusive language
18
+ * Being respectful of differing viewpoints and experiences
19
+ * Gracefully accepting constructive criticism
20
+ * Focusing on what is best for the community
21
+ * Showing empathy towards other community members
22
+
23
+ Examples of unacceptable behavior by participants include:
24
+
25
+ * The use of sexualized language or imagery and unwelcome sexual attention or
26
+ advances
27
+ * Trolling, insulting/derogatory comments, and personal or political attacks
28
+ * Public or private harassment
29
+ * Publishing others' private information, such as a physical or electronic
30
+ address, without explicit permission
31
+ * Other conduct which could reasonably be considered inappropriate in a
32
+ professional setting
33
+
34
+ ## Our Responsibilities
35
+
36
+ Project maintainers are responsible for clarifying the standards of acceptable
37
+ behavior and are expected to take appropriate and fair corrective action in
38
+ response to any instances of unacceptable behavior.
39
+
40
+ Project maintainers have the right and responsibility to remove, edit, or
41
+ reject comments, commits, code, wiki edits, issues, and other contributions
42
+ that are not aligned to this Code of Conduct, or to ban temporarily or
43
+ permanently any contributor for other behaviors that they deem inappropriate,
44
+ threatening, offensive, or harmful.
45
+
46
+ ## Scope
47
+
48
+ This Code of Conduct applies both within project spaces and in public spaces
49
+ when an individual is representing the project or its community. Examples of
50
+ representing a project or community include using an official project e-mail
51
+ address, posting via an official social media account, or acting as an appointed
52
+ representative at an online or offline event. Representation of a project may be
53
+ further defined and clarified by project maintainers.
54
+
55
+ ## Enforcement
56
+
57
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be
58
+ reported by contacting the project team at lukeaeschleman@gmail.com. All
59
+ complaints will be reviewed and investigated and will result in a response that
60
+ is deemed necessary and appropriate to the circumstances. The project team is
61
+ obligated to maintain confidentiality with regard to the reporter of an incident.
62
+ Further details of specific enforcement policies may be posted separately.
63
+
64
+ Project maintainers who do not follow or enforce the Code of Conduct in good
65
+ faith may face temporary or permanent repercussions as determined by other
66
+ members of the project's leadership.
67
+
68
+ ## Attribution
69
+
70
+ This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
71
+ available at [http://contributor-covenant.org/version/1/4][version]
72
+
73
+ [homepage]: http://contributor-covenant.org
74
+ [version]: http://contributor-covenant.org/version/1/4/
data/LICENSE ADDED
@@ -0,0 +1,201 @@
1
+ Apache License
2
+ Version 2.0, January 2004
3
+ http://www.apache.org/licenses/
4
+
5
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6
+
7
+ 1. Definitions.
8
+
9
+ "License" shall mean the terms and conditions for use, reproduction,
10
+ and distribution as defined by Sections 1 through 9 of this document.
11
+
12
+ "Licensor" shall mean the copyright owner or entity authorized by
13
+ the copyright owner that is granting the License.
14
+
15
+ "Legal Entity" shall mean the union of the acting entity and all
16
+ other entities that control, are controlled by, or are under common
17
+ control with that entity. For the purposes of this definition,
18
+ "control" means (i) the power, direct or indirect, to cause the
19
+ direction or management of such entity, whether by contract or
20
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
21
+ outstanding shares, or (iii) beneficial ownership of such entity.
22
+
23
+ "You" (or "Your") shall mean an individual or Legal Entity
24
+ exercising permissions granted by this License.
25
+
26
+ "Source" form shall mean the preferred form for making modifications,
27
+ including but not limited to software source code, documentation
28
+ source, and configuration files.
29
+
30
+ "Object" form shall mean any form resulting from mechanical
31
+ transformation or translation of a Source form, including but
32
+ not limited to compiled object code, generated documentation,
33
+ and conversions to other media types.
34
+
35
+ "Work" shall mean the work of authorship, whether in Source or
36
+ Object form, made available under the License, as indicated by a
37
+ copyright notice that is included in or attached to the work
38
+ (an example is provided in the Appendix below).
39
+
40
+ "Derivative Works" shall mean any work, whether in Source or Object
41
+ form, that is based on (or derived from) the Work and for which the
42
+ editorial revisions, annotations, elaborations, or other modifications
43
+ represent, as a whole, an original work of authorship. For the purposes
44
+ of this License, Derivative Works shall not include works that remain
45
+ separable from, or merely link (or bind by name) to the interfaces of,
46
+ the Work and Derivative Works thereof.
47
+
48
+ "Contribution" shall mean any work of authorship, including
49
+ the original version of the Work and any modifications or additions
50
+ to that Work or Derivative Works thereof, that is intentionally
51
+ submitted to Licensor for inclusion in the Work by the copyright owner
52
+ or by an individual or Legal Entity authorized to submit on behalf of
53
+ the copyright owner. For the purposes of this definition, "submitted"
54
+ means any form of electronic, verbal, or written communication sent
55
+ to the Licensor or its representatives, including but not limited to
56
+ communication on electronic mailing lists, source code control systems,
57
+ and issue tracking systems that are managed by, or on behalf of, the
58
+ Licensor for the purpose of discussing and improving the Work, but
59
+ excluding communication that is conspicuously marked or otherwise
60
+ designated in writing by the copyright owner as "Not a Contribution."
61
+
62
+ "Contributor" shall mean Licensor and any individual or Legal Entity
63
+ on behalf of whom a Contribution has been received by Licensor and
64
+ subsequently incorporated within the Work.
65
+
66
+ 2. Grant of Copyright License. Subject to the terms and conditions of
67
+ this License, each Contributor hereby grants to You a perpetual,
68
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
69
+ copyright license to reproduce, prepare Derivative Works of,
70
+ publicly display, publicly perform, sublicense, and distribute the
71
+ Work and such Derivative Works in Source or Object form.
72
+
73
+ 3. Grant of Patent License. Subject to the terms and conditions of
74
+ this License, each Contributor hereby grants to You a perpetual,
75
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
76
+ (except as stated in this section) patent license to make, have made,
77
+ use, offer to sell, sell, import, and otherwise transfer the Work,
78
+ where such license applies only to those patent claims licensable
79
+ by such Contributor that are necessarily infringed by their
80
+ Contribution(s) alone or by combination of their Contribution(s)
81
+ with the Work to which such Contribution(s) was submitted. If You
82
+ institute patent litigation against any entity (including a
83
+ cross-claim or counterclaim in a lawsuit) alleging that the Work
84
+ or a Contribution incorporated within the Work constitutes direct
85
+ or contributory patent infringement, then any patent licenses
86
+ granted to You under this License for that Work shall terminate
87
+ as of the date such litigation is filed.
88
+
89
+ 4. Redistribution. You may reproduce and distribute copies of the
90
+ Work or Derivative Works thereof in any medium, with or without
91
+ modifications, and in Source or Object form, provided that You
92
+ meet the following conditions:
93
+
94
+ (a) You must give any other recipients of the Work or
95
+ Derivative Works a copy of this License; and
96
+
97
+ (b) You must cause any modified files to carry prominent notices
98
+ stating that You changed the files; and
99
+
100
+ (c) You must retain, in the Source form of any Derivative Works
101
+ that You distribute, all copyright, patent, trademark, and
102
+ attribution notices from the Source form of the Work,
103
+ excluding those notices that do not pertain to any part of
104
+ the Derivative Works; and
105
+
106
+ (d) If the Work includes a "NOTICE" text file as part of its
107
+ distribution, then any Derivative Works that You distribute must
108
+ include a readable copy of the attribution notices contained
109
+ within such NOTICE file, excluding those notices that do not
110
+ pertain to any part of the Derivative Works, in at least one
111
+ of the following places: within a NOTICE text file distributed
112
+ as part of the Derivative Works; within the Source form or
113
+ documentation, if provided along with the Derivative Works; or,
114
+ within a display generated by the Derivative Works, if and
115
+ wherever such third-party notices normally appear. The contents
116
+ of the NOTICE file are for informational purposes only and
117
+ do not modify the License. You may add Your own attribution
118
+ notices within Derivative Works that You distribute, alongside
119
+ or as an addendum to the NOTICE text from the Work, provided
120
+ that such additional attribution notices cannot be construed
121
+ as modifying the License.
122
+
123
+ You may add Your own copyright statement to Your modifications and
124
+ may provide additional or different license terms and conditions
125
+ for use, reproduction, or distribution of Your modifications, or
126
+ for any such Derivative Works as a whole, provided Your use,
127
+ reproduction, and distribution of the Work otherwise complies with
128
+ the conditions stated in this License.
129
+
130
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
131
+ any Contribution intentionally submitted for inclusion in the Work
132
+ by You to the Licensor shall be under the terms and conditions of
133
+ this License, without any additional terms or conditions.
134
+ Notwithstanding the above, nothing herein shall supersede or modify
135
+ the terms of any separate license agreement you may have executed
136
+ with Licensor regarding such Contributions.
137
+
138
+ 6. Trademarks. This License does not grant permission to use the trade
139
+ names, trademarks, service marks, or product names of the Licensor,
140
+ except as required for reasonable and customary use in describing the
141
+ origin of the Work and reproducing the content of the NOTICE file.
142
+
143
+ 7. Disclaimer of Warranty. Unless required by applicable law or
144
+ agreed to in writing, Licensor provides the Work (and each
145
+ Contributor provides its Contributions) on an "AS IS" BASIS,
146
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147
+ implied, including, without limitation, any warranties or conditions
148
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149
+ PARTICULAR PURPOSE. You are solely responsible for determining the
150
+ appropriateness of using or redistributing the Work and assume any
151
+ risks associated with Your exercise of permissions under this License.
152
+
153
+ 8. Limitation of Liability. In no event and under no legal theory,
154
+ whether in tort (including negligence), contract, or otherwise,
155
+ unless required by applicable law (such as deliberate and grossly
156
+ negligent acts) or agreed to in writing, shall any Contributor be
157
+ liable to You for damages, including any direct, indirect, special,
158
+ incidental, or consequential damages of any character arising as a
159
+ result of this License or out of the use or inability to use the
160
+ Work (including but not limited to damages for loss of goodwill,
161
+ work stoppage, computer failure or malfunction, or any and all
162
+ other commercial damages or losses), even if such Contributor
163
+ has been advised of the possibility of such damages.
164
+
165
+ 9. Accepting Warranty or Additional Liability. While redistributing
166
+ the Work or Derivative Works thereof, You may choose to offer,
167
+ and charge a fee for, acceptance of support, warranty, indemnity,
168
+ or other liability obligations and/or rights consistent with this
169
+ License. However, in accepting such obligations, You may act only
170
+ on Your own behalf and on Your sole responsibility, not on behalf
171
+ of any other Contributor, and only if You agree to indemnify,
172
+ defend, and hold each Contributor harmless for any liability
173
+ incurred by, or claims asserted against, such Contributor by reason
174
+ of your accepting any such warranty or additional liability.
175
+
176
+ END OF TERMS AND CONDITIONS
177
+
178
+ APPENDIX: How to apply the Apache License to your work.
179
+
180
+ To apply the Apache License to your work, attach the following
181
+ boilerplate notice, with the fields enclosed by brackets "{}"
182
+ replaced with your own identifying information. (Don't include
183
+ the brackets!) The text should be enclosed in the appropriate
184
+ comment syntax for the file format. We also recommend that a
185
+ file or class name and description of purpose be included on the
186
+ same "printed page" as the copyright notice for easier
187
+ identification within third-party archives.
188
+
189
+ Copyright {yyyy} {name of copyright owner}
190
+
191
+ Licensed under the Apache License, Version 2.0 (the "License");
192
+ you may not use this file except in compliance with the License.
193
+ You may obtain a copy of the License at
194
+
195
+ http://www.apache.org/licenses/LICENSE-2.0
196
+
197
+ Unless required by applicable law or agreed to in writing, software
198
+ distributed under the License is distributed on an "AS IS" BASIS,
199
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200
+ See the License for the specific language governing permissions and
201
+ limitations under the License.
@@ -0,0 +1,32 @@
1
+ # ROWR: Really Old Website Resurrector
2
+
3
+ [![Build Status](https://travis-ci.org/UNC-Libraries/ROWR.svg?branch=master)](https://travis-ci.org/UNC-Libraries/ROWR)
4
+
5
+ It's basically link find/replace tool for a really old websites. ROWR will parse through your site files and look for any
6
+ broken links. When it finds one, it will prompt you to either replace, remove,
7
+ ROWR takes a really old website, one that might be living on a cd flash drive for archival purposes, and allows you to clean up
8
+ any broken links.
9
+
10
+ ## Installation
11
+
12
+ After installing ruby, add `gem 'rowr'` to your application's Gemfile or run the following from the command line:
13
+
14
+ $ gem install rowr
15
+
16
+ ## Usage
17
+
18
+ `rowr start` Start the script, will prompt you for information about the really old site.
19
+ While running, you can always prematurely stop the script with CMD+C or CTRL+C.
20
+
21
+ `rowr continue` Continue where you left off.
22
+
23
+ `rowr reset` Destroy all changes made and restart the process.
24
+
25
+ ## Contributing
26
+
27
+ Bug reports and pull requests are welcome on GitHub at https://github.com/UNC-Libraries/ROWR/issues.
28
+
29
+ ## License
30
+
31
+ The gem is available as open source under the terms of the [Apache License 2.0](http://www.apache.org/licenses/).
32
+
@@ -0,0 +1,8 @@
1
+ #!/usr/bin/env bash
2
+ set -euo pipefail
3
+ IFS=$'\n\t'
4
+ set -vx
5
+
6
+ bundle install
7
+
8
+ # Do any other automated setup that you need to do here
@@ -0,0 +1,7 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require 'rowr/command_line'
4
+
5
+ result = Rowr::CommandLine.start( ARGV )
6
+
7
+ exit 1 unless result # non-zero exit status on process telling us there's problems.
@@ -0,0 +1,20 @@
1
+ require 'rowr/version'
2
+ require 'rowr/zipper'
3
+ require 'rowr/resurrector'
4
+ require 'rowr/prompter'
5
+ require 'rowr/printer'
6
+ require 'rowr/link_processor'
7
+ require 'rowr/state_saver'
8
+ require 'fileutils'
9
+ require 'faraday'
10
+ require 'slop'
11
+ require 'thor'
12
+ require 'tty-prompt'
13
+ require 'json'
14
+ require 'yaml'
15
+ require 'zip'
16
+ require 'zip/zip_file_generator'
17
+
18
+ module Rowr
19
+
20
+ end
@@ -0,0 +1,61 @@
1
+ require 'rowr'
2
+ require 'tty-prompt'
3
+
4
+ module Rowr
5
+
6
+ class CommandLine < Thor
7
+
8
+ desc 'start',
9
+ 'resurrect a really old website'
10
+
11
+ def start
12
+ rowr = Rowr::Resurrector.new
13
+ rowr.start
14
+ end
15
+
16
+ desc 'continue',
17
+ 'continue resurrecting'
18
+
19
+ def continue
20
+ rowr = Rowr::Resurrector.new
21
+ rowr.continue
22
+ end
23
+
24
+ desc 'reset',
25
+ 'restart the really old website resurrection'
26
+
27
+ def reset
28
+ rowr = Rowr::Resurrector.new
29
+ rowr.reset
30
+ end
31
+
32
+ desc 'test <file>',
33
+ 'test the resurrector on a single file'
34
+
35
+ def test(file)
36
+ return unless File.exist? File.expand_path(file)
37
+
38
+ f = File.expand_path(file)
39
+ rowr = Rowr::Resurrector.new
40
+
41
+ rowr.options.source_directory = File.dirname(f)
42
+ rowr.prompt_user_for_option 'old_host?'
43
+ rowr.prompt_user_for_option 'new_base_path?'
44
+ rowr.prompt_user_for_option 'check_external_urls?'
45
+
46
+ rowr.init_link_processor
47
+
48
+ rowr.link_processor.containing_file = f
49
+
50
+ text = File.read(f)
51
+ unless text.valid_encoding?
52
+ text = text.encode('UTF-16be', :invalid=>:replace, :replace=>'&nbsp;').encode('UTF-8')
53
+ end
54
+ text = rowr.clean_no_quotes(text)
55
+ rowr.check_urls(text)
56
+
57
+ end
58
+
59
+ end
60
+
61
+ end
@@ -0,0 +1,234 @@
1
+ require 'rowr'
2
+
3
+ module Rowr
4
+
5
+ class LinkProcessor
6
+
7
+ attr_reader :local_site_dir
8
+ attr_reader :old_domain
9
+ attr_accessor :new_base_path
10
+ attr_accessor :cached
11
+ attr_reader :link_to_check
12
+ attr_accessor :containing_file
13
+ attr_reader :target_file
14
+
15
+ def initialize(src_dir, old_domain = nil, new_base_path = nil, check_external_urls = true, cached = {})
16
+ @printer = Rowr::Printer.new
17
+ @prompt = TTY::Prompt.new(active_color: :cyan)
18
+ @pastel = Pastel.new
19
+ @local_site_dir = src_dir
20
+ @old_domain = old_domain
21
+ @new_base_path = new_base_path
22
+ @check_external_urls = check_external_urls
23
+ @cached = cached
24
+ end
25
+
26
+ ################################
27
+ # Attributes
28
+ ################################
29
+ def link_to_check=(value)
30
+ if external?(value)
31
+ @link_to_check = value
32
+ else
33
+ value.sub!(old_url_regex, '') if @old_domain
34
+ if value.start_with?('/')
35
+ @link_to_check = value.sub(%r{(^/)}, '')
36
+ @target_file = File.expand_path(File.join(@local_site_dir, @link_to_check))
37
+ else
38
+ @link_to_check = File.dirname(@containing_file).sub(@local_site_dir, '') + '/' + value
39
+ @link_to_check.sub!(%r{(^/)}, '')
40
+ @target_file = File.expand_path(File.join(File.dirname(@containing_file), value))
41
+ end
42
+ end
43
+ end
44
+
45
+ def old_url_regex
46
+ %r{^(https?://|//)#{@old_domain}}i if @old_domain
47
+ end
48
+
49
+ ################################
50
+ # Checkers
51
+ ################################
52
+
53
+ def external?(link)
54
+ !old_uri?(link) && uri?(link) ? true : false
55
+ end
56
+
57
+ def old_uri?(link)
58
+ if @old_domain
59
+ link =~ old_url_regex
60
+ else
61
+ false
62
+ end
63
+ end
64
+
65
+ def uri?(link)
66
+ link =~ %r{^(https?:|//)}i
67
+ end
68
+
69
+ def in_cache?
70
+ @cached.key?(link_key)
71
+ end
72
+
73
+ def response_code(link)
74
+ begin
75
+ res = Faraday.get link
76
+ return res.status
77
+ rescue
78
+ return 0
79
+ end
80
+ end
81
+
82
+ def trim_hash(file)
83
+ file.sub(/#(.*?)$/,'')
84
+ end
85
+
86
+ def target_file_exists?
87
+ File.exist?(trim_hash(@target_file))
88
+ end
89
+
90
+ def broken_external_link?
91
+ res = response_code(@link_to_check)
92
+ res > 399 || res < 200
93
+ end
94
+
95
+ def is_valid_replacement?(link)
96
+ if uri?(link)
97
+ res = response_code(link)
98
+ res < 400 || res > 199
99
+ else
100
+ File.exist?(File.join(@local_site_dir, link))
101
+ end
102
+ end
103
+
104
+ ################################
105
+ # Misc
106
+ ################################
107
+
108
+ def link_key
109
+ @link_to_check.to_sym
110
+ end
111
+
112
+ def add_to_cache(new_link)
113
+ @cached[link_key] = new_link
114
+ end
115
+
116
+ def recommend_files
117
+ Dir.glob("#{@local_site_dir}/**/{#{File.basename(@target_file)}}").map! do |f|
118
+ f.sub(@local_site_dir,'')
119
+ end
120
+ end
121
+
122
+ def prepend_new_base_path(link)
123
+ check = @new_base_path[1..-1].chop
124
+ new_link = link.sub(%r{^/?#{check}},'')
125
+ new_link = new_link.sub(/^\//,'')
126
+ @new_base_path + new_link
127
+ end
128
+
129
+ ################################
130
+ # Processors
131
+ ################################
132
+
133
+ def process_link
134
+ @new_base_path + @link_to_check if target_file_exists?
135
+ end
136
+
137
+ def process_broken_link
138
+ return cached[link_key] if in_cache?
139
+ replacement = nil
140
+ @printer.print_broken_link_warning @containing_file, @link_to_check
141
+ replacement = ask_recommended_files unless recommend_files.empty?
142
+ replacement = ask_wtd unless replacement
143
+ ask_to_cache(replacement)
144
+ replacement
145
+ end
146
+
147
+ def process_external
148
+ return nil unless @check_external_urls && broken_external_link?
149
+ @printer.print_broken_link_warning @containing_file, @link_to_check
150
+ replacement = ask_wtd
151
+ ask_to_cache(replacement)
152
+ replacement
153
+ end
154
+
155
+ def process(link, file = nil)
156
+ @containing_file = file if file
157
+ self.link_to_check = link
158
+
159
+ if external?(@link_to_check)
160
+ replacement = process_external
161
+ else
162
+ replacement = process_link
163
+ replacement = process_broken_link unless replacement
164
+ end
165
+ replacement
166
+ end
167
+
168
+ ################################
169
+ # Asks
170
+ ################################
171
+
172
+ def ask_recommended_files
173
+ @printer.print_line ' I found some matching files ', '+', :blue
174
+ recommended_files = recommend_files
175
+ choice = @prompt.select(
176
+ 'Would you like to replace the broken link with any of the following?',
177
+ recommended_files + ['None of these match'],
178
+ per_page: 10
179
+ )
180
+ choice == 'None of these match' ? nil : prepend_new_base_path(choice)
181
+ end
182
+
183
+ def ask_to_cache(new_link)
184
+ case new_link
185
+ when nil
186
+ message = "SKIP all instances of " + @pastel.green("#{@link_to_check}") + "?"
187
+ when '#'
188
+ message = "REMOVE all instances of " + @pastel.green("#{@link_to_check}") + "?"
189
+ else
190
+ message = "REPLACE all instances of " + @pastel.green("#{@link_to_check}") + " with " + @pastel.blue("#{new_link}") + "?"
191
+ end
192
+ add_to_cache(new_link) if @prompt.yes?(message)
193
+ end
194
+
195
+ def ask_wtd
196
+ @printer.line_break 0
197
+ wtd = @prompt.enum_select"What would you like to do?" do |menu|
198
+ menu.default 1
199
+
200
+ menu.choice 'Enter a new link', 1
201
+ menu.choice 'Remove the link', 2
202
+ menu.choice 'Skip', 3
203
+ end
204
+
205
+ case wtd
206
+ when 1
207
+ ask_new_link
208
+ when 2
209
+ '#'
210
+ when 3
211
+ nil
212
+ end
213
+ end
214
+
215
+ def ask_new_link
216
+ new_link = @prompt.ask('Enter the replacement:')
217
+ unless is_valid_replacement?(new_link)
218
+ if uri?(new_link)
219
+ @prompt.error("Sorry, the url you've provided is not returning a 200 status code")
220
+ else
221
+ @prompt.error('Sorry, that file does not exist')
222
+ end
223
+ new_link = ask_new_link
224
+ end
225
+
226
+ if uri?(new_link)
227
+ new_link
228
+ else
229
+ prepend_new_base_path(new_link)
230
+ end
231
+ end
232
+
233
+ end
234
+ end
@@ -0,0 +1,64 @@
1
+ require 'rowr'
2
+
3
+ module Rowr
4
+
5
+ class Printer
6
+
7
+ def initialize(line_length = 50)
8
+ @pastel = Pastel.new
9
+ @line_length = line_length
10
+ end
11
+
12
+ def line(text = '', char = '~')
13
+ message = text.to_s
14
+ return message if too_long?(message)
15
+
16
+ waves = char * ((@line_length - message.length) / 2).ceil
17
+ output = "#{waves}#{message}#{waves}"
18
+ output + (char * (50 - output.length))
19
+ end
20
+
21
+ def too_long?(string)
22
+ string.length > @line_length
23
+ end
24
+
25
+ def line_break(duration)
26
+ puts "\n"
27
+ sleep(duration)
28
+ end
29
+
30
+ def print_line(message = nil, char = '~', color = 'green')
31
+ puts @pastel.send(color.to_sym, line(message, char))
32
+ end
33
+
34
+ def print_intro
35
+ print_line
36
+ print_line ' ROWR! '
37
+ print_line
38
+ line_break 1
39
+ end
40
+
41
+ def print_outro
42
+ print_line
43
+ print_line " You're all done! "
44
+ print_line
45
+ print_line ' rowr... '
46
+ print_line
47
+ end
48
+
49
+ def print_broken_link_warning(file, link)
50
+ line_break 0
51
+ print_line '', '!', 'yellow'
52
+ print_line ' Broken Link ', '!', 'yellow'
53
+ puts @pastel.magenta.bold('File: ', @pastel.red(file))
54
+ puts @pastel.magenta.bold('Link: ', @pastel.green(link))
55
+ end
56
+
57
+ def print_file_header(file)
58
+ print_line ' FILE ', '*', 'cyan'
59
+ print_line " #{file} ", '*', 'cyan'
60
+ print_line '', '*', 'cyan'
61
+ end
62
+
63
+ end
64
+ end
@@ -0,0 +1,113 @@
1
+ require 'rowr'
2
+
3
+ module Rowr
4
+
5
+ class Prompter
6
+
7
+ attr_reader :source_directory
8
+ attr_reader :exts_to_use
9
+ attr_reader :old_host
10
+ attr_reader :new_base_path
11
+ attr_accessor :check_external_urls
12
+
13
+ def initialize
14
+ @prompt = TTY::Prompt.new
15
+ end
16
+
17
+ def source_directory=(value)
18
+ dir = File.expand_path(value)
19
+ @source_directory = dir if Dir.exist?(dir)
20
+ end
21
+
22
+ def old_host=(value)
23
+ if value.to_s.empty?
24
+ @old_host = false
25
+ else
26
+ @old_host = value.chomp('/').chomp('/').sub(%r{https?://}, '')
27
+ end
28
+ end
29
+
30
+ def exts_to_use=(value)
31
+ exts = []
32
+ if value
33
+ exts = value.split
34
+ exts.map! { |e| e.start_with?('.') ? e[1..-1] : e } if exts.is_a?(Array)
35
+ end
36
+ exts = %w(htm html) + exts
37
+ @exts_to_use = exts.uniq
38
+ end
39
+
40
+ def new_base_path=(value)
41
+ clean = ''
42
+ unless value.to_s.empty?
43
+ clean = value.sub(%r{https?://(.*?)(/|$)}, '')
44
+ clean = clean.split('/')
45
+ clean = clean.reject(&:empty?)
46
+ clean.shift if clean.first =~ /\./
47
+ clean = clean.join('/')
48
+ end
49
+ @new_base_path = clean.to_s.empty? ? '/' : "/#{clean}/"
50
+ end
51
+
52
+ def dir_select
53
+ @prompt.select('Where is this really old website?') do |menu|
54
+ menu.default 1
55
+
56
+ menu.choice "#{Dir.pwd} (The current dir)", 1
57
+ menu.choice "Another directory?", 2
58
+ end
59
+ end
60
+
61
+ def ask_for_other_source_directory
62
+ dir = @prompt.ask('Please type in the path to that directory?') do |q|
63
+ q.required true
64
+ end
65
+ unless Dir.exist?(File.expand_path(dir))
66
+ @prompt.error("Sorry, #{dir} doesn't seem to exist")
67
+ ask_for_other_source_directory
68
+ end
69
+ dir
70
+ end
71
+
72
+ def old_host?
73
+ self.old_host = @prompt.ask('What was the old host? (e.g. www.google.com)')
74
+ end
75
+
76
+ def source_directory?
77
+ self.source_directory = case dir_select
78
+ when 1
79
+ Dir.pwd
80
+ when 2
81
+ ask_for_other_source_directory
82
+ end
83
+ end
84
+
85
+ def additional_exts?
86
+ @prompt.say('By default, I\'ll will scan any .html and .htm files.')
87
+ self.exts_to_use = @prompt.ask('Please list any other extensions, or hit Enter to skip')
88
+ end
89
+
90
+ def new_base_path?
91
+ self.new_base_path = @prompt.ask('What will be the url of the resurrected site?')
92
+ end
93
+
94
+ def check_external_urls?
95
+ self.check_external_urls = @prompt.select('If I find an link to an external site, what should I do?') do |menu|
96
+ menu.default 1
97
+
98
+ menu.choice 'Ask me about it', true
99
+ menu.choice 'Skip it', false
100
+ end
101
+ end
102
+
103
+ def generate_hash
104
+ {
105
+ source_directory: source_directory,
106
+ exts_to_use: exts_to_use,
107
+ old_host: old_host,
108
+ new_base_path: new_base_path,
109
+ check_external_urls: check_external_urls
110
+ }
111
+ end
112
+ end
113
+ end
@@ -0,0 +1,183 @@
1
+ require 'rowr'
2
+
3
+ module Rowr
4
+
5
+ class Resurrector
6
+
7
+ attr_accessor :link_processor
8
+ attr_accessor :options
9
+
10
+ def initialize
11
+ @printer = Rowr::Printer.new
12
+ @option_getter = Rowr::Prompter.new
13
+ @state = Rowr::StateSaver.new Dir.pwd, 'rowr_state.json'
14
+ @prompt = TTY::Prompt.new
15
+ @config = {}
16
+ end
17
+
18
+ def init_link_processor(cached = {})
19
+ @link_processor = Rowr::LinkProcessor.new(
20
+ @config[:source_directory],
21
+ @config[:old_host],
22
+ @config[:new_base_path],
23
+ @config[:check_external_urls],
24
+ cached
25
+ )
26
+ end
27
+
28
+ def files_with_exts(exts)
29
+ regex = /\.(#{exts.join('|')})$/i
30
+ Dir.glob(File.join(@config[:source_directory], '**', '*')).grep(regex).reject { |f| !File.file?(f) }
31
+ end
32
+
33
+ def clean_no_quotes(file_contents)
34
+ find = %r{(?<=href=|src=|background=)(?!['"])(?<content>[^> ]*)}mi
35
+ file_contents.gsub(find) do |match|
36
+ "\"#{$~[:content]}\""
37
+ end
38
+ end
39
+
40
+ def check_urls(file_contents)
41
+ find = %r{(?<=(?<=href=|src=|url\(|background=)['"])(?!(mailto:|#))(?<content>.*?)(?=["'])}mi
42
+ file_contents.gsub(find) do |match|
43
+ content = $~[:content].strip
44
+ replacement = @link_processor.process(content)
45
+ replacement.nil? ? content : replacement
46
+ end
47
+ end
48
+
49
+ def prompt_user_for_option(option)
50
+ @option_getter.send(option.to_sym)
51
+ end
52
+
53
+ def gather_options
54
+ @printer.line_break 0
55
+ prompt_user_for_option 'old_host?'
56
+ @printer.line_break 0
57
+ prompt_user_for_option 'new_base_path?'
58
+ @printer.line_break 0
59
+ prompt_user_for_option 'additional_exts?'
60
+ @printer.line_break 0
61
+ prompt_user_for_option 'check_external_urls?'
62
+ @printer.line_break 0
63
+ end
64
+
65
+ def load_state
66
+ @state.load_state
67
+ @config = @state.config
68
+ end
69
+
70
+ def continue
71
+ unless @state.config_file_exists?
72
+ loop do
73
+ @printer.print_line " I can't find a rowr save file in this directory... "
74
+ prompt_user_for_option 'source_directory?'
75
+ @state.src = @option_getter.source_directory
76
+ break if @state.config_file_exists?
77
+ end
78
+ end
79
+ load_state
80
+ run
81
+ end
82
+
83
+ def start
84
+ @printer.print_intro
85
+ @printer.print_line " Before we start, I've got some questions "
86
+ prompt_user_for_option 'source_directory?'
87
+ @state.src = @option_getter.source_directory
88
+ if @state.config_file_exists?
89
+ @prompt.say("I've found a rowr save file.")
90
+ continue_resp = @prompt.select('Would you like to continue or reset?', %w(Continue Reset))
91
+ if continue_resp == 'Reset'
92
+ reset
93
+ else
94
+ continue
95
+ end
96
+ end
97
+ prep
98
+ end
99
+
100
+ def reset
101
+ zipper = Rowr::Zipper.new Dir.pwd
102
+ if zipper.backup_file_exists?
103
+ @option_getter.source_directory = Dir.pwd
104
+ else
105
+ loop do
106
+ @prompt.warn("I can't find a rowr backup file in this directory. Where is the old site?")
107
+ prompt_user_for_option 'source_directory?'
108
+ zipper.src = @option_getter.source_directory
109
+ break if zipper.backup_file_exists?
110
+ end
111
+ end
112
+ zipper.restore
113
+ prep
114
+ end
115
+
116
+ def prep
117
+ gather_options
118
+ @config = @option_getter.generate_hash
119
+ @state.save_config(@config)
120
+ @printer.print_line " Let's get started "
121
+ @printer.line_break 0.5
122
+ run
123
+ end
124
+
125
+ def run
126
+ # Backup
127
+ zipper = Rowr::Zipper.new @config[:source_directory]
128
+ zipper.backup
129
+
130
+ # Prep the link processor
131
+ init_link_processor(@state.cached)
132
+ files = files_with_exts(@config[:exts_to_use])
133
+ count = files.length
134
+
135
+ # Print the run intro
136
+ @printer.print_line
137
+ @printer.print_line " I've found #{count} files to scan "
138
+ @printer.print_line
139
+ @printer.line_break 0.5
140
+ unless @state.scanned_files.empty?
141
+ count -= @state.scanned_files.length
142
+ @printer.print_line(
143
+ " Skipping #{@state.scanned_files.length} files, previously scanned.",
144
+ '!',
145
+ 'red'
146
+ )
147
+ @printer.line_break 0.5
148
+ end
149
+
150
+ files.each do |f|
151
+ # Skip any previously scanned files
152
+ next if @state.scanned_files.include?(f)
153
+
154
+ # Print the intro
155
+ @printer.print_file_header f
156
+ @printer.line_break 0
157
+
158
+ # Run @link_processor over all matching links
159
+ text = File.read(f)
160
+ @link_processor.containing_file = f
161
+ unless text.valid_encoding?
162
+ text = text.encode('UTF-16be', invalid: :replace, replace: '&nbsp;').encode('UTF-8')
163
+ end
164
+ text = clean_no_quotes(text)
165
+ text = check_urls(text)
166
+ File.open(f, 'w') { |file| file.puts text }
167
+ count -= 1
168
+
169
+ ## Update the state
170
+ @state.scanned_files << f
171
+ @state.cached = @link_processor.cached
172
+ @state.save_state
173
+
174
+ # Print count, onto next file
175
+ @printer.line_break 0
176
+ @printer.print_line " #{count} files left ", '*', 'cyan'
177
+ @printer.line_break 0
178
+ end
179
+
180
+ @printer.print_outro
181
+ end
182
+ end
183
+ end
@@ -0,0 +1,47 @@
1
+ require 'rowr'
2
+
3
+ module Rowr
4
+
5
+ class StateSaver
6
+
7
+ def initialize(src_dir, filename)
8
+ @src = src_dir
9
+ @file = File.expand_path(File.join(@src, filename))
10
+ @config = {}
11
+ @cached = {}
12
+ @scanned_files = []
13
+ end
14
+
15
+ attr_accessor :src
16
+ attr_reader :file
17
+ attr_reader :config
18
+ attr_accessor :scanned_files
19
+ attr_accessor :cached
20
+
21
+ def config_file_exists?
22
+ File.exist?(@file)
23
+ end
24
+
25
+ def save_state
26
+ hashed = {
27
+ config: @config,
28
+ cached: @cached,
29
+ scanned_files: scanned_files
30
+ }
31
+ File.open(@file, 'wb') { |f| f.write JSON.pretty_generate(hashed) }
32
+ end
33
+
34
+ def load_state
35
+ file = JSON.parse(File.open(@file).read, symbolize_names: true)
36
+ @config = file[:config]
37
+ @cached = file[:cached]
38
+ @scanned_files = file[:scanned_files]
39
+ end
40
+
41
+ def save_config(config)
42
+ @config = config
43
+ save_state
44
+ end
45
+
46
+ end
47
+ end
@@ -0,0 +1,3 @@
1
+ module Rowr
2
+ VERSION = "0.2.0"
3
+ end
@@ -0,0 +1,64 @@
1
+ require 'rowr'
2
+
3
+ module Rowr
4
+
5
+ class Zipper
6
+
7
+ def initialize(src_dir)
8
+ @src = src_dir
9
+ @filename = 'rowr_backup_files'
10
+ @backup_dir = File.join(src, filename)
11
+ end
12
+
13
+ attr_accessor :src
14
+ attr_reader :filename
15
+ attr_reader :backup_dir
16
+
17
+ def backup
18
+ copy
19
+ zip
20
+ remove
21
+ end
22
+
23
+ def remove
24
+ FileUtils.remove_dir(backup_dir)
25
+ end
26
+
27
+ def copy
28
+ FileUtils.mkdir_p backup_dir
29
+ files = Dir.glob(File.join(src, '*')).reject { |f| File.basename(f) == filename }
30
+ FileUtils.cp_r files, backup_dir
31
+ end
32
+
33
+ def zip
34
+ zip = File.join(src,"#{filename}.zip")
35
+ File.delete zip if File.exist?(zip)
36
+ zf = ZipFileGenerator.new(backup_dir, zip)
37
+ zf.write
38
+ end
39
+
40
+ def restore
41
+ delete_all_files_except_backup
42
+ unzip
43
+ end
44
+
45
+ def unzip
46
+ Zip::File.open(File.join(src, "#{filename}.zip")) do |zip_file|
47
+ zip_file.each do |file|
48
+ file_path = File.join(src, file.name)
49
+ FileUtils.mkdir_p(File.dirname(file_path))
50
+ zip_file.extract(file, file_path) unless File.exist?(file_path)
51
+ end
52
+ end
53
+ end
54
+
55
+ def backup_file_exists?
56
+ File.exist?(File.expand_path(File.join(src, "#{filename}.zip")))
57
+ end
58
+
59
+ def delete_all_files_except_backup
60
+ FileUtils.rm_rf (Dir.glob(File.join(src, '*')).reject { |f| File.basename(f) =~ /^rowr_/ })
61
+ end
62
+
63
+ end
64
+ end
@@ -0,0 +1,48 @@
1
+ require 'zip'
2
+
3
+ # This is a simple example which uses rubyzip to
4
+ # recursively generate a zip file from the contents of
5
+ # a specified directory. The directory itself is not
6
+ # included in the archive, rather just its contents.
7
+ #
8
+ # Usage:
9
+ # directoryToZip = "/tmp/input"
10
+ # outputFile = "/tmp/out.zip"
11
+ # zf = ZipFileGenerator.new(directoryToZip, outputFile)
12
+ # zf.write()
13
+ class ZipFileGenerator
14
+
15
+ # Initialize with the directory to zip and the location of the output archive.
16
+ def initialize(inputDir, outputFile)
17
+ @inputDir = inputDir
18
+ @outputFile = outputFile
19
+ end
20
+
21
+ # Zip the input directory.
22
+ def write()
23
+ entries = Dir.entries(@inputDir); entries.delete("."); entries.delete("..")
24
+ io = Zip::File.open(@outputFile, Zip::File::CREATE)
25
+
26
+ writeEntries(entries, "", io)
27
+ io.close()
28
+ end
29
+
30
+ # A helper method to make the recursion work.
31
+ private
32
+ def writeEntries(entries, path, io)
33
+
34
+ entries.each { |e|
35
+ zipFilePath = path == "" ? e : File.join(path, e)
36
+ diskFilePath = File.join(@inputDir, zipFilePath)
37
+ # puts "Deflating " + diskFilePath
38
+ if File.directory?(diskFilePath)
39
+ io.mkdir(zipFilePath)
40
+ subdir = Dir.entries(diskFilePath); subdir.delete("."); subdir.delete("..")
41
+ writeEntries(subdir, zipFilePath, io)
42
+ else
43
+ io.get_output_stream(zipFilePath) { |f| f.puts(File.open(diskFilePath, "rb").read())}
44
+ end
45
+ }
46
+ end
47
+
48
+ end
metadata ADDED
@@ -0,0 +1,174 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: rowr
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.2.0
5
+ platform: ruby
6
+ authors:
7
+ - Luke Aeschleman
8
+ - Jason Casden
9
+ autorequire:
10
+ bindir: exe
11
+ cert_chain: []
12
+ date: 2017-10-06 00:00:00.000000000 Z
13
+ dependencies:
14
+ - !ruby/object:Gem::Dependency
15
+ name: rubyzip
16
+ requirement: !ruby/object:Gem::Requirement
17
+ requirements:
18
+ - - "~>"
19
+ - !ruby/object:Gem::Version
20
+ version: 1.2.1
21
+ type: :runtime
22
+ prerelease: false
23
+ version_requirements: !ruby/object:Gem::Requirement
24
+ requirements:
25
+ - - "~>"
26
+ - !ruby/object:Gem::Version
27
+ version: 1.2.1
28
+ - !ruby/object:Gem::Dependency
29
+ name: slop
30
+ requirement: !ruby/object:Gem::Requirement
31
+ requirements:
32
+ - - "~>"
33
+ - !ruby/object:Gem::Version
34
+ version: 4.6.0
35
+ type: :runtime
36
+ prerelease: false
37
+ version_requirements: !ruby/object:Gem::Requirement
38
+ requirements:
39
+ - - "~>"
40
+ - !ruby/object:Gem::Version
41
+ version: 4.6.0
42
+ - !ruby/object:Gem::Dependency
43
+ name: thor
44
+ requirement: !ruby/object:Gem::Requirement
45
+ requirements:
46
+ - - "~>"
47
+ - !ruby/object:Gem::Version
48
+ version: '0.19'
49
+ type: :runtime
50
+ prerelease: false
51
+ version_requirements: !ruby/object:Gem::Requirement
52
+ requirements:
53
+ - - "~>"
54
+ - !ruby/object:Gem::Version
55
+ version: '0.19'
56
+ - !ruby/object:Gem::Dependency
57
+ name: tty-prompt
58
+ requirement: !ruby/object:Gem::Requirement
59
+ requirements:
60
+ - - "~>"
61
+ - !ruby/object:Gem::Version
62
+ version: 0.13.2
63
+ type: :runtime
64
+ prerelease: false
65
+ version_requirements: !ruby/object:Gem::Requirement
66
+ requirements:
67
+ - - "~>"
68
+ - !ruby/object:Gem::Version
69
+ version: 0.13.2
70
+ - !ruby/object:Gem::Dependency
71
+ name: faraday
72
+ requirement: !ruby/object:Gem::Requirement
73
+ requirements:
74
+ - - "~>"
75
+ - !ruby/object:Gem::Version
76
+ version: 0.13.1
77
+ type: :runtime
78
+ prerelease: false
79
+ version_requirements: !ruby/object:Gem::Requirement
80
+ requirements:
81
+ - - "~>"
82
+ - !ruby/object:Gem::Version
83
+ version: 0.13.1
84
+ - !ruby/object:Gem::Dependency
85
+ name: bundler
86
+ requirement: !ruby/object:Gem::Requirement
87
+ requirements:
88
+ - - "~>"
89
+ - !ruby/object:Gem::Version
90
+ version: '1.13'
91
+ type: :development
92
+ prerelease: false
93
+ version_requirements: !ruby/object:Gem::Requirement
94
+ requirements:
95
+ - - "~>"
96
+ - !ruby/object:Gem::Version
97
+ version: '1.13'
98
+ - !ruby/object:Gem::Dependency
99
+ name: rake
100
+ requirement: !ruby/object:Gem::Requirement
101
+ requirements:
102
+ - - "~>"
103
+ - !ruby/object:Gem::Version
104
+ version: 12.1.0
105
+ type: :development
106
+ prerelease: false
107
+ version_requirements: !ruby/object:Gem::Requirement
108
+ requirements:
109
+ - - "~>"
110
+ - !ruby/object:Gem::Version
111
+ version: 12.1.0
112
+ - !ruby/object:Gem::Dependency
113
+ name: rspec
114
+ requirement: !ruby/object:Gem::Requirement
115
+ requirements:
116
+ - - "~>"
117
+ - !ruby/object:Gem::Version
118
+ version: '3.0'
119
+ type: :development
120
+ prerelease: false
121
+ version_requirements: !ruby/object:Gem::Requirement
122
+ requirements:
123
+ - - "~>"
124
+ - !ruby/object:Gem::Version
125
+ version: '3.0'
126
+ description: Refactors pieces of old websites so they can be hosted again or archived.
127
+ email:
128
+ - lukeaeschleman@gmail.com
129
+ - casden@gmail.com
130
+ executables:
131
+ - rowr
132
+ extensions: []
133
+ extra_rdoc_files: []
134
+ files:
135
+ - CODE_OF_CONDUCT.md
136
+ - LICENSE
137
+ - README.md
138
+ - bin/setup
139
+ - exe/rowr
140
+ - lib/rowr.rb
141
+ - lib/rowr/command_line.rb
142
+ - lib/rowr/link_processor.rb
143
+ - lib/rowr/printer.rb
144
+ - lib/rowr/prompter.rb
145
+ - lib/rowr/resurrector.rb
146
+ - lib/rowr/state_saver.rb
147
+ - lib/rowr/version.rb
148
+ - lib/rowr/zipper.rb
149
+ - lib/zip/zip_file_generator.rb
150
+ homepage: https://github.com/UNC-Libraries/ROWR
151
+ licenses:
152
+ - Apache-2.0
153
+ metadata: {}
154
+ post_install_message:
155
+ rdoc_options: []
156
+ require_paths:
157
+ - lib
158
+ required_ruby_version: !ruby/object:Gem::Requirement
159
+ requirements:
160
+ - - ">="
161
+ - !ruby/object:Gem::Version
162
+ version: '0'
163
+ required_rubygems_version: !ruby/object:Gem::Requirement
164
+ requirements:
165
+ - - ">="
166
+ - !ruby/object:Gem::Version
167
+ version: '0'
168
+ requirements: []
169
+ rubyforge_project:
170
+ rubygems_version: 2.6.13
171
+ signing_key:
172
+ specification_version: 4
173
+ summary: The Really Old Website Refactorer
174
+ test_files: []