rowr 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 4fea7df6d4e1adb7ebe5a92dc53bdbc19002171d
4
+ data.tar.gz: e29245ead1eedb69de5b92626cf3e4759bf03a74
5
+ SHA512:
6
+ metadata.gz: 064b7ec66b949cec8b6d9fc323ac8268b3a4c2736732ae916afb49559e53200879f25b59bbbb797555397acc23f2ea3875245d1283524fae7bcb6d358ac1e097
7
+ data.tar.gz: 0d535ab043e508d36b9238f33cb9b41cbf7c8f564bdda59c52a10afb62147c661e53897f9dc92eb5d599c6e66262c6ace2ddf690f5c0f49b62e50d388eaf4b29
@@ -0,0 +1,74 @@
1
+ # Contributor Covenant Code of Conduct
2
+
3
+ ## Our Pledge
4
+
5
+ In the interest of fostering an open and welcoming environment, we as
6
+ contributors and maintainers pledge to making participation in our project and
7
+ our community a harassment-free experience for everyone, regardless of age, body
8
+ size, disability, ethnicity, gender identity and expression, level of experience,
9
+ nationality, personal appearance, race, religion, or sexual identity and
10
+ orientation.
11
+
12
+ ## Our Standards
13
+
14
+ Examples of behavior that contributes to creating a positive environment
15
+ include:
16
+
17
+ * Using welcoming and inclusive language
18
+ * Being respectful of differing viewpoints and experiences
19
+ * Gracefully accepting constructive criticism
20
+ * Focusing on what is best for the community
21
+ * Showing empathy towards other community members
22
+
23
+ Examples of unacceptable behavior by participants include:
24
+
25
+ * The use of sexualized language or imagery and unwelcome sexual attention or
26
+ advances
27
+ * Trolling, insulting/derogatory comments, and personal or political attacks
28
+ * Public or private harassment
29
+ * Publishing others' private information, such as a physical or electronic
30
+ address, without explicit permission
31
+ * Other conduct which could reasonably be considered inappropriate in a
32
+ professional setting
33
+
34
+ ## Our Responsibilities
35
+
36
+ Project maintainers are responsible for clarifying the standards of acceptable
37
+ behavior and are expected to take appropriate and fair corrective action in
38
+ response to any instances of unacceptable behavior.
39
+
40
+ Project maintainers have the right and responsibility to remove, edit, or
41
+ reject comments, commits, code, wiki edits, issues, and other contributions
42
+ that are not aligned to this Code of Conduct, or to ban temporarily or
43
+ permanently any contributor for other behaviors that they deem inappropriate,
44
+ threatening, offensive, or harmful.
45
+
46
+ ## Scope
47
+
48
+ This Code of Conduct applies both within project spaces and in public spaces
49
+ when an individual is representing the project or its community. Examples of
50
+ representing a project or community include using an official project e-mail
51
+ address, posting via an official social media account, or acting as an appointed
52
+ representative at an online or offline event. Representation of a project may be
53
+ further defined and clarified by project maintainers.
54
+
55
+ ## Enforcement
56
+
57
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be
58
+ reported by contacting the project team at lukeaeschleman@gmail.com. All
59
+ complaints will be reviewed and investigated and will result in a response that
60
+ is deemed necessary and appropriate to the circumstances. The project team is
61
+ obligated to maintain confidentiality with regard to the reporter of an incident.
62
+ Further details of specific enforcement policies may be posted separately.
63
+
64
+ Project maintainers who do not follow or enforce the Code of Conduct in good
65
+ faith may face temporary or permanent repercussions as determined by other
66
+ members of the project's leadership.
67
+
68
+ ## Attribution
69
+
70
+ This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
71
+ available at [http://contributor-covenant.org/version/1/4][version]
72
+
73
+ [homepage]: http://contributor-covenant.org
74
+ [version]: http://contributor-covenant.org/version/1/4/
data/LICENSE ADDED
@@ -0,0 +1,201 @@
1
+ Apache License
2
+ Version 2.0, January 2004
3
+ http://www.apache.org/licenses/
4
+
5
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6
+
7
+ 1. Definitions.
8
+
9
+ "License" shall mean the terms and conditions for use, reproduction,
10
+ and distribution as defined by Sections 1 through 9 of this document.
11
+
12
+ "Licensor" shall mean the copyright owner or entity authorized by
13
+ the copyright owner that is granting the License.
14
+
15
+ "Legal Entity" shall mean the union of the acting entity and all
16
+ other entities that control, are controlled by, or are under common
17
+ control with that entity. For the purposes of this definition,
18
+ "control" means (i) the power, direct or indirect, to cause the
19
+ direction or management of such entity, whether by contract or
20
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
21
+ outstanding shares, or (iii) beneficial ownership of such entity.
22
+
23
+ "You" (or "Your") shall mean an individual or Legal Entity
24
+ exercising permissions granted by this License.
25
+
26
+ "Source" form shall mean the preferred form for making modifications,
27
+ including but not limited to software source code, documentation
28
+ source, and configuration files.
29
+
30
+ "Object" form shall mean any form resulting from mechanical
31
+ transformation or translation of a Source form, including but
32
+ not limited to compiled object code, generated documentation,
33
+ and conversions to other media types.
34
+
35
+ "Work" shall mean the work of authorship, whether in Source or
36
+ Object form, made available under the License, as indicated by a
37
+ copyright notice that is included in or attached to the work
38
+ (an example is provided in the Appendix below).
39
+
40
+ "Derivative Works" shall mean any work, whether in Source or Object
41
+ form, that is based on (or derived from) the Work and for which the
42
+ editorial revisions, annotations, elaborations, or other modifications
43
+ represent, as a whole, an original work of authorship. For the purposes
44
+ of this License, Derivative Works shall not include works that remain
45
+ separable from, or merely link (or bind by name) to the interfaces of,
46
+ the Work and Derivative Works thereof.
47
+
48
+ "Contribution" shall mean any work of authorship, including
49
+ the original version of the Work and any modifications or additions
50
+ to that Work or Derivative Works thereof, that is intentionally
51
+ submitted to Licensor for inclusion in the Work by the copyright owner
52
+ or by an individual or Legal Entity authorized to submit on behalf of
53
+ the copyright owner. For the purposes of this definition, "submitted"
54
+ means any form of electronic, verbal, or written communication sent
55
+ to the Licensor or its representatives, including but not limited to
56
+ communication on electronic mailing lists, source code control systems,
57
+ and issue tracking systems that are managed by, or on behalf of, the
58
+ Licensor for the purpose of discussing and improving the Work, but
59
+ excluding communication that is conspicuously marked or otherwise
60
+ designated in writing by the copyright owner as "Not a Contribution."
61
+
62
+ "Contributor" shall mean Licensor and any individual or Legal Entity
63
+ on behalf of whom a Contribution has been received by Licensor and
64
+ subsequently incorporated within the Work.
65
+
66
+ 2. Grant of Copyright License. Subject to the terms and conditions of
67
+ this License, each Contributor hereby grants to You a perpetual,
68
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
69
+ copyright license to reproduce, prepare Derivative Works of,
70
+ publicly display, publicly perform, sublicense, and distribute the
71
+ Work and such Derivative Works in Source or Object form.
72
+
73
+ 3. Grant of Patent License. Subject to the terms and conditions of
74
+ this License, each Contributor hereby grants to You a perpetual,
75
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
76
+ (except as stated in this section) patent license to make, have made,
77
+ use, offer to sell, sell, import, and otherwise transfer the Work,
78
+ where such license applies only to those patent claims licensable
79
+ by such Contributor that are necessarily infringed by their
80
+ Contribution(s) alone or by combination of their Contribution(s)
81
+ with the Work to which such Contribution(s) was submitted. If You
82
+ institute patent litigation against any entity (including a
83
+ cross-claim or counterclaim in a lawsuit) alleging that the Work
84
+ or a Contribution incorporated within the Work constitutes direct
85
+ or contributory patent infringement, then any patent licenses
86
+ granted to You under this License for that Work shall terminate
87
+ as of the date such litigation is filed.
88
+
89
+ 4. Redistribution. You may reproduce and distribute copies of the
90
+ Work or Derivative Works thereof in any medium, with or without
91
+ modifications, and in Source or Object form, provided that You
92
+ meet the following conditions:
93
+
94
+ (a) You must give any other recipients of the Work or
95
+ Derivative Works a copy of this License; and
96
+
97
+ (b) You must cause any modified files to carry prominent notices
98
+ stating that You changed the files; and
99
+
100
+ (c) You must retain, in the Source form of any Derivative Works
101
+ that You distribute, all copyright, patent, trademark, and
102
+ attribution notices from the Source form of the Work,
103
+ excluding those notices that do not pertain to any part of
104
+ the Derivative Works; and
105
+
106
+ (d) If the Work includes a "NOTICE" text file as part of its
107
+ distribution, then any Derivative Works that You distribute must
108
+ include a readable copy of the attribution notices contained
109
+ within such NOTICE file, excluding those notices that do not
110
+ pertain to any part of the Derivative Works, in at least one
111
+ of the following places: within a NOTICE text file distributed
112
+ as part of the Derivative Works; within the Source form or
113
+ documentation, if provided along with the Derivative Works; or,
114
+ within a display generated by the Derivative Works, if and
115
+ wherever such third-party notices normally appear. The contents
116
+ of the NOTICE file are for informational purposes only and
117
+ do not modify the License. You may add Your own attribution
118
+ notices within Derivative Works that You distribute, alongside
119
+ or as an addendum to the NOTICE text from the Work, provided
120
+ that such additional attribution notices cannot be construed
121
+ as modifying the License.
122
+
123
+ You may add Your own copyright statement to Your modifications and
124
+ may provide additional or different license terms and conditions
125
+ for use, reproduction, or distribution of Your modifications, or
126
+ for any such Derivative Works as a whole, provided Your use,
127
+ reproduction, and distribution of the Work otherwise complies with
128
+ the conditions stated in this License.
129
+
130
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
131
+ any Contribution intentionally submitted for inclusion in the Work
132
+ by You to the Licensor shall be under the terms and conditions of
133
+ this License, without any additional terms or conditions.
134
+ Notwithstanding the above, nothing herein shall supersede or modify
135
+ the terms of any separate license agreement you may have executed
136
+ with Licensor regarding such Contributions.
137
+
138
+ 6. Trademarks. This License does not grant permission to use the trade
139
+ names, trademarks, service marks, or product names of the Licensor,
140
+ except as required for reasonable and customary use in describing the
141
+ origin of the Work and reproducing the content of the NOTICE file.
142
+
143
+ 7. Disclaimer of Warranty. Unless required by applicable law or
144
+ agreed to in writing, Licensor provides the Work (and each
145
+ Contributor provides its Contributions) on an "AS IS" BASIS,
146
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147
+ implied, including, without limitation, any warranties or conditions
148
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149
+ PARTICULAR PURPOSE. You are solely responsible for determining the
150
+ appropriateness of using or redistributing the Work and assume any
151
+ risks associated with Your exercise of permissions under this License.
152
+
153
+ 8. Limitation of Liability. In no event and under no legal theory,
154
+ whether in tort (including negligence), contract, or otherwise,
155
+ unless required by applicable law (such as deliberate and grossly
156
+ negligent acts) or agreed to in writing, shall any Contributor be
157
+ liable to You for damages, including any direct, indirect, special,
158
+ incidental, or consequential damages of any character arising as a
159
+ result of this License or out of the use or inability to use the
160
+ Work (including but not limited to damages for loss of goodwill,
161
+ work stoppage, computer failure or malfunction, or any and all
162
+ other commercial damages or losses), even if such Contributor
163
+ has been advised of the possibility of such damages.
164
+
165
+ 9. Accepting Warranty or Additional Liability. While redistributing
166
+ the Work or Derivative Works thereof, You may choose to offer,
167
+ and charge a fee for, acceptance of support, warranty, indemnity,
168
+ or other liability obligations and/or rights consistent with this
169
+ License. However, in accepting such obligations, You may act only
170
+ on Your own behalf and on Your sole responsibility, not on behalf
171
+ of any other Contributor, and only if You agree to indemnify,
172
+ defend, and hold each Contributor harmless for any liability
173
+ incurred by, or claims asserted against, such Contributor by reason
174
+ of your accepting any such warranty or additional liability.
175
+
176
+ END OF TERMS AND CONDITIONS
177
+
178
+ APPENDIX: How to apply the Apache License to your work.
179
+
180
+ To apply the Apache License to your work, attach the following
181
+ boilerplate notice, with the fields enclosed by brackets "{}"
182
+ replaced with your own identifying information. (Don't include
183
+ the brackets!) The text should be enclosed in the appropriate
184
+ comment syntax for the file format. We also recommend that a
185
+ file or class name and description of purpose be included on the
186
+ same "printed page" as the copyright notice for easier
187
+ identification within third-party archives.
188
+
189
+ Copyright {yyyy} {name of copyright owner}
190
+
191
+ Licensed under the Apache License, Version 2.0 (the "License");
192
+ you may not use this file except in compliance with the License.
193
+ You may obtain a copy of the License at
194
+
195
+ http://www.apache.org/licenses/LICENSE-2.0
196
+
197
+ Unless required by applicable law or agreed to in writing, software
198
+ distributed under the License is distributed on an "AS IS" BASIS,
199
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200
+ See the License for the specific language governing permissions and
201
+ limitations under the License.
@@ -0,0 +1,32 @@
1
+ # ROWR: Really Old Website Resurrector
2
+
3
+ [![Build Status](https://travis-ci.org/UNC-Libraries/ROWR.svg?branch=master)](https://travis-ci.org/UNC-Libraries/ROWR)
4
+
5
+ It's basically link find/replace tool for a really old websites. ROWR will parse through your site files and look for any
6
+ broken links. When it finds one, it will prompt you to either replace, remove,
7
+ ROWR takes a really old website, one that might be living on a cd flash drive for archival purposes, and allows you to clean up
8
+ any broken links.
9
+
10
+ ## Installation
11
+
12
+ After installing ruby, add `gem 'rowr'` to your application's Gemfile or run the following from the command line:
13
+
14
+ $ gem install rowr
15
+
16
+ ## Usage
17
+
18
+ `rowr start` Start the script, will prompt you for information about the really old site.
19
+ While running, you can always prematurely stop the script with CMD+C or CTRL+C.
20
+
21
+ `rowr continue` Continue where you left off.
22
+
23
+ `rowr reset` Destroy all changes made and restart the process.
24
+
25
+ ## Contributing
26
+
27
+ Bug reports and pull requests are welcome on GitHub at https://github.com/UNC-Libraries/ROWR/issues.
28
+
29
+ ## License
30
+
31
+ The gem is available as open source under the terms of the [Apache License 2.0](http://www.apache.org/licenses/).
32
+
@@ -0,0 +1,8 @@
1
+ #!/usr/bin/env bash
2
+ set -euo pipefail
3
+ IFS=$'\n\t'
4
+ set -vx
5
+
6
+ bundle install
7
+
8
+ # Do any other automated setup that you need to do here
@@ -0,0 +1,7 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require 'rowr/command_line'
4
+
5
+ result = Rowr::CommandLine.start( ARGV )
6
+
7
+ exit 1 unless result # non-zero exit status on process telling us there's problems.
@@ -0,0 +1,20 @@
1
+ require 'rowr/version'
2
+ require 'rowr/zipper'
3
+ require 'rowr/resurrector'
4
+ require 'rowr/prompter'
5
+ require 'rowr/printer'
6
+ require 'rowr/link_processor'
7
+ require 'rowr/state_saver'
8
+ require 'fileutils'
9
+ require 'faraday'
10
+ require 'slop'
11
+ require 'thor'
12
+ require 'tty-prompt'
13
+ require 'json'
14
+ require 'yaml'
15
+ require 'zip'
16
+ require 'zip/zip_file_generator'
17
+
18
+ module Rowr
19
+
20
+ end
@@ -0,0 +1,61 @@
1
+ require 'rowr'
2
+ require 'tty-prompt'
3
+
4
+ module Rowr
5
+
6
+ class CommandLine < Thor
7
+
8
+ desc 'start',
9
+ 'resurrect a really old website'
10
+
11
+ def start
12
+ rowr = Rowr::Resurrector.new
13
+ rowr.start
14
+ end
15
+
16
+ desc 'continue',
17
+ 'continue resurrecting'
18
+
19
+ def continue
20
+ rowr = Rowr::Resurrector.new
21
+ rowr.continue
22
+ end
23
+
24
+ desc 'reset',
25
+ 'restart the really old website resurrection'
26
+
27
+ def reset
28
+ rowr = Rowr::Resurrector.new
29
+ rowr.reset
30
+ end
31
+
32
+ desc 'test <file>',
33
+ 'test the resurrector on a single file'
34
+
35
+ def test(file)
36
+ return unless File.exist? File.expand_path(file)
37
+
38
+ f = File.expand_path(file)
39
+ rowr = Rowr::Resurrector.new
40
+
41
+ rowr.options.source_directory = File.dirname(f)
42
+ rowr.prompt_user_for_option 'old_host?'
43
+ rowr.prompt_user_for_option 'new_base_path?'
44
+ rowr.prompt_user_for_option 'check_external_urls?'
45
+
46
+ rowr.init_link_processor
47
+
48
+ rowr.link_processor.containing_file = f
49
+
50
+ text = File.read(f)
51
+ unless text.valid_encoding?
52
+ text = text.encode('UTF-16be', :invalid=>:replace, :replace=>'&nbsp;').encode('UTF-8')
53
+ end
54
+ text = rowr.clean_no_quotes(text)
55
+ rowr.check_urls(text)
56
+
57
+ end
58
+
59
+ end
60
+
61
+ end
@@ -0,0 +1,234 @@
1
+ require 'rowr'
2
+
3
+ module Rowr
4
+
5
+ class LinkProcessor
6
+
7
+ attr_reader :local_site_dir
8
+ attr_reader :old_domain
9
+ attr_accessor :new_base_path
10
+ attr_accessor :cached
11
+ attr_reader :link_to_check
12
+ attr_accessor :containing_file
13
+ attr_reader :target_file
14
+
15
+ def initialize(src_dir, old_domain = nil, new_base_path = nil, check_external_urls = true, cached = {})
16
+ @printer = Rowr::Printer.new
17
+ @prompt = TTY::Prompt.new(active_color: :cyan)
18
+ @pastel = Pastel.new
19
+ @local_site_dir = src_dir
20
+ @old_domain = old_domain
21
+ @new_base_path = new_base_path
22
+ @check_external_urls = check_external_urls
23
+ @cached = cached
24
+ end
25
+
26
+ ################################
27
+ # Attributes
28
+ ################################
29
+ def link_to_check=(value)
30
+ if external?(value)
31
+ @link_to_check = value
32
+ else
33
+ value.sub!(old_url_regex, '') if @old_domain
34
+ if value.start_with?('/')
35
+ @link_to_check = value.sub(%r{(^/)}, '')
36
+ @target_file = File.expand_path(File.join(@local_site_dir, @link_to_check))
37
+ else
38
+ @link_to_check = File.dirname(@containing_file).sub(@local_site_dir, '') + '/' + value
39
+ @link_to_check.sub!(%r{(^/)}, '')
40
+ @target_file = File.expand_path(File.join(File.dirname(@containing_file), value))
41
+ end
42
+ end
43
+ end
44
+
45
+ def old_url_regex
46
+ %r{^(https?://|//)#{@old_domain}}i if @old_domain
47
+ end
48
+
49
+ ################################
50
+ # Checkers
51
+ ################################
52
+
53
+ def external?(link)
54
+ !old_uri?(link) && uri?(link) ? true : false
55
+ end
56
+
57
+ def old_uri?(link)
58
+ if @old_domain
59
+ link =~ old_url_regex
60
+ else
61
+ false
62
+ end
63
+ end
64
+
65
+ def uri?(link)
66
+ link =~ %r{^(https?:|//)}i
67
+ end
68
+
69
+ def in_cache?
70
+ @cached.key?(link_key)
71
+ end
72
+
73
+ def response_code(link)
74
+ begin
75
+ res = Faraday.get link
76
+ return res.status
77
+ rescue
78
+ return 0
79
+ end
80
+ end
81
+
82
+ def trim_hash(file)
83
+ file.sub(/#(.*?)$/,'')
84
+ end
85
+
86
+ def target_file_exists?
87
+ File.exist?(trim_hash(@target_file))
88
+ end
89
+
90
+ def broken_external_link?
91
+ res = response_code(@link_to_check)
92
+ res > 399 || res < 200
93
+ end
94
+
95
+ def is_valid_replacement?(link)
96
+ if uri?(link)
97
+ res = response_code(link)
98
+ res < 400 || res > 199
99
+ else
100
+ File.exist?(File.join(@local_site_dir, link))
101
+ end
102
+ end
103
+
104
+ ################################
105
+ # Misc
106
+ ################################
107
+
108
+ def link_key
109
+ @link_to_check.to_sym
110
+ end
111
+
112
+ def add_to_cache(new_link)
113
+ @cached[link_key] = new_link
114
+ end
115
+
116
+ def recommend_files
117
+ Dir.glob("#{@local_site_dir}/**/{#{File.basename(@target_file)}}").map! do |f|
118
+ f.sub(@local_site_dir,'')
119
+ end
120
+ end
121
+
122
+ def prepend_new_base_path(link)
123
+ check = @new_base_path[1..-1].chop
124
+ new_link = link.sub(%r{^/?#{check}},'')
125
+ new_link = new_link.sub(/^\//,'')
126
+ @new_base_path + new_link
127
+ end
128
+
129
+ ################################
130
+ # Processors
131
+ ################################
132
+
133
+ def process_link
134
+ @new_base_path + @link_to_check if target_file_exists?
135
+ end
136
+
137
+ def process_broken_link
138
+ return cached[link_key] if in_cache?
139
+ replacement = nil
140
+ @printer.print_broken_link_warning @containing_file, @link_to_check
141
+ replacement = ask_recommended_files unless recommend_files.empty?
142
+ replacement = ask_wtd unless replacement
143
+ ask_to_cache(replacement)
144
+ replacement
145
+ end
146
+
147
+ def process_external
148
+ return nil unless @check_external_urls && broken_external_link?
149
+ @printer.print_broken_link_warning @containing_file, @link_to_check
150
+ replacement = ask_wtd
151
+ ask_to_cache(replacement)
152
+ replacement
153
+ end
154
+
155
+ def process(link, file = nil)
156
+ @containing_file = file if file
157
+ self.link_to_check = link
158
+
159
+ if external?(@link_to_check)
160
+ replacement = process_external
161
+ else
162
+ replacement = process_link
163
+ replacement = process_broken_link unless replacement
164
+ end
165
+ replacement
166
+ end
167
+
168
+ ################################
169
+ # Asks
170
+ ################################
171
+
172
+ def ask_recommended_files
173
+ @printer.print_line ' I found some matching files ', '+', :blue
174
+ recommended_files = recommend_files
175
+ choice = @prompt.select(
176
+ 'Would you like to replace the broken link with any of the following?',
177
+ recommended_files + ['None of these match'],
178
+ per_page: 10
179
+ )
180
+ choice == 'None of these match' ? nil : prepend_new_base_path(choice)
181
+ end
182
+
183
+ def ask_to_cache(new_link)
184
+ case new_link
185
+ when nil
186
+ message = "SKIP all instances of " + @pastel.green("#{@link_to_check}") + "?"
187
+ when '#'
188
+ message = "REMOVE all instances of " + @pastel.green("#{@link_to_check}") + "?"
189
+ else
190
+ message = "REPLACE all instances of " + @pastel.green("#{@link_to_check}") + " with " + @pastel.blue("#{new_link}") + "?"
191
+ end
192
+ add_to_cache(new_link) if @prompt.yes?(message)
193
+ end
194
+
195
+ def ask_wtd
196
+ @printer.line_break 0
197
+ wtd = @prompt.enum_select"What would you like to do?" do |menu|
198
+ menu.default 1
199
+
200
+ menu.choice 'Enter a new link', 1
201
+ menu.choice 'Remove the link', 2
202
+ menu.choice 'Skip', 3
203
+ end
204
+
205
+ case wtd
206
+ when 1
207
+ ask_new_link
208
+ when 2
209
+ '#'
210
+ when 3
211
+ nil
212
+ end
213
+ end
214
+
215
+ def ask_new_link
216
+ new_link = @prompt.ask('Enter the replacement:')
217
+ unless is_valid_replacement?(new_link)
218
+ if uri?(new_link)
219
+ @prompt.error("Sorry, the url you've provided is not returning a 200 status code")
220
+ else
221
+ @prompt.error('Sorry, that file does not exist')
222
+ end
223
+ new_link = ask_new_link
224
+ end
225
+
226
+ if uri?(new_link)
227
+ new_link
228
+ else
229
+ prepend_new_base_path(new_link)
230
+ end
231
+ end
232
+
233
+ end
234
+ end
@@ -0,0 +1,64 @@
1
+ require 'rowr'
2
+
3
+ module Rowr
4
+
5
+ class Printer
6
+
7
+ def initialize(line_length = 50)
8
+ @pastel = Pastel.new
9
+ @line_length = line_length
10
+ end
11
+
12
+ def line(text = '', char = '~')
13
+ message = text.to_s
14
+ return message if too_long?(message)
15
+
16
+ waves = char * ((@line_length - message.length) / 2).ceil
17
+ output = "#{waves}#{message}#{waves}"
18
+ output + (char * (50 - output.length))
19
+ end
20
+
21
+ def too_long?(string)
22
+ string.length > @line_length
23
+ end
24
+
25
+ def line_break(duration)
26
+ puts "\n"
27
+ sleep(duration)
28
+ end
29
+
30
+ def print_line(message = nil, char = '~', color = 'green')
31
+ puts @pastel.send(color.to_sym, line(message, char))
32
+ end
33
+
34
+ def print_intro
35
+ print_line
36
+ print_line ' ROWR! '
37
+ print_line
38
+ line_break 1
39
+ end
40
+
41
+ def print_outro
42
+ print_line
43
+ print_line " You're all done! "
44
+ print_line
45
+ print_line ' rowr... '
46
+ print_line
47
+ end
48
+
49
+ def print_broken_link_warning(file, link)
50
+ line_break 0
51
+ print_line '', '!', 'yellow'
52
+ print_line ' Broken Link ', '!', 'yellow'
53
+ puts @pastel.magenta.bold('File: ', @pastel.red(file))
54
+ puts @pastel.magenta.bold('Link: ', @pastel.green(link))
55
+ end
56
+
57
+ def print_file_header(file)
58
+ print_line ' FILE ', '*', 'cyan'
59
+ print_line " #{file} ", '*', 'cyan'
60
+ print_line '', '*', 'cyan'
61
+ end
62
+
63
+ end
64
+ end
@@ -0,0 +1,113 @@
1
+ require 'rowr'
2
+
3
+ module Rowr
4
+
5
+ class Prompter
6
+
7
+ attr_reader :source_directory
8
+ attr_reader :exts_to_use
9
+ attr_reader :old_host
10
+ attr_reader :new_base_path
11
+ attr_accessor :check_external_urls
12
+
13
+ def initialize
14
+ @prompt = TTY::Prompt.new
15
+ end
16
+
17
+ def source_directory=(value)
18
+ dir = File.expand_path(value)
19
+ @source_directory = dir if Dir.exist?(dir)
20
+ end
21
+
22
+ def old_host=(value)
23
+ if value.to_s.empty?
24
+ @old_host = false
25
+ else
26
+ @old_host = value.chomp('/').chomp('/').sub(%r{https?://}, '')
27
+ end
28
+ end
29
+
30
+ def exts_to_use=(value)
31
+ exts = []
32
+ if value
33
+ exts = value.split
34
+ exts.map! { |e| e.start_with?('.') ? e[1..-1] : e } if exts.is_a?(Array)
35
+ end
36
+ exts = %w(htm html) + exts
37
+ @exts_to_use = exts.uniq
38
+ end
39
+
40
+ def new_base_path=(value)
41
+ clean = ''
42
+ unless value.to_s.empty?
43
+ clean = value.sub(%r{https?://(.*?)(/|$)}, '')
44
+ clean = clean.split('/')
45
+ clean = clean.reject(&:empty?)
46
+ clean.shift if clean.first =~ /\./
47
+ clean = clean.join('/')
48
+ end
49
+ @new_base_path = clean.to_s.empty? ? '/' : "/#{clean}/"
50
+ end
51
+
52
+ def dir_select
53
+ @prompt.select('Where is this really old website?') do |menu|
54
+ menu.default 1
55
+
56
+ menu.choice "#{Dir.pwd} (The current dir)", 1
57
+ menu.choice "Another directory?", 2
58
+ end
59
+ end
60
+
61
+ def ask_for_other_source_directory
62
+ dir = @prompt.ask('Please type in the path to that directory?') do |q|
63
+ q.required true
64
+ end
65
+ unless Dir.exist?(File.expand_path(dir))
66
+ @prompt.error("Sorry, #{dir} doesn't seem to exist")
67
+ ask_for_other_source_directory
68
+ end
69
+ dir
70
+ end
71
+
72
+ def old_host?
73
+ self.old_host = @prompt.ask('What was the old host? (e.g. www.google.com)')
74
+ end
75
+
76
+ def source_directory?
77
+ self.source_directory = case dir_select
78
+ when 1
79
+ Dir.pwd
80
+ when 2
81
+ ask_for_other_source_directory
82
+ end
83
+ end
84
+
85
+ def additional_exts?
86
+ @prompt.say('By default, I\'ll will scan any .html and .htm files.')
87
+ self.exts_to_use = @prompt.ask('Please list any other extensions, or hit Enter to skip')
88
+ end
89
+
90
+ def new_base_path?
91
+ self.new_base_path = @prompt.ask('What will be the url of the resurrected site?')
92
+ end
93
+
94
+ def check_external_urls?
95
+ self.check_external_urls = @prompt.select('If I find an link to an external site, what should I do?') do |menu|
96
+ menu.default 1
97
+
98
+ menu.choice 'Ask me about it', true
99
+ menu.choice 'Skip it', false
100
+ end
101
+ end
102
+
103
+ def generate_hash
104
+ {
105
+ source_directory: source_directory,
106
+ exts_to_use: exts_to_use,
107
+ old_host: old_host,
108
+ new_base_path: new_base_path,
109
+ check_external_urls: check_external_urls
110
+ }
111
+ end
112
+ end
113
+ end
@@ -0,0 +1,183 @@
1
+ require 'rowr'
2
+
3
+ module Rowr
4
+
5
+ class Resurrector
6
+
7
+ attr_accessor :link_processor
8
+ attr_accessor :options
9
+
10
+ def initialize
11
+ @printer = Rowr::Printer.new
12
+ @option_getter = Rowr::Prompter.new
13
+ @state = Rowr::StateSaver.new Dir.pwd, 'rowr_state.json'
14
+ @prompt = TTY::Prompt.new
15
+ @config = {}
16
+ end
17
+
18
+ def init_link_processor(cached = {})
19
+ @link_processor = Rowr::LinkProcessor.new(
20
+ @config[:source_directory],
21
+ @config[:old_host],
22
+ @config[:new_base_path],
23
+ @config[:check_external_urls],
24
+ cached
25
+ )
26
+ end
27
+
28
+ def files_with_exts(exts)
29
+ regex = /\.(#{exts.join('|')})$/i
30
+ Dir.glob(File.join(@config[:source_directory], '**', '*')).grep(regex).reject { |f| !File.file?(f) }
31
+ end
32
+
33
+ def clean_no_quotes(file_contents)
34
+ find = %r{(?<=href=|src=|background=)(?!['"])(?<content>[^> ]*)}mi
35
+ file_contents.gsub(find) do |match|
36
+ "\"#{$~[:content]}\""
37
+ end
38
+ end
39
+
40
+ def check_urls(file_contents)
41
+ find = %r{(?<=(?<=href=|src=|url\(|background=)['"])(?!(mailto:|#))(?<content>.*?)(?=["'])}mi
42
+ file_contents.gsub(find) do |match|
43
+ content = $~[:content].strip
44
+ replacement = @link_processor.process(content)
45
+ replacement.nil? ? content : replacement
46
+ end
47
+ end
48
+
49
+ def prompt_user_for_option(option)
50
+ @option_getter.send(option.to_sym)
51
+ end
52
+
53
+ def gather_options
54
+ @printer.line_break 0
55
+ prompt_user_for_option 'old_host?'
56
+ @printer.line_break 0
57
+ prompt_user_for_option 'new_base_path?'
58
+ @printer.line_break 0
59
+ prompt_user_for_option 'additional_exts?'
60
+ @printer.line_break 0
61
+ prompt_user_for_option 'check_external_urls?'
62
+ @printer.line_break 0
63
+ end
64
+
65
+ def load_state
66
+ @state.load_state
67
+ @config = @state.config
68
+ end
69
+
70
+ def continue
71
+ unless @state.config_file_exists?
72
+ loop do
73
+ @printer.print_line " I can't find a rowr save file in this directory... "
74
+ prompt_user_for_option 'source_directory?'
75
+ @state.src = @option_getter.source_directory
76
+ break if @state.config_file_exists?
77
+ end
78
+ end
79
+ load_state
80
+ run
81
+ end
82
+
83
+ def start
84
+ @printer.print_intro
85
+ @printer.print_line " Before we start, I've got some questions "
86
+ prompt_user_for_option 'source_directory?'
87
+ @state.src = @option_getter.source_directory
88
+ if @state.config_file_exists?
89
+ @prompt.say("I've found a rowr save file.")
90
+ continue_resp = @prompt.select('Would you like to continue or reset?', %w(Continue Reset))
91
+ if continue_resp == 'Reset'
92
+ reset
93
+ else
94
+ continue
95
+ end
96
+ end
97
+ prep
98
+ end
99
+
100
+ def reset
101
+ zipper = Rowr::Zipper.new Dir.pwd
102
+ if zipper.backup_file_exists?
103
+ @option_getter.source_directory = Dir.pwd
104
+ else
105
+ loop do
106
+ @prompt.warn("I can't find a rowr backup file in this directory. Where is the old site?")
107
+ prompt_user_for_option 'source_directory?'
108
+ zipper.src = @option_getter.source_directory
109
+ break if zipper.backup_file_exists?
110
+ end
111
+ end
112
+ zipper.restore
113
+ prep
114
+ end
115
+
116
+ def prep
117
+ gather_options
118
+ @config = @option_getter.generate_hash
119
+ @state.save_config(@config)
120
+ @printer.print_line " Let's get started "
121
+ @printer.line_break 0.5
122
+ run
123
+ end
124
+
125
+ def run
126
+ # Backup
127
+ zipper = Rowr::Zipper.new @config[:source_directory]
128
+ zipper.backup
129
+
130
+ # Prep the link processor
131
+ init_link_processor(@state.cached)
132
+ files = files_with_exts(@config[:exts_to_use])
133
+ count = files.length
134
+
135
+ # Print the run intro
136
+ @printer.print_line
137
+ @printer.print_line " I've found #{count} files to scan "
138
+ @printer.print_line
139
+ @printer.line_break 0.5
140
+ unless @state.scanned_files.empty?
141
+ count -= @state.scanned_files.length
142
+ @printer.print_line(
143
+ " Skipping #{@state.scanned_files.length} files, previously scanned.",
144
+ '!',
145
+ 'red'
146
+ )
147
+ @printer.line_break 0.5
148
+ end
149
+
150
+ files.each do |f|
151
+ # Skip any previously scanned files
152
+ next if @state.scanned_files.include?(f)
153
+
154
+ # Print the intro
155
+ @printer.print_file_header f
156
+ @printer.line_break 0
157
+
158
+ # Run @link_processor over all matching links
159
+ text = File.read(f)
160
+ @link_processor.containing_file = f
161
+ unless text.valid_encoding?
162
+ text = text.encode('UTF-16be', invalid: :replace, replace: '&nbsp;').encode('UTF-8')
163
+ end
164
+ text = clean_no_quotes(text)
165
+ text = check_urls(text)
166
+ File.open(f, 'w') { |file| file.puts text }
167
+ count -= 1
168
+
169
+ ## Update the state
170
+ @state.scanned_files << f
171
+ @state.cached = @link_processor.cached
172
+ @state.save_state
173
+
174
+ # Print count, onto next file
175
+ @printer.line_break 0
176
+ @printer.print_line " #{count} files left ", '*', 'cyan'
177
+ @printer.line_break 0
178
+ end
179
+
180
+ @printer.print_outro
181
+ end
182
+ end
183
+ end
@@ -0,0 +1,47 @@
1
+ require 'rowr'
2
+
3
+ module Rowr
4
+
5
+ class StateSaver
6
+
7
+ def initialize(src_dir, filename)
8
+ @src = src_dir
9
+ @file = File.expand_path(File.join(@src, filename))
10
+ @config = {}
11
+ @cached = {}
12
+ @scanned_files = []
13
+ end
14
+
15
+ attr_accessor :src
16
+ attr_reader :file
17
+ attr_reader :config
18
+ attr_accessor :scanned_files
19
+ attr_accessor :cached
20
+
21
+ def config_file_exists?
22
+ File.exist?(@file)
23
+ end
24
+
25
+ def save_state
26
+ hashed = {
27
+ config: @config,
28
+ cached: @cached,
29
+ scanned_files: scanned_files
30
+ }
31
+ File.open(@file, 'wb') { |f| f.write JSON.pretty_generate(hashed) }
32
+ end
33
+
34
+ def load_state
35
+ file = JSON.parse(File.open(@file).read, symbolize_names: true)
36
+ @config = file[:config]
37
+ @cached = file[:cached]
38
+ @scanned_files = file[:scanned_files]
39
+ end
40
+
41
+ def save_config(config)
42
+ @config = config
43
+ save_state
44
+ end
45
+
46
+ end
47
+ end
@@ -0,0 +1,3 @@
1
+ module Rowr
2
+ VERSION = "0.2.0"
3
+ end
@@ -0,0 +1,64 @@
1
+ require 'rowr'
2
+
3
+ module Rowr
4
+
5
+ class Zipper
6
+
7
+ def initialize(src_dir)
8
+ @src = src_dir
9
+ @filename = 'rowr_backup_files'
10
+ @backup_dir = File.join(src, filename)
11
+ end
12
+
13
+ attr_accessor :src
14
+ attr_reader :filename
15
+ attr_reader :backup_dir
16
+
17
+ def backup
18
+ copy
19
+ zip
20
+ remove
21
+ end
22
+
23
+ def remove
24
+ FileUtils.remove_dir(backup_dir)
25
+ end
26
+
27
+ def copy
28
+ FileUtils.mkdir_p backup_dir
29
+ files = Dir.glob(File.join(src, '*')).reject { |f| File.basename(f) == filename }
30
+ FileUtils.cp_r files, backup_dir
31
+ end
32
+
33
+ def zip
34
+ zip = File.join(src,"#{filename}.zip")
35
+ File.delete zip if File.exist?(zip)
36
+ zf = ZipFileGenerator.new(backup_dir, zip)
37
+ zf.write
38
+ end
39
+
40
+ def restore
41
+ delete_all_files_except_backup
42
+ unzip
43
+ end
44
+
45
+ def unzip
46
+ Zip::File.open(File.join(src, "#{filename}.zip")) do |zip_file|
47
+ zip_file.each do |file|
48
+ file_path = File.join(src, file.name)
49
+ FileUtils.mkdir_p(File.dirname(file_path))
50
+ zip_file.extract(file, file_path) unless File.exist?(file_path)
51
+ end
52
+ end
53
+ end
54
+
55
+ def backup_file_exists?
56
+ File.exist?(File.expand_path(File.join(src, "#{filename}.zip")))
57
+ end
58
+
59
+ def delete_all_files_except_backup
60
+ FileUtils.rm_rf (Dir.glob(File.join(src, '*')).reject { |f| File.basename(f) =~ /^rowr_/ })
61
+ end
62
+
63
+ end
64
+ end
@@ -0,0 +1,48 @@
1
+ require 'zip'
2
+
3
+ # This is a simple example which uses rubyzip to
4
+ # recursively generate a zip file from the contents of
5
+ # a specified directory. The directory itself is not
6
+ # included in the archive, rather just its contents.
7
+ #
8
+ # Usage:
9
+ # directoryToZip = "/tmp/input"
10
+ # outputFile = "/tmp/out.zip"
11
+ # zf = ZipFileGenerator.new(directoryToZip, outputFile)
12
+ # zf.write()
13
+ class ZipFileGenerator
14
+
15
+ # Initialize with the directory to zip and the location of the output archive.
16
+ def initialize(inputDir, outputFile)
17
+ @inputDir = inputDir
18
+ @outputFile = outputFile
19
+ end
20
+
21
+ # Zip the input directory.
22
+ def write()
23
+ entries = Dir.entries(@inputDir); entries.delete("."); entries.delete("..")
24
+ io = Zip::File.open(@outputFile, Zip::File::CREATE)
25
+
26
+ writeEntries(entries, "", io)
27
+ io.close()
28
+ end
29
+
30
+ # A helper method to make the recursion work.
31
+ private
32
+ def writeEntries(entries, path, io)
33
+
34
+ entries.each { |e|
35
+ zipFilePath = path == "" ? e : File.join(path, e)
36
+ diskFilePath = File.join(@inputDir, zipFilePath)
37
+ # puts "Deflating " + diskFilePath
38
+ if File.directory?(diskFilePath)
39
+ io.mkdir(zipFilePath)
40
+ subdir = Dir.entries(diskFilePath); subdir.delete("."); subdir.delete("..")
41
+ writeEntries(subdir, zipFilePath, io)
42
+ else
43
+ io.get_output_stream(zipFilePath) { |f| f.puts(File.open(diskFilePath, "rb").read())}
44
+ end
45
+ }
46
+ end
47
+
48
+ end
metadata ADDED
@@ -0,0 +1,174 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: rowr
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.2.0
5
+ platform: ruby
6
+ authors:
7
+ - Luke Aeschleman
8
+ - Jason Casden
9
+ autorequire:
10
+ bindir: exe
11
+ cert_chain: []
12
+ date: 2017-10-06 00:00:00.000000000 Z
13
+ dependencies:
14
+ - !ruby/object:Gem::Dependency
15
+ name: rubyzip
16
+ requirement: !ruby/object:Gem::Requirement
17
+ requirements:
18
+ - - "~>"
19
+ - !ruby/object:Gem::Version
20
+ version: 1.2.1
21
+ type: :runtime
22
+ prerelease: false
23
+ version_requirements: !ruby/object:Gem::Requirement
24
+ requirements:
25
+ - - "~>"
26
+ - !ruby/object:Gem::Version
27
+ version: 1.2.1
28
+ - !ruby/object:Gem::Dependency
29
+ name: slop
30
+ requirement: !ruby/object:Gem::Requirement
31
+ requirements:
32
+ - - "~>"
33
+ - !ruby/object:Gem::Version
34
+ version: 4.6.0
35
+ type: :runtime
36
+ prerelease: false
37
+ version_requirements: !ruby/object:Gem::Requirement
38
+ requirements:
39
+ - - "~>"
40
+ - !ruby/object:Gem::Version
41
+ version: 4.6.0
42
+ - !ruby/object:Gem::Dependency
43
+ name: thor
44
+ requirement: !ruby/object:Gem::Requirement
45
+ requirements:
46
+ - - "~>"
47
+ - !ruby/object:Gem::Version
48
+ version: '0.19'
49
+ type: :runtime
50
+ prerelease: false
51
+ version_requirements: !ruby/object:Gem::Requirement
52
+ requirements:
53
+ - - "~>"
54
+ - !ruby/object:Gem::Version
55
+ version: '0.19'
56
+ - !ruby/object:Gem::Dependency
57
+ name: tty-prompt
58
+ requirement: !ruby/object:Gem::Requirement
59
+ requirements:
60
+ - - "~>"
61
+ - !ruby/object:Gem::Version
62
+ version: 0.13.2
63
+ type: :runtime
64
+ prerelease: false
65
+ version_requirements: !ruby/object:Gem::Requirement
66
+ requirements:
67
+ - - "~>"
68
+ - !ruby/object:Gem::Version
69
+ version: 0.13.2
70
+ - !ruby/object:Gem::Dependency
71
+ name: faraday
72
+ requirement: !ruby/object:Gem::Requirement
73
+ requirements:
74
+ - - "~>"
75
+ - !ruby/object:Gem::Version
76
+ version: 0.13.1
77
+ type: :runtime
78
+ prerelease: false
79
+ version_requirements: !ruby/object:Gem::Requirement
80
+ requirements:
81
+ - - "~>"
82
+ - !ruby/object:Gem::Version
83
+ version: 0.13.1
84
+ - !ruby/object:Gem::Dependency
85
+ name: bundler
86
+ requirement: !ruby/object:Gem::Requirement
87
+ requirements:
88
+ - - "~>"
89
+ - !ruby/object:Gem::Version
90
+ version: '1.13'
91
+ type: :development
92
+ prerelease: false
93
+ version_requirements: !ruby/object:Gem::Requirement
94
+ requirements:
95
+ - - "~>"
96
+ - !ruby/object:Gem::Version
97
+ version: '1.13'
98
+ - !ruby/object:Gem::Dependency
99
+ name: rake
100
+ requirement: !ruby/object:Gem::Requirement
101
+ requirements:
102
+ - - "~>"
103
+ - !ruby/object:Gem::Version
104
+ version: 12.1.0
105
+ type: :development
106
+ prerelease: false
107
+ version_requirements: !ruby/object:Gem::Requirement
108
+ requirements:
109
+ - - "~>"
110
+ - !ruby/object:Gem::Version
111
+ version: 12.1.0
112
+ - !ruby/object:Gem::Dependency
113
+ name: rspec
114
+ requirement: !ruby/object:Gem::Requirement
115
+ requirements:
116
+ - - "~>"
117
+ - !ruby/object:Gem::Version
118
+ version: '3.0'
119
+ type: :development
120
+ prerelease: false
121
+ version_requirements: !ruby/object:Gem::Requirement
122
+ requirements:
123
+ - - "~>"
124
+ - !ruby/object:Gem::Version
125
+ version: '3.0'
126
+ description: Refactors pieces of old websites so they can be hosted again or archived.
127
+ email:
128
+ - lukeaeschleman@gmail.com
129
+ - casden@gmail.com
130
+ executables:
131
+ - rowr
132
+ extensions: []
133
+ extra_rdoc_files: []
134
+ files:
135
+ - CODE_OF_CONDUCT.md
136
+ - LICENSE
137
+ - README.md
138
+ - bin/setup
139
+ - exe/rowr
140
+ - lib/rowr.rb
141
+ - lib/rowr/command_line.rb
142
+ - lib/rowr/link_processor.rb
143
+ - lib/rowr/printer.rb
144
+ - lib/rowr/prompter.rb
145
+ - lib/rowr/resurrector.rb
146
+ - lib/rowr/state_saver.rb
147
+ - lib/rowr/version.rb
148
+ - lib/rowr/zipper.rb
149
+ - lib/zip/zip_file_generator.rb
150
+ homepage: https://github.com/UNC-Libraries/ROWR
151
+ licenses:
152
+ - Apache-2.0
153
+ metadata: {}
154
+ post_install_message:
155
+ rdoc_options: []
156
+ require_paths:
157
+ - lib
158
+ required_ruby_version: !ruby/object:Gem::Requirement
159
+ requirements:
160
+ - - ">="
161
+ - !ruby/object:Gem::Version
162
+ version: '0'
163
+ required_rubygems_version: !ruby/object:Gem::Requirement
164
+ requirements:
165
+ - - ">="
166
+ - !ruby/object:Gem::Version
167
+ version: '0'
168
+ requirements: []
169
+ rubyforge_project:
170
+ rubygems_version: 2.6.13
171
+ signing_key:
172
+ specification_version: 4
173
+ summary: The Really Old Website Refactorer
174
+ test_files: []