webvac 0.1.5 → 0.1.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (8) hide show
  1. checksums.yaml +4 -4
  2. data/Changelog +3 -0
  3. data/README +27 -22
  4. data/Rakefile +11 -0
  5. data/config.ru +5 -4
  6. data/doc/TODO +0 -3
  7. data/lib/webvac.rb +6 -2
  8. metadata +4 -2
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: dd9d4c498fdf6c8497e3f3ac3799621488304111
4
- data.tar.gz: 3f41f908d153a5aa880a098389f0bad4e6e95640
3
+ metadata.gz: c0be13ddc03fb863eaf5ada9ed4f554d3afc3d64
4
+ data.tar.gz: a4dc5592881d294c45bd2fbfe77ff4a68881b19a
5
5
  SHA512:
6
- metadata.gz: 1359633e7236896e71eecd91bd0fdc04d0941b3868292979d42a6dfd5dbf58d29aa7ca827bb0c9ef56b8d84f7b6087a4e53f4f70826f6fdff445925c65d89350
7
- data.tar.gz: bf08f77ec60e7b61eea7855de97139b3e180d2447e4bb863ef768ad93d12b00f1f71409bd1d3a6760c617768b2f4115ba3931b8ce5bd8e6b865e381be132a109
6
+ metadata.gz: 297b5ac456d6a199bc216a9079610f2e0b8c8a82745de232973443542de47e4d7ce3df7ef18fac2b0e8bb7667a7ec19547eb98990cea5ef58d777358317515f8
7
+ data.tar.gz: 568910a94852ee97daa01977aef1623ad989ddc07ef0f45236ae4929e9bd0b81bc70d33c1ba2c8f5d0ddecafd45f64777846f4ab74a94b1905c6a87860107a45
@@ -0,0 +1,3 @@
1
+ Sat Jun 20 22:59:30 PDT 2020
2
+ * webvac now streams. This cuts RAM usage and latency. I tested it in production, because I am a gangsta (and because rollback is easy).
3
+ * First Changelog entry!
data/README CHANGED
@@ -1,29 +1,29 @@
1
1
  This is a somewhat specialized chunk of code!
2
2
 
3
- THIS IS ALSO BETA SOFTWARE. CAVEAT EMPTOR.
4
-
5
3
  Using a lookup table in Redis and venti as the backing storage, webvac
6
- serves static content. It comes with utilities "webvac-sweep" and
7
- "webvac-unsweep" that push things into venti and remove them from the
8
- filesystem and place the things back into the filesystem, respectively.
9
- The 'webvac' program itself starts up a webserver. Give these programs
10
- '-h' or '--help' to see helpful(?) information, and see below for
11
- configuration.
12
-
13
- Currently I am only using it to serve uploads for UGC in Pleroma. I
14
- have been using venti to do incremental backups of the data, and since
4
+ serves static content. It comes with utility `webvac-sweep`, that
5
+ pushes content into venti, optionally removing it from the filesystem.
6
+ The `webvac-server` program itself starts up a webserver. Give these
7
+ programs '-h' or '--help' to see helpful(?) information, and see below
8
+ for configuration.
9
+
10
+ Currently I am only using it to serve uploads for UGC in some small- to
11
+ moderately high-traffic Pleroma instances. I have been using venti to
12
+ do incremental backups of the data (replication is speedy), and since
15
13
  the uploads are WORM ("write once, read many") data, I thought it'd be
16
- cool to serve the data directly out of venti.
14
+ cool to serve the data directly out of venti. It worked better than
15
+ expected!
17
16
 
18
17
  I expect to use this more often and thus expect it to become a bit more
19
- general as a result.
18
+ general as a result, but for right now, it makes a couple of assumptions
19
+ about where it serves files.
20
20
 
21
21
  = Quick Start
22
22
 
23
23
  [Check that your venti and Redis servers are operational.]
24
24
  $ sudo ed /etc/webvac.json
25
25
  a
26
- '{"server_path_prepend":"/where/uploads/get/put/in/the/filesystem"}'
26
+ {"server_path_prepend":"/where/uploads/get/put/in/the/filesystem"}
27
27
  .
28
28
  wq
29
29
  $ sudo $EDITOR /etc/nginx/whatever
@@ -47,12 +47,12 @@ to configure probably one thing.
47
47
 
48
48
  = Overhead
49
49
 
50
- For about 100GB of files, venti takes 86GB of disk (no surprise, since
51
- it's mostly JPGs and MP4s, so it's already compressed; the savings are
52
- probably from dedup), and Redis takes about 60MB of RAM for this. All
53
- of the files were put into venti as part of the backup solution, but the
54
- originals weren't removed if they were bigger than 4MB (see doc/TODO).
55
- The workers take some RAM to run. CPU overhead is negligible.
50
+ Just empirically, for about 100GB of files, venti takes 86GB of disk (no
51
+ surprise, since it's mostly JPGs and MP4s, so it's already compressed;
52
+ the savings are probably from dedup), and Redis takes about 60MB of RAM
53
+ for this. All of the files were put into venti as part of the backup
54
+ solution. CPU overhead is negligible, the server takes about 200MB of
55
+ RAM for both workers.
56
56
 
57
57
  = Installation
58
58
 
@@ -110,7 +110,12 @@ server_path_prepend.
110
110
 
111
111
  = Usage
112
112
 
113
- Afer configuring, you can run the server with `webvac-server`. This will actually serve the content from venti, as long as it is present in the path→score index in Redis (so you can remove content as needed by just removing items from the index). In order to add items, you run `webvac-sweep`.
113
+ Afer configuring, you can run the server with `webvac-server`. This
114
+ will actually serve the content from venti, as long as it is present in
115
+ the path→score index in Redis (so you can remove content as needed by
116
+ just removing items from the index). In order to add items, you run
117
+ `webvac-sweep`. You can also delete the file (which will only happen if
118
+ the sweep is successful) with `-d`.
114
119
 
115
120
  = TODO
116
121
 
@@ -124,4 +129,4 @@ See doc/TODO
124
129
 
125
130
  = I feel dirty
126
131
 
127
- You can throw BTC at 1BZz3ndJUoWhEvm1BfW3FzceAjFqKTwqWV . Proceeds will go to funding the instance hosting thing.
132
+ You can throw Bitcoin (BTC) at this address: 1BZz3ndJUoWhEvm1BfW3FzceAjFqKTwqWV . Proceeds will go to funding the instance hosting thing.
data/Rakefile CHANGED
@@ -31,3 +31,14 @@ desc "Runs IRB, automatically require()ing #{spec.name}."
31
31
  task(:irb) {
32
32
  exec "irb -Ilib -r#{spec.name}"
33
33
  }
34
+
35
+ desc "Runs IRB, automatically require()ing #{spec.name}, with "\
36
+ "acme-suitable options"
37
+ task(:airb) {
38
+ exec "irb -Ilib -r#{spec.name} --prompt default --noreadline"
39
+ }
40
+
41
+ desc "Runs nginx test server."
42
+ task(:nginx) {
43
+ exec "nginx", "-c", "#{__dir__}/doc/nginx.example.conf"
44
+ }
data/config.ru CHANGED
@@ -32,6 +32,9 @@ module WebVac
32
32
  'Content-Disposition' => "filename=\"#{name}\"",
33
33
  'Fortune' => Fortunes.sample,
34
34
  }.merge!(t.metadata(score) || {}).tap { |h|
35
+ # Now that we stream, this condition never happens.
36
+ # Maybe it's a good idea to wrap the IO object and fill in
37
+ # the extra data as needed.
35
38
  if contents
36
39
  h['Content-Type'] ||= t.guess_mime(contents) rescue nil
37
40
  h['Content-Length'] ||= contents.bytesize.to_s
@@ -66,14 +69,12 @@ module WebVac
66
69
  s = tab.path2score p
67
70
  return [404, {}, ["404 Not found\nNo such path: #{p}\n"]] unless s
68
71
  ct = Time.parse(env['HTTP_IF_MODIFIED_SINCE']) rescue nil
72
+ hs = headers(tab, p, s, nil)
69
73
  if ct && ct.to_i > 0
70
- hs = headers(tab, p, s, nil)
71
74
  mt = Time.parse(hs['Last-Modified']) rescue nil
72
75
  return [304, hs, []] if mt && mt > ct
73
76
  end
74
- contents = vac.load! s
75
- hs = headers(tab, p, s, contents)
76
- [200, headers(tab, p, s, contents), [contents]]
77
+ [200, hs, vac.load_io(s)]
77
78
  }
78
79
  end
79
80
 
data/doc/TODO CHANGED
@@ -2,9 +2,6 @@ Unordered:
2
2
 
3
3
  · The closure abuse in Serv precludes using the URL to generate the routes.
4
4
  This needs a fix in order to generalize beyond Pleroma.
5
- · Should be easy to stream rather than loading everything into memory,
6
- but until then, big-ish files (≈4MB) take a second to get out of venti.
7
- Obviously, it'll be faster and more reliable to implement the venti protocol.
8
5
  · Stats and webvac-unsweep. This will allow hot objects to be swapped out of
9
6
  venti.
10
7
  · Implement the venti protocol instead of calling $plan9bin/vac.
@@ -89,14 +89,18 @@ module WebVac
89
89
  io.read.chomp.sub(/^vac:/, '')
90
90
  end
91
91
 
92
- def load! vac
92
+ def load_io vac
93
93
  unless /^vac:[a-f0-9]{40}$/.match(vac)
94
94
  raise ArgumentError, "#{vac.inspect} not a vac score?"
95
95
  end
96
96
  IO.popen(
97
97
  {'venti' => config.venti_server},
98
98
  ["#{config.plan9bin}/unvac", '-c', vac]
99
- ).tap { |io| Thread.new { Process.wait(io.pid) } }.read
99
+ ).tap { |io| Thread.new { Process.wait(io.pid) } }
100
+ end
101
+
102
+ def load! vac
103
+ load_io(vac).read
100
104
  end
101
105
  end
102
106
 
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: webvac
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.5
4
+ version: 0.1.6
5
5
  platform: ruby
6
6
  authors:
7
7
  - Pete
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2020-06-09 00:00:00.000000000 Z
11
+ date: 2020-06-21 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: redic
@@ -91,7 +91,9 @@ extra_rdoc_files:
91
91
  - doc/TODO
92
92
  - doc/nginx.example.conf
93
93
  - README
94
+ - Changelog
94
95
  files:
96
+ - Changelog
95
97
  - README
96
98
  - Rakefile
97
99
  - bin/webvac-server