giblish 0.3.1 → 0.4.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 963e7aeee8afe72afe253d10a5ef3f9fb8538af3668838d18ac6539d388f64a1
4
- data.tar.gz: 32916267003809333978cb261e459795b14d36e29e4d2f199ab21b9a02ba791a
3
+ metadata.gz: a6a7fc123856a6321bbad8b7f8f527490a09faa9df0ebeeb1a984bbda4b22d78
4
+ data.tar.gz: 2788cc7789ed4046307d9d4a7ad75be7f7b0f12c30a1b559500b18166602e966
5
5
  SHA512:
6
- metadata.gz: d0c13cd9f84c9bba7a241eb156b843145026b9e199c13255a518e487fc62c1968417c7482445457750c1c825b4b671ee80076459442f24d561a1eb670f46843c
7
- data.tar.gz: '084d83eb182a093f5b1dff2bbfd0074b6d3d9f4657b7661c7395dd4fdf1e7755d88d3342f3d1a40fc529e798543390d9e7ffefcc702a51cbbe5ecb00c7e6106c'
6
+ metadata.gz: 2ea03abe46dd0e23c06266e8b1a9bc444811931825a2155e9545ae911d9527fd7ad993d18287fd2ad5b7ce6a0e51ce40421d2c01454ec357161300969d7bdf57
7
+ data.tar.gz: 308d7306d39641442e17fa7b5a1687e8f7ae473e438c6eabf3ffbb3650521b091e15a8b0ed36700cb41cad917fbcd9c3f95646b38d2d4199c112eaa75b00ca7b
@@ -1 +1 @@
1
- 2.5.3
1
+ 2.6.1
@@ -7,17 +7,31 @@ image::https://travis-ci.com/rillbert/giblish.svg?branch=master[build status]
7
7
 
8
8
  giblish is used to convert a source directory tree containing AsciiDoc files to
9
9
  a destination directory tree containing the corresponding html or pdf files
10
- and add a handy index page for the converted files.
11
-
12
- If the source directory tree is part of a git repository, giblish can generate
13
- separate html/pdf trees for branches and/or tags that match a user specified
14
- regexp (see examples below).
10
+ and add some handy tools for easier navigation of the resulting files.
11
+
12
+ The tools include:
13
+
14
+ * An index page listing all rendered documents with clickable links
15
+ * Document ids - Note: the implementation of this is giblish-specific and thus
16
+ you need to render your adoc files using giblish to make this work as intended.
17
+ You can use document ids to:
18
+ ** Reference one doc in the source tree from another doc without depending on file
19
+ names or relative paths. The referenced doc can thus be moved within the sourc
20
+ tree or change its file name and the reference will still be valid.
21
+ ** Validate doc id references during document rendering and thus be alerted to
22
+ any invalid doc id references.
23
+ ** Let giblish generate a clickable graph of all document references (requires
24
+ graphviz and the 'dot' tool).
25
+ * A (stripped-down but nonetheless useful) text-search of your documents (requires
26
+ that you view your docs via a web-server.
27
+ * If the source directory tree is part of a git repository, giblish can generate
28
+ separate html/pdf trees for branches and/or tags that match a user specified
29
+ regexp (see examples below).
15
30
 
16
31
  == Dependencies and credits
17
32
 
18
- Giblish is basically a wrapper (with some extra candy) around the awesome
19
- *asciidoctor* and *asciidoctor-pdf* projects. Thank you @mojavelinux and others for
20
- making these brilliant tools available!!
33
+ Giblish uses the awesome *asciidoctor* and *asciidoctor-pdf* projects under the hood.
34
+ Thank you @mojavelinux and others for making these brilliant tools available!!
21
35
 
22
36
  == Installation
23
37
 
@@ -119,13 +133,14 @@ A summary page containing links to all branches will be generated directly in
119
133
  the `my_dst_root` dir.
120
134
  ====
121
135
 
122
- .Advanced usage; Publish a static html site from a git repo
136
+ .Advanced usage; Publish a static html site from a git repo with search capabilities
123
137
  ====
124
138
  giblish can be used to inject a tree of html docs suitable for serving via a web
125
139
  server (e.g. Apache). Below is an example how to create such a tree. If you
126
140
  combine this with a server side git hook that invokes this script after push,
127
141
  you will have a way of auto publish your latest documents and/or documents at
128
- specific git tags. In principle a poor-mans document managing system.
142
+ specific git tags. A document management system including nice index pages and
143
+ text search capabilities
129
144
 
130
145
  Assumptions:
131
146
 
@@ -142,7 +157,7 @@ Assumptions:
142
157
  * You want to publish the documentation as it looked for your release tags
143
158
  myprod-v1.0-final, myprod-v2.0-final, ...
144
159
 
145
- giblish -t "-final$" -r ~/gh/myrepo/common/resources -s mylayout -w /var/www/html ~/gh/myrepo/common/Documents /var/www/html/proddocs
160
+ giblish -m -t "-final$" -r ~/gh/myrepo/common/resources -s mylayout -w /var/www/html ~/gh/myrepo/common/Documents /var/www/html/proddocs
146
161
 
147
162
  The above will create a tree of html docs under `/var/www/html/proddocs`. Each
148
163
  tag will get its own subdir (e.g. `/var/www/html/proddocs/myprod_v1.0_final`).
@@ -152,4 +167,10 @@ subdir and also to the .../proddocs dir.
152
167
  The `-w` switch above will strip the `/var/www/html` from the css link so that
153
168
  the paths to the css will be correct in the context of the serving of the
154
169
  pages via the web server.
170
+
171
+ The `-m` switch above will build a database (JSON file) with enough information
172
+ to enable a cgi-script to provide a text-search capability for your users. The
173
+ cgi-script must be located at http://your_web_site.com/cgi-bin/giblish-search.cgi
174
+ and this gem provides a default implementation that you can copy from the .../lib
175
+ folder to the correct destination.
155
176
  ====
data/Rakefile CHANGED
@@ -10,7 +10,9 @@ end
10
10
  Rake::TestTask.new(:current) do |t|
11
11
  t.libs << "test"
12
12
  t.libs << "lib"
13
- t.test_files = FileList["test/**/depgraph_test.rb"]
13
+ t.test_files = FileList["test/**/docid_test.rb"]
14
+ # t.test_files = FileList["test/**/index_heading_test.rb"]
15
+ # t.test_files = FileList["test/**/depgraph_test.rb"]
14
16
  end
15
17
 
16
18
  Rake::TestTask.new(:sandbox) do |t|
@@ -35,7 +35,7 @@ Gem::Specification.new do |spec|
35
35
  spec.add_development_dependency "rake", "~> 10.0"
36
36
 
37
37
  # Usage: spec.add_runtime_dependency "[gem name]", [[version]]
38
- spec.add_runtime_dependency "asciidoctor", "~>1.5", ">= 1.5.7.1"
38
+ spec.add_runtime_dependency "asciidoctor", "~>1.5", ">= 1.5.8"
39
39
  spec.add_runtime_dependency "asciidoctor-diagram", ["~> 1.5"]
40
40
  spec.add_runtime_dependency "asciidoctor-pdf", [">= 1.5.0.alpha.16"]
41
41
  spec.add_runtime_dependency "asciidoctor-rouge", ["~> 0.3"]
@@ -0,0 +1,444 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "pathname"
4
+ require "json"
5
+ require "asciidoctor"
6
+ require "open3"
7
+ require "cgi"
8
+ require "uri/generic"
9
+
10
+ class GrepDocTree
11
+ Line_info = Struct.new(:line, :line_no) {
12
+ def initialize(line,line_no)
13
+ self.line = line
14
+ self.line_no = Integer(line_no)
15
+ end
16
+ }
17
+
18
+ # grep_opts:
19
+ # :search_top
20
+ # :search_phrase
21
+ # :ignorecase
22
+ # :useregexp
23
+ def initialize(grep_opts)
24
+ @grep_opts = "-nHr --include '*.adoc' "
25
+ @grep_opts += "-i " if grep_opts.has_key? :ignorecase
26
+ @grep_opts += "-F " unless grep_opts.has_key? :useregexp
27
+
28
+ @search_root = grep_opts[:search_top]
29
+ @input = grep_opts[:search_phrase]
30
+
31
+ @output = ""
32
+ @error = ""
33
+ @status = 0
34
+ @match_index = {}
35
+ end
36
+
37
+ def grep
38
+ # This console code sequence will only show the matching word in bold ms=01:mc=:sl=:cx=:fn=:ln=:bn=:se=
39
+ grep_env="GREP_COLORS=\"ms=01:mc=:sl=:cx=:fn=:ln=:bn=:se=\""
40
+ @grep_opts += " --color=always"
41
+
42
+
43
+ @output, @error, @status = Open3.capture3("#{grep_env} grep #{@grep_opts} \"#{@input}\" #{@search_root}")
44
+
45
+ begin
46
+ @output.force_encoding(Encoding::UTF_8)
47
+ @output.gsub!(/\x1b\[01m\x1b\[K/,"##")
48
+ @output.gsub!(/\x1b\[m\x1b\[K/,"##")
49
+ rescue StandardError => e
50
+ print e.message
51
+ print e.backtrace.inspect
52
+ exit 0
53
+ end
54
+
55
+ grep2hash @search_root
56
+ end
57
+
58
+ # returns an indexed output where each match from the search is associated with the
59
+ # corresponding src file's closest heading.
60
+ # the format of the output:
61
+ # {html_filename#heading : [line_1, line_2, ...], ...}
62
+ #
63
+ # The heading_db has the following JSON format
64
+ # {
65
+ # file_infos : [{
66
+ # filepath : filepath_1,
67
+ # title : Title,
68
+ # sections : [{
69
+ # id : section_id_1,
70
+ # title : section_title_1,
71
+ # line_no : line_no
72
+ # },
73
+ # {
74
+ # id : section_id_1,
75
+ # title : section_title_1,
76
+ # line_no : line_no
77
+ # },
78
+ # ...
79
+ # ]
80
+ # },
81
+ # {
82
+ # filepath : filepath_1,
83
+ # ...
84
+ # }]
85
+ # }
86
+ def match_with_headings heading_db
87
+ matches = []
88
+
89
+ # for each file with at least one match
90
+ @match_index.each do |file_path,match_infos|
91
+ # assume that max one file with the specified path
92
+ # exists
93
+ files = heading_db["file_infos"].select do |fi|
94
+ fi["filepath"] == file_path.to_s
95
+ end
96
+ next if files.empty?
97
+
98
+ file_anchors = construct_user_info files.first, match_infos
99
+ matches << file_anchors
100
+ end
101
+ matches
102
+ end
103
+
104
+ # Produce a hash with all info needed for the user to navigate to the
105
+ # matching html section for all matches to the file in the supplied file
106
+ # info hash.
107
+ #
108
+ # format of the resulting hash:
109
+ # {
110
+ # filepath : Filepath,
111
+ # title : Title,
112
+ # matches : {
113
+ # section_id :
114
+ # {
115
+ # section_title : Section Title,
116
+ # location : Location,
117
+ # lines : [line_1, line_2, ...]
118
+ # }
119
+ # }
120
+ # ]
121
+ # }
122
+ #
123
+ def construct_user_info file_info, match_infos
124
+ matches = {}
125
+ file_anchors = {
126
+ "filepath" => file_info["filepath"],
127
+ "title" => file_info["title"],
128
+ "matches" => matches
129
+ }
130
+
131
+ match_infos.each do |match_info|
132
+ match_line_nr = match_info.line_no
133
+
134
+ # find section with closest lower line_no to line_info
135
+ best_so_far = 0
136
+ chosen_section_info = {}
137
+ file_info["sections"].each do |section_info|
138
+ l = Integer(section_info["line_no"])
139
+ if l <= match_line_nr && l > best_so_far
140
+ chosen_section_info = section_info
141
+ end
142
+ end
143
+
144
+ matches[chosen_section_info["id"]] =
145
+ {
146
+ "section_title" => chosen_section_info["title"],
147
+ "location" => "#{Pathname.new(file_info["filepath"]).sub_ext(".html").to_s}##{chosen_section_info["id"]}",
148
+ "lines" => []
149
+ } unless matches.key?(chosen_section_info["id"])
150
+ matches[chosen_section_info["id"]]["lines"] << match_info.line
151
+ end
152
+ file_anchors
153
+ end
154
+
155
+ def formatted_output
156
+ # assume we have an updated index
157
+ adoc_str = ""
158
+ @match_index.each do |k,v|
159
+ adoc_str += "#{k}::\n"
160
+ v.each { |line_info|
161
+ adoc_str += "#{line_info.line_no} : #{line_info.line}\n"
162
+ }
163
+ end
164
+ adoc_str
165
+ end
166
+
167
+ private
168
+
169
+ # converts the 'raw' matches from grep into a hash.
170
+ # i.e. from:
171
+ # <filename>:<line_no>:<line>
172
+ # <filename>:<line_no>:<line>
173
+ # ...
174
+ #
175
+ # to
176
+ # {file_path : [line_info1, line_info2, ...], ...}
177
+ def grep2hash(base_dir)
178
+ @match_index = {}
179
+ @output.split("\n").each do |line|
180
+ tokens = line.split(":",3)
181
+
182
+ # remove all lines starting with :<attrib>:
183
+ tokens[2].gsub!(/^:[[:graph:]]+:.*$/,"")
184
+ next if tokens[2].empty?
185
+
186
+ # remove everything above the repo root from the filepath
187
+ file_path = Pathname.new(tokens[0]).relative_path_from Pathname.new(base_dir)
188
+ @match_index[file_path] = [] unless @match_index.key? file_path
189
+ @match_index[file_path] << Line_info.new(tokens[2], tokens[1])
190
+ end
191
+ end
192
+ end
193
+
194
+ class SearchDocTree
195
+ def initialize(input_data)
196
+ @input_data = input_data
197
+ end
198
+
199
+ def search
200
+ # read the heading_db from file
201
+ jsonpath = @input_data[:search_top].join("heading_index.json")
202
+ src_index = {}
203
+ json = File.read(jsonpath.to_s)
204
+ src_index = JSON.parse(json)
205
+
206
+ # search the doc tree for regex
207
+ gt = GrepDocTree.new @input_data
208
+ gt.grep
209
+
210
+ matches = gt.match_with_headings src_index
211
+
212
+ format_search_adoc matches, get_uri_top
213
+ end
214
+
215
+ private
216
+
217
+ def get_uri_top
218
+ if @input_data[:gitbranch]
219
+ return @input_data[:referer][0,@input_data[:referer].rindex('/')]
220
+ end
221
+ return @input_data[:referer].chomp('/')
222
+ end
223
+
224
+ def wash_line line
225
+ # remove any '::'
226
+ result = line.gsub(/::*/,"")
227
+ # remove =,| at the start of a line
228
+ result.gsub!(/^[=|]+/,"")
229
+ result
230
+ end
231
+
232
+ # index is an array of file_info, see construct_user_info
233
+ # for format per file
234
+ # == Title (filename)
235
+ #
236
+ # <<location,section_title>>::
237
+ # line_1
238
+ # line_2
239
+ # ...
240
+ def format_search_adoc index,uri_top
241
+ str = ""
242
+ index.each do |file_info|
243
+ filename = Pathname.new(file_info["filepath"]).basename
244
+ str << "== #{file_info["title"]}\n\n"
245
+ file_info["matches"].each do |section_id, info |
246
+ str << "#{uri_top}/#{info["location"]}[#{info["section_title"]}]::\n\n"
247
+ # str << "<<#{info["location"]},#{info["section_title"]}>>::\n\n"
248
+ str << "[subs=\"quotes\"]\n"
249
+ str << "----\n"
250
+ info["lines"].each do | line |
251
+ str << "-- #{wash_line(line)}\n"
252
+ end.join("\n\n")
253
+ str << "----\n"
254
+ end
255
+ str << "\n"
256
+ end
257
+
258
+ <<~ADOC
259
+ = Search Result
260
+
261
+ #{str}
262
+ ADOC
263
+ end
264
+ end
265
+
266
+ def init_web_server web_root
267
+ require 'webrick'
268
+
269
+ root = File.expand_path web_root
270
+ puts "Trying to start a WEBrick instance at port 8000 serving files from #{web_root}..."
271
+
272
+ server = WEBrick::HTTPServer.new(
273
+ :Port => 8000,
274
+ :DocumentRoot => root,
275
+ :Logger => WEBrick::Log.new("webrick.log",WEBrick::Log::DEBUG)
276
+ )
277
+
278
+ puts "WEBrick instance now listening to localhost:8000"
279
+
280
+ trap 'INT' do server.shutdown end
281
+
282
+ server.start
283
+ end
284
+
285
+ def hello_world
286
+ require "pp"
287
+
288
+ # init a new cgi 'connection'
289
+ cgi = CGI.new
290
+ print cgi.header
291
+ print "<br>"
292
+ print "Useful cgi parameters and variables."
293
+ print "<br>"
294
+ print cgi.public_methods(false).sort
295
+ print "<br>"
296
+ print "<br>"
297
+ print "referer: #{cgi.referer}<br>"
298
+ print "path: #{URI(cgi.referer).path}<br>"
299
+ print "host: #{cgi.host}<br>"
300
+ print "client_sent_topdir: #{cgi["topdir"]}<br>"
301
+ print "<br>"
302
+ print "client_sent_reldir: #{cgi["reltop"]}<br>"
303
+ print "<br>"
304
+ print "ENV: "
305
+ pp ENV
306
+ print "<br>"
307
+ end
308
+
309
+ def cgi_main cgi
310
+ # retrieve the form data supplied by user
311
+ input_data = {
312
+ search_phrase: cgi["searchphrase"],
313
+ ignorecase: cgi.has_key?("ignorecase"),
314
+ useregexp: cgi.has_key?("useregexp"),
315
+ doc_root_abs: Pathname.new(cgi["topdir"]),
316
+ referer_rel_top: Pathname.new("/#{cgi["reltop"]}"),
317
+ referer: cgi.referer,
318
+ uri_path: URI(cgi.referer).path,
319
+ client_css: cgi["css"],
320
+ search_top: nil,
321
+ styles_top: nil
322
+ }
323
+
324
+ # fixup paths depending on git branch or not
325
+ #
326
+ # search_assets is an absolute path
327
+ # styles_top is a relative path
328
+ #
329
+ # if the source was rendered from a git branch, the paths
330
+ # search_assets = <index_dir>/../search_assets/<branch_name>/
331
+ # styles_dir = ../web_assets/css
332
+ #
333
+ # and if not, the path is
334
+ # search_assets = <index_dir>/search_assets
335
+ # styles_dir = ./web_assets/css
336
+ #
337
+ # The styles dir shall be a relative path
338
+ if input_data[:doc_root_abs].join("./search_assets").exist?
339
+ # this is not from a git branch
340
+ input_data[:search_top] = input_data[:doc_root_abs].join("./search_assets")
341
+ # input_data[:styles_top] = Pathname.new(input_data[:uri_path]).join("./web_assets/css")
342
+ input_data[:styles_top] = Pathname.new(input_data[:referer_rel_top]).join("web_assets/css")
343
+ input_data[:gitbranch] = false
344
+ elsif input_data[:doc_root_abs].join("../search_assets").exist?
345
+ # this is from a git branch
346
+ input_data[:search_top] = input_data[:doc_root_abs].join("../search_assets").join(input_data[:doc_root_abs].basename)
347
+ input_data[:styles_top] = Pathname.new(input_data[:referer_rel_top]).join("../web_assets/css")
348
+ input_data[:gitbranch] = true
349
+ else
350
+ raise ScriptError, "Could not find search_assets dir!"
351
+ end
352
+
353
+ # use a relative stylesheet (same as the index page was rendered with)
354
+ adoc_options = {
355
+ "data-uri" => 1,
356
+ "linkcss" => 1,
357
+ "stylesdir" => input_data[:styles_top].to_s,
358
+ "stylesheet" => input_data[:client_css],
359
+ "copycss!" => 1
360
+ }
361
+
362
+ # search the docs and render html
363
+ sdt = SearchDocTree.new(input_data)
364
+ docstr = sdt.search
365
+
366
+ # send the result back to the client
367
+ print Asciidoctor.convert docstr, header_footer: true, attributes: adoc_options
368
+ end
369
+
370
+ # assume that the file tree looks like this when running
371
+ # on a git branch:
372
+ #
373
+ # dst_root_dir
374
+ # |- branch_1_top_dir
375
+ # | |- index.html
376
+ # | |- file_1.html
377
+ # | |- dir_1
378
+ # | | |- file2.html
379
+ # |- branch_2_top_dir
380
+ # |- branch_x_...
381
+ # |- web_assets
382
+ # |- search_assets
383
+ # | |- branch_1_top_dir
384
+ # | |- heading_index.json
385
+ # | |- file1.adoc
386
+ # | |- dir_1
387
+ # | | |- file2.html
388
+ # | |- ...
389
+ # | |- branch_2_top_dir
390
+ # | | ...
391
+
392
+ # assume that the file tree looks like this when not
393
+ # rendering a git branch:
394
+ #
395
+ # dst_root_dir
396
+ # |- index.html
397
+ # |- file_1.html
398
+ # |- dir_1
399
+ # | |- file2.html
400
+ # |...
401
+ # |- web_assets (only if a custom stylesheet is used...)
402
+ # |- search_assets
403
+ # | |- heading_index.json
404
+ # | |- file1.adoc
405
+ # | |- dir_1
406
+ # | | |- file2.html
407
+ # | |- ...
408
+
409
+
410
+
411
+ # Usage:
412
+ # to start a local web server for development work
413
+ # giblish-giblish-search.rb <web_root>
414
+ #
415
+ # to run as a cgi script via a previously setup web server:
416
+ # giblish-giblish-search.rb
417
+ #
418
+ if __FILE__ == $PROGRAM_NAME
419
+
420
+ STDOUT.sync = true
421
+ if ARGV.length == 0
422
+ # 'Normal' cgi usage, as called from a web server
423
+
424
+ # init a new cgi 'connection' and print headers
425
+ cgi = CGI.new
426
+ print cgi.header
427
+ begin
428
+ # hello_world
429
+ cgi_main cgi
430
+ rescue Exception => e
431
+ print e.message
432
+ exit 1
433
+ end
434
+ exit 0
435
+ end
436
+
437
+ if ARGV.length == 1
438
+ # Run a simple web server to test this locally..
439
+ # and then create the html docs using:
440
+ # giblish -c -m -w <web_root> -r <resource_dir> -s <style_name> -g <git_branch> <src_root> <web_root>
441
+ init_web_server ARGV[0]
442
+ exit 0
443
+ end
444
+ end