codnar 0.1.68 → 0.1.73

Sign up to get free protection for your applications and to get access to all the features.
data/doc/system.markdown CHANGED
@@ -155,7 +155,16 @@ such formats are supported:
155
155
 
156
156
  [[lib/codnar/markdown.rb|named_chunk_with_containers]]
157
157
 
158
- In both cases, the HTML generated by the markup format conversion is a bit
158
+ * Haddock, a specific markup syntax used in comments to document Haskell code.
159
+ Here is a simple test that demonstrates using Haddock:
160
+
161
+ [[test/expand_haddock.rb|named_chunk_with_containers]]
162
+
163
+ And here is the implementation:
164
+
165
+ [[lib/codnar/haddock.rb|named_chunk_with_containers]]
166
+
167
+ In all cases, the HTML generated by the markup format conversion is a bit
159
168
  messy. We therefore clean it up:
160
169
 
161
170
  [[Clean html|named_chunk_with_containers]]
@@ -329,13 +338,17 @@ that demonstrates "splitting" documentation:
329
338
 
330
339
  And here are the actual configurations:
331
340
 
332
- [[Documentation "splitting" configurations|named_chunk_with_containers]]
341
+ [[lib/codnar/configuration/documentation.rb|named_chunk_with_containers]]
333
342
 
334
343
  #### Source code lines classification ####
335
344
 
336
345
  Splitting source code files is a more complex affair, which does typically
337
- require combining several configurations. The basic configuration marks all
338
- lines as belonging to some code syntax, as a single chunk:
346
+ require combining several configurations.
347
+
348
+ [[lib/codnar/configuration/code.rb|named_chunk_with_containers]]
349
+
350
+ The basic configuration marks all lines as belonging to some code syntax, as a
351
+ single chunk:
339
352
 
340
353
  [[Source code lines classification configurations|named_chunk_with_containers]]
341
354
 
@@ -349,7 +362,15 @@ Here is a simple test demonstrating using source code lines classifications:
349
362
 
350
363
  [[test/split_code_configurations.rb|named_chunk_with_containers]]
351
364
 
352
- #### Simple comment classification ####
365
+ #### Classifying comment lines ####
366
+
367
+ Classifying comment lines is the most complex part of splitting source code
368
+ files, requiring the use of one or more configurations specific to the language
369
+ used.
370
+
371
+ [[lib/codnar/configuration/comments.rb|named_chunk_with_containers]]
372
+
373
+ ##### Simple comment classification #####
353
374
 
354
375
  Many languages use a simple comment syntax, where some prefix indicates a
355
376
  comment that spans until the end of the line (e.g., shell `#` comments or C++
@@ -361,18 +382,30 @@ Here is a simple test demonstrating using simple comment classifications:
361
382
 
362
383
  [[test/split_simple_comment_configurations.rb|named_chunk_with_containers]]
363
384
 
364
- #### Complex comment classification ####
385
+ ##### Denoted comment classification #####
365
386
 
366
- Other languages use a complex multi-line comment syntax, where some prefix
387
+ Sometimes some simple comments require special treatment if they are denoted by
388
+ some leading prefix. For example, Haskell simple comments start with `--` but
389
+ Haddock (documentation) comments start with `-- |`, `-- ^` etc.
390
+
391
+ [[Denoted comment classification configurations|named_chunk_with_containers]]
392
+
393
+ Here is a simple test demonstrating using denoted comment classifications:
394
+
395
+ [[test/split_denoted_comment_configurations.rb|named_chunk_with_containers]]
396
+
397
+ ##### Delimited comment classification #####
398
+
399
+ Other languages use a delimited multi-line comment syntax, where some prefix
367
400
  indicates the beginning of the comment, some suffix indicates the end, and by
368
401
  convention some prefix is expected for the inner comment lines (e.g., C's
369
402
  "`/*`", "` *`", "`*/`" comments or HTML's "`<!--`", "` -`", "`-->`" comments).
370
403
 
371
- [[Complex comment classification configurations|named_chunk_with_containers]]
404
+ [[Delimited comment classification configurations|named_chunk_with_containers]]
372
405
 
373
- Here is a simple test demonstrating using complex comment classifications:
406
+ Here is a simple test demonstrating using delimited comment classifications:
374
407
 
375
- [[test/split_complex_comment_configurations.rb|named_chunk_with_containers]]
408
+ [[test/split_delimited_comment_configurations.rb|named_chunk_with_containers]]
376
409
 
377
410
  #### Comment formatting ####
378
411
 
@@ -386,10 +419,18 @@ Here is a simple test demonstrating formatting comment contents:
386
419
 
387
420
  [[test/format_comment_configurations.rb|named_chunk_with_containers]]
388
421
 
389
- #### Syntax highlighting using GVim ####
422
+ #### Syntax highlighting ####
390
423
 
391
- Supporting a specific programming language (other than dealing with comments)
392
- is very easy using GVim for syntax highlighting, as demonstrated here:
424
+ Highlighting the syntax of the source code embedded in the documentation
425
+ improved readability. Codnar provides several ways to achieve this.
426
+
427
+ [[lib/codnar/configuration/highlighting.rb|named_chunk_with_containers]]
428
+
429
+ ##### Syntax highlighting using GVim #####
430
+
431
+ Supporting almost any known programming language (other than dealing with
432
+ comments) is very easy using GVim for syntax highlighting, as demonstrated
433
+ here:
393
434
 
394
435
  [[GVim syntax highlighting formatting configurations|named_chunk_with_containers]]
395
436
 
@@ -399,7 +440,7 @@ classes. Here is the default CSS stylesheet used by GVim:
399
440
 
400
441
  [[lib/codnar/data/gvim.css|named_chunk_with_containers]]
401
442
 
402
- #### Syntax highlighting using CodeRay ####
443
+ ##### Syntax highlighting using CodeRay #####
403
444
 
404
445
  For supported programming languages, you may choose to use CodeRay instead of GVim.
405
446
 
@@ -411,7 +452,7 @@ classes. Here is the default CSS stylesheet used by CodeRay:
411
452
 
412
453
  [[lib/codnar/data/coderay.css|named_chunk_with_containers]]
413
454
 
414
- #### Syntax highlighting using Sunlight ####
455
+ ##### Syntax highlighting using Sunlight #####
415
456
 
416
457
  For small projects in supported languages, you may choose to use Sunlight
417
458
  instead of GVim.
data/lib/codnar.rb CHANGED
@@ -8,6 +8,7 @@ require "fileutils"
8
8
  require "irb"
9
9
  require "open3"
10
10
  require "rdiscount"
11
+ require "rdoc"
11
12
  require "rdoc/markup/to_html"
12
13
  require "tempfile"
13
14
  require "yaml"
@@ -21,6 +22,7 @@ require "olag/string_unindent"
21
22
  require "codnar/version"
22
23
 
23
24
  require "codnar/coderay"
25
+ require "codnar/haddock"
24
26
  require "codnar/hash_extensions"
25
27
  require "codnar/markdown"
26
28
  require "codnar/rdoc"
@@ -36,6 +38,10 @@ require "codnar/merger"
36
38
  require "codnar/split"
37
39
  require "codnar/reader"
38
40
  require "codnar/scanner"
41
+ require "codnar/configuration/code"
42
+ require "codnar/configuration/comments"
43
+ require "codnar/configuration/documentation"
44
+ require "codnar/configuration/highlighting"
39
45
  require "codnar/split_configurations"
40
46
  require "codnar/splitter"
41
47
  require "codnar/sunlight"
@@ -0,0 +1,87 @@
1
+ module Codnar
2
+
3
+ module Configuration
4
+
5
+ # Configurations for splitting source code.
6
+ module Code
7
+
8
+ # {{{ Source code lines classification configurations
9
+
10
+ # Classify all lines as source code of some syntax (kind). This doesn't
11
+ # distinguish between comment and code lines; to do that, you need to
12
+ # combine this with comment classification configuration(s). Also, it just
13
+ # formats the lines in an HTML +pre+ element, without any syntax
14
+ # highlighting; to do that, you need to combine this with syntax
15
+ # highlighting formatting configuration(s).
16
+ CLASSIFY_SOURCE_CODE = lambda do |syntax|
17
+ return {
18
+ "formatters" => {
19
+ "#{syntax}_code" => "Formatter.lines_to_pre_html(lines, :class => :code)",
20
+ },
21
+ "syntax" => {
22
+ "patterns" => {
23
+ "#{syntax}_code" => { "regexp" => "^(\\s*)(.*)$" },
24
+ },
25
+ "states" => {
26
+ "start" => {
27
+ "transitions" => [
28
+ { "pattern" => "#{syntax}_code" },
29
+ ],
30
+ },
31
+ },
32
+ },
33
+ }
34
+ end
35
+
36
+ # }}}
37
+
38
+ # {{{ Nested foreign syntax code islands configurations
39
+
40
+ # Allow for comments containing "((( <syntax>" and "))) <syntax>" to
41
+ # designate nested islands of foreign syntax inside the normal code. The
42
+ # designator comment lines are always treated as part of the surrounding
43
+ # code, not as part of the nested foreign syntax code. There is no further
44
+ # classification of the nested foreign syntax code. Therefore, the nested
45
+ # code is not examined for begin/end chunk markers. Likewise, the nested
46
+ # code may not contain deeper nested code using a third syntax.
47
+ CLASSIFY_NESTED_CODE = lambda do |outer_syntax, inner_syntax|
48
+ {
49
+ "syntax" => {
50
+ "patterns" => {
51
+ "start_#{inner_syntax}_in_#{outer_syntax}" =>
52
+ { "regexp" => "^(\\s*)(.*\\(\\(\\(\\s*#{inner_syntax}.*)$" },
53
+ "end_#{inner_syntax}_in_#{outer_syntax}" =>
54
+ { "regexp" => "^(\\s*)(.*\\)\\)\\)\\s*#{inner_syntax}.*)$" },
55
+ "#{inner_syntax}_in_#{outer_syntax}" =>
56
+ { "regexp" => "^(\\s*)(.*)$" },
57
+ },
58
+ "states" => {
59
+ "start" => {
60
+ "transitions" => [
61
+ { "pattern" => "start_#{inner_syntax}_in_#{outer_syntax}",
62
+ "kind" => "#{outer_syntax}_code",
63
+ "next_state" => "#{inner_syntax}_in_#{outer_syntax}" },
64
+ [],
65
+ ],
66
+ },
67
+ "#{inner_syntax}_in_#{outer_syntax}" => {
68
+ "transitions" => [
69
+ { "pattern" => "end_#{inner_syntax}_in_#{outer_syntax}",
70
+ "kind" => "#{outer_syntax}_code",
71
+ "next_state" => "start" },
72
+ { "pattern" => "#{inner_syntax}_in_#{outer_syntax}",
73
+ "kind" => "#{inner_syntax}_code" },
74
+ ],
75
+ },
76
+ },
77
+ },
78
+ }
79
+ end
80
+
81
+ # }}}
82
+
83
+ end
84
+
85
+ end
86
+
87
+ end
@@ -0,0 +1,234 @@
1
+ module Codnar
2
+
3
+ module Configuration
4
+
5
+ # Configurations for splitting source code with comments.
6
+ module Comments
7
+
8
+ # {{{ Simple comment classification configurations
9
+
10
+ # Classify simple comment lines. It accepts a restricted format: each
11
+ # comment is expected to start with some exact prefix (e.g. "#" for shell
12
+ # style comments or "//" for C++ style comments). The following space, if
13
+ # any, is stripped from the payload. As a convenience, comment that starts
14
+ # with "!" is not taken to start a comment. This both protects the 1st line
15
+ # of shell scripts ("#!"), and also any other line you wish to avoid being
16
+ # treated as a comment.
17
+ #
18
+ # This configuration is typically complemented by an additional one
19
+ # specifying how to format the (stripped!) comments; by default they are
20
+ # just displayed as-is using an HTML +pre+ element, which isn't very
21
+ # useful.
22
+ CLASSIFY_SIMPLE_COMMENTS = lambda do |prefix|
23
+ return Comments.simple_comments(prefix)
24
+ end
25
+
26
+ # Classify simple shell ("#") comment lines.
27
+ CLASSIFY_SHELL_COMMENTS = lambda do
28
+ return Comments.simple_comments("#")
29
+ end
30
+
31
+ # Classify simple C++ ("//") comment lines.
32
+ CLASSIFY_CPP_COMMENTS = lambda do
33
+ return Comments.simple_comments("//")
34
+ end
35
+
36
+ # Configuration for classifying lines to comments and code based on a
37
+ # simple prefix (e.g. "#" for shell style comments or "//" for C++ style
38
+ # comments).
39
+ def self.simple_comments(prefix)
40
+ return {
41
+ "syntax" => {
42
+ "patterns" => {
43
+ "comment_#{prefix}" => { "regexp" => "^(\\s*)#{prefix}(?!!)\\s?(.*)$" },
44
+ },
45
+ "states" => {
46
+ "start" => {
47
+ "transitions" => [
48
+ { "pattern" => "comment_#{prefix}", "kind" => "comment" },
49
+ []
50
+ ],
51
+ },
52
+ },
53
+ },
54
+ }
55
+ end
56
+
57
+ # }}}
58
+
59
+ # {{{ Denoted comment classification configurations
60
+
61
+ # Classify denoted comment lines. Denoted comments are similar to simple
62
+ # comments, except that the 1st simple comment line must start with a
63
+ # specific prefix (e.g., in haddock, comment lines start with '--' but
64
+ # haddoc comments start with '-- |', '-- ^', etc.). The comment continues
65
+ # in additional simple comment lines.
66
+ #
67
+ # This configuration is typically complemented by an additional one
68
+ # specifying how to format the (stripped!) comments; by default they are
69
+ # just displayed as-is using an HTML +pre+ element, which isn't very
70
+ # useful.
71
+ CLASSIFY_DENOTED_COMMENTS = lambda do |start_prefix, continue_prefix|
72
+ return Comments.denoted_comments(start_prefix, continue_prefix)
73
+ end
74
+
75
+ # Classify denoted haddock ("--") comment lines. Note that non-haddock
76
+ # comment lines are not captured; they would treated as code and handled
77
+ # by syntax highlighting, if any.
78
+ CLASSIFY_HADDOCK_COMMENTS = lambda do
79
+ return Comments.denoted_comments("-- [|^$]", "--")
80
+ end
81
+
82
+ # Configuration for classifying lines to comments and code based on a start
83
+ # comment prefix and continuation comment prefix (e.g., "-- |" and "--" for
84
+ # haddock).
85
+ def self.denoted_comments(start_prefix, continue_prefix)
86
+ # Ruby coverage somehow barfs if we inline this. Go figure.
87
+ start_transition = {
88
+ "pattern" => "comment_start_#{start_prefix}",
89
+ "next_state" => "comment_continue_#{continue_prefix}",
90
+ "kind" => "comment"
91
+ }
92
+ return {
93
+ "syntax" => {
94
+ "patterns" => {
95
+ "comment_start_#{start_prefix}" => { "regexp" => "^(\\s*)#{start_prefix}\\s?(.*)$" },
96
+ "comment_continue_#{continue_prefix}" => { "regexp" => "^(\\s*)#{continue_prefix}\\s?(.*)$" },
97
+ },
98
+ "states" => {
99
+ "start" => {
100
+ "transitions" => [ start_transition, [] ],
101
+ },
102
+ "comment_continue_#{continue_prefix}" => {
103
+ "transitions" => [ {
104
+ "pattern" => "comment_continue_#{continue_prefix}",
105
+ "kind" => "comment" },
106
+ { "next_state" => "start" }
107
+ ],
108
+ },
109
+ },
110
+ },
111
+ }
112
+ end
113
+
114
+ # }}}
115
+
116
+ # {{{ Delimited comment classification configurations
117
+
118
+ # Classify delimited comment lines. It accepts a restricted format: each
119
+ # comment is expected to start with some exact prefix (e.g. "/*" for C
120
+ # style comments or "<!--" for HTML style comments). The following space,
121
+ # if any, is stripped from the payload. Following lines are also considered
122
+ # comments; a leading inner line prefix (e.g., " *" for C style comments or
123
+ # " -" for HTML style comments) with an optional following space are
124
+ # stripped from the payload. Finally, a line containing some exact suffix
125
+ # (e.g. "*/" for C style comments, or "-->" for HTML style comments) ends
126
+ # the comment. A one line comment format is also supported containing the
127
+ # prefix, the payload, and the suffix. As a convenience, comment that
128
+ # starts with "!" is not taken to start a comment. This allows protecting
129
+ # comment block you wish to avoid being classified as a comment.
130
+ #
131
+ # This configuration is typically complemented by an additional one
132
+ # specifying how to format the (stripped!) comments; by default they are
133
+ # just displayed as-is using an HTML +pre+ element, which isn't very
134
+ # useful.
135
+ CLASSIFY_DELIMITED_COMMENTS = lambda do |prefix, inner, suffix|
136
+ return Comments.delimited_comments(prefix, inner, suffix)
137
+ end
138
+
139
+ # Classify delimited C ("/*", " *", " */") style comments.
140
+ CLASSIFY_C_COMMENTS = lambda do
141
+ # Since the prefix/inner/suffix passed to the configuration are regexps,
142
+ # we need to escape special characters such as "*".
143
+ return Comments.delimited_comments("/\\*", " \\*", " \\*/")
144
+ end
145
+
146
+ # Classify delimited HTML ("<!--", " -", "-->") style comments.
147
+ CLASSIFY_HTML_COMMENTS = lambda do
148
+ return Comments.delimited_comments("<!--", " -", "-->")
149
+ end
150
+
151
+ # Configuration for classifying lines to comments and code based on a
152
+ # delimited start prefix, inner line prefix and final suffix (e.g., "/*", "
153
+ # *", " */" for C-style comments or "<!--", " -", "-->" for HTML style
154
+ # comments).
155
+ def self.delimited_comments(prefix, inner, suffix)
156
+ return {
157
+ "syntax" => {
158
+ "patterns" => {
159
+ "comment_prefix_#{prefix}" => { "regexp" => "^(\\s*)#{prefix}(?!!)\\s?(.*)$" },
160
+ "comment_inner_#{inner}" => { "regexp" => "^(\\s*)#{inner}\\s?(.*)$" },
161
+ "comment_suffix_#{suffix}" => { "regexp" => "^(\\s*)#{suffix}\\s*$" },
162
+ "comment_line_#{prefix}_#{suffix}" => { "regexp" => "^(\\s*)#{prefix}(?!!)\s?(.*?)\s*#{suffix}\\s*$" },
163
+ },
164
+ "states" => {
165
+ "start" => {
166
+ "transitions" => [
167
+ { "pattern" => "comment_line_#{prefix}_#{suffix}",
168
+ "kind" => "comment" },
169
+ { "pattern" => "comment_prefix_#{prefix}",
170
+ "kind" => "comment",
171
+ "next_state" => "comment_#{prefix}" },
172
+ [],
173
+ ],
174
+ },
175
+ "comment_#{prefix}" => {
176
+ "transitions" => [
177
+ { "pattern" => "comment_suffix_#{suffix}",
178
+ "kind" => "comment",
179
+ "next_state" => "start" },
180
+ { "pattern" => "comment_inner_#{inner}",
181
+ "kind" => "comment" },
182
+ ],
183
+ },
184
+ },
185
+ },
186
+ }
187
+ end
188
+
189
+ # }}}
190
+
191
+ # {{{ Comment formatting configurations
192
+
193
+ # Format comments as HTML pre elements. Is used to complement a
194
+ # configuration that classifies some lines as +comment+.
195
+ FORMAT_PRE_COMMENTS = {
196
+ "formatters" => {
197
+ "comment" => "Formatter.lines_to_pre_html(lines, :class => :comment)",
198
+ },
199
+ }
200
+
201
+ # Format comments that use the RDoc notation. Is used to complement a
202
+ # configuration that classifies some lines as +comment+.
203
+ FORMAT_RDOC_COMMENTS = {
204
+ "formatters" => {
205
+ "comment" => "Formatter.markup_lines_to_html(lines, Codnar::RDoc, 'rdoc')",
206
+ "unindented_html" => "Formatter.unindented_lines_to_html(lines)",
207
+ },
208
+ }
209
+
210
+ # Format comments that use the Markdown notation. Is used to complement a
211
+ # configuration that classifies some lines as +comment+.
212
+ FORMAT_MARKDOWN_COMMENTS = {
213
+ "formatters" => {
214
+ "comment" => "Formatter.markup_lines_to_html(lines, Markdown, 'markdown')",
215
+ "unindented_html" => "Formatter.unindented_lines_to_html(lines)",
216
+ },
217
+ }
218
+
219
+ # Format comments that use the Haddock notation. Is used to complement a
220
+ # configuration that classifies some lines as +comment+.
221
+ FORMAT_HADDOCK_COMMENTS = {
222
+ "formatters" => {
223
+ "comment" => "Formatter.markup_lines_to_html(lines, Haddock, 'haddock')",
224
+ "unindented_html" => "Formatter.unindented_lines_to_html(lines)",
225
+ },
226
+ }
227
+
228
+ # }}}
229
+
230
+ end
231
+
232
+ end
233
+
234
+ end