slaw 1.0.0.alpha.6 → 1.0.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +13 -147
- data/bin/slaw +2 -1
- data/lib/slaw.rb +0 -6
- data/lib/slaw/generator.rb +2 -8
- data/lib/slaw/grammars/pl/act.treetop +10 -14
- data/lib/slaw/grammars/pl/act_text.xsl +271 -0
- data/lib/slaw/version.rb +1 -1
- data/slaw.gemspec +3 -3
- metadata +6 -17
- data/lib/slaw/act.rb +0 -452
- data/lib/slaw/bylaw.rb +0 -62
- data/lib/slaw/collection.rb +0 -60
- data/lib/slaw/lifecycle_event.rb +0 -23
- data/lib/slaw/render/html.rb +0 -70
- data/lib/slaw/render/xsl/act.xsl +0 -15
- data/lib/slaw/render/xsl/elements.xsl +0 -120
- data/lib/slaw/render/xsl/fragment.xsl +0 -16
- data/spec/act_spec.rb +0 -56
- data/spec/bylaw_spec.rb +0 -49
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 96bb9bd00dc6e71518da515b6595f4a8b5c9a5b0
|
4
|
+
data.tar.gz: 793f2aeaedb2dc7e89d479348270c339639b9363
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 4e36394603f98a99668a868ea5458ef64a23585d4fc1794f42fc7e68c335c3186760bfcec83be04f40aa1636f6cb8a1b00c0be73b28515d5376d8838c8ceb3c7
|
7
|
+
data.tar.gz: d7e0cfbcf66ee8b1bbce81a5160c5f3302162f2a9df0e3c127ad576a0c0176929fdbd2d0abb0d948f63288d9afcf17d3bd7c9cd92f4168a0f41640b46221850a
|
data/README.md
CHANGED
@@ -1,14 +1,16 @@
|
|
1
1
|
# Slaw [![Build Status](https://travis-ci.org/longhotsummer/slaw.svg)](http://travis-ci.org/longhotsummer/slaw)
|
2
2
|
|
3
|
-
Slaw is a lightweight library for generating
|
4
|
-
It is used to power [
|
5
|
-
|
3
|
+
Slaw is a lightweight library for generating Akoma Ntoso 2.0 Act XML from plain text and PDF documents.
|
4
|
+
It is used to power [Indigo](https://github.com/OpenUpSA/indigo) and uses grammars developed for the legal
|
5
|
+
traditions in these countries:
|
6
|
+
|
7
|
+
* South Africa
|
8
|
+
* Poland
|
6
9
|
|
7
10
|
Slaw allows you to:
|
8
11
|
|
9
|
-
1.
|
10
|
-
2.
|
11
|
-
3. unparse Akoma Ntoso XML into text that can be parsed backed into Akoma Ntoso.
|
12
|
+
1. parse plain text and transform it into an Akoma Ntoso Act XML document
|
13
|
+
2. unparse Akoma Ntoso XML into a plain-text format suitable for re-parsing
|
12
14
|
|
13
15
|
Slaw is lightweight because it wraps around a Nokogiri XML representation of
|
14
16
|
the parsed document. It provides some support methods for manipulating these
|
@@ -40,7 +42,7 @@ installed by default on most systems (including Mac). On Ubuntu you can use:
|
|
40
42
|
|
41
43
|
The simplest way to use Slaw is via the commandline:
|
42
44
|
|
43
|
-
$ slaw parse myfile.pdf
|
45
|
+
$ slaw parse myfile.pdf --grammar za
|
44
46
|
|
45
47
|
## Overview
|
46
48
|
|
@@ -63,150 +65,13 @@ The grammar cannot catch some subtleties of an act or by-law -- such as nested l
|
|
63
65
|
so Slaw performs some post-processing on the XML produced by the parser. In particular,
|
64
66
|
it nests lists correctly.
|
65
67
|
|
66
|
-
## Quick Start
|
67
|
-
|
68
|
-
Install the gem using
|
69
|
-
|
70
|
-
gem install slaw
|
71
|
-
|
72
|
-
Extract text from a PDF and parse it as a South African by-law:
|
73
|
-
|
74
|
-
```ruby
|
75
|
-
require 'slaw'
|
76
|
-
|
77
|
-
# extract text from a PDF file and clean it up
|
78
|
-
extractor = Slaw::Extract::Extractor.new
|
79
|
-
text = extractor.extract_from_pdf('/path/to/file.pdf')
|
80
|
-
|
81
|
-
# parse the text into a XML and
|
82
|
-
generator = Slaw::ActGenerator.new
|
83
|
-
bylaw = generator.generate_from_text(text)
|
84
|
-
puts bylaw.to_xml(indent: 2)
|
85
|
-
|
86
|
-
# render the by-law as HTML, using / as the root
|
87
|
-
# for relative URLs
|
88
|
-
renderer = Slaw::Render::HTMLRenderer.new
|
89
|
-
puts renderer.render(bylaw.doc, '/')
|
90
|
-
```
|
91
|
-
|
92
|
-
## Extraction
|
93
|
-
|
94
|
-
Extraction is done by the `Slaw::Extract::Extractor` class. It currently handles
|
95
|
-
PDF and plain text files. Slaw uses `pdftotext` from the `xpdf` package to extract
|
96
|
-
the plain text from PDFs. PDFs are great for presentation, but suck for accurately storing
|
97
|
-
text. As a result, the extraction can produce oddities, such as lines broken in weird
|
98
|
-
places (or not broken when they should be). Slaw gets around this by running
|
99
|
-
some cleanup routines on the extracted text.
|
100
|
-
|
101
|
-
For example, it knows that these lines:
|
102
|
-
|
103
|
-
(b) any wall, swimming pool, reservoir or bridge
|
104
|
-
or any other structure connected therewith; (c) any fuel pump or any
|
105
|
-
tank used in connection therewith
|
106
|
-
|
107
|
-
should probably be broken at the section numbers:
|
108
|
-
|
109
|
-
(b) any wall, swimming pool, reservoir or bridge or any other structure connected therewith;
|
110
|
-
(c) any fuel pump or any tank used in connection therewith
|
111
|
-
|
112
|
-
If your region's numbering format differs significantly from this, these rules might not work.
|
113
|
-
|
114
|
-
Some other steps Slaw takes after extraction include (check `Slaw::Parse::Cleanser` for the full set):
|
115
|
-
|
116
|
-
* changing newlines to `\n`, and normalising quotation characters
|
117
|
-
* removing page numbers and other boilerplate
|
118
|
-
* stripping the table of contents (we can generate our own from the parsed document)
|
119
|
-
* changing tabs to spaces, stripping leading and trailing spaces and removing blank lines
|
120
|
-
|
121
68
|
## Parsing
|
122
69
|
|
123
70
|
Slaw uses Treetop to compile a grammar into a backtracking parser. The parser builds a parse
|
124
|
-
tree,
|
125
|
-
|
126
|
-
While most South African by-laws are superficially very similar, there are a sufficient differences
|
127
|
-
in their typesetting to make parsing them difficult. The grammar handles most
|
128
|
-
edge cases but may not catch them all. The one thing it cannot yet detect well is the difference
|
129
|
-
between section titles before and after a section number:
|
130
|
-
|
131
|
-
1. Definitions
|
132
|
-
In this by-law, the following words ...
|
133
|
-
|
134
|
-
Definitions
|
135
|
-
1. In this by-law, the following words ...
|
136
|
-
|
137
|
-
This must be set by the user before parsing:
|
138
|
-
|
139
|
-
```ruby
|
140
|
-
generator = Slaw::ZA::BylawGenerator.new
|
141
|
-
generator.parser.options = {section_number_after_title: true}
|
142
|
-
```
|
143
|
-
|
144
|
-
The parser does its best not to choke on input it doesn't understand, preferring a best effort
|
145
|
-
to a completely accurate result. For example it may not be able to work out a section heading
|
146
|
-
and so will treat it as simply another statement in the previous section. This causes the parser
|
147
|
-
to use a lot of backtracking and negative lookahead assertions, which can be slow for large documents.
|
148
|
-
|
149
|
-
The grammar supports a number of subsection numbering formats, which are often mixed
|
150
|
-
in a document to indicate different levels of nesting.
|
151
|
-
|
152
|
-
(a)
|
153
|
-
(2)
|
154
|
-
(3b)
|
155
|
-
(ii)
|
156
|
-
3.4
|
157
|
-
|
158
|
-
During post-processing it works out how to nest these appropriately.
|
159
|
-
|
160
|
-
Special words, such as ``part`` and ``chapter`` are ignored if the line starts with a backslash ``\``.
|
161
|
-
|
162
|
-
For more information see the South African by-law grammar at
|
163
|
-
[lib/slaw/za/bylaw.treetop](lib/slaw/za/bylaw.treetop) and the list nesting
|
164
|
-
at [lib/slaw/parse/blocklists.rb](lib/slaw/parse/blocklists.rb).
|
165
|
-
|
166
|
-
## Rendering
|
167
|
-
|
168
|
-
Slaw renders XML to HTML using XSLT. For the most part there is a direct mapping between
|
169
|
-
Akoma Ntoso structure and the HTML layout, so most AN nodes are simply mapped to `div` or `span`
|
170
|
-
elements with a class attribute derived from the name of the AN element and an ID element taken
|
171
|
-
from the node, if any. This makes it both fast and flexible, since it's easy to
|
172
|
-
apply layout rules with CSS.
|
173
|
-
|
174
|
-
Slaw can render either an entire document like this, or just a portion of the XML tree.
|
175
|
-
|
176
|
-
```ruby
|
177
|
-
# render an entire document
|
178
|
-
renderer = Slaw::Render::HTMLRenderer.new
|
179
|
-
puts renderer.render(bylaw.doc, '/')
|
180
|
-
|
181
|
-
# render the first section only
|
182
|
-
puts renderer.render(bylaw.sections[0], '/')
|
183
|
-
```
|
184
|
-
|
185
|
-
For more information, see [/lib/slaw/render/html.rb](/lib/slaw/render/html.rb).
|
186
|
-
|
187
|
-
## Meta-data
|
188
|
-
|
189
|
-
Acts and by-laws have metadata which it is not possible to get from their plain text representations,
|
190
|
-
such as their title, date and format of publication or act number. Slaw provides some helpers
|
191
|
-
for manipulating this meta-data. For example,
|
192
|
-
|
193
|
-
```ruby
|
194
|
-
bylaw = Slaw::ByLaw.new('spec/fixtures/community-fire-safety.xml')
|
195
|
-
print bylaw.id_uri
|
196
|
-
bylaw.title = 'A new title'
|
197
|
-
bylaw.name = 'a-new-title'
|
198
|
-
bylaw.published!(date: '2014-09-28')
|
199
|
-
print bylaw.id_uri
|
200
|
-
```
|
201
|
-
|
202
|
-
## Schedules
|
203
|
-
|
204
|
-
South African acts and by-laws can have addendums called schedules. They are technically a part of
|
205
|
-
the act but are not part of the primary body and have more relaxed formatting. Slaw finds schedules
|
206
|
-
by looking for section headings, but makes no effort to capture the format of their contents.
|
71
|
+
tree, the nodes of which know how to serialize themselves in XML format.
|
207
72
|
|
208
|
-
|
209
|
-
|
73
|
+
Supporting formats from other country's legal traditions probably requires creating a new grammar
|
74
|
+
and parser.
|
210
75
|
|
211
76
|
## Contributing
|
212
77
|
|
@@ -225,6 +90,7 @@ Akoma Ntoso `component` elements at the end of the XML document, with a name of
|
|
225
90
|
* Slaw no longer does too much introspection of a parsed document, since that can be so tradition-dependent.
|
226
91
|
* Move reformatting out of Slaw since it's tradition-dependent.
|
227
92
|
* Remove definition linking, Slaw no longer supports it.
|
93
|
+
* Remove unused code for interacting with the internals of acts.
|
228
94
|
|
229
95
|
### 0.17.2
|
230
96
|
|
data/bin/slaw
CHANGED
@@ -90,8 +90,9 @@ class SlawCLI < Thor
|
|
90
90
|
end
|
91
91
|
|
92
92
|
desc "unparse FILE", "Unparse FILE from Akoma Ntoso XML back into text suitable for re-parsing"
|
93
|
+
option :grammar, type: :string, desc: "Grammar name (usually a two-letter country code). Default is za."
|
93
94
|
def unparse(name)
|
94
|
-
generator = Slaw::ActGenerator.new
|
95
|
+
generator = Slaw::ActGenerator.new(options[:grammar] || 'za')
|
95
96
|
|
96
97
|
doc = File.open(name, 'r') { |f| doc = generator.builder.parse_xml(f.read) }
|
97
98
|
puts generator.text_from_act(doc)
|
data/lib/slaw.rb
CHANGED
@@ -4,14 +4,8 @@ require 'slaw/version'
|
|
4
4
|
require 'slaw/namespace'
|
5
5
|
require 'slaw/logging'
|
6
6
|
|
7
|
-
require 'slaw/act'
|
8
|
-
require 'slaw/bylaw'
|
9
|
-
require 'slaw/collection'
|
10
|
-
|
11
7
|
require 'slaw/xml_support'
|
12
|
-
require 'slaw/lifecycle_event'
|
13
8
|
|
14
|
-
require 'slaw/render/html'
|
15
9
|
require 'slaw/parse/blocklists'
|
16
10
|
require 'slaw/parse/builder'
|
17
11
|
require 'slaw/parse/cleanser'
|
data/lib/slaw/generator.rb
CHANGED
@@ -7,9 +7,6 @@ module Slaw
|
|
7
7
|
# [Slaw::Parse::Builder] builder used by the generator
|
8
8
|
attr_accessor :builder
|
9
9
|
|
10
|
-
# The type that will hold the generated document
|
11
|
-
attr_accessor :document_class
|
12
|
-
|
13
10
|
@@parsers = {}
|
14
11
|
|
15
12
|
def initialize(grammar)
|
@@ -19,7 +16,6 @@ module Slaw
|
|
19
16
|
@builder = Slaw::Parse::Builder.new(parser: @parser)
|
20
17
|
@parser = @builder.parser
|
21
18
|
@cleanser = Slaw::Parse::Cleanser.new
|
22
|
-
@document_class = Slaw::Act
|
23
19
|
end
|
24
20
|
|
25
21
|
def build_parser
|
@@ -39,11 +35,9 @@ module Slaw
|
|
39
35
|
#
|
40
36
|
# @param text [String] plain text
|
41
37
|
#
|
42
|
-
# @return [
|
38
|
+
# @return [Nokogiri::Document] the resulting xml
|
43
39
|
def generate_from_text(text)
|
44
|
-
|
45
|
-
act.doc = @builder.parse_and_process_text(cleanup(text))
|
46
|
-
act
|
40
|
+
@builder.parse_and_process_text(cleanup(text))
|
47
41
|
end
|
48
42
|
|
49
43
|
# Run basic cleanup on text, such as ensuring clean newlines
|
@@ -111,32 +111,28 @@ module Slaw
|
|
111
111
|
# these are used externally and provide support when parsing just
|
112
112
|
# a particular portion of a document
|
113
113
|
|
114
|
-
rule
|
115
|
-
children:
|
116
|
-
end
|
117
|
-
|
118
|
-
rule subdivisions
|
119
|
-
children:subdivision+ <GroupNode>
|
114
|
+
rule articles
|
115
|
+
children:article+ <GroupNode>
|
120
116
|
end
|
121
117
|
|
122
118
|
rule chapters
|
123
119
|
children:chapter+ <GroupNode>
|
124
120
|
end
|
125
121
|
|
126
|
-
rule
|
127
|
-
children:
|
128
|
-
end
|
129
|
-
|
130
|
-
rule sections
|
131
|
-
children:section+ <GroupNode>
|
122
|
+
rule divisions
|
123
|
+
children:division+ <GroupNode>
|
132
124
|
end
|
133
125
|
|
134
126
|
rule paragraphs
|
135
127
|
children:paragraph+ <GroupNode>
|
136
128
|
end
|
137
129
|
|
138
|
-
rule
|
139
|
-
children:
|
130
|
+
rule sections
|
131
|
+
children:section+ <GroupNode>
|
132
|
+
end
|
133
|
+
|
134
|
+
rule subdivisions
|
135
|
+
children:subdivision+ <GroupNode>
|
140
136
|
end
|
141
137
|
|
142
138
|
##########
|
@@ -0,0 +1,271 @@
|
|
1
|
+
<?xml version="1.0"?>
|
2
|
+
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
|
3
|
+
xmlns:a="http://www.akomantoso.org/2.0"
|
4
|
+
exclude-result-prefixes="a">
|
5
|
+
|
6
|
+
<xsl:output method="text" indent="no" omit-xml-declaration="yes" />
|
7
|
+
<xsl:strip-space elements="*"/>
|
8
|
+
|
9
|
+
<!-- adds a backslash to the start of the value param, if necessary -->
|
10
|
+
<xsl:template name="escape">
|
11
|
+
<xsl:param name="value"/>
|
12
|
+
|
13
|
+
<xsl:variable name="prefix" select="translate(substring($value, 1, 10), 'abcdefghijklmnopqrstuvwxyz', 'ABCDEFGHIJKLMNOPQRSTUVWXYZ')" />
|
14
|
+
<xsl:variable name="numprefix" select="translate(translate(substring($prefix, 1, 3), '1234567890', 'NNNNNNNNNN'), 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'NNNNNNNNNNNNNNNNNNNNNNNNNN')" />
|
15
|
+
|
16
|
+
<!-- p tags must escape initial content that looks like a block element marker.
|
17
|
+
Note that the two hyphens are different characters. -->
|
18
|
+
<xsl:if test="$prefix = 'BODY' or
|
19
|
+
$prefix = 'PREAMBLE' or
|
20
|
+
$prefix = 'PREFACE' or
|
21
|
+
starts-with($prefix, 'ROZDZIA') or
|
22
|
+
starts-with($prefix, 'DZIA') or
|
23
|
+
starts-with($prefix, 'ODDZIA') or
|
24
|
+
starts-with($prefix, 'ART.') or
|
25
|
+
starts-with($prefix, '§') or
|
26
|
+
starts-with($prefix, 'SCHEDULE ') or
|
27
|
+
starts-with($prefix, '{|') or
|
28
|
+
starts-with($numprefix, 'N)') or
|
29
|
+
starts-with($numprefix, 'NN)') or
|
30
|
+
starts-with($numprefix, 'N.') or
|
31
|
+
starts-with($numprefix, 'NN.') or
|
32
|
+
starts-with($numprefix, '-') or
|
33
|
+
starts-with($numprefix, '–')">
|
34
|
+
<xsl:text>\</xsl:text>
|
35
|
+
</xsl:if>
|
36
|
+
<xsl:value-of select="$value"/>
|
37
|
+
</xsl:template>
|
38
|
+
|
39
|
+
<xsl:template match="a:act">
|
40
|
+
<xsl:apply-templates select="a:coverPage" />
|
41
|
+
<xsl:apply-templates select="a:preface" />
|
42
|
+
<xsl:apply-templates select="a:preamble" />
|
43
|
+
<xsl:apply-templates select="a:body" />
|
44
|
+
<xsl:apply-templates select="a:conclusions" />
|
45
|
+
</xsl:template>
|
46
|
+
|
47
|
+
<xsl:template match="a:preface">
|
48
|
+
<xsl:text>PREFACE</xsl:text>
|
49
|
+
<xsl:text>
|
50
|
+
|
51
|
+
</xsl:text>
|
52
|
+
<xsl:apply-templates />
|
53
|
+
</xsl:template>
|
54
|
+
|
55
|
+
<xsl:template match="a:preamble">
|
56
|
+
<xsl:text>PREAMBLE</xsl:text>
|
57
|
+
<xsl:text>
|
58
|
+
|
59
|
+
</xsl:text>
|
60
|
+
<xsl:apply-templates />
|
61
|
+
</xsl:template>
|
62
|
+
|
63
|
+
<xsl:template match="a:division">
|
64
|
+
<xsl:text>Dział </xsl:text>
|
65
|
+
<xsl:value-of select="./a:num" />
|
66
|
+
<xsl:text> - </xsl:text>
|
67
|
+
<xsl:value-of select="./a:heading" />
|
68
|
+
<xsl:text>
|
69
|
+
|
70
|
+
</xsl:text>
|
71
|
+
<xsl:apply-templates select="./*[not(self::a:num) and not(self::a:heading)]" />
|
72
|
+
</xsl:template>
|
73
|
+
|
74
|
+
<xsl:template match="a:chapter">
|
75
|
+
<xsl:text>Rozdział </xsl:text>
|
76
|
+
<xsl:value-of select="./a:num" />
|
77
|
+
<xsl:text> - </xsl:text>
|
78
|
+
<xsl:value-of select="./a:heading" />
|
79
|
+
<xsl:text>
|
80
|
+
|
81
|
+
</xsl:text>
|
82
|
+
<xsl:apply-templates select="./*[not(self::a:num) and not(self::a:heading)]" />
|
83
|
+
</xsl:template>
|
84
|
+
|
85
|
+
<xsl:template match="a:article">
|
86
|
+
<xsl:text>Art. </xsl:text>
|
87
|
+
<xsl:value-of select="a:num" />
|
88
|
+
<xsl:text>
|
89
|
+
|
90
|
+
</xsl:text>
|
91
|
+
<xsl:apply-templates select="./*[not(self::a:num)]" />
|
92
|
+
</xsl:template>
|
93
|
+
|
94
|
+
<xsl:template match="a:section">
|
95
|
+
<xsl:text>§ </xsl:text>
|
96
|
+
<xsl:value-of select="a:num" />
|
97
|
+
<xsl:text>
|
98
|
+
|
99
|
+
</xsl:text>
|
100
|
+
<xsl:apply-templates select="./*[not(self::a:num)]" />
|
101
|
+
</xsl:template>
|
102
|
+
|
103
|
+
<xsl:template match="a:paragraph">
|
104
|
+
<xsl:if test="a:num != ''">
|
105
|
+
<xsl:value-of select="a:num" />
|
106
|
+
<xsl:text> </xsl:text>
|
107
|
+
</xsl:if>
|
108
|
+
<xsl:apply-templates select="./*[not(self::a:num) and not(self::a:heading)]" />
|
109
|
+
</xsl:template>
|
110
|
+
|
111
|
+
<xsl:template match="a:indent">
|
112
|
+
<xsl:value-of select="a:num" />
|
113
|
+
<xsl:text>- </xsl:text>
|
114
|
+
<xsl:apply-templates select="./*[not(self::a:num)]" />
|
115
|
+
</xsl:template>
|
116
|
+
|
117
|
+
<!-- these are block elements and have a newline at the end -->
|
118
|
+
<xsl:template match="a:heading">
|
119
|
+
<xsl:apply-templates />
|
120
|
+
<xsl:text>
|
121
|
+
|
122
|
+
</xsl:text>
|
123
|
+
</xsl:template>
|
124
|
+
|
125
|
+
<xsl:template match="a:p">
|
126
|
+
<xsl:apply-templates/>
|
127
|
+
<!-- p tags must end with a newline -->
|
128
|
+
<xsl:text>
|
129
|
+
|
130
|
+
</xsl:text>
|
131
|
+
</xsl:template>
|
132
|
+
|
133
|
+
<!-- numbered lists -->
|
134
|
+
<xsl:template match="a:item | a:alinea | a:point">
|
135
|
+
<xsl:value-of select="./a:num" />
|
136
|
+
<xsl:text> </xsl:text>
|
137
|
+
<xsl:apply-templates select="./*[not(self::a:num)]" />
|
138
|
+
</xsl:template>
|
139
|
+
|
140
|
+
<xsl:template match="a:list">
|
141
|
+
<xsl:if test="a:intro != ''">
|
142
|
+
<xsl:value-of select="a:intro" />
|
143
|
+
<xsl:text>
|
144
|
+
|
145
|
+
</xsl:text>
|
146
|
+
</xsl:if>
|
147
|
+
<xsl:apply-templates select="./*[not(self::a:intro)]" />
|
148
|
+
</xsl:template>
|
149
|
+
|
150
|
+
<!-- first text nodes of these elems must be escaped if they have special chars -->
|
151
|
+
<xsl:template match="a:p[not(ancestor::a:table)]/text()[1] | a:intro/text()[1]">
|
152
|
+
<xsl:call-template name="escape">
|
153
|
+
<xsl:with-param name="value" select="." />
|
154
|
+
</xsl:call-template>
|
155
|
+
</xsl:template>
|
156
|
+
|
157
|
+
<!-- components/schedules -->
|
158
|
+
<xsl:template match="a:doc">
|
159
|
+
<xsl:text>Schedule - </xsl:text>
|
160
|
+
<xsl:value-of select="a:meta/a:identification/a:FRBRWork/a:FRBRalias/@value" />
|
161
|
+
|
162
|
+
<xsl:if test="a:mainBody/a:article/a:heading">
|
163
|
+
<xsl:text>
|
164
|
+
</xsl:text>
|
165
|
+
<xsl:value-of select="a:mainBody/a:article/a:heading" />
|
166
|
+
</xsl:if>
|
167
|
+
|
168
|
+
<xsl:text>
|
169
|
+
|
170
|
+
</xsl:text>
|
171
|
+
<xsl:apply-templates select="a:mainBody" />
|
172
|
+
</xsl:template>
|
173
|
+
|
174
|
+
<xsl:template match="a:mainBody/a:article/a:heading">
|
175
|
+
<!-- no-op, this is handled by the schedules template above -->
|
176
|
+
</xsl:template>
|
177
|
+
|
178
|
+
<!-- tables -->
|
179
|
+
<xsl:template match="a:table">
|
180
|
+
<xsl:text>{| </xsl:text>
|
181
|
+
|
182
|
+
<!-- attributes -->
|
183
|
+
<xsl:for-each select="@*[local-name()!='id']">
|
184
|
+
<xsl:value-of select="local-name(.)" />
|
185
|
+
<xsl:text>="</xsl:text>
|
186
|
+
<xsl:value-of select="." />
|
187
|
+
<xsl:text>" </xsl:text>
|
188
|
+
</xsl:for-each>
|
189
|
+
<xsl:text>
|
190
|
+
|-</xsl:text>
|
191
|
+
|
192
|
+
<xsl:apply-templates />
|
193
|
+
<xsl:text>
|
194
|
+
|}
|
195
|
+
|
196
|
+
</xsl:text>
|
197
|
+
</xsl:template>
|
198
|
+
|
199
|
+
<xsl:template match="a:tr">
|
200
|
+
<xsl:apply-templates />
|
201
|
+
<xsl:text>
|
202
|
+
|-</xsl:text>
|
203
|
+
</xsl:template>
|
204
|
+
|
205
|
+
<xsl:template match="a:th|a:td">
|
206
|
+
<xsl:choose>
|
207
|
+
<xsl:when test="local-name(.) = 'th'">
|
208
|
+
<xsl:text>
|
209
|
+
! </xsl:text>
|
210
|
+
</xsl:when>
|
211
|
+
<xsl:when test="local-name(.) = 'td'">
|
212
|
+
<xsl:text>
|
213
|
+
| </xsl:text>
|
214
|
+
</xsl:when>
|
215
|
+
</xsl:choose>
|
216
|
+
|
217
|
+
<!-- attributes -->
|
218
|
+
<xsl:if test="@*">
|
219
|
+
<xsl:for-each select="@*">
|
220
|
+
<xsl:value-of select="local-name(.)" />
|
221
|
+
<xsl:text>="</xsl:text>
|
222
|
+
<xsl:value-of select="." />
|
223
|
+
<xsl:text>" </xsl:text>
|
224
|
+
</xsl:for-each>
|
225
|
+
<xsl:text>| </xsl:text>
|
226
|
+
</xsl:if>
|
227
|
+
|
228
|
+
<xsl:apply-templates />
|
229
|
+
</xsl:template>
|
230
|
+
|
231
|
+
<!-- don't end p tags with newlines in tables -->
|
232
|
+
<xsl:template match="a:table//a:p">
|
233
|
+
<xsl:apply-templates />
|
234
|
+
</xsl:template>
|
235
|
+
|
236
|
+
<!-- END tables -->
|
237
|
+
|
238
|
+
<xsl:template match="a:remark">
|
239
|
+
<xsl:text>[</xsl:text>
|
240
|
+
<xsl:apply-templates />
|
241
|
+
<xsl:text>]</xsl:text>
|
242
|
+
</xsl:template>
|
243
|
+
|
244
|
+
<xsl:template match="a:ref">
|
245
|
+
<xsl:text>[</xsl:text>
|
246
|
+
<xsl:apply-templates />
|
247
|
+
<xsl:text>](</xsl:text>
|
248
|
+
<xsl:value-of select="@href" />
|
249
|
+
<xsl:text>)</xsl:text>
|
250
|
+
</xsl:template>
|
251
|
+
|
252
|
+
<xsl:template match="a:img">
|
253
|
+
<xsl:text>![</xsl:text>
|
254
|
+
<xsl:value-of select="@alt" />
|
255
|
+
<xsl:text>](</xsl:text>
|
256
|
+
<xsl:value-of select="@src" />
|
257
|
+
<xsl:text>)</xsl:text>
|
258
|
+
</xsl:template>
|
259
|
+
|
260
|
+
<xsl:template match="a:eol">
|
261
|
+
<xsl:text>
|
262
|
+
</xsl:text>
|
263
|
+
</xsl:template>
|
264
|
+
|
265
|
+
|
266
|
+
<!-- for most nodes, just dump their text content -->
|
267
|
+
<xsl:template match="*">
|
268
|
+
<xsl:text/><xsl:apply-templates /><xsl:text/>
|
269
|
+
</xsl:template>
|
270
|
+
|
271
|
+
</xsl:stylesheet>
|