slaw 1.0.0.alpha.6 → 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +13 -147
- data/bin/slaw +2 -1
- data/lib/slaw.rb +0 -6
- data/lib/slaw/generator.rb +2 -8
- data/lib/slaw/grammars/pl/act.treetop +10 -14
- data/lib/slaw/grammars/pl/act_text.xsl +271 -0
- data/lib/slaw/version.rb +1 -1
- data/slaw.gemspec +3 -3
- metadata +6 -17
- data/lib/slaw/act.rb +0 -452
- data/lib/slaw/bylaw.rb +0 -62
- data/lib/slaw/collection.rb +0 -60
- data/lib/slaw/lifecycle_event.rb +0 -23
- data/lib/slaw/render/html.rb +0 -70
- data/lib/slaw/render/xsl/act.xsl +0 -15
- data/lib/slaw/render/xsl/elements.xsl +0 -120
- data/lib/slaw/render/xsl/fragment.xsl +0 -16
- data/spec/act_spec.rb +0 -56
- data/spec/bylaw_spec.rb +0 -49
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 96bb9bd00dc6e71518da515b6595f4a8b5c9a5b0
|
4
|
+
data.tar.gz: 793f2aeaedb2dc7e89d479348270c339639b9363
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 4e36394603f98a99668a868ea5458ef64a23585d4fc1794f42fc7e68c335c3186760bfcec83be04f40aa1636f6cb8a1b00c0be73b28515d5376d8838c8ceb3c7
|
7
|
+
data.tar.gz: d7e0cfbcf66ee8b1bbce81a5160c5f3302162f2a9df0e3c127ad576a0c0176929fdbd2d0abb0d948f63288d9afcf17d3bd7c9cd92f4168a0f41640b46221850a
|
data/README.md
CHANGED
@@ -1,14 +1,16 @@
|
|
1
1
|
# Slaw [](http://travis-ci.org/longhotsummer/slaw)
|
2
2
|
|
3
|
-
Slaw is a lightweight library for generating
|
4
|
-
It is used to power [
|
5
|
-
|
3
|
+
Slaw is a lightweight library for generating Akoma Ntoso 2.0 Act XML from plain text and PDF documents.
|
4
|
+
It is used to power [Indigo](https://github.com/OpenUpSA/indigo) and uses grammars developed for the legal
|
5
|
+
traditions in these countries:
|
6
|
+
|
7
|
+
* South Africa
|
8
|
+
* Poland
|
6
9
|
|
7
10
|
Slaw allows you to:
|
8
11
|
|
9
|
-
1.
|
10
|
-
2.
|
11
|
-
3. unparse Akoma Ntoso XML into text that can be parsed backed into Akoma Ntoso.
|
12
|
+
1. parse plain text and transform it into an Akoma Ntoso Act XML document
|
13
|
+
2. unparse Akoma Ntoso XML into a plain-text format suitable for re-parsing
|
12
14
|
|
13
15
|
Slaw is lightweight because it wraps around a Nokogiri XML representation of
|
14
16
|
the parsed document. It provides some support methods for manipulating these
|
@@ -40,7 +42,7 @@ installed by default on most systems (including Mac). On Ubuntu you can use:
|
|
40
42
|
|
41
43
|
The simplest way to use Slaw is via the commandline:
|
42
44
|
|
43
|
-
$ slaw parse myfile.pdf
|
45
|
+
$ slaw parse myfile.pdf --grammar za
|
44
46
|
|
45
47
|
## Overview
|
46
48
|
|
@@ -63,150 +65,13 @@ The grammar cannot catch some subtleties of an act or by-law -- such as nested l
|
|
63
65
|
so Slaw performs some post-processing on the XML produced by the parser. In particular,
|
64
66
|
it nests lists correctly.
|
65
67
|
|
66
|
-
## Quick Start
|
67
|
-
|
68
|
-
Install the gem using
|
69
|
-
|
70
|
-
gem install slaw
|
71
|
-
|
72
|
-
Extract text from a PDF and parse it as a South African by-law:
|
73
|
-
|
74
|
-
```ruby
|
75
|
-
require 'slaw'
|
76
|
-
|
77
|
-
# extract text from a PDF file and clean it up
|
78
|
-
extractor = Slaw::Extract::Extractor.new
|
79
|
-
text = extractor.extract_from_pdf('/path/to/file.pdf')
|
80
|
-
|
81
|
-
# parse the text into a XML and
|
82
|
-
generator = Slaw::ActGenerator.new
|
83
|
-
bylaw = generator.generate_from_text(text)
|
84
|
-
puts bylaw.to_xml(indent: 2)
|
85
|
-
|
86
|
-
# render the by-law as HTML, using / as the root
|
87
|
-
# for relative URLs
|
88
|
-
renderer = Slaw::Render::HTMLRenderer.new
|
89
|
-
puts renderer.render(bylaw.doc, '/')
|
90
|
-
```
|
91
|
-
|
92
|
-
## Extraction
|
93
|
-
|
94
|
-
Extraction is done by the `Slaw::Extract::Extractor` class. It currently handles
|
95
|
-
PDF and plain text files. Slaw uses `pdftotext` from the `xpdf` package to extract
|
96
|
-
the plain text from PDFs. PDFs are great for presentation, but suck for accurately storing
|
97
|
-
text. As a result, the extraction can produce oddities, such as lines broken in weird
|
98
|
-
places (or not broken when they should be). Slaw gets around this by running
|
99
|
-
some cleanup routines on the extracted text.
|
100
|
-
|
101
|
-
For example, it knows that these lines:
|
102
|
-
|
103
|
-
(b) any wall, swimming pool, reservoir or bridge
|
104
|
-
or any other structure connected therewith; (c) any fuel pump or any
|
105
|
-
tank used in connection therewith
|
106
|
-
|
107
|
-
should probably be broken at the section numbers:
|
108
|
-
|
109
|
-
(b) any wall, swimming pool, reservoir or bridge or any other structure connected therewith;
|
110
|
-
(c) any fuel pump or any tank used in connection therewith
|
111
|
-
|
112
|
-
If your region's numbering format differs significantly from this, these rules might not work.
|
113
|
-
|
114
|
-
Some other steps Slaw takes after extraction include (check `Slaw::Parse::Cleanser` for the full set):
|
115
|
-
|
116
|
-
* changing newlines to `\n`, and normalising quotation characters
|
117
|
-
* removing page numbers and other boilerplate
|
118
|
-
* stripping the table of contents (we can generate our own from the parsed document)
|
119
|
-
* changing tabs to spaces, stripping leading and trailing spaces and removing blank lines
|
120
|
-
|
121
68
|
## Parsing
|
122
69
|
|
123
70
|
Slaw uses Treetop to compile a grammar into a backtracking parser. The parser builds a parse
|
124
|
-
tree,
|
125
|
-
|
126
|
-
While most South African by-laws are superficially very similar, there are a sufficient differences
|
127
|
-
in their typesetting to make parsing them difficult. The grammar handles most
|
128
|
-
edge cases but may not catch them all. The one thing it cannot yet detect well is the difference
|
129
|
-
between section titles before and after a section number:
|
130
|
-
|
131
|
-
1. Definitions
|
132
|
-
In this by-law, the following words ...
|
133
|
-
|
134
|
-
Definitions
|
135
|
-
1. In this by-law, the following words ...
|
136
|
-
|
137
|
-
This must be set by the user before parsing:
|
138
|
-
|
139
|
-
```ruby
|
140
|
-
generator = Slaw::ZA::BylawGenerator.new
|
141
|
-
generator.parser.options = {section_number_after_title: true}
|
142
|
-
```
|
143
|
-
|
144
|
-
The parser does its best not to choke on input it doesn't understand, preferring a best effort
|
145
|
-
to a completely accurate result. For example it may not be able to work out a section heading
|
146
|
-
and so will treat it as simply another statement in the previous section. This causes the parser
|
147
|
-
to use a lot of backtracking and negative lookahead assertions, which can be slow for large documents.
|
148
|
-
|
149
|
-
The grammar supports a number of subsection numbering formats, which are often mixed
|
150
|
-
in a document to indicate different levels of nesting.
|
151
|
-
|
152
|
-
(a)
|
153
|
-
(2)
|
154
|
-
(3b)
|
155
|
-
(ii)
|
156
|
-
3.4
|
157
|
-
|
158
|
-
During post-processing it works out how to nest these appropriately.
|
159
|
-
|
160
|
-
Special words, such as ``part`` and ``chapter`` are ignored if the line starts with a backslash ``\``.
|
161
|
-
|
162
|
-
For more information see the South African by-law grammar at
|
163
|
-
[lib/slaw/za/bylaw.treetop](lib/slaw/za/bylaw.treetop) and the list nesting
|
164
|
-
at [lib/slaw/parse/blocklists.rb](lib/slaw/parse/blocklists.rb).
|
165
|
-
|
166
|
-
## Rendering
|
167
|
-
|
168
|
-
Slaw renders XML to HTML using XSLT. For the most part there is a direct mapping between
|
169
|
-
Akoma Ntoso structure and the HTML layout, so most AN nodes are simply mapped to `div` or `span`
|
170
|
-
elements with a class attribute derived from the name of the AN element and an ID element taken
|
171
|
-
from the node, if any. This makes it both fast and flexible, since it's easy to
|
172
|
-
apply layout rules with CSS.
|
173
|
-
|
174
|
-
Slaw can render either an entire document like this, or just a portion of the XML tree.
|
175
|
-
|
176
|
-
```ruby
|
177
|
-
# render an entire document
|
178
|
-
renderer = Slaw::Render::HTMLRenderer.new
|
179
|
-
puts renderer.render(bylaw.doc, '/')
|
180
|
-
|
181
|
-
# render the first section only
|
182
|
-
puts renderer.render(bylaw.sections[0], '/')
|
183
|
-
```
|
184
|
-
|
185
|
-
For more information, see [/lib/slaw/render/html.rb](/lib/slaw/render/html.rb).
|
186
|
-
|
187
|
-
## Meta-data
|
188
|
-
|
189
|
-
Acts and by-laws have metadata which it is not possible to get from their plain text representations,
|
190
|
-
such as their title, date and format of publication or act number. Slaw provides some helpers
|
191
|
-
for manipulating this meta-data. For example,
|
192
|
-
|
193
|
-
```ruby
|
194
|
-
bylaw = Slaw::ByLaw.new('spec/fixtures/community-fire-safety.xml')
|
195
|
-
print bylaw.id_uri
|
196
|
-
bylaw.title = 'A new title'
|
197
|
-
bylaw.name = 'a-new-title'
|
198
|
-
bylaw.published!(date: '2014-09-28')
|
199
|
-
print bylaw.id_uri
|
200
|
-
```
|
201
|
-
|
202
|
-
## Schedules
|
203
|
-
|
204
|
-
South African acts and by-laws can have addendums called schedules. They are technically a part of
|
205
|
-
the act but are not part of the primary body and have more relaxed formatting. Slaw finds schedules
|
206
|
-
by looking for section headings, but makes no effort to capture the format of their contents.
|
71
|
+
tree, the nodes of which know how to serialize themselves in XML format.
|
207
72
|
|
208
|
-
|
209
|
-
|
73
|
+
Supporting formats from other country's legal traditions probably requires creating a new grammar
|
74
|
+
and parser.
|
210
75
|
|
211
76
|
## Contributing
|
212
77
|
|
@@ -225,6 +90,7 @@ Akoma Ntoso `component` elements at the end of the XML document, with a name of
|
|
225
90
|
* Slaw no longer does too much introspection of a parsed document, since that can be so tradition-dependent.
|
226
91
|
* Move reformatting out of Slaw since it's tradition-dependent.
|
227
92
|
* Remove definition linking, Slaw no longer supports it.
|
93
|
+
* Remove unused code for interacting with the internals of acts.
|
228
94
|
|
229
95
|
### 0.17.2
|
230
96
|
|
data/bin/slaw
CHANGED
@@ -90,8 +90,9 @@ class SlawCLI < Thor
|
|
90
90
|
end
|
91
91
|
|
92
92
|
desc "unparse FILE", "Unparse FILE from Akoma Ntoso XML back into text suitable for re-parsing"
|
93
|
+
option :grammar, type: :string, desc: "Grammar name (usually a two-letter country code). Default is za."
|
93
94
|
def unparse(name)
|
94
|
-
generator = Slaw::ActGenerator.new
|
95
|
+
generator = Slaw::ActGenerator.new(options[:grammar] || 'za')
|
95
96
|
|
96
97
|
doc = File.open(name, 'r') { |f| doc = generator.builder.parse_xml(f.read) }
|
97
98
|
puts generator.text_from_act(doc)
|
data/lib/slaw.rb
CHANGED
@@ -4,14 +4,8 @@ require 'slaw/version'
|
|
4
4
|
require 'slaw/namespace'
|
5
5
|
require 'slaw/logging'
|
6
6
|
|
7
|
-
require 'slaw/act'
|
8
|
-
require 'slaw/bylaw'
|
9
|
-
require 'slaw/collection'
|
10
|
-
|
11
7
|
require 'slaw/xml_support'
|
12
|
-
require 'slaw/lifecycle_event'
|
13
8
|
|
14
|
-
require 'slaw/render/html'
|
15
9
|
require 'slaw/parse/blocklists'
|
16
10
|
require 'slaw/parse/builder'
|
17
11
|
require 'slaw/parse/cleanser'
|
data/lib/slaw/generator.rb
CHANGED
@@ -7,9 +7,6 @@ module Slaw
|
|
7
7
|
# [Slaw::Parse::Builder] builder used by the generator
|
8
8
|
attr_accessor :builder
|
9
9
|
|
10
|
-
# The type that will hold the generated document
|
11
|
-
attr_accessor :document_class
|
12
|
-
|
13
10
|
@@parsers = {}
|
14
11
|
|
15
12
|
def initialize(grammar)
|
@@ -19,7 +16,6 @@ module Slaw
|
|
19
16
|
@builder = Slaw::Parse::Builder.new(parser: @parser)
|
20
17
|
@parser = @builder.parser
|
21
18
|
@cleanser = Slaw::Parse::Cleanser.new
|
22
|
-
@document_class = Slaw::Act
|
23
19
|
end
|
24
20
|
|
25
21
|
def build_parser
|
@@ -39,11 +35,9 @@ module Slaw
|
|
39
35
|
#
|
40
36
|
# @param text [String] plain text
|
41
37
|
#
|
42
|
-
# @return [
|
38
|
+
# @return [Nokogiri::Document] the resulting xml
|
43
39
|
def generate_from_text(text)
|
44
|
-
|
45
|
-
act.doc = @builder.parse_and_process_text(cleanup(text))
|
46
|
-
act
|
40
|
+
@builder.parse_and_process_text(cleanup(text))
|
47
41
|
end
|
48
42
|
|
49
43
|
# Run basic cleanup on text, such as ensuring clean newlines
|
@@ -111,32 +111,28 @@ module Slaw
|
|
111
111
|
# these are used externally and provide support when parsing just
|
112
112
|
# a particular portion of a document
|
113
113
|
|
114
|
-
rule
|
115
|
-
children:
|
116
|
-
end
|
117
|
-
|
118
|
-
rule subdivisions
|
119
|
-
children:subdivision+ <GroupNode>
|
114
|
+
rule articles
|
115
|
+
children:article+ <GroupNode>
|
120
116
|
end
|
121
117
|
|
122
118
|
rule chapters
|
123
119
|
children:chapter+ <GroupNode>
|
124
120
|
end
|
125
121
|
|
126
|
-
rule
|
127
|
-
children:
|
128
|
-
end
|
129
|
-
|
130
|
-
rule sections
|
131
|
-
children:section+ <GroupNode>
|
122
|
+
rule divisions
|
123
|
+
children:division+ <GroupNode>
|
132
124
|
end
|
133
125
|
|
134
126
|
rule paragraphs
|
135
127
|
children:paragraph+ <GroupNode>
|
136
128
|
end
|
137
129
|
|
138
|
-
rule
|
139
|
-
children:
|
130
|
+
rule sections
|
131
|
+
children:section+ <GroupNode>
|
132
|
+
end
|
133
|
+
|
134
|
+
rule subdivisions
|
135
|
+
children:subdivision+ <GroupNode>
|
140
136
|
end
|
141
137
|
|
142
138
|
##########
|
@@ -0,0 +1,271 @@
|
|
1
|
+
<?xml version="1.0"?>
|
2
|
+
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
|
3
|
+
xmlns:a="http://www.akomantoso.org/2.0"
|
4
|
+
exclude-result-prefixes="a">
|
5
|
+
|
6
|
+
<xsl:output method="text" indent="no" omit-xml-declaration="yes" />
|
7
|
+
<xsl:strip-space elements="*"/>
|
8
|
+
|
9
|
+
<!-- adds a backslash to the start of the value param, if necessary -->
|
10
|
+
<xsl:template name="escape">
|
11
|
+
<xsl:param name="value"/>
|
12
|
+
|
13
|
+
<xsl:variable name="prefix" select="translate(substring($value, 1, 10), 'abcdefghijklmnopqrstuvwxyz', 'ABCDEFGHIJKLMNOPQRSTUVWXYZ')" />
|
14
|
+
<xsl:variable name="numprefix" select="translate(translate(substring($prefix, 1, 3), '1234567890', 'NNNNNNNNNN'), 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'NNNNNNNNNNNNNNNNNNNNNNNNNN')" />
|
15
|
+
|
16
|
+
<!-- p tags must escape initial content that looks like a block element marker.
|
17
|
+
Note that the two hyphens are different characters. -->
|
18
|
+
<xsl:if test="$prefix = 'BODY' or
|
19
|
+
$prefix = 'PREAMBLE' or
|
20
|
+
$prefix = 'PREFACE' or
|
21
|
+
starts-with($prefix, 'ROZDZIA') or
|
22
|
+
starts-with($prefix, 'DZIA') or
|
23
|
+
starts-with($prefix, 'ODDZIA') or
|
24
|
+
starts-with($prefix, 'ART.') or
|
25
|
+
starts-with($prefix, '§') or
|
26
|
+
starts-with($prefix, 'SCHEDULE ') or
|
27
|
+
starts-with($prefix, '{|') or
|
28
|
+
starts-with($numprefix, 'N)') or
|
29
|
+
starts-with($numprefix, 'NN)') or
|
30
|
+
starts-with($numprefix, 'N.') or
|
31
|
+
starts-with($numprefix, 'NN.') or
|
32
|
+
starts-with($numprefix, '-') or
|
33
|
+
starts-with($numprefix, '–')">
|
34
|
+
<xsl:text>\</xsl:text>
|
35
|
+
</xsl:if>
|
36
|
+
<xsl:value-of select="$value"/>
|
37
|
+
</xsl:template>
|
38
|
+
|
39
|
+
<xsl:template match="a:act">
|
40
|
+
<xsl:apply-templates select="a:coverPage" />
|
41
|
+
<xsl:apply-templates select="a:preface" />
|
42
|
+
<xsl:apply-templates select="a:preamble" />
|
43
|
+
<xsl:apply-templates select="a:body" />
|
44
|
+
<xsl:apply-templates select="a:conclusions" />
|
45
|
+
</xsl:template>
|
46
|
+
|
47
|
+
<xsl:template match="a:preface">
|
48
|
+
<xsl:text>PREFACE</xsl:text>
|
49
|
+
<xsl:text>
|
50
|
+
|
51
|
+
</xsl:text>
|
52
|
+
<xsl:apply-templates />
|
53
|
+
</xsl:template>
|
54
|
+
|
55
|
+
<xsl:template match="a:preamble">
|
56
|
+
<xsl:text>PREAMBLE</xsl:text>
|
57
|
+
<xsl:text>
|
58
|
+
|
59
|
+
</xsl:text>
|
60
|
+
<xsl:apply-templates />
|
61
|
+
</xsl:template>
|
62
|
+
|
63
|
+
<xsl:template match="a:division">
|
64
|
+
<xsl:text>Dział </xsl:text>
|
65
|
+
<xsl:value-of select="./a:num" />
|
66
|
+
<xsl:text> - </xsl:text>
|
67
|
+
<xsl:value-of select="./a:heading" />
|
68
|
+
<xsl:text>
|
69
|
+
|
70
|
+
</xsl:text>
|
71
|
+
<xsl:apply-templates select="./*[not(self::a:num) and not(self::a:heading)]" />
|
72
|
+
</xsl:template>
|
73
|
+
|
74
|
+
<xsl:template match="a:chapter">
|
75
|
+
<xsl:text>Rozdział </xsl:text>
|
76
|
+
<xsl:value-of select="./a:num" />
|
77
|
+
<xsl:text> - </xsl:text>
|
78
|
+
<xsl:value-of select="./a:heading" />
|
79
|
+
<xsl:text>
|
80
|
+
|
81
|
+
</xsl:text>
|
82
|
+
<xsl:apply-templates select="./*[not(self::a:num) and not(self::a:heading)]" />
|
83
|
+
</xsl:template>
|
84
|
+
|
85
|
+
<xsl:template match="a:article">
|
86
|
+
<xsl:text>Art. </xsl:text>
|
87
|
+
<xsl:value-of select="a:num" />
|
88
|
+
<xsl:text>
|
89
|
+
|
90
|
+
</xsl:text>
|
91
|
+
<xsl:apply-templates select="./*[not(self::a:num)]" />
|
92
|
+
</xsl:template>
|
93
|
+
|
94
|
+
<xsl:template match="a:section">
|
95
|
+
<xsl:text>§ </xsl:text>
|
96
|
+
<xsl:value-of select="a:num" />
|
97
|
+
<xsl:text>
|
98
|
+
|
99
|
+
</xsl:text>
|
100
|
+
<xsl:apply-templates select="./*[not(self::a:num)]" />
|
101
|
+
</xsl:template>
|
102
|
+
|
103
|
+
<xsl:template match="a:paragraph">
|
104
|
+
<xsl:if test="a:num != ''">
|
105
|
+
<xsl:value-of select="a:num" />
|
106
|
+
<xsl:text> </xsl:text>
|
107
|
+
</xsl:if>
|
108
|
+
<xsl:apply-templates select="./*[not(self::a:num) and not(self::a:heading)]" />
|
109
|
+
</xsl:template>
|
110
|
+
|
111
|
+
<xsl:template match="a:indent">
|
112
|
+
<xsl:value-of select="a:num" />
|
113
|
+
<xsl:text>- </xsl:text>
|
114
|
+
<xsl:apply-templates select="./*[not(self::a:num)]" />
|
115
|
+
</xsl:template>
|
116
|
+
|
117
|
+
<!-- these are block elements and have a newline at the end -->
|
118
|
+
<xsl:template match="a:heading">
|
119
|
+
<xsl:apply-templates />
|
120
|
+
<xsl:text>
|
121
|
+
|
122
|
+
</xsl:text>
|
123
|
+
</xsl:template>
|
124
|
+
|
125
|
+
<xsl:template match="a:p">
|
126
|
+
<xsl:apply-templates/>
|
127
|
+
<!-- p tags must end with a newline -->
|
128
|
+
<xsl:text>
|
129
|
+
|
130
|
+
</xsl:text>
|
131
|
+
</xsl:template>
|
132
|
+
|
133
|
+
<!-- numbered lists -->
|
134
|
+
<xsl:template match="a:item | a:alinea | a:point">
|
135
|
+
<xsl:value-of select="./a:num" />
|
136
|
+
<xsl:text> </xsl:text>
|
137
|
+
<xsl:apply-templates select="./*[not(self::a:num)]" />
|
138
|
+
</xsl:template>
|
139
|
+
|
140
|
+
<xsl:template match="a:list">
|
141
|
+
<xsl:if test="a:intro != ''">
|
142
|
+
<xsl:value-of select="a:intro" />
|
143
|
+
<xsl:text>
|
144
|
+
|
145
|
+
</xsl:text>
|
146
|
+
</xsl:if>
|
147
|
+
<xsl:apply-templates select="./*[not(self::a:intro)]" />
|
148
|
+
</xsl:template>
|
149
|
+
|
150
|
+
<!-- first text nodes of these elems must be escaped if they have special chars -->
|
151
|
+
<xsl:template match="a:p[not(ancestor::a:table)]/text()[1] | a:intro/text()[1]">
|
152
|
+
<xsl:call-template name="escape">
|
153
|
+
<xsl:with-param name="value" select="." />
|
154
|
+
</xsl:call-template>
|
155
|
+
</xsl:template>
|
156
|
+
|
157
|
+
<!-- components/schedules -->
|
158
|
+
<xsl:template match="a:doc">
|
159
|
+
<xsl:text>Schedule - </xsl:text>
|
160
|
+
<xsl:value-of select="a:meta/a:identification/a:FRBRWork/a:FRBRalias/@value" />
|
161
|
+
|
162
|
+
<xsl:if test="a:mainBody/a:article/a:heading">
|
163
|
+
<xsl:text>
|
164
|
+
</xsl:text>
|
165
|
+
<xsl:value-of select="a:mainBody/a:article/a:heading" />
|
166
|
+
</xsl:if>
|
167
|
+
|
168
|
+
<xsl:text>
|
169
|
+
|
170
|
+
</xsl:text>
|
171
|
+
<xsl:apply-templates select="a:mainBody" />
|
172
|
+
</xsl:template>
|
173
|
+
|
174
|
+
<xsl:template match="a:mainBody/a:article/a:heading">
|
175
|
+
<!-- no-op, this is handled by the schedules template above -->
|
176
|
+
</xsl:template>
|
177
|
+
|
178
|
+
<!-- tables -->
|
179
|
+
<xsl:template match="a:table">
|
180
|
+
<xsl:text>{| </xsl:text>
|
181
|
+
|
182
|
+
<!-- attributes -->
|
183
|
+
<xsl:for-each select="@*[local-name()!='id']">
|
184
|
+
<xsl:value-of select="local-name(.)" />
|
185
|
+
<xsl:text>="</xsl:text>
|
186
|
+
<xsl:value-of select="." />
|
187
|
+
<xsl:text>" </xsl:text>
|
188
|
+
</xsl:for-each>
|
189
|
+
<xsl:text>
|
190
|
+
|-</xsl:text>
|
191
|
+
|
192
|
+
<xsl:apply-templates />
|
193
|
+
<xsl:text>
|
194
|
+
|}
|
195
|
+
|
196
|
+
</xsl:text>
|
197
|
+
</xsl:template>
|
198
|
+
|
199
|
+
<xsl:template match="a:tr">
|
200
|
+
<xsl:apply-templates />
|
201
|
+
<xsl:text>
|
202
|
+
|-</xsl:text>
|
203
|
+
</xsl:template>
|
204
|
+
|
205
|
+
<xsl:template match="a:th|a:td">
|
206
|
+
<xsl:choose>
|
207
|
+
<xsl:when test="local-name(.) = 'th'">
|
208
|
+
<xsl:text>
|
209
|
+
! </xsl:text>
|
210
|
+
</xsl:when>
|
211
|
+
<xsl:when test="local-name(.) = 'td'">
|
212
|
+
<xsl:text>
|
213
|
+
| </xsl:text>
|
214
|
+
</xsl:when>
|
215
|
+
</xsl:choose>
|
216
|
+
|
217
|
+
<!-- attributes -->
|
218
|
+
<xsl:if test="@*">
|
219
|
+
<xsl:for-each select="@*">
|
220
|
+
<xsl:value-of select="local-name(.)" />
|
221
|
+
<xsl:text>="</xsl:text>
|
222
|
+
<xsl:value-of select="." />
|
223
|
+
<xsl:text>" </xsl:text>
|
224
|
+
</xsl:for-each>
|
225
|
+
<xsl:text>| </xsl:text>
|
226
|
+
</xsl:if>
|
227
|
+
|
228
|
+
<xsl:apply-templates />
|
229
|
+
</xsl:template>
|
230
|
+
|
231
|
+
<!-- don't end p tags with newlines in tables -->
|
232
|
+
<xsl:template match="a:table//a:p">
|
233
|
+
<xsl:apply-templates />
|
234
|
+
</xsl:template>
|
235
|
+
|
236
|
+
<!-- END tables -->
|
237
|
+
|
238
|
+
<xsl:template match="a:remark">
|
239
|
+
<xsl:text>[</xsl:text>
|
240
|
+
<xsl:apply-templates />
|
241
|
+
<xsl:text>]</xsl:text>
|
242
|
+
</xsl:template>
|
243
|
+
|
244
|
+
<xsl:template match="a:ref">
|
245
|
+
<xsl:text>[</xsl:text>
|
246
|
+
<xsl:apply-templates />
|
247
|
+
<xsl:text>](</xsl:text>
|
248
|
+
<xsl:value-of select="@href" />
|
249
|
+
<xsl:text>)</xsl:text>
|
250
|
+
</xsl:template>
|
251
|
+
|
252
|
+
<xsl:template match="a:img">
|
253
|
+
<xsl:text></xsl:text>
|
258
|
+
</xsl:template>
|
259
|
+
|
260
|
+
<xsl:template match="a:eol">
|
261
|
+
<xsl:text>
|
262
|
+
</xsl:text>
|
263
|
+
</xsl:template>
|
264
|
+
|
265
|
+
|
266
|
+
<!-- for most nodes, just dump their text content -->
|
267
|
+
<xsl:template match="*">
|
268
|
+
<xsl:text/><xsl:apply-templates /><xsl:text/>
|
269
|
+
</xsl:template>
|
270
|
+
|
271
|
+
</xsl:stylesheet>
|