slaw 0.2.0 → 0.3.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +154 -17
- data/lib/slaw/act.rb +151 -20
- data/lib/slaw/bylaw.rb +36 -20
- data/lib/slaw/schemas/akomantoso20.xsd +6834 -0
- data/lib/slaw/schemas/xml.xsd +120 -0
- data/lib/slaw/version.rb +1 -1
- data/spec/bylaw_spec.rb +68 -0
- data/spec/fixtures/community-fire-safety.xml +3838 -0
- metadata +8 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 30603c7c9387a2f1c2fc9d617f667b41824e0b68
|
4
|
+
data.tar.gz: 2b153cb4679f469f4b0b18e4ba8b6da239d69016
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: a1fb11223dfbd14614eafaf1436e2b73c17fbdbc3fc0511f54492e3cab616c0a1db4f6ebdff0d0e56b498c23f3de1c8ce81ff7cd67e1784253dd6f22457976a2
|
7
|
+
data.tar.gz: 5d46d60e58c26cc44fef10a23f81de366361d538bfe35f649b6f6a81ea121ade63785d76d081fba26fc6413981ba831771f1c32aa5014bb8a02159122f3f40d5
|
data/README.md
CHANGED
@@ -1,7 +1,18 @@
|
|
1
1
|
# Slaw [![Build Status](https://travis-ci.org/longhotsummer/slaw.svg)](http://travis-ci.org/longhotsummer/slaw)
|
2
2
|
|
3
|
-
Slaw is a lightweight library for
|
4
|
-
It is used to power [openbylaws.org.za](http://openbylaws.org.za).
|
3
|
+
Slaw is a lightweight library for generating and rendering Akoma Ntoso 2.0 Act XML from plain text and PDF documents.
|
4
|
+
It is used to power [openbylaws.org.za](http://openbylaws.org.za) and [steno.openbylaws.org.za](http://steno.openbylaws.org.za)
|
5
|
+
and uses grammars developed for South African acts and by-laws.
|
6
|
+
|
7
|
+
Slaw allows you to:
|
8
|
+
|
9
|
+
1. extract plain text from PDFs and clean up that text
|
10
|
+
2. parse plain text and transform it into an Akoma Ntoso Act XML document
|
11
|
+
3. render the XML document into HTML
|
12
|
+
|
13
|
+
Slaw is lightweight because it wraps around a Nokogiri XML representation of
|
14
|
+
the parsed document. It provides some support methods for manipulating these
|
15
|
+
documents, but anything advanced must manipulate the XML directly.
|
5
16
|
|
6
17
|
## Installation
|
7
18
|
|
@@ -13,37 +24,163 @@ And then execute:
|
|
13
24
|
|
14
25
|
$ bundle
|
15
26
|
|
16
|
-
Or install it
|
27
|
+
Or install it with:
|
17
28
|
|
18
29
|
$ gem install slaw
|
19
30
|
|
20
|
-
|
31
|
+
To run PDF extraction you will also need [xpdf](http://www.foolabs.com/xpdf/).
|
32
|
+
If you're on a Mac, you can use:
|
21
33
|
|
22
|
-
|
34
|
+
brew install xpdf
|
23
35
|
|
24
|
-
|
36
|
+
## Overview
|
25
37
|
|
26
|
-
|
27
|
-
|
38
|
+
Slaw generates Acts in the [Akoma Ntoso](http://www.akomantoso.org) 2.0 XML
|
39
|
+
standard for legislative documents. It first parses plain text using a grammar
|
40
|
+
and then generates XML from the resulting syntax tree.
|
28
41
|
|
29
|
-
|
42
|
+
Most by-laws in South Africa are available as PDF documents. Slaw therefore has support
|
43
|
+
for extracting and cleaning up text from PDFs before parsing it. Extracting text from
|
44
|
+
PDFs can product oddities (such as oddly wrapped lines) and Slaw has a number of
|
45
|
+
rules-of-thumb for correcting these. These rules are based on South African
|
46
|
+
by-laws and may not be suitable for all regions.
|
47
|
+
|
48
|
+
The grammar is expressed as a [Treetop](https://github.com/nathansobo/treetop/) grammar
|
49
|
+
and has been developed specifically for the format of South African acts and by-laws.
|
50
|
+
Grammars for other regions could de developed depending on the complexity of a region's
|
51
|
+
formats.
|
30
52
|
|
31
|
-
|
32
|
-
|
53
|
+
The grammar cannot catch some subtleties of an act or by-law -- such as nested list numbering --
|
54
|
+
so Slaw performs some post-processing on the XML produced by the parser. In particular,
|
55
|
+
it nests lists correctly and looks for specially defined terms and their occurrences in the document.
|
56
|
+
|
57
|
+
## Quick Start
|
58
|
+
|
59
|
+
Install the gem using
|
60
|
+
|
61
|
+
gem install slaw
|
62
|
+
|
63
|
+
Extract text from a PDF and parse it as a South African by-law:
|
33
64
|
|
34
65
|
```ruby
|
66
|
+
require 'slaw'
|
67
|
+
|
68
|
+
# extract text from a PDF file and clean it up
|
35
69
|
extractor = Slaw::Extract::Extractor.new
|
70
|
+
text = extractor.extract_from_pdf('/path/to/file.pdf')
|
36
71
|
|
37
|
-
#
|
38
|
-
|
72
|
+
# parse the text into a XML and
|
73
|
+
generator = Slaw::ZA::ByLawGenerator.new
|
74
|
+
bylaw = generator.generate_from_text(text)
|
75
|
+
puts bylaw.to_xml(indent: 2)
|
39
76
|
|
40
|
-
#
|
41
|
-
|
77
|
+
# render the by-law as HTML, using / as the root
|
78
|
+
# for relative URLs
|
79
|
+
renderer = Slaw::Render::HTMLRenderer.new
|
80
|
+
puts renderer.render(bylaw.doc, '/')
|
81
|
+
```
|
82
|
+
|
83
|
+
## Extraction
|
84
|
+
|
85
|
+
Extraction is done by the `Slaw::Extract::Extractor` class. It currently handles
|
86
|
+
PDF and plain text files. Slaw uses `pdftotext` from the `xpdf` package to extract
|
87
|
+
the plain text from PDFs. PDFs are great for presentation, but suck for accurately storing
|
88
|
+
text. As a result, the extraction can produce oddities, such as lines broken in weird
|
89
|
+
places (or not broken when they should be). Slaw gets around this by running
|
90
|
+
some cleanup routines on the extracted text.
|
91
|
+
|
92
|
+
For example, it knows that these lines:
|
93
|
+
|
94
|
+
(b) any wall, swimming pool, reservoir or bridge
|
95
|
+
or any other structure connected therewith; (c) any fuel pump or any
|
96
|
+
tank used in connection therewith
|
97
|
+
|
98
|
+
should probably be broken at the section numbers:
|
99
|
+
|
100
|
+
(b) any wall, swimming pool, reservoir or bridge or any other structure connected therewith;
|
101
|
+
(c) any fuel pump or any tank used in connection therewith
|
102
|
+
|
103
|
+
If your region's numbering format differs significantly from this, these rules might not work.
|
104
|
+
|
105
|
+
Some other steps Slaw takes after extraction include (check `Slaw::Parse::Cleanser` for the full set):
|
42
106
|
|
43
|
-
|
44
|
-
|
107
|
+
* changing newlines to `\n`, and normalising quotation characters
|
108
|
+
* removing page numbers and other boilerplate
|
109
|
+
* stripping the table of contents (we can generate our own from the parsed document)
|
110
|
+
* changing tabs to spaces, stripping leading and trailing spaces and removing blank lines
|
111
|
+
|
112
|
+
## Parsing
|
113
|
+
|
114
|
+
Slaw uses Treetop to compile a grammar into a backtracking parser. The parser builds a parse
|
115
|
+
tree, each node of which knows how to serialize itself in XML format.
|
116
|
+
|
117
|
+
While most South African by-laws are superficially very similar, there are a sufficient differences
|
118
|
+
in their typesetting to make parsing them difficult. The grammar handles most
|
119
|
+
edge cases but may not catch them all. The one thing it cannot yet detect well is the difference
|
120
|
+
between section titles before and after a section number:
|
121
|
+
|
122
|
+
1. Definitions
|
123
|
+
In this by-law, the following words ...
|
124
|
+
|
125
|
+
Definitions
|
126
|
+
1. In this by-law, the following words ...
|
127
|
+
|
128
|
+
This must be set by the user before parsing.
|
129
|
+
|
130
|
+
The parser does its best not to choke on input it doesn't understand, preferring a best effort
|
131
|
+
to a completely accurate result. For example it may not be able to work out a section heading
|
132
|
+
and so will treat it as simply another statement in the previous section. This causes the parser
|
133
|
+
to use a lot of backtracking and negative lookahead assertions, which can be slow for large documents.
|
134
|
+
|
135
|
+
The grammar supports a number of subsection numbering formats, which are often mixed
|
136
|
+
in a document to indicate different levels of nesting.
|
137
|
+
|
138
|
+
(a)
|
139
|
+
(2)
|
140
|
+
(3b)
|
141
|
+
(ii)
|
142
|
+
3.4
|
143
|
+
|
144
|
+
During post-processing it works out how to nest these appropriately.
|
145
|
+
|
146
|
+
For more information see the South African by-law grammar at
|
147
|
+
[lib/slaw/za/bylaw.treetop](lib/slaw/za/bylaw.treetop) and the list nesting
|
148
|
+
at [lib/slaw/parse/blocklists.rb](lib/slaw/parse/blocklists.rb).
|
149
|
+
|
150
|
+
## Rendering
|
151
|
+
|
152
|
+
Slaw renders XML to HTML using XSLT. For the most part there is a direct mapping between
|
153
|
+
Akoma Ntoso structure and the HTML layout, so most AN nodes are simply mapped to `div` or `span`
|
154
|
+
elements with a class attribute derived from the name of the AN element and an ID element taken
|
155
|
+
from the node, if any. This makes it both fast and flexible, since it's easy to
|
156
|
+
apply layout rules with CSS.
|
157
|
+
|
158
|
+
Slaw can render either an entire document like this, or just a portion of the XML tree.
|
159
|
+
|
160
|
+
## Meta-data
|
161
|
+
|
162
|
+
Acts and by-laws have metadata which it is not possible to get from their plain text representations,
|
163
|
+
such as their title, date and format of publication or act number. Slaw provides some helpers
|
164
|
+
for manipulating this meta-data. For example,
|
165
|
+
|
166
|
+
```ruby
|
167
|
+
bylaw = Slaw::ByLaw.new('spec/fixtures/community-fire-safety.xml')
|
168
|
+
print bylaw.id_uri
|
169
|
+
bylaw.title = 'A new title'
|
170
|
+
bylaw.name = 'a-new-title'
|
171
|
+
bylaw.published!(date: '2014-09-28')
|
172
|
+
print bylaw.id_uri
|
45
173
|
```
|
46
174
|
|
175
|
+
## Schedules
|
176
|
+
|
177
|
+
South African acts and by-laws can have addendums called schedules. They are technically a part of
|
178
|
+
the act but are not part of the primary body and have more relaxed formatting. Slaw finds schedules
|
179
|
+
by looking for section headings, but makes no effort to capture the format of their contents.
|
180
|
+
|
181
|
+
Akoma Ntoso has no explicit support for schedules. Instead, Slaw stores all schedules under a single
|
182
|
+
Akoma Ntoso `component` elements at the end of the XML document, with a name of `schedules`.
|
183
|
+
|
47
184
|
## Contributing
|
48
185
|
|
49
186
|
1. Fork it at http://github.com/longhotsummer/slaw/fork
|
data/lib/slaw/act.rb
CHANGED
@@ -18,25 +18,31 @@ module Slaw
|
|
18
18
|
attr_accessor :doc
|
19
19
|
|
20
20
|
# [Nokogiri::XML::Node] The `meta` XML node
|
21
|
-
|
21
|
+
attr_reader :meta
|
22
22
|
|
23
23
|
# [Nokogiri::XML::Node] The `body` XML node
|
24
|
-
|
24
|
+
attr_reader :body
|
25
25
|
|
26
26
|
# [String] The year this act was published
|
27
|
-
|
27
|
+
attr_reader :year
|
28
28
|
|
29
29
|
# [String] The act number in the year this act was published
|
30
|
-
|
30
|
+
attr_reader :num
|
31
31
|
|
32
32
|
# [String] The FRBR URI of this act, which uniquely identifies it globally
|
33
|
-
|
33
|
+
attr_reader :id_uri
|
34
34
|
|
35
35
|
# [String, nil] The source filename, or nil
|
36
|
-
|
36
|
+
attr_reader :filename
|
37
37
|
|
38
38
|
# [Time, nil] The mtime of when the source file was last modified
|
39
|
-
|
39
|
+
attr_reader :mtime
|
40
|
+
|
41
|
+
# [String] The underlying nature of this act, usually `act` although subclasses my override this.
|
42
|
+
attr_reader :nature
|
43
|
+
|
44
|
+
# [Nokogiri::XML::Schema] schema to validate against
|
45
|
+
attr_accessor :schema
|
40
46
|
|
41
47
|
# Get the act that wraps the document that owns this XML node
|
42
48
|
# @param node [Nokogiri::XML::Node]
|
@@ -49,6 +55,7 @@ module Slaw
|
|
49
55
|
# @param filename [String] filename to load XML from
|
50
56
|
def initialize(filename=nil)
|
51
57
|
self.load(filename) if filename
|
58
|
+
@schema = nil
|
52
59
|
end
|
53
60
|
|
54
61
|
# Load the XML in `filename` into this instance
|
@@ -60,8 +67,9 @@ module Slaw
|
|
60
67
|
File.open(filename) { |f| parse(f) }
|
61
68
|
end
|
62
69
|
|
63
|
-
# Parse the XML contained in the file-like object `io`
|
64
|
-
#
|
70
|
+
# Parse the XML contained in the file-like or String object `io`
|
71
|
+
#
|
72
|
+
# @param io [String, file-like] io object or String with XML
|
65
73
|
def parse(io)
|
66
74
|
self.doc = Nokogiri::XML(io)
|
67
75
|
end
|
@@ -76,26 +84,90 @@ module Slaw
|
|
76
84
|
|
77
85
|
@@acts[@doc] = self
|
78
86
|
|
79
|
-
|
87
|
+
extract_id_uri
|
80
88
|
end
|
81
89
|
|
82
|
-
#
|
83
|
-
|
84
|
-
|
85
|
-
|
90
|
+
# Directly set the FRBR URI of this act. This must be a well-formed URI,
|
91
|
+
# such as `/za/act/2002/2`. This will, in turn, update the {#year}, {#nature},
|
92
|
+
# {#country} and {#num} attributes.
|
93
|
+
#
|
94
|
+
# You probably don't want to use this method. Instead, set each component
|
95
|
+
# (such as {#date}) manually.
|
96
|
+
#
|
97
|
+
# @param uri [String] new URI
|
98
|
+
def id_uri=(uri)
|
99
|
+
for component, xpath in [['main', '//a:act/a:meta/a:identification'],
|
100
|
+
['schedules', '//a:component/a:doc/a:meta/a:identification']] do
|
101
|
+
ident = @doc.at_xpath(xpath, a: NS)
|
102
|
+
next if not ident
|
103
|
+
|
104
|
+
# work
|
105
|
+
ident.at_xpath('a:FRBRWork/a:FRBRthis', a: NS)['value'] = "#{uri}/#{component}"
|
106
|
+
ident.at_xpath('a:FRBRWork/a:FRBRuri', a: NS)['value'] = uri
|
107
|
+
|
108
|
+
# expression
|
109
|
+
ident.at_xpath('a:FRBRExpression/a:FRBRthis', a: NS)['value'] = "#{uri}/#{component}/eng@"
|
110
|
+
ident.at_xpath('a:FRBRExpression/a:FRBRuri', a: NS)['value'] = "#{uri}/eng@"
|
111
|
+
|
112
|
+
# manifestation
|
113
|
+
ident.at_xpath('a:FRBRManifestation/a:FRBRthis', a: NS)['value'] = "#{uri}/#{component}/eng@"
|
114
|
+
ident.at_xpath('a:FRBRManifestation/a:FRBRuri', a: NS)['value'] = "#{uri}/eng@"
|
115
|
+
end
|
86
116
|
|
87
|
-
|
88
|
-
|
117
|
+
extract_id_uri
|
118
|
+
end
|
119
|
+
|
120
|
+
# The date at which this act was first created/promulgated.
|
121
|
+
#
|
122
|
+
# @return [String] date, YYYY-MM-DD
|
123
|
+
def date
|
124
|
+
node = @meta.at_xpath('./a:identification/a:FRBRWork/a:FRBRdate[@name="Generation"]', a: NS)
|
125
|
+
node && node['date']
|
126
|
+
end
|
127
|
+
|
128
|
+
# Set the date at which this act was first created/promulgated. This is usually the same
|
129
|
+
# as the publication date but this is not enforced.
|
130
|
+
#
|
131
|
+
# This also updates the {#year} of this act, which in turn updates the {#id_uri}.
|
132
|
+
#
|
133
|
+
# @param date [String] date, YYYY-MM-DD
|
134
|
+
def date=(value)
|
135
|
+
for frbr in ['FRBRWork', 'FRBRExpression'] do
|
136
|
+
@meta.at_xpath("./a:identification/a:#{frbr}/a:FRBRdate[@name=\"Generation\"]", a: NS)['date'] = value
|
137
|
+
end
|
138
|
+
|
139
|
+
self.year = value.split('-')[0]
|
140
|
+
end
|
141
|
+
|
142
|
+
# Set the year for this act. You probably want to call {#date=} instead.
|
143
|
+
#
|
144
|
+
# This will also update the {#id_uri} but will not change {#date} at all.
|
145
|
+
#
|
146
|
+
# @param year [String, Number] year
|
147
|
+
def year=(year)
|
148
|
+
@year = year.to_s
|
149
|
+
rebuild_id_uri
|
89
150
|
end
|
90
151
|
|
91
152
|
# An applicable short title for this act, either from the `FRBRalias` element
|
92
153
|
# or based on the act number and year.
|
93
154
|
# @return [String]
|
94
|
-
def
|
155
|
+
def title
|
95
156
|
node = @meta.at_xpath('./a:identification/a:FRBRWork/a:FRBRalias', a: NS)
|
96
157
|
node ? node['value'] : "Act #{num} of #{year}"
|
97
158
|
end
|
98
159
|
|
160
|
+
# Change the title of this act.
|
161
|
+
def title=(value)
|
162
|
+
node = @meta.at_xpath('./a:identification/a:FRBRWork/a:FRBRalias', a: NS)
|
163
|
+
unless node
|
164
|
+
node = @doc.create_element('FRBRalias')
|
165
|
+
@meta.at_xpath('./a:identification/a:FRBRWork/a:FRBRuri', a: NS).after(node)
|
166
|
+
end
|
167
|
+
|
168
|
+
node['value'] = value
|
169
|
+
end
|
170
|
+
|
99
171
|
# Has this act been amended? This is determined by testing the `contains`
|
100
172
|
# attribute of the `act` root element.
|
101
173
|
#
|
@@ -250,6 +322,24 @@ module Slaw
|
|
250
322
|
@meta.at_xpath('./a:publication', a: NS)
|
251
323
|
end
|
252
324
|
|
325
|
+
# Update the publication details of the act. All elements are optional.
|
326
|
+
#
|
327
|
+
# @option details [String] :name name of the publication
|
328
|
+
# @option details [String] :number publication number
|
329
|
+
# @option details [String] :date date of publication (YYYY-MM-DD)
|
330
|
+
def published!(details)
|
331
|
+
node = @meta.at_xpath('./a:publication', a: NS)
|
332
|
+
unless node
|
333
|
+
node = @doc.create_element('publication')
|
334
|
+
@meta.at_xpath('./a:identification', a: NS).after(node)
|
335
|
+
end
|
336
|
+
|
337
|
+
node['showAs'] = details[:name] if details.has_key? :name
|
338
|
+
node['name'] = details[:name] if details.has_key? :name
|
339
|
+
node['date'] = details[:date] if details.has_key? :date
|
340
|
+
node['number'] = details[:number] if details.has_key? :number
|
341
|
+
end
|
342
|
+
|
253
343
|
# Has this by-law been repealed?
|
254
344
|
#
|
255
345
|
# @return [Boolean]
|
@@ -297,14 +387,55 @@ module Slaw
|
|
297
387
|
node && node['date']
|
298
388
|
end
|
299
389
|
|
300
|
-
#
|
301
|
-
|
302
|
-
|
390
|
+
# Validate the XML behind this document against the Akoma Ntoso schema and return
|
391
|
+
# any errors.
|
392
|
+
#
|
393
|
+
# @return [Object] array of errors, possibly empty
|
394
|
+
def validate
|
395
|
+
@schema ||= Dir.chdir(File.dirname(__FILE__) + "/schemas") { Nokogiri::XML::Schema(File.read('akomantoso20.xsd')) }
|
396
|
+
@schema.validate(@doc)
|
397
|
+
end
|
398
|
+
|
399
|
+
# Does this document validate against the schema?
|
400
|
+
#
|
401
|
+
# @see {#validate}
|
402
|
+
def validates?
|
403
|
+
validate.empty?
|
404
|
+
end
|
405
|
+
|
406
|
+
# Serialise the XML for this act, passing `args` to the Nokogiri serialiser.
|
407
|
+
# The most useful argument is usually `indent: 2` if you like your XML perdy.
|
408
|
+
#
|
409
|
+
# @return [String] serialized XML
|
410
|
+
def to_xml(*args)
|
411
|
+
@doc.to_xml(*args)
|
303
412
|
end
|
304
413
|
|
305
414
|
def inspect
|
306
415
|
"<#{self.class.name} @id_uri=\"#{@id_uri}\">"
|
307
416
|
end
|
417
|
+
|
418
|
+
protected
|
419
|
+
|
420
|
+
# Parse the FRBR Uri into its constituent parts
|
421
|
+
def extract_id_uri
|
422
|
+
@id_uri = @meta.at_xpath('./a:identification/a:FRBRWork/a:FRBRuri', a: NS)['value']
|
423
|
+
empty, @country, @nature, date, @num = @id_uri.split('/')
|
424
|
+
|
425
|
+
# yyyy-mm-dd
|
426
|
+
@year = date.split('-', 2)[0]
|
427
|
+
end
|
428
|
+
|
429
|
+
def build_id_uri
|
430
|
+
# /za/act/2002/3
|
431
|
+
"/#{@country}/#{@nature}/#{@year}/#{@num}"
|
432
|
+
end
|
433
|
+
|
434
|
+
# This rebuild's the FRBR uri for this document using its constituent components. It will
|
435
|
+
# update the XML then re-split the URI and grab its components.
|
436
|
+
def rebuild_id_uri
|
437
|
+
self.id_uri = build_id_uri
|
438
|
+
end
|
308
439
|
end
|
309
440
|
|
310
441
|
end
|
data/lib/slaw/bylaw.rb
CHANGED
@@ -7,40 +7,56 @@ module Slaw
|
|
7
7
|
# is not identified by a year and a number, and therefore has a different FRBR uri structure.
|
8
8
|
class ByLaw < Act
|
9
9
|
|
10
|
-
# [String] The region this by-law applies to
|
11
|
-
|
10
|
+
# [String] The code of the region this by-law applies to
|
11
|
+
attr_reader :region
|
12
12
|
|
13
13
|
# [String] A short file-like name of this by-law, unique within its year and region
|
14
|
-
|
15
|
-
|
16
|
-
def _extract_id
|
17
|
-
# /za/by-law/cape-town/2010/public-parks
|
18
|
-
|
19
|
-
@id_uri = @meta.at_xpath('./a:identification/a:FRBRWork/a:FRBRuri', a: NS)['value']
|
20
|
-
empty, @country, type, @region, date, @name = @id_uri.split('/')
|
21
|
-
|
22
|
-
# yyyy[-mm-dd]
|
23
|
-
@year = date.split('-', 2)[0]
|
24
|
-
end
|
14
|
+
attr_reader :name
|
25
15
|
|
26
16
|
# ByLaws don't have numbers, use their short-name instead
|
27
17
|
def num
|
28
18
|
name
|
29
19
|
end
|
30
20
|
|
31
|
-
def
|
21
|
+
def title
|
32
22
|
node = @meta.at_xpath('./a:identification/a:FRBRWork/a:FRBRalias', a: NS)
|
33
|
-
|
23
|
+
title = node ? node['value'] : "(Unknown)"
|
34
24
|
|
35
|
-
if amended? and not
|
36
|
-
|
25
|
+
if amended? and not title.end_with?("as amended")
|
26
|
+
title = title + " as amended"
|
37
27
|
end
|
38
28
|
|
39
|
-
|
29
|
+
title
|
30
|
+
end
|
31
|
+
|
32
|
+
# Set the short (file-like) name for this bylaw. This changes the {#id_uri}.
|
33
|
+
def name=(value)
|
34
|
+
@name = value
|
35
|
+
rebuild_id_uri
|
40
36
|
end
|
41
37
|
|
42
|
-
|
43
|
-
|
38
|
+
# Set the region code for this bylaw. This changes the {#id_uri}.
|
39
|
+
def region=(value)
|
40
|
+
@region = value
|
41
|
+
rebuild_id_uri
|
44
42
|
end
|
43
|
+
|
44
|
+
protected
|
45
|
+
|
46
|
+
def extract_id_uri
|
47
|
+
# /za/by-law/cape-town/2010/public-parks
|
48
|
+
|
49
|
+
@id_uri = @meta.at_xpath('./a:identification/a:FRBRWork/a:FRBRuri', a: NS)['value']
|
50
|
+
empty, @country, @nature, @region, date, @name = @id_uri.split('/')
|
51
|
+
|
52
|
+
# yyyy[-mm-dd]
|
53
|
+
@year = date.split('-', 2)[0]
|
54
|
+
end
|
55
|
+
|
56
|
+
def build_id_uri
|
57
|
+
# /za/by-law/cape-town/2010/public-parks
|
58
|
+
"/#{@country}/#{@nature}/#{@region}/#{@year}/#{@name}"
|
59
|
+
end
|
60
|
+
|
45
61
|
end
|
46
62
|
end
|