slaw 0.2.0 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +154 -17
- data/lib/slaw/act.rb +151 -20
- data/lib/slaw/bylaw.rb +36 -20
- data/lib/slaw/schemas/akomantoso20.xsd +6834 -0
- data/lib/slaw/schemas/xml.xsd +120 -0
- data/lib/slaw/version.rb +1 -1
- data/spec/bylaw_spec.rb +68 -0
- data/spec/fixtures/community-fire-safety.xml +3838 -0
- metadata +8 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 30603c7c9387a2f1c2fc9d617f667b41824e0b68
|
4
|
+
data.tar.gz: 2b153cb4679f469f4b0b18e4ba8b6da239d69016
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: a1fb11223dfbd14614eafaf1436e2b73c17fbdbc3fc0511f54492e3cab616c0a1db4f6ebdff0d0e56b498c23f3de1c8ce81ff7cd67e1784253dd6f22457976a2
|
7
|
+
data.tar.gz: 5d46d60e58c26cc44fef10a23f81de366361d538bfe35f649b6f6a81ea121ade63785d76d081fba26fc6413981ba831771f1c32aa5014bb8a02159122f3f40d5
|
data/README.md
CHANGED
@@ -1,7 +1,18 @@
|
|
1
1
|
# Slaw [](http://travis-ci.org/longhotsummer/slaw)
|
2
2
|
|
3
|
-
Slaw is a lightweight library for
|
4
|
-
It is used to power [openbylaws.org.za](http://openbylaws.org.za).
|
3
|
+
Slaw is a lightweight library for generating and rendering Akoma Ntoso 2.0 Act XML from plain text and PDF documents.
|
4
|
+
It is used to power [openbylaws.org.za](http://openbylaws.org.za) and [steno.openbylaws.org.za](http://steno.openbylaws.org.za)
|
5
|
+
and uses grammars developed for South African acts and by-laws.
|
6
|
+
|
7
|
+
Slaw allows you to:
|
8
|
+
|
9
|
+
1. extract plain text from PDFs and clean up that text
|
10
|
+
2. parse plain text and transform it into an Akoma Ntoso Act XML document
|
11
|
+
3. render the XML document into HTML
|
12
|
+
|
13
|
+
Slaw is lightweight because it wraps around a Nokogiri XML representation of
|
14
|
+
the parsed document. It provides some support methods for manipulating these
|
15
|
+
documents, but anything advanced must manipulate the XML directly.
|
5
16
|
|
6
17
|
## Installation
|
7
18
|
|
@@ -13,37 +24,163 @@ And then execute:
|
|
13
24
|
|
14
25
|
$ bundle
|
15
26
|
|
16
|
-
Or install it
|
27
|
+
Or install it with:
|
17
28
|
|
18
29
|
$ gem install slaw
|
19
30
|
|
20
|
-
|
31
|
+
To run PDF extraction you will also need [xpdf](http://www.foolabs.com/xpdf/).
|
32
|
+
If you're on a Mac, you can use:
|
21
33
|
|
22
|
-
|
34
|
+
brew install xpdf
|
23
35
|
|
24
|
-
|
36
|
+
## Overview
|
25
37
|
|
26
|
-
|
27
|
-
|
38
|
+
Slaw generates Acts in the [Akoma Ntoso](http://www.akomantoso.org) 2.0 XML
|
39
|
+
standard for legislative documents. It first parses plain text using a grammar
|
40
|
+
and then generates XML from the resulting syntax tree.
|
28
41
|
|
29
|
-
|
42
|
+
Most by-laws in South Africa are available as PDF documents. Slaw therefore has support
|
43
|
+
for extracting and cleaning up text from PDFs before parsing it. Extracting text from
|
44
|
+
PDFs can product oddities (such as oddly wrapped lines) and Slaw has a number of
|
45
|
+
rules-of-thumb for correcting these. These rules are based on South African
|
46
|
+
by-laws and may not be suitable for all regions.
|
47
|
+
|
48
|
+
The grammar is expressed as a [Treetop](https://github.com/nathansobo/treetop/) grammar
|
49
|
+
and has been developed specifically for the format of South African acts and by-laws.
|
50
|
+
Grammars for other regions could de developed depending on the complexity of a region's
|
51
|
+
formats.
|
30
52
|
|
31
|
-
|
32
|
-
|
53
|
+
The grammar cannot catch some subtleties of an act or by-law -- such as nested list numbering --
|
54
|
+
so Slaw performs some post-processing on the XML produced by the parser. In particular,
|
55
|
+
it nests lists correctly and looks for specially defined terms and their occurrences in the document.
|
56
|
+
|
57
|
+
## Quick Start
|
58
|
+
|
59
|
+
Install the gem using
|
60
|
+
|
61
|
+
gem install slaw
|
62
|
+
|
63
|
+
Extract text from a PDF and parse it as a South African by-law:
|
33
64
|
|
34
65
|
```ruby
|
66
|
+
require 'slaw'
|
67
|
+
|
68
|
+
# extract text from a PDF file and clean it up
|
35
69
|
extractor = Slaw::Extract::Extractor.new
|
70
|
+
text = extractor.extract_from_pdf('/path/to/file.pdf')
|
36
71
|
|
37
|
-
#
|
38
|
-
|
72
|
+
# parse the text into a XML and
|
73
|
+
generator = Slaw::ZA::ByLawGenerator.new
|
74
|
+
bylaw = generator.generate_from_text(text)
|
75
|
+
puts bylaw.to_xml(indent: 2)
|
39
76
|
|
40
|
-
#
|
41
|
-
|
77
|
+
# render the by-law as HTML, using / as the root
|
78
|
+
# for relative URLs
|
79
|
+
renderer = Slaw::Render::HTMLRenderer.new
|
80
|
+
puts renderer.render(bylaw.doc, '/')
|
81
|
+
```
|
82
|
+
|
83
|
+
## Extraction
|
84
|
+
|
85
|
+
Extraction is done by the `Slaw::Extract::Extractor` class. It currently handles
|
86
|
+
PDF and plain text files. Slaw uses `pdftotext` from the `xpdf` package to extract
|
87
|
+
the plain text from PDFs. PDFs are great for presentation, but suck for accurately storing
|
88
|
+
text. As a result, the extraction can produce oddities, such as lines broken in weird
|
89
|
+
places (or not broken when they should be). Slaw gets around this by running
|
90
|
+
some cleanup routines on the extracted text.
|
91
|
+
|
92
|
+
For example, it knows that these lines:
|
93
|
+
|
94
|
+
(b) any wall, swimming pool, reservoir or bridge
|
95
|
+
or any other structure connected therewith; (c) any fuel pump or any
|
96
|
+
tank used in connection therewith
|
97
|
+
|
98
|
+
should probably be broken at the section numbers:
|
99
|
+
|
100
|
+
(b) any wall, swimming pool, reservoir or bridge or any other structure connected therewith;
|
101
|
+
(c) any fuel pump or any tank used in connection therewith
|
102
|
+
|
103
|
+
If your region's numbering format differs significantly from this, these rules might not work.
|
104
|
+
|
105
|
+
Some other steps Slaw takes after extraction include (check `Slaw::Parse::Cleanser` for the full set):
|
42
106
|
|
43
|
-
|
44
|
-
|
107
|
+
* changing newlines to `\n`, and normalising quotation characters
|
108
|
+
* removing page numbers and other boilerplate
|
109
|
+
* stripping the table of contents (we can generate our own from the parsed document)
|
110
|
+
* changing tabs to spaces, stripping leading and trailing spaces and removing blank lines
|
111
|
+
|
112
|
+
## Parsing
|
113
|
+
|
114
|
+
Slaw uses Treetop to compile a grammar into a backtracking parser. The parser builds a parse
|
115
|
+
tree, each node of which knows how to serialize itself in XML format.
|
116
|
+
|
117
|
+
While most South African by-laws are superficially very similar, there are a sufficient differences
|
118
|
+
in their typesetting to make parsing them difficult. The grammar handles most
|
119
|
+
edge cases but may not catch them all. The one thing it cannot yet detect well is the difference
|
120
|
+
between section titles before and after a section number:
|
121
|
+
|
122
|
+
1. Definitions
|
123
|
+
In this by-law, the following words ...
|
124
|
+
|
125
|
+
Definitions
|
126
|
+
1. In this by-law, the following words ...
|
127
|
+
|
128
|
+
This must be set by the user before parsing.
|
129
|
+
|
130
|
+
The parser does its best not to choke on input it doesn't understand, preferring a best effort
|
131
|
+
to a completely accurate result. For example it may not be able to work out a section heading
|
132
|
+
and so will treat it as simply another statement in the previous section. This causes the parser
|
133
|
+
to use a lot of backtracking and negative lookahead assertions, which can be slow for large documents.
|
134
|
+
|
135
|
+
The grammar supports a number of subsection numbering formats, which are often mixed
|
136
|
+
in a document to indicate different levels of nesting.
|
137
|
+
|
138
|
+
(a)
|
139
|
+
(2)
|
140
|
+
(3b)
|
141
|
+
(ii)
|
142
|
+
3.4
|
143
|
+
|
144
|
+
During post-processing it works out how to nest these appropriately.
|
145
|
+
|
146
|
+
For more information see the South African by-law grammar at
|
147
|
+
[lib/slaw/za/bylaw.treetop](lib/slaw/za/bylaw.treetop) and the list nesting
|
148
|
+
at [lib/slaw/parse/blocklists.rb](lib/slaw/parse/blocklists.rb).
|
149
|
+
|
150
|
+
## Rendering
|
151
|
+
|
152
|
+
Slaw renders XML to HTML using XSLT. For the most part there is a direct mapping between
|
153
|
+
Akoma Ntoso structure and the HTML layout, so most AN nodes are simply mapped to `div` or `span`
|
154
|
+
elements with a class attribute derived from the name of the AN element and an ID element taken
|
155
|
+
from the node, if any. This makes it both fast and flexible, since it's easy to
|
156
|
+
apply layout rules with CSS.
|
157
|
+
|
158
|
+
Slaw can render either an entire document like this, or just a portion of the XML tree.
|
159
|
+
|
160
|
+
## Meta-data
|
161
|
+
|
162
|
+
Acts and by-laws have metadata which it is not possible to get from their plain text representations,
|
163
|
+
such as their title, date and format of publication or act number. Slaw provides some helpers
|
164
|
+
for manipulating this meta-data. For example,
|
165
|
+
|
166
|
+
```ruby
|
167
|
+
bylaw = Slaw::ByLaw.new('spec/fixtures/community-fire-safety.xml')
|
168
|
+
print bylaw.id_uri
|
169
|
+
bylaw.title = 'A new title'
|
170
|
+
bylaw.name = 'a-new-title'
|
171
|
+
bylaw.published!(date: '2014-09-28')
|
172
|
+
print bylaw.id_uri
|
45
173
|
```
|
46
174
|
|
175
|
+
## Schedules
|
176
|
+
|
177
|
+
South African acts and by-laws can have addendums called schedules. They are technically a part of
|
178
|
+
the act but are not part of the primary body and have more relaxed formatting. Slaw finds schedules
|
179
|
+
by looking for section headings, but makes no effort to capture the format of their contents.
|
180
|
+
|
181
|
+
Akoma Ntoso has no explicit support for schedules. Instead, Slaw stores all schedules under a single
|
182
|
+
Akoma Ntoso `component` elements at the end of the XML document, with a name of `schedules`.
|
183
|
+
|
47
184
|
## Contributing
|
48
185
|
|
49
186
|
1. Fork it at http://github.com/longhotsummer/slaw/fork
|
data/lib/slaw/act.rb
CHANGED
@@ -18,25 +18,31 @@ module Slaw
|
|
18
18
|
attr_accessor :doc
|
19
19
|
|
20
20
|
# [Nokogiri::XML::Node] The `meta` XML node
|
21
|
-
|
21
|
+
attr_reader :meta
|
22
22
|
|
23
23
|
# [Nokogiri::XML::Node] The `body` XML node
|
24
|
-
|
24
|
+
attr_reader :body
|
25
25
|
|
26
26
|
# [String] The year this act was published
|
27
|
-
|
27
|
+
attr_reader :year
|
28
28
|
|
29
29
|
# [String] The act number in the year this act was published
|
30
|
-
|
30
|
+
attr_reader :num
|
31
31
|
|
32
32
|
# [String] The FRBR URI of this act, which uniquely identifies it globally
|
33
|
-
|
33
|
+
attr_reader :id_uri
|
34
34
|
|
35
35
|
# [String, nil] The source filename, or nil
|
36
|
-
|
36
|
+
attr_reader :filename
|
37
37
|
|
38
38
|
# [Time, nil] The mtime of when the source file was last modified
|
39
|
-
|
39
|
+
attr_reader :mtime
|
40
|
+
|
41
|
+
# [String] The underlying nature of this act, usually `act` although subclasses my override this.
|
42
|
+
attr_reader :nature
|
43
|
+
|
44
|
+
# [Nokogiri::XML::Schema] schema to validate against
|
45
|
+
attr_accessor :schema
|
40
46
|
|
41
47
|
# Get the act that wraps the document that owns this XML node
|
42
48
|
# @param node [Nokogiri::XML::Node]
|
@@ -49,6 +55,7 @@ module Slaw
|
|
49
55
|
# @param filename [String] filename to load XML from
|
50
56
|
def initialize(filename=nil)
|
51
57
|
self.load(filename) if filename
|
58
|
+
@schema = nil
|
52
59
|
end
|
53
60
|
|
54
61
|
# Load the XML in `filename` into this instance
|
@@ -60,8 +67,9 @@ module Slaw
|
|
60
67
|
File.open(filename) { |f| parse(f) }
|
61
68
|
end
|
62
69
|
|
63
|
-
# Parse the XML contained in the file-like object `io`
|
64
|
-
#
|
70
|
+
# Parse the XML contained in the file-like or String object `io`
|
71
|
+
#
|
72
|
+
# @param io [String, file-like] io object or String with XML
|
65
73
|
def parse(io)
|
66
74
|
self.doc = Nokogiri::XML(io)
|
67
75
|
end
|
@@ -76,26 +84,90 @@ module Slaw
|
|
76
84
|
|
77
85
|
@@acts[@doc] = self
|
78
86
|
|
79
|
-
|
87
|
+
extract_id_uri
|
80
88
|
end
|
81
89
|
|
82
|
-
#
|
83
|
-
|
84
|
-
|
85
|
-
|
90
|
+
# Directly set the FRBR URI of this act. This must be a well-formed URI,
|
91
|
+
# such as `/za/act/2002/2`. This will, in turn, update the {#year}, {#nature},
|
92
|
+
# {#country} and {#num} attributes.
|
93
|
+
#
|
94
|
+
# You probably don't want to use this method. Instead, set each component
|
95
|
+
# (such as {#date}) manually.
|
96
|
+
#
|
97
|
+
# @param uri [String] new URI
|
98
|
+
def id_uri=(uri)
|
99
|
+
for component, xpath in [['main', '//a:act/a:meta/a:identification'],
|
100
|
+
['schedules', '//a:component/a:doc/a:meta/a:identification']] do
|
101
|
+
ident = @doc.at_xpath(xpath, a: NS)
|
102
|
+
next if not ident
|
103
|
+
|
104
|
+
# work
|
105
|
+
ident.at_xpath('a:FRBRWork/a:FRBRthis', a: NS)['value'] = "#{uri}/#{component}"
|
106
|
+
ident.at_xpath('a:FRBRWork/a:FRBRuri', a: NS)['value'] = uri
|
107
|
+
|
108
|
+
# expression
|
109
|
+
ident.at_xpath('a:FRBRExpression/a:FRBRthis', a: NS)['value'] = "#{uri}/#{component}/eng@"
|
110
|
+
ident.at_xpath('a:FRBRExpression/a:FRBRuri', a: NS)['value'] = "#{uri}/eng@"
|
111
|
+
|
112
|
+
# manifestation
|
113
|
+
ident.at_xpath('a:FRBRManifestation/a:FRBRthis', a: NS)['value'] = "#{uri}/#{component}/eng@"
|
114
|
+
ident.at_xpath('a:FRBRManifestation/a:FRBRuri', a: NS)['value'] = "#{uri}/eng@"
|
115
|
+
end
|
86
116
|
|
87
|
-
|
88
|
-
|
117
|
+
extract_id_uri
|
118
|
+
end
|
119
|
+
|
120
|
+
# The date at which this act was first created/promulgated.
|
121
|
+
#
|
122
|
+
# @return [String] date, YYYY-MM-DD
|
123
|
+
def date
|
124
|
+
node = @meta.at_xpath('./a:identification/a:FRBRWork/a:FRBRdate[@name="Generation"]', a: NS)
|
125
|
+
node && node['date']
|
126
|
+
end
|
127
|
+
|
128
|
+
# Set the date at which this act was first created/promulgated. This is usually the same
|
129
|
+
# as the publication date but this is not enforced.
|
130
|
+
#
|
131
|
+
# This also updates the {#year} of this act, which in turn updates the {#id_uri}.
|
132
|
+
#
|
133
|
+
# @param date [String] date, YYYY-MM-DD
|
134
|
+
def date=(value)
|
135
|
+
for frbr in ['FRBRWork', 'FRBRExpression'] do
|
136
|
+
@meta.at_xpath("./a:identification/a:#{frbr}/a:FRBRdate[@name=\"Generation\"]", a: NS)['date'] = value
|
137
|
+
end
|
138
|
+
|
139
|
+
self.year = value.split('-')[0]
|
140
|
+
end
|
141
|
+
|
142
|
+
# Set the year for this act. You probably want to call {#date=} instead.
|
143
|
+
#
|
144
|
+
# This will also update the {#id_uri} but will not change {#date} at all.
|
145
|
+
#
|
146
|
+
# @param year [String, Number] year
|
147
|
+
def year=(year)
|
148
|
+
@year = year.to_s
|
149
|
+
rebuild_id_uri
|
89
150
|
end
|
90
151
|
|
91
152
|
# An applicable short title for this act, either from the `FRBRalias` element
|
92
153
|
# or based on the act number and year.
|
93
154
|
# @return [String]
|
94
|
-
def
|
155
|
+
def title
|
95
156
|
node = @meta.at_xpath('./a:identification/a:FRBRWork/a:FRBRalias', a: NS)
|
96
157
|
node ? node['value'] : "Act #{num} of #{year}"
|
97
158
|
end
|
98
159
|
|
160
|
+
# Change the title of this act.
|
161
|
+
def title=(value)
|
162
|
+
node = @meta.at_xpath('./a:identification/a:FRBRWork/a:FRBRalias', a: NS)
|
163
|
+
unless node
|
164
|
+
node = @doc.create_element('FRBRalias')
|
165
|
+
@meta.at_xpath('./a:identification/a:FRBRWork/a:FRBRuri', a: NS).after(node)
|
166
|
+
end
|
167
|
+
|
168
|
+
node['value'] = value
|
169
|
+
end
|
170
|
+
|
99
171
|
# Has this act been amended? This is determined by testing the `contains`
|
100
172
|
# attribute of the `act` root element.
|
101
173
|
#
|
@@ -250,6 +322,24 @@ module Slaw
|
|
250
322
|
@meta.at_xpath('./a:publication', a: NS)
|
251
323
|
end
|
252
324
|
|
325
|
+
# Update the publication details of the act. All elements are optional.
|
326
|
+
#
|
327
|
+
# @option details [String] :name name of the publication
|
328
|
+
# @option details [String] :number publication number
|
329
|
+
# @option details [String] :date date of publication (YYYY-MM-DD)
|
330
|
+
def published!(details)
|
331
|
+
node = @meta.at_xpath('./a:publication', a: NS)
|
332
|
+
unless node
|
333
|
+
node = @doc.create_element('publication')
|
334
|
+
@meta.at_xpath('./a:identification', a: NS).after(node)
|
335
|
+
end
|
336
|
+
|
337
|
+
node['showAs'] = details[:name] if details.has_key? :name
|
338
|
+
node['name'] = details[:name] if details.has_key? :name
|
339
|
+
node['date'] = details[:date] if details.has_key? :date
|
340
|
+
node['number'] = details[:number] if details.has_key? :number
|
341
|
+
end
|
342
|
+
|
253
343
|
# Has this by-law been repealed?
|
254
344
|
#
|
255
345
|
# @return [Boolean]
|
@@ -297,14 +387,55 @@ module Slaw
|
|
297
387
|
node && node['date']
|
298
388
|
end
|
299
389
|
|
300
|
-
#
|
301
|
-
|
302
|
-
|
390
|
+
# Validate the XML behind this document against the Akoma Ntoso schema and return
|
391
|
+
# any errors.
|
392
|
+
#
|
393
|
+
# @return [Object] array of errors, possibly empty
|
394
|
+
def validate
|
395
|
+
@schema ||= Dir.chdir(File.dirname(__FILE__) + "/schemas") { Nokogiri::XML::Schema(File.read('akomantoso20.xsd')) }
|
396
|
+
@schema.validate(@doc)
|
397
|
+
end
|
398
|
+
|
399
|
+
# Does this document validate against the schema?
|
400
|
+
#
|
401
|
+
# @see {#validate}
|
402
|
+
def validates?
|
403
|
+
validate.empty?
|
404
|
+
end
|
405
|
+
|
406
|
+
# Serialise the XML for this act, passing `args` to the Nokogiri serialiser.
|
407
|
+
# The most useful argument is usually `indent: 2` if you like your XML perdy.
|
408
|
+
#
|
409
|
+
# @return [String] serialized XML
|
410
|
+
def to_xml(*args)
|
411
|
+
@doc.to_xml(*args)
|
303
412
|
end
|
304
413
|
|
305
414
|
def inspect
|
306
415
|
"<#{self.class.name} @id_uri=\"#{@id_uri}\">"
|
307
416
|
end
|
417
|
+
|
418
|
+
protected
|
419
|
+
|
420
|
+
# Parse the FRBR Uri into its constituent parts
|
421
|
+
def extract_id_uri
|
422
|
+
@id_uri = @meta.at_xpath('./a:identification/a:FRBRWork/a:FRBRuri', a: NS)['value']
|
423
|
+
empty, @country, @nature, date, @num = @id_uri.split('/')
|
424
|
+
|
425
|
+
# yyyy-mm-dd
|
426
|
+
@year = date.split('-', 2)[0]
|
427
|
+
end
|
428
|
+
|
429
|
+
def build_id_uri
|
430
|
+
# /za/act/2002/3
|
431
|
+
"/#{@country}/#{@nature}/#{@year}/#{@num}"
|
432
|
+
end
|
433
|
+
|
434
|
+
# This rebuild's the FRBR uri for this document using its constituent components. It will
|
435
|
+
# update the XML then re-split the URI and grab its components.
|
436
|
+
def rebuild_id_uri
|
437
|
+
self.id_uri = build_id_uri
|
438
|
+
end
|
308
439
|
end
|
309
440
|
|
310
441
|
end
|
data/lib/slaw/bylaw.rb
CHANGED
@@ -7,40 +7,56 @@ module Slaw
|
|
7
7
|
# is not identified by a year and a number, and therefore has a different FRBR uri structure.
|
8
8
|
class ByLaw < Act
|
9
9
|
|
10
|
-
# [String] The region this by-law applies to
|
11
|
-
|
10
|
+
# [String] The code of the region this by-law applies to
|
11
|
+
attr_reader :region
|
12
12
|
|
13
13
|
# [String] A short file-like name of this by-law, unique within its year and region
|
14
|
-
|
15
|
-
|
16
|
-
def _extract_id
|
17
|
-
# /za/by-law/cape-town/2010/public-parks
|
18
|
-
|
19
|
-
@id_uri = @meta.at_xpath('./a:identification/a:FRBRWork/a:FRBRuri', a: NS)['value']
|
20
|
-
empty, @country, type, @region, date, @name = @id_uri.split('/')
|
21
|
-
|
22
|
-
# yyyy[-mm-dd]
|
23
|
-
@year = date.split('-', 2)[0]
|
24
|
-
end
|
14
|
+
attr_reader :name
|
25
15
|
|
26
16
|
# ByLaws don't have numbers, use their short-name instead
|
27
17
|
def num
|
28
18
|
name
|
29
19
|
end
|
30
20
|
|
31
|
-
def
|
21
|
+
def title
|
32
22
|
node = @meta.at_xpath('./a:identification/a:FRBRWork/a:FRBRalias', a: NS)
|
33
|
-
|
23
|
+
title = node ? node['value'] : "(Unknown)"
|
34
24
|
|
35
|
-
if amended? and not
|
36
|
-
|
25
|
+
if amended? and not title.end_with?("as amended")
|
26
|
+
title = title + " as amended"
|
37
27
|
end
|
38
28
|
|
39
|
-
|
29
|
+
title
|
30
|
+
end
|
31
|
+
|
32
|
+
# Set the short (file-like) name for this bylaw. This changes the {#id_uri}.
|
33
|
+
def name=(value)
|
34
|
+
@name = value
|
35
|
+
rebuild_id_uri
|
40
36
|
end
|
41
37
|
|
42
|
-
|
43
|
-
|
38
|
+
# Set the region code for this bylaw. This changes the {#id_uri}.
|
39
|
+
def region=(value)
|
40
|
+
@region = value
|
41
|
+
rebuild_id_uri
|
44
42
|
end
|
43
|
+
|
44
|
+
protected
|
45
|
+
|
46
|
+
def extract_id_uri
|
47
|
+
# /za/by-law/cape-town/2010/public-parks
|
48
|
+
|
49
|
+
@id_uri = @meta.at_xpath('./a:identification/a:FRBRWork/a:FRBRuri', a: NS)['value']
|
50
|
+
empty, @country, @nature, @region, date, @name = @id_uri.split('/')
|
51
|
+
|
52
|
+
# yyyy[-mm-dd]
|
53
|
+
@year = date.split('-', 2)[0]
|
54
|
+
end
|
55
|
+
|
56
|
+
def build_id_uri
|
57
|
+
# /za/by-law/cape-town/2010/public-parks
|
58
|
+
"/#{@country}/#{@nature}/#{@region}/#{@year}/#{@name}"
|
59
|
+
end
|
60
|
+
|
45
61
|
end
|
46
62
|
end
|