uottawa_odesi_utils 0.0.2alpha

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 962c5f93167d6650ec0f2a0f1986a4b95663f12d
4
+ data.tar.gz: 1bc29f1c3e0620374f8aaed2d4bbaac4abf8dddc
5
+ SHA512:
6
+ metadata.gz: 672efaee5d6fb0960914ff5b3894740c908fff19e9e3dc3d5d43a5939abfaa80fce46ba87ec461b4259eefe625f3a032c5f058fc2919f71d1bb52f6bbbaf4ea4
7
+ data.tar.gz: 9dfb978bca564e3ba876438e4bbf00b1ab7cb6fa17e7b045918febf8171406dc1ad7b51b07fbcf746bbca70dd7b5e004da412afd2cee2ab660421bd4b61d6a5f
@@ -0,0 +1,17 @@
1
+ /.bundle/
2
+ /.yardoc
3
+ /Gemfile.lock
4
+ /_yardoc/
5
+ /coverage/
6
+ /doc/
7
+ /pkg/
8
+ /spec/reports/
9
+ /tmp/
10
+ *.bundle
11
+ *.so
12
+ *.o
13
+ *.a
14
+ mkmf.log
15
+
16
+ #I don't know yet if the xml of statcan is public or confidential
17
+ spec/data/*xml
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in uottawa_odesi_utils.gemspec
4
+ gemspec
@@ -0,0 +1,22 @@
1
+ Copyright (c) 2015 Guinsly Mondesir
2
+
3
+ MIT License
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining
6
+ a copy of this software and associated documentation files (the
7
+ "Software"), to deal in the Software without restriction, including
8
+ without limitation the rights to use, copy, modify, merge, publish,
9
+ distribute, sublicense, and/or sell copies of the Software, and to
10
+ permit persons to whom the Software is furnished to do so, subject to
11
+ the following conditions:
12
+
13
+ The above copyright notice and this permission notice shall be
14
+ included in all copies or substantial portions of the Software.
15
+
16
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
19
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
20
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
22
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,59 @@
1
+ # UottawaOdesiUtils
2
+
3
+ Ceci est une librairie pour agir en tant que Helper pour travailler avec les Documentations xml d'Odesi. Le logiciel SPPS va abrégé toutes les phrases qui ont une taille plus grande que 251 caractères. Le but de cette libraire est de rendre plus facile de faire la traduction des `<labl>` et des balises `<qstnLit>` en créant un hash des valeurs des balises. Ensuite je pourrai savoir quel `<labl>` sera écourté en examinant la taille totale de caractères contenu dans cette balise et utilisé une application comme Rails/sinatra pour faire la traduction ou pour modifier la pharse écourtée par SPSS.
4
+
5
+ This is an utils library to work with DDI-xml. The purpose of this library is to ease the process of translating variable label in a document. This library will retrieve the `<labl>` and the `<qstnLit>` value of an IDD file and will also tell if the label.size is greater than 251 caracters, that will mean that it will be chopped in SPSS. So that it will be easy to create a json file, export it to a db for a Rails/Flask app. The python27 library can be found [here:pyodesi](http://www.github.com/guinslym)
6
+
7
+ ## Installation
8
+
9
+ Add this line to your application's Gemfile:
10
+
11
+ ```ruby
12
+ gem 'uottawa_odesi_utils'
13
+ ```
14
+
15
+ And then execute:
16
+
17
+ $ bundle
18
+
19
+ Or install it yourself as:
20
+
21
+ $ gem install uottawa_odesi_utils
22
+
23
+ ## Usage
24
+
25
+ Dealing with one Documentation file
26
+ ```ruby
27
+ content = retrieve_label_and_qstnlit('esg-cycle-xx.xml')
28
+ puts content
29
+ => {:label=>"Lorem ipsum dolor sit amet, consectetur adipiscing elit.",
30
+ :qstnLit=>"Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.",
31
+ :label_warning=>false, :variable_name=>"VARIABLE_000"}
32
+ ```
33
+
34
+ Comparing two files
35
+ ```ruby
36
+ content = retrieve_label_and_qstnlit('esg-cycle-xx_fr.xml', 'gss-cycle-xx_en.xml')
37
+ #french file must be first
38
+ puts content
39
+ => {
40
+ :label_fr=>"Lorem ipsum dolor sit amet, consectetur adipiscing elit.",
41
+ :qstnLit_fr=>"Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.",
42
+ :label_warning_fr=>false,
43
+ :variable_name=>"VARIABLE_000":label_en=>"Lorem ipsum dolor sit amet, consectetur adipiscing elit.",
44
+ :qstnLit_en=>"Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.",
45
+ :label_warning_en=>false,
46
+ :variable_name=>"VARIABLE_000"}
47
+ ```
48
+ Now it's easier to create a web app so that I can view the English and the French translation side-by-side and make corrections if the translation is not good enough or if the label size is greater than 251 character
49
+
50
+ ##TODO
51
+ Sinatra app to put the json in a DB
52
+
53
+ ## Contributing
54
+
55
+ 1. Fork it ( https://github.com/guinslym/uottawa_odesi_utils/fork )
56
+ 2. Create your feature branch (`git checkout -b my-new-feature`)
57
+ 3. Commit your changes (`git commit -am 'Add some feature'`)
58
+ 4. Push to the branch (`git push origin my-new-feature`)
59
+ 5. Create a new Pull Request
@@ -0,0 +1,2 @@
1
+ require "bundler/gem_tasks"
2
+
@@ -0,0 +1,76 @@
1
+ require "uottawa_odesi_utils/version"
2
+ require 'nokogiri'
3
+
4
+ module UottawaOdesiUtils
5
+
6
+ def self.retrieve_label_and_qstnlit(file)
7
+ l = -> (name) { Nokogiri::XML(File.open("#{name}")) }
8
+ doc = l.call(file)
9
+
10
+ sentences = doc.children.css('dataDscr').children.search('var')
11
+
12
+ node_found = []
13
+ sentences.each do |sentence|
14
+
15
+ x = -> (elem) {sentence.children.search("#{elem}")}
16
+ label = x.call('qstnLit')
17
+ qstnLit = x.call('labl')
18
+
19
+ if label.text.size > 0 and qstnLit.text.size > 0
20
+ label = label.first.text.strip
21
+ qstnLit = qstnLit.first.text.strip
22
+ label_warning = label.size > 251
23
+ tmp = {label: label, qstnLit: qstnLit, label_warning: label_warning,
24
+ variable_name: sentence['name']}
25
+ node_found.push(tmp)
26
+ end
27
+ end
28
+ return node_found
29
+
30
+ end
31
+
32
+ def self.bilingual_files(xml_fr, xml_en)
33
+ l = -> (name) { Nokogiri::XML(File.open("#{name}")) }
34
+ doc = l.call(xml_fr)
35
+ doc_other = l.call(xml_en)
36
+
37
+ sentences = doc.children.css('dataDscr').children.search('var')
38
+
39
+ idd_value = []
40
+ sentences.each do |sentence|
41
+
42
+ x = -> (elem) {sentence.children.search("#{elem}")}
43
+ label = x.call('labl')
44
+ qstnLit = x.call('qstnLit')
45
+
46
+ if label.text.size > 0 and qstnLit.text.size > 0
47
+ variable_name = sentence['name']
48
+ label_fr = label.first.text.strip
49
+ qstnLit_fr = qstnLit.first.text.strip
50
+ label_warning_fr = label_fr.size > 251
51
+
52
+ #other language
53
+ var = doc_other.children.css("var[@name=#{variable_name}]")
54
+
55
+ #if this French Node is not Present in the English file, than I'll leave it blank
56
+ begin
57
+ b = -> (elem) { var.children.search("#{elem}").first.text.strip }
58
+ label_en= b.call('labl')
59
+ qstnLit_en= b.call('qstnLit')
60
+ rescue
61
+ label_en= ""
62
+ qstnLit_en= ""
63
+ end
64
+
65
+ tmp = {variable_name: variable_name, label_en: label_en, qstnLit_en: qstnLit_en,
66
+ label_warning_en: label_en.size > 251, label_fr: label_fr, qstnLit_fr: qstnLit_fr,
67
+ label_warning_fr: label_fr.size > 251}
68
+ #puts tmp
69
+ idd_value.push(tmp)
70
+ end
71
+ end
72
+ return idd_value
73
+
74
+ end#bilingual_files
75
+
76
+ end#module
@@ -0,0 +1,3 @@
1
+ module UottawaOdesiUtils
2
+ VERSION = "0.0.2alpha"
3
+ end
@@ -0,0 +1,30 @@
1
+ require 'spec_helper'
2
+
3
+ describe UottawaOdesiUtils do
4
+
5
+ # Crappy specs start here
6
+ describe ".IDD Documentation" do
7
+ let(:lines) { UottawaOdesiUtils.retrieve_label_and_qstnlit(Dir.pwd+'/spec/data/esg-c-25.xml')}
8
+ let(:b_lines) { UottawaOdesiUtils.bilingual_files(Dir.pwd+'/spec/data/gss-c-25.xml',
9
+ Dir.pwd+'/spec/data/esg-c-25.xml')}
10
+
11
+ it "expecting it to be at least 79" do
12
+ expect(lines.size).to be >= 79
13
+ end
14
+
15
+ it "expecting it to return an array" do
16
+ expect(lines).to be_instance_of Array
17
+ end
18
+
19
+ it "each element is a Hash" do
20
+ expect(lines.first).to be_instance_of Hash
21
+ end
22
+
23
+ it "both file will have at least 79 elements" do
24
+ expect(b_lines.size).to be >= 79
25
+ end
26
+
27
+ end
28
+
29
+
30
+ end
@@ -0,0 +1,12 @@
1
+ require 'uottawa_odesi_utils'
2
+
3
+ RSpec.configure do |config|
4
+ # Use color in STDOUT
5
+ config.color = true
6
+
7
+ # Use color not only in STDOUT but also in pagers and files
8
+ config.tty = true
9
+
10
+ # Use the specified formatter
11
+ config.formatter = :documentation # :progress, :html, :textmate
12
+ end
@@ -0,0 +1,28 @@
1
+ # coding: utf-8
2
+ lib = File.expand_path('../lib', __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require 'uottawa_odesi_utils/version'
5
+
6
+ Gem::Specification.new do |spec|
7
+ spec.name = "uottawa_odesi_utils"
8
+ spec.version = UottawaOdesiUtils::VERSION
9
+ spec.authors = ["Guinsly Mondesir"]
10
+ spec.email = ["guinslym@gmail.com"]
11
+ spec.summary = %q{A library to work with uottawa ddi xml file }
12
+ spec.description = %q{Utils library for DDI file}
13
+ spec.homepage = "https://github.com/guinslym/uottawa_odesi_utils"
14
+ spec.license = "MIT"
15
+
16
+ spec.files = `git ls-files -z`.split("\x0")
17
+ spec.executables = spec.files.grep(%r{^bin/}) { |f| File.basename(f) }
18
+ spec.test_files = spec.files.grep(%r{^(test|spec|features)/})
19
+ spec.require_paths = ["lib"]
20
+
21
+ spec.add_development_dependency "bundler", "~> 1.7"
22
+ spec.add_development_dependency "rake", "~> 10.0"
23
+ spec.add_development_dependency 'rspec', '~> 3.2', '>= 3.2.0'
24
+ spec.add_development_dependency 'nokogiri', '~> 1.6', '>= 1.6.6.2'
25
+ spec.add_development_dependency 'rspec-nc', '~> 0.2', '>= 0.2.0'
26
+ spec.add_development_dependency 'guard', '~> 2.12', '>= 2.12.4'
27
+ spec.add_development_dependency 'guard-rspec', '~> 4.5', '>= 4.5.0'
28
+ end
metadata ADDED
@@ -0,0 +1,185 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: uottawa_odesi_utils
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.0.2alpha
5
+ platform: ruby
6
+ authors:
7
+ - Guinsly Mondesir
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2015-03-08 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: bundler
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - "~>"
18
+ - !ruby/object:Gem::Version
19
+ version: '1.7'
20
+ type: :development
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - "~>"
25
+ - !ruby/object:Gem::Version
26
+ version: '1.7'
27
+ - !ruby/object:Gem::Dependency
28
+ name: rake
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - "~>"
32
+ - !ruby/object:Gem::Version
33
+ version: '10.0'
34
+ type: :development
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - "~>"
39
+ - !ruby/object:Gem::Version
40
+ version: '10.0'
41
+ - !ruby/object:Gem::Dependency
42
+ name: rspec
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - "~>"
46
+ - !ruby/object:Gem::Version
47
+ version: '3.2'
48
+ - - ">="
49
+ - !ruby/object:Gem::Version
50
+ version: 3.2.0
51
+ type: :development
52
+ prerelease: false
53
+ version_requirements: !ruby/object:Gem::Requirement
54
+ requirements:
55
+ - - "~>"
56
+ - !ruby/object:Gem::Version
57
+ version: '3.2'
58
+ - - ">="
59
+ - !ruby/object:Gem::Version
60
+ version: 3.2.0
61
+ - !ruby/object:Gem::Dependency
62
+ name: nokogiri
63
+ requirement: !ruby/object:Gem::Requirement
64
+ requirements:
65
+ - - "~>"
66
+ - !ruby/object:Gem::Version
67
+ version: '1.6'
68
+ - - ">="
69
+ - !ruby/object:Gem::Version
70
+ version: 1.6.6.2
71
+ type: :development
72
+ prerelease: false
73
+ version_requirements: !ruby/object:Gem::Requirement
74
+ requirements:
75
+ - - "~>"
76
+ - !ruby/object:Gem::Version
77
+ version: '1.6'
78
+ - - ">="
79
+ - !ruby/object:Gem::Version
80
+ version: 1.6.6.2
81
+ - !ruby/object:Gem::Dependency
82
+ name: rspec-nc
83
+ requirement: !ruby/object:Gem::Requirement
84
+ requirements:
85
+ - - "~>"
86
+ - !ruby/object:Gem::Version
87
+ version: '0.2'
88
+ - - ">="
89
+ - !ruby/object:Gem::Version
90
+ version: 0.2.0
91
+ type: :development
92
+ prerelease: false
93
+ version_requirements: !ruby/object:Gem::Requirement
94
+ requirements:
95
+ - - "~>"
96
+ - !ruby/object:Gem::Version
97
+ version: '0.2'
98
+ - - ">="
99
+ - !ruby/object:Gem::Version
100
+ version: 0.2.0
101
+ - !ruby/object:Gem::Dependency
102
+ name: guard
103
+ requirement: !ruby/object:Gem::Requirement
104
+ requirements:
105
+ - - "~>"
106
+ - !ruby/object:Gem::Version
107
+ version: '2.12'
108
+ - - ">="
109
+ - !ruby/object:Gem::Version
110
+ version: 2.12.4
111
+ type: :development
112
+ prerelease: false
113
+ version_requirements: !ruby/object:Gem::Requirement
114
+ requirements:
115
+ - - "~>"
116
+ - !ruby/object:Gem::Version
117
+ version: '2.12'
118
+ - - ">="
119
+ - !ruby/object:Gem::Version
120
+ version: 2.12.4
121
+ - !ruby/object:Gem::Dependency
122
+ name: guard-rspec
123
+ requirement: !ruby/object:Gem::Requirement
124
+ requirements:
125
+ - - "~>"
126
+ - !ruby/object:Gem::Version
127
+ version: '4.5'
128
+ - - ">="
129
+ - !ruby/object:Gem::Version
130
+ version: 4.5.0
131
+ type: :development
132
+ prerelease: false
133
+ version_requirements: !ruby/object:Gem::Requirement
134
+ requirements:
135
+ - - "~>"
136
+ - !ruby/object:Gem::Version
137
+ version: '4.5'
138
+ - - ">="
139
+ - !ruby/object:Gem::Version
140
+ version: 4.5.0
141
+ description: Utils library for DDI file
142
+ email:
143
+ - guinslym@gmail.com
144
+ executables: []
145
+ extensions: []
146
+ extra_rdoc_files: []
147
+ files:
148
+ - ".gitignore"
149
+ - Gemfile
150
+ - LICENSE.txt
151
+ - README.md
152
+ - Rakefile
153
+ - lib/uottawa_odesi_utils.rb
154
+ - lib/uottawa_odesi_utils/version.rb
155
+ - spec/Uottawa_odesi_spec.rb
156
+ - spec/spec_helper.rb
157
+ - uottawa_odesi_utils.gemspec
158
+ homepage: https://github.com/guinslym/uottawa_odesi_utils
159
+ licenses:
160
+ - MIT
161
+ metadata: {}
162
+ post_install_message:
163
+ rdoc_options: []
164
+ require_paths:
165
+ - lib
166
+ required_ruby_version: !ruby/object:Gem::Requirement
167
+ requirements:
168
+ - - ">="
169
+ - !ruby/object:Gem::Version
170
+ version: '0'
171
+ required_rubygems_version: !ruby/object:Gem::Requirement
172
+ requirements:
173
+ - - ">"
174
+ - !ruby/object:Gem::Version
175
+ version: 1.3.1
176
+ requirements: []
177
+ rubyforge_project:
178
+ rubygems_version: 2.2.2
179
+ signing_key:
180
+ specification_version: 4
181
+ summary: A library to work with uottawa ddi xml file
182
+ test_files:
183
+ - spec/Uottawa_odesi_spec.rb
184
+ - spec/spec_helper.rb
185
+ has_rdoc: