fide_xml_parser 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 67692b8c9fd3d78f38dd2dd37f3fa2de2586bfe70ba71f6be08fd0db0654383b
4
+ data.tar.gz: e55d84d2b2b6df2680dc646a9a873d969f909fc96c9b9d0526e979cddaebaec8
5
+ SHA512:
6
+ metadata.gz: ad8ac8524c1d89227b084df75ad505b98863e404af7d6f853ee29641cdda37e97be655ec68687c46e601f825b52ad8656e2c6fa0a13f9487cc56e61e62b752da
7
+ data.tar.gz: 92bf11de84011190588baef367d8512585b5b8a04e68c209a86b827d20aa3ddd0c6401e715e2b88473aa368bae9ce4b7a0f690eb49019444e8924ceaacbf345d
data/.gitignore ADDED
@@ -0,0 +1,11 @@
1
+ /.bundle/
2
+ /.yardoc
3
+ /_yardoc/
4
+ /coverage/
5
+ /doc/
6
+ /pkg/
7
+ /spec/reports/
8
+ /tmp/
9
+
10
+ # rspec failure tracking
11
+ .rspec_status
data/.rspec ADDED
@@ -0,0 +1,3 @@
1
+ --format documentation
2
+ --color
3
+ --require spec_helper
data/.travis.yml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ sudo: false
3
+ language: ruby
4
+ cache: bundler
5
+ rvm:
6
+ - 2.6.5
7
+ before_install: gem install bundler -v 2.0.2
data/CHANGELOG.md ADDED
@@ -0,0 +1,4 @@
1
+ # 0.1.0
2
+
3
+ First release.
4
+
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source "https://rubygems.org"
2
+
3
+ # Specify your gem's dependencies in fide_xml_parser.gemspec
4
+ gemspec
data/LICENSE.txt ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2020 Keith Bennett
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
@@ -0,0 +1,64 @@
1
+ # FIDE Chess Player Rating Information XML to JSON Converter & Data Analyzer
2
+
3
+ This repo contains two scripts:
4
+
5
+ * [fide_xml_to_json.rb](fide_xml_to_json.rb) - generates a JSON file from an XML file downloaded from [https://ratings.fide.com/download_lists.phtml](https://ratings.fide.com/download_lists.phtml) or [https://ratings.fide.com/download.phtml](https://ratings.fide.com/download.phtml).
6
+ * [analyze_fide_data.rb](analyze_fide_data.rb) - generates a JSON file containing statistics from a JSON file produced by fide_xml_to_json.rb. This file may not necessarily include information useful to you, but is an example of how the data can be manipulated. Sample output files are in the [samples](./samples) directory.
7
+
8
+
9
+ ## Installation Instructions
10
+
11
+ Download or clone this repository.
12
+
13
+ These scripts are Ruby scripts. They uses Ruby gems (libraries). Install the gems by running `bundle` in the project directory. (Ruby and these gems work best in Mac OS or Linux. Windows should work but may present challenges.)
14
+
15
+ If you are running on Mac OS or Linux, you can `chmod +x *.rb` so that you can invoke the scripts without preceding their filespecs with `ruby`.
16
+
17
+
18
+ ## Operating Instructions
19
+
20
+ The scripts output the JSON in "pretty" format that is more human-readable than compact JSON. If you need to, you can use compact mode instead by changing the calls from `JSON.pretty_generate(records)` to `records.to_json`.
21
+
22
+ ### fide_xml_to_json.rb
23
+
24
+ [fide_xml_to_json.rb](fide_xml_to_json.rb) syntax is:
25
+
26
+ `ruby fide_xml_to_json.rb my_file.xml`
27
+
28
+ It will show its progress, and when it is finished parsing all the XML records will write the JSON representation of the parsed data to a file whose name is the same as the XML file, but with `.xml` changed to `.json`.
29
+
30
+ ### analyze_fide_data.rb
31
+
32
+ [analyze_fide_data.rb](./analyze_fide_data.rb) syntax is:
33
+
34
+ `ruby analyze_fide_data.rb my_file.json`
35
+
36
+ It will read and parse the JSON file, analyze the data, and then output it (in pretty JSON format) to a file whose name is the same as the input file's, but with a `-stats` inserted before the `.json`.
37
+
38
+
39
+ ## Adding Your Own Functionality
40
+
41
+ Assuming one can write even simple Ruby code, manipulating the data is easy.
42
+
43
+ You can write a script that includes:
44
+
45
+ ```ruby
46
+ require 'json'
47
+ records = JSON.parse(File.read('my_file.json'))
48
+ ```
49
+
50
+ ...and then use Ruby's excellent [Enumerable support](https://ruby-doc.org/core-2.7.0/Enumerable.html) to analyze or manipulate the `records` array.
51
+
52
+ As examples, the following can easily be done:
53
+
54
+ * _field filtering_ - exclude some fields from the JSON output
55
+ * _record filtering_ - exclude records meeting specified criteria
56
+ * _statistical analysis_ - analysis of the parsed data
57
+ * _data validation_ - to look for data errors and ambiguities
58
+
59
+ You could also insert the following to get an interactive Ruby shell where `self` is the records array. (You may need to `gem install pry` if it hasn't already been installed.):
60
+
61
+ ```ruby
62
+ require 'pry'
63
+ records.pry
64
+ ```
data/README.md ADDED
@@ -0,0 +1,42 @@
1
+ # FideXmlParser
2
+
3
+ #### Sorry, the README is not yet written. Below is the boilerplate text:
4
+
5
+
6
+ Welcome to your new gem! In this directory, you'll find the files you need to be able to package up your Ruby library into a gem. Put your Ruby code in the file `lib/fide_xml_parser`. To experiment with that code, run `bin/console` for an interactive prompt.
7
+
8
+ TODO: Delete this and the text above, and describe your gem
9
+
10
+ ## Installation
11
+
12
+ Add this line to your application's Gemfile:
13
+
14
+ ```ruby
15
+ gem 'fide_xml_parser'
16
+ ```
17
+
18
+ And then execute:
19
+
20
+ $ bundle
21
+
22
+ Or install it yourself as:
23
+
24
+ $ gem install fide_xml_parser
25
+
26
+ ## Usage
27
+
28
+ TODO: Write usage instructions here
29
+
30
+ ## Development
31
+
32
+ After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
33
+
34
+ To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
35
+
36
+ ## Contributing
37
+
38
+ Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/fide_xml_parser.
39
+
40
+ ## License
41
+
42
+ The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
data/Rakefile ADDED
@@ -0,0 +1,6 @@
1
+ require "bundler/gem_tasks"
2
+ require "rspec/core/rake_task"
3
+
4
+ RSpec::Core::RakeTask.new(:spec)
5
+
6
+ task :default => :spec
@@ -0,0 +1,39 @@
1
+ lib = File.expand_path("lib", __dir__)
2
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
3
+ require "fide_xml_parser/version"
4
+
5
+ Gem::Specification.new do |spec|
6
+ spec.name = "fide_xml_parser"
7
+ spec.version = FideXmlParser::VERSION
8
+ spec.authors = ["Keith Bennett"]
9
+ spec.email = ["keithrbennett@gmail.com"]
10
+
11
+ spec.summary = %q{Parses XML files downloaded from fide.com and writes JSON files.}
12
+ spec.description = spec.summary
13
+ spec.homepage = "https://github.com/keithrbennett/fide-xml-parser"
14
+ spec.license = "Apache-2.0"
15
+
16
+ spec.metadata['allowed_push_host'] = "https://rubygems.org"
17
+
18
+ spec.metadata["homepage_uri"] = spec.homepage
19
+ spec.metadata["source_code_uri"] = spec.homepage
20
+ # spec.metadata["changelog_uri"] = '' # "TODO: Put your gem's CHANGELOG.md URL here."
21
+
22
+ # Specify which files should be added to the gem when it is released.
23
+ # The `git ls-files -z` loads the files in the RubyGem that have been added into git.
24
+ spec.files = Dir.chdir(File.expand_path('..', __FILE__)) do
25
+ `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(bin|test|spec|features)/}) }
26
+ end
27
+ spec.bindir = "exe"
28
+ spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
29
+ spec.require_paths = ["lib"]
30
+
31
+ spec.add_dependency "nokogiri", "~>1.10"
32
+ spec.add_dependency "tty-cursor", "~> 0.7"
33
+ spec.add_dependency "pry", "~> 0.12"
34
+ spec.add_dependency "awesome_print", "~> 1.8"
35
+
36
+ spec.add_development_dependency "bundler", "~> 2.0"
37
+ spec.add_development_dependency "rake", "~> 10.0"
38
+ spec.add_development_dependency "rspec", "~> 3.0"
39
+ end
@@ -0,0 +1,11 @@
1
+ module FideXmlParser
2
+
3
+ lib_dir = File.dirname(__FILE__)
4
+ file_mask = File.join(lib_dir, '**', '*.rb')
5
+ Dir[file_mask].each do |ruby_file|
6
+ require ruby_file
7
+ end
8
+
9
+ class Error < StandardError; end
10
+
11
+ end
@@ -0,0 +1,71 @@
1
+ require 'json'
2
+
3
+ module FideXmlParser
4
+
5
+ class JsonWriter
6
+
7
+ attr_reader :parser
8
+ attr_accessor :key_filter, :record_filter
9
+
10
+
11
+ def initialize
12
+ @key_filter = nil
13
+ @record_filter = nil
14
+ end
15
+
16
+
17
+ # Checks all input filespecs before processing the first one.
18
+ # Verifies not nil, ends in ".xml" (case insensitive), and exists as a file.
19
+ def validate_input_filespecs(filespecs)
20
+ filespecs = Array(filespecs)
21
+ bad_filespecs = filespecs.select do |filespec|
22
+ filespec.nil? || (! /\.xml$/.match(filespec)) || (! File.file?(filespec))
23
+ end
24
+ if bad_filespecs.any?
25
+ raise "The following filespecs were not valid XML filespecs: #{bad_filespecs.join(', ')}"
26
+ end
27
+ end
28
+
29
+
30
+ # Public entry point to write JSON file(s) from XML.
31
+ # To write a single file, pass the filespec as the `input_filespecs` parameter.
32
+ # To write multiple files, pass an array of filespecs as the `input_filespecs` parameter
33
+ # json_mode: :pretty for human readable JSON, :compact for compact JSON
34
+ # Default json_filespec will be constructed from the input file, just replacing 'xml' with 'json'.
35
+ def write(input_filespec, json_mode = :pretty, json_filespec = nil)
36
+ if input_filespec.is_a?(Array)
37
+ raise Error.new("This method is used only for single files, use write_multiple for multiple files.")
38
+ end
39
+
40
+ validate_input_filespecs(Array[input_filespec])
41
+ write_private(input_filespec, json_mode, json_filespec)
42
+ end
43
+
44
+
45
+ # Public entry point to write multiple files.
46
+ # json_mode: :pretty for human readable JSON, :compact for compact JSON
47
+ def write_multiple(input_filespecs, json_mode = :pretty)
48
+ validate_input_filespecs(input_filespecs)
49
+ input_filespecs.each do |input_filespec|
50
+ write_private(input_filespec, json_mode)
51
+ end
52
+ end
53
+
54
+
55
+ # Implementation for writing a single file.
56
+ # Separated from the public `write` method in order to validate filespecs only once.
57
+ # Default json_filespec will be constructed from the input file, just replacing 'xml' with 'json'.
58
+ private
59
+ def write_private(input_filespec, json_mode = :pretty, json_filespec = nil)
60
+ @parser = FideXmlParser::Processor.new
61
+ parser.key_filter = key_filter
62
+ parser.record_filter = record_filter
63
+ records = parser.parse(File.new(input_filespec))
64
+
65
+ json_text = (json_mode == :pretty) ? JSON.pretty_generate(records) : records.to_json
66
+ json_filespec ||= input_filespec.sub(/\.xml$/, '.json')
67
+ File.write(json_filespec, json_text)
68
+ puts "#{records.size} records processed, #{input_filespec} --> #{json_filespec}"
69
+ end
70
+ end
71
+ end
@@ -0,0 +1,118 @@
1
+ require 'awesome_print'
2
+ require 'nokogiri'
3
+ require 'tty-cursor'
4
+
5
+ module FideXmlParser
6
+
7
+ # Recommended entry point is Processor.parse, which creates an instance of the class, initiates the parse,
8
+ # and returns the parsed data.
9
+ #
10
+ # Supports key and record filters.
11
+
12
+ # For key filter, pass a lambda that takes a key name as a parameter
13
+ # and returns true to include it, false to exclude it,
14
+ # e.g. to exclude 'foo' and 'bar', do this:
15
+ # processor.key_filter = ->(key) { ! %w(foo bar).include?(key) }
16
+
17
+ # For record filter, pass a lambda that takes a record as a parameter,
18
+ # and returns true to include it or false to exclude it,
19
+ # e.g. to include only records with a "title", do this:
20
+ # processor.record_filter = ->(rec) { rec.title }
21
+ class Processor < Nokogiri::XML::SAX::Document
22
+
23
+ attr_reader :start_time
24
+ attr_accessor :current_property_name, :record, :records, :key_filter, :record_filter,
25
+ :input_record_count, :output_record_count
26
+
27
+ NUMERIC_FIELDS = %w[
28
+ k
29
+ blitz_k
30
+ rapid_k
31
+ rating
32
+ blitz_rating
33
+ rapid_rating
34
+ games
35
+ blitz_games
36
+ rapid_games
37
+ ]
38
+
39
+ def initialize
40
+ @key_filter = nil
41
+ @record_filter = nil
42
+ @current_property_name = nil
43
+ @record = {}
44
+ @records = []
45
+ @start_time = current_time
46
+ @keys_to_exclude = []
47
+ @input_record_count = 0
48
+ @output_record_count = 0
49
+ end
50
+
51
+
52
+ def parse(data_source)
53
+ parser = Nokogiri::XML::SAX::Parser.new(self)
54
+ parser.parse(data_source)
55
+ records
56
+ end
57
+
58
+
59
+ def current_time
60
+ Process.clock_gettime(Process::CLOCK_MONOTONIC)
61
+ end
62
+
63
+
64
+ def output_status
65
+ print TTY::Cursor.column(1)
66
+ print "Records processed: %9d kept: %9d Seconds elapsed: %11.2f" % [
67
+ input_record_count,
68
+ output_record_count,
69
+ current_time - start_time
70
+ ]
71
+ end
72
+
73
+
74
+ def start_element(name, _attrs)
75
+ case name
76
+ when 'playerslist'
77
+ # ignore
78
+ when 'player'
79
+ self.input_record_count += 1
80
+ output_status if input_record_count % 1000 == 0
81
+ else # this is a field in the players record; process it as such
82
+ self.current_property_name = name
83
+ end
84
+ end
85
+
86
+
87
+ def end_element(name)
88
+ case name
89
+ when 'playerslist' # end of data, write JSON file
90
+ finish
91
+ when 'player'
92
+ if record_filter.nil? || record_filter.(record)
93
+ self.output_record_count += 1
94
+ records << record
95
+ end
96
+ self.record = {}
97
+ else
98
+ self.current_property_name = nil
99
+ end
100
+ end
101
+
102
+
103
+ def characters(string)
104
+ if current_property_name
105
+ if key_filter.nil? || key_filter.(current_property_name)
106
+ value = NUMERIC_FIELDS.include?(current_property_name) ? Integer(string) : string
107
+ record[current_property_name] = value
108
+ end
109
+ end
110
+ end
111
+
112
+
113
+ def finish
114
+ output_status
115
+ puts
116
+ end
117
+ end
118
+ end
@@ -0,0 +1,3 @@
1
+ module FideXmlParser
2
+ VERSION = "0.1.0"
3
+ end
metadata ADDED
@@ -0,0 +1,158 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: fide_xml_parser
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - Keith Bennett
8
+ autorequire:
9
+ bindir: exe
10
+ cert_chain: []
11
+ date: 2020-02-27 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: nokogiri
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - "~>"
18
+ - !ruby/object:Gem::Version
19
+ version: '1.10'
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - "~>"
25
+ - !ruby/object:Gem::Version
26
+ version: '1.10'
27
+ - !ruby/object:Gem::Dependency
28
+ name: tty-cursor
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - "~>"
32
+ - !ruby/object:Gem::Version
33
+ version: '0.7'
34
+ type: :runtime
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - "~>"
39
+ - !ruby/object:Gem::Version
40
+ version: '0.7'
41
+ - !ruby/object:Gem::Dependency
42
+ name: pry
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - "~>"
46
+ - !ruby/object:Gem::Version
47
+ version: '0.12'
48
+ type: :runtime
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - "~>"
53
+ - !ruby/object:Gem::Version
54
+ version: '0.12'
55
+ - !ruby/object:Gem::Dependency
56
+ name: awesome_print
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - "~>"
60
+ - !ruby/object:Gem::Version
61
+ version: '1.8'
62
+ type: :runtime
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - "~>"
67
+ - !ruby/object:Gem::Version
68
+ version: '1.8'
69
+ - !ruby/object:Gem::Dependency
70
+ name: bundler
71
+ requirement: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - "~>"
74
+ - !ruby/object:Gem::Version
75
+ version: '2.0'
76
+ type: :development
77
+ prerelease: false
78
+ version_requirements: !ruby/object:Gem::Requirement
79
+ requirements:
80
+ - - "~>"
81
+ - !ruby/object:Gem::Version
82
+ version: '2.0'
83
+ - !ruby/object:Gem::Dependency
84
+ name: rake
85
+ requirement: !ruby/object:Gem::Requirement
86
+ requirements:
87
+ - - "~>"
88
+ - !ruby/object:Gem::Version
89
+ version: '10.0'
90
+ type: :development
91
+ prerelease: false
92
+ version_requirements: !ruby/object:Gem::Requirement
93
+ requirements:
94
+ - - "~>"
95
+ - !ruby/object:Gem::Version
96
+ version: '10.0'
97
+ - !ruby/object:Gem::Dependency
98
+ name: rspec
99
+ requirement: !ruby/object:Gem::Requirement
100
+ requirements:
101
+ - - "~>"
102
+ - !ruby/object:Gem::Version
103
+ version: '3.0'
104
+ type: :development
105
+ prerelease: false
106
+ version_requirements: !ruby/object:Gem::Requirement
107
+ requirements:
108
+ - - "~>"
109
+ - !ruby/object:Gem::Version
110
+ version: '3.0'
111
+ description: Parses XML files downloaded from fide.com and writes JSON files.
112
+ email:
113
+ - keithrbennett@gmail.com
114
+ executables: []
115
+ extensions: []
116
+ extra_rdoc_files: []
117
+ files:
118
+ - ".gitignore"
119
+ - ".rspec"
120
+ - ".travis.yml"
121
+ - CHANGELOG.md
122
+ - Gemfile
123
+ - LICENSE.txt
124
+ - README-from-old-repo.md
125
+ - README.md
126
+ - Rakefile
127
+ - fide_xml_parser.gemspec
128
+ - lib/fide_xml_parser.rb
129
+ - lib/fide_xml_parser/json_writer.rb
130
+ - lib/fide_xml_parser/processor.rb
131
+ - lib/fide_xml_parser/version.rb
132
+ homepage: https://github.com/keithrbennett/fide-xml-parser
133
+ licenses:
134
+ - Apache-2.0
135
+ metadata:
136
+ allowed_push_host: https://rubygems.org
137
+ homepage_uri: https://github.com/keithrbennett/fide-xml-parser
138
+ source_code_uri: https://github.com/keithrbennett/fide-xml-parser
139
+ post_install_message:
140
+ rdoc_options: []
141
+ require_paths:
142
+ - lib
143
+ required_ruby_version: !ruby/object:Gem::Requirement
144
+ requirements:
145
+ - - ">="
146
+ - !ruby/object:Gem::Version
147
+ version: '0'
148
+ required_rubygems_version: !ruby/object:Gem::Requirement
149
+ requirements:
150
+ - - ">="
151
+ - !ruby/object:Gem::Version
152
+ version: '0'
153
+ requirements: []
154
+ rubygems_version: 3.1.2
155
+ signing_key:
156
+ specification_version: 4
157
+ summary: Parses XML files downloaded from fide.com and writes JSON files.
158
+ test_files: []