clippings_pluck 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 2c580728b35727180d4f14814ae842a207ff0771
4
+ data.tar.gz: 4d68827fcf34987d3734d3580737132e919021ce
5
+ SHA512:
6
+ metadata.gz: 8599cccebcc2c241147bda84eb0553d5673d1c9875c0e72134e147afe332d7700c24b534453798da28bdb978603113371447f6b00148cf2836f5f29d1f242e99
7
+ data.tar.gz: 63f8278f57fc056e3cad2a52df8543c59e04a604e77909c8de7d5a00ee90f2d7fd1a34c0a6da7bbd45f5e438ddb7a3d9bf6334ccfab85d0ea01ed3ff785292e0
data/.gitignore ADDED
@@ -0,0 +1,12 @@
1
+ /.bundle/
2
+ /.yardoc
3
+ /Gemfile.lock
4
+ /_yardoc/
5
+ /coverage/
6
+ /doc/
7
+ /pkg/
8
+ /spec/reports/
9
+ /tmp/
10
+
11
+ # rspec failure tracking
12
+ .rspec_status
data/.rspec ADDED
@@ -0,0 +1,2 @@
1
+ --format documentation
2
+ --color
data/.travis.yml ADDED
@@ -0,0 +1,5 @@
1
+ sudo: false
2
+ language: ruby
3
+ rvm:
4
+ - 2.3.1
5
+ before_install: gem install bundler -v 1.14.6
@@ -0,0 +1,74 @@
1
+ # Contributor Covenant Code of Conduct
2
+
3
+ ## Our Pledge
4
+
5
+ In the interest of fostering an open and welcoming environment, we as
6
+ contributors and maintainers pledge to making participation in our project and
7
+ our community a harassment-free experience for everyone, regardless of age, body
8
+ size, disability, ethnicity, gender identity and expression, level of experience,
9
+ nationality, personal appearance, race, religion, or sexual identity and
10
+ orientation.
11
+
12
+ ## Our Standards
13
+
14
+ Examples of behavior that contributes to creating a positive environment
15
+ include:
16
+
17
+ * Using welcoming and inclusive language
18
+ * Being respectful of differing viewpoints and experiences
19
+ * Gracefully accepting constructive criticism
20
+ * Focusing on what is best for the community
21
+ * Showing empathy towards other community members
22
+
23
+ Examples of unacceptable behavior by participants include:
24
+
25
+ * The use of sexualized language or imagery and unwelcome sexual attention or
26
+ advances
27
+ * Trolling, insulting/derogatory comments, and personal or political attacks
28
+ * Public or private harassment
29
+ * Publishing others' private information, such as a physical or electronic
30
+ address, without explicit permission
31
+ * Other conduct which could reasonably be considered inappropriate in a
32
+ professional setting
33
+
34
+ ## Our Responsibilities
35
+
36
+ Project maintainers are responsible for clarifying the standards of acceptable
37
+ behavior and are expected to take appropriate and fair corrective action in
38
+ response to any instances of unacceptable behavior.
39
+
40
+ Project maintainers have the right and responsibility to remove, edit, or
41
+ reject comments, commits, code, wiki edits, issues, and other contributions
42
+ that are not aligned to this Code of Conduct, or to ban temporarily or
43
+ permanently any contributor for other behaviors that they deem inappropriate,
44
+ threatening, offensive, or harmful.
45
+
46
+ ## Scope
47
+
48
+ This Code of Conduct applies both within project spaces and in public spaces
49
+ when an individual is representing the project or its community. Examples of
50
+ representing a project or community include using an official project e-mail
51
+ address, posting via an official social media account, or acting as an appointed
52
+ representative at an online or offline event. Representation of a project may be
53
+ further defined and clarified by project maintainers.
54
+
55
+ ## Enforcement
56
+
57
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be
58
+ reported by contacting the project team at jgplane@gmail.com. All
59
+ complaints will be reviewed and investigated and will result in a response that
60
+ is deemed necessary and appropriate to the circumstances. The project team is
61
+ obligated to maintain confidentiality with regard to the reporter of an incident.
62
+ Further details of specific enforcement policies may be posted separately.
63
+
64
+ Project maintainers who do not follow or enforce the Code of Conduct in good
65
+ faith may face temporary or permanent repercussions as determined by other
66
+ members of the project's leadership.
67
+
68
+ ## Attribution
69
+
70
+ This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
71
+ available at [http://contributor-covenant.org/version/1/4][version]
72
+
73
+ [homepage]: http://contributor-covenant.org
74
+ [version]: http://contributor-covenant.org/version/1/4/
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in clippings_pluck.gemspec
4
+ gemspec
data/LICENSE.txt ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2018 Jack Plane
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,81 @@
1
+ # ClippingsPluck
2
+
3
+ Amazon stores all of your kindle highlights and notes in a txt file called "My Clippings.txt".
4
+ Amazon will also let you export those highlights as a CSV file.
5
+ ClippingsPluck accepts a string read from one of those files and returns an array of hashes.
6
+
7
+ ## Installation
8
+
9
+ Add this line to your application's Gemfile:
10
+
11
+ ```ruby
12
+ gem 'clippings_pluck'
13
+ ```
14
+
15
+ And then execute:
16
+
17
+ $ bundle
18
+
19
+ ## ClippingsPluck::CsvParser Usage
20
+
21
+ _Note:_ this is the preferred way to interact with ClippingsPluck. The Kindle CSV file is
22
+ much more reliable and straightforward to parse than the TXT file.
23
+
24
+ First, read from the csv file:
25
+ ```ruby
26
+ string = File.open("clippings.csv", "rb").read
27
+ ```
28
+
29
+ Then, you can pass that string into ClippingsPluck like this:
30
+
31
+ ```ruby
32
+ ClippingsPluck::CsvParser.new.run(string)
33
+ ```
34
+
35
+ You'll get back an array of hashes, each of which might look something like this:
36
+
37
+ ```ruby
38
+ {
39
+ note: nil,
40
+ quote: '"He was not no machine!" screamed Gloria, fiercely and ungrammatically.',
41
+ author: "Isaac Asimov",
42
+ book_title: "I, Robot (The Robot Series Book 1)",
43
+ page: nil,
44
+ location: "245",
45
+ date: nil
46
+ }
47
+ ```
48
+
49
+ ## ClippingsPluck::TxtParser Usage
50
+
51
+ Using the TXT file parser is similar:
52
+
53
+ ```ruby
54
+ file = File.open("./My\ Clippings.txt", "rb")
55
+ contents = file.read
56
+ ```
57
+
58
+ Then, you can feed the string to ClippingsPluck like this:
59
+
60
+ ```ruby
61
+ ClippingsPluck::TxtParser.new.run(contents)
62
+ ```
63
+
64
+ You'll get back an array of hashes, each of which might look something like this:
65
+
66
+ ```ruby
67
+ {
68
+ note: nil,
69
+ quote: '"He was not no machine!" screamed Gloria, fiercely and ungrammatically.',
70
+ author: "Isaac Asimov",
71
+ book_title: "I, Robot (The Robot Series Book 1)",
72
+ page: nil,
73
+ location: "245",
74
+ date: "Tuesday, November 22, 2017 6:42:51 PM"
75
+ }
76
+ ```
77
+
78
+ ## License
79
+
80
+ The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).
81
+
data/Rakefile ADDED
@@ -0,0 +1,6 @@
1
+ require "bundler/gem_tasks"
2
+ require "rspec/core/rake_task"
3
+
4
+ RSpec::Core::RakeTask.new(:spec)
5
+
6
+ task :default => :spec
data/bin/console ADDED
@@ -0,0 +1,14 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "bundler/setup"
4
+ require "clippings_pluck"
5
+
6
+ # You can add fixtures and/or initialization code here to make experimenting
7
+ # with your gem easier. You can also use a different console, if you like.
8
+
9
+ # (If you use this, don't forget to add pry to your Gemfile!)
10
+ # require "pry"
11
+ # Pry.start
12
+
13
+ require "irb"
14
+ IRB.start(__FILE__)
data/bin/setup ADDED
@@ -0,0 +1,8 @@
1
+ #!/usr/bin/env bash
2
+ set -euo pipefail
3
+ IFS=$'\n\t'
4
+ set -vx
5
+
6
+ bundle install
7
+
8
+ # Do any other automated setup that you need to do here
@@ -0,0 +1,27 @@
1
+ # coding: utf-8
2
+ lib = File.expand_path('../lib', __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require 'clippings_pluck/version'
5
+
6
+ Gem::Specification.new do |spec|
7
+ spec.name = "clippings_pluck"
8
+ spec.version = ClippingsPluck::VERSION
9
+ spec.authors = ["Jack Plane"]
10
+ spec.email = ["jgplane@gmail.com"]
11
+
12
+ spec.summary = "Kindle Clippings file parser"
13
+ spec.description = "https://github.com/jgplane/clippings-pluck"
14
+ spec.homepage = "https://github.com/jgplane/clippings-pluck"
15
+ spec.license = "MIT"
16
+
17
+ spec.files = `git ls-files -z`.split("\x0").reject do |f|
18
+ f.match(%r{^(test|spec|features)/})
19
+ end
20
+ spec.bindir = "exe"
21
+ spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
22
+ spec.require_paths = ["lib"]
23
+
24
+ spec.add_development_dependency "bundler"
25
+ spec.add_development_dependency "rake", "~> 10.0"
26
+ spec.add_development_dependency "rspec", "~> 3.0"
27
+ end
@@ -0,0 +1,9 @@
1
+ require "clippings_pluck/version"
2
+
3
+ require_relative "clippings_pluck/clippings"
4
+ require_relative "clippings_pluck/clipping"
5
+ require_relative "clippings_pluck/location"
6
+ require_relative "clippings_pluck/txt_parser"
7
+ require_relative "clippings_pluck/csv_parser"
8
+
9
+ module ClippingsPluck; end
@@ -0,0 +1,27 @@
1
+ module ClippingsPluck
2
+ class Clipping < Hash
3
+ def location=(location)
4
+ self[:location] = Location.new(location)
5
+ end
6
+
7
+ def normalized_location
8
+ missing_location? ? nil : self[:location].normalize
9
+ end
10
+
11
+ def eligible_for_note_attachment?(note_location)
12
+ has_location? && normalized_location <= note_location
13
+ end
14
+
15
+ def notated?
16
+ !self[:note].nil?
17
+ end
18
+
19
+ def missing_location?
20
+ self[:location].nil?
21
+ end
22
+
23
+ def has_location?
24
+ !missing_location?
25
+ end
26
+ end
27
+ end
@@ -0,0 +1,25 @@
1
+ module ClippingsPluck
2
+ class Clippings < Array
3
+ def closest_highlight(note_location)
4
+ sorted_eligible_note_matches(note_location.normalize).last
5
+ end
6
+
7
+ def with_notes
8
+ select(&:notated?)
9
+ end
10
+
11
+ def without_notes
12
+ self - with_notes
13
+ end
14
+
15
+ private
16
+
17
+ def sorted_eligible_note_matches(location)
18
+ eligible_note_matches(location).sort_by(&:normalized_location)
19
+ end
20
+
21
+ def eligible_note_matches(location)
22
+ select { |clipping| clipping.eligible_for_note_attachment?(location) }
23
+ end
24
+ end
25
+ end
@@ -0,0 +1,81 @@
1
+ require 'csv'
2
+
3
+ module ClippingsPluck
4
+ class CsvParser
5
+ EXCEL_DIVIDER = "----------------------------------------------\t\t\t\r\t\t\t\r"
6
+ EXCEL_SEP = "\t"
7
+
8
+ AMZN_DIVIDER = "----------------------------------------------"
9
+ AMZN_SEP = ","
10
+
11
+ def initialize
12
+ @clippings = Clippings.new
13
+ end
14
+
15
+ def run(string)
16
+ @string = string
17
+ @raw_metadata, @clipping_data = @string.split(divider)
18
+ @book, @authors = parse_metadata
19
+ build_clippings
20
+ @clippings
21
+ end
22
+
23
+ private
24
+
25
+ def excel?
26
+ @string.include? EXCEL_DIVIDER
27
+ end
28
+
29
+ def divider
30
+ excel? ? EXCEL_DIVIDER : AMZN_DIVIDER
31
+ end
32
+
33
+ def sep
34
+ excel? ? EXCEL_SEP : AMZN_SEP
35
+ end
36
+
37
+ def parse_metadata
38
+ metadata = @raw_metadata.split("\r")[1..2].map(&:strip).map{ |line| line.gsub(/by /, "") }
39
+ metadata.map! { |string| string.gsub("\"", "").gsub(",,,", "") } if !excel?
40
+ metadata
41
+ end
42
+
43
+ def build_clippings
44
+ remove_ghost_rows if !excel?
45
+ csv_hash = CSV.parse(@clipping_data, headers: true, col_sep: sep).map(&:to_h)
46
+ csv_hash.each { |data| format_according_to_type(data) }
47
+ end
48
+
49
+ def remove_ghost_rows
50
+ @clipping_data = @clipping_data.gsub(",,,\r\n,,,\r\n", "")
51
+ end
52
+
53
+ def format_according_to_type(data)
54
+ if data['Annotation Type'] == 'Note'
55
+ attach_note(data)
56
+ else
57
+ @clipping = Clipping.new
58
+ @clippings << format_clipping(data)
59
+ end
60
+ end
61
+
62
+ def attach_note(data)
63
+ if @clippings.last[:note].empty?
64
+ @clippings.last[:note] = data['Annotation']
65
+ else
66
+ @clippings.last[:note] += " | #{data['Annotation']}"
67
+ end
68
+ end
69
+
70
+ def format_clipping(data)
71
+ @clipping[:quote] = data.delete 'Annotation'
72
+ @clipping.location = (data.delete 'Location').gsub(/Location /, '')
73
+ @clipping[:note] = ""
74
+ @clipping[:book_title] = @book
75
+ @clipping[:author] = @authors
76
+ data.delete 'Annotation Type'
77
+ data.delete 'Starred?'
78
+ @clipping
79
+ end
80
+ end
81
+ end
@@ -0,0 +1,18 @@
1
+ module ClippingsPluck
2
+ class Location < String
3
+ def normalize
4
+ range? ? highest_of_range : self
5
+ end
6
+
7
+ private
8
+
9
+ def range?
10
+ include?("-")
11
+ end
12
+
13
+ def highest_of_range
14
+ split("-").last
15
+ end
16
+ end
17
+ end
18
+
@@ -0,0 +1,88 @@
1
+ module ClippingsPluck
2
+ class TxtParser
3
+ def initialize
4
+ @clippings = Clippings.new
5
+ @clipping = Clipping.new
6
+ end
7
+
8
+ def run(file_content)
9
+ split_clippings(file_content)
10
+ @clippings
11
+ end
12
+
13
+ private
14
+
15
+ def split_clippings(file_content)
16
+ clippings = file_content.force_encoding("UTF-8").split("=" * 10).delete_if { |c| c.strip.empty? }
17
+ clippings.each { |clipping| parse_lines(clipping) }
18
+ end
19
+
20
+ def parse_lines(clipping)
21
+ lines = clipping.lines.delete_if { |line| line.strip.empty? }.collect(&:strip)
22
+ parse_type(lines) if lines.length == 3
23
+ end
24
+
25
+ def parse_type(lines)
26
+ parse_quote(lines) if lines[1].include? "Highlight"
27
+ parse_note(lines) if lines[1].include? "Note"
28
+ end
29
+
30
+ def parse_note(lines)
31
+ if @clippings.length.positive?
32
+ location = Location.new(find_location(lines))
33
+ highlight = @clippings.closest_highlight(location)
34
+ highlight[:note] = lines[2]
35
+ end
36
+ end
37
+
38
+ def parse_quote(lines)
39
+ @clipping[:note] = nil
40
+ @clipping[:quote] = lines[2]
41
+ parse_author(lines)
42
+ end
43
+
44
+ def parse_author(lines)
45
+ author_names = lines[0].reverse[/\).*?\(/].to_s.gsub(/[()]/, "").reverse
46
+ if author_names.include? ","
47
+ last_name, first_name = author_names.split(",").collect(&:strip)
48
+ else
49
+ first_name, last_name = author_names.split(" ", 2).collect(&:strip)
50
+ end
51
+ @clipping[:author] = first_name + ' ' + last_name
52
+ lines[0].slice!("(#{author_names})")
53
+ parse_book(lines)
54
+ end
55
+
56
+ def parse_book(lines)
57
+ @clipping[:book_title] = lines[0].strip
58
+ parse_page(lines)
59
+ end
60
+
61
+ def parse_page(lines)
62
+ page = lines[1].match(/(?<=page ).\d*/)
63
+ @clipping[:page] = page.nil? ? nil : page[0].to_i
64
+ parse_location(lines)
65
+ end
66
+
67
+ def parse_location(lines)
68
+ @clipping.location = find_location(lines)
69
+ parse_date(lines)
70
+ end
71
+
72
+ def find_location(lines)
73
+ location = lines[1].match(/(?<=Location ).\S*/)
74
+ location.nil? ? nil : location[0].to_s
75
+ end
76
+
77
+ def parse_date(lines)
78
+ date = lines[1].match(/(?<=Added on ).+/)
79
+ @clipping[:date] = date.nil? ? nil : date[0]
80
+ push_clipping
81
+ end
82
+
83
+ def push_clipping
84
+ @clippings << @clipping
85
+ @clipping = Clipping.new
86
+ end
87
+ end
88
+ end
@@ -0,0 +1,3 @@
1
+ module ClippingsPluck
2
+ VERSION = "1.0.0"
3
+ end
metadata ADDED
@@ -0,0 +1,104 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: clippings_pluck
3
+ version: !ruby/object:Gem::Version
4
+ version: 1.0.0
5
+ platform: ruby
6
+ authors:
7
+ - Jack Plane
8
+ autorequire:
9
+ bindir: exe
10
+ cert_chain: []
11
+ date: 2019-09-06 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: bundler
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - ">="
18
+ - !ruby/object:Gem::Version
19
+ version: '0'
20
+ type: :development
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - ">="
25
+ - !ruby/object:Gem::Version
26
+ version: '0'
27
+ - !ruby/object:Gem::Dependency
28
+ name: rake
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - "~>"
32
+ - !ruby/object:Gem::Version
33
+ version: '10.0'
34
+ type: :development
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - "~>"
39
+ - !ruby/object:Gem::Version
40
+ version: '10.0'
41
+ - !ruby/object:Gem::Dependency
42
+ name: rspec
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - "~>"
46
+ - !ruby/object:Gem::Version
47
+ version: '3.0'
48
+ type: :development
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - "~>"
53
+ - !ruby/object:Gem::Version
54
+ version: '3.0'
55
+ description: https://github.com/jgplane/clippings-pluck
56
+ email:
57
+ - jgplane@gmail.com
58
+ executables: []
59
+ extensions: []
60
+ extra_rdoc_files: []
61
+ files:
62
+ - ".gitignore"
63
+ - ".rspec"
64
+ - ".travis.yml"
65
+ - CODE_OF_CONDUCT.md
66
+ - Gemfile
67
+ - LICENSE.txt
68
+ - README.md
69
+ - Rakefile
70
+ - bin/console
71
+ - bin/setup
72
+ - clippings_pluck.gemspec
73
+ - lib/clippings_pluck.rb
74
+ - lib/clippings_pluck/clipping.rb
75
+ - lib/clippings_pluck/clippings.rb
76
+ - lib/clippings_pluck/csv_parser.rb
77
+ - lib/clippings_pluck/location.rb
78
+ - lib/clippings_pluck/txt_parser.rb
79
+ - lib/clippings_pluck/version.rb
80
+ homepage: https://github.com/jgplane/clippings-pluck
81
+ licenses:
82
+ - MIT
83
+ metadata: {}
84
+ post_install_message:
85
+ rdoc_options: []
86
+ require_paths:
87
+ - lib
88
+ required_ruby_version: !ruby/object:Gem::Requirement
89
+ requirements:
90
+ - - ">="
91
+ - !ruby/object:Gem::Version
92
+ version: '0'
93
+ required_rubygems_version: !ruby/object:Gem::Requirement
94
+ requirements:
95
+ - - ">="
96
+ - !ruby/object:Gem::Version
97
+ version: '0'
98
+ requirements: []
99
+ rubyforge_project:
100
+ rubygems_version: 2.6.14
101
+ signing_key:
102
+ specification_version: 4
103
+ summary: Kindle Clippings file parser
104
+ test_files: []