fyodor 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/README.md ADDED
@@ -0,0 +1,62 @@
1
+ # Fyodor
2
+
3
+ Convert your Amazon Kindle highlights, notes and bookmarks into markdown files.
4
+
5
+
6
+ ## What is it about
7
+ This application parses `My Clippings.txt` from your Kindle and generates a markdown file for each book/document, in the format `#{Author} - #{Title}.md`. This way, your annotations on the books you read are conveniently stored and easily managed.
8
+
9
+ To read more about the motivation and what problem it tries to solve, [check this blog post](http://rafaelc.org/blog/export-all-your-kindle-highlights-and-notes/).
10
+
11
+ [For samples of the output, click here.](samples/)
12
+
13
+
14
+ ## Features
15
+
16
+ * Supports all the type of entries in your clippings file: highlights, notes, clips and bookmarks.
17
+ * Automatic removal of empty or duplicate entries (the clippings file can get a lot of those).
18
+ * Orders your entries by location/page on each book (the clippings file is ordered by time).
19
+ * Easily configurable for your language, allowing you to get all features and beautiful output.
20
+ * This software goes some length to be locale agnostic: basic parsing should work without configuration for any language. It should also work even if your clippings file has multiple locales.
21
+ * Bookmarks are printed together and notes are formatted differently, for better visualization.
22
+ * Output in a format that is clean and easy to edit/fiddle around: markdown.
23
+
24
+ This program is based on the clippings file generated by Kindle 2019, but should work with other models.
25
+
26
+
27
+ ## Installation
28
+
29
+ Install Ruby and run:
30
+
31
+ ```
32
+ $ gem install fyodor
33
+ ```
34
+
35
+
36
+ ## Configuration
37
+
38
+ If your Kindle is not in English, you should configure Fyodor so it knows how your `My Clippings.txt` calls some things (e.g. highlights, pages, etc). This is easily done in `parser` section of the config file, replacing the default English values.
39
+
40
+ Note that basic parsing should still work without configuration, but you won't take advantage of many features, resulting in a dirtier output.
41
+
42
+ To configure the application, [copy the sample config](https://github.com/rccavalcanti/fyodor/blob/master/fyodor.toml.sample) and place it at `~/.config/fyodor.toml`. Edit it as you like.
43
+
44
+ The configuration file also allows you to set whether to print the time of each entry. On `[output]`, set `time` to `true` or `false`.
45
+
46
+
47
+ ## Running
48
+
49
+ ```
50
+ $ fyodor CLIPPINGS_FILE [OUTPUT_DIR]
51
+ ```
52
+
53
+ Where:
54
+ * `CLIPPINGS_FILE` is the path for `My Clippings.txt`.
55
+ * `OUTPUT_DIR` is the directory to write the markdown files. If none supplied, it will be `fyodor_output` in the current directory.
56
+
57
+
58
+ ## LICENSE
59
+
60
+ Released under [GNU GPL v3](LICENSE).
61
+
62
+ Copyright 2019 Rafael Cavalcanti <hi@rafaelc.org>
data/bin/fyodor ADDED
@@ -0,0 +1,10 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "fyodor/cli"
4
+
5
+ # Supress stack trace
6
+ begin
7
+ Fyodor::CLI.new.main
8
+ rescue StandardError => e
9
+ abort(e.message)
10
+ end
@@ -0,0 +1,42 @@
1
+ require "forwardable"
2
+ require "set"
3
+
4
+ module Fyodor
5
+ class Book
6
+ extend Forwardable
7
+ include Enumerable
8
+
9
+ attr_reader :title, :author, :rej_dup
10
+
11
+ def_delegators :@entries, :each, :size
12
+
13
+ def initialize(title, author=nil)
14
+ raise "Book title can't be empty" if title.to_s.empty?
15
+
16
+ @title = title
17
+ @author = author
18
+ @entries = SortedSet.new
19
+ @rej_dup = 0
20
+ end
21
+
22
+ def <<(entry)
23
+ return if entry.empty?
24
+ # #add? returns nil if the entry was duplicated
25
+ @rej_dup += 1 if @entries.add?(entry).nil?
26
+ end
27
+
28
+ def basename
29
+ base = @author.to_s.empty? ? @title : "#{@author} - #{@title}"
30
+ base.strip.gsub(/[?*:|\/"<>]/,"_")
31
+ end
32
+
33
+ def count_types
34
+ list = group_by(&:type).map { |k, v| [k, v.size] }
35
+ Hash[list]
36
+ end
37
+
38
+ def count_desc_unparsed
39
+ count { |entry| ! entry.desc_parsed? }
40
+ end
41
+ end
42
+ end
data/lib/fyodor/cli.rb ADDED
@@ -0,0 +1,43 @@
1
+ require_relative "config_getter"
2
+ require_relative "stats_printer"
3
+ require_relative "library"
4
+ require_relative "clippings_parser"
5
+ require_relative "output_writer"
6
+ require "pathname"
7
+
8
+ module Fyodor
9
+ class CLI
10
+ def initialize
11
+ get_args
12
+ @config = ConfigGetter.new.config
13
+ end
14
+
15
+ def main
16
+ library = Library.new
17
+ ClippingsParser.new(@clippings_path, @config["parser"]).parse(library)
18
+ StatsPrinter.new(library).print
19
+ OutputWriter.new(library, @output_dir, @config["output"]).write_all
20
+ end
21
+
22
+
23
+ private
24
+
25
+ def get_args
26
+ if ARGV.count != 1 && ARGV.count != 2
27
+ puts "Usage: #{File.basename($0)} my_clippings_path [output_dir]"
28
+ exit 1
29
+ end
30
+
31
+ @clippings_path = get_path(ARGV[0])
32
+ @output_dir = ARGV[1].nil? ? default_output_dir : get_path(ARGV[1])
33
+ end
34
+
35
+ def get_path(path)
36
+ Pathname.new(path).expand_path
37
+ end
38
+
39
+ def default_output_dir
40
+ Pathname.new(Dir.pwd) + "fyodor_output"
41
+ end
42
+ end
43
+ end
@@ -0,0 +1,37 @@
1
+ require_relative "entry_parser"
2
+
3
+ module Fyodor
4
+ class ClippingsParser
5
+ SEPARATOR = /^==========\r?\n$/
6
+ ENTRY_SIZE = 5
7
+
8
+ def initialize(clippings_path, parser_config)
9
+ @path = clippings_path
10
+ @config = parser_config
11
+ end
12
+
13
+ def parse(library)
14
+ lines = []
15
+ File.open(@path).each do |line|
16
+ lines << line
17
+ if end_entry?(lines)
18
+ library << parse_entry(lines)
19
+ lines.clear
20
+ end
21
+ end
22
+ raise "MyClippings is badly formatted" if lines.size > 0
23
+ end
24
+
25
+ private
26
+
27
+ def end_entry?(lines)
28
+ return false if lines.size < ENTRY_SIZE
29
+ return true if lines.size == ENTRY_SIZE && lines.last =~ SEPARATOR
30
+ raise "MyClippings is badly formatted"
31
+ end
32
+
33
+ def parse_entry(lines)
34
+ EntryParser.new(lines, @config).entry
35
+ end
36
+ end
37
+ end
@@ -0,0 +1,51 @@
1
+ require_relative "core_extensions/hash/merging"
2
+ require "pathname"
3
+ require "toml"
4
+
5
+ module Fyodor
6
+ class ConfigGetter
7
+ PATHS = [Pathname.new(__FILE__).dirname + "../fyodor.toml",
8
+ Pathname.new("~/.config/fyodor.toml").expand_path]
9
+ DEFAULT = {
10
+ "parser" => {
11
+ "highlight" => "Your Highlight",
12
+ "note" => "Your Note",
13
+ "bookmark" => "Your Bookmark",
14
+ "clip" => "Clip This Article",
15
+ "loc" => "Location",
16
+ "page" => "page",
17
+ "time" => "Added on"
18
+ },
19
+ "output" => {
20
+ "time" => false
21
+ }
22
+ }
23
+
24
+ def config
25
+ @config ||= get_config
26
+ end
27
+
28
+
29
+ private
30
+
31
+ def get_config
32
+ Hash.include CoreExtensions::Hash::Merging
33
+
34
+ print_path
35
+ user_config = path.nil? ? {} : TOML.load_file(path)
36
+ DEFAULT.deep_merge(user_config)
37
+ end
38
+
39
+ def path
40
+ @path ||= PATHS.find { |path| path.exist? }
41
+ end
42
+
43
+ def print_path
44
+ if path.nil?
45
+ puts "No config found: using defaults.\n\n"
46
+ else
47
+ puts "Using config at #{path}\n\n"
48
+ end
49
+ end
50
+ end
51
+ end
@@ -0,0 +1,12 @@
1
+ module Fyodor
2
+ module CoreExtensions
3
+ module Hash
4
+ module Merging
5
+ def deep_merge(second)
6
+ merger = proc { |key, v1, v2| Hash === v1 && Hash === v2 ? v1.merge(v2, &merger) : Array === v1 && Array === v2 ? v1 | v2 : [:undefined, nil, :nil].include?(v2) ? v1 : v2 }
7
+ self.merge(second.to_h, &merger)
8
+ end
9
+ end
10
+ end
11
+ end
12
+ end
@@ -0,0 +1,63 @@
1
+ module Fyodor
2
+ class Entry
3
+ TYPE = { note: "note",
4
+ highlight: "highlight",
5
+ bookmark: "bookmark",
6
+ clip: "clip" }
7
+
8
+ attr_reader :book_title, :book_author, :text, :desc, :type, :loc, :loc_start, :page, :time
9
+
10
+ def initialize(attrs)
11
+ @book_title = attrs[:book_title]
12
+ @book_author = attrs[:book_author]
13
+ @text = attrs[:text]
14
+ @desc = attrs[:desc]
15
+ @type = attrs[:type]
16
+ @loc = attrs[:loc]
17
+ # This is our comparable, we need it as a number.
18
+ @loc_start = attrs[:loc_start].to_i
19
+ @page = attrs[:page]
20
+ @time = attrs[:time]
21
+
22
+ raise ArgumentError, "Invalid Entry type" unless TYPE.value?(@type) || @type.nil?
23
+ end
24
+
25
+ def empty?
26
+ if @type == TYPE[:bookmark] || @type.nil?
27
+ @desc.strip == ""
28
+ else
29
+ @text.strip == ""
30
+ end
31
+ end
32
+
33
+ def desc_parsed?
34
+ @loc_start != 0 && ! @type.nil?
35
+ end
36
+
37
+ # Override this method to use a SortedSet.
38
+ def <=>(other)
39
+ @loc_start <=> other.loc_start
40
+ end
41
+
42
+ # Override the following methods for deduplication.
43
+ def ==(other)
44
+ return false if @type != other.type || @text != other.text
45
+
46
+ if desc_parsed? && other.desc_parsed?
47
+ @loc == other.loc
48
+ else
49
+ @desc == other.desc
50
+ end
51
+ end
52
+
53
+ alias eql? ==
54
+
55
+ def hash
56
+ if desc_parsed?
57
+ @text.hash ^ @type.hash ^ @loc.hash
58
+ else
59
+ @text.hash ^ @desc.hash
60
+ end
61
+ end
62
+ end
63
+ end
@@ -0,0 +1,93 @@
1
+ require_relative "entry"
2
+
3
+ module Fyodor
4
+ class EntryParser
5
+ def initialize(entry_lines, parser_config)
6
+ @lines = entry_lines
7
+ @config = parser_config
8
+ format_check
9
+ end
10
+
11
+ def entry
12
+ Entry.new({book_title: book[:title],
13
+ book_author: book[:author],
14
+ text: text,
15
+ desc: desc,
16
+ type: type,
17
+ loc: loc,
18
+ loc_start: loc_start,
19
+ page: page,
20
+ time: time})
21
+ end
22
+
23
+
24
+ private
25
+
26
+ def book
27
+ title, author = @lines[0].scan(regex_cap(:title_author)).first
28
+ # If book has no author, regex fails.
29
+ title = @lines[0] if title.nil?
30
+
31
+ {title: title.strip, author: author.to_s.strip}
32
+ end
33
+
34
+ def desc
35
+ @lines[1].delete_prefix("- ").strip
36
+ end
37
+
38
+ def type
39
+ Entry::TYPE.values.find { |type| @lines[1] =~ regex_type(type) }
40
+ end
41
+
42
+ def loc
43
+ @lines[1][regex_cap(:loc), 1]
44
+ end
45
+
46
+ def loc_start
47
+ @lines[1][regex_cap(:loc_start), 1].to_i
48
+ end
49
+
50
+ def page
51
+ @lines[1][regex_cap(:page), 1]
52
+ end
53
+
54
+ def time
55
+ @lines[1][regex_cap(:time), 1]
56
+ end
57
+
58
+ def text
59
+ @lines[3].strip
60
+ end
61
+
62
+ def regex_type(type)
63
+ s = Regexp.quote(@config[type])
64
+ /^- #{s}/
65
+ end
66
+
67
+ def regex_cap(item)
68
+ case item
69
+ when :title_author
70
+ /^(.*) \((.*)\)\r?\n$/
71
+ when :loc
72
+ s = Regexp.quote(@config["loc"])
73
+ /#{s} (\S+)/
74
+ when :loc_start
75
+ s = Regexp.quote(@config["loc"])
76
+ /#{s} (\d+)(-\d+)?/
77
+ when :page
78
+ s = Regexp.quote(@config["page"])
79
+ /#{s} (\S+)/
80
+ when :time
81
+ s = Regexp.quote(@config["time"])
82
+ /#{s} (.*)\r?\n$/
83
+ end
84
+ end
85
+
86
+ def format_check
87
+ raise "Entry must have five lines" unless @lines.size == 5
88
+ raise "Entry is badly formatted" if @lines[0].strip.empty?
89
+ raise "Entry is badly formatted" if @lines[1].strip.empty?
90
+ raise "Entry is badly formatted" unless @lines[2].strip.empty?
91
+ end
92
+ end
93
+ end
@@ -0,0 +1,57 @@
1
+ require_relative "book"
2
+ require "forwardable"
3
+
4
+ module Fyodor
5
+ class Library
6
+ extend Forwardable
7
+ include Enumerable
8
+
9
+ def_delegators :@books, :each, :empty?, :size
10
+
11
+ def initialize
12
+ @books = []
13
+ @rej_empty = 0
14
+ end
15
+
16
+ def <<(entry)
17
+ if entry.empty?
18
+ @rej_empty += 1
19
+ return
20
+ end
21
+
22
+ book(entry.book_title, entry.book_author) << entry
23
+ end
24
+
25
+ def count_types
26
+ reduce({}) { |acc, book| acc.merge(book.count_types) { |key, val1, val2| val1 + val2 } }
27
+ end
28
+
29
+ def count_desc_unparsed
30
+ reduce(0) { |acc, book| acc + book.count_desc_unparsed }
31
+ end
32
+
33
+ def count_entries
34
+ reduce(0) { |acc, book| acc + book.size }
35
+ end
36
+
37
+ def rejected
38
+ {empty: @rej_empty, dup: count_rej_dup}
39
+ end
40
+
41
+
42
+ private
43
+
44
+ def book(title, author)
45
+ book = find { |book| book.title == title && book.author == author }
46
+ if book.nil?
47
+ book = Book.new(title, author)
48
+ @books << book
49
+ end
50
+ book
51
+ end
52
+
53
+ def count_rej_dup
54
+ reduce(0) { |acc, book| acc + book.rej_dup }
55
+ end
56
+ end
57
+ end
@@ -0,0 +1,102 @@
1
+ require_relative "strings"
2
+
3
+ module Fyodor
4
+ class MdGenerator
5
+ include Strings
6
+
7
+ def initialize(book, config)
8
+ @book = book
9
+ @config = config
10
+ end
11
+
12
+ def content
13
+ header + body + bookmarks
14
+ end
15
+
16
+
17
+ private
18
+
19
+ def header
20
+ return <<~EOF
21
+ # #{@book.title}
22
+ #{"by #{@book.author}" unless @book.author.to_s.empty?}
23
+
24
+ #{header_counts}
25
+
26
+ EOF
27
+ end
28
+
29
+ def header_counts
30
+ output = ""
31
+ @book.count_types.each do |type, n|
32
+ output += "#{n} #{pluralize(type, n)}, " if n > 0
33
+ end
34
+ output.delete_suffix(", ")
35
+ end
36
+
37
+ def pluralize(type, n)
38
+ n == 1 ? SINGULAR[type] : PLURAL[type]
39
+ end
40
+
41
+ def body
42
+ entries = @book.reject { |entry| entry.type == Entry::TYPE[:bookmark] }
43
+ entries.size == 0 ? "" : entries_render(entries)
44
+ end
45
+
46
+ def bookmarks
47
+ bookmarks = @book.select { |entry| entry.type == Entry::TYPE[:bookmark] }
48
+ bookmarks.size == 0 ? "" : entries_render(bookmarks, "Bookmarks")
49
+ end
50
+
51
+ def entries_render(entries, title=nil)
52
+ output = "---\n\n"
53
+ output += "## #{title}\n\n" unless title.nil?
54
+ entries.each do |entry|
55
+ output += "#{entry_text(entry)}\n\n"
56
+ output += "<p style=\"text-align: right;\"><sup>#{entry_desc(entry)}</sup></p>\n\n"
57
+ end
58
+ output
59
+ end
60
+
61
+ def entry_text(entry)
62
+ case entry.type
63
+ when Entry::TYPE[:bookmark]
64
+ "* #{page(entry)}"
65
+ when Entry::TYPE[:note]
66
+
67
+ "> _#{text(entry)}_"
68
+ else
69
+ "> #{text(entry)}"
70
+ end
71
+ end
72
+
73
+ def entry_desc(entry)
74
+ return entry.desc unless entry.desc_parsed?
75
+
76
+ case entry.type
77
+ when Entry::TYPE[:bookmark]
78
+ time(entry)
79
+ else
80
+ (type(entry) + " @ " + page(entry) + " " + time(entry)).strip
81
+ end
82
+ end
83
+
84
+ def page(entry)
85
+ ((entry.page.nil? ? "" : "page #{entry.page}, ") +
86
+ (entry.loc.nil? ? "" : "loc. #{entry.loc}")).delete_suffix(", ")
87
+ end
88
+
89
+ def time(entry)
90
+ @config["time"] ? "[#{entry.time}]" : ""
91
+ end
92
+
93
+ def type(entry)
94
+ SINGULAR[entry.type]
95
+ end
96
+
97
+ def text(entry)
98
+ # Markdown needs no white space between text and formatters
99
+ entry.text.strip
100
+ end
101
+ end
102
+ end
@@ -0,0 +1,35 @@
1
+ require_relative "md_generator"
2
+
3
+ module Fyodor
4
+ class OutputWriter
5
+ def initialize(library, output_dir, config)
6
+ @library = library
7
+ @output_dir = output_dir
8
+ @output_dir.mkdir unless @output_dir.exist?
9
+ @config = config
10
+ end
11
+
12
+ def write_all
13
+ puts "\nWriting to #{@output_dir}..." unless @library.empty?
14
+ @library.each do |book|
15
+ content = MdGenerator.new(book, @config).content
16
+ File.open(path(book), "w") { |f| f.puts(content) }
17
+ end
18
+ end
19
+
20
+
21
+ private
22
+
23
+ def path(book)
24
+ path = @output_dir + "#{book.basename}.md"
25
+
26
+ i = 2
27
+ while(path.exist?)
28
+ path = @output_dir + "#{book.basename} - #{i}.md"
29
+ i += 1
30
+ end
31
+
32
+ path
33
+ end
34
+ end
35
+ end
@@ -0,0 +1,53 @@
1
+ require_relative "strings"
2
+
3
+ module Fyodor
4
+ class StatsPrinter
5
+ include Strings
6
+
7
+ def initialize(library)
8
+ @library = library
9
+ end
10
+
11
+ def print
12
+ num_books
13
+ rejected
14
+ types
15
+ pretty_output
16
+ end
17
+
18
+ private
19
+
20
+ def num_books
21
+ puts "=> #{@library.size} books found"
22
+ end
23
+
24
+ def types
25
+ ct = @library.count_types
26
+ PLURAL.each { |type, label| puts "#{label.capitalize.rjust(12)}: #{ct[type] || 0}" }
27
+ puts "-------------------"
28
+ puts "#{"TOTAL".rjust(12)}: #{ct.sum {|k, v| v}}\n\n"
29
+ end
30
+
31
+ def rejected
32
+ rejected = @library.rejected
33
+ puts "=> Ignored #{rejected[:empty]} empty and #{rejected[:dup]} duplicated entries."
34
+ end
35
+
36
+ def pretty_output
37
+ bad = @library.count_desc_unparsed
38
+ percent = (bad.to_f / @library.count_entries.to_f) * 100
39
+
40
+ if bad > 0
41
+ warn <<~EOF
42
+ We couldn't *improve* the output of #{bad} (#{percent.round(1)}%) entries.
43
+ Possible causes:
44
+ - Wrong strings in the config file (most probable).
45
+ - Your locale has some specificity the app is not aware. Please open an issue.
46
+ #{"- Your clippings file has more than one locale." if percent != 100}
47
+ EOF
48
+ else
49
+ puts "ALL entries were correctly parsed."
50
+ end
51
+ end
52
+ end
53
+ end
@@ -0,0 +1,17 @@
1
+ require_relative "entry"
2
+
3
+ module Fyodor
4
+ module Strings
5
+ PLURAL = { Entry::TYPE[:highlight] => "highlights",
6
+ Entry::TYPE[:note] => "notes",
7
+ Entry::TYPE[:bookmark] => "bookmarks",
8
+ Entry::TYPE[:clip] => "clips",
9
+ nil => "unrecognized" }
10
+
11
+ SINGULAR = { Entry::TYPE[:highlight] => "highlight",
12
+ Entry::TYPE[:note] => "note",
13
+ Entry::TYPE[:bookmark] => "bookmark",
14
+ Entry::TYPE[:clip] => "clip",
15
+ nil=> "unrecognized" }
16
+ end
17
+ end
data/lib/fyodor.rb ADDED
@@ -0,0 +1,3 @@
1
+ require "fyodor/clippings_parser"
2
+ require "fyodor/library"
3
+ require "fyodor/output_writer"