RubyGems - bye-flickr - Versions diffs - 1.0.1 - Mend

bye-flickr 1.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

checksums.yaml +7 -0
data/LICENSE +21 -0
data/README.md +101 -0
data/bin/bye-flickr +44 -0
data/lib/bye_flickr.rb +6 -0
data/lib/bye_flickr/app.rb +133 -0
data/lib/bye_flickr/auth.rb +42 -0
data/lib/bye_flickr/photo_downloader.rb +101 -0
data/lib/bye_flickr/response_to_json.rb +39 -0
data/lib/bye_flickr/version.rb +3 -0
metadata +127 -0

checksums.yaml ADDED

@@ -0,0 +1,7 @@
+---
+SHA256:
+  metadata.gz: 80703e70104c8f9222c918040e506979ae025819ad6cd1192687c8f9555a67cc
+  data.tar.gz: c207aee6f4ca65c80e839b24622d18c7d1eaf2b512920b3f70de850cb1074882
+SHA512:
+  metadata.gz: f5ff0ed9afc666b67a859f6ca3772c523b154da300bd7dd710e017cfe8516520933adb94db1f2a85cbd842f6ad67c2905db09b242a97acd366b5d6c7c742879c
+  data.tar.gz: c32483cfd81b38ab3e883472fa74c838c0ddf1140c6a83004b5297f86afff50d58ac0b3be1a56ce9801b9e31857e9d666ddc785a4e4fbe3c86b25142bd27cd93

data/LICENSE ADDED

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2018 Jens Krämer
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

data/README.md ADDED

@@ -0,0 +1,101 @@
+Bye, Flickr!
+============
+Simple app to download everything from your flickr account. Your photos will be
+put into a directory structure reflecting Flickr collections and sets. Metadata
+for collections, sets and photos will be stored as JSON files, as well as
+contacts and groups data.
+Installation
+------------
+You need Ruby. I used it with Ruby 2.5, 2.4 should be ok as well. Install the
+gem, run `bye-flickr -h` for usage info.
+~~~~
+$ gem install bye-flickr
+Successfully installed bye-flickr-0.1.0
+1 gem installed
+$ bye-flickr -h
+usage: /home/jk/.gem/ruby/2.5.1/bin/bye-flickr [options]
+Required arguments (create API key and secret in the Flickr web interface):
+    -d, --dir     directory to store data
+    -k, --key     API key
+    -s, --secret  API secret
+Optional arguments, if you already have authorized the app:
+    --at          Access token
+    --as          Access token secret
+Other commands:
+    --version     print the version
+    -h, --help
+~~~~
+Usage
+-----
+First of all, head to your [Flickr
+account](https://www.flickr.com/services/apps/create/apply/) and create an API
+key. Choose non-commercial and pick any name you like for your 'App'. In the
+end you will get a key and a secret which are what you need for the `-k` and
+`-s` options. Pick a directory in a location with enough disk space and there you go:
+~~~~
+$ bye-flickr -d /space/photos -k lengthyAPIkey -s notsolongsecret
+token_rejected
+Open this url in your browser to complete the authentication process:
+https://api.flickr.com/services/oauth/authorize?oauth_token=some-token&perms=read
+Copy here the number given when you complete the process.
+~~~~
+Do as you're told and go to the URL, authorize the app, copy/paste the nine
+digit number and hit Enter.
+~~~~
+179-386-583
+You are now authenticated as flickrUserName with token some-other-token and secret yetanothersecret.
+~~~~
+For subsequent runs you can take note of the access token and secret you just
+got and use them as values for the `--at` and `--as` command line options. This
+will save you from having to authorize the app through the web interface over
+and over again.
+To show it's working the app prints out a `.` for every photo downloaded, and
+also prints the name of the directory (collection/set) it's currently working
+on. Photos not belonging to any set are, surprise, put into a directory named
+`not in any set`.
+Depending on the size of your Flickr account and your bandwidth this may take a
+long time. Downloading 26GB from my personal account took a couple of hours on
+my Hetzner server.
+Caveats
+-------
+I built this because I wanted to download my photos, so naturally I cut some
+corners where I could. Two things that I'm aware of which might need improvement are:
+- Support Flickr's pagination. If you have sets with more than 500 photos in it
+  (or more than 500 Photos that are not in any set) you will need that because
+  500 is the maximum number of photos you can get with a single API request. My
+  sets aren't that large so I skipped this.
+- There is no support for resuming an unfinished download, the app always
+  starts from scratch.
+Pull Requests welcome :)
+License
+-------
+MIT. See LICENSE for the text.

data/bin/bye-flickr ADDED

@@ -0,0 +1,44 @@
+#!/usr/bin/env ruby
+require 'bye_flickr'
+require 'slop'
+o = Slop::Options.new
+o.separator 'Required arguments (create API key and secret in the Flickr web interface):'
+o.string '-d', '--dir', 'directory to store data', required: true
+o.string '-k', '--key', 'API key', required: true
+o.string '-s', '--secret', 'API secret', required: true
+o.separator ''
+o.separator 'Optional arguments, if you already have authorized the app:'
+o.string '--at', 'Access token'
+o.string '--as', 'Access token secret'
+o.separator ''
+o.separator 'Other commands:'
+o.on '--version', 'print the version' do
+  puts ByeFlickr::VERSION
+  exit
+end
+o.on '-h', '--help' do
+  puts o
+  exit
+end
+begin
+  opts = Slop::Parser.new(o).parse ARGV
+rescue Slop::Error
+  puts $!
+  puts o
+  exit
+end
+FlickRaw.api_key = opts[:key]
+FlickRaw.shared_secret = opts[:secret]
+flickr.access_token = opts[:at] if opts[:at]
+flickr.access_secret = opts[:as] if opts[:as]
+ByeFlickr::App.new(dir: opts[:dir]).run

data/lib/bye_flickr.rb ADDED

@@ -0,0 +1,6 @@
+require 'flickraw'
+require 'bye_flickr/app'
+module ByeFlickr
+end

data/lib/bye_flickr/app.rb ADDED

@@ -0,0 +1,133 @@
+require 'fileutils'
+require 'json'
+require 'pathname'
+require 'bye_flickr/auth'
+require 'bye_flickr/photo_downloader'
+require 'bye_flickr/response_to_json'
+module ByeFlickr
+  class App
+    def initialize(dir: '.')
+      @basedir = Pathname(dir)
+      FileUtils.mkdir_p @basedir
+      @downloader = ByeFlickr::PhotoDownloader.new(@basedir)
+    end
+    # This code does not take into account pagination. 500 photos per set (the
+    # maximum supported per_page value) is enough for my purposes.
+    def run
+      user = Auth.call
+      @username = user[:username]
+      @id = user[:id]
+      exit if @username.nil? || @id.nil?
+      # Download photos that are not in any set
+      download_not_in_set
+      # Get collection info
+      @collections = flickr.collections.getTree
+      write_info(
+        @collections, path('collections.json')
+      )
+      # Get sets info
+      @sets = Hash[
+        flickr.photosets.getList(per_page: 500).photoset.map{|s|[s.id, s]}
+      ]
+      # Download collections and their included sets, removing downloaded sets
+      # from the @sets list
+      @collections.collection.each do |collection|
+        download_collection collection
+      end
+      # download the remaining sets, which aren't in any collection
+      @sets.values.each do |set|
+        download_set set, @basedir
+      end
+      # Fetch contacts and groups meta data
+      write_info(
+        flickr.contacts.getList, path('contacts.json')
+      )
+      write_info(
+        flickr.people.getGroups(user_id: @id), path('groups.json')
+      )
+      # wait for photo downloads to finish
+      @downloader.wait
+    end
+    def path(name, base = @basedir)
+      base.join(name.gsub(%r{/}, '_'))
+    end
+    def subdir(name, base = @basedir)
+      path(name, base).tap do |dir|
+        FileUtils.mkdir_p dir
+      end
+    end
+    def download_not_in_set
+      dir = subdir 'not in any set'
+      download_photos_to_dir(
+        flickr.photos.getNotInSet(extras: 'url_o', per_page: 500),
+        dir
+      )
+    end
+    # can collections be nested? If so, this code ignores them.
+    def download_collection(collection)
+      dir = subdir collection.title
+      FileUtils.mkdir_p dir
+      write_info collection, path("#{collection.title}.json")
+      collection.set.each do |set|
+        download_set set, dir
+        @sets.delete set.id # remove this set from the lists of sets to download
+      end
+    end
+    def download_set(set, basedir)
+      dir = subdir set.title, basedir
+      download_photos_to_dir(
+        flickr.photosets.getPhotos(photoset_id: set.id,
+                                   per_page: 500,
+                                   user_id: @id,
+                                   extras: 'url_o').photo,
+        dir
+      )
+      write_info(
+        flickr.photosets.getInfo(photoset_id: set.id),
+        path("#{set.title}.json", basedir)
+      )
+    rescue Net::OpenTimeout
+      puts "#{$!} - retrying to download Set #{set.title}"
+      retry
+    end
+    def write_info(info, path)
+      (File.open(path, 'wb') << ResponseToJson.(info)).close
+    end
+    def download_photos_to_dir(photos, dir)
+      puts dir
+      photos.each do |photo|
+        name = photo.title
+        name = name + '.jpg' unless name =~ /\.jpg$/i
+        name = "#{photo.id}.jpg"
+        @downloader.add_image photo.url_o, path(name, dir).to_s
+      end
+      photos.each do |photo|
+        write_info(
+          flickr.photos.getInfo(photo_id: photo.id),
+          path("#{photo.id}.json", dir)
+        )
+      end
+    end
+  end
+end

data/lib/bye_flickr/auth.rb ADDED

@@ -0,0 +1,42 @@
+module ByeFlickr
+  class Auth
+    def self.call
+      new.call
+    end
+    def call
+      unless test_login
+        request_auth
+      end
+      { username: @username, id: @id}
+    end
+    def test_login
+      login = flickr.test.login
+      @username = login.username
+      @id = login.id
+      true
+    rescue
+      puts $!
+      false
+    end
+    def request_auth
+      token = flickr.get_request_token
+      auth_url = flickr.get_authorize_url(token['oauth_token'], perms: 'read')
+      puts "Open this url in your browser to complete the authentication process:\n#{auth_url}"
+      puts "Copy here the number given when you complete the process."
+      verify = $stdin.gets.strip
+      flickr.get_access_token(token['oauth_token'], token['oauth_token_secret'], verify)
+      if test_login
+        puts "You are now authenticated as #{@username} with token #{flickr.access_token} and secret #{flickr.access_secret}"
+      else
+        puts "Login failed."
+      end
+    end
+  end
+end

data/lib/bye_flickr/photo_downloader.rb ADDED

@@ -0,0 +1,101 @@
+require 'concurrent'
+require 'fileutils'
+require 'net/http/persistent'
+require 'tempfile'
+require 'thread'
+module ByeFlickr
+  class PhotoDownloader
+    Download = Struct.new(:url, :path, :tries)
+    attr_reader :errors
+    def initialize(dir, workers: 2)
+      @basedir = dir
+      @lock = Mutex.new
+      @http = Net::HTTP::Persistent.new
+      @images = Concurrent::Array.new
+      @errors = Concurrent::Array.new
+      @tempdir = @basedir.join 'tmp'
+      FileUtils.mkdir_p @tempdir
+      @running = true
+      @workers = 1.upto(workers).map do |i|
+        Thread.new{ Worker.new(self).run }
+      end
+    end
+    def wait
+      @running = false
+      @workers.each{|t|t.join}
+      if @errors.any?
+        (File.open(@basedir.join('errors.json'), 'wb') << @errors.to_json).close
+      end
+    end
+    def add_image(url, path)
+      @images << Download.new(url, path, 0)
+    end
+    def running?
+      !!@running
+    end
+    def next
+      @images.shift
+    end
+    def add_failure(dl)
+      @errors << dl
+    end
+    def download(dl)
+      response = @http.request dl.url
+      f = Tempfile.create('bye-flickr-download', @tempdir)
+      f << response.body
+      f.close
+      @lock.synchronize do
+        i = 0
+        path = dl.path
+        while File.readable?(path)
+          i = i+1
+          path = "#{dl.path.sub(/\.jpg$/i, '')}_#{i}.jpg"
+        end
+        FileUtils.mv f, path
+      end
+    rescue
+      puts "#{$!}:\n#{dl.url} => #{dl.path}"
+      dl.tries += 1
+      if dl.tries > 2
+        puts "giving up on this one"
+        add_failure dl
+      else
+        @images << dl
+      end
+    end
+    class Worker
+      def initialize(downloader)
+        @downloader = downloader
+      end
+      def run
+        while file = @downloader.next or @downloader.running? do
+          if file
+            @downloader.download(file)
+            print '.'
+          else
+            # queue is empty but we're still running, wait a bit and try again
+            sleep 1
+          end
+        end
+      end
+    end
+  end
+end

data/lib/bye_flickr/response_to_json.rb ADDED

@@ -0,0 +1,39 @@
+module ByeFlickr
+  class ResponseToJson
+    def initialize(r)
+      @r = r
+    end
+    def self.call(r)
+      new(r).serialize.to_json
+    end
+    def serialize(o = @r)
+      case o
+      when FlickRaw::Response
+        serialize_response o
+      when FlickRaw::ResponseList, Enumerable
+        serialize_response_list o
+      else
+        o
+      end
+    end
+    private
+    def serialize_response_list(list)
+      list.to_a.map{|o|serialize(o)}
+    end
+    def serialize_response(r)
+      Hash.new.tap do |hsh|
+        r.to_hash.each do |key, value|
+          hsh[key] = serialize value
+        end
+      end
+    end
+  end
+end

data/lib/bye_flickr/version.rb ADDED

@@ -0,0 +1,3 @@
+module ByeFlickr
+  VERSION = '1.0.1'
+end

metadata ADDED

@@ -0,0 +1,127 @@
+--- !ruby/object:Gem::Specification
+name: bye-flickr
+version: !ruby/object:Gem::Version
+  version: 1.0.1
+platform: ruby
+authors:
+- Jens Krämer
+autorequire:
+bindir: bin
+cert_chain: []
+date: 2018-04-27 00:00:00.000000000 Z
+dependencies:
+- !ruby/object:Gem::Dependency
+  name: flickraw
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '0.9'
+  type: :runtime
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '0.9'
+- !ruby/object:Gem::Dependency
+  name: net-http-persistent
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '3.0'
+  type: :runtime
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '3.0'
+- !ruby/object:Gem::Dependency
+  name: concurrent-ruby
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '1.0'
+  type: :runtime
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '1.0'
+- !ruby/object:Gem::Dependency
+  name: slop
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '4.6'
+  type: :runtime
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '4.6'
+- !ruby/object:Gem::Dependency
+  name: bundler
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '1.16'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '1.16'
+description: This gem will download all photos and as much metadata as possible from
+  your Flickr account. Metadata is stored in json files, one file per photo. Collection
+  / Set metadata and your group subscriptions and contacts are stored as JSON files,
+  as well.
+email:
+- jk@jkraemer.net
+executables:
+- bye-flickr
+extensions: []
+extra_rdoc_files: []
+files:
+- LICENSE
+- README.md
+- bin/bye-flickr
+- lib/bye_flickr.rb
+- lib/bye_flickr/app.rb
+- lib/bye_flickr/auth.rb
+- lib/bye_flickr/photo_downloader.rb
+- lib/bye_flickr/response_to_json.rb
+- lib/bye_flickr/version.rb
+homepage: http://github.com/jkraemer/bye_flickr
+licenses:
+- MIT
+metadata: {}
+post_install_message:
+rdoc_options: []
+require_paths:
+- lib
+required_ruby_version: !ruby/object:Gem::Requirement
+  requirements:
+  - - ">="
+    - !ruby/object:Gem::Version
+      version: '0'
+required_rubygems_version: !ruby/object:Gem::Requirement
+  requirements:
+  - - ">="
+    - !ruby/object:Gem::Version
+      version: '2.0'
+requirements: []
+rubyforge_project: bye-flickr
+rubygems_version: 2.7.6
+signing_key:
+specification_version: 4
+summary: Download all photos and metadata from your Flickr account.
+test_files: []