RubyGems - tumblr_cleanr - Versions diffs - 0.0.1 - Mend

tumblr_cleanr 0.0.1

Files changed (4) hide show

data/README ADDED

@@ -0,0 +1,7 @@
+== TumblrCleanr
+Delete all posts on your tumblr tumblelog.
+* To run 'ruby bin/tumblr_cleanr.rb'
+Documentation is in doc/index.html

data/bin/tumblr_cleanr.rb ADDED

@@ -0,0 +1,180 @@
+#See TumblrCleanr class for full documentation
+require 'rubygems'
+# Note: not using Tumblr API because it does not support delete
+# http://ruby-tumblr.rubyforge.org/
+require 'mechanize' # http://mechanize.rubyforge.org/mechanize
+require 'net/http' # http://www.ruby-doc.org/stdlib/libdoc/net/http/rdoc/classes/Net/HTTP.html
+require 'uri'
+require 'logger'
+#== TumblrCleanr - clean/reset your tumblr by deleting all posts
+#Author:: engtech (http://InternetDuctTape.com, http://rubeh.tumblr.com)
+#Copyright:: Copyright (c) 2008 engtech
+#License:: Creative Commons Attribution-Noncommercial 2.5 License
+#
+#Tumblr is rapidly becoming my favorite hosted blogging platform (more so than Blogger/WordPress.com) because of all the things they do correct:
+#
+#- RSS feed importing
+#- free domain name support
+#- free CSS/theme support
+#- Google Analytics support
+#- keeping it simple
+#
+#However, there's one feature that's missing: <b>how do you delete your Tumblr?</b> At some point you might want to destroy all traces of your tumblr (privacy concerns, or you want to use it for something else) and there isn't an option to do that -- other than click the delete button on every individual post. I wanted to repurpose a tumblr I had been using for feed aggregation and it had over 18,000 posts. That's a lot of clicks.
+#
+#Enter the TumblrCleanr. Provide it with your tumblr domain name as well as your username and password and it will delete up to the latest 3000 posts at a time. You can keep running it until your entire tumblr is clean as a whistle.
+#
+#== Privacy Concerns
+#
+#TumblrCleanr does not store your login information anywhere and only uses it to communicate with tumblr.com. Every time you run the program you will have to re-enter your login details.
+#
+#== Why Not Create a New Tumblr?
+#
+#That's true, it's much easier to create a new tumblr account with a different email address than it is to "reset" your existing Tumblr. You can even just change the tumblr domain name if you want to "free up" your good domain for something else. I created this more for my own learning process, because I wanted to try WWW::Mechanize, rdoc, rake and rubyscript2exe for the first time and give myself some more experience coding in Ruby.
+#
+#== License
+#
+#This work is licensed under the Creative Commons
+#Attribution-Noncommercial 2.5 License.
+#
+#To view a copy of this license, visit
+#  http://creativecommons.org/licenses/by-nc/2.5/ or
+#send a letter to
+# Creative Commons, 543 Howard Street, 5th Floor,
+# San Francisco, California, 94105, USA.
+#
+class TumblrCleanr
+  # tumblr domain name is set by login method
+  @domain
+  # email address is set by login method
+  @email
+  # password is set by login method
+  @password
+  # WWW::Mechanize agent is created in login method
+  @agent
+  # Array of tumblr postids is set by post_archive method
+  @postids
+  #Initializing TumblrCleanr will start an interactive prompt asking your your tumblr domain name, email address and password.
+  #domain name:: the domain name used to access your tumblr, without the http:// prefix (eg: popstar.tumblr.com)
+  #email address:: the email address used to login to tumblr (eg: brittney.spears@gmail.com)
+  #password:: the password for your tumblr account, <b>not the password for your email address</b>
+  #
+  #When the program finishes the user is prompted to press enter to quit.
+  #
+  def initialize
+    begin
+      puts "Welcome to TumblrCleanr by http://InternetDuctTape.com"
+      query_loop
+      parse_archive
+      clean
+      print "Success"
+    rescue Interrupt => e
+	puts "User pressed Ctrl-C"
+    rescue Exception => e
+      puts "#{e.class}: #{e.message}"
+      puts e.backtrace.join("\n")
+    end
+    puts "Press enter to exit..."
+    gets
+  end
+  private
+  #Keeps asking the user for their login information until they enter something that works.
+  #Press Ctrl-C to exit the loop (and the program).
+  #
+  def query_loop
+    login_success = false
+    while not login_success do
+      begin
+	query_user
+	login
+	login_success = true
+      rescue Interrupt => e
+	raise Interrupt, "user abort"
+      rescue Exception => e
+	puts "#{e.class}: #{e.message}"
+	puts e.backtrace.join("\n")
+	login_success = false
+	puts ""
+	puts "Unable to login with #{@email}/#{@password} on #{@domain}"
+	puts "Type 'Ctrl-C' to abort"
+      end
+    end
+  end
+  #Asks the user for @domain, @email, and @password
+  #
+  #Side Effect:: sets @domain, @email and @password
+  #
+  def query_user
+    puts ""
+    puts "Tumblr domain (ie: popstar.tumblr.com): "
+    @domain = gets.chomp
+    puts "Tumblr email address (ie: brittney.spears@gmail.com): "
+    @email = gets.chomp
+    puts "Tumblr password (ie: kfedsux): "
+    @password = gets.chomp
+  end
+  #Creates a WWW::Mechanize @agent and uses it to verify that @domain is correct and
+  #that the @email/@password combination logs in.
+  #
+  #Side Effect:: creates @agent
+  #
+  def login
+    puts "Trying to connect to tumblr"
+    @agent = WWW::Mechanize.new do |a|
+#      a.log = Logger.new("mech.log")
+#      a.log.level = Logger::DEBUG
+      a.redirect_ok = true
+      a.user_agent_alias = 'Windows Mozilla'
+    end
+    # Is the domain any good? This will raise 404 error if bad.
+    @agent.get("http://#{@domain}")
+    # Can the user login?
+    page = @agent.get('http://www.tumblr.com/login')
+    login_form = page.forms.first
+    login_form.email = @email
+    login_form.password = @password
+    result = login_form.submit(login_form.buttons.first)
+    raise "Bad username or password" unless "Logging in..." == result.title
+  end
+  #The Tumblr API does not provide a bandwidth efficient means of getting a list of all postids
+  #without getting the entire posts as well.
+  #This is a bad hack to use the /archive page to get a list of 3000 post_ids at a time.
+  #It uses Net::HTTP because the post_ids which are stored as javascript, so Mechanize can't access them.
+  #ie: location.href='http://rubeh.tumblr.com/post/22655521
+  #
+  #Side Effect:: sets up @postids as an array of postids (as strings)
+  #
+  def parse_archive
+    url = URI.parse("http://#{@domain}/archive")
+    req = Net::HTTP::Get.new(url.path)
+    res = Net::HTTP.start(url.host, url.port) {|http| http.request(req) }
+    # with the body of the archive page, split it into chunks that have one postid each.
+    # use a regular expression to extract the postid
+    @postids = res.body.split("onclick").map{|chunk| (chunk =~ /location.href='http:\/\/[^\/]+\/post\/(\d+)/) ? $1 : nil }.reject{|i| nil == i}
+  end
+  # Using the list of @postids from parse_archive, iterate through them and send HTTP POSTs to the /delete/id action.
+  # It does not check that the delete occurs. As a matter of fact, it intentionally asks to redirect to a 404 to reduce
+  # bandwidth.
+  #
+  def clean
+    total_ids = @postids.size
+    @postids.each_with_index do |postid, i|
+      puts "\nDeleted #{i}/#{total_ids}" if i % 25 == 0
+      print "."
+      result = @agent.post("http://www.tumblr.com/delete", 'id' => postid, 'redirect_to' => '/404') rescue nil
+      # usually tumblr redirects to the dashboard after a delete happens
+      # I'm intentially creating a 404 because it's much less bandwidth intensive
+    end
+    puts
+  end
+end
+TumblrCleanr.new # appended by Rakefile

data/lib/tumblr_cleanr.rb ADDED

@@ -0,0 +1,178 @@
+#See TumblrCleanr class for full documentation
+require 'rubygems'
+# Note: not using Tumblr API because it does not support delete
+# http://ruby-tumblr.rubyforge.org/
+require 'mechanize' # http://mechanize.rubyforge.org/mechanize
+require 'net/http' # http://www.ruby-doc.org/stdlib/libdoc/net/http/rdoc/classes/Net/HTTP.html
+require 'uri'
+require 'logger'
+#== TumblrCleanr - clean/reset your tumblr by deleting all posts
+#Author:: engtech (http://InternetDuctTape.com, http://rubeh.tumblr.com)
+#Copyright:: Copyright (c) 2008 engtech
+#License:: Creative Commons Attribution-Noncommercial 2.5 License
+#
+#Tumblr is rapidly becoming my favorite hosted blogging platform (more so than Blogger/WordPress.com) because of all the things they do correct:
+#
+#- RSS feed importing
+#- free domain name support
+#- free CSS/theme support
+#- Google Analytics support
+#- keeping it simple
+#
+#However, there's one feature that's missing: <b>how do you delete your Tumblr?</b> At some point you might want to destroy all traces of your tumblr (privacy concerns, or you want to use it for something else) and there isn't an option to do that -- other than click the delete button on every individual post. I wanted to repurpose a tumblr I had been using for feed aggregation and it had over 18,000 posts. That's a lot of clicks.
+#
+#Enter the TumblrCleanr. Provide it with your tumblr domain name as well as your username and password and it will delete up to the latest 3000 posts at a time. You can keep running it until your entire tumblr is clean as a whistle.
+#
+#== Privacy Concerns
+#
+#TumblrCleanr does not store your login information anywhere and only uses it to communicate with tumblr.com. Every time you run the program you will have to re-enter your login details.
+#
+#== Why Not Create a New Tumblr?
+#
+#That's true, it's much easier to create a new tumblr account with a different email address than it is to "reset" your existing Tumblr. You can even just change the tumblr domain name if you want to "free up" your good domain for something else. I created this more for my own learning process, because I wanted to try WWW::Mechanize, rdoc, rake and rubyscript2exe for the first time and give myself some more experience coding in Ruby.
+#
+#== License
+#
+#This work is licensed under the Creative Commons
+#Attribution-Noncommercial 2.5 License.
+#
+#To view a copy of this license, visit
+#  http://creativecommons.org/licenses/by-nc/2.5/ or
+#send a letter to
+# Creative Commons, 543 Howard Street, 5th Floor,
+# San Francisco, California, 94105, USA.
+#
+class TumblrCleanr
+  # tumblr domain name is set by login method
+  @domain
+  # email address is set by login method
+  @email
+  # password is set by login method
+  @password
+  # WWW::Mechanize agent is created in login method
+  @agent
+  # Array of tumblr postids is set by post_archive method
+  @postids
+  #Initializing TumblrCleanr will start an interactive prompt asking your your tumblr domain name, email address and password.
+  #domain name:: the domain name used to access your tumblr, without the http:// prefix (eg: popstar.tumblr.com)
+  #email address:: the email address used to login to tumblr (eg: brittney.spears@gmail.com)
+  #password:: the password for your tumblr account, <b>not the password for your email address</b>
+  #
+  #When the program finishes the user is prompted to press enter to quit.
+  #
+  def initialize
+    begin
+      puts "Welcome to TumblrCleanr by http://InternetDuctTape.com"
+      query_loop
+      parse_archive
+      clean
+      print "Success"
+    rescue Interrupt => e
+	puts "User pressed Ctrl-C"
+    rescue Exception => e
+      puts "#{e.class}: #{e.message}"
+      puts e.backtrace.join("\n")
+    end
+    puts "Press enter to exit..."
+    gets
+  end
+  private
+  #Keeps asking the user for their login information until they enter something that works.
+  #Press Ctrl-C to exit the loop (and the program).
+  #
+  def query_loop
+    login_success = false
+    while not login_success do
+      begin
+	query_user
+	login
+	login_success = true
+      rescue Interrupt => e
+	raise Interrupt, "user abort"
+      rescue Exception => e
+	puts "#{e.class}: #{e.message}"
+	puts e.backtrace.join("\n")
+	login_success = false
+	puts ""
+	puts "Unable to login with #{@email}/#{@password} on #{@domain}"
+	puts "Type 'Ctrl-C' to abort"
+      end
+    end
+  end
+  #Asks the user for @domain, @email, and @password
+  #
+  #Side Effect:: sets @domain, @email and @password
+  #
+  def query_user
+    puts ""
+    puts "Tumblr domain (ie: popstar.tumblr.com): "
+    @domain = gets.chomp
+    puts "Tumblr email address (ie: brittney.spears@gmail.com): "
+    @email = gets.chomp
+    puts "Tumblr password (ie: kfedsux): "
+    @password = gets.chomp
+  end
+  #Creates a WWW::Mechanize @agent and uses it to verify that @domain is correct and
+  #that the @email/@password combination logs in.
+  #
+  #Side Effect:: creates @agent
+  #
+  def login
+    puts "Trying to connect to tumblr"
+    @agent = WWW::Mechanize.new do |a|
+#      a.log = Logger.new("mech.log")
+#      a.log.level = Logger::DEBUG
+      a.redirect_ok = true
+      a.user_agent_alias = 'Windows Mozilla'
+    end
+    # Is the domain any good? This will raise 404 error if bad.
+    @agent.get("http://#{@domain}")
+    # Can the user login?
+    page = @agent.get('http://www.tumblr.com/login')
+    login_form = page.forms.first
+    login_form.email = @email
+    login_form.password = @password
+    result = login_form.submit(login_form.buttons.first)
+    raise "Bad username or password" unless "Logging in..." == result.title
+  end
+  #The Tumblr API does not provide a bandwidth efficient means of getting a list of all postids
+  #without getting the entire posts as well.
+  #This is a bad hack to use the /archive page to get a list of 3000 post_ids at a time.
+  #It uses Net::HTTP because the post_ids which are stored as javascript, so Mechanize can't access them.
+  #ie: location.href='http://rubeh.tumblr.com/post/22655521
+  #
+  #Side Effect:: sets up @postids as an array of postids (as strings)
+  #
+  def parse_archive
+    url = URI.parse("http://#{@domain}/archive")
+    req = Net::HTTP::Get.new(url.path)
+    res = Net::HTTP.start(url.host, url.port) {|http| http.request(req) }
+    # with the body of the archive page, split it into chunks that have one postid each.
+    # use a regular expression to extract the postid
+    @postids = res.body.split("onclick").map{|chunk| (chunk =~ /location.href='http:\/\/[^\/]+\/post\/(\d+)/) ? $1 : nil }.reject{|i| nil == i}
+  end
+  # Using the list of @postids from parse_archive, iterate through them and send HTTP POSTs to the /delete/id action.
+  # It does not check that the delete occurs. As a matter of fact, it intentionally asks to redirect to a 404 to reduce
+  # bandwidth.
+  #
+  def clean
+    total_ids = @postids.size
+    @postids.each_with_index do |postid, i|
+      puts "\nDeleted #{i}/#{total_ids}" if i % 25 == 0
+      print "."
+      result = @agent.post("http://www.tumblr.com/delete", 'id' => postid, 'redirect_to' => '/404') rescue nil
+      # usually tumblr redirects to the dashboard after a delete happens
+      # I'm intentially creating a 404 because it's much less bandwidth intensive
+    end
+    puts
+  end
+end

metadata ADDED

@@ -0,0 +1,48 @@
+--- !ruby/object:Gem::Specification
+rubygems_version: 0.9.4
+specification_version: 1
+name: tumblr_cleanr
+version: !ruby/object:Gem::Version
+  version: 0.0.1
+date: 2008-02-25 00:00:00 -05:00
+summary: Script for deleting your tumblr account
+require_paths:
+- lib
+email: engtechnology+tumblrcleanr@gmail.com
+homepage:
+rubyforge_project:
+description:
+autorequire: tumblr_cleanr
+default_executable:
+bindir: bin
+has_rdoc: true
+required_ruby_version: !ruby/object:Gem::Version::Requirement
+  requirements:
+  - - ">"
+    - !ruby/object:Gem::Version
+      version: 0.0.0
+  version:
+platform: ruby
+signing_key:
+cert_chain:
+post_install_message:
+authors:
+- engtech
+files:
+- bin/tumblr_cleanr.rb
+- lib/tumblr_cleanr.rb
+- README
+test_files: []
+rdoc_options: []
+extra_rdoc_files:
+- README
+executables: []
+extensions: []
+requirements: []
+dependencies: []