refinerycms-wordpress-import 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -2,19 +2,31 @@
2
2
 
3
3
  This litte project is an importer for WordPress XML dumps into refinerycms(-blog).
4
4
 
5
- So far, only blog-relevant data gets imported, I'm working on the cms pages part.
6
-
7
5
  You can find the source code on github: https://github.com/mremolt/refinerycms-wordpress-import
8
6
 
7
+ Keep in mind, this gem imports blog posts and pages, NOT the media files, as they are not
8
+ part of the XML dump! You have to manually readd them to Refinery.
9
+
10
+ The same goes for links to other pages on your site. WordPress exports them just as <a>-Tags.
11
+ If your site (blog) structure uses new urls, the links WILL break! For example, if you used
12
+ the popular WP blog url structure "YYYY-MM/slug", be warned that Refinery just uses "blog/slug".
13
+ So your inner site links will point to the old WP url.
14
+
15
+
9
16
  == Prerequisites
10
17
 
11
18
  As refinerycms-wordpress-import is an addon for RefineryCMS, is shares the prerequisites with it.
12
19
  So you'll first need a running installation of refinerycms and refinerycms-blog. Make sure
13
20
  the site is running, all migrations are run and you created the first refinery user.
14
21
 
22
+
15
23
  == Installation
16
24
 
17
- As there is no official release out yet, just add this repos to your projects Gemfile:
25
+ Just add the gem to your projects Gemfile:
26
+
27
+ gem 'refinerycms-wordpress-import'
28
+
29
+ Or if you want to stay on the bleeding edge:
18
30
 
19
31
  gem 'refinerycms-wordpress-import', :git => 'git://github.com/mremolt/refinerycms-wordpress-import.git'
20
32
 
@@ -22,9 +34,10 @@ and run
22
34
 
23
35
  bundle
24
36
 
37
+
25
38
  == Usage
26
39
 
27
- Importing the XML dump is done via 3 rake tasks:
40
+ Importing the XML dump is done via rake tasks:
28
41
 
29
42
  rake wordpress:reset_blog
30
43
 
@@ -49,6 +62,41 @@ The task will then skip all posts that are not published.
49
62
 
50
63
  This one combines the two previous tasks.
51
64
 
65
+ If you also want to import the cms part of WordPress, three more rake tasks manage
66
+ the import into RefineryCMS Pages:
67
+
68
+ rake wordpress:reset_pages
69
+
70
+ This task deletes all data from the cms tables, ensuring a clean import. Otherwise existing
71
+ pages could break the import because of duplicate IDs.
72
+
73
+ rake wordpress:import_pages[file_name]
74
+
75
+ This task imports all the WordPress pages into Refinery. The page structure (parent - child)
76
+ is preserved.
77
+
78
+ If you want to skip the draft pages, add the ONLY_PUBLISHED parameter to this task,
79
+ just like with wordpress:import_blog.
80
+
81
+ rake wordpress:import_pages[file_name] ONLY_PUBLISHED=true
82
+
83
+ If you want to clean the tables and import in one task:
84
+
85
+ rake wordpress:reset_and_import_pages[file_name]
86
+
87
+
88
+ == Usage on ZSH
89
+
90
+ One more hint for users of zsh (like myself):
91
+
92
+ The square brackets following the rake task need to be escaped on zsh, as they have a
93
+ special meaning there. So the syntax is:
94
+
95
+ rake wordpress:reset_and_import_blog\[file_name\]
96
+
97
+ Ugly, but it works. This is the case for all rake tasks by the way, not just mine.
98
+
99
+
52
100
  == Feedback
53
101
 
54
102
  This is still a very new gem. It manages to import my own blog and a standard WordPress 3.1 dump with some sample posts.
@@ -5,7 +5,8 @@ namespace :wordpress do
5
5
  task :reset_blog do
6
6
  Rake::Task["environment"].invoke
7
7
 
8
- %w(taggings tags blog_comments blog_categories blog_categories_blog_posts blog_posts).each do |table_name|
8
+ %w(taggings tags blog_comments blog_categories blog_categories_blog_posts
9
+ blog_posts).each do |table_name|
9
10
  p "Truncating #{table_name} ..."
10
11
  ActiveRecord::Base.connection.execute "DELETE FROM #{table_name}"
11
12
  end
@@ -22,12 +23,13 @@ namespace :wordpress do
22
23
  only_published = ENV['ONLY_PUBLISHED'] == 'true' ? true : false
23
24
  dump.posts(only_published).each(&:to_refinery)
24
25
 
26
+ Refinery::WordPress::Post.create_blog_page_if_necessary
27
+
25
28
  ENV["MODEL"] = 'BlogPost'
26
29
  Rake::Task["friendly_id:redo_slugs"].invoke
27
30
  ENV.delete("MODEL")
28
31
  end
29
32
 
30
-
31
33
  desc "reset blog tables and then import blog data from a Refinery::WordPress XML dump"
32
34
  task :reset_and_import_blog, :file_name do |task, params|
33
35
  Rake::Task["environment"].invoke
@@ -35,4 +37,45 @@ namespace :wordpress do
35
37
  Rake::Task["wordpress:import_blog"].invoke(params[:file_name])
36
38
  end
37
39
 
40
+
41
+ desc "Reset the cms relevant tables for a clean import"
42
+ task :reset_pages do
43
+ Rake::Task["environment"].invoke
44
+
45
+ %w(page_part_translations page_translations page_parts pages).each do |table_name|
46
+ p "Truncating #{table_name} ..."
47
+ ActiveRecord::Base.connection.execute "DELETE FROM #{table_name}"
48
+ end
49
+ end
50
+
51
+ desc "import cms data from a Refinery::WordPress XML dump"
52
+ task :import_pages, :file_name do |task, params|
53
+ Rake::Task["environment"].invoke
54
+ dump = Refinery::WordPress::Dump.new(params[:file_name])
55
+
56
+ only_published = ENV['ONLY_PUBLISHED'] == 'true' ? true : false
57
+ dump.pages(only_published).each(&:to_refinery)
58
+
59
+ # After all pages are persisted we can now create the parent - child
60
+ # relationships. This is necessary, as WordPress doesn't dump the pages in
61
+ # a correct order.
62
+ dump.pages(only_published).each do |dump_page|
63
+ page = ::Page.find(dump_page.post_id)
64
+ page.parent_id = dump_page.parent_id
65
+ page.save!
66
+ end
67
+
68
+ Refinery::WordPress::Post.create_blog_page_if_necessary
69
+
70
+ ENV["MODEL"] = 'Page'
71
+ Rake::Task["friendly_id:redo_slugs"].invoke
72
+ ENV.delete("MODEL")
73
+ end
74
+
75
+ desc "reset cms tables and then import cms data from a Refinery::WordPress XML dump"
76
+ task :reset_and_import_pages, :file_name do |task, params|
77
+ Rake::Task["environment"].invoke
78
+ Rake::Task["wordpress:reset_pages"].invoke
79
+ Rake::Task["wordpress:import_pages"].invoke(params[:file_name])
80
+ end
38
81
  end
@@ -19,10 +19,13 @@ module Refinery
19
19
  end
20
20
  end
21
21
 
22
- def pages
23
- doc.xpath("//item[wp:post_type = 'page']").collect do |page|
22
+ def pages(only_published=false)
23
+ pages = doc.xpath("//item[wp:post_type = 'page']").collect do |page|
24
24
  Page.new(page)
25
25
  end
26
+
27
+ pages = pages.select(&:published?) if only_published
28
+ pages
26
29
  end
27
30
 
28
31
  def posts(only_published=false)
@@ -23,23 +23,10 @@ module Refinery
23
23
  end
24
24
 
25
25
  def content_formatted
26
- # WordPress doesn't export <p>-Tags, so let's run a simple_format over
27
- # the content. As we trust ourselves, no sanatize.
28
- formatted = simple_format(content, {}, { :sanitize => false })
29
-
30
- # Support for SyntaxHighlighter (http://alexgorbatchev.com/SyntaxHighlighter/):
31
- # In WordPress you can (via a plugin) enclose code in [lang][/lang]
32
- # blocks, which are converted to a <pre>-tag with a class corresponding
33
- # to the language.
34
- #
35
- # Example:
36
- # [ruby]p "Hello World"[/ruby]
37
- # -> <pre class="brush: ruby">p "Hello world"</pre>
38
- formatted.gsub!(/\[(\w+)\]/, '<pre class="brush: \1">')
39
- formatted.gsub!(/\[\/\w+\]/, '</pre>')
26
+ formatted = format_syntax_highlighter(format_paragraphs(content))
40
27
 
41
28
  # remove all tags inside <pre> that simple_format created
42
- # TODO: replace simple_format with a method, that ignores pre-tags
29
+ # TODO: replace format_paragraphs with a method, that ignores pre-tags
43
30
  formatted.gsub!(/(<pre.*?>)(.+?)(<\/pre>)/m) do |match|
44
31
  "#{$1}#{strip_tags($2)}#{$3}"
45
32
  end
@@ -60,7 +47,8 @@ module Refinery
60
47
  end
61
48
 
62
49
  def parent_id
63
- node.xpath("wp:post_parent").text.to_i
50
+ dump_id = node.xpath("wp:post_parent").text.to_i
51
+ dump_id == 0 ? nil : dump_id
64
52
  end
65
53
 
66
54
  def status
@@ -80,12 +68,40 @@ module Refinery
80
68
  end
81
69
 
82
70
  def to_refinery
83
- page = ::Page.create!(:title => title, :created_at => post_date,
84
- :draft => draft?, :parent_id => parent_id)
71
+ page = ::Page.create!(:id => post_id, :title => title,
72
+ :created_at => post_date, :draft => draft?)
85
73
 
86
74
  page.parts.create(:title => 'Body', :body => content_formatted)
87
75
  page
88
76
  end
77
+
78
+ private
79
+
80
+ def format_paragraphs(text, html_options={})
81
+ # WordPress doesn't export <p>-Tags, so let's run a simple_format over
82
+ # the content. As we trust ourselves, no sanatize. This code is heavily
83
+ # inspired by the simple_format rails helper
84
+ text = ''.html_safe if text.nil?
85
+ start_tag = tag('p', html_options, true)
86
+
87
+ text.gsub!(/\r\n?/, "\n") # \r\n and \r -> \n
88
+ text.gsub!(/\n\n+/, "</p>\n\n#{start_tag}") # 2+ newline -> paragraph
89
+ text.insert 0, start_tag
90
+
91
+ text.html_safe.safe_concat("</p>")
92
+ end
93
+
94
+ def format_syntax_highlighter(text)
95
+ # Support for SyntaxHighlighter (http://alexgorbatchev.com/SyntaxHighlighter/):
96
+ # In WordPress you can (via a plugin) enclose code in [lang][/lang]
97
+ # blocks, which are converted to a <pre>-tag with a class corresponding
98
+ # to the language.
99
+ #
100
+ # Example:
101
+ # [ruby]p "Hello World"[/ruby]
102
+ # -> <pre class="brush: ruby">p "Hello world"</pre>
103
+ text.gsub(/\[(\w+)\](.+?)\[\/\1\]/m, '<pre class="brush: \1">\2</pre>')
104
+ end
89
105
  end
90
106
  end
91
107
  end
@@ -61,6 +61,25 @@ module Refinery
61
61
 
62
62
  post
63
63
  end
64
+
65
+ def self.create_blog_page_if_necessary
66
+ # refinerycms wants a page at /blog, so let's make sure there is one
67
+ # taken from the original db seeds from refinery-blog
68
+ unless ::Page.where("link_url = ?", '/blog').exists?
69
+ page = ::Page.create(
70
+ :title => "Blog",
71
+ :link_url => "/blog",
72
+ :deletable => false,
73
+ :position => ((::Page.maximum(:position, :conditions => {:parent_id => nil}) || -1)+1),
74
+ :menu_match => "^/blogs?(\/|\/.+?|)$"
75
+ )
76
+
77
+ ::Page.default_parts.each do |default_page_part|
78
+ page.parts.create(:title => default_page_part, :body => nil)
79
+ end
80
+ end
81
+ end
82
+
64
83
  end
65
84
  end
66
85
  end
metadata CHANGED
@@ -1,93 +1,90 @@
1
- --- !ruby/object:Gem::Specification
1
+ --- !ruby/object:Gem::Specification
2
2
  name: refinerycms-wordpress-import
3
- version: !ruby/object:Gem::Version
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.2.0
4
5
  prerelease:
5
- version: 0.1.0
6
6
  platform: ruby
7
- authors:
7
+ authors:
8
8
  - Marc Remolt
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
-
13
- date: 2011-06-03 00:00:00 +02:00
12
+ date: 2011-06-05 00:00:00.000000000 +02:00
14
13
  default_executable:
15
- dependencies:
16
- - !ruby/object:Gem::Dependency
14
+ dependencies:
15
+ - !ruby/object:Gem::Dependency
17
16
  name: bundler
18
- prerelease: false
19
- requirement: &id001 !ruby/object:Gem::Requirement
17
+ requirement: &14624860 !ruby/object:Gem::Requirement
20
18
  none: false
21
- requirements:
19
+ requirements:
22
20
  - - ~>
23
- - !ruby/object:Gem::Version
24
- version: "1.0"
21
+ - !ruby/object:Gem::Version
22
+ version: '1.0'
25
23
  type: :runtime
26
- version_requirements: *id001
27
- - !ruby/object:Gem::Dependency
28
- name: refinerycms
29
24
  prerelease: false
30
- requirement: &id002 !ruby/object:Gem::Requirement
25
+ version_requirements: *14624860
26
+ - !ruby/object:Gem::Dependency
27
+ name: refinerycms
28
+ requirement: &14624380 !ruby/object:Gem::Requirement
31
29
  none: false
32
- requirements:
30
+ requirements:
33
31
  - - ~>
34
- - !ruby/object:Gem::Version
32
+ - !ruby/object:Gem::Version
35
33
  version: 1.0.0
36
34
  type: :runtime
37
- version_requirements: *id002
38
- - !ruby/object:Gem::Dependency
39
- name: refinerycms-blog
40
35
  prerelease: false
41
- requirement: &id003 !ruby/object:Gem::Requirement
36
+ version_requirements: *14624380
37
+ - !ruby/object:Gem::Dependency
38
+ name: refinerycms-blog
39
+ requirement: &14623920 !ruby/object:Gem::Requirement
42
40
  none: false
43
- requirements:
41
+ requirements:
44
42
  - - ~>
45
- - !ruby/object:Gem::Version
43
+ - !ruby/object:Gem::Version
46
44
  version: 1.5.2
47
45
  type: :runtime
48
- version_requirements: *id003
49
- - !ruby/object:Gem::Dependency
50
- name: nokogiri
51
46
  prerelease: false
52
- requirement: &id004 !ruby/object:Gem::Requirement
47
+ version_requirements: *14623920
48
+ - !ruby/object:Gem::Dependency
49
+ name: nokogiri
50
+ requirement: &14623460 !ruby/object:Gem::Requirement
53
51
  none: false
54
- requirements:
52
+ requirements:
55
53
  - - ~>
56
- - !ruby/object:Gem::Version
54
+ - !ruby/object:Gem::Version
57
55
  version: 1.4.4
58
56
  type: :runtime
59
- version_requirements: *id004
60
- - !ruby/object:Gem::Dependency
61
- name: rspec-rails
62
57
  prerelease: false
63
- requirement: &id005 !ruby/object:Gem::Requirement
58
+ version_requirements: *14623460
59
+ - !ruby/object:Gem::Dependency
60
+ name: rspec-rails
61
+ requirement: &14623080 !ruby/object:Gem::Requirement
64
62
  none: false
65
- requirements:
66
- - - ">="
67
- - !ruby/object:Gem::Version
68
- version: "0"
63
+ requirements:
64
+ - - ! '>='
65
+ - !ruby/object:Gem::Version
66
+ version: '0'
69
67
  type: :development
70
- version_requirements: *id005
71
- - !ruby/object:Gem::Dependency
72
- name: database_cleaner
73
68
  prerelease: false
74
- requirement: &id006 !ruby/object:Gem::Requirement
69
+ version_requirements: *14623080
70
+ - !ruby/object:Gem::Dependency
71
+ name: database_cleaner
72
+ requirement: &14622620 !ruby/object:Gem::Requirement
75
73
  none: false
76
- requirements:
77
- - - ">="
78
- - !ruby/object:Gem::Version
79
- version: "0"
74
+ requirements:
75
+ - - ! '>='
76
+ - !ruby/object:Gem::Version
77
+ version: '0'
80
78
  type: :development
81
- version_requirements: *id006
82
- description: This gem imports a WordPress XML dump into refinerycms (Page [soon], User) and refinerycms-blog (BlogPost, BlogCategory, Tag, BlogComment)
79
+ prerelease: false
80
+ version_requirements: *14622620
81
+ description: This gem imports a WordPress XML dump into refinerycms (Page, User) and
82
+ refinerycms-blog (BlogPost, BlogCategory, Tag, BlogComment)
83
83
  email: marc.remolt@googlemail.com
84
84
  executables: []
85
-
86
85
  extensions: []
87
-
88
86
  extra_rdoc_files: []
89
-
90
- files:
87
+ files:
91
88
  - lib/wordpress.rb
92
89
  - lib/wordpress/page.rb
93
90
  - lib/wordpress/author.rb
@@ -106,30 +103,26 @@ files:
106
103
  has_rdoc: true
107
104
  homepage: https://github.com/mremolt/refinerycms-wordpress-import
108
105
  licenses: []
109
-
110
106
  post_install_message:
111
107
  rdoc_options: []
112
-
113
- require_paths:
108
+ require_paths:
114
109
  - lib
115
- required_ruby_version: !ruby/object:Gem::Requirement
110
+ required_ruby_version: !ruby/object:Gem::Requirement
116
111
  none: false
117
- requirements:
118
- - - ">="
119
- - !ruby/object:Gem::Version
120
- version: "0"
121
- required_rubygems_version: !ruby/object:Gem::Requirement
112
+ requirements:
113
+ - - ! '>='
114
+ - !ruby/object:Gem::Version
115
+ version: '0'
116
+ required_rubygems_version: !ruby/object:Gem::Requirement
122
117
  none: false
123
- requirements:
124
- - - ">="
125
- - !ruby/object:Gem::Version
126
- version: "0"
118
+ requirements:
119
+ - - ! '>='
120
+ - !ruby/object:Gem::Version
121
+ version: '0'
127
122
  requirements: []
128
-
129
123
  rubyforge_project:
130
124
  rubygems_version: 1.6.2
131
125
  signing_key:
132
126
  specification_version: 3
133
127
  summary: Import WordPress XML dumps into refinerycms(-blog).
134
128
  test_files: []
135
-