algoliasearch-jekyll 0.2.2 → 0.2.3

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 8ba110a511e94a6806f7cea6863401e58e7ff2a8
4
- data.tar.gz: 8c498d9b582c93c4ee721d4c27af5f29a11cf528
3
+ metadata.gz: cde772c27a05882aba2a47f5c0fa626440ad7755
4
+ data.tar.gz: d0c6dafb3d6b3ade9a35057f6c91b9c6b3c3bfdc
5
5
  SHA512:
6
- metadata.gz: 0f85d1f891cc31d2221cb822af36c2c4e31e501630c31c6cdeab90b508f3740f5b36830624d5c655ba35278a1000b1498f4b86a80b47891b924dd32c406ce9c3
7
- data.tar.gz: af03f61ef88213c089ead811b6dc42676245614f7e4fada7f016851be224dfc9072ba355bcba7cd3772e68449257d5a92751c41f7ab4929db1455810d07e9961
6
+ metadata.gz: ad814fa1f1eee3d87a6407740d9aad6501874a86b395bda116c0ddeb82d35dae9ed17c87bbb59d98c11fc32c5c75a52a187838b8a8bf7714ec158c0b9a8137a0
7
+ data.tar.gz: 43d778da3ad04dec941d1f46e7a1943aeb8e60e771171243f9f502994d52a18325d26a85497f5919ccf2437c57e1c0b5fd1400ddaab0e7b8bf883650628f9b2a
data/.rspec ADDED
@@ -0,0 +1,2 @@
1
+ --color
2
+ --format progress
data/.rubocop.yml ADDED
@@ -0,0 +1,25 @@
1
+ # Defaults:
2
+ # https://github.com/bbatsov/rubocop/blob/master/config/default.yml
3
+ Metrics/AbcSize:
4
+ Max: 100
5
+
6
+ Metrics/ClassLength:
7
+ Max: 200
8
+
9
+ Metrics/ModuleLength:
10
+ Max: 200
11
+
12
+ Metrics/MethodLength:
13
+ Max: 50
14
+
15
+ Metrics/CyclomaticComplexity:
16
+ Max: 10
17
+
18
+ Metrics/PerceivedComplexity:
19
+ Max: 10
20
+
21
+ Style/FileName:
22
+ Enabled: false
23
+
24
+ Style/MultilineOperationIndentation:
25
+ Enabled: false
data/Gemfile ADDED
@@ -0,0 +1,15 @@
1
+ source 'http://rubygems.org'
2
+
3
+ gem 'algoliasearch', '~> 1.4'
4
+ gem 'awesome_print', '~> 1.6'
5
+ gem 'json', '~> 1.8'
6
+ gem 'nokogiri', '~> 1.6'
7
+
8
+ group :development do
9
+ gem 'guard-rspec', '~> 4.6'
10
+ gem 'jekyll', '~> 2.5'
11
+ gem 'jeweler', '~> 2.0'
12
+ gem 'rspec', '~> 3.0'
13
+ gem 'rubocop', '~> 0.31'
14
+ gem 'simplecov', '~> 0.10'
15
+ end
data/Guardfile ADDED
@@ -0,0 +1,7 @@
1
+ guard :rspec, cmd: 'bundle exec rspec --color --format documentation' do
2
+ watch(%r{^spec/.+_spec\.rb$})
3
+ watch(%r{^lib/(.+)\.rb$}) { |m| "spec/#{m[1]}_spec.rb" }
4
+ watch('spec/spec_helper.rb') { 'spec' }
5
+ end
6
+
7
+ notification :off
data/LICENSE.txt ADDED
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2015 Algolia
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,148 @@
1
+ # Algolia Jekyll Plugin
2
+
3
+ [![Gem Version](https://badge.fury.io/rb/algoliasearch-jekyll.svg)](http://badge.fury.io/rb/algoliasearch-jekyll)
4
+
5
+ Jekyll plugin to automatically index your Jekyll posts and pages into an
6
+ Algolia index by simply running `jekyll algolia push`.
7
+
8
+ ## Usage
9
+
10
+ ```shell
11
+ $ jekyll algolia push
12
+ ```
13
+
14
+ This will push the content of your jekyll website to your Algolia index.
15
+
16
+ You can specify any option you would normally pass to `jekyll build`, like
17
+ `--config`, `--source`, `--destination`, etc.
18
+
19
+ ## Installation
20
+
21
+ First, add the `algoliasearch-jekyll` gem to your `Gemfile`, in the
22
+ `:jekyll_plugins` section. If you do not yet have a `Gemfile`, here is the
23
+ minimum content to get your started.
24
+
25
+ ```ruby
26
+ source 'https://rubygems.org'
27
+
28
+ gem 'jekyll', '>=2.5.3'
29
+
30
+ group :jekyll_plugins do
31
+ gem 'algoliasearch-jekyll'
32
+ end
33
+ ```
34
+
35
+ Once this is done, download all dependencies with `bundle install`.
36
+
37
+ Then, add `algoliasearch-jekyll` to your `_config.yml` file, under the `gems`
38
+ section, like this:
39
+
40
+ ```yaml
41
+ gems:
42
+ - algoliasearch-jekyll
43
+ ```
44
+
45
+ If everything went well, you should be able to execute `jekyll help` and see the
46
+ `algolia` subcommand listed.
47
+
48
+ ## Configuration
49
+
50
+ Add information about your Algolia configuration into the `_config.yml` file,
51
+ under the `algolia` section, like this:
52
+
53
+ ```yaml
54
+ algolia:
55
+ application_id: 'your_application_id'
56
+ index_name: 'your_index_name'
57
+ ```
58
+
59
+ You admin api key will be read from the `ALGOLIA_API_KEY` environment variable.
60
+ You can define it on the same line as your command, allowing you to type
61
+ `ALGOLIA_API_KEY='your_admin_api_key' jekyll algolia push`.
62
+
63
+ ### ⚠ Other, unsecure, method ⚠
64
+
65
+ You can also store your admin api key in a file named `_algolia_api_key`, in
66
+ your source directory. If you do this we __very, very, very strongly__ encourage
67
+ you to make sure the file is not tracked in your versioning system.
68
+
69
+ ### Options
70
+
71
+ The plugin uses sensible defaults, but you may want to override some of its
72
+ configuration. Here are the various options you can add to your `_config.yml`
73
+ file, under the `algolia` section:
74
+
75
+ #### `excluded_files`
76
+
77
+ Defines which files should not be indexed for search.
78
+
79
+ ```yml
80
+ algolia:
81
+ excluded_files:
82
+ - index.html
83
+ - 2015-01-01-post.md
84
+ ```
85
+
86
+ #### `record_css_selector`
87
+
88
+ Defines the css selector inside a page/post used to choose which parts to index.
89
+ It is set to all paragraphs (`<p>`) by default.
90
+
91
+ If you would like to also index lists, you could set it like this:
92
+
93
+ ```yml
94
+ algolia:
95
+ record_css_selector: 'p,ul'
96
+ ```
97
+
98
+ #### `settings`
99
+
100
+ Here you can pass any custom settings you would like to push to your Algolia
101
+ index.
102
+
103
+ If you want to activate `distinct` and some snippets for example, you would do:
104
+
105
+ ```yml
106
+ algolia:
107
+ settings:
108
+ attributeForDistinct: 'hierarchy'
109
+ distinct: true
110
+ attributesToSnippet: ['text:20']
111
+ ```
112
+
113
+ ### Hooks
114
+
115
+ The `AlgoliaSearchRecordExtractor` contains two methods (`custom_hook_each` and
116
+ `custom_hook_all`) that are here so you can overwrite them to add your custom
117
+ logic. They currently simply return the argument they take as input.
118
+
119
+ ```ruby
120
+ class AlgoliaSearchRecordExtractor
121
+ # Hook to modify a record after extracting
122
+ # `node` refers to the Nokogiri HTML node of the element
123
+ def custom_hook_each(item, node)
124
+ item
125
+ end
126
+
127
+ # Hook to modify all records after extracting
128
+ def custom_hook_all(items)
129
+ items
130
+ end
131
+ end
132
+ ```
133
+
134
+ ## Searching
135
+
136
+ This plugin will only index your data in your Algolia index. Adding search
137
+ capabilities is quite easy. You can follow [our tutorials][1] or use our forked
138
+ version of the popular [Hyde theme][2].
139
+
140
+ ## GitHub Pages
141
+
142
+ Unfortunatly, GitHub does not allow custom plugins to be run on GitHub Pages.
143
+ This mean that you will have to manually run `jekyll algolia push` before
144
+ pushing your content to GitHub.
145
+
146
+
147
+ [1]: https://www.algolia.com/doc/javascript
148
+ [2]: https://github.com/algolia/hyde
data/Rakefile ADDED
@@ -0,0 +1,44 @@
1
+ # encoding: utf-8
2
+
3
+ require 'rubygems'
4
+ require 'bundler'
5
+ begin
6
+ Bundler.setup(:default, :development)
7
+ rescue Bundler::BundlerError => e
8
+ $stderr.puts e.message
9
+ $stderr.puts 'Run `bundle install` to install missing gems'
10
+ exit e.status_code
11
+ end
12
+ require 'rake'
13
+
14
+ require 'jeweler'
15
+ Jeweler::Tasks.new do |gem|
16
+ # gem is a Gem::Specification... see
17
+ # http://guides.rubygems.org/specification-reference/ for more options
18
+ gem.name = 'algoliasearch-jekyll'
19
+ gem.homepage = 'https://github.com/algolia/algoliasearch-jekyll'
20
+ gem.license = 'MIT'
21
+ gem.summary = 'AlgoliaSearch for Jekyll'
22
+ gem.description = 'Index all your pages and posts to an Algolia index with ' \
23
+ '`jekyll algolia push`'
24
+ gem.email = 'tim@pixelastic.com'
25
+ gem.authors = ['Tim Carry']
26
+ # dependencies defined in Gemfile
27
+ end
28
+ Jeweler::RubygemsDotOrgTasks.new
29
+
30
+ require 'rspec/core'
31
+ require 'rspec/core/rake_task'
32
+ RSpec::Core::RakeTask.new(:spec) do |spec|
33
+ # spec.rspec_opts = '--color --format documentation'
34
+ spec.pattern = FileList['spec/**/*_spec.rb']
35
+ end
36
+ task test: :spec
37
+
38
+ desc 'Code coverage detail'
39
+ task :coverage do
40
+ ENV['COVERAGE'] = 'true'
41
+ Rake::Task['spec'].execute
42
+ end
43
+
44
+ task default: :test
data/VERSION ADDED
@@ -0,0 +1 @@
1
+ 0.2.3
data/lib/push.rb CHANGED
@@ -31,17 +31,19 @@ class AlgoliaSearchJekyllPush < Jekyll::Command
31
31
  def indexable?(file)
32
32
  return false if file.is_a?(Jekyll::StaticFile)
33
33
 
34
+ basename = File.basename(file.path)
35
+ extname = File.extname(basename)[1..-1]
36
+
34
37
  # Keep only markdown and html files
35
38
  allowed_extensions = %w(html)
36
39
  if @config['markdown_ext']
37
40
  allowed_extensions += @config['markdown_ext'].split(',')
38
41
  end
39
- current_extension = File.extname(file.name)[1..-1]
40
- return false unless allowed_extensions.include?(current_extension)
42
+ return false unless allowed_extensions.include?(extname)
41
43
 
42
44
  # Exclude files manually excluded from config
43
45
  excluded_files = @config['algolia']['excluded_files']
44
- return false if excluded_files && excluded_files.include?(file.name)
46
+ return false if excluded_files && excluded_files.include?(basename)
45
47
 
46
48
  true
47
49
  end
@@ -55,7 +57,6 @@ class AlgoliaSearchJekyllPush < Jekyll::Command
55
57
  items = []
56
58
  each_site_file do |file|
57
59
  next unless AlgoliaSearchJekyllPush.indexable?(file)
58
-
59
60
  new_items = AlgoliaSearchRecordExtractor.new(file).extract
60
61
  next if new_items.nil?
61
62
 
@@ -162,27 +163,33 @@ class AlgoliaSearchJekyllPush < Jekyll::Command
162
163
 
163
164
  def push(items)
164
165
  check_credentials
165
-
166
- index_name = @config['algolia']['index_name']
167
166
  Algolia.init(
168
167
  application_id: @config['algolia']['application_id'],
169
168
  api_key: api_key
170
169
  )
171
- index = Algolia::Index.new(index_name)
172
- configure_index(index)
173
- index.clear_index
174
170
 
171
+ # Create a temporary index
172
+ index_name = @config['algolia']['index_name']
173
+ index_name_tmp = "#{index_name}_tmp"
174
+ index_tmp = Algolia::Index.new(index_name_tmp)
175
+ configure_index(index_tmp)
176
+
177
+ # Push to temporary index
175
178
  items.each_slice(1000) do |batch|
176
179
  Jekyll.logger.info "Indexing #{batch.size} items"
177
180
  begin
178
- index.add_objects(batch)
181
+ index_tmp.add_objects!(batch)
179
182
  rescue StandardError => error
180
183
  Jekyll.logger.error 'Algolia Error: HTTP Error'
181
184
  Jekyll.logger.warn error.message
182
185
  exit 1
186
+
183
187
  end
184
188
  end
185
189
 
190
+ # Move temporary index to real one
191
+ Algolia.move_index(index_name_tmp, index_name)
192
+
186
193
  Jekyll.logger.info "Indexing of #{items.size} items " \
187
194
  "in #{index_name} done."
188
195
  end
@@ -24,9 +24,24 @@ class AlgoliaSearchRecordExtractor
24
24
 
25
25
  # Returns metadata from the current file
26
26
  def metadata
27
- return metadata_page if @file.is_a?(Jekyll::Page)
28
- return metadata_post if @file.is_a?(Jekyll::Post)
29
- {}
27
+ metadata = {}
28
+ @file.data.each { |key, value| metadata[key.to_sym] = value }
29
+
30
+ metadata[:type] = @file.class.name.split('::')[1].downcase
31
+ metadata[:url] = @file.url
32
+
33
+ if @file.respond_to? :slug
34
+ metadata[:slug] = @file.slug
35
+ else
36
+ basename = File.basename(@file.path)
37
+ extname = File.extname(basename)
38
+ metadata[:slug] = File.basename(basename, extname)
39
+ end
40
+
41
+ metadata[:posted_at] = @file.date.to_time.to_i if @file.respond_to? :date
42
+ metadata[:tags] = tags if @file.respond_to? :tags
43
+
44
+ metadata
30
45
  end
31
46
 
32
47
  # Extract a list of tags
@@ -37,41 +52,21 @@ class AlgoliaSearchRecordExtractor
37
52
  @file.tags.map(&:to_s)
38
53
  end
39
54
 
40
- # Extract metadata from a post
41
- def metadata_post
42
- {
43
- type: 'post',
44
- url: @file.url,
45
- title: @file.title,
46
- slug: @file.slug,
47
- posted_at: @file.date.to_time.to_i,
48
- tags: tags
49
- }
50
- end
51
-
52
- # Extract metadata from a page
53
- def metadata_page
54
- {
55
- type: 'page',
56
- url: @file.url,
57
- title: @file['title'],
58
- slug: @file.basename
59
- }
60
- end
61
-
62
55
  # Get the list of all HTML nodes to index
63
56
  def html_nodes
64
57
  document = Nokogiri::HTML(@file.content)
65
58
  document.css(@config['record_css_selector'])
66
59
  end
67
60
 
61
+ # Check if node is a heading
62
+ def node_heading?(node)
63
+ %w(h1 h2 h3 h4 h5 h6).include?(node.name)
64
+ end
65
+
68
66
  # Get the closest heading parent
69
67
  def node_heading_parent(node, level = 'h7')
70
- headings = %w(h1 h2 h3 h4 h5 h6)
71
-
72
- # If initially called on a heading, we must not accept it but only accept
73
- # strong headings
74
- level = node.name if level == 'h7' && headings.include?(node.name)
68
+ # If initially called on a heading, we only accept stronger headings
69
+ level = node.name if level == 'h7' && node_heading?(node)
75
70
 
76
71
  previous = node.previous_element
77
72
 
@@ -84,32 +79,31 @@ class AlgoliaSearchRecordExtractor
84
79
  end
85
80
 
86
81
  # This is a heading, we return it
87
- return previous if headings.include?(previous.name) && previous.name < level
82
+ return previous if node_heading?(previous) && previous.name < level
88
83
 
89
84
  node_heading_parent(previous, level)
90
85
  end
91
86
 
92
87
  # Get all the parent headings of the specified node
93
- def node_hierarchy(node, memo = { level: 7 })
94
- previous = node_heading_parent(node)
88
+ # If the node itself is a heading, we include it
89
+ def node_hierarchy(node, state = { level: 7 })
90
+ tag_name = node.name
91
+ level = tag_name.gsub('h', '').to_i
95
92
 
96
- # No previous heading, we can stop the recursion
97
- unless previous
98
- memo.delete(:level)
99
- return memo
93
+ if node_heading?(node) && level < state[:level]
94
+ state[tag_name.to_sym] = node_text(node)
95
+ state[:level] = level
100
96
  end
101
97
 
102
- tag_name = previous.name
103
- level = tag_name.gsub('h', '').to_i
104
- content = previous.content
98
+ heading = node_heading_parent(node)
105
99
 
106
- # Skip if item already as title of a higher level
107
- return node_hierarchy(previous, memo) if level >= memo[:level]
108
- memo[:level] = level
100
+ # No previous heading, we can stop the recursion
101
+ unless heading
102
+ state.delete(:level)
103
+ return state
104
+ end
109
105
 
110
- # Add to the memo and continue
111
- memo[tag_name.to_sym] = content
112
- node_hierarchy(previous, memo)
106
+ node_hierarchy(heading, state)
113
107
  end
114
108
 
115
109
  # Return the raw HTML of the element to index
@@ -0,0 +1,10 @@
1
+ #!/usr/bin/env bash
2
+
3
+ # Do not commit any focused or excluded tests
4
+ if grep --color -r 'spec' -E -e '^( |\t)*(fit|fdescribe|xit|xdescribe)'; then
5
+ echo '✘ You have focused and/or skipped tests'
6
+ exit 1
7
+ fi
8
+
9
+ # Match style guide
10
+ rubocop './lib/' './spec'
@@ -0,0 +1,3 @@
1
+ #!/usr/bin/env bash
2
+
3
+ bundle exec rspec --format documentation
data/scripts/run_tests ADDED
@@ -0,0 +1,2 @@
1
+ #!/usr/bin/env bash
2
+ bundle exec rspec --color --format documentation
@@ -0,0 +1,10 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ build_output = `gem build algoliasearch-jekyll.gemspec`.split("\n")
4
+ puts build_output
5
+ build_output.select! { |line| line.match(/^ File:/) }
6
+
7
+ gem_file = build_output[0].gsub(' File: ', '')
8
+ puts 'Pushing to rubygems'
9
+ push_output = `gem push #{gem_file}`
10
+ puts push_output
@@ -0,0 +1,7 @@
1
+ algolia:
2
+ application_id: 'APPID'
3
+ index_name: 'INDEXNAME'
4
+
5
+ collections:
6
+ my-collection:
7
+ output: true
@@ -0,0 +1,6 @@
1
+ ---
2
+ title: Collection Item
3
+ ---
4
+
5
+ <p>The grandest of omelettes. Those that feast on dragon eggs often find that there
6
+ is very little they would not dare to do.</p>
@@ -0,0 +1,7 @@
1
+ ---
2
+ title: Collection Item
3
+ custom: Foo
4
+ ---
5
+
6
+ The grandest of omelettes. Those that feast on dragon eggs often find that there
7
+ is very little they would not dare to do.
@@ -0,0 +1,32 @@
1
+ ---
2
+ title: "Test post"
3
+ tags:
4
+ - tag
5
+ - another tag
6
+ custom: Foo
7
+ ---
8
+
9
+ Introduction text that also includes [some link](https://www.algolia.com). To
10
+ add a bit of fancy, we will also __bold__ and _italicize_ some text.
11
+
12
+ # Main title
13
+
14
+ We like writing stuff and then indexing it in a very fast engine. Here is why
15
+ a fast engine is good for you:
16
+
17
+ * fast
18
+ * fast
19
+ * fast
20
+ * and fast
21
+
22
+ ## Built with hands
23
+
24
+ All this text was typed with my own hands, on my own keyboard. I also did use
25
+ a Chair© and a Table™.
26
+
27
+ ## Features
28
+
29
+ The whole plugin is composed of parts of `code`, and sometime even
30
+ <code>&lt;code&gt;</code>.
31
+
32
+ Code is __✔ checked__ and errors are __✘ deleted__.
@@ -0,0 +1,34 @@
1
+ ---
2
+ title: About page
3
+ custom: Foo
4
+ ---
5
+
6
+ # Heading 1
7
+
8
+ Text 1
9
+
10
+ ## Heading 2
11
+
12
+ Text 2
13
+
14
+ ### Heading 3
15
+
16
+ Text 3
17
+
18
+ - item 1
19
+ - item 2
20
+ - item 3
21
+
22
+ ### Another Heading 3
23
+
24
+ <p id="text4">Another text 4</p>
25
+
26
+ <h2 id="heading2b">Another Heading 2</h2>
27
+
28
+ Another `<text>` 5
29
+
30
+ ### Last Heading 3
31
+
32
+ <div>Just a div</div>
33
+
34
+ <div><p>Last text 6 </p></div>
@@ -0,0 +1 @@
1
+ APIKEY_FROM_FILE
Binary file
@@ -0,0 +1,5 @@
1
+ ---
2
+ title: Authors
3
+ ---
4
+
5
+ <p>This is an HTML page</p>
@@ -0,0 +1,5 @@
1
+ ---
2
+ title: Excluded file
3
+ ---
4
+
5
+ <p>This should not be indexed</p>
@@ -0,0 +1,35 @@
1
+ ---
2
+ title: Hierarchy test
3
+ ---
4
+
5
+ # H1
6
+
7
+ TEXT1-H1
8
+
9
+ ## H2A
10
+
11
+ TEXT2-H2A-H1
12
+
13
+ TEXT3-H2A-H1
14
+
15
+ ## H2B
16
+
17
+ TEXT4-H2B-H1
18
+
19
+ ### H3A
20
+
21
+ TEXT5-H3-H2B-H1
22
+
23
+ <div>
24
+ <h4>H4</h4>
25
+ <p>TEXT7-H4-H3-H2B-H1</p>
26
+ </div>
27
+
28
+ ## H2C
29
+
30
+ TEXT8-H2C-H1
31
+
32
+ ### H3B `<code>`
33
+
34
+ TEXT9-H3B-H2C-H1
35
+
@@ -0,0 +1,19 @@
1
+ ---
2
+ title: Weight test
3
+ ---
4
+
5
+ # AAA BBB CCC DDD
6
+
7
+ aaa xxx aaa xxx aaa
8
+
9
+ ## AAA BBB
10
+
11
+ aaa bbb
12
+
13
+ ## CCC DDD
14
+
15
+ ccc ddd
16
+
17
+ ### DDD
18
+
19
+ aaa bbb ccc dddd