RubyGems - middleman-search - Versions diffs - 0.2.0 → 0.3.0 - Mend

middleman-search 0.2.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

checksums.yaml +4 -4
data/README.md +56 -10
data/lib/middleman-search/extension.rb +11 -1
data/lib/middleman-search/search-index-resource.rb +20 -1
data/lib/middleman-search/version.rb +1 -1
metadata +1 -1

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA1:
-  metadata.gz: 83d2b65441f0766102161c0489cedec031fb34fe
-  data.tar.gz: 612b8f30867b4695af0b39ead29304360bf1d644
+  metadata.gz: 3dc491d945d930fbca061b6e1c93fa4d2f5b2d1e
+  data.tar.gz: 63d5f6e7adbe105b1a865f631c8b365151381951
 SHA512:
-  metadata.gz: 86b81ea83c06557e4f29e9a7823b88a96260a55dfbc36c268fb436ecd230df85adf7db9a74f2a7255a2ec1e5715ab5c38020e3933e227fc4e85a187c0ab0b12b
-  data.tar.gz: 8fab755798193a108e834998327d3350995b8f8027bb5476ffbfde35c3efbfa4eeab04e910185f740860dd06e6a551b3ac78bcbdebe488569ac391f4378aacb4
+  metadata.gz: ba2d93142dda2dbae2736566e81e5d97e77a05c1fa5ac9acec09a8bf7f37764f2d1e720af4084d0aa980cab197049a32e4ae36c2c8ce83d38c7f23e0f42e2e37
+  data.tar.gz: a16b4903505af7c62a79d0936cf09296e211fba49dc6e4557465deb275b8e5f91dac8b306e3555ce3f03e184260d54a1b8cb487e3ead12f841981e09038836a4

data/README.md CHANGED Viewed

@@ -22,19 +22,17 @@ You need to activate the module in your `config.rb`, telling the extension how t
 ```ruby
 activate :search do
   search.resources = ['blog/', 'index.html', 'contactus/index.html']
   search.index_path = 'search/lunr-index.json' # defaults to `search.json`
   search.fields = {
     title:   {boost: 100, store: true, required: true},
     content: {boost: 50},
     url:     {index: false, store: true},
     author:  {boost: 30}
   }
-  search.before_index = Proc.new do |to_index, to_store, resource|
-    if author = resource.data.author
-      to_index[:author] = data.authors[author].name
-    end
-  end
 end
 ```
@@ -47,18 +45,59 @@ Where `resources` is a list of the beginning of the URL of the resources to inde
 Note that a special field `id` is included automatically, with an autogenerated identifier to be used as the `ref` for the document.
-All fields values are retrieved from the resource `data` (ie its frontmatter), or from the `options` in the `resource.metadata` (i.e., any options specified in a `proxy` page), except for:
+All fields values are retrieved from the resource `data` (i.e. its frontmatter), or from the `options` in the `resource.metadata` (i.e. any options specified in a `proxy` page), except for:
 - `url` which is the actual resource url
 - `content` the text extracted from the rendered resource, without including its layout
-The `before_index` option accepts a callback that will be executed for each resource, and will be executed with the document to be indexed and the map to be stored, in the `index` and `docs` objects of the output respectively (see below), as well as the resource being processed. You can use this callback to modify either of those, or `throw(:skip)` to skip the resource in question.
+### Manual index manipulation
-You should also `require` the `lunr.min.js` file to your `all.js` file, to actually use the index for search in your website:
+You can fully customise the content to be indexed and stored per resource by defining a `before_index` callback:
-```javascript
-//= require lunr.min
+```ruby
+activate :search do
+  search.before_index = Proc.new do |to_index, to_store, resource|
+    if author = resource.data.author
+      to_index[:author] = data.authors[author].name
+    end
+  end
+end
+```
+This option accepts a callback that will be executed for each resource, and will be executed with the document to be indexed and the map to be stored, in the `index` and `docs` objects of the output respectively (see below), as well as the resource being processed. You can use this callback to modify either of those, or `throw(:skip)` to skip the resource in question.
+### Lunr pipeline configuration
+In some cases, you may want to add new function to the lunr pipeline, both for creating the indexing and then for searching. You can do this by providing a `pipeline` hash with function names and body, for example:
+```ruby
+activate :search do
+  search.pipeline = {
+    tildes: <<-JS
+      function(token, tokenIndex, tokens) {
+        return token
+          .replace('á', 'a')
+          .replace('é', 'e')
+          .replace('í', 'i')
+          .replace('ó', 'o')
+          .replace('ú', 'u');
+      }
+    JS
+  }
+end
 ```
+This will register the `tildes` function in the lunr pipeline and add it when building the index. From the Lunr documentation:
+> Functions in the pipeline are called with three arguments: the current token being processed; the index of that token in the array of tokens, and the whole list of tokens part of the document being processed. This enables simple unigram processing of tokens as well as more sophisticated n-gram processing.
+>
+> The function should return the processed version of the text, which will in turn be passed to the next function in the pipeline. Returning undefined will prevent any further processing of the token, and that token will not make it to the index.
+Note that if you add a function to the pipeline, it will also be loaded when de-serialising the index, and lunr will fail with an `Cannot load un-registered function: tildes` error if it has not been re-registered. You can either register them manually, or simply include the following in a `.js.erb` file to be executed __before__ loading the index:
+```erb
+<%= lunr_js_pipeline %>
+```
 ## Index file
 The generated index file contains a JSON object with two properties:
@@ -67,6 +106,13 @@ The generated index file contains a JSON object with two properties:
 You will typically load the `index` into a lunr index instance, and then use the `docs` map to look up the returned value and present it to the user.
+You should also `require` the `lunr.min.js` file in your main sprockets javascript file to be able to actually load the index:
+```javascript
+//= require lunr.min
+```
 ## Acknowledgments
 A big thank you to:

data/lib/middleman-search/extension.rb CHANGED Viewed

@@ -5,12 +5,22 @@ module Middleman
   class SearchExtension < Middleman::Extension
     option :resources, [], 'Paths of resources to index'
     option :fields, {}, 'Fields to index, with their options'
-    option :before_index, nil, 'Callback receiving (to_index, to_store, resource) to execute before indexing a document'
+    option :before_index, nil, 'Callback to execute before indexing a document'
     option :index_path, 'search.json', 'Index file path'
+    option :pipeline, {}, 'Javascript pipeline functions to use in lunr index'
     def manipulate_resource_list(resources)
       resources.push Middleman::Sitemap::SearchIndexResource.new(@app.sitemap, @options[:index_path], @options)
       resources
     end
+    helpers do
+      def lunr_js_pipeline
+        # Thanks http://stackoverflow.com/a/20187415/12791
+        extensions[:search].options[:pipeline].map do |name, function|
+          "lunr.Pipeline.registerFunction(#{function}, '#{name}');"
+        end.join("\n")
+      end
+    end
   end
 end

data/lib/middleman-search/search-index-resource.rb CHANGED Viewed

@@ -1,3 +1,5 @@
+# encoding: UTF-8
 module Middleman
   module Sitemap
     class SearchIndexResource < ::Middleman::Sitemap::Resource
@@ -5,6 +7,7 @@ module Middleman
         @resources_to_index = options[:resources]
         @fields = options[:fields]
         @callback = options[:before_index]
+        @pipeline = options[:pipeline]
         super(store, path)
       end
@@ -20,12 +23,28 @@ module Middleman
         # Build js context
         context = V8::Context.new
         context.load(File.expand_path('../../../vendor/assets/javascripts/lunr.min.js', __FILE__))
-        context.eval('lunr.Index.prototype.indexJson = function () {return JSON.stringify(this);}')
+        context.eval('lunr.Index.prototype.indexJson = function () {return JSON.stringify(this.toJSON());}')
+        # Register pipeline functions
+        pipeline = context.eval('lunr.Pipeline')
+        @pipeline.each do |name, function|
+          context[name] = context.eval("(#{function})")
+          pipeline.registerFunction(context[name], name)
+        end
         # Build lunr based on config
         lunr = context.eval('lunr')
         lunr_conf = proc do |this|
+          # Use autogenerated id field as reference
           this.ref('id')
+          # Add functions to pipeline (just registering them isn't enough)
+          @pipeline.each do |name, function|
+            this.pipeline.add(context[name])
+          end
+          # Define fields with boost
           @fields.each do |field, opts|
             next if opts[:index] == false
             this.field(field, {:boost => opts[:boost]})

data/lib/middleman-search/version.rb CHANGED Viewed

@@ -1,3 +1,3 @@
 module MiddlemanSearch
-  VERSION = "0.2.0"
+  VERSION = "0.3.0"
 end

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: middleman-search
 version: !ruby/object:Gem::Version
-  version: 0.2.0
+  version: 0.3.0
 platform: ruby
 authors:
 - Matías García Isaía