RubyGems - traject - Versions diffs - 0.16.0 → 0.17.0 - Mend

traject 0.16.0 → 0.17.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (53) hide show

checksums.yaml +7 -0
data/.yardopts +1 -0
data/README.md +183 -191
data/bench/bench.rb +1 -1
data/doc/batch_execution.md +14 -0
data/doc/extending.md +14 -12
data/doc/indexing_rules.md +265 -0
data/lib/traject/command_line.rb +12 -41
data/lib/traject/debug_writer.rb +32 -13
data/lib/traject/indexer.rb +101 -24
data/lib/traject/indexer/settings.rb +18 -17
data/lib/traject/json_writer.rb +32 -11
data/lib/traject/line_writer.rb +6 -6
data/lib/traject/macros/basic.rb +1 -1
data/lib/traject/macros/marc21.rb +17 -13
data/lib/traject/macros/marc21_semantics.rb +27 -25
data/lib/traject/macros/marc_format_classifier.rb +39 -25
data/lib/traject/marc4j_reader.rb +36 -22
data/lib/traject/marc_extractor.rb +79 -75
data/lib/traject/marc_reader.rb +33 -25
data/lib/traject/mock_reader.rb +9 -10
data/lib/traject/ndj_reader.rb +7 -7
data/lib/traject/null_writer.rb +1 -1
data/lib/traject/qualified_const_get.rb +12 -2
data/lib/traject/solrj_writer.rb +61 -52
data/lib/traject/thread_pool.rb +45 -45
data/lib/traject/translation_map.rb +59 -27
data/lib/traject/util.rb +3 -3
data/lib/traject/version.rb +1 -1
data/lib/traject/yaml_writer.rb +1 -1
data/test/debug_writer_test.rb +7 -7
data/test/indexer/each_record_test.rb +4 -4
data/test/indexer/macros_marc21_semantics_test.rb +12 -12
data/test/indexer/macros_marc21_test.rb +10 -10
data/test/indexer/macros_test.rb +1 -1
data/test/indexer/map_record_test.rb +6 -6
data/test/indexer/read_write_test.rb +43 -4
data/test/indexer/settings_test.rb +2 -2
data/test/indexer/to_field_test.rb +8 -8
data/test/marc4j_reader_test.rb +4 -4
data/test/marc_extractor_test.rb +33 -25
data/test/marc_format_classifier_test.rb +3 -3
data/test/marc_reader_test.rb +2 -2
data/test/test_helper.rb +3 -3
data/test/test_support/demo_config.rb +52 -48
data/test/translation_map_test.rb +22 -4
data/test/translation_maps/bad_ruby.rb +2 -2
data/test/translation_maps/both_map.rb +1 -1
data/test/translation_maps/default_literal.rb +1 -1
data/test/translation_maps/default_passthrough.rb +1 -1
data/test/translation_maps/ruby_map.rb +1 -1
metadata +7 -31
data/doc/macros.md +0 -103

checksums.yaml ADDED

@@ -0,0 +1,7 @@
+---
+SHA1:
+  metadata.gz: ab462aadfb1252846b617cf1adb288eeb519b353
+  data.tar.gz: 7eac38dd8ac32e1dbfd417686ff04f95c108f011
+SHA512:
+  metadata.gz: 331350a2a93083b10710943e71bdf31b30bb3c6aeed9dde97f05fd232eaa34681a7ac0bcdf0d7aae9e37fd6ea7b9d3e4da1c840f036ed9abc6d57a01aea02e12
+  data.tar.gz: 381e2c56dc2b92e0b91330bf20275e47462b86c1301ef723f31297cae1702b1f2fb77e6b8a016cf9213879f4c593ce001b68e854649613f12ba9a96238dc9da2

data/.yardopts CHANGED

@@ -1,2 +1,3 @@
+--markup markdown
 -
 doc/*.md

data/README.md CHANGED

@@ -1,11 +1,12 @@
 # Traject
-Tools for indexing MARC records to Solr.
+Tools for reading MARC records, transforming them with indexing rules, and indexing to Solr.
+Might be used to index MARC data for a Solr-based discovery product like [Blacklight](https://github.com/projectblacklight/blacklight) or [VUFind](http://vufind.org/).
-Generalizable to tools for configuring mapping records to associative array data structures, and sending
-them somewhere.
+Traject might also be generalized to a set of tools for getting structured data from a source, and sending it to a destination.
-**Currently under development, not production ready**
+**Traject is nearing 1.0, it is robust, feature-rich and ready for trial use**
 [![Gem Version](https://badge.fury.io/rb/traject.png)](http://badge.fury.io/rb/traject)
 [![Build Status](https://travis-ci.org/jrochkind/traject.png)](https://travis-ci.org/jrochkind/traject)
@@ -13,23 +14,18 @@ them somewhere.
 ## Background/Goals
-Existing tools for indexing Marc to Solr served us well for many years, and have many features.
-But we were having more and more difficulty with them, including in extending/customizing in maintainable ways.
-We realized that to create a tool with the API (internal and external) we wanted, we could do a better
-job with jruby (ruby on the JVM).
+Initially by Jonathan Rochkind (Johns Hopkins Libraries) and Bill Dueber (University of Michigan Libraries).
+Traject was born out of our experience with similar tools, including the very popular and useful [solrmarc](https://code.google.com/p/solrmarc/) by Bob Haschart; and Bill Dueber's own [marc2solr](http://github.com/billdueber/marc2solr/).
-* **Easy to use**, getting started with standard use cases should be easy, even for non-rubyists.
-* **Support customization and flexiblity**, common customization use cases, including simple local
-  logic, should be very easy. More sophisticated and even complex customization use cases should still be possible,
-  changing just the parts of traject you want to change.
-* **Maintainable local logic**, supporting sharing of reusable logic via ruby gems.
-* **Comprehensible internal logic**; well-covered by tests, well-factored separation of concerns,
-easy for newcomer developers who know ruby to understand the codebase.
-* **High performance**, using multi-threaded concurrency where appropriate to maximize throughput.
-traject likely will provide higher throughput than other similar solutions.
-* **Well-behaved shell script**, for painless integration in batch processes and cronjobs, with
-exit codes, sufficiently flexible control of logging, proper use of stderr, etc.
+We're comfortable programming (especially in a dynamic language), and want to be able to experiment with different indexing patterns quickly, easily, and testably; but are admittedly less comfortable in Java.  In order to have a tool with the API's and usage patterns convenient for us, we found we could do it better in JRuby -- Ruby on the JVM.
+* Basic configuration files can be easily written even by non-rubyists,  with a few simple directives traject provides. But config files are 'ruby all the way down', so we can provide a gradual slope to more complex needs, with the full power of ruby.
+* Easy to program, easy to read, easy to modify.
+* Fast. Traject by default indexes using multiple threads, on multiple cpu cores.
+* Composed of decoupled components, for flexibility and extensibility. The whole code base is only 6400 lines of code, more than a third of which is tests.
+* Designed to support local code and configuration that's maintainable and testable, an can be shared between projects as ruby gems.
+* Designed with batch execution in mind: flexible logging, good exit codes, good use of stdin/stdout/stderr.
 ## Installation
@@ -41,25 +37,30 @@ Then just `gem install traject`.
 ( **Note**: We may later provide an all-in-one .jar distribution, which does not require you to install jruby or use on your system. This is hypothetically possible. Is it a good idea?)
-# Usage
-## Configuration file format
+## Configuration files
-The traject command-line utility requires you to supply it with a configuration file. So let's start by describing the configuration file.
+traject is configured using configuration files. To get a sense of what they look like, you can
+take a look at our sample non-trivial configuration file,
+[demo_config.rb](./test/test_support/demo_config.rb), which you'd run like
+`traject -c path/to/demo_config.rb marc_file.marc`.
 Configuration files are actually just ruby -- so by convention they end in `.rb`.
 We hope you can write basic useful configuration files without being a ruby expert,
-they give you a subset of ruby to work with. But the full power
+traject gives you some easy functions to use for common diretives. But the full power
 of ruby is available to you if needed.
 **rubyist tip**: Technically, config files are executed with `instance_eval` in a Traject::Indexer instance, so the special commands you see are just methods on Traject::Indexer (or mixed into it). But you can
 call ordinary ruby `require` in config files, etc., too, to load
 external functionality. See more at Extending Logic below.
+You can keep your settings and indexing rules in one config file,
+or split them accross multiple config files however you like. (Connection details vs indexing? Common things vs environmental specific things?)
 There are two main categories of directives in your configuration files: _Settings_, and _Indexing Rules_.
-### Settings
+## Settings
 Settings are a flat list of key/value pairs, where the keys are always strings and the values usually are. They look like this
 in a config file:
@@ -105,91 +106,58 @@ You can also use `store` if you want to force-set, last set wins.
 See, docs page on [Settings](./doc/settings.md) for list
 of all standardized settings.
-### Indexing Rules
-You can keep your settings and indexing rules in one config file,
-or split them accross multiple config files however you like. (Connection details vs indexing? Common things vs environmental specific things?)
-The main tool for indexing rules is the `to_field` command.
-Which can be used with a few standard functions.
+## Indexing rules: Let's start with `to_field` and `extract_marc`
-~~~ruby
-# configuration.rb
-# The first arguent, 'source' in this case, is what Solr field we're
-# sending to. And the 'literal' function supplies a hard-coded
-# constant string literal.
-to_field "source", literal("LIB_CATALOG")
-# you can call 'to_field' multiple times, additional values
-# are concatenated
-to_field "source", literal("ANOTHER ONE")
-# Serialize the marc record back out and
-# put it in a solr field.
-to_field "marc_record", serialized_marc(:format => "xml")
-# or :format => "json" for marc-in-json
-# or :format => "binary", by default Base64-encoded for Solr
-# 'binary' field, or, for more like what SolrMarc did, without
-# escaping:
-to_field "marc_record_raw", serialized_marc(:format => "binary", :binary_escape => false)
-# Take ALL of the text from the marc record, useful for
-# a catch-all field. Actually by default only takes
-# from tags 100 to 899.
-to_field "text", extract_all_marc_values
-# Now we have a simple example of the general utility function
-# `extract_marc`
-to_field "id", extract_marc("001", :first => true)
-~~~
+There are a few methods that can be used to create indexing rules, but the
+one you'll most common is called `to_field`, and establishes a rule
+to extract content to a particular named output field.
-`extract_marc` takes a marc tag/subfield specification, and optional
-arguments. `:first => true` means if the specification returned multiple values, ignore all bet the first. It is wise to use this
-*whenever you have a non-multi-valued solr field* even if you think "There should only be one 001 field anyway!", to deal with unexpected
-data properly.
+The extraction rule can use built-in 'macros', or, as we'll see later,
+entirely custom logic.
-Other examples of the specification string, which can include multiple tag mentions, as well as subfields and indicators:
+The built-in macro you'll use the most is `extract_marc`, to extract
+data out of a MARC record according to a tag/subfield specification.
 ~~~ruby
-  # 245 subfields a, p, and s. 130, all subfields.
-  # built-in punctuation trimming routine.
-  to_field "title_t", extract_marc("245nps:130", :trim_punctuation => true)
-  # Can limit to certain indicators with || chars.
-  # "*" is a wildcard in indicator spec.  So
-  # 856 with first indicator '0', subfield u.
-  to_field "email_addresses", extract_marc("856|0*|u")
-  # Can list tag twice with different field combinations
-  # to extract separately
-  to_field "isbn", extract_marc("245a:245abcde")
+    # Take the value of the first 001 field, and put
+    # it in output field 'id', to be indexed in Solr
+    # field 'id'
+    to_field "id", extract_marc("001", :first => true)
+    # 245 subfields a, p, and s. 130, all subfields.
+    # built-in punctuation trimming routine.
+    to_field "title_t", extract_marc("245nps:130", :trim_punctuation => true)
+    # Can limit to certain indicators with || chars.
+    # "*" is a wildcard in indicator spec.  So
+    # 856 with first indicator '0', subfield u.
+    to_field "email_addresses", extract_marc("856|0*|u")
+    # Can list tag twice with different field combinations
+    # to extract separately
+    to_field "isbn", extract_marc("245a:245abcde")
+    # For MARC Control ('fixed') fields, you can optionally
+    # use square brackets to take a byte offset.
+    to_field "langauge_code", extract_marc("008[35-37]")
 ~~~
-The `extract_marc` function *by default* includes any linked
-MARC `880` fields with alternate-script versions. Another reason
-to use the `:first` option if you really only want one.
+`extract_marc` by default includes all 'alternate script' linked fields correspoinding
+to matched specifications, but you can turn that off, or extract *only* corresponding
+880s.
-By default, specifications with multiple subfields (like "240abc") will produce
-one single string of output for each matching field. Specifications
-with single subfields (like "020a") will split subfields and produce
-an output string for each matching subfield.
+    to_field "title", extract_marc("245abc", :alternate_script => false)
+    to_field "title_vernacular", extract_marc("245abc", :alternate_script => :only)
-For MARC control (aka 'fixed') fields, you can use square
-brackets to take a slice by byte offset.
+By default, specifications with multiple subfields (like "240abc") will produce one single string of output for each matching field. Specifications with single subfields (like "020a") will split subfields and produce an output string for each matching subfield.
-~~~ruby
-    to_field "langauge_code", extract_marc("008[35-37]")
-~~~
-For more information on extraction specifications, see
-the [MarcExtractor class](./lib/traject/marc_extractor.rb) ([rdoc](http://rdoc.info/gems/traject/Traject/MarcExtractor)).
+For the syntax and complete possibilities of the specification
+string argument to extract_marc, see docs at the [MarcExtractor class](./lib/traject/marc_extractor.rb) ([rdoc](http://rdoc.info/gems/traject/Traject/MarcExtractor)).
 `extract_marc` also supports `translation maps` similar
 to SolrMarc's. There are some translation maps provided by traject,
-and you can also define your own. translation maps can be supplied
-in yaml or ruby.  Translation maps are especially useful
+and you can also define your own, in yaml or ruby. Translation maps are especially useful
 for mapping form MARC codes to user-displayable strings:
 ~~~ruby
@@ -198,131 +166,152 @@ for mapping form MARC codes to user-displayable strings:
     to_field "language", extract_marc("008[35-37]:041a:041d", :translation_map => "marc_language_code")
 ~~~
-See [Traject::TranslationMap](./lib/traject/translation_map.rb) ([rdoc](http://rdoc.info/gems/traject/Traject/TranslationMap)) for more info on translation mapping.
+To see all options for `extract_marc`, see the [method documentation](http://rdoc.info/gems/traject/Traject/Macros/Marc21:extract_marc)
-#### Direct indexing logic vs. Macros
+## other built-in utility macros
-It turns out all those functions we saw above used with `to_field` -- `literal`, `serialized_marc`, `extract_all_marc_values`, and `extract_marc` -- are what Traject calls 'macros'.
+Other built-in methods that can be used with `to_field` include a hard-coded
+literal string:
-They are all actually built based upon a more basic element of
-indexing functionality, which you can always drop down to, and
-which is used to build the macros. The basic use of `to_field`,
-with directly specified logic instead of using a macro, looks like this:
+    to_field "source", literal("LIB_CATALOG")
-~~~ruby
-to_field "source" do |record, accumulator, context|
-   accumulator << "LIB CATALOG"
-end
-~~~~
+The current record serialized back out as MARC, in binary, XML, or json:
-That's actually equivalent to the macro we used earlier: `to_field("source"), literal("LIB_CATALOG")`.
+    # or :format => "json" for marc-in-json
+    # or :format => "binary", by default Base64-encoded for Solr
+    # 'binary' field, or, for more like what SolrMarc did, without
+    # escaping:
+    to_field "marc_record_raw", serialized_marc(:format => "binary", :binary_escape => false, :allow_oversized => true)
-This direct use of to_field happens to be a ruby "block", which is
-used to define a block of logic that can be stored and executed later. When the block is called, first argument (`record` above) is the marc_record being indexed (a ruby-marc MARC::Record object), and the second argument (`accumulator`) is a ruby array used to accumulate output values.
+Text of all fields in a range:
-The third argument is a `Traject::Indexer::Context` object that can
-be used for more advanced functionality, including caching expensive
-per-record calculations, writing out to more than one output field at a time, or taking account of current Traject Settings in your logic. The third argument is optional, you can supply
-a two-argument block too.
+    to_field "text", extract_all_marc_values(:from => 100, :to => 899)
-You can always drop out to this basic direct use whenever you need
-special purpose logic, directly in the config file, writing in
-ruby:
-~~~ruby
-# this is more or less nonsense, just an example
-to_field "weird_title" do |record, accumlator, context|
-   field = record['245']
-   title = field['a']
-   title.upcase! if field.indicator1 = '1'
-   accumulator << title
-end
+All of these methods are defined at [Traject::Macros::Marc21](./lib/traject/macros/marc21.rb) ([rdoc](http://rdoc.info/gems/traject/Traject/Macros/Marc21))
-# To make use of marc extraction by specification, just like
-# marc_extract does, you may want to use the Traject::MarcExtractor
-# class
-to_field "weirdo" do |record, accumulator, context|
-   # use MarcExtractor.cached for performance, globally
-   # caching the MarcExtractor we create. See docs
-   # at MarcExtractor.
-   list = MarcExtractor.cached("700a").extract(record)
+## more complex canned MARC semantic logic
-   # combine all the 700a's in ONE string, cause we're weird
-   list = list.join(" ")
+Some more complex (and opinionated/subjective) algorithms for deriving semantics
+from Marc are also packaged with Traject, but not available by default. To make
+them available to your indexing, you just need to use ruby `require` and `extend`.
-   accumulator << list
-end
-~~~
+A number of methods are in [Traject::Macros::Marc21Semantics](./lib/traject/macros/marc21_semantics.rb) ([rdoc](http://rdoc.info/gems/traject/Traject/Macros/Marc21Semantics))
-You can also *combine* a macro and a direct block for some
-post-processing. In this case, the `accumulator` parameter
-in our block will start out with the values left by
-the `extract_marc`:
+    require 'traject/macros/marc21_semantics'
+    extend Traject::Macros::Marc21Semantics
-~~~ruby
-to_field "subjects", extract_marc("600:650:610") do |record, accumulator, context|
-  # for some reason we want to uppercase all our subjects
-  accumulator.collect! {|s| s.upcase }
-end
-~~~
+    to_field 'title_sort',        marc_sortable_title
+    to_field 'broad_subject',     marc_lcc_to_broad_category
+    to_field "geographic_facet",  marc_geo_facet
+    # And several more
+And, there's a routine for classifying MARC to an internal
+format/genre/type vocabulary:
+    require 'traject/macros/marc_format_classifier'
+    extend Traject::Macros::MarcFormats
+    to_field 'format_facet',    marc_formats
+## Custom logic
+The built-in routines are there for your convenience, but if you need
+something local or custom, you can write ruby logic directly
+in a configuration file, using a ruby block, which looks like this:
+    to_field "id" do |record, accumulator|
+       # take the record's 001, prefix it with "bib_",
+       # and then add it to the 'accumulator' argument,
+       # to send it to the specified output field
+       value = record['001']
+       value = "bib_#{value}"
+       accumulator << value
+    end
+`do |record, accumulator|` is the definition of a ruby block taking
+two arguments.  The first one passed in will be a MARC record. The
+second is an array, you add values to the array to send them to
+output.
+Here's a more realistic example that shows how you'd get the
+record type byte 06 out of a MARC leader, then translate it
+to a human-readable string with a TranslationMap
+    to_field "marc_type" do |record, accumulator|
+      leader06 = record.leader.byteslice(6)
+      # this translation map doesn't actually exist, but could
+      accumulator << TranslationMap.new("marc_leader")[ leader06 ]
+    end
-If you find yourself repeating code a lot in direct blocks, you
-can supply your _own_ macros, for local use, or even to share
-with others in a ruby gem. See docs [Macros](./doc/macros.md)
+You can also add a block onto the end of a built-in 'macro', to
+further customize the output. The `accumulator` passed to your block
+will already have values in it from the first step, and you can
+use ruby methods like `map!` to modify it:
-#### each_record
+    to_field "big_title", extract_marc("245abcdefg") do |record, accumulator|
+      # put it all in all uppercase, I don't know why.
+      accumulator.map! {|v| v.upcase}
+    end
-There is also a method `each_record`, which is like `to_field`, but without
-a specific field. It can be used for other side-effects of your choice, or
-even for writing to multiple fields.
+There are many more things you can do with custom logic blocks like this too,
+including additional features we haven't discussed yet.
+If you find yourself repeating boilerplate code in your custom logic, you can
+even create your own 'macros' (like `extract_marc`). `extract_marc` and other
+macros are nothing more than methods that return ruby lambda objects of
+the same format as the blocks you write for custom logic.
+For tips, gotchas, and a more complete explanation of how this works, see
+additional documentation page on [Indexing Rules: Macros and Custom Logic](./doc/indexing_rules.md)
+## each_record and after_processing
+In addition to `to_field`, an `each_record` method is available, which,
+like `to_field`, is executed for every record, but without being tied
+to a specific field.
+`each_record` can be used for logging or notifiying; computing intermediate
+results; or writing to more than one field at once.
 ~~~ruby
-  each_record do |record, context|
-    # example of writing to two fields at once.
-    (x, y) = Something.do_stuff
-    (context["one_field"] ||= [])     << x
-    (context["another_field"] ||= []) << y
+  each_record do |record|
+    some_custom_logging(record)
   end
 ~~~
-You could write or use macros for `each_record` too. It's suggested that
-such a macro take the field names it will effect as arguments (example?)
+For more on `each_record`, see documentation page on [Indexing Rules: Macros and Custom Logic](./doc/indexing_rules.md).
-`each_record` and `to_field` calls will be processed in one big order, guaranteed
-in order.
+There is also an `after_processing` method that can be used to register
+logic that will be called after the entire has been processed. You can use it for whatever custom
+ruby code you might want for your app (send an email? Clean up a log file? Trigger
+a Solr replication?)
 ~~~ruby
-  to_field("foo") {...}  # will be called first on each record
-  each_record {...}      # will always be called AFTER above has potentially added values
-  to_field("foo") {...}  # and will be called after each of the preceding for each record
+after_processing do
+  whatever_ruby_code
+end
 ~~~
-#### Sample config
-A fairly complex sample config file can be found at [./test/test_support/demo_config.rb](./test/test_support/demo_config.rb)
+## Writers
-#### Built-in MARC21 Semantics
+Traject uses modular 'Writer' classes to take the output hashes from transformation, and
+send them somewhere or do something useful with them.
-There is another package of 'macros' that comes with Traject for extracting semantics
-from Marc21.  These are sometimes 'opinionated', using heuristics or algorithms
-that are not inherently part of Marc21, but have proven useful in actual practice.
+By default traject uses the [Traject::SolrJWriter](lib/traject/solrj_writer.rb) ([rdoc](http://rdoc.info/gems/traject/Traject/SolrJWriter)) to send to Solr for indexing.
+A couple other writers are available too, mostly for debugging purposes:
+[Traject::DebugWriter](lib/traject/debug_writer.rb) ([rdoc](http://rdoc.info/gems/traject/Traject/DebugWriter))
+and [Traject::JsonWriter](lib/traject/json_writer.rb) ([rdoc](http://rdoc.info/gems/traject/Traject/JsonWriter))
-It's not loaded by default, you can use straight ruby `require` and `extend`
-to load the macros into the indexer.
+You set which writer is being used in settings (`provide "writer_class_name", "Traject::DebugWriter"`),
+or on the command-line as a shortcut with `-w Traject::DebugWriter`.
-~~~ruby
-# in a traject config file, extend so we can use methods from...
-require 'traject/macros/marc21_semantics'
-extend Traject::Macros::Marc21Semantics
-to_field "date",        marc_publication_date
-to_field "author_sort", marc_sortable_author
-to_field "inst_facet",  marc_instrumentation_humanized
-~~~
+You can write your own Readers and Writers if you'd like, see comments at top
+of [Traject::Indexer](lib/traject/indexer.rb).
-See documented list of macros available in [Marc21Semantics](./lib/traject/macros/marc21_semantics.rb)
-## Command Line
+## The traject command Line
 The simplest invocation is:
@@ -363,7 +352,7 @@ Use `-u` as a shortcut for `s solr.url=X`
 Run `traject -h` to see the command line help screen listing all available options.
-Also see `-I load_path` and `-G Gemfile` options under Extending With Your Own Code.
+Also see `-I load_path` option and suggestions for Bundler use under Extending With Your Own Code.
 See also [Hints for batch and cronjob use](./doc/batch_execution.md) of traject.
@@ -396,9 +385,9 @@ Own Code](./doc/extending.md)
   * translation map files found on the load path or in a
     "./translation_maps" subdir on the load path will be found
     for Traject translation maps.
-* Traject `-G` command line can be used to tell traject to use
-  bundler with a `Gemfile` located at current working dirctory
-  (or give an argument to `-G ./some/myGemfile`)
+* Use [Bundler](http://bundler.io/) with traject simply by creating a Gemfile with `bundler init`,
+  and then running command line with `bundle exec traject` or
+  even `BUNDLE_GEMFILE=path/to/Gemfile bundle exec traject`
 ## More
@@ -423,6 +412,9 @@ this with other developers first!)
 Pull requests should come with tests, as well as docs where applicable. Docs can be inline rdoc-style, edits to this README,
 and/or extra files in ./docs -- as appropriate for what needs to be docs.
+**Inline api docs** Note that our [`.yardopts` file](./.yardopts) used by rdoc.info to generate
+online api docs has a `--markup markdown` specified -- inline class/method docs are in markdown, not rdoc.
 ## TODO