traject 0.13.0 → 0.13.1

Sign up to get free protection for your applications and to get access to all the features.
data/.yardopts ADDED
@@ -0,0 +1,2 @@
1
+ -
2
+ doc/*.md
data/README.md CHANGED
@@ -15,15 +15,19 @@ them somewhere.
15
15
 
16
16
  Existing tools for indexing Marc to Solr exist, and have served us well for many years, and have many useful things about them -- which I've tried to preserve in traject. But I was having more and more difficulty working with the existing tools, including difficulty providing the custom logic I needed in a maintainable way. I realized that for me, to create a tool with the flexibility, maintainability, and performance I wanted, I would need to do it in jruby (ruby on the JVM).
17
17
 
18
- Some goals:
19
-
20
- * Aim to be accessible even to non-rubyists
21
- * Concise and maintainable local configuration -- including an only gradual increase in difficulty to write your own simple logic.
22
- * Support reusable and shareable mapping logic routines.
23
- * Built of modular and composable elements: If you want to change part of what traject does, you should be able to do so without having to reimplement other things you don't want to change.
24
- * A maintainable internal architecture, well-factored with seperated concerns and DRY logic. Aim to be comprehensible to newcomer developers, and well-covered by tests.
25
- * High performance, using multi-threaded concurrency where appropriate to maximize throughput. Actual throughput can depend on complexity of your mapping rules and capacity of your server(s), but I am getting throughput 2-5x greater than previous solutions.
26
- * Cooperate well in unix batch/pipeline, with control over output/logging of errors, proper exit codes, use of stdin/stdout, etc.
18
+ * *Easy to use*, getting started with standard use cases should be easy, even for non-rubyists.
19
+ * *Support customization and flexiblity*, common customization use cases, including simple local
20
+ logic, should be very easy. More sophisticated and even complex customization use cases should still be possible,
21
+ changing just the parts of traject you want to change.
22
+ * *Maintainable local logic*, including supporting sharing of reusable logic via ruby gems.
23
+ * *Maintainable understandable internal logic*; well-covered by tests, well-factored seperation of concerns,
24
+ easy for newcomer developers who know ruby to understand the codebase.
25
+ * *High performance*, using multi-threaded concurrency where appropriate to maximize throughput.
26
+ While it depends on your configuration and the size of your server(s), traject is likely higher
27
+ performance than other similar solutions.
28
+ * *Well-behaved shell script*, for painless integration in batch processes and cronjobs, with
29
+ exit codes, sufficiently flexible control of logging, proper use of stderr, etc.
30
+
27
31
 
28
32
 
29
33
  ## Installation
data/doc/extending.md CHANGED
@@ -5,7 +5,7 @@ organize it in files other than traject config files, but then
5
5
  use it in traject config files.
6
6
 
7
7
  You might want to have code local to your traject project; or you
8
- might want to use ruby gems with shared code in your traject project.
8
+ might want to use ruby gems to share code between projects and developers.
9
9
  A given project may use both of these techniques.
10
10
 
11
11
  Here are some suggestions for how to do this, along with mention
@@ -16,7 +16,7 @@ of a couple traject features meant to make it easier.
16
16
  * Traject `-I` argument command line can be used to list directories to
17
17
  add to the load path, similar to the `ruby -I` argument. You
18
18
  can then 'require' local project files from the load path.
19
- * translation map files found on the load path or in a
19
+ * translation map files found in a
20
20
  "./translation_maps" subdir on the load path will be found
21
21
  for Traject translation maps.
22
22
  * Traject `-G` command line can be used to tell traject to use
@@ -26,7 +26,7 @@ of a couple traject features meant to make it easier.
26
26
  ## Custom code local to your project
27
27
 
28
28
  You might want local translation maps, or local ruby
29
- code. Here's a standard way you might lay out
29
+ code. Here's a standard recommended way you might lay out
30
30
  this extra code in the file system, using a 'lib'
31
31
  directory kept next to your traject config files:
32
32
 
@@ -97,8 +97,8 @@ That's pretty much it!
97
97
 
98
98
  What about that translation map? The `$LOAD_PATH` modification
99
99
  took care of that too, the Traject::TranslationMap will look
100
- up translation map definition files on the load path, or
101
- in a `./translation_maps` subdir on the load path.
100
+ up translation map definition files
101
+ in a `./translation_maps` subdir on the load path, as in `./lib/translation_maps` in this case.
102
102
 
103
103
 
104
104
  ## Using gems in your traject project
@@ -128,11 +128,10 @@ require 'some_gem'
128
128
  SomeGem.whatever!
129
129
  ~~~
130
130
 
131
- Any gem can provide traject translation map definitions
132
- in it's `lib` directory, or in a `lib/translation_maps`
133
- sub-directory, and traject will be able to find those
131
+ A gem can provide traject translation map definitions
132
+ in a `lib/translation_maps` sub-directory, and traject will be able to find those
134
133
  translation maps when the gem is loaded. (Because gems'
135
- `./lib` directories are added to the ruby load path.)
134
+ `./lib` directories are by default added to the ruby load path.)
136
135
 
137
136
  ### Or, with bundler:
138
137
 
@@ -161,9 +160,14 @@ possibly with version restrictions, in the [Gemfile](http://bundler.io/v1.3/gemf
161
160
  Run `bundle install` from the directory with the Gemfile, on any system
162
161
  at any time, to make sure specified gems are installed.
163
162
 
164
- **Run traject** with the `-G` flag to tell it to use the Gemfile:
163
+ **Run traject** with the `-G` flag to tell it to use the Gemfile, for instance if
164
+ your working directory is the one that includes your Gemfile:
165
165
 
166
- traject -G -c some_traject_config.rb ...
166
+ traject -G -c some_traject_config.rb ...
167
+
168
+ Or explicitly specify a Gemfile somewhere else:
169
+
170
+ traject -G /some/path/Gemfile -c some_config.rb ...
167
171
 
168
172
  Traject will use bundler to setup with the Gemfile, making sure
169
173
  the specified versions of all gems are used (and also making sure
@@ -179,4 +183,4 @@ that bundler creates into your source control repo. The
179
183
  gem dependencies are currently being used, so you can get the exact
180
184
  same dependency environment on different servers.
181
185
 
182
- See the [bundler documentation](http://bundler.io/#getting-started), or google, for more information.
186
+ See the [bundler documentation](http://bundler.io/#getting-started), or google, for more information.
@@ -33,14 +33,8 @@ module Traject
33
33
  # Returns true on success or false on failure; may also raise exceptions;
34
34
  # may also exit program directly itself (yeah, could use some normalization)
35
35
  def execute
36
- if options[:version]
37
- self.console.puts "traject version #{Traject::VERSION}"
38
- return
39
- end
40
- if options[:help]
41
- self.console.puts slop.help
42
- return
43
- end
36
+ # Do bundler setup FIRST to try and initialize all gems from gemfile
37
+ # if requested.
44
38
 
45
39
  # have to use Slop object to tell diff between
46
40
  # no arg supplied and no option -g given at all
@@ -48,11 +42,21 @@ module Traject
48
42
  require_bundler_setup(options[:Gemfile])
49
43
  end
50
44
 
45
+
51
46
  # We require them here instead of top of file,
52
47
  # so we have done bundler require before we require these.
53
48
  require 'traject'
54
49
  require 'traject/indexer'
55
50
 
51
+ if options[:version]
52
+ self.console.puts "traject version #{Traject::VERSION}"
53
+ return
54
+ end
55
+ if options[:help]
56
+ self.console.puts slop.help
57
+ return
58
+ end
59
+
56
60
 
57
61
  (options[:load_path] || []).each do |path|
58
62
  $LOAD_PATH << path unless $LOAD_PATH.include? path
@@ -282,7 +286,7 @@ module Traject
282
286
  on :j, "output as pretty printed json, shortcut for -s writer_class_name=JsonWriter -s json_writer.pretty_print=true"
283
287
  on :t, :marc_type, "xml, json or binary. shortcut for -s marc_source.type=", :argument => true
284
288
  on :I, "load_path", "append paths to ruby $LOAD_PATH", :argument => true, :as => Array, :delimiter => ":"
285
- on :G, "Gemfile", "run with bundler and optionally specified Gemfile", :argument => :optional, :default => ""
289
+ on :G, "Gemfile", "run with bundler and optionally specified Gemfile", :argument => :optional, :default => nil
286
290
 
287
291
  on :x, "command", "alternate traject command: process (default); marcout", :argument => true, :default => "process"
288
292
 
@@ -109,6 +109,10 @@ module Traject
109
109
  end
110
110
  end
111
111
 
112
+ # Cached hash can't be mutated without weird consequences, let's
113
+ # freeze it!
114
+ found.freeze if found
115
+
112
116
  return found
113
117
  end
114
118
 
@@ -141,7 +145,7 @@ module Traject
141
145
  if options[:default]
142
146
  @default = options[:default]
143
147
  elsif @hash.has_key? "__default__"
144
- @default = @hash.delete("__default__")
148
+ @default = @hash["__default__"]
145
149
  end
146
150
  end
147
151
 
@@ -158,6 +162,12 @@ module Traject
158
162
  end
159
163
  alias_method :map, :[]
160
164
 
165
+ # Returns a dup of internal hash, dup so you can modify it
166
+ # if you like.
167
+ def to_hash
168
+ @hash.dup
169
+ end
170
+
161
171
  # Run every element of an array through this translation map,
162
172
  # return the resulting array. If translation map returns nil,
163
173
  # original element will be missing from output.
@@ -1,3 +1,3 @@
1
1
  module Traject
2
- VERSION = "0.13.0"
2
+ VERSION = "0.13.1"
3
3
  end
@@ -27,6 +27,19 @@ describe "TranslationMap" do
27
27
  assert_equal "value1", found["key1"]
28
28
  end
29
29
 
30
+ it "freezes the hash" do
31
+ found = @cache.lookup("yaml_map")
32
+
33
+ assert found.frozen?
34
+ end
35
+
36
+ it "respects in-file default, even on second load" do
37
+ map = Traject::TranslationMap.new("default_literal")
38
+ map = Traject::TranslationMap.new("default_literal")
39
+
40
+ assert_equal "DEFAULT LITERAL", map["not in the map"]
41
+ end
42
+
30
43
  it "finds .rb over .yaml" do
31
44
  found = @cache.lookup("both_map")
32
45
 
@@ -103,4 +116,17 @@ describe "TranslationMap" do
103
116
 
104
117
  assert_equal ["hola", "first", "second", "last thing", "buenas noches", "hola", "everything else"], arr
105
118
  end
119
+
120
+ it "#to_hash" do
121
+ map = Traject::TranslationMap.new("yaml_map")
122
+
123
+ hash = map.to_hash
124
+
125
+ assert_kind_of Hash, hash
126
+
127
+ assert ! hash.frozen?, "#to_hash result is not frozen"
128
+
129
+ refute_same hash, map.to_hash, "each #to_hash result is a copy"
130
+ end
131
+
106
132
  end
data/traject.gemspec CHANGED
@@ -17,6 +17,8 @@ Gem::Specification.new do |spec|
17
17
  spec.test_files = spec.files.grep(%r{^(test|spec|features)/})
18
18
  spec.require_paths = ["lib"]
19
19
 
20
+ spec.extra_rdoc_files = spec.files.grep(%r{^doc/})
21
+
20
22
 
21
23
  spec.add_dependency "marc", ">= 0.7.1"
22
24
  spec.add_dependency "marc-marc4j", ">=0.1.1"
metadata CHANGED
@@ -2,14 +2,14 @@
2
2
  name: traject
3
3
  version: !ruby/object:Gem::Version
4
4
  prerelease:
5
- version: 0.13.0
5
+ version: 0.13.1
6
6
  platform: ruby
7
7
  authors:
8
8
  - Jonathan Rochkind
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2013-09-12 00:00:00.000000000 Z
12
+ date: 2013-09-16 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: marc
@@ -157,10 +157,16 @@ email:
157
157
  executables:
158
158
  - traject
159
159
  extensions: []
160
- extra_rdoc_files: []
160
+ extra_rdoc_files:
161
+ - doc/batch_execution.md
162
+ - doc/extending.md
163
+ - doc/macros.md
164
+ - doc/other_commands.md
165
+ - doc/settings.md
161
166
  files:
162
167
  - .gitignore
163
168
  - .travis.yml
169
+ - .yardopts
164
170
  - Gemfile
165
171
  - LICENSE.txt
166
172
  - README.md