bigindex-solr 1.2.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (100) hide show
  1. data/.document +5 -0
  2. data/.gitignore +8 -0
  3. data/MIT-LICENSE +20 -0
  4. data/README.rdoc +73 -0
  5. data/Rakefile +49 -0
  6. data/VERSION +1 -0
  7. data/bigindex-solr.gemspec +141 -0
  8. data/examples/bigindex.yml +9 -0
  9. data/install.rb +17 -0
  10. data/lib/bigindex-solr.rb +32 -0
  11. data/lib/bigindex-solr/tasks.rb +3 -0
  12. data/lib/tasks/bigindex_solr.rake +55 -0
  13. data/schema.xml +126 -0
  14. data/solr/CHANGES.txt +562 -0
  15. data/solr/LICENSE.txt +608 -0
  16. data/solr/NOTICE.txt +57 -0
  17. data/solr/etc/jetty.xml +206 -0
  18. data/solr/etc/webdefault.xml +379 -0
  19. data/solr/lib/commons-codec-1.3.jar +0 -0
  20. data/solr/lib/commons-csv-0.1-SNAPSHOT.jar +0 -0
  21. data/solr/lib/commons-fileupload-1.2.jar +0 -0
  22. data/solr/lib/commons-io-1.2.jar +0 -0
  23. data/solr/lib/easymock.jar +0 -0
  24. data/solr/lib/jetty-6.1.3.jar +0 -0
  25. data/solr/lib/jetty-util-6.1.3.jar +0 -0
  26. data/solr/lib/jsp-2.1/ant-1.6.5.jar +0 -0
  27. data/solr/lib/jsp-2.1/core-3.1.1.jar +0 -0
  28. data/solr/lib/jsp-2.1/jsp-2.1.jar +0 -0
  29. data/solr/lib/jsp-2.1/jsp-api-2.1.jar +0 -0
  30. data/solr/lib/lucene-analyzers-2007-05-20_00-04-53.jar +0 -0
  31. data/solr/lib/lucene-core-2007-05-20_00-04-53.jar +0 -0
  32. data/solr/lib/lucene-highlighter-2007-05-20_00-04-53.jar +0 -0
  33. data/solr/lib/lucene-snowball-2007-05-20_00-04-53.jar +0 -0
  34. data/solr/lib/lucene-spellchecker-2007-05-20_00-04-53.jar +0 -0
  35. data/solr/lib/servlet-api-2.4.jar +0 -0
  36. data/solr/lib/servlet-api-2.5-6.1.3.jar +0 -0
  37. data/solr/lib/xpp3-1.1.3.4.O.jar +0 -0
  38. data/solr/solr/README.txt +52 -0
  39. data/solr/solr/bin/abc +176 -0
  40. data/solr/solr/bin/abo +176 -0
  41. data/solr/solr/bin/backup +108 -0
  42. data/solr/solr/bin/backupcleaner +142 -0
  43. data/solr/solr/bin/commit +128 -0
  44. data/solr/solr/bin/optimize +129 -0
  45. data/solr/solr/bin/readercycle +129 -0
  46. data/solr/solr/bin/rsyncd-disable +77 -0
  47. data/solr/solr/bin/rsyncd-enable +76 -0
  48. data/solr/solr/bin/rsyncd-start +145 -0
  49. data/solr/solr/bin/rsyncd-stop +105 -0
  50. data/solr/solr/bin/scripts-util +83 -0
  51. data/solr/solr/bin/snapcleaner +148 -0
  52. data/solr/solr/bin/snapinstaller +168 -0
  53. data/solr/solr/bin/snappuller +248 -0
  54. data/solr/solr/bin/snappuller-disable +77 -0
  55. data/solr/solr/bin/snappuller-enable +77 -0
  56. data/solr/solr/bin/snapshooter +109 -0
  57. data/solr/solr/conf/admin-extra.html +31 -0
  58. data/solr/solr/conf/protwords.txt +21 -0
  59. data/solr/solr/conf/schema.xml +126 -0
  60. data/solr/solr/conf/scripts.conf +24 -0
  61. data/solr/solr/conf/solrconfig.xml +458 -0
  62. data/solr/solr/conf/stopwords.txt +57 -0
  63. data/solr/solr/conf/synonyms.txt +31 -0
  64. data/solr/solr/conf/xslt/example.xsl +132 -0
  65. data/solr/solr/conf/xslt/example_atom.xsl +63 -0
  66. data/solr/solr/conf/xslt/example_rss.xsl +62 -0
  67. data/solr/solr/data/development/index/_a.fdt +0 -0
  68. data/solr/solr/data/development/index/_a.fdx +0 -0
  69. data/solr/solr/data/development/index/_a.fnm +2 -0
  70. data/solr/solr/data/development/index/_a.frq +1 -0
  71. data/solr/solr/data/development/index/_a.nrm +1 -0
  72. data/solr/solr/data/development/index/_a.prx +0 -0
  73. data/solr/solr/data/development/index/_a.tii +0 -0
  74. data/solr/solr/data/development/index/_a.tis +0 -0
  75. data/solr/solr/data/development/index/_a_1.del +0 -0
  76. data/solr/solr/data/development/index/_b.fdt +0 -0
  77. data/solr/solr/data/development/index/_b.fdx +0 -0
  78. data/solr/solr/data/development/index/_b.fnm +2 -0
  79. data/solr/solr/data/development/index/_b.frq +0 -0
  80. data/solr/solr/data/development/index/_b.nrm +1 -0
  81. data/solr/solr/data/development/index/_b.prx +0 -0
  82. data/solr/solr/data/development/index/_b.tii +0 -0
  83. data/solr/solr/data/development/index/_b.tis +0 -0
  84. data/solr/solr/data/development/index/_b_1.del +0 -0
  85. data/solr/solr/data/development/index/_c.fdt +0 -0
  86. data/solr/solr/data/development/index/_c.fdx +0 -0
  87. data/solr/solr/data/development/index/_c.fnm +2 -0
  88. data/solr/solr/data/development/index/_c.frq +1 -0
  89. data/solr/solr/data/development/index/_c.nrm +1 -0
  90. data/solr/solr/data/development/index/_c.prx +0 -0
  91. data/solr/solr/data/development/index/_c.tii +0 -0
  92. data/solr/solr/data/development/index/_c.tis +0 -0
  93. data/solr/solr/data/development/index/_c_1.del +0 -0
  94. data/solr/solr/data/development/index/segments.gen +0 -0
  95. data/solr/solr/data/development/index/segments_q +0 -0
  96. data/solr/start.jar +0 -0
  97. data/solr/webapps/solr.war +0 -0
  98. data/spec/bigindex-solr_spec.rb +7 -0
  99. data/spec/spec_helper.rb +9 -0
  100. metadata +163 -0
data/solr/CHANGES.txt ADDED
@@ -0,0 +1,562 @@
1
+ Apache Solr Version 1.2
2
+ Release Notes
3
+
4
+ Introduction
5
+ ------------
6
+ Apache Solr is an open source enterprise search server based on the Lucene Java
7
+ search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search,
8
+ caching, replication, and a web administration interface. It runs in a Java
9
+ servlet container such as Tomcat.
10
+
11
+ See http://lucene.apache.org/solr for more information.
12
+
13
+
14
+ Getting Started
15
+ ---------------
16
+ You need a Java 1.5 VM or later installed.
17
+ In this release, there is an example Solr server including a bundled
18
+ servlet container in the directory named "example".
19
+ See the tutorial at http://lucene.apache.org/solr/tutorial.html
20
+
21
+
22
+ $Id: CHANGES.txt 543263 2007-05-31 21:19:02Z yonik $
23
+
24
+ ================== Release 1.2, 20070603 ==================
25
+
26
+ Upgrading from Solr 1.1
27
+ -------------------------------------
28
+ IMPORTANT UPGRADE NOTE: In a master/slave configuration, all searchers/slaves
29
+ should be upgraded before the master! If the master were to be updated
30
+ first, the older searchers would not be able to read the new index format.
31
+
32
+ Older Apache Solr installations can be upgraded by replacing
33
+ the relevant war file with the new version. No changes to configuration
34
+ files should be needed.
35
+
36
+ This version of Solr contains a new version of Lucene implementing
37
+ an updated index format. This version of Solr/Lucene can still read
38
+ and update indexes in the older formats, and will convert them to the new
39
+ format on the first index change. One change in the new index format
40
+ is that all "norms" are kept in a single file, greatly reducing the number
41
+ of files per segment. Users of compound file indexes will want to consider
42
+ converting to the non-compound format for faster indexing and slightly better
43
+ search concurrency.
44
+
45
+ The JSON response format for facets has changed to make it easier for
46
+ clients to retain sorted order. Use json.nl=map explicitly in clients
47
+ to get the old behavior, or add it as a default to the request handler
48
+ in solrconfig.xml
49
+
50
+ The Lucene based Solr query syntax is slightly more strict.
51
+ A ':' in a field value must be escaped or the whole value must be quoted.
52
+
53
+ The Solr "Request Handler" framework has been updated in two key ways:
54
+ First, if a Request Handler is registered in solrconfig.xml with a name
55
+ starting with "/" then it can be accessed using path-based URL, instead of
56
+ using the legacy "/select?qt=name" URL structure. Second, the Request
57
+ Handler framework has been extended making it possible to write Request
58
+ Handlers that process streams of data for doing updates, and there is a
59
+ new-style Request Handler for XML updates given the name of "/update" in
60
+ the example solrconfig.xml. Existing installations without this "/update"
61
+ handler will continue to use the old update servlet and should see no
62
+ changes in behavior. For new-style update handlers, errors are now
63
+ reflected in the HTTP status code, Content-type checking is more strict,
64
+ and the response format has changed and is controllable via the wt
65
+ parameter.
66
+
67
+
68
+
69
+ Detailed Change List
70
+ --------------------
71
+
72
+ New Features
73
+ 1. SOLR-82: Default field values can be specified in the schema.xml.
74
+ (Ryan McKinley via hossman)
75
+
76
+ 2. SOLR-89: Two new TokenFilters with corresponding Factories...
77
+ * TrimFilter - Trims leading and trailing whitespace from Tokens
78
+ * PatternReplaceFilter - applies a Pattern to each token in the
79
+ stream, replacing match occurances with a specified replacement.
80
+ (hossman)
81
+
82
+ 3. SOLR-91: allow configuration of a limit of the number of searchers
83
+ that can be warming in the background. This can be used to avoid
84
+ out-of-memory errors, or contention caused by more and more searchers
85
+ warming in the background. An error is thrown if the limit specified
86
+ by maxWarmingSearchers in solrconfig.xml is exceeded. (yonik)
87
+
88
+ 4. SOLR-106: New faceting parameters that allow specification of a
89
+ minimum count for returned facets (facet.mincount), paging through facets
90
+ (facet.offset, facet.limit), and explicit sorting (facet.sort).
91
+ facet.zeros is now deprecated. (yonik)
92
+
93
+ 5. SOLR-80: Negative queries are now allowed everywhere. Negative queries
94
+ are generated and cached as their positive counterpart, speeding
95
+ generation and generally resulting in smaller sets to cache.
96
+ Set intersections in SolrIndexSearcher are more efficient,
97
+ starting with the smallest positive set, subtracting all negative
98
+ sets, then intersecting with all other positive sets. (yonik)
99
+
100
+ 6. SOLR-117: Limit a field faceting to constraints with a prefix specified
101
+ by facet.prefix or f.<field>.facet.prefix. (yonik)
102
+
103
+ 7. SOLR-107: JAVA API: Change NamedList to use Java5 generics
104
+ and implement Iterable<Map.Entry> (Ryan McKinley via yonik)
105
+
106
+ 8. SOLR-104: Support for "Update Plugins" -- RequestHandlers that want
107
+ access to streams of data for doing updates. ContentStreams can come
108
+ from the raw POST body, multi-part form data, or remote URLs.
109
+ Included in this change is a new SolrDispatchFilter that allows
110
+ RequestHandlers registered with names that begin with a "/" to be
111
+ accessed using a URL structure based on that name.
112
+ (Ryan McKinley via hossman)
113
+
114
+ 9. SOLR-126: DirectUpdateHandler2 supports autocommitting after a specified time
115
+ (in ms), using <autoCommit><maxTime>10000</maxTime></autoCommit>.
116
+ (Ryan McKinley via klaas).
117
+
118
+ 10. SOLR-116: IndexInfoRequestHandler added. (Erik Hatcher)
119
+
120
+ 11. SOLR-79: Add system property ${<sys.prop>[:<default>]} substitution for
121
+ configuration files loaded, including schema.xml and solrconfig.xml.
122
+ (Erik Hatcher with inspiration from Andrew Saar)
123
+
124
+ 12. SOLR-149: Changes to make Solr more easily embeddable, in addition
125
+ to logging which request handler handled each request.
126
+ (Ryan McKinley via yonik)
127
+
128
+ 13. SOLR-86: Added standalone Java-based command-line updater.
129
+ (Erik Hatcher via Bertrand Delecretaz)
130
+
131
+ 14. SOLR-152: DisMaxRequestHandler now supports configurable alternate
132
+ behavior when q is not specified. A "q.alt" param can be specified
133
+ using SolrQueryParser syntax as a mechanism for specifying what query
134
+ the dismax handler should execute if the main user query (q) is blank.
135
+ (Ryan McKinley via hossman)
136
+
137
+ 15. SOLR-158: new "qs" (Query Slop) param for DisMaxRequestHandler
138
+ allows for specifying the amount of default slop to use when parsing
139
+ explicit phrase queries from the user.
140
+ (Adam Hiatt via hossman)
141
+
142
+ 16. SOLR-81: SpellCheckerRequestHandler that uses the SpellChecker from
143
+ the Lucene contrib.
144
+ (Otis Gospodnetic and Adam Hiatt)
145
+
146
+ 17. SOLR-182: allow lazy loading of request handlers on first request.
147
+ (Ryan McKinley via yonik)
148
+
149
+ 18. SOLR-81: More SpellCheckerRequestHandler enhancements, inlcluding
150
+ support for relative or absolute directory path configurations, as
151
+ well as RAM based directory. (hossman)
152
+
153
+ 19. SOLR-197: New parameters for input: stream.contentType for specifying
154
+ or overriding the content type of input, and stream.file for reading
155
+ local files. (Ryan McKinley via yonik)
156
+
157
+ 20. SOLR-66: CSV data format for document additions and updates. (yonik)
158
+
159
+ 21. SOLR-184: add echoHandler=true to responseHeader, support echoParams=all
160
+ (Ryan McKinley via ehatcher)
161
+
162
+ 22. SOLR-211: Added a regex PatternTokenizerFactory. This extracts tokens
163
+ from the input string using a regex Pattern. (Ryan McKinley)
164
+
165
+ 23. SOLR-162: Added a "Luke" request handler and other admin helpers.
166
+ This exposes the system status through the standard requestHandler
167
+ framework. (ryan)
168
+
169
+ 24. SOLR-212: Added a DirectSolrConnection class. This lets you access
170
+ solr using the standard request/response formats, but does not require
171
+ an HTTP connection. It is designed for embedded applications. (ryan)
172
+
173
+ 25. SOLR-204: The request dispatcher (added in SOLR-104) can handle
174
+ calls to /select. This offers uniform error handling for /update and
175
+ /select. To enable this behavior, you must add:
176
+ <requestDispatcher handleSelect="true" > to your solrconfig.xml
177
+ See the example solrconfig.xml for details. (ryan)
178
+
179
+ 26. SOLR-170: StandardRequestHandler now supports a "sort" parameter.
180
+ Using the ';' syntax is still supported, but it is recommended to
181
+ transition to the new syntax. (ryan)
182
+
183
+ 27. SOLR-181: The index schema now supports "required" fields. Attempts
184
+ to add a document without a required field will fail, returning a
185
+ descriptive error message. By default, the uniqueKey field is
186
+ a required field. This can be disabled by setting required=false
187
+ in schema.xml. (Greg Ludington via ryan)
188
+
189
+ 28. SOLR-217: Fields configured in the schema to be neither indexed or
190
+ stored will now be quietly ignored by Solr when Documents are added.
191
+ The example schema has a comment explaining how this can be used to
192
+ ignore any "unknown" fields.
193
+ (Will Johnson via hossman)
194
+
195
+ 29. SOLR-227: If schema.xml defines multiple fieldTypes, fields, or
196
+ dynamicFields with the same name, a severe error will be logged rather
197
+ then quietly continuing. Depending on the <abortOnConfigurationError>
198
+ settings, this may halt the server. Likewise, if solrconfig.xml
199
+ defines multiple RequestHandlers with the same name it will also add
200
+ an error. (ryan)
201
+
202
+ 30. SOLR-226: Added support for dynamic field as the destination of a
203
+ copyField using glob (*) replacement. (ryan)
204
+
205
+ 31. SOLR-224: Adding a PhoneticFilterFactory that uses apache commons codec
206
+ language encoders to build phonetically similar tokens. This currently
207
+ supports: DoubleMetaphone, Metaphone, Soundex, and RefinedSoundex (ryan)
208
+
209
+ 32. SOLR-199: new n-gram tokenizers available via NGramTokenizerFactory
210
+ and EdgeNGramTokenizerFactory. (Adam Hiatt via yonik)
211
+
212
+ 33. SOLR-234: TrimFilter can update the Token's startOffset and endOffset
213
+ if updateOffsets="true". By default the Token offsets are unchanged.
214
+ (ryan)
215
+
216
+ 34. SOLR-208: new example_rss.xsl and example_atom.xsl to provide more
217
+ examples for people about the Solr XML response format and how they
218
+ can transform it to suit different needs.
219
+ (Brian Whitman via hossman)
220
+
221
+ 35. SOLR-249: Deprecated SolrException( int, ... ) constructors in favor
222
+ of constructors that takes an ErrorCode enum. This will ensure that
223
+ all SolrExceptions use a valid HTTP status code. (ryan)
224
+
225
+ Changes in runtime behavior
226
+ 1. Highlighting using DisMax will only pick up terms from the main
227
+ user query, not boost or filter queries (klaas).
228
+
229
+ 2. SOLR-125: Change default of json.nl to flat, change so that
230
+ json.nl only affects items where order matters (facet constraint
231
+ listings). Fix JSON output bug for null values. Internal JAVA API:
232
+ change most uses of NamedList to SimpleOrderedMap. (yonik)
233
+
234
+ 3. A new method "getSolrQueryParser" has been added to the IndexSchema
235
+ class for retrieving a new SolrQueryParser instance with all options
236
+ specified in the schema.xml's <solrQueryParser> block set. The
237
+ documentation for the SolrQueryParser constructor and it's use of
238
+ IndexSchema have also been clarified.
239
+ (Erik Hatcher and hossman)
240
+
241
+ 4. DisMaxRequestHandler's bq, bf, qf, and pf parameters can now accept
242
+ multiple values (klaas).
243
+
244
+ 5. Query are re-written before highlighting is performed. This enables
245
+ proper highlighting of prefix and wildcard queries (klaas).
246
+
247
+ 6. A meaningful exception is raised when attempting to add a doc missing
248
+ a unique id if it is declared in the schema and allowDups=false.
249
+ (ryan via klaas)
250
+
251
+ 7. SOLR-183: Exceptions with error code 400 are raised when
252
+ numeric argument parsing fails. RequiredSolrParams class added
253
+ to facilitate checking for parameters that must be present.
254
+ (Ryan McKinley, J.J. Larrea via yonik)
255
+
256
+ 8. SOLR-179: By default, solr will abort after any severe initalization
257
+ errors. This behavior can be disabled by setting:
258
+ <abortOnConfigurationError>false</abortOnConfigurationError>
259
+ in solrconfig.xml (ryan)
260
+
261
+ 9. The example solrconfig.xml maps /update to XmlUpdateRequestHandler using
262
+ the new request dispatcher (SOLR-104). This requires posted content to
263
+ have a valid contentType: curl -H 'Content-type:text/xml; charset=utf-8'
264
+ The response format matches that of /select and returns standard error
265
+ codes. To enable solr1.1 style /update, do not map "/update" to any
266
+ handler in solrconfig.xml (ryan)
267
+
268
+ 10. SOLR-231: If a charset is not specified in the contentType,
269
+ ContentStream.getReader() will use UTF-8 encoding. (ryan)
270
+
271
+ 11. SOLR-230: More options for post.jar to support stdin, xml on the
272
+ commandline, and defering commits. Tutorial modified to take
273
+ advantage of these options so there is no need for curl.
274
+ (hossman)
275
+
276
+ 12. SOLR-128: Upgraded Jetty to the latest stable release 6.1.3 (ryan)
277
+
278
+ Optimizations
279
+ 1. SOLR-114: HashDocSet specific implementations of union() and andNot()
280
+ for a 20x performance improvement for those set operations, and a new
281
+ hash algorithm speeds up exists() by 10% and intersectionSize() by 8%.
282
+ (yonik)
283
+
284
+ 2. SOLR-115: Solr now uses BooleanQuery.clauses() instead of
285
+ BooleanQuery.getClauses() in any situation where there is no risk of
286
+ modifying the original query.
287
+ (hossman)
288
+
289
+ 3. SOLR-221: Speed up sorted faceting on multivalued fields by ~60%
290
+ when the base set consists of a relatively large portion of the
291
+ index. (yonik)
292
+
293
+ 4. SOLR-221: Added a facet.enum.cache.minDf parameter which avoids
294
+ using the filterCache for terms that match few documents, trading
295
+ decreased memory usage for increased query time. (yonik)
296
+
297
+ Bug Fixes
298
+ 1. SOLR-87: Parsing of synonym files did not correctly handle escaped
299
+ whitespace such as \r\n\t\b\f. (yonik)
300
+
301
+ 2. SOLR-92: DOMUtils.getText (used when parsing config files) did not
302
+ work properly with many DOM implementations when dealing with
303
+ "Attributes". (Ryan McKinley via hossman)
304
+
305
+ 3. SOLR-9,SOLR-99: Tighten up sort specification error checking, throw
306
+ exceptions for missing sort specifications or a sort on a non-indexed
307
+ field. (Ryan McKinley via yonik)
308
+
309
+ 4. SOLR-145: Fix for bug introduced in SOLR-104 where some Exceptions
310
+ were being ignored by all "out of the box" RequestHandlers. (hossman)
311
+
312
+ 5. SOLR-166: JNDI solr.home code refactoring. SOLR-104 moved
313
+ some JNDI related code to the init method of a Servlet Filter -
314
+ according to the Servlet Spec, all Filter's should be initialized
315
+ prior to initializing any Servlets, but this is not the case in at
316
+ least one Servlet Container (Resin). This "bug fix" refactors
317
+ this JNDI code so that it should be executed the first time any
318
+ attempt is made to use the solr.home dir.
319
+ (Ryan McKinley via hossman)
320
+
321
+ 6. SOLR-173: Bug fix to SolrDispatchFilter to reduce "too many open
322
+ files" problem was that SolrDispatchFilter was not closing requests
323
+ when finished. Also modified ResponseWriters to only fetch a Searcher
324
+ reference if necessary for writing out DocLists.
325
+ (Ryan McKinley via hossman)
326
+
327
+ 7. SOLR-168: Fix display positioning of multiple tokens at the same
328
+ position in analysis.jsp (yonik)
329
+
330
+ 8. SOLR-167: The SynonymFilter sometimes generated incorrect offsets when
331
+ multi token synonyms were mached in the source text. (yonik)
332
+
333
+ 9. SOLR-188: bin scripts do not support non-default webapp names. Added "-U"
334
+ option to specify a full path to the update url, overriding the
335
+ "-h" (hostname), "-p" (port) and "-w" (webapp name) parameters.
336
+ (Jeff Rodenburg via billa)
337
+
338
+ 10. SOLR-198: RunExecutableListener always waited for the process to
339
+ finish, even when wait="false" was set. (Koji Sekiguchi via yonik)
340
+
341
+ 11. SOLR-207: Changed distribution scripts to remove recursive find
342
+ and avoid use of "find -maxdepth" on platforms where it is not
343
+ supported. (yonik)
344
+
345
+ 12. SOLR-222: Changing writeLockTimeout in solrconfig.xml did not
346
+ change the effective timeout. (Koji Sekiguchi via yonik)
347
+
348
+ 13. Changed the SOLR-104 RequestDispatcher so that /select?qt=xxx can not
349
+ access handlers that start with "/". This makes path based authentication
350
+ possible for path based request handlers. (ryan)
351
+
352
+ 14. SOLR-214: Some servlet containers (including Tomcat and Resin) do not
353
+ obey the specified charset. Rather then letting the the container handle
354
+ it solr now uses the charset from the header contentType to decode posted
355
+ content. Using the contentType: "text/xml; charset=utf-8" will force
356
+ utf-8 encoding. If you do not specify a contentType, it will use the
357
+ platform default. (Koji Sekiguchi via ryan)
358
+
359
+ 15. SOLR-241: Undefined system properties used in configuration files now
360
+ cause a clear message to be logged rather than an obscure exception thrown.
361
+ (Koji Sekiguchi via ehatcher)
362
+
363
+ Other Changes
364
+ 1. Updated to Lucene 2.1
365
+
366
+ 2. Updated to Lucene 2007-05-20_00-04-53
367
+
368
+ ================== Release 1.1.0, 20061222 ==================
369
+
370
+ Status
371
+ ------
372
+ This is the first release since Solr joined the Incubator, and brings many
373
+ new features and performance optimizations including highlighting,
374
+ faceted browsing, and JSON/Python/Ruby response formats.
375
+
376
+
377
+ Upgrading from previous Solr versions
378
+ -------------------------------------
379
+ Older Apache Solr installations can be upgraded by replacing
380
+ the relevant war file with the new version. No changes to configuration
381
+ files are needed and the index format has not changed.
382
+
383
+ The default version of the Solr XML response syntax has been changed to 2.2.
384
+ Behavior can be preserved for those clients not explicitly specifying a
385
+ version by adding a default to the request handler in solrconfig.xml
386
+
387
+ By default, Solr will no longer use a searcher that has not fully warmed,
388
+ and requests will block in the meantime. To change back to the previous
389
+ behavior of using a cold searcher in the event there is no other
390
+ warm searcher, see the useColdSearcher config item in solrconfig.xml
391
+
392
+ The XML response format when adding multiple documents to the collection
393
+ in a single <add> command has changed to return a single <result>.
394
+
395
+
396
+ Detailed Change List
397
+ --------------------
398
+
399
+ New Features
400
+ 1. added support for setting Lucene's positionIncrementGap
401
+ 2. Admin: new statistics for SolrIndexSearcher
402
+ 3. Admin: caches now show config params on stats page
403
+ 3. max() function added to FunctionQuery suite
404
+ 4. postOptimize hook, mirroring the functionallity of the postCommit hook,
405
+ but only called on an index optimize.
406
+ 5. Ability to HTTP POST query requests to /select in addition to HTTP-GET
407
+ 6. The default search field may now be overridden by requests to the
408
+ standard request handler using the df query parameter. (Erik Hatcher)
409
+ 7. Added DisMaxRequestHandler and SolrPluginUtils. (Chris Hostetter)
410
+ 8. Support for customizing the QueryResponseWriter per request
411
+ (Mike Baranczak / SOLR-16 / hossman)
412
+ 9. Added KeywordTokenizerFactory (hossman)
413
+ 10. copyField accepts dynamicfield-like names as the source.
414
+ (Darren Erik Vengroff via yonik, SOLR-21)
415
+ 11. new DocSet.andNot(), DocSet.andNotSize() (yonik)
416
+ 12. Ability to store term vectors for fields. (Mike Klaas via yonik, SOLR-23)
417
+ 13. New abstract BufferedTokenStream for people who want to write
418
+ Tokenizers or TokenFilters that require arbitrary buffering of the
419
+ stream. (SOLR-11 / yonik, hossman)
420
+ 14. New RemoveDuplicatesToken - useful in situations where
421
+ synonyms, stemming, or word-deliminater-ing produce identical tokens at
422
+ the same position. (SOLR-11 / yonik, hossman)
423
+ 15. Added highlighting to SolrPluginUtils and implemented in StandardRequestHandler
424
+ and DisMaxRequestHandler (SOLR-24 / Mike Klaas via hossman,yonik)
425
+ 16. SnowballPorterFilterFactory language is configurable via the "language"
426
+ attribute, with the default being "English". (Bertrand Delacretaz via yonik, SOLR-27)
427
+ 17. ISOLatin1AccentFilterFactory, instantiates ISOLatin1AccentFilter to remove accents.
428
+ (Bertrand Delacretaz via yonik, SOLR-28)
429
+ 18. JSON, Python, Ruby QueryResponseWriters: use wt="json", "python" or "ruby"
430
+ (yonik, SOLR-31)
431
+ 19. Make web admin pages return UTF-8, change Content-type declaration to include a
432
+ space between the mime-type and charset (Philip Jacob, SOLR-35)
433
+ 20. Made query parser default operator configurable via schema.xml:
434
+ <solrQueryParser defaultOperator="AND|OR"/>
435
+ The default operator remains "OR".
436
+ 21. JAVA API: new version of SolrIndexSearcher.getDocListAndSet() which takes
437
+ flags (Greg Ludington via yonik, SOLR-39)
438
+ 22. A HyphenatedWordsFilter, a text analysis filter used during indexing to rejoin
439
+ words that were hyphenated and split by a newline. (Boris Vitez via yonik, SOLR-41)
440
+ 23. Added a CompressableField base class which allows fields of derived types to
441
+ be compressed using the compress=true setting. The field type also gains the
442
+ ability to specify a size threshold at which field data is compressed.
443
+ (klaas, SOLR-45)
444
+ 24. Simple faceted search support for fields (enumerating terms)
445
+ and arbitrary queries added to both StandardRequestHandler and
446
+ DisMaxRequestHandler. (hossman, SOLR-44)
447
+ 25. In addition to specifying default RequestHandler params in the
448
+ solrconfig.xml, support has been added for configuring values to be
449
+ appended to the multi-val request params, as well as for configuring
450
+ invariant params that can not overridden in the query. (hossman, SOLR-46)
451
+ 26. Default operator for query parsing can now be specified with q.op=AND|OR
452
+ from the client request, overriding the schema value. (ehatcher)
453
+ 27. New XSLTResponseWriter does server side XSLT processing of XML Response.
454
+ In the process, an init(NamedList) method was added to QueryResponseWriter
455
+ which works the same way as SolrRequestHandler.
456
+ (Bertrand Delacretaz / SOLR-49 / hossman)
457
+ 28. json.wrf parameter adds a wrapper-function around the JSON response,
458
+ useful in AJAX with dynamic script tags for specifying a JavaScript
459
+ callback function. (Bertrand Delacretaz via yonik, SOLR-56)
460
+ 29. autoCommit can be specified every so many documents added (klaas, SOLR-65)
461
+ 30. ${solr.home}/lib directory can now be used for specifying "plugin" jars
462
+ (hossman, SOLR-68)
463
+ 31. Support for "Date Math" relative "NOW" when specifying values of a
464
+ DateField in a query -- or when adding a document.
465
+ (hossman, SOLR-71)
466
+ 32. useColdSearcher control in solrconfig.xml prevents the first searcher
467
+ from being used before it's done warming. This can help prevent
468
+ thrashing on startup when multiple requests hit a cold searcher.
469
+ The default is "false", preventing use before warm. (yonik, SOLR-77)
470
+
471
+ Changes in runtime behavior
472
+ 1. classes reorganized into different packages, package names changed to Apache
473
+ 2. force read of document stored fields in QuerySenderListener
474
+ 3. Solr now looks in ./solr/conf for config, ./solr/data for data
475
+ configurable via solr.solr.home system property
476
+ 4. Highlighter params changed to be prefixed with "hl."; allow fragmentsize
477
+ customization and per-field overrides on many options
478
+ (Andrew May via klaas, SOLR-37)
479
+ 5. Default param values for DisMaxRequestHandler should now be specified
480
+ using a '<lst name="defaults">...</lst>' init param, for backwards
481
+ compatability all init prams will be used as defaults if an init param
482
+ with that name does not exist. (hossman, SOLR-43)
483
+ 6. The DisMaxRequestHandler now supports multiple occurances of the "fq"
484
+ param. (hossman, SOLR-44)
485
+ 7. FunctionQuery.explain now uses ComplexExplanation to provide more
486
+ accurate score explanations when composed in a BooleanQuery.
487
+ (hossman, SOLR-25)
488
+ 8. Document update handling locking is much sparser, allowing performance gains
489
+ through multiple threads. Large commits also might be faster (klaas, SOLR-65)
490
+ 9. Lazy field loading can be enabled via a solrconfig directive. This will be faster when
491
+ not all stored fields are needed from a document (klaas, SOLR-52)
492
+ 10. Made admin JSPs return XML and transform them with new XSL stylesheets
493
+ (Otis Gospodnetic, SOLR-58)
494
+ 11. If the "echoParams=explicit" request parameter is set, request parameters are copied
495
+ to the output. In an XML output, they appear in new <lst name="params"> list inside
496
+ the new <lst name="responseHeader"> element, which replaces the old <responseHeader>.
497
+ Adding a version=2.1 parameter to the request produces the old format, for backwards
498
+ compatibility (bdelacretaz and yonik, SOLR-59).
499
+
500
+ Optimizations
501
+ 1. getDocListAndSet can now generate both a DocList and a DocSet from a
502
+ single lucene query.
503
+ 2. BitDocSet.intersectionSize(HashDocSet) no longer generates an intermediate
504
+ set
505
+ 3. OpenBitSet completed, replaces BitSet as the implementation for BitDocSet.
506
+ Iteration is faster, and BitDocSet.intersectionSize(BitDocSet) and unionSize
507
+ is between 3 and 4 times faster. (yonik, SOLR-15)
508
+ 4. much faster unionSize when one of the sets is a HashDocSet: O(smaller_set_size)
509
+ 5. Optimized getDocSet() for term queries resulting in a 36% speedup of facet.field
510
+ queries where DocSets aren't cached (for example, if the number of terms in the field
511
+ is larger than the filter cache.) (yonik)
512
+ 6. Optimized facet.field faceting by as much as 500 times when the field has
513
+ a single token per document (not multiValued & not tokenized) by using the
514
+ Lucene FieldCache entry for that field to tally term counts. The first request
515
+ utilizing the FieldCache will take longer than subsequent ones.
516
+
517
+ Bug Fixes
518
+ 1. Fixed delete-by-id for field types who's indexed form is different
519
+ from the printable form (mainly sortable numeric types).
520
+ 2. Added escaping of attribute values in the XML response (Erik Hatcher)
521
+ 3. Added empty extractTerms() to FunctionQuery to enable use in
522
+ a MultiSearcher (Yonik)
523
+ 4. WordDelimiterFilter sometimes lost token positionIncrement information
524
+ 5. Fix reverse sorting for fields were sortMissingFirst=true
525
+ (Rob Staveley, yonik)
526
+ 6. Worked around a Jetty bug that caused invalid XML responses for fields
527
+ containing non ASCII chars. (Bertrand Delacretaz via yonik, SOLR-32)
528
+ 7. WordDelimiterFilter can throw exceptions if configured with both
529
+ generate and catenate off. (Mike Klaas via yonik, SOLR-34)
530
+ 8. Escape '>' in XML output (because ]]> is illegal in CharData)
531
+ 9. field boosts weren't being applied and doc boosts were being applied to fields (klaas)
532
+ 10. Multiple-doc update generates well-formed xml (klaas, SOLR-65)
533
+ 11. Better parsing of pingQuery from solrconfig.xml (hossman, SOLR-70)
534
+ 12. Fixed bug with "Distribution" page introduced when Versions were
535
+ added to "Info" page (hossman)
536
+ 13. Fixed HTML escaping issues with user input to analysis.jsp and action.jsp
537
+ (hossman, SOLR-74)
538
+
539
+ Other Changes
540
+ 1. Upgrade to Lucene 2.0 nightly build 2006-06-22, lucene SVN revision 416224,
541
+ http://svn.apache.org/viewvc/lucene/java/trunk/CHANGES.txt?view=markup&pathrev=416224
542
+ 2. Modified admin styles to improve display in Internet Explorer (Greg Ludington via billa, SOLR-6)
543
+ 3. Upgrade to Lucene 2.0 nightly build 2006-07-15, lucene SVN revision 422302,
544
+ 4. Included unique key field name/value (if available) in log message of add (billa, SOLR-18)
545
+ 5. Updated to Lucene 2.0 nightly build 2006-09-07, SVN revision 462111
546
+ 6. Added javascript to catch empty query in admin query forms (Tomislav Nakic-Alfirevic via billa, SOLR-48
547
+ 7. blackslash escape * in ssh command used in snappuller for zsh compatibility, SOLR-63
548
+ 8. check solr return code in admin scripts, SOLR-62
549
+ 9. Updated to Lucene 2.0 nightly build 2006-11-15, SVN revision 475069
550
+ 10. Removed src/apps containing the legacy "SolrTest" app (hossman, SOLR-3)
551
+ 11. Simplified index.jsp and form.jsp, primarily by removing/hiding XML
552
+ specific params, and adding an option to pick the output type. (hossman)
553
+ 12. Added new numeric build property "specversion" to allow clean
554
+ MANIFEST.MF files (hossman)
555
+ 13. Added Solr/Lucene versions to "Info" page (hossman)
556
+ 14. Explicitly set mime-type of .xsl files in web.xml to
557
+ application/xslt+xml (hossman)
558
+ 15. Config parsing should now work useing DOM Level 2 parsers -- Solr
559
+ previously relied on getTextContent which is a DOM Level 3 addition
560
+ (Alexander Saar via hossman, SOLR-78)
561
+
562
+ 2006/01/17 Solr open sourced, moves to Apache Incubator