bigindex-solr 1.2.0
Sign up to get free protection for your applications and to get access to all the features.
- data/.document +5 -0
- data/.gitignore +8 -0
- data/MIT-LICENSE +20 -0
- data/README.rdoc +73 -0
- data/Rakefile +49 -0
- data/VERSION +1 -0
- data/bigindex-solr.gemspec +141 -0
- data/examples/bigindex.yml +9 -0
- data/install.rb +17 -0
- data/lib/bigindex-solr.rb +32 -0
- data/lib/bigindex-solr/tasks.rb +3 -0
- data/lib/tasks/bigindex_solr.rake +55 -0
- data/schema.xml +126 -0
- data/solr/CHANGES.txt +562 -0
- data/solr/LICENSE.txt +608 -0
- data/solr/NOTICE.txt +57 -0
- data/solr/etc/jetty.xml +206 -0
- data/solr/etc/webdefault.xml +379 -0
- data/solr/lib/commons-codec-1.3.jar +0 -0
- data/solr/lib/commons-csv-0.1-SNAPSHOT.jar +0 -0
- data/solr/lib/commons-fileupload-1.2.jar +0 -0
- data/solr/lib/commons-io-1.2.jar +0 -0
- data/solr/lib/easymock.jar +0 -0
- data/solr/lib/jetty-6.1.3.jar +0 -0
- data/solr/lib/jetty-util-6.1.3.jar +0 -0
- data/solr/lib/jsp-2.1/ant-1.6.5.jar +0 -0
- data/solr/lib/jsp-2.1/core-3.1.1.jar +0 -0
- data/solr/lib/jsp-2.1/jsp-2.1.jar +0 -0
- data/solr/lib/jsp-2.1/jsp-api-2.1.jar +0 -0
- data/solr/lib/lucene-analyzers-2007-05-20_00-04-53.jar +0 -0
- data/solr/lib/lucene-core-2007-05-20_00-04-53.jar +0 -0
- data/solr/lib/lucene-highlighter-2007-05-20_00-04-53.jar +0 -0
- data/solr/lib/lucene-snowball-2007-05-20_00-04-53.jar +0 -0
- data/solr/lib/lucene-spellchecker-2007-05-20_00-04-53.jar +0 -0
- data/solr/lib/servlet-api-2.4.jar +0 -0
- data/solr/lib/servlet-api-2.5-6.1.3.jar +0 -0
- data/solr/lib/xpp3-1.1.3.4.O.jar +0 -0
- data/solr/solr/README.txt +52 -0
- data/solr/solr/bin/abc +176 -0
- data/solr/solr/bin/abo +176 -0
- data/solr/solr/bin/backup +108 -0
- data/solr/solr/bin/backupcleaner +142 -0
- data/solr/solr/bin/commit +128 -0
- data/solr/solr/bin/optimize +129 -0
- data/solr/solr/bin/readercycle +129 -0
- data/solr/solr/bin/rsyncd-disable +77 -0
- data/solr/solr/bin/rsyncd-enable +76 -0
- data/solr/solr/bin/rsyncd-start +145 -0
- data/solr/solr/bin/rsyncd-stop +105 -0
- data/solr/solr/bin/scripts-util +83 -0
- data/solr/solr/bin/snapcleaner +148 -0
- data/solr/solr/bin/snapinstaller +168 -0
- data/solr/solr/bin/snappuller +248 -0
- data/solr/solr/bin/snappuller-disable +77 -0
- data/solr/solr/bin/snappuller-enable +77 -0
- data/solr/solr/bin/snapshooter +109 -0
- data/solr/solr/conf/admin-extra.html +31 -0
- data/solr/solr/conf/protwords.txt +21 -0
- data/solr/solr/conf/schema.xml +126 -0
- data/solr/solr/conf/scripts.conf +24 -0
- data/solr/solr/conf/solrconfig.xml +458 -0
- data/solr/solr/conf/stopwords.txt +57 -0
- data/solr/solr/conf/synonyms.txt +31 -0
- data/solr/solr/conf/xslt/example.xsl +132 -0
- data/solr/solr/conf/xslt/example_atom.xsl +63 -0
- data/solr/solr/conf/xslt/example_rss.xsl +62 -0
- data/solr/solr/data/development/index/_a.fdt +0 -0
- data/solr/solr/data/development/index/_a.fdx +0 -0
- data/solr/solr/data/development/index/_a.fnm +2 -0
- data/solr/solr/data/development/index/_a.frq +1 -0
- data/solr/solr/data/development/index/_a.nrm +1 -0
- data/solr/solr/data/development/index/_a.prx +0 -0
- data/solr/solr/data/development/index/_a.tii +0 -0
- data/solr/solr/data/development/index/_a.tis +0 -0
- data/solr/solr/data/development/index/_a_1.del +0 -0
- data/solr/solr/data/development/index/_b.fdt +0 -0
- data/solr/solr/data/development/index/_b.fdx +0 -0
- data/solr/solr/data/development/index/_b.fnm +2 -0
- data/solr/solr/data/development/index/_b.frq +0 -0
- data/solr/solr/data/development/index/_b.nrm +1 -0
- data/solr/solr/data/development/index/_b.prx +0 -0
- data/solr/solr/data/development/index/_b.tii +0 -0
- data/solr/solr/data/development/index/_b.tis +0 -0
- data/solr/solr/data/development/index/_b_1.del +0 -0
- data/solr/solr/data/development/index/_c.fdt +0 -0
- data/solr/solr/data/development/index/_c.fdx +0 -0
- data/solr/solr/data/development/index/_c.fnm +2 -0
- data/solr/solr/data/development/index/_c.frq +1 -0
- data/solr/solr/data/development/index/_c.nrm +1 -0
- data/solr/solr/data/development/index/_c.prx +0 -0
- data/solr/solr/data/development/index/_c.tii +0 -0
- data/solr/solr/data/development/index/_c.tis +0 -0
- data/solr/solr/data/development/index/_c_1.del +0 -0
- data/solr/solr/data/development/index/segments.gen +0 -0
- data/solr/solr/data/development/index/segments_q +0 -0
- data/solr/start.jar +0 -0
- data/solr/webapps/solr.war +0 -0
- data/spec/bigindex-solr_spec.rb +7 -0
- data/spec/spec_helper.rb +9 -0
- metadata +163 -0
data/solr/CHANGES.txt
ADDED
@@ -0,0 +1,562 @@
|
|
1
|
+
Apache Solr Version 1.2
|
2
|
+
Release Notes
|
3
|
+
|
4
|
+
Introduction
|
5
|
+
------------
|
6
|
+
Apache Solr is an open source enterprise search server based on the Lucene Java
|
7
|
+
search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search,
|
8
|
+
caching, replication, and a web administration interface. It runs in a Java
|
9
|
+
servlet container such as Tomcat.
|
10
|
+
|
11
|
+
See http://lucene.apache.org/solr for more information.
|
12
|
+
|
13
|
+
|
14
|
+
Getting Started
|
15
|
+
---------------
|
16
|
+
You need a Java 1.5 VM or later installed.
|
17
|
+
In this release, there is an example Solr server including a bundled
|
18
|
+
servlet container in the directory named "example".
|
19
|
+
See the tutorial at http://lucene.apache.org/solr/tutorial.html
|
20
|
+
|
21
|
+
|
22
|
+
$Id: CHANGES.txt 543263 2007-05-31 21:19:02Z yonik $
|
23
|
+
|
24
|
+
================== Release 1.2, 20070603 ==================
|
25
|
+
|
26
|
+
Upgrading from Solr 1.1
|
27
|
+
-------------------------------------
|
28
|
+
IMPORTANT UPGRADE NOTE: In a master/slave configuration, all searchers/slaves
|
29
|
+
should be upgraded before the master! If the master were to be updated
|
30
|
+
first, the older searchers would not be able to read the new index format.
|
31
|
+
|
32
|
+
Older Apache Solr installations can be upgraded by replacing
|
33
|
+
the relevant war file with the new version. No changes to configuration
|
34
|
+
files should be needed.
|
35
|
+
|
36
|
+
This version of Solr contains a new version of Lucene implementing
|
37
|
+
an updated index format. This version of Solr/Lucene can still read
|
38
|
+
and update indexes in the older formats, and will convert them to the new
|
39
|
+
format on the first index change. One change in the new index format
|
40
|
+
is that all "norms" are kept in a single file, greatly reducing the number
|
41
|
+
of files per segment. Users of compound file indexes will want to consider
|
42
|
+
converting to the non-compound format for faster indexing and slightly better
|
43
|
+
search concurrency.
|
44
|
+
|
45
|
+
The JSON response format for facets has changed to make it easier for
|
46
|
+
clients to retain sorted order. Use json.nl=map explicitly in clients
|
47
|
+
to get the old behavior, or add it as a default to the request handler
|
48
|
+
in solrconfig.xml
|
49
|
+
|
50
|
+
The Lucene based Solr query syntax is slightly more strict.
|
51
|
+
A ':' in a field value must be escaped or the whole value must be quoted.
|
52
|
+
|
53
|
+
The Solr "Request Handler" framework has been updated in two key ways:
|
54
|
+
First, if a Request Handler is registered in solrconfig.xml with a name
|
55
|
+
starting with "/" then it can be accessed using path-based URL, instead of
|
56
|
+
using the legacy "/select?qt=name" URL structure. Second, the Request
|
57
|
+
Handler framework has been extended making it possible to write Request
|
58
|
+
Handlers that process streams of data for doing updates, and there is a
|
59
|
+
new-style Request Handler for XML updates given the name of "/update" in
|
60
|
+
the example solrconfig.xml. Existing installations without this "/update"
|
61
|
+
handler will continue to use the old update servlet and should see no
|
62
|
+
changes in behavior. For new-style update handlers, errors are now
|
63
|
+
reflected in the HTTP status code, Content-type checking is more strict,
|
64
|
+
and the response format has changed and is controllable via the wt
|
65
|
+
parameter.
|
66
|
+
|
67
|
+
|
68
|
+
|
69
|
+
Detailed Change List
|
70
|
+
--------------------
|
71
|
+
|
72
|
+
New Features
|
73
|
+
1. SOLR-82: Default field values can be specified in the schema.xml.
|
74
|
+
(Ryan McKinley via hossman)
|
75
|
+
|
76
|
+
2. SOLR-89: Two new TokenFilters with corresponding Factories...
|
77
|
+
* TrimFilter - Trims leading and trailing whitespace from Tokens
|
78
|
+
* PatternReplaceFilter - applies a Pattern to each token in the
|
79
|
+
stream, replacing match occurances with a specified replacement.
|
80
|
+
(hossman)
|
81
|
+
|
82
|
+
3. SOLR-91: allow configuration of a limit of the number of searchers
|
83
|
+
that can be warming in the background. This can be used to avoid
|
84
|
+
out-of-memory errors, or contention caused by more and more searchers
|
85
|
+
warming in the background. An error is thrown if the limit specified
|
86
|
+
by maxWarmingSearchers in solrconfig.xml is exceeded. (yonik)
|
87
|
+
|
88
|
+
4. SOLR-106: New faceting parameters that allow specification of a
|
89
|
+
minimum count for returned facets (facet.mincount), paging through facets
|
90
|
+
(facet.offset, facet.limit), and explicit sorting (facet.sort).
|
91
|
+
facet.zeros is now deprecated. (yonik)
|
92
|
+
|
93
|
+
5. SOLR-80: Negative queries are now allowed everywhere. Negative queries
|
94
|
+
are generated and cached as their positive counterpart, speeding
|
95
|
+
generation and generally resulting in smaller sets to cache.
|
96
|
+
Set intersections in SolrIndexSearcher are more efficient,
|
97
|
+
starting with the smallest positive set, subtracting all negative
|
98
|
+
sets, then intersecting with all other positive sets. (yonik)
|
99
|
+
|
100
|
+
6. SOLR-117: Limit a field faceting to constraints with a prefix specified
|
101
|
+
by facet.prefix or f.<field>.facet.prefix. (yonik)
|
102
|
+
|
103
|
+
7. SOLR-107: JAVA API: Change NamedList to use Java5 generics
|
104
|
+
and implement Iterable<Map.Entry> (Ryan McKinley via yonik)
|
105
|
+
|
106
|
+
8. SOLR-104: Support for "Update Plugins" -- RequestHandlers that want
|
107
|
+
access to streams of data for doing updates. ContentStreams can come
|
108
|
+
from the raw POST body, multi-part form data, or remote URLs.
|
109
|
+
Included in this change is a new SolrDispatchFilter that allows
|
110
|
+
RequestHandlers registered with names that begin with a "/" to be
|
111
|
+
accessed using a URL structure based on that name.
|
112
|
+
(Ryan McKinley via hossman)
|
113
|
+
|
114
|
+
9. SOLR-126: DirectUpdateHandler2 supports autocommitting after a specified time
|
115
|
+
(in ms), using <autoCommit><maxTime>10000</maxTime></autoCommit>.
|
116
|
+
(Ryan McKinley via klaas).
|
117
|
+
|
118
|
+
10. SOLR-116: IndexInfoRequestHandler added. (Erik Hatcher)
|
119
|
+
|
120
|
+
11. SOLR-79: Add system property ${<sys.prop>[:<default>]} substitution for
|
121
|
+
configuration files loaded, including schema.xml and solrconfig.xml.
|
122
|
+
(Erik Hatcher with inspiration from Andrew Saar)
|
123
|
+
|
124
|
+
12. SOLR-149: Changes to make Solr more easily embeddable, in addition
|
125
|
+
to logging which request handler handled each request.
|
126
|
+
(Ryan McKinley via yonik)
|
127
|
+
|
128
|
+
13. SOLR-86: Added standalone Java-based command-line updater.
|
129
|
+
(Erik Hatcher via Bertrand Delecretaz)
|
130
|
+
|
131
|
+
14. SOLR-152: DisMaxRequestHandler now supports configurable alternate
|
132
|
+
behavior when q is not specified. A "q.alt" param can be specified
|
133
|
+
using SolrQueryParser syntax as a mechanism for specifying what query
|
134
|
+
the dismax handler should execute if the main user query (q) is blank.
|
135
|
+
(Ryan McKinley via hossman)
|
136
|
+
|
137
|
+
15. SOLR-158: new "qs" (Query Slop) param for DisMaxRequestHandler
|
138
|
+
allows for specifying the amount of default slop to use when parsing
|
139
|
+
explicit phrase queries from the user.
|
140
|
+
(Adam Hiatt via hossman)
|
141
|
+
|
142
|
+
16. SOLR-81: SpellCheckerRequestHandler that uses the SpellChecker from
|
143
|
+
the Lucene contrib.
|
144
|
+
(Otis Gospodnetic and Adam Hiatt)
|
145
|
+
|
146
|
+
17. SOLR-182: allow lazy loading of request handlers on first request.
|
147
|
+
(Ryan McKinley via yonik)
|
148
|
+
|
149
|
+
18. SOLR-81: More SpellCheckerRequestHandler enhancements, inlcluding
|
150
|
+
support for relative or absolute directory path configurations, as
|
151
|
+
well as RAM based directory. (hossman)
|
152
|
+
|
153
|
+
19. SOLR-197: New parameters for input: stream.contentType for specifying
|
154
|
+
or overriding the content type of input, and stream.file for reading
|
155
|
+
local files. (Ryan McKinley via yonik)
|
156
|
+
|
157
|
+
20. SOLR-66: CSV data format for document additions and updates. (yonik)
|
158
|
+
|
159
|
+
21. SOLR-184: add echoHandler=true to responseHeader, support echoParams=all
|
160
|
+
(Ryan McKinley via ehatcher)
|
161
|
+
|
162
|
+
22. SOLR-211: Added a regex PatternTokenizerFactory. This extracts tokens
|
163
|
+
from the input string using a regex Pattern. (Ryan McKinley)
|
164
|
+
|
165
|
+
23. SOLR-162: Added a "Luke" request handler and other admin helpers.
|
166
|
+
This exposes the system status through the standard requestHandler
|
167
|
+
framework. (ryan)
|
168
|
+
|
169
|
+
24. SOLR-212: Added a DirectSolrConnection class. This lets you access
|
170
|
+
solr using the standard request/response formats, but does not require
|
171
|
+
an HTTP connection. It is designed for embedded applications. (ryan)
|
172
|
+
|
173
|
+
25. SOLR-204: The request dispatcher (added in SOLR-104) can handle
|
174
|
+
calls to /select. This offers uniform error handling for /update and
|
175
|
+
/select. To enable this behavior, you must add:
|
176
|
+
<requestDispatcher handleSelect="true" > to your solrconfig.xml
|
177
|
+
See the example solrconfig.xml for details. (ryan)
|
178
|
+
|
179
|
+
26. SOLR-170: StandardRequestHandler now supports a "sort" parameter.
|
180
|
+
Using the ';' syntax is still supported, but it is recommended to
|
181
|
+
transition to the new syntax. (ryan)
|
182
|
+
|
183
|
+
27. SOLR-181: The index schema now supports "required" fields. Attempts
|
184
|
+
to add a document without a required field will fail, returning a
|
185
|
+
descriptive error message. By default, the uniqueKey field is
|
186
|
+
a required field. This can be disabled by setting required=false
|
187
|
+
in schema.xml. (Greg Ludington via ryan)
|
188
|
+
|
189
|
+
28. SOLR-217: Fields configured in the schema to be neither indexed or
|
190
|
+
stored will now be quietly ignored by Solr when Documents are added.
|
191
|
+
The example schema has a comment explaining how this can be used to
|
192
|
+
ignore any "unknown" fields.
|
193
|
+
(Will Johnson via hossman)
|
194
|
+
|
195
|
+
29. SOLR-227: If schema.xml defines multiple fieldTypes, fields, or
|
196
|
+
dynamicFields with the same name, a severe error will be logged rather
|
197
|
+
then quietly continuing. Depending on the <abortOnConfigurationError>
|
198
|
+
settings, this may halt the server. Likewise, if solrconfig.xml
|
199
|
+
defines multiple RequestHandlers with the same name it will also add
|
200
|
+
an error. (ryan)
|
201
|
+
|
202
|
+
30. SOLR-226: Added support for dynamic field as the destination of a
|
203
|
+
copyField using glob (*) replacement. (ryan)
|
204
|
+
|
205
|
+
31. SOLR-224: Adding a PhoneticFilterFactory that uses apache commons codec
|
206
|
+
language encoders to build phonetically similar tokens. This currently
|
207
|
+
supports: DoubleMetaphone, Metaphone, Soundex, and RefinedSoundex (ryan)
|
208
|
+
|
209
|
+
32. SOLR-199: new n-gram tokenizers available via NGramTokenizerFactory
|
210
|
+
and EdgeNGramTokenizerFactory. (Adam Hiatt via yonik)
|
211
|
+
|
212
|
+
33. SOLR-234: TrimFilter can update the Token's startOffset and endOffset
|
213
|
+
if updateOffsets="true". By default the Token offsets are unchanged.
|
214
|
+
(ryan)
|
215
|
+
|
216
|
+
34. SOLR-208: new example_rss.xsl and example_atom.xsl to provide more
|
217
|
+
examples for people about the Solr XML response format and how they
|
218
|
+
can transform it to suit different needs.
|
219
|
+
(Brian Whitman via hossman)
|
220
|
+
|
221
|
+
35. SOLR-249: Deprecated SolrException( int, ... ) constructors in favor
|
222
|
+
of constructors that takes an ErrorCode enum. This will ensure that
|
223
|
+
all SolrExceptions use a valid HTTP status code. (ryan)
|
224
|
+
|
225
|
+
Changes in runtime behavior
|
226
|
+
1. Highlighting using DisMax will only pick up terms from the main
|
227
|
+
user query, not boost or filter queries (klaas).
|
228
|
+
|
229
|
+
2. SOLR-125: Change default of json.nl to flat, change so that
|
230
|
+
json.nl only affects items where order matters (facet constraint
|
231
|
+
listings). Fix JSON output bug for null values. Internal JAVA API:
|
232
|
+
change most uses of NamedList to SimpleOrderedMap. (yonik)
|
233
|
+
|
234
|
+
3. A new method "getSolrQueryParser" has been added to the IndexSchema
|
235
|
+
class for retrieving a new SolrQueryParser instance with all options
|
236
|
+
specified in the schema.xml's <solrQueryParser> block set. The
|
237
|
+
documentation for the SolrQueryParser constructor and it's use of
|
238
|
+
IndexSchema have also been clarified.
|
239
|
+
(Erik Hatcher and hossman)
|
240
|
+
|
241
|
+
4. DisMaxRequestHandler's bq, bf, qf, and pf parameters can now accept
|
242
|
+
multiple values (klaas).
|
243
|
+
|
244
|
+
5. Query are re-written before highlighting is performed. This enables
|
245
|
+
proper highlighting of prefix and wildcard queries (klaas).
|
246
|
+
|
247
|
+
6. A meaningful exception is raised when attempting to add a doc missing
|
248
|
+
a unique id if it is declared in the schema and allowDups=false.
|
249
|
+
(ryan via klaas)
|
250
|
+
|
251
|
+
7. SOLR-183: Exceptions with error code 400 are raised when
|
252
|
+
numeric argument parsing fails. RequiredSolrParams class added
|
253
|
+
to facilitate checking for parameters that must be present.
|
254
|
+
(Ryan McKinley, J.J. Larrea via yonik)
|
255
|
+
|
256
|
+
8. SOLR-179: By default, solr will abort after any severe initalization
|
257
|
+
errors. This behavior can be disabled by setting:
|
258
|
+
<abortOnConfigurationError>false</abortOnConfigurationError>
|
259
|
+
in solrconfig.xml (ryan)
|
260
|
+
|
261
|
+
9. The example solrconfig.xml maps /update to XmlUpdateRequestHandler using
|
262
|
+
the new request dispatcher (SOLR-104). This requires posted content to
|
263
|
+
have a valid contentType: curl -H 'Content-type:text/xml; charset=utf-8'
|
264
|
+
The response format matches that of /select and returns standard error
|
265
|
+
codes. To enable solr1.1 style /update, do not map "/update" to any
|
266
|
+
handler in solrconfig.xml (ryan)
|
267
|
+
|
268
|
+
10. SOLR-231: If a charset is not specified in the contentType,
|
269
|
+
ContentStream.getReader() will use UTF-8 encoding. (ryan)
|
270
|
+
|
271
|
+
11. SOLR-230: More options for post.jar to support stdin, xml on the
|
272
|
+
commandline, and defering commits. Tutorial modified to take
|
273
|
+
advantage of these options so there is no need for curl.
|
274
|
+
(hossman)
|
275
|
+
|
276
|
+
12. SOLR-128: Upgraded Jetty to the latest stable release 6.1.3 (ryan)
|
277
|
+
|
278
|
+
Optimizations
|
279
|
+
1. SOLR-114: HashDocSet specific implementations of union() and andNot()
|
280
|
+
for a 20x performance improvement for those set operations, and a new
|
281
|
+
hash algorithm speeds up exists() by 10% and intersectionSize() by 8%.
|
282
|
+
(yonik)
|
283
|
+
|
284
|
+
2. SOLR-115: Solr now uses BooleanQuery.clauses() instead of
|
285
|
+
BooleanQuery.getClauses() in any situation where there is no risk of
|
286
|
+
modifying the original query.
|
287
|
+
(hossman)
|
288
|
+
|
289
|
+
3. SOLR-221: Speed up sorted faceting on multivalued fields by ~60%
|
290
|
+
when the base set consists of a relatively large portion of the
|
291
|
+
index. (yonik)
|
292
|
+
|
293
|
+
4. SOLR-221: Added a facet.enum.cache.minDf parameter which avoids
|
294
|
+
using the filterCache for terms that match few documents, trading
|
295
|
+
decreased memory usage for increased query time. (yonik)
|
296
|
+
|
297
|
+
Bug Fixes
|
298
|
+
1. SOLR-87: Parsing of synonym files did not correctly handle escaped
|
299
|
+
whitespace such as \r\n\t\b\f. (yonik)
|
300
|
+
|
301
|
+
2. SOLR-92: DOMUtils.getText (used when parsing config files) did not
|
302
|
+
work properly with many DOM implementations when dealing with
|
303
|
+
"Attributes". (Ryan McKinley via hossman)
|
304
|
+
|
305
|
+
3. SOLR-9,SOLR-99: Tighten up sort specification error checking, throw
|
306
|
+
exceptions for missing sort specifications or a sort on a non-indexed
|
307
|
+
field. (Ryan McKinley via yonik)
|
308
|
+
|
309
|
+
4. SOLR-145: Fix for bug introduced in SOLR-104 where some Exceptions
|
310
|
+
were being ignored by all "out of the box" RequestHandlers. (hossman)
|
311
|
+
|
312
|
+
5. SOLR-166: JNDI solr.home code refactoring. SOLR-104 moved
|
313
|
+
some JNDI related code to the init method of a Servlet Filter -
|
314
|
+
according to the Servlet Spec, all Filter's should be initialized
|
315
|
+
prior to initializing any Servlets, but this is not the case in at
|
316
|
+
least one Servlet Container (Resin). This "bug fix" refactors
|
317
|
+
this JNDI code so that it should be executed the first time any
|
318
|
+
attempt is made to use the solr.home dir.
|
319
|
+
(Ryan McKinley via hossman)
|
320
|
+
|
321
|
+
6. SOLR-173: Bug fix to SolrDispatchFilter to reduce "too many open
|
322
|
+
files" problem was that SolrDispatchFilter was not closing requests
|
323
|
+
when finished. Also modified ResponseWriters to only fetch a Searcher
|
324
|
+
reference if necessary for writing out DocLists.
|
325
|
+
(Ryan McKinley via hossman)
|
326
|
+
|
327
|
+
7. SOLR-168: Fix display positioning of multiple tokens at the same
|
328
|
+
position in analysis.jsp (yonik)
|
329
|
+
|
330
|
+
8. SOLR-167: The SynonymFilter sometimes generated incorrect offsets when
|
331
|
+
multi token synonyms were mached in the source text. (yonik)
|
332
|
+
|
333
|
+
9. SOLR-188: bin scripts do not support non-default webapp names. Added "-U"
|
334
|
+
option to specify a full path to the update url, overriding the
|
335
|
+
"-h" (hostname), "-p" (port) and "-w" (webapp name) parameters.
|
336
|
+
(Jeff Rodenburg via billa)
|
337
|
+
|
338
|
+
10. SOLR-198: RunExecutableListener always waited for the process to
|
339
|
+
finish, even when wait="false" was set. (Koji Sekiguchi via yonik)
|
340
|
+
|
341
|
+
11. SOLR-207: Changed distribution scripts to remove recursive find
|
342
|
+
and avoid use of "find -maxdepth" on platforms where it is not
|
343
|
+
supported. (yonik)
|
344
|
+
|
345
|
+
12. SOLR-222: Changing writeLockTimeout in solrconfig.xml did not
|
346
|
+
change the effective timeout. (Koji Sekiguchi via yonik)
|
347
|
+
|
348
|
+
13. Changed the SOLR-104 RequestDispatcher so that /select?qt=xxx can not
|
349
|
+
access handlers that start with "/". This makes path based authentication
|
350
|
+
possible for path based request handlers. (ryan)
|
351
|
+
|
352
|
+
14. SOLR-214: Some servlet containers (including Tomcat and Resin) do not
|
353
|
+
obey the specified charset. Rather then letting the the container handle
|
354
|
+
it solr now uses the charset from the header contentType to decode posted
|
355
|
+
content. Using the contentType: "text/xml; charset=utf-8" will force
|
356
|
+
utf-8 encoding. If you do not specify a contentType, it will use the
|
357
|
+
platform default. (Koji Sekiguchi via ryan)
|
358
|
+
|
359
|
+
15. SOLR-241: Undefined system properties used in configuration files now
|
360
|
+
cause a clear message to be logged rather than an obscure exception thrown.
|
361
|
+
(Koji Sekiguchi via ehatcher)
|
362
|
+
|
363
|
+
Other Changes
|
364
|
+
1. Updated to Lucene 2.1
|
365
|
+
|
366
|
+
2. Updated to Lucene 2007-05-20_00-04-53
|
367
|
+
|
368
|
+
================== Release 1.1.0, 20061222 ==================
|
369
|
+
|
370
|
+
Status
|
371
|
+
------
|
372
|
+
This is the first release since Solr joined the Incubator, and brings many
|
373
|
+
new features and performance optimizations including highlighting,
|
374
|
+
faceted browsing, and JSON/Python/Ruby response formats.
|
375
|
+
|
376
|
+
|
377
|
+
Upgrading from previous Solr versions
|
378
|
+
-------------------------------------
|
379
|
+
Older Apache Solr installations can be upgraded by replacing
|
380
|
+
the relevant war file with the new version. No changes to configuration
|
381
|
+
files are needed and the index format has not changed.
|
382
|
+
|
383
|
+
The default version of the Solr XML response syntax has been changed to 2.2.
|
384
|
+
Behavior can be preserved for those clients not explicitly specifying a
|
385
|
+
version by adding a default to the request handler in solrconfig.xml
|
386
|
+
|
387
|
+
By default, Solr will no longer use a searcher that has not fully warmed,
|
388
|
+
and requests will block in the meantime. To change back to the previous
|
389
|
+
behavior of using a cold searcher in the event there is no other
|
390
|
+
warm searcher, see the useColdSearcher config item in solrconfig.xml
|
391
|
+
|
392
|
+
The XML response format when adding multiple documents to the collection
|
393
|
+
in a single <add> command has changed to return a single <result>.
|
394
|
+
|
395
|
+
|
396
|
+
Detailed Change List
|
397
|
+
--------------------
|
398
|
+
|
399
|
+
New Features
|
400
|
+
1. added support for setting Lucene's positionIncrementGap
|
401
|
+
2. Admin: new statistics for SolrIndexSearcher
|
402
|
+
3. Admin: caches now show config params on stats page
|
403
|
+
3. max() function added to FunctionQuery suite
|
404
|
+
4. postOptimize hook, mirroring the functionallity of the postCommit hook,
|
405
|
+
but only called on an index optimize.
|
406
|
+
5. Ability to HTTP POST query requests to /select in addition to HTTP-GET
|
407
|
+
6. The default search field may now be overridden by requests to the
|
408
|
+
standard request handler using the df query parameter. (Erik Hatcher)
|
409
|
+
7. Added DisMaxRequestHandler and SolrPluginUtils. (Chris Hostetter)
|
410
|
+
8. Support for customizing the QueryResponseWriter per request
|
411
|
+
(Mike Baranczak / SOLR-16 / hossman)
|
412
|
+
9. Added KeywordTokenizerFactory (hossman)
|
413
|
+
10. copyField accepts dynamicfield-like names as the source.
|
414
|
+
(Darren Erik Vengroff via yonik, SOLR-21)
|
415
|
+
11. new DocSet.andNot(), DocSet.andNotSize() (yonik)
|
416
|
+
12. Ability to store term vectors for fields. (Mike Klaas via yonik, SOLR-23)
|
417
|
+
13. New abstract BufferedTokenStream for people who want to write
|
418
|
+
Tokenizers or TokenFilters that require arbitrary buffering of the
|
419
|
+
stream. (SOLR-11 / yonik, hossman)
|
420
|
+
14. New RemoveDuplicatesToken - useful in situations where
|
421
|
+
synonyms, stemming, or word-deliminater-ing produce identical tokens at
|
422
|
+
the same position. (SOLR-11 / yonik, hossman)
|
423
|
+
15. Added highlighting to SolrPluginUtils and implemented in StandardRequestHandler
|
424
|
+
and DisMaxRequestHandler (SOLR-24 / Mike Klaas via hossman,yonik)
|
425
|
+
16. SnowballPorterFilterFactory language is configurable via the "language"
|
426
|
+
attribute, with the default being "English". (Bertrand Delacretaz via yonik, SOLR-27)
|
427
|
+
17. ISOLatin1AccentFilterFactory, instantiates ISOLatin1AccentFilter to remove accents.
|
428
|
+
(Bertrand Delacretaz via yonik, SOLR-28)
|
429
|
+
18. JSON, Python, Ruby QueryResponseWriters: use wt="json", "python" or "ruby"
|
430
|
+
(yonik, SOLR-31)
|
431
|
+
19. Make web admin pages return UTF-8, change Content-type declaration to include a
|
432
|
+
space between the mime-type and charset (Philip Jacob, SOLR-35)
|
433
|
+
20. Made query parser default operator configurable via schema.xml:
|
434
|
+
<solrQueryParser defaultOperator="AND|OR"/>
|
435
|
+
The default operator remains "OR".
|
436
|
+
21. JAVA API: new version of SolrIndexSearcher.getDocListAndSet() which takes
|
437
|
+
flags (Greg Ludington via yonik, SOLR-39)
|
438
|
+
22. A HyphenatedWordsFilter, a text analysis filter used during indexing to rejoin
|
439
|
+
words that were hyphenated and split by a newline. (Boris Vitez via yonik, SOLR-41)
|
440
|
+
23. Added a CompressableField base class which allows fields of derived types to
|
441
|
+
be compressed using the compress=true setting. The field type also gains the
|
442
|
+
ability to specify a size threshold at which field data is compressed.
|
443
|
+
(klaas, SOLR-45)
|
444
|
+
24. Simple faceted search support for fields (enumerating terms)
|
445
|
+
and arbitrary queries added to both StandardRequestHandler and
|
446
|
+
DisMaxRequestHandler. (hossman, SOLR-44)
|
447
|
+
25. In addition to specifying default RequestHandler params in the
|
448
|
+
solrconfig.xml, support has been added for configuring values to be
|
449
|
+
appended to the multi-val request params, as well as for configuring
|
450
|
+
invariant params that can not overridden in the query. (hossman, SOLR-46)
|
451
|
+
26. Default operator for query parsing can now be specified with q.op=AND|OR
|
452
|
+
from the client request, overriding the schema value. (ehatcher)
|
453
|
+
27. New XSLTResponseWriter does server side XSLT processing of XML Response.
|
454
|
+
In the process, an init(NamedList) method was added to QueryResponseWriter
|
455
|
+
which works the same way as SolrRequestHandler.
|
456
|
+
(Bertrand Delacretaz / SOLR-49 / hossman)
|
457
|
+
28. json.wrf parameter adds a wrapper-function around the JSON response,
|
458
|
+
useful in AJAX with dynamic script tags for specifying a JavaScript
|
459
|
+
callback function. (Bertrand Delacretaz via yonik, SOLR-56)
|
460
|
+
29. autoCommit can be specified every so many documents added (klaas, SOLR-65)
|
461
|
+
30. ${solr.home}/lib directory can now be used for specifying "plugin" jars
|
462
|
+
(hossman, SOLR-68)
|
463
|
+
31. Support for "Date Math" relative "NOW" when specifying values of a
|
464
|
+
DateField in a query -- or when adding a document.
|
465
|
+
(hossman, SOLR-71)
|
466
|
+
32. useColdSearcher control in solrconfig.xml prevents the first searcher
|
467
|
+
from being used before it's done warming. This can help prevent
|
468
|
+
thrashing on startup when multiple requests hit a cold searcher.
|
469
|
+
The default is "false", preventing use before warm. (yonik, SOLR-77)
|
470
|
+
|
471
|
+
Changes in runtime behavior
|
472
|
+
1. classes reorganized into different packages, package names changed to Apache
|
473
|
+
2. force read of document stored fields in QuerySenderListener
|
474
|
+
3. Solr now looks in ./solr/conf for config, ./solr/data for data
|
475
|
+
configurable via solr.solr.home system property
|
476
|
+
4. Highlighter params changed to be prefixed with "hl."; allow fragmentsize
|
477
|
+
customization and per-field overrides on many options
|
478
|
+
(Andrew May via klaas, SOLR-37)
|
479
|
+
5. Default param values for DisMaxRequestHandler should now be specified
|
480
|
+
using a '<lst name="defaults">...</lst>' init param, for backwards
|
481
|
+
compatability all init prams will be used as defaults if an init param
|
482
|
+
with that name does not exist. (hossman, SOLR-43)
|
483
|
+
6. The DisMaxRequestHandler now supports multiple occurances of the "fq"
|
484
|
+
param. (hossman, SOLR-44)
|
485
|
+
7. FunctionQuery.explain now uses ComplexExplanation to provide more
|
486
|
+
accurate score explanations when composed in a BooleanQuery.
|
487
|
+
(hossman, SOLR-25)
|
488
|
+
8. Document update handling locking is much sparser, allowing performance gains
|
489
|
+
through multiple threads. Large commits also might be faster (klaas, SOLR-65)
|
490
|
+
9. Lazy field loading can be enabled via a solrconfig directive. This will be faster when
|
491
|
+
not all stored fields are needed from a document (klaas, SOLR-52)
|
492
|
+
10. Made admin JSPs return XML and transform them with new XSL stylesheets
|
493
|
+
(Otis Gospodnetic, SOLR-58)
|
494
|
+
11. If the "echoParams=explicit" request parameter is set, request parameters are copied
|
495
|
+
to the output. In an XML output, they appear in new <lst name="params"> list inside
|
496
|
+
the new <lst name="responseHeader"> element, which replaces the old <responseHeader>.
|
497
|
+
Adding a version=2.1 parameter to the request produces the old format, for backwards
|
498
|
+
compatibility (bdelacretaz and yonik, SOLR-59).
|
499
|
+
|
500
|
+
Optimizations
|
501
|
+
1. getDocListAndSet can now generate both a DocList and a DocSet from a
|
502
|
+
single lucene query.
|
503
|
+
2. BitDocSet.intersectionSize(HashDocSet) no longer generates an intermediate
|
504
|
+
set
|
505
|
+
3. OpenBitSet completed, replaces BitSet as the implementation for BitDocSet.
|
506
|
+
Iteration is faster, and BitDocSet.intersectionSize(BitDocSet) and unionSize
|
507
|
+
is between 3 and 4 times faster. (yonik, SOLR-15)
|
508
|
+
4. much faster unionSize when one of the sets is a HashDocSet: O(smaller_set_size)
|
509
|
+
5. Optimized getDocSet() for term queries resulting in a 36% speedup of facet.field
|
510
|
+
queries where DocSets aren't cached (for example, if the number of terms in the field
|
511
|
+
is larger than the filter cache.) (yonik)
|
512
|
+
6. Optimized facet.field faceting by as much as 500 times when the field has
|
513
|
+
a single token per document (not multiValued & not tokenized) by using the
|
514
|
+
Lucene FieldCache entry for that field to tally term counts. The first request
|
515
|
+
utilizing the FieldCache will take longer than subsequent ones.
|
516
|
+
|
517
|
+
Bug Fixes
|
518
|
+
1. Fixed delete-by-id for field types who's indexed form is different
|
519
|
+
from the printable form (mainly sortable numeric types).
|
520
|
+
2. Added escaping of attribute values in the XML response (Erik Hatcher)
|
521
|
+
3. Added empty extractTerms() to FunctionQuery to enable use in
|
522
|
+
a MultiSearcher (Yonik)
|
523
|
+
4. WordDelimiterFilter sometimes lost token positionIncrement information
|
524
|
+
5. Fix reverse sorting for fields were sortMissingFirst=true
|
525
|
+
(Rob Staveley, yonik)
|
526
|
+
6. Worked around a Jetty bug that caused invalid XML responses for fields
|
527
|
+
containing non ASCII chars. (Bertrand Delacretaz via yonik, SOLR-32)
|
528
|
+
7. WordDelimiterFilter can throw exceptions if configured with both
|
529
|
+
generate and catenate off. (Mike Klaas via yonik, SOLR-34)
|
530
|
+
8. Escape '>' in XML output (because ]]> is illegal in CharData)
|
531
|
+
9. field boosts weren't being applied and doc boosts were being applied to fields (klaas)
|
532
|
+
10. Multiple-doc update generates well-formed xml (klaas, SOLR-65)
|
533
|
+
11. Better parsing of pingQuery from solrconfig.xml (hossman, SOLR-70)
|
534
|
+
12. Fixed bug with "Distribution" page introduced when Versions were
|
535
|
+
added to "Info" page (hossman)
|
536
|
+
13. Fixed HTML escaping issues with user input to analysis.jsp and action.jsp
|
537
|
+
(hossman, SOLR-74)
|
538
|
+
|
539
|
+
Other Changes
|
540
|
+
1. Upgrade to Lucene 2.0 nightly build 2006-06-22, lucene SVN revision 416224,
|
541
|
+
http://svn.apache.org/viewvc/lucene/java/trunk/CHANGES.txt?view=markup&pathrev=416224
|
542
|
+
2. Modified admin styles to improve display in Internet Explorer (Greg Ludington via billa, SOLR-6)
|
543
|
+
3. Upgrade to Lucene 2.0 nightly build 2006-07-15, lucene SVN revision 422302,
|
544
|
+
4. Included unique key field name/value (if available) in log message of add (billa, SOLR-18)
|
545
|
+
5. Updated to Lucene 2.0 nightly build 2006-09-07, SVN revision 462111
|
546
|
+
6. Added javascript to catch empty query in admin query forms (Tomislav Nakic-Alfirevic via billa, SOLR-48
|
547
|
+
7. blackslash escape * in ssh command used in snappuller for zsh compatibility, SOLR-63
|
548
|
+
8. check solr return code in admin scripts, SOLR-62
|
549
|
+
9. Updated to Lucene 2.0 nightly build 2006-11-15, SVN revision 475069
|
550
|
+
10. Removed src/apps containing the legacy "SolrTest" app (hossman, SOLR-3)
|
551
|
+
11. Simplified index.jsp and form.jsp, primarily by removing/hiding XML
|
552
|
+
specific params, and adding an option to pick the output type. (hossman)
|
553
|
+
12. Added new numeric build property "specversion" to allow clean
|
554
|
+
MANIFEST.MF files (hossman)
|
555
|
+
13. Added Solr/Lucene versions to "Info" page (hossman)
|
556
|
+
14. Explicitly set mime-type of .xsl files in web.xml to
|
557
|
+
application/xslt+xml (hossman)
|
558
|
+
15. Config parsing should now work useing DOM Level 2 parsers -- Solr
|
559
|
+
previously relied on getTextContent which is a DOM Level 3 addition
|
560
|
+
(Alexander Saar via hossman, SOLR-78)
|
561
|
+
|
562
|
+
2006/01/17 Solr open sourced, moves to Apache Incubator
|